📦 Helm Operator

Manages the complete Helm chart lifecycle — from intelligent generation through validation, live cluster operations, and GitHub persistence.

The Helm Operator is the largest domain in k8s-autopilot, coordinating 7 sub-agents through two major pipelines: a chart generation pipeline for creating new charts from scratch, and a live operations pipeline for managing releases on real clusters.

Architecture

Chart Generation Pipeline

When you ask k8s-autopilot to create a new Helm chart, the Helm Operator orchestrates a multi-phase pipeline:

Phase 1: Planning (`helm-planner`)

The Planner is itself a deep agent (compiled subgraph) that runs two internal sub-agents:

Internal Sub-Agent	Role
Requirements Analyzer	Extracts 12 critical fields from the user's request (App Type, Framework, Image, Port, etc.), classifies complexity (Simple/Medium/Complex), and detects gaps
Architecture Planner	Designs K8s resource topology (Deployment vs StatefulSet, HPA, PDB), estimates CPU/Memory, and checks dependencies

Smart Gap Detection: If critical information is missing (e.g., container image, port), the planner pauses and asks the user — prioritizing critical questions over optional ones.

Phase 2: Skill Building (`helm-skill-builder`)

If no skill exists for the requested application type, the Skill Builder generates a complete skill directory:

/skills/{app}-chart-generator/
├── SKILL.md                          # YAML frontmatter + workflow instructions
└── references/
    ├── resource-patterns.md          # K8s resource patterns
    ├── values-schema.md              # values.yaml structure
    └── templates-schema.md           # Template patterns

Phase 3: Generation (`helm-generator`)

The Generator reads the app-specific skill and generic template patterns, then writes a complete chart:

_helpers.tpl is always written first (all charts reference it)
Chart.yaml and values.yaml follow
Each template file merges generic Go patterns with app-specific configuration
Self-validates that every Values.* reference has a corresponding key

Mandatory standards: Security contexts, resource limits, liveness/readiness probes, .enabled toggles for optional resources, no hardcoded namespaces.

Phase 4: Validation (`helm-validator`)

Runs a triple-check validation pipeline:

Check	Tool	Purpose
Lint	`helm lint .`	Syntax and structure
Template	`helm template test-release . --debug`	Logic flow and variable interpolation
Dry-Run	Server-side compatibility	Cluster API compatibility

Self-Healing: If validation fails, the validator analyzes the error, applies a fix using edit_file, and retries. After 2 consecutive failures, it escalates to the user via ask_human.

Phase 5: Update (`helm-updater`)

For modifying existing charts rather than creating from scratch. Fetches chart files, analyzes changes needed, and applies targeted updates without rewriting the entire chart.

Live Operations Pipeline

The helm-operation sub-agent manages Helm releases on real clusters through a 5-phase safety pipeline:

Dual-Path Architecture

Path	Trigger	Flow
Query Fast-Path	Read-only operations (list releases, status, search)	Call tool → format → return
Safe-Track Pipeline	State-modifying operations (install, upgrade, rollback, uninstall)	Full 5-phase pipeline with HITL gates

Read-Only Operations (Fast-Path)

Query Type	Tool
List releases	`kubernetes_get_helm_releases`
Release status	`helm_get_release_status`
Release history	`helm_get_release_history`
Search charts	`helm_search_charts`
Chart info	`helm_get_chart_info`

State-Modifying Operations (5-Phase Pipeline)

Phase	Activity	HITL Gate?
1. Discovery	Detect if release exists (Install vs Upgrade), fetch chart metadata	No
2. Planning	Validate values, generate installation plan via `helm_get_installation_plan`	No
3. Approval	Present formatted plan, wait for user sign-off	Yes
4. Execution	New installs: `helm_dry_run_install` first, then `helm_install_chart`. Upgrades: `helm_upgrade_release` with `reuse_values=true`	No
5. Verification	Confirm health via `helm_get_release_status`	No

Context Recovery

Before asking the user for any parameter, the helm-operation sub-agent checks three sources:

Task description — the coordinator should have included full context
Operations journal — /memories/helm-operator/operations-log.md records all previous operations
User — absolute last resort

GitHub Persistence (`github-agent`)

Once a chart is validated and approved via HITL, the GitHub agent:

Discovers all generated files via ls /workspace/helm-charts/{chart}/
Reads each file's content with read_file
Commits to GitHub using GitHub MCP tools (never shell git commands)
Returns the commit URL

For new files: Uses create_or_update_file directly without checking SHA. For existing files: Calls get_file_contents first to get the current SHA.

Skills Reference

Skill Directory	Sub-Agent	Purpose
`helm-operator/helm-generator`	helm-generator	Template patterns, values schema, resource structure
`helm-operator/helm-skill-builder`	helm-skill-builder	Skill directory generation conventions
`helm-operator/helm-operation`	helm-operation	Live operations workflow references
`helm-operator/helm-validator`	helm-validator	Validation patterns and self-healing rules
`helm-operator/helm-updater`	helm-updater	Chart modification workflows
`helm-operator/github-agent`	github-agent	GitHub commit patterns

MCP Integration

MCP Server	Package	Transport	Used By
`helm_mcp_server`	`helm-mcp-server`	stdio	`helm-operation`, `helm-validator`
`github_mcp`	GitHub Copilot API	HTTP	`github-agent`

Available MCP Resources

Resource URI	Description
`helm://releases`	List all Helm releases
`helm://releases/{release_name}`	Release details and history
`helm://charts`	List available charts
`helm://charts/{repo}/{name}`	Chart metadata
`helm://charts/{repo}/{name}/readme`	Chart README
`kubernetes://cluster-info`	Kubernetes cluster info
`kubernetes://namespaces`	List namespaces
`helm://best_practices`	Helm best practices reference

Safety & Governance

HITL Gates

Gate	When	Purpose
Planning Review	After planner completes architecture design	Review requirements analysis and resource decisions
Generation Review	After templates are generated and validated	Review artifacts, specify workspace path
Execution Approval	Before any cluster mutation	Confirm install/upgrade/rollback/uninstall plan
Commit Gate	Before pushing to GitHub	Confirm branch and commit details

`[PLAN-LOCKED]` Execution

When the coordinator has already obtained user approval, sub-agents receive the [PLAN-LOCKED] prefix:

Skip their own planning phase
Execute exactly the pre-approved parameters
Do not re-ask or modify any parameter
HumanInTheLoopMiddleware still gates the actual tool call as a safety backstop

Rejection Protocol

If a user rejects a plan, the sub-agent does not retry autonomously. It returns to the coordinator, which handles re-engagement — with a maximum of 2 plan presentations before asking the user to rephrase.

Architecture​

Chart Generation Pipeline​

Phase 1: Planning (helm-planner)​

Phase 2: Skill Building (helm-skill-builder)​

Phase 3: Generation (helm-generator)​

Phase 4: Validation (helm-validator)​

Phase 5: Update (helm-updater)​

Live Operations Pipeline​

Dual-Path Architecture​

Read-Only Operations (Fast-Path)​

State-Modifying Operations (5-Phase Pipeline)​

Context Recovery​

GitHub Persistence (github-agent)​

Skills Reference​

MCP Integration​

Available MCP Resources​

Safety & Governance​

HITL Gates​

[PLAN-LOCKED] Execution​

Rejection Protocol​