Skip to main content

📦 Helm Operator

Manages the complete Helm chart lifecycle — from intelligent generation through validation, live cluster operations, and GitHub persistence.

The Helm Operator is the largest domain in k8s-autopilot, coordinating 7 sub-agents through two major pipelines: a chart generation pipeline for creating new charts from scratch, and a live operations pipeline for managing releases on real clusters.


Architecture


Chart Generation Pipeline

When you ask k8s-autopilot to create a new Helm chart, the Helm Operator orchestrates a multi-phase pipeline:

Phase 1: Planning (helm-planner)

The Planner is itself a deep agent (compiled subgraph) that runs two internal sub-agents:

Internal Sub-AgentRole
Requirements AnalyzerExtracts 12 critical fields from the user's request (App Type, Framework, Image, Port, etc.), classifies complexity (Simple/Medium/Complex), and detects gaps
Architecture PlannerDesigns K8s resource topology (Deployment vs StatefulSet, HPA, PDB), estimates CPU/Memory, and checks dependencies

Smart Gap Detection: If critical information is missing (e.g., container image, port), the planner pauses and asks the user — prioritizing critical questions over optional ones.

Phase 2: Skill Building (helm-skill-builder)

If no skill exists for the requested application type, the Skill Builder generates a complete skill directory:

/skills/{app}-chart-generator/
├── SKILL.md # YAML frontmatter + workflow instructions
└── references/
├── resource-patterns.md # K8s resource patterns
├── values-schema.md # values.yaml structure
└── templates-schema.md # Template patterns

Phase 3: Generation (helm-generator)

The Generator reads the app-specific skill and generic template patterns, then writes a complete chart:

  • _helpers.tpl is always written first (all charts reference it)
  • Chart.yaml and values.yaml follow
  • Each template file merges generic Go patterns with app-specific configuration
  • Self-validates that every Values.* reference has a corresponding key

Mandatory standards: Security contexts, resource limits, liveness/readiness probes, .enabled toggles for optional resources, no hardcoded namespaces.

Phase 4: Validation (helm-validator)

Runs a triple-check validation pipeline:

CheckToolPurpose
Linthelm lint .Syntax and structure
Templatehelm template test-release . --debugLogic flow and variable interpolation
Dry-RunServer-side compatibilityCluster API compatibility

Self-Healing: If validation fails, the validator analyzes the error, applies a fix using edit_file, and retries. After 2 consecutive failures, it escalates to the user via ask_human.

Phase 5: Update (helm-updater)

For modifying existing charts rather than creating from scratch. Fetches chart files, analyzes changes needed, and applies targeted updates without rewriting the entire chart.


Live Operations Pipeline

The helm-operation sub-agent manages Helm releases on real clusters through a 5-phase safety pipeline:

Dual-Path Architecture

PathTriggerFlow
Query Fast-PathRead-only operations (list releases, status, search)Call tool → format → return
Safe-Track PipelineState-modifying operations (install, upgrade, rollback, uninstall)Full 5-phase pipeline with HITL gates

Read-Only Operations (Fast-Path)

Query TypeTool
List releaseskubernetes_get_helm_releases
Release statushelm_get_release_status
Release historyhelm_get_release_history
Search chartshelm_search_charts
Chart infohelm_get_chart_info

State-Modifying Operations (5-Phase Pipeline)

PhaseActivityHITL Gate?
1. DiscoveryDetect if release exists (Install vs Upgrade), fetch chart metadataNo
2. PlanningValidate values, generate installation plan via helm_get_installation_planNo
3. ApprovalPresent formatted plan, wait for user sign-offYes
4. ExecutionNew installs: helm_dry_run_install first, then helm_install_chart. Upgrades: helm_upgrade_release with reuse_values=trueNo
5. VerificationConfirm health via helm_get_release_statusNo

Context Recovery

Before asking the user for any parameter, the helm-operation sub-agent checks three sources:

  1. Task description — the coordinator should have included full context
  2. Operations journal/memories/helm-operator/operations-log.md records all previous operations
  3. User — absolute last resort

GitHub Persistence (github-agent)

Once a chart is validated and approved via HITL, the GitHub agent:

  1. Discovers all generated files via ls /workspace/helm-charts/{chart}/
  2. Reads each file's content with read_file
  3. Commits to GitHub using GitHub MCP tools (never shell git commands)
  4. Returns the commit URL

For new files: Uses create_or_update_file directly without checking SHA. For existing files: Calls get_file_contents first to get the current SHA.


Skills Reference

Skill DirectorySub-AgentPurpose
helm-operator/helm-generatorhelm-generatorTemplate patterns, values schema, resource structure
helm-operator/helm-skill-builderhelm-skill-builderSkill directory generation conventions
helm-operator/helm-operationhelm-operationLive operations workflow references
helm-operator/helm-validatorhelm-validatorValidation patterns and self-healing rules
helm-operator/helm-updaterhelm-updaterChart modification workflows
helm-operator/github-agentgithub-agentGitHub commit patterns

MCP Integration

MCP ServerPackageTransportUsed By
helm_mcp_serverhelm-mcp-serverstdiohelm-operation, helm-validator
github_mcpGitHub Copilot APIHTTPgithub-agent

Available MCP Resources

Resource URIDescription
helm://releasesList all Helm releases
helm://releases/{release_name}Release details and history
helm://chartsList available charts
helm://charts/{repo}/{name}Chart metadata
helm://charts/{repo}/{name}/readmeChart README
kubernetes://cluster-infoKubernetes cluster info
kubernetes://namespacesList namespaces
helm://best_practicesHelm best practices reference

Safety & Governance

HITL Gates

GateWhenPurpose
Planning ReviewAfter planner completes architecture designReview requirements analysis and resource decisions
Generation ReviewAfter templates are generated and validatedReview artifacts, specify workspace path
Execution ApprovalBefore any cluster mutationConfirm install/upgrade/rollback/uninstall plan
Commit GateBefore pushing to GitHubConfirm branch and commit details

[PLAN-LOCKED] Execution

When the coordinator has already obtained user approval, sub-agents receive the [PLAN-LOCKED] prefix:

  • Skip their own planning phase
  • Execute exactly the pre-approved parameters
  • Do not re-ask or modify any parameter
  • HumanInTheLoopMiddleware still gates the actual tool call as a safety backstop

Rejection Protocol

If a user rejects a plan, the sub-agent does not retry autonomously. It returns to the coordinator, which handles re-engagement — with a maximum of 2 plan presentations before asking the user to rephrase.