Skip to main content

Deep Agent Components

The architecture relies entirely on the Deep Agent pattern. Every capability is isolated into narrow nodes. This section documents each specialized component that powers the autonomous infrastructure pipeline.


1. The Supervisor Agent (aws_orchestrator_supervisor.py)​

This is the CI-Supervisor tool wrapper. It acts strictly as a router mapping the overarching flow of execution.

  • Key Logic: It uses create_agent_tools to encapsulate the Deep Agent logic.
  • HITL Integration: If an out-of-scope query comes in (e.g., questions about Python coding instead of infrastructure), it triggers _make_request_human_input_tool() to pause execution (interrupt()) and ask the user a clarifying question before ever touching Terraform subroutines.

2. The TF Coordinator (tf_cordinator.py)​

This agent is the true "Brain" handling memory limits and cross-agent context matching.

  • Virtual vs Physical Filesystem: It initializes an InMemoryStore to track critical files like hitl-policies.md and AGENTS.md. It injects the sync_workspace tool to materialize the virtual /workspace/ memory paths to the actual physical disk.
  • State Transforms:
    • Implements input_transform to bind the graph input.
    • Generates the TFCoordinatorContext object merging .env configurations.
    • Evaluates output_transform to flatten the final AIMessage returned to the Supervisor.

3. The Planner Supervisor Agent & Subgraph (planner_supervisor_agent.py)​

Before any code is generated, a full LangGraph subgraph executes a 3-phase planning workflow. It enforces strict transfer paths via Command(goto=..., graph=Command.PARENT) routing logic.

3.1. ReqAnalyserAgent (req_analyser_agent.py)​

  • Bound by a CRITICAL: SEQUENTIAL EXECUTION ONLY prompt.
  • Executes infra_requirements_parser_tool → aws_service_discovery_tool → get_final_resource_attributes_tool.
  • Halts and automatically requests human input if critical attributes like the region or environment context are missing.

3.2. SecBestPracticesAgent (sec_n_best_practices_agent.py)​

  • Executes security_compliance_tool → best_practices_tool.
  • Enforces structural infrastructure standards (Least-Privilege rules, VPC flow logs, SSE checks).

3.3. ExecutionPlannerAgent (execution_planner_agent.py)​

  • Executes structural optimization tools ending with write_service_skills_tool, effectively dumping all findings, attributes, and execution instructions into a single SKILL.md file located securely in the Virtual Filesystem (/skills/).

4. Generator & Operations Subagents (subagents.py)​

The actual code creation, physical validation, and code-shipment nodes are segmented tightly into operations subagents to prevent context leakage.

  • tf-skill-builder

    • Fallback generator utilized if a specific AWS service needs ad-hoc SKILL.md definitions that don't match the primary planner outcomes.
  • tf-generator

    • Strictly bound to follow SKILL.md. It is hard-programmed to write ONLY to absolute paths /workspace/terraform_modules/{service}/.
    • If its write_file tool fails 3 times, it halts execution to trigger internal recovery algorithms.
  • tf-updater & update-planner

    • These agents are specifically designed to ingest GitHub data. The planner analyzes dependencies (list_directory_contents), while the updater makes surgical line-edits, explicitly forbidden from performing full rewrites.
  • tf-validator

    • Trapped inside an AWS Sandbox interface. It is instructed to use execution paths WITHOUT leading slashes (workspace/terraform_modules), differentiating the actual shell execution directory on the host from the Virtual FS /workspace/ where memory resides.
  • github-agent

    • Compiled uniquely via _build_mcp_subagent(include_filesystem=True). This JIT (Just-in-Time) architectural pattern attaches a bespoke FilesystemMiddleware directly to the agent only when needed, allowing it to bypass virtual memory constraints and reliably read physical disk files instantly before committing/pushing them using MCP API endpoints.

5. MCP Server Integrations​

The AWS Orchestrator utilizes the Model Context Protocol (MCP) to detach volatile external functions from the LLM prompt. Two fundamental MCP servers drive the system's accuracy and deployment mechanisms:

1. Terraform Registry MCP Server​

  • Where it's used: Exclusively inside the tf-planner subgraph (specifically the Requirements Analyser and Execution Planner components).
  • The Scenario: When a user requests a novel AWS module (e.g. an EKS Cluster), the agent does not guess the module's inputs based on deprecated model training data. Instead, the planner dynamically queries the live Terraform Registry using MCP tools. It fetches the latest AWS provider schemas, module version constraints, and required inputs to author a hyper-accurate SKILL.md blueprint.

2. GitHub Copilot MCP Server​

  • Where it's used: Driven by GITHUB_MCP_URL in .env, it attaches to github-agent, tf-updater, and update-planner via _build_mcp_subagent().
  • The Scenario:
    • New Code Commits (Delivery): The github-agent uses endpoints like create_or_update_file to pass localized code buffers straight to the user's remote repository. This prevents brittle shell git clone or git add breakages.
    • Existing Code Updates (Discovery): When modifying existing infrastructure, the update-planner utilizes list_directory_contents and get_file_contents to fetch variables.tf and main.tf directly from GitHub. This lets the orchestrator read the pre-existing state, define a surgical diff, and push only the new strings back, ensuring nothing else breaks.