A/B Testing with Experiments

Run two versions (e.g., v1 and v2) side by side and compare metrics before promoting one as the winner. Argo Rollouts Experiments create temporary ReplicaSets for each variant. See Onboarding to set up a canary rollout with AnalysisTemplate first.

Prerequisites

Component	Status
Kubernetes cluster	Accessible via `kubectl`
Argo Rollouts	Installed (`kubectl get rollouts -A`)
Argo Rollout MCP Server	Running (Docker recommended)
Onboarded Rollout	Canary strategy, onboarded in target namespace
AnalysisTemplate	Exists for the rollout (see Onboarding for setup)
Prometheus	Required for analysis-backed experiments

note

Experiments create separate ReplicaSets for metric comparison. Weighted traffic routing for experiments depends on your ingress (Traefik, Istio, SMI, ALB). Traefik does not support experiment traffic routing — use experiments for metrics comparison only (experiment-as-analysis pattern).

Environment Setup

Verify Rollout and AnalysisTemplate

kubectl get rollout hello-world -n default
kubectl get analysistemplate -n default

Ensure Canary Deployment in Progress

For experiments that use specRef: "stable" and specRef: "canary", the rollout must have a canary deployment in progress. Trigger a new version first:

"Deploy hello-world:v2 to the hello-world rollout in default."

Then create the experiment while the canary is running.

tip

specRef resolution: When using specRef in templates, the MCP server resolves stable/canary (or active/preview for blue-green) from the Rollout's ReplicaSets. You must pass rollout_name so the server can fetch the correct pod templates.

Start MCP Server (Docker — recommended)

docker run --rm -it \
  -p 8768:8768 \
  -v ~/.kube:/app/.kube:ro \
  -e K8S_KUBECONFIG=/app/.kube/config \
  talkopsai/argo-rollout-mcp-server:latest

Resource for Monitoring

Resource	Purpose
`argorollout://experiments/default/hello-world-ab-test/status`	Experiment phase, template statuses, analysis results

Experiment Phases

Phase	Meaning
`Pending`	Experiment created, pods not yet ready
`Running`	Both templates running, analysis in progress
`Successful`	All analysis passed, experiment complete
`Failed`	Analysis failed or pods unhealthy
`Error`	Configuration issue

Create Experiment Workflow

Run two versions side-by-side for metric comparison. Templates reference the rollout's stable and canary ReplicaSets.

Step	Action	Prompt / Tool
1	Trigger canary (if not already)	"Deploy hello-world:v2 to the hello-world rollout in default."
2	Create experiment	"Create an A/B test experiment called hello-world-ab-test in default — run baseline (stable) and candidate (canary) side by side for 30 minutes."
3	Or use tool	`argo_create_experiment(name="hello-world-ab-test", namespace="default", templates=[{"name": "baseline", "specRef": "stable"}, {"name": "candidate", "specRef": "canary"}], duration="30m", rollout_name="hello-world", analyses=[{"name": "success-rate", "templateName": "hello-world-analysis"}])`

Notes:

rollout_name is required when templates use specRef — the server resolves stable/canary from the Rollout's ReplicaSets.
Replace hello-world-analysis with your actual AnalysisTemplate name from onboarding.

Monitor Experiment Workflow

Step	Action	Prompt / Tool
1	Poll status	"What is the status of the hello-world-ab-test experiment in default?"
2	Or fetch resource	`argorollout://experiments/default/hello-world-ab-test/status`
3	Review	Check `template_statuses` and `analysis_runs` in response

Promote Candidate Workflow (If Candidate Wins)

Step	Action	Prompt / Tool
1	Promote candidate	"Deploy hello-world:v2 to the hello-world rollout in default." (if not already) + "Fully promote the hello-world rollout in default."
2	Or use tool	`argo_update_rollout(update_type='image', name="hello-world", new_image="hello-world:v2", namespace="default")` then `argo_manage_rollout_lifecycle(action='promote_full', ...)`
3	Delete experiment	"Delete the hello-world-ab-test experiment in default."

Clean Up Workflow (If Baseline Wins)

Step	Action	Prompt / Tool
1	Abort rollout	"Abort the hello-world rollout in default."
2	Delete experiment	"Delete the hello-world-ab-test experiment in default."
3	Or use tool	`argo_delete_experiment(name="hello-world-ab-test", namespace="default")`

Inconclusive — Extend Duration or Manual Review

Step	Action	Prompt / Tool
1	Review analysis	Fetch `argorollout://experiments/default/hello-world-ab-test/status`
2	Manual decision	Extend experiment duration (re-create with longer `duration`) or manually promote/abort based on metrics

Natural Language Prompts

Copy-paste these prompts into your MCP client (Cursor, Claude, etc.).

Create Experiment

"Create an A/B test experiment called hello-world-ab-test in default — run baseline (stable) and candidate (canary) side by side for 30 minutes. Use rollout hello-world for specRef resolution."

"Start an Argo Experiment named hello-world-ab-test in default with two templates: baseline (stable spec) and candidate (canary spec), running for 1 hour. Resolve specRef from rollout hello-world."

When using specRef, the MCP client must pass rollout_name="hello-world" to argo_create_experiment so the server can resolve stable/canary from the Rollout's ReplicaSets.

Monitor

"What is the status of the hello-world-ab-test experiment in default?"

"Show me the analysis results for the hello-world-ab-test experiment in default."

Clean Up

"Delete the hello-world-ab-test experiment in default — the baseline won, keeping stable."

"Clean up the hello-world-ab-test experiment in default."

Quick Reference: Tool → Prompt Mapping

Tool	Example Prompt
`argo_create_experiment`	"Create an A/B test experiment called hello-world-ab-test in default — run baseline and candidate side by side for 30 minutes."
`argo_delete_experiment`	"Delete the hello-world-ab-test experiment in default."
`argorollout://experiments/default/hello-world-ab-test/status`	"What is the status of the hello-world-ab-test experiment in default?"
`argo_update_rollout` (image)	"Deploy hello-world:v2 to the hello-world rollout in default." (promote candidate)
`argo_manage_rollout_lifecycle` (abort)	"Abort the hello-world rollout in default." (keep baseline)

Troubleshooting

Issue	Check
Experiment stuck in Pending	Verify rollout has canary in progress. Check pod status: `kubectl get pods -n default -l app=hello-world`
AnalysisTemplate not found	Ensure AnalysisTemplate exists and `templateName` in `analyses` matches. See Onboarding.
specRef "stable"/"canary" invalid	Pass `rollout_name` when using specRef (required). Ensure the Rollout has a canary deployment in progress — trigger `argo_update_rollout(update_type='image')` first.
Traefik traffic routing	Traefik does not support experiment traffic routing. Use experiments for metrics comparison only; configure ingress separately for user-facing A/B tests.
Experiment Failed	Check AnalysisRun status: `kubectl get analysisruns -n default`. Verify Prometheus queries return data.

Next Steps

Onboarding — Set up canary rollout with AnalysisTemplate
Deployment Strategies — Promote winner to production
Emergency Abort — Rollback when baseline wins
Tools — argo_create_experiment, argo_delete_experiment
Examples — Quick reference and prompts

Prerequisites​

Environment Setup​

Verify Rollout and AnalysisTemplate​

Ensure Canary Deployment in Progress​

Start MCP Server (Docker — recommended)​

Resource for Monitoring​

Experiment Phases​

Create Experiment Workflow​

Monitor Experiment Workflow​

Promote Candidate Workflow (If Candidate Wins)​

Clean Up Workflow (If Baseline Wins)​

Inconclusive — Extend Duration or Manual Review​

Natural Language Prompts​

Create Experiment​

Monitor​

Clean Up​

Quick Reference: Tool → Prompt Mapping​

Troubleshooting​

Next Steps​