Skip to main content

A/B Testing with Experiments

Run two versions (e.g., v1 and v2) side by side and compare metrics before promoting one as the winner. Argo Rollouts Experiments create temporary ReplicaSets for each variant. See Onboarding to set up a canary rollout with AnalysisTemplate first.


Prerequisites​

ComponentStatus
Kubernetes clusterAccessible via kubectl
Argo RolloutsInstalled (kubectl get rollouts -A)
Argo Rollout MCP ServerRunning (Docker recommended)
Onboarded RolloutCanary strategy, onboarded in target namespace
AnalysisTemplateExists for the rollout (see Onboarding for setup)
PrometheusRequired for analysis-backed experiments
note

Experiments create separate ReplicaSets for metric comparison. Weighted traffic routing for experiments depends on your ingress (Traefik, Istio, SMI, ALB). Traefik does not support experiment traffic routing — use experiments for metrics comparison only (experiment-as-analysis pattern).


Environment Setup​

Verify Rollout and AnalysisTemplate​

kubectl get rollout hello-world -n default
kubectl get analysistemplate -n default

Ensure Canary Deployment in Progress​

For experiments that use specRef: "stable" and specRef: "canary", the rollout must have a canary deployment in progress. Trigger a new version first:

"Deploy hello-world:v2 to the hello-world rollout in default."

Then create the experiment while the canary is running.

tip

specRef resolution: When using specRef in templates, the MCP server resolves stable/canary (or active/preview for blue-green) from the Rollout's ReplicaSets. You must pass rollout_name so the server can fetch the correct pod templates.

docker run --rm -it \
-p 8768:8768 \
-v ~/.kube:/app/.kube:ro \
-e K8S_KUBECONFIG=/app/.kube/config \
talkopsai/argo-rollout-mcp-server:latest

Resource for Monitoring​

ResourcePurpose
argorollout://experiments/default/hello-world-ab-test/statusExperiment phase, template statuses, analysis results

Experiment Phases​

PhaseMeaning
PendingExperiment created, pods not yet ready
RunningBoth templates running, analysis in progress
SuccessfulAll analysis passed, experiment complete
FailedAnalysis failed or pods unhealthy
ErrorConfiguration issue

Create Experiment Workflow​

Run two versions side-by-side for metric comparison. Templates reference the rollout's stable and canary ReplicaSets.

StepActionPrompt / Tool
1Trigger canary (if not already)"Deploy hello-world:v2 to the hello-world rollout in default."
2Create experiment"Create an A/B test experiment called hello-world-ab-test in default — run baseline (stable) and candidate (canary) side by side for 30 minutes."
3Or use toolargo_create_experiment(name="hello-world-ab-test", namespace="default", templates=[{"name": "baseline", "specRef": "stable"}, {"name": "candidate", "specRef": "canary"}], duration="30m", rollout_name="hello-world", analyses=[{"name": "success-rate", "templateName": "hello-world-analysis"}])

Notes:

  • rollout_name is required when templates use specRef — the server resolves stable/canary from the Rollout's ReplicaSets.
  • Replace hello-world-analysis with your actual AnalysisTemplate name from onboarding.

Monitor Experiment Workflow​

StepActionPrompt / Tool
1Poll status"What is the status of the hello-world-ab-test experiment in default?"
2Or fetch resourceargorollout://experiments/default/hello-world-ab-test/status
3ReviewCheck template_statuses and analysis_runs in response

Promote Candidate Workflow (If Candidate Wins)​

StepActionPrompt / Tool
1Promote candidate"Deploy hello-world:v2 to the hello-world rollout in default." (if not already) + "Fully promote the hello-world rollout in default."
2Or use toolargo_update_rollout(update_type='image', name="hello-world", new_image="hello-world:v2", namespace="default") then argo_manage_rollout_lifecycle(action='promote_full', ...)
3Delete experiment"Delete the hello-world-ab-test experiment in default."

Clean Up Workflow (If Baseline Wins)​

StepActionPrompt / Tool
1Abort rollout"Abort the hello-world rollout in default."
2Delete experiment"Delete the hello-world-ab-test experiment in default."
3Or use toolargo_delete_experiment(name="hello-world-ab-test", namespace="default")

Inconclusive — Extend Duration or Manual Review​

StepActionPrompt / Tool
1Review analysisFetch argorollout://experiments/default/hello-world-ab-test/status
2Manual decisionExtend experiment duration (re-create with longer duration) or manually promote/abort based on metrics

Natural Language Prompts​

Copy-paste these prompts into your MCP client (Cursor, Claude, etc.).

Create Experiment​

"Create an A/B test experiment called hello-world-ab-test in default — run baseline (stable) and candidate (canary) side by side for 30 minutes. Use rollout hello-world for specRef resolution."
"Start an Argo Experiment named hello-world-ab-test in default with two templates: baseline (stable spec) and candidate (canary spec), running for 1 hour. Resolve specRef from rollout hello-world."

When using specRef, the MCP client must pass rollout_name="hello-world" to argo_create_experiment so the server can resolve stable/canary from the Rollout's ReplicaSets.

Monitor​

"What is the status of the hello-world-ab-test experiment in default?"
"Show me the analysis results for the hello-world-ab-test experiment in default."

Clean Up​

"Delete the hello-world-ab-test experiment in default — the baseline won, keeping stable."
"Clean up the hello-world-ab-test experiment in default."

Quick Reference: Tool → Prompt Mapping​

ToolExample Prompt
argo_create_experiment"Create an A/B test experiment called hello-world-ab-test in default — run baseline and candidate side by side for 30 minutes."
argo_delete_experiment"Delete the hello-world-ab-test experiment in default."
argorollout://experiments/default/hello-world-ab-test/status"What is the status of the hello-world-ab-test experiment in default?"
argo_update_rollout (image)"Deploy hello-world:v2 to the hello-world rollout in default." (promote candidate)
argo_manage_rollout_lifecycle (abort)"Abort the hello-world rollout in default." (keep baseline)

Troubleshooting​

IssueCheck
Experiment stuck in PendingVerify rollout has canary in progress. Check pod status: kubectl get pods -n default -l app=hello-world
AnalysisTemplate not foundEnsure AnalysisTemplate exists and templateName in analyses matches. See Onboarding.
specRef "stable"/"canary" invalidPass rollout_name when using specRef (required). Ensure the Rollout has a canary deployment in progress — trigger argo_update_rollout(update_type='image') first.
Traefik traffic routingTraefik does not support experiment traffic routing. Use experiments for metrics comparison only; configure ingress separately for user-facing A/B tests.
Experiment FailedCheck AnalysisRun status: kubectl get analysisruns -n default. Verify Prometheus queries return data.

Next Steps​