Workflow: PromQL Querying
Step-by-step guide for safely querying Prometheus metrics — with counter enforcement, auto-downsampling, and query validation to prevent unsafe or unbounded queries.
When to Use
Use this workflow when:
- Querying Prometheus metrics from an AI assistant
- Exploring available metrics and their labels
- Running both instant and range queries
- Calculating latency from histogram metrics
Step-by-Step
| Step | Action | Tool / Resource | Key Parameters |
|---|---|---|---|
| 1 | Discover service metrics | Resource: prom://topology/services/{job}/metrics | Returns all metrics emitted by a service |
| 2 | Explore metric labels | prom_explore_labels(metric_name="<metric>") | Returns label keys and their top values |
| 3 | Validate query syntax | prom_validate_promql(query="rate(...)") | Returns {valid: true/false, error: ...} |
| 4 | Run instant query | prom_query_instant(query="rate(...)") | Point-in-time vector result |
| 5 | Run range query | prom_query_range(query="rate(...)", start=<unix>, end=<unix>) | Auto-computes step, downsamples to ~200 pts |
Guided Prompt: Use
prom-query-guidedfor the full step-by-step flow.
Safety Guardrails
| Guardrail | Description | Override |
|---|---|---|
| Counter Enforcement | Counters must use rate() or increase() | allow_raw_counters=true |
| Auto-Downsampling | Range queries capped at ~200 points/series | max_points_per_series param |
| Query Validation | Syntax checked before execution | Use action=validate first |
| Timeout | Default 30s query timeout | timeout param |
Auto-Step Computation
When step is omitted from range queries, the server auto-computes:
step = (end - start) / max_points_per_series
This ensures LLM context windows are protected from massive data payloads.
Natural Language Prompts
"What is the current request rate for http_requests_total?"
"Show me CPU usage across all pods in production over the last hour."
"Calculate p99 latency from the http_request_duration_seconds histogram."
"Show me the error rate trend for the checkout service — last 24 hours."
Next Steps
- TSDB FinOps — Cardinality analysis and optimization
- Rule Management — Rule authoring, testing, and simulation
- Troubleshooting — Diagnosing failed scrape targets