Workflow: PromQL Querying

Step-by-step guide for safely querying Prometheus metrics — with counter enforcement, auto-downsampling, and query validation to prevent unsafe or unbounded queries.

When to Use

Use this workflow when:

Querying Prometheus metrics from an AI assistant
Exploring available metrics and their labels
Running both instant and range queries
Calculating latency from histogram metrics

Step-by-Step

Step	Action	Tool / Resource	Key Parameters
1	Discover service metrics	Resource: `prom://topology/services/{job}/metrics`	Returns all metrics emitted by a service
2	Explore metric labels	`prom_explore_labels(metric_name="<metric>")`	Returns label keys and their top values
3	Validate query syntax	`prom_validate_promql(query="rate(...)")`	Returns `{valid: true/false, error: ...}`
4	Run instant query	`prom_query_instant(query="rate(...)")`	Point-in-time vector result
5	Run range query	`prom_query_range(query="rate(...)", start=<unix>, end=<unix>)`	Auto-computes step, downsamples to ~200 pts

Guided Prompt: Use prom-query-guided for the full step-by-step flow.

Safety Guardrails

Guardrail	Description	Override
Counter Enforcement	Counters must use `rate()` or `increase()`	`allow_raw_counters=true`
Auto-Downsampling	Range queries capped at ~200 points/series	`max_points_per_series` param
Query Validation	Syntax checked before execution	Use `action=validate` first
Timeout	Default 30s query timeout	`timeout` param

Auto-Step Computation

When step is omitted from range queries, the server auto-computes:

step = (end - start) / max_points_per_series

This ensures LLM context windows are protected from massive data payloads.

Natural Language Prompts

"What is the current request rate for http_requests_total?"
"Show me CPU usage across all pods in production over the last hour."
"Calculate p99 latency from the http_request_duration_seconds histogram."
"Show me the error rate trend for the checkout service — last 24 hours."

Next Steps

TSDB FinOps — Cardinality analysis and optimization
Rule Management — Rule authoring, testing, and simulation
Troubleshooting — Diagnosing failed scrape targets

When to Use​

Step-by-Step​

Safety Guardrails​

Auto-Step Computation​

Natural Language Prompts​

Next Steps​