Range misaligned to step sizes

One of the most subtle and common mistakes in PromQL is using a range selector that doesn't match your visualization or evaluation interval. This creates a dangerous situation where your charts and alerts don't reflect the actual behavior you're trying to monitor.

The Problem

When you query with a range selector like sum (increase({otel_metric_name="dash0.spans"}[1m])) , you're calculating the rate over a 1-minute window. However, if your chart's step size (the interval between data points) is set to 2 minutes, PromQL will evaluate this query every 2 minutes, but each evaluation only looks at the most recent 1 minute of data.

This means you're missing half of the data. If you have a traffic spike that lasts 1 minute but occurs in the middle of a 2-minute evaluation window, you might only capture a fraction of it or miss it entirely.

Concrete Example

Consider these underlying data points powered by Dash0’s synthetic dash0.spans metric. In this example we assume a step size of 2 minutes.

A time series chart visualizing a two minute step size and a metric — A time series chart visualizing a two minute step size and raw metric data

The following example illustrates a metric query that can yield surprising and incorrect results. The metric query is only covering half of the step size. This results in missing data for the time between minutes two and four. Spikes that occur within that time range wouldn’t appear in any data point. Notice how the range selector is explicitly set to 1m .

A one minute range visualized as not covering the two minute step size

Let’s look at a query where the step size and range are aligned. This query explicitly sets the range to 2m, and hence, all the underlying data is covered by the query evaluation.

A two minute range covering the whole two minute step size

However, a downside to this approach is that the step size is dynamic. The step size is dynamically chosen when the Dash0 UI is rendered based on the selected time range and available space for charts. For example, it doesn’t make sense to query with a step size that results in 2000 data points when only 500 horizontal pixels are available. This would only leave 0.25 pixels per data point for rendering purposes. To solve this, you can leverage the $__interval and $__rate_interval variables instead of the explicit 2m in the range selector. Both of these variables will remain aligned as closely as possible to the step size.

info

Time series chart tooltips are communicating the actual step size through the presented timestamps. Within dashboarding, you can even configure a minimum step size.

Impact on Alerting

This issue is even more critical for alerts. If your alert rule uses a range of 30s and is evaluated every 1m, you are always going to miss 30 seconds of data in your alert evaluation! The solution is simple: make sure the range is equal to or larger than the evaluation frequency. See the following screenshot for a correctly configured example.

A screenshot showing the Dash0 check rule editor — Within the check rule editor in Dash0 you can configure the range and evaluation frequency.

Common Metric Query Gotchas

Range misaligned to step sizes

The Problem

Concrete Example

Impact on Alerting