Last updated: June 8, 2026
About Alerting
The Failed Checks View shows you which check rules are currently failing, helping you identify and respond to performance issues and outages affecting your infrastructure and services.
All alerting is based on counting or measuring values and comparing them against defined numerical thresholds. A single check rule can produce multiple failed checks simultaneously — one for each distinct result returned by the query.
Checks can be in one of three states:
- RESOLVED (grey) — previously failing but now recovered
- DEGRADED (yellow) — threshold exceeded but not yet critical
- CRITICAL (red) — critical threshold exceeded
Failed Checks and Service Health
Failed checks directly impact service health visualizations throughout Dash0. In the service map and service health views, services are colored based on the failed checks related to them:
The most severe failed-check status determines the service color. If any failed check has status CRITICAL, the service displays as red even if other failed checks are degraded or resolved.
A single check rule can produce multiple failed checks simultaneously. This happens when a PromQL query returns multiple time series with different label combinations. For example, one query can result in separate failed checks for multiple services, with each unique service.name getting its own failed check.
Admin Access Needed
Creating check rules requires elevated access. Only if you have Admin privileges or a Maintainer role within a dataset can you create check rules.
This restriction is in place because a check rule directly affects the health status of a service by being able to send notifications to your team's phones and on-call channels. If you don't have the required access, you will not see the alert creation options.
Dashboards, by contrast, can be created by any user since they are purely visual and do not affect service health.
AI-Assisted Alert Creation
Agent0 can generate check rules from natural language descriptions, automatically creating PromQL queries, setting thresholds, and configuring notification channels.
Describe what you want to monitor ("alert when frontend latency exceeds 500ms"), and Agent0 generates the complete check rule configuration. See Generate Check Rules with Agent0 for details.
Similar But Different: Check Rules and Synthetic Checks
Check rules and synthetic checks serve different purposes:
-
Check rules monitor metrics, logs, spans, and web events using PromQL queries. They evaluate aggregated telemetry data against thresholds and fire when conditions are met. Use them to monitor backend performance, error rates, resource usage, and business metrics across your entire system.
-
Synthetic checks actively test your HTTP endpoints from multiple global locations by making real requests at regular intervals. Each synthetic check validates a specific endpoint's availability and response characteristics. Use them to monitor uptime, API availability, and external-facing services from a user's perspective.
Both check rules and synthetic checks use notification channels to send notifications when checks fail, keeping your team informed via Slack, PagerDuty, email, and other integrations.
You can combine both: synthetic checks generate metrics that check rules can query.
For example, create a synthetic check to test your API endpoint, then create a check rule to notify you if the synthetic check failure rate exceeds 10% across all locations.
Further Reading
- Create Check Rules — Set up check rules to monitor metrics, logs, spans, and web events using PromQL expressions and threshold values.
- Investigate Failed Checks — Troubleshoot failed checks by exploring the underlying telemetry and identifying root causes.
- Send Check Rule Notifications — Configure notification channels to keep your team informed when checks fail.
- Route Check Rule Notifications — Use label-based routing to direct alerts to the right teams automatically.


