Last updated: July 21, 2025

Understanding the Prometheus Metric Types

You’ve heard about Prometheus. You know it’s the king of metrics-based monitoring for cloud-native systems. But talk is cheap. To truly harness its power, you must master its fundamental language: the four core metric types.

This isn’t just another blog post. This is a definitive, no-nonsense guide for engineers. We’ll move beyond the trivial examples and dive deep into what each metric type is, when to use it, and, most importantly, the common pitfalls that turn promising monitoring setups into dumpster fires. Forget the hand-waving; let’s get our hands dirty.

Understanding the Prometheus data model

Before you can use a tool, you must understand its model of the world. In Prometheus, everything is a time series: a stream of timestamped values belonging to the same metric and the same set of labeled dimensions.

Each time series is uniquely identified by two things:

The metric name: A descriptive name like http_requests_total. This should be general.
Labels: A set of key-value pairs like {method="POST", handler="/api/users"}. These provide the specific dimensions.

A request to GET /api/status and a request to POST /api/users are two different time series under the same metric name http_requests_total.

The cardinality beast

This leads us to the single most important concept you must understand: cardinality.

Cardinality is the number of unique time series a metric produces. It’s the product of the number of unique values for each of its labels.

http_requests_total{method=["GET", "POST"], status_code=["200", "500"]} has a cardinality of 2 * 2 = 4. This is great.
http_requests_total{user_id=["1", "2", ... "100000"]} has a cardinality of 100,000. This is a cardinality explosion.

A cardinality explosion will bring your Prometheus server to its knees. It inflates memory usage, slows down queries, and can cause a complete system meltdown.

Rule of Thumb: Never use labels for values with unbounded or high-cardinality sets. User IDs, request IDs, email addresses, or any unique identifiers are forbidden as label values. Stick to low-cardinality attributes like status codes, HTTP methods, queues, or regions.

The four horsemen of metrics

Prometheus offers four metric types. Understanding which one to use is not optional; it’s the core of effective instrumentation.

1. Counter

Of the four metric types, the Counter is the most straightforward. It represents a single, cumulative value that only ever increases, much like a car’s odometer which can’t be wound back.

This monotonic behavior makes it the perfect tool for observing the occurrences of an event. You would use a Counter to track the total number of HTTP requests your service has handled, the number of jobs processed by a worker, or the total exceptions logged since the application started.

Now, if a service restarts, this cumulative value would reset to zero. However, Prometheus is smart enough to handle this in its calculations through its rate() and increase() functions.

In practice, you’ll instrument a counter in your code by defining it with a name, some help text, and a set of labels:

JavaScript

1234567
import promClient from "prom-client";

const httpRequestsTotal = new promClient.Counter({
  name: "http_requests_total",
  help: "Total number of HTTP requests made",
  labelNames: ["method", "path", "status_code"],
});

Then, within your application logic, you’ll increment this counter each time the event occurs:

JavaScript

123
// In your middleware after a response is sent:
// Ensure to use the unparameterized route template
httpRequestsTotal.labels(req.method, req.route.path, res.statusCode).inc();

When it comes to querying, the absolute value of a counter is rarely interesting on its own. Instead, you’ll almost always care about its rate of change which allows you to derive meaningful performance indicators like requests per second or errors per minute.

PromQL provides several key functions for this. The workhorse is rate(), which calculates the per-second average rate of increase over a specified time window, making it ideal for graphing trends.

For alerting on sudden spikes, you should use irate(), as it calculates an instantaneous rate based on the last two data points, making it far more responsive.

Finally, if you need to know the total number of events over a period, increase() will tell you exactly how much the counter went up in that time, which is useful for answering questions like “How many errors did we have in the last hour?”.

2. Gauge

A Gauge represents a single numerical value that can arbitrarily go up and down. It gives you a snapshot of a value right now, making it ideal for measuring things that are, not counting things that have happened.

A common use of gauges is for monitoring the number of jobs in a background processing queue, such as one used for sending welcome emails or processing uploads, to understand if your workers are keeping up with demand.

The most robust way to instrument this is to periodically check the queue’s true size from the source of truth and update the gauge using the set() method:

JavaScript

12345678910111213
const backgroundJobsGauge = new promClient.Gauge({
  name: "app_background_jobs_pending",
  help: "Number of jobs currently in a background processing queue",
  labelNames: ["name"],
});

// Periodically update the gauge by checking the actual queue state
setInterval(() => {
  const counts = jobs.getCounts();
  for (const queue in counts) {
    backgroundJobsGauge.set({ name: queue }, counts[queue]);
  }
}, 10000);

This set() approach should be preferred for measuring stateful values like the number of items in a queue or current memory usage. By periodically checking the actual source of truth, your gauge remains stateless and self-correcting.

Most Prometheus’ clients also allow you to use inc() and dec() for incrementing and decrementing a gauge respectively. This is a useful pattern when the metric’s value cannot be computed directly, but must instead be tracked by observing individual events as they happen.

A common use case is tracking the number of concurrent, in-flight HTTP requests where you’d call inc() when a request begins and dec() when it finishes:

JavaScript

12
activeRequestsGauge.inc(); // when a new request is received
activeRequestsGauge.dec(); // when a request completes

The rule of thumb is to always use set() if you can query the current value from a source of truth (like a database or a system file). You only use inc() and dec() when no such source exists, and you can only observe events as they happen.

When plotting gauge metrics, accessing its raw value is the most direct way to visualize its behavior over time:

1
app_background_jobs_pending{name='email'}

To move from visualization to automated analysis, PromQL provides powerful functions. For instance, delta(<metric>[10m]) calculates the total increase or decrease over a time range, helping you answer “how much did the queue grow/shrink in the last 10 minutes?”.

Even more powerfully, you can use the predict_linear() function to forecast a gauge’s value one hour into the future based on its recent trend. This can help you create proactive alerts that warn you if a resource is depleting or a queue is growing uncontrollably, long before it becomes a critical issue.

3. Histogram

The Histogram is the most powerful way to measure latency and other distributed values in Prometheus. Instead of storing every single measurement, it counts how many fall into pre-configured buckets. This is ideal for understanding the distribution of API request durations, database query times, or any value where averages can be misleading.

When you create a histogram metric, it actually exposes multiple time series which are distinguished by suffixes: _bucket, _sum, and _count:

12345678910111213141516
# HELP http_request_duration_seconds Histogram of HTTP request durations in seconds
# TYPE http_request_duration_seconds histogram
http_request_duration_seconds_bucket{le="0.005"} 5
http_request_duration_seconds_bucket{le="0.01"} 12
http_request_duration_seconds_bucket{le="0.025"} 25
http_request_duration_seconds_bucket{le="0.05"} 40
http_request_duration_seconds_bucket{le="0.1"} 68
http_request_duration_seconds_bucket{le="0.25"} 105
http_request_duration_seconds_bucket{le="0.5"} 131
http_request_duration_seconds_bucket{le="1"} 142
http_request_duration_seconds_bucket{le="2.5"} 148
http_request_duration_seconds_bucket{le="5"} 148
http_request_duration_seconds_bucket{le="10"} 150
http_request_duration_seconds_bucket{le="+Inf"} 150
http_request_duration_seconds_sum 43.819204
http_request_duration_seconds_count 150

This http_request_duration_seconds metric summarizes 150 HTTP requests by dividing their latencies into the following default buckets which are optimized for typical application latencies:

1
[0.005, 0.01, 0.025, 0.05, 0.1, 0.25, 0.5, 1, 2.5, 5, 10, +Inf]

From this output, you can see the following:

_count: A total of 150 requests were observed.
_sum: The combined duration of all requests was ~43.8 seconds.
_bucket: These are cumulative counters where the le (less-or-equal) label defines each bucket’s upper bound. The le="1" bucket with a value of 142 means that 142 requests took one second or less to complete.

When instrumenting your code, it’s often necessary to update the default buckets so that they better align with what you’re measuring.

For instance, if your Service Level Objective (SLO) for an endpoint is that 99% of requests must be faster than 250ms, your buckets should provide fine-grained resolution around that value so that you can distinguish between comfortably meeting that target (e.g., at 150ms) versus just barely scraping by (e.g., at 245ms).

JavaScript

12345678910111213141516
const httpRequestDuration = new promClient.Histogram({
  name: "http_request_duration_seconds",
  help: "Duration of HTTP requests in seconds.",
  labelNames: ["method", "path", "status_code"],
  buckets: [0.05, 0.1, 0.15, 0.2, 0.25, 0.5, 1], // From 50ms to 1s
});

// In your middleware, start a timer and observe the duration on finish
const end = httpRequestDuration.startTimer();
res.on("finish", () => {
  end({
    method: req.method,
    path: req.route.path,
    status_code: res.statusCode,
  });
});

The primary reason to use histograms is to calculate accurate quantiles (percentiles), which is essential for measuring and alerting on SLOs. This is done with the histogram_quantile() function.

12
# Calculate the 99th percentile for the /api/cart path over 5 minutes
histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket{path="/api/cart"}[5m])) by (le))

This tells you the response time that 99% of users experienced, which is a far better indicator of performance than a simple average.

If histogram_quantile() queries are slow, use recording rules in Prometheus to pre-calculate them at regular intervals and save the result to a new metric. This makes dashboards and alerts lightning fast.

As of v2.40, native histograms have been introduced as a newer, more efficient format that provides higher resolution with reduced data overhead. However, they are still considered experimental and must be enabled using the --enable-feature=native-histograms flag.

4. Summary

The Summary metric, like a Histogram, is used for tracking distributions. However, it operates like a personal statistician on board your application: it calculates streaming quantiles on the client-side and exposes them directly, ready to be scraped.

This client-side calculation comes with a critical limitation: you cannot aggregate quantiles from a Summary across multiple instances. It is mathematically invalid to average a 99th percentile from ten different servers to get a meaningful global 99th percentile. This makes Summaries unsuitable for most modern, distributed architectures where you need a system-wide view of performance.

So, when are Summaries useful? Their niche is in scenarios where aggregation isn’t needed, and you require highly accurate quantiles with low server-side query overhead. This makes them a good choice for monitoring a specific, single-instance service where you care about its individual performance characteristics.

Final thoughts

Mastering Prometheus begins with mastering its four metric types. Each one is a specialized tool designed for a specific job, and choosing the right one is the foundation of effective monitoring.

Use Counters to track rates of events, Gauges to monitor current states, and Histograms to understand the distribution of your system’s performance.

Ultimately, thoughtful instrumentation isn’t just a technical exercise; it’s how you gain actionable insights, build reliable services, and ensure a better experience for your users.

Thanks for reading, and happy monitoring!

Authors

Ayooluwa Isaiah

Observability vs Monitoring: Understanding the Differences