• 12 min read

Observable Event-Driven Autoscaling with KEDA, OpenTelemetry, and Dash0

KEDA brings event-driven autoscaling to Kubernetes by managing Horizontal Pod Autoscalers behind the scenes. That simplicity is powerful - but it also makes it essential to see what’s happening inside its control plane. Learn how to combine KEDA with OpenTelemetry and Dash0 to observe its internal metrics, and how to use those same signals to drive fast workload autoscaling.

In a previous post, Autoscaling Your Kubernetes Application with Dash0, we showed how to use the Kubernetes Horizontal Pod Autoscaler (HPA) to scale workloads based on custom application metrics from Dash0. By wiring the HPA to Dash0 through the Prometheus Adapter, you can move beyond CPU and memory and base scaling decisions on meaningful application signals like request rate or latency.

That approach works well for many workloads - but it’s not the only way to autoscale in Kubernetes.

While the HPA reacts to metrics, KEDA (Kubernetes Event-Driven Autoscaling) lets you scale workloads based on events and external signals. It can still scale on metrics like the HPA, but it goes further: it can watch message queue depth, database job tables, or scheduled times, and even scale workloads all the way down to zero when no work is waiting. It takes care of creating and managing the HPA under the hood, letting you simply declare what should trigger scaling and how the workload should respond.

And because KEDA can scale rapidly and down to zero, observability becomes critical. When something doesn’t scale, you need to see not just what your application is doing - but what KEDA itself thinks is happening. Is the scaler firing? Are the metrics arriving? Is the control loop running on time?

In this post, you’ll see how to make that visible with OpenTelemetry and Dash0, and how to use the very same telemetry to power event-driven autoscaling with KEDA.

How KEDA works

KEDA is a lightweight component you install into your cluster. It’s designed to be as invisible as possible when idle - it only becomes active when a scaling decision needs to be made.

There are two main pieces:

  • The KEDA operator watches your triggers (defined as ScaledObject or ScaledJob resources). Every few seconds, it checks whether those triggers report enough work to justify scaling. If they do, it updates the target workload’s replica count; if not, it reduces it back toward the minimum.
  • The metrics adapter exposes the External Metrics API so that the Horizontal Pod Autoscaler can see non-resource metrics like queue length or HTTP requests per second. This is how KEDA integrates seamlessly with Kubernetes’ native autoscaling framework - it feeds external signals into the HPA as if they were regular metrics.

From your perspective, you just define a ScaledObject. It points to a workload (like a Deployment) and one or more triggers that describe how to detect work. KEDA takes care of the rest: evaluating triggers, pushing data to the HPA, and adjusting replicas as needed.

When a trigger reports no work at all, KEDA can scale the workload down to zero. That’s something the HPA can’t do on its own, and it’s what makes KEDA a good fit for on-demand, event-driven workloads.

But before you can trust an autoscaler to act on your behalf, you need to be able to see what it’s doing - and why. That’s where OpenTelemetry comes in.

Observing the KEDA control plane with OpenTelemetry

When you look at a deployment and see it’s stuck at one replica even though traffic is rising, how do you know what’s wrong? Is KEDA not firing the scaler? Is the metric not being fetched? Is the control loop behind?

Starting with version 2.12, KEDA can emit its own internal metrics via OpenTelemetry (still experimental). This gives you visibility into the autoscaler itself. You can see which scalers are active, what values they’re returning, how long they take to fetch data, whether they’ve encountered errors, and how quickly the control loop is running.

Enabling OpenTelemetry in the KEDA operator

You enable this by adding a single flag to the KEDA operator deployment, along with a few environment variables to point it at your OpenTelemetry Collector (see this for configuring the helm chart):

yaml
12345678910
containers:
- name: keda-operator
image: ghcr.io/kedacore/keda:2.17.0
args:
- --enable-opentelemetry-metrics=true
env:
- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: "http://<opentelemetry collector endpoint>:4317"
- name: OTEL_EXPORTER_OTLP_PROTOCOL
value: "grpc"

With this enabled, KEDA begins emitting a stream of OTel-formatted metrics, including:

  • keda.scaler.active - 1 if a scaler is firing, 0 if not
  • keda.scaler.metrics.value - the current values fetched from your triggers
  • keda.scaler.metrics.latency.seconds - how long it takes to fetch them
  • keda.internal.scale.loop.latency.seconds - how long each control loop takes

When these metrics arrive in Dash0, you can explore them in two ways. Dash0 automatically makes them available using Prometheus-style names with underscores (for example keda.scaler.activekeda_scaler_active), which you can query with PromQL. You can also query them as native OpenTelemetry metrics using their original names, for example:

plain text
1
{otel_metric_name = "keda.scaler.metrics.latency", otel_metric_type = "gauge"}
Screenshot of the Dash0 Metrics Explorer showing all keda.* metrics collected from the KEDA operator

This gives you full visibility into KEDA’s decision-making - but first, you need a place to send the data.

Shipping KEDA metrics to Dash0

OpenTelemetry metrics are usually sent to an OpenTelemetry Collector, which acts as a pipeline. It receives data from KEDA, optionally processes it, and exports it to Dash0.

The simplest way to deploy a Collector is via the official Helm chart. Create a values.yaml file like this:

yaml
1234567891011121314151617181920212223
mode: deployment
config:
receivers:
otlp:
protocols:
grpc:
http:
exporters:
otlp/dash0:
auth:
authenticator: bearertokenauth/dash0
endpoint: ingress.eu-west-1.aws.dash0.com:4317
extensions:
bearertokenauth/dash0:
scheme: Bearer
token: ${env:DASH0_AUTH_TOKEN}
service:
extensions:
- bearertokenauth/dash0
pipelines:
metrics:
receivers: [otlp]
exporters: [otlp]

Then install the Collector:

sh
1234567
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm install otel-collector open-telemetry/opentelemetry-collector \
-f values.yaml \
--namespace opentelemetry \
--create-namespace

Once the Collector is running, set OTEL_EXPORTER_OTLP_ENDPOINT=http://<OpenTelemetry Collector Service>:4317 in the KEDA operator. Within a few minutes, you’ll see metrics like keda.scaler.active and keda.scaler.metrics.value appearing in Dash0, showing exactly what KEDA is doing in real time.

Screenshot of a dashboard showing KEDA scaling behavior which can be imported using the KEDA integration in Dash0.

Scaling based on HTTP request metrics from Dash0

With the KEDA control plane now observable, let’s make it do something.

In the Dash0 examples repository we have created a small demo to demonstrate how scaling works with metrics from Dash0. The demo deploys a simple HTTP service. An OpenTelemetry Collector collects request counts and sends them to Dash0. We then configure a ScaledObject with KEDA’s Prometheus scaler to query Dash0’s Prometheus-compatible API for the request rate.

Here’s the simplified ScaledObject manifest (see the full manifest here). It scales between 1 and 10 replicas whenever the request rate exceeds 1 request/second:

yaml
123456789101112131415161718192021
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: http-scaler
namespace: keda-demo
spec:
scaleTargetRef:
name: http-app
minReplicaCount: 1
maxReplicaCount: 10
triggers:
- type: prometheus
name: http-requests
metadata:
serverAddress: https://api.eu-west-1.aws.dash0.com/api/prometheus
query: |
sum(rate({otel_metric_name="http.server.duration", service_name="http-app"}[1m]))
threshold: "1"
authModes: "bearer"
authenticationRef:
name: dash0-auth

When the incoming request rate crosses 1 req/s, keda.scaler.active becomes 1 and pods scale up rapidly. When traffic stops, the replicas scale back down to the configured minimum of one pod.

This demo does not scale to zero: if there were no pod to serve incoming requests, we would not have any telemetry to evaluate - creating a chicken-and-egg problem. Keeping one replica ensures that requests can continuously be served, and that KEDA can make scaling decisions.

Authenticating the trigger with TriggerAuthentication

Because Dash0’s Prometheus API is protected behind token-based authentication, you need to give KEDA an API token. Instead of embedding the token in the ScaledObject, use a TriggerAuthentication resource referencing a Kubernetes Secret:

sh
12
kubectl create secret generic dash0-api-secret \
--from-literal=apiToken='<YOUR_DASH0_API_TOKEN>'

Reference the secret in the TriggerAuthentication resource:

yaml
123456789
apiVersion: keda.sh/v1alpha1
kind: TriggerAuthentication
metadata:
name: dash0-auth
spec:
secretTargetRef:
- parameter: bearerToken
name: dash0-api-secret
key: apiToken

Then reference it from the trigger:

yaml
1234
triggers:
- type: prometheus
authenticationRef:
name: dash0-auth

This keeps credentials secure and out of your manifests while letting KEDA authenticate to Dash0.

Scaling based on RabbitMQ queue depth

The HTTP example showed how KEDA can scale on custom metrics coming from your own application. But one of KEDA’s biggest strengths is that it can scale based on external event sources without requiring those sources to expose Kubernetes metrics.

A good example is RabbitMQ.

In the same examples repository, we’ve included a demo that runs a RabbitMQ broker, pushes messages into a queue, and uses KEDA’s built-in RabbitMQ scaler to scale a worker deployment based on the number of messages waiting in that queue.

Here’s the ScaledObject manifest from the RabbitMQ demo:

yaml
12345678910111213141516171819202122
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: rabbitmq-consumer-scaler
namespace: keda-demo
spec:
scaleTargetRef:
name: rabbitmq-consumer
minReplicaCount: 0
maxReplicaCount: 10
cooldownPeriod: 10
pollingInterval: 5
idleReplicaCount: 0
triggers:
- type: rabbitmq
metadata:
protocol: amqp
queueName: work_queue
mode: QueueLength
value: "5"
host: amqp://guest:guest@rabbitmq.keda-demo.svc.cluster.local:5672/
vhostName: /

Whenever the queue length exceeds 5, KEDA begins scaling the worker pods up. When the queue is drained, it scales them all the way down to zero.

To make this observable, the demo also configures an OpenTelemetry Collector to scrape RabbitMQ metrics - including rabbitmq.queue.messages_ready - and send them to Dash0. This lets you watch queue depth over time alongside either KEDA’s own control-plane metrics such as keda.scaler.active and keda.scaler.metrics.value or Kubernetes HPA metrics.

The graph shows queue depth rising as messages are published, worker replicas scaling up from 0 to 10, and then draining back down to 0 pods as the queue empties. (logarithmic y-axe)

This RabbitMQ example highlights the core value of KEDA:

It doesn’t just scale based on what’s happening inside your cluster - it can scale based on real-world work waiting to be done.

What else you can do with KEDA

These two demos show how KEDA can scale both on metrics-based signals (like HTTP request rates) and event-based signals (like RabbitMQ queues). But these are just starting points.

KEDA ships with more than 70 scalers covering everything from message queues to databases to cloud services. Once you’ve made KEDA observable, you can apply the same pattern to triggers like:

  • PostgreSQL scaler - run a SQL query that returns a number (like rows waiting in a job table) and scale based on that.
  • Kafka scaler - scale consumers based on topic lag to keep up with incoming events.
  • Cron scaler - run batch jobs at scheduled times by scaling from zero just when needed.
  • Azure Service Bus, SQS, or GCP Pub/Sub scalers - scale serverless-style workloads based on messages in cloud queues.

You can even combine multiple triggers on the same ScaledObject. If any of them crosses its threshold, KEDA scales your workload up. This lets you model complex autoscaling logic-like “scale if either queue length > 100 or error rate > 5%.”

And because you’re collecting KEDA’s own control-plane metrics with OpenTelemetry, you can see exactly how it reacts to each trigger - making these event-driven patterns observable and predictable.

Final thoughts

In our earlier post, you saw how the HPA can scale on custom application metrics from Dash0. In this post, you’ve seen how KEDA can do the same - and much more. It can scale on metrics, but also on external events, and it can scale to zero when no work is waiting. By enabling OpenTelemetry metrics in the KEDA operator and sending them to Dash0, you can observe the autoscaler itself - and by using Dash0’s Prometheus API, you can feed those same metrics back into KEDA to drive scaling decisions.

KEDA brings event-driven autoscaling to Kubernetes. OpenTelemetry brings visibility to the autoscaler itself. And Dash0 ties it all together - collecting the signals, making them explorable, and feeding them back as scaling input.

You don’t have to hope your autoscaler is working correctly anymore. You can see it, measure it, and trust it.