KEDA brings event-driven autoscaling to Kubernetes by managing Horizontal Pod Autoscalers behind the scenes. That simplicity is powerful - but it also makes it essential to see what’s happening inside its control plane. Learn how to combine KEDA with OpenTelemetry and Dash0 to observe its internal metrics, and how to use those same signals to drive fast workload autoscaling.
In a previous post, Autoscaling Your Kubernetes Application with Dash0, we showed how to use the Kubernetes Horizontal Pod Autoscaler (HPA) to scale workloads based on custom application metrics from Dash0. By wiring the HPA to Dash0 through the Prometheus Adapter, you can move beyond CPU and memory and base scaling decisions on meaningful application signals like request rate or latency.
That approach works well for many workloads - but it’s not the only way to autoscale in Kubernetes.
While the HPA reacts to metrics, KEDA (Kubernetes Event-Driven Autoscaling) lets you scale workloads based on events and external signals. It can still scale on metrics like the HPA, but it goes further: it can watch message queue depth, database job tables, or scheduled times, and even scale workloads all the way down to zero when no work is waiting. It takes care of creating and managing the HPA under the hood, letting you simply declare what should trigger scaling and how the workload should respond.
And because KEDA can scale rapidly and down to zero, observability becomes critical. When something doesn’t scale, you need to see not just what your application is doing - but what KEDA itself thinks is happening. Is the scaler firing? Are the metrics arriving? Is the control loop running on time?
In this post, you’ll see how to make that visible with OpenTelemetry and Dash0, and how to use the very same telemetry to power event-driven autoscaling with KEDA.
How KEDA works
KEDA is a lightweight component you install into your cluster. It’s designed to be as invisible as possible when idle - it only becomes active when a scaling decision needs to be made.
There are two main pieces:
- The KEDA operator watches your triggers (defined as
ScaledObject
orScaledJob
resources). Every few seconds, it checks whether those triggers report enough work to justify scaling. If they do, it updates the target workload’s replica count; if not, it reduces it back toward the minimum. - The metrics adapter exposes the External Metrics API so that the Horizontal Pod Autoscaler can see non-resource metrics like queue length or HTTP requests per second. This is how KEDA integrates seamlessly with Kubernetes’ native autoscaling framework - it feeds external signals into the HPA as if they were regular metrics.
From your perspective, you just define a ScaledObject
. It points to a workload (like a Deployment) and one or more triggers that describe how to detect work. KEDA takes care of the rest: evaluating triggers, pushing data to the HPA, and adjusting replicas as needed.
When a trigger reports no work at all, KEDA can scale the workload down to zero. That’s something the HPA can’t do on its own, and it’s what makes KEDA a good fit for on-demand, event-driven workloads.
But before you can trust an autoscaler to act on your behalf, you need to be able to see what it’s doing - and why. That’s where OpenTelemetry comes in.
Observing the KEDA control plane with OpenTelemetry
When you look at a deployment and see it’s stuck at one replica even though traffic is rising, how do you know what’s wrong? Is KEDA not firing the scaler? Is the metric not being fetched? Is the control loop behind?
Starting with version 2.12, KEDA can emit its own internal metrics via OpenTelemetry (still experimental). This gives you visibility into the autoscaler itself. You can see which scalers are active, what values they’re returning, how long they take to fetch data, whether they’ve encountered errors, and how quickly the control loop is running.
Enabling OpenTelemetry in the KEDA operator
You enable this by adding a single flag to the KEDA operator deployment, along with a few environment variables to point it at your OpenTelemetry Collector (see this for configuring the helm chart):
12345678910containers:- name: keda-operatorimage: ghcr.io/kedacore/keda:2.17.0args:- --enable-opentelemetry-metrics=trueenv:- name: OTEL_EXPORTER_OTLP_ENDPOINTvalue: "http://<opentelemetry collector endpoint>:4317"- name: OTEL_EXPORTER_OTLP_PROTOCOLvalue: "grpc"
With this enabled, KEDA begins emitting a stream of OTel-formatted metrics, including:
keda.scaler.active
- 1 if a scaler is firing, 0 if notkeda.scaler.metrics.value
- the current values fetched from your triggerskeda.scaler.metrics.latency.seconds
- how long it takes to fetch themkeda.internal.scale.loop.latency.seconds
- how long each control loop takes
When these metrics arrive in Dash0, you can explore them in two ways. Dash0 automatically makes them available using Prometheus-style names with underscores (for example keda.scaler.active
→ keda_scaler_active
), which you can query with PromQL. You can also query them as native OpenTelemetry metrics using their original names, for example:
1{otel_metric_name = "keda.scaler.metrics.latency", otel_metric_type = "gauge"}
This gives you full visibility into KEDA’s decision-making - but first, you need a place to send the data.
Shipping KEDA metrics to Dash0
OpenTelemetry metrics are usually sent to an OpenTelemetry Collector, which acts as a pipeline. It receives data from KEDA, optionally processes it, and exports it to Dash0.
The simplest way to deploy a Collector is via the official Helm chart. Create a values.yaml
file like this:
1234567891011121314151617181920212223mode: deploymentconfig:receivers:otlp:protocols:grpc:http:exporters:otlp/dash0:auth:authenticator: bearertokenauth/dash0endpoint: ingress.eu-west-1.aws.dash0.com:4317extensions:bearertokenauth/dash0:scheme: Bearertoken: ${env:DASH0_AUTH_TOKEN}service:extensions:- bearertokenauth/dash0pipelines:metrics:receivers: [otlp]exporters: [otlp]
Then install the Collector:
1234567helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-chartshelm repo updatehelm install otel-collector open-telemetry/opentelemetry-collector \-f values.yaml \--namespace opentelemetry \--create-namespace
Once the Collector is running, set OTEL_EXPORTER_OTLP_ENDPOINT=http://<OpenTelemetry Collector Service>:4317
in the KEDA operator. Within a few minutes, you’ll see metrics like keda.scaler.active
and keda.scaler.metrics.value
appearing in Dash0, showing exactly what KEDA is doing in real time.
Scaling based on HTTP request metrics from Dash0
With the KEDA control plane now observable, let’s make it do something.
In the Dash0 examples repository we have created a small demo to demonstrate how scaling works with metrics from Dash0. The demo deploys a simple HTTP service. An OpenTelemetry Collector collects request counts and sends them to Dash0. We then configure a ScaledObject
with KEDA’s Prometheus scaler to query Dash0’s Prometheus-compatible API for the request rate.
Here’s the simplified ScaledObject
manifest (see the full manifest here). It scales between 1 and 10 replicas whenever the request rate exceeds 1 request/second:
123456789101112131415161718192021apiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata:name: http-scalernamespace: keda-demospec:scaleTargetRef:name: http-appminReplicaCount: 1maxReplicaCount: 10triggers:- type: prometheusname: http-requestsmetadata:serverAddress: https://api.eu-west-1.aws.dash0.com/api/prometheusquery: |sum(rate({otel_metric_name="http.server.duration", service_name="http-app"}[1m]))threshold: "1"authModes: "bearer"authenticationRef:name: dash0-auth
When the incoming request rate crosses 1 req/s, keda.scaler.active
becomes 1 and pods scale up rapidly. When traffic stops, the replicas scale back down to the configured minimum of one pod.
This demo does not scale to zero: if there were no pod to serve incoming requests, we would not have any telemetry to evaluate - creating a chicken-and-egg problem. Keeping one replica ensures that requests can continuously be served, and that KEDA can make scaling decisions.
Authenticating the trigger with TriggerAuthentication
Because Dash0’s Prometheus API is protected behind token-based authentication, you need to give KEDA an API token. Instead of embedding the token in the ScaledObject
, use a TriggerAuthentication
resource referencing a Kubernetes Secret:
12kubectl create secret generic dash0-api-secret \--from-literal=apiToken='<YOUR_DASH0_API_TOKEN>'
Reference the secret in the TriggerAuthentication
resource:
123456789apiVersion: keda.sh/v1alpha1kind: TriggerAuthenticationmetadata:name: dash0-authspec:secretTargetRef:- parameter: bearerTokenname: dash0-api-secretkey: apiToken
Then reference it from the trigger:
1234triggers:- type: prometheusauthenticationRef:name: dash0-auth
This keeps credentials secure and out of your manifests while letting KEDA authenticate to Dash0.
Scaling based on RabbitMQ queue depth
The HTTP example showed how KEDA can scale on custom metrics coming from your own application. But one of KEDA’s biggest strengths is that it can scale based on external event sources without requiring those sources to expose Kubernetes metrics.
A good example is RabbitMQ.
In the same examples repository, we’ve included a demo that runs a RabbitMQ broker, pushes messages into a queue, and uses KEDA’s built-in RabbitMQ scaler to scale a worker deployment based on the number of messages waiting in that queue.
Here’s the ScaledObject
manifest from the RabbitMQ demo:
12345678910111213141516171819202122apiVersion: keda.sh/v1alpha1kind: ScaledObjectmetadata:name: rabbitmq-consumer-scalernamespace: keda-demospec:scaleTargetRef:name: rabbitmq-consumerminReplicaCount: 0maxReplicaCount: 10cooldownPeriod: 10pollingInterval: 5idleReplicaCount: 0triggers:- type: rabbitmqmetadata:protocol: amqpqueueName: work_queuemode: QueueLengthvalue: "5"host: amqp://guest:guest@rabbitmq.keda-demo.svc.cluster.local:5672/vhostName: /
Whenever the queue length exceeds 5, KEDA begins scaling the worker pods up. When the queue is drained, it scales them all the way down to zero.
To make this observable, the demo also configures an OpenTelemetry Collector to scrape RabbitMQ metrics - including rabbitmq.queue.messages_ready
- and send them to Dash0. This lets you watch queue depth over time alongside either KEDA’s own control-plane metrics such as keda.scaler.active
and keda.scaler.metrics.value
or Kubernetes HPA metrics.
This RabbitMQ example highlights the core value of KEDA:
It doesn’t just scale based on what’s happening inside your cluster - it can scale based on real-world work waiting to be done.
What else you can do with KEDA
These two demos show how KEDA can scale both on metrics-based signals (like HTTP request rates) and event-based signals (like RabbitMQ queues). But these are just starting points.
KEDA ships with more than 70 scalers covering everything from message queues to databases to cloud services. Once you’ve made KEDA observable, you can apply the same pattern to triggers like:
- PostgreSQL scaler - run a SQL query that returns a number (like rows waiting in a job table) and scale based on that.
- Kafka scaler - scale consumers based on topic lag to keep up with incoming events.
- Cron scaler - run batch jobs at scheduled times by scaling from zero just when needed.
- Azure Service Bus, SQS, or GCP Pub/Sub scalers - scale serverless-style workloads based on messages in cloud queues.
You can even combine multiple triggers on the same ScaledObject
. If any of them crosses its threshold, KEDA scales your workload up. This lets you model complex autoscaling logic-like “scale if either queue length > 100 or error rate > 5%.”
And because you’re collecting KEDA’s own control-plane metrics with OpenTelemetry, you can see exactly how it reacts to each trigger - making these event-driven patterns observable and predictable.
Final thoughts
In our earlier post, you saw how the HPA can scale on custom application metrics from Dash0. In this post, you’ve seen how KEDA can do the same - and much more. It can scale on metrics, but also on external events, and it can scale to zero when no work is waiting. By enabling OpenTelemetry metrics in the KEDA operator and sending them to Dash0, you can observe the autoscaler itself - and by using Dash0’s Prometheus API, you can feed those same metrics back into KEDA to drive scaling decisions.
KEDA brings event-driven autoscaling to Kubernetes. OpenTelemetry brings visibility to the autoscaler itself. And Dash0 ties it all together - collecting the signals, making them explorable, and feeding them back as scaling input.
You don’t have to hope your autoscaler is working correctly anymore. You can see it, measure it, and trust it.