Last updated: September 1, 2025
Code Red Newsletter #10
Hi there,
Adoption, redaction, and reality checks.
OpenTelemetry is no longer the cool side project tucked away in a platform team’s backlog. It’s being rolled out at scale - across public agencies, travel platforms, and internal teams with real constraints. And as adoption deepens, the conversations shift from “why OTel?” to “how do we make it stick?”
In focus: Observability, operationalized
In this edition, we highlight how teams are doing exactly that - through operator-based rollouts, cultural shifts, and smarter metrics. We also share a story about observability, even without OpenTelemetry, and preview what’s coming at KubeCon Japan, where I’ll be talking about debugging broken signals.
Telemetry is getting serious. Let’s make it work.
NAV: 0 to 100
Norway’s welfare agency (NAV) rolled out OpenTelemetry across more than 1,600 microservices. Their approach? Use the OpenTelemetry Operator to inject agents via annotations and support engineers with clear docs, filters, and pre-built config templates. They introduced span metrics and custom YAML extensions to drive adoption at scale - without overwhelming individual teams. The result is an observability setup that’s broadly adopted and battle-tested.
Read the post here.
Skyscanner: Cost down, clarity up
Skyscanner reduced their observability bill by 90% after standardizing on OpenTelemetry. But this wasn’t just a tooling swap - it was a cultural shift. They aligned telemetry with business metrics, invested in trace-first debugging, and rolled out Game Days to teach engineers how to read traces like detectives. It's proof that good observability isn't about more data - it's about the right data.
Read more about Skycanners adoption story here.
Warner Bros. Discovery: Observability with guardrails
WBD built observability directly into the rollout of Max, their streaming platform. By embedding metadata into every resource and setting budget-aware alerting policies, they ensured every trace and metric served a purpose. Observability wasn’t just for SREs - it was a tool for everyone to stay within cost, SLA, and scale boundaries. This is telemetry with FinOps baked in.
Read the article here.
Want more real-world OpenTelemetry stories? Check out the adopters page. And if your team is using OTel in production, add your company to the list and share your journey.
KubeCon Japan: Debugging, tracing, and Tokyo
KubeCon + CloudNativeCon Japan is just around the corner - June 16–17 in Tokyo - and the Observability track is stronger than ever. Expect talks on real-time Kafka tracing, semantic conventions, green telemetry, and building observability into high-stakes platforms.
I’ll also be giving a talk:
Debugging OpenTelemetry: Ensuring Your Observability Signals Are Spot On
📅 Tuesday June 17 · 🕝 14:50–15:20 JST · 📍 Level 1 | Orion
I’ll walk through what really happens when OpenTelemetry breaks - missing traces, mismatched metrics, or semantic chaos. Expect live demos, practical debugging workflows, and tips on how to validate signal pipelines with confidence. Whether you’re new to OTel or knee-deep in Collector YAML, this talk’s for you.
Dash0 also has a booth – come by booth S4 for a demo or just to chat about how OpenTelemetry-native observability can level up your platform. Book a live onsite demo here.
See more OpenTelemetry specific talks in this summary.
Code RED Podcast: Rethinking Query Standards
Jacek Migdal (Quesma, ex-Sumo Logic) joins Mirko Novakovic to unpack the messy reality of observability query languages.
From SQL translation and multi-engine portability to the hidden costs of query rewrites, they dive into why standardization is so elusive - and what role AI might play in making observability pipelines smarter (or weirder).
Listen to the episode now.
Choice cuts
Every issue needs a snack break. These didn’t fit the main thread, but they’re too good to leave behind. Pour yourself a cup of coffee and dig in.
OTTL: Your telemetry control layer
The OpenTelemetry Transformation Language is becoming essential. Redact headers, fix timestamps, or drop noisy metrics - all without touching app code.
Explore the guide here.
RPC semantic conventions are stabilizing
Consistent span names, duration histograms, and better framework coverage are landing soon. It’s a win for everyone tired of regexing their way through spaghetti telemetry.
Learn more here.
Kubeletstats: Say goodbye to .cpu.utilization
The kubeletstats receiver is deprecating all .cpu.utilization
metrics - including k8s.node.cpu.utilization
, k8s.pod.cpu.utilization
, and container.cpu.utilization
- in favor of .cpu.usage
.
This update improves accuracy by aligning names with what the values actually represent: cumulative CPU usage time, not percentages. But if you're relying on these old names, this change could silently break dashboards.
Using Dash0? Semantic convention upgrades can automatically migrate your metrics to the correct names - no code changes or manual cleanup required.
Read the full update here.
OpenTelemetry is no longer the experiment - it’s the expectation.
We’re constantly seeing more teams standardize, sanitize, and scale their telemetry practices. Whether that means adopting semantic conventions, automating pipelines, or embedding observability into platform workflows, the trend is clear: clean signals, shared context, and real control.
If you're heading to KubeCon Japan, stop by my talk or the Dash0 booth. We’d love to hear your story - or help you fix the parts that still feel broken.
Until next time - scrub your labels, validate your pipelines, and trace like you mean it.
Kasper, out!
