Dash0 Acquires Lumigo to Expand Agentic Observability Across AWS and Serverless

Last updated: March 2, 2026

Mastering the OpenTelemetry Transform Processor

The OpenTelemetry Collector ships with a rich set of processors for common tasks: the attributes processor for key-value manipulation, the resource processor for resource-level changes, and various others for specific, bounded jobs.

But occasionally you need to do something none of them can: restructure a log body, compute a value from two existing fields, convert a metric type, or promote an attribute from the record level up to the resource level. That's when you reach for the transform processor.

The transform processor is powered by the OpenTelemetry Transformation Language (OTTL), a purpose-built expression language for manipulating telemetry in-flight. Think of it as a lightweight scripting layer built directly into your pipeline: you write statements that run against every span, log record, or metric data point as it flows through the Collector, and OTTL executes them in order.

The official docs covers what the processor does but rarely explains why certain patterns exist, what breaks when you misconfigure it, or how to debug statements that appear to do nothing. This guide closes those gaps, taking you from the basics through advanced patterns with the context you need to use it confidently in production.

Let's begin!

What the transform processor is actually doing

Before touching configuration, it helps to have a mental model of how the processor executes.

When a batch of telemetry arrives, the transform processor iterates over it hierarchically. For traces, it walks every resource span, then every scope span within it, then every individual span and span event. For logs, it walks every resource log, every scope log, and every log record. For metrics, it walks every resource metric, every scope metric, every metric, and then every data point within that metric.

At each level, the processor evaluates your OTTL statements. Each statement is a function call with an optional where clause that acts as a guard — the function only runs if the condition evaluates to true. Statements are executed in the exact order you define them, which matters enormously once you start chaining operations.

The data model you're operating on follows the OpenTelemetry Protocol (OTLP) structure directly, which is both a strength and a gotcha. You have access to everything — resource attributes, instrumentation scope metadata, individual record fields — but you need to reference each level with the correct path prefix. Using span.attributes in a log statement will cause a parse error. Using resource.attributes works at any level because all signals carry resource information.

Quick start: your first transform

One thing the transform processor can do that nothing else in the Collector can is parse a structured JSON log body and promote its fields into proper attributes — in a single pipeline step, with no external tooling.

Consider a common situation: your application writes logs like this, with everything packed into the body as a JSON string:

json
1234567
{
"level": "error",
"message": "payment gateway timeout",
"order_id": "ord_8821",
"duration_ms": 5023,
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736"
}

As far as your observability backend is concerned, that's an opaque blob of text. You can't filter by level, you can't alert on duration_ms, and the trace_id sitting in the body isn't wired to the actual trace context field that backends use for correlation. The transform processor fixes all of this in one shot.

Setting up the demo

Create a directory with three files. First, a log file that the Collector will tail:

text
123456
# app.log
{"level":"info","message":"server started","port":8080,"trace_id":"00000000000000000000000000000000"}
{"level":"info","message":"user login successful","user_id":"usr_4421","trace_id":"4bf92f3577b34da6a3ce929d0e0e4736"}
{"level":"warn","message":"high memory usage","percent":87.4,"host":"node-3","trace_id":"b7ad6b7169203331d166c7b39e74e7e3"}
{"level":"error","message":"payment gateway timeout","order_id":"ord_8821","duration_ms":5023,"trace_id":"a3ce929d0e0e47364bf92f3577b34da6"}
{"level":"error","message":"database connection lost","retries":3,"trace_id":"d0e0e47364bf92f3577b34da6a3ce929"}

Then the Collector configuration itself:

<!-- prettier-ignore-start -->
yaml
12345678910111213141516171819202122232425262728293031323334353637383940414243
# otelcol.yaml
receivers:
filelog:
include: [app.log]
start_at: beginning
processors:
transform:
error_mode: ignore
log_statements:
- conditions:
- IsString(log.body) and IsMatch(log.body, "^\\s*\\{")
statements:
# Parse the JSON body into the temporary cache map
- merge_maps(log.cache, ParseJSON(log.body), "upsert")
# Promote fields to proper log attributes
- set(log.attributes["level"], log.cache["level"])
- set(log.attributes["message"], log.cache["message"])
- set(log.attributes["order_id"], log.cache["order_id"])
- set(log.attributes["duration_ms"], log.cache["duration_ms"])
- set(log.attributes["user_id"], log.cache["user_id"])
- set(log.attributes["host"], log.cache["host"])
- set(log.attributes["percent"], log.cache["percent"])
- set(log.attributes["retries"], log.cache["retries"])
# Set the actual trace context fields
- set(log.trace_id.string, log.cache["trace_id"]) where log.cache["trace_id"] != nil
- set(log.span_id.string, log.cache["span_id"]) where log.cache["span_id"] != nil
# Set severity from the parsed level field
- set(log.severity_text, log.cache["level"])
- set(log.severity_number, SEVERITY_NUMBER_ERROR) where log.cache["level"] == "error"
- set(log.severity_number, SEVERITY_NUMBER_WARN) where log.cache["level"] == "warn"
- set(log.severity_number, SEVERITY_NUMBER_INFO) where log.cache["level"] == "info"
exporters:
debug:
verbosity: detailed
service:
pipelines:
logs:
receivers: [filelog]
processors: [transform]
exporters: [debug]
<!-- prettier-ignore-end -->

Finally, add the Docker Compose file to run the OpenTelemetry Collector:

yaml
1234567
# docker-compose.yaml
services:
otelcol:
image: otel/opentelemetry-collector-contrib:0.146.1
volumes:
- ./otelcol.yaml:/etc/otelcol-contrib/config.yaml
- ./app.log:/app.log

Run it with:

bash
1
docker compose up -d

What to look for in the output

Without the transform processor, the debug exporter would show each record with a raw string body and no attributes beyond what the filelog receiver adds:

text
12345678
LogRecord #3
ObservedTimestamp: ...
Body: Str({"level":"error","message":"payment gateway timeout","order_id":"ord_8821","duration_ms":5023,"trace_id":"a3ce929d0e0e47364bf92f3577b34da6"})
Attributes:
-> log.file.name: Str(app.log)
Trace ID:
Span ID:
Flags: 0

With the transform processor in the pipeline, the same record arrives at the exporter fully decomposed:

text
123456789101112131415
LogRecord #3
ObservedTimestamp: 2026-03-02 07:01:01.578273845 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText: error
SeverityNumber: Error(17)
Body: Str({"level":"error","message":"payment gateway timeout","order_id":"ord_8821","duration_ms":5023,"trace_id":"a3ce929d0e0e47364bf92f3577b34da6","span_id":"4bf92f3577b34da6"})
Attributes:
-> log.file.name: Str(app.log)
-> level: Str(error)
-> message: Str(payment gateway timeout)
-> order_id: Str(ord_8821)
-> duration_ms: Double(5023)
Trace ID: a3ce929d0e0e47364bf92f3577b34da6
Span ID: 4bf92f3577b34da6
Flags: 0

Three things happened that no other processor could have done in combination: the JSON body was parsed and its fields became queryable attributes, the severity fields were populated so backends can filter and alert by log level, and the trace IDs were moved from a plain string in the body into the appropriate OTLP field which means this log record is now automatically correlated to its trace in any OpenTelemetry-native backend.

A few things are worth understanding up front, because they introduce patterns you'll use throughout this guide:

  1. The group-level conditions block acts as a gate: the entire group of statements only runs for records whose body looks like a JSON object. Non-JSON logs pass through untouched with no errors.

  2. ParseJSON() deserializes the body into an OTTL map value. merge_maps writes that map into log.cache, which is a temporary scratch space that exists only for the duration of statement evaluation. You can pull individual fields out of the cache (log.cache) and place them where they belong.

  3. The log.trace_id.string accessor is specific to OTTL's type system. The top-level trace_id field expects bytes internally, and .string tells OTTL to accept a hex string and convert it automatically. Without it, the assignment silently fails even in ignore mode because the types don't align.

Now that you've seen the transform processor in action, it's worth examining on a setting that will save you from a lot of silent data loss before you go any further: error_mode.

Understanding and configuring error modes

The error_mode setting is the first thing to get right, because its default will surprise you.

The processor-level default is propagate, which means if any OTTL statement encounters a type error, a missing function argument, or tries to access a path that doesn't exist on a particular record, the entire batch is dropped and the error is returned up the pipeline.

In development this is useful for catching problems early. In production it is an outage waiting to happen, because real-world telemetry is messy and a single malformed record will cause every record in the batch to be discarded.

The three available modes are:

  • ignore: Logs the error and moves on to the next statement. This is almost always what you want in production.
  • silent: Ignores the statement that caused the error and moves on without logging. Use this when you've confirmed a statement will fail for some records by design and you don't want the noise.
  • propagate: Returns the error up the pipeline, dropping the payload. Use this only in development or when a transformation failure should be treated as a pipeline failure.

You can set ignore at the top level and override it per statement group when you need stricter behavior for a specific critical transformation:

yaml
123456789
transform:
error_mode: ignore # safe default for most statements
log_statements:
- error_mode: propagate # this group must succeed or the batch is dropped
statements:
- set(log.attributes["account_id"],
log.attributes["required_account_id"])
- statements: # inherits top-level "ignore"
- set(log.attributes["region"], resource.attributes["cloud.region"])

To make this concrete, add the following statement to the demo configuration from the quick start (after merge_maps()). It attempts to call Split() on duration_ms — a numeric field that only exists on one of the five log records, and even when it does exist, Split() only accepts strings:

yaml
1
- set(log.attributes["x"], Split(log.cache["duration_ms"], ","))

Run the demo with docker compose up -d --force-recreate and you'll see a "failed to execute statement" log in the Collector logs:

text
12345678910111213
2026-03-02T07:24:36.761Z warn ottl@v0.146.0/parser.go:410 failed to execute statement {
"resource": {
"service.instance.id": "ea76a497-ab58-4aca-8059-46fb486781ed",
"service.name": "otelcol-contrib",
"service.version": "0.146.1"
},
"otelcol.component.id": "transform",
"otelcol.component.kind": "processor",
"otelcol.pipeline.id": "logs",
"otelcol.signal": "logs",
"error": "expected string but got nil",
"statement": "set(log.attributes[\"duration_parts\"], Split(log.cache[\"duration_ms\"], \",\"))"
}

Two things are worth noting here. First, the Collector logged the error and moved on — all the statements before the bad one already ran, and all the statements after it continue to run normally on that same record.

Second, the error message tells you exactly which statement failed and why, which makes ignore genuinely useful for diagnosing problems in addition to being the safe production default.

Switch error_mode to propagate and recreate the collector instance again. This time no records make it through at all — the Collector returns the error upstream and the debug exporter shows nothing. You'll see the following error instead which is accompanied by a stack trace:

text
123456789101112
2026-03-02T07:37:54.301Z error logs/processor.go:62 failed processing logs {
"resource": {
"service.instance.id": "ad4bcf8d-101b-4f79-a6c5-7fe95cbbb51e",
"service.name": "otelcol-contrib",
"service.version": "0.146.1"
},
"otelcol.component.id": "transform",
"otelcol.component.kind": "processor",
"otelcol.pipeline.id": "logs",
"otelcol.signal": "logs",
"error": "failed to execute statement: set(log.attributes[\"duration_parts\"], Split(log.cache[\"duration_ms\"], \",\")), expected string but got nil"
}

That's the behavior you want in a controlled test environment when you need to be certain every statement is valid, and the behavior you want to avoid in production where imperfect data is often a fact of life.

Set error_mode to silent and the failures disappear from the logs entirely. Reserve this for statements you've already validated with ignore and confirmed will fail on certain record shapes by design — not as a way to hide problems you haven't investigated yet.

The path system: how to address your data

Every OTTL statement operates on paths — dot-separated identifiers that point to fields in the telemetry data model. Understanding the path prefix system is the single most important thing to internalize before writing any non-trivial statements.

Each signal type exposes a specific set of path prefixes:

SignalAvailable path prefixes
trace_statementsresource, scope, span, spanevent
metric_statementsresource, scope, metric, datapoint
log_statementsresource, scope, log

You cannot mix prefixes across signals. Using span.attributes in a log_statements block or datapoint.attributes in a trace_statements block will result in a parse error. Collector will refuse to start with an invalid configuration.

Within each prefix, you address fields using the OTLP data model:

yaml
1234567891011121314151617181920212223242526272829
# Common paths for logs
log.body # the log body (any OTTL value type)
log.severity_number # numeric severity (SeverityNumber enum)
log.severity_text # string severity
log.attributes["key"] # a specific log attribute
log.trace_id # the trace ID bytes
log.span_id # the span ID bytes
# Common paths for spans
span.name # span name
span.kind # SpanKind enum
span.attributes["key"] # a specific span attribute
span.status.code # StatusCode enum
span.start_time # start timestamp
span.end_time # end timestamp
# Common paths for metrics
metric.name # metric name string
metric.description # metric description string
metric.unit # metric unit string
metric.type # MetricDataType enum
datapoint.attributes["key"] # a datapoint attribute
datapoint.value # the numeric value (gauge/sum)
# Resource and scope paths (available across all signals)
resource.attributes["key"]
scope.name
scope.version
scope.attributes["key"]

Attribute access uses bracket notation with a string key: log.attributes["http.status_code"]. If the key doesn't exist, accessing it returns nil rather than erroring — which is useful for nil-checks in where clauses.

OTTL in the transform processor

The transform processor is powered by the OpenTelemetry Transformation Language (OTTL) — a purpose-built expression language for manipulating telemetry in-flight.

If you're new to OTTL, that guide covers the syntax, path expressions, operators, and function library in full. This section focuses only on the two concepts that are specific to how the transform processor uses OTTL: the cache field and context inference.

The cache field

Every record processed by the transform processor gets a cache field — a temporary map that exists only for the duration of that record's statement evaluation and is discarded afterwards. It is OTTL's scratch space, and it exists because some transformations can't be expressed in a single statement.

The most common use is JSON parsing, where you need to materialize the parsed structure before you can extract individual fields from it:

yaml
1234
log_statements:
- merge_maps(log.cache, ParseJSON(log.body), "upsert")
- set(log.attributes["level"], log.cache["level"])
- set(log.attributes["request_id"], log.cache["request_id"])

You can also use it to hold intermediate computed values that multiple subsequent statements depend on, keeping your logic readable without repeating expressions.

Context inference

The transform processor doesn't execute all statements at the same level of the telemetry hierarchy. It infers the correct OTTL context — resource, scope, span, spanevent, metric, or datapoint — from the path prefixes present in your statements, then iterates over the data at that level.

This inference is automatic and transparent in most cases. It becomes relevant when you mix paths from incompatible contexts in the same statement group. For example, convert_sum_to_gauge() only works at the metric context level, while datapoint.attributes requires the datapoint context.

yaml
1234
# This will fail — conflicting contexts in one group
metric_statements:
- convert_sum_to_gauge() where metric.name == "process.cpu.time"
- limit(datapoint.attributes, 10, ["host.name"])

Combining them in one group produces a parse error at startup:

text
1
Error: invalid configuration: processors::transform: unable to infer a valid context (["resource" "scope" "metric" "datapoint"]) from statements ["convert_sum_to_gauge() where metric.name == \"process.cpu.time\"" "limit(datapoint.attributes, 10, [\"host.name\"])"] and conditions []: inferred context "datapoint" does not support the function "convert_sum_to_gauge"

The fix is to separate them into distinct groups so each can be inferred independently:

yaml
12345
metric_statements:
- statements:
- convert_sum_to_gauge() where metric.name == "process.cpu.time"
- statements:
- limit(datapoint.attributes, 10, ["host.name"])

If the Collector refuses to start with a context-related parse error, this is almost always the cause.

Configuring statement groups

The basic configuration style is a flat list of statements. This works well for simple pipelines but becomes unwieldy when you want to apply different error modes or reuse conditions across multiple statements. The advanced style introduces statement groups, which are objects with their own context, error_mode, and conditions fields:

yaml
1234567891011121314151617181920
transform:
error_mode: ignore
log_statements:
- conditions:
- IsMatch(log.body, "^\\{") # Only process JSON-looking bodies
statements:
- merge_maps(log.cache, ParseJSON(log.body), "upsert")
- set(log.attributes["parsed.level"], log.cache["level"])
- set(log.attributes["parsed.message"], log.cache["message"])
- set(log.attributes["parsed.timestamp"], log.cache["timestamp"])
- conditions:
- log.severity_number == SEVERITY_NUMBER_UNSPECIFIED
statements:
- set(log.severity_number, SEVERITY_NUMBER_INFO) where IsMatch(log.body,
"\\sINFO[:\\s]")
- set(log.severity_number, SEVERITY_NUMBER_WARN) where IsMatch(log.body,
"\\sWARN(ING)?[:\\s]")
- set(log.severity_number, SEVERITY_NUMBER_ERROR) where
IsMatch(log.body, "\\sERROR[:\\s]")

There are two important things to understand about the conditions field:

First, conditions within a group are ORed together. If you list three conditions, the group's statements run if any one of them is true. For AND logic, use a single condition with OTTL's and operator, or chain multiple where clauses directly on individual statements.

Second, the group-level conditions are evaluated first, and then each statement's own where clause is evaluated independently. A record must pass the group condition to enter the group at all, and then each individual statement applies its own guard on top of that.

The flatten_data option for log transformations

When you move attributes from log records up to the resource level, you can run into a subtle problem. Multiple log records in the same batch may have different values for the attribute you're promoting — say, kubernetes.pod.name differs between records. But all records under a single resource share that resource. If you promote a per-record attribute to the resource, you'll get the value from whichever record happened to be processed last, silently losing the others.

The flatten_data option addresses this by giving each log record a distinct copy of its resource and scope before transformation, then regrouping them afterwards:

yaml
123456
transform:
flatten_data: true
log_statements:
- set(resource.attributes["pod_name"],
log.attributes["kubernetes.pod.name"])
- delete_key(log.attributes, "kubernetes.pod.name")

With flatten_data: true, records that produce different resource attribute values after transformation end up in separate resource groups — correctly preserving the per-record data.

This option carries a performance cost proportional to the number of unique resource/scope combinations in your batches. Enable it only when your transformations genuinely require per-record resource isolation, and enable the transform.flatten.logs feature gate at startup:

bash
1
./otelcol --config config.yaml --feature-gates=transform.flatten.logs

Debugging OTTL statements

When a statement appears to do nothing — no error, no change — there are a few likely causes: the condition is evaluating to false, the path doesn't exist on the records you think it does, or there's a type mismatch that error_mode: ignore is silently swallowing.

Enable debug logging

The transform processor emits detailed debug logs that show the full TransformContext before and after each statement, including the exact value of every field. Enable it by setting the Collector's log level to debug:

yaml
1234
service:
telemetry:
logs:
level: debug

You'll see output like this for every statement on every record:

text
123456789101112
debug ottl/parser.go TransformContext after statement execution
{
"statement": "set(log.attributes[\"environment\"], \"production\")",
"condition matched": true,
"TransformContext": {
"log_record": {
"attributes": {
"environment": "production"
}
}
}
}

The "condition matched": true/false field is especially useful — if you see false, your where clause is the problem. If you see true but the field isn't changing, there's a type issue in the statement itself.

This output is extremely verbose in any production-volume environment. Use it on a test Collector with a controlled data source, and turn it off before deploying.

Use the debug exporter to verify results

Pair the transform processor with the debug exporter using chained pipelines to see a before-and-after view of your transformations:

yaml
1234567891011121314151617181920212223242526272829303132333435
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
otlp/internal:
protocols:
grpc:
endpoint: 127.0.0.1:4316
processors:
transform:
error_mode: ignore
log_statements:
- set(log.attributes["environment"], "production")
exporters:
debug/before:
verbosity: detailed
debug/after:
verbosity: detailed
otlp/internal:
endpoint: 127.0.0.1:4316
tls:
insecure: true
service:
pipelines:
logs/raw:
receivers: [otlp]
exporters: [debug/before, otlp/internal]
logs/transformed:
receivers: [otlp/internal]
processors: [transform]
exporters: [debug/after]

Compare the two outputs and the change should be immediately visible. If the before and after are identical, your statement isn't matching anything — which narrows the problem down to the condition or the path.

Common failure modes

Statement silently does nothing: Check error_mode. If it's ignore or silent, a failed type conversion or an invalid path won't surface as an error. Temporarily switch to propagate to force failures to be visible.

Condition never matches: Print the raw value of the field you're testing against using the debug exporter at detailed verbosity. Type mismatches in conditions are the most frequent culprit — comparing an integer attribute to a string literal will never match, for example.

Metric statements conflict on context: If the Collector refuses to start with a context inference error, separate your statements into distinct groups so each group contains only paths from a single context level.

set(log.trace_id, ...) does nothing: The top-level trace context fields have strict type requirements. Use log.trace_id.string for hex string input or log.trace_id.bytes for raw bytes. The plain log.trace_id path expects a specific internal type that you generally can't construct directly from an attribute value.

copy_metric matches the copied metric: When you copy a metric, the new metric is appended to the same list and gets evaluated by all subsequent statements. Always use a where clause that won't match the new metric's name. If you forget this, you can end up in an unintended transformation loop within the same batch.

Warnings and things to be careful about

The official documentation flags several categories of risk worth taking seriously.

Unsound metric type conversions: Functions like convert_gauge_to_sum and convert_sum_to_gauge perform transformations that have no canonical definition in the OpenTelemetry data model specification. You're asserting semantics that the data may not actually have. If a Gauge represents an instantaneous measurement that isn't cumulative, converting it to a cumulative Sum is factually wrong — and backends may process it incorrectly as a result. Know your data before converting types.

Metric identity conflicts: Changing metric.name, removing datapoint attributes, or adding new dimensions can alter the identity of a metric from your backend's perspective. This can cause a metric stream to disappear and a new one to appear, breaking alerting rules and dashboards that reference the original name or label set. Be especially cautious when reducing attributes — adding attributes is safe, removing them is not.

Orphaned telemetry: Modifying span.trace_id, span.span_id, span.parent_span_id, log.trace_id, or log.span_id can produce spans that no longer connect to their parent or children, and logs that no longer correlate to traces. This is almost never what you want. If you're promoting these fields from attributes into the correct top-level positions (as in the trace context repair pattern above), the original attribute values should still be consistent — but modifying the actual IDs to something else will break your traces.

Best practices

Default to error_mode: ignore in production. Real telemetry data is never perfectly clean. A single record without an expected attribute will cause propagate to drop the entire batch. Use propagate in testing, ignore in production.

Keep statement lists short and focused. A transform processor with 40 statements is difficult to reason about and difficult to debug. If you find yourself writing many unrelated transformations in one processor, split them into multiple named instances — transform/normalize, transform/enrich, transform/redact — each with a clear purpose. The Collector supports multiple instances of the same processor type with the /name suffix.

Use where clauses aggressively. Every statement without a condition runs on every record. In a high-throughput pipeline, unconditional regex matching across every log body adds up. Narrow your statements to only the records that need them.

Order matters — think carefully about dependencies. Statements execute sequentially. If statement B depends on the output of statement A (for example, reading an attribute that statement A just set), it must come after A. If statement B reads an attribute that statement C will delete, it must come before C. Draw out your dependency graph before writing the configuration.

Validate transformations with the debug exporter before deploying. This is especially important for metric transformations, where an unexpected change to metric identity can silently break your observability for however long it takes someone to notice the dashboards are wrong.

Be explicit about nil checks. Accessing a missing attribute returns nil. Passing nil to a function that doesn't expect it will produce an error. Writing where span.attributes["optional_key"] != nil before operating on optional fields prevents a large class of silent failures.

Final thoughts

The transform processor is the most versatile tool in the Collector's processor ecosystem — and the one that most rewards careful study. OTTL's expression model is consistent once you internalize the path prefix system and the context hierarchy, and the library of available functions covers an impressive range of real-world needs without requiring you to write a custom plugin.

The patterns in this guide — JSON parsing, trace context repair, attribute level promotion, severity extraction, metric type normalization — cover the majority of production use cases. Once you're comfortable with them, the OTTL function reference opens up a wide range of more specialized transformations.

Before data reaches your observability backend, the transform processor gives you a final opportunity to ensure it's clean, correctly structured, and genuinely useful. Paired with a platform like Dash0 that understands the full OpenTelemetry data model, well-shaped telemetry translates directly into faster debugging, more reliable alerting, and clearer insights across your systems.

Try Dash0 for free and see what your telemetry looks like when it's processed right.

Authors
Ayooluwa Isaiah
Ayooluwa Isaiah