Last updated: September 22, 2025
OpenTelemetry Logging Explained: Data Model, Bridges, and Best Practices
Logs have always been central to understanding how software behaves. They record events, errors, and state changes that reveal what was happening inside a system at a given moment.
But as architectures have shifted toward microservices and distributed systems, the old model of siloed, free-form log files has become impossible to work with. With each service speaking its own logging dialect, troubleshooting is like trying to read a book where every page is in a different language.
OpenTelemetry was created to solve exactly this kind of fragmentation. By defining a common log data model and a language-neutral API and SDK, OTel provides a standard way to represent, process, and export logs.
The goal is not to replace your existing logging libraries, but to make their output interoperable, consistent, and enriched with the same contextual information that powers traces and metrics.
This unification has important consequences. Once logs are expressed in the OpenTelemetry model, they can be processed in the same pipelines as metrics and traces, enriched with consistent resource metadata, and correlated with distributed traces. Instead of isolated fragments of text, logs become structured events that participate in the full observability story.
This article breaks down the OpenTelemetry approach to logging, from its data model and SDK to the crucial role of log bridges. I'll also show you why the seamless correlation between logs and traces is a game-changer for debugging complex service interactions.
Let's begin!
Understanding the OpenTelemetry logs data model
At the heart of OpenTelemetry's logging capabilities is its logs data model, a stable specification that defines what a log record is and how it should be represented.
Its purpose is to create a standardized, vendor-neutral structure that can represent logs from diverse sources and frameworks, including application log files, system logs, and machine-generated events.
It is flexible enough to map existing formats into the model without ambiguity, and in many cases, to reconstruct the original format without loss of information.
An OpenTelemetry log record is composed of several core fields with defined semantics, alongside flexible attributes for contextual data. The most important fields include:
-
Timestamp
: The time when the event originally occurred at the source. -
ObservedTimestamp
: The time when the log was collected by an OpenTelemetry system. For logs generated directly by an OTel SDK, this is typically the same as theTimestamp
. -
Body
: The main content of the log message. This can be a simple string or structured JSON, but it is best treated as the human-readable message. -
Attributes
: A set of key-value pairs that provides additional, machine-readable contextual information about a specific log event. -
Resource
: Describes the entity that produced the log, such as the application, host, or Kubernetes pod. All logs from the same entity share the same resource attributes.
To enable trace–log correlation, the model also incorporates fields from the W3C Trace Context specification:
TraceId
: The unique identifier for a distributed trace.SpanId
: The identifier for a specific span (operation) within that trace.TraceFlags
: Flags providing metadata about the trace, such as whether it was sampled.
When these fields are present, a single log message can be linked directly to its broader distributed trace to accelerate debugging in microservice environments.
The data model also standardizes how log severity is represented:
-
SeverityText
: The original string representation of the log level, as it appeared at the source. -
SeverityNumber
: A numeric value that enables consistent comparison and filtering across systems. Smaller numbers represent less severe events, larger numbers more severe ones.
SeverityNumber Range | Category |
---|---|
1–4 | TRACE |
5–8 | DEBUG |
9–12 | INFO |
13–16 | WARN |
17–20 | ERROR |
21–24 | FATAL |
In OTLP JSON, a log record looks like this:
json12345678910111213141516171819202122232425262728293031323334353637{"resourceLogs": [{"resource": {"attributes": [{"key": "service.name","value": { "stringValue": "checkoutservice" }}]},"scopeLogs": [{"scope": { "name": "logback", "version": "1.4.0" },"logRecords": [{"timeUnixNano": "1756571696706248000","observedTimeUnixNano": "1756571696710000000","severityNumber": 17,"severityText": "ERROR","body": { "stringValue": "Database connection failed" },"attributes": [{ "key": "thread.id", "value": { "intValue": 42 } },{"key": "exception.type","value": { "stringValue": "SQLException" }}],"traceId": "da5b97cecb0fe7457507a876944b3cf","spanId": "fa7f0ea9cb73614c"}]}]}]}
At the top level, resourceLogs
groups all logs that originate from the same
Resource
. In this case, the service.name
attribute identify the specific
microservice and ensures that all logs from this service are bundled together.
Within a resource, scopeLogs
further groups logs by the instrumentation
scope. This identifies the specific library or module that generated the log.
Here, the scope tells us the log was emitted by version 1.4.0 of the Logback
library.
Finally, the logRecords
array contains the individual log entries, each with
its timestamps, severity, body, attributes, and (when available) trace context.
In this snippet, there is a single record describing a database error, complete
with a traceId
and spanId
that allow you to jump directly to the trace where
the error occurred.
By defining a consistent data model, OpenTelemetry makes it possible to represent logs from any source in a uniform way.
But having a model alone isn't enough. You also need a way to generate logs that already conform to it. That's where application instrumentation comes in.
Bringing your application Logs into OpenTelemetry
Unlike traces and metrics, where OpenTelemetry introduces its own API that you must call directly, logging follows a different approach. Because there is a long history of diverse logging frameworks and standards, OTel is designed to integrate with existing libraries rather than replace them.
Logs flow into the OpenTelemetry ecosystem through bridges: adapters that forward records from familiar libraries like Python's logging, Java's SLF4J/Logback, or .NET's Serilog.
This design means you can keep your existing logging code and tooling, while still gaining the benefits of the OpenTelemetry log data model, correlation with traces, and consistent export to observability backends.
Understanding the Logs API and SDK
The Logs API defines the contract for passing log records into the OpenTelemetry pipeline. It's primarily intended for library authors to build appenders or handlers, but it can also be called directly from instrumentation libraries or application code. Its core components are:
It consists of the following core components:
-
LoggerProvider
: Creates and managesLogger
instances. Typically, one is configured per process and registered globally for consistent access. -
Logger
: ALogger
is responsible for emitting logs asLogRecords
. In practice, your existing logging library (via a bridge) will call it for you. -
LogRecord
: The data structure representing a single log event, with all the fields defined in the log data model.
While the Logs API define how logs are created, the Logs SDK is responsible for processing and exporting them. It provides:
- A concrete
LoggerProvider
implementation. - A
LogRecordProcessor
that sits between log creation and export and is responsible for enrichment, filtering/transforming, and batching ofLogRecords
. - A
LogRecordExporter
that takes processed records and exports them to destinations such as the console or an OTLP endpoint.
Where log bridges fit in
It's important to note that the Logs SDK does not automatically capture your application's logs. It provides the pipeline, but log records must be fed into it by a bridge.
A log bridge is an adapter (sometimes called a handler, appender, or transport) that connects your existing logging framework to the OpenTelemetry Logs API. Instead of rewriting your application to use OTel loggers directly, you only need to attach a bridge to the logger you already use.
For example, consider a Node.js application using Pino:
JavaScript12345import pino from "pino";const logger = pino();logger.info("hi");
By default, Pino produces JSON logs like this:
json1234567{"level": 30,"time": 1758515262941,"pid": 55904,"hostname": "falcon","msg": "hi"}
To bring these logs into an OpenTelemetry pipeline, you must configure the
OpenTelemetry SDK,
register a LogRecordProcessor
and LogRecordExporter
, and include the Pino
log bridge via the
@opentelemetry/instrumentation-pino
package:
JavaScript123456789101112131415import { PinoInstrumentation } from "@opentelemetry/instrumentation-pino";import { logs, NodeSDK } from "@opentelemetry/sdk-node";import pino from "pino";const sdk = new NodeSDK({logRecordProcessor: new logs.SimpleLogRecordProcessor(new logs.ConsoleLogRecordExporter(),),instrumentations: [new PinoInstrumentation()],});sdk.start();const logger = pino();logger.info("hi");
The SimpleLogRecordProcessor
immediately exports each log, which is useful for
development and debugging. In production, you'd typically replace it with a
BatchLogRecordProcessor
(to reduce network overhead) and swap the
ConsoleLogRecordExporter
for an OTLPLogExporter
that streams logs to the
Collector:
JavaScript12345678910import { OTLPLogExporter } from "@opentelemetry/exporter-logs-otlp-http";import { PinoInstrumentation } from "@opentelemetry/instrumentation-pino";import { logs, NodeSDK } from "@opentelemetry/sdk-node";import pino from "pino";const sdk = new NodeSDK({logRecordProcessor: new logs.BatchLogRecordProcessor(new OTLPLogExporter()),instrumentations: [new PinoInstrumentation()],});sdk.start();
This shows up in your Collector pipeline as follows assuming you're using the debug exporter:
text123456789101112131415161718192021222324252627282930312025-09-22T05:31:27.964Z info ResourceLog #0Resource SchemaURL:Resource attributes:-> host.name: Str(falcon)-> host.arch: Str(amd64)-> host.id: Str(4a3dc42bf0564d50807d1553f485552a)-> process.pid: Int(59532)-> process.executable.name: Str(node)-> process.executable.path: Str(/home/ayo/.local/share/mise/installs/node/24.8.0/bin/node)-> process.command_args: Slice(["/home/ayo/.local/share/mise/installs/node/24.8.0/bin/node","--experimental-loader=@opentelemetry/instrumentation/hook.mjs","/home/ayo/dev/dash0/repro-contrib-2838/index.js"])-> process.runtime.version: Str(24.8.0)-> process.runtime.name: Str(nodejs)-> process.runtime.description: Str(Node.js)-> process.command: Str(/home/ayo/dev/dash0/repro-contrib-2838/index.js)-> process.owner: Str(ayo)-> service.name: Str(unknown_service:node)-> telemetry.sdk.language: Str(nodejs)-> telemetry.sdk.name: Str(opentelemetry)-> telemetry.sdk.version: Str(2.0.1)ScopeLogs #0ScopeLogs SchemaURL:InstrumentationScope @opentelemetry/instrumentation-pino 0.49.0LogRecord #0ObservedTimestamp: 2025-09-22 05:31:27.924 +0000 UTCTimestamp: 2025-09-22 05:31:27.924 +0000 UTCSeverityText: infoSeverityNumber: Info(9)Body: Str(hi)Trace ID:Span ID:Flags: 0
This output demonstrates how the log bridge and the Logs SDK work together:
- The raw Pino record has been translated into an OpenTelemetry
LogRecord
. - Resource attributes are automatically populated to identify where the log came from.
- Pino's
level: 30
(info) is mapped toseverityText: "info"
andseverityNumber: 9
. - The
body
field carries the original human-readable log message. - The
instrumentationScope
identifies that the log was captured via the Pino bridge. - Trace correlation fields (
traceId
,spanId
,traceFlags
) are present but unset in this example.
While this is the standard approach, be aware that the availability maturity of these bridges varies by language and framework. Ensure to check your language's OpenTelemetry documentation to see what's supported and how to configure it.
Correlating logs and traces
OpenTelemetry offers a capability that structured logging alone cannot: when you use the OTel SDK for both tracing and logging, it automatically correlates the two.
To see this in action, your service must emit logs within an active trace span. Whenever that happens, the SDK automatically attaches the current trace and span identifiers to each log record.
In most cases, you will rely on zero-code instrumentation to create spans around common operations such as HTTP requests or database calls, but you can also create spans manually:
JavaScript1234567import { api, logs, NodeSDK } from "@opentelemetry/sdk-node";const tracer = api.trace.getTracer("example");tracer.startActiveSpan("manual-span", (span) => {logger.info("in a span");});
The resulting log record now includes the active trace context:
text123456789LogRecord #1ObservedTimestamp: 2025-09-22 05:51:37.685 +0000 UTCTimestamp: 2025-09-22 05:51:37.685 +0000 UTCSeverityText: infoSeverityNumber: Info(9)Body: Str(in a span)Trace ID: 6691c3b82c157705904ba3b5b921d60aSpan ID: 72efdc9ec81b179aFlags: 1
This correlation creates a two-way street for debugging. From a trace, you can jump directly to the logs that occurred within its spans; from a log, you can pivot back to the full distributed trace that produced it.
This is the core value proposition of using the complete OpenTelemetry ecosystem. It elevates logs from a simple record of events to a deeply contextualized part of a larger observability narrative.
When a log bridge isn't available
Not every framework has a bridge today. If your logging library lacks one, you don’t need to rewrite your code. Instead, enrich your logs with trace context and let the Collector do the mapping.
Most libraries let you inject fields into log output. By adding trace_id
,
span_id
, and trace_flags
to each log record, you'll ensure logs can be
correlated later:
json12345678{"level": "ERROR","timestamp": "2025-10-05T15:34:11.428Z","message": "Payment authorization failed","trace_id": "c8f4a2171adf3de0a2c0b2e8f649a21f","span_id": "d6e2b6c1a2f53e4b","user_id": "user-1234"}
Once these enriched logs reach the Collector, you can parse and map them into canonical OpenTelemetry fields, preserving correlation across logs and traces without changing your application's logging calls.
Ingesting and transforming logs with the OpenTelemetry Collector
Up to this point, we've been looking at how applications can emit OpenTelemetry-native logs directly via the SDK and log bridges. But not every system in your stack is under your control. Legacy applications, third-party dependencies, and infrastructure components often write logs in their own formats, without any awareness of OpenTelemetry.
The OpenTelemetry Collector solves this problem. It can ingest logs from many different sources, parse them, and map them into the OpenTelemetry log data model. That way, even systems that know nothing about OTel can still participate in the same observability pipeline.
The Collector uses receivers to handle log ingestion. Each receiver supports a different input format or protocol. Some common options include:
-
filelogreceiver for tailing local log files.
-
kafkareceiver if you already forward logs to Kafka.
-
awscloudwatchreceiver: for AWS CloudWatch log groups and streams.
-
syslogreceiver: for logs sent over the network via syslog.
-
fluentforwardreceiver: for integration with Fluentd or Vector.
Once the logs are ingested by a receiver, they will be in the OpenTelemetry log data model, but this doesn't mean that all the fields will be correctly populated.
text1Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth]
For example, if the linux authentication log record above is ingested by the
filelogreceiver
, it will appear as follows with the debug
exporter:
text12345678910LogRecord #2ObservedTimestamp: 2025-09-21 17:25:01.598645527 +0000 UTCTimestamp: 1970-01-01 00:00:00 +0000 UTCSeverityText:SeverityNumber: Unspecified(0)Body: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth])Attributes:-> log.file.name: Str(auth.log)Trace ID:Span ID:
Here, the entire log line is stored in the Body
, while useful fields like
Timestamp
and SeverityNumber
remain unset. This is typical of log receivers:
they capture the raw log string and minimal metadata that the receiver can infer
directly.
The record is now technically in the OpenTelemetry log data model, but it is effectively just an unstructured string plus a small amount of context. To make the logs useful, you need to process them further so that key fields are extracted and normalized into structured attributes.
This usually involves:
-
Parsing the timestamp from the raw message and placing it in the
Timestamp
field. -
Identifying severity levels and mapping them to
SeverityText
andSeverityNumber
values. -
Extracting contextual fields from the log and storing them as structured
Attributes
using the appropriate semantic conventions. -
Enriching with environment metadata such as Kubernetes pod labels, host or process information, cloud resource tags.
-
Populating the trace context fields where available.
This is accomplished by using operators within the receiver's configuration or processors in the Collector pipeline.
For example, you can apply the syslog_parser
operator within the filelog
receiver extract standard syslog fields:
yaml12345678# otelcol.yamlreceivers:filelog:include: [/var/log/auth.log]operators:- type: syslog_parserprotocol: rfc3164allow_skip_pri_header: true
This produces the following result:
text123456789101112131415LogRecord #2ObservedTimestamp: 2025-09-21 18:40:22.780865051 +0000 UTCTimestamp: 2025-08-20 18:23:23 +0000 UTCSeverityText:SeverityNumber: Unspecified(0)Body: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth])Attributes:-> log.file.name: Str(auth.log)-> message: Str(Received disconnect from 180.101.88.228 port 11349:11: [preauth])-> hostname: Str(ubuntu-lts)-> appname: Str(sshd)-> proc_id: Str(47339)Trace ID:Span ID:Flags: 0
After applying this operator, the log record is already much improved. The
Timestamp
is now correctly parsed from the original log line, and the
Attributes
field has been enriched by extracting the syslog prefix.
To extract more specific details, you can chain a regex_parser
operator to
parse the message attribute created by the previous step:
yaml12345# otelcol.yaml- type: regex_parserparse_from: attributes.messageregex:'Received disconnect from (?P<client_ip>[\d.]+) port (?P<client_port>\d+)'
This operator will extract the IP address and port from the message and add them as new attributes:
text12345678Attributes:-> client_ip: Str(180.101.88.228)-> log.file.name: Str(auth.log)-> client_port: Str(11349)-> hostname: Str(ubuntu-lts)-> appname: Str(sshd)-> proc_id: Str(47339)-> message: Str(Received disconnect from 180.101.88.228 port 11349:11: [preauth])
For more complex transformations, the OpenTelemetry Transformation Language (OTTL) is the recommended tool. It's a powerful and flexible language used in the transform processor to manipulate log data.
For example, you can use OTTL to:
- Conditionally set
SeverityNumber
andSeverityText
based on the content of the log message. - Restructure or clean up the
Body
andAttributes
of a log record. - Conform attribute names to the OpenTelemetry semantic conventions.
- Set top-level fields like
Trace ID
andSpan ID
from attributes.
<!-- prettier-ignore-end -->yaml12345678910111213141516171819202122232425262728processors:transform/auth_logs:error_mode: ignorelog_statements:# Move host and process attributes to the Resource- set(resource.attributes["host.name"], log.attributes["hostname"])- set(resource.attributes["process.executable.name"], log.attributes["appname"])- set(resource.attributes["process.pid"], Int(log.attributes["proc_id"]))# Conform attributes to semantic conventions- set(log.attributes["client.address"], log.attributes["client_ip"])- set(log.attributes["client.port"], Int(log.attributes["client_port"]))- set(log.attributes["log.record.original"], log.body)- set(log.body, log.attributes["message"])# Severity mapping- set(log.severity_number, SEVERITY_NUMBER_INFO) where IsMatch(log.body,"^Received disconnect")- set(log.severity_text, "INFO") where log.severity_number >=SEVERITY_NUMBER_INFO and log.severity_number <= SEVERITY_NUMBER_INFO4# Delete the old, non-compliant attributes- delete_key(log.attributes, "hostname")- delete_key(log.attributes, "appname")- delete_key(log.attributes, "proc_id")- delete_key(log.attributes, "client_ip")- delete_key(log.attributes, "client_port")- delete_key(log.attributes, "message")
This results in the following output:
text1234567891011121314151617181920212223242025-09-22T03:14:29.229Z info ResourceLog #0Resource SchemaURL:Resource attributes:-> host.name: Str(ubuntu-lts)-> process.executable.name: Str(sshd)-> process.pid: Int(47339)ScopeLogs #0ScopeLogs SchemaURL:InstrumentationScope. . .LogRecord #2ObservedTimestamp: 2025-09-22 03:14:29.130188792 +0000 UTCTimestamp: 2025-08-20 18:23:23 +0000 UTCSeverityText: INFOSeverityNumber: Info(9)Body: Str(Received disconnect from 180.101.88.228 port 11349:11: [preauth])Attributes:-> client.port: Int(11349)-> client.address: Str(180.101.88.228)-> log.file.name: Str(auth.log)-> log.record.original: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth])Trace ID:Span ID:Flags: 0
By building a pipeline of these operators and processors, you can systematically convert any raw log entry into a fully populated, structured record that conforms to the OpenTelemetry data model.
Best practices for OpenTelemtry logging
Rolling out OpenTelemetry logging in production requires more than just turning on a bridge or deploying a Collector. To get reliable, cost-effective, and actionable logs, keep these practices in mind:
1. Start with structure
When logs are unstructured, like the raw sshd
example, you end up stacking
operators and transform rules just to pull out basics like timestamp or client
IP.
That effort disappears if your application emits structured logs from the start. With structure in place, bridges and the Collector can pass fields straight into the OTel model without guesswork.
The lesson is clear: save parsers and regex operators for legacy systems you can’t change, and let new services output structured logs from day one.
2. Embrace high-cardinality attributes
High-cardinality fields in logs and traces are essential for OpenTelemetry’s cross-signal correlation and per-tenant/root-cause analysis.
They allow you to slice observability data along high-dimensional axes so you can filter and correlate millions of datapoints down to the exact parameters responsible for a problem.
While many observability backends discourage you from using them by charging per GB, they are exactly what you need for precise root cause identification.
Therefore, choose an OpenTelemetry-native backend that supports high cardinality and high dimensionality without penalty, so you can keep the fidelity needed for deep debugging and rich analytics.
3. Use semantic conventions
For contextual log fields, ensure to rely on OpenTelemetry's semantic conventions where possible. This ensures your logs can be understood by any OTel-compliant backend and align with traces and metrics. It also prevents the cleanup step we saw in the Collector, where attributes had to be renamed and normalized after parsing.
4. Always include resource attributes
Resource attributes provide the anchor that ties logs to services and
environments. Ensure to set them in the SDK Resource
or enrich them in the
Collector with processors like
resourcedetection
or
k8sattributes.
5. Scrub sensitive data
Logs often capture more than they should: tokens, passwords, emails, or PII can easily slip in. Make sure these values are redacted before logs leave your environment.
Use Collector processors like transform or redaction to strip or mask sensitive fields, and configure your logging libraries to avoid writing secrets in the first place. This keeps your logs safe to share across teams and compliant with security and privacy requirements.
6. Control your log volume
Production doesn’t need the same firehose of DEBUG
logs as development. Set
appropriate levels in your logging framework, and
use the Collector to filter
or sample when necessary. This keeps costs predictable and avoids overwhelming
your backend, while still preserving high-value logs tied to traces or errors.
Final thoughts
Once your logs are normalized into the OpenTelemetry data model, the last step is getting them into the observability tools you rely on. The recommended approach is exporting to an OpenTelemetry-native backend via the OTLP exporter.
An OpenTelemetry-native backend is one that sees the OpenTelemetry Protocol as its linchpin. That means it can ingest OpenTelemetry-native traces, metrics, and logs without requiring custom shims, proprietary agents, or lossy format conversions. Your data flows through the pipeline exactly as OpenTelemetry defines it without translation steps that strip away detail or break correlations.
Dash0 is one such backend. Built from the ground up to be OpenTelemetry-native, it accepts OTLP out of the box and preserves every field of the data model, so you can immediately explore logs in full fidelity, seamlessly correlate them with traces and metrics, and unlock the power of high-cardinality, high-dimensional analysis without compromise.
To see what that looks like in practice, start your free trial with Dash0 today.
