Last updated: October 30, 2025
The Complete Guide to OpenTelemetry Logging and Data Model
Logs have always been central to understanding how software behaves. They record events, errors, and state changes that reveal what was happening inside a system at a given moment.
But as architectures have shifted toward microservices and distributed systems, the old model of siloed, free-form log files has become impossible to work with. With each service speaking its own logging dialect, troubleshooting is like trying to read a book where every page is in a different language.
OpenTelemetry was created to solve exactly this kind of fragmentation. By defining a common log data model and a language-neutral API and SDK, OTel provides a standard way to represent, process, and export logs.
The goal isn't to replace your existing logging libraries, but to make their output interoperable, consistent, and enriched with the same contextual information that powers traces and metrics.
This unification has important consequences. Once logs are expressed in the OpenTelemetry model, they can be processed in the same pipelines as metrics and traces, enriched with consistent resource metadata, and correlated with distributed traces. Instead of isolated fragments of text, logs become structured events that participate in the full observability story.
This article breaks down the OpenTelemetry approach to logging, from its data model and SDK to the crucial role of log bridges. I'll also show you why the seamless correlation between logs and traces is a game-changer for debugging complex service interactions.
Let's begin!
Understanding the OpenTelemetry logs data model
At the heart of OpenTelemetry's logging capabilities is its logs data model, a stable specification that defines what a log record is and how it should be represented.
Its purpose is to create a standardized, vendor-neutral structure that can represent logs from diverse sources and frameworks, including application log files, system logs, and machine-generated events.
It is flexible enough to map existing formats into the model without ambiguity, and in many cases, to reconstruct the original format without loss of information.
An OpenTelemetry log record is composed of several core fields with defined semantics, alongside flexible attributes for contextual data. The most important fields include:
-
Timestamp: The time when the event originally occurred at the source. -
ObservedTimestamp: The time when the log was collected by an OpenTelemetry system. For logs generated directly by an OTel SDK, this is typically the same as theTimestamp. -
Body: The main content of the log message. This can be a simple string or structured JSON, but it is best treated as an human-readable message. -
Attributes: A set of key-value pairs that provides additional, machine-readable contextual information about a specific log event. -
Resource: Describes the entity that produced the log, such as the application, host, or Kubernetes pod. All logs from the same entity share the same resource attributes.
To enable trace–log correlation, the model also incorporates fields from the W3C Trace Context specification:
TraceId: The unique identifier for a distributed trace.SpanId: The identifier for a specific span (operation) within that trace.TraceFlags: Flags providing metadata about the trace, such as whether it was sampled.
When these fields are present, a single log message can be linked directly to its broader distributed trace to accelerate debugging in microservice environments.
The data model also standardizes how log severity is represented:
-
SeverityText: The original string representation of the log level, as it appeared at the source. -
SeverityNumber: A numeric value that enables consistent comparison and filtering across systems. Smaller numbers represent less severe events, larger numbers more severe ones.
| SeverityNumber Range | Category |
|---|---|
| 1–4 | TRACE |
| 5–8 | DEBUG |
| 9–12 | INFO |
| 13–16 | WARN |
| 17–20 | ERROR |
| 21–24 | FATAL |
In OTLP JSON, a log record looks like this:
json12345678910111213141516171819202122232425262728293031323334353637{"resourceLogs": [{"resource": {"attributes": [{"key": "service.name","value": { "stringValue": "checkoutservice" }}]},"scopeLogs": [{"scope": { "name": "logback", "version": "1.4.0" },"logRecords": [{"timeUnixNano": "1756571696706248000","observedTimeUnixNano": "1756571696710000000","severityNumber": 17,"severityText": "ERROR","body": { "stringValue": "Database connection failed" },"attributes": [{ "key": "thread.id", "value": { "intValue": 42 } },{"key": "exception.type","value": { "stringValue": "SQLException" }}],"traceId": "da5b97cecb0fe7457507a876944b3cf","spanId": "fa7f0ea9cb73614c"}]}]}]}
At the top level, resourceLogs groups all logs that originate from the same
Resource. In this case, the service.name attribute identify the specific
microservice and ensures that all logs from this service are bundled together.
Within a resource, scopeLogs further groups logs by the instrumentation
scope. This identifies the specific library or module that generated the log.
Here, the scope tells us the log was emitted by version 1.4.0 of the Logback
library.
Finally, the logRecords array contains the individual log entries, each with
its timestamps, severity, body, attributes, and (when available) trace context.
In this snippet, there is a single record describing a database error, complete
with a traceId and spanId that allow you to jump directly to the trace where
the error occurred.
By defining a consistent data model, OpenTelemetry makes it possible to represent logs from any source in a uniform way.
But having a model alone isn't enough. You also need a way to generate logs that already conform to it. That's where application instrumentation comes in.
Bringing your application logs into OpenTelemetry
Unlike traces and metrics, where OpenTelemetry introduces its own API that you must call directly, logging follows a different approach. Because there is a long history of diverse logging frameworks and standards, OTel is designed to integrate with existing libraries rather than replace them.
Logs flow into the OpenTelemetry ecosystem through bridges: adapters that forward records from familiar libraries like Python's logging, Java's SLF4J/Logback, or .NET's Serilog.
This design means you can keep your existing logging code and tooling, while still gaining the benefits of the OpenTelemetry log data model, correlation with traces, and consistent export to observability backends.
Understanding the Logs API and SDK
The Logs API defines the contract for passing log records into the OpenTelemetry pipeline. It's primarily intended for library authors to build appenders or handlers, but it can also be called directly from instrumentation libraries or application code. Its core components are:
It consists of the following core components:
-
LoggerProvider: Creates and managesLoggerinstances. Typically, one is configured per process and registered globally for consistent access. -
Logger: ALoggeris responsible for emitting logs asLogRecords. In practice, your existing logging library (via a bridge) will call it for you. -
LogRecord: The data structure representing a single log event, with all the fields defined in the log data model.
While the Logs API define how logs are created, the Logs SDK is responsible for processing and exporting them. It provides:
- A concrete
LoggerProviderimplementation. - A
LogRecordProcessorthat sits between log creation and export and is responsible for enrichment, filtering/transforming, and batching ofLogRecords. - A
LogRecordExporterthat takes processed records and exports them to destinations such as the console or an OTLP endpoint.
Where log bridges fit in
It's important to note that the Logs SDK does not automatically capture your application's logs. It provides the pipeline, but log records must be fed into it by a bridge.
A log bridge is an adapter (sometimes called a handler, appender, or transport) that connects your existing logging framework to the OpenTelemetry Logs API. Instead of rewriting your application to use OTel loggers directly, you only need to attach a bridge to the logger you already use.
For example, consider a Node.js application using Pino:
JavaScript12345import pino from "pino";const logger = pino();logger.info("hi");
By default, Pino produces JSON logs like this:
json1234567{"level": 30,"time": 1758515262941,"pid": 55904,"hostname": "falcon","msg": "hi"}
To bring these logs into an OpenTelemetry pipeline, you must configure the
OpenTelemetry SDK,
register a LogRecordProcessor and LogRecordExporter, and include the Pino
log bridge via the
@opentelemetry/instrumentation-pino
package:
JavaScript123456789101112131415import { PinoInstrumentation } from "@opentelemetry/instrumentation-pino";import { logs, NodeSDK } from "@opentelemetry/sdk-node";import pino from "pino";const sdk = new NodeSDK({logRecordProcessor: new logs.SimpleLogRecordProcessor(new logs.ConsoleLogRecordExporter(),),instrumentations: [new PinoInstrumentation()],});sdk.start();const logger = pino();logger.info("hi");
The SimpleLogRecordProcessor immediately exports each log, which is useful for
development and debugging. In production, you'd typically replace it with a
BatchLogRecordProcessor (to reduce network overhead) and swap the
ConsoleLogRecordExporter for an OTLPLogExporter that streams logs to the
Collector:
JavaScript12345678910import { OTLPLogExporter } from "@opentelemetry/exporter-logs-otlp-http";import { PinoInstrumentation } from "@opentelemetry/instrumentation-pino";import { logs, NodeSDK } from "@opentelemetry/sdk-node";import pino from "pino";const sdk = new NodeSDK({logRecordProcessor: new logs.BatchLogRecordProcessor(new OTLPLogExporter()),instrumentations: [new PinoInstrumentation()],});sdk.start();
This shows up in your Collector pipeline as follows assuming you're using the debug exporter:
text123456789101112131415161718192021222324252627282930312025-09-22T05:31:27.964Z info ResourceLog #0Resource SchemaURL:Resource attributes:-> host.name: Str(falcon)-> host.arch: Str(amd64)-> host.id: Str(4a3dc42bf0564d50807d1553f485552a)-> process.pid: Int(59532)-> process.executable.name: Str(node)-> process.executable.path: Str(/home/ayo/.local/share/mise/installs/node/24.8.0/bin/node)-> process.command_args: Slice(["/home/ayo/.local/share/mise/installs/node/24.8.0/bin/node","--experimental-loader=@opentelemetry/instrumentation/hook.mjs","/home/ayo/dev/dash0/repro-contrib-2838/index.js"])-> process.runtime.version: Str(24.8.0)-> process.runtime.name: Str(nodejs)-> process.runtime.description: Str(Node.js)-> process.command: Str(/home/ayo/dev/dash0/repro-contrib-2838/index.js)-> process.owner: Str(ayo)-> service.name: Str(unknown_service:node)-> telemetry.sdk.language: Str(nodejs)-> telemetry.sdk.name: Str(opentelemetry)-> telemetry.sdk.version: Str(2.0.1)ScopeLogs #0ScopeLogs SchemaURL:InstrumentationScope @opentelemetry/instrumentation-pino 0.49.0LogRecord #0ObservedTimestamp: 2025-09-22 05:31:27.924 +0000 UTCTimestamp: 2025-09-22 05:31:27.924 +0000 UTCSeverityText: infoSeverityNumber: Info(9)Body: Str(hi)Trace ID:Span ID:Flags: 0
This output demonstrates how the log bridge and the Logs SDK work together:
- The raw Pino record has been translated into an OpenTelemetry
LogRecord. - Resource attributes are automatically populated to identify where the log came from.
- Pino's
level: 30(info) is mapped toseverityText: "info"andseverityNumber: 9. - The
bodyfield carries the original human-readable log message. - The
instrumentationScopeidentifies that the log was captured via the Pino bridge. - Trace correlation fields (
traceId,spanId,traceFlags) are present but unset in this example.
While this is the standard approach, be aware that the availability maturity of these bridges varies by language and framework. Ensure to check your language's OpenTelemetry documentation to see what's supported and how to configure it.
Correlating logs and traces
OpenTelemetry offers a capability that structured logging alone cannot: when you use the OTel SDK for both tracing and logging, it automatically correlates the two.
To see this in action, your service must emit logs within an active trace span. Whenever that happens, the SDK automatically attaches the current trace and span identifiers to each log record.
In most cases, you will rely on zero-code instrumentation to create spans around common operations such as HTTP requests or database calls, but you can also create spans manually:
JavaScript1234567import { api, logs, NodeSDK } from "@opentelemetry/sdk-node";const tracer = api.trace.getTracer("example");tracer.startActiveSpan("manual-span", (span) => {logger.info("in a span");});
The resulting log record now includes the active trace context:
text123456789LogRecord #1ObservedTimestamp: 2025-09-22 05:51:37.685 +0000 UTCTimestamp: 2025-09-22 05:51:37.685 +0000 UTCSeverityText: infoSeverityNumber: Info(9)Body: Str(in a span)Trace ID: 6691c3b82c157705904ba3b5b921d60aSpan ID: 72efdc9ec81b179aFlags: 1
This correlation creates a two-way street for debugging. From a trace, you can jump directly to the logs that occurred within its spans; from a log, you can pivot back to the full distributed trace that produced it.
This is the core value proposition of using the complete OpenTelemetry ecosystem alongside an OpenTelemetry-native observability tool. It's what elevates logs from a simple record of events to a deeply contextualized part of a larger observability narrative.
When a log bridge isn't available
Not every framework has a bridge today. If your logging library lacks one, you don’t need to rewrite your code. Instead, enrich your logs with trace context and let the Collector do the mapping.
Most libraries let you inject fields into log output. By adding trace_id,
span_id, and trace_flags to each log record, you'll ensure logs can be
correlated later:
json12345678{"level": "ERROR","timestamp": "2025-10-05T15:34:11.428Z","message": "Payment authorization failed","trace_id": "c8f4a2171adf3de0a2c0b2e8f649a21f","span_id": "d6e2b6c1a2f53e4b","user_id": "user-1234"}
Once these enriched logs reach the Collector, you can parse and map them into canonical OpenTelemetry fields, preserving correlation across logs and traces without changing your application's logging calls.
Ingesting and transforming logs with the OpenTelemetry Collector
Up to this point, we've been looking at how applications can emit OpenTelemetry-native logs directly via the SDK and log bridges. But not every system in your stack is under your control. Legacy applications, third-party dependencies, and infrastructure components often write logs in their own formats, without any awareness of OpenTelemetry.
The OpenTelemetry Collector solves this problem. It can ingest logs from many different sources, parse them, and map them into the OpenTelemetry log data model. That way, even systems that know nothing about OTel can still participate in the same observability pipeline.
The Collector uses receivers to handle log ingestion, and each one supports a different input or protocol. Some common options include:
-
filelogreceiver for tailing local log files.
-
kafkareceiver if you already forward logs to Kafka.
-
awscloudwatchreceiver for AWS CloudWatch log groups and streams.
-
syslogreceiver for logs sent over the network via syslog.
-
fluentforwardreceiver for integration with Fluentd or Vector.
Once the logs are ingested by a receiver, they will be in the OpenTelemetry log data model, but this doesn't mean that all the fields will be correctly populated.
text1Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth]
For example, if the linux authentication log record above is ingested by the
filelogreceiver, it will appear as follows with the debug exporter:
text12345678910LogRecord #2ObservedTimestamp: 2025-09-21 17:25:01.598645527 +0000 UTCTimestamp: 1970-01-01 00:00:00 +0000 UTCSeverityText:SeverityNumber: Unspecified(0)Body: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth])Attributes:-> log.file.name: Str(auth.log)Trace ID:Span ID:
Here, the entire log line is stored in the Body, while useful fields like
Timestamp and SeverityNumber remain unset. This is typical of log receivers:
they capture the raw log string and minimal metadata that the receiver can infer
directly.
The record is now technically in the OpenTelemetry log data model, but it is effectively just an unstructured string plus a small amount of context. To make the logs useful, you need to process them further so that key fields are extracted and normalized into structured attributes.
This usually involves:
-
Parsing the timestamp from the raw message and placing it in the
Timestampfield. -
Identifying severity levels and mapping them to
SeverityTextandSeverityNumbervalues. -
Extracting contextual fields from the log and storing them as structured
Attributesusing the appropriate semantic conventions. -
Enriching with environment metadata such as Kubernetes pod labels, host or process information, cloud resource tags.
-
Populating the trace context fields where available.
This is accomplished by using operators which on a single log entry as it's being ingested by that specific receiver or processors that act on a batch of telemetry data as it passes through the pipeline, regardless of the source.
For example, you can apply the syslog_parser operator within the filelog
receiver extract standard syslog fields:
yaml12345678# otelcol.yamlreceivers:filelog:include: [/var/log/auth.log]operators:- type: syslog_parserprotocol: rfc3164allow_skip_pri_header: true
This produces the following result:
text123456789101112131415LogRecord #2ObservedTimestamp: 2025-09-21 18:40:22.780865051 +0000 UTCTimestamp: 2025-08-20 18:23:23 +0000 UTCSeverityText:SeverityNumber: Unspecified(0)Body: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth])Attributes:-> log.file.name: Str(auth.log)-> message: Str(Received disconnect from 180.101.88.228 port 11349:11: [preauth])-> hostname: Str(ubuntu-lts)-> appname: Str(sshd)-> proc_id: Str(47339)Trace ID:Span ID:Flags: 0
After applying this operator, the log record is already much improved. The
Timestamp is now correctly parsed from the original log line, and the
Attributes field has been enriched by extracting the syslog prefix.
To extract more specific details, you can chain a regex_parser operator to
parse the message attribute created by the previous step:
yaml12345# otelcol.yaml- type: regex_parserparse_from: attributes.messageregex:'Received disconnect from (?P<client_ip>[\d.]+) port (?P<client_port>\d+)'
This operator will extract the IP address and port from the message and add them as new attributes:
text12345678Attributes:-> client_ip: Str(180.101.88.228)-> log.file.name: Str(auth.log)-> client_port: Str(11349)-> hostname: Str(ubuntu-lts)-> appname: Str(sshd)-> proc_id: Str(47339)-> message: Str(Received disconnect from 180.101.88.228 port 11349:11: [preauth])
For more complex transformations, the OpenTelemetry Transformation Language (OTTL) is the recommended tool. It's a powerful and flexible language used in the transform processor to manipulate log data.
For example, you can use OTTL to:
- Conditionally set
SeverityNumberandSeverityTextbased on the content of the log message. - Restructure or clean up the
BodyandAttributesof a log record. - Conform attribute names to the OpenTelemetry semantic conventions.
- Set top-level fields like
Trace IDandSpan IDfrom attributes.
yaml1234567891011121314151617181920212223242526272829# otelcol.yamlprocessors:transform/auth_logs:error_mode: ignorelog_statements:# Move host and process attributes to the Resource- set(resource.attributes["host.name"], log.attributes["hostname"])- set(resource.attributes["process.executable.name"], log.attributes["appname"])- set(resource.attributes["process.pid"], Int(log.attributes["proc_id"]))# Conform attributes to semantic conventions- set(log.attributes["client.address"], log.attributes["client_ip"])- set(log.attributes["client.port"], Int(log.attributes["client_port"]))- set(log.attributes["log.record.original"], log.body)- set(log.body, log.attributes["message"])# Severity mapping- set(log.severity_number, SEVERITY_NUMBER_INFO) where IsMatch(log.body,"^Received disconnect")- set(log.severity_text, "INFO") where log.severity_number >=SEVERITY_NUMBER_INFO and log.severity_number <= SEVERITY_NUMBER_INFO4# Delete the old, non-compliant attributes- delete_key(log.attributes, "hostname")- delete_key(log.attributes, "appname")- delete_key(log.attributes, "proc_id")- delete_key(log.attributes, "client_ip")- delete_key(log.attributes, "client_port")- delete_key(log.attributes, "message")
This results in the following output:
text1234567891011121314151617181920212223242025-09-22T03:14:29.229Z info ResourceLog #0Resource SchemaURL:Resource attributes:-> host.name: Str(ubuntu-lts)-> process.executable.name: Str(sshd)-> process.pid: Int(47339)ScopeLogs #0ScopeLogs SchemaURL:InstrumentationScope. . .LogRecord #2ObservedTimestamp: 2025-09-22 03:14:29.130188792 +0000 UTCTimestamp: 2025-08-20 18:23:23 +0000 UTCSeverityText: INFOSeverityNumber: Info(9)Body: Str(Received disconnect from 180.101.88.228 port 11349:11: [preauth])Attributes:-> client.port: Int(11349)-> client.address: Str(180.101.88.228)-> log.file.name: Str(auth.log)-> log.record.original: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11: [preauth])Trace ID:Span ID:Flags: 0
By building a pipeline of these operators and processors, you can systematically convert any raw log entry into a fully populated, structured record that conforms to the OpenTelemetry data model.
Best practices for OpenTelemetry logging
Rolling out OpenTelemetry logging in production requires more than just turning on a bridge or deploying a Collector. To get reliable, cost-effective, and actionable logs, keep these practices in mind:
1. Start with structure
When logs are unstructured, like the raw sshd example, you end up stacking
operators and transform rules just to pull out basics like timestamp or client
IP.
That effort disappears if your application emits structured logs from the start. With structure in place, bridges and the Collector can pass fields straight into the OTel model without guesswork.
The lesson is clear: save parsers and regex operators for legacy systems you can’t change, and let new services output structured logs from day one.
2. Embrace high-cardinality attributes
High-cardinality fields in logs and traces are essential for OpenTelemetry’s cross-signal correlation and per-tenant/root-cause analysis.
They are what allow you to ask the most important question when debugging: "Is this happening to everyone, or just a specific subset of users?"
Without high cardinality, you can see that you have a spike in errors. With it,
you can see that the error spike is coming from user:8675309 on the canary
deployment in eu-west-1 who has the new-checkout-flow feature flag enabled.
3. Use semantic conventions
For contextual log fields, ensure to rely on OpenTelemetry's semantic conventions where possible. This ensures your logs can be understood by any OTel-compliant backend and align with traces and metrics. It also prevents the cleanup step we saw in the Collector, where attributes had to be renamed and normalized after parsing.
4. Always include resource attributes
Resource attributes provide the anchor that ties logs to services and
environments. Ensure to set them in the SDK Resource or enrich them in the
Collector with processors like
resourcedetection
or
k8sattributes.
5. Scrub sensitive data
Logs often capture more than they should: tokens, passwords, emails, or PII can easily slip in. Make sure these values are redacted before logs leave your environment.
Use Collector processors like transform or redaction to strip or mask sensitive fields, and configure your logging libraries to avoid writing secrets in the first place. This keeps your logs safe to share across teams and compliant with security and privacy requirements.
6. Control your log volume
Production doesn’t need the same firehose of DEBUG logs as development. Set
appropriate levels in your logging framework, and
use the Collector to filter
or sample when necessary. This keeps costs predictable and avoids overwhelming
your backend, while still preserving high-value logs tied to traces or errors.
Final thoughts
Once your logs are normalized into the OpenTelemetry data model, the last step is getting them into the observability tools you rely on. The recommended approach is exporting to an OpenTelemetry-native backend via the OTLP exporter.
An OpenTelemetry-native backend is one that sees the OpenTelemetry Protocol as its linchpin. That means it can ingest OpenTelemetry-native traces, metrics, and logs without requiring custom shims, proprietary agents, or lossy format conversions. Your data flows through the pipeline exactly as OpenTelemetry defines it without translation steps that strip away detail or break correlations.
Dash0 is one such backend. Built from the ground up to be OpenTelemetry-native, it accepts OTLP out of the box and preserves every field of the data model, so you can immediately explore logs in full fidelity, seamlessly correlate them with traces and metrics, and unlock the power of high-cardinality, high-dimensional analysis without compromise.
To see what that looks like in practice, start your free trial with Dash0 today.




