Logs have always been central to understanding how software behaves. They record events, errors, and state changes that reveal what was happening inside a system at a given moment.

But as architectures have shifted toward microservices and distributed systems, the old model of siloed, free-form log files has become impossible to work with. With each service speaking its own logging dialect, troubleshooting is like trying to read a book where every page is in a different language.

OpenTelemetry was created to solve exactly this kind of fragmentation. By defining a common log data model and a language-neutral API and SDK, OTel provides a standard way to represent, process, and export logs.

The goal isn't to replace your existing logging libraries, but to make their output interoperable, consistent, and enriched with the same contextual information that powers traces and metrics.

This unification has important consequences. Once logs are expressed in the OpenTelemetry model, they can be processed in the same pipelines as metrics and traces, enriched with consistent resource metadata, and correlated with distributed traces. Instead of isolated fragments of text, logs become structured events that participate in the full observability story.

This article breaks down the OpenTelemetry approach to logging, from its data model and SDK to the crucial role of log bridges. I'll also show you why the seamless correlation between logs and traces is a game-changer for debugging complex service interactions.

Let's begin!

Understanding the OpenTelemetry logs data model

At the heart of OpenTelemetry's logging capabilities is its logs data model, a stable specification that defines what a log record is and how it should be represented.

Its purpose is to create a standardized, vendor-neutral structure that can represent logs from diverse sources and frameworks, including application log files, system logs, and machine-generated events.

It is flexible enough to map existing formats into the model without ambiguity, and in many cases, to reconstruct the original format without loss of information.

An OpenTelemetry log record is composed of several core fields with defined semantics, alongside flexible attributes for contextual data. The most important fields include:

Timestamp: The time when the event originally occurred at the source.
ObservedTimestamp: The time when the log was collected by an OpenTelemetry system. For logs generated directly by an OTel SDK, this is typically the same as the Timestamp.
Body: The main content of the log message. This can be a simple string or structured JSON, but it is best treated as an human-readable message.
Attributes: A set of key-value pairs that provides additional, machine-readable contextual information about a specific log event.
Resource: Describes the entity that produced the log, such as the application, host, or Kubernetes pod. All logs from the same entity share the same resource attributes.

To enable trace–log correlation, the model also incorporates fields from the W3C Trace Context specification:

TraceId: The unique identifier for a distributed trace.
SpanId: The identifier for a specific span (operation) within that trace.
TraceFlags: Flags providing metadata about the trace, such as whether it was sampled.

When these fields are present, a single log message can be linked directly to its broader distributed trace to accelerate debugging in microservice environments.

The data model also standardizes how log severity is represented:

SeverityText: The original string representation of the log level, as it appeared at the source.
SeverityNumber: A numeric value that enables consistent comparison and filtering across systems. Smaller numbers represent less severe events, larger numbers more severe ones.

SeverityNumber Range	Category
1–4	TRACE
5–8	DEBUG
9–12	INFO
13–16	WARN
17–20	ERROR
21–24	FATAL

In OTLP JSON, a log record looks like this:

json
12345678910111213141516171819202122232425262728293031323334353637
{
  "resourceLogs": [
    {
      "resource": {
        "attributes": [
          {
            "key": "service.name",
            "value": { "stringValue": "checkoutservice" }
          }
        ]
      },
      "scopeLogs": [
        {
          "scope": { "name": "logback", "version": "1.4.0" },
          "logRecords": [
            {
              "timeUnixNano": "1756571696706248000",
              "observedTimeUnixNano": "1756571696710000000",
              "severityNumber": 17,
              "severityText": "ERROR",
              "body": { "stringValue": "Database connection failed" },
              "attributes": [
                { "key": "thread.id", "value": { "intValue": 42 } },
                {
                  "key": "exception.type",
                  "value": { "stringValue": "SQLException" }
                }
              ],
              "traceId": "da5b97cecb0fe7457507a876944b3cf",
              "spanId": "fa7f0ea9cb73614c"
            }
          ]
        }
      ]
    }
  ]
}

At the top level, resourceLogs groups all logs that originate from the same Resource. In this case, the service.name attribute identify the specific microservice and ensures that all logs from this service are bundled together.

Within a resource, scopeLogs further groups logs by the instrumentation scope. This identifies the specific library or module that generated the log. Here, the scope tells us the log was emitted by version 1.4.0 of the Logback library.

Finally, the logRecords array contains the individual log entries, each with its timestamps, severity, body, attributes, and (when available) trace context. In this snippet, there is a single record describing a database error, complete with a traceId and spanId that allow you to jump directly to the trace where the error occurred.

By defining a consistent data model, OpenTelemetry makes it possible to represent logs from any source in a uniform way.

But having a model alone isn't enough. You also need a way to generate logs that already conform to it. That's where application instrumentation comes in.

Bringing your application logs into OpenTelemetry

Unlike traces and metrics, where OpenTelemetry introduces its own API that you must call directly, logging follows a different approach. Because there is a long history of diverse logging frameworks and standards, OTel is designed to integrate with existing libraries rather than replace them.

Logs flow into the OpenTelemetry ecosystem through bridges: adapters that forward records from familiar libraries like Python's logging, Java's SLF4J/Logback, or .NET's Serilog.

This design means you can keep your existing logging code and tooling, while still gaining the benefits of the OpenTelemetry log data model, correlation with traces, and consistent export to observability backends.

Understanding the Logs API and SDK

The Logs API defines the contract for passing log records into the OpenTelemetry pipeline. It's primarily intended for library authors to build appenders or handlers, but it can also be called directly from instrumentation libraries or application code. Its core components are:

It consists of the following core components:

LoggerProvider: Creates and manages Logger instances. Typically, one is configured per process and registered globally for consistent access.
Logger: A Logger is responsible for emitting logs as LogRecords. In practice, your existing logging library (via a bridge) will call it for you.
LogRecord: The data structure representing a single log event, with all the fields defined in the log data model.

While the Logs API define how logs are created, the Logs SDK is responsible for processing and exporting them. It provides:

A concrete LoggerProvider implementation.
A LogRecordProcessor that sits between log creation and export and is responsible for enrichment, filtering/transforming, and batching of LogRecords.
A LogRecordExporter that takes processed records and exports them to destinations such as the console or an OTLP endpoint.

Where log bridges fit in

It's important to note that the Logs SDK does not automatically capture your application's logs. It provides the pipeline, but log records must be fed into it by a bridge.

A log bridge is an adapter (sometimes called a handler, appender, or transport) that connects your existing logging framework to the OpenTelemetry Logs API. Instead of rewriting your application to use OTel loggers directly, you only need to attach a bridge to the logger you already use.

For example, consider a Node.js application using Pino:

JavaScript
12345
import pino from "pino";

const logger = pino();

logger.info("hi");

By default, Pino produces JSON logs like this:

json
1234567
{
  "level": 30,
  "time": 1758515262941,
  "pid": 55904,
  "hostname": "falcon",
  "msg": "hi"
}

To bring these logs into an OpenTelemetry pipeline, you must configure the OpenTelemetry SDK, register a LogRecordProcessor and LogRecordExporter, and include the Pino log bridge via the @opentelemetry/instrumentation-pino package:

JavaScript
123456789101112131415
import { PinoInstrumentation } from "@opentelemetry/instrumentation-pino";
import { logs, NodeSDK } from "@opentelemetry/sdk-node";
import pino from "pino";

const sdk = new NodeSDK({
  logRecordProcessor: new logs.SimpleLogRecordProcessor(
    new logs.ConsoleLogRecordExporter(),
  ),
  instrumentations: [new PinoInstrumentation()],
});
sdk.start();

const logger = pino();

logger.info("hi");

The SimpleLogRecordProcessor immediately exports each log, which is useful for development and debugging. In production, you'd typically replace it with a BatchLogRecordProcessor (to reduce network overhead) and swap the ConsoleLogRecordExporter for an OTLPLogExporter that streams logs to the Collector:

JavaScript
12345678910
import { OTLPLogExporter } from "@opentelemetry/exporter-logs-otlp-http";
import { PinoInstrumentation } from "@opentelemetry/instrumentation-pino";
import { logs, NodeSDK } from "@opentelemetry/sdk-node";
import pino from "pino";

const sdk = new NodeSDK({
  logRecordProcessor: new logs.BatchLogRecordProcessor(new OTLPLogExporter()),
  instrumentations: [new PinoInstrumentation()],
});
sdk.start();

This shows up in your Collector pipeline as follows assuming you're using the debug exporter:

text
12345678910111213141516171819202122232425262728293031
2025-09-22T05:31:27.964Z        info    ResourceLog #0
Resource SchemaURL:
Resource attributes:
     -> host.name: Str(falcon)
     -> host.arch: Str(amd64)
     -> host.id: Str(4a3dc42bf0564d50807d1553f485552a)
     -> process.pid: Int(59532)
     -> process.executable.name: Str(node)
     -> process.executable.path: Str(/home/ayo/.local/share/mise/installs/node/24.8.0/bin/node)
     -> process.command_args: Slice(["/home/ayo/.local/share/mise/installs/node/24.8.0/bin/node","--experimental-loader=@opentelemetry/instrumentation/hook.mjs","/home/ayo/dev/dash0/repro-contrib-2838/index.js"])
     -> process.runtime.version: Str(24.8.0)
     -> process.runtime.name: Str(nodejs)
     -> process.runtime.description: Str(Node.js)
     -> process.command: Str(/home/ayo/dev/dash0/repro-contrib-2838/index.js)
     -> process.owner: Str(ayo)
     -> service.name: Str(unknown_service:node)
     -> telemetry.sdk.language: Str(nodejs)
     -> telemetry.sdk.name: Str(opentelemetry)
     -> telemetry.sdk.version: Str(2.0.1)
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope @opentelemetry/instrumentation-pino 0.49.0
LogRecord #0
ObservedTimestamp: 2025-09-22 05:31:27.924 +0000 UTC
Timestamp: 2025-09-22 05:31:27.924 +0000 UTC
SeverityText: info
SeverityNumber: Info(9)
Body: Str(hi)
Trace ID:
Span ID:
Flags: 0

This output demonstrates how the log bridge and the Logs SDK work together:

The raw Pino record has been translated into an OpenTelemetry LogRecord.
Resource attributes are automatically populated to identify where the log came from.
Pino's level: 30 (info) is mapped to severityText: "info" and severityNumber: 9.
The body field carries the original human-readable log message.
The instrumentationScope identifies that the log was captured via the Pino bridge.
Trace correlation fields (traceId, spanId, traceFlags) are present but unset in this example.

While this is the standard approach, be aware that the availability maturity of these bridges varies by language and framework. Ensure to check your language's OpenTelemetry documentation to see what's supported and how to configure it.

Correlating logs and traces

OpenTelemetry offers a capability that structured logging alone cannot: when you use the OTel SDK for both tracing and logging, it automatically correlates the two.

To see this in action, your service must emit logs within an active trace span. Whenever that happens, the SDK automatically attaches the current trace and span identifiers to each log record.

In most cases, you will rely on zero-code instrumentation to create spans around common operations such as HTTP requests or database calls, but you can also create spans manually:

JavaScript
1234567
import { api, logs, NodeSDK } from "@opentelemetry/sdk-node";

const tracer = api.trace.getTracer("example");

tracer.startActiveSpan("manual-span", (span) => {
  logger.info("in a span");
});

The resulting log record now includes the active trace context:

text
123456789
LogRecord #1
ObservedTimestamp: 2025-09-22 05:51:37.685 +0000 UTC
Timestamp: 2025-09-22 05:51:37.685 +0000 UTC
SeverityText: info
SeverityNumber: Info(9)
Body: Str(in a span)
Trace ID: 6691c3b82c157705904ba3b5b921d60a
Span ID: 72efdc9ec81b179a
Flags: 1

This correlation creates a two-way street for debugging. From a trace, you can jump directly to the logs that occurred within its spans; from a log, you can pivot back to the full distributed trace that produced it.

This is the core value proposition of using the complete OpenTelemetry ecosystem alongside an OpenTelemetry-native observability tool. It's what elevates logs from a simple record of events to a deeply contextualized part of a larger observability narrative.

When a log bridge isn't available

Not every framework has a bridge today. If your logging library lacks one, you don’t need to rewrite your code. Instead, enrich your logs with trace context and let the Collector do the mapping.

Most libraries let you inject fields into log output. By adding trace_id, span_id, and trace_flags to each log record, you'll ensure logs can be correlated later:

json
12345678
{
  "level": "ERROR",
  "timestamp": "2025-10-05T15:34:11.428Z",
  "message": "Payment authorization failed",
  "trace_id": "c8f4a2171adf3de0a2c0b2e8f649a21f",
  "span_id": "d6e2b6c1a2f53e4b",
  "user_id": "user-1234"
}

Once these enriched logs reach the Collector, you can parse and map them into canonical OpenTelemetry fields, preserving correlation across logs and traces without changing your application's logging calls.

Ingesting and transforming logs with the OpenTelemetry Collector

Up to this point, we've been looking at how applications can emit OpenTelemetry-native logs directly via the SDK and log bridges. But not every system in your stack is under your control. Legacy applications, third-party dependencies, and infrastructure components often write logs in their own formats, without any awareness of OpenTelemetry.

The OpenTelemetry Collector solves this problem. It can ingest logs from many different sources, parse them, and map them into the OpenTelemetry log data model. That way, even systems that know nothing about OTel can still participate in the same observability pipeline.

The Collector uses receivers to handle log ingestion, and each one supports a different input or protocol. Some common options include:

filelogreceiver for tailing local log files.
journaldreceiver for systemd journal logs.
kafkareceiver if you already forward logs to Kafka.
awscloudwatchreceiver for AWS CloudWatch log groups and streams.
syslogreceiver for logs sent over the network via syslog.
fluentforwardreceiver for integration with Fluentd or Vector.

Once the logs are ingested by a receiver, they will be in the OpenTelemetry log data model, but this doesn't mean that all the fields will be correctly populated.

text
1
Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11:  [preauth]

For example, if the linux authentication log record above is ingested by the filelogreceiver, it will appear as follows with the debug exporter:

text
12345678910
LogRecord #2
ObservedTimestamp: 2025-09-21 17:25:01.598645527 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11:  [preauth])
Attributes:
     -> log.file.name: Str(auth.log)
Trace ID:
Span ID:

Here, the entire log line is stored in the Body, while useful fields like Timestamp and SeverityNumber remain unset. This is typical of log receivers: they capture the raw log string and minimal metadata that the receiver can infer directly.

The record is now technically in the OpenTelemetry log data model, but it is effectively just an unstructured string plus a small amount of context. To make the logs useful, you need to process them further so that key fields are extracted and normalized into structured attributes.

This usually involves:

Parsing the timestamp from the raw message and placing it in the Timestamp field.
Identifying severity levels and mapping them to SeverityText and SeverityNumber values.
Extracting contextual fields from the log and storing them as structured Attributes using the appropriate semantic conventions.
Enriching with environment metadata such as Kubernetes pod labels, host or process information, cloud resource tags.
Populating the trace context fields where available.

This is accomplished by using operators which on a single log entry as it's being ingested by that specific receiver or processors that act on a batch of telemetry data as it passes through the pipeline, regardless of the source.

For example, you can apply the syslog_parser operator within the filelog receiver extract standard syslog fields:

yaml
12345678
# otelcol.yaml
receivers:
  filelog:
    include: [/var/log/auth.log]
    operators:
      - type: syslog_parser
        protocol: rfc3164
        allow_skip_pri_header: true

This produces the following result:

text
123456789101112131415
LogRecord #2
ObservedTimestamp: 2025-09-21 18:40:22.780865051 +0000 UTC
Timestamp: 2025-08-20 18:23:23 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11:  [preauth])
Attributes:
     -> log.file.name: Str(auth.log)
     -> message: Str(Received disconnect from 180.101.88.228 port 11349:11:  [preauth])
     -> hostname: Str(ubuntu-lts)
     -> appname: Str(sshd)
     -> proc_id: Str(47339)
Trace ID:
Span ID:
Flags: 0

After applying this operator, the log record is already much improved. The Timestamp is now correctly parsed from the original log line, and the Attributes field has been enriched by extracting the syslog prefix.

To extract more specific details, you can chain a regex_parser operator to parse the message attribute created by the previous step:

yaml
12345
# otelcol.yaml
- type: regex_parser
  parse_from: attributes.message
  regex:
    'Received disconnect from (?P<client_ip>[\d.]+) port (?P<client_port>\d+)'

This operator will extract the IP address and port from the message and add them as new attributes:

text
12345678
Attributes:
     -> client_ip: Str(180.101.88.228)
     -> log.file.name: Str(auth.log)
     -> client_port: Str(11349)
     -> hostname: Str(ubuntu-lts)
     -> appname: Str(sshd)
     -> proc_id: Str(47339)
     -> message: Str(Received disconnect from 180.101.88.228 port 11349:11:  [preauth])

For more complex transformations, the OpenTelemetry Transformation Language (OTTL) is the recommended tool. It's a powerful and flexible language used in the transform processor to manipulate log data.

For example, you can use OTTL to:

Conditionally set SeverityNumber and SeverityText based on the content of the log message.
Restructure or clean up the Body and Attributes of a log record.
Conform attribute names to the OpenTelemetry semantic conventions.
Set top-level fields like Trace ID and Span ID from attributes.

yaml
1234567891011121314151617181920212223242526272829
# otelcol.yaml
processors:
  transform/auth_logs:
    error_mode: ignore
    log_statements:
      # Move host and process attributes to the Resource
      - set(resource.attributes["host.name"], log.attributes["hostname"])
      - set(resource.attributes["process.executable.name"], log.attributes["appname"])
      - set(resource.attributes["process.pid"], Int(log.attributes["proc_id"]))

      # Conform attributes to semantic conventions
      - set(log.attributes["client.address"], log.attributes["client_ip"])
      - set(log.attributes["client.port"], Int(log.attributes["client_port"]))
      - set(log.attributes["log.record.original"], log.body)
      - set(log.body, log.attributes["message"])

      # Severity mapping
      - set(log.severity_number, SEVERITY_NUMBER_INFO) where IsMatch(log.body,
        "^Received disconnect")
      - set(log.severity_text, "INFO") where log.severity_number >=
        SEVERITY_NUMBER_INFO and log.severity_number <= SEVERITY_NUMBER_INFO4

      # Delete the old, non-compliant attributes
      - delete_key(log.attributes, "hostname")
      - delete_key(log.attributes, "appname")
      - delete_key(log.attributes, "proc_id")
      - delete_key(log.attributes, "client_ip")
      - delete_key(log.attributes, "client_port")
      - delete_key(log.attributes, "message")

This results in the following output:

text
123456789101112131415161718192021222324
2025-09-22T03:14:29.229Z        info    ResourceLog #0
Resource SchemaURL:
Resource attributes:
     -> host.name: Str(ubuntu-lts)
     -> process.executable.name: Str(sshd)
     -> process.pid: Int(47339)
ScopeLogs #0
ScopeLogs SchemaURL:
InstrumentationScope
. . .
LogRecord #2
ObservedTimestamp: 2025-09-22 03:14:29.130188792 +0000 UTC
Timestamp: 2025-08-20 18:23:23 +0000 UTC
SeverityText: INFO
SeverityNumber: Info(9)
Body: Str(Received disconnect from 180.101.88.228 port 11349:11:  [preauth])
Attributes:
     -> client.port: Int(11349)
     -> client.address: Str(180.101.88.228)
     -> log.file.name: Str(auth.log)
     -> log.record.original: Str(Aug 20 18:23:23 ubuntu-lts sshd[47339]: Received disconnect from 180.101.88.228 port 11349:11:  [preauth])
Trace ID:
Span ID:
Flags: 0

By building a pipeline of these operators and processors, you can systematically convert any raw log entry into a fully populated, structured record that conforms to the OpenTelemetry data model.

Best practices for OpenTelemetry logging

Rolling out OpenTelemetry logging in production requires more than just turning on a bridge or deploying a Collector. To get reliable, cost-effective, and actionable logs, keep these practices in mind:

1. Start with structure

When logs are unstructured, like the raw sshd example, you end up stacking operators and transform rules just to pull out basics like timestamp or client IP.

That effort disappears if your application emits structured logs from the start. With structure in place, bridges and the Collector can pass fields straight into the OTel model without guesswork.

The lesson is clear: save parsers and regex operators for legacy systems you can’t change, and let new services output structured logs from day one.

2. Embrace high-cardinality attributes

High-cardinality fields in logs and traces are essential for OpenTelemetry’s cross-signal correlation and per-tenant/root-cause analysis.

They are what allow you to ask the most important question when debugging: "Is this happening to everyone, or just a specific subset of users?"

Without high cardinality, you can see that you have a spike in errors. With it, you can see that the error spike is coming from user:8675309 on the canary deployment in eu-west-1 who has the new-checkout-flow feature flag enabled.

3. Use semantic conventions

For contextual log fields, ensure to rely on OpenTelemetry's semantic conventions where possible. This ensures your logs can be understood by any OTel-compliant backend and align with traces and metrics. It also prevents the cleanup step we saw in the Collector, where attributes had to be renamed and normalized after parsing.

4. Always include resource attributes

Resource attributes provide the anchor that ties logs to services and environments. Ensure to set them in the SDK Resource or enrich them in the Collector with processors like resourcedetection or k8sattributes.

5. Scrub sensitive data

Logs often capture more than they should: tokens, passwords, emails, or PII can easily slip in. Make sure these values are redacted before logs leave your environment.

Use Collector processors like transform or redaction to strip or mask sensitive fields, and configure your logging libraries to avoid writing secrets in the first place. This keeps your logs safe to share across teams and compliant with security and privacy requirements.

6. Control your log volume

Production doesn’t need the same firehose of DEBUG logs as development. Set appropriate levels in your logging framework, and use the Collector to filter or sample when necessary. This keeps costs predictable and avoids overwhelming your backend, while still preserving high-value logs tied to traces or errors.

Final thoughts

Once your logs are normalized into the OpenTelemetry data model, the last step is getting them into the observability tools you rely on. The recommended approach is exporting to an OpenTelemetry-native backend via the OTLP exporter.

An OpenTelemetry-native backend is one that sees the OpenTelemetry Protocol as its linchpin. That means it can ingest OpenTelemetry-native traces, metrics, and logs without requiring custom shims, proprietary agents, or lossy format conversions. Your data flows through the pipeline exactly as OpenTelemetry defines it without translation steps that strip away detail or break correlations.

Dash0 is one such backend. Built from the ground up to be OpenTelemetry-native, it accepts OTLP out of the box and preserves every field of the data model, so you can immediately explore logs in full fidelity, seamlessly correlate them with traces and metrics, and unlock the power of high-cardinality, high-dimensional analysis without compromise.

To see what that looks like in practice, start your free trial with Dash0 today.

The Complete Guide to OpenTelemetry Logging and Data Model