Dash0 Raises $35 Million Series A to Build the First AI-Native Observability Platform

Last updated: October 6, 2025

Mastering NGINX Logs with JSON and OpenTelemetry

NGINX sits at the front door of countless applications, acting as a web server, reverse proxy, or load balancer. Every request that passes through it generates logs, and those records are often the most immediate and reliable source of truth when something goes wrong.

But most existing guides only teach you you the basics: turning on the log files and showing how to tail them. That level of knowledge hasn't kept pace with modern observability practices, where logs need to be structured and integrated with frameworks like OpenTelemetry.

This guide takes you further. We'll cover JSON-based access logs, strategies to cut down on log spam, best practices for error logs, and how to bring them into the wider observability ecosystem.

Let's begin!

Understanding NGINX logging fundamentals

Before you can configure advanced features or integrate with observability tools, you need a clear understanding of what logs NGINX creates and where they live.

Types of NGINX logs

NGINX categorizes its log output into two files, each serving a distinct purpose:

  1. Access log (access.log): This records every request that NGINX processes. For each request, it notes who made it (IP address), when it happened, what resource was requested (URI), how the server responded (HTTP status code), and the size of the response. These logs are what you need for analyzing traffic or user

  2. Error log (error.log): This is NGINX's diary of problems and significant events. When something goes wrong, the details are recorded here. They are your primary tool for troubleshooting and diagnostics, but they don't just contain outright errors; they also record warnings and other informational notices that can help you proactively identify potential issues before they become critical.

Where to find NGINX logs

Understanding the types of logs is only half the picture. To make use of them, you need to know where they're written, and this depends on how NGINX was installed and the environment it runs in. Let's look at a few common scenarios below.

Standard Linux distributions (Ubuntu, Debian, CentOS)

For most installations from a package manager on Linux, the default location for NGINX logs is the /var/log/nginx/ directory:

bash
1
ls -l /var/log/nginx/
text
123
total 8
-rw-r--r-- 1 root adm 2134 Oct 05 10:20 access.log
-rw-r--r-- 1 root adm 1789 Oct 05 11:15 error.log

If you can't find your logs there, the definitive source of truth is your NGINX configuration file. You can find its location by running nginx -t, which tests the configuration and reports the path to the main nginx.conf file.

bash
1
sudo nginx -t
text
12
nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful

Inside this file (or its included sub-configurations), the access_log and error_log directives will specify the exact paths:

nginx
1234567
# /etc/nginx/nginx.conf
error_log /var/log/nginx/error.log
http {
access_log /var/log/nginx/access.log;
}

We'll be diving deep into these directives shortly.

Containerized environments like Docker

In the world of containers, writing logs to a file inside the ephemeral container filesystem is a bad practice. When the container is destroyed, the logs are gone forever. The standard pattern is to forward logs to the container's standard output (stdout) and standard error (stderr) streams.

The official NGINX Docker image is configured to do this by default. It creates symbolic links from the traditional log file paths to the special device files for stdout and stderr:

  • /var/log/nginx/access.log -> /dev/stdout
  • /var/log/nginx/error.log -> /dev/stderr
Dockerfile
123
# forward request and error logs to docker log collector
&& ln -sf /dev/stdout /var/log/nginx/access.log \
&& ln -sf /dev/stderr /var/log/nginx/error.log \

This setup lets Docker capture the logs, and you can then use the docker logs command to view NGINX logs for any container produced by that image. It's the standard approach in containerized environments since it cleanly separates log generation from log collection.

Setting up a lab environment with Docker Compose

To experiment with NGINX logs, it helps to have a clean and repeatable setup. Docker Compose makes this easy by letting us define and run the whole environment with a single configuration file.

Start by creating a new project directory. Inside it, add a file named docker-compose.yml:

yaml
123456789
# docker-compose.yml
services:
nginx:
image: nginx:1.29.1
container_name: nginx-server
ports:
- "8080:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro

Next, execute the command below to grab the default NGINX configuration file from the nginx image and save it locally as nginx.conf:

bash
1
docker run --rm --entrypoint=cat nginx /etc/nginx/nginx.conf > nginx.conf

This gives a starting point you can customize for logging experiments. Here's what the file looks like:

nginx
1234567891011121314151617181920212223242526272829303132
# nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log notice;
pid /run/nginx.pid;
events {
worker_connections 1024;
}
http {
include /etc/nginx/mime.types;
default_type application/octet-stream;
log_format main '$remote_addr - $remote_user [$time_local] "$request" '
'$status $body_bytes_sent "$http_referer" '
'"$http_user_agent" "$http_x_forwarded_for"';
access_log /var/log/nginx/access.log main;
sendfile on;
#tcp_nopush on;
keepalive_timeout 65;
#gzip on;
include /etc/nginx/conf.d/*.conf;
}

This configuration shows how NGINX sets up its workers, handles basic HTTP behavior, and defines both access and error logging. The important parts for us are the access_log, error_log, and the log_format directive, which together control what gets logged and how.

Now, from your terminal in the project directory, start the services:

bash
1
docker compose up -d
text
123
[+] Running 2/2
✔ Network nginx-logging-tutorial_default Created 0.1s
✔ Container nginx-server Started 0.4s

You now have a running NGINX server accessible at http://localhost:8080:

Nginx default index page

To see NGINX logs in real-time, you can stream them directly from the container:

bash
1
docker compose logs -f nginx-server

As you access http://localhost:8080 in your browser, you'll see access log entries appear in the terminal:

text
1
172.19.0.1 - - [05/Oct/2025:12:49:47 +0000] "GET / HTTP/1.1" 200 615 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:143.0) Gecko/20100101 Firefox/143.0" "-"

If you try to access a non-existent page like http://localhost:8080/notfound, you'll see both an access log entry for the 404 and a corresponding error log entry:

text
12
172.19.0.1 - - [05/Oct/2025:12:56:23 +0000] "GET /notfound HTTP/1.1" 404 153 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:143.0) Gecko/20100101 Firefox/143.0" "-"
2025/10/05 12:56:23 [error] 33#33: *5 open() "/usr/share/nginx/html/notfound" failed (2: No such file or directory), client: 172.19.0.1, server: localhost, request: "GET /notfound HTTP/1.1", host: "localhost:8080"

This simple setup will be our laboratory for the rest of this guide.

Configuring NGINX access logs: from plain text to JSON

NGINX ships with a default access log format that works well enough for quick inspection and can be parsed by most log processing tools. The problem is that it's fragile, since any change in the format can break downstream parsing.

To build a reliable logging pipeline, it's better to move beyond the default and define a structured format from the start. Logging in JSON is the answer, since it's a universal format that is trivial for any programming language or logging tool to parse.

The access_log directive defines both where logs are written and the format they use:

text
1
access_log <path> [format];

To enable structured logging, let's create a custom log_format that outputs JSON. This can be done by placing the following code before the log_format main <format> line in your NGINX config file:

nginx
123456789101112131415
log_format json escape=json
'{'
'"timestamp": "$time_iso8601",'
'"client_ip": "$remote_addr",'
'"request_id": "$request_id",'
'"http_method": "$request_method",'
'"http_uri": "$request_uri",'
'"protocol": "$server_protocol",'
'"host": "$host",'
'"user_agent": "$http_user_agent",'
'"referer": "$http_referer",'
'"status_code": $status,'
'"bytes_sent": $body_bytes_sent,'
'"request_time_ms": $request_time'
'}';

Then apply the json format in your access_log directive:

nginx
1
access_log /var/log/nginx/access.log json;

With this in place, every request will be written as structured JSON instead of the default main format. In the log_format directive, any static text you wrap in quotes is written literally, and any $variable is replaced with its current value at runtime.

Note that request headers can be logged with the $http_<headername> variables (for example $http_user_agent or $http_x_forwarded_for), and response headers use $sent_http_<headername>.

If you're wondering about other variables you can include, NGINX exposes a long list through its core and module configuration. The NGINX variable index is the authoritative list, and specific modules may provide even more.

After updating your NGINX config once more, restart the nginx service:

bash
1
docker compose restart nginx

Now, incoming requests to the server will be logged in a clean, flat JSON format:

json
1234567891011121314
{
"timestamp": "2025-10-05T13:33:14+00:00",
"client_ip": "172.19.0.1",
"request_id": "72e5a0eeb44b6608d161254a5eaf1662",
"http_method": "GET",
"http_uri": "/",
"protocol": "HTTP/1.1",
"host": "localhost",
"user_agent": "curl/8.5.0",
"referer": "",
"status_code": 200,
"bytes_sent": 615,
"request_time_ms": 0.0
}

Setting up NGINX access log levels

Unlike error logs, access logs in NGINX don't have a built-in concept of log levels. By default, every access log entry is treated the same, whether it represents a successful request or a server error. However, it's often useful to attach a severity level so that downstream tools can prioritize or filter events more effectively.

The most common way to introduce log levels into access logs is by using the map directive. This allows you to define a variable whose value depends on another field, such as the response status code. You can then insert this variable into your log_format.

Here's an example that maps HTTP status codes to log levels:

nginx
1234567891011121314151617
# nginx.conf
http {
map $status $log_level {
~^2 "INFO";
~^3 "INFO";
~^4 "WARN";
~^5 "ERROR";
default "INFO";
}
log_format json escape=json
'{'
'"timestamp":"$time_iso8601",'
'"level":"$log_level",' # add this line
# [...]
'}';
}

With this configuration:

  • Successful and redirection responses are labeled as INFO.
  • Client errors are labeled as WARN.
  • Server errors are labeled as ERROR.

Reducing noise with conditional logging

In busy production systems, logging every request is usually unnecessary. For example, writing thousands of lines for successful 200 OK responses to static assets can have you paying for logs that don't add much value.

Conditional logging in NGINX is controlled by the if= parameter on the access_log directive. The condition is any variable that evaluates to empty or zero for false, and non-empty for true. You can drive that variable in several ways, but a common pattern is using the map directive.

Here's an example configuration that suppresses 2xx responses, but records everything else:

nginx
1234567891011
# Define a map that sets $should_log to 0 for 2xx responses, and 1 for others
map $status $should_log {
~^[2] 0;
default 1;
}
# Apply our JSON format
log_format json ...; # (your JSON format from above)
# Use the 'if' condition on the access_log directive
access_log /var/log/nginx/access.log json_analytics if=$should_log;

This configuration will drastically reduce your log volume while ensuring you capture all redirections (3xx), client-side (4xx) and server-side (5xx) errors.

Disabling NGINX access logs

You can also completely disable the access log using the special off value or by redirecting to /dev/null:

nginx
12
access_log off;
access_log /dev/null;

For example, it's common to disable logging for health checks or metrics endpoints:

nginx
123
location /metrics {
access_log off;
}

This approach keeps your logs focused on meaningful traffic while cutting noise from predictable or low-value requests.

Configuring NGINX error logs

The error_log directive configures where and how NGINX reports problems. Its syntax is straightforward:

nginx
1
error_log <path> [level];

The path defines where the log file is stored, and the level sets the minimum severity of messages to include.

Error logs can be tuned to capture only what you care about. Levels range from most to least severe:

  • emerg: Emergency situations where the system is unusable.
  • alert: A serious issue requiring immediate action.
  • crit: Critical conditions that need to be addressed.
  • error: A standard error occurred during request processing. This is the default level.
  • warn: A warning about an unusual event that is not necessarily an error but should be investigated.
  • notice: A noteworthy, normal event.
  • info: Informational messages about processing.
  • debug: Highly detailed debugging information.

When you choose a level, NGINX logs messages at that level and all levels above it. For example, setting warn will also log error, crit, alert, and emerg messages.

A common configuration looks like this:

nginx
1
error_log /var/log/nginx/error.log warn;

This keeps logs concise in production, while still recording warnings and errors that may require attention.

Understanding the default error log format

NGINX error logs are plain text, with each line following a consistent structure:

text
12
2025/10/05 13:42:23 [notice] 1#1: start worker process 43
2025/10/05 14:32:10 [error] 12345#12345: *7 open() "/usr/share/nginx/html/missing.html" failed (2: No such file or directory), client: 172.19.0.1, server: localhost, request: "GET /missing.html HTTP/1.1", host: "localhost:8080"

A typical entry contains:

  • When the error occurred,
  • The log level in square brackets ([error], [warn]),
  • Process and thread ID (12345#12345),
  • The connection ID,
  • And the actual issue, often including system error codes and contextual information.

This format is easy to read for humans and generally good enough for troubleshooting, but it isn't designed to be customizable. You only configure what gets logged (via severity levels), not how the line is structured.

This means that the only way to customize the error log format is through downstream log processors rather than modifying the format in NGINX itself. We'll look at how to handle that in the next section.

Integrating NGINX logs with OpenTelemetry

Structured logs are valuable on their own, but they become much more powerful when plugged into the rest of your observability stack. The industry standard way to do that is with OpenTelemetry (OTel).

OpenTelemetry is an open, vendor-neutral standard for collecting metrics, traces, and logs. When you align NGINX output with the OTel model and conventions, your logs become first-class signals in a unified pipeline.

Converting NGINX logs into the OpenTelemetry data model

The first step is ingesting your NGINX logs into an OTel pipeline through the OpenTelemetry Collector.

Start by adding an otelcol service to your Docker Compose file as follows:

yaml
12345678910111213141516
# docker-compose.yml
services:
nginx:
image: nginx:1.29.1
container_name: nginx-server
ports:
- 8080:80
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
otelcol:
image: otel/opentelemetry-collector-contrib:0.136.0
container_name: otelcol
volumes:
- ./otelcol.yaml:/etc/otelcol-contrib/config.yaml
restart: unless-stopped

Then create a Collector configuration file in the same directory and populate it as follows:

yaml
1234567891011121314151617181920
# otelcol.yaml
receivers:
filelog:
include:
- /var/log/nginx/*.log
start_at: beginning
processors:
batch:
exporters:
debug:
verbosity: detailed
service:
pipelines:
logs:
receivers: [filelog]
processors: [batch]
exporters: [debug]

The filelog receiver is the appropriate Collector component for ingesting NGINX logs. You only need to specify the file paths to read and ensure that start_at is set to beginning so that all existing logs are ingested (ensure to use checkpointing to avoid duplicate logs if the Collector instance restarts).

Afterwards, launch the otelcol service with:

bash
1
docker compose up otelcol -d

Once the service is running, you can tail the Collector logs to see what it's doing. The debug exporter prints every processed log record in the OpenTelemetry data model:

bash
1
docker compose logs -f --no-log-prefix otelcol

You'll likely see multiple entries depending on what's in your access log. A typical record looks like this:

text
1234567891011
LogRecord #0
ObservedTimestamp: 2025-10-05 20:34:40.70007603 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str({"timestamp": "2025-10-05T20:15:48+00:00","level": "INFO","client_ip": "172.19.0.1","request_id": "b9737cfe87f3f53c9dc9a6c3317f9f1d","http_method": "GET","http_uri": "/?userID=1234","protocol": "HTTP/1.1","host": "localhost","user_agent": "curl/8.5.0","referer": "","status_code": 200,"bytes_sent": 615,"request_time_ms": 0.000})
Attributes:
-> log.file.name: Str(/var/log/nginx/access.log)
Trace ID:
Span ID:
Flags: 0

This is the raw OpenTelemetry representation of a log record. At this stage, the JSON log is stored entirely in the Body field, and the only attribute set is the file path where the log was read from. Important fields like Timestamp, SeverityNumber are missing.

To enrich these records, you need to add parsing operators. Since the Body already contains JSON, the correct tool for the job is the json_parser operator, which extracts fields from the string and places them into structured attributes:

yaml
12345678
# otelcol.yaml
receivers:
filelog:
include:
- //var/log/nginx/access.log
start_at: beginning
operators:
- type: json_parser

With the json_parser operator in place, each log entry is parsed into key–value pairs and stored as attributes. In addition to having the entire JSON blob inside Body, you now get a structured set of Attributes that can be queried or transformed downstream:

text
123456789101112131415161718192021222324
LogRecord #0
ObservedTimestamp: 2025-10-05 20:34:40.70007603 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str({"timestamp": "2025-10-05T20:15:48+00:00","level": "INFO","client_ip": "172.19.0.1","request_id": "b9737cfe87f3f53c9dc9a6c3317f9f1d","http_method": "GET","http_uri": "/?userID=1234","protocol": "HTTP/1.1","host": "localhost","user_agent": "curl/8.5.0","referer": "","status_code": 200,"bytes_sent": 615,"request_time_ms": 0.000})
Attributes:
-> protocol: Str(HTTP/1.1)
-> host: Str(localhost)
-> user_agent: Str(curl/8.5.0)
-> status_code: Double(200)
-> level: Str(INFO)
-> log.file.name: Str(/var/log/nginx/access.log)
-> bytes_sent: Double(615)
-> http_method: Str(GET)
-> request_time_ms: Double(0)
-> timestamp: Str(2025-10-05T20:15:48+00:00)
-> client_ip: Str(172.19.0.1)
-> request_id: Str(b9737cfe87f3f53c9dc9a6c3317f9f1d)
-> http_uri: Str(/?userID=1234)
-> referer: Str()
Trace ID:
Span ID:
Flags: 0

At this stage, the attributes are extracted, but OpenTelemetry still doesn't know which ones represent the event time or severity. To address this, you can tell the Collector which attributes should be treated as the event Timestamp and which should define the log Severity*:

yaml
12345678910111213
# otelcol.yaml
receivers:
filelog:
include:
- //var/log/nginx/access.log
start_at: beginning
operators:
- type: json_parser
timestamp:
parse_from: attributes.timestamp
layout: "%Y-%m-%dT%H:%M:%S%j"
severity:
parse_from: attributes.level

This produces correctly set Timestamp and Severity* fields now:

text
123456
LogRecord #0
ObservedTimestamp: 2025-10-05 20:43:02.643812431 +0000 UTC
Timestamp: 2025-10-05 20:15:48 +0000 UTC
SeverityText: INFO
SeverityNumber: Info(9)
. . .

The next step is to align your attributes with the OpenTelemetry semantic conventions so that your logs use consistent field names across different services and components.

Mapping attributes to semantic conventions

Semantic conventions are predefined attribute names designed to make telemetry data consistent and portable. Instead of each service inventing its own field names, OpenTelemetry defines a shared vocabulary that all telemetry data should use.

Take client IPs as an example: one system might log it as client_ip, another as remote_addr. If both are mapped to the client.address attribute, your observability backend can recognize them as the same thing.

For NGINX access logs, you can update your log format to use semantic attributes directly. By doing so, your logs won't just be structured but also aligned with the same conventions used across the rest of your applications and infrastructure.

Here's an updated log_format directive that uses OpenTelemetry semantic conventions:

nginx
123456789101112131415161718
log_format otel_json escape=json
'{'
'"timestamp": "$time_iso8601",'
'"level": "INFO",'
'"client.address": "$remote_addr",'
'"client.port": "$remote_port",'
'"url.path": "$request_uri",'
'"url.query": "$args",'
'"network.protocol.name": "$server_protocol",'
'"server.address": "$host",'
'"server.port": "$server_port",'
'"user_agent.original": "$http_user_agent",'
'"http.request.method": "$request_method",'
'"http.request.header.referer": "$http_referer",'
'"http.response.status_code": $status,'
'"http.response.body.size": $body_bytes_sent,'
'"request_time_ms": $request_time'
'}';

Here's the resulting access log as seen through the debug exporter:

text
1234567891011121314151617181920212223242526
LogRecord #0
ObservedTimestamp: 2025-10-05 22:00:34.676431611 +0000 UTC
Timestamp: 2025-10-05 14:54:12 +0000 UTC
SeverityText: INFO
SeverityNumber: Info(9)
Body: Str({ "timestamp": "2025-10-05T14:54:12+00:00", "level": "INFO", "client.address": "172.19.0.1", "client.port": "59660", "url.path": "/?userID=1234", "url.query": "userID=1234", "network.protocol.name": "HTTP/1.1", "server.address": "localhost", "server.port": "80", "user_agent.original": "curl/8.5.0", "http.request.method": "GET", "http.request.header.referer": "", "http.response.status_code": 200, "http.response.body.size": 615, "request_time_ms": 0.0})
Attributes:
-> client.port: Str(59660)
-> user_agent.original: Str(curl/8.5.0)
-> timestamp: Str(2025-10-05T14:54:12+00:00)
-> url.query: Str(userID=1234)
-> level: Str(INFO)
-> http.request.header.referer: Str()
-> server.port: Str(80)
-> log.file.name: Str(/var/log/nginx/access.log)
-> http.response.body.size: Double(615)
-> request_time_ms: Double(0)
-> http.request.method: Str(GET)
-> server.address: Str(localhost)
-> protocol: Str(HTTP/1.1)
-> http.response.status_code: Double(200)
-> client.address: Str(172.19.0.1)
-> url.path: Str(/?userID=1234)
Trace ID:
Span ID:
Flags: 0

With this approach, your NGINX logs are not only structured and machine-friendly, but also immediately usable within any OpenTelemetry-compliant observability backend.

If you can't change nginx.conf directly, you can still achieve the same result by handling the mapping in the Collector through processors such as transform or attributes.

One advantage of that approach is that you decouple log generation from log normalization. Your NGINX servers can continue to emit logs in whichever format is convenient, while the Collector takes responsibility for aligning them with semantic conventions before exporting them downstream.

Whenever possible, defining semantic attributes at the source (in NGINX itself) is preferable. But if that's not feasible, Collector-side remapping ensures your logs are still consistent and compliant with OpenTelemetry standards.

Correlating NGINX logs and traces

Logs become even more valuable when you can tie them to traces. A trace represents the journey of a single request through your system, and each step along the way gets a unique trace_id and span_id.

If your NGINX access logs capture these identifiers, you can jump directly from a log to the distributed trace that shows the request's entire lifecycle.

To support this, you need the ngx_otel_module, which adds OpenTelemetry tracing to NGINX. Start with an image that includes the module:

yaml
1234
# docker-compose.yml
services:
nginx:
image: nginx:1.29.1-otel

Then load the module in your NGINX config:

nginx
12345678910
# nginx.conf
user nginx;
worker_processes auto;
error_log /var/log/nginx/error.log notice;
pid /run/nginx.pid;
load_module modules/ngx_otel_module.so;
. . .

Inside the http block, enable propagation of W3C trace context headers:

nginx
12345
# nginx.conf
http {
otel_trace_context propagate;
. . .
}

The otel_trace_context propagate directive ensures that if a request includes trace context headers, NGINX reuses them. If no headers are present, NGINX generates a new trace ID automatically

Adding trace fields to logs

To include trace identifiers in your access logs, extend your log_format with the $otel_* variables:

nginx
12345678910111213141516171819202122
log_format otel_json escape=json
'{'
'"timestamp": "$time_iso8601",'
'"level": "INFO",'
'"client.address": "$remote_addr",'
'"client.port": "$remote_port",'
'"url.path": "$request_uri",'
'"url.query": "$args",'
'"network.protocol.name": "$server_protocol",'
'"server.address": "$host",'
'"server.port": "$server_port",'
'"user_agent.original": "$http_user_agent",'
'"http.request.method": "$request_method",'
'"http.request.header.referer": "$http_referer",'
'"http.response.status_code": $status,'
'"http.response.body.size": $body_bytes_sent,'
'"request_time_ms": $request_time,'
'"trace_id": "$otel_trace_id",'
'"span_id": "$otel_span_id",'
'"parent_id": "$otel_parent_id",'
'"parent_sampled": $otel_parent_sampled'
'}';

With this in place, sending a request that includes a W3C traceparent header will result in those IDs being recorded:

bash
12
curl 'http://localhost:8080?userID=1234' \
-H 'traceparent: 00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01'
json
1234567
{
[...],
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span_id": "05a62836f6cc1b17",
"parent_id": "00f067aa0ba902b7",
"parent_sampled": 1
}

While requests without the header generates a new trace:

bash
1
curl 'http://localhost:8080?userID=1234'
json
1234567
{
[...],
"trace_id": "eede940b121ead9ef3fd3b3d912b7596",
"span_id": "c0cd388d0f7d8cea",
"parent_id": "",
"parent_sampled": 0
}

With the trace context now present in your log attributes, the final step is ensure that they're promoted into the standard Trace ID, Span ID, and Flags fields.

The Collector's transform processor is the right tool for the job here:

yaml
123456789101112131415161718
# otelcol.yaml
processors:
transform:
log_statements:
# set trace context fields
- set(log.trace_id.string, log.attributes["trace_id"])
- set(log.span_id.string, log.attributes["span_id"])
- set(log.flags, log.attributes["parent_sampled"])
. . .
service:
pipelines:
logs:
receivers: [filelog]
processors: [batch, transform] # add transform here
exporters: [debug]

The result is a perfectly compliant OpenTelemetry access log record with Trace ID, Span ID, and Flags now set:

text
1234567891011
LogRecord #0
ObservedTimestamp: 2025-10-05 22:52:23.784085403 +0000 UTC
Timestamp: 2025-10-05 22:34:38 +0000 UTC
SeverityText: INFO
SeverityNumber: Info(9)
Body: Str(...)
Attributes:
. . .
Trace ID: eede940b121ead9ef3fd3b3d912b7596
Span ID: c0cd388d0f7d8cea
Flags: 1

Bringing NGINX error logs into the pipeline

So far we've exclusively focused on access logs since it's a lot easier to parse them and extract the attributes (as long as they're already structured at the source). Let's now see how to do the same with error logs.

If you haven't already, ensure that your error log files are being ingested by the filelog receiver. In the debug exporter, such logs will look like this:

text
1234567891011
LogRecord #1
ObservedTimestamp: 2025-10-06 06:49:46.814022102 +0000 UTC
Timestamp: 1970-01-01 00:00:00 +0000 UTC
SeverityText:
SeverityNumber: Unspecified(0)
Body: Str(2025/10/05 14:32:10 [error] 12345#12345: *7 open() "/usr/share/nginx/html/missing.html" failed (2: No such file or directory), client: 172.19.0.1, server: localhost, request: "GET /missing.html HTTP/1.1", host: "localhost:8080")
Attributes:
-> log.file.name: Str(/var/log/nginx/access.log)
Trace ID:
Span ID:
Flags: 0

Since error logs don't natively support JSON, parsing is about extracting fields and aligning them with the OpenTelemetry data model. For example:

  • The [error] token maps directly to SeverityNumber and SeverityText.
  • The timestamp at the start of the line becomes Timestamp.
  • The trailing diagnostic string (e.g. "open() ... failed") maps to Body.
  • Key-value hints like client: 172.19.0.1 can be pulled into Attributes.

To achieve this, you need to use the regex_parser operator like this:

yaml
1234567891011121314
# otelcol.yaml
receivers:
filelog:
include:
- /var/log/nginx/access.log
start_at: beginning
operators:
- type: regex_parser
regex: '^(?P<timestamp>\d{4}/\d{2}/\d{2} \d{2}:\d{2}:\d{2}) \[(?P<severity>[a-z]+)\] (?P<pid>\d+)#(?P<tid>\d+):(?: \*(?P<cid>\d+))? (?P<message>.*)$'
timestamp:
parse_from: attributes.timestamp
layout: "%Y/%m/%d %H:%M:%S"
severity:
parse_from: attributes.severity

This produces:

text
12345678910111213141516
LogRecord #1
ObservedTimestamp: 2025-10-06 07:18:44.43975869 +0000 UTC
Timestamp: 2025-10-05 14:32:10 +0000 UTC
SeverityText: error
SeverityNumber: Error(17)
Body: Str(2025/10/05 14:32:10 [error] 12345#12345: *7 open() "/usr/share/nginx/html/missing.html" failed (2: No such file or directory), client: 172.19.0.1, server: localhost, request: "GET /missing.html HTTP/1.1", host: "localhost:8080")
Attributes:
-> severity: Str(error)
-> pid: Str(12345)
-> tid: Str(12345)
-> cid: Str(7)
-> message: Str(open() "/usr/share/nginx/html/missing.html" failed (2: No such file or directory), client: 172.19.0.1, server: localhost, request: "GET /missing.html HTTP/1.1", host: "localhost:8080")
-> log.file.name: Str(/var/log/nginx/access.log)
Trace ID:
Span ID:
Flags: 0

Here, the timestamp and severity has been correctly mapped, and the attributes common to every error log record have been extracted accordingly.

Final thoughts

With these practices in place, NGINX stops being just an entry point for traffic and becomes an entry point for insight. Instead of grepping through unstructured files, you'll be querying, correlating, and visualizing logs as part of a complete observability workflow.

Logging doesn't have to be an afterthought. Get it right, and it becomes one of your most powerful tools for understanding and improving your systems.

Authors
Ayooluwa Isaiah
Ayooluwa Isaiah