Closed
Description
Component(s)
receiver/datadog
What happened?
Description
When doing W3C Trace Context Propagation from an app instrumented with OTel to an app instrumented with the Datadog agent and the Datadog agent sending spans to the OTel Collector Datadog Receiver, the Trace Id reported by the OTel Collector Datadog
Receiver is different from the Trace ID of the parent spans, breaking the trace.
I can't confirm but I suspect that this is caused by the logic in the OTel Col Datadog
Receiver to produce the OTel Trace ID from the Datadog ids.
Architecture:
┌──────────┐ ┌───────────┐───┐
│OTel Java ┼──────┐ │ │ │
└─────┬────┘ │ │Receiver │ O │
│ │ │ │ T │
│ │ │───────────│ e │
│ └───────────────────────►│ OTLP │ l │
│ │ │ │
│traceparent: trace=xyz, parent=abc │ │ C │
│ │───────────│ o │
│ │ │ l │
┌─────▼──────┐ ┌──────────────────────────►┤Datadog │ │
│Datadog Java┼─┘ └───────────┘───┘
└────────────┘ span:
spanId=...
parent=abc <--CORRECT
traceId=uvw <--WRONG
OTel Collector debug log.
- First span: HTTP client call span emitted by a Java Spring Boot app instrumented by the OTel Java Agent v1.44.1 with
traceId=37940834c74a2dfc11835c979eca1433
spanId=bb4331d223d59950
- Second span: HTTP Server span emitted by a Spring Boot app instrumented by the Datadog Java Agent v1.44.1 with
parentId=bb4331d223d59950
as expectedtraceId=000000000000000011835c979eca1433
which is NOT expected, we expect37940834c74a2dfc11835c979eca1433
Resource SchemaURL: https://opentelemetry.io/schemas/1.24.0
Resource attributes:
-> deployment.environment.name: Str(staging)
-> host.arch: Str(aarch64)
-> host.name: Str(cyrille-le-clerc-macbook.local)
-> os.description: Str(Mac OS X 15.2)
-> os.type: Str(darwin)
-> process.command_args: Slice([...,"-jar","target/checkout-1.1-SNAPSHOT.jar"])
-> process.executable.path: Str(.../bin/java)
-> process.pid: Int(14768)
-> process.runtime.description: Str(Homebrew OpenJDK 64-Bit Server VM 17.0.13+0)
-> process.runtime.name: Str(OpenJDK Runtime Environment)
-> process.runtime.version: Str(17.0.13+0)
-> service.instance.id: Str(ccad3c44-aebc-4f8b-96b9-c4ed6a5433c4)
-> service.name: Str(checkout)
-> service.namespace: Str(shop)
-> service.version: Str(1.1)
-> telemetry.distro.name: Str(opentelemetry-java-instrumentation)
-> telemetry.distro.version: Str(2.10.0)
-> telemetry.sdk.language: Str(java)
-> telemetry.sdk.name: Str(opentelemetry)
-> telemetry.sdk.version: Str(1.44.1)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope io.opentelemetry.java-http-client 2.10.0-alpha
Span #0
Trace ID : 37940834c74a2dfc11835c979eca1433
Parent ID : 179ce2ee48649594
ID : bb4331d223d59950
Name : POST
Kind : Client
Start time : 2024-12-23 17:34:54.397226541 +0000 UTC
End time : 2024-12-23 17:34:54.63652375 +0000 UTC
Status code : Unset
Status message :
Attributes:
-> server.address: Str(shipping.local)
-> tenant_id: Str(tenant-1)
-> http.request.method: Str(POST)
-> network.protocol.version: Str(1.1)
-> http.response.status_code: Int(200)
-> thread.id: Int(160)
-> server.port: Int(8088)
-> thread.name: Str(grpc-default-executor-36)
-> url.full: Str(http://shipping.local:8088/shipOrder)
ResourceSpans #1
Resource SchemaURL: https://opentelemetry.io/schemas/1.16.0
Resource attributes:
-> telemetry.sdk.language: Str(java)
-> process.runtime.version: Str(17.0.13)
-> service.version: Str(1.1)
-> telemetry.sdk.version: Str(Datadog-1.44.1~13a9a2d011)
-> telemetry.sdk.name: Str(Datadog)
-> service.name: Str(shipping)
-> host.name: Str(localhost)
-> os.type: Str(darwin)
ScopeSpans #0
ScopeSpans SchemaURL:
InstrumentationScope Datadog 1.44.1~13a9a2d011
Span #0
Trace ID : 000000000000000011835c979eca1433
Parent ID : bb4331d223d59950
ID : 6176a9d3ea94c1f7
Name : servlet.request
Kind : Server
Start time : 2024-12-23 17:34:54.453192875 +0000 UTC
End time : 2024-12-23 17:34:54.638774084 +0000 UTC
Status code : Ok
Status message :
Attributes:
-> dd.span.Resource: Str(POST /shipOrder)
-> sampling.priority: Str(1.000000)
-> datadog.span.id: Str(7022987396569678327)
-> datadog.trace.id: Str(1261954126867731507)
-> servlet.path: Str(/shipOrder)
-> deployment.environment: Str(production)
-> peer.ipv4: Str(127.0.0.1)
-> thread.name: Str(http-nio-8088-exec-1)
-> language: Str(jvm)
-> service.version: Str(1.1)
-> span.kind: Str(server)
-> http.method: Str(POST)
-> _dd.p.dm: Str(-0)
-> http.status_code: Str(200)
-> _dd.tracer_host: Str(cyrille-le-clerc-macbook.local)
-> http.url: Str(http://shipping.local:8088/shipOrder)
-> http.hostname: Str(shipping.local)
-> _dd.p.tid: Str(37940834c74a2dfc)
-> servlet.context: Str(/)
-> http.route: Str(/shipOrder)
-> runtime-id: Str(8265563a-4256-4741-ba0c-ebbb676d4473)
-> http.useragent: Str(Java-http-client/17.0.13)
-> component: Str(tomcat-server)
-> thread.id: Double(37)
-> process.pid: Double(73021)
-> _dd.profiling.enabled: Double(0)
-> peer.port: Double(64145)
-> _dd.trace_span_attribute_schema: Double(0)
-> _sampling_priority_v1: Double(1)
-> _dd.measured: Double(1)
-> _dd.top_level: Double(1)
Steps to Reproduce
- Setup an OTel Col with both OTLP and Datadog receivers and the debug exporter
- Create two Spring Boot apps, one "upstream_app" calling the "downstream_app" through an HTTP call
- On the HTTP handler of the "downstream_app", dump the
traceparent
http header to verify the context is propagated
- On the HTTP handler of the "downstream_app", dump the
- Instrument the upstream app with OTel Java Auto Instr v2.10.0
- Instrument the downstream app with dd-trace-java v1.44.1
export DD_TRACE_AGENT_URL="http://localhost:8126"
# disabling remote config to ensure no weird behavior
export DD_REMOTE_CONFIGURATION_ENABLED=false
java \
-javaagent:"$DATADOG_AGENT_JAR" \
-Dserver.port=8088 \
-jar target/shipping-1.1-SNAPSHOT.jar
- Invoke the "upstream_app" to trigger an http call to the "downstream_app"
- Inspect the produced spans in the OTel collector logs
Expected Result
- The spans in the otel collector logs show that the trace context is properly propagated: there is just one
traceID
and theparentId
of the HTTP handler of the "downstream_app" matches thespanId
of th HTTP call of the "upstream_App".
Actual Result
The parentId
is properly propagated by the TraceId
is wrong.
Collector version
v0.116.0
Environment information
Environment
MacOS 15.2
Demo app:
- http client call: https://github.com/cyrille-leclerc/my-shopping-cart/blob/934ff3c0aa99af9c92e8a80ea51cb9a833ef6e1a/checkout/src/main/java/com/mycompany/checkout/CheckoutServiceServer.java#L112-L115
- http server handler: https://github.com/cyrille-leclerc/my-shopping-cart/blob/934ff3c0aa99af9c92e8a80ea51cb9a833ef6e1a/shipping/src/main/java/com/mycompany/shipping/ShippingController.java#L31-L49
OpenTelemetry Collector configuration
receivers:
datadog:
endpoint: localhost:8126
read_timeout: 60s
exporters:
debug:
verbosity: detailed
service:
pipelines:
traces:
receivers: [datadog]
processors:
exporters: [debug]
Log output
See bug description
Additional context
No response