Skip to content

Commit 5f6067f

Browse files
authored
Add trace log correlation (#21)
* adjust version code and add version tag to readme * Add trace/log correlation info * Add instructions for how to add trace ids to your logs * Made some decisions about some log levels * Add span ID, add more detail in the README * Update README and version number * Explicitly mention cloning the lambda pipeline
1 parent 0947f53 commit 5f6067f

File tree

6 files changed

+285
-7
lines changed

6 files changed

+285
-7
lines changed

README.md

Lines changed: 54 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ Once [installed](#installation), you should be able to submit custom metrics fro
3838

3939
Check out the instructions for [submitting custom metrics from AWS Lambda functions](https://docs.datadoghq.com/integrations/amazon_lambda/?tab=java#custom-metrics).
4040

41-
## Tracing
41+
## Distributed Tracing
4242

4343
Wrap your outbound HTTP requests with trace headers to see your lambda in context in APM.
4444
The Lambda Java Client Library provides instrumented HTTP connection objects as well as helper methods for
@@ -168,6 +168,59 @@ public class Handler implements RequestHandler<APIGatewayV2ProxyRequestEvent, AP
168168
}
169169
```
170170

171+
### Trace/Log Correlations
172+
173+
In order to correlate your traces with your logs, you must inject the trace context
174+
into your log messages. We've added the these into the slf4j MDC under the key `dd.trace_context`
175+
and provided convenience methods to get it automatically. The trace context is added to the MDC as a side
176+
effect of instantiating any `new DDLambda(...)`.
177+
178+
This is an example trace context: `[dd.trace_id=3371139772690049666 dd.span_id=13006875827893840236]`
179+
180+
#### JSON Logs
181+
182+
If you are using JSON logs, add the trace ID and span ID to each log message with the keys
183+
`dd.trace_id` and `dd.span_id` respectively. To get a map containing trace and span IDs,
184+
call `DDLambda.getTraceContext()`. Union this map with the JSON data being logged.
185+
186+
#### Plain text logs
187+
188+
If you are using plain text logs, then you must create a new [Parser](https://docs.datadoghq.com/logs/processing/parsing/?tab=matcher)
189+
by cloning the existing Lambda Pipeline. The new parser can extract the trace context from the correct position in the logs.
190+
Use the helper `_trace_context` to extract the trace context. For example, if your log line looked like:
191+
192+
```
193+
INFO 2020-11-11T14:00:00Z LAMBDA_REQUEST_ID [dd.trace_id=12345 dd.span_id=67890] This is a log message
194+
```
195+
196+
Then your parser rule would look like:
197+
198+
```
199+
my_custom_rule \[%{word:level}\]?\s+%{_timestamp}\s+%{notSpace:lambda.request_id}%{_trace_context}?.*
200+
```
201+
202+
#### Log4j / SLF4J
203+
204+
We have added the Trace ID into the slf4j MDC under the key `dd.trace_context`. That can be accessed
205+
using the `%X{dd.trace_context}` operator. Here is an example `log4j.properties`:
206+
207+
```
208+
log = .
209+
log4j.rootLogger = DEBUG, LAMBDA
210+
211+
log4j.appender.LAMBDA=com.amazonaws.services.lambda.runtime.log4j.LambdaAppender
212+
log4j.appender.LAMBDA.layout=org.apache.log4j.PatternLayout
213+
log4j.appender.LAMBDA.layout.conversionPattern=%d{yyyy-MM-dd HH:mm:ss} %X{dd.trace_context} %-5p %c:%L - %m%n
214+
```
215+
216+
would result in log lines looking like `2020-11-13 19:21:53 [dd.trace_id=1168910694192328743 dd.span_id=3204959397041471598] INFO com.serverless.Handler:20 - Test Log Message`
217+
218+
#### Other logging solutions
219+
220+
If you are using a different logging solution, the trace ID can be accessed using the method
221+
`DDLambda.getTraceContextString()`. That returns your trace ID as a string that can be added
222+
to any log message.
223+
171224
## Opening Issues
172225

173226
If you encounter a bug with this package, we want to hear about it. Before opening a new issue,

build.gradle

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -28,18 +28,20 @@ dependencies {
2828
implementation 'org.apache.httpcomponents:httpclient:4.5.11'
2929
implementation 'com.squareup.okhttp3:okhttp:4.4.0'
3030
implementation 'org.apache.commons:commons-lang3:3.10'
31-
31+
implementation 'org.slf4j:slf4j-log4j12:1.7.28'
32+
implementation 'org.jetbrains:annotations:15.0'
3233

3334
// Use JUnit test framework
3435
testImplementation 'junit:junit:4.12'
35-
implementation 'org.jetbrains:annotations:15.0'
36+
testImplementation 'org.apache.logging.log4j:log4j-api:2.13.3'
37+
testImplementation 'org.apache.logging.log4j:log4j-core:2.13.3'
3638
}
3739

3840
sourceCompatibility = 1.8
3941
targetCompatibility = 1.8
4042

4143
group = 'com.datadoghq'
42-
version= '0.0.6'
44+
version= '0.0.7'
4345

4446
allprojects {
4547
repositories {

src/main/java/com/datadoghq/datadog_lambda_java/DDLambda.java

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
package com.datadoghq.datadog_lambda_java;
22

3+
import org.slf4j.MDC;
34
import java.io.IOException;
45
import java.net.URL;
56
import java.net.URLConnection;
@@ -20,6 +21,7 @@ public class DDLambda {
2021
private String ENHANCED_PREFIX = "aws.lambda.enhanced.";
2122
private String INVOCATION = "invocations";
2223
private String ERROR = "errors";
24+
private String MDC_TRACE_CONTEXT_FIELD = "dd.trace_context";
2325
private Tracing tracing;
2426
private boolean enhanced = true;
2527

@@ -31,6 +33,19 @@ public DDLambda(Context cxt){
3133
this.tracing = new Tracing();
3234
this.enhanced = checkEnhanced();
3335
recordEnhanced(INVOCATION, cxt);
36+
MDC.put(MDC_TRACE_CONTEXT_FIELD, getTraceContextString());
37+
}
38+
39+
/**
40+
* Testing only: create a DDLambda instrumenter with a given context and xrayTraceInfo
41+
* @param cxt Enhanced Metrics pulls information from the Lambda context.
42+
* @param xrayTraceInfo This would normally be the contents of the "_X_AMZN_TRACE_ID" env var
43+
*/
44+
protected DDLambda(Context cxt, String xrayTraceInfo){
45+
this.tracing = new Tracing(xrayTraceInfo);
46+
this.enhanced = checkEnhanced();
47+
recordEnhanced(INVOCATION, cxt);
48+
MDC.put(MDC_TRACE_CONTEXT_FIELD, getTraceContextString());
3449
}
3550

3651
/**
@@ -44,6 +59,7 @@ public DDLambda(APIGatewayProxyRequestEvent req, Context cxt){
4459
recordEnhanced(INVOCATION, cxt);
4560
this.tracing = new Tracing(req);
4661
this.tracing.submitSegment();
62+
MDC.put(MDC_TRACE_CONTEXT_FIELD, getTraceContextString());
4763
}
4864

4965
/**
@@ -57,6 +73,7 @@ public DDLambda(APIGatewayV2ProxyRequestEvent req, Context cxt){
5773
recordEnhanced(INVOCATION, cxt);
5874
this.tracing = new Tracing(req);
5975
this.tracing.submitSegment();
76+
MDC.put(MDC_TRACE_CONTEXT_FIELD, getTraceContextString());
6077
}
6178

6279
/**
@@ -70,6 +87,7 @@ public DDLambda(Headerable req, Context cxt){
7087
recordEnhanced(INVOCATION, cxt);
7188
this.tracing = new Tracing(req);
7289
this.tracing.submitSegment();
90+
MDC.put(MDC_TRACE_CONTEXT_FIELD, getTraceContextString());
7391
}
7492

7593
private boolean checkEnhanced(){
@@ -212,4 +230,37 @@ public Request addTraceHeaders(Request request){
212230

213231
return rb.build();
214232
}
233+
234+
/**
235+
* Get the trace context for trace/log correlation. Inject this into your logs in order to correlate logs with traces.
236+
* @return a map of the current trace context
237+
*/
238+
public Map<String,String> getTraceContext(){
239+
if (this.tracing == null){
240+
DDLogger.getLoggerImpl().debug("No tracing context; unable to get Trace ID");
241+
return null;
242+
}
243+
return this.tracing.getLogCorrelationTraceAndSpanIDsMap();
244+
}
245+
246+
/**
247+
* Get the trace context in string form. Inject this into your logs in order to correlate logs with traces.
248+
* @return a string representation of the current trace context
249+
*/
250+
public String getTraceContextString(){
251+
Map<String,String> traceInfo = getTraceContext();
252+
if (traceInfo == null){
253+
DDLogger.getLoggerImpl().debug("No Trace/Log correlation IDs returned");
254+
return "";
255+
}
256+
257+
String traceID = traceInfo.get(this.tracing.TRACE_ID_KEY);
258+
String spanID = traceInfo.get(this.tracing.SPAN_ID_KEY);
259+
return formatTraceContext(this.tracing.TRACE_ID_KEY, traceID, this.tracing.SPAN_ID_KEY, spanID);
260+
}
261+
262+
private String formatTraceContext(String traceKey, String trace, String spanKey, String span){
263+
return String.format("[%s=%s %s=%s]", traceKey, trace, spanKey, span);
264+
}
265+
215266
}

src/main/java/com/datadoghq/datadog_lambda_java/Tracing.java

Lines changed: 89 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,9 @@ public class Tracing {
1616
protected DDTraceContext cxt;
1717
protected XRayTraceContext xrt;
1818

19+
protected String TRACE_ID_KEY = "dd.trace_id";
20+
protected String SPAN_ID_KEY = "dd.span_id";
21+
1922
public Tracing(){
2023
this.xrt = new XRayTraceContext();
2124
}
@@ -35,6 +38,14 @@ public Tracing(Headerable req){
3538
this.xrt = new XRayTraceContext();
3639
}
3740

41+
/**
42+
* Test constructor that can take a dummy _X_AMZN_TRACE_ID value
43+
* @param xrayTraceInfo
44+
*/
45+
protected Tracing(String xrayTraceInfo){
46+
this.xrt = new XRayTraceContext(xrayTraceInfo);
47+
}
48+
3849

3950
public DDTraceContext getDDContext() {
4051
if (this.cxt == null) {
@@ -50,6 +61,31 @@ public XRayTraceContext getXrayContext() {
5061
return xrt;
5162
}
5263

64+
public Map<String,String> getLogCorrelationTraceAndSpanIDsMap(){
65+
if (this.cxt != null){
66+
String traceID = this.cxt.getTraceID();
67+
String spanID = this.cxt.getParentID();
68+
Map<String, String> out = new HashMap<String, String>();
69+
out.put(TRACE_ID_KEY, traceID);
70+
out.put(SPAN_ID_KEY, spanID);
71+
return out;
72+
}
73+
if (this.xrt != null){
74+
String traceID = this.xrt.getAPMTraceID();
75+
String spanID = this.xrt.getAPMParentID();
76+
Map<String, String> out = new HashMap<String, String>();
77+
out.put(TRACE_ID_KEY, traceID);
78+
out.put(SPAN_ID_KEY, spanID);
79+
return out;
80+
}
81+
DDLogger.getLoggerImpl().debug("No DD trace context or XRay trace context set!");
82+
return null;
83+
}
84+
85+
private String formatLogCorrelation(String trace, String span){
86+
return String.format("[dd.trace_id=%s dd.span_id=%s]", trace, span);
87+
}
88+
5389
private static DDTraceContext populateDDContext(Map<String,String> headers){
5490
DDTraceContext ctx = null;
5591
try{
@@ -317,7 +353,7 @@ class XRayTraceContext{
317353
public XRayTraceContext(){
318354
//Root=1-5e41a79d-e6a0db584029dba86a594b7e;Parent=8c34f5ad8f92d510;Sampled=1
319355
String traceId = System.getenv("_X_AMZN_TRACE_ID");
320-
if (traceId == null){
356+
if (traceId == null || traceId == ""){
321357
DDLogger.getLoggerImpl().debug("Unable to find _X_AMZN_TRACE_ID");
322358
return;
323359
}
@@ -337,6 +373,27 @@ public XRayTraceContext(){
337373
this.traceIdHeader = traceId;
338374
}
339375

376+
/**
377+
* Test constructor that can take a dummy _X_AMZN_TRACE_ID value rather than reading from env vars
378+
* @param traceId
379+
*/
380+
protected XRayTraceContext(String traceId){
381+
String[] traceParts = traceId.split(";");
382+
if(traceParts.length != 3){
383+
DDLogger.getLoggerImpl().error("Malformed _X_AMZN_TRACE_ID value: "+ traceId);
384+
return;
385+
}
386+
387+
try {
388+
this.traceId = traceParts[0].split("=")[1];
389+
parentId = traceParts[1].split("=")[1];
390+
} catch (Exception e){
391+
DDLogger.getLoggerImpl().error("Malformed _X_AMZN_TRACE_ID value: "+ traceId);
392+
return;
393+
}
394+
this.traceIdHeader = traceId;
395+
}
396+
340397
public void setTraceIdHeader(String traceIdHeader) {
341398
this.traceIdHeader = traceIdHeader;
342399
}
@@ -375,4 +432,35 @@ public String getAPMParentID(){
375432
return null;
376433
}
377434
}
435+
436+
public String getAPMTraceID(){
437+
//trace ID looks like 1-5e41a79d-e6a0db584029dba86a594b7e
438+
String bigid = "";
439+
try {
440+
bigid = this.traceId.split("-")[2];
441+
}
442+
catch (ArrayIndexOutOfBoundsException | NullPointerException ai){
443+
DDLogger.getLoggerImpl().debug("Unexpected format for the trace ID. Unable to parse it. " + this.traceId);
444+
return "";
445+
}
446+
447+
//just to verify
448+
if (bigid.length() != 24){
449+
DDLogger.getLoggerImpl().debug("Got an unusual traceid from x-ray. Unable to convert that to an APM id. " + this.traceId);
450+
return "";
451+
}
452+
453+
String last16 = bigid.substring(bigid.length()-16); // should be the last 16 characters of the big id
454+
455+
Long parsed = 0L;
456+
try {
457+
parsed = Long.parseUnsignedLong(last16, 16); //unsigned because parseLong throws a numberformatexception at anything greater than 0x7FFFF...
458+
}
459+
catch (NumberFormatException ne){
460+
DDLogger.getLoggerImpl().debug("Got a NumberFormatException trying to parse the traceID. Unable to convert to an APM id. " + this.traceId);
461+
return "";
462+
}
463+
parsed = parsed & 0x7FFFFFFFFFFFFFFFL; //take care of that pesky first bit...
464+
return parsed.toString();
465+
}
378466
}
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
package com.datadoghq.datadog_lambda_java;
2+
3+
import java.io.ByteArrayOutputStream;
4+
import java.io.PrintStream;
5+
import java.util.Map;
6+
import org.junit.Assert;
7+
import org.junit.Test;
8+
import org.apache.log4j.Logger;
9+
import org.apache.log4j.ConsoleAppender;
10+
import org.apache.log4j.WriterAppender;
11+
import org.apache.log4j.PatternLayout;
12+
13+
14+
public class TraceLogCorrelationTest {
15+
private final ByteArrayOutputStream outContent = new ByteArrayOutputStream();
16+
private final PrintStream originalErr = System.err;
17+
18+
@Test
19+
public void TestWithLog4J(){
20+
Logger logger = Logger.getLogger(TraceLogCorrelationTest.class.getName());
21+
PatternLayout layout = new PatternLayout("%d{ISO8601} %X{dd.trace_context} [%t] %-5p %c %x - %m%n");
22+
logger.addAppender(new ConsoleAppender((layout)));
23+
24+
logger.info("Test log message");
25+
26+
EnhancedMetricTest.MockContext mc = new EnhancedMetricTest.MockContext();
27+
28+
//try to grab the contents of StdErr
29+
logger.addAppender(new WriterAppender(layout, outContent));
30+
DDLambda ddl = new DDLambda(mc, "Root=1-5fa98677-5dda00b12ec8b1c31f61aa82;Parent=b481b4a6ef3a296c;Sampled=1");
31+
outContent.reset();
32+
logger.info("Test log message 2");
33+
logger.removeAllAppenders();
34+
35+
String logOutput = outContent.toString();
36+
String[] logFields = logOutput.split(" ");
37+
Assert.assertEquals("[dd.trace_id=3371139772690049666", logFields[2]);
38+
Assert.assertEquals("dd.span_id=13006875827893840236]", logFields[3]);
39+
}
40+
41+
@Test
42+
public void TestGetTraceLogCorrelationId(){
43+
EnhancedMetricTest.MockContext mc = new EnhancedMetricTest.MockContext();
44+
DDLambda ddl = new DDLambda(mc, "Root=1-5fa98677-5dda00b12ec8b1c31f61aa82;Parent=b481b4a6ef3a296c;Sampled=1");
45+
46+
Map<String,String> traceInfo = ddl.getTraceContext();
47+
Assert.assertEquals("3371139772690049666", traceInfo.get("dd.trace_id"));
48+
Assert.assertEquals("13006875827893840236", traceInfo.get("dd.span_id"));
49+
}
50+
51+
}
52+

0 commit comments

Comments
 (0)