-
Notifications
You must be signed in to change notification settings - Fork 35
Description
/cc @lahsivjar
APM AWS Lambda extension version (apm-aws-lambda version): latest main (commit 02f4ef1) plus a patch for some debug logging
Describe the bug
I believe the extension is reporting a transaction twice in a certain case -- at least it appears that CreateProxyTxn is called twice. My case is when (a) using the extension and Node.js APM agent layers that support the recent partial transactions work and (b) a lambda function executes that results in terminating the function process (via process.exit(42) in my case). This results in the extension receiving the platform.runtimeDone event from the Logs API, and then the SHUTDOWN event. The handlers for both of those events result in calling b.finalizeInvocation(...).
I'll show full repro steps below, but the annotated log is here: https://gist.github.com/trentm/66ebb5eb18c43eea0a8c013e193a41e6
My guess at the issue: should this code be deleting reqID from b.invocations?
func (b *Batch) OnLambdaLogRuntimeDone(reqID, status string, time time.Time) error {
b.mu.Lock()
defer b.mu.Unlock()
return b.finalizeInvocation(reqID, status, time)
}To Reproduce
-
create a new Lambda function, mine is named
trentm-play-fn3name: trentm-play-fn3 runtime: Node.js 18.x architecture: x86_64 advanced settings/ [x] Enable function URL Auth type: NONEFunction URL: https://[REDACTED].lambda-url.us-west-2.on.aws/
You don't have to enable a function URL, but it is a convenient way to invoke the function. -
Handler code.
WARNING: The AWS Lambda console defaults to creating anindex.mjshandler
file. This uses ES modules that the Node.js APM agent doesn't currently support.
You must (a) deleteindex.mjsand (b) create aindex.jsfile with:module.exports.handler = async (event) => { console.log('[handler] going to process.exit now'); process.exit(42); };
-
Add layers to get the latest (unreleased) ext change, and the draft Node.js APM
changes (feat: pre-register partial transactions in AWS Lambda apm-agent-nodejs#3285) to make the
POST /register/transactioncalls to the extension. You can use the following, or build your own:arn:aws:lambda:us-west-2:627286350134:layer:trentm-play-apm-lambda-extension:8 arn:aws:lambda:us-west-2:627286350134:layer:trentm-dev-elastic-apm-node:53See note below for the patch to the extension in my build (for debug logging).
-
Configure envvars for the Lambda:
NODE_OPTIONS=-r elastic-apm-node/start ELASTIC_APM_LAMBDA_APM_SERVER=https://trentm-play11.apm.us-west2.gcp.elastic-cloud.com:9243 ELASTIC_APM_SECRET_TOKEN=[REDACTED] ELASTIC_APM_LOG_LEVEL=trace -
Call the Lambda function once. We expect a 502 because it will crash.
% curl -i https://[REDACTED].lambda-url.us-west-2.on.aws/ HTTP/1.1 502 Bad Gateway Date: Tue, 25 Apr 2023 22:30:32 GMT Content-Type: application/json Content-Length: 21 Connection: keep-alive x-amzn-RequestId: 343754cf-7763-402d-a7d8-0d2e5bda956a X-Amzn-Trace-Id: root=1-64485487-0126d8ea0a9690263d1bb94b;sampled=0;lineage=d073293c:0 Internal Server Error -
Look at the logs via
aws logs tail --since 10s --follow /aws/lambda/trentm-play-fn3
Gist of the annotated log output: https://gist.github.com/trentm/66ebb5eb18c43eea0a8c013e193a41e6
Patch to apm-aws-lambda.git for debug logging
the diff
diff --git a/accumulator/batch.go b/accumulator/batch.go
index 61618a4..5db1869 100644
--- a/accumulator/batch.go
+++ b/accumulator/batch.go
@@ -25,6 +25,7 @@ import (
"time"
"github.com/tidwall/gjson"
+ "go.uber.org/zap"
)
var (
@@ -79,15 +80,17 @@ type Batch struct {
// invoke lifecycle then it is possible to receive the agent init request
// before extension invoke is registered.
currentlyExecutingRequestID string
+ logger *zap.SugaredLogger
}
// NewBatch creates a new BatchData which can accept a
// maximum number of entries as specified by the arguments.
-func NewBatch(maxSize int, maxAge time.Duration) *Batch {
+func NewBatch(maxSize int, maxAge time.Duration, logger *zap.SugaredLogger) *Batch {
return &Batch{
invocations: make(map[string]*Invocation),
maxSize: maxSize,
maxAge: maxAge,
+ logger: logger,
}
}
@@ -311,6 +314,7 @@ func (b *Batch) finalizeInvocation(reqID, status string, time time.Time) error {
if !ok {
return fmt.Errorf("invocation for requestID %s does not exist", reqID)
}
+ b.logger.Infof("XXX call inc.CreateProxyTxn: status=%s time=%s", status, time)
proxyTxn, err := inc.CreateProxyTxn(status, time)
if err != nil {
return err
diff --git a/app/app.go b/app/app.go
index 23ef134..d981c00 100644
--- a/app/app.go
+++ b/app/app.go
@@ -61,7 +61,6 @@ func New(ctx context.Context, opts ...ConfigOption) (*App, error) {
app := &App{
extensionName: c.extensionName,
- batch: accumulator.NewBatch(defaultMaxBatchSize, defaultMaxBatchAge),
}
var err error
@@ -70,6 +69,8 @@ func New(ctx context.Context, opts ...ConfigOption) (*App, error) {
return nil, err
}
+ app.batch = accumulator.NewBatch(defaultMaxBatchSize, defaultMaxBatchAge, app.logger)
+
apmServerAPIKey, apmServerSecretToken, err := loadAWSOptions(ctx, c.awsConfig, app.logger)
if err != nil {
return nil, err