Skip to content

Conversation

lahsivjar
Copy link
Contributor

@lahsivjar lahsivjar commented Oct 26, 2022

Motivation

Create proxy transactions if agent is not able to report the transaction due to unexpected errors like OutOfMemory.

Related issues

Part of #315

Testing

(Currently, the PoC for agents is implemented only in golang so the lambda function must use go1.x runtime)

  1. Create a lambda function (in golang) that causes an timeout (or OOM) and run it a few times.
  2. Make sure to import apm-agent-go with go-agent PoC pull. Example function for reference.
  3. Observe in elastic to assert that a transaction for the function invocation is reported as event.outcome: failure.

The above steps can also be executed using the testing/benchmarking module by editing the test function (same as this example).

@github-actions github-actions bot added the aws-λ-extension AWS Lambda Extension label Oct 26, 2022

// WithBatch configures a batch to be used for batching data
// before sending to APM Server.
func WithBatch(batch *accumulator.Batch) ClientOption {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to pass some kind of callback in, maybe make it an interface (or interfaces)? That would remove the direct dependency on accumulator.

I wonder if we should just move ProcessLogs out of this package though, and into package app.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we need to pass some kind of callback in, maybe make it an interface (or interfaces)? That would remove the direct dependency on accumulator.

Thanks for the tip, I have added an interface to remove accumulator dependency on logsapi.

I wonder if we should just move ProcessLogs out of this package though, and into package app.

This sounds worth trying. I can give it a try with the PR to remove the buffer from logsapi or do you think we should address it in this PR itself?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doing it in a separate PR sounds good.

@ghost
Copy link

ghost commented Oct 28, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-11-01T12:41:06.320+0000

  • Duration: 9 min 57 sec

Test stats 🧪

Test Results
Failed 0
Passed 202
Skipped 2
Total 204

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@lahsivjar lahsivjar changed the title Enrich txn metrics and create txn if not reported by agent Create proxy transasction if not reported by agent Oct 28, 2022
@lahsivjar lahsivjar marked this pull request as ready for review October 29, 2022 02:02
@lahsivjar lahsivjar requested a review from axw October 31, 2022 03:29
@lahsivjar lahsivjar requested a review from axw November 1, 2022 02:17
Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me now.

The approach you've taken of sending a partial transaction is quite similar to what @felixbarny suggested with the transaction-start event, and looks good. I still think it could be useful to send the whole payload to the extension and have it parse it, but that's a lot more work and can be deferred until/unless we need it. Maybe specifying a Content-Type would help us handle that change gracefully.

Copy link
Member

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one other minor comment.

@lahsivjar lahsivjar merged commit 8dbac56 into elastic:main Nov 2, 2022
@lahsivjar lahsivjar deleted the 315_txn branch November 2, 2022 02:25
@kruskall kruskall assigned kruskall and unassigned kruskall Dec 5, 2022
@carsonip carsonip self-assigned this Dec 7, 2022
@carsonip
Copy link
Member

carsonip commented Dec 8, 2022

Tested extension on main branch (commit 3032883), with ESS 8.6.0 and a PoC that takes a param to decide whether it times out. Not all timed out calls are recorded in ES. Will have to investigate.

@carsonip
Copy link
Member

carsonip commented Dec 12, 2022

Tested extension on main branch with ESS 8.6.0 and PoC with go agent on branch https://github.com/lahsivjar/apm-agent-go/tree/1323-lambdatxn

This PR is working well. Timed out requests will report event.outcome: failure, with the exception that if the first invocation fails in a new execution environment, the data will be dropped due to missing metadata. The discovery is shared with @lahsivjar @felixbarny @axw

@carsonip carsonip removed their assignment Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aws-λ-extension AWS Lambda Extension

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants