Skip to content

Conversation

jlvoiseux
Copy link
Contributor

@jlvoiseux jlvoiseux commented Feb 18, 2022

Motivation / Summary

Following issue #108, the goal of this pull request is to send the following metrics to Elasticsearch through the AWS Lambda APM extension :

  • Init Duration
  • Total Duration
  • Billed Duration
  • Memory Used
  • Memory Sized

This information is typically provided by the Lambda Platform Logs API as follows :

REPORT RequestId: cc18960e-80e4-4160-bfdc-a5dae659993b	Duration: 162.81 ms	Billed Duration: 163 ms	Memory Size: 128 MB	Max Memory Used: 79 MB	Init Duration: 646.33 ms

Current demo dashboard :
image

Memory usage is also displayed in the APM UI :
image

Files / Added Elements

  • A new file, process_metrics.go has been added to the extension (as well as the corresponding test file). This file contains the structs required to implements the Metrics model. Most of the code comes from the Go Agent codebase, with some simplifications (most notably at the Metrics aggregation level).
  • In the logs_api folder, the client and the http_listener have been modified to take events of type Report into account.
  • The main function has been modified to call the process_metrics functions upon dedication of a Report event.

Limitations / Elements to discuss

  • The Platform Logs report is only made available after the function is over. As such, metrics are only sent if another invocation is triggered in a timespan short enough for the function to leave its warm state. As per definition, all workarounds would extend the Lambda runtime and thus incur additional billing for users.
  • Metadata retrieval is currently very limited, and done minimally through a mix of hard-coded values fitting the test example and environment variables. There are several ways to populate the metadata - create a dedicated endpoint on the extension for the agents to send the information, or decompress and analyse the first received transaction.
  • The newly created fields have to be added manually to the APM index pattern for reproduction.

Checklist

  • Provide a first proof of concept
  • Devise and implement a strategy for metadata retrieval
  • Confirm the names of the newly added fields
  • Refine proof of concept and dashboard
  • Settle on a final list of metrics to collect and expand scope
  • Expose Lambda specific metrics via UI

How to test

  1. Make sure that your Go Runtime is at least 1.17
  2. Clone/fork the current repository and switch to the following branch
  3. Build a new version of the extension and save it as a layer
  4. Instrument the Lambda function of your choice
  5. On your Elastic instance, add the following fields to the APM index pattern with the following configuration :
  • faas.metrics.duration.init.ms, Type double, Format Duration (Milliseconds)
  • faas.metrics.duration.measured.ms, Type double, Format Duration (Milliseconds)
  • faas.metrics.duration.billed.ms, Type long, Format Duration (Milliseconds)
  • faas.metrics.memory.maxUsed.bytes, Type long, Format Bytes (b - Binary)
  • faas.metrics.memory.total.bytes, Type long, Format Bytes (b - Binary)
  1. Run the Lambda function several times as to trigger the fetch of several reports.
  2. Observe the data in Elastic

@jlvoiseux jlvoiseux added enhancement New feature or request aws-λ-extension AWS Lambda Extension labels Feb 18, 2022
@jlvoiseux jlvoiseux linked an issue Feb 18, 2022 that may be closed by this pull request
3 tasks
@ghost
Copy link

ghost commented Feb 18, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-05-31T15:34:39.864+0000

  • Duration: 8 min 32 sec

Test stats 🧪

Test Results
Failed 0
Passed 216
Skipped 4
Total 220

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

@jlvoiseux
Copy link
Contributor Author

Following our discussions, I have enriched and brought several modifications to the initial implementation :

Metrics naming

The metrics are now named as follows (consistency with the Metricbeat Lambda integration) :

  • aws.lambda.metrics.TotalMemory
  • aws.lambda.metrics.UsedMemory
  • aws.lambda.metrics.Duration
  • aws.lambda.metrics.BilledDuration
  • aws.lambda.metrics.ColdStartDuration
  • aws.lambda.metrics.Timeout
    The latter was introduced in the most recent commit. Its computation is based on the field deadlineMs (a Unix epoch in milliseconds corresponding to the absolute time where the lambda function is supposed to time out). The computation is an approximation based on the assumption that the timeout is always an Integer (number of seconds).

Metadata

Metadata is obtained by decompressing the very first transaction received (during cold start). Aside from the standard metadata obtained from this initial transaction, the following fields have been modified :

  • agent.name : When metadata is appended to an extension-generated metricset, the Agent name is set to aws-lambda-extension
  • agent.version : Similarly to the method used by the Go APM agent, the version number is now stored in a const (extension/version.go)

While changing these fields to reflect the features of the extension is great of consistency, it confuses the APM UI, as the language used is not recognized (the language badge is thus an empty hexagon).

One of the objectives of this work was also to introduce the ECS faas fields in the Metrics metadata (faas.coldstart, faas.execution, faas.id). However, the faas object seems to be exclusive to transaction events (when added to a metricsetevent, it is ignored by the APM server). For this specific use case, we might want to move the faas object to the specification of metadata events. As a workaround, I chose to use labels. The aforementioned information is thus available in the following fields :

  • labels.faas_coldstart
  • labels.faas_execution
  • labels.faas_id

The steps for testing remain the same as the initial post, except that step 4 can be skipped - adding the fields manually to the index pattern is, in fact, not required.

@jlvoiseux
Copy link
Contributor Author

Superseded by #202.

@jlvoiseux jlvoiseux closed this May 31, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

aws-λ-extension AWS Lambda Extension enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for Metrics

1 participant