-
Notifications
You must be signed in to change notification settings - Fork 315
Use connection pool for http communications with extension. #6417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| .writeTimeout(REQUEST_TIMEOUT_IN_S, SECONDS) | ||
| .readTimeout(REQUEST_TIMEOUT_IN_S, SECONDS) | ||
| .callTimeout(REQUEST_TIMEOUT_IN_S, SECONDS) | ||
| .connectionPool(new ConnectionPool(5, 300, SECONDS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 connections, 300 second keep-alive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where do the magic numbers come from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thin air. I mostly just made them up. Very happy to take any suggestions for edits though.
I figured we'll only be creating two connections, one for start and one for end. I wanted the number of connections in this pool to be a bit larger than the number we're expecting to need. This is why I chose 5.
The keep alive value of 5 minutes also just felt like it was a good amount of time to keep a connection alive. I wanna say that lambda only keeps the container alive for like 6 minutes, though I could very well be wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could define a constant for them?
Sounds good, I'd verify with Darcy or AJ about the max time a lambda container will be alive, AFAIK, it can be longer than that (15 minutes(?) I'm not pretty sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this client is static, does it mean it's reused for multiple invocation? Can we test on how many connections it need by firing a bunch a requests (> 5) within a short period of time?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duncanista Good idea about adding the constants. And we can't merge this until next week anyway, so I'll be sure to ask @DarcyRaynerDD what he thinks before we merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joeyzhao2018 the test I performed yesterday (with the image and link to notebook above) was using a simple hello world java function executed 10x/sec. I looked to see if there was a way to log when a new connection is made, but while this functionality exists in later versions of okhttp, it doesn't in the one we're using. (We actually vendor the package, see https://github.com/DataDog/okhttp/blob/java7/okhttp/src/main/java/okhttp3/ConnectionPool.java)
So, I don't know how else to confirm this. But, I supposed that if we're seeing all the data appear as expected and that the average runtime duration is less than it was, then this seems like a good thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, we're stuck on an old version of okhttp because the new version depends on kotlin.
But sounds like you've done your diligence.
Thanks
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 41 metrics, 13 unstable metrics. LoadParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 15 unstable metrics. Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.31.0-SNAPSHOT~72bee06be2, baseline=1.31.0-SNAPSHOT~c96efd6181
dateFormat X
axisFormat %s
section baseline
no_agent (369.81 µs) : 350, 390
. : milestone, 370,
iast (482.325 µs) : 461, 503
. : milestone, 482,
iast_FULL (534.882 µs) : 515, 555
. : milestone, 535,
iast_GLOBAL (497.112 µs) : 475, 519
. : milestone, 497,
iast_HARDCODED_SECRET_DISABLED (486.868 µs) : 466, 508
. : milestone, 487,
iast_INACTIVE (458.742 µs) : 437, 480
. : milestone, 459,
iast_TELEMETRY_OFF (473.846 µs) : 452, 495
. : milestone, 474,
tracing (445.381 µs) : 424, 466
. : milestone, 445,
section candidate
no_agent (373.615 µs) : 353, 394
. : milestone, 374,
iast (468.6 µs) : 448, 489
. : milestone, 469,
iast_FULL (539.831 µs) : 519, 560
. : milestone, 540,
iast_GLOBAL (498.995 µs) : 478, 520
. : milestone, 499,
iast_HARDCODED_SECRET_DISABLED (471.592 µs) : 451, 492
. : milestone, 472,
iast_INACTIVE (451.986 µs) : 431, 473
. : milestone, 452,
iast_TELEMETRY_OFF (470.474 µs) : 449, 492
. : milestone, 470,
tracing (452.673 µs) : 432, 474
. : milestone, 453,
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.31.0-SNAPSHOT~72bee06be2, baseline=1.31.0-SNAPSHOT~c96efd6181
dateFormat X
axisFormat %s
section baseline
no_agent (1.349 ms) : 1330, 1367
. : milestone, 1349,
appsec (1.778 ms) : 1752, 1804
. : milestone, 1778,
iast (1.525 ms) : 1501, 1549
. : milestone, 1525,
profiling (1.528 ms) : 1501, 1555
. : milestone, 1528,
tracing (1.51 ms) : 1484, 1535
. : milestone, 1510,
section candidate
no_agent (1.354 ms) : 1335, 1373
. : milestone, 1354,
appsec (1.789 ms) : 1764, 1814
. : milestone, 1789,
iast (1.536 ms) : 1512, 1561
. : milestone, 1536,
profiling (1.579 ms) : 1553, 1605
. : milestone, 1579,
tracing (1.5 ms) : 1475, 1525
. : milestone, 1500,
|
35176fa to
0c3db2c
Compare
joeyzhao2018
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
![]()
| .writeTimeout(REQUEST_TIMEOUT_IN_S, SECONDS) | ||
| .readTimeout(REQUEST_TIMEOUT_IN_S, SECONDS) | ||
| .callTimeout(REQUEST_TIMEOUT_IN_S, SECONDS) | ||
| .connectionPool(new ConnectionPool(MAX_IDLE_CONNECTIONS, KEEP_ALIVE_DURATION, SECONDS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@DarcyRaynerDD do you have any opinions on the values we choose here?
0c3db2c to
3272716
Compare
3272716 to
72bee06
Compare
What Does This Do
Instead of creating a new tcp connection with each call to the extension (ie start and end invocation), use a pool of available connections.
Motivation
This change reduces
aws.lambda.enhanced.runtime_durationby approximately 2%.See https://ddserverless.datadoghq.com/notebook/2987494/reys-purple-notebook?range=3600000&view=view-mode&start=1703617009962&live=false
Additional Notes
Jira ticket: [PROJ-IDENT]