Derive max composite buffers from max content len #29448

danielmitterdorfer · 2018-04-10T13:24:28Z

With this commit we determine the maximum number of buffers that Netty
keeps while accumulating one HTTP request based on the maximum content
length. Previously, we kept the default value of 1024 which is too small
for bulk requests which leads to unnecessary copies of byte buffers
internally.

With this commit we determine the maximum number of buffers that Netty keeps while accumulating one HTTP request based on the maximum content length. Previously, we kept the default value of 1024 which is too small for bulk requests which leads to unecessary copies of byte buffers internally.

elasticmachine · 2018-04-10T13:24:30Z

Pinging @elastic/es-core-infra

…r-sizing

danielmitterdorfer · 2018-04-17T09:16:22Z

I have now run several benchmarks in our benchmarking suite with different heap sizes. Below are the results for two extreme cases:

geopoint which contains only very small documents. I ran the benchmarks with an increased bulk size of 50.000 docs per bulk (instead of 5.000).
pmc where each document is one academic paper. pmc is rather I/O intensive and has large bulk requests compared to geopoint.

I ran these benchmarks with different heap sizes to see for which configurations the max cumulation buffer changes have an effect. The results match the expectation that with these change, Elasticsearch performs better under low memory conditions while performing similar to or slightly better than before when more heap memory is available. One data point where this is very apparent is that we could successfully finish pmc with 768MB heap with these changes whereas the benchmark candidate OOMEd after indexing 30% of the document corpus without these changes (indicated with indexing thoughput set to zero in the graph below).

The graphs below show the achieved median indexing throughput for those tracks along with error bars for the minimum and maximum throughput. In blue we see the current default value of at most 1024 cumulation buffer components whereas in orange we see the new default of 69905 cumulation buffer components assuming a default HTTP max content length of 100MB (this value is derived based on the HTTP max content length; see the code for details).

I'd intend to merge these changes to master after successful review, let it bake there for a while and backport it to 6.x.

jasontedor

I left a question.

jasontedor · 2018-05-02T15:11:48Z

.../transport-netty4/src/main/java/org/elasticsearch/http/netty4/Netty4HttpServerTransport.java

+            // Note that we are *not* pre-allocating any memory based on this setting but rather determine the CompositeByteBuf's capacity.
+            // The tradeoff is between less (but larger) buffers that are contained in the CompositeByteBuf and more (but smaller) buffers.
+            // With the default max content length of 100MB and a MTU of 1500 bytes we would allow 69905 entries.
+            long maxBufferComponentsEstimate = Math.round((double) (maxContentLength.getBytes() / MTU_ETHERNET.getBytes()));


We can get the interface MTU from NetworkInterface API; should we use this here?

I had a look this and I really like your idea. However, I am not sure it is feasible: I guess we need to resolve the network interface that is associated with the publish address (leveraging NetworkService).

While I also dislike having the MTU specified as a constant here (instead of retrieving it from the network interface), it serves as an input for an estimation. The "worst" practical MTU is 1500 byte, another typical one is 65536 byte for loopback (that's not always the case for loopback but it is usually higher than for Ethernet). What happens if we assume an MTU of 1500 for our estimation in the loopback case? Our estimate would allow more buffer components than are expected for loopback. But in practice we do not expect to reach that theoretical capacity. A specific example for the loopback case (with a max content length of 100 MB):

Estimated maximum number of buffers: 100 MB / 1500 byte = 69905

Actual maximum number of buffers: 100 MB / 65536 byte = 1600

This means that we expect to reserve space for at most 1600 buffers in the HttpObjectAggregator but overestimate it to at most 69905 (which we do not reach in practice).

To summarize, I see two possibilities:

Determine the publish address with NetworkService, find the matching network interface and determine its MTU. This is correct but more complex.

Rename the constant MTU_ETHERNET to something like SMALLEST_EXPECTED_MTU (while RFC 791 states 68 bytes as the minimum, I'd stick to 1500 for practical purposes). It's not ideal but in very special cases, users can explicitly define this expert setting (http.netty.max_composite_buffer_components) directly.

@jasontedor wdyt about the above?

I think that jumbo frames are common enough that we should try to go the extra mile here. If it's not possible to do cleanly, I would at least like to see a system property that can be set to set the MTU to 9000.

Thanks for the suggestion. I have now introduced a system property with a default is 1500 (bytes).

jasontedor · 2018-05-02T15:13:21Z

.../transport-netty4/src/main/java/org/elasticsearch/http/netty4/Netty4HttpServerTransport.java

+            //
+            // Note that we are *not* pre-allocating any memory based on this setting but rather determine the CompositeByteBuf's capacity.
+            // The tradeoff is between less (but larger) buffers that are contained in the CompositeByteBuf and more (but smaller) buffers.
+            // With the default max content length of 100MB and a MTU of 1500 bytes we would allow 69905 entries.


It is a nit, but would you please use the multi-line style:

/* * */ ```?

Sure. Addressed in b5c14fd.

danielmitterdorfer · 2018-05-03T15:52:06Z

@elasticmachine retest this please

…r-sizing

Tim-Brooks · 2018-05-04T18:46:59Z

At some point it may be worth benchmarking this with TLS/SSL enabled as some of the assumptions (1500 byte buffer sizes making it to HttpObjectAggregator) may not hold. A full TLS/SSL packet (which could be 16kb) will need to be decrypted prior to doing http decoding. The risk of course is that larger component lists consume more heap. But I think it will not be a problem as Netty seems to start with lists of 16 components in the CompositeByteBufs and resizes as necessary. Still, might be worth checking. Or could just monitor security benchmarks after merge.

danielmitterdorfer · 2018-05-08T09:14:25Z

Good point @tbrooks8. I ran the PMC benchmarks x-pack now and the results are similar to Elasticsearch without x-pack (i.e. we still see a benefit of implementing this change):

The risk of course is that larger component lists consume more heap.

The number that is adjusted here defines at which point the original (small) buffers get copied into one larger buffer. We do not preallocate any memory based on that number. I agree that we might have more elements in the CompositeByteBuf but based on the benchmark results I'd conclude that more entries the internal array list are less problematic than the additional buffer copies.

…r-sizing

danielmitterdorfer · 2018-05-09T10:18:24Z

@elasticmachine retest this please

jasontedor

I left one comment but LGTM to me now.

jasontedor · 2018-05-09T13:54:42Z

.../transport-netty4/src/main/java/org/elasticsearch/http/netty4/Netty4HttpServerTransport.java

+     *
+     * By default we assume the Ethernet MTU (1500 bytes) but users can override it with a system property.
+     */
+    private static final ByteSizeValue MTU = new ByteSizeValue(Long.valueOf(System.getProperty("es.net.mtu", "1500")));


Nit: it does not really matter here but this boxes and this should instead be Long.parseLong.

Thanks, I fixed it in 635c9fb.

danielmitterdorfer · 2018-05-09T14:49:36Z

Thanks for review @jasontedor and your comments @tbrooks8! I let this bake on master for a while and will then backport to 6.x.

danielmitterdorfer · 2018-05-11T03:55:52Z

@elasticmachine retest this please

jasontedor · 2018-05-11T21:49:48Z

@danielmitterdorfer I do not think this change should be backported to the 6.3 branch.

With this commit we determine the maximum number of buffers that Netty keeps while accumulating one HTTP request based on the maximum content length (default 1500 bytes, overridable with the system property `es.net.mtu`). Previously, we kept the default value of 1024 which is too small for bulk requests which leads to unnecessary copies of byte buffers internally. Relates #29448

danielmitterdorfer · 2018-05-15T06:31:10Z

Agreed w.r.t releases. Backported to 6.x in 71d3297.

danielmitterdorfer added >enhancement WIP :Distributed Coordination/Network Http and internode communication implementations v7.0.0 labels Apr 10, 2018

Merge remote-tracking branch 'origin/master' into max-composite-buffe…

c5ac564

…r-sizing

danielmitterdorfer added review and removed WIP labels Apr 17, 2018

danielmitterdorfer requested a review from jasontedor April 17, 2018 09:16

jasontedor reviewed May 2, 2018

View reviewed changes

Use C-style comment

b5c14fd

Merge remote-tracking branch 'origin/master' into max-composite-buffe…

5c74546

…r-sizing

danielmitterdorfer added the >non-issue label May 4, 2018

danielmitterdorfer added 3 commits May 8, 2018 11:14

Expose message size as system property

e61e44a

Merge remote-tracking branch 'origin/master' into max-composite-buffe…

b63b45a

…r-sizing

Rename system property to 'es.net.mtu'

2049b32

jasontedor approved these changes May 9, 2018

View reviewed changes

Avoid boxing

635c9fb

danielmitterdorfer removed the review label May 9, 2018

danielmitterdorfer merged commit 09cf530 into elastic:master May 11, 2018

danielmitterdorfer deleted the max-composite-buffer-sizing branch May 11, 2018 08:01

danielmitterdorfer added the backport pending label May 11, 2018

danielmitterdorfer added v6.4.0 v6.3.1 labels May 11, 2018

jasontedor removed the >non-issue label May 11, 2018

jasontedor removed the v6.3.1 label May 11, 2018

danielmitterdorfer removed the backport pending label May 15, 2018

colings86 added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019

Derive max composite buffers from max content len #29448

Derive max composite buffers from max content len #29448

Uh oh!

Conversation

danielmitterdorfer commented Apr 10, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Apr 10, 2018

Uh oh!

danielmitterdorfer commented Apr 17, 2018

Uh oh!

jasontedor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielmitterdorfer commented May 3, 2018

Uh oh!

Tim-Brooks commented May 4, 2018

Uh oh!

danielmitterdorfer commented May 8, 2018

Uh oh!

danielmitterdorfer commented May 9, 2018

Uh oh!

jasontedor left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danielmitterdorfer commented May 9, 2018

Uh oh!

danielmitterdorfer commented May 11, 2018

Uh oh!

jasontedor commented May 11, 2018

Uh oh!

danielmitterdorfer commented May 15, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

danielmitterdorfer commented Apr 10, 2018 •

edited

Loading