Reduce allocations when draining HTTP requests bodies in repository tests #48541

tlrx · 2019-10-25T15:54:29Z

This PR adjusts the total time that a Google Cloud Storage request can take in the GCS repository tests from 50 sec to 120 seconds (see https://gradle-enterprise.elastic.co/s/zz5bstxyoqunk/tests/avmwsupzrdux6-dp22mjd7itzv6 for a test failure).

@original-brownbear I still think that instead of increasing the timeout we should configure an executor service in the HTTP handler, as GCS requests take more time than S3/Azure to be processed (most of them are multipart encoded).

elasticmachine · 2019-10-25T15:54:30Z

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

tlrx · 2019-10-28T08:18:49Z

@elasticmachine run elasticsearch-ci/1
(unrelated test failure)

original-brownbear

Hmm it seems to me something else but slowness in the HttpServer must be going on here?
I just profiled this test a little and it doesn't seem like we're spending a lot of time on the http server's IO loop in the failing test here.

What we are doing though is allocating a ton of buffers on it during the error simulations, maybe that's where we're burning so much time when under memory pressure?

How about adding this optimization:

--- a/test/framework/src/main/java/org/elasticsearch/repositories/blobstore/ESMockAPIBasedRepositoryIntegTestCase.java
+++ b/test/framework/src/main/java/org/elasticsearch/repositories/blobstore/ESMockAPIBasedRepositoryIntegTestCase.java
@@ -26,7 +26,6 @@ import org.elasticsearch.action.admin.indices.forcemerge.ForceMergeResponse;
 import org.elasticsearch.cluster.metadata.IndexMetaData;
 import org.elasticsearch.common.Strings;
 import org.elasticsearch.common.SuppressForbidden;
-import org.elasticsearch.common.io.Streams;
 import org.elasticsearch.common.network.InetAddresses;
 import org.elasticsearch.common.settings.Settings;
 import org.elasticsearch.mocksocket.MockHttpServer;
@@ -37,6 +36,7 @@ import org.junit.Before;
 import org.junit.BeforeClass;
 
 import java.io.IOException;
+import java.io.InputStream;
 import java.net.InetAddress;
 import java.net.InetSocketAddress;
 import java.util.Map;
@@ -165,8 +165,11 @@ public abstract class ESMockAPIBasedRepositoryIntegTestCase extends ESBlobStoreR
             }
         }
 
+        private static final byte[] BUFFER = new byte[1024];
+
         protected void handleAsError(final HttpExchange exchange) throws IOException {
-            Streams.readFully(exchange.getRequestBody());
+            final InputStream inputStream = exchange.getRequestBody();
+            while (inputStream.read(BUFFER) >= 0);
             exchange.sendResponseHeaders(HttpStatus.SC_INTERNAL_SERVER_ERROR, -1);
             exchange.close();
         }

to dump the bytes more efficiently than through actually instantiating a 1MB or so buffer on every run and see if that helps?
I don't see where we'd need another executor from profiling here and also don't think we should up the timeout here, if it's not GC pressure something else must be off here unless you were able to find some massive CPU usage on the http server dispatcher thread?

This reverts commit 7943da3

tlrx · 2019-10-28T11:21:22Z

Thanks @original-brownbear. I'm not sure the allocating are the cause of the read timeouts, but your suggestion is a good one anyway so I pushed 71fd095 👍

We'll see if it has any impacts on the test failures. I also agree that increasing the timeout is not a good thing to do unless we have exhausted all other source of trouble.

original-brownbear

LGTM, thanks Tanguy :)

Let's see if this helps (at least it should be an improvement, it's even a speedup on my local box where things are completely quiet otherwise) and otherwise try and find ways of addting debuging that would track down the source of slowness maybe?

tlrx · 2019-10-29T08:13:10Z

Thanks Armin!

…ests (#48541) In repository integration tests, we drain the HTTP request body before returning a response. Before this change this operation was done using Streams.readFully() which uses a 8kb buffer to read the input stream, it now uses a 1kb for the same operation. This should reduce the allocations made during the tests and speed them up a bit on CI. Co-authored-by: Armin Braun <[email protected]>

Give more time to GCS requests to complete

7943da3

tlrx added >test Issues or PRs that are addressing/adding tests :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.5.0 v7.6.0 labels Oct 25, 2019

tlrx requested a review from original-brownbear October 25, 2019 15:54

original-brownbear reviewed Oct 28, 2019

View reviewed changes

tlrx added 2 commits October 28, 2019 11:59

Revert "Give more time to GCS requests to complete"

e10a00f

This reverts commit 7943da3

Drain input stream with smaller buffer

71fd095

tlrx requested a review from original-brownbear October 28, 2019 11:21

original-brownbear approved these changes Oct 28, 2019

View reviewed changes

tlrx merged commit 47a1fc1 into elastic:master Oct 29, 2019

tlrx deleted the adjust-total-timeout-gcs-tests branch October 29, 2019 08:13

tlrx changed the title ~~Give more time to GCS requests to complete~~ Reduce allocations when draining HTTP requests bodies in repository tests Oct 29, 2019

jakelandis mentioned this pull request Jan 24, 2020

org.elasticsearch.repositories.gcs.GoogleCloudStorageBlobStoreRepositoryTests testSnapshotWithLargeSegmentFiles #51446

Closed

jakelandis added v8.0.0-alpha1 and removed v8.0.0 labels Jul 26, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reduce allocations when draining HTTP requests bodies in repository tests #48541

Reduce allocations when draining HTTP requests bodies in repository tests #48541

Uh oh!

tlrx commented Oct 25, 2019 •

edited

Loading

Uh oh!

elasticmachine commented Oct 25, 2019

Uh oh!

tlrx commented Oct 28, 2019

Uh oh!

original-brownbear left a comment

Uh oh!

tlrx commented Oct 28, 2019

Uh oh!

original-brownbear left a comment

Uh oh!

tlrx commented Oct 29, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Reduce allocations when draining HTTP requests bodies in repository tests #48541

Reduce allocations when draining HTTP requests bodies in repository tests #48541

Uh oh!

Conversation

tlrx commented Oct 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticmachine commented Oct 25, 2019

Uh oh!

tlrx commented Oct 28, 2019

Uh oh!

original-brownbear left a comment

Choose a reason for hiding this comment

Uh oh!

tlrx commented Oct 28, 2019

Uh oh!

original-brownbear left a comment

Choose a reason for hiding this comment

Uh oh!

tlrx commented Oct 29, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

tlrx commented Oct 25, 2019 •

edited

Loading