Skip to content

Conversation

@original-brownbear
Copy link
Contributor

@original-brownbear original-brownbear commented Mar 4, 2019

Found while profiling the impact of #39286 :

  • findNextMarker took almost 1ms per invocation during the PMC rally track which comes out
    • Fixed to be about an order of magnitude faster by using Netty's bulk ByteBuf search
  • It is unnecessary to instantiate an object (the input stream wrapper) and throw it away, just to read the int length from the message bytes
    • Fixed by adding bulk int read to BytesReference

* findNextMarker took almost 1ms per invocation during the PMC rally track
  * Fixed to be about an order of magnitude faster by using Netty's bulk `ByteBuf` search
* It is unnecessary to instantiate an object (the input stream wrapper) and throw it away, just to read the `int` length from the message bytes
  * Fixed by adding bulk `int` read to BytesReference
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed

Copy link
Contributor

@s1monw s1monw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM left one comment

@jpountz
Copy link
Contributor

jpountz commented Mar 5, 2019

Nit: this looks very similar to String#indexOf and List#indexOf, should we use the same name here? It also looks like we might be able to simplify the signature by dropping to and assuming it is always the length of the BytesReference?

@original-brownbear
Copy link
Contributor Author

@jpountz you're right, it is always the length for the to value here => removed the param + renamed the method to indexOf => much nicer I agree :)

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some tests to AbstractBytesReferenceTestCase? Other than that LGTM.

@original-brownbear
Copy link
Contributor Author

@jpountz thanks!, adjusted Javadoc and added tests as requested, can you take another look? :)

@original-brownbear
Copy link
Contributor Author

Small issue with the new test my bad, fixing ...

@original-brownbear
Copy link
Contributor Author

@jpountz alright, sorry for the noise above. Now it's green and should be good to review :)

@original-brownbear
Copy link
Contributor Author

@jpountz thanks!

@original-brownbear original-brownbear merged commit 0619e6e into elastic:master Mar 5, 2019
@original-brownbear original-brownbear deleted the optimize-marker-search branch March 5, 2019 16:45
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Mar 6, 2019
* Optimize Bulk Message Parsing and Message Length Parsing

* findNextMarker took almost 1ms per invocation during the PMC rally track
  * Fixed to be about an order of magnitude faster by using Netty's bulk `ByteBuf` search
* It is unnecessary to instantiate an object (the input stream wrapper) and throw it away, just to read the `int` length from the message bytes
  * Fixed by adding bulk `int` read to BytesReference
original-brownbear added a commit that referenced this pull request Mar 6, 2019
…9730)

* Optimize Bulk Message Parsing and Message Length Parsing

* findNextMarker took almost 1ms per invocation during the PMC rally track
  * Fixed to be about an order of magnitude faster by using Netty's bulk `ByteBuf` search
* It is unnecessary to instantiate an object (the input stream wrapper) and throw it away, just to read the `int` length from the message bytes
  * Fixed by adding bulk `int` read to BytesReference
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Jul 18, 2019
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates elastic#32228
   * I think the issue that preventet that PR  that PR from being merged was solved by elastic#39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
original-brownbear added a commit that referenced this pull request Dec 3, 2019
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates #32228
   * I think the issue that preventet that PR  that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
original-brownbear added a commit to original-brownbear/elasticsearch that referenced this pull request Dec 3, 2019
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates elastic#32228
   * I think the issue that preventet that PR  that PR from being merged was solved by elastic#39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
original-brownbear added a commit that referenced this pull request Dec 4, 2019
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates #32228
   * I think the issue that preventet that PR  that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates elastic#32228
   * I think the issue that preventet that PR  that PR from being merged was solved by elastic#39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants