Skip to content

Conversation

@jimczi
Copy link
Contributor

@jimczi jimczi commented Apr 18, 2018

This commit refactors ScriptDocValues.Strings to directly creates String objects
instead of using an intermediate BytesRef's copy.
ScriptDocValues.Binary is also changed to create a single copy of BytesRef per consumed value.

Relates #29567

This commit refactors ScriptDocValues.Strings to directly creates String objects
instead of using an intermediate BytesRef's copy.
ScriptDocValues.Binary is also changed to create a single copy of BytesRef per consumed value.

Relates elastic#29567
@jimczi jimczi added >enhancement :Search/Search Search-related issues that do not fall into other categories v7.0.0 v6.3.0 labels Apr 18, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@jpountz
Copy link
Contributor

jpountz commented Apr 18, 2018

We might want to benchmark this change. It does indeed remove one memcpy, but it also makes the String object allocation impossible to skip with escape analysis. I don't know whether escape analysis did succeed to skip the object allocation before, but if it did then this change would introduce a number of object allocations that is linear with the number of matches in the index?

Should we also look into making the utf8 conversion lazy so that scripts that only get the string value based on some other condition only pay the price for the utf8 conversion when the string is actually used?

@jimczi
Copy link
Contributor Author

jimczi commented Apr 18, 2018

Should we also look into making the utf8 conversion lazy so that scripts that only get the string value based on some other condition only pay the price for the utf8 conversion when the string is actually used?

That sounds like the solution we have today and maybe a good reason to not proceed with this pr ?

@jpountz
Copy link
Contributor

jpountz commented Apr 19, 2018

This specific issue might be fixable differently, eg. by reading doc values lazily, only if at least one value is requested?

@jimczi
Copy link
Contributor Author

jimczi commented Apr 19, 2018

I pushed a commit to copy the values lazily, there's an IOException converted into an UncheckedIOException to make it work but as Adrien said if there is an exception here it's not good anyway. I'll see how I can benchmark this change now.

@colings86
Copy link
Contributor

@jimczi was there any result from benchmarking yet? Is this still something we would like to get merged?

@jimczi
Copy link
Contributor Author

jimczi commented Mar 29, 2019

I don't have time to work on this currently so I am closing this pr and will revisit later.

@jimczi jimczi closed this Mar 29, 2019
@jimczi jimczi deleted the script_doc_values_binary branch March 29, 2019 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :Search/Search Search-related issues that do not fall into other categories v6.4.1 v7.0.0-rc2

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants