-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Collapse pipeline aggs into single package #34658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collapse pipeline aggs into single package #34658
Conversation
- Restrict visibility of Aggregators and Factories - Move PipelineAggregatorBuilders up a level so it is consistent with AggregatorBuilders - Checkstyle line length fixes for a few classes - Minor odds/ends (swapping to method references, formatting, etc)
And a missed long line
|
Pinging @elastic/es-search-aggs |
|
Jenkins, run gradle build tests |
colings86
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - I left a small comment more out of curiosity than anything else
| BucketSelectorPipelineAggregator(String name, Map<String, String> bucketsPathsMap, Script script, GapPolicy gapPolicy, | ||
| Map<String, Object> metadata) { | ||
| super(name, bucketsPathsMap.values().toArray(new String[bucketsPathsMap.size()]), metadata); | ||
| super(name, bucketsPathsMap.values().toArray(new String[0]), metadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this change is needed? My understanding was that if you initialise the array to the size of the map it avoids having to perform resizes which reinitialise the array inside the toArray() method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, so this was a rollercoaster. I thought the same too :)
I noticed a new IntelliJ hint which suggested using zero-sized arrays:
Inspection info: There are two styles to convert a collection to an array: either using a pre-sized array (like c.toArray(new String[c.size()])) or using an empty array (like c.toArray(new String[0]).
In older Java versions using pre-sized array was recommended, as the reflection call which is necessary to create an array of proper size was quite slow. However since late updates of OpenJDK 6 this call was intrinsified, making the performance of the empty array version the same and sometimes even better, compared to the pre-sized version. Also passing pre-sized array is dangerous for a concurrent or synchronized collection as a data race is possible between the size and toArray call which may result in extra nulls at the end of the array, if the collection was concurrently shrunk during the operation.
Which led down the rabbit hole and I found this extremely thorough benchmark blog: https://shipilev.net/blog/2016/arrays-wisdom-ancients/
Tl;dr: newer JVMs can optimize the zero-sized array version because it knows the size (empty) and can zero/initialize at the same time. In contrast, the pre-sized version has to be initialized first, then zero'd out in a second step which the JVM can't optimize (yet).
I'm sure it's moot, especially in a ctor like this, but figured since I was tidying up code might as well go by the current recommendations :)
| BucketScriptPipelineAggregator(String name, Map<String, String> bucketsPathsMap, Script script, DocValueFormat formatter, | ||
| GapPolicy gapPolicy, Map<String, Object> metadata) { | ||
| super(name, bucketsPathsMap.values().toArray(new String[bucketsPathsMap.size()]), metadata); | ||
| super(name, bucketsPathsMap.values().toArray(new String[0]), metadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this change is needed? My understanding was that if you initialise the array to the size of the map it avoids having to perform resizes which reinitialise the array inside the toArray() method.
| BucketScriptPipelineAggregator(String name, Map<String, String> bucketsPathsMap, Script script, DocValueFormat formatter, | ||
| GapPolicy gapPolicy, Map<String, Object> metadata) { | ||
| super(name, bucketsPathsMap.values().toArray(new String[bucketsPathsMap.size()]), metadata); | ||
| super(name, bucketsPathsMap.values().toArray(new String[0]), metadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious why this change is needed? My understanding was that if you initialise the array to the size of the map it avoids having to perform resizes which reinitialise the array inside the toArray() method.
colings86
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
jimczi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
jimczi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
- Restrict visibility of Aggregators and Factories - Move PipelineAggregatorBuilders up a level so it is consistent with AggregatorBuilders - Checkstyle line length fixes for a few classes - Minor odds/ends (swapping to method references, formatting, etc)
Collapses all the
org.elasticsearch.search.aggregations.pipeline.*packages down to a single level.Relates #22868