Skip to content

Conversation

@polyfractal
Copy link
Contributor

Collapses all the org.elasticsearch.search.aggregations.pipeline.* packages down to a single level.

  • Restrict visibility of Aggregators and Factories
  • Move PipelineAggregatorBuilders up a level so it is consistent with AggregatorBuilders
  • Checkstyle line length fixes for a few classes
  • Minor odds/ends (swapping to method references, formatting, private variables, deprecated logger, etc)

Relates #22868

- Restrict visibility of Aggregators and Factories
- Move PipelineAggregatorBuilders up a level so it is consistent with
AggregatorBuilders
- Checkstyle line length fixes for a few classes
- Minor odds/ends (swapping to method references, formatting, etc)
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@polyfractal
Copy link
Contributor Author

Jenkins, run gradle build tests

Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - I left a small comment more out of curiosity than anything else

BucketSelectorPipelineAggregator(String name, Map<String, String> bucketsPathsMap, Script script, GapPolicy gapPolicy,
Map<String, Object> metadata) {
super(name, bucketsPathsMap.values().toArray(new String[bucketsPathsMap.size()]), metadata);
super(name, bucketsPathsMap.values().toArray(new String[0]), metadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this change is needed? My understanding was that if you initialise the array to the size of the map it avoids having to perform resizes which reinitialise the array inside the toArray() method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so this was a rollercoaster. I thought the same too :)

I noticed a new IntelliJ hint which suggested using zero-sized arrays:

Inspection info: There are two styles to convert a collection to an array: either using a pre-sized array (like c.toArray(new String[c.size()])) or using an empty array (like c.toArray(new String[0]).

In older Java versions using pre-sized array was recommended, as the reflection call which is necessary to create an array of proper size was quite slow. However since late updates of OpenJDK 6 this call was intrinsified, making the performance of the empty array version the same and sometimes even better, compared to the pre-sized version. Also passing pre-sized array is dangerous for a concurrent or synchronized collection as a data race is possible between the size and toArray call which may result in extra nulls at the end of the array, if the collection was concurrently shrunk during the operation.

Which led down the rabbit hole and I found this extremely thorough benchmark blog: https://shipilev.net/blog/2016/arrays-wisdom-ancients/

Tl;dr: newer JVMs can optimize the zero-sized array version because it knows the size (empty) and can zero/initialize at the same time. In contrast, the pre-sized version has to be initialized first, then zero'd out in a second step which the JVM can't optimize (yet).

I'm sure it's moot, especially in a ctor like this, but figured since I was tidying up code might as well go by the current recommendations :)

BucketScriptPipelineAggregator(String name, Map<String, String> bucketsPathsMap, Script script, DocValueFormat formatter,
GapPolicy gapPolicy, Map<String, Object> metadata) {
super(name, bucketsPathsMap.values().toArray(new String[bucketsPathsMap.size()]), metadata);
super(name, bucketsPathsMap.values().toArray(new String[0]), metadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this change is needed? My understanding was that if you initialise the array to the size of the map it avoids having to perform resizes which reinitialise the array inside the toArray() method.

BucketScriptPipelineAggregator(String name, Map<String, String> bucketsPathsMap, Script script, DocValueFormat formatter,
GapPolicy gapPolicy, Map<String, Object> metadata) {
super(name, bucketsPathsMap.values().toArray(new String[bucketsPathsMap.size()]), metadata);
super(name, bucketsPathsMap.values().toArray(new String[0]), metadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why this change is needed? My understanding was that if you initialise the array to the size of the map it avoids having to perform resizes which reinitialise the array inside the toArray() method.

Copy link
Contributor

@colings86 colings86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@polyfractal polyfractal merged commit 299d044 into elastic:master Oct 23, 2018
kcm pushed a commit that referenced this pull request Oct 30, 2018
- Restrict visibility of Aggregators and Factories
- Move PipelineAggregatorBuilders up a level so it is consistent with
AggregatorBuilders
- Checkstyle line length fixes for a few classes
- Minor odds/ends (swapping to method references, formatting, etc)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants