-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add parsing methods for Percentiles aggregations #24183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parsing methods for Percentiles aggregations #24183
Conversation
javanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a couple of minors, LGTM otherwise
|
|
||
| Aggregation parsedAggregation; | ||
| try (XContentParser parser = xContentType.xContent().createParser(xContentRegistry, originalBytes)) { | ||
| try (XContentParser parser = xContentType.xContent().createParser(xContentRegistry(), originalBytes)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
createParser from the base test class?
| assertIterators(aggregation, parse(aggregation, randomFrom(XContentType.values()), randomBoolean(), false)); | ||
| } | ||
|
|
||
| private void assertIterators(Iterable<Percentile> aggregation, Iterable<Percentile> parsedAggregation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also merge it with the testPercentilesIterators() method if this is the only place where it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
cbuescher
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tlrx great change and nice to see how far you could unify all those aggregation and tests. I left a few minor nits and a question, nothing big that should stop you from merging this into the feature branch
|
|
||
| private double[] percents; | ||
| private boolean keyed; | ||
| private DocValueFormat docValueFormat; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason these three fields need to be test instance members be randomized in the init() method? Otherwise I would prfer making them local in createTestIntstance().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When reducing percentiles aggregations it is important that all reduced aggregations have the same set of percentiles otherwise the reduce phase will just fail. Before adding the parsing method, the test used a fixed 0.5 percent but I think it's better to have one or more randomized percentiles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I find it more rationale that all aggregations have the same format and keyed too, because it shouldn't be different from one or the other during the aggregation reducing phase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that makes sense. I didn't think about the reducing tests.
| for (int i = 0; i < numValues; ++i) { | ||
| values[i] = randomDouble(); | ||
| } | ||
| return createTestInstance(name, pipelineAggregators, metaData, keyed, docValueFormat, percents, values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer percents/keyed/docValueFormat to be declared and instantiated here if there is no other reason I am missing atm.
| assertIterators(aggregation, parse(aggregation, randomFrom(XContentType.values()), randomBoolean(), false)); | ||
| } | ||
|
|
||
| private void assertIterators(Iterable<Percentile> aggregation, Iterable<Percentile> parsedAggregation) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also merge it with the testPercentilesIterators() method if this is the only place where it is used.
| return new InternalHDRPercentileRanks(name, cdfValues, state, keyed, format, aggregators, metadata); | ||
| boolean keyed, DocValueFormat format, double[] percents, double[] values) { | ||
| final DoubleHistogram state = new DoubleHistogram(3); | ||
| Arrays.stream(values).forEach(state::recordValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to understand why this step got necessary, can you briefly explain? I'm just curious what happens here since it didn't happen in the test before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That was a mistake, now it is fixed.
|
Thanks a lot @javanna @cbuescher ! |
Note: pull request against feature branch
This pull request adds the parsing methods for
InternalHDRPercentilesandInternalTDigestPercentilesaggregations. Parsing code is straightforward because it uses the ParsedPercentiles class created in #23974 for percentiles ranks aggregations.Most of the changes are in test classes where code has been mutualized between classes.