-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add parsing to some single value aggregations #24085
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parsing to some single value aggregations #24085
Conversation
tlrx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a first look, I'm happy to see that many things can be mutualized. I don't really like how the parsed Cardinality/ValueCount extends the ParsedNumericMetricsAggregation.SingleValue and don't use the double there and provide their own long value. I'm wondering if we could let SingleValue take care of the stringified value and have a Long and Double version of SingleValue where all parsing logic and XContent rendering would be placed.
I have to admit that I'm not sure if that's possible at all, so I'll take a look at this too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tend to use the current class name, because this name is only used in log/error traces so that one can easily spot the class to look at. It seems that there's no convention in the codebase for that, so let's fix one at least for aggregations.
I'm +1 on ParsedAvg.class.getSimpleName() (but I'm OK to use AvgAggregationBuilder.NAME if both @javanna and you prefer)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion here, I don't think it makes a whole lot of difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will according to suggestion by @tlrx than, since I also don't mind much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Maybe declareSingeValueFields ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe declareAggregationFields()? I think that "common" doesn't have much sense where there's 2-3 levels of inheritance..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it can be shortened to
(parser, context) -> parseValue(parser, defaultNullValue)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move all statics down? (after doXContentBody())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
javanna
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left a few comments but LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this class? and why does SingleValue have to be an inner class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know, the idea was to pull out value()/getValueAsString() and the internal fields for the Min/Max/Sum/Avg aggregation, maybe some others of the single value aggregations that just store a double. I tried to mimic the inheritance structure of InternalNumericMetricsAggregation here, which contains inner classes for SingleValue and MultiValue. Maybe we don't need it, but I'd like to keep it for now and revisit this decision once we are done with all the single value aggregations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this hierarchy makes little sense and we should not copy it to our objects unless there are good reasons to do so. Is the only reason consistency?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is sharing value/valueAsString() a good reason to keep ParsedNumericMetricsAggregation? Or rename it to ParsedSingleValueMetricsAggregation and remove the inner SingleValue class? My suspicion at this point is that we might also reuse some stuff for the MultiValue case but I cannot tell until we start working on those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not see what ParsedNumericMetricsAggregation holds. Seems like it is used only as an ancestor for its inner class? In that case let's have ParsedSingleValueNumericMetricsAggregation as a top level class?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we try and do without depending on DocValueFormat here? We can just do what this method does?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced this with Double.toString(value).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion here, I don't think it makes a whole lot of difference
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where is this logic in InternalAvg?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In InternalAvg the value field is only rendered if the avg normalizer (count) is not 0. (count != 0 ? getValue() : null). I choose to parse back null as Double.POSITIVE_INFINITY because that is what you get back when you divide a positive double by 0L. So I check for that value here to get the same xContent output like InternalAvg.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good! leave a comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's left todo here? the static method is already in the interface?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was an example of how we could share the xContent rendering logic between InternalXY and ParsedXY aggregations. I don't know if it would work in all places. I only did it for SUM, I could try to do it for Min/Max/Avg etc but that means slight changes in the existing internal aggregations. Not sure which way to go here atm, maybe you have some opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should do it only when it's very simple to do. It is a bit funky to have this static method in the interface but that is the best place for it. @tlrx what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried doing this for the aggregations in this PR in 1048678, but I'm not sure I like it. It avoids some code duplication but makes reading and reasoning about the code much harder in my opinion. Maybe we should only do it when it really saves a lot of duplication (like really long xContent methods), wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea I tend to agree, I am +0 on this :) I defer to Tanguy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here too, shall we try and not depend on the formatter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed this to Long.toString(valueCount).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/formated/formatted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
leftover
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
complete POSITIVE with POSITIVE_INFINITY ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
0762331 to
9e6d11d
Compare
Yes, I agree. I think we should only do it when it saves a lot of duplication with custom logic. In this case it saves 1 line but added many more, and it also "force" people to look at what the Sum.renderXContent() is doing. I think we should not add those. |
1048678 to
a6112a5
Compare
a6112a5 to
210e101
Compare
|
@tlrx thanks, I reverted the commit where I pulled the xContent rendering into the interfaces, rebased and reordered the PR so changes are more self contained. I think I will merge this one manually to the feature branch without squashing so that the history better reflects the individual changes to each aggregation. |
Similar to #23973 this adds parsing from xContent for the Min, Max, Avg, Sum and ValueCount aggregation.
I tried to pull out common functionality to ParsedNumericMetricsAggregation.SingleValue. Since InternalCardinality
and InternalValueCount store a long value instead of a double, parsing and getters/setters for the value are slightly
different so I didn't include their parsed counterparts under ParsedNumericMetricsAggregation.SingleValue for now.
Also included one case where I tried pulling out common xContent rendering into the shared interface (in Sum) for discussion.
I'm unsure if I like that one but it would help reduce some code duplication.
This PR is WIP against a feature branch.