-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Increase InternalHistogramTests coverage #36004
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase InternalHistogramTests coverage #36004
Conversation
In this test we were randomizing different values but minDocCount was hardcoded to 1. It's important to test other values, especially `0` as it's the default. The test needed some adapting in the way buckets are randomly generated: all aggs need to share the same interval, minDocCount and emptyBucketInfo. Also assertions need to take into account that more (or less) buckets are expected depending on minDocCount. This was originated by elastic#35921 and its need to test adding empty buckets as part of the reduce phase. Also relates to elastic#26856 as one more key comparison needed to use `Double.compare` to properly handle `NaN` values, this was triggered by the increased test coverage.
|
Pinging @elastic/es-analytics-geo |
jimczi
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one question, LGTM otherwise
| public void setUp() throws Exception { | ||
| super.setUp(); | ||
| keyed = randomBoolean(); | ||
| format = randomNumericDocValueFormat(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a small comment explaining why we need to use the same interval, offset, ... in all tests ?
The other solution would be to add an abstract method in InternalAggregationTestCase that creates a list of random instances for the reduce test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add a comment. Given the issues that this triggered in other test methods, and the fact that we were already doing the same for a couple of fields, I think it makes sense to make this change overall rather than changing only the reduce test. I think it makes the test more realistic too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought a bit more about this and I think I understand your comment better now. Why limit the randomization of different test instances if same interval etc. are needed only for proper reduction tests? On the other hand, it seems like reduction tests are the only case where we call createTestInstance multiple times as part of the same test method. The consequence of the current change is that all of the test methods in the same run will reuse the same interval, offset, minDocCount and emptyBucketInfo values, which may limit test coverage a bit (it's not true that it makes the test more realistic like I previously said). I wonder if it's worth changing this though and making the base class more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have a createTestReduceInstances that by default call createTestInstance and that can be extended by sub classes like this one to ensure that the instances share the same parameter. I agree that it would complicate the base class a little bit but it would also make the test more realistic so I have mixed feelings. Let's continue the discussion since we also need to fix the date_histogram tests.
In `InternalHistogramTests` we were randomizing different values but `minDocCount` was hardcoded to `1`. It's important to test other values, especially `0` as it's the default. To make this possible, the test needed some adapting in the way buckets are randomly generated: all aggs need to share the same `interval`, `minDocCount` and `emptyBucketInfo`. Also assertions need to take into account that more (or less) buckets are expected depending on `minDocCount`. This was originated by #35921 and its need to test adding empty buckets as part of the reduce phase. Also relates to #26856 as one more key comparison needed to use `Double.compare` to properly handle `NaN` values, which was triggered by the increased test coverage.
In
InternalHistogramTestswe were randomizing different values butminDocCountwas hardcoded to1. It's important to test other values, especially0as it's the default. To make this possible, the test needed some adapting in the way buckets are randomly generated: all aggs need to share the sameinterval,minDocCountandemptyBucketInfo. Also assertions need to take into account that more (or less) buckets are expected depending onminDocCount.This was originated by #35921 and its need to test adding empty buckets as part of the reduce phase.
Also relates to #26856 as one more key comparison needed to use
Double.compareto properly handleNaNvalues, which was triggered by the increased test coverage.