Skip to content

Conversation

@javanna
Copy link
Member

@javanna javanna commented Nov 28, 2018

In InternalHistogramTests we were randomizing different values but minDocCount was hardcoded to 1. It's important to test other values, especially 0 as it's the default. To make this possible, the test needed some adapting in the way buckets are randomly generated: all aggs need to share the same interval, minDocCount and emptyBucketInfo. Also assertions need to take into account that more (or less) buckets are expected depending on minDocCount.

This was originated by #35921 and its need to test adding empty buckets as part of the reduce phase.

Also relates to #26856 as one more key comparison needed to use Double.compare to properly handle NaN values, which was triggered by the increased test coverage.

In this test we were randomizing different values but minDocCount was hardcoded to 1. It's important to test other values, especially `0` as it's the default. The test needed some adapting in the way buckets are randomly generated: all aggs need to share the same interval, minDocCount and emptyBucketInfo. Also assertions need to take into account that more (or less) buckets are expected depending on minDocCount.

This was originated by elastic#35921 and its need to test adding empty buckets as part of the reduce phase.
Also relates to elastic#26856 as one more key comparison needed to use `Double.compare` to properly handle `NaN` values, this was triggered by the increased test coverage.
@javanna javanna added >test Issues or PRs that are addressing/adding tests :Analytics/Aggregations Aggregations v7.0.0 v6.6.0 labels Nov 28, 2018
@javanna javanna requested review from colings86 and jimczi November 28, 2018 14:29
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo

Copy link
Contributor

@jimczi jimczi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left one question, LGTM otherwise

public void setUp() throws Exception {
super.setUp();
keyed = randomBoolean();
format = randomNumericDocValueFormat();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a small comment explaining why we need to use the same interval, offset, ... in all tests ?
The other solution would be to add an abstract method in InternalAggregationTestCase that creates a list of random instances for the reduce test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will add a comment. Given the issues that this triggered in other test methods, and the fact that we were already doing the same for a couple of fields, I think it makes sense to make this change overall rather than changing only the reduce test. I think it makes the test more realistic too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought a bit more about this and I think I understand your comment better now. Why limit the randomization of different test instances if same interval etc. are needed only for proper reduction tests? On the other hand, it seems like reduction tests are the only case where we call createTestInstance multiple times as part of the same test method. The consequence of the current change is that all of the test methods in the same run will reuse the same interval, offset, minDocCount and emptyBucketInfo values, which may limit test coverage a bit (it's not true that it makes the test more realistic like I previously said). I wonder if it's worth changing this though and making the base class more complicated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could have a createTestReduceInstances that by default call createTestInstance and that can be extended by sub classes like this one to ensure that the instances share the same parameter. I agree that it would complicate the base class a little bit but it would also make the test more realistic so I have mixed feelings. Let's continue the discussion since we also need to fix the date_histogram tests.

@javanna javanna merged commit 4b85769 into elastic:master Nov 28, 2018
javanna added a commit that referenced this pull request Dec 3, 2018
In `InternalHistogramTests` we were randomizing different values but `minDocCount` was hardcoded to `1`. It's important to test other values, especially `0` as it's the default. To make this possible, the test needed some adapting in the way buckets are randomly generated: all aggs need to share the same `interval`, `minDocCount` and `emptyBucketInfo`. Also assertions need to take into account that more (or less) buckets are expected depending on `minDocCount`.

This was originated by #35921 and its need to test adding empty buckets as part of the reduce phase.

Also relates to #26856 as one more key comparison needed to use `Double.compare` to properly handle `NaN` values, which was triggered by the increased test coverage.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations >test Issues or PRs that are addressing/adding tests v6.6.0 v7.0.0-beta1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants