-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Fix a bug with missing fields in sig_terms #57757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When you run a `significant_terms` aggregation on a field and it *is* mapped but there aren't any values for it then the count of the documents that match the query on that shard still have to be added to the overall doc count. I broke that in elastic#57361. This fixes that. Closes elastic#57402
|
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
|
Two little notes:
|
| } | ||
| } | ||
|
|
||
| public void testAllDocsWithoutStringFieldviaGlobalOrds() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These actually reproduce the issue.
| } | ||
| } | ||
|
|
||
| public void testSomeDocsWithoutStringFieldviaGlobalOrds() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought this would reproduce the issue but it doesn't. It still feels useful to add.
| } | ||
| } | ||
|
|
||
| public void testThreeLayerStringViaGlobalOrds() throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are coming in #57758 and I thought they might reproduce the issue so I added them as well. No dice. Either way, they are nice to have.
| IndexReader topReader = searcher.getIndexReader(); | ||
| int supersetSize = topReader.numDocs(); | ||
| return new SignificantStringTerms(name, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), | ||
| metadata(), format, 0, supersetSize, significanceHeuristic, emptyList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0 bad!
not-napoleon
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left a few asks around clarifying tests.
| testAllDocsWithoutStringField("map"); | ||
| } | ||
|
|
||
| private void testAllDocsWithoutStringField(String executionHint) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test (or it's mapped/global wrappers, if you'd rather) needs javadoc. It's not at all clear from the name what path this is intended to exercise.
| try (Directory dir = newDirectory()) { | ||
| try (RandomIndexWriter writer = new RandomIndexWriter(random(), dir)) { | ||
| Document d = new Document(); | ||
| d.add(new SortedDocValuesField("f", new BytesRef("f"))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does d not get added to the index? Why are we even creating this document?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'd be a copy and paste error. Sorry!
| public void testAllDocsWithoutNumericField() throws IOException { | ||
| try (Directory dir = newDirectory()) { | ||
| try (RandomIndexWriter writer = new RandomIndexWriter(random(), dir)) { | ||
| Document d = new Document(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above, why do we even need this document?
|
Thanks @not-napoleon! I'll push a cleanup from your review comments in a moment and merge when CI is happy. |
When you run a `significant_terms` aggregation on a field and it *is* mapped but there aren't any values for it then the count of the documents that match the query on that shard still have to be added to the overall doc count. I broke that in elastic#57361. This fixes that. Closes elastic#57402
When you run a
significant_termsaggregation on a field and it ismapped but there aren't any values for it then the count of the
documents that match the query on that shard still have to be added to
the overall doc count. I broke that in #57361. This fixes that.
Closes #57402