Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Jun 5, 2020

When you run a significant_terms aggregation on a field and it is
mapped but there aren't any values for it then the count of the
documents that match the query on that shard still have to be added to
the overall doc count. I broke that in #57361. This fixes that.

Closes #57402

When you run a `significant_terms` aggregation on a field and it *is*
mapped but there aren't any values for it then the count of the
documents that match the query on that shard still have to be added to
the overall doc count. I broke that in elastic#57361. This fixes that.

Closes elastic#57402
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (:Analytics/Aggregations)

@elasticmachine elasticmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jun 5, 2020
@nik9000
Copy link
Member Author

nik9000 commented Jun 5, 2020

Two little notes:

  1. I marked this a non-issue because it fixes a bug we haven't released.
  2. I've opened this targeting the 7.x branch because that is the branch where I could reproduce it. I'll forward port the whole thing to master once this is merged. That isn't normal, but it was convenient in this case because the change is small and, like I said, I couldn't reproduce this in master.

}
}

public void testAllDocsWithoutStringFieldviaGlobalOrds() throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These actually reproduce the issue.

}
}

public void testSomeDocsWithoutStringFieldviaGlobalOrds() throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought this would reproduce the issue but it doesn't. It still feels useful to add.

}
}

public void testThreeLayerStringViaGlobalOrds() throws IOException {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are coming in #57758 and I thought they might reproduce the issue so I added them as well. No dice. Either way, they are nice to have.

IndexReader topReader = searcher.getIndexReader();
int supersetSize = topReader.numDocs();
return new SignificantStringTerms(name, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(),
metadata(), format, 0, supersetSize, significanceHeuristic, emptyList());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 bad!

Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left a few asks around clarifying tests.

testAllDocsWithoutStringField("map");
}

private void testAllDocsWithoutStringField(String executionHint) throws IOException {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test (or it's mapped/global wrappers, if you'd rather) needs javadoc. It's not at all clear from the name what path this is intended to exercise.

try (Directory dir = newDirectory()) {
try (RandomIndexWriter writer = new RandomIndexWriter(random(), dir)) {
Document d = new Document();
d.add(new SortedDocValuesField("f", new BytesRef("f")));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does d not get added to the index? Why are we even creating this document?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be a copy and paste error. Sorry!

public void testAllDocsWithoutNumericField() throws IOException {
try (Directory dir = newDirectory()) {
try (RandomIndexWriter writer = new RandomIndexWriter(random(), dir)) {
Document d = new Document();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, why do we even need this document?

@nik9000
Copy link
Member Author

nik9000 commented Jun 8, 2020

Thanks @not-napoleon! I'll push a cleanup from your review comments in a moment and merge when CI is happy.

@nik9000 nik9000 merged commit ee0ce8f into elastic:7.x Jun 8, 2020
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Jun 8, 2020
When you run a `significant_terms` aggregation on a field and it *is*
mapped but there aren't any values for it then the count of the
documents that match the query on that shard still have to be added to
the overall doc count. I broke that in elastic#57361. This fixes that.

Closes elastic#57402
nik9000 added a commit that referenced this pull request Jun 8, 2020
When you run a `significant_terms` aggregation on a field and it *is*
mapped but there aren't any values for it then the count of the
documents that match the query on that shard still have to be added to
the overall doc count. I broke that in #57361. This fixes that.

Closes #57402
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/Aggregations Aggregations >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v7.9.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants