Filter(s) aggregation should create weights only once. #15998
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We have a performance bug that if a filter aggregation is below a terms
aggregation that has a cardinality of 1000, we will call Query.createWeight
1000 times as well. However, Query.createWeight can be a costly operation.
For instance in the case of a TermQuery it will seek the term in every
segment. Instead, we should create the Weight once, and then get as many
iterators as we need from this Weight.
I found this problem while trying to diagnose a performance regression while
upgrading from 1.7 to 2.1[1]. While the problem was not introduced in 2.x, the
fact that 1.7 cached very aggressively had hidden this problem, since you don't
need to seek the term anymore on a cached TermFilter.
Doing things once for every aggregator is not easy with the current API but
I discussed this with Colin and Aggregator factories will need to get an init
method for different reasons, where we will be able to put these steps that
need to be performed only once, no matter haw many aggregators need to be
created.
[1] https://discuss.elastic.co/t/aggregations-in-2-1-0-much-slower-than-1-6-0/38056/26