Skip to content

Should we continue to account 5kb per bucket in the request circuit breaker #62240

@jimczi

Description

@jimczi

Today aggregations that run on shards adds 5kb to the circuit breaker every time they create a new bucket. This estimation was added because most of the memory consumed in aggregations were not accounted properly.

However, the situations has changed, we are now properly accounting almost every allocation in aggregations and the missing ones are tracked. The extra memory that was used to create doc values iterator is also gone with a large refactoring that allows to create a single instance no matter how many buckets the parent creates. Finally the coordinating node should be able to track the memory used to perform reduction of aggregations so we need to make sure that the accounting we have in the request circuit breaker is not too far from the truth.

Adding an arbitrary 5kb for each bucket means that a node running an aggregation that creates 1M buckets will occupy 5GB in the request circuit breaker even though we return only the top N. The reality for a term aggregation is closer than a few bytes than 5kb so I hope we can revive some tests to check the validity of this estimation again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    :Analytics/GeoIndexing, search aggregations of geo points and shapes>enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)team-discuss

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions