Revisit defaults for the `cardinality` aggregation?

The `precision_threshold` parameter of the cardinality aggregation not only has an impact on accuracy but also on memory usage. This is why by default we decide how much memory a cardinality aggregation may use depending on how deep it can be found in the aggregation tree. For instance a top-level cardinality aggregation would use 16KB of memory, a cardinality aggregation under a terms aggregation would use 512 bytes per bucket, and a cardinality aggregation under two (or more) levels of terms aggregation would use 16 bytes per bucket.

Unfortunately, it's not easy to get precise counts with only 16 bytes of memory, which can make the out-of-the-box experience a bit disappointing. I think we have several (non-exclusive) options here:
- increase default memory usage, but I'm nervous about making it even easier to trigger circuit-breaking errors or worse out-of-memory errors. Maybe #9825 could help here: we could decide to always run terms aggs in breadth-first mode if there is a cardinality agg under them so that the cardinality aggregation would be computed on fewer buckets
- better document these defaults
- move parts of the aggs computation to disk so that we can increase our defaults more safely


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Revisit defaults for the `cardinality` aggregation? #13985

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Revisit defaults for the cardinality aggregation? #13985

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Revisit defaults for the `cardinality` aggregation? #13985