Enhancement: Range agg specified as max bucket count rather than explicit ranges

**NOTE:** This is very similar to #9572 about `histogram`, and is probably similarly blocked by #12316, but I didn't want to hijack that one given that it's a different aggregation. I also couldn't see any mention of the "swamping" problem described below as the second motivation.

A [range aggregation](https://www.elastic.co/guide/en/elasticsearch/reference/5.3/search-aggregations-bucket-range-aggregation.html) is specified using an array of explicit bucket ranges:

~~~
"range" : {
    "field" : "price",
    "ranges" : [
        { "to" : 50 },
        { "from" : 50, "to" : 100 },
        { "from" : 100 }
    ]
}
~~~

I'm proposing an alternative which just requests a maximum number of buckets to return; syntax is open to bikeshedding but could be e.g.

~~~
"range" : {
    "field" : "price",
    "ranges" : 3
}
~~~

The motivation for this is twofold:

1. One very common use case for range aggs is creating drilldown UI to augment a primary text-based search. The end-user doesn't enter the range boundaries; rather, the client application will do a calibration request with something like a `percentiles` agg, aiming for ***N*** buckets with doc counts as even as possible, and set the `range` bucket boundaries for the main request based on the results of that. The problem here is duplicate work; it means two round-trips to Elastic rather than one, both trips will need to apply the request's filtering criteria, and the dependency on those criteria means that the result of the `percentiles` agg can't usefully be cached. Doing the `percentiles` internally on the server would avoid the network overhead of the second call, and could hopefully avoid repeating the document filtering step.
2. Occasionally the matching docs are swamped by one value, meaning that (for example) requesting `[25,50,75]` percentiles can return the same value for all three making it almost useless for drilldown. We can work around this when using the Java client by binary-chopping the returned `Percentiles` until you find something useful, but this isn't possible via the REST API and won't be possible via the Java client either (see #23610) once it switches to REST transport.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Enhancement: Range agg specified as max bucket count rather than explicit ranges #24254

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Enhancement: Range agg specified as max bucket count rather than explicit ranges #24254

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions