-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Closed
Closed
Copy link
Labels
:Analytics/AggregationsAggregationsAggregations>enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)Meta label for analytical engine team (ESQL/Aggs/Geo)high hanging fruit
Description
NOTE: This is very similar to #9572 about histogram, and is probably similarly blocked by #12316, but I didn't want to hijack that one given that it's a different aggregation. I also couldn't see any mention of the "swamping" problem described below as the second motivation.
A range aggregation is specified using an array of explicit bucket ranges:
"range" : {
"field" : "price",
"ranges" : [
{ "to" : 50 },
{ "from" : 50, "to" : 100 },
{ "from" : 100 }
]
}
I'm proposing an alternative which just requests a maximum number of buckets to return; syntax is open to bikeshedding but could be e.g.
"range" : {
"field" : "price",
"ranges" : 3
}
The motivation for this is twofold:
- One very common use case for range aggs is creating drilldown UI to augment a primary text-based search. The end-user doesn't enter the range boundaries; rather, the client application will do a calibration request with something like a
percentilesagg, aiming for N buckets with doc counts as even as possible, and set therangebucket boundaries for the main request based on the results of that. The problem here is duplicate work; it means two round-trips to Elastic rather than one, both trips will need to apply the request's filtering criteria, and the dependency on those criteria means that the result of thepercentilesagg can't usefully be cached. Doing thepercentilesinternally on the server would avoid the network overhead of the second call, and could hopefully avoid repeating the document filtering step. - Occasionally the matching docs are swamped by one value, meaning that (for example) requesting
[25,50,75]percentiles can return the same value for all three making it almost useless for drilldown. We can work around this when using the Java client by binary-chopping the returnedPercentilesuntil you find something useful, but this isn't possible via the REST API and won't be possible via the Java client either (see Feature gap between Java and HTTP APIs for Percentiles aggregation #23610) once it switches to REST transport.
KPGDL
Metadata
Metadata
Assignees
Labels
:Analytics/AggregationsAggregationsAggregations>enhancementTeam:AnalyticsMeta label for analytical engine team (ESQL/Aggs/Geo)Meta label for analytical engine team (ESQL/Aggs/Geo)high hanging fruit