-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
This is intended to be a starting discussion about the options and possibilities for range scripting. While working on aggregation support for range fields, we've encountered a couple of situations that seem better served by scripting than by new aggregations or aggregation parameters. A couple of examples:
-
We would like to provide numerical metric aggregations, such as max and min, on ranges, but there is no natural ordering so we would need a way for users to specify how to order (e.g. by start point, end point, range width, etc). To support this strictly in aggs, we'd either need range specific metric aggregations, or to add a new parameter to select what mode to use. Neither is terribly appealing. Instead, we'd like to let users send a script that would extract whatever single number the user wants for the metric to operate on.
-
Open ended ranges (or any very large range, really) currently generate too many buckets and trip a circuit breaker when aggregating (see date_histogram of date_range with null end points triggers a circuit breaker #50109). Scripting could offer a recourse for this situation, allowing users to either rewrite large ranges (in the linked issue, they want to replace their unbounded endpoint with
now(), for example) or just omitting ranges that will produce too many buckets.
To support this, several things need to happen:
- Painless needs a way to access Range fields and provide an interface to their start and end fields. Those fields are
Objectsin Java, and can resolve to a couple of different types depending on the type of ranges. See https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/index/mapper/RangeType.java for the list of supported types. - We need to be able to return ranges from a script, which will likely require some
ValueTypeorValuesSourceTypesupport on the aggregation side. I'm not sure what's required on the scripting side. - Value Scripts seem like a natural choice here, but the aggregations
ValuesSourcelogic currently assumes that a Value Script will yield the same type as its input field, which means we can't use that to extract a single numeric value from a range. We probably need to make some changes aggregation side for this.