-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
I've been trying to make sense of Elasticssearch histogram aggregations the last couple of days. And I've found that they don't work as expected, or even advertised.
Lets say i want to aggregate like so:
"aggregations": {
"sea_water_temperature": {
"histogram": {
"field": "sea_water_temperature",
"interval": 3
}
}
}
Response buckets looks fine at first glance, but when trying to query for documents within the bounds of a bucket I don't get the same document count as the bucket suggested. E.g.
"filter": {
"range": {
"sea_water_temperature": {
"lt": 0,
"gte": -3
}
}
}
This could give x results while the bucket "-3" had a doc_count of y. This seems to only be an issue for negative bucket keys.
In the docs for histogram it states that the bucket key for a given value is:
rem = value % interval
if (rem < 0) {
rem += interval
}
bucket_key = value - rem
However I tried a term aggregation with that as a value script:
"aggregations": {
"sea_water_temperature": {
"terms": {
"field": "sea_water_temperature",
"script": "rem = _value % interval; rem = rem < 0 ? rem + interval : rem; _value - rem",
"params": {
"interval": 3
}
}
}
}
That gives me the same kind of bucketing as histogram does but now my filter queries actually match the doc_counts of the buckets(!). Why isn't histogram working as described? or am I missing something?