Skip to content

Histogram bucketing negative values incorrectly #8082

@baelter

Description

@baelter

I've been trying to make sense of Elasticssearch histogram aggregations the last couple of days. And I've found that they don't work as expected, or even advertised.

Lets say i want to aggregate like so:

"aggregations": {
  "sea_water_temperature": {
    "histogram": {
      "field": "sea_water_temperature",
      "interval": 3
    }
  }
}

Response buckets looks fine at first glance, but when trying to query for documents within the bounds of a bucket I don't get the same document count as the bucket suggested. E.g.

"filter": {
  "range": {
    "sea_water_temperature": {
      "lt": 0,
      "gte": -3
    }
  }
}

This could give x results while the bucket "-3" had a doc_count of y. This seems to only be an issue for negative bucket keys.

In the docs for histogram it states that the bucket key for a given value is:

rem = value % interval
if (rem < 0) {
  rem += interval
}
bucket_key = value - rem

However I tried a term aggregation with that as a value script:

"aggregations": {
  "sea_water_temperature": {
    "terms": {
      "field": "sea_water_temperature",
      "script": "rem = _value % interval; rem = rem < 0 ? rem + interval : rem; _value - rem",
      "params": {
        "interval": 3
      }
    }
  }
}

That gives me the same kind of bucketing as histogram does but now my filter queries actually match the doc_counts of the buckets(!). Why isn't histogram working as described? or am I missing something?

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions