-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
I would like to create an NP chart and I can't find a way to do so with ElasticSearch currently.
An NP chart is a line chart in which the value of each point is a percentage of a fixed number of items that meets some criteria. For example, if my data looks like this:
[
{ date: '2019-01-01', failed: true },
{ date: '2019-01-01', failed: true },
{ date: '2019-01-02', failed: false },
{ date: '2019-01-04', failed: false },
{ date: '2019-01-05', failed: false },
{ date: '2019-01-08', failed: true },
{ date: '2019-01-08', failed: false },
{ date: '2019-01-08', failed: false },
]
Then I want to write a histogram like this:
aggs: {
np_chart: {
fixed_size_buckets: {
max_bucket_count: 10,
max_bucket_documents: 3,
sort: [{
date: {
order: 'asc',
},
}],
},
aggs: {
failed_count: {
filter: {
term: {
'failed': true,
},
},
},
},
},
},
Which should return buckets like this:
[
{
key: ...,
key_as_string: '2019-01-01',
doc_count: 3,
failed_count: { doc_count: 2 },
},
{
key: ...,
key_as_string: '2019-01-04',
doc_count: 3,
failed_count: { doc_count: 1 },
},
{
key: ...,
key_as_string: '2019-01-08',
doc_count: 2,
failed_count: { doc_count: 0 },
},
]
Obviously if there are this few documents I could load the documents into memory and parse them manually, but I'd like to have up to a few thousand documents per bucket and that's too much to process that way.
There's a variant of the NP chart where instead of dividing all the documents into groups of N, we first make a date histogram and then take a random sample of N documents from each day. The proposal above would support both cases.
I think this is different than other requests for variable width histograms I've been able to find but please correct me if this has been proposed elsewhere.