-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Describe the feature:
We have a use case to generate high precision time series from Netflow records stored as documents in Elasticsearch.
Given a Netflow record with the following fields:
{
"timestamp": 460,
"netflow.first_switched": 100,
"netflow.last_switched": 450,
"netflow.bytes": 350
}We would like to be able to generate time series with start=0, end=500, step=100, and have the following data points:
t=0, bytes=0
t=100, bytes=100
t=200, bytes=100
t=300, bytes=100
t=400, bytes=50
t=500, bytes=0
The flow record indicates that a total of 350 bytes were transferred between t=[100,450], and the resulting series splits these values proportionally across the different buckets based on the overlap.
In the case were many documents were being aggregated, the resulting buckets would contain the sum.
During some previous research, we could not find a way to achieve this with existing facilities, so we developed a custom aggregation plugin - see https://github.com/OpenNMS/elasticsearch-drift-plugin.
Some example queries for the plugin in practice look like: https://gist.github.com/j-white/5c188e5c56fea3f14ea99ab6c1280ceb#time-series-for-top-n-applications
We're ready to work on cleaning up the code and working on making the changes necessary to help get this upstream, but before we do this we'd would like to confirm:
- Are these any existing facilities for achieving this already? Maybe something existing we missed, or something new since we last checked in 6.2.x.
- Is there interest in having this functionality as part of the core?