Skip to content

Shortcut aggs for TSDB #90423

@tmgordeeva

Description

@tmgordeeva

Description

Our goal is to enable efficient pipeline aggregations on TSDB by taking
advantage of TSDB data distribution. Data nodes could process queries
themselves, but so far there hasn't been a reason to do this because of
the equal distribution of shards among nodes. With TSDB, we can take
advantage of the data distribution.

Initial steps would be to enable processing on data nodes to begin with:

  • Add ability for data nodes to propagate requests back up
    • Push down request to data node
    • Suppress response from coordinating node in favor of data nodes
  • Test data node propagation with a trivial agg (ie no merging logic required)

The point of pushing requests down to data nodes is if we have all the data in
a bucket like in TSDB, we can optimize by processing that bucket entirely on
that data node. So the next steps would be something like:

  • Enable processing whole buckets on data nodes and propagating the request up
  • Data nodes with incomplete buckets must return up to the coordinating node
    • The request will then have to be processed on multiple nodes as usual

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions