Implement adaptive replica selection for coordinating nodes performing queries

In #23884 (and #3890) we added the `fixed_auto_queue_size` threadpool which could automatically raise or lower the queue size of the search threadpool depending on the arrival rate of operations and target response rate.

We'd like to take the next step for this and implement adaptive replica selection. This is a partial application of the [C3 algorithm](https://www.usenix.org/system/files/conference/nsdi15/nsdi15-paper-suresh.pdf) used on the coordinating node to select the appropriate replica instead of our current round robin behavior. Note that we cannot currently implement the rate control and backpressure from the paper since we cannot treat each request as having identical cost, though with the automatic queue-sizing already implemented we do have a good way to provide backpressure on the execution nodes themselves already.

The formula for replica ranking (`Ψ(s)`) (see page 6 of the linked paper) (EWMA = Exponentially Weighted Moving Average):

    Ψ(s) = R(s) - 1/µ̄(s) + (q̂(s))^b / µ̄(s)

Where `q̂(s)` is:

    q̂(s) = 1 + (os(s) * n) + q(s)

Here `(os(s) * n)` is the "concurrency compensation", where `os(s)` is the number of outstanding requests to a node and `n` is the number of clients in the system. `R(s)`, `q(s)`, and `µ̄(s)` are EWMAs of the response time (as seen from the coordinating node), queue-size, and service time received from the execution node.

This will require a number of steps in order to be implemented:

-   [x] Track EWMA of task execution time (service time) requests on the execution node (#24989)
-   [x] Piggyback service time EWMA and current queue size from execution node back to coordinating node with the search response (#25430)
-   [x] Track EWMA of response time of an execution node on the coordinating node (#25430)
-   [x] Track EWMA of queue size on the coordinating node (#25430)
-   [x] Implement the actual adaptive replica ranking on the coordinating node when deciding which copy of the data to execute the read operation on (#26128)

There is a little flexibility here, since we could possibly use some of our existing metrics (like "took" time) instead of adding new measurements, this is only a rough overview.

Additionally, we will need to decide on a value for `b` to correctly penalize long queues (the paper uses 3) as well as a good `α` value for the EWMA calculations. We could also make these configurable if desired.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement adaptive replica selection for coordinating nodes performing queries #24915

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement adaptive replica selection for coordinating nodes performing queries #24915

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions