-
Couldn't load subscription status.
- Fork 3.4k
HBASE-29351 Quotas: adaptive wait intervals #7396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
99ca827 to
1d23753
Compare
This comment has been minimized.
This comment has been minimized.
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FeedbackAdaptiveRateLimiter.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FeedbackAdaptiveRateLimiter.java
Outdated
Show resolved
Hide resolved
| currentBackoffMultiplier.set(Math | ||
| .min(currentBackoffMultiplier.get() + backoffMultiplierIncrement, maxBackoffMultiplier)); | ||
| } else { | ||
| currentBackoffMultiplier | ||
| .set(Math.max(currentBackoffMultiplier.get() - backoffMultiplierDecrement, 1.0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Be aware that the .get() and .set() on the AtomicDouble are not together an atomic operation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Good observation. I believe this is okay because getWaitIntervalMs is synchronized and is the only public mechanism for triggering a refill — and refill is protected — so races don't actually occur in practice
Co-authored-by: Ray Mattingly <[email protected]>
1d23753 to
2b5fa7e
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
|
Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Charles Connell <[email protected]>
Co-authored-by: Ray Mattingly <[email protected]> Signed-off-by: Charles Connell <[email protected]>
Signed-off-by: Charles Connell <[email protected]> Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Charles Connell <[email protected]> Co-authored-by: Ray Mattingly <[email protected]>
We've been using this at HubSpot quite successfully, and would like to contribute it.
Throttling has been great for us at HubSpot — we've written about the success here. But we've had a couple of problems at scale:
To fix this, we've implemented a FeedbackAdaptiveRateLimiter, inspired by Philipp Janert's Feedback Control for Computer Systems
The FeedbackAdaptiveRateLimiter works much like the FixedIntervalRateLimiter, but with some additional logic to support dynamic wait interval multiplication and modest oversubscription so as to drive more consistent, more full quota utilization while only serving a fraction of the previous RpcThrottlingException volume.
The additional FARL logic can be described with the following categories:
1. Closed-Loop Feedback Control
The limiter implements a classic closed-loop control system where:
2. Proportional Control with Integral Behavior
The implementation uses two separate control mechanisms:
Backoff Multiplier Control (
hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FeedbackAdaptiveRateLimiter.java:221-227):This is essentially an integral controller that accumulates error over time—increasing pressure
when contention is detected, decreasing it when there's none.
Oversubscription Control (
hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FeedbackAdaptiveRateLimiter.java:229-236):3. Exponential Moving Average (EMA) for Smoothing
The system uses EMA to track utilization (
hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FeedbackAdaptiveRateLimiter.java:216-218):double util = (double) consumed / intendedUsage;
utilizationEma = emaAlpha * util + (1.0 - emaAlpha) * utilizationEma;
This is a standard signal processing technique from control theory to filter out noise and
respond smoothly to changes in system behavior.
4. Saturation Limits (Anti-Windup)
The implementation includes caps on both control parameters:
This prevents "integral windup"—a common problem in control systems where the controller
overshoots dramatically.
5. Error Budget / Deadband
The utilizationErrorBudget parameter (
hbase-server/src/main/java/org/apache/hadoop/hbase/quotas/FeedbackAdaptiveRateLimiter.java:95-97) creates a deadband around the target utilization (1.0):6. Dual-Control Strategy
The system elegantly addresses two different control objectives:
optimal steady-state utilization
This mirrors the proportional + integral (PI) control pattern where you need both fast response
to disturbances and elimination of steady-state error.
Why This Design?
The traditional fixed-interval rate limiter suffers from two problems this feedback control
approach solves:
By applying control theory:
limits on average
In practice, we have deployed this with these default settings across all of our hundreds of clusters at HubSpot with great success, powering everything from live user-facing requests to async batch jobs. See our reduced RpcThrottlingException volume below (this is the average of our top n RegionServers by RTE/sec):

And with the variety of configuration levers available in the FARL, you can truly adapt the rate limiter to appease whatever your priorities might be (strict backoffs, lenient oversubscription). In combination with HBASE-29663, which made rate limiter configurations dynamically refreshable, this is a powerful combination for improving the usability & scalability of HBase's Quotas system.