-
Notifications
You must be signed in to change notification settings - Fork 308
attester: Add an on-chain last attestation timestamp and rate limit arg #622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
In consequence, attester clients are able to rate-limit attestations among _all_ active attesters - because the new last attestation timestamp is kept up to date on chain. Ultimately, this value being shared by concurrent clients, this feature limits our tx expenses while fulfilling our preferred attestation rates.
|
The latest updates on your projects. Learn more about Vercel for Git ↗︎ |
jayantk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great to me
| - group_name: fast_interval_rate_limited | ||
| conditions: | ||
| min_interval_secs: 1 | ||
| rate_limit_interval_secs: 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we want to have a version of this under default_attestation_conditions as well? I think in practice we're going to set this to 1 for most batches, and that would save us some configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that makes sense, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This made sense to me. PR description was great. main.rs was hard to review because I don't fully understand the crank logic.
wormhole_attester/client/src/main.rs
Outdated
| rate_limit_interval_secs, | ||
| )?; | ||
|
|
||
| // Detect rate limiting error early |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit hesitant to this change because the RPC latency (which seems to be around 1s) might actually make this check worthless and even reduce the actual attestation rate hitting the network (regardless of the result).
So 2 unfortunate consequences are:
- You decide not to attest because the simulation fails, but if you continue the instruction and it actually sends the transaction it might not get rate limited.
- You decide to attest and then it gets rate limited when it's sent for real.
I think it's better to send the tx and only capture the log of its result instead.
The slightly more nuance here is that rpc performs simulation before the actual tx send. I assume that when the contract returns error then the transaction will get reverted, right?
So they do the simulation and the above scenario's can still happen that actually make our performance a bit un-reliable.
My suggestion is that you do not fail return an error when it is rate limited and just keep sending the msg to the log (either it is attesting or not). Then you can also have 2 metrics instead of 1 for sent messages. one for sent tx and one for tx es that really attest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for noticing this. I thought about this and for some reason I couldn't find a way to get the logs from an error in solana_client :(
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because in that case, We would only need to do a map_err and fish the expected log line out of the error type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Come to think of it, the right thing to do might be using a custom error code instead of a log line, I think I'll do that instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But doesn't a custom error code face the same simulation problem i mentioned? this time in the rpc?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But our concern is not much the fees, they are essentially free right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can disable preflight simulation (I know how to do it in javascript).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Guillermo is right. I'll keep thinking about this for a while and settle for a preflight-less tx. This is probably the only way to get 100% precise rate limiting of the wormhole cross-txs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I think I came up with the right compromise. We use an error code, look for it in failing tx, finish attestation job if it's detected, pass the error through otherwise. This means the attester will never breach the desired rate limit, and at absolute worst pay for the failing tx if the rare race condition is met.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tilt just went green locally.
ali-behjati
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing code and PR description 🔥
I just have one comment that I like to know your opinion about. I might be wrong about what i think.
| } | ||
|
|
||
| pub const fn default_rate_limit_interval_secs() -> Option<u32> { | ||
| Some(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good default value :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uuugh, now we can't pass none 🥇
This lets users pass 0 to disable the feature (0-rate limiting means no rate limiting at all), which was not possible with the Option type.
This new value is bumped on chain with each new attestation, on a per-symbol basis. The timestamp is common for all cranks. This enables attester cranks to rate limit their attestations of a symbol together with all other attesters. Ultimately, our tx expenses should drop while still fulfilling our preferred attestation rates.
Rate limiting logic in depth
The rate-limiting works by letting clients specify a desired rate limiting interval in instruction args. If all of the batch's symbols were already attested up to that amount of time ago, the TX fails and no SOL is spent. Note: this rate limit may be reached because of a different attester working in parallel with the client that sees this soft error. The rate limiting error does not contribute to metrics nor healthcheck.
Review areas of interest
attest.rscode path - there's new logic evaluating the rate limit on chain and updating the value afterwards.main.rs- We use tx simulation to detect the exact rate limit error and avoid counting it in monitoring ok/error counters. This is done by returning early from a rather dense code block. Style/readability suggestions are very welcome there.Testing
The mock attester in Tilt gets a 2-second rate limit using the config value on its 1-second min interval attestation group. The rate limiting code can be seen as non-error INFO level messages about the rate limit. This happens only on 50-66% of resend attempts, as expected for the fast resends on a slower rate limit.