-
Notifications
You must be signed in to change notification settings - Fork 421
Limit outbound gossip buffer by size, rather than length #4096
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit outbound gossip buffer by size, rather than length #4096
Conversation
👋 Thanks for assigning @tnull as a reviewer! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4096 +/- ##
==========================================
- Coverage 88.75% 88.72% -0.03%
==========================================
Files 177 177
Lines 133318 133326 +8
Branches 133318 133326 +8
==========================================
- Hits 118330 118298 -32
- Misses 12293 12319 +26
- Partials 2695 2709 +14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
0afa343
to
8e58fa5
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
limiting the total outbound message buffer size before gossip forwards are dropped to 128 KiB per peer
Want to note that this change might make less of a difference once we look into message padding, as size might scale ~linerly with the number of messages then (at least if we really go ahead with MTU-size padding).
Should still make a big difference as we aren't considering max-size messages but rather MTU-sized messages. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm looks like CI is failing still on this one cargo test --no-default-features gossip_flood_pause
, looked around briefly couldn't find the root cause
Good CI #4107 |
86d115c
to
8f3775d
Compare
Now based on #4107 should pass CI. |
In 686a586 we stopped punishing peers for slowly draining the gossip forwarding buffer, delaying responding to our ping message. While that change was nice on its own, it also now allows us to be a bit more flexible with what enters the `gossip_broadcast_buffer`. Because we now do not count pending messages in `gossip_broadcast_buffer` against the peer's ping-response timer, there's no reason to continue to limit it based on `messages_sent_since_pong`. Thus, we drop that restriction here. However, in practice, the reason for the vast majority of gossip forwarding drops on my node is the 24-message total queue limit, rather than the `messages_sent_since_pong` limit. This limit was set to bend over backwards trying to avoid counting message buffer sizes while keeping peer message buffers small. In practice, there is really no reason to do that - summing the capacity of tens of buffers is negligible cost and allows us to be much more flexible with how many messages we queue. Here we do so, limiting the total outbound message buffer size before gossip forwards are dropped to 128 KiB per peer, rather than 24 messages. In practice, this appears to almost entirely remove gossip forward drops on my node.
8f3775d
to
4b60d66
Compare
Squashed and rebased 🎉 |
for _ in 0..OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP / encoded_size { | ||
cfgs[0].routing_handler.pending_events.lock().unwrap().push(msg_ev.clone()); | ||
peers[0].process_events(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have raised this earlier sorry, looks like increasing the number of iterations here breaks the test is that expected ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely. This loop pushes exactly the new buffer limit worth of messages into the gossip_broadcast_buffer
and then we test the correct handling of that later in the test.
In 686a586 we stopped punishing peers for slowly draining the gossip forwarding buffer, delaying responding to our ping message. While that change was nice on its own, it also now allows us to be a bit more flexible with what enters the
gossip_broadcast_buffer
.Because we now do not count pending messages in
gossip_broadcast_buffer
against the peer's ping-response timer, there's no reason to continue to limit it based onmessages_sent_since_pong
. Thus, we drop that restriction here.However, in practice, the reason for the vast majority of gossip forwarding drops on my node is the 24-message total queue limit, rather than the
messages_sent_since_pong
limit. This limit was set to bend over backwards trying to avoid counting message buffer sizes while keeping peer message buffers small.In practice, there is really no reason to do that - summing the capacity of tens of buffers is negligible cost and allows us to be much more flexible with how many messages we queue.
Here we do so, limiting the total outbound message buffer size before gossip forwards are dropped to 128 KiB per peer, rather than 24 messages. In practice, this appears to almost entirely remove gossip forward drops on my node.