Skip to content

Commit 8e58fa5

Browse files
committed
Limit outbound gossip buffer by size, rather than length
In 686a586 we stopped punishing peers for slowly draining the gossip forwarding buffer, delaying responding to our ping message. While that change was nice on its own, it also now allows us to be a bit more flexible with what enters the `gossip_broadcast_buffer`. Because we now do not count pending messages in `gossip_broadcast_buffer` against the peer's ping-response timer, there's no reason to continue to limit it based on `messages_sent_since_pong`. Thus, we drop that restriction here. However, in practice, the reason for the vast majority of gossip forwarding drops on my node is the 24-message total queue limit, rather than the `messages_sent_since_pong` limit. This limit was set to bend over backwards trying to avoid counting message buffer sizes while keeping peer message buffers small. In practice, there is really no reason to do that - summing the capacity of tens of buffers is negligible cost and allows us to be much more flexible with how many messages we queue. Here we do so, limiting the total outbound message buffer size before gossip forwards are dropped to 128 KiB per peer, rather than 24 messages. In practice, this appears to almost entirely remove gossip forward drops on my node.
1 parent 0cb7f54 commit 8e58fa5

File tree

2 files changed

+35
-23
lines changed

2 files changed

+35
-23
lines changed

lightning/src/ln/peer_channel_encryptor.rs

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -641,6 +641,11 @@ impl PeerChannelEncryptor {
641641
/// padding to allow for future encryption/MACing.
642642
pub struct MessageBuf(Vec<u8>);
643643
impl MessageBuf {
644+
/// The total allocated space for this message
645+
pub fn capacity(&self) -> usize {
646+
self.0.capacity()
647+
}
648+
644649
/// Creates a new buffer from an encoded message (i.e. the two message-type bytes followed by
645650
/// the message contents).
646651
///

lightning/src/ln/peer_handler.rs

Lines changed: 30 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -718,19 +718,11 @@ enum MessageBatchImpl {
718718
CommitmentSigned(Vec<msgs::CommitmentSigned>),
719719
}
720720

721-
/// The ratio between buffer sizes at which we stop sending initial sync messages vs when we stop
722-
/// forwarding gossip messages to peers altogether.
723-
const FORWARD_INIT_SYNC_BUFFER_LIMIT_RATIO: usize = 2;
724-
725721
/// When the outbound buffer has this many messages, we'll stop reading bytes from the peer until
726722
/// we have fewer than this many messages in the outbound buffer again.
727723
/// We also use this as the target number of outbound gossip messages to keep in the write buffer,
728724
/// refilled as we send bytes.
729725
const OUTBOUND_BUFFER_LIMIT_READ_PAUSE: usize = 12;
730-
/// When the outbound buffer has this many messages, we'll simply skip relaying gossip messages to
731-
/// the peer.
732-
const OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP: usize =
733-
OUTBOUND_BUFFER_LIMIT_READ_PAUSE * FORWARD_INIT_SYNC_BUFFER_LIMIT_RATIO;
734726

735727
/// If we've sent a ping, and are still awaiting a response, we may need to churn our way through
736728
/// the socket receive buffer before receiving the ping.
@@ -754,10 +746,20 @@ const MAX_BUFFER_DRAIN_TICK_INTERVALS_PER_PEER: i8 = 4;
754746
/// process before the next ping.
755747
///
756748
/// Note that we continue responding to other messages even after we've sent this many messages, so
757-
/// it's more of a general guideline used for gossip backfill (and gossip forwarding, times
758-
/// [`FORWARD_INIT_SYNC_BUFFER_LIMIT_RATIO`]) than a hard limit.
749+
/// this really limits gossip broadcast, gossip backfill, and onion message relay.
759750
const BUFFER_DRAIN_MSGS_PER_TICK: usize = 32;
760751

752+
/// The maximum number of bytes which we allow in a peer's outbound buffers before we start
753+
/// dropping outbound gossip forwards.
754+
///
755+
/// This is currently 128KiB, or two messages at the maximum message size (though in practice we
756+
/// refuse to forward gossip messages which are substantially larger than we expect, so this is
757+
/// closer to ~85 messages if all queued messages are maximum-sized channel announcements).
758+
///
759+
/// Note that as we always drain the gossip forwarding queue before continuing gossip backfill,
760+
/// the equivalent maximum buffer size for gossip backfill is zero.
761+
const OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP: usize = 64 * 1024 * 2;
762+
761763
struct Peer {
762764
channel_encryptor: PeerChannelEncryptor,
763765
/// We cache a `NodeId` here to avoid serializing peers' keys every time we forward gossip
@@ -889,12 +891,11 @@ impl Peer {
889891

890892
/// Returns whether this peer's outbound buffers are full and we should drop gossip broadcasts.
891893
fn buffer_full_drop_gossip_broadcast(&self) -> bool {
892-
let total_outbound_buffered =
893-
self.gossip_broadcast_buffer.len() + self.pending_outbound_buffer.len();
894+
let total_outbound_buffered: usize =
895+
self.gossip_broadcast_buffer.iter().map(|m| m.capacity()).sum::<usize>()
896+
+ self.pending_outbound_buffer.iter().map(|m| m.capacity()).sum::<usize>();
894897

895-
total_outbound_buffered > OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP
896-
|| self.msgs_sent_since_pong
897-
> BUFFER_DRAIN_MSGS_PER_TICK * FORWARD_INIT_SYNC_BUFFER_LIMIT_RATIO
898+
total_outbound_buffered > OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP
898899
}
899900

900901
fn set_their_node_id(&mut self, node_id: PublicKey) {
@@ -4619,22 +4620,26 @@ mod tests {
46194620
let secp_ctx = Secp256k1::new();
46204621
let key = SecretKey::from_slice(&[1; 32]).unwrap();
46214622
let msg = channel_announcement(&key, &key, ChannelFeatures::empty(), 42, &secp_ctx);
4623+
// The message bufer size is the message length plus two 16-byte MACs plus a 2-byte length.
4624+
let encoded_size = msg.encode().len() + 16 * 2 + 2;
46224625
let msg_ev = MessageSendEvent::BroadcastChannelAnnouncement { msg, update_msg: None };
46234626

46244627
fd_a.hang_writes.store(true, Ordering::Relaxed);
46254628

46264629
// Now push an arbitrarily large number of messages and check that only
46274630
// `OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP` messages end up in the queue.
4628-
for _ in 0..OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP * 2 {
4631+
for _ in 0..OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP / encoded_size {
46294632
cfgs[0].routing_handler.pending_events.lock().unwrap().push(msg_ev.clone());
46304633
peers[0].process_events();
46314634
}
46324635

46334636
{
46344637
let peer_a_lock = peers[0].peers.read().unwrap();
4635-
let buf_len =
4636-
peer_a_lock.get(&fd_a).unwrap().lock().unwrap().gossip_broadcast_buffer.len();
4637-
assert_eq!(buf_len, OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP);
4638+
let peer_state_lock = peer_a_lock.get(&fd_a).unwrap().lock().unwrap();
4639+
let buf_len: usize =
4640+
peer_state_lock.gossip_broadcast_buffer.iter().map(|m| m.capacity()).sum();
4641+
assert!(buf_len > OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP - encoded_size);
4642+
assert!(buf_len < OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP);
46384643
}
46394644

46404645
// Check that if a broadcast message comes in from the channel handler (i.e. it is an
@@ -4644,9 +4649,11 @@ mod tests {
46444649

46454650
{
46464651
let peer_a_lock = peers[0].peers.read().unwrap();
4647-
let buf_len =
4648-
peer_a_lock.get(&fd_a).unwrap().lock().unwrap().gossip_broadcast_buffer.len();
4649-
assert_eq!(buf_len, OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP + 1);
4652+
let peer_state_lock = peer_a_lock.get(&fd_a).unwrap().lock().unwrap();
4653+
let buf_len: usize =
4654+
peer_state_lock.gossip_broadcast_buffer.iter().map(|m| m.capacity()).sum();
4655+
assert!(buf_len > OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP);
4656+
assert!(buf_len < OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP + encoded_size);
46504657
}
46514658

46524659
// Finally, deliver all the messages and make sure we got the right count. Note that there
@@ -4666,7 +4673,7 @@ mod tests {
46664673

46674674
assert_eq!(
46684675
cfgs[1].routing_handler.chan_anns_recvd.load(Ordering::Relaxed),
4669-
OUTBOUND_BUFFER_LIMIT_DROP_GOSSIP + 2
4676+
OUTBOUND_BUFFER_SIZE_LIMIT_DROP_GOSSIP / encoded_size + 1
46704677
);
46714678
}
46724679

0 commit comments

Comments
 (0)