-
Notifications
You must be signed in to change notification settings - Fork 113
Persist inbound channel peer info #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@tnull would love your input here (: |
00bfe53 to
f994c45
Compare
tnull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this and excuse the delay!
I think the approach to persist the peer data when accepting the channel when handling OpenChannelRequest makes sense. However, note that this event won't currently be emitted if the peer doesn't enable 0conf support. For this to work we also need to always enable user_config.manually_accept_inbound_channels in builder.rs, independently from the 0conf case.
Also, can we add a regression test checking that we would reconnect to a peer from which we accepted an inbound channel after we reinit (or at lest stop()/start()) our node?
src/event.rs
Outdated
| if allow_0conf { " trusted" } else { "" }, | ||
| counterparty_node_id, | ||
| ); | ||
| // Persist channel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We're actually not persisting the 'channel' here. As the comment is somewhat superfluous, probably best to remove it.
src/event.rs
Outdated
| }; | ||
| log_info!( | ||
| self.logger, | ||
| "Persisted inbound channel from peer {}", counterparty_node_id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Likewise here: we haven't persisted the channel. However, we probably don't need to log this separately in the success case.
src/event.rs
Outdated
| } | ||
| }; | ||
| let peer = PeerInfo { node_id: counterparty_node_id, address: NetAddress(address) }; | ||
| if let Err(e) = self.peer_store.add_peer(peer) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you use a match here?
src/event.rs
Outdated
| let address = match peer.1.clone() { | ||
| Some(address) => address, | ||
| None => { | ||
| log_error!(self.logger, "Failed to add peer to peer store: no network address found"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we log the counterparty_node_id in this case, too?
|
@jbesraa Could you make any progress here? |
da4621d to
4cbd450
Compare
src/test/functional_tests.rs
Outdated
| let node_b_peers = node_b_peers.get(0).unwrap(); | ||
| assert_eq!(node_b_peers.node_id, node_a_id); | ||
| assert_eq!(node_b_peers.is_persisted, true); | ||
| assert_eq!(node_b_peers.is_connected, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this assertion is failing, I tried to debug it and when I restart the node, it tries to reconnect to peer A but returns ConnectionRefused, specifically it fails here https://github.com/lightningdevkit/ldk-node/pull/170/files#diff-b1a35a68f14e696205874893c07fd24fdb88882b47c23cc0e0c80a30c7d53759R584
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any hints why this can happen?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, a) reconnecting might take some time, so you might need to a small (500ms, maybe?) sleep before to give the node the chance to reconnect before checking b) When A connects to B the port associated with A's outgoing socket is not its listening port. So rather than just reusing the connected address we'll have to look up the listening address in the network graph, and just not persist the info if we can't find it.
4cbd450 to
0ea26f9
Compare
src/test/functional_tests.rs
Outdated
| builder_b.set_esplora_server(esplora_url.clone()); | ||
| let node_b = builder_b.build().unwrap(); | ||
| node_b.start().unwrap(); | ||
| open_channel(&node_a, &node_b, &bitcoind, &electrsd, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will add another flow to cover 0allowconf = false as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to cover 0conf as a special case, it shouldn't be connected to persistence at all.
0ea26f9 to
2d4b66b
Compare
tnull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This still won't work if we don't update our manual-accept settings as mentioned above:
I think the approach to persist the peer data when accepting the channel when handling OpenChannelRequest makes sense. However, note that this event won't currently be emitted if the peer doesn't enable 0conf support. For this to work we also need to always enable user_config.manually_accept_inbound_channels in builder.rs, independently from the 0conf case.
src/test/functional_tests.rs
Outdated
| do_channel_full_cycle(node_a, node_b, &bitcoind, &electrsd, true) | ||
| } | ||
|
|
||
| fn open_channel<K: KVStore + Sync + Send>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now merged #198, feel free to rebase on main.
src/test/functional_tests.rs
Outdated
| builder_b.set_esplora_server(esplora_url.clone()); | ||
| let node_b = builder_b.build().unwrap(); | ||
| node_b.start().unwrap(); | ||
| open_channel(&node_a, &node_b, &bitcoind, &electrsd, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to cover 0conf as a special case, it shouldn't be connected to persistence at all.
src/test/functional_tests.rs
Outdated
| // B should persist A as a peer by adding there | ||
| // node_id to peer_store | ||
| let (bitcoind, electrsd) = setup_bitcoind_and_electrsd(); | ||
| println!("== Node A =="); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use setup_two_nodes for this.
src/event.rs
Outdated
| .iter() | ||
| .find(|(node_id, _)| node_id == &counterparty_node_id) | ||
| { | ||
| let address = match peer.1.clone() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using this address, we need to retrieve the listening address from the public network graph and not proceed to persist the peer info if we can't find it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes was just wondering with the code why its trying to reconnect to the wrong port.
trying to access self.network_graph.read_only().nodes it return an empty map, do I need to setup gossip somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
been trying to resolve this, will get back to this tomorrow.
a couple of questions please:
- I feel there is an error in
rust-lightnin
in the following logs, the port showing innode_b_peersis not in thenode_alistening addresses array.
in this test im not running my code, this is just starting 2 nodes and creating a channel and then listing peers without stopping any node.
== Node A ==
Setting network: regtest
Setting random LDK storage dir: /tmp/xrPtejR
Setting random LDK listening addresses: [TcpIpV4 { addr: [127, 0, 0, 1], port: 9937 }, TcpIpV4 { addr: [127, 0, 0, 1], port: 57309 }]
== Node B ==
Setting network: regtest
Setting random LDK storage dir: /tmp/uu1Os5I
Setting random LDK listening addresses: [TcpIpV4 { addr: [127, 0, 0, 1], port: 12888 }, TcpIpV4 { addr: [127, 0, 0, 1], port: 14086 }]
Generating 101 blocks... Done!
Generating 1 blocks... Done!
A -- connect_open_channel -> B
029fca8b65b2b981ca69872a8ec09efd5d1b618e179a63d05a1dbc460b91d6d945 got event ChannelPending { channel_id: ChannelId([70, 215, 99, 34, 66, 47, 198, 223, 183, 31, 221, 47, 60, 193, 73, 26, 144, 183, 150, 22, 160, 33, 188, 229, 222, 222, 250, 236, 216, 40, 60, 92]), user_channel_id: UserChannelId(189294706845638377905056823224737092261), former_temporary_channel_id: ChannelId([180, 20, 248, 6, 111, 228, 250, 185, 45, 159, 200, 208, 14, 64, 2, 62, 15, 102, 60, 100, 168, 118, 239, 64, 165, 230, 199, 102, 126, 70, 42, 192]), counterparty_node_id: PublicKey(25fa24b0a53b330dd5e3ebc3a822c1b2cc4b9f5be27667a7b30bf37ad6061bec0d4b751e8f47b46ac7c1b07e5ca045347febf1a78328b974d69a57d6093ecd95), funding_txo: OutPoint { txid: 5d3c28d8ecfadedee5bc21a01696b7901a49c13c2fdd1fb7dfc62f422263d746, vout: 1 } }
node_b got event ChannelPending { channel_id: ChannelId([70, 215, 99, 34, 66, 47, 198, 223, 183, 31, 221, 47, 60, 193, 73, 26, 144, 183, 150, 22, 160, 33, 188, 229, 222, 222, 250, 236, 216, 40, 60, 92]), user_channel_id: UserChannelId(244287129025742860562500776048931337989), former_temporary_channel_id: ChannelId([180, 20, 248, 6, 111, 228, 250, 185, 45, 159, 200, 208, 14, 64, 2, 62, 15, 102, 60, 100, 168, 118, 239, 64, 165, 230, 199, 102, 126, 70, 42, 192]), counterparty_node_id: PublicKey(45d9d6910b46bc1d5ad0639a178e611b5dfd9ec08e2a8769ca81b9b2658bca9fe6f20a47262070c9128a9623d4e7cf46d7fdb21da90d1a54a1c75003ffad9194), funding_txo: OutPoint { txid: 5d3c28d8ecfadedee5bc21a01696b7901a49c13c2fdd1fb7dfc62f422263d746, vout: 1 } }
.. generating blocks ..
Generating 6 blocks... Done!
029fca8b65b2b981ca69872a8ec09efd5d1b618e179a63d05a1dbc460b91d6d945 got event ChannelReady { channel_id: ChannelId([70, 215, 99, 34, 66, 47, 198, 223, 183, 31, 221, 47, 60, 193, 73, 26, 144, 183, 150, 22, 160, 33, 188, 229, 222, 222, 250, 236, 216, 40, 60, 92]), user_channel_id: UserChannelId(189294706845638377905056823224737092261), counterparty_node_id: Some(PublicKey(25fa24b0a53b330dd5e3ebc3a822c1b2cc4b9f5be27667a7b30bf37ad6061bec0d4b751e8f47b46ac7c1b07e5ca045347febf1a78328b974d69a57d6093ecd95)) }
node_b got event ChannelReady { channel_id: ChannelId([70, 215, 99, 34, 66, 47, 198, 223, 183, 31, 221, 47, 60, 193, 73, 26, 144, 183, 150, 22, 160, 33, 188, 229, 222, 222, 250, 236, 216, 40, 60, 92]), user_channel_id: UserChannelId(244287129025742860562500776048931337989), counterparty_node_id: Some(PublicKey(45d9d6910b46bc1d5ad0639a178e611b5dfd9ec08e2a8769ca81b9b2658bca9fe6f20a47262070c9128a9623d4e7cf46d7fdb21da90d1a54a1c75003ffad9194)) }
[src/test/functional_tests.rs:57] &node_b_peers = PeerDetails {
node_id: PublicKey(
45d9d6910b46bc1d5ad0639a178e611b5dfd9ec08e2a8769ca81b9b2658bca9fe6f20a47262070c9128a9623d4e7cf46d7fdb21da90d1a54a1c75003ffad9194,
),
address: TcpIpV4 {
addr: [
127,
0,
0,
1,
],
port: 43256,
},
is_persisted: false,
is_connected: true,
}
- when the event
OpenChannelRequestfired, the network_graph is not updated yet, so iam actually not sure we can rely on that..
when moving the code theChannelReady, I do see the node info and announcements, but this is still a bit tricky because we only listen to announcements from peers who we have channels with.
Maybe we should tackle this here https://github.com/lightningdevkit/ldk-node/blob/main/src/lib.rs#L552 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
been trying to resolve this, will get back to this tomorrow. a couple of questions please:
- I feel there is an error in
rust-lightnin
in the following logs, the port showing innode_b_peersis not in thenode_alistening addresses array.
in this test im not running my code, this is just starting 2 nodes and creating a channel and then listing peers without stopping any node.
No, this is just how basic TCP connection handling usually works: you bind a socket to a particular port to listen for incoming connections which you then accept and receive a new per-connection socket. If you connect outbound you immediately get a per-connection socket with a random port that is required to be different from your listening port (as you already bound to that one).
- when the event
OpenChannelRequestfired, the network_graph is not updated yet, so iam actually not sure we can rely on that..
when moving the code theChannelReady, I do see the node info and announcements, but this is still a bit tricky because we only listen to announcements from peers who we have channels with.
Hm, I don't think this isn't technically true: we should also receive node announcements from peers we're only connected to. That said, I'm afraid we can't do much about this as there really is no other way to be informed of the listening address for inbound connections. While this will be a bit tricky to test, it shouldn't be too big of a deal in practice as we'd generally expect to have synced the gossip state via RGS or P2P during normal operation.
7ab6ca9 to
7bebcc0
Compare
|
@tnull I think we should:
|
LDK will only persist channels across restart whose funding is in-flight, however, they are not required to be confirmed. So if we want to move it to another event
Hm, I'm not sure what you mean by that? I think have this behavior covered in tests might become quite complicated. We would need to make sure that the node opening a channel already has a publicly announced channel and did broadcast its announcement before, i.e., setup two nodes, open a channel, give them time to exchange messages and then disconnect and close the channel. Then, we could open the actual channel of interest, and test persistence by a) taking the channel-opening node offline, deleting it's peer data so that it won't reconnect itself (i.e., it would need to use TLDR: After thinking about it once more I'm not sure testing this behavior is worth adding all this complexity afterall. While it's nice to try to reconnect inbound channels if we come back online, it's really not that crucial as peers establishing outbound connections are expected to reconnect eventually and nodes accepting inbound channels are expected routing nodes that have a longer uptime, i.e., it's fine if they reconnect some of their channels only after a slight delay. |
|
Ok, thanks for the explanation! |
Mh, I think moving it to |
7bebcc0 to
964ffd6
Compare
OpenChannelRequest964ffd6 to
8da7500
Compare
|
thanks @tnull if !pending_channel.is_outbound && self.peer_store.get_peer(&counterparty_node_id).is_none()reverted to this as well if !config.trusted_peers_0conf.is_empty() {
// Manually accept inbound channels if we expect 0conf channel requests, avoid
// generating the events otherwise.
user_config.manually_accept_inbound_channels = true;
} |
src/lib.rs
Outdated
| .await; | ||
| .await { | ||
| Ok(_) => { | ||
| log_info!(connect_logger, "Successfully reconnected to peer {}", node_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added this while debugging, but I think its nice to have..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, why not. However, could you please keep a let res = and then match on res. That avoids unnecessary indents.
8da7500 to
add5880
Compare
|
No code changes, just pushed with a verified commit |
src/event.rs
Outdated
| runtime: Arc<RwLock<Option<tokio::runtime::Runtime>>>, | ||
| logger: L, | ||
| config: Arc<Config>, | ||
| peer_store: Arc<PeerStore<K, Arc<FilesystemLogger>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use the generic L here and below:
| peer_store: Arc<PeerStore<K, Arc<FilesystemLogger>>>, | |
| peer_store: Arc<PeerStore<K, L>>, |
src/event.rs
Outdated
| channel_manager: Arc<ChannelManager<K>>, network_graph: Arc<NetworkGraph>, | ||
| keys_manager: Arc<KeysManager>, payment_store: Arc<PaymentStore<K, L>>, | ||
| runtime: Arc<RwLock<Option<tokio::runtime::Runtime>>>, logger: L, config: Arc<Config>, | ||
| peer_store: Arc<PeerStore<K, Arc<FilesystemLogger>>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| peer_store: Arc<PeerStore<K, Arc<FilesystemLogger>>>, | |
| peer_store: Arc<PeerStore<K, L>>, |
src/event.rs
Outdated
| if !pending_channel.is_outbound | ||
| && self.peer_store.get_peer(&counterparty_node_id).is_none() | ||
| { | ||
| let nodes = network_graph.nodes(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can write this much more compactly (and more readable) as:
if let Some(address) = network_graph
.nodes()
.get(&NodeId::from_pubkey(&counterparty_node_id))
.and_then(|node_info| node_info.announcement_info.as_ref())
.and_then(|ann_info| ann_info.addresses().first())
{
let peer = PeerInfo {
node_id: counterparty_node_id,
address: address.clone(),
};
self.peer_store.add_peer(peer).unwrap_or_else(|e| {
log_error!(
self.logger,
"Failed to add peer {} to peer store: {}",
counterparty_node_id,
e
);
});
}Also, given we don't log peer store operations elsewhere we don't need to log in the success case here (at least for now).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
src/lib.rs
Outdated
| log_info!(connect_logger, "Successfully reconnected to peer {}", node_id); | ||
| } | ||
| Err(e) => { | ||
| log_error!(connect_logger, "Failed to connect to peer: {}", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Let's also speak of reconnect here then.
src/lib.rs
Outdated
| .await; | ||
| .await { | ||
| Ok(_) => { | ||
| log_info!(connect_logger, "Successfully reconnected to peer {}", node_id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, why not. However, could you please keep a let res = and then match on res. That avoids unnecessary indents.
When we get an inbound channel request, we dont persist the peer info who initiated the request. In this commit, we use `[LdkEvent::ChannelPending]` to track channels that are inbound and their counterparty node id is not persisted, and if so, we add their `PeerInfo` to `PeerStore`. Notice that in order to persist this kind of data, we need first to get the listening address of the counterparty node, which is not always available, in that case we dont persist the peer info.
add5880 to
fa404d1
Compare
|
Pushed new version taking into account all the comments mentioned above! thanks for the feedback & patience! |
tnull
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
resolves #133
When we get an inbound channel request, we dont persist the
peer info who initiated the request.
In this commit, we use
[LdkEvent::ChannelPending]to track channels that are inbound and their counterparty
node id is not persisted, and if so,
we add their
PeerInfotoPeerStore.Notice that in order to persist this kind of data,
we need first to get the listening address of the counterparty
node, which is not always available, in that case we dont persist
the peer info.