Skip to content

Conversation

TheBlueMatt
Copy link
Collaborator

@TheBlueMatt TheBlueMatt commented Oct 6, 2025

Backport of #4107, #4081, #4080, #4078, #3504 (as a prereq for tests in #3928), #3928, #3988, #4004, and #3984

@TheBlueMatt TheBlueMatt added this to the 0.1.6 milestone Oct 6, 2025
@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 6, 2025

👋 Thanks for assigning @valentinewallace as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

TheBlueMatt and others added 6 commits October 6, 2025 15:23
If we're doing gossip backfill to a peer, we first forward all our
`channel_announcement`s and `channel_update`s in SCID-order. While
doing so, we don't forward any fresh `channel_update` messages for
any channels which we haven't yet backfilled (as we'll eventually
send the new update anyway, and it might get rejected without the
corresponding `channel_announcement`).

Sadly, our comparison for this was the wrong way, so we actually
*only* forwarded updates which were for channels we haven't yet
backfilled, and dropped updates for channels we already had
backfilled.

Backport of edc7903
If a peer opens a channel to us, but never actually broadcasts the
funding transaction, we'll still keep a `ChannelMonitor` around for
the channel. While we maybe shouldn't do this either, when the
channel ultimately times out 2016 blocks later, we should at least
immediately archive the `ChannelMonitor`, which we do here.

Fixes lightningdevkit#3384

Backport of 744664e

Addressed nontrivial conflicts in:
 * lightning/src/chain/channelmonitor.rs due to splicing changes in
   `Balance` creation upstream

and trivial ones in:
 * lightning/src/ln/functional_tests.rs
During startup, the lightning protocol forces us to fetch a ton of
gossip for channels where there is a `channel_update` in only one
direction. We then have to wait around a while until we can prune
the crap cause we don't know when the gossip sync has completed.

Sadly, doing a large prune via `remove_stale_channels_and_tracking`
is somewhat slow. Removing a large portion of our graph currently
takes a bit more than 7.5 seconds on an i9-14900K, which can
ultimately ~hang a node with a few less GHz ~forever.

The bulk of this time is in our `IndexedMap` removals, where we
walk the entire `keys` `Vec` to remove the entry, then shift it
down after removing.

Here we shift to a bulk removal model when removing channels, doing
a single `Vec` iterate + shift. This reduces the same test to
around 1.38 seconds on the same hardware.

Backport of cc028f4 without
conflicts.
During startup, the lightning protocol forces us to fetch a ton of
gossip for channels where there is a `channel_update` in only one
direction. We then have to wait around a while until we can prune
the crap cause we don't know when the gossip sync has completed.

Sadly, doing a large prune via `remove_stale_channels_and_tracking`
is somewhat slow. Removing a large portion of our graph currently
takes a bit more than 7.5 seconds on an i9-14900K, which can
ultimately ~hang a node with a few less GHz ~forever.

The bulk of this time is in our `IndexedMap` removals, where we
walk the entire `keys` `Vec` to remove the entry, then shift it
down after removing.

In the previous commit we shifted to a bulk removal model for
channels, here we do the same for nodes. This reduces the same test
to around 340 milliseconds on the same hardware.

Backport of 28b526a

Resolved trivial `use` conflicts in:
 * lightning/src/routing/gossip.rs
Previously, we had a bug that particularly affected async payments where if an
outbound payment was in the state {Static}InvoiceReceived and there was a call
to process_pending_htlc_forwards, the payment would be automatically abandoned.
We would behave correctly and avoid abandoning if the payment was awaiting an
invoice, but not if the payment had an invoice but the HTLCs weren't yet locked
in.

Backport of ebb8e79

Resolved trivial conflicts in:
 * lightning/src/ln/outbound_payment.rs

and removed changes in
 * lightning/src/ln/async_payments_tests.rs as it doesn't exist in this
   branch.
Note:
The `actions_blocking_raa_monitor_updates` list may contain stale entries
in the form of `(channel_id, [])`, which do not represent actual dangling actions.

To handle this, stale entries are ignored when accumulating pending actions
before clearing them. This ensures that the logic focuses only on relevant
actions and avoids unnecessary accumulation of already processed data.

Backport of 86a0109
@TheBlueMatt
Copy link
Collaborator Author

Sadly, some of the later backports were not entirely trivial, so it might be worth taking a closer look

shaavan and others added 9 commits October 6, 2025 19:32
Co-authored by: Matt Corallo <[email protected]>

Backport of 45aa824 as a
prerequisite to backporting
583a9a3.

`use` conflicts resolved in:
 * lightning/src/ln/chanmon_update_fail_tests.rs
In 9cc6e08, we started using the
`RAAMonitorUpdateBlockingAction` logic to block RAAs which may
remove a preimage from one `ChannelMonitor` if it isn't durably
stored in another that is a part of the same MPP claim.

Then, in 254b78f, when we claimed
a payment, if we saw that the HTLC was already claimed in the
channel, we'd simply drop the new RAA blocker. This can happen on
reload when replaying MPP claims.

However, just because an HTLC is no longer present in
`ChannelManager`'s `Channel`, doesn't mean that the
`ChannelMonitorUpdate` which stored the preimage actually made it
durably into the `ChannelMonitor` on disk.

We could begin an MPP payment, have one channel get the preimage
durably into its `ChannelMonitor`, then step forward another update
with the peer. Then, we could reload, causing the MPP claims to be
replayed across all chanels, leading to the RAA blocker(s) being
dropped and all channels being unlocked. Finally, if the first
channel managed to step forward a further update with its peer
before the (now-replayed) `ChannelMonitorUpdate`s for all MPP parts
make it to disk we could theoretically lose the preimage.

This is, of course, a somewhat comically unlikely scenario, but I
had an old note to expand the test and it turned up the issue, so
we might as well fix it.

Backport of 583a9a3

Resolved conflicts in:
 * lightning/src/ln/chanmon_update_fail_tests.rs
 * lightning/src/ln/channelmanager.rs
In 0.1 we started requiring `counterparty_node_id` to be filled in
in various previous-hop datastructures when claiming HTLCs. While
we can't switch `HTLCSource`'s
`HTLCPreviousHopData::counterparty_node_id` to required (as it
might cause us to fail to read old `ChannelMonitor`s which still
hold `HTLCSource`s we no longer need to claim), we can at least
start requiring the field in `PendingAddHTLCInfo` and
`HTLCClaimSource`. This simplifies `claim_mpp_part` marginally.

Backport of f624047

Addressed substantial conflicts in:
 * lightning/src/ln/channelmanager.rs
When we claim a payment, `Event::PaymentClaimed` contains a list of
the HTLCs we claimed from as `ClaimedHTLC` objects. While they
include a `channel_id` the pyment came to us over, in theory
`channel_id`s aren't guaranteed to be unique (though in practice
they are in all opened channels aside from 0conf ones with a
malicious counterparty). Further, our APIs often require passing
both the `counterparty_node_id` and the `channel_id` to do things
to chanels.

Thus, here we add the missing `counterparty_node-id` to
`ClaimedHTLC`.

Backport of 7c735d9
Historically we indexed channels by
`(counterparty_node_id, funding outpoint)` in several pipelines,
especially the `ChannelMonitorUpdate` pipeline. This ended up
complexifying quite a few things as we always needed to store the
full `(counterparty_node_id, funding outpoint, channel_id)` tuple
to ensure we can always access a channel no matter its state.

Over time we want to move to only the
`(counterparty_node_id, channel_id)` tuple as *the* channel index,
especially as we move towards V2 channels that have a
globally-unique `channel_id` anyway.

Here we take one small step towards this, avoiding using the
channel funding outpoint in the `EventCompletionAction` pipeline.

Backport of 10df89d

Resolved conflicts in:
 * lightning/src/ln/channel.rs
 * lightning/src/ln/channelmanager.rs
We added the ability to block `ChannelMonitorUpdate`s on receipt of
an RAA in order to avoid dropping a payment preimage from a channel
that created a `PaymentSent` event in
9ede794. We did not at the time
use the same infrastructure for `PaymentClaimed` events, but really
should have. While a `PaymentClaimed` event may seem a bit less
critical than a `PaymentSent` event (it doesn't contain a payment
preimage that the user needs to make sure they store for proof of
payment), its still important for users to ensure their payment
tracking logic is always correct.

Here we take the (relatively straightforward) action of setting a
`EventCompletionAction` to block RAA monitor updates on channels
which created a `PaymentClaimed` event. Note that we only block one
random channel from an MPP paymnet, not all of them, as any single
channel should provide enough information for us to recreate the
`PaymentClaimed` event on restart.

Backport of a80c855

Resolved conflicts in:
 * lightning/src/ln/async_signer_tests.rs due to changes only being
   on tests which don't exist on this branch,
 * lightning/src/ln/chanmon_update_fail_tests.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/quiescence_tests.rs due to file not being
   present in this branch.

Resolved silent conflicts in:
 * lightning/src/ln/monitor_tests.rs due to slightly different test
   APIs.
`test_dup_htlc_onchain_doesnt_fail_on_reload` made reference to
`ChainMonitor` persisting `ChannelMonitor`s on each new block,
which hasn't been the case in some time. Instead, we update the
comment and code to make explicit that it doesn't impact the test.

Backport of 0a6c3fb

Resolved `use` conflicts in:
 * lightning/src/ln/payment_tests.rs
During testsing, we check that a `ChannelMonitor` will round-trip
through serialization exactly. However, we recently added a fix to
change a value in `PackageTemplate` on reload to fix some issues in
the field in 0.1. This can cause the round-trip tests to fail as a
field is modified during read.

We fix it here by simply exempting the field from the equality test
in the condition where it would be updated on read.

We also make the `ChannelMonitor` `PartialEq` trait implementation
non-public as weird workarounds like this make clear that such a
comparison is a britle API at best.

Backport of a8ec966

Resolved `use` and `rustfmt` conflicts in:
 * lightning/src/chain/channelmonitor.rs

In the upstream version of this commit, the `PartialEq`
implementation for `ChannelMonitor` was made test-only however to
avoid breaking a public API we do not do so here. However, the
changes to `PartialEq` for `PackageTemplate` are `test`-only, so
it shouldn't result in any behavioral change (not that the marginal
`PartialEq` changes are likely to impact downstream crates in any
case).
On `ChannelManager` reload we rebuild the pending outbound payments
list by looking for any missing payments in `ChannelMonitor`s.
However, in the same loop over `ChannelMonitor`s, we also re-claim
any pending payments which we see we have a payment preimage for.

If we send an MPP payment across different chanels, the result may
be that we'll iterate the loop, and in each iteration add a
pending payment with only one known path, then claim/fail it and
remove the pending apyment (at least for the claim case). This may
result in spurious extra events, or even both a `PaymentFailed` and
`PaymentSent` event on startup for the same payment.

Backport of 8106dbf

Resolved substantial conflcits in:
 * lightning/src/ln/channelmanager.rs by simply rewriting the
   patch.

Note that the `is_channel_closed` variable used in the upstream
version of this commit replaced simply checking if the
`outpoint_to_peer` map had an entry for the channel's funding
outpoint.

Note that the next upstream commit in this series
(3239d67) is unnecessary on this
branch as we use `outpoint_to_peer` rather than `per_peer_state` to
detect channel closure here.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fashion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

When we restart and, during `ChannelManager` load, see a
`ChannelMonitor` for a closed channel, we scan it for preimages
that we passed to it and re-apply those to any pending or forwarded
payments. However, we didn't scan it for preimages it learned from
transactions on-chain. In cases where a `MonitorEvent` is lost,
this can lead to a lost preimage. Here we fix it by simply tracking
preimages we learned on-chain the same way we track preimages
picked up during normal channel operation.

Backport of 543cc85

Resolved conflicts in
 * lightning/src/ln/monitor_tests.rs due to trivial changes
   upstream as well as changes to upstream bump events and
   commitment announcement logic.
Backport of 9186900

Resolved trivial conflicts in:
 * lightning/src/chain/channelmonitor.rs due to splicing's
   introduction of the `funding` field in monitors.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fashion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In a previous commit we handled the case of claimed HTLCs by
replaying payment preimages on startup to avoid `MonitorEvent` loss
causing us to miss an HTLC claim. Here we handle the HTLC-failed
case similarly.

Unlike with HTLC claims via preimage, we don't already have replay
logic in `ChannelManager` startup, but its easy enough to add one.
Luckily, we already track when an HTLC reaches permanently-failed
state in `ChannelMonitor` (i.e. it has `ANTI_REORG_DELAY`
confirmations on-chain on the failing transaction), so all we need
to do is add the ability to query for that and fail them on
`ChannelManager` startup.

Backport of f809e6c

Resolved conflicts in:
 * lightning/src/chain/channelmonitor.rs due to splicing-related
   changes in the upstream branch,
 * lightning/src/ln/channelmanager.rs due to lack of the
   `LocalHTLCFailureReason` type in this branch, and
 * lightning/src/ln/monitor_tests.rs due to changes to upstream
   bump events and commitment announcement logic.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we take the first step towards that notification, adding a new
`ChannelMonitorUpdateStep` for the completion notification, and
tracking HTLCs which make it to the `ChannelMonitor` in such
updates in a new map.

Backport of c49ce57

Trivial conflicts resolved in:
 * lightning/src/chain/channelmonitor.rs
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we prepare to generate the new
`ChannelMonitorUpdateStep::ReleasePaymentComplete` updates, adding
a new `PaymentCompleteUpdate` struct to track the new update before
we generate the `ChannelMonitorUpdate` and passing through to the
right places in `ChannelManager`.

The only cases where we want to generate the new update is after a
`PaymentSent` or `PaymentFailed` event when the event was the
result of a `MonitorEvent` or the equivalent read during startup.

Backport of 8b637cc

Conflicts resolved in:
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/outbound_payment.rs
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we begin generating the new
`ChannelMonitorUpdateStep::ReleasePaymentComplete` updates,
updating functional tests for the new `ChannelMonitorUpdate`s where
required.

Backport of 71a364c

Conflicts resolved in:
 * lightning/src/ln/chanmon_update_fail_tests.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/functional_test_utils.rs
 * lightning/src/ln/functional_tests.rs
 * lightning/src/ln/monitor_tests.rs
 * lightning/src/ln/outbound_payment.rs
 * lightning/src/ln/payment_tests.rs

Note that unlike the original commit, on this branch we do not fail
to deserialize a `ChannelMonitor` if the `counterparty_node_id` is
`None` (implying it has not seen a `ChannelMonitorUpdate` since
LDK 0.0.118). Thus, we skip the new logic in some cases, generating
a warning log instead.

As we assumed that it is now reasonable to require
`counterparty_node_id`s in LDK 0.2, it seems reasonable to skip the
new logic (potentially generating some additional spurious payment
events on restart) now here as well.
`MonitorEvent`s aren't delivered to the `ChannelManager` in a
durable fasion - if the `ChannelManager` fetches the pending
`MonitorEvent`s, then the `ChannelMonitor` gets persisted (i.e. due
to a block update) then the node crashes, prior to persisting the
`ChannelManager` again, the `MonitorEvent` and its effects on the
`ChannelManger` will be lost. This isn't likely in a sync persist
environment, but in an async one this could be an issue.

Note that this is only an issue for closed channels -
`MonitorEvent`s only inform the `ChannelManager` that a channel is
closed (which the `ChannelManager` will learn on startup or when it
next tries to advance the channel state), that
`ChannelMonitorUpdate` writes completed (which the `ChannelManager`
will detect on startup), or that HTLCs resolved on-chain post
closure. Of the three, only the last is problematic to lose prior
to a reload.

In previous commits we ensured that HTLC resolutions which came to
`ChannelManager` via a `MonitorEvent` were replayed on startup if
the `MonitorEvent` was lost. However, in cases where the
`ChannelManager` was so stale that it didn't have the payment state
for an HTLC at all, we only re-add it in cases where
`ChannelMonitor::get_pending_or_resolved_outbound_htlcs` includes
it.

Because constantly re-adding a payment state and then failing it
would generate lots of noise for users on startup (not to mention
risk of confusing stale payment events for the latest state of a
payment when the `PaymentId` has been reused to retry a payment).
Thus, `get_pending_or_resolved_outbound_htlcs` does not include
state for HTLCs which were resolved on chain with a preimage or
HTLCs which were resolved on chain with a timeout after
`ANTI_REORG_DELAY` confirmations.

This critera matches the critera for generating a `MonitorEvent`,
and works great under the assumption that `MonitorEvent`s are
reliably delivered. However, if they are not, and our
`ChannelManager` is lost or substantially old (or, in a future
where we do not persist `ChannelManager` at all), we will not end
up seeing payment resolution events for an HTLC.

Instead, we really want to tell our `ChannelMonitor`s when the
resolution of an HTLC is complete. Note that we don't particularly
care about non-payment HTLCs, as there is no re-hydration of state
to do there - `ChannelManager` load ignores forwarded HTLCs coming
back from `get_pending_or_resolved_outbound_htlcs` as there's
nothing to do - we always attempt to replay the success/failure and
figure out if it mattered based on whether there was still an HTLC
to claim/fail.

Here we, finally, begin actually using the new
`ChannelMonitorUpdateStep::ReleasePaymentComplete` updates,
skipping re-hydration of pending payments once they have been fully
resolved through to a user `Event`.

Backport of 226520b

Fixed conflicts in:
 * lightning/src/chain/channelmonitor.rs
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/functional_tests.rs to address differences in
   between `ChannelMonitorUpdate` generation for channels
   force-closed by commitment transaction confirmation between
   upstream and this branch,
 * lightning/src/ln/payment_tests.rs
When a payment was sent and ultimately completed through an
on-chain HTLC claim which we discover during startup, we
deliberately break the payment tracking logic to keep it around
forever, declining to send a `PaymentPathSuccessful` event but
ensuring that we don't constantly replay the claim on every
startup.

However, now that we now have logic to complete a claim by marking
it as completed in a `ChannelMonitor` and not replaying information
about the claim on every startup. Thus, we no longer need to take
the conservative stance and can correctly replay claims now,
generating `PaymentPathSuccessful` events and allowing the state to
be removed.

Backport of ba6528f

Fixed conflicts in:
 * lightning/src/ln/channelmanager.rs
 * lightning/src/ln/payment_tests.rs
`startup_replay` should always exactly mirror
`background_events_processed_since_startup`, so we should just use
that rather than having a dedicated argument for it.

Backport of 77026c9

Fixed conflicts in:
 * lightning/src/ln/channelmanager.rs
@ldk-reviews-bot
Copy link

✅ Added second reviewer: @wpaulino

@ldk-reviews-bot
Copy link

✅ Added second reviewer: @tankyleo

Copy link
Contributor

@tankyleo tankyleo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haven't caught anything on first pass, will take another pass tomorrow, the last three PRs are non-trivial as you say

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants