Replace dual-sync-async persistence panic with Watch contract by joostjager · Pull Request #4436 · lightningdevkit/rust-lightning

joostjager · 2026-02-24T10:13:27Z

Fix false panic by relaxing the Watch contract from a global "pick one mode" restriction to a per-channel rule: don't return Completed while prior InProgress updates are still pending for that channel.

ldk-reviews-bot · 2026-02-24T10:13:30Z

👋 Thanks for assigning @TheBlueMatt as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

joostjager · 2026-02-24T10:26:56Z

I didn't add a dedicated test for the InProgress -> complete -> Completed progression since it's already well-covered by existing tests that exercise this flow without disabling the contract check: test_monitor_update_fail_cs, test_monitor_update_fail_no_rebroadcast, test_monitor_update_fail_reestablish, monitor_failed_no_reestablish_response, test_monitor_update_fail_claim, test_monitor_update_on_pending_forwards, monitor_update_claim_fail_no_response, do_during_funding_monitor_fail, test_path_paused_mpp, test_temporary_error_during_shutdown, double_temp_error, test_manager_persisted_post_outbound_edge_holding_cell.

The ChainMonitor override path (where update_monitor fails and the return is overridden to InProgress) is also exercised without disabling the check by: test_monitor_and_persister_update_fail, test_concurrent_monitor_claim, test_update_err_monitor_lockdown.

codecov · 2026-02-24T13:07:42Z

Codecov Report

❌ Patch coverage is 90.90909% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 85.97%. Comparing base (24062c0) to head (c37be93).
⚠️ Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
lightning/src/ln/channelmanager.rs	85.71%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4436      +/-   ##
==========================================
+ Coverage   85.94%   85.97%   +0.03%     
==========================================
  Files         159      159              
  Lines      104607   104618      +11     
  Branches   104607   104618      +11     
==========================================
+ Hits        89901    89943      +42     
+ Misses      12194    12168      -26     
+ Partials     2512     2507       -5

Flag	Coverage Δ
tests	`85.97% <90.90%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

TheBlueMatt

There's more context in the PR description than the commit message, please update the commit message with the context (PR descriptions should almost always just be the commit messages themselves!). Also, IIRC the chanmon_consistency target is conservative to avoid violating the old API, we should update it to allow returning Completed more often to match the API changes here.

ldk-reviews-bot · 2026-02-24T14:07:55Z

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

joostjager · 2026-02-24T17:29:41Z

Updated fuzzer

TheBlueMatt · 2026-02-24T18:15:43Z

fuzz/src/chanmon_consistency.rs

@@ -2101,21 +2101,57 @@ pub fn do_test<Out: Output + MaybeSend + MaybeSync>(
 			// harm in doing so.
 			0x00 => {
 				*mon_style[0].borrow_mut() = ChannelMonitorUpdateStatus::InProgress;


We can probably drop mon_style again, no? Given we added it in the being-basically-reverted commit?

It's currently only used to remember what style to use after a node reload. But I could read from update_ret too and set that again on the new persister.

We could also reduce the number of command bytes by defining a command to toggle the mode.

Commit 0760f99 ("Disallow dual-sync-async persistence without restarting") added a panic in non-test builds when a Persist implementation returns both Completed and InProgress from the same ChannelManager instance. However, this check runs against the status that ChainMonitor returns to ChannelManager, not the raw Persist result. When ChannelMonitor::update_monitor fails (e.g. a counterparty commitment_signed arrives after a funding spend confirms), ChainMonitor persists the full monitor successfully but overrides the return value to InProgress. If the user's Persist impl only ever returns Completed, this override triggers a false mode-mismatch panic. This replaces the panic with a cleaner contract at the Watch trait level: a Watch implementation must not return Completed for a channel update if there are still pending InProgress updates for that channel. This matches the pure-async interface where an update cannot complete until all prior updates have also completed. Persist implementors are expected to be consistent (always sync or always async), so ChainMonitor naturally satisfies this contract without code changes. The mode tracking and panic checks from 0760f99 are removed and replaced with a panic that validates the new contract directly on the in-flight update state. Legacy tests that switch the persister between modes mid-flight can opt out via Node::disable_monitor_completeness_assertion(). The chanmon_consistency fuzz target is also updated to propagate monitor style changes to the live persister immediately rather than only on reload. Switching to Completed is guarded to only take effect when no channels have pending InProgress updates, satisfying the new Watch contract. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joostjager · 2026-02-24T19:25:45Z

Pushed two commits that remove mon_style and also save command bytes by toggling the style.

The only functional change is that nodes now always start in sync mode, but the fuzzer can immediately toggle it to async. On reload, the style is carried over.

TheBlueMatt

DId you run the fuzzer for a while? I'm not sure that we shouldn't be able to hit a failure like the below.

TheBlueMatt · 2026-02-24T20:21:27Z

lightning/src/chain/mod.rs

-	/// [`Persist`]/[`Watch`] without first restarting.
+	/// A [`Watch`] implementation must not return this for a channel update if there are still
+	/// pending [`InProgress`] updates for that channel. That is, an update can only be considered
+	/// complete once all prior updates have also completed.


Hmm, thinking about this a bit more there's a weird race. Async updates are "completed" only after release_pending_monitor_events returns the completed update, but also only after it has been processed, which may be some arbitrary and indeterminate amount of time later.

For the ChainMonitor return case, this is fine (though we need to update ChainMonitor - we should probably be returning InProgress for any new update that comes after one for which we have to return a spurious InProgress. Do you already intend to do that in a followup? should we not do it here?). But for the general API its pretty weird - you can in theory return a Completed after returning InProgress as long as you want some indeterminate and arbitrary amount of time, so in practice you can't...

We could fix this by adding some mutual exclusion in channelmanager.rs where we wait before processing Completed monitor updates until after any pending MonitorEvents are processed - this should be basically zero cost as ~no one is going to be using mixed-async-sync persistence in practice so we'll rarely if ever have any contention.

On the flip side, we could say that implementations are allowed to flip from Completed to InProgress for a channel, but never back (without a restart). Its more complicated to document, but it captures what we need to allow the ChainMonitor behavior.

implementations are allowed to flip from Completed to InProgress for a channel, but never back

This sounds like the easier way. There is no need to support mixing sync and async for Persist implementations. I do wonder now whether my initial version #4435 wasn't just sufficient. Moving the mixed mode check to the chainmonitor level and keeping everything else the same.

Fuzzer run was planned for tonight.

The mon_style array was a shadow copy of each node's persistence mode, kept in sync with the persister's update_ret. On node reload, it was used to restore the persistence mode to the new monitor. Instead, read the current update_ret directly from the old monitor's persister during reload, making the persister the single source of truth. This also removes the initial fuzz input byte that was used to seed mon_style, so all nodes now always start with Completed as their persistence mode (the fuzzer can still switch to InProgress via the 0x00-0x02 opcodes). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Replace the 6 separate opcodes (0x00-0x02 for InProgress, 0x04-0x06 for Completed) with 3 toggle opcodes (0x00-0x02) that flip the persistence style for each node. When toggling back to Completed, the existing guard that prevents switching while monitors are pending is preserved. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

joostjager force-pushed the revert-mixed-mode-check branch from 721f491 to 9aff3f2 Compare February 24, 2026 10:19

joostjager mentioned this pull request Feb 24, 2026

Move ChannelMonitorUpdateStatus mode check from ChannelManager to ChainMonitor #4435

Closed

joostjager force-pushed the revert-mixed-mode-check branch from 9aff3f2 to 93a9d79 Compare February 24, 2026 10:26

joostjager force-pushed the revert-mixed-mode-check branch 2 times, most recently from c20c819 to 77a6a30 Compare February 24, 2026 11:07

joostjager marked this pull request as ready for review February 24, 2026 12:07

joostjager requested a review from TheBlueMatt February 24, 2026 12:07

TheBlueMatt reviewed Feb 24, 2026

View reviewed changes

joostjager force-pushed the revert-mixed-mode-check branch from 77a6a30 to 218e89a Compare February 24, 2026 15:11

joostjager requested a review from TheBlueMatt February 24, 2026 17:29

TheBlueMatt reviewed Feb 24, 2026

View reviewed changes

joostjager force-pushed the revert-mixed-mode-check branch from 218e89a to 7e2b509 Compare February 24, 2026 19:24

TheBlueMatt reviewed Feb 24, 2026

View reviewed changes

joostjager and others added 2 commits February 24, 2026 22:12

joostjager force-pushed the revert-mixed-mode-check branch from 7e2b509 to c37be93 Compare February 24, 2026 21:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Replace dual-sync-async persistence panic with Watch contract#4436

Replace dual-sync-async persistence panic with Watch contract#4436
joostjager wants to merge 3 commits intolightningdevkit:mainfrom
joostjager:revert-mixed-mode-check

joostjager commented Feb 24, 2026 •

edited

Loading

Uh oh!

ldk-reviews-bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

joostjager commented Feb 24, 2026 •

edited

Loading

Uh oh!

codecov bot commented Feb 24, 2026 •

edited

Loading

Uh oh!

TheBlueMatt left a comment

Uh oh!

ldk-reviews-bot commented Feb 24, 2026

Uh oh!

joostjager commented Feb 24, 2026

Uh oh!

TheBlueMatt Feb 24, 2026

Uh oh!

joostjager Feb 24, 2026

Uh oh!

joostjager commented Feb 24, 2026 •

edited

Loading

Uh oh!

TheBlueMatt left a comment

Uh oh!

TheBlueMatt Feb 24, 2026

Uh oh!

joostjager Feb 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

joostjager commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ldk-reviews-bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joostjager commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

ldk-reviews-bot commented Feb 24, 2026

Uh oh!

joostjager commented Feb 24, 2026

Uh oh!

TheBlueMatt Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

joostjager Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

joostjager commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheBlueMatt left a comment

Choose a reason for hiding this comment

Uh oh!

TheBlueMatt Feb 24, 2026

Choose a reason for hiding this comment

Uh oh!

joostjager Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

joostjager commented Feb 24, 2026 •

edited

Loading

ldk-reviews-bot commented Feb 24, 2026 •

edited

Loading

joostjager commented Feb 24, 2026 •

edited

Loading

codecov bot commented Feb 24, 2026 •

edited

Loading

joostjager commented Feb 24, 2026 •

edited

Loading

joostjager Feb 24, 2026 •

edited

Loading