Commit Graph

17 Commits

Author SHA1 Message Date
Kent Overstreet
1cddad0fcb bcachefs: Call bch2_fs_init_rw() early if we'll be going rw
kthread creation checks for pending signals, which is _very_ annoying if
we have to do a long recovery and don't go rw until we've done
significant work.

Check if we'll be going rw and pre-allocate kthreads/workqueues.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-16 19:04:48 -04:00
Kent Overstreet
0942b852d4 bcachefs: BCH_RECOVERY_PASS_NO_RATELIMIT
Add a superblock flag to temporarily disable ratelimiting for a recovery
pass.

This will be used to make check_key_has_snapshot safer: we don't want to
delete a key for a missing snapshot unless we know that the snapshots
and subvolumes btrees are consistent, i.e. check_snapshots and
check_subvols have run recently.

Changing those btrees - creating/deleting a subvolume or snapshot - will
set the "disable ratelimit" flag, i.e. ensuring that those passes run if
check_key_has_snapshot discovers an error.

We're only disabling ratelimiting in the snapshot/subvol delete paths,
we're not so concerned about the create paths.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:36 -04:00
Kent Overstreet
a2ffab0e65 bcachefs: bch2_require_recovery_pass()
Add a helper for requiring that a recovery pass has already run: either
run it directly, if we're still in recovery, or if we're not in recovery
check if it has run recently and schedule it if it hasn't.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:35 -04:00
Kent Overstreet
06977ea82b bcachefs: Run recovery passes asynchronously
When we request a recovery pass to be run online, i.e. not during
recovery, if it's an online pass it'll now be run in the background,
instead of waiting for the next mount.

To avoid situations where recovery passes are running continuously, this
also includes ratelimiting: if the RUN_RECOVERY_PASS_ratelimit flag is
passed, the pass may be deferred until later - depending on the runtime
and last run stats in the recovery_passes superblock section.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:04 -04:00
Kent Overstreet
d4b30ed90c bcachefs: bch2_run_explicit_recovery_pass() cleanup
Consolidate the run_explicit_recovery_pass() interfaces by adding a
flags parameter; this will also let us add a RUN_RECOVERY_PASS_ratelimit
flag.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:04 -04:00
Kent Overstreet
06266465cc bcachefs: bch2_recovery_pass_status_to_text()
Show recovery pass status in sysfs - important now that we're running
them automatically in the background.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:03 -04:00
Kent Overstreet
ab35552030 bcachefs: __bch2_run_recovery_passes()
Consolidate bch2_run_recovery_passes() and
bch2_run_online_recovery_passes(), prep work for automatically
scheduling and running recovery passes in the background.

- Now takes a mask of which passes to run, automatic background repair
  will pass in sb.recovery_passes_required.

- Skips passes that are failing: a pass that failed may be reattempted
  after another pass succeeds (some passes depend on repair done by
  other passes for successful completion).

- bch2_recovery_passes_match() helper to skip alloc passes on a
  filesystem without alloc info.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:03 -04:00
Kent Overstreet
001c1d146f bcachefs: online_fsck_mutex -> run_recovery_passes_lock
Prep work for automatically running recovery passes asynchronously.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:53 -04:00
Kent Overstreet
e21f997721 bcachefs: bch_sb_field_recovery_passes
New superblock section for statistics on recovery passes - last time
ran (successfully), last runtime.

This will be used by self healing code to determine when to kick off
potentially expensive recovery passes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:53 -04:00
Kent Overstreet
7677859a47 bcachefs: Run most explicit recovery passes persistent
If we detect an error that requires running a recovery pass, and we're
not in recovery, we won't be able to fix it until the next mount - make
sure we're noting in the superblock that it needs to run.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:37 -04:00
Kent Overstreet
aff2b6a7fc bcachefs: provide unlocked version of run_explicit_recovery_pass_persistent
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:36 -04:00
Kent Overstreet
600a9207c8 bcachefs: Plumb printbuf through bch2_btree_lost_data()
Part of the ongoing project to improve error messages by building them
up in printbufs and emitting them all at once, so that we can easily see
what events are related in the log.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:34 -04:00
Kent Overstreet
300904700f bcachefs: kill bch2_run_explicit_recovery_pass_persistent()
No longer has users, so we can kill it and rename
bch2_run_explicit_recovery_pass_persistent_locked().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:33 -04:00
Kent Overstreet
bb36a12921 bcachefs: bch2_run_explicit_recovery_pass_printbuf()
We prefer helpers that emit log messages to printbufs rather than
printing them directly; that way, we can ensure that different log
messages from the same event are grouped together and formatted
appropriately in the dmesg log.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:16 -04:00
Kent Overstreet
e3c43dbe8e bcachefs: bch2_btree_lost_data() now uses run_explicit_rceovery_pass_persistent()
Also get a bit more fine grained about which passes to run for which
btrees.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:15 -05:00
Kent Overstreet
060ff30a85 bcachefs: bch2_run_explicit_recovery_pass_persistent()
Flag that we need to run a recovery pass and run it - persistenly, so if
we crash it'll still get run.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31 20:36:12 -04:00
Kent Overstreet
d2554263ad bcachefs: Split out recovery_passes.c
We've grown a fair amount of code for managing recovery passes; tracking
which ones we're running, which ones need to be run, and flagging in the
superblock which ones need to be run on the next recovery.

So it's worth splitting out into its own file, this code is pretty
different from the code in recovery.c.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31 20:36:11 -04:00