Commit Graph

83 Commits

Author SHA1 Message Date
Kent Overstreet
bbc3a0b17a bcachefs: fsck: Fix check_directory_structure when no check_dirents
check_directory_structure runs after check_dirents, so it expects that
it won't see any inodes with missing backpointers - normally.

But online fsck can't run check_dirents yet, or the user might only be
running a specific pass, so we need to be careful that this isn't an
error. If an inode is unreachable, that's handled by a separate pass.

Also, add a new 'bch2_inode_has_backpointer()' helper, since we were
doing this inconsistently.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-17 13:35:19 -04:00
Kent Overstreet
d47db3e636 bcachefs: Delete redundant fsck_err()
'inode_has_wrong_backpointer'; we have more specific errors for every
case afterwards.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-06-02 12:16:35 -04:00
Kent Overstreet
4ba99dde33 bcachefs: BCH_INODE_has_case_insensitive
Add a flag for tracking whether a directory has case-insensitive
descendents - so that overlayfs can disallow mounting, even though the
filesystem supports case insensitivity.

This is a new on disk format version, with a (cheap) upgrade to ensure
the flag is correctly set on existing inodes.

Create, rename and fssetxattr are all plumbed to ensure the new flag is
set, and we've got new fsck code that hooks into check_inode(0.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:10 -04:00
Kent Overstreet
77eac89c79 bcachefs: bch2_inode_find_by_inum_snapshot()
Move a fsck.c helper into inode.c, eliminate some duplicate and organize
the inode lookup helpers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:09 -04:00
Kent Overstreet
011d644b76 bcachefs: subvol_inum_eq()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:15:08 -04:00
Kent Overstreet
123d2d09ff bcachefs: bch2_inode_find_snapshot_root()
Factor out a small common helper.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:59 -04:00
Kent Overstreet
8d5ac187da bcachefs: Fix casefold opt via xattr interface
Changing the casefold option requires extra checks/work - factor out a
helper from bch2_fileattr_set() for the xattr code to use.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:09 -04:00
Kent Overstreet
b9e1f873d2 bcachefs: Casefold is now a regular opts.h option
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-24 19:09:00 -04:00
Kent Overstreet
9b0d00a369 bcachefs: Refactor bch2_check_dirent_target()
Prep work for calling bch2_check_dirent_target() from bch2_lookup().

- Add an inline wrapper, if the target and backpointer match we can skip
  the function call.

- We don't (yet?) want to remove the dirent we did the lookup from (when
  we find a directory or subvol with multiple valid dirents pointing to
  it), we can defer on that until later. For now, add an "are we in
  fsck?" parameter.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24 09:50:36 -04:00
Kent Overstreet
4be214c269 bcachefs: bch2_bkey_sectors_need_rebalance() now only depends on bch_extent_rebalance
Previously, bch2_bkey_sectors_need_rebalance() called
bch2_target_accepts_data(), checking whether the target is writable.

However, this means that adding or removing devices from a target would
change the value of bch2_bkey_sectors_need_rebalance() for an existing
extent; this needs to be invariant so that the extent trigger can
correctly maintain rebalance_work accounting.

Instead, check target_accepts_data() in io_opts_to_rebalance_opts(),
before creating the bch_extent_rebalance entry.

This fixes (one?) cause of rebalance_work accounting being off.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-06 22:35:11 -05:00
Kent Overstreet
df448ca355 bcachefs: bcachefs_metadata_version_persistent_inode_cursors
Persistent cursors for inode allocation.

A free inodes btree would add substantial overhead to inode allocation
and freeing - a "next num to allocate" cursor is always going to be
faster.

We just need it to be persistent, to avoid scanning the inodes btree
from the start on startup.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-01-09 23:38:41 -05:00
Kent Overstreet
59c50511f7 bcachefs: bcachefs_metadata_version_inode_depth
This adds a new inode field, bi_depth, for directory inodes: this allows
us to make the check_directory_structure pass much more efficient.

Currently, to ensure the filesystem is fully connect and has no loops,
for every directory we follow backpointers until we find the root. But
by adding a depth counter, it sufficies to only check the parent of each
directory, and check that the parent's bi_depth is smaller.

(fsck doesn't require that bi_depth = parent->bi_depth + 1; if a rename
causes bi_depth off, but the chain to the root is still strictly
decreasing, then the algorithm still works and there's no need for fsck
to fixup the bi_depth fields).

We've already checked backpointers, so we know that every directory
(excluding the root)has a valid parent: if bi_depth is always
decreasing, every chain must terminate, and terminate at the root
directory.

bi_depth will not necessarily be correct when fsck runs, due to
directory renames - we can't change bi_depth on every child directory
when renaming a directory. That's ok; fsck will silently fix the
bi_depth field as needed, and future fsck runs will be much faster.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-29 13:30:39 -05:00
Kent Overstreet
a6f4794fcd bcachefs: struct bkey_validate_context
Add a new parameter to bkey validate functions, and use it to improve
invalid bkey error messages: we can now print the btree and depth it
came from, or if it came from the journal, or is a btree root.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:20 -05:00
Kent Overstreet
4ae6bbb522 bcachefs: bch2_write_inode() now checks for changing rebalance options
Previously, BCHFS_IOC_REINHERIT_ATTRS didn't trigger rebalance scans
when changing rebalance options - it had been missed, only the xattr
interface triggered them.

Ideally they'd be done by the transactional trigger, but unpacking the
inode to get the options is too heavy to be done in the low level
trigger - the inode trigger is run on every extent update, since the
bch_inode.bi_journal_seq has to be updated for fsync.

bch2_write_inode() is a good compromise, it already unpacks and repacks
and is not run in any super-fast paths.

Additionally, creating the new rebalance entry to trigger the scan is
now done in the same transaction as the inode update that changed the
options.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:16 -05:00
Kent Overstreet
de92b1ee67 bcachefs: bch2_inode_should_have_bp -> bch2_inode_should_have_single_bp
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:14 -05:00
Kent Overstreet
78cf0ae636 bcachefs: INODE_STR_HASH() for bch_inode_unpacked
Trivial cleanup - add a normal BITMASK() helper for bch_inode_unpacked.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-18 00:49:48 -04:00
Kent Overstreet
9b23fdbd5d bcachefs: bcachefs_metadata_version_inode_has_child_snapshots
There's an inherent race in taking a snapshot while an unlinked file is
open, and then reattaching it in the child snapshot.

In the interior snapshot node the file will appear unlinked, as though
it should be deleted - it's not referenced by anything in that snapshot
- but we can't delete it, because the file data is referenced by the
child snapshot.

This was being handled incorrectly with
propagate_key_to_snapshot_leaves() - but that doesn't resolve the
fundamental inconsistency of "this file looks like it should be deleted
according to normal rules, but - ".

To fix this, we need to fix the rule for when an inode is deleted. The
previous rule, ignoring snapshots (there was no well-defined rule
for with snapshots) was:
  Unlinked, non open files are deleted, either at recovery time or
  during online fsck

The new rule is:
  Unlinked, non open files, that do not exist in child snapshots, are
  deleted.

To make this work transactionally, we add a new inode flag,
BCH_INODE_has_child_snapshot; it overrides BCH_INODE_unlinked when
considering whether to delete an inode, or put it on the deleted list.

For transactional consistency, clearing it handled by the inode trigger:
when deleting an inode we check if there are parent inodes which can now
have the BCH_INODE_has_child_snapshot flag cleared.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-09 16:42:51 -04:00
Kent Overstreet
1f73cb4d34 bcachefs: Add warn param to subvol_get_snapshot, peek_inode
These shouldn't always be fatal errors - logged op resume, in
particular, and we want it as a parameter there.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
72350ee0ea bcachefs: Kill snapshot arg to fsck_write_inode()
It was initially believed that it would be better to be explicit about
the snapshot we're updating when writing inodes in fsck; however, it
turns out that passing around the snapshot separately is more error
prone and we're usually updating the inode in the same snapshow we read
it from.

This is different from normal filesystem paths, where we do the update
in the snapshot of the subvolume we're in.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-10-04 20:25:32 -04:00
Kent Overstreet
2a1df87346 bcachefs: Add snapshot to bch_inode_unpacked
this allows for various cleanups in fsck

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-27 21:46:34 -04:00
Kent Overstreet
d97de0d017 bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()
bkey_fsck_err() was added as an interface that looks like fsck_err(),
but previously all it did was ensure that the appropriate error counter
was incremented in the superblock.

This is a cleanup and bugfix patch that converts it to a wrapper around
fsck_err(). This is needed to fix an issue with the upgrade path to
disk_accounting_v3, where the "silent fix" error list now includes
bkey_fsck errors; fsck_err() handles this in a unified way, and since we
need to change printing of bkey fsck errors from the caller to the inner
bkey_fsck_err() calls, this ends up being a pretty big change.

Als,, rename .invalid() methods to .validate(), for clarity, while we're
changing the function signature anyways (to drop the printbuf argument).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13 23:00:50 -04:00
Kent Overstreet
7b6dda7282 bcachefs: drop packed, aligned from bkey_inode_buf
Unnecessary here, and this broke the rust bindings:

error[E0588]: packed type cannot transitively contain a `#[repr(align)]` type
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:29025:1
      |
29025 | pub struct bkey_i_inode_v3 {
      | ^^^^^^^^^^^^^^^^^^^^^^^^^^
      |
note: `bch_inode_v3` has a `#[repr(align)]` attribute
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:8949:1
      |
8949  | pub struct bch_inode_v3 {
      | ^^^^^^^^^^^^^^^^^^^^^^^

error[E0588]: packed type cannot transitively contain a `#[repr(align)]` type
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:32826:1
      |
32826 | pub struct bkey_inode_buf {
      | ^^^^^^^^^^^^^^^^^^^^^^^^^
      |
note: `bch_inode_v3` has a `#[repr(align)]` attribute
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:8949:1
      |
8949  | pub struct bch_inode_v3 {
      | ^^^^^^^^^^^^^^^^^^^^^^^
note: `bkey_inode_buf` contains a field of type `bkey_i_inode_v3`
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:32827:9
      |
32827 |     pub inode: bkey_i_inode_v3,
      |         ^^^^^
note: ...which contains a field of type `bch_inode_v3`
     --> /build/source/target/release/build/bch_bindgen-9445b24c90aca2a3/out/bcachefs.rs:29027:9
      |
29027 |     pub v: bch_inode_v3,
      |         ^

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:16 -04:00
Kent Overstreet
65eaf4e24a bcachefs: s/bkey_invalid_flags/bch_validate_flags
We're about to start using bch_validate_flags for superblock section
validation - it's no longer bkey specific.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09 16:23:36 -04:00
Kent Overstreet
4da1713a8d bcachefs: check for inodes that should have backpointers in fsck
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:21 -04:00
Nathan Chancellor
2d288745eb bcachefs: Fix type of flags parameter for some ->trigger() implementations
When building with clang's -Wincompatible-function-pointer-types-strict
(a warning designed to catch potential kCFI failures at build time),
there are several warnings along the lines of:

  fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, enum btree_iter_update_trigger_flags)' with an expression of type 'int (struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, unsigned int)' [-Werror,-Wincompatible-function-pointer-types-strict]
    118 |         BCH_BKEY_TYPES()
        |         ^~~~~~~~~~~~~~~~
  fs/bcachefs/bcachefs_format.h:394:2: note: expanded from macro 'BCH_BKEY_TYPES'
    394 |         x(inode,                8)                      \
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
  fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x'
    117 | #define x(name, nr) [KEY_TYPE_##name]   = bch2_bkey_ops_##name,
        |                                           ^~~~~~~~~~~~~~~~~~~~
  <scratch space>:277:1: note: expanded from here
    277 | bch2_bkey_ops_inode
        | ^~~~~~~~~~~~~~~~~~~
  fs/bcachefs/inode.h:26:13: note: expanded from macro 'bch2_bkey_ops_inode'
     26 |         .trigger        = bch2_trigger_inode,           \
      |                           ^~~~~~~~~~~~~~~~~~

There are several functions that did not have their flags parameter
converted to 'enum btree_iter_update_trigger_flags' in the recent
unification, which will cause kCFI failures at runtime because the
types, while ABI compatible (hence no warning from the non-strict
version of this warning), do not match exactly.

Fix up these functions (as well as a few other obvious functions that
should have it, even if there are no warnings currently) to resolve the
warnings and potential kCFI runtime failures.

Fixes: 31e4ef3280c8 ("bcachefs: iter/update/trigger/str_hash flag cleanup")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:21 -04:00
Kent Overstreet
5dd8c60e1e bcachefs: iter/update/trigger/str_hash flag cleanup
Combine iter/update/trigger/str_hash flags into a single enum, and
x-macroize them for a to_text() function later.

These flags are all for a specific iter/key/update context, so it makes
sense to group them together - iter/update/trigger flags were already
given distinct bits, this cleans up and unifies that handling.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:18 -04:00
Kent Overstreet
688a769409 bcachefs: Pass inode bkey to check_path()
prep work for improving logging/error messages

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
4c20278eb1 bcachefs: Check subvol <-> inode pointers in check_subvol()
Subvolumes and subvolume root inodes point to each other: this verifies
the subvolume -> inode -> subvolme path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
69c8e6ce02 bcachefs: move fsck_write_inode() to inode.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Kent Overstreet
f0431c5f47 bcachefs: Combine .trans_trigger, .atomic_trigger
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:20 -05:00
Kent Overstreet
08bc959010 bcachefs: unify inode trigger
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
ad00bce07d bcachefs: mark now takes bkey_s
Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
717296c34c bcachefs: trans_mark now takes bkey_s
Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
103ffe9aaf bcachefs: x-macro-ify inode flags enum
This lets us use bch2_prt_bitflags to print them out.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-05 13:12:18 -05:00
Kent Overstreet
4bd156c4b4 bcachefs: Fix bch2_delete_dead_inodes()
- the fsck_err() check for the filesystem being clean was incorrect,
   causing us to always fail to delete unlinked inodes
 - if a snapshot had been taken, the unlinked inode needs to be
   propagated to snapshot leaves so the unlink can happen there - fixed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04 22:19:13 -04:00
Kent Overstreet
b65db750e2 bcachefs: Enumerate fsck errors
This patch adds a superblock error counter for every distinct fsck
error; this means that when analyzing filesystems out in the wild we'll
be able to see what sorts of inconsistencies are being found and repair,
and hence what bugs to look for.

Errors validating bkeys are not yet considered distinct fsck errors, but
this patch adds a new helper, bkey_fsck_err(), in order to add distinct
error types for them as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-01 21:11:08 -04:00
Kent Overstreet
55c11a159d bcachefs: bch2_inum_opts_get()
New helper for new rebalance code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:38 -04:00
Kent Overstreet
da187cacb8 bcachefs: Kill missing inode warnings in bch2_quota_read()
bch2_quota_read(), when scanning for inodes, may attempt to look up
inodes that have been deleted in the main subvolume - this is not an
error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:12 -04:00
Kent Overstreet
dde8cb1164 bcachefs: bcachefs_metadata_version_deleted_inodes
Add a new bitset btree for inodes pending deletion; this means we no
longer have to scan the full inodes btree after an unclean shutdown.

Specifically, this adds:
 - a trigger to update the deleted_inodes btree based on changes to the
   inodes btree
 - a new recovery pass
 - and check_inodes is now only a fsck pass.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:09 -04:00
Kent Overstreet
7904c82cea bcachefs: Move fsck_inode_rm() to inode.c
Prep work for the new deleted inodes btree

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:09 -04:00
Kent Overstreet
4dc5bb9adf bcachefs: move inode triggers to inode.c
bit of reorg

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:08 -04:00
Kent Overstreet
8726dc936f bcachefs: Change check for invalid key types
As part of the forward compatibility patch series, we need to allow for
new key types without complaining loudly when running an old version.

This patch changes the flags parameter of bkey_invalid to an enum, and
adds a new flag to indicate we're being called from the transaction
commit path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:06 -04:00
Kent Overstreet
174f930b8e bcachefs: bkey_ops.min_val_size
This adds a new field to bkey_ops for the minimum size of the value,
which standardizes that check and also enforces the new rule (previously
done somewhat ad-hoc) that we can extend value types by adding new
fields on to the end.

To make that work we do _not_ initialize min_val_size with sizeof,
instead we initialize it to the size of the first version of those
values.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:00 -04:00
Kent Overstreet
facafdcbc1 bcachefs: Change bkey_invalid() rw param to flags
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:52 -04:00
Kent Overstreet
8dd69d9f64 bcachefs: KEY_TYPE_inode_v3, metadata_version_inode_v3
Move bi_size and bi_sectors into the non-varint portion of the inode, so
that the write path can update them without going through the relatively
expensive unpack/pack operations.

Other changes:
 - Add a field for the offset of the varint section, so we can add new
   non-varint fields without needing a new inode type, like alloc_v3
 - Move bi_mode into the flags field, so that the varint section can be
   u64 aligned

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:51 -04:00
Kent Overstreet
01ad673727 bcachefs: bch2_inode_opts_get()
This improves io_opts() and makes it a non-inline function - it's big
enough that it probably shouldn't be.

Also, bch_io_opts no longer needs fields for whether options are
defined, so we can slim it down a bit.

We'd like to stop passing around the full bch_io_opts, but that'll be
tricky because of bch2_rebalance_add_key().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:49 -04:00
Kent Overstreet
abb936fb9f bcachefs: Improve bch2_inode_opts_to_opts()
It turns out the *_defined entries of bch_io_opts are only used in one
place - in the xattr get path - and there we immediately convert to a
bch_opts struct, which also has the *_defined entries.

This patch changes bch2_inode_opts_to_opts() to go directly from
bch_inode_unpacked to bch_opts, which is a minor simplification and will
also let us slim down struct bch_io_opts in another patch.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:46 -04:00
Kent Overstreet
a101957649 bcachefs: More style fixes
Fixes for various checkpatch errors.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:45 -04:00
Kent Overstreet
fd0c767966 bcachefs: Convert to __packed and __aligned
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:45 -04:00
Kent Overstreet
77671e8fff bcachefs: Move bkey bkey_unpack_key() to bkey.h
Long ago, bkey_unpack_key() was added to bset.h instead of bkey.h
because bkey.h didn't include btree_types.h, which it needs for the
compiled unpack function.

This patch finally moves it to the proper location.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:45 -04:00