Commit Graph

141 Commits

Author SHA1 Message Date
Kent Overstreet
d97de0d017 bcachefs: Make bkey_fsck_err() a wrapper around fsck_err()
bkey_fsck_err() was added as an interface that looks like fsck_err(),
but previously all it did was ensure that the appropriate error counter
was incremented in the superblock.

This is a cleanup and bugfix patch that converts it to a wrapper around
fsck_err(). This is needed to fix an issue with the upgrade path to
disk_accounting_v3, where the "silent fix" error list now includes
bkey_fsck errors; fsck_err() handles this in a unified way, and since we
need to change printing of bkey fsck errors from the caller to the inner
bkey_fsck_err() calls, this ends up being a pretty big change.

Als,, rename .invalid() methods to .validate(), for clarity, while we're
changing the function signature anyways (to drop the printbuf argument).

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-08-13 23:00:50 -04:00
Kent Overstreet
a850bde649 bcachefs: fsck_err() may now take a btree_trans
fsck_err() now optionally takes a btree_trans; if the current thread has
one, it is required that it be passed.

The next patch will use this to unlock when waiting for user input.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:14 -04:00
Kent Overstreet
fb23d57a6d bcachefs: Convert gc to new accounting
Rewrite fsck/gc for the new accounting scheme.

This adds a second set of in-memory accounting counters for gc to use;
like with other parts of gc we run all trigger in TRIGGER_GC mode, then
compare what we calculated to existing in-memory accounting at the end.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:13 -04:00
Kent Overstreet
1d16c605cc bcachefs: Disk space accounting rewrite
Main part of the disk accounting rewrite.

This is a wholesale rewrite of the existing disk space accounting, which
relies on percepu counters that are sharded by journal buffer, and
rolled up and added to each journal write.

With the new scheme, every set of counters is a distinct key in the
accounting btree; this fixes scaling limitations of the old scheme,
where counters took up space in each journal entry and required multiple
percpu counters.

Now, in memory accounting requires a single set of percpu counters - not
multiple for each in flight journal buffer - and in the future we'll
probably also have counters that don't use in memory percpu counters,
they're not strictly required.

An accounting update is now a normal btree update, using the btree write
buffer path. At transaction commit time, we apply accounting updates to
the in memory counters, which are percpu counters indexed in an
eytzinger tree by the accounting key.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:13 -04:00
Kent Overstreet
3811f48aa3 bcachefs: bch2_printbuf_strip_trailing_newline()
Add a new helper to fix inode_to_text()

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:12 -04:00
Youling Tang
da6fa380d3 bcachefs: Align the display format of btrees/inodes/keys
Before patch:
```
 #cat btrees/inodes/keys
 u64s 17 type inode_v3 0:4096:U32_MAX len 0 ver 0:   mode=40755
   flags= (16300000)
   bi_size=0
```

After patch:
```
 #cat btrees/inodes/keys
 u64s 17 type inode_v3 0:4096:U32_MAX len 0 ver 0:
   mode=40755
   flags=(16300000)
   bi_size=0
```

Signed-off-by: Youling Tang <tangyouling@kylinos.cn>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-07-14 19:00:12 -04:00
Kent Overstreet
65eaf4e24a bcachefs: s/bkey_invalid_flags/bch_validate_flags
We're about to start using bch_validate_flags for superblock section
validation - it's no longer bkey specific.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-09 16:23:36 -04:00
Nathan Chancellor
2d288745eb bcachefs: Fix type of flags parameter for some ->trigger() implementations
When building with clang's -Wincompatible-function-pointer-types-strict
(a warning designed to catch potential kCFI failures at build time),
there are several warnings along the lines of:

  fs/bcachefs/bkey_methods.c:118:2: error: incompatible function pointer types initializing 'int (*)(struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, enum btree_iter_update_trigger_flags)' with an expression of type 'int (struct btree_trans *, enum btree_id, unsigned int, struct bkey_s_c, struct bkey_s, unsigned int)' [-Werror,-Wincompatible-function-pointer-types-strict]
    118 |         BCH_BKEY_TYPES()
        |         ^~~~~~~~~~~~~~~~
  fs/bcachefs/bcachefs_format.h:394:2: note: expanded from macro 'BCH_BKEY_TYPES'
    394 |         x(inode,                8)                      \
        |         ^~~~~~~~~~~~~~~~~~~~~~~~~~
  fs/bcachefs/bkey_methods.c:117:41: note: expanded from macro 'x'
    117 | #define x(name, nr) [KEY_TYPE_##name]   = bch2_bkey_ops_##name,
        |                                           ^~~~~~~~~~~~~~~~~~~~
  <scratch space>:277:1: note: expanded from here
    277 | bch2_bkey_ops_inode
        | ^~~~~~~~~~~~~~~~~~~
  fs/bcachefs/inode.h:26:13: note: expanded from macro 'bch2_bkey_ops_inode'
     26 |         .trigger        = bch2_trigger_inode,           \
      |                           ^~~~~~~~~~~~~~~~~~

There are several functions that did not have their flags parameter
converted to 'enum btree_iter_update_trigger_flags' in the recent
unification, which will cause kCFI failures at runtime because the
types, while ABI compatible (hence no warning from the non-strict
version of this warning), do not match exactly.

Fix up these functions (as well as a few other obvious functions that
should have it, even if there are no warnings currently) to resolve the
warnings and potential kCFI runtime failures.

Fixes: 31e4ef3280c8 ("bcachefs: iter/update/trigger/str_hash flag cleanup")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:21 -04:00
Kent Overstreet
5dd8c60e1e bcachefs: iter/update/trigger/str_hash flag cleanup
Combine iter/update/trigger/str_hash flags into a single enum, and
x-macroize them for a to_text() function later.

These flags are all for a specific iter/key/update context, so it makes
sense to group them together - iter/update/trigger flags were already
given distinct bits, this cleans up and unifies that handling.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:18 -04:00
Kent Overstreet
7423330e30 bcachefs: prt_printf() now respects \r\n\t
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-05-08 17:29:17 -04:00
Kent Overstreet
c258c08add bcachefs: fix integer conversion bug
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-04-28 21:34:29 -04:00
Thomas Bertschinger
688d750d10 bcachefs: fix misplaced newline in __bch2_inode_unpacked_to_text()
before:

u64s 18 type inode_v3 0:1879048192:U32_MAX len 0 ver 0:   mode=40700
  flags= (15300000)
  journal_seq=4
  bi_size=0
  bi_sectors=0

  bi_version=0bi_atime=227064388944
  ...

after:

u64s 18 type inode_v3 0:1879048192:U32_MAX len 0 ver 0:   mode=40700
  flags= (15300000)
  journal_seq=4
  bi_size=0
  bi_sectors=0
  bi_version=0
  bi_atime=227064388944
  ...

Signed-off-by: Thomas Bertschinger <tahbertschinger@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-31 20:36:10 -04:00
Kent Overstreet
5d04409a62 bcachefs: Always flush write buffer in delete_dead_inodes()
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:26 -04:00
Kent Overstreet
506b187603 bcachefs: bch2_btree_bit_mod -> bch2_btree_bit_mod_buffered
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:24 -04:00
Kent Overstreet
4c20278eb1 bcachefs: Check subvol <-> inode pointers in check_subvol()
Subvolumes and subvolume root inodes point to each other: this verifies
the subvolume -> inode -> subvolme path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-13 21:22:23 -04:00
Kent Overstreet
69c8e6ce02 bcachefs: move fsck_write_inode() to inode.c
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:09 -04:00
Guoyu Ou
0be5b38bce bcachefs: skip invisible entries in empty subvolume checking
When we are checking whether a subvolume is empty in the specified snapshot,
entries that do not belong to this subvolume should be skipped.

This fixes the following case:

    $ bcachefs subvolume create ./sub
    $ cd sub
    $ bcachefs subvolume create ./sub2
    $ bcachefs subvolume snapshot . ./snap
    $ ls -a snap
    . ..
    $ rmdir snap
    rmdir: failed to remove 'snap': Directory not empty

As Kent suggested, we pass 0 in may_delete_deleted_inode() to ignore subvols
in the subvol we are checking, because inode.bi_subvol is only set on
subvolume roots, and we can't go through every inode in the subvolume and
change bi_subvol when taking a snapshot. It makes the check less strict, but
that's ok, the rest of fsck will still catch it.

Signed-off-by: Guoyu Ou <benogy@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-03-10 15:34:07 -04:00
Kent Overstreet
249f441f83 bcachefs: Improve inode_to_text()
Add line breaks - inode_to_text() is now much easier to read.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 13:27:11 -05:00
Kent Overstreet
8e7834a883 bcachefs: bch_fs_usage_base
Split out base filesystem usage into its own type; prep work for
breaking up bch2_trans_fs_usage_apply().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 06:01:45 -05:00
Kent Overstreet
38c23fb809 bcachefs: BTREE_TRIGGER_ATOMIC
Add a new flag to be explicit about when we're running atomic triggers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-21 06:01:45 -05:00
Kent Overstreet
08bc959010 bcachefs: unify inode trigger
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
ad00bce07d bcachefs: mark now takes bkey_s
Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
717296c34c bcachefs: trans_mark now takes bkey_s
Prep work for disk space accounting rewrite: we're going to want to use
a single callback for both of our current triggers, so we need to change
them to have the same type signature first.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-05 23:24:19 -05:00
Kent Overstreet
80eab7a7c2 bcachefs: for_each_btree_key() now declares loop iter
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:42 -05:00
Kent Overstreet
cf904c8d96 bcachefs: bch_err_(fn|msg) check if should print
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:41 -05:00
Kent Overstreet
27b2df982f bcachefs: Kill for_each_btree_key()
for_each_btree_key() handles transaction restarts, like
for_each_btree_key2(), but only calls bch2_trans_begin() after a
transaction restart - for_each_btree_key2() wraps every loop iteration
in a transaction.

The for_each_btree_key() behaviour is problematic when it leads to
holding the SRCU lock that prevents key cache reclaim for an unbounded
amount of time - there's no real need to keep it around.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:40 -05:00
Kent Overstreet
183bcc89b8 bcachefs: Clean up btree write buffer write ref handling
__bch2_btree_write_buffer_flush() now assumes a write ref is already
held (as called by the transaction commit path); and the wrappers
bch2_write_buffer_flush() and flush_sync() take an explicit write ref.

This means internally the write buffer code can always use
BTREE_INSERT_NOCHECK_RW, instead of in the previous code passing flags
around and hoping the NOCHECK_RW flag was always carried around
correctly.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:39 -05:00
Kent Overstreet
3c471b6588 bcachefs: convert bch_fs_flags to x-macro
Now we can print out filesystem flags in sysfs, useful for debugging
various "what's my filesystem doing" issues.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:38 -05:00
Kent Overstreet
cb52d23e77 bcachefs: Rename BTREE_INSERT flags
BTREE_INSERT flags are actually transaction commit flags - rename them
for clarity.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:37 -05:00
Kent Overstreet
6d1980f0af bcachefs: Fix deleted inode check for dirs
We could delete directories transactionally on rmdir()/unlink(), but we
don't; instead, like with regular files we wait for the VFS to call
evict().

That means that our check for directories in the deleted inodes btree is
wrong - the check should be for non-empty directories.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-12-08 00:39:56 -05:00
Kent Overstreet
f42fa17883 bcachefs: Fix missing transaction commit
In may_delete_deleted_inode(), there's a corner case when a snapshot was
taken while we had an unlinked inode: we don't want to delete the inode
in the internal (shared) snapshot node, since it might have been
reattached in a descendent snapshot.

Instead we propagate the key to any snapshot leaves it doesn't exist in,
so that it can be deleted there if necessary, and then clear the
unlinked flag in the internal node.

But we forgot to commit after clearing the unlinked flag, causing us to
go into an infinite loop.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:43 -05:00
Kent Overstreet
069749688e bcachefs: Fix iterator leak in may_delete_deleted_inode()
may_delete_deleted_inode() was returning without exiting a btree
iterator, eventually causing propagate_key_to_snaphot_leaves() to go
into an infinite loop hitting btree_trans_too_many_iters().

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-14 23:44:43 -05:00
Kent Overstreet
103ffe9aaf bcachefs: x-macro-ify inode flags enum
This lets us use bch2_prt_bitflags to print them out.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-05 13:12:18 -05:00
Kent Overstreet
4bd156c4b4 bcachefs: Fix bch2_delete_dead_inodes()
- the fsck_err() check for the filesystem being clean was incorrect,
   causing us to always fail to delete unlinked inodes
 - if a snapshot had been taken, the unlinked inode needs to be
   propagated to snapshot leaves so the unlink can happen there - fixed.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04 22:19:13 -04:00
Kent Overstreet
b65db750e2 bcachefs: Enumerate fsck errors
This patch adds a superblock error counter for every distinct fsck
error; this means that when analyzing filesystems out in the wild we'll
be able to see what sorts of inconsistencies are being found and repair,
and hence what bugs to look for.

Errors validating bkeys are not yet considered distinct fsck errors, but
this patch adds a new helper, bkey_fsck_err(), in order to add distinct
error types for them as well.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-01 21:11:08 -04:00
Kent Overstreet
55c11a159d bcachefs: bch2_inum_opts_get()
New helper for new rebalance code

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:38 -04:00
Kent Overstreet
6ddedca218 bcachefs: Guard against unknown compression options
Since compression options now include compression level, proper
validation is a bit more involved.

This adds bch2_compression_opt_valid(), and plumbs it around
appropriately.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-31 12:18:37 -04:00
Kent Overstreet
4220666398 bcachefs: Fix bch2_inode_delete_keys()
bch2_inode_delete_keys() was using BTREE_ITER_NOT_EXTENTS, on the
assumption that it would never need to split extents.

But that caused a race with extents being split by other threads -
specifically, the data move path. Extents iterators have the iterator
position pointing to the start of the extent, which avoids the race.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:15 -04:00
Kent Overstreet
d2a990d1b1 bcachefs: bch_err_msg(), bch_err_fn() now filters out transaction restart errors
These errors aren't actual errors, and should never be printed - do this
in the common helpers.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:15 -04:00
Kent Overstreet
6bd68ec266 bcachefs: Heap allocate btree_trans
We're using more stack than we'd like in a number of functions, and
btree_trans is the biggest object that we stack allocate.

But we have to do a heap allocatation to initialize it anyways, so
there's no real downside to heap allocating the entire thing.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:13 -04:00
Kent Overstreet
96dea3d599 bcachefs: Fix W=12 build errors
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:13 -04:00
Kent Overstreet
da187cacb8 bcachefs: Kill missing inode warnings in bch2_quota_read()
bch2_quota_read(), when scanning for inodes, may attempt to look up
inodes that have been deleted in the main subvolume - this is not an
error.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:12 -04:00
Kent Overstreet
8e877caaad bcachefs: Split out snapshot.c
subvolume.c has gotten a bit large, this splits out a separate file just
for managing snapshot trees - BTREE_ID_snapshots.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:11 -04:00
Kent Overstreet
c036359197 bcachefs: Check for directories in deleted inodes btree
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:10 -04:00
Kent Overstreet
e691b391f0 bcachefs: Add logging to bch2_inode_peek() & related
Add error messages when we fail to lookup an inode, and also add a few
missing bch2_err_class() calls.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:09 -04:00
Kent Overstreet
dde8cb1164 bcachefs: bcachefs_metadata_version_deleted_inodes
Add a new bitset btree for inodes pending deletion; this means we no
longer have to scan the full inodes btree after an unclean shutdown.

Specifically, this adds:
 - a trigger to update the deleted_inodes btree based on changes to the
   inodes btree
 - a new recovery pass
 - and check_inodes is now only a fsck pass.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:09 -04:00
Kent Overstreet
7904c82cea bcachefs: Move fsck_inode_rm() to inode.c
Prep work for the new deleted inodes btree

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:09 -04:00
Kent Overstreet
4dc5bb9adf bcachefs: move inode triggers to inode.c
bit of reorg

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:08 -04:00
Kent Overstreet
8726dc936f bcachefs: Change check for invalid key types
As part of the forward compatibility patch series, we need to allow for
new key types without complaining loudly when running an old version.

This patch changes the flags parameter of bkey_invalid to an enum, and
adds a new flag to indicate we're being called from the transaction
commit path.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:06 -04:00
Kent Overstreet
e47a390aa5 bcachefs: Convert -ENOENT to private error codes
As with previous conversions, replace -ENOENT uses with more informative
private error codes.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:03 -04:00