linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-01 15:14:52 +00:00

Author	SHA1	Message	Date
Kent Overstreet	e58f963cec	bcachefs: helpers for printing data types We need bounds checking since new versions may introduce new data types. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-21 06:01:45 -05:00
Kent Overstreet	f5d4481c3e	bcachefs: move "ptrs not changing" optimization to bch2_trigger_extent() This is useful for btree ptrs as well, when we're just updating sectors_written. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:46 -05:00
Kent Overstreet	4f9ec59f8f	bcachefs: unify extent trigger Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:20 -05:00
Kent Overstreet	5a82ec3fea	bcachefs: bch2_trigger_stripe_ptr() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:20 -05:00
Kent Overstreet	1f34c21bc6	bcachefs: bch2_trigger_pointer() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:20 -05:00
Kent Overstreet	f4f78779bb	bcachefs: move stripe triggers to ec.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:20 -05:00
Kent Overstreet	6820ac2cdc	bcachefs: move bch2_mark_alloc() to alloc_background.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:20 -05:00
Kent Overstreet	6cacd0c414	bcachefs: unify reservation trigger Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:20 -05:00
Kent Overstreet	282e7c37eb	bcachefs: kill mem_trigger_run_overwrite_then_insert() now that type signatures are unified, redundant Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:19 -05:00
Kent Overstreet	ad00bce07d	bcachefs: mark now takes bkey_s Prep work for disk space accounting rewrite: we're going to want to use a single callback for both of our current triggers, so we need to change them to have the same type signature first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:19 -05:00
Kent Overstreet	717296c34c	bcachefs: trans_mark now takes bkey_s Prep work for disk space accounting rewrite: we're going to want to use a single callback for both of our current triggers, so we need to change them to have the same type signature first. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-05 23:24:19 -05:00
Kent Overstreet	41b84fb489	bcachefs: for_each_member_device_rcu() now declares loop iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:42 -05:00
Kent Overstreet	9fea2274f7	bcachefs: for_each_member_device() now declares loop iter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:42 -05:00
Kent Overstreet	cf904c8d96	bcachefs: bch_err_(fn\|msg) check if should print Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:41 -05:00
Kent Overstreet	0d963a635d	bcachefs: Move reflink_p triggers into reflink.c Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:40 -05:00
Kent Overstreet	9e243d3cda	bcachefs: Kill journal_seq/gc args to bch2_dev_usage_update_m() This is only used by gc (fsck). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	9b34f02cdc	bcachefs: Kill dev_usage->buckets_ec This counter is redundant; it's simply the sum of BCH_DATA_stripe and BCH_DATA_parity buckets. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	ed0cd515cd	bcachefs: bch2_dev_usage_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	dafff7e575	bcachefs: New bucket sector count helpers This introduces bch2_bucket_sectors() and bch2_bucket_sectors_dirty(), prep work for separately accounting stripe sectors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	3b05b8e082	bcachefs: Simplify check_bucket_ref() We only need the sector count being modified. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	25f64e997e	bcachefs: Don't use update_cached_sectors() in bch2_mark_alloc() bch2_update_cached_sectors_list() is closer to how the new disk space accounting works, called from trans_mark(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	086a52f7fa	bcachefs: Rename bch_replicas_entry -> bch_replicas_entry_v1 Prep work for introducing bch_replicas_entry_v2 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2024-01-01 11:47:38 -05:00
Kent Overstreet	03013bb0c6	bcachefs: Fix bucket data type for stripe buckets Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-28 17:18:24 -05:00
Kent Overstreet	202a7c292e	bcachefs: Don't stop copygc thread on device resize copygc no longer has to scan the buckets, so it's no longer a problem if the number of buckets is changing while it's running. This also fixes a bug where we forgot to restart copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-24 02:30:32 -05:00
Kent Overstreet	df94cb2e57	bcachefs: Fix an integer overflow Fixes: bcachefs (e7fdc10e-54a3-49d9-bd0c-390370889d84): disk usage increased 4294967296 more than 2823707312 sectors reserved) transaction updates for __bchfs_fallocate journal seq 467859 update: btree=extents cached=0 bch2_trans_update+0x4e8/0x540 old u64s 5 type deleted 536925940:3559337304:4294967283 len 0 ver 0 new u64s 6 type reservation 536925940:3559337304:4294967283 len 3559337304 ver 0: generation 0 replicas 2 update: btree=inodes cached=1 bch2_extent_update_i_size_sectors+0x305/0x3b0 old u64s 19 type inode_v3 0:536925940:4294967283 len 0 ver 0: mode 100600 flags 15300000 journal_seq 467859 bi_size 0 bi_sectors 0 bi_version 0 bi_atime 40905301656446 bi_ctime 40905301656446 bi_mtime 40905301656446 bi_otime 40905301656446 bi_uid 0 bi_gid 0 bi_nlink 0 bi_generation 0 bi_dev 0 bi_data_checksum 0 bi_compression 0 bi_project 0 bi_background_compression 0 bi_data_replicas 0 bi_promote_target 0 bi_foreground_target 0 bi_background_target 0 bi_erasure_code 0 bi_fields_set 0 bi_dir 1879048193 bi_dir_offset 3384856038735393365 bi_subvol 0 bi_parent_subvol 0 bi_nocow 0 new u64s 19 type inode_v3 0:536925940:4294967283 len 0 ver 0: mode 100600 flags 15300000 journal_seq 467859 bi_size 0 bi_sectors 3559337304 bi_version 0 bi_atime 40905301656446 bi_ctime 40905301656446 bi_mtime 40905301656446 bi_otime 40905301656446 bi_uid 0 bi_gid 0 bi_nlink 0 bi_generation 0 bi_dev 0 bi_data_checksum 0 bi_compression 0 bi_project 0 bi_background_compression 0 bi_data_replicas 0 bi_promote_target 0 bi_foreground_target 0 bi_background_target 0 bi_erasure_code 0 bi_fields_set 0 bi_dir 1879048193 bi_dir_offset 3384856038735393365 bi_subvol 0 bi_parent_subvol 0 bi_nocow 0 Kernel panic - not syncing: bcachefs (e7fdc10e-54a3-49d9-bd0c-390370889d84): panic after error CPU: 4 PID: 5154 Comm: rsync Not tainted 6.5.9-gateway-gca1614174cc0-dirty #1 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570 Phantom Gaming 4, BIOS P4.20 08/02/2021 Call Trace: <TASK> dump_stack_lvl+0x5a/0x90 panic+0x105/0x300 ? console_unlock+0xf1/0x130 ? bch2_printbuf_exit+0x16/0x30 ? srso_return_thunk+0x5/0x10 bch2_inconsistent_error+0x6f/0x80 bch2_trans_fs_usage_apply+0x279/0x3d0 __bch2_trans_commit+0x112a/0x1df0 ? bch2_extent_update+0x13a/0x1d0 bch2_extent_update+0x13a/0x1d0 bch2_extent_fallocate+0x58e/0x740 bch2_fallocate_dispatch+0xb7c/0x1030 ? do_filp_open+0xa0/0x140 vfs_fallocate+0x18e/0x1d0 __x64_sys_fallocate+0x46/0x70 do_syscall_64+0x48/0xa0 ? exit_to_user_mode_prepare+0x4d/0xa0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0033:0x7fc85d91bbb3 Code: 64 89 02 b8 ff ff ff ff eb bd 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 80 3d 31 da 0d 00 00 49 89 ca 74 14 b8 1d 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 5d c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 10 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-01 21:11:08 -04:00
Kent Overstreet	b65db750e2	bcachefs: Enumerate fsck errors This patch adds a superblock error counter for every distinct fsck error; this means that when analyzing filesystems out in the wild we'll be able to see what sorts of inconsistencies are being found and repair, and hence what bugs to look for. Errors validating bkeys are not yet considered distinct fsck errors, but this patch adds a new helper, bkey_fsck_err(), in order to add distinct error types for them as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-01 21:11:08 -04:00
Kent Overstreet	fb3f57bb11	bcachefs: rebalance_work This adds a new btree, rebalance_work, to eliminate scanning required for finding extents that need work done on them in the background - i.e. for the background_target and background_compression options. rebalance_work is a bitset btree, where a KEY_TYPE_set corresponds to an extent in the extents or reflink btree at the same pos. A new extent field is added, bch_extent_rebalance, which indicates that this extent has work that needs to be done in the background - and which options to use. This allows per-inode options to be propagated to indirect extents - at least in some circumstances. In this patch, changing IO options on a file will not propagate the new options to indirect extents pointed to by that file. Updating (setting/clearing) the rebalance_work btree is done by the extent trigger, which looks at the bch_extent_rebalance field. Scanning is still requrired after changing IO path options - either just for a given inode, or for the whole filesystem. We indicate that scanning is required by adding a KEY_TYPE_cookie key to the rebalance_work btree: the cookie counter is so that we can detect that scanning is still required when an option has been flipped mid-way through an existing scan. Future possible work: - Propagate options to indirect extents when being changed - Add other IO path options - nr_replicas, ec, to rebalance_work so they can be applied in the background when they change - Add a counter, for bcachefs fs usage output, showing the pending amount of rebalance work: we'll probably want to do this after the disk space accounting rewrite (moving it to a new btree) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-11-01 21:11:05 -04:00
Kent Overstreet	523f33efbf	bcachefs: All triggers are BTREE_TRIGGER_WANTS_OLD_AND_NEW Upcoming rebalance_work btree will require extent triggers to be BTREE_TRIGGER_WANTS_OLD_AND_NEW - so to reduce potential confusion, let's just make all triggers BTREE_TRIGGER_WANTS_OLD_AND_NEW. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-31 12:18:37 -04:00
Kent Overstreet	bbe682c767	bcachefs: Ensure devices are always correctly initialized We can't mark device superblocks or allocate journal on a device that isn't online. That means we may need to do this on every mount, because we may have formatted a new filesystem and then done the first mount (bch2_fs_initialize()) in degraded mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-31 12:18:37 -04:00
Kent Overstreet	88d39fd544	bcachefs: Switch to unsafe_memcpy() in a few places The new fortify checking doesn't work for us in all places; this switches to unsafe_memcpy() where appropriate to silence a few warnings/errors. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:16 -04:00
Kent Overstreet	73bbeaa2de	bcachefs: bucket_lock() is now a sleepable lock fsck_err() may sleep - it takes a mutex and may allocate memory, so bucket_lock() needs to be a sleepable lock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:15 -04:00
Kent Overstreet	a55fc65eb2	bcachefs: Fix an overflow check When bucket sector counts were changed from u16s to u32s, a few things were missed. This fixes an overflow check, and a truncation that prevented the overflow check from firing. Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:14 -04:00
Kent Overstreet	6bd68ec266	bcachefs: Heap allocate btree_trans We're using more stack than we'd like in a number of functions, and btree_trans is the biggest object that we stack allocate. But we have to do a heap allocatation to initialize it anyways, so there's no real downside to heap allocating the entire thing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:13 -04:00
Colin Ian King	7cb0e6992e	bcachefs: remove redundant initialization of pointer d The pointer d is being initialized with a value that is never read, it is being re-assigned later on when it is used in a for-loop. The initialization is redundant and can be removed. Cleans up clang-scan build warning: fs/bcachefs/buckets.c:1303:25: warning: Value stored to 'd' during its initialization is never read [deadcode.DeadStores] Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:12 -04:00
Kent Overstreet	aef32bf7cc	bcachefs: __bch2_btree_insert() -> bch2_btree_insert_trans() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:12 -04:00
Kent Overstreet	1e81f89b02	bcachefs: Fix assorted checkpatch nits Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:10 -04:00
Kent Overstreet	4dc5bb9adf	bcachefs: move inode triggers to inode.c bit of reorg Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:08 -04:00
Kent Overstreet	813e0cecd1	bcachefs: Upgrade path fixes Some minor fixes to not print errors that are actually due to a verson upgrade. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:07 -04:00
Kent Overstreet	73bd774d28	bcachefs: Assorted sparse fixes - endianness fixes - mark some things static - fix a few __percpu annotations - fix silent enum conversions Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:06 -04:00
Kent Overstreet	3a63b32f12	bcachefs: bch2_trans_mark_pointer() refactoring bch2_bucket_backpointer_mod() doesn't need to update the alloc key, we can exit the alloc iter earlier. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:04 -04:00
Kent Overstreet	1bb3c2a974	bcachefs: New error message helpers Add two new helpers for printing error messages with __func__ and bch2_err_str(): - bch_err_fn - bch_err_msg Also kill the old error strings in the recovery path, which were causing us to incorrectly report memory allocation failures - they're not needed anymore. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:04 -04:00
Kent Overstreet	21da6101bd	bcachefs: replicas_deltas_realloc() uses allocate_dropping_locks() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	19c304bebd	bcachefs: GFP_NOIO -> GFP_NOFS GFP_NOIO dates from the bcache days, when we operated under the block layer. Now, GFP_NOFS is more appropriate, so switch all GFP_NOIO uses to GFP_NOFS. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	e47a390aa5	bcachefs: Convert -ENOENT to private error codes As with previous conversions, replace -ENOENT uses with more informative private error codes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:03 -04:00
Kent Overstreet	962210b281	bcachefs: Fix a buffer overrun in bch2_fs_usage_read() We were copying the size of a struct bch_fs_usage_online to a struct bch_fs_usage, which is 8 bytes smaller. This adds some new helpers so we can do this correctly, and get rid of some magic +1s too. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:01 -04:00
Kent Overstreet	f12a798a89	bcachefs: bch2_bkey_get_mut() now calls bch2_trans_update() It's safe to call bch2_trans_update with a k/v pair where the value hasn't been filled out, as long as the key part has been and the value is filled out by transaction commit time. This patch folds the bch2_trans_update() call into bch2_bkey_get_mut(), eliminating a bit of boilerplate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:01 -04:00
Kent Overstreet	34dfa5db19	bcachefs: bch2_bkey_get_mut() improvements - bch2_bkey_get_mut() now handles types increasing in size, allocating a buffer for the type's current size when necessary - bch2_bkey_make_mut_typed() - bch2_bkey_get_mut() now initializes the iterator, like bch2_bkey_get_iter() Also, refactor so that most of the code is in functions - now macros are only used for wrappers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:10:01 -04:00
Kent Overstreet	62a03559d6	bcachefs: Rip out code for storing backpointers in alloc keys We don't store backpointers in alloc keys anymore, since we gained the btree write buffer. This patch drops support for backpointers in alloc keys, and revs the on disk format version so that we know a fsck is required. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:59 -04:00
Kent Overstreet	65d48e3525	bcachefs: Private error codes: ENOMEM This adds private error codes for most (but not all) of our ENOMEM uses, which makes it easier to track down assorted allocation failures. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:57 -04:00
Kent Overstreet	2640faeb17	bcachefs: Journal resize fixes - Fix a sleeping-in-atomic bug due to calling bch2_journal_buckets_to_sb() under the journal lock. - Additionally, now we mark buckets as journal buckets before adding them to the journal in memory and the superblock. This ensures that if we crash part way through we'll never be writing to journal buckets that aren't marked correctly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:56 -04:00
Kent Overstreet	910659763e	bcachefs: Mark stripe buckets with correct data type Currently, we don't use bucket data type for tracking whether buckets are part of a stripe; parity buckets are BCH_DATA_parity, but data buckets in a stripe are BCH_DATA_user. There's a separate counter, buckets_ec, outside the BCH_DATA_TYPES system for tracking number of buckets on a device that are part of a stripe. The trouble with this approach is that it's too coarse grained, and we need better information on fragmentation for debugging copygc. With this patch, data buckets in a stripe are now tracked as BCH_DATA_stripe buckets. This doesn't yet differentiate between erasure coded and non-erasure coded data in a stripe bucket, nor do we yet track empty data buckets in stripes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:55 -04:00
Kent Overstreet	2611a041ae	bcachefs: bch2_mark_key() now takes btree_id & level btree & level are passed to trans_mark - for backpointers - bch2_mark_key() should take them as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:55 -04:00
Kent Overstreet	27616a3124	bcachefs: Simplify ec stripes heap Now that we have a separate data structure for tracking open stripes, the stripes heap can track all existing stripes, which is a nice simplification. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:54 -04:00
Kent Overstreet	627a231239	bcachefs: Switch ec_stripes_heap_lock to a mutex Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:53 -04:00
Daniel Hill	3277081522	bcachefs: Don't run triggers when repairing in __bch2_mark_reflink_p() Triggers current trip-up on the faulty reflink we're trying to repair, Disabling them lets us fix broken reflink and continue. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:52 -04:00
Daniel Hill	8ffa11a2c5	bcachefs: let __bch2_btree_insert() pass in flags This patch is prep work for the following patch. Signed-off-by: Daniel Hill <daniel@gluo.nz> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:52 -04:00
Kent Overstreet	c1f59ef6d0	bcachefs: More info on check_bucket_ref() error Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:52 -04:00
Kent Overstreet	8dd69d9f64	bcachefs: KEY_TYPE_inode_v3, metadata_version_inode_v3 Move bi_size and bi_sectors into the non-varint portion of the inode, so that the write path can update them without going through the relatively expensive unpack/pack operations. Other changes: - Add a field for the offset of the varint section, so we can add new non-varint fields without needing a new inode type, like alloc_v3 - Move bi_mode into the flags field, so that the varint section can be u64 aligned Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:51 -04:00
Kent Overstreet	a8c752bb1d	bcachefs: New on disk format: Backpointers This patch adds backpointers: we now have a reverse index from device and offset on that device (specifically, offset within a bucket) back to btree nodes and (non cached) data extents. The first 40 backpointers within a bucket are stored in the alloc key; after that backpointers spill over to the next backpointers btree. This is to help avoid performance regressions from additional btree updates on large streaming workloads. This patch adds all the code for creating, checking and repairing backpointers. The next patch in the series is going to use backpointers for copygc - finally getting rid of the need to scan all extents to do copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	920e69bc3d	bcachefs: Btree write buffer This adds a new method of doing btree updates - a straight write buffer, implemented as a flat fixed size array. This is only useful when we don't need to read from the btree in order to do the update, and when reading is infrequent - perfect for the LRU btree. This will make LRU btree updates fast enough that we'll be able to use it for persistently indexing buckets by fragmentation, which will be a massive boost to copygc performance. Changes: - A new btree_insert_type enum, for btree_insert_entries. Specifies btree, btree key cache, or btree write buffer. - bch2_trans_update_buffered(): updates via the btree write buffer don't need a btree path, so we need a new update path. - Transaction commit path changes: The update to the btree write buffer both mutates global, and can fail if there isn't currently room. Therefore we do all write buffer updates in the transaction all at once, and also if it fails we have to revert filesystem usage counter changes. If there isn't room we flush the write buffer in the transaction commit error path and retry. - A new persistent option, for specifying the number of entries in the write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:50 -04:00
Kent Overstreet	19a614d2e4	bcachefs: Better inlining for bch2_alloc_to_v4_mut This separates out the slowpath into a separate function, and inlines bch2_alloc_v4_mut into bch2_trans_start_alloc_update(), the main place it's called. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	7c909f654b	bcachefs: Fix repair path in bch2_mark_reflink_p() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	ad5d3d820a	bcachefs: Kill fs_usage_apply_warn() We now have bch2_trans_inconsistent() which generically does the same thing - dumps pending btree transaction updates. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:49 -04:00
Kent Overstreet	2cc9c0db89	bcachefs: Fix some memcpy() warnings With CONFIG_FORTIFY_SOURCE, the compiler attempts to warn about mempcys that extend past struct field boundaries. This results in some spurious warnings where we use embedded variable length structs, this patch switches to unsafe_mecpy() to fix the warnings. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	994ba47543	bcachefs: New btree helpers This introduces some new conveniences, to help cut down on boilerplate: - bch2_trans_kmalloc_nomemzero() - performance optimiation - bch2_bkey_make_mut() - bch2_bkey_get_mut() - bch2_bkey_get_mut_typed() - bch2_bkey_alloc() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:48 -04:00
Kent Overstreet	8852501fe5	bcachefs: Improve fs_usage_apply_warn() message Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:45 -04:00
Kent Overstreet	df6a24f81a	bcachefs: Make error messages more uniform Use __func__ in error messages that refer to function name, and do so more uniformly - suggested by checkpatch.pl Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:45 -04:00
Kent Overstreet	3e3e02e6bc	bcachefs: Assorted checkpatch fixes checkpatch.pl gives lots of warnings that we don't want - suggested ignore list: ASSIGN_IN_IF UNSPECIFIED_INT - bcachefs coding style prefers single token type names NEW_TYPEDEFS - typedefs are occasionally good FUNCTION_ARGUMENTS - we prefer to look at functions in .c files (hopefully with docbook documentation), not .h file prototypes MULTISTATEMENT_MACRO_USE_DO_WHILE - we have _many_ x-macros and other macros where we can't do this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:44 -04:00
Kent Overstreet	ed80c5699a	bcachefs: Optimize bch2_dev_usage_read() - add bch2_dev_usage_read_fast(), which doesn't return by value - bch_dev_usage is big enough that we don't want the silent memcpy - tweak the allocation path to only call bch2_dev_usage_read() once per bucket allocated Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:44 -04:00
Kent Overstreet	5b3243cb52	bcachefs: Fix cached data accounting Negating without casting to a signed integer means the value wasn't getting sign extended properly - oops. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:43 -04:00
Kent Overstreet	6c22eb7085	bcachefs: Fix "multiple types of data in same bucket" with ec Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:42 -04:00
Kent Overstreet	098ef98d5b	bcachefs: Add private error codes for ENOSPC Continuing the saga of introducing private dedicated error codes for each error path, this patch converts ENOSPC to error codes that are subtypes of ENOSPC. We've recently had a test failure where we got -ENOSPC where we shouldn't have, and didn't have enough information to tell where it came from, so this patch will solve that problem. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:40 -04:00
Kent Overstreet	dadecd02c4	bcachefs: bch2_trans_run() This adds a new helper, bch2_trans_run(), that runs a function with a btree_transaction context but without handling transaction restarts. We're adding checks for nested transaction restart handling: when an inner transaction handles a transaction restart it will still have to return it to the outer transaction, or else assertions will be popped in the outer transaction. But some places don't need restart handling at the outer scope, so this helper does what they need. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:36 -04:00
Kent Overstreet	f501ad2b81	bcachefs: bch2_mark_alloc(): Do wakeups after updating usage We have an obvious wake up race if we do the wakeup _before_ updating the counters the thing doing the waiting is reading. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:36 -04:00
Kent Overstreet	e68914ca84	bcachefs: Rename __bch2_trans_do() -> commit_do() Better/more descriptive naming, and prep for adding nested_lockrestart_do() and nested_commit_do(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	401ec4db63	bcachefs: Printbuf rework This converts bcachefs to the modern printbuf interface/implementation, synced with the version to be submitted upstream. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:33 -04:00
Kent Overstreet	1f93726e63	bcachefs: Tracepoint improvements Delete some obsolete tracepoints, organize alloc tracepoints better, make a few tracepoints more consistent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:32 -04:00
Kent Overstreet	e1b8f5f5ca	bcachefs: Plumb btree_id & level to trans_mark For backpointers, we'll need the full key location - that means btree_id and btree level. This patch plumbs it through. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:32 -04:00
Kent Overstreet	75c8d0305a	bcachefs: Kill old rebuild_replicas option This option was useful when the replicas mechism was new and still being debugged, but hasn't been used in ages - let's delete it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	822835ffea	bcachefs: Fold bucket_state in to BCH_DATA_TYPES() Previously, we were missing accounting for buckets in need_gc_gens and need_discard states. This matters because buckets in those states need other btree operations done before they can be used, so they can't be conuted when checking current number of free buckets against the allocation watermark. Also, we weren't directly counting free buckets at all. Now, data type 0 == BCH_DATA_free, and free buckets are counted; this means we can get rid of the separate (poorly defined) count of unavailable buckets. This is a new on disk format version, with upgrade and fsck required for the accounting changes. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	e1effd42a1	bcachefs: More improvements for alloc info checks - Move checks for whether the device & bucket are valid from the .key_invalid method to bch2_check_alloc_key(). This is because .key_invalid() is called on keys that may no longer exist (post journal replay), which is a problem when removing/resizing devices. - We weren't checking the need_discard btree to ensure that every set bucket has a corresponding alloc key. This refactors the code for checking the freespace btree, so that it now checks both. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	c6b6d41612	bcachefs: gc mark fn fixes, cleanups mark_stripe_bucket() was busted; it was using @new unitialized. Also, clean up all the gc mark functions, and convert them to the same style. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	75f02de43f	bcachefs: Use crc_is_compressed() Trivial cleanup. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	66d9082385	bcachefs: Kill struct bucket_mark This switches struct bucket to using a lock, instead of cmpxchg. And now that the protected members no longer need to fit into a u64, we can expand the sector counts to 32 bits. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	5735608c14	bcachefs: Kill main in-memory bucket array All code using the in-memory bucket array, excluding GC, has now been converted to use the alloc btree directly - so we can finally delete it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	5f43f99c6e	bcachefs: bch2_dev_usage_update() no longer depends on bucket_mark This is one of the last steps in getting rid of the main in-memory bucket array. This changes bch2_dev_usage_update() to take bkey_alloc_unpacked instead of bucket_mark, and for the places where we are in fact working with bucket_mark and don't have bkey_alloc_unpacked, we add a wrapper that takes bucket_mark and converts to bkey_alloc_unpacked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	caece7fe3f	bcachefs: New bucket invalidate path In the old allocator code, preparing an existing empty bucket was part of the same code path that invalidated buckets containing cached data. In the new allocator code this is no longer the case: the main allocator path finds empty buckets (via the new freespace btree), and can't allocate buckets that contain cached data. We now need a separate code path to invalidate buckets containing cached data when we're low on empty buckets, which this patch implements. When the number of free buckets decreases that triggers the new invalidate path to run, which uses the LRU btree to pick cached data buckets to invalidate until we're above our watermark. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	59cc38b8d4	bcachefs: New discard implementation In the old allocator code, buckets would be discarded just prior to being used - this made sense in bcache where we were discarding buckets just after invalidating the cached data they contain, but in a filesystem where we typically have more free space we want to be discarding buckets when they become empty. This patch implements the new behaviour - it checks the need_discard btree for buckets awaiting discards, and then clears the appropriate bit in the alloc btree, which moves the buckets to the freespace btree. Additionally, discards are now enabled by default. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	f25d8215f4	bcachefs: Kill allocator threads & freelists Now that we have new persistent data structures for the allocator, this patch converts the allocator to use them. Now, foreground bucket allocation uses the freespace btree to find buckets to allocate, instead of popping buckets off the freelist. The background allocator threads are no longer needed and are deleted, as well as the allocator freelists. Now we only need background tasks for invalidating buckets containing cached data (when we are low on empty buckets), and for issuing discards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	c6b2826cd1	bcachefs: Freespace, need_discard btrees This adds two new btrees for the upcoming allocator rewrite: an extents btree of free buckets, and a btree for buckets awaiting discards. We also add a new trigger for alloc keys to keep the new btrees up to date, and a compatibility path to initialize them on existing filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	3d48a7f85f	bcachefs: KEY_TYPE_alloc_v4 This introduces a new alloc key which doesn't use varints. Soon we'll be adding backpointers and storing them in alloc keys, which means our pack/unpack workflow for alloc keys won't really work - we'll need to be mutating alloc keys in place. Instead of bch2_alloc_unpack(), we now have bch2_alloc_to_v4() that converts older types of alloc keys to v4 if needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	3e1547116f	bcachefs: x-macroize alloc_reserve enum This makes an array of strings available, like our other enums. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:29 -04:00
Kent Overstreet	78668fe0bb	bcachefs: Move deletion of refcount=0 indirect extents to their triggers For backpointers, we need to switch the order triggers are run in: we need to run triggers for deletions/overwrites before triggers for inserts. To avoid breaking the reflink triggers, this patch moves deleting of indirect extents with refcount=0 to their triggers, instead of doing it when we update those keys. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:29 -04:00
Kent Overstreet	880e2275f9	bcachefs: Move trigger fns to bkey_ops This replaces the switch statements in bch2_mark_key(), bch2_trans_mark_key() with new bkey methods - prep work for the next patch, which fixes BTREE_TRIGGER_WANTS_OLD_AND_NEW. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:28 -04:00
Kent Overstreet	2158fe463b	bcachefs: bch2_trans_inconsistent() Add a new error macro that also dumps transaction updates in addition to doing an emergency shutdown - when a transaction update discovers or is causing a fs inconsistency, it's helpful to see what updates it was doing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:27 -04:00
Kent Overstreet	fa8e94faee	bcachefs: Heap allocate printbufs This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:25 -04:00
Kent Overstreet	3598c56eb9	bcachefs: Consolidate trigger code a bit Upcoming patches are doing more work on the triggers code, this patch just moves code around. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	ae94c78fb1	bcachefs: bch2_trans_mark_key() now takes a bkey_i * We're now coming up with triggers that modify the update being done. A bkey_s_c is const - bkey_i is the correct type to be using here. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	b0551285e1	bcachefs: Improve reflink repair code When a reflink pointer points to a missing indirect extent, we replace it with an error key. Instead of replacing the entire reflink pointer with an error key, this patch replaces only the missing range with an error key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	78c8fe20be	bcachefs: Normal update/commit path now works before going RW This improves __bch2_trans_commit - early in the recovery process, when we're running btree_gc and before we want to go RW, it now uses bch2_journal_key_insert() to add the update to the list of updates for journal replay to do, instead of btree_gc having to use separate interfaces depending on whether we're running at bringup or, later, runtime. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:25 -04:00
Kent Overstreet	2232fa397c	bcachefs: Only allocate buckets_nouse when requested It's only needed by the migrate tool - this patch adds an option to enable allocating it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:24 -04:00
Kent Overstreet	2e63e18066	bcachefs: Stash a copy of key being overwritten in btree_insert_entry We currently need to call bch2_btree_path_peek_slot() multiple times in the transaction commit path - and some of those need to be updated to also check the keys from journal replay, too. Let's consolidate this and stash the key being overwritten in btree_insert_entry. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	80bf2f3454	bcachefs: Fix freeing in bch2_dev_buckets_resize() We were double-freeing old_buckets and not freeing old_buckets_gens: also, the code was supposed to free buckets, not old_buckets; old_buckets is only needed because we have to use rcu_assign_pointer() instead of swap(), and won't be set if we hit the error path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	0678cbe2cb	bcachefs: Ignore cached data when calculating fragmentation Previously, bucket fragmentation was considered to be bucket size - total amount of live data, both dirty and cached. This meant that if a bucket was full but only a small amount of data in it was dirty - the rest cached, we'd get stuck: copygc wouldn't move the dirty data out of the bucket and the allocator wouldn't be able to invalidate and drop the cached data. This changes fragmentation to exclude cached data, so that copygc will evacuate these buckets and copygc/the allocator will always be able to make forward progress. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	3763cb9566	bcachefs: Don't use in-memory bucket array for alloc updates More prep work for getting rid of the in-memory bucket array: now that we have BTREE_ITER_WITH_JOURNAL, the allocator code can do ntree lookups before journal replay is finished, and there's no longer any need for it to get allocation information from the in-memory bucket array. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	21aec962df	bcachefs: New data structure for buckets waiting on journal commit Implement a hash table, using cuckoo hashing, for empty buckets that are waiting on a journal commit before they can be reused. This replaces the journal_seq field of bucket_mark, and is part of eventually getting rid of the in memory bucket array. We may need to make bch2_bucket_needs_journal_commit() lockless, pending profiling and testing. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	f443fa66c9	bcachefs: Also print out in-memory gen on stale dirty pointer We're trying to track down a bug that shows itself as newly-created extents having stale dirty pointers - possibly due to the in memory gen and the btree gen being inconsistent. This patch changes the error message to also print out the in memory bucket gen when this happens. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:22 -04:00
Kent Overstreet	f0f41a6d74	bcachefs: Add error messages for memory allocation failures This adds some missing diagnostics from rare but annoying to debug runtime allocation failure paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	e3ad29379e	bcachefs: Optimize bucket reuse If the btree updates pointing to a bucket were never flushed by the journal before the bucket became empty again, we can reuse the bucket without a journal flush. This tweaks the tracking of journal sequence numbers in alloc keys to implement this optimization: now, we only update the journal sequence number in alloc keys on transitions to and from empty. When a bucket becomes empty, we check if we can tell the journal not to flush entries starting from when the bucket was used. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	13f914ecb9	bcachefs: Kill bch2_ec_mem_alloc() bch2_ec_mem_alloc() was only used by GC, and there's no real need to preallocate the stripes radix tree since we can cope fine with memory allocation failure when we use the radix tree. This deletes a fair bit of code, and it's also needed for the upcoming patch because bch2_btree_iter_peek_prev() won't be working before journal replay completes (and using it was incorrect previously, as well). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	36f035e908	bcachefs: Fix allocator + journal interaction The allocator needs to wait until the last update touching a bucket has been commited before writing to it again. However, the code was checking against the last dirty journal sequence number, not the last flushed journal sequence number. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	a786087744	bcachefs: New in-memory array for bucket gens The main in-memory bucket array is going away, but we'll still need to keep bucket generations in memory, at least for now - ptr_stale() needs to be an efficient operation. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	47ac34ec98	bcachefs: Separate out gc_bucket() Since the main in memory bucket array is going away, we don't want to be calling bucket() or __bucket() when what we want is the GC in-memory bucket. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	e75b2d4c1c	bcachefs: bch2_journal_key_insert() no longer transfers ownership bch2_journal_key_insert() used to assume that the key passed to it was allocated with kmalloc(), and on success took ownership. This patch deletes that behaviour, making it more similar to bch2_trans_update()/bch2_trans_commit(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:19 -04:00
Kent Overstreet	77170d0dd7	bcachefs: bch2_bucket_alloc_new_fs() no longer depends on bucket marks Now that bch2_bucket_alloc_new_fs() isn't looking at bucket marks to decide what buckets are eligible to allocate, we can clean up the filesystem initialization and device add paths. Previously, we had to use ancient code to mark superblock/journal buckets in the in memory bucket marks as we allocated them, and then zero that out and re-do that marking using the newer transational bucket mark paths. Now, we can simply delete the in-memory bucket marking. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:19 -04:00
Kent Overstreet	8244f3209b	bcachefs: Option improvements This adds flags for options that must be a power of two (block size and btree node size), and options that are stored in the superblock as a power of two (encoded extent max). Also: options are now stored in memory in the same units they're displayed in (bytes): we now convert when getting and setting from the superblock. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:19 -04:00
Kent Overstreet	20572300dc	bcachefs: Improve alloc_mem_to_key() This moves some common code into alloc_mem_to_key(), which translates from the in-memory format for a bucket to the btree key format. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	fb0e480872	bcachefs: bch2_alloc_write() This adds a new helper that much like the one we have for inode updates, that allocates the packed alloc key, packs it and calls bch2_trans_update. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:18 -04:00
Kent Overstreet	990d42d187	bcachefs: Split out struct gc_stripe from struct stripe We have two radix trees of stripes - one that mirrors some information from the stripes btree in normal operation, and another that GC uses to recalculate block usage counts. The normal one is now only used for finding partially empty stripes in order to reuse them - the normal stripes radix tree and the GC stripes radix tree are used significantly differently, so this patch splits them into separate types. In an upcoming patch we'll be replacing c->stripes with a btree that indexes stripes by the order we want to reuse them. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	94a3e1a6c1	bcachefs: bch2_trans_update() is now __must_check With snapshots, bch2_trans_update() has to check if we need a whitout, which can cause a transaction restart, so this is important now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	b547d005d5	bcachefs: Erasure coding fixes When we added the stripe and stripe_redundancy fields to alloc keys, we neglected to add them to the functions that convert back and forth with the in-memory types. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	181fe42a75	bcachefs: Handle replica marking fsck errors locally This simplifies the code quite a bit and eliminates an inconsistency - a given bkey doesn't necessarily translate to a single replicas entry for disk space accounting. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	58e1ea4bcb	bcachefs: Push c->mark_lock usage down to where it is needed This changes the bch2_mark_key() and related paths to take mark lock where it is needed, instead of taking it in the upper transaction commit path - by pushing down locking we'll be able to handle fsck errors locally instead of requiring a separate check in the btree_gc code for replicas being marked. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	502cfb3591	bcachefs: Kill bch2_replicas_delta_list_marked() This changes bch2_trans_fs_usage_apply() to handle failure (replicas entry missing) by reverting the changes it made - meaning we can make the main transaction commit path a bit slimmer, and perhaps also simplify some locking in upcoming patches. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:18 -04:00
Kent Overstreet	f0c3f88b35	bcachefs: Run insert triggers before overwrite triggers Currently, btree triggers are run in natural key order, which presents a problem for fallocate in INSERT_RANGE mode: since we're moving existing extents to higher offsets, the trigger for deleting the old extent runs before the trigger that adds the new extent, potentially leading to indirect extents being deleted that shouldn't be when the delete causes the refcount to hit 0. This changes the order we run triggers so that for a givin btree, we run all insert triggers before overwrite triggers, nicely sidestepping this issue. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:17 -04:00
Kent Overstreet	c714614bd0	bcachefs: Disk space accounting fix on brand-new fs The filesystem initialization path first marks superblock and journal buckets non transactionally, since the btree isn't functional yet. That path was updating the per-journal-buf percpu counters via bch2_dev_usage_update(), and updating the wrong set of counters so those updates didn't get written out until journal entry 4. The relevant code is going to get significantly rewritten in the future as we transition away from the in memory bucket array, so this just hacks around it for now. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:17 -04:00
Kent Overstreet	076c783cd3	bcachefs: Fix upgrade path for reflink_p fix Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	3e52c22255	bcachefs: Add journal_seq to inode & alloc keys Add fields to inode & alloc keys that record the journal sequence number when they were most recently modified. For alloc keys, this is needed to know what journal sequence number we have to flush before the bucket can be reused. Currently this is tracked in memory, but we'll be getting rid of the in memory bucket array. For inodes, this is needed for fsync when the inode has been evicted from the vfs cache. Currently we use a bloom filter per outstanding journal buf - but that mechanism has been broken since we added the ability to not issue a flush/fua for every journal write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	2debb1b875	bcachefs: BTREE_TRIGGER_INSERT now only means insert This allows triggers to distinguish between a key entering the btree - i.e. being called from the trans commit path - vs. being called on a key that already exists, i.e. by GC. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	904823de49	bcachefs: Convert bch2_mark_key() to take a btree_trans * This helps to unify the interface between bch2_mark_key() and bch2_trans_mark_key() - and it also gives access to the journal reservation and journal seq in the mark_key path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	961b2d6282	bcachefs: Assorted ec fixes - The backpointer that ec_stripe_update_ptrs() uses now needs to include the snapshot ID, which means we have to change where we add the backpointer to after getting the snapshot ID for the new extents - ec_stripe_update_ptrs() needs to be calling bch2_trans_begin() - improve error message in bch2_mark_stripe() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	37f72492f4	bcachefs: Fix bch2_mark_update() When the old or new key doesn't exist, we should still pass in a deleted key with the correct pos. This fixes a bug in the ec code, when bch2_mark_stripe() was looking up the wrong in-memory stripe. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	f3b1e19379	bcachefs: Improve error messages in trans_mark_reflink_p() We should always print out the key we were marking. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	396a887d8f	bcachefs: Fix fsck path for refink pointers The way __bch2_mark_reflink_p returns errors was clashing with returning the number of sectors processed - we weren't returning FSCK_ERR_EXIT correctly. Fix this by only using the return code for errors, which actually ends up simplifying the overall logic. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	6d76aefea1	bcachefs: Fix for leaking of reflinked extents When a reflink pointer points to only part of an indirect extent, and then that indirect extent is fragmented (e.g. by copygc), if the reflink pointer only points to one of the fragments we leak a reference. Fix this by storing front/back pad values in reflink pointers - when inserting reflink pointesr, we initialize them to cover the full range of the indirect extents we reference. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	dfc276df91	bcachefs: Improve reflink repair code When a reflink pointer points to an indirect extent that doesn't exist, we need to replace it with a KEY_TYPE_error key. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	14b393ee76	bcachefs: Subvolumes, snapshots This patch adds subvolume.c - support for the subvolumes and snapshots btrees and related data types and on disk data structures. The next patches will start hooking up this new code to existing code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Kent Overstreet	6fba6b83b4	bcachefs: Prefer using btree_insert_entry to btree_iter This moves some data dependencies forward, to improve pipelining. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:11 -04:00
Brett Holman	fd0bd123d5	bcachefs: Fix 32 bit build failures This fix replaces multiple 64 bit divisions with do_div() equivalents. Signed-off-by: Brett Holman <bholman.devel@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:10 -04:00
Kent Overstreet	62df3d443c	bcachefs: Disk space accounting fix DIV_ROUND_UP() wasn't doing what we wanted when passing it negative numbers - fix it by just not passing it negative numbers anymore. Also, no need to do the scaling by compression ratio for incompressible data. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	297d89343d	bcachefs: Extensive triggers cleanups - We no longer mark subsets of extents, they're marked like regular keys now - which means we can drop the offset & sectors arguments to trigger functions - Drop other arguments that are no longer needed anymore in various places - fs_usage - Drop the logic for handling extents in bch2_mark_update() that isn't needed anymore, to match bch2_trans_mark_update() - Better logic for hanlding the BTREE_ITER_CACHED_NOFILL case, where we don't have an old key to mark Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:07 -04:00
Kent Overstreet	8c3f6da9fc	bcachefs: Improve iter->should_be_locked Adding iter->should_be_locked introduced a regression where it ended up not being set on the iterator passed to bch2_btree_update_start(), which is definitely not what we want. This patch requires it to be set when calling bch2_trans_update(), and adds various fixups to make that happen. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	8ee529e9c1	bcachefs: Make sure bch2_trans_mark_update uses correct iter flags Now that bch2_btree_iter_peek_with_updates() has been removed in favor of BTREE_ITER_WITH_UPDATES, we need to make sure it's not used where we don't want it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	290448ed2e	bcachefs: Don't underflow c->sectors_available This rarely used error path should've been checking for underflow - oops. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	953ee28a3e	bcachefs: Kill bch2_btree_iter_peek_cached() It's now been rolled into bch2_btree_iter_peek_slot() Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	c1949baa51	bcachefs: Simplify reflink trigger Now that we only mark entire extents, we can ditch the "reflink_p_frag_references" code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	8e6bbc4181	bcachefs: Move extent_handle_overwrites() to bch2_trans_update() This lifts handling of overlapping extents out of __bch2_trans_commit() and moves it to where we first do the update - which means that BTREE_ITER_WITH_UPDATES can now work correctly in extents mode. Also, this patch reworks how extent triggers work: previously, on partial extent overwrite we would pass this information to the trigger, telling it what part of the extent was being overwritten. But, this approach has had too many subtle corner cases - now, we only mark whole extents, meaning on partial extent overwrite we unmark the old extent and mark the new extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:06 -04:00
Kent Overstreet	224ec3e677	bcachefs: Don't mark superblocks past end of usable space bcachefs-tools recently started putting a backup superblock at the end of the device. This causes a problem if the bucket size doesn't divide the device size - but we can fix it by just skipping marking that part. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00
Kent Overstreet	bc3f8b25f3	bcachefs: Check for errors from bch2_trans_update() Upcoming refactoring is going to change bch2_trans_update() to start returning transaction restarts. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:05 -04:00

1 2 3 4 5 ...

411 Commits