linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-08-30 21:52:21 +00:00

Author	SHA1	Message	Date
Kent Overstreet	136d082abc	bcachefs: Improve trace_trans_restart_upgrade - Convert to a 'fs_str' tracepoint that just emits as a string: this lets us build up the tracepoint with a printbuf, using our pretty printers, and they're much easier to manage - Include locks_held, before and after - Include the btree node pointer we failed on (error pointer, null, or real node) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:11 -04:00
Kent Overstreet	f638b84224	bcachefs: fix bch2_inum_snapshot_to_path() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:11 -04:00
Kent Overstreet	2faa8ab0d0	bcachefs: fix duplicate printk Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:10 -04:00
Kent Overstreet	4ba99dde33	bcachefs: BCH_INODE_has_case_insensitive Add a flag for tracking whether a directory has case-insensitive descendents - so that overlayfs can disallow mounting, even though the filesystem supports case insensitivity. This is a new on disk format version, with a (cheap) upgrade to ensure the flag is correctly set on existing inodes. Create, rename and fssetxattr are all plumbed to ensure the new flag is set, and we've got new fsck code that hooks into check_inode(0. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:10 -04:00
Kent Overstreet	77eac89c79	bcachefs: bch2_inode_find_by_inum_snapshot() Move a fsck.c helper into inode.c, eliminate some duplicate and organize the inode lookup helpers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:09 -04:00
Kent Overstreet	77aeaa2f0f	bcachefs: bch2_inum_snapshot_to_path() Add a better helper for printing out paths of inodes when we don't know the subvolume, for fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:09 -04:00
Kent Overstreet	7c4f22af25	bcachefs: bch2_rename_trans() only runs rename-to-dir code if needed Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Kent Overstreet	011d644b76	bcachefs: subvol_inum_eq() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Kent Overstreet	c3a7fd95e0	bcachefs: Don't set bi_casefold on non directories bi_casefold only makes sense for directories, and since it's one of the variable length fields setting it unnecessarily wastes space. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Alan Huang	a96c5e5045	bcachefs: Remove duplicate call to bch2_trans_begin() There is one in for_each_btree_key_max(). Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Kent Overstreet	c631bb41f5	bcachefs: Call bch2_bkey_set_needs_rebalance() earlier in write path There's no reason to be running this inside our transaction; it forces us to copy the key we're updating to a temporary, which we'd like to skip. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:07 -04:00
Kent Overstreet	f132a78095	bcachefs: Simplify bch2_extent_atomic_end() It used to be that we had a fixed maximum number of btree paths to work with - 64. That's no longer the case, so bch2_extent_atomic_end() doesn't have to be as strict. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:07 -04:00
Kent Overstreet	7fd643c032	bcachefs: Coalesce accounting in trans commit Accounting has gotten quite heavy, and there's lots of redundancy in accounting updates within a transaction, as we often add/delete multiple extents that touch the same accountign counters. This will reduce the amount of data that we journal, and reduce pressure downstream on the btree write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	e8f9992b0a	bcachefs: Split out accounting in transaction commit There can be a lot of rendundancy in accounting updates within a single btree transaction. Split out accounting updates so that they can be deduped, in the next commit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	247abee6ae	bcachefs: btree_trans_subbuf Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	81c42933a5	bcachefs: Make accounting mismatch errors more readable Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	51e23c9d60	bcachefs: async objs now support bch_write_ops Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:05 -04:00
Kent Overstreet	8c3fc7cca3	bcachefs: fix bch2_debugfs_flush_buf() when tabstops are in use Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:05 -04:00
Kent Overstreet	6b86da9282	bcachefs: fsck: Include loops in error messages This fixes the subvol loop checking and directory loop checking to print the loop. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:05 -04:00
Kent Overstreet	39cea302f1	bcachefs: bch2_check_bucket_backpointer_mismatch() Detect buckets with missing backpointers, and run repair on demand. __bch2_move_data_phys() now calls bch2_check_bucket_backpointer_mismatch() as it walks buckets, which checks for missing backpointers by comparing backpointers against bucket sector counts. When missing backpointers are detected, we kick off bch2_check_extents_to_backpointers() asynchronously - right away if we're trying to evacuate, or with a threshold if we're just running copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	15f969326e	bcachefs: Improve bucket_bitmap code Add some more helpers, and mismatches is now a superset of the empty bitmap - simplifies most checks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	06977ea82b	bcachefs: Run recovery passes asynchronously When we request a recovery pass to be run online, i.e. not during recovery, if it's an online pass it'll now be run in the background, instead of waiting for the next mount. To avoid situations where recovery passes are running continuously, this also includes ratelimiting: if the RUN_RECOVERY_PASS_ratelimit flag is passed, the pass may be deferred until later - depending on the runtime and last run stats in the recovery_passes superblock section. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	d4b30ed90c	bcachefs: bch2_run_explicit_recovery_pass() cleanup Consolidate the run_explicit_recovery_pass() interfaces by adding a flags parameter; this will also let us add a RUN_RECOVERY_PASS_ratelimit flag. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	06266465cc	bcachefs: bch2_recovery_pass_status_to_text() Show recovery pass status in sysfs - important now that we're running them automatically in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	7ed4c14e20	bcachefs: Reduce usage of recovery.curr_pass We want recovery.curr_pass to be private to the recovery passes code, for better showing recovery pass status; also, it may rewind and is generally not the correct member to use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	ab35552030	bcachefs: __bch2_run_recovery_passes() Consolidate bch2_run_recovery_passes() and bch2_run_online_recovery_passes(), prep work for automatically scheduling and running recovery passes in the background. - Now takes a mask of which passes to run, automatic background repair will pass in sb.recovery_passes_required. - Skips passes that are failing: a pass that failed may be reattempted after another pass succeeds (some passes depend on repair done by other passes for successful completion). - bch2_recovery_passes_match() helper to skip alloc passes on a filesystem without alloc info. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	68708efcac	bcachefs: struct bch_fs_recovery bch_fs has gotten obnoxiously big, let's start organizing thins a bit better. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	878713b5f5	bcachefs: kill copy in bch2_disk_accounting_mod() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:02 -04:00
Kent Overstreet	295dbf50e5	bcachefs: Optimize bch2_trans_start_alloc_update() Avoid doing more updates if we already have one. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:02 -04:00
Kent Overstreet	9469556a5f	bcachefs: btree key cache asserts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:02 -04:00
Kent Overstreet	a78a11900e	bcachefs: journal path now uses discard_opt_enabled() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:01 -04:00
Kent Overstreet	8a6fa52e07	bcachefs: relock_fail tracepoint now includes btree Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:01 -04:00
Kent Overstreet	84b9f17195	bcachefs: do_rebalance_scan() now only updates bch_extent_rebalance This ensures that our pending rebalance work accounting is accurate quickly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:01 -04:00
Kent Overstreet	bde41d9a58	bcachefs: better error message for subvol_fs_path_parent_wrong Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:00 -04:00
Kent Overstreet	fdd0807f81	bcachefs: Improve bch2_repair_inode_hash_info() Improve this so it can be used by fsck.c check_inode(); it provides a much better error message than the check_inode() version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:00 -04:00
Kent Overstreet	123d2d09ff	bcachefs: bch2_inode_find_snapshot_root() Factor out a small common helper. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Alan Huang	4a67b94bd8	bcachefs: Early return to avoid unnecessary lock Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Alan Huang	688321f97e	bcachefs: Kill BTREE_TRIGGER_bucket_invalidate Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Kent Overstreet	e882906929	bcachefs: Fix opt hooks in sysfs for non sb option We weren't checking if the option changed for non-superblock options - this led to rebalance not waking up when enabling the "rebalance_enabled" option. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Kent Overstreet	648c1142c9	bcachefs: fix can_write_extent() Failing to check the return value of bch2_dev_rcu(): we could (technically) race with device removal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:58 -04:00
Kent Overstreet	c7378d0e5e	bcachefs: Add tracepoint, counter for io_move_created_rebalance Internal moves shouldn't add new rebalance_work, but it's been reported that this seems to be happening. Add a tracepoint and counter so we can see what's going on. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:58 -04:00
Kent Overstreet	e4e513f2d5	bcachefs: move_buckets in rhashtable when allocated Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:58 -04:00
Kent Overstreet	fb7e78cc25	bcachefs: Move pending buckets queue to buckets_in_flight Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:57 -04:00
Kent Overstreet	49188a9313	bcachefs: kill move_bucket_in_flight Small cleanup/simplification, and prep work for the next patch, which will add checking if buckets don't get evacuated because they're missing backpointers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:57 -04:00
Kent Overstreet	b42fac043f	bcachefs: bch2_fs_emergency_read_only2() More error message cleanup: instead of multiple printk()s per error, we want to be building up a single error message in a printbuf, so that it can be printed with indenting that shows grouping and avoid errors getting interspersed or lost in the log. This gets rid of most calls to bch2_fs_emergency_read_only(). We still have calls to - bch2_fatal_error() - bch2_fs_fatal_error() - bch2_fs_fatal_err_on() that need work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	ac4c7ac90e	bcachefs: Extra write buffer asserts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	7ad7497862	bcachefs: add missing locking in bch2_write_point_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	177ac4925f	bcachefs: Don't rewind recovery if not in recovery Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	367cad0966	bcachefs: Rename fsck_running, recovery_running flags Slightly more readable. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	5b1247ca5f	bcachefs: debug_check_bkey_unpack Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:55 -04:00
Kent Overstreet	34aeb820f9	bcachefs: debug_check_bset_lookups Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:55 -04:00
Kent Overstreet	c4e3889440	bcachefs: debug_check_iterators no longer requires BCACHEFS_DEBUG Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:55 -04:00
Kent Overstreet	110bb6cb8b	bcachefs: debug_check_btree_locking modparam Don't put btree locking asserts behind CONFIG_BCACHEFS_DEBUG, put them behind a module parameter. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:54 -04:00
Kent Overstreet	2842515575	bcachefs: Debug params are now static_keys We'd like users to be able to debug without building custom kernels, so this will help us get rid of CONFIG_BCACHEFS_DEBUG, at least for most things. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:54 -04:00
Kent Overstreet	b51b4055c3	bcachefs: Slim down inlined part of bch2_btree_path_upgrade() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:53 -04:00
Kent Overstreet	001c1d146f	bcachefs: online_fsck_mutex -> run_recovery_passes_lock Prep work for automatically running recovery passes asynchronously. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:53 -04:00
Kent Overstreet	e21f997721	bcachefs: bch_sb_field_recovery_passes New superblock section for statistics on recovery passes - last time ran (successfully), last runtime. This will be used by self healing code to determine when to kick off potentially expensive recovery passes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:53 -04:00
Kent Overstreet	20a4b7f3b8	bcachefs: recovery_passes_types.h -> recovery_passes_format.h Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:52 -04:00
Kent Overstreet	3b7b0c3996	bcachefs: print label correctly in sb_member_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:52 -04:00
Kent Overstreet	13ffcbae86	bcachefs: "buckets with backpointer mismatches" now allocated on demand More self healing work: we're going to be calling check_bucket_backpointer_mismatch() at runtime, outside of fsck. Then when we need to we'll kick off the full check_extents_to_backpointers recovery pass. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:52 -04:00
Kent Overstreet	7f9dada701	bcachefs: delete dead items in bch_dev Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:51 -04:00
Kent Overstreet	3ffda8c219	bcachefs: kill dead code in move_data_phys() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:50 -04:00
Kent Overstreet	82067c9169	bcachefs: buckets_in_flight on stack copygc runs with a full stack available, there's no reason to dynamically allocate this. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:50 -04:00
Kent Overstreet	1dfa01ef24	bcachefs: bch2_copygc_dev_wait_amount() Factor out the per-device calculations, for better introspection. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:50 -04:00
Kent Overstreet	970dde8271	bcachefs: Add missing include fix debug build in userspace Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:50 -04:00
Kent Overstreet	8c69e2b52e	bcachefs: Knob for manual snapshot deletion Add 'opts.snapshot_deletion_enabled', enabled by default. This may be turned off so that the new sysfs knob, 'internal/trigger_delete_dead_snapshots', may be used instead - this will allow snapshot deletion to be profiled more easily. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:49 -04:00
Kent Overstreet	a8539ad8fa	bcachefs: bcachefs_metadata_version_fast_device_removal Fast device removal, that uses backpointers to find pointers to the device being removed instead of a full metadata scan. This requires BCH_SB_MEMBER_DELETED_UUID, which is an incompatible change - hence the version number bump. We don't fully trust backpointers, so we don't want to reuse device indexes until after a fsck has verified that there aren't any pointers to removed devices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:49 -04:00
Kent Overstreet	09fa6c3039	bcachefs: bch2_dev_data_drop_by_backpointers() Currently, device removal has to scan all metadata for pointers to the device being removed. Add a new method, with the same interface as bch2_dev_data_drop(), that scans by backpointers instead - this will drastically speed up device removal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:49 -04:00
Kent Overstreet	b3f80d0923	bcachefs: BCH_SB_MEMBER_DELETED_UUID Add a sentinal value for devices that have been removed, but don't want to reuse their index until a fsck has completed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:49 -04:00
Kent Overstreet	66e9a7f139	bcachefs: bch2_dev_remove_stripes() respects degraded flags Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:48 -04:00
Kent Overstreet	96fc7d8adb	bcachefs: opts.rebalance_on_ac_only Add an option for setting rebalance to only run when connected to mains power. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:48 -04:00
Kent Overstreet	502222041c	bcachefs: __bch2_fs_free() cleanup Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:47 -04:00
Kent Overstreet	39430cfd27	bcachefs: Improve bch2_extent_ptr_set_cached() Preferentially keep existing cached pointers instead of adding new ones. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:47 -04:00
Kent Overstreet	fbe728f956	bcachefs: improve check_inode_hash_info_matches_root() error message Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:47 -04:00
Kent Overstreet	84bd6afee1	bcachefs: inline bch2_ob_ptr() This was an oversight, we want bch2_alloc_sectors_append_ptrs_inlined() fully inlined. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:46 -04:00
Kent Overstreet	e02888faab	bcachefs: bch2_dev_in_target() no longer takes rcu_read_lock() Minor optimization, the caller generally has it already. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:45 -04:00
Kent Overstreet	7d4f2687ef	bcachefs: bch2_journal_write() refactoring Make the locking easier to follow; also take io_refs earlier, in __journal_write_alloc(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:45 -04:00
Kent Overstreet	88f62ed60c	bcachefs: delete_dead_snapshot_keys_v2() Since extents, dirents and xattrs require an inode with the corresponding snapshot ID to exists, we can avoid a lot of scanning by only scanning those trees for keys to process if the correspending inode exists. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:45 -04:00
Kent Overstreet	e9756dd29f	bcachefs: bcachefs_metadata_version_snapshot_deletion_v2 We're going to be speeding up snapshot deletion, by only having it process the extents/dirents/xattrs btrees if an inode of a given snapshot ID was present. This raises the possibility of 'bkey_in_missing_snapshot' errors popping up, if we ever accidentally don't do the corresponding inode update, or if the new algorithm has bugs. So instead of deleting snapshot IDs, add a new deleted flag, so that 'key in missing snapshot' errors can more definitively tell what happened and automatically repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:45 -04:00
Kent Overstreet	08d14d90a4	bcachefs: BCH_SNAPSHOT_DELETED -> BCH_SNAPSHOT_WILL_DELETE We're going to be speeding up snapshot deletion, by only having it process the extents/dirents/xattrs btrees if an inode of a given snapshot ID was present. This raises the possibility of 'bkey_in_missing_snapshot' errors popping up, if we ever accidentally don't do the corresponding inode update, or if the new algorithm has bugs. So we'll want to be able to differentiate more definitively between 'snapshot went missing' (and perhaps needs to be reconstructed), and 'key in snapshot that was deleted'. So instead of deleting snapshot IDs, we'll be adding a new deleted flag and leaving them permanently. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:44 -04:00
Kent Overstreet	3f8e977265	bcachefs: Skip unrelated snapshot trees in snapshot deletion Don't scan keys in inodes for which the snapshot tree doesn't match any we're deleting from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:43 -04:00
Kent Overstreet	0afdf4969e	bcachefs: BCH_FSCK_ERR_snapshot_key_missing_inode_snapshot We're going to be doing some snapshot deletion performance improvements, and those will strictly require that if an extent/dirent/xattr is present, an inode is present in that snapshot ID. We already check for this, but we don't repair it on disk: this patch adds that repair and turns it into a real fsck_err(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:43 -04:00
Kent Overstreet	855070dc0b	bcachefs: get_inodes_all_snapshots() now includes whiteouts The next patch is going to change lookup_inode_for_snapshot to rigorously require that a extent/dirent/xattr keys have a corresponding inode key present - whiteouts included, so this simplifies the checks lookup_inode_for_snapshot() will have to do. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:43 -04:00
Kent Overstreet	a9421140fc	bcachefs: bch2_inode_unpack() cleanup bi_snapshot is now handled like other fields Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:43 -04:00
Kent Overstreet	00757984d5	bcachefs: Improve bch2_request_incompat_feature() message Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:42 -04:00
Alan Huang	3c97ebea61	bcachefs: Fix inconsistent req->ec There is req->ec = erasure_code above. Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:42 -04:00
Kent Overstreet	6f2bbd5747	bcachefs: kill inode_walker_entry.snapshot redundant Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:41 -04:00
Kent Overstreet	7b8c41c178	bcachefs: Add comments for inode snapshot requirements Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:41 -04:00
Kent Overstreet	15dbd0d814	bcachefs: snapshot delete progress indicator Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:40 -04:00
Kent Overstreet	e3006cb010	bcachefs: Don't emit bch_sb_field_members_v1 if not required In 'bcachefs_metadata_extent_flags', we stopped requireding members_v1 to be present - only that either v1 or v2 is present. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:40 -04:00
Alan Huang	9180c5f918	bcachefs: Rename x_name to x_name_and_value The flexible array contains name and value, the x_name is misleading. Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:39 -04:00
Kent Overstreet	a42f709f9a	bcachefs: Improve bch2_disk_groups_to_text() Print out the actual name of each path/label, instead of just the integer indexes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:39 -04:00
Kent Overstreet	8a6b883e78	bcachefs: Fix setting ca->name in device add Device add doesn't get the devide index and attach to the filesystem until after attaching the block device, and setting the device name from the block device name - these needs some minor tweaks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:39 -04:00
Kent Overstreet	5ce11d9d1b	bcachefs: sysfs trigger_recalc_capacity For bug diagnosis Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:38 -04:00
Gustavo A. R. Silva	ae0386e111	bcachefs: Avoid -Wflex-array-member-not-at-end warnings -Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Refactor a couple of structs that contain flexible arrays in the middle by replacing them with unions. So, with these changes, fix the following warnings: fs/bcachefs/disk_accounting.c:429:51: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] fs/bcachefs/ec_types.h:8:41: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:38 -04:00
Kent Overstreet	98e5e36d8c	bcachefs: bch2_dev_add() can run on a non-started fs Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:37 -04:00
Kent Overstreet	a349868b5e	bcachefs: bch2_fs_open() now takes a darray Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:37 -04:00
Kent Overstreet	cf95296295	bcachefs: bch2_trans_update_ip() Allow btree_insert_entry.ip_allocated to be passed in, so we get better info on where alloc updates are coming from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:37 -04:00
Kent Overstreet	7677859a47	bcachefs: Run most explicit recovery passes persistent If we detect an error that requires running a recovery pass, and we're not in recovery, we won't be able to fix it until the next mount - make sure we're noting in the superblock that it needs to run. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:37 -04:00
Kent Overstreet	aff2b6a7fc	bcachefs: provide unlocked version of run_explicit_recovery_pass_persistent Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:36 -04:00
Kent Overstreet	c21f41f690	bcachefs: bch2_dirent_to_text() shows casefolded dirents Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:36 -04:00
Kent Overstreet	cd3cdb1ef7	bcachefs: Single err message for btree node reads Like we just did with the data read path, emit a single error message per btree node reads, nicely formatted, with all the actions we took grouped together. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:35 -04:00
Kent Overstreet	9c2472658b	bcachefs: bch2_mark_btree_validate_failure() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:35 -04:00
Kent Overstreet	d31f155964	bcachefs: bch2_fsck_err_opt() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:34 -04:00
Kent Overstreet	600a9207c8	bcachefs: Plumb printbuf through bch2_btree_lost_data() Part of the ongoing project to improve error messages by building them up in printbufs and emitting them all at once, so that we can easily see what events are related in the log. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:34 -04:00
Kent Overstreet	300904700f	bcachefs: kill bch2_run_explicit_recovery_pass_persistent() No longer has users, so we can kill it and rename bch2_run_explicit_recovery_pass_persistent_locked(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:33 -04:00
Kent Overstreet	3aecbb01a1	bcachefs: Remove redundant calls to btree_lost_data() The btree node read path calls this before returning the read error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:33 -04:00
Kent Overstreet	3be132f93c	bcachefs: bch2_btree_lost_data() now handles snapshots tree We have a consolidated places for "this btree lost data, run this repair", so use it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:33 -04:00
Kent Overstreet	b3bbd47f83	bcachefs: Kill redundant error message in topology repair The btree node read path already logs btree node read errors, this isn't needed. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:32 -04:00
Kent Overstreet	156d9e8341	bcachefs: Emit a single log message on data read error Instead of emitting a message immediately when we get an error in the read path, and then another at the end if we successfully retry - emit one single log message before returning from bch2_rbio_retry(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:32 -04:00
Kent Overstreet	353b89c6e6	bcachefs: bch2_io_failures_to_text() Pretty printer for bch_io_failures, to be used for better read error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:31 -04:00
Kent Overstreet	dbc18c97f1	bcachefs: print_string_as_lines: avoid printing empty line If the final line in in the message to be printed is blang, don't print it. This happens with indented printbufs - after a newline we emit spaces up to the indent level. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:31 -04:00
Kent Overstreet	41e51769b8	bcachefs: Make various async objs visible in debugfs Add async objs list for - promote_op - bch_read_bio - btree_read_bio - btree_write_bio This gets us introspection on in-flight async ops, and because under the hood it uses fast_lists (percpu slot buffer on top of a radix tree), it'll be fast enough to enable in production. This will be very helpful for debugging "something got stuck" issues, which have been cropping up from time to time (in the CI, especially with folio writeback). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:30 -04:00
Kent Overstreet	0499a82b18	bcachefs: Async object debugging Debugging infrastructure for async objs: this lets us easily create fast_lists for various object types so they'll be visible in debugfs. Add new object types to the BCH_ASYNC_OBJS_TYPES() enum, and drop a pretty-printer wrapper in async_objs.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:29 -04:00
Kent Overstreet	d49bafdc5d	bcachefs: fast_list A fast "list" data structure, which is actually a radix tree, with an IDA for slot allocation and a percpu buffer on top of that. Items cannot be added or moved to the head or tail, only added at some (arbitrary) position and removed. The advantage is that adding, removing and iteration is generally lockless, only hitting the lock in ida when the percpu buffer is full or empty. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:29 -04:00
Kent Overstreet	989b4c375a	bcachefs: bch2_read_bio_to_text Pretty printer for struct bch_read_bio. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:29 -04:00
Kent Overstreet	5f0de475f9	bcachefs: bch2_bio_to_text() Pretty printer for struct bio, to be used for async object debugging. This is pretty minimal, we'll add more to it as we discover what we need. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:28 -04:00
Kent Overstreet	cca2c0d224	bcachefs: bch_dev.io_ref -> enumerated_ref Convert device IO refs to enumerated_refs, for easier debugging of refcount issues. Simple conversion: enumerate all users and convert to the new helpers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:28 -04:00
Kent Overstreet	c9b1d94a21	bcachefs: bch_fs.writes -> enumerated_refs Drop the single-purpose write ref code in bcachefs.h, and convert to enumarated refs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:27 -04:00
Kent Overstreet	f5241e4127	bcachefs: enumerated_ref.c Factor out the debug code for rw filesystem refs into a small library. In release mode an enumerated ref is a normal percpu refcount, but in debug mode all enumerated users of the ref get their own atomic_long_t ref - making it much easier to chase down refcount usage bugs for when a refcount has many users. For debugging, we have enumerated_ref_to_text(), which prints the current value of each different user. Additionally, in debug mode enumerated_ref_stop() has a 10 second timeout, after which it will dump outstanding refcounts. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:27 -04:00
Kent Overstreet	6d67de1079	bcachefs: for_each_rw_member_rcu() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:27 -04:00
Kent Overstreet	e14e06e91d	bcachefs: __bch2_fs_read_write() no longer depends on io_ref Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:26 -04:00
Kent Overstreet	9fa4a8a3bd	bcachefs: for_each_online_member_rcu() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:26 -04:00
Kent Overstreet	2483dd1243	bcachefs: recalc_capacity() no longer depends on io_ref Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:25 -04:00
Kent Overstreet	c53be0ffaa	bcachefs: bch2_target_to_text() no longer depends on io_ref Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:25 -04:00
Kent Overstreet	834f9475aa	bcachefs: bch2_check_rebalance_work() Add a pass for checking the rebalance_work btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:24 -04:00
Alan Huang	09279bba72	bcachefs: Kill dead code Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:24 -04:00
Kent Overstreet	62095464e9	bcachefs: Fix struct with flex member ABI warning This pops up when buliding in userspace. The structs aren't actually variable length, but no way to tell the compiler that... Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:24 -04:00
Kent Overstreet	fe27298b92	bcachefs: bch2_move_data_btree() can now walk roots Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:22 -04:00
Kent Overstreet	3484840ece	bcachefs: bch2_move_data_btree() can move btree nodes Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:22 -04:00
Kent Overstreet	7a274285d3	bcachefs: plumb btree_id through move_pred_fd Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:22 -04:00
Kent Overstreet	f3c8eaf7a1	bcachefs: Plumb target parameter through btree_node_rewrite_pos() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:21 -04:00
Kent Overstreet	ecedc87cfa	bcachefs: export bch2_move_data_phys() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:21 -04:00
Kent Overstreet	0ca375b177	bcachefs: BCH_MEMBER_RESIZE_ON_MOUNT Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:21 -04:00
Kent Overstreet	530112d88e	bcachefs: BCH_FEATURE_small_image We can't go RW if it's an image file that hasn't been resized. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:20 -04:00
Kent Overstreet	203852d9db	bcachefs: BCH_FEATURE_no_alloc_info If a filesystem is going to only be used read-only, and will be a deployable image, we can strip out alloc info for a substantial reduction in metadata size - around half, due to backpointers. Alloc info will be regenerated on first read-write mount. Remounting RW is disallowed for now, since we don't yet have check_allocations running in RW mode. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:20 -04:00
Kent Overstreet	576493133f	bcachefs: Print features on startup with -o verbose Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:20 -04:00
Kent Overstreet	0dc73809e9	bcachefs: Shrink superblock downgrade table Don't generate entries for versions that won't be able to mount. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:20 -04:00
Kent Overstreet	1c8dfd7ba5	bcachefs: sb_validate() no longer requires members_v1 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:19 -04:00
Kent Overstreet	d12bd41018	bcachefs: Add a recovery pass for making sure root inode is readable If the root inode/subvolume is unreadable we can repair automatically - but only if we're still in recovery, so that we can rewind to the appropriate recovery pass. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:19 -04:00
Kent Overstreet	bdad8962c9	bcachefs: Flag for repair on missing subvolume Instead of going emegency read only with a bch2_fs_inconsistent() call, log the error and recovery pass appropriately. If we're still in recovery it'll be repaired immediately, otherwise it'll be repaired on the next mount. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:18 -04:00
Kent Overstreet	ebf561b208	bcachefs: print_str_as_lines() -> print_str() bch2_print_string_as_lines() is a low level helper that allows messages longer than 1k to be printed without truncation. But we should always be printing with the helpers that take a filesystem object, if we're in fsck they direct output to the userspace process controlling fsck instead of the dmesg log. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:18 -04:00
Kent Overstreet	040c762152	bcachefs: bch2_dev_missing_bkey() Part of the ongoing project to kill off bch2_(fs\|trans)_inconsistent calls - they generally need to be replaced with either - a fsck_err() call that can repair the error, or - logging an error of the appropriate type in the superblock, and flagging the appropriate recovery pass to repair the error Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:17 -04:00
Kent Overstreet	2085325171	bcachefs: Simplify bch2_count_fsck_err() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:17 -04:00
Kent Overstreet	bb36a12921	bcachefs: bch2_run_explicit_recovery_pass_printbuf() We prefer helpers that emit log messages to printbufs rather than printing them directly; that way, we can ensure that different log messages from the same event are grouped together and formatted appropriately in the dmesg log. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:16 -04:00
Kent Overstreet	5022d0e183	bcachefs: Incompatible features may now be enabled at runtime version_upgrade is now a runtime option. In the future we'll want to add compatible upgrades at runtime, and call the full check_version_upgrade() when the option changes, but we don't have compatible optional upgrades just yet. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:16 -04:00
Kent Overstreet	c79eb06da4	bcachefs: Clean up option pre/post hooks, small fixes The helpers are now: - bch2_opt_hook_pre_set() - bch2_opts_hooks_pre_set() - bch2_opt_hook_post_set Fix a bug where the filesystem discard option would incorrectly be changed when setting the device option, and don't trigger rebalance scans unnecessarily (when options aren't changing). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:16 -04:00
Kent Overstreet	83ecd1b122	bcachefs: Use drop_locks_do() in bch2_inode_hash_find() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:16 -04:00
Kent Overstreet	c02e5b5728	bcachefs: Single device mode Single device filesystems are now identified by the block device name, not the UUID - and single device filesystems with the same UUID can be mounted simultaneously, without any special options. This allocates a new bit in the superblock, BCH_SB_MULTI_DEVICE, which indicates whether a filesystem has ever been multi device. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:15 -04:00
Kent Overstreet	58c36e6710	bcachefs: Initialize c->name earlier on single dev filesystems On single device filesystems, c->name contains the block device name, not the UUID. Initialize this earlier, so that single device mode can use it for initializing sysfs/debugfs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:15 -04:00
Alan Huang	0e43bf5a6a	bcachefs: Simplify logic Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:14 -04:00
Alan Huang	152bae193c	bcachefs: Remove spurious +1/-1 operation Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:14 -04:00
Alan Huang	f013b4ca35	bcachefs: Kill bch2_trans_unlock_noassert Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:14 -04:00
Kent Overstreet	6f03e30e7c	bcachefs: Clean up duplicated code in bch2_journal_halt() It's now a wrapper around bch2_journal_halt_locked(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:13 -04:00
Kent Overstreet	03f8f9a129	bcachefs: bch2_dev_allocator_set_rw() Add a helper that lets us change bch_member.data_allowed at runtime. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:13 -04:00
Kent Overstreet	2e0d51d00e	bcachefs: bch2_dev_journal_alloc() now respects data_allowed Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:13 -04:00
Kent Overstreet	93ac4d5f92	bcachefs: Improve bch2_btree_cache_to_text() Make the output slightly clearer, and include a counter for "nodes we couldn't free because we would have gone under our reserve". Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:13 -04:00
Kent Overstreet	e50fe14c54	bcachefs: __btree_node_reclaim_checks() Factor out a helper so we're not duplicating checks after locking the btree node. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:13 -04:00
Kent Overstreet	68aaeb7c8b	bcachefs: kill BTREE_CACHE_NOT_FREED_INCREMENT() Small cleanup, just always increment the counters. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:12 -04:00
Kent Overstreet	ef8dd631f7	bcachefs: Improve opts.degraded Kill 'opts.very_degraded', and make 'opts.degraded' a persistent option, stored in the superblock. It's now an enum, with available choices ask/yes/very/no. "ask" mode will be handled by the mount helper, for prompting the user (on a machine used interactively) for whether to do a degraded mount. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:12 -04:00
Kent Overstreet	2758c28aca	bcachefs: export bch2_chacha20 Needed for userspcae. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:11 -04:00
Integral	dd1b99f706	bcachefs: indent error messages of invalid compression This patch uses printbuf_indent_add_nextline() to set a consistent indentation level for error messages of invalid compression. In my previous patch [1], the newline is added by using '\n' in the argument of prt_str(). This patch replaces prt_str() with prt_printf() to make indentation level work correctly. [1] Link: https://lore.kernel.org/20250406152659.205997-2-integral@archlinuxcn.org Signed-off-by: Integral <integral@archlinuxcn.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:10 -04:00
Integral	84ccd47d26	bcachefs: split error messages of invalid compression into two lines When an invalid compression type or level is passed as an argument to `--compression`, two error messages are squashed into one line: > bcachefs format --compression=lzo bcachefs-comp.img invalid option: invalid compression typecompression: parse error > bcachefs format --compression=lz4:16 bcachefs-comp.img invalid option: invalid compression levelcompression: parse error To resolve this issue, add a newline character at the end of the first error message to separate them into two lines. Signed-off-by: Integral <integral@archlinuxcn.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:08 -04:00
Integral	0e790469bf	bcachefs: early return for negative values when parsing BCH_OPT_UINT Currently, when passing a negative integer as argument, the error message is "too big" due to casting to an unsigned integer: > bcachefs format --block_size=-1 bcachefs.img invalid option: block_size: too big (max 65536) When negative value in argument detected, return early before calling bch2_opt_validate(). A new error code `BCH_ERR_option_negative` is added. Signed-off-by: Integral <integral@archlinuxcn.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:07 -04:00
Kent Overstreet	3a2a0d08b2	bcachefs: move_data_phys: stats are not required Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:05 -04:00
Kent Overstreet	d4d71b58e5	bcachefs: RO mounts now use less memory Defer memory allocations only needed in RW mode until we actually go RW. This is part of improved support for RO images. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:04 -04:00
Kent Overstreet	a17e985be9	bcachefs: Move various init code to _init_early() _init_early() is for initialization that cannot fail, and often must happen for teardown partway through initialization to work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:02 -04:00
Kent Overstreet	31813dcf37	bcachefs: alphabetize init function calls Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:00 -04:00
Kent Overstreet	25ee021c7f	bcachefs: simplify journal pin initialization Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:59 -04:00
Kent Overstreet	2767f4f258	bcachefs: btree_io_complete_wq -> btree_write_complete_wq Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:56 -04:00
Kent Overstreet	c9b5d9cd26	bcachefs: bch2_kvmalloc() mem alloc profiling Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:56 -04:00
Kent Overstreet	bcaea61adc	bcachefs: add missing include Hygeine, and fix build in userspace. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:52 -04:00
Kent Overstreet	b974357c63	bcachefs: bch2_snapshot_table_make_room() Add a better helper for check_snapshot_exists(). create_snapids() can't be changed to use this, unfortunately, because the transaction that creates new snapshot will also be inserting other keys (e.g. root inode) that reference that snapshot ID, and they expect the snapshot table to already be updated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:50 -04:00
Kent Overstreet	ea27e8ca5d	bcachefs: darray: provide typedefs for primitive types Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:47 -04:00
Kent Overstreet	2a81bd454c	bcachefs: reduce new_stripe_alloc_buckets() stack usage Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:44 -04:00
Kent Overstreet	a0b0b9bb9e	bcachefs: alloc_request no longer on stack Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:43 -04:00
Kent Overstreet	95f2315af7	bcachefs: alloc_request.ptrs2 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:40 -04:00
Kent Overstreet	e038213658	bcachefs: alloc_request.ca Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:39 -04:00
Kent Overstreet	7f65d1cf5c	bcachefs: alloc_request.counters Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:37 -04:00
Kent Overstreet	4d00e88d21	bcachefs: alloc_request.usage Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:36 -04:00
Kent Overstreet	a0312f4251	bcachefs: alloc_request: deallocate_extra_replicas() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:35 -04:00
Kent Overstreet	ac0952b0e5	bcachefs: new_stripe_alloc_buckets() takes alloc_request More stack usage improvements: instead of creating a new alloc_request (currently on the stack), save/restore just the fields we need to reuse. This is a bit tricky, because we're doing a normal alloc_foreground.c allocation, which calls into ec.c to get a stripe, which then does more normal allocations - some of the fields get reused, and used differently. So we have to save and restore them - but the stack usage improvements will be well worth it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:33 -04:00
Kent Overstreet	7100344301	bcachefs: bch2_ec_stripe_head_get() takes alloc_request Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:32 -04:00
Kent Overstreet	9259883b79	bcachefs: bch2_bucket_alloc_trans() takes alloc_request Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:31 -04:00
Kent Overstreet	799c418303	bcachefs: alloc_request.data_type Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:30 -04:00
Kent Overstreet	ad63f9f1e9	bcachefs: struct alloc_request Add a struct for common state for satisfying an on disk allocation, instead of passing the same long list of items to every function. This will help with stack usage, performance, and perhaps enable some code cleanups. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:27 -04:00
Kent Overstreet	d02755b8c5	bcachefs: trace bch2_trans_kmalloc() We're occasionally seeing the WARN_ON() for bump allocator usage exceeding BTREE_TRANS_MEM_MAX; add some tracing so we can see what's going on. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:27 -04:00
Roxana Nicolescu	caa6baa45f	bcachefs: replace memcpy with memcpy_and_pad for jset_entry_log->d buff This was achieved before by zero-ing out the source buffer and then copying the bytes into the destination buffer. This can also be done with memcpy_and_pad which will zero out only the destination buffer if its size is bigger than the size of the source buffer. This is already used in the same way in journal_transaction_name(). Moreover, zero-ing the source buffer was done twice, first in __bch2_fs_log_msg() and then in bch2_trans_log_msg(). And this method may also require allocating some extra memory for the source buffer. In conclusion, using memcpy_and_pad is better even tough the result is the same because it brings uniformity with what's already used in journal_transaction_name, it avoids code duplication and reallocating extra memory. Signed-off-by: Roxana Nicolescu <nicolescu.roxana@protonmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:25 -04:00
Roxana Nicolescu	4e2caf82ce	bcachefs: replace strncpy() with memcpy_and_pad in journal_transaction_name Strncpy is now deprecated. The buffer destination is not required to be NULL-terminated, but we also want to zero out the rest of the buffer as it is already done in other places. Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Roxana Nicolescu <nicolescu.roxana@protonmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:24 -04:00
Kent Overstreet	8c087d2ddf	bcachefs: Rebalance now skips poisoned extents Let's not move poisoned extents unnecessarily, since we can't guard against introducing more bitrot. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:22 -04:00
Kent Overstreet	cb8336ca42	bcachefs: Data move can read from poisoned extents Now, if an extent is poisoned we can move it even if there was a checksum error. We'll have to give it a new checksum, but the poison bit means that userspace will still see the appropriate error when they try to read it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:21 -04:00
Kent Overstreet	760be1ad5e	bcachefs: Poison extents that can't be read due to checksum errors Copygc needs to be able to move extents that have bitrotted. We don't want to delete them - in the future we'll have an API for "read me the data even if there's checksum errors", and in general we don't want to delete anything unless the user asks us to. That will require writing it with a new checksum, which means we can't forget that there was a checksum error so we return the correct error to userspace. Rebalance also wants to skip bad extents; we can now use the poison flag for that. This is currently disabled by default, as we want read fua support so that we can distinguish between transient and permanent errors from the device. It may be enabled with the module parameter: poison_extents_on_checksum_error Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:19 -04:00
Kent Overstreet	6659ba3b18	bcachefs: Be precise about bch_io_failures If the extent we're reading from changes, due to be being overwritten or moved (possibly partially) - we need to reset bch_io_failures so that we don't accidentally mark a new extent as poisoned prematurely. This means we have to separately track (in the retry path) the extent we previously read from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:17 -04:00
Kent Overstreet	0e5f1f3f8f	bcachefs: bch2_subvolume_wait_for_pagecache_and_delete() cleanup Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:15 -04:00
Kent Overstreet	010c894681	bcachefs: Check for casefolded dirents in non casefolded dirs Check for mismatches between casefold dirents and casefold directories. A mismatch will cause lookups to fail, as we'll be doing the lookup with the casefolded name, which won't match the non-casefolded dirent, and vice versa. Reported-by: Christopher Snowhill <chris@kode54.net> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:14 -04:00
Kent Overstreet	ecd76c5f10	bcachefs: Fix bch2_dirent_create_snapshot() for casefolding bch2_dirent_create_snapshot(), used in fsck, neglected to create a casefolded dirent. Just move this into dirent_create_key(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:13 -04:00
Kent Overstreet	8d5ac187da	bcachefs: Fix casefold opt via xattr interface Changing the casefold option requires extra checks/work - factor out a helper from bch2_fileattr_set() for the xattr code to use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:13:09 -04:00
Kent Overstreet	cbed8287e5	bcachefs: mkwrite() now only dirties one page Don't dirty the whole folio - fixes write amplification with applications doing mmaped writes. https://www.reddit.com/r/bcachefs/comments/1klzcg1/incredible_amounts_of_write_amplification_when/ Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-19 08:28:41 -04:00
Kent Overstreet	494d458cfa	bcachefs: fix extent_has_stripe_ptr() This wasn't checking indirect extents. Fixes: https://github.com/koverstreet/bcachefs/issues/887 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-18 22:35:33 -04:00
Kent Overstreet	49771a7578	bcachefs: Fix bch2_btree_path_traverse_cached() when paths realloced btree_key_cache_fill() will allocate and traverse another path (for the underlying btree), so we can't hold pointers to paths across a call - we have to pass indices. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-17 18:46:17 -04:00
Kent Overstreet	9c09e59cc5	bcachefs: fix wrong arg to fsck_err() fsck_err() needs the btree transaction passed to it if there is one - so that it can unlock/relock around prompting userspace for fixing the error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 18:59:15 -04:00
Kent Overstreet	d1041d8eab	bcachefs: Fix missing commit in backpointer to missing target Fsck wants to do transaction commits from an outer context; it may have other repair to do (i.e. duplicate backpointers). But when calling backpointer_not_found() from runtime code, i.e. runtime self healing, we should be doing the commit - the outer context expects to just be doing lookups. This fixes bugs where we get stuck spinning, reported as "RCU lock hold time warnings. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Kent Overstreet	a12cb6f758	bcachefs: Fix accidental O(n^2) in fiemap Since bch2_seek_pagecache_data() searches for dirty data, we only want to call it for holes in the extents btree - otherwise we have an accidental O(n^2), as we repeatedly search the same range. Reported-by: Marcin Mirosław <marcin@mejor.pl> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Kent Overstreet	43b9fece2d	bcachefs: Fix set_should_be_locked() call in peek_slot() set_should_be_locked() needs to be called before peek_key_cache(), which traverses other paths and may do a trans unlock/relock. This fixes an assertion pop in path_peek_slot(), when the path we're using is unexpectedly not uptodate. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Alan Huang	61198e6287	bcachefs: Fix self deadlock Before invoking bch2_accounting_mem_mod_locked in bch2_gc_accounting_done, we already write locked mark_lock, in bch2_accounting_mem_insert, we lock mark_lock again. Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Kent Overstreet	19b22d04cd	bcachefs: Don't set btree nodes as accessed on fill Prevent jobs that do lots of scanning (i.e. evacuatee, scrub) from causing OOMs. The shrinker code seems to be having issues when it doesn't do any freeing because it's just flipping off the acccessed bit - and the accessed bit shouldn't be set on first use anyways. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Kent Overstreet	7b6759b199	bcachefs: Fix livelock in journal_entry_open() When the journal is low on space, we might do discards from journal_res_get() -> journal_entry_open(). Make sure we set j->can_discard correctly, so that if we're low on space but not because discards aren't keeping up we don't livelock. Fixes: `8e4d28036c` ("bcachefs: Don't aggressively discard the journal") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Kent Overstreet	b1c71cb492	bcachefs: Fix broken btree_path lock invariants in next_node() This fixes btree locking assert pops users were seeing during evacuate: https://github.com/koverstreet/bcachefs/issues/878 May 09 22:45:02 sharon kernel: bcachefs (68116e25-fa2d-4c6f-86c7-e8b431d792ae): bch2_btree_insert_node(): node not locked at level 1 May 09 22:45:02 sharon kernel: bch2_btree_node_rewrite [bcachefs]: watermark=btree no_check_rw alloc l=0-1 mode=none nodes_written=0 cl.remaining=2 journal_seq=0 May 09 22:45:02 sharon kernel: path: idx 1 ref 1:0 S B btree=alloc level=0 pos 0:3699637:0 0:3698012:1-0:3699637:0 bch2_move_btree.isra.0+0x1db/0x490 [bcachefs] uptodate 0 locks_want 2 May 09 22:45:02 sharon kernel: l=0 locks intent seq 4 node ffff8bd700c93600 May 09 22:45:02 sharon kernel: l=1 locks unlocked seq 1712 node ffff8bd6fd5e7a00 May 09 22:45:02 sharon kernel: l=2 locks unlocked seq 2295 node ffff8bd6cc725400 May 09 22:45:02 sharon kernel: l=3 locks unlocked seq 0 node 0000000000000000 Evacuate walks btree nodes with bch2_btree_iter_next_node() and rewrites them, bch2_btree_update_start() upgrades the path to take intent locks as far as it needs to. But next_node() does low level unlock/relock calls on individual nodes, and didn't handle the case where a path is supposed to be holding multiple intent locks. If a path has locks_want > 1, it needs to be either holding locks on all the btree nodes (at each level) requested, or none of them. Fix this with a bch2_btree_path_downgrade(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Kent Overstreet	cd52cc3544	bcachefs: Don't strip rebalance_opts from indirect extents Fix bch2_bkey_clear_needs_rebalance(): indirect extents are never supposed to have bch_extent_rebalance stripped off, because that's how we get the IO path options when we don't have the original inode it belonged to. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-14 17:05:19 -04:00
Eric Biggers	607c92141c	crypto: lib/chacha - add strongly-typed state zeroization Now that the ChaCha state matrix is strongly-typed, add a helper function chacha_zeroize_state() which zeroizes it. Then convert all applicable callers to use it instead of direct memzero_explicit. No functional changes. Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-05-12 13:32:53 +08:00
Eric Biggers	98066f2f89	crypto: lib/chacha - strongly type the ChaCha state The ChaCha state matrix is 16 32-bit words. Currently it is represented in the code as a raw u32 array, or even just a pointer to u32. This weak typing is error-prone. Instead, introduce struct chacha_state: struct chacha_state { u32 x[16]; }; Convert all ChaCha and HChaCha functions to use struct chacha_state. No functional changes. Signed-off-by: Eric Biggers <ebiggers@google.com> Acked-by: Kent Overstreet <kent.overstreet@linux.dev> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>	2025-05-12 13:32:53 +08:00
Fedor Pchelkin	f3def8270c	sort.h: hoist cmp_int() into generic header file Deduplicate the same functionality implemented in several places by moving the cmp_int() helper macro into linux/sort.h. The macro performs a three-way comparison of the arguments mostly useful in different sorting strategies and algorithms. Link: https://lkml.kernel.org/r/20250427201451.900730-1-pchelkin@ispras.ru Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru> Suggested-by: Darrick J. Wong <djwong@kernel.org> Acked-by: Kent Overstreet <kent.overstreet@linux.dev> Acked-by: Coly Li <colyli@kernel.org> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Carlos Maiolino <cem@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Coly Li <colyli@kernel.org> Cc: Jan Kara <jack@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2025-05-11 17:54:12 -07:00
Ingo Molnar	aad823aa3a	treewide, timers: Rename destroy_timer_on_stack() as timer_destroy_on_stack() Move this API to the canonical timer_*() namespace. Signed-off-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://lore.kernel.org/all/20250507175338.672442-10-mingo@kernel.org	2025-05-08 19:49:33 +02:00
Kent Overstreet	8e4d28036c	bcachefs: Don't aggressively discard the journal We frequently use 'bcachefs list_journal -a' for debugging, as it provides a record of all btree transactions, and a history of what happened. But it's not so useful if we immediately discard journal buckets right after they're no longer dirty. This tweaks journal reclaim to only discard when we're low on space, keeping the journal mostly un-discarded. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-07 17:10:10 -04:00
Kent Overstreet	da18dabc37	bcachefs: Ensure superblock gets written when we go ERO When we go emergency read-only, make sure we do a final write_super() to persist counters and error counts - this can be critical for piecing together what fsck was doing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-07 17:09:59 -04:00
Kent Overstreet	2fea3aa76e	bcachefs: Filter out harmless EROFS error messages These just indicate that we're shutting down. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-07 16:58:32 -04:00
Kent Overstreet	473f09f362	bcachefs: journal_shutdown is EROFS, not EIO We often filter out EROFS errors to avoid log spew after an emergency shutdown - journal_shutdown is just another emergency shutdown error. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-07 16:58:26 -04:00
Kent Overstreet	9c61856099	bcachefs: Call bch2_fs_start before getting vfs superblock This reverts `1fdbe0b184` bcachefs: Make sure c->vfs_sb is set before starting fs switched up bch2_fs_get_tree() so that we got a superblock before calling bch2_fs_start, so that c->vfs_sb would always be initialized while the filesystem was active. This turned out not to be necessary, because blk_holder_ops were implemented using our own locking, not vfs locking. And this had the side effect of creating a super_block and doing our full recovery (including potentially fsck) before setting SB_BORN, which causes things like sync calls to hang until our recovery is finished. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-05 16:06:35 -04:00
Kent Overstreet	aed4ccbf45	bcachefs: fix hung task timeout in journal read Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-05 14:21:28 -04:00
Kent Overstreet	7a69fa6571	bcachefs: Add missing barriers before wake_up_bit() wake_up() doesn't require a barrier - but wake_up_bit() does. This only affected non x86, and primarily lead to lost wakeups after btree node reads. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-05 14:19:10 -04:00
Kent Overstreet	50a7b899a0	bcachefs: Ensure proper write alignment There was a buggy version of bcachefs-tools which picked misaligned bucket sizes when formatting, and we're also about to do dynamic block sizes - which will allow picking logical block size or physical block size of the device per-write, allowing for better compression ratios at the cost of slightly worse write performance (i.e. forcing the device to do RMW or extra buffering). To account for this, tweak bch2_alloc_sectors_start() to properly align open_buckets to the blocksize of the write we're about to do. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-05 14:19:01 -04:00
Kent Overstreet	844f766e02	bcachefs: Improve want_cached_ptr() If promote target isn't set, rebalance should still leave a cached copy on the faster device. Fall back to foreground_target if it's set, or allow a cached copy on any device if neither are set. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-05 14:16:20 -04:00
Kent Overstreet	df2e19a883	bcachefs: thread_with_stdio: fix spinning instead of exiting bch2_stdio_redirect_vprintf() was missing a check for stdio->done, i.e. exiting. This caused the thread attempting to print to spin, and since it was being called from the kthread ran by thread_with_stdio, the userspace side hung as well. Change it to return -EPIPE - i.e. writing to a pipe that's been closed. Reported-by: Jan Solanti <jhs@psonet.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-04 14:00:14 -04:00
Alan Huang	6846100b00	bcachefs: Remove incorrect __counted_by annotation This actually reverts `86e92eeeb2` ("bcachefs: Annotate struct bch_xattr with __counted_by()"). After the x_name, there is a value. According to the disscussion[1], __counted_by assumes that the flexible array member contains exactly the amount of elements that are specified. Now there are users came across a false positive detection of an out of bounds write caused by the __counted_by here[2], so revert that. [1] https://lore.kernel.org/lkml/Zv8VDKWN1GzLRT-_@archlinux/T/#m0ce9541c5070146320efd4f928cc1ff8de69e9b2 [2] https://privatebin.net/?a0d4e97d590d71e1#9bLmp2Kb5NU6X6cZEucchDcu88HzUQwHUah8okKPReEt Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-01 16:38:58 -04:00
Kent Overstreet	28580052e6	bcachefs: add missing sched_annotate_sleep() 00594 ------------[ cut here ]------------ 00594 do not call blocking ops when !TASK_RUNNING; state=2 set at [<000000003e51ef4a>] prepare_to_wait_event+0x5c/0x1c0 00594 WARNING: CPU: 12 PID: 1117 at kernel/sched/core.c:8741 __might_sleep+0x74/0x88 00594 Modules linked in: 00594 CPU: 12 UID: 0 PID: 1117 Comm: umount Not tainted 6.15.0-rc4-ktest-g3a72e369412d #21845 PREEMPT 00594 Hardware name: linux,dummy-virt (DT) 00594 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00594 pc : __might_sleep+0x74/0x88 00594 lr : __might_sleep+0x74/0x88 00594 sp : ffffff80c8d67a90 00594 x29: ffffff80c8d67a90 x28: ffffff80f5903500 x27: 0000000000000000 00594 x26: 0000000000000000 x25: ffffff80cf5002a0 x24: ffffffc087dad000 00594 x23: ffffff80c8d67b40 x22: 0000000000000000 x21: 0000000000000000 00594 x20: 0000000000000242 x19: ffffffc080b92020 x18: 00000000ffffffff 00594 x17: 30303c5b20746120 x16: 74657320323d6574 x15: 617473203b474e49 00594 x14: 0000000000000001 x13: 00000000000c0000 x12: ffffff80facc0000 00594 x11: 0000000000000001 x10: 0000000000000001 x9 : ffffffc0800b0774 00594 x8 : c0000000fffbffff x7 : ffffffc087dac670 x6 : 00000000015fffa8 00594 x5 : ffffff80facbffa8 x4 : ffffff80fbd30b90 x3 : 0000000000000000 00594 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffffff80f5903500 00594 Call trace: 00594 __might_sleep+0x74/0x88 (P) 00594 __mutex_lock+0x64/0x8d8 00594 mutex_lock_nested+0x28/0x38 00594 bch2_fs_ec_flush+0xf8/0x128 00594 __bch2_fs_read_only+0x54/0x1d8 00594 bch2_fs_read_only+0x3e0/0x438 00594 __bch2_fs_stop+0x5c/0x250 00594 bch2_put_super+0x18/0x28 00594 generic_shutdown_super+0x6c/0x140 00594 bch2_kill_sb+0x1c/0x38 00594 deactivate_locked_super+0x54/0xd0 00594 deactivate_super+0x70/0x90 00594 cleanup_mnt+0xec/0x188 00594 __cleanup_mnt+0x18/0x28 00594 task_work_run+0x90/0xd8 00594 do_notify_resume+0x138/0x148 00594 el0_svc+0x9c/0xa0 00594 el0t_64_sync_handler+0x104/0x130 00594 el0t_64_sync+0x154/0x158 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-01 13:54:58 -04:00
Kent Overstreet	e2699274d5	bcachefs: Fix __bch2_dev_group_set() bch2_sb_disk_groups_to_cpu() goes off of the superblock member info, so we need to set that first. Reported-by: Stijn Tintel <stijn@linux-ipv6.be> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-01 12:22:10 -04:00
Kent Overstreet	e660d7ca74	bcachefs: Kill ERO for i_blocks check in truncate Replace with logging the error in the superblock. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-01 06:19:58 -04:00
Kent Overstreet	3a72e36941	bcachefs: check for inode.bi_sectors underflow Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-01 06:19:58 -04:00
Kent Overstreet	05450c48a3	bcachefs: Kill ERO in __bch2_i_sectors_acct() We won't be root causing this in the immediate future, and it's fairly innocuous - so just log it in the superblock. https://github.com/koverstreet/bcachefs/issues/869 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-01 06:19:58 -04:00
Kent Overstreet	5e63d579e7	bcachefs: readdir fixes - Don't call bch2_trans_relock() after dir_emit(); taking a transaction restart here will cause us to emit the same dirent to userspace twice - Fix incorrect checking of the return value on dir_emit(): "true" means success, keep going, but bch2_dir_emit() needs to return true when we're finished iterating. https://github.com/koverstreet/bcachefs/issues/867 Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-30 11:49:34 -04:00
Kent Overstreet	2feaa92c7c	bcachefs: improve missing journal write device error message Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-30 11:49:28 -04:00
Kent Overstreet	dbe4674802	bcachefs: Topology error after insert is now an ERO A user hit this, and this will naturally be easier to debug if we don't panic. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 22:42:17 -04:00
Kent Overstreet	9a4a858c9b	bcachefs: Use bch2_kvmalloc() for journal keys array We can hit this limit fairly easy when we have to reconstuct large amounts of alloc info on large filesystems. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 22:42:17 -04:00
Kent Overstreet	e5a3b8cf33	bcachefs: More informative error message when shutting down due to error Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 22:42:17 -04:00
Kent Overstreet	652dd6558b	bcachefs: btree_root_unreadable_and_scan_found_nothing autofix for non data btrees If loosing a btree won't cause data loss - i.e. it's an alloc btree, or we can easily reconstruct it - we shouldn't require user action to continue repair. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 22:42:17 -04:00
Kent Overstreet	c366b1672d	bcachefs: btree_node_data_missing is now autofix Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:13 -04:00
Kent Overstreet	eca5b56ccf	bcachefs: Don't generate alloc updates to invalid buckets Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:13 -04:00
Kent Overstreet	e7f1a52849	bcachefs: Improve bch2_dev_bucket_missing() More useful error message. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:13 -04:00
Kent Overstreet	002466446a	bcachefs: fix bch2_dev_buckets_resize() The resize memcpy path was totally busted. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:13 -04:00
Kent Overstreet	9e9c28acfd	bcachefs: Add upgrade table entry from 0.14 There are a few errors that needed to be marked as autofix. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:12 -04:00
Kent Overstreet	3c24020119	bcachefs: Run BCH_RECOVERY_PASS_reconstruct_snapshots on missing subvol -> snapshot Fix this repair path. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:12 -04:00
Kent Overstreet	bdc32a10a2	bcachefs: Add missing utf8_unload() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:12 -04:00
Kent Overstreet	70c3d89f49	bcachefs: Emit unicode version message on startup fstests expects this Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:12 -04:00
Kent Overstreet	c83311c5b9	bcachefs: Use generic_set_sb_d_ops for standard casefolding d_ops Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:12 -04:00
Kent Overstreet	a2f546330e	bcachefs: Fix losing return code in next_fiemap_extent() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-28 16:46:12 -04:00
Kent Overstreet	d1b0f9aa73	bcachefs: Rework fiemap transaction restart handling Restart handling in the previous patch was incorrect, so: move btree operations into a separate helper, and run it with a lockrestart_do(). Additionally, clarify whether pagecache or the btree takes precedence. Right now, the btree takes precedence: this is incorrect, but it's needed to pass fstests. Add a giant comment explaining why. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-04-24 19:10:29 -04:00
Brian Foster	b9b0494017	bcachefs: add fiemap delalloc extent detection bcachefs currently populates fiemap data from the extents btree. This works correctly when the fiemap sync flag is provided, but if not, it skips all delalloc extents that have not yet been flushed. This is because delalloc extents from buffered writes are first stored as reservation in the pagecache, and only become resident in the extents btree after writeback completes. Update the fiemap implementation to process holes between extents by scanning pagecache for data, via seek data/hole. If a valid data range is found over a hole in the extent btree, fake up an extent key and flag the extent as delalloc for reporting to userspace. Note that this does not necessarily change behavior for the case where there is dirty pagecache over already written extents, where when in COW mode, writeback will allocate new blocks for the underlying ranges. The existing behavior is consistent with btrfs and it is recommended to use the sync flag for the most up to date extent state from fiemap. Signed-off-by: Brian Foster <bfoster@redhat.com>	2025-04-24 19:10:29 -04:00
Brian Foster	2d55a63709	bcachefs: refactor fiemap processing into extent helper and struct The bulk of the loop in bch2_fiemap() involves processing the current extent key from the iter, including following indirections and trimming the extent size and such. This patch makes a few changes to reduce the size of the loop and facilitate future changes to support delalloc extents. Define a new bch_fiemap_extent structure to wrap the bkey buffer that holds the extent key to report to userspace along with associated fiemap flags. Update bch2_fill_extent() to take the bch_fiemap_extent as a param instead of the individual fields. Finally, lift the bulk of the extent processing into a bch2_fiemap_extent() helper that takes the current key and formats the bch_fiemap_extent appropriately for the fill function. No functional changes intended by this patch. Signed-off-by: Brian Foster <bfoster@redhat.com>	2025-04-24 19:10:29 -04:00
Brian Foster	d020a9fb11	bcachefs: track current fiemap offset in start variable Signed-off-by: Brian Foster <bfoster@redhat.com>	2025-04-24 19:10:28 -04:00
Brian Foster	28d2d19ccc	bcachefs: drop duplicate fiemap sync flag FIEMAP_FLAG_SYNC handling was deliberately moved into core code in commit `45dd052e67` ("fs: handle FIEMAP_FLAG_SYNC in fiemap_prep"), released in kernel v5.8. Update bcachefs accordingly. Signed-off-by: Brian Foster <bfoster@redhat.com>	2025-04-24 19:10:28 -04:00

... 3 4 5 6 7 ...

5280 Commits