linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-08-30 21:52:21 +00:00

Author	SHA1	Message	Date
Kent Overstreet	af5b88618a	bcachefs: Update /dev/disk/by-uuid on device add Invalidate pagecache after we write the new superblock and send a uevent. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-11 23:21:30 -04:00
Kent Overstreet	b76cce1270	bcachefs: Add more flags to btree nodes for rewrite reason It seems excessive forced btree node rewrites can cause interior btree updates to become wedged during recovery, before we're using the write buffer for backpointer updates. Add more flags so we can determine where these are coming from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-11 23:21:30 -04:00
Kent Overstreet	c7e351be7a	bcachefs: Add range being updated to btree_update_to_text() We had a deadlock during recovery where interior btree updates became wedged and all open_buckets were consumed; start adding more introspection. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-11 23:21:29 -04:00
Kent Overstreet	b43f724927	bcachefs: Log fsck errors in the journal Log the specific error being corrected in the journal when we're repairing, this helps greatly with 'bcachefs list_journal' analysis. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-11 23:21:29 -04:00
Kent Overstreet	47fe65b105	bcachefs: Add missing restart handling to check_topology() The next patch will add logging of the specific error being corrected in repair paths to the journal; this means __bch2_fsck_err() can return transaction restarts in places that previously weren't expecting them. check_topology() is old code that doesn't use btree iterators for btree node locking - it'll have to be rewritten in the future to work online. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-11 23:21:29 -04:00
Linus Torvalds	ff0905bbf9	bcachefs updates for 6.16, part 2 - More stack usage improvements (~600 bytes). - Define CLASS()es for some commonly used types, and convert most rcu_read_lock() uses to the new lock guards - New introspection: - Superblock error counters are now available in sysfs: previously, they were only visible with 'show-super', which doesn't provide a live view - New tracepoint, error_throw(), which is called any time we return an error and start to unwind - Repair - check_fix_ptrs() can now repair btree node roots - We can now repair when we've somehow ended up with the journal using a superblock bucket - Revert some leftovers from the aborted directory i_size feature, and add repair code: some userspace programs (e.g. sshfs) were getting confused. It seems in 6.15 there's a bug where i_nlink on the vfs inode has been getting incorrectly set to 0, with some unfortunate results; list_journal analysis showed bch2_inode_rm() being called (by bch2_evict_inode()) when it clearly should not have been. - bch2_inode_rm() now runs "should we be deleting this inode?" checks that were previously only run when deleting unlinked inodes in recovery. - check_subvol() was treating a dangling subvol (pointing to a missing root inode) like a dangling dirent, and deleting it. This was the really unfortunate one: check_subvol() will now recreate the root inode if necessary. This took longer to debug than it should have, and we lost several filesystems unnecessarily, becuase users have been ignoring the release notes and blindly running 'fsck -y'. Debugging required reconstructing what happened through analyzing the journal, when ideally someone would have noticed 'hey, fsck is asking me if I want to repair this: it usually doesn't, maybe I should run this in dry run mode and check what's going on?'. As a reminder, fsck errors are being marked as autofix once we've verified, in real world usage, that they're working correctly; blindly running 'fsck -y' on an experimental filesystem is playing with fire. Up to this incident we've had an excellent track record of not losing data, so let's try to learn from this one. This is a community effort, I wouldn't be able to get this done without the help of all the people QAing and providing excellent bug reports and feedback based on real world usage. But please don't ignore advice and expect me to pick up the pieces. If an error isn't marked as autofix, and it /is/ happening in the wild, that's also something I need to know about so we can check it out and add it to the autofix list if repair looks good. I haven't been getting those reports, and I should be; since we don't have any sort of telemetry yet I am absolutely dependent on user reports. Now I'll be spending the weekend working on new repair code to see if I can get a filesystem back for a user who didn't have backups. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmhAuL0ACgkQE6szbY3K bnZlCg/+Pu2TgWBbkwrmHgKH9v4K3pwQRREXSj0TlbWQp9bK00zEBrmdEfTZKgUC q5nAAa6zCs0w/A9TFA7t1W/3+JY28ENhoArKFWemLhFZ2qEEXTZlVHvqyHOyuPBf Loe+hQO8qgWJm6KO9VMCT1pEupslQLRlhI8GhbPPcxPvYXVjmTne7KCanhjeSEx5 TLaOiMn7jr+qPeLZ7xSMaaUTbH2SASjwl2E9/4kG6VqaTTF2MnPNwrdJI0exjyvs QRaUvYbwBBTe/ru5ddmJuWj+61awKS87ANg+rkO2FWpOrai2HfgHd6o+zge/IR2Z /Cfarv1SSd1+0caVaGUAzhnoVhOpY1FU4emJwVvcwnBXeXdGIb/kpaw+Lxm7fr+U J6EnqgUoBsBWBCWgvUxlNHVeJ6wBdVNtDlTHabaH8RSCJZjgjg2JaSQM/v9VPLNa 6jTy3rhkPo50BJBb/F/AZmrobWXR2MkgID3iPEMcpjEyLaRZvW9FPqMFIxKQrUfB XGDU4dAu3C+Q9i1KDkFIvIG3e7z9nSmv6np4O57CgrmrmmCpRUz7Yy0yhqNs36/H WhLh/Pjb9gupdFK0TwFiEEG3wfnmXlde2c8IfrXXzKSKPIZ0T/RnLZapS7i94c2E DumhLYjNjSCiciQZh4eLK0bKx0NETUG79eLUTz5Gi3Pc02E0pU8= =ZGDn -----END PGP SIGNATURE----- Merge tag 'bcachefs-2025-06-04' of git://evilpiepirate.org/bcachefs Pull more bcachefs updates from Kent Overstreet: "More bcachefs updates: - More stack usage improvements (~600 bytes) - Define CLASS()es for some commonly used types, and convert most rcu_read_lock() uses to the new lock guards - New introspection: - Superblock error counters are now available in sysfs: previously, they were only visible with 'show-super', which doesn't provide a live view - New tracepoint, error_throw(), which is called any time we return an error and start to unwind - Repair - check_fix_ptrs() can now repair btree node roots - We can now repair when we've somehow ended up with the journal using a superblock bucket - Revert some leftovers from the aborted directory i_size feature, and add repair code: some userspace programs (e.g. sshfs) were getting confused It seems in 6.15 there's a bug where i_nlink on the vfs inode has been getting incorrectly set to 0, with some unfortunate results; list_journal analysis showed bch2_inode_rm() being called (by bch2_evict_inode()) when it clearly should not have been. - bch2_inode_rm() now runs "should we be deleting this inode?" checks that were previously only run when deleting unlinked inodes in recovery - check_subvol() was treating a dangling subvol (pointing to a missing root inode) like a dangling dirent, and deleting it. This was the really unfortunate one: check_subvol() will now recreate the root inode if necessary This took longer to debug than it should have, and we lost several filesystems unnecessarily, because users have been ignoring the release notes and blindly running 'fsck -y'. Debugging required reconstructing what happened through analyzing the journal, when ideally someone would have noticed 'hey, fsck is asking me if I want to repair this: it usually doesn't, maybe I should run this in dry run mode and check what's going on?' As a reminder, fsck errors are being marked as autofix once we've verified, in real world usage, that they're working correctly; blindly running 'fsck -y' on an experimental filesystem is playing with fire Up to this incident we've had an excellent track record of not losing data, so let's try to learn from this one This is a community effort, I wouldn't be able to get this done without the help of all the people QAing and providing excellent bug reports and feedback based on real world usage. But please don't ignore advice and expect me to pick up the pieces If an error isn't marked as autofix, and it /is/ happening in the wild, that's also something I need to know about so we can check it out and add it to the autofix list if repair looks good. I haven't been getting those reports, and I should be; since we don't have any sort of telemetry yet I am absolutely dependent on user reports Now I'll be spending the weekend working on new repair code to see if I can get a filesystem back for a user who didn't have backups" * tag 'bcachefs-2025-06-04' of git://evilpiepirate.org/bcachefs: (69 commits) bcachefs: add cond_resched() to handle_overwrites() bcachefs: Make journal read log message a bit quieter bcachefs: Fix subvol to missing root repair bcachefs: Run may_delete_deleted_inode() checks in bch2_inode_rm() bcachefs: delete dead code from may_delete_deleted_inode() bcachefs: Add flags to subvolume_to_text() bcachefs: Fix oops in btree_node_seq_matches() bcachefs: Fix dirent_casefold_mismatch repair bcachefs: Fix bch2_fsck_rename_dirent() for casefold bcachefs: Redo bch2_dirent_init_name() bcachefs: Fix -Wc23-extensions in bch2_check_dirents() bcachefs: Run check_dirents second time if required bcachefs: Run snapshot deletion out of system_long_wq bcachefs: Make check_key_has_snapshot safer bcachefs: BCH_RECOVERY_PASS_NO_RATELIMIT bcachefs: bch2_require_recovery_pass() bcachefs: bch_err_throw() bcachefs: Repair code for directory i_size bcachefs: Kill un-reverted directory i_size code bcachefs: Delete redundant fsck_err() ...	2025-06-04 19:14:24 -07:00
Kent Overstreet	3d11125ff6	bcachefs: add cond_resched() to handle_overwrites() Fix soft lockup warnings in btree nodes can. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	a4b0f75050	bcachefs: Make journal read log message a bit quieter Users seem to be assuming that the 'dropped unflushed entries' message at the end of journal read indicates some sort of problem, when it does not - we expect there to be entries in the journal that weren't commited, it's purely informational so that we can correlate journal sequence numbers elsewhere when debugging. Shorten the log message a bit to hopefully make this clearer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	29cc6fb7c0	bcachefs: Fix subvol to missing root repair We had a bug where the root inode of a subvolume was erronously deleted: bch2_evict_inode() called bch2_inode_rm(), meaning the VFS inode's i_nlink was somehow set to 0 when it shouldn't have - the inode in the btree indicated it clearly was not unlinked. This has been addressed with additional safety checks in bch2_inode_rm() - pulling in the safety checks we already were doing when deleting unlinked inodes in recovery - but the really disastrous bug was in check_subvols(), which on finding a dangling subvol (subvol with a missing root inode) would delete the subvolume. I assume this bug dates from early check_directory_structure() code, which originally handled subvolumes and normal paths - the idea being that still live contents of the subvolume would get reattached somewhere. But that's incorrect, and disastrously so; deleting a subvolume triggers deleting the snapshot ID it points to, deleting the entire contents. The correct way to repair is to recreate the root inode if it's missing; then any contents will get reattached under that subvolume's lost+found. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	09fb85ae56	bcachefs: Run may_delete_deleted_inode() checks in bch2_inode_rm() We had a bug where bch2_evict_inode() incorrectly called bch2_inode_rm() - the journal clearly showed the inode was not unlinked. We've got checks that we use in recovery when cleaning up deleted inodes, lift them to bch2_inode_rm() as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	bb6689bbee	bcachefs: delete dead code from may_delete_deleted_inode() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	bfaac2c546	bcachefs: Add flags to subvolume_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	9f2dc5f394	bcachefs: Fix oops in btree_node_seq_matches() btree_update_nodes_written() needs to wait on in-flight writes to old nodes before marking them as freed. But it has no reason to pin those old nodes in memory, so some trickyness ensues. The update we're completing deleted references to those nodes from the btree, so we know if they've been evicted they can't be pulled back in. We just have to check if the nodes we have pointers to are still those old nodes, and haven't been reused. To do that we check the node's "sequence number" (actually a random 64 bit cookie), but that lives in the node's data buffer. 'struct btree' can't be freed until filesystem shutdown (as they're quite small), but the data buffers can be freed or swapped around. Commit `1f88c35674`, which was fixing a kmsan warning, assumed that we could safely do this locklessly with just a READ_ONCE() - if we've got a non-null ptr it would be safe to read from. But that's not true if the data buffer is a vmalloc allocation, so we need to restore the locking that commit deleted (or alternatively RCU free those data buffers, but there's no other reason for that). Fixes: `1f88c35674` ("bcachefs: Fix a KMSAN splat in btree_update_nodes_written()") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	2bf380c005	bcachefs: Fix dirent_casefold_mismatch repair Instead of simply recreating a mis-casefolded dirent, use the str_hash repair code, which will rename it if necessary - the dirent might have been created again with the correct casefolding. Factor out out bch2_str_hash_repair key() from __bch2_str_hash_check_key() for the new path to use, and export bch2_dirent_create_key() as well. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	b938d3c970	bcachefs: Fix bch2_fsck_rename_dirent() for casefold bch2_fsck_renamed_dirent was creating bch_dirent keys open-coded - but we need to use the appropriate helper, if the directory is casefolded. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Kent Overstreet	35c1f131bc	bcachefs: Redo bch2_dirent_init_name() Redo (and simplify somewhat) how casefolded and non casefolded dirents are initialized, and export this to be used by fsck_rename_dirent(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:41 -04:00
Nathan Chancellor	01d925f7e1	bcachefs: Fix -Wc23-extensions in bch2_check_dirents() Clang warns (or errors with CONFIG_WERROR=y): fs/bcachefs/fsck.c:2325:2: error: label followed by a declaration is a C23 extension [-Werror,-Wc23-extensions] 2325 \| int ret = bch2_trans_run(c, \| ^ On clang-17 and older, this is an unconditional error: fs/bcachefs/fsck.c:2325:2: error: expected expression 2325 \| int ret = bch2_trans_run(c, \| ^ Move the declaration of ret to the top of the function to resolve both ways this issue manifests. Fixes: `c72def5237` ("bcachefs: Run check_dirents second time if required") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-04 16:45:38 -04:00
Kent Overstreet	c72def5237	bcachefs: Run check_dirents second time if required If we move a key backwards, we'll need a second pass to run the rest of the fsck checks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:36 -04:00
Kent Overstreet	a4907d7f33	bcachefs: Run snapshot deletion out of system_long_wq We don't want this running out of the same workqueue, and blocking, writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:36 -04:00
Kent Overstreet	e49cf9b54b	bcachefs: Make check_key_has_snapshot safer Snapshot deletion v2 added sentinal values for deleted snapshots, so "key for deleted snapshot" - i.e. snapshot deletion missed something - is safe to repair automatically. But if we find a key for a missing snapshot we have no idea what happened, and we shouldn't delete it unless we're very sure that everything else is consistent. So hook it up to the new bch2_require_recovery_pass(), we'll now only delete if snapshots and subvolumes have recenlty been checked. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:36 -04:00
Kent Overstreet	0942b852d4	bcachefs: BCH_RECOVERY_PASS_NO_RATELIMIT Add a superblock flag to temporarily disable ratelimiting for a recovery pass. This will be used to make check_key_has_snapshot safer: we don't want to delete a key for a missing snapshot unless we know that the snapshots and subvolumes btrees are consistent, i.e. check_snapshots and check_subvols have run recently. Changing those btrees - creating/deleting a subvolume or snapshot - will set the "disable ratelimit" flag, i.e. ensuring that those passes run if check_key_has_snapshot discovers an error. We're only disabling ratelimiting in the snapshot/subvol delete paths, we're not so concerned about the create paths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:36 -04:00
Kent Overstreet	a2ffab0e65	bcachefs: bch2_require_recovery_pass() Add a helper for requiring that a recovery pass has already run: either run it directly, if we're still in recovery, or if we're not in recovery check if it has run recently and schedule it if it hasn't. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:35 -04:00
Kent Overstreet	09b9c72bd4	bcachefs: bch_err_throw() Add a tracepoint for any time we return an error and unwind. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:35 -04:00
Kent Overstreet	36a2fdf7c5	bcachefs: Repair code for directory i_size We had a bug due due to an incomplete revert of the patch implementing directory i_size (summing up the size of the dirents), leading to completely screwy i_size values that underflow. Most userspace programs don't seem to care (e.g. du ignores it), but it turns out this broke sshfs, so needs to be repaired. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:35 -04:00
Kent Overstreet	95fafc0f34	bcachefs: Kill un-reverted directory i_size code Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:35 -04:00
Kent Overstreet	d47db3e636	bcachefs: Delete redundant fsck_err() 'inode_has_wrong_backpointer'; we have more specific errors for every case afterwards. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:35 -04:00
Kent Overstreet	165815c296	bcachefs: Convert BUG() to error Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-02 12:16:35 -04:00
Kent Overstreet	132263220d	bcachefs: Add better logging to fsck_rename_dirent() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-01 00:03:12 -04:00
Kent Overstreet	18dad454cd	bcachefs: Replace rcu_read_lock() with guards The new guard(), scoped_guard() allow for more natural code. Some of the uses with creative flow control have been left. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-01 00:03:12 -04:00
Kent Overstreet	9cb49fbf73	bcachefs: CLASS(btree_trans) Allow btree_trans to be used with CLASS(). Automatic cleanup, instead of manually calling bch2_trans_put(). We don't use DEFINE_CLASS because using a static inline for the constructor breaks bch2_trans_get()'s use of __func__, so we have to open code it. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-06-01 00:03:12 -04:00
Linus Torvalds	7d4e49a77d	- The 3 patch series "hung_task: extend blocking task stacktrace dump to semaphore" from Lance Yang enhances the hung task detector. The detector presently dumps the blocking tasks's stack when it is blocked on a mutex. Lance's series extends this to semaphores. - The 2 patch series "nilfs2: improve sanity checks in dirty state propagation" from Wentao Liang addresses a couple of minor flaws in nilfs2. - The 2 patch series "scripts/gdb: Fixes related to lx_per_cpu()" from Illia Ostapyshyn fixes a couple of issues in the gdb scripts. - The 9 patch series "Support kdump with LUKS encryption by reusing LUKS volume keys" from Coiby Xu addresses a usability problem with kdump. When the dump device is LUKS-encrypted, the kdump kernel may not have the keys to the encrypted filesystem. A full writeup of this is in the series [0/N] cover letter. - The 2 patch series "sysfs: add counters for lockups and stalls" from Max Kellermann adds /sys/kernel/hardlockup_count and /sys/kernel/hardlockup_count and /sys/kernel/rcu_stall_count. - The 3 patch series "fork: Page operation cleanups in the fork code" from Pasha Tatashin implements a number of code cleanups in fork.c. - The 3 patch series "scripts/gdb/symbols: determine KASLR offset on s390 during early boot" from Ilya Leoshkevich fixes some s390 issues in the gdb scripts. -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCaDuCvQAKCRDdBJ7gKXxA jrkxAQCnFAp/uK9ckkbN4nfpJ0+OMY36C+A+dawSDtuRsIkXBAEAq3e6MNAUdg5W Ca0cXdgSIq1Op7ZKEA+66Km6Rfvfow8= =g45L -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "hung_task: extend blocking task stacktrace dump to semaphore" from Lance Yang enhances the hung task detector. The detector presently dumps the blocking tasks's stack when it is blocked on a mutex. Lance's series extends this to semaphores - "nilfs2: improve sanity checks in dirty state propagation" from Wentao Liang addresses a couple of minor flaws in nilfs2 - "scripts/gdb: Fixes related to lx_per_cpu()" from Illia Ostapyshyn fixes a couple of issues in the gdb scripts - "Support kdump with LUKS encryption by reusing LUKS volume keys" from Coiby Xu addresses a usability problem with kdump. When the dump device is LUKS-encrypted, the kdump kernel may not have the keys to the encrypted filesystem. A full writeup of this is in the series [0/N] cover letter - "sysfs: add counters for lockups and stalls" from Max Kellermann adds /sys/kernel/hardlockup_count and /sys/kernel/hardlockup_count and /sys/kernel/rcu_stall_count - "fork: Page operation cleanups in the fork code" from Pasha Tatashin implements a number of code cleanups in fork.c - "scripts/gdb/symbols: determine KASLR offset on s390 during early boot" from Ilya Leoshkevich fixes some s390 issues in the gdb scripts * tag 'mm-nonmm-stable-2025-05-31-15-28' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (67 commits) llist: make llist_add_batch() a static inline delayacct: remove redundant code and adjust indentation squashfs: add optional full compressed block caching crash_dump, nvme: select CONFIGFS_FS as built-in scripts/gdb/symbols: determine KASLR offset on s390 during early boot scripts/gdb/symbols: factor out pagination_off() scripts/gdb/symbols: factor out get_vmlinux() kernel/panic.c: format kernel-doc comments mailmap: update and consolidate Casey Connolly's name and email nilfs2: remove wbc->for_reclaim handling fork: define a local GFP_VMAP_STACK fork: check charging success before zeroing stack fork: clean-up naming of vm_stack/vm_struct variables in vmap stacks code fork: clean-up ifdef logic around stack allocation kernel/rcu/tree_stall: add /sys/kernel/rcu_stall_count kernel/watchdog: add /sys/kernel/{hard,soft}lockup_count x86/crash: make the page that stores the dm crypt keys inaccessible x86/crash: pass dm crypt keys to kdump kernel Revert "x86/mm: Remove unused __set_memory_prot()" crash_dump: retrieve dm crypt keys in kdump kernel ...	2025-05-31 19:12:53 -07:00
Kent Overstreet	42359f1615	bcachefs: CLASS(darray) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	237a8e16bd	bcachefs: CLASS(printbuf) Add a DEFINE_CLASS() for printbufs. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	a0f7437906	bcachefs: sysfs trigger_journal_commit Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	1f42a0335a	bcachefs: sysfs trigger_emergency_read_only Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	5802caf74f	bcachefs: darray_find(), darray_find_p() New helpers to avoid open coded loops. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	9a1accd3a5	bcachefs: Journal keys are retained until shutdown, or journal replay finishes If we don't finish journal replay we need to keep journal keys around until the filesystem shuts down - otherwise e.g. -o norecovery, various tools (dump, list) break, and eventually we'll be doing journal replay in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	6447544c3d	bcachefs: Improve error printing in btree_node_check_topology() We had a bug report where the errors from btree_node_check_topology() don't seem to be getting printed; log_fsck_err() does some fancy ratelimiting-type stuff that we don't want here. Instead, just use bch2_count_fsck_err(); this is simpler, and modelled after how we're currently handling bucket ref update errors in buckets.c. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	f402d9710b	bcachefs: bch2_readdir() now calls str_hash_check_key() More self healing code: readdir will now notice if there are dirents hashed incorrectly, and it'll repair them if errors=fix_safe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	a592268260	bcachefs: bch2_str_hash_check_key() may now be called without snapshots_seen We don't track snapshot overwrites outside of fsck, so for this to be called at runtime outside of fsck we need to create it on demand, when we have repair to do. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	cb6f5d0dec	bcachefs: __bch2_insert_snapshot_whiteouts() refactoring Now uses bch2_get_snapshot_overwrites(), and much shorter. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	801cb2bd6c	bcachefs: bch2_get_snapshot_overwrites() New helper for getting a list of snapshot IDs that have overwritten a given key. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	d21262d4e3	bcachefs: bch2_dev_journal_bucket_delete() Recover from "journal and btree in same bucket". Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	0224d17d76	bcachefs: Runtime self healing for keys for deleted snapshots If snapshot deletion incorrectly missing some keys and leaves keys for deleted snapshots, that causes a bit of a problem for data move - we can't move an extent for a nonexistent snapshot, because the extent might have to be fragmented, and maintaining correct visibility in child snapshots doesn't work if it doesn't have a snapshot. Previously we'd just skip these keys, but it turns out that causes copygc to spin. So we need runtime self healing, i.e. calling check_key_has_snapshot() from the data move path. Snapshot deletion v2 included sentinal values for deleted snapshot nodes, so this is quite safe. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	f02d153274	bcachefs: Don't unlock trans before data_update_init() data_update_init() does need to do btree operations, delay doing the unlock-before-io. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:17 -04:00
Kent Overstreet	642c1aabb0	bcachefs: Use bch2_err_matches() for BCH_ERR_fsck_(fix\|ignore) We'll be adding subtypes of these errors, and new error code tracing. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-31 22:03:16 -04:00
Kent Overstreet	dc43f6a70b	bcachefs: Mark bch_errcode helpers __attribute__((const)) These don't access global memory or defer pointer arguments - this enables CSE optimizations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 11:20:18 -04:00
Kent Overstreet	66621f016d	bcachefs: Add missing printbuf_reset() in bch2_check_dirent_inode_dirent() We were accidentally including the contents from the previous fsck_err(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 11:20:18 -04:00
Kent Overstreet	f1dc067bc1	bcachefs: sysfs/errors Make the superblock error counters available in sysfs; the only other way they can be seen is 'show-super', but we don't write the superblock every time the error count gets incremented. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 11:20:18 -04:00
Kent Overstreet	66b7c51ceb	bcachefs: bch2_check_fix_ptrs() can now repair btree roots This is straightforward enough: check_fix_ptrs() currently only runs before we go RW, so updating the btree root pointer in c->btree_roots suffices - it'll be written out in the first journal write we do. For that, do_bch2_trans_commit_to_journal_replay() now handles JSET_ENTRY_btree_root entries. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:13 -04:00
Kent Overstreet	a7c9add482	bcachefs: Include b->ob.nr in cached_btree_node_to_text() We have a bug report that looks like we might be leaking open buckets - let's check if they got left attached to the cached btree node. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:13 -04:00
Kent Overstreet	e87de7d491	bcachefs: Move devs_sorted to alloc_request More stack usage work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:13 -04:00
Kent Overstreet	ff6369da9a	bcachefs: reduce stack usage in alloc_sectors_start() with typical config options, variables in different inline functions aren't sharing stack space - and these are slowpaths. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:13 -04:00
Kent Overstreet	eabef52ff8	bcachefs: bch2_alloc_v4_to_text() Specialize the .to_text() for alloc_v4, to avoid the temporary on the stack for conversion from old versions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:13 -04:00
Kent Overstreet	0c34e7ff69	bcachefs: Tweak bch2_data_update_init() for stack usage - Separate out a slowpath for bkey_nocow_lock() - Don't call bch2_bkey_ptrs_c() or loop over pointers more than necessary Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:13 -04:00
Kent Overstreet	56e5c7f65f	bcachefs: kill replicas_sectors arg to __trigger_extent() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:13 -04:00
Kent Overstreet	92caf17189	bcachefs: Don't stack allocate bch_writepage_state Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	cd831a9494	bcachefs: factor out break_cycle_fail() More stack usage work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	19c0a8aa8a	bcachefs: btree_node_missing_err() Factor out an error path for a small stack usage improvement. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	0d25264ecf	bcachefs: Kill bkey_buf in btree_path_down() Allocate some (smaller) temporary storage in btree_trans for this - btree_path_down() is in our max-stack call stack. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	99813d88e3	bcachefs: Add missing error logging in delete_dead_inodes() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	f54b2a80d0	bcachefs: Fix misaligned bucket check in journal space calculations Fix an assertion pop in the tiering_misaligned test: rounding down to bucket size at the end of the journal space calculations leaves cur_entry_sectors == 0, which is incorrect with !cur_entry_err. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	813825d241	bcachefs: Fix incorrect multiple dev check in journal write path It's uncomon to have multiple devices with journalling only on a subset, but can be specified with the 'data_allowed' option. We need to know if we're doing data/metadata writes to multiple devices, as that requires issuing flushes before the journal writes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	327971cef5	bcachefs: Catch data_update_done events in trace_io_move_start_fail Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	c7897b5055	bcachefs: io_move_evacuate_bucket tracepoint, counter Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	060ff4b794	bcachefs: trace_io_move_pred Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	d6efd42a84	bcachefs: Fix infinite loop in journal_entry_btree_keys_to_text() Fix an infinite loop when bkey_i->k.u64s is 0. This only happens in userspace, where 'bcachefs list_journal' can print the entire contents of the journal, and non-dirty entries aren't validated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Kent Overstreet	cd04497b10	bcachefs: Journal read error message improvements - Don't print a checksum error when we first read a journal entry: we print a checksum error later if we'll be using the journal entry. - Continuing with the theme of of improving error messages and grouping errors into a single log message per error, print a single 'checksum error' message per journal entry, and use bch2_journal_ptr_to_text() to print out where on the device it was. - Factor out checksum error messages and checking for missing journal entries into helpers, bch2_journal_read() has gotten obnoxiously big. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-30 01:21:12 -04:00
Linus Torvalds	5e8bbb2caa	Another set of timer API cleanups: - Convert init_timer(), try_to_del_timer_sync() and destroy_timer_on_stack() over to the canonical timer_() namespace convention. There are is another large converstion pending, which has not been included because it would have caused a gazillion of merge conflicts in next. The conversion scripts will be run towards the end of the merge window and a pull request sent once all conflict dependencies have been merged. -----BEGIN PGP SIGNATURE----- iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmgzgTkTHHRnbHhAbGlu dXRyb25peC5kZQAKCRCmGPVMDXSYodwVD/97rF1Juqm1JZNIZPN/vMqwCxRoUkc6 tsK0+UC7UXusuJadxJ+Bsv25iPF+qejnThMU+SQ5yTVj/PNfxOe0WPdCEGGiL8Ye 2JCk6GqSOB/360SlLmtR1B1xHDwsuuUcQTz0w57CH66HRV5vpoWSMSwj/ypy+8nU PlgjItaxdCKa9NJ+SUJZPWIxRkt/PsA1kwlV1OcxkgB++IiIHQEbPxECq9mlzWXF b4Sq/Sdf2OmEePN+DYoey4fneRwJnkjkeX/o+CqosCPHRIiWUlSu5W/lU5IYojM3 s3XpMNNg/z8PMXR4JA2VaPYWLUZyBOs+3dM7Y6Am+z55EoxMxfzg6pGx2tfM4ftl vF8wG3Z1c9MmpLk+P9LatNvfHeVLNve8KgOLa5phMDQ/El/a8KqLu6HmRDPONvKp d6iXdPq1CP8P6jOtlFfzLmKPShgEcp+Zz9W3CaQR/0ZJEsEqrpKOLzdT86hJhBV0 mBCdzixmGtKAh0BdPdmg2FCLScqER3HKIJhZSdV8I+jSETIHCuMiIfbMXR7iwm/H R1/ayvxrbc1mPseo28scqvo7m6cn5BFBxIUf4Sokp52ZCapz1v2aWzo4vHI0cTgT ZOjlTrf+fgYLn1dqdD45TJiQPnmRrw4dU+WWSFRFJY2qjfyucj80vdqdkE5zkp5b UPomlVimG4ccPg== =FHGU -----END PGP SIGNATURE----- Merge tag 'timers-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "Another set of timer API cleanups: - Convert init_timer(), try_to_del_timer_sync() and destroy_timer_on_stack() over to the canonical timer_() namespace convention. There is another large conversion pending, which has not been included because it would have caused a gazillion of merge conflicts in next. The conversion scripts will be run towards the end of the merge window and a pull request sent once all conflict dependencies have been merged" * tag 'timers-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide, timers: Rename destroy_timer_on_stack() as timer_destroy_on_stack() treewide, timers: Rename try_to_del_timer_sync() as timer_delete_sync_try() timers: Rename init_timers() as timers_init() timers: Rename NEXT_TIMER_MAX_DELTA as TIMER_NEXT_MAX_DELTA timers: Rename __init_timer_on_stack() as __timer_init_on_stack() timers: Rename __init_timer() as __timer_init() timers: Rename init_timer_on_stack_key() as timer_init_key_on_stack() timers: Rename init_timer_key() as timer_init_key()	2025-05-27 08:31:21 -07:00
Kent Overstreet	72ab5136e8	bcachefs: Don't rewind to run a recovery pass we already ran Fix a small regression from the "run recovery passes" rewrite, which enabled async recovery passes. This fixes getting stuck in a loop in recovery. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-27 00:03:45 -04:00
Kent Overstreet	686db67a8e	bcachefs: Move unicode message to after the startup message Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-27 00:03:45 -04:00
Kent Overstreet	1cda5b88e6	bcachefs: Fix missing commit in check_dirents Other repair code seems to be doing commits themselves, but check_key_has_snapshot() does not. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-27 00:02:44 -04:00
Kent Overstreet	9e2c3c2ed4	bcachefs: Fix lost rebalance wakeups Fix a missing wakeup in 'bcachefs set-file-option' -> xattr option update -> inode_write this was missing because the wakeup needs to happen after transaction commit. Also, add a 'kick' counter, to make sure we don't miss a wakeup that occured right after we finished checking the rebalance_work btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-27 00:02:44 -04:00
Kent Overstreet	dc37dcca8c	bcachefs: bch2_kthread_io_clock_wait_once() Add a version of bch2_kthread_io_clock_wait() that only schedules once - behaving more like schedule_timeout(). This will be used for fixing rebalance wakeups. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-27 00:02:44 -04:00
Kent Overstreet	ff875d4b47	bcachefs: Ensure we print output of run_recovery_pass if it errors Also, don't error out in bucket_ref_update_err(): we don't want to return -BCH_ERR_cannot_rewind_recovery if it's not an insert, if it's an overwrite we continue. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-27 00:02:44 -04:00
Linus Torvalds	14418ddcc2	This update includes the following changes: API: - Fix memcpy_sglist to handle partially overlapping SG lists. - Use memcpy_sglist to replace null skcipher. - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK. - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS. - Hide CRYPTO_MANAGER. - Add delayed freeing of driver crypto_alg structures. Compression: - Allocate large buffers on first use instead of initialisation in scomp. - Drop destination linearisation buffer in scomp. - Move scomp stream allocation into acomp. - Add acomp scatter-gather walker. - Remove request chaining. - Add optional async request allocation. Hashing: - Remove request chaining. - Add optional async request allocation. - Move partial block handling into API. - Add ahash support to hmac. - Fix shash documentation to disallow usage in hard IRQs. Algorithms: - Remove unnecessary SIMD fallback code on x86 and arm/arm64. - Drop avx10_256 xts(aes)/ctr(aes) on x86. - Improve avx-512 optimisations for xts(aes). - Move chacha arch implementations into lib/crypto. - Move poly1305 into lib/crypto and drop unused Crypto API algorithm. - Disable powerpc/poly1305 as it has no SIMD fallback. - Move sha256 arch implementations into lib/crypto. - Convert deflate to acomp. - Set block size correctly in cbcmac. Drivers: - Do not use sg_dma_len before mapping in sun8i-ss. - Fix warm-reboot failure by making shutdown do more work in qat. - Add locking in zynqmp-sha. - Remove cavium/zip. - Add support for PCI device 0x17D8 to ccp. - Add qat_6xxx support in qat. - Add support for RK3576 in rockchip-rng. - Add support for i.MX8QM in caam. Others: - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up. - Add new SEV/SNP platform shutdown API in ccp. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEn51F/lCuNhUwmDeSxycdCkmxi6cFAmgz47AACgkQxycdCkmx i6fvKRAAr4Xa903L0r1Q1P1alQqoFFCqimUWeH72m68LiWynHWi0lUo0z/+tKweg mnPStz7/Ha9HRHJjdNCMPnlJqXQDkuH3bIOuBJCwduDuhHo9VGOd46XGzmGMv3gb HKuZhI0lk7pznK3CSyD/2nHmbDCHD+7feTZSBMoN9mm875+aSoM6fdxgak8uPFcq KbB1L+hObTn2kAPSqRrNOR8/xG2N7hdH8eax7Li+LAtqYNVT5HvWVECsB/CKRPfB sgAv3UTzcIFapSSHUHaONppSeoqPAIAeV7SdQhJvlT+EUUR/h/B6+D9OUQQqbphQ LBalgTnqMKl0ymDEQFQ6QyYCat9ZfNmDft2WcXEsxc8PxImkgJI1W3B8O51sOjbG 78D8JqVQ96dleo4FsBhM2wfG0b41JM6zU4raC4vS7a3qsUS+Q1MpehvcS1iORicy SpGdE8e7DLlxKhzWyW1xJnbrtMZDC7Sa2hUnxrvP0/xOvRhChKscRVtWcf0a5q7X 8JmuvwVSOJuSbQ3MeFbQvpo5lR9+0WsNjM6e9miiH6Y7vZUKmWcq2yDp377qVzeh 7NK6+OwGIQZZExrmtPw2BXwssT9Eg+ks6Y7g2Ne7yzvrjVNfEPY7Cws/5w7p8mRS qhrcpbJNFlWgD7YYkmGZFTQ8DCN25ipP8lklO/hbcfchqLE/o1o= =O8L5 -----END PGP SIGNATURE----- Merge tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Fix memcpy_sglist to handle partially overlapping SG lists - Use memcpy_sglist to replace null skcipher - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS - Hide CRYPTO_MANAGER - Add delayed freeing of driver crypto_alg structures Compression: - Allocate large buffers on first use instead of initialisation in scomp - Drop destination linearisation buffer in scomp - Move scomp stream allocation into acomp - Add acomp scatter-gather walker - Remove request chaining - Add optional async request allocation Hashing: - Remove request chaining - Add optional async request allocation - Move partial block handling into API - Add ahash support to hmac - Fix shash documentation to disallow usage in hard IRQs Algorithms: - Remove unnecessary SIMD fallback code on x86 and arm/arm64 - Drop avx10_256 xts(aes)/ctr(aes) on x86 - Improve avx-512 optimisations for xts(aes) - Move chacha arch implementations into lib/crypto - Move poly1305 into lib/crypto and drop unused Crypto API algorithm - Disable powerpc/poly1305 as it has no SIMD fallback - Move sha256 arch implementations into lib/crypto - Convert deflate to acomp - Set block size correctly in cbcmac Drivers: - Do not use sg_dma_len before mapping in sun8i-ss - Fix warm-reboot failure by making shutdown do more work in qat - Add locking in zynqmp-sha - Remove cavium/zip - Add support for PCI device 0x17D8 to ccp - Add qat_6xxx support in qat - Add support for RK3576 in rockchip-rng - Add support for i.MX8QM in caam Others: - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up - Add new SEV/SNP platform shutdown API in ccp" * tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits) x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining crypto: qat - add missing header inclusion crypto: api - Redo lookup on EEXIST Revert "crypto: testmgr - Add hash export format testing" crypto: marvell/cesa - Do not chain submitted requests crypto: powerpc/poly1305 - add depends on BROKEN for now Revert "crypto: powerpc/poly1305 - Add SIMD fallback" crypto: ccp - Add missing tee info reg for teev2 crypto: ccp - Add missing bootloader info reg for pspv5 crypto: sun8i-ce - move fallback ahash_request to the end of the struct crypto: octeontx2 - Use dynamic allocated memory region for lmtst crypto: octeontx2 - Initialize cptlfs device info once crypto: xts - Only add ecb if it is not already there crypto: lrw - Only add ecb if it is not already there crypto: testmgr - Add hash export format testing crypto: testmgr - Use ahash for generic tfm crypto: hmac - Add ahash support crypto: testmgr - Ignore EEXIST on shash allocation crypto: algapi - Add driver template support to crypto_inst_setname crypto: shash - Set reqsize in shash_alg ...	2025-05-26 13:47:28 -07:00
Kent Overstreet	97e69f12ed	bcachefs: Fix missing BTREE_UPDATE_internal_snapshot_node Repair code will do updates on older snapshot versions, so needs the correct annotation. Reported-by: syzbot+42581416dba62b364750@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-25 03:22:18 -04:00
Kent Overstreet	7098ba57c4	bcachefs: fix REFLINK_P_MAY_UPDATE_OPTIONS If we're doing a reflink copy of existing reflinked data, we may only set REFLINK_P_MAY_UPDATE_OPTIONS if it was set on the reflink pointer we're copying from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-25 03:22:18 -04:00
Kent Overstreet	9caea9208f	bcachefs: Don't mount bs > ps without TRANSPARENT_HUGEPAGE Large folios aren't supported without TRANSPARENT_HUGEPAGE Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 22:00:07 -04:00
Kent Overstreet	3f2f028814	bcachefs: Fix btree_iter_next_node() for new locking asserts We can't unlock a should_be_locked path unless we're in a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 22:00:07 -04:00
Kent Overstreet	521f9584c2	bcachefs: Ensure we don't use a blacklisted journal seq Different versions differ on the size of the blacklist range; it is theoretically possible that we could end up with blacklisted journal sequence numbers newer than the newest seq we find in the journal, and pick a new start seq that's blacklisted. Explicitly check for this in bch2_fs_journal_start(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 19:52:31 -04:00
Kent Overstreet	9b133c0d74	bcachefs: Small check_fix_ptr fixes We don't want to change the bucket gen, on gen mismatch: it's possible to have multiple btree nodes with different gens in the same bucket that we want to keep, if we have to recover from btree node scan. It's also not necessary to set g->gen_valid; add a comment to that effect. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 19:52:31 -04:00
Kent Overstreet	cade003209	bcachefs: Fix opts.recovery_pass_last This was lost in the giant recovery pass rework - but it's used heavily by bcachefs subcommand utilities. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 19:52:31 -04:00
Kent Overstreet	f351d91edd	bcachefs: Fix allocate -> self healing path When we go to allocate and find taht a bucket in the freespace btree is actually allocated, we're supposed to return nonzero to tell the allocator to skip it. This fixes an emergency read only due to a bucket/ptr gen mismatch - we also don't return the correct bucket gen when this happens. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 19:52:31 -04:00
Kent Overstreet	016c4b48b8	bcachefs: Fix endianness in casefold check/repair Fixes: `010c894681` ("bcachefs: Check for casefolded dirents in non casefolded dirs") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 19:52:31 -04:00
Kent Overstreet	b41ac97fe0	bcachefs: Path must be locked if trans->locked && should_be_locked If path->should_be_locked is true, that means user code (of the btree API) has seen, in this transaction, something guarded by the node this path has locked, and we have to keep it locked until the end of the transaction. Assert that we're not violating this; should_be_locked should also be cleared only in _very_ special situations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	22e921a6f9	bcachefs: Simplify bch2_path_put() Simplify the "do we need to keep this locked?" checks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	80a160e494	bcachefs: Plumb btree_trans for more locking asserts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	df92f3500b	bcachefs: Clear trans->locked before unlock We're adding new should_be_locked assertions: it's going to be illegal to unlock a should_be_locked path when trans->locked is true. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	eb34365ada	bcachefs: Clear should_be_locked before unlock in key_cache_drop() We're adding new should_be_locked assertions, also add a comment explaining why clearing should_be_locked is safe here. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	be9fecdcda	bcachefs: bch2_path_get() reuses paths if upgrade_fails & !should_be_locked Small additional optimization over the previous patch, bringing us closer to the original behaviour, except when we need to clone to avoid a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	aac49471b6	bcachefs: Give out new path if upgrade fails Avoid transaction restarts due to failure to upgrade - we can traverse a new iterator without a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	66782b2acb	bcachefs: Fix btree_path_get_locks when not doing trans restart btree_path_get_locks, on failure, shouldn't unlock if we're not issuing a transaction restart: we might drop locks we're not supposed to (if path->should_be_locked is set). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	5b7b342c40	bcachefs: btree_node_locked_type_nowrite() Small helper to improve locking assertions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	659489f37b	bcachefs: Kill bch2_path_put_nokeep() bch2_path_put_nokeep() was intended for paths we wouldn't need to preserve for a transaction restart - it always frees them right away when the ref hits 0. But since paths are shared, freeing unconditionally is a bug, the path might have been used elsewhere and have should_be_locked set, i.e. we need to keep it locked until the end of the transaction. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-23 07:59:43 -04:00
Kent Overstreet	2a6c0136ae	bcachefs: bch2_journal_write_checksum() We need to delay checksumming the journal write; we don't know the blocksize until after we allocate the write. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-22 15:13:17 -04:00
Kent Overstreet	d385ca5603	bcachefs: Reduce stack usage in data_update_index_update() Separate tracepoint message generation and other slowpath code into non-inline functions, and use bch2_trans_log_str() instead of using a printbuf for our journal message. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-22 15:13:17 -04:00
Kent Overstreet	7d886a82bf	bcachefs: bch2_trans_log_str() The data update path doesn't need a printbuf for its log message - this will help reduce stack usage. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-22 15:13:17 -04:00
Kent Overstreet	4a9eb20efa	bcachefs: Kill bkey_buf usage in data_update_index_update() Reduce stack usage - bkey_buf has a 96 byte buffer on the stack, but the btree_trans bump allocator works just fine here. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-22 15:13:17 -04:00
Kent Overstreet	bfc0c6fecf	bcachefs: Drop empty accounting updates Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:19:24 -04:00
Kent Overstreet	136d082abc	bcachefs: Improve trace_trans_restart_upgrade - Convert to a 'fs_str' tracepoint that just emits as a string: this lets us build up the tracepoint with a printbuf, using our pretty printers, and they're much easier to manage - Include locks_held, before and after - Include the btree node pointer we failed on (error pointer, null, or real node) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:11 -04:00
Kent Overstreet	f638b84224	bcachefs: fix bch2_inum_snapshot_to_path() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:11 -04:00
Kent Overstreet	2faa8ab0d0	bcachefs: fix duplicate printk Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:10 -04:00
Kent Overstreet	4ba99dde33	bcachefs: BCH_INODE_has_case_insensitive Add a flag for tracking whether a directory has case-insensitive descendents - so that overlayfs can disallow mounting, even though the filesystem supports case insensitivity. This is a new on disk format version, with a (cheap) upgrade to ensure the flag is correctly set on existing inodes. Create, rename and fssetxattr are all plumbed to ensure the new flag is set, and we've got new fsck code that hooks into check_inode(0. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:10 -04:00
Kent Overstreet	77eac89c79	bcachefs: bch2_inode_find_by_inum_snapshot() Move a fsck.c helper into inode.c, eliminate some duplicate and organize the inode lookup helpers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:09 -04:00
Kent Overstreet	77aeaa2f0f	bcachefs: bch2_inum_snapshot_to_path() Add a better helper for printing out paths of inodes when we don't know the subvolume, for fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:09 -04:00
Kent Overstreet	7c4f22af25	bcachefs: bch2_rename_trans() only runs rename-to-dir code if needed Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Kent Overstreet	011d644b76	bcachefs: subvol_inum_eq() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Kent Overstreet	c3a7fd95e0	bcachefs: Don't set bi_casefold on non directories bi_casefold only makes sense for directories, and since it's one of the variable length fields setting it unnecessarily wastes space. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Alan Huang	a96c5e5045	bcachefs: Remove duplicate call to bch2_trans_begin() There is one in for_each_btree_key_max(). Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:08 -04:00
Kent Overstreet	c631bb41f5	bcachefs: Call bch2_bkey_set_needs_rebalance() earlier in write path There's no reason to be running this inside our transaction; it forces us to copy the key we're updating to a temporary, which we'd like to skip. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:07 -04:00
Kent Overstreet	f132a78095	bcachefs: Simplify bch2_extent_atomic_end() It used to be that we had a fixed maximum number of btree paths to work with - 64. That's no longer the case, so bch2_extent_atomic_end() doesn't have to be as strict. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:07 -04:00
Kent Overstreet	7fd643c032	bcachefs: Coalesce accounting in trans commit Accounting has gotten quite heavy, and there's lots of redundancy in accounting updates within a transaction, as we often add/delete multiple extents that touch the same accountign counters. This will reduce the amount of data that we journal, and reduce pressure downstream on the btree write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	e8f9992b0a	bcachefs: Split out accounting in transaction commit There can be a lot of rendundancy in accounting updates within a single btree transaction. Split out accounting updates so that they can be deduped, in the next commit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	247abee6ae	bcachefs: btree_trans_subbuf Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	81c42933a5	bcachefs: Make accounting mismatch errors more readable Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:06 -04:00
Kent Overstreet	51e23c9d60	bcachefs: async objs now support bch_write_ops Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:05 -04:00
Kent Overstreet	8c3fc7cca3	bcachefs: fix bch2_debugfs_flush_buf() when tabstops are in use Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:05 -04:00
Kent Overstreet	6b86da9282	bcachefs: fsck: Include loops in error messages This fixes the subvol loop checking and directory loop checking to print the loop. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:05 -04:00
Kent Overstreet	39cea302f1	bcachefs: bch2_check_bucket_backpointer_mismatch() Detect buckets with missing backpointers, and run repair on demand. __bch2_move_data_phys() now calls bch2_check_bucket_backpointer_mismatch() as it walks buckets, which checks for missing backpointers by comparing backpointers against bucket sector counts. When missing backpointers are detected, we kick off bch2_check_extents_to_backpointers() asynchronously - right away if we're trying to evacuate, or with a threshold if we're just running copygc. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	15f969326e	bcachefs: Improve bucket_bitmap code Add some more helpers, and mismatches is now a superset of the empty bitmap - simplifies most checks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	06977ea82b	bcachefs: Run recovery passes asynchronously When we request a recovery pass to be run online, i.e. not during recovery, if it's an online pass it'll now be run in the background, instead of waiting for the next mount. To avoid situations where recovery passes are running continuously, this also includes ratelimiting: if the RUN_RECOVERY_PASS_ratelimit flag is passed, the pass may be deferred until later - depending on the runtime and last run stats in the recovery_passes superblock section. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	d4b30ed90c	bcachefs: bch2_run_explicit_recovery_pass() cleanup Consolidate the run_explicit_recovery_pass() interfaces by adding a flags parameter; this will also let us add a RUN_RECOVERY_PASS_ratelimit flag. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:04 -04:00
Kent Overstreet	06266465cc	bcachefs: bch2_recovery_pass_status_to_text() Show recovery pass status in sysfs - important now that we're running them automatically in the background. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	7ed4c14e20	bcachefs: Reduce usage of recovery.curr_pass We want recovery.curr_pass to be private to the recovery passes code, for better showing recovery pass status; also, it may rewind and is generally not the correct member to use. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	ab35552030	bcachefs: __bch2_run_recovery_passes() Consolidate bch2_run_recovery_passes() and bch2_run_online_recovery_passes(), prep work for automatically scheduling and running recovery passes in the background. - Now takes a mask of which passes to run, automatic background repair will pass in sb.recovery_passes_required. - Skips passes that are failing: a pass that failed may be reattempted after another pass succeeds (some passes depend on repair done by other passes for successful completion). - bch2_recovery_passes_match() helper to skip alloc passes on a filesystem without alloc info. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	68708efcac	bcachefs: struct bch_fs_recovery bch_fs has gotten obnoxiously big, let's start organizing thins a bit better. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:03 -04:00
Kent Overstreet	878713b5f5	bcachefs: kill copy in bch2_disk_accounting_mod() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:02 -04:00
Kent Overstreet	295dbf50e5	bcachefs: Optimize bch2_trans_start_alloc_update() Avoid doing more updates if we already have one. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:02 -04:00
Kent Overstreet	9469556a5f	bcachefs: btree key cache asserts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:02 -04:00
Kent Overstreet	a78a11900e	bcachefs: journal path now uses discard_opt_enabled() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:01 -04:00
Kent Overstreet	8a6fa52e07	bcachefs: relock_fail tracepoint now includes btree Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:01 -04:00
Kent Overstreet	84b9f17195	bcachefs: do_rebalance_scan() now only updates bch_extent_rebalance This ensures that our pending rebalance work accounting is accurate quickly. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:01 -04:00
Kent Overstreet	bde41d9a58	bcachefs: better error message for subvol_fs_path_parent_wrong Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:00 -04:00
Kent Overstreet	fdd0807f81	bcachefs: Improve bch2_repair_inode_hash_info() Improve this so it can be used by fsck.c check_inode(); it provides a much better error message than the check_inode() version. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:15:00 -04:00
Kent Overstreet	123d2d09ff	bcachefs: bch2_inode_find_snapshot_root() Factor out a small common helper. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Alan Huang	4a67b94bd8	bcachefs: Early return to avoid unnecessary lock Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Alan Huang	688321f97e	bcachefs: Kill BTREE_TRIGGER_bucket_invalidate Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Kent Overstreet	e882906929	bcachefs: Fix opt hooks in sysfs for non sb option We weren't checking if the option changed for non-superblock options - this led to rebalance not waking up when enabling the "rebalance_enabled" option. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:59 -04:00
Kent Overstreet	648c1142c9	bcachefs: fix can_write_extent() Failing to check the return value of bch2_dev_rcu(): we could (technically) race with device removal. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:58 -04:00
Kent Overstreet	c7378d0e5e	bcachefs: Add tracepoint, counter for io_move_created_rebalance Internal moves shouldn't add new rebalance_work, but it's been reported that this seems to be happening. Add a tracepoint and counter so we can see what's going on. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:58 -04:00
Kent Overstreet	e4e513f2d5	bcachefs: move_buckets in rhashtable when allocated Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:58 -04:00
Kent Overstreet	fb7e78cc25	bcachefs: Move pending buckets queue to buckets_in_flight Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:57 -04:00
Kent Overstreet	49188a9313	bcachefs: kill move_bucket_in_flight Small cleanup/simplification, and prep work for the next patch, which will add checking if buckets don't get evacuated because they're missing backpointers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:57 -04:00
Kent Overstreet	b42fac043f	bcachefs: bch2_fs_emergency_read_only2() More error message cleanup: instead of multiple printk()s per error, we want to be building up a single error message in a printbuf, so that it can be printed with indenting that shows grouping and avoid errors getting interspersed or lost in the log. This gets rid of most calls to bch2_fs_emergency_read_only(). We still have calls to - bch2_fatal_error() - bch2_fs_fatal_error() - bch2_fs_fatal_err_on() that need work. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	ac4c7ac90e	bcachefs: Extra write buffer asserts Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	7ad7497862	bcachefs: add missing locking in bch2_write_point_to_text() Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	177ac4925f	bcachefs: Don't rewind recovery if not in recovery Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	367cad0966	bcachefs: Rename fsck_running, recovery_running flags Slightly more readable. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:56 -04:00
Kent Overstreet	5b1247ca5f	bcachefs: debug_check_bkey_unpack Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2025-05-21 20:14:55 -04:00

1 2 3 4 5 ...

5280 Commits