Commit Graph

31 Commits

Author SHA1 Message Date
Kent Overstreet
0499a82b18 bcachefs: Async object debugging
Debugging infrastructure for async objs: this lets us easily create
fast_lists for various object types so they'll be visible in debugfs.

Add new object types to the BCH_ASYNC_OBJS_TYPES() enum, and drop a
pretty-printer wrapper in async_objs.c.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:14:29 -04:00
Kent Overstreet
d02755b8c5 bcachefs: trace bch2_trans_kmalloc()
We're occasionally seeing the WARN_ON() for bump allocator usage
exceeding BTREE_TRANS_MEM_MAX; add some tracing so we can see what's
going on.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21 20:13:27 -04:00
Linus Torvalds
ef77858826 bcachefs fixes for 6.15-rc2
Mostly minor fixes.
 
 Eric Biggers' crypto API conversion is included because of long standing
 sporadic crashes - mostly, but not entirely syzbot - in the crypto API
 code when calling poly1305, which have been nigh impossible to reproduce
 and debug. His rework deletes the code where we've seen the crashes, so
 either it'll be a fix or we'll end up with backtraces we can debug.
 (Thanks Eric!).
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmf4Yq8ACgkQE6szbY3K
 bnb1fxAAu68Ll/4PLWr3xHVp7ETWgEZSzuwRqA87fqs/Q0jNRC2aDIO03Wmj28qM
 ckEM3PFPuiQubLhOmU21Osta/sFU6GxL0IggMoEC50F5XiVlcKRiNSWhRLnr07Qp
 v7sc5MQ1HDGnZXQMcdRymzRWixn2hzdRMwKOvbhBHuj0YUPd4yk5I/+AtJi5SAnT
 ri+dGIQLBRmG7J2pd4AZNza0YZ5pkTAFj9/4wRVoJdKX2pPzf40e7qzYNPt4/6rK
 A6P9ecU3TDDDQEE4S7s8Dng4rXwsa+9qTUcpXnTTC1L6YbbnZd/IQYzWI4b+FUsS
 wqnUD+aE7UEMANZh891QlJpGj3ih/6z8opUP4T6RdsVuJwt9X1vFJY99CsOTEf1o
 7jAcssL+ueEWPZj8tBoN1niujikyFsXM+xKiUOMZxbuM6BhE40j/WrA77sRhI5+I
 7DXlf5s8SDh+gw0IGUboBJe3ofGisRXnfxeZAKQHGHgtEFboY4bDAURcGW4MbIqE
 uN5Cd+5IJlcKmJdXLCbHMb5KktfBNWu9/VrOMcZ2QHhIuOfd3fFgLzE0ZEroj4lN
 kTWxpzKeNDt3bPF4esYnvduafHDbzClwfkTt5TBgcOeE4TcIL2mOmweLE2LTKIwW
 xr5Xhqx1/9//PeaOTwxbCoeZ26G0Q9B8L1+eUZgjPS0FcRdZH9w=
 =DpNC
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2025-04-10' of git://evilpiepirate.org/bcachefs

Pull bcachefs fixes from Kent Overstreet:
 "Mostly minor fixes.

  Eric Biggers' crypto API conversion is included because of long
  standing sporadic crashes - mostly, but not entirely syzbot - in the
  crypto API code when calling poly1305, which have been nigh impossible
  to reproduce and debug.

  His rework deletes the code where we've seen the crashes, so either
  it'll be a fix or we'll end up with backtraces we can debug. (Thanks
  Eric!)"

* tag 'bcachefs-2025-04-10' of git://evilpiepirate.org/bcachefs:
  bcachefs: Use sort_nonatomic() instead of sort()
  bcachefs: Remove unnecessary softdep on xxhash
  bcachefs: use library APIs for ChaCha20 and Poly1305
  bcachefs: Fix duplicate "ro,read_only" in opts at startup
  bcachefs: Fix UAF in bchfs_read()
  bcachefs: Use cpu_to_le16 for dirent lengths
  bcachefs: Fix type for parameter in journal_advance_devs_to_next_bucket
  bcachefs: Fix escape sequence in prt_printf
2025-04-10 19:38:22 -07:00
Linus Torvalds
97c484ccb8 CRC cleanups for 6.15
Finish cleaning up the CRC kconfig options by removing the remaining
 unnecessary prompts and an unnecessary 'default y', removing
 CONFIG_LIBCRC32C, and documenting all the CRC library options.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ/P7QhQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyoOAQCynFcS1dWuD27S+SdUREmBjMAoZo5M
 zdsIvlPv9KLycgD/QX5lXjW3KIYY6jQ8vHUuLVwfDl/JEp4GJS9dLGU+agg=
 =0R1T
 -----END PGP SIGNATURE-----

Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull CRC cleanups from Eric Biggers:
 "Finish cleaning up the CRC kconfig options by removing the remaining
  unnecessary prompts and an unnecessary 'default y', removing
  CONFIG_LIBCRC32C, and documenting all the CRC library options"

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux:
  lib/crc: remove CONFIG_LIBCRC32C
  lib/crc: document all the CRC library kconfig options
  lib/crc: remove unnecessary prompt for CONFIG_CRC_ITU_T
  lib/crc: remove unnecessary prompt for CONFIG_CRC_T10DIF
  lib/crc: remove unnecessary prompt for CONFIG_CRC16
  lib/crc: remove unnecessary prompt for CONFIG_CRC_CCITT
  lib/crc: remove unnecessary prompt for CONFIG_CRC32 and drop 'default y'
2025-04-08 12:09:28 -07:00
Eric Biggers
4bf4b5046d bcachefs: use library APIs for ChaCha20 and Poly1305
Just use the ChaCha20 and Poly1305 libraries instead of the clunky
crypto API.  This is much simpler.  It is also slightly faster, since
the libraries provide more direct access to the same
architecture-optimized ChaCha20 and Poly1305 code.

I've tested that existing encrypted bcachefs filesystems can be continue
to be accessed with this patch applied.

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-06 19:33:53 -04:00
Eric Biggers
b261d22220 lib/crc: remove CONFIG_LIBCRC32C
Now that LIBCRC32C does nothing besides select CRC32, make every option
that selects LIBCRC32C instead select CRC32 directly.  Then remove
LIBCRC32C.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Martin K. Petersen" <martin.petersen@oracle.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20250401221600.24878-8-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2025-04-04 11:31:42 -07:00
Eric Biggers
a07c43e6c2 bcachefs: add missing selection of XARRAY_MULTI
When CONFIG_XARRAY_MULTI is not set, reading from a bcachefs file hits
the 'BUG_ON(order > 0);' in xas_set_order(), because it tries to insert
a large folio in the page cache.  Fix this by making bcachefs select
XARRAY_MULTI.

Fixes: be212d86b1 ("bcachefs: bs > ps support")
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-04-02 10:24:34 -04:00
Eric Biggers
71fbb0b86e bcachefs: use sha256() instead of crypto_shash API
Just use sha256() instead of the clunky crypto API.  This is much
simpler.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-24 09:50:34 -04:00
Kent Overstreet
9cf6b84b71 bcachefs: CONFIG_BCACHEFS_INJECT_TRANSACTION_RESTARTS
Incorrectly handled transaction restarts can be a source of heisenbugs;
add a mode where we randomly inject them to shake them out.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-02-12 18:40:19 -05:00
Linus Torvalds
37b33c68b0 CRC updates for 6.14
- Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API.  This is much simpler and more efficient.
 
 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API.  More conversions like this will come later.
 
 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.
 
 - Add an entry to MAINTAINERS for the kernel's CRC library code.  I'm
   volunteering to maintain it.  I have additional cleanups and
   optimizations planned for future cycles.
 
 These patches have been in linux-next since -rc1.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQSacvsUNc7UX4ntmEPzXCl4vpKOKwUCZ418ZRQcZWJpZ2dlcnNA
 Z29vZ2xlLmNvbQAKCRDzXCl4vpKOKyJYAP9kBlpm8W9/XY6N8SpjKaXE/vKQYHQl
 Nobhak06Us8uJwEAkcUTymWP4IwQj5A9jgBAPRw53FQcNVKIc+01C7gRHw0=
 =mqSH
 -----END PGP SIGNATURE-----

Merge tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux

Pull CRC updates from Eric Biggers:

 - Reorganize the architecture-optimized CRC32 and CRC-T10DIF code to be
   directly accessible via the library API, instead of requiring the
   crypto API. This is much simpler and more efficient.

 - Convert some users such as ext4 to use the CRC32 library API instead
   of the crypto API. More conversions like this will come later.

 - Add a KUnit test that tests and benchmarks multiple CRC variants.
   Remove older, less-comprehensive tests that are made redundant by
   this.

 - Add an entry to MAINTAINERS for the kernel's CRC library code. I'm
   volunteering to maintain it. I have additional cleanups and
   optimizations planned for future cycles.

* tag 'crc-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiggers/linux: (31 commits)
  MAINTAINERS: add entry for CRC library
  powerpc/crc: delete obsolete crc-vpmsum_test.c
  lib/crc32test: delete obsolete crc32test.c
  lib/crc16_kunit: delete obsolete crc16_kunit.c
  lib/crc_kunit.c: add KUnit test suite for CRC library functions
  powerpc/crc-t10dif: expose CRC-T10DIF function through lib
  arm64/crc-t10dif: expose CRC-T10DIF function through lib
  arm/crc-t10dif: expose CRC-T10DIF function through lib
  x86/crc-t10dif: expose CRC-T10DIF function through lib
  crypto: crct10dif - expose arch-optimized lib function
  lib/crc-t10dif: add support for arch overrides
  lib/crc-t10dif: stop wrapping the crypto API
  scsi: target: iscsi: switch to using the crc32c library
  f2fs: switch to using the crc32 library
  jbd2: switch to using the crc32c library
  ext4: switch to using the crc32c library
  lib/crc32: make crc32c() go directly to lib
  bcachefs: Explicitly select CRYPTO from BCACHEFS_FS
  x86/crc32: expose CRC32 functions through lib
  x86/crc32: update prototype for crc32_pclmul_le_16()
  ...
2025-01-22 19:55:08 -08:00
Geert Uytterhoeven
d36b3e74b6 bcachefs: BCACHEFS_PATH_TRACEPOINTS should depend on TRACING
When tracing is disabled, there is no point in asking the user about
enabling extra btree_path tracepoints in bcachefs.

Fixes: 32ed4a620c ("bcachefs: Btree path tracepoints")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-12-21 01:36:21 -05:00
Eric Biggers
cc354fa7f0 bcachefs: Explicitly select CRYPTO from BCACHEFS_FS
Explicitly select CRYPTO from BCACHEFS_FS, so that this dependency of
CRYPTO_SHA256, CRYPTO_CHACHA20, and CRYPTO_POLY1305 (which are also
selected) is satisfied.  Currently this dependency is satisfied
indirectly via LIBCRC32C, but this is fragile and is planned to change
(https://lore.kernel.org/r/20241021002935.325878-13-ebiggers@kernel.org).

Acked-by: Kent Overstreet <kent.overstreet@linux.dev>
Link: https://lore.kernel.org/r/20241202010844.144356-15-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
2024-12-01 17:23:02 -08:00
Kuan-Wei Chiu
92a8b224b8 lib/min_heap: introduce non-inline versions of min heap API functions
Patch series "Enhance min heap API with non-inline functions and
optimizations", v2.

Add non-inline versions of the min heap API functions in lib/min_heap.c
and updates all users outside of kernel/events/core.c to use these
non-inline versions.  To mitigate the performance impact of indirect
function calls caused by the non-inline versions of the swap and compare
functions, a builtin swap has been introduced that swaps elements based on
their size.  Additionally, it micro-optimizes the efficiency of the min
heap by pre-scaling the counter, following the same approach as in
lib/sort.c.  Documentation for the min heap API has also been added to the
core-api section.


This patch (of 10):

All current min heap API functions are marked with '__always_inline'. 
However, as the number of users increases, inlining these functions
everywhere leads to a increase in kernel size.

In performance-critical paths, such as when perf events are enabled and
min heap functions are called on every context switch, it is important to
retain the inline versions for optimal performance.  To balance this, the
original inline functions are kept, and additional non-inline versions of
the functions have been added in lib/min_heap.c.

Link: https://lkml.kernel.org/r/20241020040200.939973-1-visitorckw@gmail.com
Link: https://lore.kernel.org/20240522161048.8d8bbc7b153b4ecd92c50666@linux-foundation.org
Link: https://lkml.kernel.org/r/20241020040200.939973-2-visitorckw@gmail.com
Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com>
Suggested-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ching-Chun (Jim) Huang <jserv@ccns.ncku.edu.tw>
Cc: Coly Li <colyli@suse.de>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: "Liang, Kan" <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Sakai <msakai@redhat.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-05 17:12:34 -08:00
Kent Overstreet
32ed4a620c bcachefs: Btree path tracepoints
Fastpath tracepoints, rarely needed, only enabled with
CONFIG_BCACHEFS_PATH_TRACEPOINTS.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-09-09 09:41:48 -04:00
Kent Overstreet
31403dca5b bcachefs: optimize __bch2_trans_get(), kill DEBUG_TRANSACTIONS
- Some tweaks to greatly reduce locking overhead for the list of btree
   transactions, so that it can always be enabled: leave btree_trans
   objects on the list when they're on the percpu single item freelist,
   and only check for duplicates in the same process when
   CONFIG_BCACHEFS_DEBUG is enabled

 - don't zero out the full btree_trans() unless we allocated it from
   the mempool

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:44 -05:00
Kent Overstreet
011173321f bcachefs: six locks: Simplify optimistic spinning
osq lock maintainers don't want it to be used outside of kernel/locking/
- but, we can do better.

Since we have lock handoff signalled via waitlist entries, there's no
reason for optimistic spinning to have to look at the lock at all -
aside from checking lock-owner; we can just spin looking at our waitlist
entry.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2024-01-01 11:47:38 -05:00
Kent Overstreet
6201d91ee3 bcachefs: Put erasure coding behind an EXPERIMENTAL kconfig option
We still have disk space accounting changes coming for erasure coding,
and the changes won't be as strictly backwards compatible as they'd
ought to be - specifically, we need to start accounting striped data
under a separate counter in bch_alloc (which describes buckets).

A fsck will suffice for upgrading/downgrading, but since erasure coding
is the most incomplete major feature of bcachefs it still makes sense to
put behind a separate kconfig option, so that users are fully aware.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-24 00:29:58 -05:00
Kent Overstreet
bf61dcdfc1 bcachefs: CONFIG_BCACHEFS_DEBUG_TRANSACTIONS no longer defaults to y
BCACHEFS_DEBUG_TRANSACTIONS is useful, but it's too expensive to have on
by default - and it hasn't been coming up in bug reports.

Turn it off by default until we figure out a way to make it cheaper.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04 22:19:13 -04:00
Kent Overstreet
4db8ac8629 bcachefs: Fix MEAN_AND_VARIANCE kconfig options
Fixes:

https://lore.kernel.org/linux-bcachefs/CAMuHMdXpwMdLuoWsNGa8qacT_5Wv-vSTz0xoBR5n_fnD9cNOuQ@mail.gmail.com/

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-11-04 14:17:11 -04:00
Kent Overstreet
986e9842fb bcachefs: Compression levels
This allows including a compression level when specifying a compression
type, e.g.
  compression=zstd:15

Values from 1 through 15 indicate compression levels, 0 or unspecified
indicates the default.

For LZ4, values 3-15 specify that the HC algorithm should be used.

Note that for compatibility, extents themselves only include the
compression type, not the compression level. This means that specifying
the same compression algorithm but different compression levels for the
compression and background_compression options will have no effect.

XXX: perhaps we could add a warning for this

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:07 -04:00
Kent Overstreet
dbc7deb2af bcachefs: Mark as EXPERIMENTAL
As discussed on list, bcachefs is going to be marked as experimental for
a few releases, until the inevitable tide of new bug reports subsides.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:10:06 -04:00
Daniel Hill
bf8f8b20a1 bcachefs: time stats now uses the mean_and_variance module.
Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:43 -04:00
Daniel Hill
92095781e0 bcachefs: Mean and variance
This module provides a fast 64bit implementation of basic statistics
functions, including mean, variance and standard deviation in both
weighted and unweighted variants, the unweighted variant has a 32bit
limitation per sample to prevent overflow when squaring.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:43 -04:00
Kent Overstreet
615f867c14 bcachefs: Improved errcodes
Instead of overloading standard error codes (EINTR/EAGAIN), and defining
short lists of error codes in multiple places that potentially end up
overlapping & conflicting, we're now going to have one master list of
error codes.

Error codes are defined with an x-macro: thus we also have
bch2_err_str() now.

Also, error codes have a class field. Now, instead of checking for
errors with ==, code should use bch2_err_matches(), which returns true
if the error is equal to or a sub-error of the error class.

This means we can define unique errors for every source location where
an error is generated, which will help improve our error messages.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:36 -04:00
Daniel Hill
c807ca95a6 bcachefs: added lock held time stats
We now record the length of time btree locks are held and expose this in debugfs.

Enabled via CONFIG_BCACHEFS_LOCK_TIME_STATS.

Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:35 -04:00
Kent Overstreet
b84d42c31f bcachefs: Split out CONFIG_BCACHEFS_DEBUG_TRANSACTIONS
This puts the btree_transactions sysfs/debugfs file behind a separate
config option - it's highly useful, but not cheap enough to enable
permenantly.

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:19 -04:00
jpsollie
41e633826a bcachefs: add bcachefs xxhash support
xxhash is a much faster algorithm compared to crc32.
could be used to speed up checksum calculation.
xxhash 64-bit only, as it is much faster on 64-bit CPUs compared to xxh32.

Signed-off-by: jpsollie <janpieter.sollie@edpnet.be>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:09:07 -04:00
Kent Overstreet
876c7af3a6 bcachefs: Take a SRCU lock in btree transactions
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:47 -04:00
Kent Overstreet
04c2c34f00 bcachefs: use crc64 from lib/
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:14 -04:00
Kent Overstreet
cd575ddf57 bcachefs: Erasure coding
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:11 -04:00
Kent Overstreet
1c6fdbd8f2 bcachefs: Initial commit
Initially forked from drivers/md/bcache, bcachefs is a new copy-on-write
filesystem with every feature you could possibly want.

Website: https://bcachefs.org

Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2023-10-22 17:08:07 -04:00