Commit Graph

447 Commits

Author SHA1 Message Date
David Sterba
ab5fcbb1ad btrfs: use pgoff_t for page index variables
Any conversion of offsets in the logical or the physical mapping space
of the pages is done by a shift and the target type should be pgoff_t
(type of struct page::index). Fix the locations where it's still
unsigned long.

Signed-off-by: David Sterba <dsterba@suse.com>
2025-07-21 23:58:05 +02:00
David Sterba
d64ef1d23f btrfs: rename err to ret in btrfs_alloc_from_bitmap()
Unify naming of return value to the preferred way.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-07-21 23:53:28 +02:00
Filipe Manana
443e4d0e1c btrfs: return real error from __filemap_get_folio() calls
We have a few places that always assume a -ENOMEM error happened in case a
call to __filemap_get_folio() returns an error, which is just too much of
an assumption and even if it would be the case at some point in time, it's
not future proof and there's nothing in the documentation that guarantees
that only ERR_PTR(-ENOMEM) can be returned with the flags we are passing
to it.

So use the exact error returned by __filemap_get_folio() instead.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-15 14:30:57 +02:00
David Sterba
2d44a15afd btrfs: use list_first_entry() everywhere
Using the helper makes it a bit more clear that we're accessing the
first list entry.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-15 14:30:47 +02:00
Filipe Manana
66da9c1bed btrfs: rename the functions to search for bits in extent ranges
These functions are exported so they should have a 'btrfs_' prefix by
convention, to make it clear they are btrfs specific and to avoid
collisions with functions from elsewhere in the kernel.

So add a 'btrfs_' prefix to their name to make it clear they are from
btrfs.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-15 14:30:44 +02:00
Filipe Manana
9d222562b4 btrfs: rename the functions to clear bits for an extent range
These functions are exported so they should have a 'btrfs_' prefix by
convention, to make it clear they are btrfs specific and to avoid
collisions with functions from elsewhere in the kernel. One of them has a
double underscore prefix which is also discouraged.

So remove double underscore prefix where applicable and add a 'btrfs_'
prefix to their name to make it clear they are from btrfs.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-15 14:30:43 +02:00
Filipe Manana
242570e80b btrfs: add btrfs prefix to main lock, try lock and unlock extent functions
These functions are exported so they should have a 'btrfs_' prefix by
convention, to make it clear they are btrfs specific and to avoid
collisions with functions from elsewhere in the kernel. So add a prefix to
their name.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-15 14:30:43 +02:00
David Sterba
e235418118 btrfs: do more trivial BTRFS_PATH_AUTO_FREE conversions
The most trivial pattern for the auto freeing when the variable is
declared with the macro and the final btrfs_free_path() is removed.
There are almost none goto -> return conversions and there's no other
function cleanup.

Reviewed-by: Daniel Vacek <neelx@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-05-15 14:30:41 +02:00
Filipe Manana
92be661a57 btrfs: make btrfs_iget_path() return a btrfs inode instead
It's an internal function and btrfs_iget() is now returning a btrfs inode,
so change btrfs_iget_path() to also return a btrfs inode instead of a VFS
inode.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-03-18 20:35:50 +01:00
David Sterba
c42c0db1bb btrfs: use BTRFS_PATH_AUTO_FREE in btrfs_remove_free_space_inode()
This is the trivial pattern for path auto free, initialize at the
beginning and free at the end with simple goto -> return conversions.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-03-18 20:35:48 +01:00
David Sterba
dba6ae0b43 btrfs: unify ordering of btrfs_key initializations
The btrfs_key is defined as objectid/type/offset and the keys are also
printed like that. For better readability, update all key
initializations to match this order.

Signed-off-by: David Sterba <dsterba@suse.com>
2025-03-18 20:35:42 +01:00
Matthew Wilcox (Oracle)
8be4cb04cb btrfs: convert io_ctl_prepare_pages() to work on folios
Retrieve folios instead of pages and work on them throughout.  Removes
a few calls to compound_head() and a reference to page->mapping.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-03-17 14:44:43 +01:00
David Sterba
3a1c46dbc9 btrfs: open code set_page_extent_mapped()
The function set_page_extent_mapped() is now a simple wrapper so use the
folio helper.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-13 14:53:22 +01:00
Filipe Manana
038d6999ec btrfs: free-space-cache: remove unnecessary calls to btrfs_mark_buffer_dirty()
We have several places explicitly calling btrfs_mark_buffer_dirty() but
that is not necessarily since the target leaf came from a path that was
obtained for a btree search function that modifies the btree, something
like btrfs_insert_empty_item() or anything else that ends up calling
btrfs_search_slot() with a value of 1 for its 'cow' argument.

These just make the code more verbose, confusing and add a little extra
overhead and well as increase the module's text size, so remove them.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-13 14:53:19 +01:00
Filipe Manana
07174a3429 btrfs: move extent-tree function declarations out of ctree.h
We have 3 functions that have their prototypes declared in ctree.h but
they are defined at extent-tree.c and they are unrelated to the btree
data structure. Move the prototypes out of ctree.h and into extent-tree.h.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2025-01-13 14:53:17 +01:00
Qu Wenruo
e820dbeb6a btrfs: convert btrfs_buffered_write() to use folios
The buffered write path is still heavily utilizing the page interface.
Since we have converted it to do a page-by-page copying, it's much easier
to convert all involved functions to folio interface, this involves:

- btrfs_copy_from_user()
- btrfs_drop_folio()
- prepare_uptodate_page()
- prepare_one_page()
- lock_and_cleanup_extent_if_need()
- btrfs_dirty_page()

All function are changed to accept a folio parameter, and if the word
"page" is in the function name, change that to "folio" too.

The function btrfs_dirty_page() is exported for v1 space cache, convert
v1 cache call site to convert its page to folio for the new interface.

And there is a small enhancement for prepare_one_folio(), instead of
manually waiting for the page writeback, let __filemap_get_folio() to
handle that by using FGP_WRITEBEGIN, which implies
(FGP_LOCK | FGP_WRITE | FGP_CREAT | FGP_STABLE).

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11 14:34:19 +01:00
Qu Wenruo
c87c299776 btrfs: make buffered write to copy one page a time
Currently the btrfs_buffered_write() is preparing multiple page a time,
allowing a better performance.

But the current trend is to support larger folio as an optimization,
instead of implementing own multi-page optimization.

This is inspired by generic_perform_write(), which is copying one folio
a time.

Such change will prepare us to migrate to implement the write_begin()
and write_end() callbacks, and make every involved function a little
easier.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11 14:34:19 +01:00
Thorsten Blum
4f285a7752 btrfs: use str_yes_no() helper function in btrfs_dump_free_space()
Remove hard-coded strings by using the str_yes_no() and str_no_yes()
helper functions.

Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11 14:34:18 +01:00
Qu Wenruo
00c5135dce btrfs: remove the dirty_page local variable
Inside btrfs_buffered_write(), we have a local variable @dirty_pages,
recording the number of pages we dirtied in the current iteration.

However we do not really need that variable, since it can be calculated
from @pos and @copied.

In fact there is already a problem inside the short copy path, where we
use @dirty_pages to calculate the range we need to release.
But that usage assumes sectorsize == PAGE_SIZE, which is no longer true.

Instead of keeping @dirty_pages and cause incorrect usage, just
calculate the number of dirtied pages inside btrfs_dirty_pages().

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-11-11 14:34:14 +01:00
Luca Stefani
69313850dc btrfs: add cancellation points to trim loops
There are reports that system cannot suspend due to running trim because
the task responsible for trimming the device isn't able to finish in
time, especially since we have a free extent discarding phase, which can
trim a lot of unallocated space. There are no limits on the trim size
(unlike the block group part).

Since trime isn't a critical call it can be interrupted at any time,
in such cases we stop the trim, report the amount of discarded bytes and
return an error.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219180
Link: https://bugzilla.suse.com/show_bug.cgi?id=1229737
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Luca Stefani <luca.stefani.ge1@gmail.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-10-07 23:21:56 +02:00
Naohiro Aota
e30729d4bd btrfs: zoned: properly take lock to read/update block group's zoned variables
__btrfs_add_free_space_zoned() references and modifies bg's alloc_offset,
ro, and zone_unusable, but without taking the lock. It is mostly safe
because they monotonically increase (at least for now) and this function is
mostly called by a transaction commit, which is serialized by itself.

Still, taking the lock is a safer and correct option and I'm going to add a
change to reset zone_unusable while a block group is still alive. So, add
locking around the operations.

Fixes: 169e0da91a ("btrfs: zoned: track unusable bytes for zones")
CC: stable@vger.kernel.org # 5.15+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-08-15 20:35:56 +02:00
Naohiro Aota
8cd44dd1d1 btrfs: zoned: fix zone_unusable accounting on making block group read-write again
When btrfs makes a block group read-only, it adds all free regions in the
block group to space_info->bytes_readonly. That free space excludes
reserved and pinned regions. OTOH, when btrfs makes the block group
read-write again, it moves all the unused regions into the block group's
zone_unusable. That unused region includes reserved and pinned regions.
As a result, it counts too much zone_unusable bytes.

Fortunately (or unfortunately), having erroneous zone_unusable does not
affect the calculation of space_info->bytes_readonly, because free
space (num_bytes in btrfs_dec_block_group_ro) calculation is done based on
the erroneous zone_unusable and it reduces the num_bytes just to cancel the
error.

This behavior can be easily discovered by adding a WARN_ON to check e.g,
"bg->pinned > 0" in btrfs_dec_block_group_ro(), and running fstests test
case like btrfs/282.

Fix it by properly considering pinned and reserved in
btrfs_dec_block_group_ro(). Also, add a WARN_ON and introduce
btrfs_space_info_update_bytes_zone_unusable() to catch a similar mistake.

Fixes: 169e0da91a ("btrfs: zoned: track unusable bytes for zones")
CC: stable@vger.kernel.org # 5.15+
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-29 19:21:19 +02:00
Filipe Manana
320d8dc612 btrfs: fix bitmap leak when loading free space cache on duplicate entry
If we failed to link a free space entry because there's already a
conflicting entry for the same offset, we free the free space entry but
we don't free the associated bitmap that we had just allocated before.
Fix that by freeing the bitmap before freeing the entry.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 15:52:25 +02:00
David Sterba
e108c86b10 btrfs: switch btrfs_block_group::inode to struct btrfs_inode
The structure is internal so we should use struct btrfs_inode for that.

Reviewed-by: Boris Burkov <boris@bur.io>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 15:33:28 +02:00
Filipe Manana
d383eb69eb btrfs: remove super block argument from btrfs_iget_path()
It's pointless to pass a super block argument to btrfs_iget_path() because
we always pass a root and from it we can get the super block through:

   root->fs_info->sb

So remove the super block argument.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 15:33:25 +02:00
Filipe Manana
e641e323ab btrfs: pass a btrfs_inode to btrfs_wait_ordered_range()
Instead of passing a (VFS) inode pointer argument, pass a btrfs_inode
instead, as this is generally what we do for internal APIs, making it
more consistent with most of the code base. This will later allow to
help to remove a lot of BTRFS_I() calls in btrfs_sync_file().

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 15:33:18 +02:00
Filipe Manana
cef2daba42 btrfs: pass a btrfs_inode to btrfs_fdatawrite_range()
Instead of passing a (VFS) inode pointer argument, pass a btrfs_inode
instead, as this is generally what we do for internal APIs, making it
more consistent with most of the code base. This will later allow to
help to remove a lot of BTRFS_I() calls in btrfs_sync_file().

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-07-11 15:33:18 +02:00
Linus Torvalds
66e55ff12e for-6.10-rc5-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmZ9gV0ACgkQxWXV+ddt
 WDsK2hAAmbOArbK5QHprawdOqqvJL46yoGCMba798EjYo53+hO1F8/lb531+zCUI
 GDhdIC2mHdRIkARJ8Cde5POUjID1Kv3Rlc0rdHy3nOw38WZmA/+HdkcKzQhsDFSR
 /FX9RKSWiu0xl6JdCLh4KkIWE+2m1v1kybhvRHCKb+70iBua1e+OSoM33BeiIhrP
 yoFwMwIbzG2CoZOHoobDxUjs9ZMUHm4wH0csJYG9R59Vv7uLBOgpWuQB46iqpoj4
 EYR8Sg8PscI7YXa0y8VTP3pdrNMW48IC6jerIAKHUeWRWRoTCh9He+3/E4xjNHxz
 3Pm+Aat5QYdsqmE68IbeN5c7QB1YAdUCgoJJJwAFjwe9WtTn8RS9RiMjWIr+VHYm
 E3REibQI151p0yHwAl8xPHDiTecmlNisof0eg6gzHdvODm/NYFuFapD+aDxWribX
 G63dOa8Fy0h4pwDoF73Rd2YYbtO6tDVSNVIG3bWpPep3r+SI/oC4JMHbmn1aqmqF
 c0/ZYVbsx/Hm066l4LCrpgi7kJ8en2zQ8MmcZHHt+2gXe1AyAON7kvRHaEizUCaA
 fnLVhQvaOofC4g7DJc1JkwyLc9VF5hMTYUhldoJvqj1wm2qsT/siJgKVvllRGoMs
 FU2qlWYaN/fPRylyEySyZPq/sWKHOAZZSOhyWM8SB2nYoBXNkO0=
 =qpEp
 -----END PGP SIGNATURE-----

Merge tag 'for-6.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix quota root leak after quota disable failure

 - fix condition when checking if a zone can be added as free

 - allocate inode in NOFS context during logging or tree-log replay

 - handle raid-stripe-tree lookup correctly during scrub

* tag 'for-6.10-rc5-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: qgroup: fix quota root leak after quota disable failure
  btrfs: scrub: handle RST lookup error correctly
  btrfs: zoned: fix initial free space detection
  btrfs: use NOFS context when getting inodes during logging and log replay
2024-06-27 10:26:16 -07:00
Naohiro Aota
b9fd2affe4 btrfs: zoned: fix initial free space detection
When creating a new block group, it calls btrfs_add_new_free_space() to add
the entire block group range into the free space accounting.
__btrfs_add_free_space_zoned() checks if size == block_group->length to
detect the initial free space adding, and proceed that case properly.

However, if the zone_capacity == zone_size and the over-write speed is fast
enough, the entire zone can be over-written within one transaction. That
confuses __btrfs_add_free_space_zoned() to handle it as an initial free
space accounting. As a result, that block group becomes a strange state: 0
used bytes, 0 zone_unusable bytes, but alloc_offset == zone_capacity (no
allocation anymore).

The initial free space accounting can properly be checked by checking
alloc_offset too.

Fixes: 98173255bd ("btrfs: zoned: calculate free space from zone capacity")
CC: stable@vger.kernel.org # 6.1+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-06-25 00:33:57 +02:00
Alexander Lobakin
4ca532d646 btrfs: rename bitmap_set_bits() -> btrfs_bitmap_set_bits()
bitmap_set_bits() does not start with the FS' prefix and may collide
with a new generic helper one day. It operates with the FS-specific
types, so there's no change those two could do the same thing.
Just add the prefix to exclude such possible conflict.

Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: David Sterba <dsterba@suse.com>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-04-01 10:49:28 +01:00
Chengming Zhou
ef5a05c557 btrfs: remove SLAB_MEM_SPREAD flag use
The SLAB_MEM_SPREAD flag used to be implemented in SLAB, which was
removed as of v6.8-rc1, so it became a dead flag since the commit
16a1d96835 ("mm/slab: remove mm/slab.c and slab_def.h"). And the
series[1] went on to mark it obsolete to avoid confusion for users.
Here we can just remove all its users, which has no functional change.

[1] https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-05 17:13:23 +01:00
Kunwu Chan
06c9564980 btrfs: use KMEM_CACHE() to create btrfs_free_space cache
Use the KMEM_CACHE() macro instead of kmem_cache_create() to simplify
the creation of SLAB caches when the default values are used.

Signed-off-by: Kunwu Chan <chentao@kylinos.cn>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04 16:24:54 +01:00
David Sterba
41044b41ad btrfs: add helper to get fs_info from struct inode pointer
Add a convenience helper to get a fs_info from a VFS inode pointer
instead of open coding the chain or using btrfs_sb() that in some cases
does one more pointer hop.  This is implemented as a macro (still with
type checking) so we don't need full definitions of struct btrfs_inode,
btrfs_root or btrfs_fs_info.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04 16:24:49 +01:00
Lijuan Li
737e6e5f0c btrfs: mark __btrfs_add_free_space static
__btrfs_add_free_space is only used in free-space-cache.c,
so mark it static.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Lijuan Li <lilijuan@iscas.ac.cn>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04 16:24:48 +01:00
David Sterba
2b712e3bb2 btrfs: remove unused included headers
With help of neovim, LSP and clangd we can identify header files that
are not actually needed to be included in the .c files. This is focused
only on removal (with minor fixups), further cleanups are possible but
will require doing the header files properly with forward declarations,
minimized includes and include-what-you-use care.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2024-03-04 16:24:46 +01:00
Qu Wenruo
55151ea9ec btrfs: migrate subpage code to folio interfaces
Although subpage itself is conflicting with higher folio, since subpage
(sectorsize < PAGE_SIZE and nodesize < PAGE_SIZE) means we will never
need higher order folio, there is a hidden pitfall:

- btrfs_page_*() helpers

Those helpers are an abstraction to handle both subpage and non-subpage
cases, which means we're going to pass pages pointers to those helpers.

And since those helpers are shared between data and metadata paths, it's
unavoidable to let them to handle folios, including higher order
folios).

Meanwhile for true subpage case, we should only have a single page
backed folios anyway, thus add a new ASSERT() for btrfs_subpage_assert()
to ensure that.

Also since those helpers are shared between both data and metadata, add
some extra ASSERT()s for data path to make sure we only get single page
backed folio for now.

Signed-off-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-12-15 23:03:58 +01:00
Filipe Manana
8b9d032225 btrfs: remove redundant root argument from btrfs_update_inode()
The root argument for btrfs_update_inode() always matches the root of the
given inode, so remove the root argument and get it from the inode
argument.

Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12 16:44:12 +02:00
Filipe Manana
50564b651d btrfs: abort transaction on generation mismatch when marking eb as dirty
When marking an extent buffer as dirty, at btrfs_mark_buffer_dirty(),
we check if its generation matches the running transaction and if not we
just print a warning. Such mismatch is an indicator that something really
went wrong and only printing a warning message (and stack trace) is not
enough to prevent a corruption. Allowing a transaction to commit with such
an extent buffer will trigger an error if we ever try to read it from disk
due to a generation mismatch with its parent generation.

So abort the current transaction with -EUCLEAN if we notice a generation
mismatch. For this we need to pass a transaction handle to
btrfs_mark_buffer_dirty() which is always available except in test code,
in which case we can pass NULL since it operates on dummy extent buffers
and all test roots have a single node/leaf (root node at level 0).

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12 16:44:07 +02:00
Josef Bacik
03e8634896 btrfs: remove btrfs_crc32c wrapper
This simply sends the same arguments into crc32c(), and is just used in
a few places.  Remove this wrapper and directly call crc32c() in these
instances.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12 16:44:02 +02:00
Josef Bacik
102f2640a3 btrfs: move btrfs_crc32c_final into free-space-cache.c
This is the only place this helper is used, take it out of ctree.h and
move it into free-space-cache.c.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-10-12 16:44:02 +02:00
Naohiro Aota
6a8ebc773e btrfs: zoned: no longer count fresh BG region as zone unusable
Now that we switched to write time activation, we no longer need to (and
must not) count the fresh region as zone unusable. This commit is similar
to revert of commit fa2068d7e9 ("btrfs: zoned: count fresh BG
region as zone unusable").

Signed-off-by: Naohiro Aota <naohiro.aota@wdc.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-21 14:52:19 +02:00
Filipe Manana
4d2024e90d btrfs: print target number of bytes when dumping free space
When dumping free space, with btrfs_dump_free_space(), we pass a bytes
argument in order to count how many free space entries in the block group
have a size greater than or equal to that number of bytes. We then print
how many suitable entries we found, but we don't print the target number
of bytes, we just say "bytes". Change the message to actually print the
number of bytes, which makes debugging -ENOSPC issues a bit easier.

Also sligthly change the odd grammar and terminology: the sentence is
ending with 'is', which doesn't make sense, and the term 'blocks' is
confusing as we are referring to free space entries within the block
group's free space cache.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-21 14:52:17 +02:00
Filipe Manana
e5860f8207 btrfs: make find_first_extent_bit() return a boolean
Currently find_first_extent_bit() returns a 0 if it found a range in the
given io tree and 1 if it didn't find any. There's no need to return any
errors, so make the return value a boolean and invert the logic to make
more sense: return true if it found a range and false if it didn't find
any range.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-08-21 14:52:12 +02:00
Josef Bacik
54d687c13a btrfs: move btrfs_check_trunc_cache_free_space into block-rsv.c
This is completely related to block rsv's, move it out of the free space
cache code and into block-rsv.c.

Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:24 +02:00
Filipe Manana
7e5ba55994 btrfs: assert tree lock is held when removing free space entries
Removing a free space entry from an in memory space cache requires having
the corresponding btrfs_free_space_ctl's 'tree_lock' held. We have several
code paths that remove an entry, so add assertions where appropriate to
verify we are holding the lock, as the lock is acquired by some other
function up in the call chain, which makes it easy to miss in the future.

Note: for this to work we need to lock the local btrfs_free_space_ctl at
load_free_space_cache(), which was not being done because it's local,
declared on the stack, so no other task has access to it.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:24 +02:00
Filipe Manana
9649bd9a29 btrfs: assert tree lock is held when linking free space
When linking a free space entry, at link_free_space(), the caller should
be holding the spinlock 'tree_lock' of the given btrfs_free_space_ctl
argument, which is necessary for manipulating the red black tree of free
space entries (done by tree_insert_offset(), which already asserts the
lock is held) and for manipulating the 'free_space', 'free_extents',
'discardable_extents' and 'discardable_bytes' counters of the given
struct btrfs_free_space_ctl.

So assert that the spinlock 'tree_lock' of the given btrfs_free_space_ctl
is held by the current task. We have multiple code paths that end up
calling link_free_space(), and all currently take the lock before calling
it.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:24 +02:00
Filipe Manana
91de9e978d btrfs: assert tree lock is held when searching for free space entries
When searching for a free space entry by offset, at tree_search_offset(),
we are supposed to have the btrfs_free_space_ctl's 'tree_lock' held, so
assert that. We have multiple callers of tree_search_offset(), and all
currently hold the necessary lock before calling it.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:24 +02:00
Filipe Manana
13c2018fcc btrfs: assert proper locks are held at tree_insert_offset()
There are multiple code paths leading to tree_insert_offset(), and each
path takes the necessary locks before tree_insert_offset() is called,
since they do other things that require those locks to be held. This makes
it easy to miss the locking somewhere, so make tree_insert_offset() assert
that the required locks are being held by the calling task.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:24 +02:00
Filipe Manana
0d6bac4d30 btrfs: simplify arguments to tree_insert_offset()
For the in-memory component of space caching (free space cache and free
space tree), three of the arguments passed to tree_insert_offset() can
always be taken from the new free space entry that we are about to add.

So simplify tree_insert_offset() to take the new entry instead of the
'offset', 'node' and 'bitmap' arguments. This will also allow to make
further changes simpler.

Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:24 +02:00
Filipe Manana
b77433b144 btrfs: use precomputed end offsets at do_trimming()
The are two computations of end offsets at do_trimming() that are not
necessary, as they were previously computed and stored in local const
variables. So just use the variables instead, to make the source code
shorter and easier to read.

Reviewed-by: Anand Jain <anand.jain@oracle.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2023-06-19 13:59:24 +02:00