mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs synced 2025-04-28 06:00:44 +00:00

Author	SHA1	Message	Date
Alexander Motin	ae1d11882d	BRT: Clear bv_entcount_dirty on destroy This fixes assertion in brt_sync_table() on debug builds when last cloned block on the vdev is freed and bv_meta_dirty is cleared, while bv_entcount_dirty is not. Should not matter in production. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16791	2024-11-21 08:20:22 -08:00
Brian Behlendorf	225e76cd7d	Linux 6.12 compat: META Update the META file to reflect compatibility with the 6.12 kernel. Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16793	2024-11-21 07:56:30 -08:00
Alexander Motin	9a81484e35	ZAP: Reduce leaf array and free chunks fragmentation Previous implementation of zap_leaf_array_free() put chunks on the free list in reverse order. Also zap_leaf_transfer_entry() and zap_entry_remove() were freeing name and value arrays in reverse order. Together this created a mess in the free list, making following allocations much more fragmented than necessary. This patch re-implements zap_leaf_array_free() to keep existing chunks order, and implements non-destructive zap_leaf_array_copy() to be used in zap_leaf_transfer_entry() to allow properly ordered freeing name and value arrays there and in zap_entry_remove(). With this change test of some writes and deletes shows percent of non-contiguous chunks in DDT reducing from 61% and 47% to 0% and 17% for arrays and frees respectively. Sure some explicit sorting could do even better, especially for ZAPs with variable-size arrays, but it would also cost much more, while this should be very cheap. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16766	2024-11-20 13:37:52 -08:00
tleydxdy	d02257c280	fix: block incompatible kernel from being installed The current "Requires" lines only ensure the old kernel is available on the system but it does not prevent fedora from updating to an incompatible and breaking user's system. Set Conflicts to block incompatible kernels from being installed. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: tleydxdy <shironeko.github@tesaguri.club> Closes #16139	2024-11-20 13:19:07 -08:00
Mark Johnston	d76d79fd27	zio: Avoid sleeping in the I/O path zio_delay_interrupt(), apparently used for fault injection, is executed in the I/O pipeline. It can cause the calling thread to go to sleep, which is not allowed on FreeBSD. This happens only for small delays, though, and there's no apparent reason to avoid deferring to a taskqueue in that case, as it already does otherwise. Simply go to sleep unconditionally. This fixes an occasional panic I see when running the ZTS on FreeBSD. Also remove an unhelpful comment referencing the non-existent timeout_generic(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #16785	2024-11-20 08:23:08 -08:00
Alexander Motin	457f8b76e7	BRT: More optimizations after per-vdev splitting - With both pending and current AVL-trees being per-vdev and having effectively identical comparison functions (pending tree compared also birth time, but I don't believe it is possible for them to be different for the same offset within one transaction group), it makes no sense to move entries from one to another. Instead inline dramatically simplified brt_entry_addref() into brt_pending_apply(). It no longer requires bv_lock, since there is nothing concurrent to it at the time. And it does not need to search the tree for the previous entries, since it is the same tree, we already have the entry and we know it is unique. - Put brt_vdev_lookup() and brt_vdev_addref() into different tree traversals to avoid false positives in the first due to the second entcount modifications. It saves dramatic amount of time when a file cloned first time by not looking for non-existent ZAP entries. - Remove avl_is_empty(bv_tree) check from brt_maybe_exists(). I don't think it is needed, since by the time all added entries are already accounted in bv_entcount. The extra check must be producing too many false positives for no reason. Also we don't need bv_lock there, since bv_entcount pointer must be table at this point, and we don't care about false positive races here, while false negative should be impossible, since all brt_vdev_addref() have already completed by this point. This dramatically reduces lock contention on massive deletes of cloned blocks. The only remaining one is between multiple parallel free threads calling brt_entry_decref(). - Do not update ZAP if net change for a block over the TXG was 0. In combination with above it makes file move between datasets as cheap operation as originally intended if it fits into one TXG. - Do not allocate vdevs on pool creation or import if it did not have active block cloning. This allows to save a bit in few cases. - While here, add proper error handling in brt_load() on pool import instead of assertions. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16773	2024-11-20 06:20:14 -08:00
Brian Behlendorf	49a377aa30	ZTS: Fix zpool_status_008_pos false positive Increase the injected delay to 1000ms and the ZIO_SLOW_IO_MS threshold to 750ms to avoid false positives due to unrelated slow IOs which may occur in the CI environment. Additionally, clear the fault injection as soon as it is no longer required for the test case. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16769	2024-11-20 06:13:32 -08:00
Alexander Motin	0ca82c5680	L2ARC: Stop rebuild before setting spa_final_txg Without doing that there is a race window on export when history log write by completed rebuild dirties transaction beyond final, triggering assertion. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Amanakis <gamanakis@gmail.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16714 Closes #16782	2024-11-20 06:11:51 -08:00
Alexander Motin	534688948c	Remove hash_elements_max accounting from DBUF and ARC Those values require global atomics to get current hash_elements values in few of the hottest code paths, while in all the years I never cared about it. If somebody wants, it should be easy to get it by periodic sampling, since neither ARC header nor DBUF counts change so fast that it would be difficult to catch. For now I've left hash_elements_max kstat for ARC, since it was used/reported by arc_summary and it would break older versions, but now it just reports the current value. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16759	2024-11-19 07:00:16 -08:00
Rob Norris	ffe2112795	Move "no name changes" from compression to checksum table Compression names actually aren't used in dedup table names, but checksum names are. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16776	2024-11-19 06:55:27 -08:00
Steve Mokris	e08e832b10	Expand zpool-remove.8 manpage with example results Also fix comment cross-referencing to zpool.8. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Steve Mokris <smokris@softpixel.com> Closes #16777	2024-11-19 06:52:04 -08:00
Alexander Motin	0d6306be8c	Fix few __VA_ARGS typos in assertions It should be __VA_ARGS__, not __VA_ARGS. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16780	2024-11-19 06:48:06 -08:00
Ameer Hamza	ff3df1211c	zed: prevent automatic replacement of offline vdevs When an OFFLINE device is physically removed, a spare is automatically activated. However, this behavior differs in FreeBSD, where we do not transition from OFFLINE state to REMOVED. Our support team has encountered cases where customers experienced unexpected behavior during drive replacements, with multiple spares activating for the same VDEV due to a single disk replacement. This patch ensures that a drive in an OFFLINE state remains in that state, preventing it from transitioning to REMOVED and being automatically replaced by a spare. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #16751	2024-11-15 15:08:16 -08:00
Alexander Motin	fd6e8c1d2a	BRT: Rework structures and locks to be per-vdev While block cloning operation from the beginning was made per-vdev, before this change most of its data were protected by two pool- wide locks. It created lots of lock contention in many workload. This change makes most of block cloning data structures per-vdev, which allows to lock them separately. The only pool-wide lock now it spa_brt_lock, protecting array of per-vdev pointers and in most cases taken as reader. Also this splits per-vdev locks into three different ones: bv_pending_lock protects the AVL-tree of pending operations in open context, bv_mos_entries_lock protects BRT ZAP object from while being prefetched, and bv_lock protects the rest of per-vdev context during TXG commit process. There should be no functional difference aside of some optimizations. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pawel Jakub Dawidek <pjd@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16740	2024-11-15 15:04:11 -08:00
Alexander Motin	309ce6303f	ZAP: Add by_dnode variants to lookup/prefetch_uint64 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pawel Jakub Dawidek <pjd@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16740	2024-11-15 15:04:02 -08:00
Alexander Motin	1ee251bdde	BRT: Don't call brt_pending_remove() on holes/embedded We are doing exactly the same checks around all brt_pending_add(). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pawel Jakub Dawidek <pjd@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16740	2024-11-15 15:03:57 -08:00
Alexander Motin	483087b06f	ZTS: Avoid embedded blocks in bclone/bclone_prop_sync If we write less than 113 bytes with enabled compression we get embeded block, which then fails check for number of cloned blocks in bclone_test. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pawel Jakub Dawidek <pjd@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16740	2024-11-15 15:02:44 -08:00
Rob Norris	648873f020	AUTHORS: refresh with recent new contributors Welcome to the party 🎉 Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16762	2024-11-15 15:00:06 -08:00
José Luis Salvador Rufo	de2e9a5c6b	tests: fix uClibc for getversion.c This patch fixes compilation with uClibc by applying the same fallback as commit `e12d76176d` to the `getversion.c` file, which was previously overlooked. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: José Luis Salvador Rufo <salvador.joseluis@gmail.com> Closes #16735 Closes #16741	2024-11-14 16:56:12 -08:00
Ameer Hamza	3462f3bd50	zvol_os.c: Increase optimal IO size Since zvol read and write can process up to (DMU_MAX_ACCESS / 2) bytes in a single operation, the current optimal I/O size is too low. SCST directly reports this value as the optimal transfer length for the target SCSI device. Increasing it from the previous volblocksize results in performance improvement for large block parallel I/O workloads. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ameer Hamza <ahamza@ixsystems.com> Closes #16750	2024-11-14 14:14:33 -08:00
Mark Johnston	8dc452d907	Fix some nits in zfs_getpages() - If we don't want dmu_read_pages() to perform extra readahead/behind, pass a pointer to 0 instead of a null pointer, as dum_read_pages() expects rahead and rbehind to be non-null. - Avoid unneeded iterations in a loop. Sponsored-by: Klara, Inc. Reported-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #16758	2024-11-14 14:12:57 -08:00
Rob Norris	46c4f2ce0b	dsl_dataset: put IO-inducing frees on the pool deadlist dsl_free() calls zio_free() to free the block. For most blocks, this simply calls metaslab_free() without doing any IO or putting anything on the IO pipeline. Some blocks however require additional IO to free. This at least includes gang, dedup and cloned blocks. For those, zio_free() will issue a ZIO_TYPE_FREE IO and return. If a huge number of blocks are being freed all at once, it's possible for dsl_dataset_block_kill() to be called millions of time on a single transaction (eg a 2T object of 128K blocks is 16M blocks). If those are all IO-inducing frees, that then becomes 16M FREE IOs placed on the pipeline. At time of writing, a zio_t is 1280 bytes, so for just one 2T object that requires a 20G allocation of resident memory from the zio_cache. If that can't be satisfied by the kernel, an out-of-memory condition is raised. This would be better handled by improving the cases that the dmu_tx_assign() throttle will handle, or by reducing the overheads required by the IO pipeline, or with a better central facility for freeing blocks. For now, we simply check for the cases that would cause zio_free() to create a FREE IO, and instead put the block on the pool's freelist. This is the same place that blocks from destroyed datasets go, and the async destroy machinery will automatically see them and trickle them out as normal. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #6783 Closes #16708 Closes #16722 Closes #16697	2024-11-13 07:38:42 -08:00
Alexander Motin	a60ed3822b	L2ARC: Move different stats updates earlier ..., before we make the header or the log block visible to others. It should fix assertion on allocated space going negative if the header is freed once the lock is dropped, while the write is still going. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16040 Closes #16743	2024-11-13 07:31:50 -08:00
Mark Johnston	178682506f	Grab the rangelock unconditionally in zfs_getpages() As a deadlock avoidance measure, zfs_getpages() would only try to acquire a rangelock, falling back to a single-page read if this was not possible. However, this is incompatible with direct I/O. Instead, release the busy lock before trying to acquire the rangelock in blocking mode. This means that it's possible for the page to be replaced, so we have to re-lookup. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #16643	2024-11-13 07:25:39 -08:00
Mark Johnston	25eb538778	Fix a potential page leak in mappedread_sf() mappedread_sf() may allocate pages; if it fails to populate a page can't free it, it needs to ensure that it's placed into a page queue, otherwise it can't be reclaimed until the vnode is destroyed. I think this is quite unlikely to happen in practice, it was noticed by code inspection. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Mark Johnston <markj@FreeBSD.org> Closes #16643	2024-11-13 07:24:14 -08:00
Umer Saleem	1c9a4c8cb4	Fix user properties output for zpool list In zpool_get_user_prop, when called from zpool_expand_proplist and collect_pool, we often have zpool_props present in zpool_handle_t equal to NULL. This mostly happens when only one user property is requested using zpool list -o <user_property>. Checking for this case and correctly initializing the zpool_props field in zpool_handle_t fixes this issue. Interestingly, this issue does not occur if we query any other property like name or guid along with a user property with -o flag because while accessing properties like guid, zpool_prop_get_int is called which checks for this case specifically and calls zpool_get_all_props. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16734	2024-11-11 09:46:45 -08:00
Umer Saleem	3a0a142f1c	JSON: fix user properties output for zpool list This commit fixes JSON output for zpool list when user properties are requested with -o flag. This case needed to be handled specifically since zpool_prop_to_name does not return property name for user properties, instead it is stored in pl->pl_user_prop. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16734	2024-11-11 09:46:19 -08:00
Umer Saleem	57fc5971f6	JSON: fix user properties output for zfs list This commit fixes JSON output for zfs list when user properties are requested with -o flag. This case needed to be handled specifically since zfs_prop_to_name does not return property name for user properties, instead it is stored in pl->pl_user_prop. Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16732	2024-11-07 11:31:14 -08:00
Sam James	4a7a0a0290	Use <fcntl.h> instead of <sys/fcntl.h> When building on musl, we get: ``` In file included from tests/zfs-tests/cmd/getversion.c:22: /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp] 1 \| #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> In file included from module/os/linux/zfs/vdev_file.c:36: /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp] 1 \| #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> ``` Bug: https://bugs.gentoo.org/925235 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sam James <sam@gentoo.org> Closes #15925	2024-11-07 11:20:37 -08:00
Brian Atkinson	187f931372	Update ABD stats for linear page Linux `a10e552` updated abd_free_linear_page() to no longer call abd_update_scatter_stat(). This meant that linear pages that were not attached to Direct I/O requests were not doing waste accounting for the ARC. This led to performance issues due to incorrect ARC accounting that resulted in 100% of CPU time being spent in arc_evict() during prolonged I/O workloads with the ARC. The call to abd_update_scatter_stats() is now conditionally called in abd_free_linear_page() when the ABD is not from a Direct I/O request. Reviewed-by: Mark Maybee <mmaybee@delphix.com> Reviewed-by: Tony Nguyen <tony.nguyen@delphix.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #16729	2024-11-07 11:11:05 -08:00
Chunwei Chen	5945676bcc	ZFS send should use spill block prefetched from send_reader_thread Currently, even though send_reader_thread prefetches spill block, do_dump() will not use it and issues its own blocking arc_read. This causes significant performance degradation when sending datasets with lots of spill blocks. For unmodified spill blocks, we also create send_range struct for them in send_reader_thread and issue prefetches for them. We piggyback them on the dnode send_range instead of enqueueing them so we don't break send_range_after check. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Co-authored-by: david.chen <david.chen@nutanix.com> Closes #16701	2024-11-06 11:52:01 -08:00
tstabrawa	7b6e9675da	Use simple folio migration function Avoids using fallback_migrate_folio, which starts unnecessary writeback (leading to BUG in migrate_folio_extra). Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com> Closes #16568 Closes #16723	2024-11-06 11:44:10 -08:00
tstabrawa	f38e2d239f	Revert "Avoid BUG in migrate_folio_extra" This reverts commit `b052035990`. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: tstabrawa <59430211+tstabrawa@users.noreply.github.com> Closes #16568 Closes #16723	2024-11-06 11:43:05 -08:00
наб	60c202cca4	module: unicode: remove unused tolower transformations With the previous patch this yields $ size -G ./module/zfs.ko ./module/zfs.new.ko text data bss total filename 2865126 1597982 755768 5218876 ./module/zfs.ko 2864038 1429784 755768 5049590 ./module/zfs.new.ko -1088 -168198 -1k -164k Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #16704	2024-11-04 17:26:35 -08:00
наб	e17a9698fa	module: unicode: #ifdef out unused U8_UNICODE_320 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #16704	2024-11-04 17:26:17 -08:00
наб	5e726779f5	module: unicode: u8_textprep_data.h tables: 2 => U8_UNICODE_LATEST+1 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #16704	2024-11-04 17:25:37 -08:00
Alexander Motin	5a2333b10f	CI: Automate some GitHub PR status labels manipulations - Set/remove "Work in Progress"/"Code Review Needed" for drafts. - Remove "Accepted", "Inactive", "Revision Needed" and "Stale" on pushes and reopens. I hope this reduce chances of PRs being forgotten after requested modifications done due to stale labels. It is better to have no labels than incorrect ones saying there is nothing to look at. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16721	2024-11-04 17:16:32 -08:00
Uglymotha	c8aed9f973	Verify parent_dev before calling udev_device_get_sysattr_value Not all udev devices have parent devices. Calling udev_device_get_ functions yield an assertion error if called with a NULL pointer. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Sietse <sietse@wizdom.nu> Co-authored-by: Sietse <sietse@wizdom.nu> Closes #16705 Closes #16717	2024-11-04 16:44:38 -08:00
Alexander Motin	b16e096198	Reduce dirty records memory usage Small block workloads may use a very large number of dirty records. During simple block cloning test due to BRT still using 4KB blocks I can easily see up to 2.5M of those used. Before this change dbuf_dirty_record_t structures representing them were allocated via kmem_zalloc(), that rounded their size up to 512 bytes. Introduction of specialized kmem cache allows to reduce the size from 512 to 408 bytes. Additionally, since override and raw params in dirty records are mutually exclusive, puting them into a union allows to reduce structure size down to 368 bytes, increasing the saving to 28%, that can be a 0.5GB or more of RAM. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16694	2024-11-04 16:42:06 -08:00
Rob Norris	91bd12dfeb	zfs(4): remove "experimental" from zfs_bclone_enabled I think we've done enough experiments. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16189 Closes #16712	2024-11-01 14:43:25 -07:00
наб	1c7d4b4c94	module: unicode: remove unused uconv.c Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Closes #16702	2024-11-01 12:12:13 -07:00
Tony Hutter	db2354534d	ZTS: Add Fedora 41, remove Fedora 39 Fedora 41 was released 10/29/24, and Fedora 39 will be EOL on 11/12/24. Update Fedora runners in the test suite. Some minor tweaks also needed to support ksh 1.0.10. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #16700	2024-11-01 10:02:03 -07:00
Rob Norris	673efbbf5d	zdb: add extra -T flag to show histograms of BRT refcounts Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16692	2024-11-01 12:08:33 -04:00
Rich Ercolani	acb6e71eda	Added output to `zpool online` and `offline` I was surprised to discover today that `zpool online` and `zpool offline` don't print any information about why they failed in many cases, they just return 1 with no information about why. Let's improve that where we can without changing the library function. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #16244	2024-10-31 22:32:31 -04:00
Rob Norris	3c650bec15	Revert "Workaround issue of Linux vdev_disk.c, (#16678 )" Now that we can handle these different alignments, we don't this workaround. This reverts commit `aefc2da8a5`. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16687	2024-10-31 17:00:53 -07:00
Rob Norris	e7425ae624	vdev_disk: move abd return and free off the interrupt handler Freeing an ABD can take sleeping locks to update various stats. We aren't allowed to sleep on an interrupt handler. So, move the free off to the io_done callback. We should never have been freeing things in the interrupt handler, but we got away with it because we were usually freeing a linear ABD, which at most is returning two objects to a cache and never sleeping. Scatter ABDs can be used now, and those have more complex locking. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16687	2024-10-31 17:00:53 -07:00
Rob Norris	63bafe60ec	vdev_disk: try harder to ensure IO alignment rules It seems out our notion of "properly" aligned IO was incomplete. In particular, dm-crypt does its own splitting, and assumes that a logical block will never cross an order-0 page boundary (ie, the physical page size, not compound size). This effectively means that it needs to be possible to split a BIO at any page or block size boundary and have it work correctly. This updates the alignment check function to enforce these rules (to the extent possible). Our response to misaligned data is to make some new allocation that is properly aligned, and copy the data into it. It turns out that linearising (via abd_borrow_buf()) is not enough, because we allocate eg 4K blocks from a general purpose slab, and so may receive (or already have) a 4K block that crosses pages. So instead, we allocate a new ABD, which is guaranteed to be aligned properly to block sizes, and then copy everything into it, and back out on the way back. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16687 #16631 #15646 #15533 #14533	2024-10-31 17:00:42 -07:00
Serapheim Dimitropoulos	ae93aeb849	Add warning for external consumers of dmu_tx_callback_register While reading some code @grwilson came across the above function that seemingly had no consumers besides a ztest callback that ensures that the tx_callback infrastructure works correctly. It turns out that Lustre is the main (and potentially the only) consumer of this. Refer to `osd_trans_commit_cb` of `lustre/osd-zfs/osd_handler.c` in the Lustre repo for more info. Let's add a comment highlighting this before someone removes it by mistake. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Serapheim Dimitropoulos <serapheimd@gmail.com> Closes #16698	2024-10-30 20:11:40 -04:00
Alexander Motin	6187b19434	On the first vdev open ignore impossible ashift hints If on the first open device's logical ashift is bigger than set by pool's ashift property, ignore the last as unusable instead of creating vdev that will fail most of I/Os due to misalignment. Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16690	2024-10-29 15:23:24 -04:00
Dimitry Andric	2bf1520211	Fix gcc uninitialized warning in FreeBSD zio_crypt.c In FreeBSD's `zio_do_crypt_data()`, ensure that two `struct uio` variables are cleared before copying data out of them. This avoids accessing garbage data, and fixes gcc `-Wuninitialized` warnings. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Toomas Soome <tsoome@me.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Dimitry Andric <dimitry@andric.com> Closes #16688	2024-10-29 15:05:02 -04:00

... 3 4 5 6 7 ...

9849 Commits