mirror_zfs

mirror of https://git.proxmox.com/git/mirror_zfs synced 2025-04-28 16:36:26 +00:00

Author	SHA1	Message	Date
Alexander Motin	0fea7fc109	ZTS: Reduce file size in redacted_panic to 1GB This test takes 3 minutes on RELEASE FreeBSD bots, but on CURRENT, probably due to debugging it has in kernel, it does not complete within 10 minutes, ending up killed. As I see all the redacting here happens within the first ~128MB of the file, so I hope it won't matter if there is 1GB of data instead of 2GB. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by:Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #11141	2024-12-29 11:53:45 -08:00
Alexander Motin	0f6d955a35	ZTS: Remove procfs use from zpool_import_status procfs might be not mounted on FreeBSD. Plus checking for specific PID might be not exactly reliable. Check for empty list of jobs instead. Premature loop exit can result in failed test and failed cleanup, failing also some following tests. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by:Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #11141	2024-12-29 11:53:45 -08:00
Alexander Motin	c3d2412b05	ZTS: Remove non-standard awk hex numbers usage FreeBSD recently removed non-standard hex numbers support from awk. Neither it supports -n argument, enabling it in gawk. Instead of depending on those rewrite list_file_blocks() fuction to handle the hex math in shell instead of awk. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by:Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #11141	2024-12-29 11:53:45 -08:00
Alexander Motin	30b97ce218	ZTS: Increase write sizes for RAIDZ/dRAID tests Many RAIDZ/dRAID tests filled files doing millions of 100 or even 10 byte writes. It makes very little sense since we are not micro-benchmarking syscalls or VFS layer here, while before the blocks reach the vdev layer absolute majority of the small writes will be aggregated. In some cases I see we spend almost as much time creating the test files as actually running the tests. And sometimes the tests even time out after that. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16905	2024-12-29 11:53:45 -08:00
Alexander Motin	f54052a122	Fix false assertion in dmu_tx_dirty_buf() on cloning Same as writes block cloning can increase block size and number of indirection levels. That means it can dirty block 0 at level 0 or at new top indirection level without explicitly holding them. A block cloning test case for large offsets has been added. Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16825	2024-12-05 11:49:06 -08:00
Mariusz Zaborski	3b0c1131ef	Add ability to scrub from last scrubbed txg Some users might want to scrub only new data because they would like to know if the new write wasn't corrupted. This PR adds possibility scrub only newly written data. This introduces new `last_scrubbed_txg` property, indicating the transaction group (TXG) up to which the most recent scrub operation has checked and repaired the dataset, so users can run scrub only from the last saved point. We use a scn_max_txg and scn_min_txg which are already built into scrub, to accomplish that. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com> Sponsored-By: Wasabi Technology, Inc. Sponsored-By: Klara Inc. Closes #16301	2024-12-05 09:33:21 -08:00
Rob Norris	1aee375947	zdb: show dedup table and log attributes There's interesting info in there that is going to help with understanding dedup behavior at any given moment. Since this is a format change, tests that rely on that output have been modified to match. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #16755	2024-12-02 18:14:26 -08:00
Alexander Motin	9753feaa63	ZTS: Avoid embedded blocks in bclone/bclone_prop_sync If we write less than 113 bytes with enabled compression we get embeded block, which then fails check for number of cloned blocks in bclone_test. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Pawel Jakub Dawidek <pjd@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16740	2024-11-21 08:24:37 -08:00
Brian Behlendorf	7fb7eb9a63	ZTS: Fix zpool_status_008_pos false positive Increase the injected delay to 1000ms and the ZIO_SLOW_IO_MS threshold to 750ms to avoid false positives due to unrelated slow IOs which may occur in the CI environment. Additionally, clear the fault injection as soon as it is no longer required for the test case. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16769	2024-11-21 08:24:37 -08:00
José Luis Salvador Rufo	260065099e	tests: fix uClibc for getversion.c This patch fixes compilation with uClibc by applying the same fallback as commit `e12d76176d` to the `getversion.c` file, which was previously overlooked. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: José Luis Salvador Rufo <salvador.joseluis@gmail.com> Closes #16735 Closes #16741	2024-11-14 16:56:34 -08:00
Sam James	e12d76176d	Use <fcntl.h> instead of <sys/fcntl.h> When building on musl, we get: ``` In file included from tests/zfs-tests/cmd/getversion.c:22: /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp] 1 \| #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> In file included from module/os/linux/zfs/vdev_file.c:36: /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp] 1 \| #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> ``` Bug: https://bugs.gentoo.org/925235 Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Sam James <sam@gentoo.org> Closes #15925	2024-11-07 11:33:51 -08:00
Tony Hutter	d367ef2995	ZTS: Add Fedora 41, remove Fedora 39 Fedora 41 was released 10/29/24, and Fedora 39 will be EOL on 11/12/24. Update Fedora runners in the test suite. Some minor tweaks also needed to support ksh 1.0.10. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #16700	2024-11-01 10:03:05 -07:00
Rob Norris	19a8dd48e1	vdev_disk: try harder to ensure IO alignment rules It seems out our notion of "properly" aligned IO was incomplete. In particular, dm-crypt does its own splitting, and assumes that a logical block will never cross an order-0 page boundary (ie, the physical page size, not compound size). This effectively means that it needs to be possible to split a BIO at any page or block size boundary and have it work correctly. This updates the alignment check function to enforce these rules (to the extent possible). Our response to misaligned data is to make some new allocation that is properly aligned, and copy the data into it. It turns out that linearising (via abd_borrow_buf()) is not enough, because we allocate eg 4K blocks from a general purpose slab, and so may receive (or already have) a 4K block that crosses pages. So instead, we allocate a new ABD, which is guaranteed to be aligned properly to block sizes, and then copy everything into it, and back out on the way back. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16687 #16631 #15646 #15533 #14533	2024-11-01 09:49:14 -07:00
Tony Hutter	f7b4bca66a	ZTS: Add LUKS sanity test Add a LUKS sanity test to trigger: #16631 Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #16681	2024-11-01 09:48:47 -07:00
Brian Behlendorf	5237760b17	ZTS: Add additional exceptions The following tests have been observed to occasionally fail when running under the CI. Updated our exceptions list to track them. Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16670	2024-10-21 13:02:07 -07:00
Brian Behlendorf	c645b07eaa	ZTS: Increase zpool_import_parallel_pos import margin Increase the pool import time allowed by assuming a minimum reduction to 1/2 instead of 1/3 when comparing sequential to parallel import times. This is sufficient to verify parallel imports are working as intended and should address the occasional false positive failure when the time is slightly exceeded. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16638	2024-10-11 16:21:25 -07:00
Brian Behlendorf	5bc27acf51	ZTS: Slightly increase dedup_quota limit As described in the comment above this check the space used by logged entries is not accounted for and some margin needs to be added in. While uncommon we have slightly exceeded the 600,000 threshold on some CI run so we increase the limit a bit more. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16637	2024-10-11 16:21:25 -07:00
Rob Norris	666903610d	zpool/zfs: allow --json wherever -j is allowed Mostly so that with the JSON formatting options are also used, they all look the same. To my eye, `-j --json-flat-vdevs` suggests that they are different or unrelated, while `--json --json-flat-vdevs` invites no further questions. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16632	2024-10-11 16:21:25 -07:00
Brian Atkinson	26ecd8b993	Always validate checksums for Direct I/O reads This fixes an oversight in the Direct I/O PR. There is nothing that stops a process from manipulating the contents of a buffer for a Direct I/O read while the I/O is in flight. This can lead checksum verify failures. However, the disk contents are still correct, and this would lead to false reporting of checksum validation failures. To remedy this, all Direct I/O reads that have a checksum verification failure are treated as suspicious. In the event a checksum validation failure occurs for a Direct I/O read, then the I/O request will be reissued though the ARC. This allows for actual validation to happen and removes any possibility of the buffer being manipulated after the I/O has been issued. Just as with Direct I/O write checksum validation failures, Direct I/O read checksum validation failures are reported though zpool status -d in the DIO column. Also the zevent has been updated to have both: 1. dio_verify_wr -> Checksum verification failure for writes 2. dio_verify_rd -> Checksum verification failure for reads. This allows for determining what I/O operation was the culprit for the checksum verification failure. All DIO errors are reported only on the top-level VDEV. Even though FreeBSD can write protect pages (stable pages) it still has the same issue as Linux with Direct I/O reads. This commit updates the following: 1. Propogates checksum failures for reads all the way up to the top-level VDEV. 2. Reports errors through zpool status -d as DIO. 3. Has two zevents for checksum verify errors with Direct I/O. One for read and one for write. 4. Updates FreeBSD ABD code to also check for ABD_FLAG_FROM_PAGES and handle ABD buffer contents validation the same as Linux. 5. Updated manipulate_user_buffer.c to also manipulate a buffer while a Direct I/O read is taking place. 6. Adds a new ZTS test case dio_read_verify that stress tests the new code. 7. Updated man pages. 8. Added an IMPLY statement to zio_checksum_verify() to make sure that Direct I/O reads are not issued as speculative. 9. Removed self healing through mirror, raidz, and dRAID VDEVs for Direct I/O reads. This issue was first observed when installing a Windows 11 VM on a ZFS dataset with the dataset property direct set to always. The zpool devices would report checksum failures, but running a subsequent zpool scrub would not repair any data and report no errors. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #16598	2024-10-09 13:45:06 -07:00
Brian Behlendorf	10f46d2aba	ZTS: resilver_restart_001.ksh restore defaults Update resilver_restart_001.ksh to restore the default resilver_defer_percent when the test completes. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Pavel Snajdr <snajpa@snajpa.net> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16618	2024-10-09 13:44:50 -07:00
JKDingwall	1ebb6b866f	Fix generation of kernel uevents for snapshot rename on linux `zvol_rename_minors()` needs to be given the full path not just the snapshot name. Use code removed in `a0bd735ad` as a guide to providing the necessary values. Add ZTS check for /dev changes after snapshot rename. After renaming a snapshot with 'snapdev=visible' ensure that the /dev entries are updated to reflect the rename. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: James Dingwall <james@dingwall.me.uk> Closes #14223 Closes #16600	2024-10-09 13:44:22 -07:00
Pavel Snajdr	0d77e738e6	Defer resilver only when progress is above a threshold Restart a resilver from scratch, if the current one in progress is below a new tunable, zfs_resilver_defer_percent (defaulting to 10%). The original rationale for deferring additional resilvers, when there is already one in progress, was to help achieving data redundancy sooner for the data that gets scanned at the end of the resilver. But in case the admin wants to attach multiple disks to a single vdev, it wasn't immediately obvious the admin is supposed to run `zpool resilver` afterwards to reset the deferred resilvers and start a new one from scratch. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pavel Snajdr <snajpa@snajpa.net> Closes #15810	2024-10-04 10:41:17 -07:00
Rob Norris	224393a321	feature: large_microzap In `a4b21eadec` we added the zap_micro_max_size tuneable to raise the size at which "micro" (single-block) ZAPs are upgraded to "fat" (multi-block) ZAPs. Before this, a microZAP was limited to 128KiB, which was the old largest block size. The side effect of raising the max size past 128KiB is that it be stored in a large block, requiring the large_blocks feature. Unfortunately, this means that a backup stream created without the --large-block (-L) flag to zfs send would split the microZAP block into smaller blocks and send those, as is normal behaviour for large blocks. This would be received correctly, but since microZAPs are limited to the first block in the object by definition, the entries in the later blocks would be inaccessible. For directory ZAPs, this gives the appearance of files being lost. This commit adds a feature flag, large_microzap, that must be enabled for microZAPs to grow beyond 128KiB, and which will be activated the first time that occurs. This feature is later checked when generating the stream and if active, the send operation will abort unless --large-block has also been requested. Changing the limit still requires zap_micro_max_size to be changed. The state of this flag effectively sets the upper value for this tuneable, that is, if the feature is disabled, the tuneable will be clamped to 128KiB. A stream flag is also added to ensure that the receiver also activates its own feature flag upon receiving the stream. This is not strictly necessary to _use_ the received microZAP, since it doesn't care how large its block is, but it is required to send the microZAP object on, otherwise the original problem occurs again. Because it's difficult to reliably distinguish a microZAP from a fatZAP from outside the ZAP code, and because it seems unlikely that most users are affected (a fairly niche tuneable combined with what should be an uncommon use of send), and for the sake of expediency, this change activates the feature the first time a microZAP grows to use a large block, and is never deactivated after that. This can be improved in the future. This commit changes nothing for existing pools that already have large microZAPs. The feature will not be retroactively applied, but will be activated the next time a microZAP grows past the limit. Don't use large_blocks feature for enable/disable tests. The large_microzap depends on large_blocks, so it gets enabled as a dependency, breaking the test. Instead use feature "longname", which has the exact same feature characteristics. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16593	2024-10-02 20:47:11 -07:00
Brian Behlendorf	d34d4f97a8	snapdir: add 'disabled' value to make .zfs inaccessible In some environments, just making the .zfs control dir hidden from sight might not be enough. In particular, the following scenarios might warrant not allowing access at all: - old snapshots with wrong permissions/ownership - old snapshots with exploitable setuid/setgid binaries - old snapshots with sensitive contents Introducing a new 'disabled' value that not only hides the control dir, but prevents access to its contents by returning ENOENT solves all of the above. The new property value takes advantage of 'iuv' semantics ("ignore unknown value") to automatically fall back to the old default value when a pool is accessed by an older version of ZFS that doesn't yet know about 'disabled' semantics. I think that technically the zfs_dirlook change is enough to prevent access, but preventing lookups and dir entries in an already opened .zfs handle might also be a good idea to prevent races when modifying the property at runtime. Add zfs_snapshot_no_setuid parameter to control whether automatically mounted snapshots have the setuid mount option set or not. this could be considered a partial fix for one of the scenarios mentioned in desired. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Co-authored-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Closes #3963 Closes #16587	2024-10-02 09:12:02 -07:00
Sanjeev Bagewadi	20232ecfaa	Support for longnames for files/directories (Linux part) This patch adds the ability for zfs to support file/dir name up to 1023 bytes. This number is chosen so we can support up to 255 4-byte characters. This new feature is represented by the new feature flag feature@longname. A new dataset property "longname" is also introduced to toggle longname support for each dataset individually. This property can be disabled, even if it contains longname files. In such case, new file cannot be created with longname but existing longname files can still be looked up. Note that, to my knowledge native Linux filesystems don't support name longer than 255 bytes. So there might be programs not able to work with longname. Note that NFS server may needs to use exportfs_get_name to reconnect dentries, and the buffer being passed is limit to NAME_MAX+1 (256). So NFS may not work when longname is enabled. Note, FreeBSD vfs layer imposes a limit of 255 name lengh, so even though we add code to support it here, it won't actually work. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #15921	2024-10-01 13:40:27 -07:00
Don Brady	141368a4b6	Restrict raidz faulted vdev count Specifically, a child in a replacing vdev won't count when assessing the dtl during a vdev_fault() Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #16569	2024-10-01 09:12:11 -07:00
Tino Reichardt	b1958b531b	ZTS: Replace MD5 and SHA256 wit XXH128 For data integrity checks as done in ZTS, the verification for unintended data corruption with xxhash128 should be a lot faster and perfectly usable. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #16577	2024-09-28 09:24:05 -07:00
Brian Behlendorf	834b90fb81	ZTS: Fix zpool_import_hostid_changed_unclean_export Update the test case to freeze the pool then export it to better simulate a hard failure. This is preferable to copying the vdev while the pool's imported since with a copy we're not guaranteed the on-disk state will be consistent. That can in turn result in a pool import failure and a spurious test failure. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16578	2024-09-28 09:18:21 -07:00
Brian Behlendorf	ac8837d4d6	ZTS: Update deadman_sync threshold Lower the minimum number of expected deadman events from 4 to 3. All that is strictly required is a single event to consider the test a pass. However, since I've never seen a count of less than 3 reported by the CI that should be sufficient. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16575	2024-09-27 09:14:13 -07:00
Brian Behlendorf	ab1b87e747	ZTS: Fix zpool_import_hostid_changed_cachefile_unclean_export Update the test case to freeze the pool then export it to better simulate a hard failure. This is preferable to copying the vdev while the pool's imported since with a copy we're not guaranteed the on-disk state will be consistent. That can in turn result in a pool import failure and a spurious test failure. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16570	2024-09-26 09:58:11 -07:00
Brian Behlendorf	b2ca5105b7	ZTS: Fix zpool_reguid Makefile.am and tests The zpool_reguid tests were not being included the dist tarball resulting in them not running. This is reported as a "failed verification" warning by the CI. Add the tests to the correct Makefile.am. Additionally, remove the usage of 'bc -e <expr>' from the tests. This option is only supported by the FreeBSD version of bc. Update the test case to reflect the 0 is not a valid GUID. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16559	2024-09-25 07:50:04 -07:00
George Melikov	e419a63bf4	xattr dataset prop: change defaults to sa It's the main recommendation to set xattr=sa even in man pages, so let's set it by default. xattr=sa don't use feature flag, so in the worst case we'll have non-readable xattrs by other non-openzfs platforms. Non-overridden default `xattr` prop of existing pools will automatically use `sa` after this commit too. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: George Melikov <mail@gmelikov.ru> Closes #15147	2024-09-23 09:50:48 -07:00
Brian Behlendorf	d565835c47	ZTS: Add additional exceptions The following tests have been observed to occasionally fail when running under the CI. Updated our exceptions list to track them. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16553	2024-09-22 13:04:17 -07:00
Brian Behlendorf	5520fdde29	ZTS: Retire "tmpfile_reason" exception All supported Linux kernels, 4.18 and newer, provide O_TMPFILE. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16553	2024-09-22 13:04:06 -07:00
Brian Behlendorf	9122772b2a	ZTS: Retire "ci_reason" exceptions There is no longer be a need for the ci_reason exception with the update CI GitHub Actions infrastruture. Retire it. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #16553	2024-09-22 13:03:47 -07:00
Umer Saleem	aeb79f2b29	ZTS: Fix skipping over comment lines in zpool_create.shlib In zpool_create.shlib, check_feature_set iterates over all features mentioned in provided compatibility file to check if only those features are enabled on the pool. This commit fixes skipping over comment lines correctly. Otherwise, the test case fails as comment lines are also treated as feature names by check_feature_set function. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #15909	2024-09-21 10:29:20 -07:00
Tino Reichardt	4bf6a2ab87	ZTS: use openssl for md5digest and sha256digest On larger files this should improve the speed. Sample values of my system: [mcmilk@xz]$ time dd if=/dev/zero bs=128k count=1k \| sha256sum 254bcc3fc4f27172636df4bf32de9f107f620d559b20d760197e452b97453917 - real 0m1,050s user 0m0,985s sys 0m0,153s [mcmilk@xz]$ time dd if=/dev/zero bs=128k count=1k \| openssl sha256 -r 254bcc3fc4f27172636df4bf32de9f107f620d559b20d760197e452b97453917 *stdin real 0m0,254s user 0m0,206s sys 0m0,160s I think cli_root/zdb/zdb_backup.ksh runs also an FreeBSD and I needed to include the sysutils/coreutils package for the FreeBSD tests within the QEMU patchset. This could be reverted, when this pull request gets upstream Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #16543	2024-09-19 15:53:57 -07:00
Don Brady	ddf5f34f06	Avoid fault diagnosis if multiple vdevs have errors When multiple drives are throwing errors, it is likely not a drive failing but rather a failure above the drives, like a controller. The active cases context of the drive's peers is now considered when making a diagnosis. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #16531	2024-09-18 11:36:48 -07:00
Tino Reichardt	bca9b64e7b	ZTS: Use QEMU for tests on Linux and FreeBSD This commit adds functional tests for these systems: - AlmaLinux 8, AlmaLinux 9, ArchLinux - CentOS Stream 9, Fedora 39, Fedora 40 - Debian 11, Debian 12 - FreeBSD 13, FreeBSD 14, FreeBSD 15 - Ubuntu 20.04, Ubuntu 22.04, Ubuntu 24.04 - enabled by default: - AlmaLinux 8, AlmaLinux 9 - Debian 11, Debian 12 - Fedora 39, Fedora 40 - FreeBSD 13, FreeBSD 14 Workflow for each operating system: - install qemu on the github runner - download current cloud image of operating system - start and init that image via cloud-init - install dependencies and poweroff system - start system and build openzfs and then poweroff again - clone build system and start 2 instances of it - run functional testings and complete in around 3h - when tests are done, do some logfile preparing - show detailed results for each system - in the end, generate the job summary Real-world benefits from this PR: 1. The github runner scripts are in the zfs repo itself. That means you can just open a PR against zfs, like "Add Fedora 41 tester", and see the results directly in the PR. ZFS admins no longer need manually to login to the buildbot server to update the buildbot config with new version of Fedora/Almalinux. 2. Github runners allow you to run the entire test suite against your private branch before submitting a formal PR to openzfs. Just open a PR against your private zfs repo, and the exact same Fedora/Alma/FreeBSD runners will fire up and run ZTS. This can be useful if you want to iterate on a ZTS change before submitting a formal PR. 3. buildbot is incredibly cumbersome. Our buildbot config files alone are ~1500 lines (not including any build/setup scripts)! It's a huge pain to setup. 4. We're running the super ancient buildbot 0.8.12. It's so ancient it requires python2. We actually have to build python2 from source for almalinux9 just to get it to run. Ugrading to a more modern buildbot is a huge undertaking, and the UI on the newer versions is worse. 5. Buildbot uses EC2 instances. EC2 is a pain because: * It costs money * They throttle IOPS and CPU usage, leading to mysterious, * hard-to-diagnose, failures and timeouts in ZTS. * EC2 is high maintenance. We have to setup security groups, SSH * keys, networking, users, etc, in AWS and it's a pain. We also * have to periodically go in an kill zombie EC2 instances that * buildbot is unable to kill off. 6. Buildbot doesn't always handle failures well. One of the things we saw in the past was the FreeBSD builders would often die, and each builder death would take up a "slot" in buildbot. So we would periodically have to restart buildbot via a cron job to get the slots back. 7. This PR divides up the ZTS test list into two parts, launches two VMs, and on each VM runs half the test suite. The test results are then merged and shown in the sumary page. So we're basically parallelizing ZTS on the same github runner. This leads to lower overall ZTS runtimes (2.5-3 hours vs 4+ hours on buildbot), and one unified set of results per runner, which is nice. 8. Since the tests are running on a VM, we have much more control over what happens. We can capture the serial console output even if the test completely brings down the VM. In the future, we could also restart the test on the VM where it left off, so that if a single test panics the VM, we can just restart it and run the remaining ZTS tests (this functionaly is not yet implemented though, just an idea). 9. Using the runners, users can manually kill or restart a test run via the github IU. That really isn't possible with buildbot unless you're an admin. 10. Anecdotally, the tests seem to be more stable and constant under the QEMU runners. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #16537	2024-09-17 12:03:27 -07:00
Tino Reichardt	c4d1a19b33	ZTS: increase timeout of mmap_sync_001_pos On load the test needs sometimes a bit more time then just one second. Doubling the time will help on the QEMU based testings. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #16537	2024-09-17 12:03:08 -07:00
Tino Reichardt	5cb3e2861e	ZTS: fix raidz_expand_001_pos and raidz_expand_002_pos Sometimes the pool may start an auto scrub. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #16537	2024-09-17 12:02:58 -07:00
Tino Reichardt	4999f49513	ZTS: fix zpool_status_008_pos test on qemu vm's The test needs some adjusting within the timings. Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Co-authored-by: Tino Reichardt <milky-zfs@mcmilk.de> Closes #16537	2024-09-17 12:01:13 -07:00
Brian Atkinson	a10e552b99	Adding Direct IO Support Adding O_DIRECT support to ZFS to bypass the ARC for writes/reads. O_DIRECT support in ZFS will always ensure there is coherency between buffered and O_DIRECT IO requests. This ensures that all IO requests, whether buffered or direct, will see the same file contents at all times. Just as in other FS's , O_DIRECT does not imply O_SYNC. While data is written directly to VDEV disks, metadata will not be synced until the associated TXG is synced. For both O_DIRECT read and write request the offset and request sizes, at a minimum, must be PAGE_SIZE aligned. In the event they are not, then EINVAL is returned unless the direct property is set to always (see below). For O_DIRECT writes: The request also must be block aligned (recordsize) or the write request will take the normal (buffered) write path. In the event that request is block aligned and a cached copy of the buffer in the ARC, then it will be discarded from the ARC forcing all further reads to retrieve the data from disk. For O_DIRECT reads: The only alignment restrictions are PAGE_SIZE alignment. In the event that the requested data is in buffered (in the ARC) it will just be copied from the ARC into the user buffer. For both O_DIRECT writes and reads the O_DIRECT flag will be ignored in the event that file contents are mmap'ed. In this case, all requests that are at least PAGE_SIZE aligned will just fall back to the buffered paths. If the request however is not PAGE_SIZE aligned, EINVAL will be returned as always regardless if the file's contents are mmap'ed. Since O_DIRECT writes go through the normal ZIO pipeline, the following operations are supported just as with normal buffered writes: Checksum Compression Encryption Erasure Coding There is one caveat for the data integrity of O_DIRECT writes that is distinct for each of the OS's supported by ZFS. FreeBSD - FreeBSD is able to place user pages under write protection so any data in the user buffers and written directly down to the VDEV disks is guaranteed to not change. There is no concern with data integrity and O_DIRECT writes. Linux - Linux is not able to place anonymous user pages under write protection. Because of this, if the user decides to manipulate the page contents while the write operation is occurring, data integrity can not be guaranteed. However, there is a module parameter `zfs_vdev_direct_write_verify` that controls the if a O_DIRECT writes that can occur to a top-level VDEV before a checksum verify is run before the contents of the I/O buffer are committed to disk. In the event of a checksum verification failure the write will return EIO. The number of O_DIRECT write checksum verification errors can be observed by doing `zpool status -d`, which will list all verification errors that have occurred on a top-level VDEV. Along with `zpool status`, a ZED event will be issues as `dio_verify` when a checksum verification error occurs. ZVOLs and dedup is not currently supported with Direct I/O. A new dataset property `direct` has been added with the following 3 allowable values: disabled - Accepts O_DIRECT flag, but silently ignores it and treats the request as a buffered IO request. standard - Follows the alignment restrictions outlined above for write/read IO requests when the O_DIRECT flag is used. always - Treats every write/read IO request as though it passed O_DIRECT and will do O_DIRECT if the alignment restrictions are met otherwise will redirect through the ARC. This property will not allow a request to fail. There is also a module parameter zfs_dio_enabled that can be used to force all reads and writes through the ARC. By setting this module parameter to 0, it mimics as if the direct dataset property is set to disabled. Reviewed-by: Brian Behlendorf <behlendorf@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Co-authored-by: Mark Maybee <mark.maybee@delphix.com> Co-authored-by: Matt Macy <mmacy@FreeBSD.org> Co-authored-by: Brian Behlendorf <behlendorf@llnl.gov> Closes #10018	2024-09-14 13:47:59 -07:00
Rob Norris	63253dbf4f	zts-report: don't crash on non-UTF-8 chars in the log (#16497 ) The report generator expects the log to be clean and tidy UTF-8. That can be a problem if you use some of the verbose/debug test runner options, which sends all sorts of weird output from arbitrary programs to the log. This just makes Python a little more relaxed about such things. It shouldn't matter in practice, as those lines didn't match the test result regex anyway, and are discarded immediately. Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2024-09-09 17:49:14 -07:00
Rob Norris	b3b7491615	build: rename FORCEDEBUG_CPPFLAGS to LIBZPOOL_CPPFLAGS This is just a very small attempt to make it more obvious that these flags aren't optional for libzpool-using programs, by not making it seem like there's an option to say "well, I don't _want_ to force debugging". Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Rich Ercolani <rincebrain@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Issue #16476 Closes #16477	2024-08-27 12:53:27 -07:00
Mateusz Piotrowski	6be8bf5552	zpool: Provide GUID to zpool-reguid(8) with -g (#16239 ) This commit extends the zpool-reguid(8) command with a -g flag, which allows the user to specify the GUID to set. This change also adds some general tests for zpool-reguid(8). Sponsored-by: Wasabi Technology, Inc. Sponsored-by: Klara, Inc. Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2024-08-26 09:27:24 -07:00
Rob Norris	a1902f4950	ddt: block scan until log is flushed, and flush aggressively The dedup log does not have a stable cursor, so its not possible to persist our current scan location within it across pool reloads. Beccause of this, when walking (scanning), we can't treat it like just another source of dedup entries. Instead, when a scan is wanted, we switch to an aggressive flushing mode, pushing out entries older than the scan start txg as fast as we can, before starting the scan proper. Entries after the scan start txg will be handled via other methods; the DDT ZAPs and logs will be written as normal, and blocks not seen yet will be offered to the scan machinery as normal. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: iXsystems, Inc. Closes #15895	2024-08-16 12:03:43 -07:00
Rob Norris	cd69ba3d49	ddt: dedup log Adds a log/journal to dedup. At the end of txg, instead of writing the entry directly to the ZAP, instead its adding to an in-memory tree and appended to an on-disk object. The on-disk object is only read at import, to reload the in-memory tree. Lookups first go the the log tree before going to the ZAP, so recently-used entries will remain close by in memory. This vastly reduces overhead from dedup IO, as it will not have to do so many read/update/write cycles on ZAP leaf nodes. A flushing facility is added at end of txg, to push logged entries out to the ZAP. There's actually two separate "logs" (in-memory tree and on-disk object), one active (recieving updated entries) and one flushing (writing out to disk). These are swapped (ie flushing begins) based on memory used by the in-memory log trees and time since we last flushed something. The flushing facility monitors the amount of entries coming in and being flushed out, and calibrates itself to try to flush enough each txg to keep up with the ingest rate without competing too much with other IO. Multiple tuneables are provided to control the flushing facility. All the histograms and stats are update to accomodate the log as a separate entry store. zdb gains knowledge of how to count them and dump them. Documentation included! Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: iXsystems, Inc. Closes #15895	2024-08-16 12:03:35 -07:00
Rob Norris	2b131d7345	ZTS: tests for dedup legacy/FDT tables Very basic coverage to make sure things appear to work, have the right format on disk, and pool upgrades and mixed table types work as expected. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: iXsystems, Inc. Closes #15892	2024-08-16 12:00:58 -07:00
Rob Norris	db2b1fdb79	ddt: add FDT feature and support for legacy and new on-disk formats This is the supporting infrastructure for the upcoming dedup features. Traditionally, dedup objects live directly in the MOS root. While their details vary (checksum, type and class), they are all the same "kind" of thing - a store of dedup entries. The new features are more varied than that, and are better thought of as a set of related stores for the overall state of a dedup table. This adds a new feature flag, SPA_FEATURE_FAST_DEDUP. Enabling this will cause new DDTs to be created as a ZAP in the MOS root, named DDT-<checksum>. The is used as the root object for the normal type/class store objects, but will also be a place for any storage required by new features. This commit adds two new fields to ddt_t, for version and flags. These are intended to describe the structure and features of the overall dedup table, and are stored as-is in the DDT root. In this commit, flags are always zero, but the intent is that they can be used to hang optional logic or state onto for new dedup features. Version is always 1. For a "legacy" dedup table, where no DDT root directory exists, the version will be 0. ddt_configure() is expected to determine the version and flags features currently in operation based on whether or not the fast_dedup feature is enabled, and from what's available on disk. In this way, its possible to support both old and new tables. This also provides a migration path. A legacy setup can be upgraded to FDT by creating the DDT root ZAP, moving the existing objects into it, and setting version and flags appropriately. There's no support for that here, but it would be straightforward to add later and allows the possibility that newer features could be applied to existing dedup tables. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Co-authored-by: Allan Jude <allan@klarasystems.com> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: iXsystems, Inc. Closes #15892	2024-08-16 11:58:59 -07:00

1 2 3 4 5 ...

1411 Commits