Proxmox-Port/zfs - zfs - Gitea: Git with a cup of tea

mirror of https://github.com/openzfs/zfs.git synced 2025-10-01 11:26:49 +00:00

Author	SHA1	Message	Date
nav1s	0939787e83	manuals: fix typos in zpool-upgrade man page Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: nav1s <nav1s@proton.me> Closes #17797	2025-09-29 16:50:57 -07:00
trick2011	9bcda0b5fe	Use "vdev" instead of "devices" when referring to vdevs Update documentation to use the correct terminology. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: trick2011 <trick2011@users.noreply.github.com> Closes #17734 Closes #17755	2025-09-25 12:07:52 -07:00
Allan Jude	6c4ede4026	ZFS allow send:encrypted A new `zfs allow` permissions that ONLY allows sending replication streams in raw (encrypted) mode, so encrypted data will not be decrypted as part of the replication process. Sponsored-by: Klara, Inc. Sponsored-by: Karakun AG Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Co-authored-by: JT Pennington <jt.pennington@klarasystems.com> Signed-off-by: Allan Jude <allan@klarasystems.com> Closes #17543	2025-09-12 15:05:02 -07:00
Tony Hutter	4a7a04630d	zed: Add synchronous zedlets Historically, ZED has blindly spawned off zedlets in parallel and never worried about their completion order. This means that you can potentially have zedlets for event number 2 starting before zedlets for event number 1 had finished. Most of the time this is fine, and it actually helps a lot when the system is getting spammed with hundreds of events. However, there are times when you want your zedlets to be executed in sequence with the event ID. That is where synchronous zedlets come in. ZED will wait for all previously spawned zedlets to finish before running a synchronous zedlet. Synchronous zedlets are guaranteed to be the only zedlet running. No other zedlets may run in parallel with a synchronous zedlet. Users should be careful to only use synchronous zedlets when needed, since they decrease parallelism. To make a zedlet synchronous, simply add a "-sync-" immediately following the event name in the zedlet's file name: EVENT_NAME-sync-ZEDLETNAME.sh For example, if you wanted a synchronous statechange script: statechange-sync-myzedlet.sh Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tony Hutter <hutter2@llnl.gov> Closes #17335	2025-09-11 15:58:59 -07:00
Paul Dagnelie	df55ba7c49	Detect a slow raidz child during reads A single slow responding disk can affect the overall read performance of a raidz group. When a raidz child disk is determined to be a persistent slow outlier, then have it sit out during reads for a period of time. The raidz group can use parity to reconstruct the data that was skipped. Each time a slow disk is placed into a sit out period, its `vdev_stat.vs_slow_ios count` is incremented and a zevent class `ereport.fs.zfs.delay` is posted. The length of the sit out period can be changed using the `raid_read_sit_out_secs` module parameter. Setting it to zero disables slow outlier detection. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Contributions-by: Don Brady <don.brady@klarasystems.com> Contributions-by: Brian Behlendorf <behlendorf1@llnl.gov> Closes #17227	2025-09-10 15:31:30 -07:00
Paul Dagnelie	26983d6fa7	Add allocation profile export and zhack subcommand for import When attempting to debug performance problems on large systems, one of the major factors that affect performance is free space fragmentation. This heavily affects the allocation process, which is an area of active development in ZFS. Unfortunately, fragmenting a large pool for testing purposes is time consuming; it usually involves filling the pool and then repeatedly overwriting data until the free space becomes fragmented, which can take many hours. And even if the time is available, artificial workloads rarely generate the same fragmentation patterns as the natural workloads they're attempting to mimic. This patch has two parts. First, in zdb, we add the ability to export the full allocation map of the pool. It iterates over each vdev, printing every allocated segment in the ms_allocatable range tree. This can be done while the pool is online, though in that case the allocation map may actually be from several different TXGs as new ones are loaded on demand. The second is a new subcommand for zhack, zhack metaslab leak (and its supporting kernel changes). This is a zhack subcommand that imports a pool and then modified the range trees of the metaslabs, allowing the sync process to write them out normall. It does not currently store those allocations anywhere to make them reversible, and there is no corresponding free subcommand (which would be extremely dangerous); this is an irreversible process, only intended for performance testing. The only way to reclaim the space afterwards is to destroy the pool or roll back to a checkpoint. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #17576	2025-09-10 15:01:28 -07:00
Alexander Ziaee	92d4b135b6	manuals: Audit/bump dates for last content change Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Ziaee <ziaee@FreeBSD.org> Closes #17676	2025-09-09 17:04:19 -07:00
Shawn Bayern	8604e67dc9	Add description of default sorting behavior to zfs_list.8 The sorting logic is all in cmd/zfs/zfs_iter.c. I borrowed where I could from the comments in the source code, but please note that the comment to zfs_sort() is a little imprecise, or at least incomplete, because it doesn't give any indication of the chronological sort that will be used by default for snapshots in zfs_compare(). While adding this description, I took the liberty to copy-edit the rest of the file lightly. In those edits, I've removed "If specified, you can list property information by the absolute pathname or the relative pathname" because, in context, it seems more confusing than helpful. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Shawn Bayern <sbayern@law.fsu.edu> Closes #15713 Closes #15869	2025-09-09 17:03:55 -07:00
r-ricci	30a915efed	zfs-send.8: mention combination of -c/-e flags and zstd_compress feature Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: George Melikov <mail@gmelikov.ru> Signed-off-by: Roberto Ricci <io@r-ricci.it> Closes #17647	2025-08-19 10:56:58 -04:00
Alan Somers	d3c1d27afd	zdb: better handling for corrupt block pointers When dumping indirect blocks, attempt to print corrupt block pointers rather than abort the program. When corruption is detected zdb will exit with an error code of 3. Sponsored by: ConnectWise Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Alek Pinchuk <alek.pinchuk@connectwise.com> Signed-off-by: Alan Somers <asomers@gmail.com> Closes #17166	2025-08-12 14:16:37 -07:00
Alexander Motin	60f714e6e2	Implement physical rewrites Based on previous commit this implements `zfs rewrite -P` flag, making ZFS to keep blocks logical birth times while rewriting files. It should exclude the rewritten blocks from incremental sends, snapshot diffs, etc. Snapshots space usage same time will reflect the additional space usage from newly allocated blocks. Since this begins to use new "rewrite" flag in the block pointers, this commit introduces a new read-compatible per-dataset feature physical_rewrite. It must be enabled for the command to not fail, it is activated on first use and deactivated on deletion of the last affected dataset. Reviewed-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <alexander.motin@TrueNAS.com> Closes #17565	2025-08-06 10:36:56 -07:00
Mariusz Zaborski	894edd084e	Add TXG timestamp database This feature enables tracking of when TXGs are committed to disk, providing an estimated timestamp for each TXG. With this information, it becomes possible to perform scrubs based on specific date ranges, improving the granularity of data management and recovery operations. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <alexander.motin@TrueNAS.com> Reviewed-by: Paul Dagnelie <paul.dagnelie@klarasystems.com> Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #16853	2025-08-06 10:31:21 -07:00
Akash B	b6e8db509d	zpool/zfs: Add '-a\|--all' option to scrub, trim, initialize Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details Add support for the '-a \| --all' option to perform trim, scrub, and initialize operations on all pools. Previously, specifying a pool name was mandatory for these operations. With this enhancement, users can now execute these operations across all pools at once, without needing to manually iterate over each pool from the command line. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Akash B <akash-b@hpe.com> Closes #17524	2025-07-29 14:50:44 -07:00
Rob Norris	fce18e04d5	libzpool: tunable-based option interface for zdb/ztest Removes the old dlsym() based option setter and adds a new function handle_tunable_option() that can set, get and list all the tunables in the system. And then wire it up to zdb and ztest. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #17537	2025-07-15 15:47:03 -07:00
Rob Norris	6af8db61b1	metaslab: don't pass whole zio to throttle reserve APIs Some checks failed checkstyle / checkstyle (push) Has been cancelled Details CodeQL / Analyze (cpp) (push) Has been cancelled Details CodeQL / Analyze (python) (push) Has been cancelled Details zloop / zloop (push) Has been cancelled Details They only need a couple of fields, and passing the whole thing just invites fiddling around inside it, like modifying flags, which then makes it much harder to understand the zio state from inside zio.c. We move the flag update to just after a successful throttle in zio.c. Rename ZIO_FLAG_IO_ALLOCATING to ZIO_FLAG_ALLOC_THROTTLED Better describes what it means, and makes it look less like IO_IS_ALLOCATING, which means something different. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17508	2025-07-04 23:22:22 -04:00
Rob Norris	3ff2eca0be	zfs-program(8): document zfs.sync.clone() Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #17426	2025-06-10 14:53:18 -07:00
Rob Norris	44e3266894	events: include zio type in IO error reports Usually the IO type can be inferred from the other fields (in particular, priority and flags) sometimes it's not easy to see. This is just another little debug helper. May 27 2025 00:54:54.024110493 ereport.fs.zfs.data class = "ereport.fs.zfs.data" ena = 0x1f5ecfae600801 ... zio_delta = 0x0 zio_type = 0x2 [WRITE] zio_priority = 0x3 [ASYNC_WRITE] zio_objset = 0x0 Document zio_type and zio_priority. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #17381	2025-05-30 10:29:29 -04:00
Cameron Harr	92157c840c	Refactor man page and CLI help output per mandoc Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details The man page and the usage statement from the CLI have been refactored to abide by the ManDoc standard. Style changes include: * Upper-case letters before lower-case * List short options w/o arguments first * Then list short options w/ arguments * Then list long arguments Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Cameron Harr <harr1@llnl.gov> Closes #17357	2025-05-23 09:10:30 -07:00
Cameron Harr	cdb4c44684	Reformat cli help and man page to be in sync The man page and CLI usage statements were both a little out of sync and neither fully alphabetized correctly. That has been fixed. One outstanding question is whether to get rid of the ellipses on the CLI usage. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Cameron Harr <harr1@llnl.gov> Closes #16004 Closes #17357	2025-05-23 09:10:21 -07:00
Alexander Motin	49fbdd4533	Introduce zfs rewrite subcommand (#17246 ) This allows to rewrite content of specified file(s) as-is without modifications, but at a different location, compression, checksum, dedup, copies and other parameter values. It is faster than read plus write, since it does not require data copying to user-space. It is also faster for sync=always datasets, since without data modification it does not require ZIL writing. Also since it is protected by normal range range locks, it can be done under any other load. Also it does not affect file's modification time or other properties. Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com>	2025-05-12 10:22:17 -07:00
Quentin Thébault	63de2d2dbd	zfs-rollback.8: fix typo in example number Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details Reviewed-by: George Melikov <mail@gmelikov.ru> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Alexander Ziaee <ziaee@FreeBSD.org> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Quentin Thébault <quentin.thebault@defenso.fr> Closes #17282	2025-04-28 15:38:08 -04:00
Simon Howard	ef81812726	Fix spelling errors Unlike some of my other fixes which are more subtle, these are unambigously spelling errors. Signed-off-by: Simon Howard <fraggle@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org>	2025-03-24 14:37:40 -07:00
Simon Howard	e759a86fa5	Correct "umount" to "unmount" in a couple of places This is admittedly a nitpicky change, but `umount` is the command that performs an unmount. So if we are talking about unmounting something we should phrase it that way. Signed-off-by: Simon Howard <fraggle@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org>	2025-03-24 14:37:36 -07:00
Simon Howard	1d4505d7a1	Capitalize in various places where appropriate These are mostly acronyms (CPUs; ZILs) but also proper nouns such as "Unix" and "Unicode" which should also be capitalized. Signed-off-by: Simon Howard <fraggle@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org>	2025-03-24 14:37:34 -07:00
Simon Howard	b386bf87c1	Fix cases where "descendent" is used as a noun As per Wiktionary: "descendent" may be used as an adjective (e.g. "a descendent dataset") but for nouns (e.g. "descendants of this dataset"), "descendant" is the correct spelling. Signed-off-by: Simon Howard <fraggle@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org>	2025-03-24 14:37:31 -07:00
Simon Howard	530ddcd5f1	Harmonize on American spelling in several places Most of the documentation is written in American English, so it makes sense to be consistent. Signed-off-by: Simon Howard <fraggle@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org>	2025-03-24 14:36:34 -07:00
Rob Norris	7d8dd8d9a5	SPDX: license tags: MIT Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>	2025-03-13 17:56:54 -07:00
Rob Norris	eb9098ed47	SPDX: license tags: CDDL-1.0 Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>	2025-03-13 17:56:27 -07:00
shodanshok	201d262949	Add receive:append permission for limited receive Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details Force receive (zfs receive -F) can rollback or destroy snapshots and file systems that do not exist on the sending side (see zfs-receive man page). This means an user having the receive permission can effectively delete data on receiving side, even if such user does not have explicit rollback or destroy permissions. This patch adds the receive:append permission, which only permits limited, non-forced receive. Behavior for users with full receive permission is not changed in any way. Fixes #16943 Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Gionatan Danti <g.danti@assyoma.it> Closes #17015	2025-03-13 13:54:14 -04:00
mnrx	0be1da26cb	Clarify documentation of `zfs destroy` on snapshots (#17021 ) Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details The current documentation of `zfs destroy` in application to snapshots is particularly difficult to understand. The following changes are made: - Remove circular reference to `zfs destroy` in the documentation of that command. - Remove use of "for example", which implies there are more, undocumented reasons that ZFS may fail to destroy a snapshot immediately. - Mention properties `defer_destroy` and `userrefs`. - Add `zfsprops(8)` to "SEE ALSO" list. - Clarify meaning of `-d` option. Requires-builders: none Signed-off-by: mnrx <83848843+mnrx@users.noreply.github.com> Co-authored-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: George Amanakis <gamanakis@gmail.com> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2025-02-05 14:47:03 -08:00
Rob Norris	26e38aec46	zinject: add "probe" device injection type Injecting a device probe failure is not possible by matching IO types, because probe IO goes to the label regions, which is explicitly excluded from injection. Even if it were possible, it would be awkward to do, because a probe is sequence of reads and writes. This commit adds a new IO "type" to match for injection, which looks for the ZIO_FLAG_PROBE flag instead. Any probe IO will be match the injection record and recieve the wanted error. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16947	2025-01-22 16:13:21 -08:00
Tim Smith	b8c0c154ad	Fix several typos in the man pages Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details Reviewed-by: George Amanakis <gamanakis@gmail.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Tim Smith <tsmith84@gmail.com> Closes #16965	2025-01-21 10:30:17 -05:00
Alexander Ziaee	919bc4d10e	zfs-destroy.8: Fix minor formatting typo Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zloop / zloop (push) Waiting to run Details The warning at the end of the second example in the description section was actually inside the options table. Move the El macro to match what is done in the first section for improved readability. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Ziaee <ziaee@FreeBSD.org> Closes #16962	2025-01-20 14:37:52 -05:00
Mariusz Zaborski	4b4e346b9f	Add ability to scrub from last scrubbed txg Some users might want to scrub only new data because they would like to know if the new write wasn't corrupted. This PR adds possibility scrub only newly written data. This introduces new `last_scrubbed_txg` property, indicating the transaction group (TXG) up to which the most recent scrub operation has checked and repaired the dataset, so users can run scrub only from the last saved point. We use a scn_max_txg and scn_min_txg which are already built into scrub, to accomplish that. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Mariusz Zaborski <mariusz.zaborski@klarasystems.com> Sponsored-By: Wasabi Technology, Inc. Sponsored-By: Klara Inc. Closes #16301	2024-12-04 14:21:45 -05:00
Rob Norris	027b3e06ed	zinject(8): rename "ioctl" to "flush" Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zfs-qemu / Setup (push) Waiting to run Details zfs-qemu / qemu-x86 (push) Blocked by required conditions Details zfs-qemu / Cleanup (push) Blocked by required conditions Details zloop / zloop (push) Waiting to run Details Doc bug missed in `d7605ae77`. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16827	2024-12-02 17:21:55 -08:00
Steve Mokris	e08e832b10	Expand zpool-remove.8 manpage with example results Also fix comment cross-referencing to zpool.8. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Steve Mokris <smokris@softpixel.com> Closes #16777	2024-11-19 06:52:04 -08:00
Rob Norris	673efbbf5d	zdb: add extra -T flag to show histograms of BRT refcounts Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16692	2024-11-01 12:08:33 -04:00
Rob Norris	7bf525530a	zpool/zfs: allow --json wherever -j is allowed Mostly so that with the JSON formatting options are also used, they all look the same. To my eye, `-j --json-flat-vdevs` suggests that they are different or unrelated, while `--json --json-flat-vdevs` invites no further questions. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Umer Saleem <usaleem@ixsystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16632	2024-10-11 09:37:57 -07:00
Brian Atkinson	b4e4cbeb20	Always validate checksums for Direct I/O reads This fixes an oversight in the Direct I/O PR. There is nothing that stops a process from manipulating the contents of a buffer for a Direct I/O read while the I/O is in flight. This can lead checksum verify failures. However, the disk contents are still correct, and this would lead to false reporting of checksum validation failures. To remedy this, all Direct I/O reads that have a checksum verification failure are treated as suspicious. In the event a checksum validation failure occurs for a Direct I/O read, then the I/O request will be reissued though the ARC. This allows for actual validation to happen and removes any possibility of the buffer being manipulated after the I/O has been issued. Just as with Direct I/O write checksum validation failures, Direct I/O read checksum validation failures are reported though zpool status -d in the DIO column. Also the zevent has been updated to have both: 1. dio_verify_wr -> Checksum verification failure for writes 2. dio_verify_rd -> Checksum verification failure for reads. This allows for determining what I/O operation was the culprit for the checksum verification failure. All DIO errors are reported only on the top-level VDEV. Even though FreeBSD can write protect pages (stable pages) it still has the same issue as Linux with Direct I/O reads. This commit updates the following: 1. Propogates checksum failures for reads all the way up to the top-level VDEV. 2. Reports errors through zpool status -d as DIO. 3. Has two zevents for checksum verify errors with Direct I/O. One for read and one for write. 4. Updates FreeBSD ABD code to also check for ABD_FLAG_FROM_PAGES and handle ABD buffer contents validation the same as Linux. 5. Updated manipulate_user_buffer.c to also manipulate a buffer while a Direct I/O read is taking place. 6. Adds a new ZTS test case dio_read_verify that stress tests the new code. 7. Updated man pages. 8. Added an IMPLY statement to zio_checksum_verify() to make sure that Direct I/O reads are not issued as speculative. 9. Removed self healing through mirror, raidz, and dRAID VDEVs for Direct I/O reads. This issue was first observed when installing a Windows 11 VM on a ZFS dataset with the dataset property direct set to always. The zpool devices would report checksum failures, but running a subsequent zpool scrub would not repair any data and report no errors. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Closes #16598	2024-10-09 12:28:08 -07:00
Rob Norris	224393a321	feature: large_microzap In `a4b21eadec` we added the zap_micro_max_size tuneable to raise the size at which "micro" (single-block) ZAPs are upgraded to "fat" (multi-block) ZAPs. Before this, a microZAP was limited to 128KiB, which was the old largest block size. The side effect of raising the max size past 128KiB is that it be stored in a large block, requiring the large_blocks feature. Unfortunately, this means that a backup stream created without the --large-block (-L) flag to zfs send would split the microZAP block into smaller blocks and send those, as is normal behaviour for large blocks. This would be received correctly, but since microZAPs are limited to the first block in the object by definition, the entries in the later blocks would be inaccessible. For directory ZAPs, this gives the appearance of files being lost. This commit adds a feature flag, large_microzap, that must be enabled for microZAPs to grow beyond 128KiB, and which will be activated the first time that occurs. This feature is later checked when generating the stream and if active, the send operation will abort unless --large-block has also been requested. Changing the limit still requires zap_micro_max_size to be changed. The state of this flag effectively sets the upper value for this tuneable, that is, if the feature is disabled, the tuneable will be clamped to 128KiB. A stream flag is also added to ensure that the receiver also activates its own feature flag upon receiving the stream. This is not strictly necessary to _use_ the received microZAP, since it doesn't care how large its block is, but it is required to send the microZAP object on, otherwise the original problem occurs again. Because it's difficult to reliably distinguish a microZAP from a fatZAP from outside the ZAP code, and because it seems unlikely that most users are affected (a fairly niche tuneable combined with what should be an uncommon use of send), and for the sake of expediency, this change activates the feature the first time a microZAP grows to use a large block, and is never deactivated after that. This can be improved in the future. This commit changes nothing for existing pools that already have large microZAPs. The feature will not be retroactively applied, but will be activated the next time a microZAP grows past the limit. Don't use large_blocks feature for enable/disable tests. The large_microzap depends on large_blocks, so it gets enabled as a dependency, breaking the test. Instead use feature "longname", which has the exact same feature characteristics. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16593	2024-10-02 20:47:11 -07:00
Brian Atkinson	a10e552b99	Adding Direct IO Support Adding O_DIRECT support to ZFS to bypass the ARC for writes/reads. O_DIRECT support in ZFS will always ensure there is coherency between buffered and O_DIRECT IO requests. This ensures that all IO requests, whether buffered or direct, will see the same file contents at all times. Just as in other FS's , O_DIRECT does not imply O_SYNC. While data is written directly to VDEV disks, metadata will not be synced until the associated TXG is synced. For both O_DIRECT read and write request the offset and request sizes, at a minimum, must be PAGE_SIZE aligned. In the event they are not, then EINVAL is returned unless the direct property is set to always (see below). For O_DIRECT writes: The request also must be block aligned (recordsize) or the write request will take the normal (buffered) write path. In the event that request is block aligned and a cached copy of the buffer in the ARC, then it will be discarded from the ARC forcing all further reads to retrieve the data from disk. For O_DIRECT reads: The only alignment restrictions are PAGE_SIZE alignment. In the event that the requested data is in buffered (in the ARC) it will just be copied from the ARC into the user buffer. For both O_DIRECT writes and reads the O_DIRECT flag will be ignored in the event that file contents are mmap'ed. In this case, all requests that are at least PAGE_SIZE aligned will just fall back to the buffered paths. If the request however is not PAGE_SIZE aligned, EINVAL will be returned as always regardless if the file's contents are mmap'ed. Since O_DIRECT writes go through the normal ZIO pipeline, the following operations are supported just as with normal buffered writes: Checksum Compression Encryption Erasure Coding There is one caveat for the data integrity of O_DIRECT writes that is distinct for each of the OS's supported by ZFS. FreeBSD - FreeBSD is able to place user pages under write protection so any data in the user buffers and written directly down to the VDEV disks is guaranteed to not change. There is no concern with data integrity and O_DIRECT writes. Linux - Linux is not able to place anonymous user pages under write protection. Because of this, if the user decides to manipulate the page contents while the write operation is occurring, data integrity can not be guaranteed. However, there is a module parameter `zfs_vdev_direct_write_verify` that controls the if a O_DIRECT writes that can occur to a top-level VDEV before a checksum verify is run before the contents of the I/O buffer are committed to disk. In the event of a checksum verification failure the write will return EIO. The number of O_DIRECT write checksum verification errors can be observed by doing `zpool status -d`, which will list all verification errors that have occurred on a top-level VDEV. Along with `zpool status`, a ZED event will be issues as `dio_verify` when a checksum verification error occurs. ZVOLs and dedup is not currently supported with Direct I/O. A new dataset property `direct` has been added with the following 3 allowable values: disabled - Accepts O_DIRECT flag, but silently ignores it and treats the request as a buffered IO request. standard - Follows the alignment restrictions outlined above for write/read IO requests when the O_DIRECT flag is used. always - Treats every write/read IO request as though it passed O_DIRECT and will do O_DIRECT if the alignment restrictions are met otherwise will redirect through the ARC. This property will not allow a request to fail. There is also a module parameter zfs_dio_enabled that can be used to force all reads and writes through the ARC. By setting this module parameter to 0, it mimics as if the direct dataset property is set to disabled. Reviewed-by: Brian Behlendorf <behlendorf@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Co-authored-by: Mark Maybee <mark.maybee@delphix.com> Co-authored-by: Matt Macy <mmacy@FreeBSD.org> Co-authored-by: Brian Behlendorf <behlendorf@llnl.gov> Closes #10018	2024-09-14 13:47:59 -07:00
Don Brady	d4d79451cb	Add DDT prune command Requires the new 'flat' physical data which has the start time for a class entry. The amount to prune can be based on a target percentage of the unique entries or based on the age (i.e., every entry older than N days). Sponsored-by: Klara, Inc. Sponsored-by: iXsystems, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Don Brady <don.brady@klarasystems.com> Closes #16277	2024-09-04 14:17:02 -07:00
Mateusz Piotrowski	6be8bf5552	zpool: Provide GUID to zpool-reguid(8) with -g (#16239 ) This commit extends the zpool-reguid(8) command with a -g flag, which allows the user to specify the GUID to set. This change also adds some general tests for zpool-reguid(8). Sponsored-by: Wasabi Technology, Inc. Sponsored-by: Klara, Inc. Signed-off-by: Mateusz Piotrowski <0mp@FreeBSD.org> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2024-08-26 09:27:24 -07:00
Umer Saleem	959e963c81	JSON output support for zpool status This commit adds support for zpool status command to displpay status of ZFS pools in JSON format using '-j' option. Status information is collected in nvlist which is later dumped on stdout in JSON format. Existing options for zpool status work with '-j' flag. man page for zpool status is updated accordingly. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16217	2024-08-06 12:47:10 -07:00
Umer Saleem	4e6b3f7e1d	JSON output support for zpool list This commit adds support for zpool list command to output the list of ZFS pools in JSON format using '-j' option.. Information about available pools is collected in nvlist which is later printed to stdout in JSON format. Existing options for zfs list command work with '-j' flag. man page for zpool list is updated accordingly. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16217	2024-08-06 12:47:04 -07:00
Umer Saleem	eb2b824bde	JSON output support for zpool get This commit adds support for zpool get command to output the list of properties for ZFS Pools and VDEVS in JSON format using '-j' option. Man page for zpool get is updated to include '-j' option. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16217	2024-08-06 12:47:00 -07:00
Umer Saleem	5cbdd5ea4f	JSON output support for zpool version This commit adds support for zpool version to output in JSON format using '-j' option. Userland kernel module version is collected in nvlist which is later displayed in JSON format. man page for zpool is updated. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16217	2024-08-06 12:46:51 -07:00
Umer Saleem	cad4c0ef1a	JSON output support zfs mount This commit adds support for zfs mount to display mounted file systems in JSON format using '-j' option. Data is collected in nvlist which is printed in JSON format. man page for zfs mount is updated accordingly. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16217	2024-08-06 12:46:40 -07:00
Umer Saleem	443abfc71d	JSON output support for zfs list This commit adds support for JSON output for zfs list using '-j' option. Information is collected in JSON format which is later printed in jSON format. Existing options for zfs list also work with '-j'. man pages are updated with relevant information. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16217	2024-08-06 12:46:34 -07:00
Umer Saleem	aa15b60e58	JSON output support for zfs version and zfs get This commit adds support for JSON output for zfs version and zfs get commands. '-j' flag can be used to get output in JSON format. Information is collected in nvlist objects which is later printed in JSON format. Existing options that work for zfs get and zfs version also work with '-j' flag. man pages for zfs get and zfs version are updated accordingly. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16217	2024-08-06 12:45:45 -07:00

1 2 3 4 5 ...

659 Commits