Proxmox-Port/zfs - zfs - Gitea: Git with a cup of tea

mirror of https://github.com/openzfs/zfs.git synced 2025-10-01 02:46:29 +00:00

Author	SHA1	Message	Date
Alexander Motin	654ade8ca2	FreeBSD: Remove some illumos compat from vnode.h Some checks are pending checkstyle / checkstyle (push) Waiting to run Details CodeQL / Analyze (cpp) (push) Waiting to run Details CodeQL / Analyze (python) (push) Waiting to run Details zfs-qemu / Setup (push) Waiting to run Details zfs-qemu / qemu-x86 (push) Blocked by required conditions Details zfs-qemu / Cleanup (push) Blocked by required conditions Details zloop / zloop (push) Waiting to run Details Should make no difference, just some dead code cleanup. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Martin Matuska <mm@FreeBSD.org> Signed-off-by:Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16808	2024-12-03 09:32:54 -08:00
Alexander Motin	ae00c807dc	FreeBSD: Return ifndef IN_BASE back to fix the build FreeBSD's libprocstat seems to build kernel code in user space, which does not work here due to undefined vnode_t. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Martin Matuska <mm@FreeBSD.org> Signed-off-by:Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16808	2024-12-03 09:32:07 -08:00
Alexander Motin	3e9ba0f223	Add missing parenthesis in VERIFYF() Without them the order of operations might get unexpected. Reviewed-by: Adam Moss <c@yotes.com> Reviewed-by: Jorgen Lundman <lundman@lundman.net> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16826	2024-12-02 16:55:51 -08:00
Alexander Motin	b3b0ce64d5	FreeBSD: Lock vnode in zfs_ioctl() Previously vnode was not locked there, unlike Linux. It required locking it in vn_flush_cached_data(), which recursed on the lock if called from zfs_clone_range(), having the vnode locked. Reviewed-by: Alan Somers <asomers@gmail.com> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16789 Closes #16796	2024-11-23 14:26:52 -08:00
Alexander Motin	0d6306be8c	Fix few __VA_ARGS typos in assertions It should be __VA_ARGS__, not __VA_ARGS. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Rob Norris <robn@despairlabs.com> Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Closes #16780	2024-11-19 06:48:06 -08:00
Dimitry Andric	c480e06d88	Fix gcc unused value warning in FreeBSD simd.h The macros `simd_stat_init()` and `simd_stat_fini()` in FreeBSD's `simd.h` are defined as zero, but they are actually only used as statements. Replace the definitions with `do {} while (0)` instead, to avoid gcc `-Wunused-value` warnings. Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Toomas Soome <tsoome@me.com> Signed-off-by: Dimitry Andric <dimitry@andric.com> Closes #16693	2024-10-29 14:49:54 -04:00
Rob Norris	96f382d113	spl/thread: explicitly define thread_func_t as noreturn All of our thread entry functions have this signature: void ()(void) __attribute__((noreturn)) The low-level `__thread_create()` function accepts a `thread_func_t` as the entry point, which is defined more simply as: void ()(void ) And then the `thread_create()` and `thread_create_named()` macros cast the passed-in function point down to `thread_func_t`, that is, casting away the `noreturn` attribute. Clang considers casting between these two types to be invalid because both the caller and the callee may have elided parts of the stack frame save and restore, knowing that they won't be needed. Recent Linux appears to be setting `-Wcast-function-type-strict`, which causes this invalid cast to emit a warning, which with `-Werror` is converted to an error, breaking the build. This commit fixes this in the simplest possible way: adding `noreturn` to the `thread_func_t` attribute. Since all our thread entry functions already have this attribute, it's arguably a just a consistency fix anyway. I considered removing the casts in the macros, which silences the warnings, but it turns out that Clang has a bug that won't emit this error for implicit conversions, only explicit casts. So leaving them there seems like a reasonable belt-and-suspenders approach. Also, frankly, this whole mechanism seems a little undercooked inside LLVM, so I'm content go with my intuition about the smallest, least invaisve change. NOTE: `__thread_create` is exported by `spl.ko` and has a `thread_func_t` arg, so this is an ABI break. Whether that matters in practice, I have no idea. Further reading: - `1aad641c79` - https://github.com/llvm/llvm-project/issues/7325 - https://github.com/llvm/llvm-project/issues/41465 Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16672 Closes #16673	2024-10-21 11:38:30 -07:00
Tomohiro Kusumi	a9851ea3dd	Fix compile-time warnings caused by duplicate struct typedefs Some compiler/versions warn these typedefs according to #16660. The platform specific header sys/abd_os.h shouldn't define or use abd_t, as it's defined in its non-platform specific consumer sys/abd.h. Do the same as what FreeBSD header does. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Tomohiro Kusumi <kusumi.tomohiro@gmail.com> Closes #16660 Closes #16665	2024-10-20 09:43:16 -07:00
Umer Saleem	27e8f56102	Fix inconsistent mount options for ZFS root While mounting ZFS root during boot on Linux distributions from initrd, mount from busybox is effectively used which executes mount system call directly. This skips the ZFS helper mount.zfs, which checks and enables the mount options as specified in dataset properties. As a result, datasets mounted during boot from initrd do not have correct mount options as specified in ZFS dataset properties. There has been an attempt to use mount.zfs in zfs initrd script, responsible for mounting the ZFS root filesystem (PR#13305). This was later reverted (PR#14908) after discovering that using mount.zfs breaks mounting of snapshots on root (/) and other child datasets of root have the same issue (Issue#9461). This happens because switching from busybox mount to mount.zfs correctly parses the mount options but also adds 'mntpoint=/root' to the mount options, which is then prepended to the snapshot mountpoint in '.zfs/snapshot'. '/root' is the directory on Debian with initramfs-tools where root filesystem is mounted before pivot_root. When Linux runtime is reached, trying to access the snapshots on root results in automounting the snapshot on '/root/.zfs/*', which fails. This commit attempts to fix the automounting of snapshots on root, while using mount.zfs in initrd script. Since the mountpoint of dataset is stored in vfs_mntpoint field, we can check if current mountpoint of dataset and vfs_mntpoint are same or not. If they are not same, reset the vfs_mntpoint field with current mountpoint. This fixes the mountpoints of root dataset and children in respective vfs_mntpoint fields when we try to access the snapshots of root dataset or its children. With correct mountpoint for root dataset and children stored in vfs_mntpoint, all snapshots of root dataset are mounted correctly and become accessible. This fix will come into play only if current process, that is trying to access the snapshots is not in chroot context. The Linux kernel API that is used to convert struct path into char format (d_path), returns the complete path for given struct path. It works in chroot environment as well and returns the correct path from original filesystem root. However d_path fails to return the complete path if any directory from original root filesystem is mounted using --bind flag or --rbind flag in chroot environment. In this case, if we try to access the snapshot from outside the chroot environment, d_path returns the path correctly, i.e. it returns the correct path to the directory that is mounted with --bind flag. However inside the chroot environment, it only returns the path inside chroot. For now, there is not a better way in my understanding that gives the complete path in char format and handles the case where directories from root filesystem are mounted with --bind or --rbind on another path which user will later chroot into. So this fix gets enabled if current process trying to access the snapshot is not in chroot context. With the snapshots issue fixed for root filesystem, using mount.zfs in ZFS initrd script, mounts the datasets with correct mount options. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Ameer Hamza <ahamza@ixsystems.com> Signed-off-by: Umer Saleem <usaleem@ixsystems.com> Closes #16646	2024-10-17 09:09:39 -04:00
Warner Losh	38a04f0a7c	freebsd: Use compiler.h from FreeBSD's base's linuxkpi The FreeBSD linux/compiler.h in OpenZFS was copied from a very old version of FreeBSD's linuxkpi's linux/compiler.h. There's no need for this duplication. Use FreeBSD's linuxkpi version instead, and provide zfs_fallthrough to augment it (it's all that's needed). Use #pragma once to avoid naming issues for guard variables. Since this is a complete rewrite, use my copyright here (the original code in FreeBSD still credits everybody). This works back at least to FreeBSD 12.4, which is not out of support, and all newer releases. Remove extra copies of macros that were defined elsewhere, but are now properly defined in LinuxKPI so are redundant. Sponsored-by: Netflix Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Warner Losh <imp@bsdimp.com> Closes #16650	2024-10-16 13:00:40 -04:00
Martin Matuška	efeb60b86a	FreeBSD: ignore some includes when not building kernel The function abd_alloc_from_pages() is used only in kernel. Excluding sys/vm.h, and vm/vm_page.h includes avoids dependency problems. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Martin Matuska <mm@FreeBSD.org> Closes #16616	2024-10-09 09:27:46 -07:00
Brian Behlendorf	d34d4f97a8	snapdir: add 'disabled' value to make .zfs inaccessible In some environments, just making the .zfs control dir hidden from sight might not be enough. In particular, the following scenarios might warrant not allowing access at all: - old snapshots with wrong permissions/ownership - old snapshots with exploitable setuid/setgid binaries - old snapshots with sensitive contents Introducing a new 'disabled' value that not only hides the control dir, but prevents access to its contents by returning ENOENT solves all of the above. The new property value takes advantage of 'iuv' semantics ("ignore unknown value") to automatically fall back to the old default value when a pool is accessed by an older version of ZFS that doesn't yet know about 'disabled' semantics. I think that technically the zfs_dirlook change is enough to prevent access, but preventing lookups and dir entries in an already opened .zfs handle might also be a good idea to prevent races when modifying the property at runtime. Add zfs_snapshot_no_setuid parameter to control whether automatically mounted snapshots have the setuid mount option set or not. this could be considered a partial fix for one of the scenarios mentioned in desired. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Co-authored-by: Fabian Grünbichler <f.gruenbichler@proxmox.com> Closes #3963 Closes #16587	2024-10-02 09:12:02 -07:00
Rob Norris	0cf14bf4b5	Linux 6.12: PG_error flag was removed torvalds/linux@09022bc196 removes the flag, and the corresponding SetPageError() and ClearPageError() macros, with no replacement offered. Going back through the upstream history, use of this flag has been gradually removed over the last year as part of the long tail of converting everything to folios. Interesting tidbit comments from torvalds/linux@29e9412b25 and torvalds/linux@420e05d0de suggest that this flag has not been used meaningfully since page writeback failures started being recorded in errseq_t instead (the whole "fsyncgate" thing, ~2017, around torvalds/linux@8ed1e46aaf). Given that, it's possible that since perhaps Linux 4.13 we haven't been getting anything by setting the flag. I don't know if that's true and/or if there's something we should be doing instead, but my gut feel is that its probably fine we only use the page cache as a proxy to allow mmap() to work, rather than backing IO with it. As such, I'm expecting that removing this will do no harm, but I'm leaving it in for older kernels to maintain status quo, and if there is an overall better way, that is left for a future change. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16582	2024-10-01 13:54:05 -07:00
Rob Norris	1c7f2f6a50	Linux 6.12: f_version removed from struct file linux/torvalds@11068e0b64 removes it, suggesting this was a always there as a helper to handle concurrent seeks, which all filesystems now handle themselves if necessary. Without looking into the mechanism, I can imagine how it might have been used, but we have always set it to zero and never read from it, presumably because we've always tracked per-caller position through the znode anyway. So I don't see how there can be any functional change for us by removing it. I've stayed conservative though and left it in for older kernels, since its clearly not hurting anything there. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16582	2024-10-01 13:54:00 -07:00
Rob Norris	df3b9d881b	Linux 6.12: FMODE_UNSIGNED_OFFSET is now FOP_UNSIGNED_OFFSET torvalds/linux@641bb4394f asserts that this is a static flag, not intended to be variable per-file, so it moves it to file_operations instead. We just change our check to follow. No configure check is necessary because FOP_UNSIGNED_OFFSET didn't exist before this commit, and FMODE_UNSIGNED_OFFSET flag is removed in the same commit, so there's no chance of a conflict. It's not clear to me that we need this check at all, as we never set this flag on our own files, and I can't see any way that our llseek handler could recieve a file from another filesystem. But, the whole zpl_llseek() has a number of opportunities for pleasing cleanup that are nothing to do with this change, so I'll leave that for a future change. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16582	2024-10-01 13:53:55 -07:00
Rob Norris	bc96b80550	Linux 6.12: avoid kmem_cache_create redefinition torvalds/linux@b2e7456b5c makes kmem_cache_create() a macro, which gets in the way of our our own redefinition, so we undef the macro first for our own clients. This follows what we did for kmem_cache_alloc(), see `e951dba48`. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16582	2024-10-01 13:53:33 -07:00
Sanjeev Bagewadi	20232ecfaa	Support for longnames for files/directories (Linux part) This patch adds the ability for zfs to support file/dir name up to 1023 bytes. This number is chosen so we can support up to 255 4-byte characters. This new feature is represented by the new feature flag feature@longname. A new dataset property "longname" is also introduced to toggle longname support for each dataset individually. This property can be disabled, even if it contains longname files. In such case, new file cannot be created with longname but existing longname files can still be looked up. Note that, to my knowledge native Linux filesystems don't support name longer than 255 bytes. So there might be programs not able to work with longname. Note that NFS server may needs to use exportfs_get_name to reconnect dentries, and the buffer being passed is limit to NAME_MAX+1 (256). So NFS may not work when longname is enabled. Note, FreeBSD vfs layer imposes a limit of 255 name lengh, so even though we add code to support it here, it won't actually work. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Chunwei Chen <david.chen@nutanix.com> Closes #15921	2024-10-01 13:40:27 -07:00
Rich Ercolani	5d01243964	Add SIMD metadata in /proc on Linux Too many times, people's performance problems have amounted to "somehow your SIMD support isn't working", and determining that at runtime is difficult to describe to people. This adds a /proc/spl/kstat/zfs/simd node, which exposes metadata about which instructions ZFS thinks it can use, on AArch64 and x86_64 Linux, to make investigating things like this much easier. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #16530	2024-09-20 08:16:44 -07:00
Rob Norris	a83762b3f4	linux: remove kernel version checks for unsupported kernels Following `2b069768a` (#16479), anything gated on a kernel version before 4.18 can be always included/excluded. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16545	2024-09-19 15:43:44 -07:00
Rob Norris	5df65ca9c1	config: remove HAVE_GET_USER_PAGES_* get_user_pages_unlocked() had stabilised by 4.9. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:51 -07:00
Rob Norris	1cb46d9d1a	config: remove HAVE_MODE_LOOKUP_BDEV Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:51 -07:00
Rob Norris	0a61e51736	config: rework ZFS_GENHD_FL_* Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:51 -07:00
Rob Norris	d4e5538014	config: remove HAVE_GENERIC_IO_ACCT_3ARG Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:51 -07:00
Rob Norris	7a02229293	config: remove HAVE_VFSMOUNT_IOPS_GETATTR Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:51 -07:00
Rob Norris	dcb8e5ec7c	config: remove HAVE_BLK_MQ Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	1bf93713d8	config: remove HAVE_BLK_QUEUE_FLAG_* Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	ec48dd0976	config: remove ZFS_GLOBAL_ZONE_PAGE_STATE and ZFS_ENUM_* generation Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	8d9cb04ea8	config: remove ZFS_GLOBAL_ZONE_PAGE_STATE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	0a00804585	config: remove HAVE_WAIT_QUEUE_* Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	2de203163d	config: remove HAVE_SUPER_SETUP_BDI_NAME Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	8396c84346	config: remove HAVE_SET_SPECIAL_STATE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	4059455cda	config: remove HAVE_SCHED_SIGNAL_HEADER Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	7736b68376	config: remove HAVE_PERCPU_COUNTER_ADD_BATCH Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	f95d76cc28	config: remove HAVE_KTIME_GET_RAW_TS64 Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	b3458270b5	config: remove HAVE_KTIME_GET_COARSE_REAL_TS64 Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	536a0a8a84	config: remove HAVE_KERNEL_TIMER_SETUP Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	940f5d5658	config: remove HAVE_KERNEL_TIMER_FUNCTION_TIMER_LIST Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	80d7f0f98e	config: remove HAVE_INODE_TIMESPEC64_TIMES Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	06c34465b7	config: remove HAVE_INODE_SET_IVERSION Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	a817992559	config: remove HAVE_FILEMAP_RANGE_HAS_PAGE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	f4c4df1638	config: remove HAVE_BIO_BI_STATUS and bio error compat Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	3008e691a2	config: remove HAVE_ACL_REFCOUNT Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	d69bb93c9f	config: remove HAVE_BLK_QUEUE_SECDISCARD Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	cde79fe11a	config: remove ZFS_GLOBAL_NODE_PAGE_STATE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	1af510b550	config: remove HAVE_XATTR_GET_DENTRY_INODE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	e055f0e053	config: remove HAVE_XATTR_LIST_SIMPLE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	e6713cfd54	config: remove HAVE_XATTR_(GET\|SET\|LIST)_HANDLER Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	72d3fa215f	config: remove HAVE_VFS_ITERATE/HAVE_VFS_ITERATE_SHARED Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	432f6e9eec	config: remove HAVE_SUPER_USER_NS Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	846b598519	config: remove HAVE_REQ_OP_* and HAVE_REQ_* Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	7cc89f83ff	config: remove HAVE_POSIX_ACL_VALID_WITH_NS Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	d1a07741f8	config: remove HAVE_INODE_LOCK_SHARED Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	6374171bb4	config: remove HAVE_IN_COMPAT_SYSCALL Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	6a17061213	config: remove HAVE_CURRENT_TIME Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	ddf5122a29	config: remove HAVE_BLK_QUEUE_WRITE_CACHE/HAVE_BLK_QUEUE_FLUSH Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	43bf9a712b	config: remove HAVE_BIO_BI_OPF Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	9a1c7240ba	config: remove HAVE_RENAME2_OPERATIONS_WRAPPER Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	230bc538cb	config: remove HAVE_VFS_FILE_OPERATIONS_EXTEND Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	733317966f	config: remove HAVE_XATTR_(GET\|SET\|LIST)_DENTRY Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	eb73000dbb	config: remove HAVE_WAIT_ON_BIT_ACTION Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	2bba420245	config: remove HAVE_LSEEK_EXECUTE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	6a28491f8e	config: remove HAVE_3ARGS_BDI_SETUP_AND_REGISTER Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	d60d4ad809	config: remove HAVE_SET_CACHED_ACL_USABLE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	6840e3b18b	config: remove HAVE_SET_ACL Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	3d37b1d6d4	config: remove HAVE_POSIX_ACL_RELEASE and HAVE_POSIX_ACL_RELEASE_GPL_ONLY Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:50 -07:00
Rob Norris	67b0c883df	config: remove HAVE___POSIX_ACL_CHMOD Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:49 -07:00
Rob Norris	583e2e25b9	config: remove HAVE_PERCPU_COUNTER_INIT_WITH_GFP Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:49 -07:00
Rob Norris	1b522c4583	config: remove HAVE_KERNEL_TIMER_LIST_FLAGS Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:49 -07:00
Rob Norris	eb230c789a	config: remove HAVE_KERNEL_STRSCPY Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:49 -07:00
Rob Norris	6a1d8a9cf0	config: remove HAVE_FILE_INODE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:49 -07:00
Rob Norris	257e40f9d9	config: remove HAVE_FILE_DENTRY Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:49 -07:00
Rob Norris	dc6af4a4b5	config: remove HAVE_D_MAKE_ROOT Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:49 -07:00
Rob Norris	efc293e371	config: remove HAVE_DENTRY_D_U_ALIASES Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:48 -07:00
Rob Norris	147c82bd5e	config: remove HAVE_CLEAR_INODE and HAVE_EVICT_INODE Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:48 -07:00
Rob Norris	609559e5b9	config: remove HAVE_BIO_BVEC_ITER Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:45 -07:00
Rob Norris	233bed67a8	config: remove HAVE_1ARG_BIO_END_IO_T Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16479	2024-09-18 11:23:40 -07:00
Pavel Snajdr	90af1e83e8	Linux 6.10 compat: Fix tracepoints definitions __string field definition includes the source variable for a value of the string when the TP hits; in 6.10+ kernels, __assign_str() uses that to copy a value from src to the string, with older kernels, __assign_str still accepted src as a second parameter. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pavel Snajdr <snajpa@snajpa.net> Co-authored-by: Tony Hutter <hutter2@llnl.gov> Closes #16475 Closes #16515	2024-09-17 13:38:02 -07:00
Brian Atkinson	a10e552b99	Adding Direct IO Support Adding O_DIRECT support to ZFS to bypass the ARC for writes/reads. O_DIRECT support in ZFS will always ensure there is coherency between buffered and O_DIRECT IO requests. This ensures that all IO requests, whether buffered or direct, will see the same file contents at all times. Just as in other FS's , O_DIRECT does not imply O_SYNC. While data is written directly to VDEV disks, metadata will not be synced until the associated TXG is synced. For both O_DIRECT read and write request the offset and request sizes, at a minimum, must be PAGE_SIZE aligned. In the event they are not, then EINVAL is returned unless the direct property is set to always (see below). For O_DIRECT writes: The request also must be block aligned (recordsize) or the write request will take the normal (buffered) write path. In the event that request is block aligned and a cached copy of the buffer in the ARC, then it will be discarded from the ARC forcing all further reads to retrieve the data from disk. For O_DIRECT reads: The only alignment restrictions are PAGE_SIZE alignment. In the event that the requested data is in buffered (in the ARC) it will just be copied from the ARC into the user buffer. For both O_DIRECT writes and reads the O_DIRECT flag will be ignored in the event that file contents are mmap'ed. In this case, all requests that are at least PAGE_SIZE aligned will just fall back to the buffered paths. If the request however is not PAGE_SIZE aligned, EINVAL will be returned as always regardless if the file's contents are mmap'ed. Since O_DIRECT writes go through the normal ZIO pipeline, the following operations are supported just as with normal buffered writes: Checksum Compression Encryption Erasure Coding There is one caveat for the data integrity of O_DIRECT writes that is distinct for each of the OS's supported by ZFS. FreeBSD - FreeBSD is able to place user pages under write protection so any data in the user buffers and written directly down to the VDEV disks is guaranteed to not change. There is no concern with data integrity and O_DIRECT writes. Linux - Linux is not able to place anonymous user pages under write protection. Because of this, if the user decides to manipulate the page contents while the write operation is occurring, data integrity can not be guaranteed. However, there is a module parameter `zfs_vdev_direct_write_verify` that controls the if a O_DIRECT writes that can occur to a top-level VDEV before a checksum verify is run before the contents of the I/O buffer are committed to disk. In the event of a checksum verification failure the write will return EIO. The number of O_DIRECT write checksum verification errors can be observed by doing `zpool status -d`, which will list all verification errors that have occurred on a top-level VDEV. Along with `zpool status`, a ZED event will be issues as `dio_verify` when a checksum verification error occurs. ZVOLs and dedup is not currently supported with Direct I/O. A new dataset property `direct` has been added with the following 3 allowable values: disabled - Accepts O_DIRECT flag, but silently ignores it and treats the request as a buffered IO request. standard - Follows the alignment restrictions outlined above for write/read IO requests when the O_DIRECT flag is used. always - Treats every write/read IO request as though it passed O_DIRECT and will do O_DIRECT if the alignment restrictions are met otherwise will redirect through the ARC. This property will not allow a request to fail. There is also a module parameter zfs_dio_enabled that can be used to force all reads and writes through the ARC. By setting this module parameter to 0, it mimics as if the direct dataset property is set to disabled. Reviewed-by: Brian Behlendorf <behlendorf@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov> Signed-off-by: Brian Atkinson <batkinson@lanl.gov> Co-authored-by: Mark Maybee <mark.maybee@delphix.com> Co-authored-by: Matt Macy <mmacy@FreeBSD.org> Co-authored-by: Brian Behlendorf <behlendorf@llnl.gov> Closes #10018	2024-09-14 13:47:59 -07:00
Rob Norris	5b9e695392	abd_os: break out platform-specific header parts Removing the platform #ifdefs from shared headers in favour of per-platform headers. Makes abd_t much leaner, among other things. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16253	2024-08-21 13:37:18 -07:00
Rob Norris	db40fe4cf6	spl-taskq: per-taskq kstats This exposes a variety of per-taskq stats under /proc/spl/kstat/taskq, one file per taskq, named for the taskq name.instance. These include a small amount of info about the taskq config, the current state of the threads and queues, and various counters for thread and queue activity since the taskq was created. To assist with decrementing queue size counters, the list an entry is on is encoded in spare bits in the entry flags. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Syneto Closes #16171	2024-08-19 09:50:35 -07:00
Rob Norris	3abffc8781	Linux 6.11: add compat macro for page_mapping() Since the change to folios it has just been a wrapper anyway. Linux has removed their wrapper, so we add one. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400	2024-08-13 17:47:18 -07:00
Rob Norris	7e98d30f46	Linux 6.11: get backing_dev_info through queue gendisk It's no longer available directly on the request queue, but its easy to get from the attached disk. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400	2024-08-13 17:46:49 -07:00
Rob Norris	e95b732e49	Linux 6.11: enable queue flush through queue limits In 6.11 struct queue_limits gains a 'features' field, where, among other things, flush and write-cache are enabled. Detect it and use it. Along the way, the blk_queue_set_write_cache() compat wrapper gets a little cleanup. Since both flags are alway set together, its now a single bool. Also the very very ancient version that sets q->flush_flags directly couldn't actually turn it off, so I've fixed that. Not that we use it, but still. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16400	2024-08-13 17:46:41 -07:00
Rob Norris	6c82951d11	FreeBSD: remove support for FreeBSD < 13.0-RELEASE (#16372 ) This includes the last 12.x release (now EOL) and 13.0 development versions (<1300139). Sponsored-by: https://despairlabs.com/sponsor/ Signed-off-by: Rob Norris <robn@despairlabs.com> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2024-08-05 16:56:45 -07:00
Alexander Motin	d4b5517ef9	Linux: Report reclaimable memory to kernel as such (#16385 ) Linux provides SLAB_RECLAIM_ACCOUNT and __GFP_RECLAIMABLE flags to mark memory allocations that can be freed via shinker calls. It should allow kernel to tune and group such allocations for lower memory fragmentation and better reclamation under pressure. This patch marks as reclaimable most of ARC memory, directly evictable via ZFS shrinker, plus also dnode/znode/sa memory, indirectly evictable via kernel's superblock shrinker. Signed-off-by: Alexander Motin <mav@FreeBSD.org> Sponsored by: iXsystems, Inc. Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Allan Jude <allan@klarasystems.com>	2024-07-30 11:40:47 -07:00
Jason Lee	41902c8e6d	Use kmap_local_page instead of kmap_atomic (#16329 ) Changed zfs_k(un)map_atomic to zfs_k(un)map_local Signed-off-by: Jason Lee <jasonlee@lanl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Atkinson <batkinson@lanl.gov>	2024-07-16 17:27:29 -07:00
Rob Norris	e951dba48a	Linux 6.10: work harder to avoid kmem_cache_alloc reuse Linux 6.10 change kmem_cache_alloc to be a macro, rather than a function, such that the old #undef for it in spl-kmem-cache.c would remove its definition completely, breaking the build. This inverts the model used before. Rather than always defining the kmem_cache_* macro, then undefining then inside spl-kmem-cache.c, instead we make a special tag to indicate we're currently inside spl-kmem-cache.c, and not defining those in macros in the first place, so we can use the kernel-supplied kmem_cache_* functions to implement spl_kmem_cache_*, as we expect. For all other callers, we create the macros as normal and remove access to the kernel's own conflicting names. Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Tony Hutter <hutter2@llnl.gov> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>	2024-07-15 17:10:02 -07:00
Daniel Berlin	f7d8b13336	Fix missing semicolon in trace_dbuf.h (#16281 ) On fedora 40, on the 6.9.4 kernel (in updates-testing), assign_str expands to a "do {<stuff> } while(0)" loop. Without this semicolon, the while(0) is unterminated, causing a cascade of useless errors. With this semicolon, it compiles fine. It also compiles fine on 6.8.11 (the previous kernel). I have not tested earlier kernels than that, but at worst it should add a pointless semicolon. All other instances in the source tree are already terminated with semicolons. Signed-off-by: Daniel Berlin <dberlin@dberlin.org> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2024-07-12 17:44:10 -07:00
Mark Johnston	f72e081fbf	FreeBSD: Use a statement expression to implement SET_ERROR() (#16284 ) This way we can avoid making assumptions about the SDT probe implementation. No functional change intended. Signed-off-by: Mark Johnston <markj@FreeBSD.org> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Allan Jude <allan@klarasystems.com> Reviewed-by: Rob Norris <rob.norris@klarasystems.com> Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Tony Hutter <hutter2@llnl.gov>	2024-07-08 17:59:08 -07:00
Zhenlei Huang	e2357561b9	FreeBSD: Add const qualifier to members of struct opensolaris_utsname These members have directly references to the global variables exposed by the kernel. They are not going to be changed by this kernel module. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Zhenlei Huang <zlei@FreeBSD.org> Closes #16210	2024-05-30 09:58:20 -07:00
Pawel Jakub Dawidek	01c8efdd59	Simplify issig(). We always call it twice with JUSTLOOKING and then FORREAL. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Pawel Jakub Dawidek <pawel@dawidek.net> Closes #16225	2024-05-29 10:49:11 -07:00
Rob N	708be0f415	Linux 6.7 compat: detect if kernel defines intptr_t Since Linux 6.7 the kernel has defined intptr_t. Clang has -Wtypedef-redefinition by default, which causes the build to fail because we also have a typedef for intptr_t. Since its better to use the kernel's if it exists, detect it and skip our own. Sponsored-by: https://despairlabs.com/sponsor/ Reviewed-by: Tino Reichardt <milky-zfs@mcmilk.de> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Closes #16201	2024-05-24 18:54:24 -07:00
chenqiuhao1997	41ae864b69	Replace P2ALIGN with P2ALIGN_TYPED and delete P2ALIGN. In P2ALIGN, the result would be incorrect when align is unsigned integer and x is larger than max value of the type of align. In that case, -(align) would be a positive integer, which means high bits would be zero and finally stay zero after '&' when align is converted to a larger integer type. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Youzhong Yang <yyang@mathworks.com> Signed-off-by: Qiuhao Chen <chenqiuhao1997@gmail.com> Closes #15940	2024-05-10 08:47:21 -07:00
Rob N	8f1b7a6fa6	vdev_disk: disable flushes if device does not support it If the underlying device doesn't have a write-back cache, the kernel will just return a successful response. This doesn't hurt anything, but it's extra work on the IO taskqs that are unnecessary. So, detect this when we open the device for the first time. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16148	2024-05-02 15:18:35 -07:00
Rob N	b181b2e604	bdev_discard_supported: understand discard_granularity=0 Kernel documentation for the discard_granularity property says: A discard_granularity of 0 means that the device does not support discard functionality. Some older kernels had drivers (notably loop, but also some USB-SATA adapters) that would set the QUEUE_FLAG_DISCARD capability flag, but have discard_granularity=0. Since 5.10 (torvalds/linux@b35fd7422c) the discard entry point blkdev_issue_discard() has had a check for this, which would immediately reject the call with EOPNOTSUPP, and throw a scary diagnostic message into the log. See #16068. Since 6.8, the block layer sets a non-zero default for discard_granularity (torvalds/linux@3c407dc723), and a future kernel will remove the check entirely[1]. As such, there's no good reason for us to enable discard when discard_granularity=0. The kernel will never let the request go in anyway; better that we just disable it so we can report it properly to the user. 1. https://patchwork.kernel.org/project/linux-block/patch/20240312144826.1045212-2-hch@lst.de/ Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16068 Closes #16082	2024-04-12 09:00:20 -07:00
Rob Norris	b613709c46	dkio: remove kernel dkio.h compatibility header Without DKIOCFLUSHWRITECACHE, we no longer need the compat header. Note that we're keeping the userspace SPL compat header, which is used by libefi. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16064	2024-04-11 17:17:18 -07:00
Rob Norris	c9c838aa1f	zio: remove io_cmd and DKIOCFLUSHWRITECACHE There's no other options, so we can just always assume its a flush. Includes some light refactoring where a switch statement was doing control flow that no longer works. Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Alexander Motin <mav@FreeBSD.org> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Closes #16064	2024-04-11 17:17:11 -07:00
Rich Ercolani	e5e2a5a3b8	Add custom debug printing for your asserts Being able to print custom debug information on assert trip seems useful. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Reviewed-by: Paul Dagnelie <pcd@delphix.com> Signed-off-by: Rich Ercolani <rincebrain@gmail.com> Closes #15792	2024-04-10 13:30:25 -07:00
Rob N	a9a4290173	xdr: header cleanup #16047 notes that include/os/freebsd/spl/rpc/xdr.h carried an (apparently) incompatible license. While looking into it, it seems that this file is actually unnecessary these days - FreeBSD's kernel XDR has XDR_CONTROL, xdrmem_control and XDR_GET_BYTES_AVAIL, while userspace has XDR_CONTROL and xdrmem_control, and our implementation of XDR_GET_BYTES_AVAIL for libspl works nicely with it. So this removes that file outright. To keep the includes in nvpair.c tidy, I've made a few small adjustments to the Linux headers. By definition, rpc/types.h provides bool_t and is included before rpc/xdr.h, so I've created rpc/types.h for Linux. This isn't necessary for userspace; both FreeBSD native and tirpc on Linux already have these headers set up correctly. Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <robn@despairlabs.com> Sponsored-by: https://despairlabs.com/sponsor/ Closes #16047 Closes #16051	2024-04-03 15:13:27 -07:00
Rob Norris	06a196020e	vdev_disk: rewrite BIO filling machinery to avoid split pages This commit tackles a number of issues in the way BIOs (`struct bio`) are constructed for submission to the Linux block layer. The kernel has a hard upper limit on the number of pages/segments that can be added to a BIO, as well as a separate limit for each device (related to its queue depth and other scheduling characteristics). ZFS counts the number of memory pages in the request ABD (`abd_nr_pages_off()`, and then uses that as the number of segments to put into the BIO, up to the hard upper limit. If it requires more than the limit, it will create multiple BIOs. Leaving aside the fact that page count method is wrong (see below), not limiting to the device segment max means that the device driver will need to split the BIO in half. This is alone is not necessarily a problem, but it interacts with another issue to cause a much larger problem. The kernel function to add a segment to a BIO (`bio_add_page()`) takes a `struct page` pointer, and offset+len within it. `struct page` can represent a run of contiguous memory pages (known as a "compound page"). In can be of arbitrary length. The ZFS functions that count ABD pages and load them into the BIO (`abd_nr_pages_off()`, `bio_map()` and `abd_bio_map_off()`) will never consider a page to be more than `PAGE_SIZE` (4K), even if the `struct page` is for multiple pages. In this case, it will load the same `struct page` into the BIO multiple times, with the offset adjusted each time. With a sufficiently large ABD, this can easily lead to the BIO being entirely filled much earlier than it could have been. This is also further contributes to the problem caused by the incorrect segment limit calculation, as its much easier to go past the device limit, and so require a split. Again, this is not a problem on its own. The logic for "never submit more than `PAGE_SIZE`" is actually a little more subtle. It will actually never submit a buffer that crosses a 4K page boundary. In practice, this is fine, as most ABDs are scattered, that is a list of complete 4K pages, and so are loaded in as such. Linear ABDs are typically allocated from slabs, and for small sizes they are frequently not aligned to page boundaries. For example, a 12K allocation can span four pages, eg: -- 4K -- -- 4K -- -- 4K -- -- 4K -- \| \| \| \| \| :## ######## ######## ######: [1K, 4K, 4K, 3K] Such an allocation would be loaded into a BIO as you see: [1K, 4K, 4K, 3K] This tends not to be a problem in practice, because even if the BIO were filled and needed to be split, each half would still have either a start or end aligned to the logical block size of the device (assuming 4K at least). --- In ideal circumstances, these shortcomings don't cause any particular problems. Its when they start to interact with other ZFS features that things get interesting. Aggregation will create a "gang" ABD, which is simply a list of other ABDs. Iterating over a gang ABD is just iterating over each ABD within it in turn. Because the segments are simply loaded in order, we can end up with uneven segments either side of the "gap" between the two ABDs. For example, two 12K ABDs might be aggregated and then loaded as: [1K, 4K, 4K, 3K, 2K, 4K, 4K, 2K] Should a split occur, each individual BIO can end up either having an start or end offset that is not aligned to the logical block size, which some drivers (eg SCSI) will reject. However, this tends not to happen because the default aggregation limit usually keeps the BIO small enough to not require more than one split, and most pages are actually full 4K pages, so hitting an uneven gap is very rare anyway. If the pool is under particular memory pressure, then an IO can be broken down into a "gang block", a 512-byte block composed of a header and up to three block pointers. Each points to a fragment of the original write, or in turn, another gang block, breaking the original data up over and over until space can be found in the pool for each of them. Each gang header is a separate 512-byte memory allocation from a slab, that needs to be written down to disk. When the gang header is added to the BIO, its a single 512-byte segment. Pulling all this together, consider a large aggregated write of gang blocks. This results a BIO containing lots of 512-byte segments. Given our tendency to overfill the BIO, a split is likely, and most possible split points will yield a pair of BIOs that are misaligned. Drivers that care, like the SCSI driver, will reject them. --- This commit is a substantial refactor and rewrite of much of `vdev_disk` to sort all this out. `vdev_bio_max_segs()` now returns the ideal maximum size for the device, if available. There's also a tuneable `zfs_vdev_disk_max_segs` to override this, to assist with testing. We scan the ABD up front to count the number of pages within it, and to confirm that if we submitted all those pages to one or more BIOs, it could be split at any point with creating a misaligned BIO. If the pages in the BIO are not usable (as in any of the above situations), the ABD is linearised, and then checked again. This is the same technique used in `vdev_geom` on FreeBSD, adjusted for Linux's variable page size and allocator quirks. `vbio_t` is a cleanup and enhancement of the old `dio_request_t`. The idea is simply that it can hold all the state needed to create, submit and return multiple BIOs, including all the refcounts, the ABD copy if it was needed, and so on. Apart from what I hope is a clearer interface, the major difference is that because we know how many BIOs we'll need up front, we don't need the old overflow logic that would grow the BIO array, throw away all the old work and restart. We can get it right from the start. Reviewed-by: Alexander Motin <mav@FreeBSD.org> Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov> Signed-off-by: Rob Norris <rob.norris@klarasystems.com> Sponsored-by: Klara, Inc. Sponsored-by: Wasabi Technology, Inc. Closes #15533 Closes #15588	2024-03-25 16:51:14 -07:00

1 2 3 4 5 ...

518 Commits