linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git synced 2025-08-16 20:39:07 +00:00

Author	SHA1	Message	Date
Jeff Layton	f67eef8da0	nfsd: drop inode parameter from nfsd4_change_attribute() The inode that nfs4_open_delegation() passes to this function is wrong, which throws off the result. The inode will end up getting a directory-style change attr instead of a regular-file-style one. Fix up nfs4_delegation_stat() to fetch STATX_MODE, and then drop the inode parameter from nfsd4_change_attribute(), since it's no longer needed. Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-11-11 13:42:06 -05:00
Yang Erkun	d5ff2fb2e7	nfsd: cancel nfsd_shrinker_work using sync mode in nfs4_state_shutdown_net In the normal case, when we excute `echo 0 > /proc/fs/nfsd/threads`, the function `nfs4_state_destroy_net` in `nfs4_state_shutdown_net` will release all resources related to the hashed `nfs4_client`. If the `nfsd_client_shrinker` is running concurrently, the `expire_client` function will first unhash this client and then destroy it. This can lead to the following warning. Additionally, numerous use-after-free errors may occur as well. nfsd_client_shrinker echo 0 > /proc/fs/nfsd/threads expire_client nfsd_shutdown_net unhash_client ... nfs4_state_shutdown_net /* won't wait shrinker exit / / cancel_work(&nn->nfsd_shrinker_work) * nfsd_file for this /* won't destroy unhashed client1 / client1 still alive nfs4_state_destroy_net / nfsd_file_cache_shutdown / trigger warning / kmem_cache_destroy(nfsd_file_slab) kmem_cache_destroy(nfsd_file_mark_slab) / release nfsd_file and mark */ __destroy_client ==================================================================== BUG nfsd_file (Not tainted): Objects remaining in nfsd_file on __kmem_cache_shutdown() -------------------------------------------------------------------- CPU: 4 UID: 0 PID: 764 Comm: sh Not tainted 6.12.0-rc3+ #1 dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xac/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1a5/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e ==================================================================== BUG nfsd_file_mark (Tainted: G B W ): Objects remaining nfsd_file_mark on __kmem_cache_shutdown() -------------------------------------------------------------------- dump_stack_lvl+0x53/0x70 slab_err+0xb0/0xf0 __kmem_cache_shutdown+0x15c/0x310 kmem_cache_destroy+0x66/0x160 nfsd_file_cache_shutdown+0xc8/0x210 [nfsd] nfsd_destroy_serv+0x251/0x2a0 [nfsd] nfsd_svc+0x125/0x1e0 [nfsd] write_threads+0x16a/0x2a0 [nfsd] nfsctl_transaction_write+0x74/0xa0 [nfsd] vfs_write+0x1a5/0x6d0 ksys_write+0xc1/0x160 do_syscall_64+0x5f/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e To resolve this issue, cancel `nfsd_shrinker_work` using synchronous mode in nfs4_state_shutdown_net. Fixes: `7c24fa2250` ("NFSD: replace delayed_work with work_struct for nfsd_client_shrinker") Signed-off-by: Yang Erkun <yangerkun@huaweicloud.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-21 10:27:36 -04:00
Olga Kornievskaia	8dd91e8d31	nfsd: fix race between laundromat and free_stateid There is a race between laundromat handling of revoked delegations and a client sending free_stateid operation. Laundromat thread finds that delegation has expired and needs to be revoked so it marks the delegation stid revoked and it puts it on a reaper list but then it unlock the state lock and the actual delegation revocation happens without the lock. Once the stid is marked revoked a racing free_stateid processing thread does the following (1) it calls list_del_init() which removes it from the reaper list and (2) frees the delegation stid structure. The laundromat thread ends up not calling the revoke_delegation() function for this particular delegation but that means it will no release the lock lease that exists on the file. Now, a new open for this file comes in and ends up finding that lease list isn't empty and calls nfsd_breaker_owns_lease() which ends up trying to derefence a freed delegation stateid. Leading to the followint use-after-free KASAN warning: kernel: ================================================================== kernel: BUG: KASAN: slab-use-after-free in nfsd_breaker_owns_lease+0x140/0x160 [nfsd] kernel: Read of size 8 at addr ffff0000e73cd0c8 by task nfsd/6205 kernel: kernel: CPU: 2 UID: 0 PID: 6205 Comm: nfsd Kdump: loaded Not tainted 6.11.0-rc7+ #9 kernel: Hardware name: Apple Inc. Apple Virtualization Generic Platform, BIOS 2069.0.0.0.0 08/03/2024 kernel: Call trace: kernel: dump_backtrace+0x98/0x120 kernel: show_stack+0x1c/0x30 kernel: dump_stack_lvl+0x80/0xe8 kernel: print_address_description.constprop.0+0x84/0x390 kernel: print_report+0xa4/0x268 kernel: kasan_report+0xb4/0xf8 kernel: __asan_report_load8_noabort+0x1c/0x28 kernel: nfsd_breaker_owns_lease+0x140/0x160 [nfsd] kernel: nfsd_file_do_acquire+0xb3c/0x11d0 [nfsd] kernel: nfsd_file_acquire_opened+0x84/0x110 [nfsd] kernel: nfs4_get_vfs_file+0x634/0x958 [nfsd] kernel: nfsd4_process_open2+0xa40/0x1a40 [nfsd] kernel: nfsd4_open+0xa08/0xe80 [nfsd] kernel: nfsd4_proc_compound+0xb8c/0x2130 [nfsd] kernel: nfsd_dispatch+0x22c/0x718 [nfsd] kernel: svc_process_common+0x8e8/0x1960 [sunrpc] kernel: svc_process+0x3d4/0x7e0 [sunrpc] kernel: svc_handle_xprt+0x828/0xe10 [sunrpc] kernel: svc_recv+0x2cc/0x6a8 [sunrpc] kernel: nfsd+0x270/0x400 [nfsd] kernel: kthread+0x288/0x310 kernel: ret_from_fork+0x10/0x20 This patch proposes a fixed that's based on adding 2 new additional stid's sc_status values that help coordinate between the laundromat and other operations (nfsd4_free_stateid() and nfsd4_delegreturn()). First to make sure, that once the stid is marked revoked, it is not removed by the nfsd4_free_stateid(), the laundromat take a reference on the stateid. Then, coordinating whether the stid has been put on the cl_revoked list or we are processing FREE_STATEID and need to make sure to remove it from the list, each check that state and act accordingly. If laundromat has added to the cl_revoke list before the arrival of FREE_STATEID, then nfsd4_free_stateid() knows to remove it from the list. If nfsd4_free_stateid() finds that operations arrived before laundromat has placed it on cl_revoke list, it marks the state freed and then laundromat will no longer add it to the list. Also, for nfsd4_delegreturn() when looking for the specified stid, we need to access stid that are marked removed or freeable, it means the laundromat has started processing it but hasn't finished and this delegreturn needs to return nfserr_deleg_revoked and not nfserr_bad_stateid. The latter will not trigger a FREE_STATEID and the lack of it will leave this stid on the cl_revoked list indefinitely. Fixes: `2d4a532d38` ("nfsd: ensure that clp->cl_revoked list is protected by clp->cl_lock") CC: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-18 16:40:37 -04:00
Olga Kornievskaia	c88c150a46	nfsd: fix possible badness in FREE_STATEID When multiple FREE_STATEIDs are sent for the same delegation stateid, it can lead to a possible either use-after-free or counter refcount underflow errors. In nfsd4_free_stateid() under the client lock we find a delegation stateid, however the code drops the lock before calling nfs4_put_stid(), that allows another FREE_STATE to find the stateid again. The first one will proceed to then free the stateid which leads to either use-after-free or decrementing already zeroed counter. Fixes: `3f29cc82a8` ("nfsd: split sc_status out of sc_type") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-10-05 15:44:25 -04:00
Christian Brauner	09ee2a670d	Merge patch series "Fixup NLM and kNFSD file lock callbacks" Benjamin Coddington <bcodding@redhat.com> says: Last year both GFS2 and OCFS2 had some work done to make their locking more robust when exported over NFS. Unfortunately, part of that work caused both NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send lock notifications to clients. This in itself is not a huge problem because most NFS clients will still poll the server in order to acquire a conflicted lock, but now that I've noticed it I can't help but try to fix it because there are big advantages for setups that might depend on timely lock notifications, and we've supported that as a feature for a long time. Its important for NLM and kNFSD that they do not block their kernel threads inside filesystem's file_lock implementations because that can produce deadlocks. We used to make sure of this by only trusting that posix_lock_file() can correctly handle blocking lock calls asynchronously, so the lock managers would only setup their file_lock requests for async callbacks if the filesystem did not define its own lock() file operation. However, when GFS2 and OCFS2 grew the capability to correctly handle blocking lock requests asynchronously, they started signalling this behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting posix_lock_file() was inadvertently dropped, so now most filesystems no longer produce lock notifications when exported over NFS. I tried to fix this by simply including the old check for lock(), but the resulting include mess and layering violations was more than I could accept. There's a much cleaner way presented here using an fop_flag, which while potentially flag-greedy, greatly simplifies the problem and grooms the way for future uses by both filesystems and lock managers alike. * patches from https://lore.kernel.org/r/cover.1726083391.git.bcodding@redhat.com: exportfs: Remove EXPORT_OP_ASYNC_LOCK NLM/NFSD: Fix lock notifications for async-capable filesystems gfs2/ocfs2: set FOP_ASYNC_LOCK fs: Introduce FOP_ASYNC_LOCK NFS: trace: show TIMEDOUT instead of 0x6e nfsd: use system_unbound_wq for nfsd_file_gc_worker() nfsd: count nfsd_file allocations nfsd: fix refcount leak when file is unhashed after being found nfsd: remove unneeded EEXIST error check in nfsd_do_file_acquire nfsd: add list_head nf_gc to struct nfsd_file Link: https://lore.kernel.org/r/cover.1726083391.git.bcodding@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-02 07:52:07 +02:00
Benjamin Coddington	7e64c5bc49	NLM/NFSD: Fix lock notifications for async-capable filesystems Instead of checking just the exportfs flag, use the new locks_can_async_lock() helper which allows NLM and NFSD to once again support lock notifications for all filesystems which use posix_lock_file(). Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Link: https://lore.kernel.org/r/865c40da44af67939e8eb560d17a26c9c50f23e0.1726083391.git.bcodding@redhat.com Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-10-01 17:00:18 +02:00
NeilBrown	45bb63ed20	nfsd: fix delegation_blocked() to block correctly for at least 30 seconds The pair of bloom filtered used by delegation_blocked() was intended to block delegations on given filehandles for between 30 and 60 seconds. A new filehandle would be recorded in the "new" bit set. That would then be switch to the "old" bit set between 0 and 30 seconds later, and it would remain as the "old" bit set for 30 seconds. Unfortunately the code intended to clear the old bit set once it reached 30 seconds old, preparing it to be the next new bit set, instead cleared the new bit set before switching it to be the old bit set. This means that the "old" bit set is always empty and delegations are blocked between 0 and 30 seconds. This patch updates bd->new before clearing the set with that index, instead of afterwards. Reported-by: Olga Kornievskaia <okorniev@redhat.com> Cc: stable@vger.kernel.org Fixes: `6282cd5655` ("NFSD: Don't hand out delegations for 30 seconds after recalling them.") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:38 -04:00
Jeff Layton	bf92e5008b	nfsd: fix initial getattr on write delegation At this point in compound processing, currentfh refers to the parent of the file, not the file itself. Get the correct dentry from the delegation stateid instead. Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:37 -04:00
NeilBrown	a078a7dc0e	nfsd: untangle code in nfsd4_deleg_getattr_conflict() The code in nfsd4_deleg_getattr_conflict() is convoluted and buggy. With this patch we: - properly handle non-nfsd leases. We must not assume flc_owner is a delegation unless fl_lmops == &nfsd_lease_mng_ops - move the main code out of the for loop - have a single exit which calls nfs4_put_stid() (and other exits which don't need to call that) [ jlayton: refactored on top of Neil's other patch: nfsd: fix nfsd4_deleg_getattr_conflict in presence of third party lease ] Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:36 -04:00
Chuck Lever	aadc3bbea1	NFSD: Limit the number of concurrent async COPY operations Nothing appears to limit the number of concurrent async COPY operations that clients can start. In addition, AFAICT each async COPY can copy an unlimited number of 4MB chunks, so can run for a long time. Thus IMO async COPY can become a DoS vector. Add a restriction mechanism that bounds the number of concurrent background COPY operations. Start simple and try to be fair -- this patch implements a per-namespace limit. An async COPY request that occurs while this limit is exceeded gets NFS4ERR_DELAY. The requesting client can choose to send the request again after a delay or fall back to a traditional read/write style copy. If there is need to make the mechanism more sophisticated, we can visit that in future patches. Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	15392c8cd1	nfsd: avoid races with wake_up_var() wake_up_var() needs a barrier after the important change is made in the var and before wake_up_var() is called, else it is possible that a wake up won't be sent when it should. In each case here the var is changed in an "atomic" manner, so smb_mb__after_atomic() is sufficient. In one case the important change (removing the lease) is performed after the wake_up, which is backwards. The code survives in part because the wait_var_event is given a timeout. This patch adds the required barriers and calls destroy_delegation() before waking any threads waiting for the delegation to be destroyed. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	985eeae9c8	nfsd: use clear_and_wake_up_bit() nfsd has two places that open-code clear_and_wake_up_bit(). One has the required memory barriers. The other does not. Change both to use clear_and_wake_up_bit() so we have the barriers without the noise. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Jeff Layton	ba017fd391	nfsd: add more nfsd_cb tracepoints Add some tracepoints in the callback client RPC operations. Also add a tracepoint to nfsd4_cb_getattr_done. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Jeff Layton	c1c9f3ea74	nfsd: track the main opcode for callbacks Keep track of the "main" opcode for the callback, and display it in the tracepoint. This makes it simpler to discern what's happening when there is more than one callback in flight. The one special case is the CB_NULL RPC. That's not a CB_COMPOUND opcode, so designate the value 0 for that. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Li Lingfeng	76a3f3f164	nfsd: fix some spelling errors in comments Fix spelling errors in comments of nfsd4_release_lockowner and nfs4_set_delegation. Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Hongbo Li	c2feb7ee39	nfsd: use LIST_HEAD() to simplify code list_head can be initialized automatically with LIST_HEAD() instead of calling INIT_LIST_HEAD(). Signed-off-by: Hongbo Li <lihongbo22@huawei.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Li Lingfeng	2039c5da5d	NFSD: remove redundant assignment operation Commit `5826e09bf3` ("NFSD: OP_CB_RECALL_ANY should recall both read and write delegations") added a new assignment statement to add RCA4_TYPE_MASK_WDATA_DLG to ra_bmval bitmask of OP_CB_RECALL_ANY. So the old one should be removed. Signed-off-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Mark Grimes	32b34fa485	nfsd: Add quotes to client info 'callback address' The 'callback address' in client_info_show is output without quotes causing yaml parsers to fail on processing IPv6 addresses. Adding quotes to 'callback address' also matches that used by the 'address' field. Signed-off-by: Mark Grimes <mark.grimes@ixsystems.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	4f67d24f72	nfsd: use nfsd_v4client() in nfsd_breaker_owns_lease() nfsd_breaker_owns_lease() currently open-codes the same test that nfsd_v4client() performs. With this patch we use nfsd_v4client() instead. Also as i_am_nfsd() is only used in combination with kthread_data(), replace it with nfsd_current_rqst() which combines the two and returns a valid svc_rqst, or NULL. The test for NULL is moved into nfsd_v4client() for code clarity. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	9fd45c16f3	nfsd: Pass 'cred' instead of 'rqstp' to some functions. nfsd_permission(), exp_rdonly(), nfsd_setuser(), and nfsexp_flags() only ever need the cred out of rqstp, so pass it explicitly instead of the whole rqstp. This makes the interfaces cleaner. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
Sagi Grimberg	11673b2a91	nfsd: don't assume copy notify when preprocessing the stateid Move the stateid handling to nfsd4_copy_notify. If nfs4_preprocess_stateid_op did not produce an output stateid, error out. Copy notify specifically does not permit the use of special stateids, so enforce that outside generic stateid pre-processing. Signed-off-by: Sagi Grimberg <sagi@grimberg.me> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Olga Kornievskaia <aglo@umich.edu> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-20 19:31:03 -04:00
NeilBrown	8203ab8a9d	nfsd: don't EXPORT_SYMBOL nfsd4_ssc_init_umount_work() nfsd4_ssc_init_umount_work() is only used in the nfsd module, so there is no need to EXPORT it. Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-09-01 10:04:56 -04:00
NeilBrown	40927f3d09	nfsd: fix nfsd4_deleg_getattr_conflict in presence of third party lease It is not safe to dereference fl->c.flc_owner without first confirming fl->fl_lmops is the expected manager. nfsd4_deleg_getattr_conflict() tests fl_lmops but largely ignores the result and assumes that flc_owner is an nfs4_delegation anyway. This is wrong. With this patch we restore the "!= &nfsd_lease_mng_ops" case to behave as it did before the change mentioned below. This is the same as the current code, but without any reference to a possible delegation. Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-08-30 10:48:29 -04:00
Jeff Layton	7e8ae8486e	fs/nfsd: fix update of inode attrs in CB_GETATTR Currently, we copy the mtime and ctime to the in-core inode and then mark the inode dirty. This is fine for certain types of filesystems, but not all. Some require a real setattr to properly change these values (e.g. ceph or reexported NFS). Fix this code to call notify_change() instead, which is the proper way to effect a setattr. There is one problem though: In this case, the client is holding a write delegation and has sent us attributes to update our cache. We don't want to break the delegation for this since that would defeat the purpose. Add a new ATTR_DELEG flag that makes notify_change bypass the try_break_deleg call. Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-08-26 19:04:00 -04:00
Jeff Layton	1116e0e372	nfsd: fix potential UAF in nfsd4_cb_getattr_release Once we drop the delegation reference, the fields embedded in it are no longer safe to access. Do that last. Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-08-26 11:53:05 -04:00
Jeff Layton	da05ba23d4	nfsd: hold reference to delegation when updating it for cb_getattr Once we've dropped the flc_lock, there is nothing that ensures that the delegation that was found will still be around later. Take a reference to it while holding the lock and then drop it when we've finished with the delegation. Fixes: `c5967721e1` ("NFSD: handle GETATTR conflict with write delegation") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-08-26 11:52:40 -04:00
Olga Kornievskaia	a204501e17	nfsd: prevent panic for nfsv4.0 closed files in nfs4_show_open Prior to commit `3f29cc82a8` ("nfsd: split sc_status out of sc_type") states_show() relied on sc_type field to be of valid type before calling into a subfunction to show content of a particular stateid. From that commit, we split the validity of the stateid into sc_status and no longer changed sc_type to 0 while unhashing the stateid. This resulted in kernel oopsing for nfsv4.0 opens that stay around and in nfs4_show_open() would derefence sc_file which was NULL. Instead, for closed open stateids forgo displaying information that relies of having a valid sc_file. To reproduce: mount the server with 4.0, read and close a file and then on the server cat /proc/fs/nfsd/clients/2/states [ 513.590804] Call trace: [ 513.590925] _raw_spin_lock+0xcc/0x160 [ 513.591119] nfs4_show_open+0x78/0x2c0 [nfsd] [ 513.591412] states_show+0x44c/0x488 [nfsd] [ 513.591681] seq_read_iter+0x5d8/0x760 [ 513.591896] seq_read+0x188/0x208 [ 513.592075] vfs_read+0x148/0x470 [ 513.592241] ksys_read+0xcc/0x178 Fixes: `3f29cc82a8` ("nfsd: split sc_status out of sc_type") Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-08-23 12:25:46 -04:00
NeilBrown	d43113fbbf	nfsd: optimise recalculate_deny_mode() for a common case recalculate_deny_mode() takes time that is linear in the number of stateids active on the file. When called from release_openowner -> free_ol_stateid_reaplist ->nfs4_free_ol_stateid -> release_all_access the number of times it is called is linear in the number of stateids. The net result is that time taken by release_openowner is quadratic in the number of stateids. When the nfsd server is shut down while there are many active stateids this can result in a soft lockup. ("CPU stuck for 302s" seen in one case). In many cases all the states have the same deny modes and there is no need to examine the entire list in recalculate_deny_mode(). In particular, recalculate_deny_mode() will only reduce the deny mode, never increase it. So if some prefix of the list causes the original deny mode to be required, there is no need to examine the remainder of the list. So we can improve recalculate_deny_mode() to usually run in constant time, so release_openowner will typically be only linear in the number of states. Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:19 -04:00
Jeff Layton	9320f27fda	nfsd: add tracepoint in mark_client_expired_locked Show client info alongside the number of cl_rpc_users. If that's elevated, then we can infer that this function returned nfserr_jukebox. [ cel: For additional debugging of RPC user refcounting ] Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Vladimir Benes <vbenes@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:19 -04:00
Chuck Lever	2d49901150	nfsd: new tracepoint for check_slot_seqid Replace a dprintk in check_slot_seqid with tracepoints. These new tracepoints track slot sequence numbers during operation. Suggested-by: Jeffrey Layton <jlayton@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:18 -04:00
Chuck Lever	38f080f3cd	NFSD: Move callback_wq into struct nfs4_client Commit `8838203667` ("nfsd: update workqueue creation") made the callback_wq single-threaded, presumably to protect modifications of cl_cb_client. See documenting comment for nfsd4_process_cb_update(). However, cl_cb_client is per-lease. There's no other reason that all callback operations need to be dispatched via a single thread. The single threading here means all client callbacks can be blocked by a problem with one client. Change the NFSv4 callback client so it serializes per-lease instead of serializing all NFSv4 callback operations on the server. Reported-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:16 -04:00
NeilBrown	56c35f43ee	nfsd: drop st_mutex before calling move_to_close_lru() move_to_close_lru() is currently called with ->st_mutex held. This can lead to a deadlock as move_to_close_lru() waits for sc_count to drop to 2, and some threads holding a reference might be waiting for the mutex. These references will never be dropped so sc_count will never reach 2. There can be no harm in dropping ->st_mutex before move_to_close_lru() because the only place that takes the mutex is nfsd4_lock_ol_stateid(), and it quickly aborts if sc_type is NFS4_CLOSED_STID, which it will be before move_to_close_lru() is called. See also https://lore.kernel.org/lkml/4dd1fe21e11344e5969bb112e954affb@jd.com/T/ where this problem was raised but not successfully resolved. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:16 -04:00
NeilBrown	eec7620800	nfsd: replace rp_mutex to avoid deadlock in move_to_close_lru() move_to_close_lru() waits for sc_count to become zero while holding rp_mutex. This can deadlock if another thread holds a reference and is waiting for rp_mutex. By the time we get to move_to_close_lru() the openowner is unhashed and cannot be found any more. So code waiting for the mutex can safely retry the lookup if move_to_close_lru() has started. So change rp_mutex to an atomic_t with three states: RP_UNLOCK - state is still hashed, not locked for reply RP_LOCKED - state is still hashed, is locked for reply RP_UNHASHED - state is not hashed, no code can get a lock. Use wait_var_event() to wait for either a lock, or for the owner to be unhashed. In the latter case, retry the lookup. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:16 -04:00
NeilBrown	b3f03739ca	nfsd: move nfsd4_cstate_assign_replay() earlier in open handling. Rather than taking the rp_mutex (via nfsd4_cstate_assign_replay) in nfsd4_cleanup_open_state() (which seems counter-intuitive), take it and assign rp_owner as soon as possible - in nfsd4_process_open1(). This will support a future change when nfsd4_cstate_assign_replay() might fail. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:15 -04:00
NeilBrown	23df17788c	nfsd: perform all find_openstateowner_str calls in the one place. Currently find_openstateowner_str look ups are done both in nfsd4_process_open1() and alloc_init_open_stateowner() - the latter possibly being a surprise based on its name. It would be easier to follow, and more conformant to common patterns, if the lookup was all in the one place. So replace alloc_init_open_stateowner() with find_or_alloc_open_stateowner() and use the latter in nfsd4_process_open1() without any calls to find_openstateowner_str(). This means all finds are find_openstateowner_str_locked() and find_openstateowner_str() is no longer needed. So discard find_openstateowner_str() and rename find_openstateowner_str_locked() to find_openstateowner_str(). Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-05-06 09:07:15 -04:00
Linus Torvalds	f2f80ac809	nfsd-6.9 fixes: - Address a slow memory leak with RPC-over-TCP - Prevent another NFS4ERR_DELAY loop during CREATE_SESSION -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmYReWEACgkQM2qzM29m f5fsfxAAhVkcd5Om9iBI7/Ib2QtJdeyn9+Q6hOJi9ITDPpdbSrd1Fmd8ufyKNuxH dwGLyV0+ELbUl1RRNfdnl+TkzYHMTURuvDEgUyhYA28GOJVd9GWXwX2KZR7J+AP5 HtpSGLXt+XvuO7uB+SFS85wwF0DJL39Qy4jCVYCOuN2Z8zqfTg5TwstOQ8X794QN b5JzLkUlxQfd6kGRvU+BZHNf7R/yBfjUQWVybyhqzdjnCbbnPH+cl0hTlEIQTYJH G31Gty1J/RGt1ZeURuF4OG4lFocRJW/SqoruneweBAOksN9PVcwsoMf6m16l3+AD ZMnBt7FInQc/mAqRqIoLTsmYT8OyDa3a6qjubqWCYicCXvj1FxxOd7IaYytXxv/2 Z8ZvKSSvyXRwM3mUt+3E5DTM8NnsxPxnO9iSGIMUeH7n96LU0X39b/Ll6in6+eu2 /go8cLe59uuYDF9n2srX/LLWHj5wAWxVi+OgiSsAbsDFYTtJXK+syT2CpsEFXiUZ 5AYUbfGVqQ8uNtfGaaJd71CNCuEKC5qYpeC5cS2nnruV6SArfG69DMRAO0pxJYAC 6X7gm9Se1zyI8r9gR0rKjJ5ojeTPQBLfk6oVavum6CCwHzkKQTLG2jHBq8cdpwoL KxXc37fhW9m9c2B3g2dikclM2+XrMyUzJ5Ync9SSiwFJN/956I0= =dGcu -----END PGP SIGNATURE----- Merge tag 'nfsd-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Address a slow memory leak with RPC-over-TCP - Prevent another NFS4ERR_DELAY loop during CREATE_SESSION * tag 'nfsd-6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: hold a lighter-weight client reference over CB_RECALL_ANY SUNRPC: Fix a slow server-side memory leak with RPC-over-TCP	2024-04-06 09:37:50 -07:00
Jeff Layton	10396f4df8	nfsd: hold a lighter-weight client reference over CB_RECALL_ANY Currently the CB_RECALL_ANY job takes a cl_rpc_users reference to the client. While a callback job is technically an RPC that counter is really more for client-driven RPCs, and this has the effect of preventing the client from being unhashed until the callback completes. If nfsd decides to send a CB_RECALL_ANY just as the client reboots, we can end up in a situation where the callback can't complete on the (now dead) callback channel, but the new client can't connect because the old client can't be unhashed. This usually manifests as a NFS4ERR_DELAY return on the CREATE_SESSION operation. The job is only holding a reference to the client so it can clear a flag after the RPC completes. Fix this by having CB_RECALL_ANY instead hold a reference to the cl_nfsdfs.cl_ref. Typically we only take that sort of reference when dealing with the nfsdfs info files, but it should work appropriately here to ensure that the nfs4_client doesn't disappear. Fixes: `44df6f439a` ("NFSD: add delegation reaper to react to low memory condition") Reported-by: Vladimir Benes <vbenes@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-04-05 14:05:35 -04:00
Linus Torvalds	d8e8fbec00	nfsd-6.9 fixes: - Address three recently introduced regressions -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmYFtQkACgkQM2qzM29m f5fSbA//S0YiPCU+bVwN+mK47vhai1Dzaw1kd2rzlfnvTCyBFVocfQU85bFV8uns ZMcHhO6jPjRg1o/qkCzeNf+n6AVGwvgOn1jC628jTP1wJOS8WPsZtdwz8vHX/xw5 qnMwEUtfFPiDn85VyK6z6mJBWRcF/vStPc022Ie5ba5ydxD0rFmaeIc6zYX8csUe 8tbHlulYtRR97mXjxgtXC/H0scvag07MA1fRKiZRNk0jwUdtcAsHktiBfWPV2KXO ew8saA+BLyBBdE01cq1CoMH/3wt/a5rpaiV5ePPtYkmlPVT5IEmSLZUwUVZAjMuu qlvKF9xrH7x8/7g/SwOJdH8po0IjiRdW9EKID+6PPqXz+YYKyTe7JqZtvEzt/4x9 tGbIfyHxpUPPTZ6VedlXYW95uWG+lrKzur+zHQwsNf+U0d5Xila8euEKqvB4kDtk riHUlvs8O6WroeHcOQ5pK72l7x8gELdVqHP6n+E0t5+2VKp/Vcqf8WlWV0suoDfa D2EYGIzxn8NrNuivGESHfzafmHKKMn51UGrm1Sl0vF5w2LlqMfL4Yo4NIT+e69VR E3h6wIhK7Jor9px5/tBD3Y6o9k8wT9upjA+qsrOUhxyU3T9r+s88I64gvGyABTIM kZJHLJYROoNO6A5FSRySSH+bk/UU+wJ0s6avgHL1A2yYnyrQUzU= =YV5O -----END PGP SIGNATURE----- Merge tag 'nfsd-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Address three recently introduced regressions * tag 'nfsd-6.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: CREATE_SESSION must never cache NFS4ERR_DELAY replies SUNRPC: Revert `561141dd49` nfsd: Fix error cleanup path in nfsd_rename()	2024-03-28 14:35:32 -07:00
Chuck Lever	99dc2ef039	NFSD: CREATE_SESSION must never cache NFS4ERR_DELAY replies There are one or two cases where CREATE_SESSION returns NFS4ERR_DELAY in order to force the client to wait a bit and try CREATE_SESSION again. However, after commit `e4469c6cc6` ("NFSD: Fix the NFSv4.1 CREATE_SESSION operation"), NFSD caches that response in the CREATE_SESSION slot. Thus, when the client resends the CREATE_SESSION, the server always returns the cached NFS4ERR_DELAY response rather than actually executing the request and properly recording its outcome. This blocks the client from making further progress. RFC 8881 Section 15.1.1.3 says: > If NFS4ERR_DELAY is returned on an operation other than SEQUENCE > that validly appears as the first operation of a request ... [t]he > request can be retried in full without modification. In this case > as well, the replier MUST avoid returning a response containing > NFS4ERR_DELAY as the response to an initial operation of a request > solely on the basis of its presence in the reply cache. Neither the original NFSD code nor the discussion in section 18.36.4 refer explicitly to this important requirement, so I missed it. Note also that not only must the server not cache NFS4ERR_DELAY, but it has to not advance the CREATE_SESSION slot sequence number so that it can properly recognize and accept the client's retry. Reported-by: Dai Ngo <dai.ngo@oracle.com> Fixes: `e4469c6cc6` ("NFSD: Fix the NFSv4.1 CREATE_SESSION operation") Tested-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-27 13:19:47 -04:00
Linus Torvalds	a01c9fe323	NFSD 6.9 Release Notes The bulk of the patches for this release are optimizations, code clean-ups, and minor bug fixes. One new feature to mention is that NFSD administrators now have the ability to revoke NFSv4 open and lock state. NFSD's NFSv3 support has had this capability for some time. As always I am grateful to NFSD contributors, reviewers, and testers. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmXwV4QACgkQM2qzM29m f5c7cg/8CRe0mGbeEMonoSycBjANDuiRolCM+DhVccUvSyWPqf4blF5yrNHcf5zN WmjQHVXIJUMVpLovcakj+4aBIuXGgdSmBJamFTy9fVfcFadiWYRceNgMMXpLMDDI fMAszRUyfL/r0Evj0Zajt86R5/gGn+W9X6HlDc1k7VV0Z+fzRw9WMxADy11cgHLp mh2bzyPmwu0EfBYlWNWLqzWVZm1C5UCGnlInyr0KXImCLOkpJqAVXTDvDkGFW2Qw 1kJhodyabf6fRV2ZqPjLUuR4aRqABey83rB0N5z7MumO/dJUBW3CHR3uNMqvkmh3 XevI8bPzS2Kypijcx7dONtkDWwU+fsvCdepNpmVDB73B19BFiLG+HDbMypJ0dmp+ rvvfILRDCmIb+FA1DUeT3lIc6ac1f1+qAVc7hi3E7rGctEJWeHDsZg+E1PuTvpxM 3XfRaFnucY5vwyiB2/uI4eblBHcVXoKho+pUqQMegLPRbgsEUyFUfg3+ZMtntagd OVUXvWYIARP97HNh0J5ChcGI72UpXtFWMlbbiTiCzYx4FeiCffeczIERXNJ4FYAg fKUaiBhdAN1PPFCRXJORZ5XlSIeZttUNSJUPfmuOpkscMdkpRUIhuEUYo9K8/1eL O+YZeGW/kTG+llxOERfEHJoekLf1TgGdU7oBmTIgQIK03hTUih8= =75G4 -----END PGP SIGNATURE----- Merge tag 'nfsd-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd updates from Chuck Lever: "The bulk of the patches for this release are optimizations, code clean-ups, and minor bug fixes. One new feature to mention is that NFSD administrators now have the ability to revoke NFSv4 open and lock state. NFSD's NFSv3 support has had this capability for some time. As always I am grateful to NFSD contributors, reviewers, and testers" * tag 'nfsd-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (75 commits) NFSD: Clean up nfsd4_encode_replay() NFSD: send OP_CB_RECALL_ANY to clients when number of delegations reaches its limit NFSD: Document nfsd_setattr() fill-attributes behavior nfsd: Fix NFSv3 atomicity bugs in nfsd_setattr() nfsd: Fix a regression in nfsd_setattr() NFSD: OP_CB_RECALL_ANY should recall both read and write delegations NFSD: handle GETATTR conflict with write delegation NFSD: add support for CB_GETATTR callback NFSD: Document the phases of CREATE_SESSION NFSD: Fix the NFSv4.1 CREATE_SESSION operation nfsd: clean up comments over nfs4_client definition svcrdma: Add Write chunk WRs to the RPC's Send WR chain svcrdma: Post WRs for Write chunks in svc_rdma_sendto() svcrdma: Post the Reply chunk and Send WR together svcrdma: Move write_info for Reply chunks into struct svc_rdma_send_ctxt svcrdma: Post Send WR chain svcrdma: Fix retry loop in svc_rdma_send() svcrdma: Prevent a UAF in svc_rdma_send() svcrdma: Fix SQ wake-ups svcrdma: Increase the per-transport rw_ctx count ...	2024-03-12 14:27:37 -07:00
Linus Torvalds	0c750012e8	vfs-6.9.file -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZem4tQAKCRCRxhvAZXjc ohnfAP4sm946PZfiC4y5Euk96WDC3hC8WCSBar+fpFmYVzeD9wEAy+NVCsjkMElz vqNxwFULUwQjFxxvsM9gvhrgGUud1AE= =UZk/ -----END PGP SIGNATURE----- Merge tag 'vfs-6.9.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull file locking updates from Christian Brauner: "A few years ago struct file_lock_context was added to allow for separate lists to track different types of file locks instead of using a singly-linked list for all of them. Now leases no longer need to be tracked using struct file_lock. However, a lot of the infrastructure is identical for leases and locks so separating them isn't trivial. This splits a group of fields used by both file locks and leases into a new struct file_lock_core. The new core struct is embedded in struct file_lock. Coccinelle was used to convert a lot of the callers to deal with the move, with the remaining 25% or so converted by hand. Afterwards several internal functions in fs/locks.c are made to work with struct file_lock_core. Ultimately this allows to split struct file_lock into struct file_lock and struct file_lease. The file lease APIs are then converted to take struct file_lease" * tag 'vfs-6.9.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (51 commits) filelock: fix deadlock detection in POSIX locking filelock: always define for_each_file_lock() smb: remove redundant check filelock: don't do security checks on nfsd setlease calls filelock: split leases out of struct file_lock filelock: remove temporary compatibility macros smb/server: adapt to breakup of struct file_lock smb/client: adapt to breakup of struct file_lock ocfs2: adapt to breakup of struct file_lock nfsd: adapt to breakup of struct file_lock nfs: adapt to breakup of struct file_lock lockd: adapt to breakup of struct file_lock fuse: adapt to breakup of struct file_lock gfs2: adapt to breakup of struct file_lock dlm: adapt to breakup of struct file_lock ceph: adapt to breakup of struct file_lock afs: adapt to breakup of struct file_lock 9p: adapt to breakup of struct file_lock filelock: convert seqfile handling to use file_lock_core filelock: convert locks_translate_pid to take file_lock_core ...	2024-03-11 10:37:45 -07:00
Dai Ngo	bad4c585cc	NFSD: send OP_CB_RECALL_ANY to clients when number of delegations reaches its limit The NFS server should ask clients to voluntarily return unused delegations when the number of granted delegations reaches the max_delegations. This is so that the server can continue to grant delegations for new requests. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Tested-by: Chen Hanxiao <chenhx.fnst@fujitsu.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-05 10:22:33 -05:00
Trond Myklebust	24d92de918	nfsd: Fix NFSv3 atomicity bugs in nfsd_setattr() The main point of the guarded SETATTR is to prevent races with other WRITE and SETATTR calls. That requires that the check of the guard time against the inode ctime be done after taking the inode lock. Furthermore, we need to take into account the 32-bit nature of timestamps in NFSv3, and the possibility that files may change at a faster rate than once a second. Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:33 -05:00
Dai Ngo	5826e09bf3	NFSD: OP_CB_RECALL_ANY should recall both read and write delegations Add RCA4_TYPE_MASK_WDATA_DLG to ra_bmval bitmask of OP_CB_RECALL_ANY Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:32 -05:00
Dai Ngo	c5967721e1	NFSD: handle GETATTR conflict with write delegation If the GETATTR request on a file that has write delegation in effect and the request attributes include the change info and size attribute then the request is handled as below: Server sends CB_GETATTR to client to get the latest change info and file size. If these values are the same as the server's cached values then the GETATTR proceeds as normal. If either the change info or file size is different from the server's cached values, or the file was already marked as modified, then: . update time_modify and time_metadata into file's metadata with current time . encode GETATTR as normal except the file size is encoded with the value returned from CB_GETATTR . mark the file as modified If the CB_GETATTR fails for any reasons, the delegation is recalled and NFS4ERR_DELAY is returned for the GETATTR. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:32 -05:00
Chuck Lever	b910544a5a	NFSD: Document the phases of CREATE_SESSION As described in RFC 8881 Section 18.36.4, CREATE_SESSION can be split into four phases. NFSD's implementation now does it like that description. Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:31 -05:00
Chuck Lever	e4469c6cc6	NFSD: Fix the NFSv4.1 CREATE_SESSION operation RFC 8881 Section 18.36.4 discusses the implementation of the NFSv4.1 CREATE_SESSION operation. The section defines four phases of operation. Phase 2 processes the CREATE_SESSION sequence ID. As a separate step, Phase 3 evaluates the CREATE_SESSION arguments. The problem we are concerned with is when phase 2 is successful but phase 3 fails. The spec language in this case is "No changes are made to any client records on the server." RFC 8881 Section 18.35.4 defines a "client record", and it does /not/ contain any details related to the special CREATE_SESSION slot. Therefore NFSD is incorrect to skip incrementing the CREATE_SESSION sequence id when phase 3 (see Section 18.36.4) of CREATE_SESSION processing fails. In other words, even though NFSD happens to store the cs_slot in a client record, in terms of the protocol the slot is logically separate from the client record. Three complications: 1. The world has moved on since commit `86c3e16cc7` ("nfsd4: confirm only on succesful create_session") broke this. So we can't simply revert that commit. 2. NFSD's CREATE_SESSION implementation does not cleanly delineate the logic of phases 2 and 3. So this won't be a surgical fix. 3. Because of the way it currently handles the CREATE_SESSION slot sequence number, nfsd4_create_session() isn't caching error responses in the CREATE_SESSION slot. Instead of replaying the response cache in those cases, it's executing the transaction again. Reorganize the CREATE_SESSION slot sequence number accounting. This requires that error responses are appropriately cached in the CREATE_SESSION slot (once it is found). Reported-by: Connor Smith <connor.smith@hitachivantara.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218382 Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:31 -05:00
Kunwu Chan	649e58d593	nfsd: Simplify the allocation of slab caches in nfsd4_init_slabs Use the new KMEM_CACHE() macro instead of direct kmem_cache_create to simplify the creation of SLAB caches. Make the code cleaner and more readable. Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:24 -05:00
NeilBrown	05eda6e757	nfsd: don't call locks_release_private() twice concurrently It is possible for free_blocked_lock() to be called twice concurrently, once from nfsd4_lock() and once from nfsd4_release_lockowner() calling remove_blocked_locks(). This is why a kref was added. It is perfectly safe for locks_delete_block() and kref_put() to be called in parallel as they use locking or atomicity respectively as protection. However locks_release_private() has no locking. It is safe for it to be called twice sequentially, but not concurrently. This patch moves that call from free_blocked_lock() where it could race with itself, to free_nbl() where it cannot. This will slightly delay the freeing of private info or release of the owner - but not by much. It is arguably more natural for this freeing to happen in free_nbl() where the structure itself is freed. This bug was found by code inspection - it has not been seen in practice. Fixes: `47446d74f1` ("nfsd4: add refcount for nfsd4_blocked_lock") Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:23 -05:00
NeilBrown	1e33e1414b	nfsd: allow layout state to be admin-revoked. When there is layout state on a filesystem that is being "unlocked" that is now revoked, which involves closing the nfsd_file and releasing the vfs lease. To avoid races, ->ls_file can now be accessed either: - under ->fi_lock for the state's sc_file or - under rcu_read_lock() if nfsd_file_get() is used. To support this, ->fence_client and nfsd4_cb_layout_fail() now take a second argument being the nfsd_file. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:22 -05:00
NeilBrown	06efa66750	nfsd: allow delegation state ids to be revoked and then freed Revoking state through 'unlock_filesystem' now revokes any delegation states found. When the stateids are then freed by the client, the revoked stateids will be cleaned up correctly. As there is already support for revoking delegations, we build on that for admin-revoking. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:22 -05:00
NeilBrown	39657c7406	nfsd: allow open state ids to be revoked and then freed Revoking state through 'unlock_filesystem' now revokes any open states found. When the stateids are then freed by the client, the revoked stateids will be cleaned up correctly. Possibly the related lock states should be revoked too, but a subsequent patch will do that for all lock state on the superblock. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:21 -05:00
NeilBrown	1c13bf9f2e	nfsd: allow lock state ids to be revoked and then freed Revoking state through 'unlock_filesystem' now revokes any lock states found. When the stateids are then freed by the client, the revoked stateids will be cleaned up correctly. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:21 -05:00
NeilBrown	d688d8585e	nfsd: allow admin-revoked NFSv4.0 state to be freed. For NFSv4.1 and later the client easily discovers if there is any admin-revoked state and will then find and explicitly free it. For NFSv4.0 there is no such mechanism. The client can only find that state is admin-revoked if it tries to use that state, and there is no way for it to explicitly free the state. So the server must hold on to the stateid (at least) for an indefinite amount of time. A RELEASE_LOCKOWNER request might justify forgetting some of these stateids, as would the whole clients lease lapsing, but these are not reliable. This patch takes two approaches. Whenever a client uses an revoked stateid, that stateid is then discarded and will not be recognised again. This might confuse a client which expect to get NFS4ERR_ADMIN_REVOKED consistently once it get it at all, but should mostly work. Hopefully one error will lead to other resources being closed (e.g. process exits), which will result in more stateid being freed when a CLOSE attempt gets NFS4ERR_ADMIN_REVOKED. Also, any admin-revoked stateids that have been that way for more than one lease time are periodically revoke. No actual freeing of state happens in this patch. That will come in future patches which handle the different sorts of revoked state. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:21 -05:00
NeilBrown	11b2cfbf6c	nfsd: report in /proc/fs/nfsd/clients/*/states when state is admin-revoke Add "admin-revoked" to the status information for any states that have been admin-revoked. This can be useful for confirming correct behaviour. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:20 -05:00
NeilBrown	39e1be6471	nfsd: allow state with no file to appear in /proc/fs/nfsd/clients/*/states Change the "show" functions to show some content even if a file cannot be found. This is the case for admin-revoked state. This is primarily useful for debugging - to ensure states are being removed eventually. So change several seq_printf() to seq_puts(). Some of these are needed to keep checkpatch happy. Others were done for consistency. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:20 -05:00
NeilBrown	1ac3629bf0	nfsd: prepare for supporting admin-revocation of state The NFSv4 protocol allows state to be revoked by the admin and has error codes which allow this to be communicated to the client. This patch - introduces a new state-id status SC_STATUS_ADMIN_REVOKED which can be set on open, lock, or delegation state. - reports NFS4ERR_ADMIN_REVOKED when these are accessed - introduces a per-client counter of these states and returns SEQ4_STATUS_ADMIN_STATE_REVOKED when the counter is not zero. Decrements this when freeing any admin-revoked state. - introduces stub code to find all interesting states for a given superblock so they can be revoked via the 'unlock_filesystem' file in /proc/fs/nfsd/ No actual states are handled yet. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:19 -05:00
NeilBrown	3f29cc82a8	nfsd: split sc_status out of sc_type sc_type identifies the type of a state - open, lock, deleg, layout - and also the status of a state - closed or revoked. This is a bit untidy and could get worse when "admin-revoked" states are added. So clean it up. With this patch, the type is now all that is stored in sc_type. This is zero when the state is first added to ->cl_stateids (causing it to be ignored), and is then set appropriately once it is fully initialised. It is set under ->cl_lock to ensure atomicity w.r.t lookup. It is now never cleared. sc_type is still a bit-set even though at most one bit is set. This allows lookup functions to be given a bitmap of acceptable types. sc_type is now an unsigned short rather than char. There is no value in restricting to just 8 bits. All the constants now start SC_TYPE_ matching the field in which they are stored. Keeping the existing names and ensuring clear separation from non-type flags would have required something like NFS4_STID_TYPE_CLOSED which is cumbersome. The "NFS4" prefix is redundant was they only appear in NFS4 code, so remove that and change STID to SC to match the field. The status is stored in a separate unsigned short named "sc_status". It has two flags: SC_STATUS_CLOSED and SC_STATUS_REVOKED. CLOSED combines NFS4_CLOSED_STID, NFS4_CLOSED_DELEG_STID, and is used for SC_TYPE_LOCK and SC_TYPE_LAYOUT instead of setting the sc_type to zero. These flags are only ever set, never cleared. For deleg stateids they are set under the global state_lock. For open and lock stateids they are set under ->cl_lock. For layout stateids they are set under ->ls_lock nfs4_unhash_stid() has been removed, and we never set sc_type = 0. This was only used for LOCK and LAYOUT stids and they now use SC_STATUS_CLOSED. Also TRACE_DEFINE_NUM() calls for the various STID #define have been removed because these things are not enums, and so that call is incorrect. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:19 -05:00
NeilBrown	83e733161f	nfsd: avoid race after unhash_delegation_locked() NFS4_CLOSED_DELEG_STID and NFS4_REVOKED_DELEG_STID are similar in purpose. REVOKED is used for NFSv4.1 states which have been revoked because the lease has expired. CLOSED is used in other cases. The difference has two practical effects. 1/ REVOKED states are on the ->cl_revoked list 2/ REVOKED states result in nfserr_deleg_revoked from nfsd4_verify_open_stid() and nfsd4_validate_stateid while CLOSED states result in nfserr_bad_stid. Currently a state that is being revoked is first set to "CLOSED" in unhash_delegation_locked(), then possibly to "REVOKED" in revoke_delegation(), at which point it is added to the cl_revoked list. It is possible that a stateid test could see the CLOSED state which really should be REVOKED, and so return the wrong error code. So it is safest to remove this window of inconsistency. With this patch, unhash_delegation_locked() always sets the state correctly, and revoke_delegation() no longer changes the state. Also remove a redundant test on minorversion when NFS4_REVOKED_DELEG_STID is seen - it can only be seen when minorversion is non-zero. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:19 -05:00
NeilBrown	c6540026df	nfsd: don't call functions with side-effecting inside WARN_ON() Code like: WARN_ON(foo()) looks like an assertion and might not be expected to have any side effects. When testing if a function with side-effects fails a construct like if (foo()) WARN_ON(1); makes the intent more obvious. nfsd has several WARN_ON calls where the test has side effects, so it would be good to change them. These cases don't really need the WARN_ON. They have never failed in 8 years of usage so let's just remove the WARN_ON wrapper. Suggested-by: Chuck Lever <chuck.lever@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:18 -05:00
NeilBrown	779457285a	nfsd: hold ->cl_lock for hash_delegation_locked() The protocol for creating a new state in nfsd is to allocate the state leaving it largely uninitialised, add that state to the ->cl_stateids idr so as to reserve a state-id, then complete initialisation of the state and only set ->sc_type to non-zero once the state is fully initialised. If a state is found in the idr with ->sc_type == 0, it is ignored. The ->cl_lock lock is used to avoid races - it is held while checking sc_type during lookup, and held when a non-zero value is stored in ->sc_type. ... except... hash_delegation_locked() finalises the initialisation of a delegation state, but does NOT hold ->cl_lock. So this patch takes ->cl_lock at the appropriate time w.r.t other locks, and so ensures there are no races (which are extremely unlikely in any case). As ->fi_lock is often taken when ->cl_lock is held, we need to take ->cl_lock first of those two. Currently ->cl_lock and state_lock are never both taken at the same time. We need both for this patch so an arbitrary choice is needed concerning which to take first. As state_lock is more global, it might be more contended, so take it first. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:18 -05:00
NeilBrown	6b4ca49dc3	nfsd: remove stale comment in nfs4_show_deleg() As we do now support write delegations, this comment is unhelpful and misleading. Reported-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:17 -05:00
Chuck Lever	f52f1975b1	NFSD: Add nfsd_seq4_status trace event Add a trace point that records SEQ4_STATUS flags returned in an NFSv4.1 SEQUENCE response. SEQ4_STATUS flags report backchannel issues and changes to lease state to clients. Knowing what the server is reporting to clients is useful for debugging both configuration and operational issues in real time. For example, upcoming patches will enable server administrators to revoke parts of a client's lease; that revocation is indicated to the client when a subsequent SEQUENCE operation has one or more SEQ4_STATUS flags that are set. Sample trace records: nfsd-927 [006] 615.581821: nfsd_seq4_status: xid=0x095ded07 sessionid=65a032c3:b7845faf:00000001:00000000 status_flags=BACKCHANNEL_FAULT nfsd-927 [006] 615.588043: nfsd_seq4_status: xid=0x0a5ded07 sessionid=65a032c3:b7845faf:00000001:00000000 status_flags=BACKCHANNEL_FAULT nfsd-928 [003] 615.588448: nfsd_seq4_status: xid=0x0b5ded07 sessionid=65a032c3:b7845faf:00000001:00000000 status_flags=BACKCHANNEL_FAULT Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Benjamin Coddington <bcodding@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:13 -05:00
Josef Bacik	4b14885411	nfsd: make all of the nfsd stats per-network namespace We have a global set of counters that we modify for all of the nfsd operations, but now that we're exposing these stats across all network namespaces we need to make the stats also be per-network namespace. We already have some caching stats that are per-network namespace, so move these definitions into the same counter and then adjust all the helpers and users of these stats to provide the appropriate nfsd_net struct so that the stats are maintained for the per-network namespace objects. Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-03-01 09:12:10 -05:00
NeilBrown	5ea9a7c5fe	nfsd: don't take fi_lock in nfsd_break_deleg_cb() A recent change to check_for_locks() changed it to take ->flc_lock while holding ->fi_lock. This creates a lock inversion (reported by lockdep) because there is a case where ->fi_lock is taken while holding ->flc_lock. ->flc_lock is held across ->fl_lmops callbacks, and nfsd_break_deleg_cb() is one of those and does take ->fi_lock. However it doesn't need to. Prior to v4.17-rc1~110^2~22 ("nfsd: create a separate lease for each delegation") nfsd_break_deleg_cb() would walk the ->fi_delegations list and so needed the lock. Since then it doesn't walk the list and doesn't need the lock. Two actions are performed under the lock. One is to call nfsd_break_one_deleg which calls nfsd4_run_cb(). These doesn't act on the nfs4_file at all, so don't need the lock. The other is to set ->fi_had_conflict which is in the nfs4_file. This field is only ever set here (except when initialised to false) so there is no possible problem will multiple threads racing when setting it. The field is tested twice in nfs4_set_delegation(). The first test does not hold a lock and is documented as an opportunistic optimisation, so it doesn't impose any need to hold ->fi_lock while setting ->fi_had_conflict. The second test in nfs4_set_delegation() is make under ->fi_lock, so removing the locking when ->fi_had_conflict is set could make a change. The change could only be interesting if ->fi_had_conflict tested as false even though nfsd_break_one_deleg() ran before ->fi_lock was unlocked. i.e. while hash_delegation_locked() was running. As hash_delegation_lock() doesn't interact in any way with nfs4_run_cb() there can be no importance to this interaction. So this patch removes the locking from nfsd_break_one_deleg() and moves the final test on ->fi_had_conflict out of the locked region to make it clear that locking isn't important to the test. It is still tested after vfs_setlease() has succeeded. This might be significant and as vfs_setlease() takes ->flc_lock, and nfsd_break_one_deleg() is called under ->flc_lock this "after" is a true ordering provided by a spinlock. Fixes: `edcf972515` ("nfsd: fix RELEASE_LOCKOWNER") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-02-05 09:49:47 -05:00
Jeff Layton	7b8001013d	filelock: don't do security checks on nfsd setlease calls Zdenek reported seeing some AVC denials due to nfsd trying to set delegations: type=AVC msg=audit(09.11.2023 09:03:46.411:496) : avc: denied { lease } for pid=5127 comm=rpc.nfsd capability=lease scontext=system_u:system_r:nfsd_t:s0 tcontext=system_u:system_r:nfsd_t:s0 tclass=capability permissive=0 When setting delegations on behalf of nfsd, we don't want to do all of the normal capabilty and LSM checks. nfsd is a kernel thread and runs with CAP_LEASE set, so the uid checks end up being a no-op in most cases anyway. Some nfsd functions can end up running in normal process context when tearing down the server. At that point, the CAP_LEASE check can fail and cause the client to not tear down delegations when expected. Also, the way the per-fs ->setlease handlers work today is a little convoluted. The non-trivial ones are wrappers around generic_setlease, so when they fail due to permission problems they usually they end up doing a little extra work only to determine that they can't set the lease anyway. It would be more efficient to do those checks earlier. Transplant the permission checking from generic_setlease to vfs_setlease, which will make the permission checking happen earlier on filesystems that have a ->setlease operation. Add a new kernel_setlease function that bypasses these checks, and switch nfsd to use that instead of vfs_setlease. There is one behavioral change here: prior this patch the setlease_notifier would fire even if the lease attempt was going to fail the security checks later. With this change, it doesn't fire until the caller has passed them. I think this is a desirable change overall. nfsd is the only user of the setlease_notifier and it doesn't benefit from being notified about failed attempts. Cc: Ondrej Mosnáček <omosnacek@gmail.com> Reported-by: Zdenek Pytela <zpytela@redhat.com> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2248830 Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240205-bz2248830-v1-1-d0ec0daecba1@kernel.org Acked-by: Tom Talpey <tom@talpey.com> Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-05 13:53:03 +01:00
Jeff Layton	c69ff40719	filelock: split leases out of struct file_lock Add a new struct file_lease and move the lease-specific fields from struct file_lock to it. Convert the appropriate API calls to take struct file_lease instead, and convert the callers to use them. There is zero overlap between the lock manager operations for file locks and the ones for file leases, so split the lease-related operations off into a new lease_manager_operations struct. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-47-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-05 13:11:44 +01:00
Jeff Layton	05580bbfc6	nfsd: adapt to breakup of struct file_lock Most of the existing APIs have remained the same, but subsystems that access file_lock fields directly need to reach into struct file_lock_core now. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-42-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-05 13:11:43 +01:00
Jeff Layton	60f3154d19	nfsd: convert to using new filelock helpers Convert to using the new file locking helper functions. Also, in later patches we're going to introduce some macros with names that clash with the variable names in nfsd4_lock. Rename them. Signed-off-by: Jeff Layton <jlayton@kernel.org> Link: https://lore.kernel.org/r/20240131-flsplit-v3-12-c6129007ee8d@kernel.org Reviewed-by: NeilBrown <neilb@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>	2024-02-05 13:11:36 +01:00
NeilBrown	edcf972515	nfsd: fix RELEASE_LOCKOWNER The test on so_count in nfsd4_release_lockowner() is nonsense and harmful. Revert to using check_for_locks(), changing that to not sleep. First: harmful. As is documented in the kdoc comment for nfsd4_release_lockowner(), the test on so_count can transiently return a false positive resulting in a return of NFS4ERR_LOCKS_HELD when in fact no locks are held. This is clearly a protocol violation and with the Linux NFS client it can cause incorrect behaviour. If RELEASE_LOCKOWNER is sent while some other thread is still processing a LOCK request which failed because, at the time that request was received, the given owner held a conflicting lock, then the nfsd thread processing that LOCK request can hold a reference (conflock) to the lock owner that causes nfsd4_release_lockowner() to return an incorrect error. The Linux NFS client ignores that NFS4ERR_LOCKS_HELD error because it never sends NFS4_RELEASE_LOCKOWNER without first releasing any locks, so it knows that the error is impossible. It assumes the lock owner was in fact released so it feels free to use the same lock owner identifier in some later locking request. When it does reuse a lock owner identifier for which a previous RELEASE failed, it will naturally use a lock_seqid of zero. However the server, which didn't release the lock owner, will expect a larger lock_seqid and so will respond with NFS4ERR_BAD_SEQID. So clearly it is harmful to allow a false positive, which testing so_count allows. The test is nonsense because ... well... it doesn't mean anything. so_count is the sum of three different counts. 1/ the set of states listed on so_stateids 2/ the set of active vfs locks owned by any of those states 3/ various transient counts such as for conflicting locks. When it is tested against '2' it is clear that one of these is the transient reference obtained by find_lockowner_str_locked(). It is not clear what the other one is expected to be. In practice, the count is often 2 because there is precisely one state on so_stateids. If there were more, this would fail. In my testing I see two circumstances when RELEASE_LOCKOWNER is called. In one case, CLOSE is called before RELEASE_LOCKOWNER. That results in all the lock states being removed, and so the lockowner being discarded (it is removed when there are no more references which usually happens when the lock state is discarded). When nfsd4_release_lockowner() finds that the lock owner doesn't exist, it returns success. The other case shows an so_count of '2' and precisely one state listed in so_stateid. It appears that the Linux client uses a separate lock owner for each file resulting in one lock state per lock owner, so this test on '2' is safe. For another client it might not be safe. So this patch changes check_for_locks() to use the (newish) find_any_file_locked() so that it doesn't take a reference on the nfs4_file and so never calls nfsd_file_put(), and so never sleeps. With this check is it safe to restore the use of check_for_locks() rather than testing so_count against the mysterious '2'. Fixes: `ce3c4ad7f4` ("NFSD: Fix possible sleep during nfsd4_release_lockowner()") Signed-off-by: NeilBrown <neilb@suse.de> Reviewed-by: Jeff Layton <jlayton@kernel.org> Cc: stable@vger.kernel.org # v6.2+ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-01-24 09:49:11 -05:00
Dan Carpenter	3c86e615d1	nfsd: remove unnecessary NULL check We check "state" for NULL on the previous line so it can't be NULL here. No need to check again. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202312031425.LffZTarR-lkp@intel.com/ Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2024-01-07 17:54:26 -05:00
Linus Torvalds	ac1c13e257	nfsd-6.7 fixes: - Address a few recently-introduced issues -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmWB+F4ACgkQM2qzM29m f5dZKBAAoAxQA9FTIcb+uMIviGvMFBylKv9mhQxKSkhWhhP9XX2ESXAl+6pcBQJP ikjWw09WO/6ttvxg4hY7exk6FgkDuCALnx1xB6azEq5Ndf4T2aauuFUTEERfQ8th mtGSKEX1a98bGlkBIVQgfr3VFbQ0MRwcr0iGKfbHC3X5GiuFZOHn5fFE+0LjXWyJ 7MjI0Du3lXDlrG5mJ88T5ySMcbaOWBTzqlMY3kbwcf1dWy3TZ6uh6faXqbemrDPg Zixj6JGi+oLlrbYYjQV1Sm5QLW3e882QCQ0U9g2sOEjmLRN/hbCE90i6l1rVEe9a E1ZkAY5qtl+iFK6cbAYUt6lOcZaF8lNeEtBW5JhcFm7f7CJAUSmMb05klZ6rwpXR l6UDwQtnAbmghqAwuKaWdfxys/yqFTvKNJRida7+qK1fs8H8q0y/acKsIv/5PoWz cJvWXova4wIT7Q3xneaYXhHk1aWTQTooErQ0VrgnsE0Ch7Vc4sONhQ3SLajllChD x3JL0DgDJ6mO/JaFZCfIy3ihEaprV0uL3bqgXaad1SszEWJt4HCCE1fQpI0cSbXH Vyq5H0R74EoLh8TGc6dbJ5wZwU81P6Rc0nHz/VU+gUPrEeNYKv8P2bahnXfR7Nsc F93sl2DqQOEv/Q/h/9UolwL7AGl6TkzdPKEoYKNafagUWT031y0= =F1lB -----END PGP SIGNATURE----- Merge tag 'nfsd-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Address a few recently-introduced issues * tag 'nfsd-6.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: SUNRPC: Revert `5f7fc5d69f` NFSD: Revert `738401a9bd` NFSD: Revert `6c41d9a9bd` nfsd: hold nfsd_mutex across entire netlink operation nfsd: call nfsd_last_thread() before final nfsd_put()	2023-12-20 11:16:50 -08:00
Chuck Lever	862bee84d7	NFSD: Revert `6c41d9a9bd` For some reason, the wait_on_bit() in nfsd4_deleg_getattr_conflict() is waiting forever, preventing a clean server shutdown. The requesting client might also hang waiting for a reply to the conflicting GETATTR. Invoking wait_on_bit() in an nfsd thread context is a hazard. The correct fix is to replace this wait_on_bit() call site with a mechanism that defers the conflicting GETATTR until the CB_GETATTR completes or is known to have failed. That will require some surgery and extended testing and it's late in the v6.7-rc cycle, so I'm reverting now in favor of trying again in a subsequent kernel release. This is my fault: I should have recognized the ramifications of calling wait_on_bit() in here before accepting this patch. Thanks to Dai Ngo <dai.ngo@oracle.com> for diagnosing the issue. Reported-by: Wolfgang Walter <linux-nfs@stwm.de> Closes: https://lore.kernel.org/linux-nfs/e3d43ecdad554fbdcaa7181833834f78@stwm.de/ Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-12-18 11:22:16 -05:00
Linus Torvalds	bb28378af3	nfsd-6.7 fixes: - Fix several long-standing bugs in the duplicate reply cache - Fix a memory leak -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmVY6iwACgkQM2qzM29m f5fDqhAAnsHqZNG2I6asqh/5pPoqcp7kXkEe1l+6jr4jz/00R0lg+oLbsC6/S6eY tzGVkQtIkl2OpL8lt4JTgUL/xiO2JaqbdiIuHelnT62l97r7kbKWJ0ALHahLafiX hCQdJWOdud86kZ5x6/cYVfqyO08bhqDLUFvWd7zSLmzW/9U3bG4v6yXvHT3qqnnE dCJtuM9+DUfnDJKHe6+BFIobkyta8+Tpsg4QSgSAu4hg+dTcqtPCOxMeT+YwgNQd uZY1xiIjPLkufsBF86xzC3tyoFNaZc5QhIwv7ZBtmzNUw3906SbuST9hJiWeHeWq m7p0YeWDJrygiyIrvYxv6NDUCqnkoOxbuKTUAniTEj1SsE2gcQCfij0bU1OSRk6r CpT8TdJ6j2zP78+xxMsSIiA/gR7uJtCKs7LABru7DX25+sDkKK2Te9+PXINarJ1k fraeDeuXMQ1cu71WRXUJ3QKGn1/bC8DYGHFVQqcB+gVXqcm3BuKNYGt01nFyCp5s +1jL+lRxUjPydI2J1VKw5g+5jW9rBgLOT0T9xlr+TZDnYADBAakwpbujrBH5Ey+W BswGoNzqsW9B0U4N6o8OP0FA6IxcddX6Uan2czienmqj217w4AaPPT/vuA5EyC9W zNzBAi87rnaqQs4LnWiS09K+RvYIedQl1lJIzwIdQZfnMKmCSEI= =GOU4 -----END PGP SIGNATURE----- Merge tag 'nfsd-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix several long-standing bugs in the duplicate reply cache - Fix a memory leak * tag 'nfsd-6.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: Fix checksum mismatches in the duplicate reply cache NFSD: Fix "start of NFS reply" pointer passed to nfsd_cache_update() NFSD: Update nfsd_cache_append() to use xdr_stream nfsd: fix file memleak on client_opens_release	2023-11-18 11:23:32 -08:00
Mahmoud Adam	bc1b5acb40	nfsd: fix file memleak on client_opens_release seq_release should be called to free the allocated seq_file Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Mahmoud Adam <mngyadam@amazon.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: `78599c42ae` ("nfsd4: add file to display list of client's opens") Reviewed-by: NeilBrown <neilb@suse.de> Tested-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-11-17 15:12:39 -05:00
Linus Torvalds	ecae0bd517	Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series "Fixes and cleanups to compaction". - Joel Fernandes has a patchset ("Optimize mremap during mutual alignment within PMD") which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested. - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series "Do not try to access unaccepted memory" Adrian Hunter provides some fixups for the recently-added "unaccepted memory' feature. To increase the feature's checking coverage. "Plug a few gaps where RAM is exposed without checking if it is unaccepted memory". - In the series "cleanups for lockless slab shrink" Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code. - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series "use refcount+RCU method to implement lockless slab shrink". - David Hildenbrand contributes some maintenance work for the rmap code in the series "Anon rmap cleanups". - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series "mm: migrate: more folio conversion and unification". - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series "Add and use bdev_getblk()". - In the series "Use nth_page() in place of direct struct page manipulation" Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames. - In the series "mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO" has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use. - Matthew Wilcox has sent the series "Small hugetlb cleanups" - code rationalization and folio conversions in the hugetlb code. - Yin Fengwei has improved mlock()'s handling of large folios in the series "support large folio for mlock" - In the series "Expose swapcache stat for memcg v1" Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2. - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named "MDWE without inheritance". - Kefeng Wang has provided the series "mm: convert numa balancing functions to use a folio" which does what it says. - In the series "mm/ksm: add fork-exec support for prctl" Stefan Roesch makes is possible for a process to propagate KSM treatment across exec(). - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use "high bandwidth memory" in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named "memory tiering: calculate abstract distance based on ACPI HMAT" - In the series "Smart scanning mode for KSM" Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans. - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series "mm: memcg: fix tracking of pending stats updates values". - In the series "Implement IOCTL to get and optionally clear info about PTEs" Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU. - Hugh Dickins contributed the series "shmem,tmpfs: general maintenance" - a bunch of relatively minor maintenance tweaks to this code. - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series "Handle more faults under the VMA lock". Some rationalizations of the fault path became possible as a result. - In the series "mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()" David Hildenbrand has implemented some cleanups and folio conversions. - In the series "various improvements to the GUP interface" Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements. - Andrey Konovalov has sent along the series "kasan: assorted fixes and improvements" which does those things. - Some page allocator maintenance work from Kemeng Shi in the series "Two minor cleanups to break_down_buddy_pages". - In thes series "New selftest for mm" Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults. - In the series "Add folio_end_read" Matthew Wilcox provides cleanups and an optimization to the core pagecache code. - Nhat Pham has added memcg accounting for hugetlb memory in the series "hugetlb memcg accounting". - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series "Abstract vma_merge() and split_vma()". - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series "Fix page_owner's use of free timestamps". - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series "permit write-sealed memfd read-only shared mappings". - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series "Batch hugetlb vmemmap modification operations". - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series "Finish the create_empty_buffers() transition". - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series "mm: PCP high auto-tuning". - Roman Gushchin has contributed the patchset "mm: improve performance of accounted kernel memory allocations" which improves their performance by ~30% as measured by a micro-benchmark. - folio conversions from Kefeng Wang in the series "mm: convert page cpupid functions to folios". - Some kmemleak fixups in Liu Shixin's series "Some bugfix about kmemleak". - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series "handle memoryless nodes more appropriately". - khugepaged conversions from Vishal Moola in the series "Some khugepaged folio conversions". -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZULEMwAKCRDdBJ7gKXxA jhQHAQCYpD3g849x69DmHnHWHm/EHQLvQmRMDeYZI+nx/sCJOwEAw4AKg0Oemv9y FgeUPAD1oasg6CP+INZvCj34waNxwAc= =E+Y4 -----END PGP SIGNATURE----- Merge tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "Many singleton patches against the MM code. The patch series which are included in this merge do the following: - Kemeng Shi has contributed some compation maintenance work in the series 'Fixes and cleanups to compaction' - Joel Fernandes has a patchset ('Optimize mremap during mutual alignment within PMD') which fixes an obscure issue with mremap()'s pagetable handling during a subsequent exec(), based upon an implementation which Linus suggested - More DAMON/DAMOS maintenance and feature work from SeongJae Park i the following patch series: mm/damon: misc fixups for documents, comments and its tracepoint mm/damon: add a tracepoint for damos apply target regions mm/damon: provide pseudo-moving sum based access rate mm/damon: implement DAMOS apply intervals mm/damon/core-test: Fix memory leaks in core-test mm/damon/sysfs-schemes: Do DAMOS tried regions update for only one apply interval - In the series 'Do not try to access unaccepted memory' Adrian Hunter provides some fixups for the recently-added 'unaccepted memory' feature. To increase the feature's checking coverage. 'Plug a few gaps where RAM is exposed without checking if it is unaccepted memory' - In the series 'cleanups for lockless slab shrink' Qi Zheng has done some maintenance work which is preparation for the lockless slab shrinking code - Qi Zheng has redone the earlier (and reverted) attempt to make slab shrinking lockless in the series 'use refcount+RCU method to implement lockless slab shrink' - David Hildenbrand contributes some maintenance work for the rmap code in the series 'Anon rmap cleanups' - Kefeng Wang does more folio conversions and some maintenance work in the migration code. Series 'mm: migrate: more folio conversion and unification' - Matthew Wilcox has fixed an issue in the buffer_head code which was causing long stalls under some heavy memory/IO loads. Some cleanups were added on the way. Series 'Add and use bdev_getblk()' - In the series 'Use nth_page() in place of direct struct page manipulation' Zi Yan has fixed a potential issue with the direct manipulation of hugetlb page frames - In the series 'mm: hugetlb: Skip initialization of gigantic tail struct pages if freed by HVO' has improved our handling of gigantic pages in the hugetlb vmmemmep optimizaton code. This provides significant boot time improvements when significant amounts of gigantic pages are in use - Matthew Wilcox has sent the series 'Small hugetlb cleanups' - code rationalization and folio conversions in the hugetlb code - Yin Fengwei has improved mlock()'s handling of large folios in the series 'support large folio for mlock' - In the series 'Expose swapcache stat for memcg v1' Liu Shixin has added statistics for memcg v1 users which are available (and useful) under memcg v2 - Florent Revest has enhanced the MDWE (Memory-Deny-Write-Executable) prctl so that userspace may direct the kernel to not automatically propagate the denial to child processes. The series is named 'MDWE without inheritance' - Kefeng Wang has provided the series 'mm: convert numa balancing functions to use a folio' which does what it says - In the series 'mm/ksm: add fork-exec support for prctl' Stefan Roesch makes is possible for a process to propagate KSM treatment across exec() - Huang Ying has enhanced memory tiering's calculation of memory distances. This is used to permit the dax/kmem driver to use 'high bandwidth memory' in addition to Optane Data Center Persistent Memory Modules (DCPMM). The series is named 'memory tiering: calculate abstract distance based on ACPI HMAT' - In the series 'Smart scanning mode for KSM' Stefan Roesch has optimized KSM by teaching it to retain and use some historical information from previous scans - Yosry Ahmed has fixed some inconsistencies in memcg statistics in the series 'mm: memcg: fix tracking of pending stats updates values' - In the series 'Implement IOCTL to get and optionally clear info about PTEs' Peter Xu has added an ioctl to /proc/<pid>/pagemap which permits us to atomically read-then-clear page softdirty state. This is mainly used by CRIU - Hugh Dickins contributed the series 'shmem,tmpfs: general maintenance', a bunch of relatively minor maintenance tweaks to this code - Matthew Wilcox has increased the use of the VMA lock over file-backed page faults in the series 'Handle more faults under the VMA lock'. Some rationalizations of the fault path became possible as a result - In the series 'mm/rmap: convert page_move_anon_rmap() to folio_move_anon_rmap()' David Hildenbrand has implemented some cleanups and folio conversions - In the series 'various improvements to the GUP interface' Lorenzo Stoakes has simplified and improved the GUP interface with an eye to providing groundwork for future improvements - Andrey Konovalov has sent along the series 'kasan: assorted fixes and improvements' which does those things - Some page allocator maintenance work from Kemeng Shi in the series 'Two minor cleanups to break_down_buddy_pages' - In thes series 'New selftest for mm' Breno Leitao has developed another MM self test which tickles a race we had between madvise() and page faults - In the series 'Add folio_end_read' Matthew Wilcox provides cleanups and an optimization to the core pagecache code - Nhat Pham has added memcg accounting for hugetlb memory in the series 'hugetlb memcg accounting' - Cleanups and rationalizations to the pagemap code from Lorenzo Stoakes, in the series 'Abstract vma_merge() and split_vma()' - Audra Mitchell has fixed issues in the procfs page_owner code's new timestamping feature which was causing some misbehaviours. In the series 'Fix page_owner's use of free timestamps' - Lorenzo Stoakes has fixed the handling of new mappings of sealed files in the series 'permit write-sealed memfd read-only shared mappings' - Mike Kravetz has optimized the hugetlb vmemmap optimization in the series 'Batch hugetlb vmemmap modification operations' - Some buffer_head folio conversions and cleanups from Matthew Wilcox in the series 'Finish the create_empty_buffers() transition' - As a page allocator performance optimization Huang Ying has added automatic tuning to the allocator's per-cpu-pages feature, in the series 'mm: PCP high auto-tuning' - Roman Gushchin has contributed the patchset 'mm: improve performance of accounted kernel memory allocations' which improves their performance by ~30% as measured by a micro-benchmark - folio conversions from Kefeng Wang in the series 'mm: convert page cpupid functions to folios' - Some kmemleak fixups in Liu Shixin's series 'Some bugfix about kmemleak' - Qi Zheng has improved our handling of memoryless nodes by keeping them off the allocation fallback list. This is done in the series 'handle memoryless nodes more appropriately' - khugepaged conversions from Vishal Moola in the series 'Some khugepaged folio conversions'" [ bcachefs conflicts with the dynamically allocated shrinkers have been resolved as per Stephen Rothwell in https://lore.kernel.org/all/20230913093553.4290421e@canb.auug.org.au/ with help from Qi Zheng. The clone3 test filtering conflict was half-arsed by yours truly ] * tag 'mm-stable-2023-11-01-14-33' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (406 commits) mm/damon/sysfs: update monitoring target regions for online input commit mm/damon/sysfs: remove requested targets when online-commit inputs selftests: add a sanity check for zswap Documentation: maple_tree: fix word spelling error mm/vmalloc: fix the unchecked dereference warning in vread_iter() zswap: export compression failure stats Documentation: ubsan: drop "the" from article title mempolicy: migration attempt to match interleave nodes mempolicy: mmap_lock is not needed while migrating folios mempolicy: alloc_pages_mpol() for NUMA policy without vma mm: add page_rmappable_folio() wrapper mempolicy: remove confusing MPOL_MF_LAZY dead code mempolicy: mpol_shared_policy_init() without pseudo-vma mempolicy trivia: use pgoff_t in shared mempolicy tree mempolicy trivia: slightly more consistent naming mempolicy trivia: delete those ancient pr_debug()s mempolicy: fix migrate_pages(2) syscall return nr_failed kernfs: drop shared NUMA mempolicy hooks hugetlbfs: drop shared NUMA mempolicy pretence mm/damon/sysfs-test: add a unit test for damon_sysfs_set_targets() ...	2023-11-02 19:38:47 -10:00
Sicong Huang	2ffda63c98	NFSD: clean up alloc_init_deleg() Modify the conditional statement for null pointer check in the function 'alloc_init_deleg' to make this function more robust and clear. Otherwise, this function may have potential pointer dereference problem in the future, when modifying or expanding the nfs4_delegation structure. Signed-off-by: Sicong Huang <huangsicong@iie.ac.cn> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-10-16 12:44:40 -04:00
KaiLong Wang	03a0497f83	nfsd: Clean up errors in nfs4state.c Fix the following errors reported by checkpatch: ERROR: spaces required around that '=' (ctx:VxW) ERROR: space required after that ',' (ctx:VxO) ERROR: space required before that '~' (ctx:OxV) Signed-off-by: KaiLong Wang <wangkailong@jari.cn> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-10-16 12:44:38 -04:00
Chuck Lever	e4ad7ce775	NFSD: Add nfsd4_encode_open_read_delegation4() Refactor nfsd4_encode_open() so the open_read_delegation4 type is encoded in a separate function. This makes it more straightforward to later add support for returning an nfsace4 in OPEN responses that offer a delegation. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-10-16 12:44:31 -04:00
Chuck Lever	92d82e995e	NFSD: Remove a layering violation when encoding lock_denied An XDR encoder is responsible for marshaling results, not releasing memory that was allocated by the upper layer. We have .op_release for that purpose. Move the release of the ld_owner.data string to op_release functions for LOCK and LOCKT. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-10-16 12:44:30 -04:00
Dai Ngo	6c41d9a9bd	NFSD: handle GETATTR conflict with write delegation If the GETATTR request on a file that has write delegation in effect and the request attributes include the change info and size attribute then the request is handled as below: Server sends CB_GETATTR to client to get the latest change info and file size. If these values are the same as the server's cached values then the GETATTR proceeds as normal. If either the change info or file size is different from the server's cached values, or the file was already marked as modified, then: . update time_modify and time_metadata into file's metadata with current time . encode GETATTR as normal except the file size is encoded with the value returned from CB_GETATTR . mark the file as modified If the CB_GETATTR fails for any reasons, the delegation is recalled and NFS4ERR_DELAY is returned for the GETATTR. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-10-16 12:44:09 -04:00
Alexander Aring	2dd10de8e6	lockd: introduce safe async lock op This patch reverts mostly commit `40595cdc93` ("nfs: block notification on fs with its own ->lock") and introduces an EXPORT_OP_ASYNC_LOCK export flag to signal that the "own ->lock" implementation supports async lock requests. The only main user is DLM that is used by GFS2 and OCFS2 filesystem. Those implement their own lock() implementation and return FILE_LOCK_DEFERRED as return value. Since commit `40595cdc93` ("nfs: block notification on fs with its own ->lock") the DLM implementation were never updated. This patch should prepare for DLM to set the EXPORT_OP_ASYNC_LOCK export flag and update the DLM plock implementation regarding to it. Acked-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Alexander Aring <aahringo@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-10-16 12:44:01 -04:00
Qi Zheng	d17452aa33	nfsd: dynamically allocate the nfsd-client shrinker In preparation for implementing lockless slab shrink, use new APIs to dynamically allocate the nfsd-client shrinker, so that it can be freed asynchronously via RCU. Then it doesn't need to wait for RCU read-side critical section when releasing the struct nfsd_net. Link: https://lkml.kernel.org/r/20230911094444.68966-33-zhengqi.arch@bytedance.com Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com> Acked-by: Chuck Lever <chuck.lever@oracle.com> Acked-by: Jeff Layton <jlayton@kernel.org> Cc: Jeff Layton <jlayton@kernel.org> Cc: Olga Kornievskaia <kolga@netapp.com> Cc: Dai Ngo <Dai.Ngo@oracle.com> Cc: Tom Talpey <tom@talpey.com> Cc: Abhinav Kumar <quic_abhinavk@quicinc.com> Cc: Alasdair Kergon <agk@redhat.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Cc: Andreas Gruenbacher <agruenba@redhat.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Bob Peterson <rpeterso@redhat.com> Cc: Borislav Petkov <bp@alien8.de> Cc: Carlos Llamas <cmllamas@google.com> Cc: Chandan Babu R <chandan.babu@oracle.com> Cc: Chao Yu <chao@kernel.org> Cc: Chris Mason <clm@fb.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Christian Koenig <christian.koenig@amd.com> Cc: Chuck Lever <cel@kernel.org> Cc: Coly Li <colyli@suse.de> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: "Darrick J. Wong" <djwong@kernel.org> Cc: Dave Chinner <david@fromorbit.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: David Airlie <airlied@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Sterba <dsterba@suse.com> Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org> Cc: Gao Xiang <hsiangkao@linux.alibaba.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Huang Rui <ray.huang@amd.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jani Nikula <jani.nikula@linux.intel.com> Cc: Jan Kara <jack@suse.cz> Cc: Jason Wang <jasowang@redhat.com> Cc: Jeffle Xu <jefflexu@linux.alibaba.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Juergen Gross <jgross@suse.com> Cc: Kent Overstreet <kent.overstreet@gmail.com> Cc: Kirill Tkhai <tkhai@ya.ru> Cc: Marijn Suijten <marijn.suijten@somainline.org> Cc: "Michael S. Tsirkin" <mst@redhat.com> Cc: Mike Snitzer <snitzer@kernel.org> Cc: Minchan Kim <minchan@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Muchun Song <songmuchun@bytedance.com> Cc: Nadav Amit <namit@vmware.com> Cc: Neil Brown <neilb@suse.de> Cc: Oleksandr Tyshchenko <oleksandr_tyshchenko@epam.com> Cc: Paul E. McKenney <paulmck@kernel.org> Cc: Richard Weinberger <richard@nod.at> Cc: Rob Clark <robdclark@gmail.com> Cc: Rob Herring <robh@kernel.org> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Sean Paul <sean@poorly.run> Cc: Sergey Senozhatsky <senozhatsky@chromium.org> Cc: Song Liu <song@kernel.org> Cc: Stefano Stabellini <sstabellini@kernel.org> Cc: Steven Price <steven.price@arm.com> Cc: "Theodore Ts'o" <tytso@mit.edu> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Tomeu Vizoso <tomeu.vizoso@collabora.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Cc: Yue Hu <huyue2@coolpad.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2023-10-04 10:32:25 -07:00
Dai Ngo	1d3dd1d56c	NFSD: Enable write delegation support This patch grants write delegations for OPEN with NFS4_SHARE_ACCESS_WRITE if there is no conflict with other OPENs. Write delegation conflicts with another OPEN, REMOVE, RENAME and SETATTR are handled the same as read delegation using notify_change, try_break_deleg. The NFSv4.0 protocol does not enable a server to determine that a conflicting GETATTR originated from the client holding the delegation versus coming from some other client. With NFSv4.1 and later, the SEQUENCE operation that begins each COMPOUND contains a client ID, so delegation recall can be safely squelched in this case. With NFSv4.0, however, the server must recall or send a CB_GETATTR (per RFC 7530 Section 16.7.5) even when the GETATTR originates from the client holding that delegation. An NFSv4.0 client can trigger a pathological situation if it always sends a DELEGRETURN preceded by a conflicting GETATTR in the same COMPOUND. COMPOUND execution will always stop at the GETATTR and the DELEGRETURN will never get executed. The server eventually revokes the delegation, which can result in loss of open or lock state. Tracepoint added to track whether read or write delegation is granted. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-08-29 17:45:22 -04:00
Dai Ngo	fd19ca36fd	NFSD: handle GETATTR conflict with write delegation If the GETATTR request on a file that has write delegation in effect and the request attributes include the change info and size attribute then the write delegation is recalled. If the delegation is returned within 30ms then the GETATTR is serviced as normal otherwise the NFS4ERR_DELAY error is returned for the GETATTR. Add counter for write delegation recall due to conflict GETATTR. This is used to evaluate the need to implement CB_GETATTR to adoid recalling the delegation with conflit GETATTR. Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-08-29 17:45:22 -04:00
Benjamin Coddington	3b816601e2	nfsd: Fix race to FREE_STATEID and cl_revoked We have some reports of linux NFS clients that cannot satisfy a linux knfsd server that always sets SEQ4_STATUS_RECALLABLE_STATE_REVOKED even though those clients repeatedly walk all their known state using TEST_STATEID and receive NFS4_OK for all. Its possible for revoke_delegation() to set NFS4_REVOKED_DELEG_STID, then nfsd4_free_stateid() finds the delegation and returns NFS4_OK to FREE_STATEID. Afterward, revoke_delegation() moves the same delegation to cl_revoked. This would produce the observed client/server effect. Fix this by ensuring that the setting of sc_type to NFS4_REVOKED_DELEG_STID and move to cl_revoked happens within the same cl_lock. This will allow nfsd4_free_stateid() to properly remove the delegation from cl_revoked. Link: https://bugzilla.redhat.com/show_bug.cgi?id=2217103 Link: https://bugzilla.redhat.com/show_bug.cgi?id=2176575 Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Cc: stable@vger.kernel.org # v4.17+ Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-08-04 11:38:33 -04:00
Trond Myklebust	f75546f58a	nfsd: Remove incorrect check in nfsd4_validate_stateid If the client is calling TEST_STATEID, then it is because some event occurred that requires it to check all the stateids for validity and call FREE_STATEID on the ones that have been revoked. In this case, either the stateid exists in the list of stateids associated with that nfs4_client, in which case it should be tested, or it does not. There are no additional conditions to be considered. Reported-by: "Frank Ch. Eigler" <fche@redhat.com> Fixes: `7df302f75e` ("NFSD: TEST_STATEID should not return NFS4ERR_STALE_STATEID") Cc: stable@vger.kernel.org # v5.7+ Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-07-18 11:34:09 -04:00
Linus Torvalds	9fc2f99030	NFSD 6.3 Release Notes Two significant security enhancements are part of this release: * NFSD's RPC header encoding and decoding, including RPCSEC GSS and gssproxy header parsing, has been overhauled to make it more memory-safe. * Support for Kerberos AES-SHA2-based encryption types has been added for both the NFS client and server. This provides a clean path for deprecating and removing insecure encryption types based on DES and SHA-1. AES-SHA2 is also FIPS-140 compliant, so that NFS with Kerberos may now be used on systems with fips enabled. In addition to these, NFSD is now able to handle crossing into an auto-mounted mount point on an exported NFS mount. A number of fixes have been made to NFSD's server-side copy implementation. RPC metrics have been converted to per-CPU variables. This helps reduce unnecessary cross-CPU and cross-node memory bus traffic, and significantly reduces noise when KCSAN is enabled. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmPzgiYACgkQM2qzM29m f5dB2A//eqjpj+FgAN+UjygrwMC4ahAsPX3Sc3FG8/lTAiao3NFVFY2gxAiCPyVE CFk+tUyfL23oXvbyfIBe3LhxSBOf621xU6up2OzqAzJqh1Q9iUWB6as3I14to8ZU sWpxXo5ofwk1hzkbrvOAVkyfY0emwsr00iBeWMawkpBe8FZEQA31OYj3/xHr6bBI zEVlZPBZAZlp0DZ74tb+bBLs/EOnqKj+XLWcogCH13JB3sn2umF6cQNkYgsxvHGa TNQi4LEdzWZGme242LfBRiGGwm1xuVIjlAhYV/R1wIjaknE3QBzqfXc6lJx74WII HaqpRJGrKqdo7B+1gaXCl/AMS7YluED1CBrxuej0wBG7l2JEB7m2MFMQ4LTQjgsn nrr3P70DgbB4LuPCPyUS7dtsMmUXabIqP7niiCR4T1toH6lBmHAgEi4cFmkzg7Cd EoFzn888mtDpfx4fghcsRWS5oKXEzbPJfu5+IZOD63+UB+NGpi0Xo2s23sJPK8vz kqK/X63JYOUxWUvK0zkj/c/wW1cLqIaBwnSKbShou5/BL+cZVI+uJYrnEesgpoB2 5fh/cZv3hdcoOPO7OfcjCLQYy4J6RCWajptnk/hcS3lMvBTBrnq697iAqCVURDKU Xfmlf7XbBwje+sk4eHgqVGEqqVjrEmoqbmA2OS44WSS5LDvxXdI= =ZG/7 -----END PGP SIGNATURE----- Merge tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd updates from Chuck Lever: "Two significant security enhancements are part of this release: - NFSD's RPC header encoding and decoding, including RPCSEC GSS and gssproxy header parsing, has been overhauled to make it more memory-safe. - Support for Kerberos AES-SHA2-based encryption types has been added for both the NFS client and server. This provides a clean path for deprecating and removing insecure encryption types based on DES and SHA-1. AES-SHA2 is also FIPS-140 compliant, so that NFS with Kerberos may now be used on systems with fips enabled. In addition to these, NFSD is now able to handle crossing into an auto-mounted mount point on an exported NFS mount. A number of fixes have been made to NFSD's server-side copy implementation. RPC metrics have been converted to per-CPU variables. This helps reduce unnecessary cross-CPU and cross-node memory bus traffic, and significantly reduces noise when KCSAN is enabled" * tag 'nfsd-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (121 commits) NFSD: Clean up nfsd_symlink() NFSD: copy the whole verifier in nfsd_copy_write_verifier nfsd: don't fsync nfsd_files on last close SUNRPC: Fix occasional warning when destroying gss_krb5_enctypes nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open NFSD: fix problems with cleanup on errors in nfsd4_copy nfsd: fix race to check ls_layouts nfsd: don't hand out delegation on setuid files being opened for write SUNRPC: Remove ->xpo_secure_port() SUNRPC: Clean up the svc_xprt_flags() macro nfsd: remove fs/nfsd/fault_inject.c NFSD: fix leaked reference count of nfsd4_ssc_umount_item nfsd: clean up potential nfsd_file refcount leaks in COPY codepath nfsd: zero out pointers after putting nfsd_files on COPY setup error SUNRPC: Fix whitespace damage in svcauth_unix.c nfsd: eliminate __nfs4_get_fd nfsd: add some kerneldoc comments for stateid preprocessing functions nfsd: eliminate find_deleg_file_locked nfsd: don't take nfsd4_copy ref for OP_OFFLOAD_STATUS SUNRPC: Add encryption self-tests ...	2023-02-22 14:21:40 -08:00
Linus Torvalds	575a7e0f81	File locking changes for v6.3 -----BEGIN PGP SIGNATURE----- iQJHBAABCAAxFiEES8DXskRxsqGE6vXTAA5oQRlWghUFAmPuC5kTHGpsYXl0b25A a2VybmVsLm9yZwAKCRAADmhBGVaCFdXpEADFN/loTtcANLPQmLmgmJLDuZr2zKrf aziMJRjqGMx6BLdwBgX8/XGBNwpG4tkVbI+zdRoVHkpcayMDpLq0dnrvi79a/dGU fBrI72ZDMd/S9lzbodObHMziLqvgFthsPm9ldVAZ2Kt400KKNE+ozcveiC3yVGy0 n1k5BSt/78abzpqut5whVgJBooHtUMCh3XvBJPKwgOneHfAXCm+jqaXlKKpKlpZj s2OUyn8BLfNkTgpAZ88L5Rkf0mftjziL6C8KOMy1hvOsyiP0IkwLuQ/kO+2H0Ate p3tbOGvUT+n1gYpFYBDLnuWB4G8+CVPxfoO6KGhwT4OlCJpPlNCM8O+w/A/dKn4I 858spkpYPMy91lEkcrRLRkg/MARWGTgZ3k76fp3OWNnfruWd6ekMlYKx9n6CIy34 Aoc3Svy9KeA7oRrbRDltmw3UVmz53GcDo337ZL1J6Jph3s86dMG7AwGYvoDfKuKK b0oNK5db5v50scBnRHX6UWejE5fSnnHvgC7pU57u08odCVEUALB+r8f04vmkxcVJ Qed7lolQdFzn9ddaOXzpg5KeCe/cX3p4IPZSTHad7CPr8gswmC135DfXCr64x2hC 5jyNzKbe/x+7B2xCweHmEk4ojt8IU3UaYxLJoQkNeVr8rEGC9gqZgkSDe7BnTpOf wT0ijzhy2u5RKg== =Zhf3 -----END PGP SIGNATURE----- Merge tag 'locks-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux Pull file locking updates from Jeff Layton: "The main change here is that I've broken out most of the file locking definitions into a new header file. I also went ahead and completed the removal of locks_inode function" * tag 'locks-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux: fs: remove locks_inode filelock: move file locking definitions to separate header file	2023-02-20 11:10:38 -08:00
Jeff Layton	dcd779dc46	nfsd: fix courtesy client with deny mode handling in nfs4_upgrade_open The nested if statements here make no sense, as you can never reach "else" branch in the nested statement. Fix the error handling for when there is a courtesy client that holds a conflicting deny mode. Fixes: `3d69427151` ("NFSD: add support for share reservation conflict to courteous server") Reported-by: 張智諺 <cc85nod@gmail.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Dai Ngo <dai.ngo@oracle.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:57 -05:00
Dai Ngo	81e722978a	NFSD: fix problems with cleanup on errors in nfsd4_copy When nfsd4_copy fails to allocate memory for async_copy->cp_src, or nfs4_init_copy_state fails, it calls cleanup_async_copy to do the cleanup for the async_copy which causes page fault since async_copy is not yet initialized. This patche rearranges the order of initializing the fields in async_copy and adds checks in cleanup_async_copy to skip un-initialized fields. Fixes: `ce0887ac96` ("NFSD add nfs4 inter ssc to nfsd4_copy") Fixes: `87689df694` ("NFSD: Shrink size of struct nfsd4_copy") Signed-off-by: Dai Ngo <dai.ngo@oracle.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:57 -05:00
Jeff Layton	826b67e637	nfsd: don't hand out delegation on setuid files being opened for write We had a bug report that xfstest generic/355 was failing on NFSv4.0. This test sets various combinations of setuid/setgid modes and tests whether DIO writes will cause them to be stripped. What I found was that the server did properly strip those bits, but the client didn't notice because it held a delegation that was not recalled. The recall didn't occur because the client itself was the one generating the activity and we avoid recalls in that case. Clearing setuid bits is an "implicit" activity. The client didn't specifically request that we do that, so we need the server to issue a CB_RECALL, or avoid the situation entirely by not issuing a delegation. The easiest fix here is to simply not give out a delegation if the file is being opened for write, and the mode has the setuid and/or setgid bit set. Note that there is a potential race between the mode and lease being set, so we test for this condition both before and after setting the lease. This patch fixes generic/355, generic/683 and generic/684 for me. (Note that 355 fails only on v4.0, and 683 and 684 require NFSv4.2 to run and fail). Reported-by: Boyang Xue <bxue@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:55 -05:00
Jeff Layton	edd2f5526e	nfsd: eliminate __nfs4_get_fd This is wrapper is pointless, and just obscures what's going on. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:52 -05:00
Jeff Layton	ee97e73015	nfsd: add some kerneldoc comments for stateid preprocessing functions Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:52 -05:00
Jeff Layton	45ba66cc2c	nfsd: eliminate find_deleg_file_locked We really don't need an accessor function here. Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:51 -05:00
Jeff Layton	bd6aaf781d	nfsd: fix potential race in nfs4_find_file The WARN_ON_ONCE check is not terribly useful. It also seems possible for nfs4_find_file to race with the destruction of an fi_deleg_file while trying to take a reference to it. Now that it's safe to pass nfs_get_file a NULL pointer, remove the WARN and NULL pointer check. Take the fi_lock when fetching fi_deleg_file. Cc: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:22 -05:00
Jeff Layton	70f62231cd	nfsd: allow nfsd_file_get to sanely handle a NULL pointer ...and remove some now-useless NULL pointer checks in its callers. Suggested-by: NeilBrown <neilb@suse.de> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-20 09:20:21 -05:00
Linus Torvalds	3402351a5a	nfsd-6.2 fixes: - Fix a teardown bug in the new nfs4_file hashtable -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmPqS30ACgkQM2qzM29m f5fV2BAAgkmq0MF6nTjjVcFYSUyJvPO4jCcT5ZPqJcv2CHCWp/EkhOrBMHtli3y2 zUspjZfHpuV5oi2j7J1/PnlLBbov7oXjg9vZ2xGazo1Zkr5wQHxwyuFXdVzte2Qk fGcCEgLA3aPiluafzWHMYeoZUqVSQCT41UaUW4iauc/byRZmNs0nwVsYlfhWzg1U dsQsK03S357mmqhevxIbBVvTRNakxJwqALBVZcEgfjulESwgQ6gIJpo8srrAkZ4v 2k5bPVVTRhyDCv+8fOe7UqIyW7HwT5WDvMJg7C883wcP8xk7NHqg54t7NVoPNv5R oZLkwUBr5G75OThCmrm8zDMt6zqT76Wyc29fBQ0hjyNml0HOiB131yYv7zGW6qqD 2e+jQpAKDkc6IIy8IzqxgHan9lMvtYfz7uIQjGGXb+vPTF5S7BIdWJjSEvgo/7a7 RNlou48UAF4NeZHcDESXFArcQ0MXd9rBnawxRVDqMzQ3owMyn3Nt8FRx7K/dNGKl 6qnIR+H7G65KQgQCxiqrGvvO3NssCBurv/9BRNRmVshVzI0pdn/seZff4TngvDfI GWvEE7nMFE1jDBrN5o5ecLUVI+zTDuNY3AAnaH7afiRCBQauq+AziCfvXoLtfGw2 +FktTeZRgZQzMFA62H3eEDvy5dKDp/a0ivP1QglIid6oBRH/muw= =pQWv -----END PGP SIGNATURE----- Merge tag 'nfsd-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fix from Chuck Lever: - Fix a teardown bug in the new nfs4_file hashtable * tag 'nfsd-6.2-6' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: don't destroy global nfs4_file table in per-net shutdown	2023-02-15 11:48:56 -08:00
Jeff Layton	4102db175b	nfsd: don't destroy global nfs4_file table in per-net shutdown The nfs4_file table is global, so shutting it down when a containerized nfsd is shut down is wrong and can lead to double-frees. Tear down the nfs4_file_rhltable in nfs4_state_shutdown instead of nfs4_state_shutdown_net. Fixes: `d47b295e8d` ("NFSD: Use rhashtable for managing nfs4_file objects") Link: https://bugzilla.redhat.com/show_bug.cgi?id=2169017 Reported-by: JianHong Yin <jiyin@redhat.com> Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>	2023-02-11 12:09:44 -05:00
Linus Torvalds	c1649ec557	nfsd-6.2 fixes: - Fix recently introduced use-after-free bugs -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmPG0eQACgkQM2qzM29m f5fjNA/+NTUFM/e0Ol0WKbQisBZudTvpE/G/iFqdWq50JNwVc1mwr2teNaWmyagp o/Sdrp/DX0aWNK+CAb9VcgpuA2tZiCqMGxjw1YwZhPnfn15l/kMX1a7ueWFXU3bz EuYtiDyIvu+9CxYYqBGS6brBmiccjbAL02GaVakc1SiJ4dhn1ZrCnuVXbx8uDwSX /XmKTpW+yGiQ6KpJOPXIe2mx4+hcvOkSjLXteYMMbbMHhMK3zNWpyS/d9X4AJWxA ZFxH6x8Hi+7cDK3doXL3xLX+fbcofC6NKalvDplUff8NhKowWlFTcH84JZnJ8lt1 8lwZ2GAPwxYd54L04mn8adBZjkFo4pF++sjoTltpUES12bN3qtd+zzLC416FGN7a B2mq3DetAFCodNcasb7zW861WtTQRs3a8w9H7DJ39kRRF52ScGVucrfBOHaOmGud T1jrol8kQvjzdQ+wVbMpN14Vf0eqo4Z+bWNekMzM2ROZCnHKy1uHe78iS32oXh6H YiSO9PH82lesomVxEyF32V3DM4HSd3IG5IWcPS0+U2PHWj4N7vzWFJQf/JXJl8mG h6+rX2hI7iW1P585jkNxahAP4A+NCF/kf0J4xgDfeLw8FH8HEfkfaEky2K9wyZaS ZEEb34F7bIGS4jeENjmRbgbmxDgva9yH5qoqK9EhpjMPoHgzBNY= =qZ7R -----END PGP SIGNATURE----- Merge tag 'nfsd-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: - Fix recently introduced use-after-free bugs * tag 'nfsd-6.2-4' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: replace delayed_work with work_struct for nfsd_client_shrinker NFSD: register/unregister of nfsd-client shrinker at nfsd startup/shutdown time NFSD: fix use-after-free in nfsd4_ssc_setup_dul()	2023-01-17 09:29:17 -08:00

1 2 3 4 5 ...

1356 Commits