Commit Graph

32 Commits

Author SHA1 Message Date
Shyam Prasad N
3edc68de56 cifs: add new field to track the last access time of cfid
The handlecache code today tracks the time at which dir lease was
acquired and the laundromat thread uses that to check for old
entries to cleanup.

However, if a directory is actively accessed, it should not
be chosen to expire first.

This change adds a new last_access_time field to cfid and
uses that to decide expiry of the cfid.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-07-27 16:59:59 -05:00
Bharath SM
2aaf178483 smb: change return type of cached_dir_lease_break() to bool
cached_dir_lease_break() has return type as int but only
returning true or false. change return type of this function
to bool for clarity.

Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-07-27 16:59:59 -05:00
Paul Aurich
a2182743a8 smb: Log an error when close_all_cached_dirs fails
Under low-memory conditions, close_all_cached_dirs() can't move the
dentries to a separate list to dput() them once the locks are dropped.
This will result in a "Dentry still in use" error, so add an error
message that makes it clear this is what happened:

[  495.281119] CIFS: VFS: \\otters.example.com\share Out of memory while dropping dentries
[  495.281595] ------------[ cut here ]------------
[  495.281887] BUG: Dentry ffff888115531138{i=78,n=/}  still in use (2) [unmount of cifs cifs]
[  495.282391] WARNING: CPU: 1 PID: 2329 at fs/dcache.c:1536 umount_check+0xc8/0xf0

Also, bail out of looping through all tcons as soon as a single
allocation fails, since we're already in trouble, and kmalloc() attempts
for subseqeuent tcons are likely to fail just like the first one did.

Signed-off-by: Paul Aurich <paul@darkrain42.org>
Acked-by: Bharath SM <bharathsm@microsoft.com>
Suggested-by: Ruben Devos <rdevos@oxya.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-21 11:03:25 -05:00
Linus Torvalds
6d9b5f6b81 13 smb3/cifs client fixes
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmg/AoQACgkQiiy9cAdy
 T1EWNAv/Tbkn3msd3Vw24hqiZIm9Mno5yPsYB7TaggMtzb8C3UHA44mUW8ADVB3+
 gAlahULcItEjali8QhjH7/fq3cUX8+79Fx6QCxJp0zWDcdtk6ZZDmCSnvVpOsF64
 hJ//gggQk4187EQ3DJxseFWpvqRVLldLt0XFQeXZddAhYsKQDI2kHArECQ2kZ71G
 xGMtax3NQ3tA+G2AC4D9TE7coaw2ZkPoiMERKzm5PePIYqyLnH7tLwgj3iu9qG79
 PhfLt2kgbRkc3XZx8QZWwKLsr5Zyuq9Mg96AwGSkVW+ZcaNHaTWE1peC12DVfEO6
 MWujwEPCafC2Lm/CMrBUrNC0LcWZcAbqTzVtzgmBjI5Uw4FVB0HBn99evRgva9iZ
 34/JPjB4gULn8TAoDSBFRWI+PT6s/9ryikI9JtP53mcdxzZlGHIIyeOGRxJriM1E
 /3nsuwMZ0et7VSxCuACOCILfX/BfvJZwvY6MqPNFo6GUF8Nomm7WyFek+ZGQQJfN
 DHTrty9x
 =J2E/
 -----END PGP SIGNATURE-----

Merge tag 'v6.16-rc-part1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull smb client updates from Steve French:

 - multichannel fixes (mostly reconnect related), and clarification of
   locking documentation

 - automount null pointer check fix

 - fixes to add support for ParentLeaseKey

 - minor cleanup

 - smb1/cifs fixes

* tag 'v6.16-rc-part1-smb-client-fixes' of git://git.samba.org/sfrench/cifs-2.6:
  cifs: update the lock ordering comments with new mutex
  cifs: dns resolution is needed only for primary channel
  cifs: update dstaddr whenever channel iface is updated
  cifs: reset connections for all channels when reconnect requested
  smb: client: use ParentLeaseKey in cifs_do_create
  smb: client: use ParentLeaseKey in open_cached_dir
  smb: client: add ParentLeaseKey support
  cifs: Fix cifs_query_path_info() for Windows NT servers
  cifs: Fix validation of SMB1 query reparse point response
  cifs: Correctly set SMB1 SessionKey field in Session Setup Request
  cifs: Fix encoding of SMB1 Session Setup NTLMSSP Request in non-UNICODE mode
  smb: client: add NULL check in automount_fullpath
  smb: client: Remove an unused function and variable
2025-06-03 16:04:29 -07:00
Henrique Carvalho
0f220d30c6 smb: client: use ParentLeaseKey in open_cached_dir
Implement ParentLeaseKey logic in open_cached_dir() by looking up the
parent cfid, copying its lease key into the fid struct, and setting
the appropriate lease flag.

Fixes: f047390a09 ("CIFS: Add create lease v2 context for SMB3")
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-06-01 20:46:45 -05:00
Linus Torvalds
6d5b940e1e vfs-6.16-rc1.async.dir
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCaDBN6wAKCRCRxhvAZXjc
 ok32AQD9DTiSCAoVg+7s+gSBuLTi8drPTN++mCaxdTqRh5WpRAD9GVyrGQT0s6LH
 eo9bm8d1TAYjilEWM0c0K0TxyQ7KcAA=
 =IW7H
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.16-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull vfs directory lookup updates from Christian Brauner:
 "This contains cleanups for the lookup_one*() family of helpers.

  We expose a set of functions with names containing "lookup_one_len"
  and others without the "_len". This difference has nothing to do with
  "len". It's rater a historical accident that can be confusing.

  The functions without "_len" take a "mnt_idmap" pointer. This is found
  in the "vfsmount" and that is an important question when choosing
  which to use: do you have a vfsmount, or are you "inside" the
  filesystem. A related question is "is permission checking relevant
  here?".

  nfsd and cachefiles *do* have a vfsmount but *don't* use the non-_len
  functions. They pass nop_mnt_idmap and refuse to work on filesystems
  which have any other idmap.

  This work changes nfsd and cachefile to use the lookup_one family of
  functions and to explictily pass &nop_mnt_idmap which is consistent
  with all other vfs interfaces used where &nop_mnt_idmap is explicitly
  passed.

  The remaining uses of the "_one" functions do not require permission
  checks so these are renamed to be "_noperm" and the permission
  checking is removed.

  This series also changes these lookup function to take a qstr instead
  of separate name and len. In many cases this simplifies the call"

* tag 'vfs-6.16-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  VFS: change lookup_one_common and lookup_noperm_common to take a qstr
  Use try_lookup_noperm() instead of d_hash_and_lookup() outside of VFS
  VFS: rename lookup_one_len family to lookup_noperm and remove permission check
  cachefiles: Use lookup_one() rather than lookup_one_len()
  nfsd: Use lookup_one() rather than lookup_one_len()
  VFS: improve interface for lookup_one functions
2025-05-26 08:02:43 -07:00
Paul Aurich
3ca02e63ed smb: client: Avoid race in open_cached_dir with lease breaks
A pre-existing valid cfid returned from find_or_create_cached_dir might
race with a lease break, meaning open_cached_dir doesn't consider it
valid, and thinks it's newly-constructed. This leaks a dentry reference
if the allocation occurs before the queued lease break work runs.

Avoid the race by extending holding the cfid_list_lock across
find_or_create_cached_dir and when the result is checked.

Cc: stable@vger.kernel.org
Reviewed-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2025-05-07 15:24:46 -05:00
NeilBrown
fa6fe07d15
VFS: rename lookup_one_len family to lookup_noperm and remove permission check
The lookup_one_len family of functions is (now) only used internally by
a filesystem on itself either
- in a context where permission checking is irrelevant such as by a
  virtual filesystem populating itself, or xfs accessing its ORPHANAGE
  or dquota accessing the quota file; or
- in a context where a permission check (MAY_EXEC on the parent) has just
  been performed such as a network filesystem finding in "silly-rename"
  file in the same directory.  This is also the context after the
  _parentat() functions where currently lookup_one_qstr_excl() is used.

So the permission check is pointless.

The name "one_len" is unhelpful in understanding the purpose of these
functions and should be changed.  Most of the callers pass the len as
"strlen()" so using a qstr and QSTR() can simplify the code.

This patch renames these functions (include lookup_positive_unlocked()
which is part of the family despite the name) to have a name based on
"lookup_noperm".  They are changed to receive a 'struct qstr' instead
of separate name and len.  In a few cases the use of QSTR() results in a
new call to strlen().

try_lookup_noperm() takes a pointer to a qstr instead of the whole
qstr.  This is consistent with d_hash_and_lookup() (which is nearly
identical) and useful for lookup_noperm_unlocked().

The new lookup_noperm_common() doesn't take a qstr yet.  That will be
tidied up in a subsequent patch.

Signed-off-by: NeilBrown <neil@brown.name>
Link: https://lore.kernel.org/r/20250319031545.2999807-5-neil@brown.name
Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-04-08 11:24:36 +02:00
Paul Aurich
c353ee4fb1 smb: Initialize cfid->tcon before performing network ops
Avoid leaking a tcon ref when a lease break races with opening the
cached directory. Processing the leak break might take a reference to
the tcon in cached_dir_lease_break() and then fail to release the ref in
cached_dir_offload_close, since cfid->tcon is still NULL.

Fixes: ebe98f1447 ("cifs: enable caching of directories for which a lease is held")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-26 23:35:34 -06:00
Paul Aurich
3fa640d035 smb: During unmount, ensure all cached dir instances drop their dentry
The unmount process (cifs_kill_sb() calling close_all_cached_dirs()) can
race with various cached directory operations, which ultimately results
in dentries not being dropped and these kernel BUGs:

BUG: Dentry ffff88814f37e358{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
VFS: Busy inodes after unmount of cifs (cifs)
------------[ cut here ]------------
kernel BUG at fs/super.c:661!

This happens when a cfid is in the process of being cleaned up when, and
has been removed from the cfids->entries list, including:

- Receiving a lease break from the server
- Server reconnection triggers invalidate_all_cached_dirs(), which
  removes all the cfids from the list
- The laundromat thread decides to expire an old cfid.

To solve these problems, dropping the dentry is done in queued work done
in a newly-added cfid_put_wq workqueue, and close_all_cached_dirs()
flushes that workqueue after it drops all the dentries of which it's
aware. This is a global workqueue (rather than scoped to a mount), but
the queued work is minimal.

The final cleanup work for cleaning up a cfid is performed via work
queued in the serverclose_wq workqueue; this is done separate from
dropping the dentries so that close_all_cached_dirs() doesn't block on
any server operations.

Both of these queued works expect to invoked with a cfid reference and
a tcon reference to avoid those objects from being freed while the work
is ongoing.

While we're here, add proper locking to close_all_cached_dirs(), and
locking around the freeing of cfid->dentry.

Fixes: ebe98f1447 ("cifs: enable caching of directories for which a lease is held")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-26 18:47:08 -06:00
Henrique Carvalho
07bdf9272a smb: client: change return value in open_cached_dir_by_dentry() if !cfids
Change return value from -ENOENT to -EOPNOTSUPP to maintain consistency
with the return value of open_cached_dir() for the same case. This
change is safe as the only calling function does not differentiate
between these return values.

Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-25 13:20:24 -06:00
Henrique Carvalho
f6e8883840 smb: client: remove unnecessary checks in open_cached_dir()
Checks inside open_cached_dir() can be removed because if dir caching is
disabled then tcon->cfids is necessarily NULL. Therefore, all other checks
are redundant.

Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com>
Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.com>
Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-25 13:19:30 -06:00
Paul Aurich
a9685b409a smb: prevent use-after-free due to open_cached_dir error paths
If open_cached_dir() encounters an error parsing the lease from the
server, the error handling may race with receiving a lease break,
resulting in open_cached_dir() freeing the cfid while the queued work is
pending.

Update open_cached_dir() to drop refs rather than directly freeing the
cfid.

Have cached_dir_lease_break(), cfids_laundromat_worker(), and
invalidate_all_cached_dirs() clear has_lease immediately while still
holding cfids->cfid_list_lock, and then use this to also simplify the
reference counting in cfids_laundromat_worker() and
invalidate_all_cached_dirs().

Fixes this KASAN splat (which manually injects an error and lease break
in open_cached_dir()):

==================================================================
BUG: KASAN: slab-use-after-free in smb2_cached_lease_break+0x27/0xb0
Read of size 8 at addr ffff88811cc24c10 by task kworker/3:1/65

CPU: 3 UID: 0 PID: 65 Comm: kworker/3:1 Not tainted 6.12.0-rc6-g255cf264e6e5-dirty #87
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
Workqueue: cifsiod smb2_cached_lease_break
Call Trace:
 <TASK>
 dump_stack_lvl+0x77/0xb0
 print_report+0xce/0x660
 kasan_report+0xd3/0x110
 smb2_cached_lease_break+0x27/0xb0
 process_one_work+0x50a/0xc50
 worker_thread+0x2ba/0x530
 kthread+0x17c/0x1c0
 ret_from_fork+0x34/0x60
 ret_from_fork_asm+0x1a/0x30
 </TASK>

Allocated by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 __kasan_kmalloc+0xaa/0xb0
 open_cached_dir+0xa7d/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Freed by task 2464:
 kasan_save_stack+0x33/0x60
 kasan_save_track+0x14/0x30
 kasan_save_free_info+0x3b/0x60
 __kasan_slab_free+0x51/0x70
 kfree+0x174/0x520
 open_cached_dir+0x97f/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

Last potentially related work creation:
 kasan_save_stack+0x33/0x60
 __kasan_record_aux_stack+0xad/0xc0
 insert_work+0x32/0x100
 __queue_work+0x5c9/0x870
 queue_work_on+0x82/0x90
 open_cached_dir+0x1369/0x1fb0
 smb2_query_path_info+0x43c/0x6e0
 cifs_get_fattr+0x346/0xf10
 cifs_get_inode_info+0x157/0x210
 cifs_revalidate_dentry_attr+0x2d1/0x460
 cifs_getattr+0x173/0x470
 vfs_statx_path+0x10f/0x160
 vfs_statx+0xe9/0x150
 vfs_fstatat+0x5e/0xc0
 __do_sys_newfstatat+0x91/0xf0
 do_syscall_64+0x95/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e

The buggy address belongs to the object at ffff88811cc24c00
 which belongs to the cache kmalloc-1k of size 1024
The buggy address is located 16 bytes inside of
 freed 1024-byte region [ffff88811cc24c00, ffff88811cc25000)

Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-21 10:45:50 -06:00
Paul Aurich
7afb867336 smb: Don't leak cfid when reconnect races with open_cached_dir
open_cached_dir() may either race with the tcon reconnection even before
compound_send_recv() or directly trigger a reconnection via
SMB2_open_init() or SMB_query_info_init().

The reconnection process invokes invalidate_all_cached_dirs() via
cifs_mark_open_files_invalid(), which removes all cfids from the
cfids->entries list but doesn't drop a ref if has_lease isn't true. This
results in the currently-being-constructed cfid not being on the list,
but still having a refcount of 2. It leaks if returned from
open_cached_dir().

Fix this by setting cfid->has_lease when the ref is actually taken; the
cfid will not be used by other threads until it has a valid time.

Addresses these kmemleaks:

unreferenced object 0xffff8881090c4000 (size 1024):
  comm "bash", pid 1860, jiffies 4295126592
  hex dump (first 32 bytes):
    00 01 00 00 00 00 ad de 22 01 00 00 00 00 ad de  ........".......
    00 ca 45 22 81 88 ff ff f8 dc 4f 04 81 88 ff ff  ..E"......O.....
  backtrace (crc 6f58c20f):
    [<ffffffff8b895a1e>] __kmalloc_cache_noprof+0x2be/0x350
    [<ffffffff8bda06e3>] open_cached_dir+0x993/0x1fb0
    [<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
    [<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
    [<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
    [<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
    [<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e
unreferenced object 0xffff8881044fdcf8 (size 8):
  comm "bash", pid 1860, jiffies 4295126592
  hex dump (first 8 bytes):
    00 cc cc cc cc cc cc cc                          ........
  backtrace (crc 10c106a9):
    [<ffffffff8b89a3d3>] __kmalloc_node_track_caller_noprof+0x363/0x480
    [<ffffffff8b7d7256>] kstrdup+0x36/0x60
    [<ffffffff8bda0700>] open_cached_dir+0x9b0/0x1fb0
    [<ffffffff8bdaa750>] cifs_readdir+0x15a0/0x1d50
    [<ffffffff8b9a853f>] iterate_dir+0x28f/0x4b0
    [<ffffffff8b9a9aed>] __x64_sys_getdents64+0xfd/0x200
    [<ffffffff8cf6da05>] do_syscall_64+0x95/0x1a0
    [<ffffffff8d00012f>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

And addresses these BUG splats when unmounting the SMB filesystem:

BUG: Dentry ffff888140590ba0{i=1000000000080,n=/}  still in use (2) [unmount of cifs cifs]
WARNING: CPU: 3 PID: 3433 at fs/dcache.c:1536 umount_check+0xd0/0x100
Modules linked in:
CPU: 3 UID: 0 PID: 3433 Comm: bash Not tainted 6.12.0-rc4-g850925a8133c-dirty #49
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:umount_check+0xd0/0x100
Code: 8d 7c 24 40 e8 31 5a f4 ff 49 8b 54 24 40 41 56 49 89 e9 45 89 e8 48 89 d9 41 57 48 89 de 48 c7 c7 80 e7 db ac e8 f0 72 9a ff <0f> 0b 58 31 c0 5a 5b 5d 41 5c 41 5d 41 5e 41 5f e9 2b e5 5d 01 41
RSP: 0018:ffff88811cc27978 EFLAGS: 00010286
RAX: 0000000000000000 RBX: ffff888140590ba0 RCX: ffffffffaaf20bae
RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff8881f6fb6f40
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed1023984ee3
R10: ffff88811cc2771f R11: 00000000016cfcc0 R12: ffff888134383e08
R13: 0000000000000002 R14: ffff8881462ec668 R15: ffffffffaceab4c0
FS:  00007f23bfa98740(0000) GS:ffff8881f6f80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556de4a6f808 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 d_walk+0x6a/0x530
 shrink_dcache_for_umount+0x6a/0x200
 generic_shutdown_super+0x52/0x2a0
 kill_anon_super+0x22/0x40
 cifs_kill_sb+0x159/0x1e0
 deactivate_locked_super+0x66/0xe0
 cleanup_mnt+0x140/0x210
 task_work_run+0xfb/0x170
 syscall_exit_to_user_mode+0x29f/0x2b0
 do_syscall_64+0xa1/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
Code: ff ff ff ff c3 66 0f 1f 44 00 00 48 8b 0d 11 93 0d 00 f7 d8 64 89 01 b8 ff ff ff ff eb bf 0f 1f 44 00 00 b8 50 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d e9 92 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007ffee9138598 EFLAGS: 00000246 ORIG_RAX: 0000000000000050
RAX: 0000000000000000 RBX: 0000558f1803e9a0 RCX: 00007f23bfb93ae7
RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000558f1803e9a0
RBP: 0000558f1803e600 R08: 0000000000000007 R09: 0000558f17fab610
R10: d91d5ec34ab757b0 R11: 0000000000000246 R12: 0000000000000001
R13: 0000000000000000 R14: 0000000000000015 R15: 0000000000000000
 </TASK>
irq event stamp: 1163486
hardirqs last  enabled at (1163485): [<ffffffffac98d344>] _raw_spin_unlock_irqrestore+0x34/0x60
hardirqs last disabled at (1163486): [<ffffffffac97dcfc>] __schedule+0xc7c/0x19a0
softirqs last  enabled at (1163482): [<ffffffffab79a3ee>] __smb_send_rqst+0x3de/0x990
softirqs last disabled at (1163480): [<ffffffffac2314f1>] release_sock+0x21/0xf0
---[ end trace 0000000000000000 ]---

VFS: Busy inodes after unmount of cifs (cifs)
------------[ cut here ]------------
kernel BUG at fs/super.c:661!
Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
CPU: 1 UID: 0 PID: 3433 Comm: bash Tainted: G        W          6.12.0-rc4-g850925a8133c-dirty #49
Tainted: [W]=WARN
Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 11/12/2020
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0
Call Trace:
 <TASK>
 kill_anon_super+0x22/0x40
 cifs_kill_sb+0x159/0x1e0
 deactivate_locked_super+0x66/0xe0
 cleanup_mnt+0x140/0x210
 task_work_run+0xfb/0x170
 syscall_exit_to_user_mode+0x29f/0x2b0
 do_syscall_64+0xa1/0x1a0
 entry_SYSCALL_64_after_hwframe+0x76/0x7e
RIP: 0033:0x7f23bfb93ae7
 </TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:generic_shutdown_super+0x290/0x2a0
Code: e8 15 7c f7 ff 48 8b 5d 28 48 89 df e8 09 7c f7 ff 48 8b 0b 48 89 ee 48 8d 95 68 06 00 00 48 c7 c7 80 7f db ac e8 00 69 af ff <0f> 0b 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 90 90 90 90 90 90
RSP: 0018:ffff88811cc27a50 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffffffae994420 RCX: 0000000000000027
RDX: 0000000000000000 RSI: ffffffffab06180e RDI: ffff8881f6eb18c8
RBP: ffff8881462ec000 R08: 0000000000000001 R09: ffffed103edd6319
R10: ffff8881f6eb18cb R11: 00000000016d3158 R12: ffff8881462ec9c0
R13: ffff8881462ec050 R14: 0000000000000001 R15: 0000000000000000
FS:  00007f23bfa98740(0000) GS:ffff8881f6e80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8364005d68 CR3: 0000000123c80000 CR4: 0000000000350ef0

This reproduces eventually with an SMB mount and two shells running
these loops concurrently

- while true; do
      cd ~; sleep 1;
      for i in {1..3}; do cd /mnt/test/subdir;
          echo $PWD; sleep 1; cd ..; echo $PWD; sleep 1;
      done;
      echo ...;
  done
- while true; do
      iptables -F OUTPUT; mount -t cifs -a;
      for _ in {0..2}; do ls /mnt/test/subdir/ | wc -l; done;
      iptables -I OUTPUT -p tcp --dport 445 -j DROP;
      sleep 10
      echo "unmounting"; umount -l -t cifs -a; echo "done unmounting";
      sleep 20
      echo "recovering"; iptables -F OUTPUT;
      sleep 10;
  done

Fixes: ebe98f1447 ("cifs: enable caching of directories for which a lease is held")
Fixes: 5c86919455 ("smb: client: fix use-after-free in smb2_query_info_compound()")
Cc: stable@vger.kernel.org
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-21 10:45:50 -06:00
Paul Aurich
128630e1db smb: cached directories can be more than root file handle
Update this log message since cached fids may represent things other
than the root of a mount.

Fixes: e4029e0726 ("cifs: find and use the dentry for cached non-root directories also")
Signed-off-by: Paul Aurich <paul@darkrain42.org>
Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-21 10:41:35 -06:00
Steve French
28e0947651 smb3: fix Open files on server counter going negative
We were decrementing the count of open files on server twice
for the case where we were closing cached directories.

Fixes: 8e843bf38f ("cifs: return a single-use cfid if we did not get a lease")
Cc: stable@vger.kernel.org
Acked-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-11 16:02:02 -05:00
Ritvik Budhiraja
173217bd73 smb3: retrying on failed server close
In the current implementation, CIFS close sends a close to the
server and does not check for the success of the server close.
This patch adds functionality to check for server close return
status and retries in case of an EBUSY or EAGAIN error.

This can help avoid handle leaks

Cc: stable@vger.kernel.org
Signed-off-by: Ritvik Budhiraja <rbudhiraja@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-04-03 14:43:24 -05:00
Eugene Korenevsky
f1b8224b4e cifs: open_cached_dir(): add FILE_READ_EA to desired access
Since smb2_query_eas() reads EA and uses cached directory,
open_cached_dir() should request FILE_READ_EA access.

Otherwise listxattr() and getxattr() will fail with EACCES
(0xc0000022 STATUS_ACCESS_DENIED SMB status).

Link: https://bugzilla.kernel.org/show_bug.cgi?id=218543
Cc: stable@vger.kernel.org
Signed-off-by: Eugene Korenevsky <ekorenevsky@astralinux.ru>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-03-14 13:54:27 -05:00
Shyam Prasad N
79520587fe cifs: update the same create_guid on replay
File open requests made to the server contain a
CreateGuid, which is used by the server to identify
the open request. If the same request needs to be
replayed, it needs to be sent with the same CreateGuid
in the durable handle v2 context.

Without doing so, we could end up leaking handles on
the server when:
1. multichannel is used AND
2. connection goes down, but not for all channels

This is because the replayed open request would have a
new CreateGuid and the server will treat this as a new
request and open a new handle.

This change fixes this by reusing the existing create_guid
stored in the cached fid struct.

REF: MS-SMB2 4.9 Replay Create Request on an Alternate Channel

Fixes: 4f1fffa237 ("cifs: commands that are retried should have replay flag set")
Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-02-11 19:07:08 -06:00
Shyam Prasad N
4f1fffa237 cifs: commands that are retried should have replay flag set
MS-SMB2 states that the header flag SMB2_FLAGS_REPLAY_OPERATION
needs to be set when a command needs to be retried, so that
the server is aware that this is a replay for an operation that
appeared before.

This can be very important, for example, for state changing
operations and opens which get retried following a reconnect;
since the client maybe unaware of the status of the previous
open.

This is particularly important for multichannel scenario, since
disconnection of one connection does not mean that the session
is lost. The requests can be replayed on another channel.

This change also makes use of exponential back-off before replays
and also limits the number of retries to "retrans" mount option
value.

Also, this change does not modify the read/write codepath.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-23 20:23:29 -06:00
Shyam Prasad N
64cc377b76 cifs: helper function to check replayable error codes
The code to check for replay is not just -EAGAIN. In some
cases, the send request or receive response may result in
network errors, which we're now mapping to -ECONNABORTED.

This change introduces a helper function which checks
if the error returned in one of the above two errors.
And all checks for replays will now use this helper.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-23 20:23:28 -06:00
Shyam Prasad N
936eba9cfb cifs: open_cached_dir should not rely on primary channel
open_cached_dir today selects ses->server a.k.a primary channel
to send requests. When multichannel is used, the primary
channel maybe down. So it does not make sense to rely only
on that channel.

This fix makes this function pick a channel with the standard
helper function cifs_pick_channel.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-01-18 21:01:04 -06:00
Paulo Alcantara
af1689a9b7 smb: client: fix potential OOBs in smb2_parse_contexts()
Validate offsets and lengths before dereferencing create contexts in
smb2_parse_contexts().

This fixes following oops when accessing invalid create contexts from
server:

  BUG: unable to handle page fault for address: ffff8881178d8cc3
  #PF: supervisor read access in kernel mode
  #PF: error_code(0x0000) - not-present page
  PGD 4a01067 P4D 4a01067 PUD 0
  Oops: 0000 [#1] PREEMPT SMP NOPTI
  CPU: 3 PID: 1736 Comm: mount.cifs Not tainted 6.7.0-rc4 #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
  rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  RIP: 0010:smb2_parse_contexts+0xa0/0x3a0 [cifs]
  Code: f8 10 75 13 48 b8 93 ad 25 50 9c b4 11 e7 49 39 06 0f 84 d2 00
  00 00 8b 45 00 85 c0 74 61 41 29 c5 48 01 c5 41 83 fd 0f 76 55 <0f> b7
  7d 04 0f b7 45 06 4c 8d 74 3d 00 66 83 f8 04 75 bc ba 04 00
  RSP: 0018:ffffc900007939e0 EFLAGS: 00010216
  RAX: ffffc90000793c78 RBX: ffff8880180cc000 RCX: ffffc90000793c90
  RDX: ffffc90000793cc0 RSI: ffff8880178d8cc0 RDI: ffff8880180cc000
  RBP: ffff8881178d8cbf R08: ffffc90000793c22 R09: 0000000000000000
  R10: ffff8880180cc000 R11: 0000000000000024 R12: 0000000000000000
  R13: 0000000000000020 R14: 0000000000000000 R15: ffffc90000793c22
  FS: 00007f873753cbc0(0000) GS:ffff88806bc00000(0000)
  knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: ffff8881178d8cc3 CR3: 00000000181ca000 CR4: 0000000000750ef0
  PKRU: 55555554
  Call Trace:
   <TASK>
   ? __die+0x23/0x70
   ? page_fault_oops+0x181/0x480
   ? search_module_extables+0x19/0x60
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? exc_page_fault+0x1b6/0x1c0
   ? asm_exc_page_fault+0x26/0x30
   ? smb2_parse_contexts+0xa0/0x3a0 [cifs]
   SMB2_open+0x38d/0x5f0 [cifs]
   ? smb2_is_path_accessible+0x138/0x260 [cifs]
   smb2_is_path_accessible+0x138/0x260 [cifs]
   cifs_is_path_remote+0x8d/0x230 [cifs]
   cifs_mount+0x7e/0x350 [cifs]
   cifs_smb3_do_mount+0x128/0x780 [cifs]
   smb3_get_tree+0xd9/0x290 [cifs]
   vfs_get_tree+0x2c/0x100
   ? capable+0x37/0x70
   path_mount+0x2d7/0xb80
   ? srso_alias_return_thunk+0x5/0xfbef5
   ? _raw_spin_unlock_irqrestore+0x44/0x60
   __x64_sys_mount+0x11a/0x150
   do_syscall_64+0x47/0xf0
   entry_SYSCALL_64_after_hwframe+0x6f/0x77
  RIP: 0033:0x7f8737657b1e

Reported-by: Robert Morris <rtm@csail.mit.edu>
Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-12-11 12:30:39 -06:00
Paulo Alcantara
5c86919455 smb: client: fix use-after-free in smb2_query_info_compound()
The following UAF was triggered when running fstests generic/072 with
KASAN enabled against Windows Server 2022 and mount options
'multichannel,max_channels=2,vers=3.1.1,mfsymlinks,noperm'

  BUG: KASAN: slab-use-after-free in smb2_query_info_compound+0x423/0x6d0 [cifs]
  Read of size 8 at addr ffff888014941048 by task xfs_io/27534

  CPU: 0 PID: 27534 Comm: xfs_io Not tainted 6.6.0-rc7 #1
  Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
  rel-1.16.2-3-gd478f380-rebuilt.opensuse.org 04/01/2014
  Call Trace:
   dump_stack_lvl+0x4a/0x80
   print_report+0xcf/0x650
   ? srso_alias_return_thunk+0x5/0x7f
   ? srso_alias_return_thunk+0x5/0x7f
   ? __phys_addr+0x46/0x90
   kasan_report+0xda/0x110
   ? smb2_query_info_compound+0x423/0x6d0 [cifs]
   ? smb2_query_info_compound+0x423/0x6d0 [cifs]
   smb2_query_info_compound+0x423/0x6d0 [cifs]
   ? __pfx_smb2_query_info_compound+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0x7f
   ? __stack_depot_save+0x39/0x480
   ? kasan_save_stack+0x33/0x60
   ? kasan_set_track+0x25/0x30
   ? ____kasan_slab_free+0x126/0x170
   smb2_queryfs+0xc2/0x2c0 [cifs]
   ? __pfx_smb2_queryfs+0x10/0x10 [cifs]
   ? __pfx___lock_acquire+0x10/0x10
   smb311_queryfs+0x210/0x220 [cifs]
   ? __pfx_smb311_queryfs+0x10/0x10 [cifs]
   ? srso_alias_return_thunk+0x5/0x7f
   ? __lock_acquire+0x480/0x26c0
   ? lock_release+0x1ed/0x640
   ? srso_alias_return_thunk+0x5/0x7f
   ? do_raw_spin_unlock+0x9b/0x100
   cifs_statfs+0x18c/0x4b0 [cifs]
   statfs_by_dentry+0x9b/0xf0
   fd_statfs+0x4e/0xb0
   __do_sys_fstatfs+0x7f/0xe0
   ? __pfx___do_sys_fstatfs+0x10/0x10
   ? srso_alias_return_thunk+0x5/0x7f
   ? lockdep_hardirqs_on_prepare+0x136/0x200
   ? srso_alias_return_thunk+0x5/0x7f
   do_syscall_64+0x3f/0x90
   entry_SYSCALL_64_after_hwframe+0x6e/0xd8

  Allocated by task 27534:
   kasan_save_stack+0x33/0x60
   kasan_set_track+0x25/0x30
   __kasan_kmalloc+0x8f/0xa0
   open_cached_dir+0x71b/0x1240 [cifs]
   smb2_query_info_compound+0x5c3/0x6d0 [cifs]
   smb2_queryfs+0xc2/0x2c0 [cifs]
   smb311_queryfs+0x210/0x220 [cifs]
   cifs_statfs+0x18c/0x4b0 [cifs]
   statfs_by_dentry+0x9b/0xf0
   fd_statfs+0x4e/0xb0
   __do_sys_fstatfs+0x7f/0xe0
   do_syscall_64+0x3f/0x90
   entry_SYSCALL_64_after_hwframe+0x6e/0xd8

  Freed by task 27534:
   kasan_save_stack+0x33/0x60
   kasan_set_track+0x25/0x30
   kasan_save_free_info+0x2b/0x50
   ____kasan_slab_free+0x126/0x170
   slab_free_freelist_hook+0xd0/0x1e0
   __kmem_cache_free+0x9d/0x1b0
   open_cached_dir+0xff5/0x1240 [cifs]
   smb2_query_info_compound+0x5c3/0x6d0 [cifs]
   smb2_queryfs+0xc2/0x2c0 [cifs]

This is a race between open_cached_dir() and cached_dir_lease_break()
where the cache entry for the open directory handle receives a lease
break while creating it.  And before returning from open_cached_dir(),
we put the last reference of the new @cfid because of
!@cfid->has_lease.

Besides the UAF, while running xfstests a lot of missed lease breaks
have been noticed in tests that run several concurrent statfs(2) calls
on those cached fids

  CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame...
  CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1...
  CIFS: VFS: \\w22-root1.gandalf.test smb buf 00000000715bfe83 len 108
  CIFS: VFS: Dump pending requests:
  CIFS: VFS: \\w22-root1.gandalf.test No task to wake, unknown frame...
  CIFS: VFS: \\w22-root1.gandalf.test Cmd: 18 Err: 0x0 Flags: 0x1...
  CIFS: VFS: \\w22-root1.gandalf.test smb buf 000000005aa7316e len 108
  ...

To fix both, in open_cached_dir() ensure that @cfid->has_lease is set
right before sending out compounded request so that any potential
lease break will be get processed by demultiplex thread while we're
still caching @cfid.  And, if open failed for some reason, re-check
@cfid->has_lease to decide whether or not put lease reference.

Cc: stable@vger.kernel.org
Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-11-02 08:06:06 -05:00
Paulo Alcantara
81ba109599 smb: client: prevent new fids from being removed by laundromat
Check if @cfid->time is set in laundromat so we guarantee that only
fully cached fids will be selected for removal.  While we're at it,
add missing locks to protect access of @cfid fields in order to avoid
races with open_cached_dir() and cfids_laundromat_worker(),
respectively.

Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-12 09:41:32 -05:00
Paulo Alcantara
e95f3f7446 smb: client: make laundromat a delayed worker
By having laundromat kthread processing cached directories on every
second turned out to be overkill, especially when having multiple SMB
mounts.

Relax it by using a delayed worker instead that gets scheduled on
every @dir_cache_timeout (default=30) seconds per tcon.

This also fixes the 1s delay when tearing down tcon.

Signed-off-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-10-12 09:41:04 -05:00
Steve French
2da338ff75 smb3: do not start laundromat thread when dir leases
disabled

When no directory lease support, or for IPC shares where directories
can not be opened, do not start an unneeded laundromat thread for
that mount (it wastes resources).

Fixes: d14de8067e ("cifs: Add a laundromat thread for cached directories")
Reviewed-by: Paulo Alcantara (SUSE) <pc@manguebit.com>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-19 13:32:02 -05:00
Steve French
6a50d71d0f smb3: allow controlling maximum number of cached directories
Allow adjusting the maximum number of cached directories per share
(defaults to 16) via mount parm "max_cached_dirs"

Signed-off-by: Steve French <stfrench@microsoft.com>
2023-09-07 00:06:04 -05:00
Steve French
238b351d09 smb3: allow controlling length of time directory entries are cached with dir leases
Currently with directory leases we cache directory contents for a fixed period
of time (default 30 seconds) but for many workloads this is too short.  Allow
configuring the maximum amount of time directory entries are cached when a
directory lease is held on that directory. Add module load parm "max_dir_cache"

For example to set the timeout to 10 minutes you would do:

  echo 600 > /sys/module/cifs/parameters/dir_cache_timeout

or to disable caching directory contents:

  echo 0 > /sys/module/cifs/parameters/dir_cache_timeout

Reviewed-by: Bharath SM <bharathsm@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-08-31 10:06:36 -05:00
Bharath SM
b6d44d4231 cifs: update desired access while requesting for directory lease
We read and cache directory contents when we get directory
lease, so we should ask for read permission to read contents
of directory.

Signed-off-by: Bharath SM <bharathsm@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Cc: stable@vger.kernel.org
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-08-22 10:31:00 -05:00
Ronnie Sahlberg
d14de8067e cifs: Add a laundromat thread for cached directories
and drop cached directories after 30 seconds

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-07-05 22:36:07 -05:00
Steve French
38c8a9a520 smb: move client and server files to common directory fs/smb
Move CIFS/SMB3 related client and server files (cifs.ko and ksmbd.ko
and helper modules) to new fs/smb subdirectory:

   fs/cifs --> fs/smb/client
   fs/ksmbd --> fs/smb/server
   fs/smbfs_common --> fs/smb/common

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Namjae Jeon <linkinjeon@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2023-05-24 16:29:21 -05:00