Commit Graph

574 Commits

Author SHA1 Message Date
Aurelien Aptel
ec4e4862a9 cifs: remove old dead code
While reviewing a patch clarifying locks and locking hierarchy I
realized some locks were unused.

This commit removes old data and code that isn't actually used
anywhere, or hidden in ifdefs which cannot be enabled from the kernel
config.

* The uid/gid trees and associated locks are left-overs from when
  uid/sid mapping had an extra caching layer on top of the keyring and
  are now unused.
  See commit faa65f07d2 ("cifs: simplify id_to_sid and sid_to_id mapping code")
  from 2012.

* cifs_oplock_break_ops is a left-over from when slow_work was remplaced
  by regular workqueue and is now unused.
  See commit 9b64697246 ("cifs: use workqueue instead of slow-work")
  from 2010.

* CIFSSMBSetAttrLegacy is SMB1 cruft dealing with some legacy
  NT4/Win9x behaviour.

* Remove CONFIG_CIFS_DNOTIFY_EXPERIMENTAL left-overs. This was already
  partially removed in 392e1c5dc9 ("cifs: rename and clarify CIFS_ASYNC_OP and CIFS_NO_RESP")
  from 2019. Kill it completely.

* Another candidate that was considered but spared is
  CONFIG_CIFS_NFSD_EXPORT which has an empty implementation and cannot
  be enabled by a config option (although it is listed but disabled with
  "BROKEN" as a dep). It's unclear whether this could even function
  today in its current form but it has it's own .c file and Kconfig
  entry which is a bit more involved to remove and might make a come
  back?

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Aurelien Aptel
b7fd0fa0ea cifs: simplify SWN code with dummy funcs instead of ifdefs
This commit doesn't change the logic of SWN.

Add dummy implementation of SWN functions when SWN is disabled instead
of using ifdef sections.

The dummy functions get optimized out, this leads to clearer code and
compile time type-checking regardless of config options with no
runtime penalty.

Leave the simple ifdefs section as-is.

A single bitfield (bool foo:1) on its own will use up one int. Move
tcon->use_witness out of ifdefs with the other tcon bitfields.

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Samuel Cabrero <scabrero@suse.de>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-25 16:28:22 -05:00
Maciek Borzecki
0fc9322ab5 cifs: escape spaces in share names
Commit 653a5efb84 ("cifs: update super_operations to show_devname")
introduced the display of devname for cifs mounts. However, when mounting
a share which has a whitespace in the name, that exact share name is also
displayed in mountinfo. Make sure that all whitespace is escaped.

Signed-off-by: Maciek Borzecki <maciek.borzecki@gmail.com>
CC: <stable@vger.kernel.org> # 5.11+
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-04-07 21:30:27 -05:00
Paulo Alcantara
14302ee330 cifs: return proper error code in statfs(2)
In cifs_statfs(), if server->ops->queryfs is not NULL, then we should
use its return value rather than always returning 0.  Instead, use rc
variable as it is properly set to 0 in case there is no
server->ops->queryfs.

Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-03-08 21:23:11 -06:00
Linus Torvalds
c19798af2e cifs/smb3 fixes including improvements to mode bit conversion when using cifsacl mount option, new mount options for controlling attribute caching, improvements to crediting and reconnect, improved debugging
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAmA4bocACgkQiiy9cAdy
 T1HuMwv/bmZ53ilDwhCph/UKoLJm/bRyvCp+GBECtLLS/C/4qz5IBLpPPr2yhOyH
 gmkeCZZWhj0nzGAYxhVDAdBz9IPEA7bae503IaOuk5uXaCXC8htsq/Cd7qpJmHlf
 5vCdBTmHBiUt02dUcZ9A3bm855xJLEINHH9YdJM157ysqLgttIibLQB0F/gJ49DR
 QIIdq7sZNJXcTgRsUzJZNnrWLDi2oVIoUlq5M6d8ypmZC0ArPNfrSafjW6h5rqpj
 UYBwtUDNwQiS0lgwR4mji4PCen0GGwMFtyVDOpdJLJq3fO995yse2BRk0BFHVH1i
 xfAskQjkxAHEcfzQC1cM4ouT/WYu8nHaLK1vp/1lVr93mo8KqSX+SW/bLsXfpQkm
 w//xMy94HdM2pgyM6J1pYnKPb7s/DG19RYPktQ5oYn0fR5qYlqALAmd02JRjO3xV
 cbjbmWXXzFFsFJc5MJmM6wVLJzRb4a50SN1W37aHNXWi8ktpsaNzz33LGw1pF2OT
 K6P2DqJe
 =RQo/
 -----END PGP SIGNATURE-----

Merge tag '5.12-smb3-part1' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:

 - improvements to mode bit conversion, chmod and chown when using
   cifsacl mount option

 - two new mount options for controlling attribute caching

 - improvements to crediting and reconnect, improved debugging

 - reconnect fix

 - add SMB3.1.1 dialect to default dialects for vers=3

* tag '5.12-smb3-part1' of git://git.samba.org/sfrench/cifs-2.6: (27 commits)
  cifs: update internal version number
  cifs: use discard iterator to discard unneeded network data more efficiently
  cifs: introduce helper for finding referral server to improve DFS target resolution
  cifs: check all path components in resolved dfs target
  cifs: fix DFS failover
  cifs: fix nodfs mount option
  cifs: fix handling of escaped ',' in the password mount argument
  cifs: Add new parameter "acregmax" for distinct file and directory metadata timeout
  cifs: convert revalidate of directories to using directory metadata cache timeout
  cifs: Add new mount parameter "acdirmax" to allow caching directory metadata
  cifs: If a corrupted DACL is returned by the server, bail out.
  cifs: minor simplification to smb2_is_network_name_deleted
  TCON Reconnect during STATUS_NETWORK_NAME_DELETED
  cifs: cleanup a few le16 vs. le32 uses in cifsacl.c
  cifs: Change SIDs in ACEs while transferring file ownership.
  cifs: Retain old ACEs when converting between mode bits and ACL.
  cifs: Fix cifsacl ACE mask for group and others.
  cifs: clarify hostname vs ip address in /proc/fs/cifs/DebugData
  cifs: change confusing field serverName (to ip_addr)
  cifs: Fix inconsistent IS_ERR and PTR_ERR
  ...
2021-02-26 14:09:41 -08:00
Steve French
5780464614 cifs: Add new parameter "acregmax" for distinct file and directory metadata timeout
The new optional mount parameter "acregmax" allows a different
timeout for file metadata ("acdirmax" now allows controlling timeout
for directory metadata).  Setting "actimeo" still works as before,
and changes timeout for both files and directories, but
specifying "acregmax" or "acdirmax" allows overriding the
default more granularly which can be a big performance benefit
on some workloads. "acregmax" is already used by NFS as a mount
parameter (albeit with a larger default and thus looser caching).

Suggested-by: Tom Talpey <tom@talpey.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-25 11:47:49 -06:00
Steve French
4c9f948142 cifs: Add new mount parameter "acdirmax" to allow caching directory metadata
nfs and cifs on Linux currently have a mount parameter "actimeo" to control
metadata (attribute) caching but cifs does not have additional mount
parameters to allow distinguishing between caching directory metadata
(e.g. needed to revalidate paths) and that for files.

Add new mount parameter "acdirmax" to allow caching metadata for
directories more loosely than file data.  NFS adjusts metadata
caching from acdirmin to acdirmax (and another two mount parms
for files) but to reduce complexity, it is safer to just introduce
the one mount parm to allow caching directories longer. The
defaults for acdirmax and actimeo (for cifs.ko) are conservative,
1 second (NFS defaults acdirmax to 60 seconds). For many workloads,
setting acdirmax to a higher value is safe and will improve
performance.  This patch leaves unchanged the default values
for caching metadata for files and directories but gives the
user more flexibility in adjusting them safely for their workload
via the new mount parm.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-By: Tom Talpey <tom@talpey.com>
2021-02-25 11:47:42 -06:00
Linus Torvalds
7d6beb71da idmapped-mounts-v5.12
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCYCegywAKCRCRxhvAZXjc
 ouJ6AQDlf+7jCQlQdeKKoN9QDFfMzG1ooemat36EpRRTONaGuAD8D9A4sUsG4+5f
 4IU5Lj9oY4DEmF8HenbWK2ZHsesL2Qg=
 =yPaw
 -----END PGP SIGNATURE-----

Merge tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux

Pull idmapped mounts from Christian Brauner:
 "This introduces idmapped mounts which has been in the making for some
  time. Simply put, different mounts can expose the same file or
  directory with different ownership. This initial implementation comes
  with ports for fat, ext4 and with Christoph's port for xfs with more
  filesystems being actively worked on by independent people and
  maintainers.

  Idmapping mounts handle a wide range of long standing use-cases. Here
  are just a few:

   - Idmapped mounts make it possible to easily share files between
     multiple users or multiple machines especially in complex
     scenarios. For example, idmapped mounts will be used in the
     implementation of portable home directories in
     systemd-homed.service(8) where they allow users to move their home
     directory to an external storage device and use it on multiple
     computers where they are assigned different uids and gids. This
     effectively makes it possible to assign random uids and gids at
     login time.

   - It is possible to share files from the host with unprivileged
     containers without having to change ownership permanently through
     chown(2).

   - It is possible to idmap a container's rootfs and without having to
     mangle every file. For example, Chromebooks use it to share the
     user's Download folder with their unprivileged containers in their
     Linux subsystem.

   - It is possible to share files between containers with
     non-overlapping idmappings.

   - Filesystem that lack a proper concept of ownership such as fat can
     use idmapped mounts to implement discretionary access (DAC)
     permission checking.

   - They allow users to efficiently changing ownership on a per-mount
     basis without having to (recursively) chown(2) all files. In
     contrast to chown (2) changing ownership of large sets of files is
     instantenous with idmapped mounts. This is especially useful when
     ownership of a whole root filesystem of a virtual machine or
     container is changed. With idmapped mounts a single syscall
     mount_setattr syscall will be sufficient to change the ownership of
     all files.

   - Idmapped mounts always take the current ownership into account as
     idmappings specify what a given uid or gid is supposed to be mapped
     to. This contrasts with the chown(2) syscall which cannot by itself
     take the current ownership of the files it changes into account. It
     simply changes the ownership to the specified uid and gid. This is
     especially problematic when recursively chown(2)ing a large set of
     files which is commong with the aforementioned portable home
     directory and container and vm scenario.

   - Idmapped mounts allow to change ownership locally, restricting it
     to specific mounts, and temporarily as the ownership changes only
     apply as long as the mount exists.

  Several userspace projects have either already put up patches and
  pull-requests for this feature or will do so should you decide to pull
  this:

   - systemd: In a wide variety of scenarios but especially right away
     in their implementation of portable home directories.

         https://systemd.io/HOME_DIRECTORY/

   - container runtimes: containerd, runC, LXD:To share data between
     host and unprivileged containers, unprivileged and privileged
     containers, etc. The pull request for idmapped mounts support in
     containerd, the default Kubernetes runtime is already up for quite
     a while now: https://github.com/containerd/containerd/pull/4734

   - The virtio-fs developers and several users have expressed interest
     in using this feature with virtual machines once virtio-fs is
     ported.

   - ChromeOS: Sharing host-directories with unprivileged containers.

  I've tightly synced with all those projects and all of those listed
  here have also expressed their need/desire for this feature on the
  mailing list. For more info on how people use this there's a bunch of
  talks about this too. Here's just two recent ones:

      https://www.cncf.io/wp-content/uploads/2020/12/Rootless-Containers-in-Gitpod.pdf
      https://fosdem.org/2021/schedule/event/containers_idmap/

  This comes with an extensive xfstests suite covering both ext4 and
  xfs:

      https://git.kernel.org/brauner/xfstests-dev/h/idmapped_mounts

  It covers truncation, creation, opening, xattrs, vfscaps, setid
  execution, setgid inheritance and more both with idmapped and
  non-idmapped mounts. It already helped to discover an unrelated xfs
  setgid inheritance bug which has since been fixed in mainline. It will
  be sent for inclusion with the xfstests project should you decide to
  merge this.

  In order to support per-mount idmappings vfsmounts are marked with
  user namespaces. The idmapping of the user namespace will be used to
  map the ids of vfs objects when they are accessed through that mount.
  By default all vfsmounts are marked with the initial user namespace.
  The initial user namespace is used to indicate that a mount is not
  idmapped. All operations behave as before and this is verified in the
  testsuite.

  Based on prior discussions we want to attach the whole user namespace
  and not just a dedicated idmapping struct. This allows us to reuse all
  the helpers that already exist for dealing with idmappings instead of
  introducing a whole new range of helpers. In addition, if we decide in
  the future that we are confident enough to enable unprivileged users
  to setup idmapped mounts the permission checking can take into account
  whether the caller is privileged in the user namespace the mount is
  currently marked with.

  The user namespace the mount will be marked with can be specified by
  passing a file descriptor refering to the user namespace as an
  argument to the new mount_setattr() syscall together with the new
  MOUNT_ATTR_IDMAP flag. The system call follows the openat2() pattern
  of extensibility.

  The following conditions must be met in order to create an idmapped
  mount:

   - The caller must currently have the CAP_SYS_ADMIN capability in the
     user namespace the underlying filesystem has been mounted in.

   - The underlying filesystem must support idmapped mounts.

   - The mount must not already be idmapped. This also implies that the
     idmapping of a mount cannot be altered once it has been idmapped.

   - The mount must be a detached/anonymous mount, i.e. it must have
     been created by calling open_tree() with the OPEN_TREE_CLONE flag
     and it must not already have been visible in the filesystem.

  The last two points guarantee easier semantics for userspace and the
  kernel and make the implementation significantly simpler.

  By default vfsmounts are marked with the initial user namespace and no
  behavioral or performance changes are observed.

  The manpage with a detailed description can be found here:

      1d7b902e28

  In order to support idmapped mounts, filesystems need to be changed
  and mark themselves with the FS_ALLOW_IDMAP flag in fs_flags. The
  patches to convert individual filesystem are not very large or
  complicated overall as can be seen from the included fat, ext4, and
  xfs ports. Patches for other filesystems are actively worked on and
  will be sent out separately. The xfstestsuite can be used to verify
  that port has been done correctly.

  The mount_setattr() syscall is motivated independent of the idmapped
  mounts patches and it's been around since July 2019. One of the most
  valuable features of the new mount api is the ability to perform
  mounts based on file descriptors only.

  Together with the lookup restrictions available in the openat2()
  RESOLVE_* flag namespace which we added in v5.6 this is the first time
  we are close to hardened and race-free (e.g. symlinks) mounting and
  path resolution.

  While userspace has started porting to the new mount api to mount
  proper filesystems and create new bind-mounts it is currently not
  possible to change mount options of an already existing bind mount in
  the new mount api since the mount_setattr() syscall is missing.

  With the addition of the mount_setattr() syscall we remove this last
  restriction and userspace can now fully port to the new mount api,
  covering every use-case the old mount api could. We also add the
  crucial ability to recursively change mount options for a whole mount
  tree, both removing and adding mount options at the same time. This
  syscall has been requested multiple times by various people and
  projects.

  There is a simple tool available at

      https://github.com/brauner/mount-idmapped

  that allows to create idmapped mounts so people can play with this
  patch series. I'll add support for the regular mount binary should you
  decide to pull this in the following weeks:

  Here's an example to a simple idmapped mount of another user's home
  directory:

	u1001@f2-vm:/$ sudo ./mount --idmap both:1000:1001:1 /home/ubuntu/ /mnt

	u1001@f2-vm:/$ ls -al /home/ubuntu/
	total 28
	drwxr-xr-x 2 ubuntu ubuntu 4096 Oct 28 22:07 .
	drwxr-xr-x 4 root   root   4096 Oct 28 04:00 ..
	-rw------- 1 ubuntu ubuntu 3154 Oct 28 22:12 .bash_history
	-rw-r--r-- 1 ubuntu ubuntu  220 Feb 25  2020 .bash_logout
	-rw-r--r-- 1 ubuntu ubuntu 3771 Feb 25  2020 .bashrc
	-rw-r--r-- 1 ubuntu ubuntu  807 Feb 25  2020 .profile
	-rw-r--r-- 1 ubuntu ubuntu    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw------- 1 ubuntu ubuntu 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ ls -al /mnt/
	total 28
	drwxr-xr-x  2 u1001 u1001 4096 Oct 28 22:07 .
	drwxr-xr-x 29 root  root  4096 Oct 28 22:01 ..
	-rw-------  1 u1001 u1001 3154 Oct 28 22:12 .bash_history
	-rw-r--r--  1 u1001 u1001  220 Feb 25  2020 .bash_logout
	-rw-r--r--  1 u1001 u1001 3771 Feb 25  2020 .bashrc
	-rw-r--r--  1 u1001 u1001  807 Feb 25  2020 .profile
	-rw-r--r--  1 u1001 u1001    0 Oct 16 16:11 .sudo_as_admin_successful
	-rw-------  1 u1001 u1001 1144 Oct 28 00:43 .viminfo

	u1001@f2-vm:/$ touch /mnt/my-file

	u1001@f2-vm:/$ setfacl -m u:1001:rwx /mnt/my-file

	u1001@f2-vm:/$ sudo setcap -n 1001 cap_net_raw+ep /mnt/my-file

	u1001@f2-vm:/$ ls -al /mnt/my-file
	-rw-rwxr--+ 1 u1001 u1001 0 Oct 28 22:14 /mnt/my-file

	u1001@f2-vm:/$ ls -al /home/ubuntu/my-file
	-rw-rwxr--+ 1 ubuntu ubuntu 0 Oct 28 22:14 /home/ubuntu/my-file

	u1001@f2-vm:/$ getfacl /mnt/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: mnt/my-file
	# owner: u1001
	# group: u1001
	user::rw-
	user:u1001:rwx
	group::rw-
	mask::rwx
	other::r--

	u1001@f2-vm:/$ getfacl /home/ubuntu/my-file
	getfacl: Removing leading '/' from absolute path names
	# file: home/ubuntu/my-file
	# owner: ubuntu
	# group: ubuntu
	user::rw-
	user:ubuntu:rwx
	group::rw-
	mask::rwx
	other::r--"

* tag 'idmapped-mounts-v5.12' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: (41 commits)
  xfs: remove the possibly unused mp variable in xfs_file_compat_ioctl
  xfs: support idmapped mounts
  ext4: support idmapped mounts
  fat: handle idmapped mounts
  tests: add mount_setattr() selftests
  fs: introduce MOUNT_ATTR_IDMAP
  fs: add mount_setattr()
  fs: add attr_flags_to_mnt_flags helper
  fs: split out functions to hold writers
  namespace: only take read lock in do_reconfigure_mnt()
  mount: make {lock,unlock}_mount_hash() static
  namespace: take lock_mount_hash() directly when changing flags
  nfs: do not export idmapped mounts
  overlayfs: do not mount on top of idmapped mounts
  ecryptfs: do not mount on top of idmapped mounts
  ima: handle idmapped mounts
  apparmor: handle idmapped mounts
  fs: make helpers idmap mount aware
  exec: handle idmapped mounts
  would_dump: handle idmapped mounts
  ...
2021-02-23 13:39:45 -08:00
Shyam Prasad N
6d82c27ae5 cifs: Identify a connection by a conn_id.
Introduced a new field conn_id in TCP_Server_Info structure.
This is a non-persistent unique identifier maintained by the client
for a connection to a file server. For this, a global counter named
tcpSesNextId is maintained. On allocating a new TCP_Server_Info,
this counter is incremented and assigned.

Changed the dynamic tracepoints related to reconnects and
crediting to be more informative (with conn_id printed).
Debugging a crediting issue helped me understand the
important things to print here.

Always call dynamic tracepoints outside the scope of spinlocks.
To do this, copy out the credits and in_flight fields of the
server struct before dropping the lock.

Signed-off-by: Shyam Prasad N <sprasad@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-16 15:48:02 -06:00
Ronnie Sahlberg
af1a3d2ba9 cifs: In the new mount api we get the full devname as source=
so we no longer need to handle or parse the UNC= and prefixpath=
options that mount.cifs are generating.

This also fixes a bug in the mount command option where the devname
would be truncated into just //server/share because we were looking
at the truncated UNC value and not the full path.

I.e.  in the mount command output the devive //server/share/path
would show up as just //server/share

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-02-11 10:58:08 -06:00
Ronnie Sahlberg
0d4873f9aa cifs: fix dfs domain referrals
The new mount API requires additional changes to how DFS
is handled. Additional testing of DFS uncovered problems
with domain based DFS referrals (a follow on patch addresses
DFS links) which this patch addresses.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Paulo Alcantara (SUSE) <pc@cjr.nz>
Signed-off-by: Steve French <stfrench@microsoft.com>
2021-01-28 21:40:43 -06:00
Christian Brauner
549c729771
fs: make helpers idmap mount aware
Extend some inode methods with an additional user namespace argument. A
filesystem that is aware of idmapped mounts will receive the user
namespace the mount has been marked with. This can be used for
additional permission checking and also to enable filesystems to
translate between uids and gids if they need to. We have implemented all
relevant helpers in earlier patches.

As requested we simply extend the exisiting inode method instead of
introducing new ones. This is a little more code churn but it's mostly
mechanical and doesnt't leave us with additional inode methods.

Link: https://lore.kernel.org/r/20210121131959.646623-25-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:20 +01:00
Christian Brauner
47291baa8d
namei: make permission helpers idmapped mount aware
The two helpers inode_permission() and generic_permission() are used by
the vfs to perform basic permission checking by verifying that the
caller is privileged over an inode. In order to handle idmapped mounts
we extend the two helpers with an additional user namespace argument.
On idmapped mounts the two helpers will make sure to map the inode
according to the mount's user namespace and then peform identical
permission checks to inode_permission() and generic_permission(). If the
initial user namespace is passed nothing changes so non-idmapped mounts
will see identical behavior as before.

Link: https://lore.kernel.org/r/20210121131959.646623-6-christian.brauner@ubuntu.com
Cc: Christoph Hellwig <hch@lst.de>
Cc: David Howells <dhowells@redhat.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-fsdevel@vger.kernel.org
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: James Morris <jamorris@linux.microsoft.com>
Acked-by: Serge Hallyn <serge@hallyn.com>
Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
2021-01-24 14:27:16 +01:00
Ronnie Sahlberg
6cf5abbfa8 cifs: fix use after free in cifs_smb3_do_mount()
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-15 16:55:57 -06:00
Steve French
0c2b5f7ce5 cifs: fix rsize/wsize to be negotiated values
Also make sure these are displayed in /proc/mounts

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2020-12-15 15:13:59 -06:00
Steve French
653a5efb84 cifs: update super_operations to show_devname
This is needed so that we display the correct //server/share vs
\\server\share in /proc/mounts for the device name (in the new
mount API).

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
2020-12-15 15:12:51 -06:00
Ronnie Sahlberg
51acd208bd cifs: remove ctx argument from cifs_setup_cifs_sb
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:26:30 -06:00
Ronnie Sahlberg
7c7ee628f8 cifs: uncomplicate printing the iocharset parameter
There is no need to load the default nls to check if the iocharset argument
was specified or not since we have it in cifs_sb->ctx

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:26:30 -06:00
Ronnie Sahlberg
9ccecae8d1 cifs: we do not allow changing username/password/unc/... during remount
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:26:30 -06:00
Ronnie Sahlberg
522aa3b575 cifs: move [brw]size from cifs_sb to cifs_sb->ctx
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:26:30 -06:00
Ronnie Sahlberg
c741cba2cd cifs: move cifs_cleanup_volume_info[_content] to fs_context.c
and rename it to smb3_cleanup_fs_context[_content]

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:26:30 -06:00
Ronnie Sahlberg
af1e40d9ac cifs: remove actimeo from cifs_sb
Can now be accessed via the ctx

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:16:23 -06:00
Ronnie Sahlberg
8401e93678 cifs: remove [gu]id/backup[gu]id/file_mode/dir_mode from cifs_sb
We can already access these from cifs_sb->ctx so we no longer need
a local copy in cifs_sb.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:16:23 -06:00
Samuel Cabrero
0ac4e2919a cifs: add witness mount option and data structs
Add 'witness' mount option to register for witness notifications.

Signed-off-by: Samuel Cabrero <scabrero@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:16:22 -06:00
Samuel Cabrero
06f08dab3c cifs: Register generic netlink family
Register a new generic netlink family to talk to the witness service
userspace daemon.

Signed-off-by: Samuel Cabrero <scabrero@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-14 09:16:22 -06:00
Ronnie Sahlberg
a2a52a8a36 cifs: get rid of cifs_sb->mountdata
as we now have a full smb3_fs_context as part of the cifs superblock
we no longer need a local copy of the mount options and can just
reference the copy in the smb3_fs_context.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-13 19:12:07 -06:00
Ronnie Sahlberg
d17abdf756 cifs: add an smb3_fs_context to cifs_sb
and populate it during mount in cifs_smb3_do_mount()

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Shyam Prasad N <sprasad@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-13 19:12:07 -06:00
Ronnie Sahlberg
24e0a1eff9 cifs: switch to new mount api
See Documentation/filesystems/mount_api.rst for details on new mount API

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-13 19:12:07 -06:00
Ronnie Sahlberg
3fa1c6d1b8 cifs: rename smb_vol as smb3_fs_context and move it to fs_context.h
Harmonize and change all such variables to 'ctx', where possible.
No changes to actual logic.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-12-13 19:12:07 -06:00
Steve French
29e2792304 smb3.1.1: add new module load parm enable_gcm_256
Add new module load parameter enable_gcm_256. If set, then add
AES-256-GCM (strongest encryption type) to the list of encryption
types requested. Put it in the list as the second choice (since
AES-128-GCM is faster and much more broadly supported by
SMB3 servers).  To make this stronger encryption type, GCM-256,
required (the first and only choice, you would use module parameter
"require_gcm_256."

Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-10-15 23:58:15 -05:00
Steve French
fbfd0b46af smb3.1.1: add new module load parm require_gcm_256
Add new module load parameter require_gcm_256. If set, then only
request AES-256-GCM (strongest encryption type).

Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-10-15 23:58:15 -05:00
Steve French
7866c177a0 smb3: fix typo in mount options displayed in /proc/mounts
Missing the final 's' in "max_channels" mount option when displayed in
/proc/mounts (or by mount command)

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Shyam Prasad N <nspmangalore@gmail.com>
2020-06-10 12:05:15 -05:00
Steve French
82e9367c43 smb3: Add new parm "nodelete"
In order to handle workloads where it is important to make sure that
a buggy app did not delete content on the drive, the new mount option
"nodelete" allows standard permission checks on the server to work,
but prevents on the client any attempts to unlink a file or delete
a directory on that mount point.  This can be helpful when running
a little understood app on a network mount that contains important
content that should not be deleted.

Signed-off-by: Steve French <stfrench@microsoft.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2020-06-01 00:10:18 -05:00
Steve French
4e8aea30f7 smb3: enable swap on SMB3 mounts
Add experimental support for allowing a swap file to be on an SMB3
mount.  There are use cases where swapping over a secure network
filesystem is preferable. In some cases there are no local
block devices large enough, and network block devices can be
hard to setup and secure.  And in some cases there are no
local block devices at all (e.g. with the recent addition of
remote boot over SMB3 mounts).

There are various enhancements that can be added later e.g.:
- doing a mandatory byte range lock over the swapfile (until
the Linux VFS is modified to notify the file system that an open
is for a swapfile, when the file can be opened "DENY_ALL" to prevent
others from opening it).
- pinning more buffers in the underlying transport to minimize memory
allocations in the TCP stack under the fs
- documenting how to create ACLs (on the server) to secure the
swapfile (or adding additional tools to cifs-utils to make it easier)

Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2020-04-10 13:32:32 -05:00
Steve French
c7e9f78f7b cifs: do d_move in rename
See commit 349457ccf2
"Allow file systems to manually d_move() inside of ->rename()"

Lessens possibility of race conditions in rename

Signed-off-by: Steve French <stfrench@microsoft.com>
2020-03-22 22:49:09 -05:00
Steve French
ec57010acd cifs: add missing mount option to /proc/mounts
We were not displaying the mount option "signloosely" in /proc/mounts
for cifs mounts which some users found confusing recently

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2020-02-24 14:20:38 -06:00
Petr Pavlu
3f6166aaf1 cifs: fix mount option display for sec=krb5i
Fix display for sec=krb5i which was wrongly interleaved by cruid,
resulting in string "sec=krb5,cruid=<...>i" instead of
"sec=krb5i,cruid=<...>".

Fixes: 96281b9e46 ("smb3: for kerberos mounts display the credential uid used")
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-02-10 08:32:30 -06:00
Amir Goldstein
0f060936e4 SMB3: Backup intent flag missing from some more ops
When "backup intent" is requested on the mount (e.g. backupuid or
backupgid mount options), the corresponding flag was missing from
some of the operations.

Change all operations to use the macro cifs_create_options() to
set the backup intent flag if needed.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2020-02-03 16:12:47 -06:00
Linus Torvalds
0aecba6173 Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull vfs d_inode/d_flags memory ordering fixes from Al Viro:
 "Fallout from tree-wide audit for ->d_inode/->d_flags barriers use.
  Basically, the problem is that negative pinned dentries require
  careful treatment - unless ->d_lock is locked or parent is held at
  least shared, another thread can make them positive right under us.

  Most of the uses turned out to be safe - the main surprises as far as
  filesystems are concerned were

   - race in dget_parent() fastpath, that might end up with the caller
     observing the returned dentry _negative_, due to insufficient
     barriers. It is positive in memory, but we could end up seeing the
     wrong value of ->d_inode in CPU cache. Fixed.

   - manual checks that result of lookup_one_len_unlocked() is positive
     (and rejection of negatives). Again, insufficient barriers (we
     might end up with inconsistent observed values of ->d_inode and
     ->d_flags). Fixed by switching to a new primitive that does the
     checks itself and returns ERR_PTR(-ENOENT) instead of a negative
     dentry. That way we get rid of boilerplate converting negatives
     into ERR_PTR(-ENOENT) in the callers and have a single place to
     deal with the barrier-related mess - inside fs/namei.c rather than
     in every caller out there.

  The guts of pathname resolution *do* need to be careful - the race
  found by Ritesh is real, as well as several similar races.
  Fortunately, it turns out that we can take care of that with fairly
  local changes in there.

  The tree-wide audit had not been fun, and I hate the idea of repeating
  it. I think the right approach would be to annotate the places where
  we are _not_ guaranteed ->d_inode/->d_flags stability and have sparse
  catch regressions. But I'm still not sure what would be the least
  invasive way of doing that and it's clearly the next cycle fodder"

* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  fs/namei.c: fix missing barriers when checking positivity
  fix dget_parent() fastpath race
  new helper: lookup_positive_unlocked()
  fs/namei.c: pull positivity check into follow_managed()
2019-12-06 09:06:58 -08:00
Linus Torvalds
937d6eefc7 Here's the main documentation changes for 5.5:
- Various kerneldoc script enhancements.
 
  - More RST conversions; those are slowing down as we run out of things to
    convert, but we're a ways from done still.
 
  - Dan's "maintainer profile entry" work landed at last.  Now we just need
    to get maintainers to fill in the profiles...
 
  - A reworking of the parallel build setup to work better with a variety of
    systems (and to not take over huge systems entirely in particular).
 
  - The MAINTAINERS file is now converted to RST during the build.
    Hopefully nobody ever tries to print this thing, or they will need to
    load a lot of paper.
 
  - A script and documentation making it easy for maintainers to add Link:
    tags at commit time.
 
 Also included is the removal of a bunch of spurious CR characters.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAl3j5B0PHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YtBcH/jIN2cO8/0YW2rjVT+1G6ytSdFUKx5WJ/lpf
 5uBeCvuCeYhtCB6+BgnXvjykJ7jDW11/NJNjWqz/gsvD5l5FJK1rXarI/oz2Klyi
 kcPtDmBF/ki4wz9qXzEpa0vg8LXdjeys50S1vE75qCzxZoPP7YjuRbPnLrlIJukv
 JbDVi4p9kxgeHfRB4+BHOe5rFwA3mMmaxKNIX34Y+UUO2KZ0g/yUi1bAaQwQAdt+
 PsORmkVQ8Puh3K9xRIr7dYlcWBlBiPqzYdvDgTVxSjrxdK6wjYjSgVk2VjC5MBUN
 mTSTWgyfsIcD/76/s8tq7ZRl2fw+SkCSkFo79Rb/hJwDTb7Vnng=
 =LPBr
 -----END PGP SIGNATURE-----

Merge tag 'docs-5.5a' of git://git.lwn.net/linux

Pull Documentation updates from Jonathan Corbet:
 "Here are the main documentation changes for 5.5:

   - Various kerneldoc script enhancements.

   - More RST conversions; those are slowing down as we run out of
     things to convert, but we're a ways from done still.

   - Dan's "maintainer profile entry" work landed at last. Now we just
     need to get maintainers to fill in the profiles...

   - A reworking of the parallel build setup to work better with a
     variety of systems (and to not take over huge systems entirely in
     particular).

   - The MAINTAINERS file is now converted to RST during the build.
     Hopefully nobody ever tries to print this thing, or they will need
     to load a lot of paper.

   - A script and documentation making it easy for maintainers to add
     Link: tags at commit time.

  Also included is the removal of a bunch of spurious CR characters"

* tag 'docs-5.5a' of git://git.lwn.net/linux: (91 commits)
  docs: remove a bunch of stray CRs
  docs: fix up the maintainer profile document
  libnvdimm, MAINTAINERS: Maintainer Entry Profile
  Maintainer Handbook: Maintainer Entry Profile
  MAINTAINERS: Reclaim the P: tag for Maintainer Entry Profile
  docs, parallelism: Rearrange how jobserver reservations are made
  docs, parallelism: Do not leak blocking mode to other readers
  docs, parallelism: Fix failure path and add comment
  Documentation: Remove bootmem_debug from kernel-parameters.txt
  Documentation: security: core.rst: fix warnings
  Documentation/process/howto/kokr: Update for 4.x -> 5.x versioning
  Documentation/translation: Use Korean for Korean translation title
  docs/memory-barriers.txt: Remove remaining references to mmiowb()
  docs/memory-barriers.txt/kokr: Update I/O section to be clearer about CPU vs thread
  docs/memory-barriers.txt/kokr: Fix style, spacing and grammar in I/O section
  Documentation/kokr: Kill all references to mmiowb()
  docs/memory-barriers.txt/kokr: Rewrite "KERNEL I/O BARRIER EFFECTS" section
  docs: Add initial documentation for devfreq
  Documentation: Document how to get links with git am
  docs: Add request_irq() documentation
  ...
2019-12-02 11:51:02 -08:00
Ronnie Sahlberg
32546a9586 cifs: move cifsFileInfo_put logic into a work-queue
This patch moves the final part of the cifsFileInfo_put() logic where we
need a write lock on lock_sem to be processed in a separate thread that
holds no other locks.
This is to prevent deadlocks like the one below:

> there are 6 processes looping to while trying to down_write
> cinode->lock_sem, 5 of them from _cifsFileInfo_put, and one from
> cifs_new_fileinfo
>
> and there are 5 other processes which are blocked, several of them
> waiting on either PG_writeback or PG_locked (which are both set), all
> for the same page of the file
>
> 2 inode_lock() (inode->i_rwsem) for the file
> 1 wait_on_page_writeback() for the page
> 1 down_read(inode->i_rwsem) for the inode of the directory
> 1 inode_lock()(inode->i_rwsem) for the inode of the directory
> 1 __lock_page
>
>
> so processes are blocked waiting on:
>   page flags PG_locked and PG_writeback for one specific page
>   inode->i_rwsem for the directory
>   inode->i_rwsem for the file
>   cifsInodeInflock_sem
>
>
>
> here are the more gory details (let me know if I need to provide
> anything more/better):
>
> [0 00:48:22.765] [UN]  PID: 8863   TASK: ffff8c691547c5c0  CPU: 3
> COMMAND: "reopen_file"
>  #0 [ffff9965007e3ba8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007e3c38] schedule at ffffffff9b6e64df
>  #2 [ffff9965007e3c48] rwsem_down_write_slowpath at ffffffff9af283d7
>  #3 [ffff9965007e3cb8] legitimize_path at ffffffff9b0f975d
>  #4 [ffff9965007e3d08] path_openat at ffffffff9b0fe55d
>  #5 [ffff9965007e3dd8] do_filp_open at ffffffff9b100a33
>  #6 [ffff9965007e3ee0] do_sys_open at ffffffff9b0eb2d6
>  #7 [ffff9965007e3f38] do_syscall_64 at ffffffff9ae04315
> * (I think legitimize_path is bogus)
>
> in path_openat
>         } else {
>                 const char *s = path_init(nd, flags);
>                 while (!(error = link_path_walk(s, nd)) &&
>                         (error = do_last(nd, file, op)) > 0) {  <<<<
>
> do_last:
>         if (open_flag & O_CREAT)
>                 inode_lock(dir->d_inode);  <<<<
>         else
> so it's trying to take inode->i_rwsem for the directory
>
>      DENTRY           INODE           SUPERBLK     TYPE PATH
> ffff8c68bb8e79c0 ffff8c691158ef20 ffff8c6915bf9000 DIR  /mnt/vm1_smb/
> inode.i_rwsem is ffff8c691158efc0
>
> <struct rw_semaphore 0xffff8c691158efc0>:
>         owner: <struct task_struct 0xffff8c6914275d00> (UN -   8856 -
> reopen_file), counter: 0x0000000000000003
>         waitlist: 2
>         0xffff9965007e3c90     8863   reopen_file      UN 0  1:29:22.926
>   RWSEM_WAITING_FOR_WRITE
>         0xffff996500393e00     9802   ls               UN 0  1:17:26.700
>   RWSEM_WAITING_FOR_READ
>
>
> the owner of the inode.i_rwsem of the directory is:
>
> [0 00:00:00.109] [UN]  PID: 8856   TASK: ffff8c6914275d00  CPU: 3
> COMMAND: "reopen_file"
>  #0 [ffff99650065b828] __schedule at ffffffff9b6e6095
>  #1 [ffff99650065b8b8] schedule at ffffffff9b6e64df
>  #2 [ffff99650065b8c8] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff99650065b940] msleep at ffffffff9af573a9
>  #4 [ffff99650065b948] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
>  #5 [ffff99650065ba38] cifs_writepage_locked at ffffffffc0a0b8f3 [cifs]
>  #6 [ffff99650065bab0] cifs_launder_page at ffffffffc0a0bb72 [cifs]
>  #7 [ffff99650065bb30] invalidate_inode_pages2_range at ffffffff9b04d4bd
>  #8 [ffff99650065bcb8] cifs_invalidate_mapping at ffffffffc0a11339 [cifs]
>  #9 [ffff99650065bcd0] cifs_revalidate_mapping at ffffffffc0a1139a [cifs]
> #10 [ffff99650065bcf0] cifs_d_revalidate at ffffffffc0a014f6 [cifs]
> #11 [ffff99650065bd08] path_openat at ffffffff9b0fe7f7
> #12 [ffff99650065bdd8] do_filp_open at ffffffff9b100a33
> #13 [ffff99650065bee0] do_sys_open at ffffffff9b0eb2d6
> #14 [ffff99650065bf38] do_syscall_64 at ffffffff9ae04315
>
> cifs_launder_page is for page 0xffffd1e2c07d2480
>
> crash> page.index,mapping,flags 0xffffd1e2c07d2480
>       index = 0x8
>       mapping = 0xffff8c68f3cd0db0
>   flags = 0xfffffc0008095
>
>   PAGE-FLAG       BIT  VALUE
>   PG_locked         0  0000001
>   PG_uptodate       2  0000004
>   PG_lru            4  0000010
>   PG_waiters        7  0000080
>   PG_writeback     15  0008000
>
>
> inode is ffff8c68f3cd0c40
> inode.i_rwsem is ffff8c68f3cd0ce0
>      DENTRY           INODE           SUPERBLK     TYPE PATH
> ffff8c68a1f1b480 ffff8c68f3cd0c40 ffff8c6915bf9000 REG
> /mnt/vm1_smb/testfile.8853
>
>
> this process holds the inode->i_rwsem for the parent directory, is
> laundering a page attached to the inode of the file it's opening, and in
> _cifsFileInfo_put is trying to down_write the cifsInodeInflock_sem
> for the file itself.
>
>
> <struct rw_semaphore 0xffff8c68f3cd0ce0>:
>         owner: <struct task_struct 0xffff8c6914272e80> (UN -   8854 -
> reopen_file), counter: 0x0000000000000003
>         waitlist: 1
>         0xffff9965005dfd80     8855   reopen_file      UN 0  1:29:22.912
>   RWSEM_WAITING_FOR_WRITE
>
> this is the inode.i_rwsem for the file
>
> the owner:
>
> [0 00:48:22.739] [UN]  PID: 8854   TASK: ffff8c6914272e80  CPU: 2
> COMMAND: "reopen_file"
>  #0 [ffff99650054fb38] __schedule at ffffffff9b6e6095
>  #1 [ffff99650054fbc8] schedule at ffffffff9b6e64df
>  #2 [ffff99650054fbd8] io_schedule at ffffffff9b6e68e2
>  #3 [ffff99650054fbe8] __lock_page at ffffffff9b03c56f
>  #4 [ffff99650054fc80] pagecache_get_page at ffffffff9b03dcdf
>  #5 [ffff99650054fcc0] grab_cache_page_write_begin at ffffffff9b03ef4c
>  #6 [ffff99650054fcd0] cifs_write_begin at ffffffffc0a064ec [cifs]
>  #7 [ffff99650054fd30] generic_perform_write at ffffffff9b03bba4
>  #8 [ffff99650054fda8] __generic_file_write_iter at ffffffff9b04060a
>  #9 [ffff99650054fdf0] cifs_strict_writev.cold.70 at ffffffffc0a4469b [cifs]
> #10 [ffff99650054fe48] new_sync_write at ffffffff9b0ec1dd
> #11 [ffff99650054fed0] vfs_write at ffffffff9b0eed35
> #12 [ffff99650054ff00] ksys_write at ffffffff9b0eefd9
> #13 [ffff99650054ff38] do_syscall_64 at ffffffff9ae04315
>
> the process holds the inode->i_rwsem for the file to which it's writing,
> and is trying to __lock_page for the same page as in the other processes
>
>
> the other tasks:
> [0 00:00:00.028] [UN]  PID: 8859   TASK: ffff8c6915479740  CPU: 2
> COMMAND: "reopen_file"
>  #0 [ffff9965007b39d8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007b3a68] schedule at ffffffff9b6e64df
>  #2 [ffff9965007b3a78] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff9965007b3af0] msleep at ffffffff9af573a9
>  #4 [ffff9965007b3af8] cifs_new_fileinfo.cold.61 at ffffffffc0a42a07 [cifs]
>  #5 [ffff9965007b3b78] cifs_open at ffffffffc0a0709d [cifs]
>  #6 [ffff9965007b3cd8] do_dentry_open at ffffffff9b0e9b7a
>  #7 [ffff9965007b3d08] path_openat at ffffffff9b0fe34f
>  #8 [ffff9965007b3dd8] do_filp_open at ffffffff9b100a33
>  #9 [ffff9965007b3ee0] do_sys_open at ffffffff9b0eb2d6
> #10 [ffff9965007b3f38] do_syscall_64 at ffffffff9ae04315
>
> this is opening the file, and is trying to down_write cinode->lock_sem
>
>
> [0 00:00:00.041] [UN]  PID: 8860   TASK: ffff8c691547ae80  CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.057] [UN]  PID: 8861   TASK: ffff8c6915478000  CPU: 3
> COMMAND: "reopen_file"
> [0 00:00:00.059] [UN]  PID: 8858   TASK: ffff8c6914271740  CPU: 2
> COMMAND: "reopen_file"
> [0 00:00:00.109] [UN]  PID: 8862   TASK: ffff8c691547dd00  CPU: 6
> COMMAND: "reopen_file"
>  #0 [ffff9965007c3c78] __schedule at ffffffff9b6e6095
>  #1 [ffff9965007c3d08] schedule at ffffffff9b6e64df
>  #2 [ffff9965007c3d18] schedule_timeout at ffffffff9b6e9f89
>  #3 [ffff9965007c3d90] msleep at ffffffff9af573a9
>  #4 [ffff9965007c3d98] _cifsFileInfo_put.cold.63 at ffffffffc0a42dd6 [cifs]
>  #5 [ffff9965007c3e88] cifs_close at ffffffffc0a07aaf [cifs]
>  #6 [ffff9965007c3ea0] __fput at ffffffff9b0efa6e
>  #7 [ffff9965007c3ee8] task_work_run at ffffffff9aef1614
>  #8 [ffff9965007c3f20] exit_to_usermode_loop at ffffffff9ae03d6f
>  #9 [ffff9965007c3f38] do_syscall_64 at ffffffff9ae0444c
>
> closing the file, and trying to down_write cifsi->lock_sem
>
>
> [0 00:48:22.839] [UN]  PID: 8857   TASK: ffff8c6914270000  CPU: 7
> COMMAND: "reopen_file"
>  #0 [ffff9965006a7cc8] __schedule at ffffffff9b6e6095
>  #1 [ffff9965006a7d58] schedule at ffffffff9b6e64df
>  #2 [ffff9965006a7d68] io_schedule at ffffffff9b6e68e2
>  #3 [ffff9965006a7d78] wait_on_page_bit at ffffffff9b03cac6
>  #4 [ffff9965006a7e10] __filemap_fdatawait_range at ffffffff9b03b028
>  #5 [ffff9965006a7ed8] filemap_write_and_wait at ffffffff9b040165
>  #6 [ffff9965006a7ef0] cifs_flush at ffffffffc0a0c2fa [cifs]
>  #7 [ffff9965006a7f10] filp_close at ffffffff9b0e93f1
>  #8 [ffff9965006a7f30] __x64_sys_close at ffffffff9b0e9a0e
>  #9 [ffff9965006a7f38] do_syscall_64 at ffffffff9ae04315
>
> in __filemap_fdatawait_range
>                         wait_on_page_writeback(page);
> for the same page of the file
>
>
>
> [0 00:48:22.718] [UN]  PID: 8855   TASK: ffff8c69142745c0  CPU: 7
> COMMAND: "reopen_file"
>  #0 [ffff9965005dfc98] __schedule at ffffffff9b6e6095
>  #1 [ffff9965005dfd28] schedule at ffffffff9b6e64df
>  #2 [ffff9965005dfd38] rwsem_down_write_slowpath at ffffffff9af283d7
>  #3 [ffff9965005dfdf0] cifs_strict_writev at ffffffffc0a0c40a [cifs]
>  #4 [ffff9965005dfe48] new_sync_write at ffffffff9b0ec1dd
>  #5 [ffff9965005dfed0] vfs_write at ffffffff9b0eed35
>  #6 [ffff9965005dff00] ksys_write at ffffffff9b0eefd9
>  #7 [ffff9965005dff38] do_syscall_64 at ffffffff9ae04315
>
>         inode_lock(inode);
>
>
> and one 'ls' later on, to see whether the rest of the mount is available
> (the test file is in the root, so we get blocked up on the directory
> ->i_rwsem), so the entire mount is unavailable
>
> [0 00:36:26.473] [UN]  PID: 9802   TASK: ffff8c691436ae80  CPU: 4
> COMMAND: "ls"
>  #0 [ffff996500393d28] __schedule at ffffffff9b6e6095
>  #1 [ffff996500393db8] schedule at ffffffff9b6e64df
>  #2 [ffff996500393dc8] rwsem_down_read_slowpath at ffffffff9b6e9421
>  #3 [ffff996500393e78] down_read_killable at ffffffff9b6e95e2
>  #4 [ffff996500393e88] iterate_dir at ffffffff9b103c56
>  #5 [ffff996500393ec8] ksys_getdents64 at ffffffff9b104b0c
>  #6 [ffff996500393f30] __x64_sys_getdents64 at ffffffff9b104bb6
>  #7 [ffff996500393f38] do_syscall_64 at ffffffff9ae04315
>
> in iterate_dir:
>         if (shared)
>                 res = down_read_killable(&inode->i_rwsem);  <<<<
>         else
>                 res = down_write_killable(&inode->i_rwsem);
>

Reported-by: Frank Sorenson <sorenson@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:17:12 -06:00
Aurelien Aptel
bcc8880115 cifs: add multichannel mount options and data structs
adds:
- [no]multichannel to enable/disable multichannel
- max_channels=N to control how many channels to create

these options are then stored in the volume struct.

- store channels and max_channels in cifs_ses

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Ronnie Sahlberg
3591bb83ee cifs: don't use 'pre:' for MODULE_SOFTDEP
It can cause
to fail with
modprobe: FATAL: Module <module> is builtin.

RHBZ: 1767094

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-11-25 01:16:30 -06:00
Steve French
d0677992d2 cifs: add support for flock
The flock system call locks the whole file rather than a byte
range and so is currently emulated by various other file systems
by simply sending a byte range lock for the whole file.
Add flock handling for cifs.ko in similar way.

xfstest generic/504 passes with this as well

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-11-25 01:14:14 -06:00
Al Viro
6c2d4798a8 new helper: lookup_positive_unlocked()
Most of the callers of lookup_one_len_unlocked() treat negatives are
ERR_PTR(-ENOENT).  Provide a helper that would do just that.  Note
that a pinned positive dentry remains positive - it's ->d_inode is
stable, etc.; a pinned _negative_ dentry can become positive at any
point as long as you are not holding its parent at least shared.
So using lookup_one_len_unlocked() needs to be careful;
lookup_positive_unlocked() is safer and that's what the callers
end up open-coding anyway.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-11-15 13:49:04 -05:00
Jonathan Corbet
822bbba0ca Linux 5.4-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl2su/AeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvm4H/1jkheCrvB/GJS69
 wd18vizAg+eFmNCzxlGVhpQTKGymNRy+g6clnoli3cNJ3pSVKcYgVyB3oXaONIhp
 g/ANudnBjTdjqYgJzfLij5AGecrGwDpF3YL0kuKrCB63s2I/HwQGYy/aPrYY8emy
 gAYdaf1DGRu5/DIIB6soTo/TnpKoAyTE+XY5MaPSug++t/Flov19tlU40IZxXW94
 bjTXbm0yklrsIx+LL5mYYGGnygSTCF66JjFg1qhDCBQaS2MZ21h1ZgaOtGZTwZcc
 WgEiqLC5S1Iyj96zir1t78RcVQ4RzgvDbhUOgIqUFsYAO2wOicvxyFE3Hj8rPOKd
 uGgVPRM=
 =xgZa
 -----END PGP SIGNATURE-----

Merge tag 'v5.4-rc4' into docs-next

I need to pick up the independent changes made to
Documentation/core-api/memory-allocation.rst to be able to merge further
work without creating a total mess.
2019-10-29 04:43:29 -06:00
Steve French
553292a634 cifs: clarify comment about timestamp granularity for old servers
It could be confusing why we set granularity to 1 seconds rather
than 2 seconds (1 second is the max the VFS allows) for these
mounts to very old servers ...

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-10-20 19:19:49 -05:00
Mauro Carvalho Chehab
0ac624f47d docs: fix some broken references
There are a number of documentation files that got moved or
renamed. update their references.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by: Shannon Nelson <snelson@pensando.io>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <robh@kernel.org>
Acked-by: Paul Walmsley <paul.walmsley@sifive.com> # RISC-V
Acked-by: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2019-10-10 11:25:39 -06:00
Steve French
d4cfbf04b2 smb3: Fix regression in time handling
Fixes: cb7a69e605 ("cifs: Initialize filesystem timestamp ranges")

Only very old servers (e.g. OS/2 and DOS) did not support
DCE TIME (100 nanosecond granularity).  Fix the checks used
to set minimum and maximum times.

Fixes xfstest generic/258 (on 5.4-rc1 and later)

CC: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-10-09 00:10:38 -05:00
Linus Torvalds
7e3d2c8210 various cifs/smb3 fixes (including for share deleted cases) and features including improved encrypted read performance, and various debugging improvements
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl2Dq7wACgkQiiy9cAdy
 T1H5Kgv+NdJCBEfMjYzDPCci0plAFjurbPh5GvlrgkrYuRyCEw7KcG0/DltQSoCk
 TOQ0JTVYulyKPd8uwha2p8ylscKwEi3BPHa0+CMg6qRmCtrRMMhh85TF2pScy8Jz
 s1s9RskEvdJMpZ0disu1uge9O4JHNFB3LnfM7bDMxGTDUYMtuAXYIXTqMpLX3KVq
 8NzcXUD0gAdjqo8iYZ/BtdeE7MBUCTvjp4O7pJh0cq+EV7p2G2DzaPhLJ3U5dL/y
 IlMgMBfyuYNH6wZueB+R2ZO1GeiPe6M4sq/Beh2/5dXKGaNECxo3feDFOHsvsasT
 WTllXAe8rz9AH8g8+QT4mxPQYBYw0IcJaQk5y/12gTiD2CR4BiTcNmgYv49tqiTH
 b4Gl6KbjTvyT0iieStUBywCpU93qD04nwuzBOPqT0FG0T6Papx4sTvvYM3ThGs16
 X891RLxdvj9QOjpPnQ4vw9yAd1s7PDjVm7BA5CRhRf0Yvdc7q2YVyitzGYBzjqF0
 K2f/4HRJ
 =4/w5
 -----END PGP SIGNATURE-----

Merge tag '5.4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:
 "Various cifs/smb3 fixes (including for share deleted cases) and
  features including improved encrypted read performance, and various
  debugging improvements.

  Note that since I am at a test event this week with the Samba team,
  and at the annual Storage Developer Conference/SMB3 Plugfest test
  event next week a higher than usual number of fixes is expected later
  next week as other features in progress get additional testing and
  review during these two events"

* tag '5.4-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (38 commits)
  cifs: update internal module version number
  cifs: modefromsid: write mode ACE first
  cifs: cifsroot: add more err checking
  smb3: add missing worker function for SMB3 change notify
  cifs: Add support for root file systems
  cifs: modefromsid: make room for 4 ACE
  smb3: fix potential null dereference in decrypt offload
  smb3: fix unmount hang in open_shroot
  smb3: allow disabling requesting leases
  smb3: improve handling of share deleted (and share recreated)
  smb3: display max smb3 requests in flight at any one time
  smb3: only offload decryption of read responses if multiple requests
  cifs: add a helper to find an existing readable handle to a file
  smb3: enable offload of decryption of large reads via mount option
  smb3: allow parallelizing decryption of reads
  cifs: add a debug macro that prints \\server\share for errors
  smb3: fix signing verification of large reads
  smb3: allow skipping signature verification for perf sensitive configurations
  smb3: add dynamic tracepoints for flush and close
  smb3: log warning if CSC policy conflicts with cache mount option
  ...
2019-09-19 10:32:16 -07:00
Linus Torvalds
cfb82e1df8 y2038: add inode timestamp clamping
This series from Deepa Dinamani adds a per-superblock minimum/maximum
 timestamp limit for a file system, and clamps timestamps as they are
 written, to avoid random behavior from integer overflow as well as having
 different time stamps on disk vs in memory.
 
 At mount time, a warning is now printed for any file system that can
 represent current timestamps but not future timestamps more than 30
 years into the future, similar to the arbitrary 30 year limit that was
 added to settimeofday().
 
 This was picked as a compromise to warn users to migrate to other file
 systems (e.g. ext4 instead of ext3) when they need the file system to
 survive beyond 2038 (or similar limits in other file systems), but not
 get in the way of normal usage.
 
 Signed-off-by: Arnd Bergmann <arnd@arndb.de>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2
 
 iQIcBAABCAAGBQJdcs20AAoJEJpsee/mABjZaOwQALl3lBEhg0aV6a0ZZ1uYehtd
 vcjZ6OpehfiOAxYJu0wfLPATo4T0FuBxZKz3+trkJDICcxyc68AJ2wijwInIQnZW
 MrSKnPyv/fSGp8Jr5w/0CLdp6yT6Dh7z4j2UxhwusR1bQh4cCYSswDg29/nmxgKp
 Nu8m7jMvJQ2Q0r4Zy0sT/MaycUcSH5yvpyTcsYFixGOz1niNy91ISs1+aq6HZ3i3
 +cuYTUy13y40iNUHzFBTcJItBnikwZOQ/zjNfJFXZ3bVEUPg8ZTLPYQ0OZz+pM0Z
 AlXCKghb2EOKgq729LtA6oaY+Nom/1Gm1p80q3G+nGRVOqRgC+dfAVPZQoiER5Y1
 zNPEDf2Sf7J9xktvfC+Qqa9QEUPLKs22ZIccG+vYBW65sS8IAiEDH3LAt444GGls
 yB/Cx/Qw7BftpR5Om27Mhm5jDQzr43iTkZaPQWq7ydJXpfxnjlg9L19yS1omDFyV
 hdbBXY6FikUICPKUW6I49z5BhjL+kmK9M2DVljImmdKNDTrfr0xY5M/EWjJZ7X+I
 rnSe9qTY+iQ5/AXANn5wfj1Y6L5IxkmdWI/zDIbKhYMZLCqqFLd3mJERbs+CMDJq
 qNrYyFPReFrg50oSduBPAByMTR4x9hus7iIC7r77kpoz5i60DPmIJoTfFm3844Gv
 sBEyvWV08CpE9mSzXuv6
 =em9y
 -----END PGP SIGNATURE-----

Merge tag 'y2038-vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground

Pull y2038 vfs updates from Arnd Bergmann:
 "Add inode timestamp clamping.

  This series from Deepa Dinamani adds a per-superblock minimum/maximum
  timestamp limit for a file system, and clamps timestamps as they are
  written, to avoid random behavior from integer overflow as well as
  having different time stamps on disk vs in memory.

  At mount time, a warning is now printed for any file system that can
  represent current timestamps but not future timestamps more than 30
  years into the future, similar to the arbitrary 30 year limit that was
  added to settimeofday().

  This was picked as a compromise to warn users to migrate to other file
  systems (e.g. ext4 instead of ext3) when they need the file system to
  survive beyond 2038 (or similar limits in other file systems), but not
  get in the way of normal usage"

* tag 'y2038-vfs' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/playground:
  ext4: Reduce ext4 timestamp warnings
  isofs: Initialize filesystem timestamp ranges
  pstore: fs superblock limits
  fs: omfs: Initialize filesystem timestamp ranges
  fs: hpfs: Initialize filesystem timestamp ranges
  fs: ceph: Initialize filesystem timestamp ranges
  fs: sysv: Initialize filesystem timestamp ranges
  fs: affs: Initialize filesystem timestamp ranges
  fs: fat: Initialize filesystem timestamp ranges
  fs: cifs: Initialize filesystem timestamp ranges
  fs: nfs: Initialize filesystem timestamp ranges
  ext4: Initialize timestamps limits
  9p: Fill min and max timestamps in sb
  fs: Fill in max and min timestamps in superblock
  utimes: Clamp the timestamps before update
  mount: Add mount warning for impending timestamp expiry
  timestamp_truncate: Replace users of timespec64_trunc
  vfs: Add timestamp_truncate() api
  vfs: Add file timestamp range support
2019-09-19 09:42:37 -07:00
Steve French
3e7a02d478 smb3: allow disabling requesting leases
In some cases to work around server bugs or performance
problems it can be helpful to be able to disable requesting
SMB2.1/SMB3 leases on a particular mount (not to all servers
and all shares we are mounted to). Add new mount parm
"nolease" which turns off requesting leases on directory
or file opens.  Currently the only way to disable leases is
globally through a module load parameter. This is more
granular.

Suggested-by: Pavel Shilovsky <pshilov@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
CC: Stable <stable@vger.kernel.org>
2019-09-16 11:43:38 -05:00
Steve French
10328c44cc smb3: only offload decryption of read responses if multiple requests
No point in offloading read decryption if no other requests on the
wire

Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16 11:43:38 -05:00
Steve French
563317ec30 smb3: enable offload of decryption of large reads via mount option
Disable offload of the decryption of encrypted read responses
by default (equivalent to setting this new mount option "esize=0").

Allow setting the minimum encrypted read response size that we
will choose to offload to a worker thread - it is now configurable
via on a new mount option "esize="

Depending on which encryption mechanism (GCM vs. CCM) and
the number of reads that will be issued in parallel and the
performance of the network and CPU on the client, it may make
sense to enable this since it can provide substantial benefit when
multiple large reads are in flight at the same time.

Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16 11:43:38 -05:00
Steve French
35cf94a397 smb3: allow parallelizing decryption of reads
decrypting large reads on encrypted shares can be slow (e.g. adding
multiple milliseconds per-read on non-GCM capable servers or
when mounting with dialects prior to SMB3.1.1) - allow parallelizing
of read decryption by launching worker threads.

Testing to Samba on localhost showed 25% improvement.
Testing to remote server showed very large improvement when
doing more than one 'cp' command was called at one time.

Signed-off-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-09-16 11:43:38 -05:00
Steve French
41e033fecd smb3: add mount option to allow RW caching of share accessed by only 1 client
If a share is known to be only to be accessed by one client, we
can aggressively cache writes not just reads to it.

Add "cache=" option (cache=singleclient) for mounting read write shares
(that will not be read or written to from other clients while we have
it mounted) in order to improve performance.

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-09-16 11:43:38 -05:00
Steve French
83bbfa706d smb3: add mount option to allow forced caching of read only share
If a share is immutable (at least for the period that it will
be mounted) it would be helpful to not have to revalidate
dentries repeatedly that we know can not be changed remotely.

Add "cache=" option (cache=ro) for mounting read only shares
in order to improve performance in cases in which we know that
the share will not be changing while it is in use.

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-09-16 11:43:37 -05:00
Deepa Dinamani
cb7a69e605 fs: cifs: Initialize filesystem timestamp ranges
Fill in the appropriate limits to avoid inconsistencies
in the vfs cached inode times when timestamps are
outside the permitted range.

Also fixed cnvrtDosUnixTm calculations to avoid int overflow
while computing maximum date.

References:

http://cifs.com/

https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-cifs/d416ff7c-c536-406e-a951-4f04b2fd1d2b

Signed-off-by: Deepa Dinamani <deepa.kernel@gmail.com>
Acked-by: Jeff Layton <jlayton@kernel.org>
Cc: sfrench@samba.org
Cc: linux-cifs@vger.kernel.org
2019-08-30 07:27:18 -07:00
Ard Biesheuvel
9a394d1208 fs: cifs: move from the crypto cipher API to the new DES library interface
Some legacy code in the CIFS driver uses single DES to calculate
some password hash, and uses the crypto cipher API to do so. Given
that there is no point in invoking an accelerated cipher for doing
56-bit symmetric encryption on a single 8-byte block of input, the
flexibility of the crypto cipher API does not add much value here,
and so we're much better off using a library call into the generic
C implementation.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-08-22 14:57:34 +10:00
Amir Goldstein
bf3c90ee1e cifs: copy_file_range needs to strip setuid bits and update timestamps
cifs has both source and destination inodes locked throughout the copy.
Like ->write_iter(), we update mtime and strip setuid bits of destination
file before copy and like ->read_iter(), we update atime of source file
after copy.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-18 14:03:03 -05:00
Linus Torvalds
ae9b728c8d smb3/cifs fixes (3 for stable) and improvements including much faster encryption (SMB3.1.1 GCM)
-----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEE6fsu8pdIjtWE/DpLiiy9cAdyT1EFAl0wDEQACgkQiiy9cAdy
 T1E3CQv/e+8uTD0dSmU+bEBopYCtihRq7ZGXtCGSE8U/fj0l34qBxds/JLvTSSeY
 NhUD+F5e2NYSU7LZx8d9HkOJStcLaNx5Jq1YrxmGvVfUC6s7VKn9637nByXhrgrM
 t/rQj8Ot6RDGMNs7PlMUt1jjtP3zL9ugQ2DHsjLoCY+w07qbsVWCZlm9sJEmr8lS
 3umvfPPi8LKNsOxTT+DsSwZ+XN/BctCExeojVkdFRCBsYJyHbJtejeJPXWxv4/6m
 lQpY0uLwjxgRO6aZxFvMW18vhI8977f1svwA4CmgaVYB0A7yr1VptINWVPfN+mGK
 BYJRe1i54JSBZ8/vp1POvKrhLa6Y623BNpa6myjxOXYQ3/M7PDU+PycosI4V61Bp
 yyH451jdKGZYojG6O7qGGE8kTDyjCs/k/2GeNeUKvHcNX9juDBMTxx2G5kP+w/xd
 2lgvgrYlSWVG/p1ADlHtwsAEupg8xZcl/y3IGBIAw57uKAX2LRzujbeT/CpZ3phm
 k5ZljExt
 =bdbT
 -----END PGP SIGNATURE-----

Merge tag '4.3-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6

Pull cifs updates from Steve French:
 "Fixes (three for stable) and improvements including much faster
  encryption (SMB3.1.1 GCM)"

* tag '4.3-rc-smb3-fixes' of git://git.samba.org/sfrench/cifs-2.6: (27 commits)
  smb3: smbdirect no longer experimental
  cifs: fix crash in smb2_compound_op()/smb2_set_next_command()
  cifs: fix crash in cifs_dfs_do_automount
  cifs: fix parsing of symbolic link error response
  cifs: refactor and clean up arguments in the reparse point parsing
  SMB3: query inode number on open via create context
  smb3: Send netname context during negotiate protocol
  smb3: do not send compression info by default
  smb3: add new mount option to retrieve mode from special ACE
  smb3: Allow query of symlinks stored as reparse points
  cifs: Fix a race condition with cifs_echo_request
  cifs: always add credits back for unsolicited PDUs
  fs: cifs: cifsssmb: Change return type of convert_ace_to_cifs_ace
  add some missing definitions
  cifs: fix typo in debug message with struct field ia_valid
  smb3: minor cleanup of compound_send_recv
  CIFS: Fix module dependency
  cifs: simplify code by removing CONFIG_CIFS_ACL ifdef
  cifs: Fix check for matching with existing mount
  cifs: Properly handle auto disabling of serverino option
  ...
2019-07-18 11:11:51 -07:00
Linus Torvalds
40f06c7995 Changes to copy_file_range for 5.3 from Dave and Amir:
- Create a generic copy_file_range handler and make individual
   filesystems responsible for calling it (i.e. no more assuming that
   do_splice_direct will work or is appropriate)
 - Refactor copy_file_range and remap_range parameter checking where they
   are the same
 - Install missing copy_file_range parameter checking(!)
 - Remove suid/sgid and update mtime like any other file write
 - Change the behavior so that a copy range crossing the source file's
   eof will result in a short copy to the source file's eof instead of
   EINVAL
 - Permit filesystems to decide if they want to handle cross-superblock
   copy_file_range in their local handlers.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEUzaAxoMeQq6m2jMV+H93GTRKtOsFAl0BGvAACgkQ+H93GTRK
 tOu2aw/+KGG7PiXm9ED3ZXUppKVddrZMOgqM7mSfHo6TBgW3pJUJcRIhawK0Wz/P
 stgTsOkurHSl3iT3vQyX4GTZvLoGN/rfsRLPxogJptBUqVv3BOrXsrI53f7V/kbm
 rtjlYsgExji7VBUiMTe5kOWWqxyR7B4nXyvY/8rier57rW/8C1I58B0OrxAmTK0k
 rz1e5BtE1dg91xA7cSdEc38FInz8MW8cvsrEzW9vyYY4IVE0PBuhhA1EvryxTrAZ
 hfthHFfzwxhJkI0mdha8uqNufNWrHLSqiwyjYC7pwAwSQzQPiQz9U17flu+URnfF
 kXaR5LdXbBP3pl46RdthrfuonWsv612cC1Qwfjs8PBG9lG7b9PGJ40MGVTiw7LlQ
 924/03ho0zAnV0E8Qn5O9nPshQNDJhwhzMS39EmMyFKb1D5XGzdMV0gDdIfx6hdO
 HDbw6VQ33S59gvk7v/gxsFB5Bs4PKfamHx/QmwQwpqWM5XExcr0yJ90OTBtAuY4r
 S+9gwG6uED3aPh8HbQ5UgnA8bZmMmi8AkcBvqJ9GgNw5SbZl0oyv9Sj6JNpoOejV
 8y9JkhoZUxqiihnKTw/vtMrj5RCOfifNBjMSwrShfLdLKtK0AZl1mXC0/1Q3VnEQ
 TUcyRHEzrtHgJ9/AK9xIyDNvNYzvHSLZj7maoZZumgQa2FOFrmw=
 =qM44
 -----END PGP SIGNATURE-----

Merge tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull copy_file_range updates from Darrick Wong:
 "This fixes numerous parameter checking problems and inconsistent
  behaviors in the new(ish) copy_file_range system call.

  Now the system call will actually check its range parameters
  correctly; refuse to copy into files for which the caller does not
  have sufficient privileges; update mtime and strip setuid like file
  writes are supposed to do; and allows copying up to the EOF of the
  source file instead of failing the call like we used to.

  Summary:

   - Create a generic copy_file_range handler and make individual
     filesystems responsible for calling it (i.e. no more assuming that
     do_splice_direct will work or is appropriate)

   - Refactor copy_file_range and remap_range parameter checking where
     they are the same

   - Install missing copy_file_range parameter checking(!)

   - Remove suid/sgid and update mtime like any other file write

   - Change the behavior so that a copy range crossing the source file's
     eof will result in a short copy to the source file's eof instead of
     EINVAL

   - Permit filesystems to decide if they want to handle
     cross-superblock copy_file_range in their local handlers"

* tag 'copy-file-range-fixes-1' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  fuse: copy_file_range needs to strip setuid bits and update timestamps
  vfs: allow copy_file_range to copy across devices
  xfs: use file_modified() helper
  vfs: introduce file_modified() helper
  vfs: add missing checks to copy_file_range
  vfs: remove redundant checks from generic_remap_checks()
  vfs: introduce generic_file_rw_checks()
  vfs: no fallback for ->copy_file_range
  vfs: introduce generic_copy_file_range()
2019-07-10 20:32:37 -07:00
Linus Torvalds
4d2fa8b44b Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto updates from Herbert Xu:
 "Here is the crypto update for 5.3:

  API:
   - Test shash interface directly in testmgr
   - cra_driver_name is now mandatory

  Algorithms:
   - Replace arc4 crypto_cipher with library helper
   - Implement 5 way interleave for ECB, CBC and CTR on arm64
   - Add xxhash
   - Add continuous self-test on noise source to drbg
   - Update jitter RNG

  Drivers:
   - Add support for SHA204A random number generator
   - Add support for 7211 in iproc-rng200
   - Fix fuzz test failures in inside-secure
   - Fix fuzz test failures in talitos
   - Fix fuzz test failures in qat"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (143 commits)
  crypto: stm32/hash - remove interruptible condition for dma
  crypto: stm32/hash - Fix hmac issue more than 256 bytes
  crypto: stm32/crc32 - rename driver file
  crypto: amcc - remove memset after dma_alloc_coherent
  crypto: ccp - Switch to SPDX license identifiers
  crypto: ccp - Validate the the error value used to index error messages
  crypto: doc - Fix formatting of new crypto engine content
  crypto: doc - Add parameter documentation
  crypto: arm64/aes-ce - implement 5 way interleave for ECB, CBC and CTR
  crypto: arm64/aes-ce - add 5 way interleave routines
  crypto: talitos - drop icv_ool
  crypto: talitos - fix hash on SEC1.
  crypto: talitos - move struct talitos_edesc into talitos.h
  lib/scatterlist: Fix mapping iterator when sg->offset is greater than PAGE_SIZE
  crypto/NX: Set receive window credits to max number of CRBs in RxFIFO
  crypto: asymmetric_keys - select CRYPTO_HASH where needed
  crypto: serpent - mark __serpent_setkey_sbox noinline
  crypto: testmgr - dynamically allocate crypto_shash
  crypto: testmgr - dynamically allocate testvec_config
  crypto: talitos - eliminate unneeded 'done' functions at build time
  ...
2019-07-08 20:57:08 -07:00
Steve French
412094a8fb smb3: add new mount option to retrieve mode from special ACE
There is a special ACE used by some servers to allow the mode
bits to be stored.  This can be especially helpful in scenarios
in which the client is trusted, and access checking on the
client vs the POSIX mode bits is sufficient.

Add mount option to allow enabling this behavior.
Follow on patch will add support for chmod and queryinfo
(stat) by retrieving the POSIX mode bits from the special
ACE, SID: S-1-5-88-3

See e.g.
https://docs.microsoft.com/en-us/previous-versions/windows/it-pro/windows-server-2008-R2-and-2008/hh509017(v=ws.10)

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-07-07 22:37:43 -05:00
Steve French
73cf8085dc cifs: simplify code by removing CONFIG_CIFS_ACL ifdef
SMB3 ACL support is needed for many use cases now and should not be
ifdeffed out, even for SMB1 (CIFS).  Remove the CONFIG_CIFS_ACL
ifdef so ACL support is always built into cifs.ko

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-07 22:37:43 -05:00
Steve French
dc179268cd smb3: if max_credits is specified then display it in /proc/mounts
If "max_credits" is overridden from its default by specifying
it on the smb3 mount then display it in /proc/mounts

Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-07 22:37:43 -05:00
Aurelien Aptel
5fc3681fa5 cifs: add missing GCM module dependency
Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-07-07 22:37:42 -05:00
Ard Biesheuvel
97a5fee2bd fs: cifs: switch to RC4 library interface
The CIFS code uses the sync skcipher API to invoke the ecb(arc4) skcipher,
of which only a single generic C code implementation exists. This means
that going through all the trouble of using scatterlists etc buys us
very little, and we're better off just invoking the arc4 library directly.

This also reverts commit 5f4b55699a ("CIFS: Fix BUG() in calc_seckey()"),
since it is no longer necessary to allocate sec_key on the heap.

Cc: linux-cifs@vger.kernel.org
Cc: Steve French <sfrench@samba.org>
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Acked-by: Steve French <stfrench@microsoft.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-06-20 14:19:55 +08:00
Ronnie Sahlberg
487317c994 cifs: add spinlock for the openFileList to cifsInodeInfo
We can not depend on the tcon->open_file_lock here since in multiuser mode
we may have the same file/inode open via multiple different tcons.

The current code is race prone and will crash if one user deletes a file
at the same time a different user opens/create the file.

To avoid this we need to have a spinlock attached to the inode and not the tcon.

RHBZ:  1580165

CC: Stable <stable@vger.kernel.org>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2019-06-13 14:21:09 -05:00
Amir Goldstein
5dae222a5f vfs: allow copy_file_range to copy across devices
We want to enable cross-filesystem copy_file_range functionality
where possible, so push the "same superblock only" checks down to
the individual filesystem callouts so they can make their own
decisions about cross-superblock copy offload and fallack to
generic_copy_file_range() for cross-superblock copy.

[Amir] We do not call ->remap_file_range() in case the files are not
on the same sb and do not call ->copy_file_range() in case the files
do not belong to the same filesystem driver.

This changes behavior of the copy_file_range(2) syscall, which will
now allow cross filesystem in-kernel copy.  CIFS already supports
cross-superblock copy, between two shares to the same server. This
functionality will now be available via the copy_file_range(2) syscall.

Cc: Steve French <stfrench@microsoft.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:20 -07:00
Dave Chinner
64bf5ff58d vfs: no fallback for ->copy_file_range
Now that we have generic_copy_file_range(), remove it as a fallback
case when offloads fail. This puts the responsibility for executing
fallbacks on the filesystems that implement ->copy_file_range and
allows us to add operational validity checks to
generic_copy_file_range().

Rework vfs_copy_file_range() to call a new do_copy_file_range()
helper to execute the copying callout, and move calls to
generic_file_copy_range() into filesystem methods where they
currently return failures.

[Amir] overlayfs is not responsible of executing the fallback.
It is the responsibility of the underlying filesystem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2019-06-09 10:06:19 -07:00
Ronnie Sahlberg
dece44e381 cifs: add support for SEEK_DATA and SEEK_HOLE
Add llseek op for SEEK_DATA and SEEK_HOLE.
Improves xfstests/285,286,436,445,448 and 490

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-15 22:27:53 -05:00
Kovtunenko Oleksandr
9ab70ca653 Fixed https://bugzilla.kernel.org/show_bug.cgi?id=202935 allow write on the same file
Copychunk allows source and target to be on the same file.
For details on restrictions see MS-SMB2 3.3.5.15.6

Signed-off-by: Kovtunenko Oleksandr <alexander198961@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-15 22:27:53 -05:00
Ronnie Sahlberg
2f3ebaba13 cifs: add fiemap support
Useful for improved copy performance as well as for
applications which query allocated ranges of sparse
files.

Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07 23:24:55 -05:00
Kenneth D'souza
c8b6ac1a9d CIFS: Show locallease in /proc/mounts for cifs shares mounted with locallease feature.
Missing parameter that should be displayed in the mount list

Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Kenneth D'souza <kdsouza@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-05-07 23:24:54 -05:00
Al Viro
c2e6802e7b cifs: switch to ->free_inode()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2019-05-01 22:43:24 -04:00
Steve French
ca567eb2b3 SMB3: Allow persistent handle timeout to be configurable on mount
Reconnecting after server or network failure can be improved
(to maintain availability and protect data integrity) by allowing
the client to choose the default persistent (or resilient)
handle timeout in some use cases.  Today we default to 0 which lets
the server pick the default timeout (usually 120 seconds) but this
can be problematic for some workloads.  Add the new mount parameter
to cifs.ko for SMB3 mounts "handletimeout" which enables the user
to override the default handle timeout for persistent (mount
option "persistenthandles") or resilient handles (mount option
"resilienthandles").  Maximum allowed is 16 minutes (960000 ms).
Units for the timeout are expressed in milliseconds. See
section 2.2.14.2.12 and 2.2.31.3 of the MS-SMB2 protocol
specification for more information.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
2019-04-01 14:33:36 -05:00
Xiaoli Feng
b073a08016 cifs: fix that return -EINVAL when do dedupe operation
dedupe_file_range operations is combiled into remap_file_range.
But it's always skipped for dedupe operations in function
cifs_remap_file_range.

Example to test:
Before this patch:
  # dd if=/dev/zero of=cifs/file bs=1M count=1
  # xfs_io -c "dedupe cifs/file 4k 64k 4k" cifs/file
  XFS_IOC_FILE_EXTENT_SAME: Invalid argument

After this patch:
  # dd if=/dev/zero of=cifs/file bs=1M count=1
  # xfs_io -c "dedupe cifs/file 4k 64k 4k" cifs/file
  XFS_IOC_FILE_EXTENT_SAME: Operation not supported

Influence for xfstests:
generic/091
generic/112
generic/127
generic/263
These tests report this error "do_copy_range:: Invalid
argument" instead of "FIDEDUPERANGE: Invalid argument".
Because there are still two bugs cause these test failed.
https://bugzilla.kernel.org/show_bug.cgi?id=202935
https://bugzilla.kernel.org/show_bug.cgi?id=202785

Signed-off-by: Xiaoli Feng <fengxiaoli0714@gmail.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2019-03-22 22:36:54 -05:00
Steve French
96281b9e46 smb3: for kerberos mounts display the credential uid used
For kerberos mounts, the cruid is helpful to display in
/proc/mounts in order to tell which uid's krb5 cache we
got the ticket for and to tell in the multiuser krb5 case
which local users (uids) we have Kerberos authentic sessions
for.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2019-03-05 18:10:48 -06:00
Steve French
e8506d25f7 smb3: make default i/o size for smb3 mounts larger
We negotiate rsize mounts (and it can be overridden by user) to
typically 4MB, so using larger default I/O sizes from userspace
(changing to 1MB default i/o size returned by stat) the
performance is much better (and not just for long latency
network connections) in most use cases for SMB3 than the default I/O
size (which ends up being 128K for cp and can be even smaller for cp).
This can be 4x slower or worse depending on network latency.

By changing inode->blocksize from 32K (which was perhaps ok
for very old SMB1/CIFS) to a larger value, 1MB (but still less than
max size negotiated with the server which is 4MB, in order to minimize
risk) it significantly increases performance for the
noncached case, and slightly increases it for the cached case.
This can be changed by the user on mount (specifying bsize=
values from 16K to 16MB) to tune better for performance
for applications that depend on blocksize.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
CC: Stable <stable@vger.kernel.org>
2019-03-04 20:05:35 -06:00
Paulo Alcantara
1c780228e9 cifs: Make use of DFS cache to get new DFS referrals
This patch will make use of DFS cache routines where appropriate and
do not always request a new referral from server.

Signed-off-by: Paulo Alcantara <palcantara@suse.de>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-12-28 10:09:46 -06:00
Colin Ian King
8c6c9bed87 cifs: don't dereference smb_file_target before null check
There is a null check on dst_file->private data which suggests
it can be potentially null. However, before this check, pointer
smb_file_target is derived from dst_file->private and dereferenced
in the call to tlink_tcon, hence there is a potential null pointer
deference.

Fix this by assigning smb_file_target and target_tcon after the
null pointer sanity checks.

Detected by CoverityScan, CID#1475302 ("Dereference before null check")

Fixes: 04b38d6012 ("vfs: pull btrfs clone API to vfs layer")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-11-02 14:09:42 -05:00
Long Li
be4eb68846 CIFS: Add direct I/O functions to file_operations
With direct read/write functions implemented, add them to file_operations.

Dircet I/O is used under two conditions:
1. When mounting with "cache=none", CIFS uses direct I/O for all user file
data transfer.
2. When opening a file with O_DIRECT, CIFS uses direct I/O for all data
transfer on this file.

Signed-off-by: Long Li <longli@microsoft.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-11-02 14:09:42 -05:00
Linus Torvalds
c2aa1a444c vfs: rework data cloning infrastructure
Rework the vfs_clone_file_range and vfs_dedupe_file_range infrastructure to use
 a common .remap_file_range method and supply generic bounds and sanity checking
 functions that are shared with the data write path. The current VFS
 infrastructure has problems with rlimit, LFS file sizes, file time stamps,
 maximum filesystem file sizes, stripping setuid bits, etc and so they are
 addressed in these commits.
 
 We also introduce the ability for the ->remap_file_range methods to return short
 clones so that clones for vfs_copy_file_range() don't get rejected if the entire
 range can't be cloned. It also allows filesystems to sliently skip deduplication
 of partial EOF blocks if they are not capable of doing so without requiring
 errors to be thrown to userspace.
 
 All existing filesystems are converted to user the new .remap_file_range method,
 and both XFS and ocfs2 are modified to make use of the new generic checking
 infrastructure.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJb29gEAAoJEK3oKUf0dfodpOAQAL2VbHjvKXEwNMDTKscSRMmZ
 Z0xXo3gamFKQ+VGOqy2g2lmAYQs9SAnTuCGTJ7zIAp7u+q8gzUy5FzKAwLS4Id6L
 8siaY6nzlicfO04d0MdXnWz0f3xykChgzfdQfVUlUi7WrDioBUECLPmx4a+USsp1
 DQGjLOZfoOAmn2rijdnH9RTEaHqg+8mcTaLN9TRav4gGqrWxldFKXw2y6ouFC7uo
 /hxTRNXR9VI+EdbDelwBNXl9nU9gQA0WLOvRKwgUrtv6bSJohTPsmXt7EbBtNcVR
 cl3zDNc1sLD1bLaRLEUAszI/33wXaaQgom1iB51obIcHHef+JxRNG/j6rUMfzxZI
 VaauGv5EIvtaKN0LTAqVVLQ8t2MQFYfOr8TykmO+1UFog204aKRANdVMHDSjxD/0
 dTGKJGcq+HnKQ+JHDbTdvuXEL8sUUl1FiLjOQbZPw63XmuddLKFUA2TOjXn6htbU
 1h1MG5d9KjGLpabp2BQheczD08NuSmcrOBNt7IoeI3+nxr3HpMwprfB9TyaERy9X
 iEgyVXmjjc9bLLRW7A2wm77aW64NvPs51wKMnvuNgNwnCewrGS6cB8WVj2zbQjH1
 h3f3nku44s9ctNPSBzb/sJLnpqmZQ5t0oSmrMSN+5+En6rNTacoJCzxHRJBA7z/h
 Z+C6y1GTZw0euY6Zjiwu
 =CE/A
 -----END PGP SIGNATURE-----

Merge tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull vfs dedup fixes from Dave Chinner:
 "This reworks the vfs data cloning infrastructure.

  We discovered many issues with these interfaces late in the 4.19 cycle
  - the worst of them (data corruption, setuid stripping) were fixed for
  XFS in 4.19-rc8, but a larger rework of the infrastructure fixing all
  the problems was needed. That rework is the contents of this pull
  request.

  Rework the vfs_clone_file_range and vfs_dedupe_file_range
  infrastructure to use a common .remap_file_range method and supply
  generic bounds and sanity checking functions that are shared with the
  data write path. The current VFS infrastructure has problems with
  rlimit, LFS file sizes, file time stamps, maximum filesystem file
  sizes, stripping setuid bits, etc and so they are addressed in these
  commits.

  We also introduce the ability for the ->remap_file_range methods to
  return short clones so that clones for vfs_copy_file_range() don't get
  rejected if the entire range can't be cloned. It also allows
  filesystems to sliently skip deduplication of partial EOF blocks if
  they are not capable of doing so without requiring errors to be thrown
  to userspace.

  Existing filesystems are converted to user the new remap_file_range
  method, and both XFS and ocfs2 are modified to make use of the new
  generic checking infrastructure"

* tag 'xfs-4.20-merge-2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (28 commits)
  xfs: remove [cm]time update from reflink calls
  xfs: remove xfs_reflink_remap_range
  xfs: remove redundant remap partial EOF block checks
  xfs: support returning partial reflink results
  xfs: clean up xfs_reflink_remap_blocks call site
  xfs: fix pagecache truncation prior to reflink
  ocfs2: remove ocfs2_reflink_remap_range
  ocfs2: support partial clone range and dedupe range
  ocfs2: fix pagecache truncation prior to reflink
  ocfs2: truncate page cache for clone destination file before remapping
  vfs: clean up generic_remap_file_range_prep return value
  vfs: hide file range comparison function
  vfs: enable remap callers that can handle short operations
  vfs: plumb remap flags through the vfs dedupe functions
  vfs: plumb remap flags through the vfs clone functions
  vfs: make remap_file_range functions take and return bytes completed
  vfs: remap helper should update destination inode metadata
  vfs: pass remap flags to generic_remap_checks
  vfs: pass remap flags to generic_remap_file_range_prep
  vfs: combine the clone and dedupe into a single remap_file_range
  ...
2018-11-02 09:33:08 -07:00
Darrick J. Wong
42ec3d4c02 vfs: make remap_file_range functions take and return bytes completed
Change the remap_file_range functions to take a number of bytes to
operate upon and return the number of bytes they operated on.  This is a
requirement for allowing fs implementations to return short clone/dedupe
results to the user, which will enable us to obey resource limits in a
graceful manner.

A subsequent patch will enable copy_file_range to signal to the
->clone_file_range implementation that it can handle a short length,
which will be returned in the function's return value.  For now the
short return is not implemented anywhere so the behavior won't change --
either copy_file_range manages to clone the entire range or it tries an
alternative.

Neither clone ioctl can take advantage of this, alas.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-30 10:41:49 +11:00
Darrick J. Wong
2e5dfc99f2 vfs: combine the clone and dedupe into a single remap_file_range
Combine the clone_file_range and dedupe_file_range operations into a
single remap_file_range file operation dispatch since they're
fundamentally the same operation.  The differences between the two can
be made in the prep functions.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2018-10-30 10:41:21 +11:00
Steve French
8c1beb9801 cifs: minor clarification in comments
Clarify meaning (in comments) meaning of various
options for debug messages in cifs.ko. Also fixed
trivial formatting/style issue with previous patch.

Signed-off-by: Steve French <stfrench@microsoft.com>
2018-10-23 21:16:05 -05:00
Rodrigo Freire
f80eaedd6c CIFS: Print message when attempting a mount
Currently, no messages are printed when mounting a CIFS filesystem and
no debug configuration is enabled.

However, a CIFS mount information is valuable when troubleshooting
and/or forensic analyzing a system and finding out if was a CIFS
endpoint mount attempted.

Other filesystems such as XFS, EXT* does issue a printk() when mounting
their filesystems.

A terse log message is printed only if cifsFYI is not enabled. Otherwise,
the default full debug message is printed.

In order to not clutter and classify correctly the event messages, these
are logged as KERN_INFO level.

Sample mount operations:

[root@corinthians ~]# mount -o user=administrator //172.25.250.18/c$ /mnt
(non-existent system)

[root@corinthians ~]# mount -o user=administrator //172.25.250.19/c$ /mnt
(Valid system)

Kernel message log for the mount operations:

[  450.464543] CIFS: Attempting to mount //172.25.250.18/c$
[  456.478186] CIFS VFS: Error connecting to socket. Aborting operation.
[  456.478381] CIFS VFS: cifs_mount failed w/return code = -113
[  467.688866] CIFS: Attempting to mount //172.25.250.19/c$

Signed-off-by: Rodrigo Freire <rfreire@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2018-10-23 21:16:05 -05:00
Aurelien Aptel
8393072bab CIFS: make 'nodfs' mount opt a superblock flag
tcon->Flags is only used by SMB1 code and changing it is not permanent
(you lose the setting on tcon reconnect).

* Move the setting to superblock flags (per mount-points).
* Make automount callback exit early when flag present
* Make dfs resolving happening in mount syscall exit early if flag present

Signed-off-by: Aurelien Aptel <aaptel@suse.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-10-23 21:16:05 -05:00
Steve French
00778e2294 smb3: add way to control slow response threshold for logging and stats
/proc/fs/cifs/Stats when CONFIG_CIFS_STATS2 is enabled logs 'slow'
responses, but depending on the server you are debugging a
one second timeout may be too fast, so allow setting it to
a larger number of seconds via new module parameter

/sys/module/cifs/parameters/slow_rsp_threshold

or via modprobe:

slow_rsp_threshold:Amount of time (in seconds) to wait before
logging that a response is delayed.
Default: 1 (if set to 0 disables msg). (uint)

Recommended values are 0 (disabled) to 32767 (9 hours) with
the default remaining as 1 second.

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2018-10-23 21:16:04 -05:00
Steve French
1c3a13a38a cifs: minor updates to module description for cifs.ko
note smb3 (and common more modern servers) in the module description

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-10-23 21:16:04 -05:00
Ronnie Sahlberg
e55954a5f7 cifs: don't show domain= in mount output when domain is empty
Reported-by: Xiaoli Feng <xifeng@redhat.com>
Signed-off-by: Ronnie Sahlberg <lsahlber@redhat.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Pavel Shilovsky <pshilov@microsoft.com>
2018-08-10 11:53:51 -05:00
Steve French
8a69e96e61 smb3: snapshot mounts are read-only and make sure info is displayable about the mount
snapshot mounts were not marked as read-only and did not display the snapshot
time (in /proc/mounts) specified on mount

With this patch - note that can not write to the snapshot mount (see "ro" in
/proc/mounts line) and also the missing snapshot timewarp token time is
dumped.  Sample line from /proc/mounts with the patch:

//127.0.0.1/scratch /mnt2 smb3 ro,relatime,vers=default,cache=strict,username=testuser,domain=,uid=0,noforceuid,gid=0,noforcegid,addr=127.0.0.1,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,noperm,rsize=1048576,wsize=1048576,echo_interval=60,snapshot=1234567,actimeo=1 0 0

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Paulo Alcantara <palcantara@suse.de>
2018-08-07 14:15:56 -05:00
Steve French
0fdfef9aa7 smb3: simplify code by removing CONFIG_CIFS_SMB311
We really, really want to be encouraging use of secure dialects,
and SMB3.1.1 offers useful security features, and will soon
be the recommended dialect for many use cases. Simplify the code
by removing the CONFIG_CIFS_SMB311 ifdef so users don't disable
it in the build, and create compatibility and/or security issues
with modern servers - many of which have been supporting this
dialect for multiple years.

Also clarify some of the Kconfig text for cifs.ko about
SMB3.1.1 and current supported features in the module.

Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-08-07 14:15:56 -05:00
Steve French
21ba3845b5 smb3: fill in statfs fsid and correct namelen
Fil in the correct namelen (typically 255 not 4096) in the
statfs response and also fill in a reasonably unique fsid
(in this case taken from the volume id, and the creation time
of the volume).

In the case of the POSIX statfs all fields are now filled in,
and in the case of non-POSIX mounts, all fields are filled
in which can be.

Signed-off-by: Steve French <stfrench@gmail.com>
CC: Stable <stable@vger.kernel.org>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
2018-08-07 14:15:41 -05:00
Steve French
c7c137b931 smb3: do not allow insecure cifs mounts when using smb3
if mounting as smb3 do not allow cifs (vers=1.0) or insecure vers=2.0
mounts.

For example:
root@smf-Thinkpad-P51:~/cifs-2.6# mount -t smb3 //127.0.0.1/scratch /mnt -o username=testuser,password=Testpass1
root@smf-Thinkpad-P51:~/cifs-2.6# umount /mnt
root@smf-Thinkpad-P51:~/cifs-2.6# mount -t smb3 //127.0.0.1/scratch /mnt -o username=testuser,password=Testpass1,vers=1.0
mount: /mnt: wrong fs type, bad option, bad superblock on //127.0.0.1/scratch ...
root@smf-Thinkpad-P51:~/cifs-2.6# dmesg | grep smb3
[ 4302.200122] CIFS VFS: vers=1.0 (cifs) not permitted when mounting with smb3
root@smf-Thinkpad-P51:~/cifs-2.6# mount -t smb3 //127.0.0.1/scratch /mnt -o username=testuser,password=Testpass1,vers=3.11

Signed-off-by: Steve French <stfrench@microsoft.com>
Acked-by: Pavel Shilovsky <pshilov@microsoft.com>
Reviewed-by: Aurelien Aptel <aaptel@suse.com>
Reviewed-by: Sachin Prabhu <sprabhu@redhat.com>
2018-06-07 08:36:39 -05:00
Steve French
b326614ea2 smb3: allow "posix" mount option to enable new SMB311 protocol extensions
If "posix" (or synonym "unix" for backward compatibility) specified on mount,
and server advertises support for SMB3.11 POSIX negotiate context, then
enable the new posix extensions on the tcon.  This can be viewed by
looking for "posix" in the mount options displayed by /proc/mounts
for that mount (ie if posix extensions allowed by server and the
experimental POSIX extensions also requested on the mount by specifying
"posix" at mount time).

Also add check to warn user if conflicting unix/nounix or posix/noposix specified
on mount.

Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30 16:06:18 -05:00
Steve French
f92a720ee9 cifs: allow disabling less secure legacy dialects
To improve security it may be helpful to have additional ways to restrict the
ability to override the default dialects (SMB2.1, SMB3 and SMB3.02) on mount
with old dialects (CIFS/SMB1 and SMB2) since vers=1.0 (CIFS/SMB1) and vers=2.0
are weaker and less secure.

Add a module parameter "disable_legacy_dialects"
(/sys/module/cifs/parameters/disable_legacy_dialects) which can be set to
1 (or equivalently Y) to forbid use of vers=1.0 or vers=2.0 on mount.

Also cleans up a few build warnings about globals for various module parms.

Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30 16:06:18 -05:00
Steve French
11911b956f cifs: make minor clarifications to module params for cifs.ko
Note which ones of the module params are cifs dialect only
(N/A for default dialect now that has moved to SMB2.1 or later)

Signed-off-by: Steve French <stfrench@microsoft.com>
Reviewed-by: Ronnie Sahlberg <lsahlber@redhat.com>
2018-05-30 16:06:18 -05:00
Steve French
49218b4f57 smb3: add module alias for smb3 to cifs.ko
We really don't want to be encouraging people to use the old
(less secure) cifs dialect (SMB1) and it can be confusing for them
with SMB3 (or later) being recommended but the module name is cifs.

Add a module alias for "smb3" to cifs.ko to make this less confusing.

Signed-off-by: Steve French <smfrench@gmail.com>
2018-05-30 16:06:18 -05:00