Commit Graph

211 Commits

Author SHA1 Message Date
Heinz Mauelshagen
6cc435fa76 dm: avoid 'do {} while(0)' loop in single statement macros
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:07 -05:00
Heinz Mauelshagen
1c13188669 dm: prefer '"%s...", __func__'
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:07 -05:00
Heinz Mauelshagen
2d0f25cbc0 dm: remove unnecessary braces from single statement blocks
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:06 -05:00
Heinz Mauelshagen
0ef0b4717a dm: add missing empty lines
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:06 -05:00
Heinz Mauelshagen
03b1888770 dm: fix trailing statements
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:06 -05:00
Heinz Mauelshagen
255e264649 dm: address indent/space issues
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:06 -05:00
Heinz Mauelshagen
86a3238c7b dm: change "unsigned" to "unsigned int"
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:06 -05:00
Heinz Mauelshagen
3bd9400307 dm: add missing SPDX-License-Indentifiers
'GPL-2.0-only' is used instead of 'GPL-2.0' because SPDX has
deprecated its use.

Suggested-by: John Wiele <jwiele@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-14 14:23:06 -05:00
Herbert Xu
dcfe653d7c dm: Remove completion function scaffolding
This patch removes the temporary scaffolding now that the comletion
function signature has been converted.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-02-13 18:35:15 +08:00
Herbert Xu
96747228b7 dm: Add scaffolding to change completion function signature
This patch adds temporary scaffolding so that the Crypto API
completion function can take a void * instead of crypto_async_request.
Once affected users have been converted this can be removed.

Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Acked-by: Mike Snitzer <snitzer@kernel.org>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2023-02-13 18:34:48 +08:00
Jiapeng Chong
5cd6d1d53a dm integrity: Remove bi_sector that's only used by commented debug code
drivers/md/dm-integrity.c:1738:13: warning: variable 'bi_sector' set but not used.

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=3895
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2023-02-02 14:26:09 -05:00
Luo Meng
f50cb2cbab dm integrity: Fix UAF in dm_integrity_dtr()
Dm_integrity also has the same UAF problem when dm_resume()
and dm_destroy() are concurrent.

Therefore, cancelling timer again in dm_integrity_dtr().

Cc: stable@vger.kernel.org
Fixes: 7eada909bf ("dm: add integrity target")
Signed-off-by: Luo Meng <luomeng12@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-11-30 13:29:34 -05:00
Linus Torvalds
f4408c3dfc block-6.1-2022-11-18
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmN38ZUQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpgXxD/9tUSFUKIVGIn4pmNILfY3XV45HOi1w44yR
 zCxCELupcBeT+YixmaJcT8sunrrg2fLPOXMrDJk1cG/izXHzkjAQsHZvERfqC7hC
 f5onH+2MyGm3qBwxV0iGqITJgTwQGInVJijT4f9UZd/8ultymyZR2nOdIdIydHCF
 qzlOjq6hgIuGKHhFgOqRUg/OAkx510ZEEilUDcZ6XVV+zL7ccN6J9+eNTI3c58wT
 7jvxZC4u6QGKteGvVniE3WXgk3QdFiQRORvV09g+PkbG/vPjAIZ5tJFb9PdIOebD
 3guDiNUasgz2vnDetMK+yk4LcedcRfWnqgn+Vm8C26j5Fxs13eDx5kMDteVy7CYh
 3bokOATHohoZZ9qTApgQUswTfGJfBdoy0nUTPuffxPdKDyUPteIxFCADcnyDHnDG
 d/+PjU3FKF31o2HcUfvYp7OMO0VZP0hJSWps8znoVXKxb+LH9qKkYzHVlfni5kkS
 k9XqqD1Ki98Erb346YqgvQjCkz+CUd5DxtGyh9Oh2+oS2qHP6WjdKo1QPFmWD5dp
 EyXGSqGoZrIPtnKohLUN9EiVXanRQWJr3L0gw2CYXpmwfSKfMC3CQraEC1jOc01l
 TfsLJGbl3L5XpLzxoBwDu44cqp+VvbalergdcmsDTLDFHhONY2g5LJh6C9/EDdnQ
 Cde1uHikGw==
 =sOGG
 -----END PGP SIGNATURE-----

Merge tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux

Pull block fixes from Jens Axboe:

 - NVMe pull request via Christoph:
      - Two more bogus nid quirks (Bean Huo, Tiago Dias Ferreira)
      - Memory leak fix in nvmet (Sagi Grimberg)

 - Regression fix for block cgroups pinning the wrong blkcg, causing
   leaks of cgroups and blkcgs (Chris)

 - UAF fix for drbd setup error handling (Dan)

 - Fix DMA alignment propagation in DM (Keith)

* tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux:
  dm-log-writes: set dma_alignment limit in io_hints
  dm-integrity: set dma_alignment limit in io_hints
  block: make blk_set_default_limits() private
  dm-crypt: provide dma_alignment limit in io_hints
  block: make dma_alignment a stacking queue_limit
  nvmet: fix a memory leak in nvmet_auth_set_key
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV7000
  drbd: use after free in drbd_create_device()
  nvme-pci: add NVME_QUIRK_BOGUS_NID for Micron Nitro
  blk-cgroup: properly pin the parent in blkcg_css_online
2022-11-18 13:59:45 -08:00
Mikulas Patocka
984bf2cc53 dm integrity: clear the journal on suspend
There was a problem that a user burned a dm-integrity image on CDROM
and could not activate it because it had a non-empty journal.

Fix this problem by flushing the journal (done by the previous commit)
and clearing the journal (done by this commit). Once the journal is
cleared, dm-integrity won't attempt to replay it on the next
activation.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-11-18 11:05:09 -05:00
Mikulas Patocka
5e5dab5ec7 dm integrity: flush the journal on suspend
This commit flushes the journal on suspend. It is prerequisite for the
next commit that enables activating dm integrity devices in read-only mode.

Note that we deliberately didn't flush the journal on suspend, so that the
journal replay code would be tested. However, the dm-integrity code is 5
years old now, so that journal replay is well-tested, and we can make this
change now.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-11-18 10:57:17 -05:00
Keith Busch
29aa778bb6 dm-integrity: set dma_alignment limit in io_hints
This device mapper needs bio vectors to be sized and memory aligned to
the logical block size. Set the minimum required queue limit
accordingly.

Signed-off-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Mike Snitzer <snitzer@kernel.org>
Link: https://lore.kernel.org/r/20221110184501.2451620-5-kbusch@meta.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-11-16 15:58:11 -07:00
Linus Torvalds
20cf903a0c - Add flags argument to dm_bufio_client_create and introduce
DM_BUFIO_CLIENT_NO_SLEEP flag to have dm-bufio use spinlock rather
   than mutex for its locking.
 
 - Add optional "try_verify_in_tasklet" feature to DM verity target.
   This feature gives users the option to improve IO latency by using a
   tasklet to verify, using hashes in bufio's cache, rather than wait
   to schedule a work item via workqueue. But if there is a bufio cache
   miss, or an error, then the tasklet will fallback to using workqueue.
 
 - Incremental changes to both dm-bufio and the DM verity target to use
   jump_label to minimize cost of branching associated with the niche
   "try_verify_in_tasklet" feature. DM-bufio in particular is used by
   quite a few other DM targets so it doesn't make sense to incur
   additional bufio cost in those targets purely for the benefit of
   this niche verity feature if the feature isn't ever used.
 
 - Optimize verity_verify_io, which is used by both workqueue and
   tasklet based verification, if FEC is not configured or tasklet
   based verification isn't used.
 
 - Remove DM verity target's verify_wq's use of the WQ_CPU_INTENSIVE
   flag since it uses WQ_UNBOUND. Also, use the WQ_HIGHPRI flag if
   "try_verify_in_tasklet" is specified.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmLtYU0ACgkQxSPxCi2d
 A1pIDwgAjQi7jSxN7n+Fb4sJLL5x3WvuVGcockIkucj+Pvr3nvijwkf27+kbCWhn
 d4bDhA60gCebd87lf2PZTf8LL2+h9SLzFDTrgBVg5eC4O8aoQNrgwMMKVvYn+MmK
 OShurwHXS/7iqCETFaUA7hVtH/NwSWzP7WL5+QIDVOWVGaTLnqdvA4TYSZnljEg2
 c02bL2KK+ndsYYshDq7HnVuqr4hIBWKF6y0lApU42mfTCnghX8ZnUMG9pO9K+20X
 qVfQH58CjOTP0MaHsddyR1sTKKZ1qY1HdoDhnlMVfZD5XqnCMhzefKoMxbxJKmJ3
 7hS5w2tNxSx4yYWGj3dXHKhEZi0buA==
 =ZBi4
 -----END PGP SIGNATURE-----

Merge tag 'for-6.0/dm-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull more device mapper updates from Mike Snitzer:

 - Add flags argument to dm_bufio_client_create and introduce
   DM_BUFIO_CLIENT_NO_SLEEP flag to have dm-bufio use spinlock rather
   than mutex for its locking.

 - Add optional "try_verify_in_tasklet" feature to DM verity target.
   This feature gives users the option to improve IO latency by using a
   tasklet to verify, using hashes in bufio's cache, rather than wait to
   schedule a work item via workqueue. But if there is a bufio cache
   miss, or an error, then the tasklet will fallback to using workqueue.

 - Incremental changes to both dm-bufio and the DM verity target to use
   jump_label to minimize cost of branching associated with the niche
   "try_verify_in_tasklet" feature. DM-bufio in particular is used by
   quite a few other DM targets so it doesn't make sense to incur
   additional bufio cost in those targets purely for the benefit of this
   niche verity feature if the feature isn't ever used.

 - Optimize verity_verify_io, which is used by both workqueue and
   tasklet based verification, if FEC is not configured or tasklet based
   verification isn't used.

 - Remove DM verity target's verify_wq's use of the WQ_CPU_INTENSIVE
   flag since it uses WQ_UNBOUND. Also, use the WQ_HIGHPRI flag if
   "try_verify_in_tasklet" is specified.

* tag 'for-6.0/dm-changes-2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm verity: have verify_wq use WQ_HIGHPRI if "try_verify_in_tasklet"
  dm verity: remove WQ_CPU_INTENSIVE flag since using WQ_UNBOUND
  dm verity: only copy bvec_iter in verity_verify_io if in_tasklet
  dm verity: optimize verity_verify_io if FEC not configured
  dm verity: conditionally enable branching for "try_verify_in_tasklet"
  dm bufio: conditionally enable branching for DM_BUFIO_CLIENT_NO_SLEEP
  dm verity: allow optional args to alter primary args handling
  dm verity: Add optional "try_verify_in_tasklet" feature
  dm bufio: Add DM_BUFIO_CLIENT_NO_SLEEP flag
  dm bufio: Add flags argument to dm_bufio_client_create
2022-08-06 11:09:55 -07:00
Nathan Huckleberry
0fcb100d50 dm bufio: Add flags argument to dm_bufio_client_create
Add a flags argument to dm_bufio_client_create and update all the
callers. This is in preparation to add the DM_BUFIO_NO_SLEEP flag.

Signed-off-by: Nathan Huckleberry <nhuck@google.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-07-28 17:46:14 -04:00
Bart Van Assche
c9154a4cb8 dm/dm-integrity: Combine request operation and flags
Combine the request operation type and request flags into a single
argument. Improve static type checking by using the enum req_op type for
variables that represent a request operation and the new blk_opf_t type for
variables that represent request flags.

Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-27-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-14 12:14:31 -06:00
Bart Van Assche
581075e4f6 dm/core: Reduce the size of struct dm_io_request
Combine the bi_op and bi_op_flags into the bi_opf member. Use the new
blk_opf_t type to improve static type checking. This patch does not
change any functionality.

Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-22-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-14 12:14:31 -06:00
Bart Van Assche
ff07a02e9e treewide: Rename enum req_opf into enum req_op
The type name enum req_opf is misleading since it suggests that values of
this type include both an operation type and flags. Since values of this
type represent an operation only, change the type name into enum req_op.

Convert the enum req_op documentation into kernel-doc format. Move a few
definitions such that the enum req_op documentation occurs just above
the enum req_op definition.

The name "req_opf" was introduced by commit ef295ecf09 ("block: better op
and flags encoding").

Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Cc: Hannes Reinecke <hare@suse.de>
Cc: Damien Le Moal <damien.lemoal@wdc.com>
Cc: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-2-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-07-14 12:14:30 -06:00
Dan Carpenter
d3f2a14b89 dm integrity: fix error code in dm_integrity_ctr()
The "r" variable shadows an earlier "r" that has function scope.  It
means that we accidentally return success instead of an error code.
Smatch has a warning for this:

	drivers/md/dm-integrity.c:4503 dm_integrity_ctr()
	warn: missing error code 'r'

Fixes: 7eada909bf ("dm: add integrity target")
Cc: stable@vger.kernel.org
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-05-09 12:14:00 -04:00
Mikulas Patocka
08c1af8f1c dm integrity: fix memory corruption when tag_size is less than digest size
It is possible to set up dm-integrity in such a way that the
"tag_size" parameter is less than the actual digest size. In this
situation, a part of the digest beyond tag_size is ignored.

In this case, dm-integrity would write beyond the end of the
ic->recalc_tags array and corrupt memory. The corruption happened in
integrity_recalc->integrity_sector_checksum->crypto_shash_final.

Fix this corruption by increasing the tags array so that it has enough
padding at the end to accomodate the loop in integrity_recalc() being
able to write a full digest size for the last member of the tags
array.

Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-04-13 12:38:49 -04:00
Mikulas Patocka
cc09e8a9de dm integrity: set journal entry unused when shrinking device
Commit f6f72f32c2 ("dm integrity: don't replay journal data past the
end of the device") skips journal replay if the target sector points
beyond the end of the device. Unfortunatelly, it doesn't set the
journal entry unused, which resulted in this BUG being triggered:
BUG_ON(!journal_entry_is_unused(je))

Fix this by calling journal_entry_set_unused() for this case.

Fixes: f6f72f32c2 ("dm integrity: don't replay journal data past the end of the device")
Cc: stable@vger.kernel.org # v5.7+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Tested-by: Milan Broz <gmazyland@gmail.com>
[snitzer: revised header]
Signed-off-by: Mike Snitzer <snitzer@kernel.org>
2022-04-01 10:31:23 -04:00
Christoph Hellwig
0a806cfde8 dm-integrity: stop using bio_devname
Use the %pg format specifier to save on stack consuption and code size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220304180105.409765-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2022-03-07 06:42:33 -07:00
Kees Cook
f069c7ab6c dm integrity: Use struct_group() to zero struct journal_sector
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.

Add struct_group() to mark region of struct journal_sector that should be
initialized to zero.

Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2022-01-06 09:48:33 -05:00
Mike Snitzer
1cef171abd dm integrity: fix data corruption due to improper use of bvec_kmap_local
Commit 25058d1c72 ("dm integrity: use bvec_kmap_local in
__journal_read_write") didn't account for __journal_read_write() later
adding the biovec's bv_offset. As such using bvec_kmap_local() caused
the start of the biovec to be skipped.

Trivial test that illustrates data corruption:

  # integritysetup format /dev/pmem0
  # integritysetup open /dev/pmem0 integrityroot
  # mkfs.xfs /dev/mapper/integrityroot
  ...
  bad magic number
  bad magic number
  Metadata corruption detected at xfs_sb block 0x0/0x1000
  libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
  releasing dirty buffer (bulk) to free list!

Fix this by using kmap_local_page() instead of bvec_kmap_local() in
__journal_read_write().

Fixes: 25058d1c72 ("dm integrity: use bvec_kmap_local in __journal_read_write")
Reported-by: Tony Asleson <tasleson@redhat.com>
Reviewed-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-12-15 14:16:35 -05:00
Linus Torvalds
c183e1707a - Add DM core support for emitting audit events through the audit
subsystem. Also enhance both the integrity and crypt targets to emit
   events to via dm-audit.
 
 - Various other simple code improvements and cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmGJlFkACgkQxSPxCi2d
 A1pqwwf/YZ6kNKRQaKF1mbkkHOxa/ULf7qIhi/R0epwJu4j1RGsCACS34EqzLc4c
 x15h6flCNj1IBVAqTvMUETYTjTLtyrcfD0yBRWYw2RL0ksHMHyMvd1r/7aE64+pj
 EeZk9Xzcx3Gsq9GOzKfYA2AX0PrypkKSjgHK7hgv+Jh5heqkFcnMXSl3l7BQ6vbr
 ue9joPSI7+6eVFMDn32KxyHzfm6zZo1nmKZ6tQBBHD1D9yBqWTAhXiyXhRA+BOYH
 Tg5wE1fvZ/htyZNEc1cMRArzLF6q9pEU4r8j472N6IcJbhIJzSu0V60zVvexNWG3
 fJSIWqlta1KFK8SQttmDmfFnJiFcyw==
 =t097
 -----END PGP SIGNATURE-----

Merge tag 'for-5.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - Add DM core support for emitting audit events through the audit
   subsystem. Also enhance both the integrity and crypt targets to emit
   events to via dm-audit.

 - Various other simple code improvements and cleanups.

* tag 'for-5.16/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm table: log table creation error code
  dm: make workqueue names device-specific
  dm writecache: Make use of the helper macro kthread_run()
  dm crypt: Make use of the helper macro kthread_run()
  dm verity: use bvec_kmap_local in verity_for_bv_block
  dm log writes: use memcpy_from_bvec in log_writes_map
  dm integrity: use bvec_kmap_local in __journal_read_write
  dm integrity: use bvec_kmap_local in integrity_metadata
  dm: add add_disk() error handling
  dm: Remove redundant flush_workqueue() calls
  dm crypt: log aead integrity violations to audit subsystem
  dm integrity: log audit events for dm-integrity target
  dm: introduce audit event module for device mapper
2021-11-09 11:02:04 -08:00
Christoph Hellwig
25058d1c72 dm integrity: use bvec_kmap_local in __journal_read_write
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:46 -04:00
Christoph Hellwig
c12d205dae dm integrity: use bvec_kmap_local in integrity_metadata
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-11-01 13:28:45 -04:00
Michael Weiß
82bb85998c dm integrity: log audit events for dm-integrity target
dm-integrity signals integrity violations by returning I/O errors
to user space. To identify integrity violations by a controlling
instance, the kernel audit subsystem can be used to emit audit
events to user space. We use the new dm-audit submodule allowing
to emit audit events on relevant I/O errors.

The construction and destruction of integrity device mappings are
also relevant for auditing a system. Thus, those events are also
logged as audit events.

Signed-off-by: Michael Weiß <michael.weiss@aisec.fraunhofer.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-10-27 16:54:36 -04:00
Christoph Hellwig
6dcbb52cdd dm: use bdev_nr_sectors and bdev_nr_bytes instead of open coding them
Use the proper helpers to read the block device size.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Mike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20211018101130.1838532-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-10-18 14:43:22 -06:00
Linus Torvalds
efa916af13 - Add DM infrastructure for IMA-based remote attestion. These changes
are the basis for deploying DM-based storage in a "cloud" that must
   validate configurations end-users run to maintain trust. These DM
   changes allow supported DM targets' configurations to be measured
   via IMA. But the policy and enforcement (of which configurations are
   valid) is managed by something outside the kernel (e.g. Keylime).
 
 - Fix DM crypt scalability regression on systems with many cpus due to
   percpu_counter spinlock contention in crypt_page_alloc().
 
 - Use in_hardirq() instead of deprecated in_irq() in DM crypt.
 
 - Add event counters to DM writecache to allow users to further assess
   how the writecache is performing.
 
 - Various code cleanup in DM writecache's main IO mapping function.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAmEuWG0ACgkQxSPxCi2d
 A1rZIgf+JSSR2/DBg4j9w0oVsay+rfFB+tyZLVvHFEraukDbxOKy7Dck1GZybQBq
 mFTqCWKQHOvME4nf4swIY/klPi3VhPNyWDY/hI/FAFaiTskLqjxhQQc1+cECLkMx
 ittIKYvWgcg7kflCuN6LiUslTB/P4Lo6GmNqMOhFn3nkN5hg76xaxPK+JCMGLgTM
 qs+mbZfB1Z51G+cDlU0E5WCn37k/jqqwhb8NN90Zozgi7ByQEO01bd2EkSsYT0T/
 ZrDOWP8M8u14QHAV0e8n9e6a/d5atIV5g/+XrDbVDvzwtq7eI+ojBNHDBpcgxiH7
 /AVb9AM4Pd87ExWMbsBxr3Hgbc5+dQ==
 =yIsi
 -----END PGP SIGNATURE-----

Merge tag 'for-5.15/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - Add DM infrastructure for IMA-based remote attestion. These changes
   are the basis for deploying DM-based storage in a "cloud" that must
   validate configurations end-users run to maintain trust. These DM
   changes allow supported DM targets' configurations to be measured via
   IMA. But the policy and enforcement (of which configurations are
   valid) is managed by something outside the kernel (e.g. Keylime).

 - Fix DM crypt scalability regression on systems with many cpus due to
   percpu_counter spinlock contention in crypt_page_alloc().

 - Use in_hardirq() instead of deprecated in_irq() in DM crypt.

 - Add event counters to DM writecache to allow users to further assess
   how the writecache is performing.

 - Various code cleanup in DM writecache's main IO mapping function.

* tag 'for-5.15/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm crypt: use in_hardirq() instead of deprecated in_irq()
  dm ima: update dm documentation for ima measurement support
  dm ima: update dm target attributes for ima measurements
  dm ima: add a warning in dm_init if duplicate ima events are not measured
  dm ima: prefix ima event name related to device mapper with dm_
  dm ima: add version info to dm related events in ima log
  dm ima: prefix dm table hashes in ima log with hash algorithm
  dm crypt: Avoid percpu_counter spinlock contention in crypt_page_alloc()
  dm: add documentation for IMA measurement support
  dm: update target status functions to support IMA measurement
  dm ima: measure data on device rename
  dm ima: measure data on table clear
  dm ima: measure data on device remove
  dm ima: measure data on device resume
  dm ima: measure data on table load
  dm writecache: add event counters
  dm writecache: report invalid return from writecache_map helpers
  dm writecache: further writecache_map() cleanup
  dm writecache: factor out writecache_map_remap_origin()
  dm writecache: split up writecache_map() to improve code readability
2021-08-31 14:55:09 -07:00
Tushar Sugandhi
33ace4ca12 dm ima: update dm target attributes for ima measurements
Certain DM targets ('integrity', 'multipath', 'verity') need to update the
way their attributes are recorded in the ima log, so that the attestation
servers can interpret the data correctly and decide if the devices
meet the attestation requirements.  For instance, the "mode=%c" attribute
in the 'integrity' target is measured twice, the 'verity' target is
missing the attribute "root_hash_sig_key_desc=%s", and the 'multipath'
target needs to index the attributes properly.

Update 'integrity' target to remove the duplicate measurement of
the attribute "mode=%c".  Add "root_hash_sig_key_desc=%s" attribute
for the 'verity' target.  Index various attributes in 'multipath'
target.  Also, add "nr_priority_groups=%u" attribute to 'multipath'
target to record the number of priority groups.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Suggested-by: Thore Sommer <public@thson.de>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-08-20 16:07:36 -04:00
Christoph Hellwig
964cacfdd3 dm-integrity: use bvec_virt
Use bvec_virt instead of open coding it.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20210804095634.460779-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-08-16 10:50:32 -06:00
Tushar Sugandhi
8ec456629d dm: update target status functions to support IMA measurement
For device mapper targets to take advantage of IMA's measurement
capabilities, the status functions for the individual targets need to be
updated to handle the status_type_t case for value STATUSTYPE_IMA.

Update status functions for the following target types, to log their
respective attributes to be measured using IMA.
 01. cache
 02. crypt
 03. integrity
 04. linear
 05. mirror
 06. multipath
 07. raid
 08. snapshot
 09. striped
 10. verity

For rest of the targets, handle the STATUSTYPE_IMA case by setting the
measurement buffer to NULL.

For IMA to measure the data on a given system, the IMA policy on the
system needs to be updated to have the following line, and the system
needs to be restarted for the measurements to take effect.

/etc/ima/ima-policy
 measure func=CRITICAL_DATA label=device-mapper template=ima-buf

The measurements will be reflected in the IMA logs, which are located at:

/sys/kernel/security/integrity/ima/ascii_runtime_measurements
/sys/kernel/security/integrity/ima/binary_runtime_measurements

These IMA logs can later be consumed by various attestation clients
running on the system, and send them to external services for attesting
the system.

The DM target data measured by IMA subsystem can alternatively
be queried from userspace by setting DM_IMA_MEASUREMENT_FLAG with
DM_TABLE_STATUS_CMD.

Signed-off-by: Tushar Sugandhi <tusharsu@linux.microsoft.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-08-10 13:34:23 -04:00
Mikulas Patocka
bc8f3d4647 dm integrity: fix sparse warnings
Use the types __le* instead of __u* to fix sparse warnings.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-05-13 14:53:49 -04:00
Mikulas Patocka
dbae70d452 dm integrity: revert to not using discard filler when recalulating
Revert the commit 7a5b96b478 ("dm integrity:
use discard support when recalculating").

There's a bug that when we write some data beyond the current recalculate
boundary, the checksum will be rewritten with the discard filler later.
And the data will no longer have integrity protection. There's no easy
fix for this case.

Also, another problematic case is if dm-integrity is used to detect
bitrot (random device errors, bit flips, etc); dm-integrity should
detect that even for unused sectors. With commit 7a5b96b478 it can
happen that such change is undetected (because discard filler is not a
valid checksum).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Acked-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-05-13 14:53:48 -04:00
Mikulas Patocka
7a5b96b478 dm integrity: use discard support when recalculating
If we have discard support we don't have to recalculate hash - we can
just fill the metadata with the discard pattern.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-04-30 14:02:06 -04:00
Mikulas Patocka
b1a2b93320 dm integrity: increase RECALC_SECTORS to improve recalculate speed
Increase RECALC_SECTORS because it improves recalculate speed slightly
(from 390kiB/s to 410kiB/s).

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-04-30 14:02:05 -04:00
Mikulas Patocka
a9c0fda4c0 dm integrity: don't re-write metadata if discarding same blocks
If we discard already discarded blocks we do not need to write discard
pattern to the metadata, because it is already there.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-04-30 14:01:39 -04:00
Tian Tao
17e9e134a8 dm integrity: fix missing goto in bitmap_flush_interval error handling
Fixes: 468dfca38b ("dm integrity: add a bitmap mode")
Cc: stable@vger.kernel.org
Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-04-19 13:17:10 -04:00
Mikulas Patocka
db7b93e381 dm integrity: add the "reset_recalculate" feature flag
Add a new flag "reset_recalculate" that will restart recalculating
from the beginning of the device. It can be used if we want to change
the hash function. Example:

dmsetup remove_all
rmmod brd
set -e
modprobe brd rd_size=1048576
dmsetup create in --table '0 2000000 integrity /dev/ram0 0 16 J 2 internal_hash:sha256 recalculate'
sleep 10
dmsetup status
dmsetup remove in
dmsetup create in --table '0 2000000 integrity /dev/ram0 0 16 J 2 internal_hash:sha3-256 reset_recalculate'

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-03-26 14:53:42 -04:00
Mikulas Patocka
09d85f8d89 dm integrity: introduce the "fix_hmac" argument
The "fix_hmac" argument improves security of internal_hash and
journal_mac:
- the section number is mixed to the mac, so that an attacker can't
  copy sectors from one journal section to another journal section
- the superblock is protected by journal_mac
- a 16-byte salt stored in the superblock is mixed to the mac, so
  that the attacker can't detect that two disks have the same hmac
  key and also to disallow the attacker to move sectors from one
  disk to another

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Daniel Glockner <dg@emlix.com>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com> # ReST fix
Tested-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-03 10:10:05 -05:00
Colin Ian King
23c4ecbc3e dm integrity: fix spelling mistake "flusing" -> "flushing"
There is a spelling mistake in a dm_integrity_io_error error
message. Fix it.

Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-02-03 10:10:05 -05:00
Mikulas Patocka
5c02406428 dm integrity: conditionally disable "recalculate" feature
Otherwise a malicious user could (ab)use the "recalculate" feature
that makes dm-integrity calculate the checksums in the background
while the device is already usable. When the system restarts before all
checksums have been calculated, the calculation continues where it was
interrupted even if the recalculate feature is not requested the next
time the dm device is set up.

Disable recalculating if we use internal_hash or journal_hash with a
key (e.g. HMAC) and we don't have the "legacy_recalculate" flag.

This may break activation of a volume, created by an older kernel,
that is not yet fully recalculated -- if this happens, the user should
add the "legacy_recalculate" flag to constructor parameters.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Reported-by: Daniel Glockner <dg@emlix.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-21 15:05:21 -05:00
Mikulas Patocka
2d06dfecb1 dm integrity: fix a crash if "recalculate" used without "internal_hash"
Recalculate can only be specified with internal_hash.

Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-21 14:41:43 -05:00
Mikulas Patocka
17ffc193cd dm integrity: fix the maximum number of arguments
Advance the maximum number of arguments from 9 to 15 to account for
all potential feature flags that may be supplied.

Linux 4.19 added "meta_device"
(356d9d52e1) and "recalculate"
(a3fcf72531) flags.

Commit 468dfca38b added
"sectors_per_bit" and "bitmap_flush_interval".

Commit 84597a44a9 added
"allow_discards".

And the commit d537858ac8 added
"fix_padding".

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-12 15:55:06 -05:00
Mikulas Patocka
9b5948267a dm integrity: fix flush with external metadata device
With external metadata device, flush requests are not passed down to the
data device.

Fix this by submitting the flush request in dm_integrity_flush_buffers. In
order to not degrade performance, we overlap the data device flush with
the metadata device flush.

Reported-by: Lukas Straub <lukasstraub2@web.de>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2021-01-08 15:57:29 -05:00
Mikulas Patocka
a7a10bce8a dm integrity: don't use drivers that have CRYPTO_ALG_ALLOCATES_MEMORY
Don't use crypto drivers that have the flag CRYPTO_ALG_ALLOCATES_MEMORY
set. These drivers allocate memory and thus they are not suitable for
block I/O processing.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-11-17 10:45:06 -05:00
Mikulas Patocka
e27fec66f0 dm integrity: fix error reporting in bitmap mode after creation
The dm-integrity target did not report errors in bitmap mode just after
creation. The reason is that the function integrity_recalc didn't clean up
ic->recalc_bitmap as it proceeded with recalculation.

Fix this by updating the bitmap accordingly -- the double shift serves
to rounddown.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Fixes: 468dfca38b ("dm integrity: add a bitmap mode")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-09-01 16:41:57 -04:00
Waiman Long
453431a549 mm, treewide: rename kzfree() to kfree_sensitive()
As said by Linus:

  A symmetric naming is only helpful if it implies symmetries in use.
  Otherwise it's actively misleading.

  In "kzalloc()", the z is meaningful and an important part of what the
  caller wants.

  In "kzfree()", the z is actively detrimental, because maybe in the
  future we really _might_ want to use that "memfill(0xdeadbeef)" or
  something. The "zero" part of the interface isn't even _relevant_.

The main reason that kzfree() exists is to clear sensitive information
that should not be leaked to other future users of the same memory
objects.

Rename kzfree() to kfree_sensitive() to follow the example of the recently
added kvfree_sensitive() and make the intention of the API more explicit.
In addition, memzero_explicit() is used to clear the memory to make sure
that it won't get optimized away by the compiler.

The renaming is done by using the command sequence:

  git grep -w --name-only kzfree |\
  xargs sed -i 's/kzfree/kfree_sensitive/'

followed by some editing of the kfree_sensitive() kerneldoc and adding
a kzfree backward compatibility macro in slab.h.

[akpm@linux-foundation.org: fs/crypto/inline_crypt.c needs linux/slab.h]
[akpm@linux-foundation.org: fix fs/crypto/inline_crypt.c some more]

Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: David Howells <dhowells@redhat.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Joe Perches <joe@perches.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Dan Carpenter <dan.carpenter@oracle.com>
Cc: "Jason A . Donenfeld" <Jason@zx2c4.com>
Link: http://lkml.kernel.org/r/20200616154311.12314-3-longman@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-07 11:33:22 -07:00
Linus Torvalds
382625d0d4 for-5.9/block-20200802
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl8m7YwQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpt+dEAC7a0HYuX2OrkyawBnsgd1QQR/soC7surec
 yDDa7SMM8cOq3935bfzcYHV9FWJszEGIknchiGb9R3/T+vmSohbvDsM5zgwya9u/
 FHUIuTq324I6JWXKl30k4rwjiX9wQeMt+WZ5gC8KJYCWA296i2IpJwd0A45aaKuS
 x4bTjxqknE+fD4gQiMUSt+bmuOUAp81fEku3EPapCRYDPAj8f5uoY7R2arT/POwB
 b+s+AtXqzBymIqx1z0sZ/XcdZKmDuhdurGCWu7BfJFIzw5kQ2Qe3W8rUmrQ3pGut
 8a21YfilhUFiBv+B4wptfrzJuzU6Ps0BXHCnBsQjzvXwq5uFcZH495mM/4E4OJvh
 SbjL2K4iFj+O1ngFkukG/F8tdEM1zKBYy2ZEkGoWKUpyQanbAaGI6QKKJA+DCdBi
 yPEb7yRAa5KfLqMiocm1qCEO1I56HRiNHaJVMqCPOZxLmpXj19Fs71yIRplP1Trv
 GGXdWZsccjuY6OljoXWdEfnxAr5zBsO3Yf2yFT95AD+egtGsU1oOzlqAaU1mtflw
 ABo452pvh6FFpxGXqz6oK4VqY4Et7WgXOiljA4yIGoPpG/08L1Yle4eVc2EE01Jb
 +BL49xNJVeUhGFrvUjPGl9kVMeLmubPFbmgrtipW+VRg9W8+Yirw7DPP6K+gbPAR
 RzAUdZFbWw==
 =abJG
 -----END PGP SIGNATURE-----

Merge tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block

Pull core block updates from Jens Axboe:
 "Good amount of cleanups and tech debt removals in here, and as a
  result, the diffstat shows a nice net reduction in code.

   - Softirq completion cleanups (Christoph)

   - Stop using ->queuedata (Christoph)

   - Cleanup bd claiming (Christoph)

   - Use check_events, moving away from the legacy media change
     (Christoph)

   - Use inode i_blkbits consistently (Christoph)

   - Remove old unused writeback congestion bits (Christoph)

   - Cleanup/unify submission path (Christoph)

   - Use bio_uninit consistently, instead of bio_disassociate_blkg
     (Christoph)

   - sbitmap cleared bits handling (John)

   - Request merging blktrace event addition (Jan)

   - sysfs add/remove race fixes (Luis)

   - blk-mq tag fixes/optimizations (Ming)

   - Duplicate words in comments (Randy)

   - Flush deferral cleanup (Yufen)

   - IO context locking/retry fixes (John)

   - struct_size() usage (Gustavo)

   - blk-iocost fixes (Chengming)

   - blk-cgroup IO stats fixes (Boris)

   - Various little fixes"

* tag 'for-5.9/block-20200802' of git://git.kernel.dk/linux-block: (135 commits)
  block: blk-timeout: delete duplicated word
  block: blk-mq-sched: delete duplicated word
  block: blk-mq: delete duplicated word
  block: genhd: delete duplicated words
  block: elevator: delete duplicated word and fix typos
  block: bio: delete duplicated words
  block: bfq-iosched: fix duplicated word
  iocost_monitor: start from the oldest usage index
  iocost: Fix check condition of iocg abs_vdebt
  block: Remove callback typedefs for blk_mq_ops
  block: Use non _rcu version of list functions for tag_set_list
  blk-cgroup: show global disk stats in root cgroup io.stat
  blk-cgroup: make iostat functions visible to stat printing
  block: improve discard bio alignment in __blkdev_issue_discard()
  block: change REQ_OP_ZONE_RESET and REQ_OP_ZONE_RESET_ALL to be odd numbers
  block: defer flush request no matter whether we have elevator
  block: make blk_timeout_init() static
  block: remove retry loop in ioc_release_fn()
  block: remove unnecessary ioc nested locking
  block: integrate bd_start_claiming into __blkdev_get
  ...
2020-08-03 11:57:03 -07:00
Mikulas Patocka
5df96f2b9f dm integrity: fix integrity recalculation that is improperly skipped
Commit adc0daad36 ("dm: report suspended
device during destroy") broke integrity recalculation.

The problem is dm_suspended() returns true not only during suspend,
but also during resume. So this race condition could occur:
1. dm_integrity_resume calls queue_work(ic->recalc_wq, &ic->recalc_work)
2. integrity_recalc (&ic->recalc_work) preempts the current thread
3. integrity_recalc calls if (unlikely(dm_suspended(ic->ti))) goto unlock_ret;
4. integrity_recalc exits and no recalculating is done.

To fix this race condition, add a function dm_post_suspending that is
only true during the postsuspend phase and use it instead of
dm_suspended().

Signed-off-by: Mikulas Patocka <mpatocka redhat com>
Fixes: adc0daad36 ("dm: report suspended device during destroy")
Cc: stable vger kernel org # v4.18+
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-07-23 14:39:37 -04:00
Christoph Hellwig
ed00aabd5e block: rename generic_make_request to submit_bio_noacct
generic_make_request has always been very confusingly misnamed, so rename
it to submit_bio_noacct to make it clear that it is submit_bio minus
accounting and a few checks.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-07-01 07:27:24 -06:00
Linus Torvalds
b25c6644bf - Largest change for this cycle is the DM zoned target's metadata
version 2 feature that adds support for pairing regular block
   devices with a zoned device to ease performance impact associated
   with finite random zones of zoned device.  Changes came in 3
   batches: first prepared for and then added the ability to pair a
   single regular block device, second was a batch of fixes to improve
   zoned's reclaim heuristic, third removed the limitation of only
   adding a single additional regular block device to allow many
   devices.  Testing has shown linear scaling as more devices are
   added.
 
 - Add new emulated block size (ebs) target that emulates a smaller
   logical_block_size than a block device supports.  Primary use-case
   is to emulate "512e" devices that have 512 byte logical_block_size
   and 4KB physical_block_size.  This is useful to some legacy
   applications otherwise wouldn't be ablee to be used on 4K devices
   because they depend on issuing IO in 512 byte granularity.
 
 - Add discard interfaces to DM bufio.  First consumer of the interface
   is the dm-ebs target that makes heavy use of dm-bufio.
 
 - Fix DM crypt's block queue_limits stacking to not truncate
   logic_block_size.
 
 - Add Documentation for DM integrity's status line.
 
 - Switch DMDEBUG from a compile time config option to instead use
   dynamic debug via pr_debug.
 
 - Fix DM multipath target's hueristic for how it manages
   "queue_if_no_path" state internally.  DM multipath now avoids
   disabling "queue_if_no_path" unless it is actually needed (e.g. in
   response to configure timeout or explicit "fail_if_no_path"
   message).  This fixes reports of spurious -EIO being reported back
   to userspace application during fault tolerance testing with an NVMe
   backend.  Added various dynamic DMDEBUG messages to assist with
   debugging queue_if_no_path in the future.
 
 - Add a new DM multipath "Historical Service Time" Path Selector.
 
 - Fix DM multipath's dm_blk_ioctl() to switch paths on IO error.
 
 - Improve DM writecache target performance by using explicit
   cache flushing for target's single-threaded usecase and a small
   cleanup to remove unnecessary test in persistent_memory_claim.
 
 - Other small cleanups in DM core, dm-persistent-data, and DM integrity.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAl7alrgTHHNuaXR6ZXJA
 cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWl42B/9sBd+j60emy4Bliu/f3pd7SEkFrSXv
 K2jXicRFx4E5kO0aLK+fX65cOiq2vvLsDh8c++0TLXcD9q7oK0qxK9c8TCPq29Cx
 W2J2dwdjyyqqbr3/FZHYYM9KLOl5rsxJLygXwhJhQ2Gny44L7nVACrAXNzXIHJ3r
 f8xr+GLdF/jz7WLj8bwEDo3Bf8wvxDvl2ijqj7EceOhTutNE8xHQ6UcTPqTtozJy
 sNM8UQNk1L43DBAvXfrKZ+yQ5DYAKdXKJpV9C8qv5DEGbaEikuMrHddgO4KlDdp4
 VjPk9GSfPwGcJ4ecN8vgecZVGvh52ZU7OZ8qey/q0zqps74jeHTZQQyM
 =jRun
 -----END PGP SIGNATURE-----

Merge tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - The largest change for this cycle is the DM zoned target's metadata
   version 2 feature that adds support for pairing regular block devices
   with a zoned device to ease the performance impact associated with
   finite random zones of zoned device.

   The changes came in three batches: the first prepared for and then
   added the ability to pair a single regular block device, the second
   was a batch of fixes to improve zoned's reclaim heuristic, and the
   third removed the limitation of only adding a single additional
   regular block device to allow many devices.

   Testing has shown linear scaling as more devices are added.

 - Add new emulated block size (ebs) target that emulates a smaller
   logical_block_size than a block device supports

   The primary use-case is to emulate "512e" devices that have 512 byte
   logical_block_size and 4KB physical_block_size. This is useful to
   some legacy applications that otherwise wouldn't be able to be used
   on 4K devices because they depend on issuing IO in 512 byte
   granularity.

 - Add discard interfaces to DM bufio. First consumer of the interface
   is the dm-ebs target that makes heavy use of dm-bufio.

 - Fix DM crypt's block queue_limits stacking to not truncate
   logic_block_size.

 - Add Documentation for DM integrity's status line.

 - Switch DMDEBUG from a compile time config option to instead use
   dynamic debug via pr_debug.

 - Fix DM multipath target's hueristic for how it manages
   "queue_if_no_path" state internally.

   DM multipath now avoids disabling "queue_if_no_path" unless it is
   actually needed (e.g. in response to configure timeout or explicit
   "fail_if_no_path" message).

   This fixes reports of spurious -EIO being reported back to userspace
   application during fault tolerance testing with an NVMe backend.
   Added various dynamic DMDEBUG messages to assist with debugging
   queue_if_no_path in the future.

 - Add a new DM multipath "Historical Service Time" Path Selector.

 - Fix DM multipath's dm_blk_ioctl() to switch paths on IO error.

 - Improve DM writecache target performance by using explicit cache
   flushing for target's single-threaded usecase and a small cleanup to
   remove unnecessary test in persistent_memory_claim.

 - Other small cleanups in DM core, dm-persistent-data, and DM
   integrity.

* tag 'for-5.8/dm-changes' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (62 commits)
  dm crypt: avoid truncating the logical block size
  dm mpath: add DM device name to Failing/Reinstating path log messages
  dm mpath: enhance queue_if_no_path debugging
  dm mpath: restrict queue_if_no_path state machine
  dm mpath: simplify __must_push_back
  dm zoned: check superblock location
  dm zoned: prefer full zones for reclaim
  dm zoned: select reclaim zone based on device index
  dm zoned: allocate zone by device index
  dm zoned: support arbitrary number of devices
  dm zoned: move random and sequential zones into struct dmz_dev
  dm zoned: per-device reclaim
  dm zoned: add metadata pointer to struct dmz_dev
  dm zoned: add device pointer to struct dm_zone
  dm zoned: allocate temporary superblock for tertiary devices
  dm zoned: convert to xarray
  dm zoned: add a 'reserved' zone flag
  dm zoned: improve logging messages for reclaim
  dm zoned: avoid unnecessary device recalulation for secondary superblock
  dm zoned: add debugging message for reading superblocks
  ...
2020-06-05 15:45:03 -07:00
Christoph Hellwig
9398554fb3 block: remove the error_sector argument to blkdev_issue_flush
The argument isn't used by any caller, and drivers don't fill out
bi_sector for flush requests either.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2020-05-22 08:45:46 -06:00
Gustavo A. R. Silva
b18ae8dd9d dm: replace zero-length array with flexible-array
The current codebase makes use of the zero-length array language
extension to the C90 standard, but the preferred mechanism to declare
variable-length types such as these ones is a flexible array member[1][2],
introduced in C99:

struct foo {
        int stuff;
        struct boo array[];
};

By making use of the mechanism above, we will get a compiler warning
in case the flexible array does not occur last in the structure, which
will help us prevent some kind of undefined behavior bugs from being
inadvertently introduced[3] to the codebase from now on.

Also, notice that, dynamic memory allocations won't be affected by
this change:

"Flexible array members have incomplete type, and so the sizeof operator
may not be applied. As a quirk of the original implementation of
zero-length arrays, sizeof evaluates to zero."[1]

sizeof(flexible-array-member) triggers a warning because flexible array
members have incomplete type[1]. There are some instances of code in
which the sizeof operator is being incorrectly/erroneously applied to
zero-length arrays and the result is zero. Such instances may be hiding
some bugs. So, this work (flexible-array member conversions) will also
help to get completely rid of those sorts of issues.

This issue was found with the help of Coccinelle.

[1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html
[2] https://github.com/KSPP/linux/issues/21
[3] commit 7649773293 ("cxgb3/l2t: Fix undefined behaviour")

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-05-20 17:09:44 -04:00
YueHaibing
a86fe8be51 dm integrity: remove set but not used variables
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/md/dm-integrity.c: In function 'integrity_metadata':
drivers/md/dm-integrity.c:1557:12: warning:
 variable 'save_metadata_offset' set but not used [-Wunused-but-set-variable]
drivers/md/dm-integrity.c:1556:12: warning:
 variable 'save_metadata_block' set but not used [-Wunused-but-set-variable]

They are never used, so remove it.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-05-15 10:29:35 -04:00
Mikulas Patocka
8267d8fb48 dm integrity: fix logic bug in integrity tag testing
If all the bytes are equal to DISCARD_FILLER, we want to accept the
buffer. If any of the bytes are different, we must do thorough
tag-by-tag checking.

The condition was inverted.

Fixes: 84597a44a9 ("dm integrity: add optional discard support")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-04-03 13:07:41 -04:00
Mike Snitzer
e7fc1e57d9 dm integrity: fix ppc64le warning
Otherwise:

In file included from drivers/md/dm-integrity.c:13:
drivers/md/dm-integrity.c: In function 'dm_integrity_status':
drivers/md/dm-integrity.c:3061:10: error: format '%llu' expects
argument of type 'long long unsigned int', but argument 4 has type
'long int' [-Werror=format=]
   DMEMIT("%llu %llu",
          ^~~~~~~~~~~
    atomic64_read(&ic->number_of_mismatches),
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
./include/linux/device-mapper.h:550:46: note: in definition of macro 'DMEMIT'
      0 : scnprintf(result + sz, maxlen - sz, x))
                                              ^
cc1: all warnings being treated as errors

Fixes: 7649194a16 ("dm integrity: remove sector type casts")
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-04-03 10:44:24 -04:00
Mikulas Patocka
31843edab7 dm integrity: improve discard in journal mode
When we discard something that is present in the journal, we flush the
journal first, so that discarded blocks are not overwritten by the journal
content.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 13:09:49 -04:00
Mikulas Patocka
84597a44a9 dm integrity: add optional discard support
Add an argument "allow_discards" that enables discard processing on
dm-integrity device. Discards are only allowed to devices using
internal hash.

When a block is discarded the integrity tag is filled with
DISCARD_FILLER (0xf6) bytes.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 13:05:25 -04:00
Mikulas Patocka
1ac2c15a7b dm integrity: allow resize of the integrity device
If the size of the underlying device changes, change the size of the
integrity device too.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 12:52:53 -04:00
Mikulas Patocka
87fb177b4c dm integrity: factor out get_provided_data_sectors()
Move code to a new function get_provided_data_sectors().

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 12:46:35 -04:00
Mikulas Patocka
f6f72f32c2 dm integrity: don't replay journal data past the end of the device
Following commits will make it possible to shrink or extend the device. If
the device was shrunk, we don't want to replay journal data pointing past
the end of the device.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 12:43:21 -04:00
Mikulas Patocka
7649194a16 dm integrity: remove sector type casts
Since the commit 72deb455b5 ("block:
remove CONFIG_LBDAF") sector_t is always defined as unsigned long
long.

Delete the needless type casts in printk and avoids some warnings if
DEBUG_PRINT is defined.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 12:40:18 -04:00
Mikulas Patocka
b93b6643e9 dm integrity: fix a crash with unusually large tag size
If the user specifies tag size larger than HASH_MAX_DIGESTSIZE,
there's a crash in integrity_metadata().

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 12:34:45 -04:00
Erich Eckner
eaab4bde6e dm integrity: print device name in integrity_metadata() error message
Similar to f710126cfc ("dm crypt: print
device name in integrity error message"), this message should also
better identify the device with the integrity failure.

Signed-off-by: Erich Eckner <git@eckner.net>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-24 11:25:11 -04:00
Mike Snitzer
636be4241b dm: bump version of core and various targets
Changes made during the 5.6 cycle warrant bumping the version number
for DM core and the targets modified by this commit.

It should be noted that dm-thin, dm-crypt and dm-raid already had
their target version bumped during the 5.6 merge window.

Signed-off-by; Mike Snitzer <snitzer@redhat.com>
2020-03-03 11:10:21 -05:00
Mike Snitzer
248aa2645a dm integrity: use dm_bio_record and dm_bio_restore
In cases where dec_in_flight() has to requeue the integrity_bio_wait
work to transfer the rest of the data, the bio's __bi_remaining might
already have been decremented to 0, e.g.: if bio passed to underlying
data device was split via blk_queue_split().

Use dm_bio_{record,restore} rather than effectively open-coding them in
dm-integrity -- these methods now manage __bi_remaining too.

Depends-on: f7f0b057a9c1 ("dm bio record: save/restore bi_end_io and bi_integrity")
Reported-by: Daniel Glöckner <dg@emlix.com>
Suggested-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-03-03 10:02:47 -05:00
Mikulas Patocka
adc0daad36 dm: report suspended device during destroy
The function dm_suspended returns true if the target is suspended.
However, when the target is being suspended during unload, it returns
false.

An example where this is a problem: the test "!dm_suspended(wc->ti)" in
writecache_writeback is not sufficient, because dm_suspended returns
zero while writecache_suspend is in progress.  As is, without an
enhanced dm_suspended, simply switching from flush_workqueue to
drain_workqueue still emits warnings:
workqueue writecache-writeback: drain_workqueue() isn't complete after 10 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 100 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 200 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 300 tries
workqueue writecache-writeback: drain_workqueue() isn't complete after 400 tries

writecache_suspend calls flush_workqueue(wc->writeback_wq) - this function
flushes the current work. However, the workqueue may re-queue itself and
flush_workqueue doesn't wait for re-queued works to finish. Because of
this - the function writecache_writeback continues execution after the
device was suspended and then concurrently with writecache_dtr, causing
a crash in writecache_writeback.

We must use drain_workqueue - that waits until the work and all re-queued
works finish.

As a prereq for switching to drain_workqueue, this commit fixes
dm_suspended to return true after the presuspend hook and before the
postsuspend hook - just like during a normal suspend. It allows
simplifying the dm-integrity and dm-writecache targets so that they
don't have to maintain suspended flags on their own.

With this change use of drain_workqueue() can be used effectively.  This
change was tested with the lvm2 testsuite and cryptsetup testsuite and
the are no regressions.

Fixes: 48debafe4f ("dm: add writecache target")
Cc: stable@vger.kernel.org # 4.18+
Reported-by: Corey Marthaler <cmarthal@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-27 16:40:58 -05:00
Mikulas Patocka
7fc2e47f40 dm integrity: fix invalid table returned due to argument count mismatch
If the flag SB_FLAG_RECALCULATE is present in the superblock, but it was
not specified on the command line (i.e. ic->recalculate_flag is false),
dm-integrity would return invalid table line - the reported number of
arguments would not match the real number.

Fixes: 468dfca38b ("dm integrity: add a bitmap mode")
Cc: stable@vger.kernel.org # v5.2+
Reported-by: Ondrej Kozina <okozina@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-25 12:06:08 -05:00
Mikulas Patocka
53770f0ec5 dm integrity: fix a deadlock due to offloading to an incorrect workqueue
If we need to perform synchronous I/O in dm_integrity_map_continue(),
we must make sure that we are not in the map function - in order to
avoid the deadlock due to bio queuing in generic_make_request. To
avoid the deadlock, we offload the request to metadata_wq.

However, metadata_wq also processes metadata updates for write requests.
If there are too many requests that get offloaded to metadata_wq at the
beginning of dm_integrity_map_continue, the workqueue metadata_wq
becomes clogged and the system is incapable of processing any metadata
updates.

This causes a deadlock because all the requests that need to do metadata
updates wait for metadata_wq to proceed and metadata_wq waits inside
wait_and_add_new_range until some existing request releases its range
lock (which doesn't happen because the range lock is released after
metadata update).

In order to fix the deadlock, we create a new workqueue offload_wq and
offload requests to it - so that processing of offload_wq is independent
from processing of metadata_wq.

Fixes: 7eada909bf ("dm: add integrity target")
Cc: stable@vger.kernel.org # v4.12+
Reported-by: Heinz Mauelshagen <heinzm@redhat.com>
Tested-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Heinz Mauelshagen <heinzm@redhat.com>
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-25 12:06:07 -05:00
Mikulas Patocka
d5bdf66108 dm integrity: fix recalculation when moving from journal mode to bitmap mode
If we resume a device in bitmap mode and the on-disk format is in journal
mode, we must recalculate anything above ic->sb->recalc_sector. Otherwise,
there would be non-recalculated blocks which would cause I/O errors.

Fixes: 468dfca38b ("dm integrity: add a bitmap mode")
Cc: stable@vger.kernel.org # v5.2+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2020-02-25 12:06:06 -05:00
Mikulas Patocka
d537858ac8 dm integrity: fix excessive alignment of metadata runs
Metadata runs are supposed to be aligned on 4k boundary (so that they work
efficiently with disks with 4k sectors). However, there was a programming
bug that makes them aligned on 128k boundary instead. The unused space is
wasted.

Fix this bug by providing a proper 4k alignment. In order to keep
existing volumes working, we introduce a new flag SB_FLAG_FIXED_PADDING
- when the flag is clear, we calculate the padding the old way. In order
to make sure that the old version cannot mount the volume created by the
new version, we increase superblock version to 4.

Also in order to not break with old integritysetup, we fix alignment
only if the parameter "fix_padding" is present when formatting the
device.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-11-15 14:49:16 -05:00
Max Gurtovoy
54d4e6ab91 block: centralize PI remapping logic to the block layer
Currently t10_pi_prepare/t10_pi_complete functions are called during the
NVMe and SCSi layers command preparetion/completion, but their actual
place should be the block layer since T10-PI is a general data integrity
feature that is used by block storage protocols. Introduce .prepare_fn
and .complete_fn callbacks within the integrity profile that each type
can implement according to its needs.

Suggested-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Suggested-by: Martin K. Petersen <martin.petersen@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Max Gurtovoy <maxg@mellanox.com>

Fixed to not call queue integrity functions if BLK_DEV_INTEGRITY
isn't defined in the config.

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-09-17 20:03:49 -06:00
Mikulas Patocka
5729b6e5a1 dm integrity: fix a crash due to BUG_ON in __journal_read_write()
Fix a crash that was introduced by the commit 724376a04d. The crash is
reported here: https://gitlab.com/cryptsetup/cryptsetup/issues/468

When reading from the integrity device, the function
dm_integrity_map_continue calls find_journal_node to find out if the
location to read is present in the journal. Then, it calculates how many
sectors are consecutively stored in the journal. Then, it locks the range
with add_new_range and wait_and_add_new_range.

The problem is that during wait_and_add_new_range, we hold no locks (we
don't hold ic->endio_wait.lock and we don't hold a range lock), so the
journal may change arbitrarily while wait_and_add_new_range sleeps.

The code then goes to __journal_read_write and hits
BUG_ON(journal_entry_get_sector(je) != logical_sector); because the
journal has changed.

In order to fix this bug, we need to re-check the journal location after
wait_and_add_new_range. We restrict the length to one block in order to
not complicate the code too much.

Fixes: 724376a04d ("dm integrity: implement fair range locks")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-08-15 16:01:57 -04:00
Fuqian Huang
131670c262 dm integrity: use kzalloc() instead of kmalloc() + memset()
Signed-off-by: Fuqian Huang <huangfq.daxian@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-07-09 14:14:20 -04:00
Milan Broz
5f1c56b34e dm integrity: always set version on superblock update
The new integrity bitmap mode uses the dirty flag.  The dirty flag
should not be set in older superblock versions.

The current code sets it unconditionally, even if the superblock
was already formatted without bitmap in older system.

Fix this by moving the version check to one common place and check
version on every superblock write.

Signed-off-by: Milan Broz <gmazyland@gmail.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-07-09 13:46:01 -04:00
Linus Torvalds
311f71281f - Improve DM snapshot target's scalability by using finer grained
locking.  Requires some list_bl interface improvements.
 
 - Add ability for DM integrity to use a bitmap mode, that tracks regions
   where data and metadata are out of sync, instead of using a journal.
 
 - Improve DM thin provisioning target to not write metadata changes to
   disk if the thin-pool and associated thin devices are merely
   activated but not used.  This avoids metadata corruption due to
   concurrent activation of thin devices across different OS instances
   (e.g. split brain scenarios, which ultimately would be avoided if
   proper device filters were used -- but not having proper filtering has
   proven a very common configuration mistake)
 
 - Fix missing call to path selector type->end_io in DM multipath.  This
   fixes reported performance problems due to inaccurate path selector IO
   accounting causing an imbalance of IO (e.g. avoiding issuing IO to
   particular path due to it seemingly being heavily used).
 
 - Fix bug in DM cache metadata's loading of its discard bitset that
   could lead to all cache blocks being discarded if the very first cache
   block was discarded (thankfully in practice the first cache block is
   generally in use; be it FS superblock, partition table, disk label,
   etc).
 
 - Add testing-only DM dust target which simulates a device that has
   failing sectors and/or read failures.
 
 - Fix a DM init error path reference count hang that caused boot hangs
   if user supplied malformed input on kernel commandline.
 
 - Fix a couple issues with DM crypt target's logging being overly
   verbose or lacking context.
 
 - Various other small fixes to DM init, DM multipath, DM zoned, and DM
   crypt.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEJfWUX4UqZ4x1O2wixSPxCi2dA1oFAlzdcCgTHHNuaXR6ZXJA
 cmVkaGF0LmNvbQAKCRDFI/EKLZ0DWsxZB/9idHl8LmwwL1JzBfi/XX7bWxwqDQLo
 j1b3ycQ14AKVau4VCkmgDuRIfMDuU6PIAVvsMeVbF3aCE0fZ7zbEV1qHefbtJuCL
 MMm//KbrhIT8oMKYUWtlOj7XI9MT6ErFzfActBZ6UF6r21m1N3bohhVGN7kvCnJm
 wgmSlnz/m2GLKK8gQx+OisnAh0nlje3PIdIYPu7uWN6t0FF2XRz3UwWTuyw7lYhC
 Rx2J+sOIL02CtadhHKLMCG8OutRXWP01cBSohUVJIMGihWfbe6aqvhG5afbqb4bG
 UQrXl477ry5zyQ4fAU2JKZ+8qFvc1FoLLknKrZQu+uYPRokUPw/AwiL7
 =mOH3
 -----END PGP SIGNATURE-----

Merge tag 'for-5.2/dm-changes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper updates from Mike Snitzer:

 - Improve DM snapshot target's scalability by using finer grained
   locking. Requires some list_bl interface improvements.

 - Add ability for DM integrity to use a bitmap mode, that tracks
   regions where data and metadata are out of sync, instead of using a
   journal.

 - Improve DM thin provisioning target to not write metadata changes to
   disk if the thin-pool and associated thin devices are merely
   activated but not used. This avoids metadata corruption due to
   concurrent activation of thin devices across different OS instances
   (e.g. split brain scenarios, which ultimately would be avoided if
   proper device filters were used -- but not having proper filtering
   has proven a very common configuration mistake)

 - Fix missing call to path selector type->end_io in DM multipath. This
   fixes reported performance problems due to inaccurate path selector
   IO accounting causing an imbalance of IO (e.g. avoiding issuing IO to
   particular path due to it seemingly being heavily used).

 - Fix bug in DM cache metadata's loading of its discard bitset that
   could lead to all cache blocks being discarded if the very first
   cache block was discarded (thankfully in practice the first cache
   block is generally in use; be it FS superblock, partition table, disk
   label, etc).

 - Add testing-only DM dust target which simulates a device that has
   failing sectors and/or read failures.

 - Fix a DM init error path reference count hang that caused boot hangs
   if user supplied malformed input on kernel commandline.

 - Fix a couple issues with DM crypt target's logging being overly
   verbose or lacking context.

 - Various other small fixes to DM init, DM multipath, DM zoned, and DM
   crypt.

* tag 'for-5.2/dm-changes-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: (42 commits)
  dm: fix a couple brace coding style issues
  dm crypt: print device name in integrity error message
  dm crypt: move detailed message into debug level
  dm ioctl: fix hang in early create error condition
  dm integrity: whitespace, coding style and dead code cleanup
  dm integrity: implement synchronous mode for reboot handling
  dm integrity: handle machine reboot in bitmap mode
  dm integrity: add a bitmap mode
  dm integrity: introduce a function add_new_range_and_wait()
  dm integrity: allow large ranges to be described
  dm ingerity: pass size to dm_integrity_alloc_page_list()
  dm integrity: introduce rw_journal_sectors()
  dm integrity: update documentation
  dm integrity: don't report unused options
  dm integrity: don't check null pointer before kvfree and vfree
  dm integrity: correctly calculate the size of metadata area
  dm dust: Make dm_dust_init and dm_dust_exit static
  dm dust: remove redundant unsigned comparison to less than zero
  dm mpath: always free attached_handler_name in parse_path()
  dm init: fix max devices/targets checks
  ...
2019-05-16 15:55:48 -07:00
Mike Snitzer
05d6909ea9 dm integrity: whitespace, coding style and dead code cleanup
Just some things that stood out like a sore thumb.
Also, converted some printk(KERN_CRIT, ...) to DMCRIT(...)

Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-09 16:00:31 -04:00
Mikulas Patocka
482714932e dm integrity: implement synchronous mode for reboot handling
Unfortunatelly, there may be bios coming even after the reboot notifier
was called.  We don't want these bios to make the bitmap dirty again.

To address this, implement a synchronous mode - when a bio is about to
be terminated, we clean the bitmap and terminate the bio after the clean
operation succeeds.  This obviously slows down bio processing, but it
makes sure that when all bios are finished, the bitmap will be clean.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08 13:41:59 -04:00
Mikulas Patocka
1f5a77591b dm integrity: handle machine reboot in bitmap mode
When in bitmap mode the bitmap must be cleared when rebooting.  This
commit adds the reboot hook.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08 13:41:58 -04:00
Mikulas Patocka
468dfca38b dm integrity: add a bitmap mode
Introduce an alternate mode of operation where dm-integrity uses a
bitmap instead of a journal. If a bit in the bitmap is 1, the
corresponding region's data and integrity tags are not synchronized - if
the machine crashes, the unsynchronized regions will be recalculated.
The bitmap mode is faster than the journal mode, because we don't have
to write the data twice, but it is also less reliable, because if data
corruption happens when the machine crashes, it may not be detected.

Benchmark results for an SSD connected to a SATA300 port, when doing
large linear writes with dd:

buffered I/O:
        raw device throughput - 245MB/s
        dm-integrity with journaling - 120MB/s
        dm-integrity with bitmap - 238MB/s

direct I/O with 1MB block size:
        raw device throughput - 248MB/s
        dm-integrity with journaling - 123MB/s
        dm-integrity with bitmap - 223MB/s

For more info see dm-integrity in Documentation/device-mapper/

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08 13:41:58 -04:00
Mikulas Patocka
8b3bbd490d dm integrity: introduce a function add_new_range_and_wait()
Introduce a function add_new_range_and_wait() in order to avoid
repetitive code.  It will be used in the following commit.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-08 13:40:28 -04:00
Linus Torvalds
67a2422239 for-5.2/block-20190507
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAlzR0AAQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpo0MD/47D1kBK9rGzkAwIz1Jkh1Qy/ITVaDJzmHJ
 UP5uncQsgKFLKMR1LbRcrWtmk2MwFDNULGbteHFeCYE1ypCrTgpWSp5+SJluKd1Q
 hma9krLSAXO9QiSaZ4jafshXFIZxz6IjakOW8c9LrT80Ze47yh7AxiLwDafcp/Jj
 x6NW790qB7ENDtfarDkZk14NCS8HGLRHO5B21LB+hT0Kfbh0XZaLzJdj7Mck1wPA
 VT8hL9mPuA++AjF7Ra4kUjwSakgmajTa3nS2fpkwTYdztQfas7x5Jiv7FWxrrelb
 qbabkNkWKepcHAPEiZR7o53TyfCucGeSK/jG+dsJ9KhNp26kl1ci3frl5T6PfVMP
 SPPDjsKIHs+dqFrU9y5rSGhLJqewTs96hHthnLGxyF67+5sRb5+YIy+dcqgiyc/b
 TUVyjCD6r0cO2q4v9VhwnhOyeBUA9Rwbu8nl7JV5Q45uG7qI4BC39l1jfubMNDPO
 GLNGUUzb6ER7z6lYINjRSF2Jhejsx8SR9P7jhpb1Q7k/VvDDxO1T4FpwvqWFz9+s
 Gn+s6//+cA6LL+42eZkQjvwF2CUNE7TaVT8zdb+s5HP1RQkZToqUnsQCGeRTrFni
 RqWXfW9o9+awYRp431417oMdX/LvLGq9+ZtifRk9DqDcowXevTaf0W2RpplWSuiX
 RcCuPeLAVg==
 =Ot0g
 -----END PGP SIGNATURE-----

Merge tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block

Pull block updates from Jens Axboe:
 "Nothing major in this series, just fixes and improvements all over the
  map. This contains:

   - Series of fixes for sed-opal (David, Jonas)

   - Fixes and performance tweaks for BFQ (via Paolo)

   - Set of fixes for bcache (via Coly)

   - Set of fixes for md (via Song)

   - Enabling multi-page for passthrough requests (Ming)

   - Queue release fix series (Ming)

   - Device notification improvements (Martin)

   - Propagate underlying device rotational status in loop (Holger)

   - Removal of mtip32xx trim support, which has been disabled for years
     (Christoph)

   - Improvement and cleanup of nvme command handling (Christoph)

   - Add block SPDX tags (Christoph)

   - Cleanup/hardening of bio/bvec iteration (Christoph)

   - A few NVMe pull requests (Christoph)

   - Removal of CONFIG_LBDAF (Christoph)

   - Various little fixes here and there"

* tag 'for-5.2/block-20190507' of git://git.kernel.dk/linux-block: (164 commits)
  block: fix mismerge in bvec_advance
  block: don't drain in-progress dispatch in blk_cleanup_queue()
  blk-mq: move cancel of hctx->run_work into blk_mq_hw_sysfs_release
  blk-mq: always free hctx after request queue is freed
  blk-mq: split blk_mq_alloc_and_init_hctx into two parts
  blk-mq: free hw queue's resource in hctx's release handler
  blk-mq: move cancel of requeue_work into blk_mq_release
  blk-mq: grab .q_usage_counter when queuing request from plug code path
  block: fix function name in comment
  nvmet: protect discovery change log event list iteration
  nvme: mark nvme_core_init and nvme_core_exit static
  nvme: move command size checks to the core
  nvme-fabrics: check more command sizes
  nvme-pci: check more command sizes
  nvme-pci: remove an unneeded variable initialization
  nvme-pci: unquiesce admin queue on shutdown
  nvme-pci: shutdown on timeout during deletion
  nvme-pci: fix psdt field for single segment sgls
  nvme-multipath: don't print ANA group state by default
  nvme-multipath: split bios with the ns_head bio_set before submitting
  ...
2019-05-07 18:14:36 -07:00
Mikulas Patocka
4f43446ddf dm integrity: allow large ranges to be described
Change n_sectors data type from unsigned to sector_t.  Following commits
will need to lock large ranges.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07 16:05:12 -04:00
Mikulas Patocka
d5027e0345 dm ingerity: pass size to dm_integrity_alloc_page_list()
Pass size to dm_integrity_alloc_page_list().  This is needed so
following commits can pass a size that is different from
ic->journal_pages.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07 16:05:12 -04:00
Mikulas Patocka
981e8a980d dm integrity: introduce rw_journal_sectors()
Introduce a function rw_journal_sectors() that takes sector and length
as its arguments instead of a section and the number of sections.

This functions will be used in further patches.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07 16:05:11 -04:00
Mikulas Patocka
88ad5d1eb1 dm integrity: update documentation
Update documentation with the "meta_device" parameter and flags.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07 16:05:10 -04:00
Mikulas Patocka
893e3c395b dm integrity: don't report unused options
If we are not journaling, don't report journaling options in the table
status.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07 16:05:09 -04:00
Mikulas Patocka
97abfde17a dm integrity: don't check null pointer before kvfree and vfree
The functions kfree, vfree and kvfree do nothing if we pass a NULL
pointer to them.  So we don't need to test the pointer for NULL.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07 16:05:08 -04:00
Mikulas Patocka
30bba430dd dm integrity: correctly calculate the size of metadata area
When we use separate devices for data and metadata, dm-integrity would
incorrectly calculate the size of the metadata device as if it had
512-byte block size - and it would refuse activation with larger block
size and smaller metadata device.

Fix this so that it takes actual block size into account, which fixes
the following reported issue:
https://gitlab.com/cryptsetup/cryptsetup/issues/450

Fixes: 356d9d52e1 ("dm integrity: allow separate metadata device")
Cc: stable@vger.kernel.org # v4.19+
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-05-07 16:05:08 -04:00
Linus Torvalds
81ff5d2cba Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
 "API:
   - Add support for AEAD in simd
   - Add fuzz testing to testmgr
   - Add panic_on_fail module parameter to testmgr
   - Use per-CPU struct instead multiple variables in scompress
   - Change verify API for akcipher

  Algorithms:
   - Convert x86 AEAD algorithms over to simd
   - Forbid 2-key 3DES in FIPS mode
   - Add EC-RDSA (GOST 34.10) algorithm

  Drivers:
   - Set output IV with ctr-aes in crypto4xx
   - Set output IV in rockchip
   - Fix potential length overflow with hashing in sun4i-ss
   - Fix computation error with ctr in vmx
   - Add SM4 protected keys support in ccree
   - Remove long-broken mxc-scc driver
   - Add rfc4106(gcm(aes)) cipher support in cavium/nitrox"

* 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (179 commits)
  crypto: ccree - use a proper le32 type for le32 val
  crypto: ccree - remove set but not used variable 'du_size'
  crypto: ccree - Make cc_sec_disable static
  crypto: ccree - fix spelling mistake "protedcted" -> "protected"
  crypto: caam/qi2 - generate hash keys in-place
  crypto: caam/qi2 - fix DMA mapping of stack memory
  crypto: caam/qi2 - fix zero-length buffer DMA mapping
  crypto: stm32/cryp - update to return iv_out
  crypto: stm32/cryp - remove request mutex protection
  crypto: stm32/cryp - add weak key check for DES
  crypto: atmel - remove set but not used variable 'alg_name'
  crypto: picoxcell - Use dev_get_drvdata()
  crypto: crypto4xx - get rid of redundant using_sd variable
  crypto: crypto4xx - use sync skcipher for fallback
  crypto: crypto4xx - fix cfb and ofb "overran dst buffer" issues
  crypto: crypto4xx - fix ctr-aes missing output IV
  crypto: ecrdsa - select ASN1 and OID_REGISTRY for EC-RDSA
  crypto: ux500 - use ccflags-y instead of CFLAGS_<basename>.o
  crypto: ccree - handle tee fips error during power management resume
  crypto: ccree - add function to handle cryptocell tee fips error
  ...
2019-05-06 20:15:06 -07:00
Eric Biggers
877b5691f2 crypto: shash - remove shash_desc::flags
The flags field in 'struct shash_desc' never actually does anything.
The only ostensibly supported flag is CRYPTO_TFM_REQ_MAY_SLEEP.
However, no shash algorithm ever sleeps, making this flag a no-op.

With this being the case, inevitably some users who can't sleep wrongly
pass MAY_SLEEP.  These would all need to be fixed if any shash algorithm
actually started sleeping.  For example, the shash_ahash_*() functions,
which wrap a shash algorithm with the ahash API, pass through MAY_SLEEP
from the ahash API to the shash API.  However, the shash functions are
called under kmap_atomic(), so actually they're assumed to never sleep.

Even if it turns out that some users do need preemption points while
hashing large buffers, we could easily provide a helper function
crypto_shash_update_large() which divides the data into smaller chunks
and calls crypto_shash_update() and cond_resched() for each chunk.  It's
not necessary to have a flag in 'struct shash_desc', nor is it necessary
to make individual shash algorithms aware of this at all.

Therefore, remove shash_desc::flags, and document that the
crypto_shash_*() functions can be called from any context.

Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
2019-04-25 15:38:12 +08:00
Jens Axboe
5c61ee2cd5 Linux 5.1-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAly8rGYeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGmZMH/1IRB0E1Qmzz8yzw
 wj79UuRGYPqxDDSWW+wNc8sU4Ic7iYirn9APHAztCdQqsjmzU/OVLfSa3JhdBe5w
 THo7pbGKBqEDcWnKfNk/21jXFNLZ1vr9BoQv2DGU2MMhHAyo/NZbalo2YVtpQPmM
 OCRth5n+LzvH7rGrX7RYgWu24G9l3NMfgtaDAXBNXesCGFAjVRrdkU5CBAaabvtU
 4GWh/nnutndOOLdByL3x+VZ3H3fIBnbNjcIGCglvvqzk7h3hrfGEl4UCULldTxcM
 IFsfMUhSw1ENy7F6DHGbKIG90cdCJcrQ8J/ziEzjj/KLGALluutfFhVvr6YCM2J6
 2RgU8CY=
 =CfY1
 -----END PGP SIGNATURE-----

Merge tag 'v5.1-rc6' into for-5.2/block

Pull in v5.1-rc6 to resolve two conflicts. One is in BFQ, in just a
comment, and is trivial. The other one is a conflict due to a later fix
in the bio multi-page work, and needs a bit more care.

* tag 'v5.1-rc6': (770 commits)
  Linux 5.1-rc6
  block: make sure that bvec length can't be overflow
  block: kill all_q_node in request_queue
  x86/cpu/intel: Lower the "ENERGY_PERF_BIAS: Set to normal" message's log priority
  coredump: fix race condition between mmget_not_zero()/get_task_mm() and core dumping
  mm/kmemleak.c: fix unused-function warning
  init: initialize jump labels before command line option parsing
  kernel/watchdog_hld.c: hard lockup message should end with a newline
  kcov: improve CONFIG_ARCH_HAS_KCOV help text
  mm: fix inactive list balancing between NUMA nodes and cgroups
  mm/hotplug: treat CMA pages as unmovable
  proc: fixup proc-pid-vm test
  proc: fix map_files test on F29
  mm/vmstat.c: fix /proc/vmstat format for CONFIG_DEBUG_TLBFLUSH=y CONFIG_SMP=n
  mm/memory_hotplug: do not unlock after failing to take the device_hotplug_lock
  mm: swapoff: shmem_unuse() stop eviction without igrab()
  mm: swapoff: take notice of completion sooner
  mm: swapoff: remove too limiting SWAP_UNUSE_MAX_TRIES
  mm: swapoff: shmem_find_swap_entries() filter out other types
  slab: store tagged freelist for off-slab slabmgmt
  ...

Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-22 09:47:36 -06:00
Christoph Hellwig
72deb455b5 block: remove CONFIG_LBDAF
Currently support for 64-bit sector_t and blkcnt_t is optional on 32-bit
architectures.  These types are required to support block device and/or
file sizes larger than 2 TiB, and have generally defaulted to on for
a long time.  Enabling the option only increases the i386 tinyconfig
size by 145 bytes, and many data structures already always use
64-bit values for their in-core and on-disk data structures anyway,
so there should not be a large change in dynamic memory usage either.

Dropping this option removes a somewhat weird non-default config that
has cause various bugs or compiler warnings when actually used.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2019-04-06 10:48:35 -06:00
Mikulas Patocka
4ed319c6ac dm integrity: fix deadlock with overlapping I/O
dm-integrity will deadlock if overlapping I/O is issued to it, the bug
was introduced by commit 724376a04d ("dm integrity: implement fair
range locks").  Users rarely use overlapping I/O so this bug went
undetected until now.

Fix this bug by correcting, likely cut-n-paste, typos in
ranges_overlap() and also remove a flawed ranges_overlap() check in
remove_range_unlocked().  This condition could leave unprocessed bios
hanging on wait_list forever.

Cc: stable@vger.kernel.org # v4.19+
Fixes: 724376a04d ("dm integrity: implement fair range locks")
Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-04-05 18:49:08 -04:00
YueHaibing
5efedc9b62 dm integrity: make dm_integrity_init and dm_integrity_exit static
Fix sparse warnings:

drivers/md/dm-integrity.c:3619:12: warning:
 symbol 'dm_integrity_init' was not declared. Should it be static?
drivers/md/dm-integrity.c:3638:6: warning:
 symbol 'dm_integrity_exit' was not declared. Should it be static?

Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Mike Snitzer <snitzer@redhat.com>
2019-04-01 16:16:36 -04:00