Pass a peer device parameter through the bitmap I/O functions to the I/O
handlers. In after_state_ch(), set that parameter when queuing the
drbd_send_bitmap operation so that this operation knows where to send the
bitmap.
Signed-off-by: Andreas Gruenbacher <agruen@kernel.org>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20230330102744.2128122-2-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Trying to remove an "empty" (just initialized, or "cleared") interval
from the tree, this results in an endless loop.
As we typically protect the tree with a spinlock_irq,
the result is a hung system.
Be nice to error cleanup code paths, ignore removal of empty intervals.
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20230113123538.144276-8-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This require_context attribute originated in a proposed sparse patch by
Philipp Reisner back in 2008. Johannes Berg had a different solution to
a similar problem, and that patch "won" in the end; so the require_context
thing never got merged. The whole history can be read at [0].
DRBD kept using these annotations anyway for a while. Nowadays, on a
modern unmodified sparse, they obviously do nothing, and they are hardly
used anymore anyway.
So, just remove the definitions of these macros.
[0] https://www.spinics.net/lists/linux-sparse/msg01150.html
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Link: https://lore.kernel.org/r/20230113123538.144276-6-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
To be more similar to what we do in the out-of-tree module and ease the
upstreaming process.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Link: https://lore.kernel.org/r/20230113123506.144082-4-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
To be more similar to what we do in the out-of-tree module and ease the
upstreaming process.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Link: https://lore.kernel.org/r/20230113123506.144082-2-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmO4SiAQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgc9D/0XJufUgHsLeFCF5G+q6iL5Bz+d7ymw+VFv
xrNjOz8wUKYKXcJqxLrPdkmL1tcd1+fESNGgyBidn4P53BWoHB9dtbs8+Lova08t
I4lQmZHgxgbAMhSOwGvHlOTkdlBIw/fBgQ6XdI+1qmpxzma5+gjImjyp7oH+pODP
zqsg3DKRQmDApKWtvB6D5iItsWc1Jx5TEuOfU5/JjLuVZWl6O2qynNVUccF5T89O
jkt624yO+r70CVfX3NAdFTm/mOEUiGH97l4l/8OkekJ40pf73xzvNRF/S8z8nHb/
QUGY1tKvr08xfPusl3epmQ5aO938F0aFpKi2x6P+z3G6Uq+dqMMrjJl8XMDG+J+d
+yBow5yRH7o6oBb0YPPz/6S5zBjslsHtuKFd/rs4mCDfjp9GHiIIiIpdLxZEWawJ
WaYlc5WlzSdopT/IxfaRZ9HMHzscdKadjiFngSKdpEdCUw7wxdIey+/9xbKR+xh0
Es13MzyCCurj4OnyDl5cnetGJUNNiL1JvQmIaFVndyxnMfvOaZBBmKW7h9RYBIU/
nqi4vZwYoafnGUIfLFL6uq9F627lF/EhodDuLheqz0G2pWhmFJITOJUAakGNFf83
22CiKY2GyTrOy5tKqkNzv7BG/KyJZGP+CxyyQ/7xm0k2C9wEjYSpZHKcjaNZygU5
eswPKbZMkw==
=LJ5Q
-----END PGP SIGNATURE-----
Merge tag 'block-2023-01-06' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
"The big change here is obviously the revert of the pktcdvd driver
removal. Outside of that, just minor tweaks. In detail:
- Re-instate the pktcdvd driver, which necessitates adding back
bio_copy_data_iter() and the fops->devnode() hook for now (me)
- Fix for splitting of a bio marked as NOWAIT, causing either nowait
reads or writes to error with EAGAIN even if parts of the IO
completed (me)
- Fix for ublk, punting management commands to io-wq as they can all
easily block for extended periods of time (Ming)
- Removal of SRCU dependency for the block layer (Paul)"
* tag 'block-2023-01-06' of git://git.kernel.dk/linux:
block: Remove "select SRCU"
Revert "pktcdvd: remove driver."
Revert "block: remove devnode callback from struct block_device_operations"
Revert "block: bio_copy_data_iter"
ublk: honor IO_URING_F_NONBLOCK for handling control command
block: don't allow splitting of a REQ_NOWAIT bio
block: handle bio_split_to_limits() NULL return
This can't happen right now, but in preparation for allowing
bio_split_to_limits() returning NULL if it ended the bio, check for it
in all the callers.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.
The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.
This was created by using a coccinelle script and the following
commands:
$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)
$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch
Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since moving to memalloc_nofs_save/restore, SUNRPC has stopped setting the
GFP_NOIO flag on sk_allocation which the networking system uses to decide
when it is safe to use current->task_frag. The results of this are
unexpected corruption in task_frag when SUNRPC is involved in memory
reclaim.
The corruption can be seen in crashes, but the root cause is often
difficult to ascertain as a crashing machine's stack trace will have no
evidence of being near NFS or SUNRPC code. I believe this problem to
be much more pervasive than reports to the community may indicate.
Fix this by having kernel users of sockets that may corrupt task_frag due
to reclaim set sk_use_task_frag = false. Preemptively correcting this
situation for users that still set sk_allocation allows them to convert to
memalloc_nofs_save/restore without the same unexpected corruptions that are
sure to follow, unlikely to show up in testing, and difficult to bisect.
CC: Philipp Reisner <philipp.reisner@linbit.com>
CC: Lars Ellenberg <lars.ellenberg@linbit.com>
CC: "Christoph Böhmwalder" <christoph.boehmwalder@linbit.com>
CC: Jens Axboe <axboe@kernel.dk>
CC: Josef Bacik <josef@toxicpanda.com>
CC: Keith Busch <kbusch@kernel.org>
CC: Christoph Hellwig <hch@lst.de>
CC: Sagi Grimberg <sagi@grimberg.me>
CC: Lee Duncan <lduncan@suse.com>
CC: Chris Leech <cleech@redhat.com>
CC: Mike Christie <michael.christie@oracle.com>
CC: "James E.J. Bottomley" <jejb@linux.ibm.com>
CC: "Martin K. Petersen" <martin.petersen@oracle.com>
CC: Valentina Manea <valentina.manea.m@gmail.com>
CC: Shuah Khan <shuah@kernel.org>
CC: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
CC: David Howells <dhowells@redhat.com>
CC: Marc Dionne <marc.dionne@auristor.com>
CC: Steve French <sfrench@samba.org>
CC: Christine Caulfield <ccaulfie@redhat.com>
CC: David Teigland <teigland@redhat.com>
CC: Mark Fasheh <mark@fasheh.com>
CC: Joel Becker <jlbec@evilplan.org>
CC: Joseph Qi <joseph.qi@linux.alibaba.com>
CC: Eric Van Hensbergen <ericvh@gmail.com>
CC: Latchesar Ionkov <lucho@ionkov.net>
CC: Dominique Martinet <asmadeus@codewreck.org>
CC: Ilya Dryomov <idryomov@gmail.com>
CC: Xiubo Li <xiubli@redhat.com>
CC: Chuck Lever <chuck.lever@oracle.com>
CC: Jeff Layton <jlayton@kernel.org>
CC: Trond Myklebust <trond.myklebust@hammerspace.com>
CC: Anna Schumaker <anna@kernel.org>
CC: Steffen Klassert <steffen.klassert@secunet.com>
CC: Herbert Xu <herbert@gondor.apana.org.au>
Suggested-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmOScsgQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpi5ID/9pLXFYOq1+uDjU0KO/MdjMjK8Ukr34lCnk
WkajRLheE8JBKOFDE54XJk56sQSZHX9bTWqziar0h1fioh7FlQR/tVvzsERCm2M9
2y9THJNJygC68wgybStyiKlshFjl7TD7Kv5N9Y3xP3mkQygT+D6o8fXZk5xQbYyH
YdFSoq4rJVHxRL03yzQiReGGIYdOUEQQh8l1FiLwLlKa3lXAey1KuxWIzksVN0KK
aZB4QhiBpOiPgDHUVisq2XtyQjpZ2byoCImPzgrcqk9Jo4esvm/e6esrg4xlsvII
LKFFkTmbVqjUZtFjqakFHmfuzVor4nU5f+xb90ZHExuuODYckkxWp5rWhf9QwqqI
0ik6WYgI1/5vnHnX8f2DYzOFQf9qa/rLgg0CshyUODlD6RfHa9vntqYvlIFkmOBd
Q7KblIoK8YTzUS1M+v7X8JQ7gDR2KwygH37Da2KJS+vgvfIb8kJGr1ZORuhJuJJ7
Bl69gaNkHTHrqufp7UI64YXfueeuNu2J9z3zwzGoxeaFaofF/phDn0/2gCQE1fQI
XBhsMw+ETqI6B2SPHMnzYDu2DM1S8ZTOYQlaD4G3uqgWnAM1tG707395uAy5yu4n
D5azU1fVG4UocoNIyPujpaoSRs2zWZycEFEeUQkhyDDww/j4hlHi6H33eOnk0zsr
wxzFGfvHfw==
=k/vv
-----END PGP SIGNATURE-----
Merge tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux
Pull block updates from Jens Axboe:
- NVMe pull requests via Christoph:
- Support some passthrough commands without CAP_SYS_ADMIN (Kanchan
Joshi)
- Refactor PCIe probing and reset (Christoph Hellwig)
- Various fabrics authentication fixes and improvements (Sagi
Grimberg)
- Avoid fallback to sequential scan due to transient issues (Uday
Shankar)
- Implement support for the DEAC bit in Write Zeroes (Christoph
Hellwig)
- Allow overriding the IEEE OUI and firmware revision in configfs
for nvmet (Aleksandr Miloserdov)
- Force reconnect when number of queue changes in nvmet (Daniel
Wagner)
- Minor fixes and improvements (Uros Bizjak, Joel Granados, Sagi
Grimberg, Christoph Hellwig, Christophe JAILLET)
- Fix and cleanup nvme-fc req allocation (Chaitanya Kulkarni)
- Use the common tagset helpers in nvme-pci driver (Christoph
Hellwig)
- Cleanup the nvme-pci removal path (Christoph Hellwig)
- Use kstrtobool() instead of strtobool (Christophe JAILLET)
- Allow unprivileged passthrough of Identify Controller (Joel
Granados)
- Support io stats on the mpath device (Sagi Grimberg)
- Minor nvmet cleanup (Sagi Grimberg)
- MD pull requests via Song:
- Code cleanups (Christoph)
- Various fixes
- Floppy pull request from Denis:
- Fix a memory leak in the init error path (Yuan)
- Series fixing some batch wakeup issues with sbitmap (Gabriel)
- Removal of the pktcdvd driver that was deprecated more than 5 years
ago, and subsequent removal of the devnode callback in struct
block_device_operations as no users are now left (Greg)
- Fix for partition read on an exclusively opened bdev (Jan)
- Series of elevator API cleanups (Jinlong, Christoph)
- Series of fixes and cleanups for blk-iocost (Kemeng)
- Series of fixes and cleanups for blk-throttle (Kemeng)
- Series adding concurrent support for sync queues in BFQ (Yu)
- Series bringing drbd a bit closer to the out-of-tree maintained
version (Christian, Joel, Lars, Philipp)
- Misc drbd fixes (Wang)
- blk-wbt fixes and tweaks for enable/disable (Yu)
- Fixes for mq-deadline for zoned devices (Damien)
- Add support for read-only and offline zones for null_blk
(Shin'ichiro)
- Series fixing the delayed holder tracking, as used by DM (Yu,
Christoph)
- Series enabling bio alloc caching for IRQ based IO (Pavel)
- Series enabling userspace peer-to-peer DMA (Logan)
- BFQ waker fixes (Khazhismel)
- Series fixing elevator refcount issues (Christoph, Jinlong)
- Series cleaning up references around queue destruction (Christoph)
- Series doing quiesce by tagset, enabling cleanups in drivers
(Christoph, Chao)
- Series untangling the queue kobject and queue references (Christoph)
- Misc fixes and cleanups (Bart, David, Dawei, Jinlong, Kemeng, Ye,
Yang, Waiman, Shin'ichiro, Randy, Pankaj, Christoph)
* tag 'for-6.2/block-2022-12-08' of git://git.kernel.dk/linux: (247 commits)
blktrace: Fix output non-blktrace event when blk_classic option enabled
block: sed-opal: Don't include <linux/kernel.h>
sed-opal: allow using IOC_OPAL_SAVE for locking too
blk-cgroup: Fix typo in comment
block: remove bio_set_op_attrs
nvmet: don't open-code NVME_NS_ATTR_RO enumeration
nvme-pci: use the tagset alloc/free helpers
nvme: add the Apple shared tag workaround to nvme_alloc_io_tag_set
nvme: only set reserved_tags in nvme_alloc_io_tag_set for fabrics controllers
nvme: consolidate setting the tagset flags
nvme: pass nr_maps explicitly to nvme_alloc_io_tag_set
block: bio_copy_data_iter
nvme-pci: split out a nvme_pci_ctrl_is_dead helper
nvme-pci: return early on ctrl state mismatch in nvme_reset_work
nvme-pci: rename nvme_disable_io_queues
nvme-pci: cleanup nvme_suspend_queue
nvme-pci: remove nvme_pci_disable
nvme-pci: remove nvme_disable_admin_queue
nvme: merge nvme_shutdown_ctrl into nvme_disable_ctrl
nvme: use nvme_wait_ready in nvme_shutdown_ctrl
...
direction misannotations and (hopefully) preventing
more of the same for the future.
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
-----BEGIN PGP SIGNATURE-----
iHQEABYIAB0WIQQqUNBr3gm4hGXdBJlZ7Krx/gZQ6wUCY5ZzQAAKCRBZ7Krx/gZQ
65RZAP4nTkvOn0NZLVFkuGOx8pgJelXAvrteyAuecVL8V6CR4AD40qCVY51PJp8N
MzwiRTeqnGDxTTF7mgd//IB6hoatAA==
=bcvF
-----END PGP SIGNATURE-----
Merge tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter updates from Al Viro:
"iov_iter work; most of that is about getting rid of direction
misannotations and (hopefully) preventing more of the same for the
future"
* tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
use less confusing names for iov_iter direction initializers
iov_iter: saner checks for attempt to copy to/from iterator
[xen] fix "direction" argument of iov_iter_kvec()
[vhost] fix 'direction' argument of iov_iter_{init,bvec}()
[target] fix iov_iter_bvec() "direction" argument
[s390] memcpy_real(): WRITE is "data source", not destination...
[s390] zcore: WRITE is "data source", not destination...
[infiniband] READ is "data destination", not source...
[fsi] WRITE is "data source", not destination...
[s390] copy_oldmem_kernel() - WRITE is "data source", not destination
csum_and_copy_to_iter(): handle ITER_DISCARD
get rid of unlikely() on page_copy_sane() calls
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEq5lC5tSkz8NBJiCnSfxwEqXeA64FAmOU+U8ACgkQSfxwEqXe
A67NnQ//Y5DltmvibyPd7r1TFT2gUYv+Rx3sUV9ZE1NYptd/SWhhcL8c5FZ70Fuw
bSKCa1uiWjOxosjXT1kGrWq3de7q7oUpAPSOGxgxzoaNURIt58N/ajItCX/4Au8I
RlGAScHy5e5t41/26a498kB6qJ441fBEqCYKQpPLINMBAhe8TQ+NVp0rlpUwNHFX
WrUGg4oKWxdBIW3HkDirQjJWDkkAiklRTifQh/Al4b6QDbOnRUGGCeckNOhixsvS
waHWTld+Td8jRrA4b82tUb2uVZ2/b8dEvj/A8CuTv4yC0lywoyMgBWmJAGOC+UmT
ZVNdGW02Jc2T+Iap8ZdsEmeLHNqbli4+IcbY5xNlov+tHJ2oz41H9TZoYKbudlr6
/ReAUPSn7i50PhbQlEruj3eg+M2gjOeh8OF8UKwwRK8PghvyWQ1ScW0l3kUhPIhI
PdIG6j4+D2mJc1FIj2rTVB+Bg933x6S+qx4zDxGlNp62AARUFYf6EgyD6aXFQVuX
RxcKb6cjRuFkzFiKc8zkqg5edZH+IJcPNuIBmABqTGBOxbZWURXzIQvK/iULqZa4
CdGAFIs6FuOh8pFHLI3R4YoHBopbHup/xKDEeAO9KZGyeVIuOSERDxxo5f/ITzcq
APvT77DFOEuyvanr8RMqqh0yUjzcddXqw9+ieufsAyDwjD9DTuE=
=QRhK
-----END PGP SIGNATURE-----
Merge tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
- Replace prandom_u32_max() and various open-coded variants of it,
there is now a new family of functions that uses fast rejection
sampling to choose properly uniformly random numbers within an
interval:
get_random_u32_below(ceil) - [0, ceil)
get_random_u32_above(floor) - (floor, U32_MAX]
get_random_u32_inclusive(floor, ceil) - [floor, ceil]
Coccinelle was used to convert all current users of
prandom_u32_max(), as well as many open-coded patterns, resulting in
improvements throughout the tree.
I'll have a "late" 6.1-rc1 pull for you that removes the now unused
prandom_u32_max() function, just in case any other trees add a new
use case of it that needs to converted. According to linux-next,
there may be two trivial cases of prandom_u32_max() reintroductions
that are fixable with a 's/.../.../'. So I'll have for you a final
conversion patch doing that alongside the removal patch during the
second week.
This is a treewide change that touches many files throughout.
- More consistent use of get_random_canary().
- Updates to comments, documentation, tests, headers, and
simplification in configuration.
- The arch_get_random*_early() abstraction was only used by arm64 and
wasn't entirely useful, so this has been replaced by code that works
in all relevant contexts.
- The kernel will use and manage random seeds in non-volatile EFI
variables, refreshing a variable with a fresh seed when the RNG is
initialized. The RNG GUID namespace is then hidden from efivarfs to
prevent accidental leakage.
These changes are split into random.c infrastructure code used in the
EFI subsystem, in this pull request, and related support inside of
EFISTUB, in Ard's EFI tree. These are co-dependent for full
functionality, but the order of merging doesn't matter.
- Part of the infrastructure added for the EFI support is also used for
an improvement to the way vsprintf initializes its siphash key,
replacing an sleep loop wart.
- The hardware RNG framework now always calls its correct random.c
input function, add_hwgenerator_randomness(), rather than sometimes
going through helpers better suited for other cases.
- The add_latent_entropy() function has long been called from the fork
handler, but is a no-op when the latent entropy gcc plugin isn't
used, which is fine for the purposes of latent entropy.
But it was missing out on the cycle counter that was also being mixed
in beside the latent entropy variable. So now, if the latent entropy
gcc plugin isn't enabled, add_latent_entropy() will expand to a call
to add_device_randomness(NULL, 0), which adds a cycle counter,
without the absent latent entropy variable.
- The RNG is now reseeded from a delayed worker, rather than on demand
when used. Always running from a worker allows it to make use of the
CPU RNG on platforms like S390x, whose instructions are too slow to
do so from interrupts. It also has the effect of adding in new inputs
more frequently with more regularity, amounting to a long term
transcript of random values. Plus, it helps a bit with the upcoming
vDSO implementation (which isn't yet ready for 6.2).
- The jitter entropy algorithm now tries to execute on many different
CPUs, round-robining, in hopes of hitting even more memory latencies
and other unpredictable effects. It also will mix in a cycle counter
when the entropy timer fires, in addition to being mixed in from the
main loop, to account more explicitly for fluctuations in that timer
firing. And the state it touches is now kept within the same cache
line, so that it's assured that the different execution contexts will
cause latencies.
* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
random: include <linux/once.h> in the right header
random: align entropy_timer_state to cache line
random: mix in cycle counter when jitter timer fires
random: spread out jitter callback to different CPUs
random: remove extraneous period and add a missing one in comments
efi: random: refresh non-volatile random seed when RNG is initialized
vsprintf: initialize siphash key using notifier
random: add back async readiness notifier
random: reseed in delayed work rather than on-demand
random: always mix cycle counter in add_latent_entropy()
hw_random: use add_hwgenerator_randomness() for early entropy
random: modernize documentation comment on get_random_bytes()
random: adjust comment to account for removed function
random: remove early archrandom abstraction
random: use random.trust_{bootloader,cpu} command line option only
stackprotector: actually use get_random_canary()
stackprotector: move get_random_canary() into stackprotector.h
treewide: use get_random_u32_inclusive() when possible
treewide: use get_random_u32_{above,below}() instead of manual loop
treewide: use get_random_u32_below() instead of deprecated function
...
Use call site specific ratelimit instead of one single static global.
Also ratelimit ASSERTION messages generated by expect().
Originally-from: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221201110349.1282687-5-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Incorporate as many out-of-tree changes as possible without changing the
genl API.
Over the years, we restructured this several times, and also changed the
log format.
One breaking change is that DRBD 9 gained "implicit options", like a
connection name. This cannot be replayed here without changing the API,
so save it for later.
Originally-from: Andreas Gruenbacher <agruen@linbit.com>
Originally-from: Philipp Reisner <philipp.reisner@linbit.com>
Originally-from: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221201110349.1282687-4-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.
Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
A submitter workqueue is dynamically allocated by init_submitter()
called by drbd_create_device(), we should destroy it when this
device is not needed or destroyed.
Fixes: 113fef9e20 ("drbd: prepare to queue write requests on a submit worker")
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lore.kernel.org/r/20221124015817.2729789-3-bobo.shaobowang@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This revert c2258ffc56 ("drbd: poison free'd device, resource and
connection structs"), add memset is odd here for debugging, there are
some methods to accurately show what happened, such as kdump.
Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
Link: https://lore.kernel.org/r/20221124015817.2729789-2-bobo.shaobowang@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
DRBD currently has a mix of GPL-2.0 and GPL-2.0-or-later SPDX license
identifiers. We have decided to stick with GPL 2.0 only, so consistently
use that identifier.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221122134301.69258-5-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmN38ZUQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgXxD/9tUSFUKIVGIn4pmNILfY3XV45HOi1w44yR
zCxCELupcBeT+YixmaJcT8sunrrg2fLPOXMrDJk1cG/izXHzkjAQsHZvERfqC7hC
f5onH+2MyGm3qBwxV0iGqITJgTwQGInVJijT4f9UZd/8ultymyZR2nOdIdIydHCF
qzlOjq6hgIuGKHhFgOqRUg/OAkx510ZEEilUDcZ6XVV+zL7ccN6J9+eNTI3c58wT
7jvxZC4u6QGKteGvVniE3WXgk3QdFiQRORvV09g+PkbG/vPjAIZ5tJFb9PdIOebD
3guDiNUasgz2vnDetMK+yk4LcedcRfWnqgn+Vm8C26j5Fxs13eDx5kMDteVy7CYh
3bokOATHohoZZ9qTApgQUswTfGJfBdoy0nUTPuffxPdKDyUPteIxFCADcnyDHnDG
d/+PjU3FKF31o2HcUfvYp7OMO0VZP0hJSWps8znoVXKxb+LH9qKkYzHVlfni5kkS
k9XqqD1Ki98Erb346YqgvQjCkz+CUd5DxtGyh9Oh2+oS2qHP6WjdKo1QPFmWD5dp
EyXGSqGoZrIPtnKohLUN9EiVXanRQWJr3L0gw2CYXpmwfSKfMC3CQraEC1jOc01l
TfsLJGbl3L5XpLzxoBwDu44cqp+VvbalergdcmsDTLDFHhONY2g5LJh6C9/EDdnQ
Cde1uHikGw==
=sOGG
-----END PGP SIGNATURE-----
Merge tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux
Pull block fixes from Jens Axboe:
- NVMe pull request via Christoph:
- Two more bogus nid quirks (Bean Huo, Tiago Dias Ferreira)
- Memory leak fix in nvmet (Sagi Grimberg)
- Regression fix for block cgroups pinning the wrong blkcg, causing
leaks of cgroups and blkcgs (Chris)
- UAF fix for drbd setup error handling (Dan)
- Fix DMA alignment propagation in DM (Keith)
* tag 'block-6.1-2022-11-18' of git://git.kernel.dk/linux:
dm-log-writes: set dma_alignment limit in io_hints
dm-integrity: set dma_alignment limit in io_hints
block: make blk_set_default_limits() private
dm-crypt: provide dma_alignment limit in io_hints
block: make dma_alignment a stacking queue_limit
nvmet: fix a memory leak in nvmet_auth_set_key
nvme-pci: add NVME_QUIRK_BOGUS_NID for Netac NV7000
drbd: use after free in drbd_create_device()
nvme-pci: add NVME_QUIRK_BOGUS_NID for Micron Nitro
blk-cgroup: properly pin the parent in blkcg_css_online
This is a simple mechanical transformation done by:
@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
(E)
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The drbd_destroy_connection() frees the "connection" so use the _safe()
iterator to prevent a use after free.
Fixes: b6f85ef953 ("drbd: Iterate over all connections")
Signed-off-by: Dan Carpenter <error27@gmail.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/Y3Jd5iZRbNQ9w6gm@kili
Signed-off-by: Jens Axboe <axboe@kernel.dk>
(Sort of) cherry-picked from the out-of-tree drbd9 branch. Original
commit message by Joel Colledge:
This simplifies drbd_submit_peer_request by removing most of the
arguments. It also makes the treatment of the op better aligned with
that in struct bio.
Determine fault_type dynamically using information which is already
available instead of passing it in as a parameter.
Note: The opf in receive_rs_deallocated was changed from
REQ_OP_WRITE_ZEROES to REQ_OP_DISCARD. This was required in the
out-of-tree module, and does not matter in-tree. The opf is ignored
anyway in drbd_submit_peer_request, since the discard/zero-out is
decided by the EE_TRIM flag.
Signed-off-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221109133453.51652-4-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The discard_granularity describes the minimum unit of a discard.
If that is larger than the maximal discard size, we need to disable
discards completely.
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221109133453.51652-3-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We currently only set q->limits.max_discard_sectors, but that is not
enough. Another field, max_hw_discard_sectors, was introduced in
commit 0034af0365 ("block: make /sys/block/<dev>/queue/discard_max_bytes
writeable").
The difference is that max_discard_sectors can be changed from user
space via sysfs, while max_hw_discard_sectors is the "hardware" upper
limit.
So use this helper, which sets both.
This is also a fixup for commit 998e9cbcd6 ("drbd: cleanup
decide_on_discard_support"): if discards are not supported, that does
not necessarily mean we also want to disable write_zeroes.
Fixes: 998e9cbcd6 ("drbd: cleanup decide_on_discard_support")
Reviewed-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20221109133453.51652-2-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Commit c347a787e3 (drbd: set ->bi_bdev in drbd_req_new) moved a
bio_set_dev call (which has since been removed) to "earlier", from
drbd_request_prepare to drbd_req_new.
The problem is that this accesses device->ldev->backing_bdev, which is
not NULL-checked at this point. When we don't have an ldev (i.e. when
the DRBD device is diskless), this leads to a null pointer deref.
So, only allocate the private_bio if we actually have a disk. This is
also a small optimization, since we don't clone the bio to only to
immediately free it again in the diskless case.
Fixes: c347a787e3 ("drbd: set ->bi_bdev in drbd_req_new")
Co-developed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Co-developed-by: Joel Colledge <joel.colledge@linbit.com>
Signed-off-by: Joel Colledge <joel.colledge@linbit.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20221020085205.129090-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:
@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)
@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@
- RAND = get_random_u32();
... when != RAND
- RAND %= (E);
+ RAND = prandom_u32_max(E);
// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@
((T)get_random_u32()@p & (LITERAL))
// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@
value = None
if literal.startswith('0x'):
value = int(literal, 16)
elif literal[0] in '123456789':
value = int(literal, 10)
if value is None:
print("I don't know how to handle %s" % (literal))
cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
print("Skipping 0x%x for cleanup elsewhere" % (value))
cocci.include_match(False)
elif value & (value + 1) != 0:
print("Skipping 0x%x because it's not a power of two minus one" % (value))
cocci.include_match(False)
elif literal.startswith('0x'):
coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))
// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@
- (FUNC()@p & (LITERAL))
+ prandom_u32_max(RESULT)
@collapse_ret@
type T;
identifier VAR;
expression E;
@@
{
- T VAR;
- VAR = (E);
- return VAR;
+ return E;
}
@drop_var@
type T;
identifier VAR;
@@
{
- T VAR;
... when != VAR
}
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
All implementations of req->collision, _req_may_be_done and
drbd_fail_pending_reads have been removed, so remove the comments
in receive_DataReply() that provide no useful information.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220920015216.782190-3-cuigaosheng1@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The _req_may_be_done() has been removed by
commit 6870ca6d46 ("drbd: factor out master_bio completion
and drbd_request destruction paths"), so remove the orphan
declaration.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220920015216.782190-2-cuigaosheng1@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
w_start_resync has been removed since
commit ac0acb9e39 ("drbd: use drbd_device_post_work()
in more places"), so remove it.
Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The double indirect bio leads to somewhat suboptimal code generation.
Instead return the (original or split) bio, and make sure the
request_queue arguments to the lower level helpers is passed after the
bio to avoid constant reshuffling of the argument passing registers.
Also give it and the helpers used to implement it more descriptive names.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220727162300.3089193-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We usually do all our bitmap IO in units of PAGE_SIZE.
With very small or oddly sized external meta data, or with
PAGE_SIZE != 4k, it can happen that our last on-disk bitmap page
is not fully PAGE_SIZE aligned, so we may need to adjust the size
of the IO.
We used to do that with
min_t(unsigned int, PAGE_SIZE,
last_allowed_sector - current_offset);
And for just the right diff, (unsigned int)(diff) will result in 0.
A bio of length 0 will correctly be rejected with an IO error
(and some scary WARN_ON_ONCE()) by the scsi layer.
Do the calculation properly.
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220622204932.196830-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Combine the drbd_submit_peer_request() 'op' and 'op_flags' arguments
into a single argument. This patch does not change any functionality.
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-15-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Improve static type checking by using the enum req_op type for variables
that represent a request operation and the new blk_opf_t type for
variables that represent request flags.
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
Link: https://lore.kernel.org/r/20220714180729.1065367-14-bvanassche@acm.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just use the %pg format specifier instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20220713055317.1888500-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
blk_cleanup_disk is nothing but a trivial wrapper for put_disk now,
so remove it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Link: https://lore.kernel.org/r/20220619060552.1850436-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Return boolean values ("true" or "false") instead of 1 or 0 from bool
functions. This fixes the following warnings from coccicheck:
./drivers/block/drbd/drbd_req.c:912:9-10: WARNING: return of 0/1 in
function 'remote_due_to_read_balancing' with return type bool
Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-8-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Instead of invoking a synchronize_rcu() to free a pointer
after a grace period we can directly make use of new API
that does the same but in more efficient way.
TO: Jens Axboe <axboe@kernel.dk>
TO: Philipp Reisner <philipp.reisner@linbit.com>
TO: Jason Gunthorpe <jgg@nvidia.com>
TO: drbd-dev@lists.linbit.com
TO: linux-block@vger.kernel.org
Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-7-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
when run checkpath.pl for the first patch, found that
WARNING: Prefer 'unsigned int' to bare use of 'unsigned'.
so fix it. BTW
Signed-off-by: Cai Huoqing <caihuoqing@baidu.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-6-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Variable err is set to '-EIO' but this value is never read as
it is overwritten or not used later on, hence it is a redundant
assignment and can be removed.
Clean up the following clang-analyzer warning:
drivers/block/drbd/drbd_receiver.c:3955:5: warning: Value stored to
'err' is never read [clang-analyzer-deadcode.DeadStores].
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-4-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
gcc -Wextra warns about mixing drbd_state_rv with drbd_ret_code
in a couple of places:
drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_set_role':
drivers/block/drbd/drbd_nl.c:777:14: warning: comparison between 'enum drbd_state_rv' and 'enum drbd_ret_code' [-Wenum-compare]
777 | if (retcode != NO_ERROR)
| ^~
drivers/block/drbd/drbd_nl.c:784:12: warning: implicit conversion from 'enum drbd_ret_code' to 'enum drbd_state_rv' [-Wenum-conversion]
784 | retcode = ERR_MANDATORY_TAG;
| ^
drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_attach':
drivers/block/drbd/drbd_nl.c:1965:10: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion]
1965 | retcode = rv; /* FIXME: Type mismatch. */
| ^
drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_connect':
drivers/block/drbd/drbd_nl.c:2690:10: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion]
2690 | retcode = conn_request_state(connection, NS(conn, C_UNCONNECTED), CS_VERBOSE);
| ^
drivers/block/drbd/drbd_nl.c: In function 'drbd_adm_disconnect':
drivers/block/drbd/drbd_nl.c:2803:11: warning: implicit conversion from 'enum drbd_state_rv' to 'enum drbd_ret_code' [-Wenum-conversion]
2803 | retcode = rv; /* FIXME: Type mismatch. */
| ^
In each case, both are passed into drbd_adm_finish(), which just takes
a 32-bit integer and is happy with either, presumably intentionally.
Restructure the code to pass either type directly in there in most
cases, avoiding the warnings.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-3-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There are two initializers for P_RETRY_WRITE:
drivers/block/drbd/drbd_main.c:3676:22: warning: initialized field overwritten [-Woverride-init]
Remove the first one since it was already ignored by the compiler
and reorder the list to match the enum definition. As P_ZEROES had
no entry, add that one instead.
Fixes: 036b17eaab ("drbd: Receiving part for the PROTOCOL_UPDATE packet")
Fixes: f31e583aa2 ("drbd: introduce P_ZEROES (REQ_OP_WRITE_ZEROES on the "wire")")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220406190715.1938174-2-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Secure erase is a very different operation from discard in that it is
a data integrity operation vs hint. Fully split the limits and helper
infrastructure to make the separation more clear.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nifs2]
Acked-by: Jaegeuk Kim <jaegeuk@kernel.org> [f2fs]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Acked-by: Chao Yu <chao@kernel.org>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-27-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Abstract away implementation details from file systems by providing a
block_device based helper to retrieve the discard granularity.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Link: https://lore.kernel.org/r/20220415045258.199825-26-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Just use a non-zero max_discard_sectors as an indicator for discard
support, similar to what is done for write zeroes.
The only places where needs special attention is the RAID5 driver,
which must clear discard support for security reasons by default,
even if the default stacking rules would allow for it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Jan Höppner <hoeppner@linux.ibm.com> [s390]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-25-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Add a helper to query the number of sectors support per each discard bio
based on the block device and use this helper to stop various places from
poking into the request_queue to see if discard is supported and if so how
much. This mirrors what is done e.g. for write zeroes as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> [drbd]
Acked-by: Coly Li <colyli@suse.de> [bcache]
Acked-by: David Sterba <dsterba@suse.com> [btrfs]
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220415045258.199825-24-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Sanitize the calling conventions and use a goto label to cleanup the
code flow.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220415045258.199825-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The bdev version does the right thing for partitions, so use that.
Fixes: 9104d31a75 ("drbd: introduce WRITE_SAME support")
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220415045258.199825-7-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Use the bdev based limits helpers where they exist.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220415045258.199825-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fold each branch into its only caller.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220415045258.199825-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
We want our pages not to change while they are being written.
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The bug is here:
idr_remove(&connection->peer_devices, vnr);
If the previous for_each_connection() don't exit early (no goto hit
inside the loop), the iterator 'connection' after the loop will be a
bogus pointer to an invalid structure object containing the HEAD
(&resource->connections). As a result, the use of 'connection' above
will lead to a invalid memory access (including a possible invalid free
as idr_remove could call free_layer).
The original intention should have been to remove all peer_devices,
but the following lines have already done the work. So just remove
this line and the unneeded label, to fix this bug.
Cc: stable@vger.kernel.org
Fixes: c06ece6ba6 ("drbd: Turn connection->volumes into connection->peer_devices")
Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Reviewed-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In get_initial_state, it calls notify_initial_state_done(skb,..) if
cb->args[5]==1. If genlmsg_put() failed in notify_initial_state_done(),
the skb will be freed by nlmsg_free(skb).
Then get_initial_state will goto out and the freed skb will be used by
return value skb->len, which is a uaf bug.
What's worse, the same problem goes even further: skb can also be
freed in the notify_*_state_change -> notify_*_state calls below.
Thus 4 additional uaf bugs happened.
My patch lets the problem callee functions: notify_initial_state_done
and notify_*_state_change return an error code if errors happen.
So that the error codes could be propagated and the uaf bugs can be avoid.
v2 reports a compilation warning. This v3 fixed this warning and built
successfully in my local environment with no additional warnings.
v2: https://lore.kernel.org/patchwork/patch/1435218/
Fixes: a29728463b ("drbd: Backport the "events2" command")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmJHUgMQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpgMOD/9J9mO8CQ3THnyJeZb8hy+k7fPHw+P3OrAR
umOea3ujoMqzJsd/aRenMMAHsr7Phnb5PmljvryOo59nvwxOZ5MIBzSf2H4qJ8U2
B4jGwESVW4OFNS6Mu+lgYH7XMyDHvqCSVdIhcnqkseoFyndpnTfsu4cCphqajVaP
gOmXLBSQAetULxMfbqm7ofKk8F7zA80LFbwVs1VWVnCMeLVDccmJJbfn97jDZaJc
rl8xmcvmarYLOTxDoOSdmfp4ek7QzRKQuKDlfvn1Xi+lDtkKAnYygMHhqQ0WYYc1
/jJEd3iLCeV0jYfsDpVq6n2KRAGPCtrP0HkujifMGtuL5N2MAn/Aq3Fqoztas1yK
p3T3SBIBVeznTtOXxl4Fm8tvim3i3rxn2vPdYnm/8uuNxqCQy78gVf5bPlLY+bzT
4ytrP7AUpyzFj5E+8F33mZd0Vj2AL1kvgjbfEWqyQdXu7zs98UJL3xWicLbrvt/E
nmdlZjOOBbEgV3vGQD5wTRvlsBJswtl4mHBpYYLzZtmDr8wZxHD3DCUuM1i4xyHG
qDxNTKME2KfCXA8DIlcPxOAnNeXtwW3J7KTDuPDwC6XW84hFiC2tkM5M8aWqRos+
GiRczSArhaomaGq8W+vRqgCnPQgFAIt+oyJ9aqen0xHG8qCOTv3N6mMTJk/uBnMI
qcfpguHp+g==
=0Axq
-----END PGP SIGNATURE-----
Merge tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block
Pull block driver fixes from Jens Axboe:
"Followup block driver updates and fixes for the 5.18-rc1 merge window.
In detail:
- NVMe pull request
- Fix multipath hang when disk goes live over reconnect (Anton
Eidelman)
- fix RCU hole that allowed for endless looping in multipath
round robin (Chris Leech)
- remove redundant assignment after left shift (Colin Ian King)
- add quirks for Samsung X5 SSDs (Monish Kumar R)
- fix the read-only state for zoned namespaces with unsupposed
features (Pankaj Raghav)
- use a private workqueue instead of the system workqueue in
nvmet (Sagi Grimberg)
- allow duplicate NSIDs for private namespaces (Sungup Moon)
- expose use_threaded_interrupts read-only in sysfs (Xin Hao)"
- nbd minor allocation fix (Zhang)
- drbd fixes and maintainer addition (Lars, Jakob, Christoph)
- n64cart build fix (Jackie)
- loop compat ioctl fix (Carlos)
- misc fixes (Colin, Dongli)"
* tag 'for-5.18/drivers-2022-04-01' of git://git.kernel.dk/linux-block:
drbd: remove check of list iterator against head past the loop body
drbd: remove usage of list iterator variable after loop
nbd: fix possible overflow on 'first_minor' in nbd_dev_add()
MAINTAINERS: add drbd co-maintainer
drbd: fix potential silent data corruption
loop: fix ioctl calls using compat_loop_info
nvme-multipath: fix hang when disk goes live over reconnect
nvme: fix RCU hole that allowed for endless looping in multipath round robin
nvme: allow duplicate NSIDs for private namespaces
nvmet: remove redundant assignment after left shift
nvmet: use a private workqueue instead of the system workqueue
nvme-pci: add quirks for Samsung X5 SSDs
nvme-pci: expose use_threaded_interrupts read-only in sysfs
nvme: fix the read-only state for zoned namespaces with unsupposed features
n64cart: convert bi_disk to bi_bdev->bd_disk fix build
xen/blkfront: fix comment for need_copy
xen-blkback: remove redundant assignment to variable i
When list_for_each_entry() completes the iteration over the whole list
without breaking the loop, the iterator value will be a bogus pointer
computed based on the head element.
While it is safe to use the pointer to determine if it was computed
based on the head element, either with list_entry_is_head() or
&pos->member == head, using the iterator variable after the loop should
be avoided.
In preparation to limit the scope of a list iterator to the list
traversal loop, use a dedicated pointer to point to the found element [1].
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Link: https://lore.kernel.org/r/20220331220349.885126-2-jakobkoschel@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In preparation to limit the scope of a list iterator to the list
traversal loop, use a dedicated pointer to iterate through the list [1].
Since that variable should not be used past the loop iteration, a
separate variable is used to 'remember the current location within the
loop'.
To either continue iterating from that position or skip the iteration
(if the previous iteration was complete) list_prepare_entry() is used.
Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWXJ6w@mail.gmail.com/ [1]
Signed-off-by: Jakob Koschel <jakobkoschel@gmail.com>
Link: https://lore.kernel.org/r/20220331220349.885126-1-jakobkoschel@gmail.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Scenario:
---------
bio chain generated by blk_queue_split().
Some split bio fails and propagates its error status to the "parent" bio.
But then the (last part of the) parent bio itself completes without error.
We would clobber the already recorded error status with BLK_STS_OK,
causing silent data corruption.
Reproducer:
-----------
How to trigger this in the real world within seconds:
DRBD on top of degraded parity raid,
small stripe_cache_size, large read_ahead setting.
Drop page cache (sysctl vm.drop_caches=1, fadvise "DONTNEED",
umount and mount again, "reboot").
Cause significant read ahead.
Large read ahead request is split by blk_queue_split().
Parts of the read ahead that are already in the stripe cache,
or find an available stripe cache to use, can be serviced.
Parts of the read ahead that would need "too much work",
would need to wait for a "stripe_head" to become available,
are rejected immediately.
For larger read ahead requests that are split in many pieces, it is very
likely that some "splits" will be serviced, but then the stripe cache is
exhausted/busy, and the remaining ones will be rejected.
Signed-off-by: Lars Ellenberg <lars.ellenberg@linbit.com>
Signed-off-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com>
Cc: <stable@vger.kernel.org> # 4.13.x
Link: https://lore.kernel.org/r/20220330185551.3553196-1-christoph.boehmwalder@linbit.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
This series consists of the usual driver updates (qla2xxx, pm8001,
libsas, smartpqi, scsi_debug, lpfc, iscsi, mpi3mr) plus minor updates
and bug fixes. The high blast radius core update is the removal of
write same, which affects block and several non-SCSI devices. The
other big change, which is more local, is the removal of the SCSI
pointer.
Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
-----BEGIN PGP SIGNATURE-----
iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCYjzDQyYcamFtZXMuYm90
dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishQMYAQDEWUGV
6U0+736AHVtOfiMNfiRN79B1HfXVoHvemnPcTwD/UlndwFfy/3GGOtoZmqEpc73J
Ec1HDuUCE18H1H2QAh0=
=/Ty9
-----END PGP SIGNATURE-----
Merge tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi
Pull SCSI updates from James Bottomley:
"This series consists of the usual driver updates (qla2xxx, pm8001,
libsas, smartpqi, scsi_debug, lpfc, iscsi, mpi3mr) plus minor updates
and bug fixes.
The high blast radius core update is the removal of write same, which
affects block and several non-SCSI devices. The other big change,
which is more local, is the removal of the SCSI pointer"
* tag 'scsi-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi: (281 commits)
scsi: scsi_ioctl: Drop needless assignment in sg_io()
scsi: bsg: Drop needless assignment in scsi_bsg_sg_io_fn()
scsi: lpfc: Copyright updates for 14.2.0.0 patches
scsi: lpfc: Update lpfc version to 14.2.0.0
scsi: lpfc: SLI path split: Refactor BSG paths
scsi: lpfc: SLI path split: Refactor Abort paths
scsi: lpfc: SLI path split: Refactor SCSI paths
scsi: lpfc: SLI path split: Refactor CT paths
scsi: lpfc: SLI path split: Refactor misc ELS paths
scsi: lpfc: SLI path split: Refactor VMID paths
scsi: lpfc: SLI path split: Refactor FDISC paths
scsi: lpfc: SLI path split: Refactor LS_RJT paths
scsi: lpfc: SLI path split: Refactor LS_ACC paths
scsi: lpfc: SLI path split: Refactor the RSCN/SCR/RDF/EDC/FARPR paths
scsi: lpfc: SLI path split: Refactor PLOGI/PRLI/ADISC/LOGO paths
scsi: lpfc: SLI path split: Refactor base ELS paths and the FLOGI path
scsi: lpfc: SLI path split: Introduce lpfc_prep_wqe
scsi: lpfc: SLI path split: Refactor fast and slow paths to native SLI4
scsi: lpfc: SLI path split: Refactor lpfc_iocbq
scsi: lpfc: Use kcalloc()
...
These functions are no longer useful as no BDIs report congestions any
more.
Removing the test on bdi_write_contested() in current_may_throttle()
could cause a small change in behaviour, but only when PF_LOCAL_THROTTLE
is set.
So replace the calls by 'false' and simplify the code - and remove the
functions.
[akpm@linux-foundation.org: fix build]
Link: https://lkml.kernel.org/r/164549983742.9187.2570198746005819592.stgit@noble.brown
Signed-off-by: NeilBrown <neilb@suse.de>
Acked-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> [nilfs]
Cc: Anna Schumaker <Anna.Schumaker@Netapp.com>
Cc: Chao Yu <chao@kernel.org>
Cc: Darrick J. Wong <djwong@kernel.org>
Cc: Ilya Dryomov <idryomov@gmail.com>
Cc: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Layton <jlayton@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>
Cc: Paolo Valente <paolo.valente@linaro.org>
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Trond Myklebust <trond.myklebust@hammerspace.com>
Cc: Wu Fengguang <fengguang.wu@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20220303111905.321089-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Using local kmaps slightly reduces the chances to stray writes, and
the bvec interface cleans up the code a little bit.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20220303111905.321089-9-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
REQ_OP_WRITE_SAME was only ever submitted by the legacy Linux zeroing code,
which has switched to use REQ_OP_WRITE_ZEROES long ago.
Link: https://lore.kernel.org/r/20220209082828.2629273-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Pass a block_device to bio_clone_fast and __bio_clone_fast and give
the functions more suitable names.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-14-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Make sure the newly allocated bio has the correct bi_bdev set from the
start.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Mike Snitzer <snitzer@redhat.com>
Link: https://lore.kernel.org/r/20220202160109.108149-2-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass the block_device and operation that we plan to use this bio for to
bio_alloc to optimize the assignment. NULL/0 can be passed, both for the
passthrough case on a raw request_queue and to temporarily avoid
refactoring some nasty code.
Also move the gfp_mask argument after the nr_vecs argument for a much
more logical calling convention matching what most of the kernel does.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-18-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Pass the block_device and operation that we plan to use this bio for to
bio_alloc_bioset to optimize the assigment. NULL/0 can be passed, both
for the passthrough case on a raw request_queue and to temporarily avoid
refactoring some nasty code.
Also move the gfp_mask argument after the nr_vecs argument for a much
more logical calling convention matching what most of the kernel does.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Link: https://lore.kernel.org/r/20220124091107.642561-16-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Remove handling of NULL returns from sleeping bio_alloc calls given that
those can't fail.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20220124091107.642561-10-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
There is no good reason to keep genhd.h separate from the main blkdev.h
header that includes it. So fold the contents of genhd.h into blkdev.h
and remove genhd.h entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Chaitanya Kulkarni <kch@nvidia.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Link: https://lore.kernel.org/r/20220124093913.742411-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In preparation for FORTIFY_SOURCE performing compile-time and run-time
field bounds checking for memset(), avoid intentionally writing across
neighboring fields.
Add a struct_group() for the algs so that memset() can correctly reason
about the size.
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/20211118203712.1288866-1-keescook@chromium.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
All modern drivers can support extra partitions using the extended
dev_t. In fact except for the ioctl method drivers never even see
partitions in normal operation.
So remove the GENHD_FL_EXT_DEVT and allow extra partitions for all
block devices that do support partitions, and require those that
do not support partitions to explicit disallow them using
GENHD_FL_NO_PART.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20211122130625.1136848-12-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmGKqOcQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpr3yEADD9Cx8oNk3KzWV3c3JlIR4JQtvpczS3dho
KkGU0D5fOh1sViXbLBNr6VxypcEIKQoHxDQQ6qid1kOu/B3mCNM1duLsVjyj3Qa0
7nbm2dVUsD/EVDuXedRmMvcfCUx6Z23DbpI182wXtIPaCsEEmsANzHnZNg38OV44
25SYG0QUvb9ViSz1Y1GORu0ttEJNF2GhZfiBpb0WveRnY7eTSL/PnHNDzHsSeFv4
zD0W205g7jKbt0+57kgNElTz7DbdM3p8XVex+aXPlFaHz2qx4ZoJJIsaMv/P8tT5
14b50cB41xnPvlGTvqr1WfZZfJocDNq2rG+fh6N5D1sO86ogWpj7psiiADfa0pb6
ZWoJqhk3BvEUMPQ5N/BJ/8j3FWGIYWtKQf4QcyxrJYpqDwtwbBfMlzKkc7JMPFYk
JAi6uq1uF5SbA4x99G90tK85LvxsbkseyIYXgBJ/GIyW5doIPkD9TPDEzJMCdHOe
laynHS5PMHzuhPLuEDDn9sTVXpZWAMBnoy4j1L4wGmBjiogYWLTSJVobODzCAqHY
1Va2oP6SXfCdVRkCysFbcrdsjJuoIWlMKrdE40tNvkmU0v7sEX0Zd+GLHiaWdIZa
fgxC9fmZtDDOowCp+Iw0VaAqPeeptmyUrof06ZktJleOAscX7kSwbxPdmr1FM0jy
dbnLDyaq/A==
=QaFI
-----END PGP SIGNATURE-----
Merge tag 'for-5.16/drivers-2021-11-09' of git://git.kernel.dk/linux-block
Pull more block driver updates from Jens Axboe:
- Last series adding error handling support for add_disk() in drivers.
After this one, and once the SCSI side has been merged, we can
finally annotate add_disk() as must_check. (Luis)
- bcache fixes (Coly)
- zram fixes (Ming)
- ataflop locking fix (Tetsuo)
- nbd fixes (Ye, Yu)
- MD merge via Song
- Cleanup (Yang)
- sysfs fix (Guoqing)
- Misc fixes (Geert, Wu, luo)
* tag 'for-5.16/drivers-2021-11-09' of git://git.kernel.dk/linux-block: (34 commits)
bcache: Revert "bcache: use bvec_virt"
ataflop: Add missing semicolon to return statement
floppy: address add_disk() error handling on probe
ataflop: address add_disk() error handling on probe
block: update __register_blkdev() probe documentation
ataflop: remove ataflop_probe_lock mutex
mtd/ubi/block: add error handling support for add_disk()
block/sunvdc: add error handling support for add_disk()
z2ram: add error handling support for add_disk()
nvdimm/pmem: use add_disk() error handling
nvdimm/pmem: cleanup the disk if pmem_release_disk() is yet assigned
nvdimm/blk: add error handling support for add_disk()
nvdimm/blk: avoid calling del_gendisk() on early failures
nvdimm/btt: add error handling support for add_disk()
nvdimm/btt: use goto error labels on btt_blk_init()
loop: Remove duplicate assignments
drbd: Fix double free problem in drbd_create_device
nvdimm/btt: do not call del_gendisk() if not needed
bcache: fix use-after-free problem in bcache_device_free()
zram: replace fsync_bdev with sync_blockdev
...
In drbd_create_device(), the 'out_no_io_page' lable has called
blk_cleanup_disk() when return failed.
So remove the 'out_cleanup_disk' lable to avoid double free the
disk pointer.
Fixes: e92ab4eda5 ("drbd: add error handling support for add_disk()")
Signed-off-by: Wu Bo <wubo40@huawei.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/1636013229-26309-1-git-send-email-wubo40@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmF8L70QHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpo9YEAC17yEJ0xwwtUUwZW8avzss4vdcIreFdiZu
gaS+9Oi1bLxj0d2SjaZXJxjT9K+W2LftEsLuQ4oM6VHiLQkcEDbjJdVm3goftTt5
aOvVormDdKbWNcGSbgxA/OcyUT39DH7y17NRVdqYzQSpnrhCod/1tb2ssck0OoYb
VEyBKogMwYeYR55Z3I8yL5pNcEhR8TihZv3rL1iQ7DNpvh5I0I9naSEtGNC84aLP
s4nwRIG+TYll+mg0sfSB29KF7xkoFQO7X7s1rnC/on+gsFEzbJcgkJPDIWeVLnLm
ma8F1i+vJliCGaztyXoleAdg5QDiFmwTQwXRPAk2u8njJhcKi/RwIk2QYMZBZmEJ
bB5EJnlnEaWxjgpCD7JDrtKgIgpbbQHc5QVHRZccsu43UqvDqOZIlvZNYY+h3ivz
jT1zKuKDaTf8YWbfdOJwqm9e+qyR0AFm3rLMdHO58QEh1DBvSLIIdRCNE8wX7nFM
Wx/GmQEkPqNTIZwJOQJMygK+sIuFUDybt3oAH2pjX1zyMx7kTJkrXvj0dhSS/B5u
+gfMs3otWqxQ4P1qfnaUd9mYl8JabV7le2NHzhjdARm4NKFJEtcJe5BJBwiMbo0n
vodqt7aUIAXwMrZXnWZL+w8CobhJBp8I5XHUgng147gDBuCjYQjBQT334auAXxgz
MUCgbjBDqw==
=Vadi
-----END PGP SIGNATURE-----
Merge tag 'for-5.16/bdev-size-2021-10-29' of git://git.kernel.dk/linux-block
Pull bdev size cleanups from Jens Axboe:
"Clean up the bdev size handling with new bdev_nr_bytes() helper"
* tag 'for-5.16/bdev-size-2021-10-29' of git://git.kernel.dk/linux-block: (34 commits)
partitions/ibm: use bdev_nr_sectors instead of open coding it
partitions/efi: use bdev_nr_bytes instead of open coding it
block/ioctl: use bdev_nr_sectors and bdev_nr_bytes
block: cache inode size in bdev
udf: use sb_bdev_nr_blocks
reiserfs: use sb_bdev_nr_blocks
ntfs: use sb_bdev_nr_blocks
jfs: use sb_bdev_nr_blocks
ext4: use sb_bdev_nr_blocks
block: add a sb_bdev_nr_blocks helper
block: use bdev_nr_bytes instead of open coding it in blkdev_fallocate
squashfs: use bdev_nr_bytes instead of open coding it
reiserfs: use bdev_nr_bytes instead of open coding it
pstore/blk: use bdev_nr_bytes instead of open coding it
ntfs3: use bdev_nr_bytes instead of open coding it
nilfs2: use bdev_nr_bytes instead of open coding it
nfs/blocklayout: use bdev_nr_bytes instead of open coding it
jfs: use bdev_nr_bytes instead of open coding it
hfsplus: use bdev_nr_sectors instead of open coding it
hfs: use bdev_nr_sectors instead of open coding it
...
We never checked for errors on add_disk() as this function
returned void. Now that this is fixed, use the shiny new
error handling.
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Replace the blk_poll interface that requires the caller to keep a queue
and cookie from the submissions with polling based on the bio.
Polling for the bio itself leads to a few advantages:
- the cookie construction can made entirely private in blk-mq.c
- the caller does not need to remember the request_queue and cookie
separately and thus sidesteps their lifetime issues
- keeping the device and the cookie inside the bio allows to trivially
support polling BIOs remapping by stacking drivers
- a lot of code to propagate the cookie back up the submission path can
be removed entirely.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Tested-by: Mark Wunderlich <mark.wunderlich@intel.com>
Link: https://lore.kernel.org/r/20211012111226.760968-15-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The backing device information only makes sense for file system I/O,
and thus belongs into the gendisk and not the lower level request_queue
structure. Move it there.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210809141744.1203023-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
.. and rename the function to disk_update_readahead. This is in
preparation for moving the BDI from the request_queue to the gendisk.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jan Kara <jack@suse.cz>
Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com>
Link: https://lore.kernel.org/r/20210809141744.1203023-3-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
-----BEGIN PGP SIGNATURE-----
iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmDbd5UQHGF4Ym9lQGtl
cm5lbC5kawAKCRD301j7KXHgpvsNEADCJKP81boFzRcdJo7EqaNDAzZyKOIg9Oq7
4GZE0Wm0SgA6+04bKrNVd9KLcKvQ+NC1pK7UJemSSH2y9ir+zHfyYgAV0/+wFmYm
NgHlDjBvf80XSI5wezcb6MxZT+R7IaIpDsW1ZvV9hFtPSncn5o2OIWiSdJtHT/Rv
enlgZPc7OwNWoVMX8eR58IoO0k3S6GLpctUZHt/AUukaKgoOks0X523qhEPf3Upr
RkbIZuqLWVgpdT6457iSE/OijUczD4thTI8bdprxzhgimOm2vV52sO6F5HtHc7GX
qW+PWYUaiUk7UpObuOuyv0yyUG45ii73iY1W0w66RiyCjVTgtpdwwMQ38VlBcoOg
zcE1jneAEJt6TiS6zfRaER/10JoCIG4gp1+apPuaXud/o3BqWI0cagVHAgaLziBI
F7bDJkbJZIR6GrWMgemBI+mc5/LACBePxzPGLScKFptejtQ/ysfZQ6aCLROJWB2U
4EnysAaUBf6tywj30JqfQvqFNGkHIgY95FKiXJW6GzqqwgBouNf48vS15BgkwI+2
EijcqUhlOVNfc3RIc0ZL5c9KcPIN9t5sqBrWZe3wgCErhxAx6w6Za9nDdP+US9bl
/apCpvDFlu59g8n1wtkNE/uC+XqdKDwsplYhnfpX0FGni5wIknhQq3bSe4dPFgSn
pG5VMrw3pA==
=D6dS
-----END PGP SIGNATURE-----
Merge tag 'for-5.14/drivers-2021-06-29' of git://git.kernel.dk/linux-block
Pull block driver updates from Jens Axboe:
"Pretty calm round, mostly just NVMe and a bit of MD:
- NVMe updates (via Christoph)
- improve the APST configuration algorithm (Alexey Bogoslavsky)
- look for StorageD3Enable on companion ACPI device
(Mario Limonciello)
- allow selecting the network interface for TCP connections
(Martin Belanger)
- misc cleanups (Amit Engel, Chaitanya Kulkarni, Colin Ian King,
Christoph)
- move the ACPI StorageD3 code to drivers/acpi/ and add quirks
for certain AMD CPUs (Mario Limonciello)
- zoned device support for nvmet (Chaitanya Kulkarni)
- fix the rules for changing the serial number in nvmet
(Noam Gottlieb)
- various small fixes and cleanups (Dan Carpenter, JK Kim,
Chaitanya Kulkarni, Hannes Reinecke, Wesley Sheng, Geert
Uytterhoeven, Daniel Wagner)
- MD updates (Via Song)
- iostats rewrite (Guoqing Jiang)
- raid5 lock contention optimization (Gal Ofri)
- Fall through warning fix (Gustavo)
- Misc fixes (Gustavo, Jiapeng)"
* tag 'for-5.14/drivers-2021-06-29' of git://git.kernel.dk/linux-block: (78 commits)
nvmet: use NVMET_MAX_NAMESPACES to set nn value
loop: Fix missing discard support when using LOOP_CONFIGURE
nvme.h: add missing nvme_lba_range_type endianness annotations
nvme: remove zeroout memset call for struct
nvme-pci: remove zeroout memset call for struct
nvmet: remove zeroout memset call for struct
nvmet: add ZBD over ZNS backend support
nvmet: add Command Set Identifier support
nvmet: add nvmet_req_bio put helper for backends
nvmet: add req cns error complete helper
block: export blk_next_bio()
nvmet: remove local variable
nvmet: use nvme status value directly
nvmet: use u32 type for the local variable nsid
nvmet: use u32 for nvmet_subsys max_nsid
nvmet: use req->cmd directly in file-ns fast path
nvmet: use req->cmd directly in bdev-ns fast path
nvmet: make ver stable once connection established
nvmet: allow mn change if subsys not discovered
nvmet: make sn stable once connection was established
...
Fixes scripts/checkpatch.pl warning:
WARNING: Possible unnecessary 'out of memory' message
Remove it can help us save a bit of memory.
Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Convert the drbd driver to use the blk_alloc_disk and blk_cleanup_disk
helpers to simplify gendisk and request_queue allocation.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Hannes Reinecke <hare@suse.de>
Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>
Link: https://lore.kernel.org/r/20210521055116.1053587-8-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
In preparation to enable -Wimplicit-fallthrough for Clang, fix a couple
of warnings by explicitly adding a break statement instead of just
letting the code fall through to the next, and by adding a fallthrough
pseudo-keyword in places whre the code is intended to fall through.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
spinlock can be initialized automatically with DEFINE_SPINLOCK()
rather than explicitly calling spin_lock_init().
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Guobin Huang <huangguobin4@huawei.com>
Link: https://lore.kernel.org/r/1617710988-49205-1-git-send-email-huangguobin4@huawei.com
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
from drivers/block/drbd/drbd_nl.c:24:
drivers/block/drbd/drbd_nl.c: In function ‘drbd_adm_attach’:
drivers/block/drbd/drbd_nl.c:1968:10: warning: implicit conversion from ‘enum drbd_state_rv’ to ‘enum drbd_ret_code’ [-Wenum-conversion]
drivers/block/drbd/drbd_nl.c:930: warning: Function parameter or member 'flags' not described in 'drbd_determine_dev_size'
drivers/block/drbd/drbd_nl.c:930: warning: Function parameter or member 'rs' not described in 'drbd_determine_dev_size'
drivers/block/drbd/drbd_nl.c:1148: warning: Function parameter or member 'dc' not described in 'drbd_check_al_size'
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-12-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
drivers/block/drbd/drbd_receiver.c:1641: warning: Function parameter or member 'op' not described in 'drbd_submit_peer_request'
drivers/block/drbd/drbd_receiver.c:1641: warning: Function parameter or member 'op_flags' not described in 'drbd_submit_peer_request'
drivers/block/drbd/drbd_receiver.c:1641: warning: Function parameter or member 'fault_type' not described in 'drbd_submit_peer_request'
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-10-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
drivers/block/drbd/drbd_main.c:278: warning: Function parameter or member 'connection' not described in 'tl_clear'
drivers/block/drbd/drbd_main.c:278: warning: Excess function parameter 'device' description in 'tl_clear'
drivers/block/drbd/drbd_main.c:489: warning: Function parameter or member 'cpu_mask' not described in 'drbd_calc_cpu_mask'
drivers/block/drbd/drbd_main.c:528: warning: Excess function parameter 'device' description in 'drbd_thread_current_set_cpu'
drivers/block/drbd/drbd_main.c:549: warning: Function parameter or member 'connection' not described in 'drbd_header_size'
drivers/block/drbd/drbd_main.c:1204: warning: Function parameter or member 'device' not described in 'send_bitmap_rle_or_plain'
drivers/block/drbd/drbd_main.c:1204: warning: Function parameter or member 'c' not described in 'send_bitmap_rle_or_plain'
drivers/block/drbd/drbd_main.c:1335: warning: Function parameter or member 'peer_device' not described in '_drbd_send_ack'
drivers/block/drbd/drbd_main.c:1335: warning: Excess function parameter 'device' description in '_drbd_send_ack'
drivers/block/drbd/drbd_main.c:1379: warning: Function parameter or member 'peer_device' not described in 'drbd_send_ack'
drivers/block/drbd/drbd_main.c:1379: warning: Excess function parameter 'device' description in 'drbd_send_ack'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'connection' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'sock' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'buffer' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'size' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:1892: warning: Function parameter or member 'msg_flags' not described in 'drbd_send_all'
drivers/block/drbd/drbd_main.c:3525: warning: Function parameter or member 'flags' not described in 'drbd_queue_bitmap_io'
drivers/block/drbd/drbd_main.c:3563: warning: Function parameter or member 'flags' not described in 'drbd_bitmap_io'
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-9-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Fixes the following W=1 kernel build warning(s):
from drivers/block/drbd/drbd_nl.c:24:
drivers/block/drbd/drbd_nl.c: In function ‘drbd_adm_set_role’:
drivers/block/drbd/drbd_nl.c:793:11: warning: implicit conversion from ‘enum drbd_state_rv’ to ‘enum drbd_ret_code’ [-Wenum-conversion]
drivers/block/drbd/drbd_nl.c:795:11: warning: implicit conversion from ‘enum drbd_state_rv’ to ‘enum drbd_ret_code’ [-Wenum-conversion]
drivers/block/drbd/drbd_nl.c: In function ‘drbd_adm_attach’:
drivers/block/drbd/drbd_nl.c:1965:10: warning: implicit conversion from ‘enum drbd_state_rv’ to ‘enum drbd_ret_code’ [-Wenum-conversion]
drivers/block/drbd/drbd_nl.c: In function ‘drbd_adm_connect’:
drivers/block/drbd/drbd_nl.c:2690:10: warning: implicit conversion from ‘enum drbd_state_rv’ to ‘enum drbd_ret_code’ [-Wenum-conversion]
drivers/block/drbd/drbd_nl.c: In function ‘drbd_adm_disconnect’:
drivers/block/drbd/drbd_nl.c:2803:11: warning: implicit conversion from ‘enum drbd_state_rv’ to ‘enum drbd_ret_code’ [-Wenum-conversion]
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-8-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>
[P_RETRY_WRITE] is initialised more than once.
Fixes the following W=1 kernel build warning(s):
drivers/block/drbd/drbd_main.c: In function ‘cmdname’:
drivers/block/drbd/drbd_main.c:3660:22: warning: initialized field overwritten [-Woverride-init]
drivers/block/drbd/drbd_main.c:3660:22: note: (near initialization for ‘cmdnames[44]’)
Cc: Philipp Reisner <philipp.reisner@linbit.com>
Cc: Lars Ellenberg <lars.ellenberg@linbit.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: drbd-dev@lists.linbit.com
Cc: linux-block@vger.kernel.org
Signed-off-by: Lee Jones <lee.jones@linaro.org>
Link: https://lore.kernel.org/r/20210312105530.2219008-7-lee.jones@linaro.org
Signed-off-by: Jens Axboe <axboe@kernel.dk>