linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-08-30 21:52:21 +00:00

Author	SHA1	Message	Date
Bob Pearson	8bb143c534	RDMA/rxe: Make the tasklet exits the same Make changes to the three tasklets so that the exit logic from each is the same. This makes the code easier to understand. Link: https://lore.kernel.org/r/20220630190425.2251-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-07-22 17:43:00 -03:00
Bob Pearson	445fd4f4fb	RDMA/rxe: Fix rnr retry behavior Currently the completer tasklet when retransmit timer or the rnr timer fires the same flag (qp->req.need_retry) is set so that if either timer fires it will attempt to perform a retry flow on the send queue. This has the effect of responding to an RNR NAK at the first retransmit timer event which might not allow the requested rnr timeout. This patch adds a new flag (qp->req.wait_for_rnr_timer) which, if set, prevents a retry flow until the rnr nak timer fires. This patch fixes rnr retry errors which can be observed by running the pyverbs test_rdmacm_async_traffic_external_qp multiple times. With this patch applied they do not occur. Link: https://lore.kernel.org/linux-rdma/a8287823-1408-4273-bc22-99a0678db640@gmail.com/ Link: https://lore.kernel.org/linux-rdma/2bafda9e-2bb6-186d-12a1-179e8f6a2678@talpey.com/ Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220630190425.2251-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-07-22 17:43:00 -03:00
Bob Pearson	930119a172	RDMA/rxe: Add rxe_is_fenced() subroutine The code thc that decides whether to defer execution of a wqe in rxe_requester.c is isolated into a subroutine rxe_is_fenced() and removed from the call to req_next_wqe(). The condition whether a wqe should be fenced is changed to comply with the IBA. Currently an operation is fenced if the fence bit is set in the wqe flags and the last wqe has not completed. For normal operations the IBA actually only requires that the last read or atomic operation is complete. Link: https://lore.kernel.org/r/20220630190425.2251-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-07-22 17:43:00 -03:00
Md Haris Iqbal	174e7b1370	RDMA/rxe: For invalidate compare according to set keys in mr The 'rkey' input can be an lkey or rkey, and in rxe the lkey or rkey have the same value, including the variant bits. So, if mr->rkey is set, compare the invalidate key with it, otherwise compare with the mr->lkey. Since we already did a lookup on the non-varient bits to get this far, the check's only purpose is to confirm that the wqe has the correct variant bits. Fixes: `001345339f` ("RDMA/rxe: Separate HW and SW l/rkeys") Link: https://lore.kernel.org/r/20220707073006.328737-1-haris.phnx@gmail.com Signed-off-by: Md Haris Iqbal <haris.phnx@gmail.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-07-22 17:35:53 -03:00
Bob Pearson	1603f89935	RDMA/rxe: Fix mw bind to allow any consumer key portion The current implementation of rxe_check_bind_mw() in rxe_mw.c is incorrect since it requires the new key portion provided by the mw consumer to be different than the previous key portion. This is not required by the IBA. Remove the test. Link: https://lore.kernel.org/linux-rdma/fb4614e7-4cac-0dc7-3ef7-766dfd10e8f2@gmail.com/ Fixes: `32a577b4c3` ("Add support for bind MW work requests") Link: https://lore.kernel.org/r/20220714204619.13396-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-07-21 11:35:31 -03:00
Zhang Jiaming	5abb71b47c	RDMA/rxe: Fix spelling mistake in error print There is a spelling mistake (writeable) in function rxe_check_bind_mw. Fix it. Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-21 09:59:29 +03:00
Xiao Yang	68691bad98	RDMA/rxe: Remove unused qp parameter The qp parameter in free_rd_atomic_resource() has become unused so remove it directly. Fixes: `15ae1375ea` ("RDMA/rxe: Fix qp reference counting for atomic ops") Link: https://lore.kernel.org/all/20220708035547.6592-1-yangx.jy@fujitsu.com/ Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-19 11:31:09 +03:00
lizhijian@fujitsu.com	03905ac285	RDMA/rxe: Remove unused mask parameter This parameter had been deprecated since below commit: `1a7085b342` ("RDMA/rxe: Skip adjusting remote addr for write in retry operation") Link: https://lore.kernel.org/r/20220715035340.1900168-1-lizhijian@fujitsu.com Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-18 14:50:35 +03:00
Xiao Yang	548c56dd2e	RDMA/rxe: Rename rxe_atomic_reply to atomic_reply It's better to use the unified naming format. Link: https://lore.kernel.org/r/20220705145212.12014-2-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-18 14:36:18 +03:00
Xiao Yang	882736fb3b	RDMA/rxe: Add common rxe_prepare_res() It's redundant to prepare resources for Read and Atomic requests by different functions. Replace them by a common rxe_prepare_res() with different parameters. In addition, the common rxe_prepare_res() can also be used by new Flush and Atomic Write requests in the future. Link: https://lore.kernel.org/r/20220705145212.12014-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-18 14:36:11 +03:00
Zhu Yanjun	37da51efe6	RDMA/rxe: Fix BUG: KASAN: null-ptr-deref in rxe_qp_do_cleanup The function rxe_create_qp calls rxe_qp_from_init. If some error occurs, the error handler of function rxe_qp_from_init will set both scq and rcq to NULL. Then rxe_create_qp calls rxe_put to handle qp. In the end, rxe_qp_do_cleanup is called by rxe_put. rxe_qp_do_cleanup directly accesses scq and rcq before checking them. This will cause null-ptr-deref error. The call graph is as below: rxe_create_qp { ... rxe_qp_from_init { ... err1: ... qp->rcq = NULL; <---rcq is set to NULL qp->scq = NULL; <---scq is set to NULL ... } qp_init: rxe_put{ ... rxe_qp_do_cleanup { ... atomic_dec(&qp->scq->num_wq); <--- scq is accessed ... atomic_dec(&qp->rcq->num_wq); <--- rcq is accessed } } Fixes: `4703b4f0d9` ("RDMA/rxe: Enforce IBA C11-17") Link: https://lore.kernel.org/r/20220705225414.315478-1-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Md Haris Iqbal <haris.iqbal@ionos.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-07-18 14:32:39 +03:00
Zhang Jiaming	2635d2a8d4	IB: Fix spelling of 'writable' There is a typo (writeable) in qib_file_ops.c, qib_sd7220.c's comments, and in rxe_check_bind_mw() Link: https://lore.kernel.org/r/20220701074812.12615-1-jiaming@nfschina.com Link: https://lore.kernel.org/r/20220701080019.13329-1-jiaming@nfschina.com Signed-off-by: Zhang Jiaming <jiaming@nfschina.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-07-04 10:14:57 -03:00
Bob Pearson	96938258b1	RDMA/rxe: Remove unnecessary include statement rxe_verbs.h includes the file <rdma/rdma_user_rxe.h>. It should have been <uapi/rdma/rdma_user_rxe.h>, however, it is not used and not required in this file. This patch removes the include statement. Link: https://lore.kernel.org/r/20220630190425.2251-4-rpearsonhpe@gmail.com Reported-by: Frank Zago <frank.zago@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-07-04 10:04:09 -03:00
Bob Pearson	f5d1f6d63c	RDMA/rxe: Replace include statement rxe_queue.h currently includes <uapi/rdma/rdma_user_rxe.h> for a definition of struct rxe_queue_buf. But it is only used as a pointer so the definition is not needed. This patch replaces the include statement with the declaration struct rxe_queue_buf; Link: https://lore.kernel.org/r/20220630190425.2251-5-rpearsonhpe@gmail.com Reported-by: Frank Zago <frank.zago@hpe.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 20:45:00 -03:00
Bob Pearson	cae3fa541e	RDMA/rxe: Convert pr_warn/err to pr_debug in pyverbs The pyverbs test suite generates a few dmesg traces from intentional error tests. This patch replaces those messages with pr_debug() calls which improves the usefullness of the tests. Link: https://lore.kernel.org/r/20220630190425.2251-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 20:45:00 -03:00
Bob Pearson	7cb33d1bc1	RDMA/rxe: Fix deadlock in rxe_do_local_ops() When a local operation (invalidate mr, reg mr, bind mw) is finished there will be no ack packet coming from a responder to cause the wqe to be completed. This may happen anyway if a subsequent wqe performs IO. Currently if the wqe is signalled the completer tasklet is scheduled immediately but not otherwise. This leads to a deadlock if the next wqe has the fence bit set in send flags and the operation is not signalled. This patch removes the condition that the wqe must be signalled in order to schedule the completer tasklet which is the simplest fix for this deadlock and is fairly low cost. This is the analog for local operations of always setting the ackreq bit in all last or only request packets even if the operation is not signalled. Link: https://lore.kernel.org/r/20220523223251.15350-1-rpearsonhpe@gmail.com Reported-by: Jenny Hack <jhack@hpe.com> Fixes: `c1a411268a` ("RDMA/rxe: Move local ops to subroutine") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 17:00:23 -03:00
Bob Pearson	dc18483881	RDMA/rxe: Merge normal and retry atomic flows Make the execution of the atomic operation in rxe_atomic_reply() conditional on res->replay and make duplicate_request() call into rxe_atomic_reply() to merge the two flows. This is modeled on the behavior of read reply. Delete the skb from the atomic responder resource since it is no longer used. Adjust the reference counting of the qp in send_atomic_ack() for this flow. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220606143836.3323-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 14:00:21 -03:00
Bob Pearson	8264411595	RDMA/rxe: Move atomic original value to res Move the saved original value to the atomic responder resource. This replaces saving it in the qp. In preparation for merging the normal and retry atomic responder flows. Link: https://lore.kernel.org/r/20220606143836.3323-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 14:00:21 -03:00
Bob Pearson	220e842815	RDMA/rxe: Move atomic responder res to atomic_reply Move the allocation of the atomic responder resource up into rxe_atomic_reply() from send_atomic_ack(). In preparation for merging the normal and retry atomic responder flows. Link: https://lore.kernel.org/r/20220606143836.3323-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 14:00:21 -03:00
Bob Pearson	0ed5493e43	RDMA/rxe: Add a responder state for atomic reply Add a responder state for atomic reply similar to read reply and rename process_atomic() rxe_atomic_reply(). In preparation for merging the normal and retry atomic responder flows. Link: https://lore.kernel.org/r/20220606143836.3323-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 13:54:04 -03:00
Bob Pearson	24f0ab0102	RDMA/rxe: Move code to rxe_prepare_atomic_res() Separate the code that prepares the atomic responder resource into a subroutine. This is preparation for merging the normal and retry atomic responder flows. Link: https://lore.kernel.org/r/20220606143836.3323-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 13:54:03 -03:00
Bob Pearson	b54c2a25ac	RDMA/rxe: Convert read side locking to rcu Use rcu_read_lock() for protecting read side operations in rxe_pool.c. Link: https://lore.kernel.org/r/20220612223434.31462-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 10:56:05 -03:00
Bob Pearson	215d0a755e	RDMA/rxe: Stop lookup of partially built objects Currently the rdma_rxe driver has a security weakness due to giving objects which are partially initialized indices allowing external actors to gain access to them by sending packets which refer to their index (e.g. qpn, rkey, etc) causing unpredictable results. This patch adds a new API rxe_finalize(obj) which enables looking up pool objects from indices using rxe_pool_get_index() for AH, QP, MR, and MW. They are added in create verbs only after the objects are fully initialized. It also adds wait for completion to destroy/dealloc verbs to assure that all references have been dropped before returning to rdma_core by implementing a new rxe_pool API rxe_cleanup() which drops a reference to the object and then waits for all other references to be dropped. When the last reference is dropped the object is completed by kref. After that it cleans up the object and if locally allocated frees the memory. In the special case of address handle objects the delay is implemented separately if the destroy_ah call is not sleepable. Combined with deferring cleanup code to type specific cleanup routines this allows all pending activity referring to objects to complete before returning to rdma_core. Link: https://lore.kernel.org/r/20220612223434.31462-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-30 10:56:01 -03:00
Xiao Yang	80a14dd4c3	RDMA/rxe: Remove useless pkt parameters The pkt parameters in prepare_ack_packet(), send_ack() and send_atomic_ack() have become useless by the following commits. So remove them directly. Fixes: `bf139b58af` ("RDMA/rxe: Remove unused pkt->offset") Fixes: `3896bde92d` ("RDMA/rxe: Fix extra copy in prepare_ack_packet") Link: https://lore.kernel.org/r/20220623131627.18903-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-06-24 19:35:21 -03:00
Dongliang Mu	1a685940e6	RDMA/rxe: fix xa_alloc_cycle() error return value check again Currently rxe_alloc checks ret to indicate error, but 1 is also a valid return and just indicates that the allocation succeeded with a wrap. Fix this by modifying the check to be < 0. Link: https://lore.kernel.org/r/20220609070656.1446121-1-dzm91@hust.edu.cn Fixes: `3225717f6d` ("RDMA/rxe: Replace red-black trees by xarrays") Signed-off-by: Dongliang Mu <mudongliangabcd@gmail.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Leon Romanovsky <leon@kernel.org>	2022-06-16 20:07:07 +03:00
Christophe JAILLET	7f60951ff4	RDMA/rxe: Fix an error handling path in rxe_get_mcg() The commit in the Fixes tag has shuffled some code. Now 'mcg_num' is incremented before the kzalloc(). So if the memory allocation fails, this increment must be undone. Fixes: `a926a903b7` ("RDMA/rxe: Do not call dev_mc_add/del() under a spinlock") Link: https://lore.kernel.org/r/fe137cd8b1f17593243aa73d59c18ea71ab9ee36.1653225896.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-24 12:55:12 -03:00
Jason Gunthorpe	a6f844da39	Linux 5.18 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmKKlIAeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC3oH/iPm/fLG2sJut8My sU0RC9K+6ESV5h2Qy6k00/lqKstlu4EvBjw4V8vYpx3Q2+hbSFMn2SeWqqqT3Lkk Zb8KINCFuuyMtdCBb42PV0zhUf5pCQF7ocm/Ae4jllDHtPmqk3WJ6IGtZBK5JBlw z6RR/wKt0y0MRj9eZyPyYjOee2L2vuVh4tgnexK/4L8g2ZtMMRThhvUzSMWG4zxR STYYNp0uFcfT1Vt85+ODevFH4TvdECAj+SqAegN+seHLM17YY7M0/WiIYpxGRv8P lIpDQl4PBU8EBkpI5hkpJ/3qPincbuVOMLsYfxFtpcjjG12vGjFp2krGpS3TedZQ 3mvaJ7c= =vLke -----END PGP SIGNATURE----- Merge tag 'v5.18' into rdma.git for-next Following patches have dependencies. Resolve the merge conflict in drivers/net/ethernet/mellanox/mlx5/core/main.c by keeping the new names for the fs functions following linux-next: https://lore.kernel.org/r/20220519113529.226bc3e2@canb.auug.org.au/ Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-24 12:40:28 -03:00
Bob Pearson	4703b4f0d9	RDMA/rxe: Enforce IBA C11-17 Add a counter to keep track of the number of WQs connected to a CQ and return an error if destroy_cq() is called while the counter is non zero. Link: https://lore.kernel.org/r/20220421014042.26985-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	cde3f5d682	RDMA/rxe: Move mw cleanup code to rxe_mw_cleanup() Move code from rxe_dealloc_mw() to rxe_mw_cleanup() to allow flows which hold a reference to mw to complete. Link: https://lore.kernel.org/r/20220421014042.26985-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	cf40367961	RDMA/rxe: Move mr cleanup code to rxe_mr_cleanup() Move the code which tears down an mr to rxe_mr_cleanup to allow operations holding a reference to the mr to complete. Link: https://lore.kernel.org/r/20220421014042.26985-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	ed2b5dd0f8	RDMA/rxe: Move qp cleanup code to rxe_qp_do_cleanup() Move the code from rxe_qp_destroy() to rxe_qp_do_cleanup(). This allows flows holding references to qp to complete before the qp object is torn down. Link: https://lore.kernel.org/r/20220421014042.26985-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	4e05a4b329	RDMA/rxe: Check rxe_get() return value In the tasklets (completer, responder, and requester) check the return value from rxe_get() to detect failures to get a reference. This only occurs if the qp has had its reference count drop to zero which indicates that it no longer should be used. The ref is never 0 today because the tasklets are flushed before the ref is dropped. The next patch changes this so that the ref is dropped then the tasklets are flushed. Link: https://lore.kernel.org/r/20220421014042.26985-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:45 -03:00
Bob Pearson	b2a41678fc	RDMA/rxe: Add rxe_srq_cleanup() Move cleanup code from rxe_destroy_srq() to rxe_srq_cleanup() which is called after all references are dropped to allow code depending on the srq object to complete. Link: https://lore.kernel.org/r/20220421014042.26985-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-09 09:03:39 -03:00
Bob Pearson	0b1fbfb9e9	RDMA/rxe: Remove IB_SRQ_INIT_MASK Currently the #define IB_SRQ_INIT_MASK is used to distinguish the rxe_create_srq verb from the rxe_modify_srq verb so that some code can be shared between these two subroutines. This commit splits rxe_srq_chk_attr into two subroutines: rxe_srq_chk_init and rxe_srq_chk_attr which handle the create_srq and modify_srq verbs separately. Link: https://lore.kernel.org/r/20220421014042.26985-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-06 15:33:31 -03:00
Chengguang Xu	1a7085b342	RDMA/rxe: Skip adjusting remote addr for write in retry operation For write request the remote addr will be sent only with first packet so we don't have to adjust wqe->iova in retry operation. Link: https://lore.kernel.org/r/20220502053907.6388-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-06 13:12:56 -03:00
Zhu Yanjun	08d709d5e1	RDMA/rxe: Optimize the mr pool struct Based on the commit `c9f4c69583` ("RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC"), only the mr pool uses the RXE_POOL_ALLOC, As such, replace this flags with pool type to save memory. Link: https://lore.kernel.org/r/20220428041028.1363139-1-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-04 22:02:58 -03:00
Bob Pearson	bfdc0edd11	RDMA/rxe: Change mcg_lock to a _bh lock rxe_mcast.c currently uses _irqsave spinlocks for rxe->mcg_lock while rxe_recv.c uses _bh spinlocks for the same lock. As there is no case where the mcg_lock can be taken from an IRQ, change these all to bh locks so we don't have confusing mismatched lock types on the same spinlock. Fixes: `6090a0c4c7` ("RDMA/rxe: Cleanup rxe_mcast.c") Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-04 21:29:25 -03:00
Bob Pearson	a926a903b7	RDMA/rxe: Do not call dev_mc_add/del() under a spinlock These routines were not intended to be called under a spinlock and will throw debugging warnings: raw_local_irq_restore() called with IRQs enabled WARNING: CPU: 13 PID: 3107 at kernel/locking/irqflag-debug.c:10 warn_bogus_irq_restore+0x2f/0x50 CPU: 13 PID: 3107 Comm: python3 Tainted: G E 5.18.0-rc1+ #7 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:warn_bogus_irq_restore+0x2f/0x50 Call Trace: <TASK> _raw_spin_unlock_irqrestore+0x75/0x80 rxe_attach_mcast+0x304/0x480 [rdma_rxe] ib_attach_mcast+0x88/0xa0 [ib_core] ib_uverbs_attach_mcast+0x186/0x1e0 [ib_uverbs] ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0xcd/0x140 [ib_uverbs] ib_uverbs_cmd_verbs+0xdb0/0xea0 [ib_uverbs] ib_uverbs_ioctl+0xd2/0x160 [ib_uverbs] do_syscall_64+0x5c/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Move them out of the spinlock, it is OK if there is some races setting up the MC reception at the ethernet layer with rbtree lookups. Fixes: `6090a0c4c7` ("RDMA/rxe: Cleanup rxe_mcast.c") Link: https://lore.kernel.org/r/20220504202817.98247-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-04 21:16:57 -03:00
Bob Pearson	e7734156b0	RDMA/rxe: Replace paylen by payload In finish_packet() in rxe_req.c a variable was incorrectly called paylen instead of payload. Elsewhere in the rxe source payload is always used for the RoCE payload length and paylen is always used for the UDP payload length. This will cause unnecessary confusion. Replace paylen by payload in finish_packet(). Link: https://lore.kernel.org/r/20220420172316.5465-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-05-04 11:21:10 -03:00
Li Zhijian	0f328c7034	RDMA/rxe: Remove useless parameters for update_state() wqe was not used by update_state() so far. Commit `aaaf62e066` ("RDMA/rxe: Remove useless argument for update_state()") just did a partial fixes. Link: https://lore.kernel.org/r/20220412022903.574238-1-lizhijian@fujitsu.com Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-25 15:51:32 -03:00
Bob Pearson	570a4bf744	RDMA/rxe: Recheck the MR in when generating a READ reply The rping benchmark fails on long runs. The root cause of this failure has been traced to a failure to compute a nonzero value of mr in rare situations. Fix this failure by correctly handling the computation of mr in read_reply() in rxe_resp.c in the replay flow. Fixes: `8a1a0be894` ("RDMA/rxe: Replace mr by rkey in responder resources") Link: https://lore.kernel.org/r/20220418174103.3040-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-20 11:21:24 -03:00
Bob Pearson	290c4a902b	RDMA/rxe: Fix "Replace mr by rkey in responder resources" The referenced commit generates a reference counting error if the rkey has the same index but the wrong key. In this case the reference taken by rxe_pool_get_index() is not dropped. Drop the reference if the keys don't match in rxe_recheck_mr(). Check that the mw and mr are still valid. Fixes: `8a1a0be894` ("RDMA/rxe: Replace mr by rkey in responder resources") Link: https://lore.kernel.org/r/20220411030647.20011-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-12 11:17:52 -03:00
Xiao Yang	2f917af777	RDMA/rxe: Generate a completion for unsupported/invalid opcode Current rxe_requester() doesn't generate a completion when processing an unsupported/invalid opcode. If rxe driver doesn't support a new opcode (e.g. RDMA Atomic Write) and RDMA library supports it, an application using the new opcode can reproduce this issue. Fix the issue by calling "goto err;". Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20220410113513.27537-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-12 11:16:24 -03:00
Bob Pearson	98c8026331	RDMA/rxe: Remove reliable datagram support The rdma_rxe driver does not actually support the reliable datagram transport but contains two references to RD opcodes in driver code. This commit removes these references to RD transport opcodes which are never used. Link: https://lore.kernel.org/r/cce0f07d-25fc-5880-69e7-001d951750b7@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-08 14:38:50 -03:00
Bob Pearson	409baed5d7	RDMA/rxe: Remove support for SMI QPs from rdma_rxe Currently the rdma_rxe driver supports SMI type QPs in a few places which is incorrect. RoCE devices never should support SMI QPs. This commit removes SMI QP support from the driver. Link: https://lore.kernel.org/r/20220407185416.16372-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-08 14:38:33 -03:00
Bob Pearson	5c477ee768	RDMA/rxe: Remove mc_grp_pool from struct rxe_dev Remove struct rxe_dev mc_grp_pool field. This field is no longer used. Link: https://lore.kernel.org/r/20220407184849.14359-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-08 14:38:18 -03:00
Bob Pearson	9227b6cec5	RDMA/rxe: Remove type 2A memory window capability Currently the rdma_rxe driver claims to support both 2A and 2B type memory windows. But the IBA requires 010-37.2.31: If an HCA supports the Base Memory Management extensions, the HCA shall support either Type 2A or Type 2B MWs, but not both. This commit removes the device capability bit for type 2A memory windows and adds a clarifying comment to rxe_mw.c. Link: https://lore.kernel.org/r/20220407184321.14207-1-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-08 14:37:17 -03:00
Jason Gunthorpe	e945c653c8	RDMA: Split kernel-only global device caps from uverbs device caps Split out flags from ib_device::device_cap_flags that are only used internally to the kernel into kernel_cap_flags that is not part of the uapi. This limits the device_cap_flags to being the same bitmap that will be copied to userspace. This cleanly splits out the uverbs flags from the kernel flags to avoid confusion in the flags bitmap. Add some short comments describing which each of the kernel flags is connected to. Remove unused kernel flags. Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-04-06 15:02:13 -03:00
Bob Pearson	3197706abd	RDMA/rxe: Use standard names for ref counting Rename rxe_add_ref() to rxe_get() and rxe_drop_ref() to rxe_put(). Significantly improves readability for new readers. Link: https://lore.kernel.org/r/20220304000808.225811-10-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-16 10:34:42 -03:00
Bob Pearson	3225717f6d	RDMA/rxe: Replace red-black trees by xarrays Currently the rxe driver uses red-black trees to add indices to the rxe object pools. Linux xarrays provide a better way to implement the same functionality for indices. This patch replaces red-black trees by xarrays for pool objects. Since xarrays already have a spinlock use that in place of the pool rwlock. Make sure that all changes in the xarray(index) and kref(ref counnt) occur atomically. Link: https://lore.kernel.org/r/20220304000808.225811-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-16 10:34:42 -03:00
Bob Pearson	df34dc9e03	RDMA/rxe: Shorten pool names in rxe_pool.c Replace pool names like "rxe-xx" with "xx". Just reduces clutter. Link: https://lore.kernel.org/r/20220304000808.225811-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:57 -03:00
Bob Pearson	3ccffe8abf	RDMA/rxe: Move max_elem into rxe_type_info Move the maximum number of elements from a parameter in rxe_pool_init to a member of the rxe_type_info array. Link: https://lore.kernel.org/r/20220304000808.225811-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:57 -03:00
Bob Pearson	b4a47f6836	RDMA/rxe: Replace obj by elem in declaration Fix a harmless typo replacing obj by elem in the cleanup fields. This has no effect but is confusing. Link: https://lore.kernel.org/r/20220304000808.225811-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:57 -03:00
Bob Pearson	3c3e4d582b	RDMA/rxe: Delete _locked() APIs for pool objects Since caller managed locks for indexed objects are no longer used these APIs are deleted. Link: https://lore.kernel.org/r/20220304000808.225811-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:57 -03:00
Bob Pearson	c9f4c69583	RDMA/rxe: Reverse the sense of RXE_POOL_NO_ALLOC There is only one remaining object type that allocates its own memory, that is mr. So the sense of RXE_POOL_NO_ALLOC is changed to RXE_POOL_ALLOC. Add checks to rxe_alloc() and rxe_add_to_pool() to make sure the correct call is used for the setting of this flag. Link: https://lore.kernel.org/r/20220304000808.225811-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:56 -03:00
Bob Pearson	8a1a0be894	RDMA/rxe: Replace mr by rkey in responder resources Currently rxe saves a copy of MR in responder resources for RDMA reads. Since the responder resources are never freed just over written if more are needed this MR may not have a reference freed until the QP is destroyed. This patch uses the rkey instead of the MR and on subsequent packets of a multipacket read reply message it looks up the MR from the rkey for each packet. This makes it possible for a user to deregister an MR or unbind a MW on the fly and get correct behaviour. Link: https://lore.kernel.org/r/20220304000808.225811-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:56 -03:00
Bob Pearson	63221acb0c	RDMA/rxe: Fix ref error in rxe_av.c The commit referenced below can take a reference to the AH which is never dropped. This only happens in the UD request path. This patch optionally passes that AH back to the caller so that it can hold the reference while the AV is being accessed and then drop it. Code to do this is added to rxe_req.c. The AV is also passed to rxe_prepare in rxe_net.c as an optimization. Fixes: `e2fe06c908` ("RDMA/rxe: Lookup kernel AH from ah index in UD WQEs") Link: https://lore.kernel.org/r/20220304000808.225811-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-15 20:49:56 -03:00
Chengguang Xu	aaaf62e066	RDMA/rxe: Remove useless argument for update_state() The argument 'payload' is not used in update_state(), so just remove it. Link: https://lore.kernel.org/r/20220307145047.3235675-2-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-14 20:36:01 -03:00
Chengguang Xu	7e8e611d6a	RDMA/rxe: Change variable and function argument to proper type The type of wqe length is u32 so in order to avoid overflow and shadow casting change variable and relevant function argument to proper type. Link: https://lore.kernel.org/r/20220307145047.3235675-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-03-14 20:36:01 -03:00
Bob Pearson	6090a0c4c7	RDMA/rxe: Cleanup rxe_mcast.c Finish adding subroutine comment headers to subroutines in rxe_mcast.c. Make minor api change cleanups. Link: https://lore.kernel.org/r/20220223230706.50332-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-23 20:29:15 -04:00
Bob Pearson	a181c4c81a	RDMA/rxe: Collect cleanup mca code in a subroutine Collect cleanup code for struct rxe_mca into a subroutine, __rxe_cleanup_mca() called in rxe_detach_mcg() in rxe_mcast.c. Link: https://lore.kernel.org/r/20220223230706.50332-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-23 20:29:15 -04:00
Bob Pearson	4a4f107347	RDMA/rxe: Collect mca init code in a subroutine Collect initialization code for struct rxe_mca into a subroutine, __rxe_init_mca(), to cleanup rxe_attach_mcg() in rxe_mcast.c. Check limit on total number of attached qp's. Link: https://lore.kernel.org/r/20220223230706.50332-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-23 20:29:15 -04:00
Bob Pearson	6a8a2e473b	RDMA/rxe: Warn if mcast memory is not freed Print a warning if memory allocated by mcast is not cleared when the rxe driver is unloaded. Link: https://lore.kernel.org/r/20220223230706.50332-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-23 20:29:15 -04:00
Bob Pearson	3810c1a1cb	RDMA/rxe: Remove mcg from rxe pools Finish removing mcg from rxe pools. Replace rxe pools ref counting by kref's. Replace rxe_alloc by kzalloc. Link: https://lore.kernel.org/r/20220208211644.123457-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 12:11:29 -04:00
Bob Pearson	d2ccf0411d	RDMA/rxe: Remove key'ed object support Now that rxe_mcast.c has it's own red-black tree support there is no longer any requirement for key'ed objects in rxe pools. This patch removes the key APIs and related code. Link: https://lore.kernel.org/r/20220208211644.123457-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 12:10:17 -04:00
Bob Pearson	8a0a5fe0c4	RDMA/rxe: Replace pool key by rxe->mcg_tree Continuing to decouple mcg from rxe pools. Create red-black tree code in rxe_mcast.c to hold mcg index. Replace pool key calls by calls to local red-black routines. Link: https://lore.kernel.org/r/20220208211644.123457-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 12:06:35 -04:00
Bob Pearson	8a99c81f12	RDMA/rxe: Replace int num_qp by atomic_t qp_num Replace int num_qp in struct rxe_mcg by atomic_t qp_num. Link: https://lore.kernel.org/r/20220208211644.123457-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 12:01:22 -04:00
Bob Pearson	5bc15d1f7e	RDMA/rxe: Replace grp by mcg, mce by mca Replace 'grp' by 'mcg', 'mce' by 'mca'. Shorten subroutine names in rxe_mcast.c. These name uses are more in line with other object names used. Link: https://lore.kernel.org/r/20220208211644.123457-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 12:01:19 -04:00
Bob Pearson	d572405518	RDMA/rxe: Use kzmalloc/kfree for mca Remove rxe_mca (was rxe_mc_elem) from rxe pools and use kzmalloc and kfree to allocate and free in rxe_mcast.c. Call kzalloc outside of spinlocks to avoid having to use GFP_ATOMIC. Link: https://lore.kernel.org/r/20220208211644.123457-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 11:59:11 -04:00
Bob Pearson	9fd0eb7c3c	RDMA/rxe: Move mcg_lock to rxe Replace mcg->mcg_lock and mc_grp_pool->pool_lock by rxe->mcg_lock. This is the first step of several intended to decouple the mc_grp and mc_elem objects from the rxe pool code. Link: https://lore.kernel.org/r/20220208211644.123457-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 11:55:28 -04:00
Bob Pearson	a099b08599	RDMA/rxe: Revert changes from irqsave to bh locks A previous patch replaced all irqsave locks in rxe with bh locks. This ran into problems because rdmacm has a bad habit of calling rdma verbs APIs while disabling irqs. This is not allowed during spin_unlock_bh() causing programs that use rdmacm to fail. This patch reverts the changes to locks that had this problem or got dragged into the same mess. After this patch blktests/check -q srp now runs correctly. Link: https://lore.kernel.org/r/20220215194448.44369-1-rpearsonhpe@gmail.com Fixes: `21adfa7a3c` ("RDMA/rxe: Replace irqsave locks with bh locks") Reported-by: Guoqing Jiang <guoqing.jiang@linux.dev> Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Tested-by: Bart Van Assche <bvanassche@acm.org> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-16 11:51:28 -04:00
Xiao Yang	b1377cc37f	RDMA/rxe: Check the last packet by RXE_END_MASK It's wrong to check the last packet by RXE_COMP_MASK because the flag is to indicate if responder needs to generate a completion. Fixes: `9fcd67d177` ("IB/rxe: increment msn only when completing a request") Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20211229034438.1854908-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-02-08 11:54:38 -04:00
Bob Pearson	d3f6899b0b	RDMA/rxe: Remove qp->grp_lock and qp->grp_list Since it is no longer required to cleanup attachments to multicast groups when a QP is destroyed qp->grp_lock and qp->grp_list are no longer needed and are removed. Link: https://lore.kernel.org/r/20220127213755.31697-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 14:33:28 -04:00
Bob Pearson	8a7fa872ff	RDMA/rxe: Remove rxe_drop_all_macst_groups With o10-2.2.3 enforced rxe_drop_all_mcast_groups is completely unnecessary. Remove it and references to it. Link: https://lore.kernel.org/r/20220127213755.31697-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 14:33:28 -04:00
Bob Pearson	f9f4846057	RDMA/rxe: Enforce IBA o10-2.2.3 Add code to check if a QP is attached to one or more multicast groups when destroy_qp is called and return an error if so. Link: https://lore.kernel.org/r/20220127213755.31697-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 14:33:28 -04:00
Bob Pearson	02e3524474	RDMA/rxe: Rename rxe_mc_grp and rxe_mc_elem Rename rxe_mc_grp to rxe_mcg. Rename rxe_mc_elem to rxe_mca. These can be read 'multicast group' and 'multicast attachment'. 'elem' collided with the use of elem in rxe pools and was a little confusing. Link: https://lore.kernel.org/r/20220127213755.31697-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 13:22:09 -04:00
Bob Pearson	758c7f1e9c	RDMA/rxe: Move rxe_mcast_attach/detach to rxe_mcast.c Move rxe_mcast_attach and rxe_mcast_detach from rxe_verbs.c to rxe_mcast.c, Make non-static and add declarations to rxe_loc.h. Make the subroutines in rxe_mcast.c referenced by these routines static and remove their declarations from rxe_loc.h. Link: https://lore.kernel.org/r/20220127213755.31697-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 13:22:09 -04:00
Bob Pearson	7df1023970	RDMA/rxe: Move rxe_mcast_add/delete to rxe_mcast.c Move rxe_mcast_add and rxe_mcast_delete from rxe_net.c to rxe_mcast.c, make static and remove declarations from rxe_loc.h. Link: https://lore.kernel.org/r/20220127213755.31697-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 13:22:09 -04:00
Leon Romanovsky	d7b887ab5d	RDMA/rxe: Delete useless module.h include There is no need in include of module.h in the following files. Link: https://lore.kernel.org/r/8bdb652b01f2316bc57b456fb8c60bfbffe6cc64.1642960861.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-28 13:03:13 -04:00
Linus Torvalds	f4484d138b	Merge branch 'akpm' (patches from Andrew) Merge more updates from Andrew Morton: "55 patches. Subsystems affected by this patch series: percpu, procfs, sysctl, misc, core-kernel, get_maintainer, lib, checkpatch, binfmt, nilfs2, hfs, fat, adfs, panic, delayacct, kconfig, kcov, and ubsan" * emailed patches from Andrew Morton <akpm@linux-foundation.org>: (55 commits) lib: remove redundant assignment to variable ret ubsan: remove CONFIG_UBSAN_OBJECT_SIZE kcov: fix generic Kconfig dependencies if ARCH_WANTS_NO_INSTR lib/Kconfig.debug: make TEST_KMOD depend on PAGE_SIZE_LESS_THAN_256KB btrfs: use generic Kconfig option for 256kB page size limit arch/Kconfig: split PAGE_SIZE_LESS_THAN_256KB from PAGE_SIZE_LESS_THAN_64KB configs: introduce debug.config for CI-like setup delayacct: track delays from memory compact Documentation/accounting/delay-accounting.rst: add thrashing page cache and direct compact delayacct: cleanup flags in struct task_delay_info and functions use it delayacct: fix incomplete disable operation when switch enable to disable delayacct: support swapin delay accounting for swapping without blkio panic: remove oops_id panic: use error_report_end tracepoint on warnings fs/adfs: remove unneeded variable make code cleaner FAT: use io_schedule_timeout() instead of congestion_wait() hfsplus: use struct_group_attr() for memcpy() region nilfs2: remove redundant pointer sbufs fs/binfmt_elf: use PT_LOAD p_align values for static PIE const_structs.checkpatch: add frequently used ops structs ...	2022-01-20 10:41:01 +02:00
Isabella Basso	fd0a146240	hash.h: remove unused define directive Patch series "test_hash.c: refactor into KUnit", v3. We refactored the lib/test_hash.c file into KUnit as part of the student group LKCAMP [1] introductory hackathon for kernel development. This test was pointed to our group by Daniel Latypov [2], so its full conversion into a pure KUnit test was our goal in this patch series, but we ran into many problems relating to it not being split as unit tests, which complicated matters a bit, as the reasoning behind the original tests is quite cryptic for those unfamiliar with hash implementations. Some interesting developments we'd like to highlight are: - In patch 1/5 we noticed that there was an unused define directive that could be removed. - In patch 4/5 we noticed how stringhash and hash tests are all under the lib/test_hash.c file, which might cause some confusion, and we also broke those kernel config entries up. Overall KUnit developments have been made in the other patches in this series: In patches 2/5, 3/5 and 5/5 we refactored the lib/test_hash.c file so as to make it more compatible with the KUnit style, whilst preserving the original idea of the maintainer who designed it (i.e. George Spelvin), which might be undesirable for unit tests, but we assume it is enough for a first patch. This patch (of 5): Currently, there exist hash_32() and __hash_32() functions, which were introduced in a patch [1] targeting architecture specific optimizations. These functions can be overridden on a per-architecture basis to achieve such optimizations. They must set their corresponding define directive (HAVE_ARCH_HASH_32 and HAVE_ARCH__HASH_32, respectively) so that header files can deal with these overrides properly. As the supported 32-bit architectures that have their own hash function implementation (i.e. m68k, Microblaze, H8/300, pa-risc) have only been making use of the (more general) __hash_32() function (which only lacks a right shift operation when compared to the hash_32() function), remove the define directive corresponding to the arch-specific hash_32() implementation. [1] https://lore.kernel.org/lkml/20160525073311.5600.qmail@ns.sciencehorizons.net/ [akpm@linux-foundation.org: hash_32_generic() becomes hash_32()] Link: https://lkml.kernel.org/r/20211208183711.390454-1-isabbasso@riseup.net Link: https://lkml.kernel.org/r/20211208183711.390454-2-isabbasso@riseup.net Reviewed-by: David Gow <davidgow@google.com> Tested-by: David Gow <davidgow@google.com> Co-developed-by: Augusto Durães Camargo <augusto.duraes33@gmail.com> Signed-off-by: Augusto Durães Camargo <augusto.duraes33@gmail.com> Co-developed-by: Enzo Ferreira <ferreiraenzoa@gmail.com> Signed-off-by: Enzo Ferreira <ferreiraenzoa@gmail.com> Signed-off-by: Isabella Basso <isabbasso@riseup.net> Cc: Geert Uytterhoeven <geert@linux-m68k.org> Cc: Brendan Higgins <brendanhiggins@google.com> Cc: Daniel Latypov <dlatypov@google.com> Cc: Shuah Khan <skhan@linuxfoundation.org> Cc: Rodrigo Siqueira <rodrigosiqueiramelo@gmail.com> Cc: kernel test robot <lkp@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2022-01-20 08:52:54 +02:00
Jason Gunthorpe	c0fe82baae	Linux 5.16 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmHbZ+YeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGDs4H/RgC8JOV3Dki1VtO 6OwPxUKKojhVU9LJis7kyG5voB/zE7tK5nI+jC3gYGQUFKWaZ3YY8s3UcV1zvg/b a44b91boA+dKxEwOq4RZNQ9mU+QWnNoG5+UqBkmB8vewi3QC3T8xEmpWcERLbU7d KrI2T6i4ksJ9OYSYMEMyrvrpt7nt3n1tDX8b71faXjf1zbLeGo9zT53t6BJ/LknV AK406Eq/3bg36OZrKFuG7hCJfRE/cSlxF9bxK3sIfMBMQ2YPe1S5+pxl5iBD0nyl NaHOBYcLTxPAne3YgIvK0zDdsS+EtPSlaVdWfSmNjQhX2vqEixldgdrOCmwp37vd 3gV9D28= =hrOo -----END PGP SIGNATURE----- Merge tag 'v5.16' into rdma.git for-next To resolve minor conflict in: drivers/infiniband/hw/mlx5/mlx5_ib.h By merging both hunks. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-13 13:21:03 -04:00
Zhu Yanjun	104f062fd1	RDMA/rxe: Use the standard method to produce udp source port Use the standard method to produce udp source port. Link: https://lore.kernel.org/r/20220106180359.2915060-5-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-07 19:34:56 -04:00
Leon Romanovsky	36783dec8d	RDMA/rxe: Delete deprecated module parameters interface Starting from the commit `66920e1b25` ("rdma_rxe: Use netlink messages to add/delete links") from the 2019, the RXE modules parameters are marked as deprecated in favour of rdmatool. So remove the kernel code too. Link: https://lore.kernel.org/r/c8376d7517aebe7cc851f0baaeef7b13707cf767.1641372460.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-05 15:18:47 -04:00
Li Zhijian	d8b0afd29c	RDMA/rxe: Fix indentations and operators sytle * Fix these up to always have the '+', and '\|' on the continuing line which is the normal kernel style. * Fix indentations correspondingly NOTE: this patch also remove the 2 redundant plus in IB_OPCODE_RD_FETCH_ADD and IB_OPCODE_RD_COMPARE_SWAP Link: https://lore.kernel.org/r/20220105042605.14343-1-lizhijian@fujitsu.com Signed-off-by: Li Zhijian <lizhijian@fujitsu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-05 15:17:13 -04:00
Chengguang Xu	8d1cfb884e	RDMA/rxe: Fix a typo in opcode name There is a redundant ']' in the name of opcode IB_OPCODE_RC_SEND_MIDDLE, so just fix it. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20211218112320.3558770-1-cgxu519@mykernel.net Signed-off-by: Chengguang Xu <cgxu519@mykernel.net> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-05 13:57:36 -04:00
Zhu Yanjun	8803836fe7	RDMA/rxe: Remove the unused xmit_errors member The member variable xmit_errors can be replaced with rxe_counter_inc(rxe, RXE_CNT_SEND_ERR) Link: https://lore.kernel.org/r/20211216054842.1099428-1-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-05 13:56:50 -04:00
Minghao Chi	47920e4d2c	RDMA/rxe: Remove redundant err variable Return value directly instead of taking this in another redundant variable. Link: https://lore.kernel.org/20211215075258.442930-1-chi.minghao@zte.com.cn Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Reviewed-by: Devesh Sharma <Devesh.s.sharma@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-05 13:47:43 -04:00
Li Zhijian	8ff5f5d9d8	RDMA/rxe: Prevent double freeing rxe_map_set() The same rxe_map_set could be freed twice: rxe_reg_user_mr() -> rxe_mr_init_user() -> rxe_mr_free_map_set() # 1st -> rxe_drop_ref() ... -> rxe_mr_cleanup() -> rxe_mr_free_map_set() # 2nd Follow normal convection and put resource cleanup either in the error unwind of the allocator, or the overall free function. Leave the object unchanged with a NULL cur_map_set on failure and remove the unncessary free in rxe_mr_init_user(). Link: https://lore.kernel.org/r/20211228014406.1033444-1-lizhijian@cn.fujitsu.com Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2022-01-04 10:29:34 -04:00
Jason Gunthorpe	4922f09209	Linux 5.16-rc5 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmG2fU0eHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGC7EH/3R7Rt+OD8Wn8Ss3 w8V+dBxVwa2u2oMTyUHPxaeOXZ7bi38XlUdLFPOK/76bGwO0a5TmYZqsWdRbGyT0 HfcYjHsQ0lbJXk/nh2oM47oJxJXVpThIHXJEk0FZ0Y5t+DYjIYlNHzqZymUyhLem St74zgWcyT+MXuqY34vB827FJDUnOxhhhi85tObeunaSPAomy9aiYidSC1ARREnz iz2VUntP/QnRnKVvL2nUZNzcz1xL5vfCRSKsRGRSv3qW1Y/1M71ylt6JVmSftWq+ VmMdFxFhdrb1OK/1ct/930Un/UP2NG9EJsWxote2XYlnVSZHzDqH7lUhbqgdCcLz 1m2tVNY= =7wRd -----END PGP SIGNATURE----- Merge tag 'v5.16-rc5' into rdma.git for-next Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-12-14 20:18:48 -04:00
Zhu Yanjun	3fe6d228a0	RDMA/rxe: Remove the unnecessary variable The variable pkey is assigned from a macro. Then this variable is passed to a function bth_init directly, and pkey is not used again. So remove it and use the macro directly. Fixes: `76251e15ea` ("RDMA/rxe: Remove pkey table") Link: https://lore.kernel.org/r/20211207194057.713289-1-yanjun.zhu@linux.dev Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-12-07 13:56:22 -04:00
Pavel Skripkin	84b01721e8	RDMA: Fix use-after-free in rxe_queue_cleanup On error handling path in rxe_qp_from_init() qp->sq.queue is freed and then rxe_create_qp() will drop last reference to this object. qp clean up function will try to free this queue one time and it causes UAF bug. Fix it by zeroing queue pointer after freeing queue in rxe_qp_from_init(). Fixes: `514aee660d` ("RDMA: Globally allocate and release QP memory") Link: https://lore.kernel.org/r/20211121202239.3129-1-paskripkin@gmail.com Reported-by: syzbot+aab53008a5adf26abe91@syzkaller.appspotmail.com Signed-off-by: Pavel Skripkin <paskripkin@gmail.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-25 13:15:59 -04:00
Bob Pearson	88f9335fa7	RDMA/rxe: Remove some #defines from rxe_pool.h RXE_POOL_ALIGN is only used in rxe_pool.c so move RXE_POOL_ALIGN to rxe_pool.c from rxe_pool.h. RXE_POOL_CACHE_FLAGS is never used so it is deleted from rxe_pool.h Link: https://lore.kernel.org/r/20211103050241.61293-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:47:08 -04:00
Bob Pearson	38ee25a311	RDMA/rxe: Remove #include "rxe_loc.h" from rxe_pool.c rxe_loc.h is already included in rxe.h so do not include it in rxe_pool.c Link: https://lore.kernel.org/r/20211103050241.61293-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:47:08 -04:00
Bob Pearson	b92d766c87	RDMA/rxe: Save object pointer in pool element In rxe_pool.c currently there are many cases where it is necessary to compute the offset from a pool element struct to the object containing it in a type independent way where the offset is different for each type. By saving a pointer to the object when they are created extra work can be saved. Link: https://lore.kernel.org/r/20211103050241.61293-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:29:15 -04:00
Bob Pearson	c95acedbff	RDMA/rxe: Copy setup parameters into rxe_pool In rxe_pool.c copy remaining pool setup parameters from rxe_pool_info into rxe_pool. This saves looking up rxe_pool_info in the performance path. Link: https://lore.kernel.org/r/20211103050241.61293-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:29:15 -04:00
Bob Pearson	02827b6708	RDMA/rxe: Cleanup rxe_pool_entry Currently three different names are used to describe rxe pool elements. They are referred to as entries, elems or pelems. This patch chooses one 'elem' and changes the other ones. Link: https://lore.kernel.org/r/20211103050241.61293-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:29:14 -04:00
Bob Pearson	21adfa7a3c	RDMA/rxe: Replace irqsave locks with bh locks Most of the locks in the rxe driver are _irqsave/restore locks but in fact there are no interrupt threads that run rxe code or share data with rxe. There are softirq threads and data sharing so the appropriate lock type is _bh. This patch replaces all irqsave type locks with bh type locks. Link: https://lore.kernel.org/r/20211103050241.61293-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-11-19 13:29:14 -04:00
Joe Perches	000b8490ec	RDMA/rxe: Make rxe_type_info static const Make struct rxe_type_info static const and local to the only uses. Moves a bit of data to text. $ size drivers/infiniband/sw/rxe/rxe_pool.o* (defconfig w/ infiniband swe) text data bss dec hex filename 4456 12 0 4468 1174 drivers/infiniband/sw/rxe/rxe_pool.o.new 3817 652 0 4469 1175 drivers/infiniband/sw/rxe/rxe_pool.o.old Link: https://lore.kernel.org/r/166b715d71f98336e8ecab72b0dbdd266eee9193.camel@perches.com Signed-off-by: Joe Perches <joe@perches.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-28 08:58:27 -03:00
Christophe JAILLET	e30bb300a4	RDMA/rxe: Use 'bitmap_zalloc()' when applicable 'index.table' is a bitmap. So use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid some open-coded arithmetic in allocator arguments. Using 'bitmap_zalloc()' also allows the removal of a now useless 'bitmap_zero()'. Also change the corresponding 'kfree()' into 'bitmap_free()' to keep consistency. Link: https://lore.kernel.org/r/4a3e11d45865678d570333d1962820eb13168848.1635093628.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-28 08:58:27 -03:00
Christophe JAILLET	69d1ed5999	RDMA/rxe: Save a few bytes from struct rxe_pool 'table_size' is never read, it can be removed. In fact, the only place that uses something that could be 'table_size' is 'alloc_index()'. In this function, it is re-computed from 'min_index' and 'max_index'. Link: https://lore.kernel.org/r/2c42065049bb2b99bededdc423a9babf4a98adee.1635093628.git.christophe.jaillet@wanadoo.fr Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-28 08:58:27 -03:00
Bob Pearson	3b87e08242	RDMA/rxe: Convert kernel UD post send to use ah_num Modify ib_post_send for kernel UD sends to put the AH index into the WQE instead of the address vector. Link: https://lore.kernel.org/r/20211007204051.10086-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:27 -03:00
Bob Pearson	e2fe06c908	RDMA/rxe: Lookup kernel AH from ah index in UD WQEs Add code to rxe_get_av in rxe_av.c to use the AH index in UD send WQEs to lookup the kernel AH. For old user providers continue to use the AV passed in WQEs. Move setting pkt->rxe to before the call to rxe_get_av() to get access to the AH pool. Link: https://lore.kernel.org/r/20211007204051.10086-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:26 -03:00
Bob Pearson	4da698eabf	RDMA/rxe: Replace ah->pd by ah->ibah.pd The pd field in struct rxe_ah is redundant with the pd field in the rdma-core's ib_ah. Eliminate the pd field in rxe_ah and add an inline to extract the pd from the ibah field. Link: https://lore.kernel.org/r/20211007204051.10086-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:26 -03:00
Bob Pearson	73a5493210	RDMA/rxe: Create AH index and return to user space Make changes to rdma_user_rxe.h to allow indexing AH objects, passing the index in UD send WRs to the driver and returning the index to the rxe provider. Modify rxe_create_ah() to add an index to AH when created and if called from a new user provider return it to user space. If called from an old provider mark the AH as not having a useful index. Modify rxe_destroy_ah to drop the index before deleting the object. Link: https://lore.kernel.org/r/20211007204051.10086-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:26 -03:00
Bob Pearson	99c13a3e29	RDMA/rxe: Change AH objects to indexed Make changes to rxe_param.h and rxe_pool.c to allow indexing of AH objects. Valid indices are non-zero so older providers can be detected. Link: https://lore.kernel.org/r/20211007204051.10086-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:26 -03:00
Bob Pearson	cfc0312d9c	RDMA/rxe: Move AV from rxe_send_wqe to rxe_send_wr Move the struct rxe_av av from struct rxe_send_wqe to struct rxe_send_wr placing it in wr.ud at the same offset as it was previously. This has the effect of increasing the size of struct rxe_send_wr while keeping the size of struct rxe_send_wqe the same. This better reflects the use of this field which is only used for UD sends. This change has no effect on ABI compatibility so the modified rxe driver will operate with older versions of rdma-core. Link: https://lore.kernel.org/r/20211007204051.10086-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 13:25:26 -03:00
Aharon Landau	13f30b0fa0	RDMA/counter: Add a descriptor in struct rdma_hw_stats Add a counter statistic descriptor structure in rdma_hw_stats. In addition to the counter name, more meta-information will be added. This code extension is needed for optional-counter support in the following patches. Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com Signed-off-by: Aharon Landau <aharonl@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Mark Zhang <markzhang@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-12 12:48:04 -03:00
Xiao Yang	115fda3509	RDMA/rxe: Remove duplicate settings Remove duplicate settings for vendor_err and qp_num. Link: https://lore.kernel.org/r/20210930094813.226888-5-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-06 19:45:30 -03:00
Xiao Yang	262d9fcf85	RDMA/rxe: Set partial attributes when completion status != IBV_WC_SUCCESS As ibv_poll_cq()'s manual said, only partial attributes are valid when completion status != IBV_WC_SUCCESS. Link: https://lore.kernel.org/r/20210930094813.226888-4-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-06 19:45:30 -03:00
Xiao Yang	609bb8c3a3	RDMA/rxe: Change the is_user member of struct rxe_cq to bool Make the is_user members of struct rxe_qp/rxe_cq has the same type. Link: https://lore.kernel.org/r/20210930094813.226888-3-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-06 19:45:30 -03:00
Xiao Yang	1cf2ce8272	RDMA/rxe: Remove the is_user members of struct rxe_sq/rxe_rq/rxe_srq The is_user members of struct rxe_sq/rxe_rq/rxe_srq are unsed since commit `ae6e843fe0` ("RDMA/rxe: Add memory barriers to kernel queues"). In this case, it is fine to remove them directly. Link: https://lore.kernel.org/r/20210930094813.226888-2-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-06 19:45:29 -03:00
Rao Shoaib	0994a1bcd5	RDMA/rxe: Bump up default maximum values used via uverbs In our internal testing we have found that default maximum values are too small. Ideally there should be no limits, but since maximum values are reported via ibv_query_device, we have to return some value. So, the default maximums have been changed to large values. Link: https://lore.kernel.org/r/20210915011220.307585-1-Rao.Shoaib@oracle.com Signed-off-by: Rao Shoaib <Rao.Shoaib@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-10-04 15:55:06 -03:00
Xiao Yang	27da60547d	RDMA/rxe: Remove unused WR_READ_WRITE_OR_SEND_MASK Link: https://lore.kernel.org/r/20210914080253.1145353-4-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-28 11:42:24 -03:00
Xiao Yang	45216d6363	RDMA/rxe: Add MASK suffix for RXE_READ_OR_ATOMIC and RXE_WRITE_OR_SEND To reflect the intention, since it is not just a single bit. Link: https://lore.kernel.org/r/20210914080253.1145353-3-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-28 11:42:24 -03:00
Xiao Yang	373efe0f30	RDMA/rxe: Add new RXE_READ_OR_WRITE_MASK 1) Replace (RXE_READ_MASK \| RXE_WRITE_MASK) with RXE_READ_OR_WRITE_MASK. 2) Change (RXE_READ_MASK \| RXE_WRITE_OR_SEND) to RXE_READ_OR_WRITE_MASK because we don't need to check RETH for RXE_SEND_MASK. Link: https://lore.kernel.org/r/20210914080253.1145353-2-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-28 11:42:24 -03:00
Bob Pearson	450f4f6aa1	RDMA/rxe: Only allow invalidate for appropriate MRs Local and remote invalidate operations are not allowed by IBA for MRs created by (re)register memory verbs. This patch checks the MR type in rxe_invalidate_mr(). Link: https://lore.kernel.org/r/20210914164206.19768-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:15:00 -03:00
Bob Pearson	647bf13ce9	RDMA/rxe: Create duplicate mapping tables for FMRs For fast memory regions create duplicate mapping tables so ib_map_mr_sg() can build a new mapping table which is then swapped into place synchronously with the execution of an IB_WR_REG_MR work request. Currently the rxe driver uses the same table for receiving RDMA operations and for building new tables in preparation for reusing the MR. This exposes users to potentially incorrect results. Link: https://lore.kernel.org/r/20210914164206.19768-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:15:00 -03:00
Bob Pearson	001345339f	RDMA/rxe: Separate HW and SW l/rkeys Separate software and simulated hardware lkeys and rkeys for MRs and MWs. This makes struct ib_mr and struct ib_mw isolated from hardware changes triggered by executing work requests. This change fixes a bug seen in blktest. Link: https://lore.kernel.org/r/20210914164206.19768-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:14:59 -03:00
Bob Pearson	47b7f7064b	RDMA/rxe: Cleanup MR status and type enums Eliminate RXE_MR_STATE_ZOMBIE which is not compatible with IBA. RXE_MR_STATE_INVALID is better. Replace RXE_MR_TYPE_XXX by IB_MR_TYPE_XXX which covers all the needed types. Link: https://lore.kernel.org/r/20210914164206.19768-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:14:59 -03:00
Bob Pearson	ae6e843fe0	RDMA/rxe: Add memory barriers to kernel queues Earlier patches added memory barriers to protect user space to kernel space communications. The user space queues were previously shown to have occasional memory synchonization errors which were removed by adding smp_load_acquire, smp_store_release barriers. This patch extends that to the case where queues are used between kernel space threads. This patch also extends the queue types to include kernel ULP queues which access the other end of the queues in kernel verbs calls like poll_cq and post_send/recv. Link: https://lore.kernel.org/r/20210914164206.19768-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-24 10:14:59 -03:00
Zhu Yanjun	ad17bbef3d	RDMA/rxe: remove the unnecessary variable In the struct rxe_qp, the variable send_pkts is never used. So remove it. Link: https://lore.kernel.org/r/20210915075128.482919-1-yanjun.zhu@intel.com Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-14 16:22:48 -03:00
Zhu Yanjun	d12faf2dee	RDMA/rxe: remove the redundant variable The variable port is not necessary. So remove it. Link: https://lore.kernel.org/r/20210915012456.476728-1-yanjun.zhu@intel.com Signed-off-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-14 16:22:48 -03:00
Junji Wei	dcd3f985b2	RDMA/rxe: Fix wrong port_cap_flags The port->attr.port_cap_flags should be set to enum ib_port_capability_mask_bits in ib_mad.h, not RDMA_CORE_CAP_PROT_ROCE_UDP_ENCAP. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210831083223.65797-1-weijunji@bytedance.com Signed-off-by: Junji Wei <weijunji@bytedance.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-09-14 16:22:48 -03:00
Linus Torvalds	23852bec53	RDMA v5.15 merge window Pull Request - Various cleanup and small features for rtrs - kmap_local_page() conversions - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns - Cache the IB subnet prefix - Rework how CRC is calcuated in rxe - Clean reference counting in iwpm's netlink - Pull object allocation and lifecycle for user QPs to the uverbs core code - Several small hns features and continued general code cleanups - Fix the scatterlist confusion of orig_nents/nents introduced in an earlier patch creating the append operation -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmEudRgACgkQOG33FX4g mxraJA//c6bMxrrTVrzmrtrkyYD4tYWE8RDfgvoyZtleZnnEOJeunCQWakQrpJSv ukSnOGCA3PtnmRMdV54f/11YJ/7otxOJodSO7jWsIoBrqG/lISAdX8mn2iHhrvJ0 dIaFEFPLy0WqoMLCJVIYIupR0IStVHb/mWx0uYL4XnnoYKyt7f7K5JMZpNWMhDN2 ieJw0jfrvEYm8pipWuxUvB16XARlzAWQrjqLpMRI+jFRpbDVBY21dz2/LJvOJPrA LcQ+XXsV/F659ibOAGm6bU4BMda8fE6Lw90B/gmhSswJ205NrdziF5cNYHP0QxcN oMjrjSWWHc9GEE7MTipC2AH8e36qob16Q7CK+zHEJ+ds7R6/O/8XmED1L8/KFpNA FGqnjxnxsl1y27mUegfj1Hh8PfoDp2oVq0lmpEw0CYo4cfVzHSMRrbTR//XmW628 Ie/mJddpFK4oLk+QkSNjSLrnxOvdTkdA58PU0i84S5eUVMNm41jJDkxg2J7vp0Zn sclZsclhUQ9oJ5Q2so81JMWxu4JDn7IByXL0ULBaa6xwQTiVEnyvSxSuPlflhLRW 0vI2ylATYKyWkQqyX7VyWecZJzwhwZj5gMMWmoGsij8bkZhQ/VaQMaesByzSth+h NV5UAYax4GqyOQ/tg/tqT6e5nrI1zof87H64XdTCBpJ7kFyQ/oA= =ZwOe -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma updates from Jason Gunthorpe: "This is quite a small cycle, no major series stands out. The HNS and rxe drivers saw the most activity this cycle, with rxe being broken for a good chunk of time. The significant deleted line count is due to a SPDX cleanup series. Summary: - Various cleanup and small features for rtrs - kmap_local_page() conversions - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns - Cache the IB subnet prefix - Rework how CRC is calcuated in rxe - Clean reference counting in iwpm's netlink - Pull object allocation and lifecycle for user QPs to the uverbs core code - Several small hns features and continued general code cleanups - Fix the scatterlist confusion of orig_nents/nents introduced in an earlier patch creating the append operation" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits) RDMA/mlx5: Relax DCS QP creation checks RDMA/hns: Delete unnecessary blank lines. RDMA/hns: Encapsulate the qp db as a function RDMA/hns: Adjust the order in which irq are requested and enabled RDMA/hns: Remove RST2RST error prints for hw v1 RDMA/hns: Remove dqpn filling when modify qp from Init to Init RDMA/hns: Fix QP's resp incomplete assignment RDMA/hns: Fix query destination qpn RDMA/hfi1: Convert to SPDX identifier IB/rdmavt: Convert to SPDX identifier RDMA/hns: Bugfix for incorrect association between dip_idx and dgid RDMA/hns: Bugfix for the missing assignment for dip_idx RDMA/hns: Bugfix for data type of dip_idx RDMA/hns: Fix incorrect lsn field RDMA/irdma: Remove the repeated declaration RDMA/core/sa_query: Retry SA queries RDMA: Use the sg_table directly and remove the opencoded version from umem lib/scatterlist: Fix wrong update of orig_nents lib/scatterlist: Provide a dedicated function to support table append RDMA/hns: Delete unused hns bitmap interface ...	2021-09-02 14:47:21 -07:00
Jason Gunthorpe	6a217437f9	Merge branch 'sg_nents' into rdma.git for-next From Maor Gottlieb ==================== Fix the use of nents and orig_nents in the sg table append helpers. The nents should be used by the DMA layer to store the number of DMA mapped sges, the orig_nents is the number of CPU sges. Since the sg append logic doesn't always create a SGL with exactly orig_nents entries store a total_nents as well to allow the table to be properly free'd and reorganize the freeing logic to share across all the use cases. ==================== Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> * 'sg_nents': RDMA: Use the sg_table directly and remove the opencoded version from umem lib/scatterlist: Fix wrong update of orig_nents lib/scatterlist: Provide a dedicated function to support table append	2021-08-30 09:49:59 -03:00
Maor Gottlieb	79fbd3e124	RDMA: Use the sg_table directly and remove the opencoded version from umem This allows using the normal sg_table APIs and makes all the code cleaner. Remove sgt, nents and nmapd from ib_umem. Link: https://lore.kernel.org/r/20210824142531.3877007-4-maorg@nvidia.com Signed-off-by: Maor Gottlieb <maorg@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-24 19:52:40 -03:00
Xiao Yang	cc4f596cf8	RDMA/rxe: Zero out index member of struct rxe_queue 1) New index member of struct rxe_queue was introduced but not zeroed so the initial value of index may be random. 2) The current index is not masked off to index_mask. In this case producer_addr() and consumer_addr() will get an invalid address by the random index and then accessing the invalid address triggers the following panic: "BUG: unable to handle page fault for address: ffff9ae2c07a1414" Fix the issue by using kzalloc() to zero out index member. Fixes: `5bcf5a59c4` ("RDMA/rxe: Protext kernel index from user space") Link: https://lore.kernel.org/r/20210820111509.172500-1-yangx.jy@fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-20 15:48:58 -03:00
Bob Pearson	65a81b61d8	RDMA/rxe: Fix memory allocation while in a spin lock rxe_mcast_add_grp_elem() in rxe_mcast.c calls rxe_alloc() while holding spinlocks which in turn calls kzalloc(size, GFP_KERNEL) which is incorrect. This patch replaces rxe_alloc() by rxe_alloc_locked() which uses GFP_ATOMIC. This bug was caused by the below mentioned commit and failing to handle the need for the atomic allocate. Fixes: `4276fd0ddd` ("RDMA/rxe: Remove RXE_POOL_ATOMIC") Link: https://lore.kernel.org/r/20210813210625.4484-1-rpearsonhpe@gmail.com Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-19 20:11:16 -03:00
Leon Romanovsky	514aee660d	RDMA: Globally allocate and release QP memory Convert QP object to follow IB/core general allocation scheme. That change allows us to make sure that restrack properly kref the memory. Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com Reviewed-by: Gal Pressman <galpress@amazon.com> #efa Tested-by: Gal Pressman <galpress@amazon.com> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> #rdma and core Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Tested-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-03 13:44:27 -03:00
Bob Pearson	ef4b96a577	RDMA/rxe: Restore setting tot_len in the IPv4 header An earlier patch removed setting of tot_len in IPv4 headers because it was also set in ip_local_out. However, this change resulted in an incorrect ICRC being computed because the tot_len field is not masked out. This patch restores that line. This fixes the bug reported by Zhu Yanjun. This bug affects anyone using rxe which is currently broken. Fixes: `230bb836ee` ("RDMA/rxe: Fix redundant call to ip_send_check") Link: https://lore.kernel.org/r/20210729220039.18549-3-rpearsonhpe@gmail.com Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-and-tested-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-02 12:45:22 -03:00
Bob Pearson	e2a05339fa	RDMA/rxe: Use the correct size of wqe when processing SRQ The memcpy() that copies a WQE from a SRQ the QP uses an incorrect size. The size should have been the size of the rxe_send_wqe struct not the size of a pointer to it. The result is that IO operations using a SRQ on the responder side will fail. Fixes: `ec0fa2445c` ("RDMA/rxe: Fix over copying in get_srq_wqe") Link: https://lore.kernel.org/r/20210729220039.18549-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-08-02 12:45:22 -03:00
Bob Pearson	923232bbea	RDMA/rxe: Fix types in rxe_icrc.c Currently the ICRC is generated as a u32 type and then forced to a __be32 and stored into the ICRC field in the packet. The actual type of the ICRC is __be32. This patch replaces u32 by __be32 and eliminates the casts. The computation is exactly the same as the original but the types are more consistent. Link: https://lore.kernel.org/r/20210707040040.15434-10-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:35 -03:00
Bob Pearson	e4f5c82fef	RDMA/rxe: Add kernel-doc comments to rxe_icrc.c This patch adds kernel-doc style comments to rxe_icrc.c Link: https://lore.kernel.org/r/20210707040040.15434-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:34 -03:00
Bob Pearson	add2b3b80e	RDMA/rxe: Move crc32 init code to rxe_icrc.c This patch collects the code from rxe_register_device() that sets up the crc32 calculation into a subroutine rxe_icrc_init() in rxe_icrc.c. Link: https://lore.kernel.org/r/20210707040040.15434-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:34 -03:00
Bob Pearson	6388751057	RDMA/rxe: Fixup rxe_icrc_hdr rxe_icrc_hdr() in rxe_icrc.c is no longer shared. This patch makes it static and changes the parameter list to match the other routines there. Link: https://lore.kernel.org/r/20210707040040.15434-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:34 -03:00
Bob Pearson	b6c6cc4acd	RDMA/rxe: Move rxe_crc32 to a subroutine Move rxe_crc32() from rxe.h to rxe_icrc.c as a static local function. Link: https://lore.kernel.org/r/20210707040040.15434-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:34 -03:00
Bob Pearson	1117f26ea7	RDMA/rxe: Move ICRC generation to a subroutine Isolate ICRC generation into a single subroutine named rxe_generate_icrc() in rxe_icrc.c. Remove scattered crc generation code from elsewhere. Link: https://lore.kernel.org/r/20210707040040.15434-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:34 -03:00
Bob Pearson	13050a0b32	RDMA/rxe: Fixup rxe_send and rxe_loopback Fixup rxe_send() and rxe_loopback() in rxe_net.c to have the same calling sequence. This patch makes them static and have the same parameter list and return value. Link: https://lore.kernel.org/r/20210707040040.15434-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:33 -03:00
Bob Pearson	36fbb03d05	RDMA/rxe: Move rxe_xmit_packet to a subroutine rxe_xmit_packet() was an overlong inline subroutine. This patch moves it into rxe_net.c as an ordinary subroutine. Link: https://lore.kernel.org/r/20210707040040.15434-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:33 -03:00
Bob Pearson	fe87fb17c6	RDMA/rxe: Move ICRC checking to a subroutine Move the code in rxe_recv() that checks the ICRC on incoming packets to a subroutine rxe_check_icrc() and move that to rxe_icrc.c. Link: https://lore.kernel.org/r/20210707040040.15434-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-16 12:43:33 -03:00
Bob Pearson	b18c7da63f	RDMA/rxe: Fix memory leak in error path code In rxe_mr_init_user() at the third error the driver fails to free the memory at mr->map. This patch adds code to do that. This error only occurs if page_address() fails to return a non zero address which should never happen for 64 bit architectures. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210705164153.17652-1-rpearsonhpe@gmail.com Reported by: Haakon Bugge <haakon.bugge@oracle.com> Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-15 14:44:12 -03:00
Xiao Yang	cdbdb77247	RDMA/rxe: Remove the repeated 'mr->umem = umem' Drop duplicated code Link: https://lore.kernel.org/r/20210702123024.37025-1-ice_yangxiao@163.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Reviewed-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-07-15 14:43:19 -03:00
Dan Carpenter	36941dfe0e	RDMA/rxe: Missing unlock on error in get_srq_wqe() This error path needs to unlock before returning. Fixes: `ec0fa2445c` ("RDMA/rxe: Fix over copying in get_srq_wqe") Link: https://lore.kernel.org/r/YNXUCmnPsSkPyhkm@mwanda Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Majd Dibbiny <majd@nvidia.com> Reviewed-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-25 12:00:28 -03:00
Bob Pearson	2d3b2e4427	RDMA/rxe: Fix redundant skb_put_zero rxe_init_packet() in rxe_net.c calls skb_put_zero() to reserve space for the payload and zero it out. All these bytes are then re-written with RoCE headers and payload. Remove this useless extra copy. Fixes: `ecb238f6a7` ("IB/cxgb4: use skb_put_zero()/__skb_put_zero") Link: https://lore.kernel.org/r/20210618045742.204195-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:53 -03:00
Bob Pearson	3896bde92d	RDMA/rxe: Fix extra copy in prepare_ack_packet Currently prepare_ack_packet writes almost all the fields of the BTH in the ack packet twice. Replace code with the subroutine init_bth(). Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210618045742.204195-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:53 -03:00
Bob Pearson	ec0fa2445c	RDMA/rxe: Fix over copying in get_srq_wqe Currently get_srq_wqe() in rxe_resp.c copies the maximum possible number of bytes from the wqe into the QPs copy of the SRQ wqe. This is usually extra work and risks reading past the end of the SRQ circular buffer if the SRQ is configured with less than the maximum possible number of SGEs. Check the number of SGEs is not too large. Compute the actual number of bytes in the WR and copy only those. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210618045742.204195-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:52 -03:00
Bob Pearson	1993cbed65	RDMA/rxe: Fix extra copies in build_rdma_network_hdr build_rdma_network_hdr() in rxe_resp.c does more copying than is needed. Remove this subroutine and eliminate the extra copies for IPV6 and reduce the extra copying for IPV4. Fixes: `e404f945a6` ("IB/rxe: improved debug prints & code cleanup") Link: https://lore.kernel.org/r/20210618045742.204195-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:52 -03:00
Bob Pearson	230bb836ee	RDMA/rxe: Fix redundant call to ip_send_check For IPV4 packets sent on the wire the rxe driver calls ip_local_out() which immediately calls __ip_local_out() which sets iph->tot_len and calls ip_send_check(). This code is duplicated in prepare4(). On the loopback path the IP header checksum and tot_len fields are not used so they do not need to be set. Remove this redundant code. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210618045742.204195-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:52 -03:00
Bob Pearson	fceb24a73e	RDMA/rxe: Fix useless copy in send_atomic_ack In send_atomic_ack() in rxe_resp.c there is code copying ack_pkt into the skb->cb[]. This doesn't do anything useful because the cb[] is not used in the transmit path by the rxe driver. Remove this code. Fixes: `4c93496f18` ("IB/rxe: do not copy extra stack memory to skb") Link: https://lore.kernel.org/r/20210618045742.204195-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 15:38:52 -03:00
Jason Gunthorpe	fdcebbc2ac	Linux 5.13-rc7 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmDPuyMeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGvxgH/RKvSuRPwkJ2Jcp9 VLi5kCbqtJlYLq6tB6peSJ8otKgxkcRwC0pIY4LlYIAWYboktLQ5RKp/9nB2h2FN aMZUMu6AI/lVJyFMI5MnKnJIUiUq+WXR3lSSlw68vwFLFdzqUZFNq+bqeiVvnIy1 yqA6naj24Tu/RbYffQoPvdSJcU2SLXRMxwD8HRGiU2d51RaFsOvsZvF+P5TVcsEV ZmttJeER21CaI/A809eqaFmyGrUOcZZK9roZEbMwanTZOMw18biEsLu/UH4kBX01 JC4+RlGxcWjQ5YNZgChsgoOK/CHzc6ITztTntdeDWAvwZjQFzV7pCy4/3BWne3O+ 5178yHM= =o8cN -----END PGP SIGNATURE----- Merge tag 'v5.13-rc7' into rdma.git for-next Linux 5.13-rc7 Needed for dependencies in following patches. Merge conflict in rxe_cmop.c resolved by compining both patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-22 14:43:51 -03:00
Xiao Yang	20ec0a6d60	RDMA/rxe: Don't overwrite errno from ib_umem_get() rxe_mr_init_user() always returns the fixed -EINVAL when ib_umem_get() fails so it's hard for user to know which actual error happens in ib_umem_get(). For example, ib_umem_get() will return -EOPNOTSUPP when trying to pin pages on a DAX file. Return actual error as mlx4/mlx5 does. Link: https://lore.kernel.org/r/20210621071456.4259-1-ice_yangxiao@163.com Signed-off-by: Xiao Yang <yangx.jy@fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-21 21:05:02 -03:00
Jason Gunthorpe	915e4af59f	RDMA: Remove rdma_set_device_sysfs_group() The driver's device group can be specified as part of the ops structure like the device's port group. No need for the complicated API. Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:58:32 -03:00
Jason Gunthorpe	4b5f4d3fb4	RDMA: Split the alloc_hw_stats() ops to port and device variants This is being used to implement both the port and device global stats, which is causing some confusion in the drivers. For instance EFA and i40iw both seem to be misusing the device stats. Split it into two ops so drivers that don't support one or the other can leave the op NULL'd, making the calling code a little simpler to understand. Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com Tested-by: Gal Pressman <galpress@amazon.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:58:29 -03:00
Bob Pearson	570d2b99d0	RDMA/rxe: Disallow MR dereg and invalidate when bound Check that an MR has no bound MWs before allowing a dereg or invalidate operation. Link: https://lore.kernel.org/r/20210608042552.33275-11-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:19 -03:00
Bob Pearson	cdd0b85675	RDMA/rxe: Implement memory access through MWs Add code to implement memory access through memory windows. Link: https://lore.kernel.org/r/20210608042552.33275-10-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	3902b429ca	RDMA/rxe: Implement invalidate MW operations Implement invalidate MW and cleaned up invalidate MR operations. Added code to perform remote invalidate for send with invalidate. Added code to perform local invalidation. Deleted some blank lines in rxe_loc.h. Link: https://lore.kernel.org/r/20210608042552.33275-9-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	32a577b4c3	RDMA/rxe: Add support for bind MW work requests Add support for bind MW work requests from user space. Since rdma/core does not support bind mw in ib_send_wr there is no way to support bind mw in kernel space. Added bind_mw local operation in rxe_req.c. Added bind_mw WR operation in rxe_opcode.c. Added bind_mw WC in rxe_comp.c. Added additional fields to rxe_mw in rxe_verbs.h. Added rxe_do_dealloc_mw() subroutine to cleanup an mw when rxe_dealloc_mw is called. Added code to implement bind_mw operation in rxe_mw.c Link: https://lore.kernel.org/r/20210608042552.33275-8-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	c1a411268a	RDMA/rxe: Move local ops to subroutine Simplify rxe_requester() by moving the local operations to a subroutine. Add an error return for illegal send WR opcode. Moved next_index ahead of rxe_run_task which fixed a small bug where work completions were delayed until after the next wqe which was not the intended behavior. Let errors return their own WC status. Previously all errors were reported as protection errors which was incorrect. Changed the return of errors from rxe_do_local_ops() to err: which causes an immediate completion. Without this an error on a last WR may get lost. Changed fill_packet() to finish_packet() which is more accurate. Fixes: 8700e2e7c485 ("The software RoCE driver") Link: https://lore.kernel.org/r/20210608042552.33275-7-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	886441fb2e	RDMA/rxe: Replace WR_REG_MASK by WR_LOCAL_OP_MASK Rxe has two mask bits WR_LOCAL_MASK and WR_REG_MASK with WR_REG_MASK used to indicate any local operation and WR_LOCAL_MASK unused. This patch replaces both of these with one mask bit WR_LOCAL_OP_MASK which is clearer. Link: https://lore.kernel.org/r/20210608042552.33275-6-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:18 -03:00
Bob Pearson	beec0239c3	RDMA/rxe: Add ib_alloc_mw and ib_dealloc_mw verbs Add ib_alloc_mw and ib_dealloc_mw verbs APIs. Added new file rxe_mw.c focused on MWs. Changed the 8 bit random key generator. Added a cleanup routine for MWs. Added verbs routines to ib_device_ops. Link: https://lore.kernel.org/r/20210608042552.33275-5-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:17 -03:00
Bob Pearson	af732adfac	RDMA/rxe: Enable MW object pool Currently the rxe driver has a rxe_mw struct object but nothing about memory windows is enabled. This patch turns on memory windows and some minor cleanup. Set device attribute in rxe.c so max_mw = MAX_MW. Change parameters in rxe_param.h so that MAX_MW is the same as MAX_MR. Reduce the number of MRs and MWs to 4K from 256K. Add device capability bits for 2a and 2b memory windows. Removed RXE_MR_TYPE_MW from the rxe_mr_type enum. Link: https://lore.kernel.org/r/20210608042552.33275-4-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:17 -03:00
Bob Pearson	08224016ab	RDMA/rxe: Return errors for add index and key Modify rxe_add_index() and rxe_add_key() to return an error if the index or key is aleady present in the pool. Currently they print a warning and silently fail with bad consequences to the caller. Link: https://lore.kernel.org/r/20210608042552.33275-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:51:17 -03:00
Bob Pearson	15ae1375ea	RDMA/rxe: Fix qp reference counting for atomic ops Currently the rdma_rxe driver attempts to protect atomic responder resources by taking a reference to the qp which is only freed when the resource is recycled for a new read or atomic operation. This means that in normal circumstances there is almost always an extra qp reference once an atomic operation has been executed which prevents cleaning up the qp and associated pd and cqs when the qp is destroyed. This patch removes the call to rxe_add_ref() in send_atomic_ack() and the call to rxe_drop_ref() in free_rd_atomic_resource(). If the qp is destroyed while a peer is retrying an atomic op it will cause the operation to fail which is acceptable. Link: https://lore.kernel.org/r/20210604230558.4812-1-rpearsonhpe@gmail.com Reported-by: Zhu Yanjun <zyjzyj2000@gmail.com> Fixes: `86af617641` ("IB/rxe: remove unnecessary skb_clone") Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-16 20:20:23 -03:00
Kamal Heib	32a25f2ea6	RDMA/rxe: Fix failure during driver load To avoid the following failure when trying to load the rdma_rxe module while IPv6 is disabled, add a check for EAFNOSUPPORT and ignore the failure, also delete the needless debug print from rxe_setup_udp_tunnel(). $ modprobe rdma_rxe modprobe: ERROR: could not insert 'rdma_rxe': Operation not permitted Fixes: `dfdd6158ca` ("IB/rxe: Fix kernel panic in udp_setup_tunnel") Link: https://lore.kernel.org/r/20210603090112.36341-1-kamalheib1@gmail.com Reported-by: Yi Zhang <yi.zhang@redhat.com> Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-03 16:50:29 -03:00
Bob Pearson	5bcf5a59c4	RDMA/rxe: Protext kernel index from user space In order to prevent user space from modifying the index that belongs to the kernel for shared queues let the kernel use a local copy of the index and copy any new values of that index to the shared rxe_queue_bus struct. This adds more switch statements which decreases the performance of the queue API. Move the type into the parameter list for these functions so that the compiler can optimize out the switch statements when the explicit type is known. Modify all the calls in the driver on performance paths to pass in the explicit queue type. Link: https://lore.kernel.org/r/20210527194748.662636-4-rpearsonhpe@gmail.com Link: https://lore.kernel.org/linux-rdma/20210526165239.GP1002214@@nvidia.com/ Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-03 15:53:01 -03:00
Bob Pearson	0a67c46d2e	RDMA/rxe: Protect user space index loads/stores Modify the queue APIs to protect all user space index loads with smp_load_acquire() and all user space index stores with smp_store_release(). Base this on the types of the queues which can be one of ..KERNEL, ..FROM_USER, ..TO_USER. Kernel space indices are protected by locks which also provide memory barriers. Link: https://lore.kernel.org/r/20210527194748.662636-3-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-03 15:53:01 -03:00
Bob Pearson	59daff49f2	RDMA/rxe: Add a type flag to rxe_queue structs To create optimal code only want to use smp_load_acquire() and smp_store_release() for user indices in rxe_queue APIs since kernel indices are protected by locks which also act as memory barriers. By adding a type to the queues we can determine which indices need to be protected. Link: https://lore.kernel.org/r/20210527194748.662636-2-rpearsonhpe@gmail.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-06-03 15:53:01 -03:00
Lang Cheng	cd5b010fff	RDMA/rxe: Remove unused parameter udata The old version of ib_umem_get() need these udata as a parameter but now they are unnecessary. Fixes: `c320e527e1` ("IB: Allow calls to ib_umem_get from kernel ULPs") Link: https://lore.kernel.org/r/1620807142-39157-5-git-send-email-liweihang@huawei.com Signed-off-by: Lang Cheng <chenglang@huawei.com> Signed-off-by: Weihang Li <liweihang@huawei.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-05-20 11:52:17 -03:00
Leon Romanovsky	dc07628bd2	RDMA/rxe: Return CQE error if invalid lkey was supplied RXE is missing update of WQE status in LOCAL_WRITE failures. This caused the following kernel panic if someone sent an atomic operation with an explicitly wrong lkey. [leonro@vm ~]$ mkt test test_atomic_invalid_lkey (tests.test_atomic.AtomicTest) ... WARNING: CPU: 5 PID: 263 at drivers/infiniband/sw/rxe/rxe_comp.c:740 rxe_completer+0x1a6d/0x2e30 [rdma_rxe] Modules linked in: crc32_generic rdma_rxe ip6_udp_tunnel udp_tunnel rdma_ucm rdma_cm ib_umad ib_ipoib iw_cm ib_cm mlx5_ib ib_uverbs ib_core mlx5_core ptp pps_core CPU: 5 PID: 263 Comm: python3 Not tainted 5.13.0-rc1+ #2936 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014 RIP: 0010:rxe_completer+0x1a6d/0x2e30 [rdma_rxe] Code: 03 0f 8e 65 0e 00 00 3b 93 10 06 00 00 0f 84 82 0a 00 00 4c 89 ff 4c 89 44 24 38 e8 2d 74 a9 e1 4c 8b 44 24 38 e9 1c f5 ff ff <0f> 0b e9 0c e8 ff ff b8 05 00 00 00 41 bf 05 00 00 00 e9 ab e7 ff RSP: 0018:ffff8880158af090 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff888016a78000 RCX: ffffffffa0cf1652 RDX: 1ffff9200004b442 RSI: 0000000000000004 RDI: ffffc9000025a210 RBP: dffffc0000000000 R08: 00000000ffffffea R09: ffff88801617740b R10: ffffed1002c2ee81 R11: 0000000000000007 R12: ffff88800f3b63e8 R13: ffff888016a78008 R14: ffffc9000025a180 R15: 000000000000000c FS: 00007f88b622a740(0000) GS:ffff88806d540000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f88b5a1fa10 CR3: 000000000d848004 CR4: 0000000000370ea0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: rxe_do_task+0x130/0x230 [rdma_rxe] rxe_rcv+0xb11/0x1df0 [rdma_rxe] rxe_loopback+0x157/0x1e0 [rdma_rxe] rxe_responder+0x5532/0x7620 [rdma_rxe] rxe_do_task+0x130/0x230 [rdma_rxe] rxe_rcv+0x9c8/0x1df0 [rdma_rxe] rxe_loopback+0x157/0x1e0 [rdma_rxe] rxe_requester+0x1efd/0x58c0 [rdma_rxe] rxe_do_task+0x130/0x230 [rdma_rxe] rxe_post_send+0x998/0x1860 [rdma_rxe] ib_uverbs_post_send+0xd5f/0x1220 [ib_uverbs] ib_uverbs_write+0x847/0xc80 [ib_uverbs] vfs_write+0x1c5/0x840 ksys_write+0x176/0x1d0 do_syscall_64+0x3f/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/11e7b553f3a6f5371c6bb3f57c494bb52b88af99.1620711734.git.leonro@nvidia.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-05-17 13:52:47 -03:00
Leon Romanovsky	67f29896fd	RDMA/rxe: Clear all QP fields if creation failed rxe_qp_do_cleanup() relies on valid pointer values in QP for the properly created ones, but in case rxe_qp_from_init() failed it was filled with garbage and caused tot the following error. refcount_t: underflow; use-after-free. WARNING: CPU: 1 PID: 12560 at lib/refcount.c:28 refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Modules linked in: CPU: 1 PID: 12560 Comm: syz-executor.4 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:refcount_warn_saturate+0x1d1/0x1e0 lib/refcount.c:28 Code: e9 db fe ff ff 48 89 df e8 2c c2 ea fd e9 8a fe ff ff e8 72 6a a7 fd 48 c7 c7 e0 b2 c1 89 c6 05 dc 3a e6 09 01 e8 ee 74 fb 04 <0f> 0b e9 af fe ff ff 0f 1f 84 00 00 00 00 00 41 56 41 55 41 54 55 RSP: 0018:ffffc900097ceba8 EFLAGS: 00010286 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000040000 RSI: ffffffff815bb075 RDI: fffff520012f9d67 RBP: 0000000000000003 R08: 0000000000000000 R09: 0000000000000000 R10: ffffffff815b4eae R11: 0000000000000000 R12: ffff8880322a4800 R13: ffff8880322a4940 R14: ffff888033044e00 R15: 0000000000000000 FS: 00007f6eb2be3700(0000) GS:ffff8880b9d00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fdbe5d41000 CR3: 000000001d181000 CR4: 00000000001506e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: __refcount_sub_and_test include/linux/refcount.h:283 [inline] __refcount_dec_and_test include/linux/refcount.h:315 [inline] refcount_dec_and_test include/linux/refcount.h:333 [inline] kref_put include/linux/kref.h:64 [inline] rxe_qp_do_cleanup+0x96f/0xaf0 drivers/infiniband/sw/rxe/rxe_qp.c:805 execute_in_process_context+0x37/0x150 kernel/workqueue.c:3327 rxe_elem_release+0x9f/0x180 drivers/infiniband/sw/rxe/rxe_pool.c:391 kref_put include/linux/kref.h:65 [inline] rxe_create_qp+0x2cd/0x310 drivers/infiniband/sw/rxe/rxe_verbs.c:425 _ib_create_qp drivers/infiniband/core/core_priv.h:331 [inline] ib_create_named_qp+0x2ad/0x1370 drivers/infiniband/core/verbs.c:1231 ib_create_qp include/rdma/ib_verbs.h:3644 [inline] create_mad_qp+0x177/0x2d0 drivers/infiniband/core/mad.c:2920 ib_mad_port_open drivers/infiniband/core/mad.c:3001 [inline] ib_mad_init_device+0xd6f/0x1400 drivers/infiniband/core/mad.c:3092 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:717 enable_device_and_get+0x1cd/0x3b0 drivers/infiniband/core/device.c:1331 ib_register_device drivers/infiniband/core/device.c:1413 [inline] ib_register_device+0x7c7/0xa50 drivers/infiniband/core/device.c:1365 rxe_register_device+0x3d5/0x4a0 drivers/infiniband/sw/rxe/rxe_verbs.c:1147 rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:503 rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 nldev_newlink+0x30e/0x550 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg+0x36d/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2ee/0x430 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1312 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1338 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1927 sock_sendmsg_nosec net/socket.c:654 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:674 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2350 ___sys_sendmsg+0xf3/0x170 net/socket.c:2404 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2433 do_syscall_64+0x3a/0xb0 arch/x86/entry/common.c:47 entry_SYSCALL_64_after_hwframe+0x44/0xae Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/7bf8d548764d406dbbbaf4b574960ebfd5af8387.1620717918.git.leonro@nvidia.com Reported-by: syzbot+36a7f280de4e11c6f04e@syzkaller.appspotmail.com Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-05-11 16:58:14 -03:00
Bob Pearson	45062f4415	RDMA/rxe: Fix a bug in rxe_fill_ip_info() Fix a bug in rxe_fill_ip_info() which was attempting to convert from RDMA_NETWORK_XXX to RXE_NETWORK_XXX. .._IPV6 should have mapped to .._IPV6 not .._IPV4. Fixes: `edebc8407b` ("RDMA/rxe: Fix small problem in network_type patch") Link: https://lore.kernel.org/r/20210421035952.4892-1-rpearson@hpe.com Suggested-by: Frank Zago <frank.zago@hpe.com> Signed-off-by: Bob Pearson <rpearson@hpe.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-21 16:09:04 -03:00
Bob Pearson	ea49225189	RDMA/rxe: Fix missing acks from responder All responder errors from request packets that do not consume a receive WQE fail to generate acks for RC QPs. This patch corrects this behavior by making the flow follow the same path as request packets that do consume a WQE after the completion. Link: https://lore.kernel.org/r/20210402001016.3210-1-rpearson@hpe.com Link: https://lore.kernel.org/linux-rdma/1a7286ac-bcea-40fb-2267-480134dd301b@gmail.com/ Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-04-08 15:59:28 -03:00
Kamal Heib	b1f27f688f	RDMA/rxe: Remove rxe_dma_device declaration The function isn't implemented - delete the declaration. Fixes: `a9d2e9ae95` ("RDMA/siw,rxe: Make emulated devices virtual in the device tree") Link: https://lore.kernel.org/r/20210331102043.691950-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-31 14:45:50 -03:00
Bob Pearson	364e282c4f	RDMA/rxe: Split MEM into MR and MW In the original rxe implementation it was intended to use a common object to represent MRs and MWs but they are different enough to separate these into two objects. This allows replacing the mem name with mr for MRs which is more consistent with the style for the other objects and less likely to be confusing. This is a long patch that mostly changes mem to mr where it makes sense and adds a new rxe_mw struct. Link: https://lore.kernel.org/r/20210325212425.2792-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Acked-by: Zhu Yanjun <zyjzyj2000@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-30 17:11:30 -03:00
Mark Bloch	1fb7f8973f	RDMA: Support more than 255 rdma ports Current code uses many different types when dealing with a port of a RDMA device: u8, unsigned int and u32. Switch to u32 to clean up the logic. This allows us to make (at least) the core view consistent and use the same type. Unfortunately not all places can be converted. Many uverbs functions expect port to be u8 so keep those places in order not to break UAPIs. HW/Spec defined values must also not be changed. With the switch to u32 we now can support devices with more than 255 ports. U32_MAX is reserved to make control logic a bit easier to deal with. As a device with U32_MAX ports probably isn't going to happen any time soon this seems like a non issue. When a device with more than 255 ports is created uverbs will report the RDMA device as having 255 ports as this is the max currently supported. The verbs interface is not changed yet because the IBTA spec limits the port size in too many places to be u8 and all applications that relies in verbs won't be able to cope with this change. At this stage, we are extending the interfaces that are using vendor channel solely Once the limitation is lifted mlx5 in switchdev mode will be able to have thousands of SFs created by the device. As the only instance of an RDMA device that reports more than 255 ports will be a representor device and it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other ULPs aren't effected by this change and their sysfs/interfaces that are exposes to userspace can remain unchanged. While here cleanup some alignment issues and remove unneeded sanity checks (mainly in rdmavt), Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org Signed-off-by: Mark Bloch <mbloch@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-26 09:31:21 -03:00
Bob Pearson	545c4ab463	RDMA/rxe: Fix errant WARN_ONCE in rxe_completer() In rxe_comp.c in rxe_completer() the function free_pkt() did not clear skb which triggered a warning at 'done:' and could possibly at 'exit:'. The WARN_ONCE() calls are not actually needed. The call to free_pkt() is moved to the end to clearly show that all skbs are freed. Fixes: `899aba891c` ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()") Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-05 14:15:22 -04:00
Bob Pearson	5e4a7ccc96	RDMA/rxe: Fix extra deref in rxe_rcv_mcast_pkt() rxe_rcv_mcast_pkt() dropped a reference to ib_device when no error occurred causing an underflow on the reference counter. This code is cleaned up to be clearer and easier to read. Fixes: `899aba891c` ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()") Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-05 14:15:18 -04:00
Bob Pearson	21e27ac82d	RDMA/rxe: Fix missed IB reference counting in loopback When the noted patch below extending the reference taken by rxe_get_dev_from_net() in rxe_udp_encap_recv() until each skb is freed it was not matched by a reference in the loopback path resulting in underflows. Fixes: `899aba891c` ("RDMA/rxe: Fix FIXME in rxe_udp_encap_recv()") Link: https://lore.kernel.org/r/20210304192048.2958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-05 14:11:02 -04:00
Julian Braha	475f23b8c6	RDMA/rxe: Fix missing kconfig dependency on CRYPTO When RDMA_RXE is enabled and CRYPTO is disabled, Kbuild gives the following warning: WARNING: unmet direct dependencies detected for CRYPTO_CRC32 Depends on [n]: CRYPTO [=n] Selected by [y]: - RDMA_RXE [=y] && (INFINIBAND_USER_ACCESS [=y] \|\| !INFINIBAND_USER_ACCESS [=y]) && INET [=y] && PCI [=y] && INFINIBAND [=y] && INFINIBAND_VIRT_DMA [=y] This is because RDMA_RXE selects CRYPTO_CRC32, without depending on or selecting CRYPTO, despite that config option being subordinate to CRYPTO. Fixes: `cee2688e3c` ("IB/rxe: Offload CRC calculation when possible") Signed-off-by: Julian Braha <julianbraha@gmail.com> Link: https://lore.kernel.org/r/21525878.NYvzQUHefP@ubuntu-mate-laptop Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-03-01 14:46:31 -04:00
Jason Gunthorpe	7289e26f39	Linux 5.11 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmAppPgeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGeXYH/imZPBd4A1jIMehN 5HV2A53Z+MXmmaMuGj9X1KV6vsf55/xB+IhOoFdtRAIsO8c2yYSCO8i4+4R0XfYA +/YFJeq672rojQnmh6XbpR8dugaAV7CUHy6n7KDsyvtT6EOCpwFSwkOb4X3tBRX6 TlYgm2d/xgV/wRHSgLVugK0MdFCLMAnyb7mkPfar9QrMgG1BiDKLq07xmwnS23On TkqpJ9yZ/rJpUrrUqQYPShSO/FmA+fSfWs0CDv7EIrJ40LUScD6PZxSHWTIHtjLk E4jFda6wuqLRVWsBwaBzUIdD0zk7X5quHRzEpbC5ga16SK6yrWvE5YJJXCguIEuZ f3FMRYs= =CAjn -----END PGP SIGNATURE----- Merge tag 'v5.11' into rdma.git for-next Linux 5.11 Merged to resolve conflicts with RDMA rc commits - drivers/infiniband/sw/rxe/rxe_net.c The final logic is to call rxe_get_dev_from_net() again with the master netdev if the packet was rx'd on a vlan. To keep the elimination of the local variables requires a trivial edit to the code in -rc Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-18 11:19:29 -04:00
Bob Pearson	bf139b58af	RDMA/rxe: Remove unused pkt->offset The pkt->offset field is never used except to assign it to 0. But it adds lots of unneeded code. This patch removes the field and related code. This causes a measurable improvement in performance. Link: https://lore.kernel.org/r/20210211210455.3274-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-16 14:42:59 -04:00
Bob Pearson	086f580c01	RDMA/rxe: Cleanup init_send_wqe This patch changes the type of init_send_wqe in rxe_verbs.c to void since it always returns 0. It also separates out the code that copies inline data into the send wqe as copy_inline_data_to_wqe(). Link: https://lore.kernel.org/r/20210206002437.2756-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-08 20:43:11 -04:00
Bob Pearson	dc78074a80	RDMA/rxe: Fix minor coding style issues checkpatch -f found 3 warnings in RDMA/rxe 1. a missing space following switch 2. return followed by else 3. use of strlcpy() instead of strscpy(). This patch fixes each of these. In ... } elseif (...) { ... return 0; } else ... The middle block can be safely moved since it is completely independent of the other code. Link: https://lore.kernel.org/r/20210205230525.49068-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-08 20:42:57 -04:00
Bob Pearson	899aba891c	RDMA/rxe: Fix FIXME in rxe_udp_encap_recv() rxe_udp_encap_recv() drops the reference to rxe->ib_dev taken by rxe_get_dev_from_net() which should be held until each received skb is freed. This patch moves the calls to ib_device_put() to each place a received skb is freed. It also takes references to the ib_device for each cloned skb created to process received multicast packets. Fixes: `4c173f596b` ("RDMA/rxe: Use ib_device_get_by_netdev() instead of open coding") Link: https://lore.kernel.org/r/20210128233318.2591-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-08 15:33:51 -04:00
Bob Pearson	5120bf0a5f	RDMA/rxe: Correct skb on loopback path rxe_net.c sends packets at the IP layer with skb->data pointing at the IP header but receives packets from a UDP tunnel with skb->data pointing at the UDP header. On the loopback path this was not correctly accounted for. This patch corrects for this by using sbk_pull() to strip the IP header from the skb on received packets. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210128182301.16859-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 13:45:21 -04:00
Bob Pearson	8fc1b7027f	RDMA/rxe: Fix coding error in rxe_rcv_mcast_pkt rxe_rcv_mcast_pkt() in rxe_recv.c can leak SKBs in error path code. The loop over the QPs attached to a multicast group creates new cloned SKBs for all but the last QP in the list and passes the SKB and its clones to rxe_rcv_pkt() for further processing. Any QPs that do not pass some checks are skipped. If the last QP in the list fails the tests the SKB is leaked. This patch checks if the SKB for the last QP was used and if not frees it. Also removes a redundant loop invariant assignment. Fixes: `8700e3e7c4` ("Soft RoCE driver") Fixes: `71abf20b28` ("RDMA/rxe: Handle skb_clone() failure in rxe_recv.c") Link: https://lore.kernel.org/r/20210128174752.16128-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 13:44:22 -04:00
Bob Pearson	e328197423	RDMA/rxe: Remove useless code in rxe_recv.c In check_keys() in rxe_recv.c if ((...) && pkt->mask) { ... } always has pkt->mask non zero since in rxe_udp_encap_recv() pkt->mask is always set to RXE_GRH_MASK (!= 0). There is no obvious reason for this additional test and the original intent is lost. This patch simplifies the expression. Fixes: `8b7b59d030` ("IB/rxe: remove redudant qpn check") Link: https://lore.kernel.org/r/20210127224203.2812-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 13:43:30 -04:00
Bob Pearson	7d9ae80e31	RDMA/rxe: Fix coding error in rxe_recv.c check_type_state() in rxe_recv.c is written as if the type bits in the packet opcode were a bit mask which is not correct. This patch corrects this code to compare all 3 type bits to the required type. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20210127214500.3707-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-02-05 13:43:00 -04:00
Bob Pearson	ce2063e387	RDMA/rxe: Replace missing rxe_pool_get_index_locked One of the pool APIs for when caller is holding lock was not defined but is declared in rxe_pool.h. This patch adds the definition. Link: https://lore.kernel.org/r/20210125211641.2694-7-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-28 15:29:56 -04:00
Bob Pearson	eae5f0642e	RDMA/rxe: Remove unneeded pool->state rxe_pool.c uses the field pool->state to mark a pool as invalid when it is shut down and checks it in several pool APIs to verify that the pool has not been shut down. This is unneeded because the pools are not marked invalid unless the entire driver is being removed at which point no functional APIs should or could be executing. This patch removes this field and associated code. Link: https://lore.kernel.org/r/20210125211641.2694-6-rpearson@hpe.com Suggested-by: zyjzyj2000@gmail.c Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-28 15:29:56 -04:00
Bob Pearson	6cde3e8ec1	RDMA/rxe: Remove references to ib_device and pool rxe_pool.c takes references to the pool and ib_device structs for each object allocated and also keeps an atomic num_elem count in each pool. This is more work than is needed. Pool allocation is only called from verbs APIs which already have references to ib_device and pools are only diasbled when the driver is removed so no protection of the pool addresses are needed. The elem count is used to warn if elements are still present in a pool when it is cleaned up which is useful. This patch eliminates the references to the ib_device and pool structs. Link: https://lore.kernel.org/r/20210125211641.2694-5-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-28 15:29:56 -04:00
Bob Pearson	4276fd0ddd	RDMA/rxe: Remove RXE_POOL_ATOMIC rxe_alloc() used the RXE_POOL_ATOMIC flag in rxe_type_info to select GFP_ATOMIC in calls to kzalloc(). This was intended to handle cases where an object could be created in interrupt context. This no longer occurs since allocating those objects has moved into the core so this flag is not necessary. An incorrect use of this flag was still present for rxe_mc_elem objects and is removed. Link: https://lore.kernel.org/r/20210125211641.2694-4-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-28 15:29:56 -04:00
Bob Pearson	88cc77eb8b	RDMA/rxe: Fix misleading comments and names The names and comments of the 'unlocked' pool APIs are very misleading and not what was intended. This patch replaces 'rxe_xxx_nl' with 'rxe_xxx_locked' with comments indicating that the caller is expected to hold the rxe pool lock. Link: https://lore.kernel.org/r/20210125211641.2694-3-rpearson@hpe.com Reported-by: Hillf Danton <hdanton@sina.com> Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-28 15:29:55 -04:00
Bob Pearson	c4369575b2	RDMA/rxe: Fix bug in rxe_alloc() A recent patch which added an 'unlocked' version of rxe_alloc introduced a bug causing kzalloc(..., GFP_KERNEL) to be called while holding a spin lock. This patch corrects that error. rxe_alloc_nl() should always be called while holding the pool->pool_lock so the 2nd argument to kzalloc there should be GFP_ATOMIC. rxe_alloc() prior to the change only locked the code around checking that pool->state is RXE_POOL_STATE_VALID to avoid races between working threads and a thread shutting down the rxe driver. This patch reverts rxe_alloc() to this behavior so the lock is not held when kzalloc() is called. Link: https://lore.kernel.org/r/20210125211641.2694-2-rpearson@hpe.com Reported-by: syzbot+ec2fd72374785d0e558e@syzkaller.appspotmail.com Fixes: `3853c35e24` ("RDMA/rxe: Add unlocked versions of pool APIs") Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-28 15:29:55 -04:00
Martin Wilck	f1b0a8ea9f	Revert "RDMA/rxe: Remove VLAN code leftovers from RXE" This reverts commit `b2d2440430`. It's true that creating rxe on top of 802.1q interfaces doesn't work. Thus, commit `fd49ddaf7e` ("RDMA/rxe: prevent rxe creation on top of vlan interface") was absolutely correct. But `b2d2440430` was incorrect assuming that with this change, RDMA and VLAN don't work togehter at all. It just has to be set up differently. Rather than creating rxe on top of the VLAN interface, rxe must be created on top of the physical interface. RDMA then works just fine through VLAN interfaces on top of that physical interface, via the "upper device" logic. This is hard to see in the rxe logic because it never talks about vlan, but instead rxe carefully selects upper vlan netdevices when working with packets which in turn imply certain vlan tagging. This is all done correctly and interacts with the gid table with VLAN support the same as real HW does. `b2d2440430` broke this setup deliberately and should thus be reverted. Also, `b2d2440430` removed rxe_dma_device(), so adapt the revert to discard that hunk. Fixes: `b2d2440430` ("RDMA/rxe: Remove VLAN code leftovers from RXE") Link: https://lore.kernel.org/r/20210120161913.7347-1-mwilck@suse.com Signed-off-by: Martin Wilck <mwilck@suse.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-20 13:29:28 -04:00
Bob Pearson	8a48ac7f6c	RDMA/rxe: Fix race in rxe_mcast.c Fix a race in rxe_mcast.c that occurs when two QPs try at the same time to attach a multicast address. Both QPs lookup the mgid address in a pool of multicast groups and if they do not find it create a new group elem. Fix this by locking the lookup/alloc/add key sequence and using the unlocked APIs added in this patch set. Link: https://lore.kernel.org/r/20201216231550.27224-8-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:39 -04:00
Bob Pearson	3853c35e24	RDMA/rxe: Add unlocked versions of pool APIs The existing pool APIs use the rw_lock pool_lock to protect critical sections that change the pool state. This does not correctly implement a typical sequence like the following elem = <lookup key in pool> if found use elem else elem = <alloc new elem in pool> <add key to elem> Which is racy if multiple threads are attempting to perform this at the same time. We want the second thread to use the elem created by the first thread not create two equivalent elems. This patch adds new APIs that are the same as existing APIs but do not take the pool_lock. A caller can then take the lock and perform a sequence of pool operations and then release the lock. Link: https://lore.kernel.org/r/20201216231550.27224-7-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:39 -04:00
Bob Pearson	91a42c5bec	RDMA/rxe: Make add/drop key/index APIs type safe Replace 'void ' parameters with 'struct rxe_pool_entry ' and use a macro to allow: rxe_add_index, rxe_drop_index, rxe_add_key, rxe_drop_key and rxe_add_to_pool APIs to be type safe against changing the position of pelem in the objects. Link: https://lore.kernel.org/r/20201216231550.27224-6-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:38 -04:00
Bob Pearson	2622aa718a	RDMA/rxe: Make pool lookup and alloc APIs type safe The allocate, lookup index, lookup key and cleanup routines in rxe_pool.c currently are not type safe against relocating the pelem field in the objects. Planned changes to move allocation of objects into rdma-core make addressing this a requirement. Use the elem_offset field in rxe_type_info make these APIs safe against moving the pelem field. Link: https://lore.kernel.org/r/20201216231550.27224-5-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:38 -04:00
Bob Pearson	b994d49ef4	RDMA/rxe: Add elem_offset field to rxe_type_info The rxe verbs objects each include an rdma-core object 'ib_xxx' and a rxe_pool_entry 'pelem' in addition to rxe specific data. Originally these all had pelem first and ib_xxx second. Currently about half have ib_xxx first and half have pelem first. Saving the offset of the pelem field in rxe_type info will enable making the rxe_pool APIs type safe as the pelem field continues to vary. Link: https://lore.kernel.org/r/20201216231550.27224-4-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:38 -04:00
Bob Pearson	c06ee3a014	RDMA/rxe: Let pools support both keys and indices Allow both indices and keys to exist for objects in pools. Previously you were limited to one or the other. This is required for later implementing rxe memory windows. Link: https://lore.kernel.org/r/20201216231550.27224-3-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:38 -04:00
Bob Pearson	1d11c1b7f9	RDMA/rxe: Remove unneeded RXE_POOL_ATOMIC flag Remove RXE_POOL_ATOMIC flag from rxe_type_info for AH objects. These objects are now allocated by rdma/core so there is no further reason for this flag. Link: https://lore.kernel.org/r/20201216231550.27224-2-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:35:38 -04:00
Xiao Yang	bad07664a5	RDMA/rxe: Add check for supported QP types Current rdma_rxe only supports five QP types, attempting to create any others should return an error - the type check was missed. Link: https://lore.kernel.org/r/20201216071755.149449-2-yangx.jy@cn.fujitsu.com Signed-off-by: Xiao Yang <yangx.jy@cn.fujitsu.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2021-01-12 16:11:35 -04:00
Bob Pearson	d21a1240f5	RDMA/rxe: Use acquire/release for memory ordering Change work and completion queues to use smp_load_acquire() and smp_store_release() to synchronize between driver and users. This commit goes with a matching series of commits in the rxe user space provider. Link: https://lore.kernel.org/r/20201210174258.5234-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-12-11 19:57:48 -04:00
Jason Gunthorpe	a9d2e9ae95	RDMA/siw,rxe: Make emulated devices virtual in the device tree This moves siw and rxe to be virtual devices in the device tree: lrwxrwxrwx 1 root root 0 Nov 6 13:55 /sys/class/infiniband/rxe0 -> ../../devices/virtual/infiniband/rxe0/ Previously they were trying to parent themselves to the physical device of their attached netdev, which doesn't make alot of sense. My hope is this will solve some weird syzkaller hits related to sysfs as it could be possible that the parent of a netdev is another netdev, eg under bonding or some other syzkaller found netdev configuration. Nesting a ib_device under anything but a physical device is going to cause inconsistencies in sysfs during destructions. Link: https://lore.kernel.org/r/0-v1-dcbfc68c4b4a+d6-virtual_dev_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-23 16:14:31 -04:00
Christoph Hellwig	5a7a9e038b	RDMA/core: remove use of dma_virt_ops Use the ib_dma_* helpers to skip the DMA translation instead. This removes the last user if dma_virt_ops and keeps the weird layering violation inside the RDMA core instead of burderning the DMA mapping subsystems with it. This also means the software RDMA drivers now don't have to mess with DMA parameters that are not relevant to them at all, and that in the future we can use PCI P2P transfers even for software RDMA, as there is no first fake layer of DMA mapping that the P2P DMA support. Link: https://lore.kernel.org/r/20201106181941.1878556-8-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-17 15:22:07 -04:00
Jason Gunthorpe	bf3b7b7ba9	Merge branch 'for-rc' into rdma.git From https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git The rc RDMA branch is needed due to dependencies on the next patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-17 15:20:26 -04:00
Christoph Hellwig	b1e678bf29	RMDA/sw: Don't allow drivers using dma_virt_ops on highmem configs dma_virt_ops requires that all pages have a kernel virtual address. Introduce a INFINIBAND_VIRT_DMA Kconfig symbol that depends on !HIGHMEM and make all three drivers depend on the new symbol. Also remove the ARCH_DMA_ADDR_T_64BIT dependency, which has been obsolete since commit `4965a68780` ("arch: define the ARCH_DMA_ADDR_T_64BIT config symbol in lib/Kconfig") Fixes: `551199aca1` ("lib/dma-virt: Add dma_virt_ops") Link: https://lore.kernel.org/r/20201106181941.1878556-2-hch@lst.de Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-12 13:27:41 -04:00
Zhu Yanjun	b2d2440430	RDMA/rxe: Remove VLAN code leftovers from RXE Since the commit `fd49ddaf7e` ("RDMA/rxe: prevent rxe creation on top of vlan interface") does not permit rxe on top of vlan device, all the stuff related with vlan should be removed. Fixes: `fd49ddaf7e` ("RDMA/rxe: prevent rxe creation on top of vlan interface") Link: https://lore.kernel.org/r/1604326422-18625-1-git-send-email-yanjunz@nvidia.com Signed-off-by: Zhu Yanjun <yanjunz@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-12 11:38:27 -04:00
Jason Gunthorpe	5c4193669b	RDMA/rxe,siw: Restore uverbs_cmd_mask IB_USER_VERBS_CMD_POST_SEND These two drivers open code the call to POST_SEND and do not use the rdma-core wrapper to do it, thus their usages was missed during the audit. Both drivers use this as a doorbell to signal the kernel to start DMA. Fixes: `628c02bf38` ("RDMA: Remove uverbs cmds from drivers that don't use them") Link: https://lore.kernel.org/r/0-v1-4608c5610afa+fb-uverbs_cmd_post_send_fix_jgg@nvidia.com Reported-by: Bob Pearson <rpearsonhpe@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-02 15:32:06 -04:00
Parav Pandit	683a9c7ed8	RDMA: Fix software RDMA drivers for dma mapping error The commit `f959dcd6dd` ("dma-direct: Fix potential NULL pointer dereference") made dma_mask as mandetory field to be setup even for dma_virt_ops based dma devices. The commit in the fixes tag omitted setting up the dma_mask on virtual devices triggering the below trace when they were combined during the merge window. Fix it by setting empty DMA MASK for software based RDMA devices. WARNING: CPU: 1 PID: 8488 at kernel/dma/mapping.c:149 dma_map_page_attrs+0x493/0x700 CPU: 1 PID: 8488 Comm: syz-executor144 Not tainted 5.9.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 RIP: 0010:dma_map_page_attrs+0x493/0x700 kernel/dma/mapping.c:149 Trace: dma_map_single_attrs include/linux/dma-mapping.h:279 [inline] ib_dma_map_single include/rdma/ib_verbs.h:3967 [inline] ib_mad_post_receive_mads+0x23f/0xd60 drivers/infiniband/core/mad.c:2715 ib_mad_port_start drivers/infiniband/core/mad.c:2862 [inline] ib_mad_port_open drivers/infiniband/core/mad.c:3016 [inline] ib_mad_init_device+0x72b/0x1400 drivers/infiniband/core/mad.c:3092 add_client_context+0x405/0x5e0 drivers/infiniband/core/device.c:680 enable_device_and_get+0x1d5/0x3c0 drivers/infiniband/core/device.c:1301 ib_register_device drivers/infiniband/core/device.c:1376 [inline] ib_register_device+0x7a7/0xa40 drivers/infiniband/core/device.c:1335 rxe_register_device+0x46d/0x570 drivers/infiniband/sw/rxe/rxe_verbs.c:1182 rxe_add+0x12fe/0x16d0 drivers/infiniband/sw/rxe/rxe.c:247 rxe_net_add+0x8c/0xe0 drivers/infiniband/sw/rxe/rxe_net.c:507 rxe_newlink drivers/infiniband/sw/rxe/rxe.c:269 [inline] rxe_newlink+0xb7/0xe0 drivers/infiniband/sw/rxe/rxe.c:250 nldev_newlink+0x30e/0x540 drivers/infiniband/core/nldev.c:1555 rdma_nl_rcv_msg+0x367/0x690 drivers/infiniband/core/netlink.c:195 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0x2f2/0x440 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1304 [inline] netlink_unicast+0x533/0x7d0 net/netlink/af_netlink.c:1330 netlink_sendmsg+0x856/0xd90 net/netlink/af_netlink.c:1919 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:671 ____sys_sendmsg+0x6e8/0x810 net/socket.c:2353 ___sys_sendmsg+0xf3/0x170 net/socket.c:2407 __sys_sendmsg+0xe5/0x1b0 net/socket.c:2440 do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x443699 Link: https://lore.kernel.org/r/20201030093803.278830-1-parav@nvidia.com Reported-by: syzbot+34dc2fea3478e659af01@syzkaller.appspotmail.com Fixes: `e0477b34d9` ("RDMA: Explicitly pass in the dma_device to ib_register_device") Signed-off-by: Parav Pandit <parav@nvidia.com> Tested-by: Guoqing Jiang <guoqing.jiang@cloud.ionos.com> Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Zhu Yanjun <yanjunz@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-11-02 15:14:56 -04:00
Bob Pearson	bb3ab2979f	RDMA/rxe: Compute PSN windows correctly The code which limited the number of unacknowledged PSNs was incorrect. The PSNs are limited to 24 bits and wrap back to zero from 0x00ffffff. The test was computing a 32 bit value which wraps at 32 bits so that qp->req.psn can appear smaller than the limit when it is actually larger. Replace '>' test with psn_compare which is used for other PSN comparisons and correctly handles the 24 bit size. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20201013170741.3590-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-28 10:34:20 -03:00
Bob Pearson	eeed696507	RDMA/rxe: Remove unused RXE_MR_TYPE_FMR This is a left over from the past. It is no longer used. Link: https://lore.kernel.org/r/20201009165112.271143-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 20:00:27 -03:00
Joe Perches	1c7fd72687	RDMA: Convert sysfs device * show functions to use sysfs_emit() Done with cocci script: @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... return - sprintf(buf, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... return - snprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... return - scnprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> } @@ identifier d_show; identifier dev, attr, buf; expression chr; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... return - strcpy(buf, chr); + sysfs_emit(buf, chr); ...> } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... len = - sprintf(buf, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... len = - snprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... len = - scnprintf(buf, PAGE_SIZE, + sysfs_emit(buf, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; identifier len; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char buf) { <... - len += scnprintf(buf + len, PAGE_SIZE - len, + len += sysfs_emit_at(buf, len, ...); ...> return len; } @@ identifier d_show; identifier dev, attr, buf; expression chr; @@ ssize_t d_show(struct device dev, struct device_attribute attr, char *buf) { ... - strcpy(buf, chr); - return strlen(buf); + return sysfs_emit(buf, chr); } Link: https://lore.kernel.org/r/7f406fa8e3aa2552c022bec680f621e38d1fe414.1602122879.git.joe@perches.com Signed-off-by: Joe Perches <joe@perches.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:53:21 -03:00
Jason Gunthorpe	676a80adba	RDMA: Remove AH from uverbs_cmd_mask Drivers that need a uverbs AH should instead set the create_user_ah() op similar to reg_user_mr(). MODIFY_AH and QUERY_AH cmds were never implemented so are just deleted. Link: https://lore.kernel.org/r/11-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:28:00 -03:00
Jason Gunthorpe	628c02bf38	RDMA: Remove uverbs cmds from drivers that don't use them Allowing userspace to invoke these commands is probably going to crash these drivers as they are not tested and not expecting to use them on a user object. For example pvrdma touches cq->ring_state which is not initialized for user QPs. These commands are effected: - IB_USER_VERBS_CMD_REQ_NOTIFY_CQ is ibv_cmd_req_notify_cq() in rdma-core, only hfi1, ipath and rxe calls it. - IB_USER_VERBS_CMD_POLL_CQ is ibv_cmd_poll_cq() in rdma-core, only ipath and hfi1 calls it. - IB_USER_VERBS_CMD_POST_SEND/RECV is ibv_cmd_post_send/recv() in rdma-core, only ipath and hfi1 call them. - IB_USER_VERBS_CMD_POST_SRQ_RECV is ibv_cmd_post_srq_recv() in rdma-core, only ipath and hfi1 calls it. - IB_USER_VERBS_CMD_PEEK_CQ isn't even implemented anywhere - IB_USER_VERBS_CMD_CREATE/DESTROY_AH is ibv_cmd_create/destroy_ah() in rdma-core, only bnxt_re, efa, hfi1, ipath, mlx5, orcrdma, and rxe call it. Link: https://lore.kernel.org/r/10-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:28:00 -03:00
Jason Gunthorpe	1f11a7610e	RDMA: Check create_flags during create_qp Each driver should check that the QP attrs create_flags is supported. Unfortuantely when create_flags was added to the QP attrs the drivers were not updated. uverbs_ex_cmd_mask was used to block it - even though kernel drivers use these flags too. Check that flags is zero in all drivers that don't use it, remove IB_USER_VERBS_EX_CMD_CREATE_QP from uverbs_ex_cmd_mask. Fix the error code to be EOPNOTSUPP. Link: https://lore.kernel.org/r/8-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:27:59 -03:00
Jason Gunthorpe	1c407cb5d7	RDMA: Check flags during create_cq Each driver should check that the CQ attrs is supported. Unfortuantely when flags was added to the CQ attrs the drivers were not updated, uverbs_ex_cmd_mask was used to block it. This was missed when create CQ was converted to ioctl, so non-zero flags could have been passed into drivers. Check that flags is zero in all drivers that don't use it, remove IB_USER_VERBS_EX_CMD_CREATE_CQ from uverbs_ex_cmd_mask. Fixes: `41b2a71fc8` ("IB/uverbs: Move ioctl path of create_cq and destroy_cq to a new file") Link: https://lore.kernel.org/r/7-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:27:59 -03:00
Jason Gunthorpe	26e990badd	RDMA: Check attr_mask during modify_qp Each driver should check that it can support the provided attr_mask during modify_qp. IB_USER_VERBS_EX_CMD_MODIFY_QP was being used to block modify_qp_ex because the driver didn't check RATE_LIMIT. Link: https://lore.kernel.org/r/6-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:27:58 -03:00
Jason Gunthorpe	652caba5b5	RDMA: Check srq_type during create_srq uverbs was blocking srq_types the driver doesn't support based on the CREATE_XSRQ cmd_mask. Fix all drivers to check for supported srq_types during create_srq and move CREATE_XSRQ to the core code. Link: https://lore.kernel.org/r/5-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:27:58 -03:00
Jason Gunthorpe	44ce37bc8b	RDMA: Move more uverbs_cmd_mask settings to the core These functions all depend on the driver providing a specific op: - REREG_MR is rereg_user_mr(). bnxt_re set this without providing the op - ATTACH/DEATCH_MCAST is attach_mcast()/detach_mcast(). usnic set this without providing the op - OPEN_QP doesn't involve the driver but requires a XRCD. qedr provides xrcd but forgot to set it, usnic doesn't provide XRCD but set it anyhow. - OPEN/CLOSE_XRCD are the ops alloc_xrcd()/dealloc_xrcd() - CREATE_SRQ/DESTROY_SRQ are the ops create_srq()/destroy_srq() - QUERY/MODIFY_SRQ is op query_srq()/modify_srq(). hns sets this but sometimes supplies a NULL op. - RESIZE_CQ is op resize_cq(). bnxt_re sets this boes doesn't supply an op - ALLOC/DEALLOC_MW is alloc_mw()/dealloc_mw(). cxgb4 provided an (now deleted) implementation but no userspace All drivers were checked that no drivers provide the op without also setting uverbs_cmd_mask so this should have no functional change. Link: https://lore.kernel.org/r/4-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:27:58 -03:00
Jason Gunthorpe	c074bb1e30	RDMA: Remove elements in uverbs_cmd_mask that all drivers set This is a step toward eliminating uverbs_cmd_mask. Preset this list in the core code. Only the op reg_user_mr wasn't already being required from the drivers. Link: https://lore.kernel.org/r/3-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:27:57 -03:00
Bob Pearson	edebc8407b	RDMA/rxe: Fix small problem in network_type patch The patch referenced below has a typo that results in using the wrong L2 header size for outbound traffic. (V4 <-> V6). It also breaks kernel-side RC traffic because they use AVs that use RDMA_NETWORK_XXX enums instead of RXE_NETWORK_TYPE_XXX enums. Fix this by transcoding between these enum types. Fixes: `e0d696d201` ("RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI") Link: https://lore.kernel.org/r/20201016211343.22906-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-26 19:12:17 -03:00
Bob Pearson	71abf20b28	RDMA/rxe: Handle skb_clone() failure in rxe_recv.c If skb_clone() is unable to allocate memory for a new sk_buff this is not detected by the current code. Check for a NULL return and continue. This is similar to other errors in this loop over QPs attached to the multicast address and consistent with the unreliable UD transport. Fixes: `e7ec96fc79` ("RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt()") Addresses-Coverity-ID: 1497804: Null pointer dereferences (NULL_RETURNS) Link: https://lore.kernel.org/r/20201013184236.5231-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-16 13:57:55 -03:00
Jason Gunthorpe	e0d696d201	RDMA/rxe: Move the definitions for rxe_av.network_type to uAPI RXE was wrongly using an internal kernel enum as part of its uAPI, split this out into a dedicated uAPI enum just for RXE. It only uses the IPv4 and IPv6 values. This was exposed by changing the internal kernel enum definition which broke RXE. Fixes: `1c15b4f2a4` ("RDMA/core: Modify enum ib_gid_type and enum rdma_network_type") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-16 13:54:10 -03:00
Jason Gunthorpe	e0477b34d9	RDMA: Explicitly pass in the dma_device to ib_register_device The code in setup_dma_device has become rather convoluted, move all of this to the drivers. Drives now pass in a DMA capable struct device which will be used to setup DMA, or drivers must fully configure the ibdev for DMA and pass in NULL. Other than setting the masks in rvt all drivers were doing this already anyhow. mthca, mlx4 and mlx5 were already setting up maximum DMA segment size for DMA based on their hardweare limits in: __mthca_init_one() dma_set_max_seg_size (1G) __mlx4_init_one() dma_set_max_seg_size (1G) mlx5_pci_init() set_dma_caps() dma_set_max_seg_size (2G) Other non software drivers (except usnic) were extended to UINT_MAX [1, 2] instead of 2G as was before. [1] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ [2] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/ Link: https://lore.kernel.org/r/20201008082752.275846-1-leon@kernel.org Link: https://lore.kernel.org/r/6b2ed339933d066622d5715903870676d8cc523a.1602590106.git.mchehab+huawei@kernel.org Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Parav Pandit <parav@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-16 13:53:46 -03:00
Bob Pearson	de55412d02	RDMA/rxe: Fix bug rejecting all multicast packets Fix a bug in rxe_rcv() that causes all multicast packets to be dropped. Currently rxe_match_dgid() is called for each packet to verify that the destination IP address matches one of the entries in the port source GID table. This is incorrect for IP multicast addresses since they do not appear in the GID table. Add code to detect multicast addresses. Change function name to rxe_chk_dgid() which is clearer. Link: https://lore.kernel.org/r/20201008212753.265249-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-08 20:22:12 -03:00
Bob Pearson	e7ec96fc79	RDMA/rxe: Fix skb lifetime in rxe_rcv_mcast_pkt() The changes referenced below replaced sbk_clone)_ by taking additional references, passing the skb along and then freeing the skb. This deleted the packets before they could be processed and additionally passed bad data in each packet. Since pkt is stored in skb->cb changing pkt->qp changed it for all the packets. Replace skb_get() by sbk_clone() in rxe_rcv_mcast_pkt() for cases where multiple QPs are receiving multicast packets on the same address. Delete kfree_skb() because the packets need to live until they have been processed by each QP. They are freed later. Fixes: `86af617641` ("IB/rxe: remove unnecessary skb_clone") Fixes: `fe896ceb57` ("IB/rxe: replace refcount_inc with skb_get") Link: https://lore.kernel.org/r/20201008203651.256958-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-08 20:22:11 -03:00
Bob Pearson	1858d98b83	RDMA/rxe: Remove duplicate entries in struct rxe_mr Struct rxe_mem had pd, lkey and rkey values both in itself and in the struct ib_mr which is also included in rxe_mem. Delete these entries and replace references with the ones in ibmr.Add mr_pd, mr_lkey and mr_rkey macros which extract these values from mr. Link: https://lore.kernel.org/r/20201008212818.265303-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-10-08 20:07:50 -03:00
Jason Gunthorpe	5dee5872f8	Merge branch 'mlx5_active_speed' into rdma.git for-next Leon Romanovsky says: ==================== IBTA declares speed as 16 bits, but kernel stores it in u8. This series fixes in-kernel declaration while keeping external interface intact. ==================== Based on the mlx5-next branch at git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux due to dependencies. * branch 'mlx5_active_speed': RDMA: Fix link active_speed size RDMA/mlx5: Delete duplicated mlx5_ptys_width enum net/mlx5: Refactor query port speed functions	2020-09-18 10:31:45 -03:00
Linus Torvalds	b1df2a0783	RDMA second 5.9-rc pull request A number of driver bug fixes and a few recent regressions: - Several bug fixes for bnxt_re. Crashing, incorrect data reported, and corruption on new HW - Memory leak and crash in rxe - Fix sysfs corruption in rxe if the netdev name is too long - Fix a crash on error unwind in the new cq_pool code - Fix kobject panics in rtrs by working device lifetime properly - Fix a data corruption bug in iser target related to misaligned buffers -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl9atAYACgkQOG33FX4g mxqSdw//Qi29dnxzVGpsaO4/krd/VmI6NT6eNpgK7Nqx80DaCYer0JhtwZOUxHqK KbHIV9XB/f6BSI67c9ydYj4PNX6FpFnoUWQLvqZwip5VM7R6ifIVjm0ap1jCAUSS axDLFZOySIOYNhcZ5I+MtY/kxykKBjMteuMXdpBe4FwZ+XSmsC5KkfRH/+FUhjVG peL6aRVDv9TByH8w+iZE1wSmVrOphOE1C/jN5TyotQTmKe7IHoXJtkalosYHXFWw KZaaz52e4IYKVFl4HIcl6+FfPExhxsyfDtRHluvn+vzY/wFy1RZw6F0BZt7mioy5 J8R6w82xEe/SNugTGuvIzqXOymmy9H4CrG9pHy4NRMMzC28LGI7qHJgVhr/jZy8+ GPxR26cywDhPsd4XA2K3mvs7DVSoBUPYlIUnHdYjfBZl/ghColg9+XGyNv6pdrke Q7Kog5blcpOAahBX+ElBLvIZXk5oEk5W+3H/M0OeuVMQ/DrMtALrCnwpp4wDKVvO 9QuYfGgQ+25xbV9kwzckLGo5eedN3cRD/v4hcqvQUZo+9zLYZ/HZRMjpOdrscQ+I QL4FgpcLpOASKZ+bYjjpFxK3rNVTDT9CYJw4/hxEaOhxRhtAO1Q9mJdvJTK6dj09 oR9LPyefQkyKCAt+heWHKKkEYDiwT8U1SlR8STotg24VHIj6Rb4= =2DDd -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma Pull rdma fixes from Jason Gunthorpe: "A number of driver bug fixes and a few recent regressions: - Several bug fixes for bnxt_re. Crashing, incorrect data reported, and corruption on new HW - Memory leak and crash in rxe - Fix sysfs corruption in rxe if the netdev name is too long - Fix a crash on error unwind in the new cq_pool code - Fix kobject panics in rtrs by working device lifetime properly - Fix a data corruption bug in iser target related to misaligned buffers" * tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: IB/isert: Fix unaligned immediate-data handling RDMA/rtrs-srv: Set .release function for rtrs srv device during device init RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx' RDMA/core: Fix reported speed and width RDMA/core: Fix unsafe linked list traversal after failing to allocate CQ RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address RDMA/bnxt_re: Restrict the max_gids to 256 RDMA/bnxt_re: Static NQ depth allocation RDMA/bnxt_re: Fix the qp table indexing RDMA/bnxt_re: Do not report transparent vlan from QP1 RDMA/mlx4: Read pkey table length instead of hardcoded value RDMA/rxe: Fix panic when calling kmem_cache_create() RDMA/rxe: Fix memleak in rxe_mem_init_user RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars RDMA/rtrs-srv: Replace device_register with device_initialize and device_add	2020-09-11 10:02:36 -07:00
Leon Romanovsky	43d781b9fa	RDMA: Allow fail of destroy CQ Like any other verbs objects, CQ shouldn't fail during destroy, but mlx5_ib didn't follow this contract with mixed IB verbs objects with DEVX. Such mix causes to the situation where FW and kernel are fully interdependent on the reference counting of each side. Kernel verbs and drivers that don't have DEVX flows shouldn't fail. Fixes: `e39afe3d6d` ("RDMA: Convert CQ allocations to be under core responsibility") Link: https://lore.kernel.org/r/20200907120921.476363-7-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 14:14:29 -03:00
Leon Romanovsky	119181d1d4	RDMA: Restore ability to fail on SRQ destroy In similar way to other IB objects, restore the ability to return error on SRQ destroy. Strictly speaking, this change is not necessary, and provided here to ensure a symmetrical interface like other destroy functions. Fixes: `68e326dea1` ("RDMA: Handle SRQ allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-5-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 14:14:24 -03:00
Leon Romanovsky	9a9ebf8cd7	RDMA: Restore ability to fail on AH destroy Like any other IB verbs objects, AH are refcounted by ib_core. The release of those objects are controlled by ib_core with promise that AH destroy can't fail. Being SW object for now, this change makes dealloc_ah() to behave like any other destroy IB flows. Fixes: `d345691471` ("RDMA: Handle AH allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-3-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 13:57:22 -03:00
Leon Romanovsky	91a7c58fce	RDMA: Restore ability to fail on PD deallocate The IB verbs objects are counted by the kernel and ib_core ensures that deallocate PD will success so it will be called once all other objects that depends on PD will be released. This is achieved by managing various reference counters on such objects. The mlx5 driver didn't follow this standard flow when allowed DEVX objects that are not managed by ib_core to be interleaved with the ones under ib_core responsibility. In such interleaved scenarios deallocate command can fail and ib_core will leave uobject in internal DB and attempt to clean it later to free resources anyway. This change partially restores returned value from dealloc_pd() for all drivers, but keeping in mind that non-DEVX devices and kernel verbs paths shouldn't fail. Fixes: `21a428a019` ("RDMA: Handle PD allocations by IB/core") Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org Signed-off-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-09 13:57:22 -03:00
Allen Pais	00b3c11879	RDMA/rxe: Convert tasklets to use new tasklet_setup() API In preparation for unconditionally passing the struct tasklet_struct pointer to all tasklet callbacks, switch to using the new tasklet_setup() and from_tasklet() to pass the tasklet pointer explicitly. Link: https://lore.kernel.org/r/20200903060637.424458-6-allen.lkml@gmail.com Signed-off-by: Romain Perier <romain.perier@gmail.com> Signed-off-by: Allen Pais <allen.lkml@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-09-03 12:01:53 -03:00
Jason Gunthorpe	6989aa62d3	Linux 5.9-rc3 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl9ML+IeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGA8EIAIy/kTbFS0yrE9yV hb98oX0z9+EU9YQg9vhaRWwPd+rJF/JMQZLqYcwbhjG9abaUL3T3fEcSAefMHw8E LAt+hYzA38dHt7tqhsFQX3vV1VorvDVICBVN0yRPRWKKikq4OPIHzaAR9tleGAF5 8btQisl1PjN+obwYmLuNb6aX16OCwAF+uXOwehcoJs9dvMNhwtXRzfOflWzOvOo6 tE0bHErlylLDfLv4ZzEfczTdks4QJZ7C0xLSf3oN9AAynW42Xnhct4hi8qZY/hCf CMaqeN4hdpub6TvQIqBdDqMMjEXGFgeNSnAEBQY9VpvUqz8NTu6sQxwgJEKDF5tg d81lv2c= =uW/F -----END PGP SIGNATURE----- Merge tag 'v5.9-rc3' into rdma.git for-next Required due to dependencies in following patches. Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-31 12:28:12 -03:00
Bob Pearson	7672dac304	RDMA/rxe: Address an issue with hardened user copy Change rxe pools to use kzalloc instead of kmem_cache to allocate memory for rxe objects. The pools are not really necessary and they trigger hardened user copy warnings as the ioctl framework copies the QP number directly to userspace. Also the general project to move object alloation to the core code will eventually clean these out anyhow. Link: https://lore.kernel.org/r/20200827163535.2632-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-31 12:21:16 -03:00
Bob Pearson	63fa15dbd4	RDMA/rxe: Add SPDX hdrs to rxe source files Add SPDX headers to all rxe .c and .h files. Link: https://lore.kernel.org/r/20200827145439.2273-1-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-31 12:20:02 -03:00
Bob Pearson	5f9e2822d1	RDMA/rxe: Fix style warnings Fixed several minor checkpatch warnings in existing rxe source. Link: https://lore.kernel.org/r/20200820224638.3212-3-rpearson@hpe.com Signed-off-by: Bob Pearson <rpearson@hpe.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-27 09:51:08 -03:00
Kamal Heib	d862060a4b	RDMA/rxe: Fix panic when calling kmem_cache_create() To avoid the following kernel panic when calling kmem_cache_create() with a NULL pointer from pool_cache(), Block the rxe_param_set_add() from running if the rdma_rxe module is not initialized. BUG: unable to handle kernel NULL pointer dereference at 000000000000000b PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 4 PID: 8512 Comm: modprobe Kdump: loaded Not tainted 4.18.0-231.el8.x86_64 #1 Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 Gen10, BIOS A40 10/02/2018 RIP: 0010:kmem_cache_alloc+0xd1/0x1b0 Code: 8b 57 18 45 8b 77 1c 48 8b 5c 24 30 0f 1f 44 00 00 5b 48 89 e8 5d 41 5c 41 5d 41 5e 41 5f c3 81 e3 00 00 10 00 75 0e 4d 89 fe <41> f6 47 0b 04 0f 84 6c ff ff ff 4c 89 ff e8 cc da 01 00 49 89 c6 RSP: 0018:ffffa2b8c773f9d0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000005 RDX: 0000000000000004 RSI: 00000000006080c0 RDI: 0000000000000000 RBP: ffff8ea0a8634fd0 R08: ffffa2b8c773f988 R09: 00000000006000c0 R10: 0000000000000000 R11: 0000000000000230 R12: 00000000006080c0 R13: ffffffffc0a97fc8 R14: 0000000000000000 R15: 0000000000000000 FS: 00007f9138ed9740(0000) GS:ffff8ea4ae800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000000b CR3: 000000046d59a000 CR4: 00000000003406e0 Call Trace: rxe_alloc+0xc8/0x160 [rdma_rxe] rxe_get_dma_mr+0x25/0xb0 [rdma_rxe] __ib_alloc_pd+0xcb/0x160 [ib_core] ib_mad_init_device+0x296/0x8b0 [ib_core] add_client_context+0x11a/0x160 [ib_core] enable_device_and_get+0xdc/0x1d0 [ib_core] ib_register_device+0x572/0x6b0 [ib_core] ? crypto_create_tfm+0x32/0xe0 ? crypto_create_tfm+0x7a/0xe0 ? crypto_alloc_tfm+0x58/0xf0 rxe_register_device+0x19d/0x1c0 [rdma_rxe] rxe_net_add+0x3d/0x70 [rdma_rxe] ? dev_get_by_name_rcu+0x73/0x90 rxe_param_set_add+0xaf/0xc0 [rdma_rxe] parse_args+0x179/0x370 ? ref_module+0x1b0/0x1b0 load_module+0x135e/0x17e0 ? ref_module+0x1b0/0x1b0 ? __do_sys_init_module+0x13b/0x180 __do_sys_init_module+0x13b/0x180 do_syscall_64+0x5b/0x1a0 entry_SYSCALL_64_after_hwframe+0x65/0xca RIP: 0033:0x7f9137ed296e This can be triggered if a user tries to use the 'module option' which is not actually a real module option but some idiotic (and thankfully no obsolete) sysfs interface. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200825151725.254046-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-27 09:14:41 -03:00
Dinghao Liu	e3ddd6067e	RDMA/rxe: Fix memleak in rxe_mem_init_user When page_address() fails, umem should be freed just like when rxe_mem_alloc() fails. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200819075632.22285-1-dinghao.liu@zju.edu.cn Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-27 08:45:59 -03:00
Mohammad Heib	fd49ddaf7e	RDMA/rxe: prevent rxe creation on top of vlan interface Creating rxe device on top of vlan interface will create a non-functional device that has an empty gids table and can't be used for rdma cm communication. This is caused by the logic in enum_all_gids_of_dev_cb()/is_eth_port_of_netdev(), which only considers networks connected to "upper devices" of the configured network device, resulting in an empty set of gids for a vlan interface, and attempts to connect via this rdma device fail in cm_init_av_for_response because no gids can be resolved. Apparently, this behavior was implemented to fit the HW-RoCE devices that create RoCE device per port, therefore RXE must behave the same like HW-RoCE devices and create rxe device per real device only. In order to communicate via a vlan interface, the user must use the gid index of the vlan address instead of creating rxe over vlan. Link: https://lore.kernel.org/r/20200811150415.3693-1-goody698@gmail.com Signed-off-by: Mohammad Heib <goody698@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-24 14:12:18 -03:00
Yi Zhang	60b1af64eb	RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars 'parent' sysfs reads will yield '\0' bytes when the interface name has 15 chars, and there will no "\n" output. To reproduce, create one interface with 15 chars: [root@test ~]# ip a s enp0s29u1u7u3c2 2: enp0s29u1u7u3c2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UNKNOWN group default qlen 1000 link/ether 02:21:28:57:47:17 brd ff:ff:ff:ff:ff:ff inet6 fe80::ac41:338f:5bcd:c222/64 scope link noprefixroute valid_lft forever preferred_lft forever [root@test ~]# modprobe rdma_rxe [root@test ~]# echo enp0s29u1u7u3c2 > /sys/module/rdma_rxe/parameters/add [root@test ~]# cat /sys/class/infiniband/rxe0/parent enp0s29u1u7u3c2[root@test ~]# [root@test ~]# f="/sys/class/infiniband/rxe0/parent" [root@test ~]# echo "$(<"$f")" -bash: warning: command substitution: ignored null byte in input enp0s29u1u7u3c2 Use scnprintf and PAGE_SIZE to fill the sysfs output buffer. Cc: stable@vger.kernel.org Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200820153646.31316-1-yi.zhang@redhat.com Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Yi Zhang <yi.zhang@redhat.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-08-24 13:59:58 -03:00
Gustavo A. R. Silva	df561f6688	treewide: Use fallthrough pseudo-keyword Replace the existing /* fall through */ comments and its variants with the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary fall-through markings when it is the case. [1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>	2020-08-23 17:36:59 -05:00
Kamal Heib	76251e15ea	RDMA/rxe: Remove pkey table The RoCE spec requires RoCE devices to support only the default pkey. However the rxe driver maintains a 64 enties pkey table and uses only the first entry. Remove the pkey table and hard code a table of length one hard wired with the default pkey. Replace all checks of the pkey_table with a comparison to the default_pkey instead. Link: https://lore.kernel.org/r/20200721101618.686110-1-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-31 16:17:56 -03:00
Mikhail Malygin	5f0b2a6093	RDMA/rxe: Prevent access to wr->next ptr afrer wr is posted to send queue rxe_post_send_kernel() iterates over linked list of wr's, until the wr->next ptr is NULL. However if we've got an interrupt after last wr is posted, control may be returned to the code after send completion callback is executed and wr memory is freed. As a result, wr->next pointer may contain incorrect value leading to panic. Store the wr->next on the stack before posting it. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200716190340.23453-1-m.malygin@yadro.com Signed-off-by: Mikhail Malygin <m.malygin@yadro.com> Signed-off-by: Sergey Kojushev <s.kojushev@yadro.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-16 16:12:07 -03:00
Kamal Heib	420bd9e2d9	RDMA/rxe: Remove rxe_link_layer() Instead of returning IB_LINK_LAYER_ETHERNET from rxe_link_layer, return it directly from get_link_layer callback and remove rxe_link_layer(). Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200705104313.283034-5-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-16 13:57:21 -03:00
Kamal Heib	293d8440a0	RDMA/rxe: Return void from rxe_mem_init_dma() The return value from rxe_mem_init_dma() is always 0 - change it to be void and fix the callers accordingly. Fixes: `8700e3e7c4` ("Soft RoCE driver") Link: https://lore.kernel.org/r/20200705104313.283034-4-kamalheib1@gmail.com Signed-off-by: Kamal Heib <kamalheib1@gmail.com> Reviewed-by: Leon Romanovsky <leonro@mellanox.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>	2020-07-16 13:57:21 -03:00

... 3 4 5 6 7 ...

742 Commits