Commit Graph

16 Commits

Author SHA1 Message Date
Cheng Xu
3b3dfd58ba RDMA/erdma: Refactor the original doorbell allocation mechanism
The original doorbell allocation mechanism is complex and does not meet
the isolation requirement. So we introduce a new doorbell mechanism and the
original mechanism (only be used with CAP_SYS_RAWIO if hardware does not
support the new mechanism) needs to be kept as simple as possible for
compatibility.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230606055005.80729-5-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11 11:57:01 +03:00
Cheng Xu
6534de1fe3 RDMA/erdma: Associate QPs/CQs with doorbells for authorization
For the isolation requirement, each QP/CQ can only issue doorbells from the
allocated mmio space. Configure the relationship between QPs/CQs and
mmio doorbell spaces to hardware in create_qp/create_cq interfaces.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230606055005.80729-4-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11 11:57:00 +03:00
Cheng Xu
7e9a1dada2 RDMA/erdma: Allocate doorbell resources from hardware
Each ucontext will try to allocate doorbell resources in the extended bar
space from hardware. For compatibility, we change nothing for the original
bar space, and it will be used only for applications with CAP_SYS_RAWIO
authority in the older HW/FW environments.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230606055005.80729-3-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11 11:57:00 +03:00
Cheng Xu
128f840430 RDMA/erdma: Configure PAGE_SIZE to hardware
Add a new CMDQ message to configure hardware. Initially the page size (in
the format of shift) will be passed to hardware, so that hardware can
organize the mmio space properly. It's called only if hardware supports it.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230606055005.80729-2-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-11 11:57:00 +03:00
Linus Torvalds
af3877265d v6.4 merge window RDMA pull request
Usual wide collection of unrelated items in drivers:
 
 - Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5, rxe,
   usnic, usnic, bnxt_re, ocrdma, iser
    * Unnecessary NULL checks
    * kmap obsolescence
    * pci_enable_pcie_error_reporting() obsolescence
    * Unused variables and macros
    * trace event related warnings
    * casting warnings
 
 - Code cleanups for irdm and erdma
 
 - EFA reporting of 128 byte PCIe TLP support
 
 - mlx5 more agressively uses the out of order HW feature
 
 - Big rework of how state machines and tasks work in rxe
 
 - Fix a syzkaller found crash netdev refcount leak in siw
 
 - bnxt_re revises their HW description header
 
 - Congestion control for bnxt_re
 
 - Use mmu_notifiers more safely in hfi1
 
 - mlx5 gets better support for PCIe relaxed ordering inside VMs
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRRRCHOFoQz/8F5bUaFwuHvBreFYQUCZEva5wAKCRCFwuHvBreF
 YZFmAQC9T3b/XQ3bRknYciuzbatC98o9xB0FTqmEFYGj+Y2lVAD9EEVe3HKfHfi3
 t/GxXYB5r22oxg5bgsblZfEdEdTVCg8=
 =akMm
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "Usual wide collection of unrelated items in drivers:

   - Driver bug fixes and treewide cleanups in hfi1, siw, qib, mlx5,
     rxe, usnic, usnic, bnxt_re, ocrdma, iser:
       - remove unnecessary NULL checks
       - kmap obsolescence
       - pci_enable_pcie_error_reporting() obsolescence
       - unused variables and macros
       - trace event related warnings
       - casting warnings

   - Code cleanups for irdm and erdma

   - EFA reporting of 128 byte PCIe TLP support

   - mlx5 more agressively uses the out of order HW feature

   - Big rework of how state machines and tasks work in rxe

   - Fix a syzkaller found crash netdev refcount leak in siw

   - bnxt_re revises their HW description header

   - Congestion control for bnxt_re

   - Use mmu_notifiers more safely in hfi1

   - mlx5 gets better support for PCIe relaxed ordering inside VMs"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (81 commits)
  RDMA/efa: Add rdma write capability to device caps
  RDMA/mlx5: Use correct device num_ports when modify DC
  RDMA/irdma: Drop spurious WQ_UNBOUND from alloc_ordered_workqueue() call
  RDMA/rxe: Fix spinlock recursion deadlock on requester
  RDMA/mlx5: Fix flow counter query via DEVX
  RDMA/rxe: Protect QP state with qp->state_lock
  RDMA/rxe: Move code to check if drained to subroutine
  RDMA/rxe: Remove qp->req.state
  RDMA/rxe: Remove qp->comp.state
  RDMA/rxe: Remove qp->resp.state
  RDMA/mlx5: Allow relaxed ordering read in VFs and VMs
  net/mlx5: Update relaxed ordering read HCA capabilities
  RDMA/mlx5: Check pcie_relaxed_ordering_enabled() in UMR
  RDMA/mlx5: Remove pcie_relaxed_ordering_enabled() check for RO write
  RDMA: Add ib_virt_dma_to_page()
  RDMA/rxe: Fix the error "trying to register non-static key in rxe_cleanup_task"
  RDMA/irdma: Slightly optimize irdma_form_ah_cm_frame()
  RDMA/rxe: Fix incorrect TASKLET_STATE_SCHED check in rxe_task.c
  IB/hfi1: Place struct mmu_rb_handler on cache line start
  IB/hfi1: Fix bugs with non-PAGE_SIZE-end multi-iovec user SDMA requests
  ...
2023-04-29 17:21:24 -07:00
Cheng Xu
d649c638dc RDMA/erdma: Use fixed hardware page size
Hardware's page size is 4096, but the kernel's page size may vary. Driver
should use hardware's page size when communicating with hardware.

Fixes: 1550557717 ("RDMA/erdma: Add verbs implementation")
Link: https://lore.kernel.org/r/20230307102924.70577-2-chengyou@linux.alibaba.com
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-03-24 11:31:22 -03:00
Cheng Xu
6256aa9ae9 RDMA/erdma: Update default EQ depth to 4096 and max_send_wr to 8192
Max EQ depth of hardware is 32K, the current default EQ depth is too small
for some applications, so change the default depth to 4096.
Max send WRs the hardware can support is 8K, but the driver limits the
value to 4K. Remove this limitation.

Fixes: be3cff0f24 ("RDMA/erdma: Add the hardware related definitions")
Fixes: db23ae64ca ("RDMA/erdma: Add verbs header file")
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230320084652.16807-3-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-20 12:53:24 +02:00
Cheng Xu
3fe26c0493 RDMA/erdma: Fix some typos
FAA is short for atomic fetch and add, not FAD. Fix this.

Fixes: 0ca9c2e284 ("RDMA/erdma: Implement atomic operations support")
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20230320084652.16807-2-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-20 12:53:24 +02:00
Gustavo A. R. Silva
ed73a50548 RDMA/erdma: Replace zero-length arrays with flexible-array members
Zero-length arrays are deprecated[1] and we are moving towards
adopting C99 flexible-array members instead. So, replace zero-length
arrays, in a couple of structures, with flex-array members.

This helps with the ongoing efforts to tighten the FORTIFY_SOURCE
routines on memcpy() and help us make progress towards globally
enabling -fstrict-flex-arrays=3 [2].

Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#zero-length-and-one-element-arrays [1]
Link: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/602902.html [2]
Link: https://github.com/KSPP/linux/issues/78
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Link: https://lore.kernel.org/r/Y7zCBqwC1LtabRJ9@work
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-01-15 13:33:49 +02:00
Cheng Xu
54d8fffc2a RDMA/erdma: Implement the lifecycle of reflushing work for each QP
Each QP has a work for reflushing purpose. In the work, driver will report
the latest pi to hardware.

Link: https://lore.kernel.org/r/20221116023107.82835-3-chengyou@linux.alibaba.com
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-11-24 14:58:52 -04:00
Cheng Xu
0ca9c2e284 RDMA/erdma: Implement atomic operations support
Add atomic operations support in post_send and poll_cq implementation.
Also, rename 'laddr' and 'lkey' in struct erdma_sge to 'addr' and 'key',
because this structure is used for both local and remote SGEs.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221107021845.44598-4-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-07 09:55:04 +02:00
Cheng Xu
71c6925f28 RDMA/erdma: Report atomic capacity when hardware supports atomic feature
Introduce "capacity flags" field at where hardware put all zeros originally
in "query device" response. Using this field, hardware can report atomic
feature if supports.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221107021845.44598-3-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-07 09:55:04 +02:00
Cheng Xu
ece43fad22 RDMA/erdma: Extend access right field of FRMR and REG MR to support atomic
To support atomic operations, IB_ACCESS_REMOTE_ATOMIC right should be
passed to hardware for permission check. Since "access mode" field in FRMR
SQE and RegMr command is never used by hw, we remove the "access mode"
field, so that we can then have enough space to extend access fields.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20221107021845.44598-2-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-11-07 09:55:04 +02:00
Cheng Xu
9bdb9350f3 RDMA/erdma: Support dynamic mtu
Hardware now support jumbo frame for RDMA. So we introduce a new CMDQ
message to support mtu change notification.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220909093822.33868-5-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-21 10:31:24 +03:00
Cheng Xu
13f42e5166 RDMA/erdma: Make hardware internal opcodes invisible to driver
Some opcodes are used in hardware internally, and driver does not care
about them. So, we change them to reserved opcodes in driver.

Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Link: https://lore.kernel.org/r/20220909093822.33868-4-chengyou@linux.alibaba.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-09-20 14:27:51 +03:00
Cheng Xu
be3cff0f24 RDMA/erdma: Add the hardware related definitions
ERDMA is a PCIe device, and this file provides ERDMA hardware related
definitions, mainly including PCIe device capabilities and restrictions,
device registers definitions, doorbell space, doorbell structure
definitions and WQE definitions.

Link: https://lore.kernel.org/r/20220727014927.76564-3-chengyou@linux.alibaba.com
Signed-off-by: Cheng Xu <chengyou@linux.alibaba.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-07-27 15:54:38 -03:00