Commit Graph

7560 Commits

Author SHA1 Message Date
Lang Cheng
69455df04e RDMA/hns: Prevent le32 from being implicitly converted to u32
Replace BUILD_BUG_ON_ZERO() with BUILD_BUG_ON() to avoid sparse
complaining "restricted __le32 degrades to integer".

Link: https://lore.kernel.org/r/1617354454-47840-10-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:22 -03:00
Yixing Liu
782832f254 RDMA/hns: Simplify the function config_eqc()
Use "hr_reg_write" replace "roce_set_filed".

Link: https://lore.kernel.org/r/1617354454-47840-9-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:22 -03:00
Wenpeng Liang
495c24808c RDMA/hns: Add XRC subtype in QPC and XRC type in SRQC
A field to distuiguish basic SRQ from XRC SRQ in SRQC and a field in QPC
to determine whether a QP is XRC TGT QP or XRC INI QP are missing.

Fixes: 32548870d4 ("RDMA/hns: Add support for XRC on HIP09")
Link: https://lore.kernel.org/r/1617354454-47840-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:22 -03:00
Wenpeng Liang
537bc924f3 RDMA/hns: Remove unsupported QP types
The hns ROCEE does not support UC QP currently.

Link: https://lore.kernel.org/r/1617354454-47840-7-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:22 -03:00
Yangyang Li
7bd5d90d8f RDMA/hns: Delete unused members in the structure hns_roce_hw
Some structure members in hns_roce_hw have never been used and need to be
deleted.

Fixes: 9a4435375c ("IB/hns: Add driver files for hns RoCE driver")
Fixes: b156269d88 ("RDMA/hns: Add modify CQ support for hip08")
Fixes: c7bcb13442 ("RDMA/hns: Add SRQ support for hip08 kernel mode")
Link: https://lore.kernel.org/r/1617354454-47840-6-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:21 -03:00
Wenpeng Liang
2371efab97 RDMA/hns: Delete redundant abnormal interrupt status
The hardware supports only two types of abnormal interrupts.

Fixes: a5073d6054 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1617354454-47840-5-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:21 -03:00
Yangyang Li
714a597baa RDMA/hns: Delete redundant condition judgment related to eq
The register value related to the eq interrupt depends only on
enable_flag, so the redundant condition judgment is deleted.

Fixes: a5073d6054 ("RDMA/hns: Add eq support of hip08")
Link: https://lore.kernel.org/r/1617354454-47840-4-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:21 -03:00
Weihang Li
9eab614338 RDMA/hns: Fix missing assignment of max_inline_data
When querying QP, the ULPs should be informed of the max length of inline
data supported by the hardware.

Fixes: 30b707886a ("RDMA/hns: Support inline data in extented sge space for RC")
Link: https://lore.kernel.org/r/1617354454-47840-3-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:21 -03:00
Weihang Li
24f3f1cd51 RDMA/hns: Avoid enabling RQ inline on UD
RQ inline is not supported on UD/GSI QP, it should be disabled in QPC.

Link: https://lore.kernel.org/r/1617354454-47840-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:08:20 -03:00
Wenpeng Liang
8d78e7b478 RDMA/hns: Modify prints for mailbox and command queue
Use ratelimited print in mbox and cmq. And print mailbox operation if
mailbox fails because it's useful information for the user.

Link: https://lore.kernel.org/r/1617262341-37571-4-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:05:04 -03:00
Lang Cheng
0835cf5839 RDMA/hns: Support more return types of command queue
Add error code definition according to the return code from firmware to
help find out more detailed reasons why a command fails to be sent.

Link: https://lore.kernel.org/r/1617262341-37571-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:05:03 -03:00
Lang Cheng
a389d016c0 RDMA/hns: Enable all CMDQ context
Fix error of cmd's context number calculation algorithm to enable all of
32 cmd entries and support 32 concurrent accesses.

Link: https://lore.kernel.org/r/1617262341-37571-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-08 16:05:03 -03:00
Kamal Heib
e1ad897b9c RDMA/qedr: Fix kernel panic when trying to access recv_cq
As INI QP does not require a recv_cq, avoid the following null pointer
dereference by checking if the qp_type is not INI before trying to extract
the recv_cq.

BUG: kernel NULL pointer dereference, address: 00000000000000e0
 #PF: supervisor read access in kernel mode
 #PF: error_code(0x0000) - not-present page
 PGD 0 P4D 0
 Oops: 0000 [#1] SMP PTI
 CPU: 0 PID: 54250 Comm: mpitests-IMB-MP Not tainted 5.12.0-rc5 #1
 Hardware name: Dell Inc. PowerEdge R320/0KM5PX, BIOS 2.7.0 08/19/2019
 RIP: 0010:qedr_create_qp+0x378/0x820 [qedr]
 Code: 02 00 00 50 e8 29 d4 a9 d1 48 83 c4 18 e9 65 fe ff ff 48 8b 53 10 48 8b 43 18 44 8b 82 e0 00 00 00 45 85 c0 0f 84 10 74 00 00 <8b> b8 e0 00 00 00 85 ff 0f 85 50 fd ff ff e9 fd 73 00 00 48 8d bd
 RSP: 0018:ffff9c8f056f7a70 EFLAGS: 00010202
 RAX: 0000000000000000 RBX: ffff9c8f056f7b58 RCX: 0000000000000009
 RDX: ffff8c41a9744c00 RSI: ffff9c8f056f7b58 RDI: ffff8c41c0dfa280
 RBP: ffff8c41c0dfa280 R08: 0000000000000002 R09: 0000000000000001
 R10: 0000000000000000 R11: ffff8c41e06fc608 R12: ffff8c4194052000
 R13: 0000000000000000 R14: ffff8c4191546070 R15: ffff8c41c0dfa280
 FS:  00007f78b2787b80(0000) GS:ffff8c43a3200000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 00000000000000e0 CR3: 00000001011d6002 CR4: 00000000001706f0
 Call Trace:
  ib_uverbs_handler_UVERBS_METHOD_QP_CREATE+0x4e4/0xb90 [ib_uverbs]
  ? ib_uverbs_cq_event_handler+0x30/0x30 [ib_uverbs]
  ib_uverbs_run_method+0x6f6/0x7a0 [ib_uverbs]
  ? ib_uverbs_handler_UVERBS_METHOD_QP_DESTROY+0x70/0x70 [ib_uverbs]
  ? __cond_resched+0x15/0x30
  ? __kmalloc+0x5a/0x440
  ib_uverbs_cmd_verbs+0x195/0x360 [ib_uverbs]
  ? xa_load+0x6e/0x90
  ? cred_has_capability+0x7c/0x130
  ? avc_has_extended_perms+0x17f/0x440
  ? vma_link+0xae/0xb0
  ? vma_set_page_prot+0x2a/0x60
  ? mmap_region+0x298/0x6c0
  ? do_mmap+0x373/0x520
  ? selinux_file_ioctl+0x17f/0x220
  ib_uverbs_ioctl+0xa7/0x110 [ib_uverbs]
  __x64_sys_ioctl+0x84/0xc0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xae
 RIP: 0033:0x7f78b120262b

Fixes: 06e8d1df46 ("RDMA/qedr: Add support for user mode XRC-SRQ's")
Link: https://lore.kernel.org/r/20210404125501.154789-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:28:26 -03:00
Wei Yongjun
2abb743173 RDMA/hns: Use GFP_ATOMIC under spin lock
A spin lock is taken here so we should use GFP_ATOMIC.

Fixes: f91696f2f0 ("RDMA/hns: Support congestion control type selection according to the FW")
Link: https://lore.kernel.org/r/20210407154900.3486268-1-weiyongjun1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:26:32 -03:00
Praveen Kumar Kannoju
7e111bbff9 IB/mlx5: Reduce max order of memory allocated for xlt update
To update xlt (during mlx5_ib_reg_user_mr()), the driver can request up to
1 MB (order-8) memory, depending on the size of the MR. This costly
allocation can sometimes take very long to return (a few seconds). This
causes the calling application to hang for a long time, especially when
the system is fragmented.  To avoid these long latency spikes, the calls
the higher order allocations need to fail faster in case they are not
available.

In order to acheive this we need __GFP_NORETRY flag in the gfp_mask before
during fetching the free pages. Allow the algorithm to automatically fall
back to smaller page sizes.

Link: https://lore.kernel.org/r/1617425635-35631-1-git-send-email-praveen.kannoju@oracle.com
Signed-off-by: Praveen Kumar Kannoju <praveen.kannoju@oracle.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:25:17 -03:00
Kaike Wan
fdde1aa09a IB/hfi1: Remove unused function
Remove the unused function sdma_iowait_schedule().

Fixes: 7724105686 ("IB/hfi1: add driver files")
Link: https://lore.kernel.org/r/1617026791-89997-1-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:01 -03:00
Mike Marciniszyn
ca5f72568e IB/hfi1: Use kzalloc() for mmu_rb_handler allocation
The code currently assumes that the mmu_notifier struct
embedded in mmu_rb_handler only contains two fields.

There are now extra fields:

struct mmu_notifier {
        struct hlist_node hlist;
        const struct mmu_notifier_ops *ops;
        struct mm_struct *mm;
        struct rcu_head rcu;
        unsigned int users;
};

Given that there in no init for the mmu_notifier, a kzalloc() should
be used to insure that any newly added fields are given a predictable
initial value of zero.

Fixes: 06e0ffa693 ("IB/hfi1: Re-factor MMU notification code")
Link: https://lore.kernel.org/r/1617026056-50483-9-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:01 -03:00
Mike Marciniszyn
6b13215df1 IB/hfi1: Add additional usdma traces
Add traces that were vital in isolating an issue with pq waitlist in
commit fa8dac3968 ("IB/hfi1: Fix another case where pq is left on
waitlist")

Link: https://lore.kernel.org/r/1617026056-50483-8-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:01 -03:00
Mike Marciniszyn
326a239307 IB/hfi1: Remove indirect call to hfi1_ipoib_send_dma()
hfi1_ipoib_send() directly calls hfi1_ipoib_send_dma() with no value add.

Fix by renaming hfi1_ipoib_send_dma() to hfi1_ipoib_send().

Link: https://lore.kernel.org/r/1617026056-50483-6-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:00 -03:00
Mike Marciniszyn
70d44c18a7 IB/hfi1: Use napi_schedule_irqoff() for tx napi
The call is from an ISR context and napi_schedule_irqoff() can be used.

Change the call to the more efficient type.

Link: https://lore.kernel.org/r/1617026056-50483-5-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:00 -03:00
Mike Marciniszyn
b536d4b2a2 IB/hfi1: Correct oversized ring allocation
The completion ring for tx is using the wrong size to size the ring,
oversizing the ring by two orders of magniture.

Correct the allocation size and use kcalloc_node() to allocate the ring.
Fix mistaken GFP defines in similar allocations.

Link: https://lore.kernel.org/r/1617026056-50483-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:00 -03:00
Mike Marciniszyn
042a00f93a IB/{ipoib,hfi1}: Add a timeout handler for rdma_netdev
The current rdma_netdev handling in ipoib hooks the tx_timeout handler,
but prints out a totally useless message that prevents effective debugging
especially when multiple transmit queues are being used.

Add a tx_timeout rdma_netdev hook and implement the callback in the hfi1
to print additional information.

The existing non-helpful message is avoided when the driver has presented
a callback.

Link: https://lore.kernel.org/r/1617026056-50483-3-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:00 -03:00
Mike Marciniszyn
4bd00b55c9 IB/hfi1: Add AIP tx traces
Add traces to allow for debugging issues with AIP tx.

Link: https://lore.kernel.org/r/1617026056-50483-2-git-send-email-dennis.dalessandro@cornelisnetworks.com
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 20:19:00 -03:00
Mike Marciniszyn
5de61a47eb IB/hfi1: Fix probe time panic when AIP is enabled with a buggy BIOS
A panic can result when AIP is enabled:

  BUG: unable to handle kernel NULL pointer dereference at 000000000000000
  PGD 0 P4D 0
  Oops: 0000 1 SMP PTI
  CPU: 70 PID: 981 Comm: systemd-udevd Tainted: G OE --------- - - 4.18.0-240.el8.x86_64 #1
  Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.01.01.0005.101720141054 10/17/2014
  RIP: 0010:__bitmap_and+0x1b/0x70
  RSP: 0018:ffff99aa0845f9f0 EFLAGS: 00010246
  RAX: 0000000000000000 RBX: ffff8d5a6fc18000 RCX: 0000000000000048
  RDX: 0000000000000000 RSI: ffffffffc06336f0 RDI: ffff8d5a8fa67750
  RBP: 0000000000000079 R08: 0000000fffffffff R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000001 R12: ffffffffc06336f0
  R13: 00000000000000a0 R14: ffff8d5a6fc18000 R15: 0000000000000003
  FS: 00007fec137a5980(0000) GS:ffff8d5a9fa80000(0000) knlGS:0000000000000000
  CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000000 CR3: 0000000a04b48002 CR4: 00000000001606e0
  Call Trace:
  hfi1_num_netdev_contexts+0x7c/0x110 [hfi1]
  hfi1_init_dd+0xd7f/0x1a90 [hfi1]
  ? pci_bus_read_config_dword+0x49/0x70
  ? pci_mmcfg_read+0x3e/0xe0
  do_init_one.isra.18+0x336/0x640 [hfi1]
  local_pci_probe+0x41/0x90
  pci_device_probe+0x105/0x1c0
  really_probe+0x212/0x440
  driver_probe_device+0x49/0xc0
  device_driver_attach+0x50/0x60
  __driver_attach+0x61/0x130
  ? device_driver_attach+0x60/0x60
  bus_for_each_dev+0x77/0xc0
  ? klist_add_tail+0x3b/0x70
  bus_add_driver+0x14d/0x1e0
  ? dev_init+0x10b/0x10b [hfi1]
  driver_register+0x6b/0xb0
  ? dev_init+0x10b/0x10b [hfi1]
  hfi1_mod_init+0x1e6/0x20a [hfi1]
  do_one_initcall+0x46/0x1c3
  ? free_unref_page_commit+0x91/0x100
  ? _cond_resched+0x15/0x30
  ? kmem_cache_alloc_trace+0x140/0x1c0
  do_init_module+0x5a/0x220
  load_module+0x14b4/0x17e0
  ? __do_sys_finit_module+0xa8/0x110
  __do_sys_finit_module+0xa8/0x110
  do_syscall_64+0x5b/0x1a0

The issue happens when pcibus_to_node() returns NO_NUMA_NODE.

Fix this issue by moving the initialization of dd->node to hfi1_devdata
allocation and remove the other pcibus_to_node() calls in the probe path
and use dd->node instead.

Affinity logic is adjusted to use a new field dd->affinity_entry as a
guard instead of dd->node.

Fixes: 4730f4a6c6 ("IB/hfi1: Activate the dummy netdev")
Link: https://lore.kernel.org/r/1617025700-31865-4-git-send-email-dennis.dalessandro@cornelisnetworks.com
Cc: stable@vger.kernel.org
Signed-off-by: Mike Marciniszyn <mike.marciniszyn@cornelisnetworks.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 15:31:59 -03:00
Potnuri Bharat Teja
603c4690b0 RDMA/cxgb4: check for ipv6 address properly while destroying listener
ipv6 bit is wrongly set by the below which causes fatal adapter lookup
engine errors for ipv4 connections while destroying a listener.  Fix it to
properly check the local address for ipv6.

Fixes: 3408be145a ("RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server")
Link: https://lore.kernel.org/r/20210331135715.30072-1-bharat@chelsio.com
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-07 15:31:45 -03:00
Yixian Liu
704d68f5f2 RDMA/hns: Reorganize doorbell update interfaces for all queues
The doorbell update interfaces are very similar for different queues, such
as SQ, RQ, SRQ, CQ and EQ. So reorganize these code and also fix some
inappropriate naming.

Link: https://lore.kernel.org/r/1616840738-7866-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 16:16:44 -03:00
Yixian Liu
cf8cd4ccb2 RDMA/hns: Support configuring doorbell mode of RQ and CQ
HIP08 supports both normal and record doorbell mode for RQ and CQ, SQ
record doorbell for userspace is also supported by the software for
flushing CQE process. As now the capability of HIP08 are exposed to the
user and are configurable, the support of normal doorbell should be added
back.

Note that, if switching to normal doorbell, the kernel will report "flush
cqe is unsupported" if modify qp to error status as the flush is based on
record doorbell.

Link: https://lore.kernel.org/r/1616840738-7866-2-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 16:16:13 -03:00
Xi Wang
8115f97445 RDMA/hns: Simplify command fields for HEM base address configuration
Use hr_reg_write() instead of roce_set_field() to simplify codes about
configuring HEM BA.

Link: https://lore.kernel.org/r/1616815294-13434-6-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 16:12:45 -03:00
Xi Wang
c6f0411b96 RDMA/hns: Reorganize process of setting HEM
Encapsulate configuring GMV base address and other type of HEM table into
two separate functions to make process of setting HEM clearer.

Link: https://lore.kernel.org/r/1616815294-13434-5-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 16:12:45 -03:00
Xi Wang
ee82e68850 RDMA/hns: Refactor reset state checking flow
The 'HNS_ROCE_OPC_QUERY_MB_ST' command will response the mailbox complete
status and hardware busy flag, and the complete status is only valid when
the busy flag is 0, so it's better to query these two fields at a time
rather than separately.

Link: https://lore.kernel.org/r/1616815294-13434-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 16:12:45 -03:00
Yixing Liu
d102a6e374 RDMA/hns: Reorganize hns_roce_create_cq()
Encapsulate two subprocesses into functions.

Link: https://lore.kernel.org/r/1616815294-13434-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 16:12:44 -03:00
Weihang Li
4940b0ab45 RDMA/hns: Refactor hns_roce_v2_poll_one()
Encapsulate the process of obtaining the current QP and filling WC as
functions, also merge some duplicate code.

Link: https://lore.kernel.org/r/1616815294-13434-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 16:12:44 -03:00
Yangyang Li
f91696f2f0 RDMA/hns: Support congestion control type selection according to the FW
The type of congestion control algorithm includes DCQCN, LDCP, HC3 and
DIP. The driver will select one of them according to the firmware when
querying PF capabilities, and then set the related configuration fields
into QPC.

Link: https://lore.kernel.org/r/1616679236-7795-3-git-send-email-liweihang@huawei.com
Signed-off-by: Yangyang Li <liyangyang20@huawei.com>
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:23:09 -03:00
Wei Xu
e079d87d1d RDMA/hns: Support query information of functions from FW
Add a new type of command to query mac id of functions from the firmware,
it is used to select the template of congestion algorithm. More info will
be supported in the future.

Link: https://lore.kernel.org/r/1616679236-7795-2-git-send-email-liweihang@huawei.com
Signed-off-by: Wei Xu <xuwei5@hisilicon.com>
Signed-off-by: Shengming Shu <shushengming1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-01 15:23:08 -03:00
Tang Yizhou
2e919a32ae RDMA/iw_cxgb4: Use DEFINE_SPINLOCK() for spinlock
spinlock can be initialized automatically with DEFINE_SPINLOCK() rather
than explicitly calling spin_lock_init().

Link: https://lore.kernel.org/r/20210331020105.4858-1-tangyizhou@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Tang Yizhou <tangyizhou@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-31 14:41:01 -03:00
Ruiqi Gong
de2a246195 RDMA/hns: Fix a spelling mistake in hns_roce_hw_v1.c
s/caculating/calculating

Link: https://lore.kernel.org/r/20210330122912.19989-1-gongruiqi1@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Ruiqi Gong <gongruiqi1@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-30 19:52:06 -03:00
Gal Pressman
7410c2d0f4 RDMA/efa: Use strscpy instead of strlcpy
The strlcpy function doesn't limit the source length, use the preferred
strscpy function instead.

Link: https://lore.kernel.org/r/20210329120131.18793-1-galpress@amazon.com
Reviewed-by: Firas JahJah <firasj@amazon.com>
Reviewed-by: Yossi Leybovich <sleybo@amazon.com>
Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-30 17:03:22 -03:00
Bhaskar Chowdhury
4ae6573e69 IB/hfi1: Fix a typo
s/struture/structure/

And add the missing colon for kdoc

Link: https://lore.kernel.org/r/20210322062923.3306167-1-unixbhaskar@gmail.com
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 12:05:28 -03:00
Shay Drory
e5dc370bd9 RDMA/mlx5: Set ODP caps only if device profile support ODP
Currently, ODP caps are set during the init stage of mlx5_ib_dev,
regardless of whether the device profile supports ODP or not.  There is no
point in setting ODP caps if the device profile doesn't support
ODP. Hence, move setting the ODP caps to the odp_init stage.

Link: https://lore.kernel.org/r/20210318135259.681264-1-leon@kernel.org
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Shay Drory <shayd@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 11:57:38 -03:00
Maor Gottlieb
c73700806d RDMA/mlx5: Fix drop packet rule in egress table
Initial drop action support missed that drop action can be added to egress
flow tables as well. Add the missing support.

This requires making sure that dest_type isn't set to PORT which in turn
exposes a possibility of passing dst while indicating number of dsts as
zero. Explicitly check for number of dsts and pass the appropriate
pointer.

Fixes: f29de9eee7 ("RDMA/mlx5: Add support for drop action in DV steering")
Link: https://lore.kernel.org/r/20210318135123.680759-1-leon@kernel.org
Reviewed-by: Mark Bloch <markb@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 11:52:47 -03:00
Selvin Xavier
6845485f9e RDMA/bnxt_re: Move device to error state upon device crash
When the L2 driver detects a device crash or device undergone reset, it
invokes a stop callback to recover from error.

The current RoCE driver doesn't recover the device. So move the device to
error state and dispatch fatal events to all qps Release the MSIx vectors
to avoid a crash when L2 driver disables the MSIx.  Also, check for the
device state to avoid posting further commands to the HW.

Link: https://lore.kernel.org/r/1615968942-30970-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 10:37:01 -03:00
Mark Bloch
1fb7f8973f RDMA: Support more than 255 rdma ports
Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs.  HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 09:31:21 -03:00
Linus Torvalds
2ba9bea2d3 RDMA 5.12 second rc pull request
- Typo causing a regression in mlx5 devx
 
 - Regression in the recent hns rework causing the HW to get out of sync
 
 - Longstanding cxgb4 adaptor crash when destroying cm ids
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmBcxe0ACgkQOG33FX4g
 mxruFQ/+JMfHtWowI9l32N+SI93zWAQVN6bcO5XiUeziCLEsktU2dh5Lu8ZWWVmB
 BX1US24oBuGKs+YNx1ayshlgQj7EgojnGZODGtL4O157IvvgMrm4gFN84EoN3gMA
 8QzPVgZrrvyjDc8tSEINAW5crkpmhqdeg6XYM9BTUrVLqy+rXWv6V5E8Gnvtexmq
 h/+UgbIxW00SVVxpNyAFeuu1IlbgSYU0DvU4xpha/XKX6Ifyl9SeKmn6y+1UEU4q
 AYd/6UuYvK26G8tA3Zteh3lR8cUiLeorIwB6B5WoMDZXhyBz+PhaBS2ypHsrrnyr
 IIEDres5/zEm355nT8j0hTMv40ZUlj4UFRf2eWqcdrLD54yb6n8g4X2rmsASoe1A
 Z3CrhkV39dPEsB7JXmGB9j6W47PWGYBYUGpYTMRr69K7eB3viDIC+i6fDK7eKS0e
 u+fO3K9kU2B/PSJvWVCPKn2GCOw3jRuqwbTFUt0kvo8E90yzV6yqjgyoTKt+PgBn
 t0SUyPEGue+KEKgJd/sp62OL8LPulFksx+E7ksvqqbmSTNAjH+bs/FrbUYoN0r3v
 kmkxgxLzipu1aYtLwcAtNJpEya3c6s4ERBFeJMk01xUkiiWPG9RLa66Zg2IzDqWW
 J55irlJIqBm792vB+vH7+FHU3YjEGh66ycGRqsBWiw2oUxmlxEc=
 =vjhp
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "Not much going on, just some small bug fixes:

   - Typo causing a regression in mlx5 devx

   - Regression in the recent hns rework causing the HW to get out of
     sync

   - Long-standing cxgb4 adaptor crash when destroying cm ids"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server
  RDMA/hns: Fix bug during CMDQ initialization
  RDMA/mlx5: Fix typo in destroy_mkey inbox
2021-03-25 11:23:35 -07:00
Potnuri Bharat Teja
3408be145a RDMA/cxgb4: Fix adapter LE hash errors while destroying ipv6 listening server
Not setting the ipv6 bit while destroying ipv6 listening servers may
result in potential fatal adapter errors due to lookup engine memory hash
errors. Therefore always set ipv6 field while destroying ipv6 listening
servers.

Fixes: 830662f6f0 ("RDMA/cxgb4: Add support for active and passive open connection with IPv6 address")
Link: https://lore.kernel.org/r/20210324190453.8171-1-bharat@chelsio.com
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-25 10:25:58 -03:00
Lang Cheng
847d19a451 RDMA/hns: Support to query firmware version
Implement the ops named get_dev_fw_str to support ib_get_device_fw_str().

Link: https://lore.kernel.org/r/1615882161-53827-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-23 17:00:38 -03:00
Shay Drory
ad50294d4d RDMA/mlx5: Create ODP EQ only when ODP MR is created
There is no need to create the ODP EQ if the user doesn't use ODP MRs.
Hence, create it only when the first ODP MR is created. This EQ will be
destroyed only when the device is unloaded.
This will decrease the number of EQs created per device. for example: If
we creates 1K devices (SF/VF/etc'), than we will decrease the num of EQs
by 1K.

Link: https://lore.kernel.org/r/20210314125418.179716-1-leon@kernel.org
Signed-off-by: Shay Drory <shayd@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-23 17:00:14 -03:00
Weihang Li
783cf673b0 RDMA/hns: Fix memory corruption when allocating XRCDN
It's incorrect to cast the type of pointer to xrcdn from (u32 *) to
(unsigned long *), then pass it into hns_roce_bitmap_alloc(), this will
lead to a memory corruption.

Fixes: 32548870d4 ("RDMA/hns: Add support for XRC on HIP09")
Link: https://lore.kernel.org/r/1616381069-51759-1-git-send-email-liweihang@huawei.com
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-22 21:46:37 -03:00
Leon Romanovsky
fdb68dd30e RDMA: Delete not-used static inline functions
Perform mass deletion of static inline functions that are not used.

Link: https://lore.kernel.org/r/20210314133908.291945-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-22 09:31:19 -03:00
Leon Romanovsky
ae360f41b1 RDMA: Fix kernel-doc compilation warnings
This patch fixes bunch of kernel-doc compilation warnings like below:

 drivers/infiniband/hw/i40iw/i40iw_cm.c:4372: warning: expecting prototype for i40iw_ifdown_notify(). Prototype was for i40iw_if_notify() instead

Link: https://lore.kernel.org/r/20210314133908.291945-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-22 09:31:19 -03:00
Leon Romanovsky
b5486430bb RDMA/mlx5: Add missing returned error check of mlx5_ib_dereg_mr
Fix the following smatch error:

drivers/infiniband/hw/mlx5/mr.c:1950 mlx5_ib_dereg_mr() error: uninitialized symbol 'rc'.

Fixes: e6fb246cca ("RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()")
Link: https://lore.kernel.org/r/20210314082250.10143-1-leon@kernel.org
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-22 09:28:51 -03:00
Lang Cheng
af06b628a6 RDMA/hns: Fix bug during CMDQ initialization
When reloading driver, the head/tail pointer of CMDQ may be not at
position 0. Then during initialization of CMDQ, if head is reset first,
the firmware will start to handle CMDQ because the head is not equal to
the tail. The driver can reset tail first since the firmware will be
triggerred only by head. This bug is introduced by changing macros of
head/tail register without changing the order of initialization.

Fixes: 292b3352bd ("RDMA/hns: Adjust fields and variables about CMDQ tail/head")
Link: https://lore.kernel.org/r/1615602611-7963-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-22 09:25:57 -03:00
Mark Bloch
3a46f4fb55 net/mlx5: E-Switch, Refactor send to vport to be more generic
Now that each representor stores a pointer to the managing E-Switch
use that information when creating the send-to-vport rules.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-12 13:07:46 -08:00
Mark Bloch
658cfceb62 RDMA/mlx5: Use representor E-Switch when getting netdev and metadata
Now that a pointer to the managing E-Switch is stored in the representor
use it.

Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Acked-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-12 13:07:23 -08:00
Jason Gunthorpe
7610ab57de RDMA/mlx5: Allow larger pages in DevX umem
The umem DMA list calculation was locked at 4k pages due to confusion
around how this API works and is used when larger pages are present.

The conclusion is:

 - umem's cannot extend past what is mapped into the process, so creating
   a lage page size and referring to a sub-range is not allowed

 - umem's must always have a page offset of zero, except for sub PAGE_SIZE
   umems

 - The feature of umem_offset to create multiple objects inside a umem
   is buggy and isn't used anyplace. Thus we can assume all users of the
   current API have umem_offset == 0 as well

Provide a new page size calculator that limits the DMA list to the VA
range and enforces umem_offset == 0.

Allow user space to specify the page sizes which it can accept, this
bitmap must be derived from the intended use of the umem, based on
per-usage HW limitations.

Link: https://lore.kernel.org/r/20210304130501.1102577-4-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:20:37 -04:00
Yishai Hadas
2904bb37b3 IB/core: Split uverbs_get_const/default to consider target type
Change uverbs_get_const/uverbs_get_const_default to work properly with
both signed/unsigned parameters.

Current APIs mix s64 and u64 which leads to incorrect check when u64
value was supplied and its upper bit was set. In that case
uverbs_get_const() / uverbs_get_const_default() lower bound check may
fail unexpectedly, target is unsigned (lower bound is 0) but value
became negative as of the s64 usage.

Split to have two different APIs, no change to callers as the required
API will be called internally according to the target type.

Link: https://lore.kernel.org/r/20210304130501.1102577-3-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:20:36 -04:00
Mark Zhang
6fe6e56863 RDMA/mlx5: Fix mlx5 rates to IB rates map
Correct the map between mlx5 rates and corresponding ib rates, as they
don't always have a fixed offset between them.

Fixes: e126ba97db ("mlx5: Add driver for Mellanox Connect-IB adapters")
Link: https://lore.kernel.org/r/20210304124517.1100608-4-leon@kernel.org
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:12:39 -04:00
Maor Gottlieb
7852546f52 RDMA/mlx5: Fix query RoCE port
mlx5_is_roce_enabled returns the devlink RoCE init value, therefore it
should be used only when driver is loaded. Instead we just need to read
the roce_en field.

In addition, rename mlx5_is_roce_enabled to mlx5_is_roce_init_enabled.

Fixes: 7a58779edd ("IB/mlx5: Improve query port for representor port")
Link: https://lore.kernel.org/r/20210304124517.1100608-2-leon@kernel.org
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:12:39 -04:00
Jason Gunthorpe
14d05b552b RDMA/mlx5: Rename mlx5_mr_cache_invalidate() to revoke_mr()
Now that this is only used in a few places in mr.c give it a sensible
name. It has nothing to do with the cache and can be invoked on any
MR. DMA is stopped and the user cannot touch the MR any further once it
completes.

Link: https://lore.kernel.org/r/20210304120745.1090751-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:03:26 -04:00
Jason Gunthorpe
e6fb246cca RDMA/mlx5: Consolidate MR destruction to mlx5_ib_dereg_mr()
Now that the SRCU stuff has been removed the entire MR destroy logic can
be made a lot simpler. Currently there are many different ways to destroy a
MR and it makes it really hard to do this task correctly. Route all
destruction through mlx5_ib_dereg_mr() and make it work for all
situations.

Since it turns out all the different MR types do basically the same thing
this removes a lot of knowledge of MR internals from ODP and leaves ODP
just exporting an operation to clean up children.

This fixes a few weird corner cases bugs and firmly uses the correct
ordering of the MR destruction:
 - Stop parallel access to the mkey via the ODP xarray
 - Stop DMA
 - Release the umem
 - Clean up ODP children
 - Free/Recycle the MR

Link: https://lore.kernel.org/r/20210304120745.1090751-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:03:25 -04:00
Jason Gunthorpe
f18ec42231 RDMA/mlx5: Use a union inside mlx5_ib_mr
The struct mlx5_ib_mr can be used for three different things, but only one
at a time:

 - In the user MR cache
 - As a kernel MR
 - As a user MR

Overlay the three things into a single union with the following rules:

 - If the mr is found on the cache_ent->head list then it is a cache MR
   and umem == NULL. The entire union is zero after the MR is removed from
   the cache.

 - If umem != NULL or type == IB_MR_TYPE_USER then it is a user MR.

 - If umem == NULL then it is a kernel MR

This reduces the size of struct mlx5_ib_mr to 552 bytes from 702.

The only place the three flows overlap in the code is during dereg, so add
a few extra checks along there.

Link: https://lore.kernel.org/r/20210304120745.1090751-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:03:25 -04:00
Jason Gunthorpe
a639e66703 RDMA/mlx5: Zero out ODP related items in the mlx5_ib_mr
All of the ODP code assumes when it calls mlx5_mr_cache_alloc() the ODP
related fields are zero'd. This is true if the MR was just allocated, but
if the MR is recycled through the cache then the values are never zero'd.

This causes a bug in the odp_stats, they don't reset when the MR is
reallocated, also is_odp_implicit is never 0'd.

So we can use memset on a block of the mlx5_ib_mr reorganize the structure
to put all the data that can be zero'd by the cache at the end.

It is organized as an anonymous struct because the next patch will make
this a union.

Delete the unused smr_info. Don't set the kernel only desc_size on the
user path. No longer any need to zero mr->parent before freeing it, the
memset() will get it now.

Fixes: a3de94e3d6 ("IB/mlx5: Introduce ODP diagnostic counters")
Link: https://lore.kernel.org/r/20210304120745.1090751-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 20:03:25 -04:00
Wenpeng Liang
32548870d4 RDMA/hns: Add support for XRC on HIP09
The HIP09 supports XRC transport service, it greatly saves the number of
QPs required to connect all processes in a large cluster.

Link: https://lore.kernel.org/r/1614826558-35423-1-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 19:51:27 -04:00
Mark Zhang
22053df0a3 RDMA/mlx5: Fix typo in destroy_mkey inbox
Set mkey_index to the correct place when preparing the destroy_mkey inbox,
this will fix the following syndrome:

 mlx5_core 0000:08:00.0: mlx5_cmd_check:782:(pid 980716): DEALLOC_PD(0x801)
 op_mod(0x0) failed, status bad resource state(0x9), syndrome (0x31b635)

Link: https://lore.kernel.org/r/20210311142024.1282004-1-leon@kernel.org
Fixes: 1368ead04c ("RDMA/mlx5: Use strict get/set operations for obj_id")
Reviewed-by: Ido Kalir <idok@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-11 13:35:48 -04:00
Maor Gottlieb
8256c69b2d RDMA/mlx5: Fix timestamp default mode
1. Don't set the ts_format bit to default when it reserved - device is
   running in the old mode (free running).
2. XRC doesn't have a CQ therefore the ts format in the QP
   context should be default / free running.
3. Set ts_format to WQ.

Fixes: 2fe8d4b878 ("RDMA/mlx5: Fail QP creation if the device can not support the CQE TS")
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2021-03-10 11:01:57 -08:00
Lang Cheng
0f00571f94 RDMA/hns: Use new SQ doorbell register for HIP09
HIP09 uses new address space to map SQ doorbell registers, the doorbell of
each QP is isolated based on the size of 64KB, which can improve the
performance in concurrency scenarios.

Link: https://lore.kernel.org/r/1614082833-23130-1-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-10 14:53:46 -04:00
Al Viro
e41d237818 qib_fs: switch to simple_recursive_removal()
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2021-03-08 10:20:48 -05:00
Leon Romanovsky
f91803998c RDMA/mlx5: Set correct kernel-doc identifier
The W=1 allmodconfig build produces the following warning:

drivers/infiniband/hw/mlx5/odp.c:1086: warning: wrong kernel-doc identifier on line:
  * Parse a series of data segments for page fault handling.

Fix it by changing /** to be /* as it is written in kernel-doc
documentation.

Fixes: 5e769e444d ("RDMA/hw/mlx5/odp: Fix formatting and add missing descriptions in 'pagefault_data_segments()'")
Link: https://lore.kernel.org/r/20210302074214.1054299-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-03 13:21:24 -04:00
YueHaibing
3a9b3d4536 IB/mlx5: Add missing error code
Set err to -ENOMEM if kzalloc fails instead of 0.

Fixes: 7597385371 ("IB/mlx5: Enable subscription for device events over DEVX")
Link: https://lore.kernel.org/r/20210222122343.19720-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-01 14:49:09 -04:00
Jason Gunthorpe
7289e26f39 Linux 5.11
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmAppPgeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGeXYH/imZPBd4A1jIMehN
 5HV2A53Z+MXmmaMuGj9X1KV6vsf55/xB+IhOoFdtRAIsO8c2yYSCO8i4+4R0XfYA
 +/YFJeq672rojQnmh6XbpR8dugaAV7CUHy6n7KDsyvtT6EOCpwFSwkOb4X3tBRX6
 TlYgm2d/xgV/wRHSgLVugK0MdFCLMAnyb7mkPfar9QrMgG1BiDKLq07xmwnS23On
 TkqpJ9yZ/rJpUrrUqQYPShSO/FmA+fSfWs0CDv7EIrJ40LUScD6PZxSHWTIHtjLk
 E4jFda6wuqLRVWsBwaBzUIdD0zk7X5quHRzEpbC5ga16SK6yrWvE5YJJXCguIEuZ
 f3FMRYs=
 =CAjn
 -----END PGP SIGNATURE-----

Merge tag 'v5.11' into rdma.git for-next

Linux 5.11

Merged to resolve conflicts with RDMA rc commits

- drivers/infiniband/sw/rxe/rxe_net.c
  The final logic is to call rxe_get_dev_from_net() again with the master
  netdev if the packet was rx'd on a vlan. To keep the elimination of the
  local variables requires a trivial edit to the code in -rc

Link: https://lore.kernel.org/r/20210210131542.215ea67c@canb.auug.org.au
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-18 11:19:29 -04:00
Jason Gunthorpe
68ad4d1cc6 Merge branch 'mlx5_timestamp' into rdma.git for-next
Leon Romanovsky says:

====================
Add an extra timestamp format for mlx5_ib device.
====================

Based on the mlx5-next branch at
     git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

* branch 'mlx5_timestamp':
  RDMA/mlx5: Fail QP creation if the device can not support the CQE TS
  net/mlx5: Add new timestamp mode bits
2021-02-16 14:49:36 -04:00
Aharon Landau
2fe8d4b878 RDMA/mlx5: Fail QP creation if the device can not support the CQE TS
In ConnectX6Dx device, HW can work in real time timestamp mode according
to the device capabilities per RQ/SQ/QP.

When the flag IB_UVERBS_CQ_FLAGS_TIMESTAMP_COMPLETION is set, the user
expect to get TS on the CQEs in free running format, so we need to fail
the QP creation if the current mode of the device doesn't support it.

Link: https://lore.kernel.org/r/20210209131107.698833-3-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:48:47 -04:00
Tal Gilboa
7232c132d1 RDMA/mlx5: Allow CQ creation without attached EQs
The traditional DevX CQ creation flow goes through mlx5_core_create_cq()
which checks that the given EQN corresponds to an existing EQ and attaches
a devx handler to the EQN for the CQ.

In some cases the EQ will not be a kernel EQ, but will be controlled by
modify CQ, don't block creating these just because the EQN can't be found
in the kernel.

Link: https://lore.kernel.org/r/20210211085549.1277674-1-leon@kernel.org
Signed-off-by: Tal Gilboa <talgi@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:59 -04:00
Jiapeng Chong
1a93e848b7 RDMA/qedr: Use true and false for bool variable
Fix the following coccicheck warning:

./drivers/infiniband/hw/qedr/qedr.h:629:9-10: WARNING: return of 0/1 in
function 'qedr_qp_has_rq' with return type bool.

./drivers/infiniband/hw/qedr/qedr.h:620:9-10: WARNING: return of 0/1 in
function 'qedr_qp_has_sq' with return type bool.

Link: https://lore.kernel.org/r/1612949901-109873-1-git-send-email-jiapeng.chong@linux.alibaba.com
Reported-by: Abaci Robot<abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Yixing Liu
bf656b029f RDMA/hns: Adjust definition of FRMR fields
FRMR is not well-supported on HIP08, it is re-designed for HIP09 and the
position of related fields is changed. Then the ULPs should be forbidden
to use FRMR on older hardwares.

Link: https://lore.kernel.org/r/1612924424-28217-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Lang Cheng
5e9914c003 RDMA/hns: Refactor process of posting CMDQ
Simplify __hns_roce_cmq_send() then remove the redundant variables.

Link: https://lore.kernel.org/r/1612688143-28226-6-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Lang Cheng
292b3352bd RDMA/hns: Adjust fields and variables about CMDQ tail/head
The register 0x07014 is actually the head pointer of CMDQ, and 0x07010
means tail pointer. Current definitions are confusing, so rename them and
related variables.

The next_to_use of structure hns_roce_v2_cmq_ring has the same semantics
as head, merge them into one member. The next_to_clean of structure
hns_roce_v2_cmq_ring has the same semantics as tail. After deleting
next_to_clean, tail should also be deleted.

Link: https://lore.kernel.org/r/1612688143-28226-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Lang Cheng
563aeb2266 RDMA/hns: Remove redundant operations on CMDQ
CMDQ works serially, after each successful transmission, the head and tail
pointers will be equal, so there is no need to check whether the queue is
full. At the same time, since the descriptor of each transmission is new,
there is no need to perform a cleanup operation. Then, the field named
next_to_clean in structure hns_roce_v2_cmq_ring is redundant.

Fixes: a04ff739f2 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Link: https://lore.kernel.org/r/1612688143-28226-4-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Lang Cheng
8f86e2eada RDMA/hns: Fixes missing error code of CMDQ
When posting a multi-descriptors command, the error code of previous
failed descriptors may be rewrote to 0 by a later successful descriptor.

Fixes: a04ff739f2 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Link: https://lore.kernel.org/r/1612688143-28226-3-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Lang Cheng
229557230c RDMA/hns: Remove unused member and variable of CMDQ
last_status of structure hns_roce_v2_cmq has never been used, and the
variable named 'complete' in __hns_roce_cmq_send() is meaningless.

Fixes: a04ff739f2 ("RDMA/hns: Add command queue support for hip08 RoCE driver")
Link: https://lore.kernel.org/r/1612688143-28226-2-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-16 14:42:58 -04:00
Patrisious Haddad
c70f51de85 RDMA/mlx5: Support 400Gbps IB rate in mlx5 driver
Support 400Gbps IB rate in mlx5 driver.

Link: https://lore.kernel.org/r/20210209130429.698237-1-leon@kernel.org
Signed-off-by: Patrisious Haddad <phaddad@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Reviewed-by: Maor Gottlieb <maorg@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-09 14:00:35 -04:00
Sebastian Andrzej Siewior
a14e3caaaa RDMA/qedr: Remove in_irq() usage from debug output
qedr_gsi_post_send() has a debug output which prints the return value of
in_irq() and irqs_disabled().

The result of the in_irq(), even if invoked from an interrupt handler, is
subject to change depending on the `threadirqs' command line switch.  The
result of irqs_disabled() is always be 1 because the function acquires
spinlock_t with spin_lock_irqsave().

Remove in_irq() and irqs_disabled() from the debug output because it
provides little value.

Link: https://lore.kernel.org/r/20210208193347.383254-1-bigeasy@linutronix.de
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:43:11 -04:00
Yishai Hadas
db72438c93 RDMA/mlx5: Cleanup the synchronize_srcu() from the ODP flow
Cleanup the synchronize_srcu() from the ODP flow as it was found to be a
very heavy time consumer as part of dereg_mr.

For example de-registration of 10000 ODP MRs each with size of 2M hugepage
took 19.6 sec comparing de-registration of same number of non ODP MRs that
took 172 ms.

The new locking scheme uses the wait_event() mechanism which follows the
use count of the MR instead of using synchronize_srcu().

By that change, the time required for the above test took 95 ms which is
even better than the non ODP flow.

Once fully dropped the srcu usage, had to come with a lock to protect the
XA access.

As part of using the above mechanism we could also clean the
num_deferred_work stuff and follow the use count instead.

Link: https://lore.kernel.org/r/20210202071309.2057998-1-leon@kernel.org
Signed-off-by: Yishai Hadas <yishaih@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:31:11 -04:00
Xinhao Liu
a5887d6207 RDMA/hns: Delete redundant judgment when preparing descriptors
There is no need to use a for loop to assign values for an array of cmd
descriptors which has only two elements.

Link: https://lore.kernel.org/r/1612517974-31867-13-git-send-email-liweihang@huawei.com
Signed-off-by: Xinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:25:26 -04:00
Yixian Liu
cd0a4baf36 RDMA/hns: Remove unnecessary wrap around for EQ's consumer index
The hns driver wrap around the consumer index of AEQ and CEQ when they
reach to two times of queue entries number for owner mechanism, actually,
it is unnecessary to wrap around since the hardware itself will mask it
before use.

Link: https://lore.kernel.org/r/1612517974-31867-12-git-send-email-liweihang@huawei.com
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:25:26 -04:00
Lang Cheng
62490fd5a8 RDMA/hns: Avoid unnecessary memset on WQEs in post_send
All fields of WQE will be rewrote, so the memset is unnecessary. And when
SQ is working in OWNER mode, the pipeline may prefetch the WQEs beyond PI,
the memset operation may flip the owner bit too early, then the pipeline
may get a wrong WQ.

Link: https://lore.kernel.org/r/1612517974-31867-11-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:25:25 -04:00
Xinhao Liu
993703370a RDMA/hns: Remove some magic numbers
Use macros instead of magic numbers to represent shift of dma_handle_wqe,
dma_handle_idx and UDP destination port number of RoCEv2.

Link: https://lore.kernel.org/r/1612517974-31867-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xinhao Liu <liuxinhao5@hisilicon.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:25:25 -04:00
Lang Cheng
c05ffb1f7d RDMA/hns: Move HIP06 related definitions into hns_roce_hw_v1.h
hns_roce_device.h is not specific to hardware, some definitions are only
used for HIP06, they should be moved into hns_roce_hw_v1.h.

Link: https://lore.kernel.org/r/1612517974-31867-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:25:25 -04:00
Lang Cheng
86f767e6fc RDMA/hns: Replace wmb&__raw_writeq with writeq
Currently, the driver updates doorbell looks like this:

post()
{
	wqe.field = 0x111;
	wmb();
	update_wq_db();
}

update_wq_db()
{
	db.field = 0x222;
	__raw_writeq(db, db_reg);
}

writeq() is a better choice than __raw_writeq() because it calls dma_wmb()
to barrier in ARM64, and dma_wmb() is better than wmb() for ROCEE device.

This patch removes all wmb() before updating doorbell of SQ/RQ/CQ/SRQ by
replacing __raw_writeq() with writeq() to improve performence.  The new
process looks like this:

post()
{
	wqe.field = 0x111;
	update_wq_db();
}

update_wq_db()
{
	db.field = 0x222;
	writeq(db, db_reg);
}

Link: https://lore.kernel.org/r/1612517974-31867-8-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:25:25 -04:00
Yixing Liu
3fe07a008e RDMA/hns: Skip qp_flow_control_init() for HIP09
Since HIP09 does not require this function, it should be masked.

Link: https://lore.kernel.org/r/1612517974-31867-7-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:25:25 -04:00
Lijun Ou
7373de9adb RDMA/hns: Disable RQ inline by default
This feature should only be enabled by querying capability from firmware.

Fixes: ba6bb7e974 ("RDMA/hns: Add interfaces to get pf capabilities from firmware")
Link: https://lore.kernel.org/r/1612517974-31867-5-git-send-email-liweihang@huawei.com
Signed-off-by: Lijun Ou <oulijun@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:15:10 -04:00
Xi Wang
9ea9a53ea9 RDMA/hns: Add mapped page count checking for MTR
Add the mapped page count checking flow to avoid invalid page size when
creating MTR.

Fixes: 38389eaa4d ("RDMA/hns: Add mtr support for mixed multihop addressing")
Link: https://lore.kernel.org/r/1612517974-31867-4-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:15:10 -04:00
Weihang Li
ea4092f3b5 RDMA/hns: Fix type of sq_signal_bits
This bit should be in type of enum ib_sig_type, or there will be a sparse
warning.

Fixes: bfe860351e ("RDMA/hns: Fix cast from or to restricted __le32 for driver")
Link: https://lore.kernel.org/r/1612517974-31867-3-git-send-email-liweihang@huawei.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:15:09 -04:00
Weihang Li
773f841ab1 RDMA/hns: Avoid filling sgid index when modifying QP to RTR
ULP usually set IB(V)_QP_AV when trying to modify QP to RTR if they want
to record sgid index into QPC. For UD QPs, it is useless because it will
be included in WQE. For RC QPs, it will be filled in
hns_roce_set_path(). So sgid index shouldn't be filled by default. Then
hns_get_gid_index() is moved to hns_roce_hw_v1.c because it is only called
in it.

Fixes: 926a01dc00 ("RDMA/hns: Add QP operations support for hip08 SoC")
Link: https://lore.kernel.org/r/1612517974-31867-2-git-send-email-liweihang@huawei.com
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:15:09 -04:00
Yixing Liu
01584a5edc RDMA/hns: Add support of direct wqe
Direct wqe is a mechanism to fill wqe directly into the hardware. In the
case of light load, the wqe will be filled into pcie bar space of the
hardware, this will reduce one memory access operation and therefore
reduce the latency.

Link: https://lore.kernel.org/r/1611997513-27107-1-git-send-email-liweihang@huawei.com
Signed-off-by: Yixing Liu <liuyixing1@huawei.com>
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 20:12:57 -04:00
Wenpeng Liang
204cbe423b RDMA/hns: Add verification of QP type when post_recv
The post_recv only supports QP types of RC, GSI and UD.

Link: https://lore.kernel.org/r/1611997090-48820-13-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 19:37:36 -04:00
Wenpeng Liang
2e07a3d945 RDMA/hns: Refactor hns_roce_v2_post_srq_recv()
The SRQ in the hns driver consists of the following four parts:

* wqe buf: the buffer to store WQE.

* wqe_idx buf: the cqe of SRQ may be not generated in the order of wqe, so
  the wqe_idx corresponding to the idle WQE needs to be pushed into the
  index queue which is a FIFO, then it instructs the hardware to obtain
  the corresponding WQE.

* bitmap: bitmap is used to generate and release wqe_idx. When the user
  has a new WR, the driver finds the idx of the idle wqe in bitmap. When
  the CQE of wqe is generated, the driver will release the idx.

* wr_id buf: wr_id buf is used to store the user's wr_id, then return it
  to the user when poll_cq verb is invoked.

The process of post SRQ recv is refactored to make preceding code clearer.

Link: https://lore.kernel.org/r/1611997090-48820-12-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 19:37:35 -04:00
Xi Wang
6b981e2bd9 RDMA/hns: Clear remaining unused sges when post_recv
The HIP09 requires the driver to clear the unused data segments in wqe
buffer to make the hns ROCEE stop reading the remaining invalid sges for
RQ.

Link: https://lore.kernel.org/r/1611997090-48820-11-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 19:37:35 -04:00
Xi Wang
9ae2a37e6a RDMA/hns: Refactor post recv flow
Refactor post recv flow by removing unnecessary checking and removing
duplicated code.

Link: https://lore.kernel.org/r/1611997090-48820-10-git-send-email-liweihang@huawei.com
Signed-off-by: Xi Wang <wangxi11@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 19:37:35 -04:00
Lang Cheng
3f31c41265 RDMA/hns: Use new interfaces to write SRQC
Use new register operation interfaces to simplify the process of write SRQ
Context.

Link: https://lore.kernel.org/r/1611997090-48820-9-git-send-email-liweihang@huawei.com
Signed-off-by: Lang Cheng <chenglang@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 19:37:35 -04:00
Wenpeng Liang
eacb45ca8f RDMA/hns: Refactor code about SRQ Context
Reduce parameter numbers of write_srqc() and move some related code into
it from alloc_srqc().

Link: https://lore.kernel.org/r/1611997090-48820-8-git-send-email-liweihang@huawei.com
Signed-off-by: Wenpeng Liang <liangwenpeng@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-02-08 19:37:35 -04:00