Commit Graph

570 Commits

Author SHA1 Message Date
Kashyap Desai
bcfee4ce3e RDMA/bnxt_re: remove redundant cmdq_bitmap
cmdq_bitmap is used to derive the next available index in the CMDQ.
This is not required as the we can get the next index
using the existing bnxt_qplib_crsqe array.

Driver will use bnxt_qplib_crsqe array and flag is_in_used to
derive valid entries. is_in_used  is replacement of cmdq_bitmap.
There is no change in the existing mechanism of the circular buffer
used to get index.

Added opcode field in bnxt_qplib_crsqe array so that it is easy to map
opcode associated with pending rcfw command.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-17-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:54 +03:00
Kashyap Desai
f0c875ff62 RDMA/bnxt_re: use firmware provided max request timeout
Firmware provides max request timeout value as part of hwrm_ver_get
API. Driver gets the timeout from firmware and if that interface is
not available then fall back to hardcoded timeout value.
Also, Add a helper function to check the FW status.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-16-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:54 +03:00
Kashyap Desai
a00278521c RDMA/bnxt_re: cancel all control path command waiters upon error
When an error is detected in FW, wake up all the waiters as the
all of them need to be completed with timeout. Add the device
error state also as a wait condition.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-15-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:51 +03:00
Kashyap Desai
bb8c93618f RDMA/bnxt_re: consider timeout of destroy ah as success.
If destroy_ah is timed out, it is likely to be destroyed by firmware
but it is taking longer time due to temporary slowness
in processing the rcfw command. In worst case, there might be
AH resource leak in firmware.

Sending timeout return value can dump warning message from ib_core
which can be avoided if we map timeout of destroy_ah as success.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-14-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:17 +03:00
Kashyap Desai
84911cf3b2 RDMA/bnxt_re: post destroy_ah for delayed completion of AH creation
AH create may be called from interrpt context and driver has a special
timeout (8 sec) for this command. This is to avoid soft lockups when
the FW command takes more time. Driver returns -ETIMEOUT and fail
create AH, without waiting for actual completion from firmware.
When FW completion is received, use is_waiter_alive flag to avoid
a regular completion path.

If create_ah opcode is detected in completion path which does not have
waiter alive, driver will fetch ah_id from successful firmware
completion in the interrupt context and sends destroy_ah command
for same ah_id. This special post is done in quick manner using helper
function __send_message_no_waiter.

timeout_send is only used for debugging purposes.
If timeout_send value keeps incrementing, it indicates out of sync
active ah counter between driver and firmware. This is a limitation
but graceful handling is possible in future.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-13-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:17 +03:00
Kashyap Desai
b6c7256688 RDMA/bnxt_re: Add firmware stall check detection
Every completion will update last_seen value in the unit of jiffies.
last_seen field will be used to know if firmware is alive and
is useful to detect firmware stall.

Non blocking interface __wait_for_resp will have logic to detect
firmware stall. After every 10 second interval if __wait_for_resp
has not received completion for a given command it will check for
firmware stall condition.

If current jiffies is greater than last_seen
jiffies + RCFW_FW_STALL_TIMEOUT_SEC * HZ, it is a firmware stall.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-12-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:17 +03:00
Kashyap Desai
691eb7c611 RDMA/bnxt_re: handle command completions after driver detect a timedout
If calling context detect command timeout, associated memory stored on
stack will not be valid. If firmware complete the same command later,
this causes incorrect memory access by driver.

Added is_waiter_alive to handle delayed completion by firmware.
is_waiter_alive is set and reset under command queue lock.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-11-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:17 +03:00
Kashyap Desai
354f5bd985 RDMA/bnxt_re: add helper function __poll_for_resp
This interface will be used if the driver has not enabled interrupt
and/or interrupt is disabled for a short period of time.
Completion is not possible from interrupt so this interface does
self-polling.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-10-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:17 +03:00
Kashyap Desai
159cf95e42 RDMA/bnxt_re: Simplify the function that sends the FW commands
- Use __send_message_basic_sanity helper function.
 - Do not retry posting same command if there is a queue full detection.
 - ENXIO is used to indicate controller recovery.
 - In the case of ERR_DEVICE_DETACHED state, the driver should not post
   commands to the firmware, but also return fabricated written code.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-9-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:17 +03:00
Kashyap Desai
65288a22dd RDMA/bnxt_re: use shadow qd while posting non blocking rcfw command
Whenever there is a fast path IO and create/destroy resources from
the slow path is happening in parallel, we may notice high latency
of slow path command completion.

Introduces a shadow queue depth to prevent the outstanding requests
to the FW. Driver will not allow more than #RCFW_CMD_NON_BLOCKING_SHADOW_QD
non-blocking commands to the Firmware.

Shadow queue depth is a soft limit only for non-blocking
commands. Blocking commands will be posted to the firmware
as long as there is a free slot.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-8-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:17 +03:00
Kashyap Desai
3022cc1511 RDMA/bnxt_re: Avoid the command wait if firmware is inactive
Add a check to avoid waiting if driver already detects a
FW timeout. Return success for resource destroy in case
the device is detached. Add helper function to map timeout
error code to success.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-7-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:16 +03:00
Kashyap Desai
8cf1d12ad5 RDMA/bnxt_re: Enhance the existing functions that wait for FW responses
Use jiffies based timewait instead of counting iteration for
commands that block for FW response.

Also add a poll routine for control path commands. This is for
polling completion if the waiting commands timeout. This avoids cases
where the driver misses completion interrupts.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-6-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:16 +03:00
Kashyap Desai
258ee04317 RDMA/bnxt_re: set fixed command queue depth
There is no need of setting max command queue entries based on
firmware version check.

Removing deperecated code.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-5-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:16 +03:00
Kashyap Desai
b021186bca RDMA/bnxt_re: remove virt_func check while creating RoCE FW channel
There is a common FW communication offset for both PF and VF.
Removed code around virt_fn check while creating FW channel.

Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:16 +03:00
Kashyap Desai
3099bcdc19 RDMA/bnxt_re: Avoid calling wake_up threads from spin_lock context
bnxt_qplib_service_creq can be called from interrupt or tasklet or
process context. So the function take irq variant  of spin_lock.
But when wake_up is invoked with the lock held, it is putting the
calling context to sleep.

[exception RIP: __wake_up_common+190]
RIP: ffffffffb7539d7e  RSP: ffffa73300207ad8  RFLAGS: 00000083
RAX: 0000000000000001  RBX: ffff91fa295f69b8  RCX: dead000000000200
RDX: ffffa733344af940  RSI: ffffa73336527940  RDI: ffffa73336527940
RBP: 000000000000001c   R8: 0000000000000002   R9: 00000000000299c0
R10: 0000017230de82c5  R11: 0000000000000002  R12: ffffa73300207b28
R13: 0000000000000000  R14: ffffa733341bf928  R15: 0000000000000000
ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Call the wakeup after releasing the lock.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:16 +03:00
Kashyap Desai
0af91306e1 RDMA/bnxt_re: wraparound mbox producer index
Driver is not handling the wraparound of the mbox producer index correctly.
Currently the wraparound happens once u32 max is reached.

Bit 31 of the producer index register is special and should be set
only once for the first command. Because the producer index overflow
setting bit31 after a long time, FW goes to initialization sequence
and this causes FW hang.

Fix is to wraparound the mbox producer index once it reaches u16 max.

Fixes: cee0c7bba4 ("RDMA/bnxt_re: Refactor command queue management code")
Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1686308514-11996-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-06-12 10:10:16 +03:00
Kamal Heib
18e7e3e421 RDMA/bnxt_re: Fix reporting active_{speed,width} attributes
After commit 6d758147c7 ("RDMA/bnxt_re: Use auxiliary driver interface")
the active_{speed, width} attributes are reported incorrectly, This is
happening because ib_get_eth_speed() is called only once from
bnxt_re_ib_init() - Fix this issue by calling ib_get_eth_speed() from
bnxt_re_query_port().

Fixes: 6d758147c7 ("RDMA/bnxt_re: Use auxiliary driver interface")
Link: https://lore.kernel.org/r/20230529153525.87254-1-kheib@redhat.com
Signed-off-by: Kamal Heib <kheib@redhat.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-06-01 20:01:29 -03:00
Kalesh AP
8c1ee346da RDMA/bnxt_re: Remove unnecessary checks
The NULL check inside bnxt_qplib_del_sgid() and bnxt_qplib_add_sgid()
always return false as the "sgid_tbl" inside "rdev->qplib_res" is a static
memory.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1684478897-12247-8-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:26:19 -03:00
Kalesh AP
07d5ce14b2 RDMA/bnxt_re: Return directly without goto jumps
When there is no cleanup to be done, return directly.  This will help
eliminating unnecessary local variables and goto labels.  This patch fixes
such occurrences in qplib_fp.c file.

Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Fixes: 159fb4ceac ("RDMA/bnxt_re: introduce a function to allocate swq")
Link: https://lore.kernel.org/r/1684478897-12247-7-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:26:19 -03:00
Kalesh AP
43774bc156 RDMA/bnxt_re: Fix to remove an unnecessary log
During destroy_qp, driver sets the qp handle in the existing CQEs
belonging to the QP being destroyed to NULL. As a result, a poll_cq after
destroy_qp can report unnecessary messages.  Remove this noise from system
logs.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1684478897-12247-6-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:26:18 -03:00
Kalesh AP
b989f90cef RDMA/bnxt_re: Remove a redundant check inside bnxt_re_update_gid
The NULL check inside bnxt_re_update_gid() always return false.  If
sgid_tbl->tbl is not allocated, then dev_init would have failed.

Fixes: 5fac5b1b29 ("RDMA/bnxt_re: Add vlan tag for untagged RoCE traffic when PFC is configured")
Link: https://lore.kernel.org/r/1684478897-12247-5-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:26:18 -03:00
Kalesh AP
ff2e4bfd16 RDMA/bnxt_re: Use unique names while registering interrupts
bnxt_re currently uses the names "bnxt_qplib_creq" and "bnxt_qplib_nq-0"
while registering IRQs. There is no way to distinguish the IRQs of
different device ports when there are multiple IB devices registered.
This could make the scenarios worse where one want to pin IRQs of a device
port to certain CPUs.

Fixed the code to use unique names which has PCI BDF information while
registering interrupts like: "bnxt_re-nq-0@pci:0000:65:00.0" and
"bnxt_re-creq@pci:0000:65:00.1".

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1684478897-12247-4-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Bhargava Chenna Marreddy <bhargava.marreddy@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:26:18 -03:00
Kalesh AP
9b3ee47796 RDMA/bnxt_re: Fix to remove unnecessary return labels
If there is no cleanup needed then just return directly.  This cleans up
the code and improve readability.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1684478897-12247-3-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reviewed-by: Saravanan Vajravel <saravanan.vajravel@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:26:18 -03:00
Selvin Xavier
ab112ee789 RDMA/bnxt_re: Disable/kill tasklet only if it is enabled
When the ulp hook to start the IRQ fails because the rings are not
available, tasklets are not enabled. In this case when the driver is
unloaded, driver calls CREQ tasklet_kill. This causes an indefinite hang
as the tasklet is not enabled.

Driver shouldn't call tasklet_kill if it is not enabled. So using the
creq->requested and nq->requested flags to identify if both tasklets/irqs
are registered. Checking this flag while scheduling the tasklet from
ISR. Also, added a cleanup for disabling tasklet, in case request_irq
fails during start_irq.

Check for return value for bnxt_qplib_rcfw_start_irq and in case the
bnxt_qplib_rcfw_start_irq fails, return bnxt_re_start_irq without
attempting to start NQ IRQs.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1684478897-12247-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:26:18 -03:00
Kalesh AP
dd5fb04857 RDMA/bnxt_re: Do not enable congestion control on VFs
Congestion control needs to be enabled only on the PFs. FW fails the
command if issued on VFs. Avoid sending the command on VFs.

Fixes: f13bcef04b ("RDMA/bnxt_re: Enable congestion control by default")
Link: https://lore.kernel.org/r/1684397461-23082-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:14:27 -03:00
Kalesh AP
0fa0d520e2 RDMA/bnxt_re: Fix return value of bnxt_re_process_raw_qp_pkt_rx
bnxt_re_process_raw_qp_pkt_rx() always return 0 and ignores the return
value of bnxt_re_post_send_shadow_qp().

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1684397461-23082-3-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:14:19 -03:00
Kalesh AP
349e3c0cf2 RDMA/bnxt_re: Fix a possible memory leak
Inside bnxt_qplib_create_cq(), when the check for NULL DPI fails, driver
returns directly without freeing the memory allocated inside
bnxt_qplib_alloc_init_hwq() routine.

Fixed this by moving the check for NULL DPI before invoking
bnxt_qplib_alloc_init_hwq().

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1684397461-23082-2-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-19 13:09:25 -03:00
Selvin Xavier
08c7f09356 RDMA/bnxt_re: Fix the page_size used during the MR creation
Driver populates the list of pages used for Memory region wrongly when
page size is more than system page size. This is causing a failure when
some of the applications that creates MR with page size as 2M.  Since HW
can support multiple page sizes, pass the correct page size while creating
the MR.

Also, driver need not adjust the number of pages when HW Queues are
created with user memory. It should work with the number of dma blocks
returned by ib_umem_num_dma_blocks. Fix this calculation also.

Fixes: 0c4dcd6028 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Fixes: f6919d5638 ("RDMA/bnxt_re: Code refactor while populating user MRs")
Link: https://lore.kernel.org/r/1683484169-9539-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Kashyap Desai <kashyap.desai@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2023-05-12 18:30:30 -03:00
Selvin Xavier
f13bcef04b RDMA/bnxt_re: Enable congestion control by default
Enable Congesion control by default. Issue FW command
enable the CC during driver load and disable it during
unload.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1680169540-10029-8-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-04 09:17:21 +03:00
Selvin Xavier
c682c6eda0 RDAM/bnxt_re: Use tlv apis while processing the slow path commands
Use the new TLV APIs for existing slow path commands. The TLV
APIs will be used to populate extended headers for some of the
Firmware commands, which will be introduced in the patches that
follow.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1680169540-10029-7-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-04 09:17:21 +03:00
Selvin Xavier
0722f1f7bf RDMA/bnxt_re: RoCE slow path TLV support
Header file to support  TLV encapsulated commands. These
functions will be used by the driver in the follow up patches.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1680169540-10029-6-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-04 09:17:21 +03:00
Selvin Xavier
ff015bcd21 RDMA/bnxt_re: Reduce number of argumets to control path command APIs
Reducing the number of arguments to bnxt_qplib_rcfw_send_message
by enclosing all its arguments into a command message structure.
Use the same struct while passing the command information to
send_message.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1680169540-10029-5-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-04 09:17:21 +03:00
Selvin Xavier
e576adf583 RDMA/bnxt_re: Convert RCFW_CMD_PREP macro to static inline function
Convert RCFW_CMD_PREP macro to static inline function.
Also, remove the cmd_flags passed as none of the functions
are using it.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1680169540-10029-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-04 09:17:21 +03:00
Selvin Xavier
b400acee06 RDMA/bnxt_re: Remove HW queue mapping from RoCE Driver
bnxt_en driver does the queue mapping for RoCE traffic. Removing the
queue mapping from RoCE driver.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1680169540-10029-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-04 09:17:21 +03:00
Selvin Xavier
a9a457f338 RDMA/bnxt_re: Update HW interface headers
Updating the HW structures to the latest version.
This is copied from the code maintained internally. No functionality
changes in this patch. Code is re-organized to match the file maintained
in the internal tree. Also, New HW interface structures are added, which
will be used by the drivers in future.

CC: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1680169540-10029-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-04-04 09:17:21 +03:00
Tom Rix
1b69f1e3d7 RDMA/bnxt_re: remove unused num_srqne_processed and num_cqne_processed variables
clang with W=1 reports
drivers/infiniband/hw/bnxt_re/qplib_fp.c:303:6: error: variable
  'num_srqne_processed' set but not used [-Werror,-Wunused-but-set-variable]
        int num_srqne_processed = 0;
            ^
drivers/infiniband/hw/bnxt_re/qplib_fp.c:304:6: error: variable
  'num_cqne_processed' set but not used [-Werror,-Wunused-but-set-variable]
        int num_cqne_processed = 0;
            ^
These variables are not used so remove them.

Signed-off-by: Tom Rix <trix@redhat.com>
Link: https://lore.kernel.org/r/20230325140559.1336056-1-trix@redhat.com
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-29 14:03:56 +03:00
Selvin Xavier
d54bd5abf4 RDMA/bnxt_re: Add resize_cq support
Add resize_cq verb support for user space CQs. Resize operation for
kernel CQs are not supported now.

Driver should free the current CQ only after user library polls
for all the completions and switch to new CQ. So after the resize_cq
is returned from the driver, user library polls for existing completions
and store it as temporary data. Once library reaps all completions in the
current CQ, it invokes the ibv_cmd_poll_cq to inform the driver about
the resize_cq completion. Adding a check for user CQs in driver's
poll_cq and complete the resize operation for user CQs.
Updating uverbs_cmd_mask with poll_cq to support this.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1678868215-23626-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2023-03-29 10:03:43 +03:00
Ajit Khaparde
3034322113 bnxt_en: Remove runtime interrupt vector allocation
Modified the bnxt_en code to create and pre-configure RDMA devices
with the right MSI-X vector count for the ROCE driver to use.
This is to align the ROCE driver to the auxiliary device model which
will simply bind the driver without getting into PCI-related handling.
All PCI-related logic will now be in the bnxt_en driver.

Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-01 19:02:20 -08:00
Ajit Khaparde
a43c26fa2e RDMA/bnxt_re: Remove the sriov config callback
Remove the SRIOV config callback which the bnxt_en was calling
to reconfigure the chip resources for a PF device when VFs are
created. The code is now modified to provision the VF resources
based on the total VF count instead of the actual VF count.
This allows the SRIOV config callback to be removed from the
list of ulp_ops.

Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-01 19:02:18 -08:00
Hongguang Gao
848dc857c8 bnxt_en: Remove struct bnxt access from RoCE driver
Decouple RoCE driver from directly accessing L2's private bnxt
structure. Move the fields needed by RoCE driver into bnxt_en_dev.
They'll be passed to RoCE driver by bnxt_rdma_aux_device_add()
function.

Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-01 19:02:16 -08:00
Ajit Khaparde
3b65e9456c bnxt_en: Use auxiliary bus calls over proprietary calls
Wherever possible use the function ops provided by auxiliary bus
instead of using proprietary ops.

Defined bnxt_re_suspend and bnxt_re_resume calls which can be
invoked by the bnxt_en driver instead of the ULP stop/start calls.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-01 19:02:14 -08:00
Ajit Khaparde
63669ab384 bnxt_en: Use direct API instead of indirection
For a single ULP user there is no need for complicating function
indirection calls. Remove all this complexity in favour of direct
function calls exported by the bnxt_en driver. This allows to
simplify the code greatly. Also remove unused ulp_async_notifier.

Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-01 19:02:12 -08:00
Ajit Khaparde
dafcdf5e2b bnxt_en: Remove usage of ulp_id
Since the driver continues to use the single ULP model,
the extra complexity and indirection is unnecessary.
Remove the usage of ulp_id from the code.

Suggested-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-01 19:02:10 -08:00
Ajit Khaparde
6d758147c7 RDMA/bnxt_re: Use auxiliary driver interface
Use auxiliary driver interface for driver load, unload ROCE driver.
The driver does not need to register the interface using the netdev
notifier anymore. Removed the bnxt_re_dev_list which is not needed.
Currently probe, remove and shutdown ops have been implemented for
the auxiliary device.
Also remove exccessve validation checks for rdev.

Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
2023-02-01 19:02:08 -08:00
Wolfram Sang
2c34bb6dea IB: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Link: https://lore.kernel.org/r/20220818210018.6841-1-wsa+renesas@sang-engineering.com
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
2022-08-21 14:18:02 +03:00
Jiang Jian
fd46ef3d82 RDMA: Correct duplicated words in comments
There is a duplicated word 'is' and 'for' in a comment that needs to be
dropped.

Link: https://lore.kernel.org/r/20220622170853.3644-1-jiangjian@cdjrlc.com
Link: https://lore.kernel.org/r/20220623103708.43104-1-jiangjian@cdjrlc.com
Signed-off-by: Jiang Jian <jiangjian@cdjrlc.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-06-24 16:52:28 -03:00
Jason Gunthorpe
e945c653c8 RDMA: Split kernel-only global device caps from uverbs device caps
Split out flags from ib_device::device_cap_flags that are only used
internally to the kernel into kernel_cap_flags that is not part of the
uapi. This limits the device_cap_flags to being the same bitmap that will
be copied to userspace.

This cleanly splits out the uverbs flags from the kernel flags to avoid
confusion in the flags bitmap.

Add some short comments describing which each of the kernel flags is
connected to. Remove unused kernel flags.

Link: https://lore.kernel.org/r/0-v2-22c19e565eef+139a-kern_caps_jgg@nvidia.com
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2022-04-06 15:02:13 -03:00
Kamal Heib
0a0575a12e RDMA/bnxt_re: Fix endianness warning for req.pkey
Fix the following sparse warning:

drivers/infiniband/hw/bnxt_re/qplib_fp.c:1260:26: sparse: warning: incorrect type in assignment (different base types)

Fixes: 0e938533d9 ("RDMA/bnxt_re: Remove dynamic pkey table")
Link: https://lore.kernel.org/r/20211205204537.14184-1-kamalheib1@gmail.com
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-12-06 19:56:30 -04:00
Christophe JAILLET
81ff48ddda RDMA/bnxt_re: Use bitmap_zalloc() when applicable
Use 'bitmap_zalloc()' to simplify code, improve the semantic and avoid
some open-coded arithmetic in allocator arguments.

Also change the corresponding 'kfree()' into 'bitmap_free()' to keep
consistency.

Link: https://lore.kernel.org/r/5c029daf43b92fdc27926fe8a98084843437c498.1637872888.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-29 14:33:56 -04:00
Kamal Heib
0e938533d9 RDMA/bnxt_re: Remove dynamic pkey table
The RoCE spec requires RoCE devices to support only the default pkey.
However the bnxt_re driver maintains a 0xFFFF entry pkey table and uses
only the first entry. Remove the pkey table and hard code a table of
length one hard wired with the default pkey.

Link: https://lore.kernel.org/r/20211125033615.483750-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-25 13:35:51 -04:00
Christophe JAILLET
a917dfb66c RDMA/bnxt_re: Scan the whole bitmap when checking if "disabling RCFW with pending cmd-bit"
The 'cmdq->cmdq_bitmap' bitmap is 'rcfw->cmdq_depth' bits long.  The size
stored in 'cmdq->bmap_size' is the size of the bitmap in bytes.

Remove this erroneous 'bmap_size' and use 'rcfw->cmdq_depth' directly in
'bnxt_qplib_disable_rcfw_channel()'. Otherwise some error messages may be
missing.

Other uses of 'cmdq_bitmap' already take into account 'rcfw->cmdq_depth'
directly.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/47ed717c3070a1d0f53e7b4c768a4fd11caf365d.1636707421.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-16 13:54:24 -04:00
Changcheng Deng
dd566d586f RDMA/bnxt_re: Remove unneeded variable
Fix the following coccicheck review:
./drivers/infiniband/hw/bnxt_re/main.c: 896: 5-7: Unneeded variable

Remove unneeded variable used to store return value.

Link: https://lore.kernel.org/r/20211109113227.132596-1-deng.changcheng@zte.com.cn
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-16 13:54:24 -04:00
Kamal Heib
dd83f482d2 RDMA/bnxt_re: Remove unsupported bnxt_re_modify_ah callback
There is no need to return always zero for function which is not
supported, especially since 0 is the wrong return code.

Link: https://lore.kernel.org/r/20211102073054.410838-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-11-03 09:06:36 -03:00
Kamal Heib
493620b1c9 RDMA/bnxt_re: Use helper function to set GUIDs
Use addrconf_addr_eui48() helper function to set the GUIDs and remove the
driver specific version.

Link: https://lore.kernel.org/r/20211028094359.160407-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-29 13:24:09 -03:00
Kamal Heib
04567caf96 RDMA/bnxt_re: Fix kernel panic when trying to access bnxt_re_stat_descs
For some reason when introducing the fixed commit the "active_pds" and
"active_ahs" descriptors got dropped, which lead to the following panic
when trying to access the first entry in the descriptors.

 bnxt_re: Broadcom NetXtreme-C/E RoCE Driver
 BUG: kernel NULL pointer dereference, address: 0000000000000000
 CPU: 2 PID: 594 Comm: kworker/u32:1 Not tainted 5.15.0-rc6+ #2
 Hardware name: Dell Inc. PowerEdge R430/0CN7X8, BIOS 2.12.1 12/07/2020
 Workqueue: bnxt_re bnxt_re_task [bnxt_re]
 RIP: 0010:strlen+0x0/0x20
 Code: 48 89 f9 74 09 48 83 c1 01 80 39 00 75 f7 31 d2 44 0f b6 04 16 44 88 04 11 48 83 c2 01 45 84 c0 75 ee c3 0f 1f 80 00 00 00 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 31
 RSP: 0018:ffffb25fc47dfbb0 EFLAGS: 00010246
 RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000008100
 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
 RBP: 0000000000000000 R08: 00000000fffffff4 R09: 0000000000000000
 R10: ffff8a05c71fc028 R11: 0000000000000000 R12: 0000000000000000
 R13: 0000000000000000 R14: 0000000000000000 R15: ffff8a05c3dee800
 FS:  0000000000000000(0000) GS:ffff8a092fc40000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
 CR2: 0000000000000000 CR3: 000000048d3da001 CR4: 00000000001706e0
 Call Trace:
  kernfs_name_hash+0x12/0x80
  kernfs_find_ns+0x35/0xd0
  kernfs_remove_by_name_ns+0x32/0x90
  remove_files+0x2b/0x60
  create_files+0x1d3/0x1f0
  internal_create_group+0x17b/0x1f0
  internal_create_groups.part.0+0x3d/0xa0
  setup_port+0x180/0x3b0 [ib_core]
  ? __cond_resched+0x16/0x40
  ? kmem_cache_alloc_trace+0x278/0x3d0
  ib_setup_port_attrs+0x99/0x240 [ib_core]
  ib_register_device+0xcc/0x160 [ib_core]
  bnxt_re_task+0xba/0x170 [bnxt_re]
  process_one_work+0x1eb/0x390
  worker_thread+0x53/0x3d0
  ? process_one_work+0x390/0x390
  kthread+0x10f/0x130
  ? set_kthread_struct+0x40/0x40
  ret_from_fork+0x22/0x30

Fixes: 13f30b0fa0 ("RDMA/counter: Add a descriptor in struct rdma_hw_stats")
Link: https://lore.kernel.org/r/20211027205448.127821-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Devesh Sharma <devesh.s.sharma@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-29 12:10:33 -03:00
Jakub Kicinski
fd92213e9a RDMA: Constify netdev->dev_addr accesses
netdev->dev_addr will become const soon, make sure drivers propagate the
qualifier.

Link: https://lore.kernel.org/r/20211019182604.1441387-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-25 14:33:09 -03:00
Aharon Landau
13f30b0fa0 RDMA/counter: Add a descriptor in struct rdma_hw_stats
Add a counter statistic descriptor structure in rdma_hw_stats. In addition
to the counter name, more meta-information will be added.  This code
extension is needed for optional-counter support in the following patches.

Link: https://lore.kernel.org/r/20211008122439.166063-4-markzhang@nvidia.com
Signed-off-by: Aharon Landau <aharonl@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Mark Zhang <markzhang@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-10-12 12:48:04 -03:00
Selvin Xavier
6bda39149d RDMA/bnxt_re: Check if the vlan is valid before reporting
When VF is configured with default vlan, HW strips the vlan from the
packet and driver receives it in Rx completion. VLAN needs to be reported
for UD work completion only if the vlan is configured on the host. Add a
check for valid vlan in the UD receive path.

Link: https://lore.kernel.org/r/1631709163-2287-12-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
7a3c3a121e RDMA/bnxt_re: Correct FRMR size calculation
FRMR WQE requires to provide the log2 value of the PBL and page size.  Use
the standard ilog2() to calculate the log2 value

Link: https://lore.kernel.org/r/1631709163-2287-11-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
690ea7fe00 RDMA/bnxt_re: Use GFP_KERNEL in non atomic context
Use GFP_KERNEL instead of GFP_ATOMIC while allocating control path
structures which will be only called from non atomic context

Link: https://lore.kernel.org/r/1631709163-2287-10-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
2b4ccce6ca RDMA/bnxt_re: Fix FRMR issue with single page MR allocation
When the FRMR is allocated with single page, driver is attempting to
create a level 0 HWQ and not allocating any page because the nopte field
is set. This causes the crash during post_send as the pbl is not
populated.

To avoid this crash, check for the nopte bit during HWQ creation with
single page and create a level 1 page table and populate the pbl address
correctly.

Link: https://lore.kernel.org/r/1631709163-2287-9-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
598d16fa1b RDMA/bnxt_re: Fix query SRQ failure
Fill the missing parameters for the FW command while querying SRQ.

Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/1631709163-2287-8-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:02 -03:00
Selvin Xavier
d195ff03bf RDMA/bnxt_re: Suppress unwanted error messages
Terminal CQEs are expected during QP destroy. Avoid the unwanted error
messages.

Link: https://lore.kernel.org/r/1631709163-2287-7-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:01 -03:00
Selvin Xavier
6a7296c918 RDMA/bnxt_re: Support multiple page sizes
HW can support multiple page sizes. Enable bits for enabling sizes from 4k
to 1G by reporting page_size_cap.

Link: https://lore.kernel.org/r/1631709163-2287-6-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:01 -03:00
Selvin Xavier
b9b43ad3ce RDMA/bnxt_re: Reduce the delay in polling for hwrm command completion
Driver has 1ms delay between the polling for atomic command completion.
Polling immediately after issuing command usually doesn't report any
completions. So all commands in the blocking path needs two iterations. So
effectively 1ms spend on each command. HW requires much lesser time for
each command. So reduce the delay to 1us and increase the iteration count
to wait for the same time.

Link: https://lore.kernel.org/r/1631709163-2287-5-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:01 -03:00
Edwin Peer
403bc4359a RDMA/bnxt_re: Use separate response buffer for stat_ctx_free
Use separate buffers for the request and response data. Eventhough the
response data is not used, providing the correct length is appropriate.

Link: https://lore.kernel.org/r/1631709163-2287-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:01 -03:00
Selvin Xavier
0cc4a9bdfc RDMA/bnxt_re: Update statistics counter name
Update a statistics counter name as the interface structure got updated.

Fixes: 9d6b648c31 ("bnxt_en: Update firmware interface spec to 1.10.1.65.")
Link: https://lore.kernel.org/r/1631709163-2287-3-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:00 -03:00
Selvin Xavier
9a381f7e5a RDMA/bnxt_re: Add extended statistics counters
Implement extended statistics counters for newer adapters. Check if the FW
support for this command and issue the FW command only if is
supported. Includes code re-organization to handle extended stats. Also,
add AH and PD software counters.

Link: https://lore.kernel.org/r/1631709163-2287-2-git-send-email-selvin.xavier@broadcom.com
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-20 13:37:00 -03:00
Linus Torvalds
4b105f4a25 RDMA v5.15 merge window 2nd Pull Request
An important error case regression fixes in mlx5:
 
 - Wrong size used when computing the error path smaller allocation request
   leads to corruption
 
 - Confusing but ultimately harmless alignment mis-calculation
 
 - Static checker warnings:
     Null pointer subtraction in qib
     kcalloc in bnxt_re
     Missing static on global variable in hfi1
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmE4qagACgkQOG33FX4g
 mxoueg//SfLuxirMifDHOpH2Db5vC2a/BwYW3AuT78zAXzlUmmEWYIUw1fbJA2vK
 vt/fsjaxW2FBsluiIgV8J961Mlrj+/Y1KPxbV+76CKFoBkaYcjEJcIajGXkm71r5
 bXviUuFEHV2810Nxs6QEitMPyEP3sODmcK5EoKIuifVfjhdcJ3XrLbJb0NA5PKk1
 voD2zpAfhumiAzjZ2q3rJgpCv08WWqts30DzAKdii1asdG9tDF8WBS9BJuPC7MHF
 B2AmYVTgwhsepCAUaYGathKxHblqakRXF3gjOIAB4TKuZEv0OwGVzH3flFbukg5L
 icEw/Ai0H2GM7e3yT3BI3QDoXMg52DfCeChgwttkNsJJidjlc7cUCwaaBCJ1oVo/
 860foW9YFA14ftAlBt2BtvfNfOaqYnmz4tmavVNxqW7NiZCqT/+uyV5BSDPD5b5M
 Xtsa8NK/6u65rz92GtNpVYd8W4ZfjOCAc8cWj6+ELeSDsU92sFWBWalWsl59Hq+t
 AdCUi25uwUAlLPiPKRlYxSXoCCOG80iDgrj8sntIGoVYT0nr4ITVv5VmmVWNQLPw
 V5xfDhCmALrNlp1nCBRsmtBHA/Legqu82/lWpkl12WZGOytU/jTEb2gLWPDglwr+
 BWmaq3Us/g36NX9h/GOvqD5tZCogzkjHvWLW0biMRebDyO1DW1Y=
 =Sg6A
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "I don't usually send a second PR in the merge window, but the fix to
  mlx5 is significant enough that it should start going through the
  process ASAP. Along with it comes some of the usual -rc stuff that
  would normally wait for a -rc2 or so.

  Summary:

  Important error case regression fixes in mlx5:

   - Wrong size used when computing the error path smaller allocation
     request leads to corruption

   - Confusing but ultimately harmless alignment mis-calculation

  Static checker warning fixes:

   - NULL pointer subtraction in qib

   - kcalloc in bnxt_re

   - Missing static on global variable in hfi1"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/hfi1: make hist static
  RDMA/bnxt_re: Prefer kcalloc over open coded arithmetic
  IB/qib: Fix null pointer subtraction compiler warning
  RDMA/mlx5: Fix xlt_chunk_align calculation
  RDMA/mlx5: Fix number of allocated XLT entries
2021-09-09 11:14:14 -07:00
Len Baker
f1b195ce81 RDMA/bnxt_re: Prefer kcalloc over open coded arithmetic
As noted in the "Deprecated Interfaces, Language Features, Attributes, and
Conventions" documentation [1], size calculations (especially
multiplication) should not be performed in memory allocator (or similar)
function arguments due to the risk of them overflowing. This could lead to
values wrapping around and a smaller allocation being made than the caller
was expecting. Using those allocations could lead to linear overflows of
heap memory and other misbehaviors.

In this case this is not actually dynamic sizes: both sides of the
multiplication are constant values. However it is best to refactor this
anyway, just to keep the open-coded math idiom out of code.

So, use the purpose specific kcalloc() function instead of the argument
size * count in the kzalloc() function.

Also, remove the unnecessary initialization of the sqp_tbl variable since
it is set a few lines later.

[1] https://www.kernel.org/doc/html/v5.14/process/deprecated.html#open-coded-arithmetic-in-allocator-arguments

Link: https://lore.kernel.org/r/20210905081812.17113-1-len.baker@gmx.com
Signed-off-by: Len Baker <len.baker@gmx.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-09-08 08:32:53 -03:00
Linus Torvalds
23852bec53 RDMA v5.15 merge window Pull Request
- Various cleanup and small features for rtrs
 
 - kmap_local_page() conversions
 
 - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns
 
 - Cache the IB subnet prefix
 
 - Rework how CRC is calcuated in rxe
 
 - Clean reference counting in iwpm's netlink
 
 - Pull object allocation and lifecycle for user QPs to the uverbs core
   code
 
 - Several small hns features and continued general code cleanups
 
 - Fix the scatterlist confusion of orig_nents/nents introduced in an
   earlier patch creating the append operation
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAmEudRgACgkQOG33FX4g
 mxraJA//c6bMxrrTVrzmrtrkyYD4tYWE8RDfgvoyZtleZnnEOJeunCQWakQrpJSv
 ukSnOGCA3PtnmRMdV54f/11YJ/7otxOJodSO7jWsIoBrqG/lISAdX8mn2iHhrvJ0
 dIaFEFPLy0WqoMLCJVIYIupR0IStVHb/mWx0uYL4XnnoYKyt7f7K5JMZpNWMhDN2
 ieJw0jfrvEYm8pipWuxUvB16XARlzAWQrjqLpMRI+jFRpbDVBY21dz2/LJvOJPrA
 LcQ+XXsV/F659ibOAGm6bU4BMda8fE6Lw90B/gmhSswJ205NrdziF5cNYHP0QxcN
 oMjrjSWWHc9GEE7MTipC2AH8e36qob16Q7CK+zHEJ+ds7R6/O/8XmED1L8/KFpNA
 FGqnjxnxsl1y27mUegfj1Hh8PfoDp2oVq0lmpEw0CYo4cfVzHSMRrbTR//XmW628
 Ie/mJddpFK4oLk+QkSNjSLrnxOvdTkdA58PU0i84S5eUVMNm41jJDkxg2J7vp0Zn
 sclZsclhUQ9oJ5Q2so81JMWxu4JDn7IByXL0ULBaa6xwQTiVEnyvSxSuPlflhLRW
 0vI2ylATYKyWkQqyX7VyWecZJzwhwZj5gMMWmoGsij8bkZhQ/VaQMaesByzSth+h
 NV5UAYax4GqyOQ/tg/tqT6e5nrI1zof87H64XdTCBpJ7kFyQ/oA=
 =ZwOe
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This is quite a small cycle, no major series stands out. The HNS and
  rxe drivers saw the most activity this cycle, with rxe being broken
  for a good chunk of time. The significant deleted line count is due to
  a SPDX cleanup series.

  Summary:

   - Various cleanup and small features for rtrs

   - kmap_local_page() conversions

   - Driver updates and fixes for: efa, rxe, mlx5, hfi1, qed, hns

   - Cache the IB subnet prefix

   - Rework how CRC is calcuated in rxe

   - Clean reference counting in iwpm's netlink

   - Pull object allocation and lifecycle for user QPs to the uverbs
     core code

   - Several small hns features and continued general code cleanups

   - Fix the scatterlist confusion of orig_nents/nents introduced in an
     earlier patch creating the append operation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (90 commits)
  RDMA/mlx5: Relax DCS QP creation checks
  RDMA/hns: Delete unnecessary blank lines.
  RDMA/hns: Encapsulate the qp db as a function
  RDMA/hns: Adjust the order in which irq are requested and enabled
  RDMA/hns: Remove RST2RST error prints for hw v1
  RDMA/hns: Remove dqpn filling when modify qp from Init to Init
  RDMA/hns: Fix QP's resp incomplete assignment
  RDMA/hns: Fix query destination qpn
  RDMA/hfi1: Convert to SPDX identifier
  IB/rdmavt: Convert to SPDX identifier
  RDMA/hns: Bugfix for incorrect association between dip_idx and dgid
  RDMA/hns: Bugfix for the missing assignment for dip_idx
  RDMA/hns: Bugfix for data type of dip_idx
  RDMA/hns: Fix incorrect lsn field
  RDMA/irdma: Remove the repeated declaration
  RDMA/core/sa_query: Retry SA queries
  RDMA: Use the sg_table directly and remove the opencoded version from umem
  lib/scatterlist: Fix wrong update of orig_nents
  lib/scatterlist: Provide a dedicated function to support table append
  RDMA/hns: Delete unused hns bitmap interface
  ...
2021-09-02 14:47:21 -07:00
Jason Gunthorpe
6a217437f9 Merge branch 'sg_nents' into rdma.git for-next
From Maor Gottlieb
====================

Fix the use of nents and orig_nents in the sg table append helpers. The
nents should be used by the DMA layer to store the number of DMA mapped
sges, the orig_nents is the number of CPU sges.

Since the sg append logic doesn't always create a SGL with exactly
orig_nents entries store a total_nents as well to allow the table to be
properly free'd and reorganize the freeing logic to share across all the
use cases.

====================

Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>

* 'sg_nents':
  RDMA: Use the sg_table directly and remove the opencoded version from umem
  lib/scatterlist: Fix wrong update of orig_nents
  lib/scatterlist: Provide a dedicated function to support table append
2021-08-30 09:49:59 -03:00
Dinghao Liu
a036ad0883 RDMA/bnxt_re: Remove unpaired rtnl unlock in bnxt_re_dev_init()
The fixed commit removes all rtnl_lock() and rtnl_unlock() calls in
function bnxt_re_dev_init(), but forgets to remove a rtnl_unlock() in the
error handling path of bnxt_re_register_netdev(), which may cause a
deadlock. This bug is suggested by a static analysis tool.

Fixes: c2b777a959 ("RDMA/bnxt_re: Refactor device add/remove functionalities")
Link: https://lore.kernel.org/r/20210816085531.12167-1-dinghao.liu@zju.edu.cn
Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 14:48:09 -03:00
Naresh Kumar PBS
17f2569dce RDMA/bnxt_re: Add missing spin lock initialization
Add the missing initialization of srq lock.

Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/1629343553-5843-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-19 10:19:10 -03:00
Leon Romanovsky
514aee660d RDMA: Globally allocate and release QP memory
Convert QP object to follow IB/core general allocation scheme.  That
change allows us to make sure that restrack properly kref the memory.

Link: https://lore.kernel.org/r/48e767124758aeecc433360ddd85eaa6325b34d9.1627040189.git.leonro@nvidia.com
Reviewed-by: Gal Pressman <galpress@amazon.com> #efa
Tested-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com> #rdma and core
Tested-by: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Tested-by: Tatyana Nikolova <tatyana.e.nikolova@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-08-03 13:44:27 -03:00
Naresh Kumar PBS
0c23af52cc RDMA/bnxt_re: Fix stats counters
Statistical counters are not incrementing in some adapter versions with
newer FW. This is due to the stats context length mismatch between FW and
driver. Since the L2 driver updates the length correctly, use the stats
length from L2 driver while allocating the DMA'able memory and creating
the stats context.

Fixes: 9d6b648c31 ("bnxt_en: Update firmware interface spec to 1.10.1.65.")
Link: https://lore.kernel.org/r/1626010296-6076-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-07-12 14:42:27 -03:00
Colin Ian King
6becfe913b RDMA/bnxt_re: Fix uninitialized struct bit field rsvd1
The bit field rsvd1 in resp is not being initialized and garbage data is
being copied from the stack back to userspace via the ib_copy_to_udata
call. Fix this by setting the entire struct resp to zero; this will ensure
that further new bit fields in the future will be zero'd too.

Link: https://lore.kernel.org/r/20210623182437.163801-1-colin.king@canonical.com
Addresses-Coverity: ("Uninitialized scalar variable")
Fixes: 879740517d ("RDMA/bnxt_re: Update ABI to pass wqe-mode to user space")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
[jgg: remove extra zeroing]
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-24 09:16:42 -03:00
Devesh Sharma
879740517d RDMA/bnxt_re: Update ABI to pass wqe-mode to user space
Changing ucontext ABI response structure to pass wqe_mode to user library.
A flag in comp_mask has been set to indicate presence of wqe_mode.

Moved wqe-mode ABI to uapi/rdma/bnxt_re-abi.h

Link: https://lore.kernel.org/r/20210616202817.1185276-1-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-21 20:58:52 -03:00
Jason Gunthorpe
915e4af59f RDMA: Remove rdma_set_device_sysfs_group()
The driver's device group can be specified as part of the ops structure
like the device's port group. No need for the complicated API.

Link: https://lore.kernel.org/r/8964785a34fd3a29ff5b6693493f575b717e594d.1623427137.git.leonro@nvidia.com
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:32 -03:00
Jason Gunthorpe
4b5f4d3fb4 RDMA: Split the alloc_hw_stats() ops to port and device variants
This is being used to implement both the port and device global stats,
which is causing some confusion in the drivers. For instance EFA and i40iw
both seem to be misusing the device stats.

Split it into two ops so drivers that don't support one or the other can
leave the op NULL'd, making the calling code a little simpler to
understand.

Link: https://lore.kernel.org/r/1955c154197b2a159adc2dc97266ddc74afe420c.1623427137.git.leonro@nvidia.com
Tested-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-16 20:58:29 -03:00
Devesh Sharma
35f5ace5de RDMA/bnxt_re: Enable global atomic ops if platform supports
Enabling Atomic operations for Gen P5 devices if the underlying platform
supports global atomic ops.

Link: https://lore.kernel.org/r/20210603131534.982257-2-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-06-03 16:59:11 -03:00
Guenter Roeck
6dc760027d RDMA/bnxt_re: Drop unnecessary NULL checks after container_of
The result of container_of() operations is never NULL unless the first
element of the embedding structure is extracted. This is either not the
case here, or the pointer passed to container_of() is known to be not
NULL. The NULL checks are therefore unnecessary and misleading.
Remove them.

The channges in this patch were made automatically with the following
Coccinelle script.

@@
type t;
identifier v;
statement s;
@@

<+...
(
  t v = container_of(...);
|
  v = container_of(...);
)
  ...
  when != v
- if (\( !v \| v == NULL \) ) s
...+>

Link: https://lore.kernel.org/r/20210514153606.1377119-1-linux@roeck-us.net
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-05-20 12:01:00 -03:00
Lv Yunlong
34b39efa5a RDMA/bnxt_re: Fix a double free in bnxt_qplib_alloc_res
In bnxt_qplib_alloc_res, it calls bnxt_qplib_alloc_dpi_tbl().  Inside
bnxt_qplib_alloc_dpi_tbl, dpit->dbr_bar_reg_iomem is freed via
pci_iounmap() in unmap_io error branch. After the callee returns err code,
bnxt_qplib_alloc_res calls
bnxt_qplib_free_res()->bnxt_qplib_free_dpi_tbl() in the fail branch. Then
dpit->dbr_bar_reg_iomem is freed in the second time by pci_iounmap().

My patch set dpit->dbr_bar_reg_iomem to NULL after it is freed by
pci_iounmap() in the first time, to avoid the double free.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/20210426140614.6722-1-lyl2019@mail.ustc.edu.cn
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-27 15:22:29 -03:00
Leon Romanovsky
1900357e75 RDMA/bnxt_re: Get rid of custom module reference counting
Instead of manually messing with parent driver module reference counting
rely on export symbol mechanism to ensure that proper probe/remove chain
is performed.

Link: https://lore.kernel.org/r/20210401065715.565226-4-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-19 14:57:03 -03:00
Leon Romanovsky
bcf9ee0520 RDMA/bnxt_re: Create direct symbol link between bnxt modules
Convert indirect probe call to its direct equivalent to create a symbol
link between RDMA and netdev modules. This will give us an ability to
remove custom module reference counting that doesn't belong to the driver.

Link: https://lore.kernel.org/r/20210401065715.565226-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-19 14:57:03 -03:00
Leon Romanovsky
ae9884829c RDMA/bnxt_re: Depend on bnxt ethernet driver and not blindly select it
The "select" kconfig keyword provides reverse dependency, however it
doesn't check that selected symbol meets its own dependencies. Usually
"select" is used for non-visible symbols, so instead of trying to keep
dependencies in sync with BNXT ethernet driver, simply "depends on" it,
like Kconfig documentation suggest.

* CONFIG_PCI is already required by BNXT
* CONFIG_NETDEVICES and CONFIG_ETHERNET are needed to chose BNXT

Link: https://lore.kernel.org/r/20210401065715.565226-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-19 14:57:03 -03:00
Wang Wensheng
22efb0a8d1 RDMA/bnxt_re: Fix error return code in bnxt_qplib_cq_process_terminal()
Fix to return a negative error code from the error handling case instead
of 0, as done elsewhere in this function.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/20210408113137.97202-1-wangwensheng4@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Wang Wensheng <wangwensheng4@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-04-12 15:11:00 -03:00
Selvin Xavier
6845485f9e RDMA/bnxt_re: Move device to error state upon device crash
When the L2 driver detects a device crash or device undergone reset, it
invokes a stop callback to recover from error.

The current RoCE driver doesn't recover the device. So move the device to
error state and dispatch fatal events to all qps Release the MSIx vectors
to avoid a crash when L2 driver disables the MSIx.  Also, check for the
device state to avoid posting further commands to the HW.

Link: https://lore.kernel.org/r/1615968942-30970-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 10:37:01 -03:00
Mark Bloch
1fb7f8973f RDMA: Support more than 255 rdma ports
Current code uses many different types when dealing with a port of a RDMA
device: u8, unsigned int and u32. Switch to u32 to clean up the logic.

This allows us to make (at least) the core view consistent and use the
same type. Unfortunately not all places can be converted. Many uverbs
functions expect port to be u8 so keep those places in order not to break
UAPIs.  HW/Spec defined values must also not be changed.

With the switch to u32 we now can support devices with more than 255
ports. U32_MAX is reserved to make control logic a bit easier to deal
with. As a device with U32_MAX ports probably isn't going to happen any
time soon this seems like a non issue.

When a device with more than 255 ports is created uverbs will report the
RDMA device as having 255 ports as this is the max currently supported.

The verbs interface is not changed yet because the IBTA spec limits the
port size in too many places to be u8 and all applications that relies in
verbs won't be able to cope with this change. At this stage, we are
extending the interfaces that are using vendor channel solely

Once the limitation is lifted mlx5 in switchdev mode will be able to have
thousands of SFs created by the device. As the only instance of an RDMA
device that reports more than 255 ports will be a representor device and
it exposes itself as a RAW Ethernet only device CM/MAD/IPoIB and other
ULPs aren't effected by this change and their sysfs/interfaces that are
exposes to userspace can remain unchanged.

While here cleanup some alignment issues and remove unneeded sanity
checks (mainly in rdmavt),

Link: https://lore.kernel.org/r/20210301070420.439400-1-leon@kernel.org
Signed-off-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-03-26 09:31:21 -03:00
Selvin Xavier
c930af5ab4 RDMA/bnxt_re: Allow bigger MR creation
Allow users to create bigger MRs. Remove the check that prevented creating
MRs with number of pages more than 512.

Link: https://lore.kernel.org/r/1610012608-14528-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18 16:56:09 -04:00
Selvin Xavier
f6919d5638 RDMA/bnxt_re: Code refactor while populating user MRs
Refactor code that populates MR page buffer list. Instead of allocating a
pbl_tbl to hold the buffer list, pass the struct ib_umem directly to
bnxt_qplib_alloc_init_hwq() as done for other user space memories.  Fix
the PBL level to handle the above mentioned change.

Also, remove an unwanted flag from the input to bnxt_qplib_reg_mr()
function.

Link: https://lore.kernel.org/r/1610012608-14528-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2021-01-18 16:56:08 -04:00
Selvin Xavier
c63e1c4dfc RDMA/bnxt_re: Fix max_qp_wrs reported
While creating qps, the driver adds one extra entry to the sq size passed
by the ULPs in order to avoid queue full condition.  When ULPs creates QPs
with max_qp_wr reported, driver creates QP with 1 more than the max_wqes
supported by HW. Create QP fails in this case. To avoid this error, reduce
1 entry in max_qp_wqes and report it to the stack.

Link: https://lore.kernel.org/r/1606741986-16477-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-12-07 15:43:42 -04:00
Selvin Xavier
b898d5c50c RDMA/bnxt_re: Fix entry size during SRQ create
Only static WQE is supported for SRQ. So always use the max supported SGEs
while calculating SRQ entry size.

Fixes: 2bb3c32c5c ("RDMA/bnxt_re: Change wr posting logic to accommodate variable wqes")
Link: https://lore.kernel.org/r/1602569752-12745-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-27 14:12:26 -03:00
Joe Perches
1c7fd72687 RDMA: Convert sysfs device * show functions to use sysfs_emit()
Done with cocci script:

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	return
-	strcpy(buf, chr);
+	sysfs_emit(buf, chr);
	...>
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	sprintf(buf,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	snprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
	len =
-	scnprintf(buf, PAGE_SIZE,
+	sysfs_emit(buf,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
identifier len;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	<...
-	len += scnprintf(buf + len, PAGE_SIZE - len,
+	len += sysfs_emit_at(buf, len,
	...);
	...>
	return len;
}

@@
identifier d_show;
identifier dev, attr, buf;
expression chr;
@@

ssize_t d_show(struct device *dev, struct device_attribute *attr, char *buf)
{
	...
-	strcpy(buf, chr);
-	return strlen(buf);
+	return sysfs_emit(buf, chr);
}

Link: https://lore.kernel.org/r/7f406fa8e3aa2552c022bec680f621e38d1fe414.1602122879.git.joe@perches.com
Signed-off-by: Joe Perches <joe@perches.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:53:21 -03:00
Jason Gunthorpe
676a80adba RDMA: Remove AH from uverbs_cmd_mask
Drivers that need a uverbs AH should instead set the create_user_ah() op
similar to reg_user_mr(). MODIFY_AH and QUERY_AH cmds were never
implemented so are just deleted.

Link: https://lore.kernel.org/r/11-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:28:00 -03:00
Jason Gunthorpe
1f11a7610e RDMA: Check create_flags during create_qp
Each driver should check that the QP attrs create_flags is supported.
Unfortuantely when create_flags was added to the QP attrs the drivers were
not updated. uverbs_ex_cmd_mask was used to block it - even though kernel
drivers use these flags too.

Check that flags is zero in all drivers that don't use it, remove
IB_USER_VERBS_EX_CMD_CREATE_QP from uverbs_ex_cmd_mask. Fix the error code
to be EOPNOTSUPP.

Link: https://lore.kernel.org/r/8-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:59 -03:00
Jason Gunthorpe
1c407cb5d7 RDMA: Check flags during create_cq
Each driver should check that the CQ attrs is supported. Unfortuantely
when flags was added to the CQ attrs the drivers were not updated,
uverbs_ex_cmd_mask was used to block it. This was missed when create CQ
was converted to ioctl, so non-zero flags could have been passed into
drivers.

Check that flags is zero in all drivers that don't use it, remove
IB_USER_VERBS_EX_CMD_CREATE_CQ from uverbs_ex_cmd_mask.

Fixes: 41b2a71fc8 ("IB/uverbs: Move ioctl path of create_cq and destroy_cq to a new file")
Link: https://lore.kernel.org/r/7-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:59 -03:00
Jason Gunthorpe
26e990badd RDMA: Check attr_mask during modify_qp
Each driver should check that it can support the provided attr_mask during
modify_qp. IB_USER_VERBS_EX_CMD_MODIFY_QP was being used to block
modify_qp_ex because the driver didn't check RATE_LIMIT.

Link: https://lore.kernel.org/r/6-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:58 -03:00
Jason Gunthorpe
44ce37bc8b RDMA: Move more uverbs_cmd_mask settings to the core
These functions all depend on the driver providing a specific op:

- REREG_MR is rereg_user_mr(). bnxt_re set this without providing the op
- ATTACH/DEATCH_MCAST is attach_mcast()/detach_mcast(). usnic set this
  without providing the op
- OPEN_QP doesn't involve the driver but requires a XRCD. qedr provides
  xrcd but forgot to set it, usnic doesn't provide XRCD but set it anyhow.
- OPEN/CLOSE_XRCD are the ops alloc_xrcd()/dealloc_xrcd()
- CREATE_SRQ/DESTROY_SRQ are the ops create_srq()/destroy_srq()
- QUERY/MODIFY_SRQ is op query_srq()/modify_srq(). hns sets this but
  sometimes supplies a NULL op.
- RESIZE_CQ is op resize_cq(). bnxt_re sets this boes doesn't supply an op
- ALLOC/DEALLOC_MW is alloc_mw()/dealloc_mw(). cxgb4 provided an
  (now deleted) implementation but no userspace

All drivers were checked that no drivers provide the op without also
setting uverbs_cmd_mask so this should have no functional change.

Link: https://lore.kernel.org/r/4-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:58 -03:00
Jason Gunthorpe
c074bb1e30 RDMA: Remove elements in uverbs_cmd_mask that all drivers set
This is a step toward eliminating uverbs_cmd_mask. Preset this list in the
core code. Only the op reg_user_mr wasn't already being required from the
drivers.

Link: https://lore.kernel.org/r/3-v1-caa70ba3d1ab+1436e-ucmd_mask_jgg@nvidia.com
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:27:57 -03:00
Kamal Heib
53839b51a7 RDMA/bnxt_re: Set queue pair state when being queried
The API for ib_query_qp requires the driver to set cur_qp_state on return,
add the missing set.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/20201021114952.38876-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-26 19:22:15 -03:00
Jason Gunthorpe
e0477b34d9 RDMA: Explicitly pass in the dma_device to ib_register_device
The code in setup_dma_device has become rather convoluted, move all of
this to the drivers. Drives now pass in a DMA capable struct device which
will be used to setup DMA, or drivers must fully configure the ibdev for
DMA and pass in NULL.

Other than setting the masks in rvt all drivers were doing this already
anyhow.

mthca, mlx4 and mlx5 were already setting up maximum DMA segment size for
DMA based on their hardweare limits in:
__mthca_init_one()
  dma_set_max_seg_size (1G)

__mlx4_init_one()
  dma_set_max_seg_size (1G)

mlx5_pci_init()
  set_dma_caps()
    dma_set_max_seg_size (2G)

Other non software drivers (except usnic) were extended to UINT_MAX [1, 2]
instead of 2G as was before.

[1] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/
[2] https://lore.kernel.org/linux-rdma/20200924114940.GE9475@nvidia.com/

Link: https://lore.kernel.org/r/20201008082752.275846-1-leon@kernel.org
Link: https://lore.kernel.org/r/6b2ed339933d066622d5715903870676d8cc523a.1602590106.git.mchehab+huawei@kernel.org
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-16 13:53:46 -03:00
Colin Ian King
73c5265913 RDMA/bnxt_re: Fix sizeof mismatch for allocation of pbl_tbl.
An incorrect sizeof is being used, u64 * is not correct, it should be just
u64 for a table of umem_pgs number of u64 items in the pbl_tbl.  Use the
idiom sizeof(*pbl_tbl) to get the object type without the need to
explicitly use u64.

Link: https://lore.kernel.org/r/20201006114700.537916-1-colin.king@canonical.com
Addresses-Coverity: ("Sizeof not portable (SIZEOF_MISMATCH)")
Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-06 16:56:36 -03:00
Jason Gunthorpe
6ef999f500 RDMA/bnxt_re: Use rdma_umem_for_each_dma_block()
This driver is taking the SGL out of the umem and passing it through a
struct bnxt_qplib_sg_info. Instead of passing the SGL pass the umem and
then use rdma_umem_for_each_dma_block() directly.

Move the calls of ib_umem_num_dma_blocks() closer to their actual point of
use, npages is only set for non-umem pbl flows.

Link: https://lore.kernel.org/r/0-v1-b37437a73f35+49c-bnxt_re_dma_block_jgg@nvidia.com
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Tested-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-10-06 16:45:53 -03:00
Jason Gunthorpe
5dee5872f8 Merge branch 'mlx5_active_speed' into rdma.git for-next
Leon Romanovsky says:

====================
IBTA declares speed as 16 bits, but kernel stores it in u8. This series
fixes in-kernel declaration while keeping external interface intact.
====================

Based on the mlx5-next branch at
     git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux
due to dependencies.

* branch 'mlx5_active_speed':
  RDMA: Fix link active_speed size
  RDMA/mlx5: Delete duplicated mlx5_ptys_width enum
  net/mlx5: Refactor query port speed functions
2020-09-18 10:31:45 -03:00
Aharon Landau
376ceb31ff RDMA: Fix link active_speed size
According to the IB spec active_speed size should be u16 and not u8 as
before. Changing it to allow further extensions in offered speeds.

Link: https://lore.kernel.org/r/20200917090223.1018224-4-leon@kernel.org
Signed-off-by: Aharon Landau <aharonl@mellanox.com>
Reviewed-by: Michael Guralnik <michaelgur@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-18 10:31:24 -03:00
Linus Torvalds
b1df2a0783 RDMA second 5.9-rc pull request
A number of driver bug fixes and a few recent regressions:
 
 - Several bug fixes for bnxt_re. Crashing, incorrect data reported,
   and corruption on new HW
 
 - Memory leak and crash in rxe
 
 - Fix sysfs corruption in rxe if the netdev name is too long
 
 - Fix a crash on error unwind in the new cq_pool code
 
 - Fix kobject panics in rtrs by working device lifetime properly
 
 - Fix a data corruption bug in iser target related to misaligned buffers
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl9atAYACgkQOG33FX4g
 mxqSdw//Qi29dnxzVGpsaO4/krd/VmI6NT6eNpgK7Nqx80DaCYer0JhtwZOUxHqK
 KbHIV9XB/f6BSI67c9ydYj4PNX6FpFnoUWQLvqZwip5VM7R6ifIVjm0ap1jCAUSS
 axDLFZOySIOYNhcZ5I+MtY/kxykKBjMteuMXdpBe4FwZ+XSmsC5KkfRH/+FUhjVG
 peL6aRVDv9TByH8w+iZE1wSmVrOphOE1C/jN5TyotQTmKe7IHoXJtkalosYHXFWw
 KZaaz52e4IYKVFl4HIcl6+FfPExhxsyfDtRHluvn+vzY/wFy1RZw6F0BZt7mioy5
 J8R6w82xEe/SNugTGuvIzqXOymmy9H4CrG9pHy4NRMMzC28LGI7qHJgVhr/jZy8+
 GPxR26cywDhPsd4XA2K3mvs7DVSoBUPYlIUnHdYjfBZl/ghColg9+XGyNv6pdrke
 Q7Kog5blcpOAahBX+ElBLvIZXk5oEk5W+3H/M0OeuVMQ/DrMtALrCnwpp4wDKVvO
 9QuYfGgQ+25xbV9kwzckLGo5eedN3cRD/v4hcqvQUZo+9zLYZ/HZRMjpOdrscQ+I
 QL4FgpcLpOASKZ+bYjjpFxK3rNVTDT9CYJw4/hxEaOhxRhtAO1Q9mJdvJTK6dj09
 oR9LPyefQkyKCAt+heWHKKkEYDiwT8U1SlR8STotg24VHIj6Rb4=
 =2DDd
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma fixes from Jason Gunthorpe:
 "A number of driver bug fixes and a few recent regressions:

   - Several bug fixes for bnxt_re. Crashing, incorrect data reported,
     and corruption on new HW

   - Memory leak and crash in rxe

   - Fix sysfs corruption in rxe if the netdev name is too long

   - Fix a crash on error unwind in the new cq_pool code

   - Fix kobject panics in rtrs by working device lifetime properly

   - Fix a data corruption bug in iser target related to misaligned
     buffers"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma:
  IB/isert: Fix unaligned immediate-data handling
  RDMA/rtrs-srv: Set .release function for rtrs srv device during device init
  RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx'
  RDMA/core: Fix reported speed and width
  RDMA/core: Fix unsafe linked list traversal after failing to allocate CQ
  RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
  RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
  RDMA/bnxt_re: Restrict the max_gids to 256
  RDMA/bnxt_re: Static NQ depth allocation
  RDMA/bnxt_re: Fix the qp table indexing
  RDMA/bnxt_re: Do not report transparent vlan from QP1
  RDMA/mlx4: Read pkey table length instead of hardcoded value
  RDMA/rxe: Fix panic when calling kmem_cache_create()
  RDMA/rxe: Fix memleak in rxe_mem_init_user
  RDMA/rxe: Fix the parent sysfs read when the interface has 15 chars
  RDMA/rtrs-srv: Replace device_register with device_initialize and device_add
2020-09-11 10:02:36 -07:00
Jason Gunthorpe
84e71b4d9b RDMA/bnxt: Do not use ib_umem_page_count() or ib_umem_num_pages()
ib_umem_page_count() returns the number of 4k entries required for a DMA
map, but bnxt_re already computes a variable page size. The correct API to
determine the size of the page table array is ib_umem_num_dma_blocks().

Fix the overallocation of the page array in fill_umem_pbl_tbl() when
working with larger page sizes by using the right function. Lightly
re-organize this function to make it clearer.

Replace the other calls to ib_umem_num_pages().

Fixes: d85582517e ("RDMA/bnxt_re: Use core helpers to get aligned DMA address")
Link: https://lore.kernel.org/r/11-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-11 10:24:54 -03:00
Jason Gunthorpe
ebc24096c4 RDMA/umem: Add rdma_umem_for_each_dma_block()
This helper does the same as rdma_for_each_block(), except it works on a
umem. This simplifies most of the call sites.

Link: https://lore.kernel.org/r/4-v2-270386b7e60b+28f4-umem_1_jgg@nvidia.com
Acked-by: Miguel Ojeda <miguel.ojeda.sandonis@gmail.com>
Acked-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 15:33:17 -03:00
Leon Romanovsky
43d781b9fa RDMA: Allow fail of destroy CQ
Like any other verbs objects, CQ shouldn't fail during destroy, but
mlx5_ib didn't follow this contract with mixed IB verbs objects with
DEVX. Such mix causes to the situation where FW and kernel are fully
interdependent on the reference counting of each side.

Kernel verbs and drivers that don't have DEVX flows shouldn't fail.

Fixes: e39afe3d6d ("RDMA: Convert CQ allocations to be under core responsibility")
Link: https://lore.kernel.org/r/20200907120921.476363-7-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 14:14:29 -03:00
Leon Romanovsky
119181d1d4 RDMA: Restore ability to fail on SRQ destroy
In similar way to other IB objects, restore the ability to return error on
SRQ destroy. Strictly speaking, this change is not necessary, and provided
here to ensure a symmetrical interface like other destroy functions.

Fixes: 68e326dea1 ("RDMA: Handle SRQ allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-5-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 14:14:24 -03:00
Leon Romanovsky
9a9ebf8cd7 RDMA: Restore ability to fail on AH destroy
Like any other IB verbs objects, AH are refcounted by ib_core. The release
of those objects are controlled by ib_core with promise that AH destroy
can't fail.

Being SW object for now, this change makes dealloc_ah() to behave like any
other destroy IB flows.

Fixes: d345691471 ("RDMA: Handle AH allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-3-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 13:57:22 -03:00
Leon Romanovsky
91a7c58fce RDMA: Restore ability to fail on PD deallocate
The IB verbs objects are counted by the kernel and ib_core ensures that
deallocate PD will success so it will be called once all other objects
that depends on PD will be released. This is achieved by managing various
reference counters on such objects.

The mlx5 driver didn't follow this standard flow when allowed DEVX objects
that are not managed by ib_core to be interleaved with the ones under
ib_core responsibility.

In such interleaved scenarios deallocate command can fail and ib_core will
leave uobject in internal DB and attempt to clean it later to free
resources anyway.

This change partially restores returned value from dealloc_pd() for all
drivers, but keeping in mind that non-DEVX devices and kernel verbs paths
shouldn't fail.

Fixes: 21a428a019 ("RDMA: Handle PD allocations by IB/core")
Link: https://lore.kernel.org/r/20200907120921.476363-2-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 13:57:22 -03:00
YueHaibing
9e712446a8 RDMA/bnxt_re: Remove set but not used variable 'qplib_ctx'
drivers/infiniband/hw/bnxt_re/main.c:1012:25:
 warning: variable ‘qplib_ctx’ set but not used [-Wunused-but-set-variable]

Fixes: f86b31c6a2 ("RDMA/bnxt_re: Static NQ depth allocation")
Link: https://lore.kernel.org/r/20200905121624.32776-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-09 13:20:48 -03:00
Allen Pais
53c2a706ae RDMA/bnxt_re: Convert tasklets to use new tasklet_setup() API
In preparation for unconditionally passing the struct tasklet_struct
pointer to all tasklet callbacks, switch to using the new tasklet_setup()
and from_tasklet() to pass the tasklet pointer explicitly.

Link: https://lore.kernel.org/r/20200903060637.424458-2-allen.lkml@gmail.com
Signed-off-by: Romain Perier <romain.perier@gmail.com>
Signed-off-by: Allen Pais <allen.lkml@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-09-03 12:01:52 -03:00
Selvin Xavier
097a9d23b7 RDMA/bnxt_re: Remove the qp from list only if the qp destroy succeeds
Driver crashes when destroy_qp is re-tried because of an error
returned. This is because the qp entry was removed from the qp list during
the first call.

Remove qp from the list only if destroy_qp returns success.

The driver will still trigger a WARN_ON due to the memory leaking, but at
least it isn't corrupting memory too.

Fixes: 8dae419f9e ("RDMA/bnxt_re: Refactor queue pair creation code")
Link: https://lore.kernel.org/r/1598292876-26529-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:30:44 -03:00
Naresh Kumar PBS
934d0ac9a6 RDMA/bnxt_re: Fix driver crash on unaligned PSN entry address
When computing the first psn entry, driver checks for page alignment. If
this address is not page aligned,it attempts to compute the offset in that
page for later use by using ALIGN macro. ALIGN macro does not return
offset bytes but the requested aligned address and hence cannot be used
directly to store as offset.  Since driver was using the address itself
instead of offset, it resulted in invalid address when filling the psn
buffer.

Fixed driver to use PAGE_MASK macro to calculate the offset.

Fixes: fddcbbb02a ("RDMA/bnxt_re: Simplify obtaining queue entry from hw ring")
Link: https://lore.kernel.org/r/1598292876-26529-7-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:30:44 -03:00
Naresh Kumar PBS
847b97887e RDMA/bnxt_re: Restrict the max_gids to 256
Some adapters report more than 256 gid entries. Restrict it to 256 for
now.

Fixes: 1ac5a4047975("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1598292876-26529-6-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:30:44 -03:00
Naresh Kumar PBS
f86b31c6a2 RDMA/bnxt_re: Static NQ depth allocation
At first, driver allocates memory for NQ based on qplib_ctx->cq_count and
qplib_ctx->srqc_count.  Later when creating ring, it uses a static value
of 128K -1.

Fixing this with a static value for now.

Fixes: b08fe048a6 ("RDMA/bnxt_re: Refactor net ring allocation function")
Link: https://lore.kernel.org/r/1598292876-26529-5-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:30:43 -03:00
Selvin Xavier
84cf229f40 RDMA/bnxt_re: Fix the qp table indexing
qp->id can be a value outside the max number of qp. Indexing the qp table
with the id can cause out of bounds crash. So changing the qp table
indexing by (qp->id % max_qp -1).

Allocating one extra entry for QP1. Some adapters create one more than the
max_qp requested to accommodate QP1.  If the qp->id is 1, store the
inforamtion in the last entry of the qp table.

Fixes: f218d67ef0 ("RDMA/bnxt_re: Allow posting when QPs are in error")
Link: https://lore.kernel.org/r/1598292876-26529-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:30:43 -03:00
Selvin Xavier
2d0e60ee32 RDMA/bnxt_re: Do not report transparent vlan from QP1
QP1 Rx CQE reports transparent VLAN ID in the completion and this is used
while reporting the completion for received MAD packet. Check if the vlan
id is configured before reporting it in the work completion.

Fixes: 84511455ac ("RDMA/bnxt_re: report vlan_id and sl in qp1 recv completion")
Link: https://lore.kernel.org/r/1598292876-26529-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-27 09:30:43 -03:00
Gustavo A. R. Silva
df561f6688 treewide: Use fallthrough pseudo-keyword
Replace the existing /* fall through */ comments and its variants with
the new pseudo-keyword macro fallthrough[1]. Also, remove unnecessary
fall-through markings when it is the case.

[1] https://www.kernel.org/doc/html/v5.7/process/deprecated.html?highlight=fallthrough#implicit-switch-case-fall-through

Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
2020-08-23 17:36:59 -05:00
Selvin Xavier
a812f2d60a RDMA/bnxt_re: Do not add user qps to flushlist
Driver shall add only the kernel qps to the flush list for clean up.
During async error events from the HW, driver is adding qps to this list
without checking if the qp is kernel qp or not.

Add a check to avoid user qp addition to the flush list.

Fixes: 942c9b6ca8 ("RDMA/bnxt_re: Avoid Hard lockup during error CQE processing")
Fixes: c50866e285 ("bnxt_re: fix the regression due to changes in alloc_pbl")
Link: https://lore.kernel.org/r/1596689148-4023-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-08-20 08:31:41 -03:00
Linus Torvalds
d7806bbd22 RDMA 5.9 merge window pull request
Smaller set of RDMA updates. A smaller number of 'big topics' with the
 majority of changes being driver updates.
 
 - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re
 
 - Removal of dead or redundant code across the drivers
 
 - RAW resource tracker dumps to include a device specific data blob for
   device objects to aide device debugging
 
 - Further advance the IOCTL interface, remove the ability to turn it off.
   Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands
 
 - Remove stubs related to devices with no pkey table
 
 - A shared CQ scheme to allow multiple ULPs to share the CQ rings of a
   device to give higher performance
 
 - Several more static checker, syzkaller and rare crashers fixed
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl8sSA0ACgkQOG33FX4g
 mxpp1w/8Df/KIB38PVHpKraIW10bX03KsXwoskMYCA+ITYWM5ce+P7YF+yXXGs69
 Vh2vUYHlr1RvqXQkq3Y3LjzCPKTYFuNFVQRZF1LrfbfOpSS9aoQqoxwgKs08dibm
 YDeRwueWneksWhXeEZLA0QoKd4kEWrScA/n7VGYQ4YcWw8FLKa9t6OMSGivCrFLu
 QA+sA9nytrvMWC5uJUCdeVwlRnoaICPYHmM5yafOykPyEciRw2jU1kzTRVy5Z0Hu
 iCsXm2lJPcVoMgSjW6SgktY3oBkQeSu3ZZesT3eTM6FJsoDYkuSiKjNmWSZjW1zv
 x6CFGjVVin41rN4FMTeqqnwYoML9Q/obbyHvBHs5MTd5J8tLDhesQj3Ev7CUaUed
 b0s38v+oEL1w22nkOChfeyfh7eLcy3yiszqvkIU9ABk8mF0p1guGQYsfguzbsq0K
 3ZRw/361SxCUBvU6P8CdQbIJlhkH+Un7d81qyt+rhLgaZYm/N+d8auIKUxP1jCxh
 q9hss2Cj2U9eZsA/wGNqV1LNazfEAAj/5qjItMirbRd90FL8h+AP2LfJfC7p+id3
 3BfOui0JbZqNTTl4ftTxPuxtWDEdTPgwi7JvQd/be9HRlSV8DYCSMUzYFn8A+Zya
 cbxjxFuBJWmF+y9csDIVBTdFi+j9hO6notw+G89NznuB3QlPl50=
 =0z2L
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A quiet cycle after the larger 5.8 effort. Substantially cleanup and
  driver work with a few smaller features this time.

   - Driver updates for hfi1, rxe, mlx5, hns, qedr, usnic, bnxt_re

   - Removal of dead or redundant code across the drivers

   - RAW resource tracker dumps to include a device specific data blob
     for device objects to aide device debugging

   - Further advance the IOCTL interface, remove the ability to turn it
     off. Add QUERY_CONTEXT, QUERY_MR, and QUERY_PD commands

   - Remove stubs related to devices with no pkey table

   - A shared CQ scheme to allow multiple ULPs to share the CQ rings of
     a device to give higher performance

   - Several more static checker, syzkaller and rare crashers fixed"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (121 commits)
  RDMA/mlx5: Fix flow destination setting for RDMA TX flow table
  RDMA/rxe: Remove pkey table
  RDMA/umem: Add a schedule point in ib_umem_get()
  RDMA/hns: Fix the unneeded process when getting a general type of CQE error
  RDMA/hns: Fix error during modify qp RTS2RTS
  RDMA/hns: Delete unnecessary memset when allocating VF resource
  RDMA/hns: Remove redundant parameters in set_rc_wqe()
  RDMA/hns: Remove support for HIP08_A
  RDMA/hns: Refactor hns_roce_v2_set_hem()
  RDMA/hns: Remove redundant hardware opcode definitions
  RDMA/netlink: Remove CAP_NET_RAW check when dump a raw QP
  RDMA/include: Replace license text with SPDX tags
  RDMA/rtrs: remove WQ_MEM_RECLAIM for rtrs_wq
  RDMA/rtrs-clt: add an additional random 8 seconds before reconnecting
  RDMA/cma: Execute rdma_cm destruction from a handler properly
  RDMA/cma: Remove unneeded locking for req paths
  RDMA/cma: Using the standard locking pattern when delivering the removal event
  RDMA/cma: Simplify DEVICE_REMOVAL for internal_id
  RDMA/efa: Add EFA 0xefa1 PCI ID
  RDMA/efa: User/kernel compatibility handshake mechanism
  ...
2020-08-06 16:43:36 -07:00
Michael Chan
bfc6e5fbcb bnxt_en: Update firmware interface to 1.10.1.54.
Main changes are 200G support and fixing the definitions of discard and
error counters to match the hardware definitions.

Because the HWRM_PORT_PHY_QCFG message size has now exceeded the max.
encapsulated response message size of 96 bytes from the PF to the VF,
we now need to cap this message to 96 bytes for forwarding.  The forwarded
response only needs to contain the basic link status and speed information
and can be capped without adding the new information.

v2: Fix bnxt_re compile error.

Cc: Selvin Xavier <selvin.xavier@broadcom.com>
Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Reviewed-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2020-07-27 11:47:33 -07:00
Devesh Sharma
2bb3c32c5c RDMA/bnxt_re: Change wr posting logic to accommodate variable wqes
Modifying the post-send and post-recv to initialize the wqes slot by slot
dynamically depending on the number of max sges requested by consumer at
the time of QP creation.

Changed the QP creation logic to determine the size of SQ and RQ in 16B
slots based on the number of wqe and number of SGEs requested by consumer

Link: https://lore.kernel.org/r/1594822619-4098-6-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20 16:32:50 -03:00
Devesh Sharma
54ace98443 RDMA/bnxt_re: Add helper data structures
Adding few helper data structure which are useful to initialize hardware
send wqe in variable wqe mode.

Adding a qp flag in HSI to indicate variable wqe is enabled for this qp.

Link: https://lore.kernel.org/r/1594822619-4098-5-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20 16:32:50 -03:00
Devesh Sharma
5ac5396a6c RDMA/bnxt_re: Pull psn buffer dynamically based on prod
Changing the PSN management memory buffers from statically initialized to
dynamic pull scheme.

During create qp only the start pointers are initialized and during
post-send the psn buffer is pulled based on current producer index.

Adjusting post_send code to accommodate dynamic psn-pull and changing
post_recv code to match post-send code wrt pseudo flush wqe generation.

Link: https://lore.kernel.org/r/1594822619-4098-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20 16:32:49 -03:00
Devesh Sharma
159fb4ceac RDMA/bnxt_re: introduce a function to allocate swq
The bnxt_re driver now allocates shadow sq and rq to maintain per wqe
wr_id and few other flags required to support variable wqe. Segregated the
allocation of shadow queue in a separate function and adjust the cqe
polling logic. The new polling logic is based on shadow queue indices.

Link: https://lore.kernel.org/r/1594822619-4098-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20 16:32:49 -03:00
Devesh Sharma
1da968e0ef RDMA/bnxt_re: introduce wqe mode to select execution path
The bnxt_re driver need to decide on how much SQ and RQ memory should to
be allocated and which wqe posting/polling algorithm to use.

Making changes to set the wqe-mode to a default value during device
registration sequence. The wqe-mode is passed to the lower layer driver as
well. Going forward in the lower layer driver wqe-mode will be used to
decide execution path. Initializing the wqe-mode to static wqe type for
now.

Link: https://lore.kernel.org/r/1594822619-4098-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-20 16:32:49 -03:00
Gal Pressman
42a3b15396 RDMA: Remove the udata parameter from alloc_mr callback
Allocating an MR flow can only be initiated by kernel users, and not from
userspace so a udata parameter is redundant.

Link: https://lore.kernel.org/r/20200706120343.10816-4-galpress@amazon.com
Signed-off-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2020-07-06 19:25:53 -03:00
Masahiro Yamada
a7f7f6248d treewide: replace '---help---' in Kconfig files with 'help'
Since commit 84af7a6194 ("checkpatch: kconfig: prefer 'help' over
'---help---'"), the number of '---help---' has been gradually
decreasing, but there are still more than 2400 instances.

This commit finishes the conversion. While I touched the lines,
I also fixed the indentation.

There are a variety of indentation styles found.

  a) 4 spaces + '---help---'
  b) 7 spaces + '---help---'
  c) 8 spaces + '---help---'
  d) 1 space + 1 tab + '---help---'
  e) 1 tab + '---help---'    (correct indentation)
  f) 1 tab + 1 space + '---help---'
  g) 1 tab + 2 spaces + '---help---'

In order to convert all of them to 1 tab + 'help', I ran the
following commend:

  $ find . -name 'Kconfig*' | xargs sed -i 's/^[[:space:]]*---help---/\thelp/'

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2020-06-14 01:57:21 +09:00
Jason Gunthorpe
7c08bc1956 RDMA/bnxt_re: Remove FMR leftovers
The bnxt_re_fmr struct is never referenced and the max_fmr items
in bnxt_qplib_dev_attr are never read.

Link: https://lore.kernel.org/r/6-v3-f58e6669d5d3+2cf-fmr_removal_jgg@mellanox.com
Reviewed-by: Max Gurtovoy <maxg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-06-02 20:32:53 -03:00
Maor Gottlieb
fa5d010c56 RDMA: Group create AH arguments in struct
Following patch adds additional argument to the create AH function, so it
make sense to group ah_attr and flags arguments in struct.

Link: https://lore.kernel.org/r/20200430192146.12863-13-maorg@mellanox.com
Signed-off-by: Maor Gottlieb <maorg@mellanox.com>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Acked-by: Weihang Li <liweihang@huawei.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-05-02 20:19:53 -03:00
Leon Romanovsky
322f3d45a1 RDMA/bnxt: Delete 'nq_ptr' variable which is not used
The variable "nq_ptr" is set but never used, this generates the following
warning while compiling kernel with W=1 option.

drivers/infiniband/hw/bnxt_re/qplib_fp.c: In function 'bnxt_qplib_service_nq':
drivers/infiniband/hw/bnxt_re/qplib_fp.c:303:25: warning:
   variable 'nq_ptr' set but not used [-Wunused-but-set-variable]
303 |  struct nq_base *nqe, **nq_ptr;
    |

Fixes: fddcbbb02a ("RDMA/bnxt_re: Simplify obtaining queue entry from hw ring")
Link: https://lore.kernel.org/r/20200419132046.123887-1-leon@kernel.org
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-22 17:05:13 -03:00
Devesh Sharma
8ce111d00e RDMA/bnxt_re: Remove dead code from rcfw
In the previous refactoring serise there were few leftover functions which
are not is use anymore.  Removed them as it is a dead code.

Fixes: 6f53196bc5 ("RDMA/bnxt_re: Refactor doorbell management functions")
Link: https://lore.kernel.org/r/1585851136-2316-5-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:35 -03:00
Devesh Sharma
fddcbbb02a RDMA/bnxt_re: Simplify obtaining queue entry from hw ring
Restructring the data path and control path queue management code to
simplify the way a queue element is extracted from the hardware ring.

Introduced a new function which will give a pointer to the next ring item
depending upon the current cons/prod index in the hardware queue.

Further, there are hardcoding when size of queue entry is calculated,
replacing it with an inline function. This function would be easier to
expand if need going forward.

The code section to initialize the PSN search areas has also been
restructured and couple of functions has been added there.

Link: https://lore.kernel.org/r/1585851136-2316-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:35 -03:00
Devesh Sharma
c78671a4e6 RDMA/bnxt_re: Update missing hsi data structures
Adding fast path support data structure into hardware HSI. These
structures are header only definition of RQE/SRQE/SQE. This is to help
calculating the size of hardware wqe size.

Link: https://lore.kernel.org/r/1585851136-2316-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:35 -03:00
Devesh Sharma
99bf84e24e RDMA/bnxt_re: Reduce device page size detection code
Getting rid of the repeated code in the driver when deciding on the page
size of the hardware ring memory. A new common function would translate
the ring page size into device specific page size.

Link: https://lore.kernel.org/r/1585851136-2316-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-04-14 16:39:34 -03:00
YueHaibing
b4d8ddf835 RDMA/bnxt_re: make bnxt_re_ib_init static
Fix sparse warning:

drivers/infiniband/hw/bnxt_re/main.c:1313:5:
 warning: symbol 'bnxt_re_ib_init' was not declared. Should it be static?

Link: https://lore.kernel.org/r/20200330110219.24448-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-30 15:03:19 -03:00
Selvin Xavier
b1d56fdcb6 RDMA/bnxt_re: Wait for all the CQ events before freeing CQ data structures
Destroy CQ command to firmware returns the num_cnq_events as a
response. This indicates the driver about the number of CQ events
generated for this CQ. Driver should wait for all these events before
freeing the CQ host structures.  Also, add routine to clean all the
pending notification for the CQs getting destroyed. This avoids the
possibility of accessing the CQ data structures after its freed.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1584120842-3200-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-24 20:15:36 -03:00
Selvin Xavier
4e88cef11d RDMA/bnxt_re: Remove unnecessary sched count
Since the lifetime of bnxt_re_task is controlled by the kref of device,
sched_count is no longer required.  Remove it.

Link: https://lore.kernel.org/r/1584117207-2664-4-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 20:15:03 -03:00
Jason Gunthorpe
8a6c617047 RDMA/bnxt_re: Fix lifetimes in bnxt_re_task
A work queue cannot just rely on the ib_device not being freed, it must
hold a kref on the memory so that the BNXT_RE_FLAG_IBDEV_REGISTERED check
works.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Link: https://lore.kernel.org/r/1584117207-2664-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 20:15:03 -03:00
Jason Gunthorpe
3cae58047c RDMA/bnxt_re: Use ib_device_try_get()
There are a couple places in this driver running from a work queue that
need the ib_device to be registered. Instead of using a broken internal
bit rely on the new core code to guarantee device registration.

Link: https://lore.kernel.org/r/1584117207-2664-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-17 20:15:03 -03:00
Christophe JAILLET
24a5b0ce71 RDMA/bnxt_re: Remove a redundant 'memset'
'wqe' is already zeroed at the top of the 'while' loop, just a few lines
below, and is not used outside of the loop.

So there is no need to zero it again, or for the variable to be declared
outside the loop.

Link: https://lore.kernel.org/r/20200308065442.5415-1-christophe.jaillet@wanadoo.fr
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-10 14:33:21 -03:00
Kamal Heib
bb8865f435 RDMA/providers: Fix return value when QP type isn't supported
The proper return code is "-EOPNOTSUPP" when the requested QP type is
not supported by the provider.

Link: https://lore.kernel.org/r/20200130082049.463-1-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-04 12:13:42 -04:00
YueHaibing
75d0366508 RDMA/bnxt_re: Remove set but not used variables 'pg' and 'idx'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/infiniband/hw/bnxt_re/qplib_rcfw.c: In function '__send_message':
drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:101:10: warning:
 variable 'idx' set but not used [-Wunused-but-set-variable]
drivers/infiniband/hw/bnxt_re/qplib_rcfw.c:101:6: warning:
 variable 'pg' set but not used [-Wunused-but-set-variable]

commit cee0c7bba4 ("RDMA/bnxt_re: Refactor command queue management
code") involved this, but not used.

Link: https://lore.kernel.org/r/20200227064900.92255-1-yuehaibing@huawei.com
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-02 11:10:38 -04:00
YueHaibing
a0b404a98e RDMA/bnxt_re: Remove set but not used variable 'dev_attr'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/infiniband/hw/bnxt_re/ib_verbs.c: In function 'bnxt_re_create_gsi_qp':
drivers/infiniband/hw/bnxt_re/ib_verbs.c:1283:30: warning:
 variable 'dev_attr' set but not used [-Wunused-but-set-variable]

commit 8dae419f9e ("RDMA/bnxt_re: Refactor queue pair creation code")
involved this, but not used, so remove it.

Link: https://lore.kernel.org/r/20200227064542.91205-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-02 11:10:38 -04:00
YueHaibing
6be2067d1e RDMA/bnxt_re: Remove set but not used variable 'pg_size'
Fixes gcc '-Wunused-but-set-variable' warning:

drivers/infiniband/hw/bnxt_re/qplib_res.c: In function '__alloc_pbl':
drivers/infiniband/hw/bnxt_re/qplib_res.c:109:13: warning:
 variable 'pg_size' set but not used [-Wunused-but-set-variable]

commit 0c4dcd6028 ("RDMA/bnxt_re: Refactor hardware queue memory
allocation") involved this, but not used, so remove it.

Link: https://lore.kernel.org/r/20200227064209.87893-1-yuehaibing@huawei.com
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-02 11:10:38 -04:00
Selvin Xavier
66832705c4 RDMA/bnxt_re: Use driver_unregister and unregistration API
Using the new unregister APIs provided by the core.  Provide the
dealloc_driver hook for the core to callback at the time of device
un-registration.

bnxt_re VF resources are created by the corresponding PF driver.  During
ib_unregister_driver, PF might get removed before VF and this could cause
failure when VFs are removed. Driver is explicitly queuing the removal of
VF devices before calling ib_unregister_driver.

Link: https://lore.kernel.org/r/1582731932-26574-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-02 11:10:38 -04:00
Selvin Xavier
c2b777a959 RDMA/bnxt_re: Refactor device add/remove functionalities
- bnxt_re_ib_reg() handles two main functionalities - initializing the
   device and registering with the IB stack.  Split it into 2 functions
   i.e. bnxt_re_dev_init() and bnxt_re_ib_init() to account for the same
   thereby improve modularity. Do the same for
   bnxt_re_ib_unreg()i.e. split into two functions - bnxt_re_dev_uninit()
   and bnxt_re_ib_uninit().

 - Simplify the code by combining the different steps to add and remove
   the device into two functions.

 - Report correct netdev link state during device register

Link: https://lore.kernel.org/r/1582731932-26574-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-03-02 11:10:37 -04:00
Jason Gunthorpe
65a1662015 RDMA/bnxt_re: Using vmalloc requires including vmalloc.h
Add it

Fixes: 0c4dcd6028 ("RDMA/bnxt_re: Refactor hardware queue memory allocation")
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-26 13:24:41 -04:00
Devesh Sharma
6ccad8483b RDMA/bnxt_re: use ibdev based message printing functions
Replacing the dev_err/dbg/warn with ibdev_err/dbg/warn. In the IB device
provider driver these functions are recommended to use.

Currently qplib layer function calls has not been replaced due to
unavailability of ib_device pointer at that layer.

Link: https://lore.kernel.org/r/1581786665-23705-9-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:44 -04:00
Devesh Sharma
6f53196bc5 RDMA/bnxt_re: Refactor doorbell management functions
Moving all the fast path doorbell functions at one place under
qplib_res.h. To pass doorbell record information a new structure
bnxt_qplib_db_info has been introduced.  Every roce object holds an
instance of this structure and doorbell information is initialized during
resource creation.

When DB is rung only the current queue index is read from hardware ring
and rest of the data is taken from pre-initialized dbinfo structure.

Link: https://lore.kernel.org/r/1581786665-23705-8-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:44 -04:00
Devesh Sharma
9555352bac RDMA/bnxt_re: Refactor notification queue management code
Cleaning up the notification queue data structures and management
code. The CQ and SRQ event handlers have been type defined instead of
in-place declaration. NQ doorbell register descriptor has been added in
base NQ structure.  The nq->vector has been renamed to nq->msix_vec.

Link: https://lore.kernel.org/r/1581786665-23705-7-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma
cee0c7bba4 RDMA/bnxt_re: Refactor command queue management code
Refactoring the command queue (rcfw) management code. A new data-structure
is introduced to describe the bar register.  each object which deals with
mmio space should have a descriptor structure. This structure specifically
hold DB register information.  Thus, slow path creq structure now hold a
bar register descriptor.

Further cleanup the rcfw structure to introduce the command queue context
and command response event queue context structures. Rest of the rcfw
related code has been touched to incorporate these three structures.

Link: https://lore.kernel.org/r/1581786665-23705-6-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma
b08fe048a6 RDMA/bnxt_re: Refactor net ring allocation function
Introducing a new attribute structure to reduce the long list of arguments
passed in bnxt_re_net_ring_alloc() function.

The caller of bnxt_re_net_ring_alloc should fill in the list of attributes
in bnxt_re_ring_attr structure and then pass the pointer to the function.

Link: https://lore.kernel.org/r/1581786665-23705-5-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma
0c4dcd6028 RDMA/bnxt_re: Refactor hardware queue memory allocation
At top level there are three major data structure addition.  viz
bnxt_qplib_hwq_attr, bnxt_qplib_sg_info and bnxt_qplib_tqm_ctx

Intorduction of first data structure reduces the arguments list to
bnxt_re_alloc_init_hwq() function. There are changes all over the driver
code to incorporate this new structure. The caller needs to fill the
attribute data structure and pass to this function.

The second data structure is to pass memory region description
viz. sghead, page_size and page_shift. There are changes all over the
driver code to initialize bnxt_re_sg_info data structure. The new data
structure helps to reduce the argument list of __alloc_pbl() function
call.

Till now the TQM rings related members were not collected under any
specific data-structure making it hard to manage. The third data
sctructure bnxt_qplib_tqm_ctx is added to refactor the TQM queue
allocation and initialization.

Link: https://lore.kernel.org/r/1581786665-23705-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma
0cfb329db9 RDMA/bnxt_re: Replace chip context structure with pointer
The chip_ctx member in bnxt_re_dev structure is now a pointer to struct
bnxt_qplib_chip_ctx. Since the member type has changed there are changes
in rest of the code wherever dev->chip_ctx is used.

Link: https://lore.kernel.org/r/1581786665-23705-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:43 -04:00
Devesh Sharma
8dae419f9e RDMA/bnxt_re: Refactor queue pair creation code
Restructuring the bnxt_re_create_qp function. Listing below the major
changes:
 - Monolithic central part of create_qp where attributes are initialized
   is now enclosed in one function and this new function has few more
   sub-functions.
 - Top level qp limit checking code moved to a function.
 - GSI QP creation and GSI Shadow qp creation code is handled in a sub
   function.

Link: https://lore.kernel.org/r/1581786665-23705-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-21 20:21:42 -04:00
Selvin Xavier
0a01623b74 RDMA/bnxt_re: Use rdma_read_gid_hw_context to retrieve HW gid index
bnxt_re HW maintains a GID table with only a single entry for the two
duplicate GID entries (v1 and v2). Driver needs to map stack gid index to
the HW table gid index.  Use the new API rdma_read_gid_hw_context () to
retrieve the HW GID context to get the HW table index.

Link: https://lore.kernel.org/r/1582107594-5180-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-02-19 16:54:25 -04:00
Linus Torvalds
8fdd4019bc RDMA subsystem updates for 5.6
- Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4, rxe,
   i40iw
 
 - Larger series doing cleanup and rework for hns and hfi1.
 
 - Some general reworking of the CM code to make it a little more
   understandable
 
 - Unify the different code paths connected to the uverbs FD scheme
 
 - New UAPI ioctls conversions for get context and get async fd
 
 - Trace points for CQ and CM portions of the RDMA stack
 
 - mlx5 driver support for virtio-net formatted rings as RDMA raw ethernet QPs
 
 - verbs support for setting the PCI-E relaxed ordering bit on DMA traffic
   connected to a MR
 
 - A couple of bug fixes that came too late to make rc7
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAl4zPwQACgkQOG33FX4g
 mxoURw//fuQmuJ7aTMH+0qrhaZUmzXOcI/WKvY0YMyYLvxolRcIO+uCL239wxezR
 9iTHPO7HeYXUQ4W8Hi/fTyuQ9hzaPOP3wgOJfQhm4QT/XDpRW0H3Mb+hTLHTUAcA
 rgKc9suAn+5BbIDOz7hEfeOTssx1wYrLsaHDc11NZ42JuG6uvPR33lhXiKWG+5tH
 2MpfeTU6BjL035dm3YZXCo+ouobpdMuvzJItYIsB2E5Nl0s91SMzsymIYiD0gb3t
 yUJ3wqPW3pchfAl8VEn+W5AHTUYYgGjmEblL8WdVq5JRrkQgQzj8QtCRT9NOPAT0
 LivCvgBrm0kscaQS2TjtG56Ojbwz8z1QPE/4shf0pj/G2lZfacYDAeaUc/2VafxY
 y/KG+3dB1DxtYY3eXJUxbB7Vpk7kfr35p5b75NdMhd2t49oPgV7EKoZMLYGzfX4S
 PtyNyNSiwx8qsRTr4lznOMswmrDLfG4XiywWgYo6NGOWyKYlARWIYBAEQZ0DPTiE
 9mqJ19gusdSdAgm8LGDInPmH6/AojGOVzYonJFWdlOtwCXGNXL4Gx02x4WYHykDG
 w+oy5NMJbU3b6+MWEagkuQNcrwqv02MT1mB/Lgv4GPm6rS0UXR7zUPDeccE50fSL
 X36k28UlftlPlaD7PeJdTOAhyBv5DxfpL5rbB2TfpUTpNxjayuU=
 =hepK
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "A very quiet cycle with few notable changes. Mostly the usual list of
  one or two patches to drivers changing something that isn't quite rc
  worthy. The subsystem seems to be seeing a larger number of rework and
  cleanup style patches right now, I feel that several vendors are
  prepping their drivers for new silicon.

  Summary:

   - Driver updates and cleanup for qedr, bnxt_re, hns, siw, mlx5, mlx4,
     rxe, i40iw

   - Larger series doing cleanup and rework for hns and hfi1.

   - Some general reworking of the CM code to make it a little more
     understandable

   - Unify the different code paths connected to the uverbs FD scheme

   - New UAPI ioctls conversions for get context and get async fd

   - Trace points for CQ and CM portions of the RDMA stack

   - mlx5 driver support for virtio-net formatted rings as RDMA raw
     ethernet QPs

   - verbs support for setting the PCI-E relaxed ordering bit on DMA
     traffic connected to a MR

   - A couple of bug fixes that came too late to make rc7"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (108 commits)
  RDMA/core: Make the entire API tree static
  RDMA/efa: Mask access flags with the correct optional range
  RDMA/cma: Fix unbalanced cm_id reference count during address resolve
  RDMA/umem: Fix ib_umem_find_best_pgsz()
  IB/mlx4: Fix leak in id_map_find_del
  IB/opa_vnic: Spelling correction of 'erorr' to 'error'
  IB/hfi1: Fix logical condition in msix_request_irq
  RDMA/cm: Remove CM message structs
  RDMA/cm: Use IBA functions for complex structure members
  RDMA/cm: Use IBA functions for simple structure members
  RDMA/cm: Use IBA functions for swapping get/set acessors
  RDMA/cm: Use IBA functions for simple get/set acessors
  RDMA/cm: Add SET/GET implementations to hide IBA wire format
  RDMA/cm: Add accessors for CM_REQ transport_type
  IB/mlx5: Return the administrative GUID if exists
  RDMA/core: Ensure that rdma_user_mmap_entry_remove() is a fence
  IB/mlx4: Fix memory leak in add_gid error flow
  IB/mlx5: Expose RoCE accelerator counters
  RDMA/mlx5: Set relaxed ordering when requested
  RDMA/core: Add the core support field to METHOD_GET_CONTEXT
  ...
2020-01-31 14:40:36 -08:00
Linus Torvalds
bd2463ac7d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from David Miller:

 1) Add WireGuard

 2) Add HE and TWT support to ath11k driver, from John Crispin.

 3) Add ESP in TCP encapsulation support, from Sabrina Dubroca.

 4) Add variable window congestion control to TIPC, from Jon Maloy.

 5) Add BCM84881 PHY driver, from Russell King.

 6) Start adding netlink support for ethtool operations, from Michal
    Kubecek.

 7) Add XDP drop and TX action support to ena driver, from Sameeh
    Jubran.

 8) Add new ipv4 route notifications so that mlxsw driver does not have
    to handle identical routes itself. From Ido Schimmel.

 9) Add BPF dynamic program extensions, from Alexei Starovoitov.

10) Support RX and TX timestamping in igc, from Vinicius Costa Gomes.

11) Add support for macsec HW offloading, from Antoine Tenart.

12) Add initial support for MPTCP protocol, from Christoph Paasch,
    Matthieu Baerts, Florian Westphal, Peter Krystad, and many others.

13) Add Octeontx2 PF support, from Sunil Goutham, Geetha sowjanya, Linu
    Cherian, and others.

* git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1469 commits)
  net: phy: add default ARCH_BCM_IPROC for MDIO_BCM_IPROC
  udp: segment looped gso packets correctly
  netem: change mailing list
  qed: FW 8.42.2.0 debug features
  qed: rt init valid initialization changed
  qed: Debug feature: ilt and mdump
  qed: FW 8.42.2.0 Add fw overlay feature
  qed: FW 8.42.2.0 HSI changes
  qed: FW 8.42.2.0 iscsi/fcoe changes
  qed: Add abstraction for different hsi values per chip
  qed: FW 8.42.2.0 Additional ll2 type
  qed: Use dmae to write to widebus registers in fw_funcs
  qed: FW 8.42.2.0 Parser offsets modified
  qed: FW 8.42.2.0 Queue Manager changes
  qed: FW 8.42.2.0 Expose new registers and change windows
  qed: FW 8.42.2.0 Internal ram offsets modifications
  MAINTAINERS: Add entry for Marvell OcteonTX2 Physical Function driver
  Documentation: net: octeontx2: Add RVU HW and drivers overview
  octeontx2-pf: ethtool RSS config support
  octeontx2-pf: Add basic ethtool support
  ...
2020-01-28 16:02:33 -08:00
Linus Torvalds
6a1000bd27 ioremap changes for 5.6
- remove ioremap_nocache given that is is equivalent to
    ioremap everywhere
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAl4vKHwLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYMPGBAAuVNUZaZfWYHpiVP2oRcUQUguFiD3NTbknsyzV2oH
 J9P0GfeENSKwE9OOhZ7XIjnCZAJwQgTK/ppQY5yiQ/KAtYyyXjXEJ6jqqjiTDInr
 +3+I3t/LhkgrK7tMrb7ylTGa/d7KhaciljnOXC8+b75iddvM9I1z2pbHDbppZMS9
 wT4RXL/cFtRb85AfOyPLybcka3f5P2gGvQz38qyimhJYEzHDXZu9VO1Bd20f8+Xf
 eLBKX0o6yWMhcaPLma8tm0M0zaXHEfLHUKLSOkiOk+eHTWBZ3b/w5nsOQZYZ7uQp
 25yaClbameAn7k5dHajduLGEJv//ZjLRWcN3HJWJ5vzO111aHhswpE7JgTZJSVWI
 ggCVkytD3ESXapvswmACSeCIDMmiJMzvn6JvwuSMVB7a6e5mcqTuGo/FN+DrBF/R
 IP+/gY/T7zIIOaljhQVkiEIIwiD/akYo0V9fheHTBnqcKEDTHV4WjKbeF6aCwcO+
 b8inHyXZSKSMG//UlDuN84/KH/o1l62oKaB1uDIYrrL8JVyjAxctWt3GOt5KgSFq
 wVz1lMw4kIvWtC/Sy2H4oB+RtODLp6yJDqmvmPkeJwKDUcd/1JKf0KsZ8j3FpGei
 /rEkBEss0KBKyFAgBSRO2jIpdj2epgcBcsdB/r5mlhcn8L77AS6mHbA173kY4pQ/
 Kdg=
 =TUCJ
 -----END PGP SIGNATURE-----

Merge tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap

Pull ioremap updates from Christoph Hellwig:
 "Remove the ioremap_nocache API (plus wrappers) that are always
  identical to ioremap"

* tag 'ioremap-5.6' of git://git.infradead.org/users/hch/ioremap:
  remove ioremap_nocache and devm_ioremap_nocache
  MIPS: define ioremap_nocache to ioremap
2020-01-27 13:03:00 -08:00
Jason Gunthorpe
e8b3a426fb Use ODP MRs for kernel ULPs
The following series extends MR creation routines to allow creation of
 user MRs through kernel ULPs as a proxy. The immediate use case is to
 allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
 MRs to be created and such MRs were not possible to create prior this
 series.
 
 The first part of this patchset extends RDMA to have special verb
 ib_reg_user_mr(). The common use case that uses this function is a
 userspace application that allocates memory for HCA access but the
 responsibility to register the memory at the HCA is on an kernel ULP.
 This ULP acts as an agent for the userspace application.
 
 The second part provides advise MR functionality for ULPs. This is
 integral part of ODP flows and used to trigger pagefaults in advance
 to prepare memory before running working set.
 
 The third part is actual user of those in-kernel APIs.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQT1m3YD37UfMCUQBNwp8NhrnBAZsQUCXiVO8AAKCRAp8NhrnBAZ
 scTrAP9gb0d3qv0IOtHw5aGI1DAgjTUn/SzUOnsjDEn7DIoh9gEA2+ZmaEyLXKrl
 +UcZb31auy5P8ueJYokRLhLAyRcOIAg=
 =yaHb
 -----END PGP SIGNATURE-----

Merge tag 'rds-odp-for-5.5' into rdma.git for-next

From https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma

Leon Romanovsky says:

====================
Use ODP MRs for kernel ULPs

The following series extends MR creation routines to allow creation of
user MRs through kernel ULPs as a proxy. The immediate use case is to
allow RDS to work over FS-DAX, which requires ODP (on-demand-paging)
MRs to be created and such MRs were not possible to create prior this
series.

The first part of this patchset extends RDMA to have special verb
ib_reg_user_mr(). The common use case that uses this function is a
userspace application that allocates memory for HCA access but the
responsibility to register the memory at the HCA is on an kernel ULP.
This ULP acts as an agent for the userspace application.

The second part provides advise MR functionality for ULPs. This is
integral part of ODP flows and used to trigger pagefaults in advance
to prepare memory before running working set.

The third part is actual user of those in-kernel APIs.
====================

* tag 'rds-odp-for-5.5':
  net/rds: Use prefetch for On-Demand-Paging MR
  net/rds: Handle ODP mr registration/unregistration
  net/rds: Detect need of On-Demand-Paging memory registration
  RDMA/mlx5: Fix handling of IOVA != user_va in ODP paths
  IB/mlx5: Mask out unsupported ODP capabilities for kernel QPs
  RDMA/mlx5: Don't fake udata for kernel path
  IB/mlx5: Add ODP WQE handlers for kernel QPs
  IB/core: Add interface to advise_mr for kernel users
  IB/core: Introduce ib_reg_user_mr
  IB: Allow calls to ib_umem_get from kernel ULPs

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-21 09:55:04 -04:00
Moni Shoua
c320e527e1 IB: Allow calls to ib_umem_get from kernel ULPs
So far the assumption was that ib_umem_get() and ib_umem_odp_get()
are called from flows that start in UVERBS and therefore has a user
context. This assumption restricts flows that are initiated by ULPs
and need the service that ib_umem_get() provides.

This patch changes ib_umem_get() and ib_umem_odp_get() to get IB device
directly by relying on the fact that both UVERBS and ULPs sets that
field correctly.

Reviewed-by: Guy Levi <guyle@mellanox.com>
Signed-off-by: Moni Shoua <monis@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
2020-01-16 16:14:28 +02:00
Christoph Hellwig
4bdc0d676a remove ioremap_nocache and devm_ioremap_nocache
ioremap has provided non-cached semantics by default since the Linux 2.6
days, so remove the additional ioremap_nocache interface.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Arnd Bergmann <arnd@arndb.de>
2020-01-06 09:45:59 +01:00
Selvin Xavier
53bb802315 RDMA/bnxt_re: Report more number of completion vectors
Report the the data path MSIx vectors allocated by driver as number of
completion vectors. One interrupt vector is used for Control path. So
reporting one less than the total number of MSIx vectors allocated by the
driver.

Link: https://lore.kernel.org/r/1574671174-5064-7-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 15:45:31 -04:00
Selvin Xavier
c527572358 RDMA/bnxt_re: Fix Send Work Entry state check while polling completions
Some adapters need a fence Work Entry to handle retransmission.  Currently
the driver checks for this condition, only if the Send queue entry is
signalled. Implement the condition check, irrespective of the signalled
state of the Work queue entries

Failure to add the fence can result in access to memory that is already
marked as completed, triggering data corruption, transmission failure,
IOMMU failures, etc.

Fixes: 9152e0b722 ("RDMA/bnxt_re: HW workarounds for handling specific conditions")
Link: https://lore.kernel.org/r/1574671174-5064-3-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 15:27:22 -04:00
Selvin Xavier
9a4467a6b2 RDMA/bnxt_re: Avoid freeing MR resources if dereg fails
The driver returns an error code for MR dereg, but frees the MR structure.
When the MR dereg is retried due to previous error, the system crashes as
the structure is already freed.

  BUG: unable to handle kernel NULL pointer dereference at 00000000000001b8
  PGD 0 P4D 0
  Oops: 0000 [#1] SMP PTI
  CPU: 7 PID: 12178 Comm: ib_send_bw Kdump: loaded Not tainted 4.18.0-124.el8.x86_64 #1
  Hardware name: Dell Inc. PowerEdge R430/03XKDV, BIOS 1.1.10 03/10/2015
  RIP: 0010:__dev_printk+0x2a/0x70
  Code: 0f 1f 44 00 00 49 89 d1 48 85 f6 0f 84 f6 2b 00 00 4c 8b 46 70 4d 85 c0 75 04 4c 8b
46 10 48 8b 86 a8 00 00 00 48 85 c0 74 16 <48> 8b 08 0f be 7f 01 48 c7 c2 13 ac ac 83 83 ef 30 e9 10 fe ff ff
  RSP: 0018:ffffaf7c04607a60 EFLAGS: 00010006
  RAX: 00000000000001b8 RBX: ffffa0010c91c488 RCX: 0000000000000246
  RDX: ffffaf7c04607a68 RSI: ffffa0010c91caa8 RDI: ffffffff83a788eb
  RBP: ffffaf7c04607ac8 R08: 0000000000000000 R09: ffffaf7c04607a68
  R10: 0000000000000000 R11: 0000000000000001 R12: ffffaf7c04607b90
  R13: 000000000000000e R14: 0000000000000000 R15: 00000000ffffa001
  FS:  0000146fa1f1cdc0(0000) GS:ffffa0012fac0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 00000000000001b8 CR3: 000000007680a003 CR4: 00000000001606e0
  Call Trace:
   dev_err+0x6c/0x90
   ? dev_printk_emit+0x4e/0x70
   bnxt_qplib_rcfw_send_message+0x594/0x660 [bnxt_re]
   ? dev_err+0x6c/0x90
   bnxt_qplib_free_mrw+0x80/0xe0 [bnxt_re]
   bnxt_re_dereg_mr+0x2e/0xd0 [bnxt_re]
   ib_dereg_mr+0x2f/0x50 [ib_core]
   destroy_hw_idr_uobject+0x20/0x70 [ib_uverbs]
   uverbs_destroy_uobject+0x2e/0x170 [ib_uverbs]
   __uverbs_cleanup_ufile+0x6e/0x90 [ib_uverbs]
   uverbs_destroy_ufile_hw+0x61/0x130 [ib_uverbs]
   ib_uverbs_close+0x1f/0x80 [ib_uverbs]
   __fput+0xb7/0x230
   task_work_run+0x8a/0xb0
   do_exit+0x2da/0xb40
...
  RIP: 0033:0x146fa113a387
  Code: Bad RIP value.
  RSP: 002b:00007fff945d1478 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff02
  RAX: 0000000000000000 RBX: 000055a248908d70 RCX: 0000000000000000
  RDX: 0000146fa1f2b000 RSI: 0000000000000001 RDI: 000055a248906488
  RBP: 000055a248909630 R08: 0000000000010000 R09: 0000000000000000
  R10: 0000000000000000 R11: 0000000000000000 R12: 000055a248906488
  R13: 0000000000000001 R14: 0000000000000000 R15: 000055a2489095f0

Do not free the MR structures, when driver returns error to the stack.

Fixes: 872f357824 ("RDMA/bnxt_re: Add support for MRs with Huge pages")
Link: https://lore.kernel.org/r/1574671174-5064-2-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2020-01-03 15:20:42 -04:00
Devesh Sharma
fca5b9dc09 RDMA/bnxt_re: Fix missing le16_to_cpu
From sparse:

drivers/infiniband/hw/bnxt_re/main.c:1274:18: warning: cast from restricted __le16
drivers/infiniband/hw/bnxt_re/main.c:1275:18: warning: cast from restricted __le16
drivers/infiniband/hw/bnxt_re/main.c:1276:18: warning: cast from restricted __le16
drivers/infiniband/hw/bnxt_re/main.c:1277:21: warning: restricted __le16 degrades to integer

Fixes: 2b827ea192 ("RDMA/bnxt_re: Query HWRM Interface version from FW")
Link: https://lore.kernel.org/r/1574317343-23300-4-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Devesh Sharma
98998ffe52 RDMA/bnxt_re: Fix stat push into dma buffer on gen p5 devices
Due to recent advances in the firmware for Broadcom's gen p5 series of
adaptors the driver code to report hardware counters has been broken
w.r.t. roce devices.

The new firmware command expects dma length to be specified during stat
dma buffer allocation.

Fixes: 2792b5b95e ("bnxt_en: Update firmware interface spec. to 1.10.0.89.")
Link: https://lore.kernel.org/r/1574317343-23300-3-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Luke Starrett
e284b159c6 RDMA/bnxt_re: Fix chip number validation Broadcom's Gen P5 series
In the first version of Gen P5 ASIC, chip-id was always set to 0x1750 for
all adaptor port configurations. This has been fixed in the new chip rev.

Due to this missing fix users are not able to use adaptors based on latest
chip rev of Broadcom's Gen P5 adaptors.

Fixes: ae8637e131 ("RDMA/bnxt_re: Add chip context to identify 57500 series")
Link: https://lore.kernel.org/r/1574317343-23300-2-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Naresh Kumar PBS <nareshkumar.pbs@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Luke Starrett <luke.starrett@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Krzysztof Kozlowski
6e419e35e6 RDMA/bnxt_re: Fix Kconfig indentation
Adjust indentation from spaces to tab (+optional two spaces) as in coding
style with command like:
	$ sed -e 's/^        /\t/' -i */Kconfig

Link: https://lore.kernel.org/r/20191120134138.15245-1-krzk@kernel.org
Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-25 10:31:47 -04:00
Christoph Hellwig
72b894b09a IB/umem: remove the dmasync argument to ib_umem_get
The argument is always ignored, so remove it.

Link: https://lore.kernel.org/r/20191113073214.9514-3-hch@lst.de
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-11-17 10:37:00 -04:00
Devesh Sharma
39c48c5146 RDMA/bnxt_re: Enable SRIOV VF support on Broadcom's 57500 adapter series
Broadcom's 575xx adapter series has support for SRIOV VFs.  Making changes
to enable SRIOV VF support. There are two major area where changes are
done:

 - Added new DB location for control-path and data-path DB ring

 - New devices do not need to issue the sriov-config slow-path command
   thus, skipping to call that firmware command.

For now enabling support for 64 RoCE VFs.

Link: https://lore.kernel.org/r/1570081715-14301-1-git-send-email-devesh.sharma@broadcom.com
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-08 16:22:24 -03:00
Kamal Heib
39ce85f3b1 RDMA/bnxt_re: Remove unsupported modify_device callback
There is no need to return always zero for function which is not
supported.

Link: https://lore.kernel.org/r/20190923104158.5331-3-kamalheib1@gmail.com
Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-10-01 13:06:10 -03:00
Colin Ian King
d97a3e92f3 RDMA/bnxt_re: Fix spelling mistake "missin_resp" -> "missing_resp"
There is a spelling mistake in a literal string, fix it.

Fixes: 89f81008ba ("RDMA/bnxt_re: expose detailed stats retrieved from HW")
Link: https://lore.kernel.org/r/20190911092856.11146-1-colin.king@canonical.com
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-16 10:58:57 -03:00
Jason Gunthorpe
75c66515e4 Merge tag 'v5.3-rc8' into rdma.git for-next
To resolve dependencies in following patches

mlx5_ib.h conflict resolved by keeing both hunks

Linux 5.3-rc8

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-13 16:59:51 -03:00
Navid Emamdoost
4a9d46a9fe RDMA: Fix goto target to release the allocated memory
In bnxt_re_create_srq(), when ib_copy_to_udata() fails allocated memory
should be released by goto fail.

Fixes: 37cb11acf1 ("RDMA/bnxt_re: Add SRQ support for Broadcom adapters")
Link: https://lore.kernel.org/r/20190910222120.16517-1-navid.emamdoost@gmail.com
Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com>
Reviewed-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-09-13 16:55:55 -03:00
Selvin Xavier
d37b1e5340 RDMA/bnxt_re: Fix stack-out-of-bounds in bnxt_qplib_rcfw_send_message
Driver copies FW commands to the HW queue as  units of 16 bytes. Some
of the command structures are not exact multiple of 16. So while copying
the data from those structures, the stack out of bounds messages are
reported by KASAN. The following error is reported.

[ 1337.530155] ==================================================================
[ 1337.530277] BUG: KASAN: stack-out-of-bounds in bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530413] Read of size 16 at addr ffff888725477a48 by task rmmod/2785

[ 1337.530540] CPU: 5 PID: 2785 Comm: rmmod Tainted: G           OE     5.2.0-rc6+ #75
[ 1337.530541] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[ 1337.530542] Call Trace:
[ 1337.530548]  dump_stack+0x5b/0x90
[ 1337.530556]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530560]  print_address_description+0x65/0x22e
[ 1337.530568]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530575]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530577]  __kasan_report.cold.3+0x37/0x77
[ 1337.530581]  ? _raw_write_trylock+0x10/0xe0
[ 1337.530588]  ? bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530590]  kasan_report+0xe/0x20
[ 1337.530592]  memcpy+0x1f/0x50
[ 1337.530600]  bnxt_qplib_rcfw_send_message+0x40a/0x850 [bnxt_re]
[ 1337.530608]  ? bnxt_qplib_creq_irq+0xa0/0xa0 [bnxt_re]
[ 1337.530611]  ? xas_create+0x3aa/0x5f0
[ 1337.530613]  ? xas_start+0x77/0x110
[ 1337.530615]  ? xas_clear_mark+0x34/0xd0
[ 1337.530623]  bnxt_qplib_free_mrw+0x104/0x1a0 [bnxt_re]
[ 1337.530631]  ? bnxt_qplib_destroy_ah+0x110/0x110 [bnxt_re]
[ 1337.530633]  ? bit_wait_io_timeout+0xc0/0xc0
[ 1337.530641]  bnxt_re_dealloc_mw+0x2c/0x60 [bnxt_re]
[ 1337.530648]  bnxt_re_destroy_fence_mr+0x77/0x1d0 [bnxt_re]
[ 1337.530655]  bnxt_re_dealloc_pd+0x25/0x60 [bnxt_re]
[ 1337.530677]  ib_dealloc_pd_user+0xbe/0xe0 [ib_core]
[ 1337.530683]  srpt_remove_one+0x5de/0x690 [ib_srpt]
[ 1337.530689]  ? __srpt_close_all_ch+0xc0/0xc0 [ib_srpt]
[ 1337.530692]  ? xa_load+0x87/0xe0
...
[ 1337.530840]  do_syscall_64+0x6d/0x1f0
[ 1337.530843]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1337.530845] RIP: 0033:0x7ff5b389035b
[ 1337.530848] Code: 73 01 c3 48 8b 0d 2d 0b 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 2c 00 f7 d8 64 89 01 48
[ 1337.530849] RSP: 002b:00007fff83425c28 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
[ 1337.530852] RAX: ffffffffffffffda RBX: 00005596443e6750 RCX: 00007ff5b389035b
[ 1337.530853] RDX: 000000000000000a RSI: 0000000000000800 RDI: 00005596443e67b8
[ 1337.530854] RBP: 0000000000000000 R08: 00007fff83424ba1 R09: 0000000000000000
[ 1337.530856] R10: 00007ff5b3902960 R11: 0000000000000206 R12: 00007fff83425e50
[ 1337.530857] R13: 00007fff8342673c R14: 00005596443e6260 R15: 00005596443e6750

[ 1337.530885] The buggy address belongs to the page:
[ 1337.530962] page:ffffea001c951dc0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 1337.530964] flags: 0x57ffffc0000000()
[ 1337.530967] raw: 0057ffffc0000000 0000000000000000 ffffffff1c950101 0000000000000000
[ 1337.530970] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 1337.530970] page dumped because: kasan: bad access detected

[ 1337.530996] Memory state around the buggy address:
[ 1337.531072]  ffff888725477900: 00 00 00 00 f1 f1 f1 f1 00 00 00 00 00 f2 f2 f2
[ 1337.531180]  ffff888725477980: 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 00
[ 1337.531288] >ffff888725477a00: 00 f2 f2 f2 f2 f2 f2 00 00 00 f2 00 00 00 00 00
[ 1337.531393]                                                  ^
[ 1337.531478]  ffff888725477a80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531585]  ffff888725477b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 1337.531691] ==================================================================

Fix this by passing the exact size of each FW command to
bnxt_qplib_rcfw_send_message as req->cmd_size. Before sending
the command to HW, modify the req->cmd_size to number of 16 byte units.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Link: https://lore.kernel.org/r/1566468170-489-1-git-send-email-selvin.xavier@broadcom.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-22 10:52:15 -04:00
Kamal Heib
72a7720fca RDMA: Introduce ib_port_phys_state enum
In order to improve readability, add ib_port_phys_state enum to replace
the use of magic numbers.

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Reviewed-by: Andrew Boyer <aboyer@tobark.org>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Bernard Metzler <bmt@zurich.ibm.com>
Link: https://lore.kernel.org/r/20190807103138.17219-2-kamalheib1@gmail.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-08-12 10:18:52 -04:00
Parav Pandit
bda9045a20 IB/bnxt_re: Do not notifify GID change event
GID table entry operations such as add/remove/modify are triggered
by the IB core for RoCE ports.
Hence, remove GID change notification from hw driver.

Reviewed-by: Selvin Xavier <selvin.xavier@broadcom.com>
Tested-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Link: https://lore.kernel.org/r/20190726182652.50037-1-parav@mellanox.com
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-07-29 13:50:48 -04:00
Selvin Xavier
c56b593d2a RDMA/bnxt_re: Honor vlan_id in GID entry comparison
A GID entry consists of GID, vlan, netdev and smac.  Extend GID duplicate
check comparisons to consider vlan_id as well to support IPv6 VLAN based
link local addresses. Introduce a new structure (bnxt_qplib_gid_info) to
hold gid and vlan_id information.

The issue is discussed in the following thread
https://lore.kernel.org/r/AM0PR05MB4866CFEDCDF3CDA1D7D18AA5D1F20@AM0PR05MB4866.eurprd05.prod.outlook.com

Fixes: 823b23da71 ("IB/core: Allow vlan link local address based RoCE GIDs")
Cc: <stable@vger.kernel.org> # v5.2+
Link: https://lore.kernel.org/r/20190715091913.15726-1-selvin.xavier@broadcom.com
Reported-by: Yi Zhang <yi.zhang@redhat.com>
Co-developed-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Tested-by: Yi Zhang <yi.zhang@redhat.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-07-22 15:04:04 -03:00
Jason Gunthorpe
371bb62158 Linux 5.2-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAl0Os1seHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGtx4H/j6i482XzcGFKTBm
 A7mBoQpy+kLtoUov4EtBAR62OuwI8rsahW9di37QKndPoQrczWaKBmr3De6LCdPe
 v3pl3O6wBbvH5ru+qBPFX9PdNbDvimEChh7LHxmMxNQq3M+AjZAZVJyfpoiFnx35
 Fbge+LZaH/k8HMwZmkMr5t9Mpkip715qKg2o9Bua6dkH0AqlcpLlC8d9a+HIVw/z
 aAsyGSU8jRwhoAOJsE9bJf0acQ/pZSqmFp0rDKqeFTSDMsbDRKLGq/dgv4nW0RiW
 s7xqsjb/rdcvirRj3rv9+lcTVkOtEqwk0PVdL9WOf7g4iYrb3SOIZh8ZyViaDSeH
 VTS5zps=
 =huBY
 -----END PGP SIGNATURE-----

Merge tag 'v5.2-rc6' into rdma.git for-next

For dependencies in next patches.

Resolve conflicts:
- Use uverbs_get_cleared_udata() with new cq allocation flow
- Continue to delete nes despite SPDX conflict
- Resolve list appends in mlx5_command_str()
- Use u16 for vport_rule stuff
- Resolve list appends in struct ib_client

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-28 21:18:23 -03:00
Leon Romanovsky
836a0fbb3e RDMA: Check umem pointer validity prior to release
Update ib_umem_release() to behave similarly to kfree() and allow
submitting NULL pointer as safe input to this function.

Fixes: a52c8e2469 ("RDMA: Clean destroy CQ in drivers do not return errors")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-20 15:17:59 -04:00
Leon Romanovsky
e39afe3d6d RDMA: Convert CQ allocations to be under core responsibility
Ensure that CQ is allocated and freed by IB/core and not by drivers.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Tested-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11 16:39:49 -04:00
Leon Romanovsky
a52c8e2469 RDMA: Clean destroy CQ in drivers do not return errors
Like all other destroy commands, .destroy_cq() call is not supposed
to fail. In all flows, the attempt to return earlier caused to memory
leaks.

This patch converts .destroy_cq() to do not return any errors.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Acked-by: Gal Pressman <galpress@amazon.com>
Acked-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
2019-06-11 16:17:10 -04:00
Jason Gunthorpe
7a15414252 RDMA: Move owner into struct ib_device_ops
This more closely follows how other subsytems work, with owner being a
member of the structure containing the function pointers.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:03 -03:00
Jason Gunthorpe
72c6ec18eb RDMA: Move uverbs_abi_ver into struct ib_device_ops
No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:02 -03:00
Jason Gunthorpe
b9560a419b RDMA: Move driver_id into struct ib_device_ops
No reason for every driver to emit code to set this, just make it part of
the driver's existing static const ops structure.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-06-10 16:56:02 -03:00
Thomas Gleixner
ec8f24b7fa treewide: Add SPDX license identifier - Makefile/Kconfig
Add SPDX license identifiers to all Make/Kconfig files which:

 - Have no license information of any form

These files fall under the project license, GPL v2 only. The resulting SPDX
license identifier is:

  GPL-2.0-only

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-05-21 10:50:46 +02:00
Shiraz Saleem
d85582517e RDMA/bnxt_re: Use core helpers to get aligned DMA address
Call the core helpers to retrieve the HW aligned address to use for the
MR, within a supported bnxt_re page size.

Remove checking the umem->hugtetlb flag as it is no longer required. The
new DMA block iterator will return the 2M aligned address if the MR is
backed by 2M huge pages.

Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-06 13:08:11 -03:00
Parav Pandit
a70c07397f RDMA: Introduce and use GID attr helper to read RoCE L2 fields
Instead of RoCE drivers figuring out vlan, smac fields while working on
QP/AH, provide a helper routine to read the L2 fields such as vlan_id and
source mac address.

This moves logic from mlx5 driver to core for wider usage for RoCE ports.

This is a preparation patch to allow detaching netdev in subsequent patch.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-05-03 11:10:02 -03:00
Matthew Wilcox
a7b36d5fa8 ib/bnxt: Remove mention of idr_alloc from comment
Signed-off-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-25 12:02:18 -03:00
Jason Gunthorpe
4b38da75e0 RDMA/drivers: Convert easy drivers to use ib_device_set_netdev()
Drivers that never change their ndev dynamically do not need to use
the get_netdev callback.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Acked-by: Michal Kalderon <michal.kalderon@marvell.com>
Acked-by: Adit Ranadive <aditr@vmware.com>
2019-04-09 10:14:54 -03:00
Leon Romanovsky
68e326dea1 RDMA: Handle SRQ allocations by IB/core
Convert SRQ allocation from drivers to be in the IB/core

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Leon Romanovsky
d345691471 RDMA: Handle AH allocations by IB/core
Simplify drivers by ensuring lifetime of ib_ah object. The changes
in .create_ah() go hand in hand with relevant update in .destroy_ah().

We will use this opportunity and convert .destroy_ah() to don't fail, as
it was suggested a long time ago, because there is nothing to do in case
of failure during destroy.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-08 13:05:25 -03:00
Shamir Rabinovitch
ff23dfa134 IB: Pass only ib_udata in function prototypes
Now when ib_udata is passed to all the driver's object create/destroy APIs
the ib_udata will carry the ib_ucontext for every user command. There is
no need to also pass the ib_ucontext via the functions prototypes.

Make ib_udata the only argument psssed.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01 15:00:47 -03:00
Shamir Rabinovitch
c4367a2635 IB: Pass uverbs_attr_bundle down ib_x destroy path
The uverbs_attr_bundle with the ucontext is sent down to the drivers ib_x
destroy path as ib_udata. The next patch will use the ib_udata to free the
drivers destroy path from the dependency in 'uobject->context' as we
already did for the create path.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-04-01 14:57:35 -03:00
Selvin Xavier
5aa8484080 RDMA/bnxt_re: Use correct sizing on buffers holding page DMA addresses
umem->nmap is used while allocating internal buffer for storing
page DMA addresses. This causes out of bounds array access while iterating
the umem DMA-mapped SGL with umem page combining as umem->nmap can be
less than number of system pages in umem.

Use ib_umem_num_pages() instead of umem->nmap to size the page array.
Add a new structure (bnxt_qplib_sg_info) to pass sglist, npages and nmap.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Shiraz Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-28 14:13:27 -03:00
Enrico Weigelt, metux IT consult
1a2e158327 drivers: infiniband: Fix whitespace in kconfig
Adjust the kconfig whitespace in bnxt_re/iser to match the kernel
standard.

Signed-off-by: Enrico Weigelt, metux IT consult <info@metux.net>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-26 12:49:33 -03:00
Linus Torvalds
a50243b1dd 5.1 Merge Window Pull Request
This has been a slightly more active cycle than normal with ongoing core
 changes and quite a lot of collected driver updates.
 
 - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe
 
 - A new data transfer mode for HFI1 giving higher performance
 
 - Significant functional and bug fix update to the mlx5 On-Demand-Paging MR
   feature
 
 - A chip hang reset recovery system for hns
 
 - Change mm->pinned_vm to an atomic64
 
 - Update bnxt_re to support a new 57500 chip
 
 - A sane netlink 'rdma link add' method for creating rxe devices and fixing
   the various unregistration race conditions in rxe's unregister flow
 
 - Allow lookup up objects by an ID over netlink
 
 - Various reworking of the core to driver interface:
   * Drivers should not assume umem SGLs are in PAGE_SIZE chunks
   * ucontext is accessed via udata not other means
   * Start to make the core code responsible for object memory
     allocation
   * Drivers should convert struct device to struct ib_device
     via a helper
   * Drivers have more tools to avoid use after unregister problems
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEfB7FMLh+8QxL+6i3OG33FX4gmxoFAlyAJYYACgkQOG33FX4g
 mxrWwQ/+OyAx4Moru7Aix0C6GWxTJp/wKgw21CS3reZxgLai6x81xNYG/s2wCNjo
 IccObVd7mvzyqPdxOeyHBsJBbQDqWvoD6O2duH8cqGMgBRgh3CSdUep2zLvPpSAx
 2W1SvWYCLDnCuarboFrCA8c4AN3eCZiqD7z9lHyFQGjy3nTUWzk1uBaOP46uaiMv
 w89N8EMdXJ/iY6ONzihvE05NEYbMA8fuvosKLLNdghRiHIjbMQU8SneY23pvyPDd
 ZziPu9NcO3Hw9OVbkwtJp47U3KCBgvKHmnixyZKkikjiD+HVoABw2IMwcYwyBZwP
 Bic/ddONJUvAxMHpKRnQaW7znAiHARk21nDG28UAI7FWXH/wMXgicMp6LRcNKqKF
 vqXdxHTKJb0QUR4xrYI+eA8ihstss7UUpgSgByuANJ0X729xHiJtlEvPb1DPo1Dz
 9CB4OHOVRl5O8sA5Jc6PSusZiKEpvWoyWbdmw0IiwDF5pe922VLl5Nv88ta+sJ38
 v2Ll5AgYcluk7F3599Uh9D7gwp5hxW2Ph3bNYyg2j3HP4/dKsL9XvIJPXqEthgCr
 3KQS9rOZfI/7URieT+H+Mlf+OWZhXsZilJG7No0fYgIVjgJ00h3SF1/299YIq6Qp
 9W7ZXBfVSwLYA2AEVSvGFeZPUxgBwHrSZ62wya4uFeB1jyoodPk=
 =p12E
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma

Pull rdma updates from Jason Gunthorpe:
 "This has been a slightly more active cycle than normal with ongoing
  core changes and quite a lot of collected driver updates.

   - Various driver fixes for bnxt_re, cxgb4, hns, mlx5, pvrdma, rxe

   - A new data transfer mode for HFI1 giving higher performance

   - Significant functional and bug fix update to the mlx5
     On-Demand-Paging MR feature

   - A chip hang reset recovery system for hns

   - Change mm->pinned_vm to an atomic64

   - Update bnxt_re to support a new 57500 chip

   - A sane netlink 'rdma link add' method for creating rxe devices and
     fixing the various unregistration race conditions in rxe's
     unregister flow

   - Allow lookup up objects by an ID over netlink

   - Various reworking of the core to driver interface:
       - drivers should not assume umem SGLs are in PAGE_SIZE chunks
       - ucontext is accessed via udata not other means
       - start to make the core code responsible for object memory
         allocation
       - drivers should convert struct device to struct ib_device via a
         helper
       - drivers have more tools to avoid use after unregister problems"

* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma: (280 commits)
  net/mlx5: ODP support for XRC transport is not enabled by default in FW
  IB/hfi1: Close race condition on user context disable and close
  RDMA/umem: Revert broken 'off by one' fix
  RDMA/umem: minor bug fix in error handling path
  RDMA/hns: Use GFP_ATOMIC in hns_roce_v2_modify_qp
  cxgb4: kfree mhp after the debug print
  IB/rdmavt: Fix concurrency panics in QP post_send and modify to error
  IB/rdmavt: Fix loopback send with invalidate ordering
  IB/iser: Fix dma_nents type definition
  IB/mlx5: Set correct write permissions for implicit ODP MR
  bnxt_re: Clean cq for kernel consumers only
  RDMA/uverbs: Don't do double free of allocated PD
  RDMA: Handle ucontext allocations by IB/core
  RDMA/core: Fix a WARN() message
  bnxt_re: fix the regression due to changes in alloc_pbl
  IB/mlx4: Increase the timeout for CM cache
  IB/core: Abort page fault handler silently during owning process exit
  IB/mlx5: Validate correct PD before prefetch MR
  IB/mlx5: Protect against prefetch of invalid MR
  RDMA/uverbs: Store PR pointer before it is overwritten
  ...
2019-03-09 15:53:03 -08:00
Devesh Sharma
0fca467e81 bnxt_re: Clean cq for kernel consumers only
Kernel space provider driver should clean the CQs belonging to kernel
space consumers only. The current implementation is doing reverse of it.

Fixing the same by avoiding the call to __clean_cq on a kernel qp during
destroy.

Fixes: c50866e285 ("bnxt_re: fix the regression due to changes in alloc_pbl")
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-03-04 10:51:14 -04:00
Jakub Kicinski
f4b6bcc700 net: devlink: turn devlink into a built-in
Being able to build devlink as a module causes growing pains.
First all drivers had to add a meta dependency to make sure
they are not built in when devlink is built as a module.  Now
we are struggling to invoke ethtool compat code reliably.

Make devlink code built-in, users can still not build it at
all but the dynamically loadable module option is removed.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2019-02-26 08:49:05 -08:00
Leon Romanovsky
a2a074ef39 RDMA: Handle ucontext allocations by IB/core
Following the PD conversion patch, do the same for ucontext allocations.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-22 14:11:37 -07:00
Devesh Sharma
c50866e285 bnxt_re: fix the regression due to changes in alloc_pbl
While adding the use of for_each_sg_dma_page iterator for Brodcom's rdma
driver, there was a regression added in the __alloc_pbl path. The change
left bnxt_re in DOA state in for-next branch.

Fixing the regression to avoid the host crash when a user space object is
created. Restricting the unconditional access to hwq.pg_arr when hwq is
initialized for user space objects.

Fixes: 161ebe2498 ("RDMA/bnxt_re: Use for_each_sg_dma_page iterator on umem SGL")
Reported-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-22 11:17:22 -07:00
Shamir Rabinovitch
8994445054 IB/{hw,sw}: Remove 'uobject->context' dependency in object creation APIs
Now when we have the udata passed to all the ib_xxx object creation APIs
and the additional macro 'rdma_udata_to_drv_context' to get the
ib_ucontext from ib_udata stored in uverbs_attr_bundle, we can finally
start to remove the dependency of the drivers in the
ib_xxx->uobject->context.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-15 15:38:38 -07:00
Colin Ian King
a87145957e RDMA/bnxt_re: fix or'ing of data into an uninitialized struct member
The struct member comp_mask has not been initialized however a bit
pattern is being bitwise or'd into the member and hence other bit
fields in comp_mask may contain any garbage from the stack. Fix this
by making the bitwise or into an assignment.

Fixes: 95b86d1c91 ("RDMA/bnxt_re: Update kernel user abi to pass chip context")
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Acked-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-11 15:34:54 -07:00
Shiraz, Saleem
161ebe2498 RDMA/bnxt_re: Use for_each_sg_dma_page iterator on umem SGL
Use the for_each_sg_dma_page iterator variant to walk the umem DMA-mapped
SGL and get the page DMA address. This avoids the extra loop to iterate
pages in the SGE when for_each_sg iterator is used.

Additionally, purge umem->page_shift usage in the driver as its only
relevant for ODP MRs. Use system page size and shift instead.

Signed-off-by: Shiraz, Saleem <shiraz.saleem@intel.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-11 15:02:33 -07:00
Leon Romanovsky
21a428a019 RDMA: Handle PD allocations by IB/core
The PD allocations in IB/core allows us to simplify drivers and their
error flows in their .alloc_pd() paths. The changes in .alloc_pd() go hand
in had with relevant update in .dealloc_pd().

We will use this opportunity and convert .dealloc_pd() to don't fail, as
it was suggested a long time ago, failures are not happening as we have
never seen a WARN_ON print.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-08 16:51:04 -07:00
Devesh Sharma
95b86d1c91 RDMA/bnxt_re: Update kernel user abi to pass chip context
User space verbs provider library would need chip context.  Changing the
ABI to add chip version details in structure.  Furthermore, changing the
kernel driver ucontext allocation code to initialize the abi structure
with appropriate values.

As suggested by community, appended the new fields at the bottom of the
ABI structure and retaining to older fields as those were in the older
versions.

Keeping the ABI version at 1 and adding a new field in the ucontext
response structure to hold the component mask.  The user space library
should check pre-defined flags to figure out if a certain feature is
supported on not.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:49 -07:00
Devesh Sharma
37f91cff2d RDMA/bnxt_re: Add extended psn structure for 57500 adapters
The new 57500 series of adapter has bigger psn search structure.  The size
of new structure is 16B. Changing the control path memory allocation and
fast path code to accommodate the new psn structure while maintaining the
backward compatibility.

There are few additional changes listed below:
 - For 57500 chip max-sge are limited to 6 for now.
 - For 57500 chip max-receive-sge should be set to 6 for now.
 - Add driver/hardware interface structure for new chip.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma
374c5285ab RDMA/bnxt_re: Enable GSI QP support for 57500 series
In the new 57500 series of adapters the GSI qp is a UD type QP unlike the
previous generation where it was a Raw Eth QP. Changing the control and
data path to support the same. Listing all the significant diffs:

 - AH creation resolve network type unconditionally
 - Add check at relevant places to distinguish from Raw Eth
   processing flow.
 - bnxt_re_process_res_ud_wc report completion with GRH flag
   when qp is GSI.
 - Change length, cfa_meta and smac to match new driver/hardware
   interface.
 - Add new driver/hardware interface.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma
e0387e1dd4 RDMA/bnxt_re: Skip backing store allocation for 57500 series
The backing store to keep HW context data structures is allocated and
initialized by L2 driver. For 57500 chip RoCE driver do not require to
allocate and initialize additional memory. Changing to skip duplicate
allocation and initialization for 57500 adapters. Driver continues as
before for older chips.

This patch also takes care of stats context memory alignment to 128
boundary, a requirement for 57500 series of chip. Older chips do not care
of alignment, thus the change is unconditional.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma
b353ce556d RDMA/bnxt_re: Add 64bit doorbells for 57500 series
The new chip series has 64 bit doorbell for notification queues. Thus,
both control and data path event queues need new routines to write 64 bit
doorbell. Adding the same. There is new doorbell interface between the
chip and driver. Changing the chip specific data structure definitions.

Additional significant changes are listed below
- bnxt_re_net_ring_free/alloc takes a new argument
- bnxt_qplib_enable_nq and enable_rcfw uses new doorbell offset
  for new chip.
- DB mapping for NQ and CREQ now maps 8 bytes.
- DBR_DBR_* macros renames to DBC_DBC_*
- store nq_db_offset in a 32bit data type.
- got rid of __iowrite64_copy, used writeq instead.
- changed the DB header initialization to simpler scheme.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Devesh Sharma
ae8637e131 RDMA/bnxt_re: Add chip context to identify 57500 series
Adding setup and destroy routines for chip-context. The chip context would
be used frequently in control and data path to take execution flow
depending on the chip type.  chip context structure pointer is added to
the relevant data structures.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-02-07 13:24:48 -07:00
Leon Romanovsky
459cc69fa4 RDMA: Provide safe ib_alloc_device() function
All callers to ib_alloc_device() provide a larger size than struct
ib_device and rely on the fact that struct ib_device is embedded in their
driver specific structure as the first member.

Provide a safer variant of ib_alloc_device() that checks and enforces this
approach to make sure the drivers are using it right.

Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-30 15:52:30 -07:00
Jason Gunthorpe
55c293c38e Merge branch 'devx-async' into k.o/for-next
Yishai Hadas says:

Enable DEVX asynchronous query commands

This series enables querying a DEVX object in an asynchronous mode.

The userspace application won't block when calling the firmware and it will be
able to get the response back once that it will be ready.

To enable the above functionality:

- DEVX asynchronous command completion FD object was introduced.
- The applicable file operations were implemented to enable using it by
  the user application.
- Query asynchronous method was added to the DEVX object, it will call the
  firmware asynchronously and manages the response on the given input FD.
- Hot unplug support was added for the FD to work properly upon
  unbind/disassociate.
- mlx5 core fence for asynchronous commands was implemented and used to
  prevent racing upon unbind/disassociate.

This branch is based on mlx5-next & v5.0-rc2 due to dependencies, from
git://git.kernel.org/pub/scm/linux/kernel/git/mellanox/linux

* branch 'devx-async':
  IB/mlx5: Implement DEVX hot unplug for async command FD
  IB/mlx5: Implement the file ops of DEVX async command FD
  IB/mlx5: Introduce async DEVX obj query API
  IB/mlx5: Introduce MLX5_IB_OBJECT_DEVX_ASYNC_CMD_FD

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-29 13:49:31 -07:00
Masahiro Yamada
b360ce3b2b infiniband: prefix header search paths with $(srctree)/
Currently, the Kbuild core manipulates header search paths in a crazy
way [1].

To fix this mess, I want all Makefiles to add explicit $(srctree)/ to
the search paths in the srctree. Some Makefiles are already written in
that way, but not all. The goal of this work is to make the notation
consistent, and finally get rid of the gross hacks.

Having whitespaces after -I does not matter since commit 48f6e3cf5b
("kbuild: do not drop -I without parameter").

[1]: https://patchwork.kernel.org/patch/9632347/

Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>
Acked-by: Parvi Kaustubhi <pkaustub@cisco.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-25 15:28:50 -07:00
Dan Carpenter
97099cc652 RDMA/bnxt_re: fix a size calculation
This is from static analysis not from testing.  Depending on the value
of rcfw->cmdq_depth, then this might not cause an issue at runtime.

The BITS_TO_LONGS() macro tells us how many longs it take to hold a
bitmap.  In other words, it divides by the number if bits per long and
rounds up.  Then we want to take that number and multiple by
sizeof(long) to get the number of bytes to allocate.

The code here does the multiplication first so the rounding up is done
in the wrong place.  So imagine we want to allocate 1 bit, then
"(1 * 8) / 64 = 1" when we round up.  But it should be
"(1 / 64) * 8 = 8".  In other words, because of the rounding difference
we might allocate up to "sizeof(long) - 1" bytes fewer than intended.

Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14 14:05:54 -07:00
Parav Pandit
5474723115 RDMA: Introduce and use rdma_device_to_ibdev()
Introduce and use rdma_device_to_ibdev() API for those drivers which are
registering one sysfs group and also use in ib_core.

In subsequent patch, device->provider_ibdev one-to-one mapping is no
longer holds true during accessing sysfs entries.
Therefore, introduce an API rdma_device_to_ibdev() that provides such
information.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14 13:12:03 -07:00
Parav Pandit
ea4baf7f11 RDMA: Rename port_callback to init_port
Most provider routines are callback routines which ib core invokes.
_callback suffix doesn't convey information about when such callback is
invoked. Therefore, rename port_callback to init_port.

Additionally, store the init_port function pointer in ib_device_ops, so
that it can be accessed in subsequent patches when binding rdma device to
net namespace.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-14 13:05:14 -07:00
Jason Gunthorpe
b0ea0fa543 IB/{core,hw}: Have ib_umem_get extract the ib_ucontext from ib_udata
ib_umem_get() can only be called in a method callback, which always has a
udata parameter. This allows ib_umem_get() to derive the ucontext pointer
directly from the udata without requiring the drivers to find it in some
way or another.

Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
2019-01-10 17:07:45 -07:00
Luis Chamberlain
750afb08ca cross-tree: phase out dma_zalloc_coherent()
We already need to zero out memory for dma_alloc_coherent(), as such
using dma_zalloc_coherent() is superflous. Phase it out.

This change was generated with the following Coccinelle SmPL patch:

@ replace_dma_zalloc_coherent @
expression dev, size, data, handle, flags;
@@

-dma_zalloc_coherent(dev, size, handle, flags)
+dma_alloc_coherent(dev, size, handle, flags)

Suggested-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
[hch: re-ran the script on the latest tree]
Signed-off-by: Christoph Hellwig <hch@lst.de>
2019-01-08 07:58:37 -05:00
Aditya Pakki
94edd87a1c infiniband: bnxt_re: qplib: Check the return value of send_message
In bnxt_qplib_map_tc2cos(), bnxt_qplib_rcfw_send_message() can return an
error value but it is lost. Propagate this error to the callers.

Signed-off-by: Aditya Pakki <pakki001@umn.edu>
Acked-By: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2019-01-02 16:04:24 -07:00
Devesh Sharma
bd1c24ccf9 RDMA/bnxt_re: Increase depth of control path command queue
Increasing the depth of control path command queue to 8K entries to handle
burst of commands. This feature needs support from FW and the driver/fw
compatibility is checked from the interface version number.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:37:33 -07:00
Selvin Xavier
2b827ea192 RDMA/bnxt_re: Query HWRM Interface version from FW
Get HWRM interface major, minor, build and patch version from FW for
checking the FW/Driver compatibility.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:37:32 -07:00
Gal Pressman
50c582de1d RDMA/bnxt_re: Make use of destroy AH sleepable flag
When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:28:04 -07:00
Gal Pressman
90e3edd8cc RDMA/bnxt_re: Make use of create AH sleepable flag
When in a sleepable (non-atomic) context, wait for firmware completion
instead of polling for it.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Acked-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:28:03 -07:00
Gal Pressman
2553ba217e RDMA: Mark if destroy address handle is in a sleepable context
Introduce a 'flags' field to destroy address handle callback and add a
flag that marks whether the callback is executed in an atomic context or
not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:28:03 -07:00
Gal Pressman
b090c4e3a0 RDMA: Mark if create address handle is in a sleepable context
Introduce a 'flags' field to create address handle callback and add a flag
that marks whether the callback is executed in an atomic context or not.

This will allow drivers to wait for completion instead of polling for it
when it is allowed.

Signed-off-by: Gal Pressman <galpress@amazon.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-19 16:17:19 -07:00
Shamir Rabinovitch
e00b64f7c5 RDMA: Cleanup undesired pd->uobject usage
Drivers should be using udata to determine if a method is invoked from
user space or kernel space. A pd does not necessarily say a different
objects is kernel or user.

Transforming the tests to use udata eliminates a large number of uobject
references from the drivers.

Signed-off-by: Shamir Rabinovitch <shamir.rabinovitch@oracle.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-18 19:15:48 -07:00
Kamal Heib
9615f86be9 RDMA/bnxt_re: Initialize ib_device_ops struct
Initialize ib_device_ops with the supported operations using
ib_set_device_ops().

Signed-off-by: Kamal Heib <kamalheib1@gmail.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-12-11 15:15:07 -07:00
Selvin Xavier
a6c66d6a08 RDMA/bnxt_re: Avoid accessing the device structure after it is freed
When bnxt_re_ib_reg returns failure, the device structure gets
freed. Driver tries to access the device pointer
after it is freed.

[ 4871.034744] Failed to register with netedev: 0xffffffa1
[ 4871.034765] infiniband (null): Failed to register with IB: 0xffffffea
[ 4871.046430] ==================================================================
[ 4871.046437] BUG: KASAN: use-after-free in bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046439] Write of size 4 at addr ffff880fa8406f48 by task kworker/u48:2/17813

[ 4871.046443] CPU: 20 PID: 17813 Comm: kworker/u48:2 Kdump: loaded Tainted: G B OE  4.20.0-rc1+ #42
[ 4871.046444] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[ 4871.046447] Workqueue: bnxt_re bnxt_re_task [bnxt_re]
[ 4871.046449] Call Trace:
[ 4871.046454]  dump_stack+0x91/0xeb
[ 4871.046458]  print_address_description+0x6a/0x2a0
[ 4871.046461]  kasan_report+0x176/0x2d0
[ 4871.046463]  ? bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046466]  bnxt_re_task+0x63/0x180 [bnxt_re]
[ 4871.046470]  process_one_work+0x216/0x5b0
[ 4871.046471]  ? process_one_work+0x189/0x5b0
[ 4871.046475]  worker_thread+0x4e/0x3d0
[ 4871.046479]  kthread+0x10e/0x140
[ 4871.046480]  ? process_one_work+0x5b0/0x5b0
[ 4871.046482]  ? kthread_stop+0x220/0x220
[ 4871.046486]  ret_from_fork+0x3a/0x50

[ 4871.046492] The buggy address belongs to the page:
[ 4871.046494] page:ffffea003ea10180 count:0 mapcount:0 mapping:0000000000000000 index:0x0
[ 4871.046495] flags: 0x57ffffc0000000()
[ 4871.046498] raw: 0057ffffc0000000 0000000000000000 ffffea003ea10188 0000000000000000
[ 4871.046500] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 4871.046501] page dumped because: kasan: bad access detected

Avoid accessing the device structure once it is freed.

Fixes: 497158aa5f ("RDMA/bnxt_re: Fix the ib_reg failure cleanup")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21 14:12:21 -07:00
Selvin Xavier
3c4b1419c3 RDMA/bnxt_re: Fix system hang when registration with L2 driver fails
Driver doesn't release rtnl lock if registration with
L2 driver (bnxt_re_register_netdev) fais and this causes
hang while requesting for the next lock.

[  371.635416] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  371.635417] kworker/u48:1   D    0   634      2 0x80000000
[  371.635423] Workqueue: bnxt_re bnxt_re_task [bnxt_re]
[  371.635424] Call Trace:
[  371.635426]  ? __schedule+0x36b/0xbd0
[  371.635429]  schedule+0x39/0x90
[  371.635430]  schedule_preempt_disabled+0x11/0x20
[  371.635431]  __mutex_lock+0x45b/0x9c0
[  371.635433]  ? __mutex_lock+0x16d/0x9c0
[  371.635435]  ? bnxt_re_ib_reg+0x2b/0xb30 [bnxt_re]
[  371.635438]  ? wake_up_klogd+0x37/0x40
[  371.635442]  bnxt_re_ib_reg+0x2b/0xb30 [bnxt_re]
[  371.635447]  bnxt_re_task+0xfd/0x180 [bnxt_re]
[  371.635449]  process_one_work+0x216/0x5b0
[  371.635450]  ? process_one_work+0x189/0x5b0
[  371.635453]  worker_thread+0x4e/0x3d0
[  371.635455]  kthread+0x10e/0x140
[  371.635456]  ? process_one_work+0x5b0/0x5b0
[  371.635458]  ? kthread_stop+0x220/0x220
[  371.635460]  ret_from_fork+0x3a/0x50
[  371.635477] INFO: task NetworkManager:1228 blocked for more than 120 seconds.
[  371.635478]       Tainted: G    B      OE     4.20.0-rc1+ #42
[  371.635479] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.

Release the rtnl_lock correctly in the failure path.

Fixes: de5c95d0f5 ("RDMA/bnxt_re: Fix system crash during RDMA resource initialization")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-11-21 14:11:04 -07:00
Parav Pandit
508a523f6b RDMA/drivers: Use core provided API for registering device attributes
Use rdma_set_device_sysfs_group() to register device attributes and
simplify the driver.

Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-17 03:45:01 -06:00
Selvin Xavier
5df9509949 RDMA/bnxt_re: Avoid resource leak in case the NQ registration fails
In case the NQ alloc/enable fails, free up the already allocated/enabled
NQ before reporting failure. Also, track the alloc/enable using proper
state checking.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:51 -06:00
Selvin Xavier
a08b9e9a70 RDMA/bnxt_re: Wait for delayed work to finish before device removal
Delayed work bnxt_re_worker would be still running even after
cancel_delayed_work returns. This causes crash as the driver proceeds with
device removal. To make sure that the work is finished before returning,
use cancel_delayed_work_sync.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:51 -06:00
Devesh Sharma
854a202001 RDMA/bnxt_re: Limit max_pkey to 16 bit value
Some FW versios return pkey values more than 0xFFFF. pkey_tbl_len of
ib_port_attr is 16bit value. So restricting max_pkeys to 0xFFFF.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:51 -06:00
Devesh Sharma
4c01f2e3a9 RDMA/bnxt_re: Fix qp async event reporting
Reports affiliated async event on the qp-async event channel instead of
global event channel.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier
316dd2825d RDMA/bnxt_re: Report out of sequence hw counters
Expose out of sequence errors received from FW.  This counter is a 32 bit
counter and driver has to accumulate the counter. Stores the previous
value for calculating the difference in the next query.

Also, update the HW statistics structure with new fields.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier
5c80c9138e RDMA/bnxt_re: Expose rx discards and drop counters
Expose the RoCE discard and drop counters from the HW statistics context

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Somnath Kotur
bb22c36cba RDMA/bnxt_re: Prevent driver crash due to NULL pointer in error message print
crsqe->resp would be NULL in case the host command timed out before
getting a response from HW. Check for NULL pointer to avoid a potential
crash while printing the error message.

Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Devesh Sharma
f2bd4d096e RDMA/bnxt_re: Drop L2 async events silently
In some FW versions, RoCE driver also receives an async notification which
was directed to L2 driver.  RoCE driver does not handle this and print a
message to syslog.  Drop these notifications silently.

Signed-off-by: Devesh Sharma <devesh.sharma@broadcom.com>
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier
ed51efd2ce RDMA/bnxt_re: Avoid accessing nq->bar_reg_iomem in failure case
In the failure path, nq->bar_reg_iomem gets accessed without
initializing. Avoid this by calling the bnxt_qplib_nq_stop_irq only if the
initialization is complete.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 1ac5a40479 ("RDMA/bnxt_re: Add bnxt_re RoCE driver")
Fixes: 6e04b10356 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier
eae4ad1b0c RDMA/bnxt_re: Avoid NULL check after accessing the pointer
This is reported by smatch check.  rcfw->creq_bar_reg_iomem is accessed in
bnxt_qplib_rcfw_stop_irq and this variable check afterwards doesn't make
sense.  Also, rcfw->creq_bar_reg_iomem will never be NULL.  So Removing
this check.

Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Fixes: 6e04b10356 ("RDMA/bnxt_re: Fix broken RoCE driver due to recent L2 driver changes")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier
1b7042d7a5 RDMA/bnxt_re: Remove the unnecessary version macro definition
Version macro is not required as the driver is not maintaining the
version. Removing the references of this macro too.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier
d455f29f6d RDMA/bnxt_re: Fix recursive lock warning in debug kernel
Fix possible recursive lock warning. Its a false warning as the locks are
part of two differnt HW Queue data structure - cmdq and creq. Debug kernel
is throwing the following warning and stack trace.

[  783.914967] ============================================
[  783.914970] WARNING: possible recursive locking detected
[  783.914973] 4.19.0-rc2+ #33 Not tainted
[  783.914976] --------------------------------------------
[  783.914979] swapper/2/0 is trying to acquire lock:
[  783.914982] 000000002aa3949d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.914999]
but task is already holding lock:
[  783.915002] 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915013]
other info that might help us debug this:
[  783.915016]  Possible unsafe locking scenario:

[  783.915019]        CPU0
[  783.915021]        ----
[  783.915034]   lock(&(&hwq->lock)->rlock);
[  783.915035]   lock(&(&hwq->lock)->rlock);
[  783.915037]
 *** DEADLOCK ***

[  783.915038]  May be due to missing lock nesting notation

[  783.915039] 1 lock held by swapper/2/0:
[  783.915040]  #0: 00000000be73920d (&(&hwq->lock)->rlock){..-.}, at: bnxt_qplib_service_creq+0x2a/0x350 [bnxt_re]
[  783.915044]
stack backtrace:
[  783.915046] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.19.0-rc2+ #33
[  783.915047] Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.0.4 08/28/2014
[  783.915048] Call Trace:
[  783.915049]  <IRQ>
[  783.915054]  dump_stack+0x90/0xe3
[  783.915058]  __lock_acquire+0x106c/0x1080
[  783.915061]  ? sched_clock+0x5/0x10
[  783.915063]  lock_acquire+0xbd/0x1a0
[  783.915065]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915069]  _raw_spin_lock_irqsave+0x4a/0x90
[  783.915071]  ? bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915073]  bnxt_qplib_service_creq+0x232/0x350 [bnxt_re]
[  783.915078]  tasklet_action_common.isra.17+0x197/0x1b0
[  783.915081]  __do_softirq+0xcb/0x3a6
[  783.915084]  irq_exit+0xe9/0x100
[  783.915085]  do_IRQ+0x6a/0x120
[  783.915087]  common_interrupt+0xf/0xf
[  783.915088]  </IRQ>

Use nested notation for the spin_lock to avoid this warning.

Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00
Selvin Xavier
5a23e0b1dd RDMA/bnxt_re: Add missing spin lock initialization
Add the missing initalization of the cq_lock and qplib.flush_lock.

Fixes: 942c9b6ca8 ("RDMA/bnxt_re: Avoid Hard lockup during error CQE processing")
Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com>
Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
2018-10-16 00:03:50 -06:00