Commit Graph

48075 Commits

Author SHA1 Message Date
Yunsheng Lin
09d96ee567 page_pool: remove PP_FLAG_PAGE_FRAG
PP_FLAG_PAGE_FRAG is not really needed after pp_frag_count
handling is unified and page_pool_alloc_frag() is supported
in 32-bit arch with 64-bit DMA, so remove it.

Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com>
CC: Lorenzo Bianconi <lorenzo@kernel.org>
CC: Alexander Duyck <alexander.duyck@gmail.com>
CC: Liang Chen <liangchen.linux@gmail.com>
CC: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20231020095952.11055-3-linyunsheng@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-23 19:14:48 -07:00
Pavan Kumar Linga
46d913d480 idpf: cancel mailbox work in error path
In idpf_vc_core_init, the mailbox work is queued
on a mailbox workqueue but it is not cancelled on error.
This results in a call trace when idpf_mbx_task tries
to access the freed mailbox queue pointer. Fix it by
cancelling the mailbox work in the error path.

Fixes: 4930fbf419 ("idpf: add core init and interrupt request")
Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Pavan Kumar Linga <pavan.kumar.linga@intel.com>
Tested-by: Krishneil Singh  <krishneil.k.singh@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231023202655.173369-3-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-23 15:55:31 -07:00
Michal Kubiak
d38b4d0d95 idpf: set scheduling mode for completion queue
The HW must be programmed differently for queue-based scheduling mode.
To program the completion queue context correctly, the control plane
must know the scheduling mode not only for the Tx queue, but also for
the completion queue.
Unfortunately, currently the driver sets the scheduling mode only for
the Tx queues.

Propagate the scheduling mode data for the completion queue as
well when sending the queue configuration messages.

Fixes: 1c325aac10 ("idpf: configure resources for TX queues")
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Michal Kubiak <michal.kubiak@intel.com>
Reviewed-by: Alan Brady <alan.brady@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Krishneil Singh  <krishneil.k.singh@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231023202655.173369-2-jacob.e.keller@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-23 15:55:31 -07:00
Su Hui
6e7ce2d71b net: lan966x: remove useless code in lan966x_xtr_irq_handler
'err' is useless after break, remove this to save space and
be more clear.

Signed-off-by: Su Hui <suhui@nfschina.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-23 09:56:05 +01:00
Eric Dumazet
225d9ddbac chtls: fix tp->rcv_tstamp initialization
tp->rcv_tstamp should be set to tcp_jiffies, not tcp_time_stamp().

Fixes: cc35c88ae4 ("crypto : chtls - CPL handler definition")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Ayush Sawal <ayush.sawal@chelsio.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-23 09:35:00 +01:00
Edwin Peer
5d4e1bf606 bnxt_en: extend media types to supported and autoneg modes
The current driver code does not accurately report the supported and
advertised link modes.  It basically always assumes the media type
is copper for any particular speed.  Utilize the recently added link
mode mappings to accurately report fully qualified ethtool link modes for
advertised and supported speeds.

If the media type is known, we will report the supported link modes for
that media only.  If the media is not known, we will report all possible
supported link modes.  The user can now specify any supported link modes
(including NRZ and PAM4) to advertise for autoneg.  It used to only accept
copper NRZ modes.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:46 +01:00
Edwin Peer
64d20aea6e bnxt_en: convert to linkmode_set_bit() API
Barring the BNXT_FW_TO_ETHTOOL speed macros, which will be removed
in the next patch, update code to use the newer API.

No functional change.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:46 +01:00
Michael Chan
5802e30317 bnxt_en: Refactor NRZ/PAM4 link speed related logic
Refactor some NRZ/PAM4 link speed related logic into helper functions.
The NRZ and PAM4 link parameters are stored in separate structure fields.
The driver logic has to check whether it is in NRZ or PAM4 mode and
then use the appropriate field.

Refactor this logic into helper functions for better readability.

Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com>
Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com>
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:46 +01:00
Edwin Peer
94c89e73d3 bnxt_en: refactor speed independent ethtool modes
A future patch in this series will change the algorithm used to
determine ethtool speed and media modes. Extract the handling of
the unrelated pause, autoneg modes into an independent function.
Also separate FEC handling out of bnxt_fw_to_ethtool_*_spds().

No functional change.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:46 +01:00
Edwin Peer
d6263677bb bnxt_en: support lane configuration via ethtool
Recent kernels support changing the number of link lanes via ethtool.
This is useful for determining the appropriate signal mode to use when
a given link speed can be achieved using different lane configurations.

Accept the ethtool lanes parameter when configuring forced speed.  If
there is no lanes parameter, select a default.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:46 +01:00
Edwin Peer
ecdad2a692 bnxt_en: add infrastructure to lookup ethtool link mode
Add infrastructure to look up the enum ethtool_link_mode_bit_indices
from link information provided by the firmware.  The link speed,
signal mode, and media type returned by firmware will be used to
look up the ethtool link mode.

The immediate benefit is that once the link mode is determined, we can
now use ethtool_params_from_link_mode() to fill the basic ethtool
parameters including the number of lanes.  Lanes will be fully
supported in the next patch.

Signed-off-by: Edwin Peer <edwin.peer@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:46 +01:00
Kalesh AP
fd78ec3fbc bnxt_en: Fix invoking hwmon_notify_event
FW sends the async event to the driver when the device temperature goes
above or below the threshold values.  Only notify hwmon if the
temperature is increasing to the next alert level, not when it is
decreasing.

Cc: Guenter Roeck <linux@roeck-us.net>
Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:46 +01:00
Kalesh AP
55862094a9 bnxt_en: Do not call sleeping hwmon_notify_event() from NAPI
Defer hwmon_notify_event() to bnxt_sp_task() workqueue because
hwmon_notify_event() can try to acquire a mutex shown in the stack trace
below.  Modify bnxt_event_error_report() to return true if we need to
schedule bnxt_sp_task() to notify hwmon.

  __schedule+0x68/0x520
  hwmon_notify_event+0xe8/0x114
  schedule+0x60/0xe0
  schedule_preempt_disabled+0x28/0x40
  __mutex_lock.constprop.0+0x534/0x550
  __mutex_lock_slowpath+0x18/0x20
  mutex_lock+0x5c/0x70
  kobject_uevent_env+0x2f4/0x3d0
  kobject_uevent+0x10/0x20
  hwmon_notify_event+0x94/0x114
  bnxt_hwmon_notify_event+0x40/0x70 [bnxt_en]
  bnxt_event_error_report+0x260/0x290 [bnxt_en]
  bnxt_async_event_process.isra.0+0x250/0x850 [bnxt_en]
  bnxt_hwrm_handler.isra.0+0xc8/0x120 [bnxt_en]
  bnxt_poll_p5+0x150/0x350 [bnxt_en]
  __napi_poll+0x3c/0x210
  net_rx_action+0x308/0x3b0
  __do_softirq+0x120/0x3e0

Cc: Guenter Roeck <linux@roeck-us.net>
Fixes: a19b480145 ("bnxt_en: Event handler for Thermal event")
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-22 11:41:45 +01:00
Shinas Rasheed
e10f4019b1 octeon_ep: assert hardware structure sizes
Clean up structure defines related to hardware data to be
asserted to fixed sizes, as padding is not allowed
by hardware.

Signed-off-by: Shinas Rasheed <srasheed@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-21 15:41:00 +01:00
Su Hui
a1e4c334cb pds_core: add an error code check in pdsc_dl_info_get
check the value of 'ret' after call 'devlink_info_version_stored_put'.

Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://lore.kernel.org/r/20231019083351.1526484-1-suhui@nfschina.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-20 17:39:32 -07:00
Dan Carpenter
a41654c3ed ixgbe: fix end of loop test in ixgbe_set_vf_macvlan()
The list iterator in a list_for_each_entry() loop can never be NULL.
If the loop exits without hitting a break then the iterator points
to an offset off the list head and dereferencing it is an out of
bounds access.

Before we transitioned to using list_for_each_entry() loops, then
it was possible for "entry" to be NULL and the comments mention
this.  I have updated the comments to match the new code.

Fixes: c1fec89045 ("ethernet/intel: Use list_for_each_entry() helper")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:06 +01:00
Dan Carpenter
4690aea589 igb: Fix an end of loop test
When we exit a list_for_each_entry() without hitting a break statement,
the list iterator isn't NULL, it just point to an offset off the
list_head.  In that situation, it wouldn't be too surprising for
entry->free to be true and we end up corrupting memory.

The way to test for these is to just set a flag.

Fixes: c1fec89045 ("ethernet/intel: Use list_for_each_entry() helper")
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:06 +01:00
Jacob Keller
640a65f801 ice: cleanup ice_find_netlist_node
The ice_find_netlist_node function was introduced in commit 8a3a565ff2
("ice: add admin commands to access cgu configuration"). Variations of this
function were reviewed concurrently on both intel-wired-lan[1][2], and
netdev [3][4]

[1]: https://lore.kernel.org/intel-wired-lan/20230913204943.1051233-7-vadim.fedorenko@linux.dev/
[2]: https://lore.kernel.org/intel-wired-lan/20230817000058.2433236-5-jacob.e.keller@intel.com/
[3]: https://lore.kernel.org/netdev/20230918212814.435688-1-anthony.l.nguyen@intel.com/
[4]: https://lore.kernel.org/netdev/20230913204943.1051233-7-vadim.fedorenko@linux.dev/

The variant I posted had a few changes due to review feedback which were
never incorporated into the DPLL series:

* Replace the references to ancient and long removed ICE_SUCCESS and
  ICE_ERR_DOES_NOT_EXIST status codes in the function comment.

* Return -ENOENT instead of -ENOTBLK, as a more common way to indicate that
  an entry doesn't exist.

* Avoid the use of memset() and use simple static initialization for the
  cmd variable.

* Use FIELD_PREP to assign the node_type_ctx.

* Remove an unnecessary local variable to keep track of rec_node_handle,
  just pass the node_handle pointer directly into ice_aq_get_netlist_node.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:06 +01:00
Jacob Keller
67918b6b26 ice: make ice_get_pf_c827_idx static
The ice_get_pf_c827_idx function is only called inside of ice_ptp_hw.c, so
there is no reason to export it. Mark it static and remove the declaration
from ice_ptp_hw.h

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:06 +01:00
Michal Swiatkowski
4d38cb44bd ice: manage VFs MSI-X using resource tracking
Track MSI-X for VFs using bitmap, by setting and clearing bitmap during
allocation and freeing.

Try to linearize irqs usage for VFs, by freeing them and allocating once
again. Do it only for VFs that aren't currently running.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:05 +01:00
Michal Swiatkowski
05c16687e0 ice: set MSI-X vector count on VF
Implement ops needed to set MSI-X vector count on VF.

sriov_get_vf_total_msix() should return total number of MSI-X that can
be used by the VFs. Return the value set by devlink resources API
(pf->req_msix.vf).

sriov_set_msix_vec_count() will set number of MSI-X on particular VF.
Disable VF register mapping, rebuild VSI with new MSI-X and queues
values and enable new VF register mapping.

For best performance set number of queues equal to number of MSI-X.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:05 +01:00
Michal Swiatkowski
ea4af9b400 ice: add bitmap to track VF MSI-X usage
Create a bitamp to track MSI-X usage for VFs. The bitmap has the size of
total MSI-X amount on device, because at init time the amount of MSI-X
used by VFs isn't known.

The bitmap is used in follow up patchset to provide a block of
continuous block of MSI-X indexes for each created VF.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:05 +01:00
Michal Swiatkowski
fe1c5ca2fe ice: implement num_msix field per VF
Store the amount of MSI-X per VF instead of storing it in pf struct. It
is used to calculate number of q_vectors (and queues) for VF VSI.

This is necessary because with follow up changes the number of MSI-X can
be different between VFs. Use it instead of using pf->vf_msix value in
all cases.

Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:05 +01:00
Przemek Kitszel
31642d2854 ice: store VF's pci_dev ptr in ice_vf
Extend struct ice_vf by vfdev.
Calculation of vfdev falls more nicely into ice_create_vf_entries().

Caching of vfdev enables simplification of ice_restore_all_vfs_msi_state().

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Co-developed-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:05 +01:00
Michal Swiatkowski
9dffb97da2 ice: add drop rule matching on not active lport
Inactive LAG port should not receive any packets, as it can cause adding
invalid FDBs (bridge offload). Add a drop rule matching on inactive lport
in LAG.

Reviewed-by: Simon Horman <horms@kernel.org>
Co-developed-by: Marcin Szycik <marcin.szycik@intel.com>
Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Tested-by: Sujai Buvaneswaran <sujai.buvaneswaran@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:05 +01:00
Przemek Kitszel
4cd7bc7144 ice: remove unused ice_flow_entry fields
Remove ::entry and ::entry_sz fields of &ice_flow_entry,
as they were never set.

Suggested-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 12:53:05 +01:00
Petr Machata
b46c1f3f5e mlxsw: spectrum: Set SW LAG mode on Spectrum>1
On Spectrum-2, Spectrum-3 and Spectrum-4 machines, request SW
responsibility for placement of the LAG table.

On Spectrum-1, some FW versions claim to support lag_mode field despite
quietly ignoring any settings made to that field. Thus refrain from
attempting to configure lag_mode on those systems at all.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:50 +01:00
Petr Machata
c678972580 mlxsw: spectrum: Allocate LAG table when in SW LAG mode
In this patch, if the LAG mode is SW, allocate the LAG table and configure
SGCR to indicate where it was allocated.

We use the default "DDD" (for dynamic data duplication) layout of the LAG
table. In the DDD mode, the membership information for each LAG is copied
in 8 PGT entries. This is done for performance reasons. The LAG table then
needs to be allocated on an address aligned to 8. Deal with this by
moving the LAG init ahead so that the LAG table is allocated at address 0.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:50 +01:00
Petr Machata
8c893abd64 mlxsw: spectrum_pgt: Generalize PGT allocation
PGT blocks are allocated through the function
mlxsw_sp_pgt_mid_alloc_range(). The interface assumes that the caller knows
which piece of PGT exactly they want to get. That was fine while the FID
code was the only client allocating blocks of PGT. However for SW-allocated
LAG table, there will be an additional client: mlxsw_sp_lag_init(). The
interface should therefore be changed to not require particular
coordinates, but to take just the requested size, allocate the block
wherever, and give back the PGT address.

In this patch, change the interface accordingly. Initialize FID family's
pgt_base from the result of the PGT allocation (note that mlxsw makes a
copy of the family structure, so what gets initialized is not actually the
global structure). Drop the now-unnecessary pgt_base initializations and
the corresponding defines.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:50 +01:00
Petr Machata
f5e293f993 mlxsw: spectrum_fid: Allocate PGT for the whole FID family in one go
PGT blocks are allocated through the function
mlxsw_sp_pgt_mid_alloc_range(). The interface assumes that the caller knows
which piece of PGT exactly they want to get. That was fine while the FID
code was the only client allocating blocks of PGT. However for SW-allocated
LAG table, there will be an additional client: mlxsw_sp_lag_init(). The
interface should therefore be changed to not require particular
coordinates, but to take just the requested size, allocate the block
wherever, and give back the PGT address.

The current FID mode has one place where PGT address can be stored: the FID
family's pgt_base. The allocation scheme should therefore be changed from
allocating a block per FID flood table, to allocating a block per FID
family.

Do just that in this patch.

The per-family allocation is going to be useful for another related feature
as well: the CFF mode.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:50 +01:00
Petr Machata
daee7aaba8 mlxsw: pci: Permit toggling LAG mode
Add to struct mlxsw_config_profile a field lag_mode_prefer_sw for the
driver to indicate that SW LAG mode should be configured if possible. Add
to the PCI module code to set lag_mode as appropriate.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:49 +01:00
Petr Machata
b2e9b1fe8c mlxsw: core, pci: Add plumbing related to LAG mode
lag_mode describes where the responsibility for LAG table placement lies:
SW or FW. The bus module determines whether LAG is supported, can configure
it if it is, and knows what (if any) configuration has been applied.
Therefore add a bus callback to determine the configured LAG mode. Also add
to core an API to query it.

The LAG mode is for now kept at the default value of 0 for FW-managed. The
code to actually toggle it will be added later.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:49 +01:00
Petr Machata
8eabd10cdc mlxsw: cmd: Add QUERY_FW.lag_mode_support
Add QUERY_FW.lag_mode_support, which determines whether
CONFIG_PROFILE.lag_mode is available.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:49 +01:00
Petr Machata
eb26a59232 mlxsw: cmd: Add CONFIG_PROFILE.{set_, }lag_mode
Add CONFIG_PROFILE.lag_mode, which serves for moving responsibility for
placement of the LAG table from FW to SW. Whether lag_mode should be
configured is determined by CONFIG_PROFILE.set_lag_mode, which also add.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:49 +01:00
Petr Machata
be9ed47d3f mlxsw: cmd: Fix omissions in CONFIG_PROFILE field names in comments
A number of CONFIG_PROFILE fields' comments refer to a field named like
cmd_mbox_config_* instead of cmd_mbox_config_profile_*. Correct these
omissions.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:49 +01:00
Petr Machata
cf0a86e8ce mlxsw: reg: Add SGCR.lag_lookup_pgt_base
Add SGCR.lag_lookup_pgt_base, which is used for configuring the base
address of the LAG table within the PGT table for cases when the driver
is responsible for the table placement.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:49 +01:00
Petr Machata
66eaaa8541 mlxsw: reg: Drop SGCR.llb
SGCR, Switch General Configuration Register, has not been used since commit
b0d80c013b ("mlxsw: Remove Mellanox SwitchX-2 ASIC support"). We will
need the register again shortly, so instead of dropping it and
reintroducing again, just drop the sole unused field.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:47:48 +01:00
Przemek Kitszel
18256cb2d4 qed: devlink health: use retained error fmsg API
Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:34:50 +01:00
Przemek Kitszel
d17f98bf7c net/mlx5: devlink health: use retained error fmsg API
Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:34:50 +01:00
Przemek Kitszel
1d434b4849 mlxsw: core: devlink health: use retained error fmsg API
Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:34:50 +01:00
Przemek Kitszel
d8cf03fca3 octeontx2-af: devlink health: use retained error fmsg API
Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:34:50 +01:00
Przemek Kitszel
aca7734d94 hinic: devlink health: use retained error fmsg API
Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:34:50 +01:00
Przemek Kitszel
074e1b4221 bnxt_en: devlink health: use retained error fmsg API
Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:34:49 +01:00
Przemek Kitszel
47957bb3f7 pds_core: devlink health: use retained error fmsg API
Drop unneeded error checking.

devlink_fmsg_*() family of functions is now retaining errors,
so there is no need to check for them after each call.

Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-10-20 11:34:49 +01:00
Jakub Kicinski
041c3466f3 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

net/mac80211/key.c
  02e0e426a2 ("wifi: mac80211: fix error path key leak")
  2a8b665e6b ("wifi: mac80211: remove key_mtx")
  7d6904bf26 ("Merge wireless into wireless-next")
https://lore.kernel.org/all/20231012113648.46eea5ec@canb.auug.org.au/

Adjacent changes:

drivers/net/ethernet/ti/Kconfig
  a602ee3176 ("net: ethernet: ti: Fix mixed module-builtin object")
  98bdeae950 ("net: cpmac: remove driver to prepare for platform removal")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19 13:29:01 -07:00
MD Danish Anwar
389db4fd67 net: ti: icssg-prueth: Fix r30 CMDs bitmasks
The bitmasks for EMAC_PORT_DISABLE and EMAC_PORT_FORWARD r30 commands are
wrong in the driver.

Update the bitmasks of these commands to the correct ones as used by the
ICSSG firmware. These bitmasks are backwards compatible and work with
any ICSSG firmware version.

Fixes: e9b4ece7d7 ("net: ti: icssg-prueth: Add Firmware config and classification APIs.")
Signed-off-by: MD Danish Anwar <danishanwar@ti.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20231018150715.3085380-1-danishanwar@ti.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19 09:04:43 -07:00
Ivan Vecera
f2cab25b0e i40e: Align devlink info versions with ice driver and add docs
Align devlink info versions with ice driver so change 'fw.mgmt'
version to be 2-digit version [major.minor], add 'fw.mgmt.build'
that reports mgmt firmware build number and use '"fw.psid.api'
for NVM format version instead of incorrect '"fw.psid'.
Additionally add missing i40e devlink documentation.

Fixes: 5a423552e0 ("i40e: Add handler for devlink .info_get")
Signed-off-by: Ivan Vecera <ivecera@redhat.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://lore.kernel.org/r/20231018123558.552453-1-ivecera@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-10-19 09:03:56 -07:00
Christian Marangi
039550960a net: stmmac: increase TX coalesce timer to 5ms
Commit 8fce333170 ("net: stmmac: Rework coalesce timer and fix
multi-queue races") decreased the TX coalesce timer from 40ms to 1ms.

This caused some performance regression on some target (regression was
reported at least on ipq806x) in the order of 600mbps dropping from
gigabit handling to only 200mbps.

The problem was identified in the TX timer getting armed too much time.
While this was fixed and improved in another commit, performance can be
improved even further by increasing the timer delay a bit moving from
1ms to 5ms.

The value is a good balance between battery saving by prevending too
much interrupt to be generated and permitting good performance for
internet oriented devices.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-19 15:41:31 +02:00
Christian Marangi
a594166387 net: stmmac: move TX timer arm after DMA enable
Move TX timer arm call after DMA interrupt is enabled again.

The TX timer arm function changed logic and now is skipped if a napi is
already scheduled. By moving the TX timer arm call after DMA is enabled,
we permit to correctly skip if a DMA interrupt has been fired and a napi
has been scheduled again.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-19 15:41:31 +02:00
Christian Marangi
2d1a42cf7f net: stmmac: improve TX timer arm logic
There is currently a problem with the TX timer getting armed multiple
unnecessary times causing big performance regression on some device that
suffer from heavy handling of hrtimer rearm.

The use of the TX timer is an old implementation that predates the napi
implementation and the interrupt enable/disable handling.

Due to stmmac being a very old code, the TX timer was never evaluated
again with this new implementation and was kept there causing
performance regression. The performance regression started to appear
with kernel version 4.19 with 8fce333170 ("net: stmmac: Rework coalesce
timer and fix multi-queue races") where the timer was reduced to 1ms
causing it to be armed 40 times more than before.

Decreasing the timer made the problem more present and caused the
regression in the other of 600-700mbps on some device (regression where
this was notice is ipq806x).

The problem is in the fact that handling the hrtimer on some target is
expensive and recent kernel made the timer armed much more times.
A solution that was proposed was reverting the hrtimer change and use
mod_timer but such solution would still hide the real problem in the
current implementation.

To fix the regression, apply some additional logic and skip arming the
timer when not needed.

Arm the timer ONLY if a napi is not already scheduled. Running the timer
is redundant since the same function (stmmac_tx_clean) will run in the
napi TX poll. Also try to cancel any timer if a napi is scheduled to
prevent redundant run of TX call.

With the following new logic the original performance are restored while
keeping using the hrtimer.

Signed-off-by: Christian Marangi <ansuelsmth@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-10-19 15:41:31 +02:00