linux-loongson

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson synced 2025-09-02 16:44:59 +00:00

Author	SHA1	Message	Date
Jakub Kicinski	761d527d5d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.h `c948c0973d` ("bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops") `f2878cdeb7` ("bnxt_en: Add support to call FW to update a VNIC") Link: https://patch.msgid.link/20240822210125.1542769-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-22 17:06:18 -07:00
Somnath Kotur	8baeef7616	bnxt_en: Fix double DMA unmapping for XDP_REDIRECT Remove the dma_unmap_page_attrs() call in the driver's XDP_REDIRECT code path. This should have been removed when we let the page pool handle the DMA mapping. This bug causes the warning: WARNING: CPU: 7 PID: 59 at drivers/iommu/dma-iommu.c:1198 iommu_dma_unmap_page+0xd5/0x100 CPU: 7 PID: 59 Comm: ksoftirqd/7 Tainted: G W 6.8.0-1010-gcp #11-Ubuntu Hardware name: Dell Inc. PowerEdge R7525/0PYVT1, BIOS 2.15.2 04/02/2024 RIP: 0010:iommu_dma_unmap_page+0xd5/0x100 Code: 89 ee 48 89 df e8 cb f2 69 ff 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9 31 f6 31 ff 45 31 c0 e9 ab 17 71 00 <0f> 0b 48 83 c4 08 5b 41 5c 41 5d 41 5e 41 5f 5d 31 c0 31 d2 31 c9 RSP: 0018:ffffab1fc0597a48 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff99ff838280c8 RCX: 0000000000000000 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffab1fc0597a78 R08: 0000000000000002 R09: ffffab1fc0597c1c R10: ffffab1fc0597cd3 R11: ffff99ffe375acd8 R12: 00000000e65b9000 R13: 0000000000000050 R14: 0000000000001000 R15: 0000000000000002 FS: 0000000000000000(0000) GS:ffff9a06efb80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000565c34c37210 CR3: 00000005c7e3e000 CR4: 0000000000350ef0 ? show_regs+0x6d/0x80 ? __warn+0x89/0x150 ? iommu_dma_unmap_page+0xd5/0x100 ? report_bug+0x16a/0x190 ? handle_bug+0x51/0xa0 ? exc_invalid_op+0x18/0x80 ? iommu_dma_unmap_page+0xd5/0x100 ? iommu_dma_unmap_page+0x35/0x100 dma_unmap_page_attrs+0x55/0x220 ? bpf_prog_4d7e87c0d30db711_xdp_dispatcher+0x64/0x9f bnxt_rx_xdp+0x237/0x520 [bnxt_en] bnxt_rx_pkt+0x640/0xdd0 [bnxt_en] __bnxt_poll_work+0x1a1/0x3d0 [bnxt_en] bnxt_poll+0xaa/0x1e0 [bnxt_en] __napi_poll+0x33/0x1e0 net_rx_action+0x18a/0x2f0 Fixes: `578fcfd26e` ("bnxt_en: Let the page pool manage the DMA mapping") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://patch.msgid.link/20240820203415.168178-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-21 17:36:56 -07:00
Simon Horman	a99ef548bb	bnx2x: Set ivi->vlan field as an integer In bnx2x_get_vf_config(): * The vlan field of ivi is a 32-bit integer, it is used to store a vlan ID. * The vlan field of bulletin is a 16-bit integer, it is also used to store a vlan ID. In the current code, ivi->vlan is set using memset. But in the case of setting it to the value of bulletin->vlan, this involves reading 32 bits from a 16bit source. This is likely safe, as the following 6 bytes are padding in the same structure, but none the less, it seems undesirable. However, it is entirely unclear to me how this scheme works on big-endian systems. Resolve this by simply assigning integer values to ivi->vlan. Flagged by W=1 builds. f.e. gcc-14 reports: In function 'fortify_memcpy_chk', inlined from 'bnx2x_get_vf_config' at .../bnx2x_sriov.c:2655:4: .../fortify-string.h:580:25: warning: call to '__read_overflow2_field' declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 580 \| __read_overflow2_field(q_size_field, size); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Brett Creeley <brett.creeley@amd.com> Link: https://patch.msgid.link/20240815-bnx2x-int-vlan-v1-1-5940b76e37ad@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-16 18:02:28 -07:00
Pavan Chebbi	c948c0973d	bnxt_en: Don't clear ntuple filters and rss contexts during ethtool ops The driver currently blindly deletes its cache of RSS cotexts and ntuple filters when the ethtool channel count is changing. It also deletes the ntuple filters cache when the default indirection table is changing. The core will not allow ethtool channels to drop below any that have been configured as ntuple destinations since this commit from 2022: `47f3ecf476` ("ethtool: Fail number of channels change when it conflicts with rxnfc") So there is absolutely no need to delete the ntuple filters and RSS contexts when changing ethtool channels. It is also unnecessary to delete ntuple filters when the default RSS indirection table is changing. Remove bnxt_clear_usr_fltrs() and bnxt_clear_rss_ctxis() from the ethtool ops and change them to static functions. This bug will cause confusion to the end user and causes failure when running the rss_ctx.py selftest. Fixes: `1018319f94` ("bnxt_en: Invalidate user filters when needed") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20240725111912.7bc17cf6@kernel.org/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240814225429.199280-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-15 19:12:46 -07:00
Simon Horman	1418e9ab3e	bnxt_en: avoid truncation of per rx run debugfs filename Although it seems unlikely in practice - there would need to be rx ring indexes greater than 10^10 - it is theoretically possible for the filename of per rx ring debugfs files to be truncated. This is because although a 16 byte buffer is provided, the length of the filename is restricted to 10 bytes. Remove this restriction and allow the entire buffer to be used. Also reduce the buffer to 12 bytes, which is sufficient. Given that the range of rx ring indexes likely much smaller than the maximum range of a 32-bit signed integer, a smaller buffer could be used, with some further changes. But this change seems simple, robust, and has minimal stack overhead. Flagged by gcc-14: .../bnxt_debugfs.c: In function 'bnxt_debug_dev_init': drivers/net/ethernet/broadcom/bnxt/bnxt_debugfs.c:69:30: warning: '%d' directive output may be truncated writing between 1 and 11 bytes into a region of size 10 [-Wformat-truncation=] 69 \| snprintf(qname, 10, "%d", ring_idx); \| ^~ In function 'debugfs_dim_ring_init', inlined from 'bnxt_debug_dev_init' at .../bnxt_debugfs.c:87:4: .../bnxt_debugfs.c:69:29: note: directive argument in the range [-2147483643, 2147483646] 69 \| snprintf(qname, 10, "%d", ring_idx); \| ^~~~ .../bnxt_debugfs.c:69:9: note: 'snprintf' output between 2 and 12 bytes into a destination of size 10 69 \| snprintf(qname, 10, "%d", ring_idx); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Compile tested only Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240813-bnxt-str-v2-2-872050a157e7@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-14 20:36:17 -07:00
Simon Horman	ffff7ee843	bnxt_en: Extend maximum length of version string by 1 byte This corrects an out-by-one error in the maximum length of the package version string. The size argument of snprintf includes space for the trailing '\0' byte, so there is no need to allow extra space for it by reducing the value of the size argument by 1. Found by inspection. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240813-bnxt-str-v2-1-872050a157e7@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-14 20:36:16 -07:00
Jakub Kicinski	ec6e57beaf	ethtool: rss: don't report key if device doesn't support it marvell/otx2 and mvpp2 do not support setting different keys for different RSS contexts. Contexts have separate indirection tables but key is shared with all other contexts. This is likely fine, indirection table is the most important piece. Don't report the key-related parameters from such drivers. This prevents driver-errors, e.g. otx2 always writes the main key, even when user asks to change per-context key. The second reason is that without this change tracking the keys by the core gets complicated. Even if the driver correctly reject setting key with rss_context != 0, change of the main key would have to be reflected in the XArray for all additional contexts. Since the additional contexts don't have their own keys not including the attributes (in Netlink speak) seems intuitive. ethtool CLI seems to deal with it just fine. Having to set the flag in majority of the drivers is a bit tedious but not reporting the key is a safer default. Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-12 14:16:24 +01:00
Jakub Kicinski	fb770fe758	eth: remove .cap_rss_ctx_supported from updated drivers Remove .cap_rss_ctx_supported from drivers which moved to the new API. This makes it easy to grep for drivers which still need to be converted. Reviewed-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Edward Cree <ecree.xilinx@gmail.com> Reviewed-by: Joe Damato <jdamato@fastly.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-12 14:16:24 +01:00
David Wei	97cbf3d0ac	bnxt_en: only set dev->queue_mgmt_ops if supported by FW The queue API calls bnxt_hwrm_vnic_update() to stop/start the flow of packets, which can only properly flush the pipeline if FW indicates support. Add a macro BNXT_SUPPORTS_QUEUE_API that checks for the required flags and only set queue_mgmt_ops if true. Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-11 13:48:02 +01:00
David Wei	b9d2956e86	bnxt_en: stop packet flow during bnxt_queue_stop/start The current implementation when resetting a queue while packets are flowing puts the queue into an inconsistent state. There needs to be some synchronisation with the FW. Add calls to bnxt_hwrm_vnic_update() to set the MRU for both the default and ntuple vnic during queue start/stop. When the MRU is set to 0, flow is stopped. Each Rx queue belongs to either the default or the ntuple vnic. With calling bnxt_hwrm_vnic_update() the calls to napi_enable() and napi_disable() must be removed for reset to work on a queue that has active traffic flowing e.g. iperf3. Co-developed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-11 13:48:02 +01:00
David Wei	d41575f76a	bnxt_en: set vnic->mru in bnxt_hwrm_vnic_cfg() Set the newly added vnic->mru field in bnxt_hwrm_vnic_cfg(). Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-11 13:48:02 +01:00
Michael Chan	6e360862c0	bnxt_en: Check the FW's VNIC flush capability Check the HWRM_VNIC_QCAPS FW response for the receive engine flush capability. This capability indicates that we can reliably support RX ring restart when calling HWRM_VNIC_UPDATE with MRU set to 0. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-11 13:48:02 +01:00
Michael Chan	f2878cdeb7	bnxt_en: Add support to call FW to update a VNIC Add the function bnxt_hwrm_vnic_update() to call FW to update a VNIC. This call can be used when disabling and enabling a receive ring within a VNIC. The mru which is the maximum receive size of packets received by the VNIC can be updated. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-11 13:48:02 +01:00
Michael Chan	fbda8ee64b	bnxt_en: Update firmware interface to 1.10.3.68 The main changes are: 1. HWRM_VNIC_UPDATE used to safely disable and enable an RX ring within the VNIC. 2. New flag in HWRM_VNIC_QCAPS to indicate FW will do the proper flush during HWRM_VNIC_UPDATE. 3. New flag in HWRM_FUNC_QCAPS to indicate that reservations for some resources such as VNIC can be reduced. 4. New backing store memory types not used by the driver yet. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-08-11 13:48:02 +01:00
Simon Horman	a39036847f	bnx2x: Provide declaration of dmae_reg_go_c in header Provide declaration of dmae_reg_go_c in header. This symbol is defined in bnx2x_main.c. And used in that file and bnx2x_stats.c. However, Sparse complains that there is no declaration of the symbol in dmae_reg_go_c nor is the symbol static. .../bnx2x_main.c:291:11: warning: symbol 'dmae_reg_go_c' was not declared. Should it be static? Address this by moving the declaration from bnx2x_stats.c to bnx2x_reg.h. No functional change intended. Compile tested only. Signed-off-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240806-bnx2x-dec-v1-1-ae844ec785e4@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-08 19:43:06 -07:00
Jakub Kicinski	e47fd9beb1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Link: https://patch.msgid.link/20240808170148.3629934-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-08 14:04:17 -07:00
Edward Cree	b54de55990	net: ethtool: fix off-by-one error in max RSS context IDs Both ethtool_ops.rxfh_max_context_id and the default value used when it's not specified are supposed to be exclusive maxima (the former is documented as such; the latter, U32_MAX, cannot be used as an ID since it equals ETH_RXFH_CONTEXT_ALLOC), but xa_alloc() expects an inclusive maximum. Subtract one from 'limit' to produce an inclusive maximum, and pass that to xa_alloc(). Increase bnxt's max by one to prevent a (very minor) regression, as BNXT_MAX_ETH_RSS_CTX is an inclusive max. This is safe since bnxt is not actually hard-limited; BNXT_MAX_ETH_RSS_CTX is just a leftover from old driver code that managed context IDs itself. Rename rxfh_max_context_id to rxfh_max_num_contexts to make its semantics (hopefully) more obvious. Fixes: `847a8ab186` ("net: ethtool: let the core choose RSS context IDs") Signed-off-by: Edward Cree <ecree.xilinx@gmail.com> Link: https://patch.msgid.link/5a2d11a599aa5b0cc6141072c01accfb7758650c.1723045898.git.ecree.xilinx@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-08 08:54:33 -07:00
Florian Fainelli	9ee09edc05	net: bcmgenet: Properly overlay PHY and MAC Wake-on-LAN capabilities Some Wake-on-LAN modes such as WAKE_FILTER may only be supported by the MAC, while others might be only supported by the PHY. Make sure that the .get_wol() returns the union of both rather than only that of the PHY if the PHY supports Wake-on-LAN. Fixes: `7e400ff35c` ("net: bcmgenet: Add support for PHY-based Wake-on-LAN") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20240806175659.3232204-1-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-08 08:29:29 -07:00
Michael Chan	da03f5d1b2	bnxt_en : Fix memory out-of-bounds in bnxt_fill_hw_rss_tbl() A recent commit has modified the code in __bnxt_reserve_rings() to set the default RSS indirection table to default only when the number of RX rings is changing. While this works for newer firmware that requires RX ring reservations, it causes the regression on older firmware not requiring RX ring resrvations (BNXT_NEW_RM() returns false). With older firmware, RX ring reservations are not required and so hw_resc->resv_rx_rings is not always set to the proper value. The comparison: if (old_rx_rings != bp->hw_resc.resv_rx_rings) in __bnxt_reserve_rings() may be false even when the RX rings are changing. This will cause __bnxt_reserve_rings() to skip setting the default RSS indirection table to default to match the current number of RX rings. This may later cause bnxt_fill_hw_rss_tbl() to use an out-of-range index. We already have bnxt_check_rss_tbl_no_rmgr() to handle exactly this scenario. We just need to move it up in bnxt_need_reserve_rings() to be called unconditionally when using older firmware. Without the fix, if the TX rings are changing, we'll skip the bnxt_check_rss_tbl_no_rmgr() call and __bnxt_reserve_rings() may also skip the bnxt_set_dflt_rss_indir_tbl() call for the reason explained in the last paragraph. Without setting the default RSS indirection table to default, it causes the regression: BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 Read of size 2 at addr ffff8881c5809618 by task ethtool/31525 Call Trace: __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460 __bnxt_setup_vnic_p5+0x12e/0x270 __bnxt_open_nic+0x2262/0x2f30 bnxt_open_nic+0x5d/0xf0 ethnl_set_channels+0x5d4/0xb30 ethnl_default_set_doit+0x2f1/0x620 Reported-by: Breno Leitao <leitao@debian.org> Closes: https://lore.kernel.org/netdev/ZrC6jpghA3PWVWSB@gmail.com/ Fixes: `98ba1d931f` ("bnxt_en: Fix RSS logic in __bnxt_reserve_rings()") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Tested-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20240806053742.140304-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-07 20:14:13 -07:00
Jakub Kicinski	5fa35bd39c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Link: https://patch.msgid.link/20240801131917.34494-1-pabeni@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-08-01 10:45:23 -07:00
Allen Pais	8d3beb6bc7	net: cnic: Convert tasklet API to new bottom half workqueue mechanism Migrate tasklet APIs to the new bottom half workqueue mechanism. It replaces all occurrences of tasklet usage with the appropriate workqueue APIs throughout the cnic driver. This transition ensures compatibility with the latest design and enhances performance. Signed-off-by: Allen Pais <allen.lkml@gmail.com> Link: https://patch.msgid.link/20240730183403.4176544-4-allen.lkml@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-31 18:59:46 -07:00
Jakub Kicinski	9dbad38336	eth: bnxt: populate defaults in the RSS context struct As described in the kdoc for .create_rxfh_context we are responsible for populating the defaults. The core will not call .get_rxfh for non-0 context. The problem can be easily observed since Netlink doesn't currently use the cache. Using netlink ethtool: $ ethtool -x eth0 context 1 [...] RSS hash key: 13:60:cd:60:14:d3:55:36:86:df:90:f2:96:14:e2:21:05:57:a8:8f:a5:12:5e:54:62:7f:fd:3c:15:7e:76:05:71:42:a2:9a:73:80:09:9c RSS hash function: toeplitz: on xor: off crc32: off But using IOCTL ethtool shows: $ ./ethtool-old -x eth0 context 1 [...] RSS hash key: 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00 RSS hash function: Operation not supported Fixes: `7964e78846` ("net: ethtool: use the tracking array for get_rxfh on custom RSS contexts") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-29 10:59:07 +01:00
Jakub Kicinski	daefd348a5	eth: bnxt: reject unsupported hash functions In commit under Fixes I split the bnxt_set_rxfh_context() function, and attached the appropriate chunks to new ops. I missed that bnxt_set_rxfh_context() gets called after some initial checks in bnxt_set_rxfh(), namely that the hash function is Toeplitz. Fixes: `5c466b4d4e` ("eth: bnxt: move from .set_rxfh to .create_rxfh_context and friends") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-29 10:59:07 +01:00
Pavan Chebbi	98ba1d931f	bnxt_en: Fix RSS logic in __bnxt_reserve_rings() In __bnxt_reserve_rings(), the existing code unconditionally sets the default RSS indirection table to default if netif_is_rxfh_configured() returns false. This used to be correct before we added RSS contexts support. For example, if the user is changing the number of ethtool channels, we will enter this path to reserve the new number of rings. We will then set the RSS indirection table to default to cover the new number of rings if netif_is_rxfh_configured() is false. Now, with RSS contexts support, if the user has added or deleted RSS contexts, we may now enter this path to reserve the new number of VNICs. However, netif_is_rxfh_configured() will not return the correct state if we are still in the middle of set_rxfh(). So the existing code may set the indirection table of the default RSS context to default by mistake. Fix it to check if the reservation of the RX rings is changing. Only check netif_is_rxfh_configured() if it is changing. RX rings will not change in the middle of set_rxfh() and this will fix the issue. Fixes: `b3d0083caf` ("bnxt_en: Support RSS contexts in ethtool .{get\|set}_rxfh()") Reported-and-tested-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/20240625010210.2002310-1-kuba@kernel.org Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20240724222106.147744-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-25 16:22:35 -07:00
Linus Torvalds	1722389b0d	A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Including fixes from bpf and netfilter. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaibxAACgkQMUZtbf5S IruuIRAAu96TiN/urPwmKznyb/Sk8x7p8iUzn6OvPS/TUlFUkURQtOh6M9uvbpN4 x/L//EWkMR0hY4SkBegoiXfb1GS0PjBdWTWUiROm5X9nVHqp5KRZAxWXhjFiS1BO BIYOT+JfCl7mQiPs90Mys/cEtYOggMBsCZQVIGw/iYoJLFREqxFSONwa0dG+tGMX jn9WNu4yCVDhJ/jtl2MaTsCNtYUaBUgYrKHJBfNGfJ2Lz/7rH9yFui2WSMlmOd/U QGeCb1DWURlShlCqY37wNinbFsxWkI5JN00ukTtwFAXLIaqc+zgHcIjrDjTJwK43 F4tKbJT3+bmehMU/h3Uo3c7DhXl7n9zDGiDtbCxnkykp0sFGJpjhDrWydo51c+YB qW5HaNrII2LiDicOVN8L29ylvKp7AEkClxgivEhZVGGk2f/szJRXfp9u3WBn5kAx 3paH55YN0DEsKbYbb1ZENEI1Vnc/4ff4PxZJCUNKwzcS8wCn1awqwcriK9TjS/cp fjilNFT4J3/uFrodHWTkx0jJT6UJFT0aF03qPLUH/J5kG+EVukOf1jBPInNdf1si 1j47SpblHUe86HiHphFMt32KZ210lJzWxh8uGma57Y2sB9makdLiK4etrFjkiMJJ Z8A3kGp3KpFjbuK4tHY25rp+5oxLNNOBNpay29lQrWtCL/NDcaQ= =9OsH -----END PGP SIGNATURE----- Merge tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from bpf and netfilter. A lot of networking people were at a conference last week, busy catching COVID, so relatively short PR. Current release - regressions: - tcp: process the 3rd ACK with sk_socket for TFO and MPTCP Current release - new code bugs: - l2tp: protect session IDR and tunnel session list with one lock, make sure the state is coherent to avoid a warning - eth: bnxt_en: update xdp_rxq_info in queue restart logic - eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field Previous releases - regressions: - xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len, the field reuses previously un-validated pad Previous releases - always broken: - tap/tun: drop short frames to prevent crashes later in the stack - eth: ice: add a per-VF limit on number of FDIR filters - af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash" * tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits) tun: add missing verification for short frame tap: add missing verification for short frame mISDN: Fix a use after free in hfcmulti_tx() gve: Fix an edge case for TSO skb validity check bnxt_en: update xdp_rxq_info in queue restart logic tcp: process the 3rd ACK with sk_socket for TFO/MPTCP selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len bpf: Fix a segment issue when downgrading gso_size net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling MAINTAINERS: make Breno the netconsole maintainer MAINTAINERS: Update bonding entry net: nexthop: Initialize all fields in dumped nexthops net: stmmac: Correct byte order of perfect_match selftests: forwarding: skip if kernel not support setting bridge fdb learning limit tipc: Return non-zero value from tipc_udp_addr2str() on error netfilter: nft_set_pipapo_avx2: disable softinterrupts ice: Fix recipe read procedure ice: Add a per-VF limit on number of FDIR filters net: bonding: correctly annotate RCU in bond_should_notify_peers() ...	2024-07-25 13:32:25 -07:00
Linus Torvalds	c2a96b7f18	Driver core changes for 6.11-rc1 Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZqH+aQ8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ymoOQCfVBdLcBjEDAGh3L8qHRGMPy4rV2EAoL/r+zKm cJEYtJpGtWX6aAtugm9E =ZyJV -----END PGP SIGNATURE----- Merge tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updates from Greg KH: "Here is the big set of driver core changes for 6.11-rc1. Lots of stuff in here, with not a huge diffstat, but apis are evolving which required lots of files to be touched. Highlights of the changes in here are: - platform remove callback api final fixups (Uwe took many releases to get here, finally!) - Rust bindings for basic firmware apis and initial driver-core interactions. It's not all that useful for a "write a whole driver in rust" type of thing, but the firmware bindings do help out the phy rust drivers, and the driver core bindings give a solid base on which others can start their work. There is still a long way to go here before we have a multitude of rust drivers being added, but it's a great first step. - driver core const api changes. This reached across all bus types, and there are some fix-ups for some not-common bus types that linux-next and 0-day testing shook out. This work is being done to help make the rust bindings more safe, as well as the C code, moving toward the end-goal of allowing us to put driver structures into read-only memory. We aren't there yet, but are getting closer. - minor devres cleanups and fixes found by code inspection - arch_topology minor changes - other minor driver core cleanups All of these have been in linux-next for a very long time with no reported problems" * tag 'driver-core-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (55 commits) ARM: sa1100: make match function take a const pointer sysfs/cpu: Make crash_hotplug attribute world-readable dio: Have dio_bus_match() callback take a const * zorro: make match function take a const pointer driver core: module: make module_[add\|remove]_driver take a const * driver core: make driver_find_device() take a const * driver core: make driver_[create\|remove]_file take a const * firmware_loader: fix soundness issue in `request_internal` firmware_loader: annotate doctests as `no_run` devres: Correct code style for functions that return a pointer type devres: Initialize an uninitialized struct member devres: Fix memory leakage caused by driver API devm_free_percpu() devres: Fix devm_krealloc() wasting memory driver core: platform: Switch to use kmemdup_array() driver core: have match() callback in struct bus_type take a const * MAINTAINERS: add Rust device abstractions to DRIVER CORE device: rust: improve safety comments MAINTAINERS: add Danilo as FIRMWARE LOADER maintainer MAINTAINERS: add Rust FW abstractions to FIRMWARE LOADER firmware: rust: improve safety comments ...	2024-07-25 10:42:22 -07:00
Taehee Yoo	b537633ce5	bnxt_en: update xdp_rxq_info in queue restart logic When the netdev_rx_queue_restart() restarts queues, the bnxt_en driver updates(creates and deletes) a page_pool. But it doesn't update xdp_rxq_info, so the xdp_rxq_info is still connected to an old page_pool. So, bnxt_rx_ring_info->page_pool indicates a new page_pool, but bnxt_rx_ring_info->xdp_rxq is still connected to an old page_pool. An old page_pool is no longer used so it is supposed to be deleted by page_pool_destroy() but it isn't. Because the xdp_rxq_info is holding the reference count for it and the xdp_rxq_info is not updated, an old page_pool will not be deleted in the queue restart logic. Before restarting 1 queue: ./tools/net/ynl/samples/page-pool enp10s0f1np1[6] page pools: 4 (zombies: 0) refs: 8192 bytes: 33554432 (refs: 0 bytes: 0) recycling: 0.0% (alloc: 128:8048 recycle: 0:0) After restarting 1 queue: ./tools/net/ynl/samples/page-pool enp10s0f1np1[6] page pools: 5 (zombies: 0) refs: 10240 bytes: 41943040 (refs: 0 bytes: 0) recycling: 20.0% (alloc: 160:10080 recycle: 1920:128) Before restarting queues, an interface has 4 page_pools. After restarting one queue, an interface has 5 page_pools, but it should be 4, not 5. The reason is that queue restarting logic creates a new page_pool and an old page_pool is not deleted due to the absence of an update of xdp_rxq_info logic. Fixes: `2d694c27d3` ("bnxt_en: implement netdev_queue_mgmt_ops") Signed-off-by: Taehee Yoo <ap420073@gmail.com> Reviewed-by: David Wei <dw@davidwei.uk> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://patch.msgid.link/20240721053554.1233549-1-ap420073@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-25 07:42:48 -07:00
Jakub Kicinski	30b3560050	Merge branch 'net-make-timestamping-selectable' First part of "net: Make timestamping selectable" from Kory Maincent. Change the driver-facing type already to lower rebasing pain. Link: https://lore.kernel.org/20240709-feature_ptp_netnext-v17-0-b5317f50df2a@bootlin.com/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-15 08:02:30 -07:00
Kory Maincent	2111375b85	net: Add struct kernel_ethtool_ts_info In prevision to add new UAPI for hwtstamp we will be limited to the struct ethtool_ts_info that is currently passed in fixed binary format through the ETHTOOL_GET_TS_INFO ethtool ioctl. It would be good if new kernel code already started operating on an extensible kernel variant of that structure, similar in concept to struct kernel_hwtstamp_config vs struct hwtstamp_config. Since struct ethtool_ts_info is in include/uapi/linux/ethtool.h, here we introduce the kernel-only structure in include/linux/ethtool.h. The manual copy is then made in the function called by ETHTOOL_GET_TS_INFO. Acked-by: Shannon Nelson <shannon.nelson@amd.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com> Link: https://patch.msgid.link/20240709-feature_ptp_netnext-v17-6-b5317f50df2a@bootlin.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-15 08:02:26 -07:00
Jakub Kicinski	46e457a454	eth: bnxt: use the indir table from ethtool context Instead of allocating a separate indir table in the vnic use the one already present in the RSS context allocated by the core. This saves some LoC and also we won't have to worry about syncing the local version back to the core, once core learns how to dump contexts. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-12-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:23 -07:00
Jakub Kicinski	73afb518af	eth: bnxt: bump the entry size in indir tables to u32 Ethtool core stores indirection table with u32 entries, "just to be safe". Switch the type in the driver, so that it's easier to swap local tables for the core ones. Memory allocations already use sizeof(*entry), switch the memset()s as well. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-11-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:23 -07:00
Jakub Kicinski	9c34c6c28c	eth: bnxt: pad out the correct indirection table bnxt allocates tables of max size, and changes the used size based on number of active rings. The unused entries get padded out with zeros. bnxt_modify_rss() seems to always pad out the table of the main / default RSS context, instead of the table of the modified context. I haven't observed any behavior change due to this patch, so I don't think it's a fix. Not entirely sure what role the padding plays, 0 is a valid queue ID. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-10-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:23 -07:00
Jakub Kicinski	20c8ad72eb	eth: bnxt: use the RSS context XArray instead of the local list Core already maintains all RSS contexts in an XArray, no need to keep a second list in the driver. Remove bnxt_get_max_rss_ctx_ring() completely since core performs the same check already. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-9-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:22 -07:00
Jakub Kicinski	63d4769cf7	eth: bnxt: use context priv for struct bnxt_rss_ctx Core can allocate space for per-context driver-private data, use it for struct bnxt_rss_ctx. Inline bnxt_alloc_rss_ctx() at this point, most of the init (as in the actions bnxt_del_one_rss_ctx() will undo) is open coded in bnxt_create_rxfh_context(), anyway. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:22 -07:00
Jakub Kicinski	bf30162915	eth: bnxt: depend on core cleaning up RSS contexts New RSS context API removes old contexts on netdev unregister. No need to wipe them manually. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:22 -07:00
Jakub Kicinski	1a49a23c03	eth: bnxt: remove rss_ctx_bmap Core will allocate IDs for the driver, from the range [1, BNXT_MAX_ETH_RSS_CTX], no need to track the allocations. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:22 -07:00
Jakub Kicinski	5c466b4d4e	eth: bnxt: move from .set_rxfh to .create_rxfh_context and friends Use the new ethtool ops for RSS context management. The conversion is pretty straightforward cut / paste of the right chunks of the combined handler. Main change is that we let the core pick the IDs (bitmap will be removed separately for ease of review), so we need to tell the core when we lose a context. Since the new API passes rxfh as const, change bnxt_modify_rss() to also take const. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:22 -07:00
Jakub Kicinski	667ac333db	eth: bnxt: allow deleting RSS contexts when the device is down Contexts get deleted from FW when the device is down, but they are kept in SW and re-added back on open. bnxt_set_rxfh_context() apparently does not want to deal with complexity of dealing with both the device down and device up cases. This is perhaps acceptable for creating new contexts, but not being able to delete contexts makes core-driven cleanups messy. Specifically with the new RSS API core will try to delete contexts automatically after bringing the device down. Support the delete-while-down case. Skip the FW logic and delete just the driver state. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240711220713.283778-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-12 22:16:22 -07:00
Jakub Kicinski	7c8267275d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: net/sched/act_ct.c `26488172b0` ("net/sched: Fix UAF when resolving a clash") `3abbd7ed8b` ("act_ct: prepare for stolen verdict coming from conntrack and nat engine") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-11 12:58:13 -07:00
Jakub Kicinski	0d1b7d6c92	bnxt: fix crashes when reducing ring count with active RSS contexts bnxt doesn't check if a ring is used by RSS contexts when reducing ring count. Core performs a similar check for the drivers for the main context, but core doesn't know about additional contexts, so it can't validate them. bnxt_fill_hw_rss_tbl_p5() uses ring id to index bp->rx_ring[], which without the check may end up being out of bounds. BUG: KASAN: slab-out-of-bounds in __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 Read of size 2 at addr ffff8881c5809618 by task ethtool/31525 Call Trace: __bnxt_hwrm_vnic_set_rss+0xb79/0xe40 bnxt_hwrm_vnic_rss_cfg_p5+0xf7/0x460 __bnxt_setup_vnic_p5+0x12e/0x270 __bnxt_open_nic+0x2262/0x2f30 bnxt_open_nic+0x5d/0xf0 ethnl_set_channels+0x5d4/0xb30 ethnl_default_set_doit+0x2f1/0x620 Core does track the additional contexts in net-next, so we can move this validation out of the driver as a follow up there. Fixes: `b3d0083caf` ("bnxt_en: Support RSS contexts in ethtool .{get\|set}_rxfh()") Signed-off-by: Jakub Kicinski <kuba@kernel.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://patch.msgid.link/20240705020005.681746-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-07-09 11:35:49 +02:00
Dan Carpenter	0c754d9d86	net: bcmasp: Fix error code in probe() Return an error code if bcmasp_interface_create() fails. Don't return success. Fixes: `490cb41200` ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Michal Kubiak <michal.kubiak@intel.com> Reviewed-by: Justin Chen <justin.chen@broadcom.com> Link: https://patch.msgid.link/ZoWKBkHH9D1fqV4r@stanley.mountain Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-05 17:13:57 -07:00
Jakub Kicinski	76ed626479	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/phy/aquantia/aquantia.h `219343755e` ("net: phy: aquantia: add missing include guards") `61578f6793` ("net: phy: aquantia: add support for PHY LEDs") drivers/net/ethernet/wangxun/libwx/wx_hw.c `bd07a98178` ("net: txgbe: remove separate irq request for MSI and INTx") `b501d261a5` ("net: txgbe: add FDIR ATR support") https://lore.kernel.org/all/20240703112936.483c1975@canb.auug.org.au/ include/linux/mlx5/mlx5_ifc.h `048a403648` ("net/mlx5: IFC updates for changing max EQs") `99be56171f` ("net/mlx5e: SHAMPO, Re-enable HW-GRO") https://lore.kernel.org/all/20240701133951.6926b2e3@canb.auug.org.au/ Adjacent changes: drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c `4130c67cd1` ("wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference") `3f3126515f` ("wifi: iwlwifi: mvm: add mvm-specific guard") include/net/mac80211.h `816c6bec09` ("wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP") `5a009b42e0` ("wifi: mac80211: track changes in AP's TPE") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-04 14:16:11 -07:00
Pavan Chebbi	5d350dc342	bnxt_en: Fix the resource check condition for RSS contexts While creating a new RSS context, bnxt_rfs_capable() currently makes a strict check to see if the required VNICs are already available. If the current VNICs are not what is required, either too many or not enough, it will call the firmware to reserve the exact number required. There is a bug in the firmware when the driver tries to relinquish some reserved VNICs and RSS contexts. It will cause the default VNIC to lose its RSS configuration and cause receive packets to be placed incorrectly. Workaround this problem by skipping the resource reduction. The driver will not reduce the VNIC and RSS context reservations when a context is deleted. The resources will be available for use when new contexts are created later. Potentially, this workaround can cause us to run out of VNIC and RSS contexts if there are a lot of VF functions creating and deleting RSS contexts. In the future, we will conditionally disable this workaround when the firmware fix is available. Fixes: `438ba39b25` ("bnxt_en: Improve RSS context reservation infrastructure") Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20240625010210.2002310-1-kuba@kernel.org/ Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20240703180112.78590-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-07-04 07:40:27 -07:00
David Wei	40eca00ae6	bnxt_en: unlink page pool when stopping Rx queue Have bnxt call page_pool_disable_direct_recycling() to unlink the old page pool when resetting a queue prior to destroying it, instead of touching a netdev core struct directly. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-07-02 15:00:11 +02:00
Pavan Chebbi	0603383907	bnxt_en: Remove atomic operations on ptp->tx_avail Now that we require the spinlock to protect ptp->txts_prod, change ptp->tx_avail to non-atomic and protect it under the same spinlock. Add a new helper function bnxt_ptp_get_txts_prod() to decrement ptp->tx_avail under spinlock and return the producer. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:22 +01:00
Pavan Chebbi	8aa2a79e9b	bnxt_en: Increase the max total outstanding PTP TX packets to 4 Start accepting up to 4 TX TS requests on BCM5750X (P5) chips. These PTP TX packets will be queued in the ptp->txts_req[] array waiting for the TX timestamp to complete. The entries in the array will be managed by a producer and consumer index. The producer index is updated under spinlock since multiple TX rings can try to send PTP packets at the same time. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:21 +01:00
Pavan Chebbi	9bf688d40d	bnxt_en: Let bnxt_stamp_tx_skb() return error code Change the function bnxt_stamp_tx_skb() to return 0 for suceess or -EAGAIN if the timestamp is still pending in firmware. The calling PTP aux worker will reschedule based on the return code. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:21 +01:00
Pavan Chebbi	573f2a4bfc	bnxt_en: Remove an impossible condition check for PTP TX pending SKB In the current 5750X PTP code paths, there is always at most one TX SKB requested for timestamp and we won't accept another one until we have retrieved the timestamp or it has timed out. Remove the unnecessary check in bnxt_get_tx_ts_p5() for a pending SKB and change the function to void. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:21 +01:00
Pavan Chebbi	92595a0c02	bnxt_en: Refactor all PTP TX timestamp fields into a struct On the older 5750X (P5) chips, we currently support only 1 TX PTP packet in-flight waiting for the timestamp. Refactor the datastructures to prepare to support up to 4 TX PTP packets. Combine all fields required for PTP TX timestamp query into one structure. An array of this structure will be added in follow-on patches to support multiple outstanding TX timestamps. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:21 +01:00
Pavan Chebbi	4d588d32b0	bnxt_en: Add BCM5760X specific PHC registers mapping BCM5760X firmware will advertise direct 64-bit PHC registers access for the driver from BAR0. Make the necessary changes in handling HWRM_PORT_MAC_PTP_QCFG's response and PHC register mapping for 5760X chips. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:20 +01:00
Michael Chan	1d294b4f90	bnxt_en: Add TX timestamp completion logic The new BCM5760X chips will return the timestamp of TX packets in a new completion. Add logic in __bnxt_poll_work() to handle this completion type to retrieve the timestamp. This feature eliminates the limit on the number of in-flight PTP TX packets. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:20 +01:00
Michael Chan	ba0155f1e9	bnxt_en: Allow some TX packets to be unprocessed in NAPI The driver's current logic will always free all the TX SKBs up to txr->tx_hw_cons within NAPI. In the next patches, we'll be adding logic to handle TX timestamp completion and we may need to hold some remaining TX SKBs if we don't have the timestamp completions yet. Modify __bnxt_poll_work_done() to clear each event bit separately to allow bnapi->tx_int() to decide whether to clear BNXT_TX_CMP_EVENT or not. bnapi->tx_int() will not clear BNXT_TX_CMP_EVENT if some TX SKBs are held waiting for TX timestamps. Note that legacy chips will never hold any SKBs this way. The SKB is always deferred to the PTP worker slow path to retrieve the timestamp from firmware. On the new P7 chips, the timestamp is returned by the hardware directly and we can retrieve it directly from NAPI. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:20 +01:00
Michael Chan	449da97512	bnxt_en: Add is_ts_pkt field to struct bnxt_sw_tx_bd Remove the unused is_gso field and add the is_ts_pkt field to struct bnxt_sw_tx_bd. This field will mark the TX BD that has requested HW TX timestamp. The field needs to be cleared if the timestamp packet is later aborted. This field will be useful when processing the new TX timestamp completion from the hardware in the next patches. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:20 +01:00
Michael Chan	be6b7ca3c2	bnxt_en: Add new TX timestamp completion definitions The new BCM5760X chips will generate this new TX timestamp completion when a TX packet's timestamp has been taken right before transmission. The driver logic to retrieve the timestamp will be added in the next few patches. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-07-01 11:23:20 +01:00
Ghadi Elie Rahme	134061163e	bnx2x: Fix multiple UBSAN array-index-out-of-bounds Fix UBSAN warnings that occur when using a system with 32 physical cpu cores or more, or when the user defines a number of Ethernet queues greater than or equal to FP_SB_MAX_E1x using the num_queues module parameter. Currently there is a read/write out of bounds that occurs on the array "struct stats_query_entry query" present inside the "bnx2x_fw_stats_req" struct in "drivers/net/ethernet/broadcom/bnx2x/bnx2x.h". Looking at the definition of the "struct stats_query_entry query" array: struct stats_query_entry query[FP_SB_MAX_E1x+ BNX2X_FIRST_QUEUE_QUERY_IDX]; FP_SB_MAX_E1x is defined as the maximum number of fast path interrupts and has a value of 16, while BNX2X_FIRST_QUEUE_QUERY_IDX has a value of 3 meaning the array has a total size of 19. Since accesses to "struct stats_query_entry query" are offset-ted by BNX2X_FIRST_QUEUE_QUERY_IDX, that means that the total number of Ethernet queues should not exceed FP_SB_MAX_E1x (16). However one of these queues is reserved for FCOE and thus the number of Ethernet queues should be set to [FP_SB_MAX_E1x -1] (15) if FCOE is enabled or [FP_SB_MAX_E1x] (16) if it is not. This is also described in a comment in the source code in drivers/net/ethernet/broadcom/bnx2x/bnx2x.h just above the Macro definition of FP_SB_MAX_E1x. Below is the part of this explanation that it important for this patch /* * The total number of L2 queues, MSIX vectors and HW contexts (CIDs) is * control by the number of fast-path status blocks supported by the * device (HW/FW). Each fast-path status block (FP-SB) aka non-default * status block represents an independent interrupts context that can * serve a regular L2 networking queue. However special L2 queues such * as the FCoE queue do not require a FP-SB and other components like * the CNIC may consume FP-SB reducing the number of possible L2 queues * * If the maximum number of FP-SB available is X then: * a. If CNIC is supported it consumes 1 FP-SB thus the max number of * regular L2 queues is Y=X-1 * b. In MF mode the actual number of L2 queues is Y= (X-1/MF_factor) * c. If the FCoE L2 queue is supported the actual number of L2 queues * is Y+1 * d. The number of irqs (MSIX vectors) is either Y+1 (one extra for * slow-path interrupts) or Y+2 if CNIC is supported (one additional * FP interrupt context for the CNIC). * e. The number of HW context (CID count) is always X or X+1 if FCoE * L2 queue is supported. The cid for the FCoE L2 queue is always X. */ However this driver also supports NICs that use the E2 controller which can handle more queues due to having more FP-SB represented by FP_SB_MAX_E2. Looking at the commits when the E2 support was added, it was originally using the E1x parameters: commit `f2e0899f0f` ("bnx2x: Add 57712 support"). Back then FP_SB_MAX_E2 was set to 16 the same as E1x. However the driver was later updated to take full advantage of the E2 instead of having it be limited to the capabilities of the E1x. But as far as we can tell, the array "stats_query_entry query" was still limited to using the FP-SB available to the E1x cards as part of an oversignt when the driver was updated to take full advantage of the E2, and now with the driver being aware of the greater queue size supported by E2 NICs, it causes the UBSAN warnings seen in the stack traces below. This patch increases the size of the "stats_query_entry query" array by replacing FP_SB_MAX_E1x with FP_SB_MAX_E2 to be large enough to handle both types of NICs. Stack traces: UBSAN: array-index-out-of-bounds in drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c:1529:11 index 20 is out of range for type 'stats_query_entry [19]' CPU: 12 PID: 858 Comm: systemd-network Not tainted 6.9.0-060900rc7-generic #202405052133 Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 10/21/2019 Call Trace: <TASK> dump_stack_lvl+0x76/0xa0 dump_stack+0x10/0x20 __ubsan_handle_out_of_bounds+0xcb/0x110 bnx2x_prep_fw_stats_req+0x2e1/0x310 [bnx2x] bnx2x_stats_init+0x156/0x320 [bnx2x] bnx2x_post_irq_nic_init+0x81/0x1a0 [bnx2x] bnx2x_nic_load+0x8e8/0x19e0 [bnx2x] bnx2x_open+0x16b/0x290 [bnx2x] __dev_open+0x10e/0x1d0 RIP: 0033:0x736223927a0a Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007ffc0bb2ada8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000583df50f9c78 RCX: 0000736223927a0a RDX: 0000000000000020 RSI: 0000583df50ee510 RDI: 0000000000000003 RBP: 0000583df50d4940 R08: 00007ffc0bb2adb0 R09: 0000000000000080 R10: 0000000000000000 R11: 0000000000000246 R12: 0000583df5103ae0 R13: 000000000000035a R14: 0000583df50f9c30 R15: 0000583ddddddf00 </TASK> ---[ end trace ]--- ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in drivers/net/ethernet/broadcom/bnx2x/bnx2x_stats.c:1546:11 index 28 is out of range for type 'stats_query_entry [19]' CPU: 12 PID: 858 Comm: systemd-network Not tainted 6.9.0-060900rc7-generic #202405052133 Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 10/21/2019 Call Trace: <TASK> dump_stack_lvl+0x76/0xa0 dump_stack+0x10/0x20 __ubsan_handle_out_of_bounds+0xcb/0x110 bnx2x_prep_fw_stats_req+0x2fd/0x310 [bnx2x] bnx2x_stats_init+0x156/0x320 [bnx2x] bnx2x_post_irq_nic_init+0x81/0x1a0 [bnx2x] bnx2x_nic_load+0x8e8/0x19e0 [bnx2x] bnx2x_open+0x16b/0x290 [bnx2x] __dev_open+0x10e/0x1d0 RIP: 0033:0x736223927a0a Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89 RSP: 002b:00007ffc0bb2ada8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 0000583df50f9c78 RCX: 0000736223927a0a RDX: 0000000000000020 RSI: 0000583df50ee510 RDI: 0000000000000003 RBP: 0000583df50d4940 R08: 00007ffc0bb2adb0 R09: 0000000000000080 R10: 0000000000000000 R11: 0000000000000246 R12: 0000583df5103ae0 R13: 000000000000035a R14: 0000583df50f9c30 R15: 0000583ddddddf00 </TASK> ---[ end trace ]--- ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in drivers/net/ethernet/broadcom/bnx2x/bnx2x_sriov.c:1895:8 index 29 is out of range for type 'stats_query_entry [19]' CPU: 13 PID: 163 Comm: kworker/u96:1 Not tainted 6.9.0-060900rc7-generic #202405052133 Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 10/21/2019 Workqueue: bnx2x bnx2x_sp_task [bnx2x] Call Trace: <TASK> dump_stack_lvl+0x76/0xa0 dump_stack+0x10/0x20 __ubsan_handle_out_of_bounds+0xcb/0x110 bnx2x_iov_adjust_stats_req+0x3c4/0x3d0 [bnx2x] bnx2x_storm_stats_post.part.0+0x4a/0x330 [bnx2x] ? bnx2x_hw_stats_post+0x231/0x250 [bnx2x] bnx2x_stats_start+0x44/0x70 [bnx2x] bnx2x_stats_handle+0x149/0x350 [bnx2x] bnx2x_attn_int_asserted+0x998/0x9b0 [bnx2x] bnx2x_sp_task+0x491/0x5c0 [bnx2x] process_one_work+0x18d/0x3f0 </TASK> ---[ end trace ]--- Fixes: `50f0a562f8` ("bnx2x: add fcoe statistics") Signed-off-by: Ghadi Elie Rahme <ghadi.rahme@canonical.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20240627111405.1037812-1-ghadi.rahme@canonical.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-28 18:19:05 -07:00
David Wei	2d694c27d3	bnxt_en: implement netdev_queue_mgmt_ops Implement netdev_queue_mgmt_ops for bnxt added in [1]. Two bnxt_rx_ring_info structs are allocated to hold the new/old queue memory. Queue memory is copied from/to the main bp->rx_ring[idx] bnxt_rx_ring_info. Queue memory is pre-allocated in bnxt_queue_mem_alloc() into a clone, and then copied into bp->rx_ring[idx] in bnxt_queue_mem_start(). Similarly, when bp->rx_ring[idx] is stopped its queue memory is copied into a clone, and then freed later in bnxt_queue_mem_free(). I tested this patchset with netdev_rx_queue_restart(), including inducing errors in all places that returns an error code. In all cases, the queue is left in a good working state. Rx queues are created/destroyed using bnxt_hwrm_rx_ring_alloc() and bnxt_hwrm_rx_ring_free(), which issue HWRM_RING_ALLOC and HWRM_RING_FREE commands respectively to the firmware. By the time a HWRM_RING_FREE response is received, there won't be any more completions from that queue. Thanks to Somnath for helping me with this patch. With their permission I've added them as Acked-by. [1]: https://lore.kernel.org/netdev/20240501232549.1327174-2-shailend@google.com/ Acked-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-21 10:10:33 +01:00
David Wei	88f56254a2	bnxt_en: split rx ring helpers out from ring helpers To prepare for queue API implementation, split rx ring functions out from ring helpers. These new helpers will be called from queue API implementation. Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-06-21 10:10:33 +01:00
Jakub Kicinski	a6ec08beec	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c `1e7962114c` ("bnxt_en: Restore PTP tx_avail count in case of skb_pad() error") `165f87691a` ("bnxt_en: add timestamping statistics support") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-20 13:49:59 -07:00
Pavan Chebbi	1e7962114c	bnxt_en: Restore PTP tx_avail count in case of skb_pad() error The current code only restores PTP tx_avail count when we get DMA mapping errors. Fix it so that the PTP tx_avail count will be restored for both DMA mapping errors and skb_pad() errors. Otherwise PTP TX timestamp will not be available after a PTP packet hits the skb_pad() error. Fixes: `83bb623c96` ("bnxt_en: Transmit and retrieve packet timestamps") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240618215313.29631-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-20 06:46:15 -07:00
Michael Chan	b7bfcb4c7c	bnxt_en: Set TSO max segs on devices with limits Firmware will now advertise a non-zero TSO max segments if the device has a limit. 0 means no limit. The latest 5760X chip (early revs) has a limit of 2047 that cannot be exceeded. If exceeded, the chip will send out just a small number of segments. Call netif_set_tso_max_segs() if the device has a limit. Fixes: `2012a6abc8` ("bnxt_en: Add 5760X (P7) PCI IDs") Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240618215313.29631-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-20 06:46:08 -07:00
Michael Chan	8ad0440992	bnxt_en: Update firmware interface to 1.10.3.44 The relevant change is the max_tso_segs value returned by firmware in the HWRM_FUNC_QCAPS response. This value will be used in the next patch to cap the TSO segments. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240618215313.29631-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-20 06:46:08 -07:00
Greg Kroah-Hartman	b5dd424181	Linux 6.10-rc4 -----BEGIN PGP SIGNATURE----- iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmZvTbAeHHRvcnZhbGRz QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGVksIAJEn4a9IVM8FNCJy Dxo0BItD1/qJ5mLDptqUFRKlxInjbojofz5CyoeIeXb0DwRfB16ALXqNXAkd3APi saoOpfjFsg2H2OqL9CHdkzWcJEAq2lDnL0zaOjumeDVu/EyeT+tC4e4hq1e6Bm0E fPC5ms2b+07DF9Rg6/DW8yPbdM5n6Mz1bRd3fQOIgvpM3yGOyGztEBgTRub/ZUgH 5pNJauknFAZgdiWhgNpc+lPWYZbgHKULQPhUBPdVhDIXPtQNUlKgNTQc6+L0Nmbb K1sG1q7FLeMJOTFGQfD4r26X5DNQUi894q/9SX8X7rcrECdJKcw2WjVyB4myADpf ae2gP+A= =XjWP -----END PGP SIGNATURE----- Merge tag 'v6.10-rc4' into driver-core-next We need the driver core and sysfs fixes in here to build on top of. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-17 08:33:41 +02:00
Jakub Kicinski	4c7d3d79c7	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts, no adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 13:13:46 -07:00
Aleksandr Mishin	a9b9741854	bnxt_en: Adjust logging of firmware messages in case of released token in __hwrm_send() In case of token is released due to token->state == BNXT_HWRM_DEFERRED, released token (set to NULL) is used in log messages. This issue is expected to be prevented by HWRM_ERR_CODE_PF_UNAVAILABLE error code. But this error code is returned by recent firmware. So some firmware may not return it. This may lead to NULL pointer dereference. Adjust this issue by adding token pointer check. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: `8fa4219dba` ("bnxt_en: add dynamic debug support for HWRM messages") Suggested-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Aleksandr Mishin <amishin@t-argos.ru> Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240611082547.12178-1-amishin@t-argos.ru Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 08:05:46 -07:00
Michael Chan	7d9df38c9c	bnxt_en: Cap the size of HWRM_PORT_PHY_QCFG forwarded response Firmware interface 1.10.2.118 has increased the size of HWRM_PORT_PHY_QCFG response beyond the maximum size that can be forwarded. When the VF's link state is not the default auto state, the PF will need to forward the response back to the VF to indicate the forced state. This regression may cause the VF to fail to initialize. Fix it by capping the HWRM_PORT_PHY_QCFG response to the maximum 96 bytes. The SPEEDS2_SUPPORTED flag needs to be cleared because the new speeds2 fields are beyond the legacy structure. Also modify bnxt_hwrm_fwd_resp() to print a warning if the message size exceeds 96 bytes to make this failure more obvious. Fixes: `84a911db83` ("bnxt_en: Update firmware interface to 1.10.2.118") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240612231736.57823-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-13 07:50:16 -07:00
Greg Kroah-Hartman	ff985c7597	auxbus: make to_auxiliary_drv accept and return a constant pointer In the quest to make struct device constant, start by making to_auxiliary_drv() return a constant pointer so that drivers that call this can be fixed up before the driver core changes. As the return type previously was not constant, also fix up all callers that were assuming that the pointer was not going to be a constant one in order to not break the build. Cc: Dave Ertman <david.m.ertman@intel.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Rafael J. Wysocki <rafael@kernel.org> Cc: Bingbu Cao <bingbu.cao@intel.com> Cc: Tianshu Qiu <tian.shu.qiu@intel.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Cc: Michael Chan <michael.chan@broadcom.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Jakub Kicinski <kuba@kernel.org> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Jesse Brandeburg <jesse.brandeburg@intel.com> Cc: Tony Nguyen <anthony.l.nguyen@intel.com> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Leon Romanovsky <leon@kernel.org> Cc: Tariq Toukan <tariqt@nvidia.com> Cc: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> Cc: Liam Girdwood <lgirdwood@gmail.com> Cc: Peter Ujfalusi <peter.ujfalusi@linux.intel.com> Cc: Bard Liao <yung-chuan.liao@linux.intel.com> Cc: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> Cc: Daniel Baluta <daniel.baluta@nxp.com> Cc: Kai Vehmanen <kai.vehmanen@linux.intel.com> Cc: Jaroslav Kysela <perex@perex.cz> Cc: Takashi Iwai <tiwai@suse.com> Cc: Richard Cochran <richardcochran@gmail.com> Cc: linux-media@vger.kernel.org Cc: netdev@vger.kernel.org Cc: intel-wired-lan@lists.osuosl.org Cc: linux-rdma@vger.kernel.org Cc: sound-open-firmware@alsa-project.org Cc: linux-sound@vger.kernel.org Acked-by: Sakari Ailus <sakari.ailus@linux.intel.com> # drivers/media/pci/intel/ipu6 Acked-by: Mark Brown <broonie@kernel.org> Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com> Link: https://lore.kernel.org/r/20240611130103.3262749-7-gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-06-13 16:43:26 +02:00
Vadim Fedorenko	c790275b5e	bnxt_en: fix atomic counter for ptp packets atomic_dec_if_positive returns new value regardless if it is updated or not. The commit in fixes changed the behavior of the condition to one that differs from original code. Restore original condition to properly maintain atomic counter. Fixes: `165f87691a` ("bnxt_en: add timestamping statistics support") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240604091939.785535-1-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-05 12:52:42 -07:00
Vadim Fedorenko	165f87691a	bnxt_en: add timestamping statistics support The ethtool_ts_stats structure was introduced earlier this year. Now it's time to support this group of counters in more drivers. This patch adds support to bnxt driver. Signed-off-by: Vadim Fedorenko <vadfed@meta.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240530204751.99636-1-vadfed@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-06-01 16:02:51 -07:00
Vadim Fedorenko	38155539a1	bnxt_en: silence clang build warning Clang build brings a warning: ../drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c:133:12: warning: comparison of distinct pointer types ('typeof (tmo_us) ' (aka 'unsigned int ') and 'typeof (65535) ' (aka 'int ')) [-Wcompare-distinct-pointer-types] 133 \| tmo_us = min(tmo_us, BNXT_PTP_QTS_MAX_TMO_US); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Fix it by specifying proper type for BNXT_PTP_QTS_MAX_TMO_US. Fixes: `7de3c2218e` ("bnxt_en: Add a timeout parameter to bnxt_hwrm_port_ts_query()") Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240509151833.12579-1-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-10 18:16:35 -07:00
Eric Dumazet	1eb2cded45	net: annotate writes on dev->mtu from ndo_change_mtu() Simon reported that ndo_change_mtu() methods were never updated to use WRITE_ONCE(dev->mtu, new_mtu) as hinted in commit `501a90c945` ("inet: protect against too small mtu values.") We read dev->mtu without holding RTNL in many places, with READ_ONCE() annotations. It is time to take care of ndo_change_mtu() methods to use corresponding WRITE_ONCE() Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Simon Horman <horms@kernel.org> Closes: https://lore.kernel.org/netdev/20240505144608.GB67882@kernel.org/ Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Sabrina Dubroca <sd@queasysnail.net> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Shannon Nelson <shannon.nelson@amd.com> Link: https://lore.kernel.org/r/20240506102812.3025432-1-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-07 16:19:14 -07:00
David Wei	5bfadc5737	bnxt: fix bnxt_get_avail_msix() returning negative values Current net-next/main does not boot for older chipsets e.g. Stratus. Sample dmesg: [ 11.368315] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): Able to reserve only 0 out of 9 requested RX rings [ 11.390181] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): Unable to reserve tx rings [ 11.438780] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): 2nd rings reservation failed. [ 11.487559] bnxt_en 0000:02:00.0 (unnamed net_device) (uninitialized): Not enough rings available. [ 11.506012] bnxt_en 0000:02:00.0: probe with driver bnxt_en failed with error -12 This is caused by bnxt_get_avail_msix() returning a negative value for these chipsets not using the new resource manager i.e. !BNXT_NEW_RM. This in turn causes hwr.cp in __bnxt_reserve_rings() to be set to 0. In the current call stack, __bnxt_reserve_rings() is called from bnxt_set_dflt_rings() before bnxt_init_int_mode(). Therefore, bp->total_irqs is always 0 and for !BNXT_NEW_RM bnxt_get_avail_msix() always returns a negative number. Historically, MSIX vectors were requested by the RoCE driver during run-time and bnxt_get_avail_msix() was used for this purpose. Today, RoCE MSIX vectors are statically allocated. bnxt_get_avail_msix() should only be called for the BNXT_NEW_RM() case to reserve the MSIX ahead of time for RoCE use. bnxt_get_avail_msix() is also be simplified to handle the BNXT_NEW_RM() case only. Fixes: `d630624ebd` ("bnxt_en: Utilize ulp client resources if RoCE is not registered") Signed-off-by: David Wei <dw@davidwei.uk> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240502203757.3761827-1-dw@davidwei.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-03 16:04:04 -07:00
Jakub Kicinski	e958da0ddb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: include/linux/filter.h kernel/bpf/core.c `66e13b615a` ("bpf: verifier: prevent userspace memory access") `d503a04f8b` ("bpf: Add support for certain atomics in bpf_arena to x86 JIT") https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-02 12:06:25 -07:00
Ajit Khaparde	54d0b84f40	bnxt_en: Add VF PCI ID for 5760X (P7) chips No driver logic changes are required to support the VFs, so just add the VF PCI ID. Reviewed-by: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com> Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240501003056.100607-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-02 07:27:21 -07:00
Kalesh AP	3c163f35bd	bnxt_en: Optimize recovery path ULP locking in the driver In the error recovery path (AER, firmware recovery, etc), the driver notifies the RoCE driver via ULP_STOP before the reset and via ULP_START after the reset, all under RTNL_LOCK. The RoCE driver can take a long time if there are a lot of QPs to destroy, so it is not ideal to hold the global RTNL lock. Rely on the new en_dev_lock mutex instead for ULP_STOP and ULP_START. For the most part, we move the ULP_STOP call before we take the RTNL lock and move the ULP_START after RTNL unlock. Note that SRIOV re-enablement must be done after ULP_START or RoCE on the VFs will not resume properly after reset. The one scenario in bnxt_hwrm_if_change() where the RTNL lock is already taken in the .ndo_open() context requires the ULP restart to be deferred to the bnxt_sp_task() workqueue. Reviewed-by: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240501003056.100607-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-02 07:27:21 -07:00
Kalesh AP	de21ec442d	bnxt_en: Add a mutex to synchronize ULP operations The current scheme relies heavily on the RTNL lock for all ULP operations between the L2 and the RoCE driver. Add a new en_dev_lock mutex so that the asynchronous ULP_STOP and ULP_START operations can be serialized with bnxt_register_dev() and bnxt_unregister_dev() calls without relying on the RTNL lock. The next patch will remove the RTNL lock from the ULP_STOP and ULP_START calls. Reviewed-by: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240501003056.100607-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-02 07:27:20 -07:00
Michael Chan	f79d7a9f1c	bnxt_en: Don't call ULP_STOP/ULP_START during L2 reset There is no need to call ULP_STOP and ULP_START before and after the L2 reset in bnxt_reset_task(). This L2 reset is done after detecting TX timeout, RX ring errors, or VF config changes. The L2 reset does not affect RoCE since the firmware is not reset and the backing store is left alone. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240501003056.100607-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-02 07:27:20 -07:00
Kalesh AP	895621f1c8	bnxt_en: Don't support offline self test when RoCE driver is loaded Offline self test is a very disruptive operation for RoCE and requires all active QPs to be destroyed. With a large number of QPs, it can take a long time to destroy all the QPs and can timeout. Do not allow ethtool offline self test if the RoCE driver is registered on the device. Reviewed-by: Selvin Thyparampil Xavier <selvin.xavier@broadcom.com> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240501003056.100607-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-02 07:27:20 -07:00
Edwin Peer	a75fbb3aa4	bnxt_en: share NQ ring sw_stats memory with subrings On P5_PLUS chips and later, the NQ rings have subrings for RX and TX completions respectively. These subrings are passed to the poll function instead of the base NQ, but each ring carries its own copy of the software ring statistics. For stats to be conveniently accessible in __bnxt_poll_work(), the statistics memory should either be shared between the NQ and its subrings or the subrings need to be included in the ethtool stats aggregation logic. This patch opts for the former, because it's more efficient and less confusing having the software statistics for a ring exist in a single place. Before this patch, the counter will not be displayed if the "wrong" cpr->sw_stats was used to increment a counter. Link: https://lore.kernel.org/netdev/CACKFLikEhVAJA+osD7UjQNotdGte+fth7zOy7yDdLkTyFk9Pyw@mail.gmail.com/ Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240501003056.100607-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-05-02 07:27:20 -07:00
Doug Berger	0d5e2a8223	net: bcmgenet: synchronize UMAC_CMD access The UMAC_CMD register is written from different execution contexts and has insufficient synchronization protections to prevent possible corruption. Of particular concern are the acceses from the phy_device delayed work context used by the adjust_link call and the BH context that may be used by the ndo_set_rx_mode call. A spinlock is added to the driver to protect contended register accesses (i.e. reg_lock) and it is used to synchronize accesses to UMAC_CMD. Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Cc: stable@vger.kernel.org Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-29 06:24:22 +01:00
Doug Berger	2dbe5f1936	net: bcmgenet: synchronize use of bcmgenet_set_rx_mode() The ndo_set_rx_mode function is synchronized with the netif_addr_lock spinlock and BHs disabled. Since this function is also invoked directly from the driver the same synchronization should be applied. Fixes: `72f9634762` ("net: bcmgenet: set Rx mode before starting netif") Cc: stable@vger.kernel.org Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-29 06:24:22 +01:00
Doug Berger	d85cf67a33	net: bcmgenet: synchronize EXT_RGMII_OOB_CTRL access The EXT_RGMII_OOB_CTRL register can be written from different contexts. It is predominantly written from the adjust_link handler which is synchronized by the phydev->lock, but can also be written from a different context when configuring the mii in bcmgenet_mii_config(). The chances of contention are quite low, but it is conceivable that adjust_link could occur during resume when WoL is enabled so use the phydev->lock synchronizer in bcmgenet_mii_config() to be sure. Fixes: `afe3f907d2` ("net: bcmgenet: power on MII block for all MII modes") Cc: stable@vger.kernel.org Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-29 06:24:21 +01:00
Jakub Kicinski	2bd87951de	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/ti/icssg/icssg_prueth.c net/mac80211/chan.c `89884459a0` ("wifi: mac80211: fix idle calculation with multi-link") `87f5500285` ("wifi: mac80211: simplify ieee80211_assign_link_chanctx()") https://lore.kernel.org/all/20240422105623.7b1fbda2@canb.auug.org.au/ net/unix/garbage.c `1971d13ffa` ("af_unix: Suppress false-positive lockdep splat for spin_lock() in __unix_gc().") `4090fa373f` ("af_unix: Replace garbage collection algorithm.") drivers/net/ethernet/ti/icssg/icssg_prueth.c drivers/net/ethernet/ti/icssg/icssg_common.c `4dcd0e83ea` ("net: ti: icssg-prueth: Fix signedness bug in prueth_init_rx_chns()") `e2dc7bfd67` ("net: ti: icssg-prueth: Move common functions into a separate file") No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-25 12:41:37 -07:00
Peter Münster	e3eb7dd47b	net: b44: set pause params only when interface is up b44_free_rings() accesses b44::rx_buffers (and ::tx_buffers) unconditionally, but b44::rx_buffers is only valid when the device is up (they get allocated in b44_open(), and deallocated again in b44_close()), any other time these are just a NULL pointers. So if you try to change the pause params while the network interface is disabled/administratively down, everything explodes (which likely netifd tries to do). Link: https://github.com/openwrt/openwrt/issues/13789 Fixes: `1da177e4c3` (Linux-2.6.12-rc2) Cc: stable@vger.kernel.org Reported-by: Peter Münster <pm@a16n.net> Suggested-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Vaclav Svoboda <svoboda@neng.cz> Tested-by: Peter Münster <pm@a16n.net> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Peter Münster <pm@a16n.net> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/87y192oolj.fsf@a16n.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-25 08:34:18 -07:00
Jakub Kicinski	7301177307	eth: bnxt: fix counting packets discarded due to OOM and netpoll I added OOM and netpoll discard counters, naively assuming that the cpr pointer is pointing to a common completion ring. Turns out that is usually a completion ring but not the completion ring which bnapi->cp_ring points to. bnapi->cp_ring is where the stats are read from, so we end up reporting 0 thru ethtool -S and qstat even though the drop events have happened. Make 100% sure we're recording statistics in the correct structure. Fixes: `907fd4a294` ("bnxt: count discards due to memory allocation errors") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240424002148.3937059-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-24 20:16:43 -07:00
Jakub Kicinski	21d9f921f8	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: Support 5 layer Tx scheduler topology Mateusz Polchlopek says: For performance reasons there is a need to have support for selectable Tx scheduler topology. Currently firmware supports only the default 9-layer and 5-layer topology. This patch series enables switch from default to 5-layer topology, if user decides to opt-in. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Document tx_scheduling_layers parameter ice: Add tx_scheduling_layers devlink param ice: Enable switching default Tx scheduler topology ice: Adjust the VSI/Aggregator layers ice: Support 5 layer topology devlink: extend devlink_param *set pointer ==================== Link: https://lore.kernel.org/r/20240422203913.225151-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-24 20:05:31 -07:00
Asbjørn Sloth Tønnesen	3833e4834d	bnxt_en: flower: validate control flags This driver currently doesn't support any control flags. Use flow_rule_match_has_control_flags() to check for control flags, such as can be set through `tc flower ... ip_flags frag`. In case any control flags are masked, flow_rule_match_has_control_flags() sets a NL extended error message, and we return -EOPNOTSUPP. Only compile-tested. Signed-off-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Tested-by: Sriharsha Basavapatna <sriharsha.basavapatna@broadcom.com> Link: https://lore.kernel.org/r/20240422152626.175569-1-ast@fiberby.net Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-24 19:57:19 -07:00
Mateusz Polchlopek	5625ca5640	devlink: extend devlink_param set pointer Extend devlink_param set function pointer to take extack as a param. Sometimes it is needed to pass information to the end user from set function. It is more proper to use for that netlink instead of passing message to dmesg. Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Mateusz Polchlopek <mateusz.polchlopek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2024-04-22 13:05:19 -07:00
Michael Chan	41e54045b7	bnxt_en: Fix error recovery for 5760X (P7) chips During error recovery, such as AER fatal error slot reset, we call bnxt_try_map_fw_health_reg() to try to get access to the health register to determine the firmware state. Fix bnxt_try_map_fw_health_reg() to recognize the P7 chip correctly and set up the health register. This fixes this type of AER slot reset failure: bnxt_en 0000:04:00.0: AER: PCIe Bus Error: severity=Uncorrectable (Fatal), type=Inaccessible, (Unregistered Agent ID) bnxt_en 0000:04:00.0 enp4s0f0np0: PCI I/O error detected bnxt_en 0000:04:00.0 bnxt_re0: Handle device suspend call bnxt_en 0000:04:00.1 enp4s0f1np1: PCI I/O error detected bnxt_en 0000:04:00.1 bnxt_re1: Handle device suspend call pcieport 0000:00:02.0: AER: Root Port link has been reset (0) bnxt_en 0000:04:00.0 enp4s0f0np0: PCI Slot Reset bnxt_en 0000:04:00.0: enabling device (0000 -> 0002) bnxt_en 0000:04:00.0: Firmware not ready bnxt_en 0000:04:00.1 enp4s0f1np1: PCI Slot Reset bnxt_en 0000:04:00.1: enabling device (0000 -> 0002) bnxt_en 0000:04:00.1: Firmware not ready pcieport 0000:00:02.0: AER: device recovery failed Fixes: `a432a45bdb` ("bnxt_en: Define basic P7 macros") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-22 14:13:18 +01:00
Vikas Gupta	a1acdc226b	bnxt_en: Fix the PCI-AER routines We do not support two simultaneous recoveries so check for reset flag, BNXT_STATE_IN_FW_RESET, and do not proceed with AER further. When the pci channel state is pci_channel_io_frozen, the PCIe link can not be trusted so we disable the traffic immediately and stop BAR access by calling bnxt_fw_fatal_close(). BAR access after AER fatal error can cause an NMI. Fixes: `f75d9a0aa9` ("bnxt_en: Re-write PCI BARs after PCI fatal error.") Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-22 14:13:18 +01:00
Vikas Gupta	7474b1c82b	bnxt_en: refactor reset close code Introduce bnxt_fw_fatal_close() API which can be used to stop data path and disable device when firmware is in fatal state. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-22 14:13:18 +01:00
Justin Chen	9f898fc2c3	net: bcmasp: fix memory leak when bringing down interface When bringing down the TX rings we flush the rings but forget to reclaimed the flushed packets. This leads to a memory leak since we do not free the dma mapped buffers. This also leads to tx control block corruption when bringing down the interface for power management. Fixes: `490cb41200` ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240418180541.2271719-1-justin.chen@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-19 20:32:29 -07:00
Jakub Kicinski	94426ed213	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: net/unix/garbage.c `47d8ac011f` ("af_unix: Fix garbage collector racing against connect()") `4090fa373f` ("af_unix: Replace garbage collection algorithm.") Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt.c `faa12ca245` ("bnxt_en: Reset PTP tx_avail after possible firmware reset") `b3d0083caf` ("bnxt_en: Support RSS contexts in ethtool .{get\|set}_rxfh()") drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c `7ac10c7d72` ("bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init()") `194fad5b27` ("bnxt_en: Refactor bnxt_rdma_aux_device_init/uninit functions") drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c `958f56e483` ("net/mlx5e: Un-expose functions in en.h") `49e6c93870` ("net/mlx5e: RSS, Block XOR hash with over 128 channels") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-11 14:23:47 -07:00
Michael Chan	008ce0fd39	bnxt_en: Update MODULE_DESCRIPTION Update MODULE_DESCRIPTION to the more generic adapter family name. The old name only includes the first generation of supported adapters. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240409215431.41424-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-10 19:55:06 -07:00
Vikas Gupta	d630624ebd	bnxt_en: Utilize ulp client resources if RoCE is not registered If the RoCE driver is not registered for a RoCE capable device, add flexibility to use the RoCE resources (MSIX/NQs) for L2 purposes, such as additional rings configured by the user or for XDP. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240409215431.41424-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-10 19:55:06 -07:00
Vikas Gupta	2e4592dc9b	bnxt_en: Change MSIX/NQs allocation policy The existing scheme sets aside a number of MSIX/NQs for the RoCE driver whether the RoCE driver is registered or not. This scheme is not flexible and limits the resources available for the L2 rings if RoCE is never used. Modify the scheme so that the RoCE MSIX/NQs can be used by the L2 driver if they are not used for RoCE. The MSIX/NQs are now represented by 3 fields. bp->ulp_num_msix_want contains the desired default value, edev->ulp_num_msix_vec contains the available value (but not necessarily in use), and ulp_tbl->msix_requested contains the actual value in use by RoCE. The L2 driver can dip into edev->ulp_num_msix_vec if necessary. We need to add rtnl_lock() back in bnxt_register_dev() and bnxt_unregister_dev() to synchronize the MSIX usage between L2 and RoCE. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240409215431.41424-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-10 19:55:06 -07:00
Vikas Gupta	194fad5b27	bnxt_en: Refactor bnxt_rdma_aux_device_init/uninit functions In its current form, bnxt_rdma_aux_device_init() not only initializes the necessary data structures of the newly created aux device but also adds the aux device into the aux bus subsytem. Refactor the logic into separate functions, first function to initialize the aux device along with the required resources and second, to actually add the device to the aux bus subsytem. This separation helps to create bnxt_en_dev much earlier and save its resources separately. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240409215431.41424-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-10 19:55:05 -07:00
Vikas Gupta	b58f5a9c70	bnxt_en: Remove unneeded MSIX base structure fields and code Ever since commit: `3034322113` ("bnxt_en: Remove runtime interrupt vector allocation") The MSIX base vector is effectively always 0. Remove all unneeded structure fields and code referencing the MSIX base. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240409215431.41424-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-10 19:55:05 -07:00
Kalesh AP	43226dccd1	bnxt_en: Remove a redundant NULL check in bnxt_register_dev() The memory for "edev->ulp_tbl" is allocated inside the bnxt_rdma_aux_device_init() function. If it fails, the driver will not create the auxiliary device for RoCE. Hence the NULL check inside bnxt_register_dev() is unnecessary. Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240409215431.41424-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-10 19:55:05 -07:00
Pavan Chebbi	17b0dfa1f3	bnxt_en: Skip ethtool RSS context configuration in ifdown state The current implementation requires the ifstate to be up when configuring the RSS contexts. It will try to fetch the RX ring IDs and will crash if it is in ifdown state. Return error if !netif_running() to prevent the crash. An improved implementation is in the works to allow RSS contexts to be changed while in ifdown state. Fixes: `b3d0083caf` ("bnxt_en: Support RSS contexts in ethtool .{get\|set}_rxfh()") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Link: https://lore.kernel.org/r/20240409215431.41424-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-10 19:55:05 -07:00
Pavan Chebbi	faa12ca245	bnxt_en: Reset PTP tx_avail after possible firmware reset It is possible that during error recovery and firmware reset, there is a pending TX PTP packet waiting for the timestamp. We need to reset this condition so that after recovery, the tx_avail count for PTP is reset back to the initial value. Otherwise, we may not accept any PTP TX timestamps after recovery. Fixes: `118612d519` ("bnxt_en: Add PTP clock APIs, ioctls, and ethtool methods") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-08 13:55:47 +01:00
Vikas Gupta	b5ea7d33ba	bnxt_en: Fix error recovery for RoCE ulp client Since runtime MSIXs vector allocation/free has been removed, the L2 driver needs to repopulate the MSIX entries for the ulp client as the irq table may change during the recovery process. Fixes: `3034322113` ("bnxt_en: Remove runtime interrupt vector allocation") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-08 13:55:47 +01:00
Vikas Gupta	7ac10c7d72	bnxt_en: Fix possible memory leak in bnxt_rdma_aux_device_init() If ulp = kzalloc() fails, the allocated edev will leak because it is not properly assigned and the cleanup path will not be able to free it. Fix it by assigning it properly immediately after allocation. Fixes: `3034322113` ("bnxt_en: Remove runtime interrupt vector allocation") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-08 13:55:46 +01:00
Guillaume Nault	ec20b28300	ipv4: Set scope explicitly in ip_route_output(). Add a "scope" parameter to ip_route_output() so that callers don't have to override the tos parameter with the RTO_ONLINK flag if they want a local scope. This will allow converting flowi4_tos to dscp_t in the future, thus allowing static analysers to flag invalid interactions between "tos" (the DSCP bits) and ECN. Only three users ask for local scope (bonding, arp and atm). The others continue to use RT_SCOPE_UNIVERSE. While there, add a comment to warn users about the limitations of ip_route_output(). Signed-off-by: Guillaume Nault <gnault@redhat.com> Acked-by: Leon Romanovsky <leonro@nvidia.com> # infiniband Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-08 13:20:51 +01:00
Michael Chan	da48a65f3f	bnxt_en: Fix PTP firmware timeout parameter Use the correct tmo_us microsecond parameter for the PTP firmware timeout parameter. Fixes: `7de3c2218e` ("bnxt_en: Add a timeout parameter to bnxt_hwrm_port_ts_query()") Reported-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240404195500.171071-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-05 15:56:52 -07:00
Jakub Kicinski	cf1ca1f66d	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: net/ipv4/ip_gre.c `17af420545` ("erspan: make sure erspan_base_hdr is present in skb->head") `5832c4a77d` ("ip_tunnel: convert __be16 tunnel flags to bitmaps") https://lore.kernel.org/all/20240402103253.3b54a1cf@canb.auug.org.au/ Adjacent changes: net/ipv6/ip6_fib.c `d21d40605b` ("ipv6: Fix infinite recursion in fib6_dump_done().") `5fc68320c1` ("ipv6: remove RTNL protection from inet6_dump_fib()") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 18:01:07 -07:00
Sreekanth Reddy	e193f53aed	bnxt_en: Add warning message about disallowed speed change Some chips may not allow changing default speed when dual rate transceivers modules are used. Firmware on those chips will indicate the same to the driver. Add a warning message when speed change is not supported because a dual rate transceiver is detected by the NIC. Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240402093753.331120-8-pavan.chebbi@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 09:13:20 -07:00
Pavan Chebbi	4e474addc0	bnxt_en: Update firmware interface to 1.10.3.39 This updated interface supports backing store APIs to configure host FW trace buffers, updates transceivers ID types, updates to TrueFlow features and other changes for the basic L2 features. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240402093753.331120-7-pavan.chebbi@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 09:13:19 -07:00
Somnath Kotur	0ae1fafc8b	bnxt_en: Add XDP Metadata support - Change the last arg to xdp_prepare_buff to true from false. - Ensure that when XDP_PASS is returned the xdp->data_meta area is copied to the skb->data area. Account for the meta data size on skb allocation and do a pull after to move it to the "reserved" zone. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240402093753.331120-6-pavan.chebbi@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 09:13:19 -07:00
Somnath Kotur	1614f06e09	bnxt_en: Change bnxt_rx_xdp function prototype Change bnxt_rx_xdp() to take a pointer to xdp instead of stack variable. This is in prepartion for the XDP metadata patch change where the BPF program can change the value of the xdp.meta_data. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Daniel Borkmann <daniel@iogearbox.net> Cc: Jesper Dangaard Brouer <hawk@kernel.org> Cc: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240402093753.331120-5-pavan.chebbi@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 09:13:19 -07:00
Somnath Kotur	fba2e4e5db	bnxt_en: Allocate page pool per numa node Driver's Page Pool allocation code looks at the node local to the PCIe device to determine where to allocate memory. In scenarios where the core count per NUMA node is low (< default rings) it makes sense to exhaust page pool allocations on Node 0 first and then moving on to allocating page pools for the remaining rings from Node 1. With this patch, and the following configuration on the NIC $ ethtool -L ens1f0np0 combined 16 (core count/node = 12, first 12 rings on node#0, last 4 rings node#1) and traffic redirected to a ring on node#1 , we see a performance improvement of ~20% Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240402093753.331120-4-pavan.chebbi@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 09:13:19 -07:00
Somnath Kotur	8635ae8e99	bnxt_en: Enable XPS by default on driver load Enable XPS on default during NIC open. The choice of Tx queue is based on the CPU executing the thread that submits the Tx request. The pool of Tx queues will be spread evenly across both device-attached NUMA nodes(local) and remote NUMA nodes. Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240402093753.331120-3-pavan.chebbi@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 09:13:19 -07:00
Vikas Gupta	d5ab32e9b0	bnxt_en: Add delay to handle Downstream Port Containment (DPC) AER In case of DPC, after issuing the hot reset, the kernel waits for 100ms for the device to complete the reset. However on some older chips, the firmware may take up to 1 second to complete the reset, only after which the driver can restart the card. Introduce delay of 900ms to handle this scenario on the older chipsets. Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20240402093753.331120-2-pavan.chebbi@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-04-04 09:13:19 -07:00
Paolo Abeni	72076fc9fe	Revert "tg3: Remove residual error handling in tg3_suspend" This reverts commit `9ab4ad2956`. I went out of coffee and applied it to the wrong tree. Blame on me. Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-04 10:51:01 +02:00
Nikita Kiryushin	d72b735712	tg3: Remove residual error handling in tg3_suspend As of now, tg3_power_down_prepare always ends with success, but the error handling code from former tg3_set_power_state call is still here. This code became unreachable in commit `c866b7eac0` ("tg3: Do not use legacy PCI power management"). Remove (now unreachable) error handling code for simplification and change tg3_power_down_prepare to a void function as its result is no more checked. Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-04 10:49:42 +02:00
Nikita Kiryushin	9ab4ad2956	tg3: Remove residual error handling in tg3_suspend As of now, tg3_power_down_prepare always ends with success, but the error handling code from former tg3_set_power_state call is still here. This code became unreachable in commit `c866b7eac0` ("tg3: Do not use legacy PCI power management"). Remove (now unreachable) error handling code for simplification and change tg3_power_down_prepare to a void function as its result is no more checked. Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-04-04 10:16:50 +02:00
Phil Elwell	0a6380cb4c	net: bcmgenet: Reset RBUF on first open If the RBUF logic is not reset when the kernel starts then there may be some data left over from any network boot loader. If the 64-byte packet headers are enabled then this can be fatal. Extend bcmgenet_dma_disable to do perform the reset, but not when called from bcmgenet_resume in order to preserve a wake packet. N.B. This different handling of resume is just based on a hunch - why else wouldn't one reset the RBUF as well as the TBUF? If this isn't the case then it's easy to change the patch to make the RBUF reset unconditional. See: https://github.com/raspberrypi/linux/issues/3850 See: https://github.com/raspberrypi/firmware/issues/1882 Signed-off-by: Phil Elwell <phil@raspberrypi.com> Signed-off-by: Maarten Vanraes <maarten@rmail.be> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-04-03 11:07:00 +01:00
Johannes Berg	e8058a49e6	netlink: introduce type-checking attribute iteration There are, especially with multi-attr arrays, many cases of needing to iterate all attributes of a specific type in a netlink message or a nested attribute. Add specific macros to support that case. Also convert many instances using this spatch: @@ iterator nla_for_each_attr; iterator name nla_for_each_attr_type; identifier nla; expression head, len, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_attr(nla, head, len, rem) +nla_for_each_attr_type(nla, ATTR, head, len, rem) { <... T x; ...> -if (nla_type(nla) == ATTR) { ... -} } @@ identifier nla; iterator nla_for_each_nested; iterator name nla_for_each_nested_type; expression attr, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_nested(nla, attr, rem) +nla_for_each_nested_type(nla, ATTR, attr, rem) { <... T x; ...> -if (nla_type(nla) == ATTR) { ... -} } @@ iterator nla_for_each_attr; iterator name nla_for_each_attr_type; identifier nla; expression head, len, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_attr(nla, head, len, rem) +nla_for_each_attr_type(nla, ATTR, head, len, rem) { <... T x; ...> -if (nla_type(nla) != ATTR) continue; ... } @@ identifier nla; iterator nla_for_each_nested; iterator name nla_for_each_nested_type; expression attr, rem; expression ATTR; type T; identifier x; @@ -nla_for_each_nested(nla, attr, rem) +nla_for_each_nested_type(nla, ATTR, attr, rem) { <... T x; ...> -if (nla_type(nla) != ATTR) continue; ... } Although I had to undo one bad change this made, and I also adjusted some other code for whitespace and to use direct variable initialization now. Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20240328203144.b5a6c895fb80.I1869b44767379f204998ff44dd239803f39c23e0@changeid Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-29 15:06:02 -07:00
Pavan Chebbi	2f4f9fe5bf	bnxt_en: Support adding ntuple rules on RSS contexts When the user wants to add an ntuple filter to an RSS context, select the appropriate VNIC belonging to the selected RSS context and add the VNIC destination rule. Make the necessary changes to bnxt_add_ntuple_cls_rule(). Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-13-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:41 -07:00
Pavan Chebbi	61c814bf4a	bnxt_en: Refactor bnxt_cfg_rfs_ring_tbl_idx() Refactor bnxt_cfg_rfs_ring_tbl_idx() to pass in the filter structure pointer instead of the RX ring number. This will allow an ntuple filter to be set up for the non-default RSS contexts in the next patch. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-12-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:41 -07:00
Pavan Chebbi	b3d0083caf	bnxt_en: Support RSS contexts in ethtool .{get\|set}_rxfh() Support up to 32 RSS contexts per device if supported by the device. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-11-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:41 -07:00
Michael Chan	77a614f749	bnxt_en: Refactor bnxt_set_rxfh() Add a new bnxt_modify_rss() function to modify the RSS key and RSS indirection table. The new function can modify the parameters for the default context or additional contexts. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-10-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:41 -07:00
Pavan Chebbi	0895926f72	bnxt_en: Add a new_rss_ctx parameter to bnxt_rfs_capable() Modify bnxt_rfs_capable() to check that there are enough resources to support aRFS/ntuple filters for a new RSS context requested by the user. Existing use cases in the driver will always set the new parameter to false. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:40 -07:00
Michael Chan	b09353437b	bnxt_en: Simplify bnxt_rfs_capable() bnxt_rfs_capable() determines the number of VNICs and RSS_CTXs required to support aRFS and then reserves the resources. We already have functions bnxt_get_total_vnics() and bnxt_get_total_rss_ctxs() to do that. Simplify the code by calling these functions. It is also more correct to do the resource reservation after bnxt_can_reserve_rings() returns true. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:40 -07:00
Pavan Chebbi	ecb342bb60	bnxt_en: Refactor RSS indir alloc/set functions We will need to dynamically allocate and change indirection tables for additional RSS contexts. Add the rss_ctx pointer parameter to bnxt_alloc_rss_indir_tbl() and bnxt_set_dflt_rss_indir_tbl(). Existing usage will always pass rss_ctx as NULL which means the default RSS context. When supporting additional RSS contexts in subsequent patches, we'll pass the valid rss_ctx to these 2 functions. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:40 -07:00
Pavan Chebbi	fea41bd766	bnxt_en: Introduce rss ctx structure, alloc/free functions Add struct bnxt_rss_ctx, related storage lists, required defines, and its alloc/free functions. Later patches will use them in order to support multiple RSS contexts. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:40 -07:00
Pavan Chebbi	a4c11166a6	bnxt_en: Refactor VNIC alloc and cfg functions The current VNIC structures are stored in an array bp->vnic_info[]. The index of the array (vnic_id) is passed to all the functions that need to reference the VNIC. This patch changes the scheme to pass the VNIC pointer instead of the vnic index. Subsequent patches will create additional VNICs that will not be stored in the bp->vnic_info[] array. Using the VNIC pointer will work for all the VNICs. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:40 -07:00
Pavan Chebbi	1dcd70ba24	bnxt_en: Add helper function bnxt_hwrm_vnic_rss_cfg_p5() This is a pure refactoring patch. The new function bnxt_hwrm_vnic_set_rss_p5() will set up the P5_PLUS specific RSS ring table and then call bnxt_hwrm_vnic_cfg() to setup the vnic for proper RSS operations. This new function will be used later for additional RSS contexts. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240325222902.220712-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:40 -07:00
Pavan Chebbi	604041643a	bnxt_en: Retry PTP TX timestamp from FW for 1 second Use a new default 1 second timeout value instead of the existing 1 msec value. The driver will keep track of the remaining time before timeout and will pass this value to bnxt_hwrm_port_ts_query(). The firmware supports timeout values up to 65535 usecs. If the timeout value passed to bnxt_hwrm_port_ts_query() is less than the FW max value, we will use that value to precisely control the specified timeout. If it is larger than the FW max value, we will use the FW max value and any additional retry to reach the desired timeout will be done in the context of bnxt_ptp_ts_aux_eork(). Link: https://lore.kernel.org/netdev/20240229070202.107488-2-michael.chan@broadcom.com/ Cc: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20240325222902.220712-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:40 -07:00
Michael Chan	7de3c2218e	bnxt_en: Add a timeout parameter to bnxt_hwrm_port_ts_query() The caller can pass this new timeout parameter to the function to specify the firmware timeout value when requesting the TX timestamp from the firmware. This will allow the caller to precisely control the timeout and will be used in the next patch. In this patch, the parameter is 0 which means to use the current default value. Cc: Richard Cochran <richardcochran@gmail.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://lore.kernel.org/r/20240325222902.220712-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-28 22:34:39 -07:00
Justin Chen	4494c10e00	net: bcmasp: Remove phy_{suspend/resume} phy_{suspend/resume} is redundant. It gets called from phy_{stop/start}. Fixes: `490cb41200` ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-03-28 10:46:38 +01:00
Justin Chen	dfd222e2ae	net: bcmasp: Bring up unimac after PHY link up The unimac requires the PHY RX clk during reset or it may be put into a bad state. Bring up the unimac after link up to ensure the PHY RX clk exists. Fixes: `490cb41200` ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-03-28 10:46:38 +01:00
Linus Torvalds	cba9ffdb99	Including fixes from CAN, netfilter, wireguard and IPsec. Current release - regressions: - rxrpc: fix use of page_frag_alloc_align(), it changed semantics and we added a new caller in a different subtree - xfrm: allow UDP encapsulation only in offload modes Current release - new code bugs: - tcp: fix refcnt handling in __inet_hash_connect() - Revert "net: Re-use and set mono_delivery_time bit for userspace tstamp packets", conflicted with some expectations in BPF uAPI Previous releases - regressions: - ipv4: raw: fix sending packets from raw sockets via IPsec tunnels - devlink: fix devlink's parallel command processing - veth: do not manipulate GRO when using XDP - esp: fix bad handling of pages from page_pool Previous releases - always broken: - report RCU QS for busy network kthreads (with Paul McK's blessing) - tcp/rds: fix use-after-free on netns with kernel TCP reqsk - virt: vmxnet3: fix missing reserved tailroom with XDP Misc: - couple of build fixes for Documentation Signed-off-by: Jakub Kicinski <kuba@kernel.org> -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmX8bXsACgkQMUZtbf5S IrsfBg/+KzrEx0tB/Af57ZZGZ5PMjPy+XFDox4iFfHm338UFuGXVvZrXd7G+6YkH ZwWeF5YDPKzwIEiZ5D3hewZPlkLH0Eg88q74chlE0gUv7t1jhuQHUdIVeFnPcLbN t/8AcCZCJ2fENbr1iNnzZON1RW0fVOl+SDxhiSYFeFqii6FywDfqWL/h0u86H/AF KRktgb0LzH0waH6IiefVV1NZyjnZwmQ6+UVQerTzUnQmWhV1xQKoO3MQpZuFRvr6 O+kPZMkrqnTCCy7RO1BexS5cefqc80i5Z25FLGcaHgpnYd2pDNDMMxqrhqO9Y0Pv 6u/tLgRxzVUDXWouzREIRe50Z9GJswkg78zilAhpqYiHRjd8jaBH6y+9mhGFc7F8 iVAx02WfJhlk0aynFf2qZmR7PQIb9XjtFJ7OAeJrno9UD7zAubtikGM/6m6IZfRV TD1mze95RVnNjbHZMeg6oNLFUMJXVTobtvtqk5pTQvsNsmSYGFvkvWC5/P6ycyYt pMx6E0PA/ZCnQAlThCOCzFa5BO+It3RJHcQJhgbOzHrlWKwmrjBKcKJcLLcxFSUt 4wwjdEcG1Bo2wdnsjwsQwJDHQW+M9TSLdLM3YVptM9jbqOMizoqr6/xSykg3H4wZ t/dSiYSsEr06z7lvwbAjUXJ/mfszZ+JsVAFXAN7ahcM4OZb5WTQ= =gpLl -----END PGP SIGNATURE----- Merge tag 'net-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Jakub Kicinski: "Including fixes from CAN, netfilter, wireguard and IPsec. I'd like to highlight [ lowlight? - Linus ] Florian W stepping down as a netfilter maintainer due to constant stream of bug reports. Not sure what we can do but IIUC this is not the first such case. Current release - regressions: - rxrpc: fix use of page_frag_alloc_align(), it changed semantics and we added a new caller in a different subtree - xfrm: allow UDP encapsulation only in offload modes Current release - new code bugs: - tcp: fix refcnt handling in __inet_hash_connect() - Revert "net: Re-use and set mono_delivery_time bit for userspace tstamp packets", conflicted with some expectations in BPF uAPI Previous releases - regressions: - ipv4: raw: fix sending packets from raw sockets via IPsec tunnels - devlink: fix devlink's parallel command processing - veth: do not manipulate GRO when using XDP - esp: fix bad handling of pages from page_pool Previous releases - always broken: - report RCU QS for busy network kthreads (with Paul McK's blessing) - tcp/rds: fix use-after-free on netns with kernel TCP reqsk - virt: vmxnet3: fix missing reserved tailroom with XDP Misc: - couple of build fixes for Documentation" * tag 'net-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (59 commits) selftests: forwarding: Fix ping failure due to short timeout MAINTAINERS: step down as netfilter maintainer netfilter: nf_tables: Fix a memory leak in nf_tables_updchain net: dsa: mt7530: fix handling of all link-local frames net: dsa: mt7530: fix link-local frames that ingress vlan filtering ports bpf: report RCU QS in cpumap kthread net: report RCU QS on threaded NAPI repolling rcu: add a helper to report consolidated flavor QS ionic: update documentation for XDP support lib/bitmap: Fix bitmap_scatter() and bitmap_gather() kernel doc netfilter: nf_tables: do not compare internal table flags on updates netfilter: nft_set_pipapo: release elements in clone only from destroy path octeontx2-af: Use separate handlers for interrupts octeontx2-pf: Send UP messages to VF only when VF is up. octeontx2-pf: Use default max_active works instead of one octeontx2-pf: Wait till detach_resources msg is complete octeontx2: Detect the mbox up or down message via register devlink: fix port new reply cmd type tcp: Clear req->syncookie in reqsk_alloc(). net/bnx2x: Prevent access to a freed page in page_pool ...	2024-03-21 14:50:39 -07:00
Linus Torvalds	bb41fe35dc	Char/Misc and other driver subsystem updates for 6.9-rc1 Here is the big set of char/misc and a number of other driver subsystem updates for 6.9-rc1. Included in here are: - IIO driver updates, loads of new ones and evolution of existing ones - coresight driver updates - const cleanups for many driver subsystems - speakup driver additions - platform remove callback void cleanups - mei driver updates - mhi driver updates - cdx driver updates for MSI interrupt handling - nvmem driver updates - other smaller driver updates and cleanups, full details in the shortlog All of these have been in linux-next for a long time with no reported issue, other than a build warning with some older versions of gcc for a speakup driver, fix for that will come in a few days when I catch up with my pending patch queues. Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> -----BEGIN PGP SIGNATURE----- iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZfwuLg8cZ3JlZ0Brcm9h aC5jb20ACgkQMUfUDdst+ynKVACgjvR1cD8NYk9PcGWc9ZaXAZ6zSnwAn260kMoe lLFtwszo7m0N6ZULBWBd =y3yz -----END PGP SIGNATURE----- Merge tag 'char-misc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc Pull char/misc and other driver subsystem updates from Greg KH: "Here is the big set of char/misc and a number of other driver subsystem updates for 6.9-rc1. Included in here are: - IIO driver updates, loads of new ones and evolution of existing ones - coresight driver updates - const cleanups for many driver subsystems - speakup driver additions - platform remove callback void cleanups - mei driver updates - mhi driver updates - cdx driver updates for MSI interrupt handling - nvmem driver updates - other smaller driver updates and cleanups, full details in the shortlog All of these have been in linux-next for a long time with no reported issue, other than a build warning for the speakup driver" The build warning hits clang and is a gcc (and C23) extension, and is fixed up in the merge. Link: https://lore.kernel.org/all/20240321134831.GA2762840@dev-arch.thelio-3990X/ * tag 'char-misc-6.9-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (279 commits) binder: remove redundant variable page_addr uio_dmem_genirq: UIO_MEM_DMA_COHERENT conversion uio_pruss: UIO_MEM_DMA_COHERENT conversion cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT uio: introduce UIO_MEM_DMA_COHERENT type cdx: add MSI support for CDX bus pps: use cflags-y instead of EXTRA_CFLAGS speakup: Add /dev/synthu device speakup: Fix 8bit characters from direct synth parport: sunbpp: Convert to platform remove callback returning void parport: amiga: Convert to platform remove callback returning void char: xillybus: Convert to platform remove callback returning void vmw_balloon: change maintainership MAINTAINERS: change the maintainer for hpilo driver char: xilinx_hwicap: Fix NULL vs IS_ERR() bug hpet: remove hpets::hp_clocksource platform: goldfish: move the separate 'default' propery for CONFIG_GOLDFISH char: xilinx_hwicap: drop casting to void in dev_set_drvdata greybus: move is_gb_* functions out of greybus.h greybus: Remove usage of the deprecated ida_simple_xx() API ...	2024-03-21 13:21:31 -07:00
Thinh Tran	d27e2da94a	net/bnx2x: Prevent access to a freed page in page_pool Fix race condition leading to system crash during EEH error handling During EEH error recovery, the bnx2x driver's transmit timeout logic could cause a race condition when handling reset tasks. The bnx2x_tx_timeout() schedules reset tasks via bnx2x_sp_rtnl_task(), which ultimately leads to bnx2x_nic_unload(). In bnx2x_nic_unload() SGEs are freed using bnx2x_free_rx_sge_range(). However, this could overlap with the EEH driver's attempt to reset the device using bnx2x_io_slot_reset(), which also tries to free SGEs. This race condition can result in system crashes due to accessing freed memory locations in bnx2x_free_rx_sge() 799 static inline void bnx2x_free_rx_sge(struct bnx2x bp, 800 struct bnx2x_fastpath fp, u16 index) 801 { 802 struct sw_rx_page sw_buf = &fp->rx_page_ring[index]; 803 struct page page = sw_buf->page; .... where sw_buf was set to NULL after the call to dma_unmap_page() by the preceding thread. EEH: Beginning: 'slot_reset' PCI 0011:01:00.0#10000: EEH: Invoking bnx2x->slot_reset() bnx2x: [bnx2x_io_slot_reset:14228(eth1)]IO slot reset initializing... bnx2x 0011:01:00.0: enabling device (0140 -> 0142) bnx2x: [bnx2x_io_slot_reset:14244(eth1)]IO slot reset --> driver unload Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc0080000025065fc Oops: Kernel access of bad area, sig: 11 [#1] ..... Call Trace: [c000000003c67a20] [c00800000250658c] bnx2x_io_slot_reset+0x204/0x610 [bnx2x] (unreliable) [c000000003c67af0] [c0000000000518a8] eeh_report_reset+0xb8/0xf0 [c000000003c67b60] [c000000000052130] eeh_pe_report+0x180/0x550 [c000000003c67c70] [c00000000005318c] eeh_handle_normal_event+0x84c/0xa60 [c000000003c67d50] [c000000000053a84] eeh_event_handler+0xf4/0x170 [c000000003c67da0] [c000000000194c58] kthread+0x1c8/0x1d0 [c000000003c67e10] [c00000000000cf64] ret_from_kernel_thread+0x5c/0x64 To solve this issue, we need to verify page pool allocations before freeing. Fixes: `4cace675d6` ("bnx2x: Alloc 4k fragment for each rx ring buffer element") Signed-off-by: Thinh Tran <thinhtr@linux.ibm.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20240315205535.1321-1-thinhtr@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-19 19:34:27 -07:00
Linus Torvalds	e5eb28f6d1	- Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min heap optimizations". - Kuan-Wei Chiu has also sped up the library sorting code in the series "lib/sort: Optimize the number of swaps and comparisons". - Alexey Gladkov has added the ability for code running within an IPC namespace to alter its IPC and MQ limits. The series is "Allow to change ipc/mq sysctls inside ipc namespace". - Geert Uytterhoeven has contributed some dhrystone maintenance work in the series "lib: dhry: miscellaneous cleanups". - Ryusuke Konishi continues nilfs2 maintenance work in the series "nilfs2: eliminate kmap and kmap_atomic calls" "nilfs2: fix kernel bug at submit_bh_wbc()" - Nathan Chancellor has updated our build tools requirements in the series "Bump the minimum supported version of LLVM to 13.0.1". - Muhammad Usama Anjum continues with the selftests maintenance work in the series "selftests/mm: Improve run_vmtests.sh". - Oleg Nesterov has done some maintenance work against the signal code in the series "get_signal: minor cleanups and fix". Plus the usual shower of singleton patches in various parts of the tree. Please see the individual changelogs for details. -----BEGIN PGP SIGNATURE----- iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZfMnvgAKCRDdBJ7gKXxA jjKMAP4/Upq07D4wjkMVPb+QrkipbbLpdcgJ++q3z6rba4zhPQD+M3SFriIJk/Xh tKVmvihFxfAhdDthseXcIf1nBjMALwY= =8rVc -----END PGP SIGNATURE----- Merge tag 'mm-nonmm-stable-2024-03-14-09-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - Kuan-Wei Chiu has developed the well-named series "lib min_heap: Min heap optimizations". - Kuan-Wei Chiu has also sped up the library sorting code in the series "lib/sort: Optimize the number of swaps and comparisons". - Alexey Gladkov has added the ability for code running within an IPC namespace to alter its IPC and MQ limits. The series is "Allow to change ipc/mq sysctls inside ipc namespace". - Geert Uytterhoeven has contributed some dhrystone maintenance work in the series "lib: dhry: miscellaneous cleanups". - Ryusuke Konishi continues nilfs2 maintenance work in the series "nilfs2: eliminate kmap and kmap_atomic calls" "nilfs2: fix kernel bug at submit_bh_wbc()" - Nathan Chancellor has updated our build tools requirements in the series "Bump the minimum supported version of LLVM to 13.0.1". - Muhammad Usama Anjum continues with the selftests maintenance work in the series "selftests/mm: Improve run_vmtests.sh". - Oleg Nesterov has done some maintenance work against the signal code in the series "get_signal: minor cleanups and fix". Plus the usual shower of singleton patches in various parts of the tree. Please see the individual changelogs for details. * tag 'mm-nonmm-stable-2024-03-14-09-36' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (77 commits) nilfs2: prevent kernel bug at submit_bh_wbc() nilfs2: fix failure to detect DAT corruption in btree and direct mappings ocfs2: enable ocfs2_listxattr for special files ocfs2: remove SLAB_MEM_SPREAD flag usage assoc_array: fix the return value in assoc_array_insert_mid_shortcut() buildid: use kmap_local_page() watchdog/core: remove sysctl handlers from public header nilfs2: use div64_ul() instead of do_div() mul_u64_u64_div_u64: increase precision by conditionally swapping a and b kexec: copy only happens before uchunk goes to zero get_signal: don't initialize ksig->info if SIGNAL_GROUP_EXIT/group_exec_task get_signal: hide_si_addr_tag_bits: fix the usage of uninitialized ksig get_signal: don't abuse ksig->info.si_signo and ksig->sig const_structs.checkpatch: add device_type Normalise "name (ad@dr)" MODULE_AUTHORs to "name <ad@dr>" dyndbg: replace kstrdup() + strchr() with kstrdup_and_replace() list: leverage list_is_head() for list_entry_is_head() nilfs2: MAINTAINERS: drop unreachable project mirror site smp: make __smp_processor_id() 0-argument macro fat: fix uninitialized field in nostale filehandles ...	2024-03-14 18:03:09 -07:00
Jakub Kicinski	af7b3b4add	eth: bnxt: support per-queue statistics Support per-queue statistics API in bnxt. $ ethtool -S eth0 NIC statistics: [0]: rx_ucast_packets: 1418 [0]: rx_mcast_packets: 178 [0]: rx_bcast_packets: 0 [0]: rx_discards: 0 [0]: rx_errors: 0 [0]: rx_ucast_bytes: 1141815 [0]: rx_mcast_bytes: 16766 [0]: rx_bcast_bytes: 0 [0]: tx_ucast_packets: 1734 ... $ ./cli.py --spec netlink/specs/netdev.yaml \ --dump qstats-get --json '{"scope": "queue"}' [{'ifindex': 2, 'queue-id': 0, 'queue-type': 'rx', 'rx-alloc-fail': 0, 'rx-bytes': 1164931, 'rx-packets': 1641}, ... {'ifindex': 2, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 631494, 'tx-packets': 1771}, ... Reset the per queue counters: $ ethtool -L eth0 combined 4 Inspect again: $ ./cli.py --spec netlink/specs/netdev.yaml \ --dump qstats-get --json '{"scope": "queue"}' [{'ifindex': 2, 'queue-id': 0, 'queue-type': 'rx', 'rx-alloc-fail': 0, 'rx-bytes': 32397, 'rx-packets': 145}, ... {'ifindex': 2, 'queue-id': 0, 'queue-type': 'tx', 'tx-bytes': 37481, 'tx-packets': 196}, ... $ ethtool -S eth0 \| head NIC statistics: [0]: rx_ucast_packets: 174 [0]: rx_mcast_packets: 3 [0]: rx_bcast_packets: 0 [0]: rx_discards: 0 [0]: rx_errors: 0 [0]: rx_ucast_bytes: 37151 [0]: rx_mcast_bytes: 267 [0]: rx_bcast_bytes: 0 [0]: tx_ucast_packets: 267 ... Totals are still correct: $ ./cli.py --spec netlink/specs/netdev.yaml --dump qstats-get [{'ifindex': 2, 'rx-alloc-fail': 0, 'rx-bytes': 281949995, 'rx-packets': 216524, 'tx-bytes': 52694905, 'tx-packets': 75546}] $ ip -s link show dev eth0 2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP mode DEFAULT group default qlen 1000 link/ether 14:23:f2:61:05:40 brd ff:ff:ff:ff:ff:ff RX: bytes packets errors dropped missed mcast 282519546 218100 0 0 0 516 TX: bytes packets errors dropped carrier collsns 53323054 77674 0 0 0 0 Acked-by: Stanislav Fomichev <sdf@google.com> Reviewed-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Link: https://lore.kernel.org/r/20240306195509.1502746-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-03-07 21:13:26 -08:00
Chris Leech	bfe78793b2	cnic,bnx2,bnx2x: use UIO_MEM_DMA_COHERENT Use the UIO_MEM_DMA_COHERENT type to properly handle mmap for dma_alloc_coherent buffers. The cnic l2_ring and l2_buf mmaps have caused page refcount issues as the dma_alloc_coherent no longer provide __GFP_COMP allocation as per commit "dma-mapping: reject __GFP_COMP in dma_alloc_attrs". Fix this by having the uio device use dma_mmap_coherent. The bnx2 and bnx2x status block allocations are also dma_alloc_coherent, and should use dma_mmap_coherent. They don't allocate multiple pages, but this interface does not work correctly with an iommu enabled unless dma_mmap_coherent is used. Signed-off-by: Nilesh Javali <njavali@marvell.com> Signed-off-by: Chris Leech <cleech@redhat.com> Acked-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/r/20240201233400.3394996-3-cleech@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>	2024-03-07 21:52:59 +00:00
Ahelenia Ziemiańska	6a57a21943	Normalise "name (ad@dr)" MODULE_AUTHORs to "name <ad@dr>" Found with git grep 'MODULE_AUTHOR(".([^)]@' Fixed with sed -i '/MODULE_AUTHOR(".([^)]@/{s/ (/ </g;s/)"/>"/;s/)and/> and/}' \ $(git grep -l 'MODULE_AUTHOR(".([^)]@') Also: in drivers/media/usb/siano/smsusb.c normalise ", INC" to ", Inc"; this is what every other MODULE_AUTHOR for this company says, and it's what the header says in drivers/sbus/char/openprom.c normalise a double-spaced separator; this is clearly copied from the copyright header, where the names are aligned on consecutive lines thusly: * Linux/SPARC PROM Configuration Driver * Copyright (C) 1996 Thomas K. Dyas (tdyas@noc.rutgers.edu) * Copyright (C) 1996 Eddie C. Dost (ecd@skynet.be) but the authorship branding is single-line Link: https://lkml.kernel.org/r/mk3geln4azm5binjjlfsgjepow4o73domjv6ajybws3tz22vb3@tarta.nabijaczleweli.xyz Signed-off-by: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Joe Perches <joe@perches.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-03-06 13:07:39 -08:00
Eric Dumazet	cc15bd10e7	net: adopt skb_network_header_len() more broadly (skb_transport_header(skb) - skb_network_header(skb)) can be replaced by skb_network_header_len(skb) Add a DEBUG_NET_WARN_ON_ONCE() in skb_network_header_len() to catch cases were the transport_header was not set. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-04 08:47:06 +00:00
Eric Dumazet	80bfab79b8	net: adopt skb_network_offset() and similar helpers This is a cleanup patch, making code a bit more concise. 1) Use skb_network_offset(skb) in place of (skb_network_header(skb) - skb->data) 2) Use -skb_network_offset(skb) in place of (skb->data - skb_network_header(skb)) 3) Use skb_transport_offset(skb) in place of (skb_transport_header(skb) - skb->data) 4) Use skb_inner_transport_offset(skb) in place of (skb_inner_transport_header(skb) - skb->data) Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Edward Cree <ecree.xilinx@gmail.com> # for sfc Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-04 08:47:06 +00:00
Justin Chen	cc7f105e76	net: bcmasp: Add support for PHY interrupts Hook up the phy interrupts for internal phys to reduce mdio traffic and improve responsiveness of link changes. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-01 09:22:50 +00:00
Justin Chen	4688f4f41c	net: bcmasp: Keep buffers through power management There is no advantage of freeing and re-allocating buffers through suspend and resume. This waste cycles and makes suspend/resume time longer. We also open ourselves to failed allocations in systems with heavy memory fragmentation. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-01 09:22:50 +00:00
Justin Chen	1d472eb5b6	net: bcmasp: Add support for ASP 2.2 ASP 2.2 improves power savings during low power modes. A new register was added to toggle to a slower clock during low power modes. EEE was broken for ASP 2.0/2.1. A HW workaround was added for ASP 2.2 that requires toggling a chicken bit. Signed-off-by: Justin Chen <justin.chen@broadcom.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-03-01 09:22:49 +00:00
Alexander Lobakin	c4b04a802d	bnxt_en: fix accessing vnic_info before allocating it bnxt_alloc_mem() dereferences ::vnic_info in the variable declaration block, but allocates it much later. As a result, the following crash happens on my setup: BUG: kernel NULL pointer dereference, address: 0000000000000090 fbcon: Taking over console #PF: supervisor write access in kernel mode #PF: error_code (0x0002) - not-present page PGD 12f382067 P4D 0 Oops: 8002 [#1] PREEMPT SMP NOPTI CPU: 47 PID: 2516 Comm: NetworkManager Not tainted 6.8.0-rc5-libeth+ #49 Hardware name: Intel Corporation M50CYP2SBSTD/M58CYP2SBSTD, BIOS SE5C620.86B.01.01.0088.2305172341 05/17/2023 RIP: 0010:bnxt_alloc_mem+0x1609/0x1910 [bnxt_en] Code: 81 c8 48 83 c8 08 31 c9 e9 d7 fe ff ff c7 44 24 Oc 00 00 00 00 49 89 d5 e9 2d fe ff ff 41 89 c6 e9 88 00 00 00 48 8b 44 24 50 <80> 88 90 00 00 00 Od 8b 43 74 a8 02 75 1e f6 83 14 02 00 00 80 74 RSP: 0018:ff3f25580f3432c8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ff15a5cfc45249e0 RCX: 0000002079777000 RDX: ff15a5dfb9767000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: ff15a5dfb9777000 R11: ffffff8000000000 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000020 R15: ff15a5cfce34f540 FS: 000007fb9a160500(0000) GS:ff15a5dfbefc0000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CRO: 0000000080050033 CR2: 0000000000000090 CR3: 0000000109efc00Z CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DRZ: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __die_body+0x68/0xb0 ? page_fault_oops+0x3a6/0x400 ? exc_page_fault+0x7a/0x1b0 ? asm_exc_page_fault+0x26/8x30 ? bnxt_alloc_mem+0x1609/0x1910 [bnxt_en] ? bnxt_alloc_mem+0x1389/8x1918 [bnxt_en] _bnxt_open_nic+0x198/0xa50 [bnxt_en] ? bnxt_hurm_if_change+0x287/0x3d0 [bnxt_en] bnxt_open+0xeb/0x1b0 [bnxt_en] _dev_open+0x12e/0x1f0 _dev_change_flags+0xb0/0x200 dev_change_flags+0x25/0x60 do_setlink+0x463/0x1260 ? sock_def_readable+0x14/0xc0 ? rtnl_getlink+0x4b9/0x590 ? _nla_validate_parse+0x91/0xfa0 rtnl_newlink+0xbac/0xe40 <...> Don't create a variable and dereference the first array member directly since it's used only once in the code. Fixes: `ef4ee64e99` ("bnxt_en: Define BNXT_VNIC_DEFAULT for the default vnic index") Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240226144911.1297336-1-aleksander.lobakin@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-27 08:27:03 -08:00
Jakub Kicinski	fecc51559a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: net/ipv4/udp.c `f796feabb9` ("udp: add local "peek offset enabled" flag") `56667da739` ("net: implement lockless setsockopt(SO_PEEK_OFF)") Adjacent changes: net/unix/garbage.c `aa82ac51d6` ("af_unix: Drop oob_skb ref before purging queue in GC.") `11498715f2` ("af_unix: Remove io_uring code for GC.") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-22 15:29:26 -08:00
Pavan Chebbi	f6eff053a6	bnxt_en: Use the new VNIC to create ntuple filters The newly created vnic (BNXT_VNIC_NTUPLE) is ready to be used to create ntuple filters when supported by firmware. All RX rings can be used regardless of the RSS indirection setting on the default VNIC. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:23 +01:00
Pavan Chebbi	93e90104bd	bnxt_en: Create and setup the additional VNIC for adding ntuple filters Allocate and setup the additional VNIC for ntuple filters if this new method is supported by the firmware. Even though this VNIC is only used for ntuple filters with direct ring destinations, we still setup the RSS hash to be identical to the default VNIC so that each RX packet will have the correct hash in the RX completion. This VNIC is always at VNIC index BNXT_VNIC_NTUPLE. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:23 +01:00
Pavan Chebbi	532c034e4b	bnxt_en: Provision for an additional VNIC for ntuple filters On newer chips that support the ring table index method for ntuple filters, the current scheme of using the same VNIC for both RSS and ntuple filters will not work in all cases. An ntuple filter can only be directed to a destination ring if that destination ring is also in the RSS indirection table. To support ntuple filters with any arbitratry RSS indirection table that may only include a subset of the rings, we need to use a separate VNIC for ntuple filters. This patch provisions the additional VNIC. The next patch will allocate additional VNIC from firmware and set it up. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:23 +01:00
Pavan Chebbi	ef4ee64e99	bnxt_en: Define BNXT_VNIC_DEFAULT for the default vnic index Replace hard coded 0 index with more meaningful BNXT_VNIC_DEFAULT. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:23 +01:00
Pavan Chebbi	5d5b90fb4e	bnxt_en: Refactor bnxt_set_features() Refactor bnxt_set_features() function to have a common function to re-init. We'll need this to reinitialize when ntuple configuration changes. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:23 +01:00
Venkat Duvvuru	8c81ae6c54	bnxt_en: Add bnxt_get_total_vnics() to calculate number of VNICs Refactor the code by adding a new function to calculate the number of required VNICs. This is used in multiple places when reserving or checking resources. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Venkat Duvvuru <venkatkumar.duvvuru@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:23 +01:00
Michael Chan	9294299867	bnxt_en: Check additional resources in bnxt_check_rings() bnxt_check_rings() is called to check if we have enough resource assets to satisfy the new number of ethtool channels. If the asset test fails, the ethtool operation will fail gracefully. Otherwise we will proceed and commit to use the new number of channels. If it fails to allocate any resources, the chip will fail to come up. For completeness, check all possible resources before committing to the new settings. Add the missing ring group and RSS context asset tests in bnxt_check_rings(). Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:23 +01:00
Pavan Chebbi	438ba39b25	bnxt_en: Improve RSS context reservation infrastructure Add RSS context fields to struct bnxt_hw_rings and struct bnxt_hw_resc. With these, we can now specific the exact number of RSS contexts to reserve and store the reserved value. The original code relies on other resources to infer the number of RSS contexts to reserve and the reserved value is not stored. This improved infrastructure will make the RSS context accounting more complete and is needed by later patches. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:22 +01:00
Michael Chan	ae8186b2d4	bnxt_en: Explicitly specify P5 completion rings to reserve The current code assumes that every RX ring group and every TX ring requires a completion ring on P5_PLUS chips. Now that we have the bnxt_hw_rings structure, add the cp_p5 field so that it can be explicitly specified. This makes the logic more clear. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:22 +01:00
Michael Chan	257bbf45af	bnxt_en: Refactor ring reservation functions The current functions to reserve hardware rings pass in 6 different ring or resource types as parameters. Add a structure bnxt_hw_rings to consolidate all these parameters and pass the structure pointer instead to these functions. Add 2 related helper functions also. This makes the code cleaner and makes it easier to add new resources to be reserved. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-22 15:31:22 +01:00
Florian Fainelli	ba0b78371c	Revert "net: bcmgenet: Ensure MDIO unregistration has clocks enabled" This reverts commit `1b5ea7ffb7` ("net: bcmgenet: Ensure MDIO unregistration has clocks enabled"). This is no longer necessary now that the MDIO bus controller has a clock that it can manage around the I/O accesses. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-02-21 12:46:17 +00:00
Florian Fainelli	ee2b4cf8b2	net: bcmgenet: Pass "main" clock down to the MDIO driver GENET has historically had to create a MDIO platform device for its controller and pass some auxiliary data to it, like a MDIO completion callback. Now we also pass the "main" clock to allow for the MDIO bus controller to manage that clock adequately around I/O accesses. Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-02-21 12:46:17 +00:00
Heiner Kallweit	ebb0346a11	tg3: simplify tg3_phy_autoneg_cfg Make use of ethtool_adv_to_mmd_eee_adv_t() to simplify the code. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-02-21 11:16:12 +00:00
Heiner Kallweit	8306ee08c0	tg3: copy only needed fields from userspace-provided EEE data The current code overwrites fields in tp->eee with unchecked data from edata, e.g. the bitmap with supported modes. ethtool properly returns the received data from get_eee() call, but we have no guarantee that other users of the ioctl set_eee() interface behave properly too. Therefore copy only fields which are actually needed. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-02-21 11:15:12 +00:00
Justin Chen	f120e62e37	net: bcmasp: Sanity check is off by one A sanity check for OOB write is off by one leading to a false positive when the array is full. Fixes: `9b90aca97f` ("net: ethernet: bcmasp: fix possible OOB write in bcmasp_netfilt_get_all_active()") Signed-off-by: Justin Chen <justin.chen@broadcom.com> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-02-18 11:32:10 +00:00
Florian Fainelli	5b76d928f8	net: bcmasp: Indicate MAC is in charge of PHY PM Avoid the PHY library call unnecessarily into the suspend/resume functions by setting phydev->mac_managed_pm to true. The ASP driver essentially does exactly what mdio_bus_phy_resume() does. Fixes: `490cb41200` ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Justin Chen <justin.chen@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-02-18 11:32:10 +00:00
Jakub Kicinski	73be9a3aab	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts. Adjacent changes: net/core/dev.c `9f30831390` ("net: add rcu safety to rtnl_prop_list_size()") `723de3ebef` ("net: free altname using an RCU callback") net/unix/garbage.c `11498715f2` ("af_unix: Remove io_uring code for GC.") `25236c91b5` ("af_unix: Fix task hung while purging oob_skb in GC.") drivers/net/ethernet/renesas/ravb_main.c `ed4adc0720` ("net: ravb: Count packets instead of descriptors in GbEth RX path" ) `c2da940857` ("ravb: Add Rx checksum offload support for GbEth") net/mptcp/protocol.c `bdd70eb689` ("mptcp: drop the push_pending field") `28e5c13805` ("mptcp: annotate lockless accesses around read-mostly fields") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-15 16:20:04 -08:00
Florian Fainelli	e5b2e810da	net: bcmasp: Handle RX buffer allocation failure The buffer_pg variable needs to hold an order-5 allocation (32 x PAGE_SIZE) which, under memory pressure may fail to be allocated. Deal with that error condition properly to avoid doing a NULL pointer de-reference in the subsequent call to dma_map_page(). In addition, the err_reclaim_tx error label in bcmasp_netif_init() needs to ensure that the TX NAPI object is properly deleted, otherwise unregister_netdev() will spin forever attempting to test and clear the NAPI_STATE_HASHED bit. Fixes: `490cb41200` ("net: bcmasp: Add support for ASP2.0 Ethernet controller") Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Reviewed-by: Justin Chen <justin.chen@broadcom.com> Link: https://lore.kernel.org/r/20240213173339.3438713-1-florian.fainelli@broadcom.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-02-15 13:27:29 +01:00
Heiner Kallweit	0972d1d979	tg3: fix bug caused by uninitialized variable The reported bug is caused by using mii_eee_cap1_mod_linkmode_t() with an uninitialized bitmap. Fix this by zero-initializing the struct containing the bitmap. Fixes: `9bc791341b` ("tg3: convert EEE handling to use linkmode bitmaps") Reported-by: Srikanth Aithal <sraithal@amd.com> Tested-by: Srikanth Aithal <sraithal@amd.com> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-02-12 11:12:09 +00:00
Heiner Kallweit	1c96a63af5	bnx2x: convert EEE handling to use linkmode bitmaps Convert EEE handling to use linkmode bitmaps. This prepares for removing the legacy bitmaps from struct ethtool_keee. No functional change intended. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/948562fb-c5d8-4912-8b88-bec56238732a@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:55:19 -08:00
Ajit Khaparde	0c36211bac	bnxt_en: Add RSS support for IPSEC headers IPSec uses two distinct protocols, Authentication Header (AH) and Encapsulating Security Payload (ESP). Add support to configure RSS based on AH and ESP headers. This functionality will be enabled based on the capabilities indicated by the firmware in HWRM_VNIC_QCAPS. Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-14-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:42 -08:00
Pavan Chebbi	1018319f94	bnxt_en: Invalidate user filters when needed The cached user filters slated to be reapplied need to be cleared if configured MAC changes, RSS key changes, number of rings changes, or ntuple is disabled. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-13-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:42 -08:00
Pavan Chebbi	5de1fce336	bnxt_en: Add support for user configured RSS key Store the user configured or generated Toeplitz key in bp->rss_hash_key. The key stays constant across ifdown/ifup unless updated by the user. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-12-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:42 -08:00
Pavan Chebbi	44af4b622a	bnxt_en: Restore all the user created L2 and ntuple filters Walk the usr_fltr_list and call firmware to add these filters when we open the NIC. This will restore all user created filters after reset. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-11-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:42 -08:00
Pavan Chebbi	25041467d0	bnxt_en: Retain user configured filters when closing Driver should not free user created filters from its memory when closing since we are going to reconfigure them when we open again. If the "all" parameter is false, do not free user configured filters in bnxt_free_ntp_fltrs() and bnxt_free_l2_filters(). Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-10-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:42 -08:00
Pavan Chebbi	8336a974f3	bnxt_en: Save user configured filters in a lookup list Driver needs to maintain a lookup list of all the user configured filters. This is required in order to reconfigure these filters upon interface toggle. We can look up this list to follow the order with which they should be re-applied. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:41 -08:00
Pavan Chebbi	be40b4e9ca	bnxt_en: Add separate function to delete the filter structure Since we are going to do filter deletion at multiple places in the upcoming patches, add a function that does the deletion. Future patches add more code into this function. Since we are passing the address of the filter base to free the entire filter structure, add a comment to make sure that the base is always at the beginning of the structure. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:41 -08:00
Vikas Gupta	7efd79c0e6	bnxt_en: Add drop action support for ntuple Add drop action for protocols TCP/UDP/ICMP 1) Drop action for TCP/UDP is supported via flow type tcp4/udp4/tcp6/udp6. 2) Drop action for ICMPV4/ICMPV6/wildcard is supported via flow type ipv4/ipv6. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:41 -08:00
Vikas Gupta	9ba0e56199	bnxt_en: Enhance ethtool ntuple support for ip flows besides TCP/UDP Enable flow type ipv4/ipv6 1) for protocols ICMPV4 and ICMPV6. 2) for wildcard match. Wildcard matches to TCP/UDP/ICMP. Note that, IPPROTO_RAW(255) i.e. a reserved protocol considered for a wildcard. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:41 -08:00
Edwin Peer	c8d129c437	bnxt_en: implement fully specified 5-tuple masks Support subfield masking for IP addresses and ports. Previously, only entire fields could be included or excluded in NTUPLE filters. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://lore.kernel.org/r/20240205223202.25341-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:41 -08:00
Michael Chan	7c8036fb71	bnxt_en: Support ethtool -n to display ether filters. Implement ETHTOOL_GRXCLSRULE for the user defined ether filters. Use the common functions to walk the L2 filter hash table. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://lore.kernel.org/r/20240205223202.25341-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:41 -08:00
Michael Chan	e462998abc	bnxt_en: Add ethtool -N support for ether filters. Add ETHTOOL_SRXCLSRLINS and ETHTOOL_SRXCLSRLDEL support for inserting and deleting L2 ether filter rules. Destination MAC address and optional VLAN are supported for each filter entry. This is currently only supported on older BCM573XX and BCM574XX chips only. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240205223202.25341-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:40 -08:00
Michael Chan	f42822f22b	bnxt_en: Use firmware provided maximum filter counts. While individual filter structures are allocated as needed, there is an array to keep track of the software filter IDs that we allocate ahead of time. Rather than relying on a fixed maximum filter count to allocate this array, get the maximum from the firmware when available. Move these filter related maximum counts queried from the firmware to the bnxt_hw_resc struct. If the firmware is not providing these maximum counts, fall back to the hard-coded constant. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Link: https://lore.kernel.org/r/20240205223202.25341-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-09 12:37:40 -08:00
Heiner Kallweit	6fb5dfee27	bnxt: convert EEE handling to use linkmode bitmaps Convert EEE handling to use linkmode bitmaps. This prepares for removing the legacy bitmaps from struct ethtool_keee. No functional change intended. When replacing _bnxt_fw_to_ethtool_adv_spds() with _bnxt_fw_to_linkmode(), remove the fw_pause argument because it's always passed as 0. Note: There's a discussion on whether the underlying implementation is correct, but it's independent of this mechanical conversion w/o functional change. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/9123bf18-a0d0-404e-a7c4-d6c466b4c5e8@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-08 19:06:01 -08:00
Russell King (Oracle)	0cbfdfe3fb	net: bcmasp: remove eee_enabled/eee_active in bcmasp_get_eee() bcmasp_get_eee() sets edata->eee_active and edata->eee_enabled from its own copy, and then calls phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Therefore, setting these members in bcmasp_get_eee() is redundant, and can be removed. This also makes intf->eee.eee_active unnecessary, so remove this and use a local variable where appropriate. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1rWbNC-002cCt-W7@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-07 09:03:38 -08:00
Russell King (Oracle)	409359c1c2	net: bcmgenet: remove eee_enabled/eee_active in bcmgenet_get_eee() bcmgenet_get_eee() sets edata->eee_active and edata->eee_enabled from its own copy, and then calls phy_ethtool_get_eee() which in turn will call genphy_c45_ethtool_get_eee(). genphy_c45_ethtool_get_eee() will overwrite eee_enabled and eee_active with its own interpretation from the PHYs settings and negotiation result. Therefore, setting these members in bcmgenet_get_eee() is redundant, and can be removed. This also makes priv->eee.eee_active unnecessary, so remove this and use a local variable where appropriate. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://lore.kernel.org/r/E1rWbN7-002cCn-RO@rmk-PC.armlinux.org.uk Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-07 09:03:37 -08:00
Heiner Kallweit	9bc791341b	tg3: convert EEE handling to use linkmode bitmaps Convert EEE handling to use linkmode bitmaps. This prepares for removing the legacy bitmaps from struct ethtool_keee. No functional change intended. Note: The change to mii_eee_cap1_mod_linkmode_t(tp->eee.advertised, val) in tg3_phy_autoneg_cfg() isn't completely obvious, but it doesn't change the current functionality. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/0652b910-6bcc-421f-8769-38f7dae5037e@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-06 18:59:58 -08:00
Jakub Kicinski	cf244463a2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts or adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-02-01 15:12:37 -08:00
Heiner Kallweit	1d756ff13d	ethtool: add suffix _u32 to legacy bitmap members of struct ethtool_keee This is in preparation of using the existing names for linkmode bitmaps. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-31 12:30:47 +00:00
Heiner Kallweit	d80a523353	ethtool: replace struct ethtool_eee with a new struct ethtool_keee on kernel side In order to pass EEE link modes beyond bit 32 to userspace we have to complement the 32 bit bitmaps in struct ethtool_eee with linkmode bitmaps. Therefore, similar to ethtool_link_settings and ethtool_link_ksettings, add a struct ethtool_keee. In a first step it's an identical copy of ethtool_eee. This patch simply does a s/ethtool_eee/ethtool_keee/g for all users. No functional change intended. Suggested-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-31 12:30:47 +00:00
Kees Cook	5642c82b94	bnx2x: Fix firmware version string character counts A potential string truncation was reported in bnx2x_fill_fw_str(), when a long bp->fw_ver and a long phy_fw_ver might coexist, but seems unlikely with real-world hardware. Use scnprintf() to indicate the intent that truncations are tolerated. While reading this code, I found a collection of various buffer size counting issues. None looked like they might lead to a buffer overflow with current code (the small buffers are 20 bytes and might only ever consume 10 bytes twice with a trailing %NUL). However, early truncation (due to a %NUL in the middle of the string) might be happening under likely rare conditions. Regardless fix the formatters and related functions: - Switch from a separate strscpy() to just adding an additional "%s" to the format string that immediately follows it in bnx2x_fill_fw_str(). - Use sizeof() universally instead of using unbound defines. - Fix bnx2x_7101_format_ver() and bnx2x_null_format_ver() to report the number of characters written, not including the trailing %NUL (as already done with the other firmware formatting functions). - Require space for at least 1 byte in bnx2x_get_ext_phy_fw_version() for the trailing %NUL. - Correct the needed buffer size in bnx2x_3_seq_format_ver(). Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401260858.jZN6vD1k-lkp@intel.com/ Cc: Ariel Elior <aelior@marvell.com> Cc: Sudarsana Kalluru <skalluru@marvell.com> Cc: Manish Chopra <manishc@marvell.com> Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://lore.kernel.org/r/20240126041044.work.220-kees@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-26 21:01:30 -08:00
Breno Leitao	281cb9d65a	bnxt_en: Make PTP timestamp HWRM more silent commit `056bce63c4` ("bnxt_en: Make PTP TX timestamp HWRM query silent") changed a netdev_err() to netdev_WARN_ONCE(). netdev_WARN_ONCE() is it generates a kernel WARNING, which is bad, for the following reasons: * You do not a kernel warning if the firmware queries are late * In busy networks, timestamp query failures fairly regularly * A WARNING message doesn't bring much value, since the code path is clear. (This was discussed in-depth in [1]) Transform the netdev_WARN_ONCE() into a netdev_warn_once(), and print a more well-behaved message, instead of a full WARN(). bnxt_en 0000:67:00.0 eth0: TS query for TX timer failed rc = fffffff5 [1] Link: https://lore.kernel.org/all/ZbDj%2FFI4EJezcfd1@gmail.com/ Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Fixes: `056bce63c4` ("bnxt_en: Make PTP TX timestamp HWRM query silent") Link: https://lore.kernel.org/r/20240125134104.2045573-1-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-26 14:06:21 -08:00
Breno Leitao	39535d7ff6	net: fill in MODULE_DESCRIPTION()s for Broadcom bgmac W=1 builds now warn if module is built without a MODULE_DESCRIPTION(). Add descriptions to the Broadcom iProc GBit driver. Signed-off-by: Breno Leitao <leitao@debian.org> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20240123190332.677489-3-leitao@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-24 15:12:20 -08:00
Michael Chan	467739baf6	bnxt_en: Fix possible crash after creating sw mqprio TCs The driver relies on netdev_get_num_tc() to get the number of HW offloaded mqprio TCs to allocate and free TX rings. This won't work and can potentially crash the system if software mqprio or taprio TCs have been setup. netdev_get_num_tc() will return the number of software TCs and it may cause the driver to allocate or free more TX rings that it should. Fix it by adding a bp->num_tc field to store the number of HW offload mqprio TCs for the device. Use bp->num_tc instead of netdev_get_num_tc(). This fixes a crash like this: BUG: kernel NULL pointer dereference, address: 0000000000000000 PGD 42b8404067 P4D 0 Oops: 0000 [#1] PREEMPT SMP NOPTI CPU: 120 PID: 8661 Comm: ifconfig Kdump: loaded Tainted: G OE 5.18.16 #1 Hardware name: Lenovo ThinkSystem SR650 V3/SB27A92818, BIOS ESE114N-2.12 04/25/2023 RIP: 0010:bnxt_hwrm_cp_ring_alloc_p5+0x10/0x90 [bnxt_en] Code: 41 5c 41 5d 41 5e c3 cc cc cc cc 41 8b 44 24 08 66 89 03 eb c6 e8 b0 f1 7d db 0f 1f 44 00 00 41 56 41 55 41 54 55 48 89 fd 53 <48> 8b 06 48 89 f3 48 81 c6 28 01 00 00 0f b6 96 13 ff ff ff 44 8b RSP: 0018:ff65907660d1fa88 EFLAGS: 00010202 RAX: 0000000000000010 RBX: ff4dde1d907e4980 RCX: f400000000000000 RDX: 0000000000000010 RSI: 0000000000000000 RDI: ff4dde1d907e4980 RBP: ff4dde1d907e4980 R08: 000000000000000f R09: 0000000000000000 R10: ff4dde5f02671800 R11: 0000000000000008 R12: 0000000088888889 R13: 0500000000000000 R14: 00f0000000000000 R15: ff4dde5f02671800 FS: 00007f4b126b5740(0000) GS:ff4dde9bff600000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 000000416f9c6002 CR4: 0000000000771ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> bnxt_hwrm_ring_alloc+0x204/0x770 [bnxt_en] bnxt_init_chip+0x4d/0x680 [bnxt_en] ? bnxt_poll+0x1a0/0x1a0 [bnxt_en] __bnxt_open_nic+0xd2/0x740 [bnxt_en] bnxt_open+0x10b/0x220 [bnxt_en] ? raw_notifier_call_chain+0x41/0x60 __dev_open+0xf3/0x1b0 __dev_change_flags+0x1db/0x250 dev_change_flags+0x21/0x60 devinet_ioctl+0x590/0x720 ? avc_has_extended_perms+0x1b7/0x420 ? _copy_from_user+0x3a/0x60 inet_ioctl+0x189/0x1c0 ? wp_page_copy+0x45a/0x6e0 sock_do_ioctl+0x42/0xf0 ? ioctl_has_perm.constprop.0.isra.0+0xbd/0x120 sock_ioctl+0x1ce/0x2e0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x59/0x90 ? syscall_exit_work+0x103/0x130 ? syscall_exit_to_user_mode+0x12/0x30 ? do_syscall_64+0x69/0x90 ? exc_page_fault+0x62/0x150 Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240117234515.226944-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-19 21:15:13 -08:00
Michael Chan	c20f482129	bnxt_en: Prevent kernel warning when running offline self test We call bnxt_half_open_nic() to setup the chip partially to run loopback tests. The rings and buffers are initialized normally so that we can transmit and receive packets in loopback mode. That means page pool buffers are allocated for the aggregation ring just like the normal case. NAPI is not needed because we are just polling for the loopback packets. When we're done with the loopback tests, we call bnxt_half_close_nic() to clean up. When freeing the page pools, we hit a WARN_ON() in page_pool_unlink_napi() because the NAPI state linked to the page pool is uninitialized. The simplest way to avoid this warning is just to initialize the NAPIs during half open and delete the NAPIs during half close. Trying to skip the page pool initialization or skip linking of NAPI during half open will be more complicated. This fix avoids this warning: WARNING: CPU: 4 PID: 46967 at net/core/page_pool.c:946 page_pool_unlink_napi+0x1f/0x30 CPU: 4 PID: 46967 Comm: ethtool Tainted: G S W 6.7.0-rc5+ #22 Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.3.8 08/31/2021 RIP: 0010:page_pool_unlink_napi+0x1f/0x30 Code: 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 48 8b 47 18 48 85 c0 74 1b 48 8b 50 10 83 e2 01 74 08 8b 40 34 83 f8 ff 74 02 <0f> 0b 48 c7 47 18 00 00 00 00 c3 cc cc cc cc 66 90 90 90 90 90 90 RSP: 0018:ffa000003d0dfbe8 EFLAGS: 00010246 RAX: ff110003607ce640 RBX: ff110010baf5d000 RCX: 0000000000000008 RDX: 0000000000000000 RSI: ff110001e5e522c0 RDI: ff110010baf5d000 RBP: ff11000145539b40 R08: 0000000000000001 R09: ffffffffc063f641 R10: ff110001361eddb8 R11: 000000000040000f R12: 0000000000000001 R13: 000000000000001c R14: ff1100014553a080 R15: 0000000000003fc0 FS: 00007f9301c4f740(0000) GS:ff1100103fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f91344fa8f0 CR3: 00000003527cc005 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> ? __warn+0x81/0x140 ? page_pool_unlink_napi+0x1f/0x30 ? report_bug+0x102/0x200 ? handle_bug+0x44/0x70 ? exc_invalid_op+0x13/0x60 ? asm_exc_invalid_op+0x16/0x20 ? bnxt_free_ring.isra.123+0xb1/0xd0 [bnxt_en] ? page_pool_unlink_napi+0x1f/0x30 page_pool_destroy+0x3e/0x150 bnxt_free_mem+0x441/0x5e0 [bnxt_en] bnxt_half_close_nic+0x2a/0x40 [bnxt_en] bnxt_self_test+0x21d/0x450 [bnxt_en] __dev_ethtool+0xeda/0x2e30 ? native_queued_spin_lock_slowpath+0x17f/0x2b0 ? __link_object+0xa1/0x160 ? _raw_spin_unlock_irqrestore+0x23/0x40 ? __create_object+0x5f/0x90 ? __kmem_cache_alloc_node+0x317/0x3c0 ? dev_ethtool+0x59/0x170 dev_ethtool+0xa7/0x170 dev_ioctl+0xc3/0x530 sock_do_ioctl+0xa8/0xf0 sock_ioctl+0x270/0x310 __x64_sys_ioctl+0x8c/0xc0 do_syscall_64+0x3e/0xf0 entry_SYSCALL_64_after_hwframe+0x6e/0x76 Fixes: `294e39e0d0` ("bnxt: hook NAPIs to page pools") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240117234515.226944-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-19 21:15:13 -08:00
Michael Chan	523384a6aa	bnxt_en: Fix RSS table entries calculation for P5_PLUS chips The existing formula used in the driver to calculate the number of RSS table entries is to round up the number of RX rings to the next integer multiples of 64 (e.g. 64, 128, 192, ..). This is incorrect. The valid values supported by the chip are 64, 128, 256, 512 only (power of 2 starting from 64). When the number of RX rings is greater than 128, the entry size will likely be wrong. Firmware will round down the invalid value (e.g. 192 rounded down to 128) provided by the driver, causing some RSS rings to not receive any packets. We already have an existing function bnxt_calc_nr_ring_pages() to do this calculation. Use it in bnxt_get_nr_rss_ctxs() to calculate the number of RSS contexts correctly for P5_PLUS chips. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Fixes: `7b3af4f75b` ("bnxt_en: Add RSS support for 57500 chips.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240117234515.226944-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-19 21:15:13 -08:00
Michael Chan	2ad8e57338	bnxt_en: Fix memory leak in bnxt_hwrm_get_rings() bnxt_hwrm_get_rings() can abort and return error when there are not enough ring resources. It aborts without releasing the HWRM DMA buffer, causing a dma_pool_destroy warning when the driver is unloaded: bnxt_en 0000:99:00.0: dma_pool_destroy bnxt_hwrm, 000000005b089ba8 busy Fixes: `f1e50b276d` ("bnxt_en: Fix trimming of P5 RX and TX rings") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240117234515.226944-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-19 21:15:12 -08:00
Michael Chan	3c1069fa42	bnxt_en: Wait for FLR to complete during probe The first message to firmware may fail if the device is undergoing FLR. The driver has some recovery logic for this failure scenario but we must wait 100 msec for FLR to complete before proceeding. Otherwise the recovery will always fail. Fixes: `ba02629ff6` ("bnxt_en: log firmware status on firmware init failure") Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20240117234515.226944-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-19 21:15:12 -08:00
Michael Chan	d8214d0f01	bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer() Similar to the previous patch, RCU locking was released too early in bnxt_rx_flow_steer(). Fix it to unlock after reading fltr->base.sw_id to guarantee that fltr won't be freed while we are still reading it. Fixes: `cb5bdd292d` ("bnxt_en: Add bnxt_lookup_ntp_filter_from_idx() function") Reported-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/netdev/20231225165653.GH5962@kernel.org/ Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240105235439.28282-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-08 19:15:03 -08:00
Michael Chan	fd7769798d	bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel() After looking up an ntuple filter from a RCU hash list, the rcu_read_unlock() call should be made after reading the structure, or after determining that the filter cannot age out (by aRFS). The existing code was calling rcu_read_unlock() too early in bnxt_srxclsrldel(). As suggested by Simon Horman, change the code to handle the error case of fltr_base not found in the if condition. The code looks cleaner this way. Fixes: `8d7ba028aa` ("bnxt_en: Add support for ntuple filter deletion by ethtool.") Suggested-by: Simon Horman <horms@kernel.org> Reported-by: Jakub Kicinski <kuba@kernel.org> Link: https://lore.kernel.org/netdev/20240104145955.5a6df702@kernel.org/ Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240105235439.28282-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-08 19:15:02 -08:00
Michael Chan	1ef4cacaae	bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter() After recent refactoring, this function doesn't return error any more. Remove the unneeded rc variable and change the function to void. The caller is not checking for the return value. Fixes: `96c9bedc75` ("bnxt_en: Refactor L2 filter alloc/free firmware commands.") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202401041942.qrB1amZM-lkp@intel.com/ Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://lore.kernel.org/r/20240105235439.28282-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-08 19:15:02 -08:00
Jakub Kicinski	e63c1822ac	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/broadcom/bnxt/bnxt.c `e009b2efb7` ("bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters()") `0f2b214779` ("bnxt_en: Fix compile error without CONFIG_RFS_ACCEL") https://lore.kernel.org/all/20240105115509.225aa8a2@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-04 18:06:46 -08:00
Michael Chan	0f2b214779	bnxt_en: Fix compile error without CONFIG_RFS_ACCEL Fix the following compile error: .../bnxt.c: In function 'bnxt_cfg_ntp_filters': .../bnxt.c:14077:37: error: implicit declaration of function 'rps_may_expire_flow' [-Werror=implicit-function-declaration] 14077 \| if (rps_may_expire_flow(bp->dev, fltr->base.rxq, \| ^~~~~~~~~~~~~~~~~~~ bnxt_cfg_ntp_filters() is only used when CONFIG_RFS_ACCEL is enabled. User configured ntuple filters are directly added and will not go through this function. Wrap the body of bnxt_cfg_ntp_filters() with CONFIG_RFS_ACCEL. Fixes: `59cde76f33` ("bnxt_en: Refactor filter insertion logic in bnxt_rx_flow_steer().") Reported-by: Arnd Bergmann <arnd@kernel.org> Link: https://lore.kernel.org/netdev/20240103102332.3642417-1-arnd@kernel.org/ Reported-by: kernel test robot <lkp@intel.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-04 11:10:48 +00:00
Michael Chan	e009b2efb7	bnxt_en: Remove mis-applied code from bnxt_cfg_ntp_filters() The 2 lines to check for the BNXT_HWRM_PF_UNLOAD_SP_EVENT bit was mis-applied to bnxt_cfg_ntp_filters() and should have been applied to bnxt_sp_task(). Fixes: `1924136844` ("bnxt_en: Send PF driver unload notification to all VFs.") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-04 11:09:59 +00:00
Adrian Cinal	e584f2ff1e	net: bcmgenet: Fix FCS generation for fragmented skbuffs The flag DMA_TX_APPEND_CRC was only written to the first DMA descriptor in the TX path, where each descriptor corresponds to a single skbuff fragment (or the skbuff head). This led to packets with no FCS appearing on the wire if the kernel allocated the packet in fragments, which would always happen when using PACKET_MMAP/TPACKET (cf. tpacket_fill_skb() in net/af_packet.c). Fixes: `1c1008c793` ("net: bcmgenet: add main driver file") Signed-off-by: Adrian Cinal <adriancinal1@gmail.com> Acked-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20231228135638.1339245-1-adriancinal1@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-01-02 16:19:41 -08:00
Michael Chan	8d7ba028aa	bnxt_en: Add support for ntuple filter deletion by ethtool. Add logic to delete a user specified ntuple filter from ethtool. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:28 +00:00
Michael Chan	c029bc30b9	bnxt_en: Add support for ntuple filters added from ethtool. Add support for adding user defined ntuple TCP/UDP filters. These filters are similar to aRFS filters except that they don't get aged. Source IP, destination IP, source port, or destination port can be unspecifed as wildcard. At least one of these tuples must be specifed. If a tuple is specified, the full mask must be specified. All ntuple related ethtool functions are now no longer compiled only for CONFIG_RFS_ACCEL. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:28 +00:00
Michael Chan	300c191800	bnxt_en: Add ntuple matching flags to the bnxt_ntuple_filter structure. aRFS filters match all 5 tuples. User defined ntuple filters may specify some of the tuples as wildcards. To support that, we add the ntuple_flags to the bnxt_ntuple_filter struct to specify which tuple fields are to be matched. The matching tuple fields will then be passed to the firmware in bnxt_hwrm_cfa_ntuple_filter_alloc() to create the proper filter. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:28 +00:00
Michael Chan	4faeadfd7e	bnxt_en: Refactor ntuple filter removal logic in bnxt_cfg_ntp_filters(). Refactor the logic into a new function bnxt_del_ntp_filters(). The same call will be used when the user deletes an ntuple filter. The bnxt_hwrm_cfa_ntuple_filter_free() function to call fw to free the ntuple filter is exported so that the ethtool logic can call it. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:28 +00:00
Michael Chan	80cfde29ce	bnxt_en: Refactor the hash table logic for ntuple filters. Generalize the ethtool logic that walks the ntuple hash table now that we have the common bnxt_filter_base structure. This will allow the code to easily extend to cover user defined ntuple or ether filters. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:28 +00:00
Michael Chan	59cde76f33	bnxt_en: Refactor filter insertion logic in bnxt_rx_flow_steer(). Add a new function bnxt_insert_ntp_filter() to insert the ntuple filter into the hash table and other basic setup. We'll use this function to insert a user defined filter from ethtool. Also, export bnxt_lookup_ntp_filter_from_idx() and bnxt_get_ntp_filter_idx() for similar purposes. All ntuple related functions are now no longer compiled only for CONFIG_RFS_ACCEL Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:28 +00:00
Michael Chan	ee908d05dd	bnxt_en: Add new BNXT_FLTR_INSERTED flag to bnxt_filter_base struct. Change the unused flag to BNXT_FLTR_INSERTED. To prepare for multiple pathways that an ntuple filter can be deleted, we add this flag. These filter structures can be retreived from the RCU hash table but only the caller that sees that the BNXT_FLTR_INSERTED flag is set can delete the filter structure and clear the flag under spinlock. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:27 +00:00
Michael Chan	cb5bdd292d	bnxt_en: Add bnxt_lookup_ntp_filter_from_idx() function Add the helper function to look up the ntuple filter from the hash index and use it in bnxt_rx_flow_steer(). The helper function will also be used by user defined ntuple filters in the next patches. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:27 +00:00
Pavan Chebbi	d3c982851c	bnxt_en: Add function to calculate Toeplitz hash For ntuple filters added by aRFS, the Toeplitz hash calculated by our NIC is available and is used to store the ntuple filter for quick retrieval. In the next patches, user defined ntuple filter support will be added and we need to calculate the same hash for these filters. The same hash function needs to be used so we can detect duplicates. Add the function bnxt_toeplitz() to calculate the Toeplitz hash for user defined ntuple filters. bnxt_toeplitz() uses the same Toeplitz key and the same key length as the NIC. bnxt_get_ntp_filter_idx() is added to return the hash index. For aRFS, the hash comes from the NIC. For user defined ntuple, we call bnxt_toeplitz() to calculate the hash index. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:27 +00:00
Michael Chan	96c9bedc75	bnxt_en: Refactor L2 filter alloc/free firmware commands. Refactor the L2 filter alloc/free logic so that these filters can be added/deleted by the user. The bp->ntp_fltr_bmap allocated size is also increased to allow enough IDs for L2 filters. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:27 +00:00
Michael Chan	bfeabf7e46	bnxt_en: Re-structure the bnxt_ntuple_filter structure. With the new bnxt_l2_filter structure, we can now re-structure the bnxt_ntuple_filter structure to point to the bnxt_l2_filter structure. We eliminate the L2 ether address info from the ntuple filter structure as we can get the information from the L2 filter structure. Note that the source L2 MAC address is no longer used. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:27 +00:00
Michael Chan	1f6e77cb9b	bnxt_en: Add bnxt_l2_filter hash table. The current driver only has an array of 4 additional L2 unicast addresses to support the netdev uc address list. Generalize and expand this infrastructure with an L2 address hash table so we can support an expanded list of unicast addresses (for bridges, macvlans, OVS, etc). The L2 hash table infrastructure will also allow more generalized n-tuple filter support. This patch creates the bnxt_l2_filter structure and the hash table. This L2 filter structure has the same bnxt_filter_base structure as used in the bnxt_ntuple_filter structure. All currently supported L2 filters will now have an entry in this new table. Note that L2 filters may be created for the VF. VF filters should not be freed when the PF goes down. Add some logic in bnxt_free_l2_filters() to allow keeping the VF filters or to free everything during rmmod. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:27 +00:00
Michael Chan	992d38d2b9	bnxt_en: Refactor bnxt_ntuple_filter structure. This is in preparation to support user defined L2 (ether) filters, which will have many similarities with ntuple filters. Refactor bnxt_ntuple_filter structure to have a bnxt_filter_base structure that can be re-used by the L2 filters. Reviewed-by: Vasundhara Volam <vasundhara-v.volam@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-01-02 13:52:27 +00:00
Paolo Abeni	56794e5358	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Adjacent changes: drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c `23c93c3b62` ("bnxt_en: do not map packet buffers twice") `6d1add9553` ("bnxt_en: Modify TX ring indexing logic.") tools/testing/selftests/net/Makefile `2258b66648` ("selftests: add vlan hw filter tests") `a0bc96c0cd` ("selftests: net: verify fq per-band packet limit") Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-12-21 22:17:23 +01:00
Andy Gospodarek	23c93c3b62	bnxt_en: do not map packet buffers twice Remove double-mapping of DMA buffers as it can prevent page pool entries from being freed. Mapping is managed by page pool infrastructure and was previously managed by the driver in __bnxt_alloc_rx_page before allowing the page pool infrastructure to manage it. Fixes: `578fcfd26e` ("bnxt_en: Let the page pool manage the DMA mapping") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: David Wei <dw@davidwei.uk> Link: https://lore.kernel.org/r/20231214213138.98095-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-15 10:14:40 -08:00
Jakub Kicinski	8f674972d6	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/intel/iavf/iavf_ethtool.c `3a0b5a2929` ("iavf: Introduce new state machines for flow director") `95260816b4` ("iavf: use iavf_schedule_aq_request() helper") https://lore.kernel.org/all/84e12519-04dc-bd80-bc34-8cf50d7898ce@intel.com/ drivers/net/ethernet/broadcom/bnxt/bnxt.c `c13e268c07` ("bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic") `c2f8063309` ("bnxt_en: Refactor RX VLAN acceleration logic.") `a7445d6980` ("bnxt_en: Add support for new RX and TPA_START completion types for P7") `1c7fd6ee2f` ("bnxt_en: Rename some macros for the P5 chips") https://lore.kernel.org/all/20231211110022.27926ad9@canb.auug.org.au/ drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.c `bd6781c18c` ("bnxt_en: Fix wrong return value check in bnxt_close_nic()") `84793a4995` ("bnxt_en: Skip nic close/open when configuring tstamp filters") https://lore.kernel.org/all/20231214113041.3a0c003c@canb.auug.org.au/ drivers/net/ethernet/mellanox/mlx5/core/fw_reset.c `3d7a3f2612` ("net/mlx5: Nack sync reset request when HotPlug is enabled") `cecf44ea1a` ("net/mlx5: Allow sync reset flow when BF MGT interface device is present") https://lore.kernel.org/all/20231211110328.76c925af@canb.auug.org.au/ No adjacent changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-14 17:14:41 -08:00
Ahmed Zaki	fb6e30a725	net: ethtool: pass a pointer to parameters to get/set_rxfh ethtool ops The get/set_rxfh ethtool ops currently takes the rxfh (RSS) parameters as direct function arguments. This will force us to change the API (and all drivers' functions) every time some new parameters are added. This is part 1/2 of the fix, as suggested in [1]: - First simplify the code by always providing a pointer to all params (indir, key and func); the fact that some of them may be NULL seems like a weird historic thing or a premature optimization. It will simplify the drivers if all pointers are always present. - Then make the functions take a dev pointer, and a pointer to a single struct wrapping all arguments. The set_* should also take an extack. Link: https://lore.kernel.org/netdev/20231121152906.2dd5f487@kernel.org/ [1] Suggested-by: Jakub Kicinski <kuba@kernel.org> Suggested-by: Jacob Keller <jacob.e.keller@intel.com> Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com> Link: https://lore.kernel.org/r/20231213003321.605376-2-ahmed.zaki@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-13 22:07:16 -08:00
Pavan Chebbi	056bce63c4	bnxt_en: Make PTP TX timestamp HWRM query silent In a busy network, especially with flow control enabled, we may experience timestamp query failures fairly regularly. After a while, dmesg may be flooded with timestamp query failure error messages. Silence the error message from the low level hwrm function that sends the firmware message. Change netdev_err() to netdev_WARN_ONCE() if this FW call ever fails. Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-14-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:59 -08:00
Pavan Chebbi	84793a4995	bnxt_en: Skip nic close/open when configuring tstamp filters We don't have to close and open the nic to make sure we have valid rx timestamps. Once we have the timestamp filter applied to the HW and the timestamp_fld_format bit is cleared in the rx completion and the timestamp is non-zero, we can be sure that rx timestamp is valid data. Skip close/open when we set any timestamp filter. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-13-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:58 -08:00
Michael Chan	feeef68f6f	bnxt_en: Add support for UDP GSO on 5760X chips The new 5760X chips supports UDP GSO. Tested using udpgso_bench_tx. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-12-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:58 -08:00
Damodharam Ammepalli	6ce3062254	bnxt_en: add rx_filter_miss extended stats rx_filter_miss counter is newly added to the rx_port_stats_ext stats structure for newer chips. Newer firmware will return the structure size that includes this counter. Add this entry to the bnxt_port_stats_ext_arr array and the ethtool -S code will pick up this counter if it is supported. Signed-off-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-11-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:58 -08:00
Michael Chan	9600963344	bnxt_en: Configure UDP tunnel TPA On the new P7 chips, TPA for tunnel packets can be independently enabled for each VNIC. The default TPA configuration should not include UDP tunnels because the UDP ports for these tunnels are not known yet. The chip should not aggregate these UDP tunneled packets using default UDP ports until the ports are known. Add a new function bnxt_hwrm_vnic_update_tunl_tpa() to enable VXLAN and Geneve TPA if the corresponding UDP ports are known. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-10-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:58 -08:00
Michael Chan	77b0fff55d	bnxt_en: Add support for VXLAN GPE Add a new bnxt_udp_tunnels_p7 struct to support the new P7 chips that can parse VXLAN GPE packets. Add VXLAN GPE tunnel type handling to the .set_port() and .unset_port() functions. .ndo_features_check() is also enhanced to support VXLAN GPE which may encapsulate inner IP packets instead of ethernet packets. Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:58 -08:00
Michael Chan	e6f8a5a8ec	bnxt_en: Use proper TUNNEL_DST_PORT_ALLOC* commands In bnxt_udp_tunnel_set_port(), use the proper ALLOC commands instead of the FREE commands for correctness. The ALLOC and FREE commands happen to be identical so this is just a cosmetic fix for correctness. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:58 -08:00
Selvin Xavier	297e625bf8	bnxt_en: Allocate extra QP backing store memory when RoCE FW reports it The Fast QP modify destroy RoCE feature requires additional QP entries in QP context backing store. FW reports the extra count to be allocated during backing store query. Use this value and allocate extra memory. Note that this works for both the V1 and V1 backing store FW APIs. Signed-off-by: Selvin Xavier <selvin.xavier@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:58 -08:00
Michael Chan	6dea3ebe0d	bnxt_en: Support TX coalesced completion on 5760X chips TX coalesced completions are supported on newer chips to provide one TX completion record for multiple TX packets up to the sq_cons_idx in the completion record. This method saves PCIe bandwidth by reducing the number of TX completions. Only very minor changes are now required to support this mode with the new framework that handles TX completions based on the consumer indices. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:57 -08:00
Michael Chan	f12f551b5b	bnxt_en: Prevent TX timeout with a very small TX ring If xmit_more condition is true, the driver may set the TX_BD_FLAGS_NO_CMPL flag. If after this packet, the TX ring can no longer hold a packet with maximum fragments, we will stop the TX queue. When this happens, we must clear the TX_BD_FLAGS_NO_CMPL flag on the last packet or there will be no completion and cause TX timeout. Fixes: `c1056a59ae` ("bnxt_en: Optimize xmit_more TX path") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:57 -08:00
Michael Chan	18fe0a383c	bnxt_en: Fix TX ring indexing logic Two spots were missed when modifying the TX ring indexing logic. The use of unmasked TX index in bnxt_tx_int() will cause unnecessary __bnxt_tx_int() calls. The same issue in bnxt_tx_int_xdp() can result in illegal array index. Fixes: `6d1add9553` ("bnxt_en: Modify TX ring indexing logic.") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:57 -08:00
Somnath Kotur	7fb17a0c18	bnxt_en: Fix AGG ring check logic in bnxt_check_rings() _bnxt_get_max_rings() that is invoked in bnxt_check_rings() already accounts for the AGG ring(s) and gives a max value based on that. Increasing for AGG rings before calling _bnxt_get_max_rings() will result in checking for twice the number of rings than required and it can fail. Fix it by adjusting for AGG rings after calling _bnxt_get_max_rings(). Fixes: `f5b29c6afe` ("bnxt_en: Add helper to get the number of CP rings required for TX rings") Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:57 -08:00
Michael Chan	f1e50b276d	bnxt_en: Fix trimming of P5 RX and TX rings The recent commit to trim the RX and TX rings on P5 chips by assigning each with max CP rings divided by 2 is not correct. Max CP rings divided by 2 may be bigger than the original RX or TX and would lead to failure. In other words, we may be checking for increased RX/TX rings than required and it may fail. Fix it by calling __bnxt_trim_rings() instead that would properly trim RX and TX without the possibility of increasing their values. Fixes: `f5b29c6afe` ("bnxt_en: Add helper to get the number of CP rings required for TX rings") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231212005122.2401-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-12 16:05:57 -08:00
Michael Chan	c13e268c07	bnxt_en: Fix HWTSTAMP_FILTER_ALL packet timestamp logic When the chip is configured to timestamp all receive packets, the timestamp in the RX completion is only valid if the metadata present flag is not set for packets received on the wire. In addition, internal loopback packets will never have a valid timestamp and the timestamp field will always be zero. We must exclude any 0 value in the timestamp field because there is no way to determine if it is a loopback packet or not. Add a new function bnxt_rx_ts_valid() to check for all timestamp valid conditions. Fixes: `66ed81dced` ("bnxt_en: Enable packet timestamping for all RX packets") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231208001658.14230-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-08 17:20:26 -08:00
Kalesh AP	bd6781c18c	bnxt_en: Fix wrong return value check in bnxt_close_nic() The wait_event_interruptible_timeout() function returns 0 if the timeout elapsed, -ERESTARTSYS if it was interrupted by a signal, and the remaining jiffies otherwise if the condition evaluated to true before the timeout elapsed. Driver should have checked for zero return value instead of a positive value. MChan: Print a warning for -ERESTARTSYS. The close operation will proceed anyway when wait_event_interruptible_timeout() returns for any reason. Since we do the close no matter what, we should not return this error code to the caller. Change bnxt_close_nic() to a void function and remove all error handling from some of the callers. Fixes: `c0c050c58d` ("bnxt_en: New Broadcom ethernet driver.") Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231208001658.14230-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-08 17:20:26 -08:00
Sreekanth Reddy	aded5d1feb	bnxt_en: Fix skb recycling logic in bnxt_deliver_skb() Receive SKBs can go through the VF-rep path or the normal path. skb_mark_for_recycle() is only called for the normal path. Fix it to do it for both paths to fix possible stalled page pool shutdown errors. Fixes: `86b05508f7` ("bnxt_en: Use the unified RX page pool buffers for XDP and non-XDP") Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Vikas Gupta <vikas.gupta@broadcom.com> Signed-off-by: Sreekanth Reddy <sreekanth.reddy@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231208001658.14230-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-08 17:20:26 -08:00
Somnath Kotur	9ef7c58f5a	bnxt_en: Clear resource reservation during resume We are issuing HWRM_FUNC_RESET cmd to reset the device including all reserved resources, but not clearing the reservations within the driver struct. As a result, when the driver re-initializes as part of resume, it believes that there is no need to do any resource reservation and goes ahead and tries to allocate rings which will eventually fail beyond a certain number pre-reserved by the firmware. Fixes: `674f50a5b0` ("bnxt_en: Implement new method to reserve rings.") Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231208001658.14230-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-08 17:20:26 -08:00
Jakub Kicinski	2483e7f04c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. Conflicts: drivers/net/ethernet/stmicro/stmmac/dwmac5.c drivers/net/ethernet/stmicro/stmmac/dwmac5.h drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c drivers/net/ethernet/stmicro/stmmac/hwif.h `37e4b8df27` ("net: stmmac: fix FPE events losing") `c3f3b97238` ("net: stmmac: Refactor EST implementation") https://lore.kernel.org/all/20231206110306.01e91114@canb.auug.org.au/ Adjacent changes: net/ipv4/tcp_ao.c `9396c4ee93` ("net/tcp: Don't store TCP-AO maclen on reqsk") `7b0f570f87` ("tcp: Move TCP-AO bits from cookie_v[46]_check() to tcp_ao_syncookie().") Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-07 17:53:17 -08:00
Dinghao Liu	d007caaaf0	net: bnxt: fix a potential use-after-free in bnxt_init_tc When flow_indr_dev_register() fails, bnxt_init_tc will free bp->tc_info through kfree(). However, the caller function bnxt_init_one() will ignore this failure and call bnxt_shutdown_tc() on failure of bnxt_dl_register(), where a use-after-free happens. Fix this issue by setting bp->tc_info to NULL after kfree(). Fixes: `627c89d00f` ("bnxt_en: flow_offload: offload tunnel decap rules via indirect callbacks") Signed-off-by: Dinghao Liu <dinghao.liu@zju.edu.cn> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Link: https://lore.kernel.org/r/20231204024004.8245-1-dinghao.liu@zju.edu.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-05 19:43:42 -08:00
Jakub Kicinski	e3b57ffdb3	eth: bnxt: link NAPI instances to queues and IRQs Make bnxt compatible with the newly added netlink queue GET APIs. Signed-off-by: Amritha Nambiar <amritha.nambiar@intel.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/170147336340.5260.6773000274196548907.stgit@anambiarhost.jf.intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 18:04:06 -08:00
Michael Chan	2012a6abc8	bnxt_en: Add 5760X (P7) PCI IDs Now with basic support for the new chip family, add the PCI IDs of the new devices. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-16-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:49 -08:00
Michael Chan	047a2d38e4	bnxt_en: Report the new ethtool link modes in the new firmware interface Add new look up entries to convert the new supported speeds, advertised speeds, etc to ethtool link modes. Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-15-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:48 -08:00
Michael Chan	7b60cf2b64	bnxt_en: Support force speed using the new HWRM fields Modify bnxt_force_link_speed() to support the new speeds stored in link_info->support_speeds2, including the new 400G speed. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-14-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:48 -08:00
Michael Chan	30c0bb63c2	bnxt_en: Support new firmware link parameters Newer firmware supporting PAM4 112Gbps speeds use new parameters in firmware message structures. Detect the new firmware capability and add basic logic to report and store these new fields. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-13-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:48 -08:00
Michael Chan	cf47fa5ca5	bnxt_en: Refactor ethtool speeds logic Add helper functions to refactor the logic that converts firmware speed masks to ethtool speeds. Pass the phy_flags to bnxt_get_ethtool_speeds() and the call chain. The refactoring and the phy_flags will be needed when adding support for the new speeds in the next patches. Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-12-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:48 -08:00
Michael Chan	a7445d6980	bnxt_en: Add support for new RX and TPA_START completion types for P7 These new completion types are supported on the new P7 chips. These new types have commonalities with the legacy types. After the refactoring, we mainly have to add new functions to handle the the new meta data formats and the RX hash information in the new types. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-11-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:48 -08:00
Michael Chan	39b2e62be3	bnxt_en: Refactor and refine bnxt_tpa_start() and bnxt_tpa_end(). Refactor bnxt_tpa_start() by adding bnxt_tpa_metadata() to gather the metadata from the TPA_START completion. This makes it easier to support the new P7 chip which has a modified TPA_START completion structure with different metadata formats. We also add vlan_valid and cfa_code_valid fields to the bnxt_tpa_info structure so that the VLAN and VF rep logic can be common for all chips. The VLAN metadata is now collected in bnxt_tpa_start() only when it is valid and the vlan_valid field will be set. bnxt_tpa_end() can now use common VLAN logic for all chips. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-10-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:48 -08:00
Michael Chan	c2f8063309	bnxt_en: Refactor RX VLAN acceleration logic. Refactor the logic in the RX path that checks for the accelerated VLAN tag by adding a new function. This will make it easier to support the new receive logic on P7 chips. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:48 -08:00
Michael Chan	13d2d3d381	bnxt_en: Add new P7 hardware interface definitions Add new RX, TX, and TPA hardware interface structures and macros for the P7 chips. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:47 -08:00
Ajit Khaparde	8243345bfa	bnxt_en: Refactor RSS capability fields Add a new rss_cap field in the per device struct bnxt and move all the RSS capability fields there. It will be easier to add new RSS capabilities for the new P7 chips. Signed-off-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:47 -08:00
Michael Chan	d846992e63	bnxt_en: Implement the new toggle bit doorbell mechanism on P7 chips The new chip family passes the Toggle bits to the driver in the NQE notification. The driver now stores this value and sends it back to hardware when it re-arms the RX and TX CQs. Together with the earlier patch that guarantees the driver will only re-arm the CQ at the end of NAPI polling if it has seen a new NQE, this method allows the hardware to detect any dropped doorbells. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:47 -08:00
Hongguang Gao	d3c16475dc	bnxt_en: Consolidate DB offset calculation The doorbell offset on P5 chips is hard coded. On the new P7 chips, it is returned by the firmware. Simplify the logic that determines this offset and store it in a new db_offset field in struct bnxt. Also, provide this offset to the RoCE driver in struct bnxt_en_dev. Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:47 -08:00
Michael Chan	a432a45bdb	bnxt_en: Define basic P7 macros Repurpose the BNXT_FLAG_CHIP_SR2 flag by renaming it to BNXT_FLAG_CHIP_P7 since the SR2 chip never went to production. The SR2 statictics structure is also renamed for the P7 chip. Define the basic P7 doorbell bits (Epoch. Toggle, etc) and implement the Epoch bit logic. The next higher bit beyond the legal doorbell mask is the Epoch bit used for doorbells on P7 chips. This bit is used by the chip to detect dropped doorbells. The 57608 chip ID belonging to the P7 family is also defined. Note that the PCI ID is not added until the last patch in the series. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:47 -08:00
Michael Chan	397d44bf17	bnxt_en: Update firmware interface to 1.10.3.15 This updated interface supports the new 5760X P7 chip family. It has the changes to support the new link speeds/modes and other changes for the basic L2 features. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:47 -08:00
Michael Chan	08b386b132	bnxt_en: Fix backing store V2 logic The current code determines the last backing store valid type during bnxt_hwrm_func_backing_store_qcaps_v2(). In effect, the last type is determined based on what firmware advertises. The more correct way is to determine it based on what the driver is configuring. The driver may not configure all the backing store types advertised by firmware. Move the logic to determine the last type to bnxt_backing_store_cfg_v2(). We need to pass the legacy enable flags to the function in case only the legacy types are being configured. Fixes: `236e237f8f` ("bnxt_en: Add support for HWRM_FUNC_BACKING_STORE_CFG_V2 firmware calls") Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201223924.26955-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-04 15:12:47 -08:00
Thinh Tran	16b55b1f22	net/tg3: fix race condition in tg3_reset_task() When an EEH error is encountered by a PCI adapter, the EEH driver modifies the PCI channel's state as shown below: enum { /* I/O channel is in normal state / pci_channel_io_normal = (__force pci_channel_state_t) 1, / I/O to channel is blocked / pci_channel_io_frozen = (__force pci_channel_state_t) 2, / PCI card is dead */ pci_channel_io_perm_failure = (__force pci_channel_state_t) 3, }; If the same EEH error then causes the tg3 driver's transmit timeout logic to execute, the tg3_tx_timeout() function schedules a reset task via tg3_reset_task_schedule(), which may cause a race condition between the tg3 and EEH driver as both attempt to recover the HW via a reset action. EEH driver gets error event --> eeh_set_channel_state() and set device to one of error state above scheduler: tg3_reset_task() get returned error from tg3_init_hw() --> dev_close() shuts down the interface tg3_io_slot_reset() and tg3_io_resume() fail to reset/resume the device To resolve this issue, we avoid the race condition by checking the PCI channel state in the tg3_reset_task() function and skip the tg3 driver initiated reset when the PCI channel is not in the normal state. (The driver has no access to tg3 device registers at this point and cannot even complete the reset task successfully without external assistance.) We'll leave the reset procedure to be managed by the EEH driver which calls the tg3_io_error_detected(), tg3_io_slot_reset() and tg3_io_resume() functions as appropriate. Adding the same checking in tg3_dump_state() to avoid dumping all device registers when the PCI channel is not in the normal state. Signed-off-by: Thinh Tran <thinhtr@linux.vnet.ibm.com> Tested-by: Venkata Sai Duggi <venkata.sai.duggi@ibm.com> Reviewed-by: David Christensen <drc@linux.vnet.ibm.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231201001911.656-1-thinhtr@linux.vnet.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-12-01 16:31:36 -08:00
Jakub Kicinski	7cc9e6d77f	eth: link netdev to page_pools in drivers Link page pool instances to netdev for the drivers which already link to NAPI. Unless the driver is doing something very weird per-NAPI should imply per-netdev. Add netsec as well, Ilias indicates that it fits the mold. Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-28 15:48:39 +01:00
Randy Schacher	1c7fd6ee2f	bnxt_en: Rename some macros for the P5 chips In preparation to support a new P7 chip which has a lot of similarities with the P5 chip, rename the BNXT_FLAG_CHIP_P5 flag to BNXT_FLAG_CHIP_P5_PLUS. This will make it clear that the flag is for P5 and newer chips. Also, since there are no additional P5 variants in production, rename BNXT_FLAG_CHIP_P5_THOR() to BNXT_FLAG_CHIP_P5() to keep the naming more simple. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Randy Schacher <stuart.schacher@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-14-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	f94471f3ce	bnxt_en: Modify the NAPI logic for the new P7 chips Modify the NAPI logic for the new doorbell mechanism on P7 chips. These changes are compatible with the current P5 chips. In the current logic, bnxt_poll_p5() services 1 or more CQs for each MSIX. Each MSIX has an associated NQ and each NQ has 1 or more associated CQs. If any CQ reaches NAPI budget, we'll stay in polling mode and will unconditionally check and service all CQs until we exit polling. We always re-arm all CQs when we exit polling. To be compatible with the new Toggle bit mechanism in P7 chips, we need to modify the logic so that we service and re-arm the CQ only if we receive an NQE notification for work for that CQ. We add a new had_nqe_notify bit to the cp_ring_info structure and it gets set when we see the NQE notification for that CQ anytime during polling. We'll service and re-arm only the CQs with the had_nqe_notify bits set. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-13-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	c09d22674b	bnxt_en: Modify RX ring indexing logic. Modify the RX indexing logic for both RX ring and RX aggregation ring just like the TX logic. Change it so that the index increments unbounded and mask it only when needed. Modify the existing RX macros so that the index is not masked. Add new macros RING_RX()/RING_RX_AGG() to mask it only when needed to get the index of rxr->rx_buf_ring[] and rxr->rx_agg_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-12-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	6d1add9553	bnxt_en: Modify TX ring indexing logic. Change the TX ring logic so that the index increments unbounded and mask it only when needed. Modify the existing macros so that the index is not masked. Add a new macro RING_TX() to mask it only when needed to get the index of txr->tx_buf_ring[]. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-11-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:49 -08:00
Michael Chan	b9e0c47ee2	bnxt_en: Add db_ring_mask and related macro to bnxt_db_info struct. This allows the doorbell related logic to mask the doorbell index to the proper range before writing the doorbell. The current code masks the doorbell index immediately to keep it in the legal ranges for the most part. Subsequent patches will change the logic so that the index increments unbounded and it only gets masked before use. This is preparation work for the new chip that requires an additional Epoch bit in the doorbell that needs to toggle when the index has wrapped around. This patch just adds the basic infrastructure and the logic is largely unchanged. We now replace RING_CMP() with the new DB_RING_IDX() at appropriate places where we mask the completion ring index before writing the doorbell. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-10-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	236e237f8f	bnxt_en: Add support for HWRM_FUNC_BACKING_STORE_CFG_V2 firmware calls Newer chips starting with 57600 will use this new firmware HWRM call to configure backing store memory. Add this new call if it is supported by the firmware. Reviewed-by: Hongguang Gao <hongguang.gao@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-9-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	6a4d0774f0	bnxt_en: Add support for new backing store query firmware API Use the new v2 firmware API if supported by the firmware. We now have the infrastructure to support the v2 API. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-8-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	b098dc5a33	bnxt_en: Add bnxt_setup_ctxm_pg_tbls() helper function In bnxt_alloc_ctx_mem(), the logic to set up the context memory entries and to allocate the context memory tables is done repetitively. Add a helper function to simplify the code. The setup of the Fast Path TQM entries relies on some information from the Slow Path TQM entries. Copy the SP_TQM entries to the FP_TQM entries to simplify the logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	2ad67aea11	bnxt_en: Use the pg_info field in bnxt_ctx_mem_type struct Use the newly added pg_info field in bnxt_ctx_mem_type struct and remove the standalone page info structures in bnxt_ctx_mem_info. This now completes the reorganization of the context memory structures to work better with the new and more flexible firmware interface for newer chips. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	035c576159	bnxt_en: Add page info to struct bnxt_ctx_mem_type This will further improve the organization of the bnxt_ctx_mem_info structure by moving the standalone page info structures into the bnxt_ctx_mem_type array. Add the allocation and free logic first and the next patch will migrate to use the new infrastructure. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	76087d997a	bnxt_en: Restructure context memory data structures The current code uses a flat bnxt_ctx_mem_info structure to store 8 types of context memory for the NIC. All the context memory types are very similar and have similar parameters. They can all share a common structure to improve the organization. Also, new firmware interface will provide a new API to retrieve each type of context memory by calling the API repeatedly. This patch reorganizes the bnxt_ctx_mem_info structure to fit better with the new firmware interface. It will also work with the legacy firmware interface. The flat fields in bnxt_ctx_mem_info are replaced by the bnxt_ctx_mem_type array. The bnxt_mem_init array info will no longer be needed. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:48 -08:00
Michael Chan	e50dc4c220	bnxt_en: Free bp->ctx inside bnxt_free_ctx_mem() We always free bp->ctx right after calling bnxt_free_ctx_mem(), so just free it at the end of that function to make things simpler. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:47 -08:00
Michael Chan	aa8460bacf	bnxt_en: The caller of bnxt_alloc_ctx_mem() should always free bp->ctx bnxt_alloc_ctx_mem() calls bnxt_hwrm_func_backing_store_qcaps() to allocate the memory for bp->ctx. Initialize bp->ctx with the allocated memory and let the caller free it during unwind. The unwind logic is already there, we just need to always set bp->ctx to the allocated memory so the caller will always free it. This simplifies the logic and makes it easier to expand on the backing store logic. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231120234405.194542-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-21 17:32:47 -08:00
Paolo Abeni	56eddc3cb1	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Cross-merge networking fixes after downstream PR. No conflicts. Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2023-11-16 16:05:44 +01:00
Michael Chan	c1056a59ae	bnxt_en: Optimize xmit_more TX path Now that we use the cumulative consumer index scheme for TX completion, we don't need to have one TX completion per TX packet in the xmit_more code path. Set the TX_BD_FLAGS_NO_CMPL flag if xmit_more is true. Fallback to one interrupt per packet if the ring is filled beyond bp->tx_wake_thresh. Also, move the wmb() to bnxt_txr_db_kick(). When xmit_more is true, we'll skip the bnxt_txr_db_kick() call and there is no need to call wmb() to sync. the TX BD data. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:40 +00:00
Michael Chan	ba09801779	bnxt_en: Use existing MSIX vectors for all mqprio TX rings We can now fully support sharing the same MSIX for all mqprio TX rings belonging to the same ethtool channel with the new infrastructure: 1. Allocate the proper entries for cp_ring_arr in struct bnxt_cp_ring_info to support the additional TX rings. 2. Populate the tx_ring array in struct bnxt_napi for all TX rings sharing the same NAPI. 3. bnxt_num_tx_to_cp() returns the proper NQ/completion rings to support the TX rings in the input. 4. Adjust bnxt_get_num_ring_stats() for the reduced number of ring counters with the new scheme. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:40 +00:00
Michael Chan	f07b58801b	bnxt_en: Add macros related to TC and TX rings Add 3 macros that handle to conversions between TC numbers and TX ring numbers. These will help to clarify the existing logic and the new logic in the next patch. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:40 +00:00
Michael Chan	f5b29c6afe	bnxt_en: Add helper to get the number of CP rings required for TX rings Up until now, each TX ring always requires a completion ring/NQ/MSIX. bnxt_trim_rings() and the assignment of bp->cp_nr_rings always make this assumption. This will no longer be true in the next patches, so we refactor and add helper functions to determine the proper relationship between TX rings and the required completion ring/NQ/MSIX. This patch does not change the 1:1 relationship yet. Note that on P5 chips, each RX and TX ring still requires a completion ring. Only the number of NQs has been reduced. We should no longer call bnxt_trim_rings() to adjust the RX and TX rings on P5 chips. Replace with simple logic to check that RX + TX < CP and adjust accordingly. bnxt_check_rings() should call _bnxt_get_max_rings() to get the raw number of rings instead of bnxt_get_max_rings(). If we are about to create TCs, bnxt_get_max_rings() would not be able to calculate the max rings correctly. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:40 +00:00
Michael Chan	0589a1ed4d	bnxt_en: Support up to 8 TX rings per MSIX For each mqprio TC, we allocate a set of TX rings to map to the new hardware CoS queue. Expand the tx_ring pointer in struct bnxt_napi to an array of 8 to support up to 8 TX rings, one for each TC. Only array entry 0 is used at this time. The rest of the array entries will be used in later patches. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:40 +00:00
Michael Chan	877edb3473	bnxt_en: Refactor bnxt_hwrm_set_coal() Add 2 helper functions to set coalescing for each RX and TX rings. This will make it easier to expand the number of TX rings per MSIX in the next patches. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:40 +00:00
Michael Chan	5a3c585fa8	bnxt_en: New encoding for the TX opaque field In order to support multiple TX rings on the same MSIX, we'll use the upper byte of the TX opaque field to store the ring index in the new tx_napi_idx field. This tx_napi_idx field is currently always 0 until more infrastructure is added in later patches. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:39 +00:00
Michael Chan	ebf72319ce	bnxt_en: Refactor bnxt_tx_int() bnxt_tx_int() processes the only one TX ring from the bnxt_napi pointer. To prepare for more TX rings associated with the bnxt_napi structure, add a new __bnxt_tx_int() function that takes the bnxt_tx_ring_info pointer to process that one TX ring. No functional change. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:39 +00:00
Michael Chan	9c0b06de6f	bnxt_en: Remove BNXT_RX_HDL and BNXT_TX_HDL These 2 constants were used for the one RX and one TX completion ring pointer in the cpr->cp_ring_arr fixed array. Now that we've changed to allocating the array for the exact number of entries to support more TX rings, we no longer use these constants. The array index as well as the type of completion ring (RX/TX) are now encoded in the handle for the completion ring. This will allow us to locate the completion ring during NAPI for any number of completion rings sharing the same MSIX. In the following patches, we'll be adding support for more TX rings associated with the same MSIX vector. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:39 +00:00
Michael Chan	7845b8dfc7	bnxt_en: Add completion ring pointer in TX and RX ring structures From the TX or RX ring structure, we need to find the corresponding completion ring during initialization. On P5 chips, we use the MSIX/napi entry to locate the completion ring because there is only one RX/TX ring per MSIX. To allow multiple TX rings for each MSIX, we need to add a direct pointer from the TX ring and RX ring structures. This also simplifies the existing logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:39 +00:00
Michael Chan	d1eec61410	bnxt_en: Restructure cp_ring_arr in struct bnxt_cp_ring_info The cp_ring_arr is currently a fixed array of 2 pointers for the TX and RX completion rings. These pointers are allocated during ring initialization. Currntly, we support up to 2 completion rings for each MSIX. In order to support more completion rings, we change this fixed array to a pointer and allocate the required entries during ring initialization. This patch keeps the current scheme of allocating only 2 entries when needed. Later patches will expand and allocate more entries when required. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:39 +00:00
Michael Chan	7f0a168b04	bnxt_en: Add completion ring pointer in TX and RX ring structures From the TX or RX ring structure, we need to find the corresponding completion ring during initialization. On P5 chips, we use the MSIX/napi entry to locate the completion ring because there is only one RX/TX ring per MSIX. To allow multiple TX rings for each MSIX, we need to add a direct pointer from the TX ring and RX ring structures. This also simplifies the existing logic. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:39 +00:00
Michael Chan	34eec1f29a	bnxt_en: Put the TX producer information in the TX BD opaque field Currently, the opaque field in the TX BD is only used for debugging. The TX completion logic relies on getting one TX completion for each packet and they always complete in order. Improve this scheme by putting the producer information (ring index plus number of BDs for the packet) in the opaque field. This way, we can handle TX completion processing by looking at the last TX completion instead of counting the number of completions. Since we no longer need to count the exact number of completions, we can optimize xmit_more by disabling TX completion when the xmit_more condition is true. This will be done in later patches. This patch is only initializing the opaque field in the TX BD and is not changing the driver's TX completion logic yet. Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-15 10:07:39 +00:00
Alex Pakhunov	17dd5efe5f	tg3: Increment tx_dropped in tg3_tso_bug() tg3_tso_bug() drops a packet if it cannot be segmented for any reason. The number of discarded frames should be incremented accordingly. Signed-off-by: Alex Pakhunov <alexey.pakhunov@spacex.com> Signed-off-by: Vincent Wong <vincent.wong2@spacex.com> Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Link: https://lore.kernel.org/r/20231113182350.37472-2-alexey.pakhunov@spacex.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-14 19:50:23 -08:00
Alex Pakhunov	907d1bdb8b	tg3: Move the [rt]x_dropped counters to tg3_napi This change moves [rt]x_dropped counters to tg3_napi so that they can be updated by a single writer, race-free. Signed-off-by: Alex Pakhunov <alexey.pakhunov@spacex.com> Signed-off-by: Vincent Wong <vincent.wong2@spacex.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231113182350.37472-1-alexey.pakhunov@spacex.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-14 19:50:23 -08:00
Alex Pakhunov	c542b39b60	tg3: Fix the TX ring stall The TX ring maintained by the tg3 driver can end up in the state, when it has packets queued for sending but the NIC hardware is not informed, so no progress is made. This leads to a multi-second interruption in network traffic followed by dev_watchdog() firing and resetting the queue. The specific sequence of steps is: 1. tg3_start_xmit() is called at least once and queues packet(s) without updating tnapi->prodmbox (netdev_xmit_more() returns true) 2. tg3_start_xmit() is called with an SKB which causes tg3_tso_bug() to be called. 3. tg3_tso_bug() determines that the SKB is too large, ... if (unlikely(tg3_tx_avail(tnapi) <= frag_cnt_est)) { ... stops the queue, and returns NETDEV_TX_BUSY: netif_tx_stop_queue(txq); ... if (tg3_tx_avail(tnapi) <= frag_cnt_est) return NETDEV_TX_BUSY; 4. Since all tg3_tso_bug() call sites directly return, the code updating tnapi->prodmbox is skipped. 5. The queue is stuck now. tg3_start_xmit() is not called while the queue is stopped. The NIC is not processing new packets because tnapi->prodmbox wasn't updated. tg3_tx() is not called by tg3_poll_work() because the all TX descriptions that could be freed has been freed: /* run TX completion thread */ if (tnapi->hw_status->idx[0].tx_consumer != tnapi->tx_cons) { tg3_tx(tnapi); 6. Eventually, dev_watchdog() fires triggering a reset of the queue. This fix makes sure that the tnapi->prodmbox update happens regardless of the reason tg3_start_xmit() returned. Signed-off-by: Alex Pakhunov <alexey.pakhunov@spacex.com> Signed-off-by: Vincent Wong <vincent.wong2@spacex.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-11-07 22:19:16 +00:00
George Shuklin	9fc3bc7643	tg3: power down device only on SYSTEM_POWER_OFF Dell R650xs servers hangs on reboot if tg3 driver calls tg3_power_down. This happens only if network adapters (BCM5720 for R650xs) were initialized using SNP (e.g. by booting ipxe.efi). The actual problem is on Dell side, but this fix allows servers to come back alive after reboot. Signed-off-by: George Shuklin <george.shuklin@gmail.com> Fixes: `2ca1c94ce0` ("tg3: Disable tg3 device on system reboot to avoid triggering AER") Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231103115029.83273-1-george.shuklin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-11-06 17:23:42 -08:00
Linus Torvalds	1e0c505e13	asm-generic updates for v6.7 The ia64 architecture gets its well-earned retirement as planned, now that there is one last (mostly) working release that will be maintained as an LTS kernel. The architecture specific system call tables are updated for the added map_shadow_stack() syscall and to remove references to the long-gone sys_lookup_dcookie() syscall. -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmVC40IACgkQYKtH/8kJ Uidhmw/9EX+aWSXGoObJ3fngaNSMw+PmrEuP8qEKBHxfKHcCdX3hc451Oh4GlhaQ tru91pPwgNvN2/rfoKusxT+V4PemGIzfNni/04rp+P0kvmdw5otQ2yNhsQNsfVmq XGWvkxF4P2GO6bkjjfR/1dDq7GtlyXtwwPDKeLbYb6TnJOZjtx+EAN27kkfSn1Ms R4Sa3zJ+DfHUmHL5S9g+7UD/CZ5GfKNmIskI4Mz5GsfoUz/0iiU+Bge/9sdcdSJQ kmbLy5YnVzfooLZ3TQmBFsO3iAMWb0s/mDdtyhqhTVmTUshLolkPYyKnPFvdupyv shXcpEST2XJNeaDRnL2K4zSCdxdbnCZHDpjfl9wfioBg7I8NfhXKpf1jYZHH1de4 LXq8ndEFEOVQw/zSpYWfQq1sux8Jiqr+UK/ukbVeFWiGGIUs91gEWtPAf8T0AZo9 ujkJvaWGl98O1g5wmBu0/dAR6QcFJMDfVwbmlIFpU8O+MEaz6X8mM+O5/T0IyTcD eMbAUjj4uYcU7ihKzHEv/0SS9Of38kzff67CLN5k8wOP/9NlaGZ78o1bVle9b52A BdhrsAefFiWHp1jT6Y9Rg4HOO/TguQ9e6EWSKOYFulsiLH9LEFaB9RwZLeLytV0W vlAgY9rUW77g1OJcb7DoNv33nRFuxsKqsnz3DEIXtgozo9CzbYI= =H1vH -----END PGP SIGNATURE----- Merge tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic Pull ia64 removal and asm-generic updates from Arnd Bergmann: - The ia64 architecture gets its well-earned retirement as planned, now that there is one last (mostly) working release that will be maintained as an LTS kernel. - The architecture specific system call tables are updated for the added map_shadow_stack() syscall and to remove references to the long-gone sys_lookup_dcookie() syscall. * tag 'asm-generic-6.7' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: hexagon: Remove unusable symbols from the ptrace.h uapi asm-generic: Fix spelling of architecture arch: Reserve map_shadow_stack() syscall number for all architectures syscalls: Cleanup references to sys_lookup_dcookie() Documentation: Drop or replace remaining mentions of IA64 lib/raid6: Drop IA64 support Documentation: Drop IA64 from feature descriptions kernel: Drop IA64 support from sig_fault handlers arch: Remove Itanium (IA-64) architecture	2023-11-01 15:28:33 -10:00
Michael Chan	9cfe8cf502	bnxt_en: Fix 2 stray ethtool -S counters The recent firmware interface change has added 2 counters in struct rx_port_stats_ext. This caused 2 stray ethtool counters to be displayed. Since new counters are added from time to time, fix it so that the ethtool logic will only display up to the maximum known counters. These 2 counters are not used by production firmware yet. Fixes: `754fbf604f` ("bnxt_en: Update firmware interface to 1.10.2.171") Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231026013231.53271-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-26 19:43:38 -07:00
Florian Fainelli	6ca80638b9	net: dsa: Use conduit and user terms Use more inclusive terms throughout the DSA subsystem by moving away from "master" which is replaced by "conduit" and "slave" which is replaced by "user". No functional changes. Acked-by: Rob Herring <robh@kernel.org> Acked-by: Stephen Hemminger <stephen@networkplumber.org> Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://lore.kernel.org/r/20231023181729.1191071-2-florian.fainelli@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-24 13:08:14 -07:00
Yunsheng Lin	09d96ee567	page_pool: remove PP_FLAG_PAGE_FRAG PP_FLAG_PAGE_FRAG is not really needed after pp_frag_count handling is unified and page_pool_alloc_frag() is supported in 32-bit arch with 64-bit DMA, so remove it. Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> CC: Lorenzo Bianconi <lorenzo@kernel.org> CC: Alexander Duyck <alexander.duyck@gmail.com> CC: Liang Chen <liangchen.linux@gmail.com> CC: Alexander Lobakin <aleksander.lobakin@intel.com> Link: https://lore.kernel.org/r/20231020095952.11055-3-linyunsheng@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-23 19:14:48 -07:00
Edwin Peer	5d4e1bf606	bnxt_en: extend media types to supported and autoneg modes The current driver code does not accurately report the supported and advertised link modes. It basically always assumes the media type is copper for any particular speed. Utilize the recently added link mode mappings to accurately report fully qualified ethtool link modes for advertised and supported speeds. If the media type is known, we will report the supported link modes for that media only. If the media is not known, we will report all possible supported link modes. The user can now specify any supported link modes (including NRZ and PAM4) to advertise for autoneg. It used to only accept copper NRZ modes. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:46 +01:00
Edwin Peer	64d20aea6e	bnxt_en: convert to linkmode_set_bit() API Barring the BNXT_FW_TO_ETHTOOL speed macros, which will be removed in the next patch, update code to use the newer API. No functional change. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:46 +01:00
Michael Chan	5802e30317	bnxt_en: Refactor NRZ/PAM4 link speed related logic Refactor some NRZ/PAM4 link speed related logic into helper functions. The NRZ and PAM4 link parameters are stored in separate structure fields. The driver logic has to check whether it is in NRZ or PAM4 mode and then use the appropriate field. Refactor this logic into helper functions for better readability. Reviewed-by: Damodharam Ammepalli <damodharam.ammepalli@broadcom.com> Reviewed-by: Ajit Khaparde <ajit.khaparde@broadcom.com> Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:46 +01:00
Edwin Peer	94c89e73d3	bnxt_en: refactor speed independent ethtool modes A future patch in this series will change the algorithm used to determine ethtool speed and media modes. Extract the handling of the unrelated pause, autoneg modes into an independent function. Also separate FEC handling out of bnxt_fw_to_ethtool_*_spds(). No functional change. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:46 +01:00
Edwin Peer	d6263677bb	bnxt_en: support lane configuration via ethtool Recent kernels support changing the number of link lanes via ethtool. This is useful for determining the appropriate signal mode to use when a given link speed can be achieved using different lane configurations. Accept the ethtool lanes parameter when configuring forced speed. If there is no lanes parameter, select a default. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:46 +01:00
Edwin Peer	ecdad2a692	bnxt_en: add infrastructure to lookup ethtool link mode Add infrastructure to look up the enum ethtool_link_mode_bit_indices from link information provided by the firmware. The link speed, signal mode, and media type returned by firmware will be used to look up the ethtool link mode. The immediate benefit is that once the link mode is determined, we can now use ethtool_params_from_link_mode() to fill the basic ethtool parameters including the number of lanes. Lanes will be fully supported in the next patch. Signed-off-by: Edwin Peer <edwin.peer@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:46 +01:00
Kalesh AP	fd78ec3fbc	bnxt_en: Fix invoking hwmon_notify_event FW sends the async event to the driver when the device temperature goes above or below the threshold values. Only notify hwmon if the temperature is increasing to the next alert level, not when it is decreasing. Cc: Guenter Roeck <linux@roeck-us.net> Reviewed-by: Kashyap Desai <kashyap.desai@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:46 +01:00
Kalesh AP	55862094a9	bnxt_en: Do not call sleeping hwmon_notify_event() from NAPI Defer hwmon_notify_event() to bnxt_sp_task() workqueue because hwmon_notify_event() can try to acquire a mutex shown in the stack trace below. Modify bnxt_event_error_report() to return true if we need to schedule bnxt_sp_task() to notify hwmon. __schedule+0x68/0x520 hwmon_notify_event+0xe8/0x114 schedule+0x60/0xe0 schedule_preempt_disabled+0x28/0x40 __mutex_lock.constprop.0+0x534/0x550 __mutex_lock_slowpath+0x18/0x20 mutex_lock+0x5c/0x70 kobject_uevent_env+0x2f4/0x3d0 kobject_uevent+0x10/0x20 hwmon_notify_event+0x94/0x114 bnxt_hwmon_notify_event+0x40/0x70 [bnxt_en] bnxt_event_error_report+0x260/0x290 [bnxt_en] bnxt_async_event_process.isra.0+0x250/0x850 [bnxt_en] bnxt_hwrm_handler.isra.0+0xc8/0x120 [bnxt_en] bnxt_poll_p5+0x150/0x350 [bnxt_en] __napi_poll+0x3c/0x210 net_rx_action+0x308/0x3b0 __do_softirq+0x120/0x3e0 Cc: Guenter Roeck <linux@roeck-us.net> Fixes: `a19b480145` ("bnxt_en: Event handler for Thermal event") Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-22 11:41:45 +01:00
Przemek Kitszel	074e1b4221	bnxt_en: devlink health: use retained error fmsg API Drop unneeded error checking. devlink_fmsg_*() family of functions is now retaining errors, so there is no need to check for them after each call. Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Signed-off-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-20 11:34:49 +01:00
Jakub Kicinski	73b24e7ce8	eth: bnxt: fix backward compatibility with older devices Recent FW interface update bumped the size of struct hwrm_func_cfg_input above 128B which is the max some devices support. Probe on Stratus (BCM957452) with FW 20.8.3.11 fails with: bnxt_en ...: Unable to reserve tx rings bnxt_en ...: 2nd rings reservation failed. bnxt_en ...: Not enough rings available. Once probe is fixed other errors pop up: bnxt_en ...: Failed to set async event completion ring. This is because __hwrm_send() rejects requests larger than bp->hwrm_max_ext_req_len with -E2BIG. Since the driver doesn't actually access any of the new fields, yet, trim the length. It should be safe. Similar workaround exists for backing_store_cfg_input. Although that one mins() to a constant of 256, not 128 we'll effectively use here. Michael explains: "the backing store cfg command is supported by relatively newer firmware that will accept 256 bytes at least." To make debugging easier in the future add a warning for oversized requests. Fixes: `754fbf604f` ("bnxt_en: Update firmware interface to 1.10.2.171") Reviewed-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20231016171640.1481493-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2023-10-17 17:50:55 -07:00
Pavan Chebbi	b22f21f7a5	tg3: Improve PTP TX timestamping logic When we are trying to timestamp a TX packet, there may be occasions when the TX timestamp register is still not updated with the latest timestamp even if the timestamp packet descriptor is marked as complete. This usually happens in cases where the system is under stress or flow control is affecting the transmit side. We will solve this problem by saving the snapshot of the timestamp register when we are posting the TX descriptor. At this time, the register contains previously timestamped packet's value and valid timestamp of the current packet must be different than this. Upon completion of the current descriptor, we will check if the timestamp register is updated or not before timestamping the skb. If not updated, we will schedule the ptp worker to fetch the updated time later and timestamp the skb. Also now we restrict number of outstanding PTP TX packet requests to 1. Reported-by: Simon White <Simon.White@viavisolutions.com> Link: https://lore.kernel.org/netdev/CACKFLikGdN9XPtWk-fdrzxdcD=+bv-GHBvfVfSpJzHY7hrW39g@mail.gmail.com/ Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Andy Gospodarek <gospo@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2023-10-15 14:30:24 +01:00

... 4 5 6 7 8 ...

4368 Commits