mirror_ubuntu-kernels

mirror of https://git.proxmox.com/git/mirror_ubuntu-kernels.git synced 2025-12-25 23:55:17 +00:00

Author	SHA1	Message	Date
Tariq Toukan	4206fe40b2	net/mlx5: Remove unused exported contiguous coherent buffer allocation API All WQ types moved to using the fragmented allocation API for coherent memory. Contiguous API is not used anymore. Remove it, reduce the number of exported functions. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:58 -07:00
Paul Blakey	ebf04231cf	net/mlx5: CT: Remove extra rhashtable remove on tuple entries On tuple offload del command, tuples are tried to be removed twice from the hashtable, once directly via mlx5_tc_ct_entry_remove_from_tuples() and a second time in the following mlx5_tc_ct_entry_put()-> mlx5_tc_ct_entry_del()->mlx5_tc_ct_entry_remove_from_tuples() call. This doesn't cause any issue since rhashtable first checks if the removed object exists in the hashtable. Remove the extra mlx5_tc_ct_entry_remove_from_tuples(). Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Maor Dickman <maord@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:58 -07:00
Rongwei Liu	0d7f1595bb	net/mlx5: DR, Remove hw_ste from mlx5dr_ste to reduce memory It can be calculated via function mlx5dr_ste_get_hw_ste(). Very simple and lightweight, no need to use a dedicated member. Reduce 8 bytes from struct mlx5dr_ste and its size is 48 bytes now. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:57 -07:00
Rongwei Liu	597534bd56	net/mlx5: DR, Remove 4 members from mlx5dr_ste_htbl to reduce memory Remove chunk_size in struct mlx5dr_icm_chunk and use chunk->size instead. Remove ste_arr/hw_ste_arr/miss_list since they can be accessed from htbl->chunk pointer, no need to keep a copy. This commit reduces 28 bytes from struct mlx5dr_ste_htbl and its size is 32 bytes now. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:57 -07:00
Rongwei Liu	f51bb51793	net/mlx5: DR, Remove num_of_entries byte_size from struct mlx5_dr_icm_chunk Target to reduce the memory consumption in large scale of flow rules. They can be calculated quickly from buddy memory pool. 1. num_of_entries calls dr_icm_pool_get_chunk_num_of_entries(). 2. byte_size calls dr_icm_pool_get_chunk_byte_size(). Use chunk size in dr_icm_chunk to speed up and the one in dr_ste_htbl will be removed in the upcoming commit. This commit reduce 8 bytes from struct mlx5_dr_icm_chunk and its current size is 56 bytes. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:57 -07:00
Rongwei Liu	5c4f9b6e91	net/mlx5: DR, Remove icm_addr from mlx5dr_icm_chunk to reduce memory It can be calculated quickly from buddy memory pool by function mlx5dr_icm_pool_get_chunk_icm_addr(). This function is very lightweight and straightforward. Reduce 8 bytes and current size of struct mlx5_dr_icm_chunk is 64 bytes. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:56 -07:00
Rongwei Liu	003f4f9acb	net/mlx5: DR, Remove mr_addr rkey from struct mlx5dr_icm_chunk Reduce memory footprint by removing mr_addr and rkey from mlx5_dr_icm_chunk. 1. mr_addr is calculated by mlx5dr_icm_pool_get_chunk_mr_addr() 2. rkey is calculated by mlx5dr_icm_pool_get_chunk_rkey() The two new functions are very lightweight and straightforward. Reduce 8 bytes from struct mlx5_dr_icm_chunk, its current size is 72 bytes. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:56 -07:00
Rongwei Liu	8f85336503	net/mlx5: DR, Adjust structure member to reduce memory hole Accord to profiling, mlx5dr_ste/mlx5dr_icm_chunk are the two hot structures. Their memory layout can be optimized by adjusting member sequences. Struct mlx5dr_ste size changes from 64 bytes to 56 bytes. In the upcoming commits, struct mlx5dr_icm_chunk memory layout will change automatically after removing some members. Keep it untouched here. Signed-off-by: Rongwei Liu <rongweil@nvidia.com> Reviewed-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:56 -07:00
Maxim Mikityanskiy	998923932f	net/mlx5e: Drop cqe_bcnt32 from mlx5e_skb_from_cqe_mpwrq_linear The packet size in mlx5e_skb_from_cqe_mpwrq_linear can't overflow u16, since the maximum packet size in linear striding RQ is 2^13 bytes. Drop the unneeded u32 variable. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:56 -07:00
Maxim Mikityanskiy	064990d0b6	net/mlx5e: Drop the len output parameter from mlx5e_xdp_handle The len parameter of mlx5e_xdp_handle is used to output the new packet length after XDP has processed the packet and returned XDP_PASS. However, this value can be calculated on the caller site, as the caller knows if it was an XDP_PASS. This commit drops the len parameter and moves the calculation to the caller, reducing the number of parameters passed to the function and preparing for XDP support in non-linear legacy RQ. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:55 -07:00
Tariq Toukan	e26eceb90b	net/mlx5e: RX, Test the XDP program existence out of the handler Instead of early return inside mlx5e_xdp_handle(), let the caller check if an XDP program is loaded. This allows saving a few unnecessary function calls and calculations in case !prog. Performance test: single core, drop packets in iptables Before: 3,872,504 pps After: 3,975,628 pps (+2.66%) Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:55 -07:00
Maxim Mikityanskiy	8d35fb57fd	net/mlx5e: Build SKB in place over the first fragment in non-linear legacy RQ As a performance optimization and preparation to enabling XDP multi buffer on non-linear legacy RQ, build the linear part of the SKB over the first fragment, instead of allocating a new buffer and copying the first 256 bytes there. To achieve this, add headroom and tailroom to the first fragment. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:55 -07:00
Maxim Mikityanskiy	c3cce0fff3	net/mlx5e: Add headroom only to the first fragment in legacy RQ Currently, rq->buff.headroom is applied to all fragments in legacy RQ. In the linear mode, there is a non-zero headroom, but there is only one fragment per packet. In the non-linear mode, the headroom is zero. This commit changes the logic to apply the headroom only to the first fragment. The current behavior remains the same for both linear and non-linear modes. However, it allows the next commit to enable headroom for the non-linear mode, which will be applied only to the first fragment. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:54 -07:00
Maxim Mikityanskiy	7c3b4df594	net/mlx5e: Validate MTU when building non-linear legacy RQ fragments info mlx5e_build_rq_frags_info() assumes that MTU is not bigger than PAGE_SIZE * MLX5E_MAX_RX_FRAGS, which is 16K for 4K pages. Currently, the firmware limits MTU to 10K, so the assumption doesn't lead to a bug. This commits adds an additional driver check for reliability, since the firmware boundary might be changed. The calculation is taken to a separate function with a comment explaining it. It's a preparation for the following patches that introcuce XDP multi buffer support. Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-17 11:51:54 -07:00
Ivan Vecera	b04683ff8f	iavf: Fix hang during reboot/shutdown Recent commit `974578017f` ("iavf: Add waiting so the port is initialized in remove") adds a wait-loop at the beginning of iavf_remove() to ensure that port initialization is finished prior unregistering net device. This causes a regression in reboot/shutdown scenario because in this case callback iavf_shutdown() is called and this callback detaches the device, makes it down if it is running and sets its state to __IAVF_REMOVE. Later shutdown callback of associated PF driver (e.g. ice_shutdown) is called. That callback calls among other things sriov_disable() that calls indirectly iavf_remove() (see stack trace below). As the adapter state is already __IAVF_REMOVE then the mentioned loop is end-less and shutdown process hangs. The patch fixes this by checking adapter's state at the beginning of iavf_remove() and skips the rest of the function if the adapter is already in remove state (shutdown is in progress). Reproducer: 1. Create VF on PF driven by ice or i40e driver 2. Ensure that the VF is bound to iavf driver 3. Reboot [52625.981294] sysrq: SysRq : Show Blocked State [52625.988377] task:reboot state:D stack: 0 pid:17359 ppid: 1 f2 [52625.996732] Call Trace: [52625.999187] __schedule+0x2d1/0x830 [52626.007400] schedule+0x35/0xa0 [52626.010545] schedule_hrtimeout_range_clock+0x83/0x100 [52626.020046] usleep_range+0x5b/0x80 [52626.023540] iavf_remove+0x63/0x5b0 [iavf] [52626.027645] pci_device_remove+0x3b/0xc0 [52626.031572] device_release_driver_internal+0x103/0x1f0 [52626.036805] pci_stop_bus_device+0x72/0xa0 [52626.040904] pci_stop_and_remove_bus_device+0xe/0x20 [52626.045870] pci_iov_remove_virtfn+0xba/0x120 [52626.050232] sriov_disable+0x2f/0xe0 [52626.053813] ice_free_vfs+0x7c/0x340 [ice] [52626.057946] ice_remove+0x220/0x240 [ice] [52626.061967] ice_shutdown+0x16/0x50 [ice] [52626.065987] pci_device_shutdown+0x34/0x60 [52626.070086] device_shutdown+0x165/0x1c5 [52626.074011] kernel_restart+0xe/0x30 [52626.077593] __do_sys_reboot+0x1d2/0x210 [52626.093815] do_syscall_64+0x5b/0x1a0 [52626.097483] entry_SYSCALL_64_after_hwframe+0x65/0xca Fixes: `974578017f` ("iavf: Add waiting so the port is initialized in remove") Signed-off-by: Ivan Vecera <ivecera@redhat.com> Link: https://lore.kernel.org/r/20220317104524.2802848-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 09:37:37 -07:00
Vladimir Oltean	8e0341aefc	net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload ACL rules can be offloaded to VCAP IS2 either through chain 0, or, since the blamed commit, through a chain index whose number encodes a specific PAG (Policy Action Group) and lookup number. The chain number is translated through ocelot_chain_to_pag() into a PAG, and through ocelot_chain_to_lookup() into a lookup number. The problem with the blamed commit is that the above 2 functions don't have special treatment for chain 0. So ocelot_chain_to_pag(0) returns filter->pag = 224, which is in fact -32, but the "pag" field is an u8. So we end up programming the hardware with VCAP IS2 entries having a PAG of 224. But the way in which the PAG works is that it defines a subset of VCAP IS2 filters which should match on a packet. The default PAG is 0, and previous VCAP IS1 rules (which we offload using 'goto') can modify it. So basically, we are installing filters with a PAG on which no packet will ever match. This is the hardware equivalent of adding filters to a chain which has no 'goto' to it. Restore the previous functionality by making ACL filters offloaded to chain 0 go to PAG 0 and lookup number 0. The choice of PAG is clearly correct, but the choice of lookup number isn't "as before" (which was to leave the lookup a "don't care"). However, lookup 0 should be fine, since even though there are ACL actions (policers) which have a requirement to be used in a specific lookup, that lookup is 0. Fixes: `226e9cd82a` ("net: mscc: ocelot: only install TCAM entries into a specific lookup and PAG") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220316192117.2568261-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 09:34:52 -07:00
Doug Berger	0f643c88c8	net: bcmgenet: skip invalid partial checksums The RXCHK block will return a partial checksum of 0 if it encounters a problem while receiving a packet. Since a 1's complement sum can only produce this result if no bits are set in the received data stream it is fair to treat it as an invalid partial checksum and not pass it up the stack. Fixes: `8101553978` ("net: bcmgenet: use CHECKSUM_COMPLETE for NETIF_F_RXCSUM") Signed-off-by: Doug Berger <opendmb@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220317012812.1313196-1-opendmb@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 09:34:24 -07:00
Manish Chopra	424e7834e2	bnx2x: fix built-in kernel driver load failure Commit `b7a49f7305` ("bnx2x: Utilize firmware 7.13.21.0") added request_firmware() logic in probe() which caused load failure when firmware file is not present in initrd (below), as access to firmware file is not feasible during probe. Direct firmware load for bnx2x/bnx2x-e2-7.13.15.0.fw failed with error -2 Direct firmware load for bnx2x/bnx2x-e2-7.13.21.0.fw failed with error -2 This patch fixes this issue by - 1. Removing request_firmware() logic from the probe() such that .ndo_open() handle it as it used to handle it earlier 2. Given request_firmware() is removed from probe(), so driver has to relax FW version comparisons a bit against the already loaded FW version (by some other PFs of same adapter) to allow different compatible/close enough FWs with which multiple PFs may run with (in different environments), as the given PF who is in probe flow has no idea now with which firmware file version it is going to initialize the device in ndo_open() Link: https://lore.kernel.org/all/46f2d9d9-ae7f-b332-ddeb-b59802be2bab@molgen.mpg.de/ Reported-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Paul Menzel <pmenzel@molgen.mpg.de> Fixes: `b7a49f7305` ("bnx2x: Utilize firmware 7.13.21.0") Signed-off-by: Manish Chopra <manishc@marvell.com> Signed-off-by: Ariel Elior <aelior@marvell.com> Link: https://lore.kernel.org/r/20220316214613.6884-1-manishc@marvell.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-17 09:30:04 -07:00
Tom Rix	5d705de0cd	igb: zero hwtstamp by default Clang static analysis reports this representative issue igb_ptp.c:997:3: warning: The left operand of '+' is a garbage value ktime_add_ns(shhwtstamps.hwtstamp, adjust); ^ ~~~~~~~~~~~~~~~~~~~~ shhwtstamps.hwtstamp is set by a call to igb_ptp_systim_to_hwtstamp(). In the switch-statement for the hw type, the hwtstamp is zeroed for matches but not the default case. Move the memset out of switch-statement. This degarbages the default case and reduces the size. Some whitespace cleanup of empty lines Signed-off-by: Tom Rix <trix@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-17 08:32:28 -07:00
Tom Rix	ad739d0889	i40e: little endian only valid checksums The calculation of the checksum can fail. So move converting the checksum to little endian to inside the return status check. Signed-off-by: Tom Rix <trix@redhat.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-17 07:40:46 -07:00
Dan Carpenter	58e06d05d4	net: stmmac: clean up impossible condition This code works but it has a static checker warning: drivers/net/ethernet/stmicro/stmmac/stmmac_main.c:1687 init_dma_rx_desc_rings() warn: always true condition '(queue >= 0) => (0-u32max >= 0)' Obviously, it makes no sense to check if an unsigned int is >= 0. What prevents this code from being a forever loop is that later there is a separate check for if (queue == 0). The "queue" variable is less than MTL_MAX_RX_QUEUES (8) so it can easily fit in an int type. Any larger value for "queue" would lead to an array overflow when we assign "rx_q = &priv->rx_queue[queue]". Fixes: `de0b90e52a` ("net: stmmac: rearrange RX and TX desc init into per-queue basis") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220316083744.GB30941@kili Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-03-17 13:40:35 +01:00
Chris Packham	2d2a514c1d	net: mvneta: Add support for 98DX2530 Ethernet port The 98DX2530 SoC is similar to the Armada 3700 except it needs a different MBUS window configuration. Add a new compatible string to identify this device and the required MBUS window configuration. Signed-off-by: Chris Packham <chris.packham@alliedtelesis.co.nz> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-03-17 10:49:26 +01:00
Maor Dickman	725726fd1f	net/mlx5e: MPLSoUDP encap, support action vlan pop_eth explicitly Currently the MPLSoUDP encap offload does the L2 pop implicitly while adding such action explicitly (vlan eth_push) will cause the rule to not be offloaded. Solve it by adding offload support for vlan eth_push in case of MPLSoUDP decap case. Flow example: filter root protocol ip pref 1 flower chain 0 filter root protocol ip pref 1 flower chain 0 handle 0x1 eth_type ipv4 dst_ip 2.2.2.22 src_ip 2.2.2.21 in_hw in_hw_count 1 action order 1: vlan pop_eth pipe index 1 ref 1 bind 1 used_hw_stats delayed action order 2: mpls push protocol mpls_uc label 555 tc 3 ttl 255 pipe index 1 ref 1 bind 1 used_hw_stats delayed action order 3: tunnel_key set src_ip 8.8.8.21 dst_ip 8.8.8.22 dst_port 6635 csum tos 0x4 ttl 6 pipe index 1 ref 1 bind 1 used_hw_stats delayed action order 4: mirred (Egress Redirect to device bareudp0) stolen index 1 ref 1 bind 1 used_hw_stats delayed Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 19:59:42 -07:00
Maor Dickman	697319b295	net/mlx5e: MPLSoUDP decap, use vlan push_eth instead of pedit Currently action pedit of source and destination MACs is used to fill the MACs in L2 push step in MPLSoUDP decap offload, this isn't aligned to tc SW which use vlan eth_push action to do this. To fix that, offload support for vlan veth_push action is added together with mpls pop action, and deprecate the use of pedit of MACs. Flow example: filter protocol mpls_uc pref 1 flower chain 0 filter protocol mpls_uc pref 1 flower chain 0 handle 0x1 eth_type 8847 mpls_label 555 enc_dst_port 6635 in_hw in_hw_count 1 action order 1: tunnel_key unset pipe index 2 ref 1 bind 1 used_hw_stats delayed action order 2: mpls pop protocol ip pipe index 2 ref 1 bind 1 used_hw_stats delayed action order 3: vlan push_eth dst_mac de:a2:ec:d6:69:c8 src_mac de:a2:ec:d6:69:c8 pipe index 2 ref 1 bind 1 used_hw_stats delayed action order 4: mirred (Egress Redirect to device enp8s0f0_0) stolen index 2 ref 1 bind 1 used_hw_stats delayed Signed-off-by: Maor Dickman <maord@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 19:59:42 -07:00
Minghao Chi	571703ff38	net: mv643xx_eth: undo some opreations in mv643xx_eth_probe Cannot directly return platform_get_irq return irq, there are operations that need to be undone. Fixes: `bf2b83425b` ("net: mv643xx_eth: use platform_get_irq() instead of platform_get_resource()") Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Link: https://lore.kernel.org/r/20220316012444.2126070-1-chi.minghao@zte.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 19:31:00 -07:00
Colin Ian King	f403443015	net: hns3: Fix spelling mistake "does't" -> "doesn't" There is a spelling mistake in a dev_warn message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220315222914.2960786-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 19:29:13 -07:00
Colin Ian King	2fc559c8cb	gve: Fix spelling mistake "droping" -> "dropping" There is a spelling mistake in a netdev_warn warning. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://lore.kernel.org/r/20220315222615.2960504-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 19:29:00 -07:00
wujunwen	af1147b236	net: ksz884x: optimize netdev_open flow and remove static variable remove the static next_jiffies variable, and reinitialize next_jiffies to simplify netdev_open Signed-off-by: wujunwen <wudaemon@163.com> Link: https://lore.kernel.org/r/20220315122857.78601-1-wudaemon@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 19:18:41 -07:00
Jakub Kicinski	706217c1ce	devlink: pass devlink_port to port_split / port_unsplit callbacks Now that devlink ports are protected by the instance lock it seems natural to pass devlink_port as an argument to the port_split / port_unsplit callbacks. This should save the drivers from doing a lookup. In theory drivers may have supported unsplitting ports which were not registered prior to this change. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 12:56:45 -07:00
Jakub Kicinski	49e83bbe8c	devlink: hold the instance lock in port_split / port_unsplit callbacks Let the core take the devlink instance lock around port splitting and remove the now redundant locking in the drivers. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 12:56:42 -07:00
Jakub Kicinski	5e8930aa86	eth: mlxsw: switch to explicit locking for port registration Explicitly lock the devlink instance and use devl_ API. This will be used by the subsequent patch to invoke .port_split / .port_unsplit callbacks with devlink instance lock held. Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 12:56:39 -07:00
Jakub Kicinski	162cca4292	eth: nfp: replace driver's "pf" lock with devlink instance lock The whole reason for existence of the pf mutex is that we could not lock the devlink instance around port splitting. There are more types of reconfig which can make ports appear or disappear. Now that the devlink instance lock is exposed to drivers and "locked" helpers exist we can switch to using the devlink lock directly. Next patches will move the locking inside .port_(un)split to the core. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 12:56:37 -07:00
Jakub Kicinski	8a38f2cc96	eth: nfp: wrap locking assertions in helpers We can replace the PF lock with devlink instance lock in subsequent changes. To make the patches easier to comprehend and limit line lengths - factor out the existing locking assertions. No functional changes. Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-16 12:56:35 -07:00
Jacob Keller	4c1202189e	ice: add trace events for tx timestamps We've previously run into many issues related to the latency of a Tx timestamp completion with the ice hardware. It can be difficult to determine the root cause of a slow Tx timestamp. To aid in this, introduce new trace events which capture timing data about when the driver reaches certain points while processing a transmit timestamp * ice_tx_tstamp_request: Trace when the stack initiates a new timestamp request. * ice_tx_tstamp_fw_req: Trace when the driver begins a read of the timestamp register in the work thread. * ice_tx_tstamp_fw_done: Trace when the driver finishes reading a timestamp register in the work thread. * ice_tx_tstamp_complete: Trace when the driver submits the skb back to the stack with a completed Tx timestamp. These trace events can be enabled using the standard trace event subsystem exposed by the ice driver. If they are disabled, they become no-ops with no run time cost. The following is a simple GNU AWK script which can highlight one potential way to use the trace events to capture latency data from the trace buffer about how long the driver takes to process a timestamp: ----- BEGIN { PREC=256 } # Detect requests /tx_tstamp_request/ { time=strtonum($4) skb=$7 # Store the time of request for this skb requests[skb] = time printf("skb %s: idx %d at %.6f\n", skb, idx, time) } # Detect completions /tx_tstamp_complete/ { time=strtonum($4) skb=$7 idx=$9 if (skb in requests) { latency = (time - requests[skb]) * 1000 printf("skb %s: %.3f to complete\n", skb, latency) if (latency > 4) { printf(">>> HIGH LATENCY <<<\n") } printf("\n") } else { printf("!!! skb %s (idx %d) at %.6f\n", skb, idx, time) } } ----- Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-16 10:38:15 -07:00
Yang Yingliang	2b1d0a242a	ice: fix return value check in ice_gnss.c kthread_create_worker() and tty_alloc_driver() return ERR_PTR() and never return NULL. The NULL test in the return value check should be replaced with IS_ERR(). Fixes: `43113ff734` ("ice: add TTY for GNSS module for E810T device") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Yang Yingliang <yangyingliang@huawei.com> Tested-by: Gurucharan <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-16 10:38:15 -07:00
Wojciech Drewek	2bcd5b9f35	ice: Fix inconsistent indenting in ice_switch Fix the following warning as reported by smatch: smatch warnings: drivers/net/ethernet/intel/ice/ice_switch.c:5568 ice_find_dummy_packet() warn: inconsistent indenting Fixes: `9a225f81f5` ("ice: Support GTP-U and GTP-C offload in switchdev") Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-16 10:26:43 -07:00
Biao Huang	f2d356a6ab	stmmac: dwmac-mediatek: add support for mt8195 Add Ethernet support for MediaTek SoCs from the mt8195 family. Signed-off-by: Biao Huang <biao.huang@mediatek.com> Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-16 12:49:23 +00:00
Biao Huang	4fe3075fa6	stmmac: dwmac-mediatek: re-arrange clock setting The rmii_internal clock is needed only when PHY interface is RMII, and reference clock is from MAC. Re-arrange the clock setting as following: 1. the optional "rmii_internal" is controlled by devm_clk_get(), 2. other clocks still be configured by devm_clk_bulk_get(). Signed-off-by: Biao Huang <biao.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-16 12:49:23 +00:00
Biao Huang	a71e67b210	stmmac: dwmac-mediatek: Reuse more common features This patch makes dwmac-mediatek reuse more features supported by stmmac_platform.c. Signed-off-by: Biao Huang <biao.huang@mediatek.com> Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-16 12:49:22 +00:00
Biao Huang	3186bdad97	stmmac: dwmac-mediatek: add platform level clocks management This patch implements clks_config callback for dwmac-mediatek platform, which could support platform level clocks management. Signed-off-by: Biao Huang <biao.huang@mediatek.com> Acked-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-16 12:49:22 +00:00
David S. Miller	79b0410841	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-03-15 Jacob Keller says: The ice_sriov.c file now houses almost all of the virtualization code in the ice driver. This includes both Single Root specific implementation as well as generic functionality such as the virtchnl interface. We are planning to implement support for Scalable IOV in the ice driver in the future. This implementation will want to use the generic functionality in ice_sriov.c Rather than dump the Scalable IOV code into ice_sriov.c, we will want to implement it in a separate file, ice_siov.c To help with this, refactor the code in ice_sriov.c and split the generic functionality out into separate files. Reorganize code to make the non-implementation specific bits into new files with the following general guidelines: * ice_vf_lib.[ch] Basic VF structures and accessors. This is where scheme-independent code will reside. * ice_virtchnl.[ch] Virtchnl message handling. This is where the bulk of the logic for processing messages from VFs using the virtchnl messaging scheme will reside. This is separated from ice_vf_lib.c because it is somewhat distinct and stand alone. * ice_sriov.[ch] Single Root IOV implementation, including initialization and the routines for interacting with SR-IOV based netdev operations. * (future) ice_siov.[ch] Scalable IOV implementation. The end goal is to make it easier to re-use the generic parts of the virtualization logic while keeping separate the concerns of the Single Root implementation. In addition to the pure code moves, this series has a reset refactor which clean up the functionality to make it easier to reuse the reset code. A new ops table is introduced to make the VF reset logic more generic. The Single Root specific details are implemented in ice_sriov.c. A future series implementing Scalable IOV support will use this ops table to allow re-use of the reset logic which is now in ice_vf_lib.c ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-16 10:17:22 +00:00
Casper Andersson	9f01cfbf29	net: sparx5: Use Switchdev fdb events for managing fdb entries Changes the handling of fdb entries to use Switchdev events, instead of the previous "sync_bridge" and "sync_port" which only run when adding or removing VLANs on the bridge. Signed-off-by: Casper Andersson <casper.casan@gmail.com> Link: https://lore.kernel.org/r/20220314160918.4rfrrfgmbsf2pxl3@wse-c0155 Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-15 20:40:55 -07:00
Przemyslaw Patynowski	16b2dd8cdf	iavf: Fix double free in iavf_reset_task Fix double free possibility in iavf_disable_vf, as crit_lock is freed in caller, iavf_reset_task. Add kernel-doc for iavf_disable_vf. Remove mutex_unlock in iavf_disable_vf. Without this patch there is double free scenario, when calling iavf_reset_task. Fixes: `e85ff9c631` ("iavf: Fix deadlock in iavf_reset_task") Signed-off-by: Przemyslaw Patynowski <przemyslawx.patynowski@intel.com> Suggested-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Mateusz Palczewski <mateusz.palczewski@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:36:13 -07:00
Sudheer Mogilappagari	1b4ae7d925	ice: destroy flow director filter mutex after releasing VSIs Currently fdir_fltr_lock is accessed in ice_vsi_release_all() function after it is destroyed. Instead destroy mutex after ice_vsi_release_all. Fixes: `40319796b7` ("ice: Add flow director support for channel mode") Signed-off-by: Sudheer Mogilappagari <sudheer.mogilappagari@intel.com> Tested-by: Bharathi Sreenivas <bharathi.sreenivas@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:36:13 -07:00
Maciej Fijalkowski	f153546913	ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats() It is possible to do NULL pointer dereference in routine that updates Tx ring stats. Currently only stats and bytes are updated when ring pointer is valid, but later on ring is accessed to propagate gathered Tx stats onto VSI stats. Change the existing logic to move to next ring when ring is NULL. Fixes: `e72bba2135` ("ice: split ice_ring onto Tx/Rx separate structs") Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com> Acked-by: Alexander Lobakin <alexandr.lobakin@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:36:13 -07:00
Jacob Keller	5a57ee83d9	ice: remove PF pointer from ice_check_vf_init The ice_check_vf_init function takes both a PF and a VF pointer. Every caller looks up the PF pointer from the VF structure. Some callers only use of the PF pointer is call this function. Move the lookup inside ice_check_vf_init and drop the unnecessary argument. Cleanup the callers to drop the now unnecessary local variables. In particular, replace the local PF pointer with a HW structure pointer in ice_vc_get_vf_res_msg which simplifies a few accesses to the HW structure in that function. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:14 -07:00
Jacob Keller	bf93bf791c	ice: introduce ice_virtchnl.c and ice_virtchnl.h Just as we moved the generic virtualization library logic into ice_vf_lib.c, move the virtchnl message handling into ice_virtchnl.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:10 -07:00
Jacob Keller	8cf52bec5c	ice: cleanup long lines in ice_sriov.c Before we move the virtchnl message handling from ice_sriov.c into ice_virtchnl.c, cleanup some long line warnings to avoid checkpatch.pl complaints. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:06 -07:00
Jacob Keller	f5f085c01d	ice: introduce ICE_VF_RESET_LOCK flag The ice_reset_vf function performs actions which must be taken only while holding the VF configuration lock. Some flows already acquired the lock, while other flows must acquire it just for the reset function. Add the ICE_VF_RESET_LOCK flag to the function so that it can handle taking and releasing the lock instead at the appropriate scope. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:23:02 -07:00
Jacob Keller	9dbb33da12	ice: introduce ICE_VF_RESET_NOTIFY flag In some cases of resetting a VF, the PF would like to first notify the VF that a reset is impending. This is currently done via ice_vc_notify_vf_reset. A wrapper to ice_reset_vf, ice_vf_reset_vf, is used to call this function first before calling ice_reset_vf. In fact, every single call to ice_vc_notify_vf_reset occurs just prior to a call to ice_vc_reset_vf. Now that ice_reset_vf has flags, replace this separate call with an ICE_VF_RESET_NOTIFY flag. This removes an unnecessary exported function of ice_vc_notify_vf_reset, and also makes there be a single function to reset VFs (ice_reset_vf). Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:57 -07:00
Jacob Keller	7eb517e434	ice: convert ice_reset_vf to take flags The ice_reset_vf function takes a boolean parameter which indicates whether or not the reset is due to a VFLR event. This is somewhat confusing to read because readers must interpret what "true" and "false" mean when seeing a line of code like "ice_reset_vf(vf, false)". We will want to add another toggle to the ice_reset_vf in a following change. To avoid proliferating many arguments, convert this function to take flags instead. ICE_VF_RESET_VFLR will indicate if this is a VFLR reset. A value of 0 indicates no flags. One could argue that "ice_reset_vf(vf, 0)" is no more readable than "ice_reset_vf(vf, false)".. However, this type of flags interface is somewhat common and using 0 to mean "no flags" makes sense in this context. We could bother to add a define for "ICE_VF_RESET_PLAIN" or something similar, but this can be confusing since its not an actual bit flag. This paves the way to add another flag to the function in a following change. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:54 -07:00
Jacob Keller	4fe193cc9d	ice: convert ice_reset_vf to standard error codes The ice_reset_vf function returns a boolean value indicating whether or not the VF reset. This is a bit confusing since it means that callers need to know how to interpret the return value when needing to indicate an error. Refactor the function and call sites to report a regular error code. We still report success (i.e. return 0) in cases where the reset is in progress or is disabled. Existing callers don't care because they do not check the return value. We keep the error code anyways instead of a void return because we expect future code which may care about or at least report the error value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:49 -07:00
Jacob Keller	fe99d1c06c	ice: make ice_reset_all_vfs void The ice_reset_all_vfs function returns true if any VFs were reset, and false otherwise. However, no callers check the return value. Drop this return value and make the function void since the callers do not care about this. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:44 -07:00
Jacob Keller	dac5728875	ice: drop is_vflr parameter from ice_reset_all_vfs The ice_reset_all_vfs function takes a parameter to handle whether its operating after a VFLR event or not. This is not necessary as every caller always passes true. Simplify the interface by removing the parameter. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:39 -07:00
Jacob Keller	16686d7fbb	ice: move reset functionality into ice_vf_lib.c Now that the reset functions do not rely on Single Root specific behavior, move the ice_reset_vf, ice_reset_all_vfs, and ice_vf_rebuild_host_cfg functions and their dependent helper functions out of ice_sriov.c and into ice_vf_lib.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:34 -07:00
Jacob Keller	5de95744ff	ice: fix a long line warning in ice_reset_vf We're about to move ice_reset_vf out of ice_sriov.c and into ice_vf_lib.c One of the dev_err statements has a checkpatch.pl violation due to putting the vf->vf_id on the same line as the dev_err. Fix this style issue first before moving the code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:30 -07:00
Jacob Keller	9c6f787897	ice: introduce VF operations structure for reset flows The ice driver currently supports virtualization using Single Root IOV, with code in the ice_sriov.c file. In the future, we plan to also implement support for Scalable IOV, which uses slightly different hardware implementations for some functionality. To eventually allow this, we introduce a new ice_vf_ops structure which will contain the basic operations that are different between the two IOV implementations. This primarily includes logic for how to handle the VF reset registers, as well as what to do before and after rebuilding the VF's VSI. Implement these ops structures and call the ops table instead of directly calling the SR-IOV specific function. This will allow us to easily add the Scalable IOV implementation in the future. Additionally, it helps separate the generalized VF logic from SR-IOV specifics. This change allows us to move the reset logic out of ice_sriov.c and into ice_vf_lib.c without placing any Single Root specific details into the generic file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:25 -07:00
Jacob Keller	f5840e0da6	ice: fix incorrect dev_dbg print mistaking 'i' for vf->vf_id If we fail to clear the malicious VF indication after a VF reset, the dev_dbg message which is printed uses the local variable 'i' when it meant to use vf->vf_id. Fix this. Fixes: `0891c89674` ("ice: warn about potentially malicious VFs") Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:19 -07:00
Jacob Keller	109aba47ca	ice: introduce ice_vf_lib.c, ice_vf_lib.h, and ice_vf_lib_private.h Introduce the ice_vf_lib.c file along with the ice_vf_lib.h and ice_vf_lib_private.h header files. These files will house the generic VF structures and access functions. Move struct ice_vf and its dependent definitions into this new header file. The ice_vf_lib.c is compiled conditionally on CONFIG_PCI_IOV. Some of its functionality is required by all driver files. However, some of its functionality will only be required by other files also conditionally compiled based on CONFIG_PCI_IOV. Declaring these functions used only in CONFIG_PCI_IOV files in ice_vf_lib.h is verbose. This is because we must provide a fallback implementation for each function in this header since it is included in files which may not be compiled with CONFIG_PCI_IOV. Instead, introduce a new ice_vf_lib_private.h header which verifies that CONFIG_PCI_IOV is enabled. This header is intended to be directly included in .c files which are CONFIG_PCI_IOV only. Add a #error indication that will complain if the file ever gets included by another C file on a kernel with CONFIG_PCI_IOV disabled. Add a comment indicating the nature of the file and why it is useful. This makes it so that we can easily define functions exposed from ice_vf_lib.c into other virtualization files without needing to add fallback implementations for every single function. This begins the path to separate out generic code which will be reused by other virtualization implementations from ice_sriov.h and ice_sriov.c Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-15 13:22:13 -07:00
Jakub Kicinski	c84d86a029	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-03-14 Jacob Keller says: The ice_virtchnl_pf.c file has become a single place for a lot of virtualization functionality. This includes most of the virtchnl message handling, integration with kernel hooks like the .ndo operations, reset logic, and more. We are planning in the future to implement and support Scalable IOV in the ice driver. To do this, much (but not all) of the code in ice_virtchnl_pf.c will want to be reused. Rather than dump all of the Scalable IOV implementation into ice_virtchnl_pf.c it makes sense to house it in a separate file. But that still leaves all of the Single Root IOV code littered among more generic logic. The long term goal is to re-organize the code such that generic re-usable code is split into separate files. The ice_sriov.c file would end up containing all of the Single Root IOV implementation specific details, while ice_vf_lib.[ch] and ice_virtchnl.[ch] contain the generic pieces. As a first step, notice that ice_sriov.c currently does not contain much of the SR-IOV implementation. This is housed primarily in ice_virtchnl_pf.c The code in ice_sriov.c is really generic and relates to the VF mailbox, including mailbox overflow detection. Rename ice_sriov.c to ice_vf_mbx.c, and then rename ice_virtchnl_pf.c to ice_sriov.c A later series will finish the refactor by splitting ice_sriov.c into multiple files, moving the generic code into ice_vf_lib.c and ice_virtchnl.c To prepare for that series, perform some basic cleanup and other refactors that we've accumulated during this development cycle. This series builds on top of the recent hash table refactor work. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: use ice_is_vf_trusted helper function ice: log an error message when eswitch fails to configure ice: cleanup error logging for ice_ena_vfs ice: move ice_set_vf_port_vlan near other .ndo ops ice: refactor spoofchk control code in ice_sriov.c ice: rename ICE_MAX_VF_COUNT to avoid confusion ice: remove unused definitions from ice_sriov.h ice: convert vf->vc_ops to a const pointer ice: remove circular header dependencies on ice.h ice: rename ice_virtchnl_pf.c to ice_sriov.c ice: rename ice_sriov.c to ice_vf_mbx.c ==================== Link: https://lore.kernel.org/r/20220315011155.2166817-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-15 12:11:02 -07:00
Jakub Kicinski	abe2fec8ee	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next 1) Revert CHECKSUM_UNNECESSARY for UDP packet from conntrack. 2) Reject unsupported families when creating tables, from Phil Sutter. 3) GRE support for the flowtable, from Toshiaki Makita. 4) Add GRE offload support for act_ct, also from Toshiaki. 5) Update mlx5 driver to support for GRE flowtable offload, from Toshiaki Makita. 6) Oneliner to clean up incorrect indentation in nf_conntrack_bridge, from Jiapeng Chong. * git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: bridge: clean up some inconsistent indenting net/mlx5: Support GRE conntrack offload act_ct: Support GRE offload netfilter: flowtable: Support GRE netfilter: nf_tables: Reject tables of unsupported family Revert "netfilter: conntrack: mark UDP zero checksum as CHECKSUM_UNNECESSARY" ==================== Link: https://lore.kernel.org/r/20220315091513.66544-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-15 11:52:25 -07:00
Vladimir Oltean	72f56fdb97	net: mscc: ocelot: fix build error due to missing IEEE_8021QAZ_MAX_TCS IEEE_8021QAZ_MAX_TCS is defined in include/uapi/linux/dcbnl.h, which is included by net/dcbnl.h. Then, linux/netdevice.h conditionally includes net/dcbnl.h if CONFIG_DCB is enabled. Therefore, when CONFIG_DCB is disabled, this indirect dependency is broken. There isn't a good reason to include net/dcbnl.h headers into the ocelot switch library which exports low-level hardware API, so replace IEEE_8021QAZ_MAX_TCS with OCELOT_NUM_TC which has the same value. Fixes: `978777d0fb` ("net: dsa: felix: configure default-prio and dscp priorities") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://lore.kernel.org/r/20220315131215.273450-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-15 11:47:11 -07:00
Dan Carpenter	c24f657791	net: sparx5: fix a couple warning messages The WARN_ON() macro takes a condition, not a warning message. Fixes: `0933bd0404` ("net: sparx5: Add support for ptp clocks") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Link: https://lore.kernel.org/r/20220314140327.GB30883@kili Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2022-03-15 16:07:19 +01:00
Jacob Keller	1261691dda	ice: use ice_is_vf_trusted helper function The ice_vc_cfg_promiscuous_mode_msg function directly checks ICE_VIRTCHNL_VF_CAP_PRIVILEGE, instead of using the existing helper function ice_is_vf_trusted. Switch this to use the helper function so that all trusted checks are consistent. This aids in any potential future refactor by ensuring consistent code. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	2b36944810	ice: log an error message when eswitch fails to configure When ice_eswitch_configure fails, print an error message to make it more obvious why VF initialization did not succeed. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	94ab2488d4	ice: cleanup error logging for ice_ena_vfs The ice_ena_vfs function and some of its sub-functions like ice_set_per_vf_res use a "if (<function>) { <print error> ; <exit> }" flow. This flow discards specialized errors reported by the called function. This style is generally not preferred as it makes tracing error sources more difficult. It also means we cannot log the actual error received properly. Refactor several calls in the ice_ena_vfs function that do this to catch the error in the 'ret' variable. Report this in the messages, and then return the more precise error value. Doing this reveals that ice_set_per_vf_res returns -EINVAL or -EIO in places where -ENOSPC makes more sense. Fix these calls up to return the more appropriate value. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	346f7aa3c7	ice: move ice_set_vf_port_vlan near other .ndo ops The ice_set_vf_port_vlan function is located in ice_sriov.c very far away from the other .ndo operations that it is similar to. Move this so that its located near the other .ndo operation definitions. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	a8ea6d86bd	ice: refactor spoofchk control code in ice_sriov.c The API to control the VSI spoof checking for a VF VSI has three functions: enable, disable, and set. The set function takes the VSI and the VF and decides whether to call enable or disable based on the vf->spoofchk field. In some flows, vf->spoofchk is not yet set, such as the function used to control the setting for a VF. (vf->spoofchk is only updated after a success). Simplify this API by refactoring ice_vf_set_spoofchk_cfg to be "ice_vsi_apply_spoofchk" which takes the boolean and allows all callers to avoid having to determine whether to call enable or disable themselves. This matches the expected callers better, and will prevent the need to export more than one function when this code must be called from another file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	dc36796ead	ice: rename ICE_MAX_VF_COUNT to avoid confusion The ICE_MAX_VF_COUNT field is defined in ice_sriov.h. This count is true for SR-IOV but will not be true for all VF implementations, such as when the ice driver supports Scalable IOV. Rename this definition to clearly indicate ICE_MAX_SRIOV_VFS. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:59 -07:00
Jacob Keller	00a57e2959	ice: remove unused definitions from ice_sriov.h A few more macros exist in ice_sriov.h which are not used anywhere. These can be safely removed. Note that ICE_VIRTCHNL_VF_CAP_L2 capability is set but never checked anywhere in the driver. Thus it is also safe to remove. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	a7e117109a	ice: convert vf->vc_ops to a const pointer The vc_ops structure is used to allow different handlers for virtchnl commands when the driver is in representor mode. The current implementation uses a copy of the ops table in each VF, and modifies this copy dynamically. The usual practice in kernel code is to store the ops table in a constant structure and point to different versions. This has a number of advantages: 1. Reduced memory usage. Each VF merely points to the correct table, so they're able to re-use the same constant lookup table in memory. 2. Consistency. It becomes more difficult to accidentally update or edit only one op call. Instead, the code switches to the correct able by a single pointer write. In general this is atomic, either the pointer is updated or its not. 3. Code Layout. The VF structure can store a pointer to the table without needing to have the full structure definition defined prior to the VF structure definition. This will aid in future refactoring of code by allowing the VF pointer to be kept in ice_vf_lib.h while the virtchnl ops table can be maintained in ice_virtchnl.h There is one major downside in the case of the vc_ops structure. Most of the operations in the table are the same between the two current implementations. This can appear to lead to duplication since each implementation must now fill in the complete table. It could make spotting the differences in the representor mode more challenging. Unfortunately, methods to make this less error prone either add complexity overhead (macros using CPP token concatenation) or don't work on all compilers we support (constant initializer from another constant structure). The cost of maintaining two structures does not out weigh the benefits of the constant table model. While we're making these changes, go ahead and rename the structure and implementations with "virtchnl" instead of "vc_vf_". This will more closely align with the planned file renaming, and avoid similar names when we later introduce a "vf ops" table for separating Scalable IOV and Single Root IOV implementations. Leave the accessor/assignment functions in order to avoid issues with compiling with options disabled. The interface makes it easier to handle when CONFIG_PCI_IOV is disabled in the kernel. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	649c87c6ff	ice: remove circular header dependencies on ice.h Several headers in the ice driver include ice.h even though they are themselves included by that header. The most notable of these is ice_common.h, but several other headers also do this. Such a recursive inclusion is problematic as it forces headers to be included in a strict order, otherwise compilation errors can result. The circular inclusions do not trigger an endless loop due to standard header inclusion guards, however other errors can occur. For example, ice_flow.h defines ice_rss_hash_cfg, which is used by ice_sriov.h as part of the definition of ice_vf_hash_ip_ctx. ice_flow.h includes ice_acl.h, which includes ice_common.h, and which finally includes ice.h. Since ice.h itself includes ice_sriov.h, this creates a circular dependency. The definition in ice_sriov.h requires things from ice_flow.h, but ice_flow.h itself will lead to trying to load ice_sriov.h as part of its process for expanding ice.h. The current code avoids this issue by having an implicit dependency without the include of ice_flow.h. If we were to fix that so that ice_sriov.h explicitly depends on ice_flow.h the following pattern would occur: ice_flow.h -> ice_acl.h -> ice_common.h -> ice.h -> ice_sriov.h At this point, during the expansion of, the header guard for ice_flow.h is already set, so when ice_sriov.h attempts to load the ice_flow.h header it is skipped. Then, we go on to begin including the rest of ice_sriov.h, including structure definitions which depend on ice_rss_hash_cfg. This produces a compiler warning because ice_rss_hash_cfg hasn't yet been included. Remember, we're just at the start of ice_flow.h! If the order of headers is incorrect (ice_flow.h is not implicitly loaded first in all files which include ice_sriov.h) then we get the same failure. Removing this recursive inclusion requires fixing a few cases where some headers depended on the header inclusions from ice.h. In addition, a few other changes are also required. Most notably, ice_hw_to_dev is implemented as a macro in ice_osdep.h, which is the likely reason that ice_common.h includes ice.h at all. This macro implementation requires the full definition of ice_pf in order to properly compile. Fix this by moving it to a function declared in ice_main.c, so that we do not require all files to depend on the layout of the ice_pf structure. Note that this change only fixes circular dependencies, but it does not fully resolve all implicit dependencies where one header may depend on the inclusion of another. I tried to fix as many of the implicit dependencies as I noticed, but fixing them all requires a somewhat tedious analysis of each header and attempting to compile it separately. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Gurucharan G <gurucharanx.g@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	0deb0bf70c	ice: rename ice_virtchnl_pf.c to ice_sriov.c The ice_virtchnl_pf.c and ice_virtchnl_pf.h files are where most of the code for implementing Single Root IOV virtualization resides. This code includes support for bringing up and tearing down VFs, hooks into the kernel SR-IOV netdev operations, and for handling virtchnl messages from VFs. In the future, we plan to support Scalable IOV in addition to Single Root IOV as an alternative virtualization scheme. This implementation will re-use some but not all of the code in ice_virtchnl_pf.c To prepare for this future, we want to refactor and split up the code in ice_virtchnl_pf.c into the following scheme: * ice_vf_lib.[ch] Basic VF structures and accessors. This is where scheme-independent code will reside. * ice_virtchnl.[ch] Virtchnl message handling. This is where the bulk of the logic for processing messages from VFs using the virtchnl messaging scheme will reside. This is separated from ice_vf_lib.c because it is distinct and has a bulk of the processing code. * ice_sriov.[ch] Single Root IOV implementation, including initialization and the routines for interacting with SR-IOV based netdev operations. * (future) ice_siov.[ch] Scalable IOV implementation. As a first step, lets assume that all of the code in ice_virtchnl_pf.[ch] is for Single Root IOV. Rename this file to ice_sriov.c and its header to ice_sriov.h Future changes will further split out the code in these files following the plan outlined here. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Jacob Keller	d775155a86	ice: rename ice_sriov.c to ice_vf_mbx.c The ice_sriov.c file primarily contains code which handles the logic for mailbox overflow detection and some other utility functions related to the virtualization mailbox. The bulk of the SR-IOV implementation is actually found in ice_virtchnl_pf.c, and this file isn't strictly SR-IOV specific. In the future, the ice driver will support an additional virtualization scheme known as Scalable IOV, and the code in this file will be used for this alternative implementation. Rename this file (and its associated header) to ice_vf_mbx.c, so that we can later re-use the ice_sriov.c file as the SR-IOV specific file. Signed-off-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Konrad Jankowski <konrad0.jankowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-14 17:22:58 -07:00
Niklas Söderlund	bdd6a89de4	nfp: flower: avoid newline at the end of message in NL_SET_ERR_MSG_MOD Fix the following coccicheck warning: drivers/net/ethernet/netronome/nfp/flower/action.c:959:7-69: WARNING avoid newline at end of message in NL_SET_ERR_MSG_MOD Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220312095823.2425775-1-niklas.soderlund@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-14 15:22:47 -07:00
Saeed Mahameed	8772cc499b	net/mlx5e: Fix use-after-free in mlx5e_stats_grp_sw_update_stats We need to sync page pool stats only for active channels. Reading ethtool stats on a down netdev or a netdev with modified number of channels will result in a user-after-free, trying to access page pools that are freed already. BUG: KASAN: use-after-free in mlx5e_stats_grp_sw_update_stats+0x465/0xf80 Read of size 8 at addr ffff888004835e40 by task ethtool/720 Fixes: `cc10e84b2e` ("mlx5: add support for page_pool_get_stats") Signed-off-by: Saeed Mahameed <saeedm@nvidia.com> Reported-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Joe Damato <jdamato@fastly.com> Link: https://lore.kernel.org/r/20220312005353.786255-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-14 15:04:39 -07:00
Julia Lawall	3c2dfb735b	net/mlx4_en: use kzalloc Use kzalloc instead of kmalloc + memset. The semantic patch that makes this change is: (https://coccinelle.gitlabpages.inria.fr/website/) //<smpl> @@ expression res, size, flag; @@ - res = kmalloc(size, flag); + res = kzalloc(size, flag); ... - memset(res, 0, size); //</smpl> Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Reviewed-by: Tariq Toukan <tariqt@nvidia.com> Link: https://lore.kernel.org/r/20220312102705.71413-3-Julia.Lawall@inria.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-14 13:15:28 -07:00
Julia Lawall	ebc0b8b537	drivers: net: packetengines: fix typos in comments Various spelling mistakes in comments. Detected with the help of Coccinelle. Signed-off-by: Julia Lawall <Julia.Lawall@inria.fr> Link: https://lore.kernel.org/r/20220314115354.144023-13-Julia.Lawall@inria.fr Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-14 10:04:28 -07:00
Ioana Ciornei	f978fe85b8	dpaa2-mac: configure the SerDes phy on a protocol change This patch integrates the dpaa2-eth driver with the generic PHY infrastructure in order to search, find and reconfigure the SerDes lanes in case of a protocol change. On the .mac_config() callback, the phy_set_mode_ext() API is called so that the Lynx 28G SerDes PHY driver can change the lane's configuration. In the same phylink callback the MC firmware is called so that it reconfigures the MAC side to run using the new protocol. The consumer drivers - dpaa2-eth and dpaa2-switch - are updated to call the dpaa2_mac_start/stop functions newly added which will power_on/power_off the associated SerDes lane. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-14 10:41:51 +00:00
Ioana Ciornei	aa95c37112	dpaa2-mac: move setting up supported_interfaces into a function The logic to setup the supported interfaces will get annotated based on what the configuration of the SerDes PLLs supports. Move the current setup into a separate function just to try to keep it clean. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-14 10:41:51 +00:00
Ioana Ciornei	dff953813e	dpaa2-mac: retrieve API version and detect features Retrieve the API version running on the firmware and based on it detect which features are available for usage. The first one to be listed is the capability to change the MAC protocol at runtime. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-14 10:41:51 +00:00
Ioana Ciornei	332b9ea59e	dpaa2-mac: add the MC API for reconfiguring the protocol The MC firmware gained recently a new command which can reconfigure the running protocol on the underlying MAC. Add this new command which will be used in the next patches in order to do a major reconfig on the interface. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-14 10:41:50 +00:00
Ioana Ciornei	38d28b02a0	dpaa2-mac: add the MC API for retrieving the version The dpmac_get_api_version command will be used in the next patches to determine if the current firmware is capable or not to change the Ethernet protocol running on the MAC. Signed-off-by: Ioana Ciornei <ioana.ciornei@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-14 10:41:50 +00:00
Vladimir Oltean	978777d0fb	net: dsa: felix: configure default-prio and dscp priorities Follow the established programming model for this driver and provide shims in the felix DSA driver which call the implementations from the ocelot switch lib. The ocelot switchdev driver wasn't integrated with dcbnl due to lack of hardware availability. The switch doesn't have any fancy QoS classification enabled by default. The provided getters will create a default-prio app table entry of 0, and no dscp entry. However, the getters have been made to actually retrieve the hardware configuration rather than static values, to be future proof in case DSA will need this information from more call paths. For default-prio, there is a single field per port, in ANA_PORT_QOS_CFG, called QOS_DEFAULT_VAL. DSCP classification is enabled per-port, again via ANA_PORT_QOS_CFG (field QOS_DSCP_ENA), and individual DSCP values are configured as trusted or not through register ANA_DSCP_CFG (replicated 64 times). An untrusted DSCP value falls back to other QoS classification methods. If trusted, the selected ANA_DSCP_CFG register also holds the QoS class in the QOS_DSCP_VAL field. The hardware also supports DSCP remapping (DSCP value X is translated to DSCP value Y before the QoS class is determined based on the app table entry for Y) and DSCP packet rewriting. The dcbnl framework, for being so flexible in other useless areas, doesn't appear to support this. So this functionality has been left out. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-14 10:36:15 +00:00
David S. Miller	97aeb877de	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== ice: GTP support in switchdev Marcin Szycik says: Add support for adding GTP-C and GTP-U filters in switchdev mode. To create a filter for GTP, create a GTP-type netdev with ip tool, enable hardware offload, add qdisc and add a filter in tc: ip link add $GTP0 type gtp role <sgsn/ggsn> hsize <hsize> ethtool -K $PF0 hw-tc-offload on tc qdisc add dev $GTP0 ingress tc filter add dev $GTP0 ingress prio 1 flower enc_key_id 1337 \ action mirred egress redirect dev $VF1_PR By default, a filter for GTP-U will be added. To add a filter for GTP-C, specify enc_dst_port = 2123, e.g.: tc filter add dev $GTP0 ingress prio 1 flower enc_key_id 1337 \ enc_dst_port 2123 action mirred egress redirect dev $VF1_PR Note: outer IPv6 offload is not supported yet. Note: GTP-U with no payload offload is not supported yet. ICE COMMS package is required to create a filter as it contains GTP profiles. Changes in iproute2 [1] are required to be able to add GTP netdev and use GTP-specific options (QFI and PDU type). [1] https://lore.kernel.org/netdev/20220211182902.11542-1-wojciech.drewek@intel.com/T --- v2: Add more CC v3: Fix mail thread, sorry for spam v4: Add GTP echo response in gtp module v5: Change patch order v6: Add GTP echo request in gtp module v7: Fix kernel-docs in ice v8: Remove handling of GTP Echo Response v9: Add sending of multicast message on GTP Echo Response, fix GTP-C dummy packet selection v10: Rebase, fixed most 80 char line limits v11: Rebase, collect Harald's Reviewed-by on patch 3 ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-12 11:54:29 +00:00
Eric Dumazet	625788b584	net: add per-cpu storage and net->core_stats Before adding yet another possibly contended atomic_long_t, it is time to add per-cpu storage for existing ones: dev->tx_dropped, dev->rx_dropped, and dev->rx_nohandler Because many devices do not have to increment such counters, allocate the per-cpu storage on demand, so that dev_get_stats() does not have to spend considerable time folding zero counters. Note that some drivers have abused these counters which were supposed to be only used by core networking stack. v4: should use per_cpu_ptr() in dev_get_stats() (Jakub) v3: added a READ_ONCE() in netdev_core_stats_alloc() (Paolo) v2: add a missing include (reported by kernel test robot <lkp@intel.com>) Change in netdev_core_stats_alloc() (Jakub) Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: jeffreyji <jeffreyji@google.com> Reviewed-by: Brian Vazquez <brianvv@google.com> Reviewed-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Paolo Abeni <pabeni@redhat.com> Link: https://lore.kernel.org/r/20220311051420.2608812-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:17:24 -08:00
Dirk van der Merwe	d3826a9522	nfp: add support for NFP3800/NFP3803 PCIe devices Enable binding the nfp driver to NFP3800 and NFP3803 devices. The PCIE_SRAM offset is different for the NFP3800 device, which also only supports a single explicit group. Changes to Dirk's work: * 48-bit dma addressing is not ready yet. Keep 40-bit dma addressing for NFP3800. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Jakub Kicinski	7f3aa620f8	nfp: take chip version into account for ring sizes NFP3800 has slightly different queue controller range bounds. Use the static chip data instead of defines. This commit still assumes unchanged descriptor format. Later datapath changes will allow adjusting for descriptor accounting. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Jakub Kicinski	e900db704c	nfp: parametrize QCP offset/size using dev_info The queue controller (QCP) is accessed based on a device specific offset. The NFP3800 device also supports more queues. Furthermore, the NFP3800 VFs also access the QCP differently to how the NFP6000 VFs accesses it, though still indirectly. Fortunately, we can remove the offset all together for both VF types. This is safe for NFP6000 VFs since the offset was effectively a wrap around and only used for convenience to have it set the same as the NFP6000 PF. Use nfp_dev_info to store queue controller parameters. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Jakub Kicinski	9ba1dc994f	nfp: use dev_info for the DMA mask In preparation for new chips instead of defines use dev_info constants to store DMA mask length. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Jakub Kicinski	f524b335c0	nfp: use dev_info for PCIe config space BAR offsets NFP3800 uses a different PCIe configuration to CPP expansion BAR offsets. We don't need to differentiate between the NFP4000, NFP5000 and NFP6000 since they all use the same offsets. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Jakub Kicinski	9423d24b7b	nfp: introduce dev_info static chip data In preparation for supporting new chip add a driver data structure which will hold per-chip-version information such as register offsets. Plumb it through to the relevant functions (nfpcore and nfp_net). For now only a very simple member holding chip names is added, following commits will add more. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Jakub Kicinski	7ab7985df2	nfp: sort the device ID tables Make sure the device ID tables are in ascending order. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Dirk van der Merwe	5d1359ed5d	nfp: use PluDevice register for model for non-NFP6000 chips The model number for NFP3800 and newer devices can be completely derived from PluDevice register without subtracting 0x10. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:22 -08:00
Dirk van der Merwe	113e962416	nfp: use PCI_DEVICE_ID_NETRONOME_NFP6000_VF for VFs instead The PCI_DEVICE_ID_NETRONOME_NFP6000_VF is available for use and should be used instead of the PCI_DEVICE_NFP6000VF. Meanwhile, PCI_DEVICE_NFP6000VF PCI ID is removed for not being used. Signed-off-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:21 -08:00
Christo du Toit	f6df1aa628	nfp: remove pessimistic NFP_QCP_MAX_ADD limits Multiple writes cause intermediate pointer values that do not end on complete TX descriptors. The QCP peripheral on the NFP provides a number of access modes. In some access modes, the maximum amount to add must be restricted to a 6bit value. The particular access mode used by _nfp_qcp_ptr_add() has no such restrictions, so the "< NFP_QCP_MAX_ADD" test is unnecessary. Note that trying to add more that the configured ring size in a single add will cause a QCP overflow, caught and handled by the QCP peripheral. Signed-off-by: Christo du Toit <christo.du.toit@netronome.com> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:10:21 -08:00
Jakub Kicinski	940ea0eae3	nfp: remove define for an unused control bit NFP driver ABI contains a bit for ring prioritization which was never implemented in the initially envisioned form. Remove it, and open up the possibility of reclaiming for other uses. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: Fei Qin <fei.qin@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 23:09:23 -08:00
Yang Li	d59c85ddac	ethernet: 8390: Remove unnecessary print function dev_err() The print function dev_err() is redundant because platform_get_irq() already prints an error. Eliminate the follow coccicheck warning: ./drivers/net/ethernet/8390/mcf8390.c:414:2-9: line 414 is redundant because platform_get_irq() already prints an error Reported-by: Abaci Robot <abaci@linux.alibaba.com> Signed-off-by: Yang Li <yang.lee@linux.alibaba.com> Link: https://lore.kernel.org/r/20220311001756.12234-1-yang.lee@linux.alibaba.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 22:59:03 -08:00
Niels Dossche	46b348fd2d	alx: acquire mutex for alx_reinit in alx_change_mtu alx_reinit has a lockdep assertion that the alx->mtx mutex must be held. alx_reinit is called from two places: alx_reset and alx_change_mtu. alx_reset does acquire alx->mtx before calling alx_reinit. alx_change_mtu does not acquire this mutex, nor do its callers or any path towards alx_change_mtu. Acquire the mutex in alx_change_mtu. The issue was introduced when the fine-grained locking was introduced to the code to replace the RTNL. The same commit also introduced the lockdep assertion. Fixes: `4a5fe57e77` ("alx: use fine-grained locking instead of RTNL") Signed-off-by: Niels Dossche <dossche.niels@gmail.com> Link: https://lore.kernel.org/r/20220310232707.44251-1-dossche.niels@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-11 22:53:14 -08:00
Marcin Szycik	9a225f81f5	ice: Support GTP-U and GTP-C offload in switchdev Add support for creating filters for GTP-U and GTP-C in switchdev mode. Add support for parsing GTP-specific options (QFI and PDU type) and TEID. By default, a filter for GTP-U will be added. To add a filter for GTP-C, specify enc_dst_port = 2123, e.g.: tc filter add dev $GTP0 ingress prio 1 flower enc_key_id 1337 \ enc_dst_port 2123 action mirred egress redirect dev $VF1_PR Note: GTP-U with outer IPv6 offload is not supported yet. Note: GTP-U with no payload offload is not supported yet. Signed-off-by: Marcin Szycik <marcin.szycik@linux.intel.com> Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-11 08:28:28 -08:00
Michal Swiatkowski	e5dd661b8b	ice: Fix FV offset searching Checking only protocol ids while searching for correct FVs can lead to a situation, when incorrect FV will be added to the list. Incorrect means that FV has correct protocol id but incorrect offset. Call ice_get_sw_fv_list with ice_prot_lkup_ext struct which contains all protocol ids with offsets. With this modification allocating and collecting protocol ids list is not longer needed. Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com> Tested-by: Sandeep Penigalapati <sandeep.penigalapati@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>	2022-03-11 08:28:27 -08:00
Horatiu Vultur	fb9eb027fb	net: lan966x: Improve the CPU TX bitrate. When doing manual injection of the frame, it is required to check if the TX FIFO is ready to accept the next word of the frame. For this we are using 'readx_poll_timeout_atomic', the only problem is that before it actually checks the status, is determining the time when to finish polling the status. Which seems to be an expensive operation. Therefore check the status of the TX FIFO before calling 'readx_poll_timeout_atomic'. Doing this will improve the TX bitrate by ~70%. Because 99% the FIFO is ready by that time. The measurements were done using iperf3. Before: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.03 sec 55.2 MBytes 46.2 Mbits/sec 0 sender [ 5] 0.00-10.04 sec 53.8 MBytes 45.0 Mbits/sec receiver After: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.10 sec 95.0 MBytes 78.9 Mbits/sec 0 sender [ 5] 0.00-10.11 sec 95.0 MBytes 78.8 Mbits/sec receiver Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-11 11:13:09 +00:00
Yihao Han	89ff05d595	net: ethernet: ezchip: fix platform_get_irq.cocci warning Remove dev_err() messages after platform_get_irq*() failures. platform_get_irq() already prints an error. Generated by: scripts/coccinelle/api/platform_get_irq.cocci Signed-off-by: Yihao Han <hanyihao@vivo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-11 11:07:23 +00:00
Minghao Chi	bf2b83425b	net: mv643xx_eth: use platform_get_irq() instead of platform_get_resource() It is not recommened to use platform_get_resource(pdev, IORESOURCE_IRQ) for requesting IRQ's resources any more, as they can be not ready yet in case of DT-booting. platform_get_irq() instead is a recommended way for getting IRQ even if it was not retrieved earlier. It also makes code simpler because we're getting "int" value right away and no conversion from resource to int is required. Reported-by: Zeal Robot <zealci@zte.com.cn> Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-11 11:04:25 +00:00
Lad Prabhakar	7cd08f108d	net: ethernet: ti: davinci_emac: Use platform_get_irq() to get the interrupt platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining. In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq() for DT users only. While at it propagate error code in emac_dev_stop() in case platform_get_irq_optional() fails. Signed-off-by: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-11 11:01:42 +00:00
Siddharth Vadapalli	e8609e6947	net: ethernet: ti: am65-cpsw: Convert to PHYLINK Convert am65-cpsw driver and am65-cpsw ethtool to use Phylink APIs as described at Documentation/networking/sfp-phylink.rst. All calls to Phy APIs are replaced with their equivalent Phylink APIs. No functional change intended. Use Phylink instead of conventional Phylib, in preparation to add support for SGMII/QSGMII modes. Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2022-03-11 11:00:03 +00:00
Gal Pressman	970adfb760	net/mlx5e: Remove overzealous validations in netlink EEPROM query Unlike the legacy EEPROM callbacks, when using the netlink EEPROM query (get_module_eeprom_by_page) the driver should not try to validate the query parameters, but just perform the read requested by the userspace. Recent discussion in the mailing list: https://lore.kernel.org/netdev/20220120093051.70845141@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net/ Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:25 -08:00
Gal Pressman	fcb610a86c	net/mlx5: Parse module mapping using mlx5_ifc The assumption that the first byte in the module mapping dword is the module number shouldn't be hard-coded in the driver, but come from mlx5_ifc structs. While at it, fix the incorrect width for the 'rx_lane' and 'tx_lane' fields. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:25 -08:00
Gal Pressman	271907ee2f	net/mlx5: Query the maximum MCIA register read size from firmware The MCIA register supports either 12 or 32 dwords, use the correct value by querying the capability from the MCAM register. Signed-off-by: Gal Pressman <gal@nvidia.com> Reviewed-by: Maxim Mikityanskiy <maximmi@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:24 -08:00
Paul Blakey	fbf6836db4	net/mlx5: CT: Create smfs dr matchers dynamically SMFS dr matchers are processed sequentially in hardware according to their priorities, and not skipped if empty. Currently, smfs ct fs creates four predefined dr matchers per ct table (ct/ct nat) with hardcoded priority. Compared to dmfs ct fs using autogroups, this might cause additional hops in fastpath for traffic patterns that match later priorties, even if previous priorites are empty, e.g user only using ipv6 UDP traffic will have additional 3 hops. Create the matchers dynamically, using the highest priority available, on first rule usage, and remove them on last usage. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:24 -08:00
Paul Blakey	3ee61ebb0d	net/mlx5: CT: Add software steering ct flow steering provider fs_core layer adds extra book keeping that is either unneeded for CT, or unused by the underlying software steering, such as allocating FTEs and FTE ids, saving the match key and mask, and autogroups management. On top of that, direct steering has a translation layer (fs_dr) from PRM commands to direct steering objects, for example, creating temporary dr_action objects. This has a performance impact when dealing with CT high insertion rate. To use direct steering (smfs) directly for ct, add a tc ct fs smfs implementation. Instead of dmfs autogroups, smfs ct fs uses one of 4 predefined dr matchers in CT and CT-NAT tables, for each combination of tuple ethertype (ipv4/ipv6), and tuple ip_proto (udp/tcp) that is currently used by nf flow table flow offload. At rule insertions, validate the flow rule fits one of the predfined matcher, and insert to it. To fill the dr_actions of the rule efficiently, create the fwd to post_ct tbl dr_action at fs init, the count dr_action at counter creation, and re-use the already pre-allocated modify header dr_action. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:24 -08:00
Paul Blakey	c6fef514ad	net/mlx5: Add smfs lib to export direct steering API to CT Add a thin layer that exports selected direct steering (dr) API which will be used by a ct fs implementation in a following patch. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Mark Bloch <mbloch@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:23 -08:00
Paul Blakey	34ea969d16	net/mlx5: DR, Add helper to get backing dr table from a mlx5 flow table If sw steering was used to create the table, dr steeering fs creates a backing dr table for the mlx5 flow table. Add helper to return this table so it can be used to create matchers and add rules on it directly instead of passing via eswitch_offloads/fs_core insertion. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:23 -08:00
Paul Blakey	7690900052	net/mlx5: CT: Introduce a platform for multiple flow steering providers Currently, fs_core layer provides flow steering services to the driver including: autogroups, allocating FTEs (flow table entries) and FTE ids, and support of fte action modification. If then software steering is configured, rule insertion will go through a translation layer from firmware buffers to software steering objects (see fs_dr.c). The connection tracking table is a system table that is not directly controlled by the user and is a very high scale table. These fs_core services introduces an overhead that may be optimized by using software steering API directly. Introduce ct flow steering interface to allow multiple flow steering providers. Use the new interface to implement the current dmfs (device managed flow steering) provider which uses fs_core insertion. Signed-off-by: Paul Blakey <paulb@nvidia.com> Reviewed-by: Oz Shlomo <ozsh@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:23 -08:00
Tariq Toukan	a3540effb7	net/mlx5: Node-aware allocation for the doorbell pgdir The function is node-aware and gets the node as an argument. Use a node-aware allocation for the doorbell pgdir structure. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:22 -08:00
Tariq Toukan	b5e4c30794	net/mlx5: Node-aware allocation for UAR Prefer the aware allocation, use the device NUMA node. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:22 -08:00
Tariq Toukan	7f880719b9	net/mlx5: Node-aware allocation for the EQs Prefer the aware allocation, use the device NUMA node. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:22 -08:00
Tariq Toukan	e894246df5	net/mlx5: Node-aware allocation for the EQ table Prefer the aware allocation, use the device NUMA node. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:22 -08:00
Tariq Toukan	196df17ac5	net/mlx5: Node-aware allocation for the IRQ table Prefer the aware allocation, use the device NUMA node. Signed-off-by: Tariq Toukan <tariqt@nvidia.com> Reviewed-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:21 -08:00
Leon Romanovsky	71ab580705	net/mlx5: Delete useless module.h include There is no need in include of module.h in the following files. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:21 -08:00
Leon Romanovsky	042637019e	net/mlx4: Delete useless moduleparam include Remove inclusion of not used moduleparam.h. Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-10 23:38:21 -08:00
Yinjun Zhang	87ed3de674	nfp: xsk: fix a warning when allocating rx rings Previous commits introduced AF_XDP zero-copy support, in which we need register different mem model for xdp_rxq when AF_XDP zero-copy is enabled or not. And this should be done after xdp_rxq info is registered, which is not needed for ctrl port, otherwise there complaints warnings: "Missing register, driver bug". Fix this by not registering mem model for ctrl port, just like we don't register xdp_rxq info for ctrl port. Fixes: `6402528b7a` ("nfp: xsk: add AF_XDP zero-copy Rx and Tx support") Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com> Reviewed-by: Niklas Söderlund <niklas.soderlund@corigine.com> Signed-off-by: Simon Horman <simon.horman@corigine.com> Link: https://lore.kernel.org/r/20220309135533.10162-1-simon.horman@corigine.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 20:25:18 -08:00
Jakub Kicinski	4c7d2e1795	Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue Tony Nguyen says: ==================== 100GbE Intel Wired LAN Driver Updates 2022-03-09 This series contains updates to ice driver only. Martyna implements switchdev filtering on inner EtherType field for tunnels. Marcin adds reporting of slowpath statistics for port representors. Jonathan Toppins changes a non-fatal link error message from warning to debug. Maciej removes unnecessary checks in ice_clean_tx_irq(). Amritha adds support for ADQ to match outer destination MAC for tunnels. * '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue: ice: Add support for outer dest MAC for ADQ tunnels ice: avoid XDP checks in ice_clean_tx_irq() ice: change "can't set link" message to dbg level ice: Add slow path offload stats on port representor in switchdev ice: Add support for inner etype in switchdev ==================== Link: https://lore.kernel.org/r/20220309190315.1380414-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 20:20:13 -08:00
Jakub Kicinski	1e8a3f0d2a	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net net/dsa/dsa2.c commit `afb3cc1a39` ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails") commit `e83d565378` ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master") https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/ drivers/net/ethernet/intel/ice/ice.h commit `97b0129146` ("ice: Fix error with handling of bonding MTU") commit `43113ff734` ("ice: add TTY for GNSS module for E810T device") https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/ drivers/staging/gdm724x/gdm_lte.c commit `fc7f750dc9` ("staging: gdm724x: fix use after free in gdm_lte_rx()") commit `4bcc4249b4` ("staging: Use netif_rx().") https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/ Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 17:16:56 -08:00
Ivan Vecera	5cb1ebdbc4	ice: Fix race condition during interface enslave Commit `5dbbbd01cb` ("ice: Avoid RTNL lock when re-creating auxiliary device") changes a process of re-creation of aux device so ice_plug_aux_dev() is called from ice_service_task() context. This unfortunately opens a race window that can result in dead-lock when interface has left LAG and immediately enters LAG again. Reproducer: ``` #!/bin/sh ip link add lag0 type bond mode 1 miimon 100 ip link set lag0 for n in {1..10}; do echo Cycle: $n ip link set ens7f0 master lag0 sleep 1 ip link set ens7f0 nomaster done ``` This results in: [20976.208697] Workqueue: ice ice_service_task [ice] [20976.213422] Call Trace: [20976.215871] __schedule+0x2d1/0x830 [20976.219364] schedule+0x35/0xa0 [20976.222510] schedule_preempt_disabled+0xa/0x10 [20976.227043] __mutex_lock.isra.7+0x310/0x420 [20976.235071] enum_all_gids_of_dev_cb+0x1c/0x100 [ib_core] [20976.251215] ib_enum_roce_netdev+0xa4/0xe0 [ib_core] [20976.256192] ib_cache_setup_one+0x33/0xa0 [ib_core] [20976.261079] ib_register_device+0x40d/0x580 [ib_core] [20976.266139] irdma_ib_register_device+0x129/0x250 [irdma] [20976.281409] irdma_probe+0x2c1/0x360 [irdma] [20976.285691] auxiliary_bus_probe+0x45/0x70 [20976.289790] really_probe+0x1f2/0x480 [20976.298509] driver_probe_device+0x49/0xc0 [20976.302609] bus_for_each_drv+0x79/0xc0 [20976.306448] __device_attach+0xdc/0x160 [20976.310286] bus_probe_device+0x9d/0xb0 [20976.314128] device_add+0x43c/0x890 [20976.321287] __auxiliary_device_add+0x43/0x60 [20976.325644] ice_plug_aux_dev+0xb2/0x100 [ice] [20976.330109] ice_service_task+0xd0c/0xed0 [ice] [20976.342591] process_one_work+0x1a7/0x360 [20976.350536] worker_thread+0x30/0x390 [20976.358128] kthread+0x10a/0x120 [20976.365547] ret_from_fork+0x1f/0x40 ... [20976.438030] task:ip state:D stack: 0 pid:213658 ppid:213627 flags:0x00004084 [20976.446469] Call Trace: [20976.448921] __schedule+0x2d1/0x830 [20976.452414] schedule+0x35/0xa0 [20976.455559] schedule_preempt_disabled+0xa/0x10 [20976.460090] __mutex_lock.isra.7+0x310/0x420 [20976.464364] device_del+0x36/0x3c0 [20976.467772] ice_unplug_aux_dev+0x1a/0x40 [ice] [20976.472313] ice_lag_event_handler+0x2a2/0x520 [ice] [20976.477288] notifier_call_chain+0x47/0x70 [20976.481386] __netdev_upper_dev_link+0x18b/0x280 [20976.489845] bond_enslave+0xe05/0x1790 [bonding] [20976.494475] do_setlink+0x336/0xf50 [20976.502517] __rtnl_newlink+0x529/0x8b0 [20976.543441] rtnl_newlink+0x43/0x60 [20976.546934] rtnetlink_rcv_msg+0x2b1/0x360 [20976.559238] netlink_rcv_skb+0x4c/0x120 [20976.563079] netlink_unicast+0x196/0x230 [20976.567005] netlink_sendmsg+0x204/0x3d0 [20976.570930] sock_sendmsg+0x4c/0x50 [20976.574423] ____sys_sendmsg+0x1eb/0x250 [20976.586807] ___sys_sendmsg+0x7c/0xc0 [20976.606353] __sys_sendmsg+0x57/0xa0 [20976.609930] do_syscall_64+0x5b/0x1a0 [20976.613598] entry_SYSCALL_64_after_hwframe+0x65/0xca 1. Command 'ip link ... set nomaster' causes that ice_plug_aux_dev() is called from ice_service_task() context, aux device is created and associated device->lock is taken. 2. Command 'ip link ... set master...' calls ice's notifier under RTNL lock and that notifier calls ice_unplug_aux_dev(). That function tries to take aux device->lock but this is already taken by ice_plug_aux_dev() in step 1 3. Later ice_plug_aux_dev() tries to take RTNL lock but this is already taken in step 2 4. Dead-lock The patch fixes this issue by following changes: - Bit ICE_FLAG_PLUG_AUX_DEV is kept to be set during ice_plug_aux_dev() call in ice_service_task() - The bit is checked in ice_clear_rdma_cap() and only if it is not set then ice_unplug_aux_dev() is called. If it is set (in other words plugging of aux device was requested and ice_plug_aux_dev() is potentially running) then the function only clears the bit - Once ice_plug_aux_dev() call (in ice_service_task) is finished the bit ICE_FLAG_PLUG_AUX_DEV is cleared but it is also checked whether it was already cleared by ice_clear_rdma_cap(). If so then aux device is unplugged. Signed-off-by: Ivan Vecera <ivecera@redhat.com> Co-developed-by: Petr Oros <poros@redhat.com> Signed-off-by: Petr Oros <poros@redhat.com> Reviewed-by: Dave Ertman <david.m.ertman@intel.com> Link: https://lore.kernel.org/r/20220310171641.3863659-1-ivecera@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 15:07:46 -08:00
Jeremy Linton	00b022f8f8	net: bcmgenet: Don't claim WOL when its not available Some of the bcmgenet platforms don't correctly support WOL, yet ethtool returns: "Supports Wake-on: gsf" which is false. Ideally if there isn't a wol_irq, or there is something else that keeps the device from being able to wakeup it should display: "Supports Wake-on: d" This patch checks whether the device can wakup, before using the hard-coded supported flags. This corrects the ethtool reporting, as well as the WOL configuration because ethtool verifies that the mode is supported before attempting it. Fixes: `c51de7f397` ("net: bcmgenet: add Wake-on-LAN support code") Signed-off-by: Jeremy Linton <jeremy.linton@arm.com> Tested-by: Peter Robinson <pbrobinson@gmail.com> Acked-by: Florian Fainelli <f.fainelli@gmail.com> Link: https://lore.kernel.org/r/20220310045535.224450-1-jeremy.linton@arm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 14:54:32 -08:00
Jianglei Nie	bc0e610a6e	net: arc_emac: Fix use after free in arc_mdio_probe() If bus->state is equal to MDIOBUS_ALLOCATED, mdiobus_free(bus) will free the "bus". But bus->name is still used in the next line, which will lead to a use after free. We can fix it by putting the name in a local variable and make the bus->name point to the rodata section "name",then use the name in the error message without referring to bus to avoid the uaf. Fixes: `95b5fc03c1` ("net: arc_emac: Make use of the helper function dev_err_probe()") Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Link: https://lore.kernel.org/r/20220309121824.36529-1-niejianglei2021@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 14:49:21 -08:00
Jakub Kicinski	3e18bcb778	mlx5-updates-2022-03-09 1) Remove kernel log prints on FW events regarding FW pages management and replace that with debugfs entries to track FW pages management commands failures and general stats, we do that for all FW commands in general since it's the same effort to do so under the already existing debugfs entry for FW commands. 2) Add support for ConnectX-7 Software managed steering, in other words STEv2 which shares a lot in common with STE V1, the difference is in specific offsets in the devices, the logic is almost the same, thus we implement STEv1 and STEv2 in the same file. -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmIpHREACgkQSD+KveBX +j7Ubwf/WDV1+UDECznQtWAZ3DMmSlVlfUDgxrdaNiJ1BRFEk4fg3yBLyEtYlVBR Rw0ZrtkuZo31Iu92P2rvOlExmMJuGNBMpBpNwXN1TZuG43DKMUY3RVHB27O9Y9zQ z4BOhYFVu4Bhn7euO1Icu9YlDdVWGssb0sjAN7iFyCghzA6VeuPpJxYPfNgQNZrx frHLKTW2tHqKvCCIv7oxZpG2zcg0wyV4QgG0P2XOYddRDTCPZTbLGrQEaCos0WcM uKy5vJWjbW1lxxOW2S7uYSujdTttkpC/ltv1Pe47QFZsCAtGV/P5S34D170Lu/t5 J1suur834H6o2qm9hrDj7QDwcjwsYw== =f7Ae -----END PGP SIGNATURE----- Merge tag 'mlx5-updates-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5-updates-2022-03-09 1) Remove kernel log prints on FW events regarding FW pages management and replace that with debugfs entries to track FW pages management commands failures and general stats, we do that for all FW commands in general since it's the same effort to do so under the already existing debugfs entry for FW commands. 2) Add support for ConnectX-7 Software managed steering, in other words STEv2 which shares a lot in common with STE V1, the difference is in specific offsets in the devices, the logic is almost the same, thus we implement STEv1 and STEv2 in the same file. * tag 'mlx5-updates-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5: DR, Add support for ConnectX-7 steering net/mlx5: DR, Refactor ste_ctx handling for STE v0/1 net/mlx5: DR, Rename action modify fields to reflect naming in HW spec net/mlx5: DR, Fix handling of different actions on the same STE in STEv1 net/mlx5: DR, Remove unneeded comments net/mlx5: DR, Add support for matching on Internet Header Length (IHL) net/mlx5: DR, Align mlx5dv_dr API vport action with FW behavior net/mlx5: Add debugfs counters for page commands failures net/mlx5: Add pages debugfs net/mlx5: Move debugfs entries to separate struct net/mlx5: Change release_all_pages cap bit location net/mlx5: Remove redundant error on reclaim pages net/mlx5: Remove redundant error on give pages net/mlx5: Remove redundant notify fail on give pages net/mlx5: Add command failures data to debugfs net/mlx5e: TC, Fix use after free in mlx5e_clone_flow_attr_for_post_act() ==================== Link: https://lore.kernel.org/r/20220309213755.610202-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 14:35:22 -08:00
Jakub Kicinski	55c4bf4d93	mlx5-fixes-2022-03-09 -----BEGIN PGP SIGNATURE----- iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmIpAngACgkQSD+KveBX +j6rMQf/fl7seXDl+rny/YyJ767P6fauh9kxPYt0eVOTa8hZ0nn3lmxP2m9UnsHL HQNrQwBmvjzmMffoTl5oUjDc+fxxWJgWk/WLThg8lZl2pghhnF2+IjJLE9ofvohW B0TYCOvjY61BoSi0PtjohD7rqUqONAQrMSmgyI79dIduHupcMw3ZOrr1SobUKWVg pP0vzXhb1TIPl1wT4Y+22KIfmzXHu7vyUDmKVLACrqh1nDsPC65P/yHcWzhXF1bO 2y/Pv1IK1DgjExfQ2m8HCg+XRlNmBJi7+1jc0QQ3Tfjj+oS6xDabxUAOuyS2S4ie KFYcftIYsUSSrQVJFGA2R2Td/bdhMg== =EDVx -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: ==================== mlx5 fixes 2022-03-09 This series provides bug fixes to mlx5 driver. * tag 'mlx5-fixes-2022-03-09' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux: net/mlx5e: SHAMPO, reduce TIR indication net/mlx5e: Lag, Only handle events from highest priority multipath entry net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE net/mlx5: Fix a race on command flush flow net/mlx5: Fix size field in bufferx_reg struct ==================== Link: https://lore.kernel.org/r/20220309201517.589132-1-saeed@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 14:32:32 -08:00
Miaoqian Lin	2ac5b58e64	gianfar: ethtool: Fix refcount leak in gfar_get_ts_info The of_find_compatible_node() function returns a node pointer with refcount incremented, We should use of_node_put() on it when done Add the missing of_node_put() to release the refcount. Fixes: `7349a74ea7` ("net: ethernet: gianfar_ethtool: get phc index through drvdata") Signed-off-by: Miaoqian Lin <linmq006@gmail.com> Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com> Link: https://lore.kernel.org/r/20220310015313.14938-1-linmq006@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-10 12:20:54 -08:00
Robert Hancock	6c7e7da2e0	net: axienet: Use napi_alloc_skb when refilling RX ring Use napi_alloc_skb to allocate memory when refilling the RX ring in axienet_poll for more efficiency. napi_alloc_skb() can reuse softirq-local cache of freed skbs which may still be cache-warm and skipping allocator calls. Signed-off-by: Robert Hancock <robert.hancock@calian.com> Link: https://lore.kernel.org/r/20220308211013.1530955-1-robert.hancock@calian.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-09 20:19:16 -08:00
Michael Sit Wei Hong	30c5601fbf	stmmac: intel: Add ADL-N PCI ID Add PCI ID for Ethernet TSN Controller on ADL-N. Signed-off-by: Michael Sit Wei Hong <michael.wei.hong.sit@intel.com> Link: https://lore.kernel.org/r/20220309033415.3370250-1-michael.wei.hong.sit@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-09 20:04:53 -08:00
Dimitris Michailidis	b23f923919	net/fungible: fix errors when CONFIG_TLS_DEVICE=n Include the TLS headers unconditionally and define driver TLS symbols used in code compiled also when CONFIG_TLS_DEVICE=n to fix the following errors: ../drivers/net/ethernet/fungible/funeth/funeth_tx.c: In function ‘write_pkt_desc’: ../drivers/net/ethernet/fungible/funeth/funeth_tx.c:244:13: error: implicit declaration of function ‘tls_driver_ctx’ [-Werror=implicit-function-declaration] 244 \| tls_ctx = tls_driver_ctx(skb->sk, TLS_OFFLOAD_CTX_DIR_TX); \| ^~~~~~~~~~~~~~ ../drivers/net/ethernet/fungible/funeth/funeth_tx.c:244:37: error: ‘TLS_OFFLOAD_CTX_DIR_TX’ undeclared (first use in this function) 244 \| tls_ctx = tls_driver_ctx(skb->sk, TLS_OFFLOAD_CTX_DIR_TX); \| ^~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/fungible/funeth/funeth_tx.c:244:37: note: each undeclared identifier is reported only once for each function it appears in ../drivers/net/ethernet/fungible/funeth/funeth_tx.c:245:23: error: dereferencing pointer to incomplete type ‘struct fun_ktls_tx_ctx’ 245 \| tls->tlsid = tls_ctx->tlsid; \| ^~ ../drivers/net/ethernet/fungible/funeth/funeth_tx.c: In function ‘fun_start_xmit’: ../drivers/net/ethernet/fungible/funeth/funeth_tx.c:310:6: error: implicit declaration of function ‘tls_is_sk_tx_device_offloaded’ [-Werror=implicit-function-declaration] 310 \| tls_is_sk_tx_device_offloaded(skb->sk)) { \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../drivers/net/ethernet/fungible/funeth/funeth_tx.c:311:9: error: implicit declaration of function ‘fun_tls_tx’; did you mean ‘fun_xdp_tx’? [-Werror=implicit-function-declaration] 311 \| skb = fun_tls_tx(skb, q, &tls_len); \| ^~~~~~~~~~ \| fun_xdp_tx ../drivers/net/ethernet/fungible/funeth/funeth_tx.c:311:7: warning: assignment to ‘struct sk_buff *’ from ‘int’ makes pointer from integer without a cast [-Wint-conversion] 311 \| skb = fun_tls_tx(skb, q, &tls_len); \| ^ Fixes: `db37bc177d` ("net/funeth: add the data path") Reported-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: kernel test robot <lkp@intel.com> Signed-off-by: Dimitris Michailidis <dmichail@fungible.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-09 20:01:19 -08:00
Jakub Kicinski	4a5eaa2fde	bnxt: revert hastily merged uAPI aberrations This reverts: commit `02acd39953` ("bnxt_en: parse result field when NVRAM package install fails") commit `22f5dba506` ("bnxt_en: add an nvm test for hw diagnose") commit `bafed3f231` ("bnxt_en: implement hw health reporter") These patches are still under discussion / I don't think they are right, and since the authors don't reply promptly let me lessen my load of "things I need to resolve before next release" and revert them. Acked-by: Michael Chan <michael.chan@broadcom.com> Link: https://lore.kernel.org/r/20220308173659.304915-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-09 19:55:00 -08:00
Heiner Kallweit	1a21277190	net: stmmac: switch no PTP HW support message to info level If HW doesn't support PTP, then it doesn't support it. This is neither a problem nor can the user do something about it. Therefore change the message level to info. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Link: https://lore.kernel.org/r/ee685745-f1ab-e9bf-f20e-077d55dff441@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-09 19:54:32 -08:00
Chen Yu	91ec779247	e1000e: Print PHY register address when MDI read/write fails There is occasional suspend error from e1000e which blocks the system from further suspending. And the issue was found on a WhiskeyLake-U platform with I219-V: [ 20.078957] PM: pci_pm_suspend(): e1000e_pm_suspend+0x0/0x780 [e1000e] returns -2 [ 20.078970] PM: dpm_run_callback(): pci_pm_suspend+0x0/0x170 returns -2 [ 20.078974] e1000e 0000:00:1f.6: PM: pci_pm_suspend+0x0/0x170 returned -2 after 371012 usecs [ 20.078978] e1000e 0000:00:1f.6: PM: failed to suspend async: error -2 According to the code flow, this might be caused by broken MDI read/write to PHY registers. However currently the code does not tell us which register is broken. Thus enhance the debug information to print the offender PHY register. So the next the issue is reproduced, this information could be used for narrow down. Acked-by: Paul Menzel <pmenzel@molgen.mpg.de> Reported-by: Todd Brandt <todd.e.brandt@intel.com> Signed-off-by: Chen Yu <yu.c.chen@intel.com> Tested-by: Naama Meir <naamax.meir@linux.intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://lore.kernel.org/r/20220308172030.451566-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2022-03-09 19:53:03 -08:00
Yevgeny Kliteynik	6862c787c7	net/mlx5: DR, Add support for ConnectX-7 steering Add support for a new SW format version that is implemented by ConnectX-7. Except for several differences, the STEv2 is identical to STEv1, so for most callbacks the STEv2 context struct will call STEv1 functions. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:04 -08:00
Yevgeny Kliteynik	638a07f109	net/mlx5: DR, Refactor ste_ctx handling for STE v0/1 As preparation for supporting ConnectX-7, this patches changes handling of ste_ctx handling for existing STE v0 and V1: - each context is now a static struct, and it has a corresponding getter - v0 and v1 were extended to contain the fields that are required for integrating STEv2. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:04 -08:00
Yevgeny Kliteynik	75a3926ca6	net/mlx5: DR, Rename action modify fields to reflect naming in HW spec As preparation for supporting ConnectX-7, rename action modify fields steering registers from arbitrary names to the names that reflect the corresponding naming and location of the steering registers in HW. These registers mapping has changed in ConnectX-7, so the renaming allows to keep track of their mapping better. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:04 -08:00
Yevgeny Kliteynik	bdc3ab5795	net/mlx5: DR, Fix handling of different actions on the same STE in STEv1 Fix handling of various conditions in set_actions_rx/tx that check whether different actions can be on the same STE. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:04 -08:00
Yevgeny Kliteynik	11659ef8d2	net/mlx5: DR, Remove unneeded comments Remove two comments that were erroneously left in the code. Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:03 -08:00
Yevgeny Kliteynik	5c422bfad2	net/mlx5: DR, Add support for matching on Internet Header Length (IHL) Add support for matching on new field - Internet Header Length (IHL). Signed-off-by: Muhammad Sammar <muhammads@nvidia.com> Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:03 -08:00
Shun Hao	aa818fbf8f	net/mlx5: DR, Align mlx5dv_dr API vport action with FW behavior This aligns the behavior with FW when creating an FDB rule with wire vport destination but no source port matching. Until now such rules would fail on internal DR RX rule creation since the source and destination are the wire vport. The new behavior is the same as done on FW steering, if destination is wire, we will create both TX and RX rules, but the RX packet coming from wire will be dropped due to loopback not supported. Signed-off-by: Shun Hao <shunh@nvidia.com> Reviewed-by: Alex Vesker <valex@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:03 -08:00
Moshe Shemesh	32071187e9	net/mlx5: Add debugfs counters for page commands failures Add the following new debugfs counters for debug and verbosity: fw_pages_alloc_failed - number of pages FW requested but driver failed to allocate. give_pages_dropped - number of pages given to FW, but command give pages failed by FW. reclaim_pages_discard - number of pages which were about to reclaim back and FW failed the command. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:02 -08:00
Moshe Shemesh	4e05cbf05c	net/mlx5: Add pages debugfs Add pages debugfs to expose the following counters for debuggability: fw_pages_total - How many pages were given to FW and not returned yet. vfs_pages - For SRIOV, how many pages were given to FW for virtual functions usage. host_pf_pages - For ECPF, how many pages were given to FW for external hosts physical functions usage. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:02 -08:00
Moshe Shemesh	66771a1c72	net/mlx5: Move debugfs entries to separate struct Move the debugfs entry pointers under priv to their own struct. Add get function for device debugfs root. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:02 -08:00
Moshe Shemesh	8d564292a1	net/mlx5: Remove redundant error on reclaim pages If reclaim pages was triggered by FW event and FW failed the command, the driver should ignore as FW is aware and will handle it. The downstream patch will add a debugfs counter on this flow for debuggability. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:01 -08:00
Moshe Shemesh	113fdaaad7	net/mlx5: Remove redundant error on give pages If give pages was triggered by FW event and FW failed the command, the driver should ignore as FW is aware and will handle it. The downstream patch will add a debugfs counter on this flow for debuggability. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:01 -08:00
Moshe Shemesh	4dac2f10ad	net/mlx5: Remove redundant notify fail on give pages If give pages command failed by FW, there is no need to notify the FW on the failure. FW is aware and will handle it. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:01 -08:00
Moshe Shemesh	34f46ae0d4	net/mlx5: Add command failures data to debugfs Add new counters to command interface debugfs to count command failures. The following counters added: total_failed - number of times command failed (any kind of failure). failed_mbox_status - number of times command failed on bad status returned by FW. In addition, add data about last command failure to command interface debugfs: last_failed_errno - last command failed returned errno. last_failed_mbox_status - last bad status returned by FW. Signed-off-by: Moshe Shemesh <moshe@nvidia.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>	2022-03-09 13:33:00 -08:00

1 2 3 4 5 ...

41907 Commits