Commit Graph

1473 Commits

Author SHA1 Message Date
Edward Cree
befcc1fce5 sfc: fix use-after-free in efx_tc_flower_record_encap_match()
When writing error messages to extack for pseudo collisions, we can't
 use encap->type as encap has already been freed.  Fortunately the
 same value is stored in local variable em_type, so use that instead.

Fixes: 3c9561c0a5 ("sfc: support TC decap rules matching on enc_ip_tos")
Reported-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-13 20:58:17 +01:00
Edward Cree
b6583d5e9e sfc: support TC decap rules matching on enc_src_port
Allow efx_tc_encap_match entries to include a udp_sport and a
 udp_sport_mask.  As with enc_ip_tos, use pseudos to enforce that all
 encap matches within a given <src_ip,dst_ip,udp_dport> tuple have
 the same udp_sport_mask.
Note that since we use a single layer of pseudos for both fields, two
 matches that differ in (say) udp_sport value aren't permitted to have
 different ip_tos_mask, even though this would technically be safe.
Current userland TC does not support setting enc_src_port; this patch
 was tested with an iproute2 patched to support it.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12 10:37:02 +01:00
Edward Cree
3c9561c0a5 sfc: support TC decap rules matching on enc_ip_tos
Allow efx_tc_encap_match entries to include an ip_tos and ip_tos_mask.
To avoid partially-overlapping Outer Rules (which can lead to undefined
 behaviour in the hardware), store extra "pseudo" entries in our
 encap_match hashtable, which are used to enforce that all Outer Rule
 entries within a given <src_ip,dst_ip,udp_dport> tuple (or IPv6
 equivalent) have the same ip_tos_mask.
The "direct" encap_match entry takes a reference on the "pseudo",
 allowing it to be destroyed when all "direct" entries using it are
 removed.
efx_tc_em_pseudo_type is an enum rather than just a bool because in
 future an additional pseudo-type will be added to support Conntrack
 offload.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12 10:37:02 +01:00
Edward Cree
56beb35d85 sfc: populate enc_ip_tos matches in MAE outer rules
Currently tc.c will block them before they get here, but following
 patch will change that.
Use the extack message from efx_mae_check_encap_match_caps() instead
 of writing a new one, since there's now more being fed in than just
 an IP version.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12 10:37:02 +01:00
Edward Cree
28fa3ac487 sfc: release encap match in efx_tc_flow_free()
When force-freeing leftover entries from our match_action_ht, call
 efx_tc_delete_rule(), which releases all the rule's resources, rather
 than open-coding it.  The open-coded version was missing a call to
 release the rule's encap match (if any).
It probably doesn't matter as everything's being torn down anyway, but
 it's cleaner this way and prevents further error messages potentially
 being logged by efx_tc_encap_match_free() later on.
Move efx_tc_flow_free() further down the file to avoid introducing a
 forward declaration of efx_tc_delete_rule().

Fixes: 17654d84b4 ("sfc: add offloading of 'foreign' TC (decap) rules")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12 10:37:02 +01:00
Pieter Jansen van Vuuren
134120b066 sfc: disable RXFCS and RXALL features by default
By default we would not want RXFCS and RXALL features enabled as they are
mainly intended for debugging purposes. This does not stop users from
enabling them later on as needed.

Fixes: 8e57daf706 ("sfc_ef100: RX path for EF100")
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-12 10:00:48 +01:00
Andy Moreton
281900a923 sfc: Fix module EEPROM reporting for QSFP modules
The sfc driver does not report QSFP module EEPROM contents correctly
as only the first page is fetched from hardware.

Commit 0e1a2a3e6e ("ethtool: Add SFF-8436 and SFF-8636 max EEPROM
length definitions") added ETH_MODULE_SFF_8436_MAX_LEN for the overall
size of the EEPROM info, so use that to report the full EEPROM contents.

Fixes: 9b17010da5 ("sfc: Add ethtool -m support for QSFP modules")
Signed-off-by: Andy Moreton <andy.moreton@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-01 07:34:04 +01:00
Vladimir Oltean
1f5020acb3 net: vlan: introduce skb_vlan_eth_hdr()
Similar to skb_eth_hdr() introduced in commit 96cc4b6958 ("macvlan: do
not assume mac_header is set in macvlan_broadcast()"), let's introduce a
skb_vlan_eth_hdr() helper which can be used in TX-only code paths to get
to the VLAN header based on skb->data rather than based on the
skb_mac_header(skb).

We also consolidate the drivers that dereference skb->data to go through
this helper.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-23 14:16:44 +01:00
Jakub Kicinski
681c5b51dc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Adjacent changes:

net/mptcp/protocol.h
  63740448a3 ("mptcp: fix accept vs worker race")
  2a6a870e44 ("mptcp: stops worker on unaccepted sockets at listener close")
  ddb1a072f8 ("mptcp: move first subflow allocation at mpc access time")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-20 16:29:51 -07:00
Ding Hui
a80bb8e723 sfc: Fix use-after-free due to selftest_work
There is a use-after-free scenario that is:

When the NIC is down, user set mac address or vlan tag to VF,
the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
and efx_net_open(), since netif_running() is false, the port will not
start and keep port_enabled false, but selftest_work is scheduled
in efx_net_open().

If we remove the device before selftest_work run, the efx_stop_port()
will not be called since the NIC is down, and then efx is freed,
we will soon get a UAF in run_timer_softirq() like this:

[ 1178.907941] ==================================================================
[ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
[ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
[ 1178.907950]
[ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
[ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
[ 1178.907955] Call Trace:
[ 1178.907956]  <IRQ>
[ 1178.907960]  dump_stack+0x71/0xab
[ 1178.907963]  print_address_description+0x6b/0x290
[ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
[ 1178.907967]  kasan_report+0x14a/0x2b0
[ 1178.907968]  run_timer_softirq+0xdea/0xe90
[ 1178.907971]  ? init_timer_key+0x170/0x170
[ 1178.907973]  ? hrtimer_cancel+0x20/0x20
[ 1178.907976]  ? sched_clock+0x5/0x10
[ 1178.907978]  ? sched_clock_cpu+0x18/0x170
[ 1178.907981]  __do_softirq+0x1c8/0x5fa
[ 1178.907985]  irq_exit+0x213/0x240
[ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
[ 1178.907989]  apic_timer_interrupt+0xf/0x20
[ 1178.907990]  </IRQ>
[ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370

If the NIC is not actually brought up, there is no need to schedule
selftest_work, so let's move invoking efx_selftest_async_start()
into efx_start_all(), and it will be canceled by broughting down.

Fixes: dd40781e3a ("sfc: Run event/IRQ self-test asynchronously when interface is brought up")
Fixes: e340be9230 ("sfc: add ndo_set_vf_mac() function for EF10")
Debugged-by: Huang Cun <huangcun@sangfor.com.cn>
Cc: Donglin Peng <pengdonglin@sangfor.com.cn>
Suggested-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-04-17 08:22:26 +01:00
Íñigo Huguet
ad47655ead sfc: remove expired unicast PTP filters
Filters inserted to support unicast PTP mode might become unused after
some time, so we need to remove them to avoid accumulating many of them.

Refresh the expiration time of a filter each time it's used. Then check
periodically if any filter hasn't been used for a long time (30s) and
remove it.

Reported-by: Yalin Li <yalli@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-03 19:02:51 -07:00
Íñigo Huguet
49ed35a0b6 sfc: support unicast PTP
When sending a PTP event packet, add the correct filters that will make
that future incoming unicast PTP event packets will be timestamped.
The unicast address for the filter is gotten from the outgoing skb
before sending it.

Until now they were not timestamped because only filters that match with
the PTP multicast addressed were being configured into the NIC for the
PTP special channel. Packets received through different channels are not
timestamped, getting "received SYNC without timestamp" error in ptp4l.

Note that the inserted filters are never removed unless the NIC is stopped
or reconfigured, so efx_ptp_stop is called. Removal of old filters will
be handled by the next patch.

Additionally, cleanup a bit efx_ptp_xmit_skb_mc to use the reverse xmas
tree convention and remove an unnecessary assignment to rc variable in
void function.

Reported-by: Yalin Li <yalli@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-03 19:02:51 -07:00
Íñigo Huguet
75687cd066 sfc: allow insertion of filters for unicast PTP
Add a second list for unicast filters and generalize the
efx_ptp_insert/remove_filters functions to allow acting in any of the 2
lists.

No filters for unicast are inserted yet. That will be done in the next
patch.

The reason to use 2 different lists instead of a single one is that, in
next patches, we will want to check if unicast filters are already added
and if they're expired. We don't need that for multicast filters.

Reported-by: Yalin Li <yalli@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-03 19:02:51 -07:00
Íñigo Huguet
e790fc15bf sfc: store PTP filters in a list
Instead of using a fixed sized array for the PTP filters, use a list.

This is not actually necessary at this point because the filters for
multicast PTP are a fixed number, but this is a preparation for the
following patches adding support for unicast PTP.

To avoid confusion with the new struct type efx_ptp_rxfilter, change the
name of some local variables from rxfilter to spec, given they're of the
type efx_filter_spec.

Reported-by: Yalin Li <yalli@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-03 19:02:51 -07:00
Jakub Kicinski
79548b7984 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Conflicts:

drivers/net/ethernet/mediatek/mtk_ppe.c
  3fbe4d8c0e ("net: ethernet: mtk_eth_soc: ppe: add support for flow accounting")
  924531326e ("net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-30 14:43:03 -07:00
Edward Cree
17654d84b4 sfc: add offloading of 'foreign' TC (decap) rules
A 'foreign' rule is one for which the net_dev is not the sfc netdevice
 or any of its representors.  The driver registers indirect flow blocks
 for tunnel netdevs so that it can offload decap rules.  For example:

    tc filter add dev vxlan0 parent ffff: protocol ipv4 flower \
        enc_src_ip 10.1.0.2 enc_dst_ip 10.1.0.1 \
        enc_key_id 1000 enc_dst_port 4789 \
        action tunnel_key unset \
        action mirred egress redirect dev $REPRESENTOR

When notified of a rule like this, register an encap match on the IP
 and dport tuple (creating an Outer Rule table entry) and insert an MAE
 action rule to perform the decapsulation and deliver to the representee.

Moved efx_tc_delete_rule() below efx_tc_flower_release_encap_match() to
 avoid the need for a forward declaration.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-29 09:06:08 +01:00
Edward Cree
746224cdef sfc: add code to register and unregister encap matches
Add a hashtable to detect duplicate and conflicting matches.  If match
 is not a duplicate, call MAE functions to add/remove it from OR table.
Calling code not added yet, so mark the new functions as unused.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-29 09:06:08 +01:00
Edward Cree
2245eb0086 sfc: add functions to insert encap matches into the MAE
An encap match corresponds to an entry in the exact-match Outer Rule
 table; the lookup response includes the encap type (protocol) allowing
 the hardware to continue parsing into the inner headers.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-29 09:06:08 +01:00
Edward Cree
b7f5e17b3b sfc: handle enc keys in efx_tc_flower_parse_match()
Translate the fields from flow dissector into struct efx_tc_match.
In efx_tc_flower_replace(), reject filters that match on them, because
 only 'foreign' filters (i.e. those for which the ingress dev is not
 the sfc netdev or any of its representors, e.g. a tunnel netdev) can
 use them.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-29 09:06:08 +01:00
Edward Cree
b9d5c9b7d8 sfc: add notion of match on enc keys to MAE machinery
Extend the MAE caps check to validate that the hardware supports these
 outer-header matches where used by the driver.
Extend efx_mae_populate_match_criteria() to fill in the outer rule ID
 and VNI match fields.
Nothing yet populates these match fields, nor creates outer rules.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-29 09:06:08 +01:00
Edward Cree
edd025ca08 sfc: document TC-to-EF100-MAE action translation concepts
Includes an explanation of the lifetime of the 'cursor' action-set `act`.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-03-29 09:06:08 +01:00
Íñigo Huguet
ca4a80e4bb sfc: ef10: don't overwrite offload features at NIC reset
At NIC reset, some offload features related to encapsulated traffic
might have changed (this mainly happens if the firmware-variant is
changed with the sfboot userspace tool). Because of this, features are
checked and set again at reset time.

However, this was not done right, and some features were improperly
overwritten at NIC reset:
- Tunneled IPv6 segmentation was always disabled
- Features disabled with ethtool were reenabled
- Features that becomes unsupported after the reset were not disabled

Also, checking if the device supports IPV6_CSUM to enable TSO6 is no
longer necessary because all currently supported devices support it.
Additionally, move the assignment of some other features to the
EF10_OFFLOAD_FEATURES macro, like it is done in ef100, leaving the
selection of features in efx_pci_probe_post_io a bit cleaner.

Fixes: ffffd2454a ("sfc: correctly advertise tunneled IPv6 segmentation")
Fixes: 24b2c3751a ("sfc: advertise encapsulated offloads on EF10")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Suggested-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Tested-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230323083417.7345-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-24 14:48:17 -07:00
Edward Cree
05ccd8d8a1 sfc: support offloading TC VLAN push/pop actions to the MAE
EF100 can pop and/or push up to two VLAN tags.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230309115904.56442-1-edward.cree@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-10 21:28:59 -08:00
Bjorn Helgaas
c39abdd396 sfc_ef100: Drop redundant pci_disable_pcie_error_reporting()
51b35a454e ("sfc: skeleton EF100 PF driver") added a call to
pci_disable_pcie_error_reporting() in ef100_pci_remove().

Remove this call since there's no apparent reason to disable error
reporting when it was not previously enabled.

Note that since f26e58bf6f ("PCI/AER: Enable error reporting when AER is
native"), the PCI core enables PCIe error reporting for all devices during
enumeration, so the driver doesn't need to do it itself.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08 23:34:39 -08:00
Bjorn Helgaas
ecded61ceb sfc/siena: Drop redundant pci_enable_pcie_error_reporting()
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08 23:34:39 -08:00
Bjorn Helgaas
4ac9272691 sfc: falcon: Drop redundant pci_enable_pcie_error_reporting()
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08 23:34:39 -08:00
Bjorn Helgaas
bdedf70568 sfc: Drop redundant pci_enable_pcie_error_reporting()
pci_enable_pcie_error_reporting() enables the device to send ERR_*
Messages.  Since f26e58bf6f ("PCI/AER: Enable error reporting when AER is
native"), the PCI core does this for all devices during enumeration, so the
driver doesn't need to do it itself.

Remove the redundant pci_enable_pcie_error_reporting() call from the
driver.  Also remove the corresponding pci_disable_pcie_error_reporting()
from the driver .remove() path.

Note that this only controls ERR_* Messages from the device.  An ERR_*
Message may cause the Root Port to generate an interrupt, depending on the
AER Root Error Command register managed by the AER service driver.

Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-03-08 23:34:39 -08:00
Alejandro Lucero
5f22c3b621 sfc: fix builds without CONFIG_RTC_LIB
Add an embarrassingly missed semicolon plus and embarrassingly missed
parenthesis breaking kernel building when CONFIG_RTC_LIB is not set
like the one reported with ia64 config.

Reported-by: kernel test robot <lkp@intel.com>
Link: https://lore.kernel.org/oe-kbuild-all/202302170047.EjCPizu3-lkp@intel.com/
Fixes: 14743ddd24 ("sfc: add devlink info support for ef100")
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Link: https://lore.kernel.org/r/20230220110133.29645-1-alejandro.lucero-palau@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20 16:39:50 -08:00
Yang Li
5feeaba106 sfc: clean up some inconsistent indentings
Fix some indentngs and remove the warning below:
drivers/net/ethernet/sfc/mae.c:657 efx_mae_enumerate_mports() warn: inconsistent indenting

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Link: https://bugzilla.openanolis.cn/show_bug.cgi?id=4117
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230220065958.52941-1-yang.lee@linux.alibaba.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-20 16:39:00 -08:00
Arnd Bergmann
a59f832a71 sfc: use IS_ENABLED() checks for CONFIG_SFC_SRIOV
One local variable has become unused after a recent change:

drivers/net/ethernet/sfc/ef100_nic.c: In function 'ef100_probe_netdev_pf':
drivers/net/ethernet/sfc/ef100_nic.c:1155:21: error: unused variable 'net_dev' [-Werror=unused-variable]
  struct net_device *net_dev = efx->net_dev;
                     ^~~~~~~

The variable is still used in an #ifdef. Replace the #ifdef with
an if(IS_ENABLED()) check that lets the compiler see where it is
used, rather than adding another #ifdef.

This also fixes an uninitialized return value in ef100_probe_netdev_pf()
that gcc did not spot.

Fixes: 7e056e2360 ("sfc: obtain device mac address based on firmware handle for ef100")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 10:49:30 +00:00
Colin Ian King
0d39ad3e1b sfc: Fix spelling mistake "creationg" -> "creating"
There is a spelling mistake in a pci_warn message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by:  Alejandro Lucero <alejandro.lucero-palau@amd.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-02-20 10:44:33 +00:00
Alejandro Lucero
3b6096c9b3 sfc: add support for devlink port_function_hw_addr_set in ef100
Using the builtin client handle id infrastructure, add support for
setting the mac address linked to mports in ef100. This implies to
execute an MCDI command for giving the address to the firmware for
the specific devlink port.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:13 +01:00
Alejandro Lucero
fa78b01718 sfc: add support for devlink port_function_hw_addr_get in ef100
Using the builtin client handle id infrastructure, add support for
obtaining the mac address linked to mports in ef100. This implies
to execute an MCDI command for getting the data from the firmware
for each devlink port.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:13 +01:00
Alejandro Lucero
7e056e2360 sfc: obtain device mac address based on firmware handle for ef100
Getting device mac address is currently based on a specific MCDI command
only available for the PF. This patch changes the MCDI command to a
generic one for PFs and VFs based on a client handle. This allows both
PFs and VFs to ask for their mac address during initialization using the
CLIENT_HANDLE_SELF.

Moreover, the patch allows other client handles which will be used by
the PF to ask for mac addresses linked to VFs. This is necessary for
suporting the port_function_hw_addr_get devlink function in further
patches.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
25414b2a64 sfc: add devlink port support for ef100
Using the data when enumerating mports, create devlink ports just before
netdevs are registered and remove those devlink ports after netdev has
been unregistered.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
5227adff37 sfc: add mport lookup based on driver's mport data
Obtaining mport id is based on asking the firmware about it. This is
still needed for mport initialization itself, but once the mport data is
now kept by the driver, further mport id request can be satisfied
internally without firmware interaction.

Previous function is just modified in name making clear the firmware
interaction. The new function uses the old name and looks for the data
in the mport data structure.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
a6a15aca42 sfc: enumerate mports in ef100
MAE ports (mports) are the ports on the EF100 embedded switch such
as networking PCIe functions, the physical port, and potentially
others.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
14743ddd24 sfc: add devlink info support for ef100
Add devlink info support for ef100. The information reported is obtained
through the MCDI interface with the specific meaning defined in new
documentation file.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Alejandro Lucero
fa34a5140a sfc: add devlink support for ef100
Add devlink infrastructure support. Further patches add devlink
info and devlink port support.

Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-02-16 12:03:12 +01:00
Jakub Kicinski
de42873367 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCY+bZrwAKCRDbK58LschI
 gzi4AP4+TYo0jnSwwkrOoN9l4f5VO9X8osmj3CXfHBv7BGWVxAD/WnvA3TDZyaUd
 agIZTkRs6BHF9He8oROypARZxTeMLwM=
 =nO1C
 -----END PGP SIGNATURE-----

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-02-11

We've added 96 non-merge commits during the last 14 day(s) which contain
a total of 152 files changed, 4884 insertions(+), 962 deletions(-).

There is a minor conflict in drivers/net/ethernet/intel/ice/ice_main.c
between commit 5b246e533d ("ice: split probe into smaller functions")
from the net-next tree and commit 66c0e13ad2 ("drivers: net: turn on
XDP features") from the bpf-next tree. Remove the hunk given ice_cfg_netdev()
is otherwise there a 2nd time, and add XDP features to the existing
ice_cfg_netdev() one:

        [...]
        ice_set_netdev_features(netdev);
        netdev->xdp_features = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT |
                               NETDEV_XDP_ACT_XSK_ZEROCOPY;
        ice_set_ops(netdev);
        [...]

Stephen's merge conflict mail:
https://lore.kernel.org/bpf/20230207101951.21a114fa@canb.auug.org.au/

The main changes are:

1) Add support for BPF trampoline on s390x which finally allows to remove many
   test cases from the BPF CI's DENYLIST.s390x, from Ilya Leoshkevich.

2) Add multi-buffer XDP support to ice driver, from Maciej Fijalkowski.

3) Add capability to export the XDP features supported by the NIC.
   Along with that, add a XDP compliance test tool,
   from Lorenzo Bianconi & Marek Majtyka.

4) Add __bpf_kfunc tag for marking kernel functions as kfuncs,
   from David Vernet.

5) Add a deep dive documentation about the verifier's register
   liveness tracking algorithm, from Eduard Zingerman.

6) Fix and follow-up cleanups for resolve_btfids to be compiled
   as a host program to avoid cross compile issues,
   from Jiri Olsa & Ian Rogers.

7) Batch of fixes to the BPF selftest for xdp_hw_metadata which resulted
   when testing on different NICs, from Jesper Dangaard Brouer.

8) Fix libbpf to better detect kernel version code on Debian, from Hao Xiang.

9) Extend libbpf to add an option for when the perf buffer should
   wake up, from Jon Doron.

10) Follow-up fix on xdp_metadata selftest to just consume on TX
    completion, from Stanislav Fomichev.

11) Extend the kfuncs.rst document with description on kfunc
    lifecycle & stability expectations, from David Vernet.

12) Fix bpftool prog profile to skip attaching to offline CPUs,
    from Tonghao Zhang.

====================

Link: https://lore.kernel.org/r/20230211002037.8489-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-02-10 17:51:27 -08:00
Lorenzo Bianconi
9b0651e429 sfc: move xdp_features configuration in efx_pci_probe_post_io()
Move xdp_features configuration from efx_pci_probe() to
efx_pci_probe_post_io() since it is where all the other basic netdev
features are initialised.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Link: https://lore.kernel.org/r/9bd31c9a29bcf406ab90a249a28fc328e5578fd1.1675875404.git.lorenzo@kernel.org
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
2023-02-08 10:18:46 -08:00
Marek Majtyka
66c0e13ad2 drivers: net: turn on XDP features
A summary of the flags being set for various drivers is given below.
Note that XDP_F_REDIRECT_TARGET and XDP_F_FRAG_TARGET are features
that can be turned off and on at runtime. This means that these flags
may be set and unset under RTNL lock protection by the driver. Hence,
READ_ONCE must be used by code loading the flag value.

Also, these flags are not used for synchronization against the availability
of XDP resources on a device. It is merely a hint, and hence the read
may race with the actual teardown of XDP resources on the device. This
may change in the future, e.g. operations taking a reference on the XDP
resources of the driver, and in turn inhibiting turning off this flag.
However, for now, it can only be used as a hint to check whether device
supports becoming a redirection target.

Turn 'hw-offload' feature flag on for:
 - netronome (nfp)
 - netdevsim.

Turn 'native' and 'zerocopy' features flags on for:
 - intel (i40e, ice, ixgbe, igc)
 - mellanox (mlx5).
 - stmmac
 - netronome (nfp)

Turn 'native' features flags on for:
 - amazon (ena)
 - broadcom (bnxt)
 - freescale (dpaa, dpaa2, enetc)
 - funeth
 - intel (igb)
 - marvell (mvneta, mvpp2, octeontx2)
 - mellanox (mlx4)
 - mtk_eth_soc
 - qlogic (qede)
 - sfc
 - socionext (netsec)
 - ti (cpsw)
 - tap
 - tsnep
 - veth
 - xen
 - virtio_net.

Turn 'basic' (tx, pass, aborted and drop) features flags on for:
 - netronome (nfp)
 - cavium (thunder)
 - hyperv.

Turn 'redirect_target' feature flag on for:
 - amanzon (ena)
 - broadcom (bnxt)
 - freescale (dpaa, dpaa2)
 - intel (i40e, ice, igb, ixgbe)
 - ti (cpsw)
 - marvell (mvneta, mvpp2)
 - sfc
 - socionext (netsec)
 - qlogic (qede)
 - mellanox (mlx5)
 - tap
 - veth
 - virtio_net
 - xen

Reviewed-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Acked-by: Stanislav Fomichev <sdf@google.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Marek Majtyka <alardam@gmail.com>
Link: https://lore.kernel.org/r/3eca9fafb308462f7edb1f58e451d59209aa07eb.1675245258.git.lorenzo@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2023-02-02 20:48:23 -08:00
Íñigo Huguet
ffffd2454a sfc: correctly advertise tunneled IPv6 segmentation
Recent sfc NICs are TSO capable for some tunnel protocols. However, it
was not working properly because the feature was not advertised in
hw_enc_features, but in hw_features only.

Setting up a GENEVE tunnel and using iperf3 to send IPv4 and IPv6 traffic
to the tunnel show, with tcpdump, that the IPv4 packets still had ~64k
size but the IPv6 ones had only ~1500 bytes (they had been segmented by
software, not offloaded). With this patch segmentation is offloaded as
expected and the traffic is correctly received at the other end.

Fixes: 24b2c3751a ("sfc: advertise encapsulated offloads on EF10")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20230125143513.25841-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-01-28 00:19:48 -08:00
ye xingchen
1ab586f517 sfc: use sysfs_emit() to instead of scnprintf()
Follow the advice of the Documentation/filesystems/sysfs.rst and show()
should only use sysfs_emit() or sysfs_emit_at() when formatting the
value to be returned to user space.

Signed-off-by: ye xingchen <ye.xingchen@zte.com.cn>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/202212051021451139126@zte.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-12-06 20:37:38 -08:00
Jakub Kicinski
f2bb566f5c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
tools/lib/bpf/ringbuf.c
  927cbb478a ("libbpf: Handle size overflow for ringbuf mmap")
  b486d19a0a ("libbpf: checkpatch: Fixed code alignments in ringbuf.c")
https://lore.kernel.org/all/20221121122707.44d1446a@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-29 13:04:52 -08:00
Anirudh Venkataramanan
f61e6d3ca4 sfc: Use kmap_local_page() instead of kmap_atomic()
kmap_atomic() is being deprecated in favor of kmap_local_page(). Replace
kmap_atomic() and kunmap_atomic() with kmap_local_page() and kunmap_local()
respectively.

Note that kmap_atomic() disables preemption and page-fault processing, but
kmap_local_page() doesn't. When converting uses of kmap_atomic(), one has
to check if the code being executed between the map/unmap implicitly
depends on page-faults and/or preemption being disabled. If yes, then code
to disable page-faults and/or preemption should also be added for
functional correctness. That however doesn't appear to be the case here,
so just kmap_local_page() is used.

Also note that the page being mapped is not allocated by the driver, and so
the driver doesn't know if the page is in normal memory. This is the reason
kmap_local_page() is used as opposed to page_address().

I don't have hardware, so this change has only been compile tested.

Cc: Ira Weiny <ira.weiny@intel.com>
Cc: Fabio M. De Francesco <fmdefrancesco@gmail.com>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Anirudh Venkataramanan <anirudh.venkataramanan@intel.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-25 10:44:01 +00:00
Edward Cree
e80bd08fd7 sfc: ensure type is valid before updating seen_gen
In the case of invalid or corrupted v2 counter update packets,
 efx_tc_rx_version_2() returns EFX_TC_COUNTER_TYPE_MAX.  In this case
 we should not attempt to update generation counts as this will write
 beyond the end of the seen_gen array.

Reported-by: coverity-bot <keescook+coverity-bot@chromium.org>
Addresses-Coverity-ID: 1527356 ("Memory - illegal accesses")
Fixes: 25730d8be5 ("sfc: add extra RX channel to receive MAE counter updates on ef100")
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-23 13:47:51 +00:00
Zhang Changzhong
aad98abd5c sfc: fix potential memleak in __ef100_hard_start_xmit()
The __ef100_hard_start_xmit() returns NETDEV_TX_OK without freeing skb
in error handling case, add dev_kfree_skb_any() to fix it.

Fixes: 51b35a454e ("sfc: skeleton EF100 PF driver")
Signed-off-by: Zhang Changzhong <zhangchangzhong@huawei.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Link: https://lore.kernel.org/r/1668671409-10909-1-git-send-email-zhangchangzhong@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-11-22 11:28:16 +01:00
Edward Cree
50f8f2f7fb sfc: implement counters readout to TC stats
On FLOW_CLS_STATS, look up the MAE counter by TC cookie, and report the
 change in packet and byte count since the last time FLOW_CLS_STATS read
 them.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:03 +00:00
Edward Cree
83a187a4eb sfc: validate MAE action order
Currently the only actions supported are COUNT and DELIVER, which can only
 happen in the right order; but when more actions are added, it will be
 necessary to check that they are only used in the same order in which the
 hardware performs them (since the hardware API takes an action *set* in
 which the order is implicit).  For instance, a VLAN pop must not follow a
 VLAN push.  Most practical use-cases should be unaffected by these
 restrictions.
Add a function efx_tc_flower_action_order_ok() that checks whether it is
 appropriate to add a specified action to the existing action-set.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:03 +00:00
Edward Cree
2e0f1eb056 sfc: attach an MAE counter to TC actions that need it
The only actions that expect stats (that sfc HW supports) are gact shot
 (drop), mirred redirect and mirred mirror.  Since these are 'deliverish'
 actions that end an action-set, we only require at most one counter per
 action-set.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:03 +00:00
Edward Cree
c4bad432b9 sfc: accumulate MAE counter values from update packets
Add the packet and byte counts to the software running total, and store
 the latest jiffies every time the counter is bumped.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
0363aa2957 sfc: add functions to allocate/free MAE counters
efx_tc_flower_get_counter_index() will create an MAE counter mapped to
 the passed (TC filter) cookie, or increment the reference if one already
 exists for that cookie.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
19a0c98910 sfc: add hashtables for MAE counters and counter ID mappings
Nothing populates them yet.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
25730d8be5 sfc: add extra RX channel to receive MAE counter updates on ef100
Currently there is no counter-allocating machinery to connect the
 resulting counter update values to; that will be added in a
 subsequent patch.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
e5731274cd sfc: add ef100 MAE counter support functions
Start and stop MAE counter streaming, and grant credits.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
36df6136a7 sfc: add ability for extra channels to receive raw RX buffers
The TC extra channel will need its own special RX handling, which must
 operate before any code that expects the RX buffer to contain a network
 packet; buffers on this RX queue contain MAE counter packets in a
 special format that does not resemble an Ethernet frame, and many fields
 of the RX packet prefix are not populated.
The USER_MARK field, however, is populated with the generation count from
 the counter subsystem, which needs to be passed on to the RX handler.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
85697f97fd sfc: add start and stop methods to channels
The TC extra channel needs to do extra work in efx_{start,stop}_channels()
 to start/stop MAE counter streaming from the hardware.  Add callbacks for
 it to implement.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
e395153984 sfc: add ability for an RXQ to grant credits on refill
EF100 hardware streams MAE counter updates to the driver over a dedicated
 RX queue; however, the MCPU is not able to detect when RX buffers have
 been posted to the ring.  Thus, the driver must call
 MC_CMD_MAE_COUNTERS_STREAM_GIVE_CREDITS; this patch adds the
 infrastructure to support that to the core RXQ handling code.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Edward Cree
5ae0c22634 sfc: fix ef100 RX prefix macro
Macro PREFIX_WIDTH_MASK uses unsigned long arithmetic for a shift of up
 to 32 bits, which breaks on 32-bit systems.  This did not previously
 show up as we weren't using any fields of width 32, but we now need to
 access ESF_GZ_RX_PREFIX_USER_MARK.
Change it to unsigned long long.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-16 09:07:02 +00:00
Jacob Keller
e2bd9c76c8 ptp: convert remaining drivers to adjfine interface
Convert all remaining drivers that still use .adjfreq to the newer .adjfine
implementation. These drivers are not straightforward, as they use
non-standard methods of programming their hardware. They are all converted
to use scaled_ppm_to_ppb to get the parts per billion value that their
logic depends on.

Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Cc: Ariel Elior <aelior@marvell.com>
Cc: Sudarsana Kalluru <skalluru@marvell.com>
Cc: Manish Chopra <manishc@marvell.com>
Cc: Derek Chickles <dchickles@marvell.com>
Cc: Satanand Burla <sburla@marvell.com>
Cc: Felix Manlunas <fmanlunas@marvell.com>
Cc: Raju Rangoju <rajur@chelsio.com>
Cc: Joakim Zhang <qiangqing.zhang@nxp.com>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Cc: Richard Cochran <richardcochran@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-11-11 10:58:39 +00:00
Edward Cree
5d1d24da00 sfc: add Layer 4 matches to ef100 TC offload
Support matching on UDP/TCP source and destination ports and TCP flags,
 with masking if supported by the hardware.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-04 19:54:24 -07:00
Edward Cree
5ca7ef2938 sfc: add Layer 3 flag matches to ef100 TC offload
Support matching on ip_frag and ip_firstfrag.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-04 19:54:24 -07:00
Edward Cree
c178dff3f9 sfc: add Layer 3 matches to ef100 TC offload
Support matching on IP protocol, Type of Service, Time To Live, source
 and destination addresses, with masking if supported by the hardware.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-04 19:54:23 -07:00
Edward Cree
6d1c604d10 sfc: add Layer 2 matches to ef100 TC offload
Support matching on EtherType, VLANs and ethernet source/destination
 addresses, with masking if supported by the hardware.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-04 19:54:23 -07:00
Edward Cree
f0b59ad11e sfc: check recirc_id match caps before MAE offload
Offloaded TC rules always match on recirc_id in the MAE, so we should
 check that the MAE reported support for this match before attempting
 to insert the rule.
These checks allow us to fail early, avoiding the PCIe round-trip to
 firmware for an MC_CMD_MAE_ACTION_RULE_INSERT that will only fail,
 and more importantly providing a more informative error message that
 identifies which match field is unsupported.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-04 19:54:23 -07:00
Jakub Kicinski
fbeb229a66 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-03 13:21:54 -07:00
Jiri Slaby (SUSE)
3319dbb3e7 sfc (gcc13): synchronize ef100_enqueue_skb()'s return type
ef100_enqueue_skb() generates a valid warning with gcc-13:
  drivers/net/ethernet/sfc/ef100_tx.c:370:5: error: conflicting types for 'ef100_enqueue_skb' due to enum/integer mismatch; have 'int(struct efx_tx_queue *, struct sk_buff *)'
  drivers/net/ethernet/sfc/ef100_tx.h:25:13: note: previous declaration of 'ef100_enqueue_skb' with type 'netdev_tx_t(struct efx_tx_queue *, struct sk_buff *)'

I.e. the type of the ef100_enqueue_skb()'s return value in the declaration is
int, while the definition spells enum netdev_tx_t. Synchronize them to the
latter.

Cc: Martin Liska <mliska@suse.cz>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Cc: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org>
Link: https://lore.kernel.org/r/20221031114440.10461-1-jirislaby@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-02 20:38:17 -07:00
Christophe JAILLET
6c412da54c sfc: Fix an error handling path in efx_pci_probe()
If an error occurs after the first kzalloc() the corresponding memory
allocation is never freed.

Add the missing kfree() in the error handling path, as already done in the
remove() function.

Fixes: 7e773594da ("sfc: Separate efx_nic memory from net_device memory")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/dc114193121c52c8fa3779e49bdd99d4b41344a9.1667077009.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-11-01 09:09:47 -07:00
Jakub Kicinski
94adb5e29e Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-20 17:49:10 -07:00
Pieter Jansen van Vuuren
c2bf23e4a5 sfc: include vport_id in filter spec hash and equal()
Filters on different vports are qualified by different implicit MACs and/or
VLANs, so shouldn't be considered equal even if their other match fields
are identical.

Fixes: 7c460d9be6 ("sfc: Extend and abstract efx_filter_spec to cover Huntington/EF10")
Co-developed-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Reviewed-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20221018092841.32206-1-pieter.jansen-van-vuuren@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-19 17:08:43 -07:00
Edward Cree
b799f052a9 sfc: remove 'log-tc-errors' ethtool private flag
It no longer does anything now that we're using formatted extacks instead.
So we can remove the driver's whole get/set priv_flags implementation.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-19 13:00:05 -07:00
Edward Cree
ad1c80d5f7 sfc: use formatted extacks instead of efx_tc_err()
Since we can now get a formatted message back to the user with
 NL_SET_ERR_MSG_FMT_MOD(), there's no need for our special logging.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-10-19 13:00:05 -07:00
Jonathan Cooper
a8aed7b35b sfc: Change VF mac via PF as first preference if available.
Changing a VF's mac address through the VF (rather than via the PF)
fails with EPERM because the latter part of efx_ef10_set_mac_address
attempts to change the vport mac address list as the VF.
Even with this fixed it still fails with EBUSY because the vadaptor
is still assigned on the VF - the vadaptor reassignment must be within
a section where the VF has torn down its state.

A major reason this has broken is because we have two functions that
ostensibly do the same thing - have a PF and VF cooperate to change a
VF mac address. Rather than do this, if we are changing the mac of a VF
that has a link to the PF in the same VM then simply call
sriov_set_vf_mac instead, which is a proven working function that does
that.

If there is no PF available, or that fails non-fatally, then attempt to
change the VF's mac address as we would a PF, without updating the PF's
data.

Test case:
Create a VF:
  echo 1 > /sys/class/net/<if>/device/sriov_numvfs
Set the mac address of the VF directly:
  ip link set <vf> addr 00:11:22:33:44:55
Set the MAC address of the VF via the PF:
  ip link set <pf> vf 0 mac 00:11:22:33:44:66
Without this patch the last command will fail with ENOENT.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Reported-by: Íñigo Huguet <ihuguet@redhat.com>
Fixes: 910c8789a7 ("set the MAC address using MC_CMD_VADAPTOR_SET_MAC")
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-10-15 11:06:44 +01:00
Jakub Kicinski
accc3b4a57 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-29 14:30:51 -07:00
Jakub Kicinski
b48b89f9c1 net: drop the weight argument from netif_napi_add
We tell driver developers to always pass NAPI_POLL_WEIGHT
as the weight to netif_napi_add(). This may be confusing
to newcomers, drop the weight argument, those who really
need to tweak the weight can use netif_napi_add_weight().

Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for CAN
Link: https://lore.kernel.org/r/20220927132753.750069-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-28 18:57:14 -07:00
Edward Cree
d902e1a737 sfc: bare bones TC offload on EF100
This is the absolute minimum viable TC implementation to get traffic to
 VFs and allow them to be tested; it supports no match fields besides
 ingress port, no actions besides mirred and drop, and no stats.
Example usage:
    tc filter add dev $PF parent ffff: flower skip_sw \
        action mirred egress mirror dev $VFREP
    tc filter add dev $VFREP parent ffff: flower skip_sw \
        action mirred egress redirect dev $PF
 gives a VF unfiltered access to the network out the physical port ($PF
 acts here as a physical port representor).
More matches, actions, and counters will be added in subsequent patches.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree
7ce3e235f2 sfc: interrogate MAE capabilities at probe time
Different versions of EF100 firmware and FPGA bitstreams support different
 matching capabilities in the Match-Action Engine.  Probe for these at
 start of day; subsequent patches will validate TC offload requests
 against the reported capabilities.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree
f54a28a211 sfc: add a hashtable for offloaded TC rules
Nothing inserts into this table yet, but we have code to remove rules
 on FLOW_CLS_DESTROY or at driver teardown time, in both cases also
 attempting to remove the corresponding hardware rules.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree
7c9d266d8f sfc: optional logging of TC offload errors
TC offload support will involve complex limitations on what matches and
 actions a rule can do, in some cases potentially depending on rules
 already offloaded.  So add an ethtool private flag "log-tc-errors" which
 controls reporting the reasons for un-offloadable TC rules at NETIF_INFO.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree
5b2e12d51b sfc: bind indirect blocks for TC offload on EF100
Bind indirect blocks for recognised tunnel netdevices.
Currently these connect to a stub efx_tc_flower() that only returns
 -EOPNOTSUPP; subsequent patches will implement flower offloads to the
 Match-Action Engine.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Edward Cree
9dc0cad203 sfc: bind blocks for TC offload on EF100
Bind direct blocks for the MAE-admin PF and each VF representor.
Currently these connect to a stub efx_tc_flower() that only returns
 -EOPNOTSUPP; subsequent patches will implement flower offloads to the
 Match-Action Engine.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-28 09:43:22 +01:00
Andy Moreton
b7ca8d5f56 sfc: correct filter_table_remove method for EF10 PFs
A previous patch added a wrapper function to take a lock around
 efx_mcdi_filter_table_remove(), but only changed EF10 VFs' method table
 to call it.  Change it in the PF method table too.

Fixes: 77eb40749d ("sfc: move table locking into filter_table_{probe,remove} methods")
Signed-off-by: Andy Moreton <andy.moreton@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220922211218.814-1-ecree@xilinx.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-23 20:57:14 -07:00
Jakub Kicinski
0140a7168f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/freescale/fec.h
  7b15515fc1 ("Revert "fec: Restart PPS after link state change"")
  40c79ce13b ("net: fec: add stop mode support for imx8 platform")
https://lore.kernel.org/all/20220921105337.62b41047@canb.auug.org.au/

drivers/pinctrl/pinctrl-ocelot.c
  c297561bc9 ("pinctrl: ocelot: Fix interrupt controller")
  181f604b33 ("pinctrl: ocelot: add ability to be used in a non-mmio configuration")
https://lore.kernel.org/all/20220921110032.7cd28114@canb.auug.org.au/

tools/testing/selftests/drivers/net/bonding/Makefile
  bbb774d921 ("net: Add tests for bonding and team address list management")
  152e8ec776 ("selftests/bonding: add a test for bonding lladdr target")
https://lore.kernel.org/all/20220921110437.5b7dbd82@canb.auug.org.au/

drivers/net/can/usb/gs_usb.c
  5440428b3d ("can: gs_usb: gs_can_open(): fix race dev->can.state condition")
  45dfa45f52 ("can: gs_usb: add RX and TX hardware timestamp support")
https://lore.kernel.org/all/84f45a7d-92b6-4dc5-d7a1-072152fab6ff@tessares.net/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-22 13:02:10 -07:00
Íñigo Huguet
589c6eded1 sfc/siena: fix null pointer dereference in efx_hard_start_xmit
Like in previous patch for sfc, prevent potential (but unlikely) NULL
pointer dereference.

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141958.16458-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:21:28 -07:00
Íñigo Huguet
974bb793ad sfc/siena: fix TX channel offset when using legacy interrupts
As in previous commit for sfc, fix TX channels offset when
efx_siena_separate_tx_channels is false (the default)

Fixes: 25bde571b4 ("sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/20220915141653.15504-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-20 11:21:13 -07:00
Íñigo Huguet
0a242eb291 sfc: fix null pointer dereference in efx_hard_start_xmit
Trying to get the channel from the tx_queue variable here is wrong
because we can only be here if tx_queue is NULL, so we shouldn't
dereference it. As the above comment in the code says, this is very
unlikely to happen, but it's wrong anyway so let's fix it.

I hit this issue because of a different bug that caused tx_queue to be
NULL. If that happens, this is the error message that we get here:
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [...]
  RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]

Fixes: 12804793b1 ("sfc: decouple TXQ type from label")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220914111135.21038-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:09:55 -07:00
Íñigo Huguet
f232af4295 sfc: fix TX channel offset when using legacy interrupts
In legacy interrupt mode the tx_channel_offset was hardcoded to 1, but
that's not correct if efx_sepparate_tx_channels is false. In that case,
the offset is 0 because the tx queues are in the single existing channel
at index 0, together with the rx queue.

Without this fix, as soon as you try to send any traffic, it tries to
get the tx queues from an uninitialized channel getting these errors:
  WARNING: CPU: 1 PID: 0 at drivers/net/ethernet/sfc/tx.c:540 efx_hard_start_xmit+0x12e/0x170 [sfc]
  [...]
  RIP: 0010:efx_hard_start_xmit+0x12e/0x170 [sfc]
  [...]
  Call Trace:
   <IRQ>
   dev_hard_start_xmit+0xd7/0x230
   sch_direct_xmit+0x9f/0x360
   __dev_queue_xmit+0x890/0xa40
  [...]
  BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
  [...]
  RIP: 0010:efx_hard_start_xmit+0x153/0x170 [sfc]
  [...]
  Call Trace:
   <IRQ>
   dev_hard_start_xmit+0xd7/0x230
   sch_direct_xmit+0x9f/0x360
   __dev_queue_xmit+0x890/0xa40
  [...]

Fixes: c308dfd1b4 ("sfc: fix wrong tx channel offset with efx_separate_tx_channels")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220914103648.16902-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-09-19 18:09:43 -07:00
Pieter Jansen van Vuuren
41e3b0722f sfc: introduce shutdown entry point in efx pci driver
Make the device inactive when the system shutdown callback has been
invoked. This is achieved by freezing the driver and disabling the
PCI bus mastering.

Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220906105620.26179-1-pieter.jansen-van-vuuren@amd.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-09-13 09:45:11 +02:00
Íñigo Huguet
e4616f6472 sfc: support PTP over Ethernet
The previous patch add support for PTP over IPv6/UDP (only for 8000
series and newer) and this one add support for PTP over 802.3.

Tested: sync as master and as slave is correct with ptp4l. PTP over IPv4
and IPv6 still works fine.

Suggested-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07 12:23:25 +01:00
Íñigo Huguet
621918c45f sfc: support PTP over IPv6/UDP
commit bd4a2697e5 ("sfc: use hardware tx timestamps for more than
PTP") added support for hardware timestamping on TX for cards of the
8000 series and newer, in an effort to provide support for other
transports other than IPv4/UDP.

However, timestamping was still not working on RX for these other
transports. This patch add support for PTP over IPv6/UDP.

Tested: sync as master and as slave is correct using ptp4l from linuxptp
package, both with IPv4 and IPv6.

Suggested-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07 12:23:25 +01:00
Íñigo Huguet
313aa13a07 sfc: allow more flexible way of adding filters for PTP
In preparation for the support of PTP over IPv6/UDP and Ethernet in next
patches, allow a more flexible way of adding and removing RX filters for
PTP. Right now, only 2 filters are allowed, which are the ones needed
for PTP over IPv4/UDP.

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-09-07 12:23:25 +01:00
Wolfram Sang
f029c781dd net: ethernet: move from strlcpy with unused retval to strscpy
Follow the advice of the below link and prefer 'strscpy' in this
subsystem. Conversion is 1:1 because the return value is not used.
Generated by a coccinelle script.

Link: https://lore.kernel.org/r/CAHk-=wgfRnXz0W3D37d01q3JFkr_i_uTL=V6A6G1oUZcprmknw@mail.gmail.com/
Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
Reviewed-by: Petr Machata <petrm@nvidia.com> # For drivers/net/ethernet/mellanox/mlxsw
Acked-by: Geoff Levand <geoff@infradead.org> # For ps3_gelic_net and spider_net_ethtool
Acked-by: Tom Lendacky <thomas.lendacky@amd.com> # For drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c
Acked-by: Marcin Wojtas <mw@semihalf.com> # For drivers/net/ethernet/marvell/mvpp2
Reviewed-by: Leon Romanovsky <leonro@nvidia.com> # For drivers/net/ethernet/mellanox/mlx{4|5}
Reviewed-by: Shay Agroskin <shayagr@amazon.com> # For drivers/net/ethernet/amazon/ena
Acked-by: Krzysztof Hałasa <khalasa@piap.pl> # For IXP4xx Ethernet
Link: https://lore.kernel.org/r/20220830201457.7984-3-wsa+renesas@sang-engineering.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-08-31 14:11:26 -07:00
Edward Cree
7267aa6d99 sfc: implement ethtool get/set RX ring size for EF100 reps
It's not truly a ring, but the maximum length of the list of queued RX
 SKBs is analogous to an RX ring size, so use that API to configure it.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:07 -07:00
Edward Cree
e37f3b1561 sfc: use a dynamic m-port for representor RX and set it promisc
Representors do not want to be subject to the PF's Ethernet address
 filters, since traffic from VFs will typically have a destination
 either elsewhere on the link segment or on an overlay network.
So, create a dynamic m-port with promiscuous and all-multicast
 filters, and set it as the egress port of representor default rules.
 Since the m-port is an alias of the calling PF's own m-port, traffic
 will still be delivered to the PF's RXQs, but it will be subject to
 the VNRX filter rules installed on the dynamic m-port (specified by
 the v-port ID field of the filter spec).

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:07 -07:00
Edward Cree
77eb40749d sfc: move table locking into filter_table_{probe,remove} methods
We need to be able to drop the efx->filter_sem in ef100_filter_table_up()
 so that we can call functions that insert filters (and thus take that
 rwsem for read), which means the efx->type->filter_table_probe method
 needs to be responsible for taking the lock in the first place.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree
67ab160ed0 sfc: insert default MAE rules to connect VFs to representors
Default rules are low-priority switching rules which the hardware uses
 in the absence of higher-priority rules.  Each representor requires a
 corresponding rule matching traffic from its representee VF and
 delivering to the PF (where a check on INGRESS_MPORT in
 __ef100_rx_packet() will direct it to the representor).  No rule is
 required in the reverse direction, because representor TX uses a TX
 override descriptor to bypass the MAE and deliver directly to the VF.
Since inserting any rule into the MAE disables the firmware's own
 default rules, also insert a pair of rules to connect the PF to the
 physical network port and vice-versa.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree
f50e8fcda6 sfc: receive packets from EF100 VFs into representors
If the source m-port of a packet in __ef100_rx_packet() is a VF,
 hand off the packet to the corresponding representor with
 efx_ef100_rep_rx_packet().

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree
08d0b16ecb sfc: check ef100 RX packets are from the wire
If not, for now drop them and warn.  A subsequent patch will look up
 the source m-port to try and find a representor to deliver them to.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree
6f6838aabf sfc: determine wire m-port at EF100 PF probe time
Traffic delivered to the (MAE admin) PF could be from either the wire
 or a VF.  The INGRESS_MPORT field of the RX prefix distinguishes these;
 base_mport is the value this field will have for traffic from the wire
 (which should be delivered to the PF's netdevice, not a representor).

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:06 -07:00
Edward Cree
9fe00c800e sfc: ef100 representor RX top half
Representor RX uses a NAPI context driven by a 'fake interrupt': when
 the parent PF receives a packet destined for the representor, it adds
 it to an SKB list (efv->rx_list), and schedules NAPI if the 'fake
 interrupt' is primed.  The NAPI poll then pulls packets off this list
 and feeds them to the stack with netif_receive_skb_list().
This scheme allows us to decouple representor RX from the parent PF's
 RX fast-path.
This patch implements the 'top half', which builds an SKB, copies data
 into it from the RX buffer (which can then be released), adds it to
 the queue and fires the 'fake interrupt' if necessary.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:05 -07:00
Edward Cree
69bb5fa73d sfc: ef100 representor RX NAPI poll
This patch adds the 'bottom half' napi->poll routine for representor RX.
See the next patch (with the top half) for an explanation of the 'fake
 interrupt' scheme used to drive this NAPI context.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:05 -07:00
Edward Cree
a95115c407 sfc: plumb ef100 representor stats
Implement .ndo_get_stats64() method to read values out of struct
 efx_rep_sw_stats.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-29 21:22:05 -07:00
Jakub Kicinski
272ac32f56 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-28 18:21:16 -07:00
Alejandro Lucero
67c3b611d9 sfc: disable softirqs for ptp TX
Sending a PTP packet can imply to use the normal TX driver datapath but
invoked from the driver's ptp worker. The kernel generic TX code
disables softirqs and preemption before calling specific driver TX code,
but the ptp worker does not. Although current ptp driver functionality
does not require it, there are several reasons for doing so:

   1) The invoked code is always executed with softirqs disabled for non
      PTP packets.
   2) Better if a ptp packet transmission is not interrupted by softirq
      handling which could lead to high latencies.
   3) netdev_xmit_more used by the TX code requires preemption to be
      disabled.

Indeed a solution for dealing with kernel preemption state based on static
kernel configuration is not possible since the introduction of dynamic
preemption level configuration at boot time using the static calls
functionality.

Fixes: f79c957a0b ("drivers: net: sfc: use netdev_xmit_more helper")
Signed-off-by: Alejandro Lucero <alejandro.lucero-palau@amd.com>
Link: https://lore.kernel.org/r/20220726064504.49613-1-alejandro.lucero-palau@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-27 18:20:43 -07:00
wangjianli
58d8bcd47e sfc/siena: fix repeated words in comments
Delete the redundant word 'in'.

Signed-off-by: wangjianli <wangjianli@cdjrlc.com>
Link: https://lore.kernel.org/r/20220724075207.21080-1-wangjianli@cdjrlc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 12:35:47 +02:00
wangjianli
63f1b471a0 sfc/falcon: fix repeated words in comments
Delete the redundant word 'in'.

Signed-off-by: wangjianli <wangjianli@cdjrlc.com>
Link: https://lore.kernel.org/r/20220724074746.19550-1-wangjianli@cdjrlc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-26 12:35:37 +02:00
Edward Cree
84e7fc2591 sfc: attach/detach EF100 representors along with their owning PF
Since representors piggy-back on the PF's queues for TX, they can
 only accept new TXes while the PF is up.  Thus, any operation which
 detaches the PF must first detach all its VFreps.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
f72c38fad2 sfc: hook up ef100 representor TX
Implement .ndo_start_xmit() by calling into the parent PF's TX path.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
02443ab8c9 sfc: support passing a representor to the EF100 TX path
A non-null efv in __ef100_enqueue_skb() indicates that the packet is
 from that representor, should be transmitted with a suitable option
 descriptor (to instruct the switch to deliver it to the representee),
 and should not be accounted to the parent PF's stats or BQL.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
da56552d04 sfc: determine representee m-port for EF100 representors
An MAE port, or m-port, is a port (source/destination for traffic) on
 the Match-Action Engine (the internal switch on EF100).
Representors will use their representee's m-port for two purposes: as
 a destination override on TX from the representor, and as a source
 match in 'default rules' to steer representee traffic (when not
 matched by e.g. a TC flower rule) to representor RX via the parent
 PF's receive queue.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
e1479556f8 sfc: phys port/switch identification for ef100 reps
Requires storing VF index in struct efx_rep.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
5687eb3466 sfc: add basic ethtool ops to ef100 reps
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
08135eecd0 sfc: add skeleton ef100 VF representors
No net_device_ops yet, just a placeholder netdev created per VF.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
95287e1b4e sfc: detect ef100 MAE admin privilege/capability at probe time
One PCIe function per network port (more precisely, per m-port group) is
 responsible for configuring the Match-Action Engine which performs
 switching and packet modification in the slice to support flower/OVS
 offload.  The GRP_MAE bit in the privilege mask indicates whether a
 given function has this capability.
At probe time, call MCDIs to read the calling function's privilege mask,
 and store the GRP_MAE bit in a new ef100_nic_data member.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
8ca353da9c sfc: update EF100 register descriptions
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-07-22 12:50:06 +01:00
Edward Cree
1f17708b47 sfc: update MCDI protocol headers
Link: https://lore.kernel.org/r/cover.1657878101.git.ecree.xilinx@gmail.com
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-19 13:37:04 -07:00
Jakub Kicinski
816cd16883 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/net/sock.h
  310731e2f1 ("net: Fix data-races around sysctl_mem.")
  e70f3c7012 ("Revert "net: set SK_MEM_QUANTUM to 4096"")
https://lore.kernel.org/all/20220711120211.7c8b7cba@canb.auug.org.au/

net/ipv4/fib_semantics.c
  747c143072 ("ip: fix dflt addr selection for connected nexthop")
  d62607c3fe ("net: rename reference+tracking helpers")

net/tls/tls.h
include/net/tls.h
  3d8c51b25a ("net/tls: Check for errors in tls_device_init")
  5879031423 ("tls: create an internal header")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-14 15:27:35 -07:00
Íñigo Huguet
ada74c5539 sfc: fix kernel panic when creating VF
When creating VFs a kernel panic can happen when calling to
efx_ef10_try_update_nic_stats_vf.

When releasing a DMA coherent buffer, sometimes, I don't know in what
specific circumstances, it has to unmap memory with vunmap. It is
disallowed to do that in IRQ context or with BH disabled. Otherwise, we
hit this line in vunmap, causing the crash:
  BUG_ON(in_interrupt());

This patch reenables BH to release the buffer.

Log messages when the bug is hit:
 kernel BUG at mm/vmalloc.c:2727!
 invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
 CPU: 6 PID: 1462 Comm: NetworkManager Kdump: loaded Tainted: G          I      --------- ---  5.14.0-119.el9.x86_64 #1
 Hardware name: Dell Inc. PowerEdge R740/06WXJT, BIOS 2.8.2 08/27/2020
 RIP: 0010:vunmap+0x2e/0x30
 ...skip...
 Call Trace:
  __iommu_dma_free+0x96/0x100
  efx_nic_free_buffer+0x2b/0x40 [sfc]
  efx_ef10_try_update_nic_stats_vf+0x14a/0x1c0 [sfc]
  efx_ef10_update_stats_vf+0x18/0x40 [sfc]
  efx_start_all+0x15e/0x1d0 [sfc]
  efx_net_open+0x5a/0xe0 [sfc]
  __dev_open+0xe7/0x1a0
  __dev_change_flags+0x1d7/0x240
  dev_change_flags+0x21/0x60
  ...skip...

Fixes: d778819609 ("sfc: DMA the VF stats only when requested")
Reported-by: Ma Yuying <yuma@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220713092116.21238-1-ihuguet@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-07-14 12:46:56 +02:00
Íñigo Huguet
ebe41da5d4 sfc: fix use after free when disabling sriov
Use after free is detected by kfence when disabling sriov. What was read
after being freed was vf->pci_dev: it was freed from pci_disable_sriov
and later read in efx_ef10_sriov_free_vf_vports, called from
efx_ef10_sriov_free_vf_vswitching.

Set the pointer to NULL at release time to not trying to read it later.

Reproducer and dmesg log (note that kfence doesn't detect it every time):
$ echo 1 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs
$ echo 0 > /sys/class/net/enp65s0f0np0/device/sriov_numvfs

 BUG: KFENCE: use-after-free read in efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc]

 Use-after-free read at 0x00000000ff3c1ba5 (in kfence-#224):
  efx_ef10_sriov_free_vf_vswitching+0x82/0x170 [sfc]
  efx_ef10_pci_sriov_disable+0x38/0x70 [sfc]
  efx_pci_sriov_configure+0x24/0x40 [sfc]
  sriov_numvfs_store+0xfe/0x140
  kernfs_fop_write_iter+0x11c/0x1b0
  new_sync_write+0x11f/0x1b0
  vfs_write+0x1eb/0x280
  ksys_write+0x5f/0xe0
  do_syscall_64+0x5c/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 kfence-#224: 0x00000000edb8ef95-0x00000000671f5ce1, size=2792, cache=kmalloc-4k

 allocated by task 6771 on cpu 10 at 3137.860196s:
  pci_alloc_dev+0x21/0x60
  pci_iov_add_virtfn+0x2a2/0x320
  sriov_enable+0x212/0x3e0
  efx_ef10_sriov_configure+0x67/0x80 [sfc]
  efx_pci_sriov_configure+0x24/0x40 [sfc]
  sriov_numvfs_store+0xba/0x140
  kernfs_fop_write_iter+0x11c/0x1b0
  new_sync_write+0x11f/0x1b0
  vfs_write+0x1eb/0x280
  ksys_write+0x5f/0xe0
  do_syscall_64+0x5c/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

 freed by task 6771 on cpu 12 at 3170.991309s:
  device_release+0x34/0x90
  kobject_cleanup+0x3a/0x130
  pci_iov_remove_virtfn+0xd9/0x120
  sriov_disable+0x30/0xe0
  efx_ef10_pci_sriov_disable+0x57/0x70 [sfc]
  efx_pci_sriov_configure+0x24/0x40 [sfc]
  sriov_numvfs_store+0xfe/0x140
  kernfs_fop_write_iter+0x11c/0x1b0
  new_sync_write+0x11f/0x1b0
  vfs_write+0x1eb/0x280
  ksys_write+0x5f/0xe0
  do_syscall_64+0x5c/0x80
  entry_SYSCALL_64_after_hwframe+0x44/0xae

Fixes: 3c5eb87605 ("sfc: create vports for VFs and assign random MAC addresses")
Reported-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220712062642.6915-1-ihuguet@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-13 19:42:09 -07:00
Christophe JAILLET
ee4c0c5d25 sfc: falcon: Use the bitmap API to allocate bitmaps
Use bitmap_zalloc()/bitmap_free() instead of hand-writing them.

It is less verbose and it improves the semantic.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/c62c1774e6a34bc64323ce526b385aa87c1ca575.1657049799.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-06 19:54:42 -07:00
Christophe JAILLET
820aceb53c sfc/siena: Use the bitmap API to allocate bitmaps
Use bitmap_zalloc()/bitmap_free() instead of hand-writing them.

It is less verbose and it improves the semantic.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/717ba530215f4d7ce9fedcc73d98dba1f70d7f71.1657049636.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-07-06 19:54:32 -07:00
Jonathan Cooper
98ff4c7c8a sfc: Separate netdev probe/remove from PCI probe/remove
The netdev probe will be used when moving from vDPA to EF100 BAR config.
The netdev remove will be used when moving from EF100 to vDPA BAR config.

In the process, change several log messages to pci_ instead of netif_
to remove the "(unregistered net_device)" text.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:47 +01:00
Jonathan Cooper
7592d754c0 sfc: replace function name in string with __func__
Minor fix to existing code to make later patch checkpatch clean.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:47 +01:00
Jonathan Cooper
bba84bf4c1 sfc: Unsplit literal string.
Minor fix to existing code to later patch checkpatch clean.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:47 +01:00
Jonathan Cooper
3e341d84bd sfc: Move EF100 efx_nic_type structs to the end of the file
This avoids a forward declaration in a subsequent patch.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:47 +01:00
Jonathan Cooper
7e773594da sfc: Separate efx_nic memory from net_device memory
As we have a lot of common code this applies to all NIC architectures.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:47 +01:00
Jonathan Cooper
8cb03f4e08 sfc: Encapsulate access to netdev_priv()
Once we separate struct efx_nic memory from net_device memory the
existing usage will have to change.
Apart from the new function efx_netdev_priv() accesses have been
changed using:
  sed -i 's/netdev_priv/efx_netdev_priv/'

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:47 +01:00
Jonathan Cooper
b3fd0a86da sfc: Change BUG_ON to WARN_ON and recovery code.
Pre-emptively fix a checkpatch warning in a subsequent patch.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:47 +01:00
Jonathan Cooper
62ac3ce542 sfc: Remove netdev init from efx_init_struct
Move functionality involving the struct net_device out of
efx_init_struct so that we can initialise without a net dev
for VDPA operation.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:46 +01:00
Jonathan Cooper
8b39db19b2 sfc: Add a PROBED state for EF100 VDPA use.
For VDPA we need to tear down the driver to the point where it
has various control channels like MCDI, but it no longer has
a network device. This adds a state corresponding to
that mode that will be used when VDPA support is added.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Co-developed-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:46 +01:00
Jonathan Cooper
813cf9d1e7 sfc: Split STATE_READY in to STATE_NET_DOWN and STATE_NET_UP.
This patch splits the READY state in to NET_UP and NET_DOWN. This
is to prepare for future work to delay resource allocation until
interface up so that we can use resources more efficiently in
SRIOV environments, and also to lay the ground work for an extra
PROBED state where we don't create a network interface,
for VDPA operation.

Signed-off-by: Jonathan Cooper <jonathan.s.cooper@amd.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-29 13:40:46 +01:00
Jilin Yuan
bb4a0c88ae sfc:falcon: fix repeated words in comments
Delete the redundant word 'in'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:59:54 +01:00
Jilin Yuan
982a2b5ffd sfc: fix repeated words in comments
Delete the redundant word 'set'.
Delete the redundant word 'a'.
Delete the redundant word 'in'.
Found the same error as before.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-27 11:59:54 +01:00
Jilin Yuan
85a1c6536f sfc: siena: fix repeated words in comments
Delete the redundant word 'set'.
Delete the redundant word 'a'.
Delete the redundant word 'in'.

Signed-off-by: Jilin Yuan <yuanjilin@cdjrlc.com>
Link: https://lore.kernel.org/r/20220623043115.60482-1-yuanjilin@cdjrlc.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-24 16:25:33 -07:00
Xiang wangx
9776fe0f42 sfc/siena: Fix typo in comment
Delete the redundant word 'and'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-19 11:29:28 +01:00
Xiang wangx
dd33c5932e sfc: Fix typo in comment
Delete the redundant word 'and'.

Signed-off-by: Xiang wangx <wangxiang@cdjrlc.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-06-19 11:29:28 +01:00
Íñigo Huguet
25bde571b4 sfc/siena: fix wrong tx channel offset with efx_separate_tx_channels
tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
also calculated again in efx_set_channels because it was originally done
there, and when efx_allocate_msix_channels was introduced it was
forgotten to be removed from efx_set_channels.

Moreover, the old calculation is wrong when using
efx_separate_tx_channels because now we can have XDP channels after the
TX channels, so n_channels - n_tx_channels doesn't point to the first TX
channel.

Remove the old calculation from efx_set_channels, and add the
initialization of this variable if MSI or legacy interrupts are used,
next to the initialization of the rest of the related variables, where
it was missing.

This has been already done for sfc, do it also for sfc_siena.

Fixes: 3990a8fffb ("sfc: allocate channels for XDP tx queues")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-01 17:47:17 -07:00
Martin Habets
183614bff5 sfc/siena: fix considering that all channels have TX queues
Normally, all channels have RX and TX queues, but this is not true if
modparam efx_separate_tx_channels=1 is used. In that cases, some
channels only have RX queues and others only TX queues (or more
preciselly, they have them allocated, but not initialized).

Fix efx_channel_has_tx_queues to return the correct value for this case
too.

This has been already done for sfc, do it also for sfc_siena.

Messages shown at probe time before the fix:
 sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
 ------------[ cut here ]------------
 netdevice: ens6f0np0: failed to initialise TXQ -1
 WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
 [...] stripped
 RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
 [...] stripped
 Call Trace:
  efx_init_tx_queue+0xaa/0xf0 [sfc]
  efx_start_channels+0x49/0x120 [sfc]
  efx_start_all+0x1f8/0x430 [sfc]
  efx_net_open+0x5a/0xe0 [sfc]
  __dev_open+0xd0/0x190
  __dev_change_flags+0x1b3/0x220
  dev_change_flags+0x21/0x60
 [...] stripped

Messages shown at remove time before the fix:
 sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
 sfc 0000:03:00.0 ens6f0np0: failed to flush queues

Fixes: 8700aff089 ("sfc: fix channel allocation with brute force")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Tested-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-06-01 17:47:16 -07:00
Íñigo Huguet
c308dfd1b4 sfc: fix wrong tx channel offset with efx_separate_tx_channels
tx_channel_offset is calculated in efx_allocate_msix_channels, but it is
also calculated again in efx_set_channels because it was originally done
there, and when efx_allocate_msix_channels was introduced it was
forgotten to be removed from efx_set_channels.

Moreover, the old calculation is wrong when using
efx_separate_tx_channels because now we can have XDP channels after the
TX channels, so n_channels - n_tx_channels doesn't point to the first TX
channel.

Remove the old calculation from efx_set_channels, and add the
initialization of this variable if MSI or legacy interrupts are used,
next to the initialization of the rest of the related variables, where
it was missing.

Fixes: 3990a8fffb ("sfc: allocate channels for XDP tx queues")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-29 12:12:53 +01:00
Martin Habets
2e102b53f8 sfc: fix considering that all channels have TX queues
Normally, all channels have RX and TX queues, but this is not true if
modparam efx_separate_tx_channels=1 is used. In that cases, some
channels only have RX queues and others only TX queues (or more
preciselly, they have them allocated, but not initialized).

Fix efx_channel_has_tx_queues to return the correct value for this case
too.

Messages shown at probe time before the fix:
 sfc 0000:03:00.0 ens6f0np0: MC command 0x82 inlen 544 failed rc=-22 (raw=0) arg=0
 ------------[ cut here ]------------
 netdevice: ens6f0np0: failed to initialise TXQ -1
 WARNING: CPU: 1 PID: 626 at drivers/net/ethernet/sfc/ef10.c:2393 efx_ef10_tx_init+0x201/0x300 [sfc]
 [...] stripped
 RIP: 0010:efx_ef10_tx_init+0x201/0x300 [sfc]
 [...] stripped
 Call Trace:
  efx_init_tx_queue+0xaa/0xf0 [sfc]
  efx_start_channels+0x49/0x120 [sfc]
  efx_start_all+0x1f8/0x430 [sfc]
  efx_net_open+0x5a/0xe0 [sfc]
  __dev_open+0xd0/0x190
  __dev_change_flags+0x1b3/0x220
  dev_change_flags+0x21/0x60
 [...] stripped

Messages shown at remove time before the fix:
 sfc 0000:03:00.0 ens6f0np0: failed to flush 10 queues
 sfc 0000:03:00.0 ens6f0np0: failed to flush queues

Fixes: 8700aff089 ("sfc: fix channel allocation with brute force")
Reported-by: Tianhao Zhao <tizhao@redhat.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Tested-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-29 12:12:53 +01:00
Martin Habets
cc398a34d1 sfc/siena: Remove duplicate check on segments
Siena only supports software TSO. This means more code can be deleted,
as pointed out by the Smatch static checker warning:
	drivers/net/ethernet/sfc/siena/tx.c:184 __efx_siena_enqueue_skb()
	warn: duplicate check 'segments' (previous on line 158)

Fixes: 956f2d86cb ("sfc/siena: Remove build references to missing functionality")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Link: https://lore.kernel.org/kernel-janitors/YoH5tJMnwuGTrn1Z@kili/
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/165294463549.23865.4557617334650441347.stgit@palantir17.mph.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-20 16:46:49 -07:00
Saeed Mahameed
309ec44307 sfc: siena: Have a unique wrapper ifndef for efx channels header
Both sfc/efx_channels.h and sfc/siena/efx_channels.h used the same
wrapper #ifndef EFX_CHANNELS_H, this patch changes the siena define to be
EFX_SIENA_CHANNELS_H to avoid build system confusion.

This fixes the following build break:
drivers/net/ethernet/sfc/ptp.c:2191:28:
error: ‘efx_copy_channel’ undeclared here (not in a function); did you mean ‘efx_ptp_channel’?
  2191 |  .copy                   = efx_copy_channel,

Fixes: 6e173d3b4a ("sfc: Copy shared files needed for Siena (part 1)")
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Cc: Edward Cree <ecree.xilinx@gmail.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220518065820.131611-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-18 20:07:58 -07:00
Alexander Duyck
7c4e983c4f net: allow gso_max_size to exceed 65536
The code for gso_max_size was added originally to allow for debugging and
workaround of buggy devices that couldn't support TSO with blocks 64K in
size. The original reason for limiting it to 64K was because that was the
existing limits of IPv4 and non-jumbogram IPv6 length fields.

With the addition of Big TCP we can remove this limit and allow the value
to potentially go up to UINT_MAX and instead be limited by the tso_max_size
value.

So in order to support this we need to go through and clean up the
remaining users of the gso_max_size value so that the values will cap at
64K for non-TCPv6 flows. In addition we can clean up the GSO_MAX_SIZE value
so that 64K becomes GSO_LEGACY_MAX_SIZE and UINT_MAX will now be the upper
limit for GSO_MAX_SIZE.

v6: (edumazet) fixed a compile error if CONFIG_IPV6=n,
               in a new sk_trim_gso_size() helper.
               netif_set_tso_max_size() caps the requested TSO size
               with GSO_MAX_SIZE.

Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-16 10:18:55 +01:00
Jakub Kicinski
c28678162b eth: sfc: remove remnants of the out-of-tree napi_weight module param
Remove napi_weight statics which are set to 64 and never modified,
remnants of the out-of-tree napi_weight module param.

Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220512205603.1536771-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-13 17:03:09 -07:00
Ren Zhijie
f9a210c72d sfc: siena: Fix Kconfig dependencies
If CONFIG_PTP_1588_CLOCK=m and CONFIG_SFC_SIENA=y, the siena driver will fail to link:

drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_remove_channel':
ptp.c:(.text+0xa28): undefined reference to `ptp_clock_unregister'
drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_probe_channel':
ptp.c:(.text+0x13a0): undefined reference to `ptp_clock_register'
ptp.c:(.text+0x1470): undefined reference to `ptp_clock_unregister'
drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_ptp_pps_worker':
ptp.c:(.text+0x1d29): undefined reference to `ptp_clock_event'
drivers/net/ethernet/sfc/siena/ptp.o: In function `efx_siena_ptp_get_ts_info':
ptp.c:(.text+0x301b): undefined reference to `ptp_clock_index'

To fix this build error, make SFC_SIENA depends on PTP_1588_CLOCK.

Reported-by: Hulk Robot <hulkci@huawei.com>
Fixes: d48523cb88 ("sfc: Copy shared files needed for Siena (part 2)")
Signed-off-by: Ren Zhijie <renzhijie2@huawei.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220513012721.140871-1-renzhijie2@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-13 10:18:57 -07:00
Martin Habets
c374303969 sfc/siena: Reinstate SRIOV init/fini function calls
They were removed in the first series since they were not used for EF10.
Put that code back for Siena, with the prototypes in siena_sriov.h
since that file is a more applicable place for it.

Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:49:30 -07:00
Martin Habets
ef9b577094 sfc/siena: Make PTP and reset support specific for Siena
Change the clock name and work queue names to differentiate them from
the names used in sfc.ko.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:49:30 -07:00
Martin Habets
58b6b3d537 sfc/siena: Make MCDI logging support specific for Siena
Add a Siena Kconfig option and use it in stead of the sfc one.
Rename the internal variable for the 'mcdi_logging_default' module
parameter to avoid a naming conflict with the one in sfc.ko.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:49:30 -07:00
Martin Habets
f62a074525 siena: Make HWMON support specific for Siena
Add a Siena Kconfig option and use it in stead of the sfc one.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:49:29 -07:00
Martin Habets
dfb1cfbd49 siena: Make SRIOV support specific for Siena
Add a Siena Kconfig option and use it in stead of the sfc one.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:49:29 -07:00
Martin Habets
65d4b471b3 siena: Make MTD support specific for Siena
Add a Siena Kconfig option and use it in stead of the sfc one.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:49:29 -07:00
Jakub Kicinski
9b19e57a3c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Build issue in drivers/net/ethernet/sfc/ptp.c
  54fccfdd7c ("sfc: efx_default_channel_type APIs can be static")
  49e6123c65 ("net: sfc: fix memory leak due to ptp channel")
https://lore.kernel.org/all/20220510130556.52598fe2@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 16:15:30 -07:00
Taehee Yoo
1fa89ffbc0 net: sfc: ef10: fix memory leak in efx_ef10_mtd_probe()
In the NIC ->probe() callback, ->mtd_probe() callback is called.
If NIC has 2 ports, ->probe() is called twice and ->mtd_probe() too.
In the ->mtd_probe(), which is efx_ef10_mtd_probe() it allocates and
initializes mtd partiion.
But mtd partition for sfc is shared data.
So that allocated mtd partition data from last called
efx_ef10_mtd_probe() will not be used.
Therefore it must be freed.
But it doesn't free a not used mtd partition data in efx_ef10_mtd_probe().

kmemleak reports:
unreferenced object 0xffff88811ddb0000 (size 63168):
  comm "systemd-udevd", pid 265, jiffies 4294681048 (age 348.586s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffffa3767749>] kmalloc_order_trace+0x19/0x120
    [<ffffffffa3873f0e>] __kmalloc+0x20e/0x250
    [<ffffffffc041389f>] efx_ef10_mtd_probe+0x11f/0x270 [sfc]
    [<ffffffffc0484c8a>] efx_pci_probe.cold.17+0x3df/0x53d [sfc]
    [<ffffffffa414192c>] local_pci_probe+0xdc/0x170
    [<ffffffffa4145df5>] pci_device_probe+0x235/0x680
    [<ffffffffa443dd52>] really_probe+0x1c2/0x8f0
    [<ffffffffa443e72b>] __driver_probe_device+0x2ab/0x460
    [<ffffffffa443e92a>] driver_probe_device+0x4a/0x120
    [<ffffffffa443f2ae>] __driver_attach+0x16e/0x320
    [<ffffffffa4437a90>] bus_for_each_dev+0x110/0x190
    [<ffffffffa443b75e>] bus_add_driver+0x39e/0x560
    [<ffffffffa4440b1e>] driver_register+0x18e/0x310
    [<ffffffffc02e2055>] 0xffffffffc02e2055
    [<ffffffffa3001af3>] do_one_initcall+0xc3/0x450
    [<ffffffffa33ca574>] do_init_module+0x1b4/0x700

Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Fixes: 8127d661e7 ("sfc: Add support for Solarflare SFC9100 family")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://lore.kernel.org/r/20220512054709.12513-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-12 10:01:36 -07:00
Martin Habets
c5a13c319e sfc: Add a basic Siena module
Make the (un)load message more specific to differentiate it from
the sfc.ko messages.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:15 -07:00
Martin Habets
782f713084 sfc/siena: Inline functions in sriov.h to avoid conflicts with sfc
The implementation of each is quite short. This means sriov.c is
not needed any more.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:15 -07:00
Martin Habets
c8443b6982 sfc/siena: Rename functions in nic_common.h to avoid conflicts with sfc
For siena use efx_siena_ as the function prefix.
efx_nic_update_stats_atomic is only used in efx_common.c, so move
it there.
efx_nic_copy_stats is not used in Siena, so it is removed.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:15 -07:00
Martin Habets
4d49e5cd4b sfc/siena: Rename functions in mcdi headers to avoid conflicts with sfc
For siena use efx_siena_ as the function prefix.
Several functions are not used in Siena, so they are removed.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:15 -07:00
Martin Habets
95e96f7788 sfc/siena: Rename peripheral functions to avoid conflicts with sfc
For siena use efx_siena_ as the function prefix.
This patch covers selftest.h, ptp.h, net_driver.h and ethtool_common.h.
efx_ethtool_fill_self_tests() can become static.
Some functions in ptp.c can also become static.
Rename loopback_mode in net_driver.h.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:15 -07:00
Martin Habets
7f9e4b2a61 sfc/siena: Rename RX/TX functions to avoid conflicts with sfc
For siena use efx_siena_ as the function prefix.
Several functions are not used in Siena, so they are removed.

Use a Siena specific variable name for module parameter
efx_separate_tx_channels.
Move efx_fini_tx_queue() to avoid a forward declaration of
efx_dequeue_buffer().

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
71ad88f661 sfc/siena: Rename functions in efx headers to avoid conflicts with sfc
When building with allyesconfig there are many identical
symbol names.
For siena use efx_siena_ as the function and variable prefix
to avoid build errors.

efx_mtd_remove_partition can become static as it is no longer called
from other files.
efx_ticks_to_usecs and efx_xmit_done_single are not used in Siena, so
they are removed.
Several functions are only used inside efx_channels.c for Siena so
they can become static.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
956f2d86cb sfc/siena: Remove build references to missing functionality
Functionality not supported or needed on Siena includes:
- Anything for EF100
- EF10 specifics such as register access, PIO and TSO offload.
Also only bind to Siena NICs.

Remove EF10 specifics from nic.h.
The functions that start with efx_farch_ will be removed from sfc.ko
with a subsequent patch.
Add the efx_ prefix to siena_prepare_flush() to make it consistent
with the other APIs.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
d48523cb88 sfc: Copy shared files needed for Siena (part 2)
These are the files starting with m through w.
No changes are done, those will be done with subsequent commits.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
6e173d3b4a sfc: Copy shared files needed for Siena (part 1)
These are the files starting with b through i.
No changes are done, those will be done with subsequent commits.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:14 -07:00
Martin Habets
36ff639329 sfc: Move Siena specific files
Files are only moved, no changes are made.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-05-10 15:38:13 -07:00
Taehee Yoo
49e6123c65 net: sfc: fix memory leak due to ptp channel
It fixes memory leak in ring buffer change logic.

When ring buffer size is changed(ethtool -G eth0 rx 4096), sfc driver
works like below.
1. stop all channels and remove ring buffers.
2. allocates new buffer array.
3. allocates rx buffers.
4. start channels.

While the above steps are working, it skips some steps if the channel
doesn't have a ->copy callback function.
Due to ptp channel doesn't have ->copy callback, these above steps are
skipped for ptp channel.
It eventually makes some problems.
a. ptp channel's ring buffer size is not changed, it works only
   1024(default).
b. memory leak.

The reason for memory leak is to use the wrong ring buffer values.
There are some values, which is related to ring buffer size.
a. efx->rxq_entries
 - This is global value of rx queue size.
b. rx_queue->ptr_mask
 - used for access ring buffer as circular ring.
 - roundup_pow_of_two(efx->rxq_entries) - 1
c. rx_queue->max_fill
 - efx->rxq_entries - EFX_RXD_HEAD_ROOM

These all values should be based on ring buffer size consistently.
But ptp channel's values are not.
a. efx->rxq_entries
 - This is global(for sfc) value, always new ring buffer size.
b. rx_queue->ptr_mask
 - This is always 1023(default).
c. rx_queue->max_fill
 - This is new ring buffer size - EFX_RXD_HEAD_ROOM.

Let's assume we set 4096 for rx ring buffer,

                      normal channel     ptp channel
efx->rxq_entries      4096               4096
rx_queue->ptr_mask    4095               1023
rx_queue->max_fill    4086               4086

sfc driver allocates rx ring buffers based on these values.
When it allocates ptp channel's ring buffer, 4086 ring buffers are
allocated then, these buffers are attached to the allocated array.
But ptp channel's ring buffer array size is still 1024(default)
and ptr_mask is still 1023 too.
So, 3062 ring buffers will be overwritten to the array.
This is the reason for memory leak.

Test commands:
   ethtool -G <interface name> rx 4096
   while :
   do
       ip link set <interface name> up
       ip link set <interface name> down
   done

In order to avoid this problem, it adds ->copy callback to ptp channel
type.
So that rx_queue->ptr_mask value will be updated correctly.

Fixes: 7c236c43b8 ("sfc: Add support for IEEE-1588 PTP")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-09 10:01:48 +01:00
Jakub Kicinski
b707b89f7b eth: switch to netif_napi_add_weight()
Switch all Ethernet drivers which use custom napi weights
to the new API.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-08 11:33:57 +01:00
Jakub Kicinski
ee8b7a1156 net: make drivers set the TSO limit not the GSO limit
Drivers should call the TSO setting helper, GSO is controllable
by user space.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-06 12:07:56 +01:00
Martin Habets
6b73f20ab6 sfc: Copy a subset of mcdi_pcol.h to siena
For Siena we do not need new messages that were defined
for the EF100 architecture. Several debug messages have
also been removed.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04 11:18:08 +01:00
Martin Habets
0c38a5bd60 sfc: Disable Siena support
Disable the build of Siena code until later in this patch series.
Prevent sfc.ko from binding to Siena NICs.

efx_init_sriov/efx_fini_sriov is only used for Siena. Remove calls
to those.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-05-04 11:18:08 +01:00
Pieter Jansen van Vuuren
78a9b3c47b sfc: add EF100 VF support via a write to sriov_numvfs
This patch extends the EF100 PF driver by adding .sriov_configure()
which would allow users to enable and disable virtual functions
using the sriov sysfs.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/75e74d9e-14ce-0524-9668-5ab735a7cf62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 18:43:01 -07:00
Martin Habets
d78eaf06b5 sfc: Remove global definition of efx_reset_type_names
The strings are only used in efx_common.c so the definitions
can be static in there.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-12 12:13:30 +02:00
Martin Habets
cc42e4e3f1 sfc: Remove duplicate definition of efx_xmit_done
It is defined both in efx.h and tx_common.h.
Remove the definition in efx.h.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-12 12:13:30 +02:00
Martin Habets
54fccfdd7c sfc: efx_default_channel_type APIs can be static
This means we can remove them from efx_channel.h and avoid
naming conflicts later.
efx_channel_dummy_op_void() cannot be static as it is
used in ef100_nic.c.

Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-12 12:13:30 +02:00
Lv Ruyi
ac6bef064f sfc: Fix spelling mistake "writting" -> "writing"
There are some spelling mistakes in the comment. Fix it.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Link: https://lore.kernel.org/r/20220411032546.2517628-1-lv.ruyi@zte.com.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-12 09:34:05 +02:00
Haowen Bai
b8ff3395fb sfc: ef10: Fix assigning negative value to unsigned variable
fix warning reported by smatch:
251 drivers/net/ethernet/sfc/ef10.c:2259 efx_ef10_tx_tso_desc()
warn: assigning (-208) to unsigned variable 'ip_tot_len'

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/1649640757-30041-1-git-send-email-baihaowen@meizu.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-11 20:53:09 -07:00
Bert Kenward
bd4a2697e5 sfc: use hardware tx timestamps for more than PTP
The 8000 series and newer NICs all get hardware timestamps from the MAC
 and can provide timestamps on a normal TX queue, rather than via a slow
 path through the MC. As such we can use this path for any packet where a
 hardware timestamp is requested.
This also enables support for PTP over transports other than IPv4+UDP.

Signed-off-by: Bert Kenward <bkenward@solarflare.com>
Signed-off-by: Edward Cree <ecree@xilinx.com>
Link: https://lore.kernel.org/r/510652dc-54b4-0e11-657e-e37ee3ca26a9@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08 14:43:10 -07:00
Jakub Kicinski
dc2e0617f1 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-07 23:24:23 -07:00
Robin Murphy
6a62924c0a sfc: Stop using iommu_present()
Even if an IOMMU might be present for some PCI segment in the system,
that doesn't necessarily mean it provides translation for the device
we care about. It appears that what we care about here is specifically
whether DMA mapping ops involve any IOMMU overhead or not, so check for
translation actually being active for our device.

Signed-off-by: Robin Murphy <robin.murphy@arm.com>
Acked-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/7350f957944ecfce6cce90f422e3992a1f428775.1649166055.git.robin.murphy@arm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-07 21:04:10 -07:00
Taehee Yoo
fb5833d81e net: sfc: fix using uninitialized xdp tx_queue
In some cases, xdp tx_queue can get used before initialization.
1. interface up/down
2. ring buffer size change

When CPU cores are lower than maximum number of channels of sfc driver,
it creates new channels only for XDP.

When an interface is up or ring buffer size is changed, all channels
are initialized.
But xdp channels are always initialized later.
So, the below scenario is possible.
Packets are received to rx queue of normal channels and it is acted
XDP_TX and tx_queue of xdp channels get used.
But these tx_queues are not initialized yet.
If so, TX DMA or queue error occurs.

In order to avoid this problem.
1. initializes xdp tx_queues earlier than other rx_queue in
efx_start_channels().
2. checks whether tx_queue is initialized or not in efx_xdp_tx_buffers().

Splat looks like:
   sfc 0000:08:00.1 enp8s0f1np1: TX queue 10 spurious TX completion id 250
   sfc 0000:08:00.1 enp8s0f1np1: resetting (RECOVER_OR_ALL)
   sfc 0000:08:00.1 enp8s0f1np1: MC command 0x80 inlen 100 failed rc=-22
   (raw=22) arg=789
   sfc 0000:08:00.1 enp8s0f1np1: has been disabled

Fixes: f28100cb9c ("sfc: fix lack of XDP TX queues - error XDP TX failed (-22)")
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-06 13:50:17 +01:00
Martin Habets
458f5d92df sfc: Do not free an empty page_ring
When the page_ring is not used page_ptr_mask is 0.
Do not dereference page_ring[0] in this case.

Fixes: 2768935a46 ("sfc: reuse pages to avoid DMA mapping/unmapping costs")
Reported-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-04 12:53:52 +01:00
Taehee Yoo
059a47f1da net: sfc: add missing xdp queue reinitialization
After rx/tx ring buffer size is changed, kernel panic occurs when
it acts XDP_TX or XDP_REDIRECT.

When tx/rx ring buffer size is changed(ethtool -G), sfc driver
reallocates and reinitializes rx and tx queues and their buffer
(tx_queue->buffer).
But it misses reinitializing xdp queues(efx->xdp_tx_queues).
So, while it is acting XDP_TX or XDP_REDIRECT, it uses the uninitialized
tx_queue->buffer.

A new function efx_set_xdp_channels() is separated from efx_set_channels()
to handle only xdp queues.

Splat looks like:
   BUG: kernel NULL pointer dereference, address: 000000000000002a
   #PF: supervisor write access in kernel mode
   #PF: error_code(0x0002) - not-present page
   PGD 0 P4D 0
   Oops: 0002 [#4] PREEMPT SMP NOPTI
   RIP: 0010:efx_tx_map_chunk+0x54/0x90 [sfc]
   CPU: 2 PID: 0 Comm: swapper/2 Tainted: G      D           5.17.0+ #55 e8beeee8289528f11357029357cf
   Code: 48 8b 8d a8 01 00 00 48 8d 14 52 4c 8d 2c d0 44 89 e0 48 85 c9 74 0e 44 89 e2 4c 89 f6 48 80
   RSP: 0018:ffff92f121e45c60 EFLAGS: 00010297
   RIP: 0010:efx_tx_map_chunk+0x54/0x90 [sfc]
   RAX: 0000000000000040 RBX: ffff92ea506895c0 RCX: ffffffffc0330870
   RDX: 0000000000000001 RSI: 00000001139b10ce RDI: ffff92ea506895c0
   RBP: ffffffffc0358a80 R08: 00000001139b110d R09: 0000000000000000
   R10: 0000000000000001 R11: ffff92ea414c0088 R12: 0000000000000040
   R13: 0000000000000018 R14: 00000001139b10ce R15: ffff92ea506895c0
   FS:  0000000000000000(0000) GS:ffff92f121ec0000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   Code: 48 8b 8d a8 01 00 00 48 8d 14 52 4c 8d 2c d0 44 89 e0 48 85 c9 74 0e 44 89 e2 4c 89 f6 48 80
   CR2: 000000000000002a CR3: 00000003e6810004 CR4: 00000000007706e0
   RSP: 0018:ffff92f121e85c60 EFLAGS: 00010297
   PKRU: 55555554
   RAX: 0000000000000040 RBX: ffff92ea50689700 RCX: ffffffffc0330870
   RDX: 0000000000000001 RSI: 00000001145a90ce RDI: ffff92ea50689700
   RBP: ffffffffc0358a80 R08: 00000001145a910d R09: 0000000000000000
   R10: 0000000000000001 R11: ffff92ea414c0088 R12: 0000000000000040
   R13: 0000000000000018 R14: 00000001145a90ce R15: ffff92ea50689700
   FS:  0000000000000000(0000) GS:ffff92f121e80000(0000) knlGS:0000000000000000
   CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
   CR2: 000000000000002a CR3: 00000003e6810005 CR4: 00000000007706e0
   PKRU: 55555554
   Call Trace:
    <IRQ>
    efx_xdp_tx_buffers+0x12b/0x3d0 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5]
    __efx_rx_packet+0x5c3/0x930 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5]
    efx_rx_packet+0x28c/0x2e0 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5]
    efx_ef10_ev_process+0x5f8/0xf40 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5]
    ? enqueue_task_fair+0x95/0x550
    efx_poll+0xc4/0x360 [sfc 84c94b8e32d44d296c17e10a634d3ad454de4ba5]

Fixes: 3990a8fffb ("sfc: allocate channels for XDP tx queues")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01 11:52:21 +01:00
Martin Habets
c9ad266bbe sfc: Avoid NULL pointer dereference on systems without numa awareness
On such systems cpumask_of_node() returns NULL, which bitmap
operations are not happy with.

Fixes: c265b569a4 ("sfc: default config to 1 channel/core in local NUMA node only")
Fixes: 09a99ab16c ("sfc: set affinity hints in local NUMA node only")
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Reviewed-by: Íñigo Huguet <ihuguet@redhat.com>
Link: https://lore.kernel.org/r/164857006953.8140.3265568858101821256.stgit@palantir17.mph.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-30 12:08:58 -07:00
Jakub Kicinski
80901bff81 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/batman-adv/hard-interface.c
  commit 690bb6fb64 ("batman-adv: Request iflink once in batadv-on-batadv check")
  commit 6ee3c393ee ("batman-adv: Demote batadv-on-batadv skip error message")
https://lore.kernel.org/all/20220302163049.101957-1-sw@simonwunderlich.de/

net/smc/af_smc.c
  commit 4d08b7b57e ("net/smc: Fix cleanup when register ULP fails")
  commit 462791bbfa ("net/smc: add sysctl interface for SMC")
https://lore.kernel.org/all/20220302112209.355def40@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-03 11:55:12 -08:00
Niels Dossche
f1fb205efb sfc: extend the locking on mcdi->seqno
seqno could be read as a stale value outside of the lock. The lock is
already acquired to protect the modification of seqno against a possible
race condition. Place the reading of this value also inside this locking
to protect it against a possible race condition.

Signed-off-by: Niels Dossche <dossche.niels@gmail.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-03 14:11:58 +00:00
Íñigo Huguet
09a99ab16c sfc: set affinity hints in local NUMA node only
Affinity hints were being set to CPUs in local NUMA node first, and then
in other CPUs. This was creating 2 unintended issues:
1. Channels created to be assigned each to a different physical core
   were assigned to hyperthreading siblings because of being in same
   NUMA node.
   Since the patch previous to this one, this did not longer happen
   with default rss_cpus modparam because less channels are created.
2. XDP channels could be assigned to CPUs in different NUMA nodes,
   decreasing performance too much (to less than half in some of my
   tests).

This patch sets the affinity hints spreading the channels only in local
NUMA node's CPUs. A fallback for the case that no CPU in local NUMA node
is online has been added too.

Example of CPUs being assigned in a non optimal way before this and the
previous patch (note: in this system, xdp-8 to xdp-15 are created
because num_possible_cpus == 64, but num_present_cpus == 32 so they're
never used):

$ lscpu | grep -i numa
NUMA node(s):                    2
NUMA node0 CPU(s):               0-7,16-23
NUMA node1 CPU(s):               8-15,24-31

$ grep -H . /proc/irq/*/0000:07:00.0*/../smp_affinity_list
/proc/irq/141/0000:07:00.0-0/../smp_affinity_list:0
/proc/irq/142/0000:07:00.0-1/../smp_affinity_list:1
/proc/irq/143/0000:07:00.0-2/../smp_affinity_list:2
/proc/irq/144/0000:07:00.0-3/../smp_affinity_list:3
/proc/irq/145/0000:07:00.0-4/../smp_affinity_list:4
/proc/irq/146/0000:07:00.0-5/../smp_affinity_list:5
/proc/irq/147/0000:07:00.0-6/../smp_affinity_list:6
/proc/irq/148/0000:07:00.0-7/../smp_affinity_list:7
/proc/irq/149/0000:07:00.0-8/../smp_affinity_list:16
/proc/irq/150/0000:07:00.0-9/../smp_affinity_list:17
/proc/irq/151/0000:07:00.0-10/../smp_affinity_list:18
/proc/irq/152/0000:07:00.0-11/../smp_affinity_list:19
/proc/irq/153/0000:07:00.0-12/../smp_affinity_list:20
/proc/irq/154/0000:07:00.0-13/../smp_affinity_list:21
/proc/irq/155/0000:07:00.0-14/../smp_affinity_list:22
/proc/irq/156/0000:07:00.0-15/../smp_affinity_list:23
/proc/irq/157/0000:07:00.0-xdp-0/../smp_affinity_list:8
/proc/irq/158/0000:07:00.0-xdp-1/../smp_affinity_list:9
/proc/irq/159/0000:07:00.0-xdp-2/../smp_affinity_list:10
/proc/irq/160/0000:07:00.0-xdp-3/../smp_affinity_list:11
/proc/irq/161/0000:07:00.0-xdp-4/../smp_affinity_list:12
/proc/irq/162/0000:07:00.0-xdp-5/../smp_affinity_list:13
/proc/irq/163/0000:07:00.0-xdp-6/../smp_affinity_list:14
/proc/irq/164/0000:07:00.0-xdp-7/../smp_affinity_list:15
/proc/irq/165/0000:07:00.0-xdp-8/../smp_affinity_list:24
/proc/irq/166/0000:07:00.0-xdp-9/../smp_affinity_list:25
/proc/irq/167/0000:07:00.0-xdp-10/../smp_affinity_list:26
/proc/irq/168/0000:07:00.0-xdp-11/../smp_affinity_list:27
/proc/irq/169/0000:07:00.0-xdp-12/../smp_affinity_list:28
/proc/irq/170/0000:07:00.0-xdp-13/../smp_affinity_list:29
/proc/irq/171/0000:07:00.0-xdp-14/../smp_affinity_list:30
/proc/irq/172/0000:07:00.0-xdp-15/../smp_affinity_list:31

CPUs assignments after this and previous patch, so normal channels
created only one per core in NUMA node and affinities set only to local
NUMA node:

$ grep -H . /proc/irq/*/0000:07:00.0*/../smp_affinity_list
/proc/irq/116/0000:07:00.0-0/../smp_affinity_list:0
/proc/irq/117/0000:07:00.0-1/../smp_affinity_list:1
/proc/irq/118/0000:07:00.0-2/../smp_affinity_list:2
/proc/irq/119/0000:07:00.0-3/../smp_affinity_list:3
/proc/irq/120/0000:07:00.0-4/../smp_affinity_list:4
/proc/irq/121/0000:07:00.0-5/../smp_affinity_list:5
/proc/irq/122/0000:07:00.0-6/../smp_affinity_list:6
/proc/irq/123/0000:07:00.0-7/../smp_affinity_list:7
/proc/irq/124/0000:07:00.0-xdp-0/../smp_affinity_list:16
/proc/irq/125/0000:07:00.0-xdp-1/../smp_affinity_list:17
/proc/irq/126/0000:07:00.0-xdp-2/../smp_affinity_list:18
/proc/irq/127/0000:07:00.0-xdp-3/../smp_affinity_list:19
/proc/irq/128/0000:07:00.0-xdp-4/../smp_affinity_list:20
/proc/irq/129/0000:07:00.0-xdp-5/../smp_affinity_list:21
/proc/irq/130/0000:07:00.0-xdp-6/../smp_affinity_list:22
/proc/irq/131/0000:07:00.0-xdp-7/../smp_affinity_list:23
/proc/irq/132/0000:07:00.0-xdp-8/../smp_affinity_list:0
/proc/irq/133/0000:07:00.0-xdp-9/../smp_affinity_list:1
/proc/irq/134/0000:07:00.0-xdp-10/../smp_affinity_list:2
/proc/irq/135/0000:07:00.0-xdp-11/../smp_affinity_list:3
/proc/irq/136/0000:07:00.0-xdp-12/../smp_affinity_list:4
/proc/irq/137/0000:07:00.0-xdp-13/../smp_affinity_list:5
/proc/irq/138/0000:07:00.0-xdp-14/../smp_affinity_list:6
/proc/irq/139/0000:07:00.0-xdp-15/../smp_affinity_list:7

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01 17:12:38 -08:00
Íñigo Huguet
c265b569a4 sfc: default config to 1 channel/core in local NUMA node only
Handling channels from CPUs in different NUMA node can penalize
performance, so better configure only one channel per core in the same
NUMA node than the NIC, and not per each core in the system.

Fallback to all other online cores if there are not online CPUs in local
NUMA node.

Signed-off-by: Íñigo Huguet <ihuguet@redhat.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-01 17:12:38 -08:00
Martin Habets
000fe940e5 sfc: The size of the RX recycle ring should be more flexible
Ideally the size would depend on the link speed, but the recycle
ring is created when the interface is brought up before the driver
knows the link speed. So size it for the maximum speed of a given NIC.
PowerPC is only supported on SFN7xxx and SFN8xxx NICs.

With this patch on a 40G NIC the number of calls to alloc_pages and
friends went down from about 18% to under 2%.
On a 10G NIC the number of calls to alloc_pages and friends went down
from about 15% to 0 (perf did not capture any calls during the 60
second test).
On a 100G NIC the number of calls to alloc_pages and friends went down
from about 23% to 4%.

Reported-by: Íñigo Huguet <ihuguet@redhat.com>
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20220131111054.cp4f6foyinaarwbn@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-02-01 20:34:59 -08:00
Jiapeng Chong
0cf765fb00 sfc: Use swap() instead of open coding it
Clean the following coccicheck warning:

./drivers/net/ethernet/sfc/efx_channels.c:870:36-37: WARNING opportunity
for swap().

./drivers/net/ethernet/sfc/efx_channels.c:824:36-37: WARNING opportunity
for swap().

Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-06 12:39:36 +00:00
Jakub Kicinski
b9adba350a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-05 14:36:10 -08:00
Martin Habets
1d5a474240 sfc: The RX page_ring is optional
The RX page_ring is an optional feature that improves
performance. When allocation fails the driver can still
function, but possibly with a lower bandwidth.
Guard against dereferencing a NULL page_ring.

Fixes: 2768935a46 ("sfc: reuse pages to avoid DMA mapping/unmapping costs")
Signed-off-by: Martin Habets <habetsm.xilinx@gmail.com>
Reported-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Link: https://lore.kernel.org/r/164111288276.5798.10330502993729113868.stgit@palantir17.mph.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-01-04 18:14:21 -08:00
Minghao Chi
416b27439d ethernet/sfc: remove redundant rc variable
Return value from efx_mcdi_rpc() directly instead
of taking this in another redundant variable.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-01-04 12:41:41 +00:00
David S. Miller
e63a023489 Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2021-12-30

The following pull-request contains BPF updates for your *net-next* tree.

We've added 72 non-merge commits during the last 20 day(s) which contain
a total of 223 files changed, 3510 insertions(+), 1591 deletions(-).

The main changes are:

1) Automatic setrlimit in libbpf when bpf is memcg's in the kernel, from Andrii.

2) Beautify and de-verbose verifier logs, from Christy.

3) Composable verifier types, from Hao.

4) bpf_strncmp helper, from Hou.

5) bpf.h header dependency cleanup, from Jakub.

6) get_func_[arg|ret|arg_cnt] helpers, from Jiri.

7) Sleepable local storage, from KP.

8) Extend kfunc with PTR_TO_CTX, PTR_TO_MEM argument support, from Kumar.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-31 14:35:40 +00:00
Jakub Kicinski
b6459415b3 net: Don't include filter.h from net/sock.h
sock.h is pretty heavily used (5k objects rebuilt on x86 after
it's touched). We can drop the include of filter.h from it and
add a forward declaration of struct sk_filter instead.
This decreases the number of rebuilt objects when bpf.h
is touched from ~5k to ~1k.

There's a lot of missing includes this was masking. Primarily
in networking tho, this time.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
Acked-by: Stefano Garzarella <sgarzare@redhat.com>
Link: https://lore.kernel.org/bpf/20211229004913.513372-1-kuba@kernel.org
2021-12-29 08:48:14 -08:00
Jakub Kicinski
8b3f913322 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/net/sock.h
  commit 8f905c0e73 ("inet: fully convert sk->sk_rx_dst to RCU rules")
  commit 43f51df417 ("net: move early demux fields close to sk_refcnt")
  https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-23 16:09:58 -08:00
Jiasheng Jiang
9b8bdd1eb5 sfc: falcon: Check null pointer of rx_queue->page_ring
Because of the possible failure of the kcalloc, it should be better to
set rx_queue->page_ptr_mask to 0 when it happens in order to maintain
the consistency.

Fixes: 5a6681e22c ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20211220140344.978408-1-jiasheng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 12:25:18 -08:00
Jiasheng Jiang
bdf1b5c388 sfc: Check null pointer of rx_queue->page_ring
Because of the possible failure of the kcalloc, it should be better to
set rx_queue->page_ptr_mask to 0 when it happens in order to maintain
the consistency.

Fixes: 5a6681e22c ("sfc: separate out SFC4000 ("Falcon") support into new sfc-falcon driver")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://lore.kernel.org/r/20211220135603.954944-1-jiasheng@iscas.ac.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-22 12:23:18 -08:00
Jakub Kicinski
7cd2802d74 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-12-16 16:13:19 -08:00
Jiasheng Jiang
407ecd1bd7 sfc_ef100: potential dereference of null pointer
The return value of kmalloc() needs to be checked.
To avoid use in efx_nic_update_stats() in case of the failure of alloc.

Fixes: b593b6f1b4 ("sfc_ef100: statistics gathering")
Signed-off-by: Jiasheng Jiang <jiasheng@iscas.ac.cn>
Reported-by: kernel test robot <lkp@intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-16 10:55:32 +00:00
Hangbin Liu
9c9211a3fc net_tstamp: add new flag HWTSTAMP_FLAG_BONDED_PHC_INDEX
Since commit 94dd016ae5 ("bond: pass get_ts_info and SIOC[SG]HWTSTAMP
ioctl to active device") the user could get bond active interface's
PHC index directly. But when there is a failover, the bond active
interface will change, thus the PHC index is also changed. This may
break the user's program if they did not update the PHC timely.

This patch adds a new hwtstamp_config flag HWTSTAMP_FLAG_BONDED_PHC_INDEX.
When the user wants to get the bond active interface's PHC, they need to
add this flag and be aware the PHC index may be changed.

With the new flag. All flag checks in current drivers are removed. Only
the checking in net_hwtstamp_validate() is kept.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2021-12-14 12:28:24 +00:00