Commit Graph

2959 Commits

Author SHA1 Message Date
Jakub Kicinski
097f171f98 net: convert dev->rtnl_link_state to a bool
netdevice reg_state was split into two 16 bit enums back in 2010
in commit a2835763e1 ("rtnetlink: handle rtnl_link netlink
notifications manually"). Since the split the fields have been
moved apart, and last year we converted reg_state to a normal
u8 in commit 4d42b37def ("net: convert dev->reg_state to u8").

rtnl_link_state being a 16 bitfield makes no sense. Convert it
to a single bool, it seems very unlikely after 15 years that
we'll need more values in it.

We could drop dev->rtnl_link_ops from the conditions but feels
like having it there more clearly points at the reason for this
hack.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250410014246.780885-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-14 14:29:54 -07:00
Kuniyuki Iwashima
2a63dd0edf net: Retire DCCP socket.
DCCP was orphaned in 2021 by commit 054c4610bd ("MAINTAINERS: dccp:
move Gerrit Renker to CREDITS"), which noted that the last maintainer
had been inactive for five years.

In recent years, it has become a playground for syzbot, and most changes
to DCCP have been odd bug fixes triggered by syzbot.  Apart from that,
the only changes have been driven by treewide or networking API updates
or adjustments related to TCP.

Thus, in 2023, we announced we would remove DCCP in 2025 via commit
b144fcaf46 ("dccp: Print deprecation notice.").

Since then, only one individual has contacted the netdev mailing list. [0]

There is ongoing research for Multipath DCCP.  The repository is hosted
on GitHub [1], and development is not taking place through the upstream
community.  While the repository is published under the GPLv2 license,
the scheduling part remains proprietary, with a LICENSE file [2] stating:

  "This is not Open Source software."

The researcher mentioned a plan to address the licensing issue, upstream
the patches, and step up as a maintainer, but there has been no further
communication since then.

Maintaining DCCP for a decade without any real users has become a burden.

Therefore, it's time to remove it.

Removing DCCP will also provide significant benefits to TCP.  It allows
us to freely reorganize the layout of struct inet_connection_sock, which
is currently shared with DCCP, and optimize it to reduce the number of
cachelines accessed in the TCP fast path.

Note that we keep DCCP netfilter modules as requested.  [3]

Link: https://lore.kernel.org/netdev/20230710182253.81446-1-kuniyu@amazon.com/T/#u #[0]
Link: https://github.com/telekom/mp-dccp #[1]
Link: https://github.com/telekom/mp-dccp/blob/mpdccp_v03_k5.10/net/dccp/non_gpl_scheduler/LICENSE #[2]
Link: https://lore.kernel.org/netdev/Z_VQ0KlCRkqYWXa-@calendula/ #[3]
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Paul Moore <paul@paul-moore.com> (LSM and SELinux)
Acked-by: Casey Schaufler <casey@schaufler-ca.com>
Link: https://patch.msgid.link/20250410023921.11307-3-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-11 18:58:10 -07:00
Jiayuan Chen
c449d5f3a3 tcp: add LINUX_MIB_PAWS_TW_REJECTED counter
When TCP is in TIME_WAIT state, PAWS verification uses
LINUX_PAWSESTABREJECTED, which is ambiguous and cannot be distinguished
from other PAWS verification processes.

We added a new counter, like the existing PAWS_OLD_ACK one.

Also we update the doc with previously missing PAWS_OLD_ACK.

usage:
'''
nstat -az | grep PAWSTimewait
TcpExtPAWSTimewait              1                  0.0
'''

Suggested-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250409112614.16153-3-jiayuan.chen@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 18:29:26 -07:00
Jakub Kicinski
cb7103298d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.15-rc2).

Conflict:

Documentation/networking/netdevices.rst
net/core/lock_debug.c
  04efcee6ef ("net: hold instance lock during NETDEV_CHANGE")
  03df156dd3 ("xdp: double protect netdev->xdp_flags with netdev->lock")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-10 16:51:07 -07:00
Jakub Kicinski
ce7b149474 netdev: depend on netdev->lock for qstats in ops locked drivers
We mostly needed rtnl_lock in qstat to make sure the queue count
is stable while we work. For "ops locked" drivers the instance
lock protects the queue count, so we don't have to take rtnl_lock.

For currently ops-locked drivers: netdevsim and bnxt need
the protection from netdev going down while we dump, which
instance lock provides. gve doesn't care.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250408195956.412733-9-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-09 17:01:52 -07:00
Jakub Kicinski
87eba404f2 docs: netdev: break down the instance locking info per ops struct
Explicitly list all the ops structs and what locking they provide.
Use "ops locked" as a term for drivers which have ops called under
the instance lock.

Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Reviewed-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20250408195956.412733-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-09 17:01:52 -07:00
Jakub Kicinski
03df156dd3 xdp: double protect netdev->xdp_flags with netdev->lock
Protect xdp_features with netdev->lock. This way pure readers
no longer have to take rtnl_lock to access the field.

This includes calling NETDEV_XDP_FEAT_CHANGE under the lock.
Looks like that's fine for bonding, the only "real" listener,
it's the same as ethtool feature change.

In terms of normal drivers - only GVE need special consideration
(other drivers don't use instance lock or don't support XDP).
It calls xdp_set_features_flag() helper from gve_init_priv() which
in turn is called from gve_reset_recovery() (locked), or prior
to netdev registration. So switch to _locked.

Reviewed-by: Joe Damato <jdamato@fastly.com>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Harshitha Ramamurthy <hramamurthy@google.com>
Acked-by: Martin KaFai Lau <martin.lau@kernel.org>
Link: https://patch.msgid.link/20250408195956.412733-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-09 17:01:52 -07:00
Stanislav Fomichev
04efcee6ef net: hold instance lock during NETDEV_CHANGE
Cosmin reports an issue with ipv6_add_dev being called from
NETDEV_CHANGE notifier:

[ 3455.008776]  ? ipv6_add_dev+0x370/0x620
[ 3455.010097]  ipv6_find_idev+0x96/0xe0
[ 3455.010725]  addrconf_add_dev+0x1e/0xa0
[ 3455.011382]  addrconf_init_auto_addrs+0xb0/0x720
[ 3455.013537]  addrconf_notify+0x35f/0x8d0
[ 3455.014214]  notifier_call_chain+0x38/0xf0
[ 3455.014903]  netdev_state_change+0x65/0x90
[ 3455.015586]  linkwatch_do_dev+0x5a/0x70
[ 3455.016238]  rtnl_getlink+0x241/0x3e0
[ 3455.019046]  rtnetlink_rcv_msg+0x177/0x5e0

Similarly, linkwatch might get to ipv6_add_dev without ops lock:
[ 3456.656261]  ? ipv6_add_dev+0x370/0x620
[ 3456.660039]  ipv6_find_idev+0x96/0xe0
[ 3456.660445]  addrconf_add_dev+0x1e/0xa0
[ 3456.660861]  addrconf_init_auto_addrs+0xb0/0x720
[ 3456.661803]  addrconf_notify+0x35f/0x8d0
[ 3456.662236]  notifier_call_chain+0x38/0xf0
[ 3456.662676]  netdev_state_change+0x65/0x90
[ 3456.663112]  linkwatch_do_dev+0x5a/0x70
[ 3456.663529]  __linkwatch_run_queue+0xeb/0x200
[ 3456.663990]  linkwatch_event+0x21/0x30
[ 3456.664399]  process_one_work+0x211/0x610
[ 3456.664828]  worker_thread+0x1cc/0x380
[ 3456.665691]  kthread+0xf4/0x210

Reclassify NETDEV_CHANGE as a notifier that consistently runs under the
instance lock.

Link: https://lore.kernel.org/netdev/aac073de8beec3e531c86c101b274d434741c28e.camel@nvidia.com/
Reported-by: Cosmin Ratiu <cratiu@nvidia.com>
Tested-by: Cosmin Ratiu <cratiu@nvidia.com>
Fixes: ad7c7b2172 ("net: hold netdev instance lock during sysfs operations")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250404161122.3907628-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-07 11:13:39 -07:00
Stanislav Fomichev
ee705fa21f docs: net: document netdev notifier expectations
We don't have a consistent state yet, but document where we think
we are and where we wanna be.

Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250401163452.622454-8-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-04-03 15:32:08 -07:00
Linus Torvalds
ca0b04ba0b for-6.15/io_uring-rx-zc-20250325
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAmfjTP8QHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpm6oEACnpGL52FAKTVj14GDqFo6Pq0Jmnh07x8qj
 mpHFPwxfWAzRiuNyji2iS9ecS2cnlkixNyMWZipXRi4KJAUjJH6YDd7IofUI3Glf
 6v7b6srFSvsWJIJ8LdkJHLHAJuzYnJvFZ8apwgQczEDqgHq7BAunM1sVQ+mydjYk
 EXT4kN6DSBOPzwr9GAay52f8nXhbqdHfT+YTGHPHg+QToojL6gD7vvW57w/QqD/x
 91hJef1z01cSIsDZOxA0EUeD+9bBAHpoamr/e3IOOCVYCN6hy0dGa9g0QGbbpVyE
 AeU4FGZLV9J8OOfvHVraDt5Wn3IXxYaMu22dSn1S6tVinwjXhJR2LAA+t4fGHAkt
 i38LjOsIbopSQn/cNhzwC8UZcHLqnVsdDolHlHzsVFVfcpck2/4JFpUeP8QhWgrk
 f9tY12QUf/oEaWm0/sUCHZNFxpIGeFA5FFXf0Z92clnzBuiuWoesBNvxqY/2DeZn
 IDNXiv+Trxr6kFEjTpzPeuxbWrn4PJ7afQSAFcEmOCguk5riM+zJZNIKg0TxUHSS
 tt6sfxmTP1DhgDKad5kT3MLyzOcx47Kbjf4dj6KmRnD+3DGwwN2F7X7R1GJylPSp
 RLOzJ+Ouuy9UmBN6JMsT4BmR9+FJTVirADU926d/ZqCTtRV8Tnq/6HHmKmmr4CR0
 THJ0PJqQjg==
 =MOve
 -----END PGP SIGNATURE-----

Merge tag 'for-6.15/io_uring-rx-zc-20250325' of git://git.kernel.dk/linux

Pull io_uring zero-copy receive support from Jens Axboe:
 "This adds support for zero-copy receive with io_uring, enabling fast
  bulk receive of data directly into application memory, rather than
  needing to copy the data out of kernel memory.

  While this version only supports host memory as that was the initial
  target, other memory types are planned as well, with notably GPU
  memory coming next.

  This work depends on some networking components which were queued up
  on the networking side, but have now landed in your tree.

  This is the work of Pavel Begunkov and David Wei. From the v14 posting:

    'We configure a page pool that a driver uses to fill a hw rx queue
     to hand out user pages instead of kernel pages. Any data that ends
     up hitting this hw rx queue will thus be dma'd into userspace
     memory directly, without needing to be bounced through kernel
     memory. 'Reading' data out of a socket instead becomes a
     _notification_ mechanism, where the kernel tells userspace where
     the data is. The overall approach is similar to the devmem TCP
     proposal

     This relies on hw header/data split, flow steering and RSS to
     ensure packet headers remain in kernel memory and only desired
     flows hit a hw rx queue configured for zero copy. Configuring this
     is outside of the scope of this patchset.

     We share netdev core infra with devmem TCP. The main difference is
     that io_uring is used for the uAPI and the lifetime of all objects
     are bound to an io_uring instance. Data is 'read' using a new
     io_uring request type. When done, data is returned via a new shared
     refill queue. A zero copy page pool refills a hw rx queue from this
     refill queue directly. Of course, the lifetime of these data
     buffers are managed by io_uring rather than the networking stack,
     with different refcounting rules.

     This patchset is the first step adding basic zero copy support. We
     will extend this iteratively with new features e.g. dynamically
     allocated zero copy areas, THP support, dmabuf support, improved
     copy fallback, general optimisations and more'

  In a local setup, I was able to saturate a 200G link with a single CPU
  core, and at netdev conf 0x19 earlier this month, Jamal reported
  188Gbit of bandwidth using a single core (no HT, including soft-irq).

  Safe to say the efficiency is there, as bigger links would be needed
  to find the per-core limit, and it's considerably more efficient and
  faster than the existing devmem solution"

* tag 'for-6.15/io_uring-rx-zc-20250325' of git://git.kernel.dk/linux:
  io_uring/zcrx: add selftest case for recvzc with read limit
  io_uring/zcrx: add a read limit to recvzc requests
  io_uring: add missing IORING_MAP_OFF_ZCRX_REGION in io_uring_mmap
  io_uring: Rename KConfig to Kconfig
  io_uring/zcrx: fix leaks on failed registration
  io_uring/zcrx: recheck ifq on shutdown
  io_uring/zcrx: add selftest
  net: add documentation for io_uring zcrx
  io_uring/zcrx: add copy fallback
  io_uring/zcrx: throttle receive requests
  io_uring/zcrx: set pp memory provider for an rx queue
  io_uring/zcrx: add io_recvzc request
  io_uring/zcrx: dma-map area for the device
  io_uring/zcrx: implement zerocopy receive pp memory provider
  io_uring/zcrx: grab a net device
  io_uring/zcrx: add io_zcrx_area
  io_uring/zcrx: add interface queue and refill queue
2025-03-28 13:45:52 -07:00
Linus Torvalds
7b667acd69 powerpc updates for 6.15
- Removal of support for IBM Cell Blades
 
  - SMP support for microwatt platform
 
  - Support for inline static calls on PPC32
 
  - Enable pmu selftests for power11 platform
 
  - Enable hardware trace macro (HTM) hcall support
 
  - Support for limited address mode capability
 
  - Changes to RMA size from 512 MB to 768 MB to handle fadump
 
  - Misc fixes and cleanups
 
 Thanks to: Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
 Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav
 Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar,
 Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy,
 Segher Boessenkool, Sourabh Jain, Vaibhav Jain, Venkat Rao Bagalkote.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEqX2DNAOgU8sBX3pRpnEsdPSHZJQFAmfjZnYACgkQpnEsdPSH
 ZJTmKg/+NGW7wyFY9d8Iai9ncYY7GSzsMSDTaan7qg0QWOd5gjHbsdbava7TM/DW
 8p9XsC+17kSeftRNUjtc52bSN8Ei2gBdsXIagQG1alfB2X2e6wkNauifK+dz3Su6
 usMEZZTO5R/jFDotFXNM1nsUj+8dvjnPgOUrji/P8k7PT5295wpza0hz1fy5SrOA
 hM5cliBP36UgFe5Efvgm4OUX2gQIhbc3stt9MVfymW/k0Mit5f41UIPuVGiTWowY
 s0cUJGkhxUlGXT3VfOVKuZfn4u9KMha7UCl9afSceJzXOdnUIKIbskui1VEv6cD/
 iSIxi839uErAobFHlsLYprgYFciYLII3xe2qNZCA/ZxeIMS/Mm6xokESeWLhBnfa
 P7ke6l0z3GDtTvgI2eSeU9BdrVveF1NgbP9GYSKgT6gtw/kRRnxgHF8tzmLON5PT
 KXpQlzz8VuSBRtF2jnLFU89+FFwSA1bRUhDrp89HyYFqw1B5g4N7kFFTUJWHOuKS
 fwPGy+cveKehmCUBedeTRFqHvvqdwpD/WnPlQzCly3WxqdL8U/eTXYftMiAwuK28
 ovLuSs3vRThKRQ8DnUa5oB0UGsjMpRV5LdvYkhw+x8mZKUR59oj4fx2ae4TtPakg
 dbAYuPPkCORdaSga/nV6vQgsLprFpcGX3dq6E19+BVBAY5D+1PE=
 =GFcj
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Madhavan Srinivasan:

 - Remove support for IBM Cell Blades

 - SMP support for microwatt platform

 - Support for inline static calls on PPC32

 - Enable pmu selftests for power11 platform

 - Enable hardware trace macro (HTM) hcall support

 - Support for limited address mode capability

 - Changes to RMA size from 512 MB to 768 MB to handle fadump

 - Misc fixes and cleanups

Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann,
Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom,
Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook,
Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani
(IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav
Jain, and Venkat Rao Bagalkote.

* tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits)
  powerpc/kexec: fix physical address calculation in clear_utlb_entry()
  crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD
  powerpc: Fix 'intra_function_call not a direct call' warning
  powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu'
  KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests
  powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7
  powerpc/microwatt: Add SMP support
  powerpc: Define config option for processors with broadcast TLBIE
  powerpc/microwatt: Define an idle power-save function
  powerpc/microwatt: Device-tree updates
  powerpc/microwatt: Select COMMON_CLK in order to get the clock framework
  net: toshiba: Remove reference to PPC_IBM_CELL_BLADE
  net: spider_net: Remove powerpc Cell driver
  cpufreq: ppc_cbe: Remove powerpc Cell driver
  genirq: Remove IRQ_EDGE_EOI_HANDLER
  docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR
  powerpc: Remove UDBG_RTAS_CONSOLE
  powerpc/io: Use standard barrier macros in io.c
  powerpc/io: Rename _insw_ns() etc.
  powerpc/io: Use generic raw accessors
  ...
2025-03-27 19:39:08 -07:00
Linus Torvalds
1a9239bb42 Networking changes for 6.15.
Core & protocols
 ----------------
 
  - Continue Netlink conversions to per-namespace RTNL lock
    (IPv4 routing, routing rules, routing next hops, ARP ioctls).
 
  - Continue extending the use of netdev instance locks. As a driver
    opt-in protect queue operations and (in due course) ethtool
    operations with the instance lock and not RTNL lock.
 
  - Support collecting TCP timestamps (data submitted, sent, acked)
    in BPF, allowing for transparent (to the application) and lower
    overhead tracking of TCP RPC performance.
 
  - Tweak existing networking Rx zero-copy infra to support zero-copy
    Rx via io_uring.
 
  - Optimize MPTCP performance in single subflow mode by 29%.
 
  - Enable GRO on packets which went thru XDP CPU redirect (were queued
    for processing on a different CPU). Improving TCP stream performance
    up to 2x.
 
  - Improve performance of contended connect() by 200% by searching
    for an available 4-tuple under RCU rather than a spin lock.
    Bring an additional 229% improvement by tweaking hash distribution.
 
  - Avoid unconditionally touching sk_tsflags on RX, improving
    performance under UDP flood by as much as 10%.
 
  - Avoid skb_clone() dance in ping_rcv() to improve performance under
    ping flood.
 
  - Avoid FIB lookup in netfilter if socket is available, 20% perf win.
 
  - Rework network device creation (in-kernel) API to more clearly
    identify network namespaces and their roles.
    There are up to 4 namespace roles but we used to have just 2 netns
    pointer arguments, interpreted differently based on context.
 
  - Use sysfs_break_active_protection() instead of trylock to avoid
    deadlocks between unregistering objects and sysfs access.
 
  - Add a new sysctl and sockopt for capping max retransmit timeout
    in TCP.
 
  - Support masking port and DSCP in routing rule matches.
 
  - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.
 
  - Support specifying at what time packet should be sent on AF_XDP
    sockets.
 
  - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin users.
 
  - Add Netlink YAML spec for WiFi (nl80211) and conntrack.
 
  - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
    which only need to be exported when IPv6 support is built as a module.
 
  - Age FDB entries based on Rx not Tx traffic in VxLAN, similar
    to normal bridging.
 
  - Allow users to specify source port range for GENEVE tunnels.
 
  - netconsole: allow attaching kernel release, CPU ID and task name
    to messages as metadata
 
 Driver API
 ----------
 
  - Continue rework / fixing of Energy Efficient Ethernet (EEE) across
    the SW layers. Delegate the responsibilities to phylink where possible.
    Improve its handling in phylib.
 
  - Support symmetric OR-XOR RSS hashing algorithm.
 
  - Support tracking and preserving IRQ affinity by NAPI itself.
 
  - Support loopback mode speed selection for interface selftests.
 
 Device drivers
 --------------
 
  - Remove the IBM LCS driver for s390.
 
  - Remove the sb1000 cable modem driver.
 
  - Add support for SFP module access over SMBus.
 
  - Add MCTP transport driver for MCTP-over-USB.
 
  - Enable XDP metadata support in multiple drivers.
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - add PCIe TLP Processing Hints (TPH) support for new AMD platforms
      - support dumping RoCE queue state for debug
      - opt into instance locking
    - Intel (100G, ice, idpf):
      - ice: rework MSI-X IRQ management and distribution
      - ice: support for E830 devices
      - iavf: add support for Rx timestamping
      - iavf: opt into instance locking
    - nVidia/Mellanox:
      - mlx4: use page pool memory allocator for Rx
      - mlx5: support for one PTP device per hardware clock
      - mlx5: support for 200Gbps per-lane link modes
      - mlx5: move IPSec policy check after decryption
    - AMD/Solarflare:
      - support FW flashing via devlink
    - Cisco (enic):
      - use page pool memory allocator for Rx
      - enable 32, 64 byte CQEs
      - get max rx/tx ring size from the device
    - Meta (fbnic):
      - support flow steering and RSS configuration
      - report queue stats
      - support TCP segmentation
      - support IRQ coalescing
      - support ring size configuration
    - Marvell/Cavium:
      - support AF_XDP
    - Wangxun:
      - support for PTP clock and timestamping
    - Huawei (hibmcge):
      - checksum offload
      - add more statistics
 
  - Ethernet virtual:
    - VirtIO net:
      - aggressively suppress Tx completions, improve perf by 96% with
        1 CPU and 55% with 2 CPUs
      - expose NAPI to IRQ mapping and persist NAPI settings
    - Google (gve):
      - support XDP in DQO RDA Queue Format
      - opt into instance locking
    - Microsoft vNIC:
      - support BIG TCP
 
  - Ethernet NICs consumer, and embedded:
    - Synopsys (stmmac):
      - cleanup Tx and Tx clock setting and other link-focused cleanups
      - enable SGMII and 2500BASEX mode switching for Intel platforms
      - support Sophgo SG2044
    - Broadcom switches (b53):
      - support for BCM53101
    - TI:
      - iep: add perout configuration support
      - icssg: support XDP
    - Cadence (macb):
      - implement BQL
    - Xilinx (axinet):
      - support dynamic IRQ moderation and changing coalescing at runtime
      - implement BQL
      - report standard stats
    - MediaTek:
      - support phylink managed EEE
    - Intel:
      - igc: don't restart the interface on every XDP program change
    - RealTek (r8169):
      - support reading registers of internal PHYs directly
      - increase max jumbo packet size on RTL8125/RTL8126
    - Airoha:
      - support for RISC-V NPU packet processing unit
      - enable scatter-gather and support MTU up to 9kB
    - Tehuti (tn40xx):
      - support cards with TN4010 MAC and an Aquantia AQR105 PHY
 
  - Ethernet PHYs:
    - support for TJA1102S, TJA1121
    - dp83tg720: add randomized polling intervals for link detection
    - dp83822: support changing the transmit amplitude voltage
    - support for LEDs on 88q2xxx
 
  - CAN:
    - canxl: support Remote Request Substitution bit access
    - flexcan: add S32G2/S32G3 SoC
 
  - WiFi:
    - remove cooked monitor support
    - strict mode for better AP testing
    - basic EPCS support
    - OMI RX bandwidth reduction support
    - batman-adv: add support for jumbo frames
 
  - WiFi drivers:
    - RealTek (rtw88):
      - support RTL8814AE and RTL8814AU
    - RealTek (rtw89):
      - switch using wiphy_lock and wiphy_work
      - add BB context to manipulate two PHY as preparation of MLO
      - improve BT-coexistence mechanism to play A2DP smoothly
    - Intel (iwlwifi):
      - add new iwlmld sub-driver for latest HW/FW combinations
    - MediaTek (mt76):
      - preparation for mt7996 Multi-Link Operation (MLO) support
    - Qualcomm/Atheros (ath12k):
      - continued work on MLO
    - Silabs (wfx):
      - Wake-on-WLAN support
 
  - Bluetooth:
    - add support for skb TX SND/COMPLETION timestamping
    - hci_core: enable buffer flow control for SCO/eSCO
    - coredump: log devcd dumps into the monitor
 
  - Bluetooth drivers:
    - intel: add support to configure TX power
    - nxp: handle bootloader error during cmd5 and cmd7
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmfkLC8ACgkQMUZtbf5S
 Irsb5g/+L7oKOf0ALbaV9kxFsoz8AymZfAW9i/27F07omGJGpks8oX6j6rQLgIRO
 OQOFcp7XEdDh1+jh82gHVuPrw2/6lchLtW8ARtzdiQKFr5DRjrsbtua6GRc8iBqA
 DIRCBFoV2HuMkF39Vr09HMa9AZAT7QR2RLsRGpSq8E8Z8xxKz0X7oujs10PFpMTE
 IVKhTrVrk+NDot/IU2hzVpnpup+0ld+T2/ZaBklJGcU8uDffImsqNepHRyCG5UC3
 xz74Ju23MAj24Gct+og0yFUooF+lUltKyVm0FYCDCY3bASTwgY01NR3kEH/0NQvM
 cywLzd/ngHm/SMD2ggVAHkjZUieiIVHdaZ53dgjDeBOQoVP6p0dgUK7EumXX8Mx4
 8ReR2UiGoYRPaq9c4o+IjG4K027MwVK2p+mF1a6MLa+20XcyMbev8FIRbbHtC/V4
 z5/FsOAxcuICWkA1hU9bODrrGzIqemmdRgKG8sGuTJCt/kYGAn72/TCATGNSaCJ0
 00n2jN1aepa7wtywHJ5MhVzxN9iQX7+geUHXz0BI+lK4e1Pmk+vjGksymb9ai2fk
 eQAUV9ekub6q68/J16scD7XeOUM37bTLiMBQeIF8UtZBOJscKiS71zn9QP9Twwxv
 P2pm01RDZUI+z5ZX3hc12Pm1vjRHaAh9S1JpAw/pTOVlQ+mAJEM=
 =XY0S
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Core & protocols:

   - Continue Netlink conversions to per-namespace RTNL lock
     (IPv4 routing, routing rules, routing next hops, ARP ioctls)

   - Continue extending the use of netdev instance locks. As a driver
     opt-in protect queue operations and (in due course) ethtool
     operations with the instance lock and not RTNL lock.

   - Support collecting TCP timestamps (data submitted, sent, acked) in
     BPF, allowing for transparent (to the application) and lower
     overhead tracking of TCP RPC performance.

   - Tweak existing networking Rx zero-copy infra to support zero-copy
     Rx via io_uring.

   - Optimize MPTCP performance in single subflow mode by 29%.

   - Enable GRO on packets which went thru XDP CPU redirect (were queued
     for processing on a different CPU). Improving TCP stream
     performance up to 2x.

   - Improve performance of contended connect() by 200% by searching for
     an available 4-tuple under RCU rather than a spin lock. Bring an
     additional 229% improvement by tweaking hash distribution.

   - Avoid unconditionally touching sk_tsflags on RX, improving
     performance under UDP flood by as much as 10%.

   - Avoid skb_clone() dance in ping_rcv() to improve performance under
     ping flood.

   - Avoid FIB lookup in netfilter if socket is available, 20% perf win.

   - Rework network device creation (in-kernel) API to more clearly
     identify network namespaces and their roles. There are up to 4
     namespace roles but we used to have just 2 netns pointer arguments,
     interpreted differently based on context.

   - Use sysfs_break_active_protection() instead of trylock to avoid
     deadlocks between unregistering objects and sysfs access.

   - Add a new sysctl and sockopt for capping max retransmit timeout in
     TCP.

   - Support masking port and DSCP in routing rule matches.

   - Support dumping IPv4 multicast addresses with RTM_GETMULTICAST.

   - Support specifying at what time packet should be sent on AF_XDP
     sockets.

   - Expose TCP ULP diagnostic info (for TLS and MPTCP) to non-admin
     users.

   - Add Netlink YAML spec for WiFi (nl80211) and conntrack.

   - Introduce EXPORT_IPV6_MOD() and EXPORT_IPV6_MOD_GPL() for symbols
     which only need to be exported when IPv6 support is built as a
     module.

   - Age FDB entries based on Rx not Tx traffic in VxLAN, similar to
     normal bridging.

   - Allow users to specify source port range for GENEVE tunnels.

   - netconsole: allow attaching kernel release, CPU ID and task name to
     messages as metadata

  Driver API:

   - Continue rework / fixing of Energy Efficient Ethernet (EEE) across
     the SW layers. Delegate the responsibilities to phylink where
     possible. Improve its handling in phylib.

   - Support symmetric OR-XOR RSS hashing algorithm.

   - Support tracking and preserving IRQ affinity by NAPI itself.

   - Support loopback mode speed selection for interface selftests.

  Device drivers:

   - Remove the IBM LCS driver for s390

   - Remove the sb1000 cable modem driver

   - Add support for SFP module access over SMBus

   - Add MCTP transport driver for MCTP-over-USB

   - Enable XDP metadata support in multiple drivers

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - add PCIe TLP Processing Hints (TPH) support for new AMD
           platforms
         - support dumping RoCE queue state for debug
         - opt into instance locking
      - Intel (100G, ice, idpf):
         - ice: rework MSI-X IRQ management and distribution
         - ice: support for E830 devices
         - iavf: add support for Rx timestamping
         - iavf: opt into instance locking
      - nVidia/Mellanox:
         - mlx4: use page pool memory allocator for Rx
         - mlx5: support for one PTP device per hardware clock
         - mlx5: support for 200Gbps per-lane link modes
         - mlx5: move IPSec policy check after decryption
      - AMD/Solarflare:
         - support FW flashing via devlink
      - Cisco (enic):
         - use page pool memory allocator for Rx
         - enable 32, 64 byte CQEs
         - get max rx/tx ring size from the device
      - Meta (fbnic):
         - support flow steering and RSS configuration
         - report queue stats
         - support TCP segmentation
         - support IRQ coalescing
         - support ring size configuration
      - Marvell/Cavium:
         - support AF_XDP
      - Wangxun:
         - support for PTP clock and timestamping
      - Huawei (hibmcge):
         - checksum offload
         - add more statistics

   - Ethernet virtual:
      - VirtIO net:
         - aggressively suppress Tx completions, improve perf by 96%
           with 1 CPU and 55% with 2 CPUs
         - expose NAPI to IRQ mapping and persist NAPI settings
      - Google (gve):
         - support XDP in DQO RDA Queue Format
         - opt into instance locking
      - Microsoft vNIC:
         - support BIG TCP

   - Ethernet NICs consumer, and embedded:
      - Synopsys (stmmac):
         - cleanup Tx and Tx clock setting and other link-focused
           cleanups
         - enable SGMII and 2500BASEX mode switching for Intel platforms
         - support Sophgo SG2044
      - Broadcom switches (b53):
         - support for BCM53101
      - TI:
         - iep: add perout configuration support
         - icssg: support XDP
      - Cadence (macb):
         - implement BQL
      - Xilinx (axinet):
         - support dynamic IRQ moderation and changing coalescing at
           runtime
         - implement BQL
         - report standard stats
      - MediaTek:
         - support phylink managed EEE
      - Intel:
         - igc: don't restart the interface on every XDP program change
      - RealTek (r8169):
         - support reading registers of internal PHYs directly
         - increase max jumbo packet size on RTL8125/RTL8126
      - Airoha:
         - support for RISC-V NPU packet processing unit
         - enable scatter-gather and support MTU up to 9kB
      - Tehuti (tn40xx):
         - support cards with TN4010 MAC and an Aquantia AQR105 PHY

   - Ethernet PHYs:
      - support for TJA1102S, TJA1121
      - dp83tg720: add randomized polling intervals for link detection
      - dp83822: support changing the transmit amplitude voltage
      - support for LEDs on 88q2xxx

   - CAN:
      - canxl: support Remote Request Substitution bit access
      - flexcan: add S32G2/S32G3 SoC

   - WiFi:
      - remove cooked monitor support
      - strict mode for better AP testing
      - basic EPCS support
      - OMI RX bandwidth reduction support
      - batman-adv: add support for jumbo frames

   - WiFi drivers:
      - RealTek (rtw88):
         - support RTL8814AE and RTL8814AU
      - RealTek (rtw89):
         - switch using wiphy_lock and wiphy_work
         - add BB context to manipulate two PHY as preparation of MLO
         - improve BT-coexistence mechanism to play A2DP smoothly
      - Intel (iwlwifi):
         - add new iwlmld sub-driver for latest HW/FW combinations
      - MediaTek (mt76):
         - preparation for mt7996 Multi-Link Operation (MLO) support
      - Qualcomm/Atheros (ath12k):
         - continued work on MLO
      - Silabs (wfx):
         - Wake-on-WLAN support

   - Bluetooth:
      - add support for skb TX SND/COMPLETION timestamping
      - hci_core: enable buffer flow control for SCO/eSCO
      - coredump: log devcd dumps into the monitor

   - Bluetooth drivers:
      - intel: add support to configure TX power
      - nxp: handle bootloader error during cmd5 and cmd7"

* tag 'net-next-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1681 commits)
  unix: fix up for "apparmor: add fine grained af_unix mediation"
  mctp: Fix incorrect tx flow invalidation condition in mctp-i2c
  net: usb: asix: ax88772: Increase phy_name size
  net: phy: Introduce PHY_ID_SIZE — minimum size for PHY ID string
  net: libwx: fix Tx L4 checksum
  net: libwx: fix Tx descriptor content for some tunnel packets
  atm: Fix NULL pointer dereference
  net: tn40xx: add pci-id of the aqr105-based Tehuti TN4010 cards
  net: tn40xx: prepare tn40xx driver to find phy of the TN9510 card
  net: tn40xx: create swnode for mdio and aqr105 phy and add to mdiobus
  net: phy: aquantia: add essential functions to aqr105 driver
  net: phy: aquantia: search for firmware-name in fwnode
  net: phy: aquantia: add probe function to aqr105 for firmware loading
  net: phy: Add swnode support to mdiobus_scan
  gve: add XDP DROP and PASS support for DQ
  gve: update XDP allocation path support RX buffer posting
  gve: merge packet buffer size fields
  gve: update GQ RX to use buf_size
  gve: introduce config-based allocation for XDP
  gve: remove xdp_xsk_done and xdp_xsk_wakeup statistics
  ...
2025-03-26 21:48:21 -07:00
Jakub Kicinski
4f74a45c6b bluetooth-next pull request for net-next:
core:
 
  - Add support for skb TX SND/COMPLETION timestamping
  - hci_core: Enable buffer flow control for SCO/eSCO
  - coredump: Log devcd dumps into the monitor
 
  drivers:
 
  - btusb: Add 2 HWIDs for MT7922
  - btusb: Fix regression in the initialization of fake Bluetooth controllers
  - btusb: Add 14 USB device IDs for Qualcomm WCN785x
  - btintel: Add support for Intel Scorpius Peak
  - btintel: Add support to configure TX power
  - btintel: Add DSBR support for ScP
  - btintel_pcie: Add device id of Whale Peak
  - btintel_pcie: Setup buffers for firmware traces
  - btintel_pcie: Read hardware exception data
  - btintel_pcie: Add support for device coredump
  - btintel_pcie: Trigger device coredump on hardware exception
  - btnxpuart: Support for controller wakeup gpio config
  - btnxpuart: Add support to set BD address
  - btnxpuart: Add correct bootloader error codes
  - btnxpuart: Handle bootloader error during cmd5 and cmd7
  - btnxpuart: Fix kernel panic during FW release
  - qca: add WCN3950 support
  - hci_qca: use the power sequencer for wcn6750
  - btmtksdio: Prevent enabling interrupts after IRQ handler removal
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmfjA3AZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKeWWD/0QZYBrNP9QSTyYTeNlYhCC
 Lw/n7n3+LhxOqIu+tOGS7UplTqR3p4WQGu2g+XV9wSu4dvLplZGn/40XtiXJA0+r
 VsXivQ4IR/Vjd8sNLLixfmuH4g4CbMSblQvECD3/5wFTeSH8T6/gyts/WV/LDYOJ
 jp5kG6HCFHMv7RaiaHdZ0Pe4c1xJ/Ek8TnrW4G/kPBaLzm+lhjRkfzxCx8cO0k0H
 mpoheUohLtUSgfLf49826t7rp3HDuX9db2hiGXQfKSrL2milwufKNaMFTmbuU2Uq
 IyAwR1CEdSsKlcpbnVNF05r94sjf8NjuBD3YWxB9OfVXq9aymJYZdGh54XT5nF3f
 ccD1WNnOoTsXDbEAnuVx+EYrJAI70e7vE4m8XbpxJcxhFoSIQCmtDoXUY199LiEG
 El7DlwouNFESmOIj6gSK7ogUEhmcQA0AJxBVpdxnGBAcQMn1hrGSb75VGg7rul8K
 HcBtf5j4gxZum2fzrufeY1+eBxfSRjNFIsnutAr53gEtidPZYlmiRyAuqQJMo2sO
 0hlpC6gQGQJ5HiTR5Jbo9u+OC1oeHHXNN5IpNrd6J65n0pqCRldDsoXCerEJsOou
 cqWzb9lfQCrjRnWRxUWOLukdYRgBH15wuORU+Z7/qVhqOr49RvARnzq7AjGSenTS
 YKb2N+NGlqpxG+vQscFJUA==
 =XCdn
 -----END PGP SIGNATURE-----

Merge tag 'for-net-next-2025-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next

Luiz Augusto von Dentz says:

====================
bluetooth-next pull request for net-next:

core:

 - Add support for skb TX SND/COMPLETION timestamping
 - hci_core: Enable buffer flow control for SCO/eSCO
 - coredump: Log devcd dumps into the monitor

 drivers:

 - btusb: Add 2 HWIDs for MT7922
 - btusb: Fix regression in the initialization of fake Bluetooth controllers
 - btusb: Add 14 USB device IDs for Qualcomm WCN785x
 - btintel: Add support for Intel Scorpius Peak
 - btintel: Add support to configure TX power
 - btintel: Add DSBR support for ScP
 - btintel_pcie: Add device id of Whale Peak
 - btintel_pcie: Setup buffers for firmware traces
 - btintel_pcie: Read hardware exception data
 - btintel_pcie: Add support for device coredump
 - btintel_pcie: Trigger device coredump on hardware exception
 - btnxpuart: Support for controller wakeup gpio config
 - btnxpuart: Add support to set BD address
 - btnxpuart: Add correct bootloader error codes
 - btnxpuart: Handle bootloader error during cmd5 and cmd7
 - btnxpuart: Fix kernel panic during FW release
 - qca: add WCN3950 support
 - hci_qca: use the power sequencer for wcn6750
 - btmtksdio: Prevent enabling interrupts after IRQ handler removal

* tag 'for-net-next-2025-03-25' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next: (53 commits)
  Bluetooth: MGMT: Add LL Privacy Setting
  Bluetooth: hci_event: Fix handling of HCI_EV_LE_DIRECT_ADV_REPORT
  Bluetooth: btnxpuart: Fix kernel panic during FW release
  Bluetooth: btnxpuart: Handle bootloader error during cmd5 and cmd7
  Bluetooth: btnxpuart: Add correct bootloader error codes
  t blameBluetooth: btintel: Fix leading white space
  Bluetooth: btintel: Add support to configure TX power
  Bluetooth: btmtksdio: Prevent enabling interrupts after IRQ handler removal
  Bluetooth: btmtk: Remove the resetting step before downloading the fw
  Bluetooth: SCO: add TX timestamping
  Bluetooth: L2CAP: add TX timestamping
  Bluetooth: ISO: add TX timestamping
  Bluetooth: add support for skb TX SND/COMPLETION timestamping
  net-timestamp: COMPLETION timestamp on packet tx completion
  HCI: coredump: Log devcd dumps into the monitor
  Bluetooth: HCI: Add definition of hci_rp_remote_name_req_cancel
  Bluetooth: hci_vhci: Mark Sync Flow Control as supported
  Bluetooth: hci_core: Enable buffer flow control for SCO/eSCO
  Bluetooth: btintel_pci: Fix build warning
  Bluetooth: btintel_pcie: Trigger device coredump on hardware exception
  ...
====================

Link: https://patch.msgid.link/20250325192925.2497890-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25 14:00:48 -07:00
Eric Dumazet
f1e30061e8 tcp/dccp: remove icsk->icsk_ack.timeout
icsk->icsk_ack.timeout can be replaced by icsk->csk_delack_timer.expires

This saves 8 bytes in TCP/DCCP sockets and helps for better cache locality.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250324203607.703850-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25 10:34:33 -07:00
Eric Dumazet
a7c428ee8f tcp/dccp: remove icsk->icsk_timeout
icsk->icsk_timeout can be replaced by icsk->icsk_retransmit_timer.expires

This saves 8 bytes in TCP/DCCP sockets and helps for better cache locality.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250324203607.703850-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25 10:34:33 -07:00
Pauli Virtanen
983e0e4e87 net-timestamp: COMPLETION timestamp on packet tx completion
Add SOF_TIMESTAMPING_TX_COMPLETION, for requesting a software timestamp
when hardware reports a packet completed.

Completion tstamp is useful for Bluetooth, as hardware timestamps do not
exist in the HCI specification except for ISO packets, and the hardware
has a queue where packets may wait.  In this case the software SND
timestamp only reflects the kernel-side part of the total latency
(usually small) and queue length (usually 0 unless HW buffers
congested), whereas the completion report time is more informative of
the true latency.

It may also be useful in other cases where HW TX timestamps cannot be
obtained and user wants to estimate an upper bound to when the TX
probably happened.

Signed-off-by: Pauli Virtanen <pav@iki.fi>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2025-03-25 12:48:05 -04:00
Jakub Kicinski
586b7b3ebb ipsec-next-2025-03-24
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmfg9oUACgkQrB3Eaf9P
 W7d8eRAAhUdQztUoVjNfRBScD34EHEBq8ruFMgbHcXkkdhg223CUaTYKqz3dYE1E
 04OCSwe7yxFFSs7CKR4dRMSzcx7uH/ISPprk45g+wwoIzBCYOWjLBzS5qmamtSYu
 E3sI/QZaVUU7mFDg5n3sr5nkCNz+LtTL2dUbOgi5AOqY9h7LZRzLBEB1RCVOioVO
 14vDRBJ5RGIvmNx3pd0sRhB4BySUw1GQVTYLSFaL/V2JZcRu3yGhKT8q637fdKcS
 blvdf5q4CYyu5i3p9LHhsg33D8fYUotBVFtzw3QaM1G9MdyhMF4WUbeZCa2kkk7J
 N7j2N0WDjP7crKQkPbm9ATQiruF2zvPcYJqgIHKuNhWY35KNIFARWxbP8KGJJFUI
 EqUUKuT0lUi39hULmm5YEqCGWLNQldTBOOZL1fccRj2DZelx4uhQsnt5B2yimUCD
 7QJGH0Vx6NJAutuCgTXQtmD0qO2XYi50XLmh4K+HZ1M3DkSGohcYSz+aj8IIY/wT
 +yT55uRoONqDbh9kOLHab0WWE6R+XJyY5rKnz6XpzyTPh0OwoxrbVXJF045NYIpB
 tSl9ykVpymvWNT56nsfXEOm5THZhdvL+uDS/QV4rtF9En4TEJU0dGbhESvriuu2j
 g6EAQpYJqL9OwA5aKVZCGlHf7siji7Z/uc57DADZ8mtmDXVMqco=
 =9RMg
 -----END PGP SIGNATURE-----

Merge tag 'ipsec-next-2025-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2025-03-24

1) Prevent setting high order sequence number bits input in
   non-ESN mode. From Leon Romanovsky.

2) Support PMTU handling in tunnel mode for packet offload.
   From Leon Romanovsky.

3) Make xfrm_state_lookup_byaddr lockless.
   From Florian Westphal.

4) Remove unnecessary NULL check in xfrm_lookup_with_ifid().
   From Dan Carpenter.

* tag 'ipsec-next-2025-03-24' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next:
  xfrm: Remove unnecessary NULL check in xfrm_lookup_with_ifid()
  xfrm: state: make xfrm_state_lookup_byaddr lockless
  xfrm: check for PMTU in tunnel mode for packet offload
  xfrm: provide common xdo_dev_offload_ok callback implementation
  xfrm: rely on XFRM offload
  xfrm: simplify SA initialization routine
  xfrm: delay initialization of offload path till its actually requested
  xfrm: prevent high SEQ input in non-ESN mode
====================

Link: https://patch.msgid.link/20250324061855.4116819-1-steffen.klassert@secunet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25 08:50:10 -07:00
Jason Xing
f38805c5d2 tcp: support TCP_RTO_MIN_US for set/getsockopt use
Support adjusting/reading RTO MIN for socket level by using set/getsockopt().

This new option has the same effect as TCP_BPF_RTO_MIN, which means it
doesn't affect RTAX_RTO_MIN usage (by using ip route...). Considering that
bpf option was implemented before this patch, so we need to use a standalone
new option for pure tcp set/getsockopt() use.

When the socket is created, its icsk_rto_min is set to the default
value that is controlled by sysctl_tcp_rto_min_us. Then if application
calls setsockopt() with TCP_RTO_MIN_US flag to pass a valid value, then
icsk_rto_min will be overridden in jiffies unit.

This patch adds WRITE_ONCE/READ_ONCE to avoid data-race around
icsk_rto_min.

Signed-off-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20250317120314.41404-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-25 04:27:19 -07:00
Linus Torvalds
f81c2b8150 It has been a reasonably busy cycle for docs...
- Significant changes throughout the tree to bring Python code up to
   current standards and raise the minimum Python required to 3.9.  Much of
   this is preparatory to replacing the ancient Perl scripts/kernel-doc
   horror with a slightly less horrifying Python implementation, expected
   for 6.16.
 
 - Update the minimum Sphinx required to 3.4.3, allowing us to remove a
   bunch of older compatibility code.
 
 - Rework and improve the generation of the ABI documentation.
 
   (All of the above done by Mauro)
 
 - Lots of translation updates.  Alex Shi and Yanteng Si are taking on
   responsibility for the Chinese translations going forward; that work will
   still get to you via docs-next
 
 - Try to standardize the format for indicating a developer's affiliation in
   commit tags.
 
 - Clarify the TAB's role in CoC enforcement actions.
 
 - Try to spell out the rules for when a commit tag can name another
   developer without their explicit permission.
 
 Plus lots of other typo fixes and updates.
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEIw+MvkEiF49krdp9F0NaE2wMflgFAmfccxwPHGNvcmJldEBs
 d24ubmV0AAoJEBdDWhNsDH5YoYcH/jL/nS8YAiJ3awF5PH5tR3m5ddt9l+fKXWJx
 PB3KcHtDORbWltTA+Tvo2aP1jxGY9wqsIIvl+nvjJyUcfd72g4HNfTDUDXwP3OFU
 wTkaEAQp3n/hqnLXtJ2AzV3Ir5cIfEL2d7F6QsN1Gnof8iu2OuMk5iMeb0iexUX6
 FYjJq+jknh30VdAp2hxHy8q17R7h7PySh5OsjeAYJJroLv60n3DwQgnzHjXC/FT2
 Qq1UuEzlSpRoso2o2NwVTND6OVW081umo6YrioqD7ZC2G2fhRgLFJJtJGXDNcyUl
 gQv9xLSaTD97V4zaWPm28ObNBpY/GnAd4hMjB17wAH5xUfVS5Aw=
 =Gvdp
 -----END PGP SIGNATURE-----

Merge tag 'docs-6.15' of git://git.lwn.net/linux

Pull documentation updates from Jonathan Corbet:
 "It has been a reasonably busy cycle for docs...

   - Significant changes throughout the tree to bring Python code up to
     current standards and raise the minimum Python required to 3.9

     Much of this is preparatory to replacing the ancient Perl
     scripts/kernel-doc horror with a slightly less horrifying Python
     implementation, expected for 6.16

   - Update the minimum Sphinx required to 3.4.3, allowing us to remove
     a bunch of older compatibility code

   - Rework and improve the generation of the ABI documentation

  (All of the above done by Mauro)

   - Lots of translation updates. Alex Shi and Yanteng Si are taking on
     responsibility for the Chinese translations going forward; that
     work will still get to you via docs-next

   - Try to standardize the format for indicating a developer's
     affiliation in commit tags

   - Clarify the TAB's role in CoC enforcement actions

   - Try to spell out the rules for when a commit tag can name another
     developer without their explicit permission

  Plus lots of other typo fixes and updates"

* tag 'docs-6.15' of git://git.lwn.net/linux: (98 commits)
  docs/zh_CN: fix spelling mistake
  docs/Chinese: change the disclaimer words
  docs/zh_CN: Add snp-tdx-threat-model index Chinese translation
  docs: driver-api: firmware: clarify userspace requirements
  docs: clarify rules wrt tagging other people
  docs: Remove outdated highuid.rst documentation
  Documentation: dma-buf: heaps: Add heap name definitions
  docs/.../submit-checklist: Use Documentation/admin-guide/abi.rst for cross-ref of README
  docs: Correct installation instruction
  Documentation: kcsan: fix "Plain Accesses and Data Races" URL in kcsan.rst
  Documentation/CoC: Spell out the TAB role in enforcement decisions
  Documentation: ocxl.rst: Update consortium site
  scripts: get_feat.pl: substitute s390x with s390
  scripts/kernel-doc: drop dead code for Wcontents_before_sections
  scripts/kernel-doc: don't add not needed new lines
  docs: driver-api/infiniband.rst: fix Kerneldoc markup
  drivers: firewire: firewire-cdev.h: fix identation on a kernel-doc markup
  drivers: media: intel-ipu3.h: fix identation on a kernel-doc markup
  include/asm-generic/io.h: fix kerneldoc markup
  Docs/arch/arm64: Fix spelling in amu.rst
  ...
2025-03-24 18:42:27 -07:00
Ryohei Kinugawa
4f34c2b779 docs/kcm: Fix typo "BFP"
'BFP' should be 'BPF'.

Signed-off-by: Ryohei Kinugawa <ryohei.kinugawa@gmail.com>
Link: https://patch.msgid.link/20250318095154.4187952-1-ryohei.kinugawa@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-24 12:20:42 -07:00
WangYuli
6d1929475e docs: networking: strparser: Fix a typo
The context indicates that 'than' is the correct word instead of 'then',
as a comparison is being performed.

Given that 'then' is also a valid English word, checkpatch.pl wouldn't
have picked up on this spelling error.

This typo was caught by AI during code review.

Suggested-by: Wentao Guan <guanwentao@uniontech.com>
Signed-off-by: WangYuli <wangyuli@uniontech.com>
Reviewed-by: Yanteng Si <si.yanteng@linux.dev>
Link: https://patch.msgid.link/A43BEA49ED5CC6E5+20250318074656.644391-1-wangyuli@uniontech.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-24 12:14:16 -07:00
Yui Washizu
f8e1bcec62 docs: fix the path of example code and example commands for device memory TCP
This updates the old path and fixes the description of unavailable options.

Signed-off-by: Yui Washizu <yui.washidu@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20250318061251.775191-1-yui.washidu@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-24 12:12:09 -07:00
Eric Dumazet
1937a0be28 tcp: move icsk_clean_acked to a better location
As a followup of my presentation in Zagreb for netdev 0x19:

icsk_clean_acked is only used by TCP when/if CONFIG_TLS_DEVICE
is enabled from tcp_ack().

Rename it to tcp_clean_acked, move it to tcp_sock structure
in the tcp_sock_read_rx for better cache locality in TCP
fast path.

Define this field only when CONFIG_TLS_DEVICE is enabled
saving 8 bytes on configs not using it.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250317085313.2023214-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-24 09:55:18 -07:00
Yael Chemla
c3b999cad7 net/mlx5e: Expose port reset cycle recovery counter via ethtool
Display recovery event of PPCNT recovery counters group. Counts (per
link) the number of total successful recovery events of any recovery
types during port reset cycle.

Signed-off-by: Yael Chemla <ychemla@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/1742112876-2890-5-git-send-email-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21 21:23:47 +01:00
Breno Leitao
56ad890de2 docs: netconsole: document release feature
Add documentation explaining the kernel release auto-population feature
in netconsole.

This feature appends kernel version information to the userdata
dictionary in every message sent when enabled via the `release_enabled`
file in the configfs hierarchy.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250314-netcons_release-v1-6-07979c4b86af@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21 18:59:25 +01:00
Arnd Bergmann
3fed9fda15 net: remove sb1000 cable modem driver
This one is hilariously outdated, it provided a faster downlink over
TV cable for users of analog modems in the 1990s, through an ISA card.

The web page for the userspace tools has been broken for 25 years, and
the driver has only ever seen mechanical updates.

Link: http://web.archive.org/web/20000611165545/http://home.adelphia.net:80/~siglercm/sb1000.html
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250312085236.2531870-1-arnd@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-21 17:11:54 +01:00
Geliang Tang
fa3ee9dd80 mptcp: sysctl: add available_path_managers
Similarly to net.mptcp.available_schedulers, this patch adds a new one
net.mptcp.available_path_managers to list the available path managers.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-11-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 10:14:49 +01:00
Geliang Tang
595c26d122 mptcp: sysctl: set path manager by name
Similar to net.mptcp.scheduler, a new net.mptcp.path_manager sysctl knob
is added to determine which path manager will be used by each newly
created MPTCP socket by setting the name of it.

Dealing with an explicit name is easier than with a number, especially
when more PMs will be introduced.

This sysctl knob makes the old one "pm_type" deprecated.

Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250313-net-next-mptcp-pm-ops-intro-v1-8-f4e4a88efc50@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-20 10:14:48 +01:00
Uday Shankar
f8a10bed32 netconsole: allow selection of egress interface via MAC address
Currently, netconsole has two methods of configuration - module
parameter and configfs. The former interface allows for netconsole
activation earlier during boot (by specifying the module parameter on
the kernel command line), so it is preferred for debugging issues which
arise before userspace is up/the configfs interface can be used. The
module parameter syntax requires specifying the egress interface name.
This requirement makes it hard to use for a couple reasons:
- The egress interface name can be hard or impossible to predict. For
  example, installing a new network card in a system can change the
  interface names assigned by the kernel.
- When constructing the module parameter, one may have trouble
  determining the original (kernel-assigned) name of the interface
  (which is the name that should be given to netconsole) if some stable
  interface naming scheme is in effect. A human can usually look at
  kernel logs to determine the original name, but this is very painful
  if automation is constructing the parameter.

For these reasons, allow selection of the egress interface via MAC
address when configuring netconsole using the module parameter. Update
the netconsole documentation with an example of the new syntax.
Selection of egress interface by MAC address via configfs is far less
interesting (since when this interface can be used, one should be able
to easily convert between MAC address and interface name), so it is left
unimplemented.

Signed-off-by: Uday Shankar <ushankar@purestorage.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Tested-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250312-netconsole-v6-2-3437933e79b8@purestorage.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-19 19:17:58 +01:00
Paolo Abeni
ed6bcbe39e This feature/cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
 
  - drop batadv_priv_debug_log struct, by Sven Eckelmann
 
  - adopt netdev_hold() / netdev_put(), by Eric Dumazet
 
  - add support for jumbo frames, by Sven Eckelmann
 
  - use consistent name for mesh interface, by Sven Eckelmann
 
  - cleanup B.A.T.M.A.N. IV OGM aggregation handling,
    by Sven Eckelmann (4 patches)
 
  - add missing newlines for log macros, by Sven Eckelmann
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmfTCE0WHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoea+EACQ3fJp7kf6iWxhJLiwon3Aoo4u
 dg84xX5nc02KHT/zUC/ADSSN3FDdc3y8jgjpLfZv5TkWmRUqMNP9449PaoXpiIsW
 oI6HKrVaEKIEPpMAaBrv0dPtG5fBZPku4tZBLP7ekU/2N0Ri3eKTCph7IznO3Ma5
 cM4FN6H9Nxwnht2qN7Gx3nWoBn9LeaSyKfAFLR+hI3QxTh9PYLV26rGCLrUhbbvT
 St8TtgdeULSH/50rhMqC4wpT8verk+BMJdShNPLyse4QSRky6DD6YOj7zS/c7akP
 z8S77gvp0mcgzb/t49K4AFjUFEvK9l0rvPDx6DMrfJwQAZUPjFY93y1XI5QUOPRZ
 +kVj0nr2XRc7c1/wDNfn4131T9Twi41+TX6BI8Fan8WpWKVTo2a/O5VWk5pHKjy+
 a8aZqL41444wWyFhyCL+Io1y1mDc9TDa7644+fJjR5H4h47M9rGnLVEX3F9mVtBl
 LIOwpHpHXg9z3JDgpCwLWykUzW4lEuF1hQFUD3DPIbVKENjvibiiB+qPNZJG1icL
 n9Pt6843vF8gcw3aGbp0VGsESILiKr7EN7CMkhfcO1CaqAErJG4DinubH6DLQIjO
 uW5Ss2751BfqWPOakWVzvvWafnleVRdw4rysbil4ko/xONg0qdfTMWe5fS42bkbk
 CMmGPNeSiqVTXp5bIg==
 =hmiI
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-pullrequest-20250313' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This feature/cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - drop batadv_priv_debug_log struct, by Sven Eckelmann

 - adopt netdev_hold() / netdev_put(), by Eric Dumazet

 - add support for jumbo frames, by Sven Eckelmann

 - use consistent name for mesh interface, by Sven Eckelmann

 - cleanup B.A.T.M.A.N. IV OGM aggregation handling,
   by Sven Eckelmann (4 patches)

 - add missing newlines for log macros, by Sven Eckelmann

* tag 'batadv-next-pullrequest-20250313' of git://git.open-mesh.org/linux-merge:
  batman-adv: add missing newlines for log macros
  batman-adv: Limit aggregation size to outgoing MTU
  batman-adv: Use actual packet count for aggregated packets
  batman-adv: Switch to bitmap helper for aggregation handling
  batman-adv: Limit number of aggregated packets directly
  batman-adv: Use consistent name for mesh interface
  batman-adv: Add support for jumbo frames
  batman-adv: adopt netdev_hold() / netdev_put()
  batman-adv: Drop batadv_priv_debug_log struct
  batman-adv: Start new development cycle
====================

Link: https://patch.msgid.link/20250313164519.72808-1-sw@simonwunderlich.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18 12:10:20 +01:00
Pavan Chebbi
2c4d376c3a bnxt_en: Add devlink support for ENABLE_ROCE nvm parameter
Add set/show support for the ENABLE_ROCE NVM parameter to
enable/disable RoCE for a PF.

Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Co-developed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20250310183129.3154117-4-michael.chan@broadcom.com
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-18 10:25:22 +01:00
Stanislav Fomichev
0a13c1e0a4 net: revert to lockless TC_SETUP_BLOCK and TC_SETUP_FT
There is a couple of places from which we can arrive to ndo_setup_tc
with TC_SETUP_BLOCK/TC_SETUP_FT:
- netlink
- netlink notifier
- netdev notifier

Locking netdev too deep in this call chain seems to be problematic
(especially assuming some/all of the call_netdevice_notifiers
NETDEV_UNREGISTER) might soon be running with the instance lock).
Revert to lockless ndo_setup_tc for TC_SETUP_BLOCK/TC_SETUP_FT. NFT
framework already takes care of most of the locking. Document
the assumptions.

ndo_setup_tc TC_SETUP_BLOCK
  nft_block_offload_cmd
    nft_chain_offload_cmd
      nft_flow_block_chain
        nft_flow_offload_chain
	  nft_flow_rule_offload_abort
	    nft_flow_rule_offload_commit
	  nft_flow_rule_offload_commit
	    nf_tables_commit
	      nfnetlink_rcv_batch
	        nfnetlink_rcv_skb_batch
		  nfnetlink_rcv
	nft_offload_netdev_event
	  NETDEV_UNREGISTER notifier

ndo_setup_tc TC_SETUP_FT
  nf_flow_table_offload_cmd
    nf_flow_table_offload_setup
      nft_unregister_flowtable_hook
        nft_register_flowtable_net_hooks
	  nft_flowtable_update
	  nf_tables_newflowtable
	    nfnetlink_rcv_batch (.call NFNL_CB_BATCH)
	nft_flowtable_update
	  nf_tables_newflowtable
	nft_flowtable_event
	  nf_tables_flowtable_event
	    NETDEV_UNREGISTER notifier
      __nft_unregister_flowtable_net_hooks
        nft_unregister_flowtable_net_hooks
	  nf_tables_commit
	    nfnetlink_rcv_batch (.call NFNL_CB_BATCH)
	  __nf_tables_abort
	    nf_tables_abort
	      nfnetlink_rcv_batch
	__nft_release_hook
	  __nft_release_hooks
	    nf_tables_pre_exit_net -> module unload
	  nft_rcv_nl_event
	    netlink_register_notifier (oh boy)
      nft_register_flowtable_net_hooks
      	nft_flowtable_update
	  nf_tables_newflowtable
        nf_tables_newflowtable

Fixes: c4f0f30b42 ("net: hold netdev instance lock during nft ndo_setup_tc")
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Reported-by: syzbot+0afb4bcf91e5a1afdcad@syzkaller.appspotmail.com
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250308044726.1193222-1-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-12 13:02:00 -07:00
Stanislav Fomichev
cc34acd577 docs: net: document new locking reality
Also clarify ndo_get_stats (that read and write paths can run
concurrently) and mention only RCU.

Cc: Saeed Mahameed <saeed@kernel.org>
Signed-off-by: Stanislav Fomichev <sdf@fomichev.me>
Link: https://patch.msgid.link/20250305163732.2766420-14-sdf@fomichev.me
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-03-06 12:59:45 -08:00
Breno Leitao
7010b61983 netconsole: docs: document the task name feature
Add documentation for the netconsole task name feature in
Documentation/networking/netconsole.rst. This explains how to enable
task name via configfs and demonstrates the output format.

The documentation includes:
- How to enable/disable the feature via taskname_enabled
- The format of the task name in the output
- An example showing the task name appearing in messages

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 15:28:28 +01:00
Nicolas Dichtel
0c493da863 net: rename netns_local to netns_immutable
The name 'netns_local' is confusing. A following commit will export it via
netlink, so let's use a more explicit name.

Reported-by: Eric Dumazet <edumazet@google.com>
Suggested-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-03-04 12:44:48 +01:00
Jakub Kicinski
357660d759 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc5).

Conflicts:

drivers/net/ethernet/cadence/macb_main.c
  fa52f15c74 ("net: cadence: macb: Synchronize stats calculations")
  75696dd0fd ("net: cadence: macb: Convert to get_stats64")
https://lore.kernel.org/20250224125848.68ee63e5@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/intel/ice/ice_sriov.c
  79990cf5e7 ("ice: Fix deinitializing VF in error path")
  a203163274 ("ice: simplify VF MSI-X managing")

net/ipv4/tcp.c
  18912c5206 ("tcp: devmem: don't write truncated dmabuf CMSGs to userspace")
  297d389e9e ("net: prefix devmem specific helpers")

net/mptcp/subflow.c
  8668860b0a ("mptcp: reset when MPTCP opts are dropped after join")
  c3349a22c2 ("mptcp: consolidate subflow cleanup")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-27 10:20:58 -08:00
Ahmed Zaki
bd7c00605e net: move aRFS rmap management and CPU affinity to core
A common task for most drivers is to remember the user-set CPU affinity
to its IRQs. On each netdev reset, the driver should re-assign the user's
settings to the IRQs. Unify this task across all drivers by moving the CPU
affinity to napi->config.

However, to move the CPU affinity to core, we also need to move aRFS
rmap management since aRFS uses its own IRQ notifiers.

For the aRFS, add a new netdev flag "rx_cpu_rmap_auto". Drivers supporting
aRFS should set the flag via netif_enable_cpu_rmap() and core will allocate
and manage the aRFS rmaps. Freeing the rmap is also done by core when the
netdev is freed. For better IRQ affinity management, move the IRQ rmap
notifier inside the napi_struct and add new notify.notify and
notify.release functions: netif_irq_cpu_rmap_notify() and
netif_napi_affinity_release().

Now we have the aRFS rmap management in core, add CPU affinity mask to
napi_config. To delegate the CPU affinity management to the core, drivers
must:
 1 - set the new netdev flag "irq_affinity_auto":
                                       netif_enable_irq_affinity(netdev)
 2 - create the napi with persistent config:
                                       netif_napi_add_config()
 3 - bind an IRQ to the napi instance: netif_napi_set_irq()

the core will then make sure to use re-assign affinity to the napi's
IRQ.

The default IRQ mask is set to one cpu starting from the closest NUMA.

Signed-off-by: Ahmed Zaki <ahmed.zaki@intel.com>
Link: https://patch.msgid.link/20250224232228.990783-2-ahmed.zaki@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26 19:51:37 -08:00
Eric Dumazet
3ba075278c tcp: be less liberal in TSEcr received while in SYN_RECV state
Yong-Hao Zou mentioned that linux was not strict as other OS in 3WHS,
for flows using TCP TS option (RFC 7323)

As hinted by an old comment in tcp_check_req(),
we can check the TSEcr value in the incoming packet corresponds
to one of the SYNACK TSval values we have sent.

In this patch, I record the oldest and most recent values
that SYNACK packets have used.

Send a challenge ACK if we receive a TSEcr outside
of this range, and increase a new SNMP counter.

nstat -az | grep TSEcrRejected
TcpExtTSEcrRejected            0                  0.0

Due to TCP fastopen implementation, do not apply yet these checks
for fastopen flows.

v2: No longer use req->num_timeout, but treq->snt_tsval_first
    to detect when first SYNACK is prepared. This means
    we make sure to not send an initial zero TSval.
    Make sure MPTCP and TCP selftests are passing.
    Change MIB name to TcpExtTSEcrRejected

v1: https://lore.kernel.org/netdev/CADVnQykD8i4ArpSZaPKaoNxLJ2if2ts9m4As+=Jvdkrgx1qMHw@mail.gmail.com/T/

Reported-by: Yong-Hao Zou <yonghaoz1994@gmail.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20250225171048.3105061-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-26 18:11:17 -08:00
Michael Ellerman
d9fefcea81 net: spider_net: Remove powerpc Cell driver
This driver can no longer be built since support for IBM Cell Blades was
removed, in particular PPC_IBM_CELL_BLADE.

Remove the driver and the documentation.
Remove the MAINTAINERS entry, and add Ishizaki and Geoff to CREDITS.

Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: https://patch.msgid.link/20241218105523.416573-24-mpe@ellerman.id.au
2025-02-26 21:15:18 +05:30
Gal Pressman
ecdff89338 ethtool: Symmetric OR-XOR RSS hash
Add an additional type of symmetric RSS hash type: OR-XOR.
The "Symmetric-OR-XOR" algorithm transforms the input as follows:

(SRC_IP | DST_IP, SRC_IP ^ DST_IP, SRC_PORT | DST_PORT, SRC_PORT ^ DST_PORT)

Change 'cap_rss_sym_xor_supported' to 'supported_input_xfrm', a bitmap
of supported RXH_XFRM_* types.

Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/20250224174416.499070-2-gal@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-25 18:31:04 -08:00
Sven Eckelmann
9443335502 batman-adv: Use consistent name for mesh interface
The way how the virtual interface is called inside the batman-adv source
code is not consistent. The genl headers call it meshif and the rest of the
code calls is (mostly) softif.

The genl definitions cannot be touched because they are part of the UAPI.
But the rest of the batman-adv code can be touched to have a consistent
name again.

The bulk of the renaming was done using

  sed -i -e 's/soft\(-\|\_\| \|\)i\([nf]\)/mesh\1i\2/g' \
         -e 's/SOFT\(-\|\_\| \|\)I\([NF]\)/MESH\1I\2/g'

and then it was adjusted slightly when proofreading the changes.

Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2025-02-22 11:36:22 +01:00
Jakub Kicinski
e87700965a bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQQ6NaUOruQGUkvPdG4raS+Z+3y5EwUCZ7ffOQAKCRAraS+Z+3y5
 EzVHAP9h/QkeYoOZW9gul08I8vFiZsFe/lbOSLJWxeVfxb9JhgD/cMqby3qAxQK6
 lsdNQ9jYG2232Wym89ag7fvTBK15Wg4=
 =gkN2
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Martin KaFai Lau says:

====================
pull-request: bpf-next 2025-02-20

We've added 19 non-merge commits during the last 8 day(s) which contain
a total of 35 files changed, 1126 insertions(+), 53 deletions(-).

The main changes are:

1) Add TCP_RTO_MAX_MS support to bpf_set/getsockopt, from Jason Xing

2) Add network TX timestamping support to BPF sock_ops, from Jason Xing

3) Add TX metadata Launch Time support, from Song Yoong Siang

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next:
  igc: Add launch time support to XDP ZC
  igc: Refactor empty frame insertion for launch time support
  net: stmmac: Add launch time support to XDP ZC
  selftests/bpf: Add launch time request to xdp_hw_metadata
  xsk: Add launch time hardware offload support to XDP Tx metadata
  selftests/bpf: Add simple bpf tests in the tx path for timestamping feature
  bpf: Support selective sampling for bpf timestamping
  bpf: Add BPF_SOCK_OPS_TSTAMP_SENDMSG_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_ACK_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_SND_HW_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_SND_SW_CB callback
  bpf: Add BPF_SOCK_OPS_TSTAMP_SCHED_CB callback
  net-timestamp: Prepare for isolating two modes of SO_TIMESTAMPING
  bpf: Disable unsafe helpers in TX timestamping callbacks
  bpf: Prevent unsafe access to the sock fields in the BPF timestamping callback
  bpf: Prepare the sock_ops ctx and call bpf prog for TX timestamping
  bpf: Add networking timestamping support to bpf_get/setsockopt()
  selftests/bpf: Add rto max for bpf_setsockopt test
  bpf: Support TCP_RTO_MAX_MS for bpf_setsockopt
====================

Link: https://patch.msgid.link/20250221022104.386462-1-martin.lau@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-21 15:59:47 -08:00
Leon Romanovsky
cc18f482e8 xfrm: provide common xdo_dev_offload_ok callback implementation
Almost all drivers except bond and nsim had same check if device
can perform XFRM offload on that specific packet. The check was that
packet doesn't have IPv4 options and IPv6 extensions.

In NIC drivers, the IPv4 HELEN comparison was slightly different, but
the intent was to check for the same conditions. So let's chose more
strict variant as a common base.

Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-02-21 08:08:15 +01:00
Linus Torvalds
319fc77f8f BPF fixes:
- Fix a soft-lockup in BPF arena_map_free on 64k page size
   kernels (Alan Maguire)
 
 - Fix a missing allocation failure check in BPF verifier's
   acquire_lock_state (Kumar Kartikeya Dwivedi)
 
 - Fix a NULL-pointer dereference in trace_kfree_skb by adding
   kfree_skb to the raw_tp_null_args set (Kuniyuki Iwashima)
 
 - Fix a deadlock when freeing BPF cgroup storage (Abel Wu)
 
 - Fix a syzbot-reported deadlock when holding BPF map's
   freeze_mutex (Andrii Nakryiko)
 
 - Fix a use-after-free issue in bpf_test_init when
   eth_skb_pkt_type is accessing skb data not containing an
   Ethernet header (Shigeru Yoshida)
 
 - Fix skipping non-existing keys in generic_map_lookup_batch
   (Yan Zhai)
 
 - Several BPF sockmap fixes to address incorrect TCP copied_seq
   calculations, which prevented correct data reads from recv(2)
   in user space (Jiayuan Chen)
 
 - Two fixes for BPF map lookup nullness elision (Daniel Xu)
 
 - Fix a NULL-pointer dereference from vmlinux BTF lookup in
   bpf_sk_storage_tracing_allowed (Jared Kangas)
 
 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
 -----BEGIN PGP SIGNATURE-----
 
 iIsEABYKADMWIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZ7evlRUcZGFuaWVsQGlv
 Z2VhcmJveC5uZXQACgkQ2yufC7HISIPwHgD/dTvM00Ck4Q73fPivyT7tcqxeXJlD
 D6ggzWl/SG9LAbwA/2/cSgAM9Jm1g7ddvn/S9QaDYOs5GmFl6urq6krs+tYD
 =FCs9
 -----END PGP SIGNATURE-----

Merge tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Pull BPF fixes from Daniel Borkmann:

 - Fix a soft-lockup in BPF arena_map_free on 64k page size kernels
   (Alan Maguire)

 - Fix a missing allocation failure check in BPF verifier's
   acquire_lock_state (Kumar Kartikeya Dwivedi)

 - Fix a NULL-pointer dereference in trace_kfree_skb by adding kfree_skb
   to the raw_tp_null_args set (Kuniyuki Iwashima)

 - Fix a deadlock when freeing BPF cgroup storage (Abel Wu)

 - Fix a syzbot-reported deadlock when holding BPF map's freeze_mutex
   (Andrii Nakryiko)

 - Fix a use-after-free issue in bpf_test_init when eth_skb_pkt_type is
   accessing skb data not containing an Ethernet header (Shigeru
   Yoshida)

 - Fix skipping non-existing keys in generic_map_lookup_batch (Yan Zhai)

 - Several BPF sockmap fixes to address incorrect TCP copied_seq
   calculations, which prevented correct data reads from recv(2) in user
   space (Jiayuan Chen)

 - Two fixes for BPF map lookup nullness elision (Daniel Xu)

 - Fix a NULL-pointer dereference from vmlinux BTF lookup in
   bpf_sk_storage_tracing_allowed (Jared Kangas)

* tag 'bpf-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  selftests: bpf: test batch lookup on array of maps with holes
  bpf: skip non exist keys in generic_map_lookup_batch
  bpf: Handle allocation failure in acquire_lock_state
  bpf: verifier: Disambiguate get_constant_map_key() errors
  bpf: selftests: Test constant key extraction on irrelevant maps
  bpf: verifier: Do not extract constant map keys for irrelevant maps
  bpf: Fix softlockup in arena_map_free on 64k page kernel
  net: Add rx_skb of kfree_skb to raw_tp_null_args[].
  bpf: Fix deadlock when freeing cgroup storage
  selftests/bpf: Add strparser test for bpf
  selftests/bpf: Fix invalid flag of recv()
  bpf: Disable non stream socket for strparser
  bpf: Fix wrong copied_seq calculation
  strparser: Add read_sock callback
  bpf: avoid holding freeze_mutex during mmap operation
  bpf: unify VM_WRITE vs VM_MAYWRITE use in BPF map mmaping logic
  selftests/bpf: Adjust data size to have ETH_HLEN
  bpf, test_run: Fix use-after-free issue in eth_skb_pkt_type()
  bpf: Remove unnecessary BTF lookups in bpf_sk_storage_tracing_allowed
2025-02-20 15:37:17 -08:00
Song Yoong Siang
ca4419f15a xsk: Add launch time hardware offload support to XDP Tx metadata
Extend the XDP Tx metadata framework so that user can requests launch time
hardware offload, where the Ethernet device will schedule the packet for
transmission at a pre-determined time called launch time. The value of
launch time is communicated from user space to Ethernet driver via
launch_time field of struct xsk_tx_metadata.

Suggested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Song Yoong Siang <yoong.siang.song@intel.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Acked-by: Stanislav Fomichev <sdf@fomichev.me>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250216093430.957880-2-yoong.siang.song@intel.com
2025-02-20 15:13:45 -08:00
Oleksij Rempel
6b89d89f21 can: j1939: Extend stack documentation with buffer size behavior
Extend the J1939 stack documentation to include information about how
buffer sizes influence stack behavior, detailing handling of simple
sessions, TP, and ETP transfers.

Additionally, describe various setsockopt(2) options, including their
usage  and potential error values that can be returned by the stack.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20241013181715.3488980-1-o.rempel@pengutronix.de
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-02-19 11:10:53 +01:00
Mauro Carvalho Chehab
f12583fd2b docs: networking: Allow creating cross-references statistics ABI
Now that Documentation/ABI is processed by automarkup, let it
generate cross-references for the corresponding ABI file.

Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/a34ab9bef8f4e6b89dcb15098557fd3a7a9aa353.1739254867.git.mchehab+huawei@kernel.org
2025-02-18 13:42:46 -07:00
David Wei
d9ac1d5fc9 net: add documentation for io_uring zcrx
Add documentation for io_uring zero copy Rx that explains requirements
and the user API.

Signed-off-by: David Wei <dw@davidwei.uk>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20250215000947.789731-11-dw@davidwei.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2025-02-17 05:41:09 -07:00
Joe Damato
bf1b8e0abc documentation: networking: Add NAPI config
Document the existence of persistent per-NAPI configuration space and
the API that drivers can opt into.

Update stale documentation which suggested that NAPI IDs cannot be
queried from userspace.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20250213191535.38792-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-14 17:09:19 -08:00
Jakub Kicinski
7a7e019713 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.14-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-13 12:43:30 -08:00
Ritvik Gupta
c3a97ccaed Documentation: dpaa2 ethernet switch driver: Fix spelling
Corrected spelling mistake

Signed-off-by: Ritvik Gupta <ritvikfoss@gmail.com>
Link: https://patch.msgid.link/20250212021311.13257-1-ritvikfoss@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-13 16:13:00 +01:00
Akiva Goldberger
b820864335 net/mlx5: Expose ICM consumption per function
ICM is a portion of the host's memory assigned to a function by the OS
through requests made by the NIC's firmware.

PF ICM consumption can be accessed directly, while VF/SF ICM consumption
can be accessed through their representors in switchdev mode.

The value is exposed to the user in granularity of 4KB through the vnic
health reporter as follows:

$ devlink health diagnose pci/0000:08:00.0 reporter vnic
 vNIC env counters:
     total_error_queues: 0 send_queue_priority_update_flow: 0
     comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
     invalid_command: 0 quota_exceeded_command: 0
     nic_receive_steering_discard: 0 icm_consumption: 1032

Signed-off-by: Akiva Goldberger <agoldberger@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250209101716.112774-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-12 10:46:17 -08:00
Edward Cree
5ea73bf3c4 sfc: document devlink flash support
Update the information in sfc's devlink documentation including
 support for firmware update with devlink flash.
Also update the help text for CONFIG_SFC_MTD, as it is no longer
 strictly required for firmware updates.

Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://patch.msgid.link/3476b0ef04a0944f03e0b771ec8ed1a9c70db4dc.1739186253.git.ecree.xilinx@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-02-11 17:12:17 -08:00
Eric Dumazet
1280c26228 tcp: add tcp_rto_max_ms sysctl
Previous patch added a TCP_RTO_MAX_MS socket option
to tune a TCP socket max RTO value.

Many setups prefer to change a per netns sysctl.

This patch adds /proc/sys/net/ipv4/tcp_rto_max_ms

Its initial value is 120000 (120 seconds).

Keep in mind that a decrease of tcp_rto_max_ms
means shorter overall timeouts, unless tcp_retries2
sysctl is increased.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11 13:08:00 +01:00
Eric Dumazet
54a378f434 tcp: add the ability to control max RTO
Currently, TCP stack uses a constant (120 seconds)
to limit the RTO value exponential growth.

Some applications want to set a lower value.

Add TCP_RTO_MAX_MS socket option to set a value (in ms)
between 1 and 120 seconds.

It is discouraged to change the socket rto max on a live
socket, as it might lead to unexpected disconnects.

Following patch is adding a netns sysctl to control the
default value at socket creation time.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-02-11 13:08:00 +01:00
Breno Leitao
a7aec70a90 netconsole: docs: Add documentation for CPU number auto-population
Update the netconsole documentation to explain the new feature that
allows automatic population of the CPU number.

The key changes include introducing a new section titled "CPU number
auto population in userdata", explaining how to enable the CPU number
auto-population feature by writing to the "populate_cpu_nr" file in the
netconsole configfs hierarchy.

This documentation update ensures users are aware of the new CPU number
auto-population functionality and how to leverage it for better
demultiplexing and visibility of parallel netconsole output.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-02-10 15:04:18 +00:00
Reyders Morales
d0b197b650 Documentation/networking: fix basic node example document ISO 15765-2
In the current struct sockaddr_can tp is member of can_addr. tp is not
member of struct sockaddr_can.

Signed-off-by: Reyders Morales <reyders1@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Oliver Hartkopp <socketcan@hartkopp.net>
Link: https://patch.msgid.link/20250203224720.42530-1-reyders1@gmail.com
Fixes: 67711e0425 ("Documentation: networking: document ISO 15765-2")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2025-02-08 11:28:11 +01:00
Michal Swiatkowski
b2657259fc ice: devlink PF MSI-X max and min parameter
Use generic devlink PF MSI-X parameter to allow user to change MSI-X
range.

Add notes about this parameters into ice devlink documentation.

Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com>
Signed-off-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2025-02-05 09:02:44 -08:00
Matthieu Baerts (NGI0)
18da4b5d12 doc: mptcp: sysctl: blackhole_timeout is per-netns
All other sysctl entries mention it, and it is a per-namespace sysctl.

So mention it as well.

Fixes: 27069e7cb3 ("mptcp: disable active MPTCP in case of blackhole")
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-30 14:02:19 +01:00
Jiayuan Chen
0532a79efd strparser: Add read_sock callback
Added a new read_sock handler, allowing users to customize read operations
instead of relying on the native socket's read_sock.

Signed-off-by: Jiayuan Chen <mrpre@163.com>
Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org>
Reviewed-by: Jakub Sitnicki <jakub@cloudflare.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://patch.msgid.link/20250122100917.49845-2-mrpre@163.com
2025-01-29 13:32:08 -08:00
Khaled Elnaggar
cde5959913 documentation: networking: fix spelling mistakes
Fix a couple of typos/spelling mistakes in the documentation.

Signed-off-by: Khaled Elnaggar <khaledelnaggarlinux@gmail.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20250123082521.59997-1-khaledelnaggarlinux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-27 14:26:06 -08:00
Jakub Kicinski
59372af69d This cleanup patchset includes the following patches:
- bump version strings, by Simon Wunderlich
 
  - Reorder includes for distributed-arp-table.c, by Sven Eckelmann
 
  - Fix translation table change handling, by Remi Pommarel (2 patches)
 
  - Map VID 0 to untagged TT VLAN, by Sven Eckelmann
 
  - Update MAINTAINERS/mailmap e-mail addresses, by the respective authors
    (4 patches)
 
  - netlink: reduce duplicate code by returning interfaces,
    by Linus Lüssing
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAmeKTuUWHHN3QHNpbW9u
 d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoREQEACKKLbl6IGJ9xXTlaRhtUwr55KD
 JmvJT+y25IOmJBlRpZ6zHEjDxlsF/W+Se63k4tWrQSEscqBLXL9K0nY171cU+sJ0
 Nx6AF2F6GobfVuzJGoJWjOZ3eCk4+stiCsBPz0ZJmu5cN24nZRkQZWpOJCSl1KtY
 huRG+bum58hw4kvaK5IU+9iLxtN5ec3STVZjEe0GKVqB6vo19+XGnPJqNSlVDDiQ
 iJSl+6VjaYYG2zclyQDefo4CuILVTAxGi5vaVFhoLVkUj2QPo41BEpkfQxZ8P1HR
 bFXfnOITuhCeDgzF2uwocaeW0P06kyKa+cyH4PO5BvQOp5up667aDdXtgKPBvKzV
 5Fn+WN0A/SsM+tLdGkVdbU1ceNgsNYLyROJgorDNkiGUmRvrFOmhXVzhJMlcppkR
 O+qh7K7Kpoyuz3a2ZiktvG5sSG0aFhQ3GRJyZlqKGV5bLzJ8e2Z+0aCFKtVw0RlZ
 3lg2/iEb5//+aJKZToHH97U3PG2ahO76WPSiaTPVVfvGYS2gE7fihHWtI9Uf9FzB
 AHFhiOP5oPet5jSjH0jrybOEHcSvuxsOi+vIC0hCiPV4O/fN7tjeEj4XM3uPF5LQ
 FI26vvS5izhZxOWVIUPnz5pSWzTmfiRPNbckcESHt8CHjlSTUw3TPlC0jua0iXMF
 shhpONEPSjZvExybLg==
 =cv/9
 -----END PGP SIGNATURE-----

Merge tag 'batadv-next-pullrequest-20250117' of git://git.open-mesh.org/linux-merge

Simon Wunderlich says:

====================
This cleanup patchset includes the following patches:

 - bump version strings, by Simon Wunderlich

 - Reorder includes for distributed-arp-table.c, by Sven Eckelmann

 - Fix translation table change handling, by Remi Pommarel (2 patches)

 - Map VID 0 to untagged TT VLAN, by Sven Eckelmann

 - Update MAINTAINERS/mailmap e-mail addresses, by the respective authors
   (4 patches)

 - netlink: reduce duplicate code by returning interfaces,
   by Linus Lüssing

* tag 'batadv-next-pullrequest-20250117' of git://git.open-mesh.org/linux-merge:
  batman-adv: netlink: reduce duplicate code by returning interfaces
  MAINTAINERS: mailmap: add entries for Antonio Quartulli
  mailmap: add entries for Sven Eckelmann
  mailmap: add entries for Simon Wunderlich
  MAINTAINERS: update email address of Marek Linder
  batman-adv: Map VID 0 to untagged TT VLAN
  batman-adv: Don't keep redundant TT change events
  batman-adv: Remove atomic usage for tt.local_changes
  batman-adv: Reorder includes for distributed-arp-table.c
  batman-adv: Start new development cycle
====================

Link: https://patch.msgid.link/20250117123910.219278-1-sw@simonwunderlich.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:57:31 -08:00
Matthieu Baerts (NGI0)
5b4fd35343 mptcp: sysctl: add syn_retrans_before_tcp_fallback
The number of SYN + MPC retransmissions before falling back to TCP was
fixed to 2. This is certainly a good default value, but having a fixed
number can be a problem in some environments.

The current behaviour means that if all packets are dropped, there will
be:

- The initial SYN + MPC

- 2 retransmissions with MPC

- The next ones will be without MPTCP.

So typically ~3 seconds before falling back to TCP. In some networks
where some temporally blackholes are unfortunately frequent, or when a
client tries to initiate connections while the network is not ready yet,
this can cause new connections not to have MPTCP connections.

In such environments, it is now possible to increase the number of SYN
retransmissions with MPTCP options to make sure MPTCP is used.

Interesting values are:

- 0: the first retransmission will be done without MPTCP options: quite
     aggressive, but also a higher risk of detecting false-positive
     MPTCP blackholes.

- >= 128: all SYN retransmissions will keep the MPTCP options: back to
          the < 6.12 behaviour.

The default behaviour is not changed here.

Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20250117-net-next-mptcp-syn_retrans_before_tcp_fallback-v1-1-ab4b187099b0@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-18 17:32:01 -08:00
Vladimir Oltean
6a128cdf19 net: ethtool: ts: add separate counter for unconfirmed one-step TX timestamps
For packets with two-step timestamp requests, the hardware timestamp
comes back to the driver through a confirmation mechanism of sorts,
which allows the driver to confidently bump the successful "pkts"
counter.

For one-step PTP, the NIC is supposed to autonomously insert its
hardware TX timestamp in the packet headers while simultaneously
transmitting it. There may be a confirmation that this was done
successfully, or there may not.

None of the current drivers which implement ethtool_ops :: get_ts_stats()
also support HWTSTAMP_TX_ONESTEP_SYNC or HWTSTAMP_TX_ONESTEP_SYNC, so it
is a bit unclear which model to follow. But there are NICs, such as DSA,
where there is no transmit confirmation at all. Here, it would be wrong /
misleading to increment the successful "pkts" counter, because one-step
PTP packets can be dropped on TX just like any other packets.

So introduce a special counter which signifies "yes, an attempt was made,
but we don't know whether it also exited the port or not". I expect that
for one-step PTP packets where a confirmation is available, the "pkts"
counter would be bumped.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20250116104628.123555-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-17 20:01:09 -08:00
Taehee Yoo
eec8359f07 net: ethtool: add support for configuring hds-thresh
The hds-thresh option configures the threshold value of
the header-data-split.
If a received packet size is larger than this threshold value, a packet
will be split into header and payload.
The header indicates TCP and UDP header, but it depends on driver spec.
The bnxt_en driver supports HDS(Header-Data-Split) configuration at
FW level, affecting TCP and UDP too.
So, If hds-thresh is set, it affects UDP and TCP packets.

Example:
   # ethtool -G <interface name> hds-thresh <value>

   # ethtool -G enp14s0f0np0 tcp-data-split on hds-thresh 256
   # ethtool -g enp14s0f0np0
   Ring parameters for enp14s0f0np0:
   Pre-set maximums:
   ...
   HDS thresh:  1023
   Current hardware settings:
   ...
   TCP data split:         on
   HDS thresh:  256

The default/min/max values are not defined in the ethtool so the drivers
should define themself.
The 0 value means that all TCP/UDP packets' header and payload
will be split.

Tested-by: Stanislav Fomichev <sdf@fomichev.me>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Link: https://patch.msgid.link/20250114142852.3364986-3-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-15 14:42:11 -08:00
Oleksij Rempel
7d66c74a17 Documentation: networking: update PHY error counter diagnostics in twisted pair guide
Replace generic instructions for monitoring error counters with a
procedure using the unified PHY statistics interface (`--all-groups`).

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14 11:44:19 +01:00
Jakub Kicinski
6167c0b6e8 net: ethtool: add support for structured PHY statistics
Introduce a new way to report PHY statistics in a structured and
standardized format using the netlink API. This new method does not
replace the old driver-specific stats, which can still be accessed with
`ethtool -S <eth name>`. The structured stats are available with
`ethtool -S <eth name> --all-groups`.

This new method makes it easier to diagnose problems by organizing stats
in a consistent and documented way.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-14 11:44:19 +01:00
Moshe Shemesh
9fc43b5e39 net/mlx5: fs, add HWS to steering mode options
Add HW Steering mode to mlx5 devlink param of steering mode options.

Signed-off-by: Moshe Shemesh <moshe@nvidia.com>
Reviewed-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20250109160546.1733647-14-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-13 19:21:09 -08:00
David S. Miller
7b24f164cf ipsec-next-2025-01-09
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEH7ZpcWbFyOOp6OJbrB3Eaf9PW7cFAmd/mNkACgkQrB3Eaf9P
 W7cQww/9Hnv7+wosuBxFW2o2ptgZThiwj/Ovz0oiWGWvctpN7VlpwmrDOsXQ+XMn
 xF6JBFEjnJanYoDBb78D0dMffJcMcUKZopTUU/ZMitSNr8aIHYiuB4SWPG1tqxl4
 Ete7Mr2m3tS96YePQNnAaRZzEuGsx3BQb28VLTWl9So81MByD2OK4fsAbYz22Gg8
 7A6tDHn1mUd9b2VG+LeeBZaDDFG8C0O2x4E/8Z3DX3z1N8y3LABPwZ38jcgTviKO
 1ZldGrJT+PownBydu23bWDassKE2TuVvGH9e/SOPeQj8DJ4Lmd0bafMZTY6xwfNT
 RJCwhlzZUpYRXFzvcf3+U3egsqEWEemV7/LzAapdT0V9OqfLWMUh3b1jMA4KcblZ
 qmlm/MhZyXutDoeuASwtM4jgM3wGwovOofrKKsb13hD9VLBs5jFZmFSw5TlbmwE3
 sjv7V4pFwNyfJnwQtyMmfuuHiy8w+fzqAA2GCg8mF3OosHABH/FOvEBP8xg1Vqu1
 iKlLkByfyaCFD+GxTPzqSDvSB8nDzeZBgBM/ILGuwH0OWr+0gxBzl1sxrhAkw9hC
 Gf+4J3wg7EknTxfrJk4LyqfyS50GvUIzpLSkHYxDtQRd4zv75bxRzMvoZvuapTtN
 GGpWYftKO8n2kgLORQ0dTtFee3c4w/No+KXWsFdtRVNpyk3N0Yw=
 =U79q
 -----END PGP SIGNATURE-----

Merge tag 'ipsec-next-2025-01-09' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next

Steffen Klassert says:

====================
ipsec-next-2025-01-09

1) Implement the AGGFRAG protocol and basic IP-TFS (RFC9347) functionality.
   From Christian Hopps.

2) Support ESN context update to hardware for TX.
   From Jianbo Liu.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-10 09:15:17 +00:00
Jan Stancek
ab88c2b373 tools: ynl: move python code to separate sub-directory
Move python code to a separate directory so it can be
packaged as a python module. Updates existing references
in selftests and docs.

Also rename ynl-gen-[c|rst] to ynl_gen_[c|rst], avoid
dashes as these prevent easy imports for entrypoints.

Signed-off-by: Jan Stancek <jstancek@redhat.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/a4151bad0e6984e7164d395125ce87fd2e048bf1.1736343575.git.jstancek@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-09 12:53:27 -08:00
Jakub Kicinski
0b7bdc7fab netdev: define NETDEV_INTERNAL
Linus suggested during one of past maintainer summits (in context of
a DMA_BUF discussion) that symbol namespaces can be used to prevent
unwelcome but in-tree code from using all exported functions.
Create a namespace for netdev.

Export netdev_rx_queue_restart(), drivers may want to use it since
it gives them a simple and safe way to restart a queue to apply
config changes. But it's both too low level and too actively developed
to be used outside netdev.

Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-01-09 15:33:08 +01:00
Aleksandr Loktionov
07af482e64 i40e: add ability to reset VF for Tx and Rx MDD events
Implement "mdd-auto-reset-vf" priv-flag to handle Tx and Rx MDD events for VFs.
This flag is also used in other network adapters like ICE.

Usage:
- "on"  - The problematic VF will be automatically reset
	  if a malformed descriptor is detected.
- "off" - The problematic VF will be disabled.

In cases where a VF sends malformed packets classified as malicious, it can
cause the Tx queue to freeze, rendering it unusable for several minutes. When
an MDD event occurs, this new implementation allows for a graceful VF reset to
quickly restore operational state.

Currently, VF queues are disabled if an MDD event occurs. This patch adds the
ability to reset the VF if a Tx or Rx MDD event occurs. It also includes MDD
event logging throttling to avoid dmesg pollution and unifies the format of
Tx and Rx MDD messages.

Note: Standard message rate limiting functions like dev_info_ratelimited()
do not meet our requirements. Custom rate limiting is implemented,
please see the code for details.

Co-developed-by: Jan Sokolowski <jan.sokolowski@intel.com>
Signed-off-by: Jan Sokolowski <jan.sokolowski@intel.com>
Co-developed-by: Padraig J Connolly <padraig.j.connolly@intel.com>
Signed-off-by: Padraig J Connolly <padraig.j.connolly@intel.com>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Michal Schmidt <mschmidt@redhat.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20250106221929.956999-13-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-07 18:16:02 -08:00
Jianbo Liu
373b79af3a xfrm: Support ESN context update to hardware for TX
Previously xfrm_dev_state_advance_esn() was added for RX only. But
it's possible that ESN context also need to be synced to hardware for
TX, so call it for outbound in this patch.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
2025-01-07 13:12:11 +01:00
Jakub Kicinski
3e5908172c Merge tag 'ieee802154-for-net-next-2025-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next
Stefan Schmidt says:

====================
pull-request: ieee802154-next 2025-01-03

Leo Stone provided a documatation fix to improve the grammar.

David Gilbert spotted a non-used fucntion we can safely remove.

* tag 'ieee802154-for-net-next-2025-01-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wpan/wpan-next:
  net: mac802154: Remove unused ieee802154_mlme_tx_one
  Documentation: ieee802154: fix grammar
====================

Link: https://patch.msgid.link/20250103154605.440478-1-stefan@datenfreihafen.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-04 17:02:55 -08:00
Marek Lindner
7bce3f7518 MAINTAINERS: update email address of Marek Linder
Signed-off-by: Marek Lindner <marek.lindner@mailbox.org>
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
2024-12-31 09:20:19 +01:00
Mina Almasry
f6038d913b net: Document netmem driver support
Document expectations from drivers looking to add support for device
memory tcp or other netmem based features.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Link: https://patch.msgid.link/20241217201206.2360389-1-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-19 19:12:36 -08:00
shunlizhou
65c233d8e3 docs: net: bonding: fix typos
The bonding documentation had several "insure" which is not
properly used in the context. Suggest to change to "ensure"
to improve readability.

Signed-off-by: shunlizhou <shunlizhou@aliyun.com>
Link: https://patch.msgid.link/20241216135447.57681-1-shunlizhou@aliyun.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-18 17:27:35 -08:00
Kory Maincent
6e9e2eed4f net: ethtool: Add support for tsconfig command to get/set hwtstamp config
Introduce support for ETHTOOL_MSG_TSCONFIG_GET/SET ethtool netlink socket
to read and configure hwtstamp configuration of a PHC provider. Note that
simultaneous hwtstamp isn't supported; configuring a new one disables the
previous setting.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16 12:51:41 +00:00
Kory Maincent
b9e3f7dc9e net: ethtool: tsinfo: Enhance tsinfo to support several hwtstamp by net topology
Either the MAC or the PHY can provide hwtstamp, so we should be able to
read the tsinfo for any hwtstamp provider.

Enhance 'get' command to retrieve tsinfo of hwtstamp providers within a
network topology.

Add support for a specific dump command to retrieve all hwtstamp
providers within the network topology, with added functionality for
filtered dump to target a single interface.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16 12:51:41 +00:00
Sabrina Dubroca
5aa97a43d0 docs: tls: document TLS1.3 key updates
Document the kernel's behavior and userspace expectations.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Sabrina Dubroca <sd@queasysnail.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-12-16 12:47:30 +00:00
Jakub Kicinski
5098462fba Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc3).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-12 14:19:05 -08:00
Jakub Sitnicki
ca6a6f9386 tcp: Add sysctl to configure TIME-WAIT reuse delay
Today we have a hardcoded delay of 1 sec before a TIME-WAIT socket can be
reused by reopening a connection. This is a safe choice based on an
assumption that the other TCP timestamp clock frequency, which is unknown
to us, may be as low as 1 Hz (RFC 7323, section 5.4).

However, this means that in the presence of short lived connections with an
RTT of couple of milliseconds, the time during which a 4-tuple is blocked
from reuse can be orders of magnitude longer that the connection lifetime.
Combined with a reduced pool of ephemeral ports, when using
IP_LOCAL_PORT_RANGE to share an egress IP address between hosts [1], the
long TIME-WAIT reuse delay can lead to port exhaustion, where all available
4-tuples are tied up in TIME-WAIT state.

Turn the reuse delay into a per-netns setting so that sysadmins can make
more aggressive assumptions about remote TCP timestamp clock frequency and
shorten the delay in order to allow connections to reincarnate faster.

Note that applications can completely bypass the TIME-WAIT delay protection
already today by locking the local port with bind() before connecting. Such
immediate connection reuse may result in PAWS failing to detect old
duplicate segments, leaving us with just the sequence number check as a
safety net.

This new configurable offers a trade off where the sysadmin can balance
between the risk of PAWS detection failing to act versus exhausting ports
by having sockets tied up in TIME-WAIT state for too long.

[1] https://lpc.events/event/16/contributions/1349/

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20241209-jakub-krn-909-poc-msec-tw-tstamp-v2-2-66aca0eed03e@cloudflare.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-11 20:17:33 -08:00
Petr Machata
bbe4b41259 Documentation: networking: Add a caveat to nexthop_compat_mode sysctl
net.ipv4.nexthop_compat_mode was added when nexthop objects were added to
provide the view of nexthop objects through the usual lens of the route
UAPI. As nexthop objects evolved, the information provided through this
lens became incomplete. For example, details of resilient nexthop groups
are obviously omitted.

Now that 16-bit nexthop group weights are a thing, the 8-bit UAPI cannot
convey the >8-bit weight accurately. Instead of inventing workarounds for
an obsolete interface, just document the expectations of inaccuracy.

Fixes: b72a6a7ab9 ("net: nexthop: Increase weight to u16")
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/b575e32399ccacd09079b2a218255164535123bd.1733740749.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-10 18:26:24 -08:00
Jakub Kicinski
302cc446cb Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.13-rc2).

No conflicts or adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-05 11:50:14 -08:00
Maksym Kutsevol
36de47bfd0 netcons: Add udp send fail statistics to netconsole
Enhance observability of netconsole. Packet sends can fail.
Start tracking at least two failure possibilities: ENOMEM and
NET_XMIT_DROP for every target. Stats are exposed via an additional
attribute in CONFIGFS.

The exposed statistics allows easier debugging of cases when netconsole
messages were not seen by receivers, eliminating the guesswork if the
sender thinks that messages in question were sent out.

Stats are not reset on enable/disable/change remote ip/etc, they
belong to the netcons target itself.

Reported-by: Breno Leitao <leitao@debian.org>
Closes: https://lore.kernel.org/all/ZsWoUzyK5du9Ffl+@gmail.com/
Signed-off-by: Maksym Kutsevol <max@kutsevol.com>
Link: https://patch.msgid.link/20241202-netcons-add-udp-send-fail-statistics-to-netconsole-v5-2-70e82239f922@kutsevol.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-04 19:16:59 -08:00
Vyshnav Ajith
16ed454515 docs: net: bareudp: fix spelling and grammar mistakes
The BareUDP documentation had several grammar and spelling mistakes,
making it harder to read. This patch fixes those errors to improve
clarity and readability for developers.

Signed-off-by: Vyshnav Ajith <puthen1977@gmail.com>
Link: https://patch.msgid.link/20241121224827.12293-1-puthen1977@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-30 13:54:28 -08:00
Leo Stone
04f5cb4899 Documentation: tls_offload: fix typos and grammar
Fix typos and grammar where it improves readability.

Signed-off-by: Leo Stone <leocstone@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20241124230002.56058-1-leocstone@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-28 12:09:06 +01:00
Vyshnav Ajith
6e33123a18 Fix spelling mistake
Changed from reequires to require. A minute typo.

Signed-off-by: Vyshnav Ajith <puthen1977@gmail.com>
Link: https://patch.msgid.link/20241121221852.10754-1-puthen1977@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-28 12:08:32 +01:00
tuqiang
a202e60147 Documentation: tipc: fix formatting issue in tipc.rst
The hyphen is removed to have the same style as the others.

Fixes: 09ef17863f ("Documentation: add more details in tipc.rst")
Signed-off-by: tuqiang <tu.qiang35@zte.com.cn>
Signed-off-by: Jiang Kun <jiang.kun2@zte.com.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: David S. Miller <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Xin Long <lucien.xin@gmail.com>
Link: https://patch.msgid.link/20241115104731435HuFjZOmkaqumuFhB7mrwe@zte.com.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-19 11:17:13 +01:00
Sanman Pradhan
79da2aaa08 eth: fbnic: add RPC hardware statistics
Report Rx parser statistics via ethtool -S.

The parser stats are 32b, so we need to add refresh to the service
task to make sure we don't miss overflows.

Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241115015344.757567-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18 18:43:41 -08:00
Sanman Pradhan
25ba596d13 eth: fbnic: add PCIe hardware statistics
Add PCIe hardware statistics support to the fbnic driver. These stats
provide insight into PCIe transaction performance and error conditions.

Which includes, read/write and completion TLP counts and DWORD counts and
debug counters for tag, completion credit and NP credit exhaustion

The stats are exposed via debugfs and can be used to monitor PCIe
performance and debug PCIe issues.

Signed-off-by: Sanman Pradhan <sanman.p211993@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241115015344.757567-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-18 18:43:41 -08:00
Jakub Kicinski
a79993b5fc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc8).

Conflicts:

tools/testing/selftests/net/.gitignore
  252e01e682 ("selftests: net: add netlink-dumps to .gitignore")
  be43a6b238 ("selftests: ncdevmem: Move ncdevmem under drivers/net/hw")
https://lore.kernel.org/all/20241113122359.1b95180a@canb.auug.org.au/

drivers/net/phy/phylink.c
  671154f174 ("net: phylink: ensure PHY momentary link-fails are handled")
  7530ea26c8 ("net: phylink: remove "using_mac_select_pcs"")

Adjacent changes:

drivers/net/ethernet/stmicro/stmmac/dwmac-intel-plat.c
  5b366eae71 ("stmmac: dwmac-intel-plat: fix call balance of tx_clk handling routines")
  e96321fad3 ("net: ethernet: Switch back to struct platform_driver::remove()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-14 11:29:15 -08:00
Geetha sowjanya
6050b04dca Documentation: octeontx2: Add Documentation for RVU representors
Adds documentation for creating and configuring rvu port representors

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:12 +00:00
Joe Damato
a90a91e24b docs: networking: Describe irq suspension
Describe irq suspension, the epoll ioctls, and the tradeoffs of using
different gro_flush_timeout values.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
Reviewed-by: Sridhar Samudrala <sridhar.samudrala@intel.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://patch.msgid.link/20241109050245.191288-7-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11 18:45:06 -08:00
Mina Almasry
102d1404c3 net: clarify SO_DEVMEM_DONTNEED behavior in documentation
Document new behavior when the number of frags passed is too big.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Link: https://patch.msgid.link/20241107210331.3044434-2-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11 18:11:46 -08:00
Linu Cherian
46799a41d2 devlink: Add documentation for OcteonTx2 AF
Add documentation for the following devlink params
- npc_mcam_high_zone_percent
- npc_def_rule_cntr
- nix_maxlf

Signed-off-by: Linu Cherian <lcherian@marvell.com>
Link: https://patch.msgid.link/20241105125620.2114301-4-lcherian@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-11 14:15:58 -08:00
Gilad Naaman
f7f5273863 neighbour: Create netdev->neighbour association
Create a mapping between a netdev and its neighoburs,
allowing for much cheaper flushes.

Signed-off-by: Gilad Naaman <gnaaman@drivenets.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241107160444.2913124-7-gnaaman@drivenets.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-09 13:22:57 -08:00
Jakub Kicinski
2696e451df Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc7).

Conflicts:

drivers/net/ethernet/freescale/enetc/enetc_pf.c
  e15c5506dd ("net: enetc: allocate vf_state during PF probes")
  3774409fd4 ("net: enetc: build enetc_pf_common.c as a separate module")
https://lore.kernel.org/20241105114100.118bd35e@canb.auug.org.au

Adjacent changes:

drivers/net/ethernet/ti/am65-cpsw-nuss.c
  de794169cf ("net: ethernet: ti: am65-cpsw: Fix multi queue Rx on J7")
  4a7b2ba94a ("net: ethernet: ti: am65-cpsw: Use tstats instead of open coded version")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-07 13:44:16 -08:00
Jinjian Song
238f2ca1e6 net: wwan: t7xx: Unify documentation column width
Unify the column width of the document to comply with specifications.

Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07 13:33:45 +01:00
Jinjian Song
61329a1152 net: wwan: t7xx: Add debug ports
Add support for userspace to enable/disable the debug ports(ADB,MIPC).
 - ADB port: /dev/wwan0adb0
 - MIPC port: /dev/wwan0mipc0

Application can use ADB (Android Debug Bridge) port to implement
functions (shell, pull, push ...) by ADB protocol commands.
E.g., ADB commands:
 - A_OPEN: OPEN(local-id, 0, "destination")
 - A_WRTE: WRITE(local-id, remote-id, "data")
 - A_OKEY: READY(local-id, remote-id, "")
 - A_CLSE: CLOSE(local-id, remote-id, "")

Link: https://android.googlesource.com/platform/packages/modules/adb/+/refs/heads/main/README.md

Application can use MIPC (Modem Information Process Center) port
to debug antenna tuner or noise profiling through this MTK modem
diagnostic interface.

By default, debug ports are not exposed, so using the command
to enable or disable debug ports.

Enable debug ports:
 - enable: 'echo 1 > /sys/bus/pci/devices/${bdf}/t7xx_debug_ports

Disable debug ports:
 - disable: 'echo 0 > /sys/bus/pci/devices/${bdf}/t7xx_debug_ports

Signed-off-by: Jinjian Song <jinjian.song@fibocom.com>
Reviewed-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-07 13:33:45 +01:00
Alexander Hölzl
b6ec62e01a can: j1939: fix error in J1939 documentation.
The description of PDU1 format usage mistakenly referred to PDU2 format.

Signed-off-by: Alexander Hölzl <alexander.hoelzl@gmx.net>
Acked-by: Oleksij Rempel <o.rempel@pengutronix.de>
Acked-by: Vincent Mailhol <mailhol.vincent@wanadoo.fr>
Link: https://patch.msgid.link/20241023145257.82709-1-alexander.hoelzl@gmx.net
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
2024-11-04 17:45:55 +01:00
Caleb Sander Mateos
61bf0009a7 dim: pass dim_sample to net_dim() by reference
net_dim() is currently passed a struct dim_sample argument by value.
struct dim_sample is 24 bytes. Since this is greater 16 bytes, x86-64
passes it on the stack. All callers have already initialized dim_sample
on the stack, so passing it by value requires pushing a duplicated copy
to the stack. Either witing to the stack and immediately reading it, or
perhaps dereferencing addresses relative to the stack pointer in a chain
of push instructions, seems to perform quite poorly.

In a heavy TCP workload, mlx5e_handle_rx_dim() consumes 3% of CPU time,
94% of which is attributed to the first push instruction to copy
dim_sample on the stack for the call to net_dim():
// Call ktime_get()
  0.26 |4ead2:   call   4ead7 <mlx5e_handle_rx_dim+0x47>
// Pass the address of struct dim in %rdi
       |4ead7:   lea    0x3d0(%rbx),%rdi
// Set dim_sample.pkt_ctr
       |4eade:   mov    %r13d,0x8(%rsp)
// Set dim_sample.byte_ctr
       |4eae3:   mov    %r12d,0xc(%rsp)
// Set dim_sample.event_ctr
  0.15 |4eae8:   mov    %bp,0x10(%rsp)
// Duplicate dim_sample on the stack
 94.16 |4eaed:   push   0x10(%rsp)
  2.79 |4eaf1:   push   0x10(%rsp)
  0.07 |4eaf5:   push   %rax
// Call net_dim()
  0.21 |4eaf6:   call   4eafb <mlx5e_handle_rx_dim+0x6b>

To allow the caller to reuse the struct dim_sample already on the stack,
pass the struct dim_sample by reference to net_dim().

Signed-off-by: Caleb Sander Mateos <csander@purestorage.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Shannon Nelson <shannon.nelson@amd.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Arthur Kiyanovski <akiyano@amazon.com>
Reviewed-by: Louis Peens <louis.peens@corigine.com>
Link: https://patch.msgid.link/20241031002326.3426181-2-csander@purestorage.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 12:36:54 -08:00
Jakub Kicinski
5b1c965956 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc6).

Conflicts:

drivers/net/wireless/intel/iwlwifi/mvm/mld-mac80211.c
  cbe84e9ad5 ("wifi: iwlwifi: mvm: really send iwl_txpower_constraints_cmd")
  188a1bf894 ("wifi: mac80211: re-order assigning channel in activate links")
https://lore.kernel.org/all/20241028123621.7bbb131b@canb.auug.org.au/

net/mac80211/cfg.c
  c4382d5ca1 ("wifi: mac80211: update the right link for tx power")
  8dd0498983 ("wifi: mac80211: Fix setting txpower with emulate_chanctx")

drivers/net/ethernet/intel/ice/ice_ptp_hw.h
  6e58c33106 ("ice: fix crash on probe for DPLL enabled E810 LOM")
  e4291b64e1 ("ice: Align E810T GPIO to other products")
  ebb2693f8f ("ice: Read SDP section from NVM for pin definitions")
  ac532f4f42 ("ice: Cleanup unused declarations")
https://lore.kernel.org/all/20241030120524.1ee1af18@canb.auug.org.au/

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-31 18:10:07 -07:00
Kory Maincent
2b1d193a5a Documentation: networking: Add missing PHY_GET command in the message list
ETHTOOL_MSG_PHY_GET/GET_REPLY/NTF is missing in the ethtool message list.
Add it to the ethool netlink documentation.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Link: https://patch.msgid.link/20241028132351.75922-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29 20:31:03 -07:00
Jakub Kicinski
71e0ad3451 wireless-next patches for v6.13
The first -next "new features" pull request for v6.13. This is a big
 one as we have not been able to send one earlier. We have also some
 patches affecting other subsystems: in staging we deleted the rtl8192e
 driver and in debugfs added a new interface to save struct
 file_operations memory; both were acked by GregKH.
 
 Because of the lib80211/libipw move there were quite a lot of
 conflicts and to solve those we decided to merge net-next into
 wireless-next.
 
 Currently there's one conflict in
 Documentation/networking/net_cachelines/net_device.rst. To fix that
 just remove the iw_public_data line:
 
 https://lore.kernel.org/all/20241011121014.674661a0@canb.auug.org.au/
 
 And when net is merged to net-next there will be another simple
 conflict in in net/mac80211/cfg.c:
 
 https://lore.kernel.org/all/20241024115523.4cd35dde@canb.auug.org.au/
 
 Major changes:
 
 cfg80211/mac80211
 
 * stop exporting wext symbols
 
 * new mac80211 op to indicate that a new interface is to be added
 
 * support radio separation of multi-band devices
 
 Wireless Extensions
 
 * move wext spy implementation to libiw
 
 * remove iw_public_data from struct net_device
 
 brcmfmac
 
 * optional LPO clock support
 
 ipw2x00
 
 * move remaining lib80211 code into libiw
 
 wilc1000
 
 * WILC3000 support
 
 rtw89
 
 * RTL8852BE and RTL8852BE-VT BT-coexistence improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmcbz9YRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZsabQf8CWJ/kyonw/Z8hRxgfE/7D6Jiqoq7R+ML
 8W8lbc6F5wra4eCBq/oo6UVV36Ss6mxQYcRcmLq+nCkXa4qdMpg/z55QECMHxx5Z
 YnIBbD2vBrIj7W21gfCKH1WJ+b5IQFZl3zuxuCgXjxD9TJM2CjUfOkvrhrqqzrPn
 clfUx5f01vfv2jdvClPR5977gFE5One/ANeRQNs7uDS0TeeD2P+61DEB1//htIJo
 7GwwCyUJCeOcfWRMzQwhpoppWKcPAV70kSVJrl/fRstS68vQGSQbcx9yiNeWkSFw
 JXjQGdc8eYLPzLqECwS0KwFkta6AXbafAYYXe1wdlAzr+kmJ9x5oqA==
 =x+mr
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2024-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.13

The first -next "new features" pull request for v6.13. This is a big
one as we have not been able to send one earlier. We have also some
patches affecting other subsystems: in staging we deleted the rtl8192e
driver and in debugfs added a new interface to save struct
file_operations memory; both were acked by GregKH.

Because of the lib80211/libipw move there were quite a lot of
conflicts and to solve those we decided to merge net-next into
wireless-next.

Major changes:

cfg80211/mac80211
 * stop exporting wext symbols
 * new mac80211 op to indicate that a new interface is to be added
 * support radio separation of multi-band devices

Wireless Extensions
 * move wext spy implementation to libiw
 * remove iw_public_data from struct net_device

brcmfmac
 * optional LPO clock support

ipw2x00
 * move remaining lib80211 code into libiw

wilc1000
 * WILC3000 support

rtw89
 * RTL8852BE and RTL8852BE-VT BT-coexistence improvements

* tag 'wireless-next-2024-10-25' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (126 commits)
  mac80211: Remove NOP call to ieee80211_hw_config
  wifi: iwlwifi: work around -Wenum-compare-conditional warning
  wifi: mac80211: re-order assigning channel in activate links
  wifi: mac80211: convert debugfs files to short fops
  debugfs: add small file operations for most files
  wifi: mac80211: remove misleading j_0 construction parts
  wifi: mac80211_hwsim: use hrtimer_active()
  wifi: mac80211: refactor BW limitation check for CSA parsing
  wifi: mac80211: filter on monitor interfaces based on configured channel
  wifi: mac80211: refactor ieee80211_rx_monitor
  wifi: mac80211: add support for the monitor SKIP_TX flag
  wifi: cfg80211: add monitor SKIP_TX flag
  wifi: mac80211: add flag to opt out of virtual monitor support
  wifi: cfg80211: pass net_device to .set_monitor_channel
  wifi: mac80211: remove status->ampdu_delimiter_crc
  wifi: cfg80211: report per wiphy radio antenna mask
  wifi: mac80211: use vif radio mask to limit creating chanctx
  wifi: mac80211: use vif radio mask to limit ibss scan frequencies
  wifi: cfg80211: add option for vif allowed radios
  wifi: iwlwifi: allow IWL_FW_CHECK() with just a string
  ...

====================

Link: https://patch.msgid.link/20241025170705.5F6B2C4CEC3@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-29 18:50:58 -07:00
Hangbin Liu
9f59eccd9d Documentation: bonding: add XDP support explanation
Add document about which modes have native XDP support.

Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Link: https://patch.msgid.link/20241021031211.814-3-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-28 16:09:42 -07:00
Levi Zim
b935252cc2 docs: networking: packet_mmap: replace dead links with archive.org links
The original link returns 404 now. This commit replaces the dead google
site link with archive.org link.

Signed-off-by: Levi Zim <rsworktech@outlook.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241021-packet_mmap_fix_link-v1-1-dffae4a174c0@outlook.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-28 15:47:10 -07:00
Leo Stone
aad4b41a34 Documentation: ieee802154: fix grammar
Fix grammar where it improves readability.

Signed-off-by: Leo Stone <leocstone@gmail.com>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Link: https://lore.kernel.org/20241023041203.35313-1-leocstone@gmail.com
Signed-off-by: Stefan Schmidt <stefan@datenfreihafen.org>
2024-10-28 21:14:27 +01:00
Joe Damato
86e25f40aa net: napi: Add napi_config
Add a persistent NAPI config area for NAPI configuration to the core.
Drivers opt-in to setting the persistent config for a NAPI by passing an
index when calling netif_napi_add_config.

napi_config is allocated in alloc_netdev_mqs, freed in free_netdev
(after the NAPIs are deleted).

Drivers which call netif_napi_add_config will have persistent per-NAPI
settings: NAPI IDs, gro_flush_timeout, and defer_hard_irq settings.

Per-NAPI settings are saved in napi_disable and restored in napi_enable.

Co-developed-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Martin Karsten <mkarsten@uwaterloo.ca>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241011184527.16393-6-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14 17:54:29 -07:00
Joe Damato
acb8d4ed56 net: napi: Make gro_flush_timeout per-NAPI
Allow per-NAPI gro_flush_timeout setting.

The existing sysfs parameter is respected; writes to sysfs will write to
all NAPI structs for the device and the net_device gro_flush_timeout
field. Reads from sysfs will read from the net_device field.

The ability to set gro_flush_timeout on specific NAPI instances will be
added in a later commit, via netdev-genl.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20241011184527.16393-4-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14 17:54:29 -07:00
Joe Damato
f15e3b3ddb net: napi: Make napi_defer_hard_irqs per-NAPI
Add defer_hard_irqs to napi_struct in preparation for per-NAPI
settings.

The existing sysfs parameter is respected; writes to sysfs will write to
all NAPI structs for the device and the net_device defer_hard_irq field.
Reads from sysfs show the net_device field.

The ability to set defer_hard_irqs on specific NAPI instances will be
added in a later commit, via netdev-genl.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20241011184527.16393-2-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-14 17:54:28 -07:00
Eric Dumazet
d677aebd66 tcp: move sysctl_tcp_l3mdev_accept to netns_ipv4_read_rx
sysctl_tcp_l3mdev_accept is read from TCP receive fast path from
tcp_v6_early_demux(),
 __inet6_lookup_established,
  inet_request_bound_dev_if().

Move it to netns_ipv4_read_rx.

Remove the '#ifdef CONFIG_NET_L3_MASTER_DEV' that was guarding
its definition.

Note this adds a hole of three bytes that could be filled later.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Cc: Wei Wang <weiwan@google.com>
Cc: Coco Li <lixiaoyan@google.com>
Link: https://patch.msgid.link/20241010034100.320832-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-11 08:45:24 -07:00
Jakub Kicinski
9c0fc36ec4 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR (net-6.12-rc3).

No conflicts and no adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 13:13:33 -07:00
Jakub Kicinski
bdb5d2481a Merge branch 'net-introduce-tx-h-w-shaping-api'
Paolo Abeni says:

====================
net: introduce TX H/W shaping API

We have a plurality of shaping-related drivers API, but none flexible
enough to meet existing demand from vendors[1].

This series introduces new device APIs to configure in a flexible way
TX H/W shaping. The new functionalities are exposed via a newly
defined generic netlink interface and include introspection
capabilities. Some self-tests are included, on top of a dummy
netdevsim implementation. Finally a basic implementation for the iavf
driver is provided.

Some usage examples:

* Configure shaping on a given queue:

./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \
	--do set --json '{"ifindex": '$IFINDEX',
			  "shaper": {"handle":
				     {"scope": "queue", "id":'$QUEUEID'},
			  "bw-max": 2000000}}'

* Container B/W sharing

The orchestration infrastructure wants to group the
container-related queues under a RR scheduling and limit the aggregate
bandwidth:

./tools/net/ynl/cli.py --spec Documentation/netlink/specs/shaper.yaml \
	--do group --json '{"ifindex": '$IFINDEX',
			"leaves": [
			  {"handle": {"scope": "queue", "id":'$QID1'},
			   "weight": '$W1'},
			  {"handle": {"scope": "queue", "id":'$QID2'},
			   "weight": '$W2'}],
			  {"handle": {"scope": "queue", "id":'$QID3'},
			   "weight": '$W3'}],
			"handle": {"scope":"node"},
			"bw-max": 10000000}'
{'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 0}}

Q1 \
    \
Q2 -- node 0 -------  netdev
    / (bw-max: 10M)
Q3 /

* Delegation

A containers wants to limit the aggregate B/W bandwidth of 2 of the 3
queues it owns - the starting configuration is the one from the
previous point:

SPEC=Documentation/netlink/specs/net_shaper.yaml
./tools/net/ynl/cli.py --spec $SPEC \
	--do group --json '{"ifindex": '$IFINDEX',
			"leaves": [
			  {"handle": {"scope": "queue", "id":'$QID1'},
			   "weight": '$W1'},
			  {"handle": {"scope": "queue", "id":'$QID2'},
			   "weight": '$W2'}],
			"handle": {"scope": "node"},
			"bw-max": 5000000 }'
{'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 1}}

Q1 -- node 1 --------\
    / (bw-max: 5M)    \
Q2 /                   node 0 -------  netdev
                      /(bw-max: 10M)
Q3 ------------------/

In a group operation, when prior to the op itself, the leaves have
different parents, the user must specify the parent handle for the
group. I.e., starting from the previous config:

./tools/net/ynl/cli.py --spec $SPEC \
	--do group --json '{"ifindex": '$IFINDEX',
			"leaves": [
			  {"handle": {"scope": "queue", "id":'$QID1'},
			   "weight": '$W1'},
			  {"handle": {"scope": "queue", "id":'$QID3'},
			   "weight": '$W3'}],
			"handle": {"scope": "node"},
			"bw-max": 3000000 }'
Netlink error: Invalid argument
nl_len = 96 (80) nl_flags = 0x300 nl_type = 2
	error: -22
	extack: {'msg': 'All the leaves shapers must have the same old parent'}

./tools/net/ynl/cli.py --spec $SPEC \
	--do group --json '{"ifindex": '$IFINDEX',
			"leaves": [
			  {"handle": {"scope": "queue", "id":'$QID1'},
			   "weight": '$W1'},
			  {"handle": {"scope": "queue", "id":'$QID3'},
			   "weight": '$W3'}],
			"handle": {"scope": "node"},
			"parent": {"scope": "node", "id": 1},
			"bw-max": 3000000 }
{'ifindex': $IFINDEX, 'handle': {'scope': 'node', 'id': 2}}

Q1 -- node 2 ---
    /(bw-max:3M)\
Q3 /             \
         ---- node 1 \
        / (bw-max: 5M)\
      Q2              node 0 -------  netdev
                      (bw-max: 10M)

* Cleanup:

Still starting from config 1To delete a single queue shaper

./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex": '$IFINDEX',
	  "handle": {"scope": "queue", "id":'$QID3'}}'

Q1 -- node 2 ---
     (bw-max:3M)\
                 \
         ---- node 1 \
        / (bw-max: 5M)\
      Q2              node 0 -------  netdev
                      (bw-max: 10M)

Deleting a node shaper relinks all its leaves to the node's parent:

./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex": '$IFINDEX',
	  "handle": {"scope": "node", "id":2}}'

Q1 ---\
       \
        node 1----- \
       / (bw-max: 5M)\
Q2----/              node 0 -------  netdev
                     (bw-max: 10M)

Deleting the last shaper under a node shaper deletes the node, too:

./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex": '$IFINDEX',
	  "handle": {"scope": "queue", "id":'$QID1'}}'
./tools/net/ynl/cli.py --spec $SPEC --do delete --json \
	'{"ifindex": '$IFINDEX',
	  "handle": {"scope": "queue", "id":'$QID2'}}'
./tools/net/ynl/cli.py --spec $SPEC --do get --json \
	'{"ifindex": '$IFINDEX',
	  "handle": {"scope": "node", "id": 1}}'
Netlink error: No such file or directory
nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
	error: -2
	extack: {'bad-attr': '.handle'}

Such delete recurses on parents that are left over with no leaves:

./tools/net/ynl/cli.py --spec $SPEC --do get --json \
	'{"ifindex": '$IFINDEX',
	  "handle": {"scope": "node", "id": 0}}'
Netlink error: No such file or directory
nl_len = 44 (28) nl_flags = 0x300 nl_type = 2
	error: -2
	extack: {'bad-attr': '.handle'}

v8: https://lore.kernel.org/cover.1727704215.git.pabeni@redhat.com
v7: https://lore.kernel.org/cover.1725919039.git.pabeni@redhat.com
v6: https://lore.kernel.org/cover.1725457317.git.pabeni@redhat.com
v5: https://lore.kernel.org/cover.1724944116.git.pabeni@redhat.com
v4: https://lore.kernel.org/cover.1724165948.git.pabeni@redhat.com
v3: https://lore.kernel.org/cover.1722357745.git.pabeni@redhat.com
RFC v2: https://lore.kernel.org/cover.1721851988.git.pabeni@redhat.com
RFC v1: https://lore.kernel.org/cover.1719518113.git.pabeni@redhat.com
====================

Link: https://patch.msgid.link/cover.1728460186.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 08:32:46 -07:00
Paolo Abeni
4b623f9f0f net-shapers: implement NL get operation
Introduce the basic infrastructure to implement the net-shaper
core functionality. Each network devices carries a net-shaper cache,
the NL get() operation fetches the data from such cache.

The cache is initially empty, will be fill by the set()/group()
operation implemented later and is destroyed at device cleanup time.

The net_shaper_fill_handle(), net_shaper_ctx_init(), and
net_shaper_generic_pre() implementations handle generic index type
attributes, despite the current caller always pass a constant value
to avoid more noise in later patches using them with different
attributes.

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Link: https://patch.msgid.link/ddd10fd645a9367803ad02fca4a5664ea5ace170.1728460186.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-10 08:30:22 -07:00
Jakub Kicinski
22ee378eb6 Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-10-08 (ice, iavf, igb, e1000e, e1000)

This series contains updates to ice, iavf, igb, e1000e, and e1000
drivers.

For ice:

Wojciech adds support for ethtool reset.

Paul adds support for hardware based VF mailbox limits for E830 devices.

Jake adjusts to a common iterator in ice_vc_cfg_qs_msg() and moves
storing of max_frame and rx_buf_len from VSI struct to the ring
structure.

Hongbo Li uses assign_bit() to replace an open-coded instance.

Markus Elfring adjusts a couple of PTP error paths to use a common,
shared exit point.

Yue Haibing removes unused declarations.

For iavf:

Yue Haibing removes unused declarations.

For igb:

Yue Haibing removes unused declarations.

For e1000e:

Takamitsu Iwai removes unneccessary writel() calls.

Joe Damato adds support for netdev-genl support to query IRQ, NAPI,
and queue information.

For e1000:

Joe Damato adds support for netdev-genl support to query IRQ, NAPI,
and queue information.

* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue:
  e1000: Link NAPI instances to queues and IRQs
  e1000e: Link NAPI instances to queues and IRQs
  e1000e: Remove duplicated writel() in e1000_configure_tx/rx()
  igb: Cleanup unused declarations
  iavf: Remove unused declarations
  ice: Cleanup unused declarations
  ice: Use common error handling code in two functions
  ice: Make use of assign_bit() API
  ice: store max_frame and rx_buf_len only in ice_rx_ring
  ice: consistently use q_idx in ice_vc_cfg_qs_msg()
  ice: add E830 HW VF mailbox message limit support
  ice: Implement ethtool reset support
====================

Link: https://patch.msgid.link/20241008233441.928802-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09 20:04:43 -07:00
Donald Hunter
54b771e6c6 doc: net: Fix .rst rendering of net_cachelines pages
The doc pages under /networking/net_cachelines are unreadable because
they lack .rst formatting for the tabular text.

Add simple table markup and tidy up the table contents:

- remove dashes that represent empty cells because they render
  as bullets and are not needed
- replace 'struct_*' with 'struct *' in the first column so that
  sphinx can render links for any structs that appear in the docs

Signed-off-by: Donald Hunter <donald.hunter@gmail.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20241008165329.45647-1-donald.hunter@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-09 17:34:49 -07:00
Johannes Berg
a0efa2f362 Merge net-next/main to resolve conflicts
The wireless-next tree was based on something older, and there
are now conflicts between -rc2 and work here. Merge net-next,
which has enough of -rc2 for the conflicts to happen, resolving
them in the process.

Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-09 08:59:22 +02:00
Wojciech Drewek
b699c81af0 ice: Implement ethtool reset support
Enable ethtool reset support. Ethtool reset flags are mapped to the
E810 reset type:
PF reset:
  $ ethtool --reset <ethX> irq dma filter offload
CORE reset:
  $ ethtool --reset <ethX> irq-shared dma-shared filter-shared \
    offload-shared ram-shared
GLOBAL reset:
  $ ethtool --reset <ethX> irq-shared dma-shared filter-shared \
    offload-shared mac-shared phy-shared ram-shared

Calling the same set of flags as in PF reset case on port representor
triggers VF reset.

Reviewed-by: Michal Swiatkowski <michal.swiatkowski@linux.intel.com>
Reviewed-by: Marcin Szycik <marcin.szycik@linux.intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Wojciech Drewek <wojciech.drewek@intel.com>
Tested-by: Pucha Himasekhar Reddy <himasekharx.reddy.pucha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
2024-10-08 14:37:15 -07:00
Johannes Berg
836265d316 wifi: remove iw_public_data from struct net_device
Given the previous patches, we no longer need the
struct iw_public_data etc., it's only used by the
old Intel drivers (and ps3_gelic creates it but
then doesn't use it). Remove all of that, including
the pointer in struct net_device.

Link: https://patch.msgid.link/20241007213525.8b2d52b60531.I6a27aaf30bded9a0977f07f47fba2bd31a3b3330@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-10-08 21:53:40 +02:00
Oleksij Rempel
e793b86ae4 Documentation: networking: add Twisted Pair Ethernet diagnostics at OSI Layer 1
This patch introduces a diagnostic guide for troubleshooting Twisted
Pair  Ethernet variants at OSI Layer 1. It provides detailed steps for
detecting  and resolving common link issues, such as incorrect wiring,
cable damage,  and power delivery problems. The guide also includes
interface verification  steps and PHY-specific diagnostics.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://patch.msgid.link/20241004121824.1716303-1-o.rempel@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-08 11:45:35 +02:00
Eric Dumazet
a3f5f4c2f9 ipv4: remove fib_info_devhash[]
Upcoming per-netns RTNL conversion needs to get rid
of shared hash tables.

fib_info_devhash[] is one of them.

It is unclear why we used a hash table, because
a single hlist_head per net device was cheaper and scalable.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://patch.msgid.link/20241004134720.579244-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-07 16:46:27 -07:00
Eric Dumazet
f858cc9eed net: add IFLA_MAX_PACING_OFFLOAD_HORIZON device attribute
Some network devices have the ability to offload EDT (Earliest
Departure Time) which is the model used for TCP pacing and FQ
packet scheduler.

Some of them implement the timing wheel mechanism described in
https://saeed.github.io/files/carousel-sigcomm17.pdf
with an associated 'timing wheel horizon'.

This patch adds dev->max_pacing_offload_horizon expressing
this timing wheel horizon in nsec units.

This is a read-only attribute.

Unless a driver sets it, dev->max_pacing_offload_horizon
is zero.

v2: addressed Jakub feedback ( https://lore.kernel.org/netdev/20240930152304.472767-2-edumazet@google.com/T/#mf6294d714c41cc459962154cc2580ce3c9693663 )
v3: added yaml doc (also per Jakub feedback)

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20241003121219.2396589-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04 15:37:53 -07:00
Vadim Fedorenko
4aecca4c76 net_tstamp: add SCM_TS_OPT_ID to provide OPT_ID in control message
SOF_TIMESTAMPING_OPT_ID socket option flag gives a way to correlate TX
timestamps and packets sent via socket. Unfortunately, there is no way
to reliably predict socket timestamp ID value in case of error returned
by sendmsg. For UDP sockets it's impossible because of lockless
nature of UDP transmit, several threads may send packets in parallel. In
case of RAW sockets MSG_MORE option makes things complicated. More
details are in the conversation [1].
This patch adds new control message type to give user-space
software an opportunity to control the mapping between packets and
values by providing ID with each sendmsg for UDP sockets.
The documentation is also added in this patch.

[1] https://lore.kernel.org/netdev/CALCETrU0jB+kg0mhV6A8mrHfTE1D1pr1SD_B9Eaa9aDPfgHdtA@mail.gmail.com/

Reviewed-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://patch.msgid.link/20241001125716.2832769-2-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-04 11:52:19 -07:00
Leo Stone
2d7a098b9d Documentation: networking/tcp_ao: typo and grammar fixes
Fix multiple grammatical issues and add a missing period to improve
readability.

Signed-off-by: Leo Stone <leocstone@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240929005001.370991-1-leocstone@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-03 16:38:48 -07:00
Sean Anderson
b63ad06ddd doc: net: napi: Update documentation for napi_schedule_irqoff
Since commit 8380c81d5c ("net: Treat __napi_schedule_irqoff() as
__napi_schedule() on PREEMPT_RT"), napi_schedule_irqoff will do the
right thing if IRQs are threaded. Therefore, there is no need to use
IRQF_NO_THREAD.

Signed-off-by: Sean Anderson <sean.anderson@linux.dev>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://patch.msgid.link/20240930153955.971657-1-sean.anderson@linux.dev
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-10-03 12:07:29 +02:00
谢致邦 (XIE Zhibang)
aa758763be docs: tproxy: ignore non-transparent sockets in iptables
The iptables example was added in commit d2f26037a3 (netfilter: Add
documentation for tproxy, 2008-10-08), but xt_socket 'transparent'
option was added in commit a31e1ffd22 (netfilter: xt_socket: added new
revision of the 'socket' match supporting flags, 2009-06-09).

Now add the 'transparent' option to the iptables example to ignore
non-transparent sockets, which is also consistent with the nft example.

Signed-off-by: 谢致邦 (XIE Zhibang) <Yeking@Red54.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-09-26 13:03:01 +02:00
Kory Maincent
9297886f9f Documentation: networking: Fix missing PSE documentation and grammar issues
Fix a missing end of phrase in the documentation. It describes the
ETHTOOL_A_C33_PSE_ACTUAL_PW attribute, which was not fully explained.

Also, fix grammar issues by using simple present tense instead of
present continuous.

Reviewed-by: Oleksij Rempel <o.rempel@pengutronix.de>
Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240912090550.743174-1-kory.maincent@bootlin.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-13 19:59:53 -07:00
Dragos Tatulea
909fc8d107 net/mlx5e: SHAMPO, Add no-split ethtool counters for header/data split
When SHAMPO can't identify the protocol/header of a packet, it will
yield a packet that is not split - all the packet is in the data part.
Count this value in packets and bytes.

Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Link: https://patch.msgid.link/20240911201757.1505453-15-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-12 20:50:30 -07:00
David Arinzon
49f66e1216 net: ena: Add ENA Express metrics support
ENA Express metrics, called `ena_srd` are exposed to
customers via `ethtool`.
The metrics allow customers to check the configuration
(mode), tx/rx counters as well as resource utilization.

The documentation is also updated to provide a general
explanation about ENA Express as well as links for further
information about metrics and configurations.

Signed-off-by: Igor Chauskin <igorch@amazon.com>
Signed-off-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20240909084704.13856-2-darinzon@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-12 18:01:17 -07:00
Parthiban Veerasooran
b3e33f2c54 Documentation: networking: add OPEN Alliance 10BASE-T1x MAC-PHY serial interface
The IEEE 802.3cg project defines two 10 Mbit/s PHYs operating over a
single pair of conductors. The 10BASE-T1L (Clause 146) is a long reach
PHY supporting full duplex point-to-point operation over 1 km of single
balanced pair of conductors. The 10BASE-T1S (Clause 147) is a short reach
PHY supporting full / half duplex point-to-point operation over 15 m of
single balanced pair of conductors, or half duplex multidrop bus
operation over 25 m of single balanced pair of conductors.

Furthermore, the IEEE 802.3cg project defines the new Physical Layer
Collision Avoidance (PLCA) Reconciliation Sublayer (Clause 148) meant to
provide improved determinism to the CSMA/CD media access method. PLCA
works in conjunction with the 10BASE-T1S PHY operating in multidrop mode.

The aforementioned PHYs are intended to cover the low-speed / low-cost
applications in industrial and automotive environment. The large number
of pins (16) required by the MII interface, which is specified by the
IEEE 802.3 in Clause 22, is one of the major cost factors that need to be
addressed to fulfil this objective.

The MAC-PHY solution integrates an IEEE Clause 4 MAC and a 10BASE-T1x PHY
exposing a low pin count Serial Peripheral Interface (SPI) to the host
microcontroller. This also enables the addition of Ethernet functionality
to existing low-end microcontrollers which do not integrate a MAC
controller.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Parthiban Veerasooran <Parthiban.Veerasooran@microchip.com>
Link: https://patch.msgid.link/20240909082514.262942-2-Parthiban.Veerasooran@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 20:53:42 -07:00
Mina Almasry
09d1db26b5 net: add devmem TCP documentation
Add documentation outlining the usage and details of devmem TCP.

Signed-off-by: Mina Almasry <almasrymina@google.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Link: https://patch.msgid.link/20240910171458.219195-12-almasrymina@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 20:44:32 -07:00
Matthieu Baerts (NGI0)
27069e7cb3 mptcp: disable active MPTCP in case of blackhole
An MPTCP firewall blackhole can be detected if the following SYN
retransmission after a fallback to "plain" TCP is accepted.

In case of blackhole, a similar technique to the one in place with TFO
is now used: MPTCP can be disabled for a certain period of time, 1h by
default. This time period will grow exponentially when more blackhole
issues get detected right after MPTCP is re-enabled and will reset to
the initial value when the blackhole issue goes away.

The blackhole period can be modified thanks to a new sysctl knob:
blackhole_timeout. Two new MIB counters help understanding what's
happening:

- 'Blackhole', incremented when a blackhole is detected.
- 'MPCapableSYNTXDisabled', incremented when an MPTCP connection
  directly falls back to TCP during the blackhole period.

Because the technique is inspired by the one used by TFO, an important
part of the new code is similar to what can find in tcp_fastopen.c, with
some adaptations to the MPTCP case.

Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/57
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20240909-net-next-mptcp-fallback-x-mpc-v1-3-da7ebb4cd2a3@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-11 15:57:50 -07:00
Jakub Kicinski
474bb1aa45 mlx5-updates-2024-08-29
HW-Managed Flow Steering in mlx5 driver
 
 Yevgeny Kliteynik says:
 =======================
 
 1. Overview
 -----------
 
 ConnectX devices support packet matching, modification, and redirection.
 This functionality is referred as Flow Steering.
 To configure a steering rule, the rule is written to the device-owned
 memory. This memory is accessed and cached by the device when processing
 a packet.
 
 The first implementation of Flow Steering was done in FW, and it is
 referred in the mlx5 driver as Device-Managed Flow Steering (DMFS).
 Later we introduced SW-managed Flow Steering (SWS or SMFS), where the
 driver is writing directly to the device's configuration memory (ICM)
 through RC QP using RDMA operations (RDMA-read and RDAM-write), thus
 achieving higher rates of rule insertion/deletion.
 
 Now we introduce a new flow steering implementation: HW-Managed Flow
 Steering (HWS or HMFS).
 
 In this new approach, the driver is configuring steering rules directly
 to the HW using the WQs with a special new type of WQE. This way we can
 reach higher rule insertion/deletion rate with much lower CPU utilization
 compared to SWS.
 
 The key benefits of HWS as opposed to SWS:
 + HW manages the steering decision tree
    - HW calculates CRC for each entry
    - HW handles tree hash collisions
    - HW & FW manage objects refcount
 + HW keeps cache coherency:
    - HW provides tree access locking and synchronization
    - HW provides notification on completion
 + Insertion rate isn’t affected by background traffic
    - Dedicated HW components that handle insertion
 
 2. Performance
 --------------
 
 Measuring Connection Tracking with simple IPv4 flows w/o NAT, we
 are able to get ~5 times more flows offloaded per second using HWS.
 
 3. Configuration
 ----------------
 
 The enablement of HWS mode in eswitch manager is done using the same
 devlink param that is already used for switching between FW-managed
 steering and SW-managed steering modes:
 
   # devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs
 
 4. Upstream Submission
 ----------------------
 
 HWS support consists of 3 main components:
 + Steering:
    - The lower layer that exposes HWS API to upper layers and implements
      all the management of flow steering building blocks
 + FS-Core
    - Implementation of fs_hws layer to enable fs_core to use HWS instead
      of FW or SW steering
    - Create HW steering action pools to utilize the ability of HWS to
      share steering actions among different rules
    - Add support for configuring HWS mode through devlink command,
      similar to configuring SWS mode
 + Connection Tracking
    - Implementation of CT support for HW steering
    - Hooks up the CT ops for the new steering mode and uses the HWS API
      to implement connection tracking.
 
 Because of the large number of patches, we need to perform the submission
 in several separate patch series. This series is the first submission that
 lays the ground work for the next submissions, where an actual user of HWS
 will be added.
 
 5. Patches in this series
 -------------------------
 
 This patch series contains implementation of the first bullet from above.
 The patches are:
 
 [patch 01/15] net/mlx5: Added missing mlx5_ifc definition for HW Steering
 [patch 02/15] net/mlx5: Added missing definitions in preparation for HW Steering
 [patch 03/15] net/mlx5: HWS, added actions handling
 [patch 04/15] net/mlx5: HWS, added tables handling
 [patch 05/15] net/mlx5: HWS, added rules handling
 [patch 06/15] net/mlx5: HWS, added definers handling
 [patch 07/15] net/mlx5: HWS, added matchers functionality
 [patch 08/15] net/mlx5: HWS, added FW commands handling
 [patch 09/15] net/mlx5: HWS, added modify header pattern and args handling
 [patch 10/15] net/mlx5: HWS, added vport handling
 [patch 11/15] net/mlx5: HWS, added memory management handling
 [patch 12/15] net/mlx5: HWS, added backward-compatible API handling
 [patch 13/15] net/mlx5: HWS, added debug dump and internal headers
 [patch 14/15] net/mlx5: HWS, added send engine and context handling
 [patch 15/15] net/mlx5: HWS, added API and enabled HWS support
 
 =======================
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmbfOf4ACgkQSD+KveBX
 +j7hWgf/UzlKp8uyqb+7MpMWP6EgT8WUwWdpDfAr1jubIFz7e+VGaA/7QCThe89u
 alcgYvIDQCGrpB/0qXY+kaKPvqOej2wPCiLU7K2JB5y20pZ/RATlFuFBZjsMzufJ
 7NxzZgTPUDz+8OWK0mm0LxEJRJYoJ69gAnR0jvLGx9uSjv/f9lNICvWBaI58hkzb
 HJa6sJNBiFj4EnkipxWCP0GQ4dddMkgCIVYb91FtlBA4SGZtmPS35NqQJKtGnKF3
 ZhZuaTeRdw8bFDJnhbu0ur9cs4EUorZE5QBWhoHYN0zFZF4JmqCCC1HvCS7LEKaU
 PgtREk20H2jPIRFwQuX05D6M4zSizg==
 =AsB3
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2024-08-29

HW-Managed Flow Steering in mlx5 driver

Yevgeny Kliteynik says:
=======================

1. Overview
-----------

ConnectX devices support packet matching, modification, and redirection.
This functionality is referred as Flow Steering.
To configure a steering rule, the rule is written to the device-owned
memory. This memory is accessed and cached by the device when processing
a packet.

The first implementation of Flow Steering was done in FW, and it is
referred in the mlx5 driver as Device-Managed Flow Steering (DMFS).
Later we introduced SW-managed Flow Steering (SWS or SMFS), where the
driver is writing directly to the device's configuration memory (ICM)
through RC QP using RDMA operations (RDMA-read and RDAM-write), thus
achieving higher rates of rule insertion/deletion.

Now we introduce a new flow steering implementation: HW-Managed Flow
Steering (HWS or HMFS).

In this new approach, the driver is configuring steering rules directly
to the HW using the WQs with a special new type of WQE. This way we can
reach higher rule insertion/deletion rate with much lower CPU utilization
compared to SWS.

The key benefits of HWS as opposed to SWS:
+ HW manages the steering decision tree
   - HW calculates CRC for each entry
   - HW handles tree hash collisions
   - HW & FW manage objects refcount
+ HW keeps cache coherency:
   - HW provides tree access locking and synchronization
   - HW provides notification on completion
+ Insertion rate isn’t affected by background traffic
   - Dedicated HW components that handle insertion

2. Performance
--------------

Measuring Connection Tracking with simple IPv4 flows w/o NAT, we
are able to get ~5 times more flows offloaded per second using HWS.

3. Configuration
----------------

The enablement of HWS mode in eswitch manager is done using the same
devlink param that is already used for switching between FW-managed
steering and SW-managed steering modes:

  # devlink dev param set pci/<PCI_ID> name flow_steering_mode cmod runtime value hmfs

4. Upstream Submission
----------------------

HWS support consists of 3 main components:
+ Steering:
   - The lower layer that exposes HWS API to upper layers and implements
     all the management of flow steering building blocks
+ FS-Core
   - Implementation of fs_hws layer to enable fs_core to use HWS instead
     of FW or SW steering
   - Create HW steering action pools to utilize the ability of HWS to
     share steering actions among different rules
   - Add support for configuring HWS mode through devlink command,
     similar to configuring SWS mode
+ Connection Tracking
   - Implementation of CT support for HW steering
   - Hooks up the CT ops for the new steering mode and uses the HWS API
     to implement connection tracking.

Because of the large number of patches, we need to perform the submission
in several separate patch series. This series is the first submission that
lays the ground work for the next submissions, where an actual user of HWS
will be added.

5. Patches in this series
-------------------------

This patch series contains implementation of the first bullet from above.

=======================

* tag 'mlx5-updates-2024-09-02' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5: HWS, added API and enabled HWS support
  net/mlx5: HWS, added send engine and context handling
  net/mlx5: HWS, added debug dump and internal headers
  net/mlx5: HWS, added backward-compatible API handling
  net/mlx5: HWS, added memory management handling
  net/mlx5: HWS, added vport handling
  net/mlx5: HWS, added modify header pattern and args handling
  net/mlx5: HWS, added FW commands handling
  net/mlx5: HWS, added matchers functionality
  net/mlx5: HWS, added definers handling
  net/mlx5: HWS, added rules handling
  net/mlx5: HWS, added tables handling
  net/mlx5: HWS, added actions handling
  net/mlx5: Added missing definitions in preparation for HW Steering
  net/mlx5: Added missing mlx5_ifc definition for HW Steering
====================

Link: https://patch.msgid.link/20240909181250.41596-1-saeed@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 20:01:15 -07:00
Jason Xing
be8e9eb375 net-timestamp: introduce SOF_TIMESTAMPING_OPT_RX_FILTER flag
introduce a new flag SOF_TIMESTAMPING_OPT_RX_FILTER in the receive
path. User can set it with SOF_TIMESTAMPING_SOFTWARE to filter
out rx software timestamp report, especially after a process turns on
netstamp_needed_key which can time stamp every incoming skb.

Previously, we found out if an application starts first which turns on
netstamp_needed_key, then another one only passing SOF_TIMESTAMPING_SOFTWARE
could also get rx timestamp. Now we handle this case by introducing this
new flag without breaking users.

Quoting Willem to explain why we need the flag:
"why a process would want to request software timestamp reporting, but
not receive software timestamp generation. The only use I see is when
the application does request
SOF_TIMESTAMPING_SOFTWARE | SOF_TIMESTAMPING_TX_SOFTWARE."

Similarly, this new flag could also be used for hardware case where we
can set it with SOF_TIMESTAMPING_RAW_HARDWARE, then we won't receive
hardware receive timestamp.

Another thing about errqueue in this patch I have a few words to say:
In this case, we need to handle the egress path carefully, or else
reporting the tx timestamp will fail. Egress path and ingress path will
finally call sock_recv_timestamp(). We have to distinguish them.
Errqueue is a good indicator to reflect the flow direction.

Suggested-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20240909015612.3856-2-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:55:23 -07:00
Jason Xing
e503f82e30 net-timestamp: correct the use of SOF_TIMESTAMPING_RAW_HARDWARE
SOF_TIMESTAMPING_RAW_HARDWARE is a report flag which passes the
timestamps generated by either SOF_TIMESTAMPING_TX_HARDWARE or
SOF_TIMESTAMPING_RX_HARDWARE to the userspace all the time.

So let us revise the doc here.

Link: https://lore.kernel.org/all/66d8c21d3042a_163d93294cb@willemb.c.googlers.com.notmuch/
Suggested-by: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Jason Xing <kernelxing@tencent.com>
Link: https://patch.msgid.link/20240908124141.39628-1-kerneljasonxing@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-10 16:55:08 -07:00
Lee Trager
0246388b9b eth: fbnic: Add devlink firmware version info
This adds support to show firmware version information for both stored and
running firmware versions. The version and commit is displayed separately
to aid monitoring tools which only care about the version.

Example output:
  # devlink dev info
  pci/0000:01:00.0:
    driver fbnic
    serial_number 88-25-08-ff-ff-01-50-92
    versions:
        running:
          fw 24.07.15-017
          fw.commit h999784ae9df0
          fw.bootloader 24.07.10-000
          fw.bootloader.commit hfef3ac835ce7
        stored:
          fw 24.07.24-002
          fw.commit hc9d14a68b3f2
          fw.bootloader 24.07.22-000
          fw.bootloader.commit h922f8493eb96
          fw.undi 01.00.03-000

Signed-off-by: Lee Trager <lee@trager.us>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://patch.msgid.link/20240905233820.1713043-1-lee@trager.us
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-10 12:00:19 +02:00
Yevgeny Kliteynik
510f9f61a1 net/mlx5: HWS, added API and enabled HWS support
Enabling HWS support in the mlx5 driver:
 - added HWS API header
 - added HWS files in the mlx5 driver makefile
 - added kconfig flag that enables HWS compilation

Reviewed-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
2024-09-09 11:10:06 -07:00
Joe Damato
08062af0a5 net: napi: Prevent overflow of napi_defer_hard_irqs
In commit 6f8b12d661 ("net: napi: add hard irqs deferral feature")
napi_defer_irqs was added to net_device and napi_defer_irqs_count was
added to napi_struct, both as type int.

This value never goes below zero, so there is not reason for it to be a
signed int. Change the type for both from int to u32, and add an
overflow check to sysfs to limit the value to S32_MAX.

The limit of S32_MAX was chosen because the practical limit before this
patch was S32_MAX (anything larger was an overflow) and thus there are
no behavioral changes introduced. If the extra bit is needed in the
future, the limit can be raised.

Before this patch:

$ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
$ cat /sys/class/net/eth4/napi_defer_hard_irqs
-2147483647

After this patch:

$ sudo bash -c 'echo 2147483649 > /sys/class/net/eth4/napi_defer_hard_irqs'
bash: line 0: echo: write error: Numerical result out of range

Similarly, /sys/class/net/XXXXX/tx_queue_len is defined as unsigned:

include/linux/netdevice.h:      unsigned int            tx_queue_len;

And has an overflow check:

dev_change_tx_queue_len(..., unsigned long new_len):

  if (new_len != (unsigned int)new_len)
          return -ERANGE;

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Joe Damato <jdamato@fastly.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20240904153431.307932-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-09-05 18:42:56 -07:00
Alexander Lobakin
782dbbf589 netdev_features: convert NETIF_F_FCOE_MTU to dev->fcoe_mtu
Ability to handle maximum FCoE frames of 2158 bytes can never be changed
and thus more of an attribute, not a toggleable feature.
Move it from netdev_features_t to "cold" priv flags (bitfield bool) and
free yet another feature bit.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03 11:36:43 +02:00
Alexander Lobakin
05c1280a2b netdev_features: convert NETIF_F_NETNS_LOCAL to dev->netns_local
"Interface can't change network namespaces" is rather an attribute,
not a feature, and it can't be changed via Ethtool.
Make it a "cold" private flag instead of a netdev_feature and free
one more bit.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03 11:36:43 +02:00
Alexander Lobakin
00d066a4d4 netdev_features: convert NETIF_F_LLTX to dev->lltx
NETIF_F_LLTX can't be changed via Ethtool and is not a feature,
rather an attribute, very similar to IFF_NO_QUEUE (and hot).
Free one netdev_features_t bit and make it a "hot" private flag.

Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03 11:36:43 +02:00
Alexander Lobakin
beb5a9bea8 netdevice: convert private flags > BIT(31) to bitfields
Make dev->priv_flags `u32` back and define bits higher than 31 as
bitfield booleans as per Jakub's suggestion. This simplifies code
which accesses these bits with no optimization loss (testb both
before/after), allows to not extend &netdev_priv_flags each time,
but also scales better as bits > 63 in the future would only add
a new u64 to the structure with no complications, comparing to
that extending ::priv_flags would require converting it to a bitmap.
Note that I picked `unsigned long :1` to not lose any potential
optimizations comparing to `bool :1` etc.

Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-09-03 11:36:43 +02:00
Joe Damato
6af91e3d2c Documentation: Add missing fields to net_cachelines
Two fields, page_pools and *irq_moder, were added to struct net_device
in commit 083772c9f9 ("net: page_pool: record pools per netdev") and
commit f750dfe825 ("ethtool: provide customized dim profile
management"), respectively.

Add both to the net_cachelines documentation, as well.

Signed-off-by: Joe Damato <jdamato@fastly.com>
Link: https://patch.msgid.link/20240829155742.366584-1-jdamato@fastly.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-30 11:14:58 -07:00
Oleksij Rempel
abcd3026dd ethtool: Extend cable testing interface with result source information
Extend the ethtool netlink cable testing interface by adding support for
specifying the source of cable testing results. This allows users to
differentiate between results obtained through different diagnostic
methods.

For example, some TI 10BaseT1L PHYs provide two variants of cable
diagnostics: Time Domain Reflectometry (TDR) and Active Link Cable
Diagnostic (ALCD). By introducing `ETHTOOL_A_CABLE_RESULT_SRC` and
`ETHTOOL_A_CABLE_FAULT_LENGTH_SRC` attributes, this update enables
drivers to indicate whether the result was derived from TDR or ALCD,
improving the clarity and utility of diagnostic information.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20240822120703.1393130-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-26 09:33:58 -07:00
Maxime Chevallier
db31e09d51 Documentation: networking: document phy_link_topology
The newly introduced phy_link_topology tracks all ethernet PHYs that are
attached to a netdevice. Document the base principle, internal and
external APIs. As the phy_link_topology is expected to be extended, this
documentation will hold any further improvements and additions made
relative to topology handling.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23 13:04:35 +01:00
Maxime Chevallier
17194be4c8 net: ethtool: Introduce a command to list PHYs on an interface
As we have the ability to track the PHYs connected to a net_device
through the link_topology, we can expose this list to userspace. This
allows userspace to use these identifiers for phy-specific commands and
take the decision of which PHY to target by knowing the link topology.

Add PHY_GET and PHY_DUMP, which can be a filtered DUMP operation to list
devices on only one interface.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23 13:04:34 +01:00
Maxime Chevallier
c15e065b46 net: ethtool: Allow passing a phy index for some commands
Some netlink commands are target towards ethernet PHYs, to control some
of their features. As there's several such commands, add the ability to
pass a PHY index in the ethnl request, which will populate the generic
ethnl_req_info with the passed phy_index.

Add a helper that netlink command handlers need to use to grab the
targeted PHY from the req_info. This helper needs to hold rtnl_lock()
while interacting with the PHY, as it may be removed at any point.

Signed-off-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Tested-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-08-23 13:04:34 +01:00
Tariq Toukan
9480fd0cd8 docs: networking: Align documentation with behavior change
Following commit 9f7e8fbb91 ("net/mlx5: offset comp irq index in name by one"),
which fixed the index in IRQ name to start once again from 0, we change
the documentation accordingly.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://patch.msgid.link/20240815142343.2254247-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-16 14:29:44 -07:00
Jing-Ping Jan
baae8b0ba8 Documentation: networking: correct spelling
Correct spelling problems for Documentation/networking/ as reported
by ispell.

Signed-off-by: Jing-Ping Jan <zoo868e@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240812170910.5760-1-zoo868e@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-08-13 15:57:43 -07:00