mirror_ubuntu-kernels/net/ipv4
Eric Dumazet 124c4c32e9 tcp: add TCP_RFC7323_PAWS_ACK drop reason
XPS can cause reorders because of the relaxed OOO
conditions for pure ACK packets.

For hosts not using RFS, what can happpen is that ACK
packets are sent on behalf of the cpu processing NIC
interrupts, selecting TX queue A for ACK packet P1.

Then a subsequent sendmsg() can run on another cpu.
TX queue selection uses the socket hash and can choose
another queue B for packets P2 (with payload).

If queue A is more congested than queue B,
the ACK packet P1 could be sent on the wire after
P2.

A linux receiver when processing P1 (after P2) currently increments
LINUX_MIB_PAWSESTABREJECTED (TcpExtPAWSEstab)
and use TCP_RFC7323_PAWS drop reason.
It might also send a DUPACK if not rate limited.

In order to better understand this pattern, this
patch adds a new drop_reason : TCP_RFC7323_PAWS_ACK.

For old ACKS like these, we no longer increment
LINUX_MIB_PAWSESTABREJECTED and no longer sends a DUPACK,
keeping credit for other more interesting DUPACK.

perf record -e skb:kfree_skb -a
perf script
...
         swapper       0 [148] 27475.438637: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.438706: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.438908: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [148] 27475.439010: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [148] 27475.439214: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
         swapper       0 [208] 27475.439286: skb:kfree_skb: ... location=tcp_validate_incoming+0x4f0 reason: TCP_RFC7323_PAWS_ACK
...

Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Neal Cardwell <ncardwell@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Jason Xing <kerneljasonxing@gmail.com>
Link: https://patch.msgid.link/20250113135558.3180360-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-01-14 13:28:13 -08:00
..
netfilter netfilter: nf_dup4: Convert nf_dup_ipv4_route() to dscp_t. 2024-11-15 11:00:29 +01:00
af_inet.c ipv4: Define inet_sk_init_flowi4() and use it in inet_sk_rebuild_header(). 2024-12-20 13:50:09 -08:00
ah4.c
arp.c neighbour: Remove bare neighbour::next pointer 2024-11-09 13:22:57 -08:00
bpf_tcp_ca.c bpf: Check unsupported ops from the bpf_struct_ops's cfi_stubs 2024-07-29 12:54:13 -07:00
cipso_ipv4.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
datagram.c ipv4: Use inet_sk_init_flowi4() in ip4_datagram_release_cb(). 2024-12-20 13:50:09 -08:00
devinet.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
esp4_offload.c xfrm: Add an inbound percpu state cache. 2024-10-29 11:56:18 +01:00
esp4.c xfrm: add generic iptfs defines and functionality 2024-12-05 10:01:28 +01:00
fib_frontend.c net: ip: fix unexpected return in fib_validate_source() 2024-11-18 18:57:00 -08:00
fib_lookup.h
fib_notifier.c net: do not acquire rtnl in fib_seq_sum() 2024-10-11 15:35:05 -07:00
fib_rules.c ipv4: fib_rules: Reject flow label attributes 2024-12-19 16:02:21 +01:00
fib_semantics.c ipv4: remove fib_info_devhash[] 2024-10-07 16:46:27 -07:00
fib_trie.c ipv4: output metric as unsigned int 2024-12-15 13:13:40 -08:00
fou_bpf.c ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
fou_core.c fou: fix initialization of grc 2024-09-09 17:21:47 -07:00
fou_nl.c tools: ynl-gen: use big-endian netlink attribute types 2024-10-22 15:33:24 +02:00
fou_nl.h
gre_demux.c ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
gre_offload.c
icmp.c inetpeer: do not get a refcount in inet_getpeer() 2024-12-17 19:37:48 -08:00
igmp.c netlink: correct nlmsg size for multicast notifications 2024-12-23 10:26:43 -08:00
inet_connection_sock.c ipv4: Use inet_sk_init_flowi4() in inet_csk_rebuild_route(). 2024-12-20 13:50:09 -08:00
inet_diag.c tcp: annotate data-races around icsk->icsk_pending 2024-10-04 15:34:39 -07:00
inet_fragment.c net: Rename mono_delivery_time to tstamp_type for scalabilty 2024-05-23 14:14:23 -07:00
inet_hashtables.c inet: constify 'struct net' parameter of various lookup helpers 2024-08-05 16:22:45 -07:00
inet_timewait_sock.c tcp: move inet_twsk_schedule helper out of header 2024-06-10 11:54:18 +01:00
inetpeer.c inetpeer: avoid false sharing in inet_peer_xrlim_allow() 2024-12-20 13:04:40 -08:00
ip_forward.c
ip_fragment.c inetpeer: do not get a refcount in inet_getpeer() 2024-12-17 19:37:48 -08:00
ip_gre.c gre: Drop ip_route_output_gre(). 2024-12-19 19:24:47 -08:00
ip_input.c ipv4: remove useless arg 2025-01-02 17:17:40 -08:00
ip_options.c net: ip: make ip_route_input() return drop reasons 2024-11-12 11:24:51 +01:00
ip_output.c ipv4: Use inet_sk_init_flowi4() in __ip_queue_xmit(). 2024-12-20 13:50:09 -08:00
ip_sockglue.c sock: support SO_PRIORITY cmsg 2024-12-16 18:13:44 -08:00
ip_tunnel_core.c ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
ip_tunnel.c net: Fix netns for ip_tunnel_init_flow() 2024-12-23 10:04:41 -08:00
ip_vti.c netdev_features: convert NETIF_F_LLTX to dev->lltx 2024-09-03 11:36:43 +02:00
ipcomp.c
ipconfig.c
ipip.c netdev_features: convert NETIF_F_LLTX to dev->lltx 2024-09-03 11:36:43 +02:00
ipmr_base.c ipmr: Fix access to mfc_cache_list without lock held 2024-11-13 19:09:42 -08:00
ipmr.c ipmr: tune the ipmr_can_free_table() checks. 2024-12-04 18:49:16 -08:00
Kconfig net/tcp: Expand goo.gl link 2024-07-30 18:35:12 -07:00
Makefile
metrics.c net: remove NULL-pointer net parameter in ip_metrics_convert 2024-06-05 10:06:00 +01:00
netfilter.c netfilter: ipv4: Convert ip_route_me_harder() to dscp_t. 2024-11-15 11:00:29 +01:00
netlink.c
nexthop.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
ping.c ping: use sk_skb_reason_drop to free rx packets 2024-06-19 12:44:22 +01:00
proc.c minmax: add a few more MIN_T/MAX_T users 2024-07-28 13:41:14 -07:00
protocol.c
raw_diag.c
raw.c sock: support SO_PRIORITY cmsg 2024-12-16 18:13:44 -08:00
route.c inetpeer: do not get a refcount in inet_getpeer() 2024-12-17 19:37:48 -08:00
syncookies.c tcp: use sk_skb_reason_drop to free rx packets 2024-06-19 12:44:22 +01:00
sysctl_net_ipv4.c tcp: Add sysctl to configure TIME-WAIT reuse delay 2024-12-11 20:17:33 -08:00
tcp_ao.c net/tcp: Add missing lockdep annotations for TCP-AO hlist traversals 2024-11-03 12:10:11 -08:00
tcp_bbr.c tcp: Add new args for cong_control in tcp_congestion_ops 2024-05-02 16:26:56 -07:00
tcp_bic.c
tcp_bpf.c tcp_bpf: Fix copied value in tcp_bpf_sendmsg 2024-12-20 22:53:36 +01:00
tcp_cdg.c
tcp_cong.c tcp: only release congestion control if it has been initialized 2024-10-31 18:22:48 -07:00
tcp_cubic.c
tcp_dctcp.c tcp: Fix shift-out-of-bounds in dctcp_update_alpha(). 2024-05-21 13:34:50 +02:00
tcp_dctcp.h
tcp_diag.c
tcp_fastopen.c net: use unrcu_pointer() helper 2024-06-06 11:52:52 +02:00
tcp_highspeed.c
tcp_htcp.c tcp: Use clamp() in htcp_alpha_update() 2024-08-06 12:16:25 -07:00
tcp_hybla.c
tcp_illinois.c
tcp_input.c tcp: add TCP_RFC7323_PAWS_ACK drop reason 2025-01-14 13:28:13 -08:00
tcp_ipv4.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-01-09 16:11:47 -08:00
tcp_lp.c
tcp_metrics.c tcp_metrics: use netlink policy for IPv6 addr len validation 2024-08-19 17:42:57 -07:00
tcp_minisocks.c tcp: Measure TIME-WAIT reuse delay with millisecond precision 2024-12-11 20:17:33 -08:00
tcp_nv.c
tcp_offload.c net: gso: fix tcp fraglist segmentation after pull from frag_list 2024-10-02 17:21:47 -07:00
tcp_output.c tcp: check space before adding MPTCP SYN options 2024-12-10 18:26:52 -08:00
tcp_plb.c
tcp_rate.c
tcp_recovery.c
tcp_scalable.c
tcp_sigpool.c net/tcp_sigpool: Use nested-BH locking for sigpool_scratch. 2024-06-24 16:41:22 -07:00
tcp_timer.c net: tcp: refresh tcp_mstamp for compressed ack in timer 2024-10-07 16:01:39 -07:00
tcp_ulp.c
tcp_vegas.c
tcp_vegas.h
tcp_veno.c
tcp_westwood.c
tcp_yeah.c
tcp.c tcp: only release congestion control if it has been initialized 2024-10-31 18:22:48 -07:00
tunnel4.c
udp_bpf.c
udp_diag.c
udp_impl.h
udp_offload.c gso: fix udp gso fraglist segmentation after pull from frag_list 2024-10-02 17:29:31 -07:00
udp_tunnel_core.c ipv4: udp_tunnel: Unmask upper DSCP bits in udp_tunnel_dst_lookup() 2024-09-09 14:14:53 +01:00
udp_tunnel_nic.c
udp_tunnel_stub.c
udp.c udp: Deal with race between UDP socket address change and rehash 2024-12-23 11:39:55 +00:00
udplite.c
xfrm4_input.c ipv4: Convert ip_route_input_noref() to dscp_t. 2024-10-03 16:21:21 -07:00
xfrm4_output.c
xfrm4_policy.c xfrm: Convert struct xfrm_dst_lookup_params -> tos to dscp_t. 2024-11-06 12:42:51 +01:00
xfrm4_protocol.c ipv4: Convert ip_route_input_noref() to dscp_t. 2024-10-03 16:21:21 -07:00
xfrm4_state.c
xfrm4_tunnel.c