mirror_ubuntu-kernels/net/sched
Xin Long a12c76a033 net: sched: refine software bypass handling in tc_run
This patch addresses issues with filter counting in block (tcf_block),
particularly for software bypass scenarios, by introducing a more
accurate mechanism using useswcnt.

Previously, filtercnt and skipswcnt were introduced by:

  Commit 2081fd3445 ("net: sched: cls_api: add filter counter") and
  Commit f631ef39d8 ("net: sched: cls_api: add skip_sw counter")

  filtercnt tracked all tp (tcf_proto) objects added to a block, and
  skipswcnt counted tp objects with the skipsw attribute set.

The problem is: a single tp can contain multiple filters, some with skipsw
and others without. The current implementation fails in the case:

  When the first filter in a tp has skipsw, both skipswcnt and filtercnt
  are incremented, then adding a second filter without skipsw to the same
  tp does not modify these counters because tp->counted is already set.

  This results in bypass software behavior based solely on skipswcnt
  equaling filtercnt, even when the block includes filters without
  skipsw. Consequently, filters without skipsw are inadvertently bypassed.

To address this, the patch introduces useswcnt in block to explicitly count
tp objects containing at least one filter without skipsw. Key changes
include:

  Whenever a filter without skipsw is added, its tp is marked with usesw
  and counted in useswcnt. tc_run() now uses useswcnt to determine software
  bypass, eliminating reliance on filtercnt and skipswcnt.

  This refined approach prevents software bypass for blocks containing
  mixed filters, ensuring correct behavior in tc_run().

Additionally, as atomic operations on useswcnt ensure thread safety and
tp->lock guards access to tp->usesw and tp->counted, the broader lock
down_write(&block->cb_lock) is no longer required in tc_new_tfilter(),
and this resolves a performance regression caused by the filter counting
mechanism during parallel filter insertions.

  The improvement can be demonstrated using the following script:

  # cat insert_tc_rules.sh

    tc qdisc add dev ens1f0np0 ingress
    for i in $(seq 16); do
        taskset -c $i tc -b rules_$i.txt &
    done
    wait

  Each of rules_$i.txt files above includes 100000 tc filter rules to a
  mlx5 driver NIC ens1f0np0.

  Without this patch:

  # time sh insert_tc_rules.sh

    real    0m50.780s
    user    0m23.556s
    sys	    4m13.032s

  With this patch:

  # time sh insert_tc_rules.sh

    real    0m17.718s
    user    0m7.807s
    sys     3m45.050s

Fixes: 047f340b36 ("net: sched: make skip_sw actually skip software")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Tested-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2025-01-20 09:21:27 +00:00
..
act_api.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-10-25 09:08:22 +02:00
act_bpf.c net: Rename mono_delivery_time to tstamp_type for scalabilty 2024-05-23 14:14:23 -07:00
act_connmark.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_csum.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_ct.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_ctinfo.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_gact.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_gate.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_ife.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_meta_mark.c
act_meta_skbprio.c
act_meta_skbtcindex.c
act_mirred.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-22 15:29:26 -08:00
act_mpls.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_nat.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_pedit.c net: sched: Annotate struct tc_pedit with __counted_by 2024-02-19 10:58:24 +00:00
act_police.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_sample.c net: sched: act_sample: add action cookie to sample 2024-07-05 17:45:47 -07:00
act_simple.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_skbedit.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_skbmod.c net/sched: act_skbmod: convert comma to semicolon 2024-07-11 17:12:15 -07:00
act_tunnel_key.c ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
act_vlan.c tc: adjust network header after 2nd vlan push 2024-08-27 11:37:42 +02:00
cls_api.c net: sched: refine software bypass handling in tc_run 2025-01-20 09:21:27 +00:00
cls_basic.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_bpf.c net: sched: refine software bypass handling in tc_run 2025-01-20 09:21:27 +00:00
cls_cgroup.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_flow.c net_sched: cls_flow: validate TCA_FLOW_RSHIFT attribute 2025-01-04 08:49:36 -08:00
cls_flower.c net: sched: refine software bypass handling in tc_run 2025-01-20 09:21:27 +00:00
cls_fw.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_matchall.c net: sched: refine software bypass handling in tc_run 2025-01-20 09:21:27 +00:00
cls_route.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_u32.c net: sched: refine software bypass handling in tc_run 2025-01-20 09:21:27 +00:00
em_canid.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_cmp.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
em_ipset.c
em_ipt.c
em_meta.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_nbyte.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_text.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_u32.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
ematch.c net_sched: reject TCF_EM_SIMPLE case for complex ematch module 2022-12-19 09:43:18 +00:00
Kconfig net: sched: Remove NET_ACT_IPT from Kconfig 2024-02-13 11:24:35 +01:00
Makefile net/sched: Retire ipt action 2024-01-02 12:41:16 +00:00
sch_api.c net: tc: improve qdisc error messages 2025-01-17 19:52:06 -08:00
sch_blackhole.c Revert "net: sched: Pass root lock to Qdisc_ops.enqueue" 2020-07-16 16:48:34 -07:00
sch_cake.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2025-01-09 16:11:47 -08:00
sch_cbs.c net/sched: cbs: Fix integer overflow in cbs_set_port_rate() 2024-10-15 18:25:47 -07:00
sch_choke.c net: sched: fix ordering of qlen adjustment 2024-12-04 12:54:22 +00:00
sch_codel.c net/sched: Add drop reasons for AQM-based qdiscs 2024-12-17 13:27:29 +01:00
sch_drr.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_etf.c net_sched: sch_tfs: implement lockless etf_dump() 2024-04-19 11:34:07 +01:00
sch_ets.c net_sched: sch_ets: implement lockless ets_dump() 2024-04-19 11:34:07 +01:00
sch_fifo.c net_sched: sch_fifo: implement lockless __fifo_dump() 2024-04-19 11:34:07 +01:00
sch_fq_codel.c net/sched: Add drop reasons for AQM-based qdiscs 2024-12-17 13:27:29 +01:00
sch_fq_pie.c net/sched: Add drop reasons for AQM-based qdiscs 2024-12-17 13:27:29 +01:00
sch_fq.c net_sched: sch_fq: add three drop_reason 2024-12-05 17:39:04 -08:00
sch_frag.c net: dst: remove unnecessary input parameter in dst_alloc and dst_init 2023-09-12 11:42:25 +02:00
sch_generic.c net: sched: calls synchronize_net() only when needed 2025-01-14 10:17:53 +01:00
sch_gred.c net/sched: Add drop reasons for AQM-based qdiscs 2024-12-17 13:27:29 +01:00
sch_hfsc.c net_sched: sch_hfsc: implement lockless accesses to q->defcls 2024-04-19 11:34:08 +01:00
sch_hhf.c net_sched: sch_hhf: implement lockless hhf_dump() 2024-04-19 11:34:08 +01:00
sch_htb.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_ingress.c bpf: Fix too early release of tcx_entry 2024-07-08 14:07:31 -07:00
sch_mq.c net: sched: add rcu annotations around qdisc->qdisc_sleeping 2023-06-07 10:25:39 +01:00
sch_mqprio_lib.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_mqprio_lib.h net/sched: mqprio: allow per-TC user input of FP adminStatus 2023-04-13 22:22:10 -07:00
sch_mqprio.c netlink: introduce type-checking attribute iteration 2024-03-29 15:06:02 -07:00
sch_multiq.c net: sched: sch_multiq: fix possible OOB write in multiq_tune() 2024-06-05 10:50:19 +01:00
sch_netem.c net/sched: netem: account for backlog updates from child qdisc 2024-12-11 20:26:37 -08:00
sch_pie.c net/sched: Add drop reasons for AQM-based qdiscs 2024-12-17 13:27:29 +01:00
sch_plug.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_prio.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_qfq.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_red.c net/sched: Add drop reasons for AQM-based qdiscs 2024-12-17 13:27:29 +01:00
sch_sfb.c net/sched: Add drop reasons for AQM-based qdiscs 2024-12-17 13:27:29 +01:00
sch_sfq.c net_sched: sch_sfq: don't allow 1 packet limit 2024-12-05 18:02:10 -08:00
sch_skbprio.c net_sched: sch_skbprio: implement lockless skbprio_dump() 2024-04-19 11:34:08 +01:00
sch_taprio.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_tbf.c net/sched: tbf: correct backlog statistic for GSO packets 2024-11-30 13:02:43 -08:00
sch_teql.c net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00