mirror_ubuntu-kernels/kernel/trace
Andrii Nakryiko cdf355cc60 uprobes: add speculative lockless system-wide uprobe filter check
It's very common with BPF-based uprobe/uretprobe use cases to have
a system-wide (not PID specific) probes used. In this case uprobe's
trace_uprobe_filter->nr_systemwide counter is bumped at registration
time, and actual filtering is short circuited at the time when
uprobe/uretprobe is triggered.

This is a great optimization, and the only issue with it is that to even
get to checking this counter uprobe subsystem is taking
read-side trace_uprobe_filter->rwlock. This is actually noticeable in
profiles and is just another point of contention when uprobe is
triggered on multiple CPUs simultaneously.

This patch moves this nr_systemwide check outside of filter list's
rwlock scope, as rwlock is meant to protect list modification, while
nr_systemwide-based check is speculative and racy already, despite the
lock (as discussed in [0]). trace_uprobe_filter_remove() and
trace_uprobe_filter_add() already check for filter->nr_systewide
explicitly outside of __uprobe_perf_filter, so no modifications are
required there.

Confirming with BPF selftests's based benchmarks.

BEFORE (based on changes in previous patch)
===========================================
uprobe-nop     :    2.732 ± 0.022M/s
uprobe-push    :    2.621 ± 0.016M/s
uprobe-ret     :    1.105 ± 0.007M/s
uretprobe-nop  :    1.396 ± 0.007M/s
uretprobe-push :    1.347 ± 0.008M/s
uretprobe-ret  :    0.800 ± 0.006M/s

AFTER
=====
uprobe-nop     :    2.878 ± 0.017M/s (+5.5%, total +8.3%)
uprobe-push    :    2.753 ± 0.013M/s (+5.3%, total +10.2%)
uprobe-ret     :    1.142 ± 0.010M/s (+3.8%, total +3.8%)
uretprobe-nop  :    1.444 ± 0.008M/s (+3.5%, total +6.5%)
uretprobe-push :    1.410 ± 0.010M/s (+4.8%, total +7.1%)
uretprobe-ret  :    0.816 ± 0.002M/s (+2.0%, total +3.9%)

In the above, first percentage value is based on top of previous patch
(lazy uprobe buffer optimization), while the "total" percentage is
based on kernel without any of the changes in this patch set.

As can be seen, we get about 4% - 10% speed up, in total, with both lazy
uprobe buffer and speculative filter check optimizations.

  [0] https://lore.kernel.org/bpf/20240313131926.GA19986@redhat.com/

Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/all/20240318181728.2795838-4-andrii@kernel.org/

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
2024-05-01 23:18:46 +09:00
..
rv tracing/tools: Updates for 6.4 2023-04-28 16:11:26 -07:00
blktrace.c block: remove more NULL checks after bdev_get_queue() 2023-02-21 09:23:22 -07:00
bpf_trace.c bpf: support deferring bpf_link dealloc to after RCU grace period 2024-03-28 18:47:45 -07:00
bpf_trace.h
error_report-traces.c
fgraph.c tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
fprobe.c fprobe: Fix to allocate entry_data_size buffer with rethook instances 2024-03-01 09:18:24 +09:00
ftrace_internal.h tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
ftrace.c ftrace: Fix most kernel-doc warnings 2024-03-18 10:33:05 -04:00
Kconfig tracing: Fix FTRACE_RECORD_RECURSION_SIZE Kconfig entry 2024-04-11 17:45:18 -04:00
kprobe_event_gen_test.c tracing: Fix wrong return in kprobe_event_gen_test.c 2023-03-19 12:20:48 -04:00
Makefile tracing/probes: Move finding func-proto API and getting func-param API to trace_btf 2023-08-23 09:39:45 +09:00
pid_list.c
pid_list.h
power-traces.c
preemptirq_delay_test.c
rethook.c rethook: Use __rcu pointer for rethook::handler 2023-12-01 14:53:56 +09:00
ring_buffer_benchmark.c ring-buffer: Read and write to ring buffers with custom sub buffer size 2023-12-20 07:54:56 -05:00
ring_buffer.c ring-buffer: Only update pages_touched when a new page is touched 2024-04-11 17:49:57 -04:00
rpm-traces.c
synth_event_gen_test.c tracing / synthetic: Disable events after testing in synth_event_gen_test_init() 2023-12-21 10:04:45 -05:00
trace_benchmark.c tracing: Use div64_u64() instead of do_div() 2024-03-18 10:33:06 -04:00
trace_benchmark.h
trace_boot.c tracing: Allow creating instances with specified system events 2023-12-18 23:14:16 -05:00
trace_branch.c
trace_btf.c tracing/probes: Fix to search structure fields correctly 2024-02-17 21:25:42 +09:00
trace_btf.h tracing/probes: Add a function to search a member of a struct/union 2023-08-23 09:40:16 +09:00
trace_clock.c
trace_dynevent.c tracing: Free buffers when a used dynamic event is removed 2022-11-23 19:07:12 -05:00
trace_dynevent.h
trace_entries.h tracing: Add back FORTIFY_SOURCE logic to kernel_stack event structure 2023-07-30 18:11:44 -04:00
trace_eprobe.c tracing/probes: Support $argN in return probe (kprobe and fprobe) 2024-03-07 00:27:34 +09:00
trace_event_perf.c tracing/perf: Use strndup_user instead of kzalloc/strncpy_from_user 2022-11-23 19:08:31 -05:00
trace_events_filter_test.h
trace_events_filter.c tracing: Have trace_event_file have ref counters 2023-11-01 23:44:44 -04:00
trace_events_hist.c tracing histograms: Simplify parse_actions() function 2024-01-08 13:24:56 -05:00
trace_events_inject.c tracing: Have event inject files inc the trace array ref count 2023-09-07 16:38:54 -04:00
trace_events_synth.c tracing/synthetic: Fix trace_string() return value 2024-02-15 11:40:01 -05:00
trace_events_trigger.c tracing: Decrement the snapshot if the snapshot trigger fails to register 2024-03-18 10:33:05 -04:00
trace_events_user.c tracing/user_events: Introduce multi-format events 2024-03-18 10:13:03 -04:00
trace_events.c tracing: hide unused ftrace_event_id_fops 2024-04-11 17:46:55 -04:00
trace_export.c tracing: Add back FORTIFY_SOURCE logic to kernel_stack event structure 2023-07-30 18:11:44 -04:00
trace_fprobe.c tracing/probes: Support $argN in return probe (kprobe and fprobe) 2024-03-07 00:27:34 +09:00
trace_functions_graph.c function_graph: Support recording and printing the return value of function 2023-06-20 18:38:37 -04:00
trace_functions.c
trace_hwlat.c tracing: Remove extra space at the end of hwlat_detector/mode 2023-09-01 21:00:00 -04:00
trace_irqsoff.c tracing: Fix memleak due to race between current_tracer and trace 2023-08-17 13:49:37 -04:00
trace_kdb.c
trace_kprobe_selftest.c tracing: arm64: Avoid missing-prototype warnings 2023-07-12 12:06:04 -04:00
trace_kprobe_selftest.h
trace_kprobe.c tracing/probes: Support $argN in return probe (kprobe and fprobe) 2024-03-07 00:27:34 +09:00
trace_mmiotrace.c
trace_nop.c
trace_osnoise.c tracing/timerlat: Move hrtimer_init to timerlat_fd open() 2024-02-01 11:50:13 -05:00
trace_output.c tracing: Remove precision vsnprintf() check from print event 2024-03-06 13:26:26 -05:00
trace_output.h tracing: Add "fields" option to show raw trace event fields 2023-03-29 06:52:08 -04:00
trace_preemptirq.c cpuidle: tracing, preempt: Squash _rcuidle tracing 2023-01-31 15:01:46 +01:00
trace_printk.c
trace_probe_kernel.h tracing/probes: Fix to record 0-length data_loc in fetch_store_string*() if fails 2023-07-14 17:04:58 +09:00
trace_probe_tmpl.h tracing/probes: Support $argN in return probe (kprobe and fprobe) 2024-03-07 00:27:34 +09:00
trace_probe.c tracing: probes: Fix to zero initialize a local variable 2024-03-25 16:24:31 +09:00
trace_probe.h tracing/probes: Support $argN in return probe (kprobe and fprobe) 2024-03-07 00:27:34 +09:00
trace_recursion_record.c
trace_sched_switch.c tracing: Move saved_cmdline code into trace_sched_switch.c 2024-03-17 07:58:53 -04:00
trace_sched_wakeup.c tracing: Fix memleak due to race between current_tracer and trace 2023-08-17 13:49:37 -04:00
trace_selftest_dynamic.c
trace_selftest.c tracing: Support to dump instance traces by ftrace_dump_on_oops 2024-03-18 10:33:06 -04:00
trace_seq.c trace_seq: Increase the buffer size to almost two pages 2023-12-18 23:14:16 -05:00
trace_stack.c
trace_stat.c
trace_stat.h
trace_synth.h tracing: Allow synthetic events to pass around stacktraces 2023-01-25 10:31:24 -05:00
trace_syscalls.c bpf: Change syscall_nr type to int in struct syscall_tp_t 2023-10-13 12:39:36 -07:00
trace_uprobe.c uprobes: add speculative lockless system-wide uprobe filter check 2024-05-01 23:18:46 +09:00
trace.c tracing: Support to dump instance traces by ftrace_dump_on_oops 2024-03-18 10:33:06 -04:00
trace.h tracing: Add snapshot refcount 2024-03-18 10:12:47 -04:00
tracing_map.c tracing: Ensure visibility when inserting an element into tracing_map 2024-01-22 17:15:40 -05:00
tracing_map.h tracing: Remove unused extern declaration tracing_map_set_field_descr() 2023-07-23 11:08:14 -04:00