Commit Graph

17720 Commits

Author SHA1 Message Date
Ian Rogers
fa9c4977fb perf symbol-minimal: Fix double free in filename__read_build_id
Running the "perf script task-analyzer tests" with address sanitizer
showed a double free:
```
FAIL: "test_csv_extended_times" Error message: "Failed to find required string:'Out-Out;'."
=================================================================
==19190==ERROR: AddressSanitizer: attempting double-free on 0x50b000017b10 in thread T0:
    #0 0x55da9601c78a in free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
    #1 0x55da96640c63 in filename__read_build_id tools/perf/util/symbol-minimal.c:221:2

0x50b000017b10 is located 0 bytes inside of 112-byte region [0x50b000017b10,0x50b000017b80)
freed by thread T0 here:
    #0 0x55da9601ce40 in realloc (perf+0x260e40) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
    #1 0x55da96640ad6 in filename__read_build_id tools/perf/util/symbol-minimal.c:204:10

previously allocated by thread T0 here:
    #0 0x55da9601ca23 in malloc (perf+0x260a23) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a)
    #1 0x55da966407e7 in filename__read_build_id tools/perf/util/symbol-minimal.c:181:9

SUMMARY: AddressSanitizer: double-free (perf+0x26078a) (BuildId: e7ef50e08970f017a96fde6101c5e2491acc674a) in free
==19190==ABORTING
FAIL: "invocation of perf script report task-analyzer --csv-summary csvsummary --summary-extended command failed" Error message: ""
FAIL: "test_csvsummary_extended" Error message: "Failed to find required string:'Out-Out;'."
---- end(-1) ----
132: perf script task-analyzer tests                                 : FAILED!
```

The buf_size if always set to phdr->p_filesz, but that may be 0
causing a free and realloc to return NULL. This is treated in
filename__read_build_id like a failure and the buffer is freed again.

To avoid this problem only grow buf, meaning the buf_size will never
be 0. This also reduces the number of memory (re)allocations.

Fixes: b691f64360 ("perf symbols: Implement poor man's ELF parser")
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung.kim@lge.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250501070003.22251-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
f7458176a7 perf mem: Add 'dtlb' output field
This is a breakdown of perf_mem_data_src.mem_dtlb values.  It assumes
PMU drivers would set PERF_MEM_TLB_HIT bit with an appropriate level.

And having PERF_MEM_TLB_MISS means that it failed to find one in any
levels of TLB.  For now, it doesn't use PERF_MEM_TLB_{WK,OS} bits.

Also it seems Intel machines don't distinguish L1 or L2 precisely.  So I
added ANY_HIT (printed as "L?-Hit") to handle the case.

  $ perf mem report -F overhead,dtlb,dso --stdio
  ...
  #           --- D-TLB ----
  # Overhead   L?-Hit   Miss  Shared Object
  # ........  ..............  .................
  #
      67.03%    99.5%   0.5%  [unknown]
      31.23%    99.2%   0.8%  [kernel.kallsyms]
       1.08%    97.8%   2.2%  [i915]
       0.36%   100.0%   0.0%  [JIT] tid 6853
       0.12%   100.0%   0.0%  [drm]
       0.05%   100.0%   0.0%  [drm_kms_helper]
       0.05%   100.0%   0.0%  [ext4]
       0.02%   100.0%   0.0%  [aesni_intel]
       0.02%   100.0%   0.0%  [crc32c_intel]
       0.02%   100.0%   0.0%  [dm_crypt]
       ...

Committer testing:

  # perf report --header | grep cpudesc
  # cpudesc : AMD Ryzen 9 9950X3D 16-Core Processor
  # perf mem report -F overhead,dtlb,dso --stdio | head -20
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:P'
  # Total weight : 2637
  # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
  #
  #           ---------- D-TLB -----------
  # Overhead   L1-Hit L2-Hit   Miss  Other  Shared Object
  # ........  ............................  .................................
  #
      77.47%    18.4%   0.1%   0.6%  80.9%  [kernel.kallsyms]
       5.61%    36.5%   0.7%   1.4%  61.5%  libxul.so
       2.77%    39.7%   0.0%  12.3%  47.9%  libc.so.6
       2.01%    34.0%   1.9%   1.9%  62.3%  libglib-2.0.so.0.8400.1
       1.93%    31.4%   2.0%   2.0%  64.7%  [amdgpu]
       1.63%    48.8%   0.0%   0.0%  51.2%  [JIT] tid 60168
       1.14%     3.3%   0.0%   0.0%  96.7%  [vdso]
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-12-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
5e424a0178 perf mem: Add 'snoop' output field
This is a breakdown of perf_mem_data_src.mem_snoop values.  For now, it
doesn't use mem_snoopx values like FWD and PEER.

  $ perf mem report -F overhead,snoop,comm --stdio
  ...
  #           ---------- Snoop -----------
  # Overhead      Hit   HitM   Miss  Other  Command
  # ........  ............................  ...............
  #
      34.24%     0.6%   0.0%   0.0%  99.4%  gnome-shell
      12.02%     1.0%   0.0%   0.0%  99.0%  chrome
       9.32%     1.0%   0.0%   0.3%  98.7%  Isolated Web Co
       6.85%     1.0%   0.3%   0.0%  98.6%  swapper
       6.30%     0.8%   0.8%   0.0%  98.5%  Xorg
       3.02%     2.4%   0.0%   0.0%  97.6%  VizCompositorTh
       2.35%     0.0%   0.0%   0.0% 100.0%  firefox-esr
       2.04%     0.0%   0.0%   0.0% 100.0%  JS Helper
       1.51%     3.2%   0.0%   0.0%  96.8%  threaded-ml
       1.44%     0.0%   0.0%   0.0% 100.0%  AudioIP~allback
       ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-11-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
abe4dc24a8 perf mem: Add 'cache' and 'memory' output fields
This is a breakdown of perf_mem_data_src.mem_lvl_num.  But it's also
divided into two parts because the combination is bigger than 8.

Since there are many entries for different cache levels, 'cache' field
focuses on them.  I generalized buffers like LFB, MAB and MHB to L1-buf
and L2-buf.

The rest goes to 'memory' field which can be RAM, CXL, PMEM, IO, etc.

  $ perf mem report -F cache,mem,dso --stdio
  ...
  #
  # -------------- Cache --------------  --- Memory ---
  #      L1     L2     L3 L1-buf  Other      RAM  Other  Shared Object
  # ...................................  ..............  ....................................
  #
      53.9%   3.6%  16.2%  21.6%   4.8%     4.8%  95.2%  [kernel.kallsyms]
      64.7%   1.7%   3.5%  17.4%  12.8%    12.8%  87.2%  chrome (deleted)
      78.3%   2.8%   0.0%   1.0%  17.9%    17.9%  82.1%  libc.so.6
      39.6%   1.5%   0.0%   5.7%  53.2%    53.2%  46.8%  libxul.so
      26.2%   0.0%   0.0%   0.0%  73.8%    73.8%  26.2%  [unknown]
      85.5%   0.0%   0.0%  14.5%   0.0%     0.0% 100.0%  libspa-audioconvert.so
      66.3%   4.4%   0.0%  29.4%   0.0%     0.0% 100.0%  libglib-2.0.so.0.8200.1 (deleted)
       1.9%   0.0%   0.0%   0.0%  98.1%    98.1%   1.9%  libmutter-cogl-15.so.0.0.0 (deleted)
      10.6%   0.0%   0.0%  89.4%   0.0%     0.0% 100.0%  libpulsecommon-16.1.so
       0.0%   0.0%   0.0% 100.0%   0.0%     0.0% 100.0%  libfreeblpriv3.so (deleted)
       ...

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-10-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
225772c17c perf hist: Hide unused mem stat columns
Some mem_stat types don't use all 8 columns.  And there are cases only
samples in certain kinds of mem_stat types are available only.  For that
case hide columns which has no samples.

The new output for the previous data would be:

  $ perf mem report -F overhead,op,comm --stdio
  ...
  #           ------ Mem Op -------
  # Overhead     Load  Store  Other  Command
  # ........  .....................  ...............
  #
      44.85%    21.1%  30.7%  48.3%  swapper
      26.82%    98.8%   0.3%   0.9%  netsli-prober
       7.19%    51.7%  13.7%  34.6%  perf
       5.81%    89.7%   2.2%   8.1%  qemu-system-ppc
       4.77%   100.0%   0.0%   0.0%  notifications_c
       1.77%    95.9%   1.2%   3.0%  MemoryReleaser
       0.77%    71.6%   4.1%  24.3%  DefaultEventMan
       0.19%    66.7%  22.2%  11.1%  gnome-shell
       ...

On Intel machines, the event is only for loads or stores so it'll have
only one column:

  #            Mem Op
  # Overhead     Load  Command
  # ........  .......  ...............
  #
      20.55%   100.0%  swapper
      17.13%   100.0%  chrome
       9.02%   100.0%  data-loop.0
       6.26%   100.0%  pipewire-pulse
       5.63%   100.0%  threaded-ml
       5.47%   100.0%  GraphRunner
       5.37%   100.0%  AudioIP~allback
       5.30%   100.0%  Chrome_ChildIOT
       3.17%   100.0%  Isolated Web Co
       ...

Committer testing:

  # grep "model name" -m1 /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processo
  # perf mem report -F overhead,op,comm --stdio
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:P'
  # Total weight : 2637
  # Sort order   : local_weight,mem,sym,dso,symbol_daddr,dso_daddr,snoop,tlb,locked,blocked,local_ins_lat,local_p_stage_cyc
  #
  #           ------ Mem Op -------
  # Overhead     Load  Store  Other  Command
  # ........  .....................  ...............
  #
      61.02%    14.4%  25.5%  60.1%  swapper
       5.61%    26.4%  13.5%  60.1%  Isolated Web Co
       5.50%    21.4%  29.7%  49.0%  perf
       4.74%    27.2%  15.2%  57.6%  gnome-shell
       4.63%    33.6%  11.5%  54.9%  mdns_service
       4.29%    28.3%  12.4%  59.3%  ptyxis
       2.16%    24.6%  19.3%  56.1%  DOM Worker
       0.99%    23.1%  34.6%  42.3%  firefox
       0.72%    26.3%  15.8%  57.9%  IPC I/O Parent
       0.61%    12.5%  12.5%  75.0%  kworker/u130:20
       0.61%    37.5%  18.8%  43.8%  podman
       0.57%    33.3%   6.7%  60.0%  Timer
       0.53%    14.3%   7.1%  78.6%  KMS thread
       0.49%    30.8%   7.7%  61.5%  kworker/u130:3-
       0.46%    41.7%  33.3%  25.0%  IPDL Background

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
1e6569dca5 perf mem: Add 'op' output field
This is an actual example of the he_mem_stat based sample breakdown.  It
uses 'mem_op' field of union perf_mem_data_src which means memory
operations.

It'd have basically 'load' or 'store' which can be useful if PMU doesn't
have separate events for them like IBS or SPE.  In addition, there's an
entry in case load and store happen at the same time.  Also adds entries
for prefetching and execution.

  $ perf mem report -F +op -s comm --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 4K of event 'ibs_op//'
  # Total weight : 9559
  # Sort order   : comm
  #
  #                   --------------------- Mem Op ----------------------
  # Overhead  Samples   Load  Store Ld+St Pfetch  Exec  Other   N/A   N/A  Command
  # ........  ....... ...................................................  ...............
  #
      44.85%     4077  21.1%  30.7%  0.0%   0.0%  0.0%  48.3%  0.0%  0.0%  swapper
      26.82%       45  98.8%   0.3%  0.0%   0.0%  0.0%   0.9%  0.0%  0.0%  netsli-prober
       7.19%      442  51.7%  13.7%  0.0%   0.0%  0.0%  34.6%  0.0%  0.0%  perf
       5.81%       75  89.7%   2.2%  0.0%   0.0%  0.0%   8.1%  0.0%  0.0%  qemu-system-ppc
       4.77%        1 100.0%   0.0%  0.0%   0.0%  0.0%   0.0%  0.0%  0.0%  notifications_c
       1.77%       10  95.9%   1.2%  0.0%   0.0%  0.0%   3.0%  0.0%  0.0%  MemoryReleaser
       0.77%       32  71.6%   4.1%  0.0%   0.0%  0.0%  24.3%  0.0%  0.0%  DefaultEventMan
       0.19%       10  66.7%  22.2%  0.0%   0.0%  0.0%  11.1%  0.0%  0.0%  gnome-shell

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
b1fc83ca43 perf hist: Implement output fields for mem stats
This is a preparation for later changes to support mem_stat output.  The
new fields will need two lines for the header - the first line will show
type of mem stat and the second line will show the name of each item
which is returned by mem_stat_name().

Each element in the mem_stat array will be printed in percentage for the
hist_entry and their sum would be 100%.

Add new output field dimension only for SORT_MODE__MEM using mem_stat.

To handle possible name conflict with existing sort keys, move the order
of checking output field dimensions after the sort dimensions when it
looks for sort keys.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
9fcb43e27c perf hist: Basic support for mem_stat accounting
Add a logic to account he->mem_stat based on mem_stat_type in hists.

Each mem_stat entry will have different meaning based on the type so the
index in the array is calculated at runtime using the corresponding
value in the sample.data_src.

Still hists has no mem_stat_types yet so this code won't work for now.

Later hists->mem_stat_types will be allocated based on what users want
in the output actually.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
930d4c45c6 perf hist: Add struct he_mem_stat
The 'struct he_mem_stat' is to save detailed information about memory
instruction.  It'll be used to show breakdown of various data from
PERF_SAMPLE_DATA_SRC.  Note that this structure is generic and the
contents will be different depending on actual data it'll use later.

The information about the actual data will be saved in 'struct hists'
and its length is in nr_mem_stats.  This commit just adds ground works
and does nothing since hists->nr_mem_stats is 0 for now.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
29e6392ec3 perf hist: Support multi-line header
This is a preparation to support multi-line headers in 'perf mem report'.

Normal sort keys and output fields that don't have contents for multi-
line will print the header string at the last line only.

As we don't use multi-line headers normally, it should not have any
changes in the output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
43a6446998 perf record: Add --sample-mem-info option
There's no way to enable PERF_SAMPLE_DATA_SRC without PERF_SAMPLE_ADDR
which brings a lot of overhead due to the number of MMAP[2] records.

Let's add a new option to enable this information separately.

Committer testing:

  # perf record -a --sample-mem-info
  ^C[ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.815 MB perf.data (2637 samples) ]
  #
  # perf evlist -v
  cycles:P: type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|CPU|PERIOD|IDENTIFIER|DATA_SRC, read_format: ID|LOST, disabled: 1, freq: 1, precise_ip: 2, sample_id_all: 1
  dummy:u: type: 1 (PERF_TYPE_SOFTWARE), size: 136, config: 0x9 (PERF_COUNT_SW_DUMMY), { sample_period, sample_freq }: 1, sample_type: IP|TID|TIME|CPU|IDENTIFIER|DATA_SRC, read_format: ID|LOST, exclude_kernel: 1, exclude_hv: 1, mmap: 1, comm: 1, task: 1, sample_id_all: 1, exclude_guest: 1, mmap2: 1, comm_exec: 1, ksymbol: 1, bpf_event: 1
  #
  # perf report -D |& grep -w PERF_RECORD_SAMPLE -A3 -m1
  0 44675164447282 0x1a7590 [0x40]: PERF_RECORD_SAMPLE(IP, 0x4001): 107299/107299: 0xffffffffac4a5e11 period: 144 addr: 0
   . data_src: 0x229080142
   ... thread: perf:107299
   ...... dso: /lib/modules/6.15.0-rc4+/build/vmlinux
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250430205548.789750-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Namhyung Kim
3761e7fe98 perf hist: Remove output field from sort-list properly
When it removes an output format for cancelled children or latency, it
should delete itself from the sort list as well.  Otherwise assertion
in fmt_free() will fire.

  $ perf report -H --stdio
  perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
  Aborted (core dumped)

Also convert to perf_hpp__column_unregister() for the same open codes.

Committer notes:

Before this patch:

  # perf test hierarchy
   83: perf report --hierarchy                                         : FAILED!
  # perf test -v hierarchy
  --- start ---
  test child forked, pid 102242
  perf report --hierarchy
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
  perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
  /home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted                 (core dumped) perf report --hierarchy > /dev/null
  --- Cleaning up ---
  ---- end(-1) ----
   83: perf report --hierarchy                                         : FAILED!
  #

After:

  # perf test hierarchy
   83: perf report --hierarchy                                         : Ok
  #

Fixes: dbd11b6bda ("perf hist: Remove formats in hierarchy when cancel children")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:14 -03:00
Arnaldo Carvalho de Melo
bb5ae52e53 perf test perf-report-hierarchy: Add new test
Super simple test to check that at least we're not segfaulting when
trying to use 'perf report --hierarchy', more subtests should be added
to make sure the output is the expected one.

This is being merged right before a fix for that that this test detects:

  # perf test hierarchy
   83: perf report --hierarchy                                         : FAILED!
  # perf test -v hierarchy
  --- start ---
  test child forked, pid 102242
  perf report --hierarchy
  Linux
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB /tmp/perf-test-report.HX0N85TlPq/perf-report-hierarchy-perf.data (6 samples) ]
  perf: ui/hist.c:603: fmt_free: Assertion `!(!list_empty(&fmt->sort_list))' failed.
  /home/acme/libexec/perf-core/tests/shell/perf-report-hierarchy.sh: line 34: 102250 Aborted                 (core dumped) perf report --hierarchy > /dev/null
  --- Cleaning up ---
  ---- end(-1) ----
   83: perf report --hierarchy                                         : FAILED!
  #

Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/lkml/20250430180321.736939-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-02 15:36:01 -03:00
Ravi Bangoria
35db59fa8e perf test amd ibs: Add sample period unit test
IBS Fetch and IBS Op PMUs has various constraints on supported sample
periods. Add perf unit tests to test those.

Running it in parallel with other tests causes intermittent failures.
Mark it exclusive to force it to run sequentially. Sample output on a
Zen5 machine:

Without kernel fixes:

  $ sudo ./perf test -vv 112
  112: AMD IBS sample period:
  --- start ---
  test child forked, pid 8774
  Using CPUID AuthenticAMD-26-2-1

  IBS config tests:
  -----------------
  Fetch PMU tests:
  0xffff            : Ok   (nr samples: 1078)
  0x1000            : Ok   (nr samples: 17030)
  0xff              : Ok   (nr samples: 41068)
  0x1               : Ok   (nr samples: 40543)
  0x0               : Ok
  0x10000           : Ok
  Op PMU tests:
  0x0               : Ok
  0x1               : Fail
  0x8               : Fail
  0x9               : Ok   (nr samples: 40543)
  0xf               : Ok   (nr samples: 40543)
  0x1000            : Ok   (nr samples: 18736)
  0xffff            : Ok   (nr samples: 1168)
  0x10000           : Ok
  0x100000          : Fail (nr samples: 14)
  0xf00000          : Fail (nr samples: 1)
  0xf0ffff          : Fail (nr samples: 1)
  0x1f0ffff         : Fail (nr samples: 1)
  0x7f0ffff         : Fail (nr samples: 0)
  0x8f0ffff         : Ok
  0x17f0ffff        : Ok

  IBS sample period constraint tests:
  -----------------------------------
  Fetch PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Fail
  freq 0, sample_freq        15: Fail
  freq 0, sample_freq        16: Ok   (nr samples: 1604)
  freq 0, sample_freq        17: Ok   (nr samples: 1604)
  freq 0, sample_freq       143: Ok   (nr samples: 1604)
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1566)
  freq 0, sample_freq      4103: Ok   (nr samples: 1119)
  freq 0, sample_freq     65520: Ok   (nr samples: 2264)
  freq 0, sample_freq     65535: Ok   (nr samples: 2263)
  freq 0, sample_freq     65552: Ok   (nr samples: 1166)
  freq 0, sample_freq   8388607: Ok   (nr samples: 268)
  freq 0, sample_freq 268435455: Ok   (nr samples: 8)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Ok   (nr samples: 4)
  freq 1, sample_freq        15: Ok   (nr samples: 4)
  freq 1, sample_freq        16: Ok   (nr samples: 4)
  freq 1, sample_freq        17: Ok   (nr samples: 4)
  freq 1, sample_freq       143: Ok   (nr samples: 5)
  freq 1, sample_freq       144: Ok   (nr samples: 5)
  freq 1, sample_freq       145: Ok   (nr samples: 5)
  freq 1, sample_freq      1234: Ok   (nr samples: 7)
  freq 1, sample_freq      4103: Ok   (nr samples: 35)
  freq 1, sample_freq     65520: Ok   (nr samples: 642)
  freq 1, sample_freq     65535: Ok   (nr samples: 636)
  freq 1, sample_freq     65552: Ok   (nr samples: 651)
  freq 1, sample_freq   8388607: Ok
  Op PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Fail
  freq 0, sample_freq        15: Fail
  freq 0, sample_freq        16: Fail
  freq 0, sample_freq        17: Fail
  freq 0, sample_freq       143: Fail
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1604)
  freq 0, sample_freq      4103: Ok   (nr samples: 1604)
  freq 0, sample_freq     65520: Ok   (nr samples: 2227)
  freq 0, sample_freq     65535: Ok   (nr samples: 2296)
  freq 0, sample_freq     65552: Ok   (nr samples: 2213)
  freq 0, sample_freq   8388607: Ok   (nr samples: 250)
  freq 0, sample_freq 268435455: Ok   (nr samples: 8)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Fail (nr samples: 4)
  freq 1, sample_freq        15: Fail (nr samples: 4)
  freq 1, sample_freq        16: Fail (nr samples: 4)
  freq 1, sample_freq        17: Fail (nr samples: 4)
  freq 1, sample_freq       143: Fail (nr samples: 5)
  freq 1, sample_freq       144: Fail (nr samples: 5)
  freq 1, sample_freq       145: Fail (nr samples: 5)
  freq 1, sample_freq      1234: Fail (nr samples: 8)
  freq 1, sample_freq      4103: Fail (nr samples: 33)
  freq 1, sample_freq     65520: Fail (nr samples: 546)
  freq 1, sample_freq     65535: Fail (nr samples: 544)
  freq 1, sample_freq     65552: Fail (nr samples: 555)
  freq 1, sample_freq   8388607: Ok

  IBS ioctl() tests:
  ------------------
  Fetch PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Fail
  ioctl(period = 0xf      ): Fail
  ioctl(period = 0x10     ): Ok
  ioctl(period = 0x11     ): Fail
  ioctl(period = 0x1f     ): Fail
  ioctl(period = 0x20     ): Ok
  ioctl(period = 0x80     ): Ok
  ioctl(period = 0x8f     ): Fail
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Fail
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Fail
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Fail
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok
  Op PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Fail
  ioctl(period = 0xf      ): Fail
  ioctl(period = 0x10     ): Fail
  ioctl(period = 0x11     ): Fail
  ioctl(period = 0x1f     ): Fail
  ioctl(period = 0x20     ): Fail
  ioctl(period = 0x80     ): Fail
  ioctl(period = 0x8f     ): Fail
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Fail
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Fail
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Fail
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok

  IBS freq (negative) tests:
  --------------------------
  freq 1, sample_freq 200000: Fail

  IBS L3MissOnly test: (takes a while)
  --------------------
  Fetch L3MissOnly: Fail (nr_samples: 1213)
  Op L3MissOnly:    Ok   (nr_samples: 1193)
  ---- end(-1) ----
  112: AMD IBS sample period                                           : FAILED!

With kernel fixes:

  $ sudo ./perf test -vv 112
  112: AMD IBS sample period:
  --- start ---
  test child forked, pid 6939
  Using CPUID AuthenticAMD-26-2-1

  IBS config tests:
  -----------------
  Fetch PMU tests:
  0xffff            : Ok   (nr samples: 969)
  0x1000            : Ok   (nr samples: 15540)
  0xff              : Ok   (nr samples: 40555)
  0x1               : Ok   (nr samples: 40543)
  0x0               : Ok
  0x10000           : Ok
  Op PMU tests:
  0x0               : Ok
  0x1               : Ok
  0x8               : Ok
  0x9               : Ok   (nr samples: 40543)
  0xf               : Ok   (nr samples: 40543)
  0x1000            : Ok   (nr samples: 19156)
  0xffff            : Ok   (nr samples: 1169)
  0x10000           : Ok
  0x100000          : Ok   (nr samples: 1151)
  0xf00000          : Ok   (nr samples: 76)
  0xf0ffff          : Ok   (nr samples: 73)
  0x1f0ffff         : Ok   (nr samples: 33)
  0x7f0ffff         : Ok   (nr samples: 10)
  0x8f0ffff         : Ok
  0x17f0ffff        : Ok

  IBS sample period constraint tests:
  -----------------------------------
  Fetch PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Ok
  freq 0, sample_freq        15: Ok
  freq 0, sample_freq        16: Ok   (nr samples: 1203)
  freq 0, sample_freq        17: Ok   (nr samples: 1604)
  freq 0, sample_freq       143: Ok   (nr samples: 1604)
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1604)
  freq 0, sample_freq      4103: Ok   (nr samples: 1343)
  freq 0, sample_freq     65520: Ok   (nr samples: 2254)
  freq 0, sample_freq     65535: Ok   (nr samples: 2136)
  freq 0, sample_freq     65552: Ok   (nr samples: 1158)
  freq 0, sample_freq   8388607: Ok   (nr samples: 257)
  freq 0, sample_freq 268435455: Ok   (nr samples: 8)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Ok   (nr samples: 4)
  freq 1, sample_freq        15: Ok   (nr samples: 4)
  freq 1, sample_freq        16: Ok   (nr samples: 4)
  freq 1, sample_freq        17: Ok   (nr samples: 4)
  freq 1, sample_freq       143: Ok   (nr samples: 5)
  freq 1, sample_freq       144: Ok   (nr samples: 5)
  freq 1, sample_freq       145: Ok   (nr samples: 5)
  freq 1, sample_freq      1234: Ok   (nr samples: 8)
  freq 1, sample_freq      4103: Ok   (nr samples: 34)
  freq 1, sample_freq     65520: Ok   (nr samples: 458)
  freq 1, sample_freq     65535: Ok   (nr samples: 628)
  freq 1, sample_freq     65552: Ok   (nr samples: 396)
  freq 1, sample_freq   8388607: Ok
  Op PMU test:
  freq 0, sample_freq         0: Ok
  freq 0, sample_freq         1: Ok
  freq 0, sample_freq        15: Ok
  freq 0, sample_freq        16: Ok
  freq 0, sample_freq        17: Ok
  freq 0, sample_freq       143: Ok
  freq 0, sample_freq       144: Ok   (nr samples: 1604)
  freq 0, sample_freq       145: Ok   (nr samples: 1604)
  freq 0, sample_freq      1234: Ok   (nr samples: 1604)
  freq 0, sample_freq      4103: Ok   (nr samples: 1604)
  freq 0, sample_freq     65520: Ok   (nr samples: 2250)
  freq 0, sample_freq     65535: Ok   (nr samples: 2158)
  freq 0, sample_freq     65552: Ok   (nr samples: 2296)
  freq 0, sample_freq   8388607: Ok   (nr samples: 243)
  freq 0, sample_freq 268435455: Ok   (nr samples: 6)
  freq 1, sample_freq         0: Ok
  freq 1, sample_freq         1: Ok   (nr samples: 4)
  freq 1, sample_freq        15: Ok   (nr samples: 4)
  freq 1, sample_freq        16: Ok   (nr samples: 4)
  freq 1, sample_freq        17: Ok   (nr samples: 4)
  freq 1, sample_freq       143: Ok   (nr samples: 4)
  freq 1, sample_freq       144: Ok   (nr samples: 5)
  freq 1, sample_freq       145: Ok   (nr samples: 4)
  freq 1, sample_freq      1234: Ok   (nr samples: 6)
  freq 1, sample_freq      4103: Ok   (nr samples: 27)
  freq 1, sample_freq     65520: Ok   (nr samples: 542)
  freq 1, sample_freq     65535: Ok   (nr samples: 550)
  freq 1, sample_freq     65552: Ok   (nr samples: 552)
  freq 1, sample_freq   8388607: Ok

  IBS ioctl() tests:
  ------------------
  Fetch PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Ok
  ioctl(period = 0xf      ): Ok
  ioctl(period = 0x10     ): Ok
  ioctl(period = 0x11     ): Ok
  ioctl(period = 0x1f     ): Ok
  ioctl(period = 0x20     ): Ok
  ioctl(period = 0x80     ): Ok
  ioctl(period = 0x8f     ): Ok
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Ok
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Ok
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Ok
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok
  Op PMU tests
  ioctl(period = 0x0      ): Ok
  ioctl(period = 0x1      ): Ok
  ioctl(period = 0xf      ): Ok
  ioctl(period = 0x10     ): Ok
  ioctl(period = 0x11     ): Ok
  ioctl(period = 0x1f     ): Ok
  ioctl(period = 0x20     ): Ok
  ioctl(period = 0x80     ): Ok
  ioctl(period = 0x8f     ): Ok
  ioctl(period = 0x90     ): Ok
  ioctl(period = 0x91     ): Ok
  ioctl(period = 0x100    ): Ok
  ioctl(period = 0xfff0   ): Ok
  ioctl(period = 0xffff   ): Ok
  ioctl(period = 0x10000  ): Ok
  ioctl(period = 0x1fff0  ): Ok
  ioctl(period = 0x1fff5  ): Ok
  ioctl(freq   = 0x0      ): Ok
  ioctl(freq   = 0x1      ): Ok
  ioctl(freq   = 0xf      ): Ok
  ioctl(freq   = 0x10     ): Ok
  ioctl(freq   = 0x11     ): Ok
  ioctl(freq   = 0x1f     ): Ok
  ioctl(freq   = 0x20     ): Ok
  ioctl(freq   = 0x80     ): Ok
  ioctl(freq   = 0x8f     ): Ok
  ioctl(freq   = 0x90     ): Ok
  ioctl(freq   = 0x91     ): Ok
  ioctl(freq   = 0x100    ): Ok

  IBS freq (negative) tests:
  --------------------------
  freq 1, sample_freq 200000: Ok

  IBS L3MissOnly test: (takes a while)
  --------------------
  Fetch L3MissOnly: Ok   (nr_samples: 1301)
  Op L3MissOnly:    Ok   (nr_samples: 1590)
  ---- end(0) ----
  112: AMD IBS sample period                                           : Ok

Committer notes:

Avoid using PAGE_SIZE as that define is also in sys/user.h

Make it a variable not to call sysconf() multiple times.

Also cast func to void * when passing it as the first arg to memcpy to
avoid this with some versions of clang:

  arch/x86/tests/amd-ibs-period.c:81:3: error: no matching function for call to 'memcpy'
                  memcpy(func, insn1, sizeof(insn1));
                  ^~~~~~
  /usr/include/string.h:27:7: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *' for 1st argument
  void *memcpy (void *__restrict, const void *__restrict, size_t);
        ^
  /usr/include/fortify/string.h:40:27: note: candidate function not viable: no known conversion from 'int (*)(void)' to 'void *const' for 1st argument
  _FORTIFY_FN(memcpy) void *memcpy(void * _FORTIFY_POS0 __od,
                            ^
  arch/x86/tests/amd-ibs-period.c:87:3: error: no matching function for call to 'memcpy'

This one, for instance:

  Alpine clang version 19.1.4
  Target: x86_64-alpine-linux-musl
  Thread model: posix
  InstalledDir: /usr/lib/llvm19/bin
  Configuration file: /etc/clang19/x86_64-alpine-linux-musl.cfg
  System configuration file directory: /etc/clang19

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-5-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Ravi Bangoria
fa1332a801 perf mem/c2c amd: Add ldlat support
'perf mem/c2c' uses IBS Op PMU on AMD platforms.

IBS Op PMU on Zen5 uarch has added support for Load Latency filtering.

Implement 'perf mem/c2c' --ldlat using IBS Op Load Latency filtering
capability.

Some subtle differences between AMD and other arch:

o --ldlat is disabled by default on AMD

o Supported values are 128 to 2048.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-4-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Ravi Bangoria
fc481adc97 perf amd ibs: Incorporate Zen5 DTLB and PageSize information
IBS Op PMU on Zen5 reports DTLB and page size information differently
compared to prior generation.

  IBS_OP_DATA3     Zen3/4                 Zen5
  ----------------------------------------------------------------
  19               IbsDcL2TlbHit1G        Reserved
  ----------------------------------------------------------------
   6               IbsDcL2tlbHit2M        Reserved
  ----------------------------------------------------------------
   5               IbsDcL1TlbHit1G        PageSize:
   4               IbsDcL1TlbHit2M          0 - 4K
                                            1 - 2M
                                            2 - 1G
                                            3 - Reserved
                                          Valid only if
                                            IbsDcPhyAddrValid = 1
  ----------------------------------------------------------------
   3               IbsDcL2TlbMiss         IbsDcL2TlbMiss
                                          Valid only if
                                            IbsDcPhyAddrValid = 1
  ----------------------------------------------------------------
   2               IbsDcL1tlbMiss         IbsDcL1tlbMiss
                                          Valid only if
                                            IbsDcPhyAddrValid = 1
  ----------------------------------------------------------------

Kernel expose this change as "dtlb_pgsize" capability in PMU sysfs.

Change IBS register raw-dump logic according to new bit definitions.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-3-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Ravi Bangoria
eeefc13c71 perf amd ibs: Add Load Latency bits in raw dump
IBS OP PMU on Zen5 supports Load Latency filtering. Decode and dump Load
Latency filtering related bits into perf script raw dump.

Also add oneliner example in the perf-amd-ibs man page.

Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20250429035938.1301-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:46 -03:00
Arnaldo Carvalho de Melo
4d728bb93b perf symbols: Handle 'u' and 'l' symbols in /proc/kallsyms
I started seeing this in recent Fedora 42 kernels:

  # uname -a
  Linux number 6.14.3-300.fc42.x86_64 #1 SMP PREEMPT_DYNAMIC Sun Apr 20 16:08:39 UTC 2025 x86_64 GNU/Linux
  #

  # perf test vmlinux
    1: vmlinux symtab matches kallsyms                                 : FAILED!
  #

Where we have Rust enabled:

  # grep CONFIG_RUST /boot/config-6.14.3-300.fc42.x86_64
  CONFIG_RUSTC_VERSION=108600
  CONFIG_RUST_IS_AVAILABLE=y
  CONFIG_RUSTC_LLVM_VERSION=200101
  CONFIG_RUSTC_HAS_COERCE_POINTEE=y
  CONFIG_RUST=y
  CONFIG_RUSTC_VERSION_TEXT="rustc 1.86.0 (05f9846f8 2025-03-31) (Fedora 1.86.0-1.fc42)"
  CONFIG_RUST_FW_LOADER_ABSTRACTIONS=y
  CONFIG_RUST_PHYLIB_ABSTRACTIONS=y
  # CONFIG_RUST_DEBUG_ASSERTIONS is not set
  CONFIG_RUST_OVERFLOW_CHECKS=y
  # CONFIG_RUST_BUILD_ASSERT_ALLOW is not set
  #

Looking at the reason for the failure:

  # perf test -v vmlinux |& grep ^ERR
  ERR : 0xffffffff99efc7d0: __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
  ERR : 0xffffffff99efc7e0: _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_ not on kallsyms
  #

But:

  # grep -w u /proc/kallsyms
  ffffffff99efc7d0 u __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  ffffffff99efc7e0 u _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  #

The test checks that "vmlinux symtab matches kallsyms", so it finds those two
symbols in vmlinux:

  # pahole --running_kernel_vmlinux
  /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux
  #

  # readelf -sW /usr/lib/debug/lib/modules/6.14.3-300.fc42.x86_64/vmlinux | grep -Ew '(__pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_|_RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_)'
 81844: ffffffff81efc7e0   524 FUNC    LOCAL  DEFAULT    1 _RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
144259: ffffffff81efc7d0    16 FUNC    LOCAL  DEFAULT    1 __pfx__RNCINvNtNtNtCsf5tcb0XGUW4_4core4iter8adapters3map12map_try_foldjNtCsagR6JbSOIa9_12drm_panic_qr7VersionuINtNtNtBa_3ops12control_flow11ControlFlowB10_ENcB10_0NCINvNvNtNtNtB8_6traits8iterator8Iterator4find5checkB10_NCNvMB12_B10_13from_segments0E0E0B12_
  #

It is there.

From the nm documentation we can see that:

           "U" The symbol is undefined.

           "u" The symbol is a unique global symbol.  This is a GNU extension to the
	       standard set of ELF symbol bindings.  For such a symbol the dynamic
	       linker will make sure that in the entire process there is just one
	       symbol with this name and type in use.

So lets consider 'u' symbols in /proc/kallsyms when loading it to cover this case.

Fedora:40 shows this as a 'l' symbol, so consider that as well.

With this patch 'perf test 1' is happy again:

  # perf test vmlinux
    1: vmlinux symtab matches kallsyms                                 : Ok
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/aBE_n0PGl3g6h-cS@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 22:30:44 -03:00
James Clark
8988c4b919 perf tools: Fix in-source libperf build
When libperf is built alone in-source, $(OUTPUT) isn't set. This causes
the generated uapi path to resolve to '/../arch' which results in a
permissions error:

  mkdir: cannot create directory '/../arch': Permission denied

Fix it by removing the preceding '/..' which means that it gets
generated either in the tools/lib/perf part of the tree or the OUTPUT
folder. Some other rules that rely on OUTPUT further refine this
conditionally depending on whether it's an in-source or out-of-source
build, but I don't think we need the extra complexity here. And this
rule is slightly different to others because the header is needed by
both libperf and Perf. This is further complicated by the fact that Perf
always passes O=... to libperf even for in source builds, meaning that
OUTPUT isn't set consistently between projects.

Because we're no longer going one level up to try to generate the file
in the tools/ folder, Perf's include rule needs to descend into libperf.
Also fix the clean rule while we're here.

Reported-by: Thorsten Leemhuis <linux@leemhuis.info>
Closes: https://lore.kernel.org/linux-perf-users/7703f88e-ccb7-4c98-9da4-8aad224e780f@leemhuis.info/
Fixes: bfb713ea53 ("perf tools: Fix arm64 build by generating unistd_64.h")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Thorsten Leemhuis <linux@leemhuis.info>
Link: https://lore.kernel.org/r/20250429-james-perf-fix-libperf-in-source-build-v1-1-a1a827ac15e5@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-29 12:32:31 -07:00
Jakub Brnak
92b664dcef perf test probe_vfs_getname: Skip if no suitable line detected
In some cases when calling function add_probe_vfs_getname, line number
can't be detected by 'perf probe -L getname_flags':

  78         atomic_set(&result->refcnt, 1);

	     // one of the following lines should have line number
	     // but sometimes it does not because of optimization
	     result->uptr = filename;
             result->aname = NULL;

  81         audit_getname(result);

To prevent false failures, skip the affected tests if no suitable line
numbers can be detected.

Signed-off-by: Jakub Brnak <jbrnak@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tomas Glozar <tglozar@redhat.com>
Link: https://lore.kernel.org/r/20250324144523.597557-1-jbrnak@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 12:28:12 -03:00
Namhyung Kim
13f35928a4 perf lock contention: Symbolize zone->lock using BTF
The struct zone is embedded in struct pglist_data which can be allocated
for each NUMA node early in the boot process.  As it's not a slab object
nor a global lock, this was not symbolized.

Since the zone->lock is often contended, it'd be nice if we can
symbolize it.  On NUMA systems, node_data array will have pointers for
struct pglist_data.  By following the pointer, it can calculate the
address of each zone and its lock using BTF.  On UMA, it can just use
contig_page_data and its zones.

The following example shows the zone lock contention at the end.

  $ sudo ./perf lock con -abl -E 5 -- ./perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  # 20 sender and receiver processes per group
  # 10 groups == 400 processes run

       Total time: 0.038 [sec]
   contended   total wait     max wait     avg wait            address   symbol

        5167     18.17 ms     10.27 us      3.52 us   ffff953340052d00   &kmem_cache_node (spinlock)
          38     11.75 ms    465.49 us    309.13 us   ffff95334060c480   &sock_inode_cache (spinlock)
        3916     10.13 ms     10.43 us      2.59 us   ffff953342aecb40   &kmem_cache_node (spinlock)
        2963     10.02 ms     13.75 us      3.38 us   ffff9533d2344098   &kmalloc-rnd-08-2k (spinlock)
         216      5.05 ms     99.49 us     23.39 us   ffff9542bf7d65d0   zone_lock (spinlock)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Cc: linux-mm@kvack.org
Link: https://lore.kernel.org/r/20250401063055.7431-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-29 12:23:53 -03:00
Namhyung Kim
2d099ccaad perf test: Add perf trace summary test
$ sudo ./perf test -vv 'trace summary'
  109: perf trace summary:
  --- start ---
  test child forked, pid 3501572
  testing: perf trace -s -- true
  testing: perf trace -S -- true
  testing: perf trace -s --summary-mode=thread -- true
  testing: perf trace -S --summary-mode=total -- true
  testing: perf trace -as --summary-mode=thread --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=total --no-bpf-summary -- true
  testing: perf trace -as --summary-mode=thread --bpf-summary -- true
  testing: perf trace -as --summary-mode=total --bpf-summary -- true
  testing: perf trace -aS --summary-mode=total --bpf-summary -- true
  ---- end(0) ----
  109: perf trace summary                                              : Ok

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20250326044001.3503432-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28 16:53:13 -03:00
Namhyung Kim
1bec43f523 perf trace: Implement syscall summary in BPF
When -s/--summary option is used, it doesn't need (augmented) arguments
of syscalls.  Let's skip the augmentation and load another small BPF
program to collect the statistics in the kernel instead of copying the
data to the ring-buffer to calculate the stats in userspace.  This will
be much more light-weight than the existing approach and remove any lost
events.

Let's add a new option --bpf-summary to control this behavior.  I cannot
make it default because there's no way to get e_machine in the BPF which
is needed for detecting different ABIs like 32-bit compat mode.

No functional changes intended except for no more LOST events. :)

  $ sudo ./perf trace -as --summary-mode=total --bpf-summary sleep 1

   Summary of events:

   total, 6194 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     epoll_wait           561      0  4530.843     0.000     8.076   520.941     18.75%
     futex                693     45  4317.231     0.000     6.230   500.077     21.98%
     poll                 300      0  1040.109     0.000     3.467   120.928     17.02%
     clock_nanosleep        1      0  1000.172  1000.172  1000.172  1000.172      0.00%
     ppoll                360      0   872.386     0.001     2.423   253.275     41.91%
     epoll_pwait           14      0   384.349     0.001    27.453   380.002     98.79%
     pselect6              14      0   108.130     7.198     7.724     8.206      0.85%
     nanosleep             39      0    43.378     0.069     1.112    10.084     44.23%
     ...

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250326044001.3503432-1-namhyung@kernel.org
[ Added fixup sent from Namhyung in response to my report to make it also dependent on CONFIG_TRACE ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-28 16:53:11 -03:00
Junhao He
c756441c35 perf vendor events arm64: Drop hip08 PublicDescription if same as BriefDescription
If BriefDescription and PublicDescription are the same, only
BriefDescription is needed. It will be used for both long and short
format outputs.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250418070812.3771441-3-hejunhao3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:19 -03:00
Junhao He
43fff3e948 perf vendor events arm64: Fill up Desc field for Hisi hip08 hha pmu
In the same PMU, when some JSON events have the "BriefDescription" field
populated while others do not, the cmp_sevent() function will split these
two types of events into separate groups. As a result, when using perf
list to display events, the two types of events cannot be grouped together
in the output.

before patch:
 $ perf list pmu
 ...
 uncore hha:
   hisi_sccl1_hha2/sdir-hit/
   hisi_sccl1_hha2/sdir-lookup/
 ...
 uncore hha:
   edir-hit
      [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2]
 ...

after patch:
 $ perf list pmu
 ...
 uncore hha:
   edir-hit
      [Count of The number of HHA E-Dir hit operations. Unit: hisi_sccl1_hha2]
   sdir-hit
      [Count of The number of HHA S-Dir hit operations. Unit: hisi_sccl1_hha2]
   sdir-lookup
      [Count of the number of HHA S-Dir lookup operations. Unit: hisi_sccl1_hha2]
 ...

Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Yicong Yang <yangyicong@hisilicon.com>
Signed-off-by: Junhao He <hejunhao3@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20250418070812.3771441-2-hejunhao3@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:16 -03:00
Ian Rogers
022d270bb6 perf bench evlist-open-close: Reduce scope of 2 variables
Make 2 global variables local. Reduces ELF binary size by removing
relocations. For a no flags build, the perf binary size is reduced by
4,144 bytes on x86-64.

Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Hao Ge <gehao@kylinos.cn>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20250410173631.1713627-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:13 -03:00
Ian Rogers
be8aefad33 perf tests record: Cleanup improvements
Remove the script output file. Add a trap debug message. Minor style
consistency changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250410173631.1713627-2-irogers@google.com
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Levi Yun <yeoreum.yun@arm.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Tengda Wu <wutengda@huaweicloud.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Hao Ge <gehao@kylinos.cn>
Cc: James Clark <james.clark@linaro.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: bpf@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:10 -03:00
Thomas Richter
ccd4b5cdf0 perf tests metric-only perf stat: Fix tests 84 and 86 s390
On s390x KVM and z/VM machines the CPU Measurement Facility is not
available. Events cycles and instructions do not exist.  Running above
tests on s390 KVM and z/VM guests always fail with this error:

  # ./perf test 84 86
  84: perf stat JSON output linter          : FAILED!
  86: perf stat STD output linter           : FAILED!
  #

Root cause is command:

  # perf stat -j --metric-only -e instructions,cycles -- true
  {"metric-value" : "none"}
  #

Which fails due to unsupported events and returns "none".
Do not execute this test case on s390 KVM and z/VM machines.

Output after:
  # ./perf test 84 86
  84: perf stat JSON output linter          : Ok
  86: perf stat STD output linter           : Ok
  #

Fixes: 45a86d017a ("perf test: Add --metric-only to perf stat output tests")
Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Suggested-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20250424133310.37452-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:07 -03:00
Ian Rogers
68cb156743 perf tool_pmu: Fix aggregation on duration_time
evsel__count_has_error() fails counters when the enabled or running time
are 0. The duration_time event reads 0 when the cpu_map_idx != 0 to
avoid aggregating time over CPUs. Change the enable and running time
to always have a ratio of 100% so that evsel__count_has_error won't
fail.

Before:
```
$ sudo /tmp/perf/perf stat --per-core -a -M UNCORE_FREQ sleep 1

 Performance counter stats for 'system wide':

S0-D0-C0              1      2,615,819,485      UNC_CLOCK.SOCKET                 #     2.61 UNCORE_FREQ
S0-D0-C0              2      <not counted>      duration_time

       1.002111784 seconds time elapsed
```

After:
```
$ perf stat --per-core -a -M UNCORE_FREQ sleep 1

 Performance counter stats for 'system wide':

S0-D0-C0              1        758,160,296      UNC_CLOCK.SOCKET                 #     0.76 UNCORE_FREQ
S0-D0-C0              2      1,003,438,246      duration_time

       1.002486017 seconds time elapsed
```

Note: the metric reads the value a different way and isn't impacted.

Fixes: 240505b2d0 ("perf tool_pmu: Factor tool events into their own PMU")
Reported-by: Stephane Eranian <eranian@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20250423050358.94310-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:04 -03:00
Chun-Tse Shao
b1b26ce8bb perf session: Skip unsupported new event types
`perf report` currently halts with an error when encountering
unsupported new event types (`event.type >= PERF_RECORD_HEADER_MAX`).

This patch modifies the behavior to skip these samples and continue
processing the remaining events.

Additionally, stops reporting if the new event size is not 8-byte
aligned.

Suggested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250414173921.2905822-1-ctshao@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:32:01 -03:00
Namhyung Kim
0ef8091f17 perf hist: Allow custom output fields in hierarchy mode
Now it can handle multiple output fields and sort keys in separate
levels, so it should be ok to use it in the hierarchy mode.  This
allows fully customized output format.

  $ perf report -F latency,comm,parallelism -H --stdio
  ...
  #     Latency  Command / Parallelism
  # ...........  .....................
  #
      31.84%     cc1
         29.96%     5
          1.24%     4
          0.37%     6
          0.26%     3
          0.02%     2
      24.68%     as
         22.39%     5
          1.12%     2
          0.98%     4
          0.12%     3
          0.07%     6
          ...

Committer testing:

Before:

  $ perf report -F latency,comm,parallelism -H --stdio
  Error: --hierarchy and --fields options cannot be used together

   Usage: perf report [<options>]

      -F, --fields <key[,keys...]>
                            output field(s): overhead latency overhead_sys overhead_us
  			  overhead_guest_sys overhead_guest_us overhead_children
  			  latency_children sample period weight1 weight2 weight3
  <SNIP>
      -H, --hierarchy       Show entries in a hierarchy
  $

After:

  $ perf report -F latency,comm,parallelism -H --stdio
  # Total Lost Samples: 0
  #
  # Samples: 1K of event 'cycles:Pu'
  # Event count (approx.): 1581450138
  #
  #     Latency  Command / Parallelism
  # ...........  .....................
  #
      97.66%     git
         96.95%     1
          0.55%     2
          0.04%     5
          0.03%     8
          0.03%     4
          0.02%     3
          0.01%     9
          0.01%     7
          0.01%     6
          0.01%     10
          0.00%     12
       2.34%     git-remote-http
          2.24%     1
          0.07%     5
          0.02%     2
          0.00%     4

  #
  # (Tip: To analyze particular parallelism levels, try: perf report --latency --parallelism=32-64)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:58 -03:00
Namhyung Kim
390627dda7 perf hist: Set levels in output_field_add()
It turns out that the output fields didn't consider the hierarchy mode
and put all the fields in the same level.  To support hierarchy, each
non-output field should be in a separate level.

Pass a pointer to level to output_field_add() and make it increase the
level when it sees non-output fields.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:54 -03:00
Namhyung Kim
b09124e2e1 perf hist: Remove formats in hierarchy when cancel latency
Likewise, it should remove latency output fields in hierarchy list.
Pass evlist to perf_hpp__cancel_latency() to handle them properly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:51 -03:00
Namhyung Kim
dbd11b6bda perf hist: Remove formats in hierarchy when cancel children
This is to support hierarchy options with custom output fields.
Currently perf_hpp__cancel_cumulate() only removes accumulated
overhead and latency fields from the global perf_hpp_list.

This is not used in the hierarchy mode because each evsel's hist
has its own separate hpp_list.  So it needs to remove the fields
from the lists too.  Pass evlist to the function so that it can
iterate the evsels.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250331073722.4695-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:47 -03:00
Ian Rogers
92504d927d perf record: Retirement latency cleanup in evsel__config
'perf record' will fail with retirement latency events as the open
doesn't do a perf_event_open system call.

Use evsel__config() to set up such events for recording by removing the
flag and enabling sample weights - the sample weights containing the
retirement latency.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:44 -03:00
Ian Rogers
fc807b6bde perf pmu-events: Add retirement latency to JSON events inside of perf
The updated Intel vendor events add retirement latency for
graniterapids:

  https://lore.kernel.org/lkml/20250322063403.364981-14-irogers@google.com/

This change makes those values available within an alias/event within a
PMU and saves them into the evsel at event parse time.

When no TPEBS data is available the default values are substituted in
for TMA metrics that are using retirement latency events - currently
just those on graniterapids.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:39 -03:00
Ian Rogers
f19306f065 perf stat: Add mean, min, max and last --tpebs-mode options
Add command line configuration option for how retirement latency
events are combined.

The default "mean" gives the average of retirement latency.

"min" or "max" give the smallest or largest retirment latency times
respectively.

"last" uses the last retirment latency sample's time.

Committer notes:

Enclose parse_tpebs_mode() under HAVE_ARCH_X86_64_SUPPORT to match the
ifdef block where it is used, fixing the build in systems like:

  20     5.60 debian:experimental-x-mips    : FAIL gcc version 14.2.0 (Debian 14.2.0-1)
    builtin-stat.c:2330:12: error: 'parse_tpebs_mode' defined but not used [-Werror=unused-function]
     2330 | static int parse_tpebs_mode(const struct option *opt, const char *str,
          |            ^~~~~~~~~~~~~~~~

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:36 -03:00
Ian Rogers
3533b56d22 perf intel-tpebs: Use stats for retirement latency statistics
struct stats provides access to mean, min and max.

It also provides uniformity with statistics code used elsewhere in perf.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:32 -03:00
Ian Rogers
1ddf95f6d8 perf intel-tpebs: Don't close record on read
Factor sending record control fd code into its own function.

Rather than killing the record process send it a ping when reading.

Timeouts were witnessed if done too frequently, so only ping for the
first tpebs events.

Don't kill the record command send it a stop command.

As close isn't reliably called also close on evsel__exit.

Add extra checks on the record being terminated to avoid warnings.

Adjust the locking as needed and incorporate extra -Wthread-safety
checks.

Check to do six 500ms poll timeouts when sending commands, rather than
the larger 3000ms, to allow the record process terminating to be better
witnessed.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:28 -03:00
Ian Rogers
8174392049 perf intel-tpebs: Add mutex for tpebs_results
Ensure sample reader isn't racing with events being added/removed.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:26 -03:00
Ian Rogers
ea61db61d9 perf intel-tpebs: Add support for updating counts in evsel__tpebs_read
Rename to reflect evsel argument and for consistency with other tpebs
functions.

Update count from prev_raw_counts when available.

Eventually this will allow inteval mode support.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:22 -03:00
Ian Rogers
bb1c0f1b43 perf intel-tpebs: Refactor tpebs_results list
evsel names and metric-ids are used for matching but this can be
problematic, for example, multiple occurrences of the same retirement
latency event become a single event for the record.

Change the name of the record events so they are unique and reflect the
evsel of the retirement latency event that opens them (the retirement
latency event's evsel address is embedded within them).

This allows an evsel based close to close the event when the retirement
latency event is closed.

This is important as 'perf stat' has an evlist and the session listen to
the record events has an evlist, knowing which event should remove the
tpebs_retire_lat can't be tied to an evlist list as there is more than
1, so closing which evlist should cause the tpebs to stop?

Using the evsel and the last one out doing the tpebs_stop is cleaner.

Committer notes:

Fix the build on 32-bit systems by using unsigned long when converting
pointers to integers instead of uint64_t. Fixes:

  20     4.97 debian:experimental-x-mips    : FAIL gcc version 14.2.0 (Debian 14.2.0-13)
    util/intel-tpebs.c: In function 'tpebs_retire_lat__find':
    util/intel-tpebs.c:377:21: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast]
      377 |                 if ((uint64_t)t->evsel == num)
          |                     ^
    cc1: all warnings being treated as errors

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:19 -03:00
Ian Rogers
07c3532033 perf intel-tpebs: Ensure events are opened, factor out finding
Factor out finding an tpebs_retire_lat from an evsel.

Don't blindly return when ignoring an open request, which happens after
the first open request, ensure the event was started on a fork of perf
record.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:15 -03:00
Ian Rogers
84e629143b perf intel-tpebs: Inline get_perf_record_args
Code is short enough to be inlined and there are no error cases when
made inline.

Make the implicit NULL pointer at the end of the argv explicit.

Move the fixed number of arguments before the variable number of
arguments.

Correctly size the argv allocation and zero when feeing to avoid a
dangling pointer.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:10 -03:00
Ian Rogers
728756fffb perf intel-tpebs: Reduce scope of the tpebs_events_size variable
Moved to record argument computation rather than being global.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:07 -03:00
Ian Rogers
24fead56eb perf intel-tpebs: Move the cpumap_buf variable out of evsel__tpebs_open()
The buffer holds the cpumap to pass to the 'perf record' command, so
move it down to the 'perf record' function.

Make this function an evsel function given the need for the evsel for
the cpumap.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:04 -03:00
Ian Rogers
b009b51eea perf intel-tpebs: Separate evsel__tpebs_prepare() out of evsel__tpebs_open()
Separate the creation of the tpebs_retire_lat result out of the opening
step.

This is in preparation for adding a prepare operation for evlists.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:31:01 -03:00
Ian Rogers
2332f68254 perf intel-tpebs: Rename tpebs_start to evsel__tpebs_open
Try to add more consistency to evsel by having tpebs_start renamed to
evsel__tpebs_open, passing the evsel that is being opened. The unusual
behavior of evsel__tpebs_open opening all events on the evlist is kept
and will be cleaned up further in later patches. The comments are
cleaned up as tpebs_start isn't called from evlist.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:57 -03:00
Ian Rogers
9e0ef3ec62 perf intel-tpebs: Simplify tpebs_cmd
No need to dynamically allocate when there is 1. tpebs_pid duplicates
tpebs_cmd.pid, so remove. Use 0 as the uninitialized value (PID == 0
is reserved for the kernel) rather than -1.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:54 -03:00
Ian Rogers
eb493c28e9 perf intel-tpebs: Cleanup header
Remove arch conditional compilation. Arch conditional compilation
belongs in the arch/ directory.

Tidy header guards to match other files. Remove unneeded includes and
switch to forward declarations when necesary.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250414174134.3095492-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:51 -03:00
Ian Rogers
389048775a perf vendor events: Update westmereep-dp events
Update event topic moving other topic events to cache and virtual
memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-36-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:46 -03:00
Ian Rogers
545a04dd76 perf vendor events: Update westmereep-dp events
Update event topic moving other topic events to cache and virtual
memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-35-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:41 -03:00
Ian Rogers
2c8e1c3526 perf vendor events: Update westmereep-dp events
Update event topic moving other topic events to cache and virtual
memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-34-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:37 -03:00
Ian Rogers
f4c0f4e338 perf vendor events: Update tigerlake metrics
Switch to metrics generated from the TMA spreadsheet. Minor threshold
simplification.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-33-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:32 -03:00
Ian Rogers
3166129624 perf vendor events: Update snowridgex events
Update event topic moving other topic events to cache and memory. Add
PDIST counter into descriptions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-32-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:28 -03:00
Ian Rogers
b8b16293ce perf vendor events: Update skylakex events/metrics
Update event topics, metrics to be generated from the TMA spreadsheet
and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-31-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:23 -03:00
Ian Rogers
60bcad5592 perf vendor events: Update skylake metrics
Switch to metrics generated from the TMA spreadsheet. Minor threshold
simplification.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-30-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:18 -03:00
Ian Rogers
b27d90f587 perf vendor events: Update sierraforest events/metrics
Update events from v1.08 to v1.09.

Update event topics, addition of PDIST counter into descriptions,
metrics to be generated from the TMA spreadsheet and other small clean
ups. The use of the spreadsheet for conversion has added thresholds.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-29-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:14 -03:00
Ian Rogers
73c66d36d0 perf vendor events: Update sapphirerapids events/metrics
Update event topics, addition of PDIST counter into descriptions,
metrics to be generated from the TMA spreadsheet and other small clean
ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-28-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:09 -03:00
Ian Rogers
9873746f47 perf vendor events: Update sandybridge metrics
Update TMA metrics from 4.8 to 5.02. Move INSTS_WRITTEN_TO_IQ.INSTS to
the frontend topic.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-27-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:30:04 -03:00
Ian Rogers
e311e8a2d7 perf vendor events: Update rocketlake events/metrics
Update event topics, metrics to be generated from the TMA spreadsheet
and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-26-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:59 -03:00
Ian Rogers
bce986466f perf vendor events: Update nehalemex events
Update event topic moving other topic events to cache and virtual
memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-25-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:54 -03:00
Ian Rogers
c7453cb57b perf vendor events: Update nehalemep events
Update event topic moving other topic events to cache and virtual
memory.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-24-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:50 -03:00
Ian Rogers
8ff6e2626f perf vendor events: Update meteorlake events/metrics
Update events from v1.12 to v1.13.
Update event topics, addition of PDIST counter into descriptions,
metrics to be generated from the TMA spreadsheet and other small clean
ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-23-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:45 -03:00
Ian Rogers
3af9e6879d perf vendor events: Update lunarlake events/metrics
Update event topics, addition of PDIST counter into descriptions,
metrics to be generated from the TMA spreadsheet and other small clean
ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-22-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:41 -03:00
Ian Rogers
e7972827fc perf vendor events: Update jaketown metrics
Update TMA metrics from 4.8 to 5.02. Move INSTS_WRITTEN_TO_IQ.INSTS to
the frontend topic.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-21-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:36 -03:00
Ian Rogers
61077e5e92 perf vendor events: Update ivytown metrics
Update TMA metrics from 4.8 to 5.02.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-20-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:32 -03:00
Ian Rogers
49fb6e0afd perf vendor events: Update ivybridge metrics
Update TMA metrics from 4.8 to 5.02.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-19-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:27 -03:00
Ian Rogers
4fdd931244 perf vendor events: Update icelakex events/metrics
Update event topics, metrics to be generated from the TMA spreadsheet
and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-18-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:23 -03:00
Ian Rogers
c9208b9c33 perf vendor events: Update icelake events/metrics
Update event topics, metrics to be generated from the TMA spreadsheet
and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-17-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:18 -03:00
Ian Rogers
ed23ac434e perf vendor events: Update haswellx metrics
Switch to metrics generated from the TMA spreadsheet. Minor threshold
simplification.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-16-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:14 -03:00
Ian Rogers
569ab2e020 perf vendor events: Update haswell metrics
Switch to metrics generated from the TMA spreadsheet. Minor threshold
simplification.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-15-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:09 -03:00
Ian Rogers
82acba742d perf vendor events: Add graniterapids retirement latencies
Add retirement latencies for use in place of retirement latency events.
Update events from v1.06 to v1.08.
Update event topics, addition of PDIST counter into descriptions,
metrics to be generated from the TMA spreadsheet and other small clean
ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:04 -03:00
Ian Rogers
d1ed58570e perf vendor events: Update grandridge events/metrics
Update events from v1.05 to v1.07. Update event topics, addition of
PDIST counter into descriptions, metrics to be generated from the TMA
spreadsheet and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:29:00 -03:00
Ian Rogers
4ecf9eab4a perf vendor events: Update emeraldrapids events/metrics
Update event topics, addition of PDIST counter into descriptions,
metrics to be generated from the TMA spreadsheet and other small clean
ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:55 -03:00
Ian Rogers
fa3498cb86 perf vendor events: Update elkhartlake events
Update event topic moving other topic events to cache and memory. Add
PDIST counter into descriptions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:45 -03:00
Ian Rogers
c4ba122a7e perf vendor events: Update clearwaterforest events
Update event topic of OCR.DEMAND_DATA_RD.ANY_RESPONSE and
OCR.DEMAND_RFO.ANY_RESPONSE to be cache. Add PDIST counter into
descriptions.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:41 -03:00
Ian Rogers
48660e9cc9 perf vendor events: Update cascadelakex events/metrics
Update event topics, metrics to be generated from the TMA spreadsheet
and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:36 -03:00
Ian Rogers
29c35b735a perf vendor events: Update broadwellx metrics
Switch to metrics generated from the TMA spreadsheet. Minor threshold
simplification.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:31 -03:00
Ian Rogers
307cf0cc72 perf vendor events: Update broadwellde metrics
Switch to metrics generated from the TMA spreadsheet. Minor threshold
simplification.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:27 -03:00
Ian Rogers
3040656ed7 perf vendor events: Update broadwell metrics
Switch to metrics generated from the TMA spreadsheet. Minor threshold
simplification.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:22 -03:00
Ian Rogers
0b84b6fc35 perf vendor events: Update bonnell events
Move DISPATCH_BLOCKED.ANY to the pipeline topic.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:17 -03:00
Ian Rogers
fd3dfa4b82 perf vendor events: Update arrowlake events/metrics
Update events from v1.07 to v1.08. Update event topics, addition of
PIST counter into descriptions, metrics to be generated from the TMA
spreadsheet and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:12 -03:00
Ian Rogers
4ab1fef5dc perf vendor events: Update AlderlakeN events/metrics
Update events from v1.28 to v1.29. Update event topics, addition of
PDIST counter into descriptions, metrics to be generated from the TMA
spreadsheet and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:28:05 -03:00
Ian Rogers
30f2a75e7e perf vendor events: Update alderlake events/metrics
Update events from v1.28 to v1.29. Update event topics, addition of
PDIST counter into descriptions, metrics to be generated from the TMA
spreadsheet and other small clean ups.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andreas Färber <afaerber@suse.de>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20250328175006.43110-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-25 12:27:59 -03:00
James Clark
bfb713ea53 perf tools: Fix arm64 build by generating unistd_64.h
Since pulling in the kernel changes in commit 22f72088ff ("tools
headers: Update the syscall table with the kernel sources"), arm64 is
no longer using a generic syscall header and generates one from the
syscall table. Therefore we must also generate the syscall header for
arm64 before building Perf.

Add it as a dependency to libperf which uses one syscall number. Perf
uses more, but as libperf is a dependency of Perf it will be generated
for both.

Future platforms that need this will have to add their own syscall-y
targets in libperf manually. Unfortunately the arch specific files that
do this (e.g. arch/arm64/include/asm/Kbuild) can't easily be imported
into the Perf build. But Perf only needs a subset of the generated files
anyway, so redefining them is probably the correct thing to do.

Fixes: 22f72088ff ("tools headers: Update the syscall table with the kernel sources")
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
Link: https://lore.kernel.org/r/20250417-james-perf-fix-gen-syscall-v1-1-1d268c923901@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-23 08:57:12 -07:00
Arnaldo Carvalho de Melo
3a320eada5 Merge remote-tracking branch 'torvalds/master' into perf-tools-next
Sync with upstream to pick up the perf-tools patches that updates the
header files copies to address the check_header.sh warnings. There are
also some libbpf updates, better pick those to be on the same page with
libbpf since perf uses it in various places.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-16 10:17:04 -03:00
Ingo Molnar
3846389c03 x86/platform/amd: Move the <asm/amd-ibs.h> header to <asm/amd/ibs.h>
Collect AMD specific platform header files in <asm/amd/*.h>.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mario Limonciello <superm1@kernel.org>
Link: https://lore.kernel.org/r/20250413084144.3746608-2-mingo@kernel.org
2025-04-14 09:31:47 +02:00
Namhyung Kim
2b70702917 perf tools: Remove evsel__handle_error_quirks()
The evsel__handle_error_quirks() is to fixup invalid event attributes on
some architecture based on the error code.  Currently it's only used for
AMD to disable precise_ip not to use IBS which has more restrictions.

But the commit c33aea446b changed call evsel__precise_ip_fallback
for any errors so there's no difference with the above function.  To
make matter worse, it caused a problem with branch stack on Zen3.

The IBS doesn't support branch stack so it should use a regular core
PMU event.  The default event is set precise_max and it starts with 3.
And evsel__precise_ip_fallback() tries with it and reduces the level one
by one.  At last it tries with 0 but it also failed on Zen3 since the
branch stack is not supported for the cycles event.

At this point, evsel__precise_ip_fallback() restores the original
precise_ip value (3) in the hope that it can succeed with other modifier
(like exclude_kernel).  Then evsel__handle_error_quirks() see it has
precise_ip != 0 and make it retry with 0.  This created an infinite
loop.

Before:

  $ perf record -b -vv |& grep removing
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  removing precise_ip on AMD
  ...

After:

  $ perf record -b true
  Error:
  Failure to open event 'cycles:P' on PMU 'cpu' which will be removed.
  Invalid event (cycles:P) in per-thread mode, enable system wide with '-a'.
  Error:
  Failure to open any events for recording.

Fixes: c33aea446b ("perf tools: Fix precise_ip fallback logic")
Tested-by: Chun-Tse Shao <ctshao@google.com>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250410010252.402221-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-11 09:22:49 -07:00
Arnaldo Carvalho de Melo
1293dacbbd perf libunwind arm64: Fix missing close parens in an if statement
While testing building with libunwind (using LIBUNWIND=1) in various
arches I noticed a problem on arm64, on an rpi5 system, a missing close
parens in a change related to dso__data_get_fd() usage, fix it.

Fixes: 5ac22c35aa ("perf dso: Use lock annotations to fix asan deadlock")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/Z_Z3o8KvB2i5c6ab@x1
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-10 17:51:31 -07:00
Namhyung Kim
7f56978e58 tools headers: Update the arch/x86/lib/memset_64.S copy with the kernel sources
To pick up the changes in:

  2981557cb0 x86,kcfi: Fix EXPORT_SYMBOL vs kCFI

That required adding a copy of include/linux/cfi_types.h and its checking
in tools/perf/check-headers.h.

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S

Please see tools/include/uapi/README for further details.

Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20250410001125.391820-11-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-10 09:28:25 -07:00
Namhyung Kim
7470998187 tools headers: Update the linux/unaligned.h copy with the kernel sources
To pick up the changes in:

  3846699217 ALSA: rawmidi: Make tied_device=0 as default / unknown
  7bb49d2e8b ALSA: rawmidi: Bump protocol version to 2.0.5
  b8fefed73a ALSA: rawmidi: Show substream activity in info ioctl
  bdf46443f3 ALSA: rawmidi: Expose the tied device number in info ioctl

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/uapi/sound/asound.h include/uapi/sound/asound.h

Please see tools/include/uapi/README for further details.

Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: linux-sound@vger.kernel.org
Link: https://lore.kernel.org/r/20250410001125.391820-9-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-10 09:28:25 -07:00
Namhyung Kim
df4bd8c76d tools headers: Update the uapi/linux/prctl.h copy with the kernel sources
To pick up the changes in:

  ec2d0c0462 posix-timers: Provide a mechanism to allocate a given timer ID

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/perf/trace/beauty/include/uapi/linux/prctl.h include/uapi/linux/prctl.h

Please see tools/include/uapi/README for further details.

Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20250410001125.391820-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-10 09:28:24 -07:00
Namhyung Kim
22f72088ff tools headers: Update the syscall table with the kernel sources
To pick up the changes in:

  c4a16820d9 fs: add open_tree_attr()
  2df1ad0d25 x86/arch_prctl: Simplify sys_arch_prctl()
  e632bca07c arm64: generate 64-bit syscall.tbl

This is basically to support the new open_tree_attr syscall.  But it
also needs to update asm-generic unistd.h header to get the new syscall
number.  And arm64 unistd.h header was converted to use the generic
64-bit header.

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/scripts/syscall.tbl scripts/syscall.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
    diff -u tools/perf/arch/arm/entry/syscalls/syscall.tbl arch/arm/tools/syscall.tbl
    diff -u tools/perf/arch/sh/entry/syscalls/syscall.tbl arch/sh/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/sparc/entry/syscalls/syscall.tbl arch/sparc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/xtensa/entry/syscalls/syscall.tbl arch/xtensa/kernel/syscalls/syscall.tbl
    diff -u tools/arch/arm64/include/uapi/asm/unistd.h arch/arm64/include/uapi/asm/unistd.h
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h

Please see tools/include/uapi/README for further details.

Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: linux-arch@vger.kernel.org
Link: https://lore.kernel.org/r/20250410001125.391820-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-10 09:28:24 -07:00
Namhyung Kim
af74e5fe74 tools headers: Update the VFS headers with the kernel sources
To pick up the changes in:

  7ed6cbe0f8 fs: add STATX_DIO_READ_ALIGN
  8fc7e23a9b fs: reformat the statx definition
  a5874fde3c exec: Add a new AT_EXECVE_CHECK flag to execveat(2)
  1ebd4a3c09 blk-crypto: add ioctls to create and prepare hardware-wrapped keys
  af6505e574 fs: add RWF_DONTCACHE iocb and FOP_DONTCACHE file_operations flag
  10783d0ba0 fs, iov_iter: define meta io descriptor
  8f6116b5b7 statmount: add a new supported_mask field
  37c4a9590e statmount: allow to retrieve idmappings

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h
    diff -u tools/perf/trace/beauty/include/uapi/linux/stat.h include/uapi/linux/stat.h
    diff -u tools/perf/trace/beauty/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h
    diff -u tools/perf/trace/beauty/include/uapi/linux/fs.h include/uapi/linux/fs.h
    diff -u tools/perf/trace/beauty/include/uapi/linux/mount.h include/uapi/linux/mount.h

Please see tools/include/uapi/README for further details.

Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: linux-fsdevel@vger.kernel.org
Link: https://lore.kernel.org/r/20250410001125.391820-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-10 09:28:24 -07:00
Namhyung Kim
9dbe66640f tools headers: Update the socket headers with the kernel sources
To pick up the changes in:

  64e844505b include: uapi: protocol number and packet structs for AGGFRAG in ESP
  18912c5206 tcp: devmem: don't write truncated dmabuf CMSGs to userspace

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/in.h include/uapi/linux/in.h
    diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Please see tools/include/uapi/README for further details.

Acked-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20250410001125.391820-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-04-10 09:28:24 -07:00
Arnaldo Carvalho de Melo
1741189d84 perf ui browser hists: Set actions->thread before calling do_zoom_thread()
In 7cecb7fe83 ("perf hists: Move sort__has_comm into struct
perf_hpp_list") it assumes that act->thread is set prior to calling
do_zoom_thread().

This doesn't happen when we use ESC or the Left arrow key to Zoom out of
a specific thread, making this operation not to work and we get stuck
into the thread zoom.

In 6422184b08 ("perf hists browser: Simplify zooming code using
pstack_peek()") it says no need to set actions->thread, and at that
point that was true, but in 7cecb7fe83 a actions->thread == NULL
check was added before the zoom out of thread could kick in.

We can zoom out using the alternative 't' thread zoom toggle hotkey to
finally set actions->thread before calling do_zoom_thread() and zoom
out, but lets also fix the ESC/Zoom out of thread case.

Fixes: 7cecb7fe83 ("perf hists: Move sort__has_comm into struct perf_hpp_list")
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:47:06 -03:00
Arnaldo Carvalho de Melo
fd889776df perf ui browser hists: Simplify the routines that add entries to the popup menu
Some don't need some args, ditch them, also struct popup_actions->evsel
isn't needed as it is always obtainable from hists_to_evsel(browser->hists).

This way we simplify debugging by reducing this needless complexity.

Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_dkNDj9EPFwPqq1@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:46:47 -03:00
Arnaldo Carvalho de Melo
92c48ec231 perf ui browser: Accept the left arrow key as a Zoom out if done on the first column
In the past (like before 2015) we had the familiar workflow of using
ENTER to get the menu and then zoom on a pid or DSO, kernel, etc and
then use UP and DOWN to navigate further, etc, and then when wanting to
go to the previous level, i.e. to Zoom out, use the LEFT arrow key.

This way the right hand stays in the arrow keys block that is near the
enter and we can go around real quickly.

But then, when we started supporting horizontal scrolling by columns to
support things like 'perf c2c report' that has lots of columns, we
switched to using the LEFT key exclusively for that, horizontal
scrolling, requiring the user to press 'm' to get a "context menu" that
then would allow users to select the Zoom out operation.

Ingo recently reported this as not intuitive, which is true, so lets
overload the LEFT key with both meanings, by doing a Zoom out operation
if the LEFT key is pressed when we're on the first column, but use it
also for horizontal scrolling if it is pressed when the cursor is on
column > 1.

Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:46:41 -03:00
Arnaldo Carvalho de Melo
bf5ea13bae perf ui browser annotate: Don't show the source code view status initially
To avoid initial clutter, and not to change the view users that are not
interested in toggling the source code view, just show it when the user
does the first toggle keypress (pressing 's').

I know that there are users that really disable the source code view by
using:

  # perf config annotate.hide_src_code=yes

Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:46:32 -03:00
Arnaldo Carvalho de Melo
c6043d35c0 perf ui browser annotate: Show in the title the source code view toggle
Ingo reported that having a visual cue if the source code view is
enabled will help in noticing a bug when no source is presented.

Change the title scnprintf routine for the annotation browser to do
that.

More work is needed to have the capabilities of the existing
disassemblers listed somehow and start using the fastest one but switch
to another that provides features only made available by some particular
one, like the first one, the objdump output parsing one.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:46:29 -03:00
Arnaldo Carvalho de Melo
5e74ec5ccf perf ui browser map: Provide feedback on unhandled hotkeys
Don't just eat unknown keys without providing visual feedback and
instructions on how to see which ones are assigned.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:46:26 -03:00
Arnaldo Carvalho de Melo
3ceca92c28 perf ui browser hists: Provide feedback on unhandled hotkeys
Don't just eat unknown keys without providing visual feedback and
instructions on how to see which ones are assigned.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:46:19 -03:00
Arnaldo Carvalho de Melo
62787c90b8 perf ui browser header: Provide feedback on unhandled hotkeys
Don't just eat unknown keys without providing visual feedback and
instructions on how to see which ones are assigned.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:45:57 -03:00
Arnaldo Carvalho de Melo
6b76a42cc5 perf ui browser annotate: Provide feedback on unhandled hotkeys
Don't just eat unknown keys without providing visual feedback and
instructions on how to see which ones are assigned.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:45:52 -03:00
Arnaldo Carvalho de Melo
340376534d perf ui browser annotate-data: Provide feedback on unhandled hotkeys
Don't just eat unknown keys without providing visual feedback and
instructions on how to see which ones are assigned.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:45:48 -03:00
Arnaldo Carvalho de Melo
e2fbc15124 perf ui browser: Add a warn on unhandled hotkey helper
Will be used by the various browsers, looks like:

                                        ┌─Warning!───────────────────────────────────────────┐
  29.15 │ 3c:  cmpl    $0x3,(%rbx)      │                                                    │
   3.77 │    ↓ je      51               │'Ctrl+V' key not associated, use 'h' to see actions!│
  16.59 │      movq    0x3b0(%rbx),%rdx │                                                    │
  30.65 │      movl    0x8(%rdx),%eax   │                                                    │
   3.77 │      cmpl    %eax,%edi        │Press any key...                                    │
   0.25 │    ↓ jb      82               └────────────────────────────────────────────────────┘

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:45:43 -03:00
Arnaldo Carvalho de Melo
f165523931 perf ui browser: Add key_name() helper
We'll use it to show unhandled keys in the various TUI browsers.

Suggested-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Z_TYux5fUg2pW-pF@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:45:37 -03:00
Arnaldo Carvalho de Melo
a059373a9a perf check: Add tip about building with libbfd using BUILD_NONDISTRO=1
There are still some cases, for instance to disassembly BPF code that
needs this, so until someone works on supporting that, we're keeping the
code but it is opt-in.

Make that clear on the 'perf version --build-options' and other ways to
query if a feature is present:

  $ perf check feature libbfd
                libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT ( tip: Deprecated, license incompatibility, use BUILD_NONDISTRO=1 and install binutils-dev[el] )
  $ perf -vv | grep bfd
                libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT ( tip: Deprecated, license incompatibility, use BUILD_NONDISTRO=1 and install binutils-dev[el] )
  $

Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/Z_dkNDj9EPFwPqq1@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:45:15 -03:00
Arnaldo Carvalho de Melo
4fce4b91fd perf build: Warn when libdebuginfod devel files are not available
While working on 'perf version --build-options' I noticed that:

  $ perf version --build-options
  perf version 6.15.rc1.g312a07a00d31
                   aio: [ on  ]  # HAVE_AIO_SUPPORT
                   bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
         bpf_skeletons: [ on  ]  # HAVE_BPF_SKEL
            debuginfod: [ OFF ]  # HAVE_DEBUGINFOD_SUPPORT
  <SNIP>

And looking at tools/perf/Makefile.config I also noticed that it is not
opt-in, meaning we will attempt to build with it in all normal cases.

So add the usual warning at build time to let the user know that
something recommended is missing, now we see:

  Makefile.config:563: No elfutils/debuginfod.h found, no debuginfo server support, please install elfutils-debuginfod-client-devel or equivalent

And after following the recommendation:

  $ perf check feature debuginfod
            debuginfod: [ on  ]  # HAVE_DEBUGINFOD_SUPPORT
  $ ldd ~/bin/perf | grep debuginfo
	libdebuginfod.so.1 => /lib64/libdebuginfod.so.1 (0x00007fee5cf5f000)
  $

With this feature on several perf tools will fetch what is needed and
not require all the contents of the debuginfo packages, for instance:

  # rpm -qa | grep kernel-debuginfo
  # pahole --running_kernel_vmlinux
  pahole: couldn't find a vmlinux that matches the running kernel
  HINT: Maybe you're inside a container or missing a debuginfo package?
  #
  # perf trace -e open* perf probe --vars icmp_rcv
      0.000 ( 0.005 ms): perf/97391 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) = 3
      0.014 ( 0.004 ms): perf/97391 openat(dfd: CWD, filename: "/lib64/libm.so.6", flags: RDONLY|CLOEXEC) = 3
  <SNIP>
  32130.100 ( 0.008 ms): perf/97391 openat(dfd: CWD, filename: "/root/.cache/debuginfod_client/aa3c82b4a13f9c0e0301bebb20fe958c4db6f362/debuginfo") = 3
  <SNIP>
  Available variables at icmp_rcv
        @<icmp_rcv+0>
                struct sk_buff* skb
  <SNIP>
  #
  # pahole --running_kernel_vmlinux
  /root/.cache/debuginfod_client/aa3c82b4a13f9c0e0301bebb20fe958c4db6f362/debuginfo
  # file /root/.cache/debuginfod_client/aa3c82b4a13f9c0e0301bebb20fe958c4db6f362/debuginfo
  /root/.cache/debuginfod_client/aa3c82b4a13f9c0e0301bebb20fe958c4db6f362/debuginfo: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, BuildID[sha1]=aa3c82b4a13f9c0e0301bebb20fe958c4db6f362, with debug_info, not stripped
  # ls -la /root/.cache/debuginfod_client/aa3c82b4a13f9c0e0301bebb20fe958c4db6f362/debuginfo
  -r--------. 1 root root 475401512 Mar 27 21:00 /root/.cache/debuginfod_client/aa3c82b4a13f9c0e0301bebb20fe958c4db6f362/debuginfo
  #

Then, cached:

  # perf stat --null perf probe --vars icmp_rcv
  Available variables at icmp_rcv
        @<icmp_rcv+0>
                struct sk_buff* skb

   Performance counter stats for 'perf probe --vars icmp_rcv':

       0.671389041 seconds time elapsed

       0.519176000 seconds user
       0.150860000 seconds sys

Fixes: c7a14fdcb3 ("perf build-ids: Fall back to debuginfod query if debuginfo not found")
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Link: https://lore.kernel.org/r/Z_dkNDj9EPFwPqq1@gmail.com
[ Folded patch from Ingo to have the debian/ubuntu devel package added build warning message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:45:02 -03:00
Arnaldo Carvalho de Melo
31d7e60063 perf check: Allow showing a tip for opt-in features not built into perf
Ingo reported that it was difficult to understand why libunwind support
didn't link even when he had the usual libunwind-dev files installed in
his machine.

This is because libunwind became opt-in, the user has to use
LIBUNWIND=1, as it was deemed stalled in its development/unsuitable for
use with perf, IIRC, and so we better use the elfutils equivalent
routine that we also supported for ages.

But the build message still printed:

  Auto-detecting system features:
  ...                                   libdw: [ on  ]
  ...                                   glibc: [ on  ]
  <SNIP>
  ...                               libcrypto: [ on  ]
  ...                               libunwind: [ OFF ]
  <SNIP>

Which is confusing, so allow for having a tip when 'perf version
--build-options' is used, and variants with 'perf check feature':

  $ perf version --build-options | grep libunwind
             libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
  $
  $ perf check feature libunwind
             libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT ( tip: Deprecated, use LIBUNWIND=1 and install libunwind-dev[el] to build with it )
  $

The next patches will remove the opt-in libunwind FEATURES_DISPLAY.

Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/Z--pWmTHGb62_83e@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:44:42 -03:00
Arnaldo Carvalho de Melo
0afeacbc8d perf check: Move the FEATURE_STATUS() macro to its only user source file
It is just 'perf check' that uses that macro, to initialize the list of
features built into perf, so move it there.

This also avoids depending on things that are not included from
builtin.h, like is_builtin(), the CONFIG_ macros, etc, that are all
included in 'builtin-check.c' and before where this macro was moved to.

Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/Z_dkNDj9EPFwPqq1@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:44:12 -03:00
Arnaldo Carvalho de Melo
6994c6374a perf check: Share the feature status printing routine with 'perf version'
Both had the same open coded functions, share them so that we can add
tips for opt-in features such as libunwind, coresight, etc.

Examples of use:

  $ perf check feature libcapstone
             libcapstone: [ on  ]  # HAVE_LIBCAPSTONE_SUPPORT
  $ perf check feature libunwind
               libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT
  $ perf version --build-options
  perf version 6.15.rc1.g113e3df8ccc5
                     aio: [ on  ]  # HAVE_AIO_SUPPORT
                     bpf: [ on  ]  # HAVE_LIBBPF_SUPPORT
           bpf_skeletons: [ on  ]  # HAVE_BPF_SKEL
              debuginfod: [ OFF ]  # HAVE_DEBUGINFOD_SUPPORT
                   dwarf: [ on  ]  # HAVE_LIBDW_SUPPORT
      dwarf_getlocations: [ on  ]  # HAVE_LIBDW_SUPPORT
            dwarf-unwind: [ on  ]  # HAVE_DWARF_UNWIND_SUPPORT
                auxtrace: [ on  ]  # HAVE_AUXTRACE_SUPPORT
                  libbfd: [ OFF ]  # HAVE_LIBBFD_SUPPORT
             libcapstone: [ on  ]  # HAVE_LIBCAPSTONE_SUPPORT
               libcrypto: [ on  ]  # HAVE_LIBCRYPTO_SUPPORT
      libdw-dwarf-unwind: [ on  ]  # HAVE_LIBDW_SUPPORT
                  libelf: [ on  ]  # HAVE_LIBELF_SUPPORT
                 libnuma: [ on  ]  # HAVE_LIBNUMA_SUPPORT
              libopencsd: [ on  ]  # HAVE_CSTRACE_SUPPORT
                 libperl: [ on  ]  # HAVE_LIBPERL_SUPPORT
                 libpfm4: [ on  ]  # HAVE_LIBPFM
               libpython: [ on  ]  # HAVE_LIBPYTHON_SUPPORT
                libslang: [ on  ]  # HAVE_SLANG_SUPPORT
           libtraceevent: [ on  ]  # HAVE_LIBTRACEEVENT
               libunwind: [ OFF ]  # HAVE_LIBUNWIND_SUPPORT
                    lzma: [ on  ]  # HAVE_LZMA_SUPPORT
  numa_num_possible_cpus: [ on  ]  # HAVE_LIBNUMA_SUPPORT
                    zlib: [ on  ]  # HAVE_ZLIB_SUPPORT
                    zstd: [ on  ]  # HAVE_ZSTD_SUPPORT
  $

Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/Z_Rz10stoLzBocIO@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:44:04 -03:00
Arnaldo Carvalho de Melo
6559b83e4e tools build: Don't set libunwind as available if test-all.c build succeeds
The tools/build/feature/test-all.c file tries to detect the expected,
most common set of libraries/features we expect to have available to
build perf with.

At some point libunwind was deemed not to be part of that set of
libries, but the patches making it to be opt-in ended up forgetting some
details, fix one more.

Testing it:

  $ rm -rf /tmp/build/$(basename $PWD)/ ; mkdir -p /tmp/build/$(basename $PWD)/
  $ rpm -q libunwind-devel
  libunwind-devel-1.8.0-3.fc40.x86_64
  $ make -k LIBUNWIND=1 CORESIGHT=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin |& grep unwind && ldd ~/bin/perf | grep unwind
  ...                               libunwind: [ on  ]
    CC      /tmp/build/perf-tools-next/arch/x86/tests/dwarf-unwind.o
    CC      /tmp/build/perf-tools-next/arch/x86/util/unwind-libunwind.o
    CC      /tmp/build/perf-tools-next/util/arm64-frame-pointer-unwind-support.o
    CC      /tmp/build/perf-tools-next/tests/dwarf-unwind.o
    CC      /tmp/build/perf-tools-next/util/unwind-libunwind-local.o
    CC      /tmp/build/perf-tools-next/util/unwind-libunwind.o
	  libunwind-x86_64.so.8 => /lib64/libunwind-x86_64.so.8 (0x00007f615a549000)
	  libunwind.so.8 => /lib64/libunwind.so.8 (0x00007f615a52f000)
  $ sudo rpm -e libunwind-devel
  $ rm -rf /tmp/build/$(basename $PWD)/ ; mkdir -p /tmp/build/$(basename $PWD)/
  $ make -k LIBUNWIND=1 CORESIGHT=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin |& grep unwind && ldd ~/bin/perf | grep unwind
  Makefile.config:653: No libunwind found. Please install libunwind-dev[el] >= 1.1 and/or set LIBUNWIND_DIR
  ...                               libunwind: [ OFF ]
    CC      /tmp/build/perf-tools-next/arch/x86/tests/dwarf-unwind.o
    CC      /tmp/build/perf-tools-next/arch/x86/util/unwind-libdw.o
    CC      /tmp/build/perf-tools-next/util/arm64-frame-pointer-unwind-support.o
    CC      /tmp/build/perf-tools-next/tests/dwarf-unwind.o
    CC      /tmp/build/perf-tools-next/util/unwind-libdw.o
  $

Should be in a separate patch, but tired now, so also adding a message
about the need to use LIBUNWIND=1 in the output when its not available,
so done here as well.

So, now when the devel files are not available we get:

  $ make -k LIBUNWIND=1 CORESIGHT=1 O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin |& grep unwind && ldd ~/bin/perf | grep unwind
  Makefile.config:653: No libunwind found. Please install libunwind-dev[el] >= 1.1 and/or set LIBUNWIND_DIR and set LIBUNWIND=1 in the make command line as it is opt-in now
  ...                               libunwind: [ OFF ]
  $

Fixes: 13e17c9ff4 ("perf build: Make libunwind opt-in rather than opt-out")
Reported-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Ingo Molnar <mingo@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/Z_AnsW9oJzFbhIFC@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-04-10 10:43:43 -03:00
Nam Cao
244132c4e5 tracing/timers: Rename the hrtimer_init event to hrtimer_setup
The function hrtimer_init() doesn't exist anymore. It was replaced by
hrtimer_setup().

Thus, rename the hrtimer_init trace event to hrtimer_setup to keep it
consistent.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/all/cba84c3d853c5258aa3a262363a6eac08e2c7afc.1738746927.git.namcao@linutronix.de
2025-04-05 10:30:17 +02:00
Linus Torvalds
802f0d58d5 perf tools changes for v6.15
perf record
 -----------
 * Introduce latency profiling using scheduler information.  The latency
   profiling is to show impacts on wall-time rather than cpu-time.  By
   tracking context switches, it can weight samples and find which part
   of the code contributed more to the execution latency.
 
   The value (period) of the sample is weighted by dividing it by the
   number of parallel execution at the moment.  The parallelism is
   tracked in perf report with sched-switch records.  This will reduce
   the portion that are run in parallel and in turn increase the portion
   of serial executions.
 
   For now, it's limited to profile processes, IOW system-wide profiling
   is not supported.  You can add --latency option to enable this.
 
     $ perf record --latency -- make -C tools/perf
 
   I've run the above command for perf build which adds -j option to
   make with the number of CPUs in the system internally.  Normally
   it'd show something like below:
 
     $ perf report -F overhead,comm
     ...
     #
     # Overhead  Command
     # ........  ...............
     #
         78.97%  cc1
          6.54%  python3
          4.21%  shellcheck
          3.28%  ld
          1.80%  as
          1.37%  cc1plus
          0.80%  sh
          0.62%  clang
          0.56%  gcc
          0.44%  perl
          0.39%  make
 	 ...
 
   The cc1 takes around 80% of the overhead as it's the actual compiler.
   However it runs in parallel so its contribution to latency may be less
   than that.  Now, perf report will show both overhead and latency (if
   --latency was given at record time) like below:
 
     $ perf report -s comm
     ...
     #
     # Overhead   Latency  Command
     # ........  ........  ...............
     #
         78.97%    48.66%  cc1
          6.54%    25.68%  python3
          4.21%     0.39%  shellcheck
          3.28%    13.70%  ld
          1.80%     2.56%  as
          1.37%     3.08%  cc1plus
          0.80%     0.98%  sh
          0.62%     0.61%  clang
          0.56%     0.33%  gcc
          0.44%     1.71%  perl
          0.39%     0.83%  make
 	 ...
 
   You can see latency of cc1 goes down to around 50% and python3 and ld
   contribute a lot more than their overhead.  You can use --latency
   option in perf report to get the same result but ordered by latency.
 
     $ perf report --latency -s comm
 
 perf report
 -----------
 * As a side effect of the latency profiling work, it adds a new output
   field 'latency' and a sort key 'parallelism'.  The below is a result
   from my system with 64 CPUs.  The build was well-parallelized but
   contained some serial portions.
 
     $ perf report -s parallelism
     ...
     #
     # Overhead   Latency  Parallelism
     # ........  ........  ...........
     #
         16.95%     1.54%           62
         13.38%     1.24%           61
         12.50%    70.47%            1
         11.81%     1.06%           63
          7.59%     0.71%           60
          4.33%    12.20%            2
          3.41%     0.33%           59
          2.05%     0.18%           64
          1.75%     1.09%            9
          1.64%     1.85%            5
          ...
 
 * Support Feodra mini-debuginfo which is a LZMA compressed symbol table
   inside ".gnu_debugdata" ELF section.
 
 perf annotate
 -------------
 * Add --code-with-type option to enable data-type profiling with the
   usual annotate output.  Instead of focusing on data structure, it
   shows code annotation together with data type it accesses in case the
   instruction refers to a memory location (and it was able to resolve
   the target data type).  Currently it only works with --stdio.
 
     $ perf annotate --stdio --code-with-type
     ...
      Percent |      Source code & Disassembly of vmlinux for cpu/mem-loads,ldlat=30/pp (18 samples, percent: local period)
     ----------------------------------------------------------------------------------------------------------------------
              : 0                0xffffffff81050610 <__fdget>:
         0.00 :   ffffffff81050610:        callq   0xffffffff81c01b80 <__fentry__>           # data-type: (stack operation)
         0.00 :   ffffffff81050615:        pushq   %rbp              # data-type: (stack operation)
         0.00 :   ffffffff81050616:        movq    %rsp, %rbp
         0.00 :   ffffffff81050619:        pushq   %r15              # data-type: (stack operation)
         0.00 :   ffffffff8105061b:        pushq   %r14              # data-type: (stack operation)
         0.00 :   ffffffff8105061d:        pushq   %rbx              # data-type: (stack operation)
         0.00 :   ffffffff8105061e:        subq    $0x10, %rsp
         0.00 :   ffffffff81050622:        movl    %edi, %ebx
         0.00 :   ffffffff81050624:        movq    %gs:0x7efc4814(%rip), %rax  # 0x14e40 <current_task>              # data-type: struct task_struct* +0
         0.00 :   ffffffff8105062c:        movq    0x8d0(%rax), %r14         # data-type: struct task_struct +0x8d0 (files)
         0.00 :   ffffffff81050633:        movl    (%r14), %eax              # data-type: struct files_struct +0 (count.counter)
         0.00 :   ffffffff81050636:        cmpl    $0x1, %eax
         0.00 :   ffffffff81050639:        je      0xffffffff810506a9 <__fdget+0x99>
         0.00 :   ffffffff8105063b:        movq    0x20(%r14), %rcx          # data-type: struct files_struct +0x20 (fdt)
         0.00 :   ffffffff8105063f:        movl    (%rcx), %eax              # data-type: struct fdtable +0 (max_fds)
         0.00 :   ffffffff81050641:        cmpl    %ebx, %eax
         0.00 :   ffffffff81050643:        jbe     0xffffffff810506ef <__fdget+0xdf>
         0.00 :   ffffffff81050649:        movl    %ebx, %r15d
         5.56 :   ffffffff8105064c:        movq    0x8(%rcx), %rdx           # data-type: struct fdtable +0x8 (fd)
 	...
 
   The "# data-type:" part was added with this change.  The first few
   entries are not very interesting.  But later you can it accesses
   a couple of fields in the task_struct, files_struct and fdtable.
 
 perf trace
 ----------
 * Support syscall tracing for different ABI.  For example it can trace
   system calls for 32-bit applications on 64-bit kernel transparently.
 
 * Add --summary-mode=total option to show global syscall summary.  The
   default is 'thread' to show per-thread syscall summary.
 
 Python support
 --------------
 * Add more interfaces to 'perf' module to parse events, and config,
   enable or disable the event list properly so that it can implement
   basic functionalities purely in Python.  There is an example code
   for these new interfaces in python/tracepoint.py.
 
 * Add mypy and pylint support to enable build time checking.  Fix
   some code based on the findings from these tools.
 
 Internals
 ---------
 * Introduce io_dir__readdir() API to make directory traveral (usually
   for proc or sysfs) efficient with less memory footprint.
 
 JSON vendor events
 ------------------
 * Add events and metrics for ARM Neoverse N3 and V3
 * Update events and metrics on various Intel CPUs
 * Add/update events for a number of SiFive processors
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZ+Y/YQAKCRCMstVUGiXM
 g64CAP9sdUgKCe66JilZRAEy9d3Cp835v7KjHSlCXLXkRSoy6wD+KH96Y0OqVtGw
 GYON4s9ZoXAoH+J0lKIFSffVL9MhYgY=
 =7L7B
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Namhyung Kim:
 "perf record:

   - Introduce latency profiling using scheduler information.

     The latency profiling is to show impacts on wall-time rather than
     cpu-time. By tracking context switches, it can weight samples and
     find which part of the code contributed more to the execution
     latency.

     The value (period) of the sample is weighted by dividing it by the
     number of parallel execution at the moment. The parallelism is
     tracked in perf report with sched-switch records. This will reduce
     the portion that are run in parallel and in turn increase the
     portion of serial executions.

     For now, it's limited to profile processes, IOW system-wide
     profiling is not supported. You can add --latency option to enable
     this.

       $ perf record --latency -- make -C tools/perf

     I've run the above command for perf build which adds -j option to
     make with the number of CPUs in the system internally. Normally
     it'd show something like below:

       $ perf report -F overhead,comm
       ...
       #
       # Overhead  Command
       # ........  ...............
       #
           78.97%  cc1
            6.54%  python3
            4.21%  shellcheck
            3.28%  ld
            1.80%  as
            1.37%  cc1plus
            0.80%  sh
            0.62%  clang
            0.56%  gcc
            0.44%  perl
            0.39%  make
  	 ...

     The cc1 takes around 80% of the overhead as it's the actual
     compiler. However it runs in parallel so its contribution to
     latency may be less than that. Now, perf report will show both
     overhead and latency (if --latency was given at record time) like
     below:

       $ perf report -s comm
       ...
       #
       # Overhead   Latency  Command
       # ........  ........  ...............
       #
           78.97%    48.66%  cc1
            6.54%    25.68%  python3
            4.21%     0.39%  shellcheck
            3.28%    13.70%  ld
            1.80%     2.56%  as
            1.37%     3.08%  cc1plus
            0.80%     0.98%  sh
            0.62%     0.61%  clang
            0.56%     0.33%  gcc
            0.44%     1.71%  perl
            0.39%     0.83%  make
  	 ...

     You can see latency of cc1 goes down to around 50% and python3 and
     ld contribute a lot more than their overhead. You can use --latency
     option in perf report to get the same result but ordered by
     latency.

       $ perf report --latency -s comm

  perf report:

   - As a side effect of the latency profiling work, it adds a new
     output field 'latency' and a sort key 'parallelism'. The below is a
     result from my system with 64 CPUs. The build was well-parallelized
     but contained some serial portions.

       $ perf report -s parallelism
       ...
       #
       # Overhead   Latency  Parallelism
       # ........  ........  ...........
       #
           16.95%     1.54%           62
           13.38%     1.24%           61
           12.50%    70.47%            1
           11.81%     1.06%           63
            7.59%     0.71%           60
            4.33%    12.20%            2
            3.41%     0.33%           59
            2.05%     0.18%           64
            1.75%     1.09%            9
            1.64%     1.85%            5
            ...

   - Support Feodra mini-debuginfo which is a LZMA compressed symbol
     table inside ".gnu_debugdata" ELF section.

  perf annotate:

   - Add --code-with-type option to enable data-type profiling with the
     usual annotate output.

     Instead of focusing on data structure, it shows code annotation
     together with data type it accesses in case the instruction refers
     to a memory location (and it was able to resolve the target data
     type). Currently it only works with --stdio.

       $ perf annotate --stdio --code-with-type
       ...
        Percent |      Source code & Disassembly of vmlinux for cpu/mem-loads,ldlat=30/pp (18 samples, percent: local period)
       ----------------------------------------------------------------------------------------------------------------------
                : 0                0xffffffff81050610 <__fdget>:
           0.00 :   ffffffff81050610:        callq   0xffffffff81c01b80 <__fentry__>           # data-type: (stack operation)
           0.00 :   ffffffff81050615:        pushq   %rbp              # data-type: (stack operation)
           0.00 :   ffffffff81050616:        movq    %rsp, %rbp
           0.00 :   ffffffff81050619:        pushq   %r15              # data-type: (stack operation)
           0.00 :   ffffffff8105061b:        pushq   %r14              # data-type: (stack operation)
           0.00 :   ffffffff8105061d:        pushq   %rbx              # data-type: (stack operation)
           0.00 :   ffffffff8105061e:        subq    $0x10, %rsp
           0.00 :   ffffffff81050622:        movl    %edi, %ebx
           0.00 :   ffffffff81050624:        movq    %gs:0x7efc4814(%rip), %rax  # 0x14e40 <current_task>              # data-type: struct task_struct* +0
           0.00 :   ffffffff8105062c:        movq    0x8d0(%rax), %r14         # data-type: struct task_struct +0x8d0 (files)
           0.00 :   ffffffff81050633:        movl    (%r14), %eax              # data-type: struct files_struct +0 (count.counter)
           0.00 :   ffffffff81050636:        cmpl    $0x1, %eax
           0.00 :   ffffffff81050639:        je      0xffffffff810506a9 <__fdget+0x99>
           0.00 :   ffffffff8105063b:        movq    0x20(%r14), %rcx          # data-type: struct files_struct +0x20 (fdt)
           0.00 :   ffffffff8105063f:        movl    (%rcx), %eax              # data-type: struct fdtable +0 (max_fds)
           0.00 :   ffffffff81050641:        cmpl    %ebx, %eax
           0.00 :   ffffffff81050643:        jbe     0xffffffff810506ef <__fdget+0xdf>
           0.00 :   ffffffff81050649:        movl    %ebx, %r15d
           5.56 :   ffffffff8105064c:        movq    0x8(%rcx), %rdx           # data-type: struct fdtable +0x8 (fd)
  	...

     The "# data-type:" part was added with this change. The first few
     entries are not very interesting. But later you can it accesses a
     couple of fields in the task_struct, files_struct and fdtable.

  perf trace:

   - Support syscall tracing for different ABI. For example it can trace
     system calls for 32-bit applications on 64-bit kernel
     transparently.

   - Add --summary-mode=total option to show global syscall summary. The
     default is 'thread' to show per-thread syscall summary.

  Python support:

   - Add more interfaces to 'perf' module to parse events, and config,
     enable or disable the event list properly so that it can implement
     basic functionalities purely in Python. There is an example code
     for these new interfaces in python/tracepoint.py.

   - Add mypy and pylint support to enable build time checking. Fix some
     code based on the findings from these tools.

  Internals:

   - Introduce io_dir__readdir() API to make directory traveral (usually
     for proc or sysfs) efficient with less memory footprint.

  JSON vendor events:

   - Add events and metrics for ARM Neoverse N3 and V3

   - Update events and metrics on various Intel CPUs

   - Add/update events for a number of SiFive processors"

* tag 'perf-tools-for-v6.15-2025-03-27' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (229 commits)
  perf bpf-filter: Fix a parsing error with comma
  perf report: Fix a memory leak for perf_env on AMD
  perf trace: Fix wrong size to bpf_map__update_elem call
  perf tools: annotate asm_pure_loop.S
  perf python: Fix setup.py mypy errors
  perf test: Address attr.py mypy error
  perf build: Add pylint build tests
  perf build: Add mypy build tests
  perf build: Rename TEST_LOGS to SHELL_TEST_LOGS
  tools/build: Don't pass test log files to linker
  perf bench sched pipe: fix enforced blocking reads in worker_thread
  perf tools: Fix is_compat_mode build break in ppc64
  perf build: filter all combinations of -flto for libperl
  perf vendor events arm64 AmpereOneX: Fix frontend_bound calculation
  perf vendor events arm64: AmpereOne/AmpereOneX: Mark LD_RETIRED impacted by errata
  perf trace: Fix evlist memory leak
  perf trace: Fix BTF memory leak
  perf trace: Make syscall table stable
  perf syscalltbl: Mask off ABI type for MIPS system calls
  perf build: Remove Makefile.syscalls
  ...
2025-03-31 08:52:33 -07:00
Namhyung Kim
35d13f841a perf bpf-filter: Fix a parsing error with comma
The previous change to support cgroup filters introduced a bug that
pathname can include commas.  It confused the lexer to treat an item and
the trailing comma as a single token.  And it resulted in a parse error:

  $ sudo perf record -e cycles:P --filter 'period > 0, ip > 64' -- true
  perf_bpf_filter: Error: Unexpected item: 0,
  perf_bpf_filter: syntax error, unexpected BFT_ERROR, expecting BFT_NUM

   Usage: perf record [<options>] [<command>]
      or: perf record [<options>] -- <command> [<options>]

          --filter <filter>
                            event filter

It should get "0" and "," separately.

An easiest fix would be to remove "," from the possible pathname
characters.  As it's for cgroup names, probably ok to assume it won't
have commas in the pathname.

I found that the existing BPF filtering test didn't have any complex
filter condition with commas.  Let's update the group filter test which
is supposed to test filter combinations like this.

Link: https://lore.kernel.org/r/20250307220922.434319-1-namhyung@kernel.org
Fixes: 91e88437d5 ("perf bpf-filter: Support filtering on cgroups")
Reported-by: Sally Shi <sshii@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 17:29:58 -07:00
Namhyung Kim
9daa05c84a perf report: Fix a memory leak for perf_env on AMD
The env.pmu_mapping can be leaked when it reads data from a pipe on AMD.
For a pipe data, it reads the header data including pmu_mapping from
PERF_RECORD_HEADER_FEATURE runtime.  But it's already set in:

  perf_session__new()
    __perf_session__new()
      evlist__init_trace_event_sample_raw()
        evlist__has_amd_ibs()
          perf_env__nr_pmu_mappings()

Then it'll overwrite that when it processes the HEADER_FEATURE record.
Here's a report from address sanitizer.

  Direct leak of 2689 byte(s) in 1 object(s) allocated from:
    #0 0x7fed8f814596 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
    #1 0x5595a7d416b1 in strbuf_grow util/strbuf.c:64
    #2 0x5595a7d414ef in strbuf_init util/strbuf.c:25
    #3 0x5595a7d0f4b7 in perf_env__read_pmu_mappings util/env.c:362
    #4 0x5595a7d12ab7 in perf_env__nr_pmu_mappings util/env.c:517
    #5 0x5595a7d89d2f in evlist__has_amd_ibs util/amd-sample-raw.c:315
    #6 0x5595a7d87fb2 in evlist__init_trace_event_sample_raw util/sample-raw.c:23
    #7 0x5595a7d7f893 in __perf_session__new util/session.c:179
    #8 0x5595a7b79572 in perf_session__new util/session.h:115
    #9 0x5595a7b7e9dc in cmd_report builtin-report.c:1603
    #10 0x5595a7c019eb in run_builtin perf.c:351
    #11 0x5595a7c01c92 in handle_internal_command perf.c:404
    #12 0x5595a7c01deb in run_argv perf.c:448
    #13 0x5595a7c02134 in main perf.c:556
    #14 0x7fed85833d67 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58

Let's free the existing pmu_mapping data if any.

Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20250311000416.817631-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 16:22:06 -07:00
Thomas Richter
216d567610 perf trace: Fix wrong size to bpf_map__update_elem call
In linux-next
commit c760174401 ("perf cpumap: Reduce cpu size from int to int16_t")
causes the perf tests 100 126 to fail on s390:

Output before:
 # ./perf test 100
 100: perf trace BTF general tests         : FAILED!
 #

The root cause is the change from int to int16_t for the
cpu maps. The size of the CPU key value pair changes from
four bytes to two bytes. However a two byte key size is
not supported for bpf_map__update_elem().
Note: validate_map_op() in libbpf.c emits warning
 libbpf: map '__augmented_syscalls__': \
	 unexpected key size 2 provided, expected 4
when key size is set to int16_t.

Therefore change to variable size back to 4 bytes for
invocation of bpf_map__update_elem().

Output after:
 # ./perf test 100
 100: perf trace BTF general tests         : Ok
 #

Fixes: c760174401 ("perf cpumap: Reduce cpu size from int to int16_t")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Howard Chu <howardchu95@gmail.com>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250324152756.3879571-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 13:55:26 -07:00
Marcus Meissner
9a352a90e8 perf tools: annotate asm_pure_loop.S
Annotate so it is built with non-executable stack.

Fixes: 8b97519711 ("perf test: Add asm pureloop test tool")
Signed-off-by: Marcus Meissner <meissner@suse.de>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250323085410.23751-1-meissner@suse.de
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 09:39:54 -07:00
Ian Rogers
ba3b0861ed perf python: Fix setup.py mypy errors
getenv may return None, so assert it isn't None for CC and srctree
environmental variables required for the script.
Disable an optional warning related to Popen.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311213628.569562-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 09:38:20 -07:00
Ian Rogers
21944462d5 perf test: Address attr.py mypy error
ConfigParser existed in python2 but not in python3 causing mypy to
fail.
Whilst removing a python2 workaround remove reference to __future__.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311213628.569562-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 09:38:20 -07:00
Ian Rogers
8a54784e70 perf build: Add pylint build tests
If PYLINT=1 is passed to the build then run pylint over python code in
perf. Unlike shellcheck this isn't default on as there are currently
too many errors.

An example of an error:
```
************* Module setup
util/setup.py:19:0: C0301: Line too long (127/100) (line-too-long)
util/setup.py:20:0: C0301: Line too long (138/100) (line-too-long)
util/setup.py:63:0: C0301: Line too long (106/100) (line-too-long)
util/setup.py:1:0: C0114: Missing module docstring (missing-module-docstring)
util/setup.py:24:4: W0622: Redefining built-in 'vars' (redefined-builtin)
util/setup.py:11:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name)
util/setup.py:13:4: C0103: Constant name "cc_options" doesn't conform to UPPER_CASE naming style (invalid-name)
util/setup.py:15:34: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
util/setup.py:18:0: C0116: Missing function or method docstring (missing-function-docstring)
util/setup.py:19:16: R1732: Consider using 'with' for resource-allocating operations (consider-using-with)
util/setup.py:44:0: C0413: Import "from setuptools import setup, Extension" should be placed at the top of the module (wrong-import-position)
util/setup.py:46:0: C0413: Import "from setuptools.command.build_ext import build_ext as _build_ext" should be placed at the top of the module (wrong-import-position)
util/setup.py:47:0: C0413: Import "from setuptools.command.install_lib import install_lib as _install_lib" should be placed at the top of the module (wrong-import-position)
util/setup.py:49:0: C0115: Missing class docstring (missing-class-docstring)
util/setup.py:49:0: C0103: Class name "build_ext" doesn't conform to PascalCase naming style (invalid-name)
util/setup.py:52:8: W0201: Attribute 'build_lib' defined outside __init__ (attribute-defined-outside-init)
util/setup.py:53:8: W0201: Attribute 'build_temp' defined outside __init__ (attribute-defined-outside-init)
util/setup.py:55:0: C0115: Missing class docstring (missing-class-docstring)
util/setup.py:55:0: C0103: Class name "install_lib" doesn't conform to PascalCase naming style (invalid-name)
util/setup.py:58:8: W0201: Attribute 'build_dir' defined outside __init__ (attribute-defined-outside-init)

*-----------------------------------------------------------------
Your code has been rated at 6.67/10 (previous run: 6.51/10, +0.16)

make[4]: *** [util/Build:442: util/setup.py.pylint_log] Error 1
```

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311213628.569562-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 09:38:20 -07:00
Ian Rogers
168910d0f9 perf build: Add mypy build tests
If MYPY=1 is passed to the build then run mypy over python code in
perf. Unlike shellcheck this isn't default on as there are currently
too many errors.

An example of an error:
```
util/setup.py:8: error: Item "None" of "str | None" has no attribute "split"  [union-attr]
util/setup.py:15: error: Item "None" of "IO[bytes] | None" has no attribute "readline"  [union-attr]
util/setup.py:15: error: List item 0 has incompatible type "str | None"; expected "str | bytes | PathLike[str] | PathLike[bytes]"  [list-item]
util/setup.py:16: error: Unsupported left operand type for + ("None")  [operator]
util/setup.py:16: note: Left operand is of type "str | None"
util/setup.py:74: error: Unsupported left operand type for + ("None")  [operator]
util/setup.py:74: note: Left operand is of type "str | None"
Found 5 errors in 1 file (checked 1 source file)
make[4]: *** [util/Build:430: util/setup.py.mypy_log] Error 1
```

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311213628.569562-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 09:38:20 -07:00
Ian Rogers
ef238109a3 perf build: Rename TEST_LOGS to SHELL_TEST_LOGS
Rename TEST_LOGS to SHELL_TEST_LOGS as later changes will add more
kinds of test logs.
Minor comment tweak in Makefile.perf as more than just test shell
tests are checked.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311213628.569562-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-24 09:38:20 -07:00
Dirk Gouders
99476fa085 perf bench sched pipe: fix enforced blocking reads in worker_thread
The function worker_thread() is programmed in a way that roughly
doubles the number of expectable context switches, because it enforces
blocking reads:

 Performance counter stats for 'perf bench sched pipe':

         2,000,004      context-switches

      11.859548321 seconds time elapsed

       0.674871000 seconds user
       8.076890000 seconds sys

The result of this behavior is that the blocking reads by far dominate
the performance analysis of 'perf bench sched pipe':

Samples: 78K of event 'cycles:P', Event count (approx.): 27964965844
Overhead  Command     Shared Object         Symbol
  25.28%  sched-pipe  [kernel.kallsyms]     [k] read_hpet
   8.11%  sched-pipe  [kernel.kallsyms]     [k] retbleed_untrain_ret
   2.82%  sched-pipe  [kernel.kallsyms]     [k] pipe_write

From the code, it is unclear if that behavior is wanted but the log
says that at least Ingo Molnar aims to mimic lmbench's lat_ctx, that
doesn't handle the pipe ends that way
(https://sourceforge.net/p/lmbench/code/HEAD/tree/trunk/lmbench2/src/lat_ctx.c)

Fix worker_thread() by always first feeding the write ends of the pipes
and then trying to read.

This roughly halves the context switches and runtime of pure
'perf bench sched pipe':

 Performance counter stats for 'perf bench sched pipe':

         1,005,770      context-switches

       6.033448041 seconds time elapsed

       0.423142000 seconds user
       4.519829000 seconds sys

And the blocking reads do no longer dominate the analysis at the above
extreme:

Samples: 40K of event 'cycles:P', Event count (approx.): 14309364879
Overhead  Command     Shared Object         Symbol
  12.20%  sched-pipe  [kernel.kallsyms]     [k] read_hpet
   9.23%  sched-pipe  [kernel.kallsyms]     [k] retbleed_untrain_ret
   3.68%  sched-pipe  [kernel.kallsyms]     [k] pipe_write

Signed-off-by: Dirk Gouders <dirk@gouders.net>
Acked-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250323140316.19027-2-dirk@gouders.net
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-23 23:20:37 -07:00
Likhitha Korrapati
7e442be701 perf tools: Fix is_compat_mode build break in ppc64
Commit 54f9aa1092 ("tools/perf/powerpc/util: Add support to
handle compatible mode PVR for perf json events") introduced
to select proper JSON events in case of compat mode using
auxiliary vector. But this caused a compilation error in ppc64
Big Endian.

arch/powerpc/util/header.c: In function 'is_compat_mode':
arch/powerpc/util/header.c:20:21: error: cast to pointer from
integer of different size [-Werror=int-to-pointer-cast]
   20 |         if (!strcmp((char *)platform, (char *)base_platform))
      |                     ^
arch/powerpc/util/header.c:20:39: error: cast to pointer from
integer of different size [-Werror=int-to-pointer-cast]
   20 |         if (!strcmp((char *)platform, (char *)base_platform))
      |

Commit saved the getauxval(AT_BASE_PLATFORM) and getauxval(AT_PLATFORM)
return values in u64 which causes the compilation error.

Patch fixes this issue by changing u64 to "unsigned long".

Fixes: 54f9aa1092 ("tools/perf/powerpc/util: Add support to handle compatible mode PVR for perf json events")
Signed-off-by: Likhitha Korrapati <likhitha@linux.ibm.com>
Reviewed-by: Athira Rajeev <atrajeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20250321100726.699956-1-likhitha@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-23 23:14:19 -07:00
Holger Hoffstätte
9480cc14a9 perf build: filter all combinations of -flto for libperl
When enabling the libperl feature the build uses perl's build flags
(ccopts) but filters out various flags, e.g. for LTO.
While this is conceptually correct, it is insufficient in practice,
since only "-flto=auto" is filtered out. When perl itself is built with
"-flto" this can cause parts of perf being built with LTO and others
without, giving exciting build errors like e.g.:

   ../tools/perf/pmu-events/pmu-events.c:72851:(.text+0xb79): undefined
   reference to `strcmp_cpuid_str' collect2: error: ld returned 1 exit status

Fix this by filtering all matching flag values of -flto{=n,auto,..}.

Signed-off-by: Holger Hoffstätte <holger@applied-asynchrony.com>
Link: https://lore.kernel.org/r/20250321082038.27901-2-holger@applied-asynchrony.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-23 23:12:10 -07:00
Ilkka Koskinen
182f12f319 perf vendor events arm64 AmpereOneX: Fix frontend_bound calculation
frontend_bound metrics was miscalculated due to different scaling in
a couple of metrics it depends on. Change the scaling to match with
AmpereOne.

Fixes: 16438b652b ("perf vendor events arm64 AmpereOneX: Add core PMU events and metrics")
Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250313201559.11332-3-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:57 -07:00
Ilkka Koskinen
c0b60ce461 perf vendor events arm64: AmpereOne/AmpereOneX: Mark LD_RETIRED impacted by errata
Atomic instructions are both memory-reading and memory-writing
instructions and so should be counted by both LD_RETIRED and ST_RETIRED
performance monitoring events. However LD_RETIRED does not count atomic
instructions.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250313201559.11332-2-ilkka@os.amperecomputing.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:57 -07:00
Ian Rogers
7b172b92c1 perf trace: Fix evlist memory leak
Leak sanitizer was reporting a memory leak in the "perf record and
replay" test. Add evlist__delete to trace__exit, also ensure
trace__exit is called after trace__record.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-15-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:35 -07:00
Ian Rogers
874fa827df perf trace: Fix BTF memory leak
Add missing btf__free in trace__exit.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:31 -07:00
Ian Rogers
ccc60dce3e perf trace: Make syscall table stable
Namhyung fixed the syscall table being reallocated and moving by
reloading the system call pointer after a move:
https://lore.kernel.org/lkml/Z9YHCzINiu4uBQ8B@google.com/
This could be brittle so this patch changes the syscall table to be an
array of pointers of "struct syscall" that don't move. Remove
unnecessary copies and searches with this change.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:27 -07:00
Ian Rogers
95b802ca9d perf syscalltbl: Mask off ABI type for MIPS system calls
Arnd Bergmann described that MIPS system calls don't necessarily start
from 0 as an ABI prefix is applied:
https://lore.kernel.org/lkml/8ed7dfb2-1e4d-4aa4-a04b-0397a89365d1@app.fastmail.com/
When decoding the "id" (aka system call number) for MIPS ignore values
greater-than 1000.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:23 -07:00
Ian Rogers
16ab5c708d perf build: Remove Makefile.syscalls
Now a single beauty file is generated and used by all architectures,
remove the per-architecture Makefiles, Kbuild files and previous
generator script.

Note: there was conversation with Charlie Jenkins
<charlie@rivosinc.com> and they'd written an alternate approach to
support multiple architectures:
https://lore.kernel.org/all/20250114-perf_syscall_arch_runtime-v1-1-5b304e408e11@rivosinc.com/
It would have been better to have helped Charlie fix their series (my
apologies) but they agreed that the approach taken here was likely
best for longer term maintainability:
https://lore.kernel.org/lkml/Z6Jk_UN9i69QGqUj@ghost/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:20 -07:00
Ian Rogers
1470eaa574 perf syscalltbl: Use lookup table containing multiple architectures
Switch to use the lookup table containing all architectures rather
than tables matching the perf binary.

This fixes perf trace when executed on a 32-bit i386 binary on an
x86-64 machine. Note in the following the system call names of the
32-bit i386 binary as seen by an x86-64 perf.

Before:
```
         ? (         ): a.out/447296  ... [continued]: munmap())                                           = 0
     0.024 ( 0.001 ms): a.out/447296 recvfrom(ubuf: 0x2, size: 4160585708, flags: DONTROUTE|CTRUNC|TRUNC|DONTWAIT|EOR|WAITALL|FIN|SYN|CONFIRM|RST|ERRQUEUE|NOSIGNAL|WAITFORONE|BATCH|SOCK_DEVMEM|ZEROCOPY|FASTOPEN|CMSG_CLOEXEC|0x91f80000, addr: 0xe30, addr_len: 0xffce438c) = 1475198976
     0.042 ( 0.003 ms): a.out/447296 lgetxattr(name: "", value: 0x3, size: 34)                             = 4160344064
     0.054 ( 0.003 ms): a.out/447296 dup2(oldfd: -134422744, newfd: 4)                                     = -1 ENOENT (No such file or directory)
     0.060 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x2e646c2f6374652f,.iov_len = (__kernel_size_t)7307199665335594867,}, vlen: 557056, pos_h: 4160585708) = 3
     0.074 ( 0.004 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2)                              = 4160237568
     0.080 ( 0.001 ms): a.out/447296 lstat(filename: "", statbuf: 0x193f6)                                 = 0
     0.089 ( 0.007 ms): a.out/447296 preadv(fd: 4294967196, vec: (struct iovec){.iov_base = (void *)0x3833692f62696c2f,.iov_len = (__kernel_size_t)3276497845987585334,}, vlen: 557056, pos_h: 4160585708) = 3
     0.097 ( 0.002 ms): a.out/447296 close(fd: 3</proc/447296/status>)                                     = 512
     0.103 ( 0.002 ms): a.out/447296 lgetxattr(name: "", value: 0x1, size: 2050)                           = 4157935616
     0.107 ( 0.007 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x5, size: 2066)             = 4158078976
     0.116 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x1, size: 2066)             = 4159639552
     0.121 ( 0.003 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 2066)             = 4160184320
     0.129 ( 0.002 ms): a.out/447296 lgetxattr(pathname: "", name: "", value: 0x3, size: 50)               = 4160196608
     0.138 ( 0.001 ms): a.out/447296 lstat(filename: "")                                                   = 0
     0.145 ( 0.002 ms): a.out/447296 mq_timedreceive(mqdes: 4291706800, u_msg_ptr: 0xf7f9ea48, msg_len: 134616640, u_msg_prio: 0xf7fd7fec, u_abs_timeout: (struct __kernel_timespec){.tv_sec = (__kernel_time64_t)-578174027777317696,.tv_nsec = (long long int)4160349376,}) = 0
     0.148 ( 0.001 ms): a.out/447296 mkdirat(dfd: -134617816, pathname: " ��� ���▒���▒���", mode: IFREG|ISUID|IRUSR|IWGRP|0xf7fd0000) = 447296
     0.150 ( 0.001 ms): a.out/447296 process_vm_writev(pid: -134617812, lvec: (struct iovec){.iov_base = (void *)0xf7f9e9c8f7f9e4c0,.iov_len = (__kernel_size_t)4160349376,}, liovcnt: 4160588048, rvec: (struct iovec){}, riovcnt: 4160585708, flags: 4291707352) = 0
     0.197 ( 0.004 ms): a.out/447296 capget(header: 4160184320, dataptr: 8192)                             = 0
     0.202 ( 0.002 ms): a.out/447296 capget(header: 1448669184, dataptr: 4096)                             = 0
     0.208 ( 0.002 ms): a.out/447296 capget(header: 4160577536, dataptr: 8192)                             = 0
     0.220 ( 0.001 ms): a.out/447296 getxattr(pathname: "", name: "c������", value: 0xf7f77e34, size: 1)  = 0
     0.228 ( 0.005 ms): a.out/447296 fchmod(fd: -134729728, mode: IRUGO|IWUGO|IFREG|IFIFO|ISVTX|IXUSR|0x10000) = 0
     0.240 ( 0.009 ms): a.out/447296 preadv(fd: 4294967196, vec: 0x5658e008, pos_h: 4160192052)            = 3
     0.250 ( 0.008 ms): a.out/447296 close(fd: 3</proc/447296/status>)                                     = 1436
     0.260 ( 0.018 ms): a.out/447296 stat(filename: "", statbuf: 0xffce32ac)                               = 1436
     0.288 (1000.213 ms): a.out/447296 readlinkat(buf: 0xffce31d4, bufsiz: 4291703244)                       = 0
```

After:
```
         ? (         ): a.out/442930  ... [continued]: execve())                                           = 0
     0.023 ( 0.002 ms): a.out/442930 brk()                                                                 = 0x57760000
     0.052 ( 0.003 ms): a.out/442930 access(filename: 0xf7f5af28, mode: R)                                 = -1 ENOENT (No such file or directory)
     0.059 ( 0.009 ms): a.out/442930 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
     0.078 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>)                                     = 0
     0.087 ( 0.007 ms): a.out/442930 openat(dfd: CWD, filename: "/lib/i386-linux-", flags: RDONLY|CLOEXEC|LARGEFILE) = 3
     0.095 ( 0.002 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdbb70, count: 512)         = 512
     0.135 ( 0.001 ms): a.out/442930 close(fd: 3</proc/442930/status>)                                     = 0
     0.148 ( 0.001 ms): a.out/442930 set_tid_address(tidptr: 0xf7f2b528)                                   = 442930 (a.out)
     0.150 ( 0.001 ms): a.out/442930 set_robust_list(head: 0xf7f2b52c, len: 12)                            =
     0.196 ( 0.004 ms): a.out/442930 mprotect(start: 0xf7f03000, len: 8192, prot: READ)                    = 0
     0.202 ( 0.002 ms): a.out/442930 mprotect(start: 0x5658e000, len: 4096, prot: READ)                    = 0
     0.207 ( 0.002 ms): a.out/442930 mprotect(start: 0xf7f63000, len: 8192, prot: READ)                    = 0
     0.230 ( 0.005 ms): a.out/442930 munmap(addr: 0xf7f10000, len: 103414)                                 = 0
     0.244 ( 0.010 ms): a.out/442930 openat(dfd: CWD, filename: 0x5658d008)                                = 3
     0.255 ( 0.007 ms): a.out/442930 read(fd: 3</proc/442930/status>, buf: 0xffbdb67c, count: 4096)        = 1436
     0.264 ( 0.018 ms): a.out/442930 write(fd: 1</dev/pts/4>, buf: , count: 1436)                          = 1436
     0.292 (1000.173 ms): a.out/442930 clock_nanosleep(rqtp: { .tv_sec: 17866546940376776704, .tv_nsec: 4159878336 }, rmtp: 0xffbdb59c) = 0
  1000.478 (         ): a.out/442930 exit_group()                                                          = ?
```

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:13 -07:00
Ian Rogers
0fb641f0a1 perf trace beauty: Add syscalltbl.sh generating all system call tables
Rather than generating individual syscall header files generate a
single trace/beauty/generated/syscalltbl.c. In a syscalltbls array
have references to each architectures tables along with the
corresponding e_machine. When the 32-bit or 64-bit table is ambiguous,
match the perf binary's type. For ARM32 don't use the arm64 32-bit
table which is smaller. EM_NONE is present for is no machine matches.

Conditionally compile the tables, only having the appropriate 32 and
64-bit table. If ALL_SYSCALLTBL is defined all tables can be
compiled.

Add comment for noreturn column suggested by Arnd Bergmann:
https://lore.kernel.org/lkml/d47c35dd-9c52-48e7-a00d-135572f11fbb@app.fastmail.com/
and added in commit 9142be9e64 ("x86/syscall: Mark exit[_group]
syscall handlers __noreturn").

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:08 -07:00
Ian Rogers
70351029b5 perf thread: Add support for reading the e_machine type for a thread
First try to read the e_machine from the dsos associated with the
thread's maps. If live use the executable from /proc/pid/exe and read
the e_machine from the ELF header. On failure use EM_HOST. Change
builtin-trace syscall functions to pass e_machine from the thread
rather than EM_HOST, so that in later patches when syscalltbl can use
the e_machine the system calls are specific to the architecture.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:05 -07:00
Ian Rogers
afffec6f03 perf dso: Add support for reading the e_machine type for a dso
For ELF file dsos read the e_machine from the ELF header. For kernel
types assume the e_machine matches the perf tool. In other cases
return EM_NONE.

When reading from the ELF header use DSO__SWAP that may need
dso->needs_swap initializing. Factor out dso__swap_init to allow this.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:58:02 -07:00
Ian Rogers
5c2938fe78 perf syscalltbl: Remove struct syscalltbl
The syscalltbl held entries of system call name and number pairs,
generated from a native syscalltbl at start up. As there are gaps in
the system call number there is a notion of index into the
table. Going forward we want the system call table to be identifiable
by a machine type, for example, i386 vs x86-64. Change the interface
to the syscalltbl so (1) a (currently unused machine type of EM_HOST)
is passed (2) the index to syscall number and system call name mapping
is computed at build time.

Two tables are used for this, an array of system call number to name,
an array of system call numbers sorted by the system call name. The
sorted array doesn't store strings in part to save memory and
relocations. The index notion is carried forward and is an index into
the sorted array of system call numbers, the data structures are
opaque (held only in syscalltbl.c), and so the number of indices for a
machine type is exposed as a new API.

The arrays are computed in the syscalltbl.sh script and so no start-up
time computation and storage is necessary.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:57 -07:00
Ian Rogers
3d94b8441c perf trace: Reorganize syscalls
Identify struct syscall information in the syscalls table by a machine
type and syscall number, not just system call number. Having the
machine type means that 32-bit system calls can be differentiated from
64-bit ones on a machine capable of both. Having a table for all
machine types and all system call numbers would be too large, so
maintain a sorted array of system calls as they are encountered.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:53 -07:00
Ian Rogers
af472d3c44 perf syscalltbl: Remove syscall_table.h
The definition of "static const char *const syscalltbl[] = {" is done
in a generated syscalls_32.h or syscalls_64.h that is architecture
dependent. In order to include the appropriate file a syscall_table.h
is found via the perf include path and it includes the syscalls_32.h
or syscalls_64.h as appropriate.

To support having multiple syscall tables, one for 32-bit and one for
64-bit, or for different architectures, an include path cannot be
used. Remove syscall_table.h because of this and inline what it does
into syscalltbl.c.

For architectures without a syscall_table.h this will cause a failure
to include either syscalls_32.h or syscalls_64.h rather than a failure
to include syscall_table.h. For architectures that only included one
or other, the behavior matches BITS_PER_LONG as previously done on
architectures supporting both syscalls_32.h and syscalls_64.h.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:35 -07:00
Ian Rogers
4773175c9d perf dso: kernel-doc for enum dso_binary_type
There are many and non-obvious meanings to the dso_binary_type enum
values. Add kernel-doc to speed interpretting their meanings.

Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:57:25 -07:00
Ian Rogers
f1794ecb0c perf dso: Move libunwind dso_data variables into ifdef
The variables elf_base_addr, debug_frame_offset, eh_frame_hdr_addr and
eh_frame_hdr_offset are only accessed in unwind-libunwind-local.c
which is conditionally built on having libunwind support. Make the
variables conditional on libunwind support too.

Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Link: https://lore.kernel.org/r/20250319050741.269828-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 22:56:29 -07:00
Namhyung Kim
d10a7aaaf8 perf report: Disable children column for data type profiling
I've realized that it doesn't make sense to accumulate the samples to
parent in the callchain when data type profiling is enabled.  Because it
won't have the same data type access in the parent.  Otherwise it'd see
something like this:

  $ perf report -s type --stdio -g none
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:Pu'
  # Event count (approx.): 8266456478
  #
  # Children  Latency      Self   Latency  Data Type
  # ........  .......  ........  ........  .........
  #
     698.97%   697.72%    99.80%    99.61%  (unknown)
       0.09%    0.18%     0.09%     0.18%  Elf64_Rela
       0.05%    0.10%     0.05%     0.10%  unsigned char
       0.05%    0.10%     0.05%     0.10%  struct exit_function_list
       0.00%    0.01%     0.00%     0.01%  struct rtld_global

Link: https://lore.kernel.org/r/20250307080829.354947-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Namhyung Kim
6df71c7237 perf report: Allow hierarchy mode for --children
It was prohibited because the output fields in the children mode were
not handled properly with hierarchy.  But we can have the output fields
in the same level, it can allow them together.

For example, latency mode adds more output fields by default and now
they are displayed properly.

  $ perf record --latency -g -- perf test -w thloop

  $ perf report -H --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 2K of event 'cycles:Pu'
  # Event count (approx.): 8266456478
  #
  #       Children  Latency  Overhead   Latency  Command / Shared Object / Symbol
  # ...........................................  ........................................................
  #
       0.08%    0.16%   100.00%   100.00%        perf
          0.08%    0.16%     0.24%     0.47%        ld-linux-x86-64.so.2
             0.12%    0.24%     0.12%     0.24%        [.] _dl_relocate_object
             0.08%    0.16%     0.08%     0.16%        [.] _dl_lookup_symbol_x
             0.03%    0.06%     0.03%     0.06%        [.] strcmp
             0.00%    0.01%     0.00%     0.01%        [.] _dl_start
             0.00%    0.00%     0.00%     0.00%        [.] _dl_start_user
             0.00%    0.00%     0.00%     0.00%        [.] _dl_sysdep_start
             0.00%    0.00%     0.00%     0.00%        [.] _start
             0.00%    0.00%     0.00%     0.00%        [.] dl_main
          0.03%    0.06%     0.03%     0.06%        libLLVM-16.so.1
             0.03%    0.06%     0.03%     0.06%        [.] llvm::StringMapImpl::RehashTable(unsigned int)
             0.00%    0.00%     0.00%     0.00%        [.] 0x00007f137ccd18e8
          0.00%    0.00%    99.66%    99.31%        perf
            99.66%   99.31%    99.66%    99.31%        [.] test_loop
              |
              |--49.86%--0x7f137b633d68
              |          0x55dbdbbb7d2c
              ...

Link: https://lore.kernel.org/r/20250307080829.354947-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Namhyung Kim
a1bbd66627 perf sort: Keep output fields in the same level
This is useful for hierarchy output mode where the first level is
considered as output fields.  We want them in the same level so that it
can show only the remaining groups in the hierarchy.

Before:
  $ perf report -s overhead,sample,period,comm,dso -H --stdio
  ...
  #          Overhead  Samples / Period / Command / Shared Object
  # .................  ..........................................
  #
     100.00%           4035
        100.00%           3835883066
           100.00%           perf
               99.37%           perf
                0.50%           ld-linux-x86-64.so.2
                0.06%           [unknown]
                0.04%           libc.so.6
                0.02%           libLLVM-16.so.1

After:
  $ perf report -s overhead,sample,period,comm,dso -H --stdio
  ...
  #    Overhead       Samples        Period  Command / Shared Object
  # .......................................  .......................
  #
     100.00%          4035    3835883066     perf
         99.37%          4005    3811826223     perf
          0.50%            19      19210014     ld-linux-x86-64.so.2
          0.06%             8       2367089     [unknown]
          0.04%             2       1720336     libc.so.6
          0.02%             1        759404     libLLVM-16.so.1

Acked-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20250307080829.354947-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-20 09:17:56 -07:00
Thomas Richter
431db90a73 perf pmu: Handle memory failure in tool_pmu__new()
On linux-next
commit 72c6f57a41 ("perf pmu: Dynamically allocate tool PMU")
allocated PMU named "tool" dynamicly. However that allocation
can fail and a NULL pointer is returned. That case is currently
not handled and would result in an invalid address reference.
Add a check for NULL pointer.

Fixes: 72c6f57a41 ("perf pmu: Dynamically allocate tool PMU")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250319122820.2898333-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 17:00:16 -07:00
James Clark
6d2dcd6352 perf: intel-tpebs: Fix incorrect usage of zfree()
zfree() requires an address otherwise it frees what's in name, rather
than name itself. Pass the address of name to fix it.

This was the only incorrect occurrence in Perf found using a search.

Fixes: 8db5cabcf1 ("perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.")
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250319101614.190922-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:56 -07:00
Ian Rogers
58b8b5d142 perf cpumap: Increment reference count for online cpumap
Thomas Richter <tmricht@linux.ibm.com> reported a double put on the
cpumap for the placeholder core PMU:
https://lore.kernel.org/lkml/20250318095132.1502654-3-tmricht@linux.ibm.com/
Requiring the caller to get the cpumap is not how these things are
usually done, switch cpu_map__online to do the get and then fix up any
use cases where a put is needed.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Link: https://lore.kernel.org/r/20250318171914.145616-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:33 -07:00
Stephen Brennan
ebf0b33273 perf dso: fix dso__is_kallsyms() check
Kernel modules for which we cannot find a file on-disk will have a
dso->long_name that looks like "[module_name]". Prior to the commit
listed in the fixes, the dso->kernel field would be zero (for user
space), so dso__is_kallsyms() would return false. After the commit,
kernel module DSOs are correctly labeled, but the result is that
dso__is_kallsyms() erroneously returns true for those modules without a
filesystem path.

Later, build_id_cache__add() consults this value of is_kallsyms, and
when true, it copies /proc/kallsyms into the cache. Users with many
kernel modules without a filesystem path (e.g. ksplice or possibly
kernel live patch modules) have reported excessive disk space usage in
the build ID cache directory due to this behavior.

To reproduce the issue, it's enough to build a trivial out-of-tree hello
world kernel module, load it using insmod, and then use:

   perf record -ag -- sleep 1

In the build ID directory, there will be a directory for your module
name containing a kallsyms file.

Fix this up by changing dso__is_kallsyms() to consult the
dso_binary_type enumeration, which is also symmetric to the above checks
for dso__is_vmlinux() and dso__is_kcore(). With this change, kallsyms is
not cached in the build-id cache for out-of-tree modules.

Fixes: 02213cec64 ("perf maps: Mark module DSOs with kernel type")
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Link: https://lore.kernel.org/r/20250318230012.2038790-1-stephen.s.brennan@oracle.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-19 16:56:05 -07:00
Xin Li (Intel)
8f97566c8a x86/cpufeatures: Remove {disabled,required}-features.h
The functionalities of {disabled,required}-features.h have been replaced with
the auto-generated generated/<asm/cpufeaturemasks.h> header.

Thus they are no longer needed and can be removed.

None of the macros defined in {disabled,required}-features.h is used in tools,
delete them too.

Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20250305184725.3341760-4-xin@zytor.com
2025-03-19 11:15:12 +01:00
Feng Yang
2b5b834cc3 perf kwork: Remove unreachable judgments
When s2[i] = '\0', if s1[i] != '\0', it will be judged by ret,
and if s1[i] = '\0', it will be judegd by !s1[i].
So in reality, s2 [i] will never make a judgment

Signed-off-by: Feng Yang <yangfeng@kylinos.cn>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250314031013.94480-1-yangfeng59949@163.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:55:30 -07:00
Arnaldo Carvalho de Melo
89aaeaf842 perf python: Check if there is space to copy all the event
The pyrf_event__new() method copies the event obtained from the perf
ring buffer to a structure that will then be turned into a python object
for further consumption, so it copies perf_event.header.size bytes to
its 'event' member:

  $ pahole -C pyrf_event /tmp/build/perf-tools-next/python/perf.cpython-312-x86_64-linux-gnu.so
  struct pyrf_event {
  	PyObject                   ob_base;              /*     0    16 */
  	struct evsel *             evsel;                /*    16     8 */
  	struct perf_sample         sample;               /*    24   312 */

  	/* XXX last struct has 7 bytes of padding, 2 holes */

  	/* --- cacheline 5 boundary (320 bytes) was 16 bytes ago --- */
  	union perf_event           event;                /*   336  4168 */

  	/* size: 4504, cachelines: 71, members: 4 */
  	/* member types with holes: 1, total: 2 */
  	/* paddings: 1, sum paddings: 7 */
  	/* last cacheline: 24 bytes */
  };

  $

It was doing so without checking if the event just obtained has more
than that space, fix it.

This isn't a proper, final solution, as we need to support larger
events, but for the time being we at least bounds check and document it.

Fixes: 877108e42b ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-7-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:45 -07:00
Arnaldo Carvalho de Melo
f3fed3ae34 perf python: Don't keep a raw_data pointer to consumed ring buffer space
When processing tracepoints the perf python binding was parsing the
event before calling perf_mmap__consume(&md->core) in
pyrf_evlist__read_on_cpu().

But part of this event parsing was to set the perf_sample->raw_data
pointer to the payload of the event, which then could be overwritten by
other event before tracepoint fields were asked for via event.prev_comm
in a python program, for instance.

This also happened with other fields, but strings were were problems
were surfacing, as there is UTF-8 validation for the potentially garbled
data.

This ended up showing up as (with some added debugging messages):

  ( field 'prev_comm' ret=0x7f7c31f65110, raw_size=68 )  ( field 'prev_pid' ret=0x7f7c23b1bed0, raw_size=68 )  ( field 'prev_prio' ret=0x7f7c239c0030, raw_size=68 )  ( field 'prev_state' ret=0x7f7c239c0250, raw_size=68 ) time 14771421785867 prev_comm= prev_pid=1919907691 prev_prio=796026219 prev_state=0x303a32313175 ==>
  ( XXX '��' len=16, raw_size=68)  ( field 'next_comm' ret=(nil), raw_size=68 ) Traceback (most recent call last):
   File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 51, in <module>
     main()
   File "/home/acme/git/perf-tools-next/tools/perf/python/tracepoint.py", line 46, in main
     event.next_comm,
     ^^^^^^^^^^^^^^^
  AttributeError: 'perf.sample_event' object has no attribute 'next_comm'

When event.next_comm was asked for, the PyUnicode_FromString() python
API would fail and that tracepoint field wouldn't be available, stopping
the tools/perf/python/tracepoint.py test tool.

But, since we already do a copy of the whole event in pyrf_event__new,
just use it and while at it remove what was done in in e8968e6541
("perf python: Fix pyrf_evlist__read_on_cpu event consuming") because we
don't really need to wait for parsing the sample before declaring the
event as consumed.

This copy is questionable as is now, as it limits the maximum event +
sample_type and tracepoint payload to sizeof(union perf_event), this all
has been "working" because 'struct perf_event_mmap2', the largest entry
in 'union perf_event' is:

  $ pahole -C perf_event ~/bin/perf | grep mmap2
	struct perf_record_mmap2   mmap2;              /*     0  4168 */
  $

Fixes: bae57e3825 ("perf python: Add support to resolve tracepoint fields")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-6-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:35 -07:00
Arnaldo Carvalho de Melo
3de5a2bf5b perf python: Decrement the refcount of just created event on failure
To avoid a leak if we have the python object but then something happens
and we need to return the operation, decrement the offset of the newly
created object.

Fixes: 377f698db1 ("perf python: Add struct evsel into struct pyrf_event")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-5-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:29 -07:00
Arnaldo Carvalho de Melo
a570da2148 perf python tracepoint.py: Change the COMM using setproctitle if available
Otherwise when debugging we see just "python" in perf, top, etc.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-4-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:22 -07:00
Arnaldo Carvalho de Melo
1882625c91 perf python: Remove some unused macros (_PyUnicode_FromString(arg), etc)
When python2 support was removed in e7e9943c87 ("perf python:
Remove python 2 scripting support"), all use of the
_PyUnicode_FromString(arg), _PyUnicode_FromFormat(...), and
_PyLong_FromLong(arg) macros was removed as well, so remove it.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-3-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:14 -07:00
Arnaldo Carvalho de Melo
1376c195e8 perf python: Fixup description of sample.id event member
Some old cut'n'paste error, its "ip", so the description should be
"event ip", not "event type".

Fixes: 877108e42b ("perf tools: Initial python binding")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312203141.285263-2-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-18 16:08:05 -07:00
Ian Rogers
ca2182097e perf test dso-data: Correctly free test file in read test
The DSO data read test opens a file but as dsos__exit is used the test
file isn't closed. This causes the subsequent subtests in don't fork
(-F) mode to fail as one more than expected file descriptor is open.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-17 22:07:18 -07:00
Ian Rogers
5ac22c35aa perf dso: Use lock annotations to fix asan deadlock
dso__list_del with address sanitizer and/or reference count checking
will call dso__put that can call dso__data_close reentrantly trying to
lock the dso__data_open_lock and deadlocking. Switch from pthread
mutexes to perf's mutex so that lock checking is performed in debug
builds. Add lock annotations that diagnosed the problem. Release the
dso__data_open_lock around the dso__put to avoid the deadlock.

Change the declaration of dso__data_get_fd to return a boolean,
indicating the fd is valid and the lock is held, to make it compatible
with the thread safety annotations as a try lock.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-17 22:07:18 -07:00
Ian Rogers
c5ebf3a266 perf mutex: Add annotations for LOCKS_EXCLUDED and LOCKS_RETURNED
Used to annotate when locks shouldn't be held for a function or if a
function returns a lock that's used by later mutex lock unlock
operations.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250318043151.137973-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-17 22:07:18 -07:00
Ian Rogers
658b34cc9f perf test: Add pipe output testing for annotate
Parameterize the basic testing to generate directly a perf.data file
or to generate/use one from pipe input or output.  To simplify the
refactor move some of the head/grep logic around. Use "-q" with grep
to make the test output cleaner.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250311211635.541090-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 18:16:33 -07:00
Ian Rogers
3a86d63e6f perf test: Fixes to variable expansion and stdout for diff test
When make_data fails its error message needs to go to stderr rather
than stdout and the stdout value is captured in a variable.  Quote the
$err value so that it is always a valid input for test.  This error is
commonly encountered if no sample data is gathered by the test.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250312001841.1515779-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 18:15:13 -07:00
Arnaldo Carvalho de Melo
4e82c88a90 perf libunwind: Fixup conversion perf_sample->user_regs to a pointer
The dc6d2bc2d8 ("perf sample: Make user_regs and intr_regs optional") misses
the changes to a file, resulting in this problem:

  $ make LIBUNWIND=1 -C tools/perf O=/tmp/build/perf-tools-next install-bin
  <SNIP>
    CC      /tmp/build/perf-tools-next/util/unwind-libunwind-local.o
    CC      /tmp/build/perf-tools-next/util/unwind-libunwind.o
  <SNIP>
  util/unwind-libunwind-local.c: In function ‘access_mem’:
  util/unwind-libunwind-local.c:582:56: error: ‘ui->sample->user_regs’ is a pointer; did you mean to use ‘->’?
    582 |         if (__write || !stack || !ui->sample->user_regs.regs) {
        |                                                        ^
        |                                                        ->
  util/unwind-libunwind-local.c:587:38: error: passing argument 2 of ‘perf_reg_value’ from incompatible pointer type [-Wincompatible-pointer-types]
    587 |         ret = perf_reg_value(&start, &ui->sample->user_regs,
        |                                      ^~~~~~~~~~~~~~~~~~~~~~
        |                                      |
        |                                      struct regs_dump **
<SNIP>
  ⬢ [acme@toolbox perf-tools-next]$ git bisect bad
  dc6d2bc2d8 is the first bad commit
  commit dc6d2bc2d8 (HEAD)
  Author: Ian Rogers <irogers@google.com>
  Date:   Mon Jan 13 11:43:45 2025 -0800

      perf sample: Make user_regs and intr_regs optional

Detected using:

  make -C tools/perf build-test

Fixes: dc6d2bc2d8 ("perf sample: Make user_regs and intr_regs optional")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250313033121.758978-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 18:12:33 -07:00
Veronika Molnarova
02ba09c8ab perf test stat_all_pmu.sh: Correctly check 'perf stat' result
Test case "stat_all_pmu.sh" is not correctly checking 'perf stat' output
due to a poor design. Firstly, having the 'set -e' option with a trap
catching the sigexit causes the shell to exit immediately if 'perf stat' ends
with any non-zero value, which is then caught by the trap reporting an
unexpected signal. This causes events that should be parsed by the if-else
statement to be caught by the trap handler and are reported as errors:

    $ perf test -vv "perf all pmu"
    Testing i915/actual-frequency/
    Unexpected signal in main
    Error:
    Access to performance monitoring and observability operations is limited.

Secondly, the if-else branches are not exclusive as the checking if the
event is present in the output log covers also the "<not supported>"
events, which should be accepted, and also the "Bad name events", which
should be rejected.

Remove the "set -e" option from the test case, correctly parse the
"perf stat" output log and check its return value. Add the missing
outputs for the 'perf stat' result and also add logs messages to
report the branch that parsed the event for more info.

Fixes: 7e73ea4029 ("perf test: Ignore security failures in all PMU test")
Signed-off-by: Veronika Molnarova <vmolnaro@redhat.com>
Tested-by: Qiao Zhao <qzhao@redhat.com>
Link: https://lore.kernel.org/r/20241122231233.79509-1-vmolnaro@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 10:41:34 -07:00
Yujie Liu
fa9bc517af perf script: Update brstack syntax documentation
The following commits added new fields/flags to the branch stack field
list:

commit 1f48989cdc ("perf script: Output branch sample type")
commit 6ade6c6460 ("perf script: Show branch speculation info")
commit 1e66dcff7b ("perf script: Add not taken event for branch stack")

Update brstack syntax documentation to be consistent with the latest
branch stack field list. Improve the descriptions to help users
interpret the fields accurately.

Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Sandipan Das <sandipan.das@amd.com>
Link: https://lore.kernel.org/r/20250312072329.419020-1-yujie.liu@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-14 10:41:08 -07:00
Yujie Liu
2f39edece1 perf script: Fix typo in branch event mask
BRACH -> BRANCH

Fixes: 88b1473135 ("perf script: Separate events from branch types")
Signed-off-by: Yujie Liu <yujie.liu@intel.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250312075636.429127-1-yujie.liu@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 13:19:27 -07:00
Arnaldo Carvalho de Melo
2333cfa9f8 perf hist stdio: Do bounds check when printing callchains to avoid UB with new gcc versions
Do a simple bounds check to avoid this on new gcc versions:

  31    15.81 fedora:rawhide                : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC)
    In function 'callchain__fprintf_left_margin',
        inlined from 'callchain__fprintf_graph.constprop' at ui/stdio/hist.c:246:12:
    ui/stdio/hist.c:27:39: error: iteration 2147483647 invokes undefined behavior [-Werror=aggressive-loop-optimizations]
       27 |         for (i = 0; i < left_margin; i++)
          |                                      ~^~
    ui/stdio/hist.c:27:23: note: within this loop
       27 |         for (i = 0; i < left_margin; i++)
          |                     ~~^~~~~~~~~~~~~
    cc1: all warnings being treated as errors

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250310194534.265487-4-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:30:14 -07:00
Arnaldo Carvalho de Melo
cf67629f7f perf units: Fix insufficient array space
No need to specify the array size, let the compiler figure that out.

This addresses this compiler warning that was noticed while build
testing on fedora rawhide:

  31    15.81 fedora:rawhide                : FAIL gcc version 15.0.1 20250225 (Red Hat 15.0.1-0) (GCC)
    util/units.c: In function 'unit_number__scnprintf':
    util/units.c:67:24: error: initializer-string for array of 'char' is too long [-Werror=unterminated-string-initialization]
       67 |         char unit[4] = "BKMG";
          |                        ^~~~~~
    cc1: all warnings being treated as errors

Fixes: 9808143ba2 ("perf tools: Add unit_number__scnprintf function")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250310194534.265487-3-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:30:08 -07:00
Namhyung Kim
bbf006d6d1 perf annotate: Add --code-with-type option.
This option is to show data type info in the regular (code) annotation.
It tries to find data type for each (memory) instruction in the
function.  It'd be useful to see function-level memory access pattern
and also to debug the data type profiling result.

The output would be added at the end of the line and have "# data-type:"
prefix.

For now, it only works with --stdio mode for simplicity.  I can work on
enabling it for TUI later.

  $ perf annotate --stdio --code-with-type
   Percent |      Source code & Disassembly of vmlinux for cpu/mem-loads/ppk (253 samples, percent: local period)
  ---------------------------------------------------------------------------------------------------------------
           : 0                0xffffffff81baa000 <check_preemption_disabled>:
      0.00 :   ffffffff81baa000:        pushq   %r12              # data-type: (stack operation)
      0.00 :   ffffffff81baa002:        pushq   %rbp              # data-type: (stack operation)
      0.00 :   ffffffff81baa003:        pushq   %rbx              # data-type: (stack operation)
      0.00 :   ffffffff81baa004:        subq    $0x8, %rsp
     18.00 :   ffffffff81baa008:        movl    %gs:0x7e48893d(%rip), %ebx  # 0x3294c <pcpu_hot+0xc>              # data-type: struct pcpu_hot +0xc (cpu_number)
     12.58 :   ffffffff81baa00f:        movl    %gs:0x7e488932(%rip), %eax  # 0x32948 <pcpu_hot+0x8>              # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa016:        testl   $0x7fffffff, %eax
      0.00 :   ffffffff81baa01b:        je      0xffffffff81baa02c <check_preemption_disabled+0x2c>
      0.00 :   ffffffff81baa01d:        addq    $0x8, %rsp
      0.00 :   ffffffff81baa021:        movl    %ebx, %eax
     14.19 :   ffffffff81baa023:        popq    %rbx              # data-type: (stack operation)
     18.86 :   ffffffff81baa024:        popq    %rbp              # data-type: (stack operation)
     12.10 :   ffffffff81baa025:        popq    %r12              # data-type: (stack operation)
     17.78 :   ffffffff81baa027:        jmp     0xffffffff81bc1170 <__x86_return_thunk>
      6.49 :   ffffffff81baa02c:        callq   *0xc9139e(%rip)  # 0xffffffff8283b3d0 <pv_ops+0xf0>               # data-type: (stack operation)
      0.00 :   ffffffff81baa032:        testb   $0x2, %ah
      0.00 :   ffffffff81baa035:        je      0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa037:        movq    %rdi, %rbp
      0.00 :   ffffffff81baa03a:        movq    %gs:0x32940, %rax         # data-type: struct pcpu_hot +0 (current_task)
      0.00 :   ffffffff81baa043:        testb   $0x4, 0x2f(%rax)          # data-type: struct task_struct +0x2f (flags)
      0.00 :   ffffffff81baa047:        je      0xffffffff81baa052 <check_preemption_disabled+0x52>
      0.00 :   ffffffff81baa049:        cmpl    $0x1, 0x3d0(%rax)         # data-type: struct task_struct +0x3d0 (nr_cpus_allowed)
      0.00 :   ffffffff81baa050:        je      0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa052:        movq    %gs:0x32940, %r12         # data-type: struct pcpu_hot +0 (current_task)
      0.00 :   ffffffff81baa05b:        cmpw    $0x0, 0x7f0(%r12)         # data-type: struct task_struct +0x7f0 (migration_disabled)
      0.00 :   ffffffff81baa065:        movq    %rsi, (%rsp)
      0.00 :   ffffffff81baa069:        jne     0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa06b:        movl    0xe8dd13(%rip), %eax  # 0xffffffff82a37d84 <system_state>         # data-type: enum system_states +0
      0.00 :   ffffffff81baa071:        testl   %eax, %eax
      0.00 :   ffffffff81baa073:        je      0xffffffff81baa01d <check_preemption_disabled+0x1d>
      0.00 :   ffffffff81baa075:        incl    %gs:0x7e4888cc(%rip)  # 0x32948 <pcpu_hot+0x8>            # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa07c:        movq    $-0x7e14a100, %rdi
      0.00 :   ffffffff81baa083:        callq   0xffffffff81148c40 <__printk_ratelimit>           # data-type: (stack operation)
      0.00 :   ffffffff81baa088:        testl   %eax, %eax
      0.00 :   ffffffff81baa08a:        je      0xffffffff81baa0d5 <check_preemption_disabled+0xd5>
      0.00 :   ffffffff81baa08c:        movl    0x958(%r12), %r9d         # data-type: struct task_struct +0x958 (pid)
      0.00 :   ffffffff81baa094:        movq    (%rsp), %rdx              # data-type: char* +0
      0.00 :   ffffffff81baa098:        movq    %rbp, %rsi
      0.00 :   ffffffff81baa09b:        leaq    0xb88(%r12), %r8          # data-type: struct task_struct +0xb88 (comm)
      0.00 :   ffffffff81baa0a3:        movl    %gs:0x7e48889e(%rip), %ecx  # 0x32948 <pcpu_hot+0x8>              # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa0aa:        andl    $0x7fffffff, %ecx
      0.00 :   ffffffff81baa0b0:        movq    $-0x7dd3cdf0, %rdi
      0.00 :   ffffffff81baa0b7:        subl    $0x1, %ecx
      0.00 :   ffffffff81baa0ba:        callq   0xffffffff81149340 <_printk>              # data-type: (stack operation)
      0.00 :   ffffffff81baa0bf:        movq    0x20(%rsp), %rsi
      0.00 :   ffffffff81baa0c4:        movq    $-0x7ddb8c7e, %rdi
      0.00 :   ffffffff81baa0cb:        callq   0xffffffff81149340 <_printk>              # data-type: (stack operation)
      0.00 :   ffffffff81baa0d0:        callq   0xffffffff81b7ab60 <dump_stack>           # data-type: (stack operation)
      0.00 :   ffffffff81baa0d5:        decl    %gs:0x7e48886c(%rip)  # 0x32948 <pcpu_hot+0x8>            # data-type: struct pcpu_hot +0x8 (preempt_count)
      0.00 :   ffffffff81baa0dc:        jmp     0xffffffff81baa01d <check_preemption_disabled+0x1d>

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-8-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
30c5a3941d perf annotate: Implement code + data type annotation
Sometimes it's useful to see both instructions and their data type
together.  Let's extend the annotate code to use data type profiling
functions.

To make it easy to pass more argument, introduce a struct to carry
necessary information together.  Also add a new annotation_option called
'code_with_type' to control the behavior.  This is not enabled yet but
it'll be set later from the command line.

For simplicity, this is implemented for --stdio only.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
236ee2569a perf annotate: Factor out __hist_entry__get_data_type()
So that it can only handle a single disasm_linme and hopefully make the
code simpler.  This is also a preparation to be called from different
places later.

The NO_TYPE macro was added to distinguish when it failed or needs retry.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
fe8da6692a perf annotate: Pass hist_entry to annotate functions
It's a prepartion to support code annotation and data type
annotation at the same time.  Data type annotation needs more
information in the hist_entry so it needs to be passed deeper.

Also rename a function with the same name in the builtin-annotate.c
to hist_entry__stdio_annotate since it matches better to the command
line option.  And change the condition inside to be simpler.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
9aa3cbbffb perf annotate: Pass annotation_options to annotation_line__print()
The annotation_line__print() has many arguments.  But min_percent,
max_lines and percent_type are from struct annotaion_options.  So let's
pass a pointer to the option instead of passing them separately to
reduce the number of function arguments.

Actually it has a recursive call if 'queue' is set.  Add a new option
instance to pass different values for the case.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
1f284082b1 perf annotate: Remove unused len parameter from annotation_line__print()
It's not used anywhere, let's get rid of it.

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
ce2289ad0a perf annotate-data: Add annotated_data_type__get_member_name()
Factor out a function to get the name of member field at the given
offset.  This will be used in other places.

Also update the output of typeoff sort key a little bit.  As we know
that some special types like (stack operation), (stack canary) and
(unknown) won't have fields, skip printing the offset and field.

For example, the following change is expected.

  "(stack operation) +0 (no field)"   ==>   "(stack operation)"

Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250310224925.799005-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:19:51 -07:00
Namhyung Kim
e1cde2d5e9 perf ftrace: Use atomic inc to update histogram in BPF
It should use an atomic instruction to update even if the histogram is
keyed by delta as it's also used for stats.

Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250227191223.1288473-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:18:10 -07:00
Namhyung Kim
79056b3fe8 perf ftrace: Remove an unnecessary condition check in BPF
The bucket_num is set based on the {max,min}_latency already in
cmd_ftrace(), so no need to check it again in BPF.  Also I found
that it didn't pass the max_latency to BPF. :)

No functional changes intended.

Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250227191223.1288473-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:18:10 -07:00
Namhyung Kim
9c33441418 perf ftrace: Fix latency stats with BPF
When BPF collects the stats for the latency in usec, it first divides
the time by 1000.  But that means it would have 0 if the delta is small
and won't update the total time properly.

Let's keep the stats in nsec always and adjust to usec before printing.

Before:

  $ sudo ./perf ftrace latency -ab -T mutex_lock --hide-empty -- sleep 0.1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    1 us |        765 | #############################################  |
       1 -    2 us |         10 |                                                |
       2 -    4 us |          2 |                                                |
       4 -    8 us |          5 |                                                |

  # statistics  (in usec)
    total time:                    0    <<<--- (here)
      avg time:                    0
      max time:                    6
      min time:                    0
         count:                  782

After:

  $ sudo ./perf ftrace latency -ab -T mutex_lock --hide-empty -- sleep 0.1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 -    1 us |        880 | ############################################   |
       1 -    2 us |         13 |                                                |
       2 -    4 us |          8 |                                                |
       4 -    8 us |          3 |                                                |

  # statistics  (in usec)
    total time:                  268    <<<--- (here)
      avg time:                    0
      max time:                    6
      min time:                    0
         count:                  904

Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250227191223.1288473-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-13 00:18:10 -07:00
Ian Rogers
5b562763d7 perf test stat: Additional topdown grouping tests
Add a loop and helper function to avoid repetition, the loop uses
arrays so switch the shell to bash. Add additional topdown group tests
where a topdown event needs to be moved beyond others and the slots
event isn't first in the target group. This replicates issues that
occur on hybrid systems where the other events are for the cpu_atom
PMU. Test with both PMU and software events. Place the slots event
later in the event list.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:05:04 -07:00
Dapeng Mi
16dd43dfd6 perf x86 evlist: Update comments on topdown regrouping
Update to remove comments about groupings not working and with the:
```
perf stat -e "{instructions,slots},{cycles,topdown-retiring}"
```
case that now works.

Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:04:56 -07:00
Ian Rogers
9a1c57fe26 perf parse-events: Corrections to topdown sorting
In the case of '{instructions,slots},faults,topdown-retiring' the
first event that must be grouped, slots, is ignored causing the
topdown-retiring event not to be adjacent to the group it needs to be
inserted into. Don't ignore the group members when computing the
force_grouped_index.

Make the force_grouped_index be for the leader of the group it is
within and always use it first rather than a group leader index so
that topdown events may be sorted from one group into another.

As the PMU name comparison applies to moving events in the same group
ensure the name ordering is always respected.

Change the group splitting logic to not group if there are no other
topdown events and to fix cases where the force group leader wasn't
being grouped with the other members of its group.

Reported-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Closes: https://lore.kernel.org/lkml/20250224083306.71813-2-dapeng1.mi@linux.intel.com/
Closes: https://lore.kernel.org/lkml/f7e4f7e8-748c-4ec7-9088-0e844392c11a@linux.intel.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:00:50 -07:00
Dapeng Mi
b74683b3bb perf x86/topdown: Fix topdown leader sampling test error on hybrid
When running topdown leader smapling test on Intel hybrid platforms,
such as LNL/ARL, we see the below error.

Topdown leader sampling test
Topdown leader sampling [Failed topdown events not reordered correctly]

It indciates the below command fails.

perf record -o "${perfdata}" -e "{instructions,slots,topdown-retiring}:S" true

The root cause is that perf tool creats a perf event for each PMU type
if it can create.

As for this command, there would be 5 perf events created,
cpu_atom/instructions/,cpu_atom/topdown_retiring/,
cpu_core/slots/,cpu_core/instructions/,cpu_core/topdown-retiring/

For these 5 events, the 2 cpu_atom events are in a group and the other 3
cpu_core events are in another group.

When arch_topdown_sample_read() traverses all these 5 events, events
cpu_atom/instructions/ and cpu_core/slots/ don't have a same group
leade, and then return false directly and lead to cpu_core/slots/ event
is used to sample and this is not allowed by PMU driver.

It's a overkill to return false directly if "evsel->core.leader !=
 leader->core.leader" since there could be multiple groups in the event
list.

Just "continue" instead of "return false" to fix this issue.

Fixes: 1e53e9d178 ("perf x86/topdown: Correct leader selection with sample_read enabled")
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250307023906.1135613-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:00:50 -07:00
Ian Rogers
fd5de637a4 perf tools: Improve handling of hybrid PMUs in perf_event_attr__fprintf
Support the PMU name from the legacy hardware and hw_cache PMU
extended types.  Remove some macros and make variables more intention
revealing, rather than just being called "value".

Before:
```
$ perf stat -vv -e instructions true
...
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0xa00000001
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 181636  cpu -1  group_fd -1  flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0x400000001
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid 181636  cpu -1  group_fd -1  flags 0x8 = 6
...
```

After:
```
$ perf stat -vv -e instructions true
...
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0xa00000001 (cpu_atom/PERF_COUNT_HW_INSTRUCTIONS/)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
------------------------------------------------------------
sys_perf_event_open: pid 181724  cpu -1  group_fd -1  flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             136
  config                           0x400000001 (cpu_core/PERF_COUNT_HW_INSTRUCTIONS/)
  sample_type                      IDENTIFIER
  read_format                      TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING
  disabled                         1
  inherit                          1
  enable_on_exec                   1
------------------------------------------------------------
sys_perf_event_open: pid 181724  cpu -1  group_fd -1  flags 0x8 = 6
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Tested-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250307023906.1135613-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 19:00:50 -07:00
Ian Rogers
f7cffbabf7 perf python tracepoint: Switch to using parse_events
Rather than manually configuring an evsel, switch to using
parse_events for greater commonality with the rest of the perf code.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:38 -07:00
Ian Rogers
0dfcc7c86c perf python: Add evlist.config to set up record options
Add access to evlist__config that is used to configure an evlist with
record options.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:38 -07:00
Ian Rogers
1a8356fbf8 perf python: Add evlist all_cpus accessor
Add a means to get the reference counted all_cpus CPU map from an
evlist in its python form.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:38 -07:00
Ian Rogers
9e9472c148 perf python: Avoid duplicated code in get_tracepoint_field
The code replicates computations done in evsel__tp_format, reuse
evsel__tp_format to simplify the python C code.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:38 -07:00
Ian Rogers
07fc231617 perf python: Update ungrouped evsel leader in clone
evsels are cloned in the python code as they form part of the Python
object pyrf_evsel. The cloning doesn't update the evsel's leader, do
this for the case of an evsel being ungrouped.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:37 -07:00
Ian Rogers
6c62403b5a perf python: Add optional cpus and threads arguments to parse_events
Used for the evlist initialization.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:37 -07:00
Ian Rogers
cc8bf352dd perf python: Add member access to a number of evsel variables
Most variables are part of the perf_event_attr, so that they may be
queried and modified.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:37 -07:00
Ian Rogers
d8e1767779 perf python: Add evlist enable and disable methods
By default the evsels from parse_events will be disabled. Add access
to the evlist functions so they can be enabled/disabled.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:37 -07:00
Ian Rogers
eb7e83a7ca perf evsel: tp_format accessing improvements
Ensure evsel__clone copies the tp_sys and tp_name variables.
In evsel__tp_format, if tp_sys isn't set, use the config value to find
the tp_format. This succeeds in python code where pyrf__tracepoint has
already found the format.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-4-irogers@google.com
Fixes: 6c8310e838 ("perf evsel: Allow evsel__newtp without libtraceevent")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:37 -07:00
Ian Rogers
fe0ce8a9d8 perf evlist: Add success path to evlist__create_syswide_maps
Over various refactorings evlist__create_syswide_maps has been made to
only ever return with -ENOMEM. Fix this so that when
perf_evlist__set_maps is successfully called, 0 is returned.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-3-irogers@google.com
Fixes: 8c0498b689 ("perf evlist: Fix create_syswide_maps() not propagating maps")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:37 -07:00
Ian Rogers
bda840191d perf debug: Avoid stack overflow in recursive error message
In debug_file, pr_warning_once is called on error. As that function
calls debug_file the function will yield a stack overflow. Switch the
location of the call so the recursion is avoided.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250228222308.626803-2-irogers@google.com
Fixes: ec49230cf6 ("perf debug: Expose debug file")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-11 18:55:37 -07:00
Stephen Brennan
b10f74308e perf symbol: Support .gnu_debugdata for symbols
Fedora introduced a "MiniDebuginfo" feature, in which an LZMA-compressed
ELF file is placed inside a section named ".gnu_debugdata". This file
contains nothing but a symbol table, which can be used to supplement the
.dynsym section which only contains required symbols for runtime.

It is supported by GDB for stack traces, but it should be useful for
tracing as well. Implement support for loading symbols from
.gnu_debugdata.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250307232206.2102440-4-stephen.s.brennan@oracle.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:37:06 -07:00
Stephen Brennan
71fa411fe8 perf tools: Add LZMA decompression from FILE
Internally lzma_decompress_to_file() creates a FILE from the filename.
Add an API that takes an existing FILE directly. This allows
decompressing already-open files and even buffers opened by fmemopen().
It is necessary for supporting .gnu_debugdata in the next patch.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250307232206.2102440-3-stephen.s.brennan@oracle.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:37:02 -07:00
Stephen Brennan
20ef723113 perf tools: Add dummy functions for !HAVE_LZMA_SUPPORT
This allows us to use them without needing to ifdef the calling code.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250307232206.2102440-2-stephen.s.brennan@oracle.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:36:58 -07:00
Ian Rogers
db5af2e4a0 perf mem: Don't leak mem event names
When preparing the mem events for the argv copies are intentionally
made. These copies are leaked and cause runs of perf using address
sanitizer to fail. Rather than leak the memory allocate a chunk of
memory for the mem event names upfront and build the strings in this -
the storage is sized larger than the previous buffer size. The caller
is then responsible for clearing up this memory. As part of this
change, remove the mem_loads_name and mem_stores_name global buffers
then change the perf_pmu__mem_events_name to write to an out argument
buffer.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250308012853.1384762-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:26:45 -07:00
Eric Lin
6dad43bb11 perf vendor events riscv: Add SiFive P650 events
The SiFive Performance P650 core (including the vector-enabled P670 and
area-optimized P450/P470 variants) updates the P550 microarchitecture.
It brings in the debug, trace, and counter events from newer Bullet
cores, and adds new events for iTLB and dTLB multi-hits.

All other PMU events are unchanged from the P550 core.

Signed-off-by: Eric Lin <eric.lin@sifive.com>
Co-developed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250213220341.3215660-8-samuel.holland@sifive.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:15:38 -07:00
Eric Lin
2e3a13d6b7 perf vendor events riscv: Add SiFive P550 events
The SiFive Performance P550 core features an out-of-order
microarchitecture which exposes the same PMU events as Bullet,
plus events for UTLB hits and PTE cache misses/hits.

Add support for specifying these events using symbolic names.

Signed-off-by: Eric Lin <eric.lin@sifive.com>
Co-developed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250213220341.3215660-7-samuel.holland@sifive.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:15:38 -07:00
Eric Lin
8866a33815 perf vendor events riscv: Add SiFive Bullet version 0x0d events
SiFive Bullet microarchitecture cores with mimpid values starting with
0x0d or greater add new PMU events to count TLB miss stall cycles.

All other PMU events are unchanged from earlier Bullet cores.

Signed-off-by: Eric Lin <eric.lin@sifive.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250213220341.3215660-6-samuel.holland@sifive.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:15:38 -07:00
Eric Lin
acaefd6049 perf vendor events riscv: Add SiFive Bullet version 0x07 events
SiFive Bullet microarchitecture cores with mimpid values starting with
0x07 or greater add new PMU events to support debug, trace, and counter
sampling and filtering (Sscofpmf).

All other PMU events are unchanged from earlier Bullet cores.

Signed-off-by: Eric Lin <eric.lin@sifive.com>
Co-developed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250213220341.3215660-5-samuel.holland@sifive.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:15:38 -07:00
Eric Lin
4f762cb409 perf vendor events riscv: Update SiFive Bullet events
Regenerate the event lists from the original hardware description. This
makes them consistent with the event lists for newer versions of the
hardware, allowing most files to be reused across hardware versions.

Signed-off-by: Eric Lin <eric.lin@sifive.com>
Co-developed-by: Samuel Holland <samuel.holland@sifive.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250213220341.3215660-4-samuel.holland@sifive.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:15:38 -07:00
Samuel Holland
0d042fa514 perf vendor events riscv: Remove leading zeroes
The EventCode field (as stored in the mhpmeventN CSRs) is actually 56
bits wide, but there is no need to keep leading zeroes in the JSON
files. Remove them to simplify review of the following change, which
regenerates the files in a way that does not include leading zeroes.

This change was performed automatically with `sed -i "s/0x0*/0x/"`.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250213220341.3215660-3-samuel.holland@sifive.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:15:38 -07:00
Samuel Holland
d35ad7e881 perf vendor events riscv: Rename U74 to Bullet
This set of PMU event descriptions applies not only to the SiFive U74
core configuration, but also to other SiFive cores that implement the
Bullet microarchitecture (such as U64, P270, and X280). Rename the
directory to be more generic.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250213220341.3215660-2-samuel.holland@sifive.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 14:15:37 -07:00
Dr. David Alan Gilbert
c1a37db3cf perf util: Remove unused perf_config__refresh
perf_config__refresh() was added in 2016 by
commit 8a0a9c7e91 ("perf config: Introduce new init() and exit()")
but has remained unused.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250305023120.155420-7-linux@treblig.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 11:31:24 -07:00
Dr. David Alan Gilbert
e032e7a775 perf util: Remove unused perf_pmus__default_pmu_name
perf_pmus__default_pmu_name() last use was removed by 2023's
commit e3edd6cf63 ("perf pmu-events: Reduce processed events by passing
PMU")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250305023120.155420-6-linux@treblig.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 11:31:24 -07:00
Dr. David Alan Gilbert
f986468641 perf util: Remove unused perf_data__update_dir
perf_data__update_dir() was added in 2019's
commit e8be135751 ("perf data: Add perf_data__update_dir() function")
but has never been used.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250305023120.155420-5-linux@treblig.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 11:31:24 -07:00
Dr. David Alan Gilbert
cf99ec1525 perf util: Remove unused pstack__pop
The last use of pstack__pop() was removed in 2015 by
commit 6422184b08 ("perf hists browser: Simplify zooming code using
pstack_peek()")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250305023120.155420-4-linux@treblig.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 11:31:24 -07:00
Dr. David Alan Gilbert
a9b496f420 perf util: Remove unused perf_color_default_config
perf_color_default_config() was added in 2009 by
commit 8fc0321f1a ("perf_counter tools: Add color terminal output
support")
but has remained unused.

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250305023120.155420-3-linux@treblig.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-10 11:31:24 -07:00
Ian Rogers
36e7748d33 perf tests: Fix data symbol test with LTO builds
With LTO builds, although regular builds could also see this as
all the code is in one file, the datasym workload can realize the
buf1.reserved data is never accessed. The compiler moves the
variable to bss and only keeps the data1 and data2 parts as
separate variables. This causes the symbol check to fail in the
test. Make the variable volatile to disable the more aggressive
optimization. Rename the variable to make which buf1 in perf is
being referred to.

Before:

  $ perf test -vv "data symbol"
  126: Test data symbol:
  --- start ---
  test child forked, pid 299808
  perf does not have symbol 'buf1'
  perf is missing symbols - skipping test
  ---- end(-2) ----
  126: Test data symbol                                                : Skip
  $ nm perf|grep buf1
  0000000000a5fa40 b buf1.0
  0000000000a5fa48 b buf1.1

After:

  $ nm perf|grep buf1
  0000000000a53a00 d buf1
  $ perf test -vv "data symbol"126: Test data symbol:
  --- start ---
  test child forked, pid 302166
   a53a00-a53a39 l buf1
  perf does have symbol 'buf1'
  Recording workload...
  Waiting for "perf record has started" message
  OK
  Cleaning up files...
  ---- end(0) ----
  126: Test data symbol                                                : Ok

Fixes: 3dfc01fe9d ("perf test: Add 'datasym' test workload")
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250226230109.314580-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-07 14:07:07 -08:00
Namhyung Kim
e1f5bb18a7 perf report: Fix memory leaks in the hierarchy mode
Ian told me that there are many memory leaks in the hierarchy mode.  I
can easily reproduce it with the follwing command.

  $ make DEBUG=1 EXTRA_CFLAGS=-fsanitize=leak

  $ perf record --latency -g -- ./perf test -w thloop

  $ perf report -H --stdio
  ...
  Indirect leak of 168 byte(s) in 21 object(s) allocated from:
      #0 0x7f3414c16c65 in malloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:75
      #1 0x55ed3602346e in map__get util/map.h:189
      #2 0x55ed36024cc4 in hist_entry__init util/hist.c:476
      #3 0x55ed36025208 in hist_entry__new util/hist.c:588
      #4 0x55ed36027c05 in hierarchy_insert_entry util/hist.c:1587
      #5 0x55ed36027e2e in hists__hierarchy_insert_entry util/hist.c:1638
      #6 0x55ed36027fa4 in hists__collapse_insert_entry util/hist.c:1685
      #7 0x55ed360283e8 in hists__collapse_resort util/hist.c:1776
      #8 0x55ed35de0323 in report__collapse_hists /home/namhyung/project/linux/tools/perf/builtin-report.c:735
      #9 0x55ed35de15b4 in __cmd_report /home/namhyung/project/linux/tools/perf/builtin-report.c:1119
      #10 0x55ed35de43dc in cmd_report /home/namhyung/project/linux/tools/perf/builtin-report.c:1867
      #11 0x55ed35e66767 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:351
      #12 0x55ed35e66a0e in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:404
      #13 0x55ed35e66b67 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:448
      #14 0x55ed35e66eb0 in main /home/namhyung/project/linux/tools/perf/perf.c:556
      #15 0x7f340ac33d67 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
  ...

  $ perf report -H --stdio 2>&1 | grep -c '^Indirect leak'
  93

I found that hist_entry__delete() missed to release child entries in the
hierarchy tree (hroot_{in,out}).  It needs to iterate the child entries
and call hist_entry__delete() recursively.

After this change:

  $ perf report -H --stdio 2>&1 | grep -c '^Indirect leak'
  0

Reported-by: Ian Rogers <irogers@google.com>
Tested-by Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250307061250.320849-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-07 14:07:07 -08:00
Namhyung Kim
e242df05ee perf report: Use map_symbol__copy() when copying callchains
It seems there are places to miss updating refcount of maps.
Let's use map_symbol__copy() helper to properly copy them with
refcounts updated.

Link: https://lore.kernel.org/r/20250307061250.320849-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-07 14:06:56 -08:00
Athira Rajeev
4c3f09e35c perf annotate: Return errors from disasm_line__parse_powerpc()
In disasm_line__parse_powerpc() , return code from function
disasm_line__parse() is ignored. This will result in bad results
if the disasm_line__parse() fails to disasm the line. Use
the return code to fix this.

Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-By: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Link: https://lore.kernel.org/r/20250304154114.62093-2-atrajeev@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-06 16:52:24 -08:00
Athira Rajeev
dab8c32ece perf annotate: Add annotation_options.disassembler_used
When doing "perf annotate", perf tool provides option to
use specific disassembler like llvm/objdump/capstone. The
order picked is to use llvm first and if that fails fallback
to objdump ie to use PERF_DISASM_LLVM, PERF_DISASM_CAPSTONE
and PERF_DISASM_OBJDUMP

In powerpc, when using "data type" sort keys, first preferred
approach is to read the raw instruction from the DSO. In objdump
is specified in "--objdump" option, it picks the symbol disassemble
using objdump. Currently disasm_line__parse_powerpc() function
uses length of the "line" to determine if objdump is used.
But there are few cases, where if objdump doesn't recognise the
instruction, the disassembled string will be empty.

Example:

     134cdc:	c4 05 82 41 	beq     1352a0 <getcwd+0x6e0>
     134ce0:	ac 00 8e 40 	bne     cr3,134d8c <getcwd+0x1cc>
     134ce4:	0f 00 10 04 	pld     r9,1028308
====>134ce8:	d4 b0 20 e5
     134cec:	16 00 40 39 	li      r10,22
     134cf0:	48 01 21 ea 	ld      r17,328(r1)

So depending on length of line will give bad results.

Add a new filed to annotation options structure,
"struct annotation_options" to save the disassembler used.
Use this info to determine if disassembly is done while
parsing the disasm line.

Reported-by: Tejas Manhas <Tejas.Manhas1@ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Tested-By: Venkat Rao Bagalkote <venkat88@linux.ibm.com>
Link: https://lore.kernel.org/r/20250304154114.62093-1-atrajeev@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-06 16:51:22 -08:00
Namhyung Kim
b0920abe0d perf report: Do not process non-JIT BPF ksymbol events
The length of PERF_RECORD_KSYMBOL for BPF is a size of JITed code so
it'd be 0 when it's not JITed.  The ksymbol is needed to symbolize the
code when it gets samples in the region but non-JITed code cannot get
samples.  Thus it'd be ok to ignore them.

Actually it caused a performance issue in the perf tools on old ARM
kernels where it can refuse to JIT some BPF codes.  It ended up
splitting the existing kernel map (kallsyms).  And later lookup for a
kernel symbol would create a new kernel map from kallsyms and then
split it again and again. :(

Probably there's a bug in the kernel map/symbol handling in perf tools.
But I think we need to fix this anyway.

Reported-by: Kevin Nomura <nomurak@google.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250305232838.128692-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-06 16:00:25 -08:00
Ian Rogers
2c744f38da perf test: Fix leak in "Synthesize attr update" test
The own_cpus map variable may be non-NULL and hold a reference, in
particular on hybrid machines. Do a put before overwriting the
variable to avoid a memory leak.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250305191931.604764-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-06 15:59:41 -08:00
Namhyung Kim
41453107bf perf machine: Fix insertion of PERF_RECORD_KSYMBOL related kernel maps
This was detected at the end of a 'perf record' session when build-id
collection was enabled and thus the BPF programs put in place while the
session was running, some even put in place by perf itself were
processed and inserted, with some overlaps related to BPF trampolines
and programs took place.

Using maps__fixup_overlap_and_insert() instead of maps__insert() "fixes"
the problem, in the sense that overlaps will be dealt with and then the
consistency will be kept, but it would be interesting to fully
understand why such overlaps take place and how to deal with them when
doing symbol resolution.

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Suggested-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/lkml/CAP-5=fXEEMFgPF2aZhKsfrY_En+qoqX20dWfuE_ad73Uxf0ZHQ@mail.gmail.com
Link: https://lore.kernel.org/r/20250228211734.33781-7-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 23:04:07 -08:00
Arnaldo Carvalho de Melo
e0e4e0b8b7 perf maps: Add missing map__set_kmap_maps() when replacing a kernel map
Since in this case __maps__insert_sorted() is not called and thus
doesn't have the opportunity to do the needed map__set_kmap_maps() calls on
the new map.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/lkml/Z7-May5w9VQd5QD0@x1
Link: https://lore.kernel.org/r/20250228211734.33781-6-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 23:03:43 -08:00
Namhyung Kim
0d11fab327 perf maps: Fixup maps_by_name when modifying maps_by_address
We can't just replacing the map in the maps_by_address and not touching
on the maps_by_name, that would leave the refcount as 1 and thus trip
another consistency check, this one:

  perf: util/maps.c:110: check_invariants:
  	Assertion `refcount_read(map__refcnt(map)) > 1' failed.

  106         /*
  107          * Maps by name maps should be in maps_by_address, so
  108          * the reference count should be higher.
  109          */
  110         assert(refcount_read(map__refcnt(map)) > 1);

Committer notice:

Initialize the newly added 'ni' variable, that really can't be
accessed unitialized trips some gcc versions, like:

  12    20.00 archlinux:base                : FAIL gcc version 13.2.1 20230801 (GCC)
    util/maps.c: In function ‘__maps__fixup_overlap_and_insert’:
    util/maps.c:896:54: error: ‘ni’ may be used uninitialized [-Werror=maybe-uninitialized]
      896 |                                 map__put(maps_by_name[ni]);
          |                                                      ^
    util/maps.c:816:25: note: ‘ni’ was declared here
      816 |         unsigned int i, ni;
          |                         ^~
    cc1: all warnings being treated as errors
    make[3]: *** [/git/perf-6.14.0-rc1/tools/build/Makefile.build:138: util] Error 2

Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/lkml/Z79std66tPq-nqsD@google.com
Link: https://lore.kernel.org/r/20250228211734.33781-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 23:03:33 -08:00
Namhyung Kim
f7a46e028c perf machine: Fixup kernel maps ends after adding extra maps
I just noticed it would add extra kernel maps after modules.  I think it
should fixup end address of the kernel maps after adding all maps first.

Fixes: 876e80cf83 ("perf tools: Fixup end address of modules")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/lkml/Z7TvZGjVix2asYWI@x1
Link: https://lore.kernel.org/lkml/Z712hzvv22Ni63f1@google.com
Link: https://lore.kernel.org/r/20250228211734.33781-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 23:03:15 -08:00
Arnaldo Carvalho de Melo
25d9c0301d perf maps: Set the kmaps for newly created/added kernel maps
When using __maps__insert_sorted() the map kmaps field needs to be
initialized, as we need kernel maps to work with map__kmap().

Fix it by using the newly introduced map__set_kmap() method.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/lkml/Z74V0hZXrTLM6VIJ@x1
Link: https://lore.kernel.org/r/20250228211734.33781-3-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 23:03:11 -08:00
Arnaldo Carvalho de Melo
99deaf5578 perf maps: Introduce map__set_kmap_maps() for kernel maps
We need to set it in other places than __maps__insert(), so that we can
have access to the 'struct maps' from a kernel 'struct map'.

When building perf with 'DEBUG=1' we can notice it failing a consistency
check done in the check_invariants() function:

  root@number:~# perf record -- perf test -w offcpu
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.040 MB perf.data (23 samples) ]
  perf: util/maps.c:95: check_invariants: Assertion `map__end(prev) <= map__end(map)' failed.
  Aborted (core dumped)
  root@number:~#

The investigation on that was happening bisected to 876e80cf83
("perf tools: Fixup end address of modules"), and the following patches
will plug the problems found, this patch is just legwork on that
direction.

Use the map__set_kmap_maps() name as per a review comment from Ian
Rogers, later there are further suggestions from him on getting rid of
the kmaps variable, see the thread referenced in the Link below.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/lkml/Z74V0hZXrTLM6VIJ@x1
Link: https://lore.kernel.org/r/20250228211734.33781-2-acme@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 23:03:04 -08:00
Thomas Falcon
74fb903b21 perf script: Fix output type for dynamically allocated core PMU's
This patch was originally posted here:

https://lore.kernel.org/all/20241213215421.661139-1-thomas.falcon@intel.com/

I have rebased on top of Arnaldo's patch here:

https://lore.kernel.org/all/Z2XCi3PgstSrV0SE@x1/

The original commit message:
"
perf script output may show different fields on different core PMU's
that exist on heterogeneous platforms. For example,

perf record -e "{cpu_core/mem-loads-aux/,cpu_core/event=0xcd,\
umask=0x01,ldlat=3,name=MEM_UOPS_RETIRED.LOAD_LATENCY/}:upp"\
-c10000 -W -d -a -- sleep 1

perf script:

chromium-browse   46572 [002] 544966.882384:      10000 	cpu_core/MEM_UOPS_RETIRED.LOAD_LATENCY/: 7ffdf1391b0c     10268100142 \
 |OP LOAD|LVL L1 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK    N/A    5   7    0   7fad7c47425d [unknown] (/usr/lib64/libglib-2.0.so.0.8000.3)

perf record -e cpu_atom/event=0xd0,umask=0x05,ldlat=3,\
name=MEM_UOPS_RETIRED.LOAD_LATENCY/upp -c10000 -W -d -a -- sleep 1

perf script:

gnome-control-c  534224 [023] 544951.816227:      10000 cpu_atom/MEM_UOPS_RETIRED.LOAD_LATENCY/:   7f0aaaa0aae0  [unknown] (/usr/lib64/libglib-2.0.so.0.8000.3)

Some fields, such as data_src, are not included by default.

The cause is that while one PMU may be assigned a type such as
PERF_TYPE_RAW, other core PMU's are dynamically allocated at boot time.
If this value does not match an existing PERF_TYPE_X value,
output_type(perf_event_attr.type) will return OUTPUT_TYPE_OTHER.

Instead search for a core PMU with a matching perf_event_attr type
and, if one is found, return PERF_TYPE_RAW to match output of other
core PMU's.
"

Suggested-by: Kan Liang <kan.liang@intel.com>
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250305163935.1605312-1-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 11:46:00 -08:00
Thomas Richter
957d194163 perf bench: Fix perf bench syscall loop count
Command 'perf bench syscall fork -l 100000' offers option -l to run for
a specified number of iterations. However this option is not always
observed. The number is silently limited to 10000 iterations as can be
seen:

Output before:
 # perf bench syscall fork -l 100000
 # Running 'syscall/fork' benchmark:
 # Executed 10,000 fork() calls
     Total time: 23.388 [sec]

    2338.809800 usecs/op
            427 ops/sec
 #

When explicitly specified with option -l or --loops, also observe
higher number of iterations:

Output after:
 # perf bench syscall fork -l 100000
 # Running 'syscall/fork' benchmark:
 # Executed 100,000 fork() calls
     Total time: 716.982 [sec]

    7169.829510 usecs/op
            139 ops/sec
 #

This patch fixes the issue for basic execve fork and getpgid.

Fixes: ece7f7c050 ("perf bench syscall: Add fork syscall benchmark")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Link: https://lore.kernel.org/r/20250304092349.2618082-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:19:23 -08:00
Namhyung Kim
b627b443cc perf test: Simplify data symbol test
Now the workload will end after 1 second.  Just run it with perf instead
of waiting for the background process.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304022837.1877845-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:17:01 -08:00
Namhyung Kim
f04c7ef352 perf test: Add timeout to datasym workload
Unlike others it has an infinite loop that make it annoying to call.
Make it finish after 1 second and handle command-line argument to change
the setting.

Reviewed-by: Leo Yan <leo.yan@arm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304022837.1877845-6-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:17:01 -08:00
Namhyung Kim
15bcfb96d0 perf test: Add trace record and replay test
It just check trace record and replay could display correct output.
It uses 'sleep' process and sees there's a clock_nanosleep syscall.

  $ sudo perf test -vv replay
  108: perf trace record and replay:
  --- start ---
  test child forked, pid 1563219
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.077 MB /tmp/temporary_file.w1ApA (242 samples) ]
       0.686 (1000.068 ms): sleep/1563226 clock_nanosleep(rqtp: 0x7ffc20ffee10, rmtp: 0x7ffc20ffee50)           = 0
  ---- end(0) ----
  108: perf trace record and replay                                    : Ok

Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250304022837.1877845-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:17:01 -08:00
Namhyung Kim
38672c5033 perf test: Skip perf trace tests when running as non-root
perf trace requires root because it needs to use tracepoints and BPF.
Skip those test when it's not run as root.

Before:
  $ perf test trace
   15: Parse sched tracepoints fields                                  : Skip (permissions)
   80: perf ftrace tests                                               : Skip
  105: perf trace enum augmentation tests                              : FAILED!
  106: perf trace BTF general tests                                    : FAILED!
  107: perf trace exit race                                            : FAILED!
  118: probe libc's inet_pton & backtrace it with ping                 : Skip
  125: Check Arm CoreSight trace data recording and synthesized samples: Skip
  127: Check Arm SPE trace data recording and synthesized samples      : Skip
  132: Check open filename arg using perf trace + vfs_getname          : FAILED!

After:
  $ perf test trace
   15: Parse sched tracepoints fields                                  : Skip (permissions)
   80: perf ftrace tests                                               : Skip
  105: perf trace enum augmentation tests                              : Skip
  106: perf trace BTF general tests                                    : Skip
  107: perf trace exit race                                            : Skip
  118: probe libc's inet_pton & backtrace it with ping                 : Skip
  125: Check Arm CoreSight trace data recording and synthesized samples: Skip
  127: Check Arm SPE trace data recording and synthesized samples      : Skip
  132: Check open filename arg using perf trace + vfs_getname          : Skip

Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250304022837.1877845-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:17:01 -08:00
Namhyung Kim
3fb29a7514 perf test: Skip perf probe tests when running as non-root
perf trace requires root because it needs to use [ku]probes.
Skip those test when it's not run as root.

Before:
  $ perf test probe
   47: Probe SDT events                                                : Ok
  104: test perf probe of function from different CU                   : FAILED!
  115: perftool-testsuite_probe                                        : FAILED!
  117: Add vfs_getname probe to get syscall args filenames             : FAILED!
  118: probe libc's inet_pton & backtrace it with ping                 : FAILED!
  119: Use vfs_getname probe to get syscall args filenames             : FAILED!

After:
  $ perf test probe
   47: Probe SDT events                                                : Ok
  104: test perf probe of function from different CU                   : Skip
  115: perftool-testsuite_probe                                        : Skip
  117: Add vfs_getname probe to get syscall args filenames             : Skip
  118: probe libc's inet_pton & backtrace it with ping                 : Skip
  119: Use vfs_getname probe to get syscall args filenames             : Skip

Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250304022837.1877845-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:17:01 -08:00
Namhyung Kim
45a86d017a perf test: Add --metric-only to perf stat output tests
Add a test case for --metric-only for std, csv, json output mode using
shadow IPC metric from instructions and cycles events.  It should
produce 'insn per cycle' metric.

But currently JSON output has (none) 'GHz' as well.  It looks like a bug
but I don't have enough time to debug it for now so I made it pass. :(

  $ perf stat --metric-only -e instructions,cycles true

   Performance counter stats for 'true':

                    0.56

         0.002127319 seconds time elapsed

         0.002077000 seconds user
         0.000000000 seconds sys

  $ perf stat -x, --metric-only -e instructions,cycles true

  0.55,,

  $ perf stat -j --metric-only -e instructions,cycles true
  {"insn per cycle" : "0.53", "GHz" : "none"}

  $ perf test output -v
    5: Test data source output                                         : Ok
   31: Sort output of hist entries                                     : Ok
   88: perf stat CSV output linter                                     : Ok
   90: perf stat JSON output linter                                    : Ok
   92: perf stat STD output linter                                     : Ok

Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250304022837.1877845-2-namhyung@kernel.org
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:17:01 -08:00
Leo Yan
2cc2f258a9 perf arm-spe: Support previous branch target (PBT) address
When FEAT_SPE_PBT is implemented, the previous branch target address
(named as PBT) before the sampled operation, will be recorded.

This commit first introduces a 'prev_br_tgt' field in the record for
saving the PBT address in the decoder.

If the current operation is a branch instruction, by combining with PBT,
it can create a chain with two consecutive branches.  As the branch
stack stores branches in descending order, meaning a newer branch is
stored in a lower entry in the stack.  Arm SPE stores the latest branch
in the first entry of branch stack, and the previous branch coming from
PBT is stored into the second entry.

Otherwise, if current operation is not a branch, the last branch will be
saved for PBT only.  PBT lacks associated information such as branch
source address, branch type, and events.  The branch entry fills zeros
for the corresponding fields and only set its target address.

After:

  perf script -f --itrace=bl -F flags,addr,brstack
  jcc                   ffff800080187914 0xffff8000801878fc/0xffff800080187914/P/-/-/1/COND/-  0x0/0xffff8000801878f8/-/-/-/0//-
  jcc                   ffff8000802d12d8 0xffff8000802d12f8/0xffff8000802d12d8/P/-/-/1/COND/-  0x0/0xffff8000802d12ec/-/-/-/0//-
  jcc                   ffff8000813fe200 0xffff8000813fe20c/0xffff8000813fe200/P/-/-/1/COND/-  0x0/0xffff8000813fe200/-/-/-/0//-
  jcc                   ffff8000813fe200 0xffff8000813fe20c/0xffff8000813fe200/P/-/-/1/COND/-  0x0/0xffff8000813fe200/-/-/-/0//-
  jmp                   ffff800081410980 0xffff800081419108/0xffff800081410980/P/-/-/1//-  0x0/0xffff800081419104/-/-/-/0//-
  return                ffff80008036e064 0xffff80008141ba84/0xffff80008036e064/P/-/-/1/RET/-  0x0/0xffff80008141ba60/-/-/-/0//-
  jcc                   ffff8000803d54f0 0xffff8000803d54e8/0xffff8000803d54f0/P/-/-/1/COND/-  0x0/0xffff8000803d54e0/-/-/-/0//-
  jmp                   ffff80008015e468 0xffff8000803d46dc/0xffff80008015e468/P/-/-/1//-  0x0/0xffff8000803d46c8/-/-/-/0//-
  jmp                   ffff8000806e2d50 0xffff80008040f710/0xffff8000806e2d50/P/-/-/1//-  0x0/0xffff80008040f6e8/-/-/-/0//-
  jcc                   ffff800080721704 0xffff8000807216b4/0xffff800080721704/P/-/-/1/COND/-  0x0/0xffff8000807216ac/-/-/-/0//-

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-13-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:20 -08:00
Leo Yan
73cb57f56f perf arm-spe: Add branch stack
Although Arm SPE cannot generate continuous branch records, this commit
creates a branch stack with only one branch entry.  A single branch info
can be used for performance optimization.

A branch stack structure is dynamically allocated in the decode queue.
The branch stack and stack flags are synthesized based on branch types
and associated events.

After:

  # perf script --itrace=bl1 -F flags,addr,brstack

  jcc                   ffffc0fad9c6b214 0xffffc0fad9c6b234/0xffffc0fad9c6b214/P/-/-/7/COND/-
  jcc/miss,not_taken/   ffffc0fadaaebb30 0xffffc0fadaaebb2c/0xffffc0fadaaebb30/MN/-/-/7/COND/-
  jmp                   ffffc0fadaaea358 0xffffc0fadaaea5ec/0xffffc0fadaaea358/P/-/-/5//-
  jcc/not_taken/        ffffc0fadaae6494 0xffffc0fadaae6490/0xffffc0fadaae6494/PN/-/-/11/COND/-
  jcc/not_taken/            ffff7f83ab54 0xffff7f83ab50/0xffff7f83ab54/PN/-/-/13/COND/-
  jcc/not_taken/            ffff7f83ab08 0xffff7f83ab04/0xffff7f83ab08/PN/-/-/8/COND/-
  jcc                       ffff7f83aa80 0xffff7f83aa58/0xffff7f83aa80/P/-/-/10/COND/-
  jcc                       ffff7f9a45d0 0xffff7f9a43f0/0xffff7f9a45d0/P/-/-/29/COND/-
  jcc/not_taken/        ffffc0fad9ba6db4 0xffffc0fad9ba6db0/0xffffc0fad9ba6db4/PN/-/-/44/COND/-
  jcc                   ffffc0fadaac2964 0xffffc0fadaac2970/0xffffc0fadaac2964/P/-/-/6/COND/-
  jcc                   ffffc0fad99ddc10 0xffffc0fad99ddc04/0xffffc0fad99ddc10/P/-/-/72/COND/-
  jcc/not_taken/        ffffc0fad9b3f21c 0xffffc0fad9b3f218/0xffffc0fad9b3f21c/PN/-/-/64/COND/-
  jcc                   ffffc0fad9c3b604 0xffffc0fad9c3b5f8/0xffffc0fad9c3b604/P/-/-/13/COND/-
  jcc                   ffffc0fadaad6048 0xffffc0fadaad5f8c/0xffffc0fadaad6048/P/-/-/5/COND/-
  return/miss/              ffff7f84e614 0xffffc0fad98a2274/0xffff7f84e614/M/-/-/13/RET/-
  jcc/not_taken/        ffffc0fadaac4eb4 0xffffc0fadaac4eb0/0xffffc0fadaac4eb4/PN/-/-/5/COND/-
  jmp                       ffff7f8e3130 0xffff7f87555c/0xffff7f8e3130/P/-/-/5//-
  jcc/not_taken/        ffffc0fad9b3d9b0 0xffffc0fad9b3d9ac/0xffffc0fad9b3d9b0/PN/-/-/14/COND/-
  return                ffffc0fad9b91950 0xffffc0fad98c3e28/0xffffc0fad9b91950/P/-/-/12/RET/-

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-12-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:20 -08:00
Leo Yan
4a53a67e0e perf arm-spe: Set sample flags with supplement info
Based on the supplement information in the record, this commit sets the
sample flags for conditional branch, function call, return.  It also
sets events in flags, such as mispredict, not taken, and in transaction.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-11-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:20 -08:00
Leo Yan
5c1b158396 perf arm-spe: Fill branch operations and events to record
The new added branch operations and events are filled into record, the
information will be consumed when synthesizing samples.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-10-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:20 -08:00
Leo Yan
faf2260542 perf arm-spe: Decode transactional event
The bit[16] in an event payload indicates an operation is in
transactional state.  Decode the bit.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-9-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:20 -08:00
Leo Yan
64d86c03e1 perf arm-spe: Extend branch operations
In Arm ARM (ARM DDI 0487, L.a), the section "D18.2.7 Operation Type
packet", the branch subclass is extended for Call Return (CR), Guarded
control stack data access (GCS).

This commit adds support CR and GCS operations.  The IND (indirect)
operation is defined only in bit [1], its macro is updated accordingly.

Move the COND (Conditional) macro into the same group with other
operations for better maintenance.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-8-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:19 -08:00
Leo Yan
e1d47850bb perf arm-spe: Fix load-store operation checking
The ARM_SPE_OP_LD and ARM_SPE_OP_ST operations are secondary operation
type, they are overlapping with other second level's operation types
belonging to SVE and branch operations.  As a result, a non load-store
operation can be parsed for data source and memory sample.

To fix the issue, this commit introduces a is_ldst_op() macro for
checking LDST operation, and apply the checking when synthesize data
source and memory samples.

Fixes: a89dbc9b98 ("perf arm-spe: Set sample's data source field")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250304111240.3378214-7-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:19 -08:00
Leo Yan
1e66dcff7b perf script: Add not taken event for branch stack
The branch stack has an existed field for printing mispredict, extend
the field for printing events and add support not-taken event.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-6-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:19 -08:00
Leo Yan
4caa971050 perf script: Add not taken event for branches
Some hardware (e.g., Arm SPE) can trace the not taken event for
branches.  Add a flag for this event and support printing it.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-5-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:19 -08:00
Leo Yan
88b1473135 perf script: Separate events from branch types
Branch types and events are two different things.  A branch type can be
a conditional branch, an indirect branch, a procedure call, a return, or
an exception taken, etc.  The extra event information is provided for
what happens during a branch, e.g. if a branch is mispredicted or not
taken (specific to conditional branches).

To deliver information about branches, this commit separates events from
branch types.  It parses branch types first, then appends event strings
embraced by the '/' character.  If multiple events occur, the events is
separated with a comma (,).

Also add a minor improvement by adding char 'm' in char array for branch
mispredict event.

Below are extracted sample flags.

Before:
        branch:   br miss
  instructions:   br miss

After:
        branch:   jmp/miss/
  instructions:   jmp/miss/

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-4-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:19 -08:00
Leo Yan
4d59818897 perf script: Refactor sample_flags_to_name() function
When generating a string for sample flags, the sample_flags_to_name()
function lacks the ability to parse the trace start bit or trace end bit.
Therefore, the function is invoked multiple times after clearing its
unsupported bits.

This commit improves the sample_flags_to_name() function to parse sample
flags in one go for three kinds of information:
- The prefix info for trace start, trace end, etc.
- Branch types.
- Extra info for transaction and interrupt related info.

As a result, the code is simplified to call the sample_flags_to_name()
only once.  No expectation for any changes in the perf script output.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-3-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:19 -08:00
Leo Yan
2b747a86d8 perf script: Make printing flags reliable
Add a check for the generated string of flags.  Print out the raw number
if the string generation fails.

Use the SAMPLE_FLAGS_STR_ALIGNED_SIZE macro to replace the value '21'.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20250304111240.3378214-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05 09:13:19 -08:00
James Clark
be9f3e95a9 perf stat: Fix non-uniquified hybrid legacy events
Legacy hybrid events have attr.type == PERF_TYPE_HARDWARE, so they look
like plain legacy events if we only look at attr.type. But legacy events
should still be uniquified if they were opened on a non-legacy PMU. Fix
it by checking if the evsel is hybrid and forcing needs_uniquify
before looking at the attr.type.

This restores PMU names on hybrid systems and also changes "perf stat
metrics (shadow stat) test" from a FAIL back to a SKIP (on hybrid). The
test was gated on "cycles" appearing alone which doesn't happen on
here.

Before:
  $ perf stat -- true
  ...
     <not counted>      instructions:u                           (0.00%)
           162,536      instructions:u            # 0.58  insn per cycle
  ...

After:
  $ perf stat -- true
  ...
      <not counted>      cpu_atom/instructions/u                  (0.00%)
            162,541      cpu_core/instructions/u   # 0.62  insn per cycle
  ...

Fixes: 357b965deb ("perf stat: Changes to event name uniquification")
Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250226145526.632380-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-03 12:48:17 -08:00
Namhyung Kim
7788ad59d1 perf tools: Skip BPF sideband event for userspace profiling
The BPF sideband information is tracked using a separate thread and
evlist.  But it's only useful for profiling kernel and we can skip it
when users profile their application only.

It seems it already fails to open the sideband event in that case.
Let's remove the noise in the verbose output anyway.

Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250226203039.1099131-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-02 09:47:24 -08:00
Colin Ian King
7e55bc0110 perf test: Fix spelling mistake "sythesizing" -> "synthesizing"
There are spelling mistakes in TEST_ASSERT_VAL messages. Fix them.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20250228090941.680226-1-colin.i.king@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28 16:17:53 -08:00
Luca Ceresoli
75100d848e perf build: Fix in-tree build due to symbolic link
Building perf in-tree is broken after commit 890a1961c8 ("perf tools:
Create source symlink in perf object dir") which added a 'source' symlink
in the output dir pointing to the source dir.

With in-tree builds, the added 'SOURCE = ...' line is executed multiple
times (I observed 2 during the build plus 2 during installation). This is a
minor inefficiency, in theory not harmful because symlink creation is
assumed to be idempotent. But it is not.

Considering with in-tree builds:

  srctree=/absolute/path/to/linux
   OUTPUT=/absolute/path/to/linux/tools/perf

here's what happens:

 1. ln -sf $(srctree)/tools/perf $(OUTPUT)/source
    -> creates /absolute/path/to/linux/tools/perf/source
       link to /absolute/path/to/linux/tools/perf
    => OK, that's what was intended
 2. ln -sf $(srctree)/tools/perf $(OUTPUT)/source   # same command as 1
    -> creates /absolute/path/to/linux/tools/perf/perf
       link to /absolute/path/to/linux/tools/perf
    => Not what was intended, not idempotent
 3. Now the build _should_ create the 'perf' executable, but it fails

The reason is the tricky 'ln' command line. At the first invocation 'ln'
uses the 1st form:

       ln [OPTION]... [-T] TARGET LINK_NAME

and creates a link to TARGET *called LINK_NAME*.

At the second invocation $(OUTPUT)/source exists, so 'ln' uses the 3rd
form:

       ln [OPTION]... TARGET... DIRECTORY

and creates a link to TARGET *called TARGET* inside DIRECTORY.

Fix by adding -n/--no-dereference to "treat LINK_NAME as a normal file
if it is a symbolic link to a directory", as the manpage says.

Closes: https://lore.kernel.org/all/20241125182506.38af9907@booty/
Fixes: 890a1961c8 ("perf tools: Create source symlink in perf object dir")
Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Tested-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20250124-perf-fix-intree-build-v1-1-485dd7a855e4@bootlin.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28 16:17:41 -08:00
Leo Yan
e50b291fbb perf arm-spe: Report error if set frequency
When users set the parameter '-F' to specify frequency for Arm SPE, the
tool reports error:

  perf record -F 1000 -e arm_spe_0// -- sleep 1
  Error:
  Invalid event (arm_spe_0//) in per-thread mode, enable system wide with '-a'.

The output logs are confused and it does not give the correct reminding.
Arm SPE does not support frequency setting given it adopts a statistical
based approach.

Alternatively, Arm SPE supports setting period.  This commit adds a
for frequency setting.  It reports error and reminds users to set period
instead.

After:

  perf record -F 1000 -e arm_spe_0// -- sleep 1
  Arm SPE: Frequency is not supported. Set period with -c option or PMU parameter (-e arm_spe_0/period=NUM/).

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250227085544.2154136-1-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28 10:09:03 -08:00
Chun-Tse Shao
3c97e7b991 perf lock: Report owner stack in usermode
This patch parses `owner_lock_stat` into a RB tree, enabling ordered
reporting of owner lock statistics with stack traces. It also updates
the documentation for the `-o` option in contention mode, decouples `-o`
from `-t`, and issues a warning to inform users about the new behavior
of `-ov`.

Example output:
  $ sudo ~/linux/tools/perf/perf lock con -abvo -Y mutex-spin -E3 perf bench sched pipe
  ...
   contended   total wait     max wait     avg wait         type   caller

         171      1.55 ms     20.26 us      9.06 us        mutex   pipe_read+0x57
                          0xffffffffac6318e7  pipe_read+0x57
                          0xffffffffac623862  vfs_read+0x332
                          0xffffffffac62434b  ksys_read+0xbb
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
          36    193.71 us     15.27 us      5.38 us        mutex   pipe_write+0x50
                          0xffffffffac631ee0  pipe_write+0x50
                          0xffffffffac6241db  vfs_write+0x3bb
                          0xffffffffac6244ab  ksys_write+0xbb
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
           4     51.22 us     16.47 us     12.80 us        mutex   do_epoll_wait+0x24d
                          0xffffffffac691f0d  do_epoll_wait+0x24d
                          0xffffffffac69249b  do_epoll_pwait.part.0+0xb
                          0xffffffffac693ba5  __x64_sys_epoll_pwait+0x95
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76

  === owner stack trace ===

           3     31.24 us     15.27 us     10.41 us        mutex   pipe_read+0x348
                          0xffffffffac631bd8  pipe_read+0x348
                          0xffffffffac623862  vfs_read+0x332
                          0xffffffffac62434b  ksys_read+0xbb
                          0xfffffffface604b2  do_syscall_64+0x82
                          0xffffffffad00012f  entry_SYSCALL_64_after_hwframe+0x76
  ...

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20250227003359.732948-5-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28 10:09:02 -08:00
Chun-Tse Shao
a40ccb7d98 perf lock: Make rb_tree helper functions generic
The rb_tree helper functions can be reused for parsing `owner_lock_stat`
into rb tree for sorting.

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20250227003359.732948-4-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28 00:29:37 -08:00
Chun-Tse Shao
425bc88352 perf lock: Retrieve owner callstack in bpf program
This implements per-callstack aggregation of lock owners in addition to
per-thread.  The owner callstack is captured using `bpf_get_task_stack()`
at `contention_begin()` and it also adds a custom stackid function for the
owner stacks to be compared easily.

The owner info is kept in a hash map using lock addr as a key to handle
multiple waiters for the same lock.  At `contention_end()`, it updates the
owner lock stat based on the info that was saved at `contention_begin()`.
If there are more waiters, it'd update the owner pid to itself as
`contention_end()` means it gets the lock now.  But it also needs to check
the return value of the lock function in case task was killed by a signal
or something.

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20250227003359.732948-3-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28 00:29:37 -08:00
Chun-Tse Shao
17ae7f9049 perf lock: Add bpf maps for owner stack tracing
Add a struct and few bpf maps in order to tracing owner stack.
`struct owner_tracing_data`: Contains owner's pid, stack id, timestamp for
  when the owner acquires lock, and the count of lock waiters.
`stack_buf`: Percpu buffer for retrieving owner stacktrace.
`owner_stacks`: For tracing owner stacktrace to customized owner stack id.
`owner_data`: For tracing lock_address to `struct owner_tracing_data` in
  bpf program.
`owner_stat`: For reporting owner stacktrace in usermode.

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Tested-by: Athira Rajeev <atrajeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20250227003359.732948-2-ctshao@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-28 00:29:36 -08:00
Ian Rogers
c760174401 perf cpumap: Reduce cpu size from int to int16_t
Fewer than 32k logical CPUs are currently supported by perf. A cpumap
is indexed by an integer (see perf_cpu_map__cpu) yielding a perf_cpu
that wraps a 4-byte int for the logical CPU - the wrapping is done
deliberately to avoid confusing a logical CPU with an index into a
cpumap. Using a 4-byte int within the perf_cpu is larger than required
so this patch reduces it to the 2-byte int16_t. For a cpumap
containing 16 entries this will reduce the array size from 64 to 32
bytes. For very large servers with lots of logical CPUs the size
savings will be greater.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250210191231.156294-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-27 08:47:25 -08:00
Athira Rajeev
2337b7251d perf trace: Add missing perf_tool__init()
Perf trace on perf.data fails as below:

	./perf trace record -- sleep 1
	./perf trace -i perf.data
	perf: Segmentation fault
	Segmentation fault (core dumped)

Backtrace pointed to :
	?? ()
	perf_session.process_user_event ()
	reader.read_event ()
	perf_session.process_events ()
	cmd_trace ()
	run_builtin ()
	handle_internal_command ()
	main ()

Further debug pointed that, segmentation fault happens when
trying to access id_index. Code snippet:

	case PERF_RECORD_ID_INDEX:
		err = tool->id_index(session, event);

Since 'commit 15d4a6f41d ("perf tool: Remove
perf_tool__fill_defaults()")', perf_tool__fill_defaults is
removed. All tools are initialized using perf_tool__init()
prior to use. But in builtin-trace, perf_tool__init is not
used and hence the defaults are not initialized. Use
perf_tool__init() in perf trace to handle the initialization.

Reported-by: Tejas Manhas <Tejas.Manhas1@ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Link: https://lore.kernel.org/r/20250225113157.28836-1-atrajeev@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-27 08:46:45 -08:00
James Clark
5c496f1d67 perf list: Document -v option deduplication feature
-v disables deduplication of similarly suffixed PMUs so add it to the
help and doc strings.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250226104111.564443-4-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-26 16:23:47 -08:00
James Clark
c9d699e10f perf pmu: Don't double count common sysfs and json events
After pmu_add_cpu_aliases() is called, perf_pmu__num_events() returns an
incorrect value that double counts common events and doesn't match the
actual count of events in the alias list. This is because after
'cpu_aliases_added == true', the number of events returned is
'sysfs_aliases + cpu_json_aliases'. But when adding 'case
EVENT_SRC_SYSFS' events, 'sysfs_aliases' and 'cpu_json_aliases' are both
incremented together, failing to account that these ones overlap and
only add a single item to the list. Fix it by adding another counter for
overlapping events which doesn't influence 'cpu_json_aliases'.

There doesn't seem to be a current issue because it's used in perf list
before pmu_add_cpu_aliases() so the correct value is returned. Other
uses in tests may also miss it for other reasons like only looking at
uncore events. However it's marked as a fixes commit in case any new fix
with new uses of perf_pmu__num_events() is backported.

Fixes: d9c5f5f94c ("perf pmu: Count sys and cpuid JSON events separately")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250226104111.564443-3-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-26 16:23:47 -08:00
James Clark
72c6f57a41 perf pmu: Dynamically allocate tool PMU
perf_pmus__destroy() treats all PMUs as allocated and free's them so we
can't have any static PMUs that are added to the PMU lists. Fix it by
allocating the tool PMU in the same way as the others. Current users of
the tool PMU already use find_pmu() and not perf_pmus__tool_pmu(), so
rename the function to add 'new' to avoid it being misused in the
future.

perf_pmus__fake_pmu() can remain as static as it's not added to the
PMU lists.

Fixes the following error:

  $ perf bench internals pmu-scan

  # Running 'internals/pmu-scan' benchmark:
  Computing performance of sysfs PMU event scan for 100 times
  munmap_chunk(): invalid pointer
  Aborted (core dumped)

Fixes: 240505b2d0 ("perf tool_pmu: Factor tool events into their own PMU")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250226104111.564443-2-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-26 16:23:47 -08:00
Athira Rajeev
556b58c191 perf probe: Pick the correct dwarf die while adding probe points
Perf probe on vfs_fstatat fails as below on a powerpc system

  $ ./perf probe -nf --max-probes=512 -a 'vfs_fstatat $params'
  Segmentation fault (core dumped)

This is observed while running perftool-testsuite_probe testcase.

While running with verbose, its observed that segfault happens
at:

   synthesize_probe_trace_arg ()
   synthesize_probe_trace_command ()
   probe_file.add_event ()
   apply_perf_probe_events ()
   __cmd_probe ()
   cmd_probe ()
   run_builtin ()
   handle_internal_command ()
   main ()

Code in synthesize_probe_trace_arg() access a null value and results in
segfault. Data structure which is null:
struct probe_trace_arg arg->value

We are hitting a case where arg->value is null in probe point:
"vfs_fstatat $params". This is happening since 'commit e896474fe4
("getname_maybe_null() - the third variant of pathname copy-in")'
Before the commit, probe point for vfs_fstatat was getting added only
for one location:

  Writing event: p:probe/vfs_fstatat _text+6345404 dfd=%gpr3:s32 filename=%gpr4:x64 stat=%gpr5:x64 flags=%gpr6:s32

With this change, vfs_fstatat code is inlined for other locations in the
code:
  Probe point found: __do_sys_lstat64+48
  Probe point found: __do_sys_stat64+48
  Probe point found: __do_sys_newlstat+48
  Probe point found: __do_sys_newstat+48
  Probe point found: vfs_fstatat+0

When trying to find matching dwarf information entry (DIE)
from the debuginfo, the code incorrectly picks DIE which is
not referring to vfs_fstatat. Snippet from dwarf entry in vmlinux
debuginfo file.

The main abstract die is:
 <1><4214883>: Abbrev Number: 147 (DW_TAG_subprogram)
    <4214885>   DW_AT_external    : 1
    <4214885>   DW_AT_name        : (indirect string, offset: 0x17b9f3): vfs_fstatat

With formal parameters:
 <2><4214896>: Abbrev Number: 51 (DW_TAG_formal_parameter)
    <4214897>   DW_AT_name        : dfd
 <2><42148a3>: Abbrev Number: 23 (DW_TAG_formal_parameter)
    <42148a4>   DW_AT_name        : (indirect string, offset: 0x8fda9): filename
 <2><42148b0>: Abbrev Number: 23 (DW_TAG_formal_parameter)
    <42148b1>   DW_AT_name        : (indirect string, offset: 0x16bd9c): stat
 <2><42148bd>: Abbrev Number: 23 (DW_TAG_formal_parameter)
    <42148be>   DW_AT_name        : (indirect string, offset: 0x39832b): flags

While collecting variables/parameters for a probe point, the function
copy_variables_cb() also looks at dwarf debug entries based on the
instruction address. Snippet

        if (dwarf_haspc(die_mem, vf->pf->addr))
                return DIE_FIND_CB_CONTINUE;
        else
                return DIE_FIND_CB_SIBLING;

But incase of inlined function instance for vfs_fstatat, there are two
entries which has the instruction address entry point as same.

Instance 1: which is for vfs_fstatat and DW_AT_abstract_origin points to
0x4214883 (reference above for main abstract die)

 <3><42131fa>: Abbrev Number: 59 (DW_TAG_inlined_subroutine)
    <42131fb>   DW_AT_abstract_origin: <0x4214883>
    <42131ff>   DW_AT_entry_pc    : 0xc00000000062b1e0

Instance 2: which is not for vfs_fstatat but for getname

 <5><4213270>: Abbrev Number: 39 (DW_TAG_inlined_subroutine)
    <4213271>   DW_AT_abstract_origin: <0x4215b6b>
    <4213275>   DW_AT_entry_pc    : 0xc00000000062b1e0

But the copy_variables_cb() continues to add parameters from second
instance also based on the dwarf_haspc() check. This results in
formal parameters for getname also appended to params. But while
filling in the args->value for these parameters, since these args
are not part of dwarf with offset "42131fa". Hence value will be
null. This incorrect args results in segfault when value field is
accessed.

Save the dwarf dieoffset of the actual DW_TAG_subprogram as part of
"struct probe_finder". In copy_variables_cb(), include check to make
sure the DW_AT_abstract_origin points to the correct entry if the
dwarf_haspc() matches the instruction address.

Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250225123042.37263-1-atrajeev@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-26 14:25:14 -08:00
Gabriele Monaco
833d025239 perf ftrace latency: allow to hide empty buckets
Especially while using several buckets, it isn't uncommon to have some
of them empty and reading the histogram may be a bit more complex:

  # perf ftrace latency -a -T mutex_lock --bucket-range 5 --max-latency 200
  #   DURATION     |  COUNT | GRAPH                                  |
       0 -    5 us |  14816 | ###################################### |
       5 -   10 us |   1228 | ###                                    |
      10 -   15 us |    438 | #                                      |
      15 -   20 us |    106 |                                        |
      20 -   25 us |     21 |                                        |
      25 -   30 us |     11 |                                        |
      30 -   35 us |      1 |                                        |
      35 -   40 us |      2 |                                        |
      40 -   45 us |      4 |                                        |
      45 -   50 us |      0 |                                        |
      50 -   55 us |      1 |                                        |
      55 -   60 us |      0 |                                        |
      60 -   65 us |      1 |                                        |
      65 -   70 us |      1 |                                        |
      70 -   75 us |      1 |                                        |
      75 -   80 us |      2 |                                        |
      80 -   85 us |      0 |                                        |
      85 -   90 us |      1 |                                        |
      90 -   95 us |      0 |                                        |
      95 -  100 us |      1 |                                        |
     100 -  105 us |      0 |                                        |
     105 -  110 us |      0 |                                        |
     110 -  115 us |      0 |                                        |
     115 -  120 us |      0 |                                        |
     120 -  125 us |      1 |                                        |
     125 -  130 us |      0 |                                        |
     130 -  135 us |      0 |                                        |
     135 -  140 us |      1 |                                        |
     140 -  145 us |      0 |                                        |
     145 -  150 us |      0 |                                        |
     150 -  155 us |      0 |                                        |
     155 -  160 us |      0 |                                        |
     160 -  165 us |      0 |                                        |
     165 -  170 us |      0 |                                        |
     170 -  175 us |      0 |                                        |
     175 -  180 us |      0 |                                        |
     180 -  185 us |      0 |                                        |
     185 -  190 us |      0 |                                        |
     190 -  195 us |      0 |                                        |
     195 -  200 us |      0 |                                        |
     200 -  ... us |      2 |                                        |

Allow the optional flag --hide-empty to remove buckets with no element
and produce a more compact graph. This feature could be misleading since
there is no clear indication for missing buckets, for this reason it's
disabled by default.

  # perf ftrace latency -a -T mutex_lock --bucket-range 5 --max-latency --hide-empty 200
  #   DURATION     |  COUNT | GRAPH                                  |
       0 -    5 us |  14816 | ###################################### |
       5 -   10 us |   1228 | ###                                    |
      10 -   15 us |    438 | #                                      |
      15 -   20 us |    106 |                                        |
      20 -   25 us |     21 |                                        |
      25 -   30 us |     11 |                                        |
      30 -   35 us |      1 |                                        |
      35 -   40 us |      2 |                                        |
      40 -   45 us |      4 |                                        |
      50 -   55 us |      1 |                                        |
      60 -   65 us |      1 |                                        |
      65 -   70 us |      1 |                                        |
      70 -   75 us |      1 |                                        |
      75 -   80 us |      2 |                                        |
      85 -   90 us |      1 |                                        |
      95 -  100 us |      1 |                                        |
     120 -  125 us |      1 |                                        |
     135 -  140 us |      1 |                                        |
     200 -  ... us |      2 |                                        |

Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250207080446.77630-2-gmonaco@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-26 13:48:02 -08:00
Gabriele Monaco
4a75e8c3b2 perf ftrace latency: variable histogram buckets
The max-latency value can make the histogram smaller, but not larger, we
have a maximum of 22 buckets and specifying a max-latency that would
require more buckets has no effect.

Dynamically allocate the buckets and compute the bucket number from the
max latency as (max-min) / range + 2

If the maximum is not specified, we still set the bucket number to 22
and compute the maximum accordingly.

Fail if the maximum is smaller than min+range, this way we make sure we
always have 3 buckets: those below min, those above max and one in the
middle.

Since max-latency is not available in log2 mode, always use 22 buckets.

Signed-off-by: Gabriele Monaco <gmonaco@redhat.com>
Link: https://lore.kernel.org/r/20250207080446.77630-1-gmonaco@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-26 13:48:02 -08:00
Namhyung Kim
f4dc5a3355 perf annotate-data: Handle direct use of stack pointer without fbreg
Sometimes compiler generates code to use the stack pointer register
without frame pointer.  As we know RSP is the stack register on x86,
let's treat it as same as fbreg.  But the offset would be opposite
direction so update the debug message accordingly.

Reported-by: Blake Jones <blakejones@google.com>
Link: https://lore.kernel.org/r/20250126210242.1181225-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-26 13:42:49 -08:00
Linus Torvalds
9f5270d758 perf tools fixes for v6.14: 2nd batch
- Fix tools/ quiet build Makefile infrastructure that was broken when
   working on tools/perf/ without testing on other tools/ living
   utilities.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZ74OJwAKCRCyPKLppCJ+
 J30mAPsHCA8A+CNq/5yW2VhFLV1GgCSL5oWqxXRn7QjhSrCQBQEAot2u4O5zXs7M
 sg+mPlYiS1oT+zmvTLlXrN+bVyWP9A4=
 =jH1N
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix tools/ quiet build Makefile infrastructure that was broken when
   working on tools/perf/ without testing on other tools/ living
   utilities.

* tag 'perf-tools-fixes-for-v6.14-2-2025-02-25' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  tools: Remove redundant quiet setup
  tools: Unify top-level quiet infrastructure
2025-02-25 13:32:32 -08:00
Thomas Falcon
c40aa8d98d perf report: Fix sample number stats for branch entry mode
Currently, stats->nr_samples is incremented per entry in the branch stack
instead of per sample taken. As a result, statistics of samples taken
during perf record in --branch-filter or --branch-any mode does not
seem correct. Instead call hists__inc_nr_samples() for each sample taken
instead of for each entry in the branch stack.

Before:

$ ./perf record -e cycles:u -b -c 10000000000 ./tchain_edit
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.005 MB perf.data (2 samples) ]
$ perf report -D | tail -n 16
Aggregated stats:
               TOTAL events:         16
                COMM events:          2  (12.5%)
                EXIT events:          1  ( 6.2%)
              SAMPLE events:          2  (12.5%)
               MMAP2 events:          2  (12.5%)
             KSYMBOL events:          1  ( 6.2%)
      FINISHED_ROUND events:          1  ( 6.2%)
            ID_INDEX events:          1  ( 6.2%)
          THREAD_MAP events:          1  ( 6.2%)
             CPU_MAP events:          1  ( 6.2%)
        EVENT_UPDATE events:          2  (12.5%)
           TIME_CONV events:          1  ( 6.2%)
       FINISHED_INIT events:          1  ( 6.2%)
cpu_core/cycles/u stats:
              SAMPLE events:         64

After:

$ ./perf report -D | tail -n 16
Aggregated stats:
               TOTAL events:         16
                COMM events:          2  (12.5%)
                EXIT events:          1  ( 6.2%)
              SAMPLE events:          2  (12.5%)
               MMAP2 events:          2  (12.5%)
             KSYMBOL events:          1  ( 6.2%)
      FINISHED_ROUND events:          1  ( 6.2%)
            ID_INDEX events:          1  ( 6.2%)
          THREAD_MAP events:          1  ( 6.2%)
             CPU_MAP events:          1  ( 6.2%)
        EVENT_UPDATE events:          2  (12.5%)
           TIME_CONV events:          1  ( 6.2%)
       FINISHED_INIT events:          1  ( 6.2%)
cpu_core/cycles/u stats:
              SAMPLE events:          2

Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250220045942.114965-1-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 16:02:28 -08:00
Ian Rogers
e7af194681 perf machine: Reuse module path buffer
Rather than copying the path and appending the directory entry in a
fresh path buffer, append to the path at the end of where it is for
the recursion level. This saves a PATH_MAX buffer per recursion level
and some unnecessary copying.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Ian Rogers
d996c726a5 perf hwmon_pmu: Switch event discovery to io_dir__readdir
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Ian Rogers
bb327140f5 perf parse-events: Switch tracepoints to io_dir__readdir
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Ian Rogers
56406bd557 perf events: Remove scandir in thread synthesis
This avoids scanddir reading the directory into memory that's
allocated and instead allocates on the stack.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Song Liu <song@kernel.org>
Link: https://lore.kernel.org/r/20250222061015.303622-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Ian Rogers
d6cd7c9f02 perf header: Switch mem topology to io_dir__readdir
Switch memory_node__read and build_mem_topology from opendir/readdir
to io_dir__readdir, with smaller stack allocations. Reduces peak
memory consumption of perf record by 10kb.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Ian Rogers
6a81a3fd9e perf pmu: Switch to io_dir__readdir
Avoid DIR allocations when scanning sysfs by using io_dir for the
readdir implementation, that allocates about 1kb on the stack.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Ian Rogers
f7cada5f7e perf maps: Switch modules tree walk to io_dir__readdir
Compared to glibc's opendir/readdir this lowers the max RSS of perf
record by 1.8MB on a Debian machine.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250222061015.303622-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-24 15:46:33 -08:00
Ian Rogers
7e05269ba8 perf parse-events: Tidy name token matching
Prior to commit 70c90e4a6b ("perf parse-events: Avoid scanning PMUs
before parsing") names (generally event names) excluded hyphen (minus)
symbols as the formation of legacy names with hyphens was handled in
the yacc code. That commit allowed hyphens supposedly making
name_minus unnecessary. However, changing name_minus to name has
issues in the term config tokens as then name ends up having priority
over numbers and name allows matching numbers since commit
5ceb57990b ("perf parse: Allow tracepoint names to start with digits
"). It is also permissable for a name to match with a colon (':') in
it when its in a config term list. To address this rename name_minus
to term_name, make the pattern match name's except for the colon, add
number matching into the config term region with a higher priority
than name matching. This addresses an inconsistency and allows greater
matching for names inside of term lists, for example, they may start
with a number.

Rename name_tag to quoted_name and update comments and helper
functions to avoid str detecting quoted strings which was already done
by the lexer.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250109175401.161340-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-20 22:35:10 -08:00
Krzysztof Łopatowski
4bac7fb586 perf tools: Improve startup time by reducing unnecessary stat() calls
When testing perf trace on NixOS, I noticed significant startup delays:
- `ls`: ~2ms
- `strace ls`: ~10ms
- `perf trace ls`: ~550ms

Profiling showed that 51% of the time is spent reading files,
26% in loading BPF programs, and 11% in `newfstatat`.

This patch optimizes module path exploration by avoiding `stat()` calls
unless necessary. For filesystems that do not implement `d_type`
(DT_UNKNOWN), it falls back to the old behavior.
See `readdir(3)` for details.

This reduces `perf trace ls` time to ~500ms.

A more thorough startup optimization based on command parameters would
be ideal, but that is a larger effort.

Signed-off-by: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Acked-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250206113314.335376-2-krzysztof.m.lopatowski@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19 13:55:59 -08:00
Dmitry Vyukov
6353255e7c perf report: Fix input reload/switch with symbol sort key
Currently the code checks that there is no "ipc" in the sort order
and add an ipc string. This will always error out on the second pass
after input reload/switch, since the sort order already contains "ipc".
Do the ipc check/fixup only on the first pass.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Link: https://lore.kernel.org/r/20250108063628.215577-1-dvyukov@google.com
Fixes: ec6ae74fe8 ("perf report: Display average IPC and IPC coverage per symbol")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19 13:27:59 -08:00
Namhyung Kim
acda4c2001 perf report: Support switching data w/ and w/o callchains
The symbol_conf.use_callchain should be reset when switching to new data
file, otherwise report__setup_sample_type() will show an error message
that it enabled callchains but no callchain data.  The function also
will turn on the callchains if the data has PERF_SAMPLE_CALLCHAIN so I
think it's ok to reset symbol_conf.use_callchain here.

Link: https://lore.kernel.org/r/20250211060745.294289-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19 13:23:58 -08:00
Namhyung Kim
43c2b6139b perf report: Switch data file correctly in TUI
The 's' key is to switch to a new data file and load the data in the
same window.  The switch_data_file() will show a popup menu to select
which data file user wants and update the 'input_name' global variable.

But in the cmd_report(), it didn't update the data.path using the new
'input_name' and keep usng the old file.  This is fairly an old bug and
I assume people don't use this feature much. :)

Link: https://lore.kernel.org/r/20250211060745.294289-1-namhyung@kernel.org
Closes: https://lore.kernel.org/linux-perf-users/89e678bc-f0af-4929-a8a6-a2666f1294a4@linaro.org
Fixes: f5fc14124c ("perf tools: Add data object to handle perf data file")
Reported-by: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19 13:23:58 -08:00
Greg Kroah-Hartman
0cced76a02 perf tools: Fix up some comments and code to properly use the event_source bus
In sysfs, the perf events are all located in
/sys/bus/event_source/devices/ but some places ended up hard-coding the
location to be at the root of /sys/devices/ which could be very risky as
you do not exactly know what type of device you are accessing in sysfs
at that location.

So fix this all up by properly pointing everything at the bus device
list instead of the root of the sysfs devices/ tree.

Cc: stable <stable@kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/2025021955-implant-excavator-179d@gregkh
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19 13:23:43 -08:00
James Clark
687b8c3938 perf list: Also append PMU name in verbose mode
When listing in verbose mode, the long description is used but the PMU
name isn't appended. There doesn't seem to be a reason to exclude it
when asking for more information, so use the same print block for both
long and short descriptions.

Before:
  $ perf list -v
  ...
  inst_retired
       [Instruction architecturally executed]

After:
  $ perf list -v
  ...
   inst_retired
       [Instruction architecturally executed. Unit: armv8_cortex_a57]

Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250219151622.1097289-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19 13:23:43 -08:00
Yangyu Chen
2ed0e3ea8a perf vendor events arm64: Fix incorrect CPU_CYCLE in metrics expr
Some existing metrics for Neoverse N3 and V3 expressions use CPU_CYCLE
to represent the number of cycles, but this is incorrect. The correct
event to use is CPU_CYCLES.

I encountered this issue while working on a patch to add pmu events for
Cortex A720 and A520 by reusing the existing patch for Neoverse N3 and
V3 by James Clark [1] and my check script [2] reported this issue.

[1] https://lore.kernel.org/lkml/20250122163504.2061472-1-james.clark@linaro.org/
[2] https://github.com/cyyself/arm-pmu-check

Signed-off-by: Yangyu Chen <cyy@cyyself.name>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/tencent_D4ED18476ADCE818E31084C60E3E72C14907@qq.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-19 13:23:43 -08:00
Namhyung Kim
29bab85418 perf script: Fix hangup in offline flamegraph report
A recent change in the flamegraph script fixed an issue with live mode
but it created another for offline mode.  It needs to pass "-" to -i
option to read from stdin in the live mode.  Actually there's a logic
to pass the option in the perf script code, but the script was written
with "-- $@" which prevented the option to go to the perf script.  So
the previous commit added the hard-coded "-i -" to the report command.

But it's a problem for the offline mode which expects input from a file
and now it's stuck on reading from stdin.  Let's remove the "-i - --"
part and let it pass the options properly to perf script.

Closes: https://lore.kernel.org/linux-perf-users/c41e4b04-e1fd-45ab-80b0-ec2ac6e94310@linux.ibm.com
Fixes: 23e0a63c6d ("perf script: force stdin for flamegraph in live mode")
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Tested-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 16:12:19 -08:00
Dmitry Vyukov
5e838165d0 perf hist: Shrink struct hist_entry size
Reorder the struct fields by size to reduce paddings and reduce
struct simd_flags size from 8 to 1 byte.

This reduces struct hist_entry size by 8 bytes (592->584),
and leaves a single more usable 6 byte padding hole.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/7c1cb1c8f9901e945162701ba7269d0f9c70be89.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:32 -08:00
Dmitry Vyukov
257facfaf5 perf test: Add tests for latency and parallelism profiling
Ensure basic operation of latency/parallelism profiling and that
main latency/parallelism record/report invocations don't fail/crash.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/c129c8f02f328f68e1e9ef2cdc582f8a9786a97d.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:32 -08:00
Dmitry Vyukov
32ecca8d7a perf report: Add latency and parallelism profiling documentation
Describe latency and parallelism profiling, related flags, and differences
with the currently only supported CPU-consumption-centric profiling.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/a13f270ed33cedb03ce9ebf9ddbd064854ca0f19.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:32 -08:00
Dmitry Vyukov
2570c02c3a perf report: Add --latency flag
Add record/report --latency flag that allows to capture and show
latency-centric profiles rather than the default CPU-consumption-centric
profiles. For latency profiles record captures context switch events,
and report shows Latency as the first column.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/e9640464bcbc47dde2cb557003f421052ebc9eec.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:32 -08:00
Dmitry Vyukov
ee1cffbe24 perf report: Add latency output field
Latency output field is similar to overhead, but represents overhead for
latency rather than CPU consumption. It's re-scaled from overhead by dividing
weight by the current parallelism level at the time of the sample.
It effectively models profiling with 1 sample taken per unit of wall-clock
time rather than unit of CPU time.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/b6269518758c2166e6ffdc2f0e24cfdecc8ef9c1.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:32 -08:00
Dmitry Vyukov
61b6b31c2f perf report: Add parallelism filter
Add parallelism filter that can be used to look at specific parallelism
levels only. The format is the same as cpu lists. For example:

Only single-threaded samples: --parallelism=1
Low parallelism only: --parallelism=1-4
High parallelism only: --parallelism=64-128

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/e61348985ff0a6a14b07c39e880edbd60a8f8635.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:32 -08:00
Dmitry Vyukov
216f8a970c perf report: Switch filtered from u8 to u16
We already have all u8 bits taken, adding one more filter leads to unpleasant
failure mode, where code compiles w/o warnings, but the last filters silently
don't work. Add a typedef and switch to u16.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/32b4ce1731126c88a2d9e191dc87e39ae4651cb7.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-18 14:04:31 -08:00
Charlie Jenkins
293f324ce9 tools: Unify top-level quiet infrastructure
Commit f2868b1a66 ("perf tools: Expose quiet/verbose variables in
Makefile.perf") moved the quiet infrastructure out of
tools/build/Makefile.build and into the top-level Makefile.perf file so
that the quiet infrastructure could be used throughout perf and not just
in Makefile.build.

Extract out the quiet infrastructure into Makefile.include so that it
can be leveraged outside of perf.

Fixes: f2868b1a66 ("perf tools: Expose quiet/verbose variables in Makefile.perf")
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Benjamin Tissoires <bentiss@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Daniel Lezcano <daniel.lezcano@linaro.org>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Kosina <jikos@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Lukasz Luba <lukasz.luba@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Mykola Lysenko <mykolal@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <qmo@kernel.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Yonghong Song <yonghong.song@linux.dev>
Cc: Zhang Rui <rui.zhang@intel.com>
Link: https://lore.kernel.org/r/20250213-quiet_tools-v3-1-07de4482a581@rivosinc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-02-18 15:31:45 -03:00
Dmitry Vyukov
7ae1972e74 perf report: Add parallelism sort key
Show parallelism level in profiles if requested by user.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/7f7bb87cbaa51bf1fb008a0d68b687423ce4bad4.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-17 22:00:50 -08:00
Dmitry Vyukov
f13bc61b2e perf report: Add machine parallelism
Add calculation of the current parallelism level (number of threads actively
running on CPUs). The parallelism level can be shown in reports on its own,
and to calculate latency overheads.

Signed-off-by: Dmitry Vyukov <dvyukov@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Link: https://lore.kernel.org/r/0f8c1b8eb12619029e31b3d5c0346f4616a5aeda.1739437531.git.dvyukov@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-17 22:00:50 -08:00
Namhyung Kim
20600b8aab perf tools: Fix compile error on sample->user_regs
It's recently changed to allocate dynamically but misses to update some
arch-dependent codes to use perf_sample__user_regs().

Fixes: dc6d2bc2d8 ("perf sample: Make user_regs and intr_regs optional")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250214191641.756664-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-14 12:33:41 -08:00
Leo Yan
d18c882f85 perf tools: Fix compilation error on arm64
Since the commit dc6d2bc2d8 ("perf sample: Make user_regs and
intr_regs optional"), the building for Arm64 reports error:

arch/arm64/util/unwind-libdw.c: In function ‘libdw__arch_set_initial_registers’:
arch/arm64/util/unwind-libdw.c:11:32: error: initialization of ‘struct regs_dump *’ from incompatible pointer type ‘struct regs_dump **’ [-Werror=incompatible-pointer-types]
   11 |  struct regs_dump *user_regs = &ui->sample->user_regs;
      |                                ^
cc1: all warnings being treated as errors
make[6]: *** [/home/niayan01/linux/tools/build/Makefile.build:85: arch/arm64/util/unwind-libdw.o] Error 1
make[5]: *** [/home/niayan01/linux/tools/build/Makefile.build:138: util] Error 2
arch/arm64/tests/dwarf-unwind.c: In function ‘test__arch_unwind_sample’:
arch/arm64/tests/dwarf-unwind.c:48:27: error: initialization of ‘struct regs_dump *’ from incompatible pointer type ‘struct regs_dump **’ [-Werror=incompatible-pointer-types]
   48 |  struct regs_dump *regs = &sample->user_regs;
      |                           ^

To fix the issue, use the helper perf_sample__user_regs() to retrieve
the user_regs.

Fixes: dc6d2bc2d8 ("perf sample: Make user_regs and intr_regs optional")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250214111025.14478-1-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-14 11:12:12 -08:00
Ian Rogers
dc6d2bc2d8 perf sample: Make user_regs and intr_regs optional
The struct dump_regs contains 512 bytes of cache_regs, meaning the two
values in perf_sample contribute 1088 bytes of its total 1384 bytes
size. Initializing this much memory has a cost reported by Tavian
Barnes <tavianator@tavianator.com> as about 2.5% when running `perf
script --itrace=i0`:
https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/

Adrian Hunter <adrian.hunter@intel.com> replied that the zero
initialization was necessary and couldn't simply be removed.

This patch aims to strike a middle ground of still zeroing the
perf_sample, but removing 79% of its size by make user_regs and
intr_regs optional pointers to zalloc-ed memory. To support the
allocation accessors are created for user_regs and intr_regs. To
support correct cleanup perf_sample__init and perf_sample__exit
functions are created and added throughout the code base.

Signed-off-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250113194345.1537821-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 20:06:11 -08:00
Ian Rogers
08d9e88348 perf test stat_all_metrics: Ensure missing events fail test
Issue reported by Thomas Falcon and diagnosed by Kan Liang here:
https://lore.kernel.org/lkml/d44036481022c27d83ce0faf8c7f77042baedb34.camel@intel.com/
Metrics with missing events can be erroneously skipped if they contain
FP, AMX or PMM events.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-25-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:40 -08:00
Ian Rogers
8a6dcb26af perf vendor events: Update Tigerlake events/metrics
Update events from v1.16 to v1.17.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.17:
e1d5ac3412

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-24-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
f2f3a4afdd perf vendor events: Update SkylakeX events/metrics
Update events from v1.35 to v1.36.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.36:
f6801e5c14

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-23-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
228c556a63 perf vendor events: Update Skylake metrics
Update TMA metrics from 4.8 to 5.02.

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-22-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
86f5536004 perf vendor events: Update Sierraforest events/metrics
Update events from v1.04 to v1.07.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.08:
7ae9c45ccf
903b3d0a0a
825c436147
bafe6a7b5c

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Update uncore IIO events umask with the change:
d78e8a1665
which should address an issue originally raised by Michael Petlan:
Reported-by: Michael Petlan <mpetlan@redhat.com>
Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-21-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
830ee133a5 perf vendor events: Update Sapphirerapids events/metrics
Update events from v1.23 to v1.25.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.25:
78d6273c54
f069ed9d0b

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Update uncore IIO events umask with the change:
d78e8a1665
which should address an issue originally raised by Michael Petlan:
Reported-by: Michael Petlan <mpetlan@redhat.com>
Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-20-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
870b92024e perf vendor events: Update Rocketlake events/metrics
Update events from v1.03 to v1.04.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.04:
015d5a5eab

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-19-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
b4152015a9 perf vendor events: Update Meteorlake events/metrics
Update events from v1.10 to v1.12.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.12:
d8fe70c91b
b9dabd05ff
This updates the mapfile.csv for the 0xB5 CPUID variant of meteorlake.
c3094bc9bb

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-18-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
23878069de perf vendor events: Update/add Lunarlake events/metrics
Update events from v1.01 to v1.10.
Add TMA metrics 5.02.

Bring in the event updates v1.11:
af329039e8
4a1cff8ceb
cbc3b0dc19
28f4b24f91
172900e962
dab0308f7a

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-17-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
c49b050915 perf vendor events: Update IcelakeX events/metrics
Update events from v1.26 to v1.27.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.27:
6ee80d0532

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-16-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
094b233575 perf vendor events: Update Icelake events/metrics
Update events from v1.22 to v1.24.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.24:
d4f10746cf

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-15-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
be67d89f79 perf vendor events: Update HaswellX events/metrics
Update events from v28 to v29.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v29:
71dbf03aba

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
55bf5d0792 perf vendor events: Update Haswell events/metrics
Update events from v35 to v36.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v36:
616ec6fc03

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Remove duplicate event UNC_CLOCK.SOCKET that was erroneously left in
uncore-other.json.

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:39 -08:00
Ian Rogers
aaa73d778b perf vendor events: Update/add Graniterapids events/metrics
Update events from v1.02 to v1.06.
Add TMA metrics 5.02.

Bring in the event updates v1.06:
de5502e51a
79b9e512ea
bc74a895e4

The TMA 5.02 addition is from (with subsequent fixes):
1d72913b2d

Update uncore IIO events umask with the change:
d78e8a1665
which should address an issue originally raised by Michael Petlan:
Reported-by: Michael Petlan <mpetlan@redhat.com>
Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
b52c4123a5 perf vendor events: Update GrandRidge events/metrics
Update events from v1.03 to v1.05.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.05:
3b2e3528fb
9bc1815536

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Update uncore IIO events umask with the change:
d78e8a1665
which should address an issue originally raised by Michael Petlan:
Reported-by: Michael Petlan <mpetlan@redhat.com>
Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
5ee60fbf73 perf vendor events: Update EmeraldRapids events/metrics
Update events from v1.09 to v1.11.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.11:
bffcec00a1
a63da6de48

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Update uncore IIO events umask with the change:
d78e8a1665
which should address an issue originally raised by Michael Petlan:
Reported-by: Michael Petlan <mpetlan@redhat.com>
Closes: https://lore.kernel.org/all/alpine.LRH.2.20.2401300733310.11354@Diego/

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
e415c1493f perf vendor events: Add Clearwaterforest events
Add events v1.00.

Bring in the events from:
https://github.com/intel/perfmon/tree/main/CWF/events

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
7487e4fce9 perf vendor events: Update CascadelakeX events/metrics
Update events from v1.22 to v1.23.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.23:
8f3665f6be

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
a75d905d64 perf vendor events: Update BroadwellX events/metrics
Update events from v22 to v23.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v23:
679982113f

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-7-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
11e644eb46 perf vendor events: Update BroadwellDE events/metrics
Update events from v11 to v12.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v12:
e0b83388d5

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
240411b048 perf vendor events: Update Broadwell events/metrics
Update events from v29 to v30.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v30:
9a1827b2ac

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
ba56a91063 perf vendor events: Add Arrowlake events/metrics
Add events v1.07.
Add TMA metrics based on v5.02.

Bring in the events from:
https://github.com/intel/perfmon/tree/main/ARL/events

TMA 5.02 is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
b04fe42f6e perf vendor events: Update AlderlakeN events/metrics
Update events from v1.27 to v1.28.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.28:
801f43f22e

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-developed-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Ian Rogers
54169b4663 perf vendor events: Update Alderlake events/metrics
Update events from v1.27 to v1.28.
Update TMA metrics from 4.8 to 5.02.

Bring in the event updates v1.28:
801f43f22e

The TMA 5.02 update is from (with subsequent fixes):
1d72913b2d

Co-authored-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Link: https://lore.kernel.org/r/20250211213031.114209-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:54:38 -08:00
Namhyung Kim
70f127c716 perf tools: Use symfs when opening debuginfo by path
I found that it failed to load a binary using --symfs option.  Say I
have a binary in /home/user/prog/xxx and a perf data file with it.  If I
move them to a different machine and use --symfs, it tries to find the
binary in some locations under symfs using dso__read_binary_type_filename(),
but not the last one.

  ${symfs}/usr/lib/debug/home/user/prog/xxx.debug
  ${symfs}/usr/lib/debug/home/user/prog/xxx
  ${symfs}/home/user/prog/.debug/xxx
  /home/user/prog/xxx

It should check ${symfs}/home/usr/prog/xxx.  Let's fix it.

Reviewed-by: Ian Rogers <irogers@google.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20250212221445.437481-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:44:16 -08:00
Namhyung Kim
fc00897c8a perf trace: Add --summary-mode option
The --summary-mode option will select how to show the syscall summary at
the end.  By default, it'll show the summary for each thread and it's
the same as if --summary-mode=thread is passed.

The other option is to show total summary, which is --summary-mode=total.
I'd like to have this instead of a separate option like --total-summary
because we may want to add a new summary mode (by cgroup) later.

  $ sudo ./perf trace -as --summary-mode=total sleep 1

   Summary of events:

   total, 21580 events

     syscall            calls  errors  total       min       avg       max       stddev
                                       (msec)    (msec)    (msec)    (msec)        (%)
     --------------- --------  ------ -------- --------- --------- ---------     ------
     epoll_wait          1305      0 14716.712     0.000    11.277   551.529      8.87%
     futex               1256     89 13331.197     0.000    10.614   733.722     15.49%
     poll                 669      0  6806.618     0.000    10.174   459.316     11.77%
     ppoll                220      0  3968.797     0.000    18.040   516.775     25.35%
     clock_nanosleep        1      0  1000.027  1000.027  1000.027  1000.027      0.00%
     epoll_pwait           21      0   592.783     0.000    28.228   522.293     88.29%
     nanosleep             16      0    60.515     0.000     3.782    10.123     33.33%
     ioctl                510      0     4.284     0.001     0.008     0.182      8.84%
     recvmsg             1434    775     3.497     0.001     0.002     0.174      6.37%
     write               1393      0     2.854     0.001     0.002     0.017      1.79%
     read                1063    100     2.236     0.000     0.002     0.083      5.11%
     ...

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250205205443.1986408-5-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:44:16 -08:00
Namhyung Kim
bd50a26c9a perf tools: Get rid of now-unused rb_resort.h
It was only used in perf trace and it switched to use hashmap instead.
Let's delete the code.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250205205443.1986408-4-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:44:15 -08:00
Namhyung Kim
ef2da619b1 perf trace: Convert syscall_stats to hashmap
It was using a RBtree-based int-list as a hash and a custom resort
logic for that.  As we have hashmap, let's convert to it and add a
custom sort function for the hashmap entries using an array.  It
should be faster and more light-weighted.  It's also to prepare
supporting system-wide syscall stats.

No functional changes intended.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250205205443.1986408-3-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:44:15 -08:00
Namhyung Kim
c7f821b876 perf trace: Allocate syscall stats only if summary is on
The syscall stats are used only when summary is requested.  Let's avoid
unnecessary operations.  While at it, let's pass 'trace' pointer
directly instead of passing 'output' file pointer and 'summary' option
in the 'trace' separately.

Reviewed-by: Howard Chu <howardchu95@gmail.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20250205205443.1986408-2-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:44:10 -08:00
James Clark
615ec00b06 perf tests: Fix Tool PMU test segfault
tool_pmu__event_to_str() now handles skipped events by returning NULL,
so it's wrong to re-check for a skip on the resulting string. Calling
tool_pmu__skip_event() with a NULL string results in a segfault so
remove the unnecessary skip to fix it:

  $ perf test -vv "parsing with PMU name"

  12.2: Parsing with PMU name:
  ...
  ---- unexpected signal (11) ----
  12.2: Parsing with PMU name         : FAILED!

Fixes: ee8aef2d23 ("perf tools: Add skip check in tool_pmu__event_to_str()")
Signed-off-by: James Clark <james.clark@linaro.org>
Reported-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250212163859.1489916-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12 19:34:56 -08:00
Kan Liang
ee8aef2d23 perf tools: Add skip check in tool_pmu__event_to_str()
Some topdown related metrics may fail on hybrid machines.

 $ perf stat -M tma_frontend_bound
 Cannot resolve IDs for tma_frontend_bound:
 cpu_atom@TOPDOWN_FE_BOUND.ALL@ / (8 * cpu_atom@CPU_CLK_UNHALTED.CORE@)

In the find_tool_events(), the tool_pmu__event_to_str() is used to
compare the tool_events. It only checks the event name, no PMU or arch.
So the tool_events[TOOL_PMU__EVENT_SLOTS] is set to true, because the
p-core Topdown metrics has "slots" event.
The tool_events is shared. So when parsing the e-core metrics, the
"slots" is automatically added.

The "slots" event as a tool event should only be available on arm64. It
has a different meaning on X86. The tool_pmu__skip_event() intends
handle the case. Apply it for tool_pmu__event_to_str() as well.

There is a lack of sanity check in the expr__get_id(). Add the check.

Closes: https://lore.kernel.org/lkml/608077bc-4139-4a97-8dc4-7997177d95c4@linux.intel.com/
Fixes: 069057239a ("perf tool_pmu: Move expr literals to tool_pmu")
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: thomas.falcon@intel.com
Link: https://lore.kernel.org/r/20250207152844.302167-1-kan.liang@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-10 11:46:30 -08:00
Dr. David Alan Gilbert
1df4b33f62 perf tools: Deadcode removal
The last use of machine__fprintf_vmlinux_path() was removed in 2011 by
commit ab81f3fd35 ("perf top: Reuse the 'report' hist_entry/hists
classes")

mmap_cpu_mask__duplicate() was added in 2021 by
commit 6bd006c6eb ("perf mmap: Introduce mmap_cpu_mask__duplicate()")
but hasn't been used since.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Tested-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250204220545.456435-1-linux@treblig.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-10 11:46:02 -08:00
Namhyung Kim
9e676a024f Linux 6.14-rc1
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmegAi4eHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiG+cMH/jFx5lmvzVObuStc
 OdqfdMJVF238cX3iovDF6hLMDCuSgYY9CX5FYmd7pGtxGuUEecSLxin+WbJcxfin
 WBHzgPP+hmcjqpU0yCd3azITi8BHJeFCgT86OM/1Rsv82M4T/xWxBIET79izQJ0E
 5L9KzlmPMLTLbLPVa+wookXfoJOycWRDCN6p/jxTLzeM/szqDlokAsSf19iodkl/
 59Gnk5oEYneqyt4FdTgxWcq1fteTlzZJgC6heN5XIjZuSN1ME11N4QO0xu+ld3UA
 nzbpnNwCRIl50yO5+pvYpkoRrHDwxjJ7an9sliWAHxDt/etVngTaSsl8uGht/9QK
 +4Vi48I=
 =TI43
 -----END PGP SIGNATURE-----

Merge tag 'v6.14-rc1' into perf-tools-next

To get the various fixes in the current master.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-05 14:57:18 -08:00
Ian Rogers
357b965deb perf stat: Changes to event name uniquification
The existing logic would disable uniquification on an evlist or enable
it per evsel, this is unfortunate as uniquification is most needed
when events have the same name and so the whole evlist must be
considered.  Change the initial disable uniquify on an evlist
processing to also set a needs_uniquify flag, for cases like the
matching event names. This must be done as an initial pass as
uniquification of an event name will change the behavior of the
check. Keep the per counter uniquification but now only uniquify event
names when the needs_uniquify flag is set.

Before this change a hwmon like temp1 wouldn't be uniquified and
afterwards it will (ie the PMU is added to the temp1 event's name).

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 21:29:13 -08:00
Ian Rogers
2d9961c690 perf stat: Don't merge counters purely on name
Counter merging was added in commit 942c559339 ("perf stat: Add
perf_stat_merge_counters()"), however, it merges events with the same
name on different PMUs regardless of whether the different PMUs are
actually of the same type (ie they differ only in the suffix on the
PMU). For hwmon events there may be a temp1 event on every PMU, but
the PMU names are all unique and don't differ just by a suffix. The
merging is over eager and will merge all the hwmon counters together
meaning an aggregated and very large temp1 value is shown. The same
would be true for say cache events and memory controller events where
the PMUs differ but the event names are the same.

Fix the problem by correctly saying two PMUs alias when they differ
only by suffix.

Note, there is an overlap with evsel's merged_stat with aggregation
and the evsel's metric_leader where aggregation happens for metrics.

Fixes: 942c559339 ("perf stat: Add perf_stat_merge_counters()")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 21:29:05 -08:00
Ian Rogers
63e287131c perf pmu: Rename name matching for no suffix or wildcard variants
Wildcard PMU naming will match a name like pmu_1 to a PMU name like
pmu_10 but not to a PMU name like pmu_2 as the suffix forms part of
the match. No suffix matching will match pmu_10 to either pmu_1 or
pmu_2. Add or rename matching functions on PMU to make it clearer what
kind of matching is being performed.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-4-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 21:28:46 -08:00
Ian Rogers
57e13264dc perf pmus: Restructure pmu_read_sysfs to scan fewer PMUs
Rather than scanning core or all PMUs, allow pmu_read_sysfs to read
some combination of core, other, hwmon and tool PMUs. The PMUs that
should be read and are already read are held as bitmaps. It is known
that a "hwmon_" prefix is necessary for a hwmon PMU's name, similarly
with "tool", so only scan those PMUs in situations the PMU name or the
PMU's type number make sense to.

The number of openat system calls reduces from 276 to 98 for a hwmon
event. The number of openats for regular perf events isn't changed.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 21:28:37 -08:00
Ian Rogers
340c345e58 perf evsel: Reduce scanning core PMUs in is_hybrid
evsel__is_hybrid returns true if there are multiple core PMUs and the
evsel is for a core PMU. Determining the number of core PMUs can
require loading/scanning PMUs. There's no point doing the scanning if
evsel for the is_hybrid test isn't core so reorder the tests to reduce
PMU scanning.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250201074320.746259-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 21:28:25 -08:00
Thomas Richter
888751e4d0 perf test: Fix Hwmon PMU test endianess issue
perf test 11 hwmon fails on s390 with this error

 # ./perf test -Fv 11
 --- start ---
 ---- end ----
 11.1: Basic parsing test             : Ok
 --- start ---
 Testing 'temp_test_hwmon_event1'
 Using CPUID IBM,3931,704,A01,3.7,002f
 temp_test_hwmon_event1 -> hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/
 FAILED tests/hwmon_pmu.c:189 Unexpected config for
    'temp_test_hwmon_event1', 292470092988416 != 655361
 ---- end ----
 11.2: Parsing without PMU name       : FAILED!
 --- start ---
 Testing 'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/'
 FAILED tests/hwmon_pmu.c:189 Unexpected config for
    'hwmon_a_test_hwmon_pmu/temp_test_hwmon_event1/',
    292470092988416 != 655361
 ---- end ----
 11.3: Parsing with PMU name          : FAILED!
 #

The root cause is in member test_event::config which is initialized
to 0xA0001 or 655361. During event parsing a long list event parsing
functions are called and end up with this gdb call stack:

 #0  hwmon_pmu__config_term (hwm=0x168dfd0, attr=0x3ffffff5ee8,
	term=0x168db60, err=0x3ffffff81c8) at util/hwmon_pmu.c:623
 #1  hwmon_pmu__config_terms (pmu=0x168dfd0, attr=0x3ffffff5ee8,
	terms=0x3ffffff5ea8, err=0x3ffffff81c8) at util/hwmon_pmu.c:662
 #2  0x00000000012f870c in perf_pmu__config_terms (pmu=0x168dfd0,
	attr=0x3ffffff5ee8, terms=0x3ffffff5ea8, zero=false,
	apply_hardcoded=false, err=0x3ffffff81c8) at util/pmu.c:1519
 #3  0x00000000012f88a4 in perf_pmu__config (pmu=0x168dfd0, attr=0x3ffffff5ee8,
	head_terms=0x3ffffff5ea8, apply_hardcoded=false, err=0x3ffffff81c8)
	at util/pmu.c:1545
 #4  0x00000000012680c4 in parse_events_add_pmu (parse_state=0x3ffffff7fb8,
	list=0x168dc00, pmu=0x168dfd0, const_parsed_terms=0x3ffffff6090,
	auto_merge_stats=true, alternate_hw_config=10)
	at util/parse-events.c:1508
 #5  0x00000000012684c6 in parse_events_multi_pmu_add (parse_state=0x3ffffff7fb8,
	event_name=0x168ec10 "temp_test_hwmon_event1", hw_config=10,
	const_parsed_terms=0x0, listp=0x3ffffff6230, loc_=0x3ffffff70e0)
	at util/parse-events.c:1592
 #6  0x00000000012f0e4e in parse_events_parse (_parse_state=0x3ffffff7fb8,
	scanner=0x16878c0) at util/parse-events.y:293
 #7  0x00000000012695a0 in parse_events__scanner (str=0x3ffffff81d8
	"temp_test_hwmon_event1", input=0x0, parse_state=0x3ffffff7fb8)
	at util/parse-events.c:1867
 #8  0x000000000126a1e8 in __parse_events (evlist=0x168b580,
	str=0x3ffffff81d8 "temp_test_hwmon_event1", pmu_filter=0x0,
	err=0x3ffffff81c8, fake_pmu=false, warn_if_reordered=true,
	fake_tp=false) at util/parse-events.c:2136
 #9  0x00000000011e36aa in parse_events (evlist=0x168b580,
	str=0x3ffffff81d8 "temp_test_hwmon_event1", err=0x3ffffff81c8)
	at /root/linux/tools/perf/util/parse-events.h:41
 #10 0x00000000011e3e64 in do_test (i=0, with_pmu=false, with_alias=false)
	at tests/hwmon_pmu.c:164
 #11 0x00000000011e422c in test__hwmon_pmu (with_pmu=false)
	at tests/hwmon_pmu.c:219
 #12 0x00000000011e431c in test__hwmon_pmu_without_pmu (test=0x1610368
	<suite.hwmon_pmu>, subtest=1) at tests/hwmon_pmu.c:23

where the attr::config is set to value 292470092988416 or 0x10a0000000000
in line 625 of file ./util/hwmon_pmu.c:

   attr->config = key.type_and_num;

However member key::type_and_num is defined as union and bit field:

   union hwmon_pmu_event_key {
        long type_and_num;
        struct {
                int num :16;
                enum hwmon_type type :8;
        };
   };

s390 is big endian and Intel is little endian architecture.
The events for the hwmon dummy pmu have num = 1 or num = 2 and
type is set to HWMON_TYPE_TEMP (which is 10).
On s390 this assignes member key::type_and_num the value of
0x10a0000000000 (which is 292470092988416) as shown in above
trace output.

Fix this and export the structure/union hwmon_pmu_event_key
so the test shares the same implementation as the event parsing
functions for union and bit fields. This should avoid
endianess issues on all platforms.

Output after:
 # ./perf test -F 11
 11.1: Basic parsing test         : Ok
 11.2: Parsing without PMU name   : Ok
 11.3: Parsing with PMU name      : Ok
 #

Fixes: 531ee0fd48 ("perf test: Add hwmon "PMU" test")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250131112400.568975-1-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 17:22:40 -08:00
Thomas Richter
90d97674d4 perf test: Use cycles event in perf record test for leader_sampling
On s390 the event instructions can not be used for recording.
This event is only supported by perf stat.

Change the event from instructions to cycles in subtest
test_leader_sampling.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Link: https://lore.kernel.org/r/20250131102756.4185235-3-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 11:36:14 -08:00
Thomas Richter
859199431d perf test: Fix perf record test for precise_max
On s390 the event instructions can not be used for recording.
This event is only supported by perf stat.

Test that each event cycles and instructions supports sampling.
If the event can not be sampled, skip it.

Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Suggested-by: James Clark <james.clark@linaro.org>
Reviewed-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250131102756.4185235-2-tmricht@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-04 11:34:25 -08:00
Anubhav Shelat
23e0a63c6d perf script: force stdin for flamegraph in live mode
Currently, running "perf script flamegraph -a -F 99 sleep 1" should
produce flamegraph.html containing the flamegraph. Howevever, it gives a
segmentation fault.

This is caused because the flamegraph.py script is
supposed to take as input the output of "perf record", which should be
in stdin. This would require passing "-i -" to flamegraph.py. However,
the "flamegraph-report" script causes "perf script" command to take the
"-i -" option instead of flamegraph.py, which causes no problem for
"perf script", but causes a seg fault since flamegraph.py has no input
file. To fix this I added the "-i -" option directly to the
flamegraph-report script to ensure flamegraph.py gets input from stdin.

Signed-off-by: Anubhav Shelat <ashelat@redhat.com>
Tested-by: Michael Petlan <mpetlan@redhat.com>
Link: https://lore.kernel.org/r/20250131145704.3164542-2-ashelat@redhat.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-03 19:49:10 -08:00
Ian Rogers
bb4b8f9697 perf test: Extra verbosity and hypervisor skip for tpebs test
When not running as root and with higher perf event paranoia values
the perf record forked by TPEBS can fail to attach to the process. Skip
the test in these scenarios.

Intel TPEBS test skips on non-Intel CPUs. On Intel CPUs under a
hypervisor the cache-misses event may not be present or precise. Skip
the test under this condition.

Refactor the output code to be placed in a file so that on a signal
the file can be dumped. This was necessary to catch the issue above as
the failing perf record command would fail without output.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Thomas Falcon <thomas.falcon@intel.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250130170135.5817-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-03 19:45:50 -08:00
James Clark
4c4c0724d6 perf: Always feature test reallocarray
This is also used in util/comm.c now, so instead of selectively doing
the feature test, always do it. If it's ever used anywhere else it's
less likely to cause another build failure.

This doesn't remove the need to manually include libc_compat.h, and
missing that will still cause an error for glibc < 2.26. There isn't a
way to fix that without poisoning reallocarray like libbpf did, but that
has other downsides like making memory debugging tools less useful. So
for Perf keep it like this and we'll have to fix up any missed includes.

Fixes the following build error:

  util/comm.c:152:31: error: implicit declaration of function
                      'reallocarray' [-Wimplicit-function-declaration]
  152 |                         tmp = reallocarray(comm_strs->strs,
      |                               ^~~~~~~~~~~~

Fixes: 13ca628716 ("perf comm: Add reference count checking to 'struct comm_str'")
Reported-by: Ali Utku Selen <ali.utku.selen@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20250129154405.777533-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-31 14:45:19 -08:00
Linus Torvalds
c06310fd6b perf-tools fixes for 6.14
An early round of random fixes in perf tools for this cycle.
 
 perf trace
 ----------
 * Fix loading of BPF program on certain clang versions
 * Fix out-of-bound access in syscalls with 6 arguments
 * Skip syscall enum test if landlock syscall is not available
 
 perf annotate
 -------------
 * Fix segfaults due to invalid access in disasm arrays
 
 perf stat
 ---------
 * Fix error handling in topology parsing
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZ5vx+gAKCRCMstVUGiXM
 g8PpAP9fNWvkxEiylqO9GGqMJWnIwWwlz4NCqqOZWyPspcECrgD9Eu0lZlna4tOL
 3I8giYN2m7ogNt+ZXP2b0y2np7hOGQc=
 =lVVJ
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v6.14-2025-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools fixes from Namhyung Kim:
 "An early round of random fixes in perf tools for this cycle.

  perf trace:
   - Fix loading of BPF program on certain clang versions
   - Fix out-of-bound access in syscalls with 6 arguments
   - Skip syscall enum test if landlock syscall is not available

  perf annotate:
   - Fix segfaults due to invalid access in disasm arrays

  perf stat:
   - Fix error handling in topology parsing"

* tag 'perf-tools-fixes-for-v6.14-2025-01-30' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools:
  perf cpumap: Fix die and cluster IDs
  perf test: Skip syscall enum test if no landlock syscall
  perf trace: Fix runtime error of index out of bounds
  perf annotate: Use an array for the disassembler preference
  perf trace: Fix BPF loading failure (-E2BIG)
2025-01-30 17:38:20 -08:00
Ian Rogers
8ce0d2da14 perf stat: Fix find_stat for mixed legacy/non-legacy events
Legacy events typically don't have a PMU when added leading to
mismatched legacy/non-legacy cases in find_stat. Use evsel__find_pmu
to make sure the evsel PMU is looked up. Update the evsel__find_pmu
code to look for the PMU using the extended config type or, for legacy
hardware/hw_cache events on non-hybrid systems, just use the core PMU.

Before:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1
 Performance counter stats for 'system wide':

       215,309,764      cycles
        44,326,491      cpu/instructions/

       1.002555314 seconds time elapsed
```
After:
```
$ perf stat -e cycles,cpu/instructions/ -a sleep 1

 Performance counter stats for 'system wide':

       990,676,332      cycles
     1,235,762,487      cpu/instructions/                #    1.25  insn per cycle

       1.002667198 seconds time elapsed
```

Fixes: 3612ca8e29 ("perf stat: Fix the hard-coded metrics calculation on the hybrid")
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250109222109.567031-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-29 14:06:25 -08:00
Ian Rogers
6ab89b7fc2 perf evsel: Add pmu_name helper
Add helper to get the name of the evsel's PMU. This handles the case
where there's no sysfs PMU via parse_events event_type helper.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20250109222109.567031-2-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-29 14:05:57 -08:00
James Clark
9fae5884bb perf cpumap: Fix die and cluster IDs
Now that filename__read_int() returns -errno instead of -1 these
statements need to be updated otherwise error values will be used as
die IDs.

This appears as a -2 die ID when the platform doesn't export one:

  $ perf stat --per-core -a -- true

  S36-D-2-C0            1               9.45 msec cpu-clock

And the session topology test fails:

  $ perf test -vvv topology

  CPU 0, core 0, socket 36
  CPU 1, core 1, socket 36
  CPU 2, core 2, socket 36
  CPU 3, core 3, socket 36
  FAILED tests/topology.c:137 Cpu map - Die ID doesn't match
  ---- end(-1) ----
  38: Session topology                                                : FAILED!

Fixes: 05be17eed7 ("tool api fs: Correctly encode errno for read/write open failures")
Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241218115552.912517-1-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-28 10:03:26 -08:00
Namhyung Kim
72d81e1062 perf test: Skip syscall enum test if no landlock syscall
The perf trace enum augmentation test specifically targets landlock_
add_rule syscall but IIUC it's an optional and can be opt-out by a
kernel config.

Currently trace_landlock() runs `perf test -w landlock` before the
actual testing to check the availability but it's not enough since the
workload always returns 0.  Instead it could check if perf trace output
has 'landlock' string.

Fixes: d66763fed3 ("perf test trace_btf_enum: Add regression test for the BTF augmentation of enums in 'perf trace'")
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250128170629.1251574-1-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-28 09:29:39 -08:00
Howard Chu
c7b87ce0dd perf trace: Fix runtime error of index out of bounds
libtraceevent parses and returns an array of argument fields, sometimes
larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr",
idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6
elements max, creating an out-of-bounds access. This runtime error is
found by UBsan. The error message:

  $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1
  builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]'
    #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966
    #1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110
    #2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436
    #3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897
    #4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335
    #5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502
    #6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351
    #7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404
    #8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448
    #9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556
    #10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360
    #12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6)

     0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1)                                      = 1

Fixes: 5e58fcfaf4 ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint")
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250122025519.361873-1-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-28 09:27:27 -08:00
Ian Rogers
bde4ccfd5a perf annotate: Use an array for the disassembler preference
Prior to this change a string was used which could cause issues with
an unrecognized disassembler in symbol__disassembler. Change to
initializing an array of perf_disassembler enum values. If a value
already exists then adding it a second time is ignored to avoid array
out of bounds problems present in the previous code, it also allows a
statically sized array and removes memory allocation needs. Errors in
the disassembler string are reported when the config is parsed during
perf annotate or perf top start up. If the array is uninitialized
after processing the config file the default llvm, capstone then
objdump values are added but without a need to parse a string.

Fixes: a6e8a58de6 ("perf disasm: Allow configuring what disassemblers to use")
Closes: https://lore.kernel.org/lkml/CAP-5=fUdfCyxmEiTpzS2uumUp3-SyQOseX2xZo81-dQtWXj6vA@mail.gmail.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20250124043856.1177264-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-27 15:58:01 -08:00
James Clark
66e99fd5a1 perf vendor events arm64: Add V3 events/metrics
Using the scripts at:
https://gitlab.arm.com/telemetry-solution/telemetry-solution/

Generate perf json for neoverse-v3 using the following command:
```
$ telemetry-solution/tools/perf_json_generator/generate.py \
  tools/perf/ --telemetry-files \
  telemetry-solution/data/pmu/cpu/neoverse/neoverse-v3.json
```

Signed-off-by: Ian Rogers <irogers@google.com>
[Re-generate after updating script]
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250122163504.2061472-3-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-24 15:14:15 -08:00
James Clark
994256a798 perf vendor events arm64: Add N3 events/metrics
Using the scripts at:
https://gitlab.arm.com/telemetry-solution/telemetry-solution/

Generate perf json for neoverse-n3 using the following command:
```
$ telemetry-solution/tools/perf_json_generator/generate.py \
  tools/perf/ --telemetry-files \
  telemetry-solution/data/pmu/cpu/neoverse/neoverse-n3.json
```

Signed-off-by: Ian Rogers <irogers@google.com>
[Re-generate after updating script]
Signed-off-by: James Clark <james.clark@linaro.org>
Link: https://lore.kernel.org/r/20250122163504.2061472-2-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-24 15:14:06 -08:00
Benjamin Peterson
0aefb3df8b perf trace: Fix return value of trace__fprintf_tp_fields
This function formerly returned twice the number of bytes printed.

Signed-off-by: Benjamin Peterson <benjamin@engflow.com>
Reviewed-by: Howard Chu <howardchu95@gmail.com>
Link: https://lore.kernel.org/r/20250123-void-fprintf_tp_fields-v2-1-6038f8224987@engflow.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-24 13:21:49 -08:00
Linus Torvalds
7685b334d1 perf-tools changes for v6.14
There are a lot of changes in the perf tools in this cycle.
 
 build
 -----
 * Use generic syscall table to generate syscall numbers on supported archs.
 * This also enables to get rid of libaudit which was used for syscall numbers.
 * Remove python2 support as it's deprecated for years.
 * Fix issues on static build with libzstd.
 
 perf record
 -----------
 * Intel-PT supports "aux-action" config term to pause or resume tracing in
   the aux-buffer.  Users can start the intel_pt event as "started-paused" and
   configure other events to control the Intel-PT tracing.
 
     # perf record --kcore -e intel_pt/aux-action=start-paused/   \
         -e syscalls:sys_enter_newuname/aux-action=resume/        \
         -e syscalls:sys_exit_newuname/aux-action=pause/ -- uname
 
   This requires the kernel support (which was added in v6.13).
 
 perf lock
 ---------
 * 'perf lock contention' command has an ability to symbolize locks in
   dynamically allocated objects using slab cache name when it runs with BPF.
   Those dynamic locks would have "&" prefix in the name to distinguish them
   from ordinary (static) locks.
 
     # perf lock con -abl -E 5 sleep 1
        contended   total wait     max wait     avg wait            address   symbol
 
                2      1.95 us      1.77 us       975 ns   ffff9d5e852d3498   &task_struct (mutex)
                1      1.18 us      1.18 us      1.18 us   ffff9d5e852d3538   &task_struct (mutex)
                4      1.12 us       354 ns       279 ns   ffff9d5e841ca800   &kmalloc-cg-512 (mutex)
                2       859 ns       617 ns       429 ns   ffffffffa41c3620   delayed_uprobe_lock (mutex)
                3       691 ns       388 ns       230 ns   ffffffffa41c0940   pack_mutex (mutex)
 
   This also requires the kernel/BPF support (which was added in v6.13).
 
 perf ftrace
 -----------
 * 'perf ftrace latency' command gets a couple of options to support linear
   buckets instead of exponential.  Also it's possible to specify max and
   min latency for the linear buckets.
 
     # perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100   \
         --min-latency=200 --max-latency=800 -- sleep 1
     #   DURATION     |      COUNT | GRAPH                                  |
          0 -  200 ns |        186 | ###                                    |
        200 -  300 ns |        256 | #####                                  |
        300 -  400 ns |        364 | #######                                |
        400 -  500 ns |        223 | ####                                   |
        500 -  600 ns |        111 | ##                                     |
        600 -  700 ns |         41 |                                        |
        700 -  800 ns |        141 | ##                                     |
        800 -  ... ns |        169 | ###                                    |
 
     # statistics  (in nsec)
       total time:              2162212
         avg time:                  967
         max time:                16817
         min time:                  132
            count:                 2236
 
 * As you can see in the above example, it nows shows the statistics at the
   end so that users can see the avg/max/min latencies easily.
 
 * 'perf ftrace profile' command has --graph-opts option like 'perf ftrace
   trace' so that it can control the tracing behaviors in the same way.
   For example, it can limit the function call depth or threshold.
 
 perf script
 -----------
 * Improve physical memory resolution in 'mem-phys-addr' script by parsing
   /proc/iomem file.
 
     # perf script mem-phys-addr -- find /
     ...
     Event: mem_inst_retired.all_loads:P
     Memory type                                    count  percentage
     ----------------------------------------  ----------  ----------
     100000000-85f7fffff : System RAM                8929        69.7
       547600000-54785d23f : Kernel data             1240         9.7
       546a00000-5474bdfff : Kernel rodata            490         3.8
       5480ce000-5485fffff : Kernel bss               121         0.9
     0-fff : Reserved                                3860        30.1
     100000-89c01fff : System RAM                      18         0.1
     8a22c000-8df6efff : System RAM                     5         0.0
 
 Others
 ------
 * 'perf test' gets --runs-per-test option to run the test cases repeatedly.
   This would be helpful to see if it's flaky.
 
 * Add 'parse_events' method to Python perf extension module, so that users
   can use the same event parsing logic in the python code.  One more step
   towards implementing perf tools in Python. :)
 
 * Support opening tracepoint events without libtraceevent.  This will be
   helpful if it won't use the tracing data like in 'perf stat'.
 
 * Update ARM Neoverse N2/V2 JSON events and metrics
 
 Signed-off-by: Namhyung Kim <namhyung@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSo2x5BnqMqsoHtzsmMstVUGiXMgwUCZ5AgiQAKCRCMstVUGiXM
 g0WhAP43Dpfatrm1jicTyAogk5D/JrIMOgjGtrJJi5RXG/r0gwD8DSWFzLppS9xy
 KGtjLHrN6v6BqR4DCubdlZmRfh9Qjgg=
 =M0Kz
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf-tools updates from Namhyung Kim:
 "There are a lot of changes in the perf tools in this cycle.

  build:

   - Use generic syscall table to generate syscall numbers on supported
     archs

   - This also enables to get rid of libaudit which was used for syscall
     numbers

   - Remove python2 support as it's deprecated for years

   - Fix issues on static build with libzstd

  perf record:

   - Intel-PT supports "aux-action" config term to pause or resume
     tracing in the aux-buffer. Users can start the intel_pt event as
     "started-paused" and configure other events to control the Intel-PT
     tracing:

         # perf record --kcore -e intel_pt/aux-action=start-paused/   \
             -e syscalls:sys_enter_newuname/aux-action=resume/        \
             -e syscalls:sys_exit_newuname/aux-action=pause/ -- uname

     This requires kernel support (which was added in v6.13)

  perf lock:

   - 'perf lock contention' command has an ability to symbolize locks in
     dynamically allocated objects using slab cache name when it runs
     with BPF. Those dynamic locks would have "&" prefix in the name to
     distinguish them from ordinary (static) locks

        # perf lock con -abl -E 5 sleep 1
           contended   total wait     max wait     avg wait            address   symbol

                   2      1.95 us      1.77 us       975 ns   ffff9d5e852d3498   &task_struct (mutex)
                   1      1.18 us      1.18 us      1.18 us   ffff9d5e852d3538   &task_struct (mutex)
                   4      1.12 us       354 ns       279 ns   ffff9d5e841ca800   &kmalloc-cg-512 (mutex)
                   2       859 ns       617 ns       429 ns   ffffffffa41c3620   delayed_uprobe_lock (mutex)
                   3       691 ns       388 ns       230 ns   ffffffffa41c0940   pack_mutex (mutex)

     This also requires kernel/BPF support (which was added in v6.13)

  perf ftrace:

   - 'perf ftrace latency' command gets a couple of options to support
     linear buckets instead of exponential. Also it's possible to
     specify max and min latency for the linear buckets:

        # perf ftrace latency -abn -T switch_mm_irqs_off --bucket-range=100   \
            --min-latency=200 --max-latency=800 -- sleep 1
        #   DURATION     |      COUNT | GRAPH                                  |
             0 -  200 ns |        186 | ###                                    |
           200 -  300 ns |        256 | #####                                  |
           300 -  400 ns |        364 | #######                                |
           400 -  500 ns |        223 | ####                                   |
           500 -  600 ns |        111 | ##                                     |
           600 -  700 ns |         41 |                                        |
           700 -  800 ns |        141 | ##                                     |
           800 -  ... ns |        169 | ###                                    |

        # statistics  (in nsec)
          total time:              2162212
            avg time:                  967
            max time:                16817
            min time:                  132
               count:                 2236

   - As you can see in the above example, it nows shows the statistics
     at the end so that users can see the avg/max/min latencies easily

   - 'perf ftrace profile' command has --graph-opts option like 'perf
     ftrace trace' so that it can control the tracing behaviors in the
     same way. For example, it can limit the function call depth or
     threshold

  perf script:

   - Improve physical memory resolution in 'mem-phys-addr' script by
     parsing /proc/iomem file

        # perf script mem-phys-addr -- find /
        ...
        Event: mem_inst_retired.all_loads:P
        Memory type                                    count  percentage
        ----------------------------------------  ----------  ----------
        100000000-85f7fffff : System RAM                8929        69.7
          547600000-54785d23f : Kernel data             1240         9.7
          546a00000-5474bdfff : Kernel rodata            490         3.8
          5480ce000-5485fffff : Kernel bss               121         0.9
        0-fff : Reserved                                3860        30.1
        100000-89c01fff : System RAM                      18         0.1
        8a22c000-8df6efff : System RAM                     5         0.0

  Others:

   - 'perf test' gets --runs-per-test option to run the test cases
     repeatedly. This would be helpful to see if it's flaky

   - Add 'parse_events' method to Python perf extension module, so that
     users can use the same event parsing logic in the python code. One
     more step towards implementing perf tools in Python. :)

   - Support opening tracepoint events without libtraceevent. This will
     be helpful if it won't use the tracing data like in 'perf stat'

   - Update ARM Neoverse N2/V2 JSON events and metrics"

* tag 'perf-tools-for-v6.14-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (176 commits)
  perf test: Update event_groups test to use instructions
  perf bench: Fix undefined behavior in cmpworker()
  perf annotate: Prefer passing evsel to evsel->core.idx
  perf lock: Rename fields in lock_type_table
  perf lock: Add percpu-rwsem for type filter
  perf lock: Fix parse_lock_type which only retrieve one lock flag
  perf lock: Fix return code for functions in __cmd_contention
  perf hist: Fix width calculation in hpp__fmt()
  perf hist: Fix bogus profiles when filters are enabled
  perf hist: Deduplicate cmp/sort/collapse code
  perf test: Improve verbose documentation
  perf test: Add a runs-per-test flag
  perf test: Fix parallel/sequential option documentation
  perf test: Send list output to stdout rather than stderr
  perf test: Rename functions and variables for better clarity
  perf tools: Expose quiet/verbose variables in Makefile.perf
  perf config: Add a function to set one variable in .perfconfig
  perf test perftool_testsuite: Return correct value for skipping
  perf test perftool_testsuite: Add missing description
  perf test record+probe_libc_inet_pton: Make test resilient
  ...
2025-01-24 05:45:40 -08:00
Howard Chu
013eb043f3 perf trace: Fix BPF loading failure (-E2BIG)
As reported by Namhyung Kim and acknowledged by Qiao Zhao (link:
https://lore.kernel.org/linux-perf-users/20241206001436.1947528-1-namhyung@kernel.org/),
on certain machines, perf trace failed to load the BPF program into the
kernel. The verifier runs perf trace's BPF program for up to 1 million
instructions, returning an E2BIG error, whereas the perf trace BPF
program should be much less complex than that. This patch aims to fix
the issue described above.

The E2BIG problem from clang-15 to clang-16 is cause by this line:
 } else if (size < 0 && size >= -6) { /* buffer */

Specifically this check: size < 0. seems like clang generates a cool
optimization to this sign check that breaks things.

Making 'size' s64, and use
 } else if ((int)size < 0 && size >= -6) { /* buffer */

Solves the problem. This is some Hogwarts magic.

And the unbounded access of clang-12 and clang-14 (clang-13 works this
time) is fixed by making variable 'aug_size' s64.

As for this:
-if (aug_size > TRACE_AUG_MAX_BUF)
-	aug_size = TRACE_AUG_MAX_BUF;
+aug_size = args->args[index] > TRACE_AUG_MAX_BUF ? TRACE_AUG_MAX_BUF : args->args[index];

This makes the BPF skel generated by clang-18 work. Yes, new clangs
introduce problems too.

Sorry, I only know that it works, but I don't know how it works. I'm not
an expert in the BPF verifier. I really hope this is not a kernel
version issue, as that would make the test case (kernel_nr) *
(clang_nr), a true horror story. I will test it on more kernel versions
in the future.

Fixes: 395d38419f: ("perf trace augmented_raw_syscalls: Add more check s to pass the verifier")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241213023047.541218-1-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-01-23 15:55:52 -08:00