Commit Graph

7301 Commits

Author SHA1 Message Date
Adrian Hunter
3af1dfdd51 perf build: Move perf_dlfilters.h in the source tree
Move perf_dlfilters.h in the source tree so that it will be found when
building dlfilters as part of the perf build.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https //lore.kernel.org/r/20210811101036.17986-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-11 09:35:24 -03:00
John Garry
e199f47f15 perf pmu: Make pmu_add_sys_aliases() public
Function pmu_add_sys_aliases() will be required for the PMU events test
for system events aliases, so make it public.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-10-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:46:39 -03:00
John Garry
5806099a2e perf pmu: Check .is_uncore field in pmu_add_cpu_aliases_map()
Calling pmu_is_uncore() for fake PMUs does not work, as it checks sysfs
for the PMU details (which won't exist).

Check .is_uncore field instead, which makes sense anyway.

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https //lore.kernel.org/r/1627566986-30605-7-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-10 14:44:00 -03:00
Leo Yan
7c0223e1dd perf env: Track kernel 64-bit mode in environment
It's useful to know that the kernel is running in 32-bit or 64-bit mode.
E.g. We can decide if perf tool is running in compat mode based on the
info.

This patch adds an item "kernel_is_64_bit" into session's environment
structure perf_env, its value is initialized based on the architecture
string.

Suggested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: russell king <linux@armlinux.org.uk>
Link: http://lore.kernel.org/lkml/20210809112727.596876-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:11:18 -03:00
Leo Yan
65c45afb14 perf: Cleanup for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT
Since the __sync functions have been dropped, This patch removes unused
build and checking for HAVE_SYNC_COMPARE_AND_SWAP_SUPPORT in perf tool.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210809111407.596077-9-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:01:51 -03:00
Leo Yan
9d64503308 perf auxtrace: Remove auxtrace_mmap__read_snapshot_head()
Since the function auxtrace_mmap__read_snapshot_head() is exactly same
with auxtrace_mmap__read_head(), whether the session is in snapshot mode
or not, it's unified to use function auxtrace_mmap__read_head() for
reading AUX buffer head.

And the function auxtrace_mmap__read_snapshot_head() is unused so this
patch removes it.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210809111407.596077-8-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:01:23 -03:00
Leo Yan
1fc7e593e2 perf auxtrace: Drop legacy __sync functions
The main purpose for using __sync built-in functions is to support
compat mode for 32-bit perf with 64-bit kernel.  But using these
built-in functions might cause potential issues.

__sync functions originally support Intel Itanium processoer [1] but it
cannot promise to support all 32-bit archs.  Now these functions have
become the legacy functions.

Considering __sync functions cannot really fix the 64-bit value
atomicity on 32-bit archs, thus this patch drops __sync functions.

Credits to Peter for detailed analysis.

[1] https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210809111407.596077-7-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 17:00:06 -03:00
Leo Yan
1ea3cb159e perf auxtrace: Use WRITE_ONCE() for updating aux_tail
Use WRITE_ONCE() for updating aux_tail, so can avoid unexpected memory
behaviour.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andriin@fb.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Daniel Díaz <daniel.diaz@linaro.org>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: x86@kernel.org
Link: http //lore.kernel.org/lkml/20210809111407.596077-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 16:58:29 -03:00
James Clark
9c38b671eb perf cs-etm: Add warnings for missing DSOs
Currently decode will silently fail if no binary data is available for
the decode. This is made worse if only partial data is available because
the decode will appear to work, but any trace from that missing DSO will
silently not be generated.

Add a UI popup once if there is any data missing, and then warn in the
bottom left for each individual DSO that's missing.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http //lore.kernel.org/lkml/20210805130354.878120-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-09 16:30:24 -03:00
Jakub Kicinski
0ca8d3ca45 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Build failure in drivers/net/wwan/mhi_wwan_mbim.c:
add missing parameter (0, assuming we don't want buffer pre-alloc).

Conflict in drivers/net/dsa/sja1105/sja1105_main.c between:
  589918df93 ("net: dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
  0fac6aa098 ("net: dsa: sja1105: delete the best_effort_vlan_filtering mode")

Follow the instructions from the commit message of the former commit
- removed the if conditions. When looking at commit 589918df93 ("net:
dsa: sja1105: be stateless with FDB entries on SJA1105P/Q/R/S/SJA1110 too")
note that the mask_iotag fields get removed by the following patch.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-08-05 15:08:47 -07:00
James Clark
f3c33cbd92 perf cs-etm: Improve Coresight zero timestamp warning
Only show the warning if the user hasn't already set timeless mode and
improve the text because there was ambiguity around the meaning of '...'

Change the warning to a UI warning instead of printing straight to
stderr because this corrupts the UI when perf report TUI is used. The UI
warning function also handles printing to stderr when in perf script
mode.

Suggested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:04:19 -03:00
James Clark
1155204950 perf tools: Add flag for tracking warnings of missing DSOs
Auxtrace support may need DSOs for decoding (for example Arm Coresight).
If one of these is missing it would make sense to warn once for each one
that's missing, but not flood the output with every address as there
could be thousands of lookups.

This flag will allow tracking whether a warning was shown for each DSO.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:04:08 -03:00
James Clark
243c3a3eb4 perf annotate: Add disassembly warnings for annotate --stdio
Currently 'perf annotate --stdio' (and --stdio2) will exit without
printing anything if there are disassembly errors. Apply the same error
handler that's used for TUI and GTK modes. This makes comparing
disassembly across the different modes more consistent.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:04:06 -03:00
James Clark
1094795eb9 perf tools: Add WARN_ONCE equivalent for UI warnings
Currently WARN_ONCE prints to stderr and corrupts the TUI. Add
equivalent methods for UI warnings.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210729155805.2830-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-03 17:03:18 -03:00
Namhyung Kim
c3a057dc3a perf inject: Fix output from a file to a pipe
When the input is a regular file but the output is a pipe, it should
write a pipe header.  But just repiping would write a portion of the
existing header which is different in 'size' value.  So we need to
prevent it and write a new pipe header along with other information
like event attributes and features.

This can handle something like this:

  # perf record -a -B sleep 1

  # perf inject -b -i perf.data | perf report -i -

Factor out perf_event__synthesize_for_pipe() to be shared between perf
record and inject.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:14:34 -03:00
Namhyung Kim
0ae0389362 perf tools: Pass a fd to perf_file_header__read_pipe()
Currently it unconditionally writes to stdout for repipe.  But perf
inject can direct its output to a regular file.  Then it needs to
write the header to the file as well.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:09:05 -03:00
Namhyung Kim
2681bd85a4 perf tools: Remove repipe argument from perf_session__new()
The repipe argument is only used by perf inject and the all others
passes 'false'.  Let's remove it from the function signature and add
__perf_session__new() to be called from perf inject directly.

This is a preparation of the change the pipe input/output.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210719223153.1618812-2-namhyung@kernel.org
[ Fixed up some trivial conflicts as this patchset fell thru the cracks ;-( ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 10:06:51 -03:00
Li Huafei
c4db54be9b perf annotate: Add error log in symbol__annotate()
When users use 'perf annotate' on unsupported machines, error logs
should be printed for user feedback.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dengcheng Zhu <dzhu@wavecomp.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210726123854.13463-2-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
Li Huafei
4502da0efb perf env: Normalize aarch64.* and arm64.* to arm64 in normalize_arch()
On my aarch64 big endian machine, the perf annotate does not work.

 # perf annotate
  Percent |      Source code & Disassembly of [kernel.kallsyms] for cycles (253 samples, percent: local period)
 --------------------------------------------------------------------------------------------------------------
  Percent |      Source code & Disassembly of [kernel.kallsyms] for cycles (1 samples, percent: local period)
 ------------------------------------------------------------------------------------------------------------
  Percent |      Source code & Disassembly of [kernel.kallsyms] for cycles (47 samples, percent: local period)
 -------------------------------------------------------------------------------------------------------------
 ...

This is because the arch_find() function uses the normalized architecture
name provided by normalize_arch(), and my machine's architecture name
aarch64_be is not normalized to arm64.  Like other architectures such as
arm and powerpc, we can fuzzy match the architecture names associated with
aarch64.* and normalize them.

It seems that there is also arm64_be architecture name, which we also
normalize to arm64.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dengcheng Zhu <dzhu@wavecomp.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http //lore.kernel.org/lkml/20210726123854.13463-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:18 -03:00
James Clark
9182f04a85 perf cs-etm: Pass unformatted flag to decoder
The TRBE (Trace Buffer Extension) feature allows a separate trace buffer
for each trace source, therefore the trace wouldn't need to be
formatted. The driver was introduced in commit 3fbf7f011f
("coresight: sink: Add TRBE driver").

The formatted/unformatted mode is encoded in one of the flags of the
AUX record. The first AUX record encountered for each event is used to
determine the mode, and this will persist for the remaining trace that
is either decoded or dumped.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-7-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
04aaad262c perf cs-etm: Use existing decoder instead of resetting it
When dumping trace, the decoder is continually deleted and recreated to
decode each buffer. To support both formatted and unformatted trace in
a later commit, the decoder will be configured in advance.

This commit removes the deletion of the decoder and allows the
formatted/unformatted setting to persist.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-6-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
b8324f490b perf cs-etm: Suppress printing when resetting decoder
The decoder is quite noisy when being reset. In a future commit,
dump-raw-trace will use a code path that resets the decoder rather than
creating a new one, so printing has to be suppressed to not flood the
output.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-5-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
ca50db5917 perf cs-etm: Only setup queues when they are modified
Continually creating queues in cs_etm__process_event() is unnecessary.
They only need to be created when a buffer for a new CPU or thread is
encountered. This can be in two places, when building the queues in
advance in cs_etm__process_auxtrace_info(), or in
cs_etm__process_auxtrace_event() when data_queued is false and the
index wasn't available (pipe mode).

This change will allow the 'formatted' decoder setting to applied when
iterating over aux records in a later commit.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
9ac8afd500 perf cs-etm: Split setup and timestamp search functions
This refactoring has some benefits:

 * Decoding is done to find the timestamp. If we want to print errors
   when maps aren't available, then doing it from cs_etm__setup_queue()
   may cause warnings to be printed.

 * The cs_etm__setup_queue() flow is shared between timed and timeless
   modes, so it needs to be guarded by an if statement which can now
   be removed.

 * Allows moving the setup queues function earlier.

 * If data was piped in, then not all queues would be filled so it
   wouldn't have worked properly anyway. Now it waits for flush so
   data in all queues will be available.

The motivation for this is to decouple setup functions with ones that
involve decoding. That way we can move the setup function earlier when
the formatted/unformatted trace information is available.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https //lore.kernel.org/r/20210721150202.32065-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
James Clark
6f38e1158b perf cs-etm: Refactor initialisation of kernel start address
The kernel start address is already cached in the machine struct once it
is initialised, so storing it in the cs_etm struct is unnecessary.

It also depends on kernel maps being available to be initialised.
Therefore cs_etm__setup_queues() isn't an appropriate place to call it
because it could be called before processing starts. It would be better
to initialise it at the point when it is needed, then we can be sure
that all the necessary maps are available. Also by calling
machine__kernel_start() multiple times it can be initialised at some
point, even if it failed to initialise previously due to missing maps.

In a later commit cs_etm__setup_queues() will be moved which is the
motivation for this change.

Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <al.grant@arm.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20210721150202.32065-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-08-02 09:56:17 -03:00
Jakub Kicinski
d39e8b92c3 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Andrii Nakryiko says:

====================
bpf-next 2021-07-30

We've added 64 non-merge commits during the last 15 day(s) which contain
a total of 83 files changed, 5027 insertions(+), 1808 deletions(-).

The main changes are:

1) BTF-guided binary data dumping libbpf API, from Alan.

2) Internal factoring out of libbpf CO-RE relocation logic, from Alexei.

3) Ambient BPF run context and cgroup storage cleanup, from Andrii.

4) Few small API additions for libbpf 1.0 effort, from Evgeniy and Hengqi.

5) bpf_program__attach_kprobe_opts() fixes in libbpf, from Jiri.

6) bpf_{get,set}sockopt() support in BPF iterators, from Martin.

7) BPF map pinning improvements in libbpf, from Martynas.

8) Improved module BTF support in libbpf and bpftool, from Quentin.

9) Bpftool cleanups and documentation improvements, from Quentin.

10) Libbpf improvements for supporting CO-RE on old kernels, from Shuyi.

11) Increased maximum cgroup storage size, from Stanislav.

12) Small fixes and improvements to BPF tests and samples, from various folks.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (64 commits)
  tools: bpftool: Complete metrics list in "bpftool prog profile" doc
  tools: bpftool: Document and add bash completion for -L, -B options
  selftests/bpf: Update bpftool's consistency script for checking options
  tools: bpftool: Update and synchronise option list in doc and help msg
  tools: bpftool: Complete and synchronise attach or map types
  selftests/bpf: Check consistency between bpftool source, doc, completion
  tools: bpftool: Slightly ease bash completion updates
  unix_bpf: Fix a potential deadlock in unix_dgram_bpf_recvmsg()
  libbpf: Add btf__load_vmlinux_btf/btf__load_module_btf
  tools: bpftool: Support dumping split BTF by id
  libbpf: Add split BTF support for btf__load_from_kernel_by_id()
  tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()
  tools: Free BTF objects at various locations
  libbpf: Rename btf__get_from_id() as btf__load_from_kernel_by_id()
  libbpf: Rename btf__load() as btf__load_into_kernel()
  libbpf: Return non-null error on failures in libbpf_find_prog_btf_id()
  bpf: Emit better log message if bpf_iter ctx arg btf_id == 0
  tools/resolve_btfids: Emit warnings and patch zero id for missing symbols
  bpf: Increase supported cgroup storage value size
  libbpf: Fix race when pinning maps in parallel
  ...
====================

Link: https://lore.kernel.org/r/20210730225606.1897330-1-andrii@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2021-07-31 11:23:26 -07:00
Arnaldo Carvalho de Melo
9bac1bd6e6 Revert "perf map: Fix dso->nsinfo refcounting"
This makes 'perf top' abort in some cases, and the right fix will
involve surgery that is too much to do at this stage, so revert for now
and fix it in the next merge window.

This reverts commit 2d6b74baa7.

Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-30 18:26:22 -03:00
Quentin Monnet
86f4b7f257 tools: Replace btf__get_from_id() with btf__load_from_kernel_by_id()
Replace the calls to function btf__get_from_id(), which we plan to
deprecate before the library reaches v1.0, with calls to
btf__load_from_kernel_by_id() in tools/ (bpftool, perf, selftests).
Update the surrounding code accordingly (instead of passing a pointer to
the btf struct, get it as a return value from the function).

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20210729162028.29512-6-quentin@isovalent.com
2021-07-29 17:23:52 -07:00
Quentin Monnet
369e955b3d tools: Free BTF objects at various locations
Make sure to call btf__free() (and not simply free(), which does not
free all pointers stored in the struct) on pointers to struct btf
objects retrieved at various locations.

These were found while updating the calls to btf__get_from_id().

Fixes: 999d82cbc0 ("tools/bpf: enhance test_btf file testing to test func info")
Fixes: 254471e57a ("tools/bpf: bpftool: add support for func types")
Fixes: 7b612e291a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF programs")
Fixes: d56354dc49 ("perf tools: Save bpf_prog_info and BTF of new BPF programs")
Fixes: 47c09d6a9f ("bpftool: Introduce "prog profile" command")
Fixes: fa853c4b83 ("perf stat: Enable counting events for BPF programs")
Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20210729162028.29512-5-quentin@isovalent.com
2021-07-29 17:09:28 -07:00
John Garry
c07d5c9226 perf pmu: Fix alias matching
Commit c47a5599ed ("perf tools: Fix pattern matching for same
substring in different PMU type"), may have fixed some alias matching,
but has broken some others.

Firstly it cannot handle the simple scenario of PMU name in form
pmu_name{digits} - it can only handle pmu_name_{digits}.

Secondly it cannot handle more complex matching in the case where we
have multiple tokens. In this scenario, the code failed to realise that
we may examine multiple substrings in the PMU name.

Fix in two ways:

- Change perf_pmu__valid_suffix() to accept a PMU name without '_' in the
  suffix

- Only pay attention to perf_pmu__valid_suffix() for the final token

Also add const qualifiers as necessary to avoid casting.

Fixes: c47a5599ed ("perf tools: Fix pattern matching for same substring in different PMU type")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Jin Yao <yao.jin@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1626793819-79090-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-27 13:21:46 -03:00
James Clark
48e8a7b5a5 perf cs-etm: Split --dump-raw-trace by AUX records
Currently --dump-raw-trace skips queueing and splitting buffers because
of an early exit condition in cs_etm__process_auxtrace_info(). Once
that is removed we can print the split data by using the queues
and searching for split buffers with the same reference as the
one that is currently being processed.

This keeps the same behaviour of dumping in file order when an AUXTRACE
event appears, rather than moving trace dump to where AUX records are in
the file.

There will be a newline and size printout for each fragment. For example
this buffer is comprised of two AUX records, but was printed as one:

  0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t

  . ... CoreSight ETM Trace data: size 160 bytes
          Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
          Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;
          Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
          Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;

But is now printed as two fragments:

  0 0 0x8098 [0x30]: PERF_RECORD_AUXTRACE size: 0xa0  offset: 0  ref: 0x491a4dfc52fc0e6e  idx: 0  t

  . ... CoreSight ETM Trace data: size 80 bytes
          Idx:0; ID:10;   I_ASYNC : Alignment Synchronisation.
          Idx:12; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:17; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000000000;

  . ... CoreSight ETM Trace data: size 80 bytes
          Idx:80; ID:10;  I_ASYNC : Alignment Synchronisation.
          Idx:92; ID:10;  I_TRACE_INFO : Trace Info.; INFO=0x0 { CC.0 }
          Idx:97; ID:10;  I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0xFFFFDE2AD3FD76D4;

Decoding errors that appeared in problematic files are now not present,
for example:

        Idx:808; ID:1c; I_BAD_SEQUENCE : Invalid Sequence in packet.[I_ASYNC]
        ...
        PKTP_ETMV4I_0016 : 0x0014 (OCSD_ERR_INVALID_PCKT_HDR) [Invalid packet header]; TrcIdx=822

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210624164303.28632-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-27 12:50:56 -03:00
Yang Jihong
22a665513b perf probe: Fix add event failure when running 32-bit perf in a 64-bit kernel
The "address" member of "struct probe_trace_point" uses long data type.
If kernel is 64-bit and perf program is 32-bit, size of "address"
variable is 32 bits.

As a result, upper 32 bits of address read from kernel are truncated, an
error occurs during address comparison in kprobe_warn_out_range().

Before:

  # perf probe -a schedule
  schedule is out of .text, skip it.
    Error: Failed to add events.

Solution:
  Change data type of "address" variable to u64 and change corresponding
address printing and value assignment.

After:

  # perf.new.new probe -a schedule
  Added new event:
    probe:schedule       (on schedule)

  You can now use it in all perf tools, such as:

          perf record -e probe:schedule -aR sleep 1

  # perf probe -l
    probe:schedule       (on schedule@kernel/sched/core.c)
  # perf record -e probe:schedule -aR sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.156 MB perf.data (1366 samples) ]
  # perf report --stdio
  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 1K of event 'probe:schedule'
  # Event count (approx.): 1366
  #
  # Overhead  Command          Shared Object      Symbol
  # ........  ...............  .................  ............
  #
       6.22%  migration/0      [kernel.kallsyms]  [k] schedule
       6.22%  migration/1      [kernel.kallsyms]  [k] schedule
       6.22%  migration/2      [kernel.kallsyms]  [k] schedule
       6.22%  migration/3      [kernel.kallsyms]  [k] schedule
       6.15%  migration/10     [kernel.kallsyms]  [k] schedule
       6.15%  migration/11     [kernel.kallsyms]  [k] schedule
       6.15%  migration/12     [kernel.kallsyms]  [k] schedule
       6.15%  migration/13     [kernel.kallsyms]  [k] schedule
       6.15%  migration/14     [kernel.kallsyms]  [k] schedule
       6.15%  migration/15     [kernel.kallsyms]  [k] schedule
       6.15%  migration/4      [kernel.kallsyms]  [k] schedule
       6.15%  migration/5      [kernel.kallsyms]  [k] schedule
       6.15%  migration/6      [kernel.kallsyms]  [k] schedule
       6.15%  migration/7      [kernel.kallsyms]  [k] schedule
       6.15%  migration/8      [kernel.kallsyms]  [k] schedule
       6.15%  migration/9      [kernel.kallsyms]  [k] schedule
       0.22%  rcu_sched        [kernel.kallsyms]  [k] schedule
  ...
  #
  # (Cannot load tips.txt file, please install perf!)
  #

Signed-off-by: Yang Jihong <yangjihong1@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jianlin Lv <jianlin.lv@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Link: http://lore.kernel.org/lkml/20210715063723.11926-1-yangjihong1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-18 09:31:15 -03:00
Riccardo Mancini
d4b3eedce1 perf data: Close all files in close_dir()
When using 'perf report' in directory mode, the first file is not closed
on exit, causing a memory leak.

The problem is caused by the iterating variable never reaching 0.

Fixes: 1455206311 ("perf data: Add perf_data__(create_dir|close_dir) functions")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhen Lei <thunder.leizhen@huawei.com>
Link: http://lore.kernel.org/lkml/20210716141122.858082-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-18 09:27:49 -03:00
Riccardo Mancini
e0fa7ab422 perf probe-file: Delete namelist in del_events() on the error path
ASan reports some memory leaks when running:

  # perf test "42: BPF filter"

This second leak is caused by a strlist not being dellocated on error
inside probe_file__del_events.

This patch adds a goto label before the deallocation and makes the error
path jump to it.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: e7895e422e ("perf probe: Split del_perf_probe_events()")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/174963c587ae77fa108af794669998e4ae558338.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-18 09:27:37 -03:00
Riccardo Mancini
f8cbb0f926 perf lzma: Close lzma stream on exit
ASan reports memory leaks when running:

  # perf test "88: Check open filename arg using perf trace + vfs_getname"

One of these is caused by the lzma stream never being closed inside
lzma_decompress_to_file().

This patch adds the missing lzma_end().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 80a32e5b49 ("perf tools: Add lzma decompression support for kernel module")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/aaf50bdce7afe996cfc06e1bbb36e4a2a9b9db93.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:30:22 -03:00
Riccardo Mancini
423b9174f5 perf session: Cleanup trace_event
ASan reports several memory leaks when running:

  # perf test "82: Use vfs_getname probe to get syscall args filenames"

many of which are related to session->tevent.

This patch will solve this problem, then next patch will fix the
remaining memory leaks in 'perf script'.

This bug is due to a missing deallocation of the trace_event data
strutures.

This patch adds the missing trace_event__cleanup() in
perf_session__delete().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/fa2a3f221d90e47ce4e5b7e2d6e64c3509ddc96a.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
a37338aad8 perf report: Free generated help strings for sort option
ASan reports the memory leak of the strings allocated by sort_help() when
running perf report.

This patch changes the returned pointer to char* (instead of const
char*), saves it in a temporary variable, and finally deallocates it at
function exit.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 702fb9b415 ("perf report: Show all sort keys in help output")
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/a38b13f02812a8a6759200b9063c6191337f44d4.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
da6b7c6c06 perf env: Fix memory leak of cpu_pmu_caps
ASan reports memory leaks while running:

 # perf test "83: Zstd perf.data compression/decompression"

The first of the leaks is caused by env->cpu_pmu_caps not being freed.

This patch adds the missing (z)free inside perf_env__exit.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 6f91ea283a ("perf header: Support CPU PMU capabilities")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/6ba036a8220156ec1f3d6be3e5d25920f6145028.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
581e295a0f perf dso: Fix memory leak in dso__new_map()
ASan reports a memory leak when running:

  # perf test "65: maps__merge_in".

The causes of the leaks are two, this patch addresses only the first
one, which is related to dso__new_map().

The bug is that dso__new_map() creates a new dso but never decreases the
refcount it gets from creating it.

This patch adds the missing dso__put().

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: d3a7c489c7 ("perf tools: Reference count struct dso")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/60bfe0cd06e89e2ca33646eb8468d7f5de2ee597.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:52 -03:00
Riccardo Mancini
42db3d9ded perf env: Fix sibling_dies memory leak
ASan reports a memory leak in perf_env while running:

  # perf test "41: Session topology"

Caused by sibling_dies not being freed.

This patch adds the required free.

Fixes: acae8b36cd ("perf header: Add die information in CPU topology")
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/2140d0b57656e4eb9021ca9772250c24c032924b.1626343282.git.rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:27:49 -03:00
Riccardo Mancini
dedeb4be20 perf probe: Fix dso->nsinfo refcounting
ASan reports a memory leak of nsinfo during the execution of:

 # perf test "31: Lookup mmap thread".

The leak is caused by a refcounted variable being replaced without
dropping the refcount.

This patch makes sure that the refcnt of nsinfo is decreased whenever
a refcounted variable is replaced with a new value.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: 544abd44c7 ("perf probe: Allow placing uprobes in alternate namespaces.")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343282.git.rickyman7@gmail.com
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:25:28 -03:00
Riccardo Mancini
2d6b74baa7 perf map: Fix dso->nsinfo refcounting
ASan reports a memory leak of nsinfo during the execution of

  # perf test "31: Lookup mmap thread"

The leak is caused by a refcounted variable being replaced without
dropping the refcount.

This patch makes sure that the refcnt of nsinfo is decreased whenever a
refcounted variable is replaced with a new value.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Fixes: bf2e710b3c ("perf maps: Lookup maps in both intitial mountns and inner mountns.")
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Krister Johansen <kjlx@templeofstupid.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343282.git.rickyman7@gmail.com
[ Split from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-15 17:25:27 -03:00
James Clark
83d1fc92d4 perf cs-etm: Split Coresight decode by aux records
Populate the auxtrace queues using AUX records rather than whole
auxtrace buffers so that the decoder is reset between each aux record.

This is similar to the auxtrace_queues__process_index() ->
auxtrace_queues__add_indexed_event() flow where
perf_session__peek_event() is used to read AUXTRACE events out of random
positions in the file based on the auxtrace index.

But now we loop over all PERF_RECORD_AUX events instead of AUXTRACE
buffers. For each PERF_RECORD_AUX event, we find the corresponding
AUXTRACE buffer using the index, and add a fragment of that buffer to
the auxtrace queues.

No other changes to decoding were made, apart from populating the
auxtrace queues. The result of decoding is identical to before, except
in cases where decoding failed completely, due to not resetting the
decoder.

The reason for this change is because AUX records are emitted any time
tracing is disabled, for example when the process is scheduled out.
Because ETM was disabled and enabled again, the decoder also needs to be
reset to force the search for a sync packet. Otherwise there would be
fatal decoding errors.

Testing
=======

Testing was done with the following script, to diff the decoding results
between the patched and un-patched versions of perf:

	#!/bin/bash
	set -ex

	$1 script -i $3 $4 > split.script
	$2 script -i $3 $4 > default.script

	diff split.script default.script | head -n 20

And it was run like this, with various itrace options depending on the
quantity of synthesised events:

	compare.sh ./perf-patched ./perf-default perf-per-cpu-2-threads.data --itrace=i100000ns

No changes in output were observed in the following scenarios:

* Simple per-cpu
	perf record -e cs_etm/@tmc_etr0/u top

* Per-thread, single thread
	perf record -e cs_etm/@tmc_etr0/u --per-thread ./threads_C

* Per-thread multiple threads (but only one thread collected data):
	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597

* Per-thread multiple threads (both threads collected data):
	perf record -e cs_etm/@tmc_etr0/u --per-thread --pid 4596,4597

* Per-cpu explicit threads:
	perf record -e cs_etm/@tmc_etr0/u --pid 853,854

* System-wide (per-cpu):
    perf record -e cs_etm/@tmc_etr0/u -a

* No data collected (no aux buffers)
	Can happen with any command when run for a short period

* Containing truncated records
	Can happen with any command

* Containing aux records with 0 size
	Can happen with any command

* Snapshot mode (various files with and without buffer wrap)
	perf record -e cs_etm/@tmc_etr0/u -a --snapshot

Some differences were observed in the following scenario:

* Snapshot mode (with duplicate buffers)
	perf record -e cs_etm/@tmc_etr0/u -a --snapshot

Fewer samples are generated in snapshot mode if duplicate buffers
were gathered because buffers with the same offset are now only added
once. This gives different, but more correct results and no duplicate
data is decoded any more.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210624164303.28632-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 14:42:36 -03:00
Heiko Carstens
50e98924d7 libperf: Fix build error with LIBPFM4=1
Fix build error with LIBPFM4=1:

    CC      util/pfm.o
  util/pfm.c: In function ‘parse_libpfm_events_option’:
  util/pfm.c:102:30: error: ‘struct evsel’ has no member named ‘leader’
    102 |                         evsel->leader = grp_leader;
        |                              ^~

Committer notes:

There is this entry in 'make -C tools/perf build-test' to test the build
with libpfm:

  $ grep libpfm tools/perf/tests/make
  make_with_libpfm4   := LIBPFM4=1
  run += make_with_libpfm4
  $

But the test machine lacked libpfm-devel, now its installed and further
cases like this shouldn't happen.

Committer testing:

Before this patch this fails, after applying it:

  $ make -C tools/perf build-test
  make: Entering directory '/var/home/acme/git/perf/tools/perf'
  - tarpkg: ./tests/perf-targz-src-pkg .
                   make_static: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1 -j24  DESTDIR=/tmp/tmp.KzFSfvGRQa
  <SNIP>
             make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
           make_with_libpfm4_O: make LIBPFM4=1
         make_install_prefix_O: make install prefix=/tmp/krava
            make_no_auxtrace_O: make NO_AUXTRACE=1
  <SNIP>
  $ rpm -q libpfm-devel
  libpfm-devel-4.11.0-4.fc34.x86_64
  $

FIXME:

This shows a need for 'build-test' to bail out when a build option is
specified that has no required library devel files installed.

Fixes: fba7c86601 ("libperf: Move 'leader' from tools/perf to perf_evsel::leader")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210713091907.1555560-1-hca@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
e0a7ef2a62 perf stat: Merge uncore events by default for hybrid platform
On a hybrid platform, by default 'perf stat' aggregates and reports the
event counts per PMU. For example,

  # perf stat -e cycles -a true

   Performance counter stats for 'system wide':

           1,400,445      cpu_core/cycles/
             680,881      cpu_atom/cycles/

         0.001770773 seconds time elapsed

But for uncore events that's not a suitable method. Uncore has nothing
to do with hybrid. So for uncore events, we aggregate event counts from
all PMUs and report the counts without PMUs.

Before:

  # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

   Performance counter stats for 'system wide':

               2,058      uncore_arb_0/event=0x81,umask=0x1/
               2,028      uncore_arb_1/event=0x81,umask=0x1/
                   0      uncore_arb_0/event=0x84,umask=0x1/
                   0      uncore_arb_1/event=0x84,umask=0x1/

         0.000614498 seconds time elapsed

After:

  # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ -a true

   Performance counter stats for 'system wide':

               3,996      arb/event=0x81,umask=0x1/
                   0      arb/event=0x84,umask=0x1/

         0.000630046 seconds time elapsed

Of course, we also keep the '--no-merge' working for uncore events.

  # perf stat -e arb/event=0x81,umask=0x1/,arb/event=0x84,umask=0x1/ --no-merge true

   Performance counter stats for 'system wide':

               1,952      uncore_arb_0/event=0x81,umask=0x1/
               1,921      uncore_arb_1/event=0x81,umask=0x1/
                   0      uncore_arb_0/event=0x84,umask=0x1/
                   0      uncore_arb_1/event=0x84,umask=0x1/

         0.000575536 seconds time elapsed

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210707055652.962-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
49afa7f6c7 perf pmu: Skip invalid hybrid pmu
On hybrid platform, such as Alderlake, if atom CPUs are offlined,
the kernel still exports the sysfs path '/sys/devices/cpu_atom/' for
'cpu_atom' pmu but the file '/sys/devices/cpu_atom/cpus' is empty,
which indicates this is an invalid pmu.

Need to check and skip the invalid hybrid pmu.

Before:

  # perf list
  ...
  branch-instructions OR cpu_atom/branch-instructions/ [Kernel PMU event]
  branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
  branch-misses OR cpu_atom/branch-misses/           [Kernel PMU event]
  branch-misses OR cpu_core/branch-misses/           [Kernel PMU event]
  bus-cycles OR cpu_atom/bus-cycles/                 [Kernel PMU event]
  bus-cycles OR cpu_core/bus-cycles/                 [Kernel PMU event]
  ...

The cpu_atom events are still displayed even if atom CPUs are offlined.

After:

  # perf list
  ...
  branch-instructions OR cpu_core/branch-instructions/ [Kernel PMU event]
  branch-misses OR cpu_core/branch-misses/           [Kernel PMU event]
  bus-cycles OR cpu_core/bus-cycles/                 [Kernel PMU event]
  ...

Now only cpu_core events are displayed.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210708013701.20347-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-14 10:05:35 -03:00
Jin Yao
c47a5599ed perf tools: Fix pattern matching for same substring in different PMU type
Some different PMU types may have the same substring. For example, on
Icelake server we have PMU types "uncore_imc" and
"uncore_imc_free_running". Both PMU types have the substring
"uncore_imc".  But the parser wrongly thinks they are the same PMU type.

We enable an imc event,
perf stat -e uncore_imc/event=0xe3/ -a -- sleep 1

Perf actually expands the event to:

  uncore_imc_0/event=0xe3/
  uncore_imc_1/event=0xe3/
  uncore_imc_2/event=0xe3/
  uncore_imc_3/event=0xe3/
  uncore_imc_4/event=0xe3/
  uncore_imc_5/event=0xe3/
  uncore_imc_6/event=0xe3/
  uncore_imc_7/event=0xe3/
  uncore_imc_free_running_0/event=0xe3/
  uncore_imc_free_running_1/event=0xe3/
  uncore_imc_free_running_3/event=0xe3/
  uncore_imc_free_running_4/event=0xe3/

That's because the "uncore_imc_free_running" matches the
pattern "uncore_imc*".

Now we check that the last characters of PMU name is '_<digit>'.

For example, for pattern "uncore_imc*", "uncore_imc_0" is parsed ok, but
"uncore_imc_free_running_0" fails.

Fixes: b2b9d3a3f0 ("perf pmu: Support wildcards on pmu name in dynamic pmu events")
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Agustin Vega-Frias <agustinv@codeaurora.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210701064253.1175-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa
2e6263ab54 libperf: Adopt evlist__set_leader() from tools/perf as perf_evlist__set_leader()
Move the implementation of evlist__set_leader() to a new libperf
perf_evlist__set_leader() function with the same functionality make it a
libperf exported API.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa
3a683120d8 libperf: Move 'nr_groups' from tools/perf to evlist::nr_groups
Move evsel::nr_groups to perf_evsel::nr_groups, so we can move the group
interface to libperf.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:32 -03:00
Jiri Olsa
fba7c86601 libperf: Move 'leader' from tools/perf to perf_evsel::leader
Move evsel::leader to perf_evsel::leader, so we can move the group
interface to libperf.

Also add several evsel helpers to ease up the transition:

  struct evsel *evsel__leader(struct evsel *evsel);
  - get leader evsel

  bool evsel__has_leader(struct evsel *evsel, struct evsel *leader);
  - true if evsel has leader as leader

  bool evsel__is_leader(struct evsel *evsel);
  - true if evsel is itw own leader

  void evsel__set_leader(struct evsel *evsel, struct evsel *leader);
  - set leader for evsel

Committer notes:

Fix this when building with 'make BUILD_BPF_SKEL=1'

  tools/perf/util/bpf_counter.c

  -       if (evsel->leader->core.nr_members > 1) {
  +       if (evsel->core.leader->nr_members > 1) {

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:31 -03:00
Jiri Olsa
38fe0e0156 libperf: Move 'idx' from tools/perf to perf_evsel::idx
Move evsel::idx to perf_evsel::idx, so we can move the group interface
to libperf.

Committer notes:

Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that
appeared in my tree in my local tree.

Also fixed up these:

$ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx'
tools/perf/ui/gtk/annotate.c:                      evsel->idx + i);
tools/perf/ui/gtk/annotate.c:                   evsel->idx);
$

That running 'make -C tools/perf build-test' caught.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-09 14:04:28 -03:00
Adrian Hunter
b4b046ff9e perf intel-pt: Add a config for max loops without consuming a packet
The Intel PT decoder limits the number of unconditional branches (e.g.
jmps) decoded without consuming any trace packets. Generally, a loop
needs a conditional branch which generates a TNT packet, whereas a "ret"
instruction will generate a TIP or TNT packet. So exceeding the limit is
assumed to be a never-ending loop, which can happen if there has been a
decoding error putting the decoder at the wrong place in the code.

Up until now, the limit of 10000 has been enough but some analytic
purposes have been reported to exceed that.

Increase the limit to 100000, and make it configurable via perf config
intel-pt.max-loops. Also amend the "Never-ending loop" message to
mention the configuration entry.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20210701175132.3977-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 11:40:56 -03:00
Jin Yao
493be70ac3 perf stat: Disable the NMI watchdog message on hybrid
If we run a single workload that only runs on big core, there is always
a ugly message about disabling the NMI watchdog because the atom is not
counted.

Before:

  # ./perf stat true

   Performance counter stats for 'true':

                0.43 msec task-clock                #    0.396 CPUs utilized
                   0      context-switches          #    0.000 /sec
                   0      cpu-migrations            #    0.000 /sec
                  45      page-faults               #  103.918 K/sec
             639,634      cpu_core/cycles/          #    1.477 G/sec
       <not counted>      cpu_atom/cycles/                                              (0.00%)
             643,498      cpu_core/instructions/    #    1.486 G/sec
       <not counted>      cpu_atom/instructions/                                        (0.00%)
             123,715      cpu_core/branches/        #  285.694 M/sec
       <not counted>      cpu_atom/branches/                                            (0.00%)
               4,094      cpu_core/branch-misses/   #    9.454 M/sec
       <not counted>      cpu_atom/branch-misses/                                       (0.00%)

         0.001092407 seconds time elapsed

         0.001144000 seconds user
         0.000000000 seconds sys

  Some events weren't counted. Try disabling the NMI watchdog:
          echo 0 > /proc/sys/kernel/nmi_watchdog
          perf stat ...
          echo 1 > /proc/sys/kernel/nmi_watchdog

  # ./perf stat -e '{cpu_atom/cycles/,msr/tsc/}' true

   Performance counter stats for 'true':

       <not counted>      cpu_atom/cycles/                                              (0.00%)
       <not counted>      msr/tsc/                                                      (0.00%)

         0.001904106 seconds time elapsed

         0.001947000 seconds user
         0.000000000 seconds sys

  Some events weren't counted. Try disabling the NMI watchdog:
          echo 0 > /proc/sys/kernel/nmi_watchdog
          perf stat ...
          echo 1 > /proc/sys/kernel/nmi_watchdog
  The events in group usually have to be from the same PMU. Try reorganizing the group.

Now we disable the NMI watchdog message on hybrid, otherwise there
are too many false positives.

After:

  # ./perf stat true

   Performance counter stats for 'true':

                0.79 msec task-clock                #    0.419 CPUs utilized
                   0      context-switches          #    0.000 /sec
                   0      cpu-migrations            #    0.000 /sec
                  48      page-faults               #   60.889 K/sec
             777,692      cpu_core/cycles/          #  986.519 M/sec
       <not counted>      cpu_atom/cycles/                                              (0.00%)
             669,147      cpu_core/instructions/    #  848.828 M/sec
       <not counted>      cpu_atom/instructions/                                        (0.00%)
             128,635      cpu_core/branches/        #  163.176 M/sec
       <not counted>      cpu_atom/branches/                                            (0.00%)
               4,089      cpu_core/branch-misses/   #    5.187 M/sec
       <not counted>      cpu_atom/branch-misses/                                       (0.00%)

         0.001880649 seconds time elapsed

         0.001935000 seconds user
         0.000000000 seconds sys

  # ./perf stat -e '{cpu_atom/cycles/,msr/tsc/}' true

   Performance counter stats for 'true':

       <not counted>      cpu_atom/cycles/                                              (0.00%)
       <not counted>      msr/tsc/                                                      (0.00%)

         0.000963319 seconds time elapsed

         0.000999000 seconds user
         0.000000000 seconds sys

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210610034557.29766-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 11:37:23 -03:00
Kajol Jain
dea8cfcc33 perf script python: Fix buffer size to report iregs in perf script
Commit 48a1f56526 ("perf script python: Add more PMU fields to
event handler dict") added functionality to report fields like weight,
iregs, uregs etc via perf report.  That commit predefined buffer size to
512 bytes to print those fields.

But in PowerPC, since we added extended regs support in:

  068aeea377 ("perf powerpc: Support exposing Performance Monitor Counter SPRs as part of extended regs")
  d735599a06 ("powerpc/perf: Add extended regs support for power10 platform")

Now iregs can carry more bytes of data and this predefined buffer size
can result to data loss in perf script output.

This patch resolves this issue by making the buffer size dynamic, based
on the number of registers needed to print. It also changes the
regs_map() return type from int to void, as it is not being used by the
set_regs_in_dict(), its only caller.

Fixes: 068aeea377 ("perf powerpc: Support exposing Performance Monitor Counter SPRs as part of extended regs")
Signed-off-by: Kajol Jain <kjain@linux.ibm.com>
Tested-by: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20210628062341.155839-1-kjain@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 11:15:44 -03:00
Riccardo Mancini
83952286f2 perf top: Fix overflow in elf_sec__is_text()
ASan reports a heap-buffer-overflow in elf_sec__is_text when using perf-top.

The bug is caused by the fact that secstrs is built from runtime_ss, while
shdr is built from syms_ss if shdr.sh_type != SHT_NOBITS. Therefore, they
point to two different ELF files.

This patch renames secstrs to secstrs_run and adds secstrs_sym, so that
the correct secstrs is chosen depending on shdr.sh_type.

  $ ASAN_OPTIONS=abort_on_error=1:disable_coredump=0:unmap_shadow_on_exit=1 ./perf top
  =================================================================
  ==363148==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61300009add6 at pc 0x00000049875c bp 0x7f4f56446440 sp 0x7f4f56445bf0
  READ of size 1 at 0x61300009add6 thread T6
    #0 0x49875b in StrstrCheck(void*, char*, char const*, char const*) (/home/user/linux/tools/perf/perf+0x49875b)
    #1 0x4d13a2 in strstr (/home/user/linux/tools/perf/perf+0x4d13a2)
    #2 0xacae36 in elf_sec__is_text /home/user/linux/tools/perf/util/symbol-elf.c:176:9
    #3 0xac3ec9 in elf_sec__filter /home/user/linux/tools/perf/util/symbol-elf.c:187:9
    #4 0xac2c3d in dso__load_sym /home/user/linux/tools/perf/util/symbol-elf.c:1254:20
    #5 0x883981 in dso__load /home/user/linux/tools/perf/util/symbol.c:1897:9
    #6 0x8e6248 in map__load /home/user/linux/tools/perf/util/map.c:332:7
    #7 0x8e66e5 in map__find_symbol /home/user/linux/tools/perf/util/map.c:366:6
    #8 0x7f8278 in machine__resolve /home/user/linux/tools/perf/util/event.c:707:13
    #9 0x5f3d1a in perf_event__process_sample /home/user/linux/tools/perf/builtin-top.c:773:6
    #10 0x5f30e4 in deliver_event /home/user/linux/tools/perf/builtin-top.c:1197:3
    #11 0x908a72 in do_flush /home/user/linux/tools/perf/util/ordered-events.c:244:9
    #12 0x905fae in __ordered_events__flush /home/user/linux/tools/perf/util/ordered-events.c:323:8
    #13 0x9058db in ordered_events__flush /home/user/linux/tools/perf/util/ordered-events.c:341:9
    #14 0x5f19b1 in process_thread /home/user/linux/tools/perf/builtin-top.c:1109:7
    #15 0x7f4f6a21a298 in start_thread /usr/src/debug/glibc-2.33-16.fc34.x86_64/nptl/pthread_create.c:481:8
    #16 0x7f4f697d0352 in clone ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

0x61300009add6 is located 10 bytes to the right of 332-byte region [0x61300009ac80,0x61300009adcc)
allocated by thread T6 here:

    #0 0x4f3f7f in malloc (/home/user/linux/tools/perf/perf+0x4f3f7f)
    #1 0x7f4f6a0a88d9  (/lib64/libelf.so.1+0xa8d9)

Thread T6 created by T0 here:

    #0 0x464856 in pthread_create (/home/user/linux/tools/perf/perf+0x464856)
    #1 0x5f06e0 in __cmd_top /home/user/linux/tools/perf/builtin-top.c:1309:6
    #2 0x5ef19f in cmd_top /home/user/linux/tools/perf/builtin-top.c:1762:11
    #3 0x7b28c0 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #4 0x7b119f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #5 0x7b2423 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #6 0x7b0c19 in main /home/user/linux/tools/perf/perf.c:539:3
    #7 0x7f4f696f7b74 in __libc_start_main /usr/src/debug/glibc-2.33-16.fc34.x86_64/csu/../csu/libc-start.c:332:16

  SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/user/linux/tools/perf/perf+0x49875b) in StrstrCheck(void*, char*, char const*, char const*)
  Shadow bytes around the buggy address:
    0x0c268000b560: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b570: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b580: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c268000b5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  =>0x0c268000b5b0: 00 00 00 00 00 00 00 00 00 04[fa]fa fa fa fa fa
    0x0c268000b5c0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
    0x0c268000b5d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c268000b5e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0c268000b5f0: 07 fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x0c268000b600: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:       fa
    Freed heap region:       fd
    Stack left redzone:      f1
    Stack mid redzone:       f2
    Stack right redzone:     f3
    Stack after return:      f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:       f6
    Poisoned by user:        f7
    Container overflow:      fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
    Shadow gap:              cc
  ==363148==ABORTING

Suggested-by: Jiri Slaby <jirislaby@kernel.org>
Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Link: http://lore.kernel.org/lkml/20210621222108.196219-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:10 -03:00
Masami Hiramatsu
87704345cc perf symbol-elf: Decode dynsym even if symtab exists
In Fedora34, libc-2.33.so has both .dynsym and .symtab sections and
most of (not all) symbols moved to .dynsym. In this case, perf only
decode the symbols in .symtab, and perf probe can not list up the
functions in the library.

To fix this issue, decode both .symtab and .dynsym sections.

Without this fix,
  -----
  $ ./perf probe -x /usr/lib64/libc-2.33.so -F
  @plt
  @plt
  calloc@plt
  free@plt
  malloc@plt
  memalign@plt
  realloc@plt
  -----

With this fix.

  -----
  $ ./perf probe -x /usr/lib64/libc-2.33.so -F
  @plt
  @plt
  a64l
  abort
  abs
  accept
  accept4
  access
  acct
  addmntent
  -----

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532652681.393143.10163733179955267999.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:10 -03:00
Masami Hiramatsu
eb4717f733 perf probe: Fix debuginfo__new() to enable build-id based debuginfo
Fix debuginfo__new() to set the build-id to dso before
dso__read_binary_type_filename() so that it can find
DSO_BINARY_TYPE__BUILDID_DEBUGINFO debuginfo correctly.

However, this may not change the result, because elfutils (libdwfl) has
its own debuginfo finder. With/without this patch, the perf probe
correctly find the debuginfo file.

This is just a failsafe and keep code's sanity (if you use
dso__read_binary_type_filename(), you must set the build-id to the dso.)

Reported-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Thomas Richter <tmricht@linux.ibm.com>
Signed-off-by: Masami Hiramatsu <mhriamat@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Stefan Liebler <stli@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162532651863.393143.11692691321219235810.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-07 10:28:07 -03:00
Namhyung Kim
944138f048 perf stat: Enable BPF counter with --for-each-cgroup
Recently bperf was added to use BPF to count perf events for various
purposes.  This is an extension for the approach and targetting to
cgroup usages.

Unlike the other bperf, it doesn't share the events with other
processes but it'd reduce unnecessary events (and the overhead of
multiplexing) for each monitored cgroup within the perf session.

When --for-each-cgroup is used with --bpf-counters, it will open
cgroup-switches event per cpu internally and attach the new BPF
program to read given perf_events and to aggregate the results for
cgroups.  It's only called when task is switched to a task in a
different cgroup.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210701211227.1403788-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-05 14:16:57 -03:00
Riccardo Mancini
cf96b8e45a perf session: Add missing evlist__delete when deleting a session
ASan reports a memory leak caused by evlist not being deleted on exit in
perf-report, perf-script and perf-data.
The problem is caused by evlist->session not being deleted, which is
allocated in perf_session__read_header, called in perf_session__new if
perf_data is in read mode.
In case of write mode, the session->evlist is filled by the caller.
This patch solves the problem by calling evlist__delete in
perf_session__delete if perf_data is in read mode.

Changes in v2:
 - call evlist__delete from within perf_session__delete

v1: https://lore.kernel.org/lkml/20210621234317.235545-1-rickyman7@gmail.com/

ASan report follows:

$ ./perf script report flamegraph
=================================================================
==227640==ERROR: LeakSanitizer: detected memory leaks

<SNIP unrelated>

Indirect leak of 2704 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0x7f999e in evlist__new /home/user/linux/tools/perf/util/evlist.c:77:26
    #3 0x8ad938 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3797:20
    #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #11 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 568 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0x80ce88 in evsel__new_idx /home/user/linux/tools/perf/util/evsel.c:268:24
    #3 0x8aed93 in evsel__new /home/user/linux/tools/perf/util/evsel.h:210:9
    #4 0x8ae07e in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3853:11
    #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #12 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 264 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0xbe3e70 in xyarray__new /home/user/linux/tools/lib/perf/xyarray.c:10:23
    #3 0xbd7754 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:361:21
    #4 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
    #5 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #6 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #7 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #8 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #9 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #10 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #11 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #12 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x4f4137 in calloc (/home/user/linux/tools/perf/perf+0x4f4137)
    #1 0xbe3d56 in zalloc /home/user/linux/tools/lib/perf/../../lib/zalloc.c:8:9
    #2 0xbd77e0 in perf_evsel__alloc_id /home/user/linux/tools/lib/perf/evsel.c:365:14
    #3 0x8ae201 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3871:7
    #4 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #5 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #6 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #7 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #8 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #9 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #10 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #11 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

Indirect leak of 7 byte(s) in 1 object(s) allocated from:
    #0 0x4b8207 in strdup (/home/user/linux/tools/perf/perf+0x4b8207)
    #1 0x8b4459 in evlist__set_event_name /home/user/linux/tools/perf/util/header.c:2292:16
    #2 0x89d862 in process_event_desc /home/user/linux/tools/perf/util/header.c:2313:3
    #3 0x8af319 in perf_file_section__process /home/user/linux/tools/perf/util/header.c:3651:9
    #4 0x8aa6e9 in perf_header__process_sections /home/user/linux/tools/perf/util/header.c:3427:9
    #5 0x8ae3e7 in perf_session__read_header /home/user/linux/tools/perf/util/header.c:3886:2
    #6 0x8ec714 in perf_session__open /home/user/linux/tools/perf/util/session.c:109:6
    #7 0x8ebe83 in perf_session__new /home/user/linux/tools/perf/util/session.c:213:10
    #8 0x60c6de in cmd_script /home/user/linux/tools/perf/builtin-script.c:3856:12
    #9 0x7b2930 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
    #10 0x7b120f in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
    #11 0x7b2493 in run_argv /home/user/linux/tools/perf/perf.c:409:2
    #12 0x7b0c89 in main /home/user/linux/tools/perf/perf.c:539:3
    #13 0x7f5260654b74  (/lib64/libc.so.6+0x27b74)

SUMMARY: AddressSanitizer: 3728 byte(s) leaked in 7 allocation(s).

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210624231926.212208-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
ec4c00fedb perf dlfilter: Add object_code() to perf_dlfilter_fns
Add a function, for use by dlfilters, to read object code.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
6495e76252 perf dlfilter: Add attr() to perf_dlfilter_fns
Add a function, for use by dlfilters, to return the perf_event_attr
structure.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
244afc0c93 perf dlfilter: Add srcline() to perf_dlfilter_fns
Add a function, for use by dlfilters, to return source code file name and
line number.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
e35995effd perf dlfilter: Add insn() to perf_dlfilter_fns
Add a function, for use by dlfilters, to return instruction bytes.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
f645744c50 perf dlfilter: Add resolve_address() to perf_dlfilter_fns
Add a function, for use by dlfilters, to resolve addresses from branch
stacks or callchains.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:38 -03:00
Adrian Hunter
3d032a2516 perf script: Add option to pass arguments to dlfilters
Add option --dlarg to pass arguments to dlfilters. The --dlarg option can
be repeated to pass more than 1 argument.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter
638e2b9984 perf script: Add option to list dlfilters
Add option --list-dlfilters to list dlfilters in the current directory or
the exec-path e.g. ~/libexec/perf-core/dlfilters. Use with option -v (must
come before option --list-dlfilters) to show long descriptions.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter
9bde93a79a perf script: Add dlfilter__filter_event_early()
filter_event_early() can be more than 30% faster than filter_event()
because it is called before internal filtering. In other respects it
is the same as filter_event(), except that it will be passed events
that have yet to be filtered out.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Adrian Hunter
291961fc3c perf script: Add API for filtering via dynamically loaded shared object
In some cases, users want to filter very large amounts of data (e.g.
from AUX area tracing like Intel PT) looking for something specific.
While scripting such as Python can be used, Python is 10 to 20 times
slower than C. So define a C API so that custom filters can be written
and loaded.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210627131818.810-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Arnaldo Carvalho de Melo
c435c166dc perf llvm: Return -ENOMEM when asprintf() fails
Zhihao sent a patch but it made llvm__compile_bpf() return what
asprintf() returns on error, which is just -1, but since this function
returns -errno, fix it by returning -ENOMEM for this case instead.

Fixes: cb76371441 ("perf llvm: Allow passing options to llc ...")
Fixes: 5eab5a7ee0 ("perf llvm: Display eBPF compiling command ...")
Reported-by: Hulk Robot <hulkci@huawei.com>
Reported-by: Zhihao Cheng <chengzhihao1@huawei.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yu Kuai <yukuai3@huawei.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20210609115945.2193194-1-chengzhihao1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
James Clark
0323dea318 perf cs-etm: Delay decode of non-timeless data until cs_etm__flush_events()
Currently, timeless mode starts the decode on PERF_RECORD_EXIT, and
non-timeless mode starts decoding on the fist PERF_RECORD_AUX record.

This can cause the "data has no samples!" error if the first
PERF_RECORD_AUX record comes before the first (or any relevant)
PERF_RECORD_MMAP2 record because the mmaps are required by the decoder
to access the binary data.

This change pushes the start of non-timeless decoding to the very end of
parsing the file. The PERF_RECORD_EXIT event can't be used because it
might not exist in system-wide or snapshot modes.

I have not been able to find the exact cause for the events to be
intermittently in the wrong order in the basic scenario:

	perf record -e cs_etm/@tmc_etr0/u top

But it can be made to happen every time with the --delay option. This is
because "enable_on_exec" is disabled, which causes tracing to start
before the process to be launched is exec'd. For example:

	perf record -e cs_etm/@tmc_etr0/u --delay=1 top
	perf report -D | grep 'AUX\|MAP'

	0 16714475632740 0x520 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x30 flags: 0 []
	0 16714476494960 0x5d0 [0x40]: PERF_RECORD_AUX offset: 0x30 size: 0x30 flags: 0 []
	0 16714478208900 0x660 [0x40]: PERF_RECORD_AUX offset: 0x60 size: 0x30 flags: 0 []
	4294967295 16714478293340 0x700 [0x70]: PERF_RECORD_MMAP2 8712/8712: [0x557a460000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
	4294967295 16714478353020 0x770 [0x88]: PERF_RECORD_MMAP2 8712/8712: [0x7f86f72000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so

Another scenario in which decoding from the first aux record fails is a
workload that forks. Although the aux record comes after 'bash', it
comes before 'top', which is what we are interested in. For example:

	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
	perf report -D | grep 'AUX\|MAP'

	4294967295 16853946421300 0x510 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x558f280000(0x142000) @ 0 00:17 5213953 0]: r-xp /usr/bin/bash
	4294967295 16853946543560 0x580 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba6e000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
	4294967295 16853946628420 0x608 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7fbba9e000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]
	0 16853947067300 0x690 [0x40]: PERF_RECORD_AUX offset: 0 size: 0x3a60 flags: 0 []
	...
	0 16853966602580 0x1758 [0x40]: PERF_RECORD_AUX offset: 0xc2470 size: 0x30 flags: 0 []
	4294967295 16853967119860 0x1818 [0x70]: PERF_RECORD_MMAP2 8723/8723: [0x5559e70000(0x54000) @ 0 00:17 5329258 0]: r-xp /usr/bin/top
	4294967295 16853967181620 0x1888 [0x88]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed06000(0x34000) @ 0 00:17 5214354 0]: r-xp /usr/lib/aarch64-linux-gnu/ld-2.31.so
	4294967295 16853967237180 0x1910 [0x68]: PERF_RECORD_MMAP2 8723/8723: [0x7f9ed36000(0x1000) @ 0 00:00 0 0]: r-xp [vdso]

A third scenario is when the majority of time is spent in a shared
library that is not loaded at startup. For example a dynamically loaded
plugin.

Testing
=======

Testing was done by checking if any samples that are present in the
old output are missing from the new output. Timestamps must be
stripped out with awk because now they are set to the last AUX sample,
rather than the first:

	./perf script $4 | awk '!($4="")' > new.script
	./perf-default script $4 | awk '!($4="")' > default.script
	comm -13 <(sort -u new.script) <(sort -u default.script)

Testing showed that the new output is a superset of the old. When lines
appear in the comm output, it is not because they are missing but
because [unknown] is now resolved to sensible locations. For example
last putp branch here now resolves to libtinfo, so it's not missing
from the output, but is actually improved:

Old:
	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 0 [unknown] ([unknown])
New:
	top 305 [001]  1 branches:uH: 402830 _init+0x30 (/usr/bin/top.procps) => 404a1c [unknown] (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 404a20 [unknown] (/usr/bin/top.procps) => 402970 putp@plt+0x0 (/usr/bin/top.procps)
	top 305 [001]  1 branches:uH: 40297c putp@plt+0xc (/usr/bin/top.procps) => 7f8ab39208 putp+0x0 (/lib/libtinfo.so.5.9)

In the following two modes, decoding now works and the "data has no
samples!" error is not displayed any more:

	perf record -e cs_etm/@tmc_etr0/u -- bash -c top
	perf record -e cs_etm/@tmc_etr0/u --delay=1 top

In snapshot mode, there is also an improvement to decoding. Previously
samples for the 'kill' process that was used to send SIGUSR2 were
completely missing, because the process hadn't started yet. But now
there are additional samples present:

	perf record -e cs_etm/@tmc_etr0/u --snapshot -a
	perf script

		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      aaaabb612fb4 [unknown] (/usr/bin/stress)
		  kill 19644 [000] 161627.938153:    1000000    instructions:uH:      ffffae0ef210 [unknown] (/lib/aarch64-linux-gnu/ld-2.27.so)
		stress 19380 [003] 161627.938153:    1000000    instructions:uH:      ffff9e754d40 random_r+0x20 (/lib/aarch64-linux-gnu/libc-2.27.so)

Also tested was the round trip of 'perf inject' followed by 'perf
report' which has the same differences and improvements.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Al Grant <al.grant@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Branislav Rankov <branislav.rankov@arm.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20210609130421.13934-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:37 -03:00
Leo Yan
8941ba502f perf arm-spe: Don't wait for PERF_RECORD_EXIT event
When decode Arm SPE trace, it waits for PERF_RECORD_EXIT event (the last
perf event) for processing trace data, which is needless and even might
cause logic error, e.g. it might fail to correlate perf events with Arm
SPE events correctly.

So this patch removes the condition checking for PERF_RECORD_EXIT event.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20210519071939.1598923-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
afb5e9e47f perf arm-spe: Bail out if the trace is later than perf event
It's possible that record in Arm SPE trace is later than perf event and
vice versa.  This asks to correlate the perf events and Arm SPE
synthesized events to be processed in the manner of correct timing.

To achieve the time ordering, this patch reverses the flow, it firstly
calls arm_spe_sample() and then calls arm_spe_decode().  By comparing
the timestamp value and detect the perf event is coming earlier than Arm
SPE trace data, it bails out from the decoding loop, the last record is
pushed into auxtrace stack and is deferred to generate sample.  To track
the timestamp, everytime it updates timestamp for the latest record.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-5-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
85498f756f perf arm-spe: Assign kernel time to synthesized event
In current code, it assigns the arch timer counter to the synthesized
samples Arm SPE trace, thus the samples don't contain the kernel time
but only contain the raw counter value.

To fix the issue, this patch converts the timer counter to kernel time
and assigns it to sample timestamp.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
630519014c perf arm-spe: Convert event kernel time to counter value
When handle a perf event, Arm SPE decoder needs to decide if this perf
event is earlier or later than the samples from Arm SPE trace data; to
do comparision, it needs to use the same unit for the time.

This patch converts the event kernel time to arch timer's counter value,
thus it can be used to compare with counter value contained in Arm SPE
Timestamp packet.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Leo Yan
c210c30696 perf arm-spe: Save clock parameters from TIME_CONV event
During the recording phase, "perf record" tool synthesizes event
PERF_RECORD_TIME_CONV for the hardware clock parameters and saves the
event into the data file.

Afterwards, when processing the data file, the event TIME_CONV will be
processed at the very early time and is stored into session context.

This patch extracts these parameters from the session context and saves
into the structure "spe->tc" with the type perf_tsc_conversion, so that
the parameters are ready for conversion between clock counter and time
stamp.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Grant <Al.Grant@arm.com>
Cc: Dave Martin <Dave.Martin@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20210519071939.1598923-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:36 -03:00
Namhyung Kim
d6a735ef32 perf bpf_counter: Move common functions to bpf_counter.h
Some helper functions will be used for cgroup counting too.  Move them
to a header file for sharing.

Committer notes:

Fix the build on older systems with:

  -       struct bpf_map_info map_info = {0};
  +       struct bpf_map_info map_info = { .id = 0, };

This wasn't breaking the build in such systems as bpf_counter.c isn't
built due to:

tools/perf/util/Build:

  perf-$(CONFIG_PERF_BPF_SKEL) += bpf_counter.o

The bpf_counter.h file on the other hand is included from places that
are built everywhere.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 16:14:19 -03:00
Namhyung Kim
21bcc72661 perf tools: Add cgroup_is_v2() helper
The cgroup_is_v2() is to check if the given subsystem is mounted on
cgroup v2 or not.  It'll be used by BPF cgroup code later.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 15:00:33 -03:00
Namhyung Kim
69e874db4d perf tools: Add read_cgroup_id() function
The read_cgroup_id() is to read a cgroup id from a file handle using
name_to_handle_at(2) for the given cgroup.  It'll be used by bperf
cgroup stat later.

Committer notes:

  -int read_cgroup_id(struct cgroup *cgrp)
  +static inline int read_cgroup_id(struct cgroup *cgrp __maybe_unused)

To fix the build when HAVE_FILE_HANDLE is not defined.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210625071826.608504-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-07-01 15:00:03 -03:00
Arnaldo Carvalho de Melo
ce09673636 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes, since perf/urgent is already upstream.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-22 13:56:50 -03:00
Riccardo Mancini
c087e9480c perf machine: Fix refcount usage when processing PERF_RECORD_KSYMBOL
ASan reported a memory leak of BPF-related ksymbols map and dso. The
leak is caused by refount never reaching 0, due to missing __put calls
in the function machine__process_ksymbol_register.

Once the dso is inserted in the map, dso__put() should be called
(map__new2() increases the refcount to 2).

The same thing applies for the map when it's inserted into maps
(maps__insert() increases the refcount to 2).

  $ sudo ./perf record -- sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

  =================================================================
  ==297735==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 6992 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x8e4e53 in map__new2 /home/user/linux/tools/perf/util/map.c:216:20
      #2 0x8cf68c in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:778:10
      [...]

  Indirect leak of 8702 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x8728d7 in dso__new_id /home/user/linux/tools/perf/util/dso.c:1256:20
      #2 0x872015 in dso__new /home/user/linux/tools/perf/util/dso.c:1295:9
      #3 0x8cf623 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:774:21
      [...]

  Indirect leak of 1520 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
      #2 0x888954 in map__process_kallsym_symbol /home/user/linux/tools/perf/util/symbol.c:710:8
      [...]

  Indirect leak of 1406 byte(s) in 19 object(s) allocated from:
      #0 0x4f43c7 in calloc (/home/user/linux/tools/perf/perf+0x4f43c7)
      #1 0x87b3da in symbol__new /home/user/linux/tools/perf/util/symbol.c:269:23
      #2 0x8cfbd8 in machine__process_ksymbol_register /home/user/linux/tools/perf/util/machine.c:803:8
      [...]

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tommi Rantala <tommi.t.rantala@nokia.com>
Link: http://lore.kernel.org/lkml/20210612173751.188582-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
John Garry
fe7a98b9d9 perf metricgroup: Return error code from metricgroup__add_metric_sys_event_iter()
The error code is not set at all in the sys event iter function.

This may lead to an uninitialized value of "ret" in
metricgroup__add_metric() when no CPU metric is added.

Fix by properly setting the error code.

It is not necessary to init "ret" to 0 in metricgroup__add_metric(), as
if we have no CPU or sys event metric matching, then "has_match" should
be 0 and "ret" is set to -EINVAL.

However gcc cannot detect that it may not have been set after the
map_for_each_metric() loop for CPU metrics, which is strange.

Fixes: be335ec28e ("perf metricgroup: Support adding metrics for system PMUs")
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
John Garry
fc96ec4d5d perf metricgroup: Fix find_evsel_group() event selector
The following command segfaults on my x86 broadwell:

  $ ./perf stat  -M frontend_bound,retiring,backend_bound,bad_speculation sleep 1
  WARNING: grouped events cpus do not match, disabling group:
    anon group { raw 0x10e }
    anon group { raw 0x10e }
  perf: util/evsel.c:1596: get_group_fd: Assertion `!(!leader->core.fd)' failed.
  Aborted (core dumped)

The issue shows itself as a use-after-free in evlist__check_cpu_maps(),
whereby the leader of an event selector (evsel) has been deleted (yet we
still attempt to verify for an evsel).

Fundamentally the problem comes from metricgroup__setup_events() ->
find_evsel_group(), and has developed from the previous fix attempt in
commit 9c880c24cb ("perf metricgroup: Fix for metrics containing
duration_time").

The problem now is that the logic in checking if an evsel is in the same
group is subtly broken for the "cycles" event. For the "cycles" event,
the pmu_name is NULL; however the logic in find_evsel_group() may set an
event matched against "cycles" as used, when it should not be.

This leads to a condition where an evsel is set, yet its leader is not.

Fix the check for evsel pmu_name by not matching evsels when either has a
NULL pmu_name.

There is still a pre-existing metric issue whereby the ordering of the
metrics may break the 'stat' function, as discussed at:
https://lore.kernel.org/lkml/49c6fccb-b716-1bf0-18a6-cace1cdb66b9@huawei.com/

Fixes: 9c880c24cb ("perf metricgroup: Fix for metrics containing duration_time")
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> # On a Thinkpad T450S
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623335580-187317-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-19 10:06:46 -03:00
Masami Hiramatsu
45237f9898 perf probe: Add --bootconfig to output definition in bootconfig format
Now the boot-time tracing supports kprobes events and that must be
written in bootconfig file in the following format.

  ftrace.event.kprobes.<EVENT_NAME>.probes = <PROBE-DEF>

'perf probe' already supports --definition (-D) action to show probe
definitions, but the format is for tracefs:

  [p|r][:EVENT_NAME] <PROBE-DEF>

This patch adds the --bootconfig option for -D action so that it outputs
the probe definitions in bootconfig format. E.g.

  $ perf probe --bootconfig -D "path_lookupat:7 err:s32 s:string"
  ftrace.event.kprobes.path_lookupat_L7.probe = 'path_lookupat.isra.0+309 err_s32=%ax:s32 s_string=+0(%r13):string'

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282412351.452340.14871995440005640114.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Masami Hiramatsu
d26ea48144 perf probe: Cleanup synthesize_probe_trace_command()
Cleanup synthesize_probe_trace_command() to simplify the code path.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282411361.452340.16886399333622147122.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Masami Hiramatsu
f338de2219 perf probe: Support probes on init functions for offline kernel
'perf probe' internally checks the probe target is in the text area in
post-process (after analyzing debuginfo). But it fails if the probe
target is in the "inittext".

This is a good limitation for the online kernel because such functions
have gone after booting. However, for using it for boot-time tracing,
user may want to put a probe on init functions.

This skips the post checking process if the target is offline kenrel so
that user can get the probe definition on the init functions.

Without this patch:

  $ perf probe -k ./build-x86_64/vmlinux -D do_mount_root:10
  Probe point 'do_mount_root:10' not found.
    Error: Failed to add events.

With this patch:

  $ perf probe -k ./build-x86_64/vmlinux -D do_mount_root:10
  p:probe/do_mount_root_L10 mount_block_root+300

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/162282410293.452340.13347006295826431632.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-18 13:50:05 -03:00
Li Huafei
28b8e87abf perf mem-events: Remove duplicate #undef
Remove duplicate '#undef E'.

Signed-off-by: Li Huafei <lihuafei1@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zhang Jinhao <zhangjinhao2@huawei.com>
Link: http://lore.kernel.org/lkml/20210616120339.219807-1-lihuafei1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-16 15:05:24 -03:00
Leo Yan
197eecb6ec perf session: Correct buffer copying when peeking events
When peeking an event, it has a short path and a long path.  The short
path uses the session pointer "one_mmap_addr" to directly fetch the
event; and the long path needs to read out the event header and the
following event data from file and fill into the buffer pointer passed
through the argument "buf".

The issue is in the long path that it copies the event header and event
data into the same destination address which pointer "buf", this means
the event header is overwritten.  We are just lucky to run into the
short path in most cases, so we don't hit the issue in the long path.

This patch adds the offset "hdr_sz" to the pointer "buf" when copying
the event data, so that it can reserve the event header which can be
used properly by its caller.

Fixes: 5a52f33adf ("perf session: Add perf_session__peek_event()")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210605052957.1070720-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-11 12:54:24 -03:00
Jin Yao
1fcc57b7e5 perf evsel: Adjust hybrid event and global event mixed group
A group mixed with hybrid event and global event is allowed. For
example, group leader is 'intel_pt//' and the group member is
'cpu_atom/cycles/'.

e.g.:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u'

The challenge is that their available cpus are not fully matched. For
example, 'intel_pt//' is available on CPU0-CPU23, but 'cpu_atom/cycles/'
is available on CPU16-CPU23.

When getting the group id for group member, we must be very careful.
Because the cpu for 'intel_pt//' is not equal to the cpu for
'cpu_atom/cycles/'. Actually the cpu here is the index of evsel->core.cpus,
not the real CPU ID.

e.g. cpu0 for 'intel_pt//' is CPU0, but cpu0 for 'cpu_atom/cycles/' is CPU16.

Before:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             10
    size                             128
    config                           0xe601
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|CPU|IDENTIFIER
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    enable_on_exec                   1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 4084  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 4084  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 4084  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid 4084  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 4084  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 4084  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 4084  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid 4084  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid 4084  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid 4084  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid 4084  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid 4084  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid 4084  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid 4084  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid 4084  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid 4084  cpu 15  group_fd -1  flags 0x8 = 21
  sys_perf_event_open: pid 4084  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid 4084  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid 4084  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid 4084  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid 4084  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid 4084  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid 4084  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid 4084  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------
  perf_event_attr:
    size                             128
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER|AUX
    read_format                      ID
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
    aux_sample_size                  4096
  ------------------------------------------------------------
  sys_perf_event_open: pid 4084  cpu 16  group_fd 5  flags 0x8
  sys_perf_event_open failed, error -22

The group_fd 5 is not correct. It should be 22 (the fd of
'intel_pt' on CPU16).

After:

  # perf record --aux-sample -e '{intel_pt//,cpu_atom/cycles/}:u' -vv uname
  ...
  ------------------------------------------------------------
  perf_event_attr:
    type                             10
    size                             128
    config                           0xe601
    { sample_period, sample_freq }   1
    sample_type                      IP|TID|TIME|CPU|IDENTIFIER
    read_format                      ID
    disabled                         1
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    enable_on_exec                   1
    sample_id_all                    1
    exclude_guest                    1
  ------------------------------------------------------------
  sys_perf_event_open: pid 5162  cpu 0  group_fd -1  flags 0x8 = 5
  sys_perf_event_open: pid 5162  cpu 1  group_fd -1  flags 0x8 = 6
  sys_perf_event_open: pid 5162  cpu 2  group_fd -1  flags 0x8 = 7
  sys_perf_event_open: pid 5162  cpu 3  group_fd -1  flags 0x8 = 9
  sys_perf_event_open: pid 5162  cpu 4  group_fd -1  flags 0x8 = 10
  sys_perf_event_open: pid 5162  cpu 5  group_fd -1  flags 0x8 = 11
  sys_perf_event_open: pid 5162  cpu 6  group_fd -1  flags 0x8 = 12
  sys_perf_event_open: pid 5162  cpu 7  group_fd -1  flags 0x8 = 13
  sys_perf_event_open: pid 5162  cpu 8  group_fd -1  flags 0x8 = 14
  sys_perf_event_open: pid 5162  cpu 9  group_fd -1  flags 0x8 = 15
  sys_perf_event_open: pid 5162  cpu 10  group_fd -1  flags 0x8 = 16
  sys_perf_event_open: pid 5162  cpu 11  group_fd -1  flags 0x8 = 17
  sys_perf_event_open: pid 5162  cpu 12  group_fd -1  flags 0x8 = 18
  sys_perf_event_open: pid 5162  cpu 13  group_fd -1  flags 0x8 = 19
  sys_perf_event_open: pid 5162  cpu 14  group_fd -1  flags 0x8 = 20
  sys_perf_event_open: pid 5162  cpu 15  group_fd -1  flags 0x8 = 21
  sys_perf_event_open: pid 5162  cpu 16  group_fd -1  flags 0x8 = 22
  sys_perf_event_open: pid 5162  cpu 17  group_fd -1  flags 0x8 = 23
  sys_perf_event_open: pid 5162  cpu 18  group_fd -1  flags 0x8 = 24
  sys_perf_event_open: pid 5162  cpu 19  group_fd -1  flags 0x8 = 25
  sys_perf_event_open: pid 5162  cpu 20  group_fd -1  flags 0x8 = 26
  sys_perf_event_open: pid 5162  cpu 21  group_fd -1  flags 0x8 = 27
  sys_perf_event_open: pid 5162  cpu 22  group_fd -1  flags 0x8 = 28
  sys_perf_event_open: pid 5162  cpu 23  group_fd -1  flags 0x8 = 29
  ------------------------------------------------------------
  perf_event_attr:
    size                             128
    config                           0x800000000
    { sample_period, sample_freq }   4000
    sample_type                      IP|TID|TIME|PERIOD|IDENTIFIER|AUX
    read_format                      ID
    inherit                          1
    exclude_kernel                   1
    exclude_hv                       1
    freq                             1
    sample_id_all                    1
    exclude_guest                    1
    aux_sample_size                  4096
  ------------------------------------------------------------
  sys_perf_event_open: pid 5162  cpu 16  group_fd 22  flags 0x8 = 30
  sys_perf_event_open: pid 5162  cpu 17  group_fd 23  flags 0x8 = 31
  sys_perf_event_open: pid 5162  cpu 18  group_fd 24  flags 0x8 = 32
  sys_perf_event_open: pid 5162  cpu 19  group_fd 25  flags 0x8 = 33
  sys_perf_event_open: pid 5162  cpu 20  group_fd 26  flags 0x8 = 34
  sys_perf_event_open: pid 5162  cpu 21  group_fd 27  flags 0x8 = 35
  sys_perf_event_open: pid 5162  cpu 22  group_fd 28  flags 0x8 = 36
  sys_perf_event_open: pid 5162  cpu 23  group_fd 29  flags 0x8 = 37
  ------------------------------------------------------------
  ...

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210609044555.27180-1-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-10 13:41:50 -03:00
Masami Hiramatsu
0808b3d5b7 perf probe: Provide clearer message permission error for tracefs access
Report permission error for the tracefs open and rewrite whole the error
message code around it.

You'll see a hint according to what you want to do with perf probe as
below.

  $ perf probe -l
  No permission to read tracefs.
  Please try 'sudo mount -o remount,mode=755 /sys/kernel/tracing/'
    Error: Failed to show event list.

  $ perf probe -d \*
  No permission to write tracefs.
  Please run this command again with sudo.
    Error: Failed to delete events.

This also fixes -ENOTSUP checking for mounting tracefs/debugfs.
Actually open returns -ENOENT in that case and we have to check it with
current mount point list. If we unmount debugfs and tracefs perf probe
shows correct message as below.

  $ perf probe -l
  Debugfs or tracefs is not mounted
  Please try 'sudo mount -t tracefs nodev /sys/kernel/tracing/'
    Error: Failed to show event list.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162299456839.503471.13863002017089255222.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 14:12:14 -03:00
Leo Yan
bde1e7d934 perf auxtrace: Change to use SMP memory barriers
The kernel and the userspace tool can access the AUX ring buffer head
and tail from different CPUs, thus SMP class of barriers are required
on SMP system.

This patch changes to use SMP barriers to replace mb() and rmb()
barriers.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20210602103007.184993-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 13:45:04 -03:00
Zou Wei
f54cad25a1 perf srccode: Use list_move() instead of equivalent list_del() + list_add() sequence
Using list_move() instead of list_del() + list_add(), shorter,
equivalent.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623113566-49455-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 09:36:36 -03:00
Masami Hiramatsu
f4f1c42953 perf probe: Report possible permission error for map__load() failure
Report possible permission error including kptr_restrict setting
for map__load() failure. This can happen when non-superuser runs
perf probe.

With this patch, perf probe shows the following message.

 $ perf probe vfs_read
 Failed to load symbols from /proc/kallsyms
 Please ensure you can read the /proc/kallsyms symbol addresses.
 If the /proc/sys/kernel/kptr_restrict is '2', you can not read
 kernel symbol address even if you are a superuser. Please change
 it to '1'. If kptr_restrict is '1', the superuser can read the
 symbol addresses.
 In that case, please run this command again with sudo.
   Error: Failed to add events.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: http://lore.kernel.org/lkml/162282065877.448336.10047912688119745151.stgit@devnote2
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 15:43:37 -03:00
Riccardo Mancini
67069a1f0f perf env: Fix memory leak of bpf_prog_info_linear member
ASan reported a memory leak caused by info_linear not being deallocated.

The info_linear was allocated during in perf_event__synthesize_one_bpf_prog().

This patch adds the corresponding free() when bpf_prog_info_node
is freed in perf_env__purge_bpf().

  $ sudo ./perf record -- sleep 5
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ]

  =================================================================
  ==297735==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 7688 byte(s) in 19 object(s) allocated from:
      #0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f)
      #1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16
      #2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16
      #3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9
      #4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8
      #5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8
      #6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8
      #7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11
      #8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8
      #9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2
      #10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3
      #11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: http://lore.kernel.org/lkml/20210602224024.300485-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:26:20 -03:00
Riccardo Mancini
69c9ffed6c perf symbol-elf: Fix memory leak by freeing sdt_note.args
Reported by ASan.

Signed-off-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Fabian Hemmer <copy@copy.sh>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Jiri Slaby <jirislaby@kernel.org>
Link: http://lore.kernel.org/lkml/20210602220833.285226-1-rickyman7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:06:27 -03:00
Namhyung Kim
3cc84399e9 perf stat: Honor event config name on --no-merge
If user gave an event name explicitly, it should be displayed in the
output as is.  But with --no-merge option it adds a pmu name at the
end so might confuse users.

Actually this is true for hybrid pmus, I think we should do the same
for others.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210602212241.2175005-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:05:23 -03:00
Namhyung Kim
2dc065eae5 perf evsel: Add missing cloning of evsel->use_config_name
The evsel__clone() should copy all fields in the evsel which are set
during the event parsing.  But it missed the use_config_name field.

Fixes: 12279429d8 ("perf stat: Uniquify hybrid event name")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210602212241.2175005-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-04 10:04:20 -03:00
Arnaldo Carvalho de Melo
0ab8009b3e Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes from perf/urgent to allow perf/core to be used for new
development.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 14:58:44 -03:00
Jin Yao
d5a8bd0fcd perf mem: Disable 'mem-loads-aux' group before reporting
For some platforms, such as Alderlake, the 'mem-loads' event is required
to use together with 'mem-loads-aux' within a group and 'mem-loads-aux'
must be the group leader. Now we disable this group before reporting
because 'mem-loads-aux' is just an auxiliary event. It doesn't carry
any valid memory load result. If we show the 'mem-loads-aux' +
'mem-loads' as a group in report, it needs many of changes but they
are totally unnecessary.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-8-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:06:01 -03:00
Jin Yao
4a9086adc3 perf mem: Support record for hybrid platform
Support 'perf mem record' for hybrid platform. On hybrid platform,
such as Alderlake, when executing 'perf mem record', it actually calls:

record -e {cpu_core/mem-loads-aux/,cpu_core/mem-loads,ldlat=30/}:P
       -e cpu_atom/mem-loads,ldlat=30/P
       -e cpu_core/mem-stores/P
       -e cpu_atom/mem-stores/P

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-6-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:59 -03:00
Jin Yao
e7ce8d11bf perf tools: Check if mem_events is supported for hybrid platform
Check if the mem_events ('mem-loads' and 'mem-stores') exist
in the sysfs path.

For Alderlake, the hybrid cpu pmu are "cpu_core" and "cpu_atom".
Check the existing of following paths:

/sys/devices/cpu_atom/events/mem-loads
/sys/devices/cpu_atom/events/mem-stores
/sys/devices/cpu_core/events/mem-loads
/sys/devices/cpu_core/events/mem-stores

If the patch exists, the mem_event is supported.

Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-5-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-01 11:04:33 -03:00