When there are no events and on Intel, the topdown events will be
added by default if present. To display the metrics associated with
these request special handling in stat-shadow.c. To more easily update
these metrics use the json metric version via the TopdownL1
group. This makes the handling less platform specific.
Modify the metricgroup__has_metric code to also cover metric groups.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-stm32@st-md-mailman.stormreply.com
Link: https://lore.kernel.org/r/20230219092848.639226-40-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In arch_perf_synthesize_sample_weight(), the retire_lat was mistakenly
missed, add it.
perf test -v "x86 sample parsing"
74: x86 Sample parsing :
--- start ---
test child forked, pid 72526
Samples differ at 'retire_lat'
parsing failed for sample_type 0x1000000
test child finished with -1
---- end ----
x86 Sample parsing: FAILED!
Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230206162100.3329395-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add test for the new field for Retire Latency in the X86 specific test.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20230202192209.1795329-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Retire Latency field is added in the var3_w of the
PERF_SAMPLE_WEIGHT_STRUCT. The Retire Latency reports pipeline stall of
this instruction compared to the previous instruction in cycles. That's
quite useful to display the information with perf mem report.
The p_stage_cyc for Power is also from the var3_w. Union the p_stage_cyc
and retire_lat to share the code.
Implement X86 specific codes to display the X86 specific header.
Add a new sort key retire_lat for the Retire Latency.
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20230104201349.1451191-8-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The pattern for accessing EVENT_SOURCE_DEVICE_PATH is duplicated in a
few places, so add two utility functions to cover it. Also just use
perf_pmu__scan_file() instead of pmu_type() which already does the same
thing.
No functional changes.
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Suzuki Poulouse <suzuki.poulose@arm.com>
Tested-by: Tanmay Jagdale <tanmay@marvell.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bharat Bhushan <bbhushan2@marvell.com>
Cc: George Cherian <gcherian@marvell.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Linu Cherian <lcherian@marvell.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sunil Kovvuri Goutham <sgoutham@marvell.com>
Cc: Will Deacon <will@kernel.org>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230120143702.4035046-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.
If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.
This also disables CONFIG_TRACE that controls "perf trace".
CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.
Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed. The
majority of commands continue to work including "perf test".
Committer notes:
Fixed up a tools/perf/util/Build reject and added:
#include <traceevent/event-parse.h>
to tools/perf/util/scripting-engines/trace-event-perl.c.
Committer testing:
$ rpm -qi libtraceevent-devel
Name : libtraceevent-devel
Version : 1.5.3
Release : 2.fc36
Architecture: x86_64
Install Date: Mon 25 Jul 2022 03:20:19 PM -03
Group : Unspecified
Size : 27728
License : LGPLv2+ and GPLv2+
Signature : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
Source RPM : libtraceevent-1.5.3-2.fc36.src.rpm
Build Date : Fri 15 Apr 2022 10:57:01 AM -03
Build Host : buildvm-x86-05.iad2.fedoraproject.org
Packager : Fedora Project
Vendor : Fedora Project
URL : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
Bug URL : https://bugz.fedoraproject.org/libtraceevent
Summary : Development headers of libtraceevent
Description :
Development headers of libtraceevent-libs
$
Default build:
$ ldd ~/bin/perf | grep tracee
libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
$
# perf trace -e sched:* --max-events 10
0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
#
Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.
Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:
- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y
- perf-$(CONFIG_LIBTRACEEVENT) += scripts/
- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y
- The python binding needed some fixups and util/trace-event.c can't be
built and linked with the python binding shared object, so remove it
in tools/perf/util/setup.py and exclude it from the list of
dependencies in the python/perf.so Makefile.perf target.
Building without libtraceevent-devel installed uncovered more build
failures:
- The python binding tools/perf/util/python.c was assuming that
traceevent/parse-events.h was always available, which was the case
when we defaulted to using the in-kernel tools/lib/traceevent/ files,
now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
the other parts of it that deal with tracepoints.
- We have to ifdef the rules in the Build files with
CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
detect libtraceevent-devel installed in the system. Simplification here
to avoid these two ways of disabling builtin-trace.c and not having
CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
way.
From Athira:
<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>
Then, ditto for arm64 and s390, detected by container cross build tests.
- s/390 uses test__checkevent_tracepoint() that is now only available if
HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.
Also from Athira:
<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>
Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now most of the print functions take a pointer to the struct outstate.
We have one in the evlist__print_counters() and pass it through the
child functions.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20221123180208.2068936-13-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel driver assumes hybrid CPUs will have Intel PT capabilities
that are compatible with the boot CPU. Add a test to check that is the
case.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221104121805.5264-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation for adding more Intel PT testing, redefine the test_suite
to allow for adding more subtests.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221104121805.5264-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation for adding more Intel PT testing, rename
intel-pt-pkt-decoder-test.c to intel-pt-test.c.
Subtests will later be added to intel-pt-test.c.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20221104121805.5264-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some places were including event.h just to get 'struct perf_sample',
move it to a separate place so that we speed up a bit the build.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
User space tasks can migrate between CPUs, so when tracing selected CPUs,
system-wide sideband is still needed, however evlist->core.has_user_cpus
is not set in the hybrid case, so check the target cpu_list instead.
Fixes: 7d189cadbe ("perf intel-pt: Track sideband system-wide when needed")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20221012082259.22394-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The 'perf mem' and 'perf c2c' tools are wrappers around 'perf record'
with mem load/ store events. IBS tagged load/store sample provides most
of the information needed for these tools. Wire in the "ibs_op//" event
as mem-ldst event for AMD.
There are some limitations though: Only load/store micro-ops provide
mem/c2c information. Whereas, IBS does not have a way to choose a
particular type of micro-op to tag. This results in many non-LS
micro-ops being tagged which appear as N/A in the perf report. IBS,
being an uncore pmu from kernel point of view[1], does not support per
process monitoring. Thus, perf mem/c2c on AMD are currently supported in
per-cpu mode only.
Example:
$ sudo perf mem record -- -c 10000
^C[ perf record: Woken up 227 times to write data ]
[ perf record: Captured and wrote 58.760 MB perf.data (836978 samples) ]
$ sudo perf mem report -F mem,sample,snoop
Samples: 836K of event 'ibs_op//', Event count (approx.): 8418762
Memory access Samples Snoop
N/A 700620 N/A
L1 hit 126675 N/A
L2 hit 424 N/A
L3 hit 664 HitM
L3 hit 10 N/A
Local RAM hit 2 N/A
Remote RAM (1 hop) hit 8558 N/A
Remote Cache (1 hop) hit 3 N/A
Remote Cache (1 hop) hit 2 HitM
Remote Cache (2 hops) hit 10 HitM
Remote Cache (2 hops) hit 6 N/A
Uncached hit 4 N/A
$
[1]: https://lore.kernel.org/lkml/20220829113347.295-1-ravi.bangoria@amd.com
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: https://lore.kernel.org/r/20221006153946.7816-6-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a help to create a system-wide sched_switch event. One merit is
that it sets the system-wide bit before adding it to evlist so that
the libperf can handle the cpu and thread maps correctly.
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221003204647.1481128-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Many cases do not use the extra error information provided by
parse_events and instead pass NULL as the struct parse_events_error
pointer. Add a wrapper for those cases so that the pointer is never
NULL.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220809080702.6921-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This test has been superseded by test_stat_user_read in:
tools/lib/perf/tests/test-evsel.c
The updated test doesn't divide-by-0 when running time of a counter is
0. It also supports ARM64.
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Rob Herring <robh@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220719223946.176299-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Current perf stat uses the evlist__add_default_attrs() to add the
generic default attrs, and uses arch_evlist__add_default_attrs() to add
the Arch specific default attrs, e.g., Topdown for x86.
It works well for the non-hybrid platforms. However, for a hybrid
platform, the hard code generic default attrs don't work.
Uses arch_evlist__add_default_attrs() to replace the
evlist__add_default_attrs(). The arch_evlist__add_default_attrs() is
modified to invoke the same __evlist__add_default_attrs() for the
generic default attrs. No functional change.
Add default_null_attrs[] to indicate the arch specific attrs.
No functional change for the arch specific default attrs either.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-4-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 55bcf6ef31 ("perf: Extend PERF_TYPE_HARDWARE and
PERF_TYPE_HW_CACHE") extends the two types to become PMU aware types for
a hybrid system. However, current evsel__hw_name doesn't take the PMU
type into account. It mistakenly returns the "unknown-hardware" for the
hardware event with a specific PMU type.
Add an arch specific arch_evsel__hw_name() to specially handle the PMU
aware hardware event.
Currently, the extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE is only
supported by X86. Only implement the specific arch_evsel__hw_name() for
X86 in the patch.
Nothing is changed for the other archs.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220721065706.2886112-3-zhengjun.xing@linux.intel.com
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The CPUID method of arch_get_tsc_freq fails for older Intel processors,
such as Skylake. Compute using /proc/cpuinfo.
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The TSC frequency information is required for the event metrics with the
literal, system_tsc_freq. For the newer Intel platform, the TSC
frequency information can be retrieved from the CPUID leaf 0x15. If the
TSC frequency information isn't present the /proc/cpuinfo approach is
used.
Refactor cpuid() for this use. Note, the previous stack pushing/popping
approach was broken on x86-64 that has stack red zones that would be
clobbered.
Committer testing:
Before:
$ perf record sleep 0.0001
[ perf record: Woken up 1 times to write data ]
$ perf report --header-only |& grep cpuid
# cpuid : AuthenticAMD,25,33,0
$
After the patch:
$ perf record sleep 0.0001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.002 MB perf.data (8 samples) ]
$ perf report --header-only |& grep cpuid
# cpuid : AuthenticAMD,25,33,0
$
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220718164312.3994191-2-irogers@google.com
Signed-off-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Samples without an L3 miss are discarded and counter is reset with
random value (between 1-15 for fetch PMU and 1-127 for op PMU) when IBS
L3 miss filtering is enabled. This causes a sampling period skew but
there is no way to reconstruct aggregated sampling period. So print a
warning at perf record if user sets l3missonly=1.
Ex:
# perf record -c 10000 -C 0 -e ibs_op/l3missonly=1/
WARNING: Hw internally resets sampling period when L3 Miss Filtering is enabled
and tagged operation does not cause L3 Miss. This causes sampling period skew.
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Ananth Narayan <ananth.narayan@amd.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rrichter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Santosh Shukla <santosh.shukla@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: like.xu.linux@gmail.com
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20220604044519.594-2-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the hardware TopDown metrics feature, the sample-read feature should
be supported for a TopDown group, e.g., sample a non-topdown event and read
a Topdown metric group. But the current perf record code errors are out.
For a TopDown metric group,the slots event must be the leader of the group,
but the leader slots event doesn't support sampling. To support sample-read
the TopDown metric group, uses the 2nd event of the group as the "leader"
for the purposes of sampling.
Only the platform with the TopDown metric feature supports sample-read the
topdown group. In commit acb65150a4 ("perf record: Support sample-read
topdown metric group"), it adds arch_topdown_sample_read() to indicate
whether the TopDown group supports sample-read, it should only work on the
non-hybrid systems, this patch extends the support for hybrid platforms.
Before:
# ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (cpu_core/topdown-retiring/).
/bin/dmesg | grep -i perf may provide additional information.
After:
# ./perf record -e "{cpu_core/slots/,cpu_core/cycles/,cpu_core/topdown-retiring/}:S" -a sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.238 MB perf.data (369 samples) ]
Fixes: acb65150a4 ("perf record: Support sample-read topdown metric group")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220602153603.1884710-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For the hybrid system, the "slots" event changes to "cpu_core/slots/", need
extend API arch_evsel__must_be_in_group() to support hybrid systems.
In the origin code, for hybrid system event "cpu_core/slots/", the output
of the API arch_evsel__must_be_in_group() is "false" (in fact,it should be
"true"). Currently only one API evsel__remove_from_group() calls it. In
evsel__remove_from_group(), it adds the second condition to check, so the
output of evsel__remove_from_group() still is correct. That's the reason
why there isn't an instant error. I'd like to fix the issue found in API
arch_evsel__must_be_in_group() in case someone else using the function in
the other place.
Fixes: d98079c05b ("perf evlist: Keep topdown counters in weak group")
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220601152544.1842447-1-zhengjun.xing@linux.intel.com
Cc: peterz@infradead.org
Cc: adrian.hunter@intel.com
Cc: alexander.shishkin@intel.com
Cc: acme@kernel.org
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Cc: mingo@redhat.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
User space tasks can migrate between CPUs, so when tracing selected CPUs,
sideband for all CPUs is still needed. This is in preparation for allowing
system-wide events on all CPUs while the user requested events are on only
user requested CPUs.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use evlist__add_dummy_on_all_cpus() for switch tracking in preparation for
allowing system-wide events on all CPUs while the user requested events are
on only user requested CPUs.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Link: https://lore.kernel.org/r/20220524075436.29144-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The X86 specific arch__intr_reg_mask() is to check whether the kernel
and hardware can collect XMM registers. But it doesn't work on some
hybrid platform.
Without the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15
The config of the test event doesn't contain the PMU information. The
kernel may fail to initialize it on the correct hybrid PMU and return
the wrong non-supported information.
Add the PMU information into the config for the hybrid platform. The
same register set is supported among different hybrid PMUs. Checking
the first available one is good enough.
With the patch on ADL-N:
$ perf record -I?
available registers: AX BX CX DX SI DI BP SP IP FLAGS CS SS R8 R9 R10
R11 R12 R13 R14 R15 XMM0 XMM1 XMM2 XMM3 XMM4 XMM5 XMM6 XMM7 XMM8 XMM9
XMM10 XMM11 XMM12 XMM13 XMM14 XMM15
Fixes: 6466ec14aa ("perf regs x86: Add X86 specific arch__intr_reg_mask()")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518145125.1494156-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The commit 94dbfd6781 ("perf parse-events: Architecture specific
leader override") introduced a feature to reorder the slots event to
fulfill the restriction of the perf metrics topdown group. But the
feature doesn't work on the hybrid machine.
$ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1
Performance counter stats for 'system wide':
<not counted> cpu_core/instructions/
<not counted> cpu_core/slots/
<not supported> cpu_core/topdown-retiring/
1.002871801 seconds time elapsed
A hybrid platform has a different PMU name for the core PMUs, while
current perf hard code the PMU name "cpu".
Introduce a new function to check whether the system supports the perf
metrics feature. The result is cached for the future usage.
For X86, the core PMU name always has "cpu" prefix.
With the patch:
$ perf stat -e "{cpu_core/instructions/,cpu_core/slots/,cpu_core/topdown-retiring/}" -a sleep 1
Performance counter stats for 'system wide':
76,337,010 cpu_core/slots/
10,416,809 cpu_core/instructions/
11,692,372 cpu_core/topdown-retiring/
1.002805453 seconds time elapsed
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-5-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The evsel->name may have a different format for a topdown event, a pure
topdown name (e.g., topdown-fe-bound), or a PMU name + a topdown name
(e.g., cpu/topdown-fe-bound/). The cpu/topdown-fe-bound/ kind format
isn't supported by the arch_evlist__leader(). This format is a very
common format for a hybrid platform, which requires specifying the PMU
name for each event.
Without the patch,
$ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1
Performance counter stats for 'system wide':
<not counted> instructions
<not counted> slots
<not supported> cpu/topdown-fe-bound/
1.003482041 seconds time elapsed
Some events weren't counted. Try disabling the NMI watchdog:
echo 0 > /proc/sys/kernel/nmi_watchdog
perf stat ...
echo 1 > /proc/sys/kernel/nmi_watchdog
The events in group usually have to be from the same PMU. Try reorganizing the group.
With the patch,
$ perf stat -e '{instructions,slots,cpu/topdown-fe-bound/}' -a sleep 1
Performance counter stats for 'system wide':
157,383,996 slots
25,011,711 instructions
27,441,686 cpu/topdown-fe-bound/
1.003530890 seconds time elapsed
Fixes: bc355822f0 ("perf parse-events: Move slots only with topdown")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-4-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The patch ("perf evlist: Keep topdown counters in weak group") fixes the
perf metrics topdown event issue when the topdown events are in a weak
group on a non-hybrid platform. However, it doesn't work for the hybrid
platform.
$./perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1
Performance counter stats for 'system wide':
751,765,068 cpu_core/slots/ (84.07%)
<not supported> cpu_core/topdown-bad-spec/
<not supported> cpu_core/topdown-be-bound/
<not supported> cpu_core/topdown-fe-bound/
<not supported> cpu_core/topdown-retiring/
12,398,197 cpu_core/branch-instructions/ (84.07%)
1,054,218 cpu_core/branch-misses/ (84.24%)
539,764,637 cpu_core/bus-cycles/ (84.64%)
14,683 cpu_core/cache-misses/ (84.87%)
7,277,809 cpu_core/cache-references/ (77.30%)
222,299,439 cpu_core/cpu-cycles/ (77.28%)
63,661,714 cpu_core/instructions/ (84.85%)
0 cpu_core/mem-loads/ (77.29%)
12,271,725 cpu_core/mem-stores/ (77.30%)
542,241,102 cpu_core/ref-cycles/ (84.85%)
8,854 cpu_core/cache-misses/ (76.71%)
7,179,013 cpu_core/cache-references/ (76.31%)
1.003245250 seconds time elapsed
A hybrid platform has a different PMU name for the core PMUs, while
the current perf hard code the PMU name "cpu".
The evsel->pmu_name can be used to replace the "cpu" to fix the issue.
For a hybrid platform, the pmu_name must be non-NULL. Because there are
at least two core PMUs. The PMU has to be specified.
For a non-hybrid platform, the pmu_name may be NULL. Because there is
only one core PMU, "cpu". For a NULL pmu_name, we can safely assume that
it is a "cpu" PMU.
In case other PMUs also define the "slots" event, checking the PMU type
as well.
With the patch,
$ perf stat -e '{cpu_core/slots/,cpu_core/topdown-bad-spec/,
cpu_core/topdown-be-bound/,cpu_core/topdown-fe-bound/,
cpu_core/topdown-retiring/,cpu_core/branch-instructions/,
cpu_core/branch-misses/,cpu_core/bus-cycles/,cpu_core/cache-misses/,
cpu_core/cache-references/,cpu_core/cpu-cycles/,cpu_core/instructions/,
cpu_core/mem-loads/,cpu_core/mem-stores/,cpu_core/ref-cycles/,
cpu_core/cache-misses/,cpu_core/cache-references/}:W' -a sleep 1
Performance counter stats for 'system wide':
766,620,266 cpu_core/slots/ (84.06%)
73,172,129 cpu_core/topdown-bad-spec/ # 9.5% bad speculation (84.06%)
193,443,341 cpu_core/topdown-be-bound/ # 25.0% backend bound (84.06%)
403,940,929 cpu_core/topdown-fe-bound/ # 52.3% frontend bound (84.06%)
102,070,237 cpu_core/topdown-retiring/ # 13.2% retiring (84.06%)
12,364,429 cpu_core/branch-instructions/ (84.03%)
1,080,124 cpu_core/branch-misses/ (84.24%)
564,120,383 cpu_core/bus-cycles/ (84.65%)
36,979 cpu_core/cache-misses/ (84.86%)
7,298,094 cpu_core/cache-references/ (77.30%)
227,174,372 cpu_core/cpu-cycles/ (77.31%)
63,886,523 cpu_core/instructions/ (84.87%)
0 cpu_core/mem-loads/ (77.31%)
12,208,782 cpu_core/mem-stores/ (77.31%)
566,409,738 cpu_core/ref-cycles/ (84.87%)
23,118 cpu_core/cache-misses/ (76.71%)
7,212,602 cpu_core/cache-references/ (76.29%)
1.003228667 seconds time elapsed
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220518143900.1493980-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add a flag needs_auxtrace_mmap to record whether an auxtrace mmap is
needed, in preparation for correctly determining whether or not an
auxtrace mmap is needed.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220506122601.367589-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps
of all evsels.
For non-task targets, cpus is set to be cpus requested from the command
line, defaulting to all online cpus if no cpus are specified.
For an uncore event, all_cpus may be just CPU 0 or every online CPU.
This causes all_cpus to have fewer values than the cpus variable which
is confusing given the 'all' in the name.
To try to make the behavior clearer, rename cpus to user_requested_cpus
and add comments on the two struct variables.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220328232648.2127340-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If slots isn't with a topdown event then moving it is unnecessary. For
example {instructions, slots} is re-ordered:
$ perf stat -e '{instructions,slots}' -a sleep 1
Performance counter stats for 'system wide':
936,600,825 slots
144,440,968 instructions
1.006061423 seconds time elapsed
Which can break tools expecting the command line order to match the
printed order. It is necessary to move the slots event first when it
appears with topdown events. Add extra checking so that the slots event
is only moved in the case of there being a topdown event like:
$ perf stat -e '{instructions,slots,topdown-fe-bound}' -a sleep 1
Performance counter stats for 'system wide':
2427568570 slots
300927614 instructions
551021649 topdown-fe-bound
1.001771803 seconds time elapsed
Fixes: 94dbfd6781 ("perf parse-events: Architecture specific leader override")
Reported-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220321223344.1034479-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
AMX, other misc insns.
- Update VMware-specific MAINTAINERS entries
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmI4URIACgkQEsHwGGHe
VUob3A/9GFyqt9bBKrSaq9Rt1UVkq6dQhG3kO7dW5d0YDvy8JmR9is4rNDV9GGx6
A1OAue/gDlZFIz/829oS1qwjB7GZ4Rfb0gRo33bytDLLmd0BRXW7ioZ54jBRnWvy
8dZ2WruMmazK6uJxoHvtOA+Pt3ukb074CZZ1SfW344clWK6FJZeptyRclWaT1Py2
QOIJOxMraCdNAay/1ZvOdIqqdIPx5+JyzbHIYOWUFzwT4y+Q8kFNbigrJnqxe5Ij
aqRjzMIvt6MeLwbq9CfLsPFA3gaSzYeOkuXQPcqRgd5LU5ZyXBLStUrGEv1fsMvd
9Kh7VFycZPS7MKzxoEcbuJTTOR4cBsINOlbo9iWr7UD5pm5h7c3vc+nCyia+U+Xo
5XRpf8nitt4a3r1f6HxwXJS0OlBkS4CqexE2OejY4yhWRlxhMcIvRyquU+Z0J4Bp
mgDJuXSzfJfFcBzp4jjOBxGPNEjXXOdy/qc/1jR97eMmTKrk3gk/74NWUx9hw4oN
5RGeC+khAD13TL0yVQfKBe5HuLK5tHppAzXAnT2xi6qUn+VJjLxNWgg3iV9tbShM
4q5vJp3BmvNOY8HQv1R3IDFfN0IAL09Q9v6EzEroNuVUhEOzBdH7JSzWkvBBveZb
FVgD3I+wNBE1nQD3cP/6DGbRe1JG3ULDF95WJshB8gNJwavlZGs=
=f7VZ
-----END PGP SIGNATURE-----
Merge tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 updates from Borislav Petkov:
- Add support for a couple new insn sets to the insn decoder:
AVX512-FP16, AMX, other misc insns.
- Update VMware-specific MAINTAINERS entries
* tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Mark VMware mailing list entries as email aliases
MAINTAINERS: Add Zack as maintainer of vmmouse driver
MAINTAINERS: Update maintainers for paravirt ops and VMware hypervisor interface
x86/insn: Add AVX512-FP16 instructions to the x86 instruction decoder
perf/tests: Add AVX512-FP16 instructions to x86 instruction decoder test
x86/insn: Add misc instructions to x86 instruction decoder
perf/tests: Add misc instructions to the x86 instruction decoder test
x86/insn: Add AMX instructions to the x86 instruction decoder
perf/tests: Add AMX instructions to x86 instruction decoder test
An issue with icelakex metrics:
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json?h=perf/core&id=65eab2bc7dab326ee892ec5a4c749470b368b51a#n48
That causes the slots not to be first.
Fixes: 94dbfd6781 ("perf parse-events: Architecture specific leader override")
Reported-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220317224309.543736-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The change to the MODE.Exec packet means processing must distinguish
between the old and new cases. Record the Event Trace capability flag to
make that possible.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new
Intel PT feature called Event Trace which adds a bit to the existing
MODE.Exec packet to record the interrupt flag. Amend the packet decoder and
packet decoder test accordingly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As of Intel SDM (https://www.intel.com/sdm) version 076, there is a new
Intel PT feature called Event Trace which requires 2 new packets CFE and
EVD. Add them to the packet decoder and packet decoder test.
Committer notes:
I got the "Intel® 64 and IA-32 architectures software developer’s manual
combined volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at:
https://cdrdv2.intel.com/v1/dl/getContent/671200
And these new packets are described in page 3951:
<quote>
32.2.4
Event Trace is a capability that exposes details about the asynchronous
events, when they are generated, and when their corresponding software
event handler completes execution. These include:
o Interrupts, including NMI and SMI, including the interrupt vector when
defined.
o Faults, exceptions including the fault vector.
— Page faults additionally include the page fault address, when in context.
o Event handler returns, including IRET and RSM.
o VM exits and VM entries.¹
— VM exits include the values written to the “exit reason” and “exit qualification” VMCS fields.
INIT and SIPI events.
o TSX aborts, including the abort status returned for the RTM instructions.
o Shutdown.
Additionally, it provides indication of the status of the Interrupt Flag
(IF), to indicate when interrupts are masked.
</quote>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Make test_data 'static' otherwise it will conflict with any global
variable of the same name.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20220124084201.2699795-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
struct maps is reference counted, using a pointer is more idiomatic.
Committer notes:
Delay:
maps = machine__kernel_maps(&vmlinux);
To after:
machine__init(&vmlinux, "", HOST_KERNEL_ID);
To avoid this on f34:
In file included from /var/home/acme/git/perf/tools/perf/util/build-id.h:10,
from /var/home/acme/git/perf/tools/perf/util/dso.h:13,
from tests/vmlinux-kallsyms.c:8:
In function ‘machine__kernel_maps’,
inlined from ‘test__vmlinux_matches_kallsyms’ at tests/vmlinux-kallsyms.c:122:22:
/var/home/acme/git/perf/tools/perf/util/machine.h:86:23: error: ‘vmlinux.kmaps’ is used uninitialized [-Werror=uninitialized]
86 | return machine->kmaps;
| ~~~~~~~^~~~~~~
tests/vmlinux-kallsyms.c: In function ‘test__vmlinux_matches_kallsyms’:
tests/vmlinux-kallsyms.c:121:34: note: ‘vmlinux’ declared here
121 | struct machine kallsyms, vmlinux;
| ^~~~~~~
cc1: all warnings being treated as errors
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220211103415.2737789-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The x86 instruction decoder is used for both kernel instructions and
user space instructions (e.g. uprobes, perf tools Intel PT), so it is
good to update it with new instructions.
Add AVX512-FP16 instructions to x86 instruction decoder test.
A subsequent patch adds the instructions to the instruction decoder.
Reference:
Intel AVX512-FP16 Architecture Specification
June 2021
Revision 1.0
Document Number: 347407-001US
Example:
$ perf test -v "x86 instruction decoder" |& grep vfcmaddcph | head -2
Failed to decode: 62 f6 6f 48 56 cb vfcmaddcph %zmm3,%zmm2,%zmm1
Failed to decode: 62 f6 6f 48 56 8c c8 78 56 34 12 vfcmaddcph 0x12345678(%eax,%ecx,8),%zmm2,%zmm1
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20211202095029.2165714-6-adrian.hunter@intel.com
The x86 instruction decoder is used for both kernel instructions and
user space instructions (e.g. uprobes, perf tools Intel PT), so it is
good to update it with new instructions.
Add the following instructions to the x86 instruction decoder test:
User Interrupt
clui
senduipi
stui
testui
uiret
Prediction history reset
hreset
Serialize instruction execution
serialize
TSX suspend load address tracking
xresldtrk
xsusldtrk
A subsequent patch adds the instructions to the instruction decoder.
Reference:
Intel Architecture Instruction Set Extensions and Future Features
Programming Reference
May 2021
Document Number: 319433-044
Example:
$ perf test -v "x86 instruction decoder" |& grep -i hreset
Failed to decode length (4 vs expected 6): f3 0f 3a f0 c0 00 hreset $0x0
Failed to decode length (4 vs expected 6): f3 0f 3a f0 c0 00 hreset $0x0
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20211202095029.2165714-4-adrian.hunter@intel.com
Currently topdown events must appear after a slots event:
$ perf stat -e '{slots,topdown-fe-bound}' /bin/true
Performance counter stats for '/bin/true':
3,183,090 slots
986,133 topdown-fe-bound
Reversing the events yields:
$ perf stat -e '{topdown-fe-bound,slots}' /bin/true
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (topdown-fe-bound).
For metrics the order of events is determined by iterating over a
hashmap, and so slots isn't guaranteed to be first which can yield this
error.
Change the set_leader in parse-events, called when a group is closed, so
that rather than always making the first event the leader, if the slots
event exists then it is made the leader. It is then moved to the head of
the evlist otherwise it won't be opened in the correct order.
The result is:
$ perf stat -e '{topdown-fe-bound,slots}' /bin/true
Performance counter stats for '/bin/true':
3,274,795 slots
1,001,702 topdown-fe-bound
A problem with this approach is the slots event is identified by name,
names can be overwritten like 'cpu/slots,name=foo/' and this causes the
leader change to fail.
The change also modifies and fixes mixed groups like, with the change:
$ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2
Performance counter stats for 'system wide':
5574985410 slots
971981616 instructions
1348461887 topdown-fe-bound
2.001263120 seconds time elapsed
Without the change:
$ perf stat -e '{instructions,slots,topdown-fe-bound}' -a -- sleep 2
Performance counter stats for 'system wide':
<not counted> instructions
<not counted> slots
<not supported> topdown-fe-bound
2.006247990 seconds time elapsed
Something that may be undesirable here is that the events are reordered
in the output.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Link: http://lore.kernel.org/lkml/20211130174945.247604-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is to align with kunit's terminology.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rather than export test functions, export the test struct. Rename with a
suite__ prefix to avoid name collisions.
Committer notes:
Its '&suite__vectors_page', not '&suite__vectors_pages', noticed when
cross building to arm (32-bit).
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By switching to an array of pointers to tests (later to be suites)
the definition of the tests can be moved to the file containing the
tests.
Committer notes:
It's "&vectors_page", not "&vectors_pages", noticed when cross building
to 32-bit ARM.
Also the DEFINE_SUITE(vectors_page) should be done where its function is
implemented, in tools/perf/arch/arm/tests/vectors-page.c, so that we can
make it static, as we don't have anymore its declaration in tests.h.
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Sohaib Mohamed <sohaib.amhmd@gmail.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: Daniel Latypov <dlatypov@google.com>
Cc: David Gow <davidgow@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211104064208.3156807-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Perf tool sets exclude_guest by default while calling perf_event_open().
Because IBS does not have filtering capability, it always gets rejected
by IBS PMU driver and thus perf falls back to non-precise sampling. Fix
it by not setting exclude_guest by default on AMD.
Before:
$ sudo ./perf record -C 0 -vvv true |& grep precise
precise_ip 3
decreasing precise_ip by one (2)
precise_ip 2
decreasing precise_ip by one (1)
precise_ip 1
decreasing precise_ip by one (0)
After:
$ sudo ./perf record -C 0 -vvv true |& grep precise
precise_ip 3
decreasing precise_ip by one (2)
precise_ip 2
Committer notes:
Fixup init to zero for perf_env in older compilers:
arch/x86/util/evsel.c:15:26: error: missing field 'os_release' initializer [-Werror,-Wmissing-field-initializers]
struct perf_env env = {0};
^
Committer notes:
Namhyung remarked:
It'd be nice if it can cover explicit "-e cycles:pp" as well.
Ravi clarified:
For explicit :pp modifier, evsel->precise_max does not get set and thus perf
does not try with different attr->precise_ip values while exclude_guest set.
So no issue with explicit :pp:
$ sudo ./perf record -C 0 -e cycles:pp -vvv |& grep "precise_ip\|exclude_guest"
precise_ip 2
exclude_guest 1
precise_ip 2
exclude_guest 1
switching off exclude_guest, exclude_host
precise_ip 2
^C
Also, with :P modifier, evsel->precise_max gets set but exclude_guest does
not and thus :P also works fine:
$ sudo ./perf record -C 0 -e cycles:P -vvv |& grep "precise_ip\|exclude_guest"
precise_ip 3
decreasing precise_ip by one (2)
precise_ip 2
^C
Reported-by: Kim Phillips <kim.phillips@amd.com>
Signed-off-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211103072112.32312-1-ravi.bangoria@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
If the 'perf iostat' user specifies two or more iio_root_ports and also
specifies the cpu(s) by -C which is not *connected to all* the above iio
ports, the iostat_print_metric() will run into trouble:
For example:
$ perf iostat list
S0-uncore_iio_0<0000:16>
S1-uncore_iio_0<0000:97> # <--- CPU 1 is located in the socket S0
$ perf iostat 0000:16,0000:97 -C 1 -- ls
port Inbound Read(MB) Inbound Write(MB) Outbound Read(MB) Outbound
Write(MB) ../perf-iostat: line 12: 104418 Segmentation fault
(core dumped) perf stat --iostat$DELIMITER$*
The core-dump stack says, in the above corner case, the returned
(struct perf_counts_values *) count will be NULL, and the caller
iostat_print_metric() apparently doesn't not handle this case.
433 struct perf_counts_values *count = perf_counts(evsel->counts, die, 0);
434
435 if (count->run && count->ena) {
(gdb) p count
$1 = (struct perf_counts_values *) 0x0
The deeper reason is that there are actually no statistics from the user
specified pair "iostat 0000:X, -C (disconnected) Y ", but let's fix it with
minimum cost by adding a NULL check in the user space.
Fixes: f9ed693e8b ("perf stat: Enable iostat mode for x86 platforms")
Signed-off-by: Like Xu <likexu@tencent.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210927081115.39568-2-likexu@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in these csets:
59ab844eed ("compat: remove some compat entry points")
dce4910396 ("mm: wire up syscall process_mrelease")
b48c7236b1 ("exit/bdflush: Remove the deprecated bdflush system call")
That add support for this new syscall in tools such as 'perf trace'.
For instance, this is now possible:
# perf trace -v -e process_mrelease
event qualifier tracepoint filter: (common_pid != 19351 && common_pid != 9112) && (id == 448)
^C#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep process_mrelease tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
448 common process_mrelease sys_process_mrelease
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
A perf uncore PMU may have two PMU names, a real name and an alias. The
alias is exported at /sys/bus/event_source/devices/uncore_*/alias.
The perf tool should support the alias as well.
Add alias_name in the struct perf_pmu to store the alias. For the PMU
which doesn't have an alias. It's NULL.
Introduce two X86 specific functions to retrieve the real name and the
alias separately.
Only go through the sysfs to retrieve the mapping between the real name
and the alias once. The result is cached in a list, uncore_pmu_list.
Nothing changed for the other ARCHs.
With the patch, the perf tool can monitor the PMU with either the real
name or the alias.
Use the real name,
$ perf stat -e uncore_cha_2/event=1/ -x,
4044879584,,uncore_cha_2/event=1/,2528059205,100.00,,
Use the alias,
$ perf stat -e uncore_type_0_2/event=1/ -x,
3659675336,,uncore_type_0_2/event=1/,2287306455,100.00,,
Committer notes:
Rename 'struct perf_pmu_alias_name' to 'pmu_alias', the 'perf_' prefix
should be used for libperf, things inside just tools/perf/ are being
moved away from that prefix.
Also 'pmu_alias' is shorter and reflects the abstraction.
Also don't use 'pmu' as the name for variables for that type, we should
use that for the 'struct perf_pmu' variables, avoiding confusion. Use
'pmu_alias' for 'struct pmu_alias' variables.
Co-developed-by: Jin Yao <yao.jin@linux.intel.com>
Co-developed-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: http://lore.kernel.org/lkml/20210902065955.1299-2-yao.jin@linux.intel.com
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in this cset:
7bb7f2ac24 ("arch, mm: wire up memfd_secret system call where relevant")
That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.
For instance, this is now possible:
# perf trace -v -e memfd_secret
event qualifier tracepoint filter: (common_pid != 13375 && common_pid != 3713) && (id == 447)
^C#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep memfd_secret tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
447 common memfd_secret sys_memfd_secret
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/unistd.h' differs from latest version at 'arch/arm64/include/uapi/asm/unistd.h'
diff -u tools/arch/arm64/include/uapi/asm/unistd.h arch/arm64/include/uapi/asm/unistd.h
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Topdown Microarchitecture Analysis (TMA) Method is a structured
analysis methodology to identify critical performance bottlenecks in
out-of-order processors.
The Topdown metrics L1 event was added as default in 42641d6f4d
("perf stat: Add Topdown metrics events as default events")
From the Sapphire Rapids server and later platforms, the same dedicated
"metrics" register is extended to support both L1 and L2 events.
Add both L1 and L2 Topdown metrics events as default to enrich the
default measuring information if the new measurement register is
available.
On legacy systems there is no change to avoid extra multiplexing.
The topdown_level indicates the max metrics level for the top-down
statistics. Set it to 2 to display all L1 and L2 Topdown metrics events.
With the patch:
$ perf stat sleep 1
Performance counter stats for 'sleep 1':
0.59 msec task-clock # 0.001 CPUs utilized
1 context-switches # 1.687 K/sec
0 cpu-migrations # 0.000 /sec
76 page-faults # 128.198 K/sec
1,405,318 cycles # 2.371 GHz
1,471,136 instructions # 1.05 insn per cycle
310,132 branches # 523.136 M/sec
10,435 branch-misses # 3.36% of all branches
8,431,908 slots # 14.223 G/sec
1,554,116 topdown-retiring # 18.4% retiring
1,289,585 topdown-bad-spec # 15.2% bad speculation
2,810,636 topdown-fe-bound # 33.2% frontend bound
2,810,636 topdown-be-bound # 33.2% backend bound
231,464 topdown-heavy-ops # 2.7% heavy operations # 15.6% light operations
1,223,453 topdown-br-mispredict # 14.5% branch mispredict # 0.8% machine clears
1,884,779 topdown-fetch-lat # 22.3% fetch latency # 10.9% fetch bandwidth
1,454,917 topdown-mem-bound # 17.2% memory bound # 16.0% Core bound
1.001179699 seconds time elapsed
0.000000000 seconds user
0.001238000 seconds sys
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/1625760169-18396-1-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move evsel::idx to perf_evsel::idx, so we can move the group interface
to libperf.
Committer notes:
Fixup evsel->idx usage in tools/perf/util/bpf_counter_cgroup.c, that
appeared in my tree in my local tree.
Also fixed up these:
$ find tools/perf/ -name "*.[ch]" | xargs grep 'evsel->idx'
tools/perf/ui/gtk/annotate.c: evsel->idx + i);
tools/perf/ui/gtk/annotate.c: evsel->idx);
$
That running 'make -C tools/perf build-test' caught.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Requested-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20210706151704.73662-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in these csets:
64c2c2c62f ("quota: Change quotactl_path() systcall to an fd-based one")
65ffb3d69e ("quota: Wire up quotactl_fd syscall")
That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.
For instance, this is now possible:
# perf trace -v -e quota*
event qualifier tracepoint filter: (common_pid != 158365 && common_pid != 2512) && (id == 179 || id == 443)
^C#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep quota tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
179 common quotactl sys_quotactl
443 common quotactl_fd sys_quotactl_fd
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For enabling mem-store event, it doesn't need an auxiliary event.
So just build an event name string with the pmu prefix.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-4-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The perf_mem_events__name() can generate the mem-load event name.
It uses a variable 'mem_loads_name__init' to avoid generating the
event name every time (because perf_pmu__scan takes some time).
The perf_mem_events__name() assumes the pmu is "cpu" but it's not
correct for hybrid platform. For Alderlake, the pmu is "cpu_core" or
"cpu_atom"
Introduce a new parameter 'pmu_name' in perf_mem_events__name
to let the caller specify a pmu name.
Considering such event name is x86 specific, so move
perf_mem_events[] to arch/x86/util/mem-events.c.
We still keep the variable 'mem_loads_name__init' but it's only
used when pmu_name is NULL (compatible for original behavior). When
pmu_name is not NULL (e.g. "cpu_core"), this patch doesn't have
optimization. That can be implemented in follow up patch.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-3-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For some platforms, an auxiliary event has to be enabled
simultaneously with the load latency event.
For Alderlake, the auxiliary event is created in "cpu_core" pmu.
So first we need to check the existing of "cpu_core" pmu
and then check if this pmu has auxiliary event.
Signed-off-by: Jin Yao <yao.jin@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210527001610.10553-2-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Function declarations don't need __maybe_unused annotations, only the
implementations do. Drop them on the perf x86 tests.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: masayoshi mizuma <msys.mizuma@gmail.com>
Link: http://lore.kernel.org/lkml/20210513174614.2242210-2-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There's no reason for making the test__arch_unwind_sample declaration per
arch. Currently that's done 2 different ways either with a declaration in
arch-tests.h or with an arch define. Unify all this with an unconditional
declaration in tests.h.
Signed-off-by: Rob Herring <robh@kernel.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Matt Fleming <matt.fleming@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: masayoshi mizuma <msys.mizuma@gmail.com>
Link: http://lore.kernel.org/lkml/20210513174614.2242210-1-robh@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in this csets:
5b9fedb31e ("quota: Disable quotactl_path syscall")
That silences these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Jan Kara <jack@suse.cz>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
usage:
- kvm stat
run a command and gather performance counter statistics
- show the result:
perf kvm stat report --event=msr
See the msr events:
Analyze events for all VMs, all VCPUs:
MSR Access Samples Samples% Time% Min Time Max Time Avg time
0x6e0:W 67007 98.17% 98.31% 0.59us 10.69us 0.90us ( +- 0.10% )
0x830:W 1186 1.74% 1.60% 0.53us 108.34us 0.82us ( +- 11.02% )
0x3b:R 66 0.10% 0.09% 0.56us 1.26us 0.80us ( +- 3.24% )
Total Samples:68259, Total events handled time:61150.95us.
Signed-off-by: Lei Zhao <zhaolei27@baidu.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1618470001-7239-1-git-send-email-lirongqing@baidu.com
Signed-off-by: Li RongQing <lirongqing@baidu.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes in these csets:
a49f4f81cb ("arch: Wire up Landlock syscalls")
2a1867219c ("fs: add mount_setattr()")
fa8b90070a ("quota: wire up quotactl_path")
That silences these perf build warnings and add support for those new
syscalls in tools such as 'perf trace'.
For instance, this is now possible:
# ~acme/bin/perf trace -v -e landlock*
event qualifier tracepoint filter: (common_pid != 129365 && common_pid != 3502) && (id == 444 || id == 445 || id == 446)
^C#
That is tha filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ grep landlock tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
444 common landlock_create_ruleset sys_landlock_create_ruleset
445 common landlock_add_rule sys_landlock_add_rule
446 common landlock_restrict_self sys_landlock_restrict_self
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: James Morris <jamorris@linux.microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Mickaël Salaün <mic@linux.microsoft.com>
Cc: Sascha Hauer <s.hauer@pengutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf stat:
- Add support for hybrid PMUs to support systems such as Intel Alderlake
and its BIG/little core/atom cpus.
- Introduce 'bperf' to share hardware PMCs with BPF.
- New --iostat option to collect and present IO stats on Intel hardware.
This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP):
commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
It is intended to provide four I/O performance metrics in MB per each
PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
- Align CSV output for summary.
- Clarify --null use cases: Assess raw overhead of 'perf stat' or
measure just wall clock time.
- Improve readability of shadow stats.
perf record:
- Change the COMM when starting tha workload so that --exclude-perf
doesn't seem to be not honoured.
- Improve 'Workload failed' message printing events + what was exec'ed.
- Fix cross-arch support for TIME_CONV.
perf report:
- Add option to disable raw event ordering.
- Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.
- Improvements to --stat output, that shows information about PERF_RECORD_ events.
- Preserve identifier id in OCaml demangler.
perf annotate:
- Show full source location with 'l' hotkey in the 'perf annotate' TUI.
- Add line number like in TUI and source location at EOL to the 'perf annotate' --stdio mode.
- Add --demangle and --demangle-kernel to 'perf annotate'.
- Allow configuring annotate.demangle{,_kernel} in 'perf config'.
- Fix sample events lost in stdio mode.
perf data:
- Allow converting a perf.data file to JSON.
libperf:
- Add support for user space counter access.
- Update topdown documentation to permit rdpmc calls.
perf test:
- Add 'perf test' for 'perf stat' CSV output.
- Add 'perf test' entries to test the hybrid PMU support.
- Cleanup 'perf test daemon' if its 'perf test' is interrupted.
- Handle metric reuse in pmu-events parsing 'perf test' entry.
- Add test for PE executable support.
- Add timeout for wait for daemon start in its 'perf test' entries.
Build:
- Enable libtraceevent dynamic linking.
- Improve feature detection output.
- Fix caching of feature checks caching.
- First round of updates for tools copies of kernel headers.
- Enable warnings when compiling BPF programs.
Vendor specific events:
Intel:
- Add missing skylake & icelake model numbers.
arm64:
- Add Hisi hip08 L1, L2 and L3 metrics.
- Add Fujitsu A64FX PMU events.
PowerPC:
- Initial JSON/events list for power10 platform.
- Remove unsupported power9 metrics.
AMD:
- Add Zen3 events.
- Fix broken L2 Cache Hits from L2 HWPF metric.
- Use lowercases for all the eventcodes and umasks.
Hardware tracing:
arm64:
- Update CoreSight ETM metadata format.
- Fix bitmap for CS-ETM option.
- Support PID tracing in config.
- Detect pid in VMID for kernel running at EL2.
Arch specific:
MIPS:
- Support MIPS unwinding and dwarf-regs.
- Generate mips syscalls_n64.c syscall table.
PowerPC:
- Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
- Support pipeline stage cycles for powerpc.
libbeauty:
- Fix fsconfig generator.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYIshAwAKCRCyPKLppCJ+
J8oWAP9c1POclDQ7AZDe5/t/InZYSQKJFIku1sE1SNCSOupy7wEAuPBtaN7wDaRj
BFBibfUGd4MNzLPvMMHneIhSY3DgJwg=
=FLLr
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"perf stat:
- Add support for hybrid PMUs to support systems such as Intel
Alderlake and its BIG/little core/atom cpus.
- Introduce 'bperf' to share hardware PMCs with BPF.
- New --iostat option to collect and present IO stats on Intel
hardware.
This functionality is based on recently introduced sysfs attributes
for Intel® Xeon® Scalable processor family (code name Skylake-SP)
in commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore
unit to IIO PMON mapping")
It is intended to provide four I/O performance metrics in MB per
each PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
- Align CSV output for summary.
- Clarify --null use cases: Assess raw overhead of 'perf stat' or
measure just wall clock time.
- Improve readability of shadow stats.
perf record:
- Change the COMM when starting tha workload so that --exclude-perf
doesn't seem to be not honoured.
- Improve 'Workload failed' message printing events + what was
exec'ed.
- Fix cross-arch support for TIME_CONV.
perf report:
- Add option to disable raw event ordering.
- Dump the contents of PERF_RECORD_TIME_CONV in 'perf report -D'.
- Improvements to --stat output, that shows information about
PERF_RECORD_ events.
- Preserve identifier id in OCaml demangler.
perf annotate:
- Show full source location with 'l' hotkey in the 'perf annotate'
TUI.
- Add line number like in TUI and source location at EOL to the 'perf
annotate' --stdio mode.
- Add --demangle and --demangle-kernel to 'perf annotate'.
- Allow configuring annotate.demangle{,_kernel} in 'perf config'.
- Fix sample events lost in stdio mode.
perf data:
- Allow converting a perf.data file to JSON.
libperf:
- Add support for user space counter access.
- Update topdown documentation to permit rdpmc calls.
perf test:
- Add 'perf test' for 'perf stat' CSV output.
- Add 'perf test' entries to test the hybrid PMU support.
- Cleanup 'perf test daemon' if its 'perf test' is interrupted.
- Handle metric reuse in pmu-events parsing 'perf test' entry.
- Add test for PE executable support.
- Add timeout for wait for daemon start in its 'perf test' entries.
Build:
- Enable libtraceevent dynamic linking.
- Improve feature detection output.
- Fix caching of feature checks caching.
- First round of updates for tools copies of kernel headers.
- Enable warnings when compiling BPF programs.
Vendor specific events:
- Intel:
- Add missing skylake & icelake model numbers.
- arm64:
- Add Hisi hip08 L1, L2 and L3 metrics.
- Add Fujitsu A64FX PMU events.
- PowerPC:
- Initial JSON/events list for power10 platform.
- Remove unsupported power9 metrics.
- AMD:
- Add Zen3 events.
- Fix broken L2 Cache Hits from L2 HWPF metric.
- Use lowercases for all the eventcodes and umasks.
Hardware tracing:
- arm64:
- Update CoreSight ETM metadata format.
- Fix bitmap for CS-ETM option.
- Support PID tracing in config.
- Detect pid in VMID for kernel running at EL2.
Arch specific updates:
- MIPS:
- Support MIPS unwinding and dwarf-regs.
- Generate mips syscalls_n64.c syscall table.
- PowerPC:
- Add support for PERF_SAMPLE_WEIGH_STRUCT on PowerPC.
- Support pipeline stage cycles for powerpc.
libbeauty:
- Fix fsconfig generator"
* tag 'perf-tools-for-v5.13-2021-04-29' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (132 commits)
perf build: Defer printing detected features to the end of all feature checks
tools build: Allow deferring printing the results of feature detection
perf build: Regenerate the FEATURE_DUMP file after extra feature checks
perf session: Dump PERF_RECORD_TIME_CONV event
perf session: Add swap operation for event TIME_CONV
perf jit: Let convert_timestamp() to be backwards-compatible
perf tools: Change fields type in perf_record_time_conv
perf tools: Enable libtraceevent dynamic linking
perf Documentation: Document intel-hybrid support
perf tests: Skip 'perf stat metrics (shadow stat) test' for hybrid
perf tests: Support 'Convert perf time to TSC' test for hybrid
perf tests: Support 'Session topology' test for hybrid
perf tests: Support 'Parse and process metrics' test for hybrid
perf tests: Support 'Track with sched_switch' test for hybrid
perf tests: Skip 'Setup struct perf_event_attr' test for hybrid
perf tests: Add hybrid cases for 'Roundtrip evsel->name' test
perf tests: Add hybrid cases for 'Parse event definition strings' test
perf record: Uniquify hybrid event name
perf stat: Warn group events from different hybrid PMU
perf stat: Filter out unmatched aggregation for hybrid event
...
- Improve Intel uncore PMU support:
- Parse uncore 'discovery tables' - a new hardware capability enumeration method
introduced on the latest Intel platforms. This table is in a well-defined PCI
namespace location and is read via MMIO. It is organized in an rbtree.
These uncore tables will allow the discovery of standard counter blocks, but
fancier counters still need to be enumerated explicitly.
- Add Alder Lake support
- Improve IIO stacks to PMON mapping support on Skylake servers
- Add Intel Alder Lake PMU support - which requires the introduction of 'hybrid' CPUs
and PMUs. Alder Lake is a mix of Golden Cove ('big') and Gracemont ('small' - Atom derived)
cores.
The CPU-side feature set is entirely symmetrical - but on the PMU side there's
core type dependent PMU functionality.
- Reduce data loss with CPU level hardware tracing on Intel PT / AUX profiling, by
fixing the AUX allocation watermark logic.
- Improve ring buffer allocation on NUMA systems
- Put 'struct perf_event' into their separate kmem_cache pool
- Add support for synchronous signals for select perf events. The immediate motivation
is to support low-overhead sampling-based race detection for user-space code. The
feature consists of the following main changes:
- Add thread-only event inheritance via perf_event_attr::inherit_thread, which limits
inheritance of events to CLONE_THREAD.
- Add the ability for events to not leak through exec(), via perf_event_attr::remove_on_exec.
- Allow the generation of SIGTRAP via perf_event_attr::sigtrap, extend siginfo with an u64
::si_perf, and add the breakpoint information to ::si_addr and ::si_perf if the event is
PERF_TYPE_BREAKPOINT.
The siginfo support is adequate for breakpoints right now - but the new field can be used
to introduce support for other types of metadata passed over siginfo as well.
- Misc fixes, cleanups and smaller updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmCJGpERHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1j9zBAAuVbG2snV6SBSdXLhQcM66N3NckOXvSY5
QjjhQcuwJQEK/NJB3266K5d8qSmdyRBsWf3GCsrmyBT67P1V28K44Pu7oCV0UDtf
mpVRjEP0oR7hNsANSSgo8Fa4ZD7H5waX7dK7925Tvw8By3mMoZoddiD/84WJHhxO
NDF+GRFaRj+/dpbhV8cdCoXTjYdkC36vYuZs3b9lu0tS9D/AJgsNy7TinLvO02Cs
5peP+2y29dgvCXiGBiuJtEA6JyGnX3nUJCvfOZZ/DWDc3fdduARlRrc5Aiq4n/wY
UdSkw1VTZBlZ1wMSdmHQVeC5RIH3uWUtRoNqy0Yc90lBm55AQ0EENwIfWDUDC5zy
USdBqWTNWKMBxlEilUIyqKPQK8LW/31TRzqy8BWKPNcZt5yP5YS1SjAJRDDjSwL/
I+OBw1vjLJamYh8oNiD5b+VLqNQba81jFASfv+HVWcULumnY6ImECCpkg289Fkpi
BVR065boifJDlyENXFbvTxyMBXQsZfA+EhtxG7ju2Ni+TokBbogyCb3L2injPt9g
7jjtTOqmfad4gX1WSc+215iYZMkgECcUd9E+BfOseEjBohqlo7yNKIfYnT8mE/Xq
nb7eHjyvLiE8tRtZ+7SjsujOMHv9LhWFAbSaxU/kEVzpkp0zyd6mnnslDKaaHLhz
goUMOL/D0lg=
=NhQ7
-----END PGP SIGNATURE-----
Merge tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull perf event updates from Ingo Molnar:
- Improve Intel uncore PMU support:
- Parse uncore 'discovery tables' - a new hardware capability
enumeration method introduced on the latest Intel platforms. This
table is in a well-defined PCI namespace location and is read via
MMIO. It is organized in an rbtree.
These uncore tables will allow the discovery of standard counter
blocks, but fancier counters still need to be enumerated
explicitly.
- Add Alder Lake support
- Improve IIO stacks to PMON mapping support on Skylake servers
- Add Intel Alder Lake PMU support - which requires the introduction of
'hybrid' CPUs and PMUs. Alder Lake is a mix of Golden Cove ('big')
and Gracemont ('small' - Atom derived) cores.
The CPU-side feature set is entirely symmetrical - but on the PMU
side there's core type dependent PMU functionality.
- Reduce data loss with CPU level hardware tracing on Intel PT / AUX
profiling, by fixing the AUX allocation watermark logic.
- Improve ring buffer allocation on NUMA systems
- Put 'struct perf_event' into their separate kmem_cache pool
- Add support for synchronous signals for select perf events. The
immediate motivation is to support low-overhead sampling-based race
detection for user-space code. The feature consists of the following
main changes:
- Add thread-only event inheritance via
perf_event_attr::inherit_thread, which limits inheritance of
events to CLONE_THREAD.
- Add the ability for events to not leak through exec(), via
perf_event_attr::remove_on_exec.
- Allow the generation of SIGTRAP via perf_event_attr::sigtrap,
extend siginfo with an u64 ::si_perf, and add the breakpoint
information to ::si_addr and ::si_perf if the event is
PERF_TYPE_BREAKPOINT.
The siginfo support is adequate for breakpoints right now - but the
new field can be used to introduce support for other types of
metadata passed over siginfo as well.
- Misc fixes, cleanups and smaller updates.
* tag 'perf-core-2021-04-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
signal, perf: Add missing TRAP_PERF case in siginfo_layout()
signal, perf: Fix siginfo_t by avoiding u64 on 32-bit architectures
perf/x86: Allow for 8<num_fixed_counters<16
perf/x86/rapl: Add support for Intel Alder Lake
perf/x86/cstate: Add Alder Lake CPU support
perf/x86/msr: Add Alder Lake CPU support
perf/x86/intel/uncore: Add Alder Lake support
perf: Extend PERF_TYPE_HARDWARE and PERF_TYPE_HW_CACHE
perf/x86/intel: Add Alder Lake Hybrid support
perf/x86: Support filter_match callback
perf/x86/intel: Add attr_update for Hybrid PMUs
perf/x86: Add structures for the attributes of Hybrid PMUs
perf/x86: Register hybrid PMUs
perf/x86: Factor out x86_pmu_show_pmu_cap
perf/x86: Remove temporary pmu assignment in event_init
perf/x86/intel: Factor out intel_pmu_check_extra_regs
perf/x86/intel: Factor out intel_pmu_check_event_constraints
perf/x86/intel: Factor out intel_pmu_check_num_counters
perf/x86: Hybrid PMU support for extra_regs
perf/x86: Hybrid PMU support for event constraints
...
This functionality is based on recently introduced sysfs attributes for
Intel® Xeon® Scalable processor family (code name Skylake-SP):
Commit bb42b3d397 ("perf/x86/intel/uncore: Expose an Uncore unit to IIO PMON mapping")
Mode is intended to provide four I/O performance metrics in MB per each
PCIe root port:
- Inbound Read: I/O devices below root port read from the host memory
- Inbound Write: I/O devices below root port write to the host memory
- Outbound Read: CPU reads from I/O devices below root port
- Outbound Write: CPU writes to I/O devices below root port
Each metric requiries only one uncore event which increments at every 4B
transfer in corresponding direction. The formulas to compute metrics
are generic:
#EventCount * 4B / (1024 * 1024)
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-4-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Introduce helper functions to control PCIe root ports list.
These helpers will be used in the follow-up patch.
Signed-off-by: Alexander Antonov <alexander.antonov@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey V Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210419094147.15909-3-alexander.antonov@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Turns out, the default setting of attr.aux_watermark to half of the total
buffer size is not very useful, especially with smaller buffers. The
problem is that, after half of the buffer is filled up, the kernel updates
->aux_head and sets up the next "transaction", while observing that
->aux_tail is still zero (as userspace haven't had the chance to update
it), meaning that the trace will have to stop at the end of this second
"transaction". This means, for example, that the second PERF_RECORD_AUX in
every trace comes with TRUNCATED flag set.
Setting attr.aux_watermark to quarter of the buffer gives enough space for
the ->aux_tail update to be observed and prevents the data loss.
The obligatory before/after showcase:
> # perf_before record -e intel_pt//u -m,8 uname
> Linux
> [ perf record: Woken up 6 times to write data ]
> Warning:
> AUX data lost 4 times out of 10!
>
> [ perf record: Captured and wrote 0.099 MB perf.data ]
> # perf record -e intel_pt//u -m,8 uname
> Linux
> [ perf record: Woken up 4 times to write data ]
> [ perf record: Captured and wrote 0.039 MB perf.data ]
The effect is still visible with large workloads and large buffers,
although less pronounced.
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20210414154955.49603-3-alexander.shishkin@linux.intel.com
In some versions of alpine Linux the perf build is broken since commit
1d509f2a6e ("x86/insn: Support big endian cross-compiles"):
In file included from /usr/include/linux/byteorder/little_endian.h:13,
from /usr/include/asm/byteorder.h:5,
from arch/x86/util/../../../../arch/x86/include/asm/insn.h:10,
from arch/x86/util/archinsn.c:2:
/usr/include/linux/swab.h:161:8: error: unknown type name '__always_inline'
static __always_inline __u16 __swab16p(const __u16 *p)
So move the inclusion of arch/x86/include/asm/insn.h to later in the
places where linux/stddef.h (that conditionally defines
__always_inline) to workaround this problem on Alpine Linux 3.9 to 3.11,
3.12 onwards works.
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The ins_lat of PERF_SAMPLE_WEIGHT_STRUCT stands for the instruction
latency, which is only available for X86. Add a X86 specific test for
the ins_lat and PERF_SAMPLE_WEIGHT_STRUCT type.
The test__x86_sample_parsing() uses the same way as the
test__sample_parsing() to verify a sample type. Since the ins_lat and
PERF_SAMPLE_WEIGHT_STRUCT are the only X86 specific sample type for now,
the test__x86_sample_parsing() only verify the PERF_SAMPLE_WEIGHT_STRUCT
type. Other sample types are still verified in the generic test.
$ perf test 77 -v
77: x86 Sample parsing :
--- start ---
test child forked, pid 102370
test child finished with 0
---- end ----
x86 Sample parsing: Ok
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/1614787285-104151-2-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
9caccd4154 ("fs: introduce MOUNT_ATTR_IDMAP")
This adds this new syscall to the tables used by tools such as 'perf
trace', so that one can specify it by name and have it filtered, etc.
Addressing these perf build warnings:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YD6Wsxr9ByUbab/a@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Retain the PIP packet payload as is, instead of just the CR3, because it
contains also the VMX NR flag which is needed to track VM-Entry.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In preparation to support Intel PT decoding of virtual machine traces, add
vmlaunch and vmresume as branch instructions.
Note, sample flags will show "VMentry" even if the VM-Entry fails.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For X86, the var2_w field of PERF_SAMPLE_WEIGHT_STRUCT stands for the
instruction latency. Current perf forces the var2_w to the data->ins_lat
in the generic code. It works well for now because X86 is the only
architecture that supports the PERF_SAMPLE_WEIGHT_STRUCT, but it may
bring problems once other architectures support the sample type. For
example, the var2_w may be used to capture something else on PowerPC.
Create two architecture specific functions to parse and synthesize the
weight related samples. Move the X86 specific codes to the X86 version
functions. Other architectures can implement their own functions later
separately.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/1612540912-6562-1-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The new sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. Users can apply either the
PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample
type to retrieve the sample weight, but they cannot apply both sample
types simultaneously.
The new sample type shares the same space as the PERF_SAMPLE_WEIGHT
sample type. The lower 32 bits are exactly the same for both sample
type. The higher 32 bits may be different for different architecture.
Add arch specific arch_evsel__set_sample_weight() to set the new sample
type for X86. Only store the lower 32 bits for the sample->weight if the
new sample type is applied. In practice, no memory access could last
than 4G cycles. No data will be lost.
If the kernel doesn't support the new sample type. Fall back to the
PERF_SAMPLE_WEIGHT sample type.
There is no impact for other architectures.
Committer notes:
Fixup related to PERF_SAMPLE_CODE_PAGE_SIZE, present in acme/perf/core
but not upstream yet.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/1612296553-21962-6-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On the Intel Sapphire Rapids server, an auxiliary event has to be
enabled simultaneously with the load latency event to retrieve complete
Memory Info.
Add X86 specific perf_mem_events__name() to handle the auxiliary event.
- Users are only interested in the samples of the mem-loads event.
Sample read the auxiliary event.
- The auxiliary event must be in front of the load latency event in a
group. Assume the second event to sample if the auxiliary event is the
leader.
- Add a weak is_mem_loads_aux_event() to check the auxiliary event for
X86. For other ARCHs, it always return false.
Parse the unique event name, mem-loads-aux, for the auxiliary event.
Committer notes:
According to 61b985e3e7 ("perf/x86/intel: Add perf core PMU
support for Sapphire Rapids"), ENODATA is only returned by
sys_perf_event_open() when used with these auxiliary events, with this
in evsel__open_strerror():
case ENODATA:
return scnprintf(msg, size, "Cannot collect data source with the load latency event alone. "
"Please add an auxiliary event in front of the load latency event.");
This is Ok at this point in time, but fragile long term, I pointed this
out in the e-mail thread, requesting a follow up patch to check if
ENODATA is really for this specific case.
Fixed up sizeof(MEM_LOADS_AUX_NAME) bug pointed out by Namhyung.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20210205152648.GC920417@kernel.org
Link: http://lore.kernel.org/lkml/1612296553-21962-3-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The Topdown Microarchitecture Analysis (TMA) Method is a structured
analysis methodology to identify critical performance bottlenecks in
out-of-order processors. From the Ice Lake and later platforms, the
Topdown information can be retrieved from the dedicated "metrics"
register, which isn't impacted by other events. Also, the Topdown
metrics support both per thread/process and per core measuring. Adding
Topdown metrics events as default events can enrich the default
measuring information, and would not cost any extra multiplexing.
Introduce arch_evlist__add_default_attrs() to allow architecture
specific default events. Add the Topdown metrics events in the X86
specific arch_evlist__add_default_attrs(). Other architectures can add
their own default events later separately.
With the patch:
$ perf stat sleep 1
Performance counter stats for 'sleep 1':
0.82 msec task-clock:u # 0.001 CPUs utilized
0 context-switches:u # 0.000 K/sec
0 cpu-migrations:u # 0.000 K/sec
61 page-faults:u # 0.074 M/sec
319,941 cycles:u # 0.388 GHz
242,802 instructions:u # 0.76 insn per cycle
54,380 branches:u # 66.028 M/sec
4,043 branch-misses:u # 7.43% of all branches
1,585,555 slots:u # 1925.189 M/sec
238,941 topdown-retiring:u # 15.0% retiring
410,378 topdown-bad-spec:u # 25.8% bad speculation
634,222 topdown-fe-bound:u # 39.9% frontend bound
304,675 topdown-be-bound:u # 19.2% backend bound
1.001791625 seconds time elapsed
0.000000000 seconds user
0.001572000 seconds sys
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20210121133752.118327-1-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
b0a0c2615f ("epoll: wire up syscall epoll_pwait2")
That addresses these perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This fix is for a failure that occurred in the DWARF unwind perf test.
Stack unwinders may probe memory when looking for frames.
Memory sanitizer will poison and track uninitialized memory on the
stack, and on the heap if the value is copied to the heap.
This can lead to false memory sanitizer failures for the use of an
uninitialized value.
Avoid this problem by removing the poison on the copied stack.
The full msan failure with track origins looks like:
==2168==WARNING: MemorySanitizer: use-of-uninitialized-value
#0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22
#1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13
#2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#24 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9
#1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#23 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10
#1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13
#2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18
#3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
#4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
#5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
#6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
#7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
#8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
#9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
#10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
#11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
#12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#25 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was stored to memory at
#0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3
#1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2
#2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9
#3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6
#4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
#5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
#6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
#7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
#8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
#9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
#10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
#11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
#12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
#13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
#14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
#15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
#16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
#17 0x559cea95fbce in main tools/perf/perf.c:539:3
Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events'
#0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445
SUMMARY: MemorySanitizer: use-of-uninitialized-value elfutils/libdwfl/frame_unwind.c:648:8 in handle_cfi
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201113182053.754625-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
x86 arch provides the testing for conversion between tsc and perf time,
the testing is located in x86 arch folder. Move this testing out from
x86 arch folder and place it into the common testing folder, so allows
to execute tsc testing on other architectures (e.g. Arm64).
This patch removes the inclusion of "arch-tests.h" from the testing
code, this can avoid building failure if any arch has no this header
file.
Committer testing:
$ perf test -v tsc
Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc
70: Convert perf time to TSC :
--- start ---
test child forked, pid 4032834
mmap size 528384B
1st event perf time 165409788843605 tsc 336578703793868
rdtsc time 165409788854986 tsc 336578703837038
2nd event perf time 165409788855487 tsc 336578703838935
test child finished with 0
---- end ----
Convert perf time to TSC: Ok
$
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20201019100236.23675-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
ecb8ac8b1f ("mm/madvise: introduce process_madvise() syscall: an external memory hinting API")
That addresses these perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
- cgroup improvements for 'perf stat', allowing for compact specification of events
and cgroups in the command line.
- Support per thread topdown metrics in 'perf stat'.
- Support sample-read topdown metric group in 'perf record'
- Show start of latency in addition to its start in 'perf sched latency'.
- Add min, max to 'perf script' futex-contention output, in addition to avg.
- Allow usage of 'perf_event_attr->exclusive' attribute via the new ':e' event
modifier.
- Add 'snapshot' command to 'perf record --control', using it with Intel PT.
- Support FIFO file names as alternative options to 'perf record --control'.
- Introduce branch history "streams", to compare 'perf record' runs with
'perf diff' based on branch records and report hot streams.
- Support PE executable symbol tables using libbfd, to profile, for instance, wine binaries.
- Add filter support for option 'perf ftrace -F/--funcs'.
- Allow configuring the 'disassembler_style' 'perf annotate' knob via 'perf config'
- Update CascadelakeX and SkylakeX JSON vendor events files.
- Add support for parsing perchip/percore JSON vendor events.
- Add power9 hv_24x7 core level metric events.
- Add L2 prefetch, ITLB instruction fetch hits JSON events for AMD zen1.
- Enable Family 19h users by matching Zen2 AMD vendor events.
- Use debuginfod in 'perf probe' when required debug files not found locally.
- Display negative tid in non-sample events in 'perf script'.
- Make GTK2 support opt-in
- Add build test with GTK+
- Add missing -lzstd to the fast path feature detection
- Add scripts to auto generate 'mmap', 'mremap' string<->id tables for use in 'perf trace'.
- Show python test script in verbose mode.
- Fix uncore metric expressions
- Msan uninitialized use fixes.
- Use condition variables in 'perf bench numa'
- Autodetect python3 binary in systems without python2.
- Support md5 build ids in addition to sha1.
- Add build id 'perf test' regression test.
- Fix printable strings in python3 scripts.
- Fix off by ones in 'perf trace' in arches using libaudit.
- Fix JSON event code for events referencing std arch events.
- Introduce 'perf test' shell script for Arm CoreSight testing.
- Add rdtsc() for Arm64 for used in the PERF_RECORD_TIME_CONV metadata
event and in 'perf test tsc'.
- 'perf c2c' improvements: Add "RMT Load Hit" metric, "Total Stores", fixes
and documentation update.
- Fix usage of reloc_sym in 'perf probe' when using both kallsyms and debuginfo files.
- Do not print 'Metric Groups:' unnecessarily in 'perf list'
- Refcounting fixes in the event parsing code.
- Add expand cgroup event 'perf test' entry.
- Fix out of bounds CPU map access when handling armv8_pmu events in 'perf stat'.
- Add build-id injection 'perf bench' benchmark.
- Enter namespace when reading build-id in 'perf inject'.
- Do not load map/dso when injecting build-id speeding up the 'perf inject' process.
- Add --buildid-all option to avoid processing all samples, just the mmap metadata events.
- Add feature test to check if libbfd has buildid support
- Add 'perf test' entry for PE binary format support.
- Fix typos in power8 PMU vendor events JSON files.
- Hide libtraceevent non API functions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Test results:
The first ones are container based builds of tools/perf with and without libelf
support. Where clang is available, it is also used to build perf with/without
libelf, and building with LIBCLANGLLVM=1 (built-in clang) with gcc and clang
when clang and its devel libraries are installed.
The objtool and samples/bpf/ builds are disabled now that I'm switching from
using the sources in a local volume to fetching them from a http server to
build it inside the container, to make it easier to build in a container cluster.
Those will come back later.
Several are cross builds, the ones with -x-ARCH and the android one, and those
may not have all the features built, due to lack of multi-arch devel packages,
available and being used so far on just a few, like
debian:experimental-x-{arm64,mipsel}.
The 'perf test' one will perform a variety of tests exercising
tools/perf/util/, tools/lib/{bpf,traceevent,etc}, as well as run perf commands
with a variety of command line event specifications to then intercept the
sys_perf_event syscall to check that the perf_event_attr fields are set up as
expected, among a variety of other unit tests.
Then there is the 'make -C tools/perf build-test' ones, that build tools/perf/
with a variety of feature sets, exercising the build with an incomplete set of
features as well as with a complete one. It is planned to have it run on each
of the containers mentioned above, using some container orchestration
infrastructure. Get in contact if interested in helping having this in place.
$ grep "model name" -m1 /proc/cpuinfo
model name: AMD Ryzen 9 3900X 12-Core Processor
$ export PERF_TARBALL=http://192.168.122.1/perf/perf-5.9.0-rc7.tar.xz
$ dm
Thu 15 Oct 2020 01:10:56 PM -03
1 67.40 alpine:3.4 : Ok gcc (Alpine 5.3.0) 5.3.0, clang version 3.8.0 (tags/RELEASE_380/final)
2 69.01 alpine:3.5 : Ok gcc (Alpine 6.2.1) 6.2.1 20160822, clang version 3.8.1 (tags/RELEASE_381/final)
3 70.79 alpine:3.6 : Ok gcc (Alpine 6.3.0) 6.3.0, clang version 4.0.0 (tags/RELEASE_400/final)
4 79.89 alpine:3.7 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.0 (tags/RELEASE_500/final) (based on LLVM 5.0.0)
5 80.88 alpine:3.8 : Ok gcc (Alpine 6.4.0) 6.4.0, Alpine clang version 5.0.1 (tags/RELEASE_501/final) (based on LLVM 5.0.1)
6 83.88 alpine:3.9 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 5.0.1 (tags/RELEASE_502/final) (based on LLVM 5.0.1)
7 107.87 alpine:3.10 : Ok gcc (Alpine 8.3.0) 8.3.0, Alpine clang version 8.0.0 (tags/RELEASE_800/final) (based on LLVM 8.0.0)
8 115.43 alpine:3.11 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 9.0.0 (https://git.alpinelinux.org/aports f7f0d2c2b8bcd6a5843401a9a702029556492689) (based on LLVM 9.0.0)
9 106.80 alpine:3.12 : Ok gcc (Alpine 9.3.0) 9.3.0, Alpine clang version 10.0.0 (https://gitlab.alpinelinux.org/alpine/aports.git 7445adce501f8473efdb93b17b5eaf2f1445ed4c)
10 114.06 alpine:edge : Ok gcc (Alpine 10.2.0) 10.2.0, Alpine clang version 10.0.1
11 70.42 alt:p8 : Ok x86_64-alt-linux-gcc (GCC) 5.3.1 20151207 (ALT p8 5.3.1-alt3.M80P.1), clang version 3.8.0 (tags/RELEASE_380/final)
12 98.70 alt:p9 : Ok x86_64-alt-linux-gcc (GCC) 8.4.1 20200305 (ALT p9 8.4.1-alt0.p9.1), clang version 10.0.0
13 80.37 alt:sisyphus : Ok x86_64-alt-linux-gcc (GCC) 9.3.1 20200518 (ALT Sisyphus 9.3.1-alt1), clang version 10.0.1
14 64.12 amazonlinux:1 : Ok gcc (GCC) 7.2.1 20170915 (Red Hat 7.2.1-2), clang version 3.6.2 (tags/RELEASE_362/final)
15 97.64 amazonlinux:2 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-9), clang version 7.0.1 (Amazon Linux 2 7.0.1-1.amzn2.0.2)
16 22.70 android-ndk:r12b-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
17 22.72 android-ndk:r15c-arm : Ok arm-linux-androideabi-gcc (GCC) 4.9.x 20150123 (prerelease)
18 26.70 centos:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23)
19 31.86 centos:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39)
20 113.19 centos:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5), clang version 9.0.1 (Red Hat 9.0.1-2.module_el8.2.0+309+0c7b6b03)
21 57.23 clearlinux:latest : Ok gcc (Clear Linux OS for Intel Architecture) 10.2.1 20200908 releases/gcc-10.2.0-203-g127d693955, clang version 10.0.1
22 64.98 debian:8 : Ok gcc (Debian 4.9.2-10+deb8u2) 4.9.2, Debian clang version 3.5.0-10 (tags/RELEASE_350/final) (based on LLVM 3.5.0)
23 76.08 debian:9 : Ok gcc (Debian 6.3.0-18+deb9u1) 6.3.0 20170516, clang version 3.8.1-24 (tags/RELEASE_381/final)
24 74.49 debian:10 : Ok gcc (Debian 8.3.0-6) 8.3.0, clang version 7.0.1-8+deb10u2 (tags/RELEASE_701/final)
25 78.50 debian:experimental : Ok gcc (Debian 10.2.0-15) 10.2.0, Debian clang version 11.0.0-2
26 33.30 debian:experimental-x-arm64 : Ok aarch64-linux-gnu-gcc (Debian 10.2.0-3) 10.2.0
27 30.96 debian:experimental-x-mips64 : Ok mips64-linux-gnuabi64-gcc (Debian 9.3.0-8) 9.3.0
28 32.63 debian:experimental-x-mipsel : Ok mipsel-linux-gnu-gcc (Debian 9.3.0-8) 9.3.0
29 30.12 fedora:20 : Ok gcc (GCC) 4.8.3 20140911 (Red Hat 4.8.3-7)
30 30.99 fedora:22 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.5.0 (tags/RELEASE_350/final)
31 68.60 fedora:23 : Ok gcc (GCC) 5.3.1 20160406 (Red Hat 5.3.1-6), clang version 3.7.0 (tags/RELEASE_370/final)
32 78.92 fedora:24 : Ok gcc (GCC) 6.3.1 20161221 (Red Hat 6.3.1-1), clang version 3.8.1 (tags/RELEASE_381/final)
33 26.15 fedora:24-x-ARC-uClibc : Ok arc-linux-gcc (ARCompact ISA Linux uClibc toolchain 2017.09-rc2) 7.1.1 20170710
34 80.13 fedora:25 : Ok gcc (GCC) 6.4.1 20170727 (Red Hat 6.4.1-1), clang version 3.9.1 (tags/RELEASE_391/final)
35 90.68 fedora:26 : Ok gcc (GCC) 7.3.1 20180130 (Red Hat 7.3.1-2), clang version 4.0.1 (tags/RELEASE_401/final)
36 90.45 fedora:27 : Ok gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6), clang version 5.0.2 (tags/RELEASE_502/final)
37 100.88 fedora:28 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 6.0.1 (tags/RELEASE_601/final)
38 105.99 fedora:29 : Ok gcc (GCC) 8.3.1 20190223 (Red Hat 8.3.1-2), clang version 7.0.1 (Fedora 7.0.1-6.fc29)
39 111.05 fedora:30 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 8.0.0 (Fedora 8.0.0-3.fc30)
40 29.96 fedora:30-x-ARC-glibc : Ok arc-linux-gcc (ARC HS GNU/Linux glibc toolchain 2019.03-rc1) 8.3.1 20190225
41 27.02 fedora:30-x-ARC-uClibc : Ok arc-linux-gcc (ARCv2 ISA Linux uClibc toolchain 2019.03-rc1) 8.3.1 20190225
42 110.47 fedora:31 : Ok gcc (GCC) 9.3.1 20200408 (Red Hat 9.3.1-2), clang version 9.0.1 (Fedora 9.0.1-2.fc31)
43 88.78 fedora:32 : Ok gcc (GCC) 10.2.1 20200723 (Red Hat 10.2.1-1), clang version 10.0.0 (Fedora 10.0.0-2.fc32)
44 15.92 fedora:rawhide : FAIL gcc (GCC) 10.2.1 20200916 (Red Hat 10.2.1-4), clang version 11.0.0 (Fedora 11.0.0-0.4.rc3.fc34)
45 33.58 gentoo-stage3-amd64:latest : Ok gcc (Gentoo 9.3.0-r1 p3) 9.3.0
46 65.32 mageia:5 : Ok gcc (GCC) 4.9.2, clang version 3.5.2 (tags/RELEASE_352/final)
47 81.35 mageia:6 : Ok gcc (Mageia 5.5.0-1.mga6) 5.5.0, clang version 3.9.1 (tags/RELEASE_391/final)
48 103.94 mageia:7 : Ok gcc (Mageia 8.4.0-1.mga7) 8.4.0, clang version 8.0.0 (Mageia 8.0.0-1.mga7)
49 91.62 manjaro:latest : Ok gcc (GCC) 10.2.0, clang version 10.0.1
50 219.87 openmandriva:cooker : Ok gcc (GCC) 10.2.0 20200723 (OpenMandriva), OpenMandriva 11.0.0-0.20200909.1 clang version 11.0.0 (/builddir/build/BUILD/llvm-project-release-11.x/clang 5cb8ffbab42358a7cdb0a67acfadb84df0779579)
51 111.76 opensuse:15.0 : Ok gcc (SUSE Linux) 7.4.1 20190905 [gcc-7-branch revision 275407], clang version 5.0.1 (tags/RELEASE_501/final 312548)
52 118.03 opensuse:15.1 : Ok gcc (SUSE Linux) 7.5.0, clang version 7.0.1 (tags/RELEASE_701/final 349238)
53 107.91 opensuse:15.2 : Ok gcc (SUSE Linux) 7.5.0, clang version 9.0.1
54 102.34 opensuse:tumbleweed : Ok gcc (SUSE Linux) 10.2.1 20200825 [revision c0746a1beb1ba073c7981eb09f55b3d993b32e5c], clang version 10.0.1
55 25.33 oraclelinux:6 : Ok gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-23.0.1)
56 30.45 oraclelinux:7 : Ok gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44.0.3)
57 104.65 oraclelinux:8 : Ok gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5.0.3), clang version 9.0.1 (Red Hat 9.0.1-2.0.1.module+el8.2.0+5599+9ed9ef6d)
58 26.04 ubuntu:12.04 : Ok gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3, Ubuntu clang version 3.0-6ubuntu3 (tags/RELEASE_30/final) (based on LLVM 3.0)
59 29.49 ubuntu:14.04 : Ok gcc (Ubuntu 4.8.4-2ubuntu1~14.04.4) 4.8.4
60 72.95 ubuntu:16.04 : Ok gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
61 26.03 ubuntu:16.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
62 25.15 ubuntu:16.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
63 24.88 ubuntu:16.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
64 25.72 ubuntu:16.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
65 25.39 ubuntu:16.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu/IBM 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
66 25.34 ubuntu:16.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609
67 84.84 ubuntu:18.04 : Ok gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, clang version 6.0.0-1ubuntu2 (tags/RELEASE_600/final)
68 27.15 ubuntu:18.04-x-arm : Ok arm-linux-gnueabihf-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
69 26.68 ubuntu:18.04-x-arm64 : Ok aarch64-linux-gnu-gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
70 22.38 ubuntu:18.04-x-m68k : Ok m68k-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
71 26.35 ubuntu:18.04-x-powerpc : Ok powerpc-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
72 28.58 ubuntu:18.04-x-powerpc64 : Ok powerpc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
73 28.18 ubuntu:18.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
74 178.55 ubuntu:18.04-x-riscv64 : Ok riscv64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
75 24.58 ubuntu:18.04-x-s390 : Ok s390x-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
76 26.89 ubuntu:18.04-x-sh4 : Ok sh4-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
77 24.81 ubuntu:18.04-x-sparc64 : Ok sparc64-linux-gnu-gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
78 68.90 ubuntu:19.10 : Ok gcc (Ubuntu 9.2.1-9ubuntu2) 9.2.1 20191008, clang version 8.0.1-3build1 (tags/RELEASE_801/final)
79 69.31 ubuntu:20.04 : Ok gcc (Ubuntu 9.3.0-10ubuntu2) 9.3.0, clang version 10.0.0-4ubuntu1
80 30.00 ubuntu:20.04-x-powerpc64el : Ok powerpc64le-linux-gnu-gcc (Ubuntu 10-20200411-0ubuntu1) 10.0.1 20200411 (experimental) [master revision bb87d5cc77d:75961caccb7:f883c46b4877f637e0fa5025b4d6b5c9040ec566]
81 70.34 ubuntu:20.10 : Ok gcc (Ubuntu 10.2.0-5ubuntu2) 10.2.0, Ubuntu clang version 10.0.1-1
$
# uname -a
Linux five 5.9.0+ #1 SMP Thu Oct 15 09:06:41 -03 2020 x86_64 x86_64 x86_64 GNU/Linux
# git log --oneline -1
744aec4df2 perf c2c: Update documentation for metrics reorganization
# perf version --build-options
perf version 5.9.rc7.g744aec4df2c5
dwarf: [ on ] # HAVE_DWARF_SUPPORT
dwarf_getlocations: [ on ] # HAVE_DWARF_GETLOCATIONS_SUPPORT
glibc: [ on ] # HAVE_GLIBC_SUPPORT
syscall_table: [ on ] # HAVE_SYSCALL_TABLE_SUPPORT
libbfd: [ on ] # HAVE_LIBBFD_SUPPORT
libelf: [ on ] # HAVE_LIBELF_SUPPORT
libnuma: [ on ] # HAVE_LIBNUMA_SUPPORT
numa_num_possible_cpus: [ on ] # HAVE_LIBNUMA_SUPPORT
libperl: [ on ] # HAVE_LIBPERL_SUPPORT
libpython: [ on ] # HAVE_LIBPYTHON_SUPPORT
libslang: [ on ] # HAVE_SLANG_SUPPORT
libcrypto: [ on ] # HAVE_LIBCRYPTO_SUPPORT
libunwind: [ on ] # HAVE_LIBUNWIND_SUPPORT
libdw-dwarf-unwind: [ on ] # HAVE_DWARF_SUPPORT
zlib: [ on ] # HAVE_ZLIB_SUPPORT
lzma: [ on ] # HAVE_LZMA_SUPPORT
get_cpuid: [ on ] # HAVE_AUXTRACE_SUPPORT
bpf: [ on ] # HAVE_LIBBPF_SUPPORT
aio: [ on ] # HAVE_AIO_SUPPORT
zstd: [ on ] # HAVE_ZSTD_SUPPORT
# perf test
1: vmlinux symtab matches kallsyms : Ok
2: Detect openat syscall event : Ok
3: Detect openat syscall event on all cpus : Ok
4: Read samples using the mmap interface : Ok
5: Test data source output : Ok
6: Parse event definition strings : Ok
7: Simple expression parser : Ok
8: PERF_RECORD_* events & perf_sample fields : Ok
9: Parse perf pmu format : Ok
10: PMU events :
10.1: PMU event table sanity : Ok
10.2: PMU event map aliases : Ok
10.3: Parsing of PMU event table metrics : Ok
10.4: Parsing of PMU event table metrics with fake PMUs : Ok
11: DSO data read : Ok
12: DSO data cache : Ok
13: DSO data reopen : Ok
14: Roundtrip evsel->name : Ok
15: Parse sched tracepoints fields : Ok
16: syscalls:sys_enter_openat event fields : Ok
17: Setup struct perf_event_attr : Ok
18: Match and link multiple hists : Ok
19: 'import perf' in python : Ok
20: Breakpoint overflow signal handler : Ok
21: Breakpoint overflow sampling : Ok
22: Breakpoint accounting : Ok
23: Watchpoint :
23.1: Read Only Watchpoint : Skip
23.2: Write Only Watchpoint : Ok
23.3: Read / Write Watchpoint : Ok
23.4: Modify Watchpoint : Ok
24: Number of exit events of a simple workload : Ok
25: Software clock events period values : Ok
26: Object code reading : Ok
27: Sample parsing : Ok
28: Use a dummy software event to keep tracking : Ok
29: Parse with no sample_id_all bit set : Ok
30: Filter hist entries : Ok
31: Lookup mmap thread : Ok
32: Share thread maps : Ok
33: Sort output of hist entries : Ok
34: Cumulate child hist entries : Ok
35: Track with sched_switch : Ok
36: Filter fds with revents mask in a fdarray : Ok
37: Add fd to a fdarray, making it autogrow : Ok
38: kmod_path__parse : Ok
39: Thread map : Ok
40: LLVM search and compile :
40.1: Basic BPF llvm compile : Ok
40.2: kbuild searching : Ok
40.3: Compile source for BPF prologue generation : Ok
40.4: Compile source for BPF relocation : Ok
41: Session topology : Ok
42: BPF filter :
42.1: Basic BPF filtering : Ok
42.2: BPF pinning : Ok
42.3: BPF prologue generation : Ok
42.4: BPF relocation checker : Ok
43: Synthesize thread map : Ok
44: Remove thread map : Ok
45: Synthesize cpu map : Ok
46: Synthesize stat config : Ok
47: Synthesize stat : Ok
48: Synthesize stat round : Ok
49: Synthesize attr update : Ok
50: Event times : Ok
51: Read backward ring buffer : Ok
52: Print cpu map : Ok
53: Merge cpu map : Ok
54: Probe SDT events : Ok
55: is_printable_array : Ok
56: Print bitmap : Ok
57: perf hooks : Ok
58: builtin clang support : Skip (not compiled in)
59: unit_number__scnprintf : Ok
60: mem2node : Ok
61: time utils : Ok
62: Test jit_write_elf : Ok
63: Test libpfm4 support : Skip (not compiled in)
64: Test api io : Ok
65: maps__merge_in : Ok
66: Demangle Java : Ok
67: Parse and process metrics : Ok
68: PE file support : Ok
69: Event expansion for cgroups : Ok
70: x86 rdpmc : Ok
71: Convert perf time to TSC : Ok
72: DWARF unwind : Ok
73: x86 instruction decoder - new instructions : Ok
74: Intel PT packet decoder : Ok
75: x86 bp modify : Ok
76: probe libc's inet_pton & backtrace it with ping : Ok
77: Check Arm CoreSight trace data recording and synthesized samples: Skip
78: Use vfs_getname probe to get syscall args filenames : Ok
79: Check open filename arg using perf trace + vfs_getname : Ok
80: Zstd perf.data compression/decompression : Ok
81: Add vfs_getname probe to get syscall args filenames : Ok
82: build id cache operations : Ok
#
$ git log --oneline -1
744aec4df2 (HEAD -> perf/core, quaco/perf/core) perf c2c: Update documentation for metrics reorganization
$ make -C tools/perf build-test
make: Entering directory '/home/acme/git/perf/tools/perf'
- tarpkg: ./tests/perf-targz-src-pkg .
make_install_bin_O: make install-bin
make_static_O: make LDFLAGS=-static NO_PERF_READ_VDSO32=1 NO_PERF_READ_VDSOX32=1 NO_JVMTI=1
make_no_libdw_dwarf_unwind_O: make NO_LIBDW_DWARF_UNWIND=1
make_no_newt_O: make NO_NEWT=1
make_no_libbionic_O: make NO_LIBBIONIC=1
make_no_sdt_O: make NO_SDT=1
make_debug_O: make DEBUG=1
make_perf_o_O: make perf.o
make_no_libbpf_O: make NO_LIBBPF=1
make_no_libbpf_DEBUG_O: make NO_LIBBPF=1 DEBUG=1
make_clean_all_O: make clean all
make_tags_O: make tags
make_with_babeltrace_O: make LIBBABELTRACE=1
make_with_clangllvm_O: make LIBCLANGLLVM=1
make_no_scripts_O: make NO_LIBPYTHON=1 NO_LIBPERL=1
make_no_libelf_O: make NO_LIBELF=1
make_no_libcrypto_O: make NO_LIBCRYPTO=1
make_with_libpfm4_O: make LIBPFM4=1
make_no_libunwind_O: make NO_LIBUNWIND=1
make_util_map_o_O: make util/map.o
make_no_slang_O: make NO_SLANG=1
make_with_gtk2_O: make GTK2=1
make_no_ui_O: make NO_NEWT=1 NO_SLANG=1 NO_GTK2=1
make_util_pmu_bison_o_O: make util/pmu-bison.o
make_no_backtrace_O: make NO_BACKTRACE=1
make_no_demangle_O: make NO_DEMANGLE=1
make_help_O: make help
make_pure_O: make
make_no_gtk2_O: make NO_GTK2=1
make_install_prefix_O: make install prefix=/tmp/krava
make_no_libnuma_O: make NO_LIBNUMA=1
make_no_libpython_O: make NO_LIBPYTHON=1
make_install_prefix_slash_O: make install prefix=/tmp/krava/
make_no_libaudit_O: make NO_LIBAUDIT=1
make_no_auxtrace_O: make NO_AUXTRACE=1
make_minimal_O: make NO_LIBPERL=1 NO_LIBPYTHON=1 NO_NEWT=1 NO_GTK2=1 NO_DEMANGLE=1 NO_LIBELF=1 NO_LIBUNWIND=1 NO_BACKTRACE=1 NO_LIBNUMA=1 NO_LIBAUDIT=1 NO_LIBBIONIC=1 NO_LIBDW_DWARF_UNWIND=1 NO_AUXTRACE=1 NO_LIBBPF=1 NO_LIBCRYPTO=1 NO_SDT=1 NO_JVMTI=1 NO_LIBZSTD=1 NO_LIBCAP=1 NO_SYSCALL_TABLE=1
make_install_O: make install
make_doc_O: make doc
make_no_libperl_O: make NO_LIBPERL=1
make_no_syscall_tbl_O: make NO_SYSCALL_TABLE=1
OK
make: Leaving directory '/home/acme/git/perf/tools/perf'
$
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCX4iuzgAKCRCyPKLppCJ+
J1khAP4iMQMFCMpNsBaL6KLtj3aTOhrooYuhbNL3kajqYVyW/QD8Dws35k6m2+tB
tcOMJykFjPkQ4I13zsxKyugeJuUzSQw=
=KdSj
-----END PGP SIGNATURE-----
Merge tag 'perf-tools-for-v5.10-2020-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tools updates from Arnaldo Carvalho de Melo:
- cgroup improvements for 'perf stat', allowing for compact
specification of events and cgroups in the command line.
- Support per thread topdown metrics in 'perf stat'.
- Support sample-read topdown metric group in 'perf record'
- Show start of latency in addition to its start in 'perf sched
latency'.
- Add min, max to 'perf script' futex-contention output, in addition to
avg.
- Allow usage of 'perf_event_attr->exclusive' attribute via the new
':e' event modifier.
- Add 'snapshot' command to 'perf record --control', using it with
Intel PT.
- Support FIFO file names as alternative options to 'perf record
--control'.
- Introduce branch history "streams", to compare 'perf record' runs
with 'perf diff' based on branch records and report hot streams.
- Support PE executable symbol tables using libbfd, to profile, for
instance, wine binaries.
- Add filter support for option 'perf ftrace -F/--funcs'.
- Allow configuring the 'disassembler_style' 'perf annotate' knob via
'perf config'
- Update CascadelakeX and SkylakeX JSON vendor events files.
- Add support for parsing perchip/percore JSON vendor events.
- Add power9 hv_24x7 core level metric events.
- Add L2 prefetch, ITLB instruction fetch hits JSON events for AMD
zen1.
- Enable Family 19h users by matching Zen2 AMD vendor events.
- Use debuginfod in 'perf probe' when required debug files not found
locally.
- Display negative tid in non-sample events in 'perf script'.
- Make GTK2 support opt-in
- Add build test with GTK+
- Add missing -lzstd to the fast path feature detection
- Add scripts to auto generate 'mmap', 'mremap' string<->id tables for
use in 'perf trace'.
- Show python test script in verbose mode.
- Fix uncore metric expressions
- Msan uninitialized use fixes.
- Use condition variables in 'perf bench numa'
- Autodetect python3 binary in systems without python2.
- Support md5 build ids in addition to sha1.
- Add build id 'perf test' regression test.
- Fix printable strings in python3 scripts.
- Fix off by ones in 'perf trace' in arches using libaudit.
- Fix JSON event code for events referencing std arch events.
- Introduce 'perf test' shell script for Arm CoreSight testing.
- Add rdtsc() for Arm64 for used in the PERF_RECORD_TIME_CONV metadata
event and in 'perf test tsc'.
- 'perf c2c' improvements: Add "RMT Load Hit" metric, "Total Stores",
fixes and documentation update.
- Fix usage of reloc_sym in 'perf probe' when using both kallsyms and
debuginfo files.
- Do not print 'Metric Groups:' unnecessarily in 'perf list'
- Refcounting fixes in the event parsing code.
- Add expand cgroup event 'perf test' entry.
- Fix out of bounds CPU map access when handling armv8_pmu events in
'perf stat'.
- Add build-id injection 'perf bench' benchmark.
- Enter namespace when reading build-id in 'perf inject'.
- Do not load map/dso when injecting build-id speeding up the 'perf
inject' process.
- Add --buildid-all option to avoid processing all samples, just the
mmap metadata events.
- Add feature test to check if libbfd has buildid support
- Add 'perf test' entry for PE binary format support.
- Fix typos in power8 PMU vendor events JSON files.
- Hide libtraceevent non API functions.
* tag 'perf-tools-for-v5.10-2020-10-15' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (113 commits)
perf c2c: Update documentation for metrics reorganization
perf c2c: Add metrics "RMT Load Hit"
perf c2c: Correct LLC load hit metrics
perf c2c: Change header for LLC local hit
perf c2c: Use more explicit headers for HITM
perf c2c: Change header from "LLC Load Hitm" to "Load Hitm"
perf c2c: Organize metrics based on memory hierarchy
perf c2c: Display "Total Stores" as a standalone metrics
perf c2c: Display the total numbers continuously
perf bench: Use condition variables in numa.
perf jevents: Fix event code for events referencing std arch events
perf diff: Support hot streams comparison
perf streams: Report hot streams
perf streams: Calculate the sum of total streams hits
perf streams: Link stream pair
perf streams: Compare two streams
perf streams: Get the evsel_streams by evsel_idx
perf streams: Introduce branch history "streams"
perf intel-pt: Improve PT documentation slightly
perf tools: Add support for exclusive groups/events
...
Now that import_iovec handles compat iovecs, the native syscalls
can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that import_iovec handles compat iovecs, the native vmsplice syscall
can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Now that import_iovec handles compat iovecs, the native readv and writev
syscalls can be used for the compat case as well.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Functions perf_read_tsc_conversion() and perf_event__synth_time_conv()
should work as common functions rather than x86 specific, so move these
two functions out from arch/x86 folder and place them into util/tsc.c.
Since the function perf_event__synth_time_conv() will be linked in
util/tsc.c, remove its weak version.
Committer testing:
Before/after:
# perf test tsc
70: Convert perf time to TSC : Ok
#
# perf test -v tsc
70: Convert perf time to TSC :
--- start ---
test child forked, pid 8520
mmap size 528384B
1st event perf time 592110439891 tsc 2317172044331
rdtsc time 592110441915 tsc 2317172052010
2nd event perf time 592110442336 tsc 2317172053605
test child finished with 0
---- end ----
Convert perf time to TSC: Ok
#
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kemeng Shi <shikemeng@huawei.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Gasson <nick.gasson@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Steve Maclean <steve.maclean@microsoft.com>
Cc: Will Deacon <will@kernel.org>
Cc: Zou Wei <zou_wei@huawei.com>
Link: http://lore.kernel.org/lkml/20200914115311.2201-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the hardware TopDown metrics feature, sample-read feature should be
supported for a topdown group, e.g., sample a non-topdown event and read
a topdown metric group. But the current perf record code errors out.
For a topdown metric group, the slots event must be the leader of the
group, but the leader slots event doesn't support sampling.
To support sample-read the topdown metric group, use the 2nd event of
the group as the "leader" for the purposes of sampling.
Only the platform with Topdown metic feature supports sample-read the
topdown group. Add arch_topdown_sample_read() to indicate whether the
topdown group supports sample-read.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-3-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The group.h/c only include TopDown group related functions. The name
"group" is too generic and inaccurate. Use the name "topdown" to replace
it.
Move topdown related functions to a dedicated file, topdown.c.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200911144808.27603-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When we use the 'intel' disassembler style we get 'ret' instead of
'retq', so add that as an alias.
# perf annotate --disassembler-style=intel --stdio2 acpi_processor_ffh_cstate_enter > before
Apply this patch and then:
# perf annotate --disassembler-style=intel --stdio2 acpi_processor_ffh_cstate_enter > after
# diff -u before after
--- before 2020-09-04 14:10:47.768414634 -0300
+++ after 2020-09-04 14:10:59.116681039 -0300
@@ -33,7 +33,7 @@
test al,0x8
↓ je 97
and DWORD PTR gs:[rip+0x7e548509],0x7fffffff
- 97: ret
+ 97: ← ret
mov rax,QWORD PTR gs:0x17bc0
lock or BYTE PTR [rax+0x2],0x20
mov rax,QWORD PTR [rax]
#
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Martin Liška <mliska@suse.cz>
Cc: Matt P. Dziubinski <matdzb@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.vnet.ibm.com>
Cc: Thomas Richter <tmricht@linux.vnet.ibm.com>
Cc: Wang Nan <wangnan0@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Since commit 61a47c1ad3 ("sysctl: Remove the sysctl system call"),
sys_sysctl is actually unavailable: any input can only return an error.
We have been warning about people using the sysctl system call for years
and believe there are no more users. Even if there are users of this
interface if they have not complained or fixed their code by now they
probably are not going to, so there is no point in warning them any
longer.
So completely remove sys_sysctl on all architectures.
[nixiaoming@huawei.com: s390: fix build error for sys_call_table_emu]
Link: http://lkml.kernel.org/r/20200618141426.16884-1-nixiaoming@huawei.com
Signed-off-by: Xiaoming Ni <nixiaoming@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Acked-by: Will Deacon <will@kernel.org> [arm/arm64]
Acked-by: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Bin Meng <bin.meng@windriver.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: chenzefeng <chenzefeng2@huawei.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christian Brauner <christian@brauner.io>
Cc: Chris Zankel <chris@zankel.net>
Cc: David Howells <dhowells@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Diego Elio Pettenò <flameeyes@flameeyes.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Dominik Brodowski <linux@dominikbrodowski.net>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kars de Jong <jongk@linux-m68k.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin K. Petersen <martin.petersen@oracle.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Piggin <npiggin@gmail.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Paul Burton <paulburton@kernel.org>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sudeep Holla <sudeep.holla@arm.com>
Cc: Sven Schnelle <svens@stackframe.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Zhou Yanjie <zhouyanjie@wanyeetech.com>
Link: http://lkml.kernel.org/r/20200616030734.87257-1-nixiaoming@huawei.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
To pick the changes from:
55db9c0e85 ("net: remove compat_sys_{get,set}sockopt")
9b4feb630e ("arch: wire-up close_range()")
That automagically add the 'close_range' syscall to tools such as 'perf
trace'.
Before:
# perf trace -e close_range
event syntax error: 'close_range'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
#
After, system wide strace like tracing for this syscall:
# perf trace -e close_range
^C#
No calls, I need some test proggie :-)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: David S. Miller <davem@davemloft.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To sync headers, for instance, in this case tools/perf was ahead of
upstream till Linus merged tip/perf/core to get the
PERF_RECORD_TEXT_POKE changes:
Warning: Kernel ABI header at 'tools/include/uapi/linux/perf_event.h' differs from latest version at 'include/uapi/linux/perf_event.h'
diff -u tools/include/uapi/linux/perf_event.h include/uapi/linux/perf_event.h
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that the ->compat_{get,set}sockopt proto_ops methods are gone
there is no good reason left to keep the compat syscalls separate.
This fixes the odd use of unsigned int for the compat_setsockopt
optlen and the missing sock_use_custom_sol_socket.
It would also easily allow running the eBPF hooks for the compat
syscalls, but such a large change in behavior does not belong into
a consolidation patch like this one.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
Select text poke events when available and the kernel is being traced.
Process text poke events to invalidate entries in Intel PT's instruction
cache.
Example:
The example requires kernel config:
CONFIG_PROC_SYSCTL=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
Before:
# perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.341 MB perf.data.before ]
[1]+ Terminated perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.before --itrace=e >/dev/null
Warning:
474 instruction trace errors
After:
# perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.646 MB perf.data.after ]
[1]+ Terminated perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.after --itrace=e >/dev/null
Example:
The example requires kernel config:
# CONFIG_FUNCTION_TRACER is not set
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.026 MB perf.data (68 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.268 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
207 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (107 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 39.978 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
6 565303693547 0x291f18 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027a000 len 4096 type 2 flags 0x0 name kprobe_insn_page
6 565303697010 0x291f68 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 0 new len 6
6 565303838278 0x291fa8 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027c000 len 4096 type 2 flags 0x0 name kprobe_optinsn_page
6 565303848286 0x291ff8 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 0 new len 106
6 565369336743 0x292af8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
7 566434327704 0x217c208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
6 566456313475 0x293198 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 106 new len 0
6 566456314935 0x293238 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 6 new len 0
Example:
The example requires kernel config:
CONFIG_FUNCTION_TRACER=y
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.022 MB perf.data (6 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 43.850 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
8 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.037 MB perf.data (206 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.442 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
5 312216133258 0x8bafe0 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc0360000 len 415 type 2 flags 0x0 name ftrace_trampoline
5 312216133494 0x8bb030 [0x1d8]: PERF_RECORD_TEXT_POKE addr 0xffffffffc0360000 old len 0 new len 415
5 312216229563 0x8bb208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216239063 0x8bb248 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
5 312216727230 0x8bb288 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
5 312216739322 0x8bb2c8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216748321 0x8bb308 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287163462 0x2817430 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287174890 0x2817470 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287818979 0x28174b0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
7 313287829357 0x28174f0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287841246 0x2817530 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When recording PEBS-via-PT, the kernel will not accept the intel_pt
event with register sampling e.g.
# perf record --kcore -c 10000 -e '{intel_pt/branch=0/,branch-loads/aux-output/ppp}' -I -- ls -l
Error:
intel_pt/branch=0/: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'
Fix by suppressing register sampling on the intel_pt evsel.
Committer notes:
Adrian informed that this is only available from Tremont onwards, so on
older processors the error continues the same as before.
Fixes: 9e64cefe43 ("perf intel-pt: Process options for PEBS event synthesis")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Update the copies of files affected by:
c8ffd8bcdd ("vfs: add faccessat2 syscall")
To address this perf build warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/fcntl.h' differs from latest version at 'include/uapi/linux/fcntl.h'
diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Which results in 'perf trace' gaining support for the 'faccessat2'
syscall, now one can use:
# perf trace -e faccessat2
And have system wide tracing of this syscall. And this also will include
it;
# perf trace -e faccess*
Together with the other variants.
How it affects building/usage (on an x86_64 system):
$ cp /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp/syscalls_64.c.before
$
[root@five ~]# perf trace -e faccessat2
event syntax error: 'faccessat2'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf trace [<options>] [<command>]
or: perf trace [<options>] -- <command> [<options>]
or: perf trace record [<options>] [<command>]
or: perf trace record [<options>] -- <command> [<options>]
-e, --event <event> event/syscall selector. use 'perf list' to list available events
[root@five ~]#
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
$ git diff
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index 37b844f839bc..78847b32e137 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -359,6 +359,7 @@
435 common clone3 sys_clone3
437 common openat2 sys_openat2
438 common pidfd_getfd sys_pidfd_getfd
+439 common faccessat2 sys_faccessat2
#
# x32-specific system call numbers start at 512 to avoid cache impact
$
$ make -C tools/perf O=/tmp/build/perf/ install-bin
<SNIP>
CC /tmp/build/perf/util/syscalltbl.o
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
<SNIP>
[root@five ~]# perf trace -e faccessat2
^C[root@five ~]#
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
This is currently working due to extra include paths in the build.
Before:
$ cd tools/perf/arch/arm64/util
$ ls -la ../../util/unwind-libdw.h
ls: cannot access '../../util/unwind-libdw.h': No such file or directory
After:
$ ls -la ../../../util/unwind-libdw.h
-rw-r----- 1 irogers irogers 553 Apr 17 14:31 ../../../util/unwind-libdw.h
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200529225232.207532-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Avoid a false positive caused by assembly code in arch/x86.
In tests, zero the perf_event to avoid uninitialized memory uses.
Warnings were caught using clang with -fsanitize=memory.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quentin Monnet <quentin@isovalent.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: clang-built-linux@googlegroups.com
Link: http://lore.kernel.org/lkml/20200530082015.39162-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Stop the message displaying when user space is not being traced.
Example:
Prerequisites:
sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf
sudo chmod +r /proc/kcore
Before:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
Warning:
Intel Processor Trace decoding will not be possible except for kernel tracing!
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.838 MB perf.data ]
After:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 1.068 MB perf.data ]
$ sudo chmod go-r /proc/kcore
$ sudo setcap -r ~/bin/perf
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lore.kernel.org/lkml/20200528120859.21604-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Context switch events are added automatically by Intel PT and Coresight.
Make it possible to suppress them. That is useful for tracing the
scheduler without the disturbance that the switch event processing
creates.
Example:
Prerequisites:
$ which perf
~/bin/perf
$ sudo setcap "cap_sys_rawio,cap_sys_admin,cap_sys_ptrace,cap_syslog,cap_ipc_lock=ep" ~/bin/perf
$ sudo chmod +r /proc/kcore
Before:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.938 MB perf.data ]
$ perf script -D | grep PERF_RECORD_SWITCH | wc -l
572
After:
$ perf record --no-switch-events --kcore -a -e intel_pt//k -- sleep 0.001
Warning:
Intel Processor Trace decoding will not be possible except for kernel tracing!
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.838 MB perf.data ]
$ perf script -D | grep PERF_RECORD_SWITCH | wc -l
0
$ sudo chmod go-r /proc/kcore
$ sudo setcap -r ~/bin/perf
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Link: http://lore.kernel.org/lkml/20200528120859.21604-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
On a CPU like skylakex an uncore_iio_0 PMU may alias with
uncore_iio_free_running_0. The latter PMU doesn't support fc_mask as a
parameter and so pmu_config_term fails. Typically parse_events_add_pmu
is called in a loop where if one alias succeeds errors are ignored,
however, if multiple errors occur parse_events__handle_error will
currently give a WARN_ONCE.
This change removes the WARN_ONCE in parse_events__handle_error and
makes it a pr_debug. It adds verbose messages to parse_events_add_pmu
warning that non-fatal errors may occur, while giving details on the pmu
and config terms for useful context. pmu_config_term is altered so the
failing term and pmu are present in the case of the 'unknown term' error
which makes spotting the free_running case more straightforward.
Before:
$ perf --debug verbose=3 stat -M llc_misses.pcie_read sleep 1
Using CPUID GenuineIntel-6-55-4
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
adding {unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W,{unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
WARNING: multiple event parsing errors
...
Invalid event/parameter 'fc_mask'
...
After:
$ perf --debug verbose=3 stat -M llc_misses.pcie_read sleep 1
Using CPUID GenuineIntel-6-55-4
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
metric expr unc_iio_data_req_of_cpu.mem_read.part0 + unc_iio_data_req_of_cpu.mem_read.part1 + unc_iio_data_req_of_cpu.mem_read.part2 + unc_iio_data_req_of_cpu.mem_read.part3 for LLC_MISSES.PCIE_READ
found event unc_iio_data_req_of_cpu.mem_read.part0
found event unc_iio_data_req_of_cpu.mem_read.part1
found event unc_iio_data_req_of_cpu.mem_read.part2
found event unc_iio_data_req_of_cpu.mem_read.part3
adding {unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W,{unc_iio_data_req_of_cpu.mem_read.part0,unc_iio_data_req_of_cpu.mem_read.part1,unc_iio_data_req_of_cpu.mem_read.part2,unc_iio_data_req_of_cpu.mem_read.part3}:W
intel_pt default config: tsc,mtc,mtc_period=3,psb_period=3,pt,branch
Attempting to add event pmu 'uncore_iio_free_running_5' with 'unc_iio_data_req_of_cpu.mem_read.part0,' that may result in non-fatal errors
After aliases, add event pmu 'uncore_iio_free_running_5' with 'fc_mask,ch_mask,umask,event,' that may result in non-fatal errors
Attempting to add event pmu 'uncore_iio_free_running_3' with 'unc_iio_data_req_of_cpu.mem_read.part0,' that may result in non-fatal errors
After aliases, add event pmu 'uncore_iio_free_running_3' with 'fc_mask,ch_mask,umask,event,' that may result in non-fatal errors
Attempting to add event pmu 'uncore_iio_free_running_1' with 'unc_iio_data_req_of_cpu.mem_read.part0,' that may result in non-fatal errors
After aliases, add event pmu 'uncore_iio_free_running_1' with 'fc_mask,ch_mask,umask,event,' that may result in non-fatal errors
Multiple errors dropping message: unknown term 'fc_mask' for pmu 'uncore_iio_free_running_3' (valid terms: event,umask,config,config1,config2,name,period,percore)
...
So before you see a 'WARNING: multiple event parsing errors' and
'Invalid event/parameter'. After you see 'Attempting... that may result
in non-fatal errors' then 'Multiple errors...' with details that
'fc_mask' wasn't known to a free running counter. While not completely
clean, this makes it clearer that an error hasn't really occurred.
v2. addresses review feedback from Jiri Olsa <jolsa@redhat.com>.
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lore.kernel.org/lkml/20200513220635.54700-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As it is a 'struct evsel' method, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As these are 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As those are not 'struct evsel' methods, not part of tools/lib/perf/,
aka libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As they are not 'struct evsel' methods, not part of tools/lib/perf/, aka
libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Trying to disentangle this a bit further, unfortunately it uses
parse_events(), its interesting to have it separated anyway, so do it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To pick the changes from:
d3b1b776ee ("x86/entry/64: Remove ptregs qualifier from syscall table")
cab56d3484 ("x86/entry: Remove ABI prefixes from functions in syscall tables")
27dd84fafc ("x86/entry/64: Use syscall wrappers for x32_rt_sigreturn")
Addressing this tools/perf build warning:
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
That didn't result in any tooling changes, as what is extracted are just
the first two columns, and these patches touched only the third.
$ cp /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
$ make -C tools/perf O=/tmp/build/perf install-bin
make: Entering directory '/home/acme/git/perf/tools/perf'
BUILD: Doing 'make -j12' parallel build
DESCEND plugins
CC /tmp/build/perf/util/syscalltbl.o
INSTALL trace_plugins
LD /tmp/build/perf/util/perf-in.o
LD /tmp/build/perf/perf-in.o
LINK /tmp/build/perf/perf
$ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c /tmp/syscalls_64.c
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add to the "x86 instruction decoder - new instructions" test the following
instructions:
incsspd
incsspq
rdsspd
rdsspq
saveprevssp
rstorssp
wrssd
wrssq
wrussd
wrussq
setssbsy
clrssbsy
endbr32
endbr64
And the notrack prefix for indirect calls and jumps.
For information about the instructions, refer Intel Control-flow
Enforcement Technology Specification May 2019 (334525-003).
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Link: https://lkml.kernel.org/r/20200204171425.28073-3-yu-cheng.yu@intel.com
This is currently working due to extra include paths in the build.
Committer testing:
$ cd tools/include/uapi/asm/
Before this patch:
$ ls -la ../../arch/x86/include/uapi/asm/errno.h
ls: cannot access '../../arch/x86/include/uapi/asm/errno.h': No such file or directory
$
After this patch;
$ ls -la ../../../arch/x86/include/uapi/asm/errno.h
-rw-rw-r--. 1 acme acme 31 Feb 20 12:42 ../../../arch/x86/include/uapi/asm/errno.h
$
Check that that is still under tools/, i.e. hasn't escaped into the main
kernel sources:
$ cd ../../../arch/x86/include/uapi/asm/
$ pwd
/home/acme/git/perf/tools/arch/x86/include/uapi/asm
$
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexios Zavras <alexios.zavras@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Igor Lubashev <ilubashe@akamai.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Wei Li <liwei391@huawei.com>
Link: http://lore.kernel.org/lkml/20200306071110.130202-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All ->read_finish() implementations are doing the same thing. Add a
helper function so that they can share the same implementation.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Leo Yan <leo.yan@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Wei Li <liwei391@huawei.com>
Link: http://lore.kernel.org/lkml/20200217082300.6301-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be
endless.
If the intel_bts event is disabled, we don't enable it again here.
Note: This patch is NOT tested since i don't have such a machine with
intel_bts feature, but the code seems buggy same as arm-spe and
intel-pt.
Signed-off-by: Wei Li <liwei391@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-3-adrian.hunter@intel.com
[ahunter: removed redundant 'else' after 'return']
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In __cmd_record(), when receiving SIGINT(ctrl + c), a 'done' flag will
be set and the event list will be disabled by evlist__disable() once.
While in auxtrace_record.read_finish(), the related events will be
enabled again, if they are continuous, the recording seems to be endless.
If the intel_pt event is disabled, we don't enable it again here.
Before the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 46803
^C^C^C^C^C^C
After the patch:
huawei@huawei-2288H-V5:~/linux-5.5-rc4/tools/perf$ ./perf record -e \
intel_pt//u -p 48591
^C[ perf record: Woken up 0 times to write data ]
Warning:
AUX data lost 504 times out of 4816!
[ perf record: Captured and wrote 2024.405 MB perf.data ]
Signed-off-by: Wei Li <liwei391@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Tan Xiaojun <tanxiaojun@huawei.com>
Cc: stable@vger.kernel.org # 5.4+
Link: http://lore.kernel.org/lkml/20200214132654.20395-2-adrian.hunter@intel.com
[ ahunter: removed redundant 'else' after 'return' ]
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
fddb5d430a ("open: introduce openat2(2) syscall")
9a2cef09c8 ("arch: wire up pidfd_getfd syscall")
We also need to grab a copy of uapi/linux/openat2.h since it is now
needed by fcntl.h, add it to tools/perf/check_headers.h.
$ diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
--- tools/perf/arch/x86/entry/syscalls/syscall_64.tbl 2019-12-20 16:43:57.662429958 -0300
+++ arch/x86/entry/syscalls/syscall_64.tbl 2020-02-10 16:36:22.070012468 -0300
@@ -357,6 +357,8 @@
433 common fspick __x64_sys_fspick
434 common pidfd_open __x64_sys_pidfd_open
435 common clone3 __x64_sys_clone3/ptregs
+437 common openat2 __x64_sys_openat2
+438 common pidfd_getfd __x64_sys_pidfd_getfd
#
# x32-specific system call numbers start at 512 to avoid cache impact
$
Update tools/'s copy of that file:
$ cp arch/x86/entry/syscalls/syscall_64.tbl tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
See the result:
$ diff -u /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
--- /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c.before 2020-02-10 16:42:59.010636041 -0300
+++ /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c 2020-02-10 16:43:24.149958337 -0300
@@ -346,5 +346,7 @@
[433] = "fspick",
[434] = "pidfd_open",
[435] = "clone3",
+ [437] = "openat2",
+ [438] = "pidfd_getfd",
};
-#define SYSCALLTBL_x86_64_MAX_ID 435
+#define SYSCALLTBL_x86_64_MAX_ID 438
$
Now one can use:
perf trace -e openat2,pidfd_getfd
To get just those syscalls or use in things like:
perf trace -e open*
To get all the open variant (open, openat, openat2, etc) or:
perf trace pidfd*
To get the pidfd syscalls.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Aleksa Sarai <cyphar@cyphar.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sargun Dhillon <sargun@sargun.me>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And update linux/linkage.h, which requires in turn that we make these
files switch from ENTRY()/ENDPROC() to SYM_FUNC_START()/SYM_FUNC_END():
tools/perf/arch/arm64/tests/regs_load.S
tools/perf/arch/arm/tests/regs_load.S
tools/perf/arch/powerpc/tests/regs_load.S
tools/perf/arch/x86/tests/regs_load.S
We also need to switch SYM_FUNC_START_LOCAL() to SYM_FUNC_START() for
the functions used directly by 'perf bench', and update
tools/perf/check_headers.sh to ignore those changes when checking if the
kernel original files drifted from the copies we carry.
This is to get the changes from:
6dcc5627f6 ("x86/asm: Change all ENTRY+ENDPROC to SYM_FUNC_*")
ef1e03152c ("x86/asm: Make some functions local")
e9b9d020c4 ("x86/asm: Annotate aliases")
And address these tools/perf build warnings:
Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tay3l8x8k11p7y3qcpqh9qh5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One more step in the merge of 'struct maps' with 'struct map_groups'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9ibtn3vua76f934t7woyf26w@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
One more step on the merge of 'struct maps' with 'struct map_groups'.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-69vcr8pubpym90skxhmbwhiw@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And pick the shortest name: 'struct maps'.
The split existed because we used to have two groups of maps, one for
functions and one for variables, but that only complicated things,
sometimes we needed to figure out what was at some address and then had
to first try it on the functions group and if that failed, fall back to
the variables one.
That split is long gone, so for quite a while we had only one struct
maps per struct map_groups, simplify things by combining those structs.
First patch is the minimum needed to merge both, follow up patches will
rename 'thread->mg' to 'thread->maps', etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add to the "x86 instruction decoder - new instructions" test the following
instructions:
v4fmaddps
v4fmaddss
v4fnmaddps
v4fnmaddss
vaesdec
vaesdeclast
vaesenc
vaesenclast
vcvtne2ps2bf16
vcvtneps2bf16
vdpbf16ps
gf2p8affineinvqb
vgf2p8affineinvqb
gf2p8affineqb
vgf2p8affineqb
gf2p8mulb
vgf2p8mulb
vp2intersectd
vp2intersectq
vp4dpwssd
vp4dpwssds
vpclmulqdq
vpcompressb
vpcompressw
vpdpbusd
vpdpbusds
vpdpwssd
vpdpwssds
vpexpandb
vpexpandw
vpopcntb
vpopcntd
vpopcntq
vpopcntw
vpshldd
vpshldq
vpshldvd
vpshldvq
vpshldvw
vpshldw
vpshrdd
vpshrdq
vpshrdvd
vpshrdvq
vpshrdvw
vpshrdw
vpshufbitqmb
For information about the instructions, refer Intel SDM May 2019
(325462-070US) and Intel Architecture Instruction Set Extensions May
2019 (319433-037).
Committer testing:
$ perf test x86
61: x86 rdpmc : Ok
64: x86 instruction decoder - new instructions : Ok
66: x86 bp modify : Ok
$
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Yu-cheng Yu <yu-cheng.yu@intel.com>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20191125125044.31879-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add an error message because Intel BTS does not support AUX area
sampling.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Set up the default number of mmap pages, default sample size and default
psb_period for AUX area sampling. Add documentation also.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add to the "x86 instruction decoder - new instructions" test the following
instructions:
cldemote
tpause
umonitor
umwait
movdiri
movdir64b
enqcmd
enqcmds
encls
enclu
enclv
pconfig
wbnoinvd
For information about the instructions, refer Intel SDM May 2019
(325462-070US) and Intel Architecture Instruction Set Extensions
May 2019 (319433-037).
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20191115135447.6519-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To reduce boilerplate, provide a more compact form using an idiom
present in other trees of data structures.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-59gmq4kg1r68ou1wknyjl78x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__read_event() from tools/perf to libperf and export it in
the perf/mmap.h header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-13-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__read_init() from tools/perf to libperf and export it in
the perf/mmap.h header.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-12-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__read_init() from tools/perf to libperf and export it in
perf/mmap.h header.
And add pr_debug2()/pr_debug3() macros support, because the code is
using them.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move perf_mmap__consume() vrom tools/perf to libperf and export it in
the perf/mmap.h header.
Move also the needed helpers perf_mmap__write_tail(),
perf_mmap__read_head() and perf_mmap__empty().
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20191007125344.14268-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
They are called from symbol__annotate() and to propagate errors that can
help understand the problem make them return what
symbol__strerror_disassemble() known, i.e. errno codes and other
annotation specific errors in a special, out of errnos, range.
Reported-by: Russell King - ARM Linux admin <linux@armlinux.org.uk>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Will Deacon <will@kernel.org>
Link: https://lkml.kernel.org/n/tip-pqx7srcv7tixgid251aeboj6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
For consistency, propagate the exact cause for get_cpuid() to have
failed.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9ig269f7ktnhh99g4l15vpu2@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Naresh Kamboju reported, that on the i386 build pr_err()
doesn't get defined properly due to header ordering:
perf-in.o: In function `libunwind__x86_reg_id':
tools/perf/util/libunwind/../../arch/x86/util/unwind-libunwind.c:109:
undefined reference to `pr_err'
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
We need the 'page_size' variable in libperf, so move it there.
Add a libperf_init() as a global libperf init function to obtain this
value via sysconf() at tool start.
Committer notes:
Add internal/lib.h to tools/perf/ files using 'page_size', sometimes
replacing util.h with it if that was the only reason for having util.h
included.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-33-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add perf_evlist__first()/last() functions to libperf, as internal
functions and rename perf's origins to evlist__first/last.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-29-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving 'nr_mmaps' from 'struct evlist' to 'struct perf_evlist', it will
be used in following patches.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-21-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the 'system_wide 'member from perf's evsel to libperf's perf_evsel.
Committer notes:
Added stdbool.h as we now use bool here.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add the perf_mmap struct to libperf.
The definition is added into:
include/internal/mmap.h
which is not to be included by users, but shared within perf and
libperf.
Committer notes:
Remove unnecessary includes from tools/perf/lib/include/internal/mmap.h,
those will be readded as they become necessary, later in the series.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-11-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As this isn't used at all in mmap.h but in evlist.h, so to cut down the
header dependency tree, move it to where it is used.
Also add mmap.h to the places using it but previously getting it
indirectly via evlist.h.
Add missing pthread.h to evlist.h, as it has a pthread_t struct member
and was getting the header via mmap.h.
Noticed while processing a Jiri's libperf batch touching mmap.h, where
almost everything gets rebuilt because evlist.h is so popular, so cut
down't this rebuild the world party.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Link: https://lkml.kernel.org/n/tip-he0uljeftl0xfveh3d6vtode@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__mmap() to evlist__mmap(), so we don't have a name
clash when we add perf_evlist__mmap() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename 'struct perf_evlist' to 'struct evlist', so we don't have a name
clash when we add 'struct perf_mmap' to libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those are the only routines using the perf_event__handler_t typedef and
are all related, so move to a separate header to reduce the header
dependency tree, lots of places were getting event.h and even stdio.h,
limits.h indirectly, so fix those as well.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Only a 'struct perf_cmp_map' forward allocation is necessary, fix the
places that need the header but were getting it indirectly, by luck,
from env.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-3sj3n534zghxhk7ygzeaqlx9@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Check that it is not needed and remove, fixing up some fallout for
places where it was only serving to get something else.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9h6dg6lsqe2usyqjh5rrues4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Now that there's a common version of the decoder for all tools, use it
instead of the local copy.
Also use perf's check-headers.sh script to diff the decoder files to
make sure they remain in sync with the kernel version. Objtool has a
similar check.
Committer notes:
Had to keep this all pointing explicitely to x86 headers/files, i.e.
instead of asm/isnn.h we had to use ../include/asm/insn.h when the files
were in differemt dirs, or just replace "<asm/foo.h>" with "foo.h".
This way we continue to be able to process perf.data files with Intel PT
traces in distros other than x86.
Also fixed up the awk script paths to use $(srcdir)/tools/arch instead
or relative directories so that we keep detached tarballs (make help |
grep perf) working.
For now the include lines in these headers are being ignored so as not
to flag false reports of kernel/tools out of sync.
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/8a37e615d2880f039505d693d1e068a009358a2b.1567118001.git.jpoimboe@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So that we don't carry the session.h include directive in auxtrace.h,
which in turn opens a can of worms of files that were getting all sorts
of things via that include, fix them all.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-d2d83aovpgri2z75wlitquni@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
bpf.h and build-id.h are not needed at all in event.h, remove them.
And fixup the fallout of files that were getting needed stuff from this
now pruned include.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-rdm3dgtlrndmmnlc4bafsg3b@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All we need there is a forward declaration for 'union perf_event', so
remove it from there and add missing header directives in places using
things from this indirect include.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And fix the fallout, adding it to places that must have it since they
use its definitions.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-1s3jel4i26chq2g0lydoz7i3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the movement of lots of stuff out of perf.h to other headers we
ended up not needing it in lots of places, remove it from those places.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c718m0sxxwp73lp9d8vpihb4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And remove unneeded include directives from perf-sys.h to prune the
header dependency tree.
Fixup the fallout in places where definitions were being used without
the needed include directives that were being satisfied because they
were in perf-sys.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7b1zvugiwak4ibfa3j6ott7f@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Even more, to have a "perf_record_" prefix, so that they match the
PERF_RECORD_ enum they map to.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190828135717.7245-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the PERF_RECORD_AUXTRACE_INFO event definition to libperf's
event.h.
In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used
events to their generic '__u*' versions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190828135717.7245-9-jolsa@kernel.org
[ Fix cs_etm__print_auxtrace_info() arg to be __u64 too to fix the CORESIGHT=1 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is no need for that util/util.h include there and, remove it,
pruning the include tree, fix the fallout by adding necessary headers to
places that were getting needed includes indirectly from evlist.h ->
util.h.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-s9f7uve8wvykr5itcm7m7d8q@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The kernel is using CAP_SYS_ADMIN instead of euid==0 to override
perf_event_paranoid check. Make perf do the same.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org> # coresight part
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: James Morris <jmorris@namei.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1566869956-7154-3-git-send-email-ilubashe@akamai.com
Signed-off-by: Igor Lubashev <ilubashe@akamai.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And into a separate util/record.h, to better isolate things and make
sure that those who use record_opts and the other moved declarations
are explicitly including the necessary header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-31q8mei1qkh74qvkl9nwidfq@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The util/cpumap.h file doesn't use anything in refcount.h not in
debug.h, it needs just a forward reference to 'struct cpu_map_data',
that is defined in util/event.h and cpumap.h was getting indirectly via,
of all things, debug.h
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-mtjww98yptt4ppo6g2blavg5@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It uses strcmp(), strstr() and was getting the required string.h header
by luck, from evsel.h -> cpumap.h -> debug.h -> string.h, add the
missing header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-qrz8hhvrhwnmt5ocfwk4br5d@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
So it's part of the libperf library as one of basic functions operating
on the perf_cpu_map class.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190822111141.25823-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Process synth_opts.other_events and attr.aux_output to set up for
synthesizing PEBs via Intel PT events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190806084606.4021-6-alexander.shishkin@linux.intel.com
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
[ Fixed up libbperf clashes, i.e. some places using perf_evsel (now in libperf)
need to use instead 'evsel' (a tools/perf only abstraction) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
These paths point to the wrong location but still work because they get
picked up by a -I flag that happens to direct to the correct file. Fix
paths to lead to the actual file location without help from include
flags.
Signed-off-by: Luke Mujica <lukemujica@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20190719202253.220261-1-lukemujica@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the evlist__set_maps() function from tools/perf to libperf.
Committer notes:
Fix up reject due to earlier inversion in calling perf_evlist__init().
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-57-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Moving the following functions from tools/perf:
cpu_map__new()
cpu_map__read()
to libperf with the following names:
perf_cpu_map__new()
perf_cpu_map__read()
Committer notes:
Fixed up this one:
tools/perf/arch/arm/util/cs-etm.c
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-44-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.
Committer notes:
Fixed up these:
tools/perf/arch/arm/util/auxtrace.c
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/arm-spe.c
tools/perf/arch/s390/util/auxtrace.c
tools/perf/util/cs-etm.c
Also
cc1: warnings being treated as errors
tests/sample-parsing.c: In function 'do_test':
tests/sample-parsing.c:162: error: missing initializer
tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')
struct evsel evsel = {
.needs_swap = false,
- .core.attr = {
- .sample_type = sample_type,
- .read_format = read_format,
+ .core = {
+ . attr = {
+ .sample_type = sample_type,
+ .read_format = read_format,
+ },
[perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
gcc (GCC) 4.4.7
Also we don't need to include perf_event.h in
tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
perf_event_attr' is enough. And this even fixes the build in some
systems where things are used somewhere down the include path from
perf_event.h without defining __always_inline.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__disable() to evlist__disable(), so we don't have a
name clash when we add perf_evlist__disable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__enable() to evlist__enable(), so we don't have a
name clash when we add perf_evlist__enable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-22-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__open() to evlist__open(), so we don't have a name
clash when we add perf_evlist__open() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-20-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Renaming perf_evsel__disable() to evsel__disable(), so we don't have a
name clash when we add perf_evsel__disable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-17-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evsel__enable() to evsel__enable(), so we don't have a name
clash when we add perf_evsel__enable() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-16-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__delete() to evlist__delete(), so we don't have a
name clash when we add perf_evlist__delete() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename perf_evlist__new() to evlist__new(), so we don't have a name
clash when we add perf_evlist__new() in libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.
Committer notes:
Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct thread_map to struct perf_thread_map, so it could be part
of libperf.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct cpu_map to struct perf_cpu_map, so it could be part of
libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We got the sane_ctype.h headers from git and kept using it so far, but
since that code originally came from the kernel sources to the git
sources, perhaps its better to just use the one in the kernel, so that
we can leverage tools/perf/check_headers.sh to be notified when our copy
gets out of sync, i.e. when fixes or goodies are added to the code we've
copied.
This will help with things like tools/lib/string.c where we want to have
more things in common with the kernel, such as strim(), skip_spaces(),
etc so as to go on removing the things that we have in tools/perf/util/
and instead using the code in the kernel, indirectly and removing things
like EXPORT_SYMBOL(), etc, getting notified when fixes and improvements
are made to the original code.
Hopefully this also should help with reducing the difference of code
hosted in tools/ to the one in the kernel proper.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7k9868l713wqtgo01xxygn12@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Not to depend of getting it indirectly.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-tirjsmvu4ektw0k7lm8k9lhu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
We're getting it by sheer luck, add that util.h to get the 'page_size'
definition.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-347078mgj3d2jfygtxs4ntti@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the macro defined in kernel ABI header to replace the local name.
No functional change.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Link: https://lkml.kernel.org/r/1559081314-9714-5-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>