Correlating virtual machine TSC packets is not supported at present, so
instead support the Z itrace option.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move synth_opts initialization earlier, so it can be used earlier.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210430070309.17624-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Events record a single cpumode so the tools cannot handle a branch from
the host machine to a virtual machine, or vice versa. Split it in two so
that each branch can have a different cpumode.
E.g. host ip -> guest ip
becomes: host ip -> 0
0 -> guest ip
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the change of NR to detect whether an asynchronous branch is a VM-Exit.
Note VM-Entry is determined from the vmlaunch or vmresume instruction,
in which case, sample flags will show "VMentry" even if the VM-Entry fails.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Handling TIP.PGD for an address filter for a guest kernel is the same as a
host kernel, but user space decoding, and hence address filters, are not
supported.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The guest kernel can be found from any guest thread belonging to the guest
machine. The guest machine is associated with the current host process pid.
An idle thread (pid=tid=0) is created as a vehicle from which to find the
guest kernel map.
Decoding guest user space is not supported.
Synthesized samples just need the cpumode set for the guest.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210218095801.19576-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The code assumed every CYC-eligible packet has a CYC packet, which is not
the case when CYC thresholds are used. Fix by checking if a CYC packet is
actually present in that case.
Fixes: 5b1dc0fd1d ("perf intel-pt: Add support for samples to contain IPC ratio")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: https://lore.kernel.org/r/20210205175350.23817-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The code assumed a change in cycle count means accurate IPC. That is not
correct, for example when sampling both branches and instructions, or at
a FUP packet (which is not CYC-eligible) address. Fix by using an explicit
flag to indicate when IPC can be sampled.
Fixes: 5b1dc0fd1d ("perf intel-pt: Add support for samples to contain IPC ratio")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/20210205175350.23817-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The instruction latency information can be recorded on some platforms,
e.g., the Intel Sapphire Rapids server. With both memory latency
(weight) and the new instruction latency information, users can easily
locate the expensive load instructions, and also understand the time
spent in different stages. The users can optimize their applications in
different pipeline stages.
The 'weight' field is shared among different architectures. Reusing the
'weight' field may impacts other architectures. Add a new field to store
the instruction latency.
Like the 'weight' support, introduce a 'ins_lat' for the global
instruction latency, and a 'local_ins_lat' for the local instruction
latency version.
Add new sort functions, INSTR Latency and Local INSTR Latency,
accordingly.
Add local_ins_lat to the default_mem_sort_order[].
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/1612296553-21962-7-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The new sample type, PERF_SAMPLE_WEIGHT_STRUCT, is an alternative of the
PERF_SAMPLE_WEIGHT sample type. Users can apply either the
PERF_SAMPLE_WEIGHT sample type or the PERF_SAMPLE_WEIGHT_STRUCT sample
type to retrieve the sample weight, but they cannot apply both sample
types simultaneously.
The new sample type shares the same space as the PERF_SAMPLE_WEIGHT
sample type. The lower 32 bits are exactly the same for both sample
type. The higher 32 bits may be different for different architecture.
Add arch specific arch_evsel__set_sample_weight() to set the new sample
type for X86. Only store the lower 32 bits for the sample->weight if the
new sample type is applied. In practice, no memory access could last
than 4G cycles. No data will be lost.
If the kernel doesn't support the new sample type. Fall back to the
PERF_SAMPLE_WEIGHT sample type.
There is no impact for other architectures.
Committer notes:
Fixup related to PERF_SAMPLE_CODE_PAGE_SIZE, present in acme/perf/core
but not upstream yet.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/1612296553-21962-6-git-send-email-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf_evlist__ is for 'struct perf_evlist' methods, in tools/lib/perf/,
go on completing this split.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 42bbabed09 ("perf tools: Add hw_idx in struct branch_stack")
changed the format of branch stacks in perf samples. When samples use
this new format, a flag must be set in the corresponding event.
Synthesized branch stacks generated from Intel PT were using the new
format, but not setting the event attribute, leading to consumers
seeing corrupt data. This patch fixes the issue by setting the event
attribute to indicate use of the new format.
Fixes: 42bbabed09 ("perf tools: Add hw_idx in struct branch_stack")
Signed-off-by: Al Grant <al.grant@arm.com>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20200819084751.17686-2-leo.yan@linaro.org
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Use the new itrace 'q' option to add support for a mode of decoding that
ignores TNT, does not walk object code, but gets the ip from FUP and TIP
packets.
Example:
$ perf record -e intel_pt//u grep -rI pudding drivers
[ perf record: Woken up 52 times to write data ]
[ perf record: Captured and wrote 57.870 MB perf.data ]
$ time perf script --itrace=bi | wc -l
58948289
real 1m23.863s
user 1m23.251s
sys 0m7.452s
$ time perf script --itrace=biq | wc -l
3385694
real 0m4.453s
user 0m4.455s
sys 0m0.328s
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Change the debug logging (when used with the --time option) to time
filter logged perf events, but allow that to be overridden by using
"d+a" instead of plain "d".
That can reduce the size of the log file.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The "d" option may be followed by flags which affect what debug messages
will or will not be logged. Each flag must be preceded by either '+' or
'-'. The flags support by Intel PT are:
-a Suppress logging of perf events
Suppressing perf events is useful for decreasing the size of the log.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The itrace "e" option may be followed by flags which affect what errors
will or will not be reported. Each flag must be preceded by either '+' or '-'.
The flags supported by Intel PT are:
-o Suppress overflow errors
-l Suppress trace data lost errors
For example, for errors but not overflow or data lost errors:
--itrace=e-o-l
Suppressing those errors can be useful for testing and debugging because
they are not due to decoding.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200710151104.15137-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It is generally more useful to show the symbol with an address. In this
case, the print function requires the 'machine' which means changing
callers to provide it as a parameter. It is optional because most events
do not need it and the callers that matter can provide it.
Committer notes:
Made 'union perf_event' continue to be the first parameter to the
perf_event__fprintf() and perf_event__fprintf_text_poke() events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-16-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Select text poke events when available and the kernel is being traced.
Process text poke events to invalidate entries in Intel PT's instruction
cache.
Example:
The example requires kernel config:
CONFIG_PROC_SYSCTL=y
CONFIG_SCHED_DEBUG=y
CONFIG_SCHEDSTATS=y
Before:
# perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 3.341 MB perf.data.before ]
[1]+ Terminated perf record -o perf.data.before --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.before --itrace=e >/dev/null
Warning:
474 instruction trace errors
After:
# perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M &
# cat /proc/sys/kernel/sched_schedstats
0
# echo 1 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
1
# echo 0 > /proc/sys/kernel/sched_schedstats
# cat /proc/sys/kernel/sched_schedstats
0
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 2.646 MB perf.data.after ]
[1]+ Terminated perf record -o perf.data.after --kcore -a -e intel_pt//k -m,64M
# perf script -i perf.data.after --itrace=e >/dev/null
Example:
The example requires kernel config:
# CONFIG_FUNCTION_TRACER is not set
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.026 MB perf.data (68 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.268 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
207 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __schedule
Added new event:
probe:__schedule (on __schedule)
You can now use it in all perf tools, such as:
perf record -e probe:__schedule -aR sleep 1
# perf record -e probe:__schedule -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.028 MB perf.data (107 samples) ]
# perf probe -d probe:__schedule
Removed event: probe:__schedule
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 39.978 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
6 565303693547 0x291f18 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027a000 len 4096 type 2 flags 0x0 name kprobe_insn_page
6 565303697010 0x291f68 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 0 new len 6
6 565303838278 0x291fa8 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc027c000 len 4096 type 2 flags 0x0 name kprobe_optinsn_page
6 565303848286 0x291ff8 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 0 new len 106
6 565369336743 0x292af8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
7 566434327704 0x217c208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffff88ab8890 old len 5 new len 5
6 566456313475 0x293198 [0xa0]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027c000 old len 106 new len 0
6 566456314935 0x293238 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffc027a000 old len 6 new len 0
Example:
The example requires kernel config:
CONFIG_FUNCTION_TRACER=y
Before:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.022 MB perf.data (6 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 43.850 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
Warning:
8 instruction trace errors
After:
# perf record --kcore -m,64M -o t1 -a -e intel_pt//k &
# perf probe __kmalloc
Added new event:
probe:__kmalloc (on __kmalloc)
You can now use it in all perf tools, such as:
perf record -e probe:__kmalloc -aR sleep 1
# perf record -e probe:__kmalloc -aR sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.037 MB perf.data (206 samples) ]
# perf probe -d probe:__kmalloc
Removed event: probe:__kmalloc
# kill %1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 41.442 MB t1 ]
[1]+ Terminated perf record --kcore -m,64M -o t1 -a -e intel_pt//k
# perf script -i t1 --itrace=e >/dev/null
# perf script -i t1 --no-itrace -D | grep 'POKE\|KSYMBOL'
5 312216133258 0x8bafe0 [0x50]: PERF_RECORD_KSYMBOL addr ffffffffc0360000 len 415 type 2 flags 0x0 name ftrace_trampoline
5 312216133494 0x8bb030 [0x1d8]: PERF_RECORD_TEXT_POKE addr 0xffffffffc0360000 old len 0 new len 415
5 312216229563 0x8bb208 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216239063 0x8bb248 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
5 312216727230 0x8bb288 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
5 312216739322 0x8bb2c8 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
5 312216748321 0x8bb308 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287163462 0x2817430 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287174890 0x2817470 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
7 313287818979 0x28174b0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffabbea190 old len 5 new len 5
7 313287829357 0x28174f0 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac6016f5 old len 5 new len 5
7 313287841246 0x2817530 [0x40]: PERF_RECORD_TEXT_POKE addr 0xffffffffac601803 old len 5 new len 5
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: x86@kernel.org
Link: http://lore.kernel.org/lkml/20200512121922.8997-14-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The condition to add XMM registers was missing, the regs array needed to
be in the outer scope, and the size of the regs array was too small.
Fixes: 143d34a6b3 ("perf intel-pt: Add XMM registers to synthesized PEBS sample")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Luwei Kang <luwei.kang@intel.com>
Link: http://lore.kernel.org/lkml/20200630133935.11150-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
To avoid having struct branch_stack as a non-last structure member,
use allocated branch stack for PEBS sample.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/2540ed9a-89f1-6d59-10c9-a66cc90db5d2@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As those are not 'struct evsel' methods, not part of tools/lib/perf/,
aka libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
As they are 'struct evsel' methods or related routines, not part of
tools/lib/perf/, aka libperf, to whom the perf_ prefix belongs.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Change Intel PT's branch stack support to use thread stacks. The
advantages of using branch stack support from the thread-stack are:
1. the branches are accumulated separately for each thread
2. the branch stack is cleared only in between continuous traces
This helps pave the way for adding branch stacks to regular events, not
just synthesized events as at present.
While the 2 approaches are not identical, in simple cases the results
can be identical e.g.
Before:
# perf record --kcore -e intel_pt// uname
# perf script --itrace=i10usl -F+brstacksym,+addr,+flags > cmp1.txt
After:
# perf script --itrace=i10usl -F+brstacksym,+addr,+flags > cmp2.txt
# diff -s cmp1.txt cmp2.txt
Files cmp1.txt and cmp2.txt are identical
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200429150751.12570-4-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The components of the condition do not change, so consolidate them in
one variable.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200429150751.12570-3-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT already has support for creating branch stacks for each context
(per-cpu or per-thread). In the more common per-cpu case, the branch stack
is not separated for different threads, instead being cleared in between
each sample.
That approach will not work very well for adding branch stacks to
regular events. The branch stacks really need to be accumulated
separately for each thread.
As a start to accomplishing that, this patch adds support for putting
branch stack support into the thread-stack. The advantages are:
1. the branches are accumulated separately for each thread
2. the branch stack is cleared only in between continuous traces
This helps pave the way for adding branch stacks to regular events, not
just synthesized events as at present.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20200429150751.12570-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Trying to disentangle this a bit further, unfortunately it uses
parse_events(), its interesting to have it separated anyway, so do it.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In LBR call stack mode, the depth of reconstructed LBR call stack limits
to the number of LBR registers.
For example, on skylake, the depth of reconstructed LBR call stack is
always <= 32.
# To display the perf.data header info, please use
# --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 6K of event 'cycles'
# Event count (approx.): 6487119731
#
# Children Self Command Shared Object Symbol
# ........ ........ ............... ..................
# ................................
99.97% 99.97% tchain_edit tchain_edit [.] f43
|
--99.64%--f11
f12
f13
f14
f15
f16
f17
f18
f19
f20
f21
f22
f23
f24
f25
f26
f27
f28
f29
f30
f31
f32
f33
f34
f35
f36
f37
f38
f39
f40
f41
f42
f43
For a call stack which is deeper than LBR limit, HW will overwrite the
LBR register with oldest branch. Only partial call stacks can be
reconstructed.
However, the overwritten LBRs may still be retrieved from previous
sample. At that moment, HW hasn't overwritten the LBR registers yet.
Perf tools can stitch those overwritten LBRs on current call stacks to
get a more complete call stack.
To determine if LBRs can be stitched, perf tools need to compare current
sample with previous sample.
- They should have identical LBR records (Same from, to and flags
values, and the same physical index of LBR registers).
- The searching starts from the base-of-stack of current sample.
Once perf determines to stitch the previous LBRs, the corresponding LBR
cursor nodes will be copied to 'lists'. The 'lists' is to track the LBR
cursor nodes which are going to be stitched.
When the stitching is over, the nodes will not be freed immediately.
They will be moved to 'free_lists'. Next stitching may reuse the space.
Both 'lists' and 'free_lists' will be freed when all samples are
processed.
Committer notes:
Fix the intel-pt.c initialization of the union with 'struct
branch_flags', that breaks the build with its unnamed union on older gcc
versions.
Uninline thread__free_stitch_list(), as it grew big and started dragging
includes to thread.h, so move it to thread.c where what it needs in
terms of headers are already there.
This fixes the build in several systems such as debian:experimental when
cross building to the MIPS32 architecture, i.e. in the other cases what
was needed was being included by sheer luck.
In file included from builtin-sched.c:11:
util/thread.h: In function 'thread__free_stitch_list':
util/thread.h:169:3: error: implicit declaration of function 'free' [-Werror=implicit-function-declaration]
169 | free(pos);
| ^~~~
util/thread.h:169:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
util/thread.h:19:1: note: include '<stdlib.h>' or provide a declaration of 'free'
18 | #include "callchain.h"
+++ |+#include <stdlib.h>
19 |
util/thread.h:174:3: error: incompatible implicit declaration of built-in function 'free' [-Werror]
174 | free(pos);
| ^~~~
util/thread.h:174:3: note: include '<stdlib.h>' or provide a declaration of 'free'
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200319202517.23423-13-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The low level index of raw branch records for the most recent branch can
be recorded in a sample with PERF_SAMPLE_BRANCH_HW_INDEX
branch_sample_type. Extend struct branch_stack to support it.
However, if the PERF_SAMPLE_BRANCH_HW_INDEX is not applied, only nr and
entries[] will be output by kernel. The pointer of entries[] could be
wrong, since the output format is different with new struct
branch_stack. Add a variable no_hw_idx in struct perf_sample to
indicate whether the hw_idx is output. Add get_branch_entry() to return
corresponding pointer of entries[0].
To make dummy branch sample consistent as new branch sample, add hw_idx
in struct dummy_branch_stack for cs-etm and intel-pt.
Apply the new struct branch_stack for synthetic events as well.
Extend test case sample-parsing to support new struct branch_stack.
Committer notes:
Renamed get_branch_entries() to perf_sample__branch_entries() to have
proper namespacing and pave the way for this to be moved to libperf,
eventually.
Add 'static' to that inline as it is in a header.
Add 'hw_idx' to 'struct dummy_branch_stack' in cs-etm.c to fix the build
on arm64.
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Pavel Gerasimov <pavel.gerasimov@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vitaly Slobodskoy <vitaly.slobodskoy@intel.com>
Link: http://lore.kernel.org/lkml/20200228163011.19358-2-kan.liang@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
And pick the shortest name: 'struct maps'.
The split existed because we used to have two groups of maps, one for
functions and one for variables, but that only complicated things,
sometimes we needed to figure out what was at some address and then had
to first try it on the functions group and if that failed, fall back to
the variables one.
That split is long gone, so for quite a while we had only one struct
maps per struct map_groups, simplify things by combining those structs.
First patch is the minimum needed to merge both, follow up patches will
rename 'thread->mg' to 'thread->maps', etc.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-hom6639ro7020o708trhxh59@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add support for dumping, queuing and decoding AUX area samples. Decoding
samples is the same as regular decoding, except in the case where there
are no timestamps, in which case buffers are decoded immediately before
the sample event.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lore.kernel.org/lkml/20191115124225.5247-15-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move 'ids' from 'struct evsel' to libperf's 'struct perf_evsel'.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-26-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the 'id' array from 'struct evsel' to libperf's 'struct perf_evsel'.
Committer note:
Fix the tools/perf/util/cs-etm.c build, i.e. aarch64's CoreSight.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-25-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Those are the only routines using the perf_event__handler_t typedef and
are all related, so move to a separate header to reduce the header
dependency tree, lots of places were getting event.h and even stdio.h,
limits.h indirectly, so fix those as well.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-yvx9u1mf7baq6cu1abfhbqgs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All we need there is a forward declaration for 'union perf_event', so
remove it from there and add missing header directives in places using
things from this indirect include.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-7ftk0ztstqub1tirjj8o8xbl@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the movement of lots of stuff out of perf.h to other headers we
ended up not needing it in lots of places, remove it from those places.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-c718m0sxxwp73lp9d8vpihb4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Even more, to have a "perf_record_" prefix, so that they match the
PERF_RECORD_ enum they map to.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190828135717.7245-23-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the PERF_RECORD_AUXTRACE_INFO event definition to libperf's
event.h.
In order to keep libperf simple, we switch 'u64/u32/u16/u8' types used
events to their generic '__u*' versions.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190828135717.7245-9-jolsa@kernel.org
[ Fix cs_etm__print_auxtrace_info() arg to be __u64 too to fix the CORESIGHT=1 build ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Process synth_opts.other_events and attr.aux_output to set up for
synthesizing PEBs via Intel PT events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190806084606.4021-6-alexander.shishkin@linux.intel.com
Signed-off-by: Alexander Shishkin <alexander.shishkin@linux.intel.com>
[ Fixed up libbperf clashes, i.e. some places using perf_evsel (now in libperf)
need to use instead 'evsel' (a tools/perf only abstraction) ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Move the perf_event_attr struct fron 'struct evsel' to 'struct perf_evsel'.
Committer notes:
Fixed up these:
tools/perf/arch/arm/util/auxtrace.c
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/arm-spe.c
tools/perf/arch/s390/util/auxtrace.c
tools/perf/util/cs-etm.c
Also
cc1: warnings being treated as errors
tests/sample-parsing.c: In function 'do_test':
tests/sample-parsing.c:162: error: missing initializer
tests/sample-parsing.c:162: error: (near initialization for 'evsel.core.cpus')
struct evsel evsel = {
.needs_swap = false,
- .core.attr = {
- .sample_type = sample_type,
- .read_format = read_format,
+ .core = {
+ . attr = {
+ .sample_type = sample_type,
+ .read_format = read_format,
+ },
[perfbuilder@a70e4eeb5549 /]$ gcc --version |& head -1
gcc (GCC) 4.4.7
Also we don't need to include perf_event.h in
tools/perf/lib/include/perf/evsel.h, forward declaring 'struct
perf_event_attr' is enough. And this even fixes the build in some
systems where things are used somewhere down the include path from
perf_event.h without defining __always_inline.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-43-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_evlist to struct evlist, so we don't have a name
clash when we add struct perf_evlist in libperf.
Committer notes:
Added fixes to build on arm64, from Jiri and from me
(tools/perf/util/cs-etm.c)
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_evsel to struct evsel, so we don't have a name clash
when we add struct perf_evsel in libperf.
Committer notes:
Added fixes for arm64, provided by Jiri.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20190721112506.12306-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on the following report from Smatch, fix the potential NULL
pointer dereference check.
tools/perf/util/intel-pt.c:3200
intel_pt_process_auxtrace_info() error: we previously assumed
'session->itrace_synth_opts' could be null (see line 3196)
tools/perf/util/intel-pt.c:3206
intel_pt_process_auxtrace_info() warn: variable dereferenced before
check 'session->itrace_synth_opts' (see line 3200)
tools/perf/util/intel-pt.c
3196 if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
3197 pt->synth_opts = *session->itrace_synth_opts;
3198 } else {
3199 itrace_synth_opts__set_default(&pt->synth_opts,
3200 session->itrace_synth_opts->default_no_sample);
^^^^^^^^^^^^^^^^^^^^^^^^^^
3201 if (!session->itrace_synth_opts->default_no_sample &&
3202 !session->itrace_synth_opts->inject) {
3203 pt->synth_opts.branches = false;
3204 pt->synth_opts.callchain = true;
3205 }
3206 if (session->itrace_synth_opts)
^^^^^^^^^^^^^^^^^^^^^^^^^^
3207 pt->synth_opts.thread_stack =
3208 session->itrace_synth_opts->thread_stack;
3209 }
'session->itrace_synth_opts' is impossible to be a NULL pointer in
intel_pt_process_auxtrace_info(), thus this patch removes the NULL test
for 'session->itrace_synth_opts'.
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/20190708143937.7722-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Eroding a bit more the tools/perf/util/util.h hodpodge header.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-natazosyn9rwjka25tvcnyi0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The first core-to-bus ratio (CBR) event will not be shown if --itrace
's' option (skip initial number of events) is used, nor if time
intervals are specified that do not include the start of tracing. Change
the logic to record the last CBR value seen by the user, and synthesize
CBR events whenever that changes.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190622093248.581-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Like other synthesized events, if there is also an Intel PT branch
trace, then a call stack can also be synthesized. Add that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add memory information from PEBS data in the Intel PT trace to the
synthesized PEBS sample. This provides sample types PERF_SAMPLE_ADDR,
PERF_SAMPLE_WEIGHT, and PERF_SAMPLE_TRANSACTION, but not
PERF_SAMPLE_DATA_SRC.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add LBR information from PEBS data in the Intel PT trace to the
synthesized PEBS sample.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add XMM register information from PEBS data in the Intel PT trace to the
synthesized PEBS sample.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add general purpose register information from PEBS data in the Intel PT
trace to the synthesized PEBS sample.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesize a PEBS sample using basic information (ip, timestamp) only.
Other PEBS information will be added in later patches.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-7-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out common sample preparation for re-use when synthesizing PEBS
samples.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add infrastructure to prepare for synthesizing PEBS samples but leave
the actual synthesis to later patches.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-5-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Add 3 new packets to supports PEBS via PT, namely Block Begin Packet
(BBP), Block Item Packet (BIP) and Block End Packet (BEP). PEBS data is
encoded into multiple BIP packets that come between BBP and BEP. The BEP
packet might be associated with a FUP packet. That is indicated by using
a separate packet type (INTEL_PT_BEP_IP) similar to other packets types
with the _IP suffix.
Refer to the Intel SDM for more information about PEBS via PT:
https://software.intel.com/en-us/articles/intel-sdm
May 2019 version: Vol. 3B 18.5.5.2 PEBS output to Intel® Processor Trace
Decoding of BIP packets conflicts with single-byte TNT packets. Since
BIP packets only occur in the context of a block (i.e. between BBP and
BEP), that context must be recorded and passed to the packet decoder.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190610072803.10456-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf record:
Alexey Budankov:
- Allow mixing --user-regs with --call-graph=dwarf, making sure that
the minimal set of registers for DWARF unwinding is present in the
set of user registers requested to be present in each sample, while
warning the user that this may make callchains unreliable if more
that the minimal set of registers is needed to unwind.
yuzhoujian:
- Add support to collect callchains from kernel or user space only,
IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user}
bits from the command line.
perf trace:
Arnaldo Carvalho de Melo:
- Remove x86_64 specific syscall numbers from the augmented_raw_syscalls
BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit}
payloads, use instead the syscall numbers obtainer either by the
arch specific syscalltbl generators or from audit-libs.
- Allow 'perf trace' to ask for the number of bytes to collect for
string arguments, for now ask for PATH_MAX, i.e. the whole
pathnames, which ends up being just a way to speficy which syscall
args are pathnames and thus should be read using bpf_probe_read_str().
- Skip unknown syscalls when expanding strace like syscall groups.
This helps using the 'string' group of syscalls to work in arm64,
where some of the syscalls present in x86_64 that deal with
strings, for instance 'access', are deprecated and this should not
be asked for tracing.
Leo Yan:
- Exit when failing to build eBPF program.
perf config:
Arnaldo Carvalho de Melo:
- Bail out when a handler returns failure for a key-value pair. This
helps with cases where processing a key-value pair is not just a
matter of setting some tool specific knob, involving, for instance
building a BPF program to then attach to the list of events 'perf
trace' will use, e.g. augmented_raw_syscalls.c.
perf.data:
Kan Liang:
- Read and store die ID information available in new Intel processors
in CPUID.1F in the CPU topology written in the perf.data header.
perf stat:
Kan Liang:
- Support per-die aggregation.
Documentation:
Arnaldo Carvalho de Melo:
- Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY,
CLOCKID and DIR_FORMAT headers.
Song Liu:
- Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF.
Leo Yan:
- Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'.
JVMTI:
Jiri Olsa:
- Address gcc string overflow warning for strncpy()
core:
- Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd().
Intel PT:
Adrian Hunter:
- Add support for samples to contain IPC ratio, collecting cycles
information from CYC packets, showing the IPC info periodically, because
Intel PT does not update the cycle count on every branch or instruction,
the incremental values will often be zero. When there are values, they
will be the number of instructions and number of cycles since the last
update, and thus represent the average IPC since the last IPC value.
E.g.:
# perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001
rounding mmap pages size to 1024M (262144 pages)
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 2.208 MB perf.data ]
# perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
#
<SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)>
1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44)
2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp
3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00
4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al
5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00
6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx
7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12
8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax
9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax
10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821
11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi
12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi
13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi
14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip)
15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821
16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp
17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8
18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx)
19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3)
<SNIP>
- Allow using time ranges with Intel PT, i.e. these features, already
present but not optimially usable with Intel PT, should be now:
Select the second 10% time slice:
$ perf script --time 10%/2
Select from 0% to 10% time slice:
$ perf script --time 0%-10%
Select the first and second 10% time slices:
$ perf script --time 10%/1,10%/2
Select from 0% to 10% and 30% to 40% slices:
$ perf script --time 0%-10%,30%-40%
cs-etm (ARM):
Mathieu Poirier:
- Add support for CPU-wide trace scenarios.
s390:
Thomas Richter:
- Fix missing kvm module load for s390.
- Fix OOM error in TUI mode on s390
- Support s390 diag event display when doing analysis on !s390
architectures.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCXP/1xQAKCRCyPKLppCJ+
J9xcAQCwOITAshE7op7HbKUPtkqiMNu+hpNa3skhxEpGHvKO0AEArpBXtuvEP8EU
PZsp+8vcVrlZ+dZutttgvkRz25mScg8=
=kfFb
-----END PGP SIGNATURE-----
Merge tag 'perf-core-for-mingo-5.3-20190611' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux into perf/core
Pull perf/core improvements and fixes from Arnaldo Carvalho de Melo:
perf record:
Alexey Budankov:
- Allow mixing --user-regs with --call-graph=dwarf, making sure that
the minimal set of registers for DWARF unwinding is present in the
set of user registers requested to be present in each sample, while
warning the user that this may make callchains unreliable if more
that the minimal set of registers is needed to unwind.
yuzhoujian:
- Add support to collect callchains from kernel or user space only,
IOW allow setting the perf_event_attr.exclude_callchain_{kernel,user}
bits from the command line.
perf trace:
Arnaldo Carvalho de Melo:
- Remove x86_64 specific syscall numbers from the augmented_raw_syscalls
BPF in-kernel collector of augmented raw_syscalls:sys_{enter,exit}
payloads, use instead the syscall numbers obtainer either by the
arch specific syscalltbl generators or from audit-libs.
- Allow 'perf trace' to ask for the number of bytes to collect for
string arguments, for now ask for PATH_MAX, i.e. the whole
pathnames, which ends up being just a way to speficy which syscall
args are pathnames and thus should be read using bpf_probe_read_str().
- Skip unknown syscalls when expanding strace like syscall groups.
This helps using the 'string' group of syscalls to work in arm64,
where some of the syscalls present in x86_64 that deal with
strings, for instance 'access', are deprecated and this should not
be asked for tracing.
Leo Yan:
- Exit when failing to build eBPF program.
perf config:
Arnaldo Carvalho de Melo:
- Bail out when a handler returns failure for a key-value pair. This
helps with cases where processing a key-value pair is not just a
matter of setting some tool specific knob, involving, for instance
building a BPF program to then attach to the list of events 'perf
trace' will use, e.g. augmented_raw_syscalls.c.
perf.data:
Kan Liang:
- Read and store die ID information available in new Intel processors
in CPUID.1F in the CPU topology written in the perf.data header.
perf stat:
Kan Liang:
- Support per-die aggregation.
Documentation:
Arnaldo Carvalho de Melo:
- Update perf.data documentation about the CPU_TOPOLOGY, MEM_TOPOLOGY,
CLOCKID and DIR_FORMAT headers.
Song Liu:
- Add description of headers HEADER_BPF_PROG_INFO and HEADER_BPF_BTF.
Leo Yan:
- Update default value for llvm.clang-bpf-cmd-template in 'man perf-config'.
JVMTI:
Jiri Olsa:
- Address gcc string overflow warning for strncpy()
core:
- Remove superfluous nthreads system_wide setup in perf_evsel__alloc_fd().
Intel PT:
Adrian Hunter:
- Add support for samples to contain IPC ratio, collecting cycles
information from CYC packets, showing the IPC info periodically, because
Intel PT does not update the cycle count on every branch or instruction,
the incremental values will often be zero. When there are values, they
will be the number of instructions and number of cycles since the last
update, and thus represent the average IPC since the last IPC value.
E.g.:
# perf record --cpu 1 -m200000 -a -e intel_pt/cyc/u sleep 0.0001
rounding mmap pages size to 1024M (262144 pages)
[ perf record: Woken up 0 times to write data ]
[ perf record: Captured and wrote 2.208 MB perf.data ]
# perf script --insn-trace --xed -F+ipc,-dso,-cpu,-tid
#
<SNIP + add line numbering to make sense of IPC counts e.g.: (18/3)>
1 cc1 63501.650479626: 7f5219ac27bf _int_free+0x3f jnz 0x7f5219ac2af0 IPC: 0.81 (36/44)
2 cc1 63501.650479626: 7f5219ac27c5 _int_free+0x45 cmp $0x1f, %rbp
3 cc1 63501.650479626: 7f5219ac27c9 _int_free+0x49 jbe 0x7f5219ac2b00
4 cc1 63501.650479626: 7f5219ac27cf _int_free+0x4f test $0x8, %al
5 cc1 63501.650479626: 7f5219ac27d1 _int_free+0x51 jnz 0x7f5219ac2b00
6 cc1 63501.650479626: 7f5219ac27d7 _int_free+0x57 movq 0x13c58a(%rip), %rcx
7 cc1 63501.650479626: 7f5219ac27de _int_free+0x5e mov %rdi, %r12
8 cc1 63501.650479626: 7f5219ac27e1 _int_free+0x61 movq %fs:(%rcx), %rax
9 cc1 63501.650479626: 7f5219ac27e5 _int_free+0x65 test %rax, %rax
10 cc1 63501.650479626: 7f5219ac27e8 _int_free+0x68 jz 0x7f5219ac2821
11 cc1 63501.650479626: 7f5219ac27ea _int_free+0x6a leaq -0x11(%rbp), %rdi
12 cc1 63501.650479626: 7f5219ac27ee _int_free+0x6e mov %rdi, %rsi
13 cc1 63501.650479626: 7f5219ac27f1 _int_free+0x71 shr $0x4, %rsi
14 cc1 63501.650479626: 7f5219ac27f5 _int_free+0x75 cmpq %rsi, 0x13caf4(%rip)
15 cc1 63501.650479626: 7f5219ac27fc _int_free+0x7c jbe 0x7f5219ac2821
16 cc1 63501.650479626: 7f5219ac2821 _int_free+0xa1 cmpq 0x13f138(%rip), %rbp
17 cc1 63501.650479626: 7f5219ac2828 _int_free+0xa8 jnbe 0x7f5219ac28d8
18 cc1 63501.650479626: 7f5219ac28d8 _int_free+0x158 testb $0x2, 0x8(%rbx)
19 cc1 63501.650479628: 7f5219ac28dc _int_free+0x15c jnz 0x7f5219ac2ab0 IPC: 6.00 (18/3)
<SNIP>
- Allow using time ranges with Intel PT, i.e. these features, already
present but not optimially usable with Intel PT, should be now:
Select the second 10% time slice:
$ perf script --time 10%/2
Select from 0% to 10% time slice:
$ perf script --time 0%-10%
Select the first and second 10% time slices:
$ perf script --time 10%/1,10%/2
Select from 0% to 10% and 30% to 40% slices:
$ perf script --time 0%-10%,30%-40%
cs-etm (ARM):
Mathieu Poirier:
- Add support for CPU-wide trace scenarios.
s390:
Thomas Richter:
- Fix missing kvm module load for s390.
- Fix OOM error in TUI mode on s390
- Support s390 diag event display when doing analysis on !s390
architectures.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Set up time ranges for efficient time interval filtering using the new
"fast forward" facility.
Because decoding is done in time order, intel_pt_time_filter() needs to
look only at the next start or end timestamp - refer intel_pt_next_time().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190604130017.31207-12-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Implement the lookahead callback to let the decoder access subsequent
buffers. intel_pt_lookahead() manages the buffer lifetime and calls the
decoder for each buffer until the decoder returns a non-zero value.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190604130017.31207-11-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out intel_pt_get_buffer() so it can be reused.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190604130017.31207-10-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Based on 1 normalized pattern(s):
this program is free software you can redistribute it and or modify
it under the terms and conditions of the gnu general public license
version 2 as published by the free software foundation this program
is distributed in the hope it will be useful but without any
warranty without even the implied warranty of merchantability or
fitness for a particular purpose see the gnu general public license
for more details
extracted by the scancode license scanner the SPDX license identifier
GPL-2.0-only
has been chosen to replace the boilerplate/reference in 263 file(s).
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Copy the incremental instruction count and cycle count onto 'instructions'
and 'branches' samples.
Because Intel PT does not update the cycle count on every branch or
instruction, the incremental values will often be zero.
When there are values, they will be the number of instructions and
number of cycles since the last update, and thus represent the average
IPC since the last IPC value.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190520113728.14389-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Returning 1 from intel_pt_sync_switch() causes the current tid to be
set. That negates the need to keep next_tid anymore. Rationalize the
code to that effect.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190412113830.4126-9-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.
Improve it by processing "context switch in" events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190412113830.4126-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Commit 4eb0681571 ("perf script: Make itrace script default to all
calls") does not work because 'use_browser' is being used to determine
whether to default to periodic sampling (i.e. better for perf report).
The result is that nothing but CBR events display for perf script when
no --itrace option is specified.
Fix by using 'default_no_sample' and 'inject' instead.
Example:
Before:
$ perf record -e intel_pt/cyc/u ls
$ perf script > cmp1.txt
$ perf script --itrace=cepwx > cmp2.txt
$ diff -sq cmp1.txt cmp2.txt
Files cmp1.txt and cmp2.txt differ
After:
$ perf script > cmp1.txt
$ perf script --itrace=cepwx > cmp2.txt
$ diff -sq cmp1.txt cmp2.txt
Files cmp1.txt and cmp2.txt are identical
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.20+
Fixes: 90e457f7be ("perf tools: Add Intel PT support")
Link: http://lkml.kernel.org/r/20190520113728.14389-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When TSC is not available, "timeless" decoding is used but a divide by
zero occurs if perf_time_to_tsc() is called.
Ensure the divisor is not zero.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org # v4.9+
Link: https://lkml.kernel.org/n/tip-1i4j0wqoc8vlbkcizqqxpsf4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
The timestamp can use useful to find part of a trace that has an error
without outputting all of the trace e.g. using the itrace 's' option to
skip initial number of events.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20190206103947.15750-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
perf creates a single 'struct thread' to represent the idle task. That
is because threads are identified by PID and TID, and the idle task
always has PID == TID == 0.
However, there are actually separate idle tasks for each CPU. That
creates a problem for thread stack processing which assumes that each
thread has a single stack, not one stack per CPU.
Fix that by passing through the CPU number, and in the case of the idle
"thread", pick the thread stack from an array based on the CPU number.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181221120620.9659-8-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the absence of a fallback, samples must provide a correct cpumode for
the 'ip'. Do that now there is no fallback.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/20181031091043.23465-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
In the absence of a fallback, callchains must encode also the callchain
context. Do that now there is no fallback.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Reviewed-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: stable@vger.kernel.org # 4.19
Link: http://lkml.kernel.org/r/100ea2ec-ed14-b56d-d810-e0a6d2f4b069@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
By default 'perf script' for itrace outputs sampled instructions or
branches. In my experience this is confusing to users because it's hard
to correlate with real program behavior. The sampling makes sense for
tools like 'perf report' that actually sample to reduce the run time,
but run time is normally not a problem for 'perf script'. It's better
to give an accurate representation of the program flow.
Default 'perf script' to output all calls for itrace. That's a much saner
default. The old behavior can be still requested with 'perf script'
--itrace=ibxwpe100000
v2: Fix ETM build failure
v3: Really fix ETM build failure (Kim Phillips)
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Kim Phillips <kim.phillips@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Link: http://lkml.kernel.org/r/20180920180540.14039-3-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Previously, the decoder would indicate begin / end by a branch from / to
zero. That hides useful information, in particular when a trace ends
with a call. To prepare for remedying that, add Intel PT decoder flags
for trace begin / end and map them to the existing sample flags.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/20180920130048.31432-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Some Atom CPUs can produce FUP packets that contain NLIP (next linear
instruction pointer) instead of CLIP (current linear instruction
pointer). That will result in "Unexpected indirect branch" errors. Fix
by comparing IP to NLIP in that case.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1527762225-26024-5-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.
In one case, INTEL_PT_SS_NOT_TRACING state was not correctly
transitioning to INTEL_PT_SS_TRACING state due to a missing case clause.
Add it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1527762225-26024-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
All users want MAP__FUNCTION, and this split is going away.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-sm72zwt1f03ma5uw78l6zze0@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
It was returning the searched map just on the addr_location passed, with
the function itself returning void.
Make it return the map so that we can make the code more compact.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-tzlrrzdeoof4i6ktyqv1t6ks@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Out of thread__find_add_map(..., MAP__FUNCTION, ...), idea here is to
continue removing references to MAP__{FUNCTION,VARIABLE} ahead of
getting both types of symbols in the same rbtree, as various places do
two lookups, looking first at MAP__FUNCTION, then at MAP__VARIABLE.
So thread__find_map() will eventually do just that, and 'struct symbol'
will have the symbol type, for code that cares about that.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: https://lkml.kernel.org/n/tip-q27xee34l4izpfau49w103s6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Intel PT code already has some preparation for AUX area sampling mode.
However the implementation has changed from the first proposal and one
of the side-effects is that it will not be impossible to support snapshot
mode and sampling mode at the same time.
Although there are no plans to support it, let validation (not yet
implemented) control whether it is allowed rather than low-level
functions.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1520431349-30689-9-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
intel_pt_get_trace() fixes overlaps between the current buffer and the
previous buffer ('old_buffer').
However the previous buffer might not have had usable data (no PSB) so
the comparison must be made against the previous buffer that had usable
data.
Tidy that by keeping a pointer for that purpose in struct intel_pt_queue.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1520431349-30689-8-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
With the new way sampling support will be implemented,
intel_pt_use_buffer_pid_tid() will not be needed. Get rid of it.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1520431349-30689-7-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
sync_switch is a facility to synchronize decoding more closely with the
point in the kernel when the context actually switched.
The flag when sync_switch is enabled was global to the decoding, whereas
it is really specific to the CPU.
The trace data for different CPUs is put on different queues, so add
sync_switch to the intel_pt_queue structure and use that in preference
to the global setting in the intel_pt structure.
That fixes problems decoding one CPU's trace because sync_switch was
disabled on a different CPU's queue.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-3-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Overlap detection was not not updating the buffer's 'consecutive' flag.
Marking buffers consecutive has the advantage that decoding begins from
the start of the buffer instead of the first PSB. Fix overlap detection
to identify consecutive buffers correctly.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1520431349-30689-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
There is never a need to synthesize a 'swapped' sample, so all callers
to perf_event__synthesize_sample() pass 'false' as the value to
'swapped'. So get rid of the unused 'swapped' parameter.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1516108492-21401-4-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Both 'perf inject' and internal tools consume cpu endian samples, so
there is never a need to do any swapping when synthesizing samples.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Link: http://lkml.kernel.org/r/1516108492-21401-2-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Rename struct perf_data_file to perf_data, because we will add the
possibility to have multiple files under perf.data, so the 'perf_data'
name fits better.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Changbin Du <changbin.du@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-39wn4d77phel3dgkzo3lyan0@git.kernel.org
[ Fixup recent changes in 'perf script --per-event-dump' ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Synthesize new power and ptwrite events.
Power events report changes to C-state but I have also added support
for the existing CBR (core-to-bus ratio) packet and included that
when outputting power events.
The PTWRITE packet is associated with the new "ptwrite" instruction,
which is essentially just a way to stuff a 32 or 64 bit value into the
PT trace.
More details can be found in the patches that add documentation and in
the Intel SDM.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1498811805-2335-1-git-send-email-adrian.hunter@intel.com
[ Copy the description of such packet from the patchkit cover message ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
intel_pt_synth_events() uses the same attr structure to create each event.
Move the code around a bit to simplify that.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-33-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Factor out common code in functions synthesizing event samples i.e.
intel_pt_synth_branch_sample(), intel_pt_synth_instruction_sample() and
intel_pt_synth_transaction_sample().
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-27-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
'transactions_sample_type' is needed to correctly inject transactions
samples but it was not being set. Set it from the event sample type.
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/1495786658-18063-18-git-send-email-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>