mirror of
https://git.proxmox.com/git/mirror_ubuntu-kernels.git
synced 2025-12-22 19:35:11 +00:00
cd0b6f67c4
15 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
cd0b6f67c4 |
perf annotate: Add some of the arithmetic instructions to support instruction tracking in powerpc
Data-type profiling has the concept of instruction tracking. Example sequence in powerpc: ld r10,264(r3) mr r31,r3 <<after some sequence> ld r9,312(r31) or differently lwz r10,264(r3) add r31, r3, RB lwz r9, 0(r31) If a sample is hit at "lwz r9, 0(r31)", data type of r31 depends on previous instruction sequence here. So to track the previous instructions, patch adds changes to identify some of the arithmetic instructions which are having opcode as 31. Since memory instructions also has cases with opcode 31, use the bits 22:30 to filter the arithmetic instructions here. Also there are instructions with just two operands like "addme", "addze". This patch adds new instructions ops "arithmetic_ops" to handle this Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-10-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
ace7d681d8 |
perf annotate: Add support to identify memory instructions of opcode 31 in powerpc
There are memory instructions in powerpc with opcode as 31. Example: "ldx RT,RA,RB" , Its X form is as below: ______________________________________ | 31 | RT | RA | RB | 21 |/| -------------------------------------- 0 6 11 16 21 30 31 The opcode for "ldx" is 31. There are other instructions also with opcode 31 which are memory insn like ldux, stbx, lwzx, lhaux But all instructions with opcode 31 are not memory. Example is add instruction: "add RT,RA,RB" The value in bit 21-30 [ 21 for ldx ] is different for these instructions. Patch uses this value to assign instruction ops for these cases. The naming convention and value to identify these are picked from defines in "arch/powerpc/include/asm/ppc-opcode.h" Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-9-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
1acdad6818 |
perf annotate: Add parse function for memory instructions in powerpc
Use the raw instruction code and macros to identify memory instructions, extract register fields and also offset. The implementation addresses the D-form, X-form, DS-form instructions. Two main functions are added. New parse function "load_store__parse" as instruction ops parser for memory instructions. Unlike other parsers (like mov__parse), this one fills in the "multi_regs" field for source/target and new added "mem_ref" field. No other fields are set because, here there is no need to parse the disassembled code and arch specific macros will take care of extracting offset and regs which is easier and will be precise. In powerpc, all instructions with a primary opcode from 32 to 63 are memory instructions. Update "ins__find" function to have "raw_insn" also as a parameter. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-8-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
1b4406d2a8 |
perf annotate: Update parameters for reg extract functions to use raw instruction on powerpc
Use the raw instruction code and macros to identify memory instructions, extract register fields and also offset. The implementation addresses the D-form, X-form, DS-form instructions. Adds "mem_ref" field to check whether source/target has memory reference. Add function "get_powerpc_regs" which will set these fields: reg1, reg2, offset depending of where it is source or target ops. Update "parse" callback for "struct ins_ops" to also pass "struct disasm_line" as argument. This is needed in parse functions where opcode is used to determine whether to set multi_regs and other fields Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-7-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
0b971e6bf1 |
perf annotate: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility
Add support to capture and parse raw instruction in powerpc.
Currently, the perf tool infrastructure uses two ways to disassemble
and understand the instruction. One is objdump and other option is
via libcapstone.
Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
with "objdump" while disassemble. Example from powerpc with this option
for an instruction address is:
Snippet from:
objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux>
c0000000010224b4: lwz r10,0(r9)
This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov (%rax), %rcx".
In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, patch adds support to use raw
instruction for powerpc. Approach used is to read the raw instruction
directly from the DSO file using "dso__data_read_offset" utility which
is already implemented in perf infrastructure in "util/dso.c".
Example:
38 01 81 e8 ld r4,312(r1)
Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:
| 58 | RT | RA | DS | |
-------------------------------------
0 6 11 16 30 31
Function "symbol__disassemble_dso" is updated to read raw instruction
directly from DSO using dso__data_read_offset utility. In case of
above example, this captures:
line: 38 01 81 e8
The above works well when 'perf report' is invoked with only sort keys
for data type ie type and typeoff.
Because there is no instruction level annotation needed if only data
type information is requested for.
For annotating sample, along with type and typeoff sort key, "sym" sort
key is also needed. And by default invoking just "perf report" uses sort
key "sym" that displays the symbol information.
With approach changes in powerpc which first reads DSO for raw
instruction, "perf annotate" and "perf report" + a key breaks since
it doesn't do the instruction level disassembly.
Snippet of result from 'perf report':
Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
do_work /usr/bin/pmlogger [Percent: local period]
Percent│ ea230010
│ 3a550010
│ 3a600000
│ 38f60001
│ 39490008
│ 42400438
51.44 │ 81290008
│ 7d485378
Here, raw instruction is displayed in the output instead of human
readable annotated form.
One way to get the appropriate data is to specify "--objdump path", by
which code annotation will be done. But the default behaviour will be
changed. To fix this breakage, check if "sym" sort key is set. If so
fallback and use the libcapstone/objdump way of disassmbling the sample.
With the changes and "perf report"
Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
do_work /usr/bin/pmlogger [Percent: local period]
Percent│ ld r17,16(r3)
│ addi r18,r21,16
│ li r19,0
│ 8b0: rldicl r10,r10,63,33
│ addi r10,r10,1
│ mtctr r10
│ ↓ b 8e4
│ 8c0: addi r7,r22,1
│ addi r10,r9,8
│ ↓ bdz d00
51.44 │ lwz r9,8(r9)
│ mr r8,r10
│ cmpw r20,r9
Committer notes:
Just add the extern for 'sort_order' in disasm.c so that we don't end up
breaking the build due to this type colision with capstone and libbpf:
In file included from /usr/include/capstone/capstone.h:325,
from /git/perf-6.10.0/tools/perf/util/print_insn.h:23,
from builtin-script.c:38:
/usr/include/capstone/bpf.h:94:14: error: 'bpf_insn' defined as wrong kind of tag
94 | typedef enum bpf_insn {
I reported this to the bpf mailing list, see one of the links below.
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-6-atrajeev@linux.vnet.ibm.com
Link: https://lore.kernel.org/bpf/ZqOltPk9VQGgJZAA@x1/T/#u
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
06dd4c5a56 |
perf annotate: Add disasm_line__parse() to parse raw instruction for powerpc
Currently, the perf tool infrastructure uses the disasm_line__parse function to parse disassembled line. Example snippet from objdump: objdump --start-address=<address> --stop-address=<address> -d --no-show-raw-insn -C <vmlinux> c0000000010224b4: lwz r10,0(r9) This line "lwz r10,0(r9)" is parsed to extract instruction name, registers names and offset. In powerpc, the approach for data type profiling uses raw instruction instead of result from objdump to identify the instruction category and extract the source/target registers. Example: 38 01 81 e8 ld r4,312(r1) Here "38 01 81 e8" is the raw instruction representation. Add function "disasm_line__parse_powerpc" to handle parsing of raw instruction. Also update "struct disasm_line" to save the binary code/ With the change, function captures: line -> "38 01 81 e8 ld r4,312(r1)" raw instruction "38 01 81 e8" Raw instruction is used later to extract the reg/offset fields. Macros are added to extract opcode and register fields. "struct disasm_line" is updated to carry union of "bytes" and "raw_insn" of 32 bit to carry raw code (raw). Function "disasm_line__parse_powerpc fills the raw instruction hex value and can use macros to get opcode. There is no changes in existing code paths, which parses the disassembled code. The size of raw instruction depends on architecture. In case of powerpc, the parsing the disasm line needs to handle cases for reading binary code directly from DSO as well as parsing the objdump result. Hence adding the logic into separate function instead of updating "disasm_line__parse". The architecture using the instruction name and present approach is not altered. Since this approach targets powerpc, the macro implementation is added for powerpc as of now. Since the disasm_line__parse is used in other cases (perf annotate) and not only data tye profiling, the powerpc callback includes changes to work with binary code as well as mnemonic representation. Also in case if the DSO read fails and libcapstone is not supported, the approach fallback to use objdump as option. Hence as option, patch has changes to ensure objdump option also works well. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-5-atrajeev@linux.vnet.ibm.com [ Add check for strndup() result ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
782959ac24 |
perf annotate: Add "update_insn_state" callback function to handle arch specific instruction tracking
Add "update_insn_state" callback to "struct arch" to handle instruction tracking. Currently updating instruction state is handled by static function "update_insn_state_x86" which is defined in "annotate-data.c". Make this as a callback for specific arch and move to archs specific file "arch/x86/annotate/instructions.c" . This will help to add helper function for other platforms in file: "arch/<platform>/annotate/instructions.c" and make changes/updates easier. Define callback "update_insn_state" as part of "struct arch", also make some of the debug functions non-static so that it can be referenced from other places. Reviewed-by: Kajol Jain <kjain@linux.ibm.com> Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Kajol Jain <kjain@linux.ibm.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Akanksha J N <akanksha@linux.ibm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Disha Goel <disgoel@linux.vnet.ibm.com> Cc: Hari Bathini <hbathini@linux.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Segher Boessenkool <segher@kernel.crashing.org> Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
1553419c3c |
perf dso: Fix address sanitizer build
Various files had been missed from having accessor functions added for
the sake of dso reference count checking. Add the function calls and
missing dso accessor functions.
Fixes:
|
||
|
|
ee756ef749 |
perf dso: Add reference count checking and accessor functions
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.
The majority of the change is mechanical in nature and not easy to
split up.
Committer testing:
'perf test' up to this patch shows no regressions.
But:
util/symbol.c: In function ‘dso__load_bfd_symbols’:
util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
1683 | dso__set_adjust_symbols(dso);
| ^~~~~~~~~~~~~~~~~~~~~~~
In file included from util/symbol.c:21:
util/dso.h:268:20: note: declared here
268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
| ^~~~~~~~~~~~~~~~~~~~~~~
make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/
make[6]: *** Waiting for unfinished jobs....
This was updated:
- symbols__fixup_end(&dso->symbols, false);
- symbols__fixup_duplicate(&dso->symbols);
- dso->adjust_symbols = 1;
+ symbols__fixup_end(dso__symbols(dso), false);
+ symbols__fixup_duplicate(dso__symbols(dso));
+ dso__set_adjust_symbols(dso);
But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).
Add the missing argument:
symbols__fixup_end(dso__symbols(dso), false);
symbols__fixup_duplicate(dso__symbols(dso));
- dso__set_adjust_symbols(dso);
+ dso__set_adjust_symbols(dso, true);
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Chengen Du <chengen.du@canonical.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
8f3ec810bb |
perf annotate: Update DSO binary type when trying build-id
dso__disassemble_filename() tries to get the filename for objdump (or
capstone) using build-id. But I found sometimes it didn't disassemble
some functions.
It turned out that those functions belong to a DSO which has no binary
type set. It seems it sets the binary type for some special files only
- like kernel (kallsyms or kcore) or BPF images. And there's a logic to
skip dso with DSO_BINARY_TYPE__NOT_FOUND.
As it's checked the build-id cache link, it should set the binary type
as DSO_BINARY_TYPE__BUILD_ID_CACHE.
Fixes:
|
||
|
|
f35847de2a |
perf annotate: Fallback disassemble to objdump when capstone fails
I found some cases that capstone failed to disassemble. Probably my capstone is an old version but anyway there's a chance it can fail. And then it silently stopped in the middle. In my case, it didn't understand "RDPKRU" instruction. Let's check if the capstone disassemble reached the end of the function and fallback to objdump if not. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240425005157.1104789-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
873a83731f |
perf annotate: Skip DSOs not found
In some data file, I see the following messages repeated. It seems it doesn't have DSOs in the system and the dso->binary_type is set to DSO_BINARY_TYPE__NOT_FOUND. Let's skip them to avoid the followings. No output from objdump --start-address=0x0000000000000000 --stop-address=0x00000000000000d4 -d --no-show-raw-insn -C "$1" Error running objdump --start-address=0x0000000000000000 --stop-address=0x0000000000000631 -d --no-show-raw-insn -C "$1" ... Closes: https://lore.kernel.org/linux-perf-users/15e1a2847b8cebab4de57fc68e033086aa6980ce.camel@yandex.ru/ Reported-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240410185117.1987239-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
92dfc59463 |
perf annotate: Add symbol name when using capstone
This is to keep the existing behavior with objdump. It needs to show
symbol information of global variables like below:
Percent | Source code & Disassembly of elf for cycles:P (1 samples, percent: local period)
------------------------------------------------------------------------------------------------
: 0 0xffffffff81338f70 <vm_normal_page>:
0.00 : ffffffff81338f70: endbr64
0.00 : ffffffff81338f74: callq 0xffffffff81083a40
0.00 : ffffffff81338f79: movq %rdi, %r8
0.00 : ffffffff81338f7c: movq %rdx, %rdi
0.00 : ffffffff81338f7f: callq *0x17021c3(%rip) # ffffffff82a3b148 <pv_ops+0x1e8>
0.00 : ffffffff81338f85: movq 0xffbf3c(%rip), %rdx # ffffffff82334ec8 <physical_mask>
0.00 : ffffffff81338f8c: testq %rax, %rax ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
0.00 : ffffffff81338f8f: je 0xffffffff81338fd0 here
0.00 : ffffffff81338f91: movq %rax, %rcx
0.00 : ffffffff81338f94: andl $1, %ecx
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240329215812.537846-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
||
|
|
6d17edc113 |
perf annotate: Use libcapstone to disassemble
Now it can use the capstone library to disassemble the instructions. Let's use that (if available) for perf annotate to speed up. Currently it only supports x86 architecture. With this change I can see ~3x speed up in data type profiling. But note that capstone cannot give the source file and line number info. For now, users should use the external objdump for that by specifying the --objdump option explicitly. Signed-off-by: Namhyung Kim <namhyung@kernel.org> Tested-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240329215812.537846-5-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> |
||
|
|
98f69a573c |
perf annotate: Split out util/disasm.c
The util/annotate.c code has both disassembly and sample annotation
related codes. Factor out the disasm part so that it can be handled
more easily.
No functional changes intended.
Committer notes:
Add missing include env.h, util.h, bpf-event.h and bpf-util.h to
disasm.c, to fix things like:
util/disasm.c: In function ‘symbol__disassemble_bpf’:
util/disasm.c:1203:9: error: implicit declaration of function ‘perf_exe’ [-Werror=implicit-function-declaration]
1203 | perf_exe(tpath, sizeof(tpath));
| ^~~~~~~~
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240329215812.537846-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|