Commit Graph

15 Commits

Author SHA1 Message Date
Athira Rajeev
cd0b6f67c4 perf annotate: Add some of the arithmetic instructions to support instruction tracking in powerpc
Data-type profiling has the concept of instruction tracking.

Example sequence in powerpc:

	ld      r10,264(r3)
	mr      r31,r3
	<<after some sequence>
	ld      r9,312(r31)

or differently

	lwz	r10,264(r3)
	add	r31, r3, RB
	lwz	r9, 0(r31)

If a sample is hit at "lwz r9, 0(r31)", data type of r31 depends
on previous instruction sequence here. So to track the previous
instructions, patch adds changes to identify some of the arithmetic
instructions which are having opcode as 31.

Since memory instructions also has cases with opcode 31, use the bits
22:30 to filter the arithmetic instructions here.

Also there are instructions with just two operands like "addme", "addze".

This patch adds new instructions ops "arithmetic_ops" to handle this

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-10-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
ace7d681d8 perf annotate: Add support to identify memory instructions of opcode 31 in powerpc
There are memory instructions in powerpc with opcode as 31.
Example: "ldx RT,RA,RB" , Its X form is as below:

  ______________________________________
  | 31 |  RT  |  RA |  RB |   21     |/|
  --------------------------------------
  0    6     11    16    21         30 31

The opcode for "ldx" is 31. There are other instructions also with
opcode 31 which are memory insn like ldux, stbx, lwzx, lhaux
But all instructions with opcode 31 are not memory. Example is add
instruction: "add RT,RA,RB"

The value in bit 21-30 [ 21 for ldx ] is different for these
instructions. Patch uses this value to assign instruction ops for these
cases. The naming convention and value to identify these are picked from
defines in "arch/powerpc/include/asm/ppc-opcode.h"

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-9-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
1acdad6818 perf annotate: Add parse function for memory instructions in powerpc
Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset.

The implementation addresses the D-form, X-form, DS-form instructions.
Two main functions are added.

New parse function "load_store__parse" as instruction ops parser for
memory instructions.

Unlike other parsers (like mov__parse), this one fills in the
"multi_regs" field for source/target and new added "mem_ref" field. No
other fields are set because, here there is no need to parse the
disassembled code and arch specific macros will take care of extracting
offset and regs which is easier and will be precise.

In powerpc, all instructions with a primary opcode from 32 to 63
are memory instructions. Update "ins__find" function to have "raw_insn"
also as a parameter.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-8-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
1b4406d2a8 perf annotate: Update parameters for reg extract functions to use raw instruction on powerpc
Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset.

The implementation addresses the D-form, X-form, DS-form instructions.

Adds "mem_ref" field to check whether source/target has memory
reference.

Add function "get_powerpc_regs" which will set these fields: reg1, reg2,
offset depending of where it is source or target ops.

Update "parse" callback for "struct ins_ops" to also pass "struct
disasm_line" as argument. This is needed in parse functions where opcode
is used to determine whether to set multi_regs and other fields

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-7-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
0b971e6bf1 perf annotate: Add support to capture and parse raw instruction in powerpc using dso__data_read_offset utility
Add support to capture and parse raw instruction in powerpc.
Currently, the perf tool infrastructure uses two ways to disassemble
and understand the instruction. One is objdump and other option is
via libcapstone.

Currently, the perf tool infrastructure uses "--no-show-raw-insn" option
with "objdump" while disassemble. Example from powerpc with this option
for an instruction address is:

Snippet from:

  objdump  --start-address=<address> --stop-address=<address>  -d --no-show-raw-insn -C <vmlinux>

  c0000000010224b4:	lwz     r10,0(r9)

This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset. Also to find whether there is a memory
reference in the operands, "memory_ref_char" field of objdump is used.
For x86, "(" is used as memory_ref_char to tackle instructions of the
form "mov  (%rax), %rcx".

In case of powerpc, not all instructions using "(" are the only memory
instructions. Example, above instruction can also be of extended form (X
form) "lwzx r10,0,r19". Inorder to easy identify the instruction category
and extract the source/target registers, patch adds support to use raw
instruction for powerpc. Approach used is to read the raw instruction
directly from the DSO file using "dso__data_read_offset" utility which
is already implemented in perf infrastructure in "util/dso.c".

Example:

38 01 81 e8     ld      r4,312(r1)

Here "38 01 81 e8" is the raw instruction representation. In powerpc,
this translates to instruction form: "ld RT,DS(RA)" and binary code
as:

   | 58 |  RT  |  RA |      DS       | |
   -------------------------------------
   0    6     11    16              30 31

Function "symbol__disassemble_dso" is updated to read raw instruction
directly from DSO using dso__data_read_offset utility. In case of
above example, this captures:
line:    38 01 81 e8

The above works well when 'perf report' is invoked with only sort keys
for data type ie type and typeoff.

Because there is no instruction level annotation needed if only data
type information is requested for.

For annotating sample, along with type and typeoff sort key, "sym" sort
key is also needed. And by default invoking just "perf report" uses sort
key "sym" that displays the symbol information.

With approach changes in powerpc which first reads DSO for raw
instruction, "perf annotate" and "perf report" + a key breaks since
it doesn't do the instruction level disassembly.

Snippet of result from 'perf report':

  Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
  do_work  /usr/bin/pmlogger [Percent: local period]
  Percent│        ea230010
         │        3a550010
         │        3a600000

         │        38f60001
         │        39490008
         │        42400438
   51.44 │        81290008
         │        7d485378

Here, raw instruction is displayed in the output instead of human
readable annotated form.

One way to get the appropriate data is to specify "--objdump path", by
which code annotation will be done. But the default behaviour will be
changed. To fix this breakage, check if "sym" sort key is set. If so
fallback and use the libcapstone/objdump way of disassmbling the sample.

With the changes and "perf report"

Samples: 1K of event 'mem-loads', 4000 Hz, Event count (approx.): 937238
do_work  /usr/bin/pmlogger [Percent: local period]
Percent│        ld        r17,16(r3)
       │        addi      r18,r21,16
       │        li        r19,0

       │ 8b0:   rldicl    r10,r10,63,33
       │        addi      r10,r10,1
       │        mtctr     r10
       │      ↓ b         8e4
       │ 8c0:   addi      r7,r22,1
       │        addi      r10,r9,8
       │      ↓ bdz       d00
 51.44 │        lwz       r9,8(r9)
       │        mr        r8,r10
       │        cmpw      r20,r9

Committer notes:

Just add the extern for 'sort_order' in disasm.c so that we don't end up
breaking the build due to this type colision with capstone and libbpf:

  In file included from /usr/include/capstone/capstone.h:325,
                   from /git/perf-6.10.0/tools/perf/util/print_insn.h:23,
                   from builtin-script.c:38:
  /usr/include/capstone/bpf.h:94:14: error: 'bpf_insn' defined as wrong kind of tag
     94 | typedef enum bpf_insn {

I reported this to the bpf mailing list, see one of the links below.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-6-atrajeev@linux.vnet.ibm.com
Link: https://lore.kernel.org/bpf/ZqOltPk9VQGgJZAA@x1/T/#u
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
06dd4c5a56 perf annotate: Add disasm_line__parse() to parse raw instruction for powerpc
Currently, the perf tool infrastructure uses the disasm_line__parse
function to parse disassembled line.

Example snippet from objdump:

  objdump  --start-address=<address> --stop-address=<address>  -d --no-show-raw-insn -C <vmlinux>

  c0000000010224b4:	lwz     r10,0(r9)

This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset.

In powerpc, the approach for data type profiling uses raw instruction
instead of result from objdump to identify the instruction category and
extract the source/target registers.

Example: 38 01 81 e8     ld      r4,312(r1)

Here "38 01 81 e8" is the raw instruction representation. Add function
"disasm_line__parse_powerpc" to handle parsing of raw instruction.
Also update "struct disasm_line" to save the binary code/
With the change, function captures:

line -> "38 01 81 e8     ld      r4,312(r1)"
raw instruction "38 01 81 e8"

Raw instruction is used later to extract the reg/offset fields. Macros
are added to extract opcode and register fields. "struct disasm_line"
is updated to carry union of "bytes" and "raw_insn" of 32 bit to carry raw
code (raw).

Function "disasm_line__parse_powerpc fills the raw instruction hex value
and can use macros to get opcode. There is no changes in existing code
paths, which parses the disassembled code.  The size of raw instruction
depends on architecture.

In case of powerpc, the parsing the disasm line needs to handle cases
for reading binary code directly from DSO as well as parsing the objdump
result. Hence adding the logic into separate function instead of
updating "disasm_line__parse".  The architecture using the instruction
name and present approach is not altered. Since this approach targets
powerpc, the macro implementation is added for powerpc as of now.

Since the disasm_line__parse is used in other cases (perf annotate) and
not only data tye profiling, the powerpc callback includes changes to
work with binary code as well as mnemonic representation.

Also in case if the DSO read fails and libcapstone is not supported, the
approach fallback to use objdump as option. Hence as option, patch has
changes to ensure objdump option also works well.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-5-atrajeev@linux.vnet.ibm.com
[ Add check for strndup() result ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
782959ac24 perf annotate: Add "update_insn_state" callback function to handle arch specific instruction tracking
Add "update_insn_state" callback to "struct arch" to handle instruction
tracking. Currently updating instruction state is handled by static
function "update_insn_state_x86" which is defined in "annotate-data.c".

Make this as a callback for specific arch and move to archs specific
file "arch/x86/annotate/instructions.c" . This will help to add helper
function for other platforms in file:
"arch/<platform>/annotate/instructions.c" and make changes/updates
easier.

Define callback "update_insn_state" as part of "struct arch", also make
some of the debug functions non-static so that it can be referenced from
other places.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Ian Rogers
1553419c3c perf dso: Fix address sanitizer build
Various files had been missed from having accessor functions added for
the sake of dso reference count checking. Add the function calls and
missing dso accessor functions.

Fixes: ee756ef749 ("perf dso: Add reference count checking and accessor functions")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:41 -07:00
Ian Rogers
ee756ef749 perf dso: Add reference count checking and accessor functions
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.

The majority of the change is mechanical in nature and not easy to
split up.

Committer testing:

'perf test' up to this patch shows no regressions.

But:

  util/symbol.c: In function ‘dso__load_bfd_symbols’:
  util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
   1683 |         dso__set_adjust_symbols(dso);
        |         ^~~~~~~~~~~~~~~~~~~~~~~
  In file included from util/symbol.c:21:
  util/dso.h:268:20: note: declared here
    268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
        |                    ^~~~~~~~~~~~~~~~~~~~~~~
  make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
    MKDIR   /tmp/tmp.ZWHbQftdN6/tests/workloads/
  make[6]: *** Waiting for unfinished jobs....

This was updated:

  -       symbols__fixup_end(&dso->symbols, false);
  -       symbols__fixup_duplicate(&dso->symbols);
  -       dso->adjust_symbols = 1;
  +       symbols__fixup_end(dso__symbols(dso), false);
  +       symbols__fixup_duplicate(dso__symbols(dso));
  +       dso__set_adjust_symbols(dso);

But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).

Add the missing argument:

   	symbols__fixup_end(dso__symbols(dso), false);
   	symbols__fixup_duplicate(dso__symbols(dso));
  -	dso__set_adjust_symbols(dso);
  +	dso__set_adjust_symbols(dso, true);

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Chengen Du <chengen.du@canonical.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-06 15:28:49 -03:00
Namhyung Kim
8f3ec810bb perf annotate: Update DSO binary type when trying build-id
dso__disassemble_filename() tries to get the filename for objdump (or
capstone) using build-id.  But I found sometimes it didn't disassemble
some functions.

It turned out that those functions belong to a DSO which has no binary
type set.  It seems it sets the binary type for some special files only
- like kernel (kallsyms or kcore) or BPF images.  And there's a logic to
skip dso with DSO_BINARY_TYPE__NOT_FOUND.

As it's checked the build-id cache link, it should set the binary type
as DSO_BINARY_TYPE__BUILD_ID_CACHE.

Fixes: 873a83731f ("perf annotate: Skip DSOs not found")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240425005157.1104789-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26 22:12:55 -03:00
Namhyung Kim
f35847de2a perf annotate: Fallback disassemble to objdump when capstone fails
I found some cases that capstone failed to disassemble.  Probably my
capstone is an old version but anyway there's a chance it can fail.  And
then it silently stopped in the middle.  In my case, it didn't
understand "RDPKRU" instruction.

Let's check if the capstone disassemble reached the end of the function
and fallback to objdump if not.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240425005157.1104789-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-26 22:07:21 -03:00
Namhyung Kim
873a83731f perf annotate: Skip DSOs not found
In some data file, I see the following messages repeated.  It seems it
doesn't have DSOs in the system and the dso->binary_type is set to
DSO_BINARY_TYPE__NOT_FOUND.  Let's skip them to avoid the followings.

  No output from objdump  --start-address=0x0000000000000000 --stop-address=0x00000000000000d4  -d --no-show-raw-insn       -C "$1"
  Error running objdump  --start-address=0x0000000000000000 --stop-address=0x0000000000000631  -d --no-show-raw-insn       -C "$1"
  ...

Closes: https://lore.kernel.org/linux-perf-users/15e1a2847b8cebab4de57fc68e033086aa6980ce.camel@yandex.ru/
Reported-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Konstantin Kharlamov <Hi-Angel@yandex.ru>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240410185117.1987239-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12 12:02:06 -03:00
Namhyung Kim
92dfc59463 perf annotate: Add symbol name when using capstone
This is to keep the existing behavior with objdump.  It needs to show
symbol information of global variables like below:

   Percent |      Source code & Disassembly of elf for cycles:P (1 samples, percent: local period)
  ------------------------------------------------------------------------------------------------
           : 0                0xffffffff81338f70 <vm_normal_page>:
      0.00 :   ffffffff81338f70:       endbr64
      0.00 :   ffffffff81338f74:       callq   0xffffffff81083a40
      0.00 :   ffffffff81338f79:       movq    %rdi, %r8
      0.00 :   ffffffff81338f7c:       movq    %rdx, %rdi
      0.00 :   ffffffff81338f7f:       callq   *0x17021c3(%rip)   # ffffffff82a3b148 <pv_ops+0x1e8>
      0.00 :   ffffffff81338f85:       movq    0xffbf3c(%rip), %rdx       # ffffffff82334ec8 <physical_mask>
      0.00 :   ffffffff81338f8c:       testq   %rax, %rax                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      0.00 :   ffffffff81338f8f:       je      0xffffffff81338fd0                         here
      0.00 :   ffffffff81338f91:       movq    %rax, %rcx
      0.00 :   ffffffff81338f94:       andl    $1, %ecx

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240329215812.537846-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03 11:48:57 -03:00
Namhyung Kim
6d17edc113 perf annotate: Use libcapstone to disassemble
Now it can use the capstone library to disassemble the instructions.
Let's use that (if available) for perf annotate to speed up.  Currently
it only supports x86 architecture.  With this change I can see ~3x speed
up in data type profiling.

But note that capstone cannot give the source file and line number info.
For now, users should use the external objdump for that by specifying
the --objdump option explicitly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240329215812.537846-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03 11:48:57 -03:00
Namhyung Kim
98f69a573c perf annotate: Split out util/disasm.c
The util/annotate.c code has both disassembly and sample annotation
related codes.  Factor out the disasm part so that it can be handled
more easily.

No functional changes intended.

Committer notes:

Add missing include env.h, util.h, bpf-event.h and bpf-util.h to
disasm.c, to fix things like:

  util/disasm.c: In function ‘symbol__disassemble_bpf’:
  util/disasm.c:1203:9: error: implicit declaration of function ‘perf_exe’ [-Werror=implicit-function-declaration]
   1203 |         perf_exe(tpath, sizeof(tpath));
        |         ^~~~~~~~

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240329215812.537846-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03 11:48:57 -03:00