Commit Graph

977 Commits

Author SHA1 Message Date
Ian Rogers
e7bb49e3f6 perf x86: Define arch_fetch_insn in NO_AUXTRACE builds
archinsn.c containing arch_fetch_insn was only enabled with
CONFIG_AUXTRACE, but this meant that a NO_AUXTRACE build on x86 would
use the empty weak version of arch_fetch_insn - weak symbols are a
frequent source of errors like this and are outside of the C
specification. Change it so that archinsn.c is always built on x86 and
make the weak symbol empty version of arch_fetch_insn a strong one
guarded by ifdefs.

arch_fetch_insn on x86 depends on insn_decode which is a function
included then built into intel-pt-insn-decoder.c.
intel-pt-insn-decoder.c isn't built in a NO_AUXTRACE=1 build. Separate
the insn_decode function from intel-pt-insn-decoder.c by just directly
compiling the relevant file. Guard this compilation to be for either
always on x86 (because of the use in arch_fetch_insn) or when auxtrace
is enabled. Apply the CFLAGS overrides as necessary, reducing the amount
of code where warnings are disabled.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@redhat.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Veronika Molnarova <vmolnaro@redhat.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Link: https://lore.kernel.org/r/20241119011644.971342-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18 16:24:33 -03:00
Athira Rajeev
2aad2130c2 perf tools arch powerpc: Add register mask for power11 PVR in extended regs
Perf tools side uses extended mask to display the platform supported
register names (with -I? option) to the user and also send this mask to
the kernel to capture the extended registers as part of each sample.
This mask value is decided based on the processor version ( from PVR ).

Add PVR value for power11 to enable capturing the extended regs as part
of sample in power11.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20241206135637.36166-1-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18 16:24:31 -03:00
Namhyung Kim
81b483f722 tools headers: Sync *xattrat syscall changes with the kernel sources
To pick up the changes in this cset:

  6140be90ec ("fs/xattr: add *at family syscalls")

This addresses these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_32.tbl arch/x86/entry/syscalls/syscall_32.tbl
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

The arm64 changes are not included as it requires more changes in the
tools.  It'll be worked for the later cycle.

Please see tools/include/uapi/README for further details.

Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christian Brauner <brauner@kernel.org>
CC: x86@kernel.org
CC: linux-mips@vger.kernel.org
CC: linuxppc-dev@lists.ozlabs.org
CC: linux-s390@vger.kernel.org
Link: https://lore.kernel.org/r/20241203035349.1901262-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-04 14:34:50 -08:00
Ian Rogers
8f997865ee perf pmu: Move pmu_metrics_table__find and remove ARM override
Move pmu_metrics_table__find() to the jevents.py generated pmu-events.c
and remove indirection override for ARM.

The movement removes perf_pmu__find_metrics_table that exists to enable
the ARM override.

The ARM override isn't necessary as just the CPUID, not PMU, is used in
the metric table lookup.

On non-ARM the CPU argument is just ignored for the CPUID, for ARM -1 is
passed so that the CPUID for the first logical CPU is read.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Zong-You Xie <ben717@andestech.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16 16:42:36 -03:00
Ian Rogers
494c403ff1 perf header: Pass a perf_cpu rather than a PMU to get_cpuid_str
On ARM the cpuid is dependent on the core type of the CPU in
question. The PMU was passed for the sake of the CPU map but this
means in places a temporary PMU is created just to pass a CPU
value. Just pass the CPU and fix up the callers.

As there are no longer PMU users in header.h, shuffle forward
declarations earlier to work around build failures.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Zong-You Xie <ben717@andestech.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16 16:40:30 -03:00
Ian Rogers
538737da96 perf arm64 header: Use cpu argument in get_cpuid
Use the cpu to read the MIDR file requested. If the "any" value (-1) is
passed that keep the behavior of returning the first MIDR file that can
be read.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Zong-You Xie <ben717@andestech.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16 16:39:04 -03:00
Ian Rogers
cec0d6572a perf header: Refactor get_cpuid to take a CPU for ARM
ARM BIG.little has no notion of a constant CPUID for both core
types. To reflect this reality, change the get_cpuid function to also
pass in a possibly unused logical cpu.

If the dummy value (-1) is passed in then ARM can, as currently happens,
select the first logical CPU's "CPUID".

The changes to ARM getcpuid happen in a follow up change.

Reviewed-by: James Clark <james.clark@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Xu Yang <xu.yang_2@nxp.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Zong-You Xie <ben717@andestech.com>
Cc: Benjamin Gray <bgray@linux.ibm.com>
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: Clément Le Goffic <clement.legoffic@foss.st.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Link: https://lore.kernel.org/r/20241107162035.52206-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-16 16:37:54 -03:00
Ian Rogers
ddbfb6f20c perf build: Remove PERF_HAVE_DWARF_REGS
PERF_HAVE_DWARF_REGS was true when an architecture had a dwarf-regs.c
file. There are no more architecture dwarf-regs.c files, selection is
done using constants from the ELF file rather than conditional
compilation. When removing PERF_HAVE_DWARF_REGS was the only variable
in the Makefile, remove the Makefile.

Add missing SPDX for RISC-V Makefile.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-21-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:14 -08:00
Ian Rogers
a4747c0950 perf xtensa: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. As this is
the only file in the arch/xtensa/util clean up Build files. Tidy up the
EM_NONE cases for xtensa in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-19-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:14 -08:00
Ian Rogers
85567a2a8d perf sparc: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. As this is
the only file in the arch/sparc/util clean up Build files. Tidy up the
EM_NONE cases for sparc in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-18-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:14 -08:00
Ian Rogers
04150f29e2 perf sh: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. As this is
the only file in the arch/sh/util clean up Build files. Tidy up the
EM_NONE cases for sh in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-17-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:14 -08:00
Ian Rogers
b232b704a7 perf s390: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. Tidy up the
EM_NONE cases for s390 in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-16-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:14 -08:00
Ian Rogers
a90c451918 perf riscv: Remove dwarf-regs.c and add dwarf-regs-table.h
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case,
and the register table is provided in a header file, the function can
never be called. So remove as dead code. Tidy up the EM_NONE cases for
riscv in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-15-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
285b523c2d perf dwarf-regs: Move powerpc dwarf-regs out of arch
Move arch/powerpc/util/dwarf-regs.c to util/dwarf-regs-powerpc.c and
compile in unconditionally. get_arch_regstr is redundant when EM_NONE
is treated as EM_HOST so remove and update dwarf-regs.c conditions.
Make get_powerpc_regs unconditionally available whwn libdw is.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-14-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
8a768a2f65 perf mips: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. Tidy up the
EM_NONE cases for mips in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-13-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
1d37bd8366 perf loongarch: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. Tidy up the
EM_NONE cases for loongarch in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
d4a0c4f221 perf dwarf-regs: Move csky dwarf-regs out of arch
Move arch/csky/util/dwarf-regs.c to util/dwarf-regs-csky.c and compile
in unconditionally. To avoid get_arch_regstr being duplicated, rename
to get_csky_regstr and add to get_dwarf_regstr switch.

Update #ifdefs to allow ABI V1 and V2 tables at the same
time. Determine the table from the ELF flags.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
0c0a20ecdf perf arm: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. Tidy up the
EM_NONE cases for arm in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-10-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
6f8e8add5a perf arm64: Remove dwarf-regs.c
The file just provides the function get_arch_regstr, however, if in
the only caller get_dwarf_regstr EM_HOST is used for the EM_NONE case
the function can never be called. So remove as dead code. Tidy up the
EM_NONE cases for arm64 in dwarf-regs.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-9-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
bf4e799a0a perf dwarf-regs: Move x86 dwarf-regs out of arch
Move arch/x86/util/dwarf-regs.c to util/dwarf-regs-x86.c and compile
in unconditionally. To avoid get_arch_regnum being duplicated, rename
to get_x86_regnum and add to get_dwarf_regnum switch.

For get_arch_regstr, this was unused on x86 unless the machine type
was EM_NONE. Map that case to EM_HOST and remove get_arch_regstr from
dwarf-regs-x86.c.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
cd6c9dca9d perf disasm: Add e_machine/e_flags to struct arch
Currently functions like get_dwarf_regnum only work with the host
architecture. Carry the elf machine and flags in struct arch so that
in disassembly these can be used to allow cross platform disassembly.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-5-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:13 -08:00
Ian Rogers
6ac75289b2 perf dwarf-regs: Remove PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET
PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET was used for BPF prologue
support which was removed in Commit 3d6dfae889 ("perf parse-events:
Remove BPF event support"). The code is no longer used so remove.

Remove the offset from various dwarf-regs.c tables and the dependence
on ptrace.h. Rename structs starting pt_ as the ptrace derived offset is
now removed.

Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241108234606.429459-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-11-09 08:39:12 -08:00
Björn Töpel
8c0d1202ba perf, riscv: Wire up perf trace support for RISC-V
RISC-V does not currently support perf trace, since the system call
table is not generated.

Perform the copy/paste exercise, wiring up RISC-V system call table
generation.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: linux-riscv@lists.infradead.org
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Link: https://lore.kernel.org/r/20241024190353.46737-1-bjorn@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-30 23:39:34 -07:00
Namhyung Kim
28398ce172 perf tools: Move x86__is_amd_cpu() to util/env.c
It can be called from non-x86 platform so let's move it to the general
util directory.  Also add a new helper perf_env__is_x86_amd_cpu() so
that it can be called with an existing perf_env as well.

Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Atish Patra <atishp@atishpatra.org>
Cc: Mingwei Zhang <mizhang@google.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Palmer Dabbelt <palmer@rivosinc.com>
Link: https://lore.kernel.org/r/20241016062359.264929-7-namhyung@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-22 09:55:07 -07:00
Ian Rogers
5455d89bf3 perf build: Rename CONFIG_DWARF to CONFIG_LIBDW
In Makefile.config for unwinding the name dwarf implies either
libunwind or libdw. Make it clearer that CONFIG_DWARF is really just
defined when libdw is present by renaming to CONFIG_LIBDW.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-12-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
8838abf626 perf build: Rename HAVE_DWARF_SUPPORT to HAVE_LIBDW_SUPPORT
In Makefile.config for unwinding the name dwarf implies either
libunwind or libdw. Make it clearer that HAVE_DWARF_SUPPORT is really
just defined when libdw is present by renaming to HAVE_LIBDW_SUPPORT.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-11-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
54a1368567 perf build: Rename NO_DWARF to NO_LIBDW
NO_DWARF could mean more than NO_LIBDW support, in particular no
libunwind support. Rename to be more intention revealing.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Shenlin Liang <liangshenlin@eswincomputing.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Guilherme Amadio <amadio@gentoo.org>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Alexander Lobakin <aleksander.lobakin@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Chen Pei <cp0613@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Aditya Gupta <adityag@linux.ibm.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-riscv@lists.infradead.org
Cc: Bibo Mao <maobibo@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: linux-csky@vger.kernel.org
Link: https://lore.kernel.org/r/20241017001354.56973-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-18 10:17:40 -07:00
Ian Rogers
37b77ae954 perf stat: Change color to threshold in print_metric
Colors don't mean things in CSV and JSON output, switch to a threshold
enum value that the standard output can convert to a color. Updating
the CSV and JSON output will be later changes.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20241017175356.783793-6-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17 12:44:26 -07:00
Athira Rajeev
54f9aa1092 tools/perf/powerpc/util: Add support to handle compatible mode PVR for perf json events
perf list picks the events supported for specific platform
from pmu-events/arch/powerpc/<platform>. Example power10 events
are in pmu-events/arch/powerpc/power10, power9 events are part
of pmu-events/arch/powerpc/power9. The decision of which
platform to pick is determined based on PVR value in powerpc.
The PVR value is matched from pmu-events/arch/powerpc/mapfile.csv

Example:

Format:
	PVR,Version,JSON/file/pathname,Type

0x004[bcd][[:xdigit:]]{4},1,power8,core
0x0066[[:xdigit:]]{4},1,power8,core
0x004e[[:xdigit:]]{4},1,power9,core
0x0080[[:xdigit:]]{4},1,power10,core
0x0082[[:xdigit:]]{4},1,power10,core

The code gets the PVR from system using get_cpuid_str function
in arch/powerpc/util/headers.c ( from SPRN_PVR ) and compares
with value from mapfile.csv
In case of compat mode, say when partition is booted in a power9
mode when the system is a power10, this picks incorrectly. Because
PVR will point to power10 where as it should pick events from power9
folder. To support generic events, add new folder
pmu-events/arch/powerpc/compat to contain the ISA architected events
which is supported in compat mode. Also return 0x00ffffff as pvr
when booted in compat mode. Based on this pvr value, json will
pick events from pmu-events/arch/powerpc/compat

Suggested-by: Madhavan Srinivasan <maddy@linux.ibm.com>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel<disgoel@linux.ibm.com>
Cc: akanksha@linux.ibm.com
Cc: hbathini@linux.ibm.com
Cc: kjain@linux.ibm.com
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20241010145107.51211-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17 11:25:00 -07:00
Dapeng Mi
fbc798316b perf x86/topdown: Refine helper arch_is_topdown_metrics()
Leverage the existed function perf_pmu__name_from_config() to check if
an event is topdown metrics event. perf_pmu__name_from_config() goes
through the defined formats and figures out the config of pre-defined
topdown events.

This avoids to figure out the config of topdown pre-defined events with
hard-coded format strings "event=" and "umask=" and provides more
flexibility.

Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20241011110207.1032235-2-dapeng1.mi@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-16 13:36:47 -07:00
Dapeng Mi
b68b5b36c7 perf x86/topdown: Make topdown metrics comparators be symmetric
The commit "3b5edc0421e2 (perf x86/topdown: Don't move topdown metric
 events in group)" modifies topdown metrics comparator to move topdown
metrics events which are not in same group with previous event. But it
just modifies the 2nd comparator and causes the comparators become
asymmetric.

Thus modify the 1st topdown metrics comparator and make the two
comparators be symmetric, and refine the comments as well.

Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20241011110207.1032235-1-dapeng1.mi@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-16 13:36:41 -07:00
Howard Chu
0c383c0827 perf test: Delete unused Intel CQM test
As Ian Rogers <irogers@google.com> pointed out, intel-cqm.c is neither
used nor built. It was deleted in the following commit:

commit b24413180f ("License cleanup: add SPDX GPL-2.0 license identifier to files with no license")

However, it resurfaced soon after in the following commit:

commit 5c9295bfe6 ("perf tests: Remove Intel CQM perf test")

It should be deleted once and for all.

Suggested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Howard Chu <howardchu95@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Matt Fleming <mfleming@cloudflare.com>
Link: https://lore.kernel.org/r/20241011055700.4142694-1-howardchu95@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-14 12:04:31 -07:00
Ian Rogers
069057239a perf tool_pmu: Move expr literals to tool_pmu
Add the expr literals like "#smt_on" as tool events, this allows stat
events to give the values. On my laptop with hyperthreading enabled:

```
$ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true

 Performance counter stats for 'true':

                 0      has_pmem
                 8      num_cores
                16      num_cpus
                16      num_cpus_online
                 1      num_dies
                 1      num_packages
                 1      smt_on
     2,496,000,000      system_tsc_freq

       0.001113637 seconds time elapsed

       0.001218000 seconds user
       0.000000000 seconds sys
```

And with hyperthreading disabled:
```
$ perf stat -e "has_pmem,num_cores,num_cpus,num_cpus_online,num_dies,num_packages,smt_on,system_tsc_freq" true

 Performance counter stats for 'true':

                 0      has_pmem
                 8      num_cores
                16      num_cpus
                 8      num_cpus_online
                 1      num_dies
                 1      num_packages
                 0      smt_on
     2,496,000,000      system_tsc_freq

       0.000802115 seconds time elapsed

       0.000000000 seconds user
       0.000806000 seconds sys
```

As zero matters for these values, in stat-display
should_skip_zero_counter only skip the zero value if it is not the
first aggregation index.

The tool event implementations are used in expr but not evaluated as
events for simplicity. Also core_wide isn't made a tool event as it
requires command line parameters.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241002032016.333748-8-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10 23:40:32 -07:00
Ian Rogers
c798f72c7a perf pmu: Allow hardcoded terms to be applied to attributes
Hard coded terms like "config=10" are skipped by perf_pmu__config
assuming they were already applied to a perf_event_attr by parse
event's config_attr function. When doing a reverse number to name
lookup in perf_pmu__name_from_config, as the hardcoded terms aren't
applied the config value is incorrect leading to misses or false
matches. Fix this by adding a parameter to have perf_pmu__config apply
hardcoded terms too (not just in parse event's config_term_common).

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20241002032016.333748-3-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-10 23:40:32 -07:00
Thomas Falcon
9f759d41b3 perf test x86: Fix typo in intel-pt-test
Change function name "is_hydrid" to "is_hybrid".

Signed-off-by: Thomas Falcon <thomas.falcon@intel.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20241007194758.78659-1-thomas.falcon@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-09 10:52:08 -07:00
Leo Yan
703f344d0c perf arm-spe: Save per CPU information in metadata
Save the Arm SPE information on a per-CPU basis. This approach is easier
in the decoding phase for retrieving metadata based on the CPU number of
every Arm SPE record.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Besar Wicaksono <bwicaksono@nvidia.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20241003184302.190806-4-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-03 15:23:24 -07:00
Leo Yan
59715b1908 perf arm-spe: Calculate meta data size
The metadata is designed to contain a header and per CPU information.

The arm_spe_find_cpus() function is introduced to identify how many CPUs
support ARM SPE. Based on the CPU number, calculates the metadata size.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Besar Wicaksono <bwicaksono@nvidia.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20241003184302.190806-3-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-03 15:23:20 -07:00
Leo Yan
0ca2c45404 perf arm-spe: Define metadata header version 2
The first version's metadata header structure doesn't include a field to
indicate a header version, which is not friendly for extension.

Define the metadata version 2 format with a new header structure and
extend per CPU's metadata. In the meantime, the old metadata header will
still be supported for backward compatibility.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Besar Wicaksono <bwicaksono@nvidia.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20241003184302.190806-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-03 15:23:09 -07:00
Dapeng Mi
3b5edc0421 perf x86/topdown: Don't move topdown metric events in group
when running below perf command, we say error is reported.

perf record -e "{slots,instructions,topdown-retiring}:S" -vv -C0 sleep 1

------------------------------------------------------------
perf_event_attr:
  type                             4 (cpu)
  size                             168
  config                           0x400 (slots)
  sample_type                      IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER
  read_format                      ID|GROUP|LOST
  disabled                         1
  sample_id_all                    1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
  type                             4 (cpu)
  size                             168
  config                           0x8000 (topdown-retiring)
  { sample_period, sample_freq }   4000
  sample_type                      IP|TID|TIME|READ|CPU|PERIOD|IDENTIFIER
  read_format                      ID|GROUP|LOST
  freq                             1
  sample_id_all                    1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd 5  flags 0x8
sys_perf_event_open failed, error -22

Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument) for
event (topdown-retiring).

The reason of error is that the events are regrouped and
topdown-retiring event is moved to closely after the slots event and
topdown-retiring event needs to do the sampling, but Intel PMU driver
doesn't support to sample topdown metrics events.

For topdown metrics events, it just requires to be in a group which has
slots event as leader. It doesn't require topdown metrics event must be
closely after slots event. Thus it's a overkill to move topdown metrics
event closely after slots event in events regrouping and furtherly cause
the above issue.

Thus don't move topdown metrics events forward if they are already in a
group.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Yongwei Ma <yongwei.ma@intel.com>
Link: https://lore.kernel.org/r/20240913084712.13861-4-dapeng1.mi@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-30 15:23:44 -07:00
Dapeng Mi
1e53e9d178 perf x86/topdown: Correct leader selection with sample_read enabled
Addresses an issue where, in the absence of a topdown metrics event
within a sampling group, the slots event was incorrectly bypassed as
the sampling leader when sample_read was enabled.

perf record -e '{slots,branches}:S' -c 10000 -vv sleep 1

In this case, the slots event should be sampled as leader but the
branches event is sampled in fact like the verbose output shows.

perf_event_attr:
  type                             4 (cpu)
  size                             168
  config                           0x400 (slots)
  sample_type                      IP|TID|TIME|READ|CPU|IDENTIFIER
  read_format                      ID|GROUP|LOST
  disabled                         1
  sample_id_all                    1
  exclude_guest                    1
------------------------------------------------------------
sys_perf_event_open: pid -1  cpu 0  group_fd -1  flags 0x8 = 5
------------------------------------------------------------
perf_event_attr:
  type                             0 (PERF_TYPE_HARDWARE)
  size                             168
  config                           0x4 (PERF_COUNT_HW_BRANCH_INSTRUCTIONS)
  { sample_period, sample_freq }   10000
  sample_type                      IP|TID|TIME|READ|CPU|IDENTIFIER
  read_format                      ID|GROUP|LOST
  sample_id_all                    1
  exclude_guest                    1

The sample period of slots event instead of branches event is reset to
0.

This fix ensures the slots event remains the leader under these
conditions.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Yongwei Ma <yongwei.ma@intel.com>
Link: https://lore.kernel.org/r/20240913084712.13861-3-dapeng1.mi@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-30 15:23:44 -07:00
Dapeng Mi
39820ced2a perf x86/topdown: Complete topdown slots/metrics events check
It's not complete to check whether an event is a topdown slots or
topdown metrics event by only comparing the event name since user
may assign the event by RAW format, e.g.

perf stat -e '{instructions,cpu/r400/,cpu/r8300/}' sleep 1

 Performance counter stats for 'sleep 1':

     <not counted>      instructions
     <not counted>      cpu/r400/
   <not supported>      cpu/r8300/

       1.002917796 seconds time elapsed

       0.002955000 seconds user
       0.000000000 seconds sys

The RAW format slots and topdown-be-bound events are not recognized and
not regroup the events, and eventually cause error.

Thus add two helpers arch_is_topdown_slots()/arch_is_topdown_metrics()
to detect whether an event is topdown slots/metrics event by comparing
the event config directly, and use these two helpers to replace the
original event name comparisons.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Yongwei Ma <yongwei.ma@intel.com>
Link: https://lore.kernel.org/r/20240913084712.13861-2-dapeng1.mi@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-30 15:23:43 -07:00
Ian Rogers
d7d156fc5e perf evsel: Remove pmu_name
"evsel->pmu_name" is only ever assigned a strdup of "pmu->name", a
strdup of "evsel->pmu_name" or NULL. As such, prefer to use
"pmu->name" directly and even to directly compare PMUs than PMU
names. For safety, add some additional NULL tests.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
[ Fix arm-spe.c usage of pmu_name and empty PMU name ]
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: ak@linux.intel.com
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240926144851.245903-6-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-26 13:26:11 -07:00
Ian Rogers
e2216fac1e perf evsel x86: Make evsel__has_perf_metrics work for legacy events
Use PMU interface to better detect core PMU for legacy events. Look
for slots event on core PMU if it is appropriate for the event.

Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: ak@linux.intel.com
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240926144851.245903-5-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-26 13:26:11 -07:00
Ian Rogers
d38461e977 perf stat: Remove evlist__add_default_attrs use strings
add_default_atttributes would add evsels by having pre-created
perf_event_attr, however, this needed fixing for hybrid as the
extended PMU type was necessary for each core PMU. The logic for this
was in an arch specific x86 function and wasn't present for ARM,
meaning that default events weren't being opened on all PMUs on
ARM. Change the creation of the default events to use parse_events and
strings as that will open the events on all PMUs.

Rather than try to detect events on PMUs before parsing, parse the
event but skip its output in stat-display.

The previous order of hardware events was: cycles,
stalled-cycles-frontend, stalled-cycles-backend, instructions. As
instructions is a more fundamental concept the order is changed to:
instructions, cycles, stalled-cycles-frontend, stalled-cycles-backend.

Closes: https://lore.kernel.org/lkml/CAP-5=fVABSBZnsmtRn1uF-k-G1GWM-L5SgiinhPTfHbQsKXb_g@mail.gmail.com/
Acked-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
[Don't display unsupported default events except 'cycles']
Acked-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Yang Jihong <yangjihong@bytedance.com>
Cc: Dominique Martinet <asmadeus@codewreck.org>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: ak@linux.intel.com
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240926144851.245903-4-james.clark@linaro.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-26 13:26:11 -07:00
Arnaldo Carvalho de Melo
0e7eb23668 perf tools: Build x86 32-bit syscall table from arch/x86/entry/syscalls/syscall_32.tbl
To remove one more use of the audit libs and address a problem reported
with a recent change where a function isn't available when using the
audit libs method, that should really go away, this being one step in
that direction.

The script used to generate the 64-bit syscall table was already
parametrized to generate for both 64-bit and 32-bit, so just use it and
wire the generated table to the syscalltbl.c routines.

Reported-by: Jiri Slaby <jirislaby@kernel.org>
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Jiri Slaby <jirislaby@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/6fe63fa3-6c63-4b75-ac09-884d26f6fb95@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-02 11:13:40 -03:00
James Clark
940007cee5 perf: cs-etm: Only save valid trace IDs into files
This isn't a bug because Perf always masks with
CORESIGHT_TRACE_ID_VAL_MASK before using these values, but to avoid it
looking like it could be, make an effort to not save bad values.

Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Signed-off-by: James Clark <james.clark@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-6-james.clark@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-29 15:55:52 -03:00
James Clark
19c3e4db38 perf: cs-etm: Create decoders based on the trace ID mappings
Now that each queue has a unique set of trace ID mappings, use this
list to create the decoders. In unformatted mode just add a single
mapping so only one decoder is made.

Previously each queue would have a decoder created for each traced CPU
on the system but this won't work anymore because CPUs can have
overlapping trace IDs.

This also means that the CORESIGHT_TRACE_ID_UNUSED_FLAG isn't needed
any more. If mappings aren't added then decoders aren't created, rather
than needing a flag to suppress creation.

Reviewed-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240722101202.26915-5-james.clark@linaro.org
Signed-off-by: James Clark <james.clark@linaro.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-29 15:55:24 -03:00
Leo Yan
d5726f1c8d perf auxtrace: Remove unused 'pmu' pointer from struct auxtrace_record
The 'pmu' pointer in the auxtrace_record structure is not used after
support multiple AUX events, remove it.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240806204130.720977-3-leo.yan@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-28 18:15:16 -03:00
Namhyung Kim
1cfd01eb60 perf annotate-data: Copy back variable types after move
In some cases, compilers don't set the location expression in DWARF
precisely.  For instance, it may assign a variable to a register after
copying it from a different register.  Then it should use the register
for the new type but still uses the old register.  This makes hard to
track the type information properly.

This is an example I found in __tcp_transmit_skb().  The first argument
(sk) of this function is a pointer to sock and there's a variable (tp)
for tcp_sock.

  static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb,
  				int clone_it, gfp_t gfp_mask, u32 rcv_nxt)
  {
  	...
  	struct tcp_sock *tp;

  	BUG_ON(!skb || !tcp_skb_pcount(skb));
  	tp = tcp_sk(sk);
  	prior_wstamp = tp->tcp_wstamp_ns;
  	tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache);
  	...

So it basically calls tcp_sk(sk) to get the tcp_sock pointer from sk.
But it turned out to be the same value because tcp_sock embeds sock as
the first member.  The sk is located in reg5 (RDI) and tp is in reg3
(RBX).  The offset of tcp_wstamp_ns is 0x748 and tcp_clock_cache is
0x750.  So you need to use RBX (reg3) to access the fields in the
tcp_sock.  But the code used RDI (reg5) as it has the same value.

  $ pahole --hex -C tcp_sock vmlinux | grep -e 748 -e 750
	u64                tcp_wstamp_ns;        /* 0x748   0x8 */
	u64                tcp_clock_cache;      /* 0x750   0x8 */

And this is the disassembly of the part of the function.

  <__tcp_transmit_skb>:
  ...
  44:  mov    %rdi, %rbx
  47:  mov    0x748(%rdi), %rsi
  4e:  mov    0x750(%rdi), %rax
  55:  cmp    %rax, %rsi

Because compiler put the debug info to RBX, it only knows RDI is a
pointer to sock and accessing those two fields resulted in error
due to offset being beyond the type size.

  -----------------------------------------------------------
  find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
  CU for net/ipv4/tcp_output.c (die:0x817f543)
  frame base: cfa=0 fbreg=6
  scope: [1/1] (die:81aac3e)
  bb: [0 - 30]
  var [0] -0x98(stack) type='struct tcp_out_options' size=0x28 (die:0x81af3df)
  var [5] reg8 type='unsigned int' size=0x4 (die:0x8180ed6)
  var [5] reg2 type='unsigned int' size=0x4 (die:0x8180ed6)
  var [5] reg1 type='int' size=0x4 (die:0x818059e)
  var [5] reg4 type='struct sk_buff*' size=0x8 (die:0x8181360)
  var [5] reg5 type='struct sock*' size=0x8 (die:0x8181a0c)                   <<<--- the first argument ('sk' at %RDI)
  mov [19] reg8 -> -0xa8(stack) type='unsigned int' size=0x4 (die:0x8180ed6)
  mov [20] stack canary -> reg0
  mov [29] reg0 -> -0x30(stack) stack canary
  bb: [36 - 3e]
  mov [36] reg4 -> reg15 type='struct sk_buff*' size=0x8 (die:0x8181360)
  bb: [44 - 63]
  mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c)          <<<--- calling tcp_sk()
  var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)              <<<--- new variable ('tp' at %RBX)
  var [4e] reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
  chk [63] reg5 offset=0x748 ok=1 kind=1 (struct sock*) : offset bigger than size    <<<--- access with old variable
  final result: offset bigger than size

While it's a fault in the compiler, we could work around this issue by
using the type of new variable when it's copied directly.  So I've added
copied_from field in the register state to track those direct register
to register copies.  After that new register gets a new type and the old
register still has the same type, it'll update (copy it back) the type
of the old register.

For example, if we can update type of reg5 at __tcp_transmit_skb+0x47,
we can find the target type of the instruction at 0x63 like below:

  -----------------------------------------------------------
  find data type for 0x748(reg5) at __tcp_transmit_skb+0x63
  ...
  bb: [44 - 63]
  mov [44] reg5 -> reg3 type='struct sock*' size=0x8 (die:0x8181a0c)
  var [47] reg3 type='struct tcp_sock*' size=0x8 (die:0x819eead)
  var [47] copyback reg5 type='struct tcp_sock*' size=0x8 (die:0x819eead)     <<<--- here
  mov [47] 0x748(reg5) -> reg4 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [4e] 0x750(reg5) -> reg0 type='unsigned long long' size=0x8 (die:0x8180edd)
  mov [58] reg4 -> -0xc0(stack) type='unsigned long long' size=0x8 (die:0x8180edd)
  chk [63] reg5 offset=0x748 ok=1 kind=1 (struct tcp_sock*) : Good!           <<<--- new type
  found by insn track: 0x748(reg5) type-offset=0x748
  final result:  type='struct tcp_sock' size=0xa98 (die:0x819eeb2)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821232628.353177-5-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-22 12:38:18 -03:00
Namhyung Kim
4d6d6e0f61 perf annotate-data: Fix percpu pointer check
In check_matching_type(), it checks the type state of the register in a
wrong order.  When it's the percpu pointer, it should check the type for
the pointer, but it checks the CFA bit first and thought it has no type
in the stack slot.  This resulted in no type info.

  -----------------------------------------------------------
  find data type for 0x28(reg1) at hrtimer_reprogram+0x88
  CU for kernel/time/hrtimer.c (die:0x18f219f)
  frame base: cfa=1 fbreg=7
  ...
  add [72] percpu 0x24500 -> reg1 pointer type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)
  bb: [7a - 7e]
  bb: [80 - 86]                        (here)
  bb: [88 - 88]                         vvv
  chk [88] reg1 offset=0x28 ok=1 kind=4 cfa : no type information
  no type information

Here, instruction at 0x72 found reg1 has a (percpu) pointer and got the
correct type.  But when it checks the final result, it wrongly thought
it was stack variable because it checks the cfa bit first.

After changing the order of state check:
  -----------------------------------------------------------
  find data type for 0x28(reg1) at hrtimer_reprogram+0x88
  CU for kernel/time/hrtimer.c (die:0x18f219f)
  frame base: cfa=1 fbreg=7
  ...                                     (here)
                                        vvvvvvvvvv
  chk [88] reg1 offset=0x28 ok=1 kind=4 percpu ptr : Good!
  found by insn track: 0x28(reg1) type-offset=0x28
  final type: type='struct hrtimer_cpu_base' size=0x240 (die:0x18f6d46)

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-21 11:30:38 -03:00
Namhyung Kim
922ec313f0 perf annotate-data: Fix missing constant copy
I found it missed to copy the immediate constant when it moves the
register value.  This could result in a wrong type inference since the
address for the per-cpu variable would be 0 always.

Fixes: eb9190afae ("perf annotate-data: Handle ADD instructions")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240821065408.285548-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-21 11:27:18 -03:00
Arnaldo Carvalho de Melo
3bce87eb74 Merge remote-tracking branch 'torvalds/master' into perf-tools-next
To pick up the latest perf-tools merge for 6.11, i.e. to have the
current perf tools branch that is getting into 6.11 with the
perf-tools-next that is geared towards 6.12.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-16 19:43:16 -03:00
Weilin Wang
8db5cabcf1 perf stat: Fork and launch 'perf record' when 'perf stat' needs to get retire latency value for a metric.
When retire_latency value is used in a metric formula, evsel would fork
a 'perf record' process with "-e" and "-W" options. 'perf record' will
collect required retire_latency values in parallel while 'perf stat' is
collecting counting values.

At the point of time that 'perf stat' stops counting, evsel would stop
'perf record' by sending sigterm signal to 'perf record' process.
Sampled data will be processed to get retire latency value. Another
thread is required to synchronize between 'perf stat' and 'perf record'
when we pass data through pipe.

Retire_latency evsel is not opened for 'perf stat' so that there is no
counter wasted on it. This commit includes code suggested by Namhyung to
adjust reading size for groups that include retire_latency evsels.

In current :R parsing implementation, the parser would recognize events
with retire_latency modifier and insert them into the evlist like a
normal event.  Ideally, we need to avoid counting these events.

In this commit, at the time when a retire_latency evsel is read, set the
retire latency value processed from the sampled data to count value.
This sampled retire latency value will be used for metric calculation
and final event count print out. No special metric calculation and event
print out code required for retire_latency events.

Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Weilin Wang <weilin.wang@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Link: https://lore.kernel.org/r/20240720062102.444578-4-weilin.wang@intel.com
[ Squashed the 3rd and 4th commit in the series to keep it building patch by patch ]
[ Constified the 'struct perf_tool' pointer in process_sample_event() ]
[ Use perf_tool__init(&tool, false) to address a segfault I reported and Ian/Weilin diagnosed ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-13 15:24:48 -03:00
Ian Rogers
30f29bae91 perf tool: Constify tool pointers
The tool pointer (to a struct largely of function pointers) is passed
around but is unchanged except at initialization. Change parameter and
variable types to be const to lower the possibilities of what could
happen with a tool.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Adrian Hunter <adrian.hunter@intel.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240812204720.631678-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12 18:05:14 -03:00
Namhyung Kim
568901e709 tools/include: Sync uapi/asm-generic/unistd.h with the kernel sources
And arch syscall tables to pick up changes from:

  b1e31c134a powerpc: restore some missing spu syscalls
  d3882564a7 syscalls: fix compat_sys_io_pgetevents_time64 usage
  54233a4254 uretprobe: change syscall number, again
  63ded11097 uprobe: Change uretprobe syscall scope and number
  9142be9e64 x86/syscall: Mark exit[_group] syscall handlers __noreturn
  9aae1baa1c x86, arm: Add missing license tag to syscall tables files
  5c28424e9a syscalls: Fix to add sys_uretprobe to syscall.tbl
  190fec72df uprobe: Wire up uretprobe system call

This should be used to beautify syscall arguments and it addresses
these tools/perf build warnings:

  Warning: Kernel ABI header differences:
  diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
  diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
  diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
  diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl

Please see tools/include/uapi/README for details (it's in the first patch
of this series).

Cc: Arnd Bergmann <arnd@arndb.de>
Cc: linux-arch@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-08-07 10:58:51 -07:00
Leo Yan
1635bdca4b perf arm-spe: Support multiple Arm SPE events
As the flag 'auxtrace' has been set for Arm SPE events, now it is ready
to use evsel__is_aux_event() to check if an event is AUX trace event or
not. Use this function to replace the old checking for only the first
Arm SPE event.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc:  <coresight@lists.linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc:  <linux-perf-users@vger.kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-01 12:11:32 -03:00
Leo Yan
ccd6fcda25 perf arm-spe: Extract evsel setting up
The evsel for Arm SPE PMU needs to be set up. Extract the setting up
into a function arm_spe_setup_evsel().

Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc:  <coresight@lists.linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc:  <linux-perf-users@vger.kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-01 12:11:32 -03:00
Adrian Hunter
c91928a8d5 perf tools: Enable evsel__is_aux_event() to work for ARM/ARM64
Set pmu->auxtrace on ARM/ARM64 AUX area PMUs. evsel__is_aux_event() needs
the setting to identify AUX area tracing selected events.

Currently, the features that use evsel__is_aux_event() are used only by
Intel PT, but that may change in the future.

Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Hendrik Brueckner <brueckner@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20240715160712.127117-6-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:58:18 -03:00
Athira Rajeev
2c9db7475e perf annotate: Set instruction name to be used with insn-stat when using raw instruction
Since the "ins.name" is not set while using raw instruction,
'perf annotate' with insn-stat gives wrong data:

Result from "./perf annotate --data-type --insn-stat":

  Annotate Instruction stats
  total 615, ok 419 (68.1%), bad 196 (31.9%)

    Name      :  Good   Bad
    -----------------------------------------------------------
              :   419   196

This patch sets "dl->ins.name" in arch specific function
"check_ppc_insn" while initialising "struct disasm_line".

Also update "ins_find" function to pass "struct disasm_line" as a
parameter so as to set its name field in arch specific call.

With the patch changes:

  Annotate Instruction stats
  total 609, ok 446 (73.2%), bad 163 (26.8%)

  Name/opcode         :  Good   Bad
  -----------------------------------------------------------
  58                  :   323    80
  32                  :    49    43
  34                  :    33    11
  OP_31_XOP_LDX       :     8    20
  40                  :    23     0
  OP_31_XOP_LWARX     :     5     1
  OP_31_XOP_LWZX      :     2     3
  OP_31_XOP_LDARX     :     3     0
  33                  :     0     2
  OP_31_XOP_LBZX      :     0     1
  OP_31_XOP_LWAX      :     0     1
  OP_31_XOP_LHZX      :     0     1

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-16-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
88444952bd perf annotate: Update instruction tracking for powerpc
Add instruction tracking function "update_insn_state_powerpc" for
powerpc. Example sequence in powerpc:

  ld      r10,264(r3)
  mr      r31,r3
  <<after some sequence>
  ld      r9,312(r31)

Consider ithe sample is pointing to: "ld r9,312(r31)".

Here the memory reference is hit at "312(r31)" where 312 is the offset
and r31 is the source register.

Previous instruction sequence shows that register state of r3 is moved
to r31.

So to identify the data type for r31 access, the previous instruction
("mr") needs to be tracked and the state type entry has to be updated.

Current instruction tracking support in perf tools infrastructure is
specific to x86. Patch adds this support for powerpc as well.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-12-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
539bfea3e0 perf annotate: Add more instructions for instruction tracking
Add few more instructions and use opcode as search key
to find if it is supported by the architecture.

The added ones are: addi, addic, addic., addis, subfic and mulli

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-11-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
cd0b6f67c4 perf annotate: Add some of the arithmetic instructions to support instruction tracking in powerpc
Data-type profiling has the concept of instruction tracking.

Example sequence in powerpc:

	ld      r10,264(r3)
	mr      r31,r3
	<<after some sequence>
	ld      r9,312(r31)

or differently

	lwz	r10,264(r3)
	add	r31, r3, RB
	lwz	r9, 0(r31)

If a sample is hit at "lwz r9, 0(r31)", data type of r31 depends
on previous instruction sequence here. So to track the previous
instructions, patch adds changes to identify some of the arithmetic
instructions which are having opcode as 31.

Since memory instructions also has cases with opcode 31, use the bits
22:30 to filter the arithmetic instructions here.

Also there are instructions with just two operands like "addme", "addze".

This patch adds new instructions ops "arithmetic_ops" to handle this

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-10-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
ace7d681d8 perf annotate: Add support to identify memory instructions of opcode 31 in powerpc
There are memory instructions in powerpc with opcode as 31.
Example: "ldx RT,RA,RB" , Its X form is as below:

  ______________________________________
  | 31 |  RT  |  RA |  RB |   21     |/|
  --------------------------------------
  0    6     11    16    21         30 31

The opcode for "ldx" is 31. There are other instructions also with
opcode 31 which are memory insn like ldux, stbx, lwzx, lhaux
But all instructions with opcode 31 are not memory. Example is add
instruction: "add RT,RA,RB"

The value in bit 21-30 [ 21 for ldx ] is different for these
instructions. Patch uses this value to assign instruction ops for these
cases. The naming convention and value to identify these are picked from
defines in "arch/powerpc/include/asm/ppc-opcode.h"

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-9-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
1acdad6818 perf annotate: Add parse function for memory instructions in powerpc
Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset.

The implementation addresses the D-form, X-form, DS-form instructions.
Two main functions are added.

New parse function "load_store__parse" as instruction ops parser for
memory instructions.

Unlike other parsers (like mov__parse), this one fills in the
"multi_regs" field for source/target and new added "mem_ref" field. No
other fields are set because, here there is no need to parse the
disassembled code and arch specific macros will take care of extracting
offset and regs which is easier and will be precise.

In powerpc, all instructions with a primary opcode from 32 to 63
are memory instructions. Update "ins__find" function to have "raw_insn"
also as a parameter.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-8-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
1b4406d2a8 perf annotate: Update parameters for reg extract functions to use raw instruction on powerpc
Use the raw instruction code and macros to identify memory instructions,
extract register fields and also offset.

The implementation addresses the D-form, X-form, DS-form instructions.

Adds "mem_ref" field to check whether source/target has memory
reference.

Add function "get_powerpc_regs" which will set these fields: reg1, reg2,
offset depending of where it is source or target ops.

Update "parse" callback for "struct ins_ops" to also pass "struct
disasm_line" as argument. This is needed in parse functions where opcode
is used to determine whether to set multi_regs and other fields

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-7-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
06dd4c5a56 perf annotate: Add disasm_line__parse() to parse raw instruction for powerpc
Currently, the perf tool infrastructure uses the disasm_line__parse
function to parse disassembled line.

Example snippet from objdump:

  objdump  --start-address=<address> --stop-address=<address>  -d --no-show-raw-insn -C <vmlinux>

  c0000000010224b4:	lwz     r10,0(r9)

This line "lwz r10,0(r9)" is parsed to extract instruction name,
registers names and offset.

In powerpc, the approach for data type profiling uses raw instruction
instead of result from objdump to identify the instruction category and
extract the source/target registers.

Example: 38 01 81 e8     ld      r4,312(r1)

Here "38 01 81 e8" is the raw instruction representation. Add function
"disasm_line__parse_powerpc" to handle parsing of raw instruction.
Also update "struct disasm_line" to save the binary code/
With the change, function captures:

line -> "38 01 81 e8     ld      r4,312(r1)"
raw instruction "38 01 81 e8"

Raw instruction is used later to extract the reg/offset fields. Macros
are added to extract opcode and register fields. "struct disasm_line"
is updated to carry union of "bytes" and "raw_insn" of 32 bit to carry raw
code (raw).

Function "disasm_line__parse_powerpc fills the raw instruction hex value
and can use macros to get opcode. There is no changes in existing code
paths, which parses the disassembled code.  The size of raw instruction
depends on architecture.

In case of powerpc, the parsing the disasm line needs to handle cases
for reading binary code directly from DSO as well as parsing the objdump
result. Hence adding the logic into separate function instead of
updating "disasm_line__parse".  The architecture using the instruction
name and present approach is not altered. Since this approach targets
powerpc, the macro implementation is added for powerpc as of now.

Since the disasm_line__parse is used in other cases (perf annotate) and
not only data tye profiling, the powerpc callback includes changes to
work with binary code as well as mnemonic representation.

Also in case if the DSO read fails and libcapstone is not supported, the
approach fallback to use objdump as option. Hence as option, patch has
changes to ensure objdump option also works well.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-5-atrajeev@linux.vnet.ibm.com
[ Add check for strndup() result ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Athira Rajeev
782959ac24 perf annotate: Add "update_insn_state" callback function to handle arch specific instruction tracking
Add "update_insn_state" callback to "struct arch" to handle instruction
tracking. Currently updating instruction state is handled by static
function "update_insn_state_x86" which is defined in "annotate-data.c".

Make this as a callback for specific arch and move to archs specific
file "arch/x86/annotate/instructions.c" . This will help to add helper
function for other platforms in file:
"arch/<platform>/annotate/instructions.c" and make changes/updates
easier.

Define callback "update_insn_state" as part of "struct arch", also make
some of the debug functions non-static so that it can be referenced from
other places.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Kajol Jain <kjain@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Akanksha J N <akanksha@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Hari Bathini <hbathini@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Segher Boessenkool <segher@kernel.crashing.org>
Link: https://lore.kernel.org/lkml/20240718084358.72242-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-07-31 16:12:59 -03:00
Linus Torvalds
2c9b351240 ARM:
* Initial infrastructure for shadow stage-2 MMUs, as part of nested
   virtualization enablement
 
 * Support for userspace changes to the guest CTR_EL0 value, enabling
   (in part) migration of VMs between heterogenous hardware
 
 * Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
   the protocol
 
 * FPSIMD/SVE support for nested, including merged trap configuration
   and exception routing
 
 * New command-line parameter to control the WFx trap behavior under KVM
 
 * Introduce kCFI hardening in the EL2 hypervisor
 
 * Fixes + cleanups for handling presence/absence of FEAT_TCRX
 
 * Miscellaneous fixes + documentation updates
 
 LoongArch:
 
 * Add paravirt steal time support.
 
 * Add support for KVM_DIRTY_LOG_INITIALLY_SET.
 
 * Add perf kvm-stat support for loongarch.
 
 RISC-V:
 
 * Redirect AMO load/store access fault traps to guest
 
 * perf kvm stat support
 
 * Use guest files for IMSIC virtualization, when available
 
 ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
 extensions is coming through the RISC-V tree.
 
 s390:
 
 * Assortment of tiny fixes which are not time critical
 
 x86:
 
 * Fixes for Xen emulation.
 
 * Add a global struct to consolidate tracking of host values, e.g. EFER
 
 * Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC
   bus frequency, because TDX.
 
 * Print the name of the APICv/AVIC inhibits in the relevant tracepoint.
 
 * Clean up KVM's handling of vendor specific emulation to consistently act on
   "compatible with Intel/AMD", versus checking for a specific vendor.
 
 * Drop MTRR virtualization, and instead always honor guest PAT on CPUs
   that support self-snoop.
 
 * Update to the newfangled Intel CPU FMS infrastructure.
 
 * Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads
   '0' and writes from userspace are ignored.
 
 * Misc cleanups
 
 x86 - MMU:
 
 * Small cleanups, renames and refactoring extracted from the upcoming
   Intel TDX support.
 
 * Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't
   hold leafs SPTEs.
 
 * Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager
   page splitting, to avoid stalling vCPUs when splitting huge pages.
 
 * Bug the VM instead of simply warning if KVM tries to split a SPTE that is
   non-present or not-huge.  KVM is guaranteed to end up in a broken state
   because the callers fully expect a valid SPTE, it's all but dangerous
   to let more MMU changes happen afterwards.
 
 x86 - AMD:
 
 * Make per-CPU save_area allocations NUMA-aware.
 
 * Force sev_es_host_save_area() to be inlined to avoid calling into an
   instrumentable function from noinstr code.
 
 * Base support for running SEV-SNP guests.  API-wise, this includes
   a new KVM_X86_SNP_VM type, encrypting/measure the initial image into
   guest memory, and finalizing it before launching it.  Internally,
   there are some gmem/mmu hooks needed to prepare gmem-allocated pages
   before mapping them into guest private memory ranges.
 
   This includes basic support for attestation guest requests, enough to
   say that KVM supports the GHCB 2.0 specification.
 
   There is no support yet for loading into the firmware those signing
   keys to be used for attestation requests, and therefore no need yet
   for the host to provide certificate data for those keys.  To support
   fetching certificate data from userspace, a new KVM exit type will be
   needed to handle fetching the certificate from userspace. An attempt to
   define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle
   this was introduced in v1 of this patchset, but is still being discussed
   by community, so for now this patchset only implements a stub version
   of SNP Extended Guest Requests that does not provide certificate data.
 
 x86 - Intel:
 
 * Remove an unnecessary EPT TLB flush when enabling hardware.
 
 * Fix a series of bugs that cause KVM to fail to detect nested pending posted
   interrupts as valid wake eents for a vCPU executing HLT in L2 (with
   HLT-exiting disable by L1).
 
 * KVM: x86: Suppress MMIO that is triggered during task switch emulation
 
   Explicitly suppress userspace emulated MMIO exits that are triggered when
   emulating a task switch as KVM doesn't support userspace MMIO during
   complex (multi-step) emulation.  Silently ignoring the exit request can
   result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to
   userspace for some other reason prior to purging mmio_needed.
 
   See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write exits if
   emulator detects exception") for more details on KVM's limitations with
   respect to emulated MMIO during complex emulator flows.
 
 Generic:
 
 * Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE,
   because the special casing needed by these pages is not due to just
   unmovability (and in fact they are only unmovable because the CPU cannot
   access them).
 
 * New ioctl to populate the KVM page tables in advance, which is useful to
   mitigate KVM page faults during guest boot or after live migration.
   The code will also be used by TDX, but (probably) not through the ioctl.
 
 * Enable halt poll shrinking by default, as Intel found it to be a clear win.
 
 * Setup empty IRQ routing when creating a VM to avoid having to synchronize
   SRCU when creating a split IRQCHIP on x86.
 
 * Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
   that arch code can use for hooking both sched_in() and sched_out().
 
 * Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
   truncating a bogus value from userspace, e.g. to help userspace detect bugs.
 
 * Mark a vCPU as preempted if and only if it's scheduled out while in the
   KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
   memory when retrieving guest state during live migration blackout.
 
 Selftests:
 
 * Remove dead code in the memslot modification stress test.
 
 * Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs.
 
 * Print the guest pseudo-RNG seed only when it changes, to avoid spamming the
   log for tests that create lots of VMs.
 
 * Make the PMU counters test less flaky when counting LLC cache misses by
   doing CLFLUSH{OPT} in every loop iteration.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaZQB0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNkZwf/bv2jiENaLFNGPe/VqTKMQ6PHQLMG
 +sNHx6fJPP35gTM8Jqf0/7/ummZXcSuC1mWrzYbecZm7Oeg3vwNXHZ4LquwwX6Dv
 8dKcUzLbWDAC4WA3SKhi8C8RV2v6E7ohy69NtAJmFWTc7H95dtIQm6cduV2osTC3
 OEuHe1i8d9umk6couL9Qhm8hk3i9v2KgCsrfyNrQgLtS3hu7q6yOTR8nT0iH6sJR
 KE5A8prBQgLmF34CuvYDw4Hu6E4j+0QmIqodovg2884W1gZQ9LmcVqYPaRZGsG8S
 iDdbkualLKwiR1TpRr3HJGKWSFdc7RblbsnHRvHIZgFsMQiimh4HrBSCyQ==
 =zepX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Initial infrastructure for shadow stage-2 MMUs, as part of nested
     virtualization enablement

   - Support for userspace changes to the guest CTR_EL0 value, enabling
     (in part) migration of VMs between heterogenous hardware

   - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
     of the protocol

   - FPSIMD/SVE support for nested, including merged trap configuration
     and exception routing

   - New command-line parameter to control the WFx trap behavior under
     KVM

   - Introduce kCFI hardening in the EL2 hypervisor

   - Fixes + cleanups for handling presence/absence of FEAT_TCRX

   - Miscellaneous fixes + documentation updates

  LoongArch:

   - Add paravirt steal time support

   - Add support for KVM_DIRTY_LOG_INITIALLY_SET

   - Add perf kvm-stat support for loongarch

  RISC-V:

   - Redirect AMO load/store access fault traps to guest

   - perf kvm stat support

   - Use guest files for IMSIC virtualization, when available

  s390:

   - Assortment of tiny fixes which are not time critical

  x86:

   - Fixes for Xen emulation

   - Add a global struct to consolidate tracking of host values, e.g.
     EFER

   - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
     effective APIC bus frequency, because TDX

   - Print the name of the APICv/AVIC inhibits in the relevant
     tracepoint

   - Clean up KVM's handling of vendor specific emulation to
     consistently act on "compatible with Intel/AMD", versus checking
     for a specific vendor

   - Drop MTRR virtualization, and instead always honor guest PAT on
     CPUs that support self-snoop

   - Update to the newfangled Intel CPU FMS infrastructure

   - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
     it reads '0' and writes from userspace are ignored

   - Misc cleanups

  x86 - MMU:

   - Small cleanups, renames and refactoring extracted from the upcoming
     Intel TDX support

   - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
     that can't hold leafs SPTEs

   - Unconditionally drop mmu_lock when allocating TDP MMU page tables
     for eager page splitting, to avoid stalling vCPUs when splitting
     huge pages

   - Bug the VM instead of simply warning if KVM tries to split a SPTE
     that is non-present or not-huge. KVM is guaranteed to end up in a
     broken state because the callers fully expect a valid SPTE, it's
     all but dangerous to let more MMU changes happen afterwards

  x86 - AMD:

   - Make per-CPU save_area allocations NUMA-aware

   - Force sev_es_host_save_area() to be inlined to avoid calling into
     an instrumentable function from noinstr code

   - Base support for running SEV-SNP guests. API-wise, this includes a
     new KVM_X86_SNP_VM type, encrypting/measure the initial image into
     guest memory, and finalizing it before launching it. Internally,
     there are some gmem/mmu hooks needed to prepare gmem-allocated
     pages before mapping them into guest private memory ranges

     This includes basic support for attestation guest requests, enough
     to say that KVM supports the GHCB 2.0 specification

     There is no support yet for loading into the firmware those signing
     keys to be used for attestation requests, and therefore no need yet
     for the host to provide certificate data for those keys.

     To support fetching certificate data from userspace, a new KVM exit
     type will be needed to handle fetching the certificate from
     userspace.

     An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
     exit type to handle this was introduced in v1 of this patchset, but
     is still being discussed by community, so for now this patchset
     only implements a stub version of SNP Extended Guest Requests that
     does not provide certificate data

  x86 - Intel:

   - Remove an unnecessary EPT TLB flush when enabling hardware

   - Fix a series of bugs that cause KVM to fail to detect nested
     pending posted interrupts as valid wake eents for a vCPU executing
     HLT in L2 (with HLT-exiting disable by L1)

   - KVM: x86: Suppress MMIO that is triggered during task switch
     emulation

     Explicitly suppress userspace emulated MMIO exits that are
     triggered when emulating a task switch as KVM doesn't support
     userspace MMIO during complex (multi-step) emulation

     Silently ignoring the exit request can result in the
     WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
     for some other reason prior to purging mmio_needed

     See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write
     exits if emulator detects exception") for more details on KVM's
     limitations with respect to emulated MMIO during complex emulator
     flows

  Generic:

   - Rename the AS_UNMOVABLE flag that was introduced for KVM to
     AS_INACCESSIBLE, because the special casing needed by these pages
     is not due to just unmovability (and in fact they are only
     unmovable because the CPU cannot access them)

   - New ioctl to populate the KVM page tables in advance, which is
     useful to mitigate KVM page faults during guest boot or after live
     migration. The code will also be used by TDX, but (probably) not
     through the ioctl

   - Enable halt poll shrinking by default, as Intel found it to be a
     clear win

   - Setup empty IRQ routing when creating a VM to avoid having to
     synchronize SRCU when creating a split IRQCHIP on x86

   - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
     a flag that arch code can use for hooking both sched_in() and
     sched_out()

   - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
     truncating a bogus value from userspace, e.g. to help userspace
     detect bugs

   - Mark a vCPU as preempted if and only if it's scheduled out while in
     the KVM_RUN loop, e.g. to avoid marking it preempted and thus
     writing guest memory when retrieving guest state during live
     migration blackout

  Selftests:

   - Remove dead code in the memslot modification stress test

   - Treat "branch instructions retired" as supported on all AMD Family
     17h+ CPUs

   - Print the guest pseudo-RNG seed only when it changes, to avoid
     spamming the log for tests that create lots of VMs

   - Make the PMU counters test less flaky when counting LLC cache
     misses by doing CLFLUSH{OPT} in every loop iteration"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
  crypto: ccp: Add the SNP_VLEK_LOAD command
  KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
  KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
  KVM: x86: Replace static_call_cond() with static_call()
  KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  x86/sev: Move sev_guest.h into common SEV header
  KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
  KVM: x86: Suppress MMIO that is triggered during task switch emulation
  KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
  KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
  KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
  KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
  KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
  KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  KVM: Document KVM_PRE_FAULT_MEMORY ioctl
  mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
  perf kvm: Add kvm-stat for loongarch64
  LoongArch: KVM: Add PV steal time support in guest side
  ...
2024-07-20 12:41:03 -07:00
Ian Rogers
1553419c3c perf dso: Fix address sanitizer build
Various files had been missed from having accessor functions added for
the sake of dso reference count checking. Add the function calls and
missing dso accessor functions.

Fixes: ee756ef749 ("perf dso: Add reference count checking and accessor functions")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Yunseong Kim <yskelg@gmail.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Link: https://lore.kernel.org/r/20240704011745.1021288-1-irogers@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:41 -07:00
Leo Yan
e6b4da6759 perf arm-spe: Support multiple Arm SPE PMUs
A platform can have more than one Arm SPE PMU. For example, a system
with multiple clusters may have each cluster enabled with its own Arm
SPE instance. In such case, the PMU devices will be named 'arm_spe_0',
'arm_spe_1', and so on.

Currently, the tool only supports 'arm_spe_0'. This commit extends
support to multiple Arm SPE PMUs by detecting the substring 'arm_spe_'.

Signed-off-by: Leo Yan <leo.yan@arm.com>
Reviewed-by: James Clark <james.clark@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20240706152035.86983-2-leo.yan@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:40 -07:00
Haoze Xie
759ce73cf7 perf build x86: Fix SC2034 error in syscalltbl.sh
Change the unused var in 'arch/x86/entry/syscalls/syscalltbl.sh' to '_'
when reading from '$sorted_table'. This change allows the script to pass
tests of ShellCheck before and after version 0.7.2 at the same time.

When building in arch x86, syscalltbl.sh got a ShellCheck warning, which
makes compilation error:

    In arch/x86/entry/syscalls/syscalltbl.sh line 27:
    while read nr _abi name entry _compat; do
                  ^-^ SC2034: abi appears unused.
                  Verify use (or export if used externally).
                                  ^----^ SC2034: compat appears unused.
                               Verify use (or export if used externally).

The script reads unused param abi and compat. It uses format '_xxx' to
indicate dummy vars, which won't work properly when ShellCheck <= 0.7.2.

According to SC2034, the more general way of writing is to use directly
'_' to indicate discarding vars. 'entry' is also replaced by '_' because
it just happens to be defined in emit function, otherwise it will lead
to some misunderstandings.

Link: https://www.shellcheck.net/wiki/SC2034
Signed-off-by: Haoze Xie <royenheart@gmail.com>
Signed-off-by: Yuan Tan <tanyuan@tinylab.org>
Link: https://lore.kernel.org/r/2143cab4cd8468c88860f4e5e382d0e6b4d89ac9.1720372178.git.royenheart@gmail.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-12 09:38:40 -07:00
Paolo Bonzini
c8b8b8190a LoongArch KVM changes for v6.11
1. Add ParaVirt steal time support.
 2. Add some VM migration enhancement.
 3. Add perf kvm-stat support for loongarch.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmaOS6UWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImehejD/9pACGe3h3krXLcFVWXOFIu5Hpc
 5kQLP0lSPJ/o5Xs8t/oPLrnDX70z90wXI1LOmltc7h32MSwFa2l8COQh+sN5eJBQ
 PNyt7u7bMipp0yJS4Gl3LQQ5vklcGOSpQc/gbeXnVx8J/tz+Mo9YGGLIXVRXRM6W
 Ri8D2VVFiwzQQYeTpPo1u1Ob8C6mA4KOppwvhscMTM3vj4NMbsinBzRnR0lG0Tdw
 meFhxDPly1Ksxsbnj9UGO6UnEY0A2SLONs6MiO4y4DtoqoDlw/lbqFJuYo4vvbx1
 pxtjyirD/PX/wjslQFWUOuU0hMfAodera+JupZ5BZWfcG8FltA4DQfDsm/U9RjK/
 7gGNnr8Xk2/tp6+4AVV+HU2iTgRvq+mXCL72zSy2Y4r7ElBAANDfk4n+Zn/PWisn
 U9wwV8Ue7tVB15BRpRsg77NzBidiCFEe/6flWYiX2y24ke71gwDJBGUy8hMdKt6t
 4Cq8atsU0MvDAzfYMsK9JjskJp4UFq6wb1tXbbuADM4TDhnzlK6s6h3vM+pFlh/f
 my7fDH8/2qsCWhBDM4pmsJskVp+I1GOk/80RjTQISwx7iHktJWvxNYTaisK2fvD5
 Qs1IUWfNFbDX0Lr0QpN6j6X4rZkghR4R6XoFkd4nkicwi+UHVn3oK9GSqv24QJn9
 7+Ev3dfRTUYLd6mC4Q==
 =DpIK
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-kvm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD

LoongArch KVM changes for v6.11

1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.
2024-07-12 11:24:12 -04:00
Bibo Mao
492ac37fa3 perf kvm: Add kvm-stat for loongarch64
Add support for 'perf kvm stat' on loongarch64 platform, now only kvm
exit event is supported.

Here is example output about "perf kvm --host stat report" command

   Event name   Samples   Sample%     Time (ns)   Time%   Mean Time (ns)
    Mem Store     83969    51.00%     625697070   8.00%             7451
     Mem Read     37641    22.00%     112485730   1.00%             2988
    Interrupt     15542     9.00%      20620190   0.00%             1326
        IOCSR     15207     9.00%      94296190   1.00%             6200
    Hypercall      4873     2.00%      12265280   0.00%             2516
         Idle      3713     2.00%    6322055860  87.00%          1702681
          FPU      1819     1.00%       2750300   0.00%             1511
   Inst Fetch       502     0.00%       1341740   0.00%             2672
   Mem Modify       324     0.00%        602240   0.00%             1858
       CPUCFG        55     0.00%         77610   0.00%             1411
          CSR        12     0.00%         19690   0.00%             1640
         LASX         3     0.00%          4870   0.00%             1623
          LSX         2     0.00%          2100   0.00%             1050

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2024-07-10 16:50:27 +08:00
Adrian Hunter
b40934ae32 perf intel-pt: Fix exclude_guest setting
In the past, the exclude_guest setting has had no effect on Intel PT
tracing, but that may not be the case in the future.

Set the flag correctly based upon whether KVM is using Intel PT
"Host/Guest" mode, which is determined by the kvm_intel module
parameter pt_mode:

 pt_mode=0	System-wide mode : host and guest output to host buffer
 pt_mode=1	Host/Guest mode : host/guest output to host/guest
                buffers respectively

Fixes: 6e86bfdc4a ("perf intel-pt: Support decoding of guest kernel")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240625104532.11990-3-adrian.hunter@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-02 15:24:07 -07:00
Adrian Hunter
36b4cd990a perf intel-pt: Fix aux_watermark calculation for 64-bit size
aux_watermark is a u32. For a 64-bit size, cap the aux_watermark
calculation at UINT_MAX instead of truncating it to 32-bits.

Fixes: 874fc35cdd ("perf intel-pt: Use aux_watermark")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Link: https://lore.kernel.org/r/20240625104532.11990-2-adrian.hunter@intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-02 15:24:01 -07:00
Namhyung Kim
74ad3cb08b Merge remote-tracking branch 'perf-tools' into perf-tools-next
Merge fixes and updates in v6.10 into perf-tools-next to resolve changes
in synthesizing the LOST_SAMPLES records and build fixes.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-07-02 11:51:32 -07:00
Ian Rogers
e467705a9f perf util: Make util its own library
Make the util directory into its own library. This is done to avoid
compiling code twice, once for the perf tool and once for the perf
python module. For convenience:
  arch/common.c
  scripts/perl/Perf-Trace-Util/Context.c
  scripts/python/Perf-Trace-Util/Context.c
are made part of this library.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-7-irogers@google.com
2024-06-26 11:07:42 -07:00
Ian Rogers
1dad99af1a perf test: Make tests its own library
Make the tests code its own library. This is done to avoid compiling
code twice, once for the perf tool and once for the perf python
module.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Nick Terrell <terrelln@fb.com>
Cc: Gary Guo <gary@garyguo.net>
Cc: Alex Gaynor <alex.gaynor@gmail.com>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Wedson Almeida Filho <wedsonaf@gmail.com>
Cc: Ze Gao <zegao2021@gmail.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Andrei Vagin <avagin@google.com>
Cc: Yicong Yang <yangyicong@hisilicon.com>
Cc: Jonathan Cameron <jonathan.cameron@huawei.com>
Cc: Guo Ren <guoren@kernel.org>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Benno Lossin <benno.lossin@proton.me>
Cc: Björn Roy Baron <bjorn3_gh@protonmail.com>
Cc: Andreas Hindborg <a.hindborg@samsung.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240625214117.953777-5-irogers@google.com
2024-06-26 11:07:11 -07:00
Shenlin Liang
da7b1b525e perf kvm/riscv: Port perf kvm stat to RISC-V
'perf kvm stat report/record' generates a statistical analysis of KVM
events and can be used to analyze guest exit reasons.

"report" reports statistical analysis of guest exit events.

To record kvm events on the host:
 # perf kvm stat record -a

To report kvm VM EXIT events:
 # perf kvm stat report --event=vmexit

Signed-off-by: Shenlin Liang <liangshenlin@eswincomputing.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240422080833.8745-3-liangshenlin@eswincomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:38 +05:30
Adrian Hunter
fcd094e52b perf tests: Add APX and other new instructions to x86 instruction decoder test
Add samples of APX and other new instructions to the 'x86 instruction
decoder - new instructions' test.

Note the test is only available if the perf tool has been built with
EXTRA_TESTS=1.

Example:

  $ make EXTRA_TESTS=1 -C tools/perf
  $ tools/perf/perf test -F -v 'new ins' |& grep -i 'jmpabs\|popp\|pushp'
  Decoded ok: d5 00 a1 ef cd ab 90 78 56 34 12    jmpabs $0x1234567890abcdef
  Decoded ok: d5 08 53                    pushp  %rbx
  Decoded ok: d5 18 50                    pushp  %r16
  Decoded ok: d5 19 57                    pushp  %r31
  Decoded ok: d5 19 5f                    popp   %r31
  Decoded ok: d5 18 58                    popp   %r16
  Decoded ok: d5 08 5b                    popp   %rbx

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Nikolay Borisov <nik.borisov@suse.com>
Cc: H. Peter Anvin <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240502105853.5338-11-adrian.hunter@intel.com
2024-06-25 11:06:19 -07:00
Ian Rogers
5518063fcb perf arm: Workaround ARM PMUs cpu maps having offline cpus
When PMUs have a cpu map in the 'cpus' or 'cpumask' file, perf will
try to open events on those CPUs. ARM doesn't remove offline CPUs
meaning taking a CPU offline will cause perf commands to fail unless a
CPU map is passed on the command line.

More context in:
https://lore.kernel.org/lkml/20240603092812.46616-1-yangyicong@huawei.com/

Reported-by: Yicong Yang <yangyicong@huawei.com>
Closes: https://lore.kernel.org/lkml/20240603092812.46616-2-yangyicong@huawei.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Yicong Yang <yangyicong@hisilicon.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: linux-arm-kernel@lists.infradead.org
Cc: coresight@lists.linaro.org
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240607065343.695369-1-irogers@google.com
2024-06-21 15:50:29 -07:00
Arnaldo Carvalho de Melo
da42b5229b tools headers: Update the syscall tables and unistd.h, mostly to support the new 'mseal' syscall
But also to wire up shadow stacks on 32-bit x86, picking up those
changes from these csets:

  ff388fe5c4 ("mseal: wire up mseal syscall")
  2883f01ec3 ("x86/shstk: Enable shadow stacks for x32")

This makes 'perf trace' support it, now its possible, for instance to
do:

  # perf trace -e mseal --max-stack=16

Here is an example with the 'sendmmsg' syscall:

  root@x1:~# perf trace -e sendmmsg --max-stack 16 --max-events=1
       0.000 ( 0.062 ms): dbus-broker/1012 sendmmsg(fd: 150, mmsg: 0x7ffef57cca50, vlen: 1, flags: DONTWAIT|NOSIGNAL) = 1
                                         syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
                                         syscall_exit_to_user_mode_prepare ([kernel.kallsyms])
                                         syscall_exit_to_user_mode ([kernel.kallsyms])
                                         do_syscall_64 ([kernel.kallsyms])
                                         entry_SYSCALL_64 ([kernel.kallsyms])
                                         [0x117ce7] (/usr/lib64/libc.so.6 (deleted))
  root@x1:~#

To do a system wide tracing of the new 'mseal' syscall with a backtrace
of at most 16 entries.

This addresses these perf tools build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
    diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
    diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
    diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: H J Lu <hjl.tools@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZlXlo4TNcba4wnVZ@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-28 11:10:00 -03:00
Linus Torvalds
29c73fc794 perf tools fixes and improvements for v6.10:
- Add Kan Liang to MAINTAINERS as a perf tools reviewer.
 
 - Add support for using the 'capstone' disassembler library in various tools,
   such as 'perf script' and 'perf annotate'. This is an alternative for the
   use of the 'xed' and 'objdump' disassemblers.
 
 - Data-type profiling improvements:
 
   Resolve types for a->b->c by backtracking the assignments until it finds
   DWARF info for one of those members
 
   Support for global variables, keeping a cache to speed up lookups.
 
   Handle the 'call' instruction, dealing with effects on registers and handling
   its return when tracking register data types.
 
   Handle x86's segment based addressing like %gs:0x28, to support things like
   per CPU variables, the stack canary, etc.
 
   Data-type profiling got big speedups when using capstone for disassembling.
   The objdump outoput parsing method is left as a fallback when capstone fails or
   isn't available. There are patches posted for 6.11 that to use a LLVM
   disassembler.
 
   Support event group display in the TUI when annotating types with --data-type,
   for instance to show memory load and store events for the data type fields.
 
   Optimize the 'perf annotate' data structures, reducing memory usage.
 
   Add a initial 'perf test' for 'perf annotate', checking that a target symbol
   appears on the output, specifying objdump via the command line, etc.
 
 - Integrate the shellcheck utility with the build of perf to allow catching
   shell problems early in areas such as 'perf test', 'perf trace' scrape
   scripts, etc.
 
 - Add 'uretprobe' variant in the 'perf bench uprobe' tool.
 
 - Add script to run instances of 'perf script' in parallel.
 
 - Allow parsing tracepoint names that start with digits, such as
   9p/9p_client_req, etc. Make sure 'perf test' tests it even on systems
   where those tracepoints aren't available.
 
 Vendor Events:
 
 - Update Intel JSON files for Cascade Lake X, Emerald Rapids, Grand Ridge, Ice
   Lake X, Lunar Lake, Meteor Lake, Sapphire Rapids, Sierra Forest, Sky Lake X,
   Sky Lake and Snow Ridge X.  Remove info metrics erroneously in TopdownL1.
 
 - Add AMD's Zen 5 core and uncore events and metrics. Those come from the
   "Performance Monitor Counters for AMD Family 1Ah Model 00h- 0Fh Processors"
   document, with events that capture information on op dispatch, execution and
   retirement, branch prediction, L1 and L2 cache activity, TLB activity, etc.
 
 - Mark L1D_CACHE_INVAL impacted by errata for ARM64's AmpereOne/AmpereOneX.
 
 Miscellaneous:
 
 - Sync header copies with the kernel sources.
 
 - Move some header copies used only for generating translation string tables
   for ioctl cmds and other syscall integer arguments to a new directory under
   tools/perf/beauty/, to separate from copies in tools/include/ that are used
   to build the tools.
 
 - Introduce scrape script for several syscall 'flags'/'mask' arguments.
 
 - Improve cpumap utilization, fixing up pairing of refcounts, using the right
   iterators (perf_cpu_map__for_each_cpu), etc.
 
 - Give more details about raw event encodings in 'perf list', show tracepoint
   encoding in the detailed output.
 
 - Refactor the DSOs handling code, reducing memory usage.
 
 - Document the BPF event modifier and add a 'perf test' for it.
 
 - Improve the event parser, better error messages and add further 'perf test's
   for it.
 
 - Add reference count checking to 'struct comm_str' and 'struct mem_info'.
 
 - Make ARM64's 'perf test' entries for the Neoverse N1 more robust.
 
 - Tweak the ARM64's Coresight 'perf test's.
 
 - Improve ARM64's CoreSight ETM version detection and error reporting.
 
 - Fix handling of symbols when using kcore.
 
 - Fix PAI (Processor Activity Instrumentation) counter names for s390 virtual
   machines in 'perf report'.
 
 - Fix -g/--call-graph option failure in 'perf sched timehist'.
 
 - Add LIBTRACEEVENT_DIR build option to allow building with libtraceevent
   installed in non-standard directories, such as when doing cross builds.
 
 - Various 'perf test' and 'perf bench' fixes.
 
 - Improve 'perf probe' error message for long C++ probe names.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCZkzdjgAKCRCyPKLppCJ+
 J8ZYAP46rcbOuDWol6tjD9FDXd+spkWc40bnqeSnOR+TWlmJXwEA87XU4+3LAh6p
 HQxKXehJRh90I90yn954mK2NuN+58Q0=
 =sie+
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "General:

   - Integrate the shellcheck utility with the build of perf to allow
     catching shell problems early in areas such as 'perf test', 'perf
     trace' scrape scripts, etc

   - Add 'uretprobe' variant in the 'perf bench uprobe' tool

   - Add script to run instances of 'perf script' in parallel

   - Allow parsing tracepoint names that start with digits, such as
     9p/9p_client_req, etc. Make sure 'perf test' tests it even on
     systems where those tracepoints aren't available

   - Add Kan Liang to MAINTAINERS as a perf tools reviewer

   - Add support for using the 'capstone' disassembler library in
     various tools, such as 'perf script' and 'perf annotate'. This is
     an alternative for the use of the 'xed' and 'objdump' disassemblers

  Data-type profiling improvements:

   - Resolve types for a->b->c by backtracking the assignments until it
     finds DWARF info for one of those members

   - Support for global variables, keeping a cache to speed up lookups

   - Handle the 'call' instruction, dealing with effects on registers
     and handling its return when tracking register data types

   - Handle x86's segment based addressing like %gs:0x28, to support
     things like per CPU variables, the stack canary, etc

   - Data-type profiling got big speedups when using capstone for
     disassembling. The objdump outoput parsing method is left as a
     fallback when capstone fails or isn't available. There are patches
     posted for 6.11 that to use a LLVM disassembler

   - Support event group display in the TUI when annotating types with
     --data-type, for instance to show memory load and store events for
     the data type fields

   - Optimize the 'perf annotate' data structures, reducing memory usage

   - Add a initial 'perf test' for 'perf annotate', checking that a
     target symbol appears on the output, specifying objdump via the
     command line, etc

  Vendor Events:

   - Update Intel JSON files for Cascade Lake X, Emerald Rapids, Grand
     Ridge, Ice Lake X, Lunar Lake, Meteor Lake, Sapphire Rapids, Sierra
     Forest, Sky Lake X, Sky Lake and Snow Ridge X. Remove info metrics
     erroneously in TopdownL1

   - Add AMD's Zen 5 core and uncore events and metrics. Those come from
     the "Performance Monitor Counters for AMD Family 1Ah Model 00h- 0Fh
     Processors" document, with events that capture information on op
     dispatch, execution and retirement, branch prediction, L1 and L2
     cache activity, TLB activity, etc

   - Mark L1D_CACHE_INVAL impacted by errata for ARM64's AmpereOne/
     AmpereOneX

  Miscellaneous:

   - Sync header copies with the kernel sources

   - Move some header copies used only for generating translation string
     tables for ioctl cmds and other syscall integer arguments to a new
     directory under tools/perf/beauty/, to separate from copies in
     tools/include/ that are used to build the tools

   - Introduce scrape script for several syscall 'flags'/'mask'
     arguments

   - Improve cpumap utilization, fixing up pairing of refcounts, using
     the right iterators (perf_cpu_map__for_each_cpu), etc

   - Give more details about raw event encodings in 'perf list', show
     tracepoint encoding in the detailed output

   - Refactor the DSOs handling code, reducing memory usage

   - Document the BPF event modifier and add a 'perf test' for it

   - Improve the event parser, better error messages and add further
     'perf test's for it

   - Add reference count checking to 'struct comm_str' and 'struct
     mem_info'

   - Make ARM64's 'perf test' entries for the Neoverse N1 more robust

   - Tweak the ARM64's Coresight 'perf test's

   - Improve ARM64's CoreSight ETM version detection and error reporting

   - Fix handling of symbols when using kcore

   - Fix PAI (Processor Activity Instrumentation) counter names for s390
     virtual machines in 'perf report'

   - Fix -g/--call-graph option failure in 'perf sched timehist'

   - Add LIBTRACEEVENT_DIR build option to allow building with
     libtraceevent installed in non-standard directories, such as when
     doing cross builds

   - Various 'perf test' and 'perf bench' fixes

   - Improve 'perf probe' error message for long C++ probe names"

* tag 'perf-tools-for-v6.10-1-2024-05-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (260 commits)
  tools lib subcmd: Show parent options in help
  perf pmu: Count sys and cpuid JSON events separately
  perf stat: Don't display metric header for non-leader uncore events
  perf annotate-data: Ensure the number of type histograms
  perf annotate: Fix segfault on sample histogram
  perf daemon: Fix file leak in daemon_session__control
  libsubcmd: Fix parse-options memory leak
  perf lock: Avoid memory leaks from strdup()
  perf sched: Rename 'switches' column header to 'count' and add usage description, options for latency
  perf tools: Ignore deleted cgroups
  perf parse: Allow tracepoint names to start with digits
  perf parse-events: Add new 'fake_tp' parameter for tests
  perf parse-events: pass parse_state to add_tracepoint
  perf symbols: Fix ownership of string in dso__load_vmlinux()
  perf symbols: Update kcore map before merging in remaining symbols
  perf maps: Re-use __maps__free_maps_by_name()
  perf symbols: Remove map from list before updating addresses
  perf tracepoint: Don't scan all tracepoints to test if one exists
  perf dwarf-aux: Fix build with HAVE_DWARF_CFI_SUPPORT
  perf thread: Fixes to thread__new() related to initializing comm
  ...
2024-05-21 15:45:14 -07:00
James Clark
e3123079b9 perf cs-etm: Improve version detection and error reporting
When the config validation functions are warning about ETMv3, they do it
based on "not ETMv4". If the drivers aren't all loaded or the hardware
doesn't support Coresight it will appear as "not ETMv4" and then Perf
will print the error message "... not supported in ETMv3 ..." which is
wrong and confusing.

cs_etm_is_etmv4() is also misnamed because it also returns true for
ETE because ETE has a superset of the ETMv4 metadata files. Although
this was always done in the correct order so it wasn't a bug.

Improve all this by making a single get version function which also
handles not present as a separate case. Change the ETMv3 error message
to only print when ETMv3 is detected, and add a new error message for
the not present case.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Leo Yan <leo.yan@linux.dev>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240501135753.508022-4-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02 11:34:41 -03:00
James Clark
bc5e0e1b93 perf cs-etm: Remove repeated fetches of the ETM PMU
Most functions already have cs_etm_pmu, so it's a bit neater to pass
it through rather than itr only to convert it again.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240501135753.508022-3-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02 11:33:46 -03:00
James Clark
cbaf2c4f93 perf cs-etm: Use struct perf_cpu as much as possible
The perf_cpu struct makes some iterators simpler and avoids some
mistakes with interchanging CPU IDs with indexes etc. At the moment in
this file the conversion to an integer is done somewhere in the middle
of the call tree. Change it to delay the conversion to an int until the
leaf functions.

Some of the usage patterns are duplicated, so instead of changing them
all, make cs_etm_get_ro() more reusable and use that everywhere.
cs_etm_get_ro() didn't return an error before, but return one now so
that it can also be used where an error is needed. Continue to ignore
the error where it was already ignored.

Use cs_etm_pmu_path_exists() instead of cs_etm_get_ro() in
cs_etm_is_etmv4() because cs_etm_get_ro() prints a warning, but path
exists is sufficient for this use case.

Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20240501135753.508022-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-02 11:26:59 -03:00
Ben Zong-You Xie
9c49085d69
perf riscv: Fix the warning due to the incompatible type
In the 32-bit platform, the second argument of getline is expectd to be
'size_t *'(aka 'unsigned int *'), but line_sz is of type
'unsigned long *'. Therefore, declare line_sz as size_t.

Signed-off-by: Ben Zong-You Xie <ben717@andestech.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240305120501.1785084-3-ben717@andestech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-04-26 10:21:55 -07:00
Ian Rogers
ec440763bb perf arch x86: Add shellcheck to build
Add shellcheck for:

  tools/perf/arch/x86/tests/gen-insn-x86-dat.sh
  tools/perf/arch/x86/entry/syscalls/syscalltbl.sh

Address a minor quoting issue.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Link: https://lore.kernel.org/r/20240409023216.2342032-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-12 17:54:02 -03:00
Ian Rogers
71bc3ac8e8 perf cpumap: Use perf_cpu_map__for_each_cpu when possible
Rather than manually iterating the CPU map, use
perf_cpu_map__for_each_cpu(). When possible tidy local variables.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20240202234057.2085863-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:28 -03:00
Ian Rogers
4ddccd0048 perf arm64 header: Remove unnecessary CPU map get and put
In both cases the CPU map is known owned by either the caller or a
PMU.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20240202234057.2085863-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:28 -03:00
Ian Rogers
291dcd774b perf intel-pt/intel-bts: Switch perf_cpu_map__has_any_cpu_or_is_empty use
Switch perf_cpu_map__has_any_cpu_or_is_empty() to
perf_cpu_map__is_any_cpu_or_is_empty() as a CPU map may contain CPUs as
well as the dummy event and perf_cpu_map__is_any_cpu_or_is_empty() is a
more correct alternative.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20240202234057.2085863-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:28 -03:00
Ian Rogers
e28ee1239d perf arm-spe/cs-etm: Directly iterate CPU maps
Rather than iterate all CPUs and see if they are in CPU maps, directly
iterate the CPU map. Similarly make use of the intersect function
taking care for when "any" CPU is specified. Switch
perf_cpu_map__has_any_cpu_or_is_empty() to more appropriate
alternatives.

Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alexghiti@rivosinc.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Atish Patra <atishp@rivosinc.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Cc: Yanteng Si <siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20240202234057.2085863-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:28 -03:00
Masahiro Yamada
c2bd08ba20 treewide: remove meaningless assignments in Makefiles
In Makefiles, $(error ), $(warning ), and $(info ) expand to the empty
string, as explained in the GNU Make manual [1]:
 "The result of the expansion of this function is the empty string."

Therefore, they are no-op except for logging purposes.

$(shell ...) expands to the output of the command. It expands to the
empty string when the command does not print anything to stdout.
Hence, $(shell mkdir ...) is no-op except for creating the directory.

Remove meaningless assignments.

[1]: https://www.gnu.org/software/make/manual/make.html#Make-Control-Functions

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240221134201.2656908-1-masahiroy@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
2024-02-23 14:19:07 -08:00
Leo Yan
9a4e47ef98 perf parse-regs: Introduce a weak function arch__sample_reg_masks()
Every architecture can provide a register list for sampling. If an
architecture doesn't support register sampling, it won't define the data
structure 'sample_reg_masks'. Consequently, any code using this
structure must be protected by the macro 'HAVE_PERF_REGS_SUPPORT'.

This patch defines a weak function, arch__sample_reg_masks(), which will
be replaced by an architecture-defined function for returning the
architecture's register list. With this refactoring, the function always
exists, the condition checking for 'HAVE_PERF_REGS_SUPPORT' is not
needed anymore, so remove it.

Signed-off-by: Leo Yan <leo.yan@linux.dev>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Ming Wang <wangming01@loongson.cn>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240214113947.240957-4-leo.yan@linux.dev
2024-02-15 13:48:36 -08:00
Ian Rogers
42fd623b58 perf maps: Get map before returning in maps__find
Finding a map is done under a lock, returning the map without a
reference count means it can be removed without notice and causing
uses after free. Grab a reference count to the map within the lock
region and return this. Fix up locations that need a map__put
following this.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: James Clark <james.clark@arm.com>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Song Liu <song@kernel.org>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Artem Savkov <asavkov@redhat.com>
Cc: bpf@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240210031746.4057262-3-irogers@google.com
2024-02-12 12:35:26 -08:00
Ian Rogers
8ce5fa4d68 perf kvm powerpc: Fix build
Updates to struct parse_events_error needed to be carried through to
PowerPC specific event parsing.

Fixes: fd7b8e8fb2 ("perf parse-events: Print all errors")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung <namhyung@kernel.org>
Cc: James Clark <james.clark@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240206235902.2917395-1-irogers@google.com
2024-02-07 08:55:11 -08:00
Ian Rogers
fd7b8e8fb2 perf parse-events: Print all errors
Prior to this patch the first and the last error encountered during
parsing are printed. To see other errors verbose needs
enabling. Unfortunately this can drop useful errors, in particular on
terms. This patch changes the errors so that instead of the first and
last all errors are recorded and printed, the underlying data
structure is changed to a list.

Before:
```
$ perf stat -e 'slots/edge=2/' true
event syntax error: 'slots/edge=2/'
                                \___ Bad event or PMU

Unable to find PMU or event on a PMU of 'slots'

Initial error:
event syntax error: 'slots/edge=2/'
                     \___ Cannot find PMU `slots'. Missing kernel support?
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events
```

After:
```
$ perf stat -e 'slots/edge=2/' true
event syntax error: 'slots/edge=2/'
                     \___ Bad event or PMU

Unable to find PMU or event on a PMU of 'slots'

event syntax error: 'slots/edge=2/'
                                \___ value too big for format (edge), maximum is 1

event syntax error: 'slots/edge=2/'
                     \___ Cannot find PMU `slots'. Missing kernel support?
Run 'perf list' for a list of valid events

 Usage: perf stat [<options>] [<command>]

    -e, --event <event>   event selector. use 'perf list' to list available events
```

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: tchen168@asu.edu
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240131134940.593788-3-irogers@google.com
2024-02-02 13:08:05 -08:00
Ian Rogers
2882358b8b perf tsc: Add missing newlines to debug statements
It is assumed that debug statements always print a newline, fix two
missing ones.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: James Clark <james.clark@arm.com>
Cc: tchen168@asu.edu
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240131134940.593788-1-irogers@google.com
2024-02-02 13:07:18 -08:00
Kan Liang
821aca20be perf mem: Clean up perf_pmus__num_mem_pmus()
The number of mem PMUs can be calculated by searching the
perf_pmus__scan_mem().

Remove the ARCH specific perf_pmus__num_mem_pmus()

Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: ravi.bangoria@amd.com
Cc: james.clark@arm.com
Cc: will@kernel.org
Cc: mike.leach@linaro.org
Cc: renyu.zj@linux.alibaba.com
Cc: yuhaixin.yhx@linux.alibaba.com
Cc: tmricht@linux.ibm.com
Cc: atrajeev@linux.vnet.ibm.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: john.g.garry@oracle.com
Link: https://lore.kernel.org/r/20240123185036.3461837-8-kan.liang@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-24 14:05:22 -08:00
Kan Liang
8ea9dfb916 perf mem: Clean up is_mem_loads_aux_event()
The aux_event can be retrieved from the perf_pmu now. Implement a
generic support.

Reviewed-by: Ian Rogers <irogers@google.com>
Tested-by: Ravi Bangoria <ravi.bangoria@amd.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: james.clark@arm.com
Cc: will@kernel.org
Cc: mike.leach@linaro.org
Cc: renyu.zj@linux.alibaba.com
Cc: yuhaixin.yhx@linux.alibaba.com
Cc: tmricht@linux.ibm.com
Cc: atrajeev@linux.vnet.ibm.com
Cc: linux-arm-kernel@lists.infradead.org
Cc: john.g.garry@oracle.com
Link: https://lore.kernel.org/r/20240123185036.3461837-6-kan.liang@linux.intel.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-01-24 14:05:00 -08:00