Commit Graph

9 Commits

Author SHA1 Message Date
Ian Rogers
375368a961 perf fncache: Switch to using hashmap
The existing fncache can get large in testing situations. As the
bucket array is a fixed size this leads to it degrading to O(n)
performance. Use a regular hashmap that can dynamically reallocate its
array.

Before:
```
$ time perf test "Parsing of PMU event table metrics"
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

real    0m14.132s
user    0m17.806s
sys     0m0.557s
```

After:
```
$ time perf test "Parsing of PMU event table metrics"
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

real    0m13.287s
user    0m13.026s
sys     0m0.532s
```

Committer notes:

  root@number:~# grep -m1 'model name' /proc/cpuinfo
  model name	: AMD Ryzen 9 9950X3D 16-Core Processor
  root@number:~#

Before:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m9.277s
  user	0m9.979s
  sys	0m0.055s
  root@number:~#

After:

  root@number:~# time perf test "Parsing of PMU event table metrics"
   10.3: Parsing of PMU event table metrics                            : Ok
   10.4: Parsing of PMU event table metrics with fake PMUs             : Ok

  real	0m9.296s
  user	0m9.361s
  sys	0m0.063s
  root@number:~#

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Xu Yang <xu.yang_2@nxp.com>
Link: https://lore.kernel.org/r/20250512194622.33258-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-05-13 16:36:22 -03:00
Zou Wei
f54cad25a1 perf srccode: Use list_move() instead of equivalent list_del() + list_add() sequence
Using list_move() instead of list_del() + list_add(), shorter,
equivalent.

Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zou Wei <zou_wei@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1623113566-49455-1-git-send-email-zou_wei@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2021-06-08 09:36:36 -03:00
Andi Kleen
d96645821e perf pmu: Use file system cache to optimize sysfs access
pmu.c does a lot of redundant /sys accesses while parsing aliases
and probing for PMUs. On large systems with a lot of PMUs this
can get expensive (>2s):

  % time     seconds  usecs/call     calls    errors syscall
  ------ ----------- ----------- --------- --------- ----------------
   27.25    1.227847           8    160888     16976 openat
   26.42    1.190481           7    164224    164077 stat

Add a cache to remember if specific file names exist or don't
exist, which eliminates most of this overhead.

Also optimize some stat() calls to be slightly cheaper access()

Resulting in:

    0.18    0.004166           2      1851       305 open
    0.08    0.001970           2       829       622 access

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20191121001522.180827-2-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-11-28 08:08:38 -03:00
Jiri Olsa
20f2be1d48 libperf: Move 'page_size' global variable to libperf
We need the 'page_size' variable in libperf, so move it there.

Add a libperf_init() as a global libperf init function to obtain this
value via sysconf() at tool start.

Committer notes:

Add internal/lib.h to tools/perf/ files using 'page_size', sometimes
replacing util.h with it if that was the only reason for having util.h
included.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lore.kernel.org/lkml/20190913132355.21634-33-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-25 09:51:48 -03:00
Arnaldo Carvalho de Melo
fb71c86cc8 perf tools: Remove util.h from where it is not needed
Check that it is not needed and remove, fixing up some fallout for
places where it was only serving to get something else.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-9h6dg6lsqe2usyqjh5rrues4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-09-20 09:19:20 -03:00
Arnaldo Carvalho de Melo
e56fbc9dc7 perf tools: Use list_del_init() more thorougly
To allow for destructors to check if they're operating on a object still
in a list, and to avoid going from use after free list entries into
still valid, or even also other already removed from list entries.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-deh17ub44atyox3j90e6rksu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09 10:13:27 -03:00
Arnaldo Carvalho de Melo
d8f9da2404 perf tools: Use zfree() where applicable
In places where the equivalent was already being done, i.e.:

   free(a);
   a = NULL;

And in placs where struct members are being freed so that if we have
some erroneous reference to its struct, then accesses to freed members
will result in segfaults, which we can detect faster than use after free
to areas that may still have something seemingly valid.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lkml.kernel.org/n/tip-jatyoofo5boc1bsvoig6bb6i@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2019-07-09 10:13:27 -03:00
Thomas Gleixner
2025cf9e19 treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 288
Based on 1 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms and conditions of the gnu general public license
  version 2 as published by the free software foundation this program
  is distributed in the hope it will be useful but without any
  warranty without even the implied warranty of merchantability or
  fitness for a particular purpose see the gnu general public license
  for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-only

has been chosen to replace the boilerplate/reference in 263 file(s).

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Allison Randal <allison@lohutok.net>
Reviewed-by: Alexios Zavras <alexios.zavras@intel.com>
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190529141901.208660670@linutronix.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2019-06-05 17:36:37 +02:00
Andi Kleen
dd2e18e9ac perf tools: Support 'srccode' output
When looking at PT or brstackinsn traces with 'perf script' it can be
very useful to see the source code. This adds a simple facility to print
them with 'perf script', if the information is available through dwarf

  % perf record ...
  % perf script -F insn,ip,sym,srccode
  ...

            4004c6 main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004c6 main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004cd main
  5               for (i = 0; i < 10000000; i++)
             4004b3 main
  6                       v++;

  % perf record -b ...
  % perf script -F insn,ip,sym,srccode,brstackinsn

  ...
         main+22:
          0000000000400543        insn: e8 ca ff ff ff            # PRED
  |18                     f1();
          f1:
          0000000000400512        insn: 55
  |10       {
          0000000000400513        insn: 48 89 e5
          0000000000400516        insn: b8 00 00 00 00
  |11             f2();
          000000000040051b        insn: e8 d6 ff ff ff            # PRED
          f2:
          00000000004004f6        insn: 55
  |5        {
          00000000004004f7        insn: 48 89 e5
          00000000004004fa        insn: 8b 05 2c 0b 20 00
  |6              c = a / b;
          0000000000400500        insn: 8b 0d 2a 0b 20 00
          0000000000400506        insn: 99
          0000000000400507        insn: f7 f9
          0000000000400509        insn: 89 05 29 0b 20 00
          000000000040050f        insn: 90
  |7        }
          0000000000400510        insn: 5d
          0000000000400511        insn: c3                        # PRED
          f1+14:
          0000000000400520        insn: b8 00 00 00 00
  |12             f2();
          0000000000400525        insn: e8 cc ff ff ff            # PRED
          f2:
          00000000004004f6        insn: 55
  |5        {
          00000000004004f7        insn: 48 89 e5
          00000000004004fa        insn: 8b 05 2c 0b 20 00
  |6              c = a / b;

Not supported for callchains currently, would need some layout changes
there.

Committer notes:

Fixed the build on Alpine Linux (3.4 .. 3.8) by addressing this
warning:

  In file included from util/srccode.c:19:0:
  /usr/include/sys/fcntl.h:1:2: error: #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h> [-Werror=cpp]
   #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
    ^~~~~~~
  cc1: all warnings being treated as errors

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20181204001848.24769-1-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2018-12-17 14:57:07 -03:00