Commit Graph

30456 Commits

Author SHA1 Message Date
Mark Brown
3f374d7972 kselftest/arm64: Handle more kselftest result codes in MTE helpers
The MTE selftests have a helper evaluate_test() which translates a return
code into a call to ksft_test_result_*(). Currently this only handles pass
and fail, silently ignoring any other code. Update the helper to support
skipped tests and log any unknown return codes as an error so we get at
least some diagnostic if anything goes wrong.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220419103243.24774-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-28 17:57:10 +01:00
Mark Brown
82f97bcd87 kselftest/arm64: Validate setting via FPSIMD and read via SVE regsets
Currently we validate that we can set the floating point state via the SVE
regset and read the data via the FPSIMD regset but we do not valiate that
the opposite case works as expected. Add a test that covers this case,
noting that when reading via SVE regset the kernel has the option of
returning either SVE or FPSIMD data so we need to accept both formats.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220404090613.181272-4-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-28 17:57:10 +01:00
Mark Brown
1fb1e285b4 kselftest/arm64: Remove assumption that tasks start FPSIMD only
Currently the sve-ptrace test for setting and reading FPSIMD data assumes
that the child will start off in FPSIMD only mode and that it can use this
to read some FPSIMD mode SVE ptrace data, skipping the test if it can't.
This isn't an assumption guaranteed by the ABI and also limits how we can
use this testcase within the program. Instead skip the initial read and
just generate a FPSIMD format buffer for the write part of the test, making
the coverage more robust in the face of future kernel and test program
changes.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220404090613.181272-3-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-28 17:57:10 +01:00
Mark Brown
854f856f7e kselftest/arm64: Fix comment for ptrace_sve_get_fpsimd_data()
The comment for ptrace_sve_get_fpsimd_data() doesn't describe what the test
does at all, fix that.

Signed-off-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Link: https://lore.kernel.org/r/20220404090613.181272-2-broonie@kernel.org
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2022-04-28 17:57:10 +01:00
Namhyung Kim
a5d20d42a2 perf symbol: Remove arch__symbols__fixup_end()
Now the generic code can handle kallsyms fixup properly so no need to
keep the arch-functions anymore.

Fixes: 3cf6a32f3f ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220416004048.1514900-4-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:51:40 -03:00
Namhyung Kim
8799ebce84 perf symbol: Update symbols__fixup_end()
Now arch-specific functions all do the same thing.  When it fixes the
symbol address it needs to check the boundary between the kernel image
and modules.  For the last symbol in the previous region, it cannot
know the exact size as it's discarded already.  Thus it just uses a
small page size (4096) and rounds it up like the last symbol.

Fixes: 3cf6a32f3f ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220416004048.1514900-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:51:33 -03:00
Namhyung Kim
838425f2de perf symbol: Pass is_kallsyms to symbols__fixup_end()
The symbol fixup is necessary for symbols in kallsyms since they don't
have size info.  So we use the next symbol's address to calculate the
size.  Now it's also used for user binaries because sometimes they miss
size for hand-written asm functions.

There's a arch-specific function to handle kallsyms differently but
currently it cannot distinguish kallsyms from others.  Pass this
information explicitly to handle it properly.  Note that those arch
functions will be moved to the generic function so I didn't added it to
the arch-functions.

Fixes: 3cf6a32f3f ("perf symbols: Fix symbol size calculation condition")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-s390@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220416004048.1514900-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:51:20 -03:00
Timothy Hayes
3b9a8c8b9a perf test: Add perf_event_attr test for Arm SPE
Adds a perf_event_attr test for Arm SPE in which the presence of
physical addresses are checked when SPE unit is run with pa_enable=1.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220421165205.117662-4-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:40:49 -03:00
Timothy Hayes
7599b70a3c perf arm-spe: Fix SPE events with phys addresses
This patch corrects a bug whereby SPE collection is invoked with
pa_enable=1 but synthesized events fail to show physical addresses.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220421165205.117662-3-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:39:28 -03:00
Timothy Hayes
4e13f6706d perf arm-spe: Fix addresses of synthesized SPE events
This patch corrects a bug whereby synthesized events from SPE
samples are missing virtual addresses.

Fixes: 54f7815efe ("perf arm-spe: Fill address info for samples")
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: Timothy Hayes <timothy.hayes@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: bpf@vger.kernel.org
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: netdev@vger.kernel.org
Cc: Song Liu <songliubraving@fb.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20220421165205.117662-2-timothy.hayes@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:39:14 -03:00
Ian Rogers
36c84190dc perf vendor events intel: Update WSM-EX events to v3
Events are generated for Westmere EX v3 with events from:

  https://download.01.org/perfmon/WSM-EX/

Using the scripts at:

  https://github.com/intel/event-converter-for-linux-perf/

This change updates descriptions.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220428075730.797727-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:30:36 -03:00
Ian Rogers
a0cb448978 perf vendor events intel: Update WSM-EP-SP events to v3
Events are generated for Westmere EP-SP v3 with events from:

  https://download.01.org/perfmon/WSM-EP-SP/

Using the scripts at:

  https://github.com/intel/event-converter-for-linux-perf/

This change updates descriptions.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220428075730.797727-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:30:16 -03:00
Ian Rogers
e14fd2ee6d perf vendor events intel: Update SKX events to v1.27
Events are generated for Skylake Server v1.27 with
events from:

  https://download.01.org/perfmon/SKX/

Using the scripts at:

  https://github.com/intel/event-converter-for-linux-perf/

This change updates descriptions, adds INST_DECODED.DECODERS and
corrects a counter mask in UOPS_RETIRED.TOTAL_CYCLES.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220428075730.797727-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:29:56 -03:00
Ian Rogers
02c758d2aa perf vendor events intel: Update SKL events to v53
Events are generated for Skylake v53 with
events from:

  https://download.01.org/perfmon/SKL/

Using the scripts at:

  https://github.com/intel/event-converter-for-linux-perf/

This change updates descriptions, adds INST_DECODED.DECODERS and
corrects a counter mask in UOPS_RETIRED.TOTAL_CYCLES.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220428075730.797727-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:29:38 -03:00
Ian Rogers
8ce185d496 perf vendor events intel: Update IVT events to v21
Events are generated for Ivytown v21 with events from:

  https://download.01.org/perfmon/IVT/

Using the scripts at:

  https://github.com/intel/event-converter-for-linux-perf/

This change fixes a spelling mistake in a description.

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220428075730.797727-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:29:12 -03:00
Ian Rogers
a5043ed963 perf vendor events intel: Update ICL events to v1.13
Events are generated for Icelake v1.13 with events from:

  https://download.01.org/perfmon/ICL/

Using the scripts at:

  https://github.com/intel/event-converter-for-linux-perf/

This change updates descriptions and adds INST_DECODED.DECODERS.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220428075730.797727-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:28:40 -03:00
Thomas Richter
44900ce975 perf test: Fix test case 81 ("perf record tests") on s390x
perf test -F 81 ("perf record tests") -v fails on s390x on the
linux-next branch.

The test case is x86 specific can not be executed on s390x.  The test
case depends on x86 register names such as:

  ... | egrep -q 'available registers: AX BX CX DX ....'

Skip this test case on s390x.

Output before:

  # perf test -F 81
  81: perf record tests                       : FAILED!
  #

Output after:

  # perf test -F 81
  81: perf record tests                       : Skip
  #

Fixes: 24f378e660 ("perf test: Add basic perf record tests")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220428122821.3652015-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-28 10:24:24 -03:00
Adrian Hunter
de8fd13843 perf intel-pt: Fix timeless decoding with perf.data directory
Intel PT does not capture data in separate directories, so do not
use separate directory processing because it doesn't work for
timeless decoding. It also looks like it doesn't support one_mmap
handling.

Example:

  Before:

    # perf record --kcore -a -e intel_pt/tsc=0/k sleep 0.1
    [ perf record: Woken up 1 times to write data ]
    [ perf record: Captured and wrote 1.799 MB perf.data ]
    # perf script --itrace=bep | head
    #

  After:

    # perf script --itrace=bep | head
    perf 21073 [000]              psb:  psb offs: 0                       ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms])
    perf 21073 [000]              cbr:  cbr: 45 freq: 4505 MHz (161%)     ffffffffaa68faf4 native_write_msr+0x4 ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:                 0 [unknown] ([unknown]) => ffffffffaa68faf6 native_write_msr+0x6 ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:  ffffffffaa68faf8 native_write_msr+0x8 ([kernel.kallsyms]) => ffffffffaa61aab0 pt_config_start+0x60 ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:  ffffffffaa61aabd pt_config_start+0x6d ([kernel.kallsyms]) => ffffffffaa61b8ad pt_event_start+0x27d ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:  ffffffffaa61b8bb pt_event_start+0x28b ([kernel.kallsyms]) => ffffffffaa61ba60 pt_event_add+0x40 ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:  ffffffffaa61ba76 pt_event_add+0x56 ([kernel.kallsyms]) => ffffffffaa880e86 event_sched_in+0xc6 ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:  ffffffffaa880e9b event_sched_in+0xdb ([kernel.kallsyms]) => ffffffffaa880ea5 event_sched_in+0xe5 ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:  ffffffffaa880eba event_sched_in+0xfa ([kernel.kallsyms]) => ffffffffaa880f96 event_sched_in+0x1d6 ([kernel.kallsyms])
    perf 21073 [000]          1       branches:k:  ffffffffaa880fc8 event_sched_in+0x208 ([kernel.kallsyms]) => ffffffffaa880ec0 event_sched_in+0x100 ([kernel.kallsyms])

Fixes: bb6be405c4 ("perf session: Load data directory files for analysis")
Cc: stable@vger.kernel.org
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220428093109.274641-1-adrian.hunter@intel.com
Cc: Ian Rogers <irogers@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: linux-kernel@vger.kernel.org
2022-04-28 10:20:52 -03:00
Mykola Lysenko
0925225956 bpf/selftests: Add granular subtest output for prog_test
Implement per subtest log collection for both parallel
and sequential test execution. This allows granular
per-subtest error output in the 'All error logs' section.
Add subtest log transfer into the protocol during the
parallel test execution.

Move all test log printing logic into dump_test_log
function. One exception is the output of test names when
verbose printing is enabled. Move test name/result
printing into separate functions to avoid repetition.

Print all successful subtest results in the log. Print
only failed test logs when test does not have subtests.
Or only failed subtests' logs when test has subtests.

Disable 'All error logs' output when verbose mode is
enabled. This functionality was already broken and is
causing confusion.

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220427041353.246007-1-mykolal@fb.com
2022-04-27 19:03:58 -07:00
Jakub Kicinski
50c6afabfd Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-04-27

We've added 85 non-merge commits during the last 18 day(s) which contain
a total of 163 files changed, 4499 insertions(+), 1521 deletions(-).

The main changes are:

1) Teach libbpf to enhance BPF verifier log with human-readable and relevant
   information about failed CO-RE relocations, from Andrii Nakryiko.

2) Add typed pointer support in BPF maps and enable it for unreferenced pointers
   (via probe read) and referenced ones that can be passed to in-kernel helpers,
   from Kumar Kartikeya Dwivedi.

3) Improve xsk to break NAPI loop when rx queue gets full to allow for forward
   progress to consume descriptors, from Maciej Fijalkowski & Björn Töpel.

4) Fix a small RCU read-side race in BPF_PROG_RUN routines which dereferenced
   the effective prog array before the rcu_read_lock, from Stanislav Fomichev.

5) Implement BPF atomic operations for RV64 JIT, and add libbpf parsing logic
   for USDT arguments under riscv{32,64}, from Pu Lehui.

6) Implement libbpf parsing of USDT arguments under aarch64, from Alan Maguire.

7) Enable bpftool build for musl and remove nftw with FTW_ACTIONRETVAL usage
   so it can be shipped under Alpine which is musl-based, from Dominique Martinet.

8) Clean up {sk,task,inode} local storage trace RCU handling as they do not
   need to use call_rcu_tasks_trace() barrier, from KP Singh.

9) Improve libbpf API documentation and fix error return handling of various
   API functions, from Grant Seltzer.

10) Enlarge offset check for bpf_skb_{load,store}_bytes() helpers given data
    length of frags + frag_list may surpass old offset limit, from Liu Jian.

11) Various improvements to prog_tests in area of logging, test execution
    and by-name subtest selection, from Mykola Lysenko.

12) Simplify map_btf_id generation for all map types by moving this process
    to build time with help of resolve_btfids infra, from Menglong Dong.

13) Fix a libbpf bug in probing when falling back to legacy bpf_probe_read*()
    helpers; the probing caused always to use old helpers, from Runqing Yang.

14) Add support for ARCompact and ARCv2 platforms for libbpf's PT_REGS
    tracing macros, from Vladimir Isaev.

15) Cleanup BPF selftests to remove old & unneeded rlimit code given kernel
    switched to memcg-based memory accouting a while ago, from Yafang Shao.

16) Refactor of BPF sysctl handlers to move them to BPF core, from Yan Zhu.

17) Fix BPF selftests in two occasions to work around regressions caused by latest
    LLVM to unblock CI until their fixes are worked out, from Yonghong Song.

18) Misc cleanups all over the place, from various others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
  selftests/bpf: Add libbpf's log fixup logic selftests
  libbpf: Fix up verifier log for unguarded failed CO-RE relos
  libbpf: Simplify bpf_core_parse_spec() signature
  libbpf: Refactor CO-RE relo human description formatting routine
  libbpf: Record subprog-resolved CO-RE relocations unconditionally
  selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests
  libbpf: Avoid joining .BTF.ext data with BPF programs by section name
  libbpf: Fix logic for finding matching program for CO-RE relocation
  libbpf: Drop unhelpful "program too large" guess
  libbpf: Fix anonymous type check in CO-RE logic
  bpf: Compute map_btf_id during build time
  selftests/bpf: Add test for strict BTF type check
  selftests/bpf: Add verifier tests for kptr
  selftests/bpf: Add C tests for kptr
  libbpf: Add kptr type tag macros to bpf_helpers.h
  bpf: Make BTF type match stricter for release arguments
  bpf: Teach verifier about kptr_get kfunc helpers
  bpf: Wire up freeing of referenced kptr
  bpf: Populate pairs of btf_id and destructor kfunc in btf
  bpf: Adapt copy_map_value for multiple offset case
  ...
====================

Link: https://lore.kernel.org/r/20220427224758.20976-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-27 17:09:32 -07:00
Adrian Hunter
52cc784244 perf tools: Delete perf-with-kcore.sh script
It has been obsolete since the introduction of the 'perf record --kcore'
option.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20220427141946.269523-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-27 20:11:26 -03:00
Geliang Tang
53f368bfff selftests: mptcp: print extra msg in chk_csum_nr
When the multiple checksum errors occur in chk_csum_nr(), print the
numbers of the errors as an extra message.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:54 +01:00
Geliang Tang
1f7d325f7d selftests: mptcp: check MP_FAIL response mibs
This patch extends chk_fail_nr to check the MP_FAIL response mibs.

Add a new argument invert for chk_fail_nr to allow it can check the
MP_FAIL TX and RX mibs from the opposite direction.

When the infinite map is received before the MP_FAIL response, the
response will be lost. A '-' can be added into fail_tx or fail_rx to
represent that MP_FAIL response TX or RX can be lost when doing the
checks.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:54 +01:00
Geliang Tang
b6e074e171 selftests: mptcp: add infinite map testcase
Add the single subflow test case for MP_FAIL, to test the infinite
mapping case. Use the test_linkfail value to make 128KB test files.

Add a new function reset_with_fail(), in it use 'iptables' and 'tc
action pedit' rules to produce the bit flips to trigger the checksum
failures. Set validate_checksum to enable checksums for the MP_FAIL
tests without passing the '-C' argument. Set check_invert flag to
enable the invert bytes check for the output data in check_transfer().
Instead of the file mismatch error, this test prints out the inverted
bytes.

Add a new function pedit_action_pkts() to get the numbers of the packets
edited by the tc pedit actions. Print this numbers to the output.

Also add the needed kernel configures in the selftests config file.

Suggested-by: Davide Caratti <dcaratti@redhat.com>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-27 10:45:53 +01:00
Alistair Popple
3527e1ab9a selftests/powerpc: Add matrix multiply assist (MMA) test
Adds a simple test of some basic matrix multiply assist (MMA)
instructions.

Signed-off-by: Alistair Popple <alistair@popple.id.au>
Tested-by: Joel Stanley <joel@jms.id.au>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200622021832.15870-1-alistair@popple.id.au
2022-04-27 16:32:42 +10:00
Andrii Nakryiko
ea4128eb43 selftests/bpf: Add libbpf's log fixup logic selftests
Add tests validating that libbpf is indeed patching up BPF verifier log
with CO-RE relocation details. Also test partial and full truncation
scenarios.

This test might be a bit fragile due to changing BPF verifier log
format. If that proves to be frequently breaking, we can simplify tests
or remove the truncation subtests. But for now it seems useful to test
it in those conditions that are otherwise rarely occuring in practice.

Also test CO-RE relo failure in a subprog as that excercises subprogram CO-RE
relocation mapping logic which doesn't work out of the box without extra
relo storage previously done only for gen_loader case.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-11-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
9fdc4273b8 libbpf: Fix up verifier log for unguarded failed CO-RE relos
Teach libbpf to post-process BPF verifier log on BPF program load
failure and detect known error patterns to provide user with more
context.

Currently there is one such common situation: an "unguarded" failed BPF
CO-RE relocation. While failing CO-RE relocation is expected, it is
expected to be property guarded in BPF code such that BPF verifier
always eliminates BPF instructions corresponding to such failed CO-RE
relos as dead code. In cases when user failed to take such precautions,
BPF verifier provides the best log it can:

  123: (85) call unknown#195896080
  invalid func unknown#195896080

Such incomprehensible log error is due to libbpf "poisoning" BPF
instruction that corresponds to failed CO-RE relocation by replacing it
with invalid `call 0xbad2310` instruction (195896080 == 0xbad2310 reads
"bad relo" if you squint hard enough).

Luckily, libbpf has all the necessary information to look up CO-RE
relocation that failed and provide more human-readable description of
what's going on:

  5: <invalid CO-RE relocation>
  failed to resolve CO-RE relocation <byte_off> [6] struct task_struct___bad.fake_field_subprog (0:2 @ offset 8)

This hopefully makes it much easier to understand what's wrong with
user's BPF program without googling magic constants.

This BPF verifier log fixup is setup to be extensible and is going to be
used for at least one other upcoming feature of libbpf in follow up patches.
Libbpf is parsing lines of BPF verifier log starting from the very end.
Currently it processes up to 10 lines of code looking for familiar
patterns. This avoids wasting lots of CPU processing huge verifier logs
(especially for log_level=2 verbosity level). Actual verification error
should normally be found in last few lines, so this should work
reliably.

If libbpf needs to expand log beyond available log_buf_size, it
truncates the end of the verifier log. Given verifier log normally ends
with something like:

  processed 2 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

... truncating this on program load error isn't too bad (end user can
always increase log size, if it needs to get complete log).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-10-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
14032f2644 libbpf: Simplify bpf_core_parse_spec() signature
Simplify bpf_core_parse_spec() signature to take struct bpf_core_relo as
an input instead of requiring callers to decompose them into type_id,
relo, spec_str, etc. This makes using and reusing this helper easier.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-9-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
b58af63aab libbpf: Refactor CO-RE relo human description formatting routine
Refactor how CO-RE relocation is formatted. Now it dumps human-readable
representation, currently used by libbpf in either debug or error
message output during CO-RE relocation resolution process, into provided
buffer. This approach allows for better reuse of this functionality
outside of CO-RE relocation resolution, which we'll use in next patch
for providing better error message for BPF verifier rejecting BPF
program due to unguarded failed CO-RE relocation.

It also gets rid of annoying "stitching" of libbpf_print() calls, which
was the only place where we did this.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-8-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
185cfe837f libbpf: Record subprog-resolved CO-RE relocations unconditionally
Previously, libbpf recorded CO-RE relocations with insns_idx resolved
according to finalized subprog locations (which are appended at the end
of entry BPF program) to simplify the job of light skeleton generator.

This is necessary because once subprogs' instructions are appended to
main entry BPF program all the subprog instruction indices are shifted
and that shift is different for each entry (main) BPF program, so it's
generally impossible to map final absolute insn_idx of the finalized BPF
program to their original locations inside subprograms.

This information is now going to be used not only during light skeleton
generation, but also to map absolute instruction index to subprog's
instruction and its corresponding CO-RE relocation. So start recording
these relocations always, not just when obj->gen_loader is set.

This information is going to be freed at the end of bpf_object__load()
step, as before (but this can change in the future if there will be
a need for this information post load step).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-7-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
b82bb1ffbb selftests/bpf: Add CO-RE relos and SEC("?...") to linked_funcs selftests
Enhance linked_funcs selftest with two tricky features that might not
obviously work correctly together. We add CO-RE relocations to entry BPF
programs and mark those programs as non-autoloadable with SEC("?...")
annotation. This makes sure that libbpf itself handles .BTF.ext CO-RE
relocation data matching correctly for SEC("?...") programs, as well as
ensures that BPF static linker handles this correctly (this was the case
before, no changes are necessary, but it wasn't explicitly tested).

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-6-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
11d5daa892 libbpf: Avoid joining .BTF.ext data with BPF programs by section name
Instead of using ELF section names as a joining key between .BTF.ext and
corresponding BPF programs, pre-build .BTF.ext section number to ELF
section index mapping during bpf_object__open() and use it later for
matching .BTF.ext information (func/line info or CO-RE relocations) to
their respective BPF programs and subprograms.

This simplifies corresponding joining logic and let's libbpf do
manipulations with BPF program's ELF sections like dropping leading '?'
character for non-autoloaded programs. Original joining logic in
bpf_object__relocate_core() (see relevant comment that's now removed)
was never elegant, so it's a good improvement regardless. But it also
avoids unnecessary internal assumptions about preserving original ELF
section name as BPF program's section name (which was broken when
SEC("?abc") support was added).

Fixes: a3820c4811 ("libbpf: Support opting out from autoloading BPF programs declaratively")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-5-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
966a750932 libbpf: Fix logic for finding matching program for CO-RE relocation
Fix the bug in bpf_object__relocate_core() which can lead to finding
invalid matching BPF program when processing CO-RE relocation. IF
matching program is not found, last encountered program will be assumed
to be correct program and thus error detection won't detect the problem.

Fixes: 9c82a63cf3 ("libbpf: Fix CO-RE relocs against .text section")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-4-andrii@kernel.org
2022-04-26 15:41:46 -07:00
Andrii Nakryiko
0994a54c52 libbpf: Drop unhelpful "program too large" guess
libbpf pretends it knows actual limit of BPF program instructions based
on UAPI headers it compiled with. There is neither any guarantee that
UAPI headers match host kernel, nor BPF verifier actually uses
BPF_MAXINSNS constant anymore. Just drop unhelpful "guess", BPF verifier
will emit actual reason for failure in its logs anyways.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-3-andrii@kernel.org
2022-04-26 15:41:45 -07:00
Andrii Nakryiko
afe98d46ba libbpf: Fix anonymous type check in CO-RE logic
Use type name for checking whether CO-RE relocation is referring to
anonymous type. Using spec string makes no sense.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220426004511.2691730-2-andrii@kernel.org
2022-04-26 15:41:45 -07:00
Guo Ren
c86d2cad19
syscalls: compat: Fix the missing part for __SYSCALL_COMPAT
Make "uapi asm unistd.h" could be used for architectures' COMPAT
mode. The __SYSCALL_COMPAT is first used in riscv.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-8-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26 13:36:01 -07:00
Christoph Hellwig
306f7cc1e9
uapi: always define F_GETLK64/F_SETLK64/F_SETLKW64 in fcntl.h
The F_GETLK64/F_SETLK64/F_SETLKW64 fcntl opcodes are only implemented
for the 32-bit syscall APIs, but are also needed for compat handling
on 64-bit kernels.

Consolidate them in unistd.h instead of definining the internal compat
definitions in compat.h, which is rather error prone (e.g. parisc
gets the values wrong currently).

Note that before this change they were never visible to userspace due
to the fact that CONFIG_64BIT is only set for kernel builds.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-3-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26 13:35:20 -07:00
Christoph Hellwig
9f79b8b723
uapi: simplify __ARCH_FLOCK{,64}_PAD a little
Don't bother to define the symbols empty, just don't use them.
That makes the intent a little more clear.

Remove the unused HAVE_ARCH_STRUCT_FLOCK64 define and merge the
32-bit mips struct flock into the generic one.

Add a new __ARCH_FLOCK_EXTRA_SYSID macro following the style of
__ARCH_FLOCK_PAD to avoid having a separate definition just for
one architecture.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Guo Ren <guoren@kernel.org>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Tested-by: Heiko Stuebner <heiko@sntech.de>
Link: https://lore.kernel.org/r/20220405071314.3225832-2-guoren@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-04-26 13:35:14 -07:00
Adrian Hunter
9e5e641045 perf intel-pt: Add link to the perf wiki's Intel PT page
Add an EXAMPLE section and link to the perf wiki's Intel PT page.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20220426133213.248475-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-26 14:32:29 -03:00
Colin Ian King
c7b607fa93 selftests/resctrl: Fix null pointer dereference on open failed
Currently if opening /dev/null fails to open then file pointer fp
is null and further access to fp via fprintf will cause a null
pointer dereference. Fix this by returning a negative error value
when a null fp is detected.

Detected using cppcheck static analysis:
tools/testing/selftests/resctrl/fill_buf.c:124:6: note: Assuming
that condition '!fp' is not redundant
 if (!fp)
     ^
tools/testing/selftests/resctrl/fill_buf.c:126:10: note: Null
pointer dereference
 fprintf(fp, "Sum: %d ", ret);

Fixes: a2561b12fe ("selftests/resctrl: Add built in benchmark")
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-26 09:20:00 -06:00
Haowen Bai
ef94b2664a testusb: Fix warning comparing pointer to 0
Avoid pointer type value compared with 0 to make code clear.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Link: https://lore.kernel.org/r/1648088171-30912-1-git-send-email-baihaowen@meizu.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-26 13:32:49 +02:00
ran jianping
dbbf16895a tools/testing/nvdimm: remove unneeded flush_workqueue
All work currently pending will be done first by calling destroy_workqueue,
so there is no need to flush it explicitly.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: ran jianping <ran.jianping@zte.com.cn>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://lore.kernel.org/r/20220424062655.3221152-1-ran.jianping@zte.com.cn
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-25 22:00:35 -07:00
Kumar Kartikeya Dwivedi
792c0a345f selftests/bpf: Add test for strict BTF type check
Ensure that the edge case where first member type was matched
successfully even if it didn't match BTF type of register is caught and
rejected by the verifier.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-14-memxor@gmail.com
2022-04-25 20:26:45 -07:00
Kumar Kartikeya Dwivedi
05a945deef selftests/bpf: Add verifier tests for kptr
Reuse bpf_prog_test functions to test the support for PTR_TO_BTF_ID in
BPF map case, including some tests that verify implementation sanity and
corner cases.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-13-memxor@gmail.com
2022-04-25 20:26:44 -07:00
Kumar Kartikeya Dwivedi
2cbc469a6f selftests/bpf: Add C tests for kptr
This uses the __kptr and __kptr_ref macros as well, and tries to test
the stuff that is supposed to work, since we have negative tests in
test_verifier suite. Also include some code to test map-in-map support,
such that the inner_map_meta matches the kptr_off_tab of map added as
element.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-12-memxor@gmail.com
2022-04-25 20:26:44 -07:00
Kumar Kartikeya Dwivedi
ef89654f2b libbpf: Add kptr type tag macros to bpf_helpers.h
Include convenience definitions:
__kptr:	Unreferenced kptr
__kptr_ref: Referenced kptr

Users can use them to tag the pointer type meant to be used with the new
support directly in the map value definition.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-11-memxor@gmail.com
2022-04-25 20:26:44 -07:00
Kumar Kartikeya Dwivedi
c0a5a21c25 bpf: Allow storing referenced kptr in map
Extending the code in previous commits, introduce referenced kptr
support, which needs to be tagged using 'kptr_ref' tag instead. Unlike
unreferenced kptr, referenced kptr have a lot more restrictions. In
addition to the type matching, only a newly introduced bpf_kptr_xchg
helper is allowed to modify the map value at that offset. This transfers
the referenced pointer being stored into the map, releasing the
references state for the program, and returning the old value and
creating new reference state for the returned pointer.

Similar to unreferenced pointer case, return value for this case will
also be PTR_TO_BTF_ID_OR_NULL. The reference for the returned pointer
must either be eventually released by calling the corresponding release
function, otherwise it must be transferred into another map.

It is also allowed to call bpf_kptr_xchg with a NULL pointer, to clear
the value, and obtain the old value if any.

BPF_LDX, BPF_STX, and BPF_ST cannot access referenced kptr. A future
commit will permit using BPF_LDX for such pointers, but attempt at
making it safe, since the lifetime of object won't be guaranteed.

There are valid reasons to enforce the restriction of permitting only
bpf_kptr_xchg to operate on referenced kptr. The pointer value must be
consistent in face of concurrent modification, and any prior values
contained in the map must also be released before a new one is moved
into the map. To ensure proper transfer of this ownership, bpf_kptr_xchg
returns the old value, which the verifier would require the user to
either free or move into another map, and releases the reference held
for the pointer being moved in.

In the future, direct BPF_XCHG instruction may also be permitted to work
like bpf_kptr_xchg helper.

Note that process_kptr_func doesn't have to call
check_helper_mem_access, since we already disallow rdonly/wronly flags
for map, which is what check_map_access_type checks, and we already
ensure the PTR_TO_MAP_VALUE refers to kptr by obtaining its off_desc,
so check_map_access is also not required.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-4-memxor@gmail.com
2022-04-25 20:26:05 -07:00
Haowen Bai
a84ca704d8 selftests/powerpc/pmu: Fix unsigned function returning negative constant
The function __perf_reg_mask has an unsigned return type, but returns a
negative constant to indicate an error condition. So we change unsigned
to int.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/1650788802-14402-1-git-send-email-baihaowen@meizu.com
2022-04-26 13:17:00 +10:00
Kumar Kartikeya Dwivedi
8f14852e89 bpf: Tag argument to be released in bpf_func_proto
Add a new type flag for bpf_arg_type that when set tells verifier that
for a release function, that argument's register will be the one for
which meta.ref_obj_id will be set, and which will then be released
using release_reference. To capture the regno, introduce a new field
release_regno in bpf_call_arg_meta.

This would be required in the next patch, where we may either pass NULL
or a refcounted pointer as an argument to the release function
bpf_kptr_xchg. Just releasing only when meta.ref_obj_id is set is not
enough, as there is a case where the type of argument needed matches,
but the ref_obj_id is set to 0. Hence, we must enforce that whenever
meta.ref_obj_id is zero, the register that is to be released can only
be NULL for a release function.

Since we now indicate whether an argument is to be released in
bpf_func_proto itself, is_release_function helper has lost its utitlity,
hence refactor code to work without it, and just rely on
meta.release_regno to know when to release state for a ref_obj_id.
Still, the restriction of one release argument and only one ref_obj_id
passed to BPF helper or kfunc remains. This may be lifted in the future.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220424214901.2743946-3-memxor@gmail.com
2022-04-25 17:31:35 -07:00
Shaopeng Tan
68c4844985 selftests/resctrl: Add missing SPDX license to Makefile
Add the missing SPDX(SPDX-License-Identifier) license header to
tools/testing/selftests/resctrl/Makefile.

Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 17:11:48 -06:00
Shaopeng Tan
42e2f21451 selftests/resctrl: Update README about using kselftest framework to build/run resctrl_tests
resctrl_tests can be built or run using kselftests framework.
Add description on how to do so in README.

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 17:11:41 -06:00
Shaopeng Tan
b733143cc4 selftests/resctrl: Make resctrl_tests run using kselftest framework
In kselftest framework, all tests can be build/run at a time,
and a sub test also can be build/run individually. As follows:
$ make kselftest-all TARGETS=resctrl
$ make -C tools/testing/selftests run_tests
$ make -C tools/testing/selftests TARGETS=resctrl run_tests

However, resctrl_tests cannot be run using kselftest framework,
users have to change directory to tools/testing/selftests/resctrl/,
run "make" to build executable file "resctrl_tests",
and run "sudo ./resctrl_tests" to execute the test.

To build/run resctrl_tests using kselftest framework.
Modify tools/testing/selftests/Makefile
and tools/testing/selftests/resctrl/Makefile.

Even after this change, users can still build/run resctrl_tests
without using framework as before.

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> # resctrl changes
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 17:11:34 -06:00
Shaopeng Tan
3531d930c3 selftests/resctrl: Fix resctrl_tests' return code to work with selftest framework
In kselftest framework, if a sub test can not run by some reasons,
the test result should be marked as SKIP rather than FAIL.
Return KSFT_SKIP(4) instead of KSFT_FAIL(1) if resctrl_tests is not run
as root or it is run on a test environment which does not support resctrl.

 - ksft_exit_fail_msg(): returns KSFT_FAIL(1)
 - ksft_exit_skip(): returns KSFT_SKIP(4)

Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 17:10:21 -06:00
Shaopeng Tan
e2e3fb6ef0 selftests/resctrl: Change the default limited time to 120 seconds
When testing on a Intel(R) Xeon(R) Gold 6254 CPU @ 3.10GHz the resctrl
selftests fail due to timeout after exceeding the default time limit of
45 seconds. On this system the test takes about 68 seconds.
Since the failing test by default accesses a fixed size of memory, the
execution time should not vary significantly between different environment.
A new default of 120 seconds should be sufficient yet easy to customize
with the introduction of the "settings" file for reference.

Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 17:06:53 -06:00
Shaopeng Tan
f54b327816 selftests/resctrl: Kill child process before parent process terminates if SIGTERM is received
In kselftest framework, a sub test is run using the timeout utility
and it will send SIGTERM to the test upon timeout.

In resctrl_tests, a child process is created by fork() to
run benchmark but SIGTERM is not set in sigaction().
If SIGTERM signal is received, the parent process will be killed,
but the child process still exists.

Kill child process before the parent process terminates
if SIGTERM signal is received.

Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 17:06:41 -06:00
Shaopeng Tan
d577380da0 selftests/resctrl: Print a message if the result of MBM&CMT tests is failed on Intel CPU
According to "Intel Resource Director Technology (Intel RDT) on
2nd Generation Intel Xeon Scalable Processors Reference Manual",
When the Intel Sub-NUMA Clustering(SNC) feature is enabled,
Intel CMT and MBM counters may not be accurate.

However, there does not seem to be an architectural way to detect
if SNC is enabled.

If the result of MBM&CMT test fails on Intel CPU,
print a message to let users know a possible cause of failure.

Acked-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 16:58:01 -06:00
Shaopeng Tan
6220f69e72 selftests/resctrl: Extend CPU vendor detection
Currently, the resctrl_tests only has a function to detect AMD vendor.
Since when the Intel Sub-NUMA Clustering feature is enabled,
Intel CMT and MBM counters may not be accurate,
the resctrl_tests also need a function to detect Intel vendor.
And in the future, resctrl_tests will need a function to detect different
vendors, such as Arm.

Extend the function to detect Intel vendor as well. Also,
this function can be easily extended to detect other vendors.

Signed-off-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com>
Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 16:57:50 -06:00
Dominique Martinet
246bdfa52f bpftool, musl compat: Replace sys/fcntl.h by fcntl.h
musl does not like including sys/fcntl.h directly:

    [...]
    1 | #warning redirecting incorrect #include <sys/fcntl.h> to <fcntl.h>
    [...]

Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220424051022.2619648-5-asmadeus@codewreck.org
2022-04-25 23:24:28 +02:00
Dominique Martinet
93bc2e9e94 bpftool, musl compat: Replace nftw with FTW_ACTIONRETVAL
musl nftw implementation does not support FTW_ACTIONRETVAL. There have been
multiple attempts at pushing the feature in musl upstream, but it has been
refused or ignored all the times:

  https://www.openwall.com/lists/musl/2021/03/26/1
  https://www.openwall.com/lists/musl/2022/01/22/1

In this case we only care about /proc/<pid>/fd/<fd>, so it's not too difficult
to reimplement directly instead, and the new implementation makes 'bpftool perf'
slightly faster because it doesn't needlessly stat/readdir unneeded directories
(54ms -> 13ms on my machine).

Signed-off-by: Dominique Martinet <asmadeus@codewreck.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220424051022.2619648-4-asmadeus@codewreck.org
2022-04-25 23:24:16 +02:00
Reinette Chatre
170d1c23f2 selftests/x86/corrupt_xstate_header: Use provided __cpuid_count() macro
kselftest.h makes the __cpuid_count() macro available
to conveniently call the CPUID instruction.

Remove the local CPUID wrapper and use __cpuid_count()
from kselftest.h instead.

__cpuid_count() from kselftest.h is used instead of the
macro provided by the compiler since gcc v4.4 (via cpuid.h)
because the selftest needs to be supported with gcc v3.2,
the minimal required version for stable kernels.

Cc: Andy Lutomirski <luto@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 15:13:03 -06:00
Reinette Chatre
2ba8a7abb5 selftests/x86/amx: Use provided __cpuid_count() macro
kselftest.h makes the __cpuid_count() macro available
to conveniently call the CPUID instruction.

Remove the local CPUID wrapper and use __cpuid_count()
from kselftest.h instead.

__cpuid_count() from kselftest.h is used instead of the
macro provided by the compiler since gcc v4.4 (via cpuid.h)
because the selftest needs to be supported with gcc v3.2,
the minimal required version for stable kernels.

Cc: Chang S. Bae <chang.seok.bae@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: x86@kernel.org
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 15:12:58 -06:00
Reinette Chatre
0dba8dae6b selftests/vm/pkeys: Use provided __cpuid_count() macro
kselftest.h makes the __cpuid_count() macro available
to conveniently call the CPUID instruction.

Remove the local CPUID wrapper and use __cpuid_count()
from already included kselftest.h instead.

__cpuid_count() from kselftest.h is used instead of the
macro provided by the compiler since gcc v4.4 (via cpuid.h)
because the selftest needs to be compiled with gcc v3.2,
the minimal required version for stable kernels.

Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Sandipan Das <sandipan@linux.ibm.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: "Desnes A. Nunes do Rosario" <desnesn@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Thiago Jung Bauermann <bauerman@linux.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Suchanek <msuchanek@suse.de>
Cc: linux-mm@kvack.org
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 15:12:52 -06:00
Reinette Chatre
a23039c730 selftests: Provide local define of __cpuid_count()
Some selftests depend on information provided by the CPUID instruction.
To support this dependency the selftests implement private wrappers for
CPUID.

Duplication of the CPUID wrappers should be avoided.

Both gcc and clang/LLVM provide __cpuid_count() macros but neither
the macro nor its header file are available in all the compiler
versions that need to be supported by the selftests. __cpuid_count()
as provided by gcc is available starting with gcc v4.4, so it is
not available if the latest tests need to be run in all the
environments required to support kernels v4.9 and v4.14 that
have the minimal required gcc v3.2.

Duplicate gcc's __cpuid_count() macro to provide a centrally defined
macro for __cpuid_count() to help eliminate the duplicate CPUID wrappers
while continuing to compile in older environments.

Suggested-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 15:12:36 -06:00
Yuanchu Xie
678f0cdc57 selftests/damon: add damon to selftests root Makefile
Currently the damon selftests are not built with the rest of the
selftests. We add damon to the list of targets.

Fixes: b348eb7abd ("mm/damon: add user space selftests")
Reviewed-by: SeongJae Park <sj@kernel.org>
Signed-off-by: Yuanchu Xie <yuanchu@google.com>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 13:36:17 -06:00
David Vernet
5c26993c31 cgroup: Add config file to cgroup selftest suite
Most of the test suites in tools/testing/selftests contain a config file
that specifies which kernel config options need to be present in order for
the test suite to be able to run and perform meaningful validation. There
is no config file for the tools/testing/selftests/cgroup test suite, so
this patch adds one.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
a79906570f cgroup: Add test_cpucg_max_nested() testcase
The cgroup cpu controller selftests have a test_cpucg_max() testcase
that validates the behavior of the cpu.max knob. Let's also add a
testcase that verifies that the behavior works correctly when set on a
nested cgroup.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
889ab8113e cgroup: Add test_cpucg_max() testcase
The cgroup cpu controller test suite has a number of testcases that
validate the expected behavior of the cpu.weight knob, but none for
cpu.max. This testcase fixes that by adding a testcase for cpu.max as well.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
89ca0efa84 cgroup: Add test_cpucg_nested_weight_underprovisioned() testcase
The cgroup cpu controller test suite currently contains a testcase called
test_cpucg_nested_weight_underprovisioned() which verifies the expected
behavior of cpu.weight when applied to nested cgroups. That first testcase
validated the expected behavior when the processes in the leaf cgroups
overcommitted the system. This patch adds a complementary
test_cpucg_nested_weight_underprovisioned() testcase which validates
behavior when those leaf cgroups undercommit the system.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
David Vernet
b76ee4f576 cgroup: Adding test_cpucg_nested_weight_overprovisioned() testcase
The cgroup cpu controller tests in
tools/testing/selftests/cgroup/test_cpu.c have some testcases that validate
the expected behavior of setting cpu.weight on cgroups, and then hogging
CPUs. What is still missing from the suite is a testcase that validates
nested cgroups. This patch adds test_cpucg_nested_weight_overprovisioned(),
which validates that a parent's cpu.weight will override its children if
they overcommit a host, and properly protect any sibling groups of that
parent.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-25 07:27:31 -10:00
Karthik Alapati
ea1d15a067 selftests/binderfs: Improve message to provide more info
Currently the binderfs test says what failure it encountered
without saying why it may occurred when it fails to mount
binderfs. So, Warn about enabling CONFIG_ANDROID_BINDERFS in the
running kernel.

Signed-off-by: Karthik Alapati <mail@karthek.com>
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-25 10:04:42 -06:00
Yuntao Wang
003fed595c libbpf: Remove unnecessary type cast
The link variable is already of type 'struct bpf_link *', casting it to
'struct bpf_link *' is redundant, drop it.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220424143420.457082-1-ytcoode@gmail.com
2022-04-25 17:39:16 +02:00
Jiri Pirko
002defd576 selftests: mlxsw: Check device info on activated line card
Once line card is activated, check the device FW version is exposed.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:42:29 +01:00
Jiri Pirko
08682c9e58 selftests: mlxsw: Check line card info on provisioned line card
Once line card is provisioned, check if HW revision and INI version
are exposed.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:42:28 +01:00
Jiri Pirko
5e22298918 selftests: mlxsw: Check devices on provisioned line card
Once line card is provisioned, check the count of devices on it and
print them out.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-25 10:42:28 +01:00
Mark Brown
c92b576a13 selftests: alsa: Start validating control names
Not much of a test but we keep on getting problems with boolean controls
not being called Switches so let's add a few basic checks to help people
spot problems.

Signed-off-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/r/20220421115020.14118-1-broonie@kernel.org
Signed-off-by: Takashi Iwai <tiwai@suse.de>
2022-04-25 07:52:05 +02:00
Arnaldo Carvalho de Melo
e0c1b8f9eb Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes, such as the llvm one for ubuntu:22.04.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24 07:50:49 -03:00
Adrian Hunter
4bbac9a1f5 libperf evsel: Factor out perf_evsel__ioctl()
Factor out perf_evsel__ioctl() so it can be reused.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220422162402.147958-2-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24 07:50:38 -03:00
Zhengjun Xing
d7e3c39708 perf stat: Support hybrid --topdown option
Since for cpu_core or cpu_atom, they have different topdown events
groups.

For cpu_core, --topdown equals to:

"{slots,cpu_core/topdown-retiring/,cpu_core/topdown-bad-spec/,
  cpu_core/topdown-fe-bound/,cpu_core/topdown-be-bound/,
  cpu_core/topdown-heavy-ops/,cpu_core/topdown-br-mispredict/,
  cpu_core/topdown-fetch-lat/,cpu_core/topdown-mem-bound/}"

For cpu_atom, --topdown equals to:

"{cpu_atom/topdown-retiring/,cpu_atom/topdown-bad-spec/,
 cpu_atom/topdown-fe-bound/,cpu_atom/topdown-be-bound/}"

To simplify the implementation, on hybrid, --topdown is used
together with --cputype. If without --cputype, it uses cpu_core
topdown events by default.

  # ./perf stat --topdown -a  sleep 1
  WARNING: default to use cpu_core topdown events

   Performance counter stats for 'system wide':

              retiring      bad speculation       frontend bound        backend bound     heavy operations     light operations    branch mispredict       machine clears        fetch latency      fetch bandwidth         memory bound           Core bound
                  4.1%                 0.0%                 5.1%                90.8%                 2.3%                 1.8%                 0.0%                 0.0%                 4.2%                 0.9%                 9.9%                81.0%

         1.002624229 seconds time elapsed

  # ./perf stat --topdown -a --cputype atom  sleep 1

   Performance counter stats for 'system wide':

              retiring      bad speculation       frontend bound        backend bound
                 13.5%                 0.1%                31.2%                55.2%

         1.002366987 seconds time elapsed

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220422065635.767648-3-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-24 07:50:18 -03:00
Linus Torvalds
45ab9400e7 perf tools fixes for v5.18: 3rd batch
- Fix header include for LLVM >= 14 when building with libclang.
 
 - Allow access to 'data_src' for auxtrace in 'perf script' with ARM SPE perf.data
   files, fixing processing data with such attributes.
 
 - Fix error message for test case 71 ("Convert perf time to TSC") on s390, where
   it is not supported.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYmNPCAAKCRCyPKLppCJ+
 JxW9AQCgzYxEw5CJ+zn58lGmYJdfV5Kc6C8MPD671oo39lC49AD/Qw8tyklKTok5
 hJkZ3CqahjMdN1j+xNgskXBNcJW6Rww=
 =Ayk0
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.18-2022-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix header include for LLVM >= 14 when building with libclang.

 - Allow access to 'data_src' for auxtrace in 'perf script' with ARM SPE
   perf.data files, fixing processing data with such attributes.

 - Fix error message for test case 71 ("Convert perf time to TSC") on
   s390, where it is not supported.

* tag 'perf-tools-fixes-for-v5.18-2022-04-22' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf test: Fix error message for test case 71 on s390, where it is not supported
  perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event
  perf script: Always allow field 'data_src' for auxtrace
  perf clang: Fix header include for LLVM >= 14
2022-04-23 09:36:23 -07:00
Vladimir Oltean
07c8a2dd69 selftests: drivers: dsa: add a subset of forwarding selftests
This adds an initial subset of forwarding selftests which I considered
to be relevant for DSA drivers, along with a forwarding.config that
makes it easier to run them (disables veth pair creation, makes sure MAC
addresses are unique and stable).

The intention is to request driver writers to run these selftests during
review and make sure that the tests pass, or at least that the problems
are known.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Vladimir Oltean
90b9566aa5 selftests: forwarding: add a test for local_termination.sh
This tests the capability of switch ports to filter out undesired
traffic. Different drivers are expected to have different capabilities
here (so some may fail and some may pass), yet the test still has some
value, for example to check for regressions.

There are 2 kinds of failures, one is when a packet which should have
been accepted isn't (and that should be fixed), and the other "failure"
(as reported by the test) is when a packet could have been filtered out
(for being unnecessary) yet it was received.

The bridge driver fares particularly badly at this test:

TEST: br0: Unicast IPv4 to primary MAC address                      [ OK ]
TEST: br0: Unicast IPv4 to macvlan MAC address                      [ OK ]
TEST: br0: Unicast IPv4 to unknown MAC address                      [FAIL]
        reception succeeded, but should have failed
TEST: br0: Unicast IPv4 to unknown MAC address, promisc             [ OK ]
TEST: br0: Unicast IPv4 to unknown MAC address, allmulti            [FAIL]
        reception succeeded, but should have failed
TEST: br0: Multicast IPv4 to joined group                           [ OK ]
TEST: br0: Multicast IPv4 to unknown group                          [FAIL]
        reception succeeded, but should have failed
TEST: br0: Multicast IPv4 to unknown group, promisc                 [ OK ]
TEST: br0: Multicast IPv4 to unknown group, allmulti                [ OK ]
TEST: br0: Multicast IPv6 to joined group                           [ OK ]
TEST: br0: Multicast IPv6 to unknown group                          [FAIL]
        reception succeeded, but should have failed
TEST: br0: Multicast IPv6 to unknown group, promisc                 [ OK ]
TEST: br0: Multicast IPv6 to unknown group, allmulti                [ OK ]

mainly because it does not implement IFF_UNICAST_FLT. Yet I still think
having the test (with the failures) is useful in case somebody wants to
tackle that problem in the future, to make an easy before-and-after
comparison.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Vladimir Oltean
476a4f05d9 selftests: forwarding: add a no_forwarding.sh test
Bombard a standalone switch port with various kinds of traffic to ensure
it is really standalone and doesn't leak packets to other switch ports.
Also check for switch ports in different bridges, and switch ports in a
VLAN-aware bridge but having different pvids. No forwarding should take
place in either case.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Vladimir Oltean
a5114df6c6 selftests: forwarding: add helper for retrieving IPv6 link-local address of interface
Pinging an IPv6 link-local multicast address selects the link-local
unicast address of the interface as source, and we'd like to monitor for
that in tcpdump.

Add a helper to the forwarding library which retrieves the link-local
IPv6 address of an interface, to make that task easier.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Vladimir Oltean
f23cddc722 selftests: forwarding: add helpers for IP multicast group joins/leaves
Extend the forwarding library with calls to some small C programs which
join an IP multicast group and send some packets to it. Both IPv4 and
IPv6 groups are supported. Use cases range from testing IGMP/MLD
snooping, to RX filtering, to multicast routing.

Testing multicast traffic using msend/mreceive is intended to be done
using tcpdump.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Joachim Wiberg
6182c5c509 selftests: forwarding: multiple instances in tcpdump helper
Extend tcpdump_start() & C:o to handle multiple instances.  Useful when
observing bridge operation, e.g., unicast learning/flooding, and any
case of multicast distribution (to these ports but not that one ...).

This means the interface argument is now a mandatory argument to all
tcpdump_*() functions, hence the changes to the ocelot flower test.

Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Joachim Wiberg
fe32dffdcd selftests: forwarding: add TCPDUMP_EXTRA_FLAGS to lib.sh
For some use-cases we may want to change the tcpdump flags used in
tcpdump_start().  For instance, observing interfaces without the PROMISC
flag, e.g. to see what's really being forwarded to the bridge interface.

Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Vladimir Oltean
b343734ee2 selftests: forwarding: add option to run tests with stable MAC addresses
By default, DSA switch ports inherit their MAC address from the DSA
master.

This works well for practical situations, but some selftests like
bridge_vlan_unaware.sh loop back 2 standalone DSA ports with 2 bridged
DSA ports, and require the bridge to forward packets between the
standalone ports.

Due to the bridge seeing that the MAC DA it needs to forward is present
as a local FDB entry (it coincides with the MAC address of the bridge
ports), the test packets are not forwarded, but terminated locally on
br0. In turn, this makes the ping and ping6 tests fail.

Address this by introducing an option to have stable MAC addresses.
When mac_addr_prepare is called, the current addresses of the netifs are
saved and replaced with 00:01:02:03:04:${netif number}. Then when
mac_addr_restore is called at the end of the test, the original MAC
addresses are restored. This ensures that the MAC addresses are unique,
which makes the test pass even for DSA ports.

The usage model is for the behavior to be opt-in via STABLE_MAC_ADDRS,
which DSA should set to true, all others behave as before. By hooking
the calls to mac_addr_prepare and mac_addr_restore within the forwarding
lib itself, we do not need to patch each individual selftest, the only
requirement is that pre_cleanup is called.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 12:18:16 +01:00
Geliang Tang
8bd03be341 selftests: mptcp: add infinite map mibs check
This patch adds a function chk_infi_nr() to check the mibs for the
infinite mapping. Invoke it in chk_join_nr() when validate_checksum
is set.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-23 11:51:05 +01:00
Linus Torvalds
bb4ce2c658 RISC-V:
* Remove 's' & 'u' as valid ISA extension
 
 * Do not allow disabling the base extensions 'i'/'m'/'a'/'c'
 
 x86:
 
 * Fix NMI watchdog in guests on AMD
 
 * Fix for SEV cache incoherency issues
 
 * Don't re-acquire SRCU lock in complete_emulated_io()
 
 * Avoid NULL pointer deref if VM creation fails
 
 * Fix race conditions between APICv disabling and vCPU creation
 
 * Bugfixes for disabling of APICv
 
 * Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
 
 selftests:
 
 * Do not use bitfields larger than 32-bits, they differ between GCC and clang
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJi3KUUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMhvQf/Yncfg3MkOvKsVxnCe7diKDTI/E2n
 wBGNIcL8r7L9oIltHL4Mh7JQTacHFQOZ9PQ30NO1p+pznZ03e8LR59IF1JpP7VOU
 sWrLZ5a4bIAEjOpA7Jxcee6hUBwewBauDgFLbb+YAI2lAahiH7jVfywDRife/c3k
 N2LjeA75K8UvMiDCfjxxxerFJK91zaqjWlUNF2OhtFp/5pnMfS+nli9Q8QS837pZ
 oUf+0Beb2RpSHan+wbYVU7X3ZLwtpR0M3w3uXOG+X3as56wDf26znXS02aSwa45x
 lfX+pqJfmb4vCJJDXt6avH27EVgTq0Vew+BhQHG3VLRO6uxZ+smX6qmsuw==
 =kvbw
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "The main and larger change here is a workaround for AMD's lack of
  cache coherency for encrypted-memory guests.

  I have another patch pending, but it's waiting for review from the
  architecture maintainers.

  RISC-V:

   - Remove 's' & 'u' as valid ISA extension

   - Do not allow disabling the base extensions 'i'/'m'/'a'/'c'

  x86:

   - Fix NMI watchdog in guests on AMD

   - Fix for SEV cache incoherency issues

   - Don't re-acquire SRCU lock in complete_emulated_io()

   - Avoid NULL pointer deref if VM creation fails

   - Fix race conditions between APICv disabling and vCPU creation

   - Bugfixes for disabling of APICv

   - Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume

  selftests:

   - Do not use bitfields larger than 32-bits, they differ between GCC
     and clang"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  kvm: selftests: introduce and use more page size-related constants
  kvm: selftests: do not use bitfields larger than 32-bits for PTEs
  KVM: SEV: add cache flush to solve SEV cache incoherency issues
  KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs
  KVM: SVM: Simplify and harden helper to flush SEV guest page(s)
  KVM: selftests: Silence compiler warning in the kvm_page_table_test
  KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog
  x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
  KVM: SPDX style and spelling fixes
  KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled
  KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race
  KVM: nVMX: Defer APICv updates while L2 is active until L1 is active
  KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled
  KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref
  KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused
  KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy
  KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io()
  RISC-V: KVM: Restrict the extensions that can be disabled
  RISC-V: KVM: Remove 's' & 'u' as valid ISA extension
2022-04-22 17:58:36 -07:00
Jason A. Donenfeld
00f3d2ed9d wireguard: selftests: enable ACPI for SMP
It turns out that by having CONFIG_ACPI=n, we've been failing to boot
additional CPUs, and so these systems were functionally UP. The code
bloat is unfortunate for build times, but I don't see an alternative. So
this commit sets CONFIG_ACPI=y for x86_64 and i686 configs.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-22 15:59:05 -07:00
Andrii Nakryiko
fd0493a1e4 selftests/bpf: Switch fexit_stress to bpf_link_create() API
Use bpf_link_create() API in fexit_stress test to attach FEXIT programs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Kui-Feng Lee <kuifeng@fb.com>
Link: https://lore.kernel.org/bpf/20220421033945.3602803-4-andrii@kernel.org
2022-04-23 00:37:02 +02:00
Andrii Nakryiko
8462e0b46f libbpf: Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open()
Teach bpf_link_create() to fallback to bpf_raw_tracepoint_open() on
older kernels for programs that are attachable through
BPF_RAW_TRACEPOINT_OPEN. This makes bpf_link_create() more unified and
convenient interface for creating bpf_link-based attachments.

With this approach end users can just use bpf_link_create() for
tp_btf/fentry/fexit/fmod_ret/lsm program attachments without needing to
care about kernel support, as libbpf will handle this transparently. On
the other hand, as newer features (like BPF cookie) are added to
LINK_CREATE interface, they will be readily usable though the same
bpf_link_create() API without any major refactoring from user's
standpoint.

bpf_program__attach_btf_id() is now using bpf_link_create() internally
as well and will take advantaged of this unified interface when BPF
cookie is added for fentry/fexit.

Doing proactive feature detection of LINK_CREATE support for
fentry/tp_btf/etc is quite involved. It requires parsing vmlinux BTF,
determining some stable and guaranteed to be in all kernels versions
target BTF type (either raw tracepoint or fentry target function),
actually attaching this program and thus potentially affecting the
performance of the host kernel briefly, etc. So instead we are taking
much simpler "lazy" approach of falling back to
bpf_raw_tracepoint_open() call only if initial LINK_CREATE command
fails. For modern kernels this will mean zero added overhead, while
older kernels will incur minimal overhead with a single fast-failing
LINK_CREATE call.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Kui-Feng Lee <kuifeng@fb.com>
Link: https://lore.kernel.org/bpf/20220421033945.3602803-3-andrii@kernel.org
2022-04-23 00:37:02 +02:00
Thomas Richter
5bb017d4b9 perf test: Fix error message for test case 71 on s390, where it is not supported
Test case 71 'Convert perf time to TSC' is not supported on s390.

Subtest 71.1 is skipped with the correct message, but subtest 71.2 is
not skipped and fails.

The root cause is function evlist__open() called from
test__perf_time_to_tsc().  evlist__open() returns -ENOENT because the
event cycles:u is not supported by the selected PMU, for example
platform s390 on z/VM or an x86_64 virtual machine.

The PMU driver returns -ENOENT in this case. This error is leads to the
failure.

Fix this by returning TEST_SKIP on -ENOENT.

Output before:
 71: Convert perf time to TSC:
 71.1: TSC support:             Skip (This architecture does not support)
 71.2: Perf time to TSC:        FAILED!

Output after:
 71: Convert perf time to TSC:
 71.1: TSC support:             Skip (This architecture does not support)
 71.2: Perf time to TSC:        Skip (perf_read_tsc_conversion is not supported)

This also happens on an x86_64 virtual machine:
   # uname -m
   x86_64
   $ ./perf test -F 71
    71: Convert perf time to TSC  :
    71.1: TSC support             : Ok
    71.2: Perf time to TSC        : FAILED!
   $

Committer testing:

Continues to work on x86_64:

  $ perf test 71
   71: Convert perf time to TSC    :
   71.1: TSC support               : Ok
   71.2: Perf time to TSC          : Ok
  $

Fixes: 290fa68bdc ("perf test tsc: Fix error message when not supported")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chengdong Li <chengdongli@tencent.com>
Cc: chengdongli@tencent.com
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220420062921.1211825-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 18:39:34 -03:00
Leo Yan
ccb17caecf perf report: Set PERF_SAMPLE_DATA_SRC bit for Arm SPE event
Since commit bb30acae4c ("perf report: Bail out --mem-mode if mem
info is not available") "perf mem report" and "perf report --mem-mode"
don't report result if the PERF_SAMPLE_DATA_SRC bit is missed in sample
type.

The commit ffab487052 ("perf: arm-spe: Fix perf report
--mem-mode") partially fixes the issue.  It adds PERF_SAMPLE_DATA_SRC
bit for Arm SPE event, this allows the perf data file generated by
kernel v5.18-rc1 or later version can be reported properly.

On the other hand, perf tool still fails to be backward compatibility
for a data file recorded by an older version's perf which contains Arm
SPE trace data.  This patch is a workaround in reporting phase, when
detects ARM SPE PMU event and without PERF_SAMPLE_DATA_SRC bit, it will
force to set the bit in the sample type and give a warning info.

Fixes: bb30acae4c ("perf report: Bail out --mem-mode if mem info is not available")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Link: https://lore.kernel.org/r/20220414123201.842754-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 18:39:34 -03:00
Leo Yan
c6d8df0106 perf script: Always allow field 'data_src' for auxtrace
If use command 'perf script -F,+data_src' to dump memory samples with
Arm SPE trace data, it reports error:

  # perf script -F,+data_src
  Samples for 'dummy:u' event do not have DATA_SRC attribute set. Cannot print 'data_src' field.

This is because the 'dummy:u' event is absent DATA_SRC bit in its sample
type, so if a file contains AUX area tracing data then always allow
field 'data_src' to be selected as an option for perf script.

Fixes: e55ed3423c ("perf arm-spe: Synthesize memory event")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220417114837.839896-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 18:39:34 -03:00
Guilherme Amadio
d22588d73b perf clang: Fix header include for LLVM >= 14
The header TargetRegistry.h has moved in LLVM/clang 14.

Committer notes:

The problem as noticed when building in ubuntu:22.04:

    90    98.61 ubuntu:22.04                  : FAIL gcc version 11.2.0 (Ubuntu 11.2.0-19ubuntu1)
      util/c++/clang.cpp:23:10: fatal error: llvm/Support/TargetRegistry.h: No such file or directory
         23 | #include "llvm/Support/TargetRegistry.h"
            |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      compilation terminated.

Fixed after applying this patch.

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Signed-off-by: Guilherme Amadio <amadio@gentoo.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://twitter.com/GuilhermeAmadio/status/1514970524232921088
Link: http://lore.kernel.org/lkml/Ylp0M/VYgHOxtcnF@gentoo.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 18:39:34 -03:00
David Vernet
4ab93063c8 cgroup: Add test_cpucg_weight_underprovisioned() testcase
test_cpu.c includes testcases that validate the cgroup cpu controller.
This patch adds a new testcase called test_cpucg_weight_underprovisioned()
that verifies that processes with different cpu.weight that are all running
on an underprovisioned system, still get roughly the same amount of cpu
time.

Because test_cpucg_weight_underprovisioned() is very similar to
test_cpucg_weight_overprovisioned(), this patch also pulls the common logic
into a separate helper function that is invoked from both testcases, and
which uses function pointers to invoke the unique portions of the
testcases.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
David Vernet
6376b22cd0 cgroup: Add test_cpucg_weight_overprovisioned() testcase
test_cpu.c includes testcases that validate the cgroup cpu controller.
This patch adds a new testcase called test_cpucg_weight_overprovisioned()
that verifies the expected behavior of creating multiple processes with
different cpu.weight, on a system that is overprovisioned.

So as to avoid code duplication, this patch also updates cpu_hog_func_param
to take a new hog_clock_type enum which informs how time is counted in
hog_cpus_timed() (either process time or wall clock time).

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
David Vernet
3c879a1bb8 cgroup: Add test_cpucg_stats() testcase to cgroup cpu selftests
test_cpu.c includes testcases that validate the cgroup cpu controller.
This patch adds a new testcase called test_cpucg_stats() that verifies the
expected behavior of the cpu.stat interface. In doing so, we define a
new hog_cpus_timed() function which takes a cpu_hog_func_param struct
that configures how many CPUs it uses, and how long it runs. Future
patches will also spawn threads that hog CPUs, so this function will
eventually serve those use-cases as well.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
David Vernet
820a4f88ee cgroup: Add new test_cpu.c test suite in cgroup selftests
The cgroup selftests suite currently contains tests that validate various
aspects of cgroup, such as validating the expected behavior for memory
controllers, the expected behavior of cgroup.procs, etc. There are no tests
that validate the expected behavior of the cgroup cpu controller.

This patch therefore adds a new test_cpu.c file that will contain cpu
controller testcases. The file currently only contains a single testcase
that validates creating nested cgroups with cgroup.subtree_control
including cpu. Future patches will add more sophisticated testcases that
validate functional aspects of the cpu controller.

Signed-off-by: David Vernet <void@manifault.com>
Signed-off-by: Tejun Heo <tj@kernel.org>
2022-04-22 08:39:32 -10:00
Matthew Wilcox (Oracle)
b9663a6ff8 tools: Add kmem_cache_alloc_lru()
Turn kmem_cache_alloc() into a wrapper around kmem_cache_alloc_lru().

Fixes: 9bbdc0f324 ("xarray: use kmem_cache_alloc_lru to allocate xa_node")
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Reported-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Reported-by: Li Wang <liwang@redhat.com>
2022-04-22 14:24:28 -04:00
Zhengjun Xing
2c8e64514a perf stat: Merge event counts from all hybrid PMUs
For hybrid events, by default stat aggregates and reports the event counts
per pmu.

  # ./perf stat -e cycles -a  sleep 1

   Performance counter stats for 'system wide':

      14,066,877,268      cpu_core/cycles/
       6,814,443,147      cpu_atom/cycles/

         1.002760625 seconds time elapsed

Sometimes, it's also useful to aggregate event counts from all PMUs.
Create a new option '--hybrid-merge' to enable that behavior and report
the counts without PMUs.

  # ./perf stat -e cycles -a --hybrid-merge  sleep 1

   Performance counter stats for 'system wide':

      20,732,982,512      cycles

         1.002776793 seconds time elapsed

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220422065635.767648-2-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 14:23:35 -03:00
Zhengjun Xing
60344f1a9a perf stat: Support metrics with hybrid events
One metric such as 'Kernel_Utilization' may be from different PMUs and
consists of different events.

For core,
Kernel_Utilization = cpu_clk_unhalted.thread:k / cpu_clk_unhalted.thread

For atom,
Kernel_Utilization = cpu_clk_unhalted.core:k / cpu_clk_unhalted.core

The metric group string for core is:
'{cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W'
It's internally expanded to:
'{cpu_clk_unhalted.thread_p/metric-id=cpu_clk_unhalted.thread_p:k/k,cpu_clk_unhalted.thread/metric-id=cpu_clk_unhalted.thread/}:W#cpu_core'

The metric group string for atom is:
'{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W'
It's internally expanded to:
'{cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core:k/k,cpu_clk_unhalted.core/metric-id=cpu_clk_unhalted.core/}:W#cpu_atom'

That means the group "{cpu_clk_unhalted.thread:k,cpu_clk_unhalted.thread}:W"
is from cpu_core PMU and the group "{cpu_clk_unhalted.core:k,cpu_clk_unhalted.core}"
is from cpu_atom PMU. And then next, check if the events in the group are
valid on that PMU. If one event is not valid on that PMU, the associated
group would be removed internally.

In this example, cpu_clk_unhalted.thread is valid on cpu_core and
cpu_clk_unhalted.core is valid on cpu_atom. So the checks for these two
groups are passed.

Before:

  # ./perf stat -M Kernel_Utilization -a sleep 1
WARNING: events in group from different hybrid PMUs!
WARNING: grouped events cpus do not match, disabling group:
  anon group { CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD_P:k, CPU_CLK_UNHALTED.THREAD, CPU_CLK_UNHALTED.THREAD }

 Performance counter stats for 'system wide':

        17,639,501      cpu_atom/CPU_CLK_UNHALTED.CORE/ #     1.00 Kernel_Utilization
        17,578,757      cpu_atom/CPU_CLK_UNHALTED.CORE:k/
     1,005,350,226 ns   duration_time
        43,012,352      cpu_core/CPU_CLK_UNHALTED.THREAD_P:k/ #     0.99 Kernel_Utilization
        17,608,010      cpu_atom/CPU_CLK_UNHALTED.THREAD_P:k/
        43,608,755      cpu_core/CPU_CLK_UNHALTED.THREAD/
        17,630,838      cpu_atom/CPU_CLK_UNHALTED.THREAD/
     1,005,350,226 ns   duration_time

       1.005350226 seconds time elapsed

After:

  # ./perf stat -M Kernel_Utilization -a sleep 1

 Performance counter stats for 'system wide':

        17,981,895      CPU_CLK_UNHALTED.CORE [cpu_atom] #     1.00 Kernel_Utilization
        17,925,405      CPU_CLK_UNHALTED.CORE:k [cpu_atom]
     1,004,811,366 ns   duration_time
        41,246,425      CPU_CLK_UNHALTED.THREAD_P:k [cpu_core] #     0.99 Kernel_Utilization
        41,819,129      CPU_CLK_UNHALTED.THREAD [cpu_core]
     1,004,811,366 ns   duration_time

       1.004811366 seconds time elapsed

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220422065635.767648-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 14:23:17 -03:00
Zhengjun Xing
17408e5904 perf vendor events intel: Add metrics for Alderlake
Add JSON metrics for Alderlake to perf.

It included both P-core and E-core metrics.

P-core metrics based on TMA 4.3-full (TMA_Metrics-full.csv)
E-core metrics based on E-core TMA 2.0 (E-core_TMA_Metrics.xlsx)

They are all downloaded from:
  https://download.01.org/perfmon/

Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/r/20220422065336.767582-1-zhengjun.xing@linux.intel.com
Cc: irogers@google.com
Cc: peterz@infradead.org
Cc: adrian.hunter@intel.com
Cc: alexander.shishkin@intel.com
Cc: acme@kernel.org
Cc: ak@linux.intel.com
Cc: jolsa@redhat.com
Cc: mingo@redhat.com
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
2022-04-22 14:22:24 -03:00
Linus Torvalds
281b9d9a4b Merge branch 'akpm' (patches from Andrew)
Merge misc fixes from Andrew Morton:
 "13 patches.

  Subsystems affected by this patch series: mm (memory-failure, memcg,
  userfaultfd, hugetlbfs, mremap, oom-kill, kasan, hmm), and kcov"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/mmu_notifier.c: fix race in mmu_interval_notifier_remove()
  kcov: don't generate a warning on vm_insert_page()'s failure
  MAINTAINERS: add Vincenzo Frascino to KASAN reviewers
  oom_kill.c: futex: delay the OOM reaper to allow time for proper futex cleanup
  selftest/vm: add skip support to mremap_test
  selftest/vm: support xfail in mremap_test
  selftest/vm: verify remap destination address in mremap_test
  selftest/vm: verify mmap addr in mremap_test
  mm, hugetlb: allow for "high" userspace addresses
  userfaultfd: mark uffd_wp regardless of VM_WRITE flag
  memcg: sync flush only if periodic flush is delayed
  mm/memory-failure.c: skip huge_zero_page in memory_failure()
  mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()
2022-04-22 10:10:43 -07:00
Jiri Olsa
3a7ab60597 perf tools: Move libbpf init in libbpf_init function
Move the libbpf init code into a single function, so that we have a single
place doing that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: https://lore.kernel.org/r/20220422100025.1469207-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-22 14:02:15 -03:00
Josh Poimboeuf
a8e35fece4 objtool: Update documentation
The objtool documentation is very stack validation centric.  Broaden the
documentation and describe all the features objtool supports.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/b6a84d301d9f73ec6725752654097f4e31fa1b69.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:05 +02:00
Josh Poimboeuf
753da4179d objtool: Remove --lto and --vmlinux in favor of --link
The '--lto' option is a confusing way of telling objtool to do stack
validation despite it being a linked object.  It's no longer needed now
that an explicit '--stackval' option exists.  The '--vmlinux' option is
also redundant.

Remove both options in favor of a straightforward '--link' option which
identifies a linked object.

Also, implicitly set '--link' with a warning if the user forgets to do
so and we can tell that it's a linked object.  This makes it easier for
manual vmlinux runs.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/dcd3ceffd15a54822c6183e5766d21ad06082b45.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:05 +02:00
Josh Poimboeuf
22102f4559 objtool: Make noinstr hacks optional
Objtool has some hacks in place to workaround toolchain limitations
which otherwise would break no-instrumentation rules.  Make the hacks
explicit (and optional for other arches) by turning it into a cmdline
option and kernel config option.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/b326eeb9c33231b9dfbb925f194ed7ee40edcd7c.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:04 +02:00
Josh Poimboeuf
4ab7674f59 objtool: Make jump label hack optional
Objtool secretly does a jump label hack to overcome the limitations of
the toolchain.  Make the hack explicit (and optional for other arches)
by turning it into a cmdline option and kernel config option.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/3bdcbfdd27ecb01ddec13c04bdf756a583b13d24.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:04 +02:00
Josh Poimboeuf
26e176896a objtool: Make static call annotation optional
As part of making objtool more modular, put the existing static call
code behind a new '--static-call' option.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/d59ac57ef3d6d8380cdce20322314c9e2e556750.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:03 +02:00
Josh Poimboeuf
7206447496 objtool: Make stack validation frame-pointer-specific
Now that CONFIG_STACK_VALIDATION is frame-pointer specific, do the same
for the '--stackval' option.  Now the '--no-fp' option is redundant and
can be removed.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/f563fa064b3b63d528de250c72012d49e14742a3.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:03 +02:00
Josh Poimboeuf
03f16cd020 objtool: Add CONFIG_OBJTOOL
Now that stack validation is an optional feature of objtool, add
CONFIG_OBJTOOL and replace most usages of CONFIG_STACK_VALIDATION with
it.

CONFIG_STACK_VALIDATION can now be considered to be frame-pointer
specific.  CONFIG_UNWINDER_ORC is already inherently valid for live
patching, so no need to "validate" it.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/939bf3d85604b2a126412bf11af6e3bd3b872bcb.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:03 +02:00
Josh Poimboeuf
c2bdd61c98 objtool: Extricate sls from stack validation
Extricate sls functionality from validate_branch() so they can be
executed (or ported) independently from each other.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/2545c86ffa5f27497f0d0c542540ad4a4be3c5a5.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:03 +02:00
Josh Poimboeuf
3c6f9f77e6 objtool: Rework ibt and extricate from stack validation
Extricate ibt from validate_branch() so they can be executed (or ported)
independently from each other.

While shuffling code around, simplify and improve the ibt logic:

- Ignore an explicit list of known sections which reference functions
  for reasons other than indirect branching to them.  This helps prevent
  unnnecesary sealing.

- Warn on missing !ENDBR for all other sections, not just .data and
  .rodata.  This finds additional warnings, because there are sections
  other than .[ro]data which reference function pointers.  For example,
  the ksymtab sections which are used for exporting symbols.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/fd1435e46bb95f81031b8fb1fa360f5f787e4316.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
7dce62041a objtool: Make stack validation optional
Make stack validation an explicit cmdline option so that individual
objtool features can be enabled individually by other arches.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/52da143699574d756e65ca4c9d4acaffe9b0fe5f.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
99c0beb547 objtool: Add option to print section addresses
To help prevent objtool users from having to do math to convert function
addresses to section addresses, and to help out with finding data
addresses reported by IBT validation, add an option to print the section
address in addition to the function address.

Normal:

  vmlinux.o: warning: objtool: fixup_exception()+0x2d1: unreachable instruction

With '--sec-address':

  vmlinux.o: warning: objtool: fixup_exception()+0x2d1 (.text+0x76c51): unreachable instruction

Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/2cea4d5299d53d1a4c09212a6ad7820aa46fda7a.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
2bc3dec705 objtool: Don't print parentheses in function addresses
The parentheses in the "func()+off" address output are inconsistent with
how the kernel prints function addresses, breaking Peter's scripts.
Remove them.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/f2bec70312f62ef4f1ea21c134d9def627182ad3.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:02 +02:00
Josh Poimboeuf
b51277eb97 objtool: Ditch subcommands
Objtool has a fairly singular focus.  It runs on object files and does
validations and transformations which can be combined in various ways.
The subcommand model has never been a good fit, making it awkward to
combine and remove options.

Remove the "check" and "orc" subcommands in favor of a more traditional
cmdline option model.  This makes it much more flexible to use, and
easier to port individual features to other arches.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/5c61ebf805e90aefc5fa62bc63468ffae53b9df6.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:01 +02:00
Josh Poimboeuf
2daf7faba7 objtool: Reorganize cmdline options
Split the existing options into two groups: actions, which actually do
something; and options, which modify the actions in some way.

Also there's no need to have short flags for all the non-action options.
Reserve short flags for the more important actions.

While at it:

- change a few of the short flags to be more intuitive

- make option descriptions more consistently descriptive

- sort options in the source like they are when printed

- move options to a global struct

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/9dcaa752f83aca24b1b21f0b0eeb28a0c181c0b0.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:01 +02:00
Josh Poimboeuf
aa3d60e050 libsubcmd: Fix OPTION_GROUP sorting
The OPTION_GROUP option type is a way of grouping certain options
together in the printed usage text.  It happens to be completely broken,
thanks to the fact that the subcmd option sorting just sorts everything,
without regard for grouping.  Luckily, nobody uses this option anyway,
though that will change shortly.

Fix it by sorting each group individually.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Link: https://lkml.kernel.org/r/e167ea3a11e2a9800eb062c1fd0f13e9cd05140c.1650300597.git.jpoimboe@redhat.com
2022-04-22 12:32:01 +02:00
Peter Zijlstra
3398b12d10 Merge branch 'tip/x86/urgent'
Merge the x86/urgent objtool/IBT changes as a base

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-04-22 12:32:01 +02:00
Peter Zijlstra
4abff6d48d objtool: Fix code relocs vs weak symbols
Occasionally objtool driven code patching (think .static_call_sites
.retpoline_sites etc..) goes sideways and it tries to patch an
instruction that doesn't match.

Much head-scatching and cursing later the problem is as outlined below
and affects every section that objtool generates for us, very much
including the ORC data. The below uses .static_call_sites because it's
convenient for demonstration purposes, but as mentioned the ORC
sections, .retpoline_sites and __mount_loc are all similarly affected.

Consider:

foo-weak.c:

  extern void __SCT__foo(void);

  __attribute__((weak)) void foo(void)
  {
	  return __SCT__foo();
  }

foo.c:

  extern void __SCT__foo(void);
  extern void my_foo(void);

  void foo(void)
  {
	  my_foo();
	  return __SCT__foo();
  }

These generate the obvious code
(gcc -O2 -fcf-protection=none -fno-asynchronous-unwind-tables -c foo*.c):

foo-weak.o:
0000000000000000 <foo>:
   0:   e9 00 00 00 00          jmpq   5 <foo+0x5>      1: R_X86_64_PLT32       __SCT__foo-0x4

foo.o:
0000000000000000 <foo>:
   0:   48 83 ec 08             sub    $0x8,%rsp
   4:   e8 00 00 00 00          callq  9 <foo+0x9>      5: R_X86_64_PLT32       my_foo-0x4
   9:   48 83 c4 08             add    $0x8,%rsp
   d:   e9 00 00 00 00          jmpq   12 <foo+0x12>    e: R_X86_64_PLT32       __SCT__foo-0x4

Now, when we link these two files together, you get something like
(ld -r -o foos.o foo-weak.o foo.o):

foos.o:
0000000000000000 <foo-0x10>:
   0:   e9 00 00 00 00          jmpq   5 <foo-0xb>      1: R_X86_64_PLT32       __SCT__foo-0x4
   5:   66 2e 0f 1f 84 00 00 00 00 00   nopw   %cs:0x0(%rax,%rax,1)
   f:   90                      nop

0000000000000010 <foo>:
  10:   48 83 ec 08             sub    $0x8,%rsp
  14:   e8 00 00 00 00          callq  19 <foo+0x9>     15: R_X86_64_PLT32      my_foo-0x4
  19:   48 83 c4 08             add    $0x8,%rsp
  1d:   e9 00 00 00 00          jmpq   22 <foo+0x12>    1e: R_X86_64_PLT32      __SCT__foo-0x4

Noting that ld preserves the weak function text, but strips the symbol
off of it (hence objdump doing that funny negative offset thing). This
does lead to 'interesting' unused code issues with objtool when ran on
linked objects, but that seems to be working (fingers crossed).

So far so good.. Now lets consider the objtool static_call output
section (readelf output, old binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 .text + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 .text + 1d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

So we have two patch sites, one in the dead code of the weak foo and one
in the real foo. All is well.

*HOWEVER*, when the toolchain strips unused section symbols it
generates things like this (using new enough binutils):

foo-weak.o:

Relocation section '.rela.static_call_sites' at offset 0x2c8 contains 1 entry:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foo.o:

Relocation section '.rela.static_call_sites' at offset 0x310 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000200000002 R_X86_64_PC32          0000000000000000 foo + d
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

foos.o:

Relocation section '.rela.static_call_sites' at offset 0x430 contains 4 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000000  0000000100000002 R_X86_64_PC32          0000000000000000 foo + 0
0000000000000004  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1
0000000000000008  0000000100000002 R_X86_64_PC32          0000000000000000 foo + d
000000000000000c  0000000d00000002 R_X86_64_PC32          0000000000000000 __SCT__foo + 1

And now we can see how that foos.o .static_call_sites goes side-ways, we
now have _two_ patch sites in foo. One for the weak symbol at foo+0
(which is no longer a static_call site!) and one at foo+d which is in
fact the right location.

This seems to happen when objtool cannot find a section symbol, in which
case it falls back to any other symbol to key off of, however in this
case that goes terribly wrong!

As such, teach objtool to create a section symbol when there isn't
one.

Fixes: 44f6a7c075 ("objtool: Fix seg fault with Clang non-section symbols")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.655552918@infradead.org
2022-04-22 12:13:55 +02:00
Peter Zijlstra
c087c6e7b5 objtool: Fix type of reloc::addend
Elf{32,64}_Rela::r_addend is of type: Elf{32,64}_Sword, that means
that our reloc::addend needs to be long or face tuncation issues when
we do elf_rebuild_reloc_section():

  - 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt+0x80000067
  + 107:  48 b8 00 00 00 00 00 00 00 00   movabs $0x0,%rax        109: R_X86_64_64        level4_kernel_pgt-0x7fffff99

Fixes: 627fce1480 ("objtool: Add ORC unwind table generation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20220419203807.596871927@infradead.org
2022-04-22 12:13:55 +02:00
Paolo Abeni
f70925bf99 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
drivers/net/ethernet/microchip/lan966x/lan966x_main.c
  d08ed85256 ("net: lan966x: Make sure to release ptp interrupt")
  c834963932 ("net: lan966x: Add FDMA functionality")

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-22 09:56:00 +02:00
Takashi Iwai
bc67cac103 selftests: firmware: Add ZSTD compressed file tests
It's similar like XZ compressed files.  For the simplicity, both XZ
and ZSTD tests are done in a single function.  The format is specified
via $COMPRESS_FORMAT and the compression function is pre-defined.

Link: https://lore.kernel.org/r/20210127154939.13288-5-tiwai@suse.de
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-6-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:17 +02:00
Takashi Iwai
f18b45ff9a selftests: firmware: Simplify test patterns
The test patterns are almost same in three sequential tests.
Make the unified helper function for improving the readability.

Link: https://lore.kernel.org/all/20210127154939.13288-1-tiwai@suse.de/
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-5-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:17 +02:00
Takashi Iwai
04c826d072 selftests: firmware: Fix the request_firmware_into_buf() test for XZ format
The test uses a different firmware name, and we forgot to adapt for
the XZ compressed file tests.

https://lore.kernel.org/all/20210127154939.13288-1-tiwai@suse.de/

Fixes: 1798045900 ("selftests: firmware: Add request_firmware_into_buf tests")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-4-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:17 +02:00
Takashi Iwai
b3625b1324 selftests: firmware: Use smaller dictionary for XZ compression
The xz -9 option leads to an unnecessarily too large dictionary that
isn't really suitable for the kernel firmware loader.  Pass the
dictionary size explicitly, instead.

While we're at it, make the xz command call defined in $RUN_XZ for
simplicity.

Fixes: 108ae07c50 ("selftests: firmware: Add compressed firmware tests")
Signed-off-by: Takashi Iwai <tiwai@suse.de>
Link: https://lore.kernel.org/r/20220421152908.4718-3-tiwai@suse.de
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-22 08:51:16 +02:00
Sidhartha Kumar
80df2fb95d selftest/vm: add skip support to mremap_test
Allow the mremap test to be skipped due to errors such as failing to
parse the mmap_min_addr sysctl.

Link: https://lkml.kernel.org/r/20220420215721.4868-4-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:10 -07:00
Sidhartha Kumar
e5508fc52c selftest/vm: support xfail in mremap_test
Use ksft_test_result_xfail for the tests which are expected to fail.

Link: https://lkml.kernel.org/r/20220420215721.4868-3-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:10 -07:00
Sidhartha Kumar
18d609daa5 selftest/vm: verify remap destination address in mremap_test
Because mremap does not have a MAP_FIXED_NOREPLACE flag, it can destroy
existing mappings.  This causes a segfault when regions such as text are
remapped and the permissions are changed.

Verify the requested mremap destination address does not overlap any
existing mappings by using mmap's MAP_FIXED_NOREPLACE flag.  Keep
incrementing the destination address until a valid mapping is found or
fail the current test once the max address is reached.

Link: https://lkml.kernel.org/r/20220420215721.4868-2-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:10 -07:00
Sidhartha Kumar
9c85a9bae2 selftest/vm: verify mmap addr in mremap_test
Avoid calling mmap with requested addresses that are less than the
system's mmap_min_addr.  When run as root, mmap returns EACCES when
trying to map addresses < mmap_min_addr.  This is not one of the error
codes for the condition to retry the mmap in the test.

Rather than arbitrarily retrying on EACCES, don't attempt an mmap until
addr > vm.mmap_min_addr.

Add a munmap call after an alignment check as the mappings are retained
after the retry and can reach the vm.max_map_count sysctl.

Link: https://lkml.kernel.org/r/20220420215721.4868-1-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-21 20:01:09 -07:00
Paolo Bonzini
e852be8b14 kvm: selftests: introduce and use more page size-related constants
Clean up code that was hardcoding masks for various fields,
now that the masks are included in processor.h.

For more cleanup, define PAGE_SIZE and PAGE_MASK just like in Linux.
PAGE_SIZE in particular was defined by several tests.

Suggested-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 15:41:01 -04:00
Paolo Bonzini
f18b4aebe1 kvm: selftests: do not use bitfields larger than 32-bits for PTEs
Red Hat's QE team reported test failure on access_tracking_perf_test:

Testing guest mode: PA-bits:ANY, VA-bits:48,  4K pages
guest physical test memory offset: 0x3fffbffff000

Populating memory             : 0.684014577s
Writing to populated memory   : 0.006230175s
Reading from populated memory : 0.004557805s
==== Test Assertion Failure ====
  lib/kvm_util.c:1411: false
  pid=125806 tid=125809 errno=4 - Interrupted system call
     1  0x0000000000402f7c: addr_gpa2hva at kvm_util.c:1411
     2   (inlined by) addr_gpa2hva at kvm_util.c:1405
     3  0x0000000000401f52: lookup_pfn at access_tracking_perf_test.c:98
     4   (inlined by) mark_vcpu_memory_idle at access_tracking_perf_test.c:152
     5   (inlined by) vcpu_thread_main at access_tracking_perf_test.c:232
     6  0x00007fefe9ff81ce: ?? ??:0
     7  0x00007fefe9c64d82: ?? ??:0
  No vm physical memory at 0xffbffff000

I can easily reproduce it with a Intel(R) Xeon(R) CPU E5-2630 with 46 bits
PA.

It turns out that the address translation for clearing idle page tracking
returned a wrong result; addr_gva2gpa()'s last step, which is based on
"pte[index[0]].pfn", did the calculation with 40 bits length and the
high 12 bits got truncated.  In above case the GPA address to be returned
should be 0x3fffbffff000 for GVA 0xc0000000, but it got truncated into
0xffbffff000 and the subsequent gpa2hva lookup failed.

The width of operations on bit fields greater than 32-bit is
implementation defined, and differs between GCC (which uses the bitfield
precision) and clang (which uses 64-bit arithmetic), so this is a
potential minefield.  Remove the bit fields and using manual masking
instead.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2075036
Reported-by: Nana Liu <nanliu@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 15:41:01 -04:00
Linus Torvalds
59f0c2447e Networking fixes for 5.18-rc4, including fixes from xfrm and can.
Current release - regressions:
 
   - rxrpc: restore removed timer deletion
 
 Current release - new code bugs:
 
   - gre: fix device lookup for l3mdev use-case
 
   - xfrm: fix egress device lookup for l3mdev use-case
 
 Previous releases - regressions:
 
   - sched: cls_u32: fix netns refcount changes in u32_change()
 
   - smc: fix sock leak when release after smc_shutdown()
 
   - xfrm: limit skb_page_frag_refill use to a single page
 
   - eth: atlantic: invert deep par in pm functions, preventing null
 	derefs
 
   - eth: stmmac: use readl_poll_timeout_atomic() in atomic state
 
 Previous releases - always broken:
 
   - gre: fix skb_under_panic on xmit
 
   - openvswitch: fix OOB access in reserve_sfa_size()
 
   - dsa: hellcreek: calculate checksums in tagger
 
   - eth: ice: fix crash in switchdev mode
 
   - eth: igc:
     - fix infinite loop in release_swfw_sync
     - fix scheduling while atomic
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmJhLbcSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOknxkP/jiAATyBt/HQFykQNiZ7cdrcv1gzJ3Lg
 BmrV1QmbrNwfffBtmBHRliP7x0vNF6fV8LUKjfyQh1YgJw8I9F/FDH/1fojhBZq/
 JJpZrh+TFikBBM4RDMJ0aQi6ssOEo8S9gfN4W48F/49O4S3Q/Gbgv7Lk0jL8utRz
 7RgGUVxX+xOSklvh2Tn/xHdYPeebPhLojiKhmH+l6xghyDEUHkemF3AkLwV9QMnq
 LXmNP4y100xcdCW1bLbyVcq0lbwdLSg4SL+2wC2bvgEDRR0MUezQyNxD6Oqrmusn
 sASZYgNK92R9ekLBqTX/QwV/XIT+17hclTk4u0eV8GnemnibqOq7DhDqtKyeAzbD
 mfU6Z5Ku6LuXA1U+2w1jxnd4cJTacA+dCRKcQD91ReguBbCd6zOweB996iBNLucK
 Kf+r6qWWLxt+JmhSexb/T+oQHsdgvIPSQXNHUH2W8w+2DdTB/EPcSL76DlbZUxrP
 up4EC3Nr3oxJjHbv7Iq4d9mHuRlwoOOpNJ3mARlfRDL6iuL0zECTweST3qT9YyIH
 Cz4FGj7kwEDTxGtufoTVia+/JmS39f2lBrMKuhbTVo+qcYhs3zJM4Ki9bAgOKXqI
 Qf+I73x9yQZ182afq4DsRXLnq/BajmRMyX2/kebY8KsARzRsPAktBhsT17SI6tUG
 3MiLiHiIb0qM
 =thBq
 -----END PGP SIGNATURE-----

Merge tag 'net-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from xfrm and can.

  Current release - regressions:

   - rxrpc: restore removed timer deletion

  Current release - new code bugs:

   - gre: fix device lookup for l3mdev use-case

   - xfrm: fix egress device lookup for l3mdev use-case

  Previous releases - regressions:

   - sched: cls_u32: fix netns refcount changes in u32_change()

   - smc: fix sock leak when release after smc_shutdown()

   - xfrm: limit skb_page_frag_refill use to a single page

   - eth: atlantic: invert deep par in pm functions, preventing null
     derefs

   - eth: stmmac: use readl_poll_timeout_atomic() in atomic state

  Previous releases - always broken:

   - gre: fix skb_under_panic on xmit

   - openvswitch: fix OOB access in reserve_sfa_size()

   - dsa: hellcreek: calculate checksums in tagger

   - eth: ice: fix crash in switchdev mode

   - eth: igc:
      - fix infinite loop in release_swfw_sync
      - fix scheduling while atomic"

* tag 'net-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (37 commits)
  drivers: net: hippi: Fix deadlock in rr_close()
  selftests: mlxsw: vxlan_flooding_ipv6: Prevent flooding of unwanted packets
  selftests: mlxsw: vxlan_flooding: Prevent flooding of unwanted packets
  nfc: MAINTAINERS: add Bug entry
  net: stmmac: Use readl_poll_timeout_atomic() in atomic state
  doc/ip-sysctl: add bc_forwarding
  netlink: reset network and mac headers in netlink_dump()
  net: mscc: ocelot: fix broken IP multicast flooding
  net: dsa: hellcreek: Calculate checksums in tagger
  net: atlantic: invert deep par in pm functions, preventing null derefs
  can: isotp: stop timeout monitoring when no first frame was sent
  bonding: do not discard lowest hash bit for non layer3+4 hashing
  net: lan966x: Make sure to release ptp interrupt
  ipv6: make ip6_rt_gc_expire an atomic_t
  net: Handle l3mdev in ip_tunnel_init_flow
  l3mdev: l3mdev_master_upper_ifindex_by_index_rcu should be using netdev_master_upper_dev_get_rcu
  net/sched: cls_u32: fix possible leak in u32_init_knode()
  net/sched: cls_u32: fix netns refcount changes in u32_change()
  powerpc: Update MAINTAINERS for ibmvnic and VAS
  net: restore alpha order to Ethernet devices in config
  ...
2022-04-21 12:29:08 -07:00
Thomas Huth
266a19a0bc KVM: selftests: Silence compiler warning in the kvm_page_table_test
When compiling kvm_page_table_test.c, I get this compiler warning
with gcc 11.2:

kvm_page_table_test.c: In function 'pre_init_before_test':
../../../../tools/include/linux/kernel.h:44:24: warning: comparison of
 distinct pointer types lacks a cast
   44 |         (void) (&_max1 == &_max2);              \
      |                        ^~
kvm_page_table_test.c:281:21: note: in expansion of macro 'max'
  281 |         alignment = max(0x100000, alignment);
      |                     ^~~

Fix it by adjusting the type of the absolute value.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20220414103031.565037-1-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-21 13:16:14 -04:00
Gaosheng Cui
b71a2ebf74 libbpf: Remove redundant non-null checks on obj_elf
Obj_elf is already non-null checked at the function entry, so remove
redundant non-null checks on obj_elf.

Signed-off-by: Gaosheng Cui <cuigaosheng1@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220421031803.2283974-1-cuigaosheng1@huawei.com
2022-04-21 09:56:26 -07:00
Artem Savkov
c14766a8a8 selftests/bpf: Fix map tests errno checks
Switching to libbpf 1.0 API broke test_lpm_map and test_lru_map as error
reporting changed. Instead of setting errno and returning -1 bpf calls
now return -Exxx directly.
Drop errno checks and look at return code directly.

Fixes: b858ba8c52 ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20220421094320.1563570-1-asavkov@redhat.com
2022-04-21 09:51:57 -07:00
Artem Savkov
6a12b8e20d selftests/bpf: Fix prog_tests uprobe_autoattach compilation error
I am getting the following compilation error for prog_tests/uprobe_autoattach.c:

  tools/testing/selftests/bpf/prog_tests/uprobe_autoattach.c: In function ‘test_uprobe_autoattach’:
  ./test_progs.h:209:26: error: pointer ‘mem’ may be used after ‘free’ [-Werror=use-after-free]

The value of mem is now used in one of the asserts, which is why it may be
confusing compilers. However, it is not dereferenced. Silence this by moving
free(mem) after the assert block.

Fixes: 1717e24801 ("selftests/bpf: Uprobe tests should verify param/return values")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220421132317.1583867-1-asavkov@redhat.com
2022-04-21 18:48:04 +02:00
Artem Savkov
920fd5e177 selftests/bpf: Fix attach tests retcode checks
Switching to libbpf 1.0 API broke test_sock and test_sysctl as they
check for return of bpf_prog_attach to be exactly -1. Switch the check
to '< 0' instead.

Fixes: b858ba8c52 ("selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK")
Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Yafang Shao <laoar.shao@gmail.com>
Link: https://lore.kernel.org/bpf/20220421130104.1582053-1-asavkov@redhat.com
2022-04-21 16:34:55 +02:00
Grant Seltzer
a66ab9a9e6 libbpf: Add documentation to API functions
This adds documentation for the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()
- bpf_program__set_attach_target()
- bpf_program__attach()
- bpf_program__pin()
- bpf_program__unpin()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-3-grantseltzer@gmail.com
2022-04-21 16:31:07 +02:00
Grant Seltzer
df28671632 libbpf: Update API functions usage to check error
This updates usage of the following API functions within
libbpf so their newly added error return is checked:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-2-grantseltzer@gmail.com
2022-04-21 16:28:25 +02:00
Grant Seltzer
93442f132b libbpf: Add error returns to two API functions
This adds an error return to the following API functions:

- bpf_program__set_expected_attach_type()
- bpf_program__set_type()

In both cases, the error occurs when the BPF object has
already been loaded when the function is called. In this
case -EBUSY is returned.

Signed-off-by: Grant Seltzer <grantseltzer@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220420161226.86803-1-grantseltzer@gmail.com
2022-04-21 16:28:11 +02:00
Ammar Faizi
11dbdaeff4 tools/nolibc/string: Implement strdup() and strndup()
These functions are currently only available on architectures that have
my_syscall6() macro implemented. Since these functions use malloc(),
malloc() uses mmap(), mmap() depends on my_syscall6() macro.

On architectures that don't support my_syscall6(), these function will
always return NULL with errno set to ENOSYS.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
b26823c19a tools/nolibc/string: Implement strnlen()
size_t strnlen(const char *str, size_t maxlen);

The strnlen() function returns the number of bytes in the string
pointed to by sstr, excluding the terminating null byte ('\0'), but at
most maxlen. In doing this, strnlen() looks only at the first maxlen
characters in the string pointed to by str and never beyond str[maxlen-1].

The first use case of this function is for determining the memory
allocation size in the strndup() function.

Link: https://lore.kernel.org/lkml/CAOG64qMpEMh+EkOfjNdAoueC+uQyT2Uv3689_sOr37-JxdJf4g@mail.gmail.com
Suggested-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
0e0ff63840 tools/nolibc/stdlib: Implement malloc(), calloc(), realloc() and free()
Implement basic dynamic allocator functions. These functions are
currently only available on architectures that have nolibc mmap()
syscall implemented. These are not a super-fast memory allocator,
but at least they can satisfy basic needs for having heap without
libc.

Cc: David Laight <David.Laight@ACULAB.COM>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
5a18d07ce3 tools/nolibc/types: Implement offsetof() and container_of() macro
Implement `offsetof()` and `container_of()` macro. The first use case
of these macros is for `malloc()`, `realloc()` and `free()`.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
544fa1a2d3 tools/nolibc/sys: Implement mmap() and munmap()
Implement mmap() and munmap(). Currently, they are only available for
architecures that have my_syscall6 macro. For architectures that don't
have, this function will return -1 with errno set to ENOSYS (Function
not implemented).

This has been tested on x86 and i386.

Notes for i386:
 1) The common mmap() syscall implementation uses __NR_mmap2 instead
    of __NR_mmap.

 2) The offset must be shifted-right by 12-bit.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
f4738ff74c tools/nolibc: i386: Implement syscall with 6 arguments
On i386, the 6th argument of syscall goes in %ebp. However, both Clang
and GCC cannot use %ebp in the clobber list and in the "r" constraint
without using -fomit-frame-pointer. To make it always available for
any kind of compilation, the below workaround is implemented.

  1) Push the 6-th argument.
  2) Push %ebp.
  3) Load the 6-th argument from 4(%esp) to %ebp.
  4) Do the syscall (int $0x80).
  5) Pop %ebp (restore the old value of %ebp).
  6) Add %esp by 4 (undo the stack pointer).

Cc: x86@kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/lkml/2e335ac54db44f1d8496583d97f9dab0@AcuMS.aculab.com
Suggested-by: David Laight <David.Laight@ACULAB.COM>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
1590c59836 tools/nolibc: Remove .global _start from the entry point code
Building with clang yields the following error:
```
  <inline asm>:3:1: error: _start changed binding to STB_GLOBAL
  .global _start
  ^
  1 error generated.
```
Make sure only specify one between `.global _start` and `.weak _start`.
Remove `.global _start`.

Cc: llvm@lists.linux.dev
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
37d62758e7 tools/nolibc: Replace asm with __asm__
Replace `asm` with `__asm__` to support compilation with -std flag.
Using `asm` with -std flag makes GCC think `asm()` is a function call
instead of an inline assembly.

GCC doc says:

  For the C language, the `asm` keyword is a GNU extension. When
  writing C code that can be compiled with `-ansi` and the `-std`
  options that select C dialects without GNU extensions, use
  `__asm__` instead of `asm`.

Link: https://gcc.gnu.org/onlinedocs/gcc/Basic-Asm.html
Reported-by: Alviro Iskandar Setiawan <alviro.iskandar@gnuweeb.org>
Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Ammar Faizi
5312aaa5d5 tools/nolibc: x86-64: Update System V ABI document link
The old link no longer works, update it.

Acked-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
2475d37ac3 tools/nolibc/stdlib: only reference the external environ when inlined
When building with gcc at -O0 we're seeing link errors due to the
"environ" variable being referenced by getenv(). The problem is that
at -O0 gcc will not inline getenv() and will not drop the external
reference. One solution would be to locally declare the variable as
weak, but then it would appear in all programs even those not using
it, and would be confusing to users of getenv() who would forget to
set environ to envp.

An alternate approach used in this patch consists in always inlining
the outer part of getenv() that references this extern so that it's
always dropped when not used. The biggest part of the function was
now moved to a new function called _getenv() that's still not inlined
by default.

Reported-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Tested-by: Ammar Faizi <ammarfaizi2@gnuweeb.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
96980b833a tools/nolibc/string: do not use __builtin_strlen() at -O0
clang wants to use strlen() for __builtin_strlen() at -O0. We don't
really care about -O0 but it at least ought to build, so let's make
sure we don't choke on this, by dropping the optimizationn for
constant strings in this case.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
0b37dff10b tools/nolibc: add the nolibc subdir to the common Makefile
The Makefile in tools/ is used to forward options to the makefiles
in the various subdirs. Let's add nolibc there so that it becomes
possible to make tools/nolibc_headers_standalone from the main tree
to simply create a completely usable sysroot.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
2432616468 tools/nolibc: add a makefile to install headers
This provides a target "headers_standalone" which installs the nolibc's
arch-specific headers with "arch.h" taken from the current arch (or a
concatenation of both i386 and x86_64 for arch=x86), then installs
kernel headers. This creates a convenient sysroot which is directly
usable by a bare-metal compiler to create any executable.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:46 -07:00
Willy Tarreau
96d2a1313f tools/nolibc/types: add poll() and waitpid() flag definitions
- POLLIN etc were missing, so poll() could only be used with timeouts.
- WNOHANG was not defined and is convenient to check if a child is still
  running

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
54abe3590f tools/nolibc/sys: add syscall definition for getppid()
This is essentially for completeness as it's not the most often used
in regtests.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
0e7b492943 tools/nolibc/string: add strcmp() and strncmp()
We need these functions all the time, including when checking environment
variables and parsing command-line arguments. These implementations were
optimized to show optimal code size on a wide range of compilers (22 bytes
return included for strcmp(), 33 for strncmp()).

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
bd845a193a tools/nolibc/stdio: add support for '%p' to vfprintf()
%p remains quite useful in test code, and the code path can easily be
merged with the existing "%x" thus only adds ~50 bytes, thus let's
add it.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
077d0a3924 tools/nolibc/stdlib: add a simple getenv() implementation
This implementation relies on an extern definition of the environ
variable, that the caller must declare and initialize from envp.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
170b230d22 tools/nolibc/stdio: make printf(%s) accept NULL
It's often convenient to support this, especially in test programs where
a NULL may correspond to an allocation error or a non-existing value.
Let's make printf("%s") support being passed a NULL. In this case it
prints "(null)" like glibc's printf().

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
f0f04f28d5 tools/nolibc/stdlib: implement abort()
libgcc uses it for certain divide functions, so it must be exported. Like
for memset() we do that in its own section so that the linker can strip
it when not needed.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
c4486e9728 tools/nolibc: also mention how to build by just setting the include path
Now that a few basic include files are provided, some simple portable
programs may build, which will save them from having to surround their
includes with #ifndef NOLIBC. This patch mentions how to proceed, and
enumerates the list of files that are covered.

A comprehensive list of required include files is available here:

  https://en.cppreference.com/w/c/header

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
cec1505321 tools/nolibc/time: create time.h with time()
The time() syscall is used by a few simple applications, and is trivial
to implement based on gettimeofday() that we already have. Let's create
the file to ease porting and provide the function. It never returns any
error, though it may segfault in case of invalid pointer, like other
implementations relying on gettimeofday().

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
99cb50ab94 tools/nolibc/signal: move raise() to signal.h
This function is normally found in signal.h, and providing the file
eases porting of existing programs. Let's move it there.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
180a9797b0 tools/nolibc/unistd: add usleep()
This call is trivial to implement based on select() to complete sleep()
and msleep(), let's add it.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
4619de3446 tools/nolibc/unistd: extract msleep(), sleep(), tcsetpgrp() to unistd.h
These functions are normally provided by unistd.h. For ease of porting,
let's create the file and move them there.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
45a794bf7c tools/nolibc/errno: extract errno.h from sys.h
This allows us to provide a minimal errno.h to ease porting applications
that use it.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
8d304a3740 tools/nolibc/string: export memset() and memmove()
"clang -Os" and "gcc -Ofast" without -ffreestanding may ignore memset()
and memmove(), hoping to provide their builtin equivalents, and finally
not find them. Thus we must export these functions for these rare cases.
Note that as they're set in their own sections, they will be eliminated
by the linker if not used. In addition, they do not prevent gcc from
identifying them and replacing them with the shorter "rep movsb" or
"rep stosb" when relevant.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
023033fe34 tools/nolibc/types: define PATH_MAX and MAXPATHLEN
These ones are often used and commonly set by applications to fallback
values. Let's fix them both to agree on PATH_MAX=4096 by default, as is
already present in linux/limits.h.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
dffeb81af5 tools/nolibc/arch: mark the _start symbol as weak
By doing so we can link together multiple C files that have been compiled
with nolibc and which each have a _start symbol.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
07f47ea06f tools/nolibc: move exported functions to their own section
Some functions like raise() and memcpy() are permanently exported because
they're needed by libgcc on certain platforms. However most of the time
they are not needed and needlessly take space.

Let's move them to their own sub-section, called .text.nolibc_<function>.
This allows ld to get rid of them if unused when passed --gc-sections.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:45 -07:00
Willy Tarreau
d9390de638 tools/nolibc/string: add tiny versions of strncat() and strlcat()
While these functions are often dangerous, forcing the user to work
around their absence is often much worse. Let's provide small versions
of each of them. The respective sizes in bytes on a few architectures
are:

  strncat(): x86:0x33 mips:0x68 arm:0x3c
  strlcat(): x86:0x25 mips:0x4c arm:0x2c

The two are quite different, and strncat() is even different from
strncpy() in that it limits the amount of data it copies and will always
terminate the output by one zero, while strlcat() will always limit the
total output to the specified size and will put a zero if possible.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
b312eb0b87 tools/nolibc/string: add strncpy() and strlcpy()
These are minimal variants. strncpy() always fills the destination for
<size> chars, while strlcpy() copies no more than <size> including the
zero and returns the source's length. The respective sizes on various
archs are:

  strncpy(): x86:0x1f mips:0x30 arm:0x20
  strlcpy(): x86:0x17 mips:0x34 arm:0x1a

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
d76232ff8b tools/nolibc/string: slightly simplify memmove()
The direction test inside the loop was not always completely optimized,
resulting in a larger than necessary function. This change adds a
direction variable that is set out of the loop. Now the function is down
to 48 bytes on x86, 32 on ARM and 68 on mips. It's worth noting that other
approaches were attempted (including relying on the up and down functions)
but they were only slightly beneficial on x86 and cost more on others.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
d8dcc2d8d9 tools/nolibc/string: use unidirectional variants for memcpy()
Till now memcpy() relies on memmove(), but it's always included for libgcc,
so we have a larger than needed function. Let's implement two unidirectional
variants to copy from bottom to top and from top to bottom, and use the
former for memcpy(). The variants are optimized to be compact, and at the
same time the compiler is sometimes able to detect the loop and to replace
it with a "rep movsb". The new function is 24 bytes instead of 52 on x86_64.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
830acd088e tools/nolibc/sys: make getpgrp(), getpid(), gettid() not set errno
These syscalls never fail so there is no need to extract and set errno
for them.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
6e277371a5 tools/nolibc/stdlib: make raise() use the lower level syscalls only
raise() doesn't set errno, so there's no point calling kill(), better
call sys_kill(), which also reduces the function's size.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
ac90226d53 tools/nolibc/stdlib: avoid a 64-bit shift in u64toh_r()
The build of printf() on mips requires libgcc for functions __ashldi3 and
__lshrdi3 due to 64-bit shifts when scanning the input number. These are
not really needed in fact since we scan the number 4 bits at a time. Let's
arrange the loop to perform two 32-bit shifts instead on 32-bit platforms.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
a7604ba149 tools/nolibc/sys: make open() take a vararg on the 3rd argument
Let's pass a vararg to open() so that it remains compatible with existing
code. The arg is only dereferenced when flags contain O_CREAT. The function
is generally not inlined anymore, causing an extra call (total 16 extra
bytes) but it's still optimized for constant propagation, limiting the
excess to no more than 16 bytes in practice when open() is called without
O_CREAT, and ~40 with O_CREAT, which remains reasonable.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
acab7bcdb1 tools/nolibc/stdio: add perror() to report the errno value
It doesn't contain the text for the error codes, but instead displays
"errno=" followed by the errno value. Just like the regular errno, if
a non-empty message is passed, it's placed followed with ": " on the
output before the errno code. The message is emitted on stderr.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
51469d5ab3 tools/nolibc/types: define EXIT_SUCCESS and EXIT_FAILURE
These ones are found in some examples found in man pages and ease
portability tests.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
7e4346f4a3 tools/nolibc/stdio: add a minimal [vf]printf() implementation
This adds a minimal vfprintf() implementation as well as the commonly
used fprintf() and printf() that rely on it.

For now the function supports:
  - formats: %s, %c, %u, %d, %x
  - modifiers: %l and %ll
  - unknown chars are considered as modifiers and are ignored

It is designed to remain minimalist, despite this printf() is 549 bytes
on x86_64. It would be wise not to add too many formats.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
e3e19052d5 tools/nolibc/stdio: add fwrite() to stdio
We'll use it to write substrings. It relies on a simpler _fwrite() that
only takes one size. fputs() was also modified to rely on it.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
99b037cbd5 tools/nolibc/stdio: add stdin/stdout/stderr and fget*/fput* functions
The standard puts() function always emits the trailing LF which makes it
unconvenient for small string concatenation. fputs() ought to be used
instead but it requires a FILE*.

This adds 3 dummy FILE* values (stdin, stdout, stderr) which are in fact
pointers to struct FILE of one byte. We reserve 3 pointer values for them,
-3, -2 and -1, so that they are ordered, easing the tests and mapping to
integer.

>From this, fgetc(), fputc(), fgets() and fputs() were implemented, and
the previous putchar() and getchar() now remap to these. The standard
getc() and putc() macros were also implemented as pointing to these
ones.

There is absolutely no buffering, fgetc() and fgets() read one byte at
a time, fputc() writes one byte at a time, and only fputs() which knows
the string's length writes all of it at once.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
4e383a66ac tools/nolibc/stdio: add a minimal set of stdio functions
This only provides getchar(), putchar(), and puts().

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
5f493178ef tools/nolibc/stdlib: add utoh() and u64toh()
This adds a pair of functions to emit hex values.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
b1c21e7d99 tools/nolibc/stdlib: add i64toa() and u64toa()
These are 64-bit variants of the itoa() and utoa() functions. They also
support reentrant ones, and use the same itoa_buffer. The functions are
a bit larger than the previous ones in 32-bit mode (86 and 98 bytes on
x86_64 and armv7 respectively), which is why we continue to provide them
as separate functions.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:44 -07:00
Willy Tarreau
66c397c4d2 tools/nolibc/stdlib: replace the ltoa() function with more efficient ones
The original ltoa() function and the reentrant one ltoa_r() present a
number of drawbacks. The divide by 10 generates calls to external code
from libgcc_s, and the number does not necessarily start at the beginning
of the buffer.

Let's rewrite these functions so that they do not involve a divide and
only use loops on powers of 10, and implement both signed and unsigned
variants, always starting from the buffer's first character. Instead of
using a static buffer for each function, we're now using a common one.

In order to avoid confusion with the ltoa() name, the new functions are
called itoa_r() and utoa_r() to distinguish the signed and unsigned
versions, and for convenience for their callers, these functions now
reutrn the number of characters emitted. The ltoa_r() function is just
an inline mapping to the signed one and which returns the buffer.

The functions are quite small (86 bytes on x86_64, 68 on armv7) and
do not depend anymore on external code.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
56d68a3c1f tools/nolibc/stdlib: move ltoa() to stdlib.h
This function is not standard and performs the opposite of atol(). Let's
move it with atol(). It's been split between a reentrant function and one
using a static buffer.

There's no more definition in nolibc.h anymore now.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
eba6d00d38 tools/nolibc/types: move makedev to types.h and make it a macro
The makedev() man page says it's supposed to be a macro and that some
OSes have it with the other ones in sys/types.h so it now makes sense
to move it to types.h as a macro. Let's also define major() and
minor() that perform the reverse operation.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
306c9fd4c6 tools/nolibc/types: make FD_SETSIZE configurable
The macro was hard-coded to 256 but it's common to see it redefined.
Let's support this and make sure we always allocate enough entries for
the cases where it wouldn't be multiple of 32.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
8cb98b3fce tools/nolibc/types: move the FD_* functions to macros in types.h
FD_SET, FD_CLR, FD_ISSET, FD_ZERO are often expected to be macros and
not functions. In addition we already have a file dedicated to such
macros and types used by syscalls, it's types.h, so let's move them
there and turn them to macros. FD_CLR() and FD_ISSET() were missing,
so they were added. FD_ZERO() now deals with its own loop so that it
doesn't rely on memset() that sets one byte at a time.

Cc: David Laight <David.Laight@aculab.com>
Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
50850c38b2 tools/nolibc/ctype: add the missing is* functions
There was only isdigit, this commit adds the other ones.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
62a2af0774 tools/nolibc/ctype: split the is* functions to ctype.h
In fact there's only isdigit() for now. More should definitely be added.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
c91eb03389 tools/nolibc/string: split the string functions into string.h
The string manipulation functions (mem*, str*) are now found in
string.h. The file depends on almost nothing and will be
usable from other includes if needed. Maybe more functions could
be added.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
06fdba53e0 tools/nolibc/stdlib: extract the stdlib-specific functions to their own file
The new file stdlib.h contains the definitions of functions that
are usually found in stdlib.h. Many more could certainly be added.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
bd8c8fbb86 tools/nolibc/sys: split the syscall definitions into their own file
The syscall definitions were moved to sys.h. They were arranged
in a more easily maintainable order, whereby the sys_xxx() and xxx()
functions were grouped together, which also enlights the occasional
mappings such as wait relying on wait4().

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
271661c1cd tools/nolibc/arch: split arch-specific code into individual files
In order to ease maintenance, this splits the arch-specific code into
one file per architecture. A common file "arch.h" is used to include the
right file among arch-* based on the detected architecture. Projects
which are already split per architecture could simply rename these
files to $arch/arch.h and get rid of the common arch.h. For this
reason, include guards were placed into each arch-specific file.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
cc7a492ad0 tools/nolibc/types: split syscall-specific definitions into their own files
The macros and type definitions used by a number of syscalls were moved
to types.h where they will be easier to maintain. A few of them
are arch-specific and must not be moved there (e.g. O_*, sys_stat_struct).
A warning about them was placed at the top of the file.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:43 -07:00
Willy Tarreau
967cce191f tools/nolibc/std: move the standard type definitions to std.h
The ordering of includes and definitions for now is a bit of a mess, as
for example asm/signal.h is included after int definitions, but plenty of
structures are defined later as they rely on other includes.

Let's move the standard type definitions to a dedicated file that is
included first. We also move NULL there. This way all other includes
are aware of it, and we can bring asm/signal.h back to the top of the
file.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 17:05:33 -07:00
Paul E. McKenney
fb036ad7db rcutorture: Make torture.sh allow for --kasan
The torture.sh script provides extra memory for scftorture and rcuscale.
However, the total memory provided is only 1G, which is less than the
2G that is required for KASAN testing.  This commit therefore ups the
torture.sh script's 1G to 2G.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:55:03 -07:00
Paul E. McKenney
d69e048b27 rcutorture: Make torture.sh refscale and rcuscale specify Tasks Trace RCU
Now that the Tasks RCU flavors are selected by their users rather than
by the rcutorture scenarios, torture.sh fails when attempting to run
NOPREEMPT scenarios for refscale and rcuscale.  This commit therefore
makes torture.sh specify CONFIG_TASKS_TRACE_RCU=y to avoid such failure.

Why not also CONFIG_TASKS_RCU?  Because tracing selects this one.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:55:03 -07:00
Paul E. McKenney
3101562576 rcutorture: Make kvm.sh allow more memory for --kasan runs
KASAN allots significant memory to track allocation state, and the amount
of memory has increased recently, which results in frequent OOMs on a
few of the rcutorture scenarios.  This commit therefore provides 2G of
memory for --kasan runs, up from the 512M default.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:55:03 -07:00
Paul E. McKenney
c7756fff4f torture: Save "make allmodconfig" .config file
Currently, torture.sh saves only the build output and exit code from the
"make allmodconfig" test.  This commit also saves the .config file.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:55:03 -07:00
Paul E. McKenney
f877e3993b scftorture: Remove extraneous "scf" from per_version_boot_params
There is an extraneous "scf" in the per_version_boot_params shell function
used by scftorture.  No harm done in that it is just passed as an argument
to the /init program in initrd, but this commit nevertheless removes it.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:55:03 -07:00
Paul E. McKenney
eec52c7fb5 rcutorture: Adjust scenarios' Kconfig options for CONFIG_PREEMPT_DYNAMIC
Now that CONFIG_PREEMPT_DYNAMIC=y is the default, kernels that are
ostensibly built with CONFIG_PREEMPT_NONE=y or CONFIG_PREEMPT_VOLUNTARY=y
are now actually built with CONFIG_PREEMPT=y, but are by default booted
so as to disable preemption.  Although this allows much more flexibility
from a single kernel binary, it means that the current rcutorture
scenarios won't find build errors that happen only when preemption is
fully disabled at build time.

This commit therefore adds CONFIG_PREEMPT_DYNAMIC=n to several scenarios,
and while in the area switches one from CONFIG_PREEMPT_NONE=y to
CONFIG_PREEMPT_VOLUNTARY=y to add coverage of this Kconfig option.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:55:03 -07:00
Paul E. McKenney
3e112a39f7 torture: Enable CSD-lock stall reports for scftorture
This commit passes the csdlock_debug=1 kernel parameter in order to
enable CSD-lock stall reports for torture.sh scftorure runs.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:55:03 -07:00
Paul E. McKenney
00f3133b7f torture: Skip vmlinux check for kvm-again.sh runs
The kvm-again.sh script reruns an previously built set of kernels, so
the vmlinux files are associated with that previous run, not this on.
This results in kvm-find_errors.sh reporting spurious failed-build errors.
This commit therefore omits the vmlinux check for kvm-again.sh runs.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:54:56 -07:00
Paul E. McKenney
bf5e7a2f46 scftorture: Adjust for TASKS_RCU Kconfig option being selected
This commit adjusts the scftorture PREEMPT and NOPREEMPT scenarios to
account for the TASKS_RCU Kconfig option being explicitly selected rather
than computed in isolation.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
5ce027f4cd rcuscale: Allow rcuscale without RCU Tasks Rude/Trace
Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for
RCU Tasks Rude and RCU Tasks Trace.  Unless that kernel builds rcuscale,
whether built-in or as a module, in which case these RCU Tasks flavors are
(unnecessarily) built in.  This both increases kernel size and increases
the complexity of certain tracing operations.  This commit therefore
decouples the presence of rcuscale from the presence of RCU Tasks Rude
and RCU Tasks Trace.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
4df002d908 rcuscale: Allow rcuscale without RCU Tasks
Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for
RCU Tasks.  Unless that kernel builds rcuscale, whether built-in or as
a module, in which case RCU Tasks is (unnecessarily) built.  This both
increases kernel size and increases the complexity of certain tracing
operations.  This commit therefore decouples the presence of rcuscale
from the presence of RCU Tasks.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
dec86781a5 refscale: Allow refscale without RCU Tasks Rude/Trace
Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for
RCU Tasks Rude and RCU Tasks Trace.  Unless that kernel builds refscale,
whether built-in or as a module, in which case these RCU Tasks flavors are
(unnecessarily) built in.  This both increases kernel size and increases
the complexity of certain tracing operations.  This commit therefore
decouples the presence of refscale from the presence of RCU Tasks Rude
and RCU Tasks Trace.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
5f654af150 refscale: Allow refscale without RCU Tasks
Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for
RCU Tasks.  Unless that kernel builds refscale, whether built-in or as a
module, in which case RCU Tasks is (unnecessarily) built in.  This both
increases kernel size and increases the complexity of certain tracing
operations.  This commit therefore decouples the presence of refscale
from the presence of RCU Tasks.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
58524e0fed rcutorture: Allow specifying per-scenario stat_interval
The rcutorture test suite makes double use of the rcutorture.stat_interval
module parameter.  As its name suggests, it controls the frequency
of statistics printing, but it also controls the rcu_torture_writer()
stall timeout.  The current setting of 15 seconds works surprisingly well.
However, given that the RCU tasks stall-warning timeout is ten -minutes-,
15 seconds is too short for TASKS02, which runs a non-preemptible kernel
on a single CPU.

This commit therefore adds checks for per-scenario specification of the
rcutorture.stat_interval module parameter.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
3831fc02f4 rcutorture: Add CONFIG_PREEMPT_DYNAMIC=n to TASKS02 scenario
Now that CONFIG_PREEMPT_DYNAMIC=y is the default, TASKS02 no longer
builds a pure non-preemptible kernel that uses Tiny RCU.  This commit
therefore fixes this new hole in rcutorture testing by adding
CONFIG_PREEMPT_DYNAMIC=n to the TASKS02 rcutorture scenario.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
4c3f7b0e1e rcutorture: Allow rcutorture without RCU Tasks Rude
Unless a kernel builds rcutorture, whether built-in or as a module, that
kernel is also built with CONFIG_TASKS_RUDE_RCU, whether anything else
needs Tasks Rude RCU or not.  This unnecessarily increases kernel size.
This commit therefore decouples the presence of rcutorture from the
presence of RCU Tasks Rude.

However, there is a need to select CONFIG_TASKS_RUDE_RCU for testing
purposes.  Except that casual users must not be bothered with
questions -- for them, this needs to be fully automated.  There is
thus a CONFIG_FORCE_TASKS_RUDE_RCU that selects CONFIG_TASKS_RUDE_RCU,
is user-selectable, but which depends on CONFIG_RCU_EXPERT.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
3b6e1dd423 rcutorture: Allow rcutorture without RCU Tasks
Currently, a CONFIG_PREEMPT_NONE=y kernel substitutes normal RCU for
RCU Tasks.  Unless that kernel builds rcutorture, whether built-in or as
a module, in which case RCU Tasks is (unnecessarily) used.  This both
increases kernel size and increases the complexity of certain tracing
operations.  This commit therefore decouples the presence of rcutorture
from the presence of RCU Tasks.

However, there is a need to select CONFIG_TASKS_RCU for testing purposes.
Except that casual users must not be bothered with questions -- for them,
this needs to be fully automated.  There is thus a CONFIG_FORCE_TASKS_RCU
that selects CONFIG_TASKS_RCU, is user-selectable, but which depends
on CONFIG_RCU_EXPERT.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
40c1278aa7 rcutorture: Allow rcutorture without RCU Tasks Trace
Unless a kernel builds rcutorture, whether built-in or as a module, that
kernel is also built with CONFIG_TASKS_TRACE_RCU, whether anything else
needs Tasks Trace RCU or not.  This unnecessarily increases kernel size.
This commit therefore decouples the presence of rcutorture from the
presence of RCU Tasks Trace.

However, there is a need to select CONFIG_TASKS_TRACE_RCU for
testing purposes.  Except that casual users must not be bothered with
questions -- for them, this needs to be fully automated.  There is thus
a CONFIG_FORCE_TASKS_TRACE_RCU that selects CONFIG_TASKS_TRACE_RCU,
is user-selectable, but which depends on CONFIG_RCU_EXPERT.

[ paulmck: Apply kernel test robot feedback. ]

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:53:19 -07:00
Paul E. McKenney
835f14ed53 rcu: Make the TASKS_RCU Kconfig option be selected
Currently, any kernel built with CONFIG_PREEMPTION=y also gets
CONFIG_TASKS_RCU=y, which is not helpful to people trying to build
preemptible kernels of minimal size.

Because CONFIG_TASKS_RCU=y is needed only in kernels doing tracing of
one form or another, this commit moves from TASKS_RCU deciding when it
should be enabled to the tracing Kconfig options explicitly selecting it.
This allows building preemptible kernels without TASKS_RCU, if desired.

This commit also updates the SRCU-N and TREE09 rcutorture scenarios
in order to avoid Kconfig errors that would otherwise result from
CONFIG_TASKS_RCU being selected without its CONFIG_RCU_EXPERT dependency
being met.

[ paulmck: Apply BPF_SYSCALL feedback from Andrii Nakryiko. ]

Reported-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Tested-by: Zhouyi Zhou <zhouzhouyi@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-20 16:52:58 -07:00
Liu Jian
127e7dca42 selftests/bpf: Add test for skb_load_bytes
Use bpf_prog_test_run_opts to test the skb_load_bytes function. Tests
the behavior when offset is greater than INT_MAX or a normal value.

Signed-off-by: Liu Jian <liujian56@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220416105801.88708-4-liujian56@huawei.com
2022-04-20 23:48:34 +02:00
Florian Fischer
75eafc970b perf list: Print all available tool events
Introduce names for the new tool events 'user_time' and 'system_time'.

  $ perf list
  ...
  duration_time                                      [Tool event]
  user_time                                          [Tool event]
  system_time                                        [Tool event]
  ...

Committer testing:

Before:

  $ perf list | grep Tool
  duration_time                                      [Tool event]
  $

After:

  $ perf list | grep Tool
    duration_time                                    [Tool event]
    user_time                                        [Tool event]
    system_time                                      [Tool event]
  $

Signed-off-by: Florian Fischer <florian.fischer@muhq.space>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220420174244.1741958-2-florian.fischer@muhq.space
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20 15:05:00 -03:00
Florian Fischer
b03b89b350 perf stat: Add user_time and system_time events
It bothered me that during benchmarking using 'perf stat' (to collect
for example CPU cache events) I could not simultaneously retrieve the
times spend in user or kernel mode in a machine readable format.

When running 'perf stat' the output for humans contains the times
reported by rusage and wait4.

  $ perf stat -e cache-misses:u -- true

   Performance counter stats for 'true':

             4,206      cache-misses:u

       0.001113619 seconds time elapsed

       0.001175000 seconds user
       0.000000000 seconds sys

But 'perf stat's machine-readable format does not provide this information.

  $ perf stat -x, -e cache-misses:u -- true
  4282,,cache-misses:u,492859,100.00,,

I found no way to retrieve this information using the available events
while using machine-readable output.

This patch adds two new tool internal events 'user_time' and
'system_time', similarly to the already present 'duration_time' event.

Both events use the already collected rusage information obtained by
wait4 and tracked in the global ru_stats.

Examples presenting cache-misses and rusage information in both human
and machine-readable form:

  $ perf stat -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time .

   Performance counter stats for 'grep -q -r duration_time .':

        67,422,542 ns   duration_time:u
        50,517,000 ns   user_time:u
        16,839,000 ns   system_time:u
            30,937      cache-misses:u

       0.067422542 seconds time elapsed

       0.050517000 seconds user
       0.016839000 seconds sys

  $ perf stat -x, -e duration_time,user_time,system_time,cache-misses -- grep -q -r duration_time .
  72134524,ns,duration_time:u,72134524,100.00,,
  65225000,ns,user_time:u,65225000,100.00,,
  6865000,ns,system_time:u,6865000,100.00,,
  38705,,cache-misses:u,71189328,100.00,,

Signed-off-by: Florian Fischer <florian.fischer@muhq.space>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220420102354.468173-3-florian.fischer@muhq.space
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20 13:44:56 -03:00
Florian Fischer
c735b0a521 perf stat: Introduce stats for the user and system rusage times
This is preparation for exporting rusage values as tool events.

Add new global stats tracking the values obtained via rusage.

For now only ru_utime and ru_stime are part of the tracked stats.

Both are stored as nanoseconds to be consistent with 'duration_time',
although the finest resolution the struct timeval data in rusage
provides are microseconds.

Signed-off-by: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220420102354.468173-2-florian.fischer@muhq.space
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20 13:38:41 -03:00
Martin Liška
c60664dea7 perf tools: Print warning when HAVE_DEBUGINFOD_SUPPORT is not set and user tries to use debuginfod support
When one requests debuginfod, either via --debuginfod option, or with a
perf-config value, complain when perf is not built with it.

Signed-off-by: Martin Liška <mliska@suse.cz>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/35bae747-3951-dc3d-a66b-abf4cebcd9cb@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20 13:36:36 -03:00
Martin Liška
b8836c2a4d perf version: Add HAVE_DEBUGINFOD_SUPPORT to built-in features
The change adds debuginfod to ./perf -vv:

  ...
  debuginfod: [ OFF ]  # HAVE_DEBUGINFOD_SUPPORT
  ...

Signed-off-by: Martin Liška <mliska@suse.cz>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: http://lore.kernel.org/lkml/0d1c5ace-88e8-7102-1565-7c143f01a966@suse.cz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-20 13:32:09 -03:00
Ido Schimmel
5e6242151d selftests: mlxsw: vxlan_flooding_ipv6: Prevent flooding of unwanted packets
The test verifies that packets are correctly flooded by the bridge and
the VXLAN device by matching on the encapsulated packets at the other
end. However, if packets other than those generated by the test also
ingress the bridge (e.g., MLD packets), they will be flooded as well and
interfere with the expected count.

Make the test more robust by making sure that only the packets generated
by the test can ingress the bridge. Drop all the rest using tc filters
on the egress of 'br0' and 'h1'.

In the software data path, the problem can be solved by matching on the
inner destination MAC or dropping unwanted packets at the egress of the
VXLAN device, but this is not currently supported by mlxsw.

Fixes: d01724dd2a ("selftests: mlxsw: spectrum-2: Add a test for VxLAN flooding with IPv6")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:04:27 +01:00
Ido Schimmel
044011fdf1 selftests: mlxsw: vxlan_flooding: Prevent flooding of unwanted packets
The test verifies that packets are correctly flooded by the bridge and
the VXLAN device by matching on the encapsulated packets at the other
end. However, if packets other than those generated by the test also
ingress the bridge (e.g., MLD packets), they will be flooded as well and
interfere with the expected count.

Make the test more robust by making sure that only the packets generated
by the test can ingress the bridge. Drop all the rest using tc filters
on the egress of 'br0' and 'h1'.

In the software data path, the problem can be solved by matching on the
inner destination MAC or dropping unwanted packets at the egress of the
VXLAN device, but this is not currently supported by mlxsw.

Fixes: 94d302deae ("selftests: mlxsw: Add a test for VxLAN flooding")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Amit Cohen <amcohen@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-20 15:04:27 +01:00
Pu Lehui
58ca8b0572 libbpf: Support riscv USDT argument parsing logic
Add riscv-specific USDT argument specification parsing logic.
riscv USDT argument format is shown below:
- Memory dereference case:
  "size@off(reg)", e.g. "-8@-88(s0)"
- Constant value case:
  "size@val", e.g. "4@5"
- Register read case:
  "size@reg", e.g. "-8@a1"

s8 will be marked as poison while it's a reg of riscv, we need
to alias it in advance. Both RV32 and RV64 have been tested.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-3-pulehui@huawei.com
2022-04-19 21:59:35 -07:00
Pu Lehui
5af25a410a libbpf: Fix usdt_cookie being cast to 32 bits
The usdt_cookie is defined as __u64, which should not be
used as a long type because it will be cast to 32 bits
in 32-bit platforms.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220419145238.482134-2-pulehui@huawei.com
2022-04-19 21:59:35 -07:00
Geliang Tang
abd26d348b selftests: mqueue: drop duplicate min definition
Drop duplicate macro min() definition in mq_perf_tests.c, use MIN() in
sys/param.h instead.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-19 19:28:47 -06:00
Ze Zhang
d490527d30 selftests/ftrace: add mips support for kprobe args syntax tests
This is the mips variant of commit <3990b5baf225> ("selftests/ftrace:
Add s390 support for kprobe args tests").

Signed-off-by: Ze Zhang <zhangze@loongson.cn>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-19 19:23:23 -06:00
Ze Zhang
2238a1f490 selftests/ftrace: add mips support for kprobe args string tests
This is the mips variant of commit <3990b5baf225> ("selftests/ftrace:
Add s390 support for kprobe args tests").

Signed-off-by: Ze Zhang <zhangze@loongson.cn>
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-19 19:23:10 -06:00
Kumar Kartikeya Dwivedi
24fe983abe selftests/bpf: Add tests for type tag order validation
Add a few test cases that ensure we catch cases of badly ordered type
tags in modifier chains.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220419164608.1990559-3-memxor@gmail.com
2022-04-19 14:02:49 -07:00
Andrii Nakryiko
0d7fefebea selftests/bpf: Use non-autoloaded programs in few tests
Take advantage of new libbpf feature for declarative non-autoloaded BPF
program SEC() definitions in few test that test single program at a time
out of many available programs within the single BPF object.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220419002452.632125-2-andrii@kernel.org
2022-04-19 13:48:20 -07:00
Andrii Nakryiko
a3820c4811 libbpf: Support opting out from autoloading BPF programs declaratively
Establish SEC("?abc") naming convention (i.e., adding question mark in
front of otherwise normal section name) that allows to set corresponding
program's autoload property to false. This is effectively just
a declarative way to do bpf_program__set_autoload(prog, false).

Having a way to do this declaratively in BPF code itself is useful and
convenient for various scenarios. E.g., for testing, when BPF object
consists of multiple independent BPF programs that each needs to be
tested separately. Opting out all of them by default and then setting
autoload to true for just one of them at a time simplifies testing code
(see next patch for few conversions in BPF selftests taking advantage of
this new feature).

Another real-world use case is in libbpf-tools for cases when different
BPF programs have to be picked depending on particulars of the host
kernel due to various incompatible changes (like kernel function renames
or signature change, or to pick kprobe vs fentry depending on
corresponding kernel support for the latter). Marking all the different
BPF program candidates as non-autoloaded declaratively makes this more
obvious in BPF source code and allows simpler code in user-space code.

When BPF program marked as SEC("?abc") it is otherwise treated just like
SEC("abc") and bpf_program__section_name() reported will be "abc".

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220419002452.632125-1-andrii@kernel.org
2022-04-19 13:48:20 -07:00
Josh Poimboeuf
08feafe8d1 objtool: Fix function fallthrough detection for vmlinux
Objtool's function fallthrough detection only works on C objects.
The distinction between C and assembly objects no longer makes sense
with objtool running on vmlinux.o.

Now that copy_user_64.S has been fixed up, and an objtool sibling call
detection bug has been fixed, the asm code is in "compliance" and this
hack is no longer needed.  Remove it.

Fixes: ed53a0d971 ("x86/alternative: Use .ibt_endbr_seal to seal indirect calls")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/b434cff98eca3a60dcc64c620d7d5d405a0f441c.1649718562.git.jpoimboe@redhat.com
2022-04-19 21:58:53 +02:00
Josh Poimboeuf
34c861e806 objtool: Fix sibling call detection in alternatives
In add_jump_destinations(), sibling call detection requires 'insn->func'
to be valid.  But alternative instructions get their 'func' set in
handle_group_alt(), which runs *after* add_jump_destinations().  So
sibling calls in alternatives code don't get properly detected.

Fix that by changing the initialization order: call
add_special_section_alts() *before* add_jump_destinations().

This also means the special case for a missing 'jump_dest' in
add_jump_destinations() can be removed, as it has already been dealt
with.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/c02e0a0a2a4286b5f848d17c77fdcb7e0caf709c.1649718562.git.jpoimboe@redhat.com
2022-04-19 21:58:53 +02:00
Josh Poimboeuf
26ff604102 objtool: Don't set 'jump_dest' for sibling calls
For most sibling calls, 'jump_dest' is NULL because objtool treats the
jump like a call and sets 'call_dest'.  But there are a few edge cases
where that's not true.  Make it consistent to avoid unexpected behavior.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/8737d6b9d1691831aed73375f444f0f42da3e2c9.1649718562.git.jpoimboe@redhat.com
2022-04-19 21:58:53 +02:00
Josh Poimboeuf
1d08b92fa2 objtool: Use offstr() to print address of missing ENDBR
Fixes: 89bc853eae ("objtool: Find unused ENDBR instructions")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/95d12e800c736a3f7d08d61dabb760b2d5251a8e.1650300597.git.jpoimboe@redhat.com
2022-04-19 21:58:50 +02:00
Josh Poimboeuf
4baae989e6 objtool: Print data address for "!ENDBR" data warnings
When a "!ENDBR" warning is reported for a data section, objtool just
prints the text address of the relocation target twice, without giving
any clues about the location of the original data reference:

  vmlinux.o: warning: objtool: dcbnl_netdevice_event()+0x0: .text+0xb64680: data relocation to !ENDBR: dcbnl_netdevice_event+0x0

Instead, print the address of the data reference, in addition to the
address of the relocation target.

  vmlinux.o: warning: objtool: dcbnl_nb+0x0: .data..read_mostly+0xe260: data relocation to !ENDBR: dcbnl_netdevice_event+0x0

Fixes: 89bc853eae ("objtool: Find unused ENDBR instructions")
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/762e88d51300e8eaf0f933a5b0feae20ac033bea.1650300597.git.jpoimboe@redhat.com
2022-04-19 21:58:50 +02:00
Peter Zijlstra
d4e5268a08 x86,objtool: Mark cpu_startup_entry() __noreturn
GCC-8 isn't clever enough to figure out that cpu_start_entry() is a
noreturn while objtool is. This results in code after the call in
start_secondary(). Give GCC a hand so that they all agree on things.

  vmlinux.o: warning: objtool: start_secondary()+0x10e: unreachable

Reported-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220408094718.383658532@infradead.org
2022-04-19 21:58:48 +02:00
Yonghong Song
44df171a10 selftests/bpf: Workaround a verifier issue for test exhandler
The llvm patch [1] enabled opaque pointer which caused selftest
'exhandler' failure.
  ...
  ; work = task->task_works;
  7: (79) r1 = *(u64 *)(r6 +2120)       ; R1_w=ptr_callback_head(off=0,imm=0) R6_w=ptr_task_struct(off=0,imm=0)
  ; func = work->func;
  8: (79) r2 = *(u64 *)(r1 +8)          ; R1_w=ptr_callback_head(off=0,imm=0) R2_w=scalar()
  ; if (!work && !func)
  9: (4f) r1 |= r2
  math between ptr_ pointer and register with unbounded min value is not allowed

  below is insn 10 and 11
  10: (55) if r1 != 0 goto +5
  11: (18) r1 = 0 ll
  ...

In llvm, the code generation of 'r1 |= r2' happened in codegen
selectiondag phase due to difference of opaque pointer vs. non-opaque pointer.
Without [1], the related code looks like:
  r2 = *(u64 *)(r6 + 2120)
  r1 = *(u64 *)(r2 + 8)
  if r2 != 0 goto +6 <LBB0_4>
  if r1 != 0 goto +5 <LBB0_4>
  r1 = 0 ll
  ...

I haven't found a good way in llvm to fix this issue. So let us workaround the
problem first so bpf CI won't be blocked.

  [1] https://reviews.llvm.org/D123300

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220419050900.3136024-1-yhs@fb.com
2022-04-19 10:22:19 -07:00
Yonghong Song
8c89b5db7a selftests/bpf: Limit unroll_count for pyperf600 test
LLVM commit [1] changed loop pragma behavior such that
full loop unroll is always honored with user pragma.
Previously, unroll count also depends on the unrolled
code size. For pyperf600, without [1], the loop unroll
count is 150. With [1], the loop unroll count is 600.

The unroll count of 600 caused the program size close to
298k and this caused the following code is generated:
         0:       7b 1a 00 ff 00 00 00 00 *(u64 *)(r10 - 256) = r1
  ;       uint64_t pid_tgid = bpf_get_current_pid_tgid();
         1:       85 00 00 00 0e 00 00 00 call 14
         2:       bf 06 00 00 00 00 00 00 r6 = r0
  ;       pid_t pid = (pid_t)(pid_tgid >> 32);
         3:       bf 61 00 00 00 00 00 00 r1 = r6
         4:       77 01 00 00 20 00 00 00 r1 >>= 32
         5:       63 1a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r1
         6:       bf a2 00 00 00 00 00 00 r2 = r10
         7:       07 02 00 00 fc ff ff ff r2 += -4
  ;       PidData* pidData = bpf_map_lookup_elem(&pidmap, &pid);
         8:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
        10:       85 00 00 00 01 00 00 00 call 1
        11:       bf 08 00 00 00 00 00 00 r8 = r0
  ;       if (!pidData)
        12:       15 08 15 e8 00 00 00 00 if r8 == 0 goto -6123 <LBB0_27588+0xffffffffffdae100>

Note that insn 12 has a branch offset -6123 which is clearly illegal
and will be rejected by the verifier. The negative offset is due to
the branch range is greater than INT16_MAX.

This patch changed the unroll count to be 150 to avoid above
branch target insn out-of-range issue. Also the llvm is enhanced ([2])
to assert if the branch target insn is out of INT16 range.

  [1] https://reviews.llvm.org/D119148
  [2] https://reviews.llvm.org/D123877

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220419043230.2928530-1-yhs@fb.com
2022-04-19 10:18:56 -07:00
Rafael J. Wysocki
9765fa2566 Merge branch 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux
Pull turbostat changes for 5.19 from Len Brown:

"Chen Yu (1):
      tools/power turbostat: Support thermal throttle count print

Dan Merillat (1):
      tools/power turbostat: fix dump for AMD cpus

Len Brown (5):
      tools/power turbostat: tweak --show and --hide capability
      tools/power turbostat: fix ICX DRAM power numbers
      tools/power turbostat: be more useful as non-root
      tools/power turbostat: No build warnings with -Wextra
      tools/power turbostat: version 2022.04.16

Sumeet Pawnikar (2):
      tools/power turbostat: Add Power Limit4 support
      tools/power turbostat: print power values upto three decimal

Zephaniah E. Loss-Cutler-Hull (2):
      tools/power turbostat: Allow -e for all names.
      tools/power turbostat: Allow printing header every N iterations"

* 'turbostat' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux:
  tools/power turbostat: version 2022.04.16
  tools/power turbostat: No build warnings with -Wextra
  tools/power turbostat: be more useful as non-root
  tools/power turbostat: fix ICX DRAM power numbers
  tools/power turbostat: Support thermal throttle count print
  tools/power turbostat: Allow printing header every N iterations
  tools/power turbostat: Allow -e for all names.
  tools/power turbostat: print power values upto three decimal
  tools/power turbostat: Add Power Limit4 support
  tools/power turbostat: fix dump for AMD cpus
  tools/power turbostat: tweak --show and --hide capability
2022-04-19 17:43:25 +02:00
Mykola Lysenko
2324257dbd selftests/bpf: Refactor prog_tests logging and test execution
This is a pre-req to add separate logging for each subtest in
test_progs.

Move all the mutable test data to the test_result struct.
Move per-test init/de-init into the run_one_test function.
Consolidate data aggregation and final log output in
calculate_and_print_summary function.
As a side effect, this patch fixes double counting of errors
for subtests and possible duplicate output of subtest log
on failures.

Also, add prog_tests_framework.c test to verify some of the
counting logic.

As part of verification, confirmed that number of reported
tests is the same before and after the change for both parallel
and sequential test execution.

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220418222507.1726259-1-mykolal@fb.com
2022-04-18 21:22:13 -07:00
Ian Rogers
87e0a30e9a perf vendor events intel: Update goldmont event topics
Apply topic updates from:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-14-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:38:28 -03:00
Ian Rogers
f51c401f11 perf vendor events intel: Update goldmontplus event topics
Apply topic updates from:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:38:18 -03:00
Ian Rogers
8f1a69825f perf vendor events intel: Update elkhartlake event topics
Apply topic updates from:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-12-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:38:08 -03:00
Ian Rogers
44a4b9ad8e perf vendor events intel: Update westmereex event topics
Apply topic updates from:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-11-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:37:56 -03:00
Ian Rogers
7f2c72fa69 perf vendor events intel: Update westmereep-sp event topics
Apply topic updates from:
p
https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-10-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:37:43 -03:00
Ian Rogers
a01174fc9e perf vendor events intel: Update westmereep-dp event topics
Apply topic updates from:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:37:32 -03:00
Ian Rogers
55ae1b759e perf vendor events intel: Update tremontx uncore and topics
Update the topic of BTCLEAR.ANY and add additional uncore event names
as per:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>1
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:37:22 -03:00
Ian Rogers
45d97cdd2f perf vendor events intel: Update tigerlake topic
Update the topic of ASSISTS.ANY as per:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:37:10 -03:00
Ian Rogers
da578feb70 perf vendor events intel: Update nehalemep event topics
Apply topic updates from:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:36:59 -03:00
Ian Rogers
339ec95167 perf vendor events intel: Update SKX uncore
JSON uncore events are generated for Skylake Server for v1.26
with events from:

https://download.01.org/perfmon/SKX/

New event names are added, that match the original JSON names,
due to an update to:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:36:35 -03:00
Ian Rogers
dd498d0804 perf vendor events intel: Update CLX uncore to v1.14
JSON uncore events are generated for CascadeLake Server for v1.14 with
events from:

https://download.01.org/perfmon/CLX/

New event names are added, that match the original JSON names,
due to an update to:

https://github.com/intel/event-converter-for-linux-perf/

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:36:15 -03:00
Ian Rogers
12c6385eeb perf vendor events intel: Add sapphirerapids events
Events were generated from 01.org using:

https://github.com/intel/event-converter-for-linux-perf

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:36:07 -03:00
Ian Rogers
cbeee6caa4 perf vendor events intel: Fix icelakex cstate metrics
Apply cstate fix from:

https://github.com/intel/event-converter-for-linux-perf/

so that metrics for cstates that exist on the particular architecture
are generated. This corrects issues with metric testing.

Also correct topic of ASSISTS.ANY event.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:35:54 -03:00
Ian Rogers
2c77f36a9a perf vendor events intel: Fix icelake cstate metrics
Apply cstate fix from:

https://github.com/intel/event-converter-for-linux-perf/

so that metrics for cstates that exist on the particular architecture
are generated. This corrects issues with metric testing.

Also correct topic of ASSISTS.ANY event.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220413210503.3256922-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 12:35:03 -03:00
Leo Yan
fdefc3750e perf mem: Print memory operation type
The memory operation types are not only for load and store, for easier
reviewing the memory operation type, this patch prints out it.

Before:
  ls 14753 [011]  3678.072400:  1    l1d-miss:  88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])
  ls 14753 [011]  3678.072400:  1  l1d-access:  88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])
  ls 14753 [011]  3678.072400:  1  tlb-access:  88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])
  ls 14753 [011]  3678.072400:  1      memory:  88000182 L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])

After:

  ls 14753 [011]  3678.072400:  1    l1d-miss:  88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])
  ls 14753 [011]  3678.072400:  1  l1d-access:  88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])
  ls 14753 [011]  3678.072400:  1  tlb-access:  88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])
  ls 14753 [011]  3678.072400:  1      memory:  88000182 |OP LOAD|LVL L1 miss|SNP N/A|TLB Walker hit|LCK No|BLK  N/A ffffa7c22b4b2a00 [unknown] ([kernel.kallsyms])

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220417124524.901148-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-18 11:44:06 -03:00
Jiri Pirko
e1fad9517f selftests: mlxsw: Introduce devlink line card provision/unprovision/activation tests
Introduce basic line card manipulation which consists of provisioning,
unprovisioning and activation of a line card.

Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-18 11:00:19 +01:00
Linus Torvalds
3a69a44278 Two x86 fixes related to TSX:
- Use either MSR_TSX_FORCE_ABORT or MSR_IA32_TSX_CTRL to disable TSX to
     cover all CPUs which allow to disable it.
 
   - Disable TSX development mode at boot so that a microcode update which
     provides TSX development mode does not suddenly make the system
     vulnerable to TSX Asynchronous Abort.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmJb5LYTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoVVbD/9cxZWkFctCiymedUZqLabkfpYSki65
 MngdpCPzCNaaIdlp44lwCido5+gJsY9unXdm3OAUzLjv6SsxxpDr5njz1/C6TM1l
 XmWjlkLEbG2QDPd1Ybd/lpYQORBmiukyo8v8x0yFT7ZzwvSddoDZAbeUtkQBrIin
 sDTeExsewKzL2X5qXhttrHLHu1PYgurn4ThIrrG+eg2e4FNk6UUFUS3TOyMvzJDg
 NWJ7N5pGy9YkR7CISq1q+qdnH55pGaUrgonDi2qBTt3EaH0fQtZP2ZtIOYr3O4nI
 YCx6isrIiGUB6kSygofxmk4B+22CaUJXd2OcUxMZ/Th/a2aCK+35BtGVPXQGi6nU
 d7m+ZWB7dShOiejFygS59ty+5L5kliKXYZfUASsq1CLoXH8K1xUwBMkbY5FQ2WH1
 Ue4KUvjguNqsgSRAfeHdOi6B36oot0Xf9JO013Wm3V/r9hsGPtSOjWwFuVvT/euw
 a9iFtruATxDssBxH/l0djCKnwwm5yuOt1OpyizcIMFnlCgRD06h/6zgAvsJK7c8d
 dh6lC4D2mXP1e2wtEyZelve1tmRJ/FeReyG2V5FNU7m1mWYGm1rJZ4AEvnbrzcbC
 ePwFva0lPu8GVKG6HRgHfR8PjuQ7TFmKPKytT7fboIqQpTIY+1Q75wYD4eXkSu8Q
 /ltzXQz/8lz7bA==
 =UQaW
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2022-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "Two x86 fixes related to TSX:

   - Use either MSR_TSX_FORCE_ABORT or MSR_IA32_TSX_CTRL to disable TSX
     to cover all CPUs which allow to disable it.

   - Disable TSX development mode at boot so that a microcode update
     which provides TSX development mode does not suddenly make the
     system vulnerable to TSX Asynchronous Abort"

* tag 'x86-urgent-2022-04-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/tsx: Disable TSX development mode at boot
  x86/tsx: Use MSR_TSX_CTRL to clear CPUID bits
2022-04-17 09:55:59 -07:00
Arun Ajith S
f9a2fb7331 net/ipv6: Introduce accept_unsolicited_na knob to implement router-side changes for RFC9131
Add a new neighbour cache entry in STALE state for routers on receiving
an unsolicited (gratuitous) neighbour advertisement with
target link-layer-address option specified.
This is similar to the arp_accept configuration for IPv4.
A new sysctl endpoint is created to turn on this behaviour:
/proc/sys/net/ipv6/conf/interface/accept_unsolicited_na.

Signed-off-by: Arun Ajith S <aajith@arista.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-17 13:23:49 +01:00
Len Brown
58990892ca tools/power turbostat: version 2022.04.16
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-17 00:05:25 -04:00
Len Brown
9878bf7a9f tools/power turbostat: No build warnings with -Wextra
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 23:45:18 -04:00
Len Brown
164d7a965b tools/power turbostat: be more useful as non-root
Don't exit if used this way:

sudo setcap cap_sys_nice,cap_sys_rawio=+ep ./turbostat
sudo chmod +r /dev/cpu/*/msr
./turbostat

note: cap_sys_admin is now also needed for the perf IPC counter:
sudo setcap cap_sys_admin,cap_sys_nice,cap_sys_rawio=+ep ./turbostat

Reported-by: Artem S. Tashkinov <aros@gmx.com>
Reported-by: Toby Broom <tbroom@outlook.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 23:07:05 -04:00
Len Brown
6397b64189 tools/power turbostat: fix ICX DRAM power numbers
ICX (and its duplicates) require special hard-coded DRAM RAPL units,
rather than using the generic RAPL energy units.

Reported-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:58:15 -04:00
Chen Yu
eae97e053f tools/power turbostat: Support thermal throttle count print
The turbostat data is collected by end user for power evaluationit. However
it looks like we are missing enough thermal context there. Already a couple of
time we found that power management developer asking something like this:
grep -r . /sys/devices/system/cpu/cpu*/thermal_throttle/*

Print the per core thermal throttle count so as to get suffificent thermal
context.

turbostat -i 5 -s Core,CPU,CoreThr
Core	CPU	CoreThr
-	-	104
0	0	61
0	4
1	1	0
1	5
2	2	104
2	6
3	3	7
3	7

Suggested-by: Artem Bityutskiy <artem.bityutskiy@linux.intel.com>
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:58:15 -04:00
Zephaniah E. Loss-Cutler-Hull
c7e399f839 tools/power turbostat: Allow printing header every N iterations
This gives the ability to reprint the header every N iterations, so you
can ensure that a scrolling display always has the header visible
somewhere on the screen.

Signed-off-by: Zephaniah E. Loss-Cutler-Hull <zephaniah@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:58:15 -04:00
Zephaniah E. Loss-Cutler-Hull
0fc521bc33 tools/power turbostat: Allow -e for all names.
Currently, there are a number of variables which are displayed by
default, enabled with -e all, and listed by --list, but which you can
not give to --enable/-e.

So you can enable CPU0c1 (in the bic array), but you can't enable C1 or
C1% (not in the bic array, but exists in sysfs).

This runs counter to both the documentation and user expectations, and
it's just not very user friendly.

As such, the mechanism used by --hide has been duplicated, and is now
also used by --enable, so we can handle unknown names gracefully.

Note: One impact of this is that truly unknown fields given to --enable
will no longer generate errors, they will be silently ignored, as --hide
does.

Signed-off-by: Zephaniah E. Loss-Cutler-Hull <zephaniah@gmail.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:58:15 -04:00
Sumeet Pawnikar
6b398625ae tools/power turbostat: print power values upto three decimal
Print power values upto three decimal places in watts.

Suggested-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:58:15 -04:00
Sumeet Pawnikar
f52ba93190 tools/power turbostat: Add Power Limit4 support
Add Power Limit4 support.

Signed-off-by: Sumeet Pawnikar <sumeet.r.pawnikar@intel.com>
Acked-by: Zhang Rui <rui.zhang@intel.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:58:14 -04:00
Dan Merillat
6799ba84ca tools/power turbostat: fix dump for AMD cpus
turbostat --Dump exits early with status 243 (-13)

get_counters() calls get_msr_sum() on zen CPUS
for MSR_PKG_ENERGY_STAT, but per_cpu_msr_sum
has not been initialized.

Signed-off-by: Dan Merillat <git@dan.eginity.com>
Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:58:14 -04:00
Len Brown
5dc241f2b2 tools/power turbostat: tweak --show and --hide capability
allow invocations such as # turbostat --show power,Busy%

previously the "Busy%" was ignored

Signed-off-by: Len Brown <len.brown@intel.com>
2022-04-16 21:17:18 -04:00
Kees Cook
2e53b877dc lkdtm: Add CFI_BACKWARD to test ROP mitigations
In order to test various backward-edge control flow integrity methods,
add a test that manipulates the return address on the stack. Currently
only arm64 Pointer Authentication and Shadow Call Stack is supported.

 $ echo CFI_BACKWARD | cat >/sys/kernel/debug/provoke-crash/DIRECT

Under SCS, successful test of the mitigation is reported as:

 lkdtm: Performing direct entry CFI_BACKWARD
 lkdtm: Attempting unchecked stack return address redirection ...
 lkdtm: ok: redirected stack return address.
 lkdtm: Attempting checked stack return address redirection ...
 lkdtm: ok: control flow unchanged.

Under PAC, successful test of the mitigation is reported by the PAC
exception handler:

 lkdtm: Performing direct entry CFI_BACKWARD
 lkdtm: Attempting unchecked stack return address redirection ...
 lkdtm: ok: redirected stack return address.
 lkdtm: Attempting checked stack return address redirection ...
 Unable to handle kernel paging request at virtual address bfffffc0088d0514
 Mem abort info:
   ESR = 0x86000004
   EC = 0x21: IABT (current EL), IL = 32 bits
   SET = 0, FnV = 0
   EA = 0, S1PTW = 0
   FSC = 0x04: level 0 translation fault
 [bfffffc0088d0514] address between user and kernel address ranges
 ...

If the CONFIGs are missing (or the mitigation isn't working), failure
is reported as:

 lkdtm: Performing direct entry CFI_BACKWARD
 lkdtm: Attempting unchecked stack return address redirection ...
 lkdtm: ok: redirected stack return address.
 lkdtm: Attempting checked stack return address redirection ...
 lkdtm: FAIL: stack return address was redirected!
 lkdtm: This is probably expected, since this kernel was built *without* CONFIG_ARM64_PTR_AUTH_KERNEL=y nor CONFIG_SHADOW_CALL_STACK=y

Co-developed-by: Dan Li <ashimida@linux.alibaba.com>
Signed-off-by: Dan Li <ashimida@linux.alibaba.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/lkml/20220416001103.1524653-1-keescook@chromium.org
2022-04-16 13:57:23 -07:00
Linus Torvalds
bb34e0dba3 linux-kselftest-fixes-5.18-rc3
This Kselftest fixes update consists of a mqueue perf test memory leak
 bug fix. mq_perf_tests fail to call CPU_FREE to free memory allocated
 by CPU_SET.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmJZm/0ACgkQCwJExA0N
 QxwZLQ//Z3yHo1MEsxwCh7qiPOrZhomWyWprpNYn6KHrHm9jPoUSld1M++DF/1PW
 jEjJISy4WUFLDemFABvRop/K7FuLJxtpxWB8iI+zbhhoPqdM3rThzd10nHfpgKke
 g8x5umdBuux1pNzBcekq8BT3dBEUHU9JxGMPvTsp8v0pr5xoKAwUenhhDCmARukJ
 tUH83jLY4MKNI17Z8YIAVb9MjqPseMPruZuU25n8fXDkqqSAI/99ZeeNsLnSEQRo
 1IA5Np0Qd+cUNFJ54Aqjb/nsGOc7ev0X2VSbWlrO+gBuX6GTvl7mKtxil2PIFz0p
 OjsiG9RgWcuLQTtIgKvtfpfUjsZjW6YuhxjPq+K/3jARoPk0tzk6Im7VfqRKApdA
 lWr41OEDMotaNbCFaQQyhOXe4N81n33skCs3d8xS94DJ6pAW51zFoJGiOihdobof
 CLOnpYCaA63MMHruOQb0bB4sZGFFA5tczmUe+KENH42b8NjLdUmPhyNOw6P2rkeO
 e34UrdysM1It3iJS+pKq4FiRwGuRcRw6HoK9c93/HOzG/t+l8HpDkfbsljxl2JDF
 2VEiieYqITpGmhqgJvhqr0hoHhXiw/uFZzKzkO8+W/453AnO9deLy6HCiK/s7idz
 GbI2wnd4LHPkysn/CtEJBlFtCF/XpW7N29Nbr7igHl6qfFWSjII=
 =+5jP
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
 "A mqueue perf test memory leak bug fix.

  mq_perf_tests failed to call CPU_FREE to free memory allocated by
  CPU_SET"

* tag 'linux-kselftest-fixes-5.18-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set
2022-04-15 11:24:32 -07:00
Paolo Abeni
edf45f007a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2022-04-15 09:26:00 +02:00
Athira Rajeev
f58faed7fb perf bench: Fix numa bench to fix usage of affinity for machines with #CPUs > 1K
The 'perf bench numa' testcase fails on systems with more than 1K CPUs.

Testcase: perf bench numa mem -p 1 -t 3 -P 512 -s 100 -zZ0qcm --thp  1

Snippet of code:

  <<>>
  perf: bench/numa.c:302: bind_to_node: Assertion `!(ret)' failed.
  Aborted (core dumped)
  <<>>

bind_to_node() uses "sched_getaffinity" to save the original cpumask and
this call is returning EINVAL ((invalid argument).

This happens because the default mask size in glibc is 1024.  To
overcome this 1024 CPUs mask size limitation of cpu_set_t, change the
mask size using the CPU_*_S macros ie, use CPU_ALLOC to allocate
cpumask, CPU_ALLOC_SIZE for size.

Apart from fixing this for "orig_mask", apply same logic to "mask" as
well which is used to setaffinity so that mask size is large enough to
represent number of possible CPU's in the system.

sched_getaffinity is used in one more place in perf numa bench. It is in
"bind_to_cpu" function. Apply the same logic there also. Though
currently no failure is reported from there, it is ideal to change
getaffinity to work with such system configurations having CPU's more
than default mask size supported by glibc.

Also fix "sched_setaffinity" to use mask size which is large enough to
represent number of possible CPU's in the system.

Fixed all places where "bind_cpumask" which is part of "struct
thread_data" is used such that bind_cpumask works in all configuration.

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220412164059.42654-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-14 09:15:10 -03:00
Athira Rajeev
8cb7a188ac perf bench: Fix numa testcase to check if CPU used to bind task is online
Perf numa bench test fails with error:

Testcase:

  ./perf bench numa mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp  1 --no-data_rand_walk

Failure snippet:

<<>>
  Running 'numa/mem' benchmark:

  # Running main, "perf bench numa numa-mem -p 2 -t 1 -P 1024 -C 0,8 -M 1,0 -s 20 -zZq --thp 1 --no-data_rand_walk"

  perf: bench/numa.c:333: bind_to_cpumask: Assertion `!(ret)' failed.
<<>>

The Testcases uses CPU's 0 and 8. In function "parse_setup_cpu_list",
There is check to see if cpu number is greater than max cpu's possible
in the system ie via "if (bind_cpu_0 >= g->p.nr_cpus || bind_cpu_1 >=
g->p.nr_cpus) {".

But it could happen that system has say 48 CPU's, but only number of
online CPU's is 0-7. Other CPU's are offlined. Since "g->p.nr_cpus" is
48, so function will go ahead and set bit for CPU 8 also in cpumask (
td->bind_cpumask).

bind_to_cpumask function is called to set affinity using
sched_setaffinity and the cpumask. Since the CPU8 is not present, set
affinity will fail here with EINVAL.

Fix this issue by adding a check to make sure that, CPU's provided in
the input argument values are online before proceeding further and skip
the test. For this, include new helper function "is_cpu_online" in
"tools/perf/util/header.c".

Since "BIT(x)" definition will get included from header.h, remove
that from bench/numa.c

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220412164059.42654-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-14 09:13:41 -03:00
Ian Rogers
24f378e660 perf test: Add basic perf record tests
Test the --per-thread flag.

Test Intel machine state capturing.

Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.bayduraev@gmail.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220414014642.3308206-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-14 09:10:12 -03:00
Alexey Bayduraev
23380e4d53 perf record: Fix per-thread option
Per-thread mode doesn't have specific CPUs for events, add checks for
this case.

Minor fix to a pr_debug by Ian Rogers <irogers@google.com> to avoid an
out of bound array access.

Fixes: 7954f71689 ("perf record: Introduce thread affinity and mmap masks")
Reported-by: Ian Rogers <irogers@google.com>
Signed-off-by: Alexey Bayduraev <alexey.bayduraev@gmail.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220414014642.3308206-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-14 09:05:11 -03:00
James Clark
2adacd7f0a perf docs: Add man page entry for Arm SPE
The SPE integration in Perf has quite a few usability quirks that
can't be found by just reading the reference manual. So document this
and at the same time add a summary of the feature that is also hard to
find elsewhere.

Reviewed-by: Leo Yan <leo.yan@linaro.org>
Signed-off-by: James Clark <james.clark@arm.com>
Co-authored-by: Al Grant <al.grant@arm.com>
Co-authored-by: Luke Dare <Luke.Dare@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220413084021.2556142-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-14 08:54:03 -03:00
Adrian Hunter
a668cc07f9 perf tools: Fix segfault accessing sample_id xyarray
perf_evsel::sample_id is an xyarray which can cause a segfault when
accessed beyond its size. e.g.

  # perf record -e intel_pt// -C 1 sleep 1
  Segmentation fault (core dumped)
  #

That is happening because a dummy event is opened to capture text poke
events accross all CPUs, however the mmap logic is allocating according
to the number of user_requested_cpus.

In general, perf sometimes uses the evsel cpus to open events, and
sometimes the evlist user_requested_cpus. However, it is not necessary
to determine which case is which because the opened event file
descriptors are also in an xyarray, the size of whch can be used
to correctly allocate the size of the sample_id xyarray, because there
is one ID per file descriptor.

Note, in the affected code path, perf_evsel fd array is subsequently
used to get the file descriptor for the mmap, so it makes sense for the
xyarrays to be the same size there.

Fixes: d1a177595b ("libperf: Adopt perf_evlist__mmap()/munmap() from tools/perf")
Fixes: 246eba8e90 ("perf tools: Add support for PERF_RECORD_TEXT_POKE")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: stable@vger.kernel.org # 5.5+
Link: https://lore.kernel.org/r/20220413114232.26914-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-13 22:23:02 -03:00
Lv Ruyi
d73f5d14e0 perf stat: Fix error check return value of hashmap__new(), must use IS_ERR()
hashmap__new() returns ERR_PTR(-ENOMEM) when it fails, so we should use
IS_ERR() to check it in error handling path.

Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220413093302.2538128-1-lv.ruyi@zte.com.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-13 22:20:15 -03:00
Bob Moore
487ea80a28 ACPICA: Update copyright notices to the year 2022
ACPICA commit 738d7b0726e6c0458ef93c0a01c0377490888d1e

Affects all source modules and utility signons.

Link: https://github.com/acpica/acpica/commit/738d7b07
Signed-off-by: Bob Moore <robert.moore@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2022-04-13 20:24:57 +02:00
Like Xu
42c35fdc34 selftests: kvm/x86/xen: Replace a comma in the xen_shinfo_test with semicolon
+WARNING: Possible comma where semicolon could be used
+#397: FILE: tools/testing/selftests/kvm/x86_64/xen_shinfo_test.c:700:
++				tmr.type = KVM_XEN_VCPU_ATTR_TYPE_TIMER,
++				vcpu_ioctl(vm, VCPU_ID, KVM_XEN_VCPU_GET_ATTR, &tmr);

Fixes: 25eaeebe71 ("KVM: x86/xen: Add self tests for KVM_XEN_HVM_CONFIG_EVTCHN_SEND")
Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220406063715.55625-4-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-13 13:37:18 -04:00
Paolo Bonzini
a4cfff3f0f Merge branch 'kvm-older-features' into HEAD
Merge branch for features that did not make it into 5.18:

* New ioctls to get/set TSC frequency for a whole VM

* Allow userspace to opt out of hypercall patching

Nested virtualization improvements for AMD:

* Support for "nested nested" optimizations (nested vVMLOAD/VMSAVE,
  nested vGIF)

* Allow AVIC to co-exist with a nested guest running

* Fixes for LBR virtualizations when a nested guest is running,
  and nested LBR virtualization support

* PAUSE filtering for nested hypervisors

Guest support:

* Decoupling of vcpu_is_preempted from PV spinlocks

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-13 13:37:17 -04:00
Herton R. Krzesinski
b2dd71f9f7 tools/power/x86/intel-speed-select: fix build failure when using -Wl,--as-needed
Build of intel-speed-select will fail if you run:

$ LDFLAGS="-Wl,--as-needed" /usr/bin/make V=1
...
gcc -O2 -Wall -g -D_GNU_SOURCE -Iinclude -I/usr/include/libnl3 -Wl,--as-needed -lnl-genl-3 -lnl-3 intel-speed-select-in.o -o intel-speed-select
/usr/bin/ld: intel-speed-select-in.o: in function `handle_event':
(...)/linux/tools/power/x86/intel-speed-select/hfi-events.c:189: undefined reference to `nlmsg_hdr'
...

In this case the problem is that order when linking matters when using
the flag -Wl,--as-needed, symbols not used at that point are discarded.
So since intel-speed-select-in.o comes after, at that point the
libraries/symbols are already discarded and then missing/undefined
references are reported.

To fix this, make sure we specify LDFLAGS after the object file.

Acked-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Signed-off-by: Herton R. Krzesinski <herton@redhat.com>
Link: https://lore.kernel.org/r/20220404210525.725611-1-herton@redhat.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2022-04-13 13:49:48 +02:00
Alaa Mohamed
816cda9ae5 selftests: net: fib_rule_tests: add support to select a test to run
Add boilerplate test loop in test to run all tests
in fib_rule_tests.sh

Signed-off-by: Alaa Mohamed <eng.alaamohamedsoliman.am@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-13 12:19:48 +01:00
Adrian Hunter
f034fc50d3 perf tools: Fix misleading add event PMU debug message
Fix incorrect debug message:

   Attempting to add event pmu 'intel_pt' with '' that may result in
   non-fatal errors

which always appears with perf record -vv and intel_pt e.g.

    perf record -vv -e intel_pt//u uname

The message is incorrect because there will never be non-fatal errors.

Suppress the message if the PMU is 'selectable' i.e. meant to be
selected directly as an event.

Fixes: 4ac22b484d ("perf parse-events: Make add PMU verbose output clearer")
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lore.kernel.org/lkml/20220411061758.2458417-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-13 07:00:31 -03:00
Dan Williams
31e624a77e cxl/mem: Rename cxl_dvsec_decode_init() to cxl_hdm_decode_init()
cxl_dvsec_decode_init() is tasked with checking whether legacy DVSEC
range based decode is in effect, or whether HDM can be enabled / already
is enabled. As such it either succeeds or fails and that result is the
return value. The @do_hdm_init variable is misleading in the case where
HDM operation is already found to be active, so just call it @retval.

Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Davidlohr Bueso <dave@stgolabs.net>
Link: https://lore.kernel.org/r/164730736435.3806189.2537160791687837469.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-12 19:11:58 -07:00
Linus Torvalds
453096eb04 x86:
* Miscellaneous bugfixes
 
 * A small cleanup for the new workqueue code
 
 * Documentation syntax fix
 
 RISC-V:
 
 * Remove hgatp zeroing in kvm_arch_vcpu_put()
 
 * Fix alignment of the guest_hang() in KVM selftest
 
 * Fix PTE A and D bits in KVM selftest
 
 * Missing #include in vcpu_fp.c
 
 ARM:
 
 * Some PSCI fixes after introducing PSCIv1.1 and SYSTEM_RESET2
 
 * Fix the MMU write-lock not being taken on THP split
 
 * Fix mixed-width VM handling
 
 * Fix potential UAF when debugfs registration fails
 
 * Various selftest updates for all of the above
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJVtdMUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroO33QgAiPh80xUkYfnl8FVN440S5F7UOPQ2
 Cs/PbroNoP+Oz2GoG07aaqnUkFFApeBE5S+VMu1zhRNAernqpreN64/Y2iNaz0Y6
 +MbvEX0FhQRW0UZJIF2m49ilgO8Gkt6aEpVRulq5G9w4NWiH1PtR25FVXfDMi8OG
 xdw4x1jwXNI9lOQJ5EpUKVde3rAbxCfoC6hCTh5pCNd9oLuVeLfnC+Uv91fzXltl
 EIeBlV0/mAi3RLp2E/AX38WP6ucMZqOOAy91/RTqX6oIx/7QL28ZNHXVrwQ67Hkd
 pAr3MAk84tZL58lnosw53i5aXAf9CBp0KBnpk2KGutfRNJ4Vzs1e+DZAJA==
 =vqAv
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "x86:

   - Miscellaneous bugfixes

   - A small cleanup for the new workqueue code

   - Documentation syntax fix

  RISC-V:

   - Remove hgatp zeroing in kvm_arch_vcpu_put()

   - Fix alignment of the guest_hang() in KVM selftest

   - Fix PTE A and D bits in KVM selftest

   - Missing #include in vcpu_fp.c

  ARM:

   - Some PSCI fixes after introducing PSCIv1.1 and SYSTEM_RESET2

   - Fix the MMU write-lock not being taken on THP split

   - Fix mixed-width VM handling

   - Fix potential UAF when debugfs registration fails

   - Various selftest updates for all of the above"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (24 commits)
  KVM: x86: hyper-v: Avoid writing to TSC page without an active vCPU
  KVM: SVM: Do not activate AVIC for SEV-enabled guest
  Documentation: KVM: Add SPDX-License-Identifier tag
  selftests: kvm: add tsc_scaling_sync to .gitignore
  RISC-V: KVM: include missing hwcap.h into vcpu_fp
  KVM: selftests: riscv: Fix alignment of the guest_hang() function
  KVM: selftests: riscv: Set PTE A and D bits in VS-stage page table
  RISC-V: KVM: Don't clear hgatp CSR in kvm_arch_vcpu_put()
  selftests: KVM: Free the GIC FD when cleaning up in arch_timer
  selftests: KVM: Don't leak GIC FD across dirty log test iterations
  KVM: Don't create VM debugfs files outside of the VM directory
  KVM: selftests: get-reg-list: Add KVM_REG_ARM_FW_REG(3)
  KVM: avoid NULL pointer dereference in kvm_dirty_ring_push
  KVM: arm64: selftests: Introduce vcpu_width_config
  KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
  KVM: arm64: vgic: Remove unnecessary type castings
  KVM: arm64: Don't split hugepages outside of MMU write lock
  KVM: arm64: Drop unneeded minor version check from PSCI v1.x handler
  KVM: arm64: Actually prevent SMC64 SYSTEM_RESET2 from AArch32
  KVM: arm64: Generally disallow SMC64 for AArch32 guests
  ...
2022-04-12 14:16:33 -10:00
Kees Cook
42db2594e4 lkdtm/heap: Note conditions for SLAB_LINEAR_OVERFLOW
It wasn't clear when SLAB_LINEAR_OVERFLOW would be expected to trip.
Explicitly describe it and include the CONFIGs in the kselftest.

Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
2022-04-12 16:11:49 -07:00
Athira Rajeev
ce64763c63 testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set
The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate
CPU set. This cpu set is used further in pthread_attr_setaffinity_np
and by pthread_create in the code. But in current code, allocated
cpu set is not freed.

Fix this issue by adding CPU_FREE in the "shutdown" function which
is called in most of the error/exit path for the cleanup. There are
few error paths which exit without using shutdown. Add a common goto
error path with CPU_FREE for these cases.

Fixes: 7820b0715b ("tools/selftests: add mq_perf_tests")
Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-12 13:54:49 -06:00
Joachim Wiberg
50fe062c80 selftests: forwarding: new test, verify host mdb entries
Boiler plate for testing static mdb entries.  This first test verifies
adding and removing host mdb entries for all supported types: IPv4,
IPv6, and MAC multicast.

Signed-off-by: Joachim Wiberg <troglobit@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-04-12 10:06:53 +02:00
Willy Tarreau
930c4acc06 tools/nolibc: guard the main file against multiple inclusion
Including nolibc.h multiple times results in build errors due to multiple
definitions. Let's add a guard against multiple inclusions.

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-11 17:14:33 -07:00
Willy Tarreau
9c2970fbb4 tools/nolibc: use pselect6 on RISCV
This arch doesn't provide the old-style select() syscall, we have to
use pselect6().

Signed-off-by: Willy Tarreau <w@1wt.eu>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-11 17:14:33 -07:00
Paul Menzel
8e82c28ea2 torture: Make thread detection more robust by using lspcu
For consecutive numbers the lscpu command collapses the output and just
shows the range with start and end. The processors are numbered that
way on POWER8.

    $ sudo ppc64_cpu --smt=8
    $ lscpu | grep '^NUMA node'
    NUMA node(s):                    2
    NUMA node0 CPU(s):               0-79
    NUMA node8 CPU(s):               80-159

This causes the heuristic to detect the number threads per core, looking
for the number after the first comma, to fail, and QEMU aborts because of
invalid arguments.

    $ lscpu | grep '^NUMA node0' | sed -e 's/^[^,-]*(,|\-)\([0-9]*\),.*$/\1/'
    NUMA node0 CPU(s):               0-79

But the lscpu command shows the number of threads per core:

    $ sudo ppc64_cpu --smt=8
    $ lscpu | grep 'Thread(s) per core'
    Thread(s) per core:              8
    $ sudo ppc64_cpu --smt=off
    $ lscpu | grep 'Thread(s) per core'
    Thread(s) per core:              1

This commit therefore directly uses that value and replaces use of grep
with "sed -n" and its "p" command.

Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-11 17:08:59 -07:00
Paul E. McKenney
98bb264bdb torture: Permit running of experimental torture types
This commit weakens the checks of the kvm.sh script's --torture parameter
and the kvm-recheck.sh script's parsing so that experimental torture tests
may be created without updating these two scripts.  The changes required
are to the appropriate Makefile and Kconfig file, plus a directory
whose name begins with "X" must be added to the rcutorture/configs file.
This new directory's name can then be passed in via the kvm.sh script's
--torture parameter.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-11 17:08:59 -07:00
Paul E. McKenney
b20842baf8 torture: Use "-o Batchmode=yes" to disable ssh password requests
The torture.sh script normally runs unattended, so there is not much
point in the "ssh" command asking for a password.  This commit therefore
adds the "-o Batchmode=yes" argument to each "ssh" command to cause it
to fail rather than ask for a password.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-11 17:08:59 -07:00
Paul E. McKenney
ab3ecd0bce torture: Reposition so that $? collects ssh code in torture.sh
An "echo" slipped in between an "ssh" and the "ret=$?" that was intended
to collect its exit code, which prevents torture.sh from detecting
"ssh" failure.  This commit therefore reassociates the two.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-11 17:08:58 -07:00
Paul E. McKenney
b6f3c6a2b1 torture: Add rcu_normal and rcu_expedited runs to torture.sh
Currently, the rcupdate.rcu_normal and rcupdate.rcu_expedited kernel
boot parameters are not regularly tested.  The potential addition of
polled expedited grace-period APIs increases the amount of code that is
affected by these kernel boot parameters.  This commit therefore adds a
"--do-rt" argument to torture.sh to exercise these kernel-boot options.

Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-04-11 17:07:28 -07:00
Alan Maguire
0f8619929c libbpf: Usdt aarch64 arg parsing support
Parsing of USDT arguments is architecture-specific. On aarch64 it is
relatively easy since registers used are x[0-31] and sp. Format is
slightly different compared to x86_64. Possible forms are:

- "size@[reg[,offset]]" for dereferences, e.g. "-8@[sp,76]" and "-4@[sp]";
- "size@reg" for register values, e.g. "-4@x0";
- "size@value" for raw values, e.g. "-8@1".

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649690496-1902-2-git-send-email-alan.maguire@oracle.com
2022-04-11 15:32:28 -07:00
Carsten Haitzler
41204da4c1 perf test: Shell - Limit to only run executable scripts in tests
'perf test''s shell runner will just run everything in the tests
directory (as long as it's not another directory or does not begin
with a dot), but sometimes you find files in there that are not shell
scripts - perf.data output for example if you do some testing and then
the next time you run perf test it tries to run these.

Check the files are executable so they are actually intended to be test
scripts and not just some "random junk" files there.

Signed-off-by: Carsten Haitzler <carsten.haitzler@arm.com>
Reviewed-by: Leo Yan <leo.yan@linaro.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: coresight@lists.linaro.org
Link: http://lore.kernel.org/lkml/20220309122859.31487-1-carsten.haitzler@foss.arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-11 16:39:49 -03:00
Eelco Chaudron
ae24e9b53d perf scripting python: Expose symbol offset and source information
This change adds the symbol offset to the data exported for each
call-chain entry. This can not be calculated from the script and
only the ip value, and no related mmap information.

In addition, also export the source file and line information, if
available, to avoid an external lookup if this information is needed.

Signed-off-by: Eelco Chaudron <echaudro@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/164554263724.752731.14651017093796049736.stgit@wsfd-netdev64.ntdv.lab.eng.bos.redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-11 16:39:49 -03:00
Eric Lin
335f70faa2 perf jitdump: Add riscv64 support
This patch enables perf jitdump for riscv64 and was tested with V8 on
qemu rv64.

Qemu rv64:

  $ perf record -e cpu-clock -c 1000 -g -k mono ./d8_rv64 --perf-prof --no-write-protect-code-memory test.js
  $ perf inject -j -i perf.data -o perf.data.jitted
  $ perf report -i perf.data.jitted

Output:

  To display the perf.data header info, please use --header/--header-only options.

  Total Lost Samples: 0

  Samples: 87K of event 'cpu-clock'
  Event count (approx.): 87974000

  Children  Self   Command   Shared Object      Symbol

  ....
   0.28%    0.06%  d8_rv64   d8_rv64            [.] _ZN2v88internal6WasmJs7InstallEPNS0_7IsolateEb
   0.28%    0.00%  d8_rv64   d8_rv64            [.] _ZN2v88internal10ParserBaseINS0_6ParserEE22ParseLogicalExpressionEv
   0.28%    0.03%  d8_rv64   jitted-112-76.so   [.] Builtin:InterpreterEntryTrampoline
   0.12%    0.00%  d8_rv64   d8_rv64            [.] _ZN2v88internal19ContextDeserializer11DeserializeEPNS0_7IsolateENS0_6HandleINS0_13JSGlobalProxyEEENS_33DeserializeInternalFieldsCallbackE
   0.12%    0.01%  d8_rv64   jitted-112-651.so  [.] Builtin:CEntry_Return1_DontSaveFPRegs_ArgvOnStack_NoBuiltinExit
  ....

Signed-off-by: Eric Lin <eric.lin@sifive.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: greentime.hu@sifive.com
Cc: linux-riscv@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220406142606.18464-2-eric.lin@sifive.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-11 16:37:26 -03:00
Like Xu
0c8b6641c8 selftests: kvm: add tsc_scaling_sync to .gitignore
The tsc_scaling_sync's binary should be present in the .gitignore
file for the git to ignore it.

Signed-off-by: Like Xu <likexu@tencent.com>
Message-Id: <20220406063715.55625-3-likexu@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-11 13:28:56 -04:00
Geliang Tang
f4fd706f73 selftests/bpf: Drop duplicate max/min definitions
Drop duplicate macros min() and MAX() definitions in prog_tests and use
MIN() or MAX() in sys/param.h instead.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/1ae276da9925c2de59b5bdc93b693b4c243e692e.1649462033.git.geliang.tang@suse.com
2022-04-11 17:18:09 +02:00
Florian Westphal
f2ae0fa68e selftests/mptcp: add diag listen tests
Check dumping of mptcp listener sockets:
1. filter by dport should not return any results
2. filter by sport should return listen sk
3. filter by saddr+sport should return listen sk
4. no filter should return listen sk

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:55:54 +01:00
David S. Miller
4696ad36d7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next
Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

1) Replace unnecessary list_for_each_entry_continue() in nf_tables,
   from Jakob Koschel.

2) Add struct nf_conntrack_net_ecache to conntrack event cache and
   use it, from Florian Westphal.

3) Refactor ctnetlink_dump_list(), also from Florian.

4) Bump module reference counter on cttimeout object addition/removal,
   from Florian.

5) Consolidate nf_log MAC printer, from Phil Sutter.

6) Add basic logging support for unknown ethertype, from Phil Sutter.

7) Consolidate check for sysctl nf_log_all_netns toggle, also from Phil.

8) Replace hardcode value in nft_bitwise, from Jeremy Sowden.

9) Rename BASIC-like goto tags in nft_bitwise to more meaningful names,
   also from Jeremy.

10) nft_fib support for reverse path filtering with policy-based routing
    on iif. Extend selftests to cover for this new usecase, from Florian.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-11 11:47:58 +01:00
Florian Westphal
0c7b27616f selftests: netfilter: add fib expression forward test case
Its now possible to use fib expression in the forward chain (where both
the input and output interfaces are known).

Add a simple test case for this.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2022-04-11 12:10:09 +02:00
Pawan Gupta
400331f8ff x86/tsx: Disable TSX development mode at boot
A microcode update on some Intel processors causes all TSX transactions
to always abort by default[*]. Microcode also added functionality to
re-enable TSX for development purposes. With this microcode loaded, if
tsx=on was passed on the cmdline, and TSX development mode was already
enabled before the kernel boot, it may make the system vulnerable to TSX
Asynchronous Abort (TAA).

To be on safer side, unconditionally disable TSX development mode during
boot. If a viable use case appears, this can be revisited later.

  [*]: Intel TSX Disable Update for Selected Processors, doc ID: 643557

  [ bp: Drop unstable web link, massage heavily. ]

Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Pawan Gupta <pawan.kumar.gupta@linux.intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Tested-by: Neelima Krishnan <neelima.krishnan@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.1646943780.git.pawan.kumar.gupta@linux.intel.com
2022-04-11 09:58:40 +02:00
Yafang Shao
451b5fbc2c tools/runqslower: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK
Explicitly set libbpf 1.0 API mode, then we can avoid using the deprecated
RLIMIT_MEMLOCK.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-5-laoar.shao@gmail.com
2022-04-10 20:17:16 -07:00
Yafang Shao
a777e18f1b bpftool: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK
We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now.

libbpf_set_strict_mode always return 0, so we don't need to check whether
the return value is 0 or not.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-4-laoar.shao@gmail.com
2022-04-10 20:17:16 -07:00
Yafang Shao
b858ba8c52 selftests/bpf: Use libbpf 1.0 API mode instead of RLIMIT_MEMLOCK
We have switched to memcg-based memory accouting and thus the rlimit is
not needed any more. LIBBPF_STRICT_AUTO_RLIMIT_MEMLOCK was introduced in
libbpf for backward compatibility, so we can use it instead now. After
this change, the header tools/testing/selftests/bpf/bpf_rlimit.h can be
removed.

This patch also removes the useless header sys/resource.h from many files
in tools/testing/selftests/bpf/.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409125958.92629-3-laoar.shao@gmail.com
2022-04-10 20:17:16 -07:00
Runqing Yang
d252a4a499 libbpf: Fix a bug with checking bpf_probe_read_kernel() support in old kernels
Background:
Libbpf automatically replaces calls to BPF bpf_probe_read_{kernel,user}
[_str]() helpers with bpf_probe_read[_str](), if libbpf detects that
kernel doesn't support new APIs. Specifically, libbpf invokes the
probe_kern_probe_read_kernel function to load a small eBPF program into
the kernel in which bpf_probe_read_kernel API is invoked and lets the
kernel checks whether the new API is valid. If the loading fails, libbpf
considers the new API invalid and replaces it with the old API.

static int probe_kern_probe_read_kernel(void)
{
	struct bpf_insn insns[] = {
		BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),	/* r1 = r10 (fp) */
		BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),	/* r1 += -8 */
		BPF_MOV64_IMM(BPF_REG_2, 8),		/* r2 = 8 */
		BPF_MOV64_IMM(BPF_REG_3, 0),		/* r3 = 0 */
		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_probe_read_kernel),
		BPF_EXIT_INSN(),
	};
	int fd, insn_cnt = ARRAY_SIZE(insns);

	fd = bpf_prog_load(BPF_PROG_TYPE_KPROBE, NULL,
                           "GPL", insns, insn_cnt, NULL);
	return probe_fd(fd);
}

Bug:
On older kernel versions [0], the kernel checks whether the version
number provided in the bpf syscall, matches the LINUX_VERSION_CODE.
If not matched, the bpf syscall fails. eBPF However, the
probe_kern_probe_read_kernel code does not set the kernel version
number provided to the bpf syscall, which causes the loading process
alwasys fails for old versions. It means that libbpf will replace the
new API with the old one even the kernel supports the new one.

Solution:
After a discussion in [1], the solution is using BPF_PROG_TYPE_TRACEPOINT
program type instead of BPF_PROG_TYPE_KPROBE because kernel does not
enfoce version check for tracepoint programs. I test the patch in old
kernels (4.18 and 4.19) and it works well.

  [0] https://elixir.bootlin.com/linux/v4.19/source/kernel/bpf/syscall.c#L1360
  [1] Closes: https://github.com/libbpf/libbpf/issues/473

Signed-off-by: Runqing Yang <rainkin1993@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409144928.27499-1-rainkin1993@gmail.com
2022-04-10 20:15:22 -07:00
Mykola Lysenko
61ddff373f selftests/bpf: Improve by-name subtest selection logic in prog_tests
Improve subtest selection logic when using -t/-a/-d parameters.
In particular, more than one subtest can be specified or a
combination of tests / subtests.

-a send_signal -d send_signal/send_signal_nmi* - runs send_signal
test without nmi tests

-a send_signal/send_signal_nmi*,find_vma - runs two send_signal
subtests and find_vma test

-a 'send_signal*' -a find_vma -d send_signal/send_signal_nmi* -
runs 2 send_signal test and find_vma test. Disables two send_signal
nmi subtests

-t send_signal -t find_vma - runs two *send_signal* tests and one
*find_vma* test

This will allow us to have granular control over which subtests
to disable in the CI system instead of disabling whole tests.

Also, add new selftest to avoid possible regression when
changing prog_test test name selection logic.

Signed-off-by: Mykola Lysenko <mykolal@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220409001750.529930-1-mykolal@fb.com
2022-04-10 20:08:20 -07:00
Vladimir Isaev
0738599856 libbpf: Add ARC support to bpf_tracing.h
Add PT_REGS macros suitable for ARCompact and ARCv2.

Signed-off-by: Vladimir Isaev <isaev@synopsys.com>
Signed-off-by: Sergey Matyukevich <geomatsi@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Song Liu <songliubraving@fb.com>
Link: https://lore.kernel.org/bpf/20220408224442.599566-1-geomatsi@gmail.com
2022-04-10 18:53:37 -07:00
Linus Torvalds
9c6913b749 - Fix the MSI message data struct definition
- Use local labels in the exception table macros to avoid symbol
 conflicts with clang LTO builds
 
 - A couple of fixes to objtool checking of the relatively newly added
 SLS and IBT code
 
 - Rename a local var in the WARN* macro machinery to prevent shadowing
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJSwSkACgkQEsHwGGHe
 VUp6QQ//TGhL2xxLoN+7pYjIBDEDHJ3Oi0m6fOweqyQAZTYcm/rAPqd7hvoWVSoO
 YsLdWi9jeMwkzG0ItSm/qPVm/UvrViXwuQMdz4nDWqg2IPFIbhgNA3CKCIyPTio2
 WHp2NXvYyDnwPMr6xTTRndMDoxiwxMBnXf91pNwoU3toxw0GuUuXan0Y+GKnvx1A
 sqhbpWO27bAmhKb26wPw5soJVxBbSqx+1TbFVG0Sz/uwYQowMa+nfNg1DXF0sXyJ
 E/ssqBB6wjl7ANVbQsxBQHRzr/EksLVPwHHrlT8ga/5loin+VJ6mTBCPLgG7SMBE
 +R1fm79Bp/9KU194fcqhJ3pvnyJPi8hfizzCqNKnK871V8LRzC+jW0l3EdvASEXC
 sDj0XWsSFoWft9eAtMV11d641uVC4rLB90GyyzmWWrEw9BbxmasBgED6QBx9d+V6
 o1L4y58Tsz88HKzwd0PtBkeGDkvkA7xOx8ViG24IeLA0tcbixnfnATQdelQeWKqO
 4m3o1JU8ogJp9JCEBY7ZeXyStFjZMedM4U/V0akF6AKnpDuVfR3T5C68cYhoLKBu
 XU6Swf5sFHImNWp0+54HPnXhHj/uhuwj9YWCkxx/eXViwvVlxSdTdIQWa380EddN
 0KhOFLwLOdhha2+81FJc6vmkDHwiu6hlR38yqdGvdxZf/KPKjM0=
 =kMtP
 -----END PGP SIGNATURE-----

Merge tag 'x86_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Borislav Petkov:

 - Fix the MSI message data struct definition

 - Use local labels in the exception table macros to avoid symbol
   conflicts with clang LTO builds

 - A couple of fixes to objtool checking of the relatively newly added
   SLS and IBT code

 - Rename a local var in the WARN* macro machinery to prevent shadowing

* tag 'x86_urgent_for_v5.18_rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/msi: Fix msi message data shadow struct
  x86/extable: Prefer local labels in .set directives
  x86,bpf: Avoid IBT objtool warning
  objtool: Fix SLS validation for kcov tail-call replacement
  objtool: Fix IBT tail-call detection
  x86/bug: Prevent shadowing in __WARN_FLAGS
  x86/mm/tlb: Revert retpoline avoidance approach
2022-04-10 07:12:27 -10:00
Linus Torvalds
1862a69c91 perf tools fixes for v5.18: 1st batch
- Fix the clang command line option probing and remove some options to filter
   out, fixing the build with the latest clang versions.
 
 - Fix 'perf bench' futex and epoll benchmarks to deal with machines with more
   than 1K CPUs.
 
 - Fix 'perf test tsc' error message when not supported.
 
 - Remap perf ring buffer if there is no space for event, fixing perf usage
   in 32-bit ChromeOS.
 
 - Drop objdump stderr to avoid getting stuck waiting for stdout output in
   'perf annotate'.
 
 - Fix up garbled output by now showing unwind error messages when augmenting
   frame in best effort mode.
 
 - Fix perf's libperf_print callback, use the va_args eprintf() variant.
 
 - Sync vhost and arm64 cputype headers with the kernel sources.
 
 - Fix 'perf report --mem-mode' with ARM SPE.
 
 - Add missing external commands ('perf iiostat', etc) to 'perf --list-cmds'.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYlIM+gAKCRCyPKLppCJ+
 J5l6AQCCY4co/6FBh8JMmMX4RVHAUriX0YfKTJfpeLU3nsiXPAD/TVqf1LOyYaPv
 /ZqJ8DwqvKr9nkUsf5kAOfPrDB/j/QQ=
 =0UV/
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-fixes-for-v5.18-2022-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools fixes from Arnaldo Carvalho de Melo:

 - Fix the clang command line option probing and remove some options to
   filter out, fixing the build with the latest clang versions

 - Fix 'perf bench' futex and epoll benchmarks to deal with machines
   with more than 1K CPUs

 - Fix 'perf test tsc' error message when not supported

 - Remap perf ring buffer if there is no space for event, fixing perf
   usage in 32-bit ChromeOS

 - Drop objdump stderr to avoid getting stuck waiting for stdout output
   in 'perf annotate'

 - Fix up garbled output by now showing unwind error messages when
   augmenting frame in best effort mode

 - Fix perf's libperf_print callback, use the va_args eprintf() variant

 - Sync vhost and arm64 cputype headers with the kernel sources

 - Fix 'perf report --mem-mode' with ARM SPE

 - Add missing external commands ('iiostat', etc) to 'perf --list-cmds'

* tag 'perf-tools-fixes-for-v5.18-2022-04-09' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux:
  perf annotate: Drop objdump stderr to avoid getting stuck waiting for stdout output
  perf tools: Add external commands to list-cmds
  perf docs: Add perf-iostat link to manpages
  perf session: Remap buf if there is no space for event
  perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K
  perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K
  perf tools: Fix perf's libperf_print callback
  perf: arm-spe: Fix perf report --mem-mode
  perf unwind: Don't show unwind error messages when augmenting frame pointer stack
  tools headers arm64: Sync arm64's cputype.h with the kernel sources
  perf test tsc: Fix error message when not supported
  perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13
  perf python: Fix probing for some clang command line options
  tools build: Filter out options and warnings not supported by clang
  tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts
  tools include UAPI: Sync linux/vhost.h with the kernel sources
2022-04-09 18:45:10 -10:00
Linus Torvalds
94a4c2bb7a cxl + nvdimm fixes for v5.18-rc2
- Fix a compile error in the nvdimm unit tests
 
 - Fix a shadowed variable warning in the CXL PCI driver
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYlIY6QAKCRDfioYZHlFs
 Z0aeAQDDYcicYRhLZ3Ljbg6stitBIumpdVcKDHm4WkC9gbmB4QEArnXLpcHPWyAa
 zmgc1Yrp9gOnpNSRMog9Wc8NaR45KA8=
 =Ov5t
 -----END PGP SIGNATURE-----

Merge tag 'cxl+nvdimm-for-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull cxl and nvdimm fixes from Dan Williams:

 - Fix a compile error in the nvdimm unit tests

 - Fix a shadowed variable warning in the CXL PCI driver

* tag 'cxl+nvdimm-for-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  cxl/pci: Drop shadowed variable
  tools/testing/nvdimm: Fix security_init() symbol collision
2022-04-09 18:31:59 -10:00
Ian Rogers
940a445a90 perf annotate: Drop objdump stderr to avoid getting stuck waiting for stdout output
If objdump writes to stderr it can block waiting for it to be read. As
perf doesn't read stderr then progress stops with perf waiting for
stdout output.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Dave Marchevsky <davemarchevsky@fb.com>
Cc: Denis Nikitin <denik@chromium.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Lexi Shao <shaolexi@huawei.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin Liška <mliska@suse.cz>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Remi Bernon <rbernon@codeweavers.com>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: William Cohen <wcohen@redhat.com>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/20220407230503.1265036-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:21:00 -03:00
Michael Petlan
3e6b43beb7 perf tools: Add external commands to list-cmds
The `perf --list-cmds` output prints only internal commands, although
there is no reason for that from users' perspective.

Adding the external commands to commands array with NULL function
pointer allows printing all perf commands while not changing the logic
of command handler selection.

Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-2-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:21:00 -03:00
Michael Petlan
0ff26efe92 perf docs: Add perf-iostat link to manpages
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220404221541.30312-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:20:59 -03:00
Denis Nikitin
bc21e74d47 perf session: Remap buf if there is no space for event
If a perf event doesn't fit into remaining buffer space return NULL to
remap buf and fetch the event again.

Keep the logic to error out on inadequate input from fuzzing.

This fixes perf failing on ChromeOS (with 32b userspace):

  $ perf report -v -i perf.data
  ...
  prefetch_event: head=0x1fffff8 event->header_size=0x30, mmap_size=0x2000000: fuzzed or compressed perf.data?
  Error:
  failed to process sample

Fixes: 57fc032ad6 ("perf session: Avoid infinite loop when seeing invalid header.size")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Denis Nikitin <denik@chromium.org>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Budankov <alexey.budankov@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220330031130.2152327-1-denik@chromium.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 14:20:59 -03:00
Athira Rajeev
299687e18a perf bench: Fix epoll bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench epoll' testcase fails on systems with more than 1K CPUs.

Testcase: perf bench epoll all

Result snippet:
<<>>
Run summary [PID 106497]: 1399 threads monitoring on 64 file-descriptors for 8 secs.

perf: pthread_create: No such file or directory
<<>>

In epoll benchmarks (ctl, wait) pthread_create is invoked in do_threads
from respective bench_epoll_*  function. Though the logs shows direct
failure from pthread_create, the actual failure is from
"sched_setaffinity" returning EINVAL (invalid argument).

This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.

Patch addresses this by fixing all the epoll benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-3-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Athira Rajeev
c9c2a427dd perf bench: Fix futex bench to correct usage of affinity for machines with #CPUs > 1K
The 'perf bench futex' testcase fails on systems with more than 1K CPUs.

Testcase: perf bench futex all

Failure snippet:
<<>>Running futex/hash benchmark...

perf: pthread_create: No such file or directory
<<>>

All the futex benchmarks (ie hash, lock-api, requeue, wake,
wake-parallel), pthread_create is invoked in respective bench_futex_*
function. Though the logs shows direct failure from pthread_create,
strace logs showed that actual failure is from  "sched_setaffinity"
returning EINVAL (invalid argument).

This happens because the default mask size in glibc is 1024. To overcome
this 1024 CPUs mask size limitation of cpu_set_t, change the mask size
using the CPU_*_S macros.

Patch addresses this by fixing all the futex benchmarks to use CPU_ALLOC
to allocate cpumask, CPU_ALLOC_SIZE for size, and CPU_SET_S to set the
mask.

Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Disha Goel <disgoel@linux.vnet.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nageswara R Sastry <rnsastry@linux.ibm.com>
Cc: linuxppc-dev@lists.ozlabs.org
Link: https://lore.kernel.org/r/20220406175113.87881-2-atrajeev@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Adrian Hunter
aeee9dc53c perf tools: Fix perf's libperf_print callback
eprintf() does not expect va_list as the type of the 4th parameter.

Use veprintf() because it does.

Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Fixes: 428dab813a ("libperf: Merge libperf_set_print() into libperf_init()")
Cc: Jiri Olsa <jolsa@kernel.org>
Link: https://lore.kernel.org/r/20220408132625.2451452-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
James Clark
ffab487052 perf: arm-spe: Fix perf report --mem-mode
Since commit bb30acae4c ("perf report: Bail out --mem-mode if mem
info is not available") "perf mem report" and "perf report --mem-mode"
don't allow opening the file unless one of the events has
PERF_SAMPLE_DATA_SRC set.

SPE doesn't have this set even though synthetic memory data is generated
after it is decoded. Fix this issue by setting DATA_SRC on SPE events.
This has no effect on the data collected because the SPE driver doesn't
do anything with that flag and doesn't generate samples.

Fixes: bb30acae4c ("perf report: Bail out --mem-mode if mem info is not available")
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: linux-arm-kernel@lists.infradead.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20220408144056.1955535-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
James Clark
fa7095c5c3 perf unwind: Don't show unwind error messages when augmenting frame pointer stack
Commit Fixes: b9f6fbb3b2 ("perf arm64: Inject missing frames when
using 'perf record --call-graph=fp'") intended to add a 'best effort'
DWARF unwind that improved the frame pointer stack in most scenarios.

It's expected that the unwind will fail sometimes, but this shouldn't be
reported as an error. It only works when the return address can be
determined from the contents of the link register alone.

Fix the error shown when the unwinder requires extra registers by adding
a new flag that suppresses error messages. This flag is not set in the
normal --call-graph=dwarf unwind mode so that behavior is not changed.

Fixes: b9f6fbb3b2 ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'")
Reported-by: John Garry <john.garry@huawei.com>
Signed-off-by: James Clark <james.clark@arm.com>
Tested-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220406145651.1392529-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo
278aaba2c5 tools headers arm64: Sync arm64's cputype.h with the kernel sources
To get the changes in:

  83bea32ac7 ("arm64: Add part number for Arm Cortex-A78AE")

That addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/asm/cputype.h' differs from latest version at 'arch/arm64/include/asm/cputype.h'
  diff -u tools/arch/arm64/include/asm/cputype.h arch/arm64/include/asm/cputype.h

Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Chanho Park <chanho61.park@samsung.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Will Deacon <will@kernel.org>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Chengdong Li
290fa68bdc perf test tsc: Fix error message when not supported
By default `perf test tsc` does not return the error message when the
child process detected kernel does not support it. Instead, the child
process prints an error message to stderr, unfortunately stderr is
redirected to /dev/null when verbose <= 0.

This patch does:

- return TEST_SKIP to the parent process instead of TEST_OK when
  perf_read_tsc_conversion() is not supported.

- Add a new subtest of testing if TSC is supported on current
  architecture by moving exist code to a separate function.
  It avoids two places in test__perf_time_to_tsc() that return
  TEST_SKIP by doing this.

- Extend the test suite definition to contain above two subtests.
  Current test_suite and test_case structs do not support printing skip
  reason when the number of subtest less than 1. To print skip reason, it
  is necessary to extend current test suite definition.

Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Chengdong Li <chengdongli@tencent.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: likexu@tencent.com
Link: https://lore.kernel.org/r/20220408084748.43707-1-chengdongli@tencent.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo
3a8a047586 perf build: Don't use -ffat-lto-objects in the python feature test when building with clang-13
Using -ffat-lto-objects in the python feature test when building with
clang-13 results in:

  clang-13: error: optimization flag '-ffat-lto-objects' is not supported [-Werror,-Wignored-optimization-argument]
  error: command '/usr/sbin/clang' failed with exit code 1
  cp: cannot stat '/tmp/build/perf/python_ext_build/lib/perf*.so': No such file or directory
  make[2]: *** [Makefile.perf:639: /tmp/build/perf/python/perf.so] Error 1

Noticed when building on a docker.io/library/archlinux:base container.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo
dd6e1fe91c perf python: Fix probing for some clang command line options
The clang compiler complains about some options even without a source
file being available, while others require one, so use the simple
tools/build/feature/test-hello.c file.

Then check for the "is not supported" string in its output, in addition
to the "unknown argument" already being looked for.

This was noticed when building with clang-13 where -ffat-lto-objects
isn't supported and since we were looking just for "unknown argument"
and not providing a source code to clang, was mistakenly assumed as
being available and not being filtered to set of command line options
provided to clang, leading to a build failure.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Sedat Dilek <sedat.dilek@gmail.com>
Link: http://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:29 -03:00
Arnaldo Carvalho de Melo
41caff459a tools build: Filter out options and warnings not supported by clang
These make the feature check fail when using clang, so remove them just
like is done in tools/perf/Makefile.config to build perf itself.

Adding -Wno-compound-token-split-by-macro to tools/perf/Makefile.config
when building with clang is also necessary to avoid these warnings
turned into errors (-Werror):

    CC      /tmp/build/perf/util/scripting-engines/trace-event-perl.o
  In file included from util/scripting-engines/trace-event-perl.c:35:
  In file included from /usr/lib64/perl5/CORE/perl.h:4085:
  In file included from /usr/lib64/perl5/CORE/hv.h:659:
  In file included from /usr/lib64/perl5/CORE/hv_func.h:34:
  In file included from /usr/lib64/perl5/CORE/sbox32_hash.h:4:
  /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '(' and '{' tokens introducing statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro]
      ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /usr/lib64/perl5/CORE/zaphod32_hash.h:80:38: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
  #define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START {  \
                                       ^~~~~~~~~~
  /usr/lib64/perl5/CORE/perl.h:737:29: note: expanded from macro 'STMT_START'
  #   define STMT_START   (void)( /* gcc supports "({ STATEMENTS; })" */
                                ^
  /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: '{' token is here
      ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /usr/lib64/perl5/CORE/zaphod32_hash.h:80:49: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
  #define ZAPHOD32_SCRAMBLE32(v,prime) STMT_START {  \
                                                  ^
  /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: error: '}' and ')' tokens terminating statement expression appear in different macro expansion contexts [-Werror,-Wcompound-token-split-by-macro]
      ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /usr/lib64/perl5/CORE/zaphod32_hash.h:87:41: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
      v ^= (v>>23);                       \
                                          ^
  /usr/lib64/perl5/CORE/zaphod32_hash.h:150:5: note: ')' token is here
      ZAPHOD32_SCRAMBLE32(state[0],0x9fade23b);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  /usr/lib64/perl5/CORE/zaphod32_hash.h:88:3: note: expanded from macro 'ZAPHOD32_SCRAMBLE32'
  } STMT_END
    ^~~~~~~~
  /usr/lib64/perl5/CORE/perl.h:738:21: note: expanded from macro 'STMT_END'
  #   define STMT_END     )
                          ^

Please refer to the discussion on the Link: tag below, where Nathan
clarifies the situation:

<quote>
acme> And then get to the problems at the end of this message, which seem
acme> similar to the problem described here:
acme>
acme> From  Nathan Chancellor <>
acme> Subject	[PATCH] mwifiex: Remove unnecessary braces from HostCmd_SET_SEQ_NO_BSS_INFO
acme>
acme> https://lkml.org/lkml/2020/9/1/135
acme>
acme> So perhaps in this case its better to disable that
acme> -Werror,-Wcompound-token-split-by-macro when building with clang?

Yes, I think that is probably the best solution. As far as I can tell,
at least in this file and context, the warning appears harmless, as the
"create a GNU C statement expression from two different macros" is very
much intentional, based on the presence of PERL_USE_GCC_BRACE_GROUPS.
The warning is fixed in upstream Perl by just avoiding creating GNU C
statement expressions using STMT_START and STMT_END:

  https://github.com/Perl/perl5/issues/18780
  https://github.com/Perl/perl5/pull/18984

If I am reading the source code correctly, an alternative to disabling
the warning would be specifying -DPERL_GCC_BRACE_GROUPS_FORBIDDEN but it
seems like that might end up impacting more than just this site,
according to the issue discussion above.
</quote>

Based-on-a-patch-by: Sedat Dilek <sedat.dilek@gmail.com>
Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: http://lore.kernel.org/lkml/YkxWcYzph5pC1EK8@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:34:16 -03:00
Arnaldo Carvalho de Melo
541f695cbc tools build: Use $(shell ) instead of `` to get embedded libperl's ccopts
Just like its done for ldopts and for both in tools/perf/Makefile.config.

Using `` to initialize PERL_EMBED_CCOPTS somehow precludes using:

  $(filter-out SOMETHING_TO_FILTER,$(PERL_EMBED_CCOPTS))

And we need to do it to allow for building with versions of clang where
some gcc options selected by distros are not available.

Tested-by: Sedat Dilek <sedat.dilek@gmail.com> # Debian/Selfmade LLVM-14 (x86-64)
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Fangrui Song <maskray@google.com>
Cc: Florian Fainelli <f.fainelli@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Keeping <john@metanate.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Link: http://lore.kernel.org/lkml/YktYX2OnLtyobRYD@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 12:33:38 -03:00
Arnaldo Carvalho de Melo
940442deea tools include UAPI: Sync linux/vhost.h with the kernel sources
To get the changes in:

  b04d910af3 ("vdpa: support exposing the count of vqs to userspace")
  a61280dddd ("vdpa: support exposing the config size to userspace")

Silencing this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/vhost.h' differs from latest version at 'include/uapi/linux/vhost.h'
  diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h

  $ diff -u tools/include/uapi/linux/vhost.h include/uapi/linux/vhost.h
  --- tools/include/uapi/linux/vhost.h	2021-07-15 16:17:01.840818309 -0300
  +++ include/uapi/linux/vhost.h	2022-04-02 18:55:05.702522387 -0300
  @@ -150,4 +150,11 @@
   /* Get the valid iova range */
   #define VHOST_VDPA_GET_IOVA_RANGE	_IOR(VHOST_VIRTIO, 0x78, \
   					     struct vhost_vdpa_iova_range)
  +
  +/* Get the config size */
  +#define VHOST_VDPA_GET_CONFIG_SIZE	_IOR(VHOST_VIRTIO, 0x79, __u32)
  +
  +/* Get the count of all virtqueues */
  +#define VHOST_VDPA_GET_VQS_COUNT	_IOR(VHOST_VIRTIO, 0x80, __u32)
  +
   #endif
  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > before
  $ cp include/uapi/linux/vhost.h tools/include/uapi/linux/vhost.h
  $ tools/perf/trace/beauty/vhost_virtio_ioctl.sh > after
  $ diff -u before after
  --- before	2022-04-04 14:52:25.036375145 -0300
  +++ after	2022-04-04 14:52:31.906549976 -0300
  @@ -38,4 +38,6 @@
   	[0x73] = "VDPA_GET_CONFIG",
   	[0x76] = "VDPA_GET_VRING_NUM",
   	[0x78] = "VDPA_GET_IOVA_RANGE",
  +	[0x79] = "VDPA_GET_CONFIG_SIZE",
  +	[0x80] = "VDPA_GET_VQS_COUNT",
   };
  $

Cc: Longpeng <longpeng2@huawei.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Link: https://lore.kernel.org/lkml/YksxoFcOARk%2Fldev@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-09 11:42:33 -03:00
Anup Patel
ebdef0de2d KVM: selftests: riscv: Fix alignment of the guest_hang() function
The guest_hang() function is used as the default exception handler
for various KVM selftests applications by setting it's address in
the vstvec CSR. The vstvec CSR requires exception handler base address
to be at least 4-byte aligned so this patch fixes alignment of the
guest_hang() function.

Fixes: 3e06cdf105 ("KVM: selftests: Add initial support for RISC-V
64-bit")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-04-09 09:15:51 +05:30
Anup Patel
fac3725364 KVM: selftests: riscv: Set PTE A and D bits in VS-stage page table
Supporting hardware updates of PTE A and D bits is optional for any
RISC-V implementation so current software strategy is to always set
these bits in both G-stage (hypervisor) and VS-stage (guest kernel).

If PTE A and D bits are not set by software (hypervisor or guest)
then RISC-V implementations not supporting hardware updates of these
bits will cause traps even for perfectly valid PTEs.

Based on above explanation, the VS-stage page table created by various
KVM selftest applications is not correct because PTE A and D bits are
not set. This patch fixes VS-stage page table programming of PTE A and
D bits for KVM selftests.

Fixes: 3e06cdf105 ("KVM: selftests: Add initial support for RISC-V
64-bit")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Tested-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <anup@brainfault.org>
2022-04-09 09:15:44 +05:30
Linus Torvalds
9abb16bad5 linux-kselftest-fixes-5.18-rc2
This Kselftest fixes update for Linux 5.18-rc2 consists of build,
 run-times fixes to tests:
 
 - header dependencies
 - missing tear-downs to release allocated resources in assert paths
 - missing error messages when build fails
 - coccicheck and unused variable warnings
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmJQfjYACgkQCwJExA0N
 QxyImg/+N7thuUnFJuzfq/KuoCif4+t6xSre4d+RBUPNvTDnJ0BAU7K9b3FfWTgr
 07ZPosjhhPe4gCvIRQXrzGAVyvLNR9YWBAyP41Mkg5OZzMemFD77pRgB2j3xrDMR
 zubxMlHYO/RrKUpiy5CH6wt+qBoGXt9XkEW8zxGG4wbBTtwQyEab0m8W7RRHHCf1
 5euXvpZnAxHUYiK4QhIu16qPAWC433gQCXax+CXGSZObr3ISCgEPOVRI6gexh2QI
 dLWBiYWPfFGmLjyEsnTpqBVddDGKdPwAqvtSGyKMgGGlFQmbZKr6kH6OYDXFSpzE
 9oK/heE/TX8IpDWki+vVgK15QPanPmvui+1oo670zCXX57+44jqP+3x5j/aXAjyw
 R9CPaRcQKEeP+CEZEvqU44slWC7GiiemXhNvQG7WMKA96vm12aB4nHaP9Y/lHqG/
 mgPtM1/WEGjnFrQFY6XNtFez4t4t7X8oJDCJ5vrgwRsXHm46ZNWop8P6u81X1gmH
 I8vJsM84wR1HCIXevmELGGS3jZLA5bJ7BEaNlhDtK0sIwYbQwH3zS1sePQhm8sJT
 Xui+eibzfTL2TxW+R4fX8UhRFikAyiwBRzt2/SWNyWeUq23XKRi9eetazHMHFndT
 uvZZ2y2fAlgry23VEAsnSHOqK00m2kJmlV3J8Aj3I3XN+wUocfE=
 =xBpu
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest fixes from Shuah Khan:
 "Build and run-times fixes to tests:

   - header dependencies

   - missing tear-downs to release allocated resources in assert paths

   - missing error messages when build fails

   - coccicheck and unused variable warnings"

* tag 'linux-kselftest-fixes-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  selftests/harness: Pass variant to teardown
  selftests/harness: Run TEARDOWN for ASSERT failures
  selftests: fix an unused variable warning in pidfd selftest
  selftests: fix header dependency for pid_namespace selftests
  selftests: x86: add 32bit build warnings for SUSE
  selftests/proc: fix array_size.cocci warning
  selftests/vDSO: fix array_size.cocci warning
2022-04-08 14:48:35 -10:00
Jakub Kicinski
34ba23b44c Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Daniel Borkmann says:

====================
pull-request: bpf-next 2022-04-09

We've added 63 non-merge commits during the last 9 day(s) which contain
a total of 68 files changed, 4852 insertions(+), 619 deletions(-).

The main changes are:

1) Add libbpf support for USDT (User Statically-Defined Tracing) probes.
   USDTs are an abstraction built on top of uprobes, critical for tracing
   and BPF, and widely used in production applications, from Andrii Nakryiko.

2) While Andrii was adding support for x86{-64}-specific logic of parsing
   USDT argument specification, Ilya followed-up with USDT support for s390
   architecture, from Ilya Leoshkevich.

3) Support name-based attaching for uprobe BPF programs in libbpf. The format
   supported is `u[ret]probe/binary_path:[raw_offset|function[+offset]]`, e.g.
   attaching to libc malloc can be done in BPF via SEC("uprobe/libc.so.6:malloc")
   now, from Alan Maguire.

4) Various load/store optimizations for the arm64 JIT to shrink the image
   size by using arm64 str/ldr immediate instructions. Also enable pointer
   authentication to verify return address for JITed code, from Xu Kuohai.

5) BPF verifier fixes for write access checks to helper functions, e.g.
   rd-only memory from bpf_*_cpu_ptr() must not be passed to helpers that
   write into passed buffers, from Kumar Kartikeya Dwivedi.

6) Fix overly excessive stack map allocation for its base map structure and
   buckets which slipped-in from cleanups during the rlimit accounting removal
   back then, from Yuntao Wang.

7) Extend the unstable CT lookup helpers for XDP and tc/BPF to report netfilter
   connection tracking tuple direction, from Lorenzo Bianconi.

8) Improve bpftool dump to show BPF program/link type names, Milan Landaverde.

9) Minor cleanups all over the place from various others.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (63 commits)
  bpf: Fix excessive memory allocation in stack_map_alloc()
  selftests/bpf: Fix return value checks in perf_event_stackmap test
  selftests/bpf: Add CO-RE relos into linked_funcs selftests
  libbpf: Use weak hidden modifier for USDT BPF-side API functions
  libbpf: Don't error out on CO-RE relos for overriden weak subprogs
  samples, bpf: Move routes monitor in xdp_router_ipv4 in a dedicated thread
  libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
  libbpf: Use strlcpy() in path resolution fallback logic
  libbpf: Add s390-specific USDT arg spec parsing logic
  libbpf: Make BPF-side of USDT support work on big-endian machines
  libbpf: Minor style improvements in USDT code
  libbpf: Fix use #ifdef instead of #if to avoid compiler warning
  libbpf: Potential NULL dereference in usdt_manager_attach_usdt()
  selftests/bpf: Uprobe tests should verify param/return values
  libbpf: Improve string parsing for uprobe auto-attach
  libbpf: Improve library identification for uprobe binary path resolution
  selftests/bpf: Test for writes to map key from BPF helpers
  selftests/bpf: Test passing rdonly mem to global func
  bpf: Reject writes for PTR_TO_MAP_KEY in check_helper_mem_access
  bpf: Check PTR_TO_MEM | MEM_RDONLY in check_helper_mem_access
  ...
====================

Link: https://lore.kernel.org/r/20220408231741.19116-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-08 17:07:29 -07:00
Yuntao Wang
658d87687c selftests/bpf: Fix return value checks in perf_event_stackmap test
The bpf_get_stackid() function may also return 0 on success as per UAPI BPF
helper documentation. Therefore, correct checks from 'val > 0' to 'val >= 0'
to ensure that they cover all possible success return values.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408041452.933944-1-ytcoode@gmail.com
2022-04-08 22:38:17 +02:00
Andrii Nakryiko
8555defe48 selftests/bpf: Add CO-RE relos into linked_funcs selftests
Add CO-RE relocations into __weak subprogs for multi-file linked_funcs
selftest to make sure libbpf handles such combination well.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-4-andrii@kernel.org
2022-04-08 22:24:15 +02:00
Andrii Nakryiko
2fa5b0f290 libbpf: Use weak hidden modifier for USDT BPF-side API functions
Use __weak __hidden for bpf_usdt_xxx() APIs instead of much more
confusing `static inline __noinline`. This was previously impossible due
to libbpf erroring out on CO-RE relocations pointing to eliminated weak
subprogs. Now that previous patch fixed this issue, switch back to
__weak __hidden as it's a more direct way of specifying the desired
behavior.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-3-andrii@kernel.org
2022-04-08 22:24:15 +02:00
Andrii Nakryiko
e89d57d938 libbpf: Don't error out on CO-RE relos for overriden weak subprogs
During BPF static linking, all the ELF relocations and .BTF.ext
information (including CO-RE relocations) are preserved for __weak
subprograms that were logically overriden by either previous weak
subprogram instance or by corresponding "strong" (non-weak) subprogram.
This is just how native user-space linkers work, nothing new.

But libbpf is over-zealous when processing CO-RE relocation to error out
when CO-RE relocation belonging to such eliminated weak subprogram is
encountered. Instead of erroring out on this expected situation, log
debug-level message and skip the relocation.

Fixes: db2b8b0642 ("libbpf: Support CO-RE relocations for multi-prog sections")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220408181425.2287230-2-andrii@kernel.org
2022-04-08 22:24:15 +02:00
Dan Williams
e8cf229ebe tools/testing/nvdimm: Fix security_init() symbol collision
Starting with the new perf-event support in the nvdimm core, the
nfit_test mock module stops compiling. Rename its security_init() to
nfit_security_init().

tools/testing/nvdimm/test/nfit.c:1845:13: error: conflicting types for ‘security_init’; have ‘void(struct nfit_test *)’
 1845 | static void security_init(struct nfit_test *t)
      |             ^~~~~~~~~~~~~
In file included from ./include/linux/perf_event.h:61,
                 from ./include/linux/nd.h:11,
                 from ./drivers/nvdimm/nd-core.h:11,
                 from tools/testing/nvdimm/test/nfit.c:19:

Fixes: 9a61d0838c ("drivers/nvdimm: Add nvdimm pmu structure")
Cc: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Reviewed-by: Vishal Verma <vishal.l.verma@intel.com>
Link: https://lore.kernel.org/r/164904238610.1330275.1889212115373993727.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-04-08 12:59:25 -07:00
Andrii Nakryiko
3a06ec0a99 libbpf: Allow WEAK and GLOBAL bindings during BTF fixup
During BTF fix up for global variables, global variable can be global
weak and will have STB_WEAK binding in ELF. Support such global
variables in addition to non-weak ones.

This is not the problem when using BPF static linking, as BPF static
linker "fixes up" BTF during generation so that libbpf doesn't have to
do it anymore during bpf_object__open(), which led to this not being
noticed for a while, along with a pretty rare (currently) use of __weak
variables and maps.

Reported-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-2-andrii@kernel.org
2022-04-08 09:16:09 -07:00
Andrii Nakryiko
3c0dfe6e4c libbpf: Use strlcpy() in path resolution fallback logic
Coverity static analyzer complains that strcpy() can cause buffer
overflow. Use libbpf_strlcpy() instead to be 100% sure this doesn't
happen.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220407230446.3980075-1-andrii@kernel.org
2022-04-08 09:16:09 -07:00
Ilya Leoshkevich
bd022685bd libbpf: Add s390-specific USDT arg spec parsing logic
The logic is superficially similar to that of x86, but the small
differences (no need for register table and dynamic allocation of
register names, no $ sign before constants) make maintaining a common
implementation too burdensome. Therefore simply add a s390x-specific
version of parse_usdt_arg().

Note that while bcc supports index registers, this patch does not. This
should not be a problem in most cases, since s390 uses a default value
"nor" for STAP_SDT_ARG_CONSTRAINT.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-4-iii@linux.ibm.com
2022-04-08 07:04:20 -07:00
Linus Torvalds
73b193f265 Networking fixes for 5.18-rc2, including fixes from bpf and netfilter
Current release - new code bugs:
   - mctp: correct mctp_i2c_header_create result
 
   - eth: fungible: fix reference to __udivdi3 on 32b builds
 
   - eth: micrel: remove latencies support lan8814
 
 Previous releases - regressions:
   - bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
 
   - vrf: fix packet sniffing for traffic originating from ip tunnels
 
   - rxrpc: fix a race in rxrpc_exit_net()
 
   - dsa: revert "net: dsa: stop updating master MTU from master.c"
 
   - eth: ice: fix MAC address setting
 
 Previous releases - always broken:
   - tls: fix slab-out-of-bounds bug in decrypt_internal
 
   - bpf: support dual-stack sockets in bpf_tcp_check_syncookie
 
   - xdp: fix coalescing for page_pool fragment recycling
 
   - ovs: fix leak of nested actions
 
   - eth: sfc:
     - add missing xdp queue reinitialization
     - fix using uninitialized xdp tx_queue
 
   - eth: ice:
     - clear default forwarding VSI during VSI release
     - fix broken IFF_ALLMULTI handling
     - synchronize_rcu() when terminating rings
 
   - eth: qede: confirm skb is allocated before using
 
   - eth: aqc111: fix out-of-bounds accesses in RX fixup
 
   - eth: slip: fix NPD bug in sl_tx_timeout()
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmJPJvoSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkZywQAKesxObtKwob6uclHfOOl3Tfv9EV20zl
 9T9r4vUJ7GtHtjzB59fcWXTRMgeDRRpUPww9U2DLFXEkms7b2O6XgjevRKg0e6ke
 eF7rPbjhv1igdtS43Vp+5fIUR7vMUhGKXjhLSFB5O+ToRYcWdufdPY4qU62SaFQV
 62d2SF/VbdNxnBP6Nzmv4i+EON1uKb8yDL2u4gdwOGO9EV9AUeJ2JNN3H1gc86I7
 kzL5gYc61Rd0UwwQAaUap6fcZi2kCRuSHCXLZlha/RK0BGWNcm2Fh5YKCKIAW+2/
 77Unt7aQZoj8DTUzBNjMJX432t18HTjvfOtkwTVIOXy/+n7meQjtgu93yFw9jU84
 Oqlc+A8/Si3EyweNC2OvrTqTrUH9ZjjGzL9cEzWaLtEBQWvVeDz7dZxT8QZieXAN
 hZGba7aq6Ty5CKN7AaOK6e9GMzY8eEVOoSK/dVFZmRiex/y1mME0OHSiuOS1GEVm
 dfbFvGr1dWEbnQ6yV5peM6KY6y/TNd45BKYD2q5xfCIcJPkZj/dhCli/lx+UGoZY
 OoX6C78sz5Ogj9UC9lTooA2vo55ykOyxM6yKy9Ky28TmbkkvqDH5GmGMi6TkZOin
 JNGTADvsZq8TTaq8J7/GbISfbqySUX0TcEM5goyDDFec9TxpWCQlx8P6FJjpM85z
 DpqQUwYMrIjW
 =rdzK
 -----END PGP SIGNATURE-----

Merge tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - new code bugs:

   - mctp: correct mctp_i2c_header_create result

   - eth: fungible: fix reference to __udivdi3 on 32b builds

   - eth: micrel: remove latencies support lan8814

  Previous releases - regressions:

   - bpf: resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT

   - vrf: fix packet sniffing for traffic originating from ip tunnels

   - rxrpc: fix a race in rxrpc_exit_net()

   - dsa: revert "net: dsa: stop updating master MTU from master.c"

   - eth: ice: fix MAC address setting

  Previous releases - always broken:

   - tls: fix slab-out-of-bounds bug in decrypt_internal

   - bpf: support dual-stack sockets in bpf_tcp_check_syncookie

   - xdp: fix coalescing for page_pool fragment recycling

   - ovs: fix leak of nested actions

   - eth: sfc:
      - add missing xdp queue reinitialization
      - fix using uninitialized xdp tx_queue

   - eth: ice:
      - clear default forwarding VSI during VSI release
      - fix broken IFF_ALLMULTI handling
      - synchronize_rcu() when terminating rings

   - eth: qede: confirm skb is allocated before using

   - eth: aqc111: fix out-of-bounds accesses in RX fixup

   - eth: slip: fix NPD bug in sl_tx_timeout()"

* tag 'net-5.18-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (61 commits)
  drivers: net: slip: fix NPD bug in sl_tx_timeout()
  bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
  bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
  myri10ge: fix an incorrect free for skb in myri10ge_sw_tso
  net: usb: aqc111: Fix out-of-bounds accesses in RX fixup
  qede: confirm skb is allocated before using
  net: ipv6mr: fix unused variable warning with CONFIG_IPV6_PIMSM_V2=n
  net: phy: mscc-miim: reject clause 45 register accesses
  net: axiemac: use a phandle to reference pcs_phy
  dt-bindings: net: add pcs-handle attribute
  net: axienet: factor out phy_node in struct axienet_local
  net: axienet: setup mdio unconditionally
  net: sfc: fix using uninitialized xdp tx_queue
  rxrpc: fix a race in rxrpc_exit_net()
  net: openvswitch: fix leak of nested actions
  net: ethernet: mv643xx: Fix over zealous checking of_get_mac_address()
  net: openvswitch: don't send internal clone attribute to the userspace.
  net: micrel: Fix KS8851 Kconfig
  ice: clear cmd_type_offset_bsz for TX rings
  ice: xsk: fix VSI state check in ice_xsk_wakeup()
  ...
2022-04-07 19:01:47 -10:00
Ilya Leoshkevich
6f403d9d53 libbpf: Make BPF-side of USDT support work on big-endian machines
BPF_USDT_ARG_REG_DEREF handling always reads 8 bytes, regardless of
the actual argument size. On little-endian the relevant argument bits
end up in the lower bits of val, and later on the code that handles
all the argument types expects them to be there.

On big-endian they end up in the upper bits of val, breaking that
expectation. Fix by right-shifting val on big-endian.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-3-iii@linux.ibm.com
2022-04-07 20:59:10 -07:00
Ilya Leoshkevich
e1b6df598a libbpf: Minor style improvements in USDT code
Fix several typos and references to non-existing headers.
Also use __BYTE_ORDER__ instead of __BYTE_ORDER for consistency with
the rest of the bpf code - see commit 45f2bebc80 ("libbpf: Fix
endianness detection in BPF_CORE_READ_BITFIELD_PROBED()") for
rationale).

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220407214411.257260-2-iii@linux.ibm.com
2022-04-07 20:59:10 -07:00
Andrii Nakryiko
ded6dffaed libbpf: Fix use #ifdef instead of #if to avoid compiler warning
As reported by Naresh:

  perf build errors on i386 [1] on Linux next-20220407 [2]

  usdt.c:1181:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1181 | #if __x86_64__
        |     ^~~~~~~~~~
  usdt.c:1196:5: error: "__x86_64__" is not defined, evaluates to 0
  [-Werror=undef]
   1196 | #if __x86_64__
        |     ^~~~~~~~~~
  cc1: all warnings being treated as errors

Use #ifdef instead of #if to avoid this.

Fixes: 4c59e584d1 ("libbpf: Add x86-specific USDT arg spec parsing logic")
Reported-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220407203842.3019904-1-andrii@kernel.org
2022-04-07 23:34:15 +02:00
Haowen Bai
e58c5c9717 libbpf: Potential NULL dereference in usdt_manager_attach_usdt()
link could be null but still dereference bpf_link__destroy(&link->link)
and it will lead to a null pointer access.

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649299098-2069-1-git-send-email-baihaowen@meizu.com
2022-04-07 11:46:33 -07:00
Alan Maguire
1717e24801 selftests/bpf: Uprobe tests should verify param/return values
uprobe/uretprobe tests don't do any validation of arguments/return values,
and without this we can't be sure we are attached to the right function,
or that we are indeed attached to a uprobe or uretprobe.  To fix this
record argument and return value for auto-attached functions and ensure
these match expectations.  Also need to filter by pid to ensure we do
not pick up stray malloc()s since auto-attach traces libc system-wide.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-4-git-send-email-alan.maguire@oracle.com
2022-04-07 11:42:51 -07:00
Alan Maguire
90db26e6be libbpf: Improve string parsing for uprobe auto-attach
For uprobe auto-attach, the parsing can be simplified for the SEC()
name to a single sscanf(); the return value of the sscanf can then
be used to distinguish between sections that simply specify
"u[ret]probe" (and thus cannot auto-attach), those that specify
"u[ret]probe/binary_path:function+offset" etc.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-3-git-send-email-alan.maguire@oracle.com
2022-04-07 11:42:50 -07:00
Alan Maguire
a1c9d61b19 libbpf: Improve library identification for uprobe binary path resolution
In the process of doing path resolution for uprobe attach, libraries are
identified by matching a ".so" substring in the binary_path.
This matches a lot of patterns that do not conform to library.so[.version]
format, so instead match a ".so" _suffix_, and if that fails match a
".so." substring for the versioned library case.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1649245431-29956-2-git-send-email-alan.maguire@oracle.com
2022-04-07 11:42:50 -07:00
Oliver Upton
21db838466 selftests: KVM: Free the GIC FD when cleaning up in arch_timer
In order to correctly destroy a VM, all references to the VM must be
freed. The arch_timer selftest creates a VGIC for the guest, which
itself holds a reference to the VM.

Close the GIC FD when cleaning up a VM.

Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220406235615.1447180-4-oupton@google.com
2022-04-07 08:46:13 +01:00
Oliver Upton
386ba265a8 selftests: KVM: Don't leak GIC FD across dirty log test iterations
dirty_log_perf_test instantiates a VGICv3 for the guest (if supported by
hardware) to reduce the overhead of guest exits. However, the test does
not actually close the GIC fd when cleaning up the VM between test
iterations, meaning that the VM is never actually destroyed in the
kernel.

While this is generally a bad idea, the bug was detected from the kernel
spewing about duplicate debugfs entries as subsequent VMs happen to
reuse the same FD even though the debugfs directory is still present.

Abstract away the notion of setup/cleanup of the GIC FD from the test
by creating arch-specific helpers for test setup/cleanup. Close the GIC
FD on VM cleanup and do nothing for the other architectures.

Fixes: c340f7899a ("KVM: selftests: Add vgic initialization for dirty log perf test for ARM")
Reviewed-by: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220406235615.1447180-3-oupton@google.com
2022-04-07 08:46:13 +01:00
Andrew Jones
02de9331c4 KVM: selftests: get-reg-list: Add KVM_REG_ARM_FW_REG(3)
When testing a kernel with commit a5905d6af4 ("KVM: arm64:
Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated")
get-reg-list output

vregs: Number blessed registers:   234
vregs: Number registers:           238

vregs: There are 1 new registers.
Consider adding them to the blessed reg list with the following lines:

	KVM_REG_ARM_FW_REG(3),

vregs: PASS
...

That output inspired two changes: 1) add the new register to the
blessed list and 2) explain why "Number registers" is actually four
larger than "Number blessed registers" (on the system used for
testing), even though only one register is being stated as new.
The reason is that some registers are host dependent and they get
filtered out when comparing with the blessed list. The system
used for the test apparently had three filtered registers.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220316125129.392128-1-drjones@redhat.com
2022-04-07 08:45:01 +01:00
Jakub Kicinski
8e9d0d7a76 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Alexei Starovoitov says:

====================
pull-request: bpf 2022-04-06

We've added 8 non-merge commits during the last 8 day(s) which contain
a total of 9 files changed, 139 insertions(+), 36 deletions(-).

The main changes are:

1) rethook related fixes, from Jiri and Masami.

2) Fix the case when tracing bpf prog is attached to struct_ops, from Martin.

3) Support dual-stack sockets in bpf_tcp_check_syncookie, from Maxim.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
  bpf: Support dual-stack sockets in bpf_tcp_check_syncookie
  bpf: selftests: Test fentry tracing a struct_ops program
  bpf: Resolve to prog->aux->dst_prog->type only for BPF_PROG_TYPE_EXT
  rethook: Fix to use WRITE_ONCE() for rethook:: Handler
  selftests/bpf: Fix warning comparing pointer to 0
  bpf: Fix sparse warnings in kprobe_multi_resolve_syms
  bpftool: Explicit errno handling in skeletons
====================

Link: https://lore.kernel.org/r/20220407031245.73026-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-06 21:58:50 -07:00
Kumar Kartikeya Dwivedi
9fc4476a08 selftests/bpf: Test for writes to map key from BPF helpers
When invoking bpf_for_each_map_elem callback, we are passed a
PTR_TO_MAP_KEY, previously writes to this through helper may be allowed,
but the fix in previous patches is meant to prevent that case. The test
case tries to pass it as writable memory to helper, and fails test if it
succeeds to pass the verifier.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220319080827.73251-6-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-04-06 10:32:12 -07:00
Kumar Kartikeya Dwivedi
7cb29b1c99 selftests/bpf: Test passing rdonly mem to global func
Add two test cases, one pass read only map value pointer to global
func, which should be rejected. The same code checks it for kfunc, so
that is covered as well. Second one tries to use the missing check for
PTR_TO_MEM's MEM_RDONLY flag and tries to write to a read only memory
pointer. Without prior patches, both of these tests fail.

Reviewed-by: Hao Luo <haoluo@google.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Link: https://lore.kernel.org/r/20220319080827.73251-5-memxor@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-04-06 10:32:12 -07:00
Artem Savkov
ebaf24c589 selftests/bpf: Use bpf_num_possible_cpus() in per-cpu map allocations
bpf_map_value_size() uses num_possible_cpus() to determine map size, but
some of the tests only allocate enough memory for online cpus. This
results in out-of-bound writes in userspace during bpf(BPF_MAP_LOOKUP_ELEM)
syscalls in cases when number of online cpus is lower than the number of
possible cpus. Fix by switching from get_nprocs_conf() to
bpf_num_possible_cpus() when determining the number of processors in
these tests (test_progs/netcnt and test_cgroup_storage).

Signed-off-by: Artem Savkov <asavkov@redhat.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220406085408.339336-1-asavkov@redhat.com
2022-04-06 10:15:53 -07:00
Colin Ian King
a8d600f6bc libbpf: Fix spelling mistake "libaries" -> "libraries"
There is a spelling mistake in a pr_warn message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220406080835.14879-1-colin.i.king@gmail.com
2022-04-06 10:14:27 -07:00
Yuntao Wang
958ddfd75d selftests/bpf: Fix issues in parse_num_list()
The function does not check that parsing_end is false after parsing
argument. Thus, if the final part of the argument is something like '4-',
which is invalid, parse_num_list() will discard it instead of returning
-EINVAL.

Before:

 $ ./test_progs -n 2,4-
 #2 atomic_bounds:OK
 Summary: 1/0 PASSED, 0 SKIPPED, 0 FAILED

After:

 $ ./test_progs -n 2,4-
 Failed to parse test numbers.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220406003622.73539-1-ytcoode@gmail.com
2022-04-06 10:10:03 -07:00
Maxim Mikityanskiy
53968dafc4 bpf: Adjust bpf_tcp_check_syncookie selftest to test dual-stack sockets
The previous commit fixed support for dual-stack sockets in
bpf_tcp_check_syncookie. This commit adjusts the selftest to verify the
fixed functionality.

Signed-off-by: Maxim Mikityanskiy <maximmi@nvidia.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Arthur Fabre <afabre@cloudflare.com>
Link: https://lore.kernel.org/bpf/20220406124113.2795730-2-maximmi@nvidia.com
2022-04-06 09:44:45 -07:00
Reiji Watanabe
2f5d27e6cf KVM: arm64: selftests: Introduce vcpu_width_config
Introduce a test for aarch64 that ensures non-mixed-width vCPUs
(all 64bit vCPUs or all 32bit vcPUs) can be configured, and
mixed-width vCPUs cannot be configured.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Reiji Watanabe <reijiw@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220329031924.619453-3-reijiw@google.com
2022-04-06 12:29:45 +01:00
Yuntao Wang
2d0df01974 selftests/bpf: Fix file descriptor leak in load_kallsyms()
Currently, if sym_cnt > 0, it just returns and does not close file, fix it.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220405145711.49543-1-ytcoode@gmail.com
2022-04-05 16:49:32 -07:00
Andrii Nakryiko
00a0fa2d7d selftests/bpf: Add urandom_read shared lib and USDTs
Extend urandom_read helper binary to include USDTs of 4 combinations:
semaphore/semaphoreless (refcounted and non-refcounted) and based in
executable or shared library. We also extend urandom_read with ability
to report it's own PID to parent process and wait for parent process to
ready itself up for tracing urandom_read. We utilize popen() and
underlying pipe properties for proper signaling.

Once urandom_read is ready, we add few tests to validate that libbpf's
USDT attachment handles all the above combinations of semaphore (or lack
of it) and static or shared library USDTs. Also, we validate that libbpf
handles shared libraries both with PID filter and without one (i.e., -1
for PID argument).

Having the shared library case tested with and without PID is important
because internal logic differs on kernels that don't support BPF
cookies. On such older kernels, attaching to USDTs in shared libraries
without specifying concrete PID doesn't work in principle, because it's
impossible to determine shared library's load address to derive absolute
IPs for uprobe attachments. Without absolute IPs, it's impossible to
perform correct look up of USDT spec based on uprobe's absolute IP (the
only kind available from BPF at runtime). This is not the problem on
newer kernels with BPF cookie as we don't need IP-to-ID lookup because
BPF cookie value *is* spec ID.

So having those two situations as separate subtests is good because
libbpf CI is able to test latest selftests against old kernels (e.g.,
4.9 and 5.5), so we'll be able to disable PID-less shared lib attachment
for old kernels, but will still leave PID-specific one enabled to validate
this legacy logic is working correctly.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-8-andrii@kernel.org
2022-04-05 13:16:08 -07:00
Andrii Nakryiko
630301b0d5 selftests/bpf: Add basic USDT selftests
Add semaphore-based USDT to test_progs itself and write basic tests to
valicate both auto-attachment and manual attachment logic, as well as
BPF-side functionality.

Also add subtests to validate that libbpf properly deduplicates USDT
specs and handles spec overflow situations correctly, as well as proper
"rollback" of partially-attached multi-spec USDT.

BPF-side of selftest intentionally consists of two files to validate
that usdt.bpf.h header can be included from multiple source code files
that are subsequently linked into final BPF object file without causing
any symbol duplication or other issues. We are validating that __weak
maps and bpf_usdt_xxx() API functions defined in usdt.bpf.h do work as
intended.

USDT selftests utilize sys/sdt.h header that on Ubuntu systems comes
from systemtap-sdt-devel package. But to simplify everyone's life,
including CI but especially casual contributors to bpf/bpf-next that
are trying to build selftests, I've checked in sys/sdt.h header from [0]
directly. This way it will work on all architectures and distros without
having to figure it out for every relevant combination and adding any
extra implicit package dependencies.

  [0] https://sourceware.org/git?p=systemtap.git;a=blob_plain;f=includes/sys/sdt.h;h=ca0162b4dc57520b96638c8ae79ad547eb1dd3a1;hb=HEAD

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Acked-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-7-andrii@kernel.org
2022-04-05 13:16:08 -07:00
Andrii Nakryiko
4c59e584d1 libbpf: Add x86-specific USDT arg spec parsing logic
Add x86/x86_64-specific USDT argument specification parsing. Each
architecture will require their own logic, as all this is arch-specific
assembly-based notation. Architectures that libbpf doesn't support for
USDTs will pr_warn() with specific error and return -ENOTSUP.

We use sscanf() as a very powerful and easy to use string parser. Those
spaces in sscanf's format string mean "skip any whitespaces", which is
pretty nifty (and somewhat little known) feature.

All this was tested on little-endian architecture, so bit shifts are
probably off on big-endian, which our CI will hopefully prove.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-6-andrii@kernel.org
2022-04-05 13:16:08 -07:00
Andrii Nakryiko
999783c8bb libbpf: Wire up spec management and other arch-independent USDT logic
Last part of architecture-agnostic user-space USDT handling logic is to
set up BPF spec and, optionally, IP-to-ID maps from user-space.
usdt_manager performs a compact spec ID allocation to utilize
fixed-sized BPF maps as efficiently as possible. We also use hashmap to
deduplicate USDT arg spec strings and map identical strings to single
USDT spec, minimizing the necessary BPF map size. usdt_manager supports
arbitrary sequences of attachment and detachment, both of the same USDT
and multiple different USDTs and internally maintains a free list of
unused spec IDs. bpf_link_usdt's logic is extended with proper setup and
teardown of this spec ID free list and supporting BPF maps.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-5-andrii@kernel.org
2022-04-05 13:16:07 -07:00
Andrii Nakryiko
74cc6311ce libbpf: Add USDT notes parsing and resolution logic
Implement architecture-agnostic parts of USDT parsing logic. The code is
the documentation in this case, it's futile to try to succinctly
describe how USDT parsing is done in any sort of concreteness. But
still, USDTs are recorded in special ELF notes section (.note.stapsdt),
where each USDT call site is described separately. Along with USDT
provider and USDT name, each such note contains USDT argument
specification, which uses assembly-like syntax to describe how to fetch
value of USDT argument. USDT arg spec could be just a constant, or
a register, or a register dereference (most common cases in x86_64), but
it technically can be much more complicated cases, like offset relative
to global symbol and stuff like that. One of the later patches will
implement most common subset of this for x86 and x86-64 architectures,
which seems to handle a lot of real-world production application.

USDT arg spec contains a compact encoding allowing usdt.bpf.h from
previous patch to handle the above 3 cases. Instead of recording which
register might be needed, we encode register's offset within struct
pt_regs to simplify BPF-side implementation. USDT argument can be of
different byte sizes (1, 2, 4, and 8) and signed or unsigned. To handle
this, libbpf pre-calculates necessary bit shifts to do proper casting
and sign-extension in a short sequences of left and right shifts.

The rest is in the code with sometimes extensive comments and references
to external "documentation" for USDTs.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Reviewed-by: Dave Marchevsky <davemarchevsky@fb.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-4-andrii@kernel.org
2022-04-05 13:16:07 -07:00
Andrii Nakryiko
2e4913e025 libbpf: Wire up USDT API and bpf_link integration
Wire up libbpf USDT support APIs without yet implementing all the
nitty-gritty details of USDT discovery, spec parsing, and BPF map
initialization.

User-visible user-space API is simple and is conceptually very similar
to uprobe API.

bpf_program__attach_usdt() API allows to programmatically attach given
BPF program to a USDT, specified through binary path (executable or
shared lib), USDT provider and name. Also, just like in uprobe case, PID
filter is specified (0 - self, -1 - any process, or specific PID).
Optionally, USDT cookie value can be specified. Such single API
invocation will try to discover given USDT in specified binary and will
use (potentially many) BPF uprobes to attach this program in correct
locations.

Just like any bpf_program__attach_xxx() APIs, bpf_link is returned that
represents this attachment. It is a virtual BPF link that doesn't have
direct kernel object, as it can consist of multiple underlying BPF
uprobe links. As such, attachment is not atomic operation and there can
be brief moment when some USDT call sites are attached while others are
still in the process of attaching. This should be taken into
consideration by user. But bpf_program__attach_usdt() guarantees that
in the case of success all USDT call sites are successfully attached, or
all the successfuly attachments will be detached as soon as some USDT
call sites failed to be attached. So, in theory, there could be cases of
failed bpf_program__attach_usdt() call which did trigger few USDT
program invocations. This is unavoidable due to multi-uprobe nature of
USDT and has to be handled by user, if it's important to create an
illusion of atomicity.

USDT BPF programs themselves are marked in BPF source code as either
SEC("usdt"), in which case they won't be auto-attached through
skeleton's <skel>__attach() method, or it can have a full definition,
which follows the spirit of fully-specified uprobes:
SEC("usdt/<path>:<provider>:<name>"). In the latter case skeleton's
attach method will attempt auto-attachment. Similarly, generic
bpf_program__attach() will have enought information to go off of for
parameterless attachment.

USDT BPF programs are actually uprobes, and as such for kernel they are
marked as BPF_PROG_TYPE_KPROBE.

Another part of this patch is USDT-related feature probing:
  - BPF cookie support detection from user-space;
  - detection of kernel support for auto-refcounting of USDT semaphore.

The latter is optional. If kernel doesn't support such feature and USDT
doesn't rely on USDT semaphores, no error is returned. But if libbpf
detects that USDT requires setting semaphores and kernel doesn't support
this, libbpf errors out with explicit pr_warn() message. Libbpf doesn't
support poking process's memory directly to increment semaphore value,
like BCC does on legacy kernels, due to inherent raciness and danger of
such process memory manipulation. Libbpf let's kernel take care of this
properly or gives up.

Logistically, all the extra USDT-related infrastructure of libbpf is put
into a separate usdt.c file and abstracted behind struct usdt_manager.
Each bpf_object has lazily-initialized usdt_manager pointer, which is
only instantiated if USDT programs are attempted to be attached. Closing
BPF object frees up usdt_manager resources. usdt_manager keeps track of
USDT spec ID assignment and few other small things.

Subsequent patches will fill out remaining missing pieces of USDT
initialization and setup logic.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-3-andrii@kernel.org
2022-04-05 13:16:07 -07:00
Andrii Nakryiko
d72e2968fb libbpf: Add BPF-side of USDT support
Add BPF-side implementation of libbpf-provided USDT support. This
consists of single header library, usdt.bpf.h, which is meant to be used
from user's BPF-side source code. This header is added to the list of
installed libbpf header, along bpf_helpers.h and others.

BPF-side implementation consists of two BPF maps:
  - spec map, which contains "a USDT spec" which encodes information
    necessary to be able to fetch USDT arguments and other information
    (argument count, user-provided cookie value, etc) at runtime;
  - IP-to-spec-ID map, which is only used on kernels that don't support
    BPF cookie feature. It allows to lookup spec ID based on the place
    in user application that triggers USDT program.

These maps have default sizes, 256 and 1024, which are chosen
conservatively to not waste a lot of space, but handling a lot of common
cases. But there could be cases when user application needs to either
trace a lot of different USDTs, or USDTs are heavily inlined and their
arguments are located in a lot of differing locations. For such cases it
might be necessary to size those maps up, which libbpf allows to do by
overriding BPF_USDT_MAX_SPEC_CNT and BPF_USDT_MAX_IP_CNT macros.

It is an important aspect to keep in mind. Single USDT (user-space
equivalent of kernel tracepoint) can have multiple USDT "call sites".
That is, single logical USDT is triggered from multiple places in user
application. This can happen due to function inlining. Each such inlined
instance of USDT invocation can have its own unique USDT argument
specification (instructions about the location of the value of each of
USDT arguments). So while USDT looks very similar to usual uprobe or
kernel tracepoint, under the hood it's actually a collection of uprobes,
each potentially needing different spec to know how to fetch arguments.

User-visible API consists of three helper functions:
  - bpf_usdt_arg_cnt(), which returns number of arguments of current USDT;
  - bpf_usdt_arg(), which reads value of specified USDT argument (by
    it's zero-indexed position) and returns it as 64-bit value;
  - bpf_usdt_cookie(), which functions like BPF cookie for USDT
    programs; this is necessary as libbpf doesn't allow specifying actual
    BPF cookie and utilizes it internally for USDT support implementation.

Each bpf_usdt_xxx() APIs expect struct pt_regs * context, passed into
BPF program. On kernels that don't support BPF cookie it is used to
fetch absolute IP address of the underlying uprobe.

usdt.bpf.h also provides BPF_USDT() macro, which functions like
BPF_PROG() and BPF_KPROBE() and allows much more user-friendly way to
get access to USDT arguments, if USDT definition is static and known to
the user. It is expected that majority of use cases won't have to use
bpf_usdt_arg_cnt() and bpf_usdt_arg() directly and BPF_USDT() will cover
all their needs.

Last, usdt.bpf.h is utilizing BPF CO-RE for one single purpose: to
detect kernel support for BPF cookie. If BPF CO-RE dependency is
undesirable, user application can redefine BPF_USDT_HAS_BPF_COOKIE to
either a boolean constant (or equivalently zero and non-zero), or even
point it to its own .rodata variable that can be specified from user's
application user-space code. It is important that
BPF_USDT_HAS_BPF_COOKIE is known to BPF verifier as static value (thus
.rodata and not just .data), as otherwise BPF code will still contain
bpf_get_attach_cookie() BPF helper call and will fail validation at
runtime, if not dead-code eliminated.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220404234202.331384-2-andrii@kernel.org
2022-04-05 13:16:07 -07:00
Peter Zijlstra
7a53f40890 objtool: Fix SLS validation for kcov tail-call replacement
Since not all compilers have a function attribute to disable KCOV
instrumentation, objtool can rewrite KCOV instrumentation in noinstr
functions as per commit:

  f56dae88a8 ("objtool: Handle __sanitize_cov*() tail calls")

However, this has subtle interaction with the SLS validation from
commit:

  1cc1e4c8aa ("objtool: Add straight-line-speculation validation")

In that when a tail-call instrucion is replaced with a RET an
additional INT3 instruction is also written, but is not represented in
the decoded instruction stream.

This then leads to false positive missing INT3 objtool warnings in
noinstr code.

Instead of adding additional struct instruction objects, mark the RET
instruction with retpoline_safe to suppress the warning (since we know
there really is an INT3).

Fixes: 1cc1e4c8aa ("objtool: Add straight-line-speculation validation")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220323230712.GA8939@worktop.programming.kicks-ass.net
2022-04-05 10:24:40 +02:00
Peter Zijlstra
d139bca4b8 objtool: Fix IBT tail-call detection
Objtool reports:

  arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx() falls through to next function poly1305_blocks_x86_64()
  arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_emit_avx() falls through to next function poly1305_emit_x86_64()
  arch/x86/crypto/poly1305-x86_64.o: warning: objtool: poly1305_blocks_avx2() falls through to next function poly1305_blocks_x86_64()

Which reads like:

0000000000000040 <poly1305_blocks_x86_64>:
	 40:       f3 0f 1e fa             endbr64
	...

0000000000000400 <poly1305_blocks_avx>:
	400:       f3 0f 1e fa             endbr64
	404:       44 8b 47 14             mov    0x14(%rdi),%r8d
	408:       48 81 fa 80 00 00 00    cmp    $0x80,%rdx
	40f:       73 09                   jae    41a <poly1305_blocks_avx+0x1a>
	411:       45 85 c0                test   %r8d,%r8d
	414:       0f 84 2a fc ff ff       je     44 <poly1305_blocks_x86_64+0x4>
	...

These are simple conditional tail-calls and *should* be recognised as
such by objtool, however due to a mistake in commit 08f87a93c8
("objtool: Validate IBT assumptions") this is failing.

Specifically, the jump_dest is +4, this means the instruction pointed
at will not be ENDBR and as such it will fail the second clause of
is_first_func_insn() that was supposed to capture this exact case.

Instead, have is_first_func_insn() look at the previous instruction.

Fixes: 08f87a93c8 ("objtool: Validate IBT assumptions")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20220322115125.811582125@infradead.org
2022-04-05 10:24:40 +02:00
Chun-Tse Shao
d5ea4fece4 kbuild: Allow kernel installation packaging to override pkg-config
Add HOSTPKG_CONFIG to allow tooling that builds the kernel to override
what pkg-config and parameters are used.

Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-04-05 17:03:31 +09:00
Ilya Leoshkevich
568189310c libbpf: Support Debian in resolve_full_path()
attach_probe selftest fails on Debian-based distros with `failed to
resolve full path for 'libc.so.6'`. The reason is that these distros
embraced multiarch to the point where even for the "main" architecture
they store libc in /lib/<triple>.

This is configured in /etc/ld.so.conf and in theory it's possible to
replicate the loader's parsing and processing logic in libbpf, however
a much simpler solution is to just enumerate the known library paths.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404225020.51029-1-iii@linux.ibm.com
2022-04-04 16:47:16 -07:00
Daniel Latypov
baa3331503 kunit: tool: more descriptive metavars/--help output
Before, our help output contained lines like
  --kconfig_add KCONFIG_ADD
  --qemu_config qemu_config
  --jobs jobs

They're not very helpful.

The former kind come from the automatic 'metavar' we get from argparse,
the uppercase version of the flag name.
The latter are where we manually specified metavar as the flag name.

After:
  --build_dir DIR
  --make_options X=Y
  --kunitconfig PATH
  --kconfig_add CONFIG_X=Y
  --arch ARCH
  --cross_compile PREFIX
  --qemu_config FILE
  --jobs N
  --timeout SECONDS
  --raw_output [{all,kunit}]
  --json [FILE]

This patch tries to make the code more clear by specifying the _type_ of
input we expect, e.g. --build_dir is a DIR, --qemu_config is a FILE.
I also switched it to uppercase since it looked more clearly like
placeholder text that way.

This patch also changes --raw_output to specify `choices` to make it
more clear what the options are, and this way argparse can validate it
for us, as shown by the added test case.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 16:22:14 -06:00
Ilya Leoshkevich
d298761746 selftests/bpf: Define SYS_NANOSLEEP_KPROBE_NAME for aarch64
attach_probe selftest fails on aarch64 with `failed to create kprobe
'sys_nanosleep+0x0' perf event: No such file or directory`. This is
because, like on several other architectures, nanosleep has a prefix.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Tested-by: Alan Maguire <alan.maguire@oracle.com>
Link: https://lore.kernel.org/bpf/20220404142101.27900-1-iii@linux.ibm.com
2022-04-04 14:57:29 -07:00
Milan Landaverde
7b53eaa656 bpftool: Handle libbpf_probe_prog_type errors
Previously [1], we were using bpf_probe_prog_type which returned a
bool, but the new libbpf_probe_bpf_prog_type can return a negative
error code on failure. This change decides for bpftool to declare
a program type is not available on probe failure.

[1] https://lore.kernel.org/bpf/20220202225916.3313522-3-andrii@kernel.org/

Signed-off-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220331154555.422506-4-milan@mdaverde.com
2022-04-04 14:54:44 -07:00
Milan Landaverde
fff3dfab17 bpftool: Add missing link types
Will display the link type names in bpftool link show output

Signed-off-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220331154555.422506-3-milan@mdaverde.com
2022-04-04 14:54:34 -07:00
Milan Landaverde
380341637e bpftool: Add syscall prog type
In addition to displaying the program type in bpftool prog show
this enables us to be able to query bpf_prog_type_syscall
availability through feature probe as well as see
which helpers are available in those programs (such as
bpf_sys_bpf and bpf_sys_close)

Signed-off-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220331154555.422506-2-milan@mdaverde.com
2022-04-04 14:52:54 -07:00
Quentin Monnet
4eeebce6ac selftests/bpf: Fix parsing of prog types in UAPI hdr for bpftool sync
The script for checking that various lists of types in bpftool remain in
sync with the UAPI BPF header uses a regex to parse enum bpf_prog_type.
If this enum contains a set of values different from the list of program
types in bpftool, it complains.

This script should have reported the addition, some time ago, of the new
BPF_PROG_TYPE_SYSCALL, which was not reported to bpftool's program types
list. It failed to do so, because it failed to parse that new type from
the enum. This is because the new value, in the BPF header, has an
explicative comment on the same line, and the regex does not support
that.

Let's update the script to support parsing enum values when they have
comments on the same line.

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404140944.64744-1-quentin@isovalent.com
2022-04-04 14:46:15 -07:00
Kees Cook
d34f82d67d kunit: tool: Do not colorize output when redirected
Filling log files with color codes makes diffs and other comparisons
difficult. Only emit vt100 codes when the stdout is a TTY.

Cc: Brendan Higgins <brendanhiggins@google.com>
Cc: linux-kselftest@vger.kernel.org
Cc: kunit-dev@googlegroups.com
Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 15:23:50 -06:00
Daniel Latypov
885210d348 kunit: tool: properly report the used arch for --json, or '' if not known
Before, kunit.py always printed "arch": "UM" in its json output, but...
1. With `kunit.py parse`, we could be parsing output from anywhere, so
    we can't say that.
2. Capitalizing it is probably wrong, as it's `ARCH=um`
3. Commit 87c9c16317 ("kunit: tool: add support for QEMU") made it so
   kunit.py could knowingly run a different arch, yet we'd still always
   claim "UM".

This patch addresses all of those. E.g.

1.
$ ./tools/testing/kunit/kunit.py parse .kunit/test.log --json | grep -o '"arch.*' | sort -u
"arch": "",

2.
$ ./tools/testing/kunit/kunit.py run --json | ...
"arch": "um",

3.
$ ./tools/testing/kunit/kunit.py run --json --arch=x86_64 | ...
"arch": "x86_64",

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 15:22:30 -06:00
Daniel Latypov
ee96d25f2f kunit: tool: refactor how we plumb metadata into JSON
When using --json, kunit.py run/exec/parse will produce results in
KernelCI json format.
As part of that, we include the build_dir that was used, and we
(incorrectly) hardcode in the arch, etc.

We'll want a way to plumb more values (as well as the correct `arch`),
so this patch groups those fields into kunit_json.Metadata type.
This patch should have no user visible changes.

And since we only used build_dir in KunitParseRequest for json, we can
now move it out of that struct and add it into KunitExecRequest, which
needs it and used to get it via inheritance.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 15:22:23 -06:00
Daniel Latypov
6bd0f52ee8 kunit: tool: readability tweaks in KernelCI json generation logic
Use a more idiomatic check that a list is non-empty (`if mylist:`) and
simplify the function body by dedenting and using a dict to map between
the kunit TestStatus enum => KernelCI json status string.

The dict hopefully makes it less likely to have bugs like commit
9a6bb30a88 ("kunit: tool: fix --json output for skipped tests").

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 15:22:15 -06:00
Daniel Latypov
aa1c05558e kunit: tool: simplify code since build_dir can't be None
--build_dir is set to a default of '.kunit' since commit ddbd60c779
("kunit: use --build_dir=.kunit as default"), but even before then it
was explicitly set to ''.

So outside of one unit test, there was no way for the build_dir to be
ever be None, and we can simplify code by fixing the unit test and
enforcing that via updated type annotations.

E.g. this lets us drop `get_file_path()` since it's now exactly
equivalent to os.path.join().

Note: there's some `if build_dir` checks that also fail if build_dir is
explicitly set to '' that just guard against passing "O=" to make.
But running `make O=` works just fine, so drop these checks.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 14:25:58 -06:00
Daniel Latypov
e6f6192065 kunit: tool: drop last uses of collections.namedtuple
Since we formally require python3.7+ since commit df4b0807ca
("kunit: tool: Assert the version requirement"), we can just use
@dataclasses.dataclass instead.

In kunit_config.py, we used namedtuple to create a hashable type that
had `name` and `value` fields and had to subclass it to define a custom
`__str__()`.
@datalcass lets us just define one type instead.

In qemu_config.py, we use namedtuple to allow modules to define various
parameters. Using @dataclass, we can add type-annotations for all these
fields, making our code more typesafe and making it easier for users to
figure out how to define new configs.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 14:25:53 -06:00
Daniel Latypov
89aa72cd30 kunit: tool: drop unused KernelDirectoryPath var
Commit be886ba90c ("kunit: run kunit_tool from any directory")
introduced this variable, but it was unused even in that commit.

Since it's still unused now and callers can instead use
get_kernel_root_path(), delete this var.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 14:25:47 -06:00
Daniel Latypov
00f75043e4 kunit: tool: make --json handling a bit clearer
Currently kunit_json.get_json_result() will output the JSON-ified test
output to json_path, but iff it's not "stdout".

Instead, move the responsibility entirely over to the one caller.

Signed-off-by: Daniel Latypov <dlatypov@google.com>
Reviewed-by: David Gow <davidgow@google.com>
Reviewed-by: Brendan Higgins <brendanhiggins@google.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 14:25:37 -06:00
Willem de Bruijn
79ee8aa31d selftests/harness: Pass variant to teardown
FIXTURE_VARIANT data is passed to FIXTURE_SETUP and TEST_F as "variant".

In some cases, the variant will change the setup, such that expectations
also change on teardown. Also pass variant to FIXTURE_TEARDOWN.

The new FIXTURE_TEARDOWN logic is identical to that in FIXTURE_SETUP,
right above.

Signed-off-by: Willem de Bruijn <willemb@google.com>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20201210231010.420298-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 13:37:48 -06:00
Kees Cook
63e6b2a423 selftests/harness: Run TEARDOWN for ASSERT failures
The kselftest test harness has traditionally not run the registered
TEARDOWN handler when a test encountered an ASSERT. This creates
unexpected situations and tests need to be very careful about using
ASSERT, which seems a needless hurdle for test writers.

Because of the harness's design for optional failure handlers, the
original implementation of ASSERT used an abort() to immediately
stop execution, but that meant the context for running teardown was
lost. Instead, use setjmp/longjmp so that teardown can be done.

Failed SETUP routines continue to not be followed by TEARDOWN, though.

Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Will Drewry <wad@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: linux-kselftest@vger.kernel.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 13:37:37 -06:00
Axel Rasmussen
187816d077 selftests: fix an unused variable warning in pidfd selftest
I fixed a few warnings like this in commit e2aa5e650b
("selftests: fixup build warnings in pidfd / clone3 tests"), but I
missed this one by mistake. Since this variable is unused, remove it.

Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 13:32:53 -06:00
Axel Rasmussen
52035628fa selftests: fix header dependency for pid_namespace selftests
The way the test target was defined before, when building with clang we
get a command line like this:

clang -Wall -Werror -g -I../../../../usr/include/ \
	regression_enomem.c ../pidfd/pidfd.h  -o regression_enomem

This yields an error, because clang thinks we want to produce both a *.o
file, as well as a precompiled header:

clang: error: cannot specify -o when generating multiple output files

gcc, for whatever reason, doesn't exhibit the same behavior which I
suspect is why the problem wasn't noticed before.

This can be fixed simply by using the LOCAL_HDRS infrastructure the
selftests lib.mk provides. It does the right think and marks the target
as depending on the header (so if the header changes, we rebuild), but
it filters the header out of the compiler command line, so we don't get
the error described above.

Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
Reviewed-by: Christian Brauner <brauner@kernel.org>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 13:32:31 -06:00
Geliang Tang
aa8ce29931 selftests: x86: add 32bit build warnings for SUSE
In order to successfully build all these 32bit tests, these 32bit gcc
and glibc packages, named gcc-32bit and glibc-devel-static-32bit on SUSE,
need to be installed.

This patch added this information in warn_32bit_failure.

Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 13:29:43 -06:00
Guo Zhengkui
1585b1b55a selftests/proc: fix array_size.cocci warning
Fix the following coccicheck warning:

tools/testing/selftests/proc/proc-pid-vm.c:371:26-27:
WARNING: Use ARRAY_SIZE
tools/testing/selftests/proc/proc-pid-vm.c:420:26-27:
WARNING: Use ARRAY_SIZE

It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 13:27:21 -06:00
Guo Zhengkui
8ff88bec6f selftests/vDSO: fix array_size.cocci warning
Fix the following coccicheck warning:

tools/testing/selftests/vDSO/vdso_test_correctness.c:309:46-47:
WARNING: Use ARRAY_SIZE
tools/testing/selftests/vDSO/vdso_test_correctness.c:373:46-47:
WARNING: Use ARRAY_SIZE

It has been tested with gcc (Debian 8.3.0-6) 8.3.0 on x86_64.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
2022-04-04 13:27:11 -06:00
Arnd Bergmann
fba2689ee7 Merge branch 'remove-h8300' of git://git.infradead.org/users/hch/misc into asm-generic
* 'remove-h8300' of git://git.infradead.org/users/hch/misc:
  remove the h8300 architecture

This is clearly the least actively maintained architecture we have at
the moment, and probably the least useful. It is now the only one that
does not support MMUs at all, and most of the boards only support 4MB
of RAM, out of which the defconfig kernel needs more than half just
for .text/.data.

Guenter Roeck did the original patch to remove the architecture in 2013
after it had already been obsolete for a while, and Yoshinori Sato brought
it back in a much more modern form in 2015. Looking at the git history
since the reinstantiation, it's clear that almost all commits in the tree
are build fixes or cross-architecture cleanups:

$ git log --no-merges --format=%an v4.5.. arch/h8300/  | sort | uniq
-c | sort -rn | head -n 12
     25 Masahiro Yamada
     18 Christoph Hellwig
     14 Mike Rapoport
      9 Arnd Bergmann
      8 Mark Rutland
      7 Peter Zijlstra
      6 Kees Cook
      6 Ingo Molnar
      6 Al Viro
      5 Randy Dunlap
      4 Yury Norov

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2022-04-04 14:42:49 +02:00
Borislav Petkov
dbae0a934f x86/cpu: Remove CONFIG_X86_SMAP and "nosmap"
Those were added as part of the SMAP enablement but SMAP is currently
an integral part of kernel proper and there's no need to disable it
anymore.

Rip out that functionality. Leave --uaccess default on for objtool as
this is what objtool should do by default anyway.

If still needed - clearcpuid=smap.

Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Lai Jiangshan <jiangshanlai@gmail.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20220127115626.14179-4-bp@alien8.de
2022-04-04 10:16:57 +02:00
Yuntao Wang
e93f39998d libbpf: Don't return -EINVAL if hdr_len < offsetofend(core_relo_len)
Since core relos is an optional part of the .BTF.ext ELF section, we should
skip parsing it instead of returning -EINVAL if header size is less than
offsetofend(struct btf_ext_header, core_relo_len).

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220404005320.1723055-1-ytcoode@gmail.com
2022-04-03 19:56:01 -07:00
Alan Maguire
579c3196b2 selftests/bpf: Add tests for uprobe auto-attach via skeleton
tests that verify auto-attach works for function entry/return for
local functions in program and library functions in a library.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-6-git-send-email-alan.maguire@oracle.com
2022-04-03 19:56:01 -07:00
Alan Maguire
ba7499bc9d selftests/bpf: Add tests for u[ret]probe attach by name
add tests that verify attaching by name for

1. local functions in a program
2. library functions in a shared object

...succeed for uprobe and uretprobes using new "func_name"
option for bpf_program__attach_uprobe_opts().  Also verify
auto-attach works where uprobe, path to binary and function
name are specified, but fails with -EOPNOTSUPP with a SEC
name that does not specify binary path/function.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-5-git-send-email-alan.maguire@oracle.com
2022-04-03 19:56:00 -07:00
Alan Maguire
39f8dc43b7 libbpf: Add auto-attach for uprobes based on section name
Now that u[ret]probes can use name-based specification, it makes
sense to add support for auto-attach based on SEC() definition.
The format proposed is

        SEC("u[ret]probe/binary:[raw_offset|[function_name[+offset]]")

For example, to trace malloc() in libc:

        SEC("uprobe/libc.so.6:malloc")

...or to trace function foo2 in /usr/bin/foo:

        SEC("uprobe//usr/bin/foo:foo2")

Auto-attach is done for all tasks (pid -1).  prog can be an absolute
path or simply a program/library name; in the latter case, we use
PATH/LD_LIBRARY_PATH to resolve the full path, falling back to
standard locations (/usr/bin:/usr/sbin or /usr/lib64:/usr/lib) if
the file is not found via environment-variable specified locations.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-4-git-send-email-alan.maguire@oracle.com
2022-04-03 19:55:57 -07:00
Alan Maguire
433966e3ae libbpf: Support function name-based attach uprobes
kprobe attach is name-based, using lookups of kallsyms to translate
a function name to an address.  Currently uprobe attach is done
via an offset value as described in [1].  Extend uprobe opts
for attach to include a function name which can then be converted
into a uprobe-friendly offset.  The calcualation is done in
several steps:

1. First, determine the symbol address using libelf; this gives us
   the offset as reported by objdump
2. If the function is a shared library function - and the binary
   provided is a shared library - no further work is required;
   the address found is the required address
3. Finally, if the function is local, subtract the base address
   associated with the object, retrieved from ELF program headers.

The resultant value is then added to the func_offset value passed
in to specify the uprobe attach address.  So specifying a func_offset
of 0 along with a function name "printf" will attach to printf entry.

The modes of operation supported are then

1. to attach to a local function in a binary; function "foo1" in
   "/usr/bin/foo"
2. to attach to a shared library function in a shared library -
   function "malloc" in libc.

[1] https://www.kernel.org/doc/html/latest/trace/uprobetracer.html

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-3-git-send-email-alan.maguire@oracle.com
2022-04-03 18:12:05 -07:00
Alan Maguire
1ce3a60e3c libbpf: auto-resolve programs/libraries when necessary for uprobes
bpf_program__attach_uprobe_opts() requires a binary_path argument
specifying binary to instrument.  Supporting simply specifying
"libc.so.6" or "foo" should be possible too.

Library search checks LD_LIBRARY_PATH, then /usr/lib64, /usr/lib.
This allows users to run BPF programs prefixed with
LD_LIBRARY_PATH=/path2/lib while still searching standard locations.
Similarly for non .so files, we check PATH and /usr/bin, /usr/sbin.

Path determination will be useful for auto-attach of BPF uprobe programs
using SEC() definition.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1648654000-21758-2-git-send-email-alan.maguire@oracle.com
2022-04-03 18:11:47 -07:00
Haiyue Wang
66df0fdb59 bpf: Correct the comment for BTF kind bitfield
The commit 8fd886911a ("bpf: Add BTF_KIND_FLOAT to uapi") has extended
the BTF kind bitfield from 4 to 5 bits, correct the comment.

Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220403115327.205964-1-haiyue.wang@intel.com
2022-04-03 17:06:52 -07:00
Yuntao Wang
9bbad6dab8 selftests/bpf: Fix cd_flavor_subdir() of test_progs
Currently, when we run test_progs with just executable file name, for
example 'PATH=. test_progs-no_alu32', cd_flavor_subdir() will not check
if test_progs is running as a flavored test runner and switch into
corresponding sub-directory.

This will cause test_progs-no_alu32 executed by the
'PATH=. test_progs-no_alu32' command to run in the wrong directory and
load the wrong BPF objects.

Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220403135245.1713283-1-ytcoode@gmail.com
2022-04-03 17:01:48 -07:00
Haowen Bai
f6d60facd9 selftests/bpf: Return true/false (not 1/0) from bool functions
Return boolean values ("true" or "false") instead of 1 or 0 from bool
functions.  This fixes the following warnings from coccicheck:

./tools/testing/selftests/bpf/progs/test_xdp_noinline.c:567:9-10: WARNING:
return of 0/1 in function 'get_packet_dst' with return type bool
./tools/testing/selftests/bpf/progs/test_l4lb_noinline.c:221:9-10: WARNING:
return of 0/1 in function 'get_packet_dst' with return type bool

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/1648779354-14700-1-git-send-email-baihaowen@meizu.com
2022-04-03 16:42:43 -07:00
Nikolay Borisov
e299bcd4d1 selftests/bpf: Fix vfs_link kprobe definition
Since commit 6521f89170 ("namei: prepare for idmapped mounts")
vfs_link's prototype was changed, the kprobe definition in
profiler selftest in turn wasn't updated. The result is that all
argument after the first are now stored in different registers. This
means that self-test has been broken ever since. Fix it by updating the
kprobe definition accordingly.

Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220331140949.1410056-1-nborisov@suse.com
2022-04-03 16:41:24 -07:00
Linus Torvalds
8b5656bc4e A set of x86 fixes and updates:
- Make the prctl() for enabling dynamic XSTATE components correct so it
     adds the newly requested feature to the permission bitmap instead of
     overwriting it. Add a selftest which validates that.
 
   - Unroll string MMIO for encrypted SEV guests as the hypervisor cannot
     emulate it.
 
   - Handle supervisor states correctly in the FPU/XSTATE code so it takes
     the feature set of the fpstate buffer into account. The feature sets
     can differ between host and guest buffers. Guest buffers do not contain
     supervisor states. So far this was not an issue, but with enabling
     PASID it needs to be handled in the buffer offset calculation and in
     the permission bitmaps.
 
   - Avoid a gazillion of repeated CPUID invocations in by caching the values
     early in the FPU/XSTATE code.
 
   - Enable CONFIG_WERROR for X86.
 
   - Make the X86 defconfigs more useful by adapting them to Y2022 reality.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmJJWwwTHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoT3mEACA9xkNjECn/MHN3B0X5wTPhVyw9+TJ
 OdfpqL7C9pbAU1s2mwf3TyicrCOqx8nlnOYB/mXgfRGnbZqmUeGQFpZFM587dm/I
 r/BtouAzSASjnaW7SijT3gnRTqMPVNTcLOTUEVjnTa7zatw+t4rH1uxE9dLqEq9B
 cKMtsBOJyTTbj4ie3ngkUS2PQngNNHLJ4oQGZW4wCA5snLuwF1LlgcZJy8Zkrlpo
 D58h/ZV6K2/tI7INWLINlqGnxaL2B/Ld4zXsFH+t05XGh+JOiq8ueLi5tdfEPG9f
 /pzuGia0Cv6WBv+jOHLCBe2kfgvBx+Y8Goi0tqL0hwKCGjpZlQkhRccrjbVSAPhW
 2SfxOD1pulTwI1J75csYXjTc/heJvAv/ZpZSz3wldM3fyiwnmgfWKlMYqG6Xb9+T
 2OHwEUJHJQnon/f25+yb9dWI7HYMw2fEIqu3CgbRyOviObcB9MM1uKVErkCYAUWY
 W7Q8ShjNPrUguCPbw4YFPIwaazuhRbR8t2kRvfBOyTYwh3jo6U3eRL72Cov84uik
 hnFtUdiusWtvV59ngZelREmd3iVKif2hxx7EoGDY/VV2Ru4C2X/xgJemKJeKSR/f
 gm6pp8wbPSC4TBJOfP6IwYtoZKyu03miIeupPPUDxx0hLbx5j2e6EgVM5NVAeJFF
 fu4MEkGvStZc+w==
 =GK27
 -----END PGP SIGNATURE-----

Merge tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 fixes from Thomas Gleixner:
 "A set of x86 fixes and updates:

   - Make the prctl() for enabling dynamic XSTATE components correct so
     it adds the newly requested feature to the permission bitmap
     instead of overwriting it. Add a selftest which validates that.

   - Unroll string MMIO for encrypted SEV guests as the hypervisor
     cannot emulate it.

   - Handle supervisor states correctly in the FPU/XSTATE code so it
     takes the feature set of the fpstate buffer into account. The
     feature sets can differ between host and guest buffers. Guest
     buffers do not contain supervisor states. So far this was not an
     issue, but with enabling PASID it needs to be handled in the buffer
     offset calculation and in the permission bitmaps.

   - Avoid a gazillion of repeated CPUID invocations in by caching the
     values early in the FPU/XSTATE code.

   - Enable CONFIG_WERROR in x86 defconfig.

   - Make the X86 defconfigs more useful by adapting them to Y2022
     reality"

* tag 'x86-urgent-2022-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  x86/fpu/xstate: Consolidate size calculations
  x86/fpu/xstate: Handle supervisor states in XSTATE permissions
  x86/fpu/xsave: Handle compacted offsets correctly with supervisor states
  x86/fpu: Cache xfeature flags from CPUID
  x86/fpu/xsave: Initialize offset/size cache early
  x86/fpu: Remove unused supervisor only offsets
  x86/fpu: Remove redundant XCOMP_BV initialization
  x86/sev: Unroll string mmio with CC_ATTR_GUEST_UNROLL_STRING_IO
  x86/config: Make the x86 defconfigs a bit more usable
  x86/defconfig: Enable WERROR
  selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test
  x86/fpu/xstate: Fix the ARCH_REQ_XCOMP_PERM implementation
2022-04-03 12:15:47 -07:00
Nikolay Aleksandrov
692930cc43 selftests: net: fix nexthop warning cleanup double ip typo
I made a stupid typo when adding the nexthop route warning selftest and
added both $IP and ip after it (double ip) on the cleanup path. The
error doesn't show up when running the test, but obviously it doesn't
cleanup properly after it.

Fixes: 392baa339c ("selftests: net: add delete nexthop route warning test")
Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-03 13:09:05 +01:00
David Woodhouse
e467b0de82 KVM: x86: Test case for TSC scaling and offset sync
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20220225145304.36166-4-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02 05:41:20 -04:00
David Woodhouse
a29833e36b KVM: x86/xen: Update self test for Xen PV timers
Add test cases for timers in the past, and reading the status of a timer
which has already fired.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20220309143835.253911-3-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02 05:41:18 -04:00
David Woodhouse
25eaeebe71 KVM: x86/xen: Add self tests for KVM_XEN_HVM_CONFIG_EVTCHN_SEND
Test a combination of event channel send, poll and timer operations.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Message-Id: <20220303154127.202856-18-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02 05:41:18 -04:00
Boris Ostrovsky
1a65105a5a KVM: x86/xen: handle PV spinlocks slowpath
Add support for SCHEDOP_poll hypercall.

This implementation is optimized for polling for a single channel, which
is what Linux does. Polling for multiple channels is not especially
efficient (and has not been tested).

PV spinlocks slow path uses this hypercall, and explicitly crash if it's
not supported.

[ dwmw2: Rework to use kvm_vcpu_halt(), not supported for 32-bit guests ]

Signed-off-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20220303154127.202856-17-dwmw2@infradead.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02 05:41:17 -04:00
Oliver Upton
6c2fa8b20d selftests: KVM: Test KVM_X86_QUIRK_FIX_HYPERCALL_INSN
Add a test that asserts KVM rewrites guest hypercall instructions to
match the running architecture (VMCALL on VMX, VMMCALL on SVM).
Additionally, test that with the quirk disabled, KVM no longer rewrites
guest instructions and instead injects a #UD.

Signed-off-by: Oliver Upton <oupton@google.com>
Message-Id: <20220316005538.2282772-3-oupton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-04-02 05:41:10 -04:00
Yauheni Kaliuta
891663ace7 bpf, test_offload.py: Skip base maps without names
The test fails:

  # ./test_offload.py
  [...]
  Test bpftool bound info reporting (own ns)...
  FAIL: 3 BPF maps loaded, expected 2
    File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 1177, in <module>
      check_dev_info(False, "")
    File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 645, in check_dev_info
      maps = bpftool_map_list(expected=2, ns=ns)
    File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 190, in bpftool_map_list
      fail(True, "%d BPF maps loaded, expected %d" %
    File "/root/bpf-next/tools/testing/selftests/bpf/./test_offload.py", line 86, in fail
      tb = "".join(traceback.extract_stack().format())

Some base maps do not have names and they cannot be added due to compatibility
with older kernels, see [0]. So, just skip the unnamed maps.

  [0] https://lore.kernel.org/bpf/CAEf4BzY66WPKQbDe74AKZ6nFtZjq5e+G3Ji2egcVytB9R6_sGQ@mail.gmail.com/

Signed-off-by: Yauheni Kaliuta <ykaliuta@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220329081100.9705-1-ykaliuta@redhat.com
2022-04-01 23:06:19 +02:00
Eyal Birger
fe4625d8b0 selftests/bpf: Remove unused variable from bpf_sk_assign test
Was never used in bpf_sk_assign_test(), and was removed from handle_{tcp,udp}()
in commit 0b9ad56b1e ("selftests/bpf: Use SOCKMAP for server sockets in
bpf_sk_assign test").

Fixes: 0b9ad56b1e ("selftests/bpf: Use SOCKMAP for server sockets in bpf_sk_assign test")
Signed-off-by: Eyal Birger <eyal.birger@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220329154914.3718658-1-eyal.birger@gmail.com
2022-04-01 22:35:29 +02:00
Tanu M
7e2022af79 perf python: Convert tracepoint.py example to python3
Convert the tracepoint.py file to python3 as many of the files in
tools/perf are already written in python3.

Committer testing:

  # export PYTHONPATH=/tmp/build/perf/python/
  # python3 ~acme/git/perf/tools/perf/python/tracepoint.py | head
  time 67394457376909 prev_comm=swapper/12 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=gnome-terminal- next_pid=3313 next_prio=120
  time 67394457807669 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120
  time 67394457811859 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120
  time 67394457824929 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120
  time 67394457831899 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120
  time 67394457842299 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120
  time 67394457844179 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120
  time 67394457853879 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120
  time 67394457856339 prev_comm=swapper/13 prev_pid=0 prev_prio=120 prev_state=0x0 ==> next_comm=python3 next_pid=1485930 next_prio=120
  time 67394457865659 prev_comm=python3 prev_pid=1485930 prev_prio=120 prev_state=0x1 ==> next_comm=swapper/13 next_pid=0 next_prio=120
  Traceback (most recent call last):
    File "/var/home/acme/git/perf/tools/perf/python/tracepoint.py", line 48, in <module>
      main()
    File "/var/home/acme/git/perf/tools/perf/python/tracepoint.py", line 37, in main
      print("time %u prev_comm=%s prev_pid=%d prev_prio=%d prev_state=0x%x ==> next_comm=%s next_pid=%d next_prio=%d" % (
  BrokenPipeError: [Errno 32] Broken pipe
  #

Signed-off-by: Tanu M <tanu235m@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/linux-perf-users/CAPS78prawYzRZnyhWjgOnGw4EwoswNwztvfZFdCOPOydFzVwzQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Haowen Bai
f717d89a2b perf evlist: Directly return instead of using local ret variable
Addresses this coccinelle warning:

  ./tools/perf/util/evlist.c:1333:5-8: Unneeded variable: "err". Return
  "- ENOMEM" on line 1358

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/1648432532-23151-1-git-send-email-baihaowen@meizu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Ian Rogers
da0bfb9fdf perf cpumap: More cpu map reuse by merge.
perf_cpu_map__merge() will reuse one of its arguments if they are equal or
the other argument is NULL.

The arguments could be reused if it is known one set of values is a
subset of the other.

For example, a map of 0-1 and a map of just 0 when merged yields the map
of 0-1.

Currently a new map is created rather than adding a reference count to
the original 0-1 map.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220328232648.2127340-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Ian Rogers
c3ad8d23bc perf cpumap: Add is_subset function
Returns true if the second argument is a subset of the first.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220328232648.2127340-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Ian Rogers
0df6ade711 perf evlist: Rename cpus to user_requested_cpus
evlist contains cpus and all_cpus. all_cpus is the union of the cpu maps
of all evsels.

For non-task targets, cpus is set to be cpus requested from the command
line, defaulting to all online cpus if no cpus are specified.

For an uncore event, all_cpus may be just CPU 0 or every online CPU.

This causes all_cpus to have fewer values than the cpus variable which
is confusing given the 'all' in the name.

To try to make the behavior clearer, rename cpus to user_requested_cpus
and add comments on the two struct variables.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220328232648.2127340-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
John Garry
d4ff926592 perf tools: Stop depending on .git files for building PERF-VERSION-FILE
This essentially reverts commit c72e3f04b4 ("tools/perf/build:
Speed up git-version test on re-make") and commit 4e666cdb06
("perf tools: Fix dependency for version file creation")

In commit c72e3f04b4 ("tools/perf/build: Speed up git-version test
on re-make"), a makefile dependency on .git/HEAD was added. The
background is that running PERF-VERSION-FILE is relatively slow, and
commands like "git describe" are particularly slow.

In commit 4e666cdb06 ("perf tools: Fix dependency for version file
creation"), an additional dependency on .git/ORIG_HEAD was added, as
.git/HEAD may not change for "git reset --hard HEAD^" command. However,
depending on whether we're on a branch or not, a "git cherry-pick" may
not lead to the version being updated.

As discussed with the git community in [0], using git internal files for
dependencies is not reliable. Commit 4e666cdb06 also breaks some build
scenarios [1].

As mentioned, c72e3f04b4 ("tools/perf/build: Speed up git-version
test on re-make") was added to speed up the build. However in commit
7572733b84 ("perf tools: Fix version kernel tag") we removed the
call to "git describe", so just revert Makefile.perf back to same as pre
c72e3f04b4 ("tools/perf/build: Speed up git-version test on
re-make") and the build should not be so slow, as below:

Pre 7572733b84:

  $> time util/PERF-VERSION-GEN
    PERF_VERSION = 5.17.rc8.g4e666cdb06ee

  real    0m0.110s
  user    0m0.091s
  sys     0m0.019s

Post 7572733b84:

  $> time util/PERF-VERSION-GEN
    PERF_VERSION = 5.17.rc8.g7572733b8499

  real    0m0.039s
  user    0m0.036s
  sys     0m0.007s

[0] https://lore.kernel.org/git/87wngkpddp.fsf@igel.home/T/#m4a4dd6de52fdbe21179306cd57b3761eb07f45f8
[1] https://lore.kernel.org/linux-perf-users/20220329093120.4173283-1-matthieu.baerts@tessares.net/T/#u

Committer testing:

After a fresh rebuild using 'make -C tools/perf O=/tmp/build/perf install-bin':

  $ perf -v
  perf version 5.17.g162f9db407b6
  $ git log --oneline -1
  162f9db407b6a6e5 (HEAD -> perf/core) perf tools: Stop depending on .git files for building PERF-VERSION-FILE
  $

Now using a detached tarball, i.e. outside the kernel source tree:

  $ ls -la perf*tar
  ls: cannot access 'perf*tar': No such file or directory
  $ make perf-tar-src-pkg
    TAR
    PERF_VERSION = 5.17.g31d10b3ef133
  $ ls -la perf*tar
  -rw-r--r--. 1 acme acme 22241280 Mar 30 13:26 perf-5.17.0.tar
  $ mv perf-5.17.0.tar /tmp
  $ cd /tmp
  $ tar xf perf-5.17.0.tar
  $ cd perf-5.17.0/
  $ make -C tools/perf |& tail
    CC      util/pmu.o
    CC      util/pmu-flex.o
    CC      util/expr-flex.o
    CC      util/expr.o
    LD      util/scripting-engines/perf-in.o
    LD      util/intel-pt-decoder/perf-in.o
    LD      util/perf-in.o
    LD      perf-in.o
    LINK    perf
  make: Leaving directory '/tmp/perf-5.17.0/tools/perf'
  $ tools/perf/perf -v
  perf version 5.17.g31d10b3ef133
  $ pwd
  /tmp/perf-5.17.0
  $ cat PERF-VERSION-FILE
  #define PERF_VERSION "5.17.g31d10b3ef133"
  $

Fixes: 4e666cdb06 ("perf tools: Fix dependency for version file creation")
Reported-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/1648635774-14581-1-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
5ced812435 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  991625f3dd ("x86/ibt: Add IBT feature, MSR and #CP handling")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YkSCx2kr4ambH+Qe@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
f444b2d15f tools headers UAPI: Sync drm/i915_drm.h with the kernel sources
To pick up the changes in:

  caa574ffc4 ("drm/i915/uapi: document behaviour for DG2 64K support")

That don't add any new ioctl, so no changes in tooling.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/drm/i915_drm.h' differs from latest version at 'include/uapi/drm/i915_drm.h'
  diff -u tools/include/uapi/drm/i915_drm.h include/uapi/drm/i915_drm.h

Cc: Lucas De Marchi <lucas.demarchi@intel.com>
Cc: Matthew Auld <matthew.auld@intel.com>
Link: http://lore.kernel.org/lkml/YkSChHqaOApscFQ0@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
7ceda0cfca tools headers UAPI: Sync linux/kvm.h with the kernel sources
To pick the changes in:

  6d8491910f ("KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2")
  ef11c9463a ("KVM: s390: Add vm IOCTL for key checked guest absolute memory access")
  e9e9feebcb ("KVM: s390: Add optional storage key checking to MEMOP IOCTL")

That just rebuilds perf, as these patches don't add any new KVM ioctl to
be harvested for the the 'perf trace' ioctl syscall argument
beautifiers.

This is also by now used by tools/testing/selftests/kvm/, a simple test
build succeeded.

This silences this perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/linux/kvm.h' differs from latest version at 'include/uapi/linux/kvm.h'
  diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h

Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Cc: Oliver Upton <oupton@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Link: http://lore.kernel.org/lkml/YkSCOWHQdir1lhdJ@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
8db38afd12 tools kvm headers arm64: Update KVM headers from the kernel sources
To pick the changes from:

  34739fd95f ("KVM: arm64: Indicate SYSTEM_RESET2 in kvm_run::system_event flags field")
  583cda1b0e ("KVM: arm64: Refuse to run VCPU if the PMU doesn't match the physical CPU")

That don't causes any changes in tooling (when built on x86), only
addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: Marc Zyngier <maz@kernel.org>
Cc: Will Deacon <will@kernel.org>
Cc: Alexandru Elisei <alexandru.elisei@arm.com>
Link: https://lore.kernel.org/lkml/YkSB4Q7kWmnaqeZU@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
672b259fed tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes in:

  991625f3dd ("x86/ibt: Add IBT feature, MSR and #CP handling")

Addressing these tools/perf build warnings:

    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h
    Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2022-03-29 16:23:07.678740040 -0300
  +++ after	2022-03-29 16:23:16.960978524 -0300
  @@ -220,6 +220,13 @@
   	[0x00000669] = "MC6_DEMOTION_POLICY_CONFIG",
   	[0x00000680] = "LBR_NHM_FROM",
   	[0x00000690] = "CORE_PERF_LIMIT_REASONS",
  +	[0x000006a0] = "IA32_U_CET",
  +	[0x000006a2] = "IA32_S_CET",
  +	[0x000006a4] = "IA32_PL0_SSP",
  +	[0x000006a5] = "IA32_PL1_SSP",
  +	[0x000006a6] = "IA32_PL2_SSP",
  +	[0x000006a7] = "IA32_PL3_SSP",
  +	[0x000006a8] = "IA32_INT_SSP_TAB",
   	[0x000006B0] = "GFX_PERF_LIMIT_REASONS",
   	[0x000006B1] = "RING_PERF_LIMIT_REASONS",
   	[0x000006c0] = "LBR_NHM_TO",
  $

And this gets rebuilt:

  CC      /tmp/build/perf/trace/beauty/tracepoints/x86_msr.o
  LD      /tmp/build/perf/trace/beauty/tracepoints/perf-in.o
  LD      /tmp/build/perf/trace/beauty/perf-in.o
  CC      /tmp/build/perf/util/amd-sample-raw.o
  LD      /tmp/build/perf/util/perf-in.o
  LD      /tmp/build/perf/perf-in.o
  LINK    /tmp/build/perf/perf

Now one can trace systemwide asking to see backtraces to where those
MSRs are being read/written with:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr>=IA32_U_CET && msr<=IA32_INT_SSP_TAB"
  Using CPUID AuthenticAMD-25-21-0
  0x6a0
  0x6a8
  New filter for msr:read_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  0x6a0
  0x6a8
  New filter for msr:write_msr: (msr>=0x6a0 && msr<=0x6a8) && (common_pid != 597499 && common_pid != 3313)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
     0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule ([kernel.kallsyms])
                                       futex_wait_queue_me ([kernel.kallsyms])
                                       futex_wait ([kernel.kallsyms])
                                       do_futex ([kernel.kallsyms])
                                       __x64_sys_futex ([kernel.kallsyms])
                                       do_syscall_64 ([kernel.kallsyms])
                                       entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
                                       __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
     0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
                                       do_trace_write_msr ([kernel.kallsyms])
                                       do_trace_write_msr ([kernel.kallsyms])
                                       __switch_to_xtra ([kernel.kallsyms])
                                       __switch_to ([kernel.kallsyms])
                                       __schedule ([kernel.kallsyms])
                                       schedule_idle ([kernel.kallsyms])
                                       do_idle ([kernel.kallsyms])
                                       cpu_startup_entry ([kernel.kallsyms])
                                       secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/YkNd7Ky+vi7H2Zl2@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
6d05e13985 tools headers UAPI: Sync asm-generic/mman-common.h with the kernel
To pick the changes from:

  9457056ac4 ("mm: madvise: MADV_DONTNEED_LOCKED")

That result in these changes in the tools:

  $ diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h
  --- tools/include/uapi/asm-generic/mman-common.h	2022-03-29 16:17:50.461694991 -0300
  +++ include/uapi/asm-generic/mman-common.h	2022-03-27 19:12:48.923250468 -0300
  @@ -75,6 +75,8 @@
   #define MADV_POPULATE_READ	22	/* populate (prefault) page tables readable */
   #define MADV_POPULATE_WRITE	23	/* populate (prefault) page tables writable */

  +#define MADV_DONTNEED_LOCKED	24	/* like DONTNEED, but drop locked pages too */
  +
   /* compatibility flags */
   #define MAP_FILE	0

  $ tools/perf/trace/beauty/madvise_behavior.sh > before
  $ cp include/uapi/asm-generic/mman-common.h tools/include/uapi/asm-generic/mman-common.h
  $ tools/perf/trace/beauty/madvise_behavior.sh > after
  $ diff -u before after
  --- before	2022-03-29 16:18:04.091044244 -0300
  +++ after	2022-03-29 16:18:11.692238906 -0300
  @@ -20,6 +20,7 @@
   	[21] = "PAGEOUT",
   	[22] = "POPULATE_READ",
   	[23] = "POPULATE_WRITE",
  +	[24] = "DONTNEED_LOCKED",
   	[100] = "HWPOISON",
   	[101] = "SOFT_OFFLINE",
   };
  $

I.e. now when madvise gets those behaviours as args, 'perf trace' will
be able to translate from the number to a human readable string and to
use the strings in tracepoint filter expressions.

This addresses the following perf build warning:

  Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/mman-common.h' differs from latest version at 'include/uapi/asm-generic/mman-common.h'
  diff -u tools/include/uapi/asm-generic/mman-common.h include/uapi/asm-generic/mman-common.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YkNcUfeh795yqGMV@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:35 -03:00
Arnaldo Carvalho de Melo
9a195da42f perf beauty: Update copy of linux/socket.h with the kernel sources
To pick the changes in:

  a6a6fe27ba ("net/smc: Dynamic control handshake limitation by socket options")

This automagically adds support for the SOL_MNC socket level:

  $ diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h
  --- tools/perf/trace/beauty/include/linux/socket.h	2022-03-14 17:55:22.277148656 -0300
  +++ include/linux/socket.h	2022-03-27 19:12:48.908250063 -0300
  @@ -366,6 +366,7 @@
   #define SOL_XDP		283
   #define SOL_MPTCP	284
   #define SOL_MCTP	285
  +#define SOL_SMC		286

   /* IPX options */
   #define IPX_TYPE	1
  $ tools/perf/trace/beauty/socket.sh > before
  $ cp include/linux/socket.h tools/perf/trace/beauty/include/linux/socket.h
  $ tools/perf/trace/beauty/socket.sh > after
  $ diff -u before after
  --- before	2022-03-29 11:47:56.390258780 -0300
  +++ after	2022-03-29 11:48:03.158436189 -0300
  @@ -67,6 +67,7 @@
   	[283] = "XDP",
   	[284] = "MPTCP",
   	[285] = "MCTP",
  +	[286] = "SMC",
   };

   DEFINE_STRARRAY(socket_level, "SOL_");
  $

This will allow 'perf trace' to translate 286 into "SMC" as is done with
the other socket levels:

  # perf trace -e setsockopt --max-events 4
   344.916 ( 0.003 ms): Socket Thread/3816 setsockopt(fd: 168, level: TCP, optname: 5, optval: 0x7f5797b9c4f8, optlen: 4) = 0
   344.920 ( 0.002 ms): Socket Thread/3816 setsockopt(fd: 168, level: TCP, optname: 6, optval: 0x7f5797b9c4f4, optlen: 4) = 0
  1246.974 ( 0.010 ms): systemd-resolv/1128 setsockopt(fd: 22, level: IP, optname: 11, optval: 0x7ffc96cd7244, optlen: 4) = 0
  1246.986 ( 0.002 ms): systemd-resolv/1128 setsockopt(fd: 22, level: IP, optname: 8, optval: 0x7ffc96cd7264, optlen: 4) = 0

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/trace/beauty/include/linux/socket.h' differs from latest version at 'include/linux/socket.h'
  diff -u tools/perf/trace/beauty/include/linux/socket.h include/linux/socket.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: D. Wythe <alibuda@linux.alibaba.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YkMdpzzjPu5VZtW3@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:34 -03:00
Arnaldo Carvalho de Melo
4d4d00dd32 perf tools: Update copy of libbpf's hashmap.c
To pick the changes in:

  fba60b171a ("libbpf: Use IS_ERR_OR_NULL() in hashmap__free()")

That don't entail any changes in tools/perf.

This addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/perf/util/hashmap.h' differs from latest version at 'tools/lib/bpf/hashmap.h'
  diff -u tools/perf/util/hashmap.h tools/lib/bpf/hashmap.h

Not a kernel ABI, its just that this uses the mechanism in place for
checking kernel ABI files drift.

Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mauricio Vásquez <mauricio@kinvolk.io>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/YkMb2SAIai2VeuUD@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:34 -03:00
Ian Rogers
8a96f454f5 perf stat: Avoid SEGV if core.cpus isn't set
Passing NULL to perf_cpu_map__max doesn't make sense as there is no
valid max. Avoid this problem by null checking in
perf_stat_init_aggr_mode.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: John Garry <john.garry@huawei.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220328062414.1893550-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-04-01 16:19:34 -03:00
Yinan Zhang
d8b7b3fa9f tools/vm/page_owner_sort.c: remove -c option
The -c option is used to cull by stacktrace.  Now, --cull option has
been Added in page_owner_sort.c.  Culling by stacktrace is one of the
function of "--cull".  No need to set an extra parameter.  So remove -c
option.

Remove parsing of -c when parse parameter and remove "-c" from usage.

This work is coauthored by
        Shenghong Han
        Yixuan Cao
        Chongxi Zhao
        Jiajian Ye
        Yuhong Feng
        Yongqiang Liu

Link: https://lkml.kernel.org/r/20220326085920.1470081-1-zhangyinan2019@email.szu.edu.cn
Signed-off-by: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
Cc: Georgi Djakov <georgi.djakov@linaro.org>
Cc: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Yongqiang Liu <liuyongqiang13@huawei.com>
Cc: Yuhong Feng <yuhongf@szu.edu.cn>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-04-01 11:46:09 -07:00
Nikolay Aleksandrov
392baa339c selftests: net: add delete nexthop route warning test
Add a test which causes a WARNING on kernels which treat a
nexthop route like a normal route when comparing for deletion and a
device is specified. That is, a route is found but we hit a warning while
matching it. The warning is from fib_info_nh() in include/net/nexthop.h
because we run it on a fib_info with nexthop object. The call chain is:
 inet_rtm_delroute -> fib_table_delete -> fib_nh_match (called with a
nexthop fib_info and also with fc_oif set thus calling fib_info_nh on
the fib_info and triggering the warning).

Repro steps:
 $ ip nexthop add id 12 via 172.16.1.3 dev veth1
 $ ip route add 172.16.101.1/32 nhid 12
 $ ip route delete 172.16.101.1/32 dev veth1

Signed-off-by: Nikolay Aleksandrov <razor@blackwall.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-01 12:09:17 +01:00
Linus Torvalds
f4f5d7cfb2 virtio: features, fixes
vdpa generic device type support
 More virtio hardening for broken devices
 On the same theme, revert some virtio hotplug hardening patches -
 they were misusing some interrupt flags, will have to be reverted.
 RSS support in virtio-net
 max device MTU support in mlx5 vdpa
 akcipher support in virtio-crypto
 shared IRQ support in ifcvf vdpa
 a minor performance improvement in vhost
 Enable virtio mem for ARM64
 beginnings of advance dma support
 
 Cleanups, fixes all over the place.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmJEEk8PHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRpcpUH+wRIXrzveirsN4MYH0aAeF+SLYaA5pgtO4U7
 da22HYtwlMrDRMxwjepKBOTSu89uP5LEK7IKWPj9VRZg+GLz/Cdfc6BZl/fND3qt
 0yFpwG1ZLsBK1+WHbysWQneEbPjXqQdbh9eVkKVGcNkRuLJJwXbmF95dyQEJwzeh
 dPHssDcEC2tRgHAMrLyjLPKwMCRwcgtdPoB1ZC+lqTs3G6lktAfREEvqVfJOVe1b
 mQcgdAJ+aRM0J/w/PYTmxFOZPYAmQ6hmAQ8Hf7nkjfRWQ4EM91W0cKAoZPc/+7KN
 ZfFKVL28GEZLJqnx+3xijwCR2gwVHsRYZHaTjfGgQUWZPoB3Vrc=
 =ynRx
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio updates from Michael Tsirkin:

 - vdpa generic device type support

 - more virtio hardening for broken devices (but on the same theme,
   revert some virtio hotplug hardening patches - they were misusing
   some interrupt flags and had to be reverted)

 - RSS support in virtio-net

 - max device MTU support in mlx5 vdpa

 - akcipher support in virtio-crypto

 - shared IRQ support in ifcvf vdpa

 - a minor performance improvement in vhost

 - enable virtio mem for ARM64

 - beginnings of advance dma support

 - cleanups, fixes all over the place

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: (33 commits)
  vdpa/mlx5: Avoid processing works if workqueue was destroyed
  vhost: handle error while adding split ranges to iotlb
  vdpa: support exposing the count of vqs to userspace
  vdpa: change the type of nvqs to u32
  vdpa: support exposing the config size to userspace
  vdpa/mlx5: re-create forwarding rules after mac modified
  virtio: pci: check bar values read from virtio config space
  Revert "virtio_pci: harden MSI-X interrupts"
  Revert "virtio-pci: harden INTX interrupts"
  drivers/net/virtio_net: Added RSS hash report control.
  drivers/net/virtio_net: Added RSS hash report.
  drivers/net/virtio_net: Added basic RSS support.
  drivers/net/virtio_net: Fixed padded vheader to use v1 with hash.
  virtio: use virtio_device_ready() in virtio_device_restore()
  tools/virtio: compile with -pthread
  tools/virtio: fix after premapped buf support
  virtio_ring: remove flags check for unmap packed indirect desc
  virtio_ring: remove flags check for unmap split indirect desc
  virtio_ring: rename vring_unmap_state_packed() to vring_unmap_extra_packed()
  net/mlx5: Add support for configuring max device MTU
  ...
2022-03-31 13:57:15 -07:00
Linus Torvalds
b8321ed4a4 Kbuild updates for v5.18
- Add new environment variables, USERCFLAGS and USERLDFLAGS to allow
    additional flags to be passed to user-space programs.
 
  - Fix missing fflush() bugs in Kconfig and fixdep
 
  - Fix a minor bug in the comment format of the .config file
 
  - Make kallsyms ignore llvm's local labels, .L*
 
  - Fix UAPI compile-test for cross-compiling with Clang
 
  - Extend the LLVM= syntax to support LLVM=<suffix> form for using a
    particular version of LLVm, and LLVM=<prefix> form for using custom
    LLVM in a particular directory path.
 
  - Clean up Makefiles
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmJFGloVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGH0kP/j6Vx5BqEv3tP2Q+UANxLqITleJs
 IFpbSesz/BhlG7I/IapWmCDSqFbYd5uJTO4ko8CsPmZHcxr6Gw3y+DN5yQACKaG/
 p9xiF6GjPyKR8+VdcT2tV50+dVY8ANe/DxCyzKrJd/uyYxgARPKJh0KRMNz+d9lj
 ixUpCXDhx/XlKzPIlcxrvhhjevKz+NnHmN0fe6rzcOw9KzBGBTsf20Q3PqUuBOKa
 rWHsRGcBPA8eKLfWT1Us1jjic6cT2g4aMpWjF20YgUWKHgWVKcNHpxYKGXASVo/z
 ewdDnNfmwo7f7fKMCDDro9iwFWV/BumGtn43U00tnqdBcTpFojPlEOga37UPbZDF
 nmTblGVUhR0vn4PmfBy8WkAkbW+IpVatKwJGV4J3KjSvdWvZOmVj9VUGLVAR0TXW
 /YcgRs6EtG8Hn0IlCj0fvZ5wRWoDLbP2DSZ67R/44EP0GaNQPwUe4FI1izEE4EYX
 oVUAIxcKixWGj4RmdtmtMMdUcZzTpbgS9uloMUmS3u9LK0Ir/8tcWaf2zfMO6Jl2
 p4Q31s1dUUKCnFnj0xDKRyKGUkxYebrHLfuBqi0RIc0xRpSlxoXe3Dynm9aHEQoD
 ZSV0eouQJxnaxM1ck5Bu4AHLgEebHfEGjWVyUHno7jFU5EI9Wpbqpe4pCYEEDTm1
 +LJMEpdZO0dFvpF+
 =84rW
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Add new environment variables, USERCFLAGS and USERLDFLAGS to allow
   additional flags to be passed to user-space programs.

 - Fix missing fflush() bugs in Kconfig and fixdep

 - Fix a minor bug in the comment format of the .config file

 - Make kallsyms ignore llvm's local labels, .L*

 - Fix UAPI compile-test for cross-compiling with Clang

 - Extend the LLVM= syntax to support LLVM=<suffix> form for using a
   particular version of LLVm, and LLVM=<prefix> form for using custom
   LLVM in a particular directory path.

 - Clean up Makefiles

* tag 'kbuild-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild:
  kbuild: Make $(LLVM) more flexible
  kbuild: add --target to correctly cross-compile UAPI headers with Clang
  fixdep: use fflush() and ferror() to ensure successful write to files
  arch: syscalls: simplify uapi/kapi directory creation
  usr/include: replace extra-y with always-y
  certs: simplify empty certs creation in certs/Makefile
  certs: include certs/signing_key.x509 unconditionally
  kallsyms: ignore all local labels prefixed by '.L'
  kconfig: fix missing '# end of' for empty menu
  kconfig: add fflush() before ferror() check
  kbuild: replace $(if A,A,B) with $(or A,B)
  kbuild: Add environment variables for userprogs flags
  kbuild: unify cmd_copy and cmd_shipped
2022-03-31 11:59:03 -07:00
Linus Torvalds
2975dbdc39 Networking fixes for 5.18-rc1 and rethook patches.
Features:
 
  - kprobes: rethook: x86: replace kretprobe trampoline with rethook
 
 Current release - regressions:
 
  - sfc: avoid null-deref on systems without NUMA awareness
    in the new queue sizing code
 
 Current release - new code bugs:
 
  - vxlan: do not feed vxlan_vnifilter_dump_dev with non-vxlan devices
 
  - eth: lan966x: fix null-deref on PHY pointer in timestamp ioctl
    when interface is down
 
 Previous releases - always broken:
 
  - openvswitch: correct neighbor discovery target mask field
    in the flow dump
 
  - wireguard: ignore v6 endpoints when ipv6 is disabled and fix a leak
 
  - rxrpc: fix call timer start racing with call destruction
 
  - rxrpc: fix null-deref when security type is rxrpc_no_security
 
  - can: fix UAF bugs around echo skbs in multiple drivers
 
 Misc:
 
  - docs: move netdev-FAQ to the "process" section of the documentation
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmJF3S0ACgkQMUZtbf5S
 IruIvA/+NZx+c+fBBrbjOh63avRL7kYIqIDREf+v6lh4ZXmbrp22xalcjIdxgWeK
 vAiYfYmzZblWAGkilcvPG3blCBc+9b+YE+pPJXFe60Huv3eYpjKfgTKwQOg/lIeM
 8MfPP7eBwcJ/ltSTRtySRl9LYgyVcouP9rAVJavFVYrvuDYunwhfChswVfGCYon8
 42O4nRwrtkTE1MjHD8HS3YxvwGlo+iIyhsxgG/gWx8F2xeIG22H6adzjDXcCQph8
 air/awrJ4enYkVMRokGNfNppK9Z3vjJDX5xha3CREpvXNPe0F24cAE/L8XqyH7+r
 /bXP5y9VC9mmEO7x4Le3VmDhOJGbCOtR89gTlevftDRdSIrbNHffZhbPW48tR7o8
 NJFlhiSJb4HEMN0q7BmxnWaKlbZUlvLEXLuU5ytZE/G7i+nETULlunfZrCD4eNYH
 gBGYhiob2I/XotJA9QzG/RDyaFwDaC/VARsyv37PSeBAl/yrEGAeP7DsKkKX/ayg
 LM9ItveqHXK30J0xr3QJA8s49EkIYejjYR3l0hQ9esf9QvGK99dE/fo44Apf3C3A
 Lz6XpnRc5Xd7tZ9Aopwb3FqOH6WR9Hq9Qlbk0qifsL/2sRbatpuZbbDK6L3CR3Ir
 WFNcOoNbbqv85kCKFXFjj0jdpoNa9Yej8XFkMkVSkM3sHImYmYQ=
 =5Bvy
 -----END PGP SIGNATURE-----

Merge tag 'net-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull more networking updates from Jakub Kicinski:
 "Networking fixes and rethook patches.

  Features:

   - kprobes: rethook: x86: replace kretprobe trampoline with rethook

  Current release - regressions:

   - sfc: avoid null-deref on systems without NUMA awareness in the new
     queue sizing code

  Current release - new code bugs:

   - vxlan: do not feed vxlan_vnifilter_dump_dev with non-vxlan devices

   - eth: lan966x: fix null-deref on PHY pointer in timestamp ioctl when
     interface is down

  Previous releases - always broken:

   - openvswitch: correct neighbor discovery target mask field in the
     flow dump

   - wireguard: ignore v6 endpoints when ipv6 is disabled and fix a leak

   - rxrpc: fix call timer start racing with call destruction

   - rxrpc: fix null-deref when security type is rxrpc_no_security

   - can: fix UAF bugs around echo skbs in multiple drivers

  Misc:

   - docs: move netdev-FAQ to the 'process' section of the
     documentation"

* tag 'net-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (57 commits)
  vxlan: do not feed vxlan_vnifilter_dump_dev with non vxlan devices
  openvswitch: Add recirc_id to recirc warning
  rxrpc: fix some null-ptr-deref bugs in server_key.c
  rxrpc: Fix call timer start racing with call destruction
  net: hns3: fix software vlan talbe of vlan 0 inconsistent with hardware
  net: hns3: fix the concurrency between functions reading debugfs
  docs: netdev: move the netdev-FAQ to the process pages
  docs: netdev: broaden the new vs old code formatting guidelines
  docs: netdev: call out the merge window in tag checking
  docs: netdev: add missing back ticks
  docs: netdev: make the testing requirement more stringent
  docs: netdev: add a question about re-posting frequency
  docs: netdev: rephrase the 'should I update patchwork' question
  docs: netdev: rephrase the 'Under review' question
  docs: netdev: shorten the name and mention msgid for patch status
  docs: netdev: note that RFC postings are allowed any time
  docs: netdev: turn the net-next closed into a Warning
  docs: netdev: move the patch marking section up
  docs: netdev: minor reword
  docs: netdev: replace references to old archives
  ...
2022-03-31 11:23:31 -07:00
Nathan Chancellor
e9c281928c kbuild: Make $(LLVM) more flexible
The LLVM make variable allows a developer to quickly switch between the
GNU and LLVM tools. However, it does not handle versioned binaries, such
as the ones shipped by Debian, as LLVM=1 just defines the tool variables
with the unversioned binaries.

There was some discussion during the review of the patch that introduces
LLVM=1 around versioned binaries, ultimately coming to the conclusion
that developers can just add the folder that contains the unversioned
binaries to their PATH, as Debian's versioned suffixed binaries are
really just symlinks to the unversioned binaries in /usr/lib/llvm-#/bin:

$ realpath /usr/bin/clang-14
/usr/lib/llvm-14/bin/clang

$ PATH=/usr/lib/llvm-14/bin:$PATH make ... LLVM=1

However, that can be cumbersome to developers who are constantly testing
series with different toolchains and versions. It is simple enough to
support these versioned binaries directly in the Kbuild system by
allowing the developer to specify the version suffix with LLVM=, which
is shorter than the above suggestion:

$ make ... LLVM=-14

It does not change the meaning of LLVM=1 (which will continue to use
unversioned binaries) and it does not add too much additional complexity
to the existing $(LLVM) code, while allowing developers to quickly test
their series with different versions of the whole LLVM suite of tools.

Some developers may build LLVM from source but not add the binaries to
their PATH, as they may not want to use that toolchain systemwide.
Support those developers by allowing them to supply the directory that
the LLVM tools are available in, as it is no more complex to support
than the version suffix change above.

$ make ... LLVM=/path/to/llvm/

Update and reorder the documentation to reflect these new additions.
At the same time, notate that LLVM=0 is not the same as just omitting it
altogether, which has confused people in the past.

Link: https://lore.kernel.org/r/20200317215515.226917-1-ndesaulniers@google.com/
Link: https://lore.kernel.org/r/20220224151322.072632223@infradead.org/
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2022-03-31 12:03:46 +09:00
Martin KaFai Lau
0a210af6d0 bpf: selftests: Test fentry tracing a struct_ops program
This patch tests attaching an fentry prog to a struct_ops prog.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220330011502.2985292-1-kafai@fb.com
2022-03-30 19:31:30 -07:00
Jason A. Donenfeld
ca93ca2340 wireguard: selftests: simplify RNG seeding
The seed_rng() function was written to work across lots of old kernels,
back when WireGuard used a big compatibility layer. Now that things have
evolved, we can vastly simplify this, by just marking the RNG as seeded.

Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-30 19:14:08 -07:00
Linus Torvalds
ee96dd9614 libnvdimm for 5.18
- Add perf support for nvdimm events, initially only for 'papr_scm'
   devices.
 
 - Deprecate the 'block aperture' support in libnvdimm, it only ever
   existed in the specification, not in shipping product.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYjvuTwAKCRDfioYZHlFs
 Z4JbAQCViArRj/yxffDB4kg4FNlOgAQbxPblC07E06UX8Jj2DgD+NY2YJIucz0qb
 87v0+CorJjtoy5dJ9vxAR8keojT3RQ0=
 =oAwm
 -----END PGP SIGNATURE-----

Merge tag 'libnvdimm-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm

Pull libnvdimm updates from Dan Williams:
 "The update for this cycle includes the deprecation of block-aperture
  mode and a new perf events interface for the papr_scm nvdimm driver.

  The perf events approach was acked by PeterZ.

   - Add perf support for nvdimm events, initially only for 'papr_scm'
     devices.

   - Deprecate the 'block aperture' support in libnvdimm, it only ever
     existed in the specification, not in shipping product"

* tag 'libnvdimm-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
  nvdimm/blk: Fix title level
  MAINTAINERS: remove section LIBNVDIMM BLK: MMIO-APERTURE DRIVER
  powerpc/papr_scm: Fix build failure when
  drivers/nvdimm: Fix build failure when CONFIG_PERF_EVENTS is not set
  nvdimm/region: Delete nd_blk_region infrastructure
  ACPI: NFIT: Remove block aperture support
  nvdimm/namespace: Delete nd_namespace_blk
  nvdimm/namespace: Delete blk namespace consideration in shared paths
  nvdimm/blk: Delete the block-aperture window driver
  nvdimm/region: Fix default alignment for small regions
  docs: ABI: sysfs-bus-nvdimm: Document sysfs event format entries for nvdimm pmu
  powerpc/papr_scm: Add perf interface support
  drivers/nvdimm: Add perf interface to expose nvdimm performance stats
  drivers/nvdimm: Add nvdimm pmu structure
2022-03-30 10:04:11 -07:00
Haowen Bai
2609f635a2 selftests/bpf: Fix warning comparing pointer to 0
Avoid pointer type value compared with 0 to make code clear. Reported by
coccicheck:

  tools/testing/selftests/bpf/progs/map_ptr_kern.c:370:21-22:
  WARNING comparing pointer to 0
  tools/testing/selftests/bpf/progs/map_ptr_kern.c:397:21-22:
  WARNING comparing pointer to 0

Signed-off-by: Haowen Bai <baihaowen@meizu.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/1648605588-19269-1-git-send-email-baihaowen@meizu.com
2022-03-30 14:17:25 +02:00
Delyan Kratunov
522574fd78 bpftool: Explicit errno handling in skeletons
Andrii noticed that since f97b8b9bd6 ("bpftool: Fix a bug in subskeleton
code generation") the subskeleton code allows bpf_object__destroy_subskeleton
to overwrite the errno that subskeleton__open would return with. While this
is not currently an issue, let's make it future-proof.

This patch explicitly tracks err in subskeleton__open and skeleton__create
(i.e. calloc failure is explicitly ENOMEM) and ensures that errno is -err on
the error return path. The skeleton code had to be changed since maps and
progs codegen is shared with subskeletons.

Fixes: f97b8b9bd6 ("bpftool: Fix a bug in subskeleton code generation")
Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/3b6bfbb770c79ae64d8de26c1c1bd9d53a4b85f8.camel@fb.com
2022-03-30 14:06:59 +02:00
Linus Torvalds
9ae2a14308 dma-mapping updates for Linux 5.18
- do not zero buffer in set_memory_decrypted (Kirill A. Shutemov)
  - fix return value of dma-debug __setup handlers (Randy Dunlap)
  - swiotlb cleanups (Robin Murphy)
  - remove most remaining users of the pci-dma-compat.h API
    (Christophe JAILLET)
  - share the ABI header for the DMA map_benchmark with userspace
    (Tian Tao)
  - update the maintainer for DMA MAPPING BENCHMARK (Xiang Chen)
  - remove CONFIG_DMA_REMAP (me)
 -----BEGIN PGP SIGNATURE-----
 
 iQI/BAABCgApFiEEgdbnc3r/njty3Iq9D55TZVIEUYMFAmJDDgsLHGhjaEBsc3Qu
 ZGUACgkQD55TZVIEUYM9oBAAxm93DZCXsqektM2qJ34o1KCyfAhvTvZ1r38ab+cl
 wJwmMPF6/S9MCj6XZEnCzUnXL//TnhcuYVztNpPTWqhx6QaqWmmx9yJKjoYAnHce
 svVMef7iipn35w7hAPpiVR/AVwWyxQCkSC+5sgp6XX8mp7l7I3ajfO0fZ52JCcxw
 12d4k1E0yjC096Kw8wXQv+rzmCAoQcK9Jj20COUO3rkgOr68ZIXse2HXUJjn76Fy
 wym2rJfqJ9mdKrDHqphe1ntIzkcQNWx9xR0UVh7/e4p7Si5H8Lp8QWwC7Zw6Y2Gb
 paeotIMu1uTKkcZI4K54J8PXRLA7PLrDSDFdxnKOsWNZU/inIwt9b11kr9FOaYqR
 BLJ+w6bF1/PmM6q2gkOwNuoiJD5YQfwF7y+wi84VyaauM0J8ssIHYnVrCWXn0m1E
 4veAkWasAYb1oaoNlDhmZEbpI+kcN3xwDyK1WbtHuGvR00oSvxl0d1viGTVXYfDA
 k5rBjb7CovK8JIrFIJoMiDM4TvdauxL66IlEL7ohLDh6l1f09Q0+gsdVcAM0ObX6
 zOkoulyHCFqkePvoH/xpyIrZZ9cHA228fZYC7QiBcxdWlD3dFMWkKvhajiSDQJSW
 SAz94CeEDWn64Q462N+ecivKlLwz7j/TqOig5xU+/6UoMC/2a7+HIim+p6bjh8Pc
 5Gg=
 =C+Es
 -----END PGP SIGNATURE-----

Merge tag 'dma-mapping-5.18' of git://git.infradead.org/users/hch/dma-mapping

Pull dma-mapping updates from Christoph Hellwig:

 - do not zero buffer in set_memory_decrypted (Kirill A. Shutemov)

 - fix return value of dma-debug __setup handlers (Randy Dunlap)

 - swiotlb cleanups (Robin Murphy)

 - remove most remaining users of the pci-dma-compat.h API
   (Christophe JAILLET)

 - share the ABI header for the DMA map_benchmark with userspace
   (Tian Tao)

 - update the maintainer for DMA MAPPING BENCHMARK (Xiang Chen)

 - remove CONFIG_DMA_REMAP (me)

* tag 'dma-mapping-5.18' of git://git.infradead.org/users/hch/dma-mapping:
  dma-mapping: benchmark: extract a common header file for map_benchmark definition
  dma-debug: fix return value of __setup handlers
  dma-mapping: remove CONFIG_DMA_REMAP
  media: v4l2-pci-skeleton: Remove usage of the deprecated "pci-dma-compat.h" API
  rapidio/tsi721: Remove usage of the deprecated "pci-dma-compat.h" API
  sparc: Remove usage of the deprecated "pci-dma-compat.h" API
  agp/intel: Remove usage of the deprecated "pci-dma-compat.h" API
  alpha: Remove usage of the deprecated "pci-dma-compat.h" API
  MAINTAINERS: update maintainer list of DMA MAPPING BENCHMARK
  swiotlb: simplify array allocation
  swiotlb: tidy up includes
  swiotlb: simplify debugfs setup
  swiotlb: do not zero buffer in set_memory_decrypted()
2022-03-29 08:50:14 -07:00
Yonghong Song
ccaff3d56a selftests/bpf: Fix clang compilation errors
llvm upstream patch ([1]) added to issue warning for code like
  void test() {
    int j = 0;
    for (int i = 0; i < 1000; i++)
            j++;
    return;
  }

This triggered several errors in selftests/bpf build since
compilation flag -Werror is used.
  ...
  test_lpm_map.c:212:15: error: variable 'n_matches' set but not used [-Werror,-Wunused-but-set-variable]
        size_t i, j, n_matches, n_matches_after_delete, n_nodes, n_lookups;
                     ^
  test_lpm_map.c:212:26: error: variable 'n_matches_after_delete' set but not used [-Werror,-Wunused-but-set-variable]
        size_t i, j, n_matches, n_matches_after_delete, n_nodes, n_lookups;
                                ^
  ...
  prog_tests/get_stack_raw_tp.c:32:15: error: variable 'cnt' set but not used [-Werror,-Wunused-but-set-variable]
        static __u64 cnt;
                     ^
  ...

  For test_lpm_map.c, 'n_matches'/'n_matches_after_delete' are changed to be volatile
  in order to silent the warning. I didn't remove these two declarations since
  they are referenced in a commented code which might be used by people in certain
  cases. For get_stack_raw_tp.c, the variable 'cnt' is removed.

  [1] https://reviews.llvm.org/D122271

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220325200304.2915588-1-yhs@fb.com
2022-03-28 20:00:11 -07:00
Jiri Olsa
ef8a257b4e bpftool: Fix generated code in codegen_asserts
Arnaldo reported perf compilation fail with:

  $ make -k BUILD_BPF_SKEL=1 CORESIGHT=1 PYTHON=python3
  ...
  In file included from util/bpf_counter.c:28:
  /tmp/build/perf//util/bpf_skel/bperf_leader.skel.h: In function ‘bperf_leader_bpf__assert’:
  /tmp/build/perf//util/bpf_skel/bperf_leader.skel.h:351:51: error: unused parameter ‘s’ [-Werror=unused-parameter]
    351 | bperf_leader_bpf__assert(struct bperf_leader_bpf *s)
        |                          ~~~~~~~~~~~~~~~~~~~~~~~~~^
  cc1: all warnings being treated as errors

If there's nothing to generate in the new assert function,
we will get unused 's' warn/error, adding 'unused' attribute to it.

Fixes: 08d4dba6ae ("bpftool: Bpf skeletons assert type sizes")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Link: https://lore.kernel.org/bpf/20220328083703.2880079-1-jolsa@kernel.org
2022-03-28 19:10:25 -07:00
Andrii Nakryiko
99dea2c664 selftests/bpf: fix selftest after random: Urandom_read tracepoint removal
14c174633f ("random: remove unused tracepoints") removed all the
tracepoints from drivers/char/random.c, one of which,
random:urandom_read, was used by stacktrace_build_id selftest to trigger
stack trace capture.

Fix breakage by switching to kprobing urandom_read() function.

Suggested-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220325225643.2606-1-andrii@kernel.org
2022-03-28 19:09:23 -07:00
Geliang Tang
98870605b3 bpf: Sync comments for bpf_get_stack
Commit ee2a098851 missed updating the comments for helper bpf_get_stack
in tools/include/uapi/linux/bpf.h. Sync it.

Fixes: ee2a098851 ("bpf: Adjust BPF stack helper functions to accommodate skip > 0")
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/ce54617746b7ed5e9ba3b844e55e74cb8a60e0b5.1648110794.git.geliang.tang@suse.com
2022-03-28 19:06:35 -07:00
Milan Landaverde
8c1b211985 bpf/bpftool: Add unprivileged_bpf_disabled check against value of 2
In [1], we added a kconfig knob that can set
/proc/sys/kernel/unprivileged_bpf_disabled to 2

We now check against this value in bpftool feature probe

[1] https://lore.kernel.org/bpf/74ec548079189e4e4dffaeb42b8987bb3c852eee.1620765074.git.daniel@iogearbox.net

Signed-off-by: Milan Landaverde <milan@mdaverde.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Acked-by: KP Singh <kpsingh@kernel.org>
Link: https://lore.kernel.org/bpf/20220322145012.1315376-1-milan@mdaverde.com
2022-03-28 19:01:54 -07:00
Linus Torvalds
d717e4cae0 Networking fixes, including fixes from netfilter.
Current release - regressions:
 
  - llc: only change llc->dev when bind() succeeds, fix null-deref
 
 Current release - new code bugs:
 
  - smc: fix a memory leak in smc_sysctl_net_exit()
 
  - dsa: realtek: make interface drivers depend on OF
 
 Previous releases - regressions:
 
  - sched: act_ct: fix ref leak when switching zones
 
 Previous releases - always broken:
 
  - netfilter: egress: report interface as outgoing
 
  - vsock/virtio: enable VQs early on probe and finish the setup
    before using them
 
 Misc:
 
  - memcg: enable accounting for nft objects
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmJCSMMACgkQMUZtbf5S
 Irt4eQ//fTAC/7mBmT8uoUJMZlrRckSDnJ/Y1ukgOQrjbwcgeRi0PK1cy2oGmU4w
 mRZ8zhskVpmzodPuduCIdmsdE2PaWTCFoVRC52QH1HffCRbj1mRK9vrf94q0TP9+
 jqzaIOhKyWKGMgYQGObIFbojnF4H1wm+tIXcEVWzxivS/2yY4W/3hdBIblBO++r5
 c9vxO//qzGH1kGDCWfahuJSTvZBpQ3HTmjGLC1F8xTh8RkR7MGQyGCQ984j+DClb
 PJJQXeV/Zoyxvrzv14MU5Ms9+lsgH2pyBdVzvN8p2QSwSaU8CsbbM05I4lB5mT/b
 tGBYNreMmuQbXRxNVoxaZOTgQqEtTgH+AKJ9L0f2Es6Ftp5TTrFZZA97lO0/qzMj
 NGbxa0p7tlNyOGKDxyUw6SB1+kqqgR2a6skk4XnQ6CAH7AvxSFOxvt63mjeJfCY7
 +j5Lxtm+a/RpVt6Djsvpwq12lKiootcbEyMoUKxKeQ+4I08z6W6hoS1zjUDeMDM6
 q8eDXsxpZgGF6k7x3eKkOWKLMVeQ1cv0CjGaCoTXCqtGZTixRll3v6I6/Oh405Gw
 18fZkIC4TjRdXyfA23n7MzyukjOjmbzn5Kx01lfiMYFeiS/tMwFt/W+ka836j0R6
 gzUdEHLEZdPN699WP4fRrxmIjGlGpEpl02WDEFrP+LDdFCYHCzY=
 =sOIu
 -----END PGP SIGNATURE-----

Merge tag 'net-5.18-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter.

  Current release - regressions:

   - llc: only change llc->dev when bind() succeeds, fix null-deref

  Current release - new code bugs:

   - smc: fix a memory leak in smc_sysctl_net_exit()

   - dsa: realtek: make interface drivers depend on OF

  Previous releases - regressions:

   - sched: act_ct: fix ref leak when switching zones

  Previous releases - always broken:

   - netfilter: egress: report interface as outgoing

   - vsock/virtio: enable VQs early on probe and finish the setup before
     using them

  Misc:

   - memcg: enable accounting for nft objects"

* tag 'net-5.18-rc0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (39 commits)
  Revert "selftests: net: Add tls config dependency for tls selftests"
  net/smc: Send out the remaining data in sndbuf before close
  net: move net_unlink_todo() out of the header
  net: dsa: bcm_sf2_cfp: fix an incorrect NULL check on list iterator
  net: bnxt_ptp: fix compilation error
  selftests: net: Add tls config dependency for tls selftests
  memcg: enable accounting for nft objects
  net/sched: act_ct: fix ref leak when switching zones
  net/smc: fix a memory leak in smc_sysctl_net_exit()
  selftests: tls: skip cmsg_to_pipe tests with TLS=n
  octeontx2-af: initialize action variable
  net: sparx5: switchdev: fix possible NULL pointer dereference
  net/x25: Fix null-ptr-deref caused by x25_disconnect
  qlcnic: dcb: default to returning -EOPNOTSUPP
  net: sparx5: depends on PTP_1588_CLOCK_OPTIONAL
  net: hns3: fix phy can not link up when autoneg off and reset
  net: hns3: add NULL pointer check for hns3_set/get_ringparam()
  net: hns3: add netdev reset check for hns3_set_tunable()
  net: hns3: clean residual vf config after disable sriov
  net: hns3: add max order judgement for tx spare buffer
  ...
2022-03-28 17:02:04 -07:00
Jakub Kicinski
20695e9a9f Revert "selftests: net: Add tls config dependency for tls selftests"
This reverts commit d9142e1cf3.

The test is supposed to run cleanly with TLS is disabled,
to test compatibility with TCP behavior. I can't repro
the failure [1], the problem should be debugged rather
than papered over.

Link: https://lore.kernel.org/all/20220325161203.7000698c@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com/ [1]
Fixes: d9142e1cf3 ("selftests: net: Add tls config dependency for tls selftests")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20220328212904.2685395-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-28 16:15:54 -07:00
Linus Torvalds
d111c9f034 Livepatching changes for 5.18
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEESH4wyp42V4tXvYsjUqAMR0iAlPIFAmJBmMMACgkQUqAMR0iA
 lPLeXBAAnAqK3rY+mberKcFKHLaNJ0O2Y7OMcCf5Xh8snnivgi9RYcqklSbxXQwm
 hILa2oP6gUug16zhD2XVb5Mxic7MfgsN8mfy/eItMfEVs3KqUzHKSryTp6N1PA5x
 DiQvC7Fg7NGYZs95prMCrFILwVrkLYiKlWGTmlWrz/MTfOOsbAjB9yv5bfalvlo+
 A3+XpXxHfb/Wl2kXrUjTey61Rrk3gdgLhucrHVxttb9VPp1ODoLLLu4ePoN9CArA
 fpGVUfeh1IDV3sUgwpGgXBwJFBsXxJ9ZYGnJzea0opNn8EgfwgIC97qTaa+GXX/j
 bUJFPUNrGGEq99JbPgHmu+imXC1eFfCwxXK7zi6TR7mIOq6I/DfQxCLYUHZpFIMn
 mt30wm21j2zVRsOt27frhjyXCSnts7HmOleBcd8NL+aIVKaOqamEOQrmPZPj8eH2
 cx9gAphhFv6EDnr3Cj3SbpBrqf1pcxjVa9T2gfhJjtkLLyxR2ruvlRvnWNnaKJZZ
 bC7OL74h6eAhJk1pwPcHW2BsABv3jWPzBrOYkjIhRWUY77UriWNKJ27Dd83cAVkw
 7P6GbGfTbSCX7m2+0pEdKxc9hMshK2zyTLbu02PopD7yGBDkrcnkgpPGPVMDsj4c
 44ANkVlLojBAE43fXXdRPfpSKKBa0pi6MO5WXORrWiY7PNZjAAw=
 =PhGM
 -----END PGP SIGNATURE-----

Merge tag 'livepatching-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching

Pull livepatching updates from Petr Mladek:

 - Forced transitions block only to-be-removed livepatches [Chengming]

 - Detect when ftrace handler could not be disabled in self-tests [David]

 - Calm down warning from a static analyzer [Tom]

* tag 'livepatching-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching:
  livepatch: Reorder to use before freeing a pointer
  livepatch: Don't block removal of patches that are safe to unload
  livepatch: Skip livepatch tests if ftrace cannot be configured
2022-03-28 14:38:31 -07:00
Michael S. Tsirkin
f03560a57c tools/virtio: compile with -pthread
When using pthreads, one has to compile and link with -lpthread,
otherwise e.g. glibc is not guaranteed to be reentrant.

This replaces -lpthread.

Reported-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28 16:52:59 -04:00
Michael S. Tsirkin
06f05bc522 tools/virtio: fix after premapped buf support
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-03-28 16:52:59 -04:00
Linus Torvalds
02e2af20f4 Char/Misc and other driver updates for 5.18-rc1
Here is the big set of char/misc and other small driver subsystem
 updates for 5.18-rc1.
 
 Included in here are merges from driver subsystems which contain:
 	- iio driver updates and new drivers
 	- fsi driver updates
 	- fpga driver updates
 	- habanalabs driver updates and support for new hardware
 	- soundwire driver updates and new drivers
 	- phy driver updates and new drivers
 	- coresight driver updates
 	- icc driver updates
 
 Individual changes include:
 	- mei driver updates
 	- interconnect driver updates
 	- new PECI driver subsystem added
 	- vmci driver updates
 	- lots of tiny misc/char driver updates
 
 There will be two merge conflicts with your tree, one in MAINTAINERS
 which is obvious to fix up, and one in drivers/phy/freescale/Kconfig
 which also should be easy to resolve.
 
 All of these have been in linux-next for a while with no reported
 problems.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCYkG3fQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ykNEgCfaRG8CRxewDXOO4+GSeA3NGK+AIoAnR89donC
 R4bgCjfg8BWIBcVVXg3/
 =WWXC
 -----END PGP SIGNATURE-----

Merge tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc

Pull char/misc and other driver updates from Greg KH:
 "Here is the big set of char/misc and other small driver subsystem
  updates for 5.18-rc1.

  Included in here are merges from driver subsystems which contain:

   - iio driver updates and new drivers

   - fsi driver updates

   - fpga driver updates

   - habanalabs driver updates and support for new hardware

   - soundwire driver updates and new drivers

   - phy driver updates and new drivers

   - coresight driver updates

   - icc driver updates

  Individual changes include:

   - mei driver updates

   - interconnect driver updates

   - new PECI driver subsystem added

   - vmci driver updates

   - lots of tiny misc/char driver updates

  All of these have been in linux-next for a while with no reported
  problems"

* tag 'char-misc-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc: (556 commits)
  firmware: google: Properly state IOMEM dependency
  kgdbts: fix return value of __setup handler
  firmware: sysfb: fix platform-device leak in error path
  firmware: stratix10-svc: add missing callback parameter on RSU
  arm64: dts: qcom: add non-secure domain property to fastrpc nodes
  misc: fastrpc: Add dma handle implementation
  misc: fastrpc: Add fdlist implementation
  misc: fastrpc: Add helper function to get list and page
  misc: fastrpc: Add support to secure memory map
  dt-bindings: misc: add fastrpc domain vmid property
  misc: fastrpc: check before loading process to the DSP
  misc: fastrpc: add secure domain support
  dt-bindings: misc: add property to support non-secure DSP
  misc: fastrpc: Add support to get DSP capabilities
  misc: fastrpc: add support for FASTRPC_IOCTL_MEM_MAP/UNMAP
  misc: fastrpc: separate fastrpc device from channel context
  dt-bindings: nvmem: brcm,nvram: add basic NVMEM cells
  dt-bindings: nvmem: make "reg" property optional
  nvmem: brcm_nvram: parse NVRAM content into NVMEM cells
  nvmem: dt-bindings: Fix the error of dt-bindings check
  ...
2022-03-28 12:27:35 -07:00
Naresh Kamboju
d9142e1cf3 selftests: net: Add tls config dependency for tls selftests
selftest net tls test cases need TLS=m without this the test hangs.
Enabling config TLS solves this problem and runs to complete.
  - CONFIG_TLS=m

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-28 09:00:35 -07:00
Linus Torvalds
7b58b82b86 perf tools changes for v5.18: 1st batch
New features:
 
 perf ftrace:
 
 - Add -n/--use-nsec option to the 'latency' subcommand.
 
   Default: usecs:
 
   $ sudo perf ftrace latency -T dput -a sleep 1
   #   DURATION     |      COUNT | GRAPH                          |
        0 - 1    us |    2098375 | #############################  |
        1 - 2    us |         61 |                                |
        2 - 4    us |         33 |                                |
        4 - 8    us |         13 |                                |
        8 - 16   us |        124 |                                |
       16 - 32   us |        123 |                                |
       32 - 64   us |          1 |                                |
       64 - 128  us |          0 |                                |
      128 - 256  us |          1 |                                |
      256 - 512  us |          0 |                                |
 
   Better granularity with nsec:
 
   $ sudo perf ftrace latency -T dput -a -n sleep 1
   #   DURATION     |      COUNT | GRAPH                          |
        0 - 1    us |          0 |                                |
        1 - 2    ns |          0 |                                |
        2 - 4    ns |          0 |                                |
        4 - 8    ns |          0 |                                |
        8 - 16   ns |          0 |                                |
       16 - 32   ns |          0 |                                |
       32 - 64   ns |          0 |                                |
       64 - 128  ns |    1163434 | ##############                 |
      128 - 256  ns |     914102 | #############                  |
      256 - 512  ns |        884 |                                |
      512 - 1024 ns |        613 |                                |
        1 - 2    us |         31 |                                |
        2 - 4    us |         17 |                                |
        4 - 8    us |          7 |                                |
        8 - 16   us |        123 |                                |
       16 - 32   us |         83 |                                |
 
 perf lock:
 
 - Add -c/--combine-locks option to merge lock instances in the same class into
   a single entry.
 
   # perf lock report -c
                  Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns)
 
         rcu_read_lock   251225         0            0              0            0            0
    hrtimer_bases.lock    39450         0            0              0            0            0
   &sb->s_type->i_l...    10301         1          662            662          662          662
      ptlock_ptr(page)    10173         2          701           1402          760          642
   &(ei->i_block_re...     8732         0            0              0            0            0
          &xa->xa_lock     8088         0            0              0            0            0
           &base->lock     6705         0            0              0            0            0
           &p->pi_lock     5549         0            0              0            0            0
   &dentry->d_lockr...     5010         4         1274           5097         1844          789
             &ep->lock     3958         0            0              0            0            0
 
 - Add -F/--field option to customize the list of fields to output:
 
   $ perf lock report -F contended,wait_max -k avg_wait
                   Name contended max wait(ns) avg wait(ns)
 
         slock-AF_INET6         1        23543        23543
      &lruvec->lru_lock         5        18317        11254
         slock-AF_INET6         1        10379        10379
             rcu_node_1         1         2104         2104
    &dentry->d_lockr...         1         1844         1844
    &dentry->d_lockr...         1         1672         1672
       &newf->file_lock        15         2279         1025
    &dentry->d_lockr...         1          792          792
 
 - Add --synth=no option for record, as there is no need to symbolize,
   lock names comes from the tracepoints.
 
 perf record:
 
 - Threaded recording, opt-in, via the new --threads command line option.
 
 - Improve AMD IBS (Instruction-Based Sampling) error handling messages.
 
 perf script:
 
 - Add 'brstackinsnlen' field (use it with -F) for branch stacks.
 
 - Output branch sample type in 'perf script'.
 
 perf report:
 
 - Add "addr_from" and "addr_to" sort dimensions.
 
 - Print branch stack entry type in 'perf report --dump-raw-trace'
 
 - Fix symbolization for chrooted workloads.
 
 Hardware tracing:
 
 Intel PT:
 
 - Add CFE (Control Flow Event) and EVD (Event Data) packets support.
 
 - Add MODE.Exec IFLAG bit support.
 
 Explanation about these features from the "Intel® 64 and IA-32 architectures
 software developer’s manual combined volumes: 1, 2A, 2B, 2C, 2D, 3A, 3B, 3C,
 3D, and 4" PDF at:
 
   https://cdrdv2.intel.com/v1/dl/getContent/671200
 
 At page 3951:
 
 <quote>
 32.2.4
 
 Event Trace is a capability that exposes details about the asynchronous
 events, when they are generated, and when their corresponding software
 event handler completes execution. These include:
 
 o Interrupts, including NMI and SMI, including the interrupt vector when
 defined.
 
 o Faults, exceptions including the fault vector.
 
 — Page faults additionally include the page fault address, when in context.
 
 o Event handler returns, including IRET and RSM.
 
 o VM exits and VM entries.¹
 
 — VM exits include the values written to the “exit reason” and “exit qualification” VMCS fields.
 INIT and SIPI events.
 
 o TSX aborts, including the abort status returned for the RTM instructions.
 
 o Shutdown.
 
 Additionally, it provides indication of the status of the Interrupt Flag
 (IF), to indicate when interrupts are masked.
 </quote>
 
 ARM CoreSight:
 
 - Use advertised caps/min_interval as default sample_period on ARM spe.
 
 - Update deduction of TRCCONFIGR register for branch broadcast on ARM's CoreSight ETM.
 
 Vendor Events (JSON):
 
 Intel:
 
 - Update events and metrics for:
 
     Alderlake, Broadwell, Broadwell DE, BroadwellX, CascadelakeX, Elkhartlake,
     Bonnell, Goldmont, GoldmontPlus, Westmere EP-DP, Haswell, HaswellX,
     Icelake, IcelakeX, Ivybridge, Ivytown, Jaketown, Knights Landing,
     Nehalem EP, Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX,
     Tigerlake, TremontX, Westmere EP-SP, Westmere EX.
 
 ARM:
 
 - Add support for HiSilicon CPA PMU aliasing.
 
 perf stat:
 
 - Fix forked applications enablement of counters.
 
 - The 'slots' should only be printed on a different order than the one specified
   on the command line when 'topdown' events are present, fix it.
 
 Miscellaneous:
 
 - Sync msr-index, cpufeatures header files with the kernel sources.
 
 - Stop using some deprecated libbpf APIs in 'perf trace'.
 
 - Fix some spelling mistakes.
 
 - Refactor the maps pointers usage to pave the way for using refcount debugging.
 
 - Only offer the --tui option on perf top, report and annotate when perf was
   built with libslang.
 
 - Don't mention --to-ctf in 'perf data --help' when not linking with the required
   library, libbabeltrace.
 
 - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by array_size.cocci.
 
 - Enhance the matching of sub-commands abbreviations:
 	'perf c2c rec' -> 'perf c2c record'
 	'perf c2c recport -> error
 
 - Set build-id using build-id header on new mmap records.
 
 - Fix generation of 'perf --version' string.
 
 perf test:
 
 - Add test for the arm_spe event.
 
 - Add test to check unwinding using fame-pointer (fp) mode on arm64.
 
 - Make metric testing more robust in 'perf test'.
 
 - Add error message for unsupported branch stack cases.
 
 libperf:
 
 - Add API for allocating new thread map array.
 
 - Fix typo in perf_evlist__open() failure error messages in libperf tests.
 
 perf c2c:
 
 - Replace bitmap_weight() with bitmap_empty() where appropriate.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYj8viwAKCRCyPKLppCJ+
 J8K3AQDpN45P4/TWJxVWhZlvYzJtWDSboXHZJfmBiEd4Xu2zbwD7BFW02f1ATHPr
 dGBFXxRQQufBIqfE+OQXG59Awp1m8wE=
 =1l8S
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tools updates from Arnaldo Carvalho de Melo:
 "New features:

  perf ftrace:

   - Add -n/--use-nsec option to the 'latency' subcommand.

     Default: usecs:

     $ sudo perf ftrace latency -T dput -a sleep 1
     #   DURATION     |      COUNT | GRAPH                          |
          0 - 1    us |    2098375 | #############################  |
          1 - 2    us |         61 |                                |
          2 - 4    us |         33 |                                |
          4 - 8    us |         13 |                                |
          8 - 16   us |        124 |                                |
         16 - 32   us |        123 |                                |
         32 - 64   us |          1 |                                |
         64 - 128  us |          0 |                                |
        128 - 256  us |          1 |                                |
        256 - 512  us |          0 |                                |

     Better granularity with nsec:

     $ sudo perf ftrace latency -T dput -a -n sleep 1
     #   DURATION     |      COUNT | GRAPH                          |
          0 - 1    us |          0 |                                |
          1 - 2    ns |          0 |                                |
          2 - 4    ns |          0 |                                |
          4 - 8    ns |          0 |                                |
          8 - 16   ns |          0 |                                |
         16 - 32   ns |          0 |                                |
         32 - 64   ns |          0 |                                |
         64 - 128  ns |    1163434 | ##############                 |
        128 - 256  ns |     914102 | #############                  |
        256 - 512  ns |        884 |                                |
        512 - 1024 ns |        613 |                                |
          1 - 2    us |         31 |                                |
          2 - 4    us |         17 |                                |
          4 - 8    us |          7 |                                |
          8 - 16   us |        123 |                                |
         16 - 32   us |         83 |                                |

  perf lock:

   - Add -c/--combine-locks option to merge lock instances in the same
     class into a single entry.

     # perf lock report -c
                    Name acquired contended avg wait(ns) total wait(ns) max wait(ns) min wait(ns)

           rcu_read_lock   251225         0            0              0            0            0
      hrtimer_bases.lock    39450         0            0              0            0            0
     &sb->s_type->i_l...    10301         1          662            662          662          662
        ptlock_ptr(page)    10173         2          701           1402          760          642
     &(ei->i_block_re...     8732         0            0              0            0            0
            &xa->xa_lock     8088         0            0              0            0            0
             &base->lock     6705         0            0              0            0            0
             &p->pi_lock     5549         0            0              0            0            0
     &dentry->d_lockr...     5010         4         1274           5097         1844          789
               &ep->lock     3958         0            0              0            0            0

      - Add -F/--field option to customize the list of fields to output:

     $ perf lock report -F contended,wait_max -k avg_wait
                     Name contended max wait(ns) avg wait(ns)

           slock-AF_INET6         1        23543        23543
        &lruvec->lru_lock         5        18317        11254
           slock-AF_INET6         1        10379        10379
               rcu_node_1         1         2104         2104
      &dentry->d_lockr...         1         1844         1844
      &dentry->d_lockr...         1         1672         1672
         &newf->file_lock        15         2279         1025
      &dentry->d_lockr...         1          792          792

   - Add --synth=no option for record, as there is no need to symbolize,
     lock names comes from the tracepoints.

  perf record:

   - Threaded recording, opt-in, via the new --threads command line
     option.

   - Improve AMD IBS (Instruction-Based Sampling) error handling
     messages.

  perf script:

   - Add 'brstackinsnlen' field (use it with -F) for branch stacks.

   - Output branch sample type in 'perf script'.

  perf report:

   - Add "addr_from" and "addr_to" sort dimensions.

   - Print branch stack entry type in 'perf report --dump-raw-trace'

   - Fix symbolization for chrooted workloads.

  Hardware tracing:

  Intel PT:

   - Add CFE (Control Flow Event) and EVD (Event Data) packets support.

   - Add MODE.Exec IFLAG bit support.

     Explanation about these features from the "Intel® 64 and IA-32
     architectures software developer’s manual combined volumes: 1, 2A,
     2B, 2C, 2D, 3A, 3B, 3C, 3D, and 4" PDF at:

        https://cdrdv2.intel.com/v1/dl/getContent/671200

     At page 3951:
      "32.2.4

       Event Trace is a capability that exposes details about the
       asynchronous events, when they are generated, and when their
       corresponding software event handler completes execution. These
       include:

        o Interrupts, including NMI and SMI, including the interrupt
          vector when defined.

        o Faults, exceptions including the fault vector.

           - Page faults additionally include the page fault address,
             when in context.

        o Event handler returns, including IRET and RSM.

        o VM exits and VM entries.¹

           - VM exits include the values written to the “exit reason”
             and “exit qualification” VMCS fields. INIT and SIPI events.

        o TSX aborts, including the abort status returned for the RTM
          instructions.

        o Shutdown.

       Additionally, it provides indication of the status of the
       Interrupt Flag (IF), to indicate when interrupts are masked"

  ARM CoreSight:

   - Use advertised caps/min_interval as default sample_period on ARM
     spe.

   - Update deduction of TRCCONFIGR register for branch broadcast on
     ARM's CoreSight ETM.

  Vendor Events (JSON):

  Intel:

   - Update events and metrics for: Alderlake, Broadwell, Broadwell DE,
     BroadwellX, CascadelakeX, Elkhartlake, Bonnell, Goldmont,
     GoldmontPlus, Westmere EP-DP, Haswell, HaswellX, Icelake, IcelakeX,
     Ivybridge, Ivytown, Jaketown, Knights Landing, Nehalem EP,
     Sandybridge, Silvermont, Skylake, Skylake Server, SkylakeX,
     Tigerlake, TremontX, Westmere EP-SP, and Westmere EX.

  ARM:

   - Add support for HiSilicon CPA PMU aliasing.

  perf stat:

   - Fix forked applications enablement of counters.

   - The 'slots' should only be printed on a different order than the
     one specified on the command line when 'topdown' events are
     present, fix it.

  Miscellaneous:

   - Sync msr-index, cpufeatures header files with the kernel sources.

   - Stop using some deprecated libbpf APIs in 'perf trace'.

   - Fix some spelling mistakes.

   - Refactor the maps pointers usage to pave the way for using refcount
     debugging.

   - Only offer the --tui option on perf top, report and annotate when
     perf was built with libslang.

   - Don't mention --to-ctf in 'perf data --help' when not linking with
     the required library, libbabeltrace.

   - Use ARRAY_SIZE() instead of ad hoc equivalent, spotted by
     array_size.cocci.

   - Enhance the matching of sub-commands abbreviations:
	'perf c2c rec' -> 'perf c2c record'
	'perf c2c recport -> error

   - Set build-id using build-id header on new mmap records.

   - Fix generation of 'perf --version' string.

  perf test:

   - Add test for the arm_spe event.

   - Add test to check unwinding using fame-pointer (fp) mode on arm64.

   - Make metric testing more robust in 'perf test'.

   - Add error message for unsupported branch stack cases.

  libperf:

   - Add API for allocating new thread map array.

   - Fix typo in perf_evlist__open() failure error messages in libperf
     tests.

  perf c2c:

   - Replace bitmap_weight() with bitmap_empty() where appropriate"

* tag 'perf-tools-for-v5.18-2022-03-26' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (143 commits)
  perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages
  perf python: Add perf_env stubs that will be needed in evsel__open_strerror()
  perf tools: Enhance the matching of sub-commands abbreviations
  libperf tests: Fix typo in perf_evlist__open() failure error messages
  tools arm64: Import cputype.h
  perf lock: Add -F/--field option to control output
  perf lock: Extend struct lock_key to have print function
  perf lock: Add --synth=no option for record
  tools headers cpufeatures: Sync with the kernel sources
  tools headers cpufeatures: Sync with the kernel sources
  perf stat: Fix forked applications enablement of counters
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  perf evsel: Make evsel__env() always return a valid env
  perf build-id: Fix spelling mistake "Cant" -> "Can't"
  perf header: Fix spelling mistake "could't" -> "couldn't"
  perf script: Add 'brstackinsnlen' for branch stacks
  perf parse-events: Move slots only with topdown
  perf ftrace latency: Update documentation
  perf ftrace latency: Add -n/--use-nsec option
  perf tools: Fix version kernel tag
  ...
2022-03-27 13:42:32 -07:00
Linus Torvalds
02f9a04d76 memblock: test suite and a small cleanup
* A small cleanup of unused variable in __next_mem_pfn_range_in_zone
 * Initial test suite to simulate memblock behaviour in userspace
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEeOVYVaWZL5900a/pOQOGJssO/ZEFAmI9bD4THHJwcHRAbGlu
 dXguaWJtLmNvbQAKCRA5A4Ymyw79kXwhB/wNXR1wUb/eD3eKD+aNa2KMY5+8csjD
 ghJph8wQmM9U9hsLViv3/M/H5+bY/s0riZNulKYrcmzW2BgIzF2ebcoqgfQ89YGV
 bLx7lMJGxG/lCglur9m6KnOF89//Owq6Vfk7Jd6jR/F+43JO/3+5siCbTo6NrbVw
 3DjT/WzvaICA646foyFTh8WotnIRbB2iYX1k/vIA3gwJ2C6n7WwoKzxU3ulKMUzg
 hVlhcuTVnaV4mjFBbl23wC7i4l9dgPO9M4ZrTtlEsNHeV6uoFYRObwy6/q/CsBqI
 avwgV0bQDch+QuCteUXcqIcnBpcUAfGxgiqp2PYX4lXA4gYTbo7plTna
 =IemP
 -----END PGP SIGNATURE-----

Merge tag 'memblock-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock

Pull memblock updates from Mike Rapoport:
 "Test suite and a small cleanup:

   - A small cleanup of unused variable in __next_mem_pfn_range_in_zone

   - Initial test suite to simulate memblock behaviour in userspace"

* tag 'memblock-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rppt/memblock: (27 commits)
  memblock tests: Add TODO and README files
  memblock tests: Add memblock_alloc_try_nid tests for bottom up
  memblock tests: Add memblock_alloc_try_nid tests for top down
  memblock tests: Add memblock_alloc_from tests for bottom up
  memblock tests: Add memblock_alloc_from tests for top down
  memblock tests: Add memblock_alloc tests for bottom up
  memblock tests: Add memblock_alloc tests for top down
  memblock tests: Add simulation of physical memory
  memblock tests: Split up reset_memblock function
  memblock tests: Fix testing with 32-bit physical addresses
  memblock: __next_mem_pfn_range_in_zone: remove unneeded local variable nid
  memblock tests: Add memblock_free tests
  memblock tests: Add memblock_add_node test
  memblock tests: Add memblock_remove tests
  memblock tests: Add memblock_reserve tests
  memblock tests: Add memblock_add tests
  memblock tests: Add memblock reset function
  memblock tests: Add skeleton of the memblock simulator
  tools/include: Add debugfs.h stub
  tools/include: Add pfn.h stub
  ...
2022-03-27 13:36:06 -07:00
Linus Torvalds
7001052160 Add support for Intel CET-IBT, available since Tigerlake (11th gen), which is a
coarse grained, hardware based, forward edge Control-Flow-Integrity mechanism
 where any indirect CALL/JMP must target an ENDBR instruction or suffer #CP.
 
 Additionally, since Alderlake (12th gen)/Sapphire-Rapids, speculation is
 limited to 2 instructions (and typically fewer) on branch targets not starting
 with ENDBR. CET-IBT also limits speculation of the next sequential instruction
 after the indirect CALL/JMP [1].
 
 CET-IBT is fundamentally incompatible with retpolines, but provides, as
 described above, speculation limits itself.
 
 [1] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEv3OU3/byMaA0LqWJdkfhpEvA5LoFAmI/LI8VHHBldGVyekBp
 bmZyYWRlYWQub3JnAAoJEHZH4aRLwOS6ZnkP/2QCgQLTu6oRxv9O020CHwlaSEeD
 1Hoy3loum5q5hAi1Ik3dR9p0H5u64c9qbrBVxaFoNKaLt5GKrtHaDSHNk2L/CFHX
 urpH65uvTLxbyZzcahkAahoJ71XU+m7PcrHLWMunw9sy10rExYVsUOlFyoyG6XCF
 BDCNZpdkC09ZM3vwlWGMZd5Pp+6HcZNPyoV9tpvWAS2l+WYFWAID7mflbpQ+tA8b
 y/hM6b3Ud0rT2ubuG1iUpopgNdwqQZ+HisMPGprh+wKZkYwS2l8pUTrz0MaBkFde
 go7fW16kFy2HQzGm6aIEBmfcg0palP/mFVaWP0zS62LwhJSWTn5G6xWBr3yxSsht
 9gWCiI0oDZuTg698MedWmomdG2SK6yAuZuqmdKtLLoWfWgviPEi7TDFG/cKtZdAW
 ag8GM8T4iyYZzpCEcWO9GWbjo6TTGq30JBQefCBG47GjD0csv2ubXXx0Iey+jOwT
 x3E8wnv9dl8V9FSd/tMpTFmje8ges23yGrWtNpb5BRBuWTeuGiBPZED2BNyyIf+T
 dmewi2ufNMONgyNp27bDKopY81CPAQq9cVxqNm9Cg3eWPFnpOq2KGYEvisZ/rpEL
 EjMQeUBsy/C3AUFAleu1vwNnkwP/7JfKYpN00gnSyeQNZpqwxXBCKnHNgOMTXyJz
 beB/7u2KIUbKEkSN
 =jZfK
 -----END PGP SIGNATURE-----

Merge tag 'x86_core_for_5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 CET-IBT (Control-Flow-Integrity) support from Peter Zijlstra:
 "Add support for Intel CET-IBT, available since Tigerlake (11th gen),
  which is a coarse grained, hardware based, forward edge
  Control-Flow-Integrity mechanism where any indirect CALL/JMP must
  target an ENDBR instruction or suffer #CP.

  Additionally, since Alderlake (12th gen)/Sapphire-Rapids, speculation
  is limited to 2 instructions (and typically fewer) on branch targets
  not starting with ENDBR. CET-IBT also limits speculation of the next
  sequential instruction after the indirect CALL/JMP [1].

  CET-IBT is fundamentally incompatible with retpolines, but provides,
  as described above, speculation limits itself"

[1] https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/branch-history-injection.html

* tag 'x86_core_for_5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
  kvm/emulate: Fix SETcc emulation for ENDBR
  x86/Kconfig: Only allow CONFIG_X86_KERNEL_IBT with ld.lld >= 14.0.0
  x86/Kconfig: Only enable CONFIG_CC_HAS_IBT for clang >= 14.0.0
  kbuild: Fixup the IBT kbuild changes
  x86/Kconfig: Do not allow CONFIG_X86_X32_ABI=y with llvm-objcopy
  x86: Remove toolchain check for X32 ABI capability
  x86/alternative: Use .ibt_endbr_seal to seal indirect calls
  objtool: Find unused ENDBR instructions
  objtool: Validate IBT assumptions
  objtool: Add IBT/ENDBR decoding
  objtool: Read the NOENDBR annotation
  x86: Annotate idtentry_df()
  x86,objtool: Move the ASM_REACHABLE annotation to objtool.h
  x86: Annotate call_on_stack()
  objtool: Rework ASM_REACHABLE
  x86: Mark __invalid_creds() __noreturn
  exit: Mark do_group_exit() __noreturn
  x86: Mark stop_this_cpu() __noreturn
  objtool: Ignore extra-symbol code
  objtool: Rename --duplicate to --lto
  ...
2022-03-27 10:17:23 -07:00
Jakub Kicinski
5c7e49be96 selftests: tls: skip cmsg_to_pipe tests with TLS=n
These are negative tests, testing TLS code rejects certain
operations. They won't pass without TLS enabled, pure TCP
accepts those operations.

Reported-by: Linux Kernel Functional Testing <lkft@linaro.org>
Fixes: d87d67fd61 ("selftests: tls: test splicing cmsgs")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-26 14:15:16 -07:00
Kim Phillips
ab0809af0b perf evsel: Improve AMD IBS (Instruction-Based Sampling) error handling messages
Improve the error message returned on failed perf_event_open() on AMD
systems when using IBS (Instruction-Based Sampling).

Output of executing 'perf record -e ibs_op// true' as a non root user
BEFORE this patch (perf will add the 'u' modifier at the end to exclude
kernel/hypervisor sampling):

  The sys_perf_event_open() syscall returned with 22 (Invalid argument)for event (ibs_op//u).
  /bin/dmesg | grep -i perf may provide additional information.

Output after:

  AMD IBS can't exclude kernel events.  Try running at a higher privilege level.

Output of executing 'sudo perf record -e ibs_op// true' BEFORE this patch:

  Error:
  The sys_perf_event_open() syscall returned with 22 (Invalid argument) for event (ibs_op//).
  /bin/dmesg | grep -i perf may provide additional information.

Output after:

  Error:
  Invalid event (ibs_op//) in per-thread mode, enable system wide with '-a'.

Folowing the suggestion:

  $ sudo perf record -a -e ibs_op// true
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 1.664 MB perf.data (194 samples) ]
  $

Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: João Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Sandipan Das <sandipan.das@amd.com>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220322221517.2510440-12-eranian@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-26 10:55:58 -03:00
Arnaldo Carvalho de Melo
b58230de3c perf python: Add perf_env stubs that will be needed in evsel__open_strerror()
The AMD IBS error message enhancements will use these, but we're not
using evsel__open_strerror() in the python binding so far.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-26 10:55:57 -03:00
Wei Li
ae0f4eb34f perf tools: Enhance the matching of sub-commands abbreviations
We support short command 'rec*' for 'record' and 'rep*' for 'report' in
lots of sub-commands, but the matching is not quite strict currnetly.

It may be puzzling sometime, like we mis-type a 'recport' to report but
it will perform 'record' in fact without any message.

To fix this, add a check to ensure that the short cmd is valid prefix
of the real command.

Committer testing:

  [root@quaco ~]# perf c2c re sleep 1

   Usage: perf c2c {record|report}

      -v, --verbose         be more verbose (show counter open errors, etc)

  # perf c2c rec sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.038 MB perf.data (16 samples) ]
  # perf c2c recport sleep 1

   Usage: perf c2c {record|report}

      -v, --verbose         be more verbose (show counter open errors, etc)

  # perf c2c record sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.038 MB perf.data (15 samples) ]
  # perf c2c records sleep 1

   Usage: perf c2c {record|report}

      -v, --verbose         be more verbose (show counter open errors, etc)

  #

Signed-off-by: Wei Li <liwei391@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Rui Xiang <rui.xiang@huawei.com>
Link: http://lore.kernel.org/lkml/20220325092032.2956161-1-liwei391@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-26 10:55:57 -03:00
Shunsuke Nakamura
c2eeac9856 libperf tests: Fix typo in perf_evlist__open() failure error messages
This patch corrects typos in error messages. I should be "evlist", not
"evsel" as the function that fails is perf_evlist__open().

Fixes: 3ce311afb5 ("libperf: Move to tools/lib/perf")
Fixes: a7f3713f6b ("libperf tests: Add test_stat_multiplexing test")
Signed-off-by: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220325043829.224045-2-nakamura.shun@fujitsu.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-26 10:55:57 -03:00
Ali Saidi
1314376d49 tools arm64: Import cputype.h
Bring-in the kernel's arch/arm64/include/asm/cputype.h into tools/
for arm64 to make use of all the core-type definitions in perf.

Replace sysreg.h with the version already imported into tools/.

Committer notes:

Added an entry to tools/perf/check-headers.sh, so that we get notified
when the original file in the kernel sources gets modified.

Tester notes:

LGTM. I did the testing on both my x86 and Arm64 platforms, thanks for
the fixing up.

Signed-off-by: Ali Saidi <alisaidi@amazon.com>
Tested-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrew Kilroy <andrew.kilroy@arm.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Li Huafei <lihuafei1@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick.Forrington@arm.com
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20220324183323.31414-2-alisaidi@amazon.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-26 10:53:45 -03:00
Ido Schimmel
b50d3b46f8 selftests: test_vxlan_under_vrf: Fix broken test case
The purpose of the last test case is to test VXLAN encapsulation and
decapsulation when the underlay lookup takes place in a non-default VRF.
This is achieved by enslaving the physical device of the tunnel to a
VRF.

The binding of the VXLAN UDP socket to the VRF happens when the VXLAN
device itself is opened, not when its physical device is opened. This
was also mentioned in the cited commit ("tests that moving the underlay
from a VRF to another works when down/up the VXLAN interface"), but the
test did something else.

Fix it by reopening the VXLAN device instead of its physical device.

Before:

 # ./test_vxlan_under_vrf.sh
 Checking HV connectivity                                           [ OK ]
 Check VM connectivity through VXLAN (underlay in the default VRF)  [ OK ]
 Check VM connectivity through VXLAN (underlay in a VRF)            [FAIL]

After:

 # ./test_vxlan_under_vrf.sh
 Checking HV connectivity                                           [ OK ]
 Check VM connectivity through VXLAN (underlay in the default VRF)  [ OK ]
 Check VM connectivity through VXLAN (underlay in a VRF)            [ OK ]

Fixes: 03f1c26b1c ("test/net: Add script for VXLAN underlay in a VRF")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Link: https://lore.kernel.org/r/20220324200514.1638326-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-25 17:00:11 -07:00
Linus Torvalds
1464677662 platform-drivers-x86 for v5.18-1
Highlights:
 - new drivers:
   - AMD Host System Management Port (HSMP)
   - Intel Software Defined Silicon
 - removed drivers (functionality folded into other drivers):
   - intel_cht_int33fe_microb
   - surface3_button
 - amd-pmc:
   - s2idle bug-fixes
   - Support for AMD Spill to DRAM STB feature
 - hp-wmi:
   - Fix SW_TABLET_MODE detection method (and other fixes)
   - Support omen thermal profile policy v1
 - serial-multi-instantiate:
   - Add SPI device support
   - Add support for CS35L41 amplifiers used in new laptops
 - think-lmi:
   - syfs-class-firmware-attributes Certificate authentication support
 - thinkpad_acpi:
   - Fixes + quirks
   - Add platform_profile support on AMD based ThinkPads
 - x86-android-tablets
   - Improve Asus ME176C / TF103C support
   - Support Nextbook Ares 8, Lenovo Tab 2 830 and 1050 tablets
 - Lots of various other small fixes and hardware-id additions
 
 The following is an automated git shortlog grouped by driver:
 
 ACPI / scan:
  -  Create platform device for CS35L41
 
 ACPI / x86:
  -  Add support for LPS0 callback handler
 
 ALSA:
  -  hda/realtek: Add support for HP Laptops
 
 Add AMD system management interface:
  - Add AMD system management interface
 
 Add Intel Software Defined Silicon driver:
  - Add Intel Software Defined Silicon driver
 
 Documentation:
  -  syfs-class-firmware-attributes: Lenovo Certificate support
  -  Add x86/amd_hsmp driver
 
 ISST:
  -  Fix possible circular locking dependency detected
 
 Input:
  -  soc_button_array - add support for Microsoft Surface 3 (MSHW0028) buttons
 
 Merge remote-tracking branch 'pdx86/platform-drivers-x86-pinctrl-pmu_clk' into review-hans-gcc12:
  - Merge remote-tracking branch 'pdx86/platform-drivers-x86-pinctrl-pmu_clk' into review-hans-gcc12
 
 Merge tag 'platform-drivers-x86-serial-multi-instantiate-1' into review-hans:
  - Merge tag 'platform-drivers-x86-serial-multi-instantiate-1' into review-hans
 
 Replace acpi_bus_get_device():
  - Replace acpi_bus_get_device()
 
 amd-pmc:
  -  Only report STB errors when STB enabled
  -  Drop CPU QoS workaround
  -  Output error codes in messages
  -  Move to later in the suspend process
  -  Validate entry into the deepest state on resume
  -  uninitialized variable in amd_pmc_s2d_init()
  -  Set QOS during suspend on CZN w/ timer wakeup
  -  Add support for AMD Spill to DRAM STB feature
  -  Correct usage of SMU version
  -  Make amd_pmc_stb_debugfs_fops static
 
 asus-tf103c-dock:
  -  Make 2 global structs static
 
 asus-wmi:
  -  Fix regression when probing for fan curve control
 
 hp-wmi:
  -  support omen thermal profile policy v1
  -  Changing bios_args.data to be dynamically allocated
  -  Fix 0x05 error code reported by several WMI calls
  -  Fix SW_TABLET_MODE detection method
  -  Fix hp_wmi_read_int() reporting error (0x05)
 
 huawei-wmi:
  -  check the return value of device_create_file()
 
 i2c-multi-instantiate:
  -  Rename it for a generic serial driver name
 
 int3472:
  -  Add terminator to gpiod_lookup_table
 
 intel-uncore-freq:
  -  fix uncore_freq_common_init() error codes
 
 intel_cht_int33fe:
  -  Move to intel directory
  -  Drop Lenovo Yogabook YB1-X9x code
  -  Switch to DMI modalias based loading
 
 intel_crystal_cove_charger:
  -  Fix IRQ masking / unmasking
 
 lg-laptop:
  -  Move setting of battery charge limit to common location
 
 pinctrl:
  -  baytrail: Add pinconf group + function for the pmu_clk
 
 platform/dcdbas:
  -  move EXPORT_SYMBOL after function
 
 platform/surface:
  -  Remove Surface 3 Button driver
  -  surface3-wmi: Simplify resource management
  -  Replace acpi_bus_get_device()
  -  Reinstate platform dependency
 
 platform/x86/intel-uncore-freq:
  -  Split common and enumeration part
 
 platform/x86/intel/uncore-freq:
  -  Display uncore current frequency
  -  Use sysfs API to create attributes
  -  Move to uncore-frequency folder
 
 selftests:
  -  sdsi: test sysfs setup
 
 serial-multi-instantiate:
  -  Add SPI support
  -  Reorganize I2C functions
 
 spi:
  -  Add API to count spi acpi resources
  -  Support selection of the index of the ACPI Spi Resource before alloc
  -  Create helper API to lookup ACPI info for spi device
  -  Make spi_alloc_device and spi_add_device public again
 
 surface:
  -  surface3_power: Fix battery readings on batteries without a serial number
 
 think-lmi:
  -  Certificate authentication support
 
 thinkpad_acpi:
  -  consistently check fan_get_status return.
  -  Don't use test_bit on an integer
  -  Fix compiler warning about uninitialized err variable
  -  clean up dytc profile convert
  -  Add PSC mode support
  -  Add dual fan probe
  -  Add dual-fan quirk for T15g (2nd gen)
  -  Fix incorrect use of platform profile on AMD platforms
  -  Add quirk for ThinkPads without a fan
 
 tools arch x86:
  -  Add Intel SDSi provisiong tool
 
 touchscreen_dmi:
  -  Add info for the RWC NANOTE P8 AY07J 2-in-1
 
 x86-android-tablets:
  -  Depend on EFI and SPI
  -  Lenovo Yoga Tablet 2 830/1050 sound support
  -  Workaround Lenovo Yoga Tablet 2 830/1050 poweroff hang
  -  Add Lenovo Yoga Tablet 2 830 / 1050 data
  -  Fix EBUSY error when requesting IOAPIC IRQs
  -  Minor charger / fuel-gauge improvements
  -  Add Nextbook Ares 8 data
  -  Add IRQ to Asus ME176C accelerometer info
  -  Add lid-switch gpio-keys pdev to Asus ME176C + TF103C
  -  Add x86_android_tablet_get_gpiod() helper
  -  Add Asus ME176C/TF103C charger and fuelgauge props
  -  Add battery swnode support
  -  Trivial typo fix for MODULE_AUTHOR
  -  Fix the buttons on CZC P10T tablet
  -  Constify the gpiod_lookup_tables arrays
  -  Add an init() callback to struct x86_dev_info
  -  Add support for disabling ACPI _AEI handlers
  -  Correct crystal_cove_charger module name
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmI8SjEUHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9wYUwf/cdUMPFy5cwpHq1LuqGy+PxVCRHCe
 71PFd2Ycj+HGOtrt66RxSiCC1Seb4tylr7FvudToDaqWjlBf5n6LhpDudg4ds7Qw
 lCuRlaXTIrF7p3nOLIsWvJPRqacMG79KkRM62MLTS2evtRYjbnKvFzNPJPzr8827
 1AhCakE92S8gkR5lUZYYHtsaz9rZ4z4TrEtjO6GdlbL2bDw0l18dNNwdMomfVpNS
 bBIHIDLeufDuMJ4PxIHlE5MB3AuZAuc0HTJWihozyJX/h5FMGI6qVm0/s9RAfHgX
 XdMCpADtS/JjHCmkFgLZYIzvXTxwQVZRo5VO0Wrv5Mis6gSpxJXCd0aKlA==
 =1x9/
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver updates from Hans de Goede:
  "New drivers:
    - AMD Host System Management Port (HSMP)
    - Intel Software Defined Silicon

  Removed drivers (functionality folded into other drivers):
    - intel_cht_int33fe_microb
    - surface3_button

  amd-pmc:
    - s2idle bug-fixes
    - Support for AMD Spill to DRAM STB feature

  hp-wmi:
    - Fix SW_TABLET_MODE detection method (and other fixes)
    - Support omen thermal profile policy v1

  serial-multi-instantiate:
    - Add SPI device support
    - Add support for CS35L41 amplifiers used in new laptops

  think-lmi:
    - syfs-class-firmware-attributes Certificate authentication support

  thinkpad_acpi:
    - Fixes + quirks
    - Add platform_profile support on AMD based ThinkPads

  x86-android-tablets:
    - Improve Asus ME176C / TF103C support
    - Support Nextbook Ares 8, Lenovo Tab 2 830 and 1050 tablets

  Lots of various other small fixes and hardware-id additions"

* tag 'platform-drivers-x86-v5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (60 commits)
  platform/x86: think-lmi: Certificate authentication support
  Documentation: syfs-class-firmware-attributes: Lenovo Certificate support
  platform/x86: amd-pmc: Only report STB errors when STB enabled
  platform/x86: amd-pmc: Drop CPU QoS workaround
  platform/x86: amd-pmc: Output error codes in messages
  platform/x86: amd-pmc: Move to later in the suspend process
  ACPI / x86: Add support for LPS0 callback handler
  platform/x86: thinkpad_acpi: consistently check fan_get_status return.
  platform/x86: hp-wmi: support omen thermal profile policy v1
  platform/x86: hp-wmi: Changing bios_args.data to be dynamically allocated
  platform/x86: hp-wmi: Fix 0x05 error code reported by several WMI calls
  platform/x86: hp-wmi: Fix SW_TABLET_MODE detection method
  platform/x86: hp-wmi: Fix hp_wmi_read_int() reporting error (0x05)
  platform/x86: amd-pmc: Validate entry into the deepest state on resume
  platform/x86: thinkpad_acpi: Don't use test_bit on an integer
  platform/x86: thinkpad_acpi: Fix compiler warning about uninitialized err variable
  platform/x86: thinkpad_acpi: clean up dytc profile convert
  platform/x86: x86-android-tablets: Depend on EFI and SPI
  platform/x86: amd-pmc: uninitialized variable in amd_pmc_s2d_init()
  platform/x86: intel-uncore-freq: fix uncore_freq_common_init() error codes
  ...
2022-03-25 12:14:39 -07:00
Namhyung Kim
4bd9cab59f perf lock: Add -F/--field option to control output
The -F/--field option is to customize the list of fields to output:

  $ perf lock report -F contended,wait_max -k avg_wait
                  Name  contended   max wait (ns)   avg wait (ns)

        slock-AF_INET6          1           23543           23543
     &lruvec->lru_lock          5           18317           11254
        slock-AF_INET6          1           10379           10379
            rcu_node_1          1            2104            2104
   &dentry->d_lockr...          1            1844            1844
   &dentry->d_lockr...          1            1672            1672
      &newf->file_lock         15            2279            1025
   &dentry->d_lockr...          1             792             792

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220323230259.288494-3-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-25 15:24:50 -03:00
Namhyung Kim
64999e4402 perf lock: Extend struct lock_key to have print function
And use it to print output for each key field.  No functional change
intended and the output should be identical.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220323230259.288494-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-25 15:24:16 -03:00
Namhyung Kim
67b61f59a6 perf lock: Add --synth=no option for record
The perf lock command has nothing to symbolize and lock names come
from the tracepoint.  Moreover, kernel symbols are available even the
--synth=no option is given.

This will reduce the startup time by avoiding unnecessary synthesis.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220323230259.288494-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-25 15:24:03 -03:00
Linus Torvalds
29c8c18363 Merge branch 'akpm' (patches from Andrew)
Merge yet more updates from Andrew Morton:
 "This is the material which was staged after willystuff in linux-next.

  Subsystems affected by this patch series: mm (debug, selftests,
  pagecache, thp, rmap, migration, kasan, hugetlb, pagemap, madvise),
  and selftests"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (113 commits)
  selftests: kselftest framework: provide "finished" helper
  mm: madvise: MADV_DONTNEED_LOCKED
  mm: fix race between MADV_FREE reclaim and blkdev direct IO read
  mm: generalize ARCH_HAS_FILTER_PGPROT
  mm: unmap_mapping_range_tree() with i_mmap_rwsem shared
  mm: warn on deleting redirtied only if accounted
  mm/huge_memory: remove stale locking logic from __split_huge_pmd()
  mm/huge_memory: remove stale page_trans_huge_mapcount()
  mm/swapfile: remove stale reuse_swap_page()
  mm/khugepaged: remove reuse_swap_page() usage
  mm/huge_memory: streamline COW logic in do_huge_pmd_wp_page()
  mm: streamline COW logic in do_swap_page()
  mm: slightly clarify KSM logic in do_swap_page()
  mm: optimize do_wp_page() for fresh pages in local LRU pagevecs
  mm: optimize do_wp_page() for exclusive pages in the swapcache
  mm/huge_memory: make is_transparent_hugepage() static
  userfaultfd/selftests: enable hugetlb remap and remove event testing
  selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test
  mm: enable MADV_DONTNEED for hugetlb mappings
  kasan: disable LOCKDEP when printing reports
  ...
2022-03-25 10:21:20 -07:00
Linus Torvalds
aa5b537b0e RISC-V Patches for the 5.18 Merge Window, Part 1
* Support for Sv57-based virtual memory.
 * Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.
 * An improved memmove() implementation.
 * Support for the new Ssconfpmf and SBI PMU extensions, which allows for
   a much more useful perf implementation on RISC-V systems.
 * Support for restartable sequences.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmI96FcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQBFD/425+6xmoOru6Wiki3Ja0fqQToNrQyW
 IbmE/8AxUP7UxMvJSNzvQm8deXgklzvmegXCtnjwZZins971vMzzDSI83k/zn8I7
 m5thVC9z01BjodV+pvIp/44hS6FesolOLzkVHksX0Zh6h0iidrc34Qf5HrqvvNfN
 CZ/4K1+E9ig5r9qZp4WdvocCXj+FzwF/30GjKoW9vwA599CEG/dCo+TNN9GKD6XS
 k+xiUGwlIRA+kCLSPFCi7ev9XPr1tCmQB7uB8Igcvr7Y3mWl8HKfajQVXBnXNRC3
 ifbDxpx1elJiLPyf7Rza8jIDwDhLQdxBiwPgDgP9h9R4x0uF4efq8PzLzFlFmaE+
 9Z9thfykBb5dXYDFDje9bAOXvKnGk7Iqxdsz0qWo/ChEQawX1+11bJb0TNN8QTT9
 YvlQfUXgb1dmEcj5yG2uVE1Y8L7YNLRMsZU3W3FbmPJZoavSOuU4x0yCGeLyv597
 76af3nuBJ5v80Db97gu6St+HIACeevKflsZUf/8GS/p7d1DlvmrWzQUMEycxPTG9
 UZpZak58jh7AqQ9JbLnavhwmeacY50vpZOw6QHGAHSN+8daCPlOHDG7Ver7Z+kNj
 +srJ7iKMvLnnaEjGNgavfxdqTOme1gv4LWs/JdHYMkpphqVN92xBDJnhXTPRVZiQ
 0x39vK86qtB46A==
 =Omc6
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for Sv57-based virtual memory.

 - Various improvements for the MicroChip PolarFire SOC and the
   associated Icicle dev board, which should allow upstream kernels to
   boot without any additional modifications.

 - An improved memmove() implementation.

 - Support for the new Ssconfpmf and SBI PMU extensions, which allows
   for a much more useful perf implementation on RISC-V systems.

 - Support for restartable sequences.

* tag 'riscv-for-linus-5.18-mw0' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (36 commits)
  rseq/selftests: Add support for RISC-V
  RISC-V: Add support for restartable sequence
  MAINTAINERS: Add entry for RISC-V PMU drivers
  Documentation: riscv: Remove the old documentation
  RISC-V: Add sscofpmf extension support
  RISC-V: Add perf platform driver based on SBI PMU extension
  RISC-V: Add RISC-V SBI PMU extension definitions
  RISC-V: Add a simple platform driver for RISC-V legacy perf
  RISC-V: Add a perf core library for pmu drivers
  RISC-V: Add CSR encodings for all HPMCOUNTERS
  RISC-V: Remove the current perf implementation
  RISC-V: Improve /proc/cpuinfo output for ISA extensions
  RISC-V: Do no continue isa string parsing without correct XLEN
  RISC-V: Implement multi-letter ISA extension probing framework
  RISC-V: Extract multi-letter extension names from "riscv, isa"
  RISC-V: Minimal parser for "riscv, isa" strings
  RISC-V: Correctly print supported extensions
  riscv: Fixed misaligned memory access. Fixed pointer comparison.
  MAINTAINERS: update riscv/microchip entry
  riscv: dts: microchip: add new peripherals to icicle kit device tree
  ...
2022-03-25 10:11:38 -07:00
Linus Torvalds
d710d370c4 s390 updates for the 5.18 merge window
- Raise minimum supported machine generation to z10, which comes with
   various cleanups and code simplifications (usercopy/spectre
   mitigation/etc).
 
 - Rework extables and get rid of anonymous out-of-line fixups.
 
 - Page table helpers cleanup. Add set_pXd()/set_pte() helper
   functions. Covert pte_val()/pXd_val() macros to functions.
 
 - Optimize kretprobe handling by avoiding extra kprobe on
   __kretprobe_trampoline.
 
 - Add support for CEX8 crypto cards.
 
 - Allow to trigger AP bus rescan via writing to /sys/bus/ap/scans.
 
 - Add CONFIG_EXPOLINE_EXTERN option to build the kernel without COMDAT
   group sections which simplifies kpatch support.
 
 - Always use the packed stack layout and extend kernel unwinder tests.
 
 - Add sanity checks for ftrace code patching.
 
 - Add s390dbf debug log for the vfio_ap device driver.
 
 - Various virtual vs physical address confusion fixes.
 
 - Various small fixes and improvements all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmI94dsACgkQjYWKoQLX
 FBiaCggAm9xYJ06Qt9c+T9B7aA4Lt50w7Bnxqx1/Q7UHQQgDpkNhKzI1kt/xeKY4
 JgZQ9lJC4YRLlyfIVzffLI2DWGbl8BcTpuRWVLhPI5D2yHZBXr2ARe7IGFJueddy
 MVqU/r+U3H0r3obQeUc4TSrHtSRX7eQZWIoVuDU75b9fCniee/bmGZqs6yXPXXh4
 pTZQ/gsIhF/o6eBJLEXLjUAcIasxCk15GXWXmkaSwKHAhfYiintwGmtKqQ8etCvw
 17vdlTjA4ce+3ooD/hXGPa8TqeiGKsIB2Xr89x/48f1eJyp2zPJZ1ZvAUBHJBCNt
 b4sF4ql8303Lj7Be+LeqdlbXfa5PZg==
 =meZf
 -----END PGP SIGNATURE-----

Merge tag 's390-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Raise minimum supported machine generation to z10, which comes with
   various cleanups and code simplifications (usercopy/spectre
   mitigation/etc).

 - Rework extables and get rid of anonymous out-of-line fixups.

 - Page table helpers cleanup. Add set_pXd()/set_pte() helper functions.
   Covert pte_val()/pXd_val() macros to functions.

 - Optimize kretprobe handling by avoiding extra kprobe on
   __kretprobe_trampoline.

 - Add support for CEX8 crypto cards.

 - Allow to trigger AP bus rescan via writing to /sys/bus/ap/scans.

 - Add CONFIG_EXPOLINE_EXTERN option to build the kernel without COMDAT
   group sections which simplifies kpatch support.

 - Always use the packed stack layout and extend kernel unwinder tests.

 - Add sanity checks for ftrace code patching.

 - Add s390dbf debug log for the vfio_ap device driver.

 - Various virtual vs physical address confusion fixes.

 - Various small fixes and improvements all over the code.

* tag 's390-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (69 commits)
  s390/test_unwind: add kretprobe tests
  s390/kprobes: Avoid additional kprobe in kretprobe handling
  s390: convert ".insn" encoding to instruction names
  s390: assume stckf is always present
  s390/nospec: move to single register thunks
  s390: raise minimum supported machine generation to z10
  s390/uaccess: Add copy_from/to_user_key functions
  s390/nospec: align and size extern thunks
  s390/nospec: add an option to use thunk-extern
  s390/nospec: generate single register thunks if possible
  s390/pci: make zpci_set_irq()/zpci_clear_irq() static
  s390: remove unused expoline to BC instructions
  s390/irq: use assignment instead of cast
  s390/traps: get rid of magic cast for per code
  s390/traps: get rid of magic cast for program interruption code
  s390/signal: fix typo in comments
  s390/asm-offsets: remove unused defines
  s390/test_unwind: avoid build warning with W=1
  s390: remove .fixup section
  s390/bpf: encode register within extable entry
  ...
2022-03-25 10:01:34 -07:00
Linus Torvalds
1f1c153e40 powerpc updates for 5.18
- Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.
 
  - Add support for livepatch to 32-bit.
 
  - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.
 
  - Merge vdso64 and vdso32 into a single directory.
 
  - Fix build errors with newer binutils.
 
  - Add support for UADDR64 relocations, which are emitted by some toolchains. This allows
    powerpc to build with the latest lld.
 
  - Fix (another) potential userspace r13 corruption in transactional memory handling.
 
  - Cleanups of function descriptor handling & related fixes to LKDTM.
 
 Thanks to: Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh Kumar K.V, Anton
 Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar Chowdhury, Cédric Le Goater, Chen
 Jingwen, Christophe JAILLET, Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel
 Henrique Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu Hua, Haren
 Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason Wang, Jeremy Kerr, Joachim
 Wiberg, Jordan Niethe, Julia Lawall, Kajol Jain, Kees Cook, Laurent Dufour, Madhavan
 Srinivasan, Mamatha Inamdar, Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal
 Suchanek, Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin, Nour-eddine
 Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy Dunlap, Ritesh Harjani, Rohan
 McLure, Russell Currey, Sachin Sant, Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain,
 Thierry Reding, Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean, Wedson
 Almeida Filho, YueHaibing.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmI9TtQTHG1wZUBlbGxl
 cm1hbi5pZC5hdQAKCRBR6+o8yOGlgLp2D/0dwoliEJubRCfoawYUGhxTRZuo6ZYw
 EQzprOiFA/MtrZyPfbrX/FwxeeetzQWysaw2r5JAuQwx5Jb7Od9dNIrVmueFEktC
 hD4fkO8YT+QuOD3Xhp/rDQTImdw4fkeofIjnWIqEAtz0XGInmiRQKOnojVe/Po7f
 72Yi1u80LxYBAnkN/Hhpmi/BsVmu0Nh3wELu+JZopQXjINj4RyD49ayCBSLbmiNc
 uo7oYzJ0/WsZHNTpX9kAzzCq+XmI3dKZPyf2AOCvoRxJTmUPCRZF9QCwsmQFikiI
 vZOdz4fI5e+C0aYJj8ODmWMrXiS+JUQdEShjGg9t9K6EN8idC8joKWpAuXjTA9KN
 kRjzXX7AvjxaMEGbLe8gjU0PmEjY3eSzMOy15Oc/C0DRRswXRzrXdx2AF+/J6bQb
 MWMM4aCKfcYs5/TENkEnV0xpbOCOy4ikHM1KZbxvVrShvjSlNIL9XTOnl/pNK5BJ
 XSSI2mfnjKkbI1+l0KQ4NBXIRTo6HLpu5jwY3Xh97Tq7kaEfqDbO5p2P2HoOCiLa
 ZjdzmpP99zM6wnqUSj+lyvjob7btyhoq6TKmPtxfKbR6OaSfRJ760BCJ5y15Y9Hc
 rHey4Y/NL7LqsVYFZxi4/T6Ncq1hNeYr2Fiis4gH+/1zjr6Cd4othnvw3Slaxhst
 AaHpN3pyx1QI6g==
 =8r2c
 -----END PGP SIGNATURE-----

Merge tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux

Pull powerpc updates from Michael Ellerman:
 "Livepatch support for 32-bit is probably the standout new feature,
  otherwise mostly just lots of bits and pieces all over the board.

  There's a series of commits cleaning up function descriptor handling,
  which touches a few other arches as well as LKDTM. It has acks from
  Arnd, Kees and Helge.

  Summary:

   - Enforce kernel RO, and implement STRICT_MODULE_RWX for 603.

   - Add support for livepatch to 32-bit.

   - Implement CONFIG_DYNAMIC_FTRACE_WITH_ARGS.

   - Merge vdso64 and vdso32 into a single directory.

   - Fix build errors with newer binutils.

   - Add support for UADDR64 relocations, which are emitted by some
     toolchains. This allows powerpc to build with the latest lld.

   - Fix (another) potential userspace r13 corruption in transactional
     memory handling.

   - Cleanups of function descriptor handling & related fixes to LKDTM.

  Thanks to Abdul Haleem, Alexey Kardashevskiy, Anders Roxell, Aneesh
  Kumar K.V, Anton Blanchard, Arnd Bergmann, Athira Rajeev, Bhaskar
  Chowdhury, Cédric Le Goater, Chen Jingwen, Christophe JAILLET,
  Christophe Leroy, Corentin Labbe, Daniel Axtens, Daniel Henrique
  Barboza, David Dai, Fabiano Rosas, Ganesh Goudar, Guo Zhengkui, Hangyu
  Hua, Haren Myneni, Hari Bathini, Igor Zhbanov, Jakob Koschel, Jason
  Wang, Jeremy Kerr, Joachim Wiberg, Jordan Niethe, Julia Lawall, Kajol
  Jain, Kees Cook, Laurent Dufour, Madhavan Srinivasan, Mamatha Inamdar,
  Maxime Bizon, Maxim Kiselev, Maxim Kochetkov, Michal Suchanek,
  Nageswara R Sastry, Nathan Lynch, Naveen N. Rao, Nicholas Piggin,
  Nour-eddine Taleb, Paul Menzel, Ping Fang, Pratik R. Sampat, Randy
  Dunlap, Ritesh Harjani, Rohan McLure, Russell Currey, Sachin Sant,
  Segher Boessenkool, Shivaprasad G Bhat, Sourabh Jain, Thierry Reding,
  Tobias Waldekranz, Tyrel Datwyler, Vaibhav Jain, Vladimir Oltean,
  Wedson Almeida Filho, and YueHaibing"

* tag 'powerpc-5.18-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (179 commits)
  powerpc/pseries: Fix use after free in remove_phb_dynamic()
  powerpc/time: improve decrementer clockevent processing
  powerpc/time: Fix KVM host re-arming a timer beyond decrementer range
  powerpc/tm: Fix more userspace r13 corruption
  powerpc/xive: fix return value of __setup handler
  powerpc/64: Add UADDR64 relocation support
  powerpc: 8xx: fix a return value error in mpc8xx_pic_init
  powerpc/ps3: remove unneeded semicolons
  powerpc/64: Force inlining of prevent_user_access() and set_kuap()
  powerpc/bitops: Force inlining of fls()
  powerpc: declare unmodified attribute_group usages const
  powerpc/spufs: Fix build warning when CONFIG_PROC_FS=n
  powerpc/secvar: fix refcount leak in format_show()
  powerpc/64e: Tie PPC_BOOK3E_64 to PPC_FSL_BOOK3E
  powerpc: Move C prototypes out of asm-prototypes.h
  powerpc/kexec: Declare kexec_paca static
  powerpc/smp: Declare current_set static
  powerpc: Cleanup asm-prototypes.c
  powerpc/ftrace: Use STK_GOT in ftrace_mprofile.S
  powerpc/ftrace: Regroup PPC64 specific operations in ftrace_mprofile.S
  ...
2022-03-25 09:39:36 -07:00
Kees Cook
25fd2d41b5 selftests: kselftest framework: provide "finished" helper
Instead of having each time that wants to use ksft_exit() have to figure
out the internals of kselftest.h, add the helper ksft_finished() that
makes sure the passes, xfails, and skips are equal to the test plan count.

Link: https://lkml.kernel.org/r/20220201013717.2464392-1-keescook@chromium.org
Signed-off-by: Kees Cook <keescook@chromium.org>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:51 -07:00
Mike Kravetz
9ae8f2b849 userfaultfd/selftests: enable hugetlb remap and remove event testing
With MADV_DONTNEED support added to hugetlb mappings, mremap testing can
also be enabled for hugetlb.

Modify the tests to use madvise MADV_DONTNEED and MADV_REMOVE instead of
fallocate hole puch for releasing hugetlb pages.

Link: https://lkml.kernel.org/r/20220215002348.128823-4-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:50 -07:00
Mike Kravetz
c4b6cb8840 selftests/vm: add hugetlb madvise MADV_DONTNEED MADV_REMOVE test
Now that MADV_DONTNEED support for hugetlb is enabled, add corresponding
tests.  MADV_REMOVE has been enabled for some time, but no tests exist so
add them as well.

Link: https://lkml.kernel.org/r/20220215002348.128823-3-mike.kravetz@oracle.com
Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Mina Almasry <almasrymina@google.com>
Cc: Naoya Horiguchi <naoya.horiguchi@linux.dev>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:50 -07:00
Mike Rapoport
6f6a841fb7 selftest/vm: add helpers to detect PAGE_SIZE and PAGE_SHIFT
PAGE_SIZE is not 4096 in many configurations, particularly ppc64 uses 64K
pages in majority of cases.

Add helpers to detect PAGE_SIZE and PAGE_SHIFT dynamically.

Without this tests are broken w.r.t reading /proc/self/pagemap

    if (pread(pagemap_fd, ent, sizeof(ent),
              (uintptr_t)ptr >> (PAGE_SHIFT - 3)) != sizeof(ent))
              err(2, "read pagemap");

Link: https://lkml.kernel.org/r/20220307054355.149820-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Aneesh Kumar K.V
90647d9d72 selftest/vm: add util.h and and move helper functions there
Avoid code duplication by adding util.h.  No functional change in this
patch.

Link: https://lkml.kernel.org/r/20220307054355.149820-1-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Jiajian Ye
9c8a0a8e59 tools/vm/page_owner_sort.c: support for user-defined culling rules
When viewing page owner information, we may want to cull blocks of
information with our own rules.  So it is important to enhance culling
function to provide the support for customizing culling rules.
Therefore, following adjustments are made:

1. Add --cull option to support the culling of blocks of information
   with user-defined culling rules.

	./page_owner_sort <input> <output> --cull=<rules>
	./page_owner_sort <input> <output> --cull <rules>

  <rules> is a single argument in the form of a comma-separated list to
  specify individual culling rules, by the sequence of keys k1,k2, ....
  Mixed use of abbreviated and complete-form of keys is allowed.

  For reference, please see the document(Documentation/vm/page_owner.rst).

Now, assuming two blocks in the input file are as follows:

	Page allocated via order 0, mask xxxx, pid 1, tgid 1 (task_name_demo)
	PFN xxxx
	 prep_new_page+0xd0/0xf8
	 get_page_from_freelist+0x4a0/0x1290
	 __alloc_pages+0x168/0x340
	 alloc_pages+0xb0/0x158

	Page allocated via order 0, mask xxxx, pid 32, tgid 32 (task_name_demo)
	PFN xxxx
	 prep_new_page+0xd0/0xf8
	 get_page_from_freelist+0x4a0/0x1290
	 __alloc_pages+0x168/0x340
	 alloc_pages+0xb0/0x158

If we want to cull the blocks by stacktrace and task command name, we can
use this command:

	./page_owner_sort <input> <output> --cull=stacktrace,name

The output would be like:

	2 times, 2 pages, task_comm_name: task_name_demo
	 prep_new_page+0xd0/0xf8
	 get_page_from_freelist+0x4a0/0x1290
	 __alloc_pages+0x168/0x340
	 alloc_pages+0xb0/0x158

As we can see, these two blocks are culled successfully, for they share
the same pid and task command name.

However, if we want to cull the blocks by pid, stacktrace and task command
name, we can this command:

	./page_owner_sort <input> <output> --cull=stacktrace,name,pid

The output would be like:

	1 times, 1 pages, PID 1, task_comm_name: task_name_demo
	 prep_new_page+0xd0/0xf8
	 get_page_from_freelist+0x4a0/0x1290
	 __alloc_pages+0x168/0x340
	 alloc_pages+0xb0/0x158

	1 times, 1 pages, PID 32, task_comm_name: task_name_demo
	 prep_new_page+0xd0/0xf8
	 get_page_from_freelist+0x4a0/0x1290
	 __alloc_pages+0x168/0x340
	 alloc_pages+0xb0/0x158

As we can see, these two blocks are failed to cull, for their PIDs are
different.

2. Add explanations of --cull options to the document.

This work is coauthored by
	Yixuan Cao
	Shenghong Han
	Yinan Zhang
	Chongxi Zhao
	Yuhong Feng

Link: https://lkml.kernel.org/r/20220312145834.624-1-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Shenghong Han <hanshenghong2019@email.szu.edu.cn>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Cc: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
Cc: Yuhong Feng <yuhongf@szu.edu.cn>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Jiajian Ye
8ea8613a61 tools/vm/page_owner_sort.c: support for selecting by PID, TGID or task command name
When viewing page owner information, we may also need to select the blocks
by PID, TGID or task command name, which helps to get more accurate page
allocation information as needed.

Therefore, following adjustments are made:

1. Add three new options, including --pid, --tgid and --name, to support
   the selection of information blocks by a specific pid, tgid and task
   command name. In addtion, multiple options are allowed to be used at
   the same time.

	./page_owner_sort [input] [output] --pid <PID>
	./page_owner_sort [input] [output] --tgid <TGID>
	./page_owner_sort [input] [output] --name <TASK_COMMAND_NAME>

   Assuming a scenario when a multi-threaded program, ./demo (PID =
   5280), is running, and ./demo creates a child process (PID = 5281).

	$ps
	PID   TTY        TIME   CMD
	5215  pts/0    00:00:00  bash
	5280  pts/0    00:00:00  ./demo
	5281  pts/0    00:00:00  ./demo
	5282  pts/0    00:00:00  ps

   It would be better to filter out the records with tgid=5280 and the
   task name "demo" when debugging the parent process, and the specific
   usage is

	./page_owner_sort [input] [output] --tgid 5280 --name demo

2. Add explanations of three new options, including --pid, --tgid and
   --name, to the document.

This work is coauthored by
	Shenghong Han <hanshenghong2019@email.szu.edu.cn>,
	Yixuan Cao <caoyixuan2019@email.szu.edu.cn>,
	Yinan Zhang <zhangyinan2019@email.szu.edu.cn>,
	Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>,
	Yuhong Feng <yuhongf@szu.edu.cn>.

Link: https://lkml.kernel.org/r/1646835223-7584-1-git-send-email-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Jiajian Ye
194d52d771 tools/vm/page_owner_sort: support for sorting by task command name
When viewing page owner information, we may also need to the block to be
sorted by task command name.  Therefore, the following adjustments are
made:

1. Add a member variable to record task command name of block.

2. Add a new -n option to sort the information of blocks by task command
   name.

3. Add -n option explanation in the document.

Link: https://lkml.kernel.org/r/20220306030640.43054-2-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: <zhaochongxi2019@email.szu.edu.cn>
Cc: <hanshenghong2019@email.szu.edu.cn>
Cc: <zhangyinan2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Jiajian Ye
578d8f2761 tools/vm/page_owner_sort: fix three trivival places
The following adjustments are made:

1. Instead of using another array to cull the blocks after sorting,
   reuse the old array.  So there is no need to malloc a new array.

2. When enabling '-f' option to filter out the blocks which have been
   released, only add those have not been released in the list, rather
   than add all of blocks in the list and then do the filtering when
   printing the result.

3. When enabling '-c' option to cull the blocks by comparing
   stacktrace, print the stacetrace rather than the total block.

Link: https://lkml.kernel.org/r/20220306030640.43054-1-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: <hanshenghong2019@email.szu.edu.cn>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: <zhangyinan2019@email.szu.edu.cn>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Jiajian Ye
cf3c2c8678 tools/vm/page_owner_sort.c: support sorting by tgid and update documentation
When the "page owner" information is read, the information sorted
by TGID is expected.

As a result, the following adjustments have been made:

1. Add a new -P option to sort the information of blocks by TGID in
   ascending order.

2. Adjust the order of member variables in block_list strust to avoid
   one 4 byte hole.

3. Add -P option explanation in the document.

Link: https://lkml.kernel.org/r/20220301151438.166118-3-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Jiajian Ye
56465a3830 tools/vm/page_owner_sort.c: add a security check
Add a security check after using malloc() to allocate memory.

Link: https://lkml.kernel.org/r/20220301151438.166118-2-yejiajian2018@email.szu.edu.cn
Signed-off-by: Jiajian Ye <yejiajian2018@email.szu.edu.cn>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Cc: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:45 -07:00
Yixuan Cao
49e495a015 tools/vm/page_owner_sort.c: fix the instructions for use
I noticed a discrepancy between the usage method and the code logic.

If we enable the -f option, it should be "Filter out the information of
blocks whose memory has been released".

Link: https://lkml.kernel.org/r/20220219143106.2805-1-caoyixuan2019@email.szu.edu.cn
Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:44 -07:00
Yixuan Cao
41ed64347b tools/vm/page_owner_sort.c: delete invalid duplicate code
I noticed that there is two invalid lines of duplicate code.  It's better
to delete it.

Link: https://lkml.kernel.org/r/20211213095743.3630-1-caoyixuan2019@email.szu.edu.cn
Signed-off-by: Yixuan Cao <caoyixuan2019@email.szu.edu.cn>
Cc: Mark Brown <broonie@kernel.org>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:44 -07:00
Shenghong Han
e7a3f67769 tools/vm/page_owner_sort.c: two trivial fixes
1) There is an unused variable. It's better to delete it.
2) One case is missing in the usage().

Link: https://lkml.kernel.org/r/20211213164518.2461-1-hanshenghong2019@email.szu.edu.cn
Signed-off-by: Shenghong Han <hanshenghong2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:44 -07:00
Chongxi Zhao
8f9c447e2e tools/vm/page_owner_sort.c: support sorting pid and time
When viewing the page owner information, we expect that the information
can be sorted by PID, so that we can quickly combine PID with the program
to check the information together.

We also expect that the information can be sorted by time.  Time sorting
helps to view the running status of the program according to the time
interval when the program hangs up.

Finally, we hope to pass the page_ owner_ Sort.  C can reduce part of the
output and only output the plate information whose memory has not been
released, which can make us locate the problem of the program faster.
Therefore, the following adjustments have been made:

1. Add the static functions search_pattern and check_regcomp to
   improve the cleanliness.

2. Add member attributes and their corresponding sorting methods.  In
   terms of comparison time, int will overflow because the data of ull is
   too large, so the ternary operator is used

3. Add the -f parameter to filter out the information of blocks whose
   memory has not been released

Link: https://lkml.kernel.org/r/20211206165653.5093-1-zhaochongxi2019@email.szu.edu.cn
Signed-off-by: Chongxi Zhao <zhaochongxi2019@email.szu.edu.cn>
Reviewed-by: Sean Anderson <seanga2@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:44 -07:00
Yinan Zhang
cd75ea0e32 tools/vm/page_owner_sort.c: add switch between culling by stacktrace and txt
Culling by comparing stacktrace would casue loss of some information.  For
example, if there exists 2 blocks which have the same stacktrace and the
different head info

  Page allocated via order 0, mask 0x108c48(...), pid 73696,
    ts 1578829190639010 ns, free_ts 1576583851324450 ns
    prep_new_page+0x80/0xb8
    get_page_from_freelist+0x924/0xee8
    __alloc_pages+0x138/0xc18
    alloc_pages+0x80/0xf0
    __page_cache_alloc+0x90/0xc8

  Page allocated via order 0, mask 0x108c48(...), pid 61806,
    ts 1354113726046100 ns, free_ts 1354104926841400 ns
    prep_new_page+0x80/0xb8
    get_page_from_freelist+0x924/0xee8
    __alloc_pages+0x138/0xc18
    alloc_pages+0x80/0xf0
    __page_cache_alloc+0x90/0xc8

After culling, it would be like this

  2 times, 2 pages:
  Page allocated via order 0, mask 0x108c48(...), pid 73696,
    ts 1578829190639010 ns, free_ts 1576583851324450 ns
    prep_new_page+0x80/0xb8
    get_page_from_freelist+0x924/0xee8
    __alloc_pages+0x138/0xc18
    alloc_pages+0x80/0xf0
    __page_cache_alloc+0x90/0xc8

The info of second block missed.  So, add -c to turn on culling by
stacktrace.  By default, it will cull by txt.

Link: https://lkml.kernel.org/r/20211129145658.2491-1-zhangyinan2019@email.szu.edu.cn
Signed-off-by: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Cc: Changhee Han <ch0.han@lge.com>
Cc: Sean Anderson <seanga2@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:44 -07:00
Sean Anderson
82f5ebc2be tools/vm/page_owner_sort.c: support sorting by stack trace
This adds the ability to sort by stacktraces.  This is helpful when
comparing multiple dumps of page_owner taken at different times, since
blocks will not be reordered if they were allocated/free'd.

Link: https://lkml.kernel.org/r/20211124193709.1805776-2-seanga2@gmail.com
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Changhee Han <ch0.han@lge.com>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:44 -07:00
Sean Anderson
ba5a396be5 tools/vm/page_owner_sort.c: sort by stacktrace before culling
The contents of page_owner have changed to include more information than
the stack trace.  On a modern kernel, the blocks look like

  Page allocated via order 0, mask 0x0(), pid 1, ts 165564237 ns, free_ts 0 ns
    register_early_stack+0x4b/0x90
    init_page_owner+0x39/0x250
    kernel_init_freeable+0x11e/0x242
    kernel_init+0x16/0x130

Sorting by the contents of .txt will result in almost no repeated pages,
as the pid, ts, and free_ts will almost never be the same.  Instead,
sort by the contents of the stack trace, which we assume to be whatever
is after the first line.

[seanga2@gmail.com: fix NULL-pointer dereference when comparing stack traces]
  Link: https://lkml.kernel.org/r/20211125162653.1855958-1-seanga2@gmail.com

Link: https://lkml.kernel.org/r/20211124193709.1805776-1-seanga2@gmail.com
Signed-off-by: Sean Anderson <seanga2@gmail.com>
Cc: Changhee Han <ch0.han@lge.com>
Cc: Tang Bin <tangbin@cmss.chinamobile.com>
Cc: Zhang Shengju <zhangshengju@cmss.chinamobile.com>
Cc: Zhenliang Wei <weizhenliang@huawei.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Yinan Zhang <zhangyinan2019@email.szu.edu.cn>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-24 19:06:44 -07:00
Linus Torvalds
b9132c32e0 cxl for 5.18
- Add a driver for 'struct cxl_memdev' objects responsible for CXL.mem
   operation as distinct from 'cxl_pci' mailbox operations. Its primary
   responsibility is enumerating an endpoint 'struct cxl_port' and all the
   'struct cxl_port' instances between an endpoint and the CXL platform
   root.
 
 - Add a driver for 'struct cxl_port' objects responsible for enumerating
   and operating all Host-managed Device Memory (HDM) decoder resources
   between the platform-level CXL memory description, all intervening host
   bridges / switches, and the HDM resources in endpoints.
 
 - Update the cxl_pci driver to validate CXL.mem operation precursors to
   HDM decoder operation like ready-polling, and legacy CXL 1.1 DVSEC
   based CXL.mem configuration.
 
 - Add basic lockdep coverage for usage of device_lock() on CXL subsystem
   objects similar to what exists for LIBNVDIMM. Include a compile-time
   switch for which subsystem to validate at run-time.
 
 - Update cxl_test to emulate a one level switch topology.
 
 - Document a "Theory of Operation" for the subsystem.
 
 - Add 'numa_node' and 'serial' attributes to cxl_memdev sysfs
 
 - Include miscellaneous fixes for spec / QEMU CXL emulation
   compatibility and static analysis reports.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSbo+XnGs+rwLz9XGXfioYZHlFsZwUCYjpX6AAKCRDfioYZHlFs
 ZzyxAQCztxAXj7mzkm1Qt5zZz4e7p/6sR49B03jBTfPtrEF9kQEAl9R15WVt6U+o
 Ooof1XhRic3kT6e8zS3ZVKHzGduYxwM=
 =mR94
 -----END PGP SIGNATURE-----

Merge tag 'cxl-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl

Pull CXL (Compute Express Link) updates from Dan Williams:
 "This development cycle extends the subsystem to discover CXL resources
  throughout a CXL/PCIe switch topology and respond to hot add/remove
  events anywhere in that topology.

  This is more foundational infrastructure in preparation for dynamic
  memory region provisioning support. Recall that CXL memory regions, as
  the new "Theory of Operation" section of
  Documentation/driver-api/cxl/memory-devices.rst describes, bring
  storage volume striping semantics to memory.

  The hot add/remove behavior is validated with extensions to the
  cxl_test unit test environment and this test in the cxl-cli test
  suite:

      https://github.com/pmem/ndctl/blob/djbw/for-74/cxl/test/cxl-topology.sh

  Summary:

   - Add a driver for 'struct cxl_memdev' objects responsible for
     CXL.mem operation as distinct from 'cxl_pci' mailbox operations.

     Its primary responsibility is enumerating an endpoint 'struct
     cxl_port' and all the 'struct cxl_port' instances between an
     endpoint and the CXL platform root.

   - Add a driver for 'struct cxl_port' objects responsible for
     enumerating and operating all Host-managed Device Memory (HDM)
     decoder resources between the platform-level CXL memory
     description, all intervening host bridges / switches, and the HDM
     resources in endpoints.

   - Update the cxl_pci driver to validate CXL.mem operation precursors
     to HDM decoder operation like ready-polling, and legacy CXL 1.1
     DVSEC based CXL.mem configuration.

   - Add basic lockdep coverage for usage of device_lock() on CXL
     subsystem objects similar to what exists for LIBNVDIMM. Include a
     compile-time switch for which subsystem to validate at run-time.

   - Update cxl_test to emulate a one level switch topology.

   - Document a "Theory of Operation" for the subsystem.

   - Add 'numa_node' and 'serial' attributes to cxl_memdev sysfs

   - Include miscellaneous fixes for spec / QEMU CXL emulation
     compatibility and static analysis reports"

* tag 'cxl-for-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (48 commits)
  cxl/core/port: Fix NULL but dereferenced coccicheck error
  cxl/port: Hold port reference until decoder release
  cxl/port: Fix endpoint refcount leak
  cxl/core: Fix cxl_device_lock() class detection
  cxl/core/port: Fix unregister_port() lock assertion
  cxl/regs: Fix size of CXL Capability Header Register
  cxl/core/port: Handle invalid decoders
  cxl/core/port: Fix / relax decoder target enumeration
  tools/testing/cxl: Add a physical_node link
  tools/testing/cxl: Enumerate mock decoders
  tools/testing/cxl: Mock one level of switches
  tools/testing/cxl: Fix root port to host bridge assignment
  tools/testing/cxl: Mock dvsec_ranges()
  cxl/core/port: Add endpoint decoders
  cxl/core: Move target_list out of base decoder attributes
  cxl/mem: Add the cxl_mem driver
  cxl/core/port: Add switch port enumeration
  cxl/memdev: Add numa_node attribute
  cxl/pci: Emit device serial number
  cxl/pci: Implement wait for media active
  ...
2022-03-24 18:07:03 -07:00
Linus Torvalds
52deda9551 Merge branch 'akpm' (patches from Andrew)
Merge more updates from Andrew Morton:
 "Various misc subsystems, before getting into the post-linux-next
  material.

  41 patches.

  Subsystems affected by this patch series: procfs, misc, core-kernel,
  lib, checkpatch, init, pipe, minix, fat, cgroups, kexec, kdump,
  taskstats, panic, kcov, resource, and ubsan"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (41 commits)
  Revert "ubsan, kcsan: Don't combine sanitizer with kcov on clang"
  kernel/resource: fix kfree() of bootmem memory again
  kcov: properly handle subsequent mmap calls
  kcov: split ioctl handling into locked and unlocked parts
  panic: move panic_print before kmsg dumpers
  panic: add option to dump all CPUs backtraces in panic_print
  docs: sysctl/kernel: add missing bit to panic_print
  taskstats: remove unneeded dead assignment
  kasan: no need to unset panic_on_warn in end_report()
  ubsan: no need to unset panic_on_warn in ubsan_epilogue()
  panic: unset panic_on_warn inside panic()
  docs: kdump: add scp example to write out the dump file
  docs: kdump: update description about sysfs file system support
  arm64: mm: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
  x86/setup: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
  riscv: mm: init: use IS_ENABLED(CONFIG_KEXEC_CORE) instead of #ifdef
  kexec: make crashk_res, crashk_low_res and crash_notes symbols always visible
  cgroup: use irqsave in cgroup_rstat_flush_locked().
  fat: use pointer to simple type in put_user()
  minix: fix bug when opening a file with O_DIRECT
  ...
2022-03-24 14:14:07 -07:00
Arnaldo Carvalho de Melo
d16d30f48c tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  fa31a4d669 ("x86/cpufeatures: Put the AMX macros in the word 18 block")
  7b8f40b3de ("x86/cpu: Add definitions for the Intel Hardware Feedback Interface")

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Jim Mattson <jmattson@google.com>
Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://lore.kernel.org/lkml/YjzZPxdyLjf76gM+@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-24 17:49:35 -03:00
Arnaldo Carvalho de Melo
1efe4cbd7a tools headers cpufeatures: Sync with the kernel sources
To pick the changes in:

  7c1ef59145 ("x86/cpufeatures: Re-enable ENQCMD")

That causes only these 'perf bench' objects to rebuild:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses these perf build warnings:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/disabled-features.h' differs from latest version at 'arch/x86/include/asm/disabled-features.h'
  diff -u tools/arch/x86/include/asm/disabled-features.h arch/x86/include/asm/disabled-features.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Fenghua Yu <fenghua.yu@intel.com>
Link: https://lore.kernel.org/lkml/YjzX+PknzGoKaGMX@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-24 17:44:27 -03:00
Thomas Richter
d0a0a51149 perf stat: Fix forked applications enablement of counters
I have run into the following issue:

 # perf stat -a -e new_pmu/INSTRUCTION_7/ --  mytest -c1 7

 Performance counter stats for 'system wide':

                 0      new_pmu/INSTRUCTION_7/

       0.000366428 seconds time elapsed
 #

The new PMU for s390 counts the execution of certain CPU instructions.
The root cause is the extremely small run time of the mytest program. It
just executes some assembly instructions and then exits.

In above invocation the instruction is executed exactly one time (-c1
option). The PMU is expected to report this one time execution by a
counter value of one, but fails to do so in some cases, not all.

Debugging reveals the invocation of the child process is done
*before* the counter events are installed and enabled.

Tracing reveals that sometimes the child process starts and exits before
the event is installed on all CPUs. The more CPUs the machine has, the
more often this miscount happens.

Fix this by reversing the start of the work load after the events have
been installed on the specified CPUs. Now the comment also matches the
code.

Output after:

 # perf stat -a -e new_pmu/INSTRUCTION_7/ --  mytest -c1 7

 Performance counter stats for 'system wide':

                 1      new_pmu/INSTRUCTION_7/

       0.000366428 seconds time elapsed
 #

Now the correct result is reported rock solid all the time regardless
how many CPUs are online.

Reviewers notes:

Jiri:

Right, without -a the event has enable_on_exec so the race does not
matter, but it's a problem for system wide with fork.

Namhyung:

Agreed. Also we may move the enable_counters() and the clock code out of
the if block to be shared with the else block.

Fixes: acf2892270 ("perf stat: Use perf_evlist__prepare/start_workload()")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20220317155346.577384-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-24 17:36:54 -03:00
Arnaldo Carvalho de Melo
61726144c9 tools arch x86: Sync the msr-index.h copy with the kernel sources
To pick up the changes from these csets:

  7b8f40b3de ("x86/cpu: Add definitions for the Intel Hardware Feedback Interface")

That cause no changes to tooling:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  $

Just silences this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/msr-index.h' differs from latest version at 'arch/x86/include/asm/msr-index.h'
  diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Cc: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
Link: https://lore.kernel.org/lkml/YjzVt8CjAORAsTCo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-24 17:35:31 -03:00
Linus Torvalds
169e77764a Networking changes for 5.18.
Core
 ----
 
  - Introduce XDP multi-buffer support, allowing the use of XDP with
    jumbo frame MTUs and combination with Rx coalescing offloads (LRO).
 
  - Speed up netns dismantling (5x) and lower the memory cost a little.
    Remove unnecessary per-netns sockets. Scope some lists to a netns.
    Cut down RCU syncing. Use batch methods. Allow netdev registration
    to complete out of order.
 
  - Support distinguishing timestamp types (ingress vs egress) and
    maintaining them across packet scrubbing points (e.g. redirect).
 
  - Continue the work of annotating packet drop reasons throughout
    the stack.
 
  - Switch netdev error counters from an atomic to dynamically
    allocated per-CPU counters.
 
  - Rework a few preempt_disable(), local_irq_save() and busy waiting
    sections problematic on PREEMPT_RT.
 
  - Extend the ref_tracker to allow catching use-after-free bugs.
 
 BPF
 ---
 
  - Introduce "packing allocator" for BPF JIT images. JITed code is
    marked read only, and used to be allocated at page granularity.
    Custom allocator allows for more efficient memory use, lower
    iTLB pressure and prevents identity mapping huge pages from
    getting split.
 
  - Make use of BTF type annotations (e.g. __user, __percpu) to enforce
    the correct probe read access method, add appropriate helpers.
 
  - Convert the BPF preload to use light skeleton and drop
    the user-mode-driver dependency.
 
  - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling
    its use as a packet generator.
 
  - Allow local storage memory to be allocated with GFP_KERNEL if called
    from a hook allowed to sleep.
 
  - Introduce fprobe (multi kprobe) to speed up mass attachment (arch
    bits to come later).
 
  - Add unstable conntrack lookup helpers for BPF by using the BPF
    kfunc infra.
 
  - Allow cgroup BPF progs to return custom errors to user space.
 
  - Add support for AF_UNIX iterator batching.
 
  - Allow iterator programs to use sleepable helpers.
 
  - Support JIT of add, and, or, xor and xchg atomic ops on arm64.
 
  - Add BTFGen support to bpftool which allows to use CO-RE in kernels
    without BTF info.
 
  - Large number of libbpf API improvements, cleanups and deprecations.
 
 Protocols
 ---------
 
  - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev.
 
  - Adjust TSO packet sizes based on min_rtt, allowing very low latency
    links (data centers) to always send full-sized TSO super-frames.
 
  - Make IPv6 flow label changes (AKA hash rethink) more configurable,
    via sysctl and setsockopt. Distinguish between server and client
    behavior.
 
  - VxLAN support to "collect metadata" devices to terminate only
    configured VNIs. This is similar to VLAN filtering in the bridge.
 
  - Support inserting IPv6 IOAM information to a fraction of frames.
 
  - Add protocol attribute to IP addresses to allow identifying where
    given address comes from (kernel-generated, DHCP etc.)
 
  - Support setting socket and IPv6 options via cmsg on ping6 sockets.
 
  - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS.
    Define dscp_t and stop taking ECN bits into account in fib-rules.
 
  - Add support for locked bridge ports (for 802.1X).
 
  - tun: support NAPI for packets received from batched XDP buffs,
    doubling the performance in some scenarios.
 
  - IPv6 extension header handling in Open vSwitch.
 
  - Support IPv6 control message load balancing in bonding, prevent
    neighbor solicitation and advertisement from using the wrong port.
    Support NS/NA monitor selection similar to existing ARP monitor.
 
  - SMC
    - improve performance with TCP_CORK and sendfile()
    - support auto-corking
    - support TCP_NODELAY
 
  - MCTP (Management Component Transport Protocol)
    - add user space tag control interface
    - I2C binding driver (as specified by DMTF DSP0237)
 
  - Multi-BSSID beacon handling in AP mode for WiFi.
 
  - Bluetooth:
    - handle MSFT Monitor Device Event
    - add MGMT Adv Monitor Device Found/Lost events
 
  - Multi-Path TCP:
    - add support for the SO_SNDTIMEO socket option
    - lots of selftest cleanups and improvements
 
  - Increase the max PDU size in CAN ISOTP to 64 kB.
 
 Driver API
 ----------
 
  - Add HW counters for SW netdevs, a mechanism for devices which
    offload packet forwarding to report packet statistics back to
    software interfaces such as tunnels.
 
  - Select the default NIC queue count as a fraction of number of
    physical CPU cores, instead of hard-coding to 8.
 
  - Expose devlink instance locks to drivers. Allow device layer of
    drivers to use that lock directly instead of creating their own
    which always runs into ordering issues in devlink callbacks.
 
  - Add header/data split indication to guide user space enabling
    of TCP zero-copy Rx.
 
  - Allow configuring completion queue event size.
 
  - Refactor page_pool to enable fragmenting after allocation.
 
  - Add allocation and page reuse statistics to page_pool.
 
  - Improve Multiple Spanning Trees support in the bridge to allow
    reuse of topologies across VLANs, saving HW resources in switches.
 
  - DSA (Distributed Switch Architecture):
    - replay and offload of host VLAN entries
    - offload of static and local FDB entries on LAG interfaces
    - FDB isolation and unicast filtering
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - LAN937x T1 PHYs
    - Davicom DM9051 SPI NIC driver
    - Realtek RTL8367S, RTL8367RB-VB switch and MDIO
    - Microchip ksz8563 switches
    - Netronome NFP3800 SmartNICs
    - Fungible SmartNICs
    - MediaTek MT8195 switches
 
  - WiFi:
    - mt76: MediaTek mt7916
    - mt76: MediaTek mt7921u USB adapters
    - brcmfmac: Broadcom BCM43454/6
 
  - Mobile:
    - iosm: Intel M.2 7360 WWAN card
 
 Drivers
 -------
 
  - Convert many drivers to the new phylink API built for split PCS
    designs but also simplifying other cases.
 
  - Intel Ethernet NICs:
    - add TTY for GNSS module for E810T device
    - improve AF_XDP performance
    - GTP-C and GTP-U filter offload
    - QinQ VLAN support
 
  - Mellanox Ethernet NICs (mlx5):
    - support xdp->data_meta
    - multi-buffer XDP
    - offload tc push_eth and pop_eth actions
 
  - Netronome Ethernet NICs (nfp):
    - flow-independent tc action hardware offload (police / meter)
    - AF_XDP
 
  - Other Ethernet NICs:
    - at803x: fiber and SFP support
    - xgmac: mdio: preamble suppression and custom MDC frequencies
    - r8169: enable ASPM L1.2 if system vendor flags it as safe
    - macb/gem: ZynqMP SGMII
    - hns3: add TX push mode
    - dpaa2-eth: software TSO
    - lan743x: multi-queue, mdio, SGMII, PTP
    - axienet: NAPI and GRO support
 
  - Mellanox Ethernet switches (mlxsw):
    - source and dest IP address rewrites
    - RJ45 ports
 
  - Marvell Ethernet switches (prestera):
    - basic routing offload
    - multi-chain TC ACL offload
 
  - NXP embedded Ethernet switches (ocelot & felix):
    - PTP over UDP with the ocelot-8021q DSA tagging protocol
    - basic QoS classification on Felix DSA switch using dcbnl
    - port mirroring for ocelot switches
 
  - Microchip high-speed industrial Ethernet (sparx5):
    - offloading of bridge port flooding flags
    - PTP Hardware Clock
 
  - Other embedded switches:
    - lan966x: PTP Hardward Clock
    - qca8k: mdio read/write operations via crafted Ethernet packets
 
  - Qualcomm 802.11ax WiFi (ath11k):
    - add LDPC FEC type and 802.11ax High Efficiency data in radiotap
    - enable RX PPDU stats in monitor co-exist mode
 
  - Intel WiFi (iwlwifi):
    - UHB TAS enablement via BIOS
    - band disablement via BIOS
    - channel switch offload
    - 32 Rx AMPDU sessions in newer devices
 
  - MediaTek WiFi (mt76):
    - background radar detection
    - thermal management improvements on mt7915
    - SAR support for more mt76 platforms
    - MBSSID and 6 GHz band on mt7915
 
  - RealTek WiFi:
    - rtw89: AP mode
    - rtw89: 160 MHz channels and 6 GHz band
    - rtw89: hardware scan
 
  - Bluetooth:
    - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS)
 
  - Microchip CAN (mcp251xfd):
    - multiple RX-FIFOs and runtime configurable RX/TX rings
    - internal PLL, runtime PM handling simplification
    - improve chip detection and error handling after wakeup
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmI7YBcACgkQMUZtbf5S
 IrveSBAAmSNJlUK6vPsnNzs7IhsZnfI/AUjm2TCLZnlhKttbpI4A/4Pohk33V7RS
 FGX7f8kjEfhUwrIiLDgeCnztNHRECrCmk6aZc/jLEvecmTauJ+f6kjShkDY/wix+
 AkPHmrZnQeLPAEVuljDdV+sL6ik08+zQL7PazIYHsaSKKC0MGQptRwcri8PLRAKE
 KPBAhVhleq2rAZ/ntprSN52F4Af6rpFTrPIWuN8Bqdbc9dy5094LT0mpOOWYvgr3
 /DLvvAPuLemwyIQkjWknVKBRUAQcmNPC+BY3J8K3LRaiNhekGqOFan46BfqP+k2J
 6DWu0Qrp2yWt4BMOeEToZR5rA6v5suUAMIBu8PRZIDkINXQMlIxHfGjZyNm0rVfw
 7edNri966yus9OdzwPa32MIG3oC6PnVAwYCJAjjBMNS8sSIkp7wgHLkgWN4UFe2H
 K/e6z8TLF4UQ+zFM0aGI5WZ+9QqWkTWEDF3R3OhdFpGrznna0gxmkOeV2YvtsgxY
 cbS0vV9Zj73o+bYzgBKJsw/dAjyLdXoHUGvus26VLQ78S/VGunVKtItwoxBAYmZo
 krW964qcC89YofzSi8RSKLHuEWtNWZbVm8YXr75u6jpr5GhMBu0CYefLs+BuZcxy
 dw8c69cGneVbGZmY2J3rBhDkchbuICl8vdUPatGrOJAoaFdYKuw=
 =ELpe
 -----END PGP SIGNATURE-----

Merge tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "The sprinkling of SPI drivers is because we added a new one and Mark
  sent us a SPI driver interface conversion pull request.

  Core
  ----

   - Introduce XDP multi-buffer support, allowing the use of XDP with
     jumbo frame MTUs and combination with Rx coalescing offloads (LRO).

   - Speed up netns dismantling (5x) and lower the memory cost a little.
     Remove unnecessary per-netns sockets. Scope some lists to a netns.
     Cut down RCU syncing. Use batch methods. Allow netdev registration
     to complete out of order.

   - Support distinguishing timestamp types (ingress vs egress) and
     maintaining them across packet scrubbing points (e.g. redirect).

   - Continue the work of annotating packet drop reasons throughout the
     stack.

   - Switch netdev error counters from an atomic to dynamically
     allocated per-CPU counters.

   - Rework a few preempt_disable(), local_irq_save() and busy waiting
     sections problematic on PREEMPT_RT.

   - Extend the ref_tracker to allow catching use-after-free bugs.

  BPF
  ---

   - Introduce "packing allocator" for BPF JIT images. JITed code is
     marked read only, and used to be allocated at page granularity.
     Custom allocator allows for more efficient memory use, lower iTLB
     pressure and prevents identity mapping huge pages from getting
     split.

   - Make use of BTF type annotations (e.g. __user, __percpu) to enforce
     the correct probe read access method, add appropriate helpers.

   - Convert the BPF preload to use light skeleton and drop the
     user-mode-driver dependency.

   - Allow XDP BPF_PROG_RUN test infra to send real packets, enabling
     its use as a packet generator.

   - Allow local storage memory to be allocated with GFP_KERNEL if
     called from a hook allowed to sleep.

   - Introduce fprobe (multi kprobe) to speed up mass attachment (arch
     bits to come later).

   - Add unstable conntrack lookup helpers for BPF by using the BPF
     kfunc infra.

   - Allow cgroup BPF progs to return custom errors to user space.

   - Add support for AF_UNIX iterator batching.

   - Allow iterator programs to use sleepable helpers.

   - Support JIT of add, and, or, xor and xchg atomic ops on arm64.

   - Add BTFGen support to bpftool which allows to use CO-RE in kernels
     without BTF info.

   - Large number of libbpf API improvements, cleanups and deprecations.

  Protocols
  ---------

   - Micro-optimize UDPv6 Tx, gaining up to 5% in test on dummy netdev.

   - Adjust TSO packet sizes based on min_rtt, allowing very low latency
     links (data centers) to always send full-sized TSO super-frames.

   - Make IPv6 flow label changes (AKA hash rethink) more configurable,
     via sysctl and setsockopt. Distinguish between server and client
     behavior.

   - VxLAN support to "collect metadata" devices to terminate only
     configured VNIs. This is similar to VLAN filtering in the bridge.

   - Support inserting IPv6 IOAM information to a fraction of frames.

   - Add protocol attribute to IP addresses to allow identifying where
     given address comes from (kernel-generated, DHCP etc.)

   - Support setting socket and IPv6 options via cmsg on ping6 sockets.

   - Reject mis-use of ECN bits in IP headers as part of DSCP/TOS.
     Define dscp_t and stop taking ECN bits into account in fib-rules.

   - Add support for locked bridge ports (for 802.1X).

   - tun: support NAPI for packets received from batched XDP buffs,
     doubling the performance in some scenarios.

   - IPv6 extension header handling in Open vSwitch.

   - Support IPv6 control message load balancing in bonding, prevent
     neighbor solicitation and advertisement from using the wrong port.
     Support NS/NA monitor selection similar to existing ARP monitor.

   - SMC
      - improve performance with TCP_CORK and sendfile()
      - support auto-corking
      - support TCP_NODELAY

   - MCTP (Management Component Transport Protocol)
      - add user space tag control interface
      - I2C binding driver (as specified by DMTF DSP0237)

   - Multi-BSSID beacon handling in AP mode for WiFi.

   - Bluetooth:
      - handle MSFT Monitor Device Event
      - add MGMT Adv Monitor Device Found/Lost events

   - Multi-Path TCP:
      - add support for the SO_SNDTIMEO socket option
      - lots of selftest cleanups and improvements

   - Increase the max PDU size in CAN ISOTP to 64 kB.

  Driver API
  ----------

   - Add HW counters for SW netdevs, a mechanism for devices which
     offload packet forwarding to report packet statistics back to
     software interfaces such as tunnels.

   - Select the default NIC queue count as a fraction of number of
     physical CPU cores, instead of hard-coding to 8.

   - Expose devlink instance locks to drivers. Allow device layer of
     drivers to use that lock directly instead of creating their own
     which always runs into ordering issues in devlink callbacks.

   - Add header/data split indication to guide user space enabling of
     TCP zero-copy Rx.

   - Allow configuring completion queue event size.

   - Refactor page_pool to enable fragmenting after allocation.

   - Add allocation and page reuse statistics to page_pool.

   - Improve Multiple Spanning Trees support in the bridge to allow
     reuse of topologies across VLANs, saving HW resources in switches.

   - DSA (Distributed Switch Architecture):
      - replay and offload of host VLAN entries
      - offload of static and local FDB entries on LAG interfaces
      - FDB isolation and unicast filtering

  New hardware / drivers
  ----------------------

   - Ethernet:
      - LAN937x T1 PHYs
      - Davicom DM9051 SPI NIC driver
      - Realtek RTL8367S, RTL8367RB-VB switch and MDIO
      - Microchip ksz8563 switches
      - Netronome NFP3800 SmartNICs
      - Fungible SmartNICs
      - MediaTek MT8195 switches

   - WiFi:
      - mt76: MediaTek mt7916
      - mt76: MediaTek mt7921u USB adapters
      - brcmfmac: Broadcom BCM43454/6

   - Mobile:
      - iosm: Intel M.2 7360 WWAN card

  Drivers
  -------

   - Convert many drivers to the new phylink API built for split PCS
     designs but also simplifying other cases.

   - Intel Ethernet NICs:
      - add TTY for GNSS module for E810T device
      - improve AF_XDP performance
      - GTP-C and GTP-U filter offload
      - QinQ VLAN support

   - Mellanox Ethernet NICs (mlx5):
      - support xdp->data_meta
      - multi-buffer XDP
      - offload tc push_eth and pop_eth actions

   - Netronome Ethernet NICs (nfp):
      - flow-independent tc action hardware offload (police / meter)
      - AF_XDP

   - Other Ethernet NICs:
      - at803x: fiber and SFP support
      - xgmac: mdio: preamble suppression and custom MDC frequencies
      - r8169: enable ASPM L1.2 if system vendor flags it as safe
      - macb/gem: ZynqMP SGMII
      - hns3: add TX push mode
      - dpaa2-eth: software TSO
      - lan743x: multi-queue, mdio, SGMII, PTP
      - axienet: NAPI and GRO support

   - Mellanox Ethernet switches (mlxsw):
      - source and dest IP address rewrites
      - RJ45 ports

   - Marvell Ethernet switches (prestera):
      - basic routing offload
      - multi-chain TC ACL offload

   - NXP embedded Ethernet switches (ocelot & felix):
      - PTP over UDP with the ocelot-8021q DSA tagging protocol
      - basic QoS classification on Felix DSA switch using dcbnl
      - port mirroring for ocelot switches

   - Microchip high-speed industrial Ethernet (sparx5):
      - offloading of bridge port flooding flags
      - PTP Hardware Clock

   - Other embedded switches:
      - lan966x: PTP Hardward Clock
      - qca8k: mdio read/write operations via crafted Ethernet packets

   - Qualcomm 802.11ax WiFi (ath11k):
      - add LDPC FEC type and 802.11ax High Efficiency data in radiotap
      - enable RX PPDU stats in monitor co-exist mode

   - Intel WiFi (iwlwifi):
      - UHB TAS enablement via BIOS
      - band disablement via BIOS
      - channel switch offload
      - 32 Rx AMPDU sessions in newer devices

   - MediaTek WiFi (mt76):
      - background radar detection
      - thermal management improvements on mt7915
      - SAR support for more mt76 platforms
      - MBSSID and 6 GHz band on mt7915

   - RealTek WiFi:
      - rtw89: AP mode
      - rtw89: 160 MHz channels and 6 GHz band
      - rtw89: hardware scan

   - Bluetooth:
      - mt7921s: wake on Bluetooth, SCO over I2S, wide-band-speed (WBS)

   - Microchip CAN (mcp251xfd):
      - multiple RX-FIFOs and runtime configurable RX/TX rings
      - internal PLL, runtime PM handling simplification
      - improve chip detection and error handling after wakeup"

* tag 'net-next-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2521 commits)
  llc: fix netdevice reference leaks in llc_ui_bind()
  drivers: ethernet: cpsw: fix panic when interrupt coaleceing is set via ethtool
  ice: don't allow to run ice_send_event_to_aux() in atomic ctx
  ice: fix 'scheduling while atomic' on aux critical err interrupt
  net/sched: fix incorrect vlan_push_eth dest field
  net: bridge: mst: Restrict info size queries to bridge ports
  net: marvell: prestera: add missing destroy_workqueue() in prestera_module_init()
  drivers: net: xgene: Fix regression in CRC stripping
  net: geneve: add missing netlink policy and size for IFLA_GENEVE_INNER_PROTO_INHERIT
  net: dsa: fix missing host-filtered multicast addresses
  net/mlx5e: Fix build warning, detected write beyond size of field
  iwlwifi: mvm: Don't fail if PPAG isn't supported
  selftests/bpf: Fix kprobe_multi test.
  Revert "rethook: x86: Add rethook x86 implementation"
  Revert "arm64: rethook: Add arm64 rethook implementation"
  Revert "powerpc: Add rethook support"
  Revert "ARM: rethook: Add rethook arm implementation"
  netdevice: add missing dm_private kdoc
  net: bridge: mst: prevent NULL deref in br_mst_info_size()
  selftests: forwarding: Use same VRF for port and VLAN upper
  ...
2022-03-24 13:13:26 -07:00
Linus Torvalds
1ebdbeb03e ARM:
- Proper emulation of the OSLock feature of the debug architecture
 
 - Scalibility improvements for the MMU lock when dirty logging is on
 
 - New VMID allocator, which will eventually help with SVA in VMs
 
 - Better support for PMUs in heterogenous systems
 
 - PSCI 1.1 support, enabling support for SYSTEM_RESET2
 
 - Implement CONFIG_DEBUG_LIST at EL2
 
 - Make CONFIG_ARM64_ERRATUM_2077057 default y
 
 - Reduce the overhead of VM exit when no interrupt is pending
 
 - Remove traces of 32bit ARM host support from the documentation
 
 - Updated vgic selftests
 
 - Various cleanups, doc updates and spelling fixes
 
 RISC-V:
 
 - Prevent KVM_COMPAT from being selected
 
 - Optimize __kvm_riscv_switch_to() implementation
 
 - RISC-V SBI v0.3 support
 
 s390:
 
 - memop selftest
 
 - fix SCK locking
 
 - adapter interruptions virtualization for secure guests
 
 - add Claudio Imbrenda as maintainer
 
 - first step to do proper storage key checking
 
 x86:
 
 - Continue switching kvm_x86_ops to static_call(); introduce
   static_call_cond() and __static_call_ret0 when applicable.
 
 - Cleanup unused arguments in several functions
 
 - Synthesize AMD 0x80000021 leaf
 
 - Fixes and optimization for Hyper-V sparse-bank hypercalls
 
 - Implement Hyper-V's enlightened MSR bitmap for nested SVM
 
 - Remove MMU auditing
 
 - Eager splitting of page tables (new aka "TDP" MMU only) when dirty
   page tracking is enabled
 
 - Cleanup the implementation of the guest PGD cache
 
 - Preparation for the implementation of Intel IPI virtualization
 
 - Fix some segment descriptor checks in the emulator
 
 - Allow AMD AVIC support on systems with physical APIC ID above 255
 
 - Better API to disable virtualization quirks
 
 - Fixes and optimizations for the zapping of page tables:
 
   - Zap roots in two passes, avoiding RCU read-side critical sections
     that last too long for very large guests backed by 4 KiB SPTEs.
 
   - Zap invalid and defunct roots asynchronously via concurrency-managed
     work queue.
 
   - Allowing yielding when zapping TDP MMU roots in response to the root's
     last reference being put.
 
   - Batch more TLB flushes with an RCU trick.  Whoever frees the paging
     structure now holds RCU as a proxy for all vCPUs running in the guest,
     i.e. to prolongs the grace period on their behalf.  It then kicks the
     the vCPUs out of guest mode before doing rcu_read_unlock().
 
 Generic:
 
 - Introduce __vcalloc and use it for very large allocations that
   need memcg accounting
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmI4fdwUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMq8gf/WoeVHtw2QlL5Mmz6McvRRmPAYPLV
 wLUIFNrRqRvd8Tw4kivzZoh/xTpwmnojv0YdK5SjKAiMjgv094YI1LrNp1JSPvmL
 pitocMkA10RSJNWHeEMg9cMSKH0rKiqeYl6S1e2XsdB+UZZ2BINOCVtvglmjTAvJ
 dFBdKdBkqjAUZbdXAGIvz4JEEER3N/LkFDKGaUGX+0QIQOzGBPIyLTxynxIDG6mt
 RViCCFyXdy5NkVp5hZFm96vQ2qAlWL9B9+iKruQN++82+oqWbeTdSqPhdwF7GyFz
 BfOv3gobQ2c4ef/aMLO5LswZ9joI1t/4kQbbAn6dNybpOAz/NXfDnbNefg==
 =keox
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - Proper emulation of the OSLock feature of the debug architecture

   - Scalibility improvements for the MMU lock when dirty logging is on

   - New VMID allocator, which will eventually help with SVA in VMs

   - Better support for PMUs in heterogenous systems

   - PSCI 1.1 support, enabling support for SYSTEM_RESET2

   - Implement CONFIG_DEBUG_LIST at EL2

   - Make CONFIG_ARM64_ERRATUM_2077057 default y

   - Reduce the overhead of VM exit when no interrupt is pending

   - Remove traces of 32bit ARM host support from the documentation

   - Updated vgic selftests

   - Various cleanups, doc updates and spelling fixes

  RISC-V:
   - Prevent KVM_COMPAT from being selected

   - Optimize __kvm_riscv_switch_to() implementation

   - RISC-V SBI v0.3 support

  s390:
   - memop selftest

   - fix SCK locking

   - adapter interruptions virtualization for secure guests

   - add Claudio Imbrenda as maintainer

   - first step to do proper storage key checking

  x86:
   - Continue switching kvm_x86_ops to static_call(); introduce
     static_call_cond() and __static_call_ret0 when applicable.

   - Cleanup unused arguments in several functions

   - Synthesize AMD 0x80000021 leaf

   - Fixes and optimization for Hyper-V sparse-bank hypercalls

   - Implement Hyper-V's enlightened MSR bitmap for nested SVM

   - Remove MMU auditing

   - Eager splitting of page tables (new aka "TDP" MMU only) when dirty
     page tracking is enabled

   - Cleanup the implementation of the guest PGD cache

   - Preparation for the implementation of Intel IPI virtualization

   - Fix some segment descriptor checks in the emulator

   - Allow AMD AVIC support on systems with physical APIC ID above 255

   - Better API to disable virtualization quirks

   - Fixes and optimizations for the zapping of page tables:

      - Zap roots in two passes, avoiding RCU read-side critical
        sections that last too long for very large guests backed by 4
        KiB SPTEs.

      - Zap invalid and defunct roots asynchronously via
        concurrency-managed work queue.

      - Allowing yielding when zapping TDP MMU roots in response to the
        root's last reference being put.

      - Batch more TLB flushes with an RCU trick. Whoever frees the
        paging structure now holds RCU as a proxy for all vCPUs running
        in the guest, i.e. to prolongs the grace period on their behalf.
        It then kicks the the vCPUs out of guest mode before doing
        rcu_read_unlock().

  Generic:
   - Introduce __vcalloc and use it for very large allocations that need
     memcg accounting"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (246 commits)
  KVM: use kvcalloc for array allocations
  KVM: x86: Introduce KVM_CAP_DISABLE_QUIRKS2
  kvm: x86: Require const tsc for RT
  KVM: x86: synthesize CPUID leaf 0x80000021h if useful
  KVM: x86: add support for CPUID leaf 0x80000021
  KVM: x86: do not use KVM_X86_OP_OPTIONAL_RET0 for get_mt_mask
  Revert "KVM: x86/mmu: Zap only TDP MMU leafs in kvm_zap_gfn_range()"
  kvm: x86/mmu: Flush TLB before zap_gfn_range releases RCU
  KVM: arm64: fix typos in comments
  KVM: arm64: Generalise VM features into a set of flags
  KVM: s390: selftests: Add error memop tests
  KVM: s390: selftests: Add more copy memop tests
  KVM: s390: selftests: Add named stages for memop test
  KVM: s390: selftests: Add macro as abstraction for MEM_OP
  KVM: s390: selftests: Split memop tests
  KVM: s390x: fix SCK locking
  RISC-V: KVM: Implement SBI HSM suspend call
  RISC-V: KVM: Add common kvm_riscv_vcpu_wfi() function
  RISC-V: Add SBI HSM suspend related defines
  RISC-V: KVM: Implement SBI v0.3 SRST extension
  ...
2022-03-24 11:58:57 -07:00
Linus Torvalds
3ce62cf4dc flexible-array transformations for 5.18-rc1
Hi Linus,
 
 Please, pull the following treewide patch that replaces zero-length arrays with
 flexible-array members. This patch has been baking in linux-next for a
 whole development cycle.
 
 Thanks
 --
 Gustavo
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEkmRahXBSurMIg1YvRwW0y0cG2zEFAmI6GIUACgkQRwW0y0cG
 2zFLWw/+OB1gZeQD3boKpUMntWnn6wjhUxdrO8CYkpzG+B+8TFECXNjy8HV1CSiw
 GKKRndYELOyYaD5o/F2vtPe10iPHbrdIlMFRPBRoht0/cvSZgzHlfT8EjWQwerYY
 dieztUFKjeSj0MXivdNDnKOTm8o9cz8KmCrWFP+My37Fasn/9+nBX8iNVIvAX4xy
 T+IVmjtDifQUsTs298UGnBvDeuZOiGHhXXU5rq6lIX0Rl554OsWZW94d6jUPj/h7
 t1v6jdojNuyaMKn45/xnPj9VvmDiSu3K67m3fjRdzLPDOhISjr2fw4KEUOKdsebh
 yJ9t5u8IufyPbm9kyI+rZt+T8ZlV2/qt2+mt6QgtDMnWrs+4nU15JY0SHImMSBZQ
 rBEZcQlrIcGJ+CsNB8Y7jIGYO0SSkhodAvfl0LRA0AbTqLGqq0OkAQS5D52r3H2r
 uz6xdYb7kG43XaRyaAIPqhZsp/jk2NrXvEvin2tSaXZFR1cxp+oxcV2UajmnOU6i
 EIBS4PzJnYx2RZRa+h8YbBa/+D4N6+fj/tjmwBawiUBPjjaLAsGFNwUHqvBoD05S
 bk6oXi654NBwVjsknZ0grVz0TtSvdZ3uJL5FZApTOHITqH8vlxlNefmHri4vZRZO
 NN7NIQ0yaUCnorzMg+vP8ZtflhQwrMJbjwIS9YD0RHd7MBhYX8k=
 =xZD2
 -----END PGP SIGNATURE-----

Merge tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux

Pull flexible-array transformations from Gustavo Silva:
 "Treewide patch that replaces zero-length arrays with flexible-array
  members.

  This has been baking in linux-next for a whole development cycle"

* tag 'flexible-array-transformations-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux:
  treewide: Replace zero-length arrays with flexible-array members
2022-03-24 11:39:32 -07:00
Bjorn Helgaas
c724c866bb linux/types.h: remove unnecessary __bitwise__
There are no users of "__bitwise__" except the definition of
"__bitwise".  Remove __bitwise__ and define __bitwise directly.

This is a follow-up to 05de97003c ("linux/types.h: enable endian
checks for all sparse builds").

[akpm@linux-foundation.org: change the tools/include/linux/types.h definition also]

Link: https://lkml.kernel.org/r/20220310220927.245704-2-helgaas@kernel.org
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-23 19:00:33 -07:00
Linus Torvalds
194dfe88d6 asm-generic updates for 5.18
There are three sets of updates for 5.18 in the asm-generic tree:
 
  - The set_fs()/get_fs() infrastructure gets removed for good. This
    was already gone from all major architectures, but now we can
    finally remove it everywhere, which loses some particularly
    tricky and error-prone code.
    There is a small merge conflict against a parisc cleanup, the
    solution is to use their new version.
 
  - The nds32 architecture ends its tenure in the Linux kernel. The
    hardware is still used and the code is in reasonable shape, but
    the mainline port is not actively maintained any more, as all
    remaining users are thought to run vendor kernels that would never
    be updated to a future release.
    There are some obvious conflicts against changes to the removed
    files.
 
  - A series from Masahiro Yamada cleans up some of the uapi header
    files to pass the compile-time checks.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmI69BsACgkQmmx57+YA
 GNn/zA//f4d5VTT0ThhRxRWTu9BdThGHoB8TUcY7iOhbsWu0X/913NItRC3UeWNl
 IdmisaXgVtirg1dcC2pWUmrcHdoWOCEGfK4+Zr2NhSWfuZDWvODHK9pGWk4WLnhe
 cQgUNBvIuuAMryGtrOBwHPO4TpfCyy2ioeVP36ZfcsWXdDxTrqfaq/56mk3sxIP6
 sUTk1UEjut9NG4C9xIIvcSU50R3l6LryQE/H9kyTLtaSvfvTOvprcVYCq0GPmSzo
 DtQ1Wwa9zbJ+4EqoMiP5RrgQwWvOTg2iRByLU8ytwlX3e/SEF0uihvMv1FQbL8zG
 G8RhGUOKQSEhaBfc3lIkm8GpOVPh0uHzB6zhn7daVmAWtazRD2Nu59BMjipa+ims
 a8Z58iHH7jRAnKeEkVZqXKb1CEiUxaQx/IeVPzN4QlwMhDtwrI76LY7ZJ1zCqTGY
 ENG0yRLav1XselYBslOYXGtOEWcY5EZPWqLyWbp4P9vz2g0Fe0gZxoIOvPmNQc89
 QnfXpCt7vm/DGkyO255myu08GOLeMkisVqUIzLDB9avlym5mri7T7vk9abBa2YyO
 CRpTL5gl1/qKPWuH1UI5mvhT+sbbBE2SUHSuy84btns39ZKKKynwCtdu+hSQkKLE
 h9pV30Gf1cLTD4JAE0RWlUgOmbBLVp34loTOexQj4MrLM1noOnw=
 =vtCN
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "There are three sets of updates for 5.18 in the asm-generic tree:

   - The set_fs()/get_fs() infrastructure gets removed for good.

     This was already gone from all major architectures, but now we can
     finally remove it everywhere, which loses some particularly tricky
     and error-prone code. There is a small merge conflict against a
     parisc cleanup, the solution is to use their new version.

   - The nds32 architecture ends its tenure in the Linux kernel.

     The hardware is still used and the code is in reasonable shape, but
     the mainline port is not actively maintained any more, as all
     remaining users are thought to run vendor kernels that would never
     be updated to a future release.

   - A series from Masahiro Yamada cleans up some of the uapi header
     files to pass the compile-time checks"

* tag 'asm-generic-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic: (27 commits)
  nds32: Remove the architecture
  uaccess: remove CONFIG_SET_FS
  ia64: remove CONFIG_SET_FS support
  sh: remove CONFIG_SET_FS support
  sparc64: remove CONFIG_SET_FS support
  lib/test_lockup: fix kernel pointer check for separate address spaces
  uaccess: generalize access_ok()
  uaccess: fix type mismatch warnings from access_ok()
  arm64: simplify access_ok()
  m68k: fix access_ok for coldfire
  MIPS: use simpler access_ok()
  MIPS: Handle address errors for accesses above CPU max virtual user address
  uaccess: add generic __{get,put}_kernel_nofault
  nios2: drop access_ok() check from __put_user()
  x86: use more conventional access_ok() definition
  x86: remove __range_not_ok()
  sparc64: add __{get,put}_kernel_nofault()
  nds32: fix access_ok() checks in get/put_user
  uaccess: fix nios2 and microblaze get_user_8()
  sparc64: fix building assembly files
  ...
2022-03-23 18:03:08 -07:00
Linus Torvalds
40037e4f8b sound updates for 5.18
It's been a fairly calm development cycle.  There are a few
 last-minute ALSA core fixes, most notably for covering PCM ioctl
 races, but the most of rest are device-specific changes.
 
 Below are some highlights:
 
 * ALSA core:
 - Fixes for PCM ioctl races that may lead to UAF
 - Fix for oversized allocations in PCM OSS layer
 
 * ASoC:
 - Start of moving SoF to support multiple IPC mechanisms
 - Use of NHLT ACPI table to reduce the amount of quirking required for
   Intel systems
 - Preliminary works forthcoming Intel AVS driver for legacy Intel DSP
   firmwares
 - Support for AMD PDM, Atmel PDMC, Awinic AW8738, i.MX cards with
   TLV320AIC31xx, Intel machines with CS35L41 and ESSX8336, Mediatek
   MT8181 wideband bluetooth, nVidia Tegra234, Qualcomm SC7280, Renesas
   RZ/V2L, Texas Instruments TAS585M
 
 * HD-audio:
 - Driver re-binding fix for HD-audio
 - Updates for Intel ADL and Tegra234, various platform quirks for
   Dell, HP, Lenovo, ASUS, Samsung and Clevo machines
 
 * USB-audio:
 - Quirk updates for Scarlett2, RODE, Corsair devices
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCAAsFiEEIXTw5fNLNI7mMiVaLtJE4w1nLE8FAmI7AkUOHHRpd2FpQHN1
 c2UuZGUACgkQLtJE4w1nLE/faBAAvPFODmyJlt16UG7bSlqwoSafWho+Bp4GSH4O
 +pEm47+kULgkKOm9k2NK7sci6nOsNIabQsVhMeryCLgDlNlFqR4FQjIswbgtRWsO
 lmu3TMw26I0vS2joNE+tpqCOyJuEGI/ekQru3aKAZx6JyBlXmrzuf7L4BNomVORr
 fgBgpMg/tRcE9ceWjc1qHMggueAfkcjnI4ioFYxaWYXp4wyVX1mx3mVHEf6WQnff
 ZXsgQLhupUKLvyBr2D1vkN6JcRyTahkBprbLEtZhKszR8hl6tFlnyILkzsiZ/B+K
 oJAvtEoC6z2PW+suPSPPl2qnbyOJyX32m43iCXW8uSG1KG/K2JshZIJshMbVw3pV
 rLK3XYr2zoE3VzzNUL+QyGYhLpdDPSNF+E19z7jfWU/wKwCUu8qWuejhf9uAlQgx
 XtlrZuyCpnsNVyILqLM2Sgzvc1U8vJd68uYwhecchTmP0Aurld5NM2PiAagcvVpW
 RtEMbTJbIBYbou3UPhxDjEdQOeT+KZUYrClEjb61pJQ9sHAbC4l0LoRyS4NEWCZH
 J7Z5DNPqPf6CFU1AVpfktL4Dh+VtM7nb4DVyyyLWWZgG3NcXSVLLbUA8Uo9qoDV5
 7tHnV+1MURBwEq1CUvZtb3sRC5tyNVkzXMMAJfcVWlv7JkoXs8pzwK9w685aP2zl
 YDOfau8=
 =5cCU
 -----END PGP SIGNATURE-----

Merge tag 'sound-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound

Pull sound updates from Takashi Iwai:
 "It's been a fairly calm development cycle. There are a few last-minute
  ALSA core fixes, most notably for covering PCM ioctl races, but the
  most of rest are device-specific changes.

  Below are some highlights:

  ALSA core:

   - Fixes for PCM ioctl races that may lead to UAF

   - Fix for oversized allocations in PCM OSS layer

  ASoC:

   - Start of moving SoF to support multiple IPC mechanisms

   - Use of NHLT ACPI table to reduce the amount of quirking required
     for Intel systems

   - Preliminary works forthcoming Intel AVS driver for legacy Intel DSP
     firmwares

   - Support for AMD PDM, Atmel PDMC, Awinic AW8738, i.MX cards with
     TLV320AIC31xx, Intel machines with CS35L41 and ESSX8336, Mediatek
     MT8181 wideband bluetooth, nVidia Tegra234, Qualcomm SC7280,
     Renesas RZ/V2L, Texas Instruments TAS585M

  HD-audio:

   - Driver re-binding fix for HD-audio

   - Updates for Intel ADL and Tegra234, various platform quirks for
     Dell, HP, Lenovo, ASUS, Samsung and Clevo machines

  USB-audio:

   - Quirk updates for Scarlett2, RODE, Corsair devices"

* tag 'sound-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (486 commits)
  ALSA: hda/realtek: Add alc256-samsung-headphone fixup
  ALSA: pci: fix reading of swapped values from pcmreg in AC97 codec
  ALSA: pcm: Add stream lock during PCM reset ioctl operations
  ALSA: pcm: Fix races among concurrent prealloc proc writes
  ALSA: pcm: Fix races among concurrent prepare and hw_params/hw_free calls
  ALSA: pcm: Fix races among concurrent read/write and buffer changes
  ALSA: pcm: Fix races among concurrent hw_params and hw_free calls
  ASoC: atmel: mchp-pdmc: print the correct property name
  MAINTAINERS: Add Shengjiu to maintainer list of sound/soc/fsl
  ASoC: SOF: Add a new dai_get_clk topology IPC op
  ASoC: SOF: topology: Add ops for setting up and tearing down pipelines
  ASoC: SOF: expose sof_route_setup()
  ASoC: SOF: Add dai_link_fixup PCM op for IPC3
  ASoC: SOF: Add trigger PCM op for IPC3
  ASoC: SOF: Define hw_params PCM op for IPC3
  ASoC: SOF: Introduce IPC3 PCM hw_free op
  ASoC: SOF: pcm: expose the sof_pcm_setup_connected_widgets() function
  ASoC: SOF: Introduce IPC-specific PCM ops
  ASoC: SOF: Add bytes_ext control IPC ops for IPC3
  ASoC: SOF: Add bytes_get/put control IPC ops for IPC3
  ...
2022-03-23 15:11:12 -07:00
Chang S. Bae
20df737561 selftests/x86/amx: Update the ARCH_REQ_XCOMP_PERM test
Update the arch_prctl test to check the permission bitmap whether the
requested feature is added as expected or not.

Every non-dynamic feature that is enabled is permitted already for use.
TILECFG is not dynamic feature. Ensure the bit is always on from
ARCH_GET_XCOMP_PERM.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20220129173647.27981-3-chang.seok.bae@intel.com
2022-03-23 21:28:34 +01:00
Linus Torvalds
d51b1b33c5 linux-kselftest-kunit-5.18-rc1
This KUnit update for Linux 5.18-rc1 consists of:
 
 - changes to decrease macro layering string, integer, EQ/NE asserts
 - remove unused macros
 - several cleanups and fixes
 - new list tests for list_del_init_careful(), list_is_head() and
   list_entry_is_head()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmI5KcsACgkQCwJExA0N
 Qxy37BAA4NKkZHOpIk3P+aHbqE/S+Utg+gHsFOS7srp8wTeM1nSVMCP7MYefBiRs
 4+R6RViCAvd5skK5/4UkYp53KePOww4Qo5zZKfN5J+479juMk+8CJtk3QwgY0IAu
 jaI3nZlvo+WW+2OdIXdYNNScLR5mKHVSxpoLs1KtJZXm62RQgycoGCrIEtiAKYTk
 w2mMUxG4X0upIF08xTfb5UDQyyMjqWMZJZ0l65xsJr4bgU+It0HoYCmPzqufpGza
 ZgTWac8Iai1sEzxPXaTMLCM6V3QlbESIaIB6J13BWS+OvKs7cbcIADnG79Nvh7eH
 v8v9fXTojlS6vSNJUqxA8S0f2kGJ2mVmePg11ZeOh2oqaF6l1bs7iFJQPc3PidRl
 /dobIMBGlEI2yi9vaRz6/roDp44K56OlbthtSlaEc1NLyI/+nGuG7hzXuXkmoNiX
 LloMfTmcCtrWGUnZH80K18l03T1swEiKzLuYMlzNvVz7jiIoZhXw4YG8H2FHJrpf
 9LOJFEJgVcCp5JmDTk19HwN1OogH8TcbaJkQE0EthxExb2LW5BfO9cXzQ/n+uapl
 QoN+5ig1x2ozyplVOhz/6VbmKxf7EDEOiYr1F1Kbc5qdSm1kdRQQTrMaWJkQ+KzT
 bo+yWr/2zkAqrCns5lbUERfhBSx9jZqcnmUPcdcXLd7qse0cnKc=
 =e1/u
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-kunit-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull KUnit updates from Shuah Khan:

 - changes to decrease macro layering string, integer, EQ/NE asserts

 - remove unused macros

 - several cleanups and fixes

 - new list tests for list_del_init_careful(), list_is_head() and
   list_entry_is_head()

* tag 'linux-kselftest-kunit-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest:
  list: test: Add a test for list_entry_is_head()
  list: test: Add a test for list_is_head()
  list: test: Add test for list_del_init_careful()
  kunit: cleanup assertion macro internal variables
  kunit: factor out str constants from binary assertion structs
  kunit: consolidate KUNIT_INIT_BINARY_ASSERT_STRUCT macros
  kunit: remove va_format from kunit_assert
  kunit: tool: drop mostly unused KunitResult.result field
  kunit: decrease macro layering for EQ/NE asserts
  kunit: decrease macro layering for integer asserts
  kunit: reduce layering in string assertion macros
  kunit: drop unused intermediate macros for ptr inequality checks
  kunit: make KUNIT_EXPECT_EQ() use KUNIT_EXPECT_EQ_MSG(), etc.
  kunit: drop unused assert_type from kunit_assert and clean up macros
  kunit: split out part of kunit_assert into a static const
  kunit: factor out kunit_base_assert_format() call into kunit_fail()
  kunit: drop unused kunit* field in kunit_assert
  kunit: move check if assertion passed into the macros
  kunit: add example test case showing off all the expect macros
2022-03-23 12:56:39 -07:00
Linus Torvalds
23d1dea555 linux-kselftest-next-5.18-rc1
This Kselftest test for Linux 5.18-rc1 consists of several build and
 cleanup fixes.
 
 - removing obsolete config options
 - removing dependency on internal kernel macros
 - adding config options
 - several build fixes related to headers and install paths
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEPZKym/RZuOCGeA/kCwJExA0NQxwFAmI5I5sACgkQCwJExA0N
 Qxx/Yg/8DkJOZINaZJ+fMjz0jvRc3dGAUxCczliO1ZcuzilBWbAf/D+5cHkroctW
 w661Sf05b00uyXiUmUVyk/YqfCSVwWWydQjYdhpmV/VumjOVAr4vgu2lo4GUfMJe
 U+GrleFFCsBcXFU2JLiwl0s31B24P3VlQckSn11NJ/SSWFFkomaDO6bNZI4vZA+R
 Ue4zXuXhVr1sJQWi5pJTAMz9Ylj8iJ6PYlS25kEIgxqUh1DN8qKp6CIGMSaBOMw7
 U0VJV0p22NO2JHmRLJbIENjlvrQJBQChG6g7keuqWH85fLD1x8+qvaiOg15ge69s
 q2bgUA4SJ5qZsZaCVmSCi7+/JVy9KIhRUJAL0lXzXnrJtuf8fPkofPYwDlSquHxT
 3QvUTcEoTakma0W4eZxGD9sB3vRhh1IyHM6fth8pCKRRcygH7SKDqlLMmaxhYZJu
 ODJRZ/fa/mAbr24bvo+PL2c85D29XXzshj+lmv+2jxGnnhAWaoq3QFrclu1NXq4p
 gKgzvr/rBY0LGz+BqLbxKKiinu89ZpvEucxfXBuCaR7fhQ6bd+cR0xR5c9jaYaq+
 JNhrS+Z1frcZx5qcy/4VvknfcRDzl7089jetHllxstogip0eCKuH5aKmClSV4GK2
 fPadnmkMSGft6frCE+e5vRvuPBCL7vdOQPXg+VMf7PUY74ih4X0=
 =Q6bT
 -----END PGP SIGNATURE-----

Merge tag 'linux-kselftest-next-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest

Pull Kselftest updates from Shuah Khan:
 "Several build and cleanup fixes:

   - removing obsolete config options

   - removing dependency on internal kernel macros

   - adding config options

   - several build fixes related to headers and install paths"

* tag 'linux-kselftest-next-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: (22 commits)
  selftests: Fix build when $(O) points to a relative path
  selftests: netfilter: fix a build error on openSUSE
  selftests: kvm: add generated file to the .gitignore
  selftests/exec: add generated files to .gitignore
  selftests: add kselftest_install to .gitignore
  selftests/rtc: continuously read RTC in a loop for 30s
  selftests/lkdtm: Add UBSAN config
  selftests/lkdtm: Remove dead config option
  selftests/exec: Rename file binfmt_script to binfmt_script.py
  selftests: Use -isystem instead of -I to include headers
  selftests: vm: remove dependecy from internal kernel macros
  selftests: vm: Add the uapi headers include variable
  selftests: mptcp: Add the uapi headers include variable
  selftests: net: Add the uapi headers include variable
  selftests: landlock: Add the uapi headers include variable
  selftests: kvm: Add the uapi headers include variable
  selftests: futex: Add the uapi headers include variable
  selftests: Correct the headers install path
  selftests: Add and export a kernel uapi headers path
  selftests: set the BUILD variable to absolute path
  ...
2022-03-23 12:53:00 -07:00
Linus Torvalds
1bc191051d Tracing updates for 5.18:
- New user_events interface. User space can register an event with the kernel
   describing the format of the event. Then it will receive a byte in a page
   mapping that it can check against. A privileged task can then enable that
   event like any other event, which will change the mapped byte to true,
   telling the user space application to start writing the event to the
   tracing buffer.
 
 - Add new "ftrace_boot_snapshot" kernel command line parameter. When set,
   the tracing buffer will be saved in the snapshot buffer at boot up when
   the kernel hands things over to user space. This will keep the traces that
   happened at boot up available even if user space boot up has tracing as
   well.
 
 - Have TRACE_EVENT_ENUM() also update trace event field type descriptions.
   Thus if a static array defines its size with an enum, the user space trace
   event parsers can still know how to parse that array.
 
 - Add new TRACE_CUSTOM_EVENT() macro. This acts the same as the
   TRACE_EVENT() macro, but will attach to an existing tracepoint. This will
   make one tracepoint be able to trace different content and not be stuck at
   only what the original TRACE_EVENT() macro exports.
 
 - Fixes to tracing error logging.
 
 - Better saving of cmdlines to PIDs when tracing (use the wakeup events for
   mapping).
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYjiO3RQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhQzAQDtek5p80p/zkMGymm14wSH6qq0NdgN
 Kv7fTBwEewUa0gD/UCOVLw4Oj+JtHQhCa3sCGZopmRv0BT1+4UQANqosKQY=
 =Au08
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - New user_events interface. User space can register an event with the
   kernel describing the format of the event. Then it will receive a
   byte in a page mapping that it can check against. A privileged task
   can then enable that event like any other event, which will change
   the mapped byte to true, telling the user space application to start
   writing the event to the tracing buffer.

 - Add new "ftrace_boot_snapshot" kernel command line parameter. When
   set, the tracing buffer will be saved in the snapshot buffer at boot
   up when the kernel hands things over to user space. This will keep
   the traces that happened at boot up available even if user space boot
   up has tracing as well.

 - Have TRACE_EVENT_ENUM() also update trace event field type
   descriptions. Thus if a static array defines its size with an enum,
   the user space trace event parsers can still know how to parse that
   array.

 - Add new TRACE_CUSTOM_EVENT() macro. This acts the same as the
   TRACE_EVENT() macro, but will attach to an existing tracepoint. This
   will make one tracepoint be able to trace different content and not
   be stuck at only what the original TRACE_EVENT() macro exports.

 - Fixes to tracing error logging.

 - Better saving of cmdlines to PIDs when tracing (use the wakeup events
   for mapping).

* tag 'trace-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (30 commits)
  tracing: Have type enum modifications copy the strings
  user_events: Add trace event call as root for low permission cases
  tracing/user_events: Use alloc_pages instead of kzalloc() for register pages
  tracing: Add snapshot at end of kernel boot up
  tracing: Have TRACE_DEFINE_ENUM affect trace event types as well
  tracing: Fix strncpy warning in trace_events_synth.c
  user_events: Prevent dyn_event delete racing with ioctl add/delete
  tracing: Add TRACE_CUSTOM_EVENT() macro
  tracing: Move the defines to create TRACE_EVENTS into their own files
  tracing: Add sample code for custom trace events
  tracing: Allow custom events to be added to the tracefs directory
  tracing: Fix last_cmd_set() string management in histogram code
  user_events: Fix potential uninitialized pointer while parsing field
  tracing: Fix allocation of last_cmd in last_cmd_set()
  user_events: Add documentation file
  user_events: Add sample code for typical usage
  user_events: Add self-test for validator boundaries
  user_events: Add self-test for perf_event integration
  user_events: Add self-test for dynamic_events integration
  user_events: Add self-test for ftrace integration
  ...
2022-03-23 11:40:25 -07:00
Linus Torvalds
20f463fb38 Real Time Analysis Tool updates for 5.18
Changes to RTLA:
 
  - Support for adjusting tracing_threashold
 
  - Add -a (auto) option to make it easier for users to debug in the field
 
  - Add -e option to add more events to the trace
 
  - Add --trigger option to add triggers to events
 
  - Add --filter option to filter events
 
  - Add support to save histograms to the file
 
  - Add --dma-latency to set /dev/cpu_dma_latency
 
  - Other fixes and cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYjiKhRQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qpLmAP4u7cIl3pt5zJIik5QYGpvrS+qp2NCI
 4ouVQPYvcUGjYwEAqp+5Wix6hVgLAFn5mcd9kXgS6i4JRzAhDaG2LZl7kwI=
 =2bkl
 -----END PGP SIGNATURE-----

Merge tag 'trace-rtla-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull RTLA tracing tool updates from Steven Rostedt:
 "Real Time Analysis Tool updatesfor 5.18:

   - Support for adjusting tracing_threashold

   - Add -a (auto) option to make it easier for users to debug in the field

   - Add -e option to add more events to the trace

   - Add --trigger option to add triggers to events

   - Add --filter option to filter events

   - Add support to save histograms to the file

   - Add --dma-latency to set /dev/cpu_dma_latency

   - Other fixes and cleanups"

* tag 'trace-rtla-v5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  rtla: Tools main loop cleanup
  rtla/timerlat: Add --dma-latency option
  rtla/osnoise: Fix osnoise hist stop tracing message
  rtla: Check for trace off also in the trace instance
  rtla/trace: Save event histogram output to a file
  rtla: Add --filter support
  rtla/trace: Add trace event filter helpers
  rtla: Add --trigger support
  rtla/trace: Add trace event trigger helpers
  rtla: Add -e/--event support
  rtla/trace: Add trace events helpers
  rtla/timerlat: Add the automatic trace option
  rtla/osnoise: Add the automatic trace option
  rtla/osnoise: Add an option to set the threshold
  rtla/osnoise: Add support to adjust the tracing_thresh
2022-03-23 11:08:10 -07:00
Jakub Kicinski
89695196f0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Merge in overtime fixes, no conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-23 10:53:49 -07:00
Linus Torvalds
9030fb0bb9 Folio changes for 5.18
- Rewrite how munlock works to massively reduce the contention
    on i_mmap_rwsem (Hugh Dickins):
    https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/
  - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph Hellwig):
    https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/
  - Convert GUP to use folios and make pincount available for order-1
    pages. (Matthew Wilcox)
  - Convert a few more truncation functions to use folios (Matthew Wilcox)
  - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew Wilcox)
  - Convert rmap_walk to use folios (Matthew Wilcox)
  - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)
  - Add support for creating large folios in readahead (Matthew Wilcox)
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCgAdFiEEejHryeLBw/spnjHrDpNsjXcpgj4FAmI4ucgACgkQDpNsjXcp
 gj69Wgf6AwqwmO5Tmy+fLScDPqWxmXJofbocae1kyoGHf7Ui91OK4U2j6IpvAr+g
 P/vLIK+JAAcTQcrSCjymuEkf4HkGZOR03QQn7maPIEe4eLrZRQDEsmHC1L9gpeJp
 s/GMvDWiGE0Tnxu0EOzfVi/yT+qjIl/S8VvqtCoJv1HdzxitZ7+1RDuqImaMC5MM
 Qi3uHag78vLmCltLXpIOdpgZhdZexCdL2Y/1npf+b6FVkAJRRNUnA0gRbS7YpoVp
 CbxEJcmAl9cpJLuj5i5kIfS9trr+/QcvbUlzRxh4ggC58iqnmF2V09l2MJ7YU3XL
 v1O/Elq4lRhXninZFQEm9zjrri7LDQ==
 =n9Ad
 -----END PGP SIGNATURE-----

Merge tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache

Pull folio updates from Matthew Wilcox:

 - Rewrite how munlock works to massively reduce the contention on
   i_mmap_rwsem (Hugh Dickins):

     https://lore.kernel.org/linux-mm/8e4356d-9622-a7f0-b2c-f116b5f2efea@google.com/

 - Sort out the page refcount mess for ZONE_DEVICE pages (Christoph
   Hellwig):

     https://lore.kernel.org/linux-mm/20220210072828.2930359-1-hch@lst.de/

 - Convert GUP to use folios and make pincount available for order-1
   pages. (Matthew Wilcox)

 - Convert a few more truncation functions to use folios (Matthew
   Wilcox)

 - Convert page_vma_mapped_walk to use PFNs instead of pages (Matthew
   Wilcox)

 - Convert rmap_walk to use folios (Matthew Wilcox)

 - Convert most of shrink_page_list() to use a folio (Matthew Wilcox)

 - Add support for creating large folios in readahead (Matthew Wilcox)

* tag 'folio-5.18c' of git://git.infradead.org/users/willy/pagecache: (114 commits)
  mm/damon: minor cleanup for damon_pa_young
  selftests/vm/transhuge-stress: Support file-backed PMD folios
  mm/filemap: Support VM_HUGEPAGE for file mappings
  mm/readahead: Switch to page_cache_ra_order
  mm/readahead: Align file mappings for non-DAX
  mm/readahead: Add large folio readahead
  mm: Support arbitrary THP sizes
  mm: Make large folios depend on THP
  mm: Fix READ_ONLY_THP warning
  mm/filemap: Allow large folios to be added to the page cache
  mm: Turn can_split_huge_page() into can_split_folio()
  mm/vmscan: Convert pageout() to take a folio
  mm/vmscan: Turn page_check_references() into folio_check_references()
  mm/vmscan: Account large folios correctly
  mm/vmscan: Optimise shrink_page_list for non-PMD-sized folios
  mm/vmscan: Free non-shmem folios without splitting them
  mm/rmap: Constify the rmap_walk_control argument
  mm/rmap: Convert rmap_walk() to take a folio
  mm: Turn page_anon_vma() into folio_anon_vma()
  mm/rmap: Turn page_lock_anon_vma_read() into folio_lock_anon_vma_read()
  ...
2022-03-22 17:03:12 -07:00
Linus Torvalds
3bf03b9a08 Merge branch 'akpm' (patches from Andrew)
Merge updates from Andrew Morton:

 - A few misc subsystems: kthread, scripts, ntfs, ocfs2, block, and vfs

 - Most the MM patches which precede the patches in Willy's tree: kasan,
   pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
   sparsemem, vmalloc, pagealloc, memory-failure, mlock, hugetlb,
   userfaultfd, vmscan, compaction, mempolicy, oom-kill, migration, thp,
   cma, autonuma, psi, ksm, page-poison, madvise, memory-hotplug, rmap,
   zswap, uaccess, ioremap, highmem, cleanups, kfence, hmm, and damon.

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (227 commits)
  mm/damon/sysfs: remove repeat container_of() in damon_sysfs_kdamond_release()
  Docs/ABI/testing: add DAMON sysfs interface ABI document
  Docs/admin-guide/mm/damon/usage: document DAMON sysfs interface
  selftests/damon: add a test for DAMON sysfs interface
  mm/damon/sysfs: support DAMOS stats
  mm/damon/sysfs: support DAMOS watermarks
  mm/damon/sysfs: support schemes prioritization
  mm/damon/sysfs: support DAMOS quotas
  mm/damon/sysfs: support DAMON-based Operation Schemes
  mm/damon/sysfs: support the physical address space monitoring
  mm/damon/sysfs: link DAMON for virtual address spaces monitoring
  mm/damon: implement a minimal stub for sysfs-based DAMON interface
  mm/damon/core: add number of each enum type values
  mm/damon/core: allow non-exclusive DAMON start/stop
  Docs/damon: update outdated term 'regions update interval'
  Docs/vm/damon/design: update DAMON-Idle Page Tracking interference handling
  Docs/vm/damon: call low level monitoring primitives the operations
  mm/damon: remove unnecessary CONFIG_DAMON option
  mm/damon/paddr,vaddr: remove damon_{p,v}a_{target_valid,set_operations}()
  mm/damon/dbgfs-test: fix is_target_id() change
  ...
2022-03-22 16:11:53 -07:00
SeongJae Park
40184e484d selftests/damon: add a test for DAMON sysfs interface
This commit adds a selftest for DAMON sysfs interface.  It tests the
functionality of 'nr' files and existence of files in each directory of
the hierarchy.

Link: https://lkml.kernel.org/r/20220228081314.5770-12-sj@kernel.org
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Xin Hao <xhao@linux.alibaba.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:13 -07:00
Guo Zhengkui
d794103d52 userfaultfd/selftests: fix uninitialized_var.cocci warning
Fix following coccicheck warning:
tools/testing/selftests/vm/userfaultfd.c:556:23-24:
WARNING this kind of initialization is deprecated

`unsigned long page_nr = *(&page_nr)` has the same form of
uninitialized_var() macro. I remove the redundant assignement. It has
been tested with gcc (Debian 8.3.0-6) 8.3.0.

The patch which removed uninitialized_var() is:
https://lore.kernel.org/all/20121028102007.GA7547@gmail.com/ And there is
very few "/* GCC */" comments in the Linux kernel code now.

Link: https://lkml.kernel.org/r/20220304082333.9252-1-guozhengkui@vivo.com
Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Cc: Shuah Khan <shuah@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:08 -07:00
Muchun Song
b147c89cd4 selftests: vm: add a hugetlb test case
Since the head vmemmap page frame associated with each HugeTLB page is
reused, we should hide the PG_head flag of tail struct page from the
user.  Add a tese case to check whether it is work properly.  The test
steps are as follows.

  1) alloc 2MB hugeTLB
  2) get each page frame
  3) apply those APIs in each page frame
  4) Those APIs work completely the same as before.

Reading the flags of a page by /proc/kpageflags is done in
stable_page_flags(), which has invoked PageHead(), PageTail(),
PageCompound() and compound_head().

If those APIs work properly, the head page must have 15 and 17 bits set.
And tail pages must have 16 and 17 bits set but 15 bit unset.  Those
flags are checked in check_page_flags().

Link: https://lkml.kernel.org/r/20211101031651.75851-5-songmuchun@bytedance.com
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Barry Song <song.bao.hua@hisilicon.com>
Cc: Bodeddula Balasubramaniam <bodeddub@amazon.com>
Cc: Chen Huang <chenhuang5@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Fam Zheng <fam.zheng@bytedance.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: Xiongchun Duan <duanxiongchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:08 -07:00
Guillaume Tucker
ef696f93ed selftests, x86: fix how check_cc.sh is being invoked
The $(CC) variable used in Makefiles could contain several arguments
such as "ccache gcc".  These need to be passed as a single string to
check_cc.sh, otherwise only the first argument will be used as the
compiler command.  Without quotes, the $(CC) variable is passed as
distinct arguments which causes the script to fail to build trivial
programs.

Fix this by adding quotes around $(CC) when calling check_cc.sh to pass
the whole string as a single argument to the script even if it has
several words such as "ccache gcc".

Link: https://lkml.kernel.org/r/d0d460d7be0107a69e3c52477761a6fe694c1840.1646991629.git.guillaume.tucker@collabora.com
Fixes: e9886ace22 ("selftests, x86: Rework x86 target architecture detection")
Signed-off-by: Guillaume Tucker <guillaume.tucker@collabora.com>
Tested-by: "kernelci.org bot" <bot@kernelci.org>
Reviewed-by: Guenter Roeck <groeck@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Borislav Petkov <bp@suse.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:04 -07:00
Shakeel Butt
6323ec54b4 selftests: memcg: test high limit for single entry allocation
Test the enforcement of memory.high limit for large amount of memory
allocation within a single kernel entry.  There are valid use-cases
where the application can trigger large amount of memory allocation
within a single syscall e.g.  mlock() or mmap(MAP_POPULATE).

Make sure memory.high limit enforcement works for such use-cases.

Link: https://lkml.kernel.org/r/20220211064917.2028469-4-shakeelb@google.com
Signed-off-by: Shakeel Butt <shakeelb@google.com>
Reviewed-by: Roman Gushchin <guro@fb.com>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Chris Down <chris@chrisdown.name>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-22 15:57:02 -07:00
Vincent Chen
6d1a6f464e
rseq/selftests: Add support for RISC-V
Add support for RISC-V in the rseq selftests, which covers both
64-bit and 32-bit ISA with little endian mode.

Signed-off-by: Vincent Chen <vincent.chen@sifive.com>
Tested-by: Eric Lin <eric.lin@sifive.com>
Reviewed-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2022-03-22 14:45:19 -07:00
Linus Torvalds
3fe2f7446f Changes in this cycle were:
- Cleanups for SCHED_DEADLINE
  - Tracing updates/fixes
  - CPU Accounting fixes
  - First wave of changes to optimize the overhead of the scheduler build,
    from the fast-headers tree - including placeholder *_api.h headers for
    later header split-ups.
  - Preempt-dynamic using static_branch() for ARM64
  - Isolation housekeeping mask rework; preperatory for further changes
  - NUMA-balancing: deal with CPU-less nodes
  - NUMA-balancing: tune systems that have multiple LLC cache domains per node (eg. AMD)
  - Updates to RSEQ UAPI in preparation for glibc usage
  - Lots of RSEQ/selftests, for same
  - Add Suren as PSI co-maintainer
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmI5rg8RHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1hGrw/+M3QOk6fH7G48wjlNnBvcOife6ls+Ni4k
 ixOAcF4JKoixO8HieU5vv0A7yf/83tAa6fpeXeMf1hkCGc0NSlmLtuIux+WOmoAL
 LzCyDEYfiP8KnVh0A1Tui/lK0+AkGo21O6ADhQE2gh8o2LpslOHQMzvtyekSzeeb
 mVxMYQN+QH0m518xdO2D8IQv9ctOYK0eGjmkqdNfntOlytypPZHeNel/tCzwklP/
 dElJUjNiSKDlUgTBPtL3DfpoLOI/0mHF2p6NEXvNyULxSOqJTu8pv9Z2ADb2kKo1
 0D56iXBDngMi9MHIJLgvzsA8gKzHLFSuPbpODDqkTZCa28vaMB9NYGhJ643NtEie
 IXTJEvF1rmNkcLcZlZxo0yjL0fjvPkczjw4Vj27gbrUQeEBfb4mfuI4BRmij63Ep
 qEkgQTJhduCqqrQP1rVyhwWZRk1JNcVug+F6N42qWW3fg1xhj0YSrLai2c9nPez6
 3Zt98H8YGS1Z/JQomSw48iGXVqfTp/ETI7uU7jqHK8QcjzQ4lFK5H4GZpwuqGBZi
 NJJ1l97XMEas+rPHiwMEN7Z1DVhzJLCp8omEj12QU+tGLofxxwAuuOVat3CQWLRk
 f80Oya3TLEgd22hGIKDRmHa22vdWnNQyS0S15wJotawBzQf+n3auS9Q3/rh979+t
 ES/qvlGxTIs=
 =Z8uT
 -----END PGP SIGNATURE-----

Merge tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull scheduler updates from Ingo Molnar:

 - Cleanups for SCHED_DEADLINE

 - Tracing updates/fixes

 - CPU Accounting fixes

 - First wave of changes to optimize the overhead of the scheduler
   build, from the fast-headers tree - including placeholder *_api.h
   headers for later header split-ups.

 - Preempt-dynamic using static_branch() for ARM64

 - Isolation housekeeping mask rework; preperatory for further changes

 - NUMA-balancing: deal with CPU-less nodes

 - NUMA-balancing: tune systems that have multiple LLC cache domains per
   node (eg. AMD)

 - Updates to RSEQ UAPI in preparation for glibc usage

 - Lots of RSEQ/selftests, for same

 - Add Suren as PSI co-maintainer

* tag 'sched-core-2022-03-22' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (81 commits)
  sched/headers: ARM needs asm/paravirt_api_clock.h too
  sched/numa: Fix boot crash on arm64 systems
  headers/prep: Fix header to build standalone: <linux/psi.h>
  sched/headers: Only include <linux/entry-common.h> when CONFIG_GENERIC_ENTRY=y
  cgroup: Fix suspicious rcu_dereference_check() usage warning
  sched/preempt: Tell about PREEMPT_DYNAMIC on kernel headers
  sched/topology: Remove redundant variable and fix incorrect type in build_sched_domains
  sched/deadline,rt: Remove unused parameter from pick_next_[rt|dl]_entity()
  sched/deadline,rt: Remove unused functions for !CONFIG_SMP
  sched/deadline: Use __node_2_[pdl|dle]() and rb_first_cached() consistently
  sched/deadline: Merge dl_task_can_attach() and dl_cpu_busy()
  sched/deadline: Move bandwidth mgmt and reclaim functions into sched class source file
  sched/deadline: Remove unused def_dl_bandwidth
  sched/tracing: Report TASK_RTLOCK_WAIT tasks as TASK_UNINTERRUPTIBLE
  sched/tracing: Don't re-read p->state when emitting sched_switch event
  sched/rt: Plug rt_mutex_setprio() vs push_rt_task() race
  sched/cpuacct: Remove redundant RCU read lock
  sched/cpuacct: Optimize away RCU read lock
  sched/cpuacct: Fix charge percpu cpuusage
  sched/headers: Reorganize, clean up and optimize kernel/sched/sched.h dependencies
  ...
2022-03-22 14:39:12 -07:00
Kim Phillips
7b830875d2 perf evsel: Make evsel__env() always return a valid env
It's possible to have an evsel and evsel->evlist populated without
an evsel->evlist->env, when, e.g., cmd_record is in its error path.

Future patches will add support for evsel__open_strerror to be able
to customize error messaging based on perf_env__{arch,cpuid}, so
let's have evsel__env return &perf_env instead of NULL in that case.

Reviewed-by: Kajol Jain <kjain@linux.ibm.com>
Signed-off-by: Kim Phillips <kim.phillips@amd.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <robert.richter@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211004214114.188477-1-kim.phillips@amd.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 18:36:02 -03:00
Colin Ian King
011899cc00 perf build-id: Fix spelling mistake "Cant" -> "Can't"
There is a spelling mistake in a pr_err message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20220316232452.53062-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 18:27:56 -03:00
Colin Ian King
ccbc9df9ae perf header: Fix spelling mistake "could't" -> "couldn't"
There is a spelling mistake in a pr_debug2 message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: kernel-janitors@vger.kernel.org
Link: https://lore.kernel.org/r/20220316232212.52820-1-colin.i.king@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 18:27:13 -03:00
Kan Liang
6f680c6aa2 perf script: Add 'brstackinsnlen' for branch stacks
When analyzing with 'perf script', it's useful to understand the
captured instruction and the next sequential instruction.

To calculate the address of the next sequential instruction, the length
of the captured instruction is required.

For example, you can’t know the next sequential instruction after an
unconditional branch unless you calculate that based on its length.

For branch stacks, 'perf script' only prints the instruction bytes with
'brstackinsn', but lacks the instruction length.

Add 'brstackinsnlen' to print the instruction length.

  $ perf script -F ip,brstackinsn,brstackinsnlen --xed
     7fa555be8f75
        _start:
        00007fa555be8090    mov %rsp, %rdi              ilen: 3
        00007fa555be8093    callq  0x7fa555be8ea0       ilen: 5 # PRED 102 cycles [102] 0.02 IPC
        _dl_start+38:
        00007fa555be8ec6    movq  %rdx,0x227853(%rip)   ilen: 7
        00007fa555be8ecd    leaq  0x227f94(%rip),%rdx   ilen: 7

Signed-off-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Ahmad Yasin <ahmad.yasin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/1647871212-184070-1-git-send-email-kan.liang@linux.intel.com
[ Added the new field to tools/perf/Documentation/perf-script.txt ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 18:00:45 -03:00
Ian Rogers
bc355822f0 perf parse-events: Move slots only with topdown
If slots isn't with a topdown event then moving it is unnecessary. For
example {instructions, slots} is re-ordered:

  $ perf stat -e '{instructions,slots}' -a sleep 1

   Performance counter stats for 'system wide':

         936,600,825      slots
         144,440,968      instructions

         1.006061423 seconds time elapsed

Which can break tools expecting the command line order to match the
printed order. It is necessary to move the slots event first when it
appears with topdown events. Add extra checking so that the slots event
is only moved in the case of there being a topdown event like:

  $ perf stat -e '{instructions,slots,topdown-fe-bound}' -a sleep 1

   Performance counter stats for 'system wide':

          2427568570      slots
           300927614      instructions
           551021649      topdown-fe-bound

         1.001771803 seconds time elapsed

Fixes: 94dbfd6781 ("perf parse-events: Architecture specific leader override")
Reported-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220321223344.1034479-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 17:52:58 -03:00
Arnaldo Carvalho de Melo
34fe4ccb77 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes that went thru perf/urgent and now are fixed by an
upcoming patch.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 17:52:10 -03:00
Namhyung Kim
feff08395b perf ftrace latency: Update documentation
Add description of 'perf ftrace latency' subcommand.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220321234609.90455-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 17:45:39 -03:00
Namhyung Kim
84005bb614 perf ftrace latency: Add -n/--use-nsec option
Sometimes we want to see nano-second granularity.

  $ sudo perf ftrace latency -T dput -a sleep 1
  #   DURATION     |      COUNT | GRAPH                          |
       0 - 1    us |    2098375 | #############################  |
       1 - 2    us |         61 |                                |
       2 - 4    us |         33 |                                |
       4 - 8    us |         13 |                                |
       8 - 16   us |        124 |                                |
      16 - 32   us |        123 |                                |
      32 - 64   us |          1 |                                |
      64 - 128  us |          0 |                                |
     128 - 256  us |          1 |                                |
     256 - 512  us |          0 |                                |
     512 - 1024 us |          0 |                                |
       1 - 2    ms |          0 |                                |
       2 - 4    ms |          0 |                                |
       4 - 8    ms |          0 |                                |
       8 - 16   ms |          0 |                                |
      16 - 32   ms |          0 |                                |
      32 - 64   ms |          0 |                                |
      64 - 128  ms |          0 |                                |
     128 - 256  ms |          0 |                                |
     256 - 512  ms |          0 |                                |
     512 - 1024 ms |          0 |                                |
       1 - ...   s |          0 |                                |

  $ sudo perf ftrace latency -T dput -a -n sleep 1
  #   DURATION     |      COUNT | GRAPH                          |
       0 - 1    us |          0 |                                |
       1 - 2    ns |          0 |                                |
       2 - 4    ns |          0 |                                |
       4 - 8    ns |          0 |                                |
       8 - 16   ns |          0 |                                |
      16 - 32   ns |          0 |                                |
      32 - 64   ns |          0 |                                |
      64 - 128  ns |    1163434 | ##############                 |
     128 - 256  ns |     914102 | #############                  |
     256 - 512  ns |        884 |                                |
     512 - 1024 ns |        613 |                                |
       1 - 2    us |         31 |                                |
       2 - 4    us |         17 |                                |
       4 - 8    us |          7 |                                |
       8 - 16   us |        123 |                                |
      16 - 32   us |         83 |                                |
      32 - 64   us |          0 |                                |
      64 - 128  us |          0 |                                |
     128 - 256  us |          0 |                                |
     256 - 512  us |          0 |                                |
     512 - 1024 us |          0 |                                |
       1 - ...  ms |          0 |                                |

Committer testing:

Testing it with BPF:

  # perf ftrace latency -b -n -T dput -a sleep 1
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |          0 |                                                |
       1 - 2    ns |          0 |                                                |
       2 - 4    ns |          0 |                                                |
       4 - 8    ns |          0 |                                                |
       8 - 16   ns |          0 |                                                |
      16 - 32   ns |          0 |                                                |
      32 - 64   ns |          0 |                                                |
      64 - 128  ns |          0 |                                                |
     128 - 256  ns |     823489 | #############################################  |
     256 - 512  ns |       3232 |                                                |
     512 - 1024 ns |         51 |                                                |
       1 - 2    us |        172 |                                                |
       2 - 4    us |          9 |                                                |
       4 - 8    us |          0 |                                                |
       8 - 16   us |          2 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - ...  ms |          0 |                                                |
  [root@quaco ~]# strace -e bpf perf ftrace latency -b -n -T dput -a sleep 1
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffe2bd574f0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 144) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\20\0\0\0\20\0\0\0\5\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=45, btf_log_size=0, btf_log_level=0}, 28) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 28) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\08\0\0\08\0\0\0\t\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=89, btf_log_size=0, btf_log_level=0}, 28) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0\f\0\0\0\f\0\0\0\7\0\0\0\1\0\0\0\0\0\0\20"..., btf_log_buf=NULL, btf_size=43, btf_log_size=0, btf_log_level=0}, 28) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\t\0\0\0\1\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=81, btf_log_size=0, btf_log_level=0}, 28) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0000\0\0\0000\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=77, btf_log_size=0, btf_log_level=0}, 28) = 3
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0(\0\0\0(\0\0\0\5\0\0\0\0\0\0\0\0\0\0\1"..., btf_log_buf=NULL, btf_size=69, btf_log_size=0, btf_log_level=0}, 28) = -1 EINVAL (Invalid argument)
  bpf(BPF_BTF_LOAD, {btf="\237\353\1\0\30\0\0\0\0\0\0\0<\3\0\0<\3\0\0\362\3\0\0\0\0\0\0\0\0\0\2"..., btf_log_buf=NULL, btf_size=1862, btf_log_size=0, btf_log_level=0}, 28) = 3
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=4, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0, map_extra=0}, 72) = 4
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=2, insns=0x7ffe2bd571c0, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="test", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 144) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=8, value_size=8, max_entries=10000, map_flags=0, inner_map_fd=0, map_name="functime", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0, map_extra=0}, 72) = 4
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="cpu_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0, map_extra=0}, 72) = 5
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_HASH, key_size=4, value_size=1, max_entries=1, map_flags=0, inner_map_fd=0, map_name="task_filter", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0, map_extra=0}, 72) = 7
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_PERCPU_ARRAY, key_size=4, value_size=8, max_entries=22, map_flags=0, inner_map_fd=0, map_name="latency", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0, map_extra=0}, 72) = 8
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=32, max_entries=1, map_flags=0, inner_map_fd=0, map_name="", map_ifindex=0, btf_fd=0, btf_key_type_id=0, btf_value_type_id=0, btf_vmlinux_value_type_id=0, map_extra=0}, 72) = 9
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_SOCKET_FILTER, insn_cnt=5, insns=0x7ffe2bd57220, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 144) = 10
  bpf(BPF_MAP_CREATE, {map_type=BPF_MAP_TYPE_ARRAY, key_size=4, value_size=16, max_entries=1, map_flags=BPF_F_MMAPABLE, inner_map_fd=0, map_name="func_lat.bss", map_ifindex=0, btf_fd=3, btf_key_type_id=0, btf_value_type_id=33, btf_vmlinux_value_type_id=0, map_extra=0}, 72) = 9
  bpf(BPF_MAP_UPDATE_ELEM, {map_fd=9, key=0x7ffe2bd57330, value=0x7f9a5fc39000, flags=BPF_ANY}, 144) = 0
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=42, insns=0x113daf0, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 16, 13), prog_flags=0, prog_name="func_begin", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x113fb70, func_info_cnt=1, line_info_rec_size=16, line_info=0x113fb90, line_info_cnt=21, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 144) = 10
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_KPROBE, insn_cnt=124, insns=0x113d360, license="", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(5, 16, 13), prog_flags=0, prog_name="func_end", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=3, func_info_rec_size=8, func_info=0x113fcf0, func_info_cnt=1, line_info_rec_size=16, line_info=0x1139770, line_info_cnt=60, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 144) = 11
  bpf(BPF_PROG_LOAD, {prog_type=BPF_PROG_TYPE_TRACEPOINT, insn_cnt=2, insns=0x7ffe2bd57150, license="GPL", log_level=0, log_size=0, log_buf=NULL, kern_version=KERNEL_VERSION(0, 0, 0), prog_flags=0, prog_name="", prog_ifindex=0, expected_attach_type=BPF_CGROUP_INET_INGRESS, prog_btf_fd=0, func_info_rec_size=0, func_info=NULL, func_info_cnt=0, line_info_rec_size=0, line_info=NULL, line_info_cnt=0, attach_btf_id=0, attach_prog_fd=0, fd_array=NULL}, 144) = 13
  bpf(BPF_LINK_CREATE, {link_create={prog_fd=13, target_fd=-1, attach_type=BPF_PERF_EVENT, flags=0}}, 144) = -1 EBADF (Bad file descriptor)
  bpf(BPF_LINK_CREATE, {link_create={prog_fd=10, target_fd=12, attach_type=BPF_PERF_EVENT, flags=0}}, 144) = 13
  bpf(BPF_LINK_CREATE, {link_create={prog_fd=11, target_fd=14, attach_type=BPF_PERF_EVENT, flags=0}}, 144) = 15
  --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=130075, si_uid=0, si_status=0, si_utime=0, si_stime=0} ---
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  bpf(BPF_MAP_LOOKUP_ELEM, {map_fd=8, key=0x7ffe2bd57624, value=0x113fdd0, flags=BPF_ANY}, 144) = 0
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |          0 |                                                |
       1 - 2    ns |          0 |                                                |
       2 - 4    ns |          0 |                                                |
       4 - 8    ns |          0 |                                                |
       8 - 16   ns |          0 |                                                |
      16 - 32   ns |          0 |                                                |
      32 - 64   ns |          0 |                                                |
      64 - 128  ns |          0 |                                                |
     128 - 256  ns |      42519 | ###########################################    |
     256 - 512  ns |       2140 | ##                                             |
     512 - 1024 ns |         54 |                                                |
       1 - 2    us |         16 |                                                |
       2 - 4    us |         10 |                                                |
       4 - 8    us |          0 |                                                |
       8 - 16   us |          0 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - ...  ms |          0 |                                                |
  +++ exited with 0 +++
  #

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Changbin Du <changbin.du@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220321234609.90455-1-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 17:43:46 -03:00
John Garry
7572733b84 perf tools: Fix version kernel tag
Generating the version kernel tag relies on "git describe" command to
get the latest Linus kernel tag.

However, when working from clones of Linus' git we may not have the latest
tag. For example, when working on Arnaldo's acme.git, we can have this:

  $ git branch
  perf/core
  $ head -n 5 ../../Makefile  | tail -n 4
  VERSION = 5
  PATCHLEVEL = 17
  SUBLEVEL = 0
  EXTRAVERSION = -rc3
  $ git describe --abbrev=0 --match "v[0-9].[0-9]*"
  v4.13-rc5

Indeed using tags is a problem as it relies on tags being pulled from
Linus' git (and pushed to the clone).

In commit a4147f0f91 ("perf tools: Fix perf version generation")
Robert introduced a change to use the kernelversion rule to generate the
kernel tag when no git tags are available.

However, as mentioned above, the tag we generate may be incorrect, so
just always use kernelversion to get the tag (apart from building perf
out of tree).

Signed-off-by: John Garry <john.garry@huawei.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Link: https://lore.kernel.org/r/1645449409-158238-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 17:12:40 -03:00
Peter Zijlstra
b9067cd80f Merge branch 'kvm/kvm-sls-fix'
Sync with the last minute SLS fix to extend it for IBT.

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-03-22 21:12:14 +01:00
Linus Torvalds
95ab0e8768 Changes for this cycle were:
- Fix address filtering for Intel/PT,ARM/CoreSight
  - Enable Intel/PEBS format 5
  - Allow more fixed-function counters for x86
  - Intel/PT: Enable not recording Taken-Not-Taken packets
  - Add a few branch-types
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmI4WdIRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jdTA/7BADTYzFCbdwPzHt2mR8osv7k+pDvYxs9
 wxNjyi1X7N8cPkhqgIg9CfdhdyDOqo7+J4fG17f2qbwjNK7b2Fb1/U6ZoZaf+f8F
 W0e2LX5KZTXUhkA+TEjrXvYD9FmJaCPM/l2RQg8U7okBs2kb0H6QT2Yn21wd1roC
 WwI5KFiWSVS1IzpVLaXjDh+FJfJHd75ReMqJeus+QoVQ9NHeuI+t4DglSB1IBi54
 d/zeVXE/Y4dFTQOrU06S2HxcOEptvXZsPmVLvKab/veeGGyWiGPxQpvu6bXm6u3x
 0sV+dn67zut2m2pQlUZUucgGTSYIZTpOe+rNukTB9hJ4XeN4/1ohOOCrOuYM+63P
 lGFbN1v+LD7Wc6C2eEhw8G5GEL0qbwzFNQ06O3EOFi7C7GKn7WS/ET6XuuMOERFk
 uxEPb4pFtbBlJ0SriCprFJSd5NL3PORZlLIhv4hGH5hilLR1TFeKDuwZaM4noQxU
 dL3rKGLi9H+P46Eni9H28+0gDISbv1xL+WivHOFQNmhBqAZO52ZcF3J+dgBaR1B5
 pBxVTycFpZMjxSZnqTE0gMsFaLIpVGc+75Chns1rajR0mEtRtJUQUbYz4tK4zb0E
 dZR1p+VF6+DYmSRhiqeaTi9uz9oE8kMa8o/EcbFIg/9BgEnUwJXU20bjnar30xQ7
 9OIn7r9hjHI=
 =XPuo
 -----END PGP SIGNATURE-----

Merge tag 'perf-core-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 perf event updates from Ingo Molnar:

 - Fix address filtering for Intel/PT,ARM/CoreSight

 - Enable Intel/PEBS format 5

 - Allow more fixed-function counters for x86

 - Intel/PT: Enable not recording Taken-Not-Taken packets

 - Add a few branch-types

* tag 'perf-core-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  perf/x86/intel/uncore: Fix the build on !CONFIG_PHYS_ADDR_T_64BIT
  perf: Add irq and exception return branch types
  perf/x86/intel/uncore: Make uncore_discovery clean for 64 bit addresses
  perf/x86/intel/pt: Add a capability and config bit for disabling TNTs
  perf/x86/intel/pt: Add a capability and config bit for event tracing
  perf/x86/intel: Increase max number of the fixed counters
  KVM: x86: use the KVM side max supported fixed counter
  perf/x86/intel: Enable PEBS format 5
  perf/core: Allow kernel address filter when not filtering the kernel
  perf/x86/intel/pt: Fix address filter config for 32-bit kernel
  perf/core: Fix address filter parser for multiple filters
  x86: Share definition of __is_canonical_address()
  perf/x86/intel/pt: Relax address filter validation
2022-03-22 13:06:49 -07:00
Jakub Kicinski
0db8640df5 Merge https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next
Alexei Starovoitov says:

====================
pull-request: bpf-next 2022-03-21 v2

We've added 137 non-merge commits during the last 17 day(s) which contain
a total of 143 files changed, 7123 insertions(+), 1092 deletions(-).

The main changes are:

1) Custom SEC() handling in libbpf, from Andrii.

2) subskeleton support, from Delyan.

3) Use btf_tag to recognize __percpu pointers in the verifier, from Hao.

4) Fix net.core.bpf_jit_harden race, from Hou.

5) Fix bpf_sk_lookup remote_port on big-endian, from Jakub.

6) Introduce fprobe (multi kprobe) _without_ arch bits, from Masami.
The arch specific bits will come later.

7) Introduce multi_kprobe bpf programs on top of fprobe, from Jiri.

8) Enable non-atomic allocations in local storage, from Joanne.

9) Various var_off ptr_to_btf_id fixed, from Kumar.

10) bpf_ima_file_hash helper, from Roberto.

11) Add "live packet" mode for XDP in BPF_PROG_RUN, from Toke.

* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (137 commits)
  selftests/bpf: Fix kprobe_multi test.
  Revert "rethook: x86: Add rethook x86 implementation"
  Revert "arm64: rethook: Add arm64 rethook implementation"
  Revert "powerpc: Add rethook support"
  Revert "ARM: rethook: Add rethook arm implementation"
  bpftool: Fix a bug in subskeleton code generation
  bpf: Fix bpf_prog_pack when PMU_SIZE is not defined
  bpf: Fix bpf_prog_pack for multi-node setup
  bpf: Fix warning for cast from restricted gfp_t in verifier
  bpf, arm: Fix various typos in comments
  libbpf: Close fd in bpf_object__reuse_map
  bpftool: Fix print error when show bpf map
  bpf: Fix kprobe_multi return probe backtrace
  Revert "bpf: Add support to inline bpf_get_func_ip helper on x86"
  bpf: Simplify check in btf_parse_hdr()
  selftests/bpf/test_lirc_mode2.sh: Exit with proper code
  bpf: Check for NULL return from bpf_get_btf_vmlinux
  selftests/bpf: Test skipping stacktrace
  bpf: Adjust BPF stack helper functions to accommodate skip > 0
  bpf: Select proper size for bpf_prog_pack
  ...
====================

Link: https://lore.kernel.org/r/20220322050159.5507-1-alexei.starovoitov@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-22 11:18:49 -07:00
Alexei Starovoitov
7f0059b58f selftests/bpf: Fix kprobe_multi test.
When compiler emits endbr insn the function address could
be different than what bpf_get_func_ip() reports.
This is a short term workaround.
bpf_get_func_ip() will be fixed later.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-03-22 11:09:13 -07:00
John Garry
4e666cdb06 perf tools: Fix dependency for version file creation
The version generated by perf may not be correct by just changing the
head commit, like this:

  $ git log --pretty=format:"%H" -n 1
  b5d9d4708a24ac1889a30e9aedf8af8d73102139
  $ perf -v
  perf version 5.16.gb5d9d4708a24
  $ git reset --hard HEAD^
  HEAD is now at 629f520b265f
  $ make
  ...
  $ ./perf -v
  perf version 5.16.gb5d9d4708a24

The dependency to building PERF-VERSION-FILE should also include ORIG_HEAD,
as this changes when changing the head commit (while HEAD does not).

Signed-off-by: John Garry <john.garry@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Robert Richter <rric@kernel.org>
Link: https://lore.kernel.org/r/1645449409-158238-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-22 10:18:29 -03:00
Ido Schimmel
f70f5f1a8f selftests: forwarding: Use same VRF for port and VLAN upper
The test creates a separate VRF for the VLAN upper, but does not destroy
it during cleanup, resulting in "RTNETLINK answers: File exists" errors.

Fix by using the same VRF for the port and its VLAN upper. This is OK
since their IP addresses do not overlap.

Before:

 # ./bridge_locked_port.sh
 TEST: Locked port ipv4                                              [ OK ]
 TEST: Locked port ipv6                                              [ OK ]
 TEST: Locked port vlan                                              [ OK ]

 # ./bridge_locked_port.sh
 RTNETLINK answers: File exists
 RTNETLINK answers: File exists
 RTNETLINK answers: File exists
 RTNETLINK answers: File exists
 RTNETLINK answers: File exists
 RTNETLINK answers: File exists
 TEST: Locked port ipv4                                              [ OK ]
 TEST: Locked port ipv6                                              [ OK ]
 TEST: Locked port vlan                                              [ OK ]

After:

 # ./bridge_locked_port.sh
 TEST: Locked port ipv4                                              [ OK ]
 TEST: Locked port ipv6                                              [ OK ]
 TEST: Locked port vlan                                              [ OK ]

 # ./bridge_locked_port.sh
 TEST: Locked port ipv4                                              [ OK ]
 TEST: Locked port ipv6                                              [ OK ]
 TEST: Locked port vlan                                              [ OK ]

Fixes: b2b681a412 ("selftests: forwarding: tests of locked port feature")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-22 11:09:24 +01:00
Ido Schimmel
917b149ac3 selftests: forwarding: Disable learning before link up
Disable learning before bringing the bridge port up in order to avoid
the FDB being populated and the test failing.

Before:

 # ./bridge_locked_port.sh
 RTNETLINK answers: File exists
 TEST: Locked port ipv4                                              [FAIL]
         Ping worked after locking port, but before adding FDB entry
 TEST: Locked port ipv6                                              [ OK ]
 TEST: Locked port vlan                                              [ OK ]

After:

 # ./bridge_locked_port.sh
 TEST: Locked port ipv4                                              [ OK ]
 TEST: Locked port ipv6                                              [ OK ]
 TEST: Locked port vlan                                              [ OK ]

Fixes: b2b681a412 ("selftests: forwarding: tests of locked port feature")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-22 11:09:24 +01:00
Linus Torvalds
b7a801f395 execve updates for v5.18-rc1
- Handle unusual AT_PHDR offsets (Akira Kawata)
 - Fix initial mapping size when PT_LOADs are not ordered (Alexey Dobriyan)
 - Move more code under CONFIG_COREDUMP (Alexey Dobriyan)
 - Fix missing mmap_lock in file_files_note (Eric W. Biederman)
 - Remove a.out support for alpha and m68k (Eric W. Biederman)
 - Include first pages of non-exec ELF libraries in coredump (Jann Horn)
 - Don't write past end of notes for regset gap in coredump (Rick Edgecombe)
 - Comment clean-ups (Tom Rix)
 - Force single empty string when argv is empty (Kees Cook)
 - Add NULL argv selftest (Kees Cook)
 - Properly redefine PT_GNU_* in terms of PT_LOOS (Kees Cook)
 - MAINTAINERS: Update execve entry with tree (Kees Cook)
 - Introduce initial KUnit testing for binfmt_elf (Kees Cook)
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmI4ji4WHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJi7VD/9o+PndYkeGclL7sYfouhSzK21W
 go4SGCrTl0oK/mfz3qXVYeS4VFjNTCTEs8rSZdjHN8a9VAVSJ38z6FPwbSQobzEP
 zXPuvwxe4GM4jb8FsBTcTEl1Wfw6kUV9JHXqFje6MuiZMXa8YDD+UMl95CgmGi1L
 5sOw4quHXkG8nlC0v1PI9XSpmzK2nHmXBWVddnPXTUmEfitvoIJdf0iTJ4/4mYM/
 OwrCiufGHvGtQFUrYTxgiZ3nvFdAkZDt+P8GA8NJOBCMDTPvsk57uTok1sW6CRFT
 lSymgoc3SczBtHYO6nFl5U04XGsNY+iHYhjhNL10IoucdCvS2VS0vEb8ZXKg6wtQ
 /tbgf1Mcfu7eoClA0ZjQX/pQbkPYL/s++Lwkc7pzknbmdwq+1yZF1+4Y1XItR4jJ
 kUhVsewQuU0os7BnaREkFOcwqXfA4hixb9w79p+SjMX8/XrnSkLJ3cFswkGTUxdO
 DOwhVcmqsZdVXMMk0R3oOtm9ABSp/FqvT8At2kZI0W93jhZGHWzOrU+psnkTUcDt
 KpFEJzdoh4ImZvBK8F5f07dAlqeVEZvVDhBt+x1Wxcu90p7rmZJT8OV2mJCDVhZG
 E2PW7UuLOAbgRM+E+gxz7SkpIMtOSFlxT2xGuygcRbIxOOeVnj1x9NwGdI9xcgpF
 s021x7TcHbpvYakRsg==
 =SyEY
 -----END PGP SIGNATURE-----

Merge tag 'execve-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull execve updates from Kees Cook:
 "Execve and binfmt updates.

  Eric and I have stepped up to be the active maintainers of this area,
  so here's our first collection. The bulk of the work was in coredump
  handling fixes; additional details are noted below:

   - Handle unusual AT_PHDR offsets (Akira Kawata)

   - Fix initial mapping size when PT_LOADs are not ordered (Alexey
     Dobriyan)

   - Move more code under CONFIG_COREDUMP (Alexey Dobriyan)

   - Fix missing mmap_lock in file_files_note (Eric W. Biederman)

   - Remove a.out support for alpha and m68k (Eric W. Biederman)

   - Include first pages of non-exec ELF libraries in coredump (Jann
     Horn)

   - Don't write past end of notes for regset gap in coredump (Rick
     Edgecombe)

   - Comment clean-ups (Tom Rix)

   - Force single empty string when argv is empty (Kees Cook)

   - Add NULL argv selftest (Kees Cook)

   - Properly redefine PT_GNU_* in terms of PT_LOOS (Kees Cook)

   - MAINTAINERS: Update execve entry with tree (Kees Cook)

   - Introduce initial KUnit testing for binfmt_elf (Kees Cook)"

* tag 'execve-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  binfmt_elf: Don't write past end of notes for regset gap
  a.out: Stop building a.out/osf1 support on alpha and m68k
  coredump: Don't compile flat_core_dump when coredumps are disabled
  coredump: Use the vma snapshot in fill_files_note
  coredump/elf: Pass coredump_params into fill_note_info
  coredump: Remove the WARN_ON in dump_vma_snapshot
  coredump: Snapshot the vmas in do_coredump
  coredump: Move definition of struct coredump_params into coredump.h
  binfmt_elf: Introduce KUnit test
  ELF: Properly redefine PT_GNU_* in terms of PT_LOOS
  MAINTAINERS: Update execve entry with more details
  exec: cleanup comments
  fs/binfmt_elf: Refactor load_elf_binary function
  fs/binfmt_elf: Fix AT_PHDR for unusual ELF files
  binfmt: move more stuff undef CONFIG_COREDUMP
  selftests/exec: Test for empty string on NULL argv
  exec: Force single empty string when argv is empty
  coredump: Also dump first pages of non-executable ELF libraries
  ELF: fix overflow in total mapping size calculation
2022-03-21 19:16:02 -07:00
Guo Zhengkui
94f19e1ec3 selftests: net: change fprintf format specifiers
`cur64`, `start64` and `ts_delta` are int64_t. Change format
specifiers in fprintf from `"%lu"` to `"%" PRId64` to adapt
to 32-bit and 64-bit systems.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Link: https://lore.kernel.org/r/20220319073730.5235-1-guozhengkui@vivo.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-21 16:37:54 -07:00
Yonghong Song
f97b8b9bd6 bpftool: Fix a bug in subskeleton code generation
Compiled with clang by adding LLVM=1 both kernel and selftests/bpf
build, I hit the following compilation error:

In file included from /.../tools/testing/selftests/bpf/prog_tests/subskeleton.c:6:
  ./test_subskeleton_lib.subskel.h:168:6: error: variable 'err' is used uninitialized whenever
      'if' condition is true [-Werror,-Wsometimes-uninitialized]
          if (!s->progs)
              ^~~~~~~~~
  ./test_subskeleton_lib.subskel.h:181:11: note: uninitialized use occurs here
          errno = -err;
                   ^~~
  ./test_subskeleton_lib.subskel.h:168:2: note: remove the 'if' if its condition is always false
          if (!s->progs)
          ^~~~~~~~~~~~~~

The compilation error is triggered by the following code
        ...
        int err;

        obj = (struct test_subskeleton_lib *)calloc(1, sizeof(*obj));
        if (!obj) {
                errno = ENOMEM;
                goto err;
        }
        ...

  err:
        test_subskeleton_lib__destroy(obj);
        errno = -err;
        ...
in test_subskeleton_lib__open(). The 'err' is not initialized, yet it
is used in 'errno = -err' later.

The fix is to remove 'errno = -err' since errno has been set properly
in all incoming branches.

Fixes: 00389c58ff ("bpftool: Add support for subskeletons")
Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220320032009.3106133-1-yhs@fb.com
2022-03-21 14:46:10 -07:00
Linus Torvalds
f648372dfe Thermal control updates for 5.18-rc1
- Add a new thermal driver for the Intel Hardware Feedback Interface
    (HFI) including the HFI initialization, HFI notification interrupt
    handling and sending CPU capabilities change messages to user
    space via the thermal netlink interface (Ricardo Neri, Srinivas
    Pandruvada, Nathan Chancellor, Randy Dunlap).
 
  - Extend the intel-speed-select utility to handle out-of-band CPU
    configuration changes and add support for the CPU capabilities
    change messages sent over the thermal netlink interface by the new
    HFI thermal driver to it (Srinivas Pandruvada).
 
  - Convert the DT bindings to yaml format for the Exynos platform
    and fix and update the MAINTAINERS file for this driver (Krzysztof
    Kozlowski).
 
  - Register the thermal zones as HWmon sensors for the QCom's
    Tsens driver and TI thermal platforms (Dmitry Baryshkov, Romain
    Naour).
 
  - Add the msm8953 compatible documentation in the bindings (Luca
    Weiss).
 
  - Add the sm8150 platform support to the QCom LMh driver's DT
    binding (Thara Gopinath).
 
  - Check the command result from the IPC command to the BPMP in the
    Tegra driver (Mikko Perttunen).
 
  - Silence the error for normal configuration where the interrupt
    is optionnal in the Broadcom thermal driver (Florian Fainelli).
 
  - Remove remaining dead code from the TI thermal driver (Yue
    Haibing).
 
  - Don't use bitmap_weight() in end_power_clamp() in the powerclamp
    driver (Yury Norov).
 
  - Update the OS policy capabilities handshake in the int340x thermal
    driver (Srinivas Pandruvada).
 
  - Increase the policies bitmap size in int340x (Srinivas Pandruvada).
 
  - Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
    int340x thermal driver (Rafael Wysocki).
 
  - Check for NULL after calling kmemdup() in int340x (Jiasheng Jiang).
 
  - Add Intel Dynamic Power and Thermal Framework (DPTF) kernel interface
    documentation (Srinivas Pandruvada).
 
  - Fix bullet list warning in the thermal documentation (Randy Dunlap).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmI4pU8SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRx47oP+gJMvi3IT/yaN4wyxoT6OeM8A8qPNQIw
 A6olZeL5/t1tp3jPU5qJ498q9W6vokovdqklAya4eqChmPboVk11A3TJ+dhflIRU
 NxaXIKTueNh5AwD08O9jhJJCJEejsb2i7lzWkJKMM/S3eZCciZU9ac4C5WVi3DqM
 F7WL62vhzsknsuTtCw9KLufmI3+NUFW98nS/B2EmesZs1WLfEnrEajYTvzgJRXQH
 qiO6x6fK4HJWP8D7XYxNwGpRObfRFOIkZYt40iXsV8s1fsdcEcKUnXpCviOg3tQ8
 mLE+xqnpAKxaGmrI8QZr6863/NyG5dN8A3hk6ZbTN7vWnyVLmRIzs8XZ8hoPycmH
 LeEGn/LV1td1qrJykRemCYzJCfmXF2k0b6MjJGxgUQ7UItlBXr2pVRWXCFlY+Ekh
 9ahZ7/2BSwdaW5DHbseZIIvF/rsCq0/i4+xV2JizM7ufnlFRx+6jP68KLDQxjwgp
 ZparKMYI/8zEgMq3x3tlvh5JsK4M0kA95NC+bsov4gNh0jbrm+CL92g5PuDLXAby
 RlW8Fmvx1px1n6IEoeLAtbTdQVJwqyNWUyVIhrXkJVGBkCcupCAfuMY9s6woKemf
 IXr1n/tjKG3hxuh/NTgAKYvIKaWSNF1ZIdNGbvgpzEGL+p26y96qhJYFlNBthXy9
 v/4V8qFn0w6R
 =6PYL
 -----END PGP SIGNATURE-----

Merge tag 'thermal-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control updates from Rafael Wysocki:
 "As far as new functionality is concerned, there is a new thermal
  driver for the Intel Hardware Feedback Interface (HFI) along with some
  intel-speed-select utility changes to support it. There are also new
  DT compatible strings for a couple of platforms, and thermal zones on
  some platforms will be registered as HWmon sensors now.

  Apart from the above, some drivers are updated (fixes mostly) and
  there is a new piece of documentation for the Intel DPTF (Dynamic
  Power and Thermal Framework) sysfs interface.

  Specifics:

   - Add a new thermal driver for the Intel Hardware Feedback Interface
     (HFI) including the HFI initialization, HFI notification interrupt
     handling and sending CPU capabilities change messages to user space
     via the thermal netlink interface (Ricardo Neri, Srinivas
     Pandruvada, Nathan Chancellor, Randy Dunlap).

   - Extend the intel-speed-select utility to handle out-of-band CPU
     configuration changes and add support for the CPU capabilities
     change messages sent over the thermal netlink interface by the new
     HFI thermal driver to it (Srinivas Pandruvada).

   - Convert the DT bindings to yaml format for the Exynos platform and
     fix and update the MAINTAINERS file for this driver (Krzysztof
     Kozlowski).

   - Register the thermal zones as HWmon sensors for the QCom's Tsens
     driver and TI thermal platforms (Dmitry Baryshkov, Romain Naour).

   - Add the msm8953 compatible documentation in the bindings (Luca
     Weiss).

   - Add the sm8150 platform support to the QCom LMh driver's DT binding
     (Thara Gopinath).

   - Check the command result from the IPC command to the BPMP in the
     Tegra driver (Mikko Perttunen).

   - Silence the error for normal configuration where the interrupt is
     optionnal in the Broadcom thermal driver (Florian Fainelli).

   - Remove remaining dead code from the TI thermal driver (Yue
     Haibing).

   - Don't use bitmap_weight() in end_power_clamp() in the powerclamp
     driver (Yury Norov).

   - Update the OS policy capabilities handshake in the int340x thermal
     driver (Srinivas Pandruvada).

   - Increase the policies bitmap size in int340x (Srinivas Pandruvada).

   - Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
     int340x thermal driver (Rafael Wysocki).

   - Check for NULL after calling kmemdup() in int340x (Jiasheng Jiang).

   - Add Intel Dynamic Power and Thermal Framework (DPTF) kernel
     interface documentation (Srinivas Pandruvada).

   - Fix bullet list warning in the thermal documentation (Randy
     Dunlap)"

* tag 'thermal-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (30 commits)
  thermal: int340x: Update OS policy capability handshake
  thermal: int340x: Increase bitmap size
  Documentation: thermal: DPTF Documentation
  MAINTAINERS: thermal: samsung: update Krzysztof Kozlowski's email
  thermal/drivers/ti-soc-thermal: Remove unused function ti_thermal_get_temp()
  thermal/drivers/brcmstb_thermal: Interrupt is optional
  thermal: tegra-bpmp: Handle errors in BPMP response
  drivers/thermal/ti-soc-thermal: Add hwmon support
  dt-bindings: thermal: tsens: Add msm8953 compatible
  dt-bindings: thermal: Add sm8150 compatible string for LMh
  thermal/drivers/qcom/lmh: Add support for sm8150
  thermal/drivers/tsens: register thermal zones as hwmon sensors
  MAINTAINERS: thermal: samsung: Drop obsolete properties
  dt-bindings: thermal: samsung: Convert to dtschema
  tools/power/x86/intel-speed-select: v1.12 release
  tools/power/x86/intel-speed-select: HFI support
  tools/power/x86/intel-speed-select: OOB daemon mode
  thermal: intel: hfi: INTEL_HFI_THERMAL depends on NET
  thermal: netlink: Fix parameter type of thermal_genl_cpu_capability_event() stub
  thermal: Replace acpi_bus_get_device()
  ...
2022-03-21 14:35:11 -07:00
Linus Torvalds
02b82b02c3 Power management updates for 5.18-rc1
- Allow device_pm_check_callbacks() to be called from interrupt
    context without issues (Dmitry Baryshkov).
 
  - Modify devm_pm_runtime_enable() to automatically handle
    pm_runtime_dont_use_autosuspend() at driver exit time (Douglas
    Anderson).
 
  - Make the schedutil cpufreq governor use to_gov_attr_set() instead
    of open coding it (Kevin Hao).
 
  - Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
    cpufreq longhaul driver (Rafael Wysocki).
 
  - Unify show() and store() naming in cpufreq and make it use
    __ATTR_XX (Lianjie Zhang).
 
  - Make the intel_pstate driver use the EPP value set by the firmware
    by default (Srinivas Pandruvada).
 
  - Re-order the init checks in the powernow-k8 cpufreq driver (Mario
    Limonciello).
 
  - Make the ACPI processor idle driver check for architectural
    support for LPI to avoid using it on x86 by mistake (Mario
    Limonciello).
 
  - Add Sapphire Rapids Xeon support to the intel_idle driver (Artem
    Bityutskiy).
 
  - Add 'preferred_cstates' module argument to the intel_idle driver
    to work around C1 and C1E handling issue on Sapphire Rapids (Artem
    Bityutskiy).
 
  - Add core C6 optimization on Sapphire Rapids to the intel_idle
    driver (Artem Bityutskiy).
 
  - Optimize the haltpoll cpuidle driver a bit (Li RongQing).
 
  - Remove leftover text from intel_idle() kerneldoc comment and fix
    up white space in intel_idle (Rafael Wysocki).
 
  - Fix load_image_and_restore() error path (Ye Bin).
 
  - Fix typos in comments in the system wakeup hadling code (Tom Rix).
 
  - Clean up non-kernel-doc comments in hibernation code (Jiapeng
    Chong).
 
  - Fix __setup handler error handling in system-wide suspend and
    hibernation core code (Randy Dunlap).
 
  - Add device name to suspend_report_result() (Youngjin Jang).
 
  - Make virtual guests honour ACPI S4 hardware signature by
    default (David Woodhouse).
 
  - Block power off of a parent PM domain unless child is in deepest
    state (Ulf Hansson).
 
  - Use dev_err_probe() to simplify error handling for generic PM
    domains (Ahmad Fatoum).
 
  - Fix sleep-in-atomic bug caused by genpd_debug_remove() (Shawn Guo).
 
  - Document Intel uncore frequency scaling (Srinivas Pandruvada).
 
  - Add DTPM hierarchy description (Daniel Lezcano).
 
  - Change the locking scheme in DTPM (Daniel Lezcano).
 
  - Fix dtpm_cpu cleanup at exit time and missing virtual DTPM pointer
    release (Daniel Lezcano).
 
  - Make dtpm_node_callback[] static (kernel test robot).
 
  - Fix spelling mistake "initialze" -> "initialize" in
    dtpm_create_hierarchy() (Colin Ian King).
 
  - Add tracer tool for the amd-pstate driver (Jinzhou Su).
 
  - Fix PC6 displaying in turbostat on some systems (Artem Bityutskiy).
 
  - Add AMD P-State support to the cpupower utility (Huang Rui).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmI4pM4SHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxh5wQAJEz3u55wIHzeov30obtXaD3SxxnvRzR
 p96gRcmNoR2so/Q9D+h+JHZKQkVklbnbqExMXQn1qarceAUN7KPjVMRvagjZsC/f
 J3LtQmx96yqGTCzOTu5n+Ol2ojKLMCMo++no/2873BYhd60TV6oQxRzkNiZx215n
 tT6MKY5ZMX448VKWAWh9vt5rdvbBj9z6cfvpchK/3bziE21lfLz/1iXeFnwqjPGU
 XuA7NYbVAHOfsdHZk19+4qAgm8EYkmjd4/J8HDlb7XouyLuUGy8KJZYhSrJKiQ1C
 f9f2Zw0925/YpBmFXOwxuYWP9KjFKlq7Cdr3SSgVGDOvgyRtpeV4fU8Y6WPFCtEV
 fQdKr9/4KQP6hwUpxJZucSf49wcnyh7hFDMxrwVVcL96yXZef1OqG3ITihJY/n4J
 +wDnpR2VqBeiG5NyECjk3mPROZGFfUlHRsqMd3JOswMpGF5phpEI9nNFcayB262S
 Rkgcb3MacFVsuo/ZBdzCUTZ6ECvjxZn4FGZPxumkp65SJO18gOPbqs8qfGCZ3Tgb
 GDy0CWEOv/KuGnks1CkBGok2Z4q8s2GcZmaOp9BiPjxKJD71i4uPtiGA/5Ahb6cm
 Cu0G7Ub/t2Vc93E7mnTE4hh2IuiAN73yB5teM4YNllHw6f+aqVGlvJktIMpShajo
 eEBNFlkwljyz
 =WlR9
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These are mostly fixes and cleanups all over the code and a new piece
  of documentation for Intel uncore frequency scaling.

  Functionality-wise, the intel_idle driver will support Sapphire Rapids
  Xeons natively now (with some extra facilities for controlling
  C-states more precisely on those systems), virtual guests will take
  the ACPI S4 hardware signature into account by default, the
  intel_pstate driver will take the defualt EPP value from the firmware,
  cpupower utility will support the AMD P-state driver added in the
  previous cycle, and there is a new tracer utility for that driver.

  Specifics:

   - Allow device_pm_check_callbacks() to be called from interrupt
     context without issues (Dmitry Baryshkov).

   - Modify devm_pm_runtime_enable() to automatically handle
     pm_runtime_dont_use_autosuspend() at driver exit time (Douglas
     Anderson).

   - Make the schedutil cpufreq governor use to_gov_attr_set() instead
     of open coding it (Kevin Hao).

   - Replace acpi_bus_get_device() with acpi_fetch_acpi_dev() in the
     cpufreq longhaul driver (Rafael Wysocki).

   - Unify show() and store() naming in cpufreq and make it use
     __ATTR_XX (Lianjie Zhang).

   - Make the intel_pstate driver use the EPP value set by the firmware
     by default (Srinivas Pandruvada).

   - Re-order the init checks in the powernow-k8 cpufreq driver (Mario
     Limonciello).

   - Make the ACPI processor idle driver check for architectural support
     for LPI to avoid using it on x86 by mistake (Mario Limonciello).

   - Add Sapphire Rapids Xeon support to the intel_idle driver (Artem
     Bityutskiy).

   - Add 'preferred_cstates' module argument to the intel_idle driver to
     work around C1 and C1E handling issue on Sapphire Rapids (Artem
     Bityutskiy).

   - Add core C6 optimization on Sapphire Rapids to the intel_idle
     driver (Artem Bityutskiy).

   - Optimize the haltpoll cpuidle driver a bit (Li RongQing).

   - Remove leftover text from intel_idle() kerneldoc comment and fix up
     white space in intel_idle (Rafael Wysocki).

   - Fix load_image_and_restore() error path (Ye Bin).

   - Fix typos in comments in the system wakeup hadling code (Tom Rix).

   - Clean up non-kernel-doc comments in hibernation code (Jiapeng
     Chong).

   - Fix __setup handler error handling in system-wide suspend and
     hibernation core code (Randy Dunlap).

   - Add device name to suspend_report_result() (Youngjin Jang).

   - Make virtual guests honour ACPI S4 hardware signature by default
     (David Woodhouse).

   - Block power off of a parent PM domain unless child is in deepest
     state (Ulf Hansson).

   - Use dev_err_probe() to simplify error handling for generic PM
     domains (Ahmad Fatoum).

   - Fix sleep-in-atomic bug caused by genpd_debug_remove() (Shawn Guo).

   - Document Intel uncore frequency scaling (Srinivas Pandruvada).

   - Add DTPM hierarchy description (Daniel Lezcano).

   - Change the locking scheme in DTPM (Daniel Lezcano).

   - Fix dtpm_cpu cleanup at exit time and missing virtual DTPM pointer
     release (Daniel Lezcano).

   - Make dtpm_node_callback[] static (kernel test robot).

   - Fix spelling mistake "initialze" -> "initialize" in
     dtpm_create_hierarchy() (Colin Ian King).

   - Add tracer tool for the amd-pstate driver (Jinzhou Su).

   - Fix PC6 displaying in turbostat on some systems (Artem Bityutskiy).

   - Add AMD P-State support to the cpupower utility (Huang Rui)"

* tag 'pm-5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (58 commits)
  cpufreq: powernow-k8: Re-order the init checks
  cpuidle: intel_idle: Drop redundant backslash at line end
  cpuidle: intel_idle: Update intel_idle() kerneldoc comment
  PM: hibernate: Honour ACPI hardware signature by default for virtual guests
  cpufreq: intel_pstate: Use firmware default EPP
  cpufreq: unify show() and store() naming and use __ATTR_XX
  PM: core: keep irq flags in device_pm_check_callbacks()
  cpuidle: haltpoll: Call cpuidle_poll_state_init() later
  Documentation: amd-pstate: add tracer tool introduction
  tools/power/x86/amd_pstate_tracer: Add tracer tool for AMD P-state
  tools/power/x86/intel_pstate_tracer: make tracer as a module
  cpufreq: amd-pstate: Add more tracepoint for AMD P-State module
  PM: sleep: Add device name to suspend_report_result()
  turbostat: fix PC6 displaying on some systems
  intel_idle: add core C6 optimization for SPR
  intel_idle: add 'preferred_cstates' module argument
  intel_idle: add SPR support
  PM: runtime: Have devm_pm_runtime_enable() handle pm_runtime_dont_use_autosuspend()
  ACPI: processor idle: Check for architectural support for LPI
  cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function
  ...
2022-03-21 14:26:28 -07:00
Linus Torvalds
d2eb5500f1 LKMM pull request for v5.18
This series contains an improved explanation of syntactic and semantic
 dependencies from Alan Stern.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmIurqQTHHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jGQnD/0eB7REmhm/nxzSttC6/l0aQMWoijRu
 uNFaeK7y+UmfjcVfS0VflUFwzmEOduLHUHeUOpXT7GoEEuoVi/4tN5YEgdarDCwn
 TPsMAizAvbrj3TymbXF1z+DC2QiWWeufVTcv4dSSKJviC839Zaauh4QZmoys7yGM
 BSuUX+dOHax6XGoXtolIqGwplH71RFkFM1dJy4BSSdmhVsTPuFLkxhoBVI9X4LgO
 tmRNAwV2J2mig8+HuMhtl+dYaugUTHAjZ+GqqFz7BhmH5isXuVPGK1iankEuiOhs
 rzvKNKrA0YW903a3bPj3g0DclbYYAFygOjBu49Jl6Y+BeHoID2eNbt9/+n+DWAtt
 D/uKJNFUfdVsDbBQ61P52rB0Bhz3dl3wTR4f6Nn7+MBNJAzLrhTVa73EZAzI1nuO
 GDReAzr6eb0xRB+WibFLK4iXVhRYw406YW9VPp/Ikl1DLl8vnPZlrQR3rEjTGhi7
 lyLRRCXKBb8Aaja1rdHFG4CDb3j+BXW44YscC0wrAFzK9MQFOeHbLupruf4TxsXL
 X1KEFDqPVWPSCjTt9NbA1AofWNGIQFmjU3PNB7xLMLIlOBHRoI/ia4RUotA0cWz/
 f5uFFKbFFeW+ZKDJtKyMBpNJpnnBrOqo1TrXBxIICM3DrYxFfiGDqrByebn7TupA
 GWRa5nAQjXIoBQ==
 =bGU7
 -----END PGP SIGNATURE-----

Merge tag 'lkmm.2022.03.13a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull memory model doc update from Paul McKenney:
 "An improved explanation of syntactic and semantic dependencies from
  Alan Stern"

* tag 'lkmm.2022.03.13a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
  tools/memory-model: Explain syntactic and semantic dependencies
2022-03-21 14:09:34 -07:00
German Gomez
cd6382d827 perf test arm64: Test unwinding using fame-pointer (fp) mode
Add a shell script to check that the call-graphs generated using frame
pointers (--call-graph fp) are complete and not missing leaf functions:

  | $ perf test 88 -v
  |  88: Check Arm64 callgraphs are complete in fp mode                  :
  | --- start ---
  | test child forked, pid 8734
  |  + Compiling test program (/tmp/test_program.Cz3yL)...
  |  + Recording (PID=8749)...
  |  + Stopping perf-record...
  | test_program.Cz
  |                  728 leaf
  |                  753 parent
  |                  76c main
  | test child finished with 0
  | ---- end ----
  | Check Arm SPE callgraphs are complete in fp mode: Ok

It's supposed to work with both unwinders:

  | $ make                # for libunwind (default)
  | $ make NO_LIBUNWIND=1 # for libdw

Tester notes:

Ran it on N1SDP and it passes, and it fails if b9f6fbb3b2 ("perf
arm64: Inject missing frames when using 'perf record --call-graph=fp'")
isn't applied.

Fixes: b9f6fbb3b2 ("perf arm64: Inject missing frames when using 'perf record --call-graph=fp'")
Suggested-by: Jiri Olsa <jolsa@kernel.org>
Reviewed-by: James Clark <james.clark@arm.com>
Tested-by: James Clark <james.clark@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220316172015.98000-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-21 18:05:23 -03:00
Linus Torvalds
35dc0352bb RCU pull request for v5.18
This pull request contains the following branches:
 
 exp.2022.02.24a: Contains a fix for idle detection from Neeraj Upadhyay
 	and missing access marking detected by KCSAN.
 
 fixes.2022.02.14a: Miscellaneous fixes.
 
 rcu_barrier.2022.02.08a: Reduces coupling between rcu_barrier() and
 	CPU-hotplug operations, so that rcu_barrier() no longer needs
 	to do cpus_read_lock().  This may also someday allow system
 	boot to bring CPUs online concurrently.
 
 rcu-tasks.2022.02.08a: Enable more aggressive movement to per-CPU
 	queueing when reacting to excessive lock contention due
 	to workloads placing heavy update-side stress on RCU tasks.
 
 rt.2022.02.01b: Improvements to RCU priority boosting, including
 	changes from Neeraj Upadhyay, Zqiang, and Alison Chaiken.
 
 torture.2022.02.01b: Various fixes improving test robustness and
 	debug information.
 
 torturescript.2022.02.08a: Add tests for SRCU size transitions, further
 	compress torture.sh build products, and improve debug output.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEbK7UrM+RBIrCoViJnr8S83LZ+4wFAmIusb0THHBhdWxtY2tA
 a2VybmVsLm9yZwAKCRCevxLzctn7jAklD/9VXLK7crcg2YeRXUIg1IOdnancsVCV
 MNtTfxNYqYIis+W2UfuHKuQu2yEXF5fihdY0J9TQv0byHsprp6FIZT+i1An4Ukgd
 0vyHjd/DaIKgs2txsB1DjhlatWlJUfQuBwhtNUkpYFLFwKdCI1l813bPbNlL+GiL
 p0ZejVMpBC5HgE6sDOtaaQSAB+AEUp+Lgr+yaG/On8hfzwWFKO8KldxhiKY9n07v
 SNDfKDgXB+80hx4RBVGbkuogV3s9brFULoNRXJy7Uf79DtiY09uazhhA3G0TjO34
 zGwmF91dqsXDF/Uz8g4aZO0xYRXUchOrsQ5lgO/GhTVbM9I0wWlMHEk/8WHyBJkU
 vlXOMuwzBc9/5uwZE3rnkA4a3nkXhPQjLlCr+/I7A/7Vsv9IBW9WSlgMvUN0Qf4S
 XAwTnIqfErnR60a+L0+HRr5kIV5VoXcxqI/Nv0/4/BMLRubS/c7cYjOTxXNJL9SU
 50pv5vty9xk3HSpuz0JAOyLf+PUT773uUQhFr5xCBSCVqbAm5WFg6hWPAgrN/tUS
 wstBc0wlA73rKVJxeLDQwHc/oT1zTUEzswVZITQ5zLHK0t0GbeR6QHccsdeaJyTe
 DisX+66A6YQrEuJmx5xUZqjYHqtYLDOBTbHA3ZwQmvjKu8ibWZ8Fg9ioURLCS4bF
 +FVkp/5KdcAN9w==
 =ljVY
 -----END PGP SIGNATURE-----

Merge tag 'rcu.2022.03.13a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

Pull RCU updates from Paul McKenney:

 - Fix idle detection (Neeraj Upadhyay) and missing access marking
   detected by KCSAN.

 - Reduce coupling between rcu_barrier() and CPU-hotplug operations, so
   that rcu_barrier() no longer needs to do cpus_read_lock(). This may
   also someday allow system boot to bring CPUs online concurrently.

 - Enable more aggressive movement to per-CPU queueing when reacting to
   excessive lock contention due to workloads placing heavy update-side
   stress on RCU tasks.

 - Improvements to RCU priority boosting, including changes from Neeraj
   Upadhyay, Zqiang, and Alison Chaiken.

 - Various fixes improving test robustness and debug information.

 - Add tests for SRCU size transitions, further compress torture.sh
   build products, and improve debug output.

 - Miscellaneous fixes.

* tag 'rcu.2022.03.13a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (49 commits)
  rcu: Replace cpumask_weight with cpumask_empty where appropriate
  rcu: Remove __read_mostly annotations from rcu_scheduler_active externs
  rcu: Uninline multi-use function: finish_rcuwait()
  rcu: Mark writes to the rcu_segcblist structure's ->flags field
  kasan: Record work creation stack trace with interrupts enabled
  rcu: Inline __call_rcu() into call_rcu()
  rcu: Add mutex for rcu boost kthread spawning and affinity setting
  rcu: Fix description of kvfree_rcu()
  MAINTAINERS:  Add Frederic and Neeraj to their RCU files
  rcutorture: Provide non-power-of-two Tasks RCU scenarios
  rcutorture: Test SRCU size transitions
  torture: Make torture.sh help message match reality
  rcu-tasks: Set ->percpu_enqueue_shift to zero upon contention
  rcu-tasks: Use order_base_2() instead of ilog2()
  rcu: Create and use an rcu_rdp_cpu_online()
  rcu: Make rcu_barrier() no longer block CPU-hotplug operations
  rcu: Rework rcu_barrier() and callback-migration logic
  rcu: Refactor rcu_barrier() empty-list handling
  rcu: Kill rnp->ofl_seq and use only rcu_state.ofl_lock for exclusion
  torture: Change KVM environment variable to RCUTORTURE
  ...
2022-03-21 14:00:56 -07:00
Linus Torvalds
3fd33273a4 Reenable ENQCMD/PASID support:
- Simplify the PASID handling to allocate the PASID once, associate it to
    the mm of a process and free it on mm_exit(). The previous attempt of
    refcounted PASIDs and dynamic alloc()/free() turned out to be error
    prone and too complex. The PASID space is 20bits, so the case of
    resource exhaustion is a pure academic concern.
 
  - Populate the PASID MSR on demand via #GP to avoid racy updates via IPIs.
 
  - Reenable ENQCMD and let objtool check for the forbidden usage of ENQCMD
    in the kernel.
 
  - Update the documentation for Shared Virtual Addressing accordingly.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmI4WpETHHRnbHhAbGlu
 dXRyb25peC5kZQAKCRCmGPVMDXSYoUfnD/0bY94rgEX4Uuy/mFQ1W8X8XlcyKrha
 0/cRATb+4QV/pwJgGr2nClKhGlFMYPdJLvKMC1TCUPCVrLD1RNmluIZoFzeqXwhm
 jDdCcFOuGZ2D4ujDPWwOOpKBT1ytovnQa7+lH6QJyKkEqdcC2ncOvGJQoiRxRQIG
 8wTVs/OUvQJ5ZhSZQMKQN4uMWMyHEjhbroYS30/uNi/598jTPgzlEoa14XocQ9Os
 nS6ALvjuc9MsJ34F61etMaJU1ZMI3Wx75u9QjEvX6hmJs87YdvgwE7lzJUKFDEuh
 gewM0wp2fTa8/azzP0eMiHTin56PqFdmllzRqXmilbZMEPOeI29dZVArCdpKcAn0
 r9p1kJUT3Xl2G3Oir/OdCaaQHcznD1Y5ZFOyh12wgEucZ/rdeSr7nq7n5HoOL5Bw
 Q2o6YvTkE9DOL0nTN1lSXGiPspou7fzX0uUcRBrbJUS3sBv4zGIlaJXUaTVnSdAt
 VZj4LeOK7v2BjyeiOY0iaaIQd3xjmLUF0UjozXS5M13SoVcToZRbyWqhDzPvNuKA
 imQb/dnFpXhABgmuqAiJLeqM0VtGMFNc780OURkcsBSPng+iSEdV4DzuhK0jpU8x
 Uk1RuGMd/vgmrlDFBrw+orQQiiKR1ixpI0LiHfcOBycfJhqTwcnrNZvAN5/do28Z
 E23+QzlUbZF0cw==
 =Dy8V
 -----END PGP SIGNATURE-----

Merge tag 'x86-pasid-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 PASID support from Thomas Gleixner:
 "Reenable ENQCMD/PASID support:

   - Simplify the PASID handling to allocate the PASID once, associate
     it to the mm of a process and free it on mm_exit().

     The previous attempt of refcounted PASIDs and dynamic
     alloc()/free() turned out to be error prone and too complex. The
     PASID space is 20bits, so the case of resource exhaustion is a pure
     academic concern.

   - Populate the PASID MSR on demand via #GP to avoid racy updates via
     IPIs.

   - Reenable ENQCMD and let objtool check for the forbidden usage of
     ENQCMD in the kernel.

   - Update the documentation for Shared Virtual Addressing accordingly"

* tag 'x86-pasid-2022-03-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation/x86: Update documentation for SVA (Shared Virtual Addressing)
  tools/objtool: Check for use of the ENQCMD instruction in the kernel
  x86/cpufeatures: Re-enable ENQCMD
  x86/traps: Demand-populate PASID MSR via #GP
  sched: Define and initialize a flag to identify valid PASID in the task
  x86/fpu: Clear PASID when copying fpstate
  iommu/sva: Assign a PASID to mm on PASID allocation and free it on mm exit
  kernel/fork: Initialize mm's PASID
  iommu/ioasid: Introduce a helper to check for valid PASIDs
  mm: Change CONFIG option for mm->pasid field
  iommu/sva: Rename CONFIG_IOMMU_SVA_LIB to CONFIG_IOMMU_SVA
2022-03-21 12:28:13 -07:00
Linus Torvalds
61e2658e37 - A couple of fixes and improvements to the SGX selftests
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmI4WVQACgkQEsHwGGHe
 VUpL6w//RpMzMnbo0j45xJfEwrw/zw0nB4d7/j185wqqxbvLwZZe1Xb4RfLJ28Bn
 iW4mQG15H0tiR3EeOnR5QwIqEOYcWb1IkYtTQFk//bzyto2jb19S+/zDKvj0M9zM
 apA1s+faKjyvcGdRUmZR6YqnFCpfGCaB060AYV2hVoAb9NqktUl4gbRNjJh9Fl2M
 PAPBK7VhVD4ycq1FnOw3zCky5uGCWOigxz4FeOelcT39ETdLXL9TgHO6aHiapco9
 rNbCI5L47dpjVSSZlmCJHaEgeaU9jyOiEzbQh4VPR88eDggLhkvQwA4POF+q4xHs
 uVv9pCRUWEs0BDBc3itDsVTpmv8V9OIXiJhVEbkSM2ZHmlZfkZT1MiH6HChi7pNO
 tyFHKEiutUAqCrS9fMrqxDM1vXeH/wPRGhN/wtBY7UzcDbLRAHLW5fk2HinVTTx2
 hcdrz2ku16J/ioxP2BmtdAIdYZqusReSoAPI5Z1UpyUNtKvNEXUj8a0y/w9SYTV4
 M7KakbZAQnh56HyV0j5GlM5xRvOpmJQVZQJrOi4h4xm97mO1eGfL4E9boBKFi/PF
 lTSkbNs9kHSqgFjjKNC0+a28jGa8BrVfRbD39zcUoUYWo68wDVIq6gB1RxQG0l8J
 QYxXQekHvRZ/NINbcpY6okE9L9X1CPPaHfs2+7VRsxjjHWUcIYU=
 =uGS4
 -----END PGP SIGNATURE-----

Merge tag 'x86_sgx_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 SGX updates from Borislav Petkov:

 - A couple of fixes and improvements to the SGX selftests

* tag 'x86_sgx_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  selftests/sgx: Treat CC as one argument
  selftests/x86: Add validity check and allow field splitting
  selftests/sgx: Remove extra newlines in test output
  selftests/sgx: Ensure enclave data available during debug print
  selftests/sgx: Do not attempt enclave build without valid enclave
  selftests/sgx: Fix NULL-pointer-dereference upon early test failure
2022-03-21 11:37:17 -07:00
Linus Torvalds
2268735045 - Add support for a couple new insn sets to the insn decoder: AVX512-FP16,
AMX, other misc insns.
 
 - Update VMware-specific MAINTAINERS entries
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmI4URIACgkQEsHwGGHe
 VUob3A/9GFyqt9bBKrSaq9Rt1UVkq6dQhG3kO7dW5d0YDvy8JmR9is4rNDV9GGx6
 A1OAue/gDlZFIz/829oS1qwjB7GZ4Rfb0gRo33bytDLLmd0BRXW7ioZ54jBRnWvy
 8dZ2WruMmazK6uJxoHvtOA+Pt3ukb074CZZ1SfW344clWK6FJZeptyRclWaT1Py2
 QOIJOxMraCdNAay/1ZvOdIqqdIPx5+JyzbHIYOWUFzwT4y+Q8kFNbigrJnqxe5Ij
 aqRjzMIvt6MeLwbq9CfLsPFA3gaSzYeOkuXQPcqRgd5LU5ZyXBLStUrGEv1fsMvd
 9Kh7VFycZPS7MKzxoEcbuJTTOR4cBsINOlbo9iWr7UD5pm5h7c3vc+nCyia+U+Xo
 5XRpf8nitt4a3r1f6HxwXJS0OlBkS4CqexE2OejY4yhWRlxhMcIvRyquU+Z0J4Bp
 mgDJuXSzfJfFcBzp4jjOBxGPNEjXXOdy/qc/1jR97eMmTKrk3gk/74NWUx9hw4oN
 5RGeC+khAD13TL0yVQfKBe5HuLK5tHppAzXAnT2xi6qUn+VJjLxNWgg3iV9tbShM
 4q5vJp3BmvNOY8HQv1R3IDFfN0IAL09Q9v6EzEroNuVUhEOzBdH7JSzWkvBBveZb
 FVgD3I+wNBE1nQD3cP/6DGbRe1JG3ULDF95WJshB8gNJwavlZGs=
 =f7VZ
 -----END PGP SIGNATURE-----

Merge tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull misc x86 updates from Borislav Petkov:

 - Add support for a couple new insn sets to the insn decoder:
   AVX512-FP16, AMX, other misc insns.

 - Update VMware-specific MAINTAINERS entries

* tag 'x86_misc_for_v5.18_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  MAINTAINERS: Mark VMware mailing list entries as email aliases
  MAINTAINERS: Add Zack as maintainer of vmmouse driver
  MAINTAINERS: Update maintainers for paravirt ops and VMware hypervisor interface
  x86/insn: Add AVX512-FP16 instructions to the x86 instruction decoder
  perf/tests: Add AVX512-FP16 instructions to x86 instruction decoder test
  x86/insn: Add misc instructions to x86 instruction decoder
  perf/tests: Add misc instructions to the x86 instruction decoder test
  x86/insn: Add AMX instructions to the x86 instruction decoder
  perf/tests: Add AMX instructions to x86 instruction decoder test
2022-03-21 11:19:00 -07:00
Linus Torvalds
356a1adca8 arm64 updates for 5.18
- Support for including MTE tags in ELF coredumps
 
 - Instruction encoder updates, including fixes to 64-bit immediate
   generation and support for the LSE atomic instructions
 
 - Improvements to kselftests for MTE and fpsimd
 
 - Symbol aliasing and linker script cleanups
 
 - Reduce instruction cache maintenance performed for user mappings
   created using contiguous PTEs
 
 - Support for the new "asymmetric" MTE mode, where stores are checked
   asynchronously but loads are checked synchronously
 
 - Support for the latest pointer authentication algorithm ("QARMA3")
 
 - Support for the DDR PMU present in the Marvell CN10K platform
 
 - Support for the CPU PMU present in the Apple M1 platform
 
 - Use the RNDR instruction for arch_get_random_{int,long}()
 
 - Update our copy of the Arm optimised string routines for str{n}cmp()
 
 - Fix signal frame generation for CPUs which have foolishly elected to
   avoid building in support for the fpsimd instructions
 
 - Workaround for Marvell GICv3 erratum #38545
 
 - Clarification to our Documentation (booting reqs. and MTE prctl())
 
 - Miscellanous cleanups and minor fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmIvta8QHHdpbGxAa2Vy
 bmVsLm9yZwAKCRC3rHDchMFjNAIhB/oDSva5FryAFExVuIB+mqRkbZO9kj6fy/5J
 ctN9LEVO2GI/U1TVAUWop1lXmP8Kbq5UCZOAuY8sz7dAZs7NRUWkwTrXVhaTpi6L
 oxCfu5Afu76d/TGgivNz+G7/ewIJRFj5zCPmHezLF9iiWPUkcAsP0XCp4a0iOjU4
 04O4d7TL/ap9ujEes+U0oEXHnyDTPrVB2OVE316FKD1fgztcjVJ2U+TxX5O4xitT
 PPIfeQCjQBq1B2OC1cptE3wpP+YEr9OZJbx+Ieweidy1CSInEy0nZ13tLoUnGPGU
 KPhsvO9daUCbhbd5IDRBuXmTi/sHU4NIB8LNEVzT1mUPnU8pCizv
 =ziGg
 -----END PGP SIGNATURE-----

Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux

Pull arm64 updates from Will Deacon:

 - Support for including MTE tags in ELF coredumps

 - Instruction encoder updates, including fixes to 64-bit immediate
   generation and support for the LSE atomic instructions

 - Improvements to kselftests for MTE and fpsimd

 - Symbol aliasing and linker script cleanups

 - Reduce instruction cache maintenance performed for user mappings
   created using contiguous PTEs

 - Support for the new "asymmetric" MTE mode, where stores are checked
   asynchronously but loads are checked synchronously

 - Support for the latest pointer authentication algorithm ("QARMA3")

 - Support for the DDR PMU present in the Marvell CN10K platform

 - Support for the CPU PMU present in the Apple M1 platform

 - Use the RNDR instruction for arch_get_random_{int,long}()

 - Update our copy of the Arm optimised string routines for str{n}cmp()

 - Fix signal frame generation for CPUs which have foolishly elected to
   avoid building in support for the fpsimd instructions

 - Workaround for Marvell GICv3 erratum #38545

 - Clarification to our Documentation (booting reqs. and MTE prctl())

 - Miscellanous cleanups and minor fixes

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (90 commits)
  docs: sysfs-devices-system-cpu: document "asymm" value for mte_tcf_preferred
  arm64/mte: Remove asymmetric mode from the prctl() interface
  arm64: Add cavium_erratum_23154_cpus missing sentinel
  perf/marvell: Fix !CONFIG_OF build for CN10K DDR PMU driver
  arm64: mm: Drop 'const' from conditional arm64_dma_phys_limit definition
  Documentation: vmcoreinfo: Fix htmldocs warning
  kasan: fix a missing header include of static_keys.h
  drivers/perf: Add Apple icestorm/firestorm CPU PMU driver
  drivers/perf: arm_pmu: Handle 47 bit counters
  arm64: perf: Consistently make all event numbers as 16-bits
  arm64: perf: Expose some Armv9 common events under sysfs
  perf/marvell: cn10k DDR perf event core ownership
  perf/marvell: cn10k DDR perfmon event overflow handling
  perf/marvell: CN10k DDR performance monitor support
  dt-bindings: perf: marvell: cn10k ddr performance monitor
  arm64: clean up tools Makefile
  perf/arm-cmn: Update watchpoint format
  perf/arm-cmn: Hide XP PUB events for CMN-600
  arm64: drop unused includes of <linux/personality.h>
  arm64: Do not defer reserve_crashkernel() for platforms with no DMA memory zones
  ...
2022-03-21 10:46:39 -07:00
Linus Torvalds
9d8e7007dc tpmdd updates for Linux v5.18
-----BEGIN PGP SIGNATURE-----
 
 iIgEABYIADAWIQRE6pSOnaBC00OEHEIaerohdGur0gUCYi62bxIcamFya2tvQGtl
 cm5lbC5vcmcACgkQGnq6IXRrq9KcoQD/dmnK80r9aYRpjP0r9lRViKBbpPxJVHR1
 XNDmRVET7DkBANz1Y00wayKWskefS1rGHXDp40BNBiideWz1GFrWg/ED
 =u9w6
 -----END PGP SIGNATURE-----

Merge tag 'tpmdd-next-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd

Pull tpm updates from Jarkko Sakkinen:
 "In order to split the work a bit we've aligned with David Howells more
  or less that I take more hardware/firmware aligned keyring patches,
  and he takes care more of the framework aligned patches.

  For TPM the patches worth of highlighting are the fixes for
  refcounting provided by Lino Sanfilippo and James Bottomley.

  Eric B. has done a bunch obvious (but important) fixes but there's one
  a bit controversial: removal of asym_tpm. It was added in 2018 when
  TPM1 was already declared as insecure and world had moved on to TPM2.
  I don't know how this has passed all the filters but I did not have a
  chance to see the patches when they were out. I simply cannot commit
  to maintaining this because it was from all angles just wrong to take
  it in the first place to the mainline kernel. Nobody should use this
  module really for anything.

  Finally, there is a new keyring '.machine' to hold MOK keys ('Machine
  Owner Keys'). In the mok side MokListTrustedRT UEFI variable can be
  set, from which kernel knows that MOK keys are kernel trusted keys and
  they are populated to the machine keyring. This keyring linked to the
  secondary trusted keyring, which means that can be used like any
  kernel trusted keys. This keyring of course can be used to hold other
  MOK'ish keys in other platforms in future"

* tag 'tpmdd-next-v5.18-v2' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: (24 commits)
  tpm: use try_get_ops() in tpm-space.c
  KEYS: asymmetric: properly validate hash_algo and encoding
  KEYS: asymmetric: enforce that sig algo matches key algo
  KEYS: remove support for asym_tpm keys
  tpm: fix reference counting for struct tpm_chip
  integrity: Only use machine keyring when uefi_check_trust_mok_keys is true
  integrity: Trust MOK keys if MokListTrustedRT found
  efi/mokvar: move up init order
  KEYS: Introduce link restriction for machine keys
  KEYS: store reference to machine keyring
  integrity: add new keyring handler for mok keys
  integrity: Introduce a Linux keyring called machine
  integrity: Fix warning about missing prototypes
  KEYS: trusted: Avoid calling null function trusted_key_exit
  KEYS: trusted: Fix trusted key backends when building as module
  tpm: xen-tpmfront: Use struct_size() helper
  KEYS: x509: remove dead code that set ->unsupported_sig
  KEYS: x509: remove never-set ->unsupported_key flag
  KEYS: x509: remove unused fields
  KEYS: x509: clearly distinguish between key and signature algorithms
  ...
2022-03-21 10:26:29 -07:00
Matthew Wilcox (Oracle)
72e7258874 selftests/vm/transhuge-stress: Support file-backed PMD folios
Add a -f <filename> option to test PMD folios on files

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
2022-03-21 13:01:36 -04:00
Hengqi Chen
d0f325c34c libbpf: Close fd in bpf_object__reuse_map
pin_fd is dup-ed and assigned in bpf_map__reuse_fd. Close it
in bpf_object__reuse_map after reuse.

Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220319030533.3132250-1-hengqi.chen@gmail.com
2022-03-21 15:36:52 +01:00
Yafang Shao
1824d8ea75 bpftool: Fix print error when show bpf map
If there is no btf_id or frozen, it will not show the pids, but the pids don't
depend on any one of them.

Below is the result after this change:

  $ ./bpftool map show
  2: lpm_trie  flags 0x1
	key 8B  value 8B  max_entries 1  memlock 4096B
	pids systemd(1)
  3: lpm_trie  flags 0x1
	key 20B  value 8B  max_entries 1  memlock 4096B
	pids systemd(1)

While before this change, the 'pids systemd(1)' can't be displayed.

Fixes: 9330986c03 ("bpf: Add bloom filter map implementation")
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220320060815.7716-1-laoar.shao@gmail.com
2022-03-21 14:58:06 +01:00
Hangbin Liu
ec80906b0f selftests/bpf/test_lirc_mode2.sh: Exit with proper code
When test_lirc_mode2_user exec failed, the test report failed but still
exit with 0. Fix it by exiting with an error code.

Another issue is for the LIRCDEV checking. With bash -n, we need to quote
the variable, or it will always be true. So if test_lirc_mode2_user was
not run, just exit with skip code.

Fixes: 6bdd533cee ("bpf: add selftest for lirc_mode2 type program")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220321024149.157861-1-liuhangbin@gmail.com
2022-03-21 14:48:06 +01:00
Namhyung Kim
e1cc1f3998 selftests/bpf: Test skipping stacktrace
Add a test case for stacktrace with skip > 0 using a small sized
buffer.  It didn't support skipping entries greater than or equal to
the size of buffer and filled the skipped part with 0.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220314182042.71025-2-namhyung@kernel.org
2022-03-20 19:16:50 -07:00
Jakub Sitnicki
ce52368001 selftests/bpf: Fix test for 4-byte load from remote_port on big-endian
The context access converter rewrites the 4-byte load from
bpf_sk_lookup->remote_port to a 2-byte load from bpf_sk_lookup_kern
structure.

It means that we cannot treat the destination register contents as a 32-bit
value, or the code will not be portable across big- and little-endian
architectures.

This is exactly the same case as with 4-byte loads from bpf_sock->dst_port
so follow the approach outlined in [1] and treat the register contents as a
16-bit value in the test.

[1]: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com/

Fixes: 2ed0dc5937 ("selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-4-jakub@cloudflare.com
2022-03-20 18:59:00 -07:00
Jakub Sitnicki
3c69611b89 selftests/bpf: Fix u8 narrow load checks for bpf_sk_lookup remote_port
In commit 9a69e2b385 ("bpf: Make remote_port field in struct
bpf_sk_lookup 16-bit wide") ->remote_port field changed from __u32 to
__be16.

However, narrow load tests which exercise 1-byte sized loads from
offsetof(struct bpf_sk_lookup, remote_port) were not adopted to reflect the
change.

As a result, on little-endian we continue testing loads from addresses:

 - (__u8 *)&ctx->remote_port + 3
 - (__u8 *)&ctx->remote_port + 4

which map to the zero padding following the remote_port field, and don't
break the tests because there is no observable change.

While on big-endian, we observe breakage because tests expect to see zeros
for values loaded from:

 - (__u8 *)&ctx->remote_port - 1
 - (__u8 *)&ctx->remote_port - 2

Above addresses map to ->remote_ip6 field, which precedes ->remote_port,
and are populated during the bpf_sk_lookup IPv6 tests.

Unsurprisingly, on s390x we observe:

  #136/38 sk_lookup/narrow access to ctx v4:OK
  #136/39 sk_lookup/narrow access to ctx v6:FAIL

Fix it by removing the checks for 1-byte loads from offsets outside of the
->remote_port field.

Fixes: 9a69e2b385 ("bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide")
Suggested-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220319183356.233666-3-jakub@cloudflare.com
2022-03-20 18:58:59 -07:00
Joanne Koong
0e790cbb1a selftests/bpf: Test for associating multiple elements with the local storage
This patch adds a few calls to the existing local storage selftest to
test that we can associate multiple elements with the local storage.

The sleepable program's call to bpf_sk_storage_get with sk_storage_map2
will lead to an allocation of a new selem under the GFP_KERNEL flag.

Signed-off-by: Joanne Koong <joannelkoong@gmail.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220318045553.3091807-3-joannekoong@fb.com
2022-03-20 18:55:05 -07:00
Andrii Nakryiko
a8fee96202 libbpf: Avoid NULL deref when initializing map BTF info
If BPF object doesn't have an BTF info, don't attempt to search for BTF
types describing BPF map key or value layout.

Fixes: 262cfb74ff ("libbpf: Init btf_{key,value}_type_id on internal map open")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220320001911.3640917-1-andrii@kernel.org
2022-03-20 18:53:04 -07:00
Ian Rogers
7bd1da15d2 perf parse-events: Ignore case in topdown.slots check
An issue with icelakex metrics:

  https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/tree/tools/perf/pmu-events/arch/x86/icelakex/icx-metrics.json?h=perf/core&id=65eab2bc7dab326ee892ec5a4c749470b368b51a#n48

That causes the slots not to be first.

Fixes: 94dbfd6781 ("perf parse-events: Architecture specific leader override")
Reported-by: Caleb Biggers <caleb.biggers@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220317224309.543736-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 18:39:09 -03:00
Ian Rogers
8b464eac97 perf evlist: Avoid iteration for empty evlist.
As seen with 'perf stat --null ..' and reported in:
https://lore.kernel.org/lkml/YjCLcpcX2peeQVCH@kernel.org/

v2. Avoids setting evsel in the empty list case as suggested by Jiri Olsa.

    Committer testing:

Before:

  $  perf stat --null sleep 1
  Segmentation fault (core dumped)
  $

After:

  $  perf stat --null sleep 1

   Performance counter stats for 'sleep 1':

         1.010340646 seconds time elapsed

         0.001420000 seconds user
         0.000000000 seconds sys
  $

Fixes: 472832d2c0 ("perf evlist: Refactor evlist__for_each_cpu()")
Reported-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220317231643.550902-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 18:39:09 -03:00
Michael Petlan
3cf6a32f3f perf symbols: Fix symbol size calculation condition
Before this patch, the symbol end address fixup to be called, needed two
conditions being met:

  if (prev->end == prev->start && prev->end != curr->start)

Where
  "prev->end == prev->start" means that prev is zero-long
                             (and thus needs a fixup)
and
  "prev->end != curr->start" means that fixup hasn't been applied yet

However, this logic is incorrect in the following situation:

*curr  = {rb_node = {__rb_parent_color = 278218928,
  rb_right = 0x0, rb_left = 0x0},
  start = 0xc000000000062354,
  end = 0xc000000000062354, namelen = 40, type = 2 '\002',
  binding = 0 '\000', idle = 0 '\000', ignore = 0 '\000',
  inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false,
  name = 0x1159739e "kprobe_optinsn_page\t[__builtin__kprobes]"}

*prev = {rb_node = {__rb_parent_color = 278219041,
  rb_right = 0x109548b0, rb_left = 0x109547c0},
  start = 0xc000000000062354,
  end = 0xc000000000062354, namelen = 12, type = 2 '\002',
  binding = 1 '\001', idle = 0 '\000', ignore = 0 '\000',
  inlined = 0 '\000', arch_sym = 0 '\000', annotate2 = false,
  name = 0x1095486e "optinsn_slot"}

In this case, prev->start == prev->end == curr->start == curr->end,
thus the condition above thinks that "we need a fixup due to zero
length of prev symbol, but it has been probably done, since the
prev->end == curr->start", which is wrong.

After the patch, the execution path proceeds to arch__symbols__fixup_end
function which fixes up the size of prev symbol by adding page_size to
its end offset.

Fixes: 3b01a413c1 ("perf symbols: Improve kallsyms symbol end addr calculation")
Signed-off-by: Michael Petlan <mpetlan@redhat.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20220317135536.805-1-mpetlan@redhat.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 18:39:09 -03:00
Guillaume Nault
ec730c3e1f selftest: net: Test IPv4 PMTU exceptions with DSCP and ECN
Add two tests to pmtu.sh, for verifying that PMTU exceptions get
properly created for routes that don't belong to the main table.

A fib-rule based on the packet's DSCP field is used to jump to the
correct table. ECN shouldn't interfere with this process, so each test
has two components: one that only sets DSCP and one that sets both DSCP
and ECN.

One of the test triggers PMTU exceptions using ICMP Echo Requests, the
other using UDP packets (to test different handlers in the kernel).

A few adjustments are necessary in the rest of the script to allow
policy routing scenarios:

  * Add global variable rt_table that allows setup_routing_*() to
    add routes to a specific routing table. By default rt_table is set
    to "main", so existing tests don't need to be modified.

  * Another global variable, policy_mark, is used to define which
    dsfield value is used for policy routing. This variable has no
    effect on tests that don't use policy routing.

  * The UDP version of the test uses socat. So cleanup() now also need
    to kill socat PIDs.

  * route_get_dst_pmtu_from_exception() and route_get_dst_exception()
    now take an optional third argument specifying the dsfield. If
    not specified, 0 is used, so existing users don't need to be
    modified.

Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-18 14:06:45 -07:00
Rafael J. Wysocki
31035f3e20 Merge branch 'thermal-hfi'
Merge Intel Hardware Feedback Interface (HFI) thermal driver for
5.18-rc1 and update the intel-speed-select utility to support that
driver.

* thermal-hfi:
  tools/power/x86/intel-speed-select: v1.12 release
  tools/power/x86/intel-speed-select: HFI support
  tools/power/x86/intel-speed-select: OOB daemon mode
  thermal: intel: hfi: INTEL_HFI_THERMAL depends on NET
  thermal: netlink: Fix parameter type of thermal_genl_cpu_capability_event() stub
  thermal: intel: hfi: Notify user space for HFI events
  thermal: netlink: Add a new event to notify CPU capabilities change
  thermal: intel: hfi: Enable notification interrupt
  thermal: intel: hfi: Handle CPU hotplug events
  thermal: intel: hfi: Minimally initialize the Hardware Feedback Interface
  x86/cpu: Add definitions for the Intel Hardware Feedback Interface
  x86/Documentation: Describe the Intel Hardware Feedback Interface
2022-03-18 19:00:26 +01:00
Rafael J. Wysocki
ec3d8b8365 Merge branch 'pm-tools'
Merge power management utilities changes for 5.18-rc1:

 - Add tracer tool for the amd-pstate driver (Jinzhou Su).

 - Fix PC6 displaying in turbostat on some systems (Artem Bityutskiy).

 - Add AMD P-State support to the cpupower utility (Huang Rui).

* pm-tools:
  Documentation: amd-pstate: add tracer tool introduction
  tools/power/x86/amd_pstate_tracer: Add tracer tool for AMD P-state
  tools/power/x86/intel_pstate_tracer: make tracer as a module
  cpufreq: amd-pstate: Add more tracepoint for AMD P-State module
  turbostat: fix PC6 displaying on some systems
  cpupower: Add "perf" option to print AMD P-State information
  cpupower: Add function to print AMD P-State performance capabilities
  cpupower: Move print_speed function into misc helper
  cpupower: Enable boost state support for AMD P-State module
  cpupower: Add AMD P-State sysfs definition and access helper
  cpupower: Introduce ACPI CPPC library
  cpupower: Add the function to get the sysfs value from specific table
  cpupower: Initial AMD P-State capability
  cpupower: Add the function to check AMD P-State enabled
  cpupower: Add AMD P-State capability flag
  tools/power/cpupower/{ToDo => TODO}: Rename the todo file
  tools: cpupower: fix typo in cpupower-idle-set(1) manpage
2022-03-18 18:46:15 +01:00
Andrii Nakryiko
08063b4bc1 bpftool: Add BPF_TRACE_KPROBE_MULTI to attach type names table
BPF_TRACE_KPROBE_MULTI is a new attach type name, add it to bpftool's
table. This fixes a currently failing CI bpftool check.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220318150106.2933343-1-andrii@kernel.org
2022-03-18 17:56:00 +01:00
Paolo Bonzini
714797c98e KVM/arm64 updates for 5.18
- Proper emulation of the OSLock feature of the debug architecture
 
 - Scalibility improvements for the MMU lock when dirty logging is on
 
 - New VMID allocator, which will eventually help with SVA in VMs
 
 - Better support for PMUs in heterogenous systems
 
 - PSCI 1.1 support, enabling support for SYSTEM_RESET2
 
 - Implement CONFIG_DEBUG_LIST at EL2
 
 - Make CONFIG_ARM64_ERRATUM_2077057 default y
 
 - Reduce the overhead of VM exit when no interrupt is pending
 
 - Remove traces of 32bit ARM host support from the documentation
 
 - Updated vgic selftests
 
 - Various cleanups, doc updates and spelling fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmI0lrQPHG1hekBrZXJu
 ZWwub3JnAAoJECPQ0LrRPXpDy0YQAIX2bWcPFMqHqn3CAYhTSTiOK5s+OWx9im5f
 5yTPRj+SJ88SWv030r8a5dxWh2dEK2IetM9KifZ0dvmcCs8lYW/9/IUkHYY9lAYJ
 9VLH4iPgs9dOD9wtfovfb+vcM8bso9Ndi3aCFJUj+bcNwYU3kBIJ+8AxA5DZoLty
 5LPF38eoxrSEv9N0VwqvhGxdgqDp8Zahykr693r+8Wd3Rj6yRoqoEvqWhHdVWlWJ
 3quRNkYN4LzjN3x1T9CLaZUqMofbUjfYCAvbZorALJy6In1FfgoyocFe6/JvsmzZ
 xOlrWWbJz/1NNI6Hoy5aZtQavTFrHu4XbCkjBDL7RhRxj636KWelVoXAbV05XX2r
 hQYMnN0bwlnAljTefguIZ7frnQyjg5OV8GMu3CTIPMqu//fA+61z+bXoyVy6pzaV
 gcXHtDgIdiRaT6BJiHST8ctxZWDTr2GUgTGfdlCde7hgmJ7DjManLXvgYx101/Nz
 VfvKzz3oSvVTelNa/6ZWxuUlwvly0eKONSkwjp0uq5TZ9G8NLaKitA8nKDSkoegx
 41iIUEztivuu9KQvQkl8wdcCPwEk8K2sOTH7ikINS/wJ0khiUztndxCAlEPbQo50
 567OiSaj5+vqFPZsxWBVTIbmkdBVKCzrG+4B1H4didMb1Q1n2lHhgj1keHTmZyVP
 jlFofZxf
 =J1mn
 -----END PGP SIGNATURE-----

Merge tag 'kvmarm-5.18' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 5.18

- Proper emulation of the OSLock feature of the debug architecture

- Scalibility improvements for the MMU lock when dirty logging is on

- New VMID allocator, which will eventually help with SVA in VMs

- Better support for PMUs in heterogenous systems

- PSCI 1.1 support, enabling support for SYSTEM_RESET2

- Implement CONFIG_DEBUG_LIST at EL2

- Make CONFIG_ARM64_ERRATUM_2077057 default y

- Reduce the overhead of VM exit when no interrupt is pending

- Remove traces of 32bit ARM host support from the documentation

- Updated vgic selftests

- Various cleanups, doc updates and spelling fixes
2022-03-18 12:43:24 -04:00
Krasnov Arseniy Vladimirovich
e89600ebee af_vsock: SOCK_SEQPACKET broken buffer test
Add test where sender sends two message, each with own
data pattern. Reader tries to read first to broken buffer:
it has three pages size, but middle page is unmapped. Then,
reader tries to read second message to valid buffer. Test
checks, that uncopied part of first message was dropped
and thus not copied as part of second message.

Signed-off-by: Krasnov Arseniy Vladimirovich <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-18 15:13:19 +00:00
Krasnov Arseniy Vladimirovich
efb3719f4a af_vsock: SOCK_SEQPACKET receive timeout test
Test for receive timeout check: connection is established,
receiver sets timeout, but sender does nothing. Receiver's
'read()' call must return EAGAIN.

Signed-off-by: Krasnov Arseniy Vladimirovich <AVKrasnov@sberdevices.ru>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-18 15:13:18 +00:00
Jakub Sitnicki
deb5940046 selftests/bpf: Fix test for 4-byte load from dst_port on big-endian
The check for 4-byte load from dst_port offset into bpf_sock is failing on
big-endian architecture - s390. The bpf access converter rewrites the
4-byte load to a 2-byte load from sock_common at skc_dport offset, as shown
below.

  * s390 / llvm-objdump -S --no-show-raw-insn

  00000000000002a0 <sk_dst_port__load_word>:
        84:       r1 = *(u32 *)(r1 + 48)
        85:       w0 = 1
        86:       if w1 == 51966 goto +1 <LBB5_2>
        87:       w0 = 0
  00000000000002c0 <LBB5_2>:
        88:       exit

  * s390 / bpftool prog dump xlated

  _Bool sk_dst_port__load_word(struct bpf_sock * sk):
    35: (69) r1 = *(u16 *)(r1 +12)
    36: (bc) w1 = w1
    37: (b4) w0 = 1
    38: (16) if w1 == 0xcafe goto pc+1
    39: (b4) w0 = 0
    40: (95) exit

  * x86_64 / llvm-objdump -S --no-show-raw-insn

  00000000000002a0 <sk_dst_port__load_word>:
        84:       r1 = *(u32 *)(r1 + 48)
        85:       w0 = 1
        86:       if w1 == 65226 goto +1 <LBB5_2>
        87:       w0 = 0
  00000000000002c0 <LBB5_2>:
        88:       exit

  * x86_64 / bpftool prog dump xlated

  _Bool sk_dst_port__load_word(struct bpf_sock * sk):
    33: (69) r1 = *(u16 *)(r1 +12)
    34: (b4) w0 = 1
    35: (16) if w1 == 0xfeca goto pc+1
    36: (b4) w0 = 0
    37: (95) exit

This leads to surprises if we treat the destination register contents as a
32-bit value, ignoring the fact that in reality it contains a 16-bit value.

On little-endian the register contents reflect the bpf_sock struct
definition, where the lower 16-bits contain the port number:

	struct bpf_sock {
		...
		__be16 dst_port;	/* offset 48 */
		__u16 :16;
		...
	};

However, on big-endian the register contents suggest that field the layout
of bpf_sock struct is as so:

	struct bpf_sock {
		...
		__u16 :16;		/* offset 48 */
		__be16 dst_port;
		...
	};

Account for this quirky access conversion in the test case exercising the
4-byte load by treating the result as 16-bit wide.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-5-jakub@cloudflare.com
2022-03-18 15:46:59 +01:00
Jakub Sitnicki
e06b5bbcf3 selftests/bpf: Use constants for socket states in sock_fields test
Replace magic numbers in BPF code with constants from bpf.h, so that they
don't require an explanation in the comments.

Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-4-jakub@cloudflare.com
2022-03-18 15:46:59 +01:00
Jakub Sitnicki
2d2202ba85 selftests/bpf: Check dst_port only on the client socket
cgroup_skb/egress programs which sock_fields test installs process packets
flying in both directions, from the client to the server, and in reverse
direction.

Recently added dst_port check relies on the fact that destination
port (remote peer port) of the socket which sends the packet is known ahead
of time. This holds true only for the client socket, which connects to the
known server port.

Filter out any traffic that is not egressing from the client socket in the
BPF program that tests reading the dst_port.

Fixes: 8f50f16ff3 ("selftests/bpf: Extend verifier and bpf_sock tests for dst_port loads")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-3-jakub@cloudflare.com
2022-03-18 15:46:59 +01:00
Jakub Sitnicki
a4c9fe0ed4 selftests/bpf: Fix error reporting from sock_fields programs
The helper macro that records an error in BPF programs that exercise sock
fields access has been inadvertently broken by adaptation work that
happened in commit b18c1f0aa4 ("bpf: selftest: Adapt sock_fields test to
use skel and global variables").

BPF_NOEXIST flag cannot be used to update BPF_MAP_TYPE_ARRAY. The operation
always fails with -EEXIST, which in turn means the error never gets
recorded, and the checks for errors always pass.

Revert the change in update flags.

Fixes: b18c1f0aa4 ("bpf: selftest: Adapt sock_fields test to use skel and global variables")
Signed-off-by: Jakub Sitnicki <jakub@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220317113920.1068535-2-jakub@cloudflare.com
2022-03-18 15:46:58 +01:00
Ian Rogers
5edc3c618b perf vendor events intel: Update events for TremontX
Move from v1.17 to v1.19.

The change:

  fc68041040

moved certain "other" type of events in to the cache, memory and
pipeline topics. Update the perf JSON files for this change.

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220317182858.484474-8-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:51 -03:00
Ian Rogers
42e80e1ac3 perf vendor events intel: Update events for Tigerlake
The change:

  fc68041040

moved certain "other" type of events in to the cache and pipeline topics.
Update the perf JSON files for this change.

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220317182858.484474-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:51 -03:00
Ian Rogers
299d5dca77 perf vendor events intel: Update events for SkylakeX
The change:

  fc68041040

moved certain "other" type of events in to the cache topic. Update the
perf JSON files for this change.

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220317182858.484474-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:50 -03:00
Ian Rogers
fd14311829 perf vendor events intel: Update events for Skylake
The change:

  fc68041040

moved certain "other" type of events in to the cache topic. Update the
perf JSON files for this change.

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220317182858.484474-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:47 -03:00
Ian Rogers
f25db21bbf perf vendor events intel: Update events for IcelakeX
Move from v1.11 to v1.12.

The change:

  fc68041040

moved certain "other" type of events in to the cache, memory and
pipeline topics. Update the perf JSON files for this change.

Tested:
```
...
  6: Parse event definition strings                                  : Ok
...
 91: perf all PMU test                                               : Ok
...
```

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220317182858.484474-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:44 -03:00
Ian Rogers
fb76811a8f perf vendor events intel: Update events for Icelake
The change:

  fc68041040

moved certain "other" type of events in to the cache and pipeline topic.
Update the perf JSON files for this change.

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220317182858.484474-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:39 -03:00
Ian Rogers
3e75e95e80 perf vendor events intel: Update events for Elkhartlake
The change:

  fc68041040

moved certain "other" type of events in to the pipeline topic. Update the
perf JSON files for this change.

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220317182858.484474-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:35 -03:00
Ian Rogers
2c4d33b87c perf vendor events intel: Update events for CascadelakeX
The change:

  fc68041040

moved certain "other" type of events in to the cache topic. Update the
perf JSON files for this change.

Tested:
```
...
  6: Parse event definition strings                                  : Ok
  7: Simple expression parser                                        : Ok
  8: PERF_RECORD_* events & perf_sample fields                       : Ok
  9: Parse perf pmu format                                           : Ok
 10: PMU events                                                      :
 10.1: PMU event table sanity                                        : Ok
 10.2: PMU event map aliases                                         : Ok
 10.3: Parsing of PMU event table metrics                            : Ok
 10.4: Parsing of PMU event table metrics with fake PMUs             : Ok
...
 68: Parse and process metrics                                       : Ok
...
 89: perf all metricgroups test                                      : Ok
 90: perf all metrics test                                           : FAILED!
 91: perf all PMU test                                               : Ok
...
```

Test 90 failed due to MEM_PMM_Read_Latency as the test machine
lacks optane memory, and the divide by 0 causes the metric not to
print - which is intended behavior.

Reviewed-by: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Link: http://lore.kernel.org/lkml/20220317182858.484474-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-18 11:43:21 -03:00
Kuniyuki Iwashima
d9a232d435 af_unix: Support POLLPRI for OOB.
The commit 314001f0bf ("af_unix: Add OOB support") introduced OOB for
AF_UNIX, but it lacks some changes for POLLPRI.  Let's add the missing
piece.

In the selftest, normal datagrams are sent followed by OOB data, so this
commit replaces `POLLIN | POLLPRI` with just `POLLPRI` in the first test
case.

Fixes: 314001f0bf ("af_unix: Add OOB support")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-18 13:30:52 +00:00
Greg Kroah-Hartman
cc6ce5ac2c First set of new device support, fixes, cleanups and features for IIO in 5.18
This cycle we had quite a few series that applied similar changes
 to lots of drivers. To keep this description manageable I have
 called those out in their own section rather than per driver.
 
 Particularly pleased to see the long running AFE precision series
 going in this cycle.
 
 Series includes some late breaking fixes.
 
 New device support
 * adi,ada4250 amplifier
   - New driver and dt bindings for this programmable gain amplifier.
 * adi,admv1014 microwave down-converter
   - New driver, dt bindings and some device specific ABI that
     may be generalized as more drivers for devices similar to this
     are added.
 * adi,admv4420 K Band down-converter.
   - New driver and dt bindings.
 * adi,adxl367 accelerometer driver.
   - New driver, dt-bindings + some new IIO ABI definitions to support
     reference magnitude events where an estimate of the acceleration
     due to gravity has been removed.
   - A few fixes as follow up patches.
 * adi,ltc2688 DAC with toggle and dither modes.
   - New driver and bindings. Includes some new driver specific (for now)
     ABI for handling toggle mode and the addition of a dither waveform to
     the DAC output.
 * AFE (analog front end) add support for additional types of analog device
   in front of an ADC.
   - RTD temperature sensors with dt bindings.
   - Temperature transducers wit dt bindings.
   - Related cleanup and features listed in other sections below.
 * maxim,ds3502 potentiometer.
   - Add support to ds1803 driver which required significant rework.
 * mediatek,mt2701-auxadc driver
   - Add mediatek,mt8186-auxadc - id table and chip specific info only.
 * semtech,sx9324, semtech,ax9360
   - Substantial refactoring of sx9310 to extract core logic for reuse
     into a separate module
   - New driver using this supporting sx9324 proximity sensors.
   - New driver using this supporting sx9360 proximity sensors.
 * silan,sc7a20
   - Compatible with the st,lis2dh (or nearly anyway) so add ID and
     chip specific info to enable support. Also silan vendor ID added
     for dt-bindings.
 
 Staging graduation
 * adi,ad7280a monitoring ADC for stacked lithium-ion batteries in
   electric cars and similar.
   - Substantial rework of driver required to bring inline with current
     IIO best practice. An unusual device in IIO so some interesting features
     we may see more of in future.
 
 Multiple driver/core cleanup
 - Use sysfs_emit() in simple locations where there is no path to change
   to various core created attributes.
 - Trivial white space fixes around inconsistency between space after { and
   before } in id tables.
 - Introduce new handling for fractional types to avoid repeated similar
   implementations. Use this in 3 drivers. Note this is also targeted
   at future use in the AFE driver and was motivated by discussions
   around the precision related work on that driver.
 - of related header cleanups - drop of*.h and add mod_devicetable.h as
   appropriate.
 - Move a number of symbol exports into IIO_* namespaces.  Two categories,
   1) Library used by multiple drivers e.g. st_sensors
   2) Core driver module exporting functions used by bus specific modules.
   A few related cleanups in this set.
 - Switch from CONFIG_PM_* guards to new DEFINE_SIMPLE_DEV_PM_OPS() and
   similar to simplify drivers and take advantage of these new macros
   allowing the compiler to do the job or removing unused code without
   the need for __maybe_unused markings. Conversion of other drivers to
   these new macros ongoing.
 
 Features
 * adi,adf4350
   - Switch from of specific to generic device properties enabling use with
     other firmware types.
 * adi,adx345
   - Switch from of specific to generic device properties.
   - Add ACPI ID ADS0345
   - Related driver cleanup.
 * adi,hmc425a
   - Switch from of specific to generic device properties.
 * afe analog rescaler driver
   - Wider range of types supported for scale.
   - Support offset.
   - Kunit tests.
 * atlas,ezo-sensor
   - Convert from of to device properties.
 * fsl,mma8452
   - Support mount matrix.
 * infineon,dps310:
   - Add ACPI ID IFX3100.
 * invensense,mpu6050
   - Convert to generic device properties.
 * maxim,ds1803
   - Add out_raw_available before supporting more devices.
   - Convert from of specific to device properties.
 * samsung,ssp_sensors
   - Convert from of specific to device properties.
 * st,stm32-timer trigger
   - Convert from of specific to device properties.
 * ti,hdc101x
   - Add ACPI ID TXNW1010.
 * ti,tsc2046:
   - Add read_raw support to enable use of iio_hwmon and similar.
 
 Fixes / cleanup.
 * mailmap
   - Update for Cai Huoqing
 * MAINTAINERS
   - Fix Analog Devices related links.
   - Add entry for ADRF6780
   - Add entry for ADMV1013
   - Add entry for AD7293
   - Add entry for ADMV8818
   - Update files listed for adis-lib
 * iio core:
   - Fix wrong comment about current_mode being something a driver should
     ever access.
   - Use struct_size() rather than open coding in industrialio-hw-consumer
 * adi,axl355
   - Use units.h definitions instead of local versions.
 * adi,adis-lib
   - Simplify *updated_bits() macro
   - Whitespace cleanup.
 * afe - Note many of these fixes only apply to particular configurations
   so the problems have probably not been seen in the wild, but will be
   visible with new usecases enabled this cycle.
   - Fix application of consumer scale for IIO_VAL_INT.
   - Apply a scale of 1 when no scale is provided.
   - Make best effort to establish a valid offset value for fractional
     cases.
   - Use s64 for scale calculations where parameters may be signed.
   - Tidy up include order.
   - Improve accuracy for small fractional sales
   - Reduce risk of integer overflow.
 * ams,as3935
   - Use devm_delayed_work_autocancel() to replace open coded equivalent.
 * aspeed,adc
   - Fix wrong use of divider flag.
 * atmel,sama5d2-adc
   - Relax atmel,trigger-edge-type to optional.
   - Drop Ludovic Desroches from listed maintainers of the dt-binding
     inline with previous MAINTAINERS entry update.
 * fsl,mma8452
   - Fix probing when i2c_device_id used.
   - dev_get_drvdata() on the iio_dev->dev, no longer returns iio_dev.
     Use dev_to_iio_dev() instead. Note the original path in here
     worked more by luck than design.
 * invensense,mpu6050
   - Drop ACPI_PTR() protection to avoid an unused warning.
   - Use fact ACPI_COMPANION() returns null when ACPI_HANDLE() does to
     simplify handling.
 * motorola,cpcap-adc
   - Drop unused assignment.
 * qcom,spmi-adc
   - Fix wrong example of 'reg' in binding document.
 * renesas,rzg2l-adc
   - Trivial typo fix.
 * semtech,sx9360
   - Fix wrong register handling for event generation.
 * st_sensors
   - Allow manual disabling of I2C or SPI module if not needed for a particular
     board. Default is still to enable the bus specific module if
     appropriate bus is supported.
 * st,lsm6dsx
   - dev_get_drvdata() on the iio_dev->dev, no longer returns iio_dev.
     Use dev_to_iio_dev() instead.
 * ti,palmas-gpadc
   - Split the interrupt fields in the dt-binding example
 * ti,tsc2046
   - Rework state machine to improve readability after recent debugging of
     an issue fixed elsewhere.
   - Add a sanity check to avoid very large memory allocations if a crazy
     delay is specified.
 * ti,twl6030
   - Add error handling if devm_request_threaded_irq() fails.
 * xilinx,ams
   - Use devm_delayed_work_autocancel() instead of open coding equivalent.
   - Fix missing required clock entry in dt-binding.
   - Fix miss counting of channels resulting in ps channels not
     being enabled.
   - Fix incorrect values written to sequencer registers.
   - Fix sequence for single channel reading.
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAmIfdM8RHGppYzIzQGtl
 cm5lbC5vcmcACgkQVIU0mcT0FoiGlQ//UyRpMX9Bv97LAbMDnqIHLYroTLJA3WFQ
 AaL/DKB1cVjBCoHlp24qaQrmncvifPF7sKJGKWf7yCHL5fraAYL/kHsCo/jECTho
 QOk9QaPAMP9ChOoVoP8iz5qrdF2qyoFUG69bX+QYeKhSKzcK1QPRTQ13LIL43d9p
 OJX47Cu7FfFwuAs5VKSVgpcII0tctv+Fdo6BkkeI+6w/vx2sFSzRaqRtc1ZU4Uav
 s51dM9JMos52e/G8yQAEOC24QUId4EHxo7QR8WjzZ47yIHRulpYwM6pWAtvOqEy9
 eV++yz581+Uqs/qaDDk8nJdpa8aEv/NvfAK6gufB9UOWziMoR3G1pPFWoOLbcyIt
 IcUG+QyyEiIlmlwDE/m2OcSMzsxgrkEHNb3SE7ZkWZKP8OasGdVMHa7yEKCgLmzM
 S8EY9TsNA50A2VtowAPrdk74TVG2WeIDvEH2MMAUMjgW2DzsW9cmwFrziyj7ZPLX
 onoEjd/kpL2zzAArEadvzD1z1lLJcOUWn8ST2kbPQG8n/rp5y2u5PvgWRoO9zJlD
 ztX614XYRgRUhMrgb0q0nCTi07mnBZrR3P8Hnx1HOoZon/DIqPSL7NumITG09cQc
 fHqewQOU/WqoTH4tNvfywnBL/VAcxKFlc0B2rWIvp6dD5b0TU34ZdebcjLT1zYeC
 6YQKbRaRjVg=
 =UnWv
 -----END PGP SIGNATURE-----

Merge tag 'iio-for-5.18a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into char-misc-next

Jonathan writes:

First set of new device support, fixes, cleanups and features for IIO in 5.18

This cycle we had quite a few series that applied similar changes
to lots of drivers. To keep this description manageable I have
called those out in their own section rather than per driver.

Particularly pleased to see the long running AFE precision series
going in this cycle.

Series includes some late breaking fixes.

New device support
* adi,ada4250 amplifier
  - New driver and dt bindings for this programmable gain amplifier.
* adi,admv1014 microwave down-converter
  - New driver, dt bindings and some device specific ABI that
    may be generalized as more drivers for devices similar to this
    are added.
* adi,admv4420 K Band down-converter.
  - New driver and dt bindings.
* adi,adxl367 accelerometer driver.
  - New driver, dt-bindings + some new IIO ABI definitions to support
    reference magnitude events where an estimate of the acceleration
    due to gravity has been removed.
  - A few fixes as follow up patches.
* adi,ltc2688 DAC with toggle and dither modes.
  - New driver and bindings. Includes some new driver specific (for now)
    ABI for handling toggle mode and the addition of a dither waveform to
    the DAC output.
* AFE (analog front end) add support for additional types of analog device
  in front of an ADC.
  - RTD temperature sensors with dt bindings.
  - Temperature transducers wit dt bindings.
  - Related cleanup and features listed in other sections below.
* maxim,ds3502 potentiometer.
  - Add support to ds1803 driver which required significant rework.
* mediatek,mt2701-auxadc driver
  - Add mediatek,mt8186-auxadc - id table and chip specific info only.
* semtech,sx9324, semtech,ax9360
  - Substantial refactoring of sx9310 to extract core logic for reuse
    into a separate module
  - New driver using this supporting sx9324 proximity sensors.
  - New driver using this supporting sx9360 proximity sensors.
* silan,sc7a20
  - Compatible with the st,lis2dh (or nearly anyway) so add ID and
    chip specific info to enable support. Also silan vendor ID added
    for dt-bindings.

Staging graduation
* adi,ad7280a monitoring ADC for stacked lithium-ion batteries in
  electric cars and similar.
  - Substantial rework of driver required to bring inline with current
    IIO best practice. An unusual device in IIO so some interesting features
    we may see more of in future.

Multiple driver/core cleanup
- Use sysfs_emit() in simple locations where there is no path to change
  to various core created attributes.
- Trivial white space fixes around inconsistency between space after { and
  before } in id tables.
- Introduce new handling for fractional types to avoid repeated similar
  implementations. Use this in 3 drivers. Note this is also targeted
  at future use in the AFE driver and was motivated by discussions
  around the precision related work on that driver.
- of related header cleanups - drop of*.h and add mod_devicetable.h as
  appropriate.
- Move a number of symbol exports into IIO_* namespaces.  Two categories,
  1) Library used by multiple drivers e.g. st_sensors
  2) Core driver module exporting functions used by bus specific modules.
  A few related cleanups in this set.
- Switch from CONFIG_PM_* guards to new DEFINE_SIMPLE_DEV_PM_OPS() and
  similar to simplify drivers and take advantage of these new macros
  allowing the compiler to do the job or removing unused code without
  the need for __maybe_unused markings. Conversion of other drivers to
  these new macros ongoing.

Features
* adi,adf4350
  - Switch from of specific to generic device properties enabling use with
    other firmware types.
* adi,adx345
  - Switch from of specific to generic device properties.
  - Add ACPI ID ADS0345
  - Related driver cleanup.
* adi,hmc425a
  - Switch from of specific to generic device properties.
* afe analog rescaler driver
  - Wider range of types supported for scale.
  - Support offset.
  - Kunit tests.
* atlas,ezo-sensor
  - Convert from of to device properties.
* fsl,mma8452
  - Support mount matrix.
* infineon,dps310:
  - Add ACPI ID IFX3100.
* invensense,mpu6050
  - Convert to generic device properties.
* maxim,ds1803
  - Add out_raw_available before supporting more devices.
  - Convert from of specific to device properties.
* samsung,ssp_sensors
  - Convert from of specific to device properties.
* st,stm32-timer trigger
  - Convert from of specific to device properties.
* ti,hdc101x
  - Add ACPI ID TXNW1010.
* ti,tsc2046:
  - Add read_raw support to enable use of iio_hwmon and similar.

Fixes / cleanup.
* mailmap
  - Update for Cai Huoqing
* MAINTAINERS
  - Fix Analog Devices related links.
  - Add entry for ADRF6780
  - Add entry for ADMV1013
  - Add entry for AD7293
  - Add entry for ADMV8818
  - Update files listed for adis-lib
* iio core:
  - Fix wrong comment about current_mode being something a driver should
    ever access.
  - Use struct_size() rather than open coding in industrialio-hw-consumer
* adi,axl355
  - Use units.h definitions instead of local versions.
* adi,adis-lib
  - Simplify *updated_bits() macro
  - Whitespace cleanup.
* afe - Note many of these fixes only apply to particular configurations
  so the problems have probably not been seen in the wild, but will be
  visible with new usecases enabled this cycle.
  - Fix application of consumer scale for IIO_VAL_INT.
  - Apply a scale of 1 when no scale is provided.
  - Make best effort to establish a valid offset value for fractional
    cases.
  - Use s64 for scale calculations where parameters may be signed.
  - Tidy up include order.
  - Improve accuracy for small fractional sales
  - Reduce risk of integer overflow.
* ams,as3935
  - Use devm_delayed_work_autocancel() to replace open coded equivalent.
* aspeed,adc
  - Fix wrong use of divider flag.
* atmel,sama5d2-adc
  - Relax atmel,trigger-edge-type to optional.
  - Drop Ludovic Desroches from listed maintainers of the dt-binding
    inline with previous MAINTAINERS entry update.
* fsl,mma8452
  - Fix probing when i2c_device_id used.
  - dev_get_drvdata() on the iio_dev->dev, no longer returns iio_dev.
    Use dev_to_iio_dev() instead. Note the original path in here
    worked more by luck than design.
* invensense,mpu6050
  - Drop ACPI_PTR() protection to avoid an unused warning.
  - Use fact ACPI_COMPANION() returns null when ACPI_HANDLE() does to
    simplify handling.
* motorola,cpcap-adc
  - Drop unused assignment.
* qcom,spmi-adc
  - Fix wrong example of 'reg' in binding document.
* renesas,rzg2l-adc
  - Trivial typo fix.
* semtech,sx9360
  - Fix wrong register handling for event generation.
* st_sensors
  - Allow manual disabling of I2C or SPI module if not needed for a particular
    board. Default is still to enable the bus specific module if
    appropriate bus is supported.
* st,lsm6dsx
  - dev_get_drvdata() on the iio_dev->dev, no longer returns iio_dev.
    Use dev_to_iio_dev() instead.
* ti,palmas-gpadc
  - Split the interrupt fields in the dt-binding example
* ti,tsc2046
  - Rework state machine to improve readability after recent debugging of
    an issue fixed elsewhere.
  - Add a sanity check to avoid very large memory allocations if a crazy
    delay is specified.
* ti,twl6030
  - Add error handling if devm_request_threaded_irq() fails.
* xilinx,ams
  - Use devm_delayed_work_autocancel() instead of open coding equivalent.
  - Fix missing required clock entry in dt-binding.
  - Fix miss counting of channels resulting in ps channels not
    being enabled.
  - Fix incorrect values written to sequencer registers.
  - Fix sequence for single channel reading.

* tag 'iio-for-5.18a' of https://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio: (245 commits)
  iio: adc: xilinx-ams: Fix single channel switching sequence
  iio: adc: xilinx-ams: Fixed wrong sequencer register settings
  iio: adc: xilinx-ams: Fixed missing PS channels
  dt-bindings: iio: adc: zynqmp_ams: Add clock entry
  iio: accel: mma8452: use the correct logic to get mma8452_data
  iio: adc: aspeed: Add divider flag to fix incorrect voltage reading.
  iio: imu: st_lsm6dsx: use dev_to_iio_dev() to get iio_dev struct
  dt-bindings: iio: Add ltc2688 documentation
  iio: ABI: add ABI file for the LTC2688 DAC
  iio: dac: add support for ltc2688
  dt-bindings: iio: afe: add bindings for temperature transducers
  dt-bindings: iio: afe: add bindings for temperature-sense-rtd
  iio: afe: rescale: add temperature transducers
  iio: afe: rescale: add RTD temperature sensor support
  iio: test: add basic tests for the iio-rescale driver
  iio: afe: rescale: reduce risk of integer overflow
  iio: afe: rescale: fix accuracy for small fractional scales
  iio: afe: rescale: add offset support
  iio: afe: rescale: add INT_PLUS_{MICRO,NANO} support
  iio: afe: rescale: expose scale processing function
  ...
2022-03-18 12:41:32 +01:00
Delyan Kratunov
3cccbaa033 selftests/bpf: Test subskeleton functionality
This patch changes the selftests/bpf Makefile to also generate
a subskel.h for every skel.h it would have normally generated.

Separately, it also introduces a new subskeleton test which tests
library objects, externs, weak symbols, kconfigs, and user maps.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/1bd24956940bbbfe169bb34f7f87b11df52ef011.1647473511.git.delyank@fb.com
2022-03-17 23:12:48 -07:00
Delyan Kratunov
00389c58ff bpftool: Add support for subskeletons
Subskeletons are headers which require an already loaded program to
operate.

For example, when a BPF library is linked into a larger BPF object file,
the library userspace needs a way to access its own global variables
without requiring knowledge about the larger program at build time.

As a result, subskeletons require a loaded bpf_object to open().
Further, they find their own symbols in the larger program by
walking BTF type data at run time.

At this time, programs, maps, and globals are supported through
non-owning pointers.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/ca8a48b4841c72d285ecce82371bef4a899756cb.1647473511.git.delyank@fb.com
2022-03-17 23:12:39 -07:00
Delyan Kratunov
430025e5dc libbpf: Add subskeleton scaffolding
In symmetry with bpf_object__open_skeleton(),
bpf_object__open_subskeleton() performs the actual walking and linking
of maps, progs, and globals described by bpf_*_skeleton objects.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/6942a46fbe20e7ebf970affcca307ba616985b15.1647473511.git.delyank@fb.com
2022-03-17 23:11:16 -07:00
Delyan Kratunov
262cfb74ff libbpf: Init btf_{key,value}_type_id on internal map open
For internal and user maps, look up the key and value btf
types on open() and not load(), so that `bpf_map_btf_value_type_id`
is usable in `bpftool gen`.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/78dbe4e457b4a05e098fc6c8f50014b680c86e4e.1647473511.git.delyank@fb.com
2022-03-17 23:11:15 -07:00
Delyan Kratunov
bc380eb9d0 libbpf: .text routines are subprograms in strict mode
Currently, libbpf considers a single routine in .text to be a program. This
is particularly confusing when it comes to library objects - a single routine
meant to be used as an extern will instead be considered a bpf_program.

This patch hides this compatibility behavior behind the pre-existing
SEC_NAME strict mode flag.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/018de8d0d67c04bf436055270d35d394ba393505.1647473511.git.delyank@fb.com
2022-03-17 23:11:15 -07:00
Jiri Olsa
318c812ceb selftests/bpf: Add cookie test for bpf_program__attach_kprobe_multi_opts
Adding bpf_cookie test for programs attached by
bpf_program__attach_kprobe_multi_opts API.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-14-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
9271a0c7ae selftests/bpf: Add attach test for bpf_program__attach_kprobe_multi_opts
Adding tests for bpf_program__attach_kprobe_multi_opts function,
that test attach with pattern, symbols and addrs.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-13-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
2c6401c966 selftests/bpf: Add kprobe_multi bpf_cookie test
Adding bpf_cookie test for programs attached by kprobe_multi links.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-12-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
f7a11eeccb selftests/bpf: Add kprobe_multi attach test
Adding kprobe_multi attach test that uses new fprobe interface to
attach kprobe program to multiple functions.

The test is attaching programs to bpf_fentry_test* functions and
uses single trampoline program bpf_prog_test_run to trigger
bpf_fentry_test* functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-11-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
ddc6b04989 libbpf: Add bpf_program__attach_kprobe_multi_opts function
Adding bpf_program__attach_kprobe_multi_opts function for attaching
kprobe program to multiple functions.

  struct bpf_link *
  bpf_program__attach_kprobe_multi_opts(const struct bpf_program *prog,
                                        const char *pattern,
                                        const struct bpf_kprobe_multi_opts *opts);

User can specify functions to attach with 'pattern' argument that
allows wildcards (*?' supported) or provide symbols or addresses
directly through opts argument. These 3 options are mutually
exclusive.

When using symbols or addresses, user can also provide cookie value
for each symbol/address that can be retrieved later in bpf program
with bpf_get_attach_cookie helper.

  struct bpf_kprobe_multi_opts {
          size_t sz;
          const char **syms;
          const unsigned long *addrs;
          const __u64 *cookies;
          size_t cnt;
          bool retprobe;
          size_t :0;
  };

Symbols, addresses and cookies are provided through opts object
(syms/addrs/cookies) as array pointers with specified count (cnt).

Each cookie value is paired with provided function address or symbol
with the same array index.

The program can be also attached as return probe if 'retprobe' is set.

For quick usage with NULL opts argument, like:

  bpf_program__attach_kprobe_multi_opts(prog, "ksys_*", NULL)

the 'prog' will be attached as kprobe to 'ksys_*' functions.

Also adding new program sections for automatic attachment:

  kprobe.multi/<symbol_pattern>
  kretprobe.multi/<symbol_pattern>

The symbol_pattern is used as 'pattern' argument in
bpf_program__attach_kprobe_multi_opts function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-10-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
5117c26e87 libbpf: Add bpf_link_create support for multi kprobes
Adding new kprobe_multi struct to bpf_link_create_opts object
to pass multiple kprobe data to link_create attr uapi.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-9-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
85153ac062 libbpf: Add libbpf_kallsyms_parse function
Move the kallsyms parsing in internal libbpf_kallsyms_parse
function, so it can be used from other places.

It will be used in following changes.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-8-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
ca74823c6e bpf: Add cookie support to programs attached with kprobe multi link
Adding support to call bpf_get_attach_cookie helper from
kprobe programs attached with kprobe multi link.

The cookie is provided by array of u64 values, where each
value is paired with provided function address or symbol
with the same array index.

When cookie array is provided it's sorted together with
addresses (check bpf_kprobe_multi_cookie_swap). This way
we can find cookie based on the address in
bpf_get_attach_cookie helper.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-7-jolsa@kernel.org
2022-03-17 20:17:19 -07:00
Jiri Olsa
0dcac27254 bpf: Add multi kprobe link
Adding new link type BPF_LINK_TYPE_KPROBE_MULTI that attaches kprobe
program through fprobe API.

The fprobe API allows to attach probe on multiple functions at once
very fast, because it works on top of ftrace. On the other hand this
limits the probe point to the function entry or return.

The kprobe program gets the same pt_regs input ctx as when it's attached
through the perf API.

Adding new attach type BPF_TRACE_KPROBE_MULTI that allows attachment
kprobe to multiple function with new link.

User provides array of addresses or symbols with count to attach the
kprobe program to. The new link_create uapi interface looks like:

  struct {
          __u32           flags;
          __u32           cnt;
          __aligned_u64   syms;
          __aligned_u64   addrs;
  } kprobe_multi;

The flags field allows single BPF_TRACE_KPROBE_MULTI bit to create
return multi kprobe.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220316122419.933957-4-jolsa@kernel.org
2022-03-17 20:17:18 -07:00
Kaixi Fan
e0999c8e59 selftests/bpf: Fix tunnel remote IP comments
In namespace at_ns0, the IP address of tnl dev is 10.1.1.100 which is the
overlay IP, and the ip address of veth0 is 172.16.1.100 which is the vtep
IP. When doing 'ping 10.1.1.100' from root namespace, the remote_ip should
be 172.16.1.100.

Fixes: 933a741e3b ("selftests/bpf: bpf tunnel test.")
Signed-off-by: Kaixi Fan <fankaixi.li@bytedance.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220313164116.5889-1-fankaixi.li@bytedance.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2022-03-17 16:08:02 -07:00
Jakub Kicinski
e243f39685 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
No conflicts.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-17 13:56:58 -07:00
Linus Torvalds
551acdc3c3 Networking fixes for 5.17-final, including fixes from netfilter, ipsec,
and wireless.
 
 Current release - regressions:
 
  - Revert "netfilter: nat: force port remap to prevent shadowing
    well-known ports", restore working conntrack on asymmetric paths
 
  - Revert "ath10k: drop beacon and probe response which leak from
    other channel", restore working AP and mesh mode on QCA9984
 
  - eth: intel: fix hang during reboot/shutdown
 
 Current release - new code bugs:
 
  - netfilter: nf_tables: disable register tracking, it needs more
    work to cover all corner cases
 
 Previous releases - regressions:
 
  - ipv6: fix skb_over_panic in __ip6_append_data when (admin-only)
    extension headers get specified
 
  - esp6: fix ESP over TCP/UDP, interpret ipv6_skip_exthdr's return
    value more selectively
 
  - bnx2x: fix driver load failure when FW not present in initrd
 
 Previous releases - always broken:
 
  - vsock: stop destroying unrelated sockets in nested virtualization
 
  - packet: fix slab-out-of-bounds access in packet_recvmsg()
 
 Misc:
 
  - add Paolo Abeni to networking maintainers!
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmIzZb4ACgkQMUZtbf5S
 IruPURAAs4A7xW5FdqjEFrvUzLZVIhxyWH10Q3KXN65lSO0PitIyO455y/3OoIWA
 P6djVeH7bDY7wH/y3Uqctz35wuJaeH44kIHwk3QufStaPeDSC6Z+/O9QU82jIZ+T
 X8AqoBgf7j/NTpiAcVfS2Bsr+ZJVGC1IgqOy4n+bZZdg+BWElZlxQMhUfwqaptAp
 rM3wwvzqvWyiVDGbxsQf2OJll3ivFqJuahcs6PaPffIi7xno+uyeEnt7aX/vLHQF
 qzq59I4glcJ90KTFeUP7PW7pmk6uIEyhXGnHbZpstgpTDL0lHnpszdw3jLg/bgfT
 Dgvf2JjnLGMUU7ER+y89USG9s0h8MWQSLcalL1WHcwkq5QjEPMFyz+LQZvDBsZO6
 G/8V1e+wDsPJ+q0jmbTzvPECm+xeb4Wnh+Qgx0WXcDEU4aAavTutJJmFRklPVhEl
 dSuZXtYgkiz/VOE+WzR8AMYNNZ/pbmsUhQw2QuOfL+hWKd/f/kRwD0iErwrk0cEX
 aDHvuyt10lTeh2ALM1hE2QVNu3jf8OyiZnFaP8gRoridDFRPyQVlWSOYW+6nLzUb
 7PP0pfn2dECGgscn7SWkEpRf75ZRSUq/obCziwjMrczb9iUDqFOm9Uy9Lf6ZHFyy
 XcA9mnxUUGUVQtC7ynTtZvu+/A7C/PfMe9H9HdzWwJ+rAEGpdiA=
 =pStH
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from netfilter, ipsec, and wireless.

  A few last minute revert / disable and fix patches came down from our
  sub-trees. We're not waiting for any fixes at this point.

  Current release - regressions:

   - Revert "netfilter: nat: force port remap to prevent shadowing
     well-known ports", restore working conntrack on asymmetric paths

   - Revert "ath10k: drop beacon and probe response which leak from
     other channel", restore working AP and mesh mode on QCA9984

   - eth: intel: fix hang during reboot/shutdown

  Current release - new code bugs:

   - netfilter: nf_tables: disable register tracking, it needs more work
     to cover all corner cases

  Previous releases - regressions:

   - ipv6: fix skb_over_panic in __ip6_append_data when (admin-only)
     extension headers get specified

   - esp6: fix ESP over TCP/UDP, interpret ipv6_skip_exthdr's return
     value more selectively

   - bnx2x: fix driver load failure when FW not present in initrd

  Previous releases - always broken:

   - vsock: stop destroying unrelated sockets in nested virtualization

   - packet: fix slab-out-of-bounds access in packet_recvmsg()

  Misc:

   - add Paolo Abeni to networking maintainers!"

* tag 'net-5.17-final' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (26 commits)
  iavf: Fix hang during reboot/shutdown
  net: mscc: ocelot: fix backwards compatibility with single-chain tc-flower offload
  net: bcmgenet: skip invalid partial checksums
  bnx2x: fix built-in kernel driver load failure
  net: phy: mscc: Add MODULE_FIRMWARE macros
  net: dsa: Add missing of_node_put() in dsa_port_parse_of
  net: handle ARPHRD_PIMREG in dev_is_mac_header_xmit()
  Revert "ath10k: drop beacon and probe response which leak from other channel"
  hv_netvsc: Add check for kvmalloc_array
  iavf: Fix double free in iavf_reset_task
  ice: destroy flow director filter mutex after releasing VSIs
  ice: fix NULL pointer dereference in ice_update_vsi_tx_ring_stats()
  Add Paolo Abeni to networking maintainers
  atm: eni: Add check for dma_map_single
  net/packet: fix slab-out-of-bounds access in packet_recvmsg()
  net: mdio: mscc-miim: fix duplicate debugfs entry
  net: phy: marvell: Fix invalid comparison in the resume and suspend functions
  esp6: fix check on ipv6_skip_exthdr's return value
  net: dsa: microchip: add spi_device_id tables
  netfilter: nf_tables: disable register tracking
  ...
2022-03-17 12:55:26 -07:00
Yosry Ahmed
1c4debc443 selftests: vm: fix clang build error multiple output files
When building the vm selftests using clang, some errors are seen due to
having headers in the compilation command:

  clang -Wall -I ../../../../usr/include  -no-pie    gup_test.c ../../../../mm/gup_test.h -lrt -lpthread -o .../tools/testing/selftests/vm/gup_test
  clang: error: cannot specify -o when generating multiple output files
  make[1]: *** [../lib.mk:146: .../tools/testing/selftests/vm/gup_test] Error 1

Rework to add the header files to LOCAL_HDRS before including ../lib.mk,
since the dependency is evaluated in '$(OUTPUT)/%:%.c $(LOCAL_HDRS)' in
file lib.mk.

Link: https://lkml.kernel.org/r/20220304000645.1888133-1-yosryahmed@google.com
Signed-off-by: Yosry Ahmed <yosryahmed@google.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2022-03-17 11:02:13 -07:00
Guo Zhengkui
1abea24af4 selftests: net: fix array_size.cocci warning
Fix array_size.cocci warning in tools/testing/selftests/net.

Use `ARRAY_SIZE(arr)` instead of forms like `sizeof(arr)/sizeof(arr[0])`.

It has been tested with gcc (Debian 8.3.0-6) 8.3.0.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Link: https://lore.kernel.org/r/20220316092858.9398-1-guozhengkui@vivo.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-17 15:21:16 +01:00
Hou Tao
ad13baf456 selftests/bpf: Test subprog jit when toggle bpf_jit_harden repeatedly
When bpf_jit_harden is toggled between 0 and 2, subprog jit may fail
due to inconsistent twice read values of bpf_jit_harden during jit. So
add a test to ensure the problem is fixed.

Signed-off-by: Hou Tao <houtao1@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220309123321.2400262-5-houtao1@huawei.com
2022-03-16 15:13:36 -07:00
Martin KaFai Lau
82cb2b3077 bpf: selftests: Remove libcap usage from test_progs
This patch removes the libcap usage from test_progs.
bind_perm.c is the only user.  cap_*_effective() helpers added in the
earlier patch are directly used instead.

No other selftest binary is using libcap, so '-lcap' is also removed
from the Makefile.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Reviewed-by: Stanislav Fomichev <sdf@google.com>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220316173835.2039334-1-kafai@fb.com
2022-03-16 15:07:49 -07:00
Martin KaFai Lau
b1c2768a82 bpf: selftests: Remove libcap usage from test_verifier
This patch removes the libcap usage from test_verifier.
The cap_*_effective() helpers added in the earlier patch are
used instead.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220316173829.2038682-1-kafai@fb.com
2022-03-16 15:07:49 -07:00
Martin KaFai Lau
663af70aab bpf: selftests: Add helpers to directly use the capget and capset syscall
After upgrading to the newer libcap (>= 2.60),
the libcap commit aca076443591 ("Make cap_t operations thread safe.")
added a "__u8 mutex;" to the "struct _cap_struct".  It caused a few byte
shift that breaks the assumption made in the "struct libcap" definition
in test_verifier.c.

The bpf selftest usage only needs to enable and disable the effective
caps of the running task.  It is easier to directly syscall the
capget and capset instead.  It can also remove the libcap
library dependency.

The cap_helpers.{c,h} is added.  One __u64 is used for all CAP_*
bits instead of two __u32.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: John Fastabend <john.fastabend@gmail.com>
Link: https://lore.kernel.org/bpf/20220316173823.2036955-1-kafai@fb.com
2022-03-16 15:07:49 -07:00
David Ahern
40867d74c3 net: Add l3mdev index to flow struct and avoid oif reset for port devices
The fundamental premise of VRF and l3mdev core code is binding a socket
to a device (l3mdev or netdev with an L3 domain) to indicate L3 scope.
Legacy code resets flowi_oif to the l3mdev losing any original port
device binding. Ben (among others) has demonstrated use cases where the
original port device binding is important and needs to be retained.
This patch handles that by adding a new entry to the common flow struct
that can indicate the l3mdev index for later rule and table matching
avoiding the need to reset flowi_oif.

In addition to allowing more use cases that require port device binds,
this patch brings a few datapath simplications:

1. l3mdev_fib_rule_match is only called when walking fib rules and
   always after l3mdev_update_flow. That allows an optimization to bail
   early for non-VRF type uses cases when flowi_l3mdev is not set. Also,
   only that index needs to be checked for the FIB table id.

2. l3mdev_update_flow can be called with flowi_oif set to a l3mdev
   (e.g., VRF) device. By resetting flowi_oif only for this case the
   FLOWI_FLAG_SKIP_NH_OIF flag is not longer needed and can be removed,
   removing several checks in the datapath. The flowi_iif path can be
   simplified to only be called if the it is not loopback (loopback can
   not be assigned to an L3 domain) and the l3mdev index is not already
   set.

3. Avoid another device lookup in the output path when the fib lookup
   returns a reject failure.

Note: 2 functional tests for local traffic with reject fib rules are
updated to reflect the new direct failure at FIB lookup time for ping
rather than the failure on packet path. The current code fails like this:

    HINT: Fails since address on vrf device is out of device scope
    COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1
    ping: Warning: source address might be selected on device other than: eth1
    PING 172.16.3.1 (172.16.3.1) from 172.16.3.1 eth1: 56(84) bytes of data.

    --- 172.16.3.1 ping statistics ---
    1 packets transmitted, 0 received, 100% packet loss, time 0ms

where the test now directly fails:

    HINT: Fails since address on vrf device is out of device scope
    COMMAND: ip netns exec ns-A ping -c1 -w1 -I eth1 172.16.3.1
    ping: connect: No route to host

Signed-off-by: David Ahern <dsahern@kernel.org>
Tested-by: Ben Greear <greearb@candelatech.com>
Link: https://lore.kernel.org/r/20220314204551.16369-1-dsahern@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-15 20:20:02 -07:00
Daniel Xu
6585abea98 bpftool: man: Add missing top level docs
The top-level (bpftool.8) man page was missing docs for a few
subcommands and their respective sub-sub-commands.

This commit brings the top level man page up to date. Note that I've
kept the ordering of the subcommands the same as in `bpftool help`.

Signed-off-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/3049ef5dc509c0d1832f0a8b2dba2ccaad0af688.1647213551.git.dxu@dxuuu.xyz
2022-03-15 15:51:41 -07:00
Dmitrii Dolgov
cbdaf71f7e bpftool: Add bpf_cookie to link output
Commit 82e6b1eee6 ("bpf: Allow to specify user-provided bpf_cookie for
BPF perf links") introduced the concept of user specified bpf_cookie,
which could be accessed by BPF programs using bpf_get_attach_cookie().
For troubleshooting purposes it is convenient to expose bpf_cookie via
bpftool as well, so there is no need to meddle with the target BPF
program itself.

Implemented using the pid iterator BPF program to actually fetch
bpf_cookies, which allows constraining code changes only to bpftool.

$ bpftool link
1: type 7  prog 5
        bpf_cookie 123
        pids bootstrap(81)

Signed-off-by: Dmitrii Dolgov <9erthalion6@gmail.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220309163112.24141-1-9erthalion6@gmail.com
2022-03-15 15:07:27 -07:00
Paolo Bonzini
3b53f5535d KVM: s390: Fix, test and feature for 5.18 part 2
- memop selftest
 - fix SCK locking
 - adapter interruptions virtualization for secure guests
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+SKTgaM0CPnbq/vKEXu8gLWmHHwFAmIvW8IACgkQEXu8gLWm
 HHx4Bw/+PgXvGCbrxnOL2Y7zzIRrniFag1cPcxNXCjWAH4UnzU9u+5MJ0PpM4119
 S+Ch8b+fScXpjBmDkLhjsmm4MlVMZ6/1DpbB+XmalSqDEimLAigbT+7+xViCpLja
 jajMbIIFUhcmcSjIz47jbtDDeKvBvCD8O7J0nP5fMFV2hxpm9or5JW89BIuJRJiE
 jrfG4T3FhCTVH0wpWtZm6suJMJ/SjQ9d8LD6e2i5Fx+1OVMpDJF9umnAVwBMyiKN
 uCbAkMftMmTXYhFwM2CWS65QoWTpDNSYoln1sxNpDgapoQxw+3kAYyMSz0tVMElY
 yRTBJ3HoIZAyW0bzaK4BSF2bbiewcZqI3o2LMPBIlBCvJaRzJsbH48l02lWsAT3S
 iO3i4ZpHQLNgOdT1G7w0Xk5XaUCCtWVPSqvjy79u5L5YALKf1DZaW6vgHUQeeHpA
 oogVE5hjDZof0F5Uuve3lqNh8UhC9CYRVcGkSooFZ12Yf/dsWrUWQe0c5hij+hGH
 3lWK7KfNwK18X0QBntg7gzsuc+cO4smTNb20ILsK3n1CvDrWtlpxnY/F8mT9fVxp
 sUybn+1FD0LA06E7i13rM+a2b0XAsqvGtlA94nt1WtuyshdBsufyhKg7To9+KAUe
 YMKhZriwdls+/BXSYNlE6nxMmCkmfciMVFiz6LW2e29V5WArydU=
 =cjy5
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.18-2' of https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fix, test and feature for 5.18 part 2

- memop selftest
- fix SCK locking
- adapter interruptions virtualization for secure guests
2022-03-15 17:19:02 -04:00
Daniel Bristot de Oliveira
75016ca3ac rtla: Tools main loop cleanup
I probably started using "do {} while();", but changed all but osnoise_top
to "while(){};" leaving the ; behind.

Cleanup the main loop code, making all tools use "while() {}"

Changcheng Deng reported this problem, as reported by coccicheck:

Fix the following coccicheck review:
./tools/tracing/rtla/src/timerlat_hist.c: 800: 2-3: Unneeded semicolon
./tools/tracing/rtla/src/osnoise_hist.c:  776: 2-3: Unneeded semicolon
./tools/tracing/rtla/src/timerlat_top.c:  596: 2-3: Unneeded semicolon

Link: https://lkml.kernel.org/r/3c1642110aa87c396f5da4a037dabc72dbb9c601.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reported-by: Changcheng Deng <deng.changcheng@zte.com.cn>
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:50 -04:00
Daniel Bristot de Oliveira
7d0dc9576d rtla/timerlat: Add --dma-latency option
Add the --dma-latency to set /dev/cpu_dma_latency to the
specified value, this aims to avoid having exit from idle
states latencies that could be influencing the analysis.

Link: https://lkml.kernel.org/r/72ddb0d913459f13217086dadafad88a7c46dd28.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:50 -04:00
Daniel Bristot de Oliveira
7d38c35167 rtla/osnoise: Fix osnoise hist stop tracing message
rtla osnoise hist is printing the following message when hitting stop
tracing:

  printf("rtla timelat hit stop tracing\n");

which is obviosly wrong.

s/timerlat/osnoise/ fixing the printf.

Link: https://lkml.kernel.org/r/2b8f090556fe37b81d183b74ce271421f131c77b.1646247211.git.bristot@kernel.org

Fixes: 829a6c0b56 ("rtla/osnoise: Add the hist mode")
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:50 -04:00
Daniel Bristot de Oliveira
28d2160cb1 rtla: Check for trace off also in the trace instance
With the addition of --trigger option, it is also possible to stop
the trace from the -t tracing instance using the traceoff trigger.

Make rtla tools to check if the trace is stopped also in the trace
instance, stopping the execution of the tool.

Link: https://lkml.kernel.org/r/59fc7c6f23dddd5c8b7ef1782cf3da51ea2ce0f5.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:50 -04:00
Daniel Bristot de Oliveira
761916fd02 rtla/trace: Save event histogram output to a file
The hist: trigger generates a histogram in the file sys/event/hist.
If the hist: trigger is used, automatically save the histogram output of
the event sys:event in the sys_event_hist.txt file.

Link: https://lkml.kernel.org/r/b5c906af31d4e022ffe87fb0848fac5c089087c8.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:50 -04:00
Daniel Bristot de Oliveira
44f3a37d1d rtla: Add --filter support
Add --filter option. This option enables a trace event filtering of the
previous -e sys:event argument.

This option is available for all current tools.

Link: https://lkml.kernel.org/r/509d70b6348d3e5bcbf1f07ab725ce08d063149a.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:50 -04:00
Daniel Bristot de Oliveira
5487b6ce26 rtla/trace: Add trace event filter helpers
Add a set of helper functions to allow rtla tools to filter events
in the trace instance.

Link: https://lkml.kernel.org/r/12623b1684684549d53b90f4bf66fae44584fd14.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:50 -04:00
Daniel Bristot de Oliveira
1a75489365 rtla: Add --trigger support
Add --trigger option. This option enables a trace event trigger to the
previous -e sys:event argument, allowing some advanced tracing options.

For instance, in a system with CPUs 2:23 isolated, it is possible to get
a stack trace of thread wakeup targeting those CPUs while running
osnoise with the following command line:

 # osnoise top -c 2-23 -a 50 -e sched:sched_wakeup --trigger="stacktrace if target_cpu >= 2"

This option is available for all current tools.

Link: https://lkml.kernel.org/r/07d2983d5f71261d4da89dbaf02efcad100ab8ee.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:49 -04:00
Daniel Bristot de Oliveira
336c92b26c rtla/trace: Add trace event trigger helpers
Add a set of helper functions to allow rtla tools to trigger event
actions in the trace instance.

Link: https://lkml.kernel.org/r/e0d31abe879a78a5600b64f904d0dfa8f76e4fbb.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:49 -04:00
Daniel Bristot de Oliveira
51d64c3a18 rtla: Add -e/--event support
Add -e/--event option. This option enables an event in the trace (-t)
session. The argument can be a specific event, e.g., -e sched:sched_switch,
or all events of a system group, e.g., -e sched. Multiple -e are allowed.
It is only active when -t or -a are set.

This option is available for all current tools.

Link: https://lkml.kernel.org/r/6a3b753be9b1e811953995f7f21a86918ad13390.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:49 -04:00
Daniel Bristot de Oliveira
b5aa0be25c rtla/trace: Add trace events helpers
Add a set of helper functions to allow the rtla tools to enable
additional tracepoints in the trace instance.

Link: https://lkml.kernel.org/r/932398b36c1bbaa22c7810d7a40ca9b8c5595b94.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:49 -04:00
Daniel Bristot de Oliveira
173a3b0148 rtla/timerlat: Add the automatic trace option
Add the -a/--auto <arg in us> option. This option sets some commonly
used options while debugging the system. It aims to help users produce
reports in the field, reducing the number of arguments passed to the
tool in the first approach to a problem.

It is equivalent to setting osnoise/stop_tracing_total_us and print_stack
with the argument, and saving the trace to timerlat_trace.txt file if the
trace is stopped automatically.

Link: https://lkml.kernel.org/r/92438f7ef132c731f538cebdf77850300afe04a5.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:49 -04:00
Daniel Bristot de Oliveira
2b622edd5e rtla/osnoise: Add the automatic trace option
Add the -a/--auto <arg in us> option. This option sets some commonly
used options while debugging the system. It aims to help users produce
reports in the field, reducing the number of arguments passed to the
tool in the first approach to a problem.

It is equivalent to setting osnoise/stop_tracing_us with the argument,
setting tracing_thresh to 1 us, and saving the trace to osnoise_trace.txt
file if the trace is stopped automatically.

Link: https://lkml.kernel.org/r/ef04c961b227eb93a83cd0b54bfca45e1a381b77.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:49 -04:00
Daniel Bristot de Oliveira
d635316ae9 rtla/osnoise: Add an option to set the threshold
Add the -T/--threshold option to set the minimum threshold to be
considered a noise to osnoise top and hist commands. Also update
the man pages.

Link: https://lkml.kernel.org/r/031861200ffdb24a1df4aa72c458706889a20d5d.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:49 -04:00
Daniel Bristot de Oliveira
61c57d578b rtla/osnoise: Add support to adjust the tracing_thresh
osnoise uses the tracing_thresh parameter to define the delta between
two reads of the time to be considered a noise.

Add support to get and set the tracing_thresh from osnoise tools.

Link: https://lkml.kernel.org/r/715ad2a53fd40e41bab8c3f1214c1a94e12fb595.1646247211.git.bristot@kernel.org

Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Clark Williams <williams@redhat.com>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Signed-off-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2022-03-15 14:36:48 -04:00
Guo Zhengkui
f98d6dd1e7 selftests/bpf: Clean up array_size.cocci warnings
Clean up the array_size.cocci warnings under tools/testing/selftests/bpf/:

Use `ARRAY_SIZE(arr)` instead of forms like `sizeof(arr)/sizeof(arr[0])`.

tools/testing/selftests/bpf/test_cgroup_storage.c uses ARRAY_SIZE() defined
in tools/include/linux/kernel.h (sys/sysinfo.h -> linux/kernel.h), while
others use ARRAY_SIZE() in bpf_util.h.

Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220315130143.2403-1-guozhengkui@vivo.com
2022-03-15 17:03:10 +01:00
Petr Machata
ed2ae69c40 selftests: mlxsw: hw_stats_l3: Add a new test
Add a test that verifies that UAPI notifications are emitted, as mlxsw
installs and deinstalls HW counters for the L3 offload xstats.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-15 14:00:51 +01:00
Petr Machata
9b18942e99 selftests: netdevsim: hw_stats_l3: Add a new test
Add a test that verifies basic UAPI contracts, netdevsim operation,
rollbacks after partial enablement in core, and UAPI notifications.

Signed-off-by: Petr Machata <petrm@nvidia.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2022-03-15 14:00:51 +01:00
Peter Zijlstra
89bc853eae objtool: Find unused ENDBR instructions
Find all ENDBR instructions which are never referenced and stick them
in a section such that the kernel can poison them, sealing the
functions from ever being an indirect call target.

This removes about 1-in-4 ENDBR instructions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.763643193@infradead.org
2022-03-15 10:32:47 +01:00
Peter Zijlstra
08f87a93c8 objtool: Validate IBT assumptions
Intel IBT requires that every indirect JMP/CALL targets an ENDBR
instructions, failing this #CP happens and we die. Similarly, all
exception entries should be ENDBR.

Find all code relocations and ensure they're either an ENDBR
instruction or ANNOTATE_NOENDBR. For the exceptions look for
UNWIND_HINT_IRET_REGS at sym+0 not being ENDBR.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.705110141@infradead.org
2022-03-15 10:32:46 +01:00
Peter Zijlstra
7d209d13e7 objtool: Add IBT/ENDBR decoding
Intel IBT requires the target of any indirect CALL or JMP instruction
to be the ENDBR instruction; optionally it allows those two
instructions to have a NOTRACK prefix in order to avoid this
requirement.

The kernel will not enable the use of NOTRACK, as such any occurence
of it in compiler generated code should be flagged.

Teach objtool to Decode ENDBR instructions and WARN about NOTRACK
prefixes.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.645963517@infradead.org
2022-03-15 10:32:46 +01:00
Peter Zijlstra
96db4a988d objtool: Read the NOENDBR annotation
Read the new NOENDBR annotation. While there, attempt to not bloat
struct instruction.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.586815435@infradead.org
2022-03-15 10:32:46 +01:00
Peter Zijlstra
dca5da2abe x86,objtool: Move the ASM_REACHABLE annotation to objtool.h
Because we need a variant for .S files too.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yi9gOW9f1GGwwUD6@hirez.programming.kicks-ass.net
2022-03-15 10:32:45 +01:00
Peter Zijlstra
0e5b613b4d objtool: Rework ASM_REACHABLE
Currently ASM_REACHABLE only works for UD2 instructions; reorder
things to also allow over-riding dead_end_function().

To that end:

 - Mark INSN_BUG instructions in decode_instructions(), this saves
   having to iterate all instructions yet again.

 - Have add_call_destinations() set insn->dead_end for
   dead_end_function() calls.

 - Move add_dead_ends() *after* add_call_destinations() such that
   ASM_REACHABLE can clear the ->dead_end mark.

 - have validate_branch() only check ->dead_end.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.410010807@infradead.org
2022-03-15 10:32:44 +01:00
Peter Zijlstra
105cd68596 x86: Mark __invalid_creds() __noreturn
vmlinux.o: warning: objtool: ksys_unshare()+0x36c: unreachable instruction

0000 0000000000067040 <ksys_unshare>:
...
0364    673a4:	4c 89 ef             	mov    %r13,%rdi
0367    673a7:	e8 00 00 00 00       	call   673ac <ksys_unshare+0x36c>	673a8: R_X86_64_PLT32	__invalid_creds-0x4
036c    673ac:	e9 28 ff ff ff       	jmp    672d9 <ksys_unshare+0x299>
0371    673b1:	41 bc f4 ff ff ff    	mov    $0xfffffff4,%r12d
0377    673b7:	e9 80 fd ff ff       	jmp    6713c <ksys_unshare+0xfc>

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/Yi9gOW9f1GGwwUD6@hirez.programming.kicks-ass.net
2022-03-15 10:32:44 +01:00
Peter Zijlstra
eae654f1c2 exit: Mark do_group_exit() __noreturn
vmlinux.o: warning: objtool: get_signal()+0x108: unreachable instruction

0000 000000000007f930 <get_signal>:
...
0103    7fa33:  e8 00 00 00 00          call   7fa38 <get_signal+0x108> 7fa34: R_X86_64_PLT32   do_group_exit-0x4
0108    7fa38:  41 8b 45 74             mov    0x74(%r13),%eax

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.351270711@infradead.org
2022-03-15 10:32:43 +01:00
Peter Zijlstra
f9cdf7ca57 x86: Mark stop_this_cpu() __noreturn
vmlinux.o: warning: objtool: smp_stop_nmi_callback()+0x2b: unreachable instruction

0000 0000000000047cf0 <smp_stop_nmi_callback>:
...
0026    47d16:  e8 00 00 00 00          call   47d1b <smp_stop_nmi_callback+0x2b>       47d17: R_X86_64_PLT32   stop_this_cpu-0x4
002b    47d1b:  b8 01 00 00 00          mov    $0x1,%eax

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.290905453@infradead.org
2022-03-15 10:32:43 +01:00
Peter Zijlstra
4adb236867 objtool: Ignore extra-symbol code
There's a fun implementation detail on linking STB_WEAK symbols. When
the linker combines two translation units, where one contains a weak
function and the other an override for it. It simply strips the
STB_WEAK symbol from the symbol table, but doesn't actually remove the
code.

The result is that when objtool is ran in a whole-archive kind of way,
it will encounter *heaps* of unused (and unreferenced) code. All
rudiments of weak functions.

Additionally, when a weak implementation is split into a .cold
subfunction that .cold symbol is left in place, even though completely
unused.

Teach objtool to ignore such rudiments by searching for symbol holes;
that is, code ranges that fall outside the given symbol bounds.
Specifically, ignore a sequence of unreachable instruction iff they
occupy a single hole, additionally ignore any .cold subfunctions
referenced.

Both ld.bfd and ld.lld behave like this. LTO builds otoh can (and do)
properly DCE weak functions.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.232019347@infradead.org
2022-03-15 10:32:43 +01:00
Peter Zijlstra
53f7109ef9 objtool: Rename --duplicate to --lto
In order to prepare for LTO like objtool runs for modules, rename the
duplicate argument to lto.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154319.172584233@infradead.org
2022-03-15 10:32:42 +01:00
Peter Zijlstra
c8c301abea x86/ibt: Add ANNOTATE_NOENDBR
In order to have objtool warn about code references to !ENDBR
instruction, we need an annotation to allow this for non-control-flow
instances -- consider text range checks, text patching, or return
trampolines etc.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.578968224@infradead.org
2022-03-15 10:32:33 +01:00
Peter Zijlstra
5cff2086b0 objtool: Have WARN_FUNC fall back to sym+off
Currently WARN_FUNC() either prints func+off and failing that prints
sec+off, add an intermediate sym+off. This is useful when playing
around with entry code.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.461283840@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
1ffbe4e935 objtool: Default ignore INT3 for unreachable
Ignore all INT3 instructions for unreachable code warnings, similar to NOP.
This allows using INT3 for various paddings instead of NOPs.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.343312938@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
f2d3a25089 objtool: Add --dry-run
Add a --dry-run argument to skip writing the modifications. This is
convenient for debugging.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Miroslav Benes <mbenes@suse.cz>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220308154317.282720146@infradead.org
2022-03-15 10:32:32 +01:00
Peter Zijlstra
599d66b847 Merge branch 'arm64/for-next/linkage'
Enjoy the cleanups and avoid conflicts vs linkage

Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-03-15 10:32:31 +01:00
Fenghua Yu
227a06553f tools/objtool: Check for use of the ENQCMD instruction in the kernel
The ENQCMD instruction implicitly accesses the PASID_MSR to fill in the
pasid field of the descriptor being submitted to an accelerator. But
there is no precise (and stable across kernel changes) point at which
the PASID_MSR is updated from the value for one task to the next.

Kernel code that uses accelerators must always use the ENQCMDS instruction
which does not access the PASID_MSR.

Check for use of the ENQCMD instruction in the kernel and warn on its
usage.

Signed-off-by: Fenghua Yu <fenghua.yu@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lore.kernel.org/r/20220207230254.3342514-11-fenghua.yu@intel.com
Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2022-03-15 10:32:30 +01:00
Ingo Molnar
ccdbf33c23 Linux 5.17-rc8
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmIuUskeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGCFkH/2n3mpGXuITp0ZXE
 TNrpbdZOof5SgLw+w7THswXuo6m5yRGNKQs9fvIvDD8Vf7/OdQQfPOmF1cIE5+nk
 wcz6aHKbdrok8Jql2qjJqWXZ5xbGj6qywg3zZrwOUsCKFP5p+AjBJcmZOsvQHjSp
 ASODy1moOlK+nO52TrMaJw74a8xQPmQiNa+T2P+FedEYjlcRH/c7hLJ7GEnL6+cC
 /R4bATZq3tiInbTBlkC0hR0iVNgRXwXNyv9PEXrYYYHnekh8G1mgSNf06iejLcsG
 aAYsW9NyPxu8zPhhHNx79K9o8BMtxGD4YQpsfdfIEnf9Q3euqAKe2evRWqHHlDms
 RuSCtsc=
 =M9Nc
 -----END PGP SIGNATURE-----

Merge tag 'v5.17-rc8' into sched/core, to pick up fixes

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2022-03-15 10:28:12 +01:00
Jakub Kicinski
15d703921f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net coming late
in the 5.17-rc process:

1) Revert port remap to mitigate shadowing service ports, this is causing
   problems in existing setups and this mitigation can be achieved with
   explicit ruleset, eg.

	... tcp sport < 16386 tcp dport >= 32768 masquerade random

  This patches provided a built-in policy similar to the one described above.

2) Disable register tracking infrastructure in nf_tables. Florian reported
   two issues:

   - Existing expressions with no implemented .reduce interface
     that causes data-store on register should cancel the tracking.
   - Register clobbering might be possible storing data on registers that
     are larger than 32-bits.

   This might lead to generating incorrect ruleset bytecode. These two
   issues are scheduled to be addressed in the next release cycle.

* git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: disable register tracking
  Revert "netfilter: conntrack: tag conntracks picked up in local out hook"
  Revert "netfilter: nat: force port remap to prevent shadowing well-known ports"
====================

Link: https://lore.kernel.org/r/20220312220315.64531-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-14 15:51:10 -07:00
Arnaldo Carvalho de Melo
65eab2bc7d Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes that went thru perf/urgent.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-14 19:15:16 -03:00
Will Deacon
563c463595 Merge branch 'for-next/linkage' into for-next/core
* for-next/linkage:
  arm64: module: remove (NOLOAD) from linker script
  linkage: remove SYM_FUNC_{START,END}_ALIAS()
  x86: clean up symbol aliasing
  arm64: clean up symbol aliasing
  linkage: add SYM_FUNC_ALIAS{,_LOCAL,_WEAK}()
2022-03-14 19:01:05 +00:00
Janis Schoetterl-Glausch
3bcc372c98 KVM: s390: selftests: Add error memop tests
Test that errors occur if key protection disallows access, including
tests for storage and fetch protection override. Perform tests for both
logical vcpu and absolute vm ioctls.
Also extend the existing tests to the vm ioctl.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Link: https://lore.kernel.org/r/20220308125841.3271721-6-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-03-14 16:12:27 +01:00
Janis Schoetterl-Glausch
1bb873495a KVM: s390: selftests: Add more copy memop tests
Do not just test the actual copy, but also that success is indicated
when using the check only flag.
Add copy test with storage key checking enabled, including tests for
storage and fetch protection override.
These test cover both logical vcpu ioctls as well as absolute vm ioctls.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Link: https://lore.kernel.org/r/20220308125841.3271721-5-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-03-14 16:12:27 +01:00
Janis Schoetterl-Glausch
c4816a1b7f KVM: s390: selftests: Add named stages for memop test
The stages synchronize guest and host execution.
This helps the reader and constraits the execution of the test -- if the
observed staging differs from the expected the test fails.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Link: https://lore.kernel.org/r/20220308125841.3271721-4-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-03-14 16:12:27 +01:00
Janis Schoetterl-Glausch
4eb562ab99 KVM: s390: selftests: Add macro as abstraction for MEM_OP
In order to achieve good test coverage we need to be able to invoke the
MEM_OP ioctl with all possible parametrizations.
However, for a given test, we want to be concise and not specify a long
list of default values for parameters not relevant for the test, so the
readers attention is not needlessly diverted.
Add a macro that enables this and convert the existing test to use it.
The macro emulates named arguments and hides some of the ioctl's
redundancy, e.g. sets the key flag if an access key is specified.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Link: https://lore.kernel.org/r/20220308125841.3271721-3-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-03-14 16:12:27 +01:00
Janis Schoetterl-Glausch
70e2f9f039 KVM: s390: selftests: Split memop tests
Split success case/copy test from error test, making them independent.
This means they do not share state and are easier to understand.
Also, new test can be added in the same manner without affecting the old
ones. In order to make that simpler, introduce functionality for the
setup of commonly used variables.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Link: https://lore.kernel.org/r/20220308125841.3271721-2-scgl@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@linux.ibm.com>
2022-03-14 16:12:27 +01:00
Victor Nogueira
102e4a8e12 selftests: tc-testing: Increase timeout in tdc config file
Some tests, such as Test d052: Add 1M filters with the same action, may
not work with a small timeout value.

Increase timeout to 24 seconds.

Signed-off-by: Victor Nogueira <victor@mojatatu.com>
Acked-by: Davide Caratti <dcaratti@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-03-14 10:27:52 +00:00
Zhengjun Xing
91c9923a47 perf parse: Fix event parser error for hybrid systems
This bug happened on hybrid systems when both cpu_core and cpu_atom
have the same event name such as "UOPS_RETIRED.MS" while their event
terms are different, then during perf stat, the event for cpu_atom
will parse fail and then no output for cpu_atom.

UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/

It is because event terms in the "head" of parse_events_multi_pmu_add
will be changed to event terms for cpu_core after parsing UOPS_RETIRED.MS
for cpu_core, then when parsing the same event for cpu_atom, it still
uses the event terms for cpu_core, but event terms for cpu_atom are
different with cpu_core, the event parses for cpu_atom will fail. This
patch fixes it, the event terms should be parsed from the original
event.

This patch can work for the hybrid systems that have the same event
in more than 2 PMUs. It also can work in non-hybrid systems.

Before:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 2737845 16068518485 16068518485

 Performance counter stats for 'system wide':

         2,737,845      cpu_core/UOPS_RETIRED.MS/

       1.002553850 seconds time elapsed

After:

  # perf stat -v  -e  UOPS_RETIRED.MS  -a sleep 1

  Using CPUID GenuineIntel-6-97-1
  UOPS_RETIRED.MS -> cpu_core/period=0x1e8483,umask=0x4,event=0xc2,frontend=0x8/
  UOPS_RETIRED.MS -> cpu_atom/period=0x1e8483,umask=0x1,event=0xc2/
  Control descriptor is not initialized
  UOPS_RETIRED.MS: 1977555 16076950711 16076950711
  UOPS_RETIRED.MS: 568684 8038694234 8038694234

 Performance counter stats for 'system wide':

         1,977,555      cpu_core/UOPS_RETIRED.MS/
           568,684      cpu_atom/UOPS_RETIRED.MS/

       1.004758259 seconds time elapsed

Fixes: fb0811535e ("perf parse-events: Allow config on kernel PMU events")
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220307151627.30049-1-zhengjun.xing@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 11:04:55 -03:00
James Clark
f693dac479 perf tools: Set build-id using build-id header on new mmap records
MMAP records that occur after the build-id header is parsed do not have
their build-id set even if the filename matches an entry from the
header. Set the build-id on these dsos as long as the MMAP record
doesn't have its own build-id set.

This fixes an issue with off target analysis where the local version of
a dso is loaded rather than one from ~/.debug via a build-id.

Reported-by: Denis Nikitin <denik@chromium.org>
Signed-off-by: James Clark <james.clark@arm.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: coresight@lists.linaro.org
Link: https://lore.kernel.org/r/20220304090956.2048712-2-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 11:01:12 -03:00
Rasmus Villemoes
7177a47926 tools compiler.h: Remove duplicate #ifndef noinline block
The same three lines also appear a bit earlier in the same file.

Reviewed-by: Miguel Ojeda <ojeda@kernel.org>
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Cc: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20211015083144.2767725-1-linux@rasmusvillemoes.dk
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 11:00:57 -03:00
Weiguo Li
073a15c351 perf bench: Fix NULL check against wrong variable
We did a NULL check after "epollfdp = calloc(...)", but we checked
"epollfd" instead of "epollfdp".

Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_B5D64530EB9C7DBB8D2C88A0C790F1489D0A@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:49:13 -03:00
Weiguo Li
a7a72631f6 perf parse-events: Fix NULL check against wrong variable
We did a null check after "tmp->symbol = strdup(...)", but we checked
"list->symbol" other than "tmp->symbol".

Reviewed-by: John Garry <john.garry@huawei.com>
Signed-off-by: Weiguo Li <liwg06@foxmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/tencent_DF39269807EC9425E24787E6DB632441A405@qq.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:47:52 -03:00
Arnaldo Carvalho de Melo
ec9d50ace3 tools headers cpufeatures: Sync with the kernel sources
To pick the changes from:

  d45476d983 ("x86/speculation: Rename RETPOLINE_AMD to RETPOLINE_LFENCE")

Its just a comment fixup.

This only causes these perf files to be rebuilt:

  CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
  CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/x86/include/asm/cpufeatures.h' differs from latest version at 'arch/x86/include/asm/cpufeatures.h'
  diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Borislav Petkov <bp@suse.de>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/lkml/YiyiHatGaJQM7l/Y@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:38:05 -03:00
Arnaldo Carvalho de Melo
3ec94eeaff tools kvm headers arm64: Update KVM headers from the kernel sources
To pick the changes from:

  a5905d6af4 ("KVM: arm64: Allow SMCCC_ARCH_WORKAROUND_3 to be discovered and migrated")

That don't causes any changes in tooling (when built on x86), only
addresses this perf build warning:

  Warning: Kernel ABI header at 'tools/arch/arm64/include/uapi/asm/kvm.h' differs from latest version at 'arch/arm64/include/uapi/asm/kvm.h'
  diff -u tools/arch/arm64/include/uapi/asm/kvm.h arch/arm64/include/uapi/asm/kvm.h

Cc: James Morse <james.morse@arm.com>
Link: https://lore.kernel.org/lkml/YiyhAK6sVPc83FaI@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-03-12 10:33:13 -03:00
Dan Williams
3b6c6c0397 nvdimm/region: Delete nd_blk_region infrastructure
Now that the nd_namespace_blk infrastructure is removed, delete all the
region machinery to coordinate provisioning aliased capacity between
PMEM and BLK.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/164688418803.2879318.1302315202397235855.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-03-11 15:53:13 -08:00
Dan Williams
a4b96046a8 ACPI: NFIT: Remove block aperture support
Delete the code to parse interleave-descriptor-tables and coordinate I/O
through a BLK aperture.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/164688418240.2879318.400185926874596938.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-03-11 15:53:13 -08:00
Dan Williams
f8669f1d6a nvdimm/blk: Delete the block-aperture window driver
Block Aperture Window support was an attempt to layer an error model
over PMEM for platforms that did not support machine-check-recovery.
However, it was abandoned before it ever shipped, and only ever existed
in the ACPI specification. Meanwhile Linux has carried a large pile of
dead code for non-shipping infrastructure. For years it has been off to
the side out of the way, but now CXL and recent directions with DAX
support have the potential to collide with this code.

In preparation for adding discontiguous namespace support, a
pre-requisite for the nvdimm subsystem to replace device-mapper for
striping + concatenation use cases, delete BLK aperture support.

On the obscure chance that some hardware vendor shipped support for this
mode, note that the driver will still keep BLK space reserved in the
label area. So an end user in this case would still have the opportunity
to report the regression to get BLK-mode support restored without
risking the data they have on that device.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/164688416668.2879318.16903178375774275120.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
2022-03-11 15:53:12 -08:00
Yonghong Song
d3b351f65b selftests/bpf: Fix a clang compilation error for send_signal.c
Building selftests/bpf with latest clang compiler (clang15 built
from source), I hit the following compilation error:

  /.../prog_tests/send_signal.c:43:16: error: variable 'j' set but not used [-Werror,-Wunused-but-set-variable]
                  volatile int j = 0;
                               ^
  1 error generated.

The problem also exists with clang13 and clang14. clang12 is okay.

In send_signal.c, we have the following code ...

  volatile int j = 0;
  [...]
  for (int i = 0; i < 100000000 && !sigusr1_received; i++)
    j /= i + 1;

... to burn CPU cycles so bpf_send_signal() helper can be tested
in NMI mode.

Slightly changing 'j /= i + 1' to 'j /= i + j + 1' or 'j++' can
fix the problem. Further investigation indicated this should be
a clang bug ([1]). The upstream fix will be proposed later. But it
is a good idea to workaround the issue to unblock people who build
kernel/selftests with clang.

  [1] https://discourse.llvm.org/t/strange-clang-unused-but-set-variable-error-with-volatile-variables/60841

Signed-off-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220311003721.2177170-1-yhs@fb.com
2022-03-11 22:18:13 +01:00
Toke Høiland-Jørgensen
c09df4bd3a selftests/bpf: Add a test for maximum packet size in xdp_do_redirect
This adds an extra test to the xdp_do_redirect selftest for XDP live packet
mode, which verifies that the maximum permissible packet size is accepted
without any errors, and that a too big packet is correctly rejected.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Link: https://lore.kernel.org/bpf/20220310225621.53374-2-toke@redhat.com
2022-03-11 22:01:26 +01:00
Roberto Sassu
7bae42b68d selftests/bpf: Check that bpf_kernel_read_file() denies reading IMA policy
Check that bpf_kernel_read_file() denies the reading of an IMA policy, by
ensuring that ima_setup.sh exits with an error.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-10-roberto.sassu@huawei.com
2022-03-10 18:57:55 -08:00
Roberto Sassu
e6dcf7bbf3 selftests/bpf: Add test for bpf_lsm_kernel_read_file()
Test the ability of bpf_lsm_kernel_read_file() to call the sleepable
functions bpf_ima_inode_hash() or bpf_ima_file_hash() to obtain a
measurement of a loaded IMA policy.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-9-roberto.sassu@huawei.com
2022-03-10 18:57:55 -08:00
Roberto Sassu
91e8fa254d selftests/bpf: Check if the digest is refreshed after a file write
Verify that bpf_ima_inode_hash() returns a non-fresh digest after a file
write, and that bpf_ima_file_hash() returns a fresh digest. Verification is
done by requesting the digest from the bprm_creds_for_exec hook, called
before ima_bprm_check().

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-7-roberto.sassu@huawei.com
2022-03-10 18:57:54 -08:00
Roberto Sassu
27a77d0d46 selftests/bpf: Add test for bpf_ima_file_hash()
Add new test to ensure that bpf_ima_file_hash() returns the digest of the
executed files.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-6-roberto.sassu@huawei.com
2022-03-10 18:57:54 -08:00
Roberto Sassu
2746de3c53 selftests/bpf: Move sample generation code to ima_test_common()
Move sample generator code to ima_test_common() so that the new function
can be called by multiple LSM hooks.

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-5-roberto.sassu@huawei.com
2022-03-10 18:57:54 -08:00
Roberto Sassu
174b16946e bpf-lsm: Introduce new helper bpf_ima_file_hash()
ima_file_hash() has been modified to calculate the measurement of a file on
demand, if it has not been already performed by IMA or the measurement is
not fresh. For compatibility reasons, ima_inode_hash() remains unchanged.

Keep the same approach in eBPF and introduce the new helper
bpf_ima_file_hash() to take advantage of the modified behavior of
ima_file_hash().

Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20220302111404.193900-4-roberto.sassu@huawei.com
2022-03-10 18:57:54 -08:00
Jakub Kicinski
1e8a3f0d2a Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
net/dsa/dsa2.c
  commit afb3cc1a39 ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails")
  commit e83d565378 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master")
https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/

drivers/net/ethernet/intel/ice/ice.h
  commit 97b0129146 ("ice: Fix error with handling of bonding MTU")
  commit 43113ff734 ("ice: add TTY for GNSS module for E810T device")
https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/

drivers/staging/gdm724x/gdm_lte.c
  commit fc7f750dc9 ("staging: gdm724x: fix use after free in gdm_lte_rx()")
  commit 4bcc4249b4 ("staging: Use netif_rx().")
https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 17:16:56 -08:00
Linus Torvalds
186d32bbf0 Networking fixes for 5.17-rc8/final, including fixes from bluetooth,
and ipsec.
 
 Current release - regressions:
 
  - Bluetooth: fix unbalanced unlock in set_device_flags()
 
  - Bluetooth: fix not processing all entries on cmd_sync_work,
    make connect with qualcomm and intel adapters reliable
 
  - Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"
 
  - xdp: xdp_mem_allocator can be NULL in trace_mem_connect()
 
  - eth: ice: fix race condition and deadlock during interface enslave
 
 Current release - new code bugs:
 
  - tipc: fix incorrect order of state message data sanity check
 
 Previous releases - regressions:
 
  - esp: fix possible buffer overflow in ESP transformation
 
  - dsa: unlock the rtnl_mutex when dsa_master_setup() fails
 
  - phy: meson-gxl: fix interrupt handling in forced mode
 
  - smsc95xx: ignore -ENODEV errors when device is unplugged
 
 Previous releases - always broken:
 
  - xfrm: fix tunnel mode fragmentation behavior
 
  - esp: fix inter address family tunneling on GSO
 
  - tipc: fix null-deref due to race when enabling bearer
 
  - sctp: fix kernel-infoleak for SCTP sockets
 
  - eth: macb: fix lost RX packet wakeup race in NAPI receive
 
  - eth: intel stop disabling VFs due to PF error responses
 
  - eth: bcmgenet: don't claim WOL when its not available
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmIqlOsACgkQMUZtbf5S
 IrtKJBAAjZpYBwwHty6JR7AahLF4LNO+o1KmraqFV7YByS5NRfBRpXV7asvpxJNF
 9iJhOWtLMsz/mVq0OXdx/+NpDh9JIHrQzb3GiskeKzBdhHmW4HjuYug1gytqRDMx
 uZOiQEuJSREu0tCsfcVWTF8wm4OgmPWtyZNZq2kwXsHiKoptB9KFK9pcvD6Utxrg
 jTpYBS5I9cX0Sj+gG9fZFNeyaxgmKkC5cM4cSLcheGSKHvEbX6MIXfi2Wb1VRBzE
 Qk/1JbkQf4gQ1BAu9kt8+jgWqW7vSnDn2iYUVw7RSSlj5xIM4f4m71nS9XzejJLb
 ADry24arlmknMS9Rhpy7n3ogNn/5MtlsZt01z/AAyZDRc1rrsWDqOJugtDRSnSEh
 yAhAsl/vqOuoovA86IRBTji8JlyfNZXt33K7+1KKDsj1wzSpcB9AKTDps8Ncu9uL
 elyaU2v4bTdhdqkQnxpcsLlLcV3FzLaWUVLpcla3XVLvzjEnoY+mhR5boW735uj7
 f8Ig9Aj4UceJ+sQtXywciknE1+s48/pWqs8b8Y5DXX1P168A1ud5voy4Po6RvqQG
 B17WvAaq/7DsMKcuofeykFHCKlwO36xdt6l0ExaQuzmV+NgoEBWAmgwsyl9ktFpT
 I09D2RMPfTqYgdNvYkKGBrMKV87weVvHpMIeJiG1YeiBB3e1Xw8=
 =WfAR
 -----END PGP SIGNATURE-----

Merge tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, and ipsec.

  Current release - regressions:

   - Bluetooth: fix unbalanced unlock in set_device_flags()

   - Bluetooth: fix not processing all entries on cmd_sync_work, make
     connect with qualcomm and intel adapters reliable

   - Revert "xfrm: state and policy should fail if XFRMA_IF_ID 0"

   - xdp: xdp_mem_allocator can be NULL in trace_mem_connect()

   - eth: ice: fix race condition and deadlock during interface enslave

  Current release - new code bugs:

   - tipc: fix incorrect order of state message data sanity check

  Previous releases - regressions:

   - esp: fix possible buffer overflow in ESP transformation

   - dsa: unlock the rtnl_mutex when dsa_master_setup() fails

   - phy: meson-gxl: fix interrupt handling in forced mode

   - smsc95xx: ignore -ENODEV errors when device is unplugged

  Previous releases - always broken:

   - xfrm: fix tunnel mode fragmentation behavior

   - esp: fix inter address family tunneling on GSO

   - tipc: fix null-deref due to race when enabling bearer

   - sctp: fix kernel-infoleak for SCTP sockets

   - eth: macb: fix lost RX packet wakeup race in NAPI receive

   - eth: intel stop disabling VFs due to PF error responses

   - eth: bcmgenet: don't claim WOL when its not available"

* tag 'net-5.17-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (50 commits)
  xdp: xdp_mem_allocator can be NULL in trace_mem_connect().
  ice: Fix race condition during interface enslave
  net: phy: meson-gxl: improve link-up behavior
  net: bcmgenet: Don't claim WOL when its not available
  net: arc_emac: Fix use after free in arc_mdio_probe()
  sctp: fix kernel-infoleak for SCTP sockets
  net: phy: correct spelling error of media in documentation
  net: phy: DP83822: clear MISR2 register to disable interrupts
  gianfar: ethtool: Fix refcount leak in gfar_get_ts_info
  selftests: pmtu.sh: Kill nettest processes launched in subshell.
  selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
  NFC: port100: fix use-after-free in port100_send_complete
  net/mlx5e: SHAMPO, reduce TIR indication
  net/mlx5e: Lag, Only handle events from highest priority multipath entry
  net/mlx5: Fix offloading with ESWITCH_IPV4_TTL_MODIFY_ENABLE
  net/mlx5: Fix a race on command flush flow
  net/mlx5: Fix size field in bufferx_reg struct
  ax25: Fix NULL pointer dereference in ax25_kill_by_device
  net: marvell: prestera: Add missing of_node_put() in prestera_switch_set_base_mac_addr
  net: ethernet: lpc_eth: Handle error for clk_enable
  ...
2022-03-10 16:47:58 -08:00
Chris J Arges
357b3cc3c0 bpftool: Ensure bytes_memlock json output is correct
If a BPF map is created over 2^32 the memlock value as displayed in JSON
format will be incorrect. Use atoll instead of atoi so that the correct
number is displayed.

  ```
  $ bpftool map create /sys/fs/bpf/test_bpfmap type hash key 4 \
    value 1024 entries 4194304 name test_bpfmap
  $ bpftool map list
  1: hash  name test_bpfmap  flags 0x0
          key 4B  value 1024B  max_entries 4194304  memlock 4328521728B
  $ sudo bpftool map list -j | jq .[].bytes_memlock
  33554432
  ```

Signed-off-by: Chris J Arges <carges@cloudflare.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/b6601087-0b11-33cc-904a-1133d1500a10@cloudflare.com
2022-03-11 00:06:11 +01:00
Hengqi Chen
5861701440 bpf: Fix comment for helper bpf_current_task_under_cgroup()
Fix the descriptions of the return values of helper bpf_current_task_under_cgroup().

Fixes: c6b5fb8690 ("bpf: add documentation for eBPF helpers (42-50)")
Signed-off-by: Hengqi Chen <hengqi.chen@gmail.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/bpf/20220310155335.1278783-1-hengqi.chen@gmail.com
2022-03-10 23:00:43 +01:00
Martin KaFai Lau
3daf0896f3 bpf: selftests: Update tests after s/delivery_time/tstamp/ change in bpf.h
The previous patch made the follow changes:
- s/delivery_time_type/tstamp_type/
- s/bpf_skb_set_delivery_time/bpf_skb_set_tstamp/
- BPF_SKB_DELIVERY_TIME_* to BPF_SKB_TSTAMP_*

This patch is to change the test_tc_dtime.c to reflect the above.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220309090515.3712742-1-kafai@fb.com
2022-03-10 22:57:05 +01:00
Martin KaFai Lau
9bb984f28d bpf: Remove BPF_SKB_DELIVERY_TIME_NONE and rename s/delivery_time_/tstamp_/
This patch is to simplify the uapi bpf.h regarding to the tstamp type
and use a similar way as the kernel to describe the value stored
in __sk_buff->tstamp.

My earlier thought was to avoid describing the semantic and
clock base for the rcv timestamp until there is more clarity
on the use case, so the __sk_buff->delivery_time_type naming instead
of __sk_buff->tstamp_type.

With some thoughts, it can reuse the UNSPEC naming.  This patch first
removes BPF_SKB_DELIVERY_TIME_NONE and also

rename BPF_SKB_DELIVERY_TIME_UNSPEC to BPF_SKB_TSTAMP_UNSPEC
and    BPF_SKB_DELIVERY_TIME_MONO   to BPF_SKB_TSTAMP_DELIVERY_MONO.

The semantic of BPF_SKB_TSTAMP_DELIVERY_MONO is the same:
__sk_buff->tstamp has delivery time in mono clock base.

BPF_SKB_TSTAMP_UNSPEC means __sk_buff->tstamp has the (rcv)
tstamp at ingress and the delivery time at egress.  At egress,
the clock base could be found from skb->sk->sk_clockid.
__sk_buff->tstamp == 0 naturally means NONE, so NONE is not needed.

With BPF_SKB_TSTAMP_UNSPEC for the rcv tstamp at ingress,
the __sk_buff->delivery_time_type is also renamed to __sk_buff->tstamp_type
which was also suggested in the earlier discussion:
https://lore.kernel.org/bpf/b181acbe-caf8-502d-4b7b-7d96b9fc5d55@iogearbox.net/

The above will then make __sk_buff->tstamp and __sk_buff->tstamp_type
the same as its kernel skb->tstamp and skb->mono_delivery_time
counter part.

The internal kernel function bpf_skb_convert_dtime_type_read() is then
renamed to bpf_skb_convert_tstamp_type_read() and it can be simplified
with the BPF_SKB_DELIVERY_TIME_NONE gone.  A BPF_ALU32_IMM(BPF_AND)
insn is also saved by using BPF_JMP32_IMM(BPF_JSET).

The bpf helper bpf_skb_set_delivery_time() is also renamed to
bpf_skb_set_tstamp().  The arg name is changed from dtime
to tstamp also.  It only allows setting tstamp 0 for
BPF_SKB_TSTAMP_UNSPEC and it could be relaxed later
if there is use case to change mono delivery time to
non mono.

prog->delivery_time_access is also renamed to prog->tstamp_type_access.

Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20220309090509.3712315-1-kafai@fb.com
2022-03-10 22:57:05 +01:00
Matthieu Baerts
d8d0830205 selftests: mptcp: join: make it shellcheck compliant
This fixes a few issues reported by ShellCheck:

- SC2068: Double quote array expansions to avoid re-splitting elements.
- SC2206: Quote to prevent word splitting/globbing, or split robustly
          with mapfile or read -a.
- SC2166: Prefer [ p ] && [ q ] as [ p -a q ] is not well defined.
- SC2155: Declare and assign separately to avoid masking return values.
- SC2162: read without -r will mangle backslashes.
- SC2219: Instead of 'let expr', prefer (( expr )) .
- SC2181: Check exit code directly with e.g. 'if mycmd;', not indirectly
          with $?.
- SC2236: Use -n instead of ! -z.
- SC2004: $/${} is unnecessary on arithmetic variables.
- SC2012: Use find instead of ls to better handle non-alphanumeric
          filenames.
- SC2002: Useless cat. Consider 'cmd < file | ..' or 'cmd file | ..'
          instead.

SC2086 (Double quotes to prevent globbing and word splitting) is ignored
because it is controlled for the moment and there are too many to
change.

While at it, also fixed the alignment in one comment.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:59 -08:00
Matthieu Baerts
4bfadd7120 selftests: mptcp: join: avoid backquotes
As explained on ShellCheck's wiki [1], it is recommended to avoid
backquotes `...` in favour of parenthesis $(...):

> Backtick command substitution `...` is legacy syntax with several
> issues.
>
> - It has a series of undefined behaviors related to quoting in POSIX.
> - It imposes a custom escaping mode with surprising results.
> - It's exceptionally hard to nest.
>
> $(...) command substitution has none of these problems, and is
> therefore strongly encouraged.

[1] https://www.shellcheck.net/wiki/SC2006

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:59 -08:00
Matthieu Baerts
1e777bd818 selftests: mptcp: join: clarify local/global vars
Some vars are redefined in different places. Best to avoid this
classical Bash pitfall where variables are accidentally overridden by
other functions because the proper scope has not been defined.

Most issues are with loops: typically 'i' is used in for-loops but if it
is not global, calling a function from a for-loop also doing a for-loop
with the same non local 'i' variable causes troubles because the first
'i' will be assigned to another value. To prevent such issues, the
iterator variable is now declared as local just before the loop. If it
is always done like this, issues are avoided.

To distinct between local and non local variables, all non local ones
are defined at the beginning of the script. The others are now defined
with the "local" keyword.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:59 -08:00
Matthieu Baerts
3469d72f13 selftests: mptcp: join: helper to filter TCP
This is more readable and reduces duplicated commands.

This might also be useful to add v6 support and switch to nftables.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:59 -08:00
Matthieu Baerts
39aab88242 selftests: mptcp: join: list failure at the end
With ~100 tests, it helps to have this summary at the end not to scroll
to find which one has failed.

It is especially interseting when looking at the output produced by the
CI where the kernel logs from the serial are mixed together.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:58 -08:00
Matthieu Baerts
c7d49c033d selftests: mptcp: join: alt. to exec specific tests
Running a specific test by giving the ID is often what we want: the CI
reports an issue with the Nth test, it is reproducible with:

  ./mptcp_join.sh N

But this might not work when there is a need to find which commit has
introduced a regression making a test unstable: failing from time to
time. Indeed, a specific test is not attached to one ID: the ID is in
fact a counter. It means the same test can have a different ID if other
tests have been added/removed before this unstable one.

Remembering the current test can also help listing failed tests at the
end.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:58 -08:00
Matthieu Baerts
ae7bd9ccec selftests: mptcp: join: option to execute specific tests
Often, it is needed to run one specific test.

There are options to run subgroups of tests but when only one fails, no
need to run all the subgroup. So far, the solution was to edit the
script to comment the tests that are not needed but that's not ideal.

Now, it is possible to run one specific test by giving the ID of the
tests that are going to be validated, e.g.

  ./mptcp_join.sh 36 37

This is cleaner and saves time.

Technically, the reset* functions now return 0 if the test can be
executed. This naturally creates sections per test in the code which is
also helpful to understand what a test is exactly doing.

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:58 -08:00
Matthieu Baerts
e59300ce3f selftests: mptcp: join: reset failing links
Best to always reset this env var before each test to avoid surprising
behaviour depending on the order tests are running.

Also clearly set it for the last failing links test is also needed when
only this test is executed.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:58 -08:00
Matthieu Baerts
3afd0280e7 selftests: mptcp: join: define tests groups once
When adding a new tests group, it has to be defined in multiple places:

- in the all_tests() function
- in the 'usage()' function
- in the getopts: short option + what to do when the option is used

Because it is easy to forget one of them, it is useful to have to define
them only once.

Note: only using an associative array would simplify the code but the
entries are stored in a hashtable and iterating over the different items
doesn't give the same order as the one used in the declaration of this
array. Because we want to run these tests in the same order as before, a
"simple" array is used first.

Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:57 -08:00
Geliang Tang
3c082695e7 selftests: mptcp: drop msg argument of chk_csum_nr
This patch dropped the msg argument of chk_csum_nr, to unify chk_csum_nr
with other chk_*_nr functions.

Reviewed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Geliang Tang <geliang.tang@suse.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-10 12:29:57 -08:00
Niklas Söderlund
f655c088e7 bpftool: Restore support for BPF offload-enabled feature probing
Commit 1a56c18e6c ("bpftool: Stop supporting BPF offload-enabled
feature probing") removed the support to probe for BPF offload features.
This is still something that is useful for NFP NIC that can support
offloading of BPF programs.

The reason for the dropped support was that libbpf starting with v1.0
would drop support for passing the ifindex to the BPF prog/map/helper
feature probing APIs. In order to keep this useful feature for NFP
restore the functionality by moving it directly into bpftool.

The code restored is a simplified version of the code that existed in
libbpf which supposed passing the ifindex. The simplification is that it
only targets the cases where ifindex is given and call into libbpf for
the cases where it's not.

Before restoring support for probing offload features:

  # bpftool feature probe dev ens4np0
  Scanning system call availability...
  bpf() syscall is available

  Scanning eBPF program types...

  Scanning eBPF map types...

  Scanning eBPF helper functions...
  eBPF helpers supported for program type sched_cls:
  eBPF helpers supported for program type xdp:

  Scanning miscellaneous eBPF features...
  Large program size limit is NOT available
  Bounded loop support is NOT available
  ISA extension v2 is NOT available
  ISA extension v3 is NOT available

With support for probing offload features restored:

  # bpftool feature probe dev ens4np0
  Scanning system call availability...
  bpf() syscall is available

  Scanning eBPF program types...
  eBPF program_type sched_cls is available
  eBPF program_type xdp is available

  Scanning eBPF map types...
  eBPF map_type hash is available
  eBPF map_type array is available

  Scanning eBPF helper functions...
  eBPF helpers supported for program type sched_cls:
  	- bpf_map_lookup_elem
  	- bpf_get_prandom_u32
  	- bpf_perf_event_output
  eBPF helpers supported for program type xdp:
  	- bpf_map_lookup_elem
  	- bpf_get_prandom_u32
  	- bpf_perf_event_output
  	- bpf_xdp_adjust_head
  	- bpf_xdp_adjust_tail

  Scanning miscellaneous eBPF features...
  Large program size limit is NOT available
  Bounded loop support is NOT available
  ISA extension v2 is NOT available
  ISA extension v3 is NOT available

Signed-off-by: Niklas Söderlund <niklas.soderlund@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Quentin Monnet <quentin@isovalent.com>
Link: https://lore.kernel.org/bpf/20220310121846.921256-1-niklas.soderlund@corigine.com
2022-03-10 16:09:47 +01:00
Karolina Drobnik
58ffc34896 memblock tests: Add TODO and README files
Add description of the project, its structure and how to run it.
List what is left to implement and what the known issues are.

Signed-off-by: Karolina Drobnik <karolinadrobnik@gmail.com>
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Link: https://lore.kernel.org/r/d5e39b9f7dcef177ebc14282727447bc21e3b38f.1646055639.git.karolinadrobnik@gmail.com
2022-03-10 12:19:44 +02:00
Tian Tao
8ddde07a3d dma-mapping: benchmark: extract a common header file for map_benchmark definition
kernel/dma/map_benchmark.c and selftests/dma/dma_map_benchmark.c
have duplicate map_benchmark definitions, which tends to lead to
inconsistent changes to map_benchmark on both sides, extract a
common header file to avoid this problem.

Signed-off-by: Tian Tao <tiantao6@hisilicon.com>
Acked-by: Barry Song <song.bao.hua@hisilicon.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Christoph Hellwig <hch@lst.de>
2022-03-10 07:41:14 +01:00
Guillaume Nault
94a4a4fe4c selftests: pmtu.sh: Kill nettest processes launched in subshell.
When using "run_cmd <command> &", then "$!" refers to the PID of the
subshell used to run <command>, not the command itself. Therefore
nettest_pids actually doesn't contain the list of the nettest commands
running in the background. So cleanup() can't kill them and the nettest
processes run until completion (fortunately they have a 5s timeout).

Fix this by defining a new command for running processes in the
background, for which "$!" really refers to the PID of the command run.

Also, double quote variables on the modified lines, to avoid shellcheck
warnings.

Fixes: ece1278a9b ("selftests: net: add ESP-in-UDP PMTU test")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09 20:23:32 -08:00
Guillaume Nault
18dfc66755 selftests: pmtu.sh: Kill tcpdump processes launched by subshell.
The cleanup() function takes care of killing processes launched by the
test functions. It relies on variables like ${tcpdump_pids} to get the
relevant PIDs. But tests are run in their own subshell, so updated
*_pids values are invisible to other shells. Therefore cleanup() never
sees any process to kill:

$ ./tools/testing/selftests/net/pmtu.sh -t pmtu_ipv4_exception
TEST: ipv4: PMTU exceptions                                         [ OK ]
TEST: ipv4: PMTU exceptions - nexthop objects                       [ OK ]

$ pgrep -af tcpdump
6084 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap
6085 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap
6086 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap
6087 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap
6088 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap
6089 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap
6090 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap
6091 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap
6228 tcpdump -s 0 -i veth_A-R1 -w pmtu_ipv4_exception_veth_A-R1.pcap
6229 tcpdump -s 0 -i veth_R1-A -w pmtu_ipv4_exception_veth_R1-A.pcap
6230 tcpdump -s 0 -i veth_R1-B -w pmtu_ipv4_exception_veth_R1-B.pcap
6231 tcpdump -s 0 -i veth_B-R1 -w pmtu_ipv4_exception_veth_B-R1.pcap
6232 tcpdump -s 0 -i veth_A-R2 -w pmtu_ipv4_exception_veth_A-R2.pcap
6233 tcpdump -s 0 -i veth_R2-A -w pmtu_ipv4_exception_veth_R2-A.pcap
6234 tcpdump -s 0 -i veth_R2-B -w pmtu_ipv4_exception_veth_R2-B.pcap
6235 tcpdump -s 0 -i veth_B-R2 -w pmtu_ipv4_exception_veth_B-R2.pcap

Fix this by running cleanup() in the context of the test subshell.
Now that each test cleans the environment after completion, there's no
need for calling cleanup() again when the next test starts. So let's
drop it from the setup() function. This is okay because cleanup() is
also called when pmtu.sh starts, so even the first test starts in a
clean environment.

Also, use tcpdump's immediate mode. Otherwise it might not have time to
process buffered packets, resulting in missing packets or even empty
pcap files for short tests.

Note: PAUSE_ON_FAIL is still evaluated before cleanup(), so one can
still inspect the test environment upon failure when using -p.

Fixes: a92a0a7b8e ("selftests: pmtu: Simplify cleanup and namespace names")
Signed-off-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-03-09 20:23:15 -08:00