Commit Graph

33 Commits

Author SHA1 Message Date
Ingo Molnar
023f42dd59 x86/alternatives: Rename 'apply_relocation()' to 'text_poke_apply_relocation()'
Join the text_poke_*() API namespace.

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Juergen Gross <jgross@suse.com>
Cc: "H . Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250411054105.2341982-52-mingo@kernel.org
2025-04-11 11:01:35 +02:00
Linus Torvalds
01d5b167dc Modules changes for 6.15-rc1
- Use RCU instead of RCU-sched
 
   The mix of rcu_read_lock(), rcu_read_lock_sched() and preempt_disable()
   in the module code and its users has been replaced with just
   rcu_read_lock().
 
 - The rest of changes are smaller fixes and updates.
 
 The changes have been on linux-next for at least 2 weeks, with the RCU
 cleanup present for 2 months. One performance problem was reported with the
 RCU change when KASAN + lockdep were enabled, but it was effectively
 addressed by the already merged ee57ab5a32 ("locking/lockdep: Disable
 KASAN instrumentation of lockdep.c").
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEIduBR9MnFA82q/jtumpXJwqY6poFAmfmwrsUHHBldHIucGF2
 bHVAc3VzZS5jb20ACgkQumpXJwqY6prWxgf/S7Pvdywm10vJ6fooYa+GxXNMwhyh
 XRjZ4m9gjeTNf2KLwX0XHv0XZeFHOmHfjd3iI+pS6CXZnCFTN9J3XPLYsrTxXUb6
 U6zzLf8Zsz8TzeI4dgvSBsZln7oICSACkAgdJCq23hpNKeaeRo91dgiZaIwyZJG3
 FekqSFtP7pYhfFoNkrFKysqbgl1+RWWZ79L2qRJA0bPzVFlvRUuh6cOHQw+8RMqf
 BYLwnArjTkW8AcXpxIXSiwphDHVZ81B96xoplavyoprA5FDpv1W+8y4DtxdWFn+1
 QVWCs/ZV3KrwXWpZev625w3fIOOIXILqRINOzLfvXTw+1xFS3TzSQEpVeg==
 =4OKc
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux

Pull modules updates from Petr Pavlu:

 - Use RCU instead of RCU-sched

   The mix of rcu_read_lock(), rcu_read_lock_sched() and
   preempt_disable() in the module code and its users has
   been replaced with just rcu_read_lock()

 - The rest of changes are smaller fixes and updates

* tag 'modules-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/modules/linux: (32 commits)
  MAINTAINERS: Update the MODULE SUPPORT section
  module: Remove unnecessary size argument when calling strscpy()
  module: Replace deprecated strncpy() with strscpy()
  params: Annotate struct module_param_attrs with __counted_by()
  bug: Use RCU instead RCU-sched to protect module_bug_list.
  static_call: Use RCU in all users of __module_text_address().
  kprobes: Use RCU in all users of __module_text_address().
  bpf: Use RCU in all users of __module_text_address().
  jump_label: Use RCU in all users of __module_text_address().
  jump_label: Use RCU in all users of __module_address().
  x86: Use RCU in all users of __module_address().
  cfi: Use RCU while invoking __module_address().
  powerpc/ftrace: Use RCU in all users of __module_text_address().
  LoongArch: ftrace: Use RCU in all users of __module_text_address().
  LoongArch/orc: Use RCU in all users of __module_address().
  arm64: module: Use RCU in all users of __module_text_address().
  ARM: module: Use RCU in all users of __module_text_address().
  module: Use RCU in all users of __module_text_address().
  module: Use RCU in all users of __module_address().
  module: Use RCU in search_module_extables().
  ...
2025-03-30 15:44:36 -07:00
Sebastian Andrzej Siewior
14daa3bca2 x86: Use RCU in all users of __module_address().
__module_address() can be invoked within a RCU section, there is no
requirement to have preemption disabled.

Replace the preempt_disable() section around __module_address() with
RCU.

Cc: H. Peter Anvin <hpa@zytor.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: x86@kernel.org
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20250108090457.512198-23-bigeasy@linutronix.de
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
2025-03-10 11:54:45 +01:00
Peter Zijlstra
ab9fea5948 x86/alternative: Simplify callthunk patching
Now that paravirt call patching is implemented using alternatives, it
is possible to avoid having to patch the alternative sites by
including the altinstr_replacement calls in the call_sites list.

This means we're now stacking relative adjustments like so:

  callthunks_patch_builtin_calls():
    patches all function calls to target: func() -> func()-10
    since the CALL accounting lives in the CALL_PADDING.

    This explicitly includes .altinstr_replacement

  alt_replace_call():
    patches: x86_BUG() -> target()

    this patching is done in a relative manner, and will preserve
    the above adjustment, meaning that with calldepth patching it
    will do: x86_BUG()-10 -> target()-10

  apply_relocation():
    does code relocation, and adjusts all RIP-relative instructions
    to the new location, also in a relative manner.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Link: https://lore.kernel.org/r/20250207122546.617187089@infradead.org
2025-02-14 10:32:06 +01:00
Linus Torvalds
5b7f7234ff x86/boot changes for v6.14:
- A large and involved preparatory series to pave the way to add exception
    handling for relocate_kernel - which will be a debugging facility that
    has aided in the field to debug an exceptionally hard to debug early boot bug.
    Plus assorted cleanups and fixes that were discovered along the way,
    by David Woodhouse:
 
       - Clean up and document register use in relocate_kernel_64.S
       - Use named labels in swap_pages in relocate_kernel_64.S
       - Only swap pages for ::preserve_context mode
       - Allocate PGD for x86_64 transition page tables separately
       - Copy control page into place in machine_kexec_prepare()
       - Invoke copy of relocate_kernel() instead of the original
       - Move relocate_kernel to kernel .data section
       - Add data section to relocate_kernel
       - Drop page_list argument from relocate_kernel()
       - Eliminate writes through kernel mapping of relocate_kernel page
       - Clean up register usage in relocate_kernel()
       - Mark relocate_kernel page as ROX instead of RWX
       - Disable global pages before writing to control page
       - Ensure preserve_context flag is set on return to kernel
       - Use correct swap page in swap_pages function
       - Fix stack and handling of re-entry point for ::preserve_context
       - Mark machine_kexec() with __nocfi
       - Cope with relocate_kernel() not being at the start of the page
       - Use typedef for relocate_kernel_fn function prototype
       - Fix location of relocate_kernel with -ffunction-sections (fix by Nathan Chancellor)
 
  - A series to remove the last remaining absolute symbol references from
    .head.text, and enforce this at build time, by Ard Biesheuvel:
 
       - Avoid WARN()s and panic()s in early boot code
       - Don't hang but terminate on failure to remap SVSM CA
       - Determine VA/PA offset before entering C code
       - Avoid intentional absolute symbol references in .head.text
       - Disable UBSAN in early boot code
       - Move ENTRY_TEXT to the start of the image
       - Move .head.text into its own output section
       - Reject absolute references in .head.text
 
  - Which build-time enforcement uncovered a handful of bugs of essentially
    non-working code, and a wrokaround for a toolchain bug, fixed by
    Ard Biesheuvel as well:
 
       - Fix spurious undefined reference when CONFIG_X86_5LEVEL=n, on GCC-12
       - Disable UBSAN on SEV code that may execute very early
       - Disable ftrace branch profiling in SEV startup code
 
  - And miscellaneous cleanups:
 
        - kexec_core: Add and update comments regarding the KEXEC_JUMP flow (Rafael J. Wysocki)
        - x86/sysfs: Constify 'struct bin_attribute' (Thomas Weißschuh)
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmeQDmURHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1inwRAAjD5QR/Yu7Yiv2nM/ncUwAItsFkv9Jk4Y
 HPGz9qNJoZxKxuZVj9bfQhWDe3g6VLnlDYgatht9BsyP5b12qZrUe+yp/TOH54Z3
 wPD+U/jun4jiSr7oJkJC+bFn+a/tL39pB8Y6m+jblacgVglleO3SH5fBWNE1UbIV
 e2iiNxi0bfuHy3wquegnKaMyF1e7YLw1p5laGSwwk21g5FjT7cLQOqC0/9u8u9xX
 Ha+iaod7JOcjiQOqIt/MV57ldWEFCrUhQozRV3tK5Ptf5aoGFpisgQoRoduWUtFz
 UbHiHhv6zE4DOIUzaAbJjYfR1Z/LCviwON97XJgeOOkJaULF7yFCfhGxKSyQoMIh
 qZtlBs4VsGl2/dOl+iW6xKwgRiNundTzSQtt5D/xuFz5LnDxe/SrlZnYp8lOPP8R
 w9V2b/fC0YxmUzEW6EDhBqvfuScKiNWoic47qvYfZPaWyg1ESpvWTIh6AKB5ThUR
 upgJQdA4HW+y5C57uHW40TSe3xEeqM3+Slk0jxLElP7/yTul5r7jrjq2EkwaAv/j
 6/0LsMSr33r9fVFeMP1qLXPUaipcqTWWTpeeTr8NBGUcvOKzw5SltEG4NihzCyhF
 3/UMQhcQ6KE3iFMPlRu4hV7ZV4gErZmLoRwh9Uk28f2Xx8T95uoV8KTg1/sRZRTo
 uQLeRxYnyrw=
 =vGWS
 -----END PGP SIGNATURE-----

Merge tag 'x86-boot-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull x86 boot updates from Ingo Molnar:

 - A large and involved preparatory series to pave the way to add
   exception handling for relocate_kernel - which will be a debugging
   facility that has aided in the field to debug an exceptionally hard
   to debug early boot bug. Plus assorted cleanups and fixes that were
   discovered along the way, by David Woodhouse:

      - Clean up and document register use in relocate_kernel_64.S
      - Use named labels in swap_pages in relocate_kernel_64.S
      - Only swap pages for ::preserve_context mode
      - Allocate PGD for x86_64 transition page tables separately
      - Copy control page into place in machine_kexec_prepare()
      - Invoke copy of relocate_kernel() instead of the original
      - Move relocate_kernel to kernel .data section
      - Add data section to relocate_kernel
      - Drop page_list argument from relocate_kernel()
      - Eliminate writes through kernel mapping of relocate_kernel page
      - Clean up register usage in relocate_kernel()
      - Mark relocate_kernel page as ROX instead of RWX
      - Disable global pages before writing to control page
      - Ensure preserve_context flag is set on return to kernel
      - Use correct swap page in swap_pages function
      - Fix stack and handling of re-entry point for ::preserve_context
      - Mark machine_kexec() with __nocfi
      - Cope with relocate_kernel() not being at the start of the page
      - Use typedef for relocate_kernel_fn function prototype
      - Fix location of relocate_kernel with -ffunction-sections (fix by Nathan Chancellor)

 - A series to remove the last remaining absolute symbol references from
   .head.text, and enforce this at build time, by Ard Biesheuvel:

      - Avoid WARN()s and panic()s in early boot code
      - Don't hang but terminate on failure to remap SVSM CA
      - Determine VA/PA offset before entering C code
      - Avoid intentional absolute symbol references in .head.text
      - Disable UBSAN in early boot code
      - Move ENTRY_TEXT to the start of the image
      - Move .head.text into its own output section
      - Reject absolute references in .head.text

 - The above build-time enforcement uncovered a handful of bugs of
   essentially non-working code, and a wrokaround for a toolchain bug,
   fixed by Ard Biesheuvel as well:

      - Fix spurious undefined reference when CONFIG_X86_5LEVEL=n, on GCC-12
      - Disable UBSAN on SEV code that may execute very early
      - Disable ftrace branch profiling in SEV startup code

 - And miscellaneous cleanups:

      - kexec_core: Add and update comments regarding the KEXEC_JUMP flow (Rafael J. Wysocki)
      - x86/sysfs: Constify 'struct bin_attribute' (Thomas Weißschuh)"

* tag 'x86-boot-2025-01-21' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (33 commits)
  x86/sev: Disable ftrace branch profiling in SEV startup code
  x86/kexec: Use typedef for relocate_kernel_fn function prototype
  x86/kexec: Cope with relocate_kernel() not being at the start of the page
  kexec_core: Add and update comments regarding the KEXEC_JUMP flow
  x86/kexec: Mark machine_kexec() with __nocfi
  x86/kexec: Fix location of relocate_kernel with -ffunction-sections
  x86/kexec: Fix stack and handling of re-entry point for ::preserve_context
  x86/kexec: Use correct swap page in swap_pages function
  x86/kexec: Ensure preserve_context flag is set on return to kernel
  x86/kexec: Disable global pages before writing to control page
  x86/sev: Don't hang but terminate on failure to remap SVSM CA
  x86/sev: Disable UBSAN on SEV code that may execute very early
  x86/boot/64: Fix spurious undefined reference when CONFIG_X86_5LEVEL=n, on GCC-12
  x86/sysfs: Constify 'struct bin_attribute'
  x86/kexec: Mark relocate_kernel page as ROX instead of RWX
  x86/kexec: Clean up register usage in relocate_kernel()
  x86/kexec: Eliminate writes through kernel mapping of relocate_kernel page
  x86/kexec: Drop page_list argument from relocate_kernel()
  x86/kexec: Add data section to relocate_kernel
  x86/kexec: Move relocate_kernel to kernel .data section
  ...
2025-01-24 05:54:26 -08:00
Juergen Gross
7fa0da5373 x86/xen: remove hypercall page
The hypercall page is no longer needed. It can be removed, as from the
Xen perspective it is optional.

But, from Linux's perspective, it removes naked RET instructions that
escape the speculative protections that Call Depth Tracking and/or
Untrain Ret are trying to achieve.

This is part of XSA-466 / CVE-2024-53241.

Reported-by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
2024-12-17 08:23:42 +01:00
David Woodhouse
cb33ff9e06 x86/kexec: Move relocate_kernel to kernel .data section
Now that the copy is executed instead of the original, the relocate_kernel
page can live in the kernel's .text section. This will allow subsequent
commits to actually add real data to it and clean up the code somewhat as
well as making the control page ROX.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Baoquan He <bhe@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: https://lore.kernel.org/r/20241205153343.3275139-9-dwmw2@infradead.org
2024-12-06 10:41:59 +01:00
Borislav Petkov (AMD)
f796c75837 x86/alternatives: Use a temporary buffer when optimizing NOPs
Instead of optimizing NOPs in-place, use a temporary buffer like the
usual alternatives patching flow does. This obviates the need to grab
locks when patching, see

  6778977590da ("x86/alternatives: Disable interrupts and sync when optimizing NOPs in place")

While at it, add nomenclature definitions clarifying and simplifying the
naming of function-local variables in the alternatives code.

Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20240130105941.19707-2-bp@alien8.de
2024-04-09 18:08:11 +02:00
Joan Bruguera Micó
6a53745300 x86/bpf: Fix IP for relocating call depth accounting
The commit:

  59bec00ace ("x86/percpu: Introduce %rip-relative addressing to PER_CPU_VAR()")

made PER_CPU_VAR() to use rip-relative addressing, hence
INCREMENT_CALL_DEPTH macro and skl_call_thunk_template got rip-relative
asm code inside of it. A follow up commit:

  17bce3b2ae ("x86/callthunks: Handle %rip-relative relocations in call thunk template")

changed x86_call_depth_emit_accounting() to use apply_relocation(),
but mistakenly assumed that the code is being patched in-place (where
the destination of the relocation matches the address of the code),
using *pprog as the destination ip. This is not true for the call depth
accounting, emitted by the BPF JIT, so the calculated address was wrong,
JIT-ed BPF progs on kernels with call depth tracking got broken and
usually caused a page fault.

Pass the destination IP when the BPF JIT emits call depth accounting.

Fixes: 17bce3b2ae ("x86/callthunks: Handle %rip-relative relocations in call thunk template")
Signed-off-by: Joan Bruguera Micó <joanbrugueram@gmail.com>
Reviewed-by: Uros Bizjak <ubizjak@gmail.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/r/20240401185821.224068-3-ubizjak@gmail.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-04-01 20:37:56 -07:00
Linus Torvalds
685d982112 Core x86 changes for v6.9:
- The biggest change is the rework of the percpu code,
   to support the 'Named Address Spaces' GCC feature,
   by Uros Bizjak:
 
    - This allows C code to access GS and FS segment relative
      memory via variables declared with such attributes,
      which allows the compiler to better optimize those accesses
      than the previous inline assembly code.
 
    - The series also includes a number of micro-optimizations
      for various percpu access methods, plus a number of
      cleanups of %gs accesses in assembly code.
 
    - These changes have been exposed to linux-next testing for
      the last ~5 months, with no known regressions in this area.
 
 - Fix/clean up __switch_to()'s broken but accidentally
   working handling of FPU switching - which also generates
   better code.
 
 - Propagate more RIP-relative addressing in assembly code,
   to generate slightly better code.
 
 - Rework the CPU mitigations Kconfig space to be less idiosyncratic,
   to make it easier for distros to follow & maintain these options.
 
 - Rework the x86 idle code to cure RCU violations and
   to clean up the logic.
 
 - Clean up the vDSO Makefile logic.
 
 - Misc cleanups and fixes.
 
 [ Please note that there's a higher number of merge commits in
   this branch (three) than is usual in x86 topic trees. This happened
   due to the long testing lifecycle of the percpu changes that
   involved 3 merge windows, which generated a longer history
   and various interactions with other core x86 changes that we
   felt better about to carry in a single branch. ]
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmXvB0gRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1jUqRAAqnEQPiabF5acQlHrwviX+cjSobDlqtH5
 9q2AQy9qaEHapzD0XMOxvFye6XIvehGOGxSPvk6CoviSxBND8rb56lvnsEZuLeBV
 Bo5QSIL2x42Zrvo11iPHwgXZfTIusU90sBuKDRFkYBAxY3HK2naMDZe8MAsYCUE9
 nwgHF8DDc/NYiSOXV8kosWoWpNIkoK/STyH5bvTQZMqZcwyZ49AIeP1jGZb/prbC
 e/rbnlrq5Eu6brpM7xo9kELO0Vhd34urV14KrrIpdkmUKytW2KIsyvW8D6fqgDBj
 NSaQLLcz0pCXbhF+8Nqvdh/1coR4L7Ymt08P1rfEjCsQgb/2WnSAGUQuC5JoGzaj
 ngkbFcZllIbD9gNzMQ1n4Aw5TiO+l9zxCqPC/r58Uuvstr+K9QKlwnp2+B3Q73Ft
 rojIJ04NJL6lCHdDgwAjTTks+TD2PT/eBWsDfJ/1pnUWttmv9IjMpnXD5sbHxoiU
 2RGGKnYbxXczYdq/ALYDWM6JXpfnJZcXL3jJi0IDcCSsb92xRvTANYFHnTfyzGfw
 EHkhbF4e4Vy9f6QOkSP3CvW5H26BmZS9DKG0J9Il5R3u2lKdfbb5vmtUmVTqHmAD
 Ulo5cWZjEznlWCAYSI/aIidmBsp9OAEvYd+X7Z5SBIgTfSqV7VWHGt0BfA1heiVv
 F/mednG0gGc=
 =3v4F
 -----END PGP SIGNATURE-----

Merge tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull core x86 updates from Ingo Molnar:

 - The biggest change is the rework of the percpu code, to support the
   'Named Address Spaces' GCC feature, by Uros Bizjak:

      - This allows C code to access GS and FS segment relative memory
        via variables declared with such attributes, which allows the
        compiler to better optimize those accesses than the previous
        inline assembly code.

      - The series also includes a number of micro-optimizations for
        various percpu access methods, plus a number of cleanups of %gs
        accesses in assembly code.

      - These changes have been exposed to linux-next testing for the
        last ~5 months, with no known regressions in this area.

 - Fix/clean up __switch_to()'s broken but accidentally working handling
   of FPU switching - which also generates better code

 - Propagate more RIP-relative addressing in assembly code, to generate
   slightly better code

 - Rework the CPU mitigations Kconfig space to be less idiosyncratic, to
   make it easier for distros to follow & maintain these options

 - Rework the x86 idle code to cure RCU violations and to clean up the
   logic

 - Clean up the vDSO Makefile logic

 - Misc cleanups and fixes

* tag 'x86-core-2024-03-11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (52 commits)
  x86/idle: Select idle routine only once
  x86/idle: Let prefer_mwait_c1_over_halt() return bool
  x86/idle: Cleanup idle_setup()
  x86/idle: Clean up idle selection
  x86/idle: Sanitize X86_BUG_AMD_E400 handling
  sched/idle: Conditionally handle tick broadcast in default_idle_call()
  x86: Increase brk randomness entropy for 64-bit systems
  x86/vdso: Move vDSO to mmap region
  x86/vdso/kbuild: Group non-standard build attributes and primary object file rules together
  x86/vdso: Fix rethunk patching for vdso-image-{32,64}.o
  x86/retpoline: Ensure default return thunk isn't used at runtime
  x86/vdso: Use CONFIG_COMPAT_32 to specify vdso32
  x86/vdso: Use $(addprefix ) instead of $(foreach )
  x86/vdso: Simplify obj-y addition
  x86/vdso: Consolidate targets and clean-files
  x86/bugs: Rename CONFIG_RETHUNK              => CONFIG_MITIGATION_RETHUNK
  x86/bugs: Rename CONFIG_CPU_SRSO             => CONFIG_MITIGATION_SRSO
  x86/bugs: Rename CONFIG_CPU_IBRS_ENTRY       => CONFIG_MITIGATION_IBRS_ENTRY
  x86/bugs: Rename CONFIG_CPU_UNRET_ENTRY      => CONFIG_MITIGATION_UNRET_ENTRY
  x86/bugs: Rename CONFIG_SLS                  => CONFIG_MITIGATION_SLS
  ...
2024-03-11 19:53:15 -07:00
Thomas Gleixner
cad860b595 x86/callthunks: Use EXPORT_PER_CPU_SYMBOL_GPL() for per CPU variables
Sparse complains rightfully about the usage of EXPORT_SYMBOL_GPL() for per
CPU variables:

  callthunks.c:346:20: sparse: warning: incorrect type in initializer (different address spaces)
  callthunks.c:346:20: sparse:    expected void const [noderef] __percpu *__vpp_verify
  callthunks.c:346:20: sparse:    got unsigned long long *

Use EXPORT_PER_CPU_SYMBOL_GPL() instead.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20240304005104.841915535@linutronix.de
2024-03-04 12:09:13 +01:00
Ingo Molnar
03c11eb3b1 Linux 6.8-rc4
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmXJK4UeHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGHsYH/jKmzKXDRsBCcw/Q
 HGUvFtpohWBOpN6efdf0nxilQisuyQrqKB9fnwvfcdE60VpqMJXFMdlFh/fonxPl
 JMbpk9y5uw48IJZA43NwTxUrjZ4wyWzv4ZF6YWa+5WdTAJpPLEPhhnLxcHOKklMr
 5Cm/7B/M7eB2BXBfc45b1pkKN22q9OXvjaKxZ+5wYmiMxS+GC8l8jiJ/WlHX78PR
 eLgsa1v732f2D7YF75wVhaoYepR+QzA9wTKqhjMNCEaVc2PQhA2JRsBXEt84qEIa
 FZigmf7LLc4ed9YA2XjRBZhAehe3cZVJZ1lasW37IATS921La2WfKuiysICJOtyT
 bGjK8tk=
 =Pt7W
 -----END PGP SIGNATURE-----

Merge tag 'v6.8-rc4' into x86/percpu, to resolve conflicts and refresh the branch

Conflicts:
	arch/x86/include/asm/percpu.h
	arch/x86/include/asm/text-patching.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2024-02-14 10:45:07 +01:00
Juergen Gross
60bc276b12 x86/paravirt: Switch mixed paravirt/alternative calls to alternatives
Instead of stacking alternative and paravirt patching, use the new
ALT_FLAG_CALL flag to switch those mixed calls to pure alternative
handling.

Eliminate the need to be careful regarding the sequence of alternative
and paravirt patching.

  [ bp: Touch up commit message. ]

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231210062138.2417-5-jgross@suse.com
2023-12-10 23:33:09 +01:00
Uros Bizjak
fc50065325 x86/callthunks: Correct calculation of dest address in is_callthunk()
GCC didn't warn on the invalid use of relocation destination
pointer, so the calculated destination value was applied to
the uninitialized pointer location in error.

Fixes: 17bce3b2ae ("x86/callthunks: Handle %rip-relative relocations in call thunk template")
Reported-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Closes: https://lore.kernel.org/lkml/20231201035457.GA321497@dev-arch.thelio-3990X/
Link: https://lore.kernel.org/r/20231201085727.3647051-1-ubizjak@gmail.com
2023-12-02 10:51:28 +01:00
Uros Bizjak
17bce3b2ae x86/callthunks: Handle %rip-relative relocations in call thunk template
Contrary to alternatives, relocations are currently not supported in
call thunk templates.  Re-use the existing infrastructure from
alternative.c to allow %rip-relative relocations when copying call
thunk template from its storage location.

The patch allows unification of ASM_INCREMENT_CALL_DEPTH, which already
uses PER_CPU_VAR macro, with INCREMENT_CALL_DEPTH, used in call thunk
template, which is currently limited to use absolute address.

Reuse existing relocation infrastructure from alternative.c.,
as suggested by Peter Zijlstra.

Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20231105213731.1878100-3-ubizjak@gmail.com
2023-11-30 20:06:17 +01:00
Hou Wenlong
5c22c4726e x86/paravirt: Use relative reference for the original instruction offset
Similar to the alternative patching, use a relative reference for original
instruction offset rather than absolute one, which saves 8 bytes for one
PARA_SITE entry on x86_64.  As a result, a R_X86_64_PC32 relocation is
generated instead of an R_X86_64_64 one, which also reduces relocation
metadata on relocatable builds. Hardcode the alignment to 4 now.

  [ bp: Massage commit message. ]

Signed-off-by: Hou Wenlong <houwenlong.hwl@antgroup.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Juergen Gross <jgross@suse.com>
Link: https://lore.kernel.org/r/9e6053107fbaabc0d33e5d2865c5af2c67ec9925.1686301237.git.houwenlong.hwl@antgroup.com
2023-11-13 12:23:27 +01:00
Alexey Dobriyan
321a145137 x86/callthunks: Delete unused "struct thunk_desc"
It looks like it was never used.

Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-kernel@vger.kernel.org
Link: https://lore.kernel.org/r/843bf596-db67-4b33-a865-2bae4a4418e5@p183
2023-10-20 12:58:48 +02:00
Peter Zijlstra
aee9d30b97 x86,static_call: Fix static-call vs return-thunk
Commit

  7825451fa4 ("static_call: Add call depth tracking support")

failed to realize the problem fixed there is not specific to call depth
tracking but applies to all return-thunk uses.

Move the fix to the appropriate place and condition.

Fixes: ee88d363d1 ("x86,static_call: Use alternative RET encoding")
Reported-by: David Kaplan <David.Kaplan@amd.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@kernel.org>
2023-09-22 18:58:24 +02:00
Linus Torvalds
6f612579be objtool changes for v6.5:
- Build footprint & performance improvements:
 
     - Reduce memory usage with CONFIG_DEBUG_INFO=y
 
       In the worst case of an allyesconfig+CONFIG_DEBUG_INFO=y kernel, DWARF
       creates almost 200 million relocations, ballooning objtool's peak heap
       usage to 53GB.  These patches reduce that to 25GB.
 
       On a distro-type kernel with kernel IBT enabled, they reduce objtool's
       peak heap usage from 4.2GB to 2.8GB.
 
       These changes also improve the runtime significantly.
 
 - Debuggability improvements:
 
     - Add the unwind_debug command-line option, for more extend unwinding
       debugging output.
     - Limit unreachable warnings to once per function
     - Add verbose option for disassembling affected functions
     - Include backtrace in verbose mode
     - Detect missing __noreturn annotations
     - Ignore exc_double_fault() __noreturn warnings
     - Remove superfluous global_noreturns entries
     - Move noreturn function list to separate file
     - Add __kunit_abort() to noreturns
 
 - Unwinder improvements:
 
     - Allow stack operations in UNWIND_HINT_UNDEFINED regions
     - drm/vmwgfx: Add unwind hints around RBP clobber
 
 - Cleanups:
 
     - Move the x86 entry thunk restore code into thunk functions
     - x86/unwind/orc: Use swap() instead of open coding it
     - Remove unnecessary/unused variables
 
 - Fixes for modern stack canary handling
 
 Signed-off-by: Ingo Molnar <mingo@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmSaxcoRHG1pbmdvQGtl
 cm5lbC5vcmcACgkQEnMQ0APhK1ht5w//f8mBoABct29pS4ib6pDwRZQDoG8fCA7M
 +KWjFD1AhX7RsJVEbM4uBUXdSWZD61xxIa8p8LO2jjzE5RyhM+EuNaisKujKqmfj
 uQTSnRhIRHMPqqVGK/gQxy1v4+3+12O32XFIJhAPYCp/dpbZJ2yKDsiHjapzZTDy
 BM+86hbIyHFmSl5uJcBFHEv6EGhoxwdrrrOxhpao1CqfAUi+uVgamHGwVqx+NtTY
 MvOmcy3/0ukHwDLON0MIMu9MSwvnXorD7+RSkYstwAM/k6ao/k78iJ31sOcynpRn
 ri0gmfygJsh2bxL4JUlY4ZeTs7PLWkj3i60deePc5u6EyV4JDJ2borUibs5oGoF6
 pN0AwbtubLHHhUI/v74B3E6K6ZGvLiEn9dsNTuXsJffD+qU2REb+WLhr4ut+E1Wi
 IKWrYh811yBLyOqFEW3XudZTiXSJlgi3eYiCxspEsKw2RIFFt2g6vYcwrIb0Hatw
 8R4/jCWk1nc6Wa3RQYsVnhkglAECSKQdDfS7p2e1hNUTjZuess4EEJjSLs8upIQ9
 D1bmuUxEzRxVwAZtXYNh0NKe7OtyOrqgsVTQuqxvWXq2CpC7Hqj8piVJWHdBWgHO
 0o2OQqjwSrzAtevpAIaYQv9zhPs1hV7CpBgzzqWGXrwJ3vM6YoSRLf0bg+5OkN8I
 O4U2xq2OVa8=
 =uNnc
 -----END PGP SIGNATURE-----

Merge tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip

Pull objtool updates from Ingo Molar:
 "Build footprint & performance improvements:

   - Reduce memory usage with CONFIG_DEBUG_INFO=y

     In the worst case of an allyesconfig+CONFIG_DEBUG_INFO=y kernel,
     DWARF creates almost 200 million relocations, ballooning objtool's
     peak heap usage to 53GB. These patches reduce that to 25GB.

     On a distro-type kernel with kernel IBT enabled, they reduce
     objtool's peak heap usage from 4.2GB to 2.8GB.

     These changes also improve the runtime significantly.

  Debuggability improvements:

   - Add the unwind_debug command-line option, for more extend unwinding
     debugging output
   - Limit unreachable warnings to once per function
   - Add verbose option for disassembling affected functions
   - Include backtrace in verbose mode
   - Detect missing __noreturn annotations
   - Ignore exc_double_fault() __noreturn warnings
   - Remove superfluous global_noreturns entries
   - Move noreturn function list to separate file
   - Add __kunit_abort() to noreturns

  Unwinder improvements:

   - Allow stack operations in UNWIND_HINT_UNDEFINED regions
   - drm/vmwgfx: Add unwind hints around RBP clobber

  Cleanups:

   - Move the x86 entry thunk restore code into thunk functions
   - x86/unwind/orc: Use swap() instead of open coding it
   - Remove unnecessary/unused variables

  Fixes for modern stack canary handling"

* tag 'objtool-core-2023-06-27' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (42 commits)
  x86/orc: Make the is_callthunk() definition depend on CONFIG_BPF_JIT=y
  objtool: Skip reading DWARF section data
  objtool: Free insns when done
  objtool: Get rid of reloc->rel[a]
  objtool: Shrink elf hash nodes
  objtool: Shrink reloc->sym_reloc_entry
  objtool: Get rid of reloc->jump_table_start
  objtool: Get rid of reloc->addend
  objtool: Get rid of reloc->type
  objtool: Get rid of reloc->offset
  objtool: Get rid of reloc->idx
  objtool: Get rid of reloc->list
  objtool: Allocate relocs in advance for new rela sections
  objtool: Add for_each_reloc()
  objtool: Don't free memory in elf_close()
  objtool: Keep GElf_Rel[a] structs synced
  objtool: Add elf_create_section_pair()
  objtool: Add mark_sec_changed()
  objtool: Fix reloc_hash size
  objtool: Consolidate rel/rela handling
  ...
2023-06-27 15:05:41 -07:00
Ingo Molnar
301cf77e21 x86/orc: Make the is_callthunk() definition depend on CONFIG_BPF_JIT=y
Recent commit:

  020126239b Revert "x86/orc: Make it callthunk aware"

Made the only user of is_callthunk() depend on CONFIG_BPF_JIT=y, while
the definition of the helper function is unconditional.

Move is_callthunk() inside the #ifdef block.

Addresses this build failure:

   arch/x86/kernel/callthunks.c:296:13: error: ‘is_callthunk’ defined but not used [-Werror=unused-function]

Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: Peter Zijlstra <peterz@infradead.org>
2023-06-09 11:09:04 +02:00
Josh Poimboeuf
020126239b Revert "x86/orc: Make it callthunk aware"
Commit 396e0b8e09 ("x86/orc: Make it callthunk aware") attempted to
deal with the fact that function prefix code didn't have ORC coverage.
However, it didn't work as advertised.  Use of the "null" ORC entry just
caused affected unwinds to end early.

The root cause has now been fixed with commit 5743654f5e ("objtool:
Generate ORC data for __pfx code").

Revert most of commit 396e0b8e09 ("x86/orc: Make it callthunk aware").
The is_callthunk() function remains as it's now used by other code.

Link: https://lore.kernel.org/r/a05b916ef941da872cbece1ab3593eceabd05a79.1684245404.git.jpoimboe@kernel.org
Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org>
2023-06-07 09:48:57 -07:00
Thomas Gleixner
cded367976 x86/smpboot: Restrict soft_restart_cpu() to SEV
Now that the CPU0 hotplug cruft is gone, the only user is AMD SEV.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.822234014@linutronix.de
2023-05-15 13:44:50 +02:00
Thomas Gleixner
666e1156b2 x86/smpboot: Rename start_cpu0() to soft_restart_cpu()
This is used in the SEV play_dead() implementation to re-online CPUs. But
that has nothing to do with CPU0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Tested-by: Michael Kelley <mikelley@microsoft.com>
Tested-by: Oleksandr Natalenko <oleksandr@natalenko.name>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Tested-by: Guilherme G. Piccoli <gpiccoli@igalia.com> # Steam Deck
Link: https://lore.kernel.org/r/20230512205255.662319599@linutronix.de
2023-05-15 13:44:48 +02:00
Song Liu
ac3b432839 module: replace module_layout with module_memory
module_layout manages different types of memory (text, data, rodata, etc.)
in one allocation, which is problematic for some reasons:

1. It is hard to enable CONFIG_STRICT_MODULE_RWX.
2. It is hard to use huge pages in modules (and not break strict rwx).
3. Many archs uses module_layout for arch-specific data, but it is not
   obvious how these data are used (are they RO, RX, or RW?)

Improve the scenario by replacing 2 (or 3) module_layout per module with
up to 7 module_memory per module:

        MOD_TEXT,
        MOD_DATA,
        MOD_RODATA,
        MOD_RO_AFTER_INIT,
        MOD_INIT_TEXT,
        MOD_INIT_DATA,
        MOD_INIT_RODATA,

and allocating them separately. This adds slightly more entries to
mod_tree (from up to 3 entries per module, to up to 7 entries per
module). However, this at most adds a small constant overhead to
__module_address(), which is expected to be fast.

Various archs use module_layout for different data. These data are put
into different module_memory based on their location in module_layout.
IOW, data that used to go with text is allocated with MOD_MEM_TYPE_TEXT;
data that used to go with data is allocated with MOD_MEM_TYPE_DATA, etc.

module_memory simplifies quite some of the module code. For example,
ARCH_WANTS_MODULES_DATA_IN_VMALLOC is a lot cleaner, as it just uses a
different allocator for the data. kernel/module/strict_rwx.c is also
much cleaner with module_memory.

Signed-off-by: Song Liu <song@kernel.org>
Cc: Luis Chamberlain <mcgrof@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-03-09 12:55:15 -08:00
Arnd Bergmann
ade8c20847 x86/calldepth: Fix incorrect init section references
The addition of callthunks_translate_call_dest means that
skip_addr() and patch_dest() can no longer be discarded
as part of the __init section freeing:

WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> skip_addr (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: callthunks_translate_call_dest.cold (section: .text.unlikely) -> patch_dest (section: .init.text)
WARNING: modpost: vmlinux.o: section mismatch in reference: is_callthunk.cold (section: .text.unlikely) -> skip_addr (section: .init.text)
ERROR: modpost: Section mismatches detected.
Set CONFIG_SECTION_MISMATCH_WARN_ONLY=y to allow them.

Fixes: b2e9dfe54b ("x86/bpf: Emit call depth accounting if required")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lkml.kernel.org/r/20221215164334.968863-1-arnd@kernel.org
2022-12-27 12:51:58 +01:00
Peter Zijlstra
ee3e2469b3 x86/ftrace: Make it call depth tracking aware
Since ftrace has trampolines, don't use thunks for the __fentry__ site
but instead require that every function called from there includes
accounting. This very much includes all the direct-call functions.

Additionally, ftrace uses ROP tricks in two places:

 - return_to_handler(), and
 - ftrace_regs_caller() when pt_regs->orig_ax is set by a direct-call.

return_to_handler() already uses a retpoline to replace an
indirect-jump to defeat IBT, since this is a jump-type retpoline, make
sure there is no accounting done and ALTERNATIVE the RET into a ret.

ftrace_regs_caller() does much the same and gets the same treatment.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111148.927545073@infradead.org
2022-10-17 16:41:19 +02:00
Thomas Gleixner
b2e9dfe54b x86/bpf: Emit call depth accounting if required
Ensure that calls in BPF jitted programs are emitting call depth accounting
when enabled to keep the call/return balanced. The return thunk jump is
already injected due to the earlier retbleed mitigations.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111148.615413406@infradead.org
2022-10-17 16:41:18 +02:00
Peter Zijlstra
396e0b8e09 x86/orc: Make it callthunk aware
Callthunks addresses on the stack would confuse the ORC unwinder. Handle
them correctly and tell ORC to proceed further down the stack.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111148.511637628@infradead.org
2022-10-17 16:41:17 +02:00
Peter Zijlstra
7825451fa4 static_call: Add call depth tracking support
When indirect calls are switched to direct calls then it has to be ensured
that the call target is not the function, but the call thunk when call
depth tracking is enabled. But static calls are available before call
thunks have been set up.

Ensure a second run through the static call patching code after call thunks
have been created. When call thunks are not enabled this has no side
effects.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111148.306100465@infradead.org
2022-10-17 16:41:16 +02:00
Thomas Gleixner
f5c1bb2afe x86/calldepth: Add ret/call counting for debug
Add a debuigfs mechanism to validate the accounting, e.g. vs. call/ret
balance and to gather statistics about the stuffing to call ratio.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111148.204285506@infradead.org
2022-10-17 16:41:16 +02:00
Thomas Gleixner
bbaceb189a x86/retbleed: Add SKL call thunk
Add the actual SKL call thunk for call depth accounting.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111148.101125588@infradead.org
2022-10-17 16:41:15 +02:00
Thomas Gleixner
eaf44c816e x86/modules: Add call patching
As for the builtins create call thunks and patch the call sites to call the
thunk on Intel SKL CPUs for retbleed mitigation.

Note, that module init functions are ignored for sake of simplicity because
loading modules is not something which is done in high frequent loops and
the attacker has not really a handle on when this happens in order to
launch a matching attack. The depth tracking will still work for calls into
the builtins and because the call is not accounted it will underflow faster
and overstuff, but that's mitigated by the saturating counter and the side
effect is only temporary.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111147.575673066@infradead.org
2022-10-17 16:41:13 +02:00
Thomas Gleixner
e81dc127ef x86/callthunks: Add call patching for call depth tracking
Mitigating the Intel SKL RSB underflow issue in software requires to
track the call depth. That is every CALL and every RET need to be
intercepted and additional code injected.

The existing retbleed mitigations already include means of redirecting
RET to __x86_return_thunk; this can be re-purposed and RET can be
redirected to another function doing RET accounting.

CALL accounting will use the function padding introduced in prior
patches. For each CALL instruction, the destination symbol's padding
is rewritten to do the accounting and the CALL instruction is adjusted
to call into the padding.

This ensures only affected CPUs pay the overhead of this accounting.
Unaffected CPUs will leave the padding unused and have their 'JMP
__x86_return_thunk' replaced with an actual 'RET' instruction.

Objtool has been modified to supply a .call_sites section that lists
all the 'CALL' instructions. Additionally the paravirt instruction
sites are iterated since they will have been patched from an indirect
call to direct calls (or direct instructions in which case it'll be
ignored).

Module handling and the actual thunk code for SKL will be added in
subsequent steps.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111147.470877038@infradead.org
2022-10-17 16:41:13 +02:00