Check the IO permission bitmap (if present) before emulating IOIO #VC
exceptions for user-space. These permissions are checked by hardware
already before the #VC is raised, but due to the VC-handler decoding
race it needs to be checked again in software.
Fixes: 25189d08e5 ("x86/sev-es: Add support for handling IOIO exceptions")
Reported-by: Tom Dohrmann <erbse.13@gmx.de>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Tom Dohrmann <erbse.13@gmx.de>
Cc: <stable@kernel.org>
boot_ghcb_page is not used by any other file, so make it static.
This also resolves sparse warning:
arch/x86/boot/compressed/sev.c:28:13: warning: symbol 'boot_ghcb_page' was not declared. Should it be static?
Signed-off-by: GUO Zihua <guozihua@huawei.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org
Align x86 with other EFI architectures, and increase the section
alignment to the EFI page size (4k), so that firmware is able to honour
the section permission attributes and map code read-only and data
non-executable.
There are a number of requirements that have to be taken into account:
- the sign tools get cranky when there are gaps between sections in the
file view of the image
- the virtual offset of each section must be aligned to the image's
section alignment
- the file offset *and size* of each section must be aligned to the
image's file alignment
- the image size must be aligned to the section alignment
- each section's virtual offset must be greater than or equal to the
size of the headers.
In order to meet all these requirements, while avoiding the need for
lots of padding to accommodate the .compat section, the latter is placed
at an arbitrary offset towards the end of the image, but aligned to the
minimum file alignment (512 bytes). The space before the .text section
is therefore distributed between the PE header, the .setup section and
the .compat section, leaving no gaps in the file coverage, making the
signing tools happy.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-18-ardb@google.com
Describe the code and data of the decompressor binary using separate
.text and .data PE/COFF sections, so that we will be able to map them
using restricted permissions once we increase the section and file
alignment sufficiently. This avoids the need for memory mappings that
are writable and executable at the same time, which is something that
is best avoided for security reasons.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-17-ardb@google.com
Ancient buggy EFI loaders may have required a .reloc section to be
present at some point in time, but this has not been true for a long
time so the .reloc section can just be dropped.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-16-ardb@google.com
Now that the size of the setup block is visible to the assembler, it is
possible to populate the PE/COFF header fields from the asm code
directly, instead of poking the values into the binary using the build
tool. This will make it easier to reorganize the section layout without
having to tweak the build tool in lockstep.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-15-ardb@google.com
Tweak the linker script so that the value of _edata represents the
decompressor binary's file size rounded up to the appropriate alignment.
This removes the need to calculate it in the build tool, and will make
it easier to refer to the file size from the header directly in
subsequent changes to the PE header layout.
While adding _edata to the sed regex that parses the compressed
vmlinux's symbol list, tweak the regex a bit for conciseness.
This change has no impact on the resulting bzImage binary when
configured with CONFIG_EFI_STUB=y.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-14-ardb@google.com
The setup block contains the real mode startup code that is used when
booting from a legacy BIOS, along with the boot_params/setup_data that
is used by legacy x86 bootloaders to pass the command line and initial
ramdisk parameters, among other things.
The setup block also contains the PE/COFF header of the entire combined
image, which includes the compressed kernel image, the decompressor and
the EFI stub.
This PE header describes the layout of the executable image in memory,
and currently, the fact that the setup block precedes it makes it rather
fiddly to get the right values into the right place in the final image.
Let's make things a bit easier by defining the setup_size in the linker
script so it can be referenced from the asm code directly, rather than
having to rely on the build tool to calculate it. For the time being,
add 64 bytes of fixed padding for the .reloc and .compat sections - this
will be removed in a subsequent patch after the PE/COFF header has been
reorganized.
This change has no impact on the resulting bzImage binary when
configured with CONFIG_EFI_MIXED=y.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-13-ardb@google.com
The offsets of the EFI handover entrypoints are available to the
assembler when constructing the header, so there is no need to set them
from the build tool afterwards.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-12-ardb@google.com
Instead of parsing zoffset.h and poking the kernel_info offset value
into the header from the build tool, just grab the value directly in the
asm file that describes this header.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915171623.655440-11-ardb@google.com
The decompressor has a hard limit on the number of page tables it can
allocate. This limit is defined at compile-time and will cause boot
failure if it is reached.
The kernel is very strict and calculates the limit precisely for the
worst-case scenario based on the current configuration. However, it is
easy to forget to adjust the limit when a new use-case arises. The
worst-case scenario is rarely encountered during sanity checks.
In the case of enabling 5-level paging, a use-case was overlooked. The
limit needs to be increased by one to accommodate the additional level.
This oversight went unnoticed until Aaron attempted to run the kernel
via kexec with 5-level paging and unaccepted memory enabled.
Update wost-case calculations to include 5-level paging.
To address this issue, let's allocate some extra space for page tables.
128K should be sufficient for any use-case. The logic can be simplified
by using a single value for all kernel configurations.
[ Also add a warning, should this memory run low - by Dave Hansen. ]
Fixes: 34bbb0009f ("x86/boot/compressed: Enable 5-level paging during decompression stage")
Reported-by: Aaron Lu <aaron.lu@intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230915070221.10266-1-kirill.shutemov@linux.intel.com
The x86 boot image generation tool assign a default value to startup_64
and subsequently parses the actual value from zoffset.h but it never
actually uses the value anywhere. So remove this code.
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230912090051.4014114-25-ardb@google.com
The root device defaults to 0,0 and is no longer configurable at build
time [0], so there is no need for the build tool to ever write to this
field.
[0] 079f85e624 ("x86, build: Do not set the root_dev field in bzImage")
This change has no impact on the resulting bzImage binary.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230912090051.4014114-23-ardb@google.com
Now that the EFI stub decompresses the kernel and hands over to the
decompressed image directly, there is no longer a need to provide a
decompression buffer as part of the .BSS allocation of the PE/COFF
image. It also means the PE/COFF image can be loaded anywhere in memory,
and setting the preferred image base is unnecessary. So drop the
handling of this from the header and from the build tool.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230912090051.4014114-22-ardb@google.com
Ancient (pre-2003) x86 kernels could boot from a floppy disk straight from
the BIOS, using a small real mode boot stub at the start of the image
where the BIOS would expect the boot record (or boot block) to appear.
Due to its limitations (kernel size < 1 MiB, no support for IDE, USB or
El Torito floppy emulation), this support was dropped, and a Linux aware
bootloader is now always required to boot the kernel from a legacy BIOS.
To smoothen this transition, the boot stub was not removed entirely, but
replaced with one that just prints an error message telling the user to
install a bootloader.
As it is unlikely that anyone doing direct floppy boot with such an
ancient kernel is going to upgrade to v6.5+ and expect that this boot
method still works, printing this message is kind of pointless, and so
it should be possible to remove the logic that emits it.
Let's free up this space so it can be used to expand the PE header in a
subsequent patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Link: https://lore.kernel.org/r/20230912090051.4014114-21-ardb@google.com
The section header flags for alignment are documented in the PE/COFF
spec as being applicable to PE object files only, not to PE executables
such as the Linux bzImage, so let's drop them from the PE header.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230912090051.4014114-20-ardb@google.com
Now that the EFI stub always zero inits its BSS section upon entry,
there is no longer a need to place the BSS symbols carried by the stub
into the .data section.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20230912090051.4014114-18-ardb@google.com
Now 'struct tdx_hypercall_args' is basically 'struct tdx_module_args'
minus RCX. Although from __tdx_hypercall()'s perspective RCX isn't
used as shared register thus not part of input/output registers, it's
not worth to have a separate structure just due to one register.
Remove the 'struct tdx_hypercall_args' and use 'struct tdx_module_args'
instead in __tdx_hypercall() related code. This also saves the memory
copy between the two structures within __tdx_hypercall().
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/798dad5ce24e9d745cf0e16825b75ccc433ad065.1692096753.git.kai.huang%40intel.com
Now the 'struct tdx_hypercall_args' and 'struct tdx_module_args' are
almost the same, and the TDX_HYPERCALL and TDX_MODULE_CALL asm macro
share similar code pattern too. The __tdx_hypercall() and __tdcall()
should be unified to use the same assembly code.
As a preparation to unify them, simplify the TDX_HYPERCALL to make it
more like the TDX_MODULE_CALL.
The TDX_HYPERCALL takes the pointer of 'struct tdx_hypercall_args' as
function call argument, and does below extra things comparing to the
TDX_MODULE_CALL:
1) It sets RAX to 0 (TDG.VP.VMCALL leaf) internally;
2) It sets RCX to the (fixed) bitmap of shared registers internally;
3) It calls __tdx_hypercall_failed() internally (and panics) when the
TDCALL instruction itself fails;
4) After TDCALL, it moves R10 to RAX to return the return code of the
VMCALL leaf, regardless the '\ret' asm macro argument;
Firstly, change the TDX_HYPERCALL to take the same function call
arguments as the TDX_MODULE_CALL does: TDCALL leaf ID, and the pointer
to 'struct tdx_module_args'. Then 1) and 2) can be moved to the
caller:
- TDG.VP.VMCALL leaf ID can be passed via the function call argument;
- 'struct tdx_module_args' is 'struct tdx_hypercall_args' + RCX, thus
the bitmap of shared registers can be passed via RCX in the
structure.
Secondly, to move 3) and 4) out of assembly, make the TDX_HYPERCALL
always save output registers to the structure. The caller then can:
- Call __tdx_hypercall_failed() when TDX_HYPERCALL returns error;
- Return R10 in the structure as the return code of the VMCALL leaf;
With above changes, change the asm function from __tdx_hypercall() to
__tdcall_hypercall(), and reimplement __tdx_hypercall() as the C wrapper
of it. This avoids having to add another wrapper of __tdx_hypercall()
(_tdx_hypercall() is already taken).
The __tdcall_hypercall() will be replaced with a __tdcall() variant
using TDX_MODULE_CALL in a later commit as the final goal is to have one
assembly to handle both TDCALL and TDVMCALL.
Currently, the __tdx_hypercall() asm is in '.noinstr.text'. To keep
this unchanged, annotate __tdx_hypercall(), which is a C function now,
as 'noinstr'.
Remove the __tdx_hypercall_ret() as __tdx_hypercall() already does so.
Implement __tdx_hypercall() in tdx-shared.c so it can be shared with the
compressed code.
Opportunistically fix a checkpatch error complaining using space around
parenthesis '(' and ')' while moving the bitmap of shared registers to
<asm/shared/tdx.h>.
[ dhansen: quash new calls of __tdx_hypercall_ret() that showed up ]
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/0cbf25e7aee3256288045023a31f65f0cef90af4.1692096753.git.kai.huang%40intel.com
The following commit deserves special mention:
22dc02f81c Revert "sched/fair: Move unused stub functions to header"
This is in x86/cleanups, because the revert is a re-application of a
number of cleanups that got removed inadvertedly.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmTtDkoRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1jCMw//UvQGM8yxsTa57r0/ZpJHS2++P5pJxOsz
45kBb3aBiDV6idArce4EHpthp3MvF3Pycibp9w0qg//NOtIHTKeagXv52abxsu1W
hmS6gXJZDXZvjO1BFaUlmv97iYtzGfKnQppj32C4tMr9SaP49h3KvOHH1Z8CR3mP
1nZaJJwYIi2qBh7msnmLGG+F0drb85O/dfHdoLX6iVJw9UP4n5nu9u8u1E0iC7J7
2GC6AwP60A0EBRTK9EHQQEYwy9uvdS/TG5f2Qk1VP87KA9TTocs8MyapMG4DQu79
hZKVEGuVQAlV3rYe9cJBNpDx1mTu3rmuMH0G71KEe3T6UcG5QRUiAPm8UfA9prPD
uWjY4zm5o0W3tUio4V1MqqiLFIaBU63WmTY9RyM0QH8Ms8r8GugWKmnrTIuHfEC3
9D+Uhyb5d8ID6qFGLTOvPm0g+v64lnH71qq83PcVJgsmZvUb2XvFA3d/A0h9JzLT
2In/yfU10UsLUFTiNRyAgcLccjaGhliDB2oke9Kp0OyOTSQRcWmiq8kByVxCPImP
auOWWcNXjcuOgjlnziEkMTDuRY12MgUB2If4zhELvdEFibIaaNW5sNCbY2msWaN1
CUD7fcj0L3HZvzujUm72l5hxL2brJMuPwVNJfuOe4T8wzy569d6VJULrd1URBM1B
vfaPs1Dz46Q=
=kiAA
-----END PGP SIGNATURE-----
Merge tag 'x86-cleanups-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 cleanups from Ingo Molnar:
"The following commit deserves special mention:
22dc02f81c Revert "sched/fair: Move unused stub functions to header"
This is in x86/cleanups, because the revert is a re-application of a
number of cleanups that got removed inadvertedly"
[ This also effectively undoes the amd_check_microcode() microcode
declaration change I had done in my microcode loader merge in commit
42a7f6e3ff ("Merge tag 'x86_microcode_for_v6.6_rc1' [...]").
I picked the declaration change by Arnd from this branch instead,
which put it in <asm/processor.h> instead of <asm/microcode.h> like I
had done in my merge resolution - Linus ]
* tag 'x86-cleanups-2023-08-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/platform/uv: Refactor code using deprecated strncpy() interface to use strscpy()
x86/hpet: Refactor code using deprecated strncpy() interface to use strscpy()
x86/platform/uv: Refactor code using deprecated strcpy()/strncpy() interfaces to use strscpy()
x86/qspinlock-paravirt: Fix missing-prototype warning
x86/paravirt: Silence unused native_pv_lock_init() function warning
x86/alternative: Add a __alt_reloc_selftest() prototype
x86/purgatory: Include header for warn() declaration
x86/asm: Avoid unneeded __div64_32 function definition
Revert "sched/fair: Move unused stub functions to header"
x86/apic: Hide unused safe_smp_processor_id() on 32-bit UP
x86/cpu: Fix amd_check_microcode() declaration
range whose SEV encryption status needs to change, is not page aligned
so that callers which round up the number of pages to be decrypted,
would mark a trailing page as decrypted and thus cause corruption
during live migration.
- Return an error from the #VC handler on AMD SEV-* guests when the debug
registers swapping is enabled as a DR7 access should not happen then
- that register is guest/host switched.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmTsje4ACgkQEsHwGGHe
VUrKWxAAqiTpQSjJCB32ReioSsLv3kl7vtLO3xtE42VpF0F7pAAPzRsh+bgDjGSM
uqcEgbX1YtPlb8wK6yh5dyNLLvtzxhaAQkUfbfuEN2oqbvIEcJmhWAm/xw1yCsh2
GDphFPtvqgT4KUCkEHj8tC9eQzG+L0bwymPzqXooVDnm4rL0ulEl6ONffhHfJFVg
bmL8UjmJNodFcO6YBfosQIDDfc4ayuwm9f/rGltNFl+jwCi62kMJaVdU1112agsV
LE73DRoRpfHKLslj9o9ubRcvaHKS24y2Amflnj1tas0h8I2uXBRwIgxjQXl5vtXV
pu5/5VHM9X13x8XKpKVkEohXkBzFRigs8yfHq+JlpyWXXB/ymW8Acbqqnvll12r4
JSy+XfBNa6V5Y/NDS/1faJiX6XSi5ZyZHZG70sf52XVoBYhzoms5kxqTJnHHisnY
X50677/tQF3V9WsmKD0aj0Um2ztiq0/TNMI7FT3lzYRDNJb1ln3ZK9f04i8L5jA4
bsrSV5oCVpLkW4eQaAJwxttTB+dRb5MwwkeS7D/eTuJ1pgUmJMIbZp2YbJH7NP2F
6FShQdwHi8KYN7mxUM+WwOk7goaBm5L61w5UtRlt6aDE7LdEbMAeSSdmD3HlEZHR
ntBqcEx4SkAT+Ru0izVXjsoWmtkn8+DY44oUC2X6eZxUSAT4Cm4=
=td9F
-----END PGP SIGNATURE-----
Merge tag 'x86_sev_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 SEV updates from Borislav Petkov:
- Handle the case where the beginning virtual address of the address
range whose SEV encryption status needs to change, is not page
aligned so that callers which round up the number of pages to be
decrypted, would mark a trailing page as decrypted and thus cause
corruption during live migration.
- Return an error from the #VC handler on AMD SEV-* guests when the
debug registers swapping is enabled as a DR7 access should not happen
then - that register is guest/host switched.
* tag 'x86_sev_for_v6.6_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev: Make enc_dec_hypercall() accept a size instead of npages
x86/sev: Do not handle #VC for DR7 read/write
With MSR_AMD64_SEV_DEBUG_SWAP enabled, the guest is not expected to
receive a #VC for reads or writes of DR7.
Update the SNP_FEATURES_PRESENT mask with MSR_AMD64_SNP_DEBUG_SWAP so
an SNP guest doesn't gracefully terminate during SNP feature negotiation
if MSR_AMD64_SEV_DEBUG_SWAP is enabled.
Since a guest is not expected to receive a #VC on DR7 accesses when
MSR_AMD64_SEV_DEBUG_SWAP is enabled, return an error from the #VC
handler in this situation.
Signed-off-by: Alexey Kardashevskiy <aik@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Carlos Bilbao <carlos.bilbao@amd.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Reviewed-by: Pankaj Gupta <pankaj.gupta@amd.com>
Link: https://lore.kernel.org/r/20230816022122.981998-1-aik@amd.com
The bare metal decompressor code was never really intended to run in a
hosted environment such as the EFI boot services, and does a few things
that are becoming problematic in the context of EFI boot now that the
logo requirements are getting tighter: EFI executables will no longer be
allowed to consist of a single executable section that is mapped with
read, write and execute permissions if they are intended for use in a
context where Secure Boot is enabled (and where Microsoft's set of
certificates is used, i.e., every x86 PC built to run Windows).
To avoid stepping on reserved memory before having inspected the E820
tables, and to ensure the correct placement when running a kernel build
that is non-relocatable, the bare metal decompressor moves its own
executable image to the end of the allocation that was reserved for it,
in order to perform the decompression in place. This means the region in
question requires both write and execute permissions, which either need
to be given upfront (which EFI will no longer permit), or need to be
applied on demand using the existing page fault handling framework.
However, the physical placement of the kernel is usually randomized
anyway, and even if it isn't, a dedicated decompression output buffer
can be allocated anywhere in memory using EFI APIs when still running in
the boot services, given that EFI support already implies a relocatable
kernel. This means that decompression in place is never necessary, nor
is moving the compressed image from one end to the other.
Since EFI already maps all of memory 1:1, it is also unnecessary to
create new page tables or handle page faults when decompressing the
kernel. That means there is also no need to replace the special
exception handlers for SEV. Generally, there is little need to do
any of the things that the decompressor does beyond
- initialize SEV encryption, if needed,
- perform the 4/5 level paging switch, if needed,
- decompress the kernel
- relocate the kernel
So do all of this from the EFI stub code, and avoid the bare metal
decompressor altogether.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-24-ardb@kernel.org
Before refactoring the EFI stub boot flow to avoid the legacy bare metal
decompressor, duplicate the SNP feature check in the EFI stub before
handing over to the kernel proper.
The SNP feature check can be performed while running under the EFI boot
services, which means it can force the boot to fail gracefully and
return an error to the bootloader if the loaded kernel does not
implement support for all the features that the hypervisor enabled.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-23-ardb@kernel.org
Factor out the decompressor sequence that invokes the decompressor,
parses the ELF and applies the relocations so that it can be called
directly from the EFI stub.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-21-ardb@kernel.org
It is no longer necessary to be cautious when referring to global
variables in the position independent decompressor code, now that it is
built using PIE codegen and makes an assertion in the linker script that
no GOT entries exist (which would require adjustment for the actual
runtime load address of the decompressor binary).
This means global variables can be referenced directly from C code,
instead of having to pass their runtime addresses into C routines from
asm code, which needs to happen at each call site. Do so for the code
that will be called directly from the EFI stub after a subsequent patch,
and avoid the need to duplicate this logic a third time.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-20-ardb@kernel.org
Now that the trampoline setup code and the actual invocation of it are
all done from the C routine, the trampoline cleanup can be merged into
it as well, instead of returning to asm just to call another C function.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20230807162720.545787-16-ardb@kernel.org
The only remaining use of the trampoline address by the trampoline
itself is deriving the page table address from it, and this involves
adding an offset of 0x0. So simplify this, and pass the new CR3 value
directly.
This makes the fact that the page table happens to be at the start of
the trampoline allocation an implementation detail of the caller.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-15-ardb@kernel.org
Since the current and desired number of paging levels are known when the
trampoline is being prepared, avoid calling the trampoline at all if it
is clear that calling it is not going to result in a change to the
number of paging levels.
Given that the CPU is already running in long mode, the PAE and LA57
settings are necessarily consistent with the currently active page
tables, and other fields in CR4 will be initialized by the startup code
in the kernel proper. So limit the manipulation of CR4 to toggling the
LA57 bit, which is the only thing that really needs doing at this point
in the boot. This also means that there is no need to pass the value of
l5_required to toggle_la57(), as it will not be called unless CR4.LA57
needs to toggle.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20230807162720.545787-14-ardb@kernel.org
Instead of returning to the asm calling code to invoke the trampoline,
call it straight from the C code that sets it up. That way, the struct
return type is no longer needed for returning two values, and the call
can be made conditional more cleanly in a subsequent patch.
This means that all callee save 64-bit registers need to be preserved
and restored, as their contents may not survive the legacy mode switch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20230807162720.545787-13-ardb@kernel.org
The 32-bit trampoline no longer uses the stack for anything except
performing a far return back to long mode, and preserving the caller's
stack pointer value. Currently, the trampoline stack is placed in the
same page that carries the trampoline code, which means this page must
be mapped writable and executable, and the stack is therefore executable
as well.
Replace the far return with a far jump, so that the return address can
be pre-calculated and patched into the code before it is called. This
removes the need for a 32-bit addressable stack entirely, and in a later
patch, this will be taken advantage of by removing writable permissions
from (and adding executable permissions to) the trampoline code page
when booting via the EFI stub.
Note that the value of RSP still needs to be preserved explicitly across
the switch into 32-bit mode, as the register may get truncated to 32
bits.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20230807162720.545787-12-ardb@kernel.org
Update the trampoline code so its arguments are passed via RDI and RSI,
which matches the ordinary SysV calling convention for x86_64. This will
allow this code to be called directly from C.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20230807162720.545787-11-ardb@kernel.org
Move the long return to switch to 32-bit mode into the trampoline code
so it can be called as an ordinary function. This will allow it to be
called directly from C code in a subsequent patch.
While at it, reorganize the code somewhat to keep the prologue and
epilogue of the function together, making the code a bit easier to
follow. Also, given that the trampoline is now entered in 64-bit mode, a
simple RIP-relative reference can be used to take the address of the
exit point.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/r/20230807162720.545787-10-ardb@kernel.org
There is no need to defer the assignment of the paging related global
variables 'pgdir_shift' and 'ptrs_per_p4d' until after the trampoline is
cleaned up, so assign them as soon as it is clear that 5-level paging
will be enabled.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-9-ardb@kernel.org
Instead of pushing and popping %RSI several times to preserve the struct
boot_params pointer across the execution of the startup code, move it
into a callee save register before the first call into C, and copy it
back when needed.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-8-ardb@kernel.org
The so-called EFI handover protocol is value-add from the distros that
permits a loader to simply copy a PE kernel image into memory and call
an alternative entrypoint that is described by an embedded boot_params
structure.
Most implementations of this protocol do not bother to check the PE
header for minimum alignment, section placement, etc, and therefore also
don't clear the image's BSS, or even allocate enough memory for it.
Allocating more memory on the fly is rather difficult, but at least
clear the BSS region explicitly when entering in this manner, so that
the EFI stub code does not get confused by global variables that were
not zero-initialized correctly.
When booting in mixed mode, this BSS clearing must occur before any
global state is created, so clear it in the 32-bit asm entry point.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-7-ardb@kernel.org
The native 32-bit or 64-bit EFI handover protocol entrypoint offset
relative to the respective startup_32/64 address is described in
boot_params as handover_offset, so that the special Linux/x86 aware EFI
loader can find it there.
When mixed mode is enabled, this single field has to describe this
offset for both the 32-bit and 64-bit entrypoints, so their respective
relative offsets have to be identical. Given that startup_32 and
startup_64 are 0x200 bytes apart, and the EFI handover entrypoint
resides at a fixed offset, the 32-bit and 64-bit versions of those
entrypoints must be exactly 0x200 bytes apart as well.
Currently, hard-coded fixed offsets are used to ensure this, but it is
sufficient to emit the 64-bit entrypoint 0x200 bytes after the 32-bit
one, wherever it happens to reside. This allows this code (which is now
EFI mixed mode specific) to be moved into efi_mixed.S and out of the
startup code in head_64.S.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-6-ardb@kernel.org
Now that the EFI entry code in assembler is only used by the optional
and deprecated EFI handover protocol, and given that the EFI stub C code
no longer returns to it, most of it can simply be dropped.
While at it, clarify the symbol naming, by merging efi_main() and
efi_stub_entry(), making the latter the shared entry point for all
different boot modes that enter via the EFI stub.
The efi32_stub_entry() and efi64_stub_entry() names are referenced
explicitly by the tooling that populates the setup header, so these must
be retained, but can be emitted as aliases of efi_stub_entry() where
appropriate.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-5-ardb@kernel.org
The 4-to-5 level mode switch trampoline disables long mode and paging in
order to be able to flick the LA57 bit. According to section 3.4.1.1 of
the x86 architecture manual [0], 64-bit GPRs might not retain the upper
32 bits of their contents across such a mode switch.
Given that RBP, RBX and RSI are live at this point, preserve them on the
stack, along with the return address that might be above 4G as well.
[0] Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture
"Because the upper 32 bits of 64-bit general-purpose registers are
undefined in 32-bit modes, the upper 32 bits of any general-purpose
register are not preserved when switching from 64-bit mode to a 32-bit
mode (to protected mode or compatibility mode). Software must not
depend on these bits to maintain a value after a 64-bit to 32-bit
mode switch."
Fixes: 194a9749c7 ("x86/boot/compressed/64: Handle 5-level paging boot if kernel is above 4G")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230807162720.545787-2-ardb@kernel.org
Tao Liu reported a boot hang on an Intel Atom machine due to an unmapped
EFI config table. The reason being that the CC blob which contains the
CPUID page for AMD SNP guests is parsed for before even checking
whether the machine runs on AMD hardware.
Usually that's not a problem on !AMD hw - it simply won't find the CC
blob's GUID and return. However, if any parts of the config table
pointers array is not mapped, the kernel will #PF very early in the
decompressor stage without any opportunity to recover.
Therefore, do a superficial CPUID check before poking for the CC blob.
This will fix the current issue on real hardware. It would also work as
a guest on a non-lying hypervisor.
For the lying hypervisor, the check is done again, *after* parsing the
CC blob as the real CPUID page will be present then.
Clear the #VC handler in case SEV-{ES,SNP} hasn't been detected, as
a precaution.
Fixes: c01fce9cef ("x86/compressed: Add SEV-SNP feature detection/setup")
Reported-by: Tao Liu <ltao@redhat.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Tested-by: Tao Liu <ltao@redhat.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230601072043.24439-1-ltao@redhat.com
The purgatory code uses parts of the decompressor and provides its own
warn() function, but has to include the corresponding header file to
avoid a -Wmissing-prototypes warning.
It turns out that this function prototype actually differs from the
declaration, so change it to get a constant pointer in the declaration
and the other definition as well.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230803082619.1369127-5-arnd@kernel.org
a fatal shutdown during TDX private<=>shared conversion
- Annotate sites where VM "exit reasons" are reused as hypercall
numbers.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmSZ50cACgkQaDWVMHDJ
krAPEg//bAR0SzrjIir2eiQ7p3ktr4L7iae0odzFkW/XHam5ZJP+v9cCMLzY6zNO
44x9Z85jZ9w34GcZ4D7D0OmTHbcDpcxckPXSFco/dK4IyYeLzUImYXEYo41YJEx9
O4sSQMBqIyjMXej/oKBhgKHSWaV60XimvQvTvhpjXGD/45bt9sx4ZNVTi8+xVMbw
jpktMsQcsjHctcIY2D2eUR61Ma/Vg9t6Qih51YMtbq6Nqcyhw2IKvDwIx3kQuW34
qSW7wsyn+RfHQDpwjPgDG/6OE815Pbtzlxz+y6tB9pN88IWkA1H5Jh2CQRlMBud2
2nVQRpqPgr9uOIeNnNI7FFd1LgTIc/v7lDPfpUH9KelOs7cGWvaRymkuhPSvWxRI
tmjlMdFq8XcjrOPieA9WpxYKXinqj4wNXtnYGyaM+Ur/P3qWaj18PMCYMbeN6pJC
eNYEJVk2Mt8GmiPL55aYG5+Z1F8sciLKbz8TFq5ya2z0EnSbyVvR+DReqd7zRzh6
Bmbmx9isAzN6wWNszNt7f8XSgRPV2Ri1tvb1vixk3JLxyx2iVCUL6KJ0cZOUNy0x
nQqy7/zMtBsFGZ/Ca8f2kpVaGgxkUFy7n1rI4psXTGBOVlnJyMz3WSi9N8F/uJg0
Ca5W4493+txdyHSAmWBQAQuZp3RJOlhTkXe5dfjukv6Rnnw1MZU=
=Hr3D
-----END PGP SIGNATURE-----
Merge tag 'x86_tdx_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 tdx updates from Dave Hansen:
- Fix a race window where load_unaligned_zeropad() could cause a fatal
shutdown during TDX private<=>shared conversion
The race has never been observed in practice but might allow
load_unaligned_zeropad() to catch a TDX page in the middle of its
conversion process which would lead to a fatal and unrecoverable
guest shutdown.
- Annotate sites where VM "exit reasons" are reused as hypercall
numbers.
* tag 'x86_tdx_for_6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/mm: Fix enc_status_change_finish_noop()
x86/tdx: Fix race between set_memory_encrypted() and load_unaligned_zeropad()
x86/mm: Allow guest.enc_status_change_prepare() to fail
x86/tdx: Wrap exit reason with hcall_func()
and assert __x86_return_thunk's alignment so that future changes to
the symbol macros do not accidentally break them.
- Remove CONFIG_X86_FEATURE_NAMES Kconfig option as its existence is
pointless
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSZ1wgACgkQEsHwGGHe
VUrXlRAAhIonFM1suIHo6w085jY5YA1XnsziJr/bT3e16FdHrF1i3RBEX4ml0m3O
ADwa9dMsC9UJIa+/TKRNFfQvfRcLE/rsUKlS1Rluf/IRIxuSt/Oa4bFQHGXFRwnV
eSlnWTNiaWrRs/vJEYAnMOe98oRyElHWa9kZ7K5FC+Ksfn/WO1U1RQ2NWg2A2wkN
8MHJiS41w2piOrLU/nfUoI7+esHgHNlib222LoptDGHuaY8V2kBugFooxAEnTwS3
PCzWUqCTgahs393vbx6JimoIqgJDa7bVdUMB0kOUHxtpbBiNdYYVy6e7UKnV1yjB
qP3v9jQW4+xIyRmlFiErJXEZx7DjAIP5nulGRrUMzRfWEGF8mdRZ+ugGqFMHCeC8
vXI+Ixp2vvsfhG3N/algsJUdkjlpt3hBpElRZCfR08M253KAbAmUNMOr4sx4RPi5
ymC+pLIHd1K0G9jiZaFnOMaY71gAzWizwxwjFKLQMo44q+lpNJvsVO00cr+9RBYj
LQL2APkONVEzHPMYR/LrXCslYaW//DrfLdRQjNbzUTonxFadkTO2Eu8J90B/5SFZ
CqC1NYKMQPVFeg4XuGWCgZEH+jokCGhl8vvmXClAMcOEOZt0/s4H89EKFkmziyon
L1ZrA/U72gWV8EwD7GLtuFJmnV4Ayl/hlek2j0qNKaj6UUgTFg8=
=LcUq
-----END PGP SIGNATURE-----
Merge tag 'x86_cpu_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cpu updates from Borislav Petkov:
- Compute the purposeful misalignment of zen_untrain_ret automatically
and assert __x86_return_thunk's alignment so that future changes to
the symbol macros do not accidentally break them.
- Remove CONFIG_X86_FEATURE_NAMES Kconfig option as its existence is
pointless
* tag 'x86_cpu_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/retbleed: Add __x86_return_thunk alignment checks
x86/cpu: Remove X86_FEATURE_NAMES
x86/Kconfig: Make X86_FEATURE_NAMES non-configurable in prompt
The gist of it all is that Intel TDX and AMD SEV-SNP confidential
computing guests define the notion of accepting memory before using it
and thus preventing a whole set of attacks against such guests like
memory replay and the like.
There are a couple of strategies of how memory should be accepted
- the current implementation does an on-demand way of accepting.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmSZ0f4ACgkQEsHwGGHe
VUpasw//RKoNW9HSU1csY+XnG9uuaT6QKgji+gIEZWWIGPO9iibvbBj6P5WxJE8T
fe7yb6CGa6d6thoU0v+mQGVVvCd7OjCFwPD5wAo4mXToD7Ig+4mI6jMkaKifqa2f
N1Uuy8u/zQnGyWrP5Y//WH5bJYfsmds4UGwXI2nLvKlhE7MG90/ePjt7iqnnwZsy
waLp6a0Q1VeOvnfRszFLHZw/SoER5RSJ4qeVqttkFNmPPEKMK1Kirrl2poR56OQJ
nMr6LqVtD7erlSJ36VRXOKzLI443A4iIEIg/wBjIOU6L5ZEWJGNqtCDnIqFJ6+TM
XatsejfRYkkMZH0qXtX9+M0u+HJHbZPCH5rEcA21P3Nbd7od/ANq91qCGoMjtUZ4
7pZohMG8M6IDvkLiOb8fQVkR5k/9Jbk8UvdN/8jdPx1ERxYMFO3BDvJpV2gzrW4B
KYtFTPR7j2nY3eKfDpe3flanqYzKUBsKoTlLnlH7UHaiMZ2idwG8AQjlrhC/erCq
/Lq1LXt4Mq46FyHABc+PSHytu0WWj1nBUftRt+lviY/Uv7TlkBldOTT7wm7itsfF
HUCTfLWl0CJXKPq8rbbZhAG/exN6Ay6MO3E3OcNq8A72E5y4cXenuG3ic/0tUuOu
FfjpiMk35qE2Qb4hnj1YtF3XINtd1MpKcuwzGSzEdv9s3J7hrS0=
=FS95
-----END PGP SIGNATURE-----
Merge tag 'x86_cc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 confidential computing update from Borislav Petkov:
- Add support for unaccepted memory as specified in the UEFI spec v2.9.
The gist of it all is that Intel TDX and AMD SEV-SNP confidential
computing guests define the notion of accepting memory before using
it and thus preventing a whole set of attacks against such guests
like memory replay and the like.
There are a couple of strategies of how memory should be accepted -
the current implementation does an on-demand way of accepting.
* tag 'x86_cc_for_v6.5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
virt: sevguest: Add CONFIG_CRYPTO dependency
x86/efi: Safely enable unaccepted memory in UEFI
x86/sev: Add SNP-specific unaccepted memory support
x86/sev: Use large PSC requests if applicable
x86/sev: Allow for use of the early boot GHCB for PSC requests
x86/sev: Put PSC struct on the stack in prep for unaccepted memory support
x86/sev: Fix calculation of end address based on number of pages
x86/tdx: Add unaccepted memory support
x86/tdx: Refactor try_accept_one()
x86/tdx: Make _tdx_hypercall() and __tdx_module_call() available in boot stub
efi/unaccepted: Avoid load_unaligned_zeropad() stepping into unaccepted memory
efi: Add unaccepted memory support
x86/boot/compressed: Handle unaccepted memory
efi/libstub: Implement support for unaccepted memory
efi/x86: Get full memory map in allocate_e820()
mm: Add support for unaccepted memory
The Linux build process on x86 roughly consists of compiling all input
files, statically linking them into a vmlinux ELF file, and then taking
and turning this file into an actual bzImage bootable file.
vmlinux has in this process two main purposes:
1) It is an intermediate build target on the way to produce the final
bootable image.
2) It is a file that is expected to be used by debuggers and standard
ELF tooling to work with the built kernel.
For the second purpose, a vmlinux file is typically collected by various
package build recipes, such as distribution spec files, including the
kernel's own tar-pkg target.
When building a kernel supporting KASLR with CONFIG_X86_NEED_RELOCS,
vmlinux contains also relocation information produced by using the
--emit-relocs linker option. This is utilized by subsequent build steps
to create vmlinux.relocs and produce a relocatable image. However, the
information is not needed by debuggers and other standard ELF tooling.
The issue is then that the collected vmlinux file and hence distribution
packages end up unnecessarily large because of this extra data. The
following is a size comparison of vmlinux v6.0 with and without the
relocation information:
| Configuration | With relocs | Stripped relocs |
| x86_64_defconfig | 70 MB | 43 MB |
| +CONFIG_DEBUG_INFO | 818 MB | 367 MB |
Optimize a resulting vmlinux by adding a postlink step that splits the
relocation information into vmlinux.relocs and then strips it from the
vmlinux binary.
Signed-off-by: Petr Pavlu <petr.pavlu@suse.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Link: https://lore.kernel.org/r/20220927084632.14531-1-petr.pavlu@suse.com
Add SNP-specific hooks to the unaccepted memory support in the boot
path (__accept_memory()) and the core kernel (accept_memory()) in order
to support booting SNP guests when unaccepted memory is present. Without
this support, SNP guests will fail to boot and/or panic() when unaccepted
memory is present in the EFI memory map.
The process of accepting memory under SNP involves invoking the hypervisor
to perform a page state change for the page to private memory and then
issuing a PVALIDATE instruction to accept the page.
Since the boot path and the core kernel paths perform similar operations,
move the pvalidate_pages() and vmgexit_psc() functions into sev-shared.c
to avoid code duplication.
Create the new header file arch/x86/boot/compressed/sev.h because adding
the function declaration to any of the existing SEV related header files
pulls in too many other header files, causing the build to fail.
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/a52fa69f460fd1876d70074b20ad68210dfc31dd.1686063086.git.thomas.lendacky@amd.com
Hookup TDX-specific code to accept memory.
Accepting the memory is done with ACCEPT_PAGE module call on every page
in the range. MAP_GPA hypercall is not required as the unaccepted memory
is considered private already.
Extract the part of tdx_enc_status_changed() that does memory acceptance
in a new helper. Move the helper tdx-shared.c. It is going to be used by
both main kernel and decompressor.
[ bp: Fix the INTEL_TDX_GUEST=y, KVM_GUEST=n build. ]
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/20230606142637.5171-10-kirill.shutemov@linux.intel.com
The firmware will pre-accept the memory used to run the stub. But, the
stub is responsible for accepting the memory into which it decompresses
the main kernel. Accept memory just before decompression starts.
The stub is also responsible for choosing a physical address in which to
place the decompressed kernel image. The KASLR mechanism will randomize
this physical address. Since the accepted memory region is relatively
small, KASLR would be quite ineffective if it only used the pre-accepted
area (EFI_CONVENTIONAL_MEMORY). Ensure that KASLR randomizes among the
entire physical address space by also including EFI_UNACCEPTED_MEMORY.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Liam Merwick <liam.merwick@oracle.com>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lore.kernel.org/r/20230606142637.5171-5-kirill.shutemov@linux.intel.com
UEFI Specification version 2.9 introduces the concept of memory
acceptance: Some Virtual Machine platforms, such as Intel TDX or AMD
SEV-SNP, requiring memory to be accepted before it can be used by the
guest. Accepting happens via a protocol specific for the Virtual
Machine platform.
Accepting memory is costly and it makes VMM allocate memory for the
accepted guest physical address range. It's better to postpone memory
acceptance until memory is needed. It lowers boot time and reduces
memory overhead.
The kernel needs to know what memory has been accepted. Firmware
communicates this information via memory map: a new memory type --
EFI_UNACCEPTED_MEMORY -- indicates such memory.
Range-based tracking works fine for firmware, but it gets bulky for
the kernel: e820 (or whatever the arch uses) has to be modified on every
page acceptance. It leads to table fragmentation and there's a limited
number of entries in the e820 table.
Another option is to mark such memory as usable in e820 and track if the
range has been accepted in a bitmap. One bit in the bitmap represents a
naturally aligned power-2-sized region of address space -- unit.
For x86, unit size is 2MiB: 4k of the bitmap is enough to track 64GiB or
physical address space.
In the worst-case scenario -- a huge hole in the middle of the
address space -- It needs 256MiB to handle 4PiB of the address
space.
Any unaccepted memory that is not aligned to unit_size gets accepted
upfront.
The bitmap is allocated and constructed in the EFI stub and passed down
to the kernel via EFI configuration table. allocate_e820() allocates the
bitmap if unaccepted memory is present, according to the size of
unaccepted region.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lore.kernel.org/r/20230606142637.5171-4-kirill.shutemov@linux.intel.com
TDX reuses VMEXIT "reasons" in its guest->host hypercall ABI. This is
confusing because there might not be a VMEXIT involved at *all*.
These instances are supposed to document situation and reduce confusion
by wrapping VMEXIT reasons with hcall_func().
The decompression code does not follow this convention.
Unify the TDX decompression code with the other TDX use of VMEXIT reasons.
No functional changes.
Signed-off-by: Nikolay Borisov <nik.borisov@suse.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/all/20230505120332.1429957-1-nik.borisov%40suse.com
While discussing to change the visibility of X86_FEATURE_NAMES (see Link)
in order to remove CONFIG_EMBEDDED, Boris suggested to simply make the
X86_FEATURE_NAMES functionality unconditional.
As the need for really tiny kernel images has gone away and kernel images
with !X86_FEATURE_NAMES are hardly tested, remove this config and the whole
ifdeffery in the source code.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/all/20230509084007.24373-1-lukas.bulwahn@gmail.com/
Link: https://lore.kernel.org/r/20230510065713.10996-3-lukas.bulwahn@gmail.com
assembly macro argument rather than a runtime register.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEV76QKkVc4xCGURexaDWVMHDJkrAFAmRKvncACgkQaDWVMHDJ
krDQsA/+OlDCexITqWgmd3rPN2ZUuyB+aV4MfoKppQuAuWD4I7vxD5qeqjRS2XTh
E6SSzp43zEVhVo6Kv3UvPR/Tr9edUGn2KzIWmqd1bOwhgbEfd898gzbWuRmK6i8t
qqweR1RMAL/COgPAlcrdpTLl2PCc9tLYpDnQ8WcAUqH4uoePpQyN3Za0J/dcKX7l
8XexOAaco4Wz3ylD9npPcLo9ytvohg+exJtCNldN1l2j5xXdA2fTqEJYaUMp/+Nd
Z1TTQ43QcT7dRknFojxdYfAkCqBfr8ccBAwV1mriahKWY/3xl35BqSeJVlma1tkm
UzkTY1CFwKYRk24C/oQK7OQMYnyJ7Q1RhSrd91lQWVjaTcI/3DPUKiKKdwFXDv4C
FUYvuJkanPVk3PyCZRvltdNvsXsifzx0RKZWLZ+3TQ2jtaMEDOzPgChq7a6WfpkQ
HQPuVoENHvyHdUycQhtELUsaJ3AdnOM87XiQDcbNNiaPiOLB9C8dhSWMKoPsMehO
oAiUQ7lW6po0lcELVSKib2ASVpXhOmlAxdRyZ50mhjrbpcxfBBGD3+KdFqZ4Gs1c
8UyrQbjVq07Lx2fvdizvDpIcr4M7z0xBAhJeIegC6z86XpJq5uvin+vOLzFAfe16
WGy6FiZtVpXp4fyqUY7GgQNqhk1b8h6EHKd9d/zCSPuH8/wT/6g=
=hvDm
-----END PGP SIGNATURE-----
Merge tag 'x86_tdx_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 tdx update from Dave Hansen:
"The original tdx hypercall assembly code took two flags in %RSI to
tweak its behavior at runtime. PeterZ recently axed one flag in commit
e80a48bade ("x86/tdx: Remove TDX_HCALL_ISSUE_STI").
Kill the other flag too and tweak the 'output' mode with an assembly
macro instead. This results in elimination of one push/pop pair and
overall easier to read assembly.
- Do conditional __tdx_hypercall() 'output' processing via an
assembly macro argument rather than a runtime register"
* tag 'x86_tdx_for_6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tdx: Drop flags from __tdx_hypercall()
Replace multiple __pa()/__va() definitions with a single one in misc.h.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/all/20230330114956.20342-2-kirill.shutemov%40linux.intel.com
Move the x86 documentation under Documentation/arch/ as a way of cleaning
up the top-level directory and making the structure of our docs more
closely match the structure of the source directories it describes.
All in-kernel references to the old paths have been updated.
Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-arch@vger.kernel.org
Cc: x86@kernel.org
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/lkml/20230315211523.108836-1-corbet@lwn.net/
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
After TDX_HCALL_ISSUE_STI got dropped, the only flag left is
TDX_HCALL_HAS_OUTPUT. The flag indicates if the caller wants to see
tdx_hypercall_args updated based on the hypercall output.
Drop the flags and provide __tdx_hypercall_ret() that matches
TDX_HCALL_HAS_OUTPUT semantics.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/all/20230321003511.9469-1-kirill.shutemov%40linux.intel.com
- Change V=1 option to print both short log and full command log.
- Allow V=1 and V=2 to be combined as V=12.
- Make W=1 detect wrong .gitignore files.
- Tree-wide cleanups for unused command line arguments passed to Clang.
- Stop using -Qunused-arguments with Clang.
- Make scripts/setlocalversion handle only correct release tags instead
of any arbitrary annotated tag.
- Create Debian and RPM source packages without cleaning the source tree.
- Various cleanups for packaging.
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmP7iHoVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGL/cQAK9q5rsNL5a2LgTbm89ORA+UV+ST
hrAoGo5DkJHUbVH53oPzyLynFBZPvUzLK8yjApjXkyAzy2hXYnj+vbTs0s+JVCFL
owS4NB0YP+tpHGuy8bGpWI0GMZSMwmspUteqxk86zuH8uQVAhnCaeV1/Cr6Aqj1h
2jk1FZid3/h7qEkEgu5U8soeyFnV6VhAT6Ie5yfZ2O2RdsSqPUh6vfKrgdyW4RWz
gito0SOUwvjIDfSmTnIIacUibisPRv2OW29OvmDp1aXj5rMhe3UfOznVE3NR86yl
ZbWDAIm6KYT8V1ASOoAUR80qent9IPKytThLK9BVEQCT6bsujCZMvhYhhEvO30TF
Lzsdr+FrES//xag3+hgc63FEied2xxWGQG1cRtzAhfRL9tJ03+mY1omoW6SyKqW/
Gc9PIcTgQbCIrkeL0HuAI1q3I1vkvHXInJKtGkoHh1J9aJ8v5gQpwGA+DDRUnA+A
LQSeEbT2Hf3MoF4CqZRnConvfhlMuLI+j5v54YPrhokxXmv7u807kjfwMFTiZ/+m
CJFlEMf9YRv3pi8g/AYyGAg5ZQigCwzOCRUC5kguFqzZdgnjiI907GEL804lm1Mg
lpx/HtYPyxwWEd2XyU6/C9AEIl3gm7MBd6b1tD54Tb/VmE+AvjS/O9jFYXZqnAnM
Llv4BfK/cQKwHb6o
=HpFZ
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Change V=1 option to print both short log and full command log
- Allow V=1 and V=2 to be combined as V=12
- Make W=1 detect wrong .gitignore files
- Tree-wide cleanups for unused command line arguments passed to Clang
- Stop using -Qunused-arguments with Clang
- Make scripts/setlocalversion handle only correct release tags instead
of any arbitrary annotated tag
- Create Debian and RPM source packages without cleaning the source
tree
- Various cleanups for packaging
* tag 'kbuild-v6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (74 commits)
kbuild: rpm-pkg: remove unneeded KERNELRELEASE from modules/headers_install
docs: kbuild: remove description of KBUILD_LDS_MODULE
.gitattributes: use 'dts' diff driver for *.dtso files
kbuild: deb-pkg: improve the usability of source package
kbuild: deb-pkg: fix binary-arch and clean in debian/rules
kbuild: tar-pkg: use tar rules in scripts/Makefile.package
kbuild: make perf-tar*-src-pkg work without relying on git
kbuild: deb-pkg: switch over to source format 3.0 (quilt)
kbuild: deb-pkg: make .orig tarball a hard link if possible
kbuild: deb-pkg: hide KDEB_SOURCENAME from Makefile
kbuild: srcrpm-pkg: create source package without cleaning
kbuild: rpm-pkg: build binary packages from source rpm
kbuild: deb-pkg: create source package without cleaning
kbuild: add a tool to list files ignored by git
Documentation/llvm: add Chimera Linux, Google and Meta datacenters
setlocalversion: use only the correct release tag for git-describe
setlocalversion: clean up the construction of version output
.gitignore: ignore *.cover and *.mbx
kbuild: remove --include-dir MAKEFLAG from top Makefile
kbuild: fix trivial typo in comment
...
- Robustify/fix calling startup_{32,64}() from the decompressor code,
and removing x86 quirk from scripts/head-object-list.txt as
a result.
- Do not register processors that cannot be onlined for x2APIC
Signed-off-by: Ingo Molnar <mingo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iQJFBAABCgAvFiEEBpT5eoXrXCwVQwEKEnMQ0APhK1gFAmPzcNsRHG1pbmdvQGtl
cm5lbC5vcmcACgkQEnMQ0APhK1gFjhAAqxnVl1X413IK9sd4C56wWdoLlRo9uGvO
HtAYK1SRzibTOrFn+ByBhugYzYPyxIx634rM6hyp4nkVEnbgCXQ+Qc1xOwjyW8fh
gxR3FCxsQiqajg7/1DOOSoMc/rc3adU73RHCWTjcHV/Zo7KEtvVa6AFvMTd1xzt9
eMPqi7wsPflbdUV9wvf6cKkFPe3Nm3P1hOlUDHGmYZkDw30N8UlZmxvegwrBFDdV
SpiJ0ZLV90NGJ6k6O3XSd4pVDxMn9DlYd0v/0r+YAT56hiRhefSKR2/jQntutZqp
YlyZYjvwUjwEgOdUWPPRbndWWEfFsE2XQQclr4L+ZLQ/Gm8jTsT2b/IvXBmF4FzV
0kzjNdhkPObx3X6UQZ47r6J3x8SWA9qZ6JH+uqCd6w/UW1KIiMBZ2kuIXvJn6eSr
xFLabjPPeOeRXFpiQJjIZ31m7i3JlQbIsfb8IIxI1D55nEkNywjk9VqlLEVw23qD
p93l0+ehpnZ2YCjV0kts/EXMikSmVZorA5wkTzEmG5ER+2BuIDin+wuGPawXrKew
QCa2X7GoVmxf81Rcz7f/E+JnYcSTQ6AQzFkOxe3zb97bnRsckM/87buC0GktcPjW
C8iy3yZzEIhRj2ilKEZLl7jIK59B4jReUKJx+vsxk2k2p5fuRdMkMtPfIZDBwVHQ
PzRZGSDY4FI=
=p3z1
-----END PGP SIGNATURE-----
Merge tag 'x86-boot-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Ingo Molnar:
- Robustify/fix calling startup_{32,64}() from the decompressor code,
and removing x86 quirk from scripts/head-object-list.txt as a result.
- Do not register processors that cannot be onlined for x2APIC
* tag 'x86-boot-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/acpi/boot: Do not register processors that cannot be onlined for x2APIC
scripts/head-object-list: Remove x86 from the list
x86/boot: Robustify calling startup_{32,64}() from the decompressor code
-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmPW7E8eHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGf7MIAI0JnHN9WvtEukSZ
E6j6+cEGWxsvD6q0g3GPolaKOCw7hlv0pWcFJFcUAt0jebspMdxV2oUGJ8RYW7Lg
nCcHvEVswGKLAQtQSWw52qotW6fUfMPsNYYB5l31sm1sKH4Cgss0W7l2HxO/1LvG
TSeNHX53vNAZ8pVnFYEWCSXC9bzrmU/VALF2EV00cdICmfvjlgkELGXoLKJJWzUp
s63fBHYGGURSgwIWOKStoO6HNo0j/F/wcSMx8leY8qDUtVKHj4v24EvSgxUSDBER
ch3LiSQ6qf4sw/z7pqruKFthKOrlNmcc0phjiES0xwwGiNhLv0z3rAhc4OM2cgYh
SDc/Y/c=
=zpaD
-----END PGP SIGNATURE-----
Merge tag 'v6.2-rc6' into sched/core, to pick up fixes
Pick up fixes before merging another batch of cpuidle updates.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
as-option tests new options using KBUILD_CFLAGS, which causes problems
when using as-option to update KBUILD_AFLAGS because many compiler
options are not valid assembler options.
This will be fixed in a follow up patch. Before doing so, move the
assembler test for -Wa,-mrelax-relocations=no from using as-option to
cc-option.
Link: https://lore.kernel.org/llvm/CAK7LNATcHt7GcXZ=jMszyH=+M_LC9Qr6yeAGRCBbE6xriLxtUQ@mail.gmail.com/
Suggested-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Linux Kernel Functional Testing <lkft@linaro.org>
Tested-by: Anders Roxell <anders.roxell@linaro.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
The hypervisor can enable various new features (SEV_FEATURES[1:63]) and start a
SNP guest. Some of these features need guest side implementation. If any of
these features are enabled without it, the behavior of the SNP guest will be
undefined. It may fail booting in a non-obvious way making it difficult to
debug.
Instead of allowing the guest to continue and have it fail randomly later,
detect this early and fail gracefully.
The SEV_STATUS MSR indicates features which the hypervisor has enabled. While
booting, SNP guests should ascertain that all the enabled features have guest
side implementation. In case a feature is not implemented in the guest, the
guest terminates booting with GHCB protocol Non-Automatic Exit(NAE) termination
request event, see "SEV-ES Guest-Hypervisor Communication Block Standardization"
document (currently at https://developer.amd.com/wp-content/resources/56421.pdf),
section "Termination Request".
Populate SW_EXITINFO2 with mask of unsupported features that the hypervisor can
easily report to the user.
More details in the AMD64 APM Vol 2, Section "SEV_STATUS MSR".
[ bp:
- Massage.
- Move snp_check_features() call to C code.
Note: the CC:stable@ aspect here is to be able to protect older, stable
kernels when running on newer hypervisors. Or not "running" but fail
reliably and in a well-defined manner instead of randomly. ]
Fixes: cbd3d4f7c4 ("x86/sev: Check SEV-SNP features support")
Signed-off-by: Nikunj A Dadhania <nikunj@amd.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
Cc: <stable@kernel.org>
Link: https://lore.kernel.org/r/20230118061943.534309-1-nikunj@amd.com
objtool found a few cases where this code called out into instrumented
code:
vmlinux.o: warning: objtool: __halt+0x2c: call to hcall_func.constprop.0() leaves .noinstr.text section
vmlinux.o: warning: objtool: __halt+0x3f: call to __tdx_hypercall() leaves .noinstr.text section
vmlinux.o: warning: objtool: __tdx_hypercall+0x66: call to __tdx_hypercall_failed() leaves .noinstr.text section
Fix it by:
- moving TDX tdcall assembly methods into .noinstr.text (they are already noistr-clean)
- marking __tdx_hypercall_failed() as 'noinstr'
- annotating hcall_func() as __always_inline
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Tony Lindgren <tony@atomide.com>
Tested-by: Ulf Hansson <ulf.hansson@linaro.org>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Frederic Weisbecker <frederic@kernel.org>
Link: https://lore.kernel.org/r/20230112195541.111485720@infradead.org
With 'GNU assembler (GNU Binutils for Debian) 2.39.90.20221231' the
build now reports:
arch/x86/realmode/rm/../../boot/bioscall.S: Assembler messages:
arch/x86/realmode/rm/../../boot/bioscall.S:35: Warning: found `movsd'; assuming `movsl' was meant
arch/x86/realmode/rm/../../boot/bioscall.S:70: Warning: found `movsd'; assuming `movsl' was meant
arch/x86/boot/bioscall.S: Assembler messages:
arch/x86/boot/bioscall.S:35: Warning: found `movsd'; assuming `movsl' was meant
arch/x86/boot/bioscall.S:70: Warning: found `movsd'; assuming `movsl' was meant
Which is due to:
PR gas/29525
Note that with the dropped CMPSD and MOVSD Intel Syntax string insn
templates taking operands, mixed IsString/non-IsString template groups
(with memory operands) cannot occur anymore. With that
maybe_adjust_templates() becomes unnecessary (and is hence being
removed).
More details: https://sourceware.org/bugzilla/show_bug.cgi?id=29525
Borislav Petkov further explains:
" the particular problem here is is that the 'd' suffix is
"conflicting" in the sense that you can have SSE mnemonics like movsD %xmm...
and the same thing also for string ops (which is the case here) so apparently
the agreement in binutils land is to use the always accepted suffixes 'l' or 'q'
and phase out 'd' slowly... "
Fixes: 7a734e7dd9 ("x86, setup: "glove box" BIOS calls -- infrastructure")
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://lore.kernel.org/r/Y71I3Ex2pvIxMpsP@hirez.programming.kicks-ass.net
After commit ce697ccee1 ("kbuild: remove head-y syntax"), I
started digging whether x86 is ready for removing this old cruft.
Removing its objects from the list makes the kernel unbootable.
This applies only to bzImage, vmlinux still works correctly.
The reason is that with no strict object order determined by the
linker arguments, not the linker script, startup_64 can be placed
not right at the beginning of the kernel.
Here's vmlinux.map's beginning before removing:
ffffffff81000000 vmlinux.o:(.head.text)
ffffffff81000000 startup_64
ffffffff81000070 secondary_startup_64
ffffffff81000075 secondary_startup_64_no_verify
ffffffff81000160 verify_cpu
and after:
ffffffff81000000 vmlinux.o:(.head.text)
ffffffff81000000 pvh_start_xen
ffffffff81000080 startup_64
ffffffff810000f0 secondary_startup_64
ffffffff810000f5 secondary_startup_64_no_verify
Not a problem itself, but the self-extractor code has the address of
that function hardcoded the beginning, not looking onto the ELF
header, which always contains the address of startup_{32,64}().
So, instead of doing an "act of blind faith", just take the address
from the ELF header and extract a relative offset to the entry
point. The decompressor function already returns a pointer to the
beginning of the kernel to the Asm code, which then jumps to it,
so add that offset to the return value.
This doesn't change anything for now, but allows to resign from the
"head object list" for x86 and makes sure valid Kbuild or any other
improvements won't break anything here in general.
Signed-off-by: Alexander Lobakin <alexandr.lobakin@intel.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Jiri Slaby <jirislaby@kernel.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: https://lore.kernel.org/r/20230109170403.4117105-2-alexandr.lobakin@intel.com
been long in the making. It is a lighterweight software-only fix for
Skylake-based cores where enabling IBRS is a big hammer and causes a
significant performance impact.
What it basically does is, it aligns all kernel functions to 16 bytes
boundary and adds a 16-byte padding before the function, objtool
collects all functions' locations and when the mitigation gets applied,
it patches a call accounting thunk which is used to track the call depth
of the stack at any time.
When that call depth reaches a magical, microarchitecture-specific value
for the Return Stack Buffer, the code stuffs that RSB and avoids its
underflow which could otherwise lead to the Intel variant of Retbleed.
This software-only solution brings a lot of the lost performance back,
as benchmarks suggest:
https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
That page above also contains a lot more detailed explanation of the
whole mechanism
- Implement a new control flow integrity scheme called FineIBT which is
based on the software kCFI implementation and uses hardware IBT support
where present to annotate and track indirect branches using a hash to
validate them
- Other misc fixes and cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOZp5EACgkQEsHwGGHe
VUrZFxAAvi/+8L0IYSK4mKJvixGbTFjxN/Swo2JVOfs34LqGUT6JaBc+VUMwZxdb
VMTFIZ3ttkKEodjhxGI7oGev6V8UfhI37SmO2lYKXpQVjXXnMlv/M+Vw3teE38CN
gopi+xtGnT1IeWQ3tc/Tv18pleJ0mh5HKWiW+9KoqgXj0wgF9x4eRYDz1TDCDA/A
iaBzs56j8m/FSykZHnrWZ/MvjKNPdGlfJASUCPeTM2dcrXQGJ93+X2hJctzDte0y
Nuiw6Y0htfFBE7xoJn+sqm5Okr+McoUM18/CCprbgSKYk18iMYm3ZtAi6FUQZS1A
ua4wQCf49loGp15PO61AS5d3OBf5D3q/WihQRbCaJvTVgPp9sWYnWwtcVUuhMllh
ZQtBU9REcVJ/22bH09Q9CjBW0VpKpXHveqQdqRDViLJ6v/iI6EFGmD24SW/VxyRd
73k9MBGrL/dOf1SbEzdsnvcSB3LGzp0Om8o/KzJWOomrVKjBCJy16bwTEsCZEJmP
i406m92GPXeaN1GhTko7vmF0GnkEdJs1GVCZPluCAxxbhHukyxHnrjlQjI4vC80n
Ylc0B3Kvitw7LGJsPqu+/jfNHADC/zhx1qz/30wb5cFmFbN1aRdp3pm8JYUkn+l/
zri2Y6+O89gvE/9/xUhMohzHsWUO7xITiBavewKeTP9GSWybWUs=
=cRy1
-----END PGP SIGNATURE-----
Merge tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:
- Add the call depth tracking mitigation for Retbleed which has been
long in the making. It is a lighterweight software-only fix for
Skylake-based cores where enabling IBRS is a big hammer and causes a
significant performance impact.
What it basically does is, it aligns all kernel functions to 16 bytes
boundary and adds a 16-byte padding before the function, objtool
collects all functions' locations and when the mitigation gets
applied, it patches a call accounting thunk which is used to track
the call depth of the stack at any time.
When that call depth reaches a magical, microarchitecture-specific
value for the Return Stack Buffer, the code stuffs that RSB and
avoids its underflow which could otherwise lead to the Intel variant
of Retbleed.
This software-only solution brings a lot of the lost performance
back, as benchmarks suggest:
https://lore.kernel.org/all/20220915111039.092790446@infradead.org/
That page above also contains a lot more detailed explanation of the
whole mechanism
- Implement a new control flow integrity scheme called FineIBT which is
based on the software kCFI implementation and uses hardware IBT
support where present to annotate and track indirect branches using a
hash to validate them
- Other misc fixes and cleanups
* tag 'x86_core_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (80 commits)
x86/paravirt: Use common macro for creating simple asm paravirt functions
x86/paravirt: Remove clobber bitmask from .parainstructions
x86/debug: Include percpu.h in debugreg.h to get DECLARE_PER_CPU() et al
x86/cpufeatures: Move X86_FEATURE_CALL_DEPTH from bit 18 to bit 19 of word 11, to leave space for WIP X86_FEATURE_SGX_EDECCSSA bit
x86/Kconfig: Enable kernel IBT by default
x86,pm: Force out-of-line memcpy()
objtool: Fix weak hole vs prefix symbol
objtool: Optimize elf_dirty_reloc_sym()
x86/cfi: Add boot time hash randomization
x86/cfi: Boot time selection of CFI scheme
x86/ibt: Implement FineIBT
objtool: Add --cfi to generate the .cfi_sites section
x86: Add prefix symbols for function padding
objtool: Add option to generate prefix symbols
objtool: Avoid O(bloody terrible) behaviour -- an ode to libelf
objtool: Slice up elf_create_section_symbol()
kallsyms: Revert "Take callthunks into account"
x86: Unconfuse CONFIG_ and X86_FEATURE_ namespaces
x86/retpoline: Fix crash printing warning
x86/paravirt: Fix a !PARAVIRT build warning
...
- Convert flexible array members, fix -Wstringop-overflow warnings,
and fix KCFI function type mismatches that went ignored by
maintainers (Gustavo A. R. Silva, Nathan Chancellor, Kees Cook).
- Remove the remaining side-effect users of ksize() by converting
dma-buf, btrfs, and coredump to using kmalloc_size_roundup(),
add more __alloc_size attributes, and introduce full testing
of all allocator functions. Finally remove the ksize() side-effect
so that each allocation-aware checker can finally behave without
exceptions.
- Introduce oops_limit (default 10,000) and warn_limit (default off)
to provide greater granularity of control for panic_on_oops and
panic_on_warn (Jann Horn, Kees Cook).
- Introduce overflows_type() and castable_to_type() helpers for
cleaner overflow checking.
- Improve code generation for strscpy() and update str*() kern-doc.
- Convert strscpy and sigphash tests to KUnit, and expand memcpy
tests.
- Always use a non-NULL argument for prepare_kernel_cred().
- Disable structleak plugin in FORTIFY KUnit test (Anders Roxell).
- Adjust orphan linker section checking to respect CONFIG_WERROR
(Xin Li).
- Make sure siginfo is cleared for forced SIGKILL (haifeng.xu).
- Fix um vs FORTIFY warnings for always-NULL arguments.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmOZSOoWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJjAAD/0YkvpU7f03f8hcQMJK6wv//24K
AW41hEaBikq9RcmkuvkLLrJRibGgZ5O2xUkUkxRs/HxhkhrZ0kEw8sbwZe8MoWls
F4Y9+TDjsrdHmjhfcBZdLnVxwcKK5wlaEcpjZXtbsfcdhx3TbgcDA23YELl5t0K+
I11j4kYmf9SLl4CwIrSP5iACml8CBHARDh8oIMF7FT/LrjNbM8XkvBcVVT6hTbOV
yjgA8WP2e9GXvj9GzKgqvd0uE/kwPkVAeXLNFWopPi4FQ8AWjlxbBZR0gamA6/EB
d7TIs0ifpVU2JGQaTav4xO6SsFMj3ntoUI0qIrFaTxZAvV4KYGrPT/Kwz1O4SFaG
rN5lcxseQbPQSBTFNG4zFjpywTkVCgD2tZqDwz5Rrmiraz0RyIokCN+i4CD9S0Ds
oEd8JSyLBk1sRALczkuEKo0an5AyC9YWRcBXuRdIHpLo08PsbeUUSe//4pe303cw
0ApQxYOXnrIk26MLElTzSMImlSvlzW6/5XXzL9ME16leSHOIfDeerPnc9FU9Eb3z
ODv22z6tJZ9H/apSUIHZbMciMbbVTZ8zgpkfydr08o87b342N/ncYHZ5cSvQ6DWb
jS5YOIuvl46/IhMPT16qWC8p0bP5YhxoPv5l6Xr0zq0ooEj0E7keiD/SzoLvW+Qs
AHXcibguPRQBPAdiPQ==
=yaaN
-----END PGP SIGNATURE-----
Merge tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull kernel hardening updates from Kees Cook:
- Convert flexible array members, fix -Wstringop-overflow warnings, and
fix KCFI function type mismatches that went ignored by maintainers
(Gustavo A. R. Silva, Nathan Chancellor, Kees Cook)
- Remove the remaining side-effect users of ksize() by converting
dma-buf, btrfs, and coredump to using kmalloc_size_roundup(), add
more __alloc_size attributes, and introduce full testing of all
allocator functions. Finally remove the ksize() side-effect so that
each allocation-aware checker can finally behave without exceptions
- Introduce oops_limit (default 10,000) and warn_limit (default off) to
provide greater granularity of control for panic_on_oops and
panic_on_warn (Jann Horn, Kees Cook)
- Introduce overflows_type() and castable_to_type() helpers for cleaner
overflow checking
- Improve code generation for strscpy() and update str*() kern-doc
- Convert strscpy and sigphash tests to KUnit, and expand memcpy tests
- Always use a non-NULL argument for prepare_kernel_cred()
- Disable structleak plugin in FORTIFY KUnit test (Anders Roxell)
- Adjust orphan linker section checking to respect CONFIG_WERROR (Xin
Li)
- Make sure siginfo is cleared for forced SIGKILL (haifeng.xu)
- Fix um vs FORTIFY warnings for always-NULL arguments
* tag 'hardening-v6.2-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (31 commits)
ksmbd: replace one-element arrays with flexible-array members
hpet: Replace one-element array with flexible-array member
um: virt-pci: Avoid GCC non-NULL warning
signal: Initialize the info in ksignal
lib: fortify_kunit: build without structleak plugin
panic: Expose "warn_count" to sysfs
panic: Introduce warn_limit
panic: Consolidate open-coded panic_on_warn checks
exit: Allow oops_limit to be disabled
exit: Expose "oops_count" to sysfs
exit: Put an upper limit on how often we can oops
panic: Separate sysctl logic from CONFIG_SMP
mm/pgtable: Fix multiple -Wstringop-overflow warnings
mm: Make ksize() a reporting-only function
kunit/fortify: Validate __alloc_size attribute results
drm/sti: Fix return type of sti_{dvo,hda,hdmi}_connector_mode_valid()
drm/fsl-dcu: Fix return type of fsl_dcu_drm_connector_mode_valid()
driver core: Add __alloc_size hint to devm allocators
overflow: Introduce overflows_type() and castable_to_type()
coredump: Proactively round up to kmalloc bucket size
...
EFI mixed-mode code to a separate compilation unit, the AMD memory
encryption early code where it belongs and fixing up build dependencies.
Make the deprecated EFI handover protocol optional with the goal of
removing it at some point (Ard Biesheuvel)
- Skip realmode init code on Xen PV guests as it is not needed there
- Remove an old 32-bit PIC code compiler workaround
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmOYaiMACgkQEsHwGGHe
VUrNVhAAk3lLagEsrBcQ24SnMMAyQvdKfRucn9fbs72jBCyWbDqXcE59qNgdbMS1
3rIL+EJdF8jlm5K28GjRS1WSvwUyYbyFEfUcYfqZl9L/5PAl7PlG7nNQw7/gXnw+
xS57w/Q3cONlo5LC0K2Zkbj/59RvDoBEs3nkhozkKR0npTDW/LK3Vl0zgKTkvqsV
DzRIHhWsqSEvpdowbQmQCyqFh/pOoQlZkQwjYVA9+SaQYdH3Yo1dpLd5i9I9eVmJ
dci/HDU+plwYYuZ1XhxwXr82PcdCUVYjJ/DTt9GkTVYq7u5EWx62puxTl+c+wbG2
H1WBXuZHBGdzNMFdnb1k9RuLCaYdaxKTNlZh3FPMMDtkjtjKTl/olXTlFUYFgI6E
FPv4hi15g6pMveS3K6YUAd0uGvpsjvLUZHPqMDVS2trhxLENQALc6Id/PwqzrQ1T
FzfPYcDyFFwMM3MDuWc8ClwEDD9wr0Z4m4Aek/ca2r85AKEX8ZtTTlWZoI4E9A4B
hEjUFnRhT/d6XLWwZqcOIKfwtbpKAjdsCN3ElFst8ogRFAXqW8luDoI4BRCkBC4p
T4RHdij4afkuFjSAxBacazpaavtcCsDqXwBpeL4YN+4fA7+NokVZGiQVh/3S8BPn
LlgIf6awFq6yQq7JyEGPdk+dWn5sknldixZ55m666ZLzSvQhvE8=
=VGZx
-----END PGP SIGNATURE-----
Merge tag 'x86_boot_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Borislav Petkov:
"A of early boot cleanups and fixes.
- Do some spring cleaning to the compressed boot code by moving the
EFI mixed-mode code to a separate compilation unit, the AMD memory
encryption early code where it belongs and fixing up build
dependencies. Make the deprecated EFI handover protocol optional
with the goal of removing it at some point (Ard Biesheuvel)
- Skip realmode init code on Xen PV guests as it is not needed there
- Remove an old 32-bit PIC code compiler workaround"
* tag 'x86_boot_for_v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Remove x86_32 PIC using %ebx workaround
x86/boot: Skip realmode init code when running as Xen PV guest
x86/efi: Make the deprecated EFI handover protocol optional
x86/boot/compressed: Only build mem_encrypt.S if AMD_MEM_ENCRYPT=y
x86/boot/compressed: Adhere to calling convention in get_sev_encryption_bit()
x86/boot/compressed: Move startup32_check_sev_cbit() out of head_64.S
x86/boot/compressed: Move startup32_check_sev_cbit() into .text
x86/boot/compressed: Move startup32_load_idt() out of head_64.S
x86/boot/compressed: Move startup32_load_idt() into .text section
x86/boot/compressed: Pull global variable reference into startup32_load_idt()
x86/boot/compressed: Avoid touching ECX in startup32_set_idt_entry()
x86/boot/compressed: Simplify IDT/GDT preserve/restore in the EFI thunk
x86/boot/compressed, efi: Merge multiple definitions of image_offset into one
x86/boot/compressed: Move efi32_pe_entry() out of head_64.S
x86/boot/compressed: Move efi32_entry out of head_64.S
x86/boot/compressed: Move efi32_pe_entry into .text section
x86/boot/compressed: Move bootargs parsing out of 32-bit startup code
x86/boot/compressed: Move 32-bit entrypoint code into .text section
x86/boot/compressed: Rename efi_thunk_64.S to efi-mixed.S
- Refactor the zboot code so that it incorporates all the EFI stub
logic, rather than calling the decompressed kernel as a EFI app.
- Add support for initrd= command line option to x86 mixed mode.
- Allow initrd= to be used with arbitrary EFI accessible file systems
instead of just the one the kernel itself was loaded from.
- Move some x86-only handling and manipulation of the EFI memory map
into arch/x86, as it is not used anywhere else.
- More flexible handling of any random seeds provided by the boot
environment (i.e., systemd-boot) so that it becomes available much
earlier during the boot.
- Allow improved arch-agnostic EFI support in loaders, by setting a
uniform baseline of supported features, and adding a generic magic
number to the DOS/PE header. This should allow loaders such as GRUB or
systemd-boot to reduce the amount of arch-specific handling
substantially.
- (arm64) Run EFI runtime services from a dedicated stack, and use it to
recover from synchronous exceptions that might occur in the firmware
code.
- (arm64) Ensure that we don't allocate memory outside of the 48-bit
addressable physical range.
- Make EFI pstore record size configurable
- Add support for decoding CXL specific CPER records
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmOTQ1cACgkQw08iOZLZ
jyQRkAv+LqaZFWeVwhAQHiw/N3RnRM0nZHea6++D2p1y/ZbCpwv3pdLl2YHQ1KmW
wDG9Nr4C1ITLtfy1YZKeYpwloQtq9S1GZDWnFpVv/hdo7L924eRAwIlxowWn1OnP
ruxv2PaYXyb0plh1YD1f6E1BqrfUOtajET55Kxs9ZsxmnMtDpIX3NiYy4LKMBIZC
+Eywt41M3uBX+wgmSujFBMVVJjhOX60WhUYXqy0RXwDKOyrz/oW5td+eotSCreB6
FVbjvwQvUdtzn4s1FayOMlTrkxxLw4vLhsaUGAdDOHd3rg3sZT9Xh1HqFFD6nss6
ZAzAYQ6BzdiV/5WSB9meJe+BeG1hjTNKjJI6JPO2lctzYJqlnJJzI6JzBuH9vzQ0
dffLB8NITeEW2rphIh+q+PAKFFNbXWkJtV4BMRpqmzZ/w7HwupZbUXAzbWE8/5km
qlFpr0kmq8GlVcbXNOFjmnQVrJ8jPYn+O3AwmEiVAXKZJOsMH0sjlXHKsonme9oV
Sk71c6Em
=JEXz
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"Another fairly sizable pull request, by EFI subsystem standards.
Most of the work was done by me, some of it in collaboration with the
distro and bootloader folks (GRUB, systemd-boot), where the main focus
has been on removing pointless per-arch differences in the way EFI
boots a Linux kernel.
- Refactor the zboot code so that it incorporates all the EFI stub
logic, rather than calling the decompressed kernel as a EFI app.
- Add support for initrd= command line option to x86 mixed mode.
- Allow initrd= to be used with arbitrary EFI accessible file systems
instead of just the one the kernel itself was loaded from.
- Move some x86-only handling and manipulation of the EFI memory map
into arch/x86, as it is not used anywhere else.
- More flexible handling of any random seeds provided by the boot
environment (i.e., systemd-boot) so that it becomes available much
earlier during the boot.
- Allow improved arch-agnostic EFI support in loaders, by setting a
uniform baseline of supported features, and adding a generic magic
number to the DOS/PE header. This should allow loaders such as GRUB
or systemd-boot to reduce the amount of arch-specific handling
substantially.
- (arm64) Run EFI runtime services from a dedicated stack, and use it
to recover from synchronous exceptions that might occur in the
firmware code.
- (arm64) Ensure that we don't allocate memory outside of the 48-bit
addressable physical range.
- Make EFI pstore record size configurable
- Add support for decoding CXL specific CPER records"
* tag 'efi-next-for-v6.2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (43 commits)
arm64: efi: Recover from synchronous exceptions occurring in firmware
arm64: efi: Execute runtime services from a dedicated stack
arm64: efi: Limit allocations to 48-bit addressable physical region
efi: Put Linux specific magic number in the DOS header
efi: libstub: Always enable initrd command line loader and bump version
efi: stub: use random seed from EFI variable
efi: vars: prohibit reading random seed variables
efi: random: combine bootloader provided RNG seed with RNG protocol output
efi/cper, cxl: Decode CXL Error Log
efi/cper, cxl: Decode CXL Protocol Error Section
efi: libstub: fix efi_load_initrd_dev_path() kernel-doc comment
efi: x86: Move EFI runtime map sysfs code to arch/x86
efi: runtime-maps: Clarify purpose and enable by default for kexec
efi: pstore: Add module parameter for setting the record size
efi: xen: Set EFI_PARAVIRT for Xen dom0 boot on all architectures
efi: memmap: Move manipulation routines into x86 arch tree
efi: memmap: Move EFI fake memmap support into x86 arch tree
efi: libstub: Undeprecate the command line initrd loader
efi: libstub: Add mixed mode support to command line initrd loader
efi: libstub: Permit mixed mode return types other than efi_status_t
...
- Rework the handling of x86_regset for 32 and 64 bit. The original
implementation tried to minimize the allocation size with quite some
hard to understand and fragile tricks. Make it robust and straight
forward by separating the register enumerations for 32 and 64 bit
completely.
- Add a few missing static annotations
- Remove the stale unused setup_once() assembly function
- Address a few minor static analysis and kernel-doc warnings
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAmOUu0ATHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUNzEACNn5XbRqxPQZak5XHeJ46/VNVTqTE0
Z7euwF8oP+aAybyDevvm18D2hB9Atn4vU9QJYhnTxBXbCLUNErKrH8FcXdNOBbeC
YdAX7nO5WH8IM+drCMySeK6Tv6rvhnDUtgBzdBSl4NdPXUSOnGo+jHqHfN/Q+/n0
yvbwSoVAjD01sxVZQqKQOrzDgDuR/zlISCVudfS+tR4Rm/CYj0cl+MQS9Z1VM3Z6
7pqyypd5+CyNAD6vTDY/q+ZK0ShfNnU9TIIoGmOB/pc0kLctwIu3MY76Uo2DUgGn
n/ItR9mvYu/QelCwX02VG3aRYJPLRfBa+DjQfZUwZapRz3rsjKtfa8ogpPZTLrSO
o4ht/jxlKKDyNOQKYeL2yy054JR4DkKziilEzw5GZHeH2y66XWudRuWfMwbTdrGc
esP5fSNfZ9uluYl6GCCw6S83RJzQ8aZXRcAy7CJgw2Qb4XE7IOA2jf18x5AYaDUp
4a6HCjbxYkEmKCkzkh9+w5koYruyizMBKMBBh5QsMzH4xp20s/vffHwbZ1tls9Za
eTDC/E+wW9Om3qynRynm0EmcHpa0j+RcmkHOhFcXj6SRLnhzktk4Rrr3vlhardS3
Pc8h3GnE5mFXqS8t3r6/hvMk+6svhSu3RbICiLNU72F/tVLU628ux/WoCKfXZloE
7HxWoVhkTF7eOw==
=DTBQ
-----END PGP SIGNATURE-----
Merge tag 'x86-cleanups-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Thomas Gleixner:
"A set of x86 cleanups:
- Rework the handling of x86_regset for 32 and 64 bit.
The original implementation tried to minimize the allocation size
with quite some hard to understand and fragile tricks. Make it
robust and straight forward by separating the register enumerations
for 32 and 64 bit completely.
- Add a few missing static annotations
- Remove the stale unused setup_once() assembly function
- Address a few minor static analysis and kernel-doc warnings"
* tag 'x86-cleanups-2022-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/asm/32: Remove setup_once()
x86/kaslr: Fix process_mem_region()'s return value
x86: Fix misc small issues
x86/boot: Repair kernel-doc for boot_kstrtoul()
x86: Improve formatting of user_regset arrays
x86: Separate out x86_regset for 32 and 64 bit
x86/i8259: Make default_legacy_pic static
x86/tsc: Make art_related_clocksource static
GRUB currently relies on the magic number in the image header of ARM and
arm64 EFI kernel images to decide whether or not the image in question
is a bootable kernel.
However, the purpose of the magic number is to identify the image as one
that implements the bare metal boot protocol, and so GRUB, which only
does EFI boot, is limited unnecessarily to booting images that could
potentially be booted in a non-EFI manner as well.
This is problematic for the new zboot decompressor image format, as it
can only boot in EFI mode, and must therefore not use the bare metal
boot magic number in its header.
For this reason, the strict magic number was dropped from GRUB, to
permit essentially any kind of EFI executable to be booted via the
'linux' command, blurring the line between the linux loader and the
chainloader.
So let's use the same field in the DOS header that RISC-V and arm64
already use for their 'bare metal' magic numbers to store a 'generic
Linux kernel' magic number, which can be used to identify bootable
kernel images in PE format which don't necessarily implement a bare
metal boot protocol in the same binary. Note that, in the context of
EFI, the MS-DOS header is only described in terms of the fields that it
shares with the hybrid PE/COFF image format, (i.e., the MS-DOS EXE magic
number at offset #0 and the PE header offset at byte offset #0x3c).
Since we aim for compatibility with EFI only, and not with MS-DOS or
MS-Windows, we can use the remaining space in the MS-DOS header however
we want.
Let's set the generic magic number for x86 images as well: existing
bootloaders already have their own methods to identify x86 Linux images
that can be booted in a non-EFI manner, and having the magic number in
place there will ease any future transitions in loader implementations
to merge the x86 and non-x86 EFI boot paths.
Note that 32-bit ARM already uses the same location in the header for a
different purpose, but the ARM support is already widely implemented and
the EFI zboot decompressor is not available on ARM anyway, so we just
disregard it here.
Acked-by: Leif Lindholm <quic_llindhol@quicinc.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The currently supported minimum gcc version is 5.1. Before that, the
PIC register, when generating Position Independent Code, was considered
"fixed" in the sense that it wasn't in the set of registers available to
the compiler's register allocator. Which, on x86-32, is already a very
small set.
What is more, the register allocator was unable to satisfy extended asm
"=b" constraints. (Yes, PIC code uses %ebx on 32-bit as the base reg.)
With gcc 5.1:
"Reuse of the PIC hard register, instead of using a fixed register,
was implemented on x86/x86-64 targets. This improves generated PIC
code performance as more hard registers can be used. Shared libraries
can significantly benefit from this optimization. Currently it is
switched on only for x86/x86-64 targets. As RA infrastructure is already
implemented for PIC register reuse, other targets might follow this in
the future."
(from: https://gcc.gnu.org/gcc-5/changes.html)
which basically means that the register allocator has a higher degree
of freedom when handling %ebx, including reloading it with the correct
value before a PIC access.
Furthermore:
arch/x86/Makefile:
# Never want PIC in a 32-bit kernel, prevent breakage with GCC built
# with nonstandard options
KBUILD_CFLAGS += -fno-pic
$ gcc -Wp,-MMD,arch/x86/boot/.cpuflags.o.d ... -fno-pic ... -D__KBUILD_MODNAME=kmod_cpuflags -c -o arch/x86/boot/cpuflags.o arch/x86/boot/cpuflags.c
so the 32-bit workaround in cpuid_count() is fixing exactly nothing
because 32-bit configs don't even allow PIC builds.
As to 64-bit builds: they're done using -mcmodel=kernel which produces
RIP-relative addressing for PIC builds and thus does not apply here
either.
So get rid of the thing and make cpuid_count() nice and simple.
There should be no functional changes resulting from this.
[ bp: Expand commit message. ]
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221104124546.196077-1-ubizjak@gmail.com
The EFI handover protocol permits a bootloader to invoke the kernel as a
EFI PE/COFF application, while passing a bootparams struct as a third
argument to the entrypoint function call.
This has no basis in the UEFI specification, and there are better ways
to pass additional data to a UEFI application (UEFI configuration
tables, UEFI variables, UEFI protocols) than going around the
StartImage() boot service and jumping to a fixed offset in the loaded
image, just to call a different function that takes a third parameter.
The reason for handling struct bootparams in the bootloader was that the
EFI stub could only load initrd images from the EFI system partition,
and so passing it via struct bootparams was needed for loaders like
GRUB, which pass the initrd in memory, and may load it from anywhere,
including from the network. Another motivation was EFI mixed mode, which
could not use the initrd loader in the EFI stub at all due to 32/64 bit
incompatibilities (which will be fixed shortly [0]), and could not
invoke the ordinary PE/COFF entry point either, for the same reasons.
Given that loaders such as GRUB already carried the bootparams handling
in order to implement non-EFI boot, retaining that code and just passing
bootparams to the EFI stub was a reasonable choice (although defining an
alternate entrypoint could have been avoided.) However, the GRUB side
changes never made it upstream, and are only shipped by some of the
distros in their downstream versions.
In the meantime, EFI support has been added to other Linux architecture
ports, as well as to U-boot and systemd, including arch-agnostic methods
for passing initrd images in memory [1], and for doing mixed mode boot
[2], none of them requiring anything like the EFI handover protocol. So
given that only out-of-tree distro GRUB relies on this, let's permit it
to be omitted from the build, in preparation for retiring it completely
at a later date. (Note that systemd-boot does have an implementation as
well, but only uses it as a fallback for booting images that do not
implement the LoadFile2 based initrd loading method, i.e., v5.8 or older)
[0] https://lore.kernel.org/all/20220927085842.2860715-1-ardb@kernel.org/
[1] ec93fc371f ("efi/libstub: Add support for loading the initrd from a device path")
[2] 97aa276579 ("efi/x86: Add true mixed mode entry point into .compat section")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-18-ardb@kernel.org
Make get_sev_encryption_bit() follow the ordinary i386 calling
convention, and only call it if CONFIG_AMD_MEM_ENCRYPT is actually
enabled. This clarifies the calling code, and makes it more
maintainable.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-16-ardb@kernel.org
Now that the startup32_check_sev_cbit() routine can execute from
anywhere and behaves like an ordinary function, it can be moved where it
belongs.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-15-ardb@kernel.org
Move startup32_check_sev_cbit() into the .text section and turn it into
an ordinary function using the ordinary 32-bit calling convention,
instead of saving/restoring the registers that are known to be live at
the only call site. This improves maintainability, and makes it possible
to move this function out of head_64.S and into a separate compilation
unit that is specific to memory encryption.
Note that this requires the call site to be moved before the mixed mode
check, as %eax will be live otherwise.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-14-ardb@kernel.org
Now that startup32_load_idt() has been refactored into an ordinary
callable function, move it into mem-encrypt.S where it belongs.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-13-ardb@kernel.org
Convert startup32_load_idt() into an ordinary function and move it into
the .text section. This involves turning the rva() immediates into ones
derived from a local label, and preserving/restoring the %ebp and %ebx
as per the calling convention.
Also move the #ifdef to the only existing call site. This makes it clear
that the function call does nothing if support for memory encryption is
not compiled in.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-12-ardb@kernel.org
In preparation for moving startup32_load_idt() out of head_64.S and
turning it into an ordinary function using the ordinary 32-bit calling
convention, pull the global variable reference to boot32_idt up into
startup32_load_idt() so that startup32_set_idt_entry() does not need to
discover its own runtime physical address, which will no longer be
correlated with startup_32 once this code is moved into .text.
While at it, give startup32_set_idt_entry() static linkage.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-11-ardb@kernel.org
Avoid touching register %ecx in startup32_set_idt_entry(), by folding
the MOV, SHL and ORL instructions into a single ORL which no longer
requires a temp register.
This permits ECX to be used as a function argument in a subsequent
patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-10-ardb@kernel.org
Tweak the asm and remove some redundant instructions. While at it,
fix the associated comment for style and correctness.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-9-ardb@kernel.org
There is no need for head_32.S and head_64.S both declaring a copy of
the global 'image_offset' variable, so drop those and make the extern C
declaration the definition.
When image_offset is moved to the .c file, it needs to be placed
particularly in the .data section because it lands by default in the
.bss section which is cleared too late, in .Lrelocated, before the first
access to it and thus garbage gets read, leading to SEV guests exploding
in early boot.
This happens only when the SEV guest kernel is loaded through grub. If
supplied with qemu's -kernel command line option, that memory is always
cleared upfront by qemu and all is fine there.
[ bp: Expand commit message with SEV aspect. ]
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-8-ardb@kernel.org
Since commit 2df8220cc5 ("kbuild: build init/built-in.a just once"),
the .version file is not touched at all when KBUILD_BUILD_VERSION is
given.
If KBUILD_BUILD_VERSION is specified and the .version file is missing
(for example right after 'make mrproper'), "No such file or director"
is shown. Even if the .version exists, it is irrelevant to the version
of the current build.
$ make -j$(nproc) KBUILD_BUILD_VERSION=100 mrproper defconfig all
[ snip ]
BUILD arch/x86/boot/bzImage
cat: .version: No such file or directory
Kernel: arch/x86/boot/bzImage is ready (#)
Show KBUILD_BUILD_VERSION if it is given.
Fixes: 2df8220cc5 ("kbuild: build init/built-in.a just once")
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <nicolas@fjasle.eu>
Move the implementation of efi32_pe_entry() into efi-mixed.S, which is a
more suitable location that only gets built if EFI mixed mode is
actually enabled.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-7-ardb@kernel.org
Move the efi32_entry() routine out of head_64.S and into efi-mixed.S,
which reduces clutter in the complicated startup routines. It also
permits linkage of some symbols used by code to be made local.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-6-ardb@kernel.org
Move efi32_pe_entry() into the .text section, so that it can be moved
out of head_64.S and into a separate compilation unit in a subsequent
patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-5-ardb@kernel.org
Move the logic that chooses between the different EFI entrypoints out of
the 32-bit boot path, and into a 64-bit helper that can perform the same
task much more cleanly. While at it, document the mixed mode boot flow
in a code comment.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-4-ardb@kernel.org
Move the code that stores the arguments passed to the EFI entrypoint
into the .text section, so that it can be moved into a separate
compilation unit in a subsequent patch.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-3-ardb@kernel.org
In preparation for moving the mixed mode specific code out of head_64.S,
rename the existing file to clarify that it contains more than just the
mixed mode thunk.
While at it, clean up the Makefile rules that add it to the build.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20221122161017.2426828-2-ardb@kernel.org
Fix the following coccicheck warning:
./arch/x86/boot/compressed/kaslr.c:670:8-9: WARNING: return of 0/1 in
function 'process_mem_region' with return type bool.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220421202556.129799-1-jiapeng.chong@linux.alibaba.com
Rework the EFI stub macro wrappers around protocol method calls and
other indirect calls in order to allow return types other than
efi_status_t. This means the widening should be conditional on whether
or not the return type is efi_status_t, and should be omitted otherwise.
Also, switch to _Generic() to implement the type based compile time
conditionals, which is more concise, and distinguishes between
efi_status_t and u64 properly.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Andrew Cooper suggested upgrading the orphan section warning to a hard link
error. However Nathan Chancellor said outright turning the warning into an
error with no escape hatch might be too aggressive, as we have had these
warnings triggered by new compiler generated sections, and suggested turning
orphan sections into an error only if CONFIG_WERROR is set. Kees Cook echoed
and emphasized that the mandate from Linus is that we should avoid breaking
builds. It wrecks bisection, it causes problems across compiler versions, etc.
Thus upgrade the orphan section warning to a hard link error only if
CONFIG_WERROR is set.
Suggested-by: Andrew Cooper <andrew.cooper3@citrix.com>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Xin Li <xin3.li@intel.com>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20221025073023.16137-2-xin3.li@intel.com
Generic function-alignment infrastructure.
Architectures can select FUNCTION_ALIGNMENT_xxB symbols; the
FUNCTION_ALIGNMENT symbol is then set to the largest such selected
size, 0 otherwise.
From this the -falign-functions compiler argument and __ALIGN macro
are set.
This incorporates the DEBUG_FORCE_FUNCTION_ALIGN_64B knob and future
alignment requirements for x86_64 (later in this series) into a single
place.
NOTE: also removes the 0x90 filler byte from the generic __ALIGN
primitive, that value makes no sense outside of x86.
NOTE: .balign 0 reverts to a no-op.
Requested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220915111143.719248727@infradead.org
linux-next for a couple of months without, to my knowledge, any negative
reports (or any positive ones, come to that).
- Also the Maple Tree from Liam R. Howlett. An overlapping range-based
tree for vmas. It it apparently slight more efficient in its own right,
but is mainly targeted at enabling work to reduce mmap_lock contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
(https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
This has yet to be addressed due to Liam's unfortunately timed
vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down to
the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
=xfWx
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
- Remove potentially incomplete targets when Kbuid is interrupted by
SIGINT etc. in case GNU Make may miss to do that when stderr is piped
to another program.
- Rewrite the single target build so it works more correctly.
- Fix rpm-pkg builds with V=1.
- List top-level subdirectories in ./Kbuild.
- Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in kallsyms.
- Avoid two different modules in lib/zstd/ having shared code, which
potentially causes building the common code as build-in and modular
back-and-forth.
- Unify two modpost invocations to optimize the build process.
- Remove head-y syntax in favor of linker scripts for placing particular
sections in the head of vmlinux.
- Bump the minimal GNU Make version to 3.82.
- Clean up misc Makefiles and scripts.
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmM+4vcVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGY2IQAInr0JUNnkkxwUSXtOcQuA3IK8RJ
FbU9HXJRoV9H+7+l3SMlN7mIbrs5eE5fTY3iwQ3CVe139d1+1q7nvTMRv8owywJx
GBgzswncuu1lk7iQQ//CxiqMwSCG8GJdYn1uDVy4I5jg3o+DtFZJtyq2Wb7pqsMm
ZhZ4PozRN+idYQJSF6Vx/zEVLHI7quMBwfe4CME8/0Kg2+hnYzbXV/aUf0ED2emq
zdCMDQgIOK5AhY+8qgMXKYnBUJMTqBp6LoR4p3ApfUkwRFY0sGa0/LK3U/B22OE7
uWyR4fCUExGyerlcHEVev+9eBfmsLLPyqlchNwpSDOPf5OSdnKmgqJEBR/Cvx0eh
URerPk7EHxyH3G8yi+cU2GtofNTGc5RHPRgJE2ADsQEi5TAUKGmbXMlsFRL/51Vn
lTANZObBNa1d4enljF6TfTL5nuccOa+DKvXnH9fQ49t0QdtSikv6J/lGwilwm1Sr
BctmCsySPuURZfkpI9OQnLuouloMXl9f7Q/+S39haS/tSgvPpyITyO71nxDnXn/s
BbFObZJUk9QkqOACjBP1hNErTLt83uBxQ9z+rDCw/SbLIe4nw0wyneuygfHI5rI8
3RZB2DbGauuJHX2Zs6YGS14SLSY33IsLqKR1/Vy3LrPvOHuEvNiOR8LITq5E0YCK
OffK2Y5cIlXR0QWf
=DHiN
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Remove potentially incomplete targets when Kbuid is interrupted by
SIGINT etc in case GNU Make may miss to do that when stderr is piped
to another program.
- Rewrite the single target build so it works more correctly.
- Fix rpm-pkg builds with V=1.
- List top-level subdirectories in ./Kbuild.
- Ignore auto-generated __kstrtab_* and __kstrtabns_* symbols in
kallsyms.
- Avoid two different modules in lib/zstd/ having shared code, which
potentially causes building the common code as build-in and modular
back-and-forth.
- Unify two modpost invocations to optimize the build process.
- Remove head-y syntax in favor of linker scripts for placing
particular sections in the head of vmlinux.
- Bump the minimal GNU Make version to 3.82.
- Clean up misc Makefiles and scripts.
* tag 'kbuild-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (41 commits)
docs: bump minimal GNU Make version to 3.82
ia64: simplify esi object addition in Makefile
Revert "kbuild: Check if linker supports the -X option"
kbuild: rebuild .vmlinux.export.o when its prerequisite is updated
kbuild: move modules.builtin(.modinfo) rules to Makefile.vmlinux_o
zstd: Fixing mixed module-builtin objects
kallsyms: ignore __kstrtab_* and __kstrtabns_* symbols
kallsyms: take the input file instead of reading stdin
kallsyms: drop duplicated ignore patterns from kallsyms.c
kbuild: reuse mksysmap output for kallsyms
mksysmap: update comment about __crc_*
kbuild: remove head-y syntax
kbuild: use obj-y instead extra-y for objects placed at the head
kbuild: hide error checker logs for V=1 builds
kbuild: re-run modpost when it is updated
kbuild: unify two modpost invocations
kbuild: move vmlinux.o rule to the top Makefile
kbuild: move .vmlinux.objs rule to Makefile.modpost
kbuild: list sub-directories in ./Kbuild
Makefile.compiler: replace cc-ifversion with compiler-specific macros
...
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmM8QskACgkQEsHwGGHe
VUq2ZQ/+KwgmCojK54P05UOClpvf96CLDJA7r4m6ydKiM7GDWFg9wCZdews4JRk1
/5hqfkFZsEAUlloRjRk3Qvd6PWRzDX8X/jjtHn3JyzRHT6ra31tyiZmD2LEb4eb6
D0jIHfZQRYjZP39p3rYSuSMFrdWWE8gCETJLZEflR96ACwHXlm1fH/wSRI2RUG4c
sH7nT/hGqtKiDsmOcb314yjmjraYEW1mKnLKRLfjUwksBET4mOiLTjH175MQ5Yv7
cXZs0LsYvdfCqWSH5uefv32TX/yLsIi8ygaALpXawkoyXTmLr5MwJJykrm60AogV
74gvxc3s3ItO0aKVM0J4ABTUWmU+wg+sjPcJD1MolafnJpsgGdfEKlWfTY4hjMV5
onjtgr7byEdgZU25JtuI0BzPoggahnHvK6LiIvGy9vw8LRdKziKPXsyxuRF4rvXw
0n9ofVRmBCuzUsRS8vbL65K2PcIS4oUmUUSEDmALtGQ9vG8j50k6vM3Fu6HayyJx
7qgjVRpREemqRO21wS7SmR6z1RkT5J+zWv4TdacyyrA9QRqyM6ny/yZGCsfOZA77
+LxBFzITwIXlTgfTDVYnLIi1ZPP2MCK74Gq0Buqsjxz8IOpV6yjB+PSajbJzZv35
gIdbWKc5oHgmcDkrpBCoZ6KQ5ZNvDy6glSdnegkDFjRfVm5eCu0=
=RqjF
-----END PGP SIGNATURE-----
Merge tag 'x86_cleanups_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:
- The usual round of smaller fixes and cleanups all over the tree
* tag 'x86_cleanups_for_v6.1_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/cpu: Include the header of init_ia32_feat_ctl()'s prototype
x86/uaccess: Improve __try_cmpxchg64_user_asm() for x86_32
x86: Fix various duplicate-word comment typos
x86/boot: Remove superfluous type casting from arch/x86/boot/bitops.h
Instrumenting some files with KMSAN will result in kernel being unable to
link, boot or crashing at runtime for various reasons (e.g. infinite
recursion caused by instrumentation hooks calling instrumented code
again).
Completely omit KMSAN instrumentation in the following places:
- arch/x86/boot and arch/x86/realmode/rm, as KMSAN doesn't work for i386;
- arch/x86/entry/vdso, which isn't linked with KMSAN runtime;
- three files in arch/x86/kernel - boot problems;
- arch/x86/mm/cpu_entry_area.c - recursion.
Link: https://lkml.kernel.org/r/20220915150417.722975-33-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrey Konovalov <andreyknvl@gmail.com>
Cc: Andrey Konovalov <andreyknvl@google.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Eric Biggers <ebiggers@google.com>
Cc: Eric Biggers <ebiggers@kernel.org>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Ilya Leoshkevich <iii@linux.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Marco Elver <elver@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kbuild builds init/built-in.a twice; first during the ordinary
directory descending, second from scripts/link-vmlinux.sh.
We do this because UTS_VERSION contains the build version and the
timestamp. We cannot update it during the normal directory traversal
since we do not yet know if we need to update vmlinux. UTS_VERSION is
temporarily calculated, but omitted from the update check. Otherwise,
vmlinux would be rebuilt every time.
When Kbuild results in running link-vmlinux.sh, it increments the
version number in the .version file and takes the timestamp at that
time to really fix UTS_VERSION.
However, updating the same file twice is a footgun. To avoid nasty
timestamp issues, all build artifacts that depend on init/built-in.a
are atomically generated in link-vmlinux.sh, where some of them do not
need rebuilding.
To fix this issue, this commit changes as follows:
[1] Split UTS_VERSION out to include/generated/utsversion.h from
include/generated/compile.h
include/generated/utsversion.h is generated just before the
vmlinux link. It is generated under include/generated/ because
some decompressors (s390, x86) use UTS_VERSION.
[2] Split init_uts_ns and linux_banner out to init/version-timestamp.c
from init/version.c
init_uts_ns and linux_banner contain UTS_VERSION. During the ordinary
directory descending, they are compiled with __weak and used to
determine if vmlinux needs relinking. Just before the vmlinux link,
they are compiled without __weak to embed the real version and
timestamp.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
In some cases, bootloaders will leave boot_params->cc_blob_address
uninitialized rather than zeroing it out. This field is only meant to be
set by the boot/compressed kernel in order to pass information to the
uncompressed kernel when SEV-SNP support is enabled.
Therefore, there are no cases where the bootloader-provided values
should be treated as anything other than garbage. Otherwise, the
uncompressed kernel may attempt to access this bogus address, leading to
a crash during early boot.
Normally, sanitize_boot_params() would be used to clear out such fields
but that happens too late: sev_enable() may have already initialized
it to a valid value that should not be zeroed out. Instead, have
sev_enable() zero it out unconditionally beforehand.
Also ensure this happens for !CONFIG_AMD_MEM_ENCRYPT as well by also
including this handling in the sev_enable() stub function.
[ bp: Massage commit message and comments. ]
Fixes: b190a043c4 ("x86/sev: Add SEV-SNP feature detection/setup")
Reported-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Reported-by: watnuss@gmx.de
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=216387
Link: https://lore.kernel.org/r/20220823160734.89036-1-michael.roth@amd.com
'const void *' will auto-type-convert to just about any other const
pointer type, no need to force it.
[ mingo: Rewrote the changelog. ]
Signed-off-by: Li kunyu <kunyu@nfschina.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220725042358.3377-1-kunyu@nfschina.com
Users of GNU ld (BFD) from binutils 2.39+ will observe multiple
instances of a new warning when linking kernels in the form:
ld: warning: arch/x86/boot/pmjump.o: missing .note.GNU-stack section implies executable stack
ld: NOTE: This behaviour is deprecated and will be removed in a future version of the linker
ld: warning: arch/x86/boot/compressed/vmlinux has a LOAD segment with RWX permissions
Generally, we would like to avoid the stack being executable. Because
there could be a need for the stack to be executable, assembler sources
have to opt-in to this security feature via explicit creation of the
.note.GNU-stack feature (which compilers create by default) or command
line flag --noexecstack. Or we can simply tell the linker the
production of such sections is irrelevant and to link the stack as
--noexecstack.
LLVM's LLD linker defaults to -z noexecstack, so this flag isn't
strictly necessary when linking with LLD, only BFD, but it doesn't hurt
to be explicit here for all linkers IMO. --no-warn-rwx-segments is
currently BFD specific and only available in the current latest release,
so it's wrapped in an ld-option check.
While the kernel makes extensive usage of ELF sections, it doesn't use
permissions from ELF segments.
Link: https://lore.kernel.org/linux-block/3af4127a-f453-4cf7-f133-a181cce06f73@kernel.dk/
Link: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=ba951afb99912da01a6e8434126b8fac7aa75107
Link: https://github.com/llvm/llvm-project/issues/57009
Reported-and-tested-by: Jens Axboe <axboe@kernel.dk>
Suggested-by: Fangrui Song <maskray@google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The decompressed kernel initially relies on the identity map set up by
the boot/compressed kernel for accessing things like boot_params. With
the recent introduction of SEV-SNP support, the decompressed kernel
also needs to access the setup_data entries pointed to by
boot_params->hdr.setup_data.
This can lead to a crash in the kexec kernel during early boot due to
these entries not currently being included in the initial identity map,
see thread at Link below.
Include mappings for the setup_data entries in the initial identity map.
[ bp: Massage commit message and use a helper var for better readability. ]
Fixes: b190a043c4 ("x86/sev: Add SEV-SNP feature detection/setup")
Reported-by: Jun'ichi Nomura <junichi.nomura@nec.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/TYCPR01MB694815CD815E98945F63C99183B49@TYCPR01MB6948.jpnprd01.prod.outlook.com
- fix new DXE service invocations for mixed mode
- use correct Kconfig symbol when setting PE header flag
- clean up the drivers/firmware/efi Kconfig dependencies so that
features that depend on CONFIG_EFI are hidden from the UI when the
symbol is not enabled.
Also included is a RISC-V bugfix from Heinrich to avoid read-write
mappings of read-only firmware regions in the EFI page tables.
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmKXk5YACgkQw08iOZLZ
jyQQ0wv/cB9Z9kJur3wJqj75HFEly7bwSk5oxJ+txRytApSaRYnqm7l4WeP3QQ8c
o9GzZZNwoRQSx1mCBJefaO4s8fA24QkIeD8Oy4MeucKaPX1UbNc6Z84srOynjpSj
mOyIYB+kurxsCBKmzQQBy8txIWld4EkrMhEoc1h2L4d2+OVRvIlsu1PMv03eCiww
4Sop0yO5CydEpjxJDCfwol0L/PBiXc2PfRs2FdHFwOSQaisQLxhNruCnovyS9Zwk
zLkhYC5dS+OZctknl6XMzOAi3x7sNYzVwNf4+yhFeU2cTuj3kJWnEAqs3CU/tiPO
DOobLg/r/j7H44Nsc/8aJGT4GPNrbUrb6aOcfrBAkxvsu1Sp/k/UfSMZLS9fU1gC
XUUl46NXG1yFpCntruQm5SMytVKdtlyUu7pPa+Ijmr+vc6UWl1XJq26J3UpiiFYT
mjrer5gvzrnhuvUjIb4ulKoNMdoOQQMtofLxUGuc1u/53jWHxbiKt7/QvyFepJVe
zi/ikD/v
=7wiT
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull more EFI updates from Ard Biesheuvel:
"Follow-up tweaks for EFI changes - they mostly address issues
introduced this merge window, except for Heinrich's patch:
- fix new DXE service invocations for mixed mode
- use correct Kconfig symbol when setting PE header flag
- clean up the drivers/firmware/efi Kconfig dependencies so that
features that depend on CONFIG_EFI are hidden from the UI when the
symbol is not enabled.
Also included is a RISC-V bugfix from Heinrich to avoid read-write
mappings of read-only firmware regions in the EFI page tables"
* tag 'efi-next-for-v5.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: clean up Kconfig dependencies on CONFIG_EFI
efi/x86: libstub: Make DXE calls mixed mode safe
efi: x86: Fix config name for setting the NX-compatibility flag in the PE header
riscv: read-only pages should not be writable
Commit 21b68da7bf4a ("efi: x86: Set the NX-compatibility flag in the PE
header") intends to set the compatibility flag, i.e.,
IMAGE_DLL_CHARACTERISTICS_NX_COMPAT, but this ifdef is actually dead as
the CONFIG_DXE_MEM_ATTRIBUTES Kconfig option does not exist.
The config is actually called EFI_DXE_MEM_ATTRIBUTES. Adjust the ifdef
to use the intended config name.
The issue was identified with ./scripts/checkkconfigsymbols.py.
Fixes: 21b68da7bf4a ("efi: x86: Set the NX-compatibility flag in the PE header")
Signed-off-by: Lukas Bulwahn <lukas.bulwahn@gmail.com>
Link: https://lore.kernel.org/r/20220601115043.7678-1-lukas.bulwahn@gmail.com
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
- Add HOSTPKG_CONFIG env variable to allow users to override pkg-config
- Support W=e as a shorthand for KCFLAGS=-Werror
- Fix CONFIG_IKHEADERS build to support toybox cpio
- Add scripts/dummy-tools/pahole to ease distro packagers' life
- Suppress false-positive warnings from checksyscalls.sh for W=2 build
- Factor out the common code of arch/*/boot/install.sh into
scripts/install.sh
- Support 'kernel-install' tool in scripts/prune-kernel
- Refactor module-versioning to link the symbol versions at the final
link of vmlinux and modules
- Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in
an arch-agnostic way
- Refactor modpost, Makefiles
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmKOO2oVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGG54P/3/U5FIP5EoPAVu9HqSUKeeUiBYc
z1B8d7Wt1xU0xHImPWNjoacfye4MrDMUv8mEWKgHCVusJxbUoS+3Z/kd64NU75Fg
Cpj+9fP1N8m02IJzraxn6fw0bmfx4zp9Zsa9l2fjwL0emq4qhB7BA9/Nl6Png1IW
p0TPR6gV0Wgp6ikf/eJ3b1decFSqM7QzDlbo860nPMG164gNpDZmFVf2G4HCRQoY
GtgoQLEy2pBeOdU7+nJTKl2f5JOhDjRKX8equ7BHW9l7nbUvHd6ys3DGqYO3nvwV
hZZdHwDtxxO6bJtzClKPREyfL2H9R2AGxq94HzSwdvwdLLoFxrTN+mg88xBg17Rm
tKHy8jpZT36qh218h5lX5n9ZWcovTA38giZ+S/tkwOvvYGivKHDS23QwzB0HrG8/
VRd+0rhfIvuIpu0OQaTpTkZr2QVci2Zn6PPnxpyPEsGvWVFRjyx0WyZh4fFXnkQT
n+GS7j5g1LVMra0qu0y+yp4zy/DVFKIcfry0xU8S5SaSEBBcWUxLS2nnoBVB4vb2
RpiVD2vaOlvu/Zs2SOgtuMOnTw+Qqrvh7OYm/WyxWrB3JQGa/r+vipMKiFEDi2NN
pwR8wJT/CW1ycte93m3oO83jiitFqzXtAqo24wKlp4SOqnR/TQ/dx743ku2xvONe
uynJVW/gZVm4KEUl
=Y2TB
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add HOSTPKG_CONFIG env variable to allow users to override pkg-config
- Support W=e as a shorthand for KCFLAGS=-Werror
- Fix CONFIG_IKHEADERS build to support toybox cpio
- Add scripts/dummy-tools/pahole to ease distro packagers' life
- Suppress false-positive warnings from checksyscalls.sh for W=2 build
- Factor out the common code of arch/*/boot/install.sh into
scripts/install.sh
- Support 'kernel-install' tool in scripts/prune-kernel
- Refactor module-versioning to link the symbol versions at the final
link of vmlinux and modules
- Remove CONFIG_MODULE_REL_CRCS because module-versioning now works in
an arch-agnostic way
- Refactor modpost, Makefiles
* tag 'kbuild-v5.19' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (56 commits)
genksyms: adjust the output format to modpost
kbuild: stop merging *.symversions
kbuild: link symbol CRCs at final link, removing CONFIG_MODULE_REL_CRCS
modpost: extract symbol versions from *.cmd files
modpost: add sym_find_with_module() helper
modpost: change the license of EXPORT_SYMBOL to bool type
modpost: remove left-over cross_compile declaration
kbuild: record symbol versions in *.cmd files
kbuild: generate a list of objects in vmlinux
modpost: move *.mod.c generation to write_mod_c_files()
modpost: merge add_{intree_flag,retpoline,staging_flag} to add_header
scripts/prune-kernel: Use kernel-install if available
kbuild: factor out the common installation code into scripts/install.sh
modpost: split new_symbol() to symbol allocation and hash table addition
modpost: make sym_add_exported() always allocate a new symbol
modpost: make multiple export error
modpost: dump Module.symvers in the same order of modules.order
modpost: traverse the namespace_list in order
modpost: use doubly linked list for dump_lists
modpost: traverse unresolved symbols in order
...
config debug options when trying to debug an issue
- A gcc12 build warnings fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLfcsACgkQEsHwGGHe
VUqfPQ/+JAQ1UxXFNWqr0LEYwo58d5p4QSGrHrNfzOtoxQfuK6aYnpOicKcjmKyo
HZAujMzlby8nworbNDo/wGBBFqCsJ8pj9v30BdClbGT671wN25y9WmK367RLtRam
dk+nOpTvIWbydDXP6tuOdqPpFdT+XPljVxLuO215kOAZmQtqmQ2cOrVprbn/OMoo
qqFZXjpazpoQButHBh8sI2nl5Y06JCZX5S5FRFTH+tfzfcEKXcbO2yOksU+L7oUc
TyfJmtytT1O/uschAH0lNExIBQKUUtnXzzLNRE+ix9k9RTFQAOKNPrFTWqeJPEZe
ZLuXZgBjdLO6IEgtaKFlpQml3uM5DSr3A6nBg9h+6xbwL1+GujoY3nblqD8W59wK
GUjUmKC2xRXSLEpRGCVnDmYIOIzYWlw04DSNNApij8/H2mzm/noCAQmEgfy7dh6n
N4duLyliqWl0bZQlhou19Hw9yGNqphVMRWCYRsEt+NQVqmpcOvM4A9r9RlaJoGaA
bgk4sUCmO2bQ3PHfcv+833+GCCpobutYOsWQw7tborPsOh4p9GN/9IdxaCCqpChW
ddXkKSTGezeUB+pe7Cixfkb5tHcQAVzCeHIFrsYho8gesiL/LXKJX8hQuo10cmVa
qOSJAvlTBeW84+mK93kKfcig/iiyZfDkXEq0SJ8oeD1idNDaRUY=
=oO1t
-----END PGP SIGNATURE-----
Merge tag 'x86_build_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Borislav Petkov:
- Add a "make x86_debug.config" target which enables a bunch of useful
config debug options when trying to debug an issue
- A gcc-12 build warnings fix
* tag 'x86_build_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot: Wrap literal addresses in absolute_pointer()
x86/configs: Add x86 debugging Kconfig fragment plus docs
This is the Intel version of a confidential computing solution called
Trust Domain Extensions (TDX). This series adds support to run the
kernel as part of a TDX guest. It provides similar guest protections to
AMD's SEV-SNP like guest memory and register state encryption, memory
integrity protection and a lot more.
Design-wise, it differs from AMD's solution considerably: it uses
a software module which runs in a special CPU mode called (Secure
Arbitration Mode) SEAM. As the name suggests, this module serves as sort
of an arbiter which the confidential guest calls for services it needs
during its lifetime.
Just like AMD's SNP set, this series reworks and streamlines certain
parts of x86 arch code so that this feature can be properly accomodated.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLbisACgkQEsHwGGHe
VUqZLg/7B55iygCwzz0W/KLcXL2cISatUpzGbFs1XTbE9DMz06BPkOsEjF2k8ckv
kfZjgqhSx3GvUI80gK0Tn2M2DfIj3nKuNSXd1pfextP7AxEf68FFJsQz1Ju7bHpT
pZaG+g8IK4+mnEHEKTCO9ANg/Zw8yqJLdtsCaCNE9SUGUfQ6m/ujTEfsambXDHNm
khyCAgpIGSOt51/4apoR9ebyrNCaeVbDawpIPjTy+iyFRc/WyaLFV9CQ8klw4gbw
r/90x2JYxvAf0/z/ifT9Wa+TnYiQ0d4VjFbfr0iJ4GcPn5L3EIoIKPE8vPGMpoSX
fLSzoNmAOT3ja57ytUUQ3o0edoRUIPEdixOebf9qWvE/aj7W37YRzrlJ8Ej/x9Jy
HcI4WZF6Dr1bh6FnI/xX2eVZRzLOL4j9gNyPCwIbvgr1NjDqQnxU7nhxVMmQhJrs
IdiEcP5WYerLKfka/uF//QfWUg5mDBgFa1/3xK57Z3j0iKWmgjaPpR0SWlOKjj8G
tr0gGN9ejikZTqXKGsHn8fv/R3bjXvbVD8z0IEcx+MIrRmZPnX2QBlg7UA1AXV5n
HoVwPFdH1QAtjZq1MRcL4hTOjz3FkS68rg7ZH0f2GWJAzWmEGytBIhECRnN/PFFq
VwRB4dCCt0bzqRxkiH5lzdgR+xqRe61juQQsMzg+Flv/trpXDqM=
=ac9K
-----END PGP SIGNATURE-----
Merge tag 'x86_tdx_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull Intel TDX support from Borislav Petkov:
"Intel Trust Domain Extensions (TDX) support.
This is the Intel version of a confidential computing solution called
Trust Domain Extensions (TDX). This series adds support to run the
kernel as part of a TDX guest. It provides similar guest protections
to AMD's SEV-SNP like guest memory and register state encryption,
memory integrity protection and a lot more.
Design-wise, it differs from AMD's solution considerably: it uses a
software module which runs in a special CPU mode called (Secure
Arbitration Mode) SEAM. As the name suggests, this module serves as
sort of an arbiter which the confidential guest calls for services it
needs during its lifetime.
Just like AMD's SNP set, this series reworks and streamlines certain
parts of x86 arch code so that this feature can be properly
accomodated"
* tag 'x86_tdx_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (34 commits)
x86/tdx: Fix RETs in TDX asm
x86/tdx: Annotate a noreturn function
x86/mm: Fix spacing within memory encryption features message
x86/kaslr: Fix build warning in KASLR code in boot stub
Documentation/x86: Document TDX kernel architecture
ACPICA: Avoid cache flush inside virtual machines
x86/tdx/ioapic: Add shared bit for IOAPIC base address
x86/mm: Make DMA memory shared for TD guest
x86/mm/cpa: Add support for TDX shared memory
x86/tdx: Make pages shared in ioremap()
x86/topology: Disable CPU online/offline control for TDX guests
x86/boot: Avoid #VE during boot for TDX platforms
x86/boot: Set CR0.NE early and keep it set during the boot
x86/acpi/x86/boot: Add multiprocessor wake-up support
x86/boot: Add a trampoline for booting APs via firmware handoff
x86/tdx: Wire up KVM hypercalls
x86/tdx: Port I/O: Add early boot support
x86/tdx: Port I/O: Add runtime hypercalls
x86/boot: Port I/O: Add decompression-time support for TDX
x86/boot: Port I/O: Allow to hook up alternative helpers
...
Add to confidential guests the necessary memory integrity protection
against malicious hypervisor-based attacks like data replay, memory
remapping and others, thus achieving a stronger isolation from the
hypervisor.
At the core of the functionality is a new structure called a reverse
map table (RMP) with which the guest has a say in which pages get
assigned to it and gets notified when a page which it owns, gets
accessed/modified under the covers so that the guest can take an
appropriate action.
In addition, add support for the whole machinery needed to launch a SNP
guest, details of which is properly explained in each patch.
And last but not least, the series refactors and improves parts of the
previous SEV support so that the new code is accomodated properly and
not just bolted on.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmKLU2AACgkQEsHwGGHe
VUpb/Q//f4LGiJf4nw1flzpe90uIsHNwAafng3NOjeXmhI/EcOlqPf23WHPCgg3Z
2umfa4sRZyj4aZubDd7tYAoq4qWrQ7pO7viWCNTh0InxBAILOoMPMuq2jSAbq0zV
ASUJXeQ2bqjYxX4JV4N5f3HT2l+k68M0mpGLN0H+O+LV9pFS7dz7Jnsg+gW4ZP25
PMPLf6FNzO/1tU1aoYu80YDP1ne4eReLrNzA7Y/rx+S2NAetNwPn21AALVgoD4Nu
vFdKh4MHgtVbwaQuh0csb/+4vD+tDXAhc8lbIl+Abl9ZxJaDWtAJW5D9e2CnsHk1
NOkHwnrzizzhtGK1g56YPUVRFAWhZYMOI1hR0zGPLQaVqBnN4b+iahPeRiV0XnGE
PSbIHSfJdeiCkvLMCdIAmpE5mRshhRSUfl1CXTCdetMn8xV/qz/vG6bXssf8yhTV
cfLGPHU7gfVmsbR9nk5a8KZ78PaytxOxfIDXvCy8JfQwlIWtieaCcjncrj+sdMJy
0fdOuwvi4jma0cyYuPolKiS1Hn4ldeibvxXT7CZQlIx6jZShMbpfpTTJs11XdtHm
PdDAc1TY3AqI33mpy9DhDQmx/+EhOGxY3HNLT7evRhv4CfdQeK3cPVUWgo4bGNVv
ZnFz7nvmwpyufltW9K8mhEZV267174jXGl6/idxybnlVE7ESr2Y=
=Y8kW
-----END PGP SIGNATURE-----
Merge tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull AMD SEV-SNP support from Borislav Petkov:
"The third AMD confidential computing feature called Secure Nested
Paging.
Add to confidential guests the necessary memory integrity protection
against malicious hypervisor-based attacks like data replay, memory
remapping and others, thus achieving a stronger isolation from the
hypervisor.
At the core of the functionality is a new structure called a reverse
map table (RMP) with which the guest has a say in which pages get
assigned to it and gets notified when a page which it owns, gets
accessed/modified under the covers so that the guest can take an
appropriate action.
In addition, add support for the whole machinery needed to launch a
SNP guest, details of which is properly explained in each patch.
And last but not least, the series refactors and improves parts of the
previous SEV support so that the new code is accomodated properly and
not just bolted on"
* tag 'x86_sev_for_v5.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits)
x86/entry: Fixup objtool/ibt validation
x86/sev: Mark the code returning to user space as syscall gap
x86/sev: Annotate stack change in the #VC handler
x86/sev: Remove duplicated assignment to variable info
x86/sev: Fix address space sparse warning
x86/sev: Get the AP jump table address from secrets page
x86/sev: Add missing __init annotations to SEV init routines
virt: sevguest: Rename the sevguest dir and files to sev-guest
virt: sevguest: Change driver name to reflect generic SEV support
x86/boot: Put globals that are accessed early into the .data section
x86/boot: Add an efi.h header for the decompressor
virt: sevguest: Fix bool function returning negative value
virt: sevguest: Fix return value check in alloc_shared_pages()
x86/sev-es: Replace open-coded hlt-loop with sev_es_terminate()
virt: sevguest: Add documentation for SEV-SNP CPUID Enforcement
virt: sevguest: Add support to get extended report
virt: sevguest: Add support to derive key
virt: Add SEV-SNP guest driver
x86/sev: Register SEV-SNP guest request platform device
x86/sev: Provide support for SNP guest request NAEs
...
GCC 11 (incorrectly[1]) assumes that literal values cast to (void *)
should be treated like a NULL pointer with an offset, and raises
diagnostics when doing bounds checking under -Warray-bounds. GCC 12
got "smarter" about finding these:
In function 'rdfs8',
inlined from 'vga_recalc_vertical' at /srv/code/arch/x86/boot/video-mode.c:124:29,
inlined from 'set_mode' at /srv/code/arch/x86/boot/video-mode.c:163:3:
/srv/code/arch/x86/boot/boot.h:114:9: warning: array subscript 0 is outside array bounds of 'u8[0]' {aka 'unsigned char[]'} [-Warray-bounds]
114 | asm volatile("movb %%fs:%1,%0" : "=q" (v) : "m" (*(u8 *)addr));
| ^~~
This has been solved in other places[2] already by using the recently
added absolute_pointer() macro. Do the same here.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99578
[2] https://lore.kernel.org/all/20210912160149.2227137-1-linux@roeck-us.net/
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Link: https://lore.kernel.org/r/20220227195918.705219-1-keescook@chromium.org
Many architectures have similar install.sh scripts.
The first half is really generic; it verifies that the kernel image
and System.map exist, then executes ~/bin/${INSTALLKERNEL} or
/sbin/${INSTALLKERNEL} if available.
The second half is kind of arch-specific; it copies the kernel image
and System.map to the destination, but the code is slightly different.
Factor out the generic part into scripts/install.sh.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
Following Baskov Evgeniy's "Handle UEFI NX-restricted page tables"
patches, it's safe to set this compatibility flag to let loaders know
they don't need to make special accommodations for kernel to load if
pre-boot NX is enabled.
Signed-off-by: Peter Jones <pjones@redhat.com>
Link: https://lore.kernel.org/all/20220329184743.798513-1-pjones@redhat.com/
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
The helpers in arch/x86/boot/compressed/efi.c might be used during
early boot to access the EFI system/config tables, and in some cases
these EFI helpers might attempt to print debug/error messages, before
console_init() has been called.
__putstr() checks some variables to avoid printing anything before
the console has been initialized, but this isn't enough since those
variables live in .bss, which may not have been cleared yet. This can
lead to a triple-fault occurring, primarily when booting in legacy/CSM
mode (where EFI helpers will attempt to print some debug messages).
Fix this by declaring these globals in .data section instead so there
is no dependency on .bss being cleared before accessing them.
Fixes: c01fce9cef ("x86/compressed: Add SEV-SNP feature detection/setup")
Reported-by: Borislav Petkov <bp@suse.de>
Suggested-by: Thomas Lendacky <Thomas.Lendacky@amd.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220420152613.145077-1-michael.roth@amd.com
There are a few MSRs and control register bits that the kernel
normally needs to modify during boot. But, TDX disallows
modification of these registers to help provide consistent security
guarantees. Fortunately, TDX ensures that these are all in the correct
state before the kernel loads, which means the kernel does not need to
modify them.
The conditions to avoid are:
* Any writes to the EFER MSR
* Clearing CR4.MCE
This theoretically makes the guest boot more fragile. If, for instance,
EFER was set up incorrectly and a WRMSR was performed, it will trigger
early exception panic or a triple fault, if it's before early
exceptions are set up. However, this is likely to trip up the guest
BIOS long before control reaches the kernel. In any case, these kinds
of problems are unlikely to occur in production environments, and
developers have good debug tools to fix them quickly.
Change the common boot code to work on TDX and non-TDX systems.
This should have no functional effect on non-TDX systems.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20220405232939.73860-24-kirill.shutemov@linux.intel.com
TDX guest requires CR0.NE to be set. Clearing the bit triggers #GP(0).
If CR0.NE is 0, the MS-DOS compatibility mode for handling floating-point
exceptions is selected. In this mode, the software exception handler for
floating-point exceptions is invoked externally using the processor’s
FERR#, INTR, and IGNNE# pins.
Using FERR# and IGNNE# to handle floating-point exception is deprecated.
CR0.NE=0 also limits newer processors to operate with one logical
processor active.
Kernel uses CR0_STATE constant to initialize CR0. It has NE bit set.
But during early boot kernel has more ad-hoc approach to setting bit
in the register. During some of this ad-hoc manipulation, CR0.NE is
cleared. This causes a #GP in TDX guests and makes it die in early boot.
Make CR0 initialization consistent, deriving the initial value of CR0
from CR0_STATE. Since CR0_STATE always has CR0.NE=1, this ensures that
CR0.NE is never 0 and avoids the #GP.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20220405232939.73860-23-kirill.shutemov@linux.intel.com
Port I/O instructions trigger #VE in the TDX environment. In response to
the exception, kernel emulates these instructions using hypercalls.
But during early boot, on the decompression stage, it is cumbersome to
deal with #VE. It is cleaner to go to hypercalls directly, bypassing #VE
handling.
Hook up TDX-specific port I/O helpers if booting in TDX environment.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220405232939.73860-17-kirill.shutemov@linux.intel.com
Port I/O instructions trigger #VE in the TDX environment. In response to
the exception, kernel emulates these instructions using hypercalls.
But during early boot, on the decompression stage, it is cumbersome to
deal with #VE. It is cleaner to go to hypercalls directly, bypassing #VE
handling.
Add a way to hook up alternative port I/O helpers in the boot stub with
a new pio_ops structure. For now, set the ops structure to just call
the normal I/O operation functions.
out*()/in*() macros redefined to use pio_ops callbacks. It eliminates
need in changing call sites. io_delay() changed to use port I/O helper
instead of inline assembly.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220405232939.73860-16-kirill.shutemov@linux.intel.com
There are two implementations of port I/O helpers: one in the kernel and
one in the boot stub.
Move the helpers required for both to <asm/shared/io.h> and use the one
implementation everywhere.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20220405232939.73860-15-kirill.shutemov@linux.intel.com
The early decompression code does port I/O for its console output. But,
handling the decompression-time port I/O demands a different approach
from normal runtime because the IDT required to support #VE based port
I/O emulation is not yet set up. Paravirtualizing I/O calls during
the decompression step is acceptable because the decompression code
doesn't have a lot of call sites to IO instruction.
To support port I/O in decompression code, TDX must be detected before
the decompression code might do port I/O. Detect whether the kernel runs
in a TDX guest.
Add an early_is_tdx_guest() interface to query the cached TDX guest
status in the decompression code.
TDX is detected with CPUID. Make cpuid_count() accessible outside
boot/cpuflags.c.
TDX detection in the main kernel is very similar. Move common bits
into <asm/shared/tdx.h>.
The actual port I/O paravirtualization will come later in the series.
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lkml.kernel.org/r/20220405232939.73860-13-kirill.shutemov@linux.intel.com
SEV-SNP guests will be provided the location of special 'secrets' and
'CPUID' pages via the Confidential Computing blob. This blob is
provided to the run-time kernel either through a boot_params field that
was initialized by the boot/compressed kernel, or via a setup_data
structure as defined by the Linux Boot Protocol.
Locate the Confidential Computing blob from these sources and, if found,
use the provided CPUID page/table address to create a copy that the
run-time kernel will use when servicing CPUID instructions via a #VC
handler.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-40-brijesh.singh@amd.com
Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-39-brijesh.singh@amd.com
The run-time kernel will need to access the Confidential Computing blob
very early during boot to access the CPUID table it points to. At that
stage, it will be relying on the identity-mapped page table set up by
the boot/compressed kernel, so make sure the blob and the CPUID table it
points to are mapped in advance.
[ bp: Massage. ]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-38-brijesh.singh@amd.com
SEV-specific code will need to add some additional mappings, but doing
this within ident_map_64.c requires some SEV-specific helpers to be
exported and some SEV-specific struct definitions to be pulled into
ident_map_64.c. Instead, export add_identity_map() so SEV-specific (and
other subsystem-specific) code can be better contained outside of
ident_map_64.c.
While at it, rename the function to kernel_add_identity_map(), similar
to the kernel_ident_mapping_init() function it relies upon.
No functional changes.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-37-brijesh.singh@amd.com
SEV-SNP guests will be provided the location of special 'secrets'
'CPUID' pages via the Confidential Computing blob. This blob is
provided to the boot kernel either through an EFI config table entry,
or via a setup_data structure as defined by the Linux Boot Protocol.
Locate the Confidential Computing from these sources and, if found,
use the provided CPUID page/table address to create a copy that the
boot kernel will use when servicing CPUID instructions via a #VC CPUID
handler.
[ bp: s/cpuid/CPUID/ ]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-36-brijesh.singh@amd.com
Initial/preliminary detection of SEV-SNP is done via the Confidential
Computing blob. Check for it prior to the normal SEV/SME feature
initialization, and add some sanity checks to confirm it agrees with
SEV-SNP CPUID/MSR bits.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-35-brijesh.singh@amd.com
This code will also be used later for SEV-SNP-validated CPUID code in
some cases, so move it to a common helper.
While here, also add a check to terminate in cases where the CPUID
function/subfunction is indexed and the subfunction is non-zero, since
the GHCB MSR protocol does not support non-zero subfunctions.
Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-32-brijesh.singh@amd.com
Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.
In this instance, the current acpi.c kexec handling is mainly used to
get the alternative EFI config table address provided by kexec via a
setup_data entry of type SETUP_EFI. If not present, the code then falls
back to normal EFI config table address provided by EFI system table.
This would need to be done by all call-sites attempting to access the
EFI config table, so just have efi_get_conf_table() handle that
automatically.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-29-brijesh.singh@amd.com
Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.
[ bp: Unbreak unnecessarily broken lines. ]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-28-brijesh.singh@amd.com
Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.
[ bp: Remove superfluous zeroing of a stack variable. ]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-27-brijesh.singh@amd.com
Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related
code into a set of helpers that can be re-used for that purpose.
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-26-brijesh.singh@amd.com
Future patches for SEV-SNP-validated CPUID will also require early
parsing of the EFI configuration. Incrementally move the related code
into a set of helpers that can be re-used for that purpose.
First, carve out the functionality which determines the EFI environment
type the machine is booting on.
[ bp: Massage commit message. ]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-25-brijesh.singh@amd.com
The SEV-SNP guest is required by the GHCB spec to register the GHCB's
Guest Physical Address (GPA). This is because the hypervisor may prefer
that a guest use a consistent and/or specific GPA for the GHCB associated
with a vCPU. For more information, see the GHCB specification section
"GHCB GPA Registration".
If hypervisor can not work with the guest provided GPA then terminate the
guest boot.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Link: https://lore.kernel.org/r/20220307213356.2797205-17-brijesh.singh@amd.com
Many of the integrity guarantees of SEV-SNP are enforced through the
Reverse Map Table (RMP). Each RMP entry contains the GPA at which a
particular page of DRAM should be mapped. The VMs can request the
hypervisor to add pages in the RMP table via the Page State Change
VMGEXIT defined in the GHCB specification.
Inside each RMP entry is a Validated flag; this flag is automatically
cleared to 0 by the CPU hardware when a new RMP entry is created for a
guest. Each VM page can be either validated or invalidated, as indicated
by the Validated flag in the RMP entry. Memory access to a private page
that is not validated generates a #VC. A VM must use the PVALIDATE
instruction to validate a private page before using it.
To maintain the security guarantee of SEV-SNP guests, when transitioning
pages from private to shared, the guest must invalidate the pages before
asking the hypervisor to change the page state to shared in the RMP table.
After the pages are mapped private in the page table, the guest must
issue a page state change VMGEXIT to mark the pages private in the RMP
table and validate them.
Upon boot, BIOS should have validated the entire system memory.
During the kernel decompression stage, early_setup_ghcb() uses
set_page_decrypted() to make the GHCB page shared (i.e. clear encryption
attribute). And while exiting from the decompression, it calls
set_page_encrypted() to make the page private.
Add snp_set_page_{private,shared}() helpers that are used by
set_page_{decrypted,encrypted}() to change the page state in the RMP
table.
[ bp: Massage commit message and comments. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-16-brijesh.singh@amd.com
The Virtual Machine Privilege Level (VMPL) feature in the SEV-SNP
architecture allows a guest VM to divide its address space into four
levels. The level can be used to provide hardware isolated abstraction
layers within a VM. VMPL0 is the highest privilege level, and VMPL3 is
the least privilege level. Certain operations must be done by the VMPL0
software, such as:
* Validate or invalidate memory range (PVALIDATE instruction)
* Allocate VMSA page (RMPADJUST instruction when VMSA=1)
The initial SNP support requires that the guest kernel is running at
VMPL0. Add such a check to verify the guest is running at level 0 before
continuing the boot. There is no easy method to query the current VMPL
level, so use the RMPADJUST instruction to determine whether the guest
is running at the VMPL0.
[ bp: Massage commit message. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-15-brijesh.singh@amd.com
Version 2 of the GHCB specification added the advertisement of features
that are supported by the hypervisor. If the hypervisor supports SEV-SNP
then it must set the SEV-SNP features bit to indicate that the base
functionality is supported.
Check that feature bit while establishing the GHCB; if failed, terminate
the guest.
Version 2 of the GHCB specification adds several new Non-Automatic Exits
(NAEs), most of them are optional except the hypervisor feature. Now
that the hypervisor feature NAE is implemented, bump the GHCB maximum
supported protocol version.
While at it, move the GHCB protocol negotiation check from the #VC
exception handler to sev_enable() so that all feature detection happens
before the first #VC exception.
While at it, document why the GHCB page cannot be setup from
load_stage2_idt().
[ bp: Massage commit message. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-13-brijesh.singh@amd.com
The GHCB specification defines the reason code for reason set 0. The
reason codes defined in the set 0 do not cover all possible causes for a
guest to request termination.
The reason sets 1 to 255 are reserved for the vendor-specific codes.
Reserve the reason set 1 for the Linux guest. Define the error codes for
reason set 1 so that one can have meaningful termination reasons and thus
better guest failure diagnosis.
While at it, change sev_es_terminate() to accept a reason set parameter.
[ bp: Massage commit message. ]
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Venu Busireddy <venu.busireddy@oracle.com>
Link: https://lore.kernel.org/r/20220307213356.2797205-11-brijesh.singh@amd.com
With upcoming SEV-SNP support, SEV-related features need to be
initialized earlier during boot, at the same point the initial #VC
handler is set up, so that the SEV-SNP CPUID table can be utilized
during the initial feature checks. Also, SEV-SNP feature detection
will rely on EFI helper functions to scan the EFI config table for the
Confidential Computing blob, and so would need to be implemented at
least partially in C.
Currently set_sev_encryption_mask() is used to initialize the
sev_status and sme_me_mask globals that advertise what SEV/SME features
are available in a guest. Rename it to sev_enable() to better reflect
that (SME is only enabled in the case of SEV guests in the
boot/compressed kernel), and move it to just after the stage1 #VC
handler is set up so that it can be used to initialize SEV-SNP as well
in future patches.
While at it, re-implement it as C code so that all SEV feature
detection can be better consolidated with upcoming SEV-SNP feature
detection, which will also be in C.
The 32-bit entry path remains unchanged, as it never relied on the
set_sev_encryption_mask() initialization to begin with.
[ bp: Massage commit message. ]
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-8-brijesh.singh@amd.com
Update all C code to use the new boot_rdmsr()/boot_wrmsr() helpers
instead of relying on inline assembly.
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-7-brijesh.singh@amd.com
The current set of helpers used throughout the run-time kernel have
dependencies on code/facilities outside of the boot kernel, so there
are a number of call-sites throughout the boot kernel where inline
assembly is used instead. More will be added with subsequent patches
that add support for SEV-SNP, so take the opportunity to provide a basic
set of helpers that can be used by the boot kernel to reduce reliance on
inline assembly.
Use boot_* prefix so that it's clear these are helpers specific to the
boot kernel to avoid any confusion with the various other MSR read/write
helpers.
[ bp: Disambiguate parameter names and trim comment. ]
Suggested-by: Borislav Petkov <bp@alien8.de>
Signed-off-by: Michael Roth <michael.roth@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20220307213356.2797205-6-brijesh.singh@amd.com
- Enable strict FORTIFY_SOURCE compile-time validation of memcpy buffers
- Add Clang features needed for FORTIFY_SOURCE support
- Enable FORTIFY_SOURCE for Clang where possible
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmI+NxwWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhnPEACI1AUB9OHzL+VbLhX6zzvPuFRm
7MC11PWyPTa4tkhKGTlVvYbHKwrfcJyAG85rKpz5euWVlzVFkifouT4YAG959CYK
OGUj9WXPRpQ3IIPXXazZOtds4T5sP/m6dSts2NaRIX4w0NKOo3p2mlxUaYoagH1Z
j178epRJ+lbUwPdBmGsSGceb5qDKqubz/sXh51lY3YoLdMZGiom6FLva4STenzZq
SBEJqD2AM0tPWSkrue4OCRig7IsiLhzLvP8jC303suLLHn3eVTvoIT+RRBvwFqXo
MX9B6i3DdCjbWoOg9gA0Jhc6+2+kP7MU1MO6WfWP6IVZh2V1pk4Avmgxy6ypxfwU
fMNqH7CrFmojKOWqF55/1zfrQNNLqnHD3HiDAHpCtATN8kpcZGZXMUb3kT4FIij1
2Mcf6mBQOSqZTg4OvgKzPWGZYJe3KJp5lup5zhWmcOSV0o2gNhFCwXHEmhlNRLzw
idnbghjqBE74UcThQQjyWNBldzdPWVAjgaD696CnziRDCtHiTsrQaIrRsjx9P8NX
3GpoIp0vqDFG4SjFkuGishmlyMWXb3B2Ij7s2WCCSYRHLgOUJQgkhkw5wNZ7F2zD
qjEXaRZXecG5W/gwA4Ak9I2o6oKaK5HPMhNxYp7mlbceYcnuw9gSqeqRAgqX9LJA
kg7orn733jgfMrGhHw==
=8qRJ
-----END PGP SIGNATURE-----
Merge tag 'memcpy-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull FORTIFY_SOURCE updates from Kees Cook:
"This series consists of two halves:
- strict compile-time buffer size checking under FORTIFY_SOURCE for
the memcpy()-family of functions (for extensive details and
rationale, see the first commit)
- enabling FORTIFY_SOURCE for Clang, which has had many overlapping
bugs that we've finally worked past"
* tag 'memcpy-v5.18-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
fortify: Add Clang support
fortify: Make sure strlen() may still be used as a constant expression
fortify: Use __diagnose_as() for better diagnostic coverage
fortify: Make pointer arguments const
Compiler Attributes: Add __diagnose_as for Clang
Compiler Attributes: Add __overloadable for Clang
Compiler Attributes: Add __pass_object_size for Clang
fortify: Replace open-coded __gnu_inline attribute
fortify: Update compile-time tests for Clang 14
fortify: Detect struct member overflows in memset() at compile-time
fortify: Detect struct member overflows in memmove() at compile-time
fortify: Detect struct member overflows in memcpy() at compile-time
Since switch to simplefb/simpledrm VESA graphic mode selection with vga=
kernel parameter is no longer available with legacy BIOS.
The x86 realmode boot code enables the VESA graphic modes when option
FB_BOOT_VESA_SUPPORT is enabled.
This option is selected by vesafb but not simplefb/simpledrm.
To enable use of VESA modes with simplefb in legacy BIOS boot mode drop
dependency of BOOT_VESA_SUPPORT on FB, also drop the FB_ prefix. Select
the option from sysfb rather than the drivers that depend on it.
The BOOT_VESA_SUPPORT is not specific to framebuffer but rather to x86
platform, move it from fbdev to x86 Kconfig.
Fixes: e3263ab389 ("x86: provide platform-devices for boot-framebuffers")
Signed-off-by: Michal Suchanek <msuchanek@suse.de>
Reviewed-by: Javier Martinez Canillas <javierm@redhat.com>
Acked-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patchwork.freedesktop.org/patch/msgid/948c39940a4e99f5b43bdbcbe537faae71a43e1d.1645822213.git.msuchanek@suse.de
Now that we have SYM_FUNC_ALIAS() and SYM_FUNC_ALIAS_WEAK(), use those
to simplify the definition of function aliases across arch/x86.
For clarity, where there are multiple annotations such as
EXPORT_SYMBOL(), I've tried to keep annotations grouped by symbol. For
example, where a function has a name and an alias which are both
exported, this is organised as:
SYM_FUNC_START(func)
... asm insns ...
SYM_FUNC_END(func)
EXPORT_SYMBOL(func)
SYM_FUNC_ALIAS(alias, func)
EXPORT_SYMBOL(alias)
Where there are only aliases and no exports or other annotations, I have
not bothered with line spacing, e.g.
SYM_FUNC_START(func)
... asm insns ...
SYM_FUNC_END(func)
SYM_FUNC_ALIAS(alias, func)
The tools/perf/ copies of memset_64.S and memset_64.S are updated
likewise to avoid the build system complaining these are mismatched:
| Warning: Kernel ABI header at 'tools/arch/x86/lib/memcpy_64.S' differs from latest version at 'arch/x86/lib/memcpy_64.S'
| diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
| Warning: Kernel ABI header at 'tools/arch/x86/lib/memset_64.S' differs from latest version at 'arch/x86/lib/memset_64.S'
| diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S
There should be no functional change as a result of this patch.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
Acked-by: Mark Brown <broonie@kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Slaby <jslaby@suse.cz>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220216162229.1076788-4-mark.rutland@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
As done for memcpy(), also update memmove() to use the same tightened
compile-time checks under CONFIG_FORTIFY_SOURCE.
Signed-off-by: Kees Cook <keescook@chromium.org>
- Add new kconfig target 'make mod2noconfig', which will be useful to
speed up the build and test iteration.
- Raise the minimum supported version of LLVM to 11.0.0
- Refactor certs/Makefile
- Change the format of include/config/auto.conf to stop double-quoting
string type CONFIG options.
- Fix ARCH=sh builds in dash
- Separate compression macros for general purposes (cmd_bzip2 etc.) and
the ones for decompressors (cmd_bzip2_with_size etc.)
- Misc Makefile cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmHnFNIVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGiQEP/1tkt9IHP7vFvkN9xChQI8HQ7HOC
mPIxBAUzHIp1V2IALb0lfojjnpkzcMNpJZVlmqjgyYShLEPPBFwKVXs1War6GViX
aprUMz7w1zR/vZJ2fplFmrkNwSxNp3+LSE6sHVmsliS4Vfzh7CjHb8DnaKjBvQLZ
M+eQugjHsWI3d3E81/qtRG5EaVs6q8osF3b0Km59mrESWVYKqwlUP3aUyQCCUGFK
mI+zC4SrHH6EAIZd//VpaleXxVtDcjjadb7Iru5MFhFdCBIRoSC3d1IWPUNUKNnK
i0ocDXuIoAulA/mROgrpyAzLXg10qYMwwTmX+tplkHA055gKcY/v4aHym6ypH+TX
6zd34UMTLM32LSjs8hssiQT8BiZU0uZoa/m2E9IBaiExA2sTsRZxgQMKXFFaPQJl
jn4cRiG0K1NDeRKtq4xh2WO46OS4sPlR6zW9EXDEsS/bI05Y7LpUz7Flt6iA2Mq3
0g8uYIYr/9drl96X83tFgTkxxB6lpB29tbsmsrKJRGxvrCDnAhXlXhPCkMajkm2Q
PjJfNtMFzwemSZWq09+F+X5BgCjzZtroOdFI9FTMNhGWyaUJZXCtcXQ6UTIKnTHO
cDjcURvh+l56eNEQ5SMTNtAkxB+pX8gPUmyO1wLwRUT4YodxylkTUXGyBBR9tgTn
Yks1TnPD06ld364l
=8BQf
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add new kconfig target 'make mod2noconfig', which will be useful to
speed up the build and test iteration.
- Raise the minimum supported version of LLVM to 11.0.0
- Refactor certs/Makefile
- Change the format of include/config/auto.conf to stop double-quoting
string type CONFIG options.
- Fix ARCH=sh builds in dash
- Separate compression macros for general purposes (cmd_bzip2 etc.) and
the ones for decompressors (cmd_bzip2_with_size etc.)
- Misc Makefile cleanups
* tag 'kbuild-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (34 commits)
kbuild: add cmd_file_size
arch: decompressor: remove useless vmlinux.bin.all-y
kbuild: rename cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}
kbuild: drop $(size_append) from cmd_zstd
sh: rename suffix-y to suffix_y
doc: kbuild: fix default in `imply` table
microblaze: use built-in function to get CPU_{MAJOR,MINOR,REV}
certs: move scripts/extract-cert to certs/
kbuild: do not quote string values in include/config/auto.conf
kbuild: do not include include/config/auto.conf from shell scripts
certs: simplify $(srctree)/ handling and remove config_filename macro
kbuild: stop using config_filename in scripts/Makefile.modsign
certs: remove misleading comments about GCC PR
certs: refactor file cleaning
certs: remove unneeded -I$(srctree) option for system_certificates.o
certs: unify duplicated cmd_extract_certs and improve the log
certs: use $< and $@ to simplify the key generation rule
kbuild: remove headers_check stub
kbuild: move headers_check.pl to usr/include/
certs: use if_changed to re-generate the key when the key type is changed
...
GZIP-compressed files end with 4 byte data that represents the size
of the original input. The decompressors (the self-extracting kernel)
exploit it to know the vmlinux size beforehand. To mimic the GZIP's
trailer, Kbuild provides cmd_{bzip2,lzma,lzo,lz4,xzkern,zstd22}.
Unfortunately these macros are used everywhere despite the appended
size data is only useful for the decompressors.
There is no guarantee that such hand-crafted trailers are safely ignored.
In fact, the kernel refuses compressed initramdfs with the garbage data.
That is why usr/Makefile overrides size_append to make it no-op.
To limit the use of such broken compressed files, this commit renames
the existing macros as follows:
cmd_bzip2 --> cmd_bzip2_with_size
cmd_lzma --> cmd_lzma_with_size
cmd_lzo --> cmd_lzo_with_size
cmd_lz4 --> cmd_lz4_with_size
cmd_xzkern --> cmd_xzkern_with_size
cmd_zstd22 --> cmd_zstd22_with_size
To keep the decompressors working, I updated the following Makefiles
accordingly:
arch/arm/boot/compressed/Makefile
arch/h8300/boot/compressed/Makefile
arch/mips/boot/compressed/Makefile
arch/parisc/boot/compressed/Makefile
arch/s390/boot/compressed/Makefile
arch/sh/boot/compressed/Makefile
arch/x86/boot/compressed/Makefile
I reused the current macro names for the normal usecases; they produce
the compressed data in the proper format.
I did not touch the following:
arch/arc/boot/Makefile
arch/arm64/boot/Makefile
arch/csky/boot/Makefile
arch/mips/boot/Makefile
arch/riscv/boot/Makefile
arch/sh/boot/Makefile
kernel/Makefile
This means those Makefiles will stop appending the size data.
I dropped the 'override size_append' hack from usr/Makefile.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Nicolas Schier <n.schier@avm.de>
misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and
LIVEPATCH as the backtrace misses the function which is being fixed up.
- Add Straight Light Speculation mitigation support which uses a new
compiler switch -mharden-sls= which sticks an INT3 after a RET or an
indirect branch in order to block speculation after them. Reportedly,
CPUs do speculate behind such insns.
- The usual set of cleanups and improvements
-----BEGIN PGP SIGNATURE-----
iQIyBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHfKA0ACgkQEsHwGGHe
VUqLJg/2I2X2xXr5filJVaK+sQgmvDzk67DKnbxRBW2xcPF+B5sSW5yhe3G5UPW7
SJVdhQ3gHcTiliGGlBf/VE7KXbqxFN0vO4/VFHZm78r43g7OrXTxz6WXXQRJ1n67
U3YwRH3b6cqXZNFMs+X4bJt6qsGJM1kdTTZ2as4aERnaFr5AOAfQvfKbyhxLe/XA
3SakfYISVKCBQ2RkTfpMpwmqlsatGFhTC5IrvuDQ83dDsM7O+Dx1J6Gu3fwjKmie
iVzPOjCh+xTpZQp/SIZmt7MzoduZvpSym4YVyHvEnMiexQT4AmyaRthWqrhnEXY/
qOvj8/XIqxmix8EaooGqRIK0Y2ZegxkPckNFzaeC3lsWohwMIGIhNXwHNEeuhNyH
yvNGAW9Cq6NeDRgz5MRUXcimYw4P4oQKYLObS1WqFZhNMqm4sNtoEAYpai/lPYfs
zUDckgXF2AoPOsSqy3hFAVaGovAgzfDaJVzkt0Lk4kzzjX2WQiNLhmiior460w+K
0l2Iej58IajSp3MkWmFH368Jo8YfUVmkjbbpsmjsBppA08e1xamJB7RmswI/Ezj6
s5re6UioCD+UYdjWx41kgbvYdvIkkZ2RLrktoZd/hqHrOLWEIiwEbyFO2nRFJIAh
YjvPkB1p7iNuAeYcP1x9Ft9GNYVIsUlJ+hK86wtFCqy+abV+zQ==
=R52z
-----END PGP SIGNATURE-----
Merge tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:
- Get rid of all the .fixup sections because this generates
misleading/wrong stacktraces and confuse RELIABLE_STACKTRACE and
LIVEPATCH as the backtrace misses the function which is being fixed
up.
- Add Straight Line Speculation mitigation support which uses a new
compiler switch -mharden-sls= which sticks an INT3 after a RET or an
indirect branch in order to block speculation after them. Reportedly,
CPUs do speculate behind such insns.
- The usual set of cleanups and improvements
* tag 'x86_core_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (32 commits)
x86/entry_32: Fix segment exceptions
objtool: Remove .fixup handling
x86: Remove .fixup section
x86/word-at-a-time: Remove .fixup usage
x86/usercopy: Remove .fixup usage
x86/usercopy_32: Simplify __copy_user_intel_nocache()
x86/sgx: Remove .fixup usage
x86/checksum_32: Remove .fixup usage
x86/vmx: Remove .fixup usage
x86/kvm: Remove .fixup usage
x86/segment: Remove .fixup usage
x86/fpu: Remove .fixup usage
x86/xen: Remove .fixup usage
x86/uaccess: Remove .fixup usage
x86/futex: Remove .fixup usage
x86/msr: Remove .fixup usage
x86/extable: Extend extable functionality
x86/entry_32: Remove .fixup usage
x86/entry_64: Remove .fixup usage
x86/copy_mc_64: Remove .fixup usage
...
- support taking the measurement of the initrd when loaded via the
LoadFile2 protocol
- kobject API cleanup from Greg
- some header file whitespace fixes
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmHcnSAACgkQw08iOZLZ
jyQgagv/b41O2jok20vO9vXwWqxsjru9aOsFKMeXiITudObWaXvRmvbEeUhZIRc3
FefCemUEGUkQz1Alf23t8daJRezL3kE+Lt1R525o384INxHPiieZ2Vu+Kp1zdu6C
4cAuJu/iLbNbn1glPOAkMRRXjfVrlIaS1pC431jYk0WncEKpoE467ljP0k8jQuTq
X+S4W8dyxOObsGuRJJpmX9zsFKJ+R8dh0lc/KoyNcP5LSXIK8xrXwBitM0CF3YKD
sA8dYCrWiiA8KmY851fFQknyPBpfN+30m5DZ52uGWuuqAZ4CwN3ODSfEInhyhMqf
PhY/mXRTIPe5jAmKlHdIAe11ACB/fDtRxyMla0u1yjQgjY7CbjTlEBLNUtU+N2vs
zbemDACEL2S9NfUuF407B9gztx4j7LmaSui3qtBaGO4fn9cbsmnwM2M2j8bCLAt0
WOQSir/1gemyhFAKe4yDPjMjwpC+gMX8nYY2kmvm354Oseqt9l91VrnNDEUAROAE
zBUbds2U
=BY66
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
- support taking the measurement of the initrd when loaded via the
LoadFile2 protocol
- kobject API cleanup from Greg
- some header file whitespace fixes
* tag 'efi-next-for-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi:
efi: use default_groups in kobj_type
efi/libstub: measure loaded initrd info into the TPM
efi/libstub: consolidate initrd handling across architectures
efi/libstub: x86/mixed: increase supported argument count
efi/libstub: add prototype of efi_tcg2_protocol::hash_log_extend_event()
include/linux/efi.h: Remove unneeded whitespaces before tabs
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcFgkACgkQEsHwGGHe
VUp4VxAAuPLLNi+NKFzO+MZsJfnfQ+3w5zaB/fCEbiBFeDXU1AgMYQe5ClgjNwrG
rtAsugbF+W1MyU2CILqsrMqvuB/H1Ra4o4IeQJzLuMnZJBRpRdG7sH1RarUYdG3W
C4WkJTjmsPW3QFJUPs1EoH18zGIxHg4qrIGv9t6kAjWJi+ZnJrkHLjSrnvYSZMAM
sk0jhe1d0NupLh2gmQiTKAd7ULaWeL0RQLRW7ZhTJpaWAMIkL0z2B4azzQb11dgW
m6zRsPtOa+MRBI01OBrEEwoCexAkK8tuFrqV1QdylKhDUsDBTQCyM8zY3i/Fu40U
OVERtDCR+kcjLJ9cWeJdysNGAT91R2QLPrxv6wFod6jnRdn8yRrOz92cwnpN1/Fg
YZD1b0q9l7ar3Qfy0WgA+SMLU1fln/NMVDsZ8fFm7UFhnBNChjHuQ2DAtOBuobhI
Rdu8gI6hHJztGDoRXKaXSDpJgdVP5e9Uvqckh8abxJjy6mTIZy7GUQJeIs2Xk8MQ
U6m0WRsT3clsMC/eIgqrLYlA1QYh99RzcXmmJcVuPbcMDdMtBDm5gfWrWyI+L94c
vUrgVondrwTP7Rz7E3d3nBsV/Alr3gUMZrgjkLIZLmFVrPSfaZMzc6grtbtPxZf8
bvZWi15V05b8jF3pAwUMrG2pbNQMCdTxkIK6Fk7baefQBvh1dOw=
=nqWC
-----END PGP SIGNATURE-----
Merge tag 'x86_build_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build fix from Borislav Petkov:
"A fix for cross-compiling the compressed stub on arm64 with clang"
* tag 'x86_build_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/boot/compressed: Move CLANG_FLAGS to beginning of KBUILD_CFLAGS
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmHcEx8ACgkQEsHwGGHe
VUr7sw/9HcBuobgSgF8RJTsuaITMYZEIEGCQX9V8i8QLWAmWfw4gWsun3x/QrF0+
/CmuGpXVckTQgYo8Y/knhPfvVRDWLIsF9p18gV7sPxpQvDZ0ZgPQVSsfXx3cl84y
4D11zdYHDPDnFRY6AX4GCh5Ozz4K2WPwv41iHgi+vGoFjfSMeq1+ilAEXCBY8pdZ
S/Txd7oAm5tCVyH0VUXmNbQKTNjDbFXxr/FnWpMtVgXFiDlzbOK91H998ZJP13WL
1owCxLKW8BgvoRXtUBt8jJqHDqzjwBuxrGBCYwiFpVl0PBkbwLLYxElkef6rtpWf
gHAMtUeqgTOprNpvtK3ynaf9g8cgXd5QyBL5nszaD1WCzpVstg+yZExgKFWtTI6F
U0UJkjmGmVGsD9XkV7nbkcO/IGEgY9cOZ3tHqmyEaHSVahgoZRS8SDQKJA5Ivziu
mUN60dXEdX2xtCwGR3thacblTWHZrSr4k7Cb93thCb+Het67/DtOSkVp9U4Eutqu
O2La5Lw1pwsFaTy7WvBUvcuI+gc/zFrbs92r1gT5cGvSJ8+0Pf1Pr4nuhPL9dkKa
rFQ1ubRHpQQ5u4hVDiEVm+PD7I9hSSPEUWiqZamUEkYGF7XYec60FOTgFERhgfez
HExAhjHlH6lo4lSxwuWMXRcJMI7uehEm0fUduX8USvYUQn66zGM=
=u68B
-----END PGP SIGNATURE-----
Merge tag 'x86_cleanups_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:
"The mandatory set of random minor cleanups all over tip"
* tag 'x86_cleanups_for_v5.17_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/events/amd/iommu: Remove redundant assignment to variable shift
x86/boot/string: Add missing function prototypes
x86/fpu: Remove duplicate copy_fpstate_to_sigframe() prototype
x86/uaccess: Move variable into switch case statement
When cross compiling i386_defconfig on an arm64 host with clang, there
are a few instances of '-Waddress-of-packed-member' and
'-Wgnu-variable-sized-type-not-at-end' in arch/x86/boot/compressed/,
which should both be disabled with the cc-disable-warning calls in that
directory's Makefile, which indicates that cc-disable-warning is failing
at the point of testing these flags.
The cc-disable-warning calls fail because at the point that the flags
are tested, KBUILD_CFLAGS has '-march=i386' without $(CLANG_FLAGS),
which has the '--target=' flag to tell clang what architecture it is
targeting. Without the '--target=' flag, the host architecture (arm64)
is used and i386 is not a valid value for '-march=' in that case. This
error can be seen by adding some logging to try-run:
clang-14: error: the clang compiler does not support '-march=i386'
Invoking the compiler has to succeed prior to calling cc-option or
cc-disable-warning in order to accurately test whether or not the flag
is supported; if it doesn't, the requested flag can never be added to
the compiler flags. Move $(CLANG_FLAGS) to the beginning of KBUILD_FLAGS
so that any new flags that might be added in the future can be
accurately tested.
Fixes: d5cbd80e30 ("x86/boot: Add $(CLANG_FLAGS) to compressed KBUILD_CFLAGS")
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211222163040.1961481-1-nathan@kernel.org
Replace all ret/retq instructions with RET in preparation of making
RET a macro. Since AS is case insensitive it's a big no-op without
RET defined.
find arch/x86/ -name \*.S | while read file
do
sed -i 's/\<ret[q]*\>/RET/' $file
done
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lore.kernel.org/r/20211204134907.905503893@infradead.org
Increase the number of arguments supported by mixed mode calls, so that
we will be able to call into the TCG2 protocol to measure the initrd
and extend the associated PCR. This involves the TCG2 protocol's
hash_log_extend_event() method, which takes five arguments, three of
which are u64 and need to be split, producing a total of 8 outgoing
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://lore.kernel.org/r/20211119114745.1560453-3-ilias.apalodimas@linaro.org
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
keep old userspace from breaking. Adjust the corresponding iopl selftest
to that.
- Improve stack overflow warnings to say which stack got overflowed and
raise the exception stack sizes to 2 pages since overflowing the single
page of exception stack is very easy to do nowadays with all the tracing
machinery enabled. With that, rip out the custom mapping of AMD SEV's
too.
- A bunch of changes in preparation for FGKASLR like supporting more
than 64K section headers in the relocs tool, correct ORC lookup table
size to cover the whole kernel .text and other adjustments.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmF/uugACgkQEsHwGGHe
VUroKw//e8BJ3Aun8bg00FHxfiMGbPYcozjLGDkaoMtMDZ8WlfCUrvtqYICEr8eB
UU0eRyygAPI167dre1O9JvAcbilkNTKntaU6qbu/ZVyUwS3+Jkjwsotbqn3xKtkd
QDDTDNiCU+beCJ2ZbspbrPgEh13+H0MwMHUfRxZB9Scpmo6aGSEaU3g295f6GX57
VFGJ/LNov5MV1dTD7Pp/h6/Nb+R6WmflKcBzJmQxYuKyKX+g1xsSv0VSga+t+uf3
M9pUkizqTiUxzC2eLgtcEZTqqBHu810E8M76FmhKBUMilsFJT5YAJTiqyahwHXds
HYarOFRgcnFuJPd29vn8UHjqeeoi6ru8GtcZYzccEc7U3ku/gXPaDJ9ffmvhs7vU
pJX5Um3GiiFm0w/ZZOKDqh78wRAsCKLN+jIoyszuhkkNchZSj/jKfOgdd3EmcZst
6L6rxBA4oRHwNOgM7uVMp+jFeRe1/prR280OWWH0D4QmmuqybThOdO23Iuh/Deth
W3qPUH3UQtfSWxGy2yODzJ1ciuGAr/AzJZ9zjg04e3Vl0DkEpyWtLKJiG3ClXZag
Nj+3xc4xYH2Aw+M0HRaONk5XVKLpqVjuAfgU5iLQa0YSUbtrR+wCWvY8KgQNbAqK
xZmzYzQ89stwVCuGKx10gPsL3jSJ3VCylMfqdHD2Ajmld1yApr0=
=DOZU
-----END PGP SIGNATURE-----
Merge tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 core updates from Borislav Petkov:
- Do not #GP on userspace use of CLI/STI but pretend it was a NOP to
keep old userspace from breaking. Adjust the corresponding iopl
selftest to that.
- Improve stack overflow warnings to say which stack got overflowed and
raise the exception stack sizes to 2 pages since overflowing the
single page of exception stack is very easy to do nowadays with all
the tracing machinery enabled. With that, rip out the custom mapping
of AMD SEV's too.
- A bunch of changes in preparation for FGKASLR like supporting more
than 64K section headers in the relocs tool, correct ORC lookup table
size to cover the whole kernel .text and other adjustments.
* tag 'x86_core_for_v5.16_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
selftests/x86/iopl: Adjust to the faked iopl CLI/STI usage
vmlinux.lds.h: Have ORC lookup cover entire _etext - _stext
x86/boot/compressed: Avoid duplicate malloc() implementations
x86/boot: Allow a "silent" kaslr random byte fetch
x86/tools/relocs: Support >64K section headers
x86/sev: Make the #VC exception stacks part of the default stacks storage
x86: Increase exception stack sizes
x86/mm/64: Improve stack overflow warnings
x86/iopl: Fake iopl(3) CLI/STI usage
The end goal of the current buffer overflow detection work[0] is to gain
full compile-time and run-time coverage of all detectable buffer overflows
seen via array indexing or memcpy(), memmove(), and memset(). The str*()
family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this series
contains the foundational elements of several related buffer overflow
detection improvements by providing new common helpers and FORTIFY_SOURCE
changes needed to gain the introspection required for compiler visibility
into array sizes. Also included are a handful of already Acked instances
using the helpers (or related clean-ups), with many more waiting at the
ready to be taken via subsystem-specific trees[2]. The new helpers are:
- struct_group() for gaining struct member range introspection.
- memset_after() and memset_startat() for clearing to the end of structures.
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in structs.
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage under
GCC, and to prepare to fix the currently broken Clang support. Finishing
this work is part of the larger series[0], but depends on all the false
positives and buffer overflow bug fixes to have landed already and those
that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a compile-time
and run-time tests are added for FORTIFY_SOURCE and the mem*()-family
functions respectively. The compile time tests have found a legitimate
(though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage that
result in no known object code differences.
After this series (and the changes that have now landed via netdev
and usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds. However, due corner cases in
GCC[3] and Clang[4], I have not included the last two patches that turn
on these options, as I don't want to introduce any known warnings to
the build. Hopefully these can be solved soon.
[0] https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE
[2] https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/
[3] https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/
[4] https://bugs.llvm.org/show_bug.cgi?id=51682
[5] https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/
[6] https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/
-----BEGIN PGP SIGNATURE-----
iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAmGAFWcWHGtlZXNjb29r
QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJmKFD/45MJdnvW5MhIEeW5tc5UjfcIPS
ae+YvlEX/2ZwgSlTxocFVocE6hz7b6eCiX3dSAChPkPxsSfgeiuhjxsU+4ROnELR
04RqTA/rwT6JXfJcXbDPXfxDL4huUkgktAW3m1sT771AZspeap2GrSwFyttlTqKA
+kTiZ3lXJVFcw10uyhfp3Lk6eFJxdf5iOjuEou5kBOQfpNKEOduRL2K15hSowOwB
lARiAC+HbmN+E+npvDE7YqK4V7ZQ0/dtB0BlfqgTkn1spQz8N21kBAMpegV5vvIk
A+qGHc7q2oyk4M14TRTidQHGQ4juW1Kkvq3NV6KzwQIVD+mIfz0ESn3d4tnp28Hk
Y+OXTI1BRFlApQU9qGWv33gkNEozeyqMLDRLKhDYRSFPA9UKkpgXQRzeTzoLKyrQ
4B6n5NnUGcu7I6WWhpyZQcZLDsHGyy0vHzjQGs/NXtb1PzXJ5XIGuPdmx9pVMykk
IVKnqRcWyGWahfh3asOnoXvdhi1No4NSHQ/ZHfUM+SrIGYjBMaUisw66qm3Fe8ZU
lbO2CFkCsfGSoKNPHf0lUEGlkyxAiDolazOfflDNxdzzlZo2X1l/a7O/yoO4Pqul
cdL0eDjiNoQ2YR2TSYPnXq5KSL1RI0tlfS8pH8k1hVhZsQx0wpAQ+qki0S+fLePV
PdA9XB82G2tmqKc9cQ==
=9xbT
-----END PGP SIGNATURE-----
Merge tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull overflow updates from Kees Cook:
"The end goal of the current buffer overflow detection work[0] is to
gain full compile-time and run-time coverage of all detectable buffer
overflows seen via array indexing or memcpy(), memmove(), and
memset(). The str*() family of functions already have full coverage.
While much of the work for these changes have been on-going for many
releases (i.e. 0-element and 1-element array replacements, as well as
avoiding false positives and fixing discovered overflows[1]), this
series contains the foundational elements of several related buffer
overflow detection improvements by providing new common helpers and
FORTIFY_SOURCE changes needed to gain the introspection required for
compiler visibility into array sizes. Also included are a handful of
already Acked instances using the helpers (or related clean-ups), with
many more waiting at the ready to be taken via subsystem-specific
trees[2].
The new helpers are:
- struct_group() for gaining struct member range introspection
- memset_after() and memset_startat() for clearing to the end of
structures
- DECLARE_FLEX_ARRAY() for using flex arrays in unions or alone in
structs
Also included is the beginning of the refactoring of FORTIFY_SOURCE to
support memcpy() introspection, fix missing and regressed coverage
under GCC, and to prepare to fix the currently broken Clang support.
Finishing this work is part of the larger series[0], but depends on
all the false positives and buffer overflow bug fixes to have landed
already and those that depend on this series to land.
As part of the FORTIFY_SOURCE refactoring, a set of both a
compile-time and run-time tests are added for FORTIFY_SOURCE and the
mem*()-family functions respectively. The compile time tests have
found a legitimate (though corner-case) bug[6] already.
Please note that the appearance of "panic" and "BUG" in the
FORTIFY_SOURCE refactoring are the result of relocating existing code,
and no new use of those code-paths are expected nor desired.
Finally, there are two tree-wide conversions for 0-element arrays and
flexible array unions to gain sane compiler introspection coverage
that result in no known object code differences.
After this series (and the changes that have now landed via netdev and
usb), we are very close to finally being able to build with
-Warray-bounds and -Wzero-length-bounds.
However, due corner cases in GCC[3] and Clang[4], I have not included
the last two patches that turn on these options, as I don't want to
introduce any known warnings to the build. Hopefully these can be
solved soon"
Link: https://lore.kernel.org/lkml/20210818060533.3569517-1-keescook@chromium.org/ [0]
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/log/?qt=grep&q=FORTIFY_SOURCE [1]
Link: https://lore.kernel.org/lkml/202108220107.3E26FE6C9C@keescook/ [2]
Link: https://lore.kernel.org/lkml/3ab153ec-2798-da4c-f7b1-81b0ac8b0c5b@roeck-us.net/ [3]
Link: https://bugs.llvm.org/show_bug.cgi?id=51682 [4]
Link: https://lore.kernel.org/lkml/202109051257.29B29745C0@keescook/ [5]
Link: https://lore.kernel.org/lkml/20211020200039.170424-1-keescook@chromium.org/ [6]
* tag 'overflow-v5.16-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (30 commits)
fortify: strlen: Avoid shadowing previous locals
compiler-gcc.h: Define __SANITIZE_ADDRESS__ under hwaddress sanitizer
treewide: Replace 0-element memcpy() destinations with flexible arrays
treewide: Replace open-coded flex arrays in unions
stddef: Introduce DECLARE_FLEX_ARRAY() helper
btrfs: Use memset_startat() to clear end of struct
string.h: Introduce memset_startat() for wiping trailing members and padding
xfrm: Use memset_after() to clear padding
string.h: Introduce memset_after() for wiping trailing members/padding
lib: Introduce CONFIG_MEMCPY_KUNIT_TEST
fortify: Add compile-time FORTIFY_SOURCE tests
fortify: Allow strlen() and strnlen() to pass compile-time known lengths
fortify: Prepare to improve strnlen() and strlen() warnings
fortify: Fix dropped strcpy() compile-time write overflow check
fortify: Explicitly disable Clang support
fortify: Move remaining fortify helpers into fortify-string.h
lib/string: Move helper functions out of string.c
compiler_types.h: Remove __compiletime_object_size()
cm4000_cs: Use struct_group() to zero struct cm4000_dev region
can: flexcan: Use struct_group() to zero struct flexcan_regs regions
...
The early malloc() and free() implementation in include/linux/decompress/mm.h
(which is also included by the static decompressors) is static. This is
fine when the only thing interested in using malloc() is the decompression
code, but the x86 early boot environment may use malloc() in a couple places,
leading to a potential collision when the static copies of the available
memory region ("malloc_ptr") gets reset to the global "free_mem_ptr" value.
As it happened, the existing usage pattern was accidentally safe because each
user did 1 malloc() and 1 free() before returning and were not nested:
extract_kernel() (misc.c)
choose_random_location() (kaslr.c)
mem_avoid_init()
handle_mem_options()
malloc()
...
free()
...
parse_elf() (misc.c)
malloc()
...
free()
Once the future FGKASLR series is added, however, it will insert
additional malloc() calls local to fgkaslr.c in the middle of
parse_elf()'s malloc()/free() pair:
parse_elf() (misc.c)
malloc()
if (...) {
layout_randomized_image(output, &ehdr, phdrs);
malloc() <- boom
...
else
layout_image(output, &ehdr, phdrs);
free()
To avoid collisions, there must be a single implementation of malloc().
Adjust include/linux/decompress/mm.h so that visibility can be
controlled, provide prototypes in misc.h, and implement the functions in
misc.c. This also results in a small size savings:
$ size vmlinux.before vmlinux.after
text data bss dec hex filename
8842314 468 178320 9021102 89a6ae vmlinux.before
8842240 468 178320 9021028 89a664 vmlinux.after
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211013175742.1197608-4-keescook@chromium.org
Some versions of mtools (fixed somewhere between 4.0.31 and 4.0.35)
generate bad output for mformat when used with the partition= option.
Use the offset= option instead. An mtools.conf entry is *also* needed
with partition= to support mpartition; combining them in one entry does
not work either.
Don't specify the -t option to mpartition; it is unnecessary and seems
to confuse mpartition under some circumstances.
Also do a few minor optimizations:
Use a larger cluster size; there is no reason for the typical 4K
clusters when we are dealing mainly with comparatively huge files.
Start the partition at 32K. There is no reason to align it more than
that, since the internal FAT filesystem structures will at best be
cluster-aligned, and 32K is the maximum FAT cluster size.
[ bp: Remove "we". ]
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210911003906.2700218-1-hpa@zytor.com
The core functions of string.c are those that may be implemented by
per-architecture functions, or overloaded by FORTIFY_SOURCE. As a
result, it needs to be built with __NO_FORTIFY. Without this, macros
will collide with function declarations. This was accidentally working
due to -ffreestanding (on some architectures). Make this deterministic
by explicitly setting __NO_FORTIFY and move all the helper functions
into string_helpers.c so that they gain the fortification coverage they
had been missing.
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Andy Lavr <andy.lavr@gmail.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Bartosz Golaszewski <bgolaszewski@baylibre.com>
Acked-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
- Add -s option (strict mode) to merge_config.sh to make it fail when
any symbol is redefined.
- Show a warning if a different compiler is used for building external
modules.
- Infer --target from ARCH for CC=clang to let you cross-compile the
kernel without CROSS_COMPILE.
- Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
- Add <linux/stdarg.h> to the kernel source instead of borrowing
<stdarg.h> from the compiler.
- Add Nick Desaulniers as a Kbuild reviewer.
- Drop stale cc-option tests.
- Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
to handle symbols in inline assembly.
- Show a warning if 'FORCE' is missing for if_changed rules.
- Various cleanups
-----BEGIN PGP SIGNATURE-----
iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmExXHoVHG1hc2FoaXJv
eUBrZXJuZWwub3JnAAoJED2LAQed4NsGAZwP/iHdEZzuQ4cz2uXUaV0fevj9jjPU
zJ8wrrNabAiT6f5x861DsARQSR4OSt3zN0tyBNgZwUdotbe7ED5GegrgIUBMWlML
QskhTEIZj7TexAX/20vx671gtzI3JzFg4c9BuriXCFRBvychSevdJPr65gMDOesL
vOJnXe+SGXG2+fPWi/PxrcOItNRcveqo2GiWHT3g0Cv/DJUulu81gEkz3hrufnMR
cjMeSkV0nJJcvI755OQBOUnEuigW64k4m2WxHPG24tU8cQOCqV6lqwOfNQBAn4+F
OoaCMyPQT9gvGYwGExQMCXGg0wbUt1qnxzOVoA2qFCwbo+MFhqjBvPXab6VJm7CE
mY3RrTtvxSqBdHI6EGcYeLjhycK9b+LLoJ1qc3S9FK8It6NoFFp4XV0R6ItPBls7
mWi9VSpyI6k0AwLq+bGXEHvaX/bnnf/vfqn8H+w6mRZdXjFV8EB2DiOSRX/OqjVG
RnvTtXzWWThLyXvWR3Jox4+7X6728oL7akLemoeZI6oTbJDm7dQgwpz5HbSyHXLh
d+gUF3Y/6lqxT5N9GSVDxpD1bEMh2I7nGQ4M7WGbGas/3yUemF8wbBqGQo4a+YeD
d9vGAUxDp2PQTtL2sjFo5Gd4PZEM9g7vwWzRvHe0o5NxKEXcBg25b8cD1hxrN9Y4
Y1AAnc0kLO+My3PC
=lw3M
-----END PGP SIGNATURE-----
Merge tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild
Pull Kbuild updates from Masahiro Yamada:
- Add -s option (strict mode) to merge_config.sh to make it fail when
any symbol is redefined.
- Show a warning if a different compiler is used for building external
modules.
- Infer --target from ARCH for CC=clang to let you cross-compile the
kernel without CROSS_COMPILE.
- Make the integrated assembler default (LLVM_IAS=1) for CC=clang.
- Add <linux/stdarg.h> to the kernel source instead of borrowing
<stdarg.h> from the compiler.
- Add Nick Desaulniers as a Kbuild reviewer.
- Drop stale cc-option tests.
- Fix the combination of CONFIG_TRIM_UNUSED_KSYMS and CONFIG_LTO_CLANG
to handle symbols in inline assembly.
- Show a warning if 'FORCE' is missing for if_changed rules.
- Various cleanups
* tag 'kbuild-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (39 commits)
kbuild: redo fake deps at include/ksym/*.h
kbuild: clean up objtool_args slightly
modpost: get the *.mod file path more simply
checkkconfigsymbols.py: Fix the '--ignore' option
kbuild: merge vmlinux_link() between ARCH=um and other architectures
kbuild: do not remove 'linux' link in scripts/link-vmlinux.sh
kbuild: merge vmlinux_link() between the ordinary link and Clang LTO
kbuild: remove stale *.symversions
kbuild: remove unused quiet_cmd_update_lto_symversions
gen_compile_commands: extract compiler command from a series of commands
x86: remove cc-option-yn test for -mtune=
arc: replace cc-option-yn uses with cc-option
s390: replace cc-option-yn uses with cc-option
ia64: move core-y in arch/ia64/Makefile to arch/ia64/Kbuild
sparc: move the install rule to arch/sparc/Makefile
security: remove unneeded subdir-$(CONFIG_...)
kbuild: sh: remove unused install script
kbuild: Fix 'no symbols' warning when CONFIG_TRIM_UNUSD_KSYMS=y
kbuild: Switch to 'f' variants of integrated assembler flag
kbuild: Shuffle blank line to improve comment meaning
...
minimal compiler version the kernel uses and thus avoid the need to
invoke the compiler unnecessarily.
- Cleanups
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmEsqfIACgkQEsHwGGHe
VUqkBw/9Hji2LIYrC0vVtsqK9OOgz0dNVPu0njC+bcxwz28on+HilJzJAnqTxE0D
KwYqOFsnl6bbfWdvML498gEiSpL+QQnCcNce28OEWLyaMvZoie1gH77qwQvJEJj8
HLN0wc2RaPai9T3cACekxaGvMt9HZrv8NvayQoGjX1GKCKMAkOTqYV5SQXIumQQQ
Ak50wGiaBNreripzq4QBRdGLh3G8W18w4BIOqmEceriNEjCOQAkwDlq0wXp5fM0x
SVFRH+dchgPnv90GamToY3uDr6qhqoCpQcGLn3MNVY8BxLxC9MQAzcf3xmIXu0u5
RwLZISgZ3Wh7iKyir0BTLKwmIoIaxz0Mb3dpSOzqTgBRD0USU1JjarHkOiYppX8A
yX/mRkcekuCykn1hhi05HeWwKkto/5pn36sILzYX9OWxuJxdDXgQSJ9MNRWZ1B+k
dcfhPC4prFieTDFa5jnZC6hwu6sleRvdw+Uw2U8pwDkYVmueER1ozz+3hgojby7g
L8S1Hd/5B4aAZnDSv5qJSI5hbrFuORBEwO3MHvw9GdPBJTv1P0WHdoJqSdUQW7cW
hyeILbpchgnMLF8pmKKIjg9pdFgVMn2Z5ISG+dxFv9pRz1NrA8bSvTtPxIsP/jKX
+WO85aCnJTFulDdIwoL77PRXi4PVP+ir8KXs7o4IZQnVfLnPlQY=
=fVgI
-----END PGP SIGNATURE-----
Merge tag 'x86_build_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Borislav Petkov:
- Remove cc-option checks which are old and already supported by the
minimal compiler version the kernel uses and thus avoid the need to
invoke the compiler unnecessarily.
- Cleanups
* tag 'x86_build_for_v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Move the install rule to arch/x86/Makefile
x86/build: Remove the left-over bzlilo target
x86/tools/relocs: Mark die() with the printf function attr format
x86/build: Remove stale cc-option checks
Currently, the install target in arch/x86/Makefile descends into
arch/x86/boot/Makefile to invoke the shell script, but there is no
good reason to do so.
arch/x86/Makefile can run the shell script directly.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210729140023.442101-2-masahiroy@kernel.org
Fix the following coccicheck warning:
./arch/x86/boot/compressed/kaslr.c:671:10-11:WARNING:return of 0/1
in function 'process_mem_region' with return type bool
Generated by: scripts/coccinelle/misc/boolreturn.cocci
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Jing Yangyang <jing.yangyang@zte.com.cn>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210824070515.61065-1-deng.changcheng@zte.com.cn
Commit
79419e13e8 ("x86/boot/compressed/64: Setup IDT in startup_32 boot path")
introduced an IDT into the 32-bit boot path of the decompressor stub.
But the IDT is set up before ExitBootServices() is called, and some UEFI
firmwares rely on their own IDT.
Save the firmware IDT on boot and restore it before calling into EFI
functions to fix boot failures introduced by above commit.
Fixes: 79419e13e8 ("x86/boot/compressed/64: Setup IDT in startup_32 boot path")
Reported-by: Fabio Aiuto <fabioaiuto83@gmail.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Cc: stable@vger.kernel.org # 5.13+
Link: https://lkml.kernel.org/r/20210820125703.32410-1-joro@8bytes.org
Ship minimal stdarg.h (1 type, 4 macros) as <linux/stdarg.h>.
stdarg.h is the only userspace header commonly used in the kernel.
GPL 2 version of <stdarg.h> can be extracted from
http://archive.debian.org/debian/pool/main/g/gcc-4.2/gcc-4.2_4.2.4.orig.tar.gz
Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
A discussion around -Wundef showed that there were still a few boolean
Kconfigs where #if was used rather than #ifdef to guard different code.
Kconfig doesn't define boolean configs, which can result in -Wundef
warnings.
arch/x86/boot/compressed/Makefile resets the CFLAGS used for this
directory, and doesn't re-enable -Wundef as the top level Makefile does.
If re-added, with RANDOMIZE_BASE and X86_NEED_RELOCS disabled, the
following warnings are visible.
arch/x86/boot/compressed/misc.h:82:5: warning: 'CONFIG_RANDOMIZE_BASE'
is not defined, evaluates to 0 [-Wundef]
^
arch/x86/boot/compressed/misc.c:175:5: warning: 'CONFIG_X86_NEED_RELOCS'
is not defined, evaluates to 0 [-Wundef]
^
Simply fix these and re-enable this warning for this directory.
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Joe Perches <joe@perches.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Link: https://lore.kernel.org/r/20210422190450.3903999-1-ndesaulniers@google.com
The image generation scripts in arch/x86/boot are pretty out of date,
except for the isoimage target. Update and clean up the
genimage.sh script, and make it support an arbitrary number of
initramfs files in the image.
Add a "hdimage" target, which can be booted by either BIOS or
EFI (if the kernel is compiled with the EFI stub.) For EFI to be able
to pass the command line to the kernel, we need the EFI shell, but the
firmware builtin EFI shell, if it even exists, is pretty much always
the last resort boot option, so search for OVMF or EDK2 and explicitly
include a copy of the EFI shell.
To make this all work, use bash features in the script. Furthermore,
this version of the script makes use of some mtools features,
especially mpartition, that might not exist in very old version of
mtools, but given all the other dependencies on this script this
doesn't seem such a big deal.
Finally, put a volume label ("LINUX_BOOT") on all generated images.
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Link: https://lore.kernel.org/r/20210510082840.628372-1-hpa@zytor.com
SEV-SNP builds upon the SEV-ES functionality while adding new hardware
protection. Version 2 of the GHCB specification adds new NAE events that
are SEV-SNP specific. Rename the sev-es.{ch} to sev.{ch} so that all
SEV* functionality can be consolidated in one place.
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Joerg Roedel <jroedel@suse.de>
Link: https://lkml.kernel.org/r/20210427111636.1207-2-brijesh.singh@amd.com
gets rid of the LAZY_GS stuff and a lot of code.
- Add an insn_decode() API which all users of the instruction decoder
should preferrably use. Its goal is to keep the details of the
instruction decoder away from its users and simplify and streamline how
one decodes insns in the kernel. Convert its users to it.
- kprobes improvements and fixes
- Set the maximum DIE per package variable on Hygon
- Rip out the dynamic NOP selection and simplify all the machinery around
selecting NOPs. Use the simplified NOPs in objtool now too.
- Add Xeon Sapphire Rapids to list of CPUs that support PPIN
- Simplify the retpolines by folding the entire thing into an
alternative now that objtool can handle alternatives with stack
ops. Then, have objtool rewrite the call to the retpoline with the
alternative which then will get patched at boot time.
- Document Intel uarch per models in intel-family.h
- Make Sub-NUMA Clustering topology the default and Cluster-on-Die the
exception on Intel.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCHyJQACgkQEsHwGGHe
VUpjiRAAwPZdwwp08ypZuMHR4EhLNru6gYhbAoALGgtYnQjLtn5onQhIeieK+R4L
cmZpxHT9OFp5dXHk4kwygaQBsD4pPOiIpm60kye1dN3cSbOORRdkwEoQMpKMZ+5Y
kvVsmn7lrwRbp600KdE4G6L5+N6gEgr0r6fMFWWGK3mgVAyCzPexVHgydcp131ch
iYMo6/pPDcNkcV/hboVKgx7GISdQ7L356L1MAIW/Sxtw6uD/X4qGYW+kV2OQg9+t
nQDaAo7a8Jqlop5W5TQUdMLKQZ1xK8SFOSX/nTS15DZIOBQOGgXR7Xjywn1chBH/
PHLwM5s4XF6NT5VlIA8tXNZjWIZTiBdldr1kJAmdDYacrtZVs2LWSOC0ilXsd08Z
EWtvcpHfHEqcuYJlcdALuXY8xDWqf6Q2F7BeadEBAxwnnBg+pAEoLXI/1UwWcmsj
wpaZTCorhJpYo2pxXckVdHz2z0LldDCNOXOjjaWU8tyaOBKEK6MgAaYU7e0yyENv
mVc9n5+WuvXuivC6EdZ94Pcr/KQsd09ezpJYcVfMDGv58YZrb6XIEELAJIBTu2/B
Ua8QApgRgetx+1FKb8X6eGjPl0p40qjD381TADb4rgETPb1AgKaQflmrSTIik+7p
O+Eo/4x/GdIi9jFk3K+j4mIznRbUX0cheTJgXoiI4zXML9Jv94w=
=bm4S
-----END PGP SIGNATURE-----
Merge tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 updates from Borislav Petkov:
- Turn the stack canary into a normal __percpu variable on 32-bit which
gets rid of the LAZY_GS stuff and a lot of code.
- Add an insn_decode() API which all users of the instruction decoder
should preferrably use. Its goal is to keep the details of the
instruction decoder away from its users and simplify and streamline
how one decodes insns in the kernel. Convert its users to it.
- kprobes improvements and fixes
- Set the maximum DIE per package variable on Hygon
- Rip out the dynamic NOP selection and simplify all the machinery
around selecting NOPs. Use the simplified NOPs in objtool now too.
- Add Xeon Sapphire Rapids to list of CPUs that support PPIN
- Simplify the retpolines by folding the entire thing into an
alternative now that objtool can handle alternatives with stack ops.
Then, have objtool rewrite the call to the retpoline with the
alternative which then will get patched at boot time.
- Document Intel uarch per models in intel-family.h
- Make Sub-NUMA Clustering topology the default and Cluster-on-Die the
exception on Intel.
* tag 'x86_core_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (53 commits)
x86, sched: Treat Intel SNC topology as default, COD as exception
x86/cpu: Comment Skylake server stepping too
x86/cpu: Resort and comment Intel models
objtool/x86: Rewrite retpoline thunk calls
objtool: Skip magical retpoline .altinstr_replacement
objtool: Cache instruction relocs
objtool: Keep track of retpoline call sites
objtool: Add elf_create_undef_symbol()
objtool: Extract elf_symbol_add()
objtool: Extract elf_strtab_concat()
objtool: Create reloc sections implicitly
objtool: Add elf_create_reloc() helper
objtool: Rework the elf_rebuild_reloc_section() logic
objtool: Fix static_call list generation
objtool: Handle per arch retpoline naming
objtool: Correctly handle retpoline thunk calls
x86/retpoline: Simplify retpolines
x86/alternatives: Optimize optimize_nops()
x86: Add insn_decode_kernel()
x86/kprobes: Move 'inline' to the beginning of the kprobe_is_ss() declaration
...
486SX.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCGm0QACgkQEsHwGGHe
VUpWgQ//X5ArCvi1KvVSA0TpxO1u6V1y3VARWsWvL3xKos57u9NyeRBUqqvAUcdW
bGqoddOneBsMNwnLuj4grQVYfRtXBPKbsgnhvYKD7X0NNcULABL/h3GRGM6QHLCw
0n6xpXzr6x6s80McUYnQIEcJmzEnsXKXmPWWOerjd37U79ruxAcJCLV7wIHPQG4A
hKIXCMF7dmY9wRWkZ9yNN/F+bOXqbLO80wx59u4l8AgLLVASYOLdicutltE6CiRH
KU4p8trViujtswK2d4q2RO66pwAqFqRmGT1HXJvQE4b3YUqJbI4O2iZGOJTen6N/
F9yywdjXPGA466id5PoZJVRm5QpzFctfdjXUA1BGBmYu8TsqJecXstLXlMoqhaIj
DBttl0/0MId9+UqVLBY6P1LWiWUUgIt0uwC7WltiVf2gPKqLNkS7dEZpVadESQTb
imnEUNNfzh9JMX+e8jjFq3cl3igY1My39/edUoQIWdPuFnFs/Ni+Qu/PztFunEIT
8nRAr9Hxbvj5tK0OeOTod5i7ZEPyG2OcmEPZnhDUHgz0oaeLKLVfXRBz6lle6Z3N
WoF/qbPm0nqMOd20H2NWIBdCs9+8sHvp+tlY9hta8lVYzY27qEa21s5xyIZRU3Ia
/BperJ+J8qyuNCvnaai3pUur+NM7ck/EBTRkxCtwgi6xFxeaFp4=
=Ic77
-----END PGP SIGNATURE-----
Merge tag 'x86_build_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Borislav Petkov:
"A bunch of clang build fixes and a Kconfig highmem selection fix for
486SX"
* tag 'x86_build_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Disable HIGHMEM64G selection for M486SX
efi/libstub: Add $(CLANG_FLAGS) to x86 flags
x86/boot: Add $(CLANG_FLAGS) to compressed KBUILD_CFLAGS
x86/build: Propagate $(CLANG_FLAGS) to $(REALMODE_FLAGS)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCGmYIACgkQEsHwGGHe
VUr45w/8CSXr7MXaFBj4To0hTWJXSZyF6YGqlZOSJXFcFh4cWTNwfVOoFaV47aDo
+HsCNTkGENcKhLrDUWDRiG/Uo46jxtOtl1vhq7U4pGemSYH871XWOKfb5k5XNMwn
/uhaHMI4aEfd6bUFnF518NeyRIsD0BdqFj4tB7RbAiyFwdETDX9Tkj/uBKnQ4zon
4tEDoXgThuK5YKK9zVQg5pa7aFp2zg1CAdX/WzBkS8BHVBPXSV0CF97AJYQOM/V+
lUHv+BN3wp97GYHPQMPsbkNr8IuFoe2mIvikwjxg8iOFpzEU1G1u09XV9R+PXByX
LclFTRqK/2uU5hJlcsBiKfUuidyErYMRYImbMAOREt2w0ogWVu2zQ7HkjVve25h1
sQPwPudbAt6STbqRxvpmB3yoV4TCYwnF91FcWgEy+rcEK2BDsHCnScA45TsK5I1C
kGR1K17pHXprgMZFPveH+LgxewB6smDv+HllxQdSG67LhMJXcs2Epz0TsN8VsXw8
dlD3lGReK+5qy9FTgO7mY0xhiXGz1IbEdAPU4eRBgih13puu03+jqgMaMabvBWKD
wax+BWJUrPtetwD5fBPhlS/XdJDnd8Mkv2xsf//+wT0s4p+g++l1APYxeB8QEehm
Pd7Mvxm4GvQkfE13QEVIPYQRIXCMH/e9qixtY5SHUZDBVkUyFM0=
=bO1i
-----END PGP SIGNATURE-----
Merge tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull misc x86 cleanups from Borislav Petkov:
"Trivial cleanups and fixes all over the place"
* tag 'x86_cleanups_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
MAINTAINERS: Remove me from IDE/ATAPI section
x86/pat: Do not compile stubbed functions when X86_PAT is off
x86/asm: Ensure asm/proto.h can be included stand-alone
x86/platform/intel/quark: Fix incorrect kernel-doc comment syntax in files
x86/msr: Make locally used functions static
x86/cacheinfo: Remove unneeded dead-store initialization
x86/process/64: Move cpu_current_top_of_stack out of TSS
tools/turbostat: Unmark non-kernel-doc comment
x86/syscalls: Fix -Wmissing-prototypes warnings from COND_SYSCALL()
x86/fpu/math-emu: Fix function cast warning
x86/msr: Fix wr/rdmsr_safe_regs_on_cpu() prototypes
x86: Fix various typos in comments, take #2
x86: Remove unusual Unicode characters from comments
x86/kaslr: Return boolean values from a function returning bool
x86: Fix various typos in comments
x86/setup: Remove unused RESERVE_BRK_ARRAY()
stacktrace: Move documentation for arch_stack_walk_reliable() to header
x86: Remove duplicate TSC DEADLINE MSR definitions
couple of gcc11 warning fixes.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmCGmBgACgkQEsHwGGHe
VUox6xAAus7u9Bpyu4UCr93j4PmkfLf0du7A7mfuxfATFFNTy+lQWq+tuJJsFMSI
ShbRNKxE1clDtCpWHI9hi9B0GmrMlgjii2YtNfM7pkZYom3aA6IeXDedE3Ot1KwI
Ox7DsUjgdwwF2O/pYHL4Jg6Vra5daNHYOSlAe7Rk78kcECFlXj77CJYiPtvtkYHD
JH2tu2vaNcbp11vrWbbx7St4w+vDB37Y3NczatbqXMS4Uiwoyfjzyi4qmf97p92u
9aDNq+hj+90b/PYUzd9wyCWc0S6TcQo3OYfZq1/hHdS8UE8kq4AY3FFnzFGIKi7k
IcQDJivkKjXOURD8Btjgbp9dkcbZtiuKS7RcjDuBbmH/q8iBIRYK8GfMxyna0TpE
VKC9Wdn/LvNPS8t0vyB6fK+vt7uxvBXscRA0GtCva3WWSORdI3bFV9n998ArSVZa
Itj0GBQXx4zNIjfg4U+aDsqICKmxGZqoKHm8pDVJUDrZi9A1kWxmhivMSQg58+as
pDKPArtXN2NzN+DCU+UWyFk9qvMSVQh+t3204w4PM0PiHpOyFh7jRXCvzn3ulVJP
LBm3L/Bj7m7qwfmB0iWOGvhwGFIOG0jUk2abudBn864TFuMqEPRadQUwMNC+ezOT
1bp5LWh2s71n610I5LPBYF1diwwxwmx5jhfhXjjfejzCcEy/Xp0=
=PLgK
-----END PGP SIGNATURE-----
Merge tag 'x86_boot_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 boot updates from Borislav Petkov:
"Consolidation and cleanup of the early memory reservations, along with
a couple of gcc11 warning fixes"
* tag 'x86_boot_for_v5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/setup: Move trim_snb_memory() later in setup_arch() to fix boot hangs
x86/setup: Merge several reservations of start of memory
x86/setup: Consolidate early memory reservations
x86/boot/compressed: Avoid gcc-11 -Wstringop-overread warning
x86/boot/tboot: Avoid Wstringop-overread-warning
When cross compiling x86 on an ARM machine with clang, there are several
errors along the lines of:
arch/x86/include/asm/string_64.h:27:10: error: invalid output constraint '=&c' in asm
This happens because the compressed boot Makefile reassigns KBUILD_CFLAGS
and drops the clang flags that set the target architecture ('--target=')
and the path to the GNU cross tools ('--prefix='), meaning that the host
architecture is targeted.
These flags are available as $(CLANG_FLAGS) from the main Makefile so
add them to the compressed boot folder's KBUILD_CFLAGS so that cross
compiling works as expected.
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Ard Biesheuvel <ardb@kernel.org>
Link: https://lkml.kernel.org/r/20210326000435.4785-3-nathan@kernel.org
GCC gets confused by the comparison of a pointer to an integer literal,
with the assumption that this is an offset from a NULL pointer and that
dereferencing it is invalid:
In file included from arch/x86/boot/compressed/misc.c:18:
In function ‘parse_elf’,
inlined from ‘extract_kernel’ at arch/x86/boot/compressed/misc.c:442:2:
arch/x86/boot/compressed/../string.h:15:23: error: ‘__builtin_memcpy’ reading 64 bytes from a region of size 0 [-Werror=stringop-overread]
15 | #define memcpy(d,s,l) __builtin_memcpy(d,s,l)
| ^~~~~~~~~~~~~~~~~~~~~~~
arch/x86/boot/compressed/misc.c:283:9: note: in expansion of macro ‘memcpy’
283 | memcpy(&ehdr, output, sizeof(ehdr));
| ^~~~~~
I could not find any good workaround for this, but as this is only
a warning for a failure during early boot, removing the line entirely
works around the warning.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Martin Sebor <msebor@gmail.com>
Link: https://lore.kernel.org/r/20210322160253.4032422-2-arnd@kernel.org
Fix another ~42 single-word typos in arch/x86/ code comments,
missed a few in the first pass, in particular in .S files.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: linux-kernel@vger.kernel.org
Fix the following coccicheck warnings:
./arch/x86/boot/compressed/kaslr.c:642:10-11: WARNING: return of 0/1 in
function 'process_mem_region' with return type bool.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/1615283963-67277-1-git-send-email-jiapeng.chong@linux.alibaba.com
There are a few places left in the SEV-ES C code where hlt loops and/or
terminate requests are implemented. Replace them all with calls to
sev_es_terminate().
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312123824.306-9-joro@8bytes.org
Check whether the hypervisor reported the correct C-bit when running
as an SEV guest. Using a wrong C-bit position could be used to leak
sensitive data from the guest to the hypervisor.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312123824.306-8-joro@8bytes.org
The 32-bit #VC handler has no GHCB and can only handle CPUID exit codes.
It is needed by the early boot code to handle #VC exceptions raised in
verify_cpu() and to get the position of the C-bit.
But the CPUID information comes from the hypervisor which is untrusted
and might return results which trick the guest into the no-SEV boot path
with no C-bit set in the page-tables. All data written to memory would
then be unencrypted and could leak sensitive data to the hypervisor.
Add sanity checks to the 32-bit boot #VC handler to make sure the
hypervisor does not pretend that SEV is not enabled.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312123824.306-7-joro@8bytes.org
Add a #VC exception handler which is used when the kernel still executes
in protected mode. This boot-path already uses CPUID, which will cause #VC
exceptions in an SEV-ES guest.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312123824.306-6-joro@8bytes.org
This boot path needs exception handling when it is used with SEV-ES.
Setup an IDT and provide a helper function to write IDT entries for
use in 32-bit protected mode.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312123824.306-5-joro@8bytes.org
Exception handling in the startup_32 boot path requires the CS
selector to be correctly set up. Reload it from the current GDT.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312123824.306-4-joro@8bytes.org
A malicious hypervisor could disable the CPUID intercept for an SEV or
SEV-ES guest and trick it into the no-SEV boot path, where it could
potentially reveal secrets. This is not an issue for SEV-SNP guests,
as the CPUID intercept can't be disabled for those.
Remove the Hypervisor CPUID bit check from the SEV detection code to
protect against this kind of attack and add a Hypervisor bit equals zero
check to the SME detection path to prevent non-encrypted guests from
trying to enable SME.
This handles the following cases:
1) SEV(-ES) guest where CPUID intercept is disabled. The guest
will still see leaf 0x8000001f and the SEV bit. It can
retrieve the C-bit and boot normally.
2) Non-encrypted guests with intercepted CPUID will check
the SEV_STATUS MSR and find it 0 and will try to enable SME.
This will fail when the guest finds MSR_K8_SYSCFG to be zero,
as it is emulated by KVM. But we can't rely on that, as there
might be other hypervisors which return this MSR with bit
23 set. The Hypervisor bit check will prevent that the guest
tries to enable SME in this case.
3) Non-encrypted guests on SEV capable hosts with CPUID intercept
disabled (by a malicious hypervisor) will try to boot into
the SME path. This will fail, but it is also not considered
a problem because non-encrypted guests have no protection
against the hypervisor anyway.
[ bp: s/non-SEV/non-encrypted/g ]
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Acked-by: Tom Lendacky <thomas.lendacky@amd.com>
Link: https://lkml.kernel.org/r/20210312123824.306-3-joro@8bytes.org
Disable the exception handling before booting the kernel to make sure
any exceptions that happen during early kernel boot are not directed to
the pre-decompression code.
Signed-off-by: Joerg Roedel <jroedel@suse.de>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20210312123824.306-2-joro@8bytes.org
- Don't move BSS section around pointlessly in the x86 decompressor
- Refactor helper for discovering the EFI secure boot mode
- Wire up EFI secure boot to IMA for arm64
- Some fixes for the capsule loader
- Expose the RT_PROP table via the EFI test module
- Relax DT and kernel placement restrictions on ARM
+ followup fixes:
- fix the build breakage on IA64 caused by recent capsule loader changes
- suppress a type mismatch build warning in the expansion of
EFI_PHYS_ALIGN on ARM
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl/kWCMACgkQEsHwGGHe
VUqVlxAAg3jSS5w5fuaXON2xYZmgKdlRB0fjbklo1ZrWS6sEHrP+gmVmrJSWGZP+
qFleQ6AxaYK57UiBXxS6Xfn7hHRToqdOAGnaSYzIg1aQIofRoLxvm3YHBMKllb+g
x73IBS/Hu9/kiH8EVDrJSkBpVdbPwDnw+FeW4ZWUMF9GVmV8oA6Zx23BVSVsbFda
jat/cEsJQS3GfECJ/Fg5ae+c/2zn5NgbaVtLxVnMnJfAwEpoPz3ogKoANSskdZg3
z6pA1aMFoHr+lnlzcsM5zdboQlwZRKPHvFpsXPexESBy5dPkYhxFnHqgK4hSZglC
c3QoO9Gn+KOJl4KAKJWNzCrd3G9kKY5RXkoei4bH9wGMjW2c68WrbFyXgNsO3vYR
v5CKpq3+jlwGo03GiLJgWQFdgqX0EgTVHHPTcwFpt8qAMi9/JIPSIeTE41p2+AjZ
cW5F0IlikaR+N8vxc2TDvQTuSsroMiLcocvRWR61oV/48pFlEjqiUjV31myDsASg
gGkOxZOOz2iBJfK8lCrKp5p9JwGp0M0/GSHTxlYQFy+p4SrcOiPX4wYYdLsWxioK
AbVhvOClgB3kN7y7TpLvdjND00ciy4nKEC0QZ5p5G59jSLnpSBM/g6av24LsSQwo
S1HJKhQPbzcI1lhaPjo91HQoOOMZHWLes0SqK4FGNIH+0imHliA=
=n7Gc
-----END PGP SIGNATURE-----
Merge tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull EFI updates from Borislav Petkov:
"These got delayed due to a last minute ia64 build issue which got
fixed in the meantime.
EFI updates collected by Ard Biesheuvel:
- Don't move BSS section around pointlessly in the x86 decompressor
- Refactor helper for discovering the EFI secure boot mode
- Wire up EFI secure boot to IMA for arm64
- Some fixes for the capsule loader
- Expose the RT_PROP table via the EFI test module
- Relax DT and kernel placement restrictions on ARM
with a few followup fixes:
- fix the build breakage on IA64 caused by recent capsule loader
changes
- suppress a type mismatch build warning in the expansion of
EFI_PHYS_ALIGN on ARM"
* tag 'efi_updates_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
efi: arm: force use of unsigned type for EFI_PHYS_ALIGN
efi: ia64: disable the capsule loader
efi: stub: get rid of efi_get_max_fdt_addr()
efi/efi_test: read RuntimeServicesSupported
efi: arm: reduce minimum alignment of uncompressed kernel
efi: capsule: clean scatter-gather entries from the D-cache
efi: capsule: use atomic kmap for transient sglist mappings
efi: x86/xen: switch to efi_get_secureboot_mode helper
arm64/ima: add ima_arch support
ima: generalize x86/EFI arch glue for other EFI architectures
efi: generalize efi_get_secureboot
efi/libstub: EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER should not default to yes
efi/x86: Only copy the compressed kernel image in efi_relocate_kernel()
efi/libstub/x86: simplify efi_is_native()
With the intoduction of hardware tag-based KASAN some kernel checks of
this kind:
ifdef CONFIG_KASAN
will be updated to:
if defined(CONFIG_KASAN_GENERIC) || defined(CONFIG_KASAN_SW_TAGS)
x86 and s390 use a trick to #undef CONFIG_KASAN for some of the code
that isn't linked with KASAN runtime and shouldn't have any KASAN
annotations.
Also #undef CONFIG_KASAN_GENERIC with CONFIG_KASAN.
Link: https://lkml.kernel.org/r/9d84bfaaf8fabe0fc89f913c9e420a30bd31a260.1606161801.git.andreyknvl@google.com
Signed-off-by: Andrey Konovalov <andreyknvl@google.com>
Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Reviewed-by: Marco Elver <elver@google.com>
Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Potapenko <glider@google.com>
Tested-by: Vincenzo Frascino <vincenzo.frascino@arm.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Branislav Rankov <Branislav.Rankov@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Evgenii Stepanov <eugenis@google.com>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
(Arvind Sankar)
- Remove -m16 workaround now that the GCC versions that need it are unsupported
(Nick Desaulniers)
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl/XhwYACgkQEsHwGGHe
VUqYPQ/9ECDO6Ms8rkvZf4f7+LfKO8XsRYf0h411h5zuVssgPybIbZCvuiQ5Jj9r
IzTe0pO0AoVoX5/JE+mxbvroBb5hRokNehJayPmdVjZSrzsYWZR3RVW6iSRo/TQ0
bjzY1cHC4EPmtyYA4FDr+b0JAEEbyPuPOAcqSzBBadsx5FURvwH7s1hUILCZ1nmy
i8DADZDo7Puk1W8VQkGkjUeo3QVY8KG2VzZegU0iUR/M7Hoek3WBPObYFg0bsKN4
6hvRWPMt2lFXOSNZOFhG+LjtWXTJ22g7ENU4SHXhHBdxmM8H/OZvh/3UZTes7rdo
u/LZJ580GfOa3TYHOToGQq9iBxoi9Se3vLXUtt2I9Z7sbBVUGTKx2Llezvt9/QRh
kJkpvMc2r2b9n7Cu52gHKQ89OrqFzs2scNyjQ2KRU61bPwLjWoGiobugnl8T9q8Y
Li8LKS2tCD6yo7L1eYt0NDriWVF3/Tv1rebWMNIf3sGSHFkxgHxFviTx3L3SWEBS
8JMuKLPh20WC9ombXP/tDN2rtTmmL9hEoqZ9uSTG1tRNLOLLNv790ucUh+lp6RFp
7y8dzl+L3hXsFdcCYooOIGO1phmdThwSLMqZHmbpRWpKNr+8P0wCqZP16twnoXNC
kO7OiNd6XHVpJg4WnN3C3X70n7TPXSZ7dkpDNFBJaH2QilNEHa8=
=Qk/S
-----END PGP SIGNATURE-----
Merge tag 'x86_build_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 build updates from Borislav Petkov:
"Two x86 build fixes:
- Fix the vmlinux size check on 64-bit along with adding useful
clarifications on the topic (Arvind Sankar)
- Remove -m16 workaround now that the GCC versions that need it are
unsupported (Nick Desaulniers)"
* tag 'x86_build_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/build: Remove -m16 workaround for unsupported versions of GCC
x86/build: Fix vmlinux size check on 64-bit
(Gabriel Krisman Bertazi)
- All kinds of minor cleanups all over the tree.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAl/XgtoACgkQEsHwGGHe
VUqGuA/9GqN2zNQdhgRvAQ+FLZiOYK9MfXcoayfMq8T61VRPDBWaQRfVYKmfmEjS
0l5OnYgZQ9n6vzqFy6pmgc/ix8Jr553dZp5NCamcOqjCTcuO/LwRRh+ZBeFSBTPi
r2qFYKKRYvM7nbyUMm4WqvAakxJ18xsjNbIslr9Aqe8WtHBKKX3MOu8SOpFtGyXz
aEc4rhsS45iZa5gTXhvOn73tr3yHGWU1rzyyAAAmDGTgAxRwsTna8v16C4+v+Bua
Zg18Wiutj8ZjtFpzKJtGWGZoSBap3Jw2Ys64g42MBQUE56KY/99tQVo/SvbYvvlf
PHWLH0f3rPNJ6J2qeKwhtNzPlEAH/6e416A1/6TVwsK+8pdfGmkfaQh2iDHLhJ5i
CSwF61H44ZaE3pc1tHHbC5ALvydPlup7D4MKgztfq0mZ3OoV2Vg7dtyyr+Ybz72b
G+Kl/tmyacQTXo0FiYbZKETo3/VfTdBXGyVax1rHkx3pt8zvhFg3kxb1TT/l/CoM
eSTx53PtTdVtbGOq1CjnUm0FKlbh4+kLoNuo9DYKeXUQBs8PWOCZmL3wXmm4cqlZ
mDZVWvll7CjToY8izzcE/AG279cWkgcL5Tcg7W7CR66+egfDdpuqOZ4tv4TyzoWq
0J7WeNj+TAo98b7RA0Ux8LOlszRxS2ykuI6uB2MgwCaRMbbaQao=
=lLiH
-----END PGP SIGNATURE-----
Merge tag 'x86_cleanups_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 cleanups from Borislav Petkov:
"Another branch with a nicely negative diffstat, just the way I
like 'em:
- Remove all uses of TIF_IA32 and TIF_X32 and reclaim the two bits in
the end (Gabriel Krisman Bertazi)
- All kinds of minor cleanups all over the tree"
* tag 'x86_cleanups_for_v5.11' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (24 commits)
x86/ia32_signal: Propagate __user annotation properly
x86/alternative: Update text_poke_bp() kernel-doc comment
x86/PCI: Make a kernel-doc comment a normal one
x86/asm: Drop unused RDPID macro
x86/boot/compressed/64: Use TEST %reg,%reg instead of CMP $0,%reg
x86/head64: Remove duplicate include
x86/mm: Declare 'start' variable where it is used
x86/head/64: Remove unused GET_CR2_INTO() macro
x86/boot: Remove unused finalize_identity_maps()
x86/uaccess: Document copy_from_user_nmi()
x86/dumpstack: Make show_trace_log_lvl() static
x86/mtrr: Fix a kernel-doc markup
x86/setup: Remove unused MCA variables
x86, libnvdimm/test: Remove COPY_MC_TEST
x86: Reclaim TIF_IA32 and TIF_X32
x86/mm: Convert mmu context ia32_compat into a proper flags field
x86/elf: Use e_machine to check for x32/ia32 in setup_additional_pages()
elf: Expose ELF header on arch_setup_additional_pages()
x86/elf: Use e_machine to select start_thread for x32
elf: Expose ELF header in compat_start_thread()
...
- Make the AMD L3 QoS code and data priorization enable/disable mechanism
work correctly. The control bit was only set/cleared on one of the CPUs
in a L3 domain, but it has to be modified on all CPUs in the domain. The
initial documentation was not clear about this, but the updated one from
Oct 2020 spells it out.
- Fix an off by one in the UV platform detection code which causes the UV
hubs to be identified wrongly. The chip revisions start at 1 not at 0.
- Fix a long standing bug in the evaluation of prefixes in the uprobes
code which fails to handle repeated prefixes properly. The aggregate
size of the prefixes can be larger than the bytes array but the code
blindly iterated over the aggregate size beyond the array boundary.
Add a macro to handle this case properly and use it at the affected
places.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEQp8+kY+LLUocC4bMphj1TA10mKEFAl/M2GoTHHRnbHhAbGlu
dXRyb25peC5kZQAKCRCmGPVMDXSYoUGeD/0TxQyVIZTJjY/ExDYvQZeSWgvFVon9
A/QcwkgtkWzYKI3YThgxBtSNKhHPP8eQ9QK8ErcaQKEHU5TdXVEOwHgTm1A6OGat
0+M9/EWEe7tTu+cLpu7esQ+1VbSvEcFXZbljbhCKrShlBlsIjFG4eCPAprSw9yjI
RSgfXKZu4NmHVS6nfJTwmIIaTeLQ6U3b7b1D5s/66slBFScqnLbRNhABVbHbos4F
pl/lxDCFOddy2YbEojHjjGqMA7oxPav7c0nYFOM/zG+wAqfEjbqOxReT31bGQPi2
XT9K4JEqDqILo0KnhV4GsYoWAhes3BtmsJ9IoZ7IijsMriYl80mD9URAORidJ4PX
28Ckk9V/DlE8uDrAnBDcWDSoKlg78mhVV7V9L6v43teg/gJfSZNROtNDBmqRmwG4
Op2NJfzJITtaxVQuSZRkSs8rzGv+QUfaM1sBUQ+Oz4KYeIjjA7G2MAOECrzIAWKB
GWc5toYRVS6oGT+RbZhSxZYoh8ASoGJ2MrL8K4OV4RqEqHHcXcih0WmmljtsDIFI
td4FHHH6fghIb9S6iYKiApd6k2qKa33mwJwa/xZOoIrv0w5xT0WDJnxT60gu/Mec
YDkqhmA009CNSD2G4oNRNF5MH7gp34UII+25jOGatbVh+5DDPYs+5Jnh/DR7jssR
PryAG9ER7UUb6w==
=rNBq
-----END PGP SIGNATURE-----
Merge tag 'x86-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Thomas Gleixner:
"A set of fixes for x86:
- Make the AMD L3 QoS code and data priorization enable/disable
mechanism work correctly.
The control bit was only set/cleared on one of the CPUs in a L3
domain, but it has to be modified on all CPUs in the domain. The
initial documentation was not clear about this, but the updated one
from Oct 2020 spells it out.
- Fix an off by one in the UV platform detection code which causes
the UV hubs to be identified wrongly.
The chip revisions start at 1 not at 0.
- Fix a long standing bug in the evaluation of prefixes in the
uprobes code which fails to handle repeated prefixes properly.
The aggregate size of the prefixes can be larger than the bytes
array but the code blindly iterated over the aggregate size beyond
the array boundary. Add a macro to handle this case properly and
use it at the affected places"
* tag 'x86-urgent-2020-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sev-es: Use new for_each_insn_prefix() macro to loop over prefixes bytes
x86/insn-eval: Use new for_each_insn_prefix() macro to loop over prefixes bytes
x86/uprobes: Do not use prefixes.nbytes when looping over prefixes.bytes
x86/platform/uv: Fix UV4 hub revision adjustment
x86/resctrl: Fix AMD L3 QOS CDP enable/disable
Since insn.prefixes.nbytes can be bigger than the size of
insn.prefixes.bytes[] when a prefix is repeated, the proper
check must be:
insn.prefixes.bytes[i] != 0 and i < 4
instead of using insn.prefixes.nbytes. Use the new
for_each_insn_prefix() macro which does it correctly.
Debugged by Kees Cook <keescook@chromium.org>.
[ bp: Massage commit message. ]
Fixes: 25189d08e5 ("x86/sev-es: Add support for handling IOIO exceptions")
Reported-by: syzbot+9b64b619f10f19d19a7c@syzkaller.appspotmail.com
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/160697106089.3146288.2052422845039649176.stgit@devnote2
Revert the following two commits:
de3accdaec ("x86, build: Build 16-bit code with -m16 where possible")
a9cfccee66 ("x86, build: Change code16gcc.h from a C header to an assembly header")
Since
0bddd227f3 ("Documentation: update for gcc 4.9 requirement")
the minimum supported version of GCC is gcc-4.9. It's now safe to remove
this code.
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Nathan Chancellor <natechancellor@gmail.com>
Reviewed-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://lkml.kernel.org/r/20201201011307.3676986-1-ndesaulniers@google.com
Currently, '--orphan-handling=warn' is spread out across four different
architectures in their respective Makefiles, which makes it a little
unruly to deal with in case it needs to be disabled for a specific
linker version (in this case, ld.lld 10.0.1).
To make it easier to control this, hoist this warning into Kconfig and
the main Makefile so that disabling it is simpler, as the warning will
only be enabled in a couple places (main Makefile and a couple of
compressed boot folders that blow away LDFLAGS_vmlinx) and making it
conditional is easier due to Kconfig syntax. One small additional
benefit of this is saving a call to ld-option on incremental builds
because we will have already evaluated it for CONFIG_LD_ORPHAN_WARN.
To keep the list of supported architectures the same, introduce
CONFIG_ARCH_WANT_LD_ORPHAN_WARN, which an architecture can select to
gain this automatically after all of the sections are specified and size
asserted. A special thanks to Kees Cook for the help text on this
config.
Link: https://github.com/ClangBuiltLinux/linux/issues/1187
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Nathan Chancellor <natechancellor@gmail.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Use TEST %reg,%reg which sets the zero flag in the same way as CMP
$0,%reg, but the encoding uses one byte less.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Andy Lutomirski <luto@kernel.org>
Link: https://lkml.kernel.org/r/20201029160258.139216-1-ubizjak@gmail.com