The FRED RSP0 MSR points to the top of the kernel stack for user level
event delivery. As this is the task stack it needs to be updated when a
task is scheduled in.
The update is done at context switch. That means it's also done when
switching to kernel threads, which is pointless as those never go out to
user space. For KVM threads this means there are two writes to FRED_RSP0 as
KVM has to switch to the guest value before VMENTER.
Defer the update to the exit to user space path and cache the per CPU
FRED_RSP0 value, so redundant writes can be avoided.
Provide fred_sync_rsp0() for KVM to keep the cache in sync with the actual
MSR value after returning from guest to host mode.
[ tglx: Massage change log ]
Suggested-by: Sean Christopherson <seanjc@google.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240822073906.2176342-4-xin@zytor.com
SS is initialized to NULL during boot time and not explicitly set to
__KERNEL_DS.
With FRED enabled, if a kernel event is delivered before a CPU goes to
user level for the first time, its SS is NULL thus NULL is pushed into
the SS field of the FRED stack frame. But before ERETS is executed,
the CPU may context switch to another task and go to user level. Then
when the CPU comes back to kernel mode, SS is changed to __KERNEL_DS.
Later when ERETS is executed to return from the kernel event handler,
a #GP fault is generated because SS doesn't match the SS saved in the
FRED stack frame.
Initialize SS to __KERNEL_DS when enabling FRED to prevent that.
Note, IRET doesn't check if SS matches the SS saved in its stack frame,
thus IDT doesn't have this problem. For IDT it doesn't matter whether
SS is set to __KERNEL_DS or not, because it's set to NULL upon interrupt
or exception delivery and __KERNEL_DS upon SYSCALL. Thus it's pointless
to initialize SS for IDT.
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240816104316.2276968-1-xin@zytor.com
To enable FRED earlier, move the RSP initialization out of
cpu_init_fred_exceptions() into cpu_init_fred_rsps().
This is required as the FRED RSP initialization depends on the availability
of the CPU entry areas which are set up late in trap_init(),
No functional change intended. Marked with Fixes as it's a depedency for
the real fix.
Fixes: 14619d912b ("x86/fred: FRED entry/exit and dispatch code")
Signed-off-by: Xin Li (Intel) <xin@zytor.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20240709154048.3543361-3-xin@zytor.com
Add cpu_init_fred_exceptions() to:
- Set FRED entrypoints for events happening in ring 0 and 3.
- Specify the stack level for IRQs occurred ring 0.
- Specify dedicated event stacks for #DB/NMI/#MCE/#DF.
- Enable FRED and invalidtes IDT.
- Force 32-bit system calls to use "int $0x80" only.
Add fred_complete_exception_setup() to:
- Initialize system_vectors as done for IDT systems.
- Set unused sysvec_table entries to fred_handle_spurious_interrupt().
Co-developed-by: Xin Li <xin3.li@intel.com>
Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Signed-off-by: Xin Li <xin3.li@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Tested-by: Shan Kang <shan.kang@intel.com>
Link: https://lore.kernel.org/r/20231205105030.8698-35-xin3.li@intel.com