mirror_ubuntu-kernels/arch/x86/kvm
David Matlack ada51a9de7 KVM: x86/mmu: Extend Eager Page Splitting to nested MMUs
Add support for Eager Page Splitting pages that are mapped by nested
MMUs. Walk through the rmap first splitting all 1GiB pages to 2MiB
pages, and then splitting all 2MiB pages to 4KiB pages.

Note, Eager Page Splitting is limited to nested MMUs as a policy rather
than due to any technical reason (the sp->role.guest_mode check could
just be deleted and Eager Page Splitting would work correctly for all
shadow MMU pages). There is really no reason to support Eager Page
Splitting for tdp_mmu=N, since such support will eventually be phased
out, and there is no current use case supporting Eager Page Splitting on
hosts where TDP is either disabled or unavailable in hardware.
Furthermore, future improvements to nested MMU scalability may diverge
the code from the legacy shadow paging implementation. These
improvements will be simpler to make if Eager Page Splitting does not
have to worry about legacy shadow paging.

Splitting huge pages mapped by nested MMUs requires dealing with some
extra complexity beyond that of the TDP MMU:

(1) The shadow MMU has a limit on the number of shadow pages that are
    allowed to be allocated. So, as a policy, Eager Page Splitting
    refuses to split if there are KVM_MIN_FREE_MMU_PAGES or fewer
    pages available.

(2) Splitting a huge page may end up re-using an existing lower level
    shadow page tables. This is unlike the TDP MMU which always allocates
    new shadow page tables when splitting.

(3) When installing the lower level SPTEs, they must be added to the
    rmap which may require allocating additional pte_list_desc structs.

Case (2) is especially interesting since it may require a TLB flush,
unlike the TDP MMU which can fully split huge pages without any TLB
flushes. Specifically, an existing lower level page table may point to
even lower level page tables that are not fully populated, effectively
unmapping a portion of the huge page, which requires a flush.  As of
this commit, a flush is always done always after dropping the huge page
and before installing the lower level page table.

This TLB flush could instead be delayed until the MMU lock is about to be
dropped, which would batch flushes for multiple splits.  However these
flushes should be rare in practice (a huge page must be aliased in
multiple SPTEs and have been split for NX Huge Pages in only some of
them). Flushing immediately is simpler to plumb and also reduces the
chances of tripping over a CPU bug (e.g. see iTLB multihit).

[ This commit is based off of the original implementation of Eager Page
  Splitting from Peter in Google's kernel from 2016. ]

Suggested-by: Peter Feiner <pfeiner@google.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20220516232138.1783324-23-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-06-24 04:52:00 -04:00
..
mmu KVM: x86/mmu: Extend Eager Page Splitting to nested MMUs 2022-06-24 04:52:00 -04:00
svm Revert "KVM: x86: always allow host-initiated writes to PMU MSRs" 2022-06-20 11:49:46 -04:00
vmx KVM: VMX: Use vcpu_get_perf_capabilities() to get guest-visible value 2022-06-20 11:49:54 -04:00
cpuid.c KVM: x86: Introduce "struct kvm_caps" to track misc caps/settings 2022-06-08 05:21:16 -04:00
cpuid.h KVM: x86/cpuid: Refactor host/guest CPU model consistency check 2022-06-08 04:48:19 -04:00
debugfs.c KVM: x86: Introduce "struct kvm_caps" to track misc caps/settings 2022-06-08 05:21:16 -04:00
emulate.c KVM: x86: Bug the VM on an out-of-bounds data read 2022-06-10 10:01:34 -04:00
fpu.h KVM: x86: Move FPU register accessors into fpu.h 2021-06-17 13:09:24 -04:00
hyperv.c Bitmap patches for 5.19-rc1 2022-06-04 14:04:27 -07:00
hyperv.h KVM: x86: hyper-v: Avoid writing to TSC page without an active vCPU 2022-04-11 13:29:51 -04:00
i8254.c KVM: x86: PIT: Preserve state of speaker port data bit 2022-06-08 13:06:20 -04:00
i8254.h KVM: x86: PIT: Preserve state of speaker port data bit 2022-06-08 13:06:20 -04:00
i8259.c KVM: x86/i8259: Remove a dead store of irq in a conditional block 2022-04-02 05:41:19 -04:00
ioapic.c KVM: x86/ioapic: Remove unused "addr" and "length" of ioapic_read_indirect() 2022-02-10 13:47:13 -05:00
ioapic.h x86/kvm: remove unused ack_notifier callbacks 2021-11-18 07:05:57 -05:00
irq_comm.c KVM: x86/xen: Make kvm_xen_set_evtchn() reusable from other places 2022-04-02 05:41:14 -04:00
irq.c KVM: x86/xen: handle PV timers oneshot mode 2022-04-02 05:41:16 -04:00
irq.h x86/kvm: remove unused ack_notifier callbacks 2021-11-18 07:05:57 -05:00
Kconfig KVM: x86/mmu: Remove MMU auditing 2022-02-18 13:46:23 -05:00
kvm_cache_regs.h KVM: X86: Remove kvm_register_clear_available() 2021-12-08 04:25:03 -05:00
kvm_emulate.h KVM: x86: Bug the VM if the emulator accesses a non-existent GPR 2022-06-10 10:01:33 -04:00
kvm_onhyperv.c KVM: x86: Uninline and export hv_track_root_tdp() 2022-02-10 13:47:19 -05:00
kvm_onhyperv.h KVM: x86: Uninline and export hv_track_root_tdp() 2022-02-10 13:47:19 -05:00
lapic.c KVM: x86: Move "apicv_active" into "struct kvm_lapic" 2022-06-20 06:21:24 -04:00
lapic.h KVM: x86: Use lapic_in_kernel() to query in-kernel APIC in APICv helper 2022-06-20 06:21:25 -04:00
Makefile KVM: Add Makefile.kvm for common files, use it for x86 2021-12-09 12:56:02 -05:00
mmu.h KVM: x86/mmu: Use separate namespaces for guest PTEs and shadow PTEs 2022-06-20 06:21:28 -04:00
mtrr.c KVM: x86: Add helper to consolidate "raw" reserved GPA mask calculations 2021-02-04 09:27:30 -05:00
pmu.c Revert "KVM: x86: always allow host-initiated writes to PMU MSRs" 2022-06-20 11:49:46 -04:00
pmu.h Revert "KVM: x86: always allow host-initiated writes to PMU MSRs" 2022-06-20 11:49:46 -04:00
reverse_cpuid.h KVM: SEV: Mask CPUID[0x8000001F].eax according to supported features 2021-04-26 05:27:15 -04:00
trace.h KVM: x86: Differentiate Soft vs. Hard IRQs vs. reinjected in tracepoint 2022-06-08 04:47:01 -04:00
tss.h
x86.c KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis 2022-06-24 04:51:49 -04:00
x86.h KVM: VMX: Enable Notify VM exit 2022-06-08 05:56:24 -04:00
xen.c KVM: x86/xen: Remove the redundantly included header file lapic.h 2022-04-13 13:37:18 -04:00
xen.h KVM: x86: do not set st->preempted when going back to user space 2022-06-08 04:21:06 -04:00