vgic_get_irq() has an awkward signature, as it takes both a kvm
*and* a vcpu, where the vcpu is allowed to be NULL if the INTID
being looked up is a global interrupt (SPI or LPI).
This leads to potentially problematic situations where the INTID
passed is a private interrupt, but that there is no vcpu.
In order to make things less ambiguous, let have *two* helpers
instead:
- vgic_get_irq(struct kvm *kvm, u32 intid), which is only concerned
with *global* interrupts, as indicated by the lack of vcpu.
- vgic_get_vcpu_irq(struct kvm_vcpu *vcpu, u32 intid), which can
return *any* interrupt class, but must have of course a non-NULL
vcpu.
Most of the code nicely falls under one or the other situations,
except for a couple of cases (close to the UABI or in the debug code)
where we have to distinguish between the two cases.
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20241117165757.247686-3-maz@kernel.org
Signed-off-by: Oliver Upton <oliver.upton@linux.dev>
Consolidate the GICv3 VMCR accessor hypercalls into the APR save/restore
hypercalls so that all of the EL2 GICv3 state is covered by a single pair
of hypercalls.
Signed-off-by: Fuad Tabba <tabba@google.com>
Acked-by: Oliver Upton <oliver.upton@linux.dev>
Link: https://lore.kernel.org/r/20240423150538.2103045-17-tabba@google.com
Signed-off-by: Marc Zyngier <maz@kernel.org>
Lockdep reports a circular lock dependency between the srcu and the
config_lock:
[ 262.179917] -> #1 (&kvm->srcu){.+.+}-{0:0}:
[ 262.182010] __synchronize_srcu+0xb0/0x224
[ 262.183422] synchronize_srcu_expedited+0x24/0x34
[ 262.184554] kvm_io_bus_register_dev+0x324/0x50c
[ 262.185650] vgic_register_redist_iodev+0x254/0x398
[ 262.186740] vgic_v3_set_redist_base+0x3b0/0x724
[ 262.188087] kvm_vgic_addr+0x364/0x600
[ 262.189189] vgic_set_common_attr+0x90/0x544
[ 262.190278] vgic_v3_set_attr+0x74/0x9c
[ 262.191432] kvm_device_ioctl+0x2a0/0x4e4
[ 262.192515] __arm64_sys_ioctl+0x7ac/0x1ba8
[ 262.193612] invoke_syscall.constprop.0+0x70/0x1e0
[ 262.195006] do_el0_svc+0xe4/0x2d4
[ 262.195929] el0_svc+0x44/0x8c
[ 262.196917] el0t_64_sync_handler+0xf4/0x120
[ 262.198238] el0t_64_sync+0x190/0x194
[ 262.199224]
[ 262.199224] -> #0 (&kvm->arch.config_lock){+.+.}-{3:3}:
[ 262.201094] __lock_acquire+0x2b70/0x626c
[ 262.202245] lock_acquire+0x454/0x778
[ 262.203132] __mutex_lock+0x190/0x8b4
[ 262.204023] mutex_lock_nested+0x24/0x30
[ 262.205100] vgic_mmio_write_v3_misc+0x5c/0x2a0
[ 262.206178] dispatch_mmio_write+0xd8/0x258
[ 262.207498] __kvm_io_bus_write+0x1e0/0x350
[ 262.208582] kvm_io_bus_write+0xe0/0x1cc
[ 262.209653] io_mem_abort+0x2ac/0x6d8
[ 262.210569] kvm_handle_guest_abort+0x9b8/0x1f88
[ 262.211937] handle_exit+0xc4/0x39c
[ 262.212971] kvm_arch_vcpu_ioctl_run+0x90c/0x1c04
[ 262.214154] kvm_vcpu_ioctl+0x450/0x12f8
[ 262.215233] __arm64_sys_ioctl+0x7ac/0x1ba8
[ 262.216402] invoke_syscall.constprop.0+0x70/0x1e0
[ 262.217774] do_el0_svc+0xe4/0x2d4
[ 262.218758] el0_svc+0x44/0x8c
[ 262.219941] el0t_64_sync_handler+0xf4/0x120
[ 262.221110] el0t_64_sync+0x190/0x194
Note that the current report, which can be triggered by the vgic_irq
kselftest, is a triple chain that includes slots_lock, but after
inverting the slots_lock/config_lock dependency, the actual problem
reported above remains.
In several places, the vgic code calls kvm_io_bus_register_dev(), which
synchronizes the srcu, while holding config_lock (#1). And the MMIO
handler takes the config_lock while holding the srcu read lock (#0).
Break dependency #1, by registering the distributor and redistributors
without holding config_lock. The ITS also uses kvm_io_bus_register_dev()
but already relies on slots_lock to serialize calls.
The distributor iodev is created on the first KVM_RUN call. Multiple
threads will race for vgic initialization, and only the first one will
see !vgic_ready() under the lock. To serialize those threads, rely on
slots_lock rather than config_lock.
Redistributors are created earlier, through KVM_DEV_ARM_VGIC_GRP_ADDR
ioctls and vCPU creation. Similarly, serialize the iodev creation with
slots_lock, and the rest with config_lock.
Fixes: f003277311 ("KVM: arm64: Use config_lock to protect vgic state")
Signed-off-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Reviewed-by: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20230518100914.2837292-2-jean-philippe@linaro.org
* kvm-arm64/vgic-fixes-5.17:
: .
: A few vgic fixes:
: - Harden vgic-v3 error handling paths against signed vs unsigned
: comparison that will happen once the xarray-based vcpus are in
: - Demote userspace-triggered console output to kvm_debug()
: .
KVM: arm64: vgic: Demote userspace-triggered console prints to kvm_debug()
KVM: arm64: vgic-v3: Fix vcpu index comparison
Signed-off-by: Marc Zyngier <maz@kernel.org>
Running the KVM selftests results in these messages being dumped
in the kernel console:
[ 188.051073] kvm [469]: VGIC redist and dist frames overlap
[ 188.056820] kvm [469]: VGIC redist and dist frames overlap
[ 188.076199] kvm [469]: VGIC redist and dist frames overlap
Being amle to trigger this from userspace is definitely not on,
so demote these warnings to kvm_debug().
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20211216104507.1482017-1-maz@kernel.org
GICv2 requires having device mappings in guests and the hypervisor,
which is incompatible with the current pKVM EL2 page ownership model
which only covers memory. While it would be desirable to support pKVM
with GICv2, this will require a lot more work, so let's make the
current assumption clear until then.
Co-developed-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Quentin Perret <qperret@google.com>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20211208152300.2478542-3-qperret@google.com
When a mapped level interrupt (a timer, for example) is deactivated
by the guest, the corresponding host interrupt is equally deactivated.
However, the fate of the pending state still needs to be dealt
with in SW.
This is specially true when the interrupt was in the active+pending
state in the virtual distributor at the point where the guest
was entered. On exit, the pending state is potentially stale
(the guest may have put the interrupt in a non-pending state).
If we don't do anything, the interrupt will be spuriously injected
in the guest. Although this shouldn't have any ill effect (spurious
interrupts are always possible), we can improve the emulation by
detecting the deactivation-while-pending case and resample the
interrupt.
While we're at it, move the logic into a common helper that can
be shared between the two GIC implementations.
Fixes: e40cc57bac ("KVM: arm/arm64: vgic: Support level-triggered mapped interrupts")
Reported-by: Raghavendra Rao Ananta <rananta@google.com>
Tested-by: Raghavendra Rao Ananta <rananta@google.com>
Reviewed-by: Oliver Upton <oupton@google.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210819180305.1670525-1-maz@kernel.org
In order to deal with these systems that do not offer HW-based
deactivation of interrupts, let implement a SW-based approach:
- When the irq is queued into a LR, treat it as a pure virtual
interrupt and set the EOI flag in the LR.
- When the interrupt state is read back from the LR, force a
deactivation when the state is invalid (neither active nor
pending)
Interrupts requiring such treatment get the VGIC_SW_RESAMPLE flag.
Signed-off-by: Marc Zyngier <maz@kernel.org>
dist->ready setting is pointlessly spread across the two vgic
backends, while it could be consolidated in kvm_vgic_map_resources().
Move it there, and slightly simplify the flows in both backends.
Suggested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
kvm_vgic_map_resources() is called when a VCPU if first run and it maps all
the VGIC MMIO regions. To prevent double-initialization, the VGIC uses the
ready variable to keep track of the state of resources and the global KVM
mutex to protect against concurrent accesses. After the lock is taken, the
variable is checked again in case another VCPU took the lock between the
current VCPU reading ready equals false and taking the lock.
The double-checked lock pattern is spread across four different functions:
in kvm_vcpu_first_run_init(), in kvm_vgic_map_resource() and in
vgic_{v2,v3}_map_resources(), which makes it hard to reason about and
introduces minor code duplication. Consolidate the checks in
kvm_vgic_map_resources(), where the lock is taken.
No functional change intended.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20201201150157.223625-4-alexandru.elisei@arm.com
If we move the used_lrs field to the version-specific cpu interface
structure, the following functions only operate on the struct
vgic_v3_cpu_if and not the full vcpu:
__vgic_v3_save_state
__vgic_v3_restore_state
__vgic_v3_activate_traps
__vgic_v3_deactivate_traps
__vgic_v3_save_aprs
__vgic_v3_restore_aprs
This is going to be very useful for nested virt, so move the used_lrs
field and change the prototypes and implementations of these functions to
take the cpu_if parameter directly.
No functional change.
Reviewed-by: James Morse <james.morse@arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Now that the 32bit KVM/arm host is a distant memory, let's move the
whole of the KVM/arm64 code into the arm64 tree.
As they said in the song: Welcome Home (Sanitarium).
Signed-off-by: Marc Zyngier <maz@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Link: https://lore.kernel.org/r/20200513104034.74741-1-maz@kernel.org