Commit Graph

840 Commits

Author SHA1 Message Date
Maciej S. Szmigiero
a54d806688 KVM: Keep memslots in tree-based structures instead of array-based ones
The current memslot code uses a (reverse gfn-ordered) memslot array for
keeping track of them.

Because the memslot array that is currently in use cannot be modified
every memslot management operation (create, delete, move, change flags)
has to make a copy of the whole array so it has a scratch copy to work on.

Strictly speaking, however, it is only necessary to make copy of the
memslot that is being modified, copying all the memslots currently present
is just a limitation of the array-based memslot implementation.

Two memslot sets, however, are still needed so the VM continues to run
on the currently active set while the requested operation is being
performed on the second, currently inactive one.

In order to have two memslot sets, but only one copy of actual memslots
it is necessary to split out the memslot data from the memslot sets.

The memslots themselves should be also kept independent of each other
so they can be individually added or deleted.

These two memslot sets should normally point to the same set of
memslots. They can, however, be desynchronized when performing a
memslot management operation by replacing the memslot to be modified
by its copy.  After the operation is complete, both memslot sets once
again point to the same, common set of memslot data.

This commit implements the aforementioned idea.

For tracking of gfns an ordinary rbtree is used since memslots cannot
overlap in the guest address space and so this data structure is
sufficient for ensuring that lookups are done quickly.

The "last used slot" mini-caches (both per-slot set one and per-vCPU one),
that keep track of the last found-by-gfn memslot, are still present in the
new code.

Co-developed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <17c0cf3663b760a0d3753d4ac08c0753e941b811.1638817641.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:34 -05:00
Maciej S. Szmigiero
6a656832aa KVM: s390: Introduce kvm_s390_get_gfn_end()
And use it where s390 code would just access the memslot with the highest
gfn directly.

No functional change intended.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <42496041d6af1c23b1cbba2636b344ca8d5fc3af.1638817641.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:33 -05:00
Maciej S. Szmigiero
c928bfc263 KVM: Integrate gfn_to_memslot_approx() into search_memslots()
s390 arch has gfn_to_memslot_approx() which is almost identical to
search_memslots(), differing only in that in case the gfn falls in a hole
one of the memslots bordering the hole is returned.

Add this lookup mode as an option to search_memslots() so we don't have two
almost identical functions for looking up a memslot by its gfn.

Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
[sean: tweaked helper names to keep gfn_to_memslot_approx() in s390]
Reviewed-by: Sean Christopherson <seanjc@google.com>
Message-Id: <171cd89b52c718dbe180ecd909b4437a64a7e2ec.1638817640.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:30 -05:00
Sean Christopherson
ec5c869766 KVM: s390: Skip gfn/size sanity checks on memslot DELETE or FLAGS_ONLY
Sanity check the hva, gfn, and size of a userspace memory region only if
any of those properties can change, i.e. skip the checks for DELETE and
FLAGS_ONLY.  KVM doesn't allow moving the hva or changing the size, a gfn
change shows up as a MOVE even if flags are being modified, and the
checks are pointless for the DELETE case as userspace_addr and gfn_base
are zeroed by common KVM.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <05430738437ac2c9c7371ac4e11f4a533e1677da.1638817640.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:27 -05:00
Sean Christopherson
6a99c6e3f5 KVM: Stop passing kvm_userspace_memory_region to arch memslot hooks
Drop the @mem param from kvm_arch_{prepare,commit}_memory_region() now
that its use has been removed in all architectures.

No functional change intended.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <aa5ed3e62c27e881d0d8bc0acbc1572bc336dc19.1638817640.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:25 -05:00
Sean Christopherson
cf5b486922 KVM: s390: Use "new" memslot instead of userspace memory region
Get the gfn, size, and hva from the new memslot instead of the userspace
memory region when preparing/committing memory region changes.  This will
allow a future commit to drop the @mem param.

Note, this has a subtle functional change as KVM would previously reject
DELETE if userspace provided a garbage userspace_addr or guest_phys_addr,
whereas KVM zeros those fields in the "new" memslot when deleting an
existing memslot.  Arguably the old behavior is more correct, but there's
zero benefit into requiring userspace to provide sane values for hva and
gfn.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <917ed131c06a4c7b35dd7fb7ed7955be899ad8cc.1638817639.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:23 -05:00
Sean Christopherson
537a17b314 KVM: Let/force architectures to deal with arch specific memslot data
Pass the "old" slot to kvm_arch_prepare_memory_region() and force arch
code to handle propagating arch specific data from "new" to "old" when
necessary.  This is a baby step towards dynamically allocating "new" from
the get go, and is a (very) minor performance boost on x86 due to not
unnecessarily copying arch data.

For PPC HV, copy the rmap in the !CREATE and !DELETE paths, i.e. for MOVE
and FLAGS_ONLY.  This is functionally a nop as the previous behavior
would overwrite the pointer for CREATE, and eventually discard/ignore it
for DELETE.

For x86, copy the arch data only for FLAGS_ONLY changes.  Unlike PPC HV,
x86 needs to reallocate arch data in the MOVE case as the size of x86's
allocations depend on the alignment of the memslot's gfn.

Opportunistically tweak kvm_arch_prepare_memory_region()'s param order to
match the "commit" prototype.

Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
[mss: add missing RISCV kvm_arch_prepare_memory_region() change]
Signed-off-by: Maciej S. Szmigiero <maciej.szmigiero@oracle.com>
Message-Id: <67dea5f11bbcfd71e3da5986f11e87f5dd4013f9.1638817639.git.maciej.szmigiero@oracle.com>
2021-12-08 04:24:20 -05:00
Marc Zyngier
46808a4cb8 KVM: Use 'unsigned long' as kvm_for_each_vcpu()'s index
Everywhere we use kvm_for_each_vpcu(), we use an int as the vcpu
index. Unfortunately, we're about to move rework the iterator,
which requires this to be upgrade to an unsigned long.

Let's bite the bullet and repaint all of it in one go.

Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-7-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08 04:24:15 -05:00
Marc Zyngier
113d10bca2 KVM: s390: Use kvm_get_vcpu() instead of open-coded access
As we are about to change the way vcpus are allocated, mandate
the use of kvm_get_vcpu() instead of open-coding the access.

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-4-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08 04:24:14 -05:00
Marc Zyngier
27592ae8db KVM: Move wiping of the kvm->vcpus array to common code
All architectures have similar loops iterating over the vcpus,
freeing one vcpu at a time, and eventually wiping the reference
off the vcpus array. They are also inconsistently taking
the kvm->lock mutex when wiping the references from the array.

Make this code common, which will simplify further changes.
The locking is dropped altogether, as this should only be called
when there is no further references on the kvm structure.

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Message-Id: <20211116160403.4074052-2-maz@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-12-08 04:24:13 -05:00
Vitaly Kuznetsov
82cc27eff4 KVM: s390: Cap KVM_CAP_NR_VCPUS by num_online_cpus()
KVM_CAP_NR_VCPUS is a legacy advisory value which on other architectures
return num_online_cpus() caped by KVM_CAP_NR_VCPUS or something else
(ppc and arm64 are special cases). On s390, KVM_CAP_NR_VCPUS returns
the same as KVM_CAP_MAX_VCPUS and this may turn out to be a bad
'advice'. Switch s390 to returning caped num_online_cpus() too.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@linux.ibm.com>
Message-Id: <20211116163443.88707-6-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-11-18 02:12:15 -05:00
Collin Walling
3fd8417f2c KVM: s390: add debug statement for diag 318 CPNC data
The diag 318 data contains values that denote information regarding the
guest's environment. Currently, it is unecessarily difficult to observe
this value (either manually-inserted debug statements, gdb stepping, mem
dumping etc). It's useful to observe this information to obtain an
at-a-glance view of the guest's environment, so lets add a simple VCPU
event that prints the CPNC to the s390dbf logs.

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20211027025451.290124-1-walling@linux.ibm.com
[borntraeger@de.ibm.com]: change debug level to 3
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-10-27 07:55:53 +02:00
Eric Farman
67cf68b6a5 KVM: s390: Add a routine for setting userspace CPU state
This capability exists, but we don't record anything when userspace
enables it. Let's refactor that code so that a note can be made in
the debug logs that it was enabled.

Signed-off-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20211008203112.1979843-7-farman@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-10-25 09:20:39 +02:00
Halil Pasic
9b57e9d501 KVM: s390: clear kicked_mask before sleeping again
The idea behind kicked mask is that we should not re-kick a vcpu that
is already in the "kick" process, i.e. that was kicked and is
is about to be dispatched if certain conditions are met.

The problem with the current implementation is, that it assumes the
kicked vcpu is going to enter SIE shortly. But under certain
circumstances, the vcpu we just kicked will be deemed non-runnable and
will remain in wait state. This can happen, if the interrupt(s) this
vcpu got kicked to deal with got already cleared (because the interrupts
got delivered to another vcpu). In this case kvm_arch_vcpu_runnable()
would return false, and the vcpu would remain in kvm_vcpu_block(),
but this time with its kicked_mask bit set. So next time around we
wouldn't kick the vcpu form __airqs_kick_single_vcpu(), but would assume
that we just kicked it.

Let us make sure the kicked_mask is cleared before we give up on
re-dispatching the vcpu.

Fixes: 9f30f62163 ("KVM: s390: add gib_alert_irq_handler()")
Reported-by: Matthew Rosato <mjrosato@linux.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20211019175401.3757927-2-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-10-20 13:03:04 +02:00
Sean Christopherson
4eeef24241 KVM: x86: Query vcpu->vcpu_idx directly and drop its accessor
Read vcpu->vcpu_idx directly instead of bouncing through the one-line
wrapper, kvm_vcpu_get_idx(), and drop the wrapper.  The wrapper is a
remnant of the original implementation and serves no purpose; remove it
before it gains more users.

Back when kvm_vcpu_get_idx() was added by commit 497d72d80a ("KVM: Add
kvm_vcpu_get_idx to get vcpu index in kvm->vcpus"), the implementation
was more than just a simple wrapper as vcpu->vcpu_idx did not exist and
retrieving the index meant walking over the vCPU array to find the given
vCPU.

When vcpu_idx was introduced by commit 8750e72a79 ("KVM: remember
position in kvm->vcpus array"), the helper was left behind, likely to
avoid extra thrash (but even then there were only two users, the original
arm usage having been removed at some point in the past).

No functional change intended.

Suggested-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20210910183220.2397812-2-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-09-22 10:33:11 -04:00
Linus Torvalds
192ad3c27a ARM:
- Page ownership tracking between host EL1 and EL2
 
 - Rely on userspace page tables to create large stage-2 mappings
 
 - Fix incompatibility between pKVM and kmemleak
 
 - Fix the PMU reset state, and improve the performance of the virtual PMU
 
 - Move over to the generic KVM entry code
 
 - Address PSCI reset issues w.r.t. save/restore
 
 - Preliminary rework for the upcoming pKVM fixed feature
 
 - A bunch of MM cleanups
 
 - a vGIC fix for timer spurious interrupts
 
 - Various cleanups
 
 s390:
 
 - enable interpretation of specification exceptions
 
 - fix a vcpu_idx vs vcpu_id mixup
 
 x86:
 
 - fast (lockless) page fault support for the new MMU
 
 - new MMU now the default
 
 - increased maximum allowed VCPU count
 
 - allow inhibit IRQs on KVM_RUN while debugging guests
 
 - let Hyper-V-enabled guests run with virtualized LAPIC as long as they
   do not enable the Hyper-V "AutoEOI" feature
 
 - fixes and optimizations for the toggling of AMD AVIC (virtualized LAPIC)
 
 - tuning for the case when two-dimensional paging (EPT/NPT) is disabled
 
 - bugfixes and cleanups, especially with respect to 1) vCPU reset and
   2) choosing a paging mode based on CR0/CR4/EFER
 
 - support for 5-level page table on AMD processors
 
 Generic:
 
 - MMU notifier invalidation callbacks do not take mmu_lock unless necessary
 
 - improved caching of LRU kvm_memory_slot
 
 - support for histogram statistics
 
 - add statistics for halt polling and remote TLB flush requests
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmE2CIAUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMyqwf+Ky2WoThuQ9Ra0r/m8pUTAx5+gsAf
 MmG24rNLE+26X0xuBT9Q5+etYYRLrRTWJvo5cgHooz7muAYW6scR+ho5xzvLTAxi
 DAuoijkXsSdGoFCp0OMUHiwG3cgY5N7feTEwLPAb2i6xr/l6SZyCP4zcwiiQbJ2s
 UUD0i3rEoNQ02/hOEveud/ENxzUli9cmmgHKXR3kNgsJClSf1fcuLnhg+7EGMhK9
 +c2V+hde5y0gmEairQWm22MLMRolNZ5NL4kjykiNh2M5q9YvbHe5+f/JmENlNZMT
 bsUQT6Ry1ukuJ0V59rZvUw71KknPFzZ3d6HgW4pwytMq6EJKiISHzRbVnQ==
 =FCAB
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - Page ownership tracking between host EL1 and EL2
   - Rely on userspace page tables to create large stage-2 mappings
   - Fix incompatibility between pKVM and kmemleak
   - Fix the PMU reset state, and improve the performance of the virtual
     PMU
   - Move over to the generic KVM entry code
   - Address PSCI reset issues w.r.t. save/restore
   - Preliminary rework for the upcoming pKVM fixed feature
   - A bunch of MM cleanups
   - a vGIC fix for timer spurious interrupts
   - Various cleanups

  s390:
   - enable interpretation of specification exceptions
   - fix a vcpu_idx vs vcpu_id mixup

  x86:
   - fast (lockless) page fault support for the new MMU
   - new MMU now the default
   - increased maximum allowed VCPU count
   - allow inhibit IRQs on KVM_RUN while debugging guests
   - let Hyper-V-enabled guests run with virtualized LAPIC as long as
     they do not enable the Hyper-V "AutoEOI" feature
   - fixes and optimizations for the toggling of AMD AVIC (virtualized
     LAPIC)
   - tuning for the case when two-dimensional paging (EPT/NPT) is
     disabled
   - bugfixes and cleanups, especially with respect to vCPU reset and
     choosing a paging mode based on CR0/CR4/EFER
   - support for 5-level page table on AMD processors

  Generic:
   - MMU notifier invalidation callbacks do not take mmu_lock unless
     necessary
   - improved caching of LRU kvm_memory_slot
   - support for histogram statistics
   - add statistics for halt polling and remote TLB flush requests"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (210 commits)
  KVM: Drop unused kvm_dirty_gfn_invalid()
  KVM: x86: Update vCPU's hv_clock before back to guest when tsc_offset is adjusted
  KVM: MMU: mark role_regs and role accessors as maybe unused
  KVM: MIPS: Remove a "set but not used" variable
  x86/kvm: Don't enable IRQ when IRQ enabled in kvm_wait
  KVM: stats: Add VM stat for remote tlb flush requests
  KVM: Remove unnecessary export of kvm_{inc,dec}_notifier_count()
  KVM: x86/mmu: Move lpage_disallowed_link further "down" in kvm_mmu_page
  KVM: x86/mmu: Relocate kvm_mmu_page.tdp_mmu_page for better cache locality
  Revert "KVM: x86: mmu: Add guest physical address check in translate_gpa()"
  KVM: x86/mmu: Remove unused field mmio_cached in struct kvm_mmu_page
  kvm: x86: Increase KVM_SOFT_MAX_VCPUS to 710
  kvm: x86: Increase MAX_VCPUS to 1024
  kvm: x86: Set KVM_MAX_VCPU_ID to 4*KVM_MAX_VCPUS
  KVM: VMX: avoid running vmx_handle_exit_irqoff in case of emulation
  KVM: x86/mmu: Don't freak out if pml5_root is NULL on 4-level host
  KVM: s390: index kvm->arch.idle_mask by vcpu_idx
  KVM: s390: Enable specification exception interpretation
  KVM: arm64: Trim guest debug exception handling
  KVM: SVM: Add 5-level page table support for SVM
  ...
2021-09-07 13:40:51 -07:00
Paolo Bonzini
0d0a19395b KVM: s390: Fix and feature for 5.15
- enable interpretion of specification exceptions
 - fix a vcpu_idx vs vcpu_id mixup
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJhKxuYAAoJEBF7vIC1phx8w80P/i3mOFrNWyS9eVuFnFc34Bb5
 soueWQKv3HR3pdjPhN6GnfkZsZlHjcflTQzvB3QNi6xlfhJSkwh+hoYZvhjZUJCa
 25/vRYJfkkp6xWu+tGsfrBC+2e6muFMLE5PRBVvjEcrCb/IM3rfzBazcXwahBuTW
 oT2j9+eYvlD6hy2kQSHaAkhIW5ldJj6RJ9cVE1lcWs+6YjP6e17a3SHMUXoBUjSv
 RAIv3iCgK0IuVALGzvC4QeKblTv2l+TrcUyIiQdFjnzv7IIcfPzzJFdYRg6g6pbR
 vsz5a309MyOaiVERUNqmNhDC0j6TLwS2eSvdqksJXhfmoT1kbK+rkum10IM7Q4ii
 rmKw3Wmvb+Dyde4SILcpQt2zsg5KjGAOFOqIb0f8lq7cIvrqdA1FbY8XxjJgkmSo
 EM9GkRWqS6K5nwNV7bSeLmFmdzYeCTFfzeYIPWI/cjZB4KPd9K0aU8LtiLJjZn2y
 xjfvZE+iRzPZL2sBZLtovhfOrqh8LLTd6o+i6KIgCC98CPOL0Y3ShNxPsCgN1sXH
 REK9dPKymoXEbFyUabUAerrUrjWZc80AnFZMWTz9wybwTuSWyIGxIUBuq2W1aLSV
 AKN8BEVdiB2PGsTlzio8JUtDb8dzHbEHyNrzKrYYdPb681b5EYRHZE5Eh9YejiaI
 I4d6bO1OmJD8uSaF+qRV
 =Usbs
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fix and feature for 5.15

- enable interpretion of specification exceptions
- fix a vcpu_idx vs vcpu_id mixup
2021-09-06 06:33:40 -04:00
Halil Pasic
a3e03bc136 KVM: s390: index kvm->arch.idle_mask by vcpu_idx
While in practice vcpu->vcpu_idx ==  vcpu->vcp_id is often true, it may
not always be, and we must not rely on this. Reason is that KVM decides
the vcpu_idx, userspace decides the vcpu_id, thus the two might not
match.

Currently kvm->arch.idle_mask is indexed by vcpu_id, which implies
that code like
for_each_set_bit(vcpu_id, kvm->arch.idle_mask, online_vcpus) {
                vcpu = kvm_get_vcpu(kvm, vcpu_id);
		do_stuff(vcpu);
}
is not legit. Reason is that kvm_get_vcpu expects an vcpu_idx, not an
vcpu_id.  The trouble is, we do actually use kvm->arch.idle_mask like
this. To fix this problem we have two options. Either use
kvm_get_vcpu_by_id(vcpu_id), which would loop to find the right vcpu_id,
or switch to indexing via vcpu_idx. The latter is preferable for obvious
reasons.

Let us make switch from indexing kvm->arch.idle_mask by vcpu_id to
indexing it by vcpu_idx.  To keep gisa_int.kicked_mask indexed by the
same index as idle_mask lets make the same change for it as well.

Fixes: 1ee0bc559d ("KVM: s390: get rid of local_int array")
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Christian Bornträger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Cc: <stable@vger.kernel.org> # 3.15+
Link: https://lore.kernel.org/r/20210827125429.1912577-1-pasic@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-08-27 18:35:41 +02:00
Janis Schoetterl-Glausch
7119decf47 KVM: s390: Enable specification exception interpretation
When this feature is enabled the hardware is free to interpret
specification exceptions generated by the guest, instead of causing
program interruption interceptions.

This benefits (test) programs that generate a lot of specification
exceptions (roughly 4x increase in exceptions/sec).

Interceptions will occur as before if ICTL_PINT is set,
i.e. if guest debug is enabled.

There is no indication if this feature is available or not and the
hardware is free to interpret or not. So we can simply set this bit and
if the hardware ignores it we fall back to intercept 8 handling.

Signed-off-by: Janis Schoetterl-Glausch <scgl@linux.ibm.com>
Link: https://lore.kernel.org/linux-s390/20210706114714.3936825-1-scgl@linux.ibm.com/
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-08-27 18:35:20 +02:00
Tony Krowiak
86956e7076 s390/vfio-ap: replace open coded locks for VFIO_GROUP_NOTIFY_SET_KVM notification
It was pointed out during an unrelated patch review that locks should not
be open coded - i.e., writing the algorithm of a standard lock in a
function instead of using a lock from the standard library. The setting and
testing of a busy flag and sleeping on a wait_event is the same thing
a lock does. The open coded locks are invisible to lockdep, so potential
locking problems are not detected.

This patch removes the open coded locks used during
VFIO_GROUP_NOTIFY_SET_KVM notification. The busy flag
and wait queue were introduced to resolve a possible circular locking
dependency reported by lockdep when starting a secure execution guest
configured with AP adapters and domains. Reversing the order in which
the kvm->lock mutex and matrix_dev->lock mutex are locked resolves the
issue reported by lockdep, thus enabling the removal of the open coded
locks.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Link: https://lore.kernel.org/r/20210823212047.1476436-3-akrowiak@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-24 12:14:05 -06:00
Tony Krowiak
1e753732bd s390/vfio-ap: r/w lock for PQAP interception handler function pointer
The function pointer to the interception handler for the PQAP instruction
can get changed during the interception process. Let's add a
semaphore to struct kvm_s390_crypto to control read/write access to the
function pointer contained therein.

The semaphore must be locked for write access by the vfio_ap device driver
when notified that the KVM pointer has been set or cleared. It must be
locked for read access by the interception framework when the PQAP
instruction is intercepted.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20210823212047.1476436-2-akrowiak@linux.ibm.com
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2021-08-24 12:14:05 -06:00
Jing Zhang
f95937ccf5 KVM: stats: Support linear and logarithmic histogram statistics
Add new types of KVM stats, linear and logarithmic histogram.
Histogram are very useful for observing the value distribution
of time or size related stats.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210802165633.1866976-2-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-20 16:06:32 -04:00
David Matlack
87689270b1 KVM: Rename lru_slot to last_used_slot
lru_slot is used to keep track of the index of the most-recently used
memslot. The correct acronym would be "mru" but that is not a common
acronym. So call it last_used_slot which is a bit more obvious.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Matlack <dmatlack@google.com>
Message-Id: <20210804222844.1419481-2-dmatlack@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-08-06 07:52:28 -04:00
Christian Borntraeger
bb000f640e KVM: s390: restore old debugfs names
commit bc9e9e672d ("KVM: debugfs: Reuse binary stats descriptors")
did replace the old definitions with the binary ones. While doing that
it missed that some files are names different than the counters. This
is especially important for kvm_stat which does have special handling
for counters named instruction_*.

Fixes: commit bc9e9e672d ("KVM: debugfs: Reuse binary stats descriptors")
CC: Jing Zhang <jingzhangos@google.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20210726150108.5603-1-borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-07-27 16:59:00 -04:00
Linus Torvalds
2bb919b62f s390 updates for the 5.14 merge window
- Rework inline asm to get rid of error prone "register asm" constructs,
   which are problematic especially when code instrumentation is enabled. In
   particular introduce and use register pair union to allocate even/odd
   register pairs. Unfortunately this breaks compatibility with older
   clang compilers and minimum clang version for s390 has been raised to 13.
   https://lore.kernel.org/linux-next/CAK7LNARuSmPCEy-ak0erPrPTgZdGVypBROFhtw+=3spoGoYsyw@mail.gmail.com/
 
 - Fix gcc 11 warnings, which triggered various minor reworks all over
   the code.
 
 - Add zstd kernel image compression support.
 
 - Rework boot CPU lowcore handling.
 
 - De-duplicate and move kernel memory layout setup logic earlier.
 
 - Few fixes in preparation for FORTIFY_SOURCE performing compile-time
   and run-time field bounds checking for mem functions.
 
 - Remove broken and unused power management support leftovers in s390
   drivers.
 
 - Disable stack-protector for decompressor and purgatory to fix buildroot
   build.
 
 - Fix vt220 sclp console name to match the char device name.
 
 - Enable HAVE_IOREMAP_PROT and add zpci_set_irq()/zpci_clear_irq() in
   zPCI code.
 
 - Remove some implausible WARN_ON_ONCEs and remove arch specific counter
   transaction call backs in favour of default transaction handling in
   perf code.
 
 - Extend/add new uevents for online/config/mode state changes of
   AP card / queue device in zcrypt.
 
 - Minor entry and ccwgroup code improvements.
 
 - Other small various fixes and improvements all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAmDhuTEACgkQjYWKoQLX
 FBjVlggAgDFBkDjlyfvrm4xzmHi7BJMmhrTJIONsSz+3tcA4/u5kE+Hrdrqxm0Uh
 ZH4MXBxn4q4Fmoomhu5w5ZDe8o2ip0aN9fFNdsBoP8hurmQbL/IbdTnBETKMrKpV
 XpogU2G7p+2nQ0+9+o6PS/vWlZhI88NVh8dWyRd2+5/XdMycgLv2Qm7NpQoACVw1
 CbUvxP2PlpZ0wltLvNBKPg1xXMZa3GS0wbVUsS2jiWcr/3VzCqfTHenZJ/RadoE6
 axG99QXCbLDMsJgVQcXtlI8K6Z461fAwbNtWZWC+Uq7o5pYuUFW1dovMg9WWF+7T
 lFNqXyyNy5wwITRkvuzjlVTE8yzYYg==
 =ADZ4
 -----END PGP SIGNATURE-----

Merge tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Rework inline asm to get rid of error prone "register asm"
   constructs, which are problematic especially when code
   instrumentation is enabled.

   In particular introduce and use register pair union to allocate
   even/odd register pairs. Unfortunately this breaks compatibility with
   older clang compilers and minimum clang version for s390 has been
   raised to 13.

     https://lore.kernel.org/linux-next/CAK7LNARuSmPCEy-ak0erPrPTgZdGVypBROFhtw+=3spoGoYsyw@mail.gmail.com/

 - Fix gcc 11 warnings, which triggered various minor reworks all over
   the code.

 - Add zstd kernel image compression support.

 - Rework boot CPU lowcore handling.

 - De-duplicate and move kernel memory layout setup logic earlier.

 - Few fixes in preparation for FORTIFY_SOURCE performing compile-time
   and run-time field bounds checking for mem functions.

 - Remove broken and unused power management support leftovers in s390
   drivers.

 - Disable stack-protector for decompressor and purgatory to fix
   buildroot build.

 - Fix vt220 sclp console name to match the char device name.

 - Enable HAVE_IOREMAP_PROT and add zpci_set_irq()/zpci_clear_irq() in
   zPCI code.

 - Remove some implausible WARN_ON_ONCEs and remove arch specific
   counter transaction call backs in favour of default transaction
   handling in perf code.

 - Extend/add new uevents for online/config/mode state changes of AP
   card / queue device in zcrypt.

 - Minor entry and ccwgroup code improvements.

 - Other small various fixes and improvements all over the code.

* tag 's390-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (91 commits)
  s390/dasd: use register pair instead of register asm
  s390/qdio: get rid of register asm
  s390/ioasm: use symbolic names for asm operands
  s390/ioasm: get rid of register asm
  s390/cmf: get rid of register asm
  s390/lib,string: get rid of register asm
  s390/lib,uaccess: get rid of register asm
  s390/string: get rid of register asm
  s390/cmpxchg: use register pair instead of register asm
  s390/mm,pages-states: get rid of register asm
  s390/lib,xor: get rid of register asm
  s390/timex: get rid of register asm
  s390/hypfs: use register pair instead of register asm
  s390/zcrypt: Switch to flexible array member
  s390/speculation: Use statically initialized const for instructions
  virtio/s390: get rid of open-coded kvm hypercall
  s390/pci: add zpci_set_irq()/zpci_clear_irq()
  scripts/min-tool-version.sh: Raise minimum clang version to 13.0.0 for s390
  s390/ipl: use register pair instead of register asm
  s390/mem_detect: fix tprot() program check new psw handling
  ...
2021-07-04 12:17:38 -07:00
Paolo Bonzini
79b1e56509 KVM: s390: Features for 5.14
- new HW facilities for guests
 - make inline assembly more robust with KASAN and co
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJg1ZfMAAoJEBF7vIC1phx8uv0P/0glFasUp3GEUWUzjcTycFFf
 SAPiyrk4ucU/8eOJQVWMLPL0pESTgGZkxaa5+rChJA4K00Pf+KWEDRMqNpZ5/eOY
 SVq4XqHUZtRKHWH1z6B7Sfx3GliIAqsEmJz1dOcXp11CxIzumBD9gAHNaYLBqkKt
 5b5UFn/GkyutnL+CEBYVIOXvd1QBrEKOtiIfnIPDZAJCpjUh68lFBjW4SOsd7fz7
 9VDUjZrIRN+CWb/AfWEnInzlBoyjgIbwfxQIKXcpeZsKWpYzQJ+Oti0ZFoKPtfdV
 G7zzgwyPG5vXbJETxBg58M8NddW0Ft+jttz/GJ7NtzWi2a046Mp02Udk47vpL1AW
 DzZgatOQasFP5PBOBpOn460BhuUdYkSrHOXZbRO3/rlrFd7UbJiTBIaV7lYaeZ6T
 nImP5/Rd8NPFPfJB990inFjqyburfA7rCWv8oB2a2n3YduV4bI4t5d71Giz9ibaH
 gm/zWJdIZYHaMvE7sCWiXnStXEs1DEOeMvZpkBpOUf2/DEvfCUIOiaepOnQrl7GW
 jMFACO471PCh7xDvohNxo0tbs59+Ctfglo/gy12yZMtsfpgq4iHP1BrnfOTB6Xig
 rzJT2rSWgPw1nViuZQOqypd8ZhkrfoHZg1xnjwJ7tiWnlFNhpWJkZpqYmz7qtiUn
 W5Svlo06FWoGKfGwvZf6
 =HiDo
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.14-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Features for 5.14

- new HW facilities for guests
- make inline assembly more robust with KASAN and co
2021-06-25 10:50:11 -04:00
Jing Zhang
bc9e9e672d KVM: debugfs: Reuse binary stats descriptors
To remove code duplication, use the binary stats descriptors in the
implementation of the debugfs interface for statistics. This unifies
the definition of statistics for the binary and debugfs interfaces.

Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-8-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 18:00:29 -04:00
Jing Zhang
ce55c04945 KVM: stats: Support binary stats retrieval for a VCPU
Add a VCPU ioctl to get a statistics file descriptor by which a read
functionality is provided for userspace to read out VCPU stats header,
descriptors and data.
Define VCPU statistics descriptors and header for all architectures.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com> #arm64
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-5-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 18:00:19 -04:00
Jing Zhang
fcfe1baedd KVM: stats: Support binary stats retrieval for a VM
Add a VM ioctl to get a statistics file descriptor by which a read
functionality is provided for userspace to read out VM stats header,
descriptors and data.
Define VM statistics descriptors and header for all architectures.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Reviewed-by: Fuad Tabba <tabba@google.com>
Tested-by: Fuad Tabba <tabba@google.com> #arm64
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-4-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 18:00:10 -04:00
Jing Zhang
0193cc908b KVM: stats: Separate generic stats from architecture specific ones
Generic KVM stats are those collected in architecture independent code
or those supported by all architectures; put all generic statistics in
a separate structure.  This ensures that they are defined the same way
in the statistics API which is being added, removing duplication among
different architectures in the declaration of the descriptors.

No functional change intended.

Reviewed-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ricardo Koller <ricarkol@google.com>
Reviewed-by: Krish Sadhukhan <krish.sadhukhan@oracle.com>
Signed-off-by: Jing Zhang <jingzhangos@google.com>
Message-Id: <20210618222709.1858088-2-jingzhangos@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-06-24 11:47:56 -04:00
Christian Borntraeger
1f703d2cf2 KVM: s390: allow facility 192 (vector-packed-decimal-enhancement facility 2)
pass through newer vector instructions if vector support is enabled.

Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-06-23 09:35:20 +02:00
Heiko Carstens
4fa3b91bde KVM: s390: get rid of register asm usage
Using register asm statements has been proven to be very error prone,
especially when using code instrumentation where gcc may add function
calls, which clobbers register contents in an unexpected way.

Therefore get rid of register asm statements in kvm code, even though
there is currently nothing wrong with them. This way we know for sure
that this bug class won't be introduced here.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Link: https://lore.kernel.org/r/20210621140356.1210771-1-hca@linux.ibm.com
[borntraeger@de.ibm.com: checkpatch strict fix]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-06-23 09:22:37 +02:00
Sven Schnelle
17e89e1340 s390/facilities: move stfl information from lowcore to global data
With gcc-11, there are a lot of warnings because the facility functions
are accessing lowcore through a null pointer. Fix this by moving the
facility arrays away from lowcore.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-06-07 17:06:58 +02:00
Maxim Levitsky
a43b80b782 KVM: s390x: implement KVM_CAP_SET_GUEST_DEBUG2
Define KVM_GUESTDBG_VALID_MASK and use it to implement this capabiity.
Compile tested only.

Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20210401135451.1004564-6-mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2021-04-17 08:31:03 -04:00
Heiko Carstens
44bada2821 KVM: s390: fix guarded storage control register handling
store_regs_fmt2() has an ordering problem: first the guarded storage
facility is enabled on the local cpu, then preemption disabled, and
then the STGSC (store guarded storage controls) instruction is
executed.

If the process gets scheduled away between enabling the guarded
storage facility and before preemption is disabled, this might lead to
a special operation exception and therefore kernel crash as soon as
the process is scheduled back and the STGSC instruction is executed.

Fixes: 4e0b1ab72b ("KVM: s390: gs support for kvm guests")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Cc: <stable@vger.kernel.org> # 4.12
Link: https://lore.kernel.org/r/20210415080127.1061275-1-hca@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-04-15 15:35:38 +02:00
Pierre Morel
87e28a15c4 KVM: s390: diag9c (directed yield) forwarding
When we intercept a DIAG_9C from the guest we verify that the
target real CPU associated with the virtual CPU designated by
the guest is running and if not we forward the DIAG_9C to the
target real CPU.

To avoid a diag9c storm we allow a maximal rate of diag9c forwarding.

The rate is calculated as a count per second defined as a new
parameter of the s390 kvm module: diag9c_forwarding_hz .

The default value of 0 is to not forward diag9c.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Link: https://lore.kernel.org/r/1613997661-22525-2-git-send-email-pmorel@linux.ibm.com
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-03-09 10:16:26 +01:00
Bhaskar Chowdhury
38860756a1 KVM: s390: Fix comment spelling in kvm_s390_vcpu_start()
s/oustanding/outstanding/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20210213153227.1640682-1-unixbhaskar@gmail.com
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-03-09 10:06:48 +01:00
Heiko Carstens
2cfd7b73f5 s390/kvm: use union tod_clock
Use union tod_clock and get rid of the kvm specific struct
kvm_s390_tod_clock_ext which apparently was introduced for the same
purpose.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-02-13 17:17:54 +01:00
Sven Schnelle
56e62a7370 s390: convert to generic entry
This patch converts s390 to use the generic entry infrastructure from
kernel/entry/*.

There are a few special things on s390:

- PIF_PER_TRAP is moved to TIF_PER_TRAP as the generic code doesn't
  know about our PIF flags in exit_to_user_mode_loop().

- The old code had several ways to restart syscalls:

  a) PIF_SYSCALL_RESTART, which was only set during execve to force a
     restart after upgrading a process (usually qemu-kvm) to pgste page
     table extensions.

  b) PIF_SYSCALL, which is set by do_signal() to indicate that the
     current syscall should be restarted. This is changed so that
     do_signal() now also uses PIF_SYSCALL_RESTART. Continuing to use
     PIF_SYSCALL doesn't work with the generic code, and changing it
     to PIF_SYSCALL_RESTART makes PIF_SYSCALL and PIF_SYSCALL_RESTART
     more unique.

- On s390 calling sys_sigreturn or sys_rt_sigreturn is implemented by
executing a svc instruction on the process stack which causes a fault.
While handling that fault the fault code sets PIF_SYSCALL to hand over
processing to the syscall code on exit to usermode.

The patch introduces PIF_SYSCALL_RET_SET, which is set if ptrace sets
a return value for a syscall. The s390x ptrace ABI uses r2 both for the
syscall number and return value, so ptrace cannot set the syscall number +
return value at the same time. The flag makes handling that a bit easier.
do_syscall() will just skip executing the syscall if PIF_SYSCALL_RET_SET
is set.

CONFIG_DEBUG_ASCE was removd in favour of the generic CONFIG_DEBUG_ENTRY.
CR1/7/13 will be checked both on kernel entry and exit to contain the
correct asces.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-01-19 12:29:26 +01:00
Linus Torvalds
6a447b0e31 ARM:
* PSCI relay at EL2 when "protected KVM" is enabled
 * New exception injection code
 * Simplification of AArch32 system register handling
 * Fix PMU accesses when no PMU is enabled
 * Expose CSV3 on non-Meltdown hosts
 * Cache hierarchy discovery fixes
 * PV steal-time cleanups
 * Allow function pointers at EL2
 * Various host EL2 entry cleanups
 * Simplification of the EL2 vector allocation
 
 s390:
 * memcg accouting for s390 specific parts of kvm and gmap
 * selftest for diag318
 * new kvm_stat for when async_pf falls back to sync
 
 x86:
 * Tracepoints for the new pagetable code from 5.10
 * Catch VFIO and KVM irqfd events before userspace
 * Reporting dirty pages to userspace with a ring buffer
 * SEV-ES host support
 * Nested VMX support for wait-for-SIPI activity state
 * New feature flag (AVX512 FP16)
 * New system ioctl to report Hyper-V-compatible paravirtualization features
 
 Generic:
 * Selftest improvements
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl/bdL4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNgQQgAnTH6rhXa++Zd5F0EM2NwXwz3iEGb
 lOq1DZSGjs6Eekjn8AnrWbmVQr+CBCuGU9MrxpSSzNDK/awryo3NwepOWAZw9eqk
 BBCVwGBbJQx5YrdgkGC0pDq2sNzcpW/VVB3vFsmOxd9eHblnuKSIxEsCCXTtyqIt
 XrLpQ1UhvI4yu102fDNhuFw2EfpzXm+K0Lc0x6idSkdM/p7SyeOxiv8hD4aMr6+G
 bGUQuMl4edKZFOWFigzr8NovQAvDHZGrwfihu2cLRYKLhV97QuWVmafv/yYfXcz2
 drr+wQCDNzDOXyANnssmviazrhOX0QmTAhbIXGGX/kTxYKcfPi83ZLoI3A==
 =ISud
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "Much x86 work was pushed out to 5.12, but ARM more than made up for it.

  ARM:
   - PSCI relay at EL2 when "protected KVM" is enabled
   - New exception injection code
   - Simplification of AArch32 system register handling
   - Fix PMU accesses when no PMU is enabled
   - Expose CSV3 on non-Meltdown hosts
   - Cache hierarchy discovery fixes
   - PV steal-time cleanups
   - Allow function pointers at EL2
   - Various host EL2 entry cleanups
   - Simplification of the EL2 vector allocation

  s390:
   - memcg accouting for s390 specific parts of kvm and gmap
   - selftest for diag318
   - new kvm_stat for when async_pf falls back to sync

  x86:
   - Tracepoints for the new pagetable code from 5.10
   - Catch VFIO and KVM irqfd events before userspace
   - Reporting dirty pages to userspace with a ring buffer
   - SEV-ES host support
   - Nested VMX support for wait-for-SIPI activity state
   - New feature flag (AVX512 FP16)
   - New system ioctl to report Hyper-V-compatible paravirtualization features

  Generic:
   - Selftest improvements"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (171 commits)
  KVM: SVM: fix 32-bit compilation
  KVM: SVM: Add AP_JUMP_TABLE support in prep for AP booting
  KVM: SVM: Provide support to launch and run an SEV-ES guest
  KVM: SVM: Provide an updated VMRUN invocation for SEV-ES guests
  KVM: SVM: Provide support for SEV-ES vCPU loading
  KVM: SVM: Provide support for SEV-ES vCPU creation/loading
  KVM: SVM: Update ASID allocation to support SEV-ES guests
  KVM: SVM: Set the encryption mask for the SVM host save area
  KVM: SVM: Add NMI support for an SEV-ES guest
  KVM: SVM: Guest FPU state save/restore not needed for SEV-ES guest
  KVM: SVM: Do not report support for SMM for an SEV-ES guest
  KVM: x86: Update __get_sregs() / __set_sregs() to support SEV-ES
  KVM: SVM: Add support for CR8 write traps for an SEV-ES guest
  KVM: SVM: Add support for CR4 write traps for an SEV-ES guest
  KVM: SVM: Add support for CR0 write traps for an SEV-ES guest
  KVM: SVM: Add support for EFER write traps for an SEV-ES guest
  KVM: SVM: Support string IO operations for an SEV-ES guest
  KVM: SVM: Support MMIO for an SEV-ES guest
  KVM: SVM: Create trace events for VMGEXIT MSR protocol processing
  KVM: SVM: Create trace events for VMGEXIT processing
  ...
2020-12-20 10:44:05 -08:00
Christian Borntraeger
50a05be484 KVM: s390: track synchronous pfault events in kvm_stat
Right now we do count pfault (pseudo page faults aka async page faults
start and completion events). What we do not count is, if an async page
fault would have been possible by the host, but it was disabled by the
guest (e.g. interrupts off, pfault disabled, secure execution....).  Let
us count those as well in the pfault_sync counter.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Link: https://lore.kernel.org/r/20201125090658.38463-1-borntraeger@de.ibm.com
2020-12-10 14:20:26 +01:00
Christian Borntraeger
c419621873 KVM: s390: Add memcg accounting to KVM allocations
Almost all kvm allocations in the s390x KVM code can be attributed to
the process that triggers the allocation (in other words, no global
allocation for other guests). This will help the memcg controller to
make the right decisions.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
2020-12-10 13:36:05 +01:00
Collin Walling
6cbf1e960f KVM: s390: remove diag318 reset code
The diag318 data must be set to 0 by VM-wide reset events
triggered by diag308. As such, KVM should not handle
resetting this data via the VCPU ioctls.

Fixes: 23a60f8344 ("s390/kvm: diagnose 0x318 sync and reset")
Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20201104181032.109800-1-walling@linux.ibm.com
2020-11-11 09:31:52 +01:00
Janosch Frank
1ed576a20c KVM: s390: pv: Mark mm as protected after the set secure parameters and improve cleanup
We can only have protected guest pages after a successful set secure
parameters call as only then the UV allows imports and unpacks.

By moving the test we can now also check for it in s390_reset_acc()
and do an early return if it is 0.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: 29b40f105e ("KVM: s390: protvirt: Add initial vm and cpu lifecycle handling")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-11-11 09:31:48 +01:00
Peter Xu
64019a2e46 mm/gup: remove task_struct pointer for all gup code
After the cleanup of page fault accounting, gup does not need to pass
task_struct around any more.  Remove that parameter in the whole gup
stack.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: John Hubbard <jhubbard@nvidia.com>
Link: http://lkml.kernel.org/r/20200707225021.200906-26-peterx@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-08-12 10:58:04 -07:00
Paolo Bonzini
f3633c2683 KVM: s390: Enhancement for 5.9
- implement diagnose 318
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJfIpNzAAoJEBF7vIC1phx8l0cP/AvZ6oT5dlAGeBhtPeM/3rqp
 g7RCukN445LQfxWeWXuzckAYE4AAAtFqMS6PujfKBc+Lf7t+d6Iuod7wFlJTDImP
 wIGcCV1pTSpIHaFiSM1rpqRjnzFGeWrqWg6gBSjm0aSMqB8KAjv+PdyQ1rcfyiIj
 r+sD+Vt9DNGop12TY2YxUlXaxzPccGMAniDXesFgKb9IoTdMLdEt45Evkx9D6UAx
 eetWMwZTwqB8iWJx6xU41LxDA4ERlS+8TsE+SC0r8n6yCmhQ98hgb4i2O1gx9JIl
 K5TqpXMWVBKFyeSbJBw9bXtWa5F/gXDuD6zrzRiMjZR4Og6TXqL2NoXgr9LHN/g7
 WpBlF/eDr7TNxF1VutvSiLvV5XI/t8yjbwSvAt2+QtIIrJK+fPAdTRSH1Q8TRUMj
 cIRdCw2H10neseAPhbdn9nSJhuQ5E/hGrMzubiYQeTXsA3TLfLWniuejfRufMOXB
 kgepl+8H60D8o1l459+81NBV6rM5RdRRzWkWIIYD2/+yWRtclb1K2CF2HrN51saC
 3SQI90Rr7Vx4yjS0p84/aasAAy7WxfumnoLwBsRwIE0X9R4e4plC12igwsmPK8oM
 V/SO4w+LAJnW1bQpXuqRGMPI29gpGDHVEcfOtuerHE1pZya6VRIWTEkSsdXt1eZI
 trxY3c6Xruor8DQSDsjv
 =hr9t
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.9-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-next-5.6

KVM: s390: Enhancement for 5.9
- implement diagnose 318
2020-08-03 14:19:13 -04:00
Tianjia Zhang
2f0a83bece KVM: s390: clean up redundant 'kvm_run' parameters
In the current kvm version, 'kvm_run' has been included in the 'kvm_vcpu'
structure. For historical reasons, many kvm-related function parameters
retain the 'kvm_run' and 'kvm_vcpu' parameters at the same time. This
patch does a unified cleanup of these remaining redundant parameters.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200623131418.31473-2-tianjia.zhang@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-10 04:26:39 -04:00
Vitaly Kuznetsov
e8c22266e6 KVM: async_pf: change kvm_setup_async_pf()/kvm_arch_setup_async_pf() return type to bool
Unlike normal 'int' functions returning '0' on success, kvm_setup_async_pf()/
kvm_arch_setup_async_pf() return '1' when a job to handle page fault
asynchronously was scheduled and '0' otherwise. To avoid the confusion
change return type to 'bool'.

No functional change intended.

Suggested-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200615121334.91300-1-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-07-08 16:21:36 -04:00
Collin Walling
23a60f8344 s390/kvm: diagnose 0x318 sync and reset
DIAGNOSE 0x318 (diag318) sets information regarding the environment
the VM is running in (Linux, z/VM, etc) and is observed via
firmware/service events.

This is a privileged s390x instruction that must be intercepted by
SIE. Userspace handles the instruction as well as migration. Data
is communicated via VCPU register synchronization.

The Control Program Name Code (CPNC) is stored in the SIE block. The
CPNC along with the Control Program Version Code (CPVC) are stored
in the kvm_vcpu_arch struct.

This data is reset on load normal and clear resets.

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/20200622154636.5499-3-walling@linux.ibm.com
[borntraeger@de.ibm.com: fix sync_reg position]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-06-23 10:55:33 +02:00
Linus Torvalds
52cd0d972f MIPS:
- Loongson port
 
 PPC:
 - Fixes
 
 ARM:
 - Fixes
 
 x86:
 - KVM_SET_USER_MEMORY_REGION optimizations
 - Fixes
 - Selftest fixes
 
 The guest side of the asynchronous page fault work has been delayed to 5.9
 in order to sync with Thomas's interrupt entry rework.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl7icj4UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroPHGQgAj9+5j+f5v06iMP/+ponWwsVfh+5/
 UR1gPbpMSFMKF0U+BCFxsBeGKWPDiz9QXaLfy6UGfOFYBI475Su5SoZ8/i/o6a2V
 QjcKIJxBRNs66IG/774pIpONY8/mm/3b6vxmQktyBTqjb6XMGlOwoGZixj/RTp85
 +uwSICxMlrijg+fhFMwC4Bo/8SFg+FeBVbwR07my88JaLj+3cV/NPolG900qLSa6
 uPqJ289EQ86LrHIHXCEWRKYvwy77GFsmBYjKZH8yXpdzUlSGNexV8eIMAz50figu
 wYRJGmHrRqwuzFwEGknv8SA3s2HVggXO4WVkWWCeJyO8nIVfYFUhME5l6Q==
 =+Hh0
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull more KVM updates from Paolo Bonzini:
 "The guest side of the asynchronous page fault work has been delayed to
  5.9 in order to sync with Thomas's interrupt entry rework, but here's
  the rest of the KVM updates for this merge window.

  MIPS:
   - Loongson port

  PPC:
   - Fixes

  ARM:
   - Fixes

  x86:
   - KVM_SET_USER_MEMORY_REGION optimizations
   - Fixes
   - Selftest fixes"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (62 commits)
  KVM: x86: do not pass poisoned hva to __kvm_set_memory_region
  KVM: selftests: fix sync_with_host() in smm_test
  KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected
  KVM: async_pf: Cleanup kvm_setup_async_pf()
  kvm: i8254: remove redundant assignment to pointer s
  KVM: x86: respect singlestep when emulating instruction
  KVM: selftests: Don't probe KVM_CAP_HYPERV_ENLIGHTENED_VMCS when nested VMX is unsupported
  KVM: selftests: do not substitute SVM/VMX check with KVM_CAP_NESTED_STATE check
  KVM: nVMX: Consult only the "basic" exit reason when routing nested exit
  KVM: arm64: Move hyp_symbol_addr() to kvm_asm.h
  KVM: arm64: Synchronize sysreg state on injecting an AArch32 exception
  KVM: arm64: Make vcpu_cp1x() work on Big Endian hosts
  KVM: arm64: Remove host_cpu_context member from vcpu structure
  KVM: arm64: Stop sparse from moaning at __hyp_this_cpu_ptr
  KVM: arm64: Handle PtrAuth traps early
  KVM: x86: Unexport x86_fpu_cache and make it static
  KVM: selftests: Ignore KVM 5-level paging support for VM_MODE_PXXV48_4K
  KVM: arm64: Save the host's PtrAuth keys in non-preemptible context
  KVM: arm64: Stop save/restoring ACTLR_EL1
  KVM: arm64: Add emulation for 32bit guests accessing ACTLR2
  ...
2020-06-12 11:05:52 -07:00
Vitaly Kuznetsov
2a18b7e7cd KVM: async_pf: Inject 'page ready' event only if 'page not present' was previously injected
'Page not present' event may or may not get injected depending on
guest's state. If the event wasn't injected, there is no need to
inject the corresponding 'page ready' event as the guest may get
confused. E.g. Linux thinks that the corresponding 'page not present'
event wasn't delivered *yet* and allocates a 'dummy entry' for it.
This entry is never freed.

Note, 'wakeup all' events have no corresponding 'page not present'
event and always get injected.

s390 seems to always be able to inject 'page not present', the
change is effectively a nop.

Suggested-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200610175532.779793-2-vkuznets@redhat.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208081
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-11 12:35:19 -04:00
Michel Lespinasse
d8ed45c5dc mmap locking API: use coccinelle to convert mmap_sem rwsem call sites
This change converts the existing mmap_sem rwsem calls to use the new mmap
locking API instead.

The change is generated using coccinelle with the following rule:

// spatch --sp-file mmap_lock_api.cocci --in-place --include-headers --dir .

@@
expression mm;
@@
(
-init_rwsem
+mmap_init_lock
|
-down_write
+mmap_write_lock
|
-down_write_killable
+mmap_write_lock_killable
|
-down_write_trylock
+mmap_write_trylock
|
-up_write
+mmap_write_unlock
|
-downgrade_write
+mmap_write_downgrade
|
-down_read
+mmap_read_lock
|
-down_read_killable
+mmap_read_lock_killable
|
-down_read_trylock
+mmap_read_trylock
|
-up_read
+mmap_read_unlock
)
-(&mm->mmap_sem)
+(mm)

Signed-off-by: Michel Lespinasse <walken@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Daniel Jordan <daniel.m.jordan@oracle.com>
Reviewed-by: Laurent Dufour <ldufour@linux.ibm.com>
Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Davidlohr Bueso <dbueso@suse.de>
Cc: David Rientjes <rientjes@google.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Jerome Glisse <jglisse@redhat.com>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam Howlett <Liam.Howlett@oracle.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ying Han <yinghan@google.com>
Link: http://lkml.kernel.org/r/20200520052908.204642-5-walken@google.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:14 -07:00
Mike Rapoport
65fddcfca8 mm: reorder includes after introduction of linux/pgtable.h
The replacement of <asm/pgrable.h> with <linux/pgtable.h> made the include
of the latter in the middle of asm includes.  Fix this up with the aid of
the below script and manual adjustments here and there.

	import sys
	import re

	if len(sys.argv) is not 3:
	    print "USAGE: %s <file> <header>" % (sys.argv[0])
	    sys.exit(1)

	hdr_to_move="#include <linux/%s>" % sys.argv[2]
	moved = False
	in_hdrs = False

	with open(sys.argv[1], "r") as f:
	    lines = f.readlines()
	    for _line in lines:
		line = _line.rstrip('
')
		if line == hdr_to_move:
		    continue
		if line.startswith("#include <linux/"):
		    in_hdrs = True
		elif not moved and in_hdrs:
		    moved = True
		    print hdr_to_move
		print line

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-4-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Mike Rapoport
ca5999fde0 mm: introduce include/linux/pgtable.h
The include/linux/pgtable.h is going to be the home of generic page table
manipulation functions.

Start with moving asm-generic/pgtable.h to include/linux/pgtable.h and
make the latter include asm/pgtable.h.

Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Brian Cain <bcain@codeaurora.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Greg Ungerer <gerg@linux-m68k.org>
Cc: Guan Xuetao <gxt@pku.edu.cn>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ley Foon Tan <ley.foon.tan@intel.com>
Cc: Mark Salter <msalter@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Nick Hu <nickhu@andestech.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Richard Weinberger <richard@nod.at>
Cc: Rich Felker <dalias@libc.org>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Vineet Gupta <vgupta@synopsys.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Link: http://lkml.kernel.org/r/20200514170327.31389-3-rppt@kernel.org
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-09 09:39:13 -07:00
Linus Torvalds
23fc02e36e s390 updates for the 5.8 merge window
- Add support for multi-function devices in pci code.
 
 - Enable PF-VF linking for architectures using the
   pdev->no_vf_scan flag (currently just s390).
 
 - Add reipl from NVMe support.
 
 - Get rid of critical section cleanup in entry.S.
 
 - Refactor PNSO CHSC (perform network subchannel operation) in cio
   and qeth.
 
 - QDIO interrupts and error handling fixes and improvements, more
   refactoring changes.
 
 - Align ioremap() with generic code.
 
 - Accept requests without the prefetch bit set in vfio-ccw.
 
 - Enable path handling via two new regions in vfio-ccw.
 
 - Other small fixes and improvements all over the code.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEE3QHqV+H2a8xAv27vjYWKoQLXFBgFAl7eVGcACgkQjYWKoQLX
 FBhweQgAkicvx31x230rdfG+jQkQkl0UqF99vvWrJHEll77SqadfjzKAGIjUB+K0
 EoeHVD5Wcj7BogDGcyHeQ0bZpu4WzE+y1nmnrsvu7TEEvcBmkJH0rF2jF+y0sb/O
 3qvwFkX/CB5OqaMzKC/AEeRpcCKR+ZUXkWu1irbYth7CBXaycD9EAPc4cj8CfYGZ
 r5njUdYOVk77TaO4aV+t5pCYc5TCRJaWXSsWaAv/nuLcIqsFBYOy2q+L47zITGXp
 utZVanIDjzx+ikpaKicOIfC3hJsRuNX9MnlZKsQFwpVEZAUZmIUm29XdhGJTWSxU
 RV7m1ORINbFP1nGAqWqkOvGo/LC0ZA==
 =VhXR
 -----END PGP SIGNATURE-----

Merge tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Vasily Gorbik:

 - Add support for multi-function devices in pci code.

 - Enable PF-VF linking for architectures using the pdev->no_vf_scan
   flag (currently just s390).

 - Add reipl from NVMe support.

 - Get rid of critical section cleanup in entry.S.

 - Refactor PNSO CHSC (perform network subchannel operation) in cio and
   qeth.

 - QDIO interrupts and error handling fixes and improvements, more
   refactoring changes.

 - Align ioremap() with generic code.

 - Accept requests without the prefetch bit set in vfio-ccw.

 - Enable path handling via two new regions in vfio-ccw.

 - Other small fixes and improvements all over the code.

* tag 's390-5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (52 commits)
  vfio-ccw: make vfio_ccw_regops variables declarations static
  vfio-ccw: Add trace for CRW event
  vfio-ccw: Wire up the CRW irq and CRW region
  vfio-ccw: Introduce a new CRW region
  vfio-ccw: Refactor IRQ handlers
  vfio-ccw: Introduce a new schib region
  vfio-ccw: Refactor the unregister of the async regions
  vfio-ccw: Register a chp_event callback for vfio-ccw
  vfio-ccw: Introduce new helper functions to free/destroy regions
  vfio-ccw: document possible errors
  vfio-ccw: Enable transparent CCW IPL from DASD
  s390/pci: Log new handle in clp_disable_fh()
  s390/cio, s390/qeth: cleanup PNSO CHSC
  s390/qdio: remove q->first_to_kick
  s390/qdio: fix up qdio_start_irq() kerneldoc
  s390: remove critical section cleanup from entry.S
  s390: add machine check SIGP
  s390/pci: ioremap() align with generic code
  s390/ap: introduce new ap function ap_get_qdev()
  Documentation/s390: Update / remove developerWorks web links
  ...
2020-06-08 12:05:31 -07:00
Vitaly Kuznetsov
7c0ade6c90 KVM: rename kvm_arch_can_inject_async_page_present() to kvm_arch_can_dequeue_async_page_present()
An innocent reader of the following x86 KVM code:

bool kvm_arch_can_inject_async_page_present(struct kvm_vcpu *vcpu)
{
        if (!(vcpu->arch.apf.msr_val & KVM_ASYNC_PF_ENABLED))
                return true;
...

may get very confused: if APF mechanism is not enabled, why do we report
that we 'can inject async page present'? In reality, upon injection
kvm_arch_async_page_present() will check the same condition again and,
in case APF is disabled, will just drop the item. This is fine as the
guest which deliberately disabled APF doesn't expect to get any APF
notifications.

Rename kvm_arch_can_inject_async_page_present() to
kvm_arch_can_dequeue_async_page_present() to make it clear what we are
checking: if the item can be dequeued (meaning either injected or just
dropped).

On s390 kvm_arch_can_inject_async_page_present() always returns 'true' so
the rename doesn't matter much.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Message-Id: <20200525144125.143875-4-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-06-01 04:26:07 -04:00
Sven Schnelle
0b0ed657fe s390: remove critical section cleanup from entry.S
The current code is rather complex and caused a lot of subtle
and hard to debug bugs in the past. Simplify the code by calling
the system_call handler with interrupts disabled, save
machine state, and re-enable them later.

This requires significant changes to the machine check handling code
as well. When the machine check interrupt arrived while being in kernel
mode the new code will signal pending machine checks with a SIGP external
call. When userspace was interrupted, the handler will switch to the
kernel stack and directly execute s390_handle_mcck().

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2020-05-28 12:21:54 +02:00
David Matlack
cb953129bf kvm: add halt-polling cpu usage stats
Two new stats for exposing halt-polling cpu usage:
halt_poll_success_ns
halt_poll_fail_ns

Thus sum of these 2 stats is the total cpu time spent polling. "success"
means the VCPU polled until a virtual interrupt was delivered. "fail"
means the VCPU had to schedule out (either because the maximum poll time
was reached or it needed to yield the CPU).

To avoid touching every arch's kvm_vcpu_stat struct, only update and
export halt-polling cpu usage stats if we're on x86.

Exporting cpu usage as a u64 and in nanoseconds means we will overflow at
~500 years, which seems reasonably large.

Signed-off-by: David Matlack <dmatlack@google.com>
Signed-off-by: Jon Cargille <jcargill@google.com>
Reviewed-by: Jim Mattson <jmattson@google.com>

Message-Id: <20200508182240.68440-1-jcargill@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-15 12:26:26 -04:00
Paolo Bonzini
4aef2ec902 Merge branch 'kvm-amd-fixes' into HEAD 2020-05-13 12:14:05 -04:00
Peter Xu
b9b2782cd5 KVM: X86: Declare KVM_CAP_SET_GUEST_DEBUG properly
KVM_CAP_SET_GUEST_DEBUG should be supported for x86 however it's not declared
as supported.  My wild guess is that userspaces like QEMU are using "#ifdef
KVM_CAP_SET_GUEST_DEBUG" to check for the capability instead, but that could be
wrong because the compilation host may not be the runtime host.

The userspace might still want to keep the old "#ifdef" though to not break the
guest debug on old kernels.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20200505154750.126300-1-peterx@redhat.com>
[Do the same for PPC and s390. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-05-07 06:13:40 -04:00
Tianjia Zhang
1b94f6f810 KVM: Remove redundant argument to kvm_arch_vcpu_ioctl_run
In earlier versions of kvm, 'kvm_run' was an independent structure
and was not included in the vcpu structure. At present, 'kvm_run'
is already included in the vcpu structure, so the parameter
'kvm_run' is redundant.

This patch simplifies the function definition, removes the extra
'kvm_run' parameter, and extracts it from the 'kvm_vcpu' structure
if necessary.

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
Message-Id: <20200416051057.26526-1-tianjia.zhang@linux.alibaba.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-21 09:13:11 -04:00
Emanuele Giuseppe Esposito
812756a82e kvm_host: unify VM_STAT and VCPU_STAT definitions in a single place
The macros VM_STAT and VCPU_STAT are redundantly implemented in multiple
files, each used by a different architecure to initialize the debugfs
entries for statistics. Since they all have the same purpose, they can be
unified in a single common definition in include/linux/kvm_host.h

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
Message-Id: <20200414155625.20559-1-eesposit@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-21 09:13:01 -04:00
Sean Christopherson
97daa028f3 KVM: s390: Return last valid slot if approx index is out-of-bounds
Return the index of the last valid slot from gfn_to_memslot_approx() if
its binary search loop yielded an out-of-bounds index.  The index can
be out-of-bounds if the specified gfn is less than the base of the
lowest memslot (which is also the last valid memslot).

Note, the sole caller, kvm_s390_get_cmma(), ensures used_slots is
non-zero.

Fixes: afdad61615 ("KVM: s390: Fix storage attributes migration with memory slots")
Cc: stable@vger.kernel.org # 4.19.x: 0774a964ef: KVM: Fix out of range accesses to memslots
Cc: stable@vger.kernel.org # 4.19.x
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200408064059.8957-3-sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-04-14 10:39:57 -04:00
Linus Torvalds
8c1b724ddb ARM:
* GICv4.1 support
 * 32bit host removal
 
 PPC:
 * secure (encrypted) using under the Protected Execution Framework
 ultravisor
 
 s390:
 * allow disabling GISA (hardware interrupt injection) and protected
 VMs/ultravisor support.
 
 x86:
 * New dirty bitmap flag that sets all bits in the bitmap when dirty
 page logging is enabled; this is faster because it doesn't require bulk
 modification of the page tables.
 * Initial work on making nested SVM event injection more similar to VMX,
 and less buggy.
 * Various cleanups to MMU code (though the big ones and related
 optimizations were delayed to 5.8).  Instead of using cr3 in function
 names which occasionally means eptp, KVM too has standardized on "pgd".
 * A large refactoring of CPUID features, which now use an array that
 parallels the core x86_features.
 * Some removal of pointer chasing from kvm_x86_ops, which will also be
 switched to static calls as soon as they are available.
 * New Tigerlake CPUID features.
 * More bugfixes, optimizations and cleanups.
 
 Generic:
 * selftests: cleanups, new MMU notifier stress test, steal-time test
 * CSV output for kvm_stat.
 
 KVM/MIPS has been broken since 5.5, it does not compile due to a patch committed
 by MIPS maintainers.  I had already prepared a fix, but the MIPS maintainers
 prefer to fix it in generic code rather than KVM so they are taking care of it.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAl6GOnIUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroMfxwf/ZKLZiRoaovXCOG71M/eHtQb8ZIqU
 3MPy+On3eC5Sk/aBxWUL9EFZsbYG6kYdbZ1VOvG9XPBoLlnkDSm/IR0kaELHtnjj
 oGVda/tvGn46Ne39y8xBptmb91WDcWH0vFthT/CwlMxAw3xjr+gG7Qyo+8F2CW6m
 SSSuLiHSBnyO1cQKruBTHZ8qnR8LlnfXEqtd6Y4LFLic0LbLIoIdRcT3wjQrcZrm
 Djd7wbTEYZjUfoqZ72ekwEDUsONcDLDSKcguDO9pSMSCGhpxCVT5Vy68KRpoIMs2
 nzNWDKjvqQo5zb2+GWxJgkd12Hv+n7PCXZMbVrWBu1pQsewUns9m4mkpGw==
 =6fGt
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:
   - GICv4.1 support

   - 32bit host removal

  PPC:
   - secure (encrypted) using under the Protected Execution Framework
     ultravisor

  s390:
   - allow disabling GISA (hardware interrupt injection) and protected
     VMs/ultravisor support.

  x86:
   - New dirty bitmap flag that sets all bits in the bitmap when dirty
     page logging is enabled; this is faster because it doesn't require
     bulk modification of the page tables.

   - Initial work on making nested SVM event injection more similar to
     VMX, and less buggy.

   - Various cleanups to MMU code (though the big ones and related
     optimizations were delayed to 5.8). Instead of using cr3 in
     function names which occasionally means eptp, KVM too has
     standardized on "pgd".

   - A large refactoring of CPUID features, which now use an array that
     parallels the core x86_features.

   - Some removal of pointer chasing from kvm_x86_ops, which will also
     be switched to static calls as soon as they are available.

   - New Tigerlake CPUID features.

   - More bugfixes, optimizations and cleanups.

  Generic:
   - selftests: cleanups, new MMU notifier stress test, steal-time test

   - CSV output for kvm_stat"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (277 commits)
  x86/kvm: fix a missing-prototypes "vmread_error"
  KVM: x86: Fix BUILD_BUG() in __cpuid_entry_get_reg() w/ CONFIG_UBSAN=y
  KVM: VMX: Add a trampoline to fix VMREAD error handling
  KVM: SVM: Annotate svm_x86_ops as __initdata
  KVM: VMX: Annotate vmx_x86_ops as __initdata
  KVM: x86: Drop __exit from kvm_x86_ops' hardware_unsetup()
  KVM: x86: Copy kvm_x86_ops by value to eliminate layer of indirection
  KVM: x86: Set kvm_x86_ops only after ->hardware_setup() completes
  KVM: VMX: Configure runtime hooks using vmx_x86_ops
  KVM: VMX: Move hardware_setup() definition below vmx_x86_ops
  KVM: x86: Move init-only kvm_x86_ops to separate struct
  KVM: Pass kvm_init()'s opaque param to additional arch funcs
  s390/gmap: return proper error code on ksm unsharing
  KVM: selftests: Fix cosmetic copy-paste error in vm_mem_region_move()
  KVM: Fix out of range accesses to memslots
  KVM: X86: Micro-optimize IPI fastpath delay
  KVM: X86: Delay read msr data iff writes ICR MSR
  KVM: PPC: Book3S HV: Add a capability for enabling secure guests
  KVM: arm64: GICv4.1: Expose HW-based SGIs in debugfs
  KVM: arm64: GICv4.1: Allow non-trapping WFI when using HW SGIs
  ...
2020-04-02 15:13:15 -07:00
Sean Christopherson
b990408537 KVM: Pass kvm_init()'s opaque param to additional arch funcs
Pass @opaque to kvm_arch_hardware_setup() and
kvm_arch_check_processor_compat() to allow architecture specific code to
reference @opaque without having to stash it away in a temporary global
variable.  This will enable x86 to separate its vendor specific callback
ops, which are passed via @opaque, into "init" and "runtime" ops without
having to stash away the "init" ops.

No functional change intended.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com> #s390
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200321202603.19355-2-sean.j.christopherson@intel.com>
Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-31 10:48:03 -04:00
Paolo Bonzini
8bf8961332 KVM: s390: cleanups for 5.7
- mark sie control block as 512 byte aligned
 - use fallthrough;
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJeegAeAAoJEBF7vIC1phx8PgMP/jcN57GRXU0yML+r8izA1JBZ
 0hnV9Mesec5sUDyDLTLuAJBS7tDvfLKi8A6QLdyTuPkUihPpWp0VD/QW4tJe1WDD
 MOPA56xH3yN7ll/q+IX7QurwY4jDCReV8DIsHjrOvDYKSBHxZSsgrR/G5ivtqzRJ
 VWl1qpB3/FgPtsdTW4x1nciaAYulWp7K75F2zWFpoRe0EZUXY8IyLJMZjScMVRnk
 DbdA8+jsFqeYCJC1JSnE6TShP2RvDNu5NjE2pInyWbXAA1PwcFM+bML+ZZ4kJlii
 9+cDqctXNC4MdxOJVMKEOEdoQ1M3yYzKn/J7an7Zly7kY54G8uJkQGCaBmlg1K0k
 r57WP0DpFu5kuNFFJg52bBcAOUPEgFiyIitICG+nMn9BjLA3zY7bjUSCGHviIzLm
 pNPL+t7KVWyyn5t8X25CuFUkCQskDSALpC3SFdL4iPo+fpBBw/GPqngjmx4zdrHg
 jkOImPWt2n2KK9I4kfxw/NIx8Q06wHHzIpY2uutHA++TNF8WMgXR1RvJgZvOAO0G
 JLGaOUZNV5BcTsK38ftEMh9awqBG9J4l8DPwPoRG/9r394W25O1KdjiNnTVH9qYU
 INUArhtfeeAYoSteDCflD7cgORbGQLOXBePtCO8+Ayt2lTJLW98CDfCrCAKawQLl
 O+RMxuDshobeH/AGT7yr
 =nsOM
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.7-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: cleanups for 5.7

- mark sie control block as 512 byte aligned
- use fallthrough;
2020-03-26 05:58:49 -04:00
Sean Christopherson
0774a964ef KVM: Fix out of range accesses to memslots
Reset the LRU slot if it becomes invalid when deleting a memslot to fix
an out-of-bounds/use-after-free access when searching through memslots.

Explicitly check for there being no used slots in search_memslots(), and
in the caller of s390's approximation variant.

Fixes: 36947254e5 ("KVM: Dynamically size memslot array based on number of used slots")
Reported-by: Qian Cai <cai@lca.pw>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Message-Id: <20200320205546.2396-2-sean.j.christopherson@intel.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-26 05:58:27 -04:00
Joe Perches
3b684a420b KVM: s390: Use fallthrough;
Convert the various uses of fallthrough comments to fallthrough;

Done via script
Link: https://lore.kernel.org/lkml/b56602fcf79f849e733e7b521bb0e17895d390fa.1582230379.git.joe@perches.com

Signed-off-by: Joe Perches <joe@perches.com>
Link: https://lore.kernel.org/r/d63c86429f3e5aa806aa3e185c97d213904924a5.1583896348.git.joe@perches.com
[borntrager@de.ibm.com: Fix link to tool and subject]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-03-23 18:30:07 +01:00
Paolo Bonzini
1c482452d5 KVM: s390: Features and Enhancements for 5.7 part1
1. Allow to disable gisa
 2. protected virtual machines
   Protected VMs (PVM) are KVM VMs, where KVM can't access the VM's
   state like guest memory and guest registers anymore. Instead the
   PVMs are mostly managed by a new entity called Ultravisor (UV),
   which provides an API, so KVM and the PV can request management
   actions.
 
   PVMs are encrypted at rest and protected from hypervisor access
   while running.  They switch from a normal operation into protected
   mode, so we can still use the standard boot process to load a
   encrypted blob and then move it into protected mode.
 
   Rebooting is only possible by passing through the unprotected/normal
   mode and switching to protected again.
 
   One mm related patch will go via Andrews mm tree ( mm/gup/writeback:
   add callbacks for inaccessible pages)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJeZf9tAAoJEBF7vIC1phx89J0P/iv3wCoMNDqAttnHa/UQFF04
 njUadNYkAADDrsabIEOs9O+BE1/4BVspnIunE4+xw76p5M/7/g5eIhXWcLudhlnL
 +XtvuEwz/2ffA9JWAAYNKB7cGqBM9BCC+iYzAF9ah6sPLmlDCoF+hRe0g+0tXSON
 cklUJFril9bOcxd/MxrzFLcmipbxT/Z4/10eBY+FHcm6SQGOKAtJH0xL7X3PfPI5
 L/6ZhML9exsj1Iplkrl8BomMRoYOrvfq/jMaZp9SwmfXaOKYmNU3a19MhzfZ593h
 bfR92H8kZRy/TpBd7EnpxYGQ/n53HkUhFMhtqkkkeHW1rCo8ccwC4VfnXb+KqQp+
 nJ8KieWG+OlKKFDuZPl5Gq+jQqjJfzchbyMTYnBNe+GPT5zg76tJXmQyDn5X9p3R
 mfg+9ZEeEonMu7px93Ht1gLdPiC2gjRckjuBDPqMGEhG2z2SQ/MLri+WnproIQRa
 TcE7rZBtuyrGFTq4M4dEcsUW02xnOaav6H57kkl8EwqYwgDHlqoUbt0AvLFyW07a
 RlH7drmhKDwTJkcOhOLeLNM8Un6NvnsLZ8Lbcr9rRf9Z9Lpc+zW88BSwJ7MM/GH8
 FEQM8Omnn8KAJTENpIm3bHHyvsi0kJEhl+c3Ila3QnYzXZbJ3ZDaJZngMAbUUnVl
 YNeFyyALzOgVVBx4kvTm
 =x6Hn
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.7-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Features and Enhancements for 5.7 part1

1. Allow to disable gisa
2. protected virtual machines
  Protected VMs (PVM) are KVM VMs, where KVM can't access the VM's
  state like guest memory and guest registers anymore. Instead the
  PVMs are mostly managed by a new entity called Ultravisor (UV),
  which provides an API, so KVM and the PV can request management
  actions.

  PVMs are encrypted at rest and protected from hypervisor access
  while running.  They switch from a normal operation into protected
  mode, so we can still use the standard boot process to load a
  encrypted blob and then move it into protected mode.

  Rebooting is only possible by passing through the unprotected/normal
  mode and switching to protected again.

  One mm related patch will go via Andrews mm tree ( mm/gup/writeback:
  add callbacks for inaccessible pages)
2020-03-16 18:19:34 +01:00
Sean Christopherson
2a49f61dfc KVM: Ensure validity of memslot with respect to kvm_get_dirty_log()
Rework kvm_get_dirty_log() so that it "returns" the associated memslot
on success.  A future patch will rework memslot handling such that
id_to_memslot() can return NULL, returning the memslot makes it more
obvious that the validity of the memslot has been verified, i.e.
precludes the need to add validity checks in the arch code that are
technically unnecessary.

To maintain ordering in s390, move the call to kvm_arch_sync_dirty_log()
from s390's kvm_vm_ioctl_get_dirty_log() to the new kvm_get_dirty_log().
This is a nop for PPC, the only other arch that doesn't select
KVM_GENERIC_DIRTYLOG_READ_PROTECT, as its sync_dirty_log() is empty.

Ideally, moving the sync_dirty_log() call would be done in a separate
patch, but it can't be done in a follow-on patch because that would
temporarily break s390's ordering.  Making the move in a preparatory
patch would be functionally correct, but would create an odd scenario
where the moved sync_dirty_log() would operate on a "different" memslot
due to consuming the result of a different id_to_memslot().  The
memslot couldn't actually be different as slots_lock is held, but the
code is confusing enough as it is, i.e. moving sync_dirty_log() in this
patch is the lesser of all evils.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:25 +01:00
Sean Christopherson
0dff084607 KVM: Provide common implementation for generic dirty log functions
Move the implementations of KVM_GET_DIRTY_LOG and KVM_CLEAR_DIRTY_LOG
for CONFIG_KVM_GENERIC_DIRTYLOG_READ_PROTECT into common KVM code.
The arch specific implemenations are extremely similar, differing
only in whether the dirty log needs to be sync'd from hardware (x86)
and how the TLBs are flushed.  Add new arch hooks to handle sync
and TLB flush; the sync will also be used for non-generic dirty log
support in a future patch (s390).

The ulterior motive for providing a common implementation is to
eliminate the dependency between arch and common code with respect to
the memslot referenced by the dirty log, i.e. to make it obvious in the
code that the validity of the memslot is guaranteed, as a future patch
will rework memslot handling such that id_to_memslot() can return NULL.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:24 +01:00
Sean Christopherson
9d4c197c0e KVM: Drop "const" attribute from old memslot in commit_memory_region()
Drop the "const" attribute from @old in kvm_arch_commit_memory_region()
to allow arch specific code to free arch specific resources in the old
memslot without having to cast away the attribute.  Freeing resources in
kvm_arch_commit_memory_region() paves the way for simplifying
kvm_free_memslot() by eliminating the last usage of its @dont param.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:20 +01:00
Sean Christopherson
414de7abbf KVM: Drop kvm_arch_create_memslot()
Remove kvm_arch_create_memslot() now that all arch implementations are
effectively nops.  Removing kvm_arch_create_memslot() eliminates the
possibility for arch specific code to allocate memory prior to setting
a memslot, which sets the stage for simplifying kvm_free_memslot().

Cc: Janosch Frank <frankja@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-03-16 17:57:17 +01:00
Christian Borntraeger
e93fc7b454 KVM: s390: Also reset registers in sync regs for initial cpu reset
When we do the initial CPU reset we must not only clear the registers
in the internal data structures but also in kvm_run sync_regs. For
modern userspace sync_regs is the only place that it looks at.

Fixes: 7de3f1423f ("KVM: s390: Add new reset vcpu API")
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-03-11 08:25:26 +01:00
Michael Mueller
cc674ef252 KVM: s390: introduce module parameter kvm.use_gisa
The boolean module parameter "kvm.use_gisa" controls if newly
created guests will use the GISA facility if provided by the
host system. The default is yes.

  # cat /sys/module/kvm/parameters/use_gisa
  Y

The parameter can be changed on the fly.

  # echo N > /sys/module/kvm/parameters/use_gisa

Already running guests are not affected by this change.

The kvm s390 debug feature shows if a guest is running with GISA.

  # grep gisa /sys/kernel/debug/s390dbf/kvm-$pid/sprintf
  00 01582725059:843303 3 - 08 00000000e119bc01  gisa 0x00000000c9ac2642 initialized
  00 01582725059:903840 3 - 11 000000004391ee22  00[0000000000000000-0000000000000000]: AIV gisa format-1 enabled for cpu 000
  ...
  00 01582725059:916847 3 - 08 0000000094fff572  gisa 0x00000000c9ac2642 cleared

In general, that value should not be changed as the GISA facility
enhances interruption delivery performance.

A reason to switch the GISA facility off might be a performance
comparison run or debugging.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Link: https://lore.kernel.org/r/20200227091031.102993-1-mimu@linux.ibm.com
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:13 +01:00
Christian Borntraeger
13da9ae1cd KVM: s390: protvirt: introduce and enable KVM_CAP_S390_PROTECTED
Now that everything is in place, we can announce the feature.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2020-02-27 19:47:13 +01:00
Janosch Frank
8a8378fa61 KVM: s390: protvirt: Add UV cpu reset calls
For protected VMs, the VCPU resets are done by the Ultravisor, as KVM
has no access to the VCPU registers.

Note that the ultravisor will only accept a call for the exact reset
that has been requested.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Christian Borntraeger
72f218208f KVM: s390: protvirt: do not inject interrupts after start
As PSW restart is handled by the ultravisor (and we only get a start
notification) we must re-check the PSW after a start before injecting
interrupts.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
3adae0b4ca KVM: s390: protvirt: Mask PSW interrupt bits for interception 104 and 112
We're not allowed to inject interrupts on intercepts that leave the
guest state in an "in-between" state where the next SIE entry will do a
continuation, namely secure instruction interception (104) and secure
prefix interception (112).
As our PSW is just a copy of the real one that will be replaced on the
next exit, we can mask out the interrupt bits in the PSW to make sure
that we do not inject anything.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
7c36a3fcf4 KVM: s390: protvirt: Support cmd 5 operation state
Code 5 for the set cpu state UV call tells the UV to load a PSW from
the SE header (first IPL) or from guest location 0x0 (diag 308 subcode
0/1). Also it sets the cpu into operating state afterwards, so we can
start it.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
fe28c7868f KVM: s390: protvirt: Report CPU state to Ultravisor
VCPU states have to be reported to the ultravisor for SIGP
interpretation, kdump, kexec and reboot.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
e0d2773d48 KVM: s390: protvirt: UV calls in support of diag308 0, 1
diag 308 subcode 0 and 1 require several KVM and Ultravisor interactions.
Specific to these "soft" reboots are

* The "unshare all" UVC
* The "prepare for reset" UVC

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
811ea79711 KVM: s390: protvirt: Only sync fmt4 registers
A lot of the registers are controlled by the Ultravisor and never
visible to KVM. Also some registers are overlayed, like gbea is with
sidad, which might leak data to userspace.

Hence we sync a minimal set of registers for both SIE formats and then
check and sync format 2 registers if necessary.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
0f30350471 KVM: s390: protvirt: Do only reset registers that are accessible
For protected VMs the hypervisor can not access guest breaking event
address, program parameter, bpbc and todpr. Do not reset those fields
as the control block does not provide access to these fields.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
68cf7b1f13 KVM: s390: protvirt: disallow one_reg
A lot of the registers are controlled by the Ultravisor and never
visible to KVM. Some fields in the sie control block are overlayed, like
gbea. As no known userspace uses the ONE_REG interface on s390 if sync
regs are available, no functionality is lost if it is disabled for
protected guests.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:12 +01:00
Janosch Frank
19e1227768 KVM: S390: protvirt: Introduce instruction data area bounce buffer
Now that we can't access guest memory anymore, we have a dedicated
satellite block that's a bounce buffer for instruction data.

We re-use the memop interface to copy the instruction data to / from
userspace. This lets us re-use a lot of QEMU code which used that
interface to make logical guest memory accesses which are not possible
anymore in protected mode anyway.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:11 +01:00
Janosch Frank
c8aac2344d KVM: s390: protvirt: Add new gprs location handling
Guest registers for protected guests are stored at offset 0x380.  We
will copy those to the usual places.  Long term we could refactor this
or use register access functions.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:11 +01:00
Christian Borntraeger
0890ddea1a KVM: s390: protvirt: Add SCLP interrupt handling
The sclp interrupt is kind of special. The ultravisor polices that we
do not inject an sclp interrupt with payload if no sccb is outstanding.
On the other hand we have "asynchronous" event interrupts, e.g. for
console input.
We separate both variants into sclp interrupt and sclp event interrupt.
The sclp interrupt is masked until a previous servc instruction has
finished (sie exit 108).

[frankja@linux.ibm.com: factoring out write_sclp]
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:11 +01:00
Janosch Frank
fa0c5eabbd KVM: s390: protvirt: Secure memory is not mergeable
KSM will not work on secure pages, because when the kernel reads a
secure page, it will be encrypted and hence no two pages will look the
same.

Let's mark the guest pages as unmergeable when we transition to secure
mode.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:11 +01:00
Janosch Frank
29b40f105e KVM: s390: protvirt: Add initial vm and cpu lifecycle handling
This contains 3 main changes:
1. changes in SIE control block handling for secure guests
2. helper functions for create/destroy/unpack secure guests
3. KVM_S390_PV_COMMAND ioctl to allow userspace dealing with secure
machines

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:11 +01:00
Janosch Frank
3e6c556899 KVM: s390: protvirt: Add UV debug trace
Let's have some debug traces which stay around for longer than the
guest.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[borntraeger@de.ibm.com: patch merging, splitting, fixing]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-02-27 19:47:10 +01:00
Paolo Bonzini
ef09f4f463 KVM: s390: Fixes and cleanups for 5.6
- fix register corruption
 - ENOTSUPP/EOPNOTSUPP mixed
 - reset cleanups/fixes
 - selftests
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJeNDcAAAoJEBF7vIC1phx8NkcP/2JWMr/9v44LJJ8BfZVFqdP4
 i41pVFIgtI8Ieqjgp+Fuiu/8ELPxfohzBZ1Rm60TPcZlJ+uREmHklG1ZD2iXEJix
 0YqzICadQ4OvJxiFpi/s5+9bzczoxCIEx7CfJ4PTM2V3qtefauFgNtoSMevF9CtK
 6UuPNNjBi6cJuG3uAyqoOZ3vbMNeZ337ffEgBwukR01UxGImXwJ9odPFEwz31hji
 WKEEbnPaXFZUKy2vMSZVcndJKkhb043QFkZBY98D8m5VTSO5UFwpdYuht6QdMSKx
 IrxDN7788e/p4IPOGBWAXuhjYcmAYZh2Ayt7DM53b49XhWifsc6fw4khly2fjr3+
 Wg5Ol13ls2WaeDTGd5c4XQRWpQD27Wnum0yXLaVf2gaTRbTqrrsisWLHL6k/gqyb
 CXqJIr11/sb4zLwlwXPSrOrIz3CRz4DqawF/F0q47rHC7xyGsRzpGU4gP5Aqj8op
 qAMVORoQQjMtH4fVv6/NhIG6srVeonNA5GjI6hkYZ85mEJhy5Nl9lNuyEh4W094D
 fkNSnlWcCG8fyoLih1SHVa7cROVI8G0tfwhk4uSjRCXXtA5B5Rve2LQl3nCP9gUX
 m7Y6Qzm/yusVtaTu+YE8MyXVE2bpvGMR/xeztIR8eYw/LqbodOzxkRLdfeH2cfaD
 VCmFaVuUjTXx5q4xYmIl
 =ZgeW
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.6-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes and cleanups for 5.6
- fix register corruption
- ENOTSUPP/EOPNOTSUPP mixed
- reset cleanups/fixes
- selftests
2020-02-05 16:15:05 +01:00
Janosch Frank
7de3f1423f KVM: s390: Add new reset vcpu API
The architecture states that we need to reset local IRQs for all CPU
resets. Because the old reset interface did not support the normal CPU
reset we never did that on a normal reset.

Let's implement an interface for the missing normal and clear resets
and reset all local IRQs, registers and control structures as stated
in the architecture.

Userspace might already reset the registers via the vcpu run struct,
but as we need the interface for the interrupt clearing part anyway,
we implement the resets fully and don't rely on userspace to reset the
rest.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20200131100205.74720-4-frankja@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-01-31 12:50:04 +01:00
Janosch Frank
cca00ebb8a KVM: s390: Cleanup initial cpu reset
The code seems to be quite old and uses lots of unneeded spaces for
alignment, which doesn't really help with readability.

Let's:
* Get rid of the extra spaces
* Remove the ULs as they are not needed on 0s
* Define constants for the CR 0 and 14 initial values
* Use the sizeof of the gcr array to memset it to 0

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/r/20200131100205.74720-3-frankja@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-01-31 12:49:54 +01:00
Christian Borntraeger
55680890ea KVM: s390: do not clobber registers during guest reset/store status
The initial CPU reset clobbers the userspace fpc and the store status
ioctl clobbers the guest acrs + fpr.  As these calls are only done via
ioctl (and not via vcpu_run), no CPU context is loaded, so we can (and
must) act directly on the sync regs, not on the thread context.

Cc: stable@kernel.org
Fixes: e1788bb995 ("KVM: s390: handle floating point registers in the run ioctl not in vcpu_put/load")
Fixes: 31d8b8d41a ("KVM: s390: handle access registers in the run ioctl not in vcpu_put/load")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/r/20200131100205.74720-2-frankja@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2020-01-31 12:49:24 +01:00
Sean Christopherson
ddd259c9aa KVM: Drop kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit()
Remove kvm_arch_vcpu_init() and kvm_arch_vcpu_uninit() now that all
arch specific implementations are nops.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:33 +01:00
Sean Christopherson
afede96df5 KVM: Drop kvm_arch_vcpu_setup()
Remove kvm_arch_vcpu_setup() now that all arch specific implementations
are nops.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:28 +01:00
Sean Christopherson
ff72bb55cb KVM: s390: Manually invoke vcpu setup during kvm_arch_vcpu_create()
Rename kvm_arch_vcpu_setup() to kvm_s390_vcpu_setup() and manually call
the new function during kvm_arch_vcpu_create().  Define an empty
kvm_arch_vcpu_setup() as it's still required for compilation.  This
is effectively a nop as kvm_arch_vcpu_create() and kvm_arch_vcpu_setup()
are called back-to-back by common KVM code.  Obsoleting
kvm_arch_vcpu_setup() paves the way for its removal.

Note, gmap_remove() is now called if setup fails, as s390 was previously
freeing it via kvm_arch_vcpu_destroy(), which is called by common KVM
code if kvm_arch_vcpu_setup() fails.

No functional change intended.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:27 +01:00
Sean Christopherson
e529ef66e6 KVM: Move vcpu alloc and init invocation to common code
Now that all architectures tightly couple vcpu allocation/free with the
mandatory calls to kvm_{un}init_vcpu(), move the sequences verbatim to
common KVM code.

Move both allocation and initialization in a single patch to eliminate
thrash in arch specific code.  The bisection benefits of moving the two
pieces in separate patches is marginal at best, whereas the odds of
introducing a transient arch specific bug are non-zero.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-27 19:59:20 +01:00
Sean Christopherson
4543bdc088 KVM: Introduce kvm_vcpu_destroy()
Add kvm_vcpu_destroy() and wire up all architectures to call the common
function instead of their arch specific implementation.  The common
destruction function will be used by future patches to move allocation
and initialization of vCPUs to common KVM code, i.e. to free resources
that are allocated by arch agnostic code.

No functional change intended.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:11 +01:00
Sean Christopherson
a2017f17fa KVM: s390: Invoke kvm_vcpu_init() before allocating sie_page
Now that s390's implementation of kvm_arch_vcpu_init() is empty, move
the call to kvm_vcpu_init() above the allocation of the sie_page.  This
paves the way for moving vcpu allocation and initialization into common
KVM code without any associated functional change.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:08 +01:00
Sean Christopherson
321f8ee559 KVM: s390: Move guts of kvm_arch_vcpu_init() into kvm_arch_vcpu_create()
Move all of kvm_arch_vcpu_init(), which is invoked at the very end of
kvm_vcpu_init(), into kvm_arch_vcpu_create() in preparation of moving
the call to kvm_vcpu_init().  Moving kvm_vcpu_init() is itself a
preparatory step for moving allocation and initialization to common KVM
code.

No functional change inteded.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:08 +01:00
Sean Christopherson
897cc38eaa KVM: Add kvm_arch_vcpu_precreate() to handle pre-allocation issues
Add a pre-allocation arch hook to handle checks that are currently done
by arch specific code prior to allocating the vCPU object.  This paves
the way for moving the allocation to common KVM code.

Acked-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2020-01-24 09:19:07 +01:00
Paolo Bonzini
fe289ebb65 KVM: s390: small fixes and enhancements
- selftest improvements
 - yield improvements
 - cleanups
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJd0k9KAAoJEBF7vIC1phx8jecP/15y4vJABaNMCb/zzNYEncxr
 lJf8ZeW+257eiEhsmmju4eM8l9/3RzsJM9WXSj91MBRu+xlkt+cyla/TC+CEKMxW
 Z8yd3AkaIPTMDBY/n6QSqDusrUwfR01iM02mr/IKguG/HeCKgLksN03ZU00mc09q
 Ogo+Cl3AdNnIds+5vkIOQAc+CHM3SGjEfyZCqoTwjn46jsKNQeDrq3hHX9RMG4FF
 BxVcSx5rCFCYyb9eruCCK4OHrEEwdJ4l0udkblRjIl+T9Y8LgoXO1/KGIggVL5UJ
 +Smoc/soXMdkOAhefn/2fB1dBRNBaUpvB5xtAd4BHyRjPomw93sftScW06qfiZuo
 0nBiDgTyilpi8dpojyu2vUpYj7NQXTI4ZoHOMTsXOhk6cqGqm4loLb4xdJ8FCoc9
 04Yf1GCfbyEovoyLq1BkL1qD5ZUBecUfYWQGS1xf0+U6/hvn5lQOGeINNe/ho2Zl
 jU1lsFuGGyKs3G5qpk0Dz8UgbRqOYC58VlGQ1eOcNVksTf7qG+MZ3c6kall7CfXg
 MFcK/PuSxyTfrr5CApyK3Gpqu32aMV0rComd6Bv28DlsTRA9F1TJ5WQTO3HUhV9R
 iiqbMAx0s1xHZp6K/VsCvYRjdVyKU7/sQ6OxRmRTybjjKajKijQjMlE2f1Nr0liD
 PKsQjv2kTvrtMDzOhWFu
 =zHPF
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.5-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: small fixes and enhancements

- selftest improvements
- yield improvements
- cleanups
2019-11-18 13:16:46 +01:00
Christian Borntraeger
8474e5cac0 KVM: s390: count invalid yields
To analyze some performance issues with lock contention and scheduling
it is nice to know when diag9c did not result in any action or when
no action was tried.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2019-10-10 13:18:38 +02:00
Heiko Carstens
d0dea733f6 KVM: s390: mark __insn32_query() as __always_inline
__insn32_query() will not compile if the compiler decides to not
inline it, since it contains an inline assembly with an "i" constraint
with variable contents.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-05 13:51:22 +02:00
Heiko Carstens
b1c41ac3ce KVM: s390: fix __insn32_query() inline assembly
The inline assembly constraints of __insn32_query() tell the compiler
that only the first byte of "query" is being written to. Intended was
probably that 32 bytes are written to.

Fix and simplify the code and just use a "memory" clobber.

Fixes: d668139718 ("KVM: s390: provide query function for instructions returning 32 byte")
Cc: stable@vger.kernel.org # v5.2+
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-10-05 13:51:18 +02:00
Janosch Frank
f76f637164 KVM: s390: Cleanup kvm_arch_init error path
Both kvm_s390_gib_destroy and debug_unregister test if the needed
pointers are not NULL and hence can be called unconditionally.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Link: https://lore.kernel.org/kvm/20191002075627.3582-1-frankja@linux.ibm.com
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-10-04 15:37:53 +02:00
Linus Torvalds
fe38bd6862 * s390: ioctl hardening, selftests
* ARM: ITS translation cache; support for 512 vCPUs, various cleanups
 and bugfixes
 
 * PPC: various minor fixes and preparation
 
 * x86: bugfixes all over the place (posted interrupts, SVM, emulation
 corner cases, blocked INIT), some IPI optimizations
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJdf7fdAAoJEL/70l94x66DJzkIAKDcuWXJB4Qtoto6yUvPiHZm
 LYkY/Dn1zulb/DhzrBoXFey/jZXwl9kxMYkVTefnrAl0fRwFGX+G1UYnQrtAL6Gr
 ifdTYdy3kZhXCnnp99QAantWDswJHo1THwbmHrlmkxS4MdisEaTHwgjaHrDRZ4/d
 FAEwW2isSonP3YJfTtsKFFjL9k2D4iMnwZ/R2B7UOaWvgnerZ1GLmOkilvnzGGEV
 IQ89IIkWlkKd4SKgq8RkDKlfW5JrLrSdTK2Uf0DvAxV+J0EFkEaR+WlLsqumra0z
 Eg3KwNScfQj0DyT0TzurcOxObcQPoMNSFYXLRbUu1+i0CGgm90XpF1IosiuihgU=
 =w6I3
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "s390:
   - ioctl hardening
   - selftests

  ARM:
   - ITS translation cache
   - support for 512 vCPUs
   - various cleanups and bugfixes

  PPC:
   - various minor fixes and preparation

  x86:
   - bugfixes all over the place (posted interrupts, SVM, emulation
     corner cases, blocked INIT)
   - some IPI optimizations"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (75 commits)
  KVM: X86: Use IPI shorthands in kvm guest when support
  KVM: x86: Fix INIT signal handling in various CPU states
  KVM: VMX: Introduce exit reason for receiving INIT signal on guest-mode
  KVM: VMX: Stop the preemption timer during vCPU reset
  KVM: LAPIC: Micro optimize IPI latency
  kvm: Nested KVM MMUs need PAE root too
  KVM: x86: set ctxt->have_exception in x86_decode_insn()
  KVM: x86: always stop emulation on page fault
  KVM: nVMX: trace nested VM-Enter failures detected by H/W
  KVM: nVMX: add tracepoint for failed nested VM-Enter
  x86: KVM: svm: Fix a check in nested_svm_vmrun()
  KVM: x86: Return to userspace with internal error on unexpected exit reason
  KVM: x86: Add kvm_emulate_{rd,wr}msr() to consolidate VXM/SVM code
  KVM: x86: Refactor up kvm_{g,s}et_msr() to simplify callers
  doc: kvm: Fix return description of KVM_SET_MSRS
  KVM: X86: Tune PLE Window tracepoint
  KVM: VMX: Change ple_window type to unsigned int
  KVM: X86: Remove tailing newline for tracepoints
  KVM: X86: Trace vcpu_id for vmexit
  KVM: x86: Manually calculate reserved bits when loading PDPTRS
  ...
2019-09-18 09:49:13 -07:00
Paolo Bonzini
a9c20bb020 KVM: s390: Fixes for 5.3
- prevent a user triggerable oops in the migration code
 - do not leak kernel stack content
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJdejosAAoJEBF7vIC1phx8ZcYP/09WMmcbOexGvopqyMzIWgAv
 xpSHAW0+mGriu9b41OwkxBsMG3MxUzk86b3zL0r5eaigWXSuE2NU0OhScqF9ehMX
 pTtoeSzFJsPFwGQrOKIhpgcNzOJ+YfVqTDlf5dxq9uSNYF32suuz0Dw4P9PdFJOg
 k8prJXiKu+bL21TcbhWsAAP7Gb5/DA26p4d5KM3wJe351Af9lrLrDF2z+pKe9fbY
 v0vMcH3tJoBOOTYUSJeptEWU9OlYljMrJN7kkmXCEC8yklwoXPDNgAC8Yg2SfqYM
 xNKVkX/rY97cn1Dq0LpAvEjMDYvu7KbOM1qQE9A67gRLIjuGJnDyEa+j/iB/tOrz
 BMmTdut44XRaVZVdDL+d2pg3LKI+1+UV4XTwpD4g1tSpYLar3dJVb9mq00OzdCAg
 TsK+pQYTSZig+H4ubtikgm9pFGKOB2Jsp2+FoC7jYxhYQWyj4syBkSoaaUdY0LvE
 /Du3NY3RaG4yi2K2XV0yjBVAjpXxYMWqvzJYTC9XlrEQJ5nAmiefTgxZmcg4ZCMw
 0YVRigG7vz8oKpVRl/6smGd/U+qTNZN4cXnFgUr71yONiIxsSndUZ/Yledtf+KQR
 uzPfvIwYpRzwqVnXkkFb+PNxvJVftCbe2rRI4D549VsbmEJmSadjiB5aW1Rj3fMN
 47ZjXZmmGETR8BtQEM37
 =LxGy
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-5.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvm-master

KVM: s390: Fixes for 5.3

- prevent a user triggerable oops in the migration code
- do not leak kernel stack content
2019-09-14 09:25:30 +02:00
Thomas Huth
53936b5bf3 KVM: s390: Do not leak kernel stack data in the KVM_S390_INTERRUPT ioctl
When the userspace program runs the KVM_S390_INTERRUPT ioctl to inject
an interrupt, we convert them from the legacy struct kvm_s390_interrupt
to the new struct kvm_s390_irq via the s390int_to_s390irq() function.
However, this function does not take care of all types of interrupts
that we can inject into the guest later (see do_inject_vcpu()). Since we
do not clear out the s390irq values before calling s390int_to_s390irq(),
there is a chance that we copy random data from the kernel stack which
could be leaked to the userspace later.

Specifically, the problem exists with the KVM_S390_INT_PFAULT_INIT
interrupt: s390int_to_s390irq() does not handle it, and the function
__inject_pfault_init() later copies irq->u.ext which contains the
random kernel stack data. This data can then be leaked either to
the guest memory in __deliver_pfault_init(), or the userspace might
retrieve it directly with the KVM_S390_GET_IRQ_STATE ioctl.

Fix it by handling that interrupt type in s390int_to_s390irq(), too,
and by making sure that the s390irq struct is properly pre-initialized.
And while we're at it, make sure that s390int_to_s390irq() now
directly returns -EINVAL for unknown interrupt types, so that we
immediately get a proper error code in case we add more interrupt
types to do_inject_vcpu() without updating s390int_to_s390irq()
sometime in the future.

Cc: stable@vger.kernel.org
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/kvm/20190912115438.25761-1-thuth@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-09-12 14:12:21 +02:00
Igor Mammedov
13a17cc052 KVM: s390: kvm_s390_vm_start_migration: check dirty_bitmap before using it as target for memset()
If userspace doesn't set KVM_MEM_LOG_DIRTY_PAGES on memslot before calling
kvm_s390_vm_start_migration(), kernel will oops with:

  Unable to handle kernel pointer dereference in virtual kernel address space
  Failing address: 0000000000000000 TEID: 0000000000000483
  Fault in home space mode while using kernel ASCE.
  AS:0000000002a2000b R2:00000001bff8c00b R3:00000001bff88007 S:00000001bff91000 P:000000000000003d
  Oops: 0004 ilc:2 [#1] SMP
  ...
  Call Trace:
  ([<001fffff804ec552>] kvm_s390_vm_set_attr+0x347a/0x3828 [kvm])
   [<001fffff804ecfc0>] kvm_arch_vm_ioctl+0x6c0/0x1998 [kvm]
   [<001fffff804b67e4>] kvm_vm_ioctl+0x51c/0x11a8 [kvm]
   [<00000000008ba572>] do_vfs_ioctl+0x1d2/0xe58
   [<00000000008bb284>] ksys_ioctl+0x8c/0xb8
   [<00000000008bb2e2>] sys_ioctl+0x32/0x40
   [<000000000175552c>] system_call+0x2b8/0x2d8
  INFO: lockdep is turned off.
  Last Breaking-Event-Address:
   [<0000000000dbaf60>] __memset+0xc/0xa0

due to ms->dirty_bitmap being NULL, which might crash the host.

Make sure that ms->dirty_bitmap is set before using it or
return -EINVAL otherwise.

Cc: <stable@vger.kernel.org>
Fixes: afdad61615 ("KVM: s390: Fix storage attributes migration with memory slots")
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Link: https://lore.kernel.org/kvm/20190911075218.29153-1-imammedo@redhat.com/
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-09-12 13:09:17 +02:00
Paolo Bonzini
17a81bdb4e * More selftests
* Improved KVM_S390_MEM_OP ioctl input checking
 * Add kvm_valid_regs and kvm_dirty_regs invalid bit checking
 -----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJdb8MuAAoJEONU5rjiOLn4w80P/0oFvdohxQuk2KAVxs9u4I2A
 lMcoer637WukI8K5r9oBacofzG+6ODlv75VOrm4DXVmluaLMD8X5XbKmIXKK2k9Q
 YrkdUo/h+g+O9e6oLcawhkDr+BrTnAoBt9ox1W2SEKQjMe1hbgacrnogktYc7WPY
 diPSovQ3g53BX0W/OXw4ym5C0Qeyseegewl1Vc110fXKPH0eMlnXbWdkHpe9tNxV
 DjtikIC6/NNHL4shwDFZtxao0jUpjlOMASdfTJpNk6g+16XFpUJwm0Frca8qplzt
 4HJyuWPeZeyMKzCPOqJbqvwzxMmAoft+fcBeX4YhtqMerOVIZ0wM7bcf1zm99jbq
 PYMW9KXIdYEdljnQBgrK7vdZ91z0KUKUa1QkxXbPPfzD2nDo3f/hOiBcpyP8cGHO
 DZ10rkv6sNG6Y5COVDD0HMxsFh3fxDPjvHvpsU/77bS/JNHBzvcRNhafzr20en6g
 PAuBqkjWFbGbPwdINN01v0LDiHTzsZ8Z2mzv02+1UYGTOxDopbDZyB6l5Nbi51lE
 fxJKHiyqHjEO4eGzhL7vc+Cl1w/k6yvIoprM2sV+gTXdHgwh8GxzNomhRwkunXlp
 2hvCFS9XyD7M89T09hhHkDaSDP0hWcCaAp00ZuBFLRKmXJYz+Im7wqmEwRuZwOhV
 P/MiQjOnCDQ/+qW5VPgp
 =gYMG
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.4-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

* More selftests
* Improved KVM_S390_MEM_OP ioctl input checking
* Add kvm_valid_regs and kvm_dirty_regs invalid bit checking
2019-09-11 18:06:15 +02:00
Thomas Huth
200824f55e KVM: s390: Disallow invalid bits in kvm_valid_regs and kvm_dirty_regs
If unknown bits are set in kvm_valid_regs or kvm_dirty_regs, this
clearly indicates that something went wrong in the KVM userspace
application. The x86 variant of KVM already contains a check for
bad bits, so let's do the same on s390x now, too.

Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lore.kernel.org/lkml/20190904085200.29021-2-thuth@redhat.com/
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2019-09-04 15:38:05 +02:00
Thomas Huth
a13b03bbb4 KVM: s390: Test for bad access register and size at the start of S390_MEM_OP
If the KVM_S390_MEM_OP ioctl is called with an access register >= 16,
then there is certainly a bug in the calling userspace application.
We check for wrong access registers, but only if the vCPU was already
in the access register mode before (i.e. the SIE block has recorded
it). The check is also buried somewhere deep in the calling chain (in
the function ar_translation()), so this is somewhat hard to find.

It's better to always report an error to the userspace in case this
field is set wrong, and it's safer in the KVM code if we block wrong
values here early instead of relying on a check somewhere deep down
the calling chain, so let's add another check to kvm_s390_guest_mem_op()
directly.

We also should check that the "size" is non-zero here (thanks to Janosch
Frank for the hint!). If we do not check the size, we could call vmalloc()
with this 0 value, and this will cause a kernel warning.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Link: https://lkml.kernel.org/r/20190829122517.31042-1-thuth@redhat.com
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-08-29 16:08:38 +02:00
Paolo Bonzini
741cbbae07 KVM: remove kvm_arch_has_vcpu_debugfs()
There is no need for this function as all arches have to implement
kvm_arch_create_vcpu_debugfs() no matter what.  A #define symbol
let us actually simplify the code.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-08-05 12:55:48 +02:00
Linus Torvalds
39d7530d74 ARM:
* support for chained PMU counters in guests
 * improved SError handling
 * handle Neoverse N1 erratum #1349291
 * allow side-channel mitigation status to be migrated
 * standardise most AArch64 system register accesses to msr_s/mrs_s
 * fix host MPIDR corruption on 32bit
 * selftests ckleanups
 
 x86:
 * PMU event {white,black}listing
 * ability for the guest to disable host-side interrupt polling
 * fixes for enlightened VMCS (Hyper-V pv nested virtualization),
 * new hypercall to yield to IPI target
 * support for passing cstate MSRs through to the guest
 * lots of cleanups and optimizations
 
 Generic:
 * Some txt->rST conversions for the documentation
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQEcBAABAgAGBQJdJzdIAAoJEL/70l94x66DQDoH/i83/8kX4I8AWDlushPru4ts
 Q4lCE5VAPha+o4pLb1dtfFL3gTmSbsB1N++JSlqK3JOo6LphIOy6b0wBjQBbAa6U
 3CT1dJaHJoScLLj09vyBlvClGUH2ZKEQTWOiquCCf7JfPofxwPUA6vJ7TYsdkckx
 zR3ygbADWmnfS7hFfiqN3JzuYh9eoooGNWSU+Giq6VF41SiL3IqhBGZhWS0zE9c2
 2c5lpqqdeHmAYNBqsyzNiDRKp7+zLFSmZ7Z5/0L755L8KYwR6F5beTnmBMHvb4lA
 PWH/SWOC8EYR+PEowfrH+TxKZwp0gMn1kcAKjilHk0uCRwG1IzuHAr2jlNxICCk=
 =t/Oq
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull KVM updates from Paolo Bonzini:
 "ARM:
   - support for chained PMU counters in guests
   - improved SError handling
   - handle Neoverse N1 erratum #1349291
   - allow side-channel mitigation status to be migrated
   - standardise most AArch64 system register accesses to msr_s/mrs_s
   - fix host MPIDR corruption on 32bit
   - selftests ckleanups

  x86:
   - PMU event {white,black}listing
   - ability for the guest to disable host-side interrupt polling
   - fixes for enlightened VMCS (Hyper-V pv nested virtualization),
   - new hypercall to yield to IPI target
   - support for passing cstate MSRs through to the guest
   - lots of cleanups and optimizations

  Generic:
   - Some txt->rST conversions for the documentation"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (128 commits)
  Documentation: virtual: Add toctree hooks
  Documentation: kvm: Convert cpuid.txt to .rst
  Documentation: virtual: Convert paravirt_ops.txt to .rst
  KVM: x86: Unconditionally enable irqs in guest context
  KVM: x86: PMU Event Filter
  kvm: x86: Fix -Wmissing-prototypes warnings
  KVM: Properly check if "page" is valid in kvm_vcpu_unmap
  KVM: arm/arm64: Initialise host's MPIDRs by reading the actual register
  KVM: LAPIC: Retry tune per-vCPU timer_advance_ns if adaptive tuning goes insane
  kvm: LAPIC: write down valid APIC registers
  KVM: arm64: Migrate _elx sysreg accessors to msr_s/mrs_s
  KVM: doc: Add API documentation on the KVM_REG_ARM_WORKAROUNDS register
  KVM: arm/arm64: Add save/restore support for firmware workaround state
  arm64: KVM: Propagate full Spectre v2 workaround state to KVM guests
  KVM: arm/arm64: Support chained PMU counters
  KVM: arm/arm64: Remove pmc->bitmask
  KVM: arm/arm64: Re-create event when setting counter value
  KVM: arm/arm64: Extract duplicated code to own function
  KVM: arm/arm64: Rename kvm_pmu_{enable/disable}_counter functions
  KVM: LAPIC: ARBPRI is a reserved register for x2APIC
  ...
2019-07-12 15:35:14 -07:00
Pierre Morel
05f31e3bf6 s390: ap: kvm: Enable PQAP/AQIC facility for the guest
AP Queue Interruption Control (AQIC) facility gives
the guest the possibility to control interruption for
the Cryptographic Adjunct Processor queues.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Harald Freudenberger <freude@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.ibm.com>
[ Modified while picking: we may not expose STFLE facility 65
unconditionally because AIV is a pre-requirement.]
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2019-07-02 16:00:28 +02:00
Junaid Shahid
0d9ce162cf kvm: Convert kvm_lock to a mutex
It doesn't seem as if there is any particular need for kvm_lock to be a
spinlock, so convert the lock to a mutex so that sleepable functions (in
particular cond_resched()) can be called while holding it.

Signed-off-by: Junaid Shahid <junaids@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-05 14:14:50 +02:00
Sean Christopherson
f257d6dcda KVM: Directly return result from kvm_arch_check_processor_compat()
Add a wrapper to invoke kvm_arch_check_processor_compat() so that the
boilerplate ugliness of checking virtualization support on all CPUs is
hidden from the arch specific code.  x86's implementation in particular
is quite heinous, as it unnecessarily propagates the out-param pattern
into kvm_x86_ops.

While the x86 specific issue could be resolved solely by changing
kvm_x86_ops, make the change for all architectures as returning a value
directly is prettier and technically more robust, e.g. s390 doesn't set
the out param, which could lead to subtle breakage in the (highly
unlikely) scenario where the out-param was not pre-initialized by the
caller.

Opportunistically annotate svm_check_processor_compat() with __init.

Signed-off-by: Sean Christopherson <sean.j.christopherson@intel.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-06-04 19:27:32 +02:00
Thomas Huth
a86cb413f4 KVM: s390: Do not report unusabled IDs via KVM_CAP_MAX_VCPU_ID
KVM_CAP_MAX_VCPU_ID is currently always reporting KVM_MAX_VCPU_ID on all
architectures. However, on s390x, the amount of usable CPUs is determined
during runtime - it is depending on the features of the machine the code
is running on. Since we are using the vcpu_id as an index into the SCA
structures that are defined by the hardware (see e.g. the sca_add_vcpu()
function), it is not only the amount of CPUs that is limited by the hard-
ware, but also the range of IDs that we can use.
Thus KVM_CAP_MAX_VCPU_ID must be determined during runtime on s390x, too.
So the handling of KVM_CAP_MAX_VCPU_ID has to be moved from the common
code into the architecture specific code, and on s390x we have to return
the same value here as for KVM_CAP_MAX_VCPUS.
This problem has been discovered with the kvm_create_max_vcpus selftest.
With this change applied, the selftest now passes on s390x, too.

Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20190523164309.13345-9-thuth@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-05-28 15:52:19 +02:00
Christian Borntraeger
19ec166c3f KVM: s390: fix memory slot handling for KVM_SET_USER_MEMORY_REGION
kselftests exposed a problem in the s390 handling for memory slots.
Right now we only do proper memory slot handling for creation of new
memory slots. Neither MOVE, nor DELETION are handled properly. Let us
implement those.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-05-24 21:27:14 +02:00
Wei Yongjun
b41fb528dd KVM: s390: fix typo in parameter description
Fix typo in parameter description.

Fixes: 8b905d28ee ("KVM: s390: provide kvm_arch_no_poll function")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
Message-Id: <20190504065145.53665-1-weiyongjun1@huawei.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-05-20 09:40:38 +02:00
Paolo Bonzini
da8f0d97b2 KVM: s390: Features and fixes for 5.2
- VSIE crypto fixes
 - new guest features for gen15
 - disable halt polling for nested virtualization with overcommit
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJcxrJmAAoJEBF7vIC1phx8EsEP/2mIUbtY9OmVCZNHX43ds5Jr
 WR51UA/cXQGzP1cqLrqIchjJ40J7KGYBqS+9MeOyUxX85HUvb5dGgUiIfDOmh8R7
 YIHe3nkM0dcIRbeuSp48sA8rl817TNGSBg7GnUN+eaEvJ/U+WbLb1sry/0uZN6Tm
 2iFkff+XgSeEfBmrlxiPVl5PGUxi6FtKQWDwhn+MRkvs4sdQBh1SBITMIrzMgDmQ
 GMd5olfLp3AZZV2yniFvZM9TSWvKobCCH6IVF0/mBchxkqmdjQaKdSCRO6a1pLDh
 8PVBN7i+yipLURUMBuDCMxGDBINJgvvXkThB8N9K6+CanUc8KCc7l0EimS93s3DB
 FsutI/2mSFy/xJ4nk98VVp8WCbVftQLtyKUSytBiqCTSpg1gtFMMntCPAqlON4TV
 xHOaAnJjF4Lhvfm0QrxQ22bAmuju6WIh5WKG8D+s7yqcn7GZeDUYdeftWiGNteaf
 sJwX1Vq8H6iUac1mfp7UbfT+60UuiCkj/d9sY9eRBNlPPIX6V4UgZU4Xh8/rSMf3
 qnN4RCBGIQqndUzRzaw7ZtAfNy5jBE1BABems49fy07kuPCzrg9tQqXlWxf/60Ad
 QKqZ3Q/hb4ixYQJ7TAqQZmq1D3NL8w+V9MthcILmEGfMYF4BZKJV39ZigbttRIcN
 ZuiS+8IfOWN1IXZ2zXL0
 =mZyZ
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-5.2-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Features and fixes for 5.2

- VSIE crypto fixes
- new guest features for gen15
- disable halt polling for nested virtualization with overcommit
2019-04-30 21:29:14 +02:00
Christian Borntraeger
8b905d28ee KVM: s390: provide kvm_arch_no_poll function
We do track the current steal time of the host CPUs. Let us use
this value to disable halt polling if the steal time goes beyond
a configured value.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-04-26 09:08:17 +02:00
Christian Borntraeger
8ec2fa52ea KVM: s390: enable MSA9 keywrapping functions depending on cpu model
Instead of adding a new machine option to disable/enable the keywrapping
options of pckmo (like for AES and DEA) we can now use the CPU model to
decide. As ECC is also wrapped with the AES key we need that to be
enabled.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-04-25 02:26:21 -04:00
Christian Borntraeger
4f45b90e1c KVM: s390: add deflate conversion facilty to cpu model
This enables stfle.151 and adds the subfunctions for DFLTCC. Bit 151 is
added to the list of facilities that will be enabled when there is no
cpu model involved as DFLTCC requires no additional handling from
userspace, e.g. for migration.

Please note that a cpu model enabled user space can and will have the
final decision on the facility bits for a guests.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-04-25 02:24:17 -04:00
Christian Borntraeger
173aec2d5a KVM: s390: add enhanced sort facilty to cpu model
This enables stfle.150 and adds the subfunctions for SORTL. Bit 150 is
added to the list of facilities that will be enabled when there is no
cpu model involved as sortl requires no additional handling from
userspace, e.g. for migration.

Please note that a cpu model enabled user space can and will have the
final decision on the facility bits for a guests.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-04-18 12:57:53 +02:00
Christian Borntraeger
d668139718 KVM: s390: provide query function for instructions returning 32 byte
Some of the new features have a 32byte response for the query function.
Provide a new wrapper similar to __cpacf_query. We might want to factor
this out if other users come up, as of today there is none. So let us
keep the function within KVM.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-04-18 12:57:53 +02:00
Christian Borntraeger
13209ad039 KVM: s390: add MSA9 to cpumodel
This enables stfle.155 and adds the subfunctions for KDSA. Bit 155 is
added to the list of facilities that will be enabled when there is no
cpu model involved as MSA9 requires no additional handling from
userspace, e.g. for migration.

Please note that a cpu model enabled user space can and will have the
final decision on the facility bits for a guests.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-04-18 10:14:11 +02:00
Christian Borntraeger
d5cb6ab1e3 KVM: s390: add vector BCD enhancements facility to cpumodel
If vector support is enabled, the vector BCD enhancements facility
might also be enabled.
We can directly forward this facility to the guest if available
and VX is requested by user space.

Please note that user space can and will have the final decision
on the facility bits for a guests.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-04-18 10:14:11 +02:00
Christian Borntraeger
7832e91cd3 KVM: s390: add vector enhancements facility 2 to cpumodel
If vector support is enabled, the vector enhancements facility 2
might also be enabled.
We can directly forward this facility to the guest if available
and VX is requested by user space.

Please note that user space can and will have the final decision
on the facility bits for a guests.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2019-04-18 10:14:10 +02:00
Paolo Bonzini
c110ae578c kvm: move KVM_CAP_NR_MEMSLOTS to common code
All architectures except MIPS were defining it in the same way,
and memory slots are handled entirely by common code so there
is no point in keeping the definition per-architecture.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2019-04-16 15:39:08 +02:00
Christian Borntraeger
11ba5961a2 KVM: s390: add debug logging for cpu model subfunctions
As userspace can now get/set the subfunctions we want to trace those.
This will allow to also check QEMUs cpu model vs. what the real
hardware provides.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
2019-02-22 11:04:35 +01:00
Christian Borntraeger
346fa2f891 KVM: s390: implement subfunction processor calls
While we will not implement interception for query functions yet, we can
and should disable functions that have a control bit based on the given
CPU model.

Let us start with enabling the subfunction interface.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2019-02-22 11:04:35 +01:00
Michael Mueller
b1d1e76ed9 KVM: s390: start using the GIB
By initializing the GIB, it will be used by the kvm host.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-15-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:24 +01:00
Michael Mueller
9f30f62163 KVM: s390: add gib_alert_irq_handler()
The patch implements a handler for GIB alert interruptions
on the host. Its task is to alert guests that interrupts are
pending for them.

A GIB alert interrupt statistic counter is added as well:

$ cat /proc/interrupts
          CPU0       CPU1
  ...
  GAL:      23         37   [I/O] GIB Alert
  ...

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <20190131085247.13826-14-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
25c84dbaec KVM: s390: add kvm reference to struct sie_page2
Adding the kvm reference to struct sie_page2 will allow to
determine the kvm a given gisa belongs to:

  container_of(gisa, struct sie_page2, gisa)->kvm

This functionality will be required to process a gisa in
gib alert interruption context.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-11-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
1282c21eb3 KVM: s390: add the GIB and its related life-cyle functions
The Guest Information Block (GIB) links the GISA of all guests
that have adapter interrupts pending. These interrupts cannot be
delivered because all vcpus of these guests are currently in WAIT
state or have masked the respective Interruption Sub Class (ISC).
If enabled, a GIB alert is issued on the host to schedule these
guests to run suitable vcpus to consume the pending interruptions.

This mechanism allows to process adapter interrupts for currently
not running guests.

The GIB is created during host initialization and associated with
the Adapter Interruption Facility in case an Adapter Interruption
Virtualization Facility is available.

The GIB initialization and thus the activation of the related code
will be done in an upcoming patch of this series.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-10-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:23 +01:00
Michael Mueller
982cff4259 KVM: s390: introduce struct kvm_s390_gisa_interrupt
Use this struct analog to the kvm interruption structs
for kvm emulated floating and local interruptions.

GIB handling will add further fields to this structure as
required.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Message-Id: <20190131085247.13826-8-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:29:22 +01:00
Michael Mueller
8d43d57036 KVM: s390: clarify kvm related kernel message
As suggested by our ID dept. here are some kernel message
updates.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2019-02-05 14:28:35 +01:00
Paolo Bonzini
e9f2e05a5f KVM: s390: Fixes for 4.21
Just two small fixes.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJcGgnhAAoJEBF7vIC1phx8kfgP/iiXHJo94IK/rqXbYFGnw259
 ehRaWmzXJAdU6G7RgaAqyNkEudOjIoPx9QDe0WRl/vRkAiQ2iejoovGFfa/wDkV4
 N4uCKdKJ8U39ixonC7/b90798p+Fgc1MfNHtrsvgjj9d4kjzx6L0Qq9G+8t9EcU+
 BJZNuK6L2+AY/o/yysVTCp5yI/Pqf0vtKrtglsGe7Eg1FES8MWR3A0OIeOar5Bcq
 uGFIUhEy2tHDNFYSdrmKCF4DGkJ+RmBgAEq/Lp2RqChD00CfVE/pHNZfQHGXmPFA
 MuWvUohuhhF7Ly3OrQKNdILqxQkqUov3pNeWSzTb4Awy/GY3F1j9K4ysF4/uQLFr
 97kjySVUpK1qhDVVS2lGZp1gOAmjByVfw9j7/Jq+MPDsHmNRISTfbCjdkzyhHxcd
 joPS9/StC1r/kFN9pyfDr+S+8KgG4jx5Jk6Jjwt+BOUi2pummP9UcrAfxQd+6QKZ
 3s2qrgAbkaJfYXpTqEw0WkxncYsNC+WVL3tmL7IQdBo6C+rPtUPpiSgT4Mbwy9Tk
 s7KGX9u33mDuw4vvz3LFZcgcXdM+hItzsHsE/l8PFOea5jqKIvyyuaK9zjGS25b1
 VTP/2RckdopTHEy+iFz3tmRzHB2n36U3cEeOCow3/wDzEbJKy2qK7SeIUTuQloyG
 ZZChydpdoc3I/5m6ecCc
 =GVCX
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.21-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Fixes for 4.21

Just two small fixes.
2018-12-19 22:17:09 +01:00
Michael Mueller
7aedd9d48f KVM: s390: fix kmsg component kvm-s390
Relocate #define statement for kvm related kernel messages
before the include of printk to become effective.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-12-18 10:18:27 +01:00
Michael Mueller
308c3e6673 KVM: s390: unregister debug feature on failing arch init
Make sure the debug feature and its allocated resources get
released upon unsuccessful architecture initialization.

A related indication of the issue will be reported as kernel
message.

Signed-off-by: Michael Mueller <mimu@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20181130143215.69496-2-mimu@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-12-18 10:18:27 +01:00
Paolo Bonzini
e5d83c74a5 kvm: make KVM_CAP_ENABLE_CAP_VM architecture agnostic
The first such capability to be handled in virt/kvm/ will be manual
dirty page reprotection.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-12-14 12:34:18 +01:00
Paolo Bonzini
3d0d0d9b1d KVM: s390/vfio-ap: Fixes and enhancements for vfio-ap
- add tracing
 - fix a locking bug
 - make local functions and data static
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJbwKzMAAoJEBF7vIC1phx8DR0QALWLdmVtMioQeeoas9LYurI0
 VuFjM5QsH9hVkjDZIP7Y0titQz1L4WWqIwZVffHmQGL8saRr+fd/7gBwAReKgZVU
 OrZDtUpqS1PBsrKQx36MjWrZ5n4R5tzvW38xiPevEIBLq+rtQ1GbiCs3rtRwKlur
 uABquv9uYr8GHrmYa9bUvUpbVvHQvWz/h8T4cjgwAuN0NER6PzqMUzcZBt2Q9s29
 26ZQ+r7CZ2qklJvoB8UOsrWdsZhM58BaY+CJzrAsxD3OAnPJGILpTFW2dXIoVfkh
 LMuuuzl8Tl0ntwJKjifhut7/f9VX9ipTvmA53e2moq52UEA2mJoOzu1Ku/KAivLe
 4efycTIvYRV1UKE1JLWlS/5z6fNg9eG2CykSqRlrznEPiGNTPMY5JemtvrPINkcZ
 QrdbI6ou+grFXlfaG+KcS2iFOgrMqL1UWABiq1jJVW2RAK1ZeUBFHKVeJwKVKSeW
 p9xbh7jl7yIvQ8bfsO8P3LVFWK0EmxJt6oA7ln4X7O1Pbx1QBeH5ZMBsdqLJZFsT
 AQIT/p51JjIF6H4V8/jBCRYuR91IcD9CQlRR96y8zfHiaEYlvS1YFHuZdLZCJ7Ef
 LFLVIHXGfQLHoSN2r/8us7OmmVDJctKwPGLQuh2RcMFJPySm/iBBX9m57S7VWjAb
 87wLYmntUMcFHIWFgixT
 =hZdt
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.20-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390/vfio-ap: Fixes and enhancements for vfio-ap

- add tracing
- fix a locking bug
- make local functions and data static
2018-10-13 12:00:26 +02:00
Christian Borntraeger
ed3054a302 Merge branch 'apv11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext 2018-10-08 12:14:54 +02:00
Pierre Morel
0e237e4469 KVM: s390: Tracing APCB changes
kvm_arch_crypto_set_masks is a new function to centralize
the setup the APCB masks inside the CRYCB SIE satellite.

To trace APCB mask changes, we add KVM_EVENT() tracing to
both kvm_arch_crypto_set_masks and kvm_arch_crypto_clear_masks.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Message-Id: <1538728270-10340-2-git-send-email-pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-10-05 13:10:18 +02:00
Christian Borntraeger
8e41bd5431 KVM: s390: fix locking for crypto setting error path
We need to unlock the kvm->lock mutex in the error case.

Reported-by: smatch
Fixes: 37940fb0b6 ("KVM: s390: device attrs to enable/disable AP interpretation")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-10-05 10:04:03 +02:00
Paolo Bonzini
dd5bd0a65f KVM: s390: Features for 4.20
- Initial version of AP crypto virtualization via vfio-mdev
 - Set the host program identifier
 - Optimize page table locking
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJbsxPQAAoJEBF7vIC1phx8TDoP/2zJTTf6s4Kc+jltNsFaaZyO
 rg5N6ZhL+YRpdtPB/H5Y07zt8MSAOfMMqFwzSJo2B+C/xs4BjVtTx6H7M/5AS4Rl
 /JC2xcjoVi11FzJ1EflfLlqOtPrenJmB+c7RrLy61xIYCY8VhM55u4epIjY/FWwA
 VlLVHIP7+9MBgDG6TNEuvAiFwwpM2axITzXw6vkjC/8CbRQz3cY+zvBqhVDq3KOO
 MLHSmBKLbrA940XhUlPQ1wDplGlZ5lobG6+pXnynCs8YBj12zEivNe4y9Z1v0XsM
 nKQZxkDK+q9LG7WyRU5uIA00+msFopGrUCsQd/S/HQA8wyJ6xYeLALQpNHgMR7ts
 Qiv4oj/2nd7qW8X0Fs25no0G5MtOSvHqNGKQ5pY09q8JAxmU1vnSNFR+KZuS+fX7
 YyUf+SeBAZqkSzXgI11nD4hyxyFX1SQiO5FPjPyE93fPdJ9fKaQv4A/wdsrt6+ca
 5GaE2RJIxhKfkr9dHWJXQBGkAuYS8PnJiNYUdati5aemTht71KCYuafRzYL/T0YG
 omuDHbsS0L0EniMIWaWqmwu7M1BLsnMLA8nLsMrCANBG1PWaebobP7HXeK1jK90b
 ODhzldX5r3wQcj0nVLfdA6UOiY0wyvHYyRNiq+EBO9FXHtrNpxjz2X2MmK2fhkE6
 EaDLlgLSpB8ZT6MZHsWA
 =XI83
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.20-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD

KVM: s390: Features for 4.20
- Initial version of AP crypto virtualization via vfio-mdev
- Set the host program identifier
- Optimize page table locking
2018-10-04 17:12:45 +02:00
Christian Borntraeger
55d09dd4c8 Merge branch 'apv11' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kernelorgnext 2018-10-01 08:53:23 +02:00
Collin Walling
67d49d52ae KVM: s390: set host program identifier
A host program identifier (HPID) provides information regarding the
underlying host environment. A level-2 (VM) guest will have an HPID
denoting Linux/KVM, which is set during VCPU setup. A level-3 (VM on a
VM) and beyond guest will have an HPID denoting KVM vSIE, which is set
for all shadow control blocks, overriding the original value of the
HPID.

Signed-off-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Message-Id: <1535734279-10204-4-git-send-email-walling@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-10-01 08:51:42 +02:00
Tony Krowiak
37940fb0b6 KVM: s390: device attrs to enable/disable AP interpretation
Introduces two new VM crypto device attributes (KVM_S390_VM_CRYPTO)
to enable or disable AP instruction interpretation from userspace
via the KVM_SET_DEVICE_ATTR ioctl:

* The KVM_S390_VM_CRYPTO_ENABLE_APIE attribute enables hardware
  interpretation of AP instructions executed on the guest.

* The KVM_S390_VM_CRYPTO_DISABLE_APIE attribute disables hardware
  interpretation of AP instructions executed on the guest. In this
  case the instructions will be intercepted and pass through to
  the guest.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-25-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Pierre Morel
6cc571b1b1 KVM: s390: Clear Crypto Control Block when using vSIE
When we clear the Crypto Control Block (CRYCB) used by a guest
level 2, the vSIE shadow CRYCB for guest level 3 must be updated
before the guest uses it.

We achieve this by using the KVM_REQ_VSIE_RESTART synchronous
request for each vCPU belonging to the guest to force the reload
of the shadow CRYCB before rerunning the guest level 3.

Signed-off-by: Pierre Morel <pmorel@linux.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Message-Id: <20180925231641.4954-16-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-28 15:50:11 +02:00
Tony Krowiak
42104598ef KVM: s390: interface to clear CRYCB masks
Introduces a new KVM function to clear the APCB0 and APCB1 in the guest's
CRYCB. This effectively clears all bits of the APM, AQM and ADM masks
configured for the guest. The VCPUs are taken out of SIE to ensure the
VCPUs do not get out of sync.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Acked-by: Halil Pasic <pasic@linux.ibm.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Tested-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20180925231641.4954-11-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 21:02:59 +02:00
Tony Krowiak
e585b24aeb KVM: s390: refactor crypto initialization
This patch refactors the code that initializes and sets up the
crypto configuration for a guest. The following changes are
implemented via this patch:

1. Introduces a flag indicating AP instructions executed on
   the guest shall be interpreted by the firmware. This flag
   is used to set a bit in the guest's state description
   indicating AP instructions are to be interpreted.

2. Replace code implementing AP interfaces with code supplied
   by the AP bus to query the AP configuration.

Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Michael Mueller <mimu@linux.ibm.com>
Tested-by: Farhan Ali <alifm@linux.ibm.com>
Message-Id: <20180925231641.4954-4-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 20:45:20 +02:00
David Hildenbrand
3194cdb711 KVM: s390: introduce and use KVM_REQ_VSIE_RESTART
When we change the crycb (or execution controls), we also have to make sure
that the vSIE shadow datastructures properly consider the changed
values before rerunning the vSIE. We can achieve that by simply using a
VCPU request now.

This has to be a synchronous request (== handled before entering the
(v)SIE again).

The request will make sure that the vSIE handler is left, and that the
request will be processed (NOP), therefore forcing a reload of all
vSIE data (including rebuilding the crycb) when re-entering the vSIE
interception handler the next time.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180925231641.4954-3-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 09:13:20 +02:00
David Hildenbrand
9ea5972865 KVM: s390: vsie: simulate VCPU SIE entry/exit
VCPU requests and VCPU blocking right now don't take care of the vSIE
(as it was not necessary until now). But we want to have synchronous VCPU
requests that will also be handled before running the vSIE again.

So let's simulate a SIE entry of the VCPU when calling the sie during
vSIE handling and check for PROG_ flags. The existing infrastructure
(e.g. exit_sie()) will then detect that the SIE (in form of the vSIE) is
running and properly kick the vSIE CPU, resulting in it leaving the vSIE
loop and therefore the vSIE interception handler, allowing it to handle
VCPU requests.

E.g. if we want to modify the crycb of the VCPU and make sure that any
masks also get applied to the VSIE crycb shadow (which uses masks from the
VCPU crycb), we will need a way to hinder the vSIE from running and make
sure to process the updated crycb before reentering the vSIE again.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180925231641.4954-2-akrowiak@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-26 09:13:20 +02:00
Janosch Frank
40ebdb8e59 KVM: s390: Make huge pages unavailable in ucontrol VMs
We currently do not notify all gmaps when using gmap_pmdp_xchg(), due
to locking constraints. This makes ucontrol VMs, which is the only VM
type that creates multiple gmaps, incompatible with huge pages. Also
we would need to hold the guest_table_lock of all gmaps that have this
vmaddr maped to synchronize access to the pmd.

ucontrol VMs are rather exotic and creating a new locking concept is
no easy task. Hence we return EINVAL when trying to active
KVM_CAP_S390_HPAGE_1M and report it as being not available when
checking for it.

Fixes: a4499382 ("KVM: s390: Add huge page enablement control")
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Message-Id: <20180801112508.138159-1-frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-09-12 14:46:37 +02:00
Janosch Frank
df88f3181f KVM: s390: Properly lock mm context allow_gmap_hpage_1m setting
We have to do down_write on the mm semaphore to set a bitfield in the
mm context.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Fixes: a4499382 ("KVM: s390: Add huge page enablement control")
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-09-04 11:40:26 +02:00
Janosch Frank
2375846193 KVM: s390: initial host large page support
- must be enabled via module parameter hpage=1
 - cannot be used together with nested
 - does support migration
 - does support hugetlbfs
 - no THP yet
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJbX4AwAAoJEBF7vIC1phx85eMP/ifsNHwqfAOrZBdlJuLVPla5
 47J8iY4i4DOKGhKI4YOTcJQhn1izKZhECXS8d8hghB/sQUCE2CLVr1X/r1Udy2Pq
 bpKG4apYtcJZBF6qn7yDMjBGkIRK4OCBD1pkuKEq2NyvUgPsHUVUgpuq2gngMTBk
 ZN9MIfRQMdIEJsT389D6T9as0lwABJ0MJap5AudkQwguN2dDhQGeZv8l0QYV8C2I
 WqRI2VsI1QEo3cJr1lJ5li/F9fC7q0l6QwlvPVocIHJAnq01zJvOekeAgQ4hzz16
 JIoQckJq8m4d4PqZ7aWmAaMEemoQ9llmCavovspJNtFT79jho6cWWtBEvq+t0GLQ
 qTsG9Yi20hONZMWAw+JIdSdOuFMD0HCpOWdUtSMjENFRbr8LLHUr91dGIxRLjF8Z
 gv3vDJrbGzCQ+b9qPA8SrAN7U3VNCZG384MEmobwTuv5hxOopWp6chcK7RCriV/m
 7cFDfO7+2pZymdW7D4DWlFiZl4mWpwOxip32C9tCt0CQveqeYSZsb5Qb9Pe+50vr
 JhpB74UL79Wffvd65InGlu5jx1SdGG0QAzmBOkdOsAhX+0WMmXRB1ddn4whu7HPU
 ssNtdKgLt9KkM/kIsB9RC/YLvUFK1lBVHrfnzUmLw3CBHP3QeO+V+arLwdVLVDjV
 PA/LPECBWtGtQtxGWb2H
 =Y0Wl
 -----END PGP SIGNATURE-----

Merge tag 'hlp_stage1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into kvms390/next

KVM: s390: initial host large page support

- must be enabled via module parameter hpage=1
- cannot be used together with nested
- does support migration
- does support hugetlbfs
- no THP yet
2018-07-30 23:20:48 +02:00
Janosch Frank
a449938297 KVM: s390: Add huge page enablement control
General KVM huge page support on s390 has to be enabled via the
kvm.hpage module parameter. Either nested or hpage can be enabled, as
we currently do not support vSIE for huge backed guests. Once the vSIE
support is added we will either drop the parameter or enable it as
default.

For a guest the feature has to be enabled through the new
KVM_CAP_S390_HPAGE_1M capability and the hpage module
parameter. Enabling it means that cmm can't be enabled for the vm and
disables pfmf and storage key interpretation.

This is due to the fact that in some cases, in upcoming patches, we
have to split huge pages in the guest mapping to be able to set more
granular memory protection on 4k pages. These split pages have fake
page tables that are not visible to the Linux memory management which
subsequently will not manage its PGSTEs, while the SIE will. Disabling
these features lets us manage PGSTE data in a consistent matter and
solve that problem.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-07-30 23:13:38 +02:00
Janosch Frank
bd096f6443 KVM: s390: Add skey emulation fault handling
When doing skey emulation for huge guests, we now need to fault in
pmds, as we don't have PGSTES anymore to store them when we do not
have valid table entries.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2018-07-30 11:20:18 +01:00
Janosch Frank
0959e16867 s390/mm: Add huge page dirty sync support
To do dirty loging with huge pages, we protect huge pmds in the
gmap. When they are written to, we unprotect them and mark them dirty.

We introduce the function gmap_test_and_clear_dirty_pmd which handles
dirty sync for huge pages.

Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
2018-07-30 11:20:18 +01:00
Christian Borntraeger
a3da7b4a3b KVM: s390: add etoken support for guests
We want to provide facility 156 (etoken facility) to our
guests. This includes migration support (via sync regs) and
VSIE changes. The tokens are being reset on clear reset. This
has to be implemented by userspace (via sync regs).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
2018-07-19 12:59:36 +02:00
Claudio Imbrenda
afdad61615 KVM: s390: Fix storage attributes migration with memory slots
This is a fix for several issues that were found in the original code
for storage attributes migration.

Now no bitmap is allocated to keep track of dirty storage attributes;
the extra bits of the per-memslot bitmap that are always present anyway
are now used for this purpose.

The code has also been refactored a little to improve readability.

Fixes: 190df4a212 ("KVM: s390: CMMA tracking, ESSA emulation, migration mode")
Fixes: 4036e3874a ("KVM: s390: ioctls to get and set guest storage attributes")
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1525106005-13931-3-git-send-email-imbrenda@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-07-13 09:48:57 +02:00
Linus Torvalds
b08fc5277a - Error path bug fix for overflow tests (Dan)
- Additional struct_size() conversions (Matthew, Kees)
 - Explicitly reported overflow fixes (Silvio, Kees)
 - Add missing kvcalloc() function (Kees)
 - Treewide conversions of allocators to use either 2-factor argument
   variant when available, or array_size() and array3_size() as needed (Kees)
 -----BEGIN PGP SIGNATURE-----
 Comment: Kees Cook <kees@outflux.net>
 
 iQJKBAABCgA0FiEEpcP2jyKd1g9yPm4TiXL039xtwCYFAlsgVtMWHGtlZXNjb29r
 QGNocm9taXVtLm9yZwAKCRCJcvTf3G3AJhsJEACLYe2EbwLFJz7emOT1KUGK5R1b
 oVxJog0893WyMqgk9XBlA2lvTBRBYzR3tzsadfYo87L3VOBzazUv0YZaweJb65sF
 bAvxW3nY06brhKKwTRed1PrMa1iG9R63WISnNAuZAq7+79mN6YgW4G6YSAEF9lW7
 oPJoPw93YxcI8JcG+dA8BC9w7pJFKooZH4gvLUSUNl5XKr8Ru5YnWcV8F+8M4vZI
 EJtXFmdlmxAledUPxTSCIojO8m/tNOjYTreBJt9K1DXKY6UcgAdhk75TRLEsp38P
 fPvMigYQpBDnYz2pi9ourTgvZLkffK1OBZ46PPt8BgUZVf70D6CBg10vK47KO6N2
 zreloxkMTrz5XohyjfNjYFRkyyuwV2sSVrRJqF4dpyJ4NJQRjvyywxIP4Myifwlb
 ONipCM1EjvQjaEUbdcqKgvlooMdhcyxfshqJWjHzXB6BL22uPzq5jHXXugz8/ol8
 tOSM2FuJ2sBLQso+szhisxtMd11PihzIZK9BfxEG3du+/hlI+2XgN7hnmlXuA2k3
 BUW6BSDhab41HNd6pp50bDJnL0uKPWyFC6hqSNZw+GOIb46jfFcQqnCB3VZGCwj3
 LH53Be1XlUrttc/NrtkvVhm4bdxtfsp4F7nsPFNDuHvYNkalAVoC3An0BzOibtkh
 AtfvEeaPHaOyD8/h2Q==
 =zUUp
 -----END PGP SIGNATURE-----

Merge tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull more overflow updates from Kees Cook:
 "The rest of the overflow changes for v4.18-rc1.

  This includes the explicit overflow fixes from Silvio, further
  struct_size() conversions from Matthew, and a bug fix from Dan.

  But the bulk of it is the treewide conversions to use either the
  2-factor argument allocators (e.g. kmalloc(a * b, ...) into
  kmalloc_array(a, b, ...) or the array_size() macros (e.g. vmalloc(a *
  b) into vmalloc(array_size(a, b)).

  Coccinelle was fighting me on several fronts, so I've done a bunch of
  manual whitespace updates in the patches as well.

  Summary:

   - Error path bug fix for overflow tests (Dan)

   - Additional struct_size() conversions (Matthew, Kees)

   - Explicitly reported overflow fixes (Silvio, Kees)

   - Add missing kvcalloc() function (Kees)

   - Treewide conversions of allocators to use either 2-factor argument
     variant when available, or array_size() and array3_size() as needed
     (Kees)"

* tag 'overflow-v4.18-rc1-part2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
  treewide: Use array_size in f2fs_kvzalloc()
  treewide: Use array_size() in f2fs_kzalloc()
  treewide: Use array_size() in f2fs_kmalloc()
  treewide: Use array_size() in sock_kmalloc()
  treewide: Use array_size() in kvzalloc_node()
  treewide: Use array_size() in vzalloc_node()
  treewide: Use array_size() in vzalloc()
  treewide: Use array_size() in vmalloc()
  treewide: devm_kzalloc() -> devm_kcalloc()
  treewide: devm_kmalloc() -> devm_kmalloc_array()
  treewide: kvzalloc() -> kvcalloc()
  treewide: kvmalloc() -> kvmalloc_array()
  treewide: kzalloc_node() -> kcalloc_node()
  treewide: kzalloc() -> kcalloc()
  treewide: kmalloc() -> kmalloc_array()
  mm: Introduce kvcalloc()
  video: uvesafb: Fix integer overflow in allocation
  UBIFS: Fix potential integer overflow in allocation
  leds: Use struct_size() in allocation
  Convert intel uncore to struct_size
  ...
2018-06-12 18:28:00 -07:00
Kees Cook
42bc47b353 treewide: Use array_size() in vmalloc()
The vmalloc() function has no 2-factor argument form, so multiplication
factors need to be wrapped in array_size(). This patch replaces cases of:

        vmalloc(a * b)

with:
        vmalloc(array_size(a, b))

as well as handling cases of:

        vmalloc(a * b * c)

with:

        vmalloc(array3_size(a, b, c))

This does, however, attempt to ignore constant size factors like:

        vmalloc(4 * 1024)

though any constants defined via macros get caught up in the conversion.

Any factors with a sizeof() of "unsigned char", "char", and "u8" were
dropped, since they're redundant.

The Coccinelle script used for this was:

// Fix redundant parens around sizeof().
@@
type TYPE;
expression THING, E;
@@

(
  vmalloc(
-	(sizeof(TYPE)) * E
+	sizeof(TYPE) * E
  , ...)
|
  vmalloc(
-	(sizeof(THING)) * E
+	sizeof(THING) * E
  , ...)
)

// Drop single-byte sizes and redundant parens.
@@
expression COUNT;
typedef u8;
typedef __u8;
@@

(
  vmalloc(
-	sizeof(u8) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(__u8) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(char) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(unsigned char) * (COUNT)
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(u8) * COUNT
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(__u8) * COUNT
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(char) * COUNT
+	COUNT
  , ...)
|
  vmalloc(
-	sizeof(unsigned char) * COUNT
+	COUNT
  , ...)
)

// 2-factor product with sizeof(type/expression) and identifier or constant.
@@
type TYPE;
expression THING;
identifier COUNT_ID;
constant COUNT_CONST;
@@

(
  vmalloc(
-	sizeof(TYPE) * (COUNT_ID)
+	array_size(COUNT_ID, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT_ID
+	array_size(COUNT_ID, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * (COUNT_CONST)
+	array_size(COUNT_CONST, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT_CONST
+	array_size(COUNT_CONST, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT_ID)
+	array_size(COUNT_ID, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT_ID
+	array_size(COUNT_ID, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT_CONST)
+	array_size(COUNT_CONST, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT_CONST
+	array_size(COUNT_CONST, sizeof(THING))
  , ...)
)

// 2-factor product, only identifiers.
@@
identifier SIZE, COUNT;
@@

  vmalloc(
-	SIZE * COUNT
+	array_size(COUNT, SIZE)
  , ...)

// 3-factor product with 1 sizeof(type) or sizeof(expression), with
// redundant parens removed.
@@
expression THING;
identifier STRIDE, COUNT;
type TYPE;
@@

(
  vmalloc(
-	sizeof(TYPE) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(TYPE) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(TYPE))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT) * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * (COUNT) * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT * (STRIDE)
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
|
  vmalloc(
-	sizeof(THING) * COUNT * STRIDE
+	array3_size(COUNT, STRIDE, sizeof(THING))
  , ...)
)

// 3-factor product with 2 sizeof(variable), with redundant parens removed.
@@
expression THING1, THING2;
identifier COUNT;
type TYPE1, TYPE2;
@@

(
  vmalloc(
-	sizeof(TYPE1) * sizeof(TYPE2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  vmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(TYPE2))
  , ...)
|
  vmalloc(
-	sizeof(THING1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  vmalloc(
-	sizeof(THING1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(THING1), sizeof(THING2))
  , ...)
|
  vmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * COUNT
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
|
  vmalloc(
-	sizeof(TYPE1) * sizeof(THING2) * (COUNT)
+	array3_size(COUNT, sizeof(TYPE1), sizeof(THING2))
  , ...)
)

// 3-factor product, only identifiers, with redundant parens removed.
@@
identifier STRIDE, SIZE, COUNT;
@@

(
  vmalloc(
-	(COUNT) * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	(COUNT) * (STRIDE) * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	(COUNT) * STRIDE * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	(COUNT) * (STRIDE) * (SIZE)
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
|
  vmalloc(
-	COUNT * STRIDE * SIZE
+	array3_size(COUNT, STRIDE, SIZE)
  , ...)
)

// Any remaining multi-factor products, first at least 3-factor products
// when they're not all constants...
@@
expression E1, E2, E3;
constant C1, C2, C3;
@@

(
  vmalloc(C1 * C2 * C3, ...)
|
  vmalloc(
-	E1 * E2 * E3
+	array3_size(E1, E2, E3)
  , ...)
)

// And then all remaining 2 factors products when they're not all constants.
@@
expression E1, E2;
constant C1, C2;
@@

(
  vmalloc(C1 * C2, ...)
|
  vmalloc(
-	E1 * E2
+	array_size(E1, E2)
  , ...)
)

Signed-off-by: Kees Cook <keescook@chromium.org>
2018-06-12 16:19:22 -07:00
Souptick Joarder
1499fa809e kvm: Change return type to vm_fault_t
Use new return type vm_fault_t for fault handler. For
now, this is just documenting that the function returns
a VM_FAULT value rather than an errno. Once all instances
are converted, vm_fault_t will become a distinct type.

commit 1c8f422059 ("mm: change return type to vm_fault_t")

Signed-off-by: Souptick Joarder <jrdr.linux@gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-06-01 19:18:25 +02:00
David Hildenbrand
33d1b2729e KVM: s390: generalize kvm_s390_get_tod_clock_ext()
Move the Multiple-epoch facility handling into it and rename it to
kvm_s390_get_tod_clock().

This leaves us with:
- kvm_s390_set_tod_clock()
- kvm_s390_get_tod_clock()
- kvm_s390_get_tod_clock_fast()

So all Multiple-epoch facility is hidden in these functions.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-05-17 09:02:37 +02:00
David Hildenbrand
9ac96d759f KVM: s390: no need to inititalize kvm->arch members to 0
KVM is allocated with kzalloc(), so these members are already 0.

Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-05-17 09:02:32 +02:00
David Hildenbrand
b9224cd738 KVM: s390: introduce defines for control registers
In KVM code we use masks to test/set control registers.

Let's define the ones we use in arch/s390/include/asm/ctl_reg.h and
replace all occurrences in KVM code.

As we will be needing the define for Clock-comparator sign control soon,
let's also add it.

Suggested-by: Collin L. Walling <walling@linux.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Collin Walling <walling@linux.ibm.com>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-05-17 09:02:27 +02:00
Tony Krowiak
20c922f04b KVM: s390: reset crypto attributes for all vcpus
Introduces a new function to reset the crypto attributes for all
vcpus whether they are running or not. Each vcpu in KVM will
be removed from SIE prior to resetting the crypto attributes in its
SIE state description. After all vcpus have had their crypto attributes
reset the vcpus will be restored to SIE.

This function is incorporated into the kvm_s390_vm_set_crypto(kvm)
function to fix a reported issue whereby the crypto key wrapping
attributes could potentially get out of synch for running vcpus.

Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Tony Krowiak <akrowiak@linux.vnet.ibm.com>
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-05-17 09:02:10 +02:00
Janosch Frank
55531b7431 KVM: s390: Add storage key facility interpretation control
Up to now we always expected to have the storage key facility
available for our (non-VSIE) KVM guests. For huge page support, we
need to be able to disable it, so let's introduce that now.

We add the use_skf variable to manage KVM storage key facility
usage. Also we rename use_skey in the mm context struct to uses_skeys
to make it more clear that it is an indication that the vm actively
uses storage keys.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-05-17 09:00:41 +02:00
Christian Borntraeger
ccc40c53c0 KVM: s390: provide counters for all interrupt injects/delivery
For testing the exitless interrupt support it turned out useful to
have separate counters for inject and delivery of I/O interrupt.
While at it do the same for all interrupt types. For timer
related interrupts (clock comparator and cpu timer) we even had
no delivery counters. Fix this as well. On this way some counters
are being renamed to have a similar name.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2018-03-14 19:21:18 +00:00
QingFeng Hao
32de074909 KVM: add machine check counter to kvm_stat
This counter can be used for administration, debug or test purposes.

Suggested-by: Vladislav Mironov <mironov@de.ibm.com>
Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-14 19:21:17 +00:00
Christian Borntraeger
a5e0acea9e KVM: s390: add exit io request stats and simplify code
We want to count IO exit requests in kvm_stat. At the same time
we can get rid of the handle_noop function.

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-14 19:21:11 +00:00
Janosch Frank
c9f0a2b87f KVM: s390: Refactor host cmma and pfmfi interpretation controls
use_cmma in kvm_arch means that the KVM hypervisor is allowed to use
cmma, whereas use_cmma in the mm context means cmm has been used before.
Let's rename the context one to uses_cmm, as the vm does use
collaborative memory management but the host uses the cmm assist
(interpretation facility).

Also let's introduce use_pfmfi, so we can remove the pfmfi disablement
when we activate cmma and rather not activate it in the first place.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Message-Id: <1518779775-256056-2-git-send-email-frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-09 09:44:17 +00:00
Christian Borntraeger
c3b9e3e1ea KVM: s390: implement CPU model only facilities
Some facilities should only be provided to the guest, if they are
enabled by a CPU model. This allows us to avoid capabilities and
to simply fall back to the cpumodel for deciding about a facility
without enabling it for older QEMUs or QEMUs without a CPU
model.

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-09 09:44:17 +00:00
David Hildenbrand
f07afa0462 KVM: s390: fix memory overwrites when not using SCA entries
Even if we don't have extended SCA support, we can have more than 64 CPUs
if we don't enable any HW features that might use the SCA entries.

Now, this works just fine, but we missed a return, which is why we
would actually store the SCA entries. If we have more than 64 CPUs, this
means writing outside of the basic SCA - bad.

Let's fix this. This allows > 64 CPUs when running nested (under vSIE)
without random crashes.

Fixes: a6940674c3 ("KVM: s390: allow 255 VCPUs when sca entries aren't used")
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180306132758.21034-1-david@redhat.com>
Cc: stable@vger.kernel.org
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-06 13:33:02 +00:00
Christian Borntraeger
09a0fb6753 KVM: s390: provide io interrupt kvm_stat
We already count io interrupts, but we forgot to print them.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: d8346b7d9b ("KVM: s390: Support for I/O interrupts.")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-03-01 12:10:32 +00:00
David Hildenbrand
0e7def5fb0 KVM: s390: provide only a single function for setting the tod (fix SCK)
Right now, SET CLOCK called in the guest does not properly take care of
the epoch index, as the call goes via the old kvm_s390_set_tod_clock()
interface. So the epoch index is neither reset to 0, if required, nor
properly set to e.g. 0xff on negative values.

Fix this by providing a single kvm_s390_set_tod_clock() function. Move
Multiple-epoch facility handling into it.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180207114647.6220-3-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 8fa1696ea7 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-02-20 20:51:28 +00:00
David Hildenbrand
1575767ef3 KVM: s390: consider epoch index on TOD clock syncs
For now, we don't take care of over/underflows. Especially underflows
are critical:

Assume the epoch is currently 0 and we get a sync request for delta=1,
meaning the TOD is moved forward by 1 and we have to fix it up by
subtracting 1 from the epoch. Right now, this will leave the epoch
index untouched, resulting in epoch=-1, epoch_idx=0, which is wrong.

We have to take care of over and underflows, also for the VSIE case. So
let's factor out calculation into a separate function.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180207114647.6220-5-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 8fa1696ea7 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[use u8 for idx]
2018-02-20 20:51:21 +00:00
David Hildenbrand
d16b52cb9c KVM: s390: consider epoch index on hotplugged CPUs
We must copy both, the epoch and the epoch_idx.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180207114647.6220-4-david@redhat.com>
Fixes: 8fa1696ea7 ("KVM: s390: Multiple Epoch Facility support")
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: 8fa1696ea7 ("KVM: s390: Multiple Epoch Facility support")
Cc: stable@vger.kernel.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-02-20 20:50:51 +00:00
Radim Krčmář
7bf14c28ee Merge branch 'x86/hyperv' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Topic branch for stable KVM clockource under Hyper-V.

Thanks to Christoffer Dall for resolving the ARM conflict.
2018-02-01 15:04:17 +01:00
Radim Krčmář
92ea2b3381 KVM: s390: Fixes and features for 4.16 part 2
- exitless interrupts for emulated devices (Michael Mueller)
 - cleanup of cpuflag handling (David Hildenbrand)
 - kvm stat counter improvements (Christian Borntraeger)
 - vsie improvements (David Hildenbrand)
 - mm cleanup (Janosch Frank)
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJaayoSAAoJEBF7vIC1phx8T8EP/1xGy6ZZBdVCAT00u3GWZ9eh
 M5m1thSUuhKuHdJHaN1ORrBPNhR5l+Bvf5VMi5LkWORUpQc4jOYz1BX1qrryvWXk
 Bt6363v1rhbInk7uKv4E0q9i3Ei6dSfoT0YByihqiPjkaeJyG830Ez2IUFFdGUQQ
 ulyRtKXJ8Sk9L3LhO9uLHFtU9CZ+2CpiEEM6q3TCNzduU7LC9NHdl/bx4uKqgz4h
 9l/i+P3CXFglBpFDL+JTD72myBbbm78bQAXDoJWSTm9EKolpUZaTP2xpCrrG+A7f
 RRPzJoYOtxEgDTnNcjH16OX2TGXpPL0Q2cTl4vJihaZW9KrTPjYHQY5BIf4EQzhU
 Kd2p1yN/aYQFsLA5cofpkC4HPBKeiocg0HAu6byo9N+uLTjnxoZUkekfq03UXVan
 xwENUYknFmuI0SLnz3f2fHqWKXsdrron+gzldtBUBtTd8EOvh7c7/TNE+IrpTgRo
 HvC4KddOp30filp8GBxXU79i2qFzTyvjGwxzuiMo29n0R0i8tzdCokywsvAgWHvX
 KC8ukcmwsGq1lJB32rZ5RkacBCDS/iSaD0xg7iseHoYYpUPRuWxzoTVV134LGh5Q
 IX42NGvS8mA2I5byd3R9DtDY5eQYSN7bRGsNWSKJCKRSX9zEHpIKTQXT8N/aXvxe
 PQoxcEau3AuzbQs9KjFj
 =YJ0E
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.16-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: Fixes and features for 4.16 part 2

- exitless interrupts for emulated devices (Michael Mueller)
- cleanup of cpuflag handling (David Hildenbrand)
- kvm stat counter improvements (Christian Borntraeger)
- vsie improvements (David Hildenbrand)
- mm cleanup (Janosch Frank)
2018-01-30 17:42:40 +01:00
Michael Mueller
4b9f952577 KVM: s390: introduce the format-1 GISA
The patch modifies the previously defined GISA data structure to be
able to store two GISA formats, format-0 and format-1. Additionally,
it verifies the availability of the GISA format facility and enables
the use of a format-1 GISA in the SIE control block accordingly.

A format-1 can do everything that format-0 can and we will need it
for real HW passthrough. As there are systems with only format-0
we keep both variants.

Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-26 14:13:58 +01:00
Michael Mueller
d7c5cb0105 KVM: s390: exploit GISA and AIV for emulated interrupts
The adapter interruption virtualization (AIV) facility is an
optional facility that comes with functionality expected to increase
the performance of adapter interrupt handling for both emulated and
passed-through adapter interrupts. With AIV, adapter interrupts can be
delivered to the guest without exiting SIE.

This patch provides some preparations for using AIV for emulated adapter
interrupts (including virtio) if it's available. When using AIV, the
interrupts are delivered at the so called GISA by setting the bit
corresponding to its Interruption Subclass (ISC) in the Interruption
Pending Mask (IPM) instead of inserting a node into the floating interrupt
list.

To keep the change reasonably small, the handling of this new state is
deferred in get_all_floating_irqs and handle_tpi. This patch concentrates
on the code handling enqueuement of emulated adapter interrupts, and their
delivery to the guest.

Note that care is still required for adapter interrupts using AIV,
because there is no guarantee that AIV is going to deliver the adapter
interrupts pending at the GISA (consider all vcpus idle). When delivering
GISA adapter interrupts by the host (usual mechanism) special attention
is required to honor interrupt priorities.

Empirical results show that the time window between making an interrupt
pending at the GISA and doing kvm_s390_deliver_pending_interrupts is
sufficient for a guest with at least moderate cpu activity to get adapter
interrupts delivered within the SIE, and potentially save some SIE exits
(if not other deliverable interrupts).

The code will be activated with a follow-up patch.

Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-26 11:11:39 +01:00
Michael Mueller
19114beb73 KVM: s390: define GISA format-0 data structure
In preperation to support pass-through adapter interrupts, the Guest
Interruption State Area (GISA) and the Adapter Interruption Virtualization
(AIV) features will be introduced here.

This patch introduces format-0 GISA (that is defines the struct describing
the GISA, allocates storage for it, and introduces fields for the
GISA address in kvm_s390_sie_block and kvm_s390_vsie).

As the GISA requires storage below 2GB, it is put in sie_page2, which is
already allocated in ZONE_DMA. In addition, The GISA requires alignment to
its integral boundary. This is already naturally aligned via the
padding in the sie_page2.

Signed-off-by: Michael Mueller <mimu@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-26 10:47:29 +01:00
David Hildenbrand
8d5fb0dc4e KVM: s390: introduce and use kvm_s390_test_cpuflags()
Use it just like kvm_s390_set_cpuflags() and kvm_s390_clear_cpuflags().

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180123170531.13687-5-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-24 17:46:42 +01:00
David Hildenbrand
9daecfc660 KVM: s390: introduce and use kvm_s390_clear_cpuflags()
Use it just like kvm_s390_set_cpuflags().

Suggested-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180123170531.13687-4-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-24 17:46:42 +01:00
David Hildenbrand
ef8f4f49fc KVM: s390: reuse kvm_s390_set_cpuflags()
Use it in all places where we set cpuflags.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180123170531.13687-3-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-24 17:46:41 +01:00
Christian Borntraeger
a37cb07a30 KVM: s390: add vcpu stat counters for many instruction
The overall instruction counter is larger than the sum of the
single counters. We should try to catch all instruction handlers
to make this match the summary counter.
Let us add sck,tb,sske,iske,rrbe,tb,tpi,tsch,lpsw,pswe....
and remove other unused ones.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-01-24 16:49:02 +01:00
Radim Krčmář
bda646dd18 KVM: s390: another fix for cmma migration
This fixes races and potential use after free in the
 cmma migration code.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJaaJimAAoJEBF7vIC1phx8fIsP/1ShgCjg1Bcgc9FI/Rgo8diG
 oycjpv7BXp2dcrgW2FRB1LFoRpMoWAwOy4YNoMls/9gqemupYYa97F3viP3oVH56
 wQaZjgRdpnvflmti+3S9Ca3aws7WUPuAONTld44TJZaVJXM8g52VtjPD5APMWFqL
 agawGopcmoxUX5qE12BAxCx1lx5/hToF5bcVULCmNogW9Cl9QOUw+8W3ezmaoJ0w
 37FVyK5Xv+JvF8gXJlhLwXkcFyaEfLNUsoJdOCgi89/aULmp1z3/EmOTos3vTQwT
 83lNxdjXuS+Ps4qGUnNwNTThoXhdC+rVbAZd/tbLi/8CtjvSeKqB0T8sMgvAC1Aa
 HXYgNbEtIxjdJYbWGbTy+8uCIHLgVMiYRnPymqa6j0Y8Tk07ji2g/r0kZNotykD6
 Hb4CQ41MKAXljpLEXV1L1a/LKrfQBhlTIR4VCPy9GBN+4AgV+KzjF0l1N0B9r/RD
 sqSsttkbGNyNkPpCT+EGQa1d82on7T9lm40Ag6AAaBaw5x6lPwLAmDCslst5c3Fx
 pGmoJk6fe64saQcs49zOiPiBYrlawW69IIVJaHWWOG3z+EdgpMjvtn1XWJpkyPZq
 N0MDPn5+VwPRPQ3qKpaCRuVaUNn2gGuZSpx2/DsaY8gF65BLIIRjLt/SpLGxw3wU
 AUY13gafv82jM6xlpc0p
 =aAYd
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-master-4.15-3' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: another fix for cmma migration

This fixes races and potential use after free in the
cmma migration code.
2018-01-24 16:25:53 +01:00
Christian Borntraeger
866c138c32 KVM: s390: diagnoses are instructions as well
Make the diagnose counters also appear as instruction counters.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2018-01-24 16:19:22 +01:00
Christian Borntraeger
1de1ea7efe KVM: s390: add proper locking for CMMA migration bitmap
Some parts of the cmma migration bitmap is already protected
with the kvm->lock (e.g. the migration start). On the other
hand the read of the cmma bits is not protected against a
concurrent free, neither is the emulation of the ESSA instruction.
Let's extend the locking to all related ioctls by using
the slots lock for
- kvm_s390_vm_start_migration
- kvm_s390_vm_stop_migration
- kvm_s390_set_cmma_bits
- kvm_s390_get_cmma_bits

In addition to that, we use synchronize_srcu before freeing
the migration structure as all users hold kvm->srcu for read.
(e.g. the ESSA handler).

Reported-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: stable@vger.kernel.org # 4.13+
Fixes: 190df4a212 (KVM: s390: CMMA tracking, ESSA emulation, migration mode)
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
2018-01-24 15:22:51 +01:00
Christian Borntraeger
35b3fde620 KVM: s390: wire up bpb feature
The new firmware interfaces for branch prediction behaviour changes
are transparently available for the guest. Nevertheless, there is
new state attached that should be migrated and properly resetted.
Provide a mechanism for handling reset, migration and VSIE.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
[Changed capability number to 152. - Radim]
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
2018-01-20 17:30:47 +01:00
Radim Krčmář
7cd918047a KVM: s390: Fixes and features for 4.16
- add the virtio-ccw transport for kvmconfig
 - more debug tracing for cpu model
 - cleanups and fixes
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJaXhcpAAoJEBF7vIC1phx8YdwP/1FYC24FZVqKZ3NO4ItSh7xc
 QdithL2dqfeudmwc/nU6AilMbvgTdR6QmWOICh7fc2HklrIxqkFcjZeHDe2mp5NB
 aI1WVtt3EpqZWsimXUkWYUY0Az3DF36Yc/vYw7ubUvPzb5aN9c7G666ADfUwgIjP
 IgFgqyEKeT7uP5KVF5Ysz/WaYSGY1BsbwfNfWWjWYQgcj77cA4FkBrM4Krq7GYsO
 sGI/IeI9RjtNyExLljpV/eg1rfO6iV+9k8QR4DOYccHooG3tZNhRTbOWTIbvDQir
 ryoDeAe2ndDa6BpWDPWRjsricq53+hXuDhx344hro15Uiv949cNMS5d6UFsAnuHR
 JYoX/TLmqaETTEC2krn0OgviEU7RcEUAaiEbdegHRTgCNVsYnxoqO91OMudaiyml
 zyzUKQYt73t2rBsciRPi3p+nSe6i56uE2yvAi1HtKSM5JMJweVp0VYsQB/0MTFnz
 8VIrQjWhj/GEbUufHwWTTwPvEy1Aj9yr4xM6Jxe+C0hnFnB9n2BQQr89QWLkLt2L
 0YGviq17Xbk3dgvhp28wY6kPTYipY3VJy2MiyH5DZDY9+5MsMo2VY/y6GyXEe4HZ
 ycGyRdvyyNxwiAOI7NVHQYufiVjcdX4kV9uKC6VcfB2tcJF16l3s3u60EE324+t5
 lf1LrFVP0xgBrKfAA8SV
 =Cc57
 -----END PGP SIGNATURE-----

Merge tag 'kvm-s390-next-4.16-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux

KVM: s390: Fixes and features for 4.16

- add the virtio-ccw transport for kvmconfig
- more debug tracing for cpu model
- cleanups and fixes
2018-01-16 16:41:27 +01:00
David Hildenbrand
a9f6c9a92f KVM: s390: cleanup struct kvm_s390_float_interrupt
"wq" is not used at all. "cpuflags" can be access directly via the vcpu,
just as "float_int" via vcpu->kvm.
While at it, reuse _set_cpuflag() to make the code look nicer.

Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180108193747.10818-1-david@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2018-01-16 16:15:18 +01:00
Christian Borntraeger
2f8311c912 KVM: s390: add debug tracing for cpu features of CPU model
The cpu model already traces the cpu facilities, the ibc and
guest CPU ids. We should do the same for the cpu features (on
success only).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
2018-01-16 16:15:17 +01:00