In XIVE Gen 2 there were some minor changes to the TIMA header that were
updated when printed.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Michael Kowal <kowal@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
It will help us model the END triggers on the PowerNV machine, which
can be rerouted to another interrupt controller.
Reviewed-by: Frederic Barrat <fbarrat@linux.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Add the CPU target in the trace when reading/writing the TIMA
space. It was already done for other TIMA ops (notify, accept, ...),
only missing for those 2. Useful for debug and even more now that we
experiment with SMT.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230705110039.231148-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
We currently only allow 64-bit operations on the ESB CI pages. There's
no real reason for that limitation, skiboot/linux didn't need
more. However the hardware supports any size, so this patch relaxes
that restriction. It impacts both the ESB pages for "normal"
interrupts as well as the ESB pages for escalation interrupts defined
for the ENDs.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-ID: <20230704144848.164287-1-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The PQ state of a xive interrupt is always initialized to Q=1, which
means the interrupt is disabled. Since a xive source can be embedded
in many objects, this patch adds a property to allow that behavior to
be refined if needed.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20230703081215.55252-2-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Accessing the TIMA from some specific ring/offset combination can
trigger a special operation, with or without side effects. It is
implemented in qemu with an array of special operations to compare
accesses against. Since the presenter on P10 is pretty similar to P9,
we had the full array defined for P9 and we just had a special case
for P10 to treat one access differently. With a recent change,
6f2cbd133d ("pnv/xive2: Handle TIMA access through all ports"), we
now ignore some of the bits of the TIMA address, but that patch
managed to botch the detection of the special case for P10.
To clean that up, this patch introduces a full array of special ops to
be used for P10. The code to detect a special access is common with
P9, only the array of operations differs. The presenter can pick the
correct array of special ops based on its configuration introduced in
a previous patch.
Fixes: Coverity CID 1512997, 1512998
Fixes: 6f2cbd133d ("pnv/xive2: Handle TIMA access through all ports")
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The presenters for xive on P9 and P10 are mostly similar but the
behavior can be tuned through a few CQ registers. This patch adds a
"get_config" method, which will allow to access that config from the
presenter in a later patch.
For now, just define the config for the TIMA version.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The Thread Interrupt Management Area (TIMA) can be accessed through 4
ports, targeted by the address. The base address of a TIMA
is using port 0 and the other ports are 0x80 apart. Using one port or
another can be useful to balance the load on the snoop buses. With
skiboot and linux, we currently use port 0, but as it tends to be
busy, another hypervisor is using port 1 for TIMA access.
The port address bits fall in between the special op indication
bits (the 2 MSBs) and the register offset bits (the 6 LSBs). They are
"don't care" for the hardware when processing a TIMA operation. This
patch filters out those port address bits so that a TIMA operation can
be triggered using any port.
It is also true for indirect access (through the IC BAR) and it's
actually nothing new, it was already the case on P9. Which helps here,
as the TIMA handling code is common between P9 (xive) and P10 (xive2).
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230601121331.487207-6-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
TIMA addresses are somewhat special and are split in several bit
fields with different meanings. This patch describes it and introduce
macros to more easily access the various fields.
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20230601121331.487207-5-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
This replaces the IRQ array 'irq_inputs' with GPIO lines, the goal
being to remove 'irq_inputs' when all CPUs have been converted.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220705145814.461723-2-clg@kaod.org>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
When pulling or pushing an OS context from/to a CPU, we should
re-evaluate the state of the External interrupt signal. Otherwise, we
can end up catching the External interrupt exception in hypervisor
mode, which is unexpected.
The problem is best illustrated with the following scenario:
1. an External interrupt is raised while the guest is on the CPU.
2. before the guest can ack the External interrupt, an hypervisor
interrupt is raised, for example the Hypervisor Decrementer or
Hypervisor Virtualization interrupt. The hypervisor interrupt forces
the guest to exit while the External interrupt is still pending.
3. the hypervisor handles the hypervisor interrupt. At this point, the
External interrupt is still pending. So it's very likely to be
delivered while the hypervisor is running. That's unexpected and can
result in an infinite loop where the hypervisor catches the External
interrupt, looks for an interrupt in its hypervisor queue, doesn't
find any, exits the interrupt handler with the External interrupt
still raised, repeat...
The fix is simply to always lower the External interrupt signal when
pulling an OS context. It means it needs to be raised again when
re-pushing the OS context. Fortunately, it's already the case, as we
now always call xive_tctx_ipb_update(), which will raise the signal if
needed.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-Id: <20220429071620.177142-3-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The Post Interrupt Priority Register (PIPR) is not restored like the
other OS-context related fields of the TIMA when pushing an OS context
on the CPU. It's not needed because it can be calculated from the
Interrupt Pending Buffer (IPB), which is saved and restored. The PIPR
must therefore always be recomputed when pushing an OS context.
This patch fixes a path on P9 and P10 where it was not done. If there
was a pending interrupt when the OS context was pulled, the IPB was
saved correctly. When pushing back the context, the code in
xive_tctx_need_resend() was checking for a interrupt raised while the
context was not on the CPU, saved in the NVT. If one was found, then
it was merged with the saved IPB and the PIPR updated and everything
was fine. However, if there was no interrupt found in the NVT, then
xive_tctx_ipb_update() was not being called and the PIPR was not
updated. This patch fixes it by always calling xive_tctx_ipb_update().
Note that on P10 (xive2.c) and because of the above, there's no longer
any need to check the CPPR value so it can go away.
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com>
Message-Id: <20220429071620.177142-2-fbarrat@linux.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
The PQ_disable configuration bit disables the check done on the PQ
state bits when processing new MSI interrupts. When bit 9 is enabled,
the PHB forwards any MSI trigger to the XIVE interrupt controller
without checking the PQ state bits. The XIVE IC knows from the trigger
message that the PQ bits have not been checked and performs the check
locally.
This configuration bit only applies to MSIs and LSIs are still checked
on the PHB to handle the assertion level.
PQ_disable enablement is a requirement for StoreEOI.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
The trigger message coming from a HW source contains a special bit
informing the XIVE interrupt controller that the PQ bits have been
checked at the source or not. Depending on the value, the IC can
perform the check and the state transition locally using its own PQ
state bits.
The following changes add new accessors to the XiveRouter required to
query and update the PQ state bits. This only applies to the PowerNV
machine. sPAPR accessors are provided but the pSeries machine should
not be concerned by such complex configuration for the moment.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
This is an internal offset used to inject triggers when the PQ state
bits are not controlled locally. Such as for LSIs when the PHB5 are
using the Address-Based Interrupt Trigger mode and on the END.
Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
and use them to set and test the ASSERTED bit of LSI sources.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20211004212141.432954-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It's generic enough to be used from the XIVE2 router and avoid more
duplication.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210809134547.689560-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These will be shared with the XIVE2 router.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210809134547.689560-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ENDs allocated by OPAL for the HW thread VPs are tagged as owned by FW.
Dump the state in 'info pic'.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20210126171059.307867-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
I have been keeping those logging messages in an ugly form for
while. Make them clean !
Beware not to activate all of them, this is really verbose.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20201123163717.1368450-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that kvmppc_xive_cpu_connect() returns a negative errno on failure,
use that and get rid of the local_err boilerplate.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707852234.1489912.16410314514265848075.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that kvmppc_xive_cpu_get_state() and kvmppc_xive_cpu_set_state()
return negative errnos on failures, use that instead local_err because
it is the recommended practice. Also return that instead of -1 since
vmstate expects negative errnos.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159707850840.1489912.14912810818646455474.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Calls to the KVM XIVE device are guarded by kvm_irqchip_in_kernel(). This
ensures that QEMU won't try to use the device if KVM is disabled or if
an in-kernel irqchip isn't required.
When using ic-mode=dual with the pseries machine, we have two possible
interrupt controllers: XIVE and XICS. The kvm_irqchip_in_kernel() helper
will return true as soon as any of the KVM device is created. It might
lure QEMU to think that the other one is also around, while it is not.
This is exactly what happens with ic-mode=dual at machine init when
claiming IRQ numbers, which must be done on all possible IRQ backends,
eg. RTAS event sources or the PHB0 LSI table : only the KVM XICS device
is active but we end up calling kvmppc_xive_source_reset_one() anyway,
which fails. This doesn't cause any trouble because of another bug :
kvmppc_xive_source_reset_one() lacks an error_setg() and callers don't
see the failure.
Most of the other kvmppc_xive_* functions have similar xive->fd
checks to filter out the case when KVM XIVE isn't active. It
might look safer to have idempotent functions but it doesn't
really help to understand what's going on when debugging.
Since we already have all the kvm_irqchip_in_kernel() in place,
also have the callers to check xive->fd as well before calling
KVM XIVE specific code. This is straight-forward for the spapr
specific XIVE code. Some more care is needed for the platform
agnostic XIVE code since it cannot access xive->fd directly.
Introduce new in_kernel() methods in some base XIVE classes
for this purpose and implement them only in spapr.
In all cases, we still need to call kvm_irqchip_in_kernel() so that
compilers can optimize the kvmppc_xive_* calls away when CONFIG_KVM
isn't defined, thus avoiding the need for stubs.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159679993438.876294.7285654331498605426.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Depending on whether XIVE is emultated or backed with a KVM XIVE device,
the ESB MMIOs of a XIVE source point to an I/O memory region or a mapped
memory region.
This is currently handled by checking kvm_irqchip_in_kernel() returns
false in xive_source_realize(). This is a bit awkward as we usually
need to do extra things when we're using the in-kernel backend, not
less. But most important, we can do better: turn the existing "xive.esb"
memory region into a plain container, introduce an "xive.esb-emulated"
I/O subregion and rename the existing "xive.esb" subregion in the KVM
code to "xive.esb-kvm". Since "xive.esb-kvm" is added with overlap
and a higher priority, it prevails over "xive.esb-emulated" (ie.
a guest using KVM XIVE will interact with "xive.esb-kvm" instead of
the default "xive.esb-emulated" region.
While here, consolidate the computation of the MMIO region size in
a common helper.
Suggested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <159679992680.876294.7520540158586170894.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Fix some typos in comments about code modeling coalescing points in the
XIVE routing engine (IVRE).
Signed-off-by: Gustavo Romero <gromero@linux.ibm.com>
Message-Id: <1595461434-27725-1-git-send-email-gromero@linux.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When all we do with an Error we receive into a local variable is
propagating to somewhere else, we can just as well receive it there
right away. The previous two commits did that for sufficiently simple
cases with Coccinelle. Do it for several more manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20200707160613.848843-37-armbru@redhat.com>
The object_property_set_FOO() setters take property name and value in
an unusual order:
void object_property_set_FOO(Object *obj, FOO_TYPE value,
const char *name, Error **errp)
Having to pass value before name feels grating. Swap them.
Same for object_property_set(), object_property_get(), and
object_property_parse().
Convert callers with this Coccinelle script:
@@
identifier fun = {
object_property_get, object_property_parse, object_property_set_str,
object_property_set_link, object_property_set_bool,
object_property_set_int, object_property_set_uint, object_property_set,
object_property_set_qobject
};
expression obj, v, name, errp;
@@
- fun(obj, v, name, errp)
+ fun(obj, name, v, errp)
Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information". Convert that one manually.
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Convert manually.
Fails to convert hw/rx/rx-gdbsim.c, because Coccinelle gets confused
by RXCPU being used both as typedef and function-like macro there.
Convert manually. The other files using RXCPU that way don't need
conversion.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20200707160613.848843-27-armbru@redhat.com>
[Straightforwad conflict with commit 2336172d9b "audio: set default
value for pcspk.iobase property" resolved]
Convert
foo(..., &err);
if (err) {
...
}
to
if (!foo(..., &err)) {
...
}
for qdev_realize(), qdev_realize_and_unref(), qbus_realize() and their
wrappers isa_realize_and_unref(), pci_realize_and_unref(),
sysbus_realize(), sysbus_realize_and_unref(), usb_realize_and_unref().
Coccinelle script:
@@
identifier fun = {
isa_realize_and_unref, pci_realize_and_unref, qbus_realize,
qdev_realize, qdev_realize_and_unref, sysbus_realize,
sysbus_realize_and_unref, usb_realize_and_unref
};
expression list args, args2;
typedef Error;
Error *err;
@@
- fun(args, &err, args2);
- if (err)
+ if (!fun(args, &err, args2))
{
...
}
Chokes on hw/arm/musicpal.c's lcd_refresh() with the unhelpful error
message "no position information". Nothing to convert there; skipped.
Fails to convert hw/arm/armsse.c, because Coccinelle gets confused by
ARMSSE being used both as typedef and function-like macro there.
Converted manually.
A few line breaks tidied up manually.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20200707160613.848843-5-armbru@redhat.com>
All remaining conversions to qdev_realize() are for bus-less devices.
Coccinelle script:
// only correct for bus-less @dev!
@@
expression errp;
expression dev;
@@
- qdev_init_nofail(dev);
+ qdev_realize(dev, NULL, &error_fatal);
@ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
expression errp;
expression dev;
symbol true;
@@
- object_property_set_bool(OBJECT(dev), true, "realized", errp);
+ qdev_realize(DEVICE(dev), NULL, errp);
@ depends on !(file in "hw/core/qdev.c") && !(file in "hw/core/bus.c")@
expression errp;
expression dev;
symbol true;
@@
- object_property_set_bool(dev, true, "realized", errp);
+ qdev_realize(DEVICE(dev), NULL, errp);
Note that Coccinelle chokes on ARMSSE typedef vs. macro in
hw/arm/armsse.c. Worked around by temporarily renaming the macro for
the spatch run.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200610053247.1583243-57-armbru@redhat.com>
The only way object_property_add() can fail is when a property with
the same name already exists. Since our property names are all
hardcoded, failure is a programming error, and the appropriate way to
handle it is passing &error_abort.
Same for its variants, except for object_property_add_child(), which
additionally fails when the child already has a parent. Parentage is
also under program control, so this is a programming error, too.
We have a bit over 500 callers. Almost half of them pass
&error_abort, slightly fewer ignore errors, one test case handles
errors, and the remaining few callers pass them to their own callers.
The previous few commits demonstrated once again that ignoring
programming errors is a bad idea.
Of the few ones that pass on errors, several violate the Error API.
The Error ** argument must be NULL, &error_abort, &error_fatal, or a
pointer to a variable containing NULL. Passing an argument of the
latter kind twice without clearing it in between is wrong: if the
first call sets an error, it no longer points to NULL for the second
call. ich9_pm_add_properties(), sparc32_ledma_realize(),
sparc32_dma_realize(), xilinx_axidma_realize(), xilinx_enet_realize()
are wrong that way.
When the one appropriate choice of argument is &error_abort, letting
users pick the argument is a bad idea.
Drop parameter @errp and assert the preconditions instead.
There's one exception to "duplicate property name is a programming
error": the way object_property_add() implements the magic (and
undocumented) "automatic arrayification". Don't drop @errp there.
Instead, rename object_property_add() to object_property_try_add(),
and add the obvious wrapper object_property_add().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20200505152926.18877-15-armbru@redhat.com>
[Two semantic rebase conflicts resolved]
This will be used in subsequent patches to access the XIVE associated to
a TCTX without reaching out to the machine through qdev_get_machine().
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[ groug: - split patch
- write subject and changelog ]
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200106145645.4539-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that the spapr and pnv machines do set the "xive-fabric" link, the
use of the XIVE fabric pointer becomes mandatory. This is checked with
an assert() in a new realize hook. Since the XIVE router is realized at
machine init for the all the machine's life time, no risk to abort an
already running guest (ie. not a hotplug path).
This gets rid of a qdev_get_machine() call.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200106145645.4539-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In order to get rid of qdev_get_machine(), first add a pointer to the
XIVE fabric under the XIVE router and make it configurable through a
QOM link property.
Configure it in the spapr and pnv machine. In the case of pnv, the XIVE
routers are under the chip, so this is done with a QOM alias property of
the POWER9 pnv chip.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20200106145645.4539-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When doing CAM line compares, fetch the block id from the interrupt
controller which can have set the PC_TCTXT_CHIPID field.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-20-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a vCPU is dispatched on a HW thread, its context is pushed in the
thread registers and it is activated by setting the VO bit in the CAM
line word2. The HW grabs the associated NVT, pulls the IPB bits and
merges them with the IPB of the new context. If interrupts were missed
while the vCPU was not dispatched, these are synthesized in this
sequence.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-18-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We will use it to resend missed interrupts when a vCPU context is
pushed on a HW thread.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-17-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It is now unused.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-16-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On the P9 Processor, the thread interrupt context registers of a CPU
can be accessed "directly" when by load/store from the CPU or
"indirectly" by the IC through an indirect TIMA page. This requires to
configure first the PC_TCTXT_INDIRx registers.
Today, we rely on the get_tctx() handler to deduce from the CPU PIR
the chip from which the TIMA access is being done. By handling the
TIMA memory ops under the interrupt controller model of each machine,
we can uniformize the TIMA direct and indirect ops under PowerNV. We
can also check that the CPUs have been enabled in the XIVE controller.
This prepares ground for the future versions of XIVE.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The TIMA operations are performed on behalf of the XIVE IVPE sub-engine
(Presenter) on the thread interrupt context registers. The current
operations supported by the model are simple and do not require access
to the controller but more complex operations will need access to the
controller NVT table and to its configuration.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-13-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that the machines have handlers implementing the XiveFabric and
XivePresenter interfaces, remove xive_presenter_match() and make use
of the 'match_nvt' handler of the machine.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-12-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XiveFabric QOM interface acts as the PowerBUS interface between
the interrupt controller and the system and should be implemented by
the QEMU machine. On HW, the XIVE sub-engine is responsible for the
communication with the other chip is the Common Queue (CQ) bridge
unit.
This interface offers a 'match_nvt' handler to perform the CAM line
matching when looking for a XIVE Presenter with a dispatched NVT.
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Each XIVE Router model, sPAPR and PowerNV, now implements the 'match_nvt'
handler of the XivePresenter QOM interface. This is simply moving code
and taking into account the new API.
To be noted that the xive_router_get_tctx() helper is not used anymore
when doing CAM matching and will be removed later on after other changes.
The XIVE presenter model is still too simple for the PowerNV machine
and the CAM matching algo is not correct on multichip system. Subsequent
patches will introduce more changes to scan all chips of the system.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When the XIVE IVRE sub-engine (XiveRouter) looks for a Notification
Virtual Target (NVT) to notify, it broadcasts a message on the
PowerBUS to find an XIVE IVPE sub-engine (Presenter) with the NVT
dispatched on one of its HW threads, and then forwards the
notification if any response was received.
The current XIVE presenter model is sufficient for the pseries machine
because it has a single interrupt controller device, but the PowerNV
machine can have multiple chips each having its own interrupt
controller. In this case, the XIVE presenter model is too simple and
the CAM line matching should scan all chips of the system.
To start fixing this issue, we first extend the XIVE Router model with
a new XivePresenter QOM interface representing the XIVE IVPE
sub-engine. This interface exposes a 'match_nvt' handler which the
sPAPR and PowerNV XIVE Router models will need to implement to perform
the CAM line matching.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191125065820.927-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A context should be 'valid' when pulled from the thread interrupt
context registers.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191115162436.30548-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The OS CAM line has a special encoding exploited by the HW. Provide
helper routines to hide the details to the TIMA command handlers. This
also clarifies the endianness of different variables : 'qw1w2' is
big-endian and 'cam' is native.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191115162436.30548-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When an interrupt can not be presented to a vCPU, because it is not
running on any of the HW treads, the XIVE presenter updates the
Interrupt Pending Buffer register of the associated XIVE NVT
structure. This is only done if backlog is activated in the END but
this is generally the case.
The current code assumes that the fields of the NVT structure is
architected with the same layout of the thread interrupt context
registers. Fix this assumption and define an offset for the IPB
register backup value in the NVT.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191115162436.30548-2-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The END source object has both a pointer and a "xive" property pointing to
the router object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
The property isn't optional : not being able to set the link is a bug
and QEMU should rather abort than exit in this case.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383333784.165747.5298512574054268786.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The source object has both a pointer and a "xive" property pointing to the
notifier object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
The property isn't optional : not being able to set the link is a bug
and QEMU should rather abort than exit in this case.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383333227.165747.12901571295951957951.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The TCTX object has both a pointer and a "cpu" property pointing to the
vCPU object. Confusing bugs could arise if these ever go out of sync.
Change the property definition so that it explicitely sets the pointer.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157383332669.165747.2484056603605646820.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
CPU_FOREACH() can race with vCPU hotplug/unplug on sPAPR machines, ie.
we may try to print out info about a vCPU with a NULL presenter pointer.
Check that in order to prevent QEMU from crashing.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157192725327.3146912.12047076483178652551.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
When a VCPU gets connected to the XIVE interrupt controller, we add a
const link targetting the CPU object to the TCTX object. Similar links
are added to the ICP object when using the XICS interrupt controller.
As explained in <qom/object.h>:
* The caller must ensure that @target stays alive as long as
* this property exists. In the case @target is a child of @obj,
* this will be the case. Otherwise, the caller is responsible for
* taking a reference.
We're in the latter case for both XICS and XIVE. Add the missing
calls to object_ref() and object_unref().
This doesn't fix any known issue because the life cycle of the TCTX or
ICP happens to be shorter than the one of the CPU or XICS fabric, but
better safe than sorry.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <157192724770.3146912.15400869269097231255.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
SpaprInterruptControllerClass and PnvChipClass have an intc_create() method
that calls the appropriate routine, ie. icp_create() or xive_tctx_create(),
to establish the link between the VCPU and the presenter component of the
interrupt controller during realize.
There aren't any symmetrical call to be called when the VCPU gets unrealized
though. It is assumed that object_unparent() is the only thing to do.
This is questionable because the parenting logic around the CPU and
presenter objects is really an implementation detail of the interrupt
controller. It shouldn't be open-coded in the machine code.
Fix this by adding an intc_destroy() method that undoes what was done in
intc_create(). Also NULLify the presenter pointers to avoid having
stale pointers around. This will allow to reliably check if a vCPU has
a valid presenter.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157192724208.3146912.7254684777515287626.stgit@bahia.lan>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
On the sPAPR machine and PowerNV machine, the interrupt presenters are
created by a machine handler at the core level and are reset
independently. This is not consistent and it raises issues when it
comes to handle hot-plugged CPUs. In that case, the presenters are not
reset. This is less of an issue in XICS, although a zero MFFR could
be a concern, but in XIVE, the OS CAM line is not set and this breaks
the presenting algorithm. The current code has workarounds which need
a global cleanup.
Extend the sPAPR IRQ backend and the PowerNV Chip class with a new
cpu_intc_reset() handler called by the CPU reset handler and remove
the XiveTCTX reset handler which is now redundant.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191022163812.330-6-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The trigger data is used for both triggers of a HW source interrupts,
PHB, PSI, and triggers for rerouting interrupts between interrupt
controllers.
When an interrupt is rerouted, the trigger data follows an "END
trigger" format. In that case, the remote IC needs EAS containing an
END index to perform a lookup of an END.
An END trigger, bit0 of word0 set to '1', is defined as :
|0123|4567|0123|4567|0123|4567|0123|4567|
W0 E=1 |1P--|BLOC| END IDX |
W1 E=1 |M | END DATA |
An EAS is defined as :
|0123|4567|0123|4567|0123|4567|0123|4567|
W0 |V---|BLOC| END IDX |
W1 |M | END DATA |
The END trigger adds an extra 'PQ' bit, bit1 of word0 set to '1',
signaling that the PQ bits have been checked. That bit is unused in
the initial EAS definition.
When a HW device performs the trigger, the trigger data follows an
"EAS trigger" format because the trigger data in that case contains an
EAS index which the IC needs to look for.
An EAS trigger, bit0 of word0 set to '0', is defined as :
|0123|4567|0123|4567|0123|4567|0123|4567|
W0 E=0 |0P--|---- ---- ---- ---- ---- ---- ----|
W1 E=0 |BLOC| EAS INDEX |
There is also a 'PQ' bit, bit1 of word0 to '1', signaling that the
PQ bits have been checked.
Introduce these new trigger bits and rename the XIVE_SRCNO macros in
XIVE_EAS to reflect better the nature of the data.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191007084102.29776-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some device types of the XIVE model are exposed to the QEMU command
line:
$ ppc64-softmmu/qemu-system-ppc64 -device help | grep xive
name "xive-end-source", desc "XIVE END Source"
name "xive-source", desc "XIVE Interrupt Source"
name "xive-tctx", desc "XIVE Interrupt Thread Context"
These are internal devices that shouldn't be instantiable by the
user. By the way, they can't be because their respective realize
functions expect link properties that can't be set from the command
line:
qemu-system-ppc64: -device xive-source: required link 'xive' not found:
Property '.xive' not found
qemu-system-ppc64: -device xive-end-source: required link 'xive' not found:
Property '.xive' not found
qemu-system-ppc64: -device xive-tctx: required link 'cpu' not found:
Property '.cpu' not found
Hide them by setting dc->user_creatable to false in their respective
class init functions.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <157017473006.331610.2983143972519884544.stgit@bahia.lan>
Message-Id: <157045578401.865784.6058183726552779559.stgit@bahia.lan>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Folded comment update into base patch]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When vCPUs are hotplugged, they are added to the QEMU CPU list before
being fully realized. This can crash the XIVE presenter because the
'tctx' pointer is not necessarily initialized when looking for a
matching target.
These vCPUs are not valid targets for the presenter. Skip them.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20191001085722.32755-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Provide a better output of the XIVE END structures including the
escalation information and extend the PowerNV machine 'info pic'
command with a dump of the END EAS table used for escalations.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-9-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When the 's' bit is set the escalation is said to be 'silent' or
'silent/gather'. In such configuration, the notification sequence is
skipped and only the escalation sequence is performed. This is used to
configure all the EQs of a vCPU to escalate on a single EQ which will
then target the hypervisor.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-8-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When the 'u' bit is set the escalation is said to be 'unconditional'
which means that the ESe PQ bits are not used. Introduce a
xive_router_end_es_notify() routine to share code with the ESn
notification.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the XIVE presenter can not find the NVT dispatched on any of the HW
threads, it can not deliver the interrupt. XIVE offers an escalation
mechanism to handle such scenarios and inform the hypervisor that an
action should be taken.
Escalation is configured by setting the 'e' bit and the EAS in word 4
& 5 to let the HW look for the escalation END on which to trigger a
new event.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-6-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If backlog is activated ('b' bit) on the END, the pending priority of
a missed event is recorded in the IPB field of the NVT for a later
resend.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a vCPU is not dispatched anymore on a HW thread, the Hypervisor
(KVM on Linux) invalidates the OS interrupt context of a vCPU with
this special command. It returns the OS CAM line value and resets the
VO bit.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190718115420.19919-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In my "build everything" tree, changing migration/vmstate.h triggers a
recompile of some 2700 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get VMStateDescription. The previous commit made
that unnecessary.
Include migration/vmstate.h only where it's still needed. Touching it
now recompiles only some 1600 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Message-Id: <20190812052359.30071-16-armbru@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
In my "build everything" tree, changing hw/irq.h triggers a recompile
of some 5400 out of 6600 objects (not counting tests and objects that
don't depend on qemu/osdep.h).
hw/hw.h supposedly includes it for convenience. Several other headers
include it just to get qemu_irq and.or qemu_irq_handler.
Move the qemu_irq and qemu_irq_handler typedefs from hw/irq.h to
qemu/typedefs.h, and then include hw/irq.h only where it's still
needed. Touching it now recompiles only some 500 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-13-armbru@redhat.com>
In my "build everything" tree, changing sysemu/reset.h triggers a
recompile of some 2600 out of 6600 objects (not counting tests and
objects that don't depend on qemu/osdep.h).
The main culprit is hw/hw.h, which supposedly includes it for
convenience.
Include sysemu/reset.h only where it's needed. Touching it now
recompiles less than 200 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190812052359.30071-9-armbru@redhat.com>
The migration sequence of a guest using the XIVE exploitation mode
relies on the fact that the states of all devices are restored before
the machine is. This is not true for hot-plug devices such as CPUs
which state come after the machine. This breaks migration because the
thread interrupt context registers are not correctly set.
Fix migration of hotplugged CPUs by restoring their context in the
'post_load' handler of the XiveTCTX model.
Fixes: 277dd3d771 ("spapr/xive: add migration support for KVM")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190813064853.29310-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a CPU is reseted, the hypervisor (Linux or OPAL) invalidates the
POOL interrupt context of a CPU with this special command. It returns
the POOL CAM line value and resets the VP bit.
Fixes: 4836b45510 ("ppc/xive: activate HV support")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190630204601.30574-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When the hypervisor (KVM) dispatches a vCPU on a HW thread, it restores
its thread interrupt context. The Pending Interrupt Priority Register
(PIPR) is computed from the Interrupt Pending Buffer (IPB) and stores
should not be allowed to change its value.
Fixes: 207d9fe985 ("ppc/xive: introduce the XIVE interrupt thread context")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190630204601.30574-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When an interrupt needs to be delivered, the XIVE interrupt controller
presenter scans the CAM lines of the thread interrupt contexts of the
HW threads of the chip to find a matching vCPU. The interrupt context
is composed of 4 different sets of registers: Physical, HV, OS and
User.
The encoding of the Physical CAM line depends on the mode in which the
interrupt controller is operating: CAM mode or block group mode.
Block group mode being the default configuration today on POWER9 and
the only one available on the next POWER10 generation, enforce this
encoding in the Physical CAM line :
chip << 19 | 0000000 0 0001 thread (7Bit)
It fits the overall encoding of the NVT ids and simplifies the matching
algorithm in the presenter.
Fixes: d514c48d41 ("ppc/xive: hardwire the Physical CAM line of the thread context")
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190630204601.30574-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Next pull request against qemu-4.1. The big thing here is adding
support for hot plug of P2P bridges, and PCI devices under P2P bridges
on the "pseries" machine (which doesn't use SHPC). Other than that
there's just a handful of fixes and small enhancements.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAl0AkgwACgkQbDjKyiDZ
s5Jyug//cwxP+t1t2CNHtffKwiXFzuEKx9YSNE1V0wog6aB40EbPKU72FzCq6FfA
lev+pZWV9AwVMzFYe4VM/7Lqh7WFMYDT3DOXaZwfANs4471vYtgvPi21L2TBj80d
hMszlyLWMLY9ByOzCxIq3xnbivGpA94G2q9rKbwXdK4T/5i62Pe3SIfgG+gXiiwW
+YlHWCPX0I1cJz2bBs9ElXdl7ONWnn+7uDf7gNfWkTKuiUq6Ps7mxzy3GhJ1T7nz
OFKmQ5dKzLJsgOULSSun8kWpXBmnPffkM3+fCE07edrWZVor09fMCk4HvtfaRy2K
FFa2Kvzn/V/70TL+44dsSX4QcwdcHQztiaMO7UGPq9CMswx5L7gsNmfX6zvK1Nrb
1t7ORZKNJ72hMyvDPSMiGU2DpVjO3ZbBlSL4/xG8Qeal4An0kgkN5NcFlB/XEfnz
dsKu9XzuGSeD1bWz1Mgcf1x7lPDBoHIKLcX6notZ8epP/otu4ywNFvAkPu4fk8s0
4jQGajIT7328SmzpjXClsmiEskpKsEr7hQjPRhu0hFGrhVc+i9PjkmbDl0TYRAf6
N6k6gJQAi+StJde2rcua1iS7Ra+Tka6QRKy+EctLqfqOKPb2VmkZ6fswQ3nfRRlT
LgcTHt2iJcLeud2klVXs1e4pKXzXchkVyFL4ucvmyYG5VeimMzU=
=ERgu
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.1-20190612' into staging
ppc patch queue 2019-06-12
Next pull request against qemu-4.1. The big thing here is adding
support for hot plug of P2P bridges, and PCI devices under P2P bridges
on the "pseries" machine (which doesn't use SHPC). Other than that
there's just a handful of fixes and small enhancements.
# gpg: Signature made Wed 12 Jun 2019 06:47:56 BST
# gpg: using RSA key 75F46586AE61A66CC44E87DC6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" [full]
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" [full]
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" [full]
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>" [unknown]
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-4.1-20190612:
ppc/xive: Make XIVE generate the proper interrupt types
ppc/pnv: activate the "dumpdtb" option on the powernv machine
target/ppc: Use tcg_gen_gvec_bitsel
spapr: Allow hot plug/unplug of PCI bridges and devices under PCI bridges
spapr: Direct all PCI hotplug to host bridge, rather than P2P bridge
spapr: Don't use bus number for building DRC ids
spapr: Clean up DRC index construction
spapr: Clean up spapr_drc_populate_dt()
spapr: Clean up dt creation for PCI buses
spapr: Clean up device tree construction for PCI devices
spapr: Clean up device node name generation for PCI devices
target/ppc: Fix lxvw4x, lxvh8x and lxvb16x
spapr_pci: Improve error message
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It should be generic Hypervisor Virtualization interrupts for HV
directed rings and traditional External Interrupts for the OS directed
ring.
Don't generate anything for the user ring as it isn't actually
supported.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190606174409.12502-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The interrupt mode is chosen by the CAS negotiation process and
activated after a reset to take into account the required changes in
the machine. This brings new constraints on how the associated KVM IRQ
device is initialized.
Currently, each model takes care of the initialization of the KVM
device in their realize method but this is not possible anymore as the
initialization needs to be done globaly when the interrupt mode is
known, i.e. when machine is reseted. It also means that we need a way
to delete a KVM device when another mode is chosen.
Also, to support migration, the QEMU objects holding the state to
transfer should always be available but not necessarily activated.
The overall approach of this proposal is to initialize both interrupt
mode at the QEMU level to keep the IRQ number space in sync and to
allow switching from one mode to another. For the KVM side of things,
the whole initialization of the KVM device, sources and presenters, is
grouped in a single routine. The XICS and XIVE sPAPR IRQ reset
handlers are modified accordingly to handle the init and the delete
sequences of the KVM device.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-15-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When the VM is stopped, the VM state handler stabilizes the XIVE IC
and marks the EQ pages dirty. These are then transferred to destination
before the transfer of the device vmstates starts.
The SpaprXive interrupt controller model captures the XIVE internal
tables, EAT and ENDT and the XiveTCTX model does the same for the
thread interrupt context registers.
At restart, the SpaprXive 'post_load' method restores all the XIVE
states. It is called by the sPAPR machine 'post_load' method, when all
XIVE states have been transferred and loaded.
Finally, the source states are restored in the VM change state handler
when the machine reaches the running state.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-7-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This extends the KVM XIVE device backend with 'synchronize_state'
methods used to retrieve the state from KVM. The HW state of the
sources, the KVM device and the thread interrupt contexts are
collected for the monitor usage and also migration.
These get operations rely on their KVM counterpart in the host kernel
which acts as a proxy for OPAL, the host firmware. The set operations
will be added for migration support later.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190513084245.25755-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This introduces a set of helpers when KVM is in use, which create the
KVM XIVE device, initialize the interrupt sources at a KVM level and
connect the interrupt presenters to the vCPU.
They also handle the initialization of the TIMA and the source ESB
memory regions of the controller. These have a different type under
KVM. They are 'ram device' memory mappings, similarly to VFIO, exposed
to the guest and the associated VMAs on the host are populated
dynamically with the appropriate pages using a fault handler.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190513084245.25755-3-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The high order bits of the address of the OS event queue is stored in
bits [4-31] of word2 of the XIVE END internal structures and the low
order bits in word3. This structure is using Big Endian ordering and
computing the value requires some simple arithmetic which happens to
be wrong. The mask removing bits [0-3] of word2 is applied to the
wrong value and the resulting address is bogus when above 64GB.
Guests with more than 64GB of RAM will allocate pages for the OS event
queues which will reside above the 64GB limit. In this case, the XIVE
device model will wake up the CPUs in case of a notification, such as
IPIs, but the update of the event queue will be written at the wrong
place in memory. The result is uncertain as the guest memory is
trashed and IPI are not delivered.
Introduce a helper xive_end_qaddr() to compute this value correctly in
all places where it is used.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190508171946.657-3-clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The NSR register of the HV ring has a different, although similar, bit
layout. TM_QW3_NSR_HE_PHYS bit should now be raised when the
Hypervisor interrupt line is signaled. Other bits TM_QW3_NSR_HE_POOL
and TM_QW3_NSR_HE_LSI are not modeled. LSI are for special interrupts
reserved for HW bringup and the POOL bit is used when signaling a
group of VPs. This is not currently implemented in Linux but it is in
pHyp.
The most important special commands on the HV TIMA page are added to
let the core manage interrupts : acking and changing the CPU priority.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PowerNV machine with need to encode the block id in the source
interrupt number before forwarding the source event notification to
the Router.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-5-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PowerNV machine can perform indirect loads and stores on the TIMA
on behalf of another CPU. Give the controller the possibility to call
the TIMA memory accessors with a XiveTCTX of its choice.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-4-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
By default on P9, the HW CAM line (23bits) is hardwired to :
0x000||0b1||4Bit chip number||7Bit Thread number.
When the block group mode is enabled at the controller level (PowerNV),
the CAM line is changed for CAM compares to :
4Bit chip number||0x001||7Bit Thread number
This will require changes in xive_presenter_tctx_match() possibly.
This is a lowlevel functionality of the HW controller and it is not
strictly needed. Leave it for later.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-2-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Patch "target/ppc: Add POWER9 external interrupt model" should have
removed the section covering PPC_FLAGS_INPUT_POWER7.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190219142530.17807-1-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Adds support for the Hypervisor directed interrupts in addition to the
OS ones.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - modified the icp_realize() and xive_tctx_realize() to take
into account explicitely the POWER9 interrupt model
- introduced a specific power9_set_irq for POWER9 ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190215161648.9600-10-clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It provides a mean to retrieve the XiveTCTX of a CPU. This will become
necessary with future changes which move the interrupt presenter
object pointers under the PowerPCCPU machine_data.
The PowerNV machine has an extra requirement on TIMA accesses that
this new method addresses. The machine can perform indirect loads and
stores on the TIMA on behalf of another CPU. The PIR being defined in
the controller registers, we need a way to peek in the controller
model to find the PIR value.
The XiveTCTX is moved above the XiveRouter definition to avoid forward
typedef declarations.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The qemu_irq array is now allocated at the machine level using a sPAPR
IRQ set_irq handler depending on the chosen interrupt mode. The use of
this handler is slightly inefficient today but it will become necessary
when the 'dual' interrupt mode is introduced.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To support the 'dual' interrupt mode, XICS and XIVE, we plan to move
the qemu_irq array of each interrupt controller under the machine and
do the allocation under the sPAPR IRQ init method.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
which will be used by the machine only when the XIVE interrupt mode is
in use.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>