From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: "Cédric Le Goater" <clg@kaod.org>, qemu-ppc@nongnu.org
Cc: Frederic Barrat <fbarrat@linux.ibm.com>,
Timothy Pearson <tpearson@raptorengineering.com>,
Alex Williamson <alex.williamson@redhat.com>,
qemu-devel@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH qemu] spapr_pci: Disable IRQFD resampling on XIVE
Date: Thu, 28 Apr 2022 15:32:39 +1000 [thread overview]
Message-ID: <5fa3c59b-de8e-c428-43f1-eb4d698e835a@ozlabs.ru> (raw)
In-Reply-To: <880cdd91-3a4a-8c35-1357-d3858950db44@kaod.org>
On 4/27/22 17:36, Cédric Le Goater wrote:
> Hello Alexey,
>
> On 4/27/22 06:36, Alexey Kardashevskiy wrote:
>> VFIO-PCI has an "KVM_IRQFD_FLAG_RESAMPLE" optimization for INTx EOI
>> handling when KVM can unmask PCI INTx (level triggered interrupt) without
>> switching to the userspace (==QEMU).
>>
>> Unfortunately XIVE does not support level interrupts,
>
> That's not correctly phrased I think.
My bad, I meant "XIVE hardware".
>
> The QEMU XIVE device support LSIs but the POWER9 kernel-irqchips,
> KVM XICS-on-XIVE and XIVE native devices, are broken with respect
> to passthrough adapters using INTx.
>
>
>> QEMU emulates them
>> and therefore there is no existing code path to kick the resamplefd.
>> The problem appears when passing through a PCI adapter with
>> the "pci=nomsi" kernel parameter - the adapter's interrupt interrupt
>> count in /proc/interrupts will stuck at "1".
>>
>> This disables resampler when the XIVE interrupt controller is configured.
>> This should not be very visible though KVM already exits to QEMU for INTx
>> and XIVE-capable boxes (POWER9 and newer) do not seem to have
>> performance-critical INTx-only capable devices.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>
>>
>> Cédric, this is what I meant when I said that spapr_pci.c was unaware of
>> the interrupt controller type, neither xics nor xive was mentioned
>> in the file before.
>>
>>
>> ---
>> hw/ppc/spapr_pci.c | 14 +++++++++++---
>> 1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 5bfd4aa9e5aa..2675052601db 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -729,11 +729,19 @@ static void pci_spapr_set_irq(void *opaque, int
>> irq_num, int level)
>> static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin)
>> {
>> + SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>> SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque);
>> - PCIINTxRoute route;
>> + PCIINTxRoute route = { .mode = PCI_INTX_DISABLED };
>> - route.mode = PCI_INTX_ENABLED;
>> - route.irq = sphb->lsi_table[pin].irq;
>> + /*
>> + * Disable IRQFD resampler on XIVE as it does not support LSI and
>> QEMU
>> + * emulates those so the KVM kernel resamplefd kick is skipped
>> and EOI
>> + * is not delivered to VFIO-PCI.
>> + */
>> + if (!spapr->xive) {
>
> This is testing the availability of the XIVE interrupt mode, but not
> the activate controller. See spapr_irq_init() which is called very
> early in the machine initialization.
>
> Is that what we want ? Is everything fine if we start the machine with
> ic-mode=xics ? On a POWER9 host, this would use the KVM XICS-on-XIVE
> device which is broken also AFAICT.
I should probably fix that in KVM, just not quite sure yet how for the
realmode handlers, or just drop those on P9 and then the fix is trivial.
> You should extend the SpaprInterruptControllerClass (for a routine) or
> simply SpaprIrq (for a bool) if you need to handle IRQ matters from a
> device model.
It is a property of KVM rather than the interrupt controller so it
probably makes more sense to just stop advertising
KVM_CAP_IRQFD_RESAMPLE. Hmmm...
>
> Thanks,
>
> C.
>
>
>> + route.mode = PCI_INTX_ENABLED;
>> + route.irq = sphb->lsi_table[pin].irq;
>> + }
>> return route;
>> }
next prev parent reply other threads:[~2022-04-28 5:34 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-27 4:36 [PATCH qemu] spapr_pci: Disable IRQFD resampling on XIVE Alexey Kardashevskiy
2022-04-27 7:36 ` Cédric Le Goater
2022-04-28 5:32 ` Alexey Kardashevskiy [this message]
2022-04-28 6:25 ` Cédric Le Goater
2022-04-28 7:26 ` Alexey Kardashevskiy
2022-04-28 7:31 ` Cédric Le Goater
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5fa3c59b-de8e-c428-43f1-eb4d698e835a@ozlabs.ru \
--to=aik@ozlabs.ru \
--cc=alex.williamson@redhat.com \
--cc=clg@kaod.org \
--cc=david@gibson.dropbear.id.au \
--cc=fbarrat@linux.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=tpearson@raptorengineering.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).