qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: "Cédric Le Goater" <clg@kaod.org>, qemu-ppc@nongnu.org
Cc: Frederic Barrat <fbarrat@linux.ibm.com>,
	Timothy Pearson <tpearson@raptorengineering.com>,
	Alex Williamson <alex.williamson@redhat.com>,
	qemu-devel@nongnu.org, David Gibson <david@gibson.dropbear.id.au>
Subject: Re: [PATCH qemu] spapr_pci: Disable IRQFD resampling on XIVE
Date: Thu, 28 Apr 2022 15:32:39 +1000	[thread overview]
Message-ID: <5fa3c59b-de8e-c428-43f1-eb4d698e835a@ozlabs.ru> (raw)
In-Reply-To: <880cdd91-3a4a-8c35-1357-d3858950db44@kaod.org>



On 4/27/22 17:36, Cédric Le Goater wrote:
> Hello Alexey,
> 
> On 4/27/22 06:36, Alexey Kardashevskiy wrote:
>> VFIO-PCI has an "KVM_IRQFD_FLAG_RESAMPLE" optimization for INTx EOI
>> handling when KVM can unmask PCI INTx (level triggered interrupt) without
>> switching to the userspace (==QEMU).
>>
>> Unfortunately XIVE does not support level interrupts, 
> 
> That's not correctly phrased I think.


My bad, I meant "XIVE hardware".

> 
> The QEMU XIVE device support LSIs but the POWER9 kernel-irqchips,
> KVM XICS-on-XIVE and XIVE native devices, are broken with respect
> to passthrough adapters using INTx.
> 
> 
>> QEMU emulates them
>> and therefore there is no existing code path to kick the resamplefd.
>> The problem appears when passing through a PCI adapter with
>> the "pci=nomsi" kernel parameter - the adapter's interrupt interrupt
>> count in /proc/interrupts will stuck at "1".
>>
>> This disables resampler when the XIVE interrupt controller is configured.
>> This should not be very visible though KVM already exits to QEMU for INTx
>> and XIVE-capable boxes (POWER9 and newer) do not seem to have
>> performance-critical INTx-only capable devices.
>>
>> Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
>> ---
>>
>>
>> Cédric, this is what I meant when I said that spapr_pci.c was unaware of
>> the interrupt controller type, neither xics nor xive was mentioned
>> in the file before.
>>
>>
>> ---
>>   hw/ppc/spapr_pci.c | 14 +++++++++++---
>>   1 file changed, 11 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c
>> index 5bfd4aa9e5aa..2675052601db 100644
>> --- a/hw/ppc/spapr_pci.c
>> +++ b/hw/ppc/spapr_pci.c
>> @@ -729,11 +729,19 @@ static void pci_spapr_set_irq(void *opaque, int 
>> irq_num, int level)
>>   static PCIINTxRoute spapr_route_intx_pin_to_irq(void *opaque, int pin)
>>   {
>> +    SpaprMachineState *spapr = SPAPR_MACHINE(qdev_get_machine());
>>       SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(opaque);
>> -    PCIINTxRoute route;
>> +    PCIINTxRoute route = { .mode = PCI_INTX_DISABLED };
>> -    route.mode = PCI_INTX_ENABLED;
>> -    route.irq = sphb->lsi_table[pin].irq;
>> +    /*
>> +     * Disable IRQFD resampler on XIVE as it does not support LSI and 
>> QEMU
>> +     * emulates those so the KVM kernel resamplefd kick is skipped 
>> and EOI
>> +     * is not delivered to VFIO-PCI.
>> +     */
>> +    if (!spapr->xive) {
> 
> This is testing the availability of the XIVE interrupt mode, but not
> the activate controller. See spapr_irq_init() which is called very
> early in the machine initialization.
> 
> Is that what we want ? Is everything fine if we start the machine with
> ic-mode=xics ? On a POWER9 host, this would use the KVM XICS-on-XIVE
> device which is broken also AFAICT.

I should probably fix that in KVM, just not quite sure yet how for the 
realmode handlers, or just drop those on P9 and then the fix is trivial.


> You should extend the SpaprInterruptControllerClass (for a routine) or
> simply SpaprIrq (for a bool) if you need to handle IRQ matters from a
> device model.

It is a property of KVM rather than the interrupt controller so it 
probably makes more sense to just stop advertising 
KVM_CAP_IRQFD_RESAMPLE. Hmmm...


> 
> Thanks,
> 
> C.
> 
> 
>> +        route.mode = PCI_INTX_ENABLED;
>> +        route.irq = sphb->lsi_table[pin].irq;
>> +    }
>>       return route;
>>   }


  reply	other threads:[~2022-04-28  5:34 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-27  4:36 [PATCH qemu] spapr_pci: Disable IRQFD resampling on XIVE Alexey Kardashevskiy
2022-04-27  7:36 ` Cédric Le Goater
2022-04-28  5:32   ` Alexey Kardashevskiy [this message]
2022-04-28  6:25     ` Cédric Le Goater
2022-04-28  7:26       ` Alexey Kardashevskiy
2022-04-28  7:31         ` Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5fa3c59b-de8e-c428-43f1-eb4d698e835a@ozlabs.ru \
    --to=aik@ozlabs.ru \
    --cc=alex.williamson@redhat.com \
    --cc=clg@kaod.org \
    --cc=david@gibson.dropbear.id.au \
    --cc=fbarrat@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-ppc@nongnu.org \
    --cc=tpearson@raptorengineering.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).