From: David Gibson <david@gibson.dropbear.id.au>
To: Peter Xu <peterx@redhat.com>
Cc: alex.williamson@redhat.com, qemu-devel@nongnu.org,
qemu-ppc@nongnu.org, gwshan@au1.ibm.com
Subject: Re: [Qemu-devel] [PATCH] vfio: Fix regression in MSI routing configuration
Date: Tue, 20 Sep 2016 15:57:19 +1000 [thread overview]
Message-ID: <20160920055719.GU20488@umbus> (raw)
In-Reply-To: <20160918053327.GB11536@pxdev.xzpeter.org>
[-- Attachment #1: Type: text/plain, Size: 3891 bytes --]
On Sun, Sep 18, 2016 at 01:33:27PM +0800, Peter Xu wrote:
> Good to know that we have a solution. :)
>
> I think this is a good fix, however I still do not understand why this
> is happening... Please see below comment.
>
> On Thu, Sep 15, 2016 at 04:11:48PM +1000, David Gibson wrote:
> > d1f6af6 "kvm-irqchip: simplify kvm_irqchip_add_msi_route" was a cleanup
> > of kvmchip routing configuration, that was mostly intended for x86.
> > However, it also contains a subtle change in behaviour which breaks EEH[1]
> > error recovery on certain VFIO passthrough devices on spapr guests. So far
> > it's only been seen on a BCM5719 NIC on a POWER8 server, but there may be
> > other hardware with the same problem. It's also possible there could be
> > circumstances where it causes a bug on x86 as well, though I don't know of
> > any obvious candidates.
> >
> > Prior to d1f6af6, both vfio_msix_vector_do_use() and
> > vfio_add_kvm_msi_virq() used msg == NULL as a special flag to mark this
> > as the "dummy" vector used to make the host hardware state sync with the
> > guest expected hardware state in terms of MSI configuration.
> >
> > Specifically that flag caused vfio_add_kvm_msi_virq() to become a no-op,
> > meaning the dummy irq would always be delivered via qemu.
>
> AFAICT, QEMU is not delivering that *dummy* IRQ as well. IIUC here the
> dummy IRQ refers to the one in vfio_msix_enable():
>
> vfio_msix_vector_do_use(&vdev->pdev, 0, NULL, NULL);
> vfio_msix_vector_release(&vdev->pdev, 0);
>
> Here we registered this dummy one to sync up the guest and host
> hardware state for some reason. However, the handler is NULL here,
> means that even QEMU won't handle it if it happens. And the only
> difference is that during this extremely small period, kvm will handle
> the interrupt with an uninitialized MSI message, but assuming that
> would never happen since no one should start to use MSI yet. After
> release(), interrupts will be dropped for all cases.
>
> So looks like it is not related to "whether QEMU or KVM will handle
> the interrupt", but something else.
>
> Generally speaking, the process of registering the IRQ should be
> something like (using QEMU with d1f6af6 change):
>
> 1. vfio_msix_enable(): when MSIX is enabled. this will trigger
> vfio_add_kvm_msi_virq(), but quickly it is unregistered (virq will
> be there, but VFIO will assign the VFIO handler to
> vector->interrupt, rather than vector->kvm_interrupt, so that virq
> won't take effect).
>
> 2. vfio_msix_vector_use(): when an MSIX entry is finally used and
> setup. this will trigger vfio_update_kvm_msi_virq(), to update the
> interrupt information to the "real one".
>
> IMHO we should always receive interrupts after step (2). And as we
> have traced (with David), step (2) was updating the correct interrupt
> information even with commit d1f6af6. But I think I might be wrong (or
> say, the above assumption is not correct), because commit d1f6af6
> indeed caused problem with EHH and this specific hardware. I am just
> do not know why it is triggering the issue. And that's why I want to
> know whether there is anything special with that NIC's driver.
It's quite possible there's something unusual about the driver - this
didn't happen for the other NIC I tried, or for an XHCI.
Unfortunately, I don't know what it would be, and I really have no
idea where one would start looking.
> Again, this is totally a discussion only, and I am totally agree with
> current change to fix the issue. Just in case someone knows the reason
> behind this.
>
> Thanks,
>
> -- peterx
>
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
prev parent reply other threads:[~2016-09-20 5:57 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-09-15 6:11 [Qemu-devel] [PATCH] vfio: Fix regression in MSI routing configuration David Gibson
2016-09-15 16:38 ` Alex Williamson
2016-09-18 5:33 ` Peter Xu
2016-09-20 5:57 ` David Gibson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160920055719.GU20488@umbus \
--to=david@gibson.dropbear.id.au \
--cc=alex.williamson@redhat.com \
--cc=gwshan@au1.ibm.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.