From: Alex Williamson <alex.williamson@redhat.com>
To: Laine Stump <laine@redhat.com>
Cc: libvir-list@redhat.com, Michael Roth <mdroth@linux.vnet.ibm.com>,
qemu-devel@nongnu.org, qemu-ppc@nongnu.org, aik@ozlabs.ru,
abologna@redhat.com, pkrempa@redhat.com, berrange@redhat.com,
mprivozn@redhat.com, sbhat@linux.vnet.ibm.com
Subject: Re: [Qemu-devel] [RFC PATCH 0/5] hotplug: fix premature rebinding of VFIO devices to host
Date: Thu, 29 Jun 2017 13:44:18 -0600 [thread overview]
Message-ID: <20170629134418.7b1faea4@w520.home> (raw)
In-Reply-To: <804c9b1a-6803-e920-9d3d-60fc33dd2fea@redhat.com>
On Thu, 29 Jun 2017 14:50:15 -0400
Laine Stump <laine@redhat.com> wrote:
> On 06/28/2017 08:24 PM, Michael Roth wrote:
> > Hi everyone. Hoping to get some feedback on this approach, or some
> > alternatives proposed below, to the following issue:
> >
> > Currently libvirt immediately attempts to rebind a managed device back to the
> > host driver when it receives a DEVICE_DELETED event from QEMU. This is
> > problematic for 2 reasons:
> >
> > 1) If multiple devices from a group are attached to a guest, this can move
> > the group into a "non-viable" state where some devices are assigned to
> > the host and some to the guest.
>
> Since we don't support hotplug with managed='yes' of individual (or
> multiple) functions of a multifunction host device, I don't know that
> it's very useful to support hot *un*plug of it - it would only be useful
> if the multi-function device were present in the guest when it was
> started, and then was hot-unplugged later. And this is all a lot of
> extra complexity, though, so it would be useful to know what are the
> scenarios where it would actually be used (i.e. is this a legitimate
> need, or just an interesting exercise?)
This doesn't make sense to me, since when do we not support hotplug
with managed='yes' and how is it prevented? Also, let's just not talk
about multifunction, a multifunction device does not imply a shared
group, nor does a shared group imply multifunction. So is it hotplug
of a device which is in a shared group that is not supported, and if so
how? I think libvirt tries to do the hot-add, but it hits the
non-viable group when it gives it to QEMU. On hot-remove, I'm pretty
sure libvirt just lets the host crash into the ground by re-binding the
device to the host driver. If we don't want to support it, that's one
thing, but the current model is more just neglectful than unsupported.
> > 2) When QEMU emits the DEVICE_DELETED event, there's still a "finalize" phase
> > where additional cleanup occurs. In most cases libvirt can ignore this
> > cleanup, but in the case of VFIO devices this is where closing of a VFIO
> > group FD occurs, and failing to wait before rebinding the device to the
> > host driver can result in unexpected behavior. In the case of powernv
> > hosts at least, this can lead to a host driver crashing due to the default
> > DMA windows not having been fully-restored yet. The window between this is
> > and the initial DEVICE_DELETED seems to be ~6 seconds in practice. We've
> > seen host dumps with Mellanox CX4 VFs being rebound to host driver during
> > this period (on powernv hosts).
>
> I agree with Dan that the situation described here should be considered
> a qemu bug - according to my understanding (from back at the time
> DEVICE_DELETED was added to qemu (I think at libvirt's request) qemu
> should never emit the DEVICE_DELETED event until *everything* related to
> the device is finished - that was the whole point of adding the event in
> the first palce. Covering up this bug with a bunch of extra libvirt
> complexity is just creating the potential for even more bugs in the more
> complex code.
Agree, but ISTR not everyone thinks that way. I don't remember the
opposing viewpoint though. Thanks,
Alex
next prev parent reply other threads:[~2017-06-29 19:44 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-06-29 0:24 [Qemu-devel] [RFC PATCH 0/5] hotplug: fix premature rebinding of VFIO devices to host Michael Roth
2017-06-29 0:24 ` [Qemu-devel] [RFC PATCH 1/5] virhostdev: factor release out from reattach and export it for use later Michael Roth
2017-06-29 0:24 ` [Qemu-devel] [RFC PATCH 2/5] qemu_hotplug: squash qemuDomainRemovePCIHostDevice into caller Michael Roth
2017-06-29 0:24 ` [Qemu-devel] [RFC PATCH 3/5] virpci: introduce virPCIIOMMUGroupIterate() Michael Roth
2017-06-29 0:24 ` [Qemu-devel] [RFC PATCH 4/5] qemu: hotplug: unbind VFIO devices as a group Michael Roth
2017-06-29 0:25 ` [Qemu-devel] [RFC PATCH 5/5] qemu: hotplug: wait for VFIO group FD close before unbind Michael Roth
2017-06-29 8:33 ` [Qemu-devel] [RFC PATCH 0/5] hotplug: fix premature rebinding of VFIO devices to host Daniel P. Berrange
2017-06-29 18:22 ` Michael Roth
2017-06-29 19:31 ` Alex Williamson
2017-06-29 19:28 ` Alex Williamson
2017-06-29 20:21 ` Michael Roth
2017-06-29 18:50 ` Laine Stump
2017-06-29 19:44 ` Alex Williamson [this message]
2017-06-30 2:27 ` Laine Stump
2017-06-29 20:59 ` Michael Roth
2017-06-30 6:59 ` Andrea Bolognani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170629134418.7b1faea4@w520.home \
--to=alex.williamson@redhat.com \
--cc=abologna@redhat.com \
--cc=aik@ozlabs.ru \
--cc=berrange@redhat.com \
--cc=laine@redhat.com \
--cc=libvir-list@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=mprivozn@redhat.com \
--cc=pkrempa@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=sbhat@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).