* [Qemu-devel] [PATCH v2 0/3] qdev/vfio: defer DEVICE_DEL to avoid races with libvirt
@ 2017-10-09 17:06 Michael Roth
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting Michael Roth
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Michael Roth @ 2017-10-09 17:06 UTC (permalink / raw)
To: qemu-devel
Cc: david, groug, peter.maydell, armbru, alex.williamson, pbonzini,
imammedo, ehabkost
This series was motivated by the discussion in this thread:
https://www.redhat.com/archives/libvir-list/2017-June/msg01370.html
The issue this series addresses is that when libvirt unplugs a VFIO PCI device,
it may attempt to bind the host device back to the host driver when QEMU emits
the DEVICE_DELETED event for the corresponding vfio-pci device. However, the
VFIO group FD is not actually cleaned up until vfio-pci device is *finalized*
by QEMU, whereas the event is emitted earlier during device_unparent.
Depending on the host device and how long certain operations like resetting the
device might take, this can in result in libvirt trying to rebind the device
back to the host while it is still in use by VFIO, leading to host crashes or
other unexpected behavior.
In particular, Mellanox CX4 adapters on PowerNV hosts might not be fully
quiesced by vfio-pci's finalize() routine until up to 6s after the
DEVICE_DELETED was emitted, leading to detach-device on the libvirt side pretty
much always crashing the host.
Implementing this change requires 2 prereqs to ensure the same information is
available when the DEVICE_DELETED is finally emitted:
1) Storing the path in the composition patch, which is addressed by PATCH 1,
which was plucked from another pending series from Greg Kurz:
https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg07922.html
since we are now "disconnected" at the time the event is emitted, and
2) Deferring qemu_opts_del of the DeviceState->QemuOpts till finalize, since
that is where DeviceState->id is stored. This was actually how it was
done in the past, so PATCH 2 simply reverts the change which moved it to
device_unparent.
>From there it's just a mechanical move of the event from device_unparent to
device_finalize.
Since this was originally posted a kernel fix was merged to address the race
on the kernel side (6586b561), but it would still be good to fix this on the
QEMU side for older host kernel and for clearer semantics on the
libvirt/management side.
v2:
- rebased on master
- fixed up inaccurate comment in PATCH 1 (Eric)
hw/core/qdev.c | 31 ++++++++++++++++++++-----------
include/hw/qdev-core.h | 1 +
2 files changed, 21 insertions(+), 11 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting
2017-10-09 17:06 [Qemu-devel] [PATCH v2 0/3] qdev/vfio: defer DEVICE_DEL to avoid races with libvirt Michael Roth
@ 2017-10-09 17:06 ` Michael Roth
2017-10-09 18:11 ` Eric Blake
2017-10-10 1:41 ` David Gibson
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 2/3] Revert "qdev: Free QemuOpts when the QOM path goes away" Michael Roth
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 3/3] qdev: defer DEVICE_DEL event until instance_finalize() Michael Roth
2 siblings, 2 replies; 9+ messages in thread
From: Michael Roth @ 2017-10-09 17:06 UTC (permalink / raw)
To: qemu-devel
Cc: david, groug, peter.maydell, armbru, alex.williamson, pbonzini,
imammedo, ehabkost, Michael S . Tsirkin
device_unparent(dev, ...) is called when a device is unparented,
either directly, or as a result of a parent device being
finalized, and handles some final cleanup for the device. Part
of this includes emiting a DEVICE_DELETED QMP event to notify
management, which includes the device's path in the composition
tree as provided by object_get_canonical_path().
object_get_canonical_path() assumes the device is still connected
to the machine/root container, and will assert otherwise, but
in some situations this isn't the case:
If the parent is finalized as a result of object_unparent(), it
will still be attached to the composition tree at the time any
children are unparented as a result of that same call to
object_unparent(). However, in some cases, object_unparent()
will complete without finalizing the parent device, due to
lingering references that won't be released till some time later.
One such example is if the parent has MemoryRegion children (which
take a ref on their parent), who in turn have AddressSpace's (which
take a ref on their regions), since those AddressSpaces get cleaned
up asynchronously by the RCU thread.
In this case qdev:device_unparent() may be called for a child Device
that no longer has a path to the root/machine container, causing
object_get_canonical_path() to assert.
Fix this by storing the canonical path during realize() so the
information will still be available for device_unparent() in such
cases.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
---
hw/core/qdev.c | 16 +++++++++++++---
include/hw/qdev-core.h | 1 +
2 files changed, 14 insertions(+), 3 deletions(-)
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 606ab53c42..0542e1879f 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -928,6 +928,13 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
goto post_realize_fail;
}
+ /*
+ * always free/re-initialize here since the value cannot be cleaned up
+ * in device_unrealize due to it's usage later on in the unplug path
+ */
+ g_free(dev->canonical_path);
+ dev->canonical_path = object_get_canonical_path(OBJECT(dev));
+
if (qdev_get_vmsd(dev)) {
if (vmstate_register_with_alias_id(dev, -1, qdev_get_vmsd(dev), dev,
dev->instance_id_alias,
@@ -984,6 +991,7 @@ child_realize_fail:
}
post_realize_fail:
+ g_free(dev->canonical_path);
if (dc->unrealize) {
dc->unrealize(dev, NULL);
}
@@ -1102,10 +1110,12 @@ static void device_unparent(Object *obj)
/* Only send event if the device had been completely realized */
if (dev->pending_deleted_event) {
- gchar *path = object_get_canonical_path(OBJECT(dev));
+ g_assert(dev->canonical_path);
- qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
- g_free(path);
+ qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
+ &error_abort);
+ g_free(dev->canonical_path);
+ dev->canonical_path = NULL;
}
qemu_opts_del(dev->opts);
diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
index 089146197f..0a71bf83f0 100644
--- a/include/hw/qdev-core.h
+++ b/include/hw/qdev-core.h
@@ -153,6 +153,7 @@ struct DeviceState {
/*< public >*/
const char *id;
+ char *canonical_path;
bool realized;
bool pending_deleted_event;
QemuOpts *opts;
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 2/3] Revert "qdev: Free QemuOpts when the QOM path goes away"
2017-10-09 17:06 [Qemu-devel] [PATCH v2 0/3] qdev/vfio: defer DEVICE_DEL to avoid races with libvirt Michael Roth
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting Michael Roth
@ 2017-10-09 17:06 ` Michael Roth
2017-10-10 1:42 ` David Gibson
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 3/3] qdev: defer DEVICE_DEL event until instance_finalize() Michael Roth
2 siblings, 1 reply; 9+ messages in thread
From: Michael Roth @ 2017-10-09 17:06 UTC (permalink / raw)
To: qemu-devel
Cc: david, groug, peter.maydell, armbru, alex.williamson, pbonzini,
imammedo, ehabkost
This reverts commit abed886ec60cf239a03515cf0b30fb11fa964c44.
This patch originally addressed an issue where a DEVICE_DELETED
event could be emitted (in device_unparent()) before a Device's
QemuOpts were cleaned up (in device_finalize()), leading to a
"duplicate ID" error if management attempted to immediately add
a device with the same ID in response to the DEVICE_DELETED event.
An alternative will be implemented in a subsequent patch where we
defer the DEVICE_DELETED event until device_finalize(), which would
also prevent the race, so we revert the original fix in preparation.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
---
hw/core/qdev.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 0542e1879f..f7c66d9bd0 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -1068,6 +1068,7 @@ static void device_finalize(Object *obj)
NamedGPIOList *ngl, *next;
DeviceState *dev = DEVICE(obj);
+ qemu_opts_del(dev->opts);
QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
QLIST_REMOVE(ngl, node);
@@ -1117,9 +1118,6 @@ static void device_unparent(Object *obj)
g_free(dev->canonical_path);
dev->canonical_path = NULL;
}
-
- qemu_opts_del(dev->opts);
- dev->opts = NULL;
}
static void device_class_init(ObjectClass *class, void *data)
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [Qemu-devel] [PATCH v2 3/3] qdev: defer DEVICE_DEL event until instance_finalize()
2017-10-09 17:06 [Qemu-devel] [PATCH v2 0/3] qdev/vfio: defer DEVICE_DEL to avoid races with libvirt Michael Roth
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting Michael Roth
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 2/3] Revert "qdev: Free QemuOpts when the QOM path goes away" Michael Roth
@ 2017-10-09 17:06 ` Michael Roth
2017-10-10 1:44 ` David Gibson
2 siblings, 1 reply; 9+ messages in thread
From: Michael Roth @ 2017-10-09 17:06 UTC (permalink / raw)
To: qemu-devel
Cc: david, groug, peter.maydell, armbru, alex.williamson, pbonzini,
imammedo, ehabkost
DEVICE_DEL is currently emitted when a Device is unparented, as
opposed to when it is finalized. The main design motivation for this
seems to be that after unparent()/unrealize(), the Device is no
longer visible to the guest, and thus the operation is complete
from the perspective of management.
However, there are cases where remaining host-side cleanup is also
pertinent to management. The is generally handled by treating these
resources as aspects of the "backend", which can be managed via
separate interfaces/events, such as blockdev_add/del, netdev_add/del,
object_add/del, etc, but some devices do not have this level of
compartmentalization, namely vfio-pci, and possibly to lend themselves
well to it.
In the case of vfio-pci, the "backend" cleanup happens as part of
the finalization of the vfio-pci device itself, in particular the
cleanup of the VFIO group FD. Failing to wait for this cleanup can
result in tools like libvirt attempting to rebind the device to
the host while it's still being used by VFIO, which can result in
host crashes or other misbehavior depending on the host driver.
Deferring DEVICE_DEL still affords us the ability to manage backends
explicitly, while also addressing cases like vfio-pci's, so we
implement that approach here.
An alternative proposal involving having VFIO emit a separate event
to denote completion of host-side cleanup was discussed, but the
prevailing opinion seems to be that it is not worth the added
complexity, and leaves the issue open for other Device implementations
solve in the future.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
---
hw/core/qdev.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index f7c66d9bd0..8b7b8c3280 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -1068,7 +1068,6 @@ static void device_finalize(Object *obj)
NamedGPIOList *ngl, *next;
DeviceState *dev = DEVICE(obj);
- qemu_opts_del(dev->opts);
QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
QLIST_REMOVE(ngl, node);
@@ -1079,6 +1078,18 @@ static void device_finalize(Object *obj)
* here
*/
}
+
+ /* Only send event if the device had been completely realized */
+ if (dev->pending_deleted_event) {
+ g_assert(dev->canonical_path);
+
+ qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
+ &error_abort);
+ g_free(dev->canonical_path);
+ dev->canonical_path = NULL;
+ }
+
+ qemu_opts_del(dev->opts);
}
static void device_class_base_init(ObjectClass *class, void *data)
@@ -1108,16 +1119,6 @@ static void device_unparent(Object *obj)
object_unref(OBJECT(dev->parent_bus));
dev->parent_bus = NULL;
}
-
- /* Only send event if the device had been completely realized */
- if (dev->pending_deleted_event) {
- g_assert(dev->canonical_path);
-
- qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
- &error_abort);
- g_free(dev->canonical_path);
- dev->canonical_path = NULL;
- }
}
static void device_class_init(ObjectClass *class, void *data)
--
2.11.0
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting Michael Roth
@ 2017-10-09 18:11 ` Eric Blake
2017-10-09 19:04 ` Michael Roth
2017-10-10 1:41 ` David Gibson
1 sibling, 1 reply; 9+ messages in thread
From: Eric Blake @ 2017-10-09 18:11 UTC (permalink / raw)
To: Michael Roth, qemu-devel
Cc: peter.maydell, ehabkost, Michael S . Tsirkin, groug, armbru,
alex.williamson, imammedo, pbonzini, david
[-- Attachment #1: Type: text/plain, Size: 1070 bytes --]
On 10/09/2017 12:06 PM, Michael Roth wrote:
> device_unparent(dev, ...) is called when a device is unparented,
> either directly, or as a result of a parent device being
> finalized, and handles some final cleanup for the device. Part
> of this includes emiting a DEVICE_DELETED QMP event to notify
> management, which includes the device's path in the composition
> tree as provided by object_get_canonical_path().
>
> +++ b/hw/core/qdev.c
> @@ -928,6 +928,13 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
> goto post_realize_fail;
> }
>
> + /*
> + * always free/re-initialize here since the value cannot be cleaned up
> + * in device_unrealize due to it's usage later on in the unplug path
s/it's/its/ (remember, "it's" is only appropriate if "it is" or "it has"
can be substituted in its place; otherwise you meant the possessive "its")
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3266
Virtualization: qemu.org | libvirt.org
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 619 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting
2017-10-09 18:11 ` Eric Blake
@ 2017-10-09 19:04 ` Michael Roth
0 siblings, 0 replies; 9+ messages in thread
From: Michael Roth @ 2017-10-09 19:04 UTC (permalink / raw)
To: Eric Blake, qemu-devel
Cc: peter.maydell, ehabkost, Michael S . Tsirkin, groug, armbru,
alex.williamson, imammedo, pbonzini, david
Quoting Eric Blake (2017-10-09 13:11:28)
> On 10/09/2017 12:06 PM, Michael Roth wrote:
> > device_unparent(dev, ...) is called when a device is unparented,
> > either directly, or as a result of a parent device being
> > finalized, and handles some final cleanup for the device. Part
> > of this includes emiting a DEVICE_DELETED QMP event to notify
> > management, which includes the device's path in the composition
> > tree as provided by object_get_canonical_path().
> >
>
> > +++ b/hw/core/qdev.c
> > @@ -928,6 +928,13 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
> > goto post_realize_fail;
> > }
> >
> > + /*
> > + * always free/re-initialize here since the value cannot be cleaned up
> > + * in device_unrealize due to it's usage later on in the unplug path
>
> s/it's/its/ (remember, "it's" is only appropriate if "it is" or "it has"
> can be substituted in its place; otherwise you meant the possessive "its")
Thanks, just inattentiveness on my part. Will send a v3 soon if there are
no other comments.
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc. +1-919-301-3266
> Virtualization: qemu.org | libvirt.org
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting Michael Roth
2017-10-09 18:11 ` Eric Blake
@ 2017-10-10 1:41 ` David Gibson
1 sibling, 0 replies; 9+ messages in thread
From: David Gibson @ 2017-10-10 1:41 UTC (permalink / raw)
To: Michael Roth
Cc: qemu-devel, groug, peter.maydell, armbru, alex.williamson,
pbonzini, imammedo, ehabkost, Michael S . Tsirkin
[-- Attachment #1: Type: text/plain, Size: 4448 bytes --]
On Mon, Oct 09, 2017 at 12:06:05PM -0500, Michael Roth wrote:
> device_unparent(dev, ...) is called when a device is unparented,
> either directly, or as a result of a parent device being
> finalized, and handles some final cleanup for the device. Part
> of this includes emiting a DEVICE_DELETED QMP event to notify
> management, which includes the device's path in the composition
> tree as provided by object_get_canonical_path().
>
> object_get_canonical_path() assumes the device is still connected
> to the machine/root container, and will assert otherwise, but
> in some situations this isn't the case:
>
> If the parent is finalized as a result of object_unparent(), it
> will still be attached to the composition tree at the time any
> children are unparented as a result of that same call to
> object_unparent(). However, in some cases, object_unparent()
> will complete without finalizing the parent device, due to
> lingering references that won't be released till some time later.
> One such example is if the parent has MemoryRegion children (which
> take a ref on their parent), who in turn have AddressSpace's (which
> take a ref on their regions), since those AddressSpaces get cleaned
> up asynchronously by the RCU thread.
>
> In this case qdev:device_unparent() may be called for a child Device
> that no longer has a path to the root/machine container, causing
> object_get_canonical_path() to assert.
>
> Fix this by storing the canonical path during realize() so the
> information will still be available for device_unparent() in such
> cases.
>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Signed-off-by: Greg Kurz <groug@kaod.org>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
With the exception of the trivial comment error that's already been
mentioned.
> ---
> hw/core/qdev.c | 16 +++++++++++++---
> include/hw/qdev-core.h | 1 +
> 2 files changed, 14 insertions(+), 3 deletions(-)
>
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 606ab53c42..0542e1879f 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -928,6 +928,13 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
> goto post_realize_fail;
> }
>
> + /*
> + * always free/re-initialize here since the value cannot be cleaned up
> + * in device_unrealize due to it's usage later on in the unplug path
> + */
> + g_free(dev->canonical_path);
> + dev->canonical_path = object_get_canonical_path(OBJECT(dev));
> +
> if (qdev_get_vmsd(dev)) {
> if (vmstate_register_with_alias_id(dev, -1, qdev_get_vmsd(dev), dev,
> dev->instance_id_alias,
> @@ -984,6 +991,7 @@ child_realize_fail:
> }
>
> post_realize_fail:
> + g_free(dev->canonical_path);
> if (dc->unrealize) {
> dc->unrealize(dev, NULL);
> }
> @@ -1102,10 +1110,12 @@ static void device_unparent(Object *obj)
>
> /* Only send event if the device had been completely realized */
> if (dev->pending_deleted_event) {
> - gchar *path = object_get_canonical_path(OBJECT(dev));
> + g_assert(dev->canonical_path);
>
> - qapi_event_send_device_deleted(!!dev->id, dev->id, path, &error_abort);
> - g_free(path);
> + qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> + &error_abort);
> + g_free(dev->canonical_path);
> + dev->canonical_path = NULL;
> }
>
> qemu_opts_del(dev->opts);
> diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h
> index 089146197f..0a71bf83f0 100644
> --- a/include/hw/qdev-core.h
> +++ b/include/hw/qdev-core.h
> @@ -153,6 +153,7 @@ struct DeviceState {
> /*< public >*/
>
> const char *id;
> + char *canonical_path;
> bool realized;
> bool pending_deleted_event;
> QemuOpts *opts;
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 2/3] Revert "qdev: Free QemuOpts when the QOM path goes away"
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 2/3] Revert "qdev: Free QemuOpts when the QOM path goes away" Michael Roth
@ 2017-10-10 1:42 ` David Gibson
0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2017-10-10 1:42 UTC (permalink / raw)
To: Michael Roth
Cc: qemu-devel, groug, peter.maydell, armbru, alex.williamson,
pbonzini, imammedo, ehabkost
[-- Attachment #1: Type: text/plain, Size: 1901 bytes --]
On Mon, Oct 09, 2017 at 12:06:06PM -0500, Michael Roth wrote:
> This reverts commit abed886ec60cf239a03515cf0b30fb11fa964c44.
>
> This patch originally addressed an issue where a DEVICE_DELETED
> event could be emitted (in device_unparent()) before a Device's
> QemuOpts were cleaned up (in device_finalize()), leading to a
> "duplicate ID" error if management attempted to immediately add
> a device with the same ID in response to the DEVICE_DELETED event.
>
> An alternative will be implemented in a subsequent patch where we
> defer the DEVICE_DELETED event until device_finalize(), which would
> also prevent the race, so we revert the original fix in preparation.
>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Reviewed-by: Greg Kurz <groug@kaod.org>
> Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> hw/core/qdev.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 0542e1879f..f7c66d9bd0 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -1068,6 +1068,7 @@ static void device_finalize(Object *obj)
> NamedGPIOList *ngl, *next;
>
> DeviceState *dev = DEVICE(obj);
> + qemu_opts_del(dev->opts);
>
> QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
> QLIST_REMOVE(ngl, node);
> @@ -1117,9 +1118,6 @@ static void device_unparent(Object *obj)
> g_free(dev->canonical_path);
> dev->canonical_path = NULL;
> }
> -
> - qemu_opts_del(dev->opts);
> - dev->opts = NULL;
> }
>
> static void device_class_init(ObjectClass *class, void *data)
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [Qemu-devel] [PATCH v2 3/3] qdev: defer DEVICE_DEL event until instance_finalize()
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 3/3] qdev: defer DEVICE_DEL event until instance_finalize() Michael Roth
@ 2017-10-10 1:44 ` David Gibson
0 siblings, 0 replies; 9+ messages in thread
From: David Gibson @ 2017-10-10 1:44 UTC (permalink / raw)
To: Michael Roth
Cc: qemu-devel, groug, peter.maydell, armbru, alex.williamson,
pbonzini, imammedo, ehabkost
[-- Attachment #1: Type: text/plain, Size: 3930 bytes --]
On Mon, Oct 09, 2017 at 12:06:07PM -0500, Michael Roth wrote:
> DEVICE_DEL is currently emitted when a Device is unparented, as
> opposed to when it is finalized. The main design motivation for this
> seems to be that after unparent()/unrealize(), the Device is no
> longer visible to the guest, and thus the operation is complete
> from the perspective of management.
>
> However, there are cases where remaining host-side cleanup is also
> pertinent to management. The is generally handled by treating these
> resources as aspects of the "backend", which can be managed via
> separate interfaces/events, such as blockdev_add/del, netdev_add/del,
> object_add/del, etc, but some devices do not have this level of
> compartmentalization, namely vfio-pci, and possibly to lend themselves
> well to it.
>
> In the case of vfio-pci, the "backend" cleanup happens as part of
> the finalization of the vfio-pci device itself, in particular the
> cleanup of the VFIO group FD. Failing to wait for this cleanup can
> result in tools like libvirt attempting to rebind the device to
> the host while it's still being used by VFIO, which can result in
> host crashes or other misbehavior depending on the host driver.
>
> Deferring DEVICE_DEL still affords us the ability to manage backends
> explicitly, while also addressing cases like vfio-pci's, so we
> implement that approach here.
>
> An alternative proposal involving having VFIO emit a separate event
> to denote completion of host-side cleanup was discussed, but the
> prevailing opinion seems to be that it is not worth the added
> complexity, and leaves the issue open for other Device implementations
> solve in the future.
>
> Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
> Reviewed-by: Greg Kurz <groug@kaod.org>
> Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> hw/core/qdev.c | 23 ++++++++++++-----------
> 1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index f7c66d9bd0..8b7b8c3280 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -1068,7 +1068,6 @@ static void device_finalize(Object *obj)
> NamedGPIOList *ngl, *next;
>
> DeviceState *dev = DEVICE(obj);
> - qemu_opts_del(dev->opts);
>
> QLIST_FOREACH_SAFE(ngl, &dev->gpios, node, next) {
> QLIST_REMOVE(ngl, node);
> @@ -1079,6 +1078,18 @@ static void device_finalize(Object *obj)
> * here
> */
> }
> +
> + /* Only send event if the device had been completely realized */
> + if (dev->pending_deleted_event) {
> + g_assert(dev->canonical_path);
> +
> + qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> + &error_abort);
> + g_free(dev->canonical_path);
> + dev->canonical_path = NULL;
> + }
> +
> + qemu_opts_del(dev->opts);
> }
>
> static void device_class_base_init(ObjectClass *class, void *data)
> @@ -1108,16 +1119,6 @@ static void device_unparent(Object *obj)
> object_unref(OBJECT(dev->parent_bus));
> dev->parent_bus = NULL;
> }
> -
> - /* Only send event if the device had been completely realized */
> - if (dev->pending_deleted_event) {
> - g_assert(dev->canonical_path);
> -
> - qapi_event_send_device_deleted(!!dev->id, dev->id, dev->canonical_path,
> - &error_abort);
> - g_free(dev->canonical_path);
> - dev->canonical_path = NULL;
> - }
> }
>
> static void device_class_init(ObjectClass *class, void *data)
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2017-10-10 2:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-09 17:06 [Qemu-devel] [PATCH v2 0/3] qdev/vfio: defer DEVICE_DEL to avoid races with libvirt Michael Roth
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 1/3] qdev: store DeviceState's canonical path to use when unparenting Michael Roth
2017-10-09 18:11 ` Eric Blake
2017-10-09 19:04 ` Michael Roth
2017-10-10 1:41 ` David Gibson
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 2/3] Revert "qdev: Free QemuOpts when the QOM path goes away" Michael Roth
2017-10-10 1:42 ` David Gibson
2017-10-09 17:06 ` [Qemu-devel] [PATCH v2 3/3] qdev: defer DEVICE_DEL event until instance_finalize() Michael Roth
2017-10-10 1:44 ` David Gibson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).