* [PATCH RFC v2 0/7] QOM: Singleton interface
@ 2024-10-29 21:16 Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 1/7] qom: Track dynamic initiations of random object class Peter Xu
` (8 more replies)
0 siblings, 9 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
This patchset introduces the singleton interface for QOM. I didn't add a
changelog because there're quite a few changes here and there, plus new
patches. So it might just be easier to re-read, considering the patchset
isn't large.
I switched v2 into RFC, because we have reviewer concerns (Phil and Dan so
far) that it could be error prone to try to trap every attempts to create
an object. My argument is, if we already have abstract class, meanwhile we
do not allow instantiation of abstract class, so the complexity is already
there. I prepared patch 1 this time to collect and track all similar
random object creations; it might be helpful as a cleanup on its own to
deduplicate some similar error messages. Said that, I'm still always open
to rejections to this proposal.
I hope v2 looks slightly cleaner by having not only object_new_allowed()
but also object_new_or_fetch().
Patch layout:
Patch 1-2: The patches to introduce QOM singleton interface
Patch 3-5: Add support for vIOMMU singleton, some qdev change needed
Patch 6-7: Add support for migration singleton, fix dangle pointer
Below are copy-paste of the commit message of patch 2, that I could have
put into the cover letter too.
====8<====
The singleton interface declares an object class which can only create one
instance globally.
Backgrounds / Use Cases
=======================
There can be some existing classes that can start to benefit from it. One
example is vIOMMU implementations.
QEMU emulated vIOMMUs are normally implemented on top of a generic device,
however it's special enough to normally only allow one instance of it for
the whole system, attached to the root complex.
These vIOMMU classes can be singletons in this case, so that QEMU can fail
or detect yet another attempt of creating such devices for more than once,
which can be fatal errors to a system.
We used to have some special code guarding it from happening. In x86,
pc_machine_device_pre_plug_cb() has code to detect when vIOMMU is created
more than once, for instance. With singleton class, logically we could
consider dropping the special code, but start to rely on QOM to make sure
there's only one vIOMMU for the whole system emulation.
There is a similar demand raising recently (even if the problem existed
over years) in migration.
Firstly, the migration object can currently be created randomly, even
though not wanted, e.g. during qom-list-properties QMP commands. Ideally,
there can be cases where we want to have an object walking over the
properties, we could use the existing migration object instead of
dynamically create one.
Meanwhile, migration has a long standing issue on current_migration
pointer, where it can point to freed data after the migration object is
finalized. It is debatable that the pointer can be cleared after the main
thread (1) join() the migration thread first, then (2) release the last
refcount for the migration object and clear the pointer. However there's
still major challenges [1]. With singleton, we could have a slightly but
hopefully working workaround to clear the pointer during finalize().
Design
======
The idea behind is pretty simple: any object that can only be created once
can now declare the TYPE_SINGLETON interface. Then, QOM facilities will
make sure it won't be created more than once for the whole QEMU lifecycle.
Whenever possible (e.g., on object_new_allowed() checks), pretty error
message will be generated to report an error. QOM also guards at the core
of object_new() so that any further violation of trying to create a
singleton more than once will crash QEMU as a programming error.
For example, qom-list-properties, device-list-properties, etc., will be
smart enough to not try to create temporary singleton objects if the class
is a singleton class and if there's existing instance created. Such usages
should be rare, and this patch introduced object_new_or_fetch() just for
it, which either create a new temp object when available, or fetch the
instance if we found an existing singleton instance. There're only two
such use cases.
Meanwhile, we also guard device-add or similar paths using the singleton
check in object_new_allowed(), so that it'll fail properly if a singleton
class instantiate more than one object.
Glib Singleton implementation
=============================
One note here to mention the Glib implementation of singletons [1].
QEMU chose not to follow Glib's implementation because Glib's version is
not thread safe on the constructor, so that two concurrent g_object_new()
on a single can race. It's not ideal to QEMU, as QEMU has to not only
support the event-driven context which is normally lock-free, but also
the case where threads are heavily used.
It could be QEMU's desire to properly support multi-threading by default on
such new interface. The "bad" side effect of that is, QEMU's object_new()
on singletons can assert failures if the singleton existed, but that's also
part of the design so as to forbid such from happening, taking which as a
programming error. Meanwhile since we have pretty ways to fail elsewhere
on qdev creations, it should already guard us in a clean way, from anywhere
that the user could try to create the singleton more than once.
The current QEMU impl also guarantees object_new() always return a newly
allocated object as long as properly returned, rather than silently return
an existing object as what Glib's impl would do. I see it a benefit, so as
to avoid unknown caller manipulate a global object, wrongly assuming it was
temporarily created.
[1] https://lore.kernel.org/qemu-devel/20190228122822.GD4970@work-vm/
[2] https://lore.kernel.org/r/ZxtqGQbd4Hq4APtm@redhat.com
Thanks,
Peter Xu (7):
qom: Track dynamic initiations of random object class
qom: TYPE_SINGLETON interface
qdev: Make device_set_realized() be fully prepared with !machine
qdev: Make qdev_get_machine() safe before machine creates
x86/iommu: Make x86-iommu a singleton object
migration: Make migration object a singleton object
migration: Reset current_migration properly
qapi/qdev.json | 2 +-
qapi/qom.json | 2 +-
include/qom/object.h | 25 ++++++++++++++++
include/qom/object_interfaces.h | 51 +++++++++++++++++++++++++++++++++
chardev/char.c | 4 +--
hw/core/cpu-common.c | 13 ++++++---
hw/core/qdev.c | 20 +++++++++++--
hw/i386/x86-iommu.c | 26 +++++++++++++++--
migration/migration.c | 36 +++++++++++++++++++++--
qom/object.c | 50 ++++++++++++++++++++++++++++++--
qom/object_interfaces.c | 33 +++++++++++++++++++--
qom/qom-qmp-cmds.c | 4 +--
system/qdev-monitor.c | 4 +--
13 files changed, 243 insertions(+), 27 deletions(-)
--
2.45.0
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH RFC v2 1/7] qom: Track dynamic initiations of random object class
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
@ 2024-10-29 21:16 ` Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 2/7] qom: TYPE_SINGLETON interface Peter Xu
` (7 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
Add a helper object_new_allowed(), use it to track all the places in QEMU
where a new (and especially, random) object can be created.
Currently, it is some form of a cleanup, just to put together all the
errors where QEMU wants to avoid instantiations of abstract classes. The
follow up patch will add more restriction on what object we can create.
A side effect of the cleanup: we could have reported the error message in
different ways even if the reason is always the same (attempts to create an
instance for an abstract class). Now we always report the same message,
could be different from before, but hopefully still worthwhile to change.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/qom/object.h | 13 +++++++++++++
chardev/char.c | 4 +---
hw/core/cpu-common.c | 13 +++++++++----
qom/object.c | 17 +++++++++++++++--
qom/object_interfaces.c | 3 +--
system/qdev-monitor.c | 4 +---
6 files changed, 40 insertions(+), 14 deletions(-)
diff --git a/include/qom/object.h b/include/qom/object.h
index 2af9854675..32f1af2986 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -627,6 +627,19 @@ Object *object_new_with_class(ObjectClass *klass);
*/
Object *object_new(const char *typename);
+/**
+ * object_new_allowed:
+ * @klass: The class to instantiate, or fetch instance from.
+ * @errp: The pointer to an Error* that might be filled
+ *
+ * This function detects whether creating a new object of specificed class
+ * is allowed. For example, we do not allow initiations of abstract class.
+ *
+ * Returns: True if new objects allowed, false otherwise. When false is
+ * returned, errp will be set with a proper error message.
+ */
+bool object_new_allowed(ObjectClass *klass, Error **errp);
+
/**
* object_new_with_props:
* @typename: The name of the type of the object to instantiate.
diff --git a/chardev/char.c b/chardev/char.c
index a1722aa076..7fa5b82585 100644
--- a/chardev/char.c
+++ b/chardev/char.c
@@ -533,9 +533,7 @@ static const ChardevClass *char_get_class(const char *driver, Error **errp)
return NULL;
}
- if (object_class_is_abstract(oc)) {
- error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "driver",
- "a non-abstract device type");
+ if (!object_new_allowed(oc, errp)) {
return NULL;
}
diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c
index 09c7903594..1815b08ba0 100644
--- a/hw/core/cpu-common.c
+++ b/hw/core/cpu-common.c
@@ -154,12 +154,17 @@ ObjectClass *cpu_class_by_name(const char *typename, const char *cpu_model)
assert(cc->class_by_name);
assert(cpu_model);
oc = cc->class_by_name(cpu_model);
- if (object_class_dynamic_cast(oc, typename) &&
- !object_class_is_abstract(oc)) {
- return oc;
+
+ if (!object_class_dynamic_cast(oc, typename)) {
+ return NULL;
}
- return NULL;
+ /* TODO: allow error message to be passed to the callers */
+ if (!object_new_allowed(oc, NULL)) {
+ return NULL;
+ }
+
+ return oc;
}
static void cpu_common_parse_features(const char *typename, char *features,
diff --git a/qom/object.c b/qom/object.c
index 11424cf471..4f3739fd85 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -797,6 +797,19 @@ Object *object_new(const char *typename)
return object_new_with_type(ti);
}
+bool object_new_allowed(ObjectClass *klass, Error **errp)
+{
+ ERRP_GUARD();
+
+ /* Abstract classes are not instantiable */
+ if (object_class_is_abstract(klass)) {
+ error_setg(errp, "Object type '%s' is abstract",
+ klass->type->name);
+ return false;
+ }
+
+ return true;
+}
Object *object_new_with_props(const char *typename,
Object *parent,
@@ -831,10 +844,10 @@ Object *object_new_with_propv(const char *typename,
return NULL;
}
- if (object_class_is_abstract(klass)) {
- error_setg(errp, "object type '%s' is abstract", typename);
+ if (!object_new_allowed(klass, errp)) {
return NULL;
}
+
obj = object_new_with_type(klass->type);
if (!object_set_propv(obj, errp, vargs)) {
diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
index e0833c8bfe..d68faf2234 100644
--- a/qom/object_interfaces.c
+++ b/qom/object_interfaces.c
@@ -102,8 +102,7 @@ Object *user_creatable_add_type(const char *type, const char *id,
return NULL;
}
- if (object_class_is_abstract(klass)) {
- error_setg(errp, "object type '%s' is abstract", type);
+ if (!object_new_allowed(klass, errp)) {
return NULL;
}
diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 44994ea0e1..5609e73635 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -255,9 +255,7 @@ static DeviceClass *qdev_get_device_class(const char **driver, Error **errp)
return NULL;
}
- if (object_class_is_abstract(oc)) {
- error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "driver",
- "a non-abstract device type");
+ if (!object_new_allowed(oc, errp)) {
return NULL;
}
--
2.45.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH RFC v2 2/7] qom: TYPE_SINGLETON interface
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 1/7] qom: Track dynamic initiations of random object class Peter Xu
@ 2024-10-29 21:16 ` Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 3/7] qdev: Make device_set_realized() be fully prepared with !machine Peter Xu
` (6 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
This patch introduces a new QOM interface called SINGLETON.
The singleton interface declares an object class which can only create one
instance globally.
Backgrounds / Use Cases
=======================
There can be some existing classes that can start to benefit from it. One
example is vIOMMU implementations.
QEMU emulated vIOMMUs are normally implemented on top of a generic device,
however it's special enough to normally only allow one instance of it for
the whole system, attached to the root complex.
These vIOMMU classes can be singletons in this case, so that QEMU can fail
or detect yet another attempt of creating such devices for more than once,
which can be fatal errors to a system.
We used to have some special code guarding it from happening. In x86,
pc_machine_device_pre_plug_cb() has code to detect when vIOMMU is created
more than once, for instance. With singleton class, logically we could
consider dropping the special code, but start to rely on QOM to make sure
there's only one vIOMMU for the whole system emulation.
There is a similar demand raising recently (even if the problem existed
over years) in migration.
Firstly, the migration object can currently be created randomly, even
though not wanted, e.g. during qom-list-properties QMP commands. Ideally,
there can be cases where we want to have an object walking over the
properties, we could use the existing migration object instead of
dynamically create one.
Meanwhile, migration has a long standing issue on current_migration
pointer, where it can point to freed data after the migration object is
finalized. It is debatable that the pointer can be cleared after the main
thread (1) join() the migration thread first, then (2) release the last
refcount for the migration object and clear the pointer. However there's
still major challenges [1]. With singleton, we could have a slightly but
hopefully working workaround to clear the pointer during finalize().
Design
======
The idea behind is pretty simple: any object that can only be created once
can now declare the TYPE_SINGLETON interface. Then, QOM facilities will
make sure it won't be created more than once for the whole QEMU lifecycle.
Whenever possible (e.g., on object_new_allowed() checks), pretty error
message will be generated to report an error. QOM also guards at the core
of object_new() so that any further violation of trying to create a
singleton more than once will crash QEMU as a programming error.
For example, qom-list-properties, device-list-properties, etc., will be
smart enough to not try to create temporary singleton objects if the class
is a singleton class and if there's existing instance created. Such usages
should be rare, and this patch introduced object_new_or_fetch() just for
it, which either create a new temp object when available, or fetch the
instance if we found an existing singleton instance. There're only two
such use cases.
Meanwhile, we also guard device-add or similar paths using the singleton
check in object_new_allowed(), so that it'll fail properly if a singleton
class instantiate more than one object.
Glib Singleton implementation
=============================
One note here to mention the Glib implementation of singletons [1].
QEMU chose not to follow Glib's implementation because Glib's version is
not thread safe on the constructor, so that two concurrent g_object_new()
on a single can race. It's not ideal to QEMU, as QEMU has to not only
support the event-driven context which is normally lock-free, but also
the case where threads are heavily used.
It could be QEMU's desire to properly support multi-threading by default on
such new interface. The "bad" side effect of that is, QEMU's object_new()
on singletons can assert failures if the singleton existed, but that's also
part of the design so as to forbid such from happening, taking which as a
programming error. Meanwhile since we have pretty ways to fail elsewhere
on qdev creations, it should already guard us in a clean way, from anywhere
that the user could try to create the singleton more than once.
The current QEMU impl also guarantees object_new() always return a newly
allocated object as long as properly returned, rather than silently return
an existing object as what Glib's impl would do. I see it a benefit, so as
to avoid unknown caller manipulate a global object, wrongly assuming it was
temporarily created.
[1] https://lore.kernel.org/qemu-devel/20190228122822.GD4970@work-vm/
[2] https://lore.kernel.org/r/ZxtqGQbd4Hq4APtm@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
---
qapi/qdev.json | 2 +-
qapi/qom.json | 2 +-
include/qom/object.h | 12 ++++++++
include/qom/object_interfaces.h | 51 +++++++++++++++++++++++++++++++++
qom/object.c | 33 +++++++++++++++++++++
qom/object_interfaces.c | 30 +++++++++++++++++++
qom/qom-qmp-cmds.c | 4 +--
7 files changed, 130 insertions(+), 4 deletions(-)
diff --git a/qapi/qdev.json b/qapi/qdev.json
index 53d147c7b4..f9ca1aa1df 100644
--- a/qapi/qdev.json
+++ b/qapi/qdev.json
@@ -22,7 +22,7 @@
#
# .. note:: Objects can create properties at runtime, for example to
# describe links between different devices and/or objects. These
-# properties are not included in the output of this command.
+# properties may or may not be included in the output of this command.
#
# Since: 1.2
##
diff --git a/qapi/qom.json b/qapi/qom.json
index 321ccd708a..00345b0715 100644
--- a/qapi/qom.json
+++ b/qapi/qom.json
@@ -197,7 +197,7 @@
#
# .. note:: Objects can create properties at runtime, for example to
# describe links between different devices and/or objects. These
-# properties are not included in the output of this command.
+# properties may or may not be included in the output of this command.
#
# Returns: a list of ObjectPropertyInfo describing object properties
#
diff --git a/include/qom/object.h b/include/qom/object.h
index 32f1af2986..e0d4173dfd 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -640,6 +640,18 @@ Object *object_new(const char *typename);
*/
bool object_new_allowed(ObjectClass *klass, Error **errp);
+/**
+ * object_new_or_fetch:
+ * @klass: The class to instantiate, or fetch instance from.
+ *
+ * This function will either initialize a new object using heap allocated
+ * memory, or fetch one object if the object is a singleton and the only
+ * instance has already been created.
+ *
+ * Returns: The fetched object (either newly initiated or existing).
+ */
+Object *object_new_or_fetch(ObjectClass *klass);
+
/**
* object_new_with_props:
* @typename: The name of the type of the object to instantiate.
diff --git a/include/qom/object_interfaces.h b/include/qom/object_interfaces.h
index 02b11a7ef0..48b77cf6c4 100644
--- a/include/qom/object_interfaces.h
+++ b/include/qom/object_interfaces.h
@@ -177,4 +177,55 @@ bool user_creatable_del(const char *id, Error **errp);
*/
void user_creatable_cleanup(void);
+#define TYPE_SINGLETON "singleton"
+
+typedef struct SingletonClass SingletonClass;
+DECLARE_CLASS_CHECKERS(SingletonClass, SINGLETON, TYPE_SINGLETON)
+
+/**
+ * SingletonClass:
+ *
+ * @parent_class: the base class
+ * @get_instance: fetch the singleton instance and elevate its refcount if
+ * it is created, NULL otherwise.
+ *
+ * Singleton class describes the type of object classes that can only
+ * provide one instance for the whole lifecycle of QEMU. It will fail the
+ * operation if one attemps to create more than one instance.
+ *
+ * One can fetch the single object using class's get_instance() callback if
+ * it was created before. This can be useful for operations like QMP
+ * qom-list-properties, where dynamically creating an object might not be
+ * feasible.
+ *
+ * Classes with TYPE_SINGLETON must provide get_instance() implementation,
+ * make sure the refcount is elevanted before returning, so that the
+ * instance (if existed) can never be freed concurrently once returned.
+ */
+struct SingletonClass {
+ /* <private> */
+ InterfaceClass parent_class;
+ /* <public> */
+ Object *(*get_instance)(void);
+};
+
+/**
+ * object_class_is_singleton:
+ *
+ * @class: the class to detect singleton
+ *
+ * Returns: true if it's a singleton class, false otherwise.
+ */
+bool object_class_is_singleton(ObjectClass *class);
+
+/**
+ * singleton_get_instance:
+ *
+ * @class: the class to fetch singleton instance
+ *
+ * Returns: the Object with elevated refcount if the class is a singleton
+ * class and the singleton object is created, NULL otherwise.
+ */
+Object *singleton_get_instance(ObjectClass *class);
+
#endif
diff --git a/qom/object.c b/qom/object.c
index 4f3739fd85..c7c6f3e397 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -553,6 +553,9 @@ static void object_initialize_with_type(Object *obj, size_t size, TypeImpl *type
g_assert(type->abstract == false);
g_assert(size >= type->instance_size);
+ /* Singleton class can only create one object */
+ g_assert(!singleton_get_instance(type->class));
+
memset(obj, 0, type->instance_size);
obj->class = type->class;
object_ref(obj);
@@ -808,9 +811,39 @@ bool object_new_allowed(ObjectClass *klass, Error **errp)
return false;
}
+ if (object_class_is_singleton(klass)) {
+ Object *obj = singleton_get_instance(klass);
+
+ if (obj) {
+ object_unref(obj);
+ /*
+ * TODO: Enhance the error message. E.g., the singleton class
+ * can provide a per-class error message in SingletonClass.
+ */
+ error_setg(errp, "Object type '%s' conflicts with "
+ "an existing singleton instance",
+ klass->type->name);
+ return false;
+ }
+ }
+
return true;
}
+Object *object_new_or_fetch(ObjectClass *klass)
+{
+ Object *obj;
+
+ if (object_class_is_singleton(klass)) {
+ obj = singleton_get_instance(klass);
+ if (obj) {
+ return obj;
+ }
+ }
+
+ return object_new_with_class(klass);
+}
+
Object *object_new_with_props(const char *typename,
Object *parent,
const char *id,
diff --git a/qom/object_interfaces.c b/qom/object_interfaces.c
index d68faf2234..548e1f344f 100644
--- a/qom/object_interfaces.c
+++ b/qom/object_interfaces.c
@@ -353,6 +353,29 @@ void user_creatable_cleanup(void)
object_unparent(object_get_objects_root());
}
+bool object_class_is_singleton(ObjectClass *class)
+{
+ return !!object_class_dynamic_cast(class, TYPE_SINGLETON);
+}
+
+Object *singleton_get_instance(ObjectClass *class)
+{
+ SingletonClass *singleton =
+ (SingletonClass *)object_class_dynamic_cast(class, TYPE_SINGLETON);
+
+ if (!singleton) {
+ return NULL;
+ }
+
+ /*
+ * NOTE: it's only safe to do object_ref() in the ->get_instance()
+ * callback, because logically speaking any object* returned here might
+ * get free()ed concurrently if without the elevated refcount. The
+ * hooks need to provide proper locking if a race condition is possible.
+ */
+ return singleton->get_instance();
+}
+
static void register_types(void)
{
static const TypeInfo uc_interface_info = {
@@ -361,7 +384,14 @@ static void register_types(void)
.class_size = sizeof(UserCreatableClass),
};
+ static const TypeInfo singleton_interface_info = {
+ .name = TYPE_SINGLETON,
+ .parent = TYPE_INTERFACE,
+ .class_size = sizeof(SingletonClass),
+ };
+
type_register_static(&uc_interface_info);
+ type_register_static(&singleton_interface_info);
}
type_init(register_types)
diff --git a/qom/qom-qmp-cmds.c b/qom/qom-qmp-cmds.c
index e91a235347..3aa51c4c95 100644
--- a/qom/qom-qmp-cmds.c
+++ b/qom/qom-qmp-cmds.c
@@ -141,7 +141,7 @@ ObjectPropertyInfoList *qmp_device_list_properties(const char *typename,
return NULL;
}
- obj = object_new(typename);
+ obj = object_new_or_fetch(klass);
object_property_iter_init(&iter, obj);
while ((prop = object_property_iter_next(&iter))) {
@@ -202,7 +202,7 @@ ObjectPropertyInfoList *qmp_qom_list_properties(const char *typename,
if (object_class_is_abstract(klass)) {
object_class_property_iter_init(&iter, klass);
} else {
- obj = object_new(typename);
+ obj = object_new_or_fetch(klass);
object_property_iter_init(&iter, obj);
}
while ((prop = object_property_iter_next(&iter))) {
--
2.45.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH RFC v2 3/7] qdev: Make device_set_realized() be fully prepared with !machine
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 1/7] qom: Track dynamic initiations of random object class Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 2/7] qom: TYPE_SINGLETON interface Peter Xu
@ 2024-10-29 21:16 ` Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 4/7] qdev: Make qdev_get_machine() safe before machine creates Peter Xu
` (5 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
We're going to enable qdev_get_machine() to work even before machine is
created. Make device_set_realized() be prepared with it.
Currently, a device can be realized even before machine is created, but
only in one of QEMU's qtest, test-global-qdev-props.c.
Right now, the test_static_prop_subprocess() test (which creates one simple
object without machine created) will internally make "/machine" to be a
container. Now explicitly support that case when there's no real
"/machine" object around, then unattached devices will be put under
root ("/") rather than "/machine". Mostly only for this test case, or for
any future test cases only.
Note that this shouldn't affect anything else that relies on a real machine
being there but only unit tests like mentioned, because if "/machine" is
created as a container as of now, it'll fail QEMU very soon later on
qemu_create_machine() trying to create the real machine, seeing that
there's a conflict.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/core/qdev.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index db36f54d91..5c83f48b33 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -490,9 +490,17 @@ static void device_set_realized(Object *obj, bool value, Error **errp)
if (!obj->parent) {
gchar *name = g_strdup_printf("device[%d]", unattached_count++);
+ Object *root = qdev_get_machine();
- object_property_add_child(container_get(qdev_get_machine(),
- "/unattached"),
+ /*
+ * We could have qdev test cases trying to realize() a device
+ * without machine created. In that case we use the root.
+ */
+ if (!root) {
+ root = object_get_root();
+ }
+
+ object_property_add_child(container_get(root, "/unattached"),
name, obj);
unattached_parent = true;
g_free(name);
--
2.45.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH RFC v2 4/7] qdev: Make qdev_get_machine() safe before machine creates
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
` (2 preceding siblings ...)
2024-10-29 21:16 ` [PATCH RFC v2 3/7] qdev: Make device_set_realized() be fully prepared with !machine Peter Xu
@ 2024-10-29 21:16 ` Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object Peter Xu
` (4 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
qdev_get_machine() is the helper that QEMU heavily uses in most places to
fetch the current machine object after it's created. It can only be called
after the machine is created as of now, otherwise a container can be
wrongly created at path "/machine", and that could crash QEMU later.
It's not an issue for now, because all code paths will currently make sure
this helper won't be called too early, e.g., before the machine object is
properly created and attached under the object root path.
This patch makes this behavior more predictable, by never trying to wrongly
create a container if the object is missing. This enables the helper to be
used even before the machine is created, as long as the caller can properly
handle a NULL return (which says, "machine is not yet created").
No functional change intended as of now, but will start to make use of it
in later patches, where qdev_get_machine() can start to be use before
machine creations.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/core/qdev.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 5c83f48b33..c867aed28a 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -840,7 +840,13 @@ Object *qdev_get_machine(void)
static Object *dev;
if (dev == NULL) {
- dev = container_get(object_get_root(), "/machine");
+ /*
+ * NOTE: dev can keep being NULL if machine is not yet created!
+ * In which case the function will properly return NULL.
+ *
+ * Whenever machine object is created and found once, we cache it.
+ */
+ dev = object_resolve_path_component(object_get_root(), "machine");
}
return dev;
--
2.45.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
` (3 preceding siblings ...)
2024-10-29 21:16 ` [PATCH RFC v2 4/7] qdev: Make qdev_get_machine() safe before machine creates Peter Xu
@ 2024-10-29 21:16 ` Peter Xu
2024-10-30 10:33 ` Daniel P. Berrangé
2024-10-29 21:16 ` [PATCH RFC v2 6/7] migration: Make migration object " Peter Xu
` (3 subsequent siblings)
8 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
X86 IOMMUs cannot be created more than one on a system yet. Make it a
singleton so it guards the system from accidentally create yet another
IOMMU object when one already presents.
Now if someone tries to create more than one, e.g., via:
./qemu -M q35 -device intel-iommu -device intel-iommu
The error will change from:
qemu-system-x86_64: -device intel-iommu: QEMU does not support multiple vIOMMUs for x86 yet.
To:
qemu-system-x86_64: -device intel-iommu: Class 'intel-iommu' only supports one instance
Unfortunately, yet we can't remove the singleton check in the machine
hook (pc_machine_device_pre_plug_cb), because there can also be
virtio-iommu involved, which doesn't share a common parent class yet.
But with this, it should be closer to reach that goal to check singleton by
QOM one day.
Note: after this patch, x86_iommu_get_default() might be called very early,
even before machine is created (e.g., cmdline "-device intel-iommu,help").
In this case we need to be prepared with such, and that also means the
default iommu is not yet around.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
hw/i386/x86-iommu.c | 26 +++++++++++++++++++++++---
1 file changed, 23 insertions(+), 3 deletions(-)
diff --git a/hw/i386/x86-iommu.c b/hw/i386/x86-iommu.c
index 60af896225..ec45b80306 100644
--- a/hw/i386/x86-iommu.c
+++ b/hw/i386/x86-iommu.c
@@ -26,6 +26,7 @@
#include "qemu/error-report.h"
#include "trace.h"
#include "sysemu/kvm.h"
+#include "qom/object_interfaces.h"
void x86_iommu_iec_register_notifier(X86IOMMUState *iommu,
iec_notify_fn fn, void *data)
@@ -79,9 +80,15 @@ void x86_iommu_irq_to_msi_message(X86IOMMUIrq *irq, MSIMessage *msg_out)
X86IOMMUState *x86_iommu_get_default(void)
{
- MachineState *ms = MACHINE(qdev_get_machine());
- PCMachineState *pcms =
- PC_MACHINE(object_dynamic_cast(OBJECT(ms), TYPE_PC_MACHINE));
+ Object *machine = qdev_get_machine();
+ PCMachineState *pcms;
+
+ /* If machine has not been created, so is the vIOMMU */
+ if (!machine) {
+ return NULL;
+ }
+
+ pcms = PC_MACHINE(object_dynamic_cast(machine, TYPE_PC_MACHINE));
if (pcms &&
object_dynamic_cast(OBJECT(pcms->iommu), TYPE_X86_IOMMU_DEVICE)) {
@@ -133,10 +140,19 @@ static Property x86_iommu_properties[] = {
DEFINE_PROP_END_OF_LIST(),
};
+static Object *x86_iommu_get_instance(void)
+{
+ return object_ref(OBJECT(x86_iommu_get_default()));
+}
+
static void x86_iommu_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
+ SingletonClass *singleton = SINGLETON_CLASS(klass);
+
dc->realize = x86_iommu_realize;
+ singleton->get_instance = x86_iommu_get_instance;
+
device_class_set_props(dc, x86_iommu_properties);
}
@@ -152,6 +168,10 @@ static const TypeInfo x86_iommu_info = {
.class_init = x86_iommu_class_init,
.class_size = sizeof(X86IOMMUClass),
.abstract = true,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_SINGLETON },
+ { }
+ }
};
static void x86_iommu_register_types(void)
--
2.45.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH RFC v2 6/7] migration: Make migration object a singleton object
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
` (4 preceding siblings ...)
2024-10-29 21:16 ` [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object Peter Xu
@ 2024-10-29 21:16 ` Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 7/7] migration: Reset current_migration properly Peter Xu
` (2 subsequent siblings)
8 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
This makes the migration object a singleton unit. After this, we can do
something slightly tricky later on with the guarantee that nobody will be
able to create the object twice.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/migration/migration.c b/migration/migration.c
index 021faee2f3..f4456f7142 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -45,6 +45,7 @@
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qnull.h"
#include "qemu/rcu.h"
+#include "qom/object_interfaces.h"
#include "postcopy-ram.h"
#include "qemu/thread.h"
#include "trace.h"
@@ -3833,11 +3834,19 @@ fail:
migrate_fd_cleanup(s);
}
+static Object *migration_get_instance(void)
+{
+ return object_ref(OBJECT(current_migration));
+}
+
static void migration_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
+ SingletonClass *singleton = SINGLETON_CLASS(klass);
dc->user_creatable = false;
+ singleton->get_instance = migration_get_instance;
+
device_class_set_props(dc, migration_properties);
}
@@ -3910,6 +3919,10 @@ static const TypeInfo migration_type = {
.instance_size = sizeof(MigrationState),
.instance_init = migration_instance_init,
.instance_finalize = migration_instance_finalize,
+ .interfaces = (InterfaceInfo[]) {
+ { TYPE_SINGLETON },
+ { }
+ }
};
static void register_migration_types(void)
--
2.45.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH RFC v2 7/7] migration: Reset current_migration properly
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
` (5 preceding siblings ...)
2024-10-29 21:16 ` [PATCH RFC v2 6/7] migration: Make migration object " Peter Xu
@ 2024-10-29 21:16 ` Peter Xu
2024-10-30 9:48 ` [PATCH RFC v2 0/7] QOM: Singleton interface Daniel P. Berrangé
2024-10-30 18:07 ` Daniel P. Berrangé
8 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-29 21:16 UTC (permalink / raw)
To: qemu-devel
Cc: Markus Armbruster, Peter Maydell, Eduardo Habkost, Paolo Bonzini,
Mark Cave-Ayland, Igor Mammedov, Michael S . Tsirkin,
Alex Williamson, Dr . David Alan Gilbert, peterx,
Daniel P . Berrangé, Philippe Mathieu-Daudé,
Cédric Le Goater, Fabiano Rosas, Juraj Marcin
current_migration is never reset, even if the migration object is freed
already. It means anyone references that can trigger UAF and it'll be hard
to debug.
Properly clear the pointer now. So far the only place to do is via its own
finalize(), which means QEMU is releasing the last refcount and right
before freeing the object memory. Meanwhile, QEMU won't know who holds the
last refcount, so it can't reset the variable manually / explicitly.
To make it more readable, also initialize the variable in the
instance_init() so it's very well paired at least.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 23 ++++++++++++++++++++---
1 file changed, 20 insertions(+), 3 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index f4456f7142..70b9ef8228 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -233,9 +233,11 @@ static int migration_stop_vm(MigrationState *s, RunState state)
void migration_object_init(void)
{
- /* This can only be called once. */
- assert(!current_migration);
- current_migration = MIGRATION_OBJ(object_new(TYPE_MIGRATION));
+ /* This creates the singleton migration object */
+ object_new(TYPE_MIGRATION);
+
+ /* This should be set now when initialize the singleton object */
+ assert(current_migration);
/*
* Init the migrate incoming object as well no matter whether
@@ -3864,12 +3866,27 @@ static void migration_instance_finalize(Object *obj)
qemu_sem_destroy(&ms->rp_state.rp_pong_acks);
qemu_sem_destroy(&ms->postcopy_qemufile_src_sem);
error_free(ms->error);
+
+ /*
+ * We know we only have one instance of migration, and when reaching
+ * here it means migration object is going away. Clear the global
+ * reference to reflect that.
+ */
+ current_migration = NULL;
}
static void migration_instance_init(Object *obj)
{
MigrationState *ms = MIGRATION_OBJ(obj);
+ /*
+ * There can only be one migration object globally. Keep a record of
+ * the pointer in current_migration, which will be reset after the
+ * object finalize().
+ */
+ assert(!current_migration);
+ current_migration = ms;
+
ms->state = MIGRATION_STATUS_NONE;
ms->mbps = -1;
ms->pages_per_second = -1;
--
2.45.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
` (6 preceding siblings ...)
2024-10-29 21:16 ` [PATCH RFC v2 7/7] migration: Reset current_migration properly Peter Xu
@ 2024-10-30 9:48 ` Daniel P. Berrangé
2024-10-30 13:13 ` Peter Xu
2024-10-30 18:07 ` Daniel P. Berrangé
8 siblings, 1 reply; 21+ messages in thread
From: Daniel P. Berrangé @ 2024-10-30 9:48 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
> Meanwhile, migration has a long standing issue on current_migration
> pointer, where it can point to freed data after the migration object is
> finalized. It is debatable that the pointer can be cleared after the main
> thread (1) join() the migration thread first, then (2) release the last
> refcount for the migration object and clear the pointer. However there's
> still major challenges [1]. With singleton, we could have a slightly but
> hopefully working workaround to clear the pointer during finalize().
I'm still not entirely convinced that this singleton proposal is
fixing the migration problem correctly.
Based on discussions in v1, IIUC, the situation is that we have
migration_shutdown() being called from qemu_cleanup(). The former
will call object_unref(current_migration), but there may still
be background migration threads running that access 'current_migration',
and thus a potential use-after-free.
Based on what the 7th patch here does, the key difference is that
the finalize() method for MigrationState will set 'current_migration'
to NULL after free'ing it.
I don't believe that is safe.
Back to the current code, if there is a use-after-free today, that
implies that the background threads are *not* holding their own
reference on 'current_migration', allowing the object to be free'd
while they're still using it. If they held their own reference then
the object_unref in migration_shutdown would not have any use after
free risk.
The new code is not changing the ref counting done by any threads.
Therefore if there's a use-after-free in existing code, AFAICT, the
same use-after-free *must* still exist in the current code.
The 7th patch only fixes the use-after-free, *if and only if* the
background thread tries to access 'current_migration', /after/
finalize as completed. The use-after-free in this case, has been
turned into a NULL pointer reference.
A background thread could be accessing the 'current_migration' pointer
*concurrently* with the finalize method executing though. In this case
we still have a use after free problem, only the time window in which
it exists has been narrowed a little.
Shouldn't the problem with migration be solved by every migration thread
holding a reference on current_migration, that the thread releases when
it exits, such that MigrationState is only finalized once every thread
has exited ? That would not require any join() synchronization point.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object
2024-10-29 21:16 ` [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object Peter Xu
@ 2024-10-30 10:33 ` Daniel P. Berrangé
2024-10-30 13:01 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Daniel P. Berrangé @ 2024-10-30 10:33 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Tue, Oct 29, 2024 at 05:16:05PM -0400, Peter Xu wrote:
> X86 IOMMUs cannot be created more than one on a system yet. Make it a
> singleton so it guards the system from accidentally create yet another
> IOMMU object when one already presents.
>
> Now if someone tries to create more than one, e.g., via:
>
> ./qemu -M q35 -device intel-iommu -device intel-iommu
>
> The error will change from:
>
> qemu-system-x86_64: -device intel-iommu: QEMU does not support multiple vIOMMUs for x86 yet.
>
> To:
>
> qemu-system-x86_64: -device intel-iommu: Class 'intel-iommu' only supports one instance
>
> Unfortunately, yet we can't remove the singleton check in the machine
> hook (pc_machine_device_pre_plug_cb), because there can also be
> virtio-iommu involved, which doesn't share a common parent class yet.
>
> But with this, it should be closer to reach that goal to check singleton by
> QOM one day.
Looking at the other iommu impls, I noticed that they all have something
in common, in that they call pci_setup_iommu from their realize()
function to register their set of callback functions.
This pci_setup_iommu can happily be called multiple times and just
over-writes previously registered callbacks. I wonder if this is a better
place to diagnose incorrect usage of multiple impls. If pci_setup_iommu
raised an error, it wouldn't matter that virtio-iommu doesn't share
a common parent with intel-iommu. This would also perhaps be better for
a future heterogeneous machine types, where it might be valid to have
multiple iommus concurrently. Checking at the resource setup/registration
point reflects where the physical constraint comes from.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object
2024-10-30 10:33 ` Daniel P. Berrangé
@ 2024-10-30 13:01 ` Peter Xu
2024-10-30 13:07 ` Daniel P. Berrangé
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2024-10-30 13:01 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 10:33:23AM +0000, Daniel P. Berrangé wrote:
> On Tue, Oct 29, 2024 at 05:16:05PM -0400, Peter Xu wrote:
> > X86 IOMMUs cannot be created more than one on a system yet. Make it a
> > singleton so it guards the system from accidentally create yet another
> > IOMMU object when one already presents.
> >
> > Now if someone tries to create more than one, e.g., via:
> >
> > ./qemu -M q35 -device intel-iommu -device intel-iommu
> >
> > The error will change from:
> >
> > qemu-system-x86_64: -device intel-iommu: QEMU does not support multiple vIOMMUs for x86 yet.
> >
> > To:
> >
> > qemu-system-x86_64: -device intel-iommu: Class 'intel-iommu' only supports one instance
> >
> > Unfortunately, yet we can't remove the singleton check in the machine
> > hook (pc_machine_device_pre_plug_cb), because there can also be
> > virtio-iommu involved, which doesn't share a common parent class yet.
> >
> > But with this, it should be closer to reach that goal to check singleton by
> > QOM one day.
>
> Looking at the other iommu impls, I noticed that they all have something
> in common, in that they call pci_setup_iommu from their realize()
> function to register their set of callback functions.
>
> This pci_setup_iommu can happily be called multiple times and just
> over-writes previously registered callbacks. I wonder if this is a better
> place to diagnose incorrect usage of multiple impls. If pci_setup_iommu
> raised an error, it wouldn't matter that virtio-iommu doesn't share
> a common parent with intel-iommu. This would also perhaps be better for
> a future heterogeneous machine types, where it might be valid to have
> multiple iommus concurrently. Checking at the resource setup/registration
> point reflects where the physical constraint comes from.
There can still be side effects that vIOMMU code, at least so far, consider
it the only object even during init/realize. E.g. vtd_decide_config() has
kvm_enable_x2apic() calls which we definitely don't want to be triggered
during machine running. The pci_setup_iommu() idea could work indeed but
it might still need cleanups here and there all over the places.
In general, I was trying to have this as a show case, so in this case it's
indeed not required. But I wonder whether other devices / objects can also
benefit from it, knowing some of them can only have one instance. I used
to believe it could be pretty common in QEMU, but maybe I'm wrong. If
there's none of such elsewhere, then I agree the singleton idea may not be
that helpful.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object
2024-10-30 13:01 ` Peter Xu
@ 2024-10-30 13:07 ` Daniel P. Berrangé
2024-10-30 14:33 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Daniel P. Berrangé @ 2024-10-30 13:07 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 09:01:03AM -0400, Peter Xu wrote:
> On Wed, Oct 30, 2024 at 10:33:23AM +0000, Daniel P. Berrangé wrote:
> > On Tue, Oct 29, 2024 at 05:16:05PM -0400, Peter Xu wrote:
> > > X86 IOMMUs cannot be created more than one on a system yet. Make it a
> > > singleton so it guards the system from accidentally create yet another
> > > IOMMU object when one already presents.
> > >
> > > Now if someone tries to create more than one, e.g., via:
> > >
> > > ./qemu -M q35 -device intel-iommu -device intel-iommu
> > >
> > > The error will change from:
> > >
> > > qemu-system-x86_64: -device intel-iommu: QEMU does not support multiple vIOMMUs for x86 yet.
> > >
> > > To:
> > >
> > > qemu-system-x86_64: -device intel-iommu: Class 'intel-iommu' only supports one instance
> > >
> > > Unfortunately, yet we can't remove the singleton check in the machine
> > > hook (pc_machine_device_pre_plug_cb), because there can also be
> > > virtio-iommu involved, which doesn't share a common parent class yet.
> > >
> > > But with this, it should be closer to reach that goal to check singleton by
> > > QOM one day.
> >
> > Looking at the other iommu impls, I noticed that they all have something
> > in common, in that they call pci_setup_iommu from their realize()
> > function to register their set of callback functions.
> >
> > This pci_setup_iommu can happily be called multiple times and just
> > over-writes previously registered callbacks. I wonder if this is a better
> > place to diagnose incorrect usage of multiple impls. If pci_setup_iommu
> > raised an error, it wouldn't matter that virtio-iommu doesn't share
> > a common parent with intel-iommu. This would also perhaps be better for
> > a future heterogeneous machine types, where it might be valid to have
> > multiple iommus concurrently. Checking at the resource setup/registration
> > point reflects where the physical constraint comes from.
>
> There can still be side effects that vIOMMU code, at least so far, consider
> it the only object even during init/realize. E.g. vtd_decide_config() has
> kvm_enable_x2apic() calls which we definitely don't want to be triggered
> during machine running. The pci_setup_iommu() idea could work indeed but
> it might still need cleanups here and there all over the places.
The side effects surely don't matter, because when we hit the error
scenario, we'll propagate that up the stack until something calls
exit(), since this is a cold boot path, rather than hotplug ?
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-30 9:48 ` [PATCH RFC v2 0/7] QOM: Singleton interface Daniel P. Berrangé
@ 2024-10-30 13:13 ` Peter Xu
2024-10-30 16:13 ` Daniel P. Berrangé
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2024-10-30 13:13 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 09:48:07AM +0000, Daniel P. Berrangé wrote:
> On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> > v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
>
> > Meanwhile, migration has a long standing issue on current_migration
> > pointer, where it can point to freed data after the migration object is
> > finalized. It is debatable that the pointer can be cleared after the main
> > thread (1) join() the migration thread first, then (2) release the last
> > refcount for the migration object and clear the pointer. However there's
> > still major challenges [1]. With singleton, we could have a slightly but
> > hopefully working workaround to clear the pointer during finalize().
>
> I'm still not entirely convinced that this singleton proposal is
> fixing the migration problem correctly.
>
> Based on discussions in v1, IIUC, the situation is that we have
> migration_shutdown() being called from qemu_cleanup(). The former
> will call object_unref(current_migration), but there may still
> be background migration threads running that access 'current_migration',
> and thus a potential use-after-free.
migration thread is fine, it takes a refcount at the entry.
And btw, taking it at the entry is racy, we've just fixed it, see (in my
next migration pull):
https://lore.kernel.org/qemu-devel/20241024213056.1395400-2-peterx@redhat.com/
The access reported was, IIUC, outside migration code, but after both
main/migration threads released the refcount, hence after finalize(). It
could be a random migration_is_running() call very late in device code, for
example.
>
> Based on what the 7th patch here does, the key difference is that
> the finalize() method for MigrationState will set 'current_migration'
> to NULL after free'ing it.
Yes. But this show case series isn't complete. We need a migration-side
lock finally to make it safe to access. For that, see:
https://lore.kernel.org/qemu-devel/20241024213056.1395400-9-peterx@redhat.com/
>
> I don't believe that is safe.
I hope after the other series applied it will be 100% safe, even though I
agree it's tricky. But hopefully QOM is very clean, the trickly part is
still within migration, and it should be less tricky than migration
implement a refcount on top of Object..
>
> Back to the current code, if there is a use-after-free today, that
> implies that the background threads are *not* holding their own
> reference on 'current_migration', allowing the object to be free'd
> while they're still using it. If they held their own reference then
> the object_unref in migration_shutdown would not have any use after
> free risk.
>
> The new code is not changing the ref counting done by any threads.
> Therefore if there's a use-after-free in existing code, AFAICT, the
> same use-after-free *must* still exist in the current code.
>
> The 7th patch only fixes the use-after-free, *if and only if* the
> background thread tries to access 'current_migration', /after/
> finalize as completed. The use-after-free in this case, has been
> turned into a NULL pointer reference.
>
> A background thread could be accessing the 'current_migration' pointer
> *concurrently* with the finalize method executing though. In this case
> we still have a use after free problem, only the time window in which
> it exists has been narrowed a little.
>
> Shouldn't the problem with migration be solved by every migration thread
> holding a reference on current_migration, that the thread releases when
> it exits, such that MigrationState is only finalized once every thread
> has exited ? That would not require any join() synchronization point.
I think the question is whether things like migration_is_running() is
allowed to be used anywhere, even after migration_shutdown(). My answer
is, it should be ok to be used anywhere, and we don't necessarilly need to
limit that. In that case the caller doesn't need to take a refcount
because it's an immediate query. It can simply check its existance with
the lock (after my patch 8 of the other series applied, which depends on
this qom series).
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object
2024-10-30 13:07 ` Daniel P. Berrangé
@ 2024-10-30 14:33 ` Peter Xu
0 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-30 14:33 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: QEMU Developers, Markus Armbruster, Peter Maydell,
Eduardo Habkost, Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
[-- Attachment #1: Type: text/plain, Size: 3288 bytes --]
On Wed, Oct 30, 2024, 9:08 a.m. Daniel P. Berrangé <berrange@redhat.com>
wrote:
> On Wed, Oct 30, 2024 at 09:01:03AM -0400, Peter Xu wrote:
> > On Wed, Oct 30, 2024 at 10:33:23AM +0000, Daniel P. Berrangé wrote:
> > > On Tue, Oct 29, 2024 at 05:16:05PM -0400, Peter Xu wrote:
> > > > X86 IOMMUs cannot be created more than one on a system yet. Make it
> a
> > > > singleton so it guards the system from accidentally create yet
> another
> > > > IOMMU object when one already presents.
> > > >
> > > > Now if someone tries to create more than one, e.g., via:
> > > >
> > > > ./qemu -M q35 -device intel-iommu -device intel-iommu
> > > >
> > > > The error will change from:
> > > >
> > > > qemu-system-x86_64: -device intel-iommu: QEMU does not support
> multiple vIOMMUs for x86 yet.
> > > >
> > > > To:
> > > >
> > > > qemu-system-x86_64: -device intel-iommu: Class 'intel-iommu' only
> supports one instance
> > > >
> > > > Unfortunately, yet we can't remove the singleton check in the machine
> > > > hook (pc_machine_device_pre_plug_cb), because there can also be
> > > > virtio-iommu involved, which doesn't share a common parent class yet.
> > > >
> > > > But with this, it should be closer to reach that goal to check
> singleton by
> > > > QOM one day.
> > >
> > > Looking at the other iommu impls, I noticed that they all have
> something
> > > in common, in that they call pci_setup_iommu from their realize()
> > > function to register their set of callback functions.
> > >
> > > This pci_setup_iommu can happily be called multiple times and just
> > > over-writes previously registered callbacks. I wonder if this is a
> better
> > > place to diagnose incorrect usage of multiple impls. If pci_setup_iommu
> > > raised an error, it wouldn't matter that virtio-iommu doesn't share
> > > a common parent with intel-iommu. This would also perhaps be better for
> > > a future heterogeneous machine types, where it might be valid to have
> > > multiple iommus concurrently. Checking at the resource
> setup/registration
> > > point reflects where the physical constraint comes from.
> >
> > There can still be side effects that vIOMMU code, at least so far,
> consider
> > it the only object even during init/realize. E.g. vtd_decide_config()
> has
> > kvm_enable_x2apic() calls which we definitely don't want to be triggered
> > during machine running. The pci_setup_iommu() idea could work indeed but
> > it might still need cleanups here and there all over the places.
>
> The side effects surely don't matter, because when we hit the error
> scenario, we'll propagate that up the stack until something calls
> exit(), since this is a cold boot path, rather than hotplug ?
>
Yes, intel iommus are not hot pluggable so it shouldn't be a major concern.
But my point is we could have similar devices that either operate on
globals or system wide behaviors. Singleton may properly protect it from
ever being created.
> With regards,
> Daniel
> --
> |: https://berrange.com -o-
> https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o-
> https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o-
> https://www.instagram.com/dberrange :|
>
>
[-- Attachment #2: Type: text/html, Size: 4797 bytes --]
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-30 13:13 ` Peter Xu
@ 2024-10-30 16:13 ` Daniel P. Berrangé
2024-10-30 17:51 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Daniel P. Berrangé @ 2024-10-30 16:13 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 09:13:13AM -0400, Peter Xu wrote:
> On Wed, Oct 30, 2024 at 09:48:07AM +0000, Daniel P. Berrangé wrote:
> > On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> > > v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
> >
> > > Meanwhile, migration has a long standing issue on current_migration
> > > pointer, where it can point to freed data after the migration object is
> > > finalized. It is debatable that the pointer can be cleared after the main
> > > thread (1) join() the migration thread first, then (2) release the last
> > > refcount for the migration object and clear the pointer. However there's
> > > still major challenges [1]. With singleton, we could have a slightly but
> > > hopefully working workaround to clear the pointer during finalize().
> >
> > I'm still not entirely convinced that this singleton proposal is
> > fixing the migration problem correctly.
> >
> > Based on discussions in v1, IIUC, the situation is that we have
> > migration_shutdown() being called from qemu_cleanup(). The former
> > will call object_unref(current_migration), but there may still
> > be background migration threads running that access 'current_migration',
> > and thus a potential use-after-free.
>
> migration thread is fine, it takes a refcount at the entry.
>
> And btw, taking it at the entry is racy, we've just fixed it, see (in my
> next migration pull):
>
> https://lore.kernel.org/qemu-devel/20241024213056.1395400-2-peterx@redhat.com/
Yep, acquiring the refcount immediately before thread-create
is what I meant.
> The access reported was, IIUC, outside migration code, but after both
> main/migration threads released the refcount, hence after finalize(). It
> could be a random migration_is_running() call very late in device code, for
> example.
>
> >
> > Based on what the 7th patch here does, the key difference is that
> > the finalize() method for MigrationState will set 'current_migration'
> > to NULL after free'ing it.
>
> Yes. But this show case series isn't complete. We need a migration-side
> lock finally to make it safe to access. For that, see:
>
> https://lore.kernel.org/qemu-devel/20241024213056.1395400-9-peterx@redhat.com/
>
> >
> > I don't believe that is safe.
>
> I hope after the other series applied it will be 100% safe, even though I
> agree it's tricky. But hopefully QOM is very clean, the trickly part is
> still within migration, and it should be less tricky than migration
> implement a refcount on top of Object..
Ok, so with the other series applied, this does look safe, but
it also doesn't seem to really have any dependancy on the
single interface code. Patch 7 here looks sufficient, in combo
with the other 2 series to avoid the use-after-free flaws.
> I think the question is whether things like migration_is_running() is
> allowed to be used anywhere, even after migration_shutdown(). My answer
> is, it should be ok to be used anywhere, and we don't necessarilly need to
> limit that. In that case the caller doesn't need to take a refcount
> because it's an immediate query. It can simply check its existance with
> the lock (after my patch 8 of the other series applied, which depends on
> this qom series).
Agree, and from a practical POV, I think it would be impossible to
require a ref count be held from other non-migration threads, so the
locking around 'current_migration' looks like the only practical
option.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-30 16:13 ` Daniel P. Berrangé
@ 2024-10-30 17:51 ` Peter Xu
2024-10-30 17:58 ` Daniel P. Berrangé
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2024-10-30 17:51 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 04:13:57PM +0000, Daniel P. Berrangé wrote:
> On Wed, Oct 30, 2024 at 09:13:13AM -0400, Peter Xu wrote:
> > On Wed, Oct 30, 2024 at 09:48:07AM +0000, Daniel P. Berrangé wrote:
> > > On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> > > > v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
> > >
> > > > Meanwhile, migration has a long standing issue on current_migration
> > > > pointer, where it can point to freed data after the migration object is
> > > > finalized. It is debatable that the pointer can be cleared after the main
> > > > thread (1) join() the migration thread first, then (2) release the last
> > > > refcount for the migration object and clear the pointer. However there's
> > > > still major challenges [1]. With singleton, we could have a slightly but
> > > > hopefully working workaround to clear the pointer during finalize().
> > >
> > > I'm still not entirely convinced that this singleton proposal is
> > > fixing the migration problem correctly.
> > >
> > > Based on discussions in v1, IIUC, the situation is that we have
> > > migration_shutdown() being called from qemu_cleanup(). The former
> > > will call object_unref(current_migration), but there may still
> > > be background migration threads running that access 'current_migration',
> > > and thus a potential use-after-free.
> >
> > migration thread is fine, it takes a refcount at the entry.
> >
> > And btw, taking it at the entry is racy, we've just fixed it, see (in my
> > next migration pull):
> >
> > https://lore.kernel.org/qemu-devel/20241024213056.1395400-2-peterx@redhat.com/
>
> Yep, acquiring the refcount immediately before thread-create
> is what I meant.
>
> > The access reported was, IIUC, outside migration code, but after both
> > main/migration threads released the refcount, hence after finalize(). It
> > could be a random migration_is_running() call very late in device code, for
> > example.
>
>
>
> >
> > >
> > > Based on what the 7th patch here does, the key difference is that
> > > the finalize() method for MigrationState will set 'current_migration'
> > > to NULL after free'ing it.
> >
> > Yes. But this show case series isn't complete. We need a migration-side
> > lock finally to make it safe to access. For that, see:
> >
> > https://lore.kernel.org/qemu-devel/20241024213056.1395400-9-peterx@redhat.com/
> >
> > >
> > > I don't believe that is safe.
> >
> > I hope after the other series applied it will be 100% safe, even though I
> > agree it's tricky. But hopefully QOM is very clean, the trickly part is
> > still within migration, and it should be less tricky than migration
> > implement a refcount on top of Object..
>
> Ok, so with the other series applied, this does look safe, but
> it also doesn't seem to really have any dependancy on the
> single interface code. Patch 7 here looks sufficient, in combo
> with the other 2 series to avoid the use-after-free flaws.
Patch 7, when applied without patch 6 and prior, will crash in
device-introspect-test, trying to create yet another migration object when
processing the "device-list-properties" QMP command. And it turns out
that's also not the only way QEMU can crash by that.
Fundamentally it's because patch 7 has global operations within
init()/finalize() to fix the migration dangling pointer, hence it must not
be instanciated more than once.
It's also probably because I always think singleton can be useful in
general to QEMU's device model where can be special devices all over the
places that I'm not aware of. I didn't work on a lot of QEMU devices, but
with that limited experience I still stumbled upon two devices (if taking
migration object as one..) that might benefit from it.
That leads to this whole series, which is also the cleanest so far I can
think of to solve the immediate migration UAF.
Thanks,
>
> > I think the question is whether things like migration_is_running() is
> > allowed to be used anywhere, even after migration_shutdown(). My answer
> > is, it should be ok to be used anywhere, and we don't necessarilly need to
> > limit that. In that case the caller doesn't need to take a refcount
> > because it's an immediate query. It can simply check its existance with
> > the lock (after my patch 8 of the other series applied, which depends on
> > this qom series).
>
> Agree, and from a practical POV, I think it would be impossible to
> require a ref count be held from other non-migration threads, so the
> locking around 'current_migration' looks like the only practical
> option.
>
> With regards,
> Daniel
> --
> |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org -o- https://fstop138.berrange.com :|
> |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
>
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-30 17:51 ` Peter Xu
@ 2024-10-30 17:58 ` Daniel P. Berrangé
2024-10-30 18:55 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Daniel P. Berrangé @ 2024-10-30 17:58 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 01:51:54PM -0400, Peter Xu wrote:
> On Wed, Oct 30, 2024 at 04:13:57PM +0000, Daniel P. Berrangé wrote:
> > On Wed, Oct 30, 2024 at 09:13:13AM -0400, Peter Xu wrote:
> > > On Wed, Oct 30, 2024 at 09:48:07AM +0000, Daniel P. Berrangé wrote:
> > > > On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> > > > > v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
> > > >
> > > > > Meanwhile, migration has a long standing issue on current_migration
> > > > > pointer, where it can point to freed data after the migration object is
> > > > > finalized. It is debatable that the pointer can be cleared after the main
> > > > > thread (1) join() the migration thread first, then (2) release the last
> > > > > refcount for the migration object and clear the pointer. However there's
> > > > > still major challenges [1]. With singleton, we could have a slightly but
> > > > > hopefully working workaround to clear the pointer during finalize().
> > > >
> > > > I'm still not entirely convinced that this singleton proposal is
> > > > fixing the migration problem correctly.
> > > >
> > > > Based on discussions in v1, IIUC, the situation is that we have
> > > > migration_shutdown() being called from qemu_cleanup(). The former
> > > > will call object_unref(current_migration), but there may still
> > > > be background migration threads running that access 'current_migration',
> > > > and thus a potential use-after-free.
> > >
> > > migration thread is fine, it takes a refcount at the entry.
> > >
> > > And btw, taking it at the entry is racy, we've just fixed it, see (in my
> > > next migration pull):
> > >
> > > https://lore.kernel.org/qemu-devel/20241024213056.1395400-2-peterx@redhat.com/
> >
> > Yep, acquiring the refcount immediately before thread-create
> > is what I meant.
> >
> > > The access reported was, IIUC, outside migration code, but after both
> > > main/migration threads released the refcount, hence after finalize(). It
> > > could be a random migration_is_running() call very late in device code, for
> > > example.
> >
> >
> >
> > >
> > > >
> > > > Based on what the 7th patch here does, the key difference is that
> > > > the finalize() method for MigrationState will set 'current_migration'
> > > > to NULL after free'ing it.
> > >
> > > Yes. But this show case series isn't complete. We need a migration-side
> > > lock finally to make it safe to access. For that, see:
> > >
> > > https://lore.kernel.org/qemu-devel/20241024213056.1395400-9-peterx@redhat.com/
> > >
> > > >
> > > > I don't believe that is safe.
> > >
> > > I hope after the other series applied it will be 100% safe, even though I
> > > agree it's tricky. But hopefully QOM is very clean, the trickly part is
> > > still within migration, and it should be less tricky than migration
> > > implement a refcount on top of Object..
> >
> > Ok, so with the other series applied, this does look safe, but
> > it also doesn't seem to really have any dependancy on the
> > single interface code. Patch 7 here looks sufficient, in combo
> > with the other 2 series to avoid the use-after-free flaws.
>
> Patch 7, when applied without patch 6 and prior, will crash in
> device-introspect-test, trying to create yet another migration object when
> processing the "device-list-properties" QMP command. And it turns out
> that's also not the only way QEMU can crash by that.
>
> Fundamentally it's because patch 7 has global operations within
> init()/finalize() to fix the migration dangling pointer, hence it must not
> be instanciated more than once.
That's a result from moving the "assert()" into the constructor.
The assert(!current_migration) can be kept in migration_object_init,
the constructor could conditionally set current_migration only if it
is NULL, and the finalizer could conditionally clear current_migration
only if it matches the current object. There's no conceptual dependancy
on having a singleton interface in the patch.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
` (7 preceding siblings ...)
2024-10-30 9:48 ` [PATCH RFC v2 0/7] QOM: Singleton interface Daniel P. Berrangé
@ 2024-10-30 18:07 ` Daniel P. Berrangé
2024-10-30 19:08 ` Peter Xu
8 siblings, 1 reply; 21+ messages in thread
From: Daniel P. Berrangé @ 2024-10-30 18:07 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
>
> This patchset introduces the singleton interface for QOM. I didn't add a
> changelog because there're quite a few changes here and there, plus new
> patches. So it might just be easier to re-read, considering the patchset
> isn't large.
>
> I switched v2 into RFC, because we have reviewer concerns (Phil and Dan so
> far) that it could be error prone to try to trap every attempts to create
> an object. My argument is, if we already have abstract class, meanwhile we
> do not allow instantiation of abstract class, so the complexity is already
> there. I prepared patch 1 this time to collect and track all similar
> random object creations; it might be helpful as a cleanup on its own to
> deduplicate some similar error messages. Said that, I'm still always open
> to rejections to this proposal.
>
> I hope v2 looks slightly cleaner by having not only object_new_allowed()
> but also object_new_or_fetch().
For me, that doesn't really make it much more appealing. Yes, we already have
an abstract class, but that has narrower impact, as there are fewer places
in which which we can trigger instantiation of an abstract class, than
where we can trigger instantiation of arbitrary objects and devices.
The conversion of the iommu code results in worse error reporting, and
doesn't handle the virtio-iommu case, and the migration problems appear
solvable without inventing a singleton interface. So this doesn't feel
like it is worth the the trouble.
NB, my view point would have been different if 'object_new' had an
"Error *errp" parameter. That would have made handling failure a
standard part of the design pattern for object construction, thus
avoiding adding asserts in the 'object_new' codepath which could be
triggered by unexpected/badly validated user input.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-30 17:58 ` Daniel P. Berrangé
@ 2024-10-30 18:55 ` Peter Xu
0 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2024-10-30 18:55 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 05:58:15PM +0000, Daniel P. Berrangé wrote:
> That's a result from moving the "assert()" into the constructor.
> The assert(!current_migration) can be kept in migration_object_init,
> the constructor could conditionally set current_migration only if it
> is NULL, and the finalizer could conditionally clear current_migration
> only if it matches the current object. There's no conceptual dependancy
> on having a singleton interface in the patch.
Yes that could work, but that sounds more hackish from my POV, trying to
detect "which is the real migration object". We need to assume (1) the
1st migration object being created must be the global migration object,
then (2) we still allow concurrent creations of such object, which I
personally don't really like, even with a hack already.
If this series cannot get accepted, I can try go with that, or I'll
implement the refcount in migration.c, whichever I found better at last.
To me, both are less clean comparing to singleton.
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-30 18:07 ` Daniel P. Berrangé
@ 2024-10-30 19:08 ` Peter Xu
2024-10-31 15:57 ` Daniel P. Berrangé
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2024-10-30 19:08 UTC (permalink / raw)
To: Daniel P. Berrangé
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 06:07:08PM +0000, Daniel P. Berrangé wrote:
> On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> > v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
> >
> > This patchset introduces the singleton interface for QOM. I didn't add a
> > changelog because there're quite a few changes here and there, plus new
> > patches. So it might just be easier to re-read, considering the patchset
> > isn't large.
> >
> > I switched v2 into RFC, because we have reviewer concerns (Phil and Dan so
> > far) that it could be error prone to try to trap every attempts to create
> > an object. My argument is, if we already have abstract class, meanwhile we
> > do not allow instantiation of abstract class, so the complexity is already
> > there. I prepared patch 1 this time to collect and track all similar
> > random object creations; it might be helpful as a cleanup on its own to
> > deduplicate some similar error messages. Said that, I'm still always open
> > to rejections to this proposal.
> >
> > I hope v2 looks slightly cleaner by having not only object_new_allowed()
> > but also object_new_or_fetch().
>
> For me, that doesn't really make it much more appealing. Yes, we already have
> an abstract class, but that has narrower impact, as there are fewer places
> in which which we can trigger instantiation of an abstract class, than
> where we can trigger instantiation of arbitrary objects and devices.
There should be exactly the same number of places that will need care for
either abstract or singleton. I tried to justify this with patch 1.
I still think patch 1 can be seen as a cleanup too on its own (dedups the
same "The class is abstract" error message), tracking random object
creations so logically we could have the idea on whether a class can be
instantiated at all, starting with abstract class.
The real extra "complexity" is object_new_or_fetch(), but I hope it's not a
major concern either. We only have two such use (aka, "please give me an
object of class XXX"), which is qom/device-list-properties. I don't expect
it to be common, I hope it's easy to maintain.
>
> The conversion of the iommu code results in worse error reporting, and
> doesn't handle the virtio-iommu case, and the migration problems appear
> solvable without inventing a singleton interface. So this doesn't feel
> like it is worth the the trouble.
IMHO that's not a major issue, I can drop patch 3-5 just to make it simple
as of now. Btw, I have a TODO in patch 2 where I mentioned we can provide
better error report if we want, so we can easily have exactly the same
error as before with maybe a few or 10+ LOCs on top. It's trivial.
object_new_allowed():
+ if (object_class_is_singleton(klass)) {
+ Object *obj = singleton_get_instance(klass);
+
+ if (obj) {
+ object_unref(obj);
+ /*
+ * TODO: Enhance the error message. E.g., the singleton class
+ * can provide a per-class error message in SingletonClass.
+ */
+ error_setg(errp, "Object type '%s' conflicts with "
+ "an existing singleton instance",
+ klass->type->name);
+ return false;
+ }
+ }
>
> NB, my view point would have been different if 'object_new' had an
> "Error *errp" parameter. That would have made handling failure a
> standard part of the design pattern for object construction, thus
> avoiding adding asserts in the 'object_new' codepath which could be
> triggered by unexpected/badly validated user input.
Yes I also wished object_new() can take an Error** when I started working
on it. It would make this much easier, indeed. I suppose we don't need
that by not allowing instance_init() to fail at all, postponing things to
realize(). I suppose that's a "tactic" QEMU chose explicitly to make it
easy that object_new() callers keep like before with zero error handling
needed. At least for TYPE_DEVICE it looks all fine if all such operations
can be offloaded into realize(). I'm not sure user creatable has those
steps also because of this limitation.
I was trying to do that with object_new_allowed() here instead, whenever it
could be triggered by an user input. We could have an extra layer before
reaching object_new() to guard any user input, and I think
object_new_allowed() could play that role. When / If we want to introduce
Error** to object_new() some day (or a variance of it), we could simply
move object_new_allowed() into it.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH RFC v2 0/7] QOM: Singleton interface
2024-10-30 19:08 ` Peter Xu
@ 2024-10-31 15:57 ` Daniel P. Berrangé
0 siblings, 0 replies; 21+ messages in thread
From: Daniel P. Berrangé @ 2024-10-31 15:57 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Markus Armbruster, Peter Maydell, Eduardo Habkost,
Paolo Bonzini, Mark Cave-Ayland, Igor Mammedov,
Michael S . Tsirkin, Alex Williamson, Dr . David Alan Gilbert,
Philippe Mathieu-Daudé, Cédric Le Goater, Fabiano Rosas,
Juraj Marcin
On Wed, Oct 30, 2024 at 03:08:08PM -0400, Peter Xu wrote:
> On Wed, Oct 30, 2024 at 06:07:08PM +0000, Daniel P. Berrangé wrote:
> > On Tue, Oct 29, 2024 at 05:16:00PM -0400, Peter Xu wrote:
> > > v1: https://lore.kernel.org/r/20241024165627.1372621-1-peterx@redhat.com
> > >
> > > This patchset introduces the singleton interface for QOM. I didn't add a
> > > changelog because there're quite a few changes here and there, plus new
> > > patches. So it might just be easier to re-read, considering the patchset
> > > isn't large.
> > >
> > > I switched v2 into RFC, because we have reviewer concerns (Phil and Dan so
> > > far) that it could be error prone to try to trap every attempts to create
> > > an object. My argument is, if we already have abstract class, meanwhile we
> > > do not allow instantiation of abstract class, so the complexity is already
> > > there. I prepared patch 1 this time to collect and track all similar
> > > random object creations; it might be helpful as a cleanup on its own to
> > > deduplicate some similar error messages. Said that, I'm still always open
> > > to rejections to this proposal.
> > >
> > > I hope v2 looks slightly cleaner by having not only object_new_allowed()
> > > but also object_new_or_fetch().
> >
> > For me, that doesn't really make it much more appealing. Yes, we already have
> > an abstract class, but that has narrower impact, as there are fewer places
> > in which which we can trigger instantiation of an abstract class, than
> > where we can trigger instantiation of arbitrary objects and devices.
>
> There should be exactly the same number of places that will need care for
> either abstract or singleton. I tried to justify this with patch 1.
>
> I still think patch 1 can be seen as a cleanup too on its own (dedups the
> same "The class is abstract" error message), tracking random object
> creations so logically we could have the idea on whether a class can be
> instantiated at all, starting with abstract class.
I think patch 1 might be incomplete, as I'm not seeing what checks
for abstract or singleton classes in the 'qdev_new' code paths, used
by -device / device_add QMP. This is an example of the risks of adding
more failure scenarios to object_add.
> > NB, my view point would have been different if 'object_new' had an
> > "Error *errp" parameter. That would have made handling failure a
> > standard part of the design pattern for object construction, thus
> > avoiding adding asserts in the 'object_new' codepath which could be
> > triggered by unexpected/badly validated user input.
>
> Yes I also wished object_new() can take an Error** when I started working
> on it. It would make this much easier, indeed. I suppose we don't need
> that by not allowing instance_init() to fail at all, postponing things to
> realize(). I suppose that's a "tactic" QEMU chose explicitly to make it
> easy that object_new() callers keep like before with zero error handling
> needed. At least for TYPE_DEVICE it looks all fine if all such operations
> can be offloaded into realize(). I'm not sure user creatable has those
> steps also because of this limitation.
>
> I was trying to do that with object_new_allowed() here instead, whenever it
> could be triggered by an user input. We could have an extra layer before
> reaching object_new() to guard any user input, and I think
> object_new_allowed() could play that role. When / If we want to introduce
> Error** to object_new() some day (or a variance of it), we could simply
> move object_new_allowed() into it.
Yes, having thought about this today, I came up with a way that we could
introduce a object_new_dynamic() variant with "Error *errp" instead of
asserts, and *crucially* force its use in the unsafe scenarios. ie any
place that is not passing a const,static string. I've CC'd you on an
RFC series that mocks up this idea. That would be sufficient to remove
my objections wrt the singleton concept.
With regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2024-10-31 15:58 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-29 21:16 [PATCH RFC v2 0/7] QOM: Singleton interface Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 1/7] qom: Track dynamic initiations of random object class Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 2/7] qom: TYPE_SINGLETON interface Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 3/7] qdev: Make device_set_realized() be fully prepared with !machine Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 4/7] qdev: Make qdev_get_machine() safe before machine creates Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 5/7] x86/iommu: Make x86-iommu a singleton object Peter Xu
2024-10-30 10:33 ` Daniel P. Berrangé
2024-10-30 13:01 ` Peter Xu
2024-10-30 13:07 ` Daniel P. Berrangé
2024-10-30 14:33 ` Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 6/7] migration: Make migration object " Peter Xu
2024-10-29 21:16 ` [PATCH RFC v2 7/7] migration: Reset current_migration properly Peter Xu
2024-10-30 9:48 ` [PATCH RFC v2 0/7] QOM: Singleton interface Daniel P. Berrangé
2024-10-30 13:13 ` Peter Xu
2024-10-30 16:13 ` Daniel P. Berrangé
2024-10-30 17:51 ` Peter Xu
2024-10-30 17:58 ` Daniel P. Berrangé
2024-10-30 18:55 ` Peter Xu
2024-10-30 18:07 ` Daniel P. Berrangé
2024-10-30 19:08 ` Peter Xu
2024-10-31 15:57 ` Daniel P. Berrangé
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).