* [PATCH v2 0/3] memory: Stop piggybacking on memory region owners
@ 2025-09-06 2:39 Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 1/3] qom: Do not finalize twice Akihiko Odaki
` (2 more replies)
0 siblings, 3 replies; 21+ messages in thread
From: Akihiko Odaki @ 2025-09-06 2:39 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, Peter Xu,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko, Akihiko Odaki
Supersedes: <20250828-san-v9-0-c0dff4b8a487@rsg.ci.i.u-tokyo.ac.jp>
("[PATCH v9 0/2] Fix check-qtest-ppc64 sanitizer errors")
Based-on: <20250906-use-v1-0-c51caafd1eb7@rsg.ci.i.u-tokyo.ac.jp>
("[PATCH 00/22] Fix memory region leaks and use-after-finalization")
MemoryRegions used to "piggyback" on their owners instead of using their
reference counters due to the circular dependencies between them, which
caused memory leak.
I tried to fix it with "[PATCH v9 0/2] Fix check-qtest-ppc64 sanitizer
errors" but it resulted in a lengthy discussion; ultimately it is
attributed to the fact that "piggybacking" is hard to understand and
forces us design trade-offs. It was also insufficient because it only
deals with the container/subregion pattern and did not deal with
AddressSpace and DMA. Fixing all possible memory leaks require checking
the referrer at many places where memory_region_ref() is called.
With this series, I remove the "piggyback" hack altogather.
The key insight here is that the unparented devices have the finalizable
MemoryRegions and they do not need them. I code the fact by calling
object_unparent() in device_unparent(). This eliminates the entire class
of memory leaks caused by references from owners to their MemoryRegions.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
Changes in v2:
- Expanded the message of patch
"vfio/pci: Do not unparent in instance_finalize()".
- Changed to exploit the unparenting timing instead of the unrealization
timing.
- Link to v1: https://lore.kernel.org/qemu-devel/20250901-mr-v1-0-dd7cb6b1480b@rsg.ci.i.u-tokyo.ac.jp
---
Akihiko Odaki (3):
qom: Do not finalize twice
virtio-gpu-virgl: Add virtio-gpu-virgl-hostmem-region type
memory: Stop piggybacking on memory region owners
docs/devel/memory.rst | 41 +++++++++++++++++-----------------
include/qom/object.h | 1 +
include/system/memory.h | 51 +++++++++++++++++++++----------------------
hw/core/qdev.c | 16 ++++++++++++++
hw/display/virtio-gpu-virgl.c | 50 ++++++++++++++++++++++++++++++------------
qom/object.c | 5 +++++
system/memory.c | 33 ++++++++++++++++++++--------
7 files changed, 127 insertions(+), 70 deletions(-)
---
base-commit: e101d33792530093fa0b0a6e5f43e4d8cfe4581e
change-id: 20250831-mr-d0dc495bad11
prerequisite-message-id: <20250906-use-v1-0-c51caafd1eb7@rsg.ci.i.u-tokyo.ac.jp>
prerequisite-patch-id: d464fda86a3c79ff8e6d7a2e623d979b2a47019b
prerequisite-patch-id: 17b153237f69c898b9c5b93aad0d5116d0bfe49f
prerequisite-patch-id: a323f67e01c672ab2958a237ea54b77f1443e2d1
prerequisite-patch-id: 019969fe248bd57ddcda1ff5fc960b214ccffefe
prerequisite-patch-id: 74ded25b212b75b2f7d1859fedc601cf33d59107
prerequisite-patch-id: 43f841a1924749e2a5a3b74b35e54f89afb7e3c5
prerequisite-patch-id: 44300da5065efee0390be5d450225868e01cecfc
prerequisite-patch-id: 4af306d6f3d0a4585015c5907ca1e1dcfced77d3
prerequisite-patch-id: fff78c7af9b0a56190a1b4afbb122c460a6b0e7d
prerequisite-patch-id: 3d38803ce09ba9c93f2a876f54309e673b396ab1
prerequisite-patch-id: 822094864ad7a6a702fee098e4835621bd8092fe
prerequisite-patch-id: 5757efd81557b060257b5db6dec6fd189076ee77
prerequisite-patch-id: bd912830a326f13186bf38e916655ec980e11af8
prerequisite-patch-id: fe6b92112288829e60f10c305742a544f45e8984
prerequisite-patch-id: ac4ff0c11dcc1fc5d08b4fc480c14721fde574ad
prerequisite-patch-id: ff398fa97b5f2feee85372fdf108d82d8d5526b0
prerequisite-patch-id: 7ac446ae76e05dd267a63889ff775ac609712c31
prerequisite-patch-id: b49a74cd5f31348c3dc13dcfd1dad629e6b30387
prerequisite-patch-id: 8f61fe1b81cf3ec906ebbf61776573edd96c1e8c
prerequisite-patch-id: 01fb8ccbe7326021a94a8d7531189568d2e311a7
prerequisite-patch-id: 974b0fc6d7c8d6d56b8f44597260647e1a53cf38
prerequisite-patch-id: 55c4711a2a4e6b02b8b512e0283f8feaf7d3bfa3
Best regards,
--
Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH v2 1/3] qom: Do not finalize twice
2025-09-06 2:39 [PATCH v2 0/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
@ 2025-09-06 2:39 ` Akihiko Odaki
2025-09-23 9:27 ` Paolo Bonzini
2025-09-06 2:39 ` [PATCH v2 2/3] virtio-gpu-virgl: Add virtio-gpu-virgl-hostmem-region type Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 3/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
2 siblings, 1 reply; 21+ messages in thread
From: Akihiko Odaki @ 2025-09-06 2:39 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, Peter Xu,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko, Akihiko Odaki
The next change adds code to retain references from an object to the
parent when it is being unparented to ensure that the parent outlive
them. This change handles the following scenario with the code:
1. The parent starts being finalized without unparenting.
2. Unparenting happens during finalization.
3. The child retains the reference to the parent.
4. The child gets finalized, and releases the reference.
In this scenario, the reference counter of the parent reaches to zero,
gets incremented, and gets decremented to reach to zero again. This
change ensures that finalization will be triggered again in the
scenario.
Note that the reference counter needs to reach to zero again before
finalization ends; otherwise the object will be "resurrected", which
is not clearly defined and prohibited with an existing assertion.
One thing that looks concerning with this change is that it adds a bool
to Object. This is not a problem in the most situations where the host
uses 64-bit addressing because the member is added to a gap needed for
alignment, and possible double-free scenarios handled with this change
are more serious than the extra memory usage for 32-bit hosts.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
include/qom/object.h | 1 +
qom/object.c | 5 +++++
2 files changed, 6 insertions(+)
diff --git a/include/qom/object.h b/include/qom/object.h
index 26df6137b911..7f7b1ffea8fe 100644
--- a/include/qom/object.h
+++ b/include/qom/object.h
@@ -158,6 +158,7 @@ struct Object
ObjectFree *free;
GHashTable *properties;
uint32_t ref;
+ bool finalizing;
Object *parent;
};
diff --git a/qom/object.c b/qom/object.c
index 1856bb36c74c..b766b2e9baa7 100644
--- a/qom/object.c
+++ b/qom/object.c
@@ -725,6 +725,11 @@ static void object_finalize(void *data)
Object *obj = data;
TypeImpl *ti = obj->class->type;
+ if (obj->finalizing) {
+ return;
+ }
+
+ obj->finalizing = true;
object_property_del_all(obj);
object_deinit(obj, ti);
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 2/3] virtio-gpu-virgl: Add virtio-gpu-virgl-hostmem-region type
2025-09-06 2:39 [PATCH v2 0/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 1/3] qom: Do not finalize twice Akihiko Odaki
@ 2025-09-06 2:39 ` Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 3/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
2 siblings, 0 replies; 21+ messages in thread
From: Akihiko Odaki @ 2025-09-06 2:39 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, Peter Xu,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko, Akihiko Odaki
virtio-gpu-virgl used to set Object::free() of memory regions, but
memory regions are going to have its own implementation of the
function.
Add the virtio-gpu-virgl-hostmem-region type, which contains a
memory region and performs what Object::free() did during finalization.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
hw/display/virtio-gpu-virgl.c | 51 ++++++++++++++++++++++++++++++-------------
1 file changed, 36 insertions(+), 15 deletions(-)
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index 94ddc01f91c6..f5cba8a7fa66 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -52,11 +52,18 @@ virgl_get_egl_display(G_GNUC_UNUSED void *cookie)
#if VIRGL_VERSION_MAJOR >= 1
struct virtio_gpu_virgl_hostmem_region {
+ Object parent_obj;
MemoryRegion mr;
struct VirtIOGPU *g;
bool finish_unmapping;
};
+#define TYPE_VIRTIO_GPU_VIRGL_HOSTMEM_REGION "virtio-gpu-virgl-hostmem-region"
+
+DECLARE_INSTANCE_CHECKER(struct virtio_gpu_virgl_hostmem_region,
+ VIRTIO_GPU_VIRGL_HOSTMEM_REGION,
+ TYPE_VIRTIO_GPU_VIRGL_HOSTMEM_REGION)
+
static struct virtio_gpu_virgl_hostmem_region *
to_hostmem_region(MemoryRegion *mr)
{
@@ -70,14 +77,18 @@ static void virtio_gpu_virgl_resume_cmdq_bh(void *opaque)
virtio_gpu_process_cmdq(g);
}
-static void virtio_gpu_virgl_hostmem_region_free(void *obj)
+/*
+ * MR could outlive the resource if MR's reference is held outside of
+ * virtio-gpu. In order to prevent unmapping resource while MR is alive,
+ * and thus, making the data pointer invalid, we will block virtio-gpu
+ * command processing until MR is fully unreferenced and freed.
+ */
+static void virtio_gpu_virgl_hostmem_region_finalize(Object *obj)
{
- MemoryRegion *mr = MEMORY_REGION(obj);
- struct virtio_gpu_virgl_hostmem_region *vmr;
+ struct virtio_gpu_virgl_hostmem_region *vmr = VIRTIO_GPU_VIRGL_HOSTMEM_REGION(obj);
VirtIOGPUBase *b;
VirtIOGPUGL *gl;
- vmr = to_hostmem_region(mr);
vmr->finish_unmapping = true;
b = VIRTIO_GPU_BASE(vmr->g);
@@ -92,11 +103,26 @@ static void virtio_gpu_virgl_hostmem_region_free(void *obj)
qemu_bh_schedule(gl->cmdq_resume_bh);
}
+static const TypeInfo virtio_gpu_virgl_hostmem_region_info = {
+ .parent = TYPE_OBJECT,
+ .name = TYPE_VIRTIO_GPU_VIRGL_HOSTMEM_REGION,
+ .instance_size = sizeof(struct virtio_gpu_virgl_hostmem_region),
+ .instance_finalize = virtio_gpu_virgl_hostmem_region_finalize
+};
+
+static void virtio_gpu_virgl_types(void)
+{
+ type_register_static(&virtio_gpu_virgl_hostmem_region_info);
+}
+
+type_init(virtio_gpu_virgl_types)
+
static int
virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
struct virtio_gpu_virgl_resource *res,
uint64_t offset)
{
+ g_autofree char *name = NULL;
struct virtio_gpu_virgl_hostmem_region *vmr;
VirtIOGPUBase *b = VIRTIO_GPU_BASE(g);
MemoryRegion *mr;
@@ -117,21 +143,16 @@ virtio_gpu_virgl_map_resource_blob(VirtIOGPU *g,
}
vmr = g_new0(struct virtio_gpu_virgl_hostmem_region, 1);
+ name = g_strdup_printf("blob[%" PRIu32 "]", res->base.resource_id);
+ object_initialize_child(OBJECT(g), name, vmr,
+ TYPE_VIRTIO_GPU_VIRGL_HOSTMEM_REGION);
vmr->g = g;
mr = &vmr->mr;
- memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
+ memory_region_init_ram_ptr(mr, OBJECT(vmr), "mr", size, data);
memory_region_add_subregion(&b->hostmem, offset, mr);
memory_region_set_enabled(mr, true);
- /*
- * MR could outlive the resource if MR's reference is held outside of
- * virtio-gpu. In order to prevent unmapping resource while MR is alive,
- * and thus, making the data pointer invalid, we will block virtio-gpu
- * command processing until MR is fully unreferenced and freed.
- */
- OBJECT(mr)->free = virtio_gpu_virgl_hostmem_region_free;
-
res->mr = mr;
return 0;
@@ -159,7 +180,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
* 1. Begin async unmapping with memory_region_del_subregion()
* and suspend/block cmd processing.
* 2. Wait for res->mr to be freed and cmd processing resumed
- * asynchronously by virtio_gpu_virgl_hostmem_region_free().
+ * asynchronously by virtio_gpu_virgl_hostmem_region_finalize().
* 3. Finish the unmapping with final virgl_renderer_resource_unmap().
*/
if (vmr->finish_unmapping) {
@@ -182,7 +203,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
/* memory region owns self res->mr object and frees it by itself */
memory_region_set_enabled(mr, false);
memory_region_del_subregion(&b->hostmem, mr);
- object_unparent(OBJECT(mr));
+ object_unparent(OBJECT(vmr));
}
return 0;
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-06 2:39 [PATCH v2 0/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 1/3] qom: Do not finalize twice Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 2/3] virtio-gpu-virgl: Add virtio-gpu-virgl-hostmem-region type Akihiko Odaki
@ 2025-09-06 2:39 ` Akihiko Odaki
2025-09-10 21:45 ` Peter Xu
2025-09-23 8:41 ` Paolo Bonzini
2 siblings, 2 replies; 21+ messages in thread
From: Akihiko Odaki @ 2025-09-06 2:39 UTC (permalink / raw)
To: qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, Peter Xu,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko, Akihiko Odaki
MemoryRegions used to "piggyback" on their owners instead of using their
own reference counters due to the circular dependencies between
them, which caused new circular references. Stop piggybacking, showing
that the circular dependencies are actually broken at the finalization
time.
Circular references caused by piggybacking
==========================================
Piggybacking created another circular reference when an owner referred
to its MemoryRegion via memory_region_ref(). This could happen in three
scenarios:
- A subregion and its container had the same owner. In this case, the
container would use memory_region_ref() to take a reference to the
subregion.
- An AddressSpace and its root MemoryRegion had the same owner. The
situation is similar to the subregion/container scenario.
- An owner called address_space_map() with a guest-controlled address
that points back to its MemoryRegion. This scenario is similar to what
MemReentrancyGuard handles.
Avoiding such a circular reference required checking if the referrer is
the owner at many places where memory_region_ref() is called,
complicating the code further.
Insight
=======
This change challenges the underlying assumption that circular
dependencies exist and can be tolerated at the finalization time. The
deletable MemoryRegions are of hotpluggable devices and
virtio-gpu-virgl's hostmem. A device and its MemoryRegion have the
following dependencies:
QOM tree -> Device
Container -> MemoryRegion
Device <-> MemoryRegion
Techniques like piggybacking or a hypothetical garbage collector can
finalize the device and the MemoryRegion once the QOM tree and container
lose their dependencies. However, these methods are fundamentally
insufficient because the MemoryRegion and the device have finalizers,
and they do not respect the dependencies these objects may have during
finalization.
As long as the object model based on the device and the MemoryRegion is
correct, one of them should lose its dependency on the first,
establishing a valid finalization order. Understanding this allows using
standard reference counting, which is immune to the problems of
piggybacking.
Analysis reveals that the device no longer depends on the MemoryRegion
after being unparented. The device needs the MemoryRegion for the
following two purposes:
- To add the MemoryRegion to a container and expose it to an
AddressSpace.
- To expose it via the QOM tree.
Once unparented, the device is hidden from the AddresSpaces and the
QOM tree so it no longer needs the MemoryRegion and the circular
dependencies are broken. We only need to ensure that the device is
finalized after the MemoryRegion then.
It should also be noted that the reference from the MemoryRegion to the
device needs to be tracked only after the device is unparented because
the device will not be finalized as long as it is in the QOM tree.
This fact can be exploited to avoid having a circular reference between
the device and the MemoryRegion before the device gets unparented.
Solution
========
When devices get unparented, they also unparent their memory regions.
The unparented memory regions will then retain their references to the
devices. This ensures that:
1. the references from the devices to their memory regions are counted
until the devices get unparented
2. the references from their memory regions to the devices afterwards,
which in turn ensures that devices are finalized after their
MemoryRegions.
When virtio-gpu's hostmems get unparented, they also unparent their
memory regions in the same manner.
This enforces a valid finalization order, allows MemoryRegions to rely
on standard reference counting, and eliminates the entire class of
memory leaks caused by piggybacking.
Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
---
docs/devel/memory.rst | 41 +++++++++++++++++-----------------
include/system/memory.h | 51 +++++++++++++++++++++----------------------
hw/core/qdev.c | 16 ++++++++++++++
hw/display/virtio-gpu-virgl.c | 1 +
system/memory.c | 33 ++++++++++++++++++++--------
5 files changed, 86 insertions(+), 56 deletions(-)
diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
index 749f11d8a4dd..9634d8805740 100644
--- a/docs/devel/memory.rst
+++ b/docs/devel/memory.rst
@@ -143,11 +143,7 @@ Region lifecycle
----------------
A region is created by one of the memory_region_init*() functions and
-attached to an object, which acts as its owner or parent. QEMU ensures
-that the owner object remains alive as long as the region is visible to
-the guest, or as long as the region is in use by a virtual CPU or another
-device. For example, the owner object will not die between an
-address_space_map operation and the corresponding address_space_unmap.
+attached to an object, which acts as its owner or parent.
After creation, a region can be added to an address space or a
container with memory_region_add_subregion(), and removed using
@@ -158,30 +154,34 @@ ioeventfd) can be changed during the region lifecycle. They take effect
as soon as the region is made visible. This can be immediately, later,
or never.
-Destruction of a memory region happens automatically when the owner
-object dies.
+A region will retain references to its owner object when it is
+unparented. This ensures that the owner object remains alive as long as
+the region is in use by a virtual CPU or another device. For example,
+the owner object will not die between an address_space_map operation and
+the corresponding address_space_unmap. Devices automatically unparent
+their memory regions when they are unparented.
-You must not destroy a memory region as long as it may be in use by a
-device or CPU. In order to do this, as a general rule do not create or
-destroy memory regions dynamically during a device's lifetime.
+You must not free a memory region as long as it may be in use by a
+device or CPU. In order to do this, as a general rule do not allocate
+or free memory regions dynamically during a device's lifetime.
The dynamically allocated data structure that contains the
memory region should be freed in the instance_finalize callback.
If you break this rule, the following situation can happen:
-- the memory region's owner had a reference taken via memory_region_ref
+- the memory region had a reference taken via memory_region_ref
(for example by address_space_map)
-- the region is unparented, and has no owner anymore
+- the region is freed
-- when address_space_unmap is called, the reference to the memory region's
- owner is leaked.
+- when the mapped memory is used, the use of the memory region
+ results in use-after-free.
-There is an exception to the above rule: it is okay to call
-object_unparent at any time for an alias or a container region. It is
-therefore also okay to create or destroy alias and container regions
-dynamically during a device's lifetime.
+There is an exception to the above rule: it is okay to free an alias or
+a container region at any time. It is therefore also okay to allocate
+or free alias and container regions dynamically during a device's
+lifetime.
This exceptional usage is valid because aliases and containers only help
QEMU building the guest's memory map; they are never accessed directly.
@@ -191,9 +191,8 @@ this exception is rarely necessary, and therefore it is discouraged,
but nevertheless it is used in a few places.
For regions that "have no owner" (NULL is passed at creation time), the
-machine object is actually used as the owner. You must never call
-object_unparent on regions that have no owner, unless they are aliases
-or containers.
+machine object is actually used as the owner. You must never free
+regions that have no owner, unless they are aliases or containers.
Overlapping regions and priority
diff --git a/include/system/memory.h b/include/system/memory.h
index e2cd6ed12614..1fc773ca49e2 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -1307,7 +1307,7 @@ static inline bool memory_region_section_intersect_range(MemoryRegionSection *s,
* memory_region_add_subregion() to add subregions.
*
* @mr: the #MemoryRegion to be initialized
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: used for debugging; not visible to the user or ABI
* @size: size of the region; any subregions beyond this size will be clipped
*/
@@ -1320,14 +1320,14 @@ void memory_region_init(MemoryRegion *mr,
* memory_region_ref: Add 1 to a memory region's reference count
*
* Whenever memory regions are accessed outside the BQL, they need to be
- * preserved against hot-unplug. MemoryRegions actually do not have their
- * own reference count; they piggyback on a QOM object, their "owner".
- * This function adds a reference to the owner.
+ * preserved against hot-unplug. This function adds a reference to the
+ * memory region.
*
- * All MemoryRegions must have an owner if they can disappear, even if the
- * device they belong to operates exclusively under the BQL. This is because
- * the region could be returned at any time by memory_region_find, and this
- * is usually under guest control.
+ * We do not ref/unref memory regions without an owner because it slows
+ * down DMA sensibly. All MemoryRegions must have an owner if they can
+ * disappear, even if the device they belong to operates exclusively
+ * under the BQL. This is because the region could be returned at any
+ * time by memory_region_find, and this is usually under guest control.
*
* @mr: the #MemoryRegion
*/
@@ -1337,9 +1337,8 @@ void memory_region_ref(MemoryRegion *mr);
* memory_region_unref: Remove 1 to a memory region's reference count
*
* Whenever memory regions are accessed outside the BQL, they need to be
- * preserved against hot-unplug. MemoryRegions actually do not have their
- * own reference count; they piggyback on a QOM object, their "owner".
- * This function removes a reference to the owner and possibly destroys it.
+ * preserved against hot-unplug. This function removes a reference to
+ * the memory and possibly destroys it.
*
* @mr: the #MemoryRegion
*/
@@ -1352,7 +1351,7 @@ void memory_region_unref(MemoryRegion *mr);
* if @size is nonzero, subregions will be clipped to @size.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @ops: a structure containing read and write callbacks to be used when
* I/O is performed on the region.
* @opaque: passed to the read and write callbacks of the @ops structure.
@@ -1372,7 +1371,7 @@ void memory_region_init_io(MemoryRegion *mr,
* directly.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: size of the region.
@@ -1395,7 +1394,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
* modify memory directly.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: size of the region.
@@ -1425,7 +1424,7 @@ bool memory_region_init_ram_flags_nomigrate(MemoryRegion *mr,
* canceled.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: used size of the region.
@@ -1454,7 +1453,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
* mmap-ed backend.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: size of the region.
@@ -1487,7 +1486,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
* mmap-ed backend.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: the name of the region.
* @size: size of the region.
* @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
@@ -1518,7 +1517,7 @@ bool memory_region_init_ram_from_fd(MemoryRegion *mr,
* region will modify memory directly.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: size of the region.
@@ -1546,7 +1545,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
* skip_dump flag.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: the name of the region.
* @size: size of the region.
* @ptr: memory to be mapped; must contain at least @size bytes.
@@ -1566,7 +1565,7 @@ void memory_region_init_ram_device_ptr(MemoryRegion *mr,
* part of another memory region.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: used for debugging; not visible to the user or ABI
* @orig: the region to be referenced; @mr will be equivalent to
* @orig between @offset and @offset + @size - 1.
@@ -1592,7 +1591,7 @@ void memory_region_init_alias(MemoryRegion *mr,
* of the caller.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: size of the region.
@@ -1615,7 +1614,7 @@ bool memory_region_init_rom_nomigrate(MemoryRegion *mr,
* of the caller.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @ops: callbacks for write access handling (must not be NULL).
* @opaque: passed to the read and write callbacks of the @ops structure.
* @name: Region name, becomes part of RAMBlock name used in migration stream
@@ -1651,7 +1650,7 @@ bool memory_region_init_rom_device_nomigrate(MemoryRegion *mr,
* @_iommu_mr: the #IOMMUMemoryRegion to be initialized
* @instance_size: the IOMMUMemoryRegion subclass instance size
* @mrtypename: the type name of the #IOMMUMemoryRegion
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: used for debugging; not visible to the user or ABI
* @size: size of the region.
*/
@@ -1667,7 +1666,7 @@ void memory_region_init_iommu(void *_iommu_mr,
* region will modify memory directly.
*
* @mr: the #MemoryRegion to be initialized
- * @owner: the object that tracks the region's reference count (must be
+ * @owner: the object that provides the region's storage (must be
* TYPE_DEVICE or a subclass of TYPE_DEVICE, or NULL)
* @name: name of the memory region
* @size: size of the region in bytes
@@ -1713,7 +1712,7 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
* If you pass a non-NULL non-device @owner then we will assert.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @name: Region name, becomes part of RAMBlock name used in migration stream
* must be unique within any device
* @size: size of the region.
@@ -1744,7 +1743,7 @@ bool memory_region_init_rom(MemoryRegion *mr,
* If you pass a non-NULL non-device @owner then we will assert.
*
* @mr: the #MemoryRegion to be initialized.
- * @owner: the object that tracks the region's reference count
+ * @owner: the object that provides the region's storage
* @ops: callbacks for write access handling (must not be NULL).
* @opaque: passed to the read and write callbacks of the @ops structure.
* @name: Region name, becomes part of RAMBlock name used in migration stream
diff --git a/hw/core/qdev.c b/hw/core/qdev.c
index 8fdf6774f87e..b83c46615fba 100644
--- a/hw/core/qdev.c
+++ b/hw/core/qdev.c
@@ -718,8 +718,18 @@ static void device_class_base_init(ObjectClass *class, const void *data)
klass->props_count_ = 0;
}
+static int collect_memory_region(Object *child, void *opaque)
+{
+ if (object_dynamic_cast(child, TYPE_MEMORY_REGION)) {
+ g_array_append_val(opaque, child);
+ }
+
+ return 0;
+}
+
static void device_unparent(Object *obj)
{
+ g_autoptr(GArray) array = g_array_new(FALSE, FALSE, sizeof(Object *));
DeviceState *dev = DEVICE(obj);
BusState *bus;
@@ -735,6 +745,12 @@ static void device_unparent(Object *obj)
object_unref(OBJECT(dev->parent_bus));
dev->parent_bus = NULL;
}
+
+ object_child_foreach(OBJECT(dev), collect_memory_region, array);
+
+ for (gsize i = 0; i < array->len; i++) {
+ object_unparent(g_array_index(array, Object *, i));
+ }
}
static char *
diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
index f5cba8a7fa66..7afcaa4bfe2e 100644
--- a/hw/display/virtio-gpu-virgl.c
+++ b/hw/display/virtio-gpu-virgl.c
@@ -203,6 +203,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
/* memory region owns self res->mr object and frees it by itself */
memory_region_set_enabled(mr, false);
memory_region_del_subregion(&b->hostmem, mr);
+ object_unparent(OBJECT(mr));
object_unparent(OBJECT(vmr));
}
diff --git a/system/memory.c b/system/memory.c
index 56465479406f..edaf039b0647 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1212,11 +1212,19 @@ static char *memory_region_escape_name(const char *name)
return escaped;
}
+static void memory_region_free(void *obj)
+{
+ MemoryRegion *mr = obj;
+
+ object_unref(mr->owner);
+}
+
static void memory_region_do_init(MemoryRegion *mr,
Object *owner,
const char *name,
uint64_t size)
{
+ OBJECT(mr)->free = memory_region_free;
mr->size = int128_make64(size);
if (size == UINT64_MAX) {
mr->size = int128_2_64();
@@ -1293,6 +1301,18 @@ static void memory_region_get_size(Object *obj, Visitor *v, const char *name,
visit_type_uint64(v, name, &value, errp);
}
+static void memory_region_unparent(Object *obj)
+{
+ MemoryRegion *mr = MEMORY_REGION(obj);
+
+ object_ref(mr->owner);
+}
+
+static void memory_region_class_init(ObjectClass *klass, const void *data)
+{
+ klass->unparent = memory_region_unparent;
+}
+
static void memory_region_initfn(Object *obj)
{
MemoryRegion *mr = MEMORY_REGION(obj);
@@ -1826,25 +1846,19 @@ Object *memory_region_owner(MemoryRegion *mr)
void memory_region_ref(MemoryRegion *mr)
{
- /* MMIO callbacks most likely will access data that belongs
- * to the owner, hence the need to ref/unref the owner whenever
- * the memory region is in use.
- *
- * The memory region is a child of its owner. As long as the
- * owner doesn't call unparent itself on the memory region,
- * ref-ing the owner will also keep the memory region alive.
+ /*
* Memory regions without an owner are supposed to never go away;
* we do not ref/unref them because it slows down DMA sensibly.
*/
if (mr && mr->owner) {
- object_ref(mr->owner);
+ object_ref(mr);
}
}
void memory_region_unref(MemoryRegion *mr)
{
if (mr && mr->owner) {
- object_unref(mr->owner);
+ object_unref(mr);
}
}
@@ -3777,6 +3791,7 @@ static const TypeInfo memory_region_info = {
.parent = TYPE_OBJECT,
.name = TYPE_MEMORY_REGION,
.class_size = sizeof(MemoryRegionClass),
+ .class_init = memory_region_class_init,
.instance_size = sizeof(MemoryRegion),
.instance_init = memory_region_initfn,
.instance_finalize = memory_region_finalize,
--
2.51.0
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-06 2:39 ` [PATCH v2 3/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
@ 2025-09-10 21:45 ` Peter Xu
2025-09-11 3:40 ` Akihiko Odaki
2025-09-23 8:41 ` Paolo Bonzini
1 sibling, 1 reply; 21+ messages in thread
From: Peter Xu @ 2025-09-10 21:45 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, David Hildenbrand,
Philippe Mathieu-Daudé, Richard Henderson, Helge Deller,
Marc-André Lureau, Michael S. Tsirkin, Gerd Hoffmann,
John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko
On Sat, Sep 06, 2025 at 04:39:06AM +0200, Akihiko Odaki wrote:
> MemoryRegions used to "piggyback" on their owners instead of using their
> own reference counters due to the circular dependencies between
> them, which caused new circular references. Stop piggybacking, showing
> that the circular dependencies are actually broken at the finalization
> time.
>
> Circular references caused by piggybacking
> ==========================================
>
> Piggybacking created another circular reference when an owner referred
> to its MemoryRegion via memory_region_ref(). This could happen in three
> scenarios:
>
> - A subregion and its container had the same owner. In this case, the
> container would use memory_region_ref() to take a reference to the
> subregion.
>
> - An AddressSpace and its root MemoryRegion had the same owner. The
> situation is similar to the subregion/container scenario.
>
> - An owner called address_space_map() with a guest-controlled address
> that points back to its MemoryRegion. This scenario is similar to what
> MemReentrancyGuard handles.
>
> Avoiding such a circular reference required checking if the referrer is
> the owner at many places where memory_region_ref() is called,
> complicating the code further.
I understand (1). Why (2)/(3) are causes of circular references?
>
> Insight
> =======
>
> This change challenges the underlying assumption that circular
> dependencies exist and can be tolerated at the finalization time. The
> deletable MemoryRegions are of hotpluggable devices and
> virtio-gpu-virgl's hostmem. A device and its MemoryRegion have the
> following dependencies:
>
> QOM tree -> Device
> Container -> MemoryRegion
> Device <-> MemoryRegion
Yes, they need to "reference" each other. That's also why IMHO the current
qemu mr refcounting design is still practical, in that it keeps the
reference graph to be acyclic, which is much simpler to understand.
>
> Techniques like piggybacking or a hypothetical garbage collector can
> finalize the device and the MemoryRegion once the QOM tree and container
> lose their dependencies. However, these methods are fundamentally
> insufficient because the MemoryRegion and the device have finalizers,
> and they do not respect the dependencies these objects may have during
> finalization.
>
> As long as the object model based on the device and the MemoryRegion is
> correct, one of them should lose its dependency on the first,
> establishing a valid finalization order. Understanding this allows using
> standard reference counting, which is immune to the problems of
> piggybacking.
>
> Analysis reveals that the device no longer depends on the MemoryRegion
> after being unparented. The device needs the MemoryRegion for the
> following two purposes:
> - To add the MemoryRegion to a container and expose it to an
> AddressSpace.
> - To expose it via the QOM tree.
>
> Once unparented, the device is hidden from the AddresSpaces and the
> QOM tree so it no longer needs the MemoryRegion and the circular
> dependencies are broken. We only need to ensure that the device is
> finalized after the MemoryRegion then.
>
> It should also be noted that the reference from the MemoryRegion to the
> device needs to be tracked only after the device is unparented because
> the device will not be finalized as long as it is in the QOM tree.
> This fact can be exploited to avoid having a circular reference between
> the device and the MemoryRegion before the device gets unparented.
>
> Solution
> ========
>
> When devices get unparented, they also unparent their memory regions.
> The unparented memory regions will then retain their references to the
> devices. This ensures that:
>
> 1. the references from the devices to their memory regions are counted
> until the devices get unparented
> 2. the references from their memory regions to the devices afterwards,
> which in turn ensures that devices are finalized after their
> MemoryRegions.
>
> When virtio-gpu's hostmems get unparented, they also unparent their
> memory regions in the same manner.
>
> This enforces a valid finalization order, allows MemoryRegions to rely
> on standard reference counting, and eliminates the entire class of
> memory leaks caused by piggybacking.
>
> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> ---
> docs/devel/memory.rst | 41 +++++++++++++++++-----------------
> include/system/memory.h | 51 +++++++++++++++++++++----------------------
> hw/core/qdev.c | 16 ++++++++++++++
> hw/display/virtio-gpu-virgl.c | 1 +
> system/memory.c | 33 ++++++++++++++++++++--------
> 5 files changed, 86 insertions(+), 56 deletions(-)
>
> diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
> index 749f11d8a4dd..9634d8805740 100644
> --- a/docs/devel/memory.rst
> +++ b/docs/devel/memory.rst
> @@ -143,11 +143,7 @@ Region lifecycle
> ----------------
>
> A region is created by one of the memory_region_init*() functions and
> -attached to an object, which acts as its owner or parent. QEMU ensures
> -that the owner object remains alive as long as the region is visible to
> -the guest, or as long as the region is in use by a virtual CPU or another
> -device. For example, the owner object will not die between an
> -address_space_map operation and the corresponding address_space_unmap.
> +attached to an object, which acts as its owner or parent.
>
> After creation, a region can be added to an address space or a
> container with memory_region_add_subregion(), and removed using
> @@ -158,30 +154,34 @@ ioeventfd) can be changed during the region lifecycle. They take effect
> as soon as the region is made visible. This can be immediately, later,
> or never.
>
> -Destruction of a memory region happens automatically when the owner
> -object dies.
> +A region will retain references to its owner object when it is
> +unparented. This ensures that the owner object remains alive as long as
> +the region is in use by a virtual CPU or another device. For example,
> +the owner object will not die between an address_space_map operation and
> +the corresponding address_space_unmap. Devices automatically unparent
> +their memory regions when they are unparented.
>
> -You must not destroy a memory region as long as it may be in use by a
> -device or CPU. In order to do this, as a general rule do not create or
> -destroy memory regions dynamically during a device's lifetime.
> +You must not free a memory region as long as it may be in use by a
> +device or CPU. In order to do this, as a general rule do not allocate
> +or free memory regions dynamically during a device's lifetime.
> The dynamically allocated data structure that contains the
> memory region should be freed in the instance_finalize callback.
>
> If you break this rule, the following situation can happen:
>
> -- the memory region's owner had a reference taken via memory_region_ref
> +- the memory region had a reference taken via memory_region_ref
> (for example by address_space_map)
>
> -- the region is unparented, and has no owner anymore
> +- the region is freed
>
> -- when address_space_unmap is called, the reference to the memory region's
> - owner is leaked.
> +- when the mapped memory is used, the use of the memory region
> + results in use-after-free.
>
>
> -There is an exception to the above rule: it is okay to call
> -object_unparent at any time for an alias or a container region. It is
> -therefore also okay to create or destroy alias and container regions
> -dynamically during a device's lifetime.
> +There is an exception to the above rule: it is okay to free an alias or
> +a container region at any time. It is therefore also okay to allocate
> +or free alias and container regions dynamically during a device's
> +lifetime.
>
> This exceptional usage is valid because aliases and containers only help
> QEMU building the guest's memory map; they are never accessed directly.
> @@ -191,9 +191,8 @@ this exception is rarely necessary, and therefore it is discouraged,
> but nevertheless it is used in a few places.
>
> For regions that "have no owner" (NULL is passed at creation time), the
> -machine object is actually used as the owner. You must never call
> -object_unparent on regions that have no owner, unless they are aliases
> -or containers.
> +machine object is actually used as the owner. You must never free
> +regions that have no owner, unless they are aliases or containers.
>
>
> Overlapping regions and priority
> diff --git a/include/system/memory.h b/include/system/memory.h
> index e2cd6ed12614..1fc773ca49e2 100644
> --- a/include/system/memory.h
> +++ b/include/system/memory.h
> @@ -1307,7 +1307,7 @@ static inline bool memory_region_section_intersect_range(MemoryRegionSection *s,
> * memory_region_add_subregion() to add subregions.
> *
> * @mr: the #MemoryRegion to be initialized
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: used for debugging; not visible to the user or ABI
> * @size: size of the region; any subregions beyond this size will be clipped
> */
> @@ -1320,14 +1320,14 @@ void memory_region_init(MemoryRegion *mr,
> * memory_region_ref: Add 1 to a memory region's reference count
> *
> * Whenever memory regions are accessed outside the BQL, they need to be
> - * preserved against hot-unplug. MemoryRegions actually do not have their
> - * own reference count; they piggyback on a QOM object, their "owner".
> - * This function adds a reference to the owner.
> + * preserved against hot-unplug. This function adds a reference to the
> + * memory region.
> *
> - * All MemoryRegions must have an owner if they can disappear, even if the
> - * device they belong to operates exclusively under the BQL. This is because
> - * the region could be returned at any time by memory_region_find, and this
> - * is usually under guest control.
> + * We do not ref/unref memory regions without an owner because it slows
> + * down DMA sensibly. All MemoryRegions must have an owner if they can
> + * disappear, even if the device they belong to operates exclusively
> + * under the BQL. This is because the region could be returned at any
> + * time by memory_region_find, and this is usually under guest control.
> *
> * @mr: the #MemoryRegion
> */
> @@ -1337,9 +1337,8 @@ void memory_region_ref(MemoryRegion *mr);
> * memory_region_unref: Remove 1 to a memory region's reference count
> *
> * Whenever memory regions are accessed outside the BQL, they need to be
> - * preserved against hot-unplug. MemoryRegions actually do not have their
> - * own reference count; they piggyback on a QOM object, their "owner".
> - * This function removes a reference to the owner and possibly destroys it.
> + * preserved against hot-unplug. This function removes a reference to
> + * the memory and possibly destroys it.
> *
> * @mr: the #MemoryRegion
> */
> @@ -1352,7 +1351,7 @@ void memory_region_unref(MemoryRegion *mr);
> * if @size is nonzero, subregions will be clipped to @size.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @ops: a structure containing read and write callbacks to be used when
> * I/O is performed on the region.
> * @opaque: passed to the read and write callbacks of the @ops structure.
> @@ -1372,7 +1371,7 @@ void memory_region_init_io(MemoryRegion *mr,
> * directly.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> * must be unique within any device
> * @size: size of the region.
> @@ -1395,7 +1394,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
> * modify memory directly.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> * must be unique within any device
> * @size: size of the region.
> @@ -1425,7 +1424,7 @@ bool memory_region_init_ram_flags_nomigrate(MemoryRegion *mr,
> * canceled.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> * must be unique within any device
> * @size: used size of the region.
> @@ -1454,7 +1453,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
> * mmap-ed backend.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> * must be unique within any device
> * @size: size of the region.
> @@ -1487,7 +1486,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
> * mmap-ed backend.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: the name of the region.
> * @size: size of the region.
> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
> @@ -1518,7 +1517,7 @@ bool memory_region_init_ram_from_fd(MemoryRegion *mr,
> * region will modify memory directly.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> * must be unique within any device
> * @size: size of the region.
> @@ -1546,7 +1545,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
> * skip_dump flag.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: the name of the region.
> * @size: size of the region.
> * @ptr: memory to be mapped; must contain at least @size bytes.
> @@ -1566,7 +1565,7 @@ void memory_region_init_ram_device_ptr(MemoryRegion *mr,
> * part of another memory region.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: used for debugging; not visible to the user or ABI
> * @orig: the region to be referenced; @mr will be equivalent to
> * @orig between @offset and @offset + @size - 1.
> @@ -1592,7 +1591,7 @@ void memory_region_init_alias(MemoryRegion *mr,
> * of the caller.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> * must be unique within any device
> * @size: size of the region.
> @@ -1615,7 +1614,7 @@ bool memory_region_init_rom_nomigrate(MemoryRegion *mr,
> * of the caller.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @ops: callbacks for write access handling (must not be NULL).
> * @opaque: passed to the read and write callbacks of the @ops structure.
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> @@ -1651,7 +1650,7 @@ bool memory_region_init_rom_device_nomigrate(MemoryRegion *mr,
> * @_iommu_mr: the #IOMMUMemoryRegion to be initialized
> * @instance_size: the IOMMUMemoryRegion subclass instance size
> * @mrtypename: the type name of the #IOMMUMemoryRegion
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: used for debugging; not visible to the user or ABI
> * @size: size of the region.
> */
> @@ -1667,7 +1666,7 @@ void memory_region_init_iommu(void *_iommu_mr,
> * region will modify memory directly.
> *
> * @mr: the #MemoryRegion to be initialized
> - * @owner: the object that tracks the region's reference count (must be
> + * @owner: the object that provides the region's storage (must be
> * TYPE_DEVICE or a subclass of TYPE_DEVICE, or NULL)
> * @name: name of the memory region
> * @size: size of the region in bytes
> @@ -1713,7 +1712,7 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
> * If you pass a non-NULL non-device @owner then we will assert.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> * must be unique within any device
> * @size: size of the region.
> @@ -1744,7 +1743,7 @@ bool memory_region_init_rom(MemoryRegion *mr,
> * If you pass a non-NULL non-device @owner then we will assert.
> *
> * @mr: the #MemoryRegion to be initialized.
> - * @owner: the object that tracks the region's reference count
> + * @owner: the object that provides the region's storage
> * @ops: callbacks for write access handling (must not be NULL).
> * @opaque: passed to the read and write callbacks of the @ops structure.
> * @name: Region name, becomes part of RAMBlock name used in migration stream
> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> index 8fdf6774f87e..b83c46615fba 100644
> --- a/hw/core/qdev.c
> +++ b/hw/core/qdev.c
> @@ -718,8 +718,18 @@ static void device_class_base_init(ObjectClass *class, const void *data)
> klass->props_count_ = 0;
> }
>
> +static int collect_memory_region(Object *child, void *opaque)
> +{
> + if (object_dynamic_cast(child, TYPE_MEMORY_REGION)) {
> + g_array_append_val(opaque, child);
> + }
> +
> + return 0;
> +}
> +
> static void device_unparent(Object *obj)
> {
> + g_autoptr(GArray) array = g_array_new(FALSE, FALSE, sizeof(Object *));
> DeviceState *dev = DEVICE(obj);
> BusState *bus;
>
> @@ -735,6 +745,12 @@ static void device_unparent(Object *obj)
> object_unref(OBJECT(dev->parent_bus));
> dev->parent_bus = NULL;
> }
> +
> + object_child_foreach(OBJECT(dev), collect_memory_region, array);
> +
> + for (gsize i = 0; i < array->len; i++) {
> + object_unparent(g_array_index(array, Object *, i));
> + }
What if the owner is not a device?
If we have a more self contained solution [1], which do not care about the
type of the owner object, why bother?
What is the benefit of your proposal here, comparing to [1]?
> }
>
> static char *
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index f5cba8a7fa66..7afcaa4bfe2e 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -203,6 +203,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
> /* memory region owns self res->mr object and frees it by itself */
> memory_region_set_enabled(mr, false);
> memory_region_del_subregion(&b->hostmem, mr);
> + object_unparent(OBJECT(mr));
It's definitely not obvious why a memory core change will involve a GPU
change too.
> object_unparent(OBJECT(vmr));
> }
>
> diff --git a/system/memory.c b/system/memory.c
> index 56465479406f..edaf039b0647 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -1212,11 +1212,19 @@ static char *memory_region_escape_name(const char *name)
> return escaped;
> }
>
> +static void memory_region_free(void *obj)
> +{
> + MemoryRegion *mr = obj;
> +
> + object_unref(mr->owner);
> +}
> +
> static void memory_region_do_init(MemoryRegion *mr,
> Object *owner,
> const char *name,
> uint64_t size)
> {
> + OBJECT(mr)->free = memory_region_free;
I think this is wrong. This should break object_new(TYPE_MEMORY_REGION)
users.
> mr->size = int128_make64(size);
> if (size == UINT64_MAX) {
> mr->size = int128_2_64();
> @@ -1293,6 +1301,18 @@ static void memory_region_get_size(Object *obj, Visitor *v, const char *name,
> visit_type_uint64(v, name, &value, errp);
> }
>
> +static void memory_region_unparent(Object *obj)
> +{
> + MemoryRegion *mr = MEMORY_REGION(obj);
> +
> + object_ref(mr->owner);
> +}
> +
> +static void memory_region_class_init(ObjectClass *klass, const void *data)
> +{
> + klass->unparent = memory_region_unparent;
> +}
> +
> static void memory_region_initfn(Object *obj)
> {
> MemoryRegion *mr = MEMORY_REGION(obj);
> @@ -1826,25 +1846,19 @@ Object *memory_region_owner(MemoryRegion *mr)
>
> void memory_region_ref(MemoryRegion *mr)
> {
> - /* MMIO callbacks most likely will access data that belongs
> - * to the owner, hence the need to ref/unref the owner whenever
> - * the memory region is in use.
> - *
> - * The memory region is a child of its owner. As long as the
> - * owner doesn't call unparent itself on the memory region,
> - * ref-ing the owner will also keep the memory region alive.
> + /*
> * Memory regions without an owner are supposed to never go away;
> * we do not ref/unref them because it slows down DMA sensibly.
> */
> if (mr && mr->owner) {
> - object_ref(mr->owner);
> + object_ref(mr);
> }
> }
>
> void memory_region_unref(MemoryRegion *mr)
> {
> if (mr && mr->owner) {
> - object_unref(mr->owner);
> + object_unref(mr);
> }
> }
>
> @@ -3777,6 +3791,7 @@ static const TypeInfo memory_region_info = {
> .parent = TYPE_OBJECT,
> .name = TYPE_MEMORY_REGION,
> .class_size = sizeof(MemoryRegionClass),
> + .class_init = memory_region_class_init,
> .instance_size = sizeof(MemoryRegion),
> .instance_init = memory_region_initfn,
> .instance_finalize = memory_region_finalize,
>
> --
> 2.51.0
>
[1] https://lore.kernel.org/all/20250826221750.285242-1-peterx@redhat.com/
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-10 21:45 ` Peter Xu
@ 2025-09-11 3:40 ` Akihiko Odaki
2025-09-11 22:26 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Akihiko Odaki @ 2025-09-11 3:40 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, David Hildenbrand,
Philippe Mathieu-Daudé, Richard Henderson, Helge Deller,
Marc-André Lureau, Michael S. Tsirkin, Gerd Hoffmann,
John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko
On 2025/09/11 6:45, Peter Xu wrote:
> On Sat, Sep 06, 2025 at 04:39:06AM +0200, Akihiko Odaki wrote:
>> MemoryRegions used to "piggyback" on their owners instead of using their
>> own reference counters due to the circular dependencies between
>> them, which caused new circular references. Stop piggybacking, showing
>> that the circular dependencies are actually broken at the finalization
>> time.
>>
>> Circular references caused by piggybacking
>> ==========================================
>>
>> Piggybacking created another circular reference when an owner referred
>> to its MemoryRegion via memory_region_ref(). This could happen in three
>> scenarios:
>>
>> - A subregion and its container had the same owner. In this case, the
>> container would use memory_region_ref() to take a reference to the
>> subregion.
>>
>> - An AddressSpace and its root MemoryRegion had the same owner. The
>> situation is similar to the subregion/container scenario.
>>
>> - An owner called address_space_map() with a guest-controlled address
>> that points back to its MemoryRegion. This scenario is similar to what
>> MemReentrancyGuard handles.
>>
>> Avoiding such a circular reference required checking if the referrer is
>> the owner at many places where memory_region_ref() is called,
>> complicating the code further.
>
> I understand (1). Why (2)/(3) are causes of circular references?
I first explain why (1) results in a circular reference, and generalize
the argument to (2) and (3).
Semantically, (1) has the following dependencies:
Owner <-> Subregion
Owner <-> Container
Container -> Subregion
However, the bidirectional dependencies prevents finalization, so
piggybacking is done. It makes the following reference relationship:
Owner -> Subregion
Owner -> Container
Container -> Owner
In this case, we still have a circular reference:
Owner -> Container -> Owner
This could be generalized: if you replace Subregion with (any kind of)
MemoryRegion and Container with "X", it will look like follows:
Owner -> MemoryRegion
Owner -> X
X -> Owner
In case of (2), "X" is an AddressSpace.
In case of (3), "X" is a mapped address.
>
>>
>> Insight
>> =======
>>
>> This change challenges the underlying assumption that circular
>> dependencies exist and can be tolerated at the finalization time. The
>> deletable MemoryRegions are of hotpluggable devices and
>> virtio-gpu-virgl's hostmem. A device and its MemoryRegion have the
>> following dependencies:
>>
>> QOM tree -> Device
>> Container -> MemoryRegion
>> Device <-> MemoryRegion
>
> Yes, they need to "reference" each other. That's also why IMHO the current
> qemu mr refcounting design is still practical, in that it keeps the
> reference graph to be acyclic, which is much simpler to understand.
Piggybacking is indeed simpler but has the three possible circular
reference scenarios I mentioned earlier.
>
>>
>> Techniques like piggybacking or a hypothetical garbage collector can
>> finalize the device and the MemoryRegion once the QOM tree and container
>> lose their dependencies. However, these methods are fundamentally
>> insufficient because the MemoryRegion and the device have finalizers,
>> and they do not respect the dependencies these objects may have during
>> finalization.
>>
>> As long as the object model based on the device and the MemoryRegion is
>> correct, one of them should lose its dependency on the first,
>> establishing a valid finalization order. Understanding this allows using
>> standard reference counting, which is immune to the problems of
>> piggybacking.
>>
>> Analysis reveals that the device no longer depends on the MemoryRegion
>> after being unparented. The device needs the MemoryRegion for the
>> following two purposes:
>> - To add the MemoryRegion to a container and expose it to an
>> AddressSpace.
>> - To expose it via the QOM tree.
>>
>> Once unparented, the device is hidden from the AddresSpaces and the
>> QOM tree so it no longer needs the MemoryRegion and the circular
>> dependencies are broken. We only need to ensure that the device is
>> finalized after the MemoryRegion then.
>>
>> It should also be noted that the reference from the MemoryRegion to the
>> device needs to be tracked only after the device is unparented because
>> the device will not be finalized as long as it is in the QOM tree.
>> This fact can be exploited to avoid having a circular reference between
>> the device and the MemoryRegion before the device gets unparented.
>>
>> Solution
>> ========
>>
>> When devices get unparented, they also unparent their memory regions.
>> The unparented memory regions will then retain their references to the
>> devices. This ensures that:
>>
>> 1. the references from the devices to their memory regions are counted
>> until the devices get unparented
>> 2. the references from their memory regions to the devices afterwards,
>> which in turn ensures that devices are finalized after their
>> MemoryRegions.
>>
>> When virtio-gpu's hostmems get unparented, they also unparent their
>> memory regions in the same manner.
>>
>> This enforces a valid finalization order, allows MemoryRegions to rely
>> on standard reference counting, and eliminates the entire class of
>> memory leaks caused by piggybacking.
>>
>> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
>> ---
>> docs/devel/memory.rst | 41 +++++++++++++++++-----------------
>> include/system/memory.h | 51 +++++++++++++++++++++----------------------
>> hw/core/qdev.c | 16 ++++++++++++++
>> hw/display/virtio-gpu-virgl.c | 1 +
>> system/memory.c | 33 ++++++++++++++++++++--------
>> 5 files changed, 86 insertions(+), 56 deletions(-)
>>
>> diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
>> index 749f11d8a4dd..9634d8805740 100644
>> --- a/docs/devel/memory.rst
>> +++ b/docs/devel/memory.rst
>> @@ -143,11 +143,7 @@ Region lifecycle
>> ----------------
>>
>> A region is created by one of the memory_region_init*() functions and
>> -attached to an object, which acts as its owner or parent. QEMU ensures
>> -that the owner object remains alive as long as the region is visible to
>> -the guest, or as long as the region is in use by a virtual CPU or another
>> -device. For example, the owner object will not die between an
>> -address_space_map operation and the corresponding address_space_unmap.
>> +attached to an object, which acts as its owner or parent.
>>
>> After creation, a region can be added to an address space or a
>> container with memory_region_add_subregion(), and removed using
>> @@ -158,30 +154,34 @@ ioeventfd) can be changed during the region lifecycle. They take effect
>> as soon as the region is made visible. This can be immediately, later,
>> or never.
>>
>> -Destruction of a memory region happens automatically when the owner
>> -object dies.
>> +A region will retain references to its owner object when it is
>> +unparented. This ensures that the owner object remains alive as long as
>> +the region is in use by a virtual CPU or another device. For example,
>> +the owner object will not die between an address_space_map operation and
>> +the corresponding address_space_unmap. Devices automatically unparent
>> +their memory regions when they are unparented.
>>
>> -You must not destroy a memory region as long as it may be in use by a
>> -device or CPU. In order to do this, as a general rule do not create or
>> -destroy memory regions dynamically during a device's lifetime.
>> +You must not free a memory region as long as it may be in use by a
>> +device or CPU. In order to do this, as a general rule do not allocate
>> +or free memory regions dynamically during a device's lifetime.
>> The dynamically allocated data structure that contains the
>> memory region should be freed in the instance_finalize callback.
>>
>> If you break this rule, the following situation can happen:
>>
>> -- the memory region's owner had a reference taken via memory_region_ref
>> +- the memory region had a reference taken via memory_region_ref
>> (for example by address_space_map)
>>
>> -- the region is unparented, and has no owner anymore
>> +- the region is freed
>>
>> -- when address_space_unmap is called, the reference to the memory region's
>> - owner is leaked.
>> +- when the mapped memory is used, the use of the memory region
>> + results in use-after-free.
>>
>>
>> -There is an exception to the above rule: it is okay to call
>> -object_unparent at any time for an alias or a container region. It is
>> -therefore also okay to create or destroy alias and container regions
>> -dynamically during a device's lifetime.
>> +There is an exception to the above rule: it is okay to free an alias or
>> +a container region at any time. It is therefore also okay to allocate
>> +or free alias and container regions dynamically during a device's
>> +lifetime.
>>
>> This exceptional usage is valid because aliases and containers only help
>> QEMU building the guest's memory map; they are never accessed directly.
>> @@ -191,9 +191,8 @@ this exception is rarely necessary, and therefore it is discouraged,
>> but nevertheless it is used in a few places.
>>
>> For regions that "have no owner" (NULL is passed at creation time), the
>> -machine object is actually used as the owner. You must never call
>> -object_unparent on regions that have no owner, unless they are aliases
>> -or containers.
>> +machine object is actually used as the owner. You must never free
>> +regions that have no owner, unless they are aliases or containers.
>>
>>
>> Overlapping regions and priority
>> diff --git a/include/system/memory.h b/include/system/memory.h
>> index e2cd6ed12614..1fc773ca49e2 100644
>> --- a/include/system/memory.h
>> +++ b/include/system/memory.h
>> @@ -1307,7 +1307,7 @@ static inline bool memory_region_section_intersect_range(MemoryRegionSection *s,
>> * memory_region_add_subregion() to add subregions.
>> *
>> * @mr: the #MemoryRegion to be initialized
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: used for debugging; not visible to the user or ABI
>> * @size: size of the region; any subregions beyond this size will be clipped
>> */
>> @@ -1320,14 +1320,14 @@ void memory_region_init(MemoryRegion *mr,
>> * memory_region_ref: Add 1 to a memory region's reference count
>> *
>> * Whenever memory regions are accessed outside the BQL, they need to be
>> - * preserved against hot-unplug. MemoryRegions actually do not have their
>> - * own reference count; they piggyback on a QOM object, their "owner".
>> - * This function adds a reference to the owner.
>> + * preserved against hot-unplug. This function adds a reference to the
>> + * memory region.
>> *
>> - * All MemoryRegions must have an owner if they can disappear, even if the
>> - * device they belong to operates exclusively under the BQL. This is because
>> - * the region could be returned at any time by memory_region_find, and this
>> - * is usually under guest control.
>> + * We do not ref/unref memory regions without an owner because it slows
>> + * down DMA sensibly. All MemoryRegions must have an owner if they can
>> + * disappear, even if the device they belong to operates exclusively
>> + * under the BQL. This is because the region could be returned at any
>> + * time by memory_region_find, and this is usually under guest control.
>> *
>> * @mr: the #MemoryRegion
>> */
>> @@ -1337,9 +1337,8 @@ void memory_region_ref(MemoryRegion *mr);
>> * memory_region_unref: Remove 1 to a memory region's reference count
>> *
>> * Whenever memory regions are accessed outside the BQL, they need to be
>> - * preserved against hot-unplug. MemoryRegions actually do not have their
>> - * own reference count; they piggyback on a QOM object, their "owner".
>> - * This function removes a reference to the owner and possibly destroys it.
>> + * preserved against hot-unplug. This function removes a reference to
>> + * the memory and possibly destroys it.
>> *
>> * @mr: the #MemoryRegion
>> */
>> @@ -1352,7 +1351,7 @@ void memory_region_unref(MemoryRegion *mr);
>> * if @size is nonzero, subregions will be clipped to @size.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @ops: a structure containing read and write callbacks to be used when
>> * I/O is performed on the region.
>> * @opaque: passed to the read and write callbacks of the @ops structure.
>> @@ -1372,7 +1371,7 @@ void memory_region_init_io(MemoryRegion *mr,
>> * directly.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> * must be unique within any device
>> * @size: size of the region.
>> @@ -1395,7 +1394,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
>> * modify memory directly.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> * must be unique within any device
>> * @size: size of the region.
>> @@ -1425,7 +1424,7 @@ bool memory_region_init_ram_flags_nomigrate(MemoryRegion *mr,
>> * canceled.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> * must be unique within any device
>> * @size: used size of the region.
>> @@ -1454,7 +1453,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
>> * mmap-ed backend.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> * must be unique within any device
>> * @size: size of the region.
>> @@ -1487,7 +1486,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
>> * mmap-ed backend.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: the name of the region.
>> * @size: size of the region.
>> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>> @@ -1518,7 +1517,7 @@ bool memory_region_init_ram_from_fd(MemoryRegion *mr,
>> * region will modify memory directly.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> * must be unique within any device
>> * @size: size of the region.
>> @@ -1546,7 +1545,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
>> * skip_dump flag.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: the name of the region.
>> * @size: size of the region.
>> * @ptr: memory to be mapped; must contain at least @size bytes.
>> @@ -1566,7 +1565,7 @@ void memory_region_init_ram_device_ptr(MemoryRegion *mr,
>> * part of another memory region.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: used for debugging; not visible to the user or ABI
>> * @orig: the region to be referenced; @mr will be equivalent to
>> * @orig between @offset and @offset + @size - 1.
>> @@ -1592,7 +1591,7 @@ void memory_region_init_alias(MemoryRegion *mr,
>> * of the caller.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> * must be unique within any device
>> * @size: size of the region.
>> @@ -1615,7 +1614,7 @@ bool memory_region_init_rom_nomigrate(MemoryRegion *mr,
>> * of the caller.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @ops: callbacks for write access handling (must not be NULL).
>> * @opaque: passed to the read and write callbacks of the @ops structure.
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> @@ -1651,7 +1650,7 @@ bool memory_region_init_rom_device_nomigrate(MemoryRegion *mr,
>> * @_iommu_mr: the #IOMMUMemoryRegion to be initialized
>> * @instance_size: the IOMMUMemoryRegion subclass instance size
>> * @mrtypename: the type name of the #IOMMUMemoryRegion
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: used for debugging; not visible to the user or ABI
>> * @size: size of the region.
>> */
>> @@ -1667,7 +1666,7 @@ void memory_region_init_iommu(void *_iommu_mr,
>> * region will modify memory directly.
>> *
>> * @mr: the #MemoryRegion to be initialized
>> - * @owner: the object that tracks the region's reference count (must be
>> + * @owner: the object that provides the region's storage (must be
>> * TYPE_DEVICE or a subclass of TYPE_DEVICE, or NULL)
>> * @name: name of the memory region
>> * @size: size of the region in bytes
>> @@ -1713,7 +1712,7 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
>> * If you pass a non-NULL non-device @owner then we will assert.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> * must be unique within any device
>> * @size: size of the region.
>> @@ -1744,7 +1743,7 @@ bool memory_region_init_rom(MemoryRegion *mr,
>> * If you pass a non-NULL non-device @owner then we will assert.
>> *
>> * @mr: the #MemoryRegion to be initialized.
>> - * @owner: the object that tracks the region's reference count
>> + * @owner: the object that provides the region's storage
>> * @ops: callbacks for write access handling (must not be NULL).
>> * @opaque: passed to the read and write callbacks of the @ops structure.
>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>> index 8fdf6774f87e..b83c46615fba 100644
>> --- a/hw/core/qdev.c
>> +++ b/hw/core/qdev.c
>> @@ -718,8 +718,18 @@ static void device_class_base_init(ObjectClass *class, const void *data)
>> klass->props_count_ = 0;
>> }
>>
>> +static int collect_memory_region(Object *child, void *opaque)
>> +{
>> + if (object_dynamic_cast(child, TYPE_MEMORY_REGION)) {
>> + g_array_append_val(opaque, child);
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static void device_unparent(Object *obj)
>> {
>> + g_autoptr(GArray) array = g_array_new(FALSE, FALSE, sizeof(Object *));
>> DeviceState *dev = DEVICE(obj);
>> BusState *bus;
>>
>> @@ -735,6 +745,12 @@ static void device_unparent(Object *obj)
>> object_unref(OBJECT(dev->parent_bus));
>> dev->parent_bus = NULL;
>> }
>> +
>> + object_child_foreach(OBJECT(dev), collect_memory_region, array);
>> +
>> + for (gsize i = 0; i < array->len; i++) {
>> + object_unparent(g_array_index(array, Object *, i));
>> + }
>
> What if the owner is not a device?
>
> If we have a more self contained solution [1], which do not care about the
> type of the owner object, why bother?
>
> What is the benefit of your proposal here, comparing to [1]?
[1] only handles (1) but this covers the entire class of memory leaks,
including (2) and (3).
>
>> }
>>
>> static char *
>> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
>> index f5cba8a7fa66..7afcaa4bfe2e 100644
>> --- a/hw/display/virtio-gpu-virgl.c
>> +++ b/hw/display/virtio-gpu-virgl.c
>> @@ -203,6 +203,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
>> /* memory region owns self res->mr object and frees it by itself */
>> memory_region_set_enabled(mr, false);
>> memory_region_del_subregion(&b->hostmem, mr);
>> + object_unparent(OBJECT(mr));
>
> It's definitely not obvious why a memory core change will involve a GPU
> change too.
>
>> object_unparent(OBJECT(vmr));
>> }
>>
>> diff --git a/system/memory.c b/system/memory.c
>> index 56465479406f..edaf039b0647 100644
>> --- a/system/memory.c
>> +++ b/system/memory.c
>> @@ -1212,11 +1212,19 @@ static char *memory_region_escape_name(const char *name)
>> return escaped;
>> }
>>
>> +static void memory_region_free(void *obj)
>> +{
>> + MemoryRegion *mr = obj;
>> +
>> + object_unref(mr->owner);
>> +}
>> +
>> static void memory_region_do_init(MemoryRegion *mr,
>> Object *owner,
>> const char *name,
>> uint64_t size)
>> {
>> + OBJECT(mr)->free = memory_region_free;
>
> I think this is wrong. This should break object_new(TYPE_MEMORY_REGION)
> users.
This function will never be called with an object created with
object_new(TYPE_MEMORY_REGION). It is called by memory_region_init() and
memory_region_init_iommu() and they call object_initialize(). If an
object created with object_new(TYPE_MEMORY_REGION) is passed to them, it
will result in double object initialization, which doesn't make sense.
Regards,
Akihiko Odaki
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-11 3:40 ` Akihiko Odaki
@ 2025-09-11 22:26 ` Peter Xu
2025-09-18 12:04 ` Akihiko Odaki
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2025-09-11 22:26 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, David Hildenbrand,
Philippe Mathieu-Daudé, Richard Henderson, Helge Deller,
Marc-André Lureau, Michael S. Tsirkin, Gerd Hoffmann,
John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko
On Thu, Sep 11, 2025 at 12:40:43PM +0900, Akihiko Odaki wrote:
> On 2025/09/11 6:45, Peter Xu wrote:
> > On Sat, Sep 06, 2025 at 04:39:06AM +0200, Akihiko Odaki wrote:
> > > MemoryRegions used to "piggyback" on their owners instead of using their
> > > own reference counters due to the circular dependencies between
> > > them, which caused new circular references. Stop piggybacking, showing
> > > that the circular dependencies are actually broken at the finalization
> > > time.
> > >
> > > Circular references caused by piggybacking
> > > ==========================================
> > >
> > > Piggybacking created another circular reference when an owner referred
> > > to its MemoryRegion via memory_region_ref(). This could happen in three
> > > scenarios:
> > >
> > > - A subregion and its container had the same owner. In this case, the
> > > container would use memory_region_ref() to take a reference to the
> > > subregion.
> > >
> > > - An AddressSpace and its root MemoryRegion had the same owner. The
> > > situation is similar to the subregion/container scenario.
> > >
> > > - An owner called address_space_map() with a guest-controlled address
> > > that points back to its MemoryRegion. This scenario is similar to what
> > > MemReentrancyGuard handles.
> > >
> > > Avoiding such a circular reference required checking if the referrer is
> > > the owner at many places where memory_region_ref() is called,
> > > complicating the code further.
> >
> > I understand (1). Why (2)/(3) are causes of circular references?
>
> I first explain why (1) results in a circular reference, and generalize the
> argument to (2) and (3).
>
> Semantically, (1) has the following dependencies:
>
> Owner <-> Subregion
> Owner <-> Container
> Container -> Subregion
>
> However, the bidirectional dependencies prevents finalization, so
> piggybacking is done. It makes the following reference relationship:
>
> Owner -> Subregion
> Owner -> Container
> Container -> Owner
>
> In this case, we still have a circular reference:
> Owner -> Container -> Owner
>
> This could be generalized: if you replace Subregion with (any kind of)
> MemoryRegion and Container with "X", it will look like follows:
>
> Owner -> MemoryRegion
> Owner -> X
> X -> Owner
>
> In case of (2), "X" is an AddressSpace.
> In case of (3), "X" is a mapped address.
>
> >
> > >
> > > Insight
> > > =======
> > >
> > > This change challenges the underlying assumption that circular
> > > dependencies exist and can be tolerated at the finalization time. The
> > > deletable MemoryRegions are of hotpluggable devices and
> > > virtio-gpu-virgl's hostmem. A device and its MemoryRegion have the
> > > following dependencies:
> > >
> > > QOM tree -> Device
> > > Container -> MemoryRegion
> > > Device <-> MemoryRegion
> >
> > Yes, they need to "reference" each other. That's also why IMHO the current
> > qemu mr refcounting design is still practical, in that it keeps the
> > reference graph to be acyclic, which is much simpler to understand.
>
> Piggybacking is indeed simpler but has the three possible circular reference
> scenarios I mentioned earlier.
>
> >
> > >
> > > Techniques like piggybacking or a hypothetical garbage collector can
> > > finalize the device and the MemoryRegion once the QOM tree and container
> > > lose their dependencies. However, these methods are fundamentally
> > > insufficient because the MemoryRegion and the device have finalizers,
> > > and they do not respect the dependencies these objects may have during
> > > finalization.
> > >
> > > As long as the object model based on the device and the MemoryRegion is
> > > correct, one of them should lose its dependency on the first,
> > > establishing a valid finalization order. Understanding this allows using
> > > standard reference counting, which is immune to the problems of
> > > piggybacking.
> > >
> > > Analysis reveals that the device no longer depends on the MemoryRegion
> > > after being unparented. The device needs the MemoryRegion for the
> > > following two purposes:
> > > - To add the MemoryRegion to a container and expose it to an
> > > AddressSpace.
> > > - To expose it via the QOM tree.
> > >
> > > Once unparented, the device is hidden from the AddresSpaces and the
> > > QOM tree so it no longer needs the MemoryRegion and the circular
> > > dependencies are broken. We only need to ensure that the device is
> > > finalized after the MemoryRegion then.
> > >
> > > It should also be noted that the reference from the MemoryRegion to the
> > > device needs to be tracked only after the device is unparented because
> > > the device will not be finalized as long as it is in the QOM tree.
> > > This fact can be exploited to avoid having a circular reference between
> > > the device and the MemoryRegion before the device gets unparented.
> > >
> > > Solution
> > > ========
> > >
> > > When devices get unparented, they also unparent their memory regions.
> > > The unparented memory regions will then retain their references to the
> > > devices. This ensures that:
> > >
> > > 1. the references from the devices to their memory regions are counted
> > > until the devices get unparented
> > > 2. the references from their memory regions to the devices afterwards,
> > > which in turn ensures that devices are finalized after their
> > > MemoryRegions.
> > >
> > > When virtio-gpu's hostmems get unparented, they also unparent their
> > > memory regions in the same manner.
> > >
> > > This enforces a valid finalization order, allows MemoryRegions to rely
> > > on standard reference counting, and eliminates the entire class of
> > > memory leaks caused by piggybacking.
> > >
> > > Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> > > ---
> > > docs/devel/memory.rst | 41 +++++++++++++++++-----------------
> > > include/system/memory.h | 51 +++++++++++++++++++++----------------------
> > > hw/core/qdev.c | 16 ++++++++++++++
> > > hw/display/virtio-gpu-virgl.c | 1 +
> > > system/memory.c | 33 ++++++++++++++++++++--------
> > > 5 files changed, 86 insertions(+), 56 deletions(-)
> > >
> > > diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
> > > index 749f11d8a4dd..9634d8805740 100644
> > > --- a/docs/devel/memory.rst
> > > +++ b/docs/devel/memory.rst
> > > @@ -143,11 +143,7 @@ Region lifecycle
> > > ----------------
> > > A region is created by one of the memory_region_init*() functions and
> > > -attached to an object, which acts as its owner or parent. QEMU ensures
> > > -that the owner object remains alive as long as the region is visible to
> > > -the guest, or as long as the region is in use by a virtual CPU or another
> > > -device. For example, the owner object will not die between an
> > > -address_space_map operation and the corresponding address_space_unmap.
> > > +attached to an object, which acts as its owner or parent.
> > > After creation, a region can be added to an address space or a
> > > container with memory_region_add_subregion(), and removed using
> > > @@ -158,30 +154,34 @@ ioeventfd) can be changed during the region lifecycle. They take effect
> > > as soon as the region is made visible. This can be immediately, later,
> > > or never.
> > > -Destruction of a memory region happens automatically when the owner
> > > -object dies.
> > > +A region will retain references to its owner object when it is
> > > +unparented. This ensures that the owner object remains alive as long as
> > > +the region is in use by a virtual CPU or another device. For example,
> > > +the owner object will not die between an address_space_map operation and
> > > +the corresponding address_space_unmap. Devices automatically unparent
> > > +their memory regions when they are unparented.
> > > -You must not destroy a memory region as long as it may be in use by a
> > > -device or CPU. In order to do this, as a general rule do not create or
> > > -destroy memory regions dynamically during a device's lifetime.
> > > +You must not free a memory region as long as it may be in use by a
> > > +device or CPU. In order to do this, as a general rule do not allocate
> > > +or free memory regions dynamically during a device's lifetime.
> > > The dynamically allocated data structure that contains the
> > > memory region should be freed in the instance_finalize callback.
> > > If you break this rule, the following situation can happen:
> > > -- the memory region's owner had a reference taken via memory_region_ref
> > > +- the memory region had a reference taken via memory_region_ref
> > > (for example by address_space_map)
> > > -- the region is unparented, and has no owner anymore
> > > +- the region is freed
> > > -- when address_space_unmap is called, the reference to the memory region's
> > > - owner is leaked.
> > > +- when the mapped memory is used, the use of the memory region
> > > + results in use-after-free.
> > > -There is an exception to the above rule: it is okay to call
> > > -object_unparent at any time for an alias or a container region. It is
> > > -therefore also okay to create or destroy alias and container regions
> > > -dynamically during a device's lifetime.
> > > +There is an exception to the above rule: it is okay to free an alias or
> > > +a container region at any time. It is therefore also okay to allocate
> > > +or free alias and container regions dynamically during a device's
> > > +lifetime.
> > > This exceptional usage is valid because aliases and containers only help
> > > QEMU building the guest's memory map; they are never accessed directly.
> > > @@ -191,9 +191,8 @@ this exception is rarely necessary, and therefore it is discouraged,
> > > but nevertheless it is used in a few places.
> > > For regions that "have no owner" (NULL is passed at creation time), the
> > > -machine object is actually used as the owner. You must never call
> > > -object_unparent on regions that have no owner, unless they are aliases
> > > -or containers.
> > > +machine object is actually used as the owner. You must never free
> > > +regions that have no owner, unless they are aliases or containers.
> > > Overlapping regions and priority
> > > diff --git a/include/system/memory.h b/include/system/memory.h
> > > index e2cd6ed12614..1fc773ca49e2 100644
> > > --- a/include/system/memory.h
> > > +++ b/include/system/memory.h
> > > @@ -1307,7 +1307,7 @@ static inline bool memory_region_section_intersect_range(MemoryRegionSection *s,
> > > * memory_region_add_subregion() to add subregions.
> > > *
> > > * @mr: the #MemoryRegion to be initialized
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: used for debugging; not visible to the user or ABI
> > > * @size: size of the region; any subregions beyond this size will be clipped
> > > */
> > > @@ -1320,14 +1320,14 @@ void memory_region_init(MemoryRegion *mr,
> > > * memory_region_ref: Add 1 to a memory region's reference count
> > > *
> > > * Whenever memory regions are accessed outside the BQL, they need to be
> > > - * preserved against hot-unplug. MemoryRegions actually do not have their
> > > - * own reference count; they piggyback on a QOM object, their "owner".
> > > - * This function adds a reference to the owner.
> > > + * preserved against hot-unplug. This function adds a reference to the
> > > + * memory region.
> > > *
> > > - * All MemoryRegions must have an owner if they can disappear, even if the
> > > - * device they belong to operates exclusively under the BQL. This is because
> > > - * the region could be returned at any time by memory_region_find, and this
> > > - * is usually under guest control.
> > > + * We do not ref/unref memory regions without an owner because it slows
> > > + * down DMA sensibly. All MemoryRegions must have an owner if they can
> > > + * disappear, even if the device they belong to operates exclusively
> > > + * under the BQL. This is because the region could be returned at any
> > > + * time by memory_region_find, and this is usually under guest control.
> > > *
> > > * @mr: the #MemoryRegion
> > > */
> > > @@ -1337,9 +1337,8 @@ void memory_region_ref(MemoryRegion *mr);
> > > * memory_region_unref: Remove 1 to a memory region's reference count
> > > *
> > > * Whenever memory regions are accessed outside the BQL, they need to be
> > > - * preserved against hot-unplug. MemoryRegions actually do not have their
> > > - * own reference count; they piggyback on a QOM object, their "owner".
> > > - * This function removes a reference to the owner and possibly destroys it.
> > > + * preserved against hot-unplug. This function removes a reference to
> > > + * the memory and possibly destroys it.
> > > *
> > > * @mr: the #MemoryRegion
> > > */
> > > @@ -1352,7 +1351,7 @@ void memory_region_unref(MemoryRegion *mr);
> > > * if @size is nonzero, subregions will be clipped to @size.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @ops: a structure containing read and write callbacks to be used when
> > > * I/O is performed on the region.
> > > * @opaque: passed to the read and write callbacks of the @ops structure.
> > > @@ -1372,7 +1371,7 @@ void memory_region_init_io(MemoryRegion *mr,
> > > * directly.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > * must be unique within any device
> > > * @size: size of the region.
> > > @@ -1395,7 +1394,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
> > > * modify memory directly.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > * must be unique within any device
> > > * @size: size of the region.
> > > @@ -1425,7 +1424,7 @@ bool memory_region_init_ram_flags_nomigrate(MemoryRegion *mr,
> > > * canceled.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > * must be unique within any device
> > > * @size: used size of the region.
> > > @@ -1454,7 +1453,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
> > > * mmap-ed backend.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > * must be unique within any device
> > > * @size: size of the region.
> > > @@ -1487,7 +1486,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
> > > * mmap-ed backend.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: the name of the region.
> > > * @size: size of the region.
> > > * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
> > > @@ -1518,7 +1517,7 @@ bool memory_region_init_ram_from_fd(MemoryRegion *mr,
> > > * region will modify memory directly.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > * must be unique within any device
> > > * @size: size of the region.
> > > @@ -1546,7 +1545,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
> > > * skip_dump flag.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: the name of the region.
> > > * @size: size of the region.
> > > * @ptr: memory to be mapped; must contain at least @size bytes.
> > > @@ -1566,7 +1565,7 @@ void memory_region_init_ram_device_ptr(MemoryRegion *mr,
> > > * part of another memory region.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: used for debugging; not visible to the user or ABI
> > > * @orig: the region to be referenced; @mr will be equivalent to
> > > * @orig between @offset and @offset + @size - 1.
> > > @@ -1592,7 +1591,7 @@ void memory_region_init_alias(MemoryRegion *mr,
> > > * of the caller.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > * must be unique within any device
> > > * @size: size of the region.
> > > @@ -1615,7 +1614,7 @@ bool memory_region_init_rom_nomigrate(MemoryRegion *mr,
> > > * of the caller.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @ops: callbacks for write access handling (must not be NULL).
> > > * @opaque: passed to the read and write callbacks of the @ops structure.
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > @@ -1651,7 +1650,7 @@ bool memory_region_init_rom_device_nomigrate(MemoryRegion *mr,
> > > * @_iommu_mr: the #IOMMUMemoryRegion to be initialized
> > > * @instance_size: the IOMMUMemoryRegion subclass instance size
> > > * @mrtypename: the type name of the #IOMMUMemoryRegion
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: used for debugging; not visible to the user or ABI
> > > * @size: size of the region.
> > > */
> > > @@ -1667,7 +1666,7 @@ void memory_region_init_iommu(void *_iommu_mr,
> > > * region will modify memory directly.
> > > *
> > > * @mr: the #MemoryRegion to be initialized
> > > - * @owner: the object that tracks the region's reference count (must be
> > > + * @owner: the object that provides the region's storage (must be
> > > * TYPE_DEVICE or a subclass of TYPE_DEVICE, or NULL)
> > > * @name: name of the memory region
> > > * @size: size of the region in bytes
> > > @@ -1713,7 +1712,7 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
> > > * If you pass a non-NULL non-device @owner then we will assert.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > * must be unique within any device
> > > * @size: size of the region.
> > > @@ -1744,7 +1743,7 @@ bool memory_region_init_rom(MemoryRegion *mr,
> > > * If you pass a non-NULL non-device @owner then we will assert.
> > > *
> > > * @mr: the #MemoryRegion to be initialized.
> > > - * @owner: the object that tracks the region's reference count
> > > + * @owner: the object that provides the region's storage
> > > * @ops: callbacks for write access handling (must not be NULL).
> > > * @opaque: passed to the read and write callbacks of the @ops structure.
> > > * @name: Region name, becomes part of RAMBlock name used in migration stream
> > > diff --git a/hw/core/qdev.c b/hw/core/qdev.c
> > > index 8fdf6774f87e..b83c46615fba 100644
> > > --- a/hw/core/qdev.c
> > > +++ b/hw/core/qdev.c
> > > @@ -718,8 +718,18 @@ static void device_class_base_init(ObjectClass *class, const void *data)
> > > klass->props_count_ = 0;
> > > }
> > > +static int collect_memory_region(Object *child, void *opaque)
> > > +{
> > > + if (object_dynamic_cast(child, TYPE_MEMORY_REGION)) {
> > > + g_array_append_val(opaque, child);
> > > + }
> > > +
> > > + return 0;
> > > +}
> > > +
> > > static void device_unparent(Object *obj)
> > > {
> > > + g_autoptr(GArray) array = g_array_new(FALSE, FALSE, sizeof(Object *));
> > > DeviceState *dev = DEVICE(obj);
> > > BusState *bus;
> > > @@ -735,6 +745,12 @@ static void device_unparent(Object *obj)
> > > object_unref(OBJECT(dev->parent_bus));
> > > dev->parent_bus = NULL;
> > > }
> > > +
> > > + object_child_foreach(OBJECT(dev), collect_memory_region, array);
> > > +
> > > + for (gsize i = 0; i < array->len; i++) {
> > > + object_unparent(g_array_index(array, Object *, i));
> > > + }
> >
> > What if the owner is not a device?
> >
> > If we have a more self contained solution [1], which do not care about the
> > type of the owner object, why bother?
> >
> > What is the benefit of your proposal here, comparing to [1]?
>
> [1] only handles (1) but this covers the entire class of memory leaks,
> including (2) and (3).
Can you provide some example memory leaks that can reproduce on current
master branch with (2) and (3)? How severe are they?
I plan to merge the one-patch fix first on circular ref. This issue has
been dangling for too long, and I highly doubt anyone would even start to
like patch 1 of your this series on "finalizing" stage.
I don't see why we need to be blocked forever on this, keeping PeterM and
others poking the known issues. It'll be great if you can work whatever on
top of that, and justifying whatever you propose is better is also easier.
Is that OK for you?
>
> >
> > > }
> > > static char *
> > > diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> > > index f5cba8a7fa66..7afcaa4bfe2e 100644
> > > --- a/hw/display/virtio-gpu-virgl.c
> > > +++ b/hw/display/virtio-gpu-virgl.c
> > > @@ -203,6 +203,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
> > > /* memory region owns self res->mr object and frees it by itself */
> > > memory_region_set_enabled(mr, false);
> > > memory_region_del_subregion(&b->hostmem, mr);
> > > + object_unparent(OBJECT(mr));
> >
> > It's definitely not obvious why a memory core change will involve a GPU
> > change too.
> >
> > > object_unparent(OBJECT(vmr));
> > > }
> > > diff --git a/system/memory.c b/system/memory.c
> > > index 56465479406f..edaf039b0647 100644
> > > --- a/system/memory.c
> > > +++ b/system/memory.c
> > > @@ -1212,11 +1212,19 @@ static char *memory_region_escape_name(const char *name)
> > > return escaped;
> > > }
> > > +static void memory_region_free(void *obj)
> > > +{
> > > + MemoryRegion *mr = obj;
> > > +
> > > + object_unref(mr->owner);
> > > +}
> > > +
> > > static void memory_region_do_init(MemoryRegion *mr,
> > > Object *owner,
> > > const char *name,
> > > uint64_t size)
> > > {
> > > + OBJECT(mr)->free = memory_region_free;
> >
> > I think this is wrong. This should break object_new(TYPE_MEMORY_REGION)
> > users.
>
> This function will never be called with an object created with
> object_new(TYPE_MEMORY_REGION). It is called by memory_region_init() and
> memory_region_init_iommu() and they call object_initialize(). If an object
> created with object_new(TYPE_MEMORY_REGION) is passed to them, it will
> result in double object initialization, which doesn't make sense.
I actually didn't notice we don't have much users. I think it'll block new
users, then, by making free() not usable anymore. Not to mention it's also
definitely an abuse to reuse a free() for other things.
It can be relevant to the other email discussion too:
https://lore.kernel.org/all/aMHidDl1tdx-2G4e@x1.local/
I feel like object_new(TYPE_MEMORY_REGION) can be a good thing to simplify
allocated MRs whose lifecycle still is the same as the device for the long
term.
Some memory APIs will need touch up, but it shouldn't be much, basically we
want to skip object_initialize() for all memory_region_init_*() APIs on top
of object_new(TYPE_MEMORY_REGION) MRs.
I'm also not eager to change anything yet, because these MRs are minority,
and I don't immediately have a problem to solve myself here. So I would
like to know what exactly problem you're hitting with (2)/(3) above.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-11 22:26 ` Peter Xu
@ 2025-09-18 12:04 ` Akihiko Odaki
2025-09-24 21:14 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Akihiko Odaki @ 2025-09-18 12:04 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, David Hildenbrand,
Philippe Mathieu-Daudé, Richard Henderson, Helge Deller,
Marc-André Lureau, Michael S. Tsirkin, Gerd Hoffmann,
John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko, Manos Pitsidianakis
On 2025/09/12 7:26, Peter Xu wrote:
> On Thu, Sep 11, 2025 at 12:40:43PM +0900, Akihiko Odaki wrote:
>> On 2025/09/11 6:45, Peter Xu wrote:
>>> On Sat, Sep 06, 2025 at 04:39:06AM +0200, Akihiko Odaki wrote:
>>>> MemoryRegions used to "piggyback" on their owners instead of using their
>>>> own reference counters due to the circular dependencies between
>>>> them, which caused new circular references. Stop piggybacking, showing
>>>> that the circular dependencies are actually broken at the finalization
>>>> time.
>>>>
>>>> Circular references caused by piggybacking
>>>> ==========================================
>>>>
>>>> Piggybacking created another circular reference when an owner referred
>>>> to its MemoryRegion via memory_region_ref(). This could happen in three
>>>> scenarios:
>>>>
>>>> - A subregion and its container had the same owner. In this case, the
>>>> container would use memory_region_ref() to take a reference to the
>>>> subregion.
>>>>
>>>> - An AddressSpace and its root MemoryRegion had the same owner. The
>>>> situation is similar to the subregion/container scenario.
>>>>
>>>> - An owner called address_space_map() with a guest-controlled address
>>>> that points back to its MemoryRegion. This scenario is similar to what
>>>> MemReentrancyGuard handles.
>>>>
>>>> Avoiding such a circular reference required checking if the referrer is
>>>> the owner at many places where memory_region_ref() is called,
>>>> complicating the code further.
>>>
>>> I understand (1). Why (2)/(3) are causes of circular references?
>>
>> I first explain why (1) results in a circular reference, and generalize the
>> argument to (2) and (3).
>>
>> Semantically, (1) has the following dependencies:
>>
>> Owner <-> Subregion
>> Owner <-> Container
>> Container -> Subregion
>>
>> However, the bidirectional dependencies prevents finalization, so
>> piggybacking is done. It makes the following reference relationship:
>>
>> Owner -> Subregion
>> Owner -> Container
>> Container -> Owner
>>
>> In this case, we still have a circular reference:
>> Owner -> Container -> Owner
>>
>> This could be generalized: if you replace Subregion with (any kind of)
>> MemoryRegion and Container with "X", it will look like follows:
>>
>> Owner -> MemoryRegion
>> Owner -> X
>> X -> Owner
>>
>> In case of (2), "X" is an AddressSpace.
>> In case of (3), "X" is a mapped address.
>>
>>>
>>>>
>>>> Insight
>>>> =======
>>>>
>>>> This change challenges the underlying assumption that circular
>>>> dependencies exist and can be tolerated at the finalization time. The
>>>> deletable MemoryRegions are of hotpluggable devices and
>>>> virtio-gpu-virgl's hostmem. A device and its MemoryRegion have the
>>>> following dependencies:
>>>>
>>>> QOM tree -> Device
>>>> Container -> MemoryRegion
>>>> Device <-> MemoryRegion
>>>
>>> Yes, they need to "reference" each other. That's also why IMHO the current
>>> qemu mr refcounting design is still practical, in that it keeps the
>>> reference graph to be acyclic, which is much simpler to understand.
>>
>> Piggybacking is indeed simpler but has the three possible circular reference
>> scenarios I mentioned earlier.
>>
>>>
>>>>
>>>> Techniques like piggybacking or a hypothetical garbage collector can
>>>> finalize the device and the MemoryRegion once the QOM tree and container
>>>> lose their dependencies. However, these methods are fundamentally
>>>> insufficient because the MemoryRegion and the device have finalizers,
>>>> and they do not respect the dependencies these objects may have during
>>>> finalization.
>>>>
>>>> As long as the object model based on the device and the MemoryRegion is
>>>> correct, one of them should lose its dependency on the first,
>>>> establishing a valid finalization order. Understanding this allows using
>>>> standard reference counting, which is immune to the problems of
>>>> piggybacking.
>>>>
>>>> Analysis reveals that the device no longer depends on the MemoryRegion
>>>> after being unparented. The device needs the MemoryRegion for the
>>>> following two purposes:
>>>> - To add the MemoryRegion to a container and expose it to an
>>>> AddressSpace.
>>>> - To expose it via the QOM tree.
>>>>
>>>> Once unparented, the device is hidden from the AddresSpaces and the
>>>> QOM tree so it no longer needs the MemoryRegion and the circular
>>>> dependencies are broken. We only need to ensure that the device is
>>>> finalized after the MemoryRegion then.
>>>>
>>>> It should also be noted that the reference from the MemoryRegion to the
>>>> device needs to be tracked only after the device is unparented because
>>>> the device will not be finalized as long as it is in the QOM tree.
>>>> This fact can be exploited to avoid having a circular reference between
>>>> the device and the MemoryRegion before the device gets unparented.
>>>>
>>>> Solution
>>>> ========
>>>>
>>>> When devices get unparented, they also unparent their memory regions.
>>>> The unparented memory regions will then retain their references to the
>>>> devices. This ensures that:
>>>>
>>>> 1. the references from the devices to their memory regions are counted
>>>> until the devices get unparented
>>>> 2. the references from their memory regions to the devices afterwards,
>>>> which in turn ensures that devices are finalized after their
>>>> MemoryRegions.
>>>>
>>>> When virtio-gpu's hostmems get unparented, they also unparent their
>>>> memory regions in the same manner.
>>>>
>>>> This enforces a valid finalization order, allows MemoryRegions to rely
>>>> on standard reference counting, and eliminates the entire class of
>>>> memory leaks caused by piggybacking.
>>>>
>>>> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
>>>> ---
>>>> docs/devel/memory.rst | 41 +++++++++++++++++-----------------
>>>> include/system/memory.h | 51 +++++++++++++++++++++----------------------
>>>> hw/core/qdev.c | 16 ++++++++++++++
>>>> hw/display/virtio-gpu-virgl.c | 1 +
>>>> system/memory.c | 33 ++++++++++++++++++++--------
>>>> 5 files changed, 86 insertions(+), 56 deletions(-)
>>>>
>>>> diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
>>>> index 749f11d8a4dd..9634d8805740 100644
>>>> --- a/docs/devel/memory.rst
>>>> +++ b/docs/devel/memory.rst
>>>> @@ -143,11 +143,7 @@ Region lifecycle
>>>> ----------------
>>>> A region is created by one of the memory_region_init*() functions and
>>>> -attached to an object, which acts as its owner or parent. QEMU ensures
>>>> -that the owner object remains alive as long as the region is visible to
>>>> -the guest, or as long as the region is in use by a virtual CPU or another
>>>> -device. For example, the owner object will not die between an
>>>> -address_space_map operation and the corresponding address_space_unmap.
>>>> +attached to an object, which acts as its owner or parent.
>>>> After creation, a region can be added to an address space or a
>>>> container with memory_region_add_subregion(), and removed using
>>>> @@ -158,30 +154,34 @@ ioeventfd) can be changed during the region lifecycle. They take effect
>>>> as soon as the region is made visible. This can be immediately, later,
>>>> or never.
>>>> -Destruction of a memory region happens automatically when the owner
>>>> -object dies.
>>>> +A region will retain references to its owner object when it is
>>>> +unparented. This ensures that the owner object remains alive as long as
>>>> +the region is in use by a virtual CPU or another device. For example,
>>>> +the owner object will not die between an address_space_map operation and
>>>> +the corresponding address_space_unmap. Devices automatically unparent
>>>> +their memory regions when they are unparented.
>>>> -You must not destroy a memory region as long as it may be in use by a
>>>> -device or CPU. In order to do this, as a general rule do not create or
>>>> -destroy memory regions dynamically during a device's lifetime.
>>>> +You must not free a memory region as long as it may be in use by a
>>>> +device or CPU. In order to do this, as a general rule do not allocate
>>>> +or free memory regions dynamically during a device's lifetime.
>>>> The dynamically allocated data structure that contains the
>>>> memory region should be freed in the instance_finalize callback.
>>>> If you break this rule, the following situation can happen:
>>>> -- the memory region's owner had a reference taken via memory_region_ref
>>>> +- the memory region had a reference taken via memory_region_ref
>>>> (for example by address_space_map)
>>>> -- the region is unparented, and has no owner anymore
>>>> +- the region is freed
>>>> -- when address_space_unmap is called, the reference to the memory region's
>>>> - owner is leaked.
>>>> +- when the mapped memory is used, the use of the memory region
>>>> + results in use-after-free.
>>>> -There is an exception to the above rule: it is okay to call
>>>> -object_unparent at any time for an alias or a container region. It is
>>>> -therefore also okay to create or destroy alias and container regions
>>>> -dynamically during a device's lifetime.
>>>> +There is an exception to the above rule: it is okay to free an alias or
>>>> +a container region at any time. It is therefore also okay to allocate
>>>> +or free alias and container regions dynamically during a device's
>>>> +lifetime.
>>>> This exceptional usage is valid because aliases and containers only help
>>>> QEMU building the guest's memory map; they are never accessed directly.
>>>> @@ -191,9 +191,8 @@ this exception is rarely necessary, and therefore it is discouraged,
>>>> but nevertheless it is used in a few places.
>>>> For regions that "have no owner" (NULL is passed at creation time), the
>>>> -machine object is actually used as the owner. You must never call
>>>> -object_unparent on regions that have no owner, unless they are aliases
>>>> -or containers.
>>>> +machine object is actually used as the owner. You must never free
>>>> +regions that have no owner, unless they are aliases or containers.
>>>> Overlapping regions and priority
>>>> diff --git a/include/system/memory.h b/include/system/memory.h
>>>> index e2cd6ed12614..1fc773ca49e2 100644
>>>> --- a/include/system/memory.h
>>>> +++ b/include/system/memory.h
>>>> @@ -1307,7 +1307,7 @@ static inline bool memory_region_section_intersect_range(MemoryRegionSection *s,
>>>> * memory_region_add_subregion() to add subregions.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: used for debugging; not visible to the user or ABI
>>>> * @size: size of the region; any subregions beyond this size will be clipped
>>>> */
>>>> @@ -1320,14 +1320,14 @@ void memory_region_init(MemoryRegion *mr,
>>>> * memory_region_ref: Add 1 to a memory region's reference count
>>>> *
>>>> * Whenever memory regions are accessed outside the BQL, they need to be
>>>> - * preserved against hot-unplug. MemoryRegions actually do not have their
>>>> - * own reference count; they piggyback on a QOM object, their "owner".
>>>> - * This function adds a reference to the owner.
>>>> + * preserved against hot-unplug. This function adds a reference to the
>>>> + * memory region.
>>>> *
>>>> - * All MemoryRegions must have an owner if they can disappear, even if the
>>>> - * device they belong to operates exclusively under the BQL. This is because
>>>> - * the region could be returned at any time by memory_region_find, and this
>>>> - * is usually under guest control.
>>>> + * We do not ref/unref memory regions without an owner because it slows
>>>> + * down DMA sensibly. All MemoryRegions must have an owner if they can
>>>> + * disappear, even if the device they belong to operates exclusively
>>>> + * under the BQL. This is because the region could be returned at any
>>>> + * time by memory_region_find, and this is usually under guest control.
>>>> *
>>>> * @mr: the #MemoryRegion
>>>> */
>>>> @@ -1337,9 +1337,8 @@ void memory_region_ref(MemoryRegion *mr);
>>>> * memory_region_unref: Remove 1 to a memory region's reference count
>>>> *
>>>> * Whenever memory regions are accessed outside the BQL, they need to be
>>>> - * preserved against hot-unplug. MemoryRegions actually do not have their
>>>> - * own reference count; they piggyback on a QOM object, their "owner".
>>>> - * This function removes a reference to the owner and possibly destroys it.
>>>> + * preserved against hot-unplug. This function removes a reference to
>>>> + * the memory and possibly destroys it.
>>>> *
>>>> * @mr: the #MemoryRegion
>>>> */
>>>> @@ -1352,7 +1351,7 @@ void memory_region_unref(MemoryRegion *mr);
>>>> * if @size is nonzero, subregions will be clipped to @size.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @ops: a structure containing read and write callbacks to be used when
>>>> * I/O is performed on the region.
>>>> * @opaque: passed to the read and write callbacks of the @ops structure.
>>>> @@ -1372,7 +1371,7 @@ void memory_region_init_io(MemoryRegion *mr,
>>>> * directly.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> * must be unique within any device
>>>> * @size: size of the region.
>>>> @@ -1395,7 +1394,7 @@ bool memory_region_init_ram_nomigrate(MemoryRegion *mr,
>>>> * modify memory directly.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> * must be unique within any device
>>>> * @size: size of the region.
>>>> @@ -1425,7 +1424,7 @@ bool memory_region_init_ram_flags_nomigrate(MemoryRegion *mr,
>>>> * canceled.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> * must be unique within any device
>>>> * @size: used size of the region.
>>>> @@ -1454,7 +1453,7 @@ bool memory_region_init_resizeable_ram(MemoryRegion *mr,
>>>> * mmap-ed backend.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> * must be unique within any device
>>>> * @size: size of the region.
>>>> @@ -1487,7 +1486,7 @@ bool memory_region_init_ram_from_file(MemoryRegion *mr,
>>>> * mmap-ed backend.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: the name of the region.
>>>> * @size: size of the region.
>>>> * @ram_flags: RamBlock flags. Supported flags: RAM_SHARED, RAM_PMEM,
>>>> @@ -1518,7 +1517,7 @@ bool memory_region_init_ram_from_fd(MemoryRegion *mr,
>>>> * region will modify memory directly.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> * must be unique within any device
>>>> * @size: size of the region.
>>>> @@ -1546,7 +1545,7 @@ void memory_region_init_ram_ptr(MemoryRegion *mr,
>>>> * skip_dump flag.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: the name of the region.
>>>> * @size: size of the region.
>>>> * @ptr: memory to be mapped; must contain at least @size bytes.
>>>> @@ -1566,7 +1565,7 @@ void memory_region_init_ram_device_ptr(MemoryRegion *mr,
>>>> * part of another memory region.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: used for debugging; not visible to the user or ABI
>>>> * @orig: the region to be referenced; @mr will be equivalent to
>>>> * @orig between @offset and @offset + @size - 1.
>>>> @@ -1592,7 +1591,7 @@ void memory_region_init_alias(MemoryRegion *mr,
>>>> * of the caller.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> * must be unique within any device
>>>> * @size: size of the region.
>>>> @@ -1615,7 +1614,7 @@ bool memory_region_init_rom_nomigrate(MemoryRegion *mr,
>>>> * of the caller.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @ops: callbacks for write access handling (must not be NULL).
>>>> * @opaque: passed to the read and write callbacks of the @ops structure.
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> @@ -1651,7 +1650,7 @@ bool memory_region_init_rom_device_nomigrate(MemoryRegion *mr,
>>>> * @_iommu_mr: the #IOMMUMemoryRegion to be initialized
>>>> * @instance_size: the IOMMUMemoryRegion subclass instance size
>>>> * @mrtypename: the type name of the #IOMMUMemoryRegion
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: used for debugging; not visible to the user or ABI
>>>> * @size: size of the region.
>>>> */
>>>> @@ -1667,7 +1666,7 @@ void memory_region_init_iommu(void *_iommu_mr,
>>>> * region will modify memory directly.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized
>>>> - * @owner: the object that tracks the region's reference count (must be
>>>> + * @owner: the object that provides the region's storage (must be
>>>> * TYPE_DEVICE or a subclass of TYPE_DEVICE, or NULL)
>>>> * @name: name of the memory region
>>>> * @size: size of the region in bytes
>>>> @@ -1713,7 +1712,7 @@ bool memory_region_init_ram_guest_memfd(MemoryRegion *mr,
>>>> * If you pass a non-NULL non-device @owner then we will assert.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> * must be unique within any device
>>>> * @size: size of the region.
>>>> @@ -1744,7 +1743,7 @@ bool memory_region_init_rom(MemoryRegion *mr,
>>>> * If you pass a non-NULL non-device @owner then we will assert.
>>>> *
>>>> * @mr: the #MemoryRegion to be initialized.
>>>> - * @owner: the object that tracks the region's reference count
>>>> + * @owner: the object that provides the region's storage
>>>> * @ops: callbacks for write access handling (must not be NULL).
>>>> * @opaque: passed to the read and write callbacks of the @ops structure.
>>>> * @name: Region name, becomes part of RAMBlock name used in migration stream
>>>> diff --git a/hw/core/qdev.c b/hw/core/qdev.c
>>>> index 8fdf6774f87e..b83c46615fba 100644
>>>> --- a/hw/core/qdev.c
>>>> +++ b/hw/core/qdev.c
>>>> @@ -718,8 +718,18 @@ static void device_class_base_init(ObjectClass *class, const void *data)
>>>> klass->props_count_ = 0;
>>>> }
>>>> +static int collect_memory_region(Object *child, void *opaque)
>>>> +{
>>>> + if (object_dynamic_cast(child, TYPE_MEMORY_REGION)) {
>>>> + g_array_append_val(opaque, child);
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> static void device_unparent(Object *obj)
>>>> {
>>>> + g_autoptr(GArray) array = g_array_new(FALSE, FALSE, sizeof(Object *));
>>>> DeviceState *dev = DEVICE(obj);
>>>> BusState *bus;
>>>> @@ -735,6 +745,12 @@ static void device_unparent(Object *obj)
>>>> object_unref(OBJECT(dev->parent_bus));
>>>> dev->parent_bus = NULL;
>>>> }
>>>> +
>>>> + object_child_foreach(OBJECT(dev), collect_memory_region, array);
>>>> +
>>>> + for (gsize i = 0; i < array->len; i++) {
>>>> + object_unparent(g_array_index(array, Object *, i));
>>>> + }
>>>
>>> What if the owner is not a device?
>>>
>>> If we have a more self contained solution [1], which do not care about the
>>> type of the owner object, why bother?
>>>
>>> What is the benefit of your proposal here, comparing to [1]?
>>
>> [1] only handles (1) but this covers the entire class of memory leaks,
>> including (2) and (3).
>
> Can you provide some example memory leaks that can reproduce on current
> master branch with (2) and (3)? How severe are they?
>
> I plan to merge the one-patch fix first on circular ref. This issue has
> been dangling for too long, and I highly doubt anyone would even start to
> like patch 1 of your this series on "finalizing" stage.
>
> I don't see why we need to be blocked forever on this, keeping PeterM and
> others poking the known issues. It'll be great if you can work whatever on
> top of that, and justifying whatever you propose is better is also easier.
>
> Is that OK for you?
>
>>
>>>
>>>> }
>>>> static char *
>>>> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
>>>> index f5cba8a7fa66..7afcaa4bfe2e 100644
>>>> --- a/hw/display/virtio-gpu-virgl.c
>>>> +++ b/hw/display/virtio-gpu-virgl.c
>>>> @@ -203,6 +203,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
>>>> /* memory region owns self res->mr object and frees it by itself */
>>>> memory_region_set_enabled(mr, false);
>>>> memory_region_del_subregion(&b->hostmem, mr);
>>>> + object_unparent(OBJECT(mr));
>>>
>>> It's definitely not obvious why a memory core change will involve a GPU
>>> change too.
>>>
>>>> object_unparent(OBJECT(vmr));
>>>> }
>>>> diff --git a/system/memory.c b/system/memory.c
>>>> index 56465479406f..edaf039b0647 100644
>>>> --- a/system/memory.c
>>>> +++ b/system/memory.c
>>>> @@ -1212,11 +1212,19 @@ static char *memory_region_escape_name(const char *name)
>>>> return escaped;
>>>> }
>>>> +static void memory_region_free(void *obj)
>>>> +{
>>>> + MemoryRegion *mr = obj;
>>>> +
>>>> + object_unref(mr->owner);
>>>> +}
>>>> +
>>>> static void memory_region_do_init(MemoryRegion *mr,
>>>> Object *owner,
>>>> const char *name,
>>>> uint64_t size)
>>>> {
>>>> + OBJECT(mr)->free = memory_region_free;
>>>
>>> I think this is wrong. This should break object_new(TYPE_MEMORY_REGION)
>>> users.
>>
>> This function will never be called with an object created with
>> object_new(TYPE_MEMORY_REGION). It is called by memory_region_init() and
>> memory_region_init_iommu() and they call object_initialize(). If an object
>> created with object_new(TYPE_MEMORY_REGION) is passed to them, it will
>> result in double object initialization, which doesn't make sense.
>
> I actually didn't notice we don't have much users. I think it'll block new
> users, then, by making free() not usable anymore. Not to mention it's also
> definitely an abuse to reuse a free() for other things.
>
> It can be relevant to the other email discussion too:
>
> https://lore.kernel.org/all/aMHidDl1tdx-2G4e@x1.local/
>
> I feel like object_new(TYPE_MEMORY_REGION) can be a good thing to simplify
> allocated MRs whose lifecycle still is the same as the device for the long
> term.
>
> Some memory APIs will need touch up, but it shouldn't be much, basically we
> want to skip object_initialize() for all memory_region_init_*() APIs on top
> of object_new(TYPE_MEMORY_REGION) MRs.
>
> I'm also not eager to change anything yet, because these MRs are minority,
> and I don't immediately have a problem to solve myself here. So I would
> like to know what exactly problem you're hitting with (2)/(3) above.
(I'm ccing Manos Pitsidianakis as I mention Rust below.)
I couldn't reply this immediately as the topic is complicated and I
wanted to double-check the situation, but I'm confident enough to do so now.
------
First, I describe (2) and (3).
(2) is almost equivalent with (1). For example, xtensa-cpu leaks because
of the following code:
memory_region_init_io(env->system_er, obj, NULL, env, "er",
UINT64_C(0x100000000));
address_space_init(env->address_space_er, env->system_er, "ER");
The difference from (1) is that we have an AddresSpace instead of a
container MemoryRegion. You can reproduce the issue with the following
command:
meson test -C build qtest-xtensa/device-introspect-test
The code lacks address_space_destroy() so it is destined to leak anyway,
but even if you add address_space_destroy() to instance_finalize(), it
still leaks due to the exactly same reason with (1). "[PATCH 00/35]
memory: QOM-ify AddressSpace" QOM-ifies AddressSpaces just like
(container) MemoryRegions, which make the situation even more closer.
(3) is more complicated. The problem will occur when:
- flatview_translate() is called.
- The acquired memory region is owned by the parent.
I do not have a reproduction case for this, but it is hard to ensure
that the situation never occurs because there are so many places where
flatview_translate() is called; note that the function is called from
several abstracting functions such as address_space_translate(),
dma_memory_map(), pci_dma_map(), etc.
Your patch and my old one tried to solve (1) by checking if the owner is
trying to create a reference to its MemoryRegions. However, this
approach is not practical to cover all the memory leak hazards,
especially (3). In theory, we can change memory_region_ref() to take the
referrer as the argument and let it compare against the owner, but there
are too many code paths where memory_region_ref() is called.
To avoid the complication, this patch avoids circular references by
extending the code to unparent. Unparenting occurs far less frequently
than memory_region_ref(); this patch eventually reduces the number of
affected places to just 2; the qdev common code and virtio-gpu's
hostmem/blob code. Changing the two is much easier than auditing all the
DMA code that triggers flatview_translate().
------
Next, I explain that patch 1 "qom: Do not finalize twice" is useful,
independently with this patch.
The memory management code in qom/object.c has a number of assertions to
prevent misuses. They ensures that implemented operations have
well-defined semantics; memory leaks can happen with circular references
and misuses trigger abortion, but they do not result in an undefined
behavior at least.
However, there is one open hole in the code, which "qom: Do not finalize
twice" closes: calling object_ref() and object_unref() during
finalization results double-free. Defeinsively closing the hole ensures
that an undefined behavior will not result in a hard-to-debug issue or a
security vulnerability.
The ultimate way to prevent undefined behaviors is to use Rust; it
relies on programmers to avoid undefined behaviors only in a relatively
small number of unsafe wrappers, and when doing so, "safety" comments
are placed. However, a corner case that results in undefined behaviors
makes it hard to write unsafe wrappers.
To own an QOM object in safe Rust code, the Rust wrapper provides an
unsafe function: Owned::from(), which wraps an object into
Owned<T: ObjectType>. The caller must ensure that the object is not an
embedded object like a memory region, which is described in the "safety"
comment.
Once an object is wrapped into Owned<T: ObjectType>, you can freely
create and drop references using safe object_ref() and object_unref().
But the open hole in QOM breaks the safety assumption if a child decides
to take a reference to its parent during unparenting just like this
patch does. Avoiding this requires to ensure no new Owned<T: ObjectType>
is created by any child objects during unparenting and would require to
add "safety" comments anywhere Owned<T: ObjectType> is created.
Instead, we can simply change qom/object.c not to attempt double-free at
all, ensuring the safety in one place, which is what patch 1 does.
------
Lastly, I explain this patch will not prevent to add functions that use
object_new(TYPE_MEMORY_REGION).
The functions that call object_new(TYPE_MEMORY_REGION) can simply skip
setting OBJECT(mr)->free and ensure object_unref(mr->owner) in the
instance_finalize().
Regards,
Akihiko Odaki
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-06 2:39 ` [PATCH v2 3/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
2025-09-10 21:45 ` Peter Xu
@ 2025-09-23 8:41 ` Paolo Bonzini
1 sibling, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2025-09-23 8:41 UTC (permalink / raw)
To: Akihiko Odaki, qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Daniel P. Berrangé,
Eduardo Habkost, Peter Xu, David Hildenbrand,
Philippe Mathieu-Daudé, Richard Henderson, Helge Deller,
Marc-André Lureau, Michael S. Tsirkin, Gerd Hoffmann,
John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko
On 9/6/25 04:39, Akihiko Odaki wrote:
> +static int collect_memory_region(Object *child, void *opaque)
> +{
> + if (object_dynamic_cast(child, TYPE_MEMORY_REGION)) {
> + g_array_append_val(opaque, child);
> + }
> +
> + return 0;
> +}
> +
> static void device_unparent(Object *obj)
> {
> + g_autoptr(GArray) array = g_array_new(FALSE, FALSE, sizeof(Object *));
We can use GPtrArray too, instead.
> DeviceState *dev = DEVICE(obj);
> BusState *bus;
>
> @@ -735,6 +745,12 @@ static void device_unparent(Object *obj)
> object_unref(OBJECT(dev->parent_bus));
> dev->parent_bus = NULL;
> }
> +
> + object_child_foreach(OBJECT(dev), collect_memory_region, array);
> +
> + for (gsize i = 0; i < array->len; i++) {
> + object_unparent(g_array_index(array, Object *, i));
> + }
> }
>
> static char *
Very good idea! The purpose of unparent is basically to break the
circular references, and it's a bit weird to use object_child_foreach
but it makes a lot of sense.
> diff --git a/hw/display/virtio-gpu-virgl.c b/hw/display/virtio-gpu-virgl.c
> index f5cba8a7fa66..7afcaa4bfe2e 100644
> --- a/hw/display/virtio-gpu-virgl.c
> +++ b/hw/display/virtio-gpu-virgl.c
> @@ -203,6 +203,7 @@ virtio_gpu_virgl_unmap_resource_blob(VirtIOGPU *g,
> /* memory region owns self res->mr object and frees it by itself */
> memory_region_set_enabled(mr, false);
> memory_region_del_subregion(&b->hostmem, mr);
> + object_unparent(OBJECT(mr));
> object_unparent(OBJECT(vmr));
> }g
>
Can you do this in vmr's unparent callback? Even the
memory_region_set_enabled and del_subregion calls could belong there.
> +static void memory_region_unparent(Object *obj)
> +{
> + MemoryRegion *mr = MEMORY_REGION(obj);
> +
> + object_ref(mr->owner);
> +}
> +
This probably should assert that OBJECT(mr)->free == memory_region_free.
This provides safety, and also clarifies where the matching unref is.
Paolo
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 1/3] qom: Do not finalize twice
2025-09-06 2:39 ` [PATCH v2 1/3] qom: Do not finalize twice Akihiko Odaki
@ 2025-09-23 9:27 ` Paolo Bonzini
0 siblings, 0 replies; 21+ messages in thread
From: Paolo Bonzini @ 2025-09-23 9:27 UTC (permalink / raw)
To: Akihiko Odaki, qemu-devel
Cc: Alex Williamson, Cédric Le Goater, Daniel P. Berrangé,
Eduardo Habkost, Peter Xu, David Hildenbrand,
Philippe Mathieu-Daudé, Richard Henderson, Helge Deller,
Marc-André Lureau, Michael S. Tsirkin, Gerd Hoffmann,
John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko
On 9/6/25 04:39, Akihiko Odaki wrote:
> The next change adds code to retain references from an object to the
> parent when it is being unparented to ensure that the parent outlive
> them. This change handles the following scenario with the code:
>
> 1. The parent starts being finalized without unparenting.
> 2. Unparenting happens during finalization.
> 3. The child retains the reference to the parent.
> 4. The child gets finalized, and releases the reference.
>
> In this scenario, the reference counter of the parent reaches to zero,
> gets incremented, and gets decremented to reach to zero again. This
> change ensures that finalization will be triggered again in the
> scenario.
>
> Note that the reference counter needs to reach to zero again before
> finalization ends; otherwise the object will be "resurrected", which
> is not clearly defined and prohibited with an existing assertion.
>
> One thing that looks concerning with this change is that it adds a bool
> to Object. This is not a problem in the most situations where the host
> uses 64-bit addressing because the member is added to a gap needed for
> alignment, and possible double-free scenarios handled with this change
> are more serious than the extra memory usage for 32-bit hosts.
If this is a problem, we could reserve a special value of ->ref for that
(such as bit 31) but I think this is okay as long as there is this
32-bit hole.
Paolo
> Signed-off-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp>
> ---
> include/qom/object.h | 1 +
> qom/object.c | 5 +++++
> 2 files changed, 6 insertions(+)
>
> diff --git a/include/qom/object.h b/include/qom/object.h
> index 26df6137b911..7f7b1ffea8fe 100644
> --- a/include/qom/object.h
> +++ b/include/qom/object.h
> @@ -158,6 +158,7 @@ struct Object
> ObjectFree *free;
> GHashTable *properties;
> uint32_t ref;
> + bool finalizing;
> Object *parent;
> };
>
> diff --git a/qom/object.c b/qom/object.c
> index 1856bb36c74c..b766b2e9baa7 100644
> --- a/qom/object.c
> +++ b/qom/object.c
> @@ -725,6 +725,11 @@ static void object_finalize(void *data)
> Object *obj = data;
> TypeImpl *ti = obj->class->type;
>
> + if (obj->finalizing) {
> + return;
> + }
> +
> + obj->finalizing = true;
> object_property_del_all(obj);
> object_deinit(obj, ti);
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-18 12:04 ` Akihiko Odaki
@ 2025-09-24 21:14 ` Peter Xu
2025-09-25 9:03 ` Peter Maydell
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2025-09-24 21:14 UTC (permalink / raw)
To: Akihiko Odaki
Cc: qemu-devel, Alex Williamson, Cédric Le Goater, Paolo Bonzini,
Daniel P. Berrangé, Eduardo Habkost, David Hildenbrand,
Philippe Mathieu-Daudé, Richard Henderson, Helge Deller,
Marc-André Lureau, Michael S. Tsirkin, Gerd Hoffmann,
John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Peter Maydell,
Dmitry Osipenko, Manos Pitsidianakis
On Thu, Sep 18, 2025 at 09:04:19PM +0900, Akihiko Odaki wrote:
> (I'm ccing Manos Pitsidianakis as I mention Rust below.)
>
> I couldn't reply this immediately as the topic is complicated and I wanted
> to double-check the situation, but I'm confident enough to do so now.
I apologize for my slow response, and thanks a lot for the long writeup.
Actually, even until now I still didn't get enough time to finish my own
investigation regarding to at least the hotplug issues I mentioned below,
but let me reply first on my current thoughts.
>
> ------
>
> First, I describe (2) and (3).
>
> (2) is almost equivalent with (1). For example, xtensa-cpu leaks because of
> the following code:
>
> memory_region_init_io(env->system_er, obj, NULL, env, "er",
> UINT64_C(0x100000000));
> address_space_init(env->address_space_er, env->system_er, "ER");
>
> The difference from (1) is that we have an AddresSpace instead of a
> container MemoryRegion. You can reproduce the issue with the following
> command:
> meson test -C build qtest-xtensa/device-introspect-test
I didn't see any warning/error running it, with --enable-asan. I
definitely missed something, please let me know if you have idea.
My gut feeling is CPU hotplugs logic should make sure proper eliminations
of all these objects, that should include the address space destructions.
To me, the address space may or may not even have an owner. It's unlike
MR, that either will not be freed if owner==NULL, or must have an owner.
Hence address spaces are at least not as special as MRs, who constantly do
refcounts against owner before your this patch.
AFAIU, when arch would care about any form of memory leaks in the CPU
object / address spaces, the arch should follow the trend of the major
archs that support hotplugs. I had a vague memory that at least i386 should
support CPU hotplugs properly.
From that, I saw we do have cpu_address_space_init(), that i386 and arm
use. There, AddressSpace is not embeded in the CPU object, instead we have
these relevant fields in CPUState:
struct CPUAddressSpace *cpu_ases;
int cpu_ases_count;
AddressSpace *as;
That looks like a more advanced way of allowing >1 address spaces for a
CPU, however one arch can still opt-in with 1 address space, setting
num_ases=1, and pass in asidx=0. After all, all these are fields of
CPUState that all CPU objects share, including xtensa's.
Should xtensa (when starts to care about CPU hotplugs and relevant memleaks
like what we're discussing now) move to the generic way of managing address
spaces like what x86 and arm do (rather than relying on its)? Then maybe we
should always be able to properly destroy the address spaces?
How important is such leak in xtensa CPU's case? E.g. does it support
CPU hotplugs?
Side note: when I was trying to test hotplugs with i386/q35, unfortunately
I didn't really see when the address space was destroyed, maybe there's a
bug somewhere; I put that info into appendix at the end.
Anyway, my confusion is if we should manage address spaces in the same way
across important archs, and if that would after all always involve proper
address space destructions.
OTOH, I also see things like tcg handling in cpu_address_space_init(). I'm
not sure if it means tcg_log_global_after_sync() / tcg_commit() might be
useful for xtensa tcg cpus too. So there might be other reasons too that
xtensa cpus should move over to a generic CPU plug/unplug API, which will
cover the address spaces management?
Take my above as pure asks, not review comments. After all, I don't know
the problem well, once more.
>
> The code lacks address_space_destroy() so it is destined to leak anyway, but
> even if you add address_space_destroy() to instance_finalize(), it still
> leaks due to the exactly same reason with (1). "[PATCH 00/35] memory:
> QOM-ify AddressSpace" QOM-ifies AddressSpaces just like (container)
> MemoryRegions, which make the situation even more closer.
>
> (3) is more complicated. The problem will occur when:
> - flatview_translate() is called.
> - The acquired memory region is owned by the parent.
It doesn't look like exactly the problem, or I definitely could have missed
something. After all, flatview_translate() is protected by rcu, afaiu, not
refcounts, right? Maybe you meant something like address_space_map()?
That'll definitely need to hold the MR longer after rcu_read_unlock(),
however then could you explain if it happens, why the paired unmap() won't
happen to properly release it?
>
> I do not have a reproduction case for this, but it is hard to ensure that
> the situation never occurs because there are so many places where
> flatview_translate() is called; note that the function is called from
> several abstracting functions such as address_space_translate(),
> dma_memory_map(), pci_dma_map(), etc.
>
> Your patch and my old one tried to solve (1) by checking if the owner is
> trying to create a reference to its MemoryRegions. However, this approach is
> not practical to cover all the memory leak hazards, especially (3). In
> theory, we can change memory_region_ref() to take the referrer as the
> argument and let it compare against the owner, but there are too many code
> paths where memory_region_ref() is called.
>
> To avoid the complication, this patch avoids circular references by
> extending the code to unparent. Unparenting occurs far less frequently than
> memory_region_ref(); this patch eventually reduces the number of affected
> places to just 2; the qdev common code and virtio-gpu's hostmem/blob code.
> Changing the two is much easier than auditing all the DMA code that triggers
> flatview_translate().
>
> ------
>
> Next, I explain that patch 1 "qom: Do not finalize twice" is useful,
> independently with this patch.
>
> The memory management code in qom/object.c has a number of assertions to
> prevent misuses. They ensures that implemented operations have well-defined
> semantics; memory leaks can happen with circular references and misuses
> trigger abortion, but they do not result in an undefined behavior at least.
>
> However, there is one open hole in the code, which "qom: Do not finalize
> twice" closes: calling object_ref() and object_unref() during finalization
> results double-free. Defeinsively closing the hole ensures that an undefined
> behavior will not result in a hard-to-debug issue or a security
> vulnerability.
>
> The ultimate way to prevent undefined behaviors is to use Rust; it relies on
> programmers to avoid undefined behaviors only in a relatively small number
> of unsafe wrappers, and when doing so, "safety" comments are placed.
> However, a corner case that results in undefined behaviors makes it hard to
> write unsafe wrappers.
>
> To own an QOM object in safe Rust code, the Rust wrapper provides an unsafe
> function: Owned::from(), which wraps an object into
> Owned<T: ObjectType>. The caller must ensure that the object is not an
> embedded object like a memory region, which is described in the "safety"
> comment.
>
> Once an object is wrapped into Owned<T: ObjectType>, you can freely create
> and drop references using safe object_ref() and object_unref().
>
> But the open hole in QOM breaks the safety assumption if a child decides to
> take a reference to its parent during unparenting just like this patch does.
> Avoiding this requires to ensure no new Owned<T: ObjectType> is created by
> any child objects during unparenting and would require to add "safety"
> comments anywhere Owned<T: ObjectType> is created.
>
> Instead, we can simply change qom/object.c not to attempt double-free at
> all, ensuring the safety in one place, which is what patch 1 does.
Please bare with me on my ignorance on Rust. Do you mean it might be
important to Rust, and this could be proposed separately copying whoever is
active on Rust?
It sounds like a better way to justify the first patch landing before the
rest, if it is acceptable whoever is fluent with QEMU/Rust.
My stupid question is, why a ref() to self could happen in a finalize(),
and why it's legal at all, causing double finalize? It sounds against the
whole idea refcounting. But as I know zero on Rust, feel free to ignore my
question if you think that's a well known problem by all others. Then I
also suggest you send patch 1 separately.
>
> ------
>
> Lastly, I explain this patch will not prevent to add functions that use
> object_new(TYPE_MEMORY_REGION).
>
> The functions that call object_new(TYPE_MEMORY_REGION) can simply skip
> setting OBJECT(mr)->free and ensure object_unref(mr->owner) in the
> instance_finalize().
We could, but then MR object will behave totally differently when it's
allocated by object_new(), or when it's not. Meanwhile, if we want to in
the future use more object_new(TYPE_MEMORY_REGION) we may also want to
reuse memory_region_do_init() on top of them, then that'll be a problem and
even if it's not a blocker it's hard to track all these design trivial
differences.
I confess I can only get your gut of your new design, however I personally
don't prefer reusing a free() function that does anything without memory
releases. It is really error-prone to me for the long term. That's also
the root cause of why it cannot be used mixture with object_new() because
that was the much-easy-to-understand way to properly set a object->free()..
Another thing is, after your new design, MRs (even fully embeded in a
Device object) will start to have real refcountings, concurrently from the
Device object.
And then, if we think about an Device object with N MR objects embeded, the
struct has N+1 refcounts... Totally confusing to me. I think that's where
I mentioned I liked the piggypack solution, and I wished the MR refcount is
simply a link or bool rather than called to be a refcount.. basically I
wished we can ignore the MR refcount which is easier, taking MR part of the
owner always. But I agree this opinion can be subjective.
Another thing I'm not 100% certain is, I believe your new design relies on
device_unparent() to happen first, during which auto-detached MRs start to
gain refcounts of the owner, and that will be released in MR's free().
Could it happen device_unparent() are not invoked on the owner? E.g., can
something like this happen, that one creates a device, maintain it
internally, map anything, then when not needed unref() the device?
dev = object_new(SOME_DEVICE_HAS_MRS);
... add dev's MRs as subregions ...
... after finish using the device ...
... remove dev's MRs from container MRs ...
object_unref(dev); <--- which finalize() and free() the device object
I really wish we don't go with this and find something clean, after
understanding better on the problems. Reusing object->free() to unref()
another object but without freeing the memory at all, just sounds too
tricky for a long-term maintained project.
Thanks,
============= APPENDIX =============
The CPU hotplug test I did on i386, after some sequences of (guest OS runs
very new version of Linux, with some random mm patches on top of v6.17-rc4):
- HMP: hotplug a cpu
- guest OS: online the cpu
- guest OS: offline the cpu
- HMP: hotunplug the cpu
- HMP: hotplug a cpu
- HMP: hotunplug the cpu
- ...
I was able to hit a KVM failure after above sequence (I won't hit it if I
keep online/offline steps in guest OS for each continuous plug/unplug):
2025-09-24T15:33:38.398617Z qemu-system-x86_64: kvm_init_vcpu: kvm_arch_init_vcpu failed (8): Invalid argument
There, it's KVM_SET_CPUID2 that failed.
When ASAN enabled, I'll see a splash:
Direct leak of 1248 byte(s) in 3 object(s) allocated from:
#0 0x7f966e0e68a3 in calloc (/lib64/libasan.so.8+0xe68a3) (BuildId: 10b8ccd49f75c21babf1d7abe51bb63589d8471f)
#1 0x7f966d0b6f81 in g_malloc0 (/lib64/libglib-2.0.so.0+0x47f81) (BuildId: 3adead5a77684e5d72b61c7e0db58031a35baf58)
#2 0x55da4a8c31d4 in cpu_address_space_init ../system/physmem.c:797
#3 0x55da4acbc340 in kvm_cpu_realizefn ../target/i386/kvm/kvm-cpu.c:102
#4 0x55da4a0cc385 in accel_cpu_common_realize ../accel/accel-common.c:101
#5 0x55da4a082f78 in cpu_exec_realizefn ../hw/core/cpu-common.c:232
#6 0x55da4adbc5a8 in x86_cpu_realizefn ../target/i386/cpu.c:9321
#7 0x55da4ada6e6e in max_x86_cpu_realize ../target/i386/cpu.c:6715
#8 0x55da4afeb556 in device_set_realized ../hw/core/qdev.c:494
#9 0x55da4afff2d2 in property_set_bool ../qom/object.c:2375
#10 0x55da4affa5e5 in object_property_set ../qom/object.c:1450
#11 0x55da4b0048e0 in object_property_set_qobject ../qom/qom-qobject.c:28
#12 0x55da4affab6d in object_property_set_bool ../qom/object.c:1520
#13 0x55da4afea405 in qdev_realize ../hw/core/qdev.c:276
#14 0x55da4a8e2d5d in qdev_device_add_from_qdict ../system/qdev-monitor.c:714
#15 0x55da4a8e2e69 in qdev_device_add ../system/qdev-monitor.c:732
#16 0x55da4a8e43c0 in hmp_device_add ../system/qdev-monitor.c:993
#17 0x55da4a9e4fee in handle_hmp_command_exec ../monitor/hmp.c:1106
#18 0x55da4a9e548a in handle_hmp_command ../monitor/hmp.c:1158
#19 0x55da4a9ddea0 in monitor_command_cb ../monitor/hmp.c:47
#20 0x55da4b5236fd in readline_handle_byte ../util/readline.c:427
#21 0x55da4a9e7060 in monitor_read ../monitor/hmp.c:1390
#22 0x55da4b2f86f8 in qemu_chr_be_write_impl ../chardev/char.c:214
#23 0x55da4b2f87a3 in qemu_chr_be_write ../chardev/char.c:226
#24 0x55da4b2ec890 in tcp_chr_read ../chardev/char-socket.c:511
#25 0x55da4b023d65 in qio_channel_fd_source_dispatch ../io/channel-watch.c:84
#26 0x7f966d0af862 in g_main_context_dispatch_unlocked.lto_priv.0 (/lib64/libglib-2.0.so.0+0x40862) (BuildId: 3adead5a77684e5d72b61c)
#27 0x7f966d0afae4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x40ae4) (BuildId: 3adead5a77684e5d72b61c7e0db58031a35baf58)
#28 0x55da4b4f4c35 in glib_pollfds_poll ../util/main-loop.c:290
#29 0x55da4b4f4dae in os_host_main_loop_wait ../util/main-loop.c:313
So indeed I at least didn't see a AccelState.cpu_common_unrealize()
provided (which I was trying to look for callers of the paired API called
cpu_address_space_destroy() but surprisingly I found it's not even used
anywhere..), when it is hot unplugged:
#4 0x0000560336361a8a in x86_cpu_unrealizefn
#5 0x000056033658fd37 in device_set_realized
#6 0x00005603365a32d3 in property_set_bool
#7 0x000056033659e5e6 in object_property_set
#8 0x00005603365a88e1 in object_property_set_qobject
#9 0x000056033659eb6e in object_property_set_bool
#10 0x000056033658e48a in qdev_unrealize
#11 0x00005603362b5ee6 in x86_cpu_unplug_cb
#12 0x00005603362facda in pc_machine_device_unplug_cb
#13 0x0000560335822c28 in hotplug_handler_unplug
#14 0x0000560335771af7 in cpu_hotplug_wr
#15 0x0000560335e45c46 in memory_region_write_accessor
#16 0x0000560335e462da in access_with_adjusted_size
#17 0x0000560335e4e74d in memory_region_dispatch_write
Cmdline I used for testing:
/usr/local/bin/qemu-system-x86_64 -M q35,kernel-irqchip=split -accel kvm -gdb tcp::1234 -m 6G,slots=4,maxmem=16G -name peter-vm,debug-threads=on -msg timestamp=on -nographic -cpu host -smp 8,maxcpus=32 -global migration.x-max-bandwidth=1M -global migration.x-events=on -global migration.x-postcopy-ram=on -global migration.x-postcopy-preempt=on -monitor telnet:localhost:6666,server,nowait -qmp unix:/tmp/peter.qmp1,server,nowait -device ioh3420,id=pcie.1,chassis=1 -netdev user,id=net0,hostfwd=tcp::5555-:22 -device virtio-net-pci,netdev=net0,bus=pcie.1 -device ioh3420,id=pcie.2,chassis=2 -drive file=/home/peterx/images/default.qcow2,id=drive0,if=none,aio=io_uring -device virtio-blk-pci,drive=drive0,bus=pcie.2 -device ioh3420,id=pcie.3,chassis=3 -device virtio-balloon,bus=pcie.3 -device ioh3420,id=pcie.4,chassis=4 -kernel /home/peterx/git/linux/arch/x86/boot/bzImage -append root=/dev/vda3 console=ttyS0 nokaslr hugepages=128
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-24 21:14 ` Peter Xu
@ 2025-09-25 9:03 ` Peter Maydell
2025-09-25 20:05 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Peter Maydell @ 2025-09-25 9:03 UTC (permalink / raw)
To: Peter Xu
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Wed, 24 Sept 2025 at 22:14, Peter Xu <peterx@redhat.com> wrote:
> Side note: when I was trying to test hotplugs with i386/q35, unfortunately
> I didn't really see when the address space was destroyed, maybe there's a
> bug somewhere; I put that info into appendix at the end.
This is https://gitlab.com/qemu-project/qemu/-/issues/2517
I got blocked on that because I ran into a weird "I have some
memory that needs to be freed by the RCU callback, but only
after the callback has freed some other RCU stuff". I see
Paolo made a reply on that bug -- I would need to get back
to it and reproduce whatever it was I was doing.
-- PMM
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-25 9:03 ` Peter Maydell
@ 2025-09-25 20:05 ` Peter Xu
2025-09-26 9:09 ` Peter Maydell
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2025-09-25 20:05 UTC (permalink / raw)
To: Peter Maydell
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> On Wed, 24 Sept 2025 at 22:14, Peter Xu <peterx@redhat.com> wrote:
> > Side note: when I was trying to test hotplugs with i386/q35, unfortunately
> > I didn't really see when the address space was destroyed, maybe there's a
> > bug somewhere; I put that info into appendix at the end.
>
> This is https://gitlab.com/qemu-project/qemu/-/issues/2517
>
> I got blocked on that because I ran into a weird "I have some
> memory that needs to be freed by the RCU callback, but only
> after the callback has freed some other RCU stuff". I see
> Paolo made a reply on that bug -- I would need to get back
> to it and reproduce whatever it was I was doing.
Thanks for the link, right that looks exactly like what I hit.
I am curious if FIFO is guaranteed for RCU in general, or it is an impl
detail only specific to QEMU.
The other thing is I feel like it should be OK to reorder callbacks, if all
the call_rcu() users can make sure the rcu-freed object is completely
detached from the rest world, e.g. resetting all relevant pointers to NULL.
With that, it seems the order won't matter too, because nobody will be able
to reference the internal object anyway, so the two objects (after reseting
all referers to NULL pointer of the inner object) are completely standalone.
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-25 20:05 ` Peter Xu
@ 2025-09-26 9:09 ` Peter Maydell
2025-09-26 15:16 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Peter Maydell @ 2025-09-26 9:09 UTC (permalink / raw)
To: Peter Xu
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Thu, 25 Sept 2025 at 21:06, Peter Xu <peterx@redhat.com> wrote:
>
> On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> > On Wed, 24 Sept 2025 at 22:14, Peter Xu <peterx@redhat.com> wrote:
> > > Side note: when I was trying to test hotplugs with i386/q35, unfortunately
> > > I didn't really see when the address space was destroyed, maybe there's a
> > > bug somewhere; I put that info into appendix at the end.
> >
> > This is https://gitlab.com/qemu-project/qemu/-/issues/2517
> >
> > I got blocked on that because I ran into a weird "I have some
> > memory that needs to be freed by the RCU callback, but only
> > after the callback has freed some other RCU stuff". I see
> > Paolo made a reply on that bug -- I would need to get back
> > to it and reproduce whatever it was I was doing.
>
> Thanks for the link, right that looks exactly like what I hit.
>
> I am curious if FIFO is guaranteed for RCU in general, or it is an impl
> detail only specific to QEMU.
>
> The other thing is I feel like it should be OK to reorder callbacks, if all
> the call_rcu() users can make sure the rcu-freed object is completely
> detached from the rest world, e.g. resetting all relevant pointers to NULL.
> With that, it seems the order won't matter too, because nobody will be able
> to reference the internal object anyway, so the two objects (after reseting
> all referers to NULL pointer of the inner object) are completely standalone.
The specific ordering problem for cpu_address_space is that
there's a g_new allocated array of memory which contains
the AddressSpace objects (not pointers to them). The ASes need
to be RCU-deallocated first so they can clean up their internal
data structures; only once that has happened can we free the
memory that holds the AddressSpace structs themselves.
-- PMM
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-26 9:09 ` Peter Maydell
@ 2025-09-26 15:16 ` Peter Xu
2025-09-26 15:59 ` Peter Maydell
2025-09-29 12:45 ` Peter Maydell
0 siblings, 2 replies; 21+ messages in thread
From: Peter Xu @ 2025-09-26 15:16 UTC (permalink / raw)
To: Peter Maydell
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Fri, Sep 26, 2025 at 10:09:29AM +0100, Peter Maydell wrote:
> On Thu, 25 Sept 2025 at 21:06, Peter Xu <peterx@redhat.com> wrote:
> >
> > On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> > > On Wed, 24 Sept 2025 at 22:14, Peter Xu <peterx@redhat.com> wrote:
> > > > Side note: when I was trying to test hotplugs with i386/q35, unfortunately
> > > > I didn't really see when the address space was destroyed, maybe there's a
> > > > bug somewhere; I put that info into appendix at the end.
> > >
> > > This is https://gitlab.com/qemu-project/qemu/-/issues/2517
> > >
> > > I got blocked on that because I ran into a weird "I have some
> > > memory that needs to be freed by the RCU callback, but only
> > > after the callback has freed some other RCU stuff". I see
> > > Paolo made a reply on that bug -- I would need to get back
> > > to it and reproduce whatever it was I was doing.
> >
> > Thanks for the link, right that looks exactly like what I hit.
> >
> > I am curious if FIFO is guaranteed for RCU in general, or it is an impl
> > detail only specific to QEMU.
> >
> > The other thing is I feel like it should be OK to reorder callbacks, if all
> > the call_rcu() users can make sure the rcu-freed object is completely
> > detached from the rest world, e.g. resetting all relevant pointers to NULL.
> > With that, it seems the order won't matter too, because nobody will be able
> > to reference the internal object anyway, so the two objects (after reseting
> > all referers to NULL pointer of the inner object) are completely standalone.
>
> The specific ordering problem for cpu_address_space is that
> there's a g_new allocated array of memory which contains
> the AddressSpace objects (not pointers to them). The ASes need
> to be RCU-deallocated first so they can clean up their internal
> data structures; only once that has happened can we free the
> memory that holds the AddressSpace structs themselves.
If it's about cpu_address_space_destroy(), then IIUC it can also be done by
providing a destroy_free() function so that instead of trying to serialize
two rcu callbacks, we could easily serialize the operations in one
callback. One sample patch attached to avoid relying on order of rcu
enqueue.
Thanks,
===8<===
From 427ce0d2c7efe5771be859fa34c6f3ec18498a29 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Fri, 26 Sep 2025 10:55:26 -0400
Subject: [PATCH] memory: New AS helper to serialize destroy+free
Signed-off-by: Peter Xu <peterx@redhat.com>
---
include/system/memory.h | 10 ++++++++++
system/memory.c | 20 +++++++++++++++++++-
system/physmem.c | 3 +--
3 files changed, 30 insertions(+), 3 deletions(-)
diff --git a/include/system/memory.h b/include/system/memory.h
index aa85fc27a1..45f8c3d4aa 100644
--- a/include/system/memory.h
+++ b/include/system/memory.h
@@ -2735,6 +2735,16 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name);
*/
void address_space_destroy(AddressSpace *as);
+/**
+ * address_space_destroy_free: destroy an address space and free it
+ *
+ * Same to address_space_destroy(), only that it will also free the
+ * memory that AddressSpace pointer points to.
+ *
+ * @as: address space to be destroyed
+ */
+void address_space_destroy_free(AddressSpace *as);
+
/**
* address_space_remove_listeners: unregister all listeners of an address space
*
diff --git a/system/memory.c b/system/memory.c
index cf8cad6961..fe8b28a096 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -3278,7 +3278,14 @@ static void do_address_space_destroy(AddressSpace *as)
memory_region_unref(as->root);
}
-void address_space_destroy(AddressSpace *as)
+static void do_address_space_destroy_free(AddressSpace *as)
+{
+ do_address_space_destroy(as);
+ g_free(as);
+}
+
+/* Detach address space from global view, notify all listeners */
+static void address_space_detach(AddressSpace *as)
{
MemoryRegion *root = as->root;
@@ -3293,9 +3300,20 @@ void address_space_destroy(AddressSpace *as)
* values to expire before freeing the data.
*/
as->root = root;
+}
+
+void address_space_destroy(AddressSpace *as)
+{
+ address_space_detach(as);
call_rcu(as, do_address_space_destroy, rcu);
}
+void address_space_destroy_free(AddressSpace *as)
+{
+ address_space_detach(as);
+ call_rcu(as, do_address_space_destroy_free, rcu);
+}
+
static const char *memory_region_type(MemoryRegion *mr)
{
if (mr->alias) {
diff --git a/system/physmem.c b/system/physmem.c
index ae8ecd50ea..5afd6c67e6 100644
--- a/system/physmem.c
+++ b/system/physmem.c
@@ -821,8 +821,7 @@ void cpu_address_space_destroy(CPUState *cpu, int asidx)
memory_listener_unregister(&cpuas->tcg_as_listener);
}
- address_space_destroy(cpuas->as);
- g_free_rcu(cpuas->as, rcu);
+ g_clear_pointer(&cpuas->as, address_space_destroy_free);
if (asidx == 0) {
/* reset the convenience alias for address space 0 */
--
2.50.1
--
Peter Xu
^ permalink raw reply related [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-26 15:16 ` Peter Xu
@ 2025-09-26 15:59 ` Peter Maydell
2025-09-26 16:56 ` Peter Maydell
2025-09-29 12:45 ` Peter Maydell
1 sibling, 1 reply; 21+ messages in thread
From: Peter Maydell @ 2025-09-26 15:59 UTC (permalink / raw)
To: Peter Xu
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Fri, 26 Sept 2025 at 16:16, Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Sep 26, 2025 at 10:09:29AM +0100, Peter Maydell wrote:
> > On Thu, 25 Sept 2025 at 21:06, Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> > > > On Wed, 24 Sept 2025 at 22:14, Peter Xu <peterx@redhat.com> wrote:
> > > > > Side note: when I was trying to test hotplugs with i386/q35, unfortunately
> > > > > I didn't really see when the address space was destroyed, maybe there's a
> > > > > bug somewhere; I put that info into appendix at the end.
> > > >
> > > > This is https://gitlab.com/qemu-project/qemu/-/issues/2517
> > > >
> > > > I got blocked on that because I ran into a weird "I have some
> > > > memory that needs to be freed by the RCU callback, but only
> > > > after the callback has freed some other RCU stuff". I see
> > > > Paolo made a reply on that bug -- I would need to get back
> > > > to it and reproduce whatever it was I was doing.
> > >
> > > Thanks for the link, right that looks exactly like what I hit.
> > >
> > > I am curious if FIFO is guaranteed for RCU in general, or it is an impl
> > > detail only specific to QEMU.
> > >
> > > The other thing is I feel like it should be OK to reorder callbacks, if all
> > > the call_rcu() users can make sure the rcu-freed object is completely
> > > detached from the rest world, e.g. resetting all relevant pointers to NULL.
> > > With that, it seems the order won't matter too, because nobody will be able
> > > to reference the internal object anyway, so the two objects (after reseting
> > > all referers to NULL pointer of the inner object) are completely standalone.
> >
> > The specific ordering problem for cpu_address_space is that
> > there's a g_new allocated array of memory which contains
> > the AddressSpace objects (not pointers to them). The ASes need
> > to be RCU-deallocated first so they can clean up their internal
> > data structures; only once that has happened can we free the
> > memory that holds the AddressSpace structs themselves.
>
> If it's about cpu_address_space_destroy(), then IIUC it can also be done by
> providing a destroy_free() function so that instead of trying to serialize
> two rcu callbacks, we could easily serialize the operations in one
> callback. One sample patch attached to avoid relying on order of rcu
> enqueue.
The cpu_address_space_destroy() function is broken and never
called by anything. It needs rewriting to instead of trying
to destroy cpu_as, just destroy every AS the CPU has at once.
(I have some code for this.)
I'm trying to repro the setup I had last year, but I can't
figure out a setup where I can get hot-unplug to work:
the "device-del" command documented in system/cpu-hotplug.html
always fails with:
"desc": "acpi: device unplug request for not supported device type:
IvyBridge-IBRS-x86_64-cpu"
Do you know how to get this working?
thanks
-- PMM
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-26 15:59 ` Peter Maydell
@ 2025-09-26 16:56 ` Peter Maydell
2025-09-26 17:10 ` Peter Xu
0 siblings, 1 reply; 21+ messages in thread
From: Peter Maydell @ 2025-09-26 16:56 UTC (permalink / raw)
To: Peter Xu
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Fri, 26 Sept 2025 at 16:59, Peter Maydell <peter.maydell@linaro.org> wrote:
> I'm trying to repro the setup I had last year, but I can't
> figure out a setup where I can get hot-unplug to work:
> the "device-del" command documented in system/cpu-hotplug.html
> always fails with:
>
> "desc": "acpi: device unplug request for not supported device type:
> IvyBridge-IBRS-x86_64-cpu"
>
> Do you know how to get this working?
Turns out I needed to also add enough command line arguments
to get a guest to boot (which is a bit tricky when you don't
want to use the gui frontend because fontconfig et al are
full of leaks that clutter up the asan output). The error
message from device-del is extremely unhelpful here.
The command line I ended up with was
./build/x86-tgts-asan/qemu-system-x86_64 -vga std -display curses
-no-user-config -m 2048 -nodefaults -machine q35,accel=kvm,usb=off
-smp 1,maxcpus=2 -cpu IvyBridge-IBRS -qmp
unix:/tmp/qmp-sock,server=on,wait=off -cdrom
~/test-images/ubuntu-server-2404/ubuntu-24.04-live-server-amd64.iso
thanks
-- PMM
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-26 16:56 ` Peter Maydell
@ 2025-09-26 17:10 ` Peter Xu
0 siblings, 0 replies; 21+ messages in thread
From: Peter Xu @ 2025-09-26 17:10 UTC (permalink / raw)
To: Peter Maydell
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Fri, Sep 26, 2025 at 05:56:59PM +0100, Peter Maydell wrote:
> On Fri, 26 Sept 2025 at 16:59, Peter Maydell <peter.maydell@linaro.org> wrote:
> > I'm trying to repro the setup I had last year, but I can't
> > figure out a setup where I can get hot-unplug to work:
> > the "device-del" command documented in system/cpu-hotplug.html
> > always fails with:
> >
> > "desc": "acpi: device unplug request for not supported device type:
> > IvyBridge-IBRS-x86_64-cpu"
> >
> > Do you know how to get this working?
>
> Turns out I needed to also add enough command line arguments
> to get a guest to boot (which is a bit tricky when you don't
> want to use the gui frontend because fontconfig et al are
> full of leaks that clutter up the asan output). The error
> message from device-del is extremely unhelpful here.
>
> The command line I ended up with was
>
> ./build/x86-tgts-asan/qemu-system-x86_64 -vga std -display curses
> -no-user-config -m 2048 -nodefaults -machine q35,accel=kvm,usb=off
> -smp 1,maxcpus=2 -cpu IvyBridge-IBRS -qmp
> unix:/tmp/qmp-sock,server=on,wait=off -cdrom
> ~/test-images/ubuntu-server-2404/ubuntu-24.04-live-server-amd64.iso
Right, I remember I hit similar condition, but after I switch to a Linux
guest it started to work (with/without leaks..). Looks like we need the
guest/firmware to respond to some of the ACPI requests.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-26 15:16 ` Peter Xu
2025-09-26 15:59 ` Peter Maydell
@ 2025-09-29 12:45 ` Peter Maydell
2025-09-29 14:37 ` Peter Xu
1 sibling, 1 reply; 21+ messages in thread
From: Peter Maydell @ 2025-09-29 12:45 UTC (permalink / raw)
To: Peter Xu
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Fri, 26 Sept 2025 at 16:16, Peter Xu <peterx@redhat.com> wrote:
>
> On Fri, Sep 26, 2025 at 10:09:29AM +0100, Peter Maydell wrote:
> > On Thu, 25 Sept 2025 at 21:06, Peter Xu <peterx@redhat.com> wrote:
> > >
> > > On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> > > > On Wed, 24 Sept 2025 at 22:14, Peter Xu <peterx@redhat.com> wrote:
> > > > > Side note: when I was trying to test hotplugs with i386/q35, unfortunately
> > > > > I didn't really see when the address space was destroyed, maybe there's a
> > > > > bug somewhere; I put that info into appendix at the end.
> > > >
> > > > This is https://gitlab.com/qemu-project/qemu/-/issues/2517
> > > >
> > > > I got blocked on that because I ran into a weird "I have some
> > > > memory that needs to be freed by the RCU callback, but only
> > > > after the callback has freed some other RCU stuff". I see
> > > > Paolo made a reply on that bug -- I would need to get back
> > > > to it and reproduce whatever it was I was doing.
> > >
> > > Thanks for the link, right that looks exactly like what I hit.
> > >
> > > I am curious if FIFO is guaranteed for RCU in general, or it is an impl
> > > detail only specific to QEMU.
> > >
> > > The other thing is I feel like it should be OK to reorder callbacks, if all
> > > the call_rcu() users can make sure the rcu-freed object is completely
> > > detached from the rest world, e.g. resetting all relevant pointers to NULL.
> > > With that, it seems the order won't matter too, because nobody will be able
> > > to reference the internal object anyway, so the two objects (after reseting
> > > all referers to NULL pointer of the inner object) are completely standalone.
> >
> > The specific ordering problem for cpu_address_space is that
> > there's a g_new allocated array of memory which contains
> > the AddressSpace objects (not pointers to them). The ASes need
> > to be RCU-deallocated first so they can clean up their internal
> > data structures; only once that has happened can we free the
> > memory that holds the AddressSpace structs themselves.
>
> If it's about cpu_address_space_destroy(), then IIUC it can also be done by
> providing a destroy_free() function so that instead of trying to serialize
> two rcu callbacks, we could easily serialize the operations in one
> callback. One sample patch attached to avoid relying on order of rcu
> enqueue.
I figured out what my problem was here: like the existing
cpu_address_space_destroy(), it wants to first destroy the AS
and then free the memory the AS is using. So it does the
obvious thing:
address_space_destroy(cpuas->as);
g_free_rcu(cpuas->as, rcu);
This doesn't work, because address_space_destroy() sets
up an RCU callback using the 'rcu' node in the AddressSpace
struct. But then g_free_rcu() tries to do exactly the same
thing and overwrites the info in the 'rcu' node: so we never
call the do_address_space_destroy() hook.
(1) Is there some way we can make this "tried to use the RCU
node twice" assert?
(2) I think the simplest fix here is something like the
patch you propose that does the "destroy + free" in one
RCU callback.
thanks
-- PMM
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-29 12:45 ` Peter Maydell
@ 2025-09-29 14:37 ` Peter Xu
2025-09-29 14:43 ` Peter Maydell
0 siblings, 1 reply; 21+ messages in thread
From: Peter Xu @ 2025-09-29 14:37 UTC (permalink / raw)
To: Peter Maydell
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Mon, Sep 29, 2025 at 01:45:05PM +0100, Peter Maydell wrote:
> On Fri, 26 Sept 2025 at 16:16, Peter Xu <peterx@redhat.com> wrote:
> >
> > On Fri, Sep 26, 2025 at 10:09:29AM +0100, Peter Maydell wrote:
> > > On Thu, 25 Sept 2025 at 21:06, Peter Xu <peterx@redhat.com> wrote:
> > > >
> > > > On Thu, Sep 25, 2025 at 10:03:45AM +0100, Peter Maydell wrote:
> > > > > On Wed, 24 Sept 2025 at 22:14, Peter Xu <peterx@redhat.com> wrote:
> > > > > > Side note: when I was trying to test hotplugs with i386/q35, unfortunately
> > > > > > I didn't really see when the address space was destroyed, maybe there's a
> > > > > > bug somewhere; I put that info into appendix at the end.
> > > > >
> > > > > This is https://gitlab.com/qemu-project/qemu/-/issues/2517
> > > > >
> > > > > I got blocked on that because I ran into a weird "I have some
> > > > > memory that needs to be freed by the RCU callback, but only
> > > > > after the callback has freed some other RCU stuff". I see
> > > > > Paolo made a reply on that bug -- I would need to get back
> > > > > to it and reproduce whatever it was I was doing.
> > > >
> > > > Thanks for the link, right that looks exactly like what I hit.
> > > >
> > > > I am curious if FIFO is guaranteed for RCU in general, or it is an impl
> > > > detail only specific to QEMU.
> > > >
> > > > The other thing is I feel like it should be OK to reorder callbacks, if all
> > > > the call_rcu() users can make sure the rcu-freed object is completely
> > > > detached from the rest world, e.g. resetting all relevant pointers to NULL.
> > > > With that, it seems the order won't matter too, because nobody will be able
> > > > to reference the internal object anyway, so the two objects (after reseting
> > > > all referers to NULL pointer of the inner object) are completely standalone.
> > >
> > > The specific ordering problem for cpu_address_space is that
> > > there's a g_new allocated array of memory which contains
> > > the AddressSpace objects (not pointers to them). The ASes need
> > > to be RCU-deallocated first so they can clean up their internal
> > > data structures; only once that has happened can we free the
> > > memory that holds the AddressSpace structs themselves.
> >
> > If it's about cpu_address_space_destroy(), then IIUC it can also be done by
> > providing a destroy_free() function so that instead of trying to serialize
> > two rcu callbacks, we could easily serialize the operations in one
> > callback. One sample patch attached to avoid relying on order of rcu
> > enqueue.
>
> I figured out what my problem was here: like the existing
> cpu_address_space_destroy(), it wants to first destroy the AS
> and then free the memory the AS is using. So it does the
> obvious thing:
> address_space_destroy(cpuas->as);
> g_free_rcu(cpuas->as, rcu);
>
> This doesn't work, because address_space_destroy() sets
> up an RCU callback using the 'rcu' node in the AddressSpace
> struct. But then g_free_rcu() tries to do exactly the same
> thing and overwrites the info in the 'rcu' node: so we never
> call the do_address_space_destroy() hook.
>
> (1) Is there some way we can make this "tried to use the RCU
> node twice" assert?
Good point. Maybe we should assert node->func==NULL in call_rcu1().
>
> (2) I think the simplest fix here is something like the
> patch you propose that does the "destroy + free" in one
> RCU callback.
Yes, I agree.
Note that Akihiko has another series to QOMify Address space. This problem
should be relevant there too. Currently I believe it's similarly broken in
his series, but I think maybe we should fix this first on x86/arm hotplugs.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH v2 3/3] memory: Stop piggybacking on memory region owners
2025-09-29 14:37 ` Peter Xu
@ 2025-09-29 14:43 ` Peter Maydell
0 siblings, 0 replies; 21+ messages in thread
From: Peter Maydell @ 2025-09-29 14:43 UTC (permalink / raw)
To: Peter Xu
Cc: Akihiko Odaki, qemu-devel, Alex Williamson, Cédric Le Goater,
Paolo Bonzini, Daniel P. Berrangé, Eduardo Habkost,
David Hildenbrand, Philippe Mathieu-Daudé, Richard Henderson,
Helge Deller, Marc-André Lureau, Michael S. Tsirkin,
Gerd Hoffmann, John Snow, qemu-block, Keith Busch, Klaus Jensen,
Jesper Devantier, Marcel Apfelbaum, Nicholas Piggin, qemu-ppc,
John Levon, Thanos Makatos, Yanan Wang, BALATON Zoltan,
Jiaxun Yang, Daniel Henrique Barboza, David Gibson,
Harsh Prateek Bora, Alexey Kardashevskiy, Alex Bennée,
Fabiano Rosas, Thomas Huth, Laurent Vivier, Dmitry Osipenko,
Manos Pitsidianakis
On Mon, 29 Sept 2025 at 15:37, Peter Xu <peterx@redhat.com> wrote:
> On Mon, Sep 29, 2025 at 01:45:05PM +0100, Peter Maydell wrote:
> > I figured out what my problem was here: like the existing
> > cpu_address_space_destroy(), it wants to first destroy the AS
> > and then free the memory the AS is using. So it does the
> > obvious thing:
> > address_space_destroy(cpuas->as);
> > g_free_rcu(cpuas->as, rcu);
> >
> > This doesn't work, because address_space_destroy() sets
> > up an RCU callback using the 'rcu' node in the AddressSpace
> > struct. But then g_free_rcu() tries to do exactly the same
> > thing and overwrites the info in the 'rcu' node: so we never
> > call the do_address_space_destroy() hook.
> >
> > (1) Is there some way we can make this "tried to use the RCU
> > node twice" assert?
>
> Good point. Maybe we should assert node->func==NULL in call_rcu1().
>
> >
> > (2) I think the simplest fix here is something like the
> > patch you propose that does the "destroy + free" in one
> > RCU callback.
>
> Yes, I agree.
>
> Note that Akihiko has another series to QOMify Address space. This problem
> should be relevant there too. Currently I believe it's similarly broken in
> his series, but I think maybe we should fix this first on x86/arm hotplugs.
Yep; I've just sent out a patchset that tries to do that.
-- PMM
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2025-09-29 14:46 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-06 2:39 [PATCH v2 0/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 1/3] qom: Do not finalize twice Akihiko Odaki
2025-09-23 9:27 ` Paolo Bonzini
2025-09-06 2:39 ` [PATCH v2 2/3] virtio-gpu-virgl: Add virtio-gpu-virgl-hostmem-region type Akihiko Odaki
2025-09-06 2:39 ` [PATCH v2 3/3] memory: Stop piggybacking on memory region owners Akihiko Odaki
2025-09-10 21:45 ` Peter Xu
2025-09-11 3:40 ` Akihiko Odaki
2025-09-11 22:26 ` Peter Xu
2025-09-18 12:04 ` Akihiko Odaki
2025-09-24 21:14 ` Peter Xu
2025-09-25 9:03 ` Peter Maydell
2025-09-25 20:05 ` Peter Xu
2025-09-26 9:09 ` Peter Maydell
2025-09-26 15:16 ` Peter Xu
2025-09-26 15:59 ` Peter Maydell
2025-09-26 16:56 ` Peter Maydell
2025-09-26 17:10 ` Peter Xu
2025-09-29 12:45 ` Peter Maydell
2025-09-29 14:37 ` Peter Xu
2025-09-29 14:43 ` Peter Maydell
2025-09-23 8:41 ` Paolo Bonzini
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).