* [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions
@ 2025-08-05 8:11 Albert Esteve
2025-08-06 20:15 ` Peter Xu
0 siblings, 1 reply; 6+ messages in thread
From: Albert Esteve @ 2025-08-05 8:11 UTC (permalink / raw)
To: qemu-devel
Cc: stefanha, Laurent Vivier, Fabiano Rosas, peterx, pbonzini,
David Hildenbrand, Philippe Mathieu-Daudé, Albert Esteve
v1->v2:
- Added documentation
- Explained the reasoning in the commit message
In the last version of the SHMEM MAP/UNMAP [1] Stefan
raised a concern [2] about dynamically creating and
destroying memory regions and their lifecycle [3].
After some discussion, David Hildenbrand proposed
to detect RAM regions and handle refcounting differently.
I tried to extend the reasoning in the commit message
below. If I wrote any innacuracies, please keep me
honest. I hope we can gather some feedback with
this RFC patch before sending it for inclusion.
[1] https://patchwork.ozlabs.org/project/qemu-devel/list/?series=460121
[2] https://patchwork.ozlabs.org/comment/3528600/
[3] https://www.qemu.org/docs/master/devel/memory.html#region-lifecycle
---
Memory region lifecycle documentation discourages
creating or destroying memory regions dynamically
during a device's lifetime. The main concern here
is that "it may be in use by a device or CPU".
This means that a memory region could outlive its
owner.
However, this concern seems to revolve around
the fact that MMIO callbacks will most likely
access data that belong to the owner. Dynamically
removing/destroying the owner will result in buggy
behaviour. On the other hand, RAM regions do not
have this problematic behaviour as they reference
themselves. Recognizing non-MMIO MRs and handling
ref_count appropiately could clear/relax (for RAM
regions specifically) the initial concern stated
in the documentation.
This patch enhances memory_region_ref() and memory_region_unref()
to handle RAM and MMIO memory regions differently, improving
reference counting semantics.
RAM regions now reference/unreference the memory region object
itself, while MMIO continue to reference/unreference the owner
device as before.
An additional qtest has been added to stress the memory
lifecycle. All tests pass as these changes keep backward
compatibility (prior behaviour is kept for MMIO regions).
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Albert Esteve <aesteve@redhat.com>
---
docs/devel/memory.rst | 24 +++++++++++++-----
system/memory.c | 28 +++++++++++++++++----
tests/qtest/ivshmem-test.c | 50 ++++++++++++++++++++++++++++++++++++++
3 files changed, 91 insertions(+), 11 deletions(-)
diff --git a/docs/devel/memory.rst b/docs/devel/memory.rst
index 57fb2aec76..9c428e82ff 100644
--- a/docs/devel/memory.rst
+++ b/docs/devel/memory.rst
@@ -143,11 +143,15 @@ Region lifecycle
----------------
A region is created by one of the memory_region_init*() functions and
-attached to an object, which acts as its owner or parent. QEMU ensures
-that the owner object remains alive as long as the region is visible to
-the guest, or as long as the region is in use by a virtual CPU or another
-device. For example, the owner object will not die between an
-address_space_map operation and the corresponding address_space_unmap.
+attached to an object, which acts as its owner or parent.
+
+For MMIO regions, QEMU ensures that the owner object remains alive as
+long as the region is visible to the guest, or as long as the region is in
+use by a virtual CPU or another device. For example, the owner object will
+not die between an address_space_map operation and the corresponding
+address_space_unmap. For RAM regions, the region itself is kept alive
+rather than the owner, since RAM regions are self-contained data objects
+that manage their own lifecycle.
After creation, a region can be added to an address space or a
container with memory_region_add_subregion(), and removed using
@@ -174,7 +178,8 @@ callback. The dynamically allocated data structure that contains the
memory region then should obviously be freed in the instance_finalize
callback as well.
-If you break this rule, the following situation can happen:
+If you break this rule, the following situation can happen for MMIO
+regions:
- the memory region's owner had a reference taken via memory_region_ref
(for example by address_space_map)
@@ -184,6 +189,13 @@ If you break this rule, the following situation can happen:
- when address_space_unmap is called, the reference to the memory region's
owner is leaked.
+Note that memory_region_ref() and memory_region_unref() handle different
+region types appropriately: MMIO regions reference their owner device
+(since MMIO callbacks access owner data), while RAM regions reference
+themselves (since they are self-contained). Regions without owners, such
+as system memory containers and I/O address spaces, skip reference
+counting entirely as they exist for the machine's lifetime.
+
There is an exception to the above rule: it is okay to call
object_unparent at any time for an alias or a container region. It is
diff --git a/system/memory.c b/system/memory.c
index 5646547940..1592293be7 100644
--- a/system/memory.c
+++ b/system/memory.c
@@ -1826,6 +1826,17 @@ Object *memory_region_owner(MemoryRegion *mr)
void memory_region_ref(MemoryRegion *mr)
{
+ /* Regions without an owner are considered static. */
+ if (!mr || !mr->owner) {
+ return;
+ }
+ if (mr->ram) {
+ /* RAM regions are self-contained data objects, so we ref/unref
+ * the memory region itself rather than its owner.
+ */
+ object_ref(OBJECT(mr));
+ return;
+ }
/* MMIO callbacks most likely will access data that belongs
* to the owner, hence the need to ref/unref the owner whenever
* the memory region is in use.
@@ -1836,16 +1847,23 @@ void memory_region_ref(MemoryRegion *mr)
* Memory regions without an owner are supposed to never go away;
* we do not ref/unref them because it slows down DMA sensibly.
*/
- if (mr && mr->owner) {
- object_ref(mr->owner);
- }
+ object_ref(mr->owner);
}
void memory_region_unref(MemoryRegion *mr)
{
- if (mr && mr->owner) {
- object_unref(mr->owner);
+ /* Regions without an owner are considered static. */
+ if (!mr || !mr->owner) {
+ return;
+ }
+ if (mr->ram) {
+ /* RAM regions are self-contained data objects, so we ref/unref
+ * the memory region itself rather than its owner.
+ */
+ object_unref(OBJECT(mr));
+ return;
}
+ object_unref(mr->owner);
}
uint64_t memory_region_size(MemoryRegion *mr)
diff --git a/tests/qtest/ivshmem-test.c b/tests/qtest/ivshmem-test.c
index fb45fdeb07..44f712e9ae 100644
--- a/tests/qtest/ivshmem-test.c
+++ b/tests/qtest/ivshmem-test.c
@@ -194,6 +194,55 @@ static void test_ivshmem_single(void)
cleanup_vm(s);
}
+static void test_memory_region_lifecycle(void)
+{
+ /* Device creation triggers memory region mapping (calls ref) */
+ IVState state1, state2;
+ uint32_t test_data, read_data;
+ int i;
+
+ setup_vm(&state1);
+
+ /* Basic verification that device works */
+ test_data = 0x12345678;
+ write_mem(&state1, 0, &test_data, sizeof(test_data));
+ read_mem(&state1, 0, &read_data, sizeof(read_data));
+ g_assert_cmpuint(read_data, ==, test_data);
+
+ /* Multiple devices stress test memory region ref counting */
+ setup_vm(&state2);
+
+ /* Verify both devices work independently */
+ test_data = 0xDEADBEEF;
+ write_mem(&state2, 4, &test_data, sizeof(test_data));
+ read_mem(&state2, 4, &read_data, sizeof(read_data));
+ g_assert_cmpuint(read_data, ==, test_data);
+
+ /* Device destruction triggers memory region unmapping (calls unref) */
+ cleanup_vm(&state1);
+
+ /* Verify remaining device still works after first device cleanup */
+ read_mem(&state2, 4, &read_data, sizeof(read_data));
+ g_assert_cmpuint(read_data, ==, test_data);
+
+ /* Final cleanup */
+ cleanup_vm(&state2);
+
+ /* Quick lifecycle stress test - multiple create/destroy cycles */
+ for (i = 0; i < 5; i++) {
+ IVState temp_state;
+ setup_vm(&temp_state);
+
+ /* Quick test to ensure device works */
+ test_data = 0x1000 + i;
+ write_mem(&temp_state, 0, &test_data, sizeof(test_data));
+ read_mem(&temp_state, 0, &read_data, sizeof(read_data));
+ g_assert_cmpuint(read_data, ==, test_data);
+
+ cleanup_vm(&temp_state);
+ }
+}
+
static void test_ivshmem_pair(void)
{
IVState state1, state2, *s1, *s2;
@@ -503,6 +552,7 @@ int main(int argc, char **argv)
tmpserver = g_strconcat(tmpdir, "/server", NULL);
qtest_add_func("/ivshmem/single", test_ivshmem_single);
+ qtest_add_func("/ivshmem/memory-lifecycle", test_memory_region_lifecycle);
qtest_add_func("/ivshmem/hotplug", test_ivshmem_hotplug);
qtest_add_func("/ivshmem/memdev", test_ivshmem_memdev);
if (g_test_slow()) {
--
2.49.0
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions
2025-08-05 8:11 [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions Albert Esteve
@ 2025-08-06 20:15 ` Peter Xu
2025-08-06 20:36 ` David Hildenbrand
0 siblings, 1 reply; 6+ messages in thread
From: Peter Xu @ 2025-08-06 20:15 UTC (permalink / raw)
To: Albert Esteve
Cc: qemu-devel, stefanha, Laurent Vivier, Fabiano Rosas, pbonzini,
David Hildenbrand, Philippe Mathieu-Daudé
On Tue, Aug 05, 2025 at 10:11:23AM +0200, Albert Esteve wrote:
> v1->v2:
> - Added documentation
> - Explained the reasoning in the commit message
>
> In the last version of the SHMEM MAP/UNMAP [1] Stefan
> raised a concern [2] about dynamically creating and
> destroying memory regions and their lifecycle [3].
>
> After some discussion, David Hildenbrand proposed
> to detect RAM regions and handle refcounting differently.
> I tried to extend the reasoning in the commit message
> below. If I wrote any innacuracies, please keep me
> honest. I hope we can gather some feedback with
> this RFC patch before sending it for inclusion.
This seems working. Looks like so far all RAM MRs are fine with it, but
I'm not strongly confident it's true or it'll trivially keep true in the
future too.
Besides, this still adds some trivial complexity to memory_region_ref() on
treating RAM/MMIO MRs differently.
It also sounds like a pure "accident" that the shmem objects to be mapped
from the vhost-user devices are RAMs. I wonder what happens if we want to
also support dynmaic MMIO regions.
Would this work even without changing QEMU memory core?
For example, have you thought about creating a VhostUserShmemObject for
each of the VHOST_USER_BACKEND_SHMEM_MAP request?
AFAICT, QEMU has complete refcounting support for objects, I thought that
should be totally fine being dynmaically created or destroyed. Then MRs
will be children of those dynamic objects rather than the vhost-user
device.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions
2025-08-06 20:15 ` Peter Xu
@ 2025-08-06 20:36 ` David Hildenbrand
2025-08-06 20:56 ` Peter Xu
2025-08-07 7:22 ` Albert Esteve
0 siblings, 2 replies; 6+ messages in thread
From: David Hildenbrand @ 2025-08-06 20:36 UTC (permalink / raw)
To: Peter Xu, Albert Esteve
Cc: qemu-devel, stefanha, Laurent Vivier, Fabiano Rosas, pbonzini,
Philippe Mathieu-Daudé
On 06.08.25 22:15, Peter Xu wrote:
> On Tue, Aug 05, 2025 at 10:11:23AM +0200, Albert Esteve wrote:
>> v1->v2:
>> - Added documentation
>> - Explained the reasoning in the commit message
>>
>> In the last version of the SHMEM MAP/UNMAP [1] Stefan
>> raised a concern [2] about dynamically creating and
>> destroying memory regions and their lifecycle [3].
>>
>> After some discussion, David Hildenbrand proposed
>> to detect RAM regions and handle refcounting differently.
>> I tried to extend the reasoning in the commit message
>> below. If I wrote any innacuracies, please keep me
>> honest. I hope we can gather some feedback with
>> this RFC patch before sending it for inclusion.
>
> This seems working. Looks like so far all RAM MRs are fine with it, but
> I'm not strongly confident it's true or it'll trivially keep true in the
> future too.
>
> Besides, this still adds some trivial complexity to memory_region_ref() on
> treating RAM/MMIO MRs differently.
> > It also sounds like a pure "accident" that the shmem objects to be
mapped
> from the vhost-user devices are RAMs. I wonder what happens if we want to
> also support dynmaic MMIO regions.
Is this use case realistic?
If there is a reasonable way to prepare for such hypothetical use cases
them while solving Albert's immediate use case, I'm all for it.
>
> Would this work even without changing QEMU memory core?
>
> For example, have you thought about creating a VhostUserShmemObject for
> each of the VHOST_USER_BACKEND_SHMEM_MAP request?
You mean, adding an intermediate object that remains the parent of these
MemoryRegion?
Could work. To free a MemoryRegion, I guess we would unparent that
intermediate object, and that object would then free the memory region
-- unless something still references that intermediate object. Not sure
if the memory region might keep the intermediate object still alive (no
idea).
Certainly something to explore, Albert, can you look into that?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions
2025-08-06 20:36 ` David Hildenbrand
@ 2025-08-06 20:56 ` Peter Xu
2025-08-07 7:22 ` Albert Esteve
1 sibling, 0 replies; 6+ messages in thread
From: Peter Xu @ 2025-08-06 20:56 UTC (permalink / raw)
To: David Hildenbrand
Cc: Albert Esteve, qemu-devel, stefanha, Laurent Vivier,
Fabiano Rosas, pbonzini, Philippe Mathieu-Daudé
On Wed, Aug 06, 2025 at 10:36:33PM +0200, David Hildenbrand wrote:
> On 06.08.25 22:15, Peter Xu wrote:
> > On Tue, Aug 05, 2025 at 10:11:23AM +0200, Albert Esteve wrote:
> > > v1->v2:
> > > - Added documentation
> > > - Explained the reasoning in the commit message
> > >
> > > In the last version of the SHMEM MAP/UNMAP [1] Stefan
> > > raised a concern [2] about dynamically creating and
> > > destroying memory regions and their lifecycle [3].
> > >
> > > After some discussion, David Hildenbrand proposed
> > > to detect RAM regions and handle refcounting differently.
> > > I tried to extend the reasoning in the commit message
> > > below. If I wrote any innacuracies, please keep me
> > > honest. I hope we can gather some feedback with
> > > this RFC patch before sending it for inclusion.
> >
> > This seems working. Looks like so far all RAM MRs are fine with it, but
> > I'm not strongly confident it's true or it'll trivially keep true in the
> > future too.
> >
> > Besides, this still adds some trivial complexity to memory_region_ref() on
> > treating RAM/MMIO MRs differently.
> > > It also sounds like a pure "accident" that the shmem objects to be
> mapped
> > from the vhost-user devices are RAMs. I wonder what happens if we want to
> > also support dynmaic MMIO regions.
>
> Is this use case realistic?
Nop. :) It's a sincere wish that if such a feature to be introduced, it
could work for MMIOs too. Or better if no need to introduce it.
>
> If there is a reasonable way to prepare for such hypothetical use cases them
> while solving Albert's immediate use case, I'm all for it.
>
> >
> > Would this work even without changing QEMU memory core?
> >
> > For example, have you thought about creating a VhostUserShmemObject for
> > each of the VHOST_USER_BACKEND_SHMEM_MAP request?
>
> You mean, adding an intermediate object that remains the parent of these
> MemoryRegion?
>
> Could work. To free a MemoryRegion, I guess we would unparent that
> intermediate object, and that object would then free the memory region --
> unless something still references that intermediate object. Not sure if the
> memory region might keep the intermediate object still alive (no idea).
It should, as long as memory_region_ref() will boost the tempobj's refcount
properly.
Thanks,
>
> Certainly something to explore, Albert, can you look into that?
>
> --
> Cheers,
>
> David / dhildenb
>
--
Peter Xu
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions
2025-08-06 20:36 ` David Hildenbrand
2025-08-06 20:56 ` Peter Xu
@ 2025-08-07 7:22 ` Albert Esteve
2025-08-14 8:47 ` Albert Esteve
1 sibling, 1 reply; 6+ messages in thread
From: Albert Esteve @ 2025-08-07 7:22 UTC (permalink / raw)
To: David Hildenbrand
Cc: Peter Xu, qemu-devel, stefanha, Laurent Vivier, Fabiano Rosas,
pbonzini, Philippe Mathieu-Daudé
On Wed, Aug 6, 2025 at 10:36 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 06.08.25 22:15, Peter Xu wrote:
> > On Tue, Aug 05, 2025 at 10:11:23AM +0200, Albert Esteve wrote:
> >> v1->v2:
> >> - Added documentation
> >> - Explained the reasoning in the commit message
> >>
> >> In the last version of the SHMEM MAP/UNMAP [1] Stefan
> >> raised a concern [2] about dynamically creating and
> >> destroying memory regions and their lifecycle [3].
> >>
> >> After some discussion, David Hildenbrand proposed
> >> to detect RAM regions and handle refcounting differently.
> >> I tried to extend the reasoning in the commit message
> >> below. If I wrote any innacuracies, please keep me
> >> honest. I hope we can gather some feedback with
> >> this RFC patch before sending it for inclusion.
> >
> > This seems working. Looks like so far all RAM MRs are fine with it, but
> > I'm not strongly confident it's true or it'll trivially keep true in the
> > future too.
> >
> > Besides, this still adds some trivial complexity to memory_region_ref() on
> > treating RAM/MMIO MRs differently.
> > > It also sounds like a pure "accident" that the shmem objects to be
> mapped
> > from the vhost-user devices are RAMs. I wonder what happens if we want to
> > also support dynmaic MMIO regions.
>
> Is this use case realistic?
>
> If there is a reasonable way to prepare for such hypothetical use cases
> them while solving Albert's immediate use case, I'm all for it.
>
> >
> > Would this work even without changing QEMU memory core?
> >
> > For example, have you thought about creating a VhostUserShmemObject for
> > each of the VHOST_USER_BACKEND_SHMEM_MAP request?
>
> You mean, adding an intermediate object that remains the parent of these
> MemoryRegion?
>
> Could work. To free a MemoryRegion, I guess we would unparent that
> intermediate object, and that object would then free the memory region
> -- unless something still references that intermediate object. Not sure
> if the memory region might keep the intermediate object still alive (no
> idea).
>
> Certainly something to explore, Albert, can you look into that?
Sure, I will try this. Thank you both for the time and help.
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions
2025-08-07 7:22 ` Albert Esteve
@ 2025-08-14 8:47 ` Albert Esteve
0 siblings, 0 replies; 6+ messages in thread
From: Albert Esteve @ 2025-08-14 8:47 UTC (permalink / raw)
To: David Hildenbrand
Cc: Peter Xu, qemu-devel, stefanha, Laurent Vivier, Fabiano Rosas,
pbonzini, Philippe Mathieu-Daudé
On Thu, Aug 7, 2025 at 9:22 AM Albert Esteve <aesteve@redhat.com> wrote:
>
> On Wed, Aug 6, 2025 at 10:36 PM David Hildenbrand <david@redhat.com> wrote:
> >
> > On 06.08.25 22:15, Peter Xu wrote:
> > > On Tue, Aug 05, 2025 at 10:11:23AM +0200, Albert Esteve wrote:
> > >> v1->v2:
> > >> - Added documentation
> > >> - Explained the reasoning in the commit message
> > >>
> > >> In the last version of the SHMEM MAP/UNMAP [1] Stefan
> > >> raised a concern [2] about dynamically creating and
> > >> destroying memory regions and their lifecycle [3].
> > >>
> > >> After some discussion, David Hildenbrand proposed
> > >> to detect RAM regions and handle refcounting differently.
> > >> I tried to extend the reasoning in the commit message
> > >> below. If I wrote any innacuracies, please keep me
> > >> honest. I hope we can gather some feedback with
> > >> this RFC patch before sending it for inclusion.
> > >
> > > This seems working. Looks like so far all RAM MRs are fine with it, but
> > > I'm not strongly confident it's true or it'll trivially keep true in the
> > > future too.
> > >
> > > Besides, this still adds some trivial complexity to memory_region_ref() on
> > > treating RAM/MMIO MRs differently.
> > > > It also sounds like a pure "accident" that the shmem objects to be
> > mapped
> > > from the vhost-user devices are RAMs. I wonder what happens if we want to
> > > also support dynmaic MMIO regions.
> >
> > Is this use case realistic?
> >
> > If there is a reasonable way to prepare for such hypothetical use cases
> > them while solving Albert's immediate use case, I'm all for it.
> >
> > >
> > > Would this work even without changing QEMU memory core?
> > >
> > > For example, have you thought about creating a VhostUserShmemObject for
> > > each of the VHOST_USER_BACKEND_SHMEM_MAP request?
> >
> > You mean, adding an intermediate object that remains the parent of these
> > MemoryRegion?
> >
> > Could work. To free a MemoryRegion, I guess we would unparent that
> > intermediate object, and that object would then free the memory region
> > -- unless something still references that intermediate object. Not sure
> > if the memory region might keep the intermediate object still alive (no
> > idea).
> >
> > Certainly something to explore, Albert, can you look into that?
>
> Sure, I will try this. Thank you both for the time and help.
I did test this approach with a rust-vmm modified backend
and the buffer were created/destroyed correctly.
I just posted the version 6 of the SHMEM_MAP/UNMAP patch.
So unless there are other issues with the other patch series
implementation, I think we can discard this RFC.
Either way, thanks for the feedback and suggestions!
>
> >
> > --
> > Cheers,
> >
> > David / dhildenb
> >
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2025-08-14 8:48 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-05 8:11 [RFC v2] memory.c: improve refcounting for RAM vs MMIO regions Albert Esteve
2025-08-06 20:15 ` Peter Xu
2025-08-06 20:36 ` David Hildenbrand
2025-08-06 20:56 ` Peter Xu
2025-08-07 7:22 ` Albert Esteve
2025-08-14 8:47 ` Albert Esteve
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).