From: Akihiko Odaki <akihiko.odaki@daynix.com>
To: "Philippe Mathieu-Daudé" <philmd@linaro.org>,
"Peter Xu" <peterx@redhat.com>
Cc: "Eduardo Habkost" <eduardo@habkost.net>,
"Marcel Apfelbaum" <marcel.apfelbaum@gmail.com>,
"Yanan Wang" <wangyanan55@huawei.com>,
"John Snow" <jsnow@redhat.com>,
"BALATON Zoltan" <balaton@eik.bme.hu>,
"Jiaxun Yang" <jiaxun.yang@flygoat.com>,
"Nicholas Piggin" <npiggin@gmail.com>,
"Daniel Henrique Barboza" <danielhb413@gmail.com>,
"David Gibson" <david@gibson.dropbear.id.au>,
"Harsh Prateek Bora" <harshpb@linux.ibm.com>,
"Alexey Kardashevskiy" <aik@ozlabs.ru>,
"Michael S. Tsirkin" <mst@redhat.com>,
"Alex Bennée" <alex.bennee@linaro.org>,
"Fabiano Rosas" <farosas@suse.de>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"David Hildenbrand" <david@redhat.com>,
"Thomas Huth" <thuth@redhat.com>,
"Laurent Vivier" <lvivier@redhat.com>,
qemu-devel@nongnu.org, qemu-block@nongnu.org,
qemu-ppc@nongnu.org
Subject: Re: [PATCH v2 09/15] memory: Do not create circular reference with subregion
Date: Mon, 8 Jul 2024 17:41:24 +0900 [thread overview]
Message-ID: <8ac3dd7a-13fb-438b-a45e-afd7c4ec370b@daynix.com> (raw)
In-Reply-To: <f52c6660-3a1c-449e-93c0-07e4cefebef3@linaro.org>
On 2024/07/08 17:06, Philippe Mathieu-Daudé wrote:
> On 6/7/24 13:59, Akihiko Odaki wrote:
>> On 2024/07/03 2:44, Peter Xu wrote:
>>> On Thu, Jun 27, 2024 at 10:37:52PM +0900, Akihiko Odaki wrote:
>>>> A memory region does not use their own reference counters, but instead
>>>> piggybacks on another QOM object, "owner" (unless the owner is not the
>>>> memory region itself). When creating a subregion, a new reference to
>>>> the
>>>> owner of the container must be created. However, if the subregion is
>>>> owned by the same QOM object, this result in a self-reference, and make
>>>> the owner immortal. Avoid such a self-reference.
>>>>
>>>> Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
>>>> ---
>>>> system/memory.c | 11 +++++++++--
>>>> 1 file changed, 9 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/system/memory.c b/system/memory.c
>>>> index 74cd73ebc78b..949f5016a68d 100644
>>>> --- a/system/memory.c
>>>> +++ b/system/memory.c
>>>> @@ -2638,7 +2638,10 @@ static void
>>>> memory_region_update_container_subregions(MemoryRegion *subregion)
>>>> memory_region_transaction_begin();
>>>> - memory_region_ref(subregion);
>>>> + if (mr->owner != subregion->owner) {
>>>> + memory_region_ref(subregion);
>>>> + }
>>>> +
>>>> QTAILQ_FOREACH(other, &mr->subregions, subregions_link) {
>>>> if (subregion->priority >= other->priority) {
>>>> QTAILQ_INSERT_BEFORE(other, subregion, subregions_link);
>>>> @@ -2696,7 +2699,11 @@ void memory_region_del_subregion(MemoryRegion
>>>> *mr,
>>>> assert(alias->mapped_via_alias >= 0);
>>>> }
>>>> QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link);
>>>> - memory_region_unref(subregion);
>>>> +
>>>> + if (mr->owner != subregion->owner) {
>>>> + memory_region_unref(subregion);
>>>> + }
>>>> +
>>>> memory_region_update_pending |= mr->enabled &&
>>>> subregion->enabled;
>>>> memory_region_transaction_commit();
>>>> }
>>>
>>> This does look like a real issue.. the patch looks reasonable to me,
>>> but I
>>> wonder whether we should start to add some good comments in code to
>>> reflect
>>> that complexity starting from this one. The MR refcount isn't easy to
>>> understand to me.
>>>
>>> It also lets me start to wonder how MR refcount went through until it
>>> looks
>>> like today.. It's definitely not extremely intuitive to use
>>> mr->owner as
>>> the object to do refcounting if mr itself does has its own QObject,
>>> meanwhile it has other tricks around.
>>>
>>> E.g. the first thing I stumbled over when looking was the optimization
>>> where we will avoid refcounting the mr when there's no owner, and
>>> IIUC it
>>> was for the case when the "guest memory" (which will never be freed)
>>> used
>>> to have no owner so we can speedup DMA if we know it won't go away.
>>>
>>> https://lore.kernel.org/qemu-devel/1450263601-2828-5-git-send-email-pbonzini@redhat.com/
>>>
>>> commit 612263cf33062f7441a5d0e3b37c65991fdc3210
>>> Author: Paolo Bonzini <pbonzini@redhat.com>
>>> Date: Wed Dec 9 11:44:25 2015 +0100
>>>
>>> memory: avoid unnecessary object_ref/unref
>>> For the common case of DMA into non-hotplugged RAM, it is
>>> unnecessary
>>> but expensive to do object_ref/unref. Add back an owner field to
>>> MemoryRegion, so that these memory regions can skip the reference
>>> counting.
>>>
>>> If so, it looks like it will stop working with memory-backends get
>>> involved? As I think those MRs will have owner set always, and I wonder
>>> whether memory-backends should be the major way to specify guest
>>> memory now
>>> and in the future. So I'm not sure how important that optimization
>>> is as
>>> of now, and whether we could "simplify" it back to always do the
>>> refcount
>>> if the major scenarios will not adopt it.
>>>
>>> The other issue is we used owner refcount from the start of
>>> memory_region_ref() got introduced, since:
>>>
>>> commit 46637be269aaaceb9867ffdf176e906401138fff
>>> Author: Paolo Bonzini <pbonzini@redhat.com>
>>> Date: Tue May 7 09:06:00 2013 +0200
>>>
>>> memory: add ref/unref
>>>
>>> And we still have that in our document, even though I don't think
>>> it's true
>>> anymore:
>>>
>>> * ... MemoryRegions actually do not have their
>>> * own reference count; they piggyback on a QOM object, their "owner".
>>> * This function adds a reference to the owner.
>>>
>>> It looks like what happened is when introduced the change, MR is not
>>> a QOM
>>> object yet. But it later is..
>>>
>>> I mentioned all these only because I found that _if_ we can keep mr
>>> refcounting as simple as other objects:
>>>
>>> memory_region_ref(mr)
>>> {
>>> object_ref(OBJECT(mr));
>>> }
>>>
>>> Then looks like this "recursive refcount" problem can also go away. I'm
>>> curious whether you or anyone tried to explore that path, or whether
>>> above
>>> doesn't make sense at all.
>>
>> It unfortunately does not solve the problem.
>>
>> The underlying problem is that the whole device must be kept alive
>> while its memory region are. Indeed MemoryRegions do have refcounts,
>> but incrementing them do not extend the lifetime of the devices (i.e.,
>> the owners). The refcount of the owners must be incremented for
>> correctness.
>>
>> Referencing a subregion MemoryRegion from its container MemoryRegion
>> owned by the same device is an exceptional case. Incrementing the
>> refcount of the owner extends the owner's lifetime to forever.
>
> Is it really an exceptional case?
>
> What I'm seeing are a lot of devices creating MR and never bother
> to destroy them, so indeed owner (device) refcount never reaches 0.
>
> Most of the time when we create MR in .instance_init/.realize,
> we neglect to implement the undo path (.instance_finalize or
> .unrealize).
I meant memory_region_update_container_subregions() is the only code
that creates such a reference. The function itself is called in many
code paths.
next prev parent reply other threads:[~2024-07-08 8:42 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-27 13:37 [PATCH v2 00/15] Fix check-qtest-ppc64 sanitizer errors Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 01/15] cpu: Free cpu_ases Akihiko Odaki
2024-06-28 15:12 ` Peter Maydell
2024-06-27 13:37 ` [PATCH v2 02/15] hw/ide: Convert macio ide_irq into GPIO line Akihiko Odaki
2024-06-28 7:19 ` Mark Cave-Ayland
2024-06-27 13:37 ` [PATCH v2 03/15] hw/ide: Remove internal DMA qemu_irq Akihiko Odaki
2024-06-28 7:23 ` Mark Cave-Ayland
2024-06-27 13:37 ` [PATCH v2 04/15] hw/isa/vt82c686: Define a GPIO line between vt82c686 and i8259 Akihiko Odaki
2024-06-28 7:27 ` Mark Cave-Ayland
2024-06-29 7:38 ` Akihiko Odaki
2024-06-29 8:04 ` Akihiko Odaki
2024-06-29 13:08 ` BALATON Zoltan
2024-07-01 10:32 ` Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 05/15] spapr: Free stdout path Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 06/15] ppc/vof: Fix unaligned FDT property access Akihiko Odaki
2024-06-28 15:20 ` Peter Maydell
2024-06-29 3:16 ` David Gibson
2024-07-04 11:54 ` Nicholas Piggin
2024-07-04 12:15 ` Peter Maydell
2024-07-05 1:18 ` Nicholas Piggin
2024-07-05 1:41 ` David Gibson
2024-07-05 4:40 ` Nicholas Piggin
2024-07-05 5:12 ` David Gibson
2024-07-05 7:50 ` Nicholas Piggin
2024-07-06 9:07 ` Akihiko Odaki
2024-07-06 10:37 ` Peter Maydell
2024-07-06 23:46 ` David Gibson
2024-07-08 7:49 ` Nicholas Piggin
2024-07-08 15:59 ` Peter Maydell
2024-07-09 7:46 ` David Gibson
2024-07-09 7:41 ` David Gibson
2024-07-05 1:33 ` David Gibson
2024-06-27 13:37 ` [PATCH v2 07/15] hw/virtio: Free vqs after vhost_dev_cleanup() Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 08/15] migration: Free removed SaveStateEntry Akihiko Odaki
2024-08-02 12:47 ` (subset) " Fabiano Rosas
2024-06-27 13:37 ` [PATCH v2 09/15] memory: Do not create circular reference with subregion Akihiko Odaki
2024-07-02 17:44 ` Peter Xu
2024-07-06 11:59 ` Akihiko Odaki
2024-07-08 8:06 ` Philippe Mathieu-Daudé
2024-07-08 8:41 ` Akihiko Odaki [this message]
2024-07-08 16:40 ` Peter Xu
2024-07-11 8:38 ` Akihiko Odaki
2024-08-22 17:10 ` Peter Maydell
2024-08-22 21:01 ` Peter Xu
2024-08-23 6:17 ` Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 10/15] tests/qtest: Use qtest_add_data_func_full() Akihiko Odaki
2024-07-02 6:14 ` Thomas Huth
2024-06-27 13:37 ` [PATCH v2 11/15] tests/qtest: Free unused QMP response Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 12/15] tests/qtest: Free old machine variable name Akihiko Odaki
2024-06-28 15:21 ` Peter Maydell
2024-06-27 13:37 ` [PATCH v2 13/15] tests/qtest: Delete previous boot file Akihiko Odaki
2024-07-02 7:31 ` Thomas Huth
2024-07-04 11:41 ` Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 14/15] tests/qtest: Free paths Akihiko Odaki
2024-06-27 13:37 ` [PATCH v2 15/15] tests/qtest: Free GThread Akihiko Odaki
2024-06-28 15:24 ` Peter Maydell
2024-07-01 20:11 ` [PATCH v2 00/15] Fix check-qtest-ppc64 sanitizer errors Michael S. Tsirkin
2024-07-01 22:23 ` BALATON Zoltan
2024-07-02 6:23 ` Thomas Huth
2024-07-04 11:37 ` Nicholas Piggin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8ac3dd7a-13fb-438b-a45e-afd7c4ec370b@daynix.com \
--to=akihiko.odaki@daynix.com \
--cc=aik@ozlabs.ru \
--cc=alex.bennee@linaro.org \
--cc=balaton@eik.bme.hu \
--cc=danielhb413@gmail.com \
--cc=david@gibson.dropbear.id.au \
--cc=david@redhat.com \
--cc=eduardo@habkost.net \
--cc=farosas@suse.de \
--cc=harshpb@linux.ibm.com \
--cc=jiaxun.yang@flygoat.com \
--cc=jsnow@redhat.com \
--cc=lvivier@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=npiggin@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=qemu-ppc@nongnu.org \
--cc=thuth@redhat.com \
--cc=wangyanan55@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).