From: Steven Sistare <steven.sistare@oracle.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
David Hildenbrand <david@redhat.com>,
Philippe Mathieu-Daude <philmd@linaro.org>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] migration: ram block cpr blockers
Date: Wed, 29 Jan 2025 13:20:13 -0500 [thread overview]
Message-ID: <20674b54-3c88-4d2c-a590-3b0ddaff86f9@oracle.com> (raw)
In-Reply-To: <Z4ruhpH28-GnnTq7@x1n>
On 1/17/2025 6:57 PM, Peter Xu wrote:
> On Fri, Jan 17, 2025 at 02:10:14PM -0500, Steven Sistare wrote:
>> On 1/17/2025 1:16 PM, Peter Xu wrote:
>>> On Fri, Jan 17, 2025 at 09:46:11AM -0800, Steve Sistare wrote:
>>>> +/*
>>>> + * Return true if ram contents would be lost during CPR.
>>>> + * Return false for ram_device because it is remapped in new QEMU. Do not
>>>> + * exclude rom, even though it is readonly, because the rom file could change
>>>> + * in new QEMU. Return false for non-migratable blocks. They are either
>>>> + * re-created in new QEMU, or are handled specially, or are covered by a
>>>> + * device-level CPR blocker. Return false for an fd, because it is visible and
>>>> + * can be remapped in new QEMU.
>>>> + */
>>>> +static bool ram_is_volatile(RAMBlock *rb)
>>>> +{
>>>> + MemoryRegion *mr = rb->mr;
>>>> +
>>>> + return mr &&
>>>> + memory_region_is_ram(mr) &&
>>>> + !memory_region_is_ram_device(mr) &&
>>>> + (!qemu_ram_is_shared(rb) || !qemu_ram_is_named_file(rb)) &&
>>>> + qemu_ram_is_migratable(rb) &&
>>>> + rb->fd < 0;
>>>> +}
>>>
>>> Blocking guest_memfd looks ok, but comparing to add one more block
>>> notifier, can we check all ramblocks once in migrate_prepare(), and fail
>>> that command directly if it fails the check?
>>
>> In an upcoming patch, I will be adding an option analogous to only-migratable which
>> prevents QEMU from starting if anything would block cpr-transfer. That option
>> will be checked when blockers are added, like for only-migratable. migrate_prepare
>> is too late.
>>
>>> OTOH, is there any simpler way to simplify the check conditions? It'll be
>>> at least nice to break these checks into smaller if conditions for
>>> readability..
>>
>> I thought the function header comments made it clear, but I could move each
>> comment next to each condition:
>>
>> ...
>> /*
>> * Return false for an fd, because it is visible and can be remapped in
>> * new QEMU.
>> */
>> if (rb->fd >= 0) {
>> return false;
>> }
>> ...
>>
>>> I wonder if we could stick with looping over all ramblocks, then make sure
>>> each of them is on the cpr saved fd list. It may need to make
>>> cpr_save_fd() always register with the name of ramblock to do such lookup,
>>> or maybe we could also cache the ramblock pointer in CprFd, then the lookup
>>> will be a pointer match check.
>>
>> Some ramblocks are not on the list, such as named files. Plus looping in
>> migrate_prepare is too late as noted above.
>>
>> IMO what I have already implemented using blockers is clean and elegant.
>
> OK if we need to fail it early at boot, then yes blockers are probably
> better.
>
> We'll need one more cmdline parameter. I've no objection, but I don't know
> how to judge when it's ok to add, when it's better not.. I'll leave others
> to comment on this.
>
> But still, could we check it when ramblocks are created? So in that way
> whatever is forbidden is clear in its own path, I feel like that could be
> clearer (like what you did with gmemfd).
When the ramblock is created, we don't yet know if it is migratable. A
ramblock that is not migratable does not block cpr. Migratable is not known
until vmstate_register_ram calls qemu_ram_set_migratable. Hence that is
where I evaluate conditions and install a blocker.
Because that is the only place where ram_block_add_cpr_blocker is called,
the test qemu_ram_is_migratable() inside ram_block_add_cpr_blocker is
redundant, and I should delete it.
> For example, if I start to convert some of your requirements above, then
> memory_region_is_ram_device() implies RAM_PREALLOC. Actually, ram_device
> is not the only RAM_PREALLOC user.. Say, would it also not work with all
> memory_region_init_ram_ptr() users (even if they're not ram_device)? An
> example is, looks like virtio-gpu can create random ramblocks on the fly
> with prealloced buffers. I am not sure whether they can be pinned by VFIO
> too. You may know better.
That memory is not visible to the guest. It is not part of system_memory,
and is not marked migratable.
- Steve
> So, to me ram_is_volatile() is harder to follow, meanwhile it may miss
> something to me? IMO it's still better to explicitly add cpr blockers in
> the ram block add() path if possible, but maybe you still have good reasons
> to do it only until vmstate_register_ram() which I overlooked..
next prev parent reply other threads:[~2025-01-29 18:21 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-17 17:46 [PATCH] migration: ram block cpr blockers Steve Sistare
2025-01-17 18:16 ` Peter Xu
2025-01-17 19:10 ` Steven Sistare
2025-01-17 23:57 ` Peter Xu
2025-01-29 18:20 ` Steven Sistare [this message]
2025-01-30 17:01 ` Peter Xu
2025-02-14 20:12 ` Steven Sistare
2025-02-18 16:10 ` Peter Xu
2025-02-25 15:46 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20674b54-3c88-4d2c-a590-3b0ddaff86f9@oracle.com \
--to=steven.sistare@oracle.com \
--cc=david@redhat.com \
--cc=farosas@suse.de \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).