From: Steven Sistare <steven.sistare@oracle.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
David Hildenbrand <david@redhat.com>,
Philippe Mathieu-Daude <philmd@linaro.org>,
Paolo Bonzini <pbonzini@redhat.com>
Subject: Re: [PATCH] migration: ram block cpr blockers
Date: Fri, 14 Feb 2025 15:12:22 -0500 [thread overview]
Message-ID: <f1f1c640-367d-4966-93f5-a98e3b764e28@oracle.com> (raw)
In-Reply-To: <Z5uwUNtYhQfJJlg3@x1.local>
On 1/30/2025 12:01 PM, Peter Xu wrote:
> On Wed, Jan 29, 2025 at 01:20:13PM -0500, Steven Sistare wrote:
>> On 1/17/2025 6:57 PM, Peter Xu wrote:
>>> On Fri, Jan 17, 2025 at 02:10:14PM -0500, Steven Sistare wrote:
>>>> On 1/17/2025 1:16 PM, Peter Xu wrote:
>>>>> On Fri, Jan 17, 2025 at 09:46:11AM -0800, Steve Sistare wrote:
>>>>>> +/*
>>>>>> + * Return true if ram contents would be lost during CPR.
>>>>>> + * Return false for ram_device because it is remapped in new QEMU. Do not
>>>>>> + * exclude rom, even though it is readonly, because the rom file could change
>>>>>> + * in new QEMU. Return false for non-migratable blocks. They are either
>>>>>> + * re-created in new QEMU, or are handled specially, or are covered by a
>>>>>> + * device-level CPR blocker. Return false for an fd, because it is visible and
>>>>>> + * can be remapped in new QEMU.
>>>>>> + */
>>>>>> +static bool ram_is_volatile(RAMBlock *rb)
>>>>>> +{
>>>>>> + MemoryRegion *mr = rb->mr;
>>>>>> +
>>>>>> + return mr &&
>>>>>> + memory_region_is_ram(mr) &&
>>>>>> + !memory_region_is_ram_device(mr) &&
>>>>>> + (!qemu_ram_is_shared(rb) || !qemu_ram_is_named_file(rb)) &&
>>>>>> + qemu_ram_is_migratable(rb) &&
>>>>>> + rb->fd < 0;
>>>>>> +}
>>>>>
>>>>> Blocking guest_memfd looks ok, but comparing to add one more block
>>>>> notifier, can we check all ramblocks once in migrate_prepare(), and fail
>>>>> that command directly if it fails the check?
>>>>
>>>> In an upcoming patch, I will be adding an option analogous to only-migratable which
>>>> prevents QEMU from starting if anything would block cpr-transfer. That option
>>>> will be checked when blockers are added, like for only-migratable. migrate_prepare
>>>> is too late.
>>>>
>>>>> OTOH, is there any simpler way to simplify the check conditions? It'll be
>>>>> at least nice to break these checks into smaller if conditions for
>>>>> readability..
>>>>
>>>> I thought the function header comments made it clear, but I could move each
>>>> comment next to each condition:
>>>>
>>>> ...
>>>> /*
>>>> * Return false for an fd, because it is visible and can be remapped in
>>>> * new QEMU.
>>>> */
>>>> if (rb->fd >= 0) {
>>>> return false;
>>>> }
>>>> ...
>>>>
>>>>> I wonder if we could stick with looping over all ramblocks, then make sure
>>>>> each of them is on the cpr saved fd list. It may need to make
>>>>> cpr_save_fd() always register with the name of ramblock to do such lookup,
>>>>> or maybe we could also cache the ramblock pointer in CprFd, then the lookup
>>>>> will be a pointer match check.
>>>>
>>>> Some ramblocks are not on the list, such as named files. Plus looping in
>>>> migrate_prepare is too late as noted above.
>>>>
>>>> IMO what I have already implemented using blockers is clean and elegant.
>>>
>>> OK if we need to fail it early at boot, then yes blockers are probably
>>> better.
>>>
>>> We'll need one more cmdline parameter. I've no objection, but I don't know
>>> how to judge when it's ok to add, when it's better not.. I'll leave others
>>> to comment on this.
>>>
>>> But still, could we check it when ramblocks are created? So in that way
>>> whatever is forbidden is clear in its own path, I feel like that could be
>>> clearer (like what you did with gmemfd).
>>
>> When the ramblock is created, we don't yet know if it is migratable. A
>> ramblock that is not migratable does not block cpr. Migratable is not known
>> until vmstate_register_ram calls qemu_ram_set_migratable. Hence that is
>> where I evaluate conditions and install a blocker.
>>
>> Because that is the only place where ram_block_add_cpr_blocker is called,
>> the test qemu_ram_is_migratable() inside ram_block_add_cpr_blocker is
>> redundant, and I should delete it.
>
> Hmm.. sounds reasonable.
>
>>
>>> For example, if I start to convert some of your requirements above, then
>>> memory_region_is_ram_device() implies RAM_PREALLOC. Actually, ram_device
>>> is not the only RAM_PREALLOC user.. Say, would it also not work with all
>>> memory_region_init_ram_ptr() users (even if they're not ram_device)? An
>>> example is, looks like virtio-gpu can create random ramblocks on the fly
>>> with prealloced buffers. I am not sure whether they can be pinned by VFIO
>>> too. You may know better.
>>
>> That memory is not visible to the guest. It is not part of system_memory,
>> and is not marked migratable.
>
> I _think_ that can still be visible at least for the virtio-gpu use case,
> which hangs under VirtIOGPUBase.hostmem. Relevant code for reference:
>
> virtio_gpu_virgl_map_resource_blob:
> memory_region_init_ram_ptr(mr, OBJECT(mr), "blob", size, data);
> memory_region_add_subregion(&b->hostmem, offset, mr);
> memory_region_set_enabled(mr, true);
>
> virtio_gpu_pci_base_realize:
> memory_region_init(&g->hostmem, OBJECT(g), "virtio-gpu-hostmem",
> g->conf.hostmem);
> pci_register_bar(&vpci_dev->pci_dev, 4,
> PCI_BASE_ADDRESS_SPACE_MEMORY |
> PCI_BASE_ADDRESS_MEM_PREFETCH |
> PCI_BASE_ADDRESS_MEM_TYPE_64,
> &g->hostmem);
>
> pci_update_mappings:
> memory_region_add_subregion_overlap(r->address_space,
> r->addr, r->memory, 1);
>
> but indeed I'm not sure how it work with migration so far, because it
> doesn't have RAM_MIGRATABLE set. So at least cpr didn't make it more
> special. I assume this isn't something we must figure out as of now
> then.. but if you do, please kindly share.
AFAICT this memory cannot be pinned, because it is not attached to the
"system_memory" mr and the "address_space_memory" address space. The
listener than maps/pins is attached to address_space_memory.
I am submitting V2 of this patch with more embedded comments.
- Steve
next prev parent reply other threads:[~2025-02-14 20:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-17 17:46 [PATCH] migration: ram block cpr blockers Steve Sistare
2025-01-17 18:16 ` Peter Xu
2025-01-17 19:10 ` Steven Sistare
2025-01-17 23:57 ` Peter Xu
2025-01-29 18:20 ` Steven Sistare
2025-01-30 17:01 ` Peter Xu
2025-02-14 20:12 ` Steven Sistare [this message]
2025-02-18 16:10 ` Peter Xu
2025-02-25 15:46 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f1f1c640-367d-4966-93f5-a98e3b764e28@oracle.com \
--to=steven.sistare@oracle.com \
--cc=david@redhat.com \
--cc=farosas@suse.de \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).