From: Steven Sistare <steven.sistare@oracle.com>
To: Peter Xu <peterx@redhat.com>, Igor Mammedov <imammedo@redhat.com>,
"Michael S. Tsirkin" <mst@redhat.com>
Cc: qemu-devel@nongnu.org, Fabiano Rosas <farosas@suse.de>,
David Hildenbrand <david@redhat.com>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Eduardo Habkost <eduardo@habkost.net>,
Philippe Mathieu-Daude <philmd@linaro.org>,
Paolo Bonzini <pbonzini@redhat.com>,
"Daniel P. Berrange" <berrange@redhat.com>,
Markus Armbruster <armbru@redhat.com>
Subject: Re: [PATCH V2 05/13] physmem: preserve ram blocks for cpr
Date: Tue, 8 Oct 2024 11:17:46 -0400 [thread overview]
Message-ID: <025423a6-8cf8-4300-91f2-13be32ec2c5c@oracle.com> (raw)
In-Reply-To: <ZwQMRlSSqP0i0ITb@x1n>
On 10/7/2024 12:28 PM, Peter Xu wrote:
> On Mon, Oct 07, 2024 at 11:49:25AM -0400, Peter Xu wrote:
>> On Mon, Sep 30, 2024 at 12:40:36PM -0700, Steve Sistare wrote:
>>> Save the memfd for anonymous ramblocks in CPR state, along with a name
>>> that uniquely identifies it. The block's idstr is not yet set, so it
>>> cannot be used for this purpose. Find the saved memfd in new QEMU when
>>> creating a block. QEMU hard-codes the length of some internally-created
>>> blocks, so to guard against that length changing, use lseek to get the
>>> actual length of an incoming memfd.
>>>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> ---
>>> system/physmem.c | 25 ++++++++++++++++++++++++-
>>> 1 file changed, 24 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/system/physmem.c b/system/physmem.c
>>> index 174f7e0..ddbeec9 100644
>>> --- a/system/physmem.c
>>> +++ b/system/physmem.c
>>> @@ -72,6 +72,7 @@
>>>
>>> #include "qapi/qapi-types-migration.h"
>>> #include "migration/options.h"
>>> +#include "migration/cpr.h"
>>> #include "migration/vmstate.h"
>>>
>>> #include "qemu/range.h"
>>> @@ -1663,6 +1664,19 @@ void qemu_ram_unset_idstr(RAMBlock *block)
>>> }
>>> }
>>>
>>> +static char *cpr_name(RAMBlock *block)
>>> +{
>>> + MemoryRegion *mr = block->mr;
>>> + const char *mr_name = memory_region_name(mr);
>>> + g_autofree char *id = mr->dev ? qdev_get_dev_path(mr->dev) : NULL;
>>> +
>>> + if (id) {
>>> + return g_strdup_printf("%s/%s", id, mr_name);
>>> + } else {
>>> + return g_strdup(mr_name);
>>> + }
>>> +}
>>> +
>>> size_t qemu_ram_pagesize(RAMBlock *rb)
>>> {
>>> return rb->page_size;
>>> @@ -1858,14 +1872,18 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>>> TYPE_MEMORY_BACKEND)) {
>>> size_t max_length = new_block->max_length;
>>> MemoryRegion *mr = new_block->mr;
>>> - const char *name = memory_region_name(mr);
>>> + g_autofree char *name = cpr_name(new_block);
>>>
>>> new_block->mr->align = QEMU_VMALLOC_ALIGN;
>>> new_block->flags |= RAM_SHARED;
>>> + new_block->fd = cpr_find_fd(name, 0);
>>>
>>> if (new_block->fd == -1) {
>>> new_block->fd = qemu_memfd_create(name, max_length + mr->align,
>>> 0, 0, 0, errp);
>>> + cpr_save_fd(name, 0, new_block->fd);
>>> + } else {
>>> + new_block->max_length = lseek(new_block->fd, 0, SEEK_END);
>>
>> So this can overwrite the max_length that the caller specified..
>>
>> I remember we used to have some tricks on specifying different max_length
>> for ROMs on dest QEMU (on which, qemu firmwares also upgraded on the dest
>> host so the size can be bigger than src qemu's old ramblocks), so that the
>> MR is always large enough to reload even the new firmwares, while migration
>> only migrates the smaller size (used_length) so it's fine as we keep the
>> extra sizes empty. I think that can relevant to the qemu_ram_resize() call
>> of parse_ramblock().
Yes, resizable ram block for firmware blob is the only case I know of where
the length changed in the past. If a length changes in the future, we will
need to detect and accommodate that change here, and I believe the fix will
be to simply use the actual length, as per the code above. But if you prefer,
for now I can check for length change and return an error. New qemu will fail
to start, and old qemu will recover.
>> The reload will not happen until some point, perhaps system resets. I
>> wonder whether that is an issue in this case.
Firmware is only generated once, via this path on x86:
qmp_x_exit_preconfig
qemu_machine_creation_done
qdev_machine_creation_done
pc_machine_done
acpi_setup
acpi_add_rom_blob
rom_add_blob
rom_set_mr
After a system reset, the ramblock contents from memory are used as-is.
> PS: If this is needed by CPR-transfer only because mmap() later can fail
> due to a bigger max_length,
That is the reason. IMO adjusting max_length is more robust than fiddling
with truncate and pretending that max_length is larger, when qemu will never
be able to use the phantom space up to max_length.
- Steve
> I wonder whether it can be fixed by passing
> truncate=true in the upcoming file_ram_alloc(), rather than overwritting
> the max_length value itself.
>
>>
>>> }
>>>
>>> if (new_block->fd >= 0) {
>>> @@ -1875,6 +1893,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
>>> false, 0, errp);
>>> }
>>> if (!new_block->host) {
>>> + cpr_delete_fd(name, 0);
>>> qemu_mutex_unlock_ramlist();
>>> return;
>>> }
>>> @@ -2182,6 +2201,8 @@ static void reclaim_ramblock(RAMBlock *block)
>>>
>>> void qemu_ram_free(RAMBlock *block)
>>> {
>>> + g_autofree char *name = NULL;
>>> +
>>> if (!block) {
>>> return;
>>> }
>>> @@ -2192,6 +2213,8 @@ void qemu_ram_free(RAMBlock *block)
>>> }
>>>
>>> qemu_mutex_lock_ramlist();
>>> + name = cpr_name(block);
>>> + cpr_delete_fd(name, 0);
>>> QLIST_REMOVE_RCU(block, next);
>>> ram_list.mru_block = NULL;
>>> /* Write list before version */
>>> --
>>> 1.8.3.1
>>>
>>
>> --
>> Peter Xu
>
next prev parent reply other threads:[~2024-10-08 15:18 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-30 19:40 [PATCH V2 00/13] Live update: cpr-transfer Steve Sistare
2024-09-30 19:40 ` [PATCH V2 01/13] machine: alloc-anon option Steve Sistare
2024-10-03 16:14 ` Peter Xu
2024-10-04 10:14 ` David Hildenbrand
2024-10-04 12:33 ` Peter Xu
2024-10-04 12:54 ` David Hildenbrand
2024-10-04 13:24 ` Peter Xu
2024-10-07 16:23 ` David Hildenbrand
2024-10-07 19:05 ` Peter Xu
2024-10-07 15:36 ` Peter Xu
2024-10-07 19:30 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 02/13] migration: cpr-state Steve Sistare
2024-10-07 14:14 ` Peter Xu
2024-10-07 19:30 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 03/13] migration: save cpr mode Steve Sistare
2024-10-07 15:18 ` Peter Xu
2024-10-07 19:31 ` Steven Sistare
2024-10-07 20:10 ` Peter Xu
2024-10-08 15:57 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 04/13] migration: stop vm earlier for cpr Steve Sistare
2024-10-07 15:27 ` Peter Xu
2024-10-07 20:52 ` Steven Sistare
2024-10-08 15:35 ` Peter Xu
2024-10-08 19:13 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 05/13] physmem: preserve ram blocks " Steve Sistare
2024-10-07 15:49 ` Peter Xu
2024-10-07 16:28 ` Peter Xu
2024-10-08 15:17 ` Steven Sistare [this message]
2024-10-08 16:26 ` Peter Xu
2024-10-08 21:05 ` Steven Sistare
2024-10-08 21:32 ` Peter Xu
2024-10-31 20:32 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 06/13] hostmem-memfd: preserve " Steve Sistare
2024-10-07 15:52 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 07/13] migration: SCM_RIGHTS for QEMUFile Steve Sistare
2024-10-07 16:06 ` Peter Xu
2024-10-07 16:35 ` Daniel P. Berrangé
2024-10-07 18:12 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 08/13] migration: VMSTATE_FD Steve Sistare
2024-10-07 16:36 ` Peter Xu
2024-10-07 19:31 ` Steven Sistare
2024-09-30 19:40 ` [PATCH V2 09/13] migration: cpr-transfer save and load Steve Sistare
2024-10-07 16:47 ` Peter Xu
2024-10-07 19:31 ` Steven Sistare
2024-10-08 15:36 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 10/13] migration: cpr-uri parameter Steve Sistare
2024-10-07 16:49 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 11/13] migration: cpr-uri option Steve Sistare
2024-10-07 16:50 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 12/13] migration: split qmp_migrate Steve Sistare
2024-10-07 19:18 ` Peter Xu
2024-09-30 19:40 ` [PATCH V2 13/13] migration: cpr-transfer mode Steve Sistare
2024-10-07 19:44 ` Peter Xu
2024-10-07 20:39 ` Steven Sistare
2024-10-08 15:45 ` Peter Xu
2024-10-08 19:12 ` Steven Sistare
2024-10-08 19:38 ` Peter Xu
2024-10-08 18:28 ` Fabiano Rosas
2024-10-08 18:47 ` Peter Xu
2024-10-08 19:11 ` Fabiano Rosas
2024-10-08 19:33 ` Steven Sistare
2024-10-08 19:48 ` Peter Xu
2024-10-09 18:43 ` Steven Sistare
2024-10-09 19:06 ` Peter Xu
2024-10-09 19:59 ` Peter Xu
2024-10-09 20:18 ` Steven Sistare
2024-10-09 20:57 ` Peter Xu
2024-10-09 22:08 ` Fabiano Rosas
2024-10-10 20:05 ` Steven Sistare
2024-10-09 20:09 ` Steven Sistare
2024-10-09 20:36 ` Peter Xu
2024-10-10 20:06 ` Steven Sistare
2024-10-10 21:23 ` Peter Xu
2024-10-24 21:12 ` Steven Sistare
2024-10-25 13:55 ` Peter Xu
2024-10-25 15:04 ` Steven Sistare
2024-10-08 19:29 ` Steven Sistare
2024-10-08 14:33 ` [PATCH V2 00/13] Live update: cpr-transfer Vladimir Sementsov-Ogievskiy
2024-10-08 21:13 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=025423a6-8cf8-4300-91f2-13be32ec2c5c@oracle.com \
--to=steven.sistare@oracle.com \
--cc=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=david@redhat.com \
--cc=eduardo@habkost.net \
--cc=farosas@suse.de \
--cc=imammedo@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=mst@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).