From: Markus Armbruster <armbru@redhat.com>
To: Steven Sistare <steven.sistare@oracle.com>
Cc: qemu-devel@nongnu.org, Peter Xu <peterx@redhat.com>,
Fabiano Rosas <farosas@suse.de>,
David Hildenbrand <david@redhat.com>,
Igor Mammedov <imammedo@redhat.com>,
Eduardo Habkost <eduardo@habkost.net>,
Marcel Apfelbaum <marcel.apfelbaum@gmail.com>,
Philippe Mathieu-Daude <philmd@linaro.org>,
Paolo Bonzini <pbonzini@redhat.com>,
"Daniel P. Berrange" <berrange@redhat.com>
Subject: Re: [PATCH V1 20/26] migration: cpr-exec mode
Date: Fri, 03 May 2024 08:26:50 +0200 [thread overview]
Message-ID: <87edaj2yxh.fsf@pond.sub.org> (raw)
In-Reply-To: <39b6e4b1-1910-4411-a3f0-d96214bcd6d6@oracle.com> (Steven Sistare's message of "Thu, 2 May 2024 12:00:23 -0400")
Steven Sistare <steven.sistare@oracle.com> writes:
> On 5/2/2024 8:23 AM, Markus Armbruster wrote:
>> Steve Sistare <steven.sistare@oracle.com> writes:
>>
>>> Add the cpr-exec migration mode. Usage:
>>> qemu-system-$arch -machine memfd-alloc=on ...
>>> migrate_set_parameter mode cpr-exec
>>> migrate_set_parameter cpr-exec-args \
>>> <arg1> <arg2> ... -incoming <uri>
>>> migrate -d <uri>
>>>
>>> The migrate command stops the VM, saves state to the URI,
>>> directly exec's a new version of QEMU on the same host,
>>> replacing the original process while retaining its PID, and
>>> loads state from the URI. Guest RAM is preserved in place,
>>> albeit with new virtual addresses.
>>>
>>> Arguments for the new QEMU process are taken from the
>>> @cpr-exec-args parameter. The first argument should be the
>>> path of a new QEMU binary, or a prefix command that exec's the
>>> new QEMU binary.
>>>
>>> Because old QEMU terminates when new QEMU starts, one cannot
>>> stream data between the two, so the URI must be a type, such as
>>> a file, that reads all data before old QEMU exits.
>>>
>>> Memory backend objects must have the share=on attribute, and
>>> must be mmap'able in the new QEMU process. For example,
>>> memory-backend-file is acceptable, but memory-backend-ram is
>>> not.
>>>
>>> The VM must be started with the '-machine memfd-alloc=on'
>>> option. This causes implicit ram blocks (those not explicitly
>>> described by a memory-backend object) to be allocated by
>>> mmap'ing a memfd. Examples include VGA, ROM, and even guest
>>> RAM when it is specified without a memory-backend object.
>>>
>>> The implementation saves precreate vmstate at the end of normal
>>> migration in migrate_fd_cleanup, and tells the main loop to call
>>> cpr_exec. Incoming qemu loads preceate state early, before objects
>>> are created. The memfds are kept open across exec by clearing the
>>> close-on-exec flag, their values are saved in precreate vmstate,
>>> and they are mmap'd in new qemu.
>>>
>>> Note that the memfd-alloc option is not related to memory-backend-memfd.
>>> Later patches add support for memory-backend-memfd, and for additional
>>> devices, including vfio, chardev, and more.
>>>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>
>> [...]
>>
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index 49710e7..7c5f45f 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -665,9 +665,37 @@
>>> # or COLO.
>>> #
>>> # (since 8.2)
>>> +#
>>> +# @cpr-exec: The migrate command stops the VM, saves state to the URI,
What URI? I know you mean the migration URI, but will readers know?
Elsewhere, we use "migration URI".
Hmm. That's no good, either: we may not *have* a migration URI since
commit 074dbce5fcce (migration: New migrate and migrate-incoming
argument 'channels') and its fixup commit 57fd4b4e1075 made command
migrate argument @uri optional and mutually exclusive with @channels.
I think we better use more generic terminology here. Let's have a look
at migrate's documentation for inspiration:
##
# @migrate:
#
# Migrates the current running guest to another Virtual Machine.
#
# @uri: the Uniform Resource Identifier of the destination VM
#
# @channels: list of migration stream channels with each stream in the
# list connected to a destination interface endpoint.
#
[...]
# Notes:
[...]
# 4. The uri argument should have the Uniform Resource Identifier
# of default destination VM. This connection will be bound to
# default network.
#
# 5. For now, number of migration streams is restricted to one,
# i.e. number of items in 'channels' list is just 1.
#
# 6. The 'uri' and 'channels' arguments are mutually exclusive;
# exactly one of the two should be present.
Perhaps "saves the state to the migration destination"?
>>> +# directly exec's a new version of QEMU on the same host,
>>> +# replacing the original process while retaining its PID, and
>>> +# loads state from the URI. Guest RAM is preserved in place,
"loads the state from the migration destination"?
We should also fix up existing uses of "migration URI": @mapped-ram,
@cpr-reboot, @tls-hostname. Not this series' job. I'll report it
separately.
>>> +# albeit with new virtual addresses.
>>
>> Do you mean the virtual addresses of guest RAM may differ betwen old and
>> new QEMU process?
>
> The VA at which a guest RAM segment is mapped in the QEMU process
> changes. The end user would not notice or care, so I'll drop that
> detail here.
>
>>> +#
>>> +# Arguments for the new QEMU process are taken from the
>>> +# @cpr-exec-args parameter. The first argument should be the
>>> +# path of a new QEMU binary, or a prefix command that exec's the
>>> +# new QEMU binary.
>>
>> What's a "prefix command"? A wrapper script, perhaps?
>
> A prefix command is any command of the form:
> command1 command1-args command2 command2-args
> where command1 performs some set up before exec'ing command2.
> However, I will drop the word "prefix", it adds no meaning here.
Maybe "the command to start the new QEMU process"?
Hmm. @cpr-exec-args is documented like this:
# @cpr-exec-args: Arguments passed to new QEMU for @cpr-exec mode.
# See @cpr-exec for details. (Since 9.1)
Is it a good idea to keep the details with @cpr-exec? Let me try not
to. Replace the "Arguments for the new QEMU process..." paragraph by
# The new QEMU process is started according to migration parameter
# @cpr-exec-args.
Then document cpr-exec-args like
# @cpr-exec-args: Command to start the new QEMU process for MigMode
# @cpr-exec. The first list element is the program's filename, the
# remainder its arguments.
What do you think?
Naming the thing "-args" feels questionable. It's program and
arguments.
For what it's worth, QGA command guest-exec has them separate:
# @path: path or executable name to execute
#
# @arg: argument list to pass to executable
The name @path is poorly chosen.
qmp_guest_exec() then prepends @path to @arg to make the argv[] for the
execve() wrapper it uses.
I figure you'd rather not have them separate, to keep migration
parameters simpler. Name it @cpr-exec-command?
>>> +#
>>> +# Because old QEMU terminates when new QEMU starts, one cannot
>>> +# stream data between the two, so the URI must be a type, such as
>>> +# a file, that reads all data before old QEMU exits.
>>
>> What happens when you specify a URI that doesn't?
>
> Old QEMU will quietly block indefinitely writing to the URI.
Worth spelling that out in the doc comment?
>>> +#
>>> +# Memory backend objects must have the share=on attribute, and
>>> +# must be mmap'able in the new QEMU process. For example,
>>> +# memory-backend-file is acceptable, but memory-backend-ram is
>>> +# not.
>>> +#
>>> +# The VM must be started with the '-machine memfd-alloc=on'
>>
>> What happens when you don't?
>
> If '-only-migratable-modes cpr-exec' is specified, then QEMU will fail
> to start, and print a clear error message.
>
> Otherwise, a blocker is registered and any attempt to cpr-exec will fail
> with a clear error message.
With clear errors, no further documentation is needed. Good :)
> - Steve
>
>>> +# option. This causes implicit ram blocks -- those not explicitly
>>> +# described by a memory-backend object -- to be allocated by
>>> +# mmap'ing a memfd. Examples include VGA, ROM, and even guest
>>> +# RAM when it is specified without a memory-backend object.
>>> +#
>>> +# (since 9.1)
>>> ##
>>> { 'enum': 'MigMode',
>>> - 'data': [ 'normal', 'cpr-reboot' ] }
>>> + 'data': [ 'normal', 'cpr-reboot', 'cpr-exec' ] }
>>>
>>> ##
>>> # @ZeroPageDetection:
>>
>> [...]
>>
next prev parent reply other threads:[~2024-05-03 6:27 UTC|newest]
Thread overview: 122+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-29 15:55 [PATCH V1 00/26] Live update: cpr-exec Steve Sistare
2024-04-29 15:55 ` [PATCH V1 01/26] oslib: qemu_clear_cloexec Steve Sistare
2024-05-06 23:27 ` Fabiano Rosas
2024-05-07 8:56 ` Daniel P. Berrangé
2024-05-07 13:54 ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 02/26] vl: helper to request re-exec Steve Sistare
2024-04-29 15:55 ` [PATCH V1 03/26] migration: SAVEVM_FOREACH Steve Sistare
2024-05-06 23:17 ` Fabiano Rosas
2024-05-13 19:27 ` Steven Sistare
2024-05-27 18:14 ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 04/26] migration: delete unused parameter mis Steve Sistare
2024-05-06 21:50 ` Fabiano Rosas
2024-05-27 18:02 ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 05/26] migration: precreate vmstate Steve Sistare
2024-05-07 21:02 ` Fabiano Rosas
2024-05-13 19:28 ` Steven Sistare
2024-05-24 13:56 ` Fabiano Rosas
2024-05-27 18:16 ` Peter Xu
2024-05-28 15:09 ` Steven Sistare via
2024-05-29 18:39 ` Peter Xu
2024-05-30 17:04 ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 06/26] migration: precreate vmstate for exec Steve Sistare
2024-05-06 23:34 ` Fabiano Rosas
2024-05-13 19:28 ` Steven Sistare
2024-05-13 21:21 ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 07/26] migration: VMStateId Steve Sistare
2024-05-07 21:03 ` Fabiano Rosas
2024-05-27 18:20 ` Peter Xu
2024-05-28 15:10 ` Steven Sistare via
2024-05-28 17:44 ` Peter Xu
2024-05-29 17:30 ` Steven Sistare via
2024-05-29 18:53 ` Peter Xu
2024-05-30 17:11 ` Steven Sistare via
2024-05-30 18:03 ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 08/26] migration: vmstate_info_void_ptr Steve Sistare
2024-05-07 21:33 ` Fabiano Rosas
2024-05-27 18:31 ` Peter Xu
2024-05-28 15:10 ` Steven Sistare via
2024-05-28 18:21 ` Peter Xu
2024-05-29 17:30 ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 09/26] migration: vmstate_register_named Steve Sistare
2024-05-09 14:19 ` Fabiano Rosas
2024-05-09 14:32 ` Fabiano Rosas
2024-05-13 19:29 ` Steven Sistare
2024-04-29 15:55 ` [PATCH V1 10/26] migration: vmstate_unregister_named Steve Sistare
2024-04-29 15:55 ` [PATCH V1 11/26] migration: vmstate_register at init time Steve Sistare
2024-04-29 15:55 ` [PATCH V1 12/26] migration: vmstate factory object Steve Sistare
2024-04-29 15:55 ` [PATCH V1 13/26] physmem: ram_block_create Steve Sistare
2024-05-13 18:37 ` Fabiano Rosas
2024-05-13 19:30 ` Steven Sistare
2024-04-29 15:55 ` [PATCH V1 14/26] physmem: hoist guest_memfd creation Steve Sistare
2024-04-29 15:55 ` [PATCH V1 15/26] physmem: hoist host memory allocation Steve Sistare
2024-04-29 15:55 ` [PATCH V1 16/26] physmem: set ram block idstr earlier Steve Sistare
2024-04-29 15:55 ` [PATCH V1 17/26] machine: memfd-alloc option Steve Sistare
2024-05-28 21:12 ` Peter Xu
2024-05-29 17:31 ` Steven Sistare via
2024-05-29 19:14 ` Peter Xu
2024-05-30 17:11 ` Steven Sistare via
2024-05-30 18:14 ` Peter Xu
2024-05-31 19:32 ` Steven Sistare via
2024-06-03 21:48 ` Peter Xu
2024-06-04 7:13 ` Daniel P. Berrangé
2024-06-04 15:58 ` Peter Xu
2024-06-04 16:14 ` David Hildenbrand
2024-06-04 16:41 ` Peter Xu
2024-06-04 17:16 ` David Hildenbrand
2024-06-03 10:17 ` Daniel P. Berrangé
2024-06-03 11:59 ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 18/26] migration: cpr-exec-args parameter Steve Sistare
2024-05-02 12:23 ` Markus Armbruster
2024-05-02 16:00 ` Steven Sistare
2024-05-21 8:13 ` Daniel P. Berrangé
2024-04-29 15:55 ` [PATCH V1 19/26] physmem: preserve ram blocks for cpr Steve Sistare
2024-05-28 21:44 ` Peter Xu
2024-05-29 17:31 ` Steven Sistare via
2024-05-29 19:25 ` Peter Xu
2024-05-30 17:12 ` Steven Sistare via
2024-05-30 18:39 ` Peter Xu
2024-05-31 19:32 ` Steven Sistare via
2024-06-03 22:29 ` Peter Xu
2024-04-29 15:55 ` [PATCH V1 20/26] migration: cpr-exec mode Steve Sistare
2024-05-02 12:23 ` Markus Armbruster
2024-05-02 16:00 ` Steven Sistare
2024-05-03 6:26 ` Markus Armbruster [this message]
2024-05-21 8:20 ` Daniel P. Berrangé
2024-05-24 14:58 ` Fabiano Rosas
2024-05-27 18:54 ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 21/26] migration: migrate_add_blocker_mode Steve Sistare
2024-05-09 17:47 ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 22/26] migration: ram block cpr-exec blockers Steve Sistare
2024-05-09 18:01 ` Fabiano Rosas
2024-05-13 19:29 ` Steven Sistare
2024-04-29 15:55 ` [PATCH V1 23/26] migration: misc " Steve Sistare
2024-05-09 18:05 ` Fabiano Rosas
2024-05-24 12:40 ` Fabiano Rosas
2024-05-27 19:02 ` Steven Sistare via
2024-04-29 15:55 ` [PATCH V1 24/26] seccomp: cpr-exec blocker Steve Sistare
2024-05-09 18:16 ` Fabiano Rosas
2024-05-10 7:54 ` Daniel P. Berrangé
2024-05-13 19:29 ` Steven Sistare
2024-05-21 7:14 ` Daniel P. Berrangé
2024-04-29 15:55 ` [PATCH V1 25/26] migration: fix mismatched GPAs during cpr-exec Steve Sistare
2024-05-09 18:39 ` Fabiano Rosas
2024-04-29 15:55 ` [PATCH V1 26/26] migration: only-migratable-modes Steve Sistare
2024-05-09 19:14 ` Fabiano Rosas
2024-05-13 19:48 ` Steven Sistare
2024-05-13 21:57 ` Fabiano Rosas
2024-05-21 8:05 ` Daniel P. Berrangé
2024-05-02 16:13 ` cpr-exec doc (was Re: [PATCH V1 00/26] Live update: cpr-exec) Steven Sistare
2024-05-02 18:15 ` Peter Xu
2024-05-20 18:30 ` [PATCH V1 00/26] Live update: cpr-exec Steven Sistare
2024-05-20 22:28 ` Fabiano Rosas
2024-05-21 2:31 ` Peter Xu
2024-05-21 11:46 ` Steven Sistare
2024-05-27 17:45 ` Peter Xu
2024-05-28 15:10 ` Steven Sistare via
2024-05-28 16:42 ` Peter Xu
2024-05-30 17:17 ` Steven Sistare via
2024-05-30 19:23 ` Peter Xu
2024-05-24 13:02 ` Fabiano Rosas
2024-05-24 14:07 ` Steven Sistare
2024-05-27 18:07 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87edaj2yxh.fsf@pond.sub.org \
--to=armbru@redhat.com \
--cc=berrange@redhat.com \
--cc=david@redhat.com \
--cc=eduardo@habkost.net \
--cc=farosas@suse.de \
--cc=imammedo@redhat.com \
--cc=marcel.apfelbaum@gmail.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=philmd@linaro.org \
--cc=qemu-devel@nongnu.org \
--cc=steven.sistare@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.