From: Markus Armbruster <armbru@redhat.com>
To: Jonah Palmer <jonah.palmer@oracle.com>
Cc: qemu-devel@nongnu.org, peterx@redhat.com, farosas@suse.de,
eblake@redhat.com, jasowang@redhat.com, mst@redhat.com,
si-wei.liu@oracle.com, eperezma@redhat.com,
boris.ostrovsky@oracle.com
Subject: Re: [RFC 1/6] migration: Add virtio-iterative capability
Date: Tue, 26 Aug 2025 08:11:06 +0200 [thread overview]
Message-ID: <87ecsypq85.fsf@pond.sub.org> (raw)
In-Reply-To: <2764b188-a4cd-40b8-95a7-ccec775d7db9@oracle.com> (Jonah Palmer's message of "Mon, 25 Aug 2025 10:57:47 -0400")
Jonah Palmer <jonah.palmer@oracle.com> writes:
> On 8/25/25 8:44 AM, Markus Armbruster wrote:
[...]
>> Jonah Palmer <jonah.palmer@oracle.com> writes:
>>
>>> On 8/8/25 6:48 AM, Markus Armbruster wrote:
[...]
>>>> Jonah Palmer <jonah.palmer@oracle.com> writes:
>>>>> Adds a new migration capability 'virtio-iterative' that will allow
>>>>> virtio devices, where supported, to iteratively migrate configuration
>>>>> changes that occur during the migration process.
>>>>
>>>> Why is that desirable?
>>>
>>> To be frank, I wasn't sure if having a migration capability, or even
>>> have it toggleable at all, would be desirable or not. It appears though
>>> that this might be better off as a per-device feature set via
>>> --device virtio-net-pci,iterative-mig=on,..., for example.
>>
>> See below.
>>
>>> And by "iteratively migrate configuration changes" I meant more along
>>> the lines of the device's state as it continues running on the source.
>>
>> Isn't that what migration does always?
>
> Essentially yes, but today all of the state is only migrated at the end, once the source has been paused. So the final correct state is always sent to the destination.
As far as I understand (and ignoring lots of detail, including post
copy), we have three stages:
1. Source runs, migrate memory pages. Pages that get dirtied after they
are migrated need to be migrated again.
2. Neither source or destination runs, migrate remaining memory pages
and device state.
3. Destination starts to run.
If the duration of stage 2 (downtime) was of no concern, we'd switch to
it immediately, i.e. without migrating anything in stage 1. This would
minimize I/O.
Of course, we actually care for limiting downtime. We switch to stage 2
when "little enough" is left for stage two to migrate.
> If we're no longer waiting until the source has been paused and the initial state is sent early, then we need to make sure that any changes that happen is still communicated to the destination.
So you're proposing to treat suitable parts of the device state more
like memory pages. Correct?
Cover letter and commit message of PATCH 4 provide the motivation: you
observe a shorter downtime. You speculate this is due to moving "heavy
allocations and page-fault latencies" from stage 2 to stage 1. Correct?
Is there anything that makes virtio-net particularly suitable?
I think this patch's commit message should at least hint at the
motivation at a high level. Details like measurements are best left to
PATCH 4.
> This RFC handles this by just re-sending the entire state again once the source has been paused. But of course this isn't optimal and I'm looking into how to better optimize this part.
How much is the entire state?
>>> But perhaps actual configuration changes (e.g. changing the number of
>>> queue pairs) could also be supported mid-migration like this?
>>
>> I don't know.
>>
>>>>> This capability is added to the validated capabilities list to ensure
>>>>> both the source and destination support it before enabling.
>>>>
>>>> What happens when only one side enables it?
>>>
>>> The migration stream breaks if only one side enables it.
>>
>> How does it break? Error message pointing out the misconfiguration?
>>
>
> The destination VM is torn down and the source just reports that migration failed.
Exact same failure as for other misconfigurations, like missing a device
on the destination?
> I don't believe the source/destination could be aware of the misconfiguration. IIUC the destination reads the migration stream and expects certain pieces of data in a certain order. If new data is added to the migration stream or the order has changed and the destination isn't expecting it, then the migration fails. It doesn't know exactly why, just that it read-in data that it wasn't expecting.
>
>>> This is poor wording on my part, my apologies. I don't think it's even
>>> possible to know the capabilities between the source & destination.
>>>
>>>>> The capability defaults to off to maintain backward compatibility.
>>>>>
>>>>> To enable the capability via HMP:
>>>>> (qemu) migrate_set_capability virtio-iterative on
>>>>>
>>>>> To enable the capability via QMP:
>>>>> {"execute": "migrate-set-capabilities", "arguments": {
>>>>> "capabilities": [
>>>>> { "capability": "virtio-iterative", "state": true }
>>>>> ]
>>>>> }
>>>>> }
>>>>>
>>>>> Signed-off-by: Jonah Palmer <jonah.palmer@oracle.com>
[...]
>>>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>>>> index 4963f6ca12..8f042c3ba5 100644
>>>>> --- a/qapi/migration.json
>>>>> +++ b/qapi/migration.json
>>>>> @@ -479,6 +479,11 @@
>>>>> # each RAM page. Requires a migration URI that supports seeking,
>>>>> # such as a file. (since 9.0)
>>>>> #
>>>>> +# @virtio-iterative: Enable iterative migration for virtio devices, if
>>>>> +# the device supports it. When enabled, and where supported, virtio
>>>>> +# devices will track and migrate configuration changes that may
>>>>> +# occur during the migration process. (Since 10.1)
>>>>
>>>> When and why should the user enable this?
>>>
>>> Well if all goes according to plan, always (at least for virtio-net).
>>> This should improve the overall speed of live migration for a virtio-net
>>> device (and vhost-net/vhost-vdpa).
>>
>> So the only use for "disabled" would be when migrating to or from an
>> older version of QEMU that doesn't support this. Fair?
>
> Correct.
>
>> What's the default?
>
> Disabled.
Awkward for something that should always be enabled. But see below.
Please document defaults in the doc comment.
>>>> What exactly do you mean by "where supported"?
>>>
>>> I meant if both source's Qemu and destination's Qemu support it, as well
>>> as for other virtio devices in the future if they decide to implement
>>> iterative migration (e.g. a more general "enable iterative migration for
>>> virtio devices").
>>>
>>> But I think for now this is better left as a virtio-net configuration
>>> rather than as a migration capability (e.g. --device
>>> virtio-net-pci,iterative-mig=on/off,...)
>>
>> Makes sense to me (but I'm not a migration expert).
A device property's default can depend on the machine type via compat
properties. This is normally used to restrict a guest-visible change to
newer machine types. Here, it's not guest-visible. But it can get you
this:
* Migrate new machine type from new QEMU to new QEMU (old QEMU doesn't
have the machine type): iterative is enabled by default. Good. User
can disable it on both ends to not get the improvement. Enabling it
on just one breaks migration.
All other cases go away with time.
* Migrate old machine type from new QEMU to new QEMU: iterative is
disabled by default, which is sad, but no worse than before. User can
enable it on both ends to get the improvement. Enabling it on just
one breaks migration.
* Migrate old machine type from new QEMU to old QEMU or vice versa:
iterative is off by default. Good. Enabling it on the new one breaks
migration.
* Migrate old machine type from old QEMU to old QEMU: iterative is off
I figure almost all users could simply ignore this configuration knob
then.
>> [...]
next prev parent reply other threads:[~2025-08-26 6:13 UTC|newest]
Thread overview: 66+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-22 12:41 [RFC 0/6] virtio-net: initial iterative live migration support Jonah Palmer
2025-07-22 12:41 ` [RFC 1/6] migration: Add virtio-iterative capability Jonah Palmer
2025-08-06 15:58 ` Peter Xu
2025-08-07 12:50 ` Jonah Palmer
2025-08-07 13:13 ` Peter Xu
2025-08-07 14:20 ` Jonah Palmer
2025-08-08 10:48 ` Markus Armbruster
2025-08-11 12:18 ` Jonah Palmer
2025-08-25 12:44 ` Markus Armbruster
2025-08-25 14:57 ` Jonah Palmer
2025-08-26 6:11 ` Markus Armbruster [this message]
2025-08-26 18:08 ` Jonah Palmer
2025-08-27 6:37 ` Markus Armbruster
2025-08-28 15:29 ` Jonah Palmer
2025-08-29 9:24 ` Markus Armbruster
2025-09-01 14:10 ` Jonah Palmer
2025-07-22 12:41 ` [RFC 2/6] virtio-net: Reorder vmstate_virtio_net and helpers Jonah Palmer
2025-07-22 12:41 ` [RFC 3/6] virtio-net: Add SaveVMHandlers for iterative migration Jonah Palmer
2025-07-22 12:41 ` [RFC 4/6] virtio-net: iter live migration - migrate vmstate Jonah Palmer
2025-07-23 6:51 ` Michael S. Tsirkin
2025-07-24 14:45 ` Jonah Palmer
2025-07-25 9:31 ` Michael S. Tsirkin
2025-07-28 12:30 ` Jonah Palmer
2025-07-22 12:41 ` [RFC 5/6] virtio, virtio-net: skip consistency check in virtio_load for iterative migration Jonah Palmer via
2025-07-28 15:30 ` [RFC 5/6] virtio,virtio-net: " Eugenio Perez Martin
2025-07-28 16:23 ` Jonah Palmer
2025-07-30 8:59 ` Eugenio Perez Martin
2025-08-06 16:27 ` Peter Xu
2025-08-07 14:18 ` Jonah Palmer
2025-08-07 16:31 ` Peter Xu
2025-08-11 12:30 ` Jonah Palmer
2025-08-11 13:39 ` Peter Xu
2025-08-11 21:26 ` Jonah Palmer
2025-08-11 21:55 ` Peter Xu
2025-08-12 15:51 ` Jonah Palmer
2025-08-13 9:25 ` Eugenio Perez Martin
2025-08-13 14:06 ` Peter Xu
2025-08-14 9:28 ` Eugenio Perez Martin
2025-08-14 16:16 ` Dragos Tatulea
2025-08-14 20:27 ` Peter Xu
2025-08-15 14:50 ` Jonah Palmer
2025-08-15 19:35 ` Si-Wei Liu
2025-08-18 6:51 ` Eugenio Perez Martin
2025-08-18 14:46 ` Jonah Palmer
2025-08-18 16:21 ` Peter Xu
2025-08-19 7:20 ` Eugenio Perez Martin
2025-08-19 7:10 ` Eugenio Perez Martin
2025-08-19 15:10 ` Jonah Palmer
2025-08-20 7:59 ` Eugenio Perez Martin
2025-08-25 12:16 ` Jonah Palmer
2025-08-27 16:55 ` Jonah Palmer
2025-09-01 6:57 ` Eugenio Perez Martin
2025-09-01 13:17 ` Jonah Palmer
2025-09-02 7:31 ` Eugenio Perez Martin
2025-07-22 12:41 ` [RFC 6/6] virtio-net: skip vhost_started assertion during " Jonah Palmer
2025-07-23 5:51 ` [RFC 0/6] virtio-net: initial iterative live migration support Jason Wang
2025-07-24 21:59 ` Jonah Palmer
2025-07-25 9:18 ` Lei Yang
2025-07-25 9:33 ` Michael S. Tsirkin
2025-07-28 7:09 ` Jason Wang
2025-07-28 7:35 ` Jason Wang
2025-07-28 12:41 ` Jonah Palmer
2025-07-28 14:51 ` Eugenio Perez Martin
2025-07-28 15:38 ` Eugenio Perez Martin
2025-07-29 2:38 ` Jason Wang
2025-07-29 12:41 ` Jonah Palmer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ecsypq85.fsf@pond.sub.org \
--to=armbru@redhat.com \
--cc=boris.ostrovsky@oracle.com \
--cc=eblake@redhat.com \
--cc=eperezma@redhat.com \
--cc=farosas@suse.de \
--cc=jasowang@redhat.com \
--cc=jonah.palmer@oracle.com \
--cc=mst@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=si-wei.liu@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.