From: Juraj Marcin <jmarcin@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org,
"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Zhiyi Guo" <zhguo@redhat.com>,
"Prasad Pandit" <ppandit@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Kirti Wankhede" <kwankhede@nvidia.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Fabiano Rosas" <farosas@suse.de>,
"Joao Martins" <joao.m.martins@oracle.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Alex Williamson" <alex@shazbot.org>
Subject: Re: [PATCH 06/14] migration: Introduce stopcopy_bytes in save_query_pending()
Date: Thu, 9 Apr 2026 19:36:51 +0200 [thread overview]
Message-ID: <adfhtgEeWtsMxEmv@fedora> (raw)
In-Reply-To: <20260408165559.157108-7-peterx@redhat.com>
Hi Peter,
actually, I do have one question, see inline
On 2026-04-08 12:55, Peter Xu wrote:
> Allow modules to report data that can only be migrated after VM is stopped.
>
> When this concept is introduced, we will need to account stopcopy size to
> be part of pending_size as before.
>
> However, when there're data only can be migrated in stopcopy phase, it
> means the old "pending_size" may not always be able to reach low enough to
> kickoff an slow version of query sync.
>
> It used to be almost guaranteed to happen as all prior iterative modules
> doesn't have stopcopy only data. VFIO may change that fact by having some
> data that must be copied during stop phase.
>
> So we need to make sure QEMU will kickoff a synchronized version of query
> pending when all precopy data is migrated. This might be important to VFIO
> to keep making progress even if the downtime cannot yet be satisfied.
>
> So far, this patch should introduce no functional change, as no module yet
> report stopcopy size.
>
> This paves way for VFIO to properly report its pending data sizes, which
> will start to include stop-only data.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> include/migration/register.h | 7 +++++
> migration/migration.c | 52 ++++++++++++++++++++++++++++++------
> migration/savevm.c | 7 +++--
> migration/trace-events | 2 +-
> 4 files changed, 57 insertions(+), 11 deletions(-)
>
> diff --git a/include/migration/register.h b/include/migration/register.h
> index aba3c9af2f..e822a2a59f 100644
> --- a/include/migration/register.h
> +++ b/include/migration/register.h
> @@ -21,6 +21,13 @@ typedef struct MigPendingData {
> uint64_t precopy_bytes;
> /* Amount of pending bytes can be transferred in postcopy */
> uint64_t postcopy_bytes;
> + /* Amount of pending bytes can be transferred only in stopcopy */
> + uint64_t stopcopy_bytes;
> + /*
> + * Total pending data, modules do not need to update this field, it
> + * will be automatically calculated by migration core API.
> + */
> + uint64_t total_bytes;
> } MigPendingData;
>
> /**
> diff --git a/migration/migration.c b/migration/migration.c
> index 68cfe2d3bf..bb17bd0e68 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3198,6 +3198,44 @@ typedef enum {
> MIG_ITERATE_BREAK, /* Break the loop */
> } MigIterateState;
>
> +/* Are we ready to move to the next iteration phase? */
> +static bool migration_iteration_next_ready(MigrationState *s,
> + MigPendingData *pending)
> +{
> + /*
> + * If the estimated values already suggest us to switchover, mark this
> + * iteration finished, time to do a slow sync.
> + */
> + if (pending->total_bytes <= s->threshold_size) {
> + return true;
> + }
> +
> + /*
> + * Since we may have modules reporting stop-only data, we also want to
> + * re-query with slow mode if all precopy data is moved over. This
> + * will also mark the current iteration done.
> + *
> + * This could happen when e.g. a module (like, VFIO) reports stopcopy
> + * size too large so it will never yet satisfy the downtime with the
> + * current setup (above check). Here, slow version of re-query helps
> + * because we keep trying the best to move whatever we have.
> + */
> + if (pending->precopy_bytes == 0) {
> + return true;
> + }
> +
> + return false;
> +}
> +
> +static void migration_iteration_go_next(MigPendingData *pending)
> +{
> + /*
> + * Do a slow sync will achieve this. TODO: move RAM iteration code
> + * into the core layer.
> + */
> + qemu_savevm_query_pending(pending, true);
> +}
> +
> /*
> * Return true if continue to the next iteration directly, false
> * otherwise.
> @@ -3209,12 +3247,10 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
> bool can_switchover = migration_can_switchover(s);
> MigPendingData pending = { };
> - uint64_t pending_size;
> bool complete_ready;
>
> /* Fast path - get the estimated amount of pending data */
> qemu_savevm_query_pending(&pending, false);
> - pending_size = pending.precopy_bytes + pending.postcopy_bytes;
>
> if (in_postcopy) {
> /*
> @@ -3222,7 +3258,7 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> * postcopy completion doesn't rely on can_switchover, because when
> * POSTCOPY_ACTIVE it means switchover already happened.
> */
> - complete_ready = !pending_size;
> + complete_ready = !pending.total_bytes;
> if (s->state == MIGRATION_STATUS_POSTCOPY_DEVICE &&
> (s->postcopy_package_loaded || complete_ready)) {
> /*
> @@ -3242,9 +3278,8 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> * postcopy started, so ESTIMATE should always match with EXACT
> * during postcopy phase.
> */
> - if (pending_size <= s->threshold_size) {
> - qemu_savevm_query_pending(&pending, true);
> - pending_size = pending.precopy_bytes + pending.postcopy_bytes;
> + if (migration_iteration_next_ready(s, &pending)) {
> + migration_iteration_go_next(&pending);
> }
>
> /* Should we switch to postcopy now? */
> @@ -3264,11 +3299,12 @@ static MigIterateState migration_iteration_run(MigrationState *s)
> * (2) Pending size is no more than the threshold specified
> * (which was calculated from expected downtime)
> */
> - complete_ready = can_switchover && (pending_size <= s->threshold_size);
> + complete_ready = can_switchover &&
> + (pending.total_bytes <= s->threshold_size);
shouldn't also the condition that triggers postcopy migration be updated?
As total_bytes is calculated as sum of all three
(precopy_bytes + stopcopy_bytes + postcopy_bytes), this implies to me
that stopcopy_bytes is not subset of precopy_bytes and would also need
to be migrated during switchover before postcopy.
Once this is resolved, then my Reviewed-by tag is valid, the patch looks
good to me otherwise.
Thanks!
> }
>
> if (complete_ready) {
> - trace_migration_thread_low_pending(pending_size);
> + trace_migration_thread_low_pending(pending.total_bytes);
> migration_completion(s);
> return MIG_ITERATE_BREAK;
> }
> diff --git a/migration/savevm.c b/migration/savevm.c
> index 397f602257..b75c311a95 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1766,8 +1766,7 @@ void qemu_savevm_query_pending(MigPendingData *pending, bool exact)
> {
> SaveStateEntry *se;
>
> - pending->precopy_bytes = 0;
> - pending->postcopy_bytes = 0;
> + memset(pending, 0, sizeof(*pending));
>
> QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> if (!se->ops || !se->ops->save_query_pending) {
> @@ -1779,7 +1778,11 @@ void qemu_savevm_query_pending(MigPendingData *pending, bool exact)
> se->ops->save_query_pending(se->opaque, pending, exact);
> }
>
> + pending->total_bytes = pending->precopy_bytes +
> + pending->stopcopy_bytes + pending->postcopy_bytes;
> +
> trace_qemu_savevm_query_pending(exact, pending->precopy_bytes,
> + pending->stopcopy_bytes,
> pending->postcopy_bytes);
> }
>
> diff --git a/migration/trace-events b/migration/trace-events
> index f8995b8d0d..2f86ad448e 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -7,7 +7,7 @@ qemu_loadvm_state_section_partend(uint32_t section_id) "%u"
> qemu_loadvm_state_post_main(int ret) "%d"
> qemu_loadvm_state_section_startfull(uint32_t section_id, const char *idstr, uint32_t instance_id, uint32_t version_id) "%u(%s) %u %u"
> qemu_savevm_send_packaged(void) ""
> -qemu_savevm_query_pending(bool exact, uint64_t precopy, uint64_t postcopy) "exact=%d, precopy=%"PRIu64", postcopy=%"PRIu64
> +qemu_savevm_query_pending(bool exact, uint64_t precopy, uint64_t stopcopy, uint64_t postcopy) "exact=%d, precopy=%"PRIu64", stopcopy=%"PRIu64", postcopy=%"PRIu64
> loadvm_state_switchover_ack_needed(unsigned int switchover_ack_pending_num) "Switchover ack pending num=%u"
> loadvm_state_setup(void) ""
> loadvm_state_cleanup(void) ""
> --
> 2.53.0
>
next prev parent reply other threads:[~2026-04-09 17:37 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 16:55 [PATCH 00/14] migration/vfio: Fix a few issues on API misuse or statistic reports Peter Xu
2026-04-08 16:55 ` [PATCH 01/14] migration: Fix low possibility downtime violation Peter Xu
2026-04-08 16:55 ` [PATCH 02/14] migration/qapi: Rename MigrationStats to MigrationRAMStats Peter Xu
2026-04-09 17:08 ` Juraj Marcin
2026-04-10 11:10 ` Michal Prívozník
2026-04-15 16:09 ` Peter Xu
2026-04-08 16:55 ` [PATCH 03/14] vfio/migration: Cache stop size in VFIOMigration Peter Xu
2026-04-13 9:52 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 04/14] migration/treewide: Merge @state_pending_{exact|estimate} APIs Peter Xu
2026-04-09 17:10 ` Juraj Marcin
2026-04-15 16:23 ` Peter Xu
2026-04-16 8:24 ` Juraj Marcin
2026-04-13 9:57 ` Avihai Horon
2026-04-16 14:01 ` Peter Xu
2026-04-16 14:18 ` Jason J. Herne
2026-04-08 16:55 ` [PATCH 05/14] migration: Use the new save_query_pending() API directly Peter Xu
2026-04-13 9:59 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 06/14] migration: Introduce stopcopy_bytes in save_query_pending() Peter Xu
2026-04-09 17:13 ` Juraj Marcin
2026-04-09 17:36 ` Juraj Marcin [this message]
2026-04-16 17:20 ` Peter Xu
2026-04-17 10:18 ` Juraj Marcin
2026-04-13 10:34 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 07/14] vfio/migration: Fix incorrect reporting for VFIO pending data Peter Xu
2026-04-13 10:56 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 08/14] migration: Make qemu_savevm_query_pending() available anytime Peter Xu
2026-04-09 17:15 ` Juraj Marcin
2026-04-16 18:06 ` Peter Xu
2026-04-17 10:26 ` Juraj Marcin
2026-04-20 15:56 ` Peter Xu
2026-04-08 16:55 ` [PATCH 09/14] migration: Move iteration counter out of RAM Peter Xu
2026-04-09 22:14 ` Fabiano Rosas
2026-04-16 18:15 ` Peter Xu
2026-04-16 21:15 ` Fabiano Rosas
2026-04-08 16:55 ` [PATCH 10/14] migration: Introduce a helper to return switchover bw estimate Peter Xu
2026-04-08 16:55 ` [PATCH 11/14] migration: Calculate expected downtime on demand Peter Xu
2026-04-09 17:16 ` Juraj Marcin
2026-04-08 16:55 ` [PATCH 12/14] migration: Fix calculation of expected_downtime to take VFIO info Peter Xu
2026-04-09 17:17 ` Juraj Marcin
2026-04-09 22:17 ` Fabiano Rosas
2026-04-16 18:19 ` Peter Xu
2026-04-08 16:55 ` [PATCH 13/14] migration/qapi: Introduce system-wise "remaining" reports Peter Xu
2026-04-09 17:41 ` Juraj Marcin
2026-04-09 21:48 ` Dr. David Alan Gilbert
2026-04-16 18:25 ` Peter Xu
2026-04-09 22:21 ` Fabiano Rosas
2026-04-16 18:26 ` Peter Xu
2026-04-08 16:55 ` [PATCH 14/14] migration/qapi: Update unit for avail-switchover-bandwidth Peter Xu
2026-04-09 17:40 ` Juraj Marcin
2026-04-08 18:37 ` [PATCH 00/14] migration/vfio: Fix a few issues on API misuse or statistic reports Peter Xu
2026-04-13 16:09 ` Cédric Le Goater
2026-04-15 16:06 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adfhtgEeWtsMxEmv@fedora \
--to=jmarcin@redhat.com \
--cc=alex@shazbot.org \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=kwankhede@nvidia.com \
--cc=mail@maciej.szmigiero.name \
--cc=peterx@redhat.com \
--cc=ppandit@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zhguo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.