From: Juraj Marcin <jmarcin@redhat.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org,
"Maciej S . Szmigiero" <mail@maciej.szmigiero.name>,
"Daniel P . Berrangé" <berrange@redhat.com>,
"Zhiyi Guo" <zhguo@redhat.com>,
"Prasad Pandit" <ppandit@redhat.com>,
"Avihai Horon" <avihaih@nvidia.com>,
"Kirti Wankhede" <kwankhede@nvidia.com>,
"Cédric Le Goater" <clg@redhat.com>,
"Fabiano Rosas" <farosas@suse.de>,
"Joao Martins" <joao.m.martins@oracle.com>,
"Markus Armbruster" <armbru@redhat.com>,
"Alex Williamson" <alex@shazbot.org>
Subject: Re: [PATCH 08/14] migration: Make qemu_savevm_query_pending() available anytime
Date: Thu, 9 Apr 2026 19:15:28 +0200 [thread overview]
Message-ID: <adfegWEjLcGmDTPi@fedora> (raw)
In-Reply-To: <20260408165559.157108-9-peterx@redhat.com>
On 2026-04-08 12:55, Peter Xu wrote:
> After qemu_savevm_query_pending() be exposed to more code paths, it can be
> used at very early stage when migration started and this may expose some
> race conditions that we don't use to have. This patch make it prepared
> for such use cases so this API is fine to be used almost anytime.
>
> What matters here is, querying pending for each module normally depends on
> save_setup() being run first, otherwise modules may not be ready for the
> query request.
>
> Consider an early cancellation of migration after SETUP status but before
> invocations of save_setup() hooks, source QEMU may fall into CANCELLING
> stage directly from SETUP (not ACTIVE, which is the normal use case), in
> which case save_setup() may not have been invoked and modules are not
> ready. However qemu_savevm_query_pending() may still be used in QMP
> commands like query-migrate and causing crashes.
>
> Guard such use case by introducing a boolean reflecting the availability of
> vmstate save handlers on correct completions of save_setup()s. So far,
> only protect qemu_savevm_query_pending() with it. Logically other hooks
> face similar concern, but most of them shouldn't be reachable from random
> code path except migration thread so it should be fine.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> migration/migration.h | 8 ++++++++
> migration/savevm.h | 2 +-
> migration/migration.c | 2 +-
> migration/savevm.c | 37 +++++++++++++++++++++++++++++++++----
> 4 files changed, 43 insertions(+), 6 deletions(-)
>
> diff --git a/migration/migration.h b/migration/migration.h
> index b6888daced..e504df6915 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -522,6 +522,14 @@ struct MigrationState {
> * anything as input.
> */
> bool has_block_bitmap_mapping;
> +
> + /*
> + * This boolean reflects if the vmstate handlers have been properly
> + * setup on source side. It is set after vmstate save_setup() hooks
> + * are successfully invoked, and cleared after save_cleanup()s. It
> + * reflects a general availability of vmstate hooks on the source side.
> + */
> + bool save_setup_ready;
> };
>
> void migrate_set_state(MigrationStatus *state, MigrationStatus old_state,
> diff --git a/migration/savevm.h b/migration/savevm.h
> index 96fdf96d4e..04ed09cec2 100644
> --- a/migration/savevm.h
> +++ b/migration/savevm.h
> @@ -42,7 +42,7 @@ int qemu_savevm_state_resume_prepare(MigrationState *s);
> void qemu_savevm_send_header(QEMUFile *f);
> void qemu_savevm_state_header(QEMUFile *f);
> int qemu_savevm_state_iterate(QEMUFile *f, bool postcopy);
> -void qemu_savevm_state_cleanup(void);
> +void qemu_savevm_state_cleanup(MigrationState *s);
> void qemu_savevm_state_complete_postcopy(QEMUFile *f);
> int qemu_savevm_state_complete_precopy(MigrationState *s);
> void qemu_savevm_query_pending(MigPendingData *pending, bool exact);
> diff --git a/migration/migration.c b/migration/migration.c
> index bb17bd0e68..a9ee3360e1 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1283,7 +1283,7 @@ static void migration_cleanup(MigrationState *s)
> g_free(s->hostname);
> s->hostname = NULL;
>
> - qemu_savevm_state_cleanup();
> + qemu_savevm_state_cleanup(s);
> cpr_state_close();
> cpr_transfer_source_destroy(s);
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index b75c311a95..1d3fce45b9 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -1387,7 +1387,8 @@ int qemu_savevm_state_non_iterable_early(QEMUFile *f,
> return 0;
> }
>
> -static int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
> +static int qemu_savevm_state_setup(MigrationState *s, QEMUFile *f,
> + Error **errp)
> {
> SaveStateEntry *se;
> int ret;
> @@ -1409,6 +1410,13 @@ static int qemu_savevm_state_setup(QEMUFile *f, Error **errp)
> }
> }
>
> + /*
> + * Logically, it should be paired with any hook being used who needs to
> + * load_acquire() the flag first. So far, only save_query_pending()
> + * uses it.
> + */
> + qatomic_store_release(&s->save_setup_ready, true);
What other savevm functions would benefit from this? Would it make sense
to include them in this patch/series?
> +
> return 0;
> }
>
> @@ -1429,7 +1437,7 @@ int qemu_savevm_state_do_setup(QEMUFile *f, Error **errp)
> return ret;
> }
>
> - ret = qemu_savevm_state_setup(f, errp);
> + ret = qemu_savevm_state_setup(ms, f, errp);
> if (ret) {
> return ret;
> }
> @@ -1764,10 +1772,23 @@ int qemu_savevm_state_complete_precopy(MigrationState *s)
>
> void qemu_savevm_query_pending(MigPendingData *pending, bool exact)
> {
> + MigrationState *s = migrate_get_current();
> SaveStateEntry *se;
>
> memset(pending, 0, sizeof(*pending));
>
> + /*
> + * This API can be invoked very early before SETUP is properly done, in
> + * that case don't invoke module queries because they're not ready.
> + * Just report all zeros.
> + *
> + * This is paired with save_setup_ready updates on save_setup() and
> + * save_cleanup().
> + */
> + if (!s || !qatomic_load_acquire(&s->save_setup_ready)) {
> + return;
> + }
> +
> QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> if (!se->ops || !se->ops->save_query_pending) {
> continue;
> @@ -1786,7 +1807,7 @@ void qemu_savevm_query_pending(MigPendingData *pending, bool exact)
> pending->postcopy_bytes);
> }
>
> -void qemu_savevm_state_cleanup(void)
> +void qemu_savevm_state_cleanup(MigrationState *s)
> {
> SaveStateEntry *se;
> Error *local_err = NULL;
> @@ -1795,6 +1816,14 @@ void qemu_savevm_state_cleanup(void)
> error_report_err(local_err);
> }
>
> + s->save_setup_ready = false;
> + /*
> + * Make sure we clear the flag before invoking save_cleanup(), so any
> + * racy QMP query-migrate won't try to invoke any save hooks. Just use
> + * an explicit barrier to be simple.
> + */
> + smp_mb();
> +
> trace_savevm_state_cleanup();
> QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> if (se->ops && se->ops->save_cleanup) {
> @@ -1841,7 +1870,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp)
> error_setg_errno(errp, -ret, "Error while writing VM state");
> }
> cleanup:
> - qemu_savevm_state_cleanup();
> + qemu_savevm_state_cleanup(ms);
>
> if (ret != 0) {
> status = MIGRATION_STATUS_FAILED;
> --
> 2.53.0
>
next prev parent reply other threads:[~2026-04-09 17:16 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 16:55 [PATCH 00/14] migration/vfio: Fix a few issues on API misuse or statistic reports Peter Xu
2026-04-08 16:55 ` [PATCH 01/14] migration: Fix low possibility downtime violation Peter Xu
2026-04-08 16:55 ` [PATCH 02/14] migration/qapi: Rename MigrationStats to MigrationRAMStats Peter Xu
2026-04-09 17:08 ` Juraj Marcin
2026-04-10 11:10 ` Michal Prívozník
2026-04-15 16:09 ` Peter Xu
2026-04-08 16:55 ` [PATCH 03/14] vfio/migration: Cache stop size in VFIOMigration Peter Xu
2026-04-13 9:52 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 04/14] migration/treewide: Merge @state_pending_{exact|estimate} APIs Peter Xu
2026-04-09 17:10 ` Juraj Marcin
2026-04-15 16:23 ` Peter Xu
2026-04-16 8:24 ` Juraj Marcin
2026-04-13 9:57 ` Avihai Horon
2026-04-16 14:01 ` Peter Xu
2026-04-16 14:18 ` Jason J. Herne
2026-04-08 16:55 ` [PATCH 05/14] migration: Use the new save_query_pending() API directly Peter Xu
2026-04-13 9:59 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 06/14] migration: Introduce stopcopy_bytes in save_query_pending() Peter Xu
2026-04-09 17:13 ` Juraj Marcin
2026-04-09 17:36 ` Juraj Marcin
2026-04-16 17:20 ` Peter Xu
2026-04-17 10:18 ` Juraj Marcin
2026-04-13 10:34 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 07/14] vfio/migration: Fix incorrect reporting for VFIO pending data Peter Xu
2026-04-13 10:56 ` Avihai Horon
2026-04-08 16:55 ` [PATCH 08/14] migration: Make qemu_savevm_query_pending() available anytime Peter Xu
2026-04-09 17:15 ` Juraj Marcin [this message]
2026-04-16 18:06 ` Peter Xu
2026-04-17 10:26 ` Juraj Marcin
2026-04-20 15:56 ` Peter Xu
2026-04-08 16:55 ` [PATCH 09/14] migration: Move iteration counter out of RAM Peter Xu
2026-04-09 22:14 ` Fabiano Rosas
2026-04-16 18:15 ` Peter Xu
2026-04-16 21:15 ` Fabiano Rosas
2026-04-08 16:55 ` [PATCH 10/14] migration: Introduce a helper to return switchover bw estimate Peter Xu
2026-04-08 16:55 ` [PATCH 11/14] migration: Calculate expected downtime on demand Peter Xu
2026-04-09 17:16 ` Juraj Marcin
2026-04-08 16:55 ` [PATCH 12/14] migration: Fix calculation of expected_downtime to take VFIO info Peter Xu
2026-04-09 17:17 ` Juraj Marcin
2026-04-09 22:17 ` Fabiano Rosas
2026-04-16 18:19 ` Peter Xu
2026-04-08 16:55 ` [PATCH 13/14] migration/qapi: Introduce system-wise "remaining" reports Peter Xu
2026-04-09 17:41 ` Juraj Marcin
2026-04-09 21:48 ` Dr. David Alan Gilbert
2026-04-16 18:25 ` Peter Xu
2026-04-09 22:21 ` Fabiano Rosas
2026-04-16 18:26 ` Peter Xu
2026-04-08 16:55 ` [PATCH 14/14] migration/qapi: Update unit for avail-switchover-bandwidth Peter Xu
2026-04-09 17:40 ` Juraj Marcin
2026-04-08 18:37 ` [PATCH 00/14] migration/vfio: Fix a few issues on API misuse or statistic reports Peter Xu
2026-04-13 16:09 ` Cédric Le Goater
2026-04-15 16:06 ` Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=adfegWEjLcGmDTPi@fedora \
--to=jmarcin@redhat.com \
--cc=alex@shazbot.org \
--cc=armbru@redhat.com \
--cc=avihaih@nvidia.com \
--cc=berrange@redhat.com \
--cc=clg@redhat.com \
--cc=farosas@suse.de \
--cc=joao.m.martins@oracle.com \
--cc=kwankhede@nvidia.com \
--cc=mail@maciej.szmigiero.name \
--cc=peterx@redhat.com \
--cc=ppandit@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=zhguo@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.