qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Prasad Pandit <ppandit@redhat.com>
Cc: qemu-devel@nongnu.org, farosas@suse.de, berrange@redhat.com,
	Prasad Pandit <pjp@fedoraproject.org>
Subject: Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command
Date: Thu, 13 Mar 2025 16:08:17 -0400	[thread overview]
Message-ID: <Z9M7MYUPqHFIQPuV@x1.local> (raw)
In-Reply-To: <CAE8KmOwdLk4oZg8TAt0z6rd27f0MpbSS54TWNDshZFU7WPxk-Q@mail.gmail.com>

On Thu, Mar 13, 2025 at 06:13:24PM +0530, Prasad Pandit wrote:
> +int qemu_savevm_state_postcopy_prepare(QEMUFile *f)
> +{
> +    int ret = 0;
> +    SaveStateEntry *se;
> +
> +    QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
> +        if (strcmp(se->idstr, "ram")) {
> +            continue;
> +        }
> +
> +        save_section_header(f, se, QEMU_VM_SECTION_PART);
> +
> +        ram_save_zero_page(f, se->opaque);

I'll stop requesting a why here... but I think this is another example that
even if all the tests pass it may not be correct.

> +        if ((ret = multifd_ram_flush_and_sync(f)) < 0) {
> +            return ret;
> +        }
> +        qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> +
> +        trace_savevm_section_end(se->idstr, se->section_id, ret);
> +        save_section_footer(f, se);
> +        if (ret < 0) {
> +            qemu_file_set_error(f, ret);
> +            return -1;
> +        }
> +    }
> +
> +    return ret;
> +}

[...]

> * Does this patch look okay?

I've written the relevant patches below, please review and take them if you
agree with the changes.

Thanks,

===8<===

From f9343dfc777ef04168443e86a1fa3922296ea563 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Thu, 13 Mar 2025 15:34:10 -0400
Subject: [PATCH 1/2] migration: Add save_postcopy_prepare() savevm handler

Add a savevm handler for a module to opt-in sending extra sections right
before postcopy starts, and before VM is stopped.

RAM will start to use this new savevm handler in the next patch to do flush
and sync for multifd pages.

Note that we choose to do it before VM stopped because the current only
potential user is not sensitive to VM status, so doing it before VM is
stopped is preferred to enlarge any postcopy downtime.

It is still a bit unfortunate that we need to introduce such a new savevm
handler just for the only use case, however it's so far the cleanest.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/migration/register.h | 15 +++++++++++++++
 migration/savevm.h           |  1 +
 migration/migration.c        |  4 ++++
 migration/savevm.c           | 33 +++++++++++++++++++++++++++++++++
 4 files changed, 53 insertions(+)

diff --git a/include/migration/register.h b/include/migration/register.h
index c041ce32f2..b79dc81b8d 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -189,6 +189,21 @@ typedef struct SaveVMHandlers {
 
     /* This runs outside the BQL!  */
 
+    /**
+     * @save_postcopy_prepare
+     *
+     * This hook will be invoked on the source side right before switching
+     * to postcopy (before VM stopped).
+     *
+     * @f:      QEMUFile where to send the data
+     * @opaque: Data pointer passed to register_savevm_live()
+     * @errp:   Error** used to report error message
+     *
+     * Returns: true if succeeded, false if error occured.  When false is
+     * returned, @errp must be set.
+     */
+    bool (*save_postcopy_prepare)(QEMUFile *f, void *opaque, Error **errp);
+
     /**
      * @state_pending_estimate
      *
diff --git a/migration/savevm.h b/migration/savevm.h
index 138c39a7f9..2d5e9c7166 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -45,6 +45,7 @@ void qemu_savevm_state_pending_exact(uint64_t *must_precopy,
 void qemu_savevm_state_pending_estimate(uint64_t *must_precopy,
                                         uint64_t *can_postcopy);
 int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy);
+bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp);
 void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
 void qemu_savevm_send_open_return_path(QEMUFile *f);
 int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len);
diff --git a/migration/migration.c b/migration/migration.c
index d46e776e24..212f6b4145 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2707,6 +2707,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
         }
     }
 
+    if (!qemu_savevm_state_postcopy_prepare(ms->to_dst_file, errp)) {
+        return -1;
+    }
+
     trace_postcopy_start();
     bql_lock();
     trace_postcopy_start_set_run();
diff --git a/migration/savevm.c b/migration/savevm.c
index ce158c3512..23ef4c7dc9 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1523,6 +1523,39 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
     qemu_fflush(f);
 }
 
+bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp)
+{
+    SaveStateEntry *se;
+    bool ret;
+
+    QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+        if (!se->ops || !se->ops->save_postcopy_prepare) {
+            continue;
+        }
+
+        if (se->ops->is_active) {
+            if (!se->ops->is_active(se->opaque)) {
+                continue;
+            }
+        }
+
+        trace_savevm_section_start(se->idstr, se->section_id);
+
+        save_section_header(f, se, QEMU_VM_SECTION_PART);
+        ret = se->ops->save_postcopy_prepare(f, se->opaque, errp);
+        save_section_footer(f, se);
+
+        trace_savevm_section_end(se->idstr, se->section_id, ret);
+
+        if (!ret) {
+            assert(*errp);
+            return false;
+        }
+    }
+
+    return true;
+}
+
 int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy)
 {
     int64_t start_ts_each, end_ts_each;
-- 
2.47.0


From 299e1cdd9b28802f361ed012673825685e30f965 Mon Sep 17 00:00:00 2001
From: Peter Xu <peterx@redhat.com>
Date: Thu, 13 Mar 2025 15:56:01 -0400
Subject: [PATCH 2/2] migration/ram: Implement save_postcopy_prepare()

Implement save_postcopy_prepare(), preparing for the enablement of both
multifd and postcopy.

Please see the rich comment for the rationals.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/ram.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

diff --git a/migration/ram.c b/migration/ram.c
index 424df6d9f1..119e7d3ac2 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4420,6 +4420,42 @@ static int ram_resume_prepare(MigrationState *s, void *opaque)
     return 0;
 }
 
+static bool ram_save_postcopy_prepare(QEMUFile *f, void *opaque, Error **errp)
+{
+    int ret;
+
+    if (migrate_multifd()) {
+        /*
+         * When multifd is enabled, source QEMU needs to make sure all the
+         * pages queued before postcopy starts to be flushed.
+         *
+         * Meanwhile, the load of these pages must happen before switching
+         * to postcopy.  It's because loading of guest pages (so far) in
+         * multifd recv threads is still non-atomic, so the load cannot
+         * happen with vCPUs running on destination side.
+         *
+         * This flush and sync will guarantee those pages loaded _before_
+         * postcopy starts on destination. The rational is, this happens
+         * before VM stops (and before source QEMU sends all the rest of
+         * the postcopy messages).  So when the destination QEMU received
+         * the postcopy messages, it must have received the sync message on
+         * the main channel (either RAM_SAVE_FLAG_MULTIFD_FLUSH, or
+         * RAM_SAVE_FLAG_EOS), and such message should have guaranteed all
+         * previous guest pages queued in the multifd channels to be
+         * completely loaded.
+         */
+        ret = multifd_ram_flush_and_sync(f);
+        if (ret < 0) {
+            error_setg(errp, "%s: multifd flush and sync failed", __func__);
+            return false;
+        }
+    }
+
+    qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
+
+    return true;
+}
+
 void postcopy_preempt_shutdown_file(MigrationState *s)
 {
     qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS);
@@ -4439,6 +4475,7 @@ static SaveVMHandlers savevm_ram_handlers = {
     .load_setup = ram_load_setup,
     .load_cleanup = ram_load_cleanup,
     .resume_prepare = ram_resume_prepare,
+    .save_postcopy_prepare = ram_save_postcopy_prepare,
 };
 
 static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
-- 
2.47.0


-- 
Peter Xu



  reply	other threads:[~2025-03-13 20:09 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit
2025-02-28 12:17 ` [PATCH v7 1/5] migration/multifd: move macros to multifd header Prasad Pandit
2025-02-28 12:17 ` [PATCH v7 2/5] migration: enable multifd and postcopy together Prasad Pandit
2025-02-28 12:17 ` [PATCH v7 3/5] tests/qtest/migration: consolidate set capabilities Prasad Pandit
2025-02-28 12:17 ` [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit
2025-02-28 15:11   ` Fabiano Rosas
2025-03-03  9:33     ` Prasad Pandit
2025-02-28 12:17 ` [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command Prasad Pandit
2025-02-28 13:42   ` Peter Xu
2025-03-03 11:43     ` Prasad Pandit
2025-03-03 14:50       ` Peter Xu
2025-03-04  8:10         ` Prasad Pandit
2025-03-04 14:35           ` Peter Xu
2025-03-05 11:21             ` Prasad Pandit
2025-03-05 12:54               ` Peter Xu
2025-03-07 11:45                 ` Prasad Pandit
2025-03-07 22:48                   ` Peter Xu
2025-03-10  7:36                     ` Prasad Pandit
2025-03-13 12:43                     ` Prasad Pandit
2025-03-13 20:08                       ` Peter Xu [this message]
2025-03-17 12:30                         ` Prasad Pandit
2025-03-17 15:00                           ` Peter Xu
2025-03-07 22:51                   ` Peter Xu
2025-03-10 14:38                     ` Fabiano Rosas
2025-03-10 17:08                       ` Prasad Pandit
2025-03-10 19:58                         ` Fabiano Rosas
2025-03-11 10:01                           ` Prasad Pandit
2025-03-11 12:44                             ` Fabiano Rosas
2025-02-28 14:53 ` [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Fabiano Rosas
2025-03-03 10:47   ` Prasad Pandit
2025-03-03 14:12     ` Peter Xu
2025-03-04  9:47       ` Prasad Pandit
2025-03-04 14:42         ` Peter Xu
2025-03-05  7:41           ` Prasad Pandit
2025-03-05 13:56             ` Fabiano Rosas
2025-03-06  7:51               ` Prasad Pandit
2025-03-06 13:48                 ` Fabiano Rosas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z9M7MYUPqHFIQPuV@x1.local \
    --to=peterx@redhat.com \
    --cc=berrange@redhat.com \
    --cc=farosas@suse.de \
    --cc=pjp@fedoraproject.org \
    --cc=ppandit@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).