qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
@ 2025-03-18 12:38 Prasad Pandit
  2025-03-18 12:38 ` [PATCH v8 1/7] migration/multifd: move macros to multifd header Prasad Pandit
                   ` (8 more replies)
  0 siblings, 9 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit

From: Prasad Pandit <pjp@fedoraproject.org>

Hello,

* This series (v8) splits earlier patch-2 which enabled multifd and
  postcopy options together into two separate patches. One modifies
  the channel discovery in migration_ioc_process_incoming() function,
  and second one enables the multifd and postcopy migration together.

  It also adds the 'save_postcopy_prepare' savevm_state handler to
  enable different sections to take an action just before the Postcopy
  phase starts. Thank you Peter for these patches.
===
67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             152.66s   81 subtests passed
===


v7: https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t
* This series (v7) adds 'MULTIFD_RECV_SYNC' migration command. It is used
  to notify the destination migration thread to synchronise with the Multifd
  threads. This allows Multifd ('mig/dst/recv_x') threads on the destination
  to receive all their data, before they are shutdown.

  This series also updates the channel discovery function and qtests as
  suggested in the previous review comments.
===
67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             147.84s   81 subtests passed
===


v6: https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t
* This series (v6) shuts down Multifd threads before starting Postcopy
  migration. It helps to avoid an issue of multifd pages arriving late
  at the destination during Postcopy phase and corrupting the vCPU
  state. It also reorders the qtest patches and does some refactoring
  changes as suggested in previous review.
===
67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             161.35s   73 subtests passed
===


v5: https://lore.kernel.org/qemu-devel/20250205122712.229151-1-ppandit@redhat.com/T/#t
* This series (v5) consolidates migration capabilities setting in one
  'set_migration_capabilities()' function, thus simplifying test sources.
  It passes all migration tests.
===
66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             143.66s   71 subtests passed
===


v4: https://lore.kernel.org/qemu-devel/20250127120823.144949-1-ppandit@redhat.com/T/#t
* This series (v4) adds more 'multifd+postcopy' qtests which test
  Precopy migration with 'postcopy-ram' attribute set. And run
  Postcopy migrations with 'multifd' channels enabled.
===
$ ../qtest/migration-test --tap -k -r '/x86_64/migration/multifd+postcopy' | grep -i 'slow test'
# slow test /x86_64/migration/multifd+postcopy/plain executed in 1.29 secs
# slow test /x86_64/migration/multifd+postcopy/recovery/tls/psk executed in 2.48 secs
# slow test /x86_64/migration/multifd+postcopy/preempt/plain executed in 1.49 secs
# slow test /x86_64/migration/multifd+postcopy/preempt/recovery/tls/psk executed in 2.52 secs
# slow test /x86_64/migration/multifd+postcopy/tcp/tls/psk/match executed in 3.62 secs
# slow test /x86_64/migration/multifd+postcopy/tcp/plain/zstd executed in 1.34 secs
# slow test /x86_64/migration/multifd+postcopy/tcp/plain/cancel executed in 2.24 secs
...
66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             148.41s   71 subtests passed
===


v3: https://lore.kernel.org/qemu-devel/20250121131032.1611245-1-ppandit@redhat.com/T/#t
* This series (v3) passes all existing 'tests/qtest/migration/*' tests
  and adds a new one to enable multifd channels with postcopy migration.


v2: https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/T/#u
* This series (v2) further refactors the 'ram_save_target_page'
  function to make it independent of the multifd & postcopy change.


v1: https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u
* This series removes magic value (4-bytes) introduced in the
  previous series for the Postcopy channel.


v0: https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u
* Currently Multifd and Postcopy migration can not be used together.
  QEMU shows "Postcopy is not yet compatible with multifd" message.

  When migrating guests with large (100's GB) RAM, Multifd threads
  help to accelerate migration, but inability to use it with the
  Postcopy mode delays guest start up on the destination side.

* This patch series allows to enable both Multifd and Postcopy
  migration together. Precopy and Multifd threads work during
  the initial guest (RAM) transfer. When migration moves to the
  Postcopy phase, Multifd threads are restrained and the Postcopy
  threads start to request pages from the source side.

* This series introduces magic value (4-bytes) to be sent on the
  Postcopy channel. It helps to differentiate channels and properly
  setup incoming connections on the destination side.


Thank you.
---
Peter Xu (2):
  migration: Add save_postcopy_prepare() savevm handler
  migration/ram: Implement save_postcopy_prepare()

Prasad Pandit (5):
  migration/multifd: move macros to multifd header
  migration: Refactor channel discovery mechanism
  migration: enable multifd and postcopy together
  tests/qtest/migration: consolidate set capabilities
  tests/qtest/migration: add postcopy tests with multifd

 include/migration/register.h              |  15 +++
 migration/migration.c                     | 128 ++++++++++++----------
 migration/multifd-nocomp.c                |   3 +-
 migration/multifd.c                       |  12 +-
 migration/multifd.h                       |   5 +
 migration/options.c                       |   5 -
 migration/ram.c                           |  44 +++++++-
 migration/savevm.c                        |  33 ++++++
 migration/savevm.h                        |   1 +
 tests/qtest/migration/compression-tests.c |  38 ++++++-
 tests/qtest/migration/cpr-tests.c         |   6 +-
 tests/qtest/migration/file-tests.c        |  58 +++++-----
 tests/qtest/migration/framework.c         |  76 +++++++++----
 tests/qtest/migration/framework.h         |   9 +-
 tests/qtest/migration/misc-tests.c        |   4 +-
 tests/qtest/migration/postcopy-tests.c    |  35 +++++-
 tests/qtest/migration/precopy-tests.c     |  48 +++++---
 tests/qtest/migration/tls-tests.c         |  70 +++++++++++-
 18 files changed, 436 insertions(+), 154 deletions(-)

-- 
2.48.1



^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH v8 1/7] migration/multifd: move macros to multifd header
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
@ 2025-03-18 12:38 ` Prasad Pandit
  2025-03-18 12:38 ` [PATCH v8 2/7] migration: Refactor channel discovery mechanism Prasad Pandit
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit

From: Prasad Pandit <pjp@fedoraproject.org>

Move MULTIFD_ macros to the header file so that
they are accessible from other source files.

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 migration/multifd.c | 5 -----
 migration/multifd.h | 5 +++++
 2 files changed, 5 insertions(+), 5 deletions(-)

v7: no change
- https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t

diff --git a/migration/multifd.c b/migration/multifd.c
index dfb5189f0e..6139cabe44 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -36,11 +36,6 @@
 #include "io/channel-socket.h"
 #include "yank_functions.h"
 
-/* Multiple fd's */
-
-#define MULTIFD_MAGIC 0x11223344U
-#define MULTIFD_VERSION 1
-
 typedef struct {
     uint32_t magic;
     uint32_t version;
diff --git a/migration/multifd.h b/migration/multifd.h
index 2d337e7b3b..9b6d81e7ed 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -49,6 +49,11 @@ bool multifd_queue_page(RAMBlock *block, ram_addr_t offset);
 bool multifd_recv(void);
 MultiFDRecvData *multifd_get_recv_data(void);
 
+/* Multiple fd's */
+
+#define MULTIFD_MAGIC 0x11223344U
+#define MULTIFD_VERSION 1
+
 /* Multifd Compression flags */
 #define MULTIFD_FLAG_SYNC (1 << 0)
 
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v8 2/7] migration: Refactor channel discovery mechanism
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
  2025-03-18 12:38 ` [PATCH v8 1/7] migration/multifd: move macros to multifd header Prasad Pandit
@ 2025-03-18 12:38 ` Prasad Pandit
  2025-03-31 15:01   ` Fabiano Rosas
  2025-03-18 12:38 ` [PATCH v8 3/7] migration: enable multifd and postcopy together Prasad Pandit
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit

From: Prasad Pandit <pjp@fedoraproject.org>

The various logical migration channels don't have a
standardized way of advertising themselves and their
connections may be seen out of order by the migration
destination. When a new connection arrives, the incoming
migration currently make use of heuristics to determine
which channel it belongs to.

The next few patches will need to change how the multifd
and postcopy capabilities interact and that affects the
channel discovery heuristic.

Refactor the channel discovery heuristic to make it less
opaque and simplify the subsequent patches.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 migration/migration.c | 124 +++++++++++++++++++++++-------------------
 1 file changed, 69 insertions(+), 55 deletions(-)

v8:
 - Separate this patch out from earlier patch-2

v7:
- https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t

diff --git a/migration/migration.c b/migration/migration.c
index d46e776e24..f97bb2777f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -95,6 +95,9 @@ enum mig_rp_message_type {
     MIG_RP_MSG_MAX
 };
 
+/* Migration channel types */
+enum { CH_MAIN, CH_MULTIFD, CH_POSTCOPY };
+
 /* When we add fault tolerance, we could have several
    migrations at once.  For now we don't need to add
    dynamic creation of migration */
@@ -985,28 +988,19 @@ void migration_fd_process_incoming(QEMUFile *f)
     migration_incoming_process();
 }
 
-/*
- * Returns true when we want to start a new incoming migration process,
- * false otherwise.
- */
-static bool migration_should_start_incoming(bool main_channel)
+static bool migration_has_main_and_multifd_channels(void)
 {
-    /* Multifd doesn't start unless all channels are established */
-    if (migrate_multifd()) {
-        return migration_has_all_channels();
+    MigrationIncomingState *mis = migration_incoming_get_current();
+    if (!mis->from_src_file) {
+        /* main channel not established */
+        return false;
     }
 
-    /* Preempt channel only starts when the main channel is created */
-    if (migrate_postcopy_preempt()) {
-        return main_channel;
+    if (migrate_multifd() && !multifd_recv_all_channels_created()) {
+        return false;
     }
 
-    /*
-     * For all the rest types of migration, we should only reach here when
-     * it's the main channel that's being created, and we should always
-     * proceed with this channel.
-     */
-    assert(main_channel);
+    /* main and all multifd channels are established */
     return true;
 }
 
@@ -1015,59 +1009,84 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
     MigrationIncomingState *mis = migration_incoming_get_current();
     Error *local_err = NULL;
     QEMUFile *f;
-    bool default_channel = true;
+    uint8_t channel;
     uint32_t channel_magic = 0;
     int ret = 0;
 
-    if (migrate_multifd() && !migrate_mapped_ram() &&
-        !migrate_postcopy_ram() &&
-        qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
-        /*
-         * With multiple channels, it is possible that we receive channels
-         * out of order on destination side, causing incorrect mapping of
-         * source channels on destination side. Check channel MAGIC to
-         * decide type of channel. Please note this is best effort, postcopy
-         * preempt channel does not send any magic number so avoid it for
-         * postcopy live migration. Also tls live migration already does
-         * tls handshake while initializing main channel so with tls this
-         * issue is not possible.
-         */
-        ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
-                                          sizeof(channel_magic), errp);
+    if (!migration_has_main_and_multifd_channels()) {
+        if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
+            /*
+             * With multiple channels, it is possible that we receive channels
+             * out of order on destination side, causing incorrect mapping of
+             * source channels on destination side. Check channel MAGIC to
+             * decide type of channel. Please note this is best effort,
+             * postcopy preempt channel does not send any magic number so
+             * avoid it for postcopy live migration. Also tls live migration
+             * already does tls handshake while initializing main channel so
+             * with tls this issue is not possible.
+             */
+            ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
+                                              sizeof(channel_magic), errp);
+            if (ret != 0) {
+                return;
+            }
 
-        if (ret != 0) {
+            channel_magic = be32_to_cpu(channel_magic);
+            if (channel_magic == QEMU_VM_FILE_MAGIC) {
+                channel = CH_MAIN;
+            } else if (channel_magic == MULTIFD_MAGIC) {
+                channel = CH_MULTIFD;
+            } else if (!mis->from_src_file &&
+                        mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
+                /* reconnect main channel for postcopy recovery */
+                channel = CH_MAIN;
+            } else {
+                error_setg(errp, "unknown channel magic: %u", channel_magic);
+                return;
+            }
+        } else if (mis->from_src_file && migrate_multifd()) {
+            /*
+             * Non-peekable channels like tls/file are processed as
+             * multifd channels when multifd is enabled.
+             */
+            channel = CH_MULTIFD;
+        } else if (!mis->from_src_file) {
+            channel = CH_MAIN;
+        } else {
+            error_setg(errp, "non-peekable channel used without multifd");
             return;
         }
-
-        default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC));
+    } else if (mis->from_src_file) {
+        channel = CH_POSTCOPY;
     } else {
-        default_channel = !mis->from_src_file;
+        channel = CH_MAIN;
     }
 
     if (multifd_recv_setup(errp) != 0) {
         return;
     }
 
-    if (default_channel) {
+    if (channel == CH_MAIN) {
         f = qemu_file_new_input(ioc);
         migration_incoming_setup(f);
-    } else {
+    } else if (channel == CH_MULTIFD) {
         /* Multiple connections */
-        assert(migration_needs_multiple_sockets());
         if (migrate_multifd()) {
             multifd_recv_new_channel(ioc, &local_err);
-        } else {
-            assert(migrate_postcopy_preempt());
-            f = qemu_file_new_input(ioc);
-            postcopy_preempt_new_channel(mis, f);
         }
         if (local_err) {
             error_propagate(errp, local_err);
             return;
         }
+    } else if (channel == CH_POSTCOPY) {
+        assert(migrate_postcopy_preempt());
+        assert(!mis->postcopy_qemufile_dst);
+        f = qemu_file_new_input(ioc);
+        postcopy_preempt_new_channel(mis, f);
+        return;
     }
 
-    if (migration_should_start_incoming(default_channel)) {
+    if (migration_has_main_and_multifd_channels()) {
         /* If it's a recovery, we're done */
         if (postcopy_try_recover()) {
             return;
@@ -1084,20 +1103,15 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
  */
 bool migration_has_all_channels(void)
 {
+    if (!migration_has_main_and_multifd_channels()) {
+        return false;
+    }
+
     MigrationIncomingState *mis = migration_incoming_get_current();
-
-    if (!mis->from_src_file) {
+    if (migrate_postcopy_preempt() && !mis->postcopy_qemufile_dst) {
         return false;
     }
 
-    if (migrate_multifd()) {
-        return multifd_recv_all_channels_created();
-    }
-
-    if (migrate_postcopy_preempt()) {
-        return mis->postcopy_qemufile_dst != NULL;
-    }
-
     return true;
 }
 
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v8 3/7] migration: enable multifd and postcopy together
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
  2025-03-18 12:38 ` [PATCH v8 1/7] migration/multifd: move macros to multifd header Prasad Pandit
  2025-03-18 12:38 ` [PATCH v8 2/7] migration: Refactor channel discovery mechanism Prasad Pandit
@ 2025-03-18 12:38 ` Prasad Pandit
  2025-03-31 15:27   ` Fabiano Rosas
  2025-03-18 12:38 ` [PATCH v8 4/7] tests/qtest/migration: consolidate set capabilities Prasad Pandit
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit

From: Prasad Pandit <pjp@fedoraproject.org>

Enable Multifd and Postcopy migration together.
The migration_ioc_process_incoming() routine
checks magic value sent on each channel and
helps to properly setup multifd and postcopy
channels.

The Precopy and Multifd threads work during the
initial guest RAM transfer. When migration moves
to the Postcopy phase, the multifd threads cease
to send data on multifd channels and Postcopy
threads on the destination request/pull data from
the source side.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 migration/multifd-nocomp.c | 3 ++-
 migration/multifd.c        | 7 +++++++
 migration/options.c        | 5 -----
 migration/ram.c            | 7 +++----
 4 files changed, 12 insertions(+), 10 deletions(-)

v8:
- Separate this patch out from earlier patch-2.

v7:
- https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t

diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index ffe75256c9..02f8bf8ce8 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -17,6 +17,7 @@
 #include "migration-stats.h"
 #include "multifd.h"
 #include "options.h"
+#include "migration.h"
 #include "qapi/error.h"
 #include "qemu/cutils.h"
 #include "qemu/error-report.h"
@@ -399,7 +400,7 @@ int multifd_ram_flush_and_sync(QEMUFile *f)
     MultiFDSyncReq req;
     int ret;
 
-    if (!migrate_multifd()) {
+    if (!migrate_multifd() || migration_in_postcopy()) {
         return 0;
     }
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 6139cabe44..074d16d07d 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -1379,6 +1379,13 @@ static void *multifd_recv_thread(void *opaque)
         }
 
         if (has_data) {
+            /*
+             * multifd thread should not be active and receive data
+             * when migration is in the Postcopy phase. Two threads
+             * writing the same memory area could easily corrupt
+             * the guest state.
+             */
+            assert(!migration_in_postcopy());
             if (is_device_state) {
                 assert(use_packets);
                 ret = multifd_device_state_recv(p, &local_err);
diff --git a/migration/options.c b/migration/options.c
index b0ac2ea408..48aa6076de 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -491,11 +491,6 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
             error_setg(errp, "Postcopy is not compatible with ignore-shared");
             return false;
         }
-
-        if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) {
-            error_setg(errp, "Postcopy is not yet compatible with multifd");
-            return false;
-        }
     }
 
     if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
diff --git a/migration/ram.c b/migration/ram.c
index 424df6d9f1..6fd88cbf2a 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1297,7 +1297,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
         pss->page = 0;
         pss->block = QLIST_NEXT_RCU(pss->block, next);
         if (!pss->block) {
-            if (multifd_ram_sync_per_round()) {
+            if (multifd_ram_sync_per_round() && !migration_in_postcopy()) {
                 QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
                 int ret = multifd_ram_flush_and_sync(f);
                 if (ret < 0) {
@@ -1976,9 +1976,8 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
         }
     }
 
-    if (migrate_multifd()) {
-        RAMBlock *block = pss->block;
-        return ram_save_multifd_page(block, offset);
+    if (migrate_multifd() && !migration_in_postcopy()) {
+        return ram_save_multifd_page(pss->block, offset);
     }
 
     return ram_save_page(rs, pss);
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v8 4/7] tests/qtest/migration: consolidate set capabilities
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
                   ` (2 preceding siblings ...)
  2025-03-18 12:38 ` [PATCH v8 3/7] migration: enable multifd and postcopy together Prasad Pandit
@ 2025-03-18 12:38 ` Prasad Pandit
  2025-03-18 12:38 ` [PATCH v8 5/7] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit

From: Prasad Pandit <pjp@fedoraproject.org>

Migration capabilities are set in multiple '.start_hook'
functions for various tests. Instead, consolidate setting
capabilities in 'migrate_start_set_capabilities()' function
which is called from the 'migrate_start()' function.
While simplifying the capabilities setting, it helps
to declutter the qtest sources.

Suggested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 tests/qtest/migration/compression-tests.c | 22 +++++--
 tests/qtest/migration/cpr-tests.c         |  6 +-
 tests/qtest/migration/file-tests.c        | 58 ++++++++---------
 tests/qtest/migration/framework.c         | 76 ++++++++++++++++-------
 tests/qtest/migration/framework.h         |  9 ++-
 tests/qtest/migration/misc-tests.c        |  4 +-
 tests/qtest/migration/postcopy-tests.c    |  8 ++-
 tests/qtest/migration/precopy-tests.c     | 29 +++++----
 tests/qtest/migration/tls-tests.c         | 23 ++++++-
 9 files changed, 151 insertions(+), 84 deletions(-)

v7: no change
- https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t

diff --git a/tests/qtest/migration/compression-tests.c b/tests/qtest/migration/compression-tests.c
index 8b58401b84..41e79f031b 100644
--- a/tests/qtest/migration/compression-tests.c
+++ b/tests/qtest/migration/compression-tests.c
@@ -35,6 +35,9 @@ static void test_multifd_tcp_zstd(void)
 {
     MigrateCommon args = {
         .listen_uri = "defer",
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         .start_hook = migrate_hook_start_precopy_tcp_multifd_zstd,
     };
     test_precopy_common(&args);
@@ -56,6 +59,9 @@ static void test_multifd_tcp_qatzip(void)
 {
     MigrateCommon args = {
         .listen_uri = "defer",
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         .start_hook = migrate_hook_start_precopy_tcp_multifd_qatzip,
     };
     test_precopy_common(&args);
@@ -74,6 +80,9 @@ static void test_multifd_tcp_qpl(void)
 {
     MigrateCommon args = {
         .listen_uri = "defer",
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         .start_hook = migrate_hook_start_precopy_tcp_multifd_qpl,
     };
     test_precopy_common(&args);
@@ -92,6 +101,9 @@ static void test_multifd_tcp_uadk(void)
 {
     MigrateCommon args = {
         .listen_uri = "defer",
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         .start_hook = migrate_hook_start_precopy_tcp_multifd_uadk,
     };
     test_precopy_common(&args);
@@ -103,10 +115,6 @@ migrate_hook_start_xbzrle(QTestState *from,
                           QTestState *to)
 {
     migrate_set_parameter_int(from, "xbzrle-cache-size", 33554432);
-
-    migrate_set_capability(from, "xbzrle", true);
-    migrate_set_capability(to, "xbzrle", true);
-
     return NULL;
 }
 
@@ -118,6 +126,9 @@ static void test_precopy_unix_xbzrle(void)
         .listen_uri = uri,
         .start_hook = migrate_hook_start_xbzrle,
         .iterations = 2,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_XBZRLE] = true,
+        },
         /*
          * XBZRLE needs pages to be modified when doing the 2nd+ round
          * iteration to have real data pushed to the stream.
@@ -146,6 +157,9 @@ static void test_multifd_tcp_zlib(void)
 {
     MigrateCommon args = {
         .listen_uri = "defer",
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         .start_hook = migrate_hook_start_precopy_tcp_multifd_zlib,
     };
     test_precopy_common(&args);
diff --git a/tests/qtest/migration/cpr-tests.c b/tests/qtest/migration/cpr-tests.c
index 4758841824..5536e14610 100644
--- a/tests/qtest/migration/cpr-tests.c
+++ b/tests/qtest/migration/cpr-tests.c
@@ -24,9 +24,6 @@ static void *migrate_hook_start_mode_reboot(QTestState *from, QTestState *to)
     migrate_set_parameter_str(from, "mode", "cpr-reboot");
     migrate_set_parameter_str(to, "mode", "cpr-reboot");
 
-    migrate_set_capability(from, "x-ignore-shared", true);
-    migrate_set_capability(to, "x-ignore-shared", true);
-
     return NULL;
 }
 
@@ -39,6 +36,9 @@ static void test_mode_reboot(void)
         .connect_uri = uri,
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_mode_reboot,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_X_IGNORE_SHARED] = true,
+        },
     };
 
     test_file_common(&args, true);
diff --git a/tests/qtest/migration/file-tests.c b/tests/qtest/migration/file-tests.c
index f260e2871d..4d78ce0855 100644
--- a/tests/qtest/migration/file-tests.c
+++ b/tests/qtest/migration/file-tests.c
@@ -107,15 +107,6 @@ static void test_precopy_file_offset_bad(void)
     test_file_common(&args, false);
 }
 
-static void *migrate_hook_start_mapped_ram(QTestState *from,
-                                           QTestState *to)
-{
-    migrate_set_capability(from, "mapped-ram", true);
-    migrate_set_capability(to, "mapped-ram", true);
-
-    return NULL;
-}
-
 static void test_precopy_file_mapped_ram_live(void)
 {
     g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
@@ -123,7 +114,9 @@ static void test_precopy_file_mapped_ram_live(void)
     MigrateCommon args = {
         .connect_uri = uri,
         .listen_uri = "defer",
-        .start_hook = migrate_hook_start_mapped_ram,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true,
+        },
     };
 
     test_file_common(&args, false);
@@ -136,26 +129,14 @@ static void test_precopy_file_mapped_ram(void)
     MigrateCommon args = {
         .connect_uri = uri,
         .listen_uri = "defer",
-        .start_hook = migrate_hook_start_mapped_ram,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true,
+        },
     };
 
     test_file_common(&args, true);
 }
 
-static void *migrate_hook_start_multifd_mapped_ram(QTestState *from,
-                                                   QTestState *to)
-{
-    migrate_hook_start_mapped_ram(from, to);
-
-    migrate_set_parameter_int(from, "multifd-channels", 4);
-    migrate_set_parameter_int(to, "multifd-channels", 4);
-
-    migrate_set_capability(from, "multifd", true);
-    migrate_set_capability(to, "multifd", true);
-
-    return NULL;
-}
-
 static void test_multifd_file_mapped_ram_live(void)
 {
     g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
@@ -163,7 +144,10 @@ static void test_multifd_file_mapped_ram_live(void)
     MigrateCommon args = {
         .connect_uri = uri,
         .listen_uri = "defer",
-        .start_hook = migrate_hook_start_multifd_mapped_ram,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+            .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true,
+        },
     };
 
     test_file_common(&args, false);
@@ -176,7 +160,10 @@ static void test_multifd_file_mapped_ram(void)
     MigrateCommon args = {
         .connect_uri = uri,
         .listen_uri = "defer",
-        .start_hook = migrate_hook_start_multifd_mapped_ram,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+            .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true,
+        },
     };
 
     test_file_common(&args, true);
@@ -185,8 +172,6 @@ static void test_multifd_file_mapped_ram(void)
 static void *migrate_hook_start_multifd_mapped_ram_dio(QTestState *from,
                                                        QTestState *to)
 {
-    migrate_hook_start_multifd_mapped_ram(from, to);
-
     migrate_set_parameter_bool(from, "direct-io", true);
     migrate_set_parameter_bool(to, "direct-io", true);
 
@@ -201,6 +186,10 @@ static void test_multifd_file_mapped_ram_dio(void)
         .connect_uri = uri,
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_mapped_ram_dio,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true,
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
     };
 
     if (!probe_o_direct_support(tmpfs)) {
@@ -246,7 +235,6 @@ static void *migrate_hook_start_multifd_mapped_ram_fdset_dio(QTestState *from,
     fdset_add_fds(from, file, O_WRONLY, 2, true);
     fdset_add_fds(to, file, O_RDONLY, 2, true);
 
-    migrate_hook_start_multifd_mapped_ram(from, to);
     migrate_set_parameter_bool(from, "direct-io", true);
     migrate_set_parameter_bool(to, "direct-io", true);
 
@@ -261,8 +249,6 @@ static void *migrate_hook_start_multifd_mapped_ram_fdset(QTestState *from,
     fdset_add_fds(from, file, O_WRONLY, 2, false);
     fdset_add_fds(to, file, O_RDONLY, 2, false);
 
-    migrate_hook_start_multifd_mapped_ram(from, to);
-
     return NULL;
 }
 
@@ -275,6 +261,10 @@ static void test_multifd_file_mapped_ram_fdset(void)
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_mapped_ram_fdset,
         .end_hook = migrate_hook_end_multifd_mapped_ram_fdset,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true,
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
     };
 
     test_file_common(&args, true);
@@ -289,6 +279,10 @@ static void test_multifd_file_mapped_ram_fdset_dio(void)
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_mapped_ram_fdset_dio,
         .end_hook = migrate_hook_end_multifd_mapped_ram_fdset,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true,
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
     };
 
     if (!probe_o_direct_support(tmpfs)) {
diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
index 10e1d04b58..be6c245843 100644
--- a/tests/qtest/migration/framework.c
+++ b/tests/qtest/migration/framework.c
@@ -30,6 +30,7 @@
 #define QEMU_VM_FILE_MAGIC 0x5145564d
 #define QEMU_ENV_SRC "QTEST_QEMU_BINARY_SRC"
 #define QEMU_ENV_DST "QTEST_QEMU_BINARY_DST"
+#define MULTIFD_TEST_CHANNELS 4
 
 unsigned start_address;
 unsigned end_address;
@@ -207,6 +208,52 @@ static QList *migrate_start_get_qmp_capabilities(const MigrateStart *args)
     return capabilities;
 }
 
+static void migrate_start_set_capabilities(QTestState *from, QTestState *to,
+                                           MigrateStart *args)
+{
+    /*
+     * MigrationCapability_lookup and MIGRATION_CAPABILITY_ constants
+     * are from qapi-types-migration.h.
+     */
+    for (uint8_t i = 0; i < MIGRATION_CAPABILITY__MAX; i++)
+    {
+        if (!args->caps[i]) {
+            continue;
+        }
+        if (from) {
+            migrate_set_capability(from,
+                            MigrationCapability_lookup.array[i], true);
+        }
+        if (to) {
+            migrate_set_capability(to,
+                            MigrationCapability_lookup.array[i], true);
+        }
+    }
+
+    /*
+     * Always enable migration events.  Libvirt always uses it, let's try
+     * to mimic as closer as that.
+     */
+    migrate_set_capability(from, "events", true);
+    if (!args->defer_target_connect) {
+        migrate_set_capability(to, "events", true);
+    }
+
+    /*
+     * Default number of channels should be fine for most
+     * tests. Individual tests can override by calling
+     * migrate_set_parameter() directly.
+     */
+    if (args->caps[MIGRATION_CAPABILITY_MULTIFD]) {
+        migrate_set_parameter_int(from, "multifd-channels",
+                                  MULTIFD_TEST_CHANNELS);
+        migrate_set_parameter_int(to, "multifd-channels",
+                                  MULTIFD_TEST_CHANNELS);
+    }
+
+    return;
+}
+
 int migrate_start(QTestState **from, QTestState **to, const char *uri,
                   MigrateStart *args)
 {
@@ -379,14 +426,7 @@ int migrate_start(QTestState **from, QTestState **to, const char *uri,
         unlink(shmem_path);
     }
 
-    /*
-     * Always enable migration events.  Libvirt always uses it, let's try
-     * to mimic as closer as that.
-     */
-    migrate_set_capability(*from, "events", true);
-    if (!args->defer_target_connect) {
-        migrate_set_capability(*to, "events", true);
-    }
+    migrate_start_set_capabilities(*from, *to, args);
 
     return 0;
 }
@@ -432,6 +472,10 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
 {
     QTestState *from, *to;
 
+    /* set postcopy capabilities */
+    args->start.caps[MIGRATION_CAPABILITY_POSTCOPY_BLOCKTIME] = true;
+    args->start.caps[MIGRATION_CAPABILITY_POSTCOPY_RAM] = true;
+
     if (migrate_start(&from, &to, "defer", &args->start)) {
         return -1;
     }
@@ -440,17 +484,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
         args->postcopy_data = args->start_hook(from, to);
     }
 
-    migrate_set_capability(from, "postcopy-ram", true);
-    migrate_set_capability(to, "postcopy-ram", true);
-    migrate_set_capability(to, "postcopy-blocktime", true);
-
-    if (args->postcopy_preempt) {
-        migrate_set_capability(from, "postcopy-preempt", true);
-        migrate_set_capability(to, "postcopy-preempt", true);
-    }
-
     migrate_ensure_non_converge(from);
-
     migrate_prepare_for_dirty_mem(from);
     qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming',"
                              "  'arguments': { "
@@ -948,15 +982,9 @@ void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from,
                                                     QTestState *to,
                                                     const char *method)
 {
-    migrate_set_parameter_int(from, "multifd-channels", 16);
-    migrate_set_parameter_int(to, "multifd-channels", 16);
-
     migrate_set_parameter_str(from, "multifd-compression", method);
     migrate_set_parameter_str(to, "multifd-compression", method);
 
-    migrate_set_capability(from, "multifd", true);
-    migrate_set_capability(to, "multifd", true);
-
     /* Start incoming migration from the 1st socket */
     migrate_incoming_qmp(to, "tcp:127.0.0.1:0", NULL, "{}");
 
diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
index e4a11870f6..01e425e64e 100644
--- a/tests/qtest/migration/framework.h
+++ b/tests/qtest/migration/framework.h
@@ -12,6 +12,7 @@
 #define TEST_FRAMEWORK_H
 
 #include "libqtest.h"
+#include <qapi/qapi-types-migration.h>
 
 #define FILE_TEST_FILENAME "migfile"
 #define FILE_TEST_OFFSET 0x1000
@@ -120,6 +121,13 @@ typedef struct {
 
     /* Do not connect to target monitor and qtest sockets in qtest_init */
     bool defer_target_connect;
+
+    /*
+     * Migration capabilities to be set in both source and
+     * destination. For unilateral capabilities, use
+     * migration_set_capabilities().
+     */
+    bool caps[MIGRATION_CAPABILITY__MAX];
 } MigrateStart;
 
 typedef enum PostcopyRecoveryFailStage {
@@ -207,7 +215,6 @@ typedef struct {
 
     /* Postcopy specific fields */
     void *postcopy_data;
-    bool postcopy_preempt;
     PostcopyRecoveryFailStage postcopy_recovery_fail_stage;
 } MigrateCommon;
 
diff --git a/tests/qtest/migration/misc-tests.c b/tests/qtest/migration/misc-tests.c
index 2e612d9e38..54995256d8 100644
--- a/tests/qtest/migration/misc-tests.c
+++ b/tests/qtest/migration/misc-tests.c
@@ -98,6 +98,7 @@ static void test_ignore_shared(void)
     QTestState *from, *to;
     MigrateStart args = {
         .use_shmem = true,
+        .caps[MIGRATION_CAPABILITY_X_IGNORE_SHARED] = true,
     };
 
     if (migrate_start(&from, &to, uri, &args)) {
@@ -107,9 +108,6 @@ static void test_ignore_shared(void)
     migrate_ensure_non_converge(from);
     migrate_prepare_for_dirty_mem(from);
 
-    migrate_set_capability(from, "x-ignore-shared", true);
-    migrate_set_capability(to, "x-ignore-shared", true);
-
     /* Wait for the first serial output from the source */
     wait_for_serial("src_serial");
 
diff --git a/tests/qtest/migration/postcopy-tests.c b/tests/qtest/migration/postcopy-tests.c
index 982457bed1..483e3ff99f 100644
--- a/tests/qtest/migration/postcopy-tests.c
+++ b/tests/qtest/migration/postcopy-tests.c
@@ -39,7 +39,9 @@ static void test_postcopy_suspend(void)
 static void test_postcopy_preempt(void)
 {
     MigrateCommon args = {
-        .postcopy_preempt = true,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true,
+        },
     };
 
     test_postcopy_common(&args);
@@ -73,7 +75,9 @@ static void test_postcopy_recovery_fail_reconnect(void)
 static void test_postcopy_preempt_recovery(void)
 {
     MigrateCommon args = {
-        .postcopy_preempt = true,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true,
+        },
     };
 
     test_postcopy_recovery_common(&args);
diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
index ba273d10b9..f8404793b8 100644
--- a/tests/qtest/migration/precopy-tests.c
+++ b/tests/qtest/migration/precopy-tests.c
@@ -108,23 +108,14 @@ static void test_precopy_tcp_plain(void)
     test_precopy_common(&args);
 }
 
-static void *migrate_hook_start_switchover_ack(QTestState *from, QTestState *to)
-{
-
-    migrate_set_capability(from, "return-path", true);
-    migrate_set_capability(to, "return-path", true);
-
-    migrate_set_capability(from, "switchover-ack", true);
-    migrate_set_capability(to, "switchover-ack", true);
-
-    return NULL;
-}
-
 static void test_precopy_tcp_switchover_ack(void)
 {
     MigrateCommon args = {
         .listen_uri = "tcp:127.0.0.1:0",
-        .start_hook = migrate_hook_start_switchover_ack,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_RETURN_PATH] = true,
+            .caps[MIGRATION_CAPABILITY_SWITCHOVER_ACK] = true,
+        },
         /*
          * Source VM must be running in order to consider the switchover ACK
          * when deciding to do switchover or not.
@@ -393,6 +384,9 @@ static void test_multifd_tcp_uri_none(void)
     MigrateCommon args = {
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_precopy_tcp_multifd,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         /*
          * Multifd is more complicated than most of the features, it
          * directly takes guest page buffers when sending, make sure
@@ -408,6 +402,9 @@ static void test_multifd_tcp_zero_page_legacy(void)
     MigrateCommon args = {
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_precopy_tcp_multifd_zero_page_legacy,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         /*
          * Multifd is more complicated than most of the features, it
          * directly takes guest page buffers when sending, make sure
@@ -423,6 +420,9 @@ static void test_multifd_tcp_no_zero_page(void)
     MigrateCommon args = {
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_precopy_tcp_multifd_no_zero_page,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         /*
          * Multifd is more complicated than most of the features, it
          * directly takes guest page buffers when sending, make sure
@@ -439,6 +439,9 @@ static void test_multifd_tcp_channels_none(void)
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_precopy_tcp_multifd,
         .live = true,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
         .connect_channels = ("[ { 'channel-type': 'main',"
                              "    'addr': { 'transport': 'socket',"
                              "              'type': 'inet',"
diff --git a/tests/qtest/migration/tls-tests.c b/tests/qtest/migration/tls-tests.c
index 2cb4a44bcd..72f44defbb 100644
--- a/tests/qtest/migration/tls-tests.c
+++ b/tests/qtest/migration/tls-tests.c
@@ -375,9 +375,11 @@ static void test_postcopy_tls_psk(void)
 static void test_postcopy_preempt_tls_psk(void)
 {
     MigrateCommon args = {
-        .postcopy_preempt = true,
         .start_hook = migrate_hook_start_tls_psk_match,
         .end_hook = migrate_hook_end_tls_psk,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true,
+        },
     };
 
     test_postcopy_common(&args);
@@ -397,9 +399,11 @@ static void test_postcopy_recovery_tls_psk(void)
 static void test_postcopy_preempt_all(void)
 {
     MigrateCommon args = {
-        .postcopy_preempt = true,
         .start_hook = migrate_hook_start_tls_psk_match,
         .end_hook = migrate_hook_end_tls_psk,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true,
+        },
     };
 
     test_postcopy_recovery_common(&args);
@@ -631,6 +635,9 @@ static void test_multifd_tcp_tls_psk_match(void)
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_tcp_tls_psk_match,
         .end_hook = migrate_hook_end_tls_psk,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
     };
     test_precopy_common(&args);
 }
@@ -640,6 +647,7 @@ static void test_multifd_tcp_tls_psk_mismatch(void)
     MigrateCommon args = {
         .start = {
             .hide_stderr = true,
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
         },
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_tcp_tls_psk_mismatch,
@@ -656,6 +664,9 @@ static void test_multifd_tcp_tls_x509_default_host(void)
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_tls_x509_default_host,
         .end_hook = migrate_hook_end_tls_x509,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
     };
     test_precopy_common(&args);
 }
@@ -666,6 +677,9 @@ static void test_multifd_tcp_tls_x509_override_host(void)
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_tls_x509_override_host,
         .end_hook = migrate_hook_end_tls_x509,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
     };
     test_precopy_common(&args);
 }
@@ -688,6 +702,7 @@ static void test_multifd_tcp_tls_x509_mismatch_host(void)
     MigrateCommon args = {
         .start = {
             .hide_stderr = true,
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
         },
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_tls_x509_mismatch_host,
@@ -703,6 +718,9 @@ static void test_multifd_tcp_tls_x509_allow_anon_client(void)
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_tls_x509_allow_anon_client,
         .end_hook = migrate_hook_end_tls_x509,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
     };
     test_precopy_common(&args);
 }
@@ -712,6 +730,7 @@ static void test_multifd_tcp_tls_x509_reject_anon_client(void)
     MigrateCommon args = {
         .start = {
             .hide_stderr = true,
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
         },
         .listen_uri = "defer",
         .start_hook = migrate_hook_start_multifd_tls_x509_reject_anon_client,
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v8 5/7] tests/qtest/migration: add postcopy tests with multifd
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
                   ` (3 preceding siblings ...)
  2025-03-18 12:38 ` [PATCH v8 4/7] tests/qtest/migration: consolidate set capabilities Prasad Pandit
@ 2025-03-18 12:38 ` Prasad Pandit
  2025-03-18 12:38 ` [PATCH v8 6/7] migration: Add save_postcopy_prepare() savevm handler Prasad Pandit
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit

From: Prasad Pandit <pjp@fedoraproject.org>

Add new qtests to run postcopy migration with multifd
channels enabled.

Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 tests/qtest/migration/compression-tests.c | 16 ++++++++
 tests/qtest/migration/postcopy-tests.c    | 27 +++++++++++++
 tests/qtest/migration/precopy-tests.c     | 19 +++++++++
 tests/qtest/migration/tls-tests.c         | 47 +++++++++++++++++++++++
 4 files changed, 109 insertions(+)

v8:
- Set missing 'postcopy-ram' and 'multifd' capabilities for couple tests.

v7:
- https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t

diff --git a/tests/qtest/migration/compression-tests.c b/tests/qtest/migration/compression-tests.c
index 41e79f031b..a788a8d4a7 100644
--- a/tests/qtest/migration/compression-tests.c
+++ b/tests/qtest/migration/compression-tests.c
@@ -42,6 +42,20 @@ static void test_multifd_tcp_zstd(void)
     };
     test_precopy_common(&args);
 }
+
+static void test_multifd_postcopy_tcp_zstd(void)
+{
+    MigrateCommon args = {
+        .listen_uri = "defer",
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+            .caps[MIGRATION_CAPABILITY_POSTCOPY_RAM] = true,
+        },
+        .start_hook = migrate_hook_start_precopy_tcp_multifd_zstd,
+    };
+
+    test_precopy_common(&args);
+}
 #endif /* CONFIG_ZSTD */
 
 #ifdef CONFIG_QATZIP
@@ -184,6 +198,8 @@ void migration_test_add_compression(MigrationTestEnv *env)
 #ifdef CONFIG_ZSTD
     migration_test_add("/migration/multifd/tcp/plain/zstd",
                        test_multifd_tcp_zstd);
+    migration_test_add("/migration/multifd+postcopy/tcp/plain/zstd",
+                       test_multifd_postcopy_tcp_zstd);
 #endif
 
 #ifdef CONFIG_QATZIP
diff --git a/tests/qtest/migration/postcopy-tests.c b/tests/qtest/migration/postcopy-tests.c
index 483e3ff99f..eb637f94f7 100644
--- a/tests/qtest/migration/postcopy-tests.c
+++ b/tests/qtest/migration/postcopy-tests.c
@@ -94,6 +94,29 @@ static void migration_test_add_postcopy_smoke(MigrationTestEnv *env)
     }
 }
 
+static void test_multifd_postcopy(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
+    };
+
+    test_postcopy_common(&args);
+}
+
+static void test_multifd_postcopy_preempt(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+            .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true,
+        },
+    };
+
+    test_postcopy_common(&args);
+}
+
 void migration_test_add_postcopy(MigrationTestEnv *env)
 {
     migration_test_add_postcopy_smoke(env);
@@ -114,6 +137,10 @@ void migration_test_add_postcopy(MigrationTestEnv *env)
             "/migration/postcopy/recovery/double-failures/reconnect",
             test_postcopy_recovery_fail_reconnect);
 
+        migration_test_add("/migration/postcopy/multifd/plain",
+                           test_multifd_postcopy);
+        migration_test_add("/migration/postcopy/multifd/preempt/plain",
+                           test_multifd_postcopy_preempt);
         if (env->is_x86) {
             migration_test_add("/migration/postcopy/suspend",
                                test_postcopy_suspend);
diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c
index f8404793b8..b2b0db8076 100644
--- a/tests/qtest/migration/precopy-tests.c
+++ b/tests/qtest/migration/precopy-tests.c
@@ -34,6 +34,7 @@
 #define DIRTYLIMIT_TOLERANCE_RANGE  25  /* MB/s */
 
 static char *tmpfs;
+static bool postcopy_ram = false;
 
 static void test_precopy_unix_plain(void)
 {
@@ -476,6 +477,11 @@ static void test_multifd_tcp_cancel(void)
     migrate_ensure_non_converge(from);
     migrate_prepare_for_dirty_mem(from);
 
+    if (postcopy_ram) {
+        migrate_set_capability(from, "postcopy-ram", true);
+        migrate_set_capability(to, "postcopy-ram", true);
+    }
+
     migrate_set_parameter_int(from, "multifd-channels", 16);
     migrate_set_parameter_int(to, "multifd-channels", 16);
 
@@ -517,6 +523,10 @@ static void test_multifd_tcp_cancel(void)
         return;
     }
 
+    if (postcopy_ram) {
+        migrate_set_capability(to2, "postcopy-ram", true);
+    }
+
     migrate_set_parameter_int(to2, "multifd-channels", 16);
 
     migrate_set_capability(to2, "multifd", true);
@@ -540,6 +550,13 @@ static void test_multifd_tcp_cancel(void)
     migrate_end(from, to2, true);
 }
 
+static void test_multifd_postcopy_tcp_cancel(void)
+{
+    postcopy_ram = true;
+    test_multifd_tcp_cancel();
+    postcopy_ram = false;
+}
+
 static void test_cancel_src_after_failed(QTestState *from, QTestState *to,
                                          const char *uri, const char *phase)
 {
@@ -1127,6 +1144,8 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env)
                        test_multifd_tcp_uri_none);
     migration_test_add("/migration/multifd/tcp/plain/cancel",
                        test_multifd_tcp_cancel);
+    migration_test_add("/migration/multifd+postcopy/tcp/plain/cancel",
+                       test_multifd_postcopy_tcp_cancel);
 }
 
 void migration_test_add_precopy(MigrationTestEnv *env)
diff --git a/tests/qtest/migration/tls-tests.c b/tests/qtest/migration/tls-tests.c
index 72f44defbb..54581e182f 100644
--- a/tests/qtest/migration/tls-tests.c
+++ b/tests/qtest/migration/tls-tests.c
@@ -395,6 +395,19 @@ static void test_postcopy_recovery_tls_psk(void)
     test_postcopy_recovery_common(&args);
 }
 
+static void test_multifd_postcopy_recovery_tls_psk(void)
+{
+    MigrateCommon args = {
+        .start_hook = migrate_hook_start_tls_psk_match,
+        .end_hook = migrate_hook_end_tls_psk,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
+    };
+
+    test_postcopy_recovery_common(&args);
+}
+
 /* This contains preempt+recovery+tls test altogether */
 static void test_postcopy_preempt_all(void)
 {
@@ -409,6 +422,19 @@ static void test_postcopy_preempt_all(void)
     test_postcopy_recovery_common(&args);
 }
 
+static void test_multifd_postcopy_preempt_recovery_tls_psk(void)
+{
+    MigrateCommon args = {
+        .start_hook = migrate_hook_start_tls_psk_match,
+        .end_hook = migrate_hook_end_tls_psk,
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+        },
+    };
+
+    test_postcopy_recovery_common(&args);
+}
+
 static void test_precopy_unix_tls_psk(void)
 {
     g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
@@ -657,6 +683,21 @@ static void test_multifd_tcp_tls_psk_mismatch(void)
     test_precopy_common(&args);
 }
 
+static void test_multifd_postcopy_tcp_tls_psk_match(void)
+{
+    MigrateCommon args = {
+        .start = {
+            .caps[MIGRATION_CAPABILITY_MULTIFD] = true,
+            .caps[MIGRATION_CAPABILITY_POSTCOPY_RAM] = true,
+        },
+        .listen_uri = "defer",
+        .start_hook = migrate_hook_start_multifd_tcp_tls_psk_match,
+        .end_hook = migrate_hook_end_tls_psk,
+    };
+
+    test_precopy_common(&args);
+}
+
 #ifdef CONFIG_TASN1
 static void test_multifd_tcp_tls_x509_default_host(void)
 {
@@ -774,6 +815,10 @@ void migration_test_add_tls(MigrationTestEnv *env)
                            test_postcopy_preempt_tls_psk);
         migration_test_add("/migration/postcopy/preempt/recovery/tls/psk",
                            test_postcopy_preempt_all);
+        migration_test_add("/migration/postcopy/multifd/recovery/tls/psk",
+                           test_multifd_postcopy_recovery_tls_psk);
+        migration_test_add("/migration/postcopy/multifd/preempt/recovery/tls/psk",
+                           test_multifd_postcopy_preempt_recovery_tls_psk);
     }
 #ifdef CONFIG_TASN1
     migration_test_add("/migration/precopy/unix/tls/x509/default-host",
@@ -805,6 +850,8 @@ void migration_test_add_tls(MigrationTestEnv *env)
                        test_multifd_tcp_tls_psk_match);
     migration_test_add("/migration/multifd/tcp/tls/psk/mismatch",
                        test_multifd_tcp_tls_psk_mismatch);
+    migration_test_add("/migration/multifd+postcopy/tcp/tls/psk/match",
+                       test_multifd_postcopy_tcp_tls_psk_match);
 #ifdef CONFIG_TASN1
     migration_test_add("/migration/multifd/tcp/tls/x509/default-host",
                        test_multifd_tcp_tls_x509_default_host);
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v8 6/7] migration: Add save_postcopy_prepare() savevm handler
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
                   ` (4 preceding siblings ...)
  2025-03-18 12:38 ` [PATCH v8 5/7] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit
@ 2025-03-18 12:38 ` Prasad Pandit
  2025-03-31 15:08   ` Fabiano Rosas
  2025-03-18 12:38 ` [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare() Prasad Pandit
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange

From: Peter Xu <peterx@redhat.com>

Add a savevm handler for a module to opt-in sending extra sections right
before postcopy starts, and before VM is stopped.

RAM will start to use this new savevm handler in the next patch to do flush
and sync for multifd pages.

Note that we choose to do it before VM stopped because the current only
potential user is not sensitive to VM status, so doing it before VM is
stopped is preferred to enlarge any postcopy downtime.

It is still a bit unfortunate that we need to introduce such a new savevm
handler just for the only use case, however it's so far the cleanest.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 include/migration/register.h | 15 +++++++++++++++
 migration/migration.c        |  4 ++++
 migration/savevm.c           | 33 +++++++++++++++++++++++++++++++++
 migration/savevm.h           |  1 +
 4 files changed, 53 insertions(+)

v8:
- New patch

v7:
- https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t

diff --git a/include/migration/register.h b/include/migration/register.h
index c041ce32f2..b79dc81b8d 100644
--- a/include/migration/register.h
+++ b/include/migration/register.h
@@ -189,6 +189,21 @@ typedef struct SaveVMHandlers {
 
     /* This runs outside the BQL!  */
 
+    /**
+     * @save_postcopy_prepare
+     *
+     * This hook will be invoked on the source side right before switching
+     * to postcopy (before VM stopped).
+     *
+     * @f:      QEMUFile where to send the data
+     * @opaque: Data pointer passed to register_savevm_live()
+     * @errp:   Error** used to report error message
+     *
+     * Returns: true if succeeded, false if error occured.  When false is
+     * returned, @errp must be set.
+     */
+    bool (*save_postcopy_prepare)(QEMUFile *f, void *opaque, Error **errp);
+
     /**
      * @state_pending_estimate
      *
diff --git a/migration/migration.c b/migration/migration.c
index f97bb2777f..afb4dda19e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2721,6 +2721,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
         }
     }
 
+    if (!qemu_savevm_state_postcopy_prepare(ms->to_dst_file, errp)) {
+        return -1;
+    }
+
     trace_postcopy_start();
     bql_lock();
     trace_postcopy_start_set_run();
diff --git a/migration/savevm.c b/migration/savevm.c
index ce158c3512..23ef4c7dc9 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1523,6 +1523,39 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f)
     qemu_fflush(f);
 }
 
+bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp)
+{
+    SaveStateEntry *se;
+    bool ret;
+
+    QTAILQ_FOREACH(se, &savevm_state.handlers, entry) {
+        if (!se->ops || !se->ops->save_postcopy_prepare) {
+            continue;
+        }
+
+        if (se->ops->is_active) {
+            if (!se->ops->is_active(se->opaque)) {
+                continue;
+            }
+        }
+
+        trace_savevm_section_start(se->idstr, se->section_id);
+
+        save_section_header(f, se, QEMU_VM_SECTION_PART);
+        ret = se->ops->save_postcopy_prepare(f, se->opaque, errp);
+        save_section_footer(f, se);
+
+        trace_savevm_section_end(se->idstr, se->section_id, ret);
+
+        if (!ret) {
+            assert(*errp);
+            return false;
+        }
+    }
+
+    return true;
+}
+
 int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy)
 {
     int64_t start_ts_each, end_ts_each;
diff --git a/migration/savevm.h b/migration/savevm.h
index 138c39a7f9..2d5e9c7166 100644
--- a/migration/savevm.h
+++ b/migration/savevm.h
@@ -45,6 +45,7 @@ void qemu_savevm_state_pending_exact(uint64_t *must_precopy,
 void qemu_savevm_state_pending_estimate(uint64_t *must_precopy,
                                         uint64_t *can_postcopy);
 int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy);
+bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp);
 void qemu_savevm_send_ping(QEMUFile *f, uint32_t value);
 void qemu_savevm_send_open_return_path(QEMUFile *f);
 int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len);
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare()
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
                   ` (5 preceding siblings ...)
  2025-03-18 12:38 ` [PATCH v8 6/7] migration: Add save_postcopy_prepare() savevm handler Prasad Pandit
@ 2025-03-18 12:38 ` Prasad Pandit
  2025-03-31 15:18   ` Fabiano Rosas
  2025-03-25  9:53 ` [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
  2025-03-31 20:54 ` Fabiano Rosas
  8 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-03-18 12:38 UTC (permalink / raw)
  To: qemu-devel; +Cc: peterx, farosas, berrange

From: Peter Xu <peterx@redhat.com>

Implement save_postcopy_prepare(), preparing for the enablement of both
multifd and postcopy.

Please see the rich comment for the rationals.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
 migration/ram.c | 37 +++++++++++++++++++++++++++++++++++++
 1 file changed, 37 insertions(+)

v8:
- New patch

v7:
- https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t

diff --git a/migration/ram.c b/migration/ram.c
index 6fd88cbf2a..04fde7ba6b 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -4419,6 +4419,42 @@ static int ram_resume_prepare(MigrationState *s, void *opaque)
     return 0;
 }
 
+static bool ram_save_postcopy_prepare(QEMUFile *f, void *opaque, Error **errp)
+{
+    int ret;
+
+    if (migrate_multifd()) {
+        /*
+         * When multifd is enabled, source QEMU needs to make sure all the
+         * pages queued before postcopy starts to be flushed.
+         *
+         * Meanwhile, the load of these pages must happen before switching
+         * to postcopy.  It's because loading of guest pages (so far) in
+         * multifd recv threads is still non-atomic, so the load cannot
+         * happen with vCPUs running on destination side.
+         *
+         * This flush and sync will guarantee those pages loaded _before_
+         * postcopy starts on destination. The rational is, this happens
+         * before VM stops (and before source QEMU sends all the rest of
+         * the postcopy messages).  So when the destination QEMU received
+         * the postcopy messages, it must have received the sync message on
+         * the main channel (either RAM_SAVE_FLAG_MULTIFD_FLUSH, or
+         * RAM_SAVE_FLAG_EOS), and such message should have guaranteed all
+         * previous guest pages queued in the multifd channels to be
+         * completely loaded.
+         */
+        ret = multifd_ram_flush_and_sync(f);
+        if (ret < 0) {
+            error_setg(errp, "%s: multifd flush and sync failed", __func__);
+            return false;
+        }
+    }
+
+    qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
+
+    return true;
+}
+
 void postcopy_preempt_shutdown_file(MigrationState *s)
 {
     qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS);
@@ -4438,6 +4474,7 @@ static SaveVMHandlers savevm_ram_handlers = {
     .load_setup = ram_load_setup,
     .load_cleanup = ram_load_cleanup,
     .resume_prepare = ram_resume_prepare,
+    .save_postcopy_prepare = ram_save_postcopy_prepare,
 };
 
 static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,
-- 
2.48.1



^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
                   ` (6 preceding siblings ...)
  2025-03-18 12:38 ` [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare() Prasad Pandit
@ 2025-03-25  9:53 ` Prasad Pandit
  2025-03-27 14:35   ` Fabiano Rosas
  2025-03-31 20:54 ` Fabiano Rosas
  8 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-03-25  9:53 UTC (permalink / raw)
  To: qemu-devel, Fabiano Rosas; +Cc: peterx, berrange, Prasad Pandit

Hello Fabiano,

On Tue, 18 Mar 2025 at 18:10, Prasad Pandit <ppandit@redhat.com> wrote:
> * This series (v8) splits earlier patch-2 which enabled multifd and
>   postcopy options together into two separate patches. One modifies
>   the channel discovery in migration_ioc_process_incoming() function,
>   and second one enables the multifd and postcopy migration together.
>
>   It also adds the 'save_postcopy_prepare' savevm_state handler to
>   enable different sections to take an action just before the Postcopy
>   phase starts. Thank you Peter for these patches.
> ===
> 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             152.66s   81 subtests passed
> ===
>

Ping..! Did you have chance to review this (v8) series. (just checking)

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-03-25  9:53 ` [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
@ 2025-03-27 14:35   ` Fabiano Rosas
  2025-03-27 16:01     ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-03-27 14:35 UTC (permalink / raw)
  To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> Hello Fabiano,
>
> On Tue, 18 Mar 2025 at 18:10, Prasad Pandit <ppandit@redhat.com> wrote:
>> * This series (v8) splits earlier patch-2 which enabled multifd and
>>   postcopy options together into two separate patches. One modifies
>>   the channel discovery in migration_ioc_process_incoming() function,
>>   and second one enables the multifd and postcopy migration together.
>>
>>   It also adds the 'save_postcopy_prepare' savevm_state handler to
>>   enable different sections to take an action just before the Postcopy
>>   phase starts. Thank you Peter for these patches.
>> ===
>> 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             152.66s   81 subtests passed
>> ===
>>
>
> Ping..! Did you have chance to review this (v8) series. (just checking)
>

I'll get to it soon. I need to send a PR for the recent SNP breakage and
also check Li Zhijian's RDMA series first.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-03-27 14:35   ` Fabiano Rosas
@ 2025-03-27 16:01     ` Prasad Pandit
  0 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-03-27 16:01 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

On Thu, 27 Mar 2025 at 20:05, Fabiano Rosas <farosas@suse.de> wrote:
> I'll get to it soon. I need to send a PR for the recent SNP breakage and
> also check Li Zhijian's RDMA series first.

* I see, okay. Thank you for an update, I appreciate it.

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 2/7] migration: Refactor channel discovery mechanism
  2025-03-18 12:38 ` [PATCH v8 2/7] migration: Refactor channel discovery mechanism Prasad Pandit
@ 2025-03-31 15:01   ` Fabiano Rosas
  2025-04-03  7:01     ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-03-31 15:01 UTC (permalink / raw)
  To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> From: Prasad Pandit <pjp@fedoraproject.org>
>
> The various logical migration channels don't have a
> standardized way of advertising themselves and their
> connections may be seen out of order by the migration
> destination. When a new connection arrives, the incoming
> migration currently make use of heuristics to determine
> which channel it belongs to.
>
> The next few patches will need to change how the multifd
> and postcopy capabilities interact and that affects the
> channel discovery heuristic.
>
> Refactor the channel discovery heuristic to make it less
> opaque and simplify the subsequent patches.
>
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
>  migration/migration.c | 124 +++++++++++++++++++++++-------------------
>  1 file changed, 69 insertions(+), 55 deletions(-)
>
> v8:
>  - Separate this patch out from earlier patch-2
>
> v7:
> - https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t
>
> diff --git a/migration/migration.c b/migration/migration.c
> index d46e776e24..f97bb2777f 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -95,6 +95,9 @@ enum mig_rp_message_type {
>      MIG_RP_MSG_MAX
>  };
>  
> +/* Migration channel types */
> +enum { CH_MAIN, CH_MULTIFD, CH_POSTCOPY };
> +
>  /* When we add fault tolerance, we could have several
>     migrations at once.  For now we don't need to add
>     dynamic creation of migration */
> @@ -985,28 +988,19 @@ void migration_fd_process_incoming(QEMUFile *f)
>      migration_incoming_process();
>  }
>  
> -/*
> - * Returns true when we want to start a new incoming migration process,
> - * false otherwise.
> - */
> -static bool migration_should_start_incoming(bool main_channel)
> +static bool migration_has_main_and_multifd_channels(void)
>  {
> -    /* Multifd doesn't start unless all channels are established */
> -    if (migrate_multifd()) {
> -        return migration_has_all_channels();
> +    MigrationIncomingState *mis = migration_incoming_get_current();
> +    if (!mis->from_src_file) {
> +        /* main channel not established */
> +        return false;
>      }
>  
> -    /* Preempt channel only starts when the main channel is created */
> -    if (migrate_postcopy_preempt()) {
> -        return main_channel;
> +    if (migrate_multifd() && !multifd_recv_all_channels_created()) {
> +        return false;
>      }
>  
> -    /*
> -     * For all the rest types of migration, we should only reach here when
> -     * it's the main channel that's being created, and we should always
> -     * proceed with this channel.
> -     */
> -    assert(main_channel);
> +    /* main and all multifd channels are established */
>      return true;
>  }
>  
> @@ -1015,59 +1009,84 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      Error *local_err = NULL;
>      QEMUFile *f;
> -    bool default_channel = true;
> +    uint8_t channel;
>      uint32_t channel_magic = 0;
>      int ret = 0;
>  
> -    if (migrate_multifd() && !migrate_mapped_ram() &&
> -        !migrate_postcopy_ram() &&
> -        qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
> -        /*
> -         * With multiple channels, it is possible that we receive channels
> -         * out of order on destination side, causing incorrect mapping of
> -         * source channels on destination side. Check channel MAGIC to
> -         * decide type of channel. Please note this is best effort, postcopy
> -         * preempt channel does not send any magic number so avoid it for
> -         * postcopy live migration. Also tls live migration already does
> -         * tls handshake while initializing main channel so with tls this
> -         * issue is not possible.
> -         */
> -        ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
> -                                          sizeof(channel_magic), errp);
> +    if (!migration_has_main_and_multifd_channels()) {
> +        if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
> +            /*
> +             * With multiple channels, it is possible that we receive channels
> +             * out of order on destination side, causing incorrect mapping of
> +             * source channels on destination side. Check channel MAGIC to
> +             * decide type of channel. Please note this is best effort,
> +             * postcopy preempt channel does not send any magic number so
> +             * avoid it for postcopy live migration. Also tls live migration
> +             * already does tls handshake while initializing main channel so
> +             * with tls this issue is not possible.
> +             */
> +            ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
> +                                              sizeof(channel_magic), errp);
> +            if (ret != 0) {
> +                return;
> +            }
>  
> -        if (ret != 0) {
> +            channel_magic = be32_to_cpu(channel_magic);
> +            if (channel_magic == QEMU_VM_FILE_MAGIC) {
> +                channel = CH_MAIN;
> +            } else if (channel_magic == MULTIFD_MAGIC) {
> +                channel = CH_MULTIFD;
> +            } else if (!mis->from_src_file &&
> +                        mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
> +                /* reconnect main channel for postcopy recovery */
> +                channel = CH_MAIN;
> +            } else {
> +                error_setg(errp, "unknown channel magic: %u", channel_magic);
> +                return;
> +            }
> +        } else if (mis->from_src_file && migrate_multifd()) {
> +            /*
> +             * Non-peekable channels like tls/file are processed as
> +             * multifd channels when multifd is enabled.
> +             */
> +            channel = CH_MULTIFD;
> +        } else if (!mis->from_src_file) {
> +            channel = CH_MAIN;
> +        } else {
> +            error_setg(errp, "non-peekable channel used without multifd");
>              return;
>          }
> -
> -        default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC));
> +    } else if (mis->from_src_file) {

This is redundant.

> +        channel = CH_POSTCOPY;
>      } else {
> -        default_channel = !mis->from_src_file;
> +        channel = CH_MAIN;

And this is impossible.

>      }
>  
>      if (multifd_recv_setup(errp) != 0) {
>          return;
>      }
>  
> -    if (default_channel) {
> +    if (channel == CH_MAIN) {
>          f = qemu_file_new_input(ioc);
>          migration_incoming_setup(f);

We should probably expand migration_incoming_setup() to make it clear
that mis->from_src_file is set at this point. And
assert(!mis->from_src_file). I can send a patch on top later.

> -    } else {
> +    } else if (channel == CH_MULTIFD) {
>          /* Multiple connections */
> -        assert(migration_needs_multiple_sockets());
>          if (migrate_multifd()) {

This should be an assert.

>              multifd_recv_new_channel(ioc, &local_err);
> -        } else {
> -            assert(migrate_postcopy_preempt());
> -            f = qemu_file_new_input(ioc);
> -            postcopy_preempt_new_channel(mis, f);
>          }
>          if (local_err) {
>              error_propagate(errp, local_err);
>              return;
>          }
> +    } else if (channel == CH_POSTCOPY) {
> +        assert(migrate_postcopy_preempt());
> +        assert(!mis->postcopy_qemufile_dst);
> +        f = qemu_file_new_input(ioc);
> +        postcopy_preempt_new_channel(mis, f);
> +        return;
>      }
>  
> -    if (migration_should_start_incoming(default_channel)) {
> +    if (migration_has_main_and_multifd_channels()) {

I think there's a bug here. Excluding multifd from the picture, if only
the main channel needs to be setup, then it's possible to start postcopy
recovery twice, once when the main channel appears and another time when
the preempt channel appears.

The previous code worked differently because it did:

if (migrate_postcopy_preempt()) {
    return main_channel;

which would return false when preempt arrived after main.

We could use migration_has_all_channels() instead, that would look more
logically correct, but it would also change the current behavior that
postcopy recovery can start before the preempt channel is in place. I'm
not even sure if that's actually part of the design of the feature.

>          /* If it's a recovery, we're done */
>          if (postcopy_try_recover()) {
>              return;
> @@ -1084,20 +1103,15 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
>   */
>  bool migration_has_all_channels(void)
>  {
> +    if (!migration_has_main_and_multifd_channels()) {
> +        return false;
> +    }
> +
>      MigrationIncomingState *mis = migration_incoming_get_current();
> -
> -    if (!mis->from_src_file) {
> +    if (migrate_postcopy_preempt() && !mis->postcopy_qemufile_dst) {
>          return false;
>      }
>  
> -    if (migrate_multifd()) {
> -        return multifd_recv_all_channels_created();
> -    }
> -
> -    if (migrate_postcopy_preempt()) {
> -        return mis->postcopy_qemufile_dst != NULL;
> -    }
> -
>      return true;
>  }


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 6/7] migration: Add save_postcopy_prepare() savevm handler
  2025-03-18 12:38 ` [PATCH v8 6/7] migration: Add save_postcopy_prepare() savevm handler Prasad Pandit
@ 2025-03-31 15:08   ` Fabiano Rosas
  2025-04-03  7:03     ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-03-31 15:08 UTC (permalink / raw)
  To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange

Prasad Pandit <ppandit@redhat.com> writes:

This patch and the next one need to come before 3/7.



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare()
  2025-03-18 12:38 ` [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare() Prasad Pandit
@ 2025-03-31 15:18   ` Fabiano Rosas
  2025-04-03  7:21     ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-03-31 15:18 UTC (permalink / raw)
  To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange

Prasad Pandit <ppandit@redhat.com> writes:

> From: Peter Xu <peterx@redhat.com>
>
> Implement save_postcopy_prepare(), preparing for the enablement of both
> multifd and postcopy.
>
> Please see the rich comment for the rationals.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
>  migration/ram.c | 37 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 37 insertions(+)
>
> v8:
> - New patch
>
> v7:
> - https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t
>
> diff --git a/migration/ram.c b/migration/ram.c
> index 6fd88cbf2a..04fde7ba6b 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -4419,6 +4419,42 @@ static int ram_resume_prepare(MigrationState *s, void *opaque)
>      return 0;
>  }
>  
> +static bool ram_save_postcopy_prepare(QEMUFile *f, void *opaque, Error **errp)
> +{
> +    int ret;
> +
> +    if (migrate_multifd()) {
> +        /*
> +         * When multifd is enabled, source QEMU needs to make sure all the
> +         * pages queued before postcopy starts to be flushed.

s/to be/have been/

> +         *
> +         * Meanwhile, the load of these pages must happen before switching

s/Meanwhile,//

> +         * to postcopy.  It's because loading of guest pages (so far) in
> +         * multifd recv threads is still non-atomic, so the load cannot
> +         * happen with vCPUs running on destination side.
> +         *
> +         * This flush and sync will guarantee those pages loaded _before_

s/loaded/are loaded/

> +         * postcopy starts on destination. The rational is, this happens

s/rational/rationale/

> +         * before VM stops (and before source QEMU sends all the rest of
> +         * the postcopy messages).  So when the destination QEMU received
> +         * the postcopy messages, it must have received the sync message on
> +         * the main channel (either RAM_SAVE_FLAG_MULTIFD_FLUSH, or
> +         * RAM_SAVE_FLAG_EOS), and such message should have guaranteed all
> +         * previous guest pages queued in the multifd channels to be
> +         * completely loaded.
> +         */
> +        ret = multifd_ram_flush_and_sync(f);
> +        if (ret < 0) {
> +            error_setg(errp, "%s: multifd flush and sync failed", __func__);
> +            return false;
> +        }
> +    }
> +
> +    qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
> +
> +    return true;
> +}
> +
>  void postcopy_preempt_shutdown_file(MigrationState *s)
>  {
>      qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS);
> @@ -4438,6 +4474,7 @@ static SaveVMHandlers savevm_ram_handlers = {
>      .load_setup = ram_load_setup,
>      .load_cleanup = ram_load_cleanup,
>      .resume_prepare = ram_resume_prepare,
> +    .save_postcopy_prepare = ram_save_postcopy_prepare,
>  };
>  
>  static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host,


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 3/7] migration: enable multifd and postcopy together
  2025-03-18 12:38 ` [PATCH v8 3/7] migration: enable multifd and postcopy together Prasad Pandit
@ 2025-03-31 15:27   ` Fabiano Rosas
  2025-04-03 10:57     ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-03-31 15:27 UTC (permalink / raw)
  To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> From: Prasad Pandit <pjp@fedoraproject.org>
>
> Enable Multifd and Postcopy migration together.
> The migration_ioc_process_incoming() routine
> checks magic value sent on each channel and
> helps to properly setup multifd and postcopy
> channels.
>
> The Precopy and Multifd threads work during the
> initial guest RAM transfer. When migration moves
> to the Postcopy phase, the multifd threads cease
> to send data on multifd channels and Postcopy
> threads on the destination request/pull data from
> the source side.
>
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
>  migration/multifd-nocomp.c | 3 ++-
>  migration/multifd.c        | 7 +++++++
>  migration/options.c        | 5 -----
>  migration/ram.c            | 7 +++----
>  4 files changed, 12 insertions(+), 10 deletions(-)
>
> v8:
> - Separate this patch out from earlier patch-2.
>
> v7:
> - https://lore.kernel.org/qemu-devel/20250228121749.553184-1-ppandit@redhat.com/T/#t
>
> diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
> index ffe75256c9..02f8bf8ce8 100644
> --- a/migration/multifd-nocomp.c
> +++ b/migration/multifd-nocomp.c
> @@ -17,6 +17,7 @@
>  #include "migration-stats.h"
>  #include "multifd.h"
>  #include "options.h"
> +#include "migration.h"
>  #include "qapi/error.h"
>  #include "qemu/cutils.h"
>  #include "qemu/error-report.h"
> @@ -399,7 +400,7 @@ int multifd_ram_flush_and_sync(QEMUFile *f)
>      MultiFDSyncReq req;
>      int ret;
>  
> -    if (!migrate_multifd()) {
> +    if (!migrate_multifd() || migration_in_postcopy()) {
>          return 0;
>      }
>  
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 6139cabe44..074d16d07d 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -1379,6 +1379,13 @@ static void *multifd_recv_thread(void *opaque)
>          }
>  
>          if (has_data) {
> +            /*
> +             * multifd thread should not be active and receive data
> +             * when migration is in the Postcopy phase. Two threads
> +             * writing the same memory area could easily corrupt
> +             * the guest state.
> +             */
> +            assert(!migration_in_postcopy());
>              if (is_device_state) {
>                  assert(use_packets);
>                  ret = multifd_device_state_recv(p, &local_err);
> diff --git a/migration/options.c b/migration/options.c
> index b0ac2ea408..48aa6076de 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -491,11 +491,6 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
>              error_setg(errp, "Postcopy is not compatible with ignore-shared");
>              return false;
>          }
> -
> -        if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) {
> -            error_setg(errp, "Postcopy is not yet compatible with multifd");
> -            return false;
> -        }
>      }
>  
>      if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
> diff --git a/migration/ram.c b/migration/ram.c
> index 424df6d9f1..6fd88cbf2a 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -1297,7 +1297,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
>          pss->page = 0;
>          pss->block = QLIST_NEXT_RCU(pss->block, next);
>          if (!pss->block) {
> -            if (multifd_ram_sync_per_round()) {
> +            if (multifd_ram_sync_per_round() && !migration_in_postcopy()) {

I'd rather not put this check here. multifd_ram_flush_and_sync() will
already return 0 if in postcopy.

>                  QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
>                  int ret = multifd_ram_flush_and_sync(f);
>                  if (ret < 0) {
> @@ -1976,9 +1976,8 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
>          }
>      }
>  
> -    if (migrate_multifd()) {
> -        RAMBlock *block = pss->block;
> -        return ram_save_multifd_page(block, offset);
> +    if (migrate_multifd() && !migration_in_postcopy()) {
> +        return ram_save_multifd_page(pss->block, offset);
>      }
>  
>      return ram_save_page(rs, pss);


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
                   ` (7 preceding siblings ...)
  2025-03-25  9:53 ` [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
@ 2025-03-31 20:54 ` Fabiano Rosas
  2025-04-03  7:24   ` Prasad Pandit
  8 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-03-31 20:54 UTC (permalink / raw)
  To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> From: Prasad Pandit <pjp@fedoraproject.org>
>
> Hello,
>
> * This series (v8) splits earlier patch-2 which enabled multifd and
>   postcopy options together into two separate patches. One modifies
>   the channel discovery in migration_ioc_process_incoming() function,
>   and second one enables the multifd and postcopy migration together.
>
>   It also adds the 'save_postcopy_prepare' savevm_state handler to
>   enable different sections to take an action just before the Postcopy
>   phase starts. Thank you Peter for these patches.
> ===
> 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test                 OK             152.66s   81 subtests passed

The postcopy/multifd/plain test is still hanging from time to time. I
see a vmstate load function trying to access guest memory and the
postcopy-listen thread already finished, waiting for that
qemu_loadvm_state() (frame #18) to return and set the
main_thread_load_event.

Thread 1 (Thread 0x7fbc4849df80 (LWP 7487) "qemu-system-x86"):
#0  __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:274
#1  0x0000560b135103aa in flatview_read_continue_step (attrs=..., buf=0x560b168a5930 "U\252\022\006\016\a1\300\271", len=9216, mr_addr=831488, l=0x7fbc465ff980, mr=0x560b166c5070) at ../system/physmem.c:3056
#2  0x0000560b1351042e in flatview_read_continue (fv=0x560b16c606a0, addr=831488, attrs=..., ptr=0x560b168a5930, len=9216, mr_addr=831488, l=9216, mr=0x560b166c5070) at ../system/physmem.c:3073
#3  0x0000560b13510533 in flatview_read (fv=0x560b16c606a0, addr=831488, attrs=..., buf=0x560b168a5930, len=9216) at ../system/physmem.c:3103
#4  0x0000560b135105be in address_space_read_full (as=0x560b14970fc0 <address_space_memory>, addr=831488, attrs=..., buf=0x560b168a5930, len=9216) at ../system/physmem.c:3116
#5  0x0000560b135106e7 in address_space_rw (as=0x560b14970fc0 <address_space_memory>, addr=831488, attrs=..., buf=0x560b168a5930, len=9216, is_write=false) at ../system/physmem.c:3144
#6  0x0000560b13510848 in cpu_physical_memory_rw (addr=831488, buf=0x560b168a5930, len=9216, is_write=false) at ../system/physmem.c:3170
#7  0x0000560b1338f5a5 in cpu_physical_memory_read (addr=831488, buf=0x560b168a5930, len=9216) at qemu/include/exec/cpu-common.h:148
#8  0x0000560b1339063c in patch_hypercalls (s=0x560b168840c0) at ../hw/i386/vapic.c:547
#9  0x0000560b1339096d in vapic_prepare (s=0x560b168840c0) at ../hw/i386/vapic.c:629
#10 0x0000560b13390e8b in vapic_post_load (opaque=0x560b168840c0, version_id=1) at ../hw/i386/vapic.c:789
#11 0x0000560b135b4924 in vmstate_load_state (f=0x560b16c53400, vmsd=0x560b147c6cc0 <vmstate_vapic>, opaque=0x560b168840c0, version_id=1) at ../migration/vmstate.c:234
#12 0x0000560b132a15b8 in vmstate_load (f=0x560b16c53400, se=0x560b16893390) at ../migration/savevm.c:972
#13 0x0000560b132a4f28 in qemu_loadvm_section_start_full (f=0x560b16c53400, type=4 '\004') at ../migration/savevm.c:2746
#14 0x0000560b132a5ae8 in qemu_loadvm_state_main (f=0x560b16c53400, mis=0x560b16877f20) at ../migration/savevm.c:3058
#15 0x0000560b132a45d0 in loadvm_handle_cmd_packaged (mis=0x560b16877f20) at ../migration/savevm.c:2451
#16 0x0000560b132a4b36 in loadvm_process_command (f=0x560b168c3b60) at ../migration/savevm.c:2614
#17 0x0000560b132a5b96 in qemu_loadvm_state_main (f=0x560b168c3b60, mis=0x560b16877f20) at ../migration/savevm.c:3073
#18 0x0000560b132a5db7 in qemu_loadvm_state (f=0x560b168c3b60) at ../migration/savevm.c:3150
#19 0x0000560b13286271 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:892
#20 0x0000560b137cb6d4 in coroutine_trampoline (i0=377836416, i1=22027) at ../util/coroutine-ucontext.c:175
#21 0x00007fbc4786a79e in ??? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:103


Thread 10 (Thread 0x7fffce7fc700 (LWP 11778) "mig/dst/listen"):
#0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
#1  0x000055555614e33f in qemu_futex_wait (f=0x5555576f6fc0, val=4294967295) at qemu/include/qemu/futex.h:29
#2  0x000055555614e505 in qemu_event_wait (ev=0x5555576f6fc0) at ../util/qemu-thread-posix.c:464
#3  0x0000555555c44eb1 in postcopy_ram_listen_thread (opaque=0x5555576f6f20) at ../migration/savevm.c:2135
#4  0x000055555614e6b8 in qemu_thread_start (args=0x5555582c8480) at ../util/qemu-thread-posix.c:541
#5  0x00007ffff72626ea in start_thread (arg=0x7fffce7fc700) at pthread_create.c:477
#6  0x00007ffff532158f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Thread 9 (Thread 0x7fffceffd700 (LWP 11777) "mig/dst/fault"):
#0  0x00007ffff5314a89 in __GI___poll (fds=0x7fffc0000b60, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
#1  0x0000555555c3be3f in postcopy_ram_fault_thread (opaque=0x5555576f6f20) at ../migration/postcopy-ram.c:999
#2  0x000055555614e6b8 in qemu_thread_start (args=0x555557735be0) at ../util/qemu-thread-posix.c:541
#3  0x00007ffff72626ea in start_thread (arg=0x7fffceffd700) at pthread_create.c:477
#4  0x00007ffff532158f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Breaking with gdb and stepping through the memcpy code generates a
request for a page that's seemingly already in the receivedmap:

(gdb) x/i $pc
=> 0x7ffff5399d14 <__memcpy_evex_unaligned_erms+86>:    rep movsb %ds:(%rsi),%es:(%rdi)
(gdb) p/x $rsi
$1 = 0x7fffd68cc000
(gdb) si
postcopy_ram_fault_thread_request Request for HVA=0x7fffd68cc000 rb=pc.ram offset=0xcc000 pid=11754
// these are my printfs:
postcopy_request_page:
migrate_send_rp_req_pages: 
migrate_send_rp_req_pages: mutex
migrate_send_rp_req_pages: received

// gdb hangs here, it looks like the page wasn't populated?

I've had my share of postcopy for the day. Hopefully you'll be able to
figure out what the issue is.

- reproducer (2nd iter already hangs for me):

$ for i in $(seq 1 9999); do echo "$i ============="; \
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test \
--full -r /x86_64/migration/postcopy/multifd/plain || break ; done

- reproducer with traces and gdb:

$ for i in $(seq 1 9999); do echo "$i ============="; \
QTEST_TRACE="multifd_* -trace source_* -trace postcopy_* -trace savevm_* \
-trace loadvm_*" QTEST_QEMU_BINARY_DST='gdb --ex "handle SIGUSR1 \
noprint" --ex "run" --args ./qemu-system-x86_64' \
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test \
--full -r /x86_64/migration/postcopy/multifd/plain || break ; done


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 2/7] migration: Refactor channel discovery mechanism
  2025-03-31 15:01   ` Fabiano Rosas
@ 2025-04-03  7:01     ` Prasad Pandit
  2025-04-03 12:59       ` Fabiano Rosas
  0 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-04-03  7:01 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

Hello Fabiano,

On Mon, 31 Mar 2025 at 20:31, Fabiano Rosas <farosas@suse.de> wrote:
> > +    } else if (mis->from_src_file) {
> This is redundant.

* This was to ensure (double check) that when the Postcopy connection
comes in, the main channel is established. Also a couple of versions
back migration qtest was failing without this check. Nonetheless,
qtests do work now without this check. I'll remove it if we must.

> > +        channel = CH_POSTCOPY;
> >      } else {
> > -        default_channel = !mis->from_src_file;
> > +        channel = CH_MAIN;
>
> And this is impossible.

    -> https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#m18b6bf30e877f9eafaa67bba6a209b47782f6eac

* Yes, but a couple of revisions back you suggested adding it saying
CH_MAIN assignment at the top was doing some heavy lifting and it's
more clear this way.

> We should probably expand migration_incoming_setup() to make it clear
> that mis->from_src_file is set at this point. And
> assert(!mis->from_src_file). I can send a patch on top later.

* migration_incoming_setup uses the QEMUFile object only when
mis->from_src_file is not set. I'm wondering if we really need an
assert(!mis->from_src_file) check? Because it'll reach here only when
channel == CH_MAIN and channel is set to CH_MAIN only when
mis->from_src_file is NULL.


> > -    } else {
> > +    } else if (channel == CH_MULTIFD) {
> >          /* Multiple connections */
> > -        assert(migration_needs_multiple_sockets());
> >          if (migrate_multifd()) {
>
> This should be an assert.

Same, 'channel' is set to CH_MULTIFD,  only when migrate_multifd() is
enabled. Do we need another assert(migrate_multifd()) check?

> > +    } else if (channel == CH_POSTCOPY) {
> > +        assert(migrate_postcopy_preempt());
> > +        assert(!mis->postcopy_qemufile_dst);
> > +        f = qemu_file_new_input(ioc);
> > +        postcopy_preempt_new_channel(mis, f);
> > +        return;
> >      }
> >
> > -    if (migration_should_start_incoming(default_channel)) {
> > +    if (migration_has_main_and_multifd_channels()) {
>
> I think there's a bug here. Excluding multifd from the picture, if only
> the main channel needs to be setup, then it's possible to start postcopy
> recovery twice, once when the main channel appears and another time when
> the preempt channel appears.

* When the preempt channel appears 'channel' is set to CH_POSTCOPY, so
it shall 'return' before reaching here, right?

===
        } else if (!mis->from_src_file &&
                        mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
                /* reconnect main channel for postcopy recovery */
                channel = CH_MAIN;
        } else {
===
* When 'main' channel connection arrives for postcopy recovery,
'channel' shall be set to CH_MAIN.

> The previous code worked differently because it did:
>
> if (migrate_postcopy_preempt()) {
>     return main_channel;
>
> which would return false when preempt arrived after main.

* Yes.

> We could use migration_has_all_channels() instead, that would look more
> logically correct, but it would also change the current behavior that
> postcopy recovery can start before the preempt channel is in place. I'm
> not even sure if that's actually part of the design of the feature.

* Not sure if we need this.

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 6/7] migration: Add save_postcopy_prepare() savevm handler
  2025-03-31 15:08   ` Fabiano Rosas
@ 2025-04-03  7:03     ` Prasad Pandit
  0 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-04-03  7:03 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange

On Mon, 31 Mar 2025 at 20:39, Fabiano Rosas <farosas@suse.de> wrote:
> This patch and the next one need to come before 3/7.

* Okay.

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare()
  2025-03-31 15:18   ` Fabiano Rosas
@ 2025-04-03  7:21     ` Prasad Pandit
  2025-04-03 13:07       ` Fabiano Rosas
  0 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-04-03  7:21 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange

Hi,

On Mon, 31 Mar 2025 at 20:49, Fabiano Rosas <farosas@suse.de> wrote:
> > +static bool ram_save_postcopy_prepare(QEMUFile *f, void *opaque, Error **errp)
> > +{
> > +    int ret;
> > +
> > +    if (migrate_multifd()) {
> > +        /*
> > +         * When multifd is enabled, source QEMU needs to make sure all the
> > +         * pages queued before postcopy starts to be flushed.
>
> s/to be/have been/
>
> > +         *
> > +         * Meanwhile, the load of these pages must happen before switching
>
> s/Meanwhile,//
>
> > +         * to postcopy.  It's because loading of guest pages (so far) in
> > +         * multifd recv threads is still non-atomic, so the load cannot
> > +         * happen with vCPUs running on destination side.
> > +         *
> > +         * This flush and sync will guarantee those pages loaded _before_
>
> s/loaded/are loaded/
>
> > +         * postcopy starts on destination. The rational is, this happens
>
> s/rational/rationale/
>
> > +         * before VM stops (and before source QEMU sends all the rest of
> > +         * the postcopy messages).  So when the destination QEMU received
> > +         * the postcopy messages, it must have received the sync message on
> > +         * the main channel (either RAM_SAVE_FLAG_MULTIFD_FLUSH, or
> > +         * RAM_SAVE_FLAG_EOS), and such message should have guaranteed all
> > +         * previous guest pages queued in the multifd channels to be
> > +         * completely loaded.
> > +         */

* I'll include the above suggested corrections. I'm thinking it might
help more to have such an explanatory comment at the definition of the
multifd_ram_flush_and_sync() routine. Because looking at that function
it is not clear how 'MULTIFD_SYNC_ALL' is used. It sets the
'->pending_sync' to MULTIFD_SYNC_CALL. And when '->pending_sync' is
set this way, multifd_send_thread() writes 'MULTIFD_FLAG_SYNC' on each
multifd channel. At the destination this 'MULTIFD_FLAG_SYNC' flag is
then used to sync main and multifd_recv threads.

...wdyt?

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-03-31 20:54 ` Fabiano Rosas
@ 2025-04-03  7:24   ` Prasad Pandit
  2025-04-03 13:11     ` Fabiano Rosas
  0 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-04-03  7:24 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

On Tue, 1 Apr 2025 at 02:24, Fabiano Rosas <farosas@suse.de> wrote:
> The postcopy/multifd/plain test is still hanging from time to time. I
> see a vmstate load function trying to access guest memory and the
> postcopy-listen thread already finished, waiting for that
> qemu_loadvm_state() (frame #18) to return and set the
> main_thread_load_event.
>
> Thread 1 (Thread 0x7fbc4849df80 (LWP 7487) "qemu-system-x86"):
> #0  __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:274
> #1  0x0000560b135103aa in flatview_read_continue_step (attrs=..., buf=0x560b168a5930 "U\252\022\006\016\a1\300\271", len=9216, mr_addr=831488, l=0x7fbc465ff980, mr=0x560b166c5070) at ../system/physmem.c:3056
> #2  0x0000560b1351042e in flatview_read_continue (fv=0x560b16c606a0, addr=831488, attrs=..., ptr=0x560b168a5930, len=9216, mr_addr=831488, l=9216, mr=0x560b166c5070) at ../system/physmem.c:3073
> #3  0x0000560b13510533 in flatview_read (fv=0x560b16c606a0, addr=831488, attrs=..., buf=0x560b168a5930, len=9216) at ../system/physmem.c:3103
> #4  0x0000560b135105be in address_space_read_full (as=0x560b14970fc0 <address_space_memory>, addr=831488, attrs=..., buf=0x560b168a5930, len=9216) at ../system/physmem.c:3116
> #5  0x0000560b135106e7 in address_space_rw (as=0x560b14970fc0 <address_space_memory>, addr=831488, attrs=..., buf=0x560b168a5930, len=9216, is_write=false) at ../system/physmem.c:3144
> #6  0x0000560b13510848 in cpu_physical_memory_rw (addr=831488, buf=0x560b168a5930, len=9216, is_write=false) at ../system/physmem.c:3170
> #7  0x0000560b1338f5a5 in cpu_physical_memory_read (addr=831488, buf=0x560b168a5930, len=9216) at qemu/include/exec/cpu-common.h:148
> #8  0x0000560b1339063c in patch_hypercalls (s=0x560b168840c0) at ../hw/i386/vapic.c:547
> #9  0x0000560b1339096d in vapic_prepare (s=0x560b168840c0) at ../hw/i386/vapic.c:629
> #10 0x0000560b13390e8b in vapic_post_load (opaque=0x560b168840c0, version_id=1) at ../hw/i386/vapic.c:789
> #11 0x0000560b135b4924 in vmstate_load_state (f=0x560b16c53400, vmsd=0x560b147c6cc0 <vmstate_vapic>, opaque=0x560b168840c0, version_id=1) at ../migration/vmstate.c:234
> #12 0x0000560b132a15b8 in vmstate_load (f=0x560b16c53400, se=0x560b16893390) at ../migration/savevm.c:972
> #13 0x0000560b132a4f28 in qemu_loadvm_section_start_full (f=0x560b16c53400, type=4 '\004') at ../migration/savevm.c:2746
> #14 0x0000560b132a5ae8 in qemu_loadvm_state_main (f=0x560b16c53400, mis=0x560b16877f20) at ../migration/savevm.c:3058
> #15 0x0000560b132a45d0 in loadvm_handle_cmd_packaged (mis=0x560b16877f20) at ../migration/savevm.c:2451
> #16 0x0000560b132a4b36 in loadvm_process_command (f=0x560b168c3b60) at ../migration/savevm.c:2614
> #17 0x0000560b132a5b96 in qemu_loadvm_state_main (f=0x560b168c3b60, mis=0x560b16877f20) at ../migration/savevm.c:3073
> #18 0x0000560b132a5db7 in qemu_loadvm_state (f=0x560b168c3b60) at ../migration/savevm.c:3150
> #19 0x0000560b13286271 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:892
> #20 0x0000560b137cb6d4 in coroutine_trampoline (i0=377836416, i1=22027) at ../util/coroutine-ucontext.c:175
> #21 0x00007fbc4786a79e in ??? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:103
>
>
> Thread 10 (Thread 0x7fffce7fc700 (LWP 11778) "mig/dst/listen"):
> #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
> #1  0x000055555614e33f in qemu_futex_wait (f=0x5555576f6fc0, val=4294967295) at qemu/include/qemu/futex.h:29
> #2  0x000055555614e505 in qemu_event_wait (ev=0x5555576f6fc0) at ../util/qemu-thread-posix.c:464
> #3  0x0000555555c44eb1 in postcopy_ram_listen_thread (opaque=0x5555576f6f20) at ../migration/savevm.c:2135
> #4  0x000055555614e6b8 in qemu_thread_start (args=0x5555582c8480) at ../util/qemu-thread-posix.c:541
> #5  0x00007ffff72626ea in start_thread (arg=0x7fffce7fc700) at pthread_create.c:477
> #6  0x00007ffff532158f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> Thread 9 (Thread 0x7fffceffd700 (LWP 11777) "mig/dst/fault"):
> #0  0x00007ffff5314a89 in __GI___poll (fds=0x7fffc0000b60, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
> #1  0x0000555555c3be3f in postcopy_ram_fault_thread (opaque=0x5555576f6f20) at ../migration/postcopy-ram.c:999
> #2  0x000055555614e6b8 in qemu_thread_start (args=0x555557735be0) at ../util/qemu-thread-posix.c:541
> #3  0x00007ffff72626ea in start_thread (arg=0x7fffceffd700) at pthread_create.c:477
> #4  0x00007ffff532158f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>
> Breaking with gdb and stepping through the memcpy code generates a
> request for a page that's seemingly already in the receivedmap:
>
> (gdb) x/i $pc
> => 0x7ffff5399d14 <__memcpy_evex_unaligned_erms+86>:    rep movsb %ds:(%rsi),%es:(%rdi)
> (gdb) p/x $rsi
> $1 = 0x7fffd68cc000
> (gdb) si
> postcopy_ram_fault_thread_request Request for HVA=0x7fffd68cc000 rb=pc.ram offset=0xcc000 pid=11754
> // these are my printfs:
> postcopy_request_page:
> migrate_send_rp_req_pages:
> migrate_send_rp_req_pages: mutex
> migrate_send_rp_req_pages: received
>
> // gdb hangs here, it looks like the page wasn't populated?
>
> I've had my share of postcopy for the day. Hopefully you'll be able to
> figure out what the issue is.
>
> - reproducer (2nd iter already hangs for me):
>
> $ for i in $(seq 1 9999); do echo "$i ============="; \
> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test \
> --full -r /x86_64/migration/postcopy/multifd/plain || break ; done
>
> - reproducer with traces and gdb:
>
> $ for i in $(seq 1 9999); do echo "$i ============="; \
> QTEST_TRACE="multifd_* -trace source_* -trace postcopy_* -trace savevm_* \
> -trace loadvm_*" QTEST_QEMU_BINARY_DST='gdb --ex "handle SIGUSR1 \
> noprint" --ex "run" --args ./qemu-system-x86_64' \
> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test \
> --full -r /x86_64/migration/postcopy/multifd/plain || break ; done

* Thank you for the reproducer and traces. I'll try to check more and
see if I'm able to reproduce it on my side.

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 3/7] migration: enable multifd and postcopy together
  2025-03-31 15:27   ` Fabiano Rosas
@ 2025-04-03 10:57     ` Prasad Pandit
  2025-04-03 13:03       ` Fabiano Rosas
  0 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-04-03 10:57 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

On Mon, 31 Mar 2025 at 20:57, Fabiano Rosas <farosas@suse.de> wrote:
> > --- a/migration/ram.c
> > +++ b/migration/ram.c
> > @@ -1297,7 +1297,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
> >          pss->page = 0;
> >          pss->block = QLIST_NEXT_RCU(pss->block, next);
> >          if (!pss->block) {
> > -            if (multifd_ram_sync_per_round()) {
> > +            if (multifd_ram_sync_per_round() && !migration_in_postcopy()) {
>
> I'd rather not put this check here. multifd_ram_flush_and_sync() will
> already return 0 if in postcopy.

* IIRC, without it migration did not finish or was crashing. I don't
recall if it was on Fedora or RHEL systems.

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 2/7] migration: Refactor channel discovery mechanism
  2025-04-03  7:01     ` Prasad Pandit
@ 2025-04-03 12:59       ` Fabiano Rosas
  2025-04-04  9:48         ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-04-03 12:59 UTC (permalink / raw)
  To: Prasad Pandit; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> Hello Fabiano,
>
> On Mon, 31 Mar 2025 at 20:31, Fabiano Rosas <farosas@suse.de> wrote:
>> > +    } else if (mis->from_src_file) {
>> This is redundant.
>
> * This was to ensure (double check) that when the Postcopy connection
> comes in, the main channel is established. Also a couple of versions
> back migration qtest was failing without this check. Nonetheless,
> qtests do work now without this check. I'll remove it if we must.
>

Yes, there's no point. if we already have main and multifd channels,
what's left must be postcopy.

>> > +        channel = CH_POSTCOPY;
>> >      } else {
>> > -        default_channel = !mis->from_src_file;
>> > +        channel = CH_MAIN;
>>
>> And this is impossible.
>
>     -> https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#m18b6bf30e877f9eafaa67bba6a209b47782f6eac
>
> * Yes, but a couple of revisions back you suggested adding it saying
> CH_MAIN assignment at the top was doing some heavy lifting and it's
> more clear this way.
>

Well, but don't add it blindly if it doesn't make sense. The point was
to not end the conditional at 'else if' because that makes the reader
have to go look around the code to see what was already assigned. Here
we want just a plain:

else {
    channel = CH_POSTCOPY; 
}

>> We should probably expand migration_incoming_setup() to make it clear
>> that mis->from_src_file is set at this point. And
>> assert(!mis->from_src_file). I can send a patch on top later.
>
> * migration_incoming_setup uses the QEMUFile object only when
> mis->from_src_file is not set. I'm wondering if we really need an
> assert(!mis->from_src_file) check? Because it'll reach here only when
> channel == CH_MAIN and channel is set to CH_MAIN only when
> mis->from_src_file is NULL.
>
>

Given the:

if (!mis->from_src_file) {

I think someone (back in 2017) thought it was possible to reach there
with from_src_file already set. I don't know whether that applied to
this path. In any case, for this function I believe the correct is
assert because we shouldn't have two channels arriving as main.

>> > -    } else {
>> > +    } else if (channel == CH_MULTIFD) {
>> >          /* Multiple connections */
>> > -        assert(migration_needs_multiple_sockets());
>> >          if (migrate_multifd()) {
>>
>> This should be an assert.
>
> Same, 'channel' is set to CH_MULTIFD,  only when migrate_multifd() is
> enabled. Do we need another assert(migrate_multifd()) check?
>

Maybe not, but we definitely cannot just ignore if it happens and we
also should not have an empty check that is always known to be true. So
either assert or remove the if entirely.

>> > +    } else if (channel == CH_POSTCOPY) {
>> > +        assert(migrate_postcopy_preempt());
>> > +        assert(!mis->postcopy_qemufile_dst);
>> > +        f = qemu_file_new_input(ioc);
>> > +        postcopy_preempt_new_channel(mis, f);
>> > +        return;
>> >      }
>> >
>> > -    if (migration_should_start_incoming(default_channel)) {
>> > +    if (migration_has_main_and_multifd_channels()) {
>>
>> I think there's a bug here. Excluding multifd from the picture, if only
>> the main channel needs to be setup, then it's possible to start postcopy
>> recovery twice, once when the main channel appears and another time when
>> the preempt channel appears.
>
> * When the preempt channel appears 'channel' is set to CH_POSTCOPY, so
> it shall 'return' before reaching here, right?
>

You're right, I missed the return statement.

> ===
>         } else if (!mis->from_src_file &&
>                         mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
>                 /* reconnect main channel for postcopy recovery */
>                 channel = CH_MAIN;
>         } else {
> ===
> * When 'main' channel connection arrives for postcopy recovery,
> 'channel' shall be set to CH_MAIN.
>
>> The previous code worked differently because it did:
>>
>> if (migrate_postcopy_preempt()) {
>>     return main_channel;
>>
>> which would return false when preempt arrived after main.
>
> * Yes.
>
>> We could use migration_has_all_channels() instead, that would look more
>> logically correct, but it would also change the current behavior that
>> postcopy recovery can start before the preempt channel is in place. I'm
>> not even sure if that's actually part of the design of the feature.
>
> * Not sure if we need this.
>
> Thank you.
> ---
>   - Prasad


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 3/7] migration: enable multifd and postcopy together
  2025-04-03 10:57     ` Prasad Pandit
@ 2025-04-03 13:03       ` Fabiano Rosas
  0 siblings, 0 replies; 30+ messages in thread
From: Fabiano Rosas @ 2025-04-03 13:03 UTC (permalink / raw)
  To: Prasad Pandit; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> On Mon, 31 Mar 2025 at 20:57, Fabiano Rosas <farosas@suse.de> wrote:
>> > --- a/migration/ram.c
>> > +++ b/migration/ram.c
>> > @@ -1297,7 +1297,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
>> >          pss->page = 0;
>> >          pss->block = QLIST_NEXT_RCU(pss->block, next);
>> >          if (!pss->block) {
>> > -            if (multifd_ram_sync_per_round()) {
>> > +            if (multifd_ram_sync_per_round() && !migration_in_postcopy()) {
>>
>> I'd rather not put this check here. multifd_ram_flush_and_sync() will
>> already return 0 if in postcopy.
>
> * IIRC, without it migration did not finish or was crashing. I don't
> recall if it was on Fedora or RHEL systems.
>

Maybe you hit the issue I reported? Let's not duplicate the in_postcopy
check unless it makes logical sense. If it crashes, we must figure out
why.

> Thank you.
> ---
>   - Prasad


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare()
  2025-04-03  7:21     ` Prasad Pandit
@ 2025-04-03 13:07       ` Fabiano Rosas
  2025-04-04  9:50         ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-04-03 13:07 UTC (permalink / raw)
  To: Prasad Pandit; +Cc: qemu-devel, peterx, berrange

Prasad Pandit <ppandit@redhat.com> writes:

> Hi,
>
> On Mon, 31 Mar 2025 at 20:49, Fabiano Rosas <farosas@suse.de> wrote:
>> > +static bool ram_save_postcopy_prepare(QEMUFile *f, void *opaque, Error **errp)
>> > +{
>> > +    int ret;
>> > +
>> > +    if (migrate_multifd()) {
>> > +        /*
>> > +         * When multifd is enabled, source QEMU needs to make sure all the
>> > +         * pages queued before postcopy starts to be flushed.
>>
>> s/to be/have been/
>>
>> > +         *
>> > +         * Meanwhile, the load of these pages must happen before switching
>>
>> s/Meanwhile,//
>>
>> > +         * to postcopy.  It's because loading of guest pages (so far) in
>> > +         * multifd recv threads is still non-atomic, so the load cannot
>> > +         * happen with vCPUs running on destination side.
>> > +         *
>> > +         * This flush and sync will guarantee those pages loaded _before_
>>
>> s/loaded/are loaded/
>>
>> > +         * postcopy starts on destination. The rational is, this happens
>>
>> s/rational/rationale/
>>
>> > +         * before VM stops (and before source QEMU sends all the rest of
>> > +         * the postcopy messages).  So when the destination QEMU received
>> > +         * the postcopy messages, it must have received the sync message on
>> > +         * the main channel (either RAM_SAVE_FLAG_MULTIFD_FLUSH, or
>> > +         * RAM_SAVE_FLAG_EOS), and such message should have guaranteed all
>> > +         * previous guest pages queued in the multifd channels to be
>> > +         * completely loaded.
>> > +         */
>
> * I'll include the above suggested corrections. I'm thinking it might
> help more to have such an explanatory comment at the definition of the
> multifd_ram_flush_and_sync() routine. Because looking at that function
> it is not clear how 'MULTIFD_SYNC_ALL' is used. It sets the
> '->pending_sync' to MULTIFD_SYNC_CALL. And when '->pending_sync' is
> set this way, multifd_send_thread() writes 'MULTIFD_FLAG_SYNC' on each
> multifd channel. At the destination this 'MULTIFD_FLAG_SYNC' flag is
> then used to sync main and multifd_recv threads.
>
> ...wdyt?

The code assumes some understanding of the multifd sync in general. It
doesn't help that we don't have a high level documentation for that
(yet). If you think the comments at the MultiFDSyncReq are not enough,
feel free to propose a separate patch adding documentation to
multifd_ram_flush_and_sync().

>
> Thank you.
> ---
>   - Prasad


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-04-03  7:24   ` Prasad Pandit
@ 2025-04-03 13:11     ` Fabiano Rosas
  2025-04-10 12:22       ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-04-03 13:11 UTC (permalink / raw)
  To: Prasad Pandit; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> On Tue, 1 Apr 2025 at 02:24, Fabiano Rosas <farosas@suse.de> wrote:
>> The postcopy/multifd/plain test is still hanging from time to time. I
>> see a vmstate load function trying to access guest memory and the
>> postcopy-listen thread already finished, waiting for that
>> qemu_loadvm_state() (frame #18) to return and set the
>> main_thread_load_event.
>>
>> Thread 1 (Thread 0x7fbc4849df80 (LWP 7487) "qemu-system-x86"):
>> #0  __memcpy_evex_unaligned_erms () at ../sysdeps/x86_64/multiarch/memmove-vec-unaligned-erms.S:274
>> #1  0x0000560b135103aa in flatview_read_continue_step (attrs=..., buf=0x560b168a5930 "U\252\022\006\016\a1\300\271", len=9216, mr_addr=831488, l=0x7fbc465ff980, mr=0x560b166c5070) at ../system/physmem.c:3056
>> #2  0x0000560b1351042e in flatview_read_continue (fv=0x560b16c606a0, addr=831488, attrs=..., ptr=0x560b168a5930, len=9216, mr_addr=831488, l=9216, mr=0x560b166c5070) at ../system/physmem.c:3073
>> #3  0x0000560b13510533 in flatview_read (fv=0x560b16c606a0, addr=831488, attrs=..., buf=0x560b168a5930, len=9216) at ../system/physmem.c:3103
>> #4  0x0000560b135105be in address_space_read_full (as=0x560b14970fc0 <address_space_memory>, addr=831488, attrs=..., buf=0x560b168a5930, len=9216) at ../system/physmem.c:3116
>> #5  0x0000560b135106e7 in address_space_rw (as=0x560b14970fc0 <address_space_memory>, addr=831488, attrs=..., buf=0x560b168a5930, len=9216, is_write=false) at ../system/physmem.c:3144
>> #6  0x0000560b13510848 in cpu_physical_memory_rw (addr=831488, buf=0x560b168a5930, len=9216, is_write=false) at ../system/physmem.c:3170
>> #7  0x0000560b1338f5a5 in cpu_physical_memory_read (addr=831488, buf=0x560b168a5930, len=9216) at qemu/include/exec/cpu-common.h:148
>> #8  0x0000560b1339063c in patch_hypercalls (s=0x560b168840c0) at ../hw/i386/vapic.c:547
>> #9  0x0000560b1339096d in vapic_prepare (s=0x560b168840c0) at ../hw/i386/vapic.c:629
>> #10 0x0000560b13390e8b in vapic_post_load (opaque=0x560b168840c0, version_id=1) at ../hw/i386/vapic.c:789
>> #11 0x0000560b135b4924 in vmstate_load_state (f=0x560b16c53400, vmsd=0x560b147c6cc0 <vmstate_vapic>, opaque=0x560b168840c0, version_id=1) at ../migration/vmstate.c:234
>> #12 0x0000560b132a15b8 in vmstate_load (f=0x560b16c53400, se=0x560b16893390) at ../migration/savevm.c:972
>> #13 0x0000560b132a4f28 in qemu_loadvm_section_start_full (f=0x560b16c53400, type=4 '\004') at ../migration/savevm.c:2746
>> #14 0x0000560b132a5ae8 in qemu_loadvm_state_main (f=0x560b16c53400, mis=0x560b16877f20) at ../migration/savevm.c:3058
>> #15 0x0000560b132a45d0 in loadvm_handle_cmd_packaged (mis=0x560b16877f20) at ../migration/savevm.c:2451
>> #16 0x0000560b132a4b36 in loadvm_process_command (f=0x560b168c3b60) at ../migration/savevm.c:2614
>> #17 0x0000560b132a5b96 in qemu_loadvm_state_main (f=0x560b168c3b60, mis=0x560b16877f20) at ../migration/savevm.c:3073
>> #18 0x0000560b132a5db7 in qemu_loadvm_state (f=0x560b168c3b60) at ../migration/savevm.c:3150
>> #19 0x0000560b13286271 in process_incoming_migration_co (opaque=0x0) at ../migration/migration.c:892
>> #20 0x0000560b137cb6d4 in coroutine_trampoline (i0=377836416, i1=22027) at ../util/coroutine-ucontext.c:175
>> #21 0x00007fbc4786a79e in ??? () at ../sysdeps/unix/sysv/linux/x86_64/__start_context.S:103
>>
>>
>> Thread 10 (Thread 0x7fffce7fc700 (LWP 11778) "mig/dst/listen"):
>> #0  syscall () at ../sysdeps/unix/sysv/linux/x86_64/syscall.S:38
>> #1  0x000055555614e33f in qemu_futex_wait (f=0x5555576f6fc0, val=4294967295) at qemu/include/qemu/futex.h:29
>> #2  0x000055555614e505 in qemu_event_wait (ev=0x5555576f6fc0) at ../util/qemu-thread-posix.c:464
>> #3  0x0000555555c44eb1 in postcopy_ram_listen_thread (opaque=0x5555576f6f20) at ../migration/savevm.c:2135
>> #4  0x000055555614e6b8 in qemu_thread_start (args=0x5555582c8480) at ../util/qemu-thread-posix.c:541
>> #5  0x00007ffff72626ea in start_thread (arg=0x7fffce7fc700) at pthread_create.c:477
>> #6  0x00007ffff532158f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>
>> Thread 9 (Thread 0x7fffceffd700 (LWP 11777) "mig/dst/fault"):
>> #0  0x00007ffff5314a89 in __GI___poll (fds=0x7fffc0000b60, nfds=2, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
>> #1  0x0000555555c3be3f in postcopy_ram_fault_thread (opaque=0x5555576f6f20) at ../migration/postcopy-ram.c:999
>> #2  0x000055555614e6b8 in qemu_thread_start (args=0x555557735be0) at ../util/qemu-thread-posix.c:541
>> #3  0x00007ffff72626ea in start_thread (arg=0x7fffceffd700) at pthread_create.c:477
>> #4  0x00007ffff532158f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
>>
>> Breaking with gdb and stepping through the memcpy code generates a
>> request for a page that's seemingly already in the receivedmap:
>>
>> (gdb) x/i $pc
>> => 0x7ffff5399d14 <__memcpy_evex_unaligned_erms+86>:    rep movsb %ds:(%rsi),%es:(%rdi)
>> (gdb) p/x $rsi
>> $1 = 0x7fffd68cc000
>> (gdb) si
>> postcopy_ram_fault_thread_request Request for HVA=0x7fffd68cc000 rb=pc.ram offset=0xcc000 pid=11754
>> // these are my printfs:
>> postcopy_request_page:
>> migrate_send_rp_req_pages:
>> migrate_send_rp_req_pages: mutex
>> migrate_send_rp_req_pages: received
>>
>> // gdb hangs here, it looks like the page wasn't populated?
>>
>> I've had my share of postcopy for the day. Hopefully you'll be able to
>> figure out what the issue is.
>>
>> - reproducer (2nd iter already hangs for me):
>>
>> $ for i in $(seq 1 9999); do echo "$i ============="; \
>> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test \
>> --full -r /x86_64/migration/postcopy/multifd/plain || break ; done
>>
>> - reproducer with traces and gdb:
>>
>> $ for i in $(seq 1 9999); do echo "$i ============="; \
>> QTEST_TRACE="multifd_* -trace source_* -trace postcopy_* -trace savevm_* \
>> -trace loadvm_*" QTEST_QEMU_BINARY_DST='gdb --ex "handle SIGUSR1 \
>> noprint" --ex "run" --args ./qemu-system-x86_64' \
>> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test \
>> --full -r /x86_64/migration/postcopy/multifd/plain || break ; done
>
> * Thank you for the reproducer and traces. I'll try to check more and
> see if I'm able to reproduce it on my side.
>

Thanks. I cannot merge this series until that issue is resolved. If it
reproduces on my machine there's a high chance that it will break CI at
some point and then it'll be a nightmare to debug. This has happened
many times before with multifd.

> Thank you.
> ---
>   - Prasad


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 2/7] migration: Refactor channel discovery mechanism
  2025-04-03 12:59       ` Fabiano Rosas
@ 2025-04-04  9:48         ` Prasad Pandit
  0 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-04-04  9:48 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

On Thu, 3 Apr 2025 at 18:29, Fabiano Rosas <farosas@suse.de> wrote:
> Yes, there's no point. if we already have main and multifd channels,
> what's left must be postcopy.

* Okay.

> Well, but don't add it blindly if it doesn't make sense.

* Hmmn, okay. When I say/do things that seem reasonable to me, I'm
told - it's much easier to just do things than arguing over them. When
I do something because it's a minor change, I'm admonished with this.
Life is tricky. :)

> The point was to not end the conditional at 'else if' because that makes the reader
> have to go look around the code to see what was already assigned. Here
> we want just a plain:
>
> else {
>     channel = CH_POSTCOPY;
> }

* Okay.

> > * migration_incoming_setup uses the QEMUFile object only when
> > mis->from_src_file is not set. I'm wondering if we really need an
> > assert(!mis->from_src_file) check? Because it'll reach here only when
> > channel == CH_MAIN and channel is set to CH_MAIN only when
> > mis->from_src_file is NULL.
>
> Given the:
> if (!mis->from_src_file) {
>
> I think someone (back in 2017) thought it was possible to reach there
> with from_src_file already set. I don't know whether that applied to
> this path. In any case, for this function I believe the correct is
> assert because we shouldn't have two channels arriving as main.

* Okay.

* If we add assert(!mis->from_src_file), then the if
(!mis->from_src_file) check in migration_incoming_setup() is not
needed then.

> Maybe not, but we definitely cannot just ignore if it happens and we
> also should not have an empty check that is always known to be true. So
> either assert or remove the if entirely.

* Okay. I'll add an assert(migrate_multifd()). That should be
consistent with the other assert(!mis->from_src_file),
assert(migrate_postcopy_preempt()) and
assert(!mis->postcopy_qemufile_dst) calls in there.


Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare()
  2025-04-03 13:07       ` Fabiano Rosas
@ 2025-04-04  9:50         ` Prasad Pandit
  0 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-04-04  9:50 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange

On Thu, 3 Apr 2025 at 18:37, Fabiano Rosas <farosas@suse.de> wrote:
> The code assumes some understanding of the multifd sync in general. It
> doesn't help that we don't have a high level documentation for that
> (yet). If you think the comments at the MultiFDSyncReq are not enough,
> feel free to propose a separate patch adding documentation to
> multifd_ram_flush_and_sync().

Okay, will check. Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-04-03 13:11     ` Fabiano Rosas
@ 2025-04-10 12:22       ` Prasad Pandit
  2025-04-10 20:18         ` Fabiano Rosas
  0 siblings, 1 reply; 30+ messages in thread
From: Prasad Pandit @ 2025-04-10 12:22 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

Hello Fabiano,

On Thu, 3 Apr 2025 at 18:41, Fabiano Rosas <farosas@suse.de> wrote:
> Prasad Pandit <ppandit@redhat.com> writes:
> > * Thank you for the reproducer and traces. I'll try to check more and
> > see if I'm able to reproduce it on my side.
>
> Thanks. I cannot merge this series until that issue is resolved. If it
> reproduces on my machine there's a high chance that it will break CI at
> some point and then it'll be a nightmare to debug. This has happened
> many times before with multifd.

===
qemu/build)$ for i in $(seq 1 9999); do echo "$i ====";
QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test
--full -r '/x86_64/migration/postcopy/multifd/plain' || break; done |
tee /tmp/migration-test.out | awk -e '/====/ { printf ("%s ", $_) };
/slow test/ { printf("%s\n", $_); }'

Host-1]
...
9980 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9981 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9982 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.42 secs
9983 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.56 secs
9984 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.44 secs
9985 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.43 secs
9986 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9987 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.53 secs
9988 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.46 secs
9989 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.49 secs
9990 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9991 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9992 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9993 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9994 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.41 secs
9995 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.42 secs
9996 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.58 secs
9997 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9998 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9999 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
--------
Iter: 9999, low: 1.35, high: 1.73, avg: 1.47 secs


Host-2]
...
9980 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9981 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.69 secs
9982 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.41 secs
9983 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.54 secs
9984 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9985 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.44 secs
9986 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9987 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9988 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.44 secs
9989 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9990 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.37 secs
9991 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9992 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9993 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9994 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9995 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.45 secs
9996 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.53 secs
9997 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9998 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.47 secs
9999 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
--------
Iter: 9999, low: 1.34, high: 1.82, avg: 1.48 secs


Host-3]
...
9980 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.50 secs
9981 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.55 secs
9982 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.54 secs
9983 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.49 secs
9984 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.49 secs
9985 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.52 secs
9986 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9987 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.52 secs
9988 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.54 secs
9989 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9990 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.51 secs
9991 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.50 secs
9992 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.53 secs
9993 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.50 secs
9994 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.53 secs
9995 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.49 secs
9996 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.48 secs
9997 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.54 secs
9998 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.44 secs
9999 ==== # slow test /x86_64/migration/postcopy/multifd/plain
executed in 1.54 secs
--------
Iter: 9999, low: 1.31, high: 2.49, avg: 1.48
===

* I tried to reproduce the hang issue with and without -traces across
3 different machines but am unable to reproduce it on my side.

* Going through the source and the back trace you provided, you said
gdb hangs in the postcopy_ram_fault_thread() function at poll()
function to wait for a missing page.
   - But by this time, postcopy_ram_listen thread is already preparing
to cleanup and exit
       - That means postcopy migration is finished/ending
   - ie. postcopy migration is ending without (or before) migrating
all the RAM pages from the source side?

In postcopy mode:
    * Is there a way to log the pages (#numers) that are sent from the
source side?
    * And log the pages (#numbers) that are received on the receive side?

* That way we might be able to check/confirm the pages which were not
received or not processed properly.

* Can we connect the faulting/missing (HVA=0x7fffd68cc000)
address/page in postcopy_ram_fault_thread() with the memcpy that the
main thread seems to be loading via vapic_post_load()? ie. the main
thread and pocyopy_ram_fault_thread() above could be doing unrelated
things.

* Other than this, I've revised the patch-set as suggested. How do we
proceed further?

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-04-10 12:22       ` Prasad Pandit
@ 2025-04-10 20:18         ` Fabiano Rosas
  2025-04-11  7:25           ` Prasad Pandit
  0 siblings, 1 reply; 30+ messages in thread
From: Fabiano Rosas @ 2025-04-10 20:18 UTC (permalink / raw)
  To: Prasad Pandit; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

Prasad Pandit <ppandit@redhat.com> writes:

> Hello Fabiano,
>
> On Thu, 3 Apr 2025 at 18:41, Fabiano Rosas <farosas@suse.de> wrote:
>> Prasad Pandit <ppandit@redhat.com> writes:
>> > * Thank you for the reproducer and traces. I'll try to check more and
>> > see if I'm able to reproduce it on my side.
>>
>> Thanks. I cannot merge this series until that issue is resolved. If it
>> reproduces on my machine there's a high chance that it will break CI at
>> some point and then it'll be a nightmare to debug. This has happened
>> many times before with multifd.
>
> ===
> qemu/build)$ for i in $(seq 1 9999); do echo "$i ====";
> QTEST_QEMU_BINARY=./qemu-system-x86_64 ./tests/qtest/migration-test
> --full -r '/x86_64/migration/postcopy/multifd/plain' || break; done |
> tee /tmp/migration-test.out | awk -e '/====/ { printf ("%s ", $_) };
> /slow test/ { printf("%s\n", $_); }'
>
> Host-1]
> ...
> 9980 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.51 secs
> 9981 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.47 secs
> 9982 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.42 secs
> 9983 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.56 secs
> 9984 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.44 secs
> 9985 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.43 secs
> 9986 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.45 secs
> 9987 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.53 secs
> 9988 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.46 secs
> 9989 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.49 secs
> 9990 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> 9991 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.47 secs
> 9992 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.45 secs
> 9993 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.47 secs
> 9994 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.41 secs
> 9995 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.42 secs
> 9996 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.58 secs
> 9997 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.45 secs
> 9998 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.51 secs
> 9999 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.51 secs
> --------
> Iter: 9999, low: 1.35, high: 1.73, avg: 1.47 secs
>
>
> Host-2]
> ...
> 9980 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.45 secs
> 9981 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.69 secs
> 9982 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.41 secs
> 9983 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.54 secs
> 9984 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.45 secs
> 9985 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.44 secs
> 9986 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> 9987 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> 9988 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.44 secs
> 9989 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.51 secs
> 9990 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.37 secs
> 9991 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> 9992 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.51 secs
> 9993 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.47 secs
> 9994 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.47 secs
> 9995 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.45 secs
> 9996 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.53 secs
> 9997 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> 9998 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.47 secs
> 9999 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> --------
> Iter: 9999, low: 1.34, high: 1.82, avg: 1.48 secs
>
>
> Host-3]
> ...
> 9980 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.50 secs
> 9981 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.55 secs
> 9982 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.54 secs
> 9983 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.49 secs
> 9984 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.49 secs
> 9985 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.52 secs
> 9986 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> 9987 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.52 secs
> 9988 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.54 secs
> 9989 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.51 secs
> 9990 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.51 secs
> 9991 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.50 secs
> 9992 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.53 secs
> 9993 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.50 secs
> 9994 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.53 secs
> 9995 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.49 secs
> 9996 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.48 secs
> 9997 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.54 secs
> 9998 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.44 secs
> 9999 ==== # slow test /x86_64/migration/postcopy/multifd/plain
> executed in 1.54 secs
> --------
> Iter: 9999, low: 1.31, high: 2.49, avg: 1.48
> ===
>
> * I tried to reproduce the hang issue with and without -traces across
> 3 different machines but am unable to reproduce it on my side.
>
> * Going through the source and the back trace you provided, you said
> gdb hangs in the postcopy_ram_fault_thread() function at poll()
> function to wait for a missing page.
>    - But by this time, postcopy_ram_listen thread is already preparing
> to cleanup and exit
>        - That means postcopy migration is finished/ending
>    - ie. postcopy migration is ending without (or before) migrating
> all the RAM pages from the source side?
>

That's what it looks like. It could be some error condition that is not
being propagated properly. The thread hits an error and exits without
informing the rest of migration.

> In postcopy mode:
>     * Is there a way to log the pages (#numers) that are sent from the
> source side?
>     * And log the pages (#numbers) that are received on the receive side?
>
> * That way we might be able to check/confirm the pages which were not
> received or not processed properly.

Some combination of the postcopy traces should give you that. Sorry,
Peter Xu really is the expert on postcopy, I just tag along.

>
> * Can we connect the faulting/missing (HVA=0x7fffd68cc000)
> address/page in postcopy_ram_fault_thread() with the memcpy that the
> main thread seems to be loading via vapic_post_load()? ie. the main
> thread and pocyopy_ram_fault_thread() above could be doing unrelated
> things.
>

The snippet I posted shows that it's the same page:

(gdb) x/i $pc
=> 0x7ffff5399d14 <__memcpy_evex_unaligned_erms+86>:    rep movsb %ds:(%rsi),%es:(%rdi)
(gdb) p/x $rsi
$1 = 0x7fffd68cc000

> * Other than this, I've revised the patch-set as suggested. How do we
> proceed further?

Send your next version and I'll set some time aside to debug this.

heads-up: I'll be off from 2025/04/18 until 2025/05/05. Peter should be
already back in the meantime.

>
> Thank you.
> ---
>   - Prasad


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH v8 0/7] Allow to enable multifd and postcopy migration together
  2025-04-10 20:18         ` Fabiano Rosas
@ 2025-04-11  7:25           ` Prasad Pandit
  0 siblings, 0 replies; 30+ messages in thread
From: Prasad Pandit @ 2025-04-11  7:25 UTC (permalink / raw)
  To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit

Hi,

On Fri, 11 Apr 2025 at 01:48, Fabiano Rosas <farosas@suse.de> wrote:
> That's what it looks like. It could be some error condition that is not
> being propagated properly. The thread hits an error and exits without
> informing the rest of migration.

* The gdb(1) hanging in the postcopy_ram_fault_thread() is not
conclusive. I tried to set following break-points

    gdb) break postcopy-ram.c:998 - poll_result = poll(pfd, pfd_len,
-1 /* Wait forever */);
    gdb) break postcopy-ram.c:1057 -  rb = qemu_ram_block_from_host(...);

  gdb(1) hangs for both of them, there might be another reason for it.
Live-migration also stalls with it.

> Some combination of the postcopy traces should give you that. Sorry,
> Peter Xu really is the expert on postcopy, I just tag along.

* I see. Maybe it could be logged with --migration-debug=<level> option.

> The snippet I posted shows that it's the same page:
>
> (gdb) x/i $pc
> => 0x7ffff5399d14 <__memcpy_evex_unaligned_erms+86>:    rep movsb %ds:(%rsi),%es:(%rdi)
> (gdb) p/x $rsi
> $1 = 0x7fffd68cc000
>
===
>> Thread 1 (Thread 0x7fbc4849df80 (LWP 7487) "qemu-system-x86"):
...
>> Thread 10 (Thread 0x7fffce7fc700 (LWP 11778) "mig/dst/listen"):
...
>> Thread 9 (Thread 0x7fffceffd700 (LWP 11777) "mig/dst/fault"):
#0  0x00007ffff5314a89 in __GI___poll (fds=0x7fffc0000b60, nfds=2,
timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:29
...
postcopy_ram_fault_thread_request Request for HVA=0x7fffd68cc000
rb=pc.ram offset=0xcc000 pid=11754
===

* Looking at the above data, it seems the missing page fault occurred
in thread=11754 , it may not be the memcpy(3) in
thread-1(pid/tid=7487) that triggered the fault.

* Secondly, if 'mig/dst/fault' thread is waiting at poll(2) call, ie.
fault notification has not arrived on the mis->userfault_fd  OR
mis->userfault_event_fd descriptors yet.  So the "Request for
HVA=0x7fffd..." via postcopy_ram_fault_thread_request() could be an
already served request.


> Send your next version and I'll set some time aside to debug this.
>
> heads-up: I'll be off from 2025/04/18 until 2025/05/05. Peter should be
> already back in the meantime.

* Okay, I'll send the next version.

Thank you.
---
  - Prasad



^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2025-04-11  7:26 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-18 12:38 [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
2025-03-18 12:38 ` [PATCH v8 1/7] migration/multifd: move macros to multifd header Prasad Pandit
2025-03-18 12:38 ` [PATCH v8 2/7] migration: Refactor channel discovery mechanism Prasad Pandit
2025-03-31 15:01   ` Fabiano Rosas
2025-04-03  7:01     ` Prasad Pandit
2025-04-03 12:59       ` Fabiano Rosas
2025-04-04  9:48         ` Prasad Pandit
2025-03-18 12:38 ` [PATCH v8 3/7] migration: enable multifd and postcopy together Prasad Pandit
2025-03-31 15:27   ` Fabiano Rosas
2025-04-03 10:57     ` Prasad Pandit
2025-04-03 13:03       ` Fabiano Rosas
2025-03-18 12:38 ` [PATCH v8 4/7] tests/qtest/migration: consolidate set capabilities Prasad Pandit
2025-03-18 12:38 ` [PATCH v8 5/7] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit
2025-03-18 12:38 ` [PATCH v8 6/7] migration: Add save_postcopy_prepare() savevm handler Prasad Pandit
2025-03-31 15:08   ` Fabiano Rosas
2025-04-03  7:03     ` Prasad Pandit
2025-03-18 12:38 ` [PATCH v8 7/7] migration/ram: Implement save_postcopy_prepare() Prasad Pandit
2025-03-31 15:18   ` Fabiano Rosas
2025-04-03  7:21     ` Prasad Pandit
2025-04-03 13:07       ` Fabiano Rosas
2025-04-04  9:50         ` Prasad Pandit
2025-03-25  9:53 ` [PATCH v8 0/7] Allow to enable multifd and postcopy migration together Prasad Pandit
2025-03-27 14:35   ` Fabiano Rosas
2025-03-27 16:01     ` Prasad Pandit
2025-03-31 20:54 ` Fabiano Rosas
2025-04-03  7:24   ` Prasad Pandit
2025-04-03 13:11     ` Fabiano Rosas
2025-04-10 12:22       ` Prasad Pandit
2025-04-10 20:18         ` Fabiano Rosas
2025-04-11  7:25           ` Prasad Pandit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).