* [PATCH v3 0/4] Allow to enable multifd and postcopy migration together
@ 2025-01-21 13:10 Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 1/4] migration/multifd: move macros to multifd header Prasad Pandit
` (4 more replies)
0 siblings, 5 replies; 13+ messages in thread
From: Prasad Pandit @ 2025-01-21 13:10 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit
From: Prasad Pandit <pjp@fedoraproject.org>
Hello,
* Currently Multifd and Postcopy migration can not be used together.
QEMU shows "Postcopy is not yet compatible with multifd" message.
When migrating guests with large (100's GB) RAM, Multifd threads
help to accelerate migration, but inability to use it with the
Postcopy mode delays guest start up on the destination side.
* This patch series allows to enable both Multifd and Postcopy
migration together. Precopy and Multifd threads work during
the initial guest (RAM) transfer. When migration moves to the
Postcopy phase, Multifd threads are restrained and the Postcopy
threads start to request pages from the source side.
* This series removes magic value (4-bytes) introduced in the
previous series for the Postcopy channel. And refactoring of
the 'ram_save_target_page' function is made independent of
the multifd & postcopy change.
* This series passes all existing 'tests/qtest/migration/*' test
cases and adds a new one to enable multifd channels with postcopy
migration.
v2: https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/T/#u
v1: https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u
v0: https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u
Thank you.
---
Prasad Pandit (4):
migration/multifd: move macros to multifd header
migration: refactor ram_save_target_page functions
migration: enable multifd and postcopy together
tests/qtest/migration: add postcopy test with multifd
migration/migration.c | 106 +++++++++++++++----------
migration/multifd-nocomp.c | 3 +-
migration/multifd.c | 5 --
migration/multifd.h | 5 ++
migration/options.c | 5 --
migration/ram.c | 69 +++++-----------
tests/qtest/migration/framework.c | 8 ++
tests/qtest/migration/framework.h | 3 +
tests/qtest/migration/postcopy-tests.c | 10 +++
9 files changed, 112 insertions(+), 102 deletions(-)
--
2.48.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v3 1/4] migration/multifd: move macros to multifd header
2025-01-21 13:10 [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Prasad Pandit
@ 2025-01-21 13:10 ` Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 2/4] migration: refactor ram_save_target_page functions Prasad Pandit
` (3 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Prasad Pandit @ 2025-01-21 13:10 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit
From: Prasad Pandit <pjp@fedoraproject.org>
Move MULTIFD_ macros to the header file so that
they are accessible from other source files.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
migration/multifd.c | 5 -----
migration/multifd.h | 5 +++++
2 files changed, 5 insertions(+), 5 deletions(-)
v2:
- https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/
diff --git a/migration/multifd.c b/migration/multifd.c
index ab73d6d984..97ce831775 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -33,11 +33,6 @@
#include "io/channel-socket.h"
#include "yank_functions.h"
-/* Multiple fd's */
-
-#define MULTIFD_MAGIC 0x11223344U
-#define MULTIFD_VERSION 1
-
typedef struct {
uint32_t magic;
uint32_t version;
diff --git a/migration/multifd.h b/migration/multifd.h
index bd785b9873..10149af654 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -49,6 +49,11 @@ bool multifd_queue_page(RAMBlock *block, ram_addr_t offset);
bool multifd_recv(void);
MultiFDRecvData *multifd_get_recv_data(void);
+/* Multiple fd's */
+
+#define MULTIFD_MAGIC 0x11223344U
+#define MULTIFD_VERSION 1
+
/* Multifd Compression flags */
#define MULTIFD_FLAG_SYNC (1 << 0)
--
2.48.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 2/4] migration: refactor ram_save_target_page functions
2025-01-21 13:10 [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 1/4] migration/multifd: move macros to multifd header Prasad Pandit
@ 2025-01-21 13:10 ` Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 3/4] migration: enable multifd and postcopy together Prasad Pandit
` (2 subsequent siblings)
4 siblings, 0 replies; 13+ messages in thread
From: Prasad Pandit @ 2025-01-21 13:10 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit
From: Prasad Pandit <pjp@fedoraproject.org>
Refactor ram_save_target_page legacy and multifd
functions into one. Other than simplifying it,
it frees 'migration_ops' object from usage, so it
is expunged.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
migration/ram.c | 67 +++++++++++++------------------------------------
1 file changed, 17 insertions(+), 50 deletions(-)
v2:
- https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/
diff --git a/migration/ram.c b/migration/ram.c
index ce28328141..f2326788de 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -446,13 +446,6 @@ void ram_transferred_add(uint64_t bytes)
}
}
-struct MigrationOps {
- int (*ram_save_target_page)(RAMState *rs, PageSearchStatus *pss);
-};
-typedef struct MigrationOps MigrationOps;
-
-MigrationOps *migration_ops;
-
static int ram_save_host_page_urgent(PageSearchStatus *pss);
/* NOTE: page is the PFN not real ram_addr_t. */
@@ -1958,55 +1951,36 @@ int ram_save_queue_pages(const char *rbname, ram_addr_t start, ram_addr_t len,
}
/**
- * ram_save_target_page_legacy: save one target page
- *
- * Returns the number of pages written
+ * ram_save_target_page: save one target page to the precopy thread
+ * OR to multifd workers.
*
* @rs: current RAM state
* @pss: data about the page we want to send
*/
-static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss)
+static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
{
ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
int res;
+ if (!migrate_multifd()
+ || migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) {
+ if (save_zero_page(rs, pss, offset)) {
+ return 1;
+ }
+ }
+
+ if (migrate_multifd()) {
+ RAMBlock *block = pss->block;
+ return ram_save_multifd_page(block, offset);
+ }
+
if (control_save_page(pss, offset, &res)) {
return res;
}
- if (save_zero_page(rs, pss, offset)) {
- return 1;
- }
-
return ram_save_page(rs, pss);
}
-/**
- * ram_save_target_page_multifd: send one target page to multifd workers
- *
- * Returns 1 if the page was queued, -1 otherwise.
- *
- * @rs: current RAM state
- * @pss: data about the page we want to send
- */
-static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss)
-{
- RAMBlock *block = pss->block;
- ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS;
-
- /*
- * While using multifd live migration, we still need to handle zero
- * page checking on the migration main thread.
- */
- if (migrate_zero_page_detection() == ZERO_PAGE_DETECTION_LEGACY) {
- if (save_zero_page(rs, pss, offset)) {
- return 1;
- }
- }
-
- return ram_save_multifd_page(block, offset);
-}
-
/* Should be called before sending a host page */
static void pss_host_page_prepare(PageSearchStatus *pss)
{
@@ -2093,7 +2067,7 @@ static int ram_save_host_page_urgent(PageSearchStatus *pss)
if (page_dirty) {
/* Be strict to return code; it must be 1, or what else? */
- if (migration_ops->ram_save_target_page(rs, pss) != 1) {
+ if (ram_save_target_page(rs, pss) != 1) {
error_report_once("%s: ram_save_target_page failed", __func__);
ret = -1;
goto out;
@@ -2162,7 +2136,7 @@ static int ram_save_host_page(RAMState *rs, PageSearchStatus *pss)
if (preempt_active) {
qemu_mutex_unlock(&rs->bitmap_mutex);
}
- tmppages = migration_ops->ram_save_target_page(rs, pss);
+ tmppages = ram_save_target_page(rs, pss);
if (tmppages >= 0) {
pages += tmppages;
/*
@@ -2360,8 +2334,6 @@ static void ram_save_cleanup(void *opaque)
xbzrle_cleanup();
multifd_ram_save_cleanup();
ram_state_cleanup(rsp);
- g_free(migration_ops);
- migration_ops = NULL;
}
static void ram_state_reset(RAMState *rs)
@@ -3027,13 +2999,8 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp)
return ret;
}
- migration_ops = g_malloc0(sizeof(MigrationOps));
-
if (migrate_multifd()) {
multifd_ram_save_setup();
- migration_ops->ram_save_target_page = ram_save_target_page_multifd;
- } else {
- migration_ops->ram_save_target_page = ram_save_target_page_legacy;
}
/*
--
2.48.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 3/4] migration: enable multifd and postcopy together
2025-01-21 13:10 [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 1/4] migration/multifd: move macros to multifd header Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 2/4] migration: refactor ram_save_target_page functions Prasad Pandit
@ 2025-01-21 13:10 ` Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd Prasad Pandit
2025-01-21 15:50 ` [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Peter Xu
4 siblings, 0 replies; 13+ messages in thread
From: Prasad Pandit @ 2025-01-21 13:10 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit
From: Prasad Pandit <pjp@fedoraproject.org>
Enable Multifd and Postcopy migration together.
The migration_ioc_process_incoming() routine
checks magic value sent on each channel and
helps to properly setup multifd and postcopy
channels.
The Precopy and Multifd threads work during the
initial guest RAM transfer. When migration moves
to the Postcopy phase, the multifd threads are
restrained and Postcopy threads on the destination
request/pull data from the source side.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
migration/migration.c | 106 +++++++++++++++++++++++--------------
migration/multifd-nocomp.c | 3 +-
migration/options.c | 5 --
migration/ram.c | 4 +-
4 files changed, 70 insertions(+), 48 deletions(-)
v2: Minor changes in migration_ioc_process_incoming() to pass test cases
- https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/
diff --git a/migration/migration.c b/migration/migration.c
index 2d1da917c7..a280722e9e 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -92,6 +92,9 @@ enum mig_rp_message_type {
MIG_RP_MSG_MAX
};
+/* Migration channel types */
+enum { CH_DEFAULT, CH_MULTIFD, CH_POSTCOPY };
+
/* When we add fault tolerance, we could have several
migrations at once. For now we don't need to add
dynamic creation of migration */
@@ -929,26 +932,33 @@ void migration_fd_process_incoming(QEMUFile *f)
/*
* Returns true when we want to start a new incoming migration process,
* false otherwise.
+ *
+ * All the required channels must be in place before a new incoming
+ * migration process starts.
+ * - Multifd enabled:
+ * The main channel and the multifd channels are required.
+ * - Multifd/Postcopy disabled:
+ * The main channel is required.
+ * - Postcopy enabled:
+ * We don't want to start a new incoming migration when
+ * the postcopy channel is created. Because it is created
+ * towards the end of the precopy migration.
+ *
*/
-static bool migration_should_start_incoming(bool main_channel)
+static bool migration_should_start_incoming(uint8_t channel)
{
- /* Multifd doesn't start unless all channels are established */
- if (migrate_multifd()) {
- return migration_has_all_channels();
- }
+ bool ret = false;
+
+ if (channel != CH_POSTCOPY) {
+ MigrationIncomingState *mis = migration_incoming_get_current();
+ ret = mis->from_src_file ? true : false;
- /* Preempt channel only starts when the main channel is created */
- if (migrate_postcopy_preempt()) {
- return main_channel;
+ if (ret && migrate_multifd()) {
+ ret = multifd_recv_all_channels_created();
+ }
}
- /*
- * For all the rest types of migration, we should only reach here when
- * it's the main channel that's being created, and we should always
- * proceed with this channel.
- */
- assert(main_channel);
- return true;
+ return ret;
}
void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
@@ -956,13 +966,12 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
MigrationIncomingState *mis = migration_incoming_get_current();
Error *local_err = NULL;
QEMUFile *f;
- bool default_channel = true;
uint32_t channel_magic = 0;
+ uint8_t channel = CH_DEFAULT;
int ret = 0;
- if (migrate_multifd() && !migrate_mapped_ram() &&
- !migrate_postcopy_ram() &&
- qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
+ if (!migration_should_start_incoming(channel)) {
+ if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) {
/*
* With multiple channels, it is possible that we receive channels
* out of order on destination side, causing incorrect mapping of
@@ -973,42 +982,58 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
* tls handshake while initializing main channel so with tls this
* issue is not possible.
*/
- ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
- sizeof(channel_magic), errp);
+ ret = migration_channel_read_peek(ioc, (void *)&channel_magic,
+ sizeof(channel_magic), errp);
+ if (ret != 0) {
+ return;
+ }
- if (ret != 0) {
- return;
+ if (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)) {
+ channel = CH_DEFAULT;
+ } else if (channel_magic == cpu_to_be32(MULTIFD_MAGIC)) {
+ channel = CH_MULTIFD;
+ } else if (!mis->from_src_file
+ && mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) {
+ /* reconnect default channel for postcopy recovery */
+ channel = CH_DEFAULT;
+ } else {
+ error_report("%s: could not identify channel, unknown magic: %u",
+ __func__, channel_magic);
+ return;
+ }
+ } else if (mis->from_src_file
+ && (!strcmp(ioc->name, "migration-tls-incoming")
+ || !strcmp(ioc->name, "migration-file-incoming"))) {
+ channel = CH_MULTIFD;
}
-
- default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC));
- } else {
- default_channel = !mis->from_src_file;
+ } else if (mis->from_src_file) { // && migrate_postcopy_preempt()
+ channel = CH_POSTCOPY;
}
if (multifd_recv_setup(errp) != 0) {
return;
}
- if (default_channel) {
+ if (channel == CH_DEFAULT) {
f = qemu_file_new_input(ioc);
migration_incoming_setup(f);
- } else {
+ } else if (channel == CH_MULTIFD) {
/* Multiple connections */
- assert(migration_needs_multiple_sockets());
if (migrate_multifd()) {
multifd_recv_new_channel(ioc, &local_err);
- } else {
- assert(migrate_postcopy_preempt());
- f = qemu_file_new_input(ioc);
- postcopy_preempt_new_channel(mis, f);
}
if (local_err) {
error_propagate(errp, local_err);
return;
}
+ } else if (channel == CH_POSTCOPY) {
+ assert(migrate_postcopy_preempt());
+ assert(!mis->postcopy_qemufile_dst);
+ f = qemu_file_new_input(ioc);
+ postcopy_preempt_new_channel(mis, f);
}
- if (migration_should_start_incoming(default_channel)) {
+ if (migration_should_start_incoming(channel)) {
/* If it's a recovery, we're done */
if (postcopy_try_recover()) {
return;
@@ -1025,21 +1050,22 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp)
*/
bool migration_has_all_channels(void)
{
+ bool ret = false;
MigrationIncomingState *mis = migration_incoming_get_current();
if (!mis->from_src_file) {
- return false;
+ return ret;
}
if (migrate_multifd()) {
- return multifd_recv_all_channels_created();
+ ret = multifd_recv_all_channels_created();
}
- if (migrate_postcopy_preempt()) {
- return mis->postcopy_qemufile_dst != NULL;
+ if (ret && migrate_postcopy_preempt()) {
+ ret = mis->postcopy_qemufile_dst != NULL;
}
- return true;
+ return ret;
}
int migrate_send_rp_switchover_ack(MigrationIncomingState *mis)
diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c
index 1325dba97c..d0edec7cd1 100644
--- a/migration/multifd-nocomp.c
+++ b/migration/multifd-nocomp.c
@@ -16,6 +16,7 @@
#include "file.h"
#include "multifd.h"
#include "options.h"
+#include "migration.h"
#include "qapi/error.h"
#include "qemu/cutils.h"
#include "qemu/error-report.h"
@@ -391,7 +392,7 @@ int multifd_ram_flush_and_sync(QEMUFile *f)
MultiFDSyncReq req;
int ret;
- if (!migrate_multifd()) {
+ if (!migrate_multifd() || migration_in_postcopy()) {
return 0;
}
diff --git a/migration/options.c b/migration/options.c
index b8d5300326..8c878dea49 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -479,11 +479,6 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp)
error_setg(errp, "Postcopy is not compatible with ignore-shared");
return false;
}
-
- if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) {
- error_setg(errp, "Postcopy is not yet compatible with multifd");
- return false;
- }
}
if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) {
diff --git a/migration/ram.c b/migration/ram.c
index f2326788de..bdba7abe73 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -1295,7 +1295,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss)
pss->page = 0;
pss->block = QLIST_NEXT_RCU(pss->block, next);
if (!pss->block) {
- if (multifd_ram_sync_per_round()) {
+ if (multifd_ram_sync_per_round() && !migration_in_postcopy()) {
QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel;
int ret = multifd_ram_flush_and_sync(f);
if (ret < 0) {
@@ -1969,7 +1969,7 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss)
}
}
- if (migrate_multifd()) {
+ if (migrate_multifd() && !migration_in_postcopy()) {
RAMBlock *block = pss->block;
return ram_save_multifd_page(block, offset);
}
--
2.48.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-21 13:10 [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Prasad Pandit
` (2 preceding siblings ...)
2025-01-21 13:10 ` [PATCH v3 3/4] migration: enable multifd and postcopy together Prasad Pandit
@ 2025-01-21 13:10 ` Prasad Pandit
2025-01-21 15:47 ` Peter Xu
2025-01-21 15:50 ` [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Peter Xu
4 siblings, 1 reply; 13+ messages in thread
From: Prasad Pandit @ 2025-01-21 13:10 UTC (permalink / raw)
To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit
From: Prasad Pandit <pjp@fedoraproject.org>
Add a new postcopy test 'migration/postcopy/multifd'
to run postcopy migration with multifd channels enabled.
Add a boolean field 'multifd' to the MigrateCommon struct.
It helps to enable multifd channels.
Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
---
tests/qtest/migration/framework.c | 8 ++++++++
tests/qtest/migration/framework.h | 3 +++
tests/qtest/migration/postcopy-tests.c | 10 ++++++++++
3 files changed, 21 insertions(+)
diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
index 4550cda129..7f5abd760e 100644
--- a/tests/qtest/migration/framework.c
+++ b/tests/qtest/migration/framework.c
@@ -427,6 +427,14 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
migrate_set_capability(to, "postcopy-preempt", true);
}
+ if (args->multifd) {
+ migrate_set_capability(from, "multifd", true);
+ migrate_set_capability(to, "multifd", true);
+
+ migrate_set_parameter_int(from, "multifd-channels", 8);
+ migrate_set_parameter_int(to, "multifd-channels", 8);
+ }
+
migrate_ensure_non_converge(from);
migrate_prepare_for_dirty_mem(from);
diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
index 7991ee56b6..1b2320ebef 100644
--- a/tests/qtest/migration/framework.h
+++ b/tests/qtest/migration/framework.h
@@ -193,6 +193,9 @@ typedef struct {
*/
bool live;
+ /* set multifd on */
+ bool multifd;
+
/* Postcopy specific fields */
void *postcopy_data;
bool postcopy_preempt;
diff --git a/tests/qtest/migration/postcopy-tests.c b/tests/qtest/migration/postcopy-tests.c
index daf7449f2c..6eada6ccbc 100644
--- a/tests/qtest/migration/postcopy-tests.c
+++ b/tests/qtest/migration/postcopy-tests.c
@@ -27,6 +27,15 @@ static void test_postcopy(void)
test_postcopy_common(&args);
}
+static void test_postcopy_multifd(void)
+{
+ MigrateCommon args = {
+ .multifd = true,
+ };
+
+ test_postcopy_common(&args);
+}
+
static void test_postcopy_suspend(void)
{
MigrateCommon args = {
@@ -83,6 +92,7 @@ void migration_test_add_postcopy(MigrationTestEnv *env)
{
if (env->has_uffd) {
migration_test_add("/migration/postcopy/plain", test_postcopy);
+ migration_test_add("/migration/postcopy/multifd", test_postcopy_multifd);
migration_test_add("/migration/postcopy/recovery/plain",
test_postcopy_recovery);
migration_test_add("/migration/postcopy/preempt/plain",
--
2.48.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-21 13:10 ` [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd Prasad Pandit
@ 2025-01-21 15:47 ` Peter Xu
2025-01-22 7:56 ` Prasad Pandit
0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2025-01-21 15:47 UTC (permalink / raw)
To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
On Tue, Jan 21, 2025 at 06:40:32PM +0530, Prasad Pandit wrote:
> From: Prasad Pandit <pjp@fedoraproject.org>
>
> Add a new postcopy test 'migration/postcopy/multifd'
> to run postcopy migration with multifd channels enabled.
> Add a boolean field 'multifd' to the MigrateCommon struct.
> It helps to enable multifd channels.
>
> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org>
> ---
> tests/qtest/migration/framework.c | 8 ++++++++
> tests/qtest/migration/framework.h | 3 +++
> tests/qtest/migration/postcopy-tests.c | 10 ++++++++++
> 3 files changed, 21 insertions(+)
>
> diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c
> index 4550cda129..7f5abd760e 100644
> --- a/tests/qtest/migration/framework.c
> +++ b/tests/qtest/migration/framework.c
> @@ -427,6 +427,14 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
> migrate_set_capability(to, "postcopy-preempt", true);
> }
>
> + if (args->multifd) {
> + migrate_set_capability(from, "multifd", true);
> + migrate_set_capability(to, "multifd", true);
> +
> + migrate_set_parameter_int(from, "multifd-channels", 8);
> + migrate_set_parameter_int(to, "multifd-channels", 8);
> + }
> +
> migrate_ensure_non_converge(from);
>
> migrate_prepare_for_dirty_mem(from);
> diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h
> index 7991ee56b6..1b2320ebef 100644
> --- a/tests/qtest/migration/framework.h
> +++ b/tests/qtest/migration/framework.h
> @@ -193,6 +193,9 @@ typedef struct {
> */
> bool live;
>
> + /* set multifd on */
> + bool multifd;
> +
> /* Postcopy specific fields */
> void *postcopy_data;
> bool postcopy_preempt;
> diff --git a/tests/qtest/migration/postcopy-tests.c b/tests/qtest/migration/postcopy-tests.c
> index daf7449f2c..6eada6ccbc 100644
> --- a/tests/qtest/migration/postcopy-tests.c
> +++ b/tests/qtest/migration/postcopy-tests.c
> @@ -27,6 +27,15 @@ static void test_postcopy(void)
> test_postcopy_common(&args);
> }
>
> +static void test_postcopy_multifd(void)
> +{
> + MigrateCommon args = {
> + .multifd = true,
> + };
> +
> + test_postcopy_common(&args);
> +}
> +
> static void test_postcopy_suspend(void)
> {
> MigrateCommon args = {
> @@ -83,6 +92,7 @@ void migration_test_add_postcopy(MigrationTestEnv *env)
> {
> if (env->has_uffd) {
> migration_test_add("/migration/postcopy/plain", test_postcopy);
> + migration_test_add("/migration/postcopy/multifd", test_postcopy_multifd);
> migration_test_add("/migration/postcopy/recovery/plain",
> test_postcopy_recovery);
> migration_test_add("/migration/postcopy/preempt/plain",
> --
> 2.48.1
>
This is only one test out of many I listed in the previous email:
https://lore.kernel.org/qemu-devel/ZykJBq7ME5jgSzCA@x1n/
Would you please add all the tests mentioned there?
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 0/4] Allow to enable multifd and postcopy migration together
2025-01-21 13:10 [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Prasad Pandit
` (3 preceding siblings ...)
2025-01-21 13:10 ` [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd Prasad Pandit
@ 2025-01-21 15:50 ` Peter Xu
4 siblings, 0 replies; 13+ messages in thread
From: Peter Xu @ 2025-01-21 15:50 UTC (permalink / raw)
To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
On Tue, Jan 21, 2025 at 06:40:28PM +0530, Prasad Pandit wrote:
> From: Prasad Pandit <pjp@fedoraproject.org>
>
> Hello,
>
> * Currently Multifd and Postcopy migration can not be used together.
> QEMU shows "Postcopy is not yet compatible with multifd" message.
>
> When migrating guests with large (100's GB) RAM, Multifd threads
> help to accelerate migration, but inability to use it with the
> Postcopy mode delays guest start up on the destination side.
>
> * This patch series allows to enable both Multifd and Postcopy
> migration together. Precopy and Multifd threads work during
> the initial guest (RAM) transfer. When migration moves to the
> Postcopy phase, Multifd threads are restrained and the Postcopy
> threads start to request pages from the source side.
>
> * This series removes magic value (4-bytes) introduced in the
> previous series for the Postcopy channel. And refactoring of
> the 'ram_save_target_page' function is made independent of
> the multifd & postcopy change.
>
> * This series passes all existing 'tests/qtest/migration/*' test
> cases and adds a new one to enable multifd channels with postcopy
> migration.
>
> v2: https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/T/#u
> v1: https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u
> v0: https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u
Another comment for the cover letter (besides the test request): please
consider adding some changelog into cover letter whenever possible.
Normally we don't suggest a changelog only if the wholeset has been
drastically rewritten. Otherwise having a changelog would help people
understand what has happened between the versions.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-21 15:47 ` Peter Xu
@ 2025-01-22 7:56 ` Prasad Pandit
2025-01-22 16:10 ` Peter Xu
0 siblings, 1 reply; 13+ messages in thread
From: Prasad Pandit @ 2025-01-22 7:56 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
Hi,
On Tue, 21 Jan 2025 at 21:17, Peter Xu <peterx@redhat.com> wrote:
> https://lore.kernel.org/qemu-devel/ZykJBq7ME5jgSzCA@x1n/
> Would you please add all the tests mentioned there?
/x86_64/migration/multifd/file/mapped-ram/
/x86_64/migration/multifd/tcp/uri/plain/none
/x86_64/migration/multifd/tcp/plain/cancel
/x86_64/migration/postcopy/plain
/x86_64/migration/postcopy/recovery/
/x86_64/migration/postcopy/preempt/
* Of the tests you suggested above, I'll try to enable multifd
channels for 'postcopy/recovery' and 'postcopy/preempt' tests. For the
'multifd' tests above, how do we want to modify them? Enable
'postcopy' mode for them?
Thank you.
---
- Prasad
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-22 7:56 ` Prasad Pandit
@ 2025-01-22 16:10 ` Peter Xu
2025-01-23 11:09 ` Prasad Pandit
0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2025-01-22 16:10 UTC (permalink / raw)
To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
On Wed, Jan 22, 2025 at 01:26:21PM +0530, Prasad Pandit wrote:
> Hi,
> On Tue, 21 Jan 2025 at 21:17, Peter Xu <peterx@redhat.com> wrote:
> > https://lore.kernel.org/qemu-devel/ZykJBq7ME5jgSzCA@x1n/
> > Would you please add all the tests mentioned there?
>
> /x86_64/migration/multifd/file/mapped-ram/
> /x86_64/migration/multifd/tcp/uri/plain/none
> /x86_64/migration/multifd/tcp/plain/cancel
>
> /x86_64/migration/postcopy/plain
> /x86_64/migration/postcopy/recovery/
> /x86_64/migration/postcopy/preempt/
>
> * Of the tests you suggested above, I'll try to enable multifd
> channels for 'postcopy/recovery' and 'postcopy/preempt' tests. For the
> 'multifd' tests above, how do we want to modify them? Enable
> 'postcopy' mode for them?
Right, that implies a migration with both features enabled but finished
even before postcopy starts.
We have some tricky paths that may go differently when different feature
enabled, especially when it's relevant to postcopy and multifd. Adding
these tests could make sure those corner cases got covered.
And btw, some of my above lines are not a single test, but a set of tests.
E.g., "/x86_64/migration/postcopy/recovery/" is not a single test but:
# /x86_64/migration/postcopy/recovery/plain
# /x86_64/migration/postcopy/recovery/tls/psk
# /x86_64/migration/postcopy/recovery/double-failures/handshake
# /x86_64/migration/postcopy/recovery/double-failures/reconnect
Let me list all the tests that is relevant to the two features to be
explicit..
# /x86_64/migration/postcopy/plain
# /x86_64/migration/postcopy/suspend
# /x86_64/migration/postcopy/tls/psk
# /x86_64/migration/postcopy/recovery/plain
# /x86_64/migration/postcopy/recovery/tls/psk
# /x86_64/migration/postcopy/recovery/double-failures/handshake
# /x86_64/migration/postcopy/recovery/double-failures/reconnect
# /x86_64/migration/postcopy/preempt/plain
# /x86_64/migration/postcopy/preempt/tls/psk
# /x86_64/migration/postcopy/preempt/recovery/plain
# /x86_64/migration/postcopy/preempt/recovery/tls/psk
# /x86_64/migration/multifd/tcp/tls/psk/match
# /x86_64/migration/multifd/tcp/tls/psk/mismatch
# /x86_64/migration/multifd/tcp/tls/x509/default-host
# /x86_64/migration/multifd/tcp/tls/x509/override-host
# /x86_64/migration/multifd/tcp/tls/x509/mismatch-host
# /x86_64/migration/multifd/tcp/tls/x509/allow-anon-client
# /x86_64/migration/multifd/tcp/tls/x509/reject-anon-client
# /x86_64/migration/multifd/tcp/plain/zstd
# /x86_64/migration/multifd/tcp/plain/zlib
# /x86_64/migration/multifd/tcp/plain/cancel
# /x86_64/migration/multifd/tcp/plain/zero-page/legacy
# /x86_64/migration/multifd/tcp/plain/zero-page/none
# /x86_64/migration/multifd/tcp/uri/plain/none
# /x86_64/migration/multifd/tcp/channels/plain/none
(I used to reference mapped-ram for multifd, but I just remembered it can't
be enabled with postcopy, so I dropped them)
I believe many of the tests can be avoided, but still below is a list of
minimum tests that I think might still be good to add:
# /x86_64/migration/postcopy/plain
# /x86_64/migration/postcopy/recovery/tls/psk
# /x86_64/migration/postcopy/preempt/plain
# /x86_64/migration/postcopy/preempt/recovery/tls/psk
# /x86_64/migration/multifd/tcp/tls/psk/match
# /x86_64/migration/multifd/tcp/plain/zstd
# /x86_64/migration/multifd/tcp/plain/cancel
I kept almost tls relevant ones because it has the most code path coverage,
and I suppose tls should also be an emphasis in the future for migration in
CoCo environments. I removed most of the trivial test cases, like postcopy
double failures etc. which can be too hard to trigger in real life.
Feel free to comment on whether you think the list is suitable to you. If
you want to add some more into the list I'm also ok with.
IMHO we can have a specific path "/x86_64/migration/multifd+postcopy/*" for
all above new tests, that have both features enabled.
Fabiano, you're rethinking the test infra, please comment if you have any
thoughts on above too.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-22 16:10 ` Peter Xu
@ 2025-01-23 11:09 ` Prasad Pandit
2025-01-24 12:45 ` Prasad Pandit
0 siblings, 1 reply; 13+ messages in thread
From: Prasad Pandit @ 2025-01-23 11:09 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
On Wed, 22 Jan 2025 at 21:40, Peter Xu <peterx@redhat.com> wrote:
> I believe many of the tests can be avoided, but still below is a list of
> minimum tests that I think might still be good to add:
>
> # /x86_64/migration/postcopy/plain
> # /x86_64/migration/postcopy/recovery/tls/psk
> # /x86_64/migration/postcopy/preempt/plain
> # /x86_64/migration/postcopy/preempt/recovery/tls/psk
> # /x86_64/migration/multifd/tcp/tls/psk/match
> # /x86_64/migration/multifd/tcp/plain/zstd
> # /x86_64/migration/multifd/tcp/plain/cancel
>
Okay, I'll start with these for now. Thank you.
---
- Prasad
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-23 11:09 ` Prasad Pandit
@ 2025-01-24 12:45 ` Prasad Pandit
2025-01-24 15:38 ` Peter Xu
0 siblings, 1 reply; 13+ messages in thread
From: Prasad Pandit @ 2025-01-24 12:45 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
Hello Peter,
On Thu, 23 Jan 2025 at 16:39, Prasad Pandit <ppandit@redhat.com> wrote:
> On Wed, 22 Jan 2025 at 21:40, Peter Xu <peterx@redhat.com> wrote:
> > I believe many of the tests can be avoided, but still below is a list of
> > minimum tests that I think might still be good to add:
> > # /x86_64/migration/postcopy/plain
> > # /x86_64/migration/postcopy/recovery/tls/psk
> > # /x86_64/migration/postcopy/preempt/plain
> > # /x86_64/migration/postcopy/preempt/recovery/tls/psk
---
$ ../qtest/migration-test --tap -k -r
'/x86_64/migration/multifd+postcopy' | grep -i 'slow test'
# slow test /x86_64/migration/multifd+postcopy/plain executed in 1.28 secs
# slow test /x86_64/migration/multifd+postcopy/recovery/tls/psk
executed in 2.43 secs
# slow test /x86_64/migration/multifd+postcopy/preempt/plain executed
in 1.52 secs
# slow test /x86_64/migration/multifd+postcopy/preempt/recovery/tls/psk
executed in 3.32 secs
---
* Postcopy tests are working well with setting 'multifd = true'.
> > # /x86_64/migration/multifd/tcp/tls/psk/match
> > # /x86_64/migration/multifd/tcp/plain/zstd
> > # /x86_64/migration/multifd/tcp/plain/cancel
* Above precopy tests already enable (16) multifd channels and they
seem to test scenarios like: resume after migrate_cancel() or precopy
with compression (zstd). Enabling 'postcopy' here is not the same as
setting 'postcopy=true'. Do we really need to redefine these tests for
postcopy migration? Does compression (zstd/zlib etc.) OR
migrate_cancle() work with 'postcopy' migration? Just trying to figure
out how a 'multifd+postcopy/tcp/zstd' or
'multifd+postcopy/tcp/cancel' tests should work.
Thank you.
---
- Prasad
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-24 12:45 ` Prasad Pandit
@ 2025-01-24 15:38 ` Peter Xu
2025-01-25 12:15 ` Prasad Pandit
0 siblings, 1 reply; 13+ messages in thread
From: Peter Xu @ 2025-01-24 15:38 UTC (permalink / raw)
To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
On Fri, Jan 24, 2025 at 06:15:20PM +0530, Prasad Pandit wrote:
> Hello Peter,
>
> On Thu, 23 Jan 2025 at 16:39, Prasad Pandit <ppandit@redhat.com> wrote:
> > On Wed, 22 Jan 2025 at 21:40, Peter Xu <peterx@redhat.com> wrote:
> > > I believe many of the tests can be avoided, but still below is a list of
> > > minimum tests that I think might still be good to add:
> > > # /x86_64/migration/postcopy/plain
> > > # /x86_64/migration/postcopy/recovery/tls/psk
> > > # /x86_64/migration/postcopy/preempt/plain
> > > # /x86_64/migration/postcopy/preempt/recovery/tls/psk
> ---
> $ ../qtest/migration-test --tap -k -r
> '/x86_64/migration/multifd+postcopy' | grep -i 'slow test'
> # slow test /x86_64/migration/multifd+postcopy/plain executed in 1.28 secs
> # slow test /x86_64/migration/multifd+postcopy/recovery/tls/psk
> executed in 2.43 secs
> # slow test /x86_64/migration/multifd+postcopy/preempt/plain executed
> in 1.52 secs
> # slow test /x86_64/migration/multifd+postcopy/preempt/recovery/tls/psk
> executed in 3.32 secs
> ---
> * Postcopy tests are working well with setting 'multifd = true'.
Great.
>
> > > # /x86_64/migration/multifd/tcp/tls/psk/match
> > > # /x86_64/migration/multifd/tcp/plain/zstd
> > > # /x86_64/migration/multifd/tcp/plain/cancel
>
> * Above precopy tests already enable (16) multifd channels and they
> seem to test scenarios like: resume after migrate_cancel() or precopy
> with compression (zstd). Enabling 'postcopy' here is not the same as
> setting 'postcopy=true'. Do we really need to redefine these tests for
> postcopy migration? Does compression (zstd/zlib etc.) OR
> migrate_cancle() work with 'postcopy' migration?
Since multifd doesn't work with postcopy phase, compression so far cannot
happen in postcopy phase but only in precopy phase.
So the tests I suggested was trying to make sure multifd major features (in
this case, tls, compression, and cancellation) work like before even if we
set postcopy-ram=on in the feature list, because after your changes merged,
people may start always set postcopy-ram=on for all cases.
OTOH, these test cases do not test anything that would happen in postcopy
phase, they should be covered by the postcopy tests you added above.
Thanks,
--
Peter Xu
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd
2025-01-24 15:38 ` Peter Xu
@ 2025-01-25 12:15 ` Prasad Pandit
0 siblings, 0 replies; 13+ messages in thread
From: Prasad Pandit @ 2025-01-25 12:15 UTC (permalink / raw)
To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit
On Fri, 24 Jan 2025 at 22:39, Peter Xu <peterx@redhat.com> wrote:
> > > # /x86_64/migration/multifd/tcp/tls/psk/match
> > > # /x86_64/migration/multifd/tcp/plain/zstd
> > > # /x86_64/migration/multifd/tcp/plain/cancel
> Since multifd doesn't work with postcopy phase, compression so far cannot
> happen in postcopy phase but only in precopy phase.
* Right.
> So the tests I suggested was trying to make sure multifd major features (in
> this case, tls, compression, and cancellation) work like before even if we
> set postcopy-ram=on in the feature list, because after your changes merged,
> people may start always set postcopy-ram=on for all cases.
* Ie. we set 'postcopy-ram=true' for above precopy tests?
Thank you.
---
- Prasad
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-01-25 12:16 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-21 13:10 [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 1/4] migration/multifd: move macros to multifd header Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 2/4] migration: refactor ram_save_target_page functions Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 3/4] migration: enable multifd and postcopy together Prasad Pandit
2025-01-21 13:10 ` [PATCH v3 4/4] tests/qtest/migration: add postcopy test with multifd Prasad Pandit
2025-01-21 15:47 ` Peter Xu
2025-01-22 7:56 ` Prasad Pandit
2025-01-22 16:10 ` Peter Xu
2025-01-23 11:09 ` Prasad Pandit
2025-01-24 12:45 ` Prasad Pandit
2025-01-24 15:38 ` Peter Xu
2025-01-25 12:15 ` Prasad Pandit
2025-01-21 15:50 ` [PATCH v3 0/4] Allow to enable multifd and postcopy migration together Peter Xu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).