* [PATCH v7 0/5] Allow to enable multifd and postcopy migration together @ 2025-02-28 12:17 Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 1/5] migration/multifd: move macros to multifd header Prasad Pandit ` (5 more replies) 0 siblings, 6 replies; 37+ messages in thread From: Prasad Pandit @ 2025-02-28 12:17 UTC (permalink / raw) To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Hello, * This series (v7) adds 'MULTIFD_RECV_SYNC' migration command. It is used to notify the destination migration thread to synchronise with the Multifd threads. This allows Multifd ('mig/dst/recv_x') threads on the destination to receive all their data, before they are shutdown. This series also updates the channel discovery function and qtests as suggested in the previous review comments. === 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 147.84s 81 subtests passed === v6: https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t * This series (v6) shuts down Multifd threads before starting Postcopy migration. It helps to avoid an issue of multifd pages arriving late at the destination during Postcopy phase and corrupting the vCPU state. It also reorders the qtest patches and does some refactoring changes as suggested in previous review. === 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 161.35s 73 subtests passed === v5: https://lore.kernel.org/qemu-devel/20250205122712.229151-1-ppandit@redhat.com/T/#t * This series (v5) consolidates migration capabilities setting in one 'set_migration_capabilities()' function, thus simplifying test sources. It passes all migration tests. === 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 143.66s 71 subtests passed === v4: https://lore.kernel.org/qemu-devel/20250127120823.144949-1-ppandit@redhat.com/T/#t * This series (v4) adds more 'multifd+postcopy' qtests which test Precopy migration with 'postcopy-ram' attribute set. And run Postcopy migrations with 'multifd' channels enabled. === $ ../qtest/migration-test --tap -k -r '/x86_64/migration/multifd+postcopy' | grep -i 'slow test' # slow test /x86_64/migration/multifd+postcopy/plain executed in 1.29 secs # slow test /x86_64/migration/multifd+postcopy/recovery/tls/psk executed in 2.48 secs # slow test /x86_64/migration/multifd+postcopy/preempt/plain executed in 1.49 secs # slow test /x86_64/migration/multifd+postcopy/preempt/recovery/tls/psk executed in 2.52 secs # slow test /x86_64/migration/multifd+postcopy/tcp/tls/psk/match executed in 3.62 secs # slow test /x86_64/migration/multifd+postcopy/tcp/plain/zstd executed in 1.34 secs # slow test /x86_64/migration/multifd+postcopy/tcp/plain/cancel executed in 2.24 secs ... 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 148.41s 71 subtests passed === v3: https://lore.kernel.org/qemu-devel/20250121131032.1611245-1-ppandit@redhat.com/T/#t * This series (v3) passes all existing 'tests/qtest/migration/*' tests and adds a new one to enable multifd channels with postcopy migration. v2: https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/T/#u * This series (v2) further refactors the 'ram_save_target_page' function to make it independent of the multifd & postcopy change. v1: https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u * This series removes magic value (4-bytes) introduced in the previous series for the Postcopy channel. v0: https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u * Currently Multifd and Postcopy migration can not be used together. QEMU shows "Postcopy is not yet compatible with multifd" message. When migrating guests with large (100's GB) RAM, Multifd threads help to accelerate migration, but inability to use it with the Postcopy mode delays guest start up on the destination side. * This patch series allows to enable both Multifd and Postcopy migration together. Precopy and Multifd threads work during the initial guest (RAM) transfer. When migration moves to the Postcopy phase, Multifd threads are restrained and the Postcopy threads start to request pages from the source side. * This series introduces magic value (4-bytes) to be sent on the Postcopy channel. It helps to differentiate channels and properly setup incoming connections on the destination side. Thank you. --- Prasad Pandit (5): migration/multifd: move macros to multifd header migration: enable multifd and postcopy together tests/qtest/migration: consolidate set capabilities tests/qtest/migration: add postcopy tests with multifd migration: add MULTIFD_RECV_SYNC migration command migration/migration.c | 136 +++++++++++++--------- migration/multifd-nocomp.c | 28 +++-- migration/multifd.c | 17 +-- migration/multifd.h | 6 + migration/options.c | 5 - migration/ram.c | 7 +- migration/savevm.c | 13 +++ migration/savevm.h | 1 + tests/qtest/migration/compression-tests.c | 37 +++++- tests/qtest/migration/cpr-tests.c | 6 +- tests/qtest/migration/file-tests.c | 58 +++++---- tests/qtest/migration/framework.c | 76 ++++++++---- tests/qtest/migration/framework.h | 9 +- tests/qtest/migration/misc-tests.c | 4 +- tests/qtest/migration/postcopy-tests.c | 35 +++++- tests/qtest/migration/precopy-tests.c | 48 +++++--- tests/qtest/migration/tls-tests.c | 69 ++++++++++- 17 files changed, 388 insertions(+), 167 deletions(-) -- 2.48.1 ^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH v7 1/5] migration/multifd: move macros to multifd header 2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit @ 2025-02-28 12:17 ` Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 2/5] migration: enable multifd and postcopy together Prasad Pandit ` (4 subsequent siblings) 5 siblings, 0 replies; 37+ messages in thread From: Prasad Pandit @ 2025-02-28 12:17 UTC (permalink / raw) To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Move MULTIFD_ macros to the header file so that they are accessible from other source files. Reviewed-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/multifd.c | 5 ----- migration/multifd.h | 5 +++++ 2 files changed, 5 insertions(+), 5 deletions(-) v6: no change - https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t diff --git a/migration/multifd.c b/migration/multifd.c index 215ad0414a..30e514785f 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -33,11 +33,6 @@ #include "io/channel-socket.h" #include "yank_functions.h" -/* Multiple fd's */ - -#define MULTIFD_MAGIC 0x11223344U -#define MULTIFD_VERSION 1 - typedef struct { uint32_t magic; uint32_t version; diff --git a/migration/multifd.h b/migration/multifd.h index cf408ff721..bff867ca6b 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -49,6 +49,11 @@ bool multifd_queue_page(RAMBlock *block, ram_addr_t offset); bool multifd_recv(void); MultiFDRecvData *multifd_get_recv_data(void); +/* Multiple fd's */ + +#define MULTIFD_MAGIC 0x11223344U +#define MULTIFD_VERSION 1 + /* Multifd Compression flags */ #define MULTIFD_FLAG_SYNC (1 << 0) -- 2.48.1 ^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v7 2/5] migration: enable multifd and postcopy together 2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 1/5] migration/multifd: move macros to multifd header Prasad Pandit @ 2025-02-28 12:17 ` Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 3/5] tests/qtest/migration: consolidate set capabilities Prasad Pandit ` (3 subsequent siblings) 5 siblings, 0 replies; 37+ messages in thread From: Prasad Pandit @ 2025-02-28 12:17 UTC (permalink / raw) To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Enable Multifd and Postcopy migration together. The migration_ioc_process_incoming() routine checks magic value sent on each channel and helps to properly setup multifd and postcopy channels. The Precopy and Multifd threads work during the initial guest RAM transfer. When migration moves to the Postcopy phase, the multifd threads are shutdown and Postcopy threads on the destination request/pull data from the source side. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/migration.c | 133 +++++++++++++++++++++---------------- migration/multifd-nocomp.c | 3 +- migration/multifd.c | 11 ++- migration/options.c | 5 -- migration/ram.c | 7 +- 5 files changed, 89 insertions(+), 70 deletions(-) v6: update channel discovery part in migration_ioc_process_incoming() - https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t diff --git a/migration/migration.c b/migration/migration.c index c597aa707e..5db9e18272 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -95,6 +95,9 @@ enum mig_rp_message_type { MIG_RP_MSG_MAX }; +/* Migration channel types */ +enum { CH_MAIN, CH_MULTIFD, CH_POSTCOPY }; + /* When we add fault tolerance, we could have several migrations at once. For now we don't need to add dynamic creation of migration */ @@ -947,28 +950,19 @@ void migration_fd_process_incoming(QEMUFile *f) migration_incoming_process(); } -/* - * Returns true when we want to start a new incoming migration process, - * false otherwise. - */ -static bool migration_should_start_incoming(bool main_channel) +static bool migration_has_main_and_multifd_channels(void) { - /* Multifd doesn't start unless all channels are established */ - if (migrate_multifd()) { - return migration_has_all_channels(); + MigrationIncomingState *mis = migration_incoming_get_current(); + if (!mis->from_src_file) { + /* main channel not established */ + return false; } - /* Preempt channel only starts when the main channel is created */ - if (migrate_postcopy_preempt()) { - return main_channel; + if (migrate_multifd() && !multifd_recv_all_channels_created()) { + return false; } - /* - * For all the rest types of migration, we should only reach here when - * it's the main channel that's being created, and we should always - * proceed with this channel. - */ - assert(main_channel); + /* main channel and all multifd channels are established */ return true; } @@ -977,59 +971,84 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) MigrationIncomingState *mis = migration_incoming_get_current(); Error *local_err = NULL; QEMUFile *f; - bool default_channel = true; + uint8_t channel; uint32_t channel_magic = 0; int ret = 0; - if (migrate_multifd() && !migrate_mapped_ram() && - !migrate_postcopy_ram() && - qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { - /* - * With multiple channels, it is possible that we receive channels - * out of order on destination side, causing incorrect mapping of - * source channels on destination side. Check channel MAGIC to - * decide type of channel. Please note this is best effort, postcopy - * preempt channel does not send any magic number so avoid it for - * postcopy live migration. Also tls live migration already does - * tls handshake while initializing main channel so with tls this - * issue is not possible. - */ - ret = migration_channel_read_peek(ioc, (void *)&channel_magic, - sizeof(channel_magic), errp); + if (!migration_has_main_and_multifd_channels()) { + if (qio_channel_has_feature(ioc, QIO_CHANNEL_FEATURE_READ_MSG_PEEK)) { + /* + * With multiple channels, it is possible that we receive channels + * out of order on destination side, causing incorrect mapping of + * source channels on destination side. Check channel MAGIC to + * decide type of channel. Please note this is best effort, + * postcopy preempt channel does not send any magic number so + * avoid it for postcopy live migration. Also tls live migration + * already does tls handshake while initializing main channel so + * with tls this issue is not possible. + */ + ret = migration_channel_read_peek(ioc, (void *)&channel_magic, + sizeof(channel_magic), errp); + if (ret != 0) { + return; + } - if (ret != 0) { + channel_magic = be32_to_cpu(channel_magic); + if (channel_magic == QEMU_VM_FILE_MAGIC) { + channel = CH_MAIN; + } else if (channel_magic == MULTIFD_MAGIC) { + channel = CH_MULTIFD; + } else if (!mis->from_src_file && + mis->state == MIGRATION_STATUS_POSTCOPY_PAUSED) { + /* reconnect main channel for postcopy recovery */ + channel = CH_MAIN; + } else { + error_setg(errp, "unknown channel magic: %u", channel_magic); + return; + } + } else if (mis->from_src_file && migrate_multifd()) { + /* + * Non-peekable channels like tls/file are processed as + * multifd channels when multifd is enabled. + */ + channel = CH_MULTIFD; + } else if (!mis->from_src_file) { + channel = CH_MAIN; + } else { + error_setg(errp, "non-peekable channel used without multifd"); return; } - - default_channel = (channel_magic == cpu_to_be32(QEMU_VM_FILE_MAGIC)); + } else if (mis->from_src_file) { + channel = CH_POSTCOPY; } else { - default_channel = !mis->from_src_file; + channel = CH_MAIN; } if (multifd_recv_setup(errp) != 0) { return; } - if (default_channel) { + if (channel == CH_MAIN) { f = qemu_file_new_input(ioc); migration_incoming_setup(f); - } else { + } else if (channel == CH_MULTIFD) { /* Multiple connections */ - assert(migration_needs_multiple_sockets()); if (migrate_multifd()) { multifd_recv_new_channel(ioc, &local_err); - } else { - assert(migrate_postcopy_preempt()); - f = qemu_file_new_input(ioc); - postcopy_preempt_new_channel(mis, f); } if (local_err) { error_propagate(errp, local_err); return; } + } else if (channel == CH_POSTCOPY) { + assert(migrate_postcopy_preempt()); + assert(!mis->postcopy_qemufile_dst); + f = qemu_file_new_input(ioc); + postcopy_preempt_new_channel(mis, f); + return; } - if (migration_should_start_incoming(default_channel)) { + if (migration_has_main_and_multifd_channels()) { /* If it's a recovery, we're done */ if (postcopy_try_recover()) { return; @@ -1046,20 +1065,15 @@ void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) */ bool migration_has_all_channels(void) { + if (!migration_has_main_and_multifd_channels()) { + return false; + } + MigrationIncomingState *mis = migration_incoming_get_current(); - - if (!mis->from_src_file) { + if (migrate_postcopy_preempt() && !mis->postcopy_qemufile_dst) { return false; } - if (migrate_multifd()) { - return multifd_recv_all_channels_created(); - } - - if (migrate_postcopy_preempt()) { - return mis->postcopy_qemufile_dst != NULL; - } - return true; } @@ -1470,6 +1484,8 @@ static void migration_cleanup(MigrationState *s) assert(!migration_is_active()); + file_cleanup_outgoing_migration(); + socket_cleanup_outgoing_migration(); if (s->state == MIGRATION_STATUS_CANCELLING) { migrate_set_state(&s->state, MIGRATION_STATUS_CANCELLING, MIGRATION_STATUS_CANCELLED); @@ -3382,8 +3398,11 @@ static MigIterateState migration_iteration_run(MigrationState *s) } /* Still a significant amount to transfer */ - if (!in_postcopy && must_precopy <= s->threshold_size && can_switchover && - qatomic_read(&s->start_postcopy)) { + if (!in_postcopy && must_precopy <= s->threshold_size + && can_switchover && qatomic_read(&s->start_postcopy)) { + if (migrate_multifd()) { + multifd_send_shutdown(); + } if (postcopy_start(s, &local_err)) { migrate_set_error(s, local_err); error_report_err(local_err); diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 1325dba97c..d0edec7cd1 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -16,6 +16,7 @@ #include "file.h" #include "multifd.h" #include "options.h" +#include "migration.h" #include "qapi/error.h" #include "qemu/cutils.h" #include "qemu/error-report.h" @@ -391,7 +392,7 @@ int multifd_ram_flush_and_sync(QEMUFile *f) MultiFDSyncReq req; int ret; - if (!migrate_multifd()) { + if (!migrate_multifd() || migration_in_postcopy()) { return 0; } diff --git a/migration/multifd.c b/migration/multifd.c index 30e514785f..2bd8604ca1 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -467,8 +467,6 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) static void multifd_send_cleanup_state(void) { - file_cleanup_outgoing_migration(); - socket_cleanup_outgoing_migration(); qemu_sem_destroy(&multifd_send_state->channels_created); qemu_sem_destroy(&multifd_send_state->channels_ready); g_free(multifd_send_state->params); @@ -481,7 +479,7 @@ void multifd_send_shutdown(void) { int i; - if (!migrate_multifd()) { + if (!migrate_multifd() || !multifd_send_state) { return; } @@ -1231,6 +1229,13 @@ static void *multifd_recv_thread(void *opaque) } if (has_data) { + /* + * multifd thread should not be active and receive data + * when migration is in the Postcopy phase. Two threads + * writing the same memory area could easily corrupt + * the guest state. + */ + assert(!migration_in_postcopy()); ret = multifd_recv_state->ops->recv(p, &local_err); if (ret != 0) { break; diff --git a/migration/options.c b/migration/options.c index bb259d192a..5d56591f9e 100644 --- a/migration/options.c +++ b/migration/options.c @@ -482,11 +482,6 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy is not compatible with ignore-shared"); return false; } - - if (new_caps[MIGRATION_CAPABILITY_MULTIFD]) { - error_setg(errp, "Postcopy is not yet compatible with multifd"); - return false; - } } if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) { diff --git a/migration/ram.c b/migration/ram.c index 589b6505eb..2d8e593c90 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1297,7 +1297,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss) pss->page = 0; pss->block = QLIST_NEXT_RCU(pss->block, next); if (!pss->block) { - if (multifd_ram_sync_per_round()) { + if (multifd_ram_sync_per_round() && !migration_in_postcopy()) { QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel; int ret = multifd_ram_flush_and_sync(f); if (ret < 0) { @@ -1971,9 +1971,8 @@ static int ram_save_target_page(RAMState *rs, PageSearchStatus *pss) } } - if (migrate_multifd()) { - RAMBlock *block = pss->block; - return ram_save_multifd_page(block, offset); + if (migrate_multifd() && !migration_in_postcopy()) { + return ram_save_multifd_page(pss->block, offset); } if (control_save_page(pss, offset, &res)) { -- 2.48.1 ^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v7 3/5] tests/qtest/migration: consolidate set capabilities 2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 1/5] migration/multifd: move macros to multifd header Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 2/5] migration: enable multifd and postcopy together Prasad Pandit @ 2025-02-28 12:17 ` Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit ` (2 subsequent siblings) 5 siblings, 0 replies; 37+ messages in thread From: Prasad Pandit @ 2025-02-28 12:17 UTC (permalink / raw) To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Migration capabilities are set in multiple '.start_hook' functions for various tests. Instead, consolidate setting capabilities in 'migrate_start_set_capabilities()' function which is called from the 'migrate_start()' function. While simplifying the capabilities setting, it helps to declutter the qtest sources. Suggested-by: Fabiano Rosas <farosas@suse.de> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- tests/qtest/migration/compression-tests.c | 22 +++++-- tests/qtest/migration/cpr-tests.c | 6 +- tests/qtest/migration/file-tests.c | 58 ++++++++--------- tests/qtest/migration/framework.c | 76 ++++++++++++++++------- tests/qtest/migration/framework.h | 9 ++- tests/qtest/migration/misc-tests.c | 4 +- tests/qtest/migration/postcopy-tests.c | 8 ++- tests/qtest/migration/precopy-tests.c | 29 +++++---- tests/qtest/migration/tls-tests.c | 23 ++++++- 9 files changed, 151 insertions(+), 84 deletions(-) v6: Move .caps initialization to the .start object. Set default multifd threads to 4. - https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t diff --git a/tests/qtest/migration/compression-tests.c b/tests/qtest/migration/compression-tests.c index 8b58401b84..41e79f031b 100644 --- a/tests/qtest/migration/compression-tests.c +++ b/tests/qtest/migration/compression-tests.c @@ -35,6 +35,9 @@ static void test_multifd_tcp_zstd(void) { MigrateCommon args = { .listen_uri = "defer", + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, .start_hook = migrate_hook_start_precopy_tcp_multifd_zstd, }; test_precopy_common(&args); @@ -56,6 +59,9 @@ static void test_multifd_tcp_qatzip(void) { MigrateCommon args = { .listen_uri = "defer", + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, .start_hook = migrate_hook_start_precopy_tcp_multifd_qatzip, }; test_precopy_common(&args); @@ -74,6 +80,9 @@ static void test_multifd_tcp_qpl(void) { MigrateCommon args = { .listen_uri = "defer", + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, .start_hook = migrate_hook_start_precopy_tcp_multifd_qpl, }; test_precopy_common(&args); @@ -92,6 +101,9 @@ static void test_multifd_tcp_uadk(void) { MigrateCommon args = { .listen_uri = "defer", + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, .start_hook = migrate_hook_start_precopy_tcp_multifd_uadk, }; test_precopy_common(&args); @@ -103,10 +115,6 @@ migrate_hook_start_xbzrle(QTestState *from, QTestState *to) { migrate_set_parameter_int(from, "xbzrle-cache-size", 33554432); - - migrate_set_capability(from, "xbzrle", true); - migrate_set_capability(to, "xbzrle", true); - return NULL; } @@ -118,6 +126,9 @@ static void test_precopy_unix_xbzrle(void) .listen_uri = uri, .start_hook = migrate_hook_start_xbzrle, .iterations = 2, + .start = { + .caps[MIGRATION_CAPABILITY_XBZRLE] = true, + }, /* * XBZRLE needs pages to be modified when doing the 2nd+ round * iteration to have real data pushed to the stream. @@ -146,6 +157,9 @@ static void test_multifd_tcp_zlib(void) { MigrateCommon args = { .listen_uri = "defer", + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, .start_hook = migrate_hook_start_precopy_tcp_multifd_zlib, }; test_precopy_common(&args); diff --git a/tests/qtest/migration/cpr-tests.c b/tests/qtest/migration/cpr-tests.c index 4758841824..5536e14610 100644 --- a/tests/qtest/migration/cpr-tests.c +++ b/tests/qtest/migration/cpr-tests.c @@ -24,9 +24,6 @@ static void *migrate_hook_start_mode_reboot(QTestState *from, QTestState *to) migrate_set_parameter_str(from, "mode", "cpr-reboot"); migrate_set_parameter_str(to, "mode", "cpr-reboot"); - migrate_set_capability(from, "x-ignore-shared", true); - migrate_set_capability(to, "x-ignore-shared", true); - return NULL; } @@ -39,6 +36,9 @@ static void test_mode_reboot(void) .connect_uri = uri, .listen_uri = "defer", .start_hook = migrate_hook_start_mode_reboot, + .start = { + .caps[MIGRATION_CAPABILITY_X_IGNORE_SHARED] = true, + }, }; test_file_common(&args, true); diff --git a/tests/qtest/migration/file-tests.c b/tests/qtest/migration/file-tests.c index f260e2871d..4d78ce0855 100644 --- a/tests/qtest/migration/file-tests.c +++ b/tests/qtest/migration/file-tests.c @@ -107,15 +107,6 @@ static void test_precopy_file_offset_bad(void) test_file_common(&args, false); } -static void *migrate_hook_start_mapped_ram(QTestState *from, - QTestState *to) -{ - migrate_set_capability(from, "mapped-ram", true); - migrate_set_capability(to, "mapped-ram", true); - - return NULL; -} - static void test_precopy_file_mapped_ram_live(void) { g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs, @@ -123,7 +114,9 @@ static void test_precopy_file_mapped_ram_live(void) MigrateCommon args = { .connect_uri = uri, .listen_uri = "defer", - .start_hook = migrate_hook_start_mapped_ram, + .start = { + .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true, + }, }; test_file_common(&args, false); @@ -136,26 +129,14 @@ static void test_precopy_file_mapped_ram(void) MigrateCommon args = { .connect_uri = uri, .listen_uri = "defer", - .start_hook = migrate_hook_start_mapped_ram, + .start = { + .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true, + }, }; test_file_common(&args, true); } -static void *migrate_hook_start_multifd_mapped_ram(QTestState *from, - QTestState *to) -{ - migrate_hook_start_mapped_ram(from, to); - - migrate_set_parameter_int(from, "multifd-channels", 4); - migrate_set_parameter_int(to, "multifd-channels", 4); - - migrate_set_capability(from, "multifd", true); - migrate_set_capability(to, "multifd", true); - - return NULL; -} - static void test_multifd_file_mapped_ram_live(void) { g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs, @@ -163,7 +144,10 @@ static void test_multifd_file_mapped_ram_live(void) MigrateCommon args = { .connect_uri = uri, .listen_uri = "defer", - .start_hook = migrate_hook_start_multifd_mapped_ram, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true, + }, }; test_file_common(&args, false); @@ -176,7 +160,10 @@ static void test_multifd_file_mapped_ram(void) MigrateCommon args = { .connect_uri = uri, .listen_uri = "defer", - .start_hook = migrate_hook_start_multifd_mapped_ram, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true, + }, }; test_file_common(&args, true); @@ -185,8 +172,6 @@ static void test_multifd_file_mapped_ram(void) static void *migrate_hook_start_multifd_mapped_ram_dio(QTestState *from, QTestState *to) { - migrate_hook_start_multifd_mapped_ram(from, to); - migrate_set_parameter_bool(from, "direct-io", true); migrate_set_parameter_bool(to, "direct-io", true); @@ -201,6 +186,10 @@ static void test_multifd_file_mapped_ram_dio(void) .connect_uri = uri, .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_mapped_ram_dio, + .start = { + .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true, + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, }; if (!probe_o_direct_support(tmpfs)) { @@ -246,7 +235,6 @@ static void *migrate_hook_start_multifd_mapped_ram_fdset_dio(QTestState *from, fdset_add_fds(from, file, O_WRONLY, 2, true); fdset_add_fds(to, file, O_RDONLY, 2, true); - migrate_hook_start_multifd_mapped_ram(from, to); migrate_set_parameter_bool(from, "direct-io", true); migrate_set_parameter_bool(to, "direct-io", true); @@ -261,8 +249,6 @@ static void *migrate_hook_start_multifd_mapped_ram_fdset(QTestState *from, fdset_add_fds(from, file, O_WRONLY, 2, false); fdset_add_fds(to, file, O_RDONLY, 2, false); - migrate_hook_start_multifd_mapped_ram(from, to); - return NULL; } @@ -275,6 +261,10 @@ static void test_multifd_file_mapped_ram_fdset(void) .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_mapped_ram_fdset, .end_hook = migrate_hook_end_multifd_mapped_ram_fdset, + .start = { + .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true, + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, }; test_file_common(&args, true); @@ -289,6 +279,10 @@ static void test_multifd_file_mapped_ram_fdset_dio(void) .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_mapped_ram_fdset_dio, .end_hook = migrate_hook_end_multifd_mapped_ram_fdset, + .start = { + .caps[MIGRATION_CAPABILITY_MAPPED_RAM] = true, + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, }; if (!probe_o_direct_support(tmpfs)) { diff --git a/tests/qtest/migration/framework.c b/tests/qtest/migration/framework.c index 10e1d04b58..be6c245843 100644 --- a/tests/qtest/migration/framework.c +++ b/tests/qtest/migration/framework.c @@ -30,6 +30,7 @@ #define QEMU_VM_FILE_MAGIC 0x5145564d #define QEMU_ENV_SRC "QTEST_QEMU_BINARY_SRC" #define QEMU_ENV_DST "QTEST_QEMU_BINARY_DST" +#define MULTIFD_TEST_CHANNELS 4 unsigned start_address; unsigned end_address; @@ -207,6 +208,52 @@ static QList *migrate_start_get_qmp_capabilities(const MigrateStart *args) return capabilities; } +static void migrate_start_set_capabilities(QTestState *from, QTestState *to, + MigrateStart *args) +{ + /* + * MigrationCapability_lookup and MIGRATION_CAPABILITY_ constants + * are from qapi-types-migration.h. + */ + for (uint8_t i = 0; i < MIGRATION_CAPABILITY__MAX; i++) + { + if (!args->caps[i]) { + continue; + } + if (from) { + migrate_set_capability(from, + MigrationCapability_lookup.array[i], true); + } + if (to) { + migrate_set_capability(to, + MigrationCapability_lookup.array[i], true); + } + } + + /* + * Always enable migration events. Libvirt always uses it, let's try + * to mimic as closer as that. + */ + migrate_set_capability(from, "events", true); + if (!args->defer_target_connect) { + migrate_set_capability(to, "events", true); + } + + /* + * Default number of channels should be fine for most + * tests. Individual tests can override by calling + * migrate_set_parameter() directly. + */ + if (args->caps[MIGRATION_CAPABILITY_MULTIFD]) { + migrate_set_parameter_int(from, "multifd-channels", + MULTIFD_TEST_CHANNELS); + migrate_set_parameter_int(to, "multifd-channels", + MULTIFD_TEST_CHANNELS); + } + + return; +} + int migrate_start(QTestState **from, QTestState **to, const char *uri, MigrateStart *args) { @@ -379,14 +426,7 @@ int migrate_start(QTestState **from, QTestState **to, const char *uri, unlink(shmem_path); } - /* - * Always enable migration events. Libvirt always uses it, let's try - * to mimic as closer as that. - */ - migrate_set_capability(*from, "events", true); - if (!args->defer_target_connect) { - migrate_set_capability(*to, "events", true); - } + migrate_start_set_capabilities(*from, *to, args); return 0; } @@ -432,6 +472,10 @@ static int migrate_postcopy_prepare(QTestState **from_ptr, { QTestState *from, *to; + /* set postcopy capabilities */ + args->start.caps[MIGRATION_CAPABILITY_POSTCOPY_BLOCKTIME] = true; + args->start.caps[MIGRATION_CAPABILITY_POSTCOPY_RAM] = true; + if (migrate_start(&from, &to, "defer", &args->start)) { return -1; } @@ -440,17 +484,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr, args->postcopy_data = args->start_hook(from, to); } - migrate_set_capability(from, "postcopy-ram", true); - migrate_set_capability(to, "postcopy-ram", true); - migrate_set_capability(to, "postcopy-blocktime", true); - - if (args->postcopy_preempt) { - migrate_set_capability(from, "postcopy-preempt", true); - migrate_set_capability(to, "postcopy-preempt", true); - } - migrate_ensure_non_converge(from); - migrate_prepare_for_dirty_mem(from); qtest_qmp_assert_success(to, "{ 'execute': 'migrate-incoming'," " 'arguments': { " @@ -948,15 +982,9 @@ void *migrate_hook_start_precopy_tcp_multifd_common(QTestState *from, QTestState *to, const char *method) { - migrate_set_parameter_int(from, "multifd-channels", 16); - migrate_set_parameter_int(to, "multifd-channels", 16); - migrate_set_parameter_str(from, "multifd-compression", method); migrate_set_parameter_str(to, "multifd-compression", method); - migrate_set_capability(from, "multifd", true); - migrate_set_capability(to, "multifd", true); - /* Start incoming migration from the 1st socket */ migrate_incoming_qmp(to, "tcp:127.0.0.1:0", NULL, "{}"); diff --git a/tests/qtest/migration/framework.h b/tests/qtest/migration/framework.h index e4a11870f6..01e425e64e 100644 --- a/tests/qtest/migration/framework.h +++ b/tests/qtest/migration/framework.h @@ -12,6 +12,7 @@ #define TEST_FRAMEWORK_H #include "libqtest.h" +#include <qapi/qapi-types-migration.h> #define FILE_TEST_FILENAME "migfile" #define FILE_TEST_OFFSET 0x1000 @@ -120,6 +121,13 @@ typedef struct { /* Do not connect to target monitor and qtest sockets in qtest_init */ bool defer_target_connect; + + /* + * Migration capabilities to be set in both source and + * destination. For unilateral capabilities, use + * migration_set_capabilities(). + */ + bool caps[MIGRATION_CAPABILITY__MAX]; } MigrateStart; typedef enum PostcopyRecoveryFailStage { @@ -207,7 +215,6 @@ typedef struct { /* Postcopy specific fields */ void *postcopy_data; - bool postcopy_preempt; PostcopyRecoveryFailStage postcopy_recovery_fail_stage; } MigrateCommon; diff --git a/tests/qtest/migration/misc-tests.c b/tests/qtest/migration/misc-tests.c index 2e612d9e38..54995256d8 100644 --- a/tests/qtest/migration/misc-tests.c +++ b/tests/qtest/migration/misc-tests.c @@ -98,6 +98,7 @@ static void test_ignore_shared(void) QTestState *from, *to; MigrateStart args = { .use_shmem = true, + .caps[MIGRATION_CAPABILITY_X_IGNORE_SHARED] = true, }; if (migrate_start(&from, &to, uri, &args)) { @@ -107,9 +108,6 @@ static void test_ignore_shared(void) migrate_ensure_non_converge(from); migrate_prepare_for_dirty_mem(from); - migrate_set_capability(from, "x-ignore-shared", true); - migrate_set_capability(to, "x-ignore-shared", true); - /* Wait for the first serial output from the source */ wait_for_serial("src_serial"); diff --git a/tests/qtest/migration/postcopy-tests.c b/tests/qtest/migration/postcopy-tests.c index 982457bed1..483e3ff99f 100644 --- a/tests/qtest/migration/postcopy-tests.c +++ b/tests/qtest/migration/postcopy-tests.c @@ -39,7 +39,9 @@ static void test_postcopy_suspend(void) static void test_postcopy_preempt(void) { MigrateCommon args = { - .postcopy_preempt = true, + .start = { + .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true, + }, }; test_postcopy_common(&args); @@ -73,7 +75,9 @@ static void test_postcopy_recovery_fail_reconnect(void) static void test_postcopy_preempt_recovery(void) { MigrateCommon args = { - .postcopy_preempt = true, + .start = { + .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true, + }, }; test_postcopy_recovery_common(&args); diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c index ba273d10b9..f8404793b8 100644 --- a/tests/qtest/migration/precopy-tests.c +++ b/tests/qtest/migration/precopy-tests.c @@ -108,23 +108,14 @@ static void test_precopy_tcp_plain(void) test_precopy_common(&args); } -static void *migrate_hook_start_switchover_ack(QTestState *from, QTestState *to) -{ - - migrate_set_capability(from, "return-path", true); - migrate_set_capability(to, "return-path", true); - - migrate_set_capability(from, "switchover-ack", true); - migrate_set_capability(to, "switchover-ack", true); - - return NULL; -} - static void test_precopy_tcp_switchover_ack(void) { MigrateCommon args = { .listen_uri = "tcp:127.0.0.1:0", - .start_hook = migrate_hook_start_switchover_ack, + .start = { + .caps[MIGRATION_CAPABILITY_RETURN_PATH] = true, + .caps[MIGRATION_CAPABILITY_SWITCHOVER_ACK] = true, + }, /* * Source VM must be running in order to consider the switchover ACK * when deciding to do switchover or not. @@ -393,6 +384,9 @@ static void test_multifd_tcp_uri_none(void) MigrateCommon args = { .listen_uri = "defer", .start_hook = migrate_hook_start_precopy_tcp_multifd, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, /* * Multifd is more complicated than most of the features, it * directly takes guest page buffers when sending, make sure @@ -408,6 +402,9 @@ static void test_multifd_tcp_zero_page_legacy(void) MigrateCommon args = { .listen_uri = "defer", .start_hook = migrate_hook_start_precopy_tcp_multifd_zero_page_legacy, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, /* * Multifd is more complicated than most of the features, it * directly takes guest page buffers when sending, make sure @@ -423,6 +420,9 @@ static void test_multifd_tcp_no_zero_page(void) MigrateCommon args = { .listen_uri = "defer", .start_hook = migrate_hook_start_precopy_tcp_multifd_no_zero_page, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, /* * Multifd is more complicated than most of the features, it * directly takes guest page buffers when sending, make sure @@ -439,6 +439,9 @@ static void test_multifd_tcp_channels_none(void) .listen_uri = "defer", .start_hook = migrate_hook_start_precopy_tcp_multifd, .live = true, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, .connect_channels = ("[ { 'channel-type': 'main'," " 'addr': { 'transport': 'socket'," " 'type': 'inet'," diff --git a/tests/qtest/migration/tls-tests.c b/tests/qtest/migration/tls-tests.c index 2cb4a44bcd..72f44defbb 100644 --- a/tests/qtest/migration/tls-tests.c +++ b/tests/qtest/migration/tls-tests.c @@ -375,9 +375,11 @@ static void test_postcopy_tls_psk(void) static void test_postcopy_preempt_tls_psk(void) { MigrateCommon args = { - .postcopy_preempt = true, .start_hook = migrate_hook_start_tls_psk_match, .end_hook = migrate_hook_end_tls_psk, + .start = { + .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true, + }, }; test_postcopy_common(&args); @@ -397,9 +399,11 @@ static void test_postcopy_recovery_tls_psk(void) static void test_postcopy_preempt_all(void) { MigrateCommon args = { - .postcopy_preempt = true, .start_hook = migrate_hook_start_tls_psk_match, .end_hook = migrate_hook_end_tls_psk, + .start = { + .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true, + }, }; test_postcopy_recovery_common(&args); @@ -631,6 +635,9 @@ static void test_multifd_tcp_tls_psk_match(void) .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_tcp_tls_psk_match, .end_hook = migrate_hook_end_tls_psk, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, }; test_precopy_common(&args); } @@ -640,6 +647,7 @@ static void test_multifd_tcp_tls_psk_mismatch(void) MigrateCommon args = { .start = { .hide_stderr = true, + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, }, .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_tcp_tls_psk_mismatch, @@ -656,6 +664,9 @@ static void test_multifd_tcp_tls_x509_default_host(void) .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_tls_x509_default_host, .end_hook = migrate_hook_end_tls_x509, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, }; test_precopy_common(&args); } @@ -666,6 +677,9 @@ static void test_multifd_tcp_tls_x509_override_host(void) .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_tls_x509_override_host, .end_hook = migrate_hook_end_tls_x509, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, }; test_precopy_common(&args); } @@ -688,6 +702,7 @@ static void test_multifd_tcp_tls_x509_mismatch_host(void) MigrateCommon args = { .start = { .hide_stderr = true, + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, }, .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_tls_x509_mismatch_host, @@ -703,6 +718,9 @@ static void test_multifd_tcp_tls_x509_allow_anon_client(void) .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_tls_x509_allow_anon_client, .end_hook = migrate_hook_end_tls_x509, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, }; test_precopy_common(&args); } @@ -712,6 +730,7 @@ static void test_multifd_tcp_tls_x509_reject_anon_client(void) MigrateCommon args = { .start = { .hide_stderr = true, + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, }, .listen_uri = "defer", .start_hook = migrate_hook_start_multifd_tls_x509_reject_anon_client, -- 2.48.1 ^ permalink raw reply related [flat|nested] 37+ messages in thread
* [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd 2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit ` (2 preceding siblings ...) 2025-02-28 12:17 ` [PATCH v7 3/5] tests/qtest/migration: consolidate set capabilities Prasad Pandit @ 2025-02-28 12:17 ` Prasad Pandit 2025-02-28 15:11 ` Fabiano Rosas 2025-02-28 12:17 ` [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command Prasad Pandit 2025-02-28 14:53 ` [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Fabiano Rosas 5 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-02-28 12:17 UTC (permalink / raw) To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> Add new qtests to run postcopy migration with multifd channels enabled. Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- tests/qtest/migration/compression-tests.c | 15 ++++++++ tests/qtest/migration/postcopy-tests.c | 27 +++++++++++++ tests/qtest/migration/precopy-tests.c | 19 ++++++++++ tests/qtest/migration/tls-tests.c | 46 +++++++++++++++++++++++ 4 files changed, 107 insertions(+) v6: Move .caps initialisations to .start object. - https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t diff --git a/tests/qtest/migration/compression-tests.c b/tests/qtest/migration/compression-tests.c index 41e79f031b..592e85d69d 100644 --- a/tests/qtest/migration/compression-tests.c +++ b/tests/qtest/migration/compression-tests.c @@ -42,6 +42,19 @@ static void test_multifd_tcp_zstd(void) }; test_precopy_common(&args); } + +static void test_multifd_postcopy_tcp_zstd(void) +{ + MigrateCommon args = { + .listen_uri = "defer", + .start = { + .caps[MIGRATION_CAPABILITY_POSTCOPY_RAM] = true, + }, + .start_hook = migrate_hook_start_precopy_tcp_multifd_zstd, + }; + + test_precopy_common(&args); +} #endif /* CONFIG_ZSTD */ #ifdef CONFIG_QATZIP @@ -184,6 +197,8 @@ void migration_test_add_compression(MigrationTestEnv *env) #ifdef CONFIG_ZSTD migration_test_add("/migration/multifd/tcp/plain/zstd", test_multifd_tcp_zstd); + migration_test_add("/migration/multifd+postcopy/tcp/plain/zstd", + test_multifd_postcopy_tcp_zstd); #endif #ifdef CONFIG_QATZIP diff --git a/tests/qtest/migration/postcopy-tests.c b/tests/qtest/migration/postcopy-tests.c index 483e3ff99f..eb637f94f7 100644 --- a/tests/qtest/migration/postcopy-tests.c +++ b/tests/qtest/migration/postcopy-tests.c @@ -94,6 +94,29 @@ static void migration_test_add_postcopy_smoke(MigrationTestEnv *env) } } +static void test_multifd_postcopy(void) +{ + MigrateCommon args = { + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, + }; + + test_postcopy_common(&args); +} + +static void test_multifd_postcopy_preempt(void) +{ + MigrateCommon args = { + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true, + }, + }; + + test_postcopy_common(&args); +} + void migration_test_add_postcopy(MigrationTestEnv *env) { migration_test_add_postcopy_smoke(env); @@ -114,6 +137,10 @@ void migration_test_add_postcopy(MigrationTestEnv *env) "/migration/postcopy/recovery/double-failures/reconnect", test_postcopy_recovery_fail_reconnect); + migration_test_add("/migration/postcopy/multifd/plain", + test_multifd_postcopy); + migration_test_add("/migration/postcopy/multifd/preempt/plain", + test_multifd_postcopy_preempt); if (env->is_x86) { migration_test_add("/migration/postcopy/suspend", test_postcopy_suspend); diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c index f8404793b8..b2b0db8076 100644 --- a/tests/qtest/migration/precopy-tests.c +++ b/tests/qtest/migration/precopy-tests.c @@ -34,6 +34,7 @@ #define DIRTYLIMIT_TOLERANCE_RANGE 25 /* MB/s */ static char *tmpfs; +static bool postcopy_ram = false; static void test_precopy_unix_plain(void) { @@ -476,6 +477,11 @@ static void test_multifd_tcp_cancel(void) migrate_ensure_non_converge(from); migrate_prepare_for_dirty_mem(from); + if (postcopy_ram) { + migrate_set_capability(from, "postcopy-ram", true); + migrate_set_capability(to, "postcopy-ram", true); + } + migrate_set_parameter_int(from, "multifd-channels", 16); migrate_set_parameter_int(to, "multifd-channels", 16); @@ -517,6 +523,10 @@ static void test_multifd_tcp_cancel(void) return; } + if (postcopy_ram) { + migrate_set_capability(to2, "postcopy-ram", true); + } + migrate_set_parameter_int(to2, "multifd-channels", 16); migrate_set_capability(to2, "multifd", true); @@ -540,6 +550,13 @@ static void test_multifd_tcp_cancel(void) migrate_end(from, to2, true); } +static void test_multifd_postcopy_tcp_cancel(void) +{ + postcopy_ram = true; + test_multifd_tcp_cancel(); + postcopy_ram = false; +} + static void test_cancel_src_after_failed(QTestState *from, QTestState *to, const char *uri, const char *phase) { @@ -1127,6 +1144,8 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env) test_multifd_tcp_uri_none); migration_test_add("/migration/multifd/tcp/plain/cancel", test_multifd_tcp_cancel); + migration_test_add("/migration/multifd+postcopy/tcp/plain/cancel", + test_multifd_postcopy_tcp_cancel); } void migration_test_add_precopy(MigrationTestEnv *env) diff --git a/tests/qtest/migration/tls-tests.c b/tests/qtest/migration/tls-tests.c index 72f44defbb..81a2faf4c0 100644 --- a/tests/qtest/migration/tls-tests.c +++ b/tests/qtest/migration/tls-tests.c @@ -395,6 +395,19 @@ static void test_postcopy_recovery_tls_psk(void) test_postcopy_recovery_common(&args); } +static void test_multifd_postcopy_recovery_tls_psk(void) +{ + MigrateCommon args = { + .start_hook = migrate_hook_start_tls_psk_match, + .end_hook = migrate_hook_end_tls_psk, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, + }; + + test_postcopy_recovery_common(&args); +} + /* This contains preempt+recovery+tls test altogether */ static void test_postcopy_preempt_all(void) { @@ -409,6 +422,19 @@ static void test_postcopy_preempt_all(void) test_postcopy_recovery_common(&args); } +static void test_multifd_postcopy_preempt_recovery_tls_psk(void) +{ + MigrateCommon args = { + .start_hook = migrate_hook_start_tls_psk_match, + .end_hook = migrate_hook_end_tls_psk, + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, + }; + + test_postcopy_recovery_common(&args); +} + static void test_precopy_unix_tls_psk(void) { g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); @@ -657,6 +683,20 @@ static void test_multifd_tcp_tls_psk_mismatch(void) test_precopy_common(&args); } +static void test_multifd_postcopy_tcp_tls_psk_match(void) +{ + MigrateCommon args = { + .start = { + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, + }, + .listen_uri = "defer", + .start_hook = migrate_hook_start_multifd_tcp_tls_psk_match, + .end_hook = migrate_hook_end_tls_psk, + }; + + test_precopy_common(&args); +} + #ifdef CONFIG_TASN1 static void test_multifd_tcp_tls_x509_default_host(void) { @@ -774,6 +814,10 @@ void migration_test_add_tls(MigrationTestEnv *env) test_postcopy_preempt_tls_psk); migration_test_add("/migration/postcopy/preempt/recovery/tls/psk", test_postcopy_preempt_all); + migration_test_add("/migration/postcopy/multifd/recovery/tls/psk", + test_multifd_postcopy_recovery_tls_psk); + migration_test_add("/migration/postcopy/multifd/preempt/recovery/tls/psk", + test_multifd_postcopy_preempt_recovery_tls_psk); } #ifdef CONFIG_TASN1 migration_test_add("/migration/precopy/unix/tls/x509/default-host", @@ -805,6 +849,8 @@ void migration_test_add_tls(MigrationTestEnv *env) test_multifd_tcp_tls_psk_match); migration_test_add("/migration/multifd/tcp/tls/psk/mismatch", test_multifd_tcp_tls_psk_mismatch); + migration_test_add("/migration/multifd+postcopy/tcp/tls/psk/match", + test_multifd_postcopy_tcp_tls_psk_match); #ifdef CONFIG_TASN1 migration_test_add("/migration/multifd/tcp/tls/x509/default-host", test_multifd_tcp_tls_x509_default_host); -- 2.48.1 ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd 2025-02-28 12:17 ` [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit @ 2025-02-28 15:11 ` Fabiano Rosas 2025-03-03 9:33 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Fabiano Rosas @ 2025-02-28 15:11 UTC (permalink / raw) To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > From: Prasad Pandit <pjp@fedoraproject.org> > > Add new qtests to run postcopy migration with multifd > channels enabled. > > Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> > --- > tests/qtest/migration/compression-tests.c | 15 ++++++++ > tests/qtest/migration/postcopy-tests.c | 27 +++++++++++++ > tests/qtest/migration/precopy-tests.c | 19 ++++++++++ > tests/qtest/migration/tls-tests.c | 46 +++++++++++++++++++++++ > 4 files changed, 107 insertions(+) > > v6: Move .caps initialisations to .start object. > - https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t > > diff --git a/tests/qtest/migration/compression-tests.c b/tests/qtest/migration/compression-tests.c > index 41e79f031b..592e85d69d 100644 > --- a/tests/qtest/migration/compression-tests.c > +++ b/tests/qtest/migration/compression-tests.c > @@ -42,6 +42,19 @@ static void test_multifd_tcp_zstd(void) > }; > test_precopy_common(&args); > } > + > +static void test_multifd_postcopy_tcp_zstd(void) > +{ > + MigrateCommon args = { > + .listen_uri = "defer", > + .start = { > + .caps[MIGRATION_CAPABILITY_POSTCOPY_RAM] = true, > + }, > + .start_hook = migrate_hook_start_precopy_tcp_multifd_zstd, > + }; > + > + test_precopy_common(&args); > +} > #endif /* CONFIG_ZSTD */ > > #ifdef CONFIG_QATZIP > @@ -184,6 +197,8 @@ void migration_test_add_compression(MigrationTestEnv *env) > #ifdef CONFIG_ZSTD > migration_test_add("/migration/multifd/tcp/plain/zstd", > test_multifd_tcp_zstd); > + migration_test_add("/migration/multifd+postcopy/tcp/plain/zstd", > + test_multifd_postcopy_tcp_zstd); > #endif > > #ifdef CONFIG_QATZIP > diff --git a/tests/qtest/migration/postcopy-tests.c b/tests/qtest/migration/postcopy-tests.c > index 483e3ff99f..eb637f94f7 100644 > --- a/tests/qtest/migration/postcopy-tests.c > +++ b/tests/qtest/migration/postcopy-tests.c > @@ -94,6 +94,29 @@ static void migration_test_add_postcopy_smoke(MigrationTestEnv *env) > } > } > > +static void test_multifd_postcopy(void) > +{ > + MigrateCommon args = { > + .start = { > + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, > + }, > + }; > + > + test_postcopy_common(&args); > +} > + > +static void test_multifd_postcopy_preempt(void) > +{ > + MigrateCommon args = { > + .start = { > + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, > + .caps[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT] = true, > + }, > + }; > + > + test_postcopy_common(&args); > +} > + > void migration_test_add_postcopy(MigrationTestEnv *env) > { > migration_test_add_postcopy_smoke(env); > @@ -114,6 +137,10 @@ void migration_test_add_postcopy(MigrationTestEnv *env) > "/migration/postcopy/recovery/double-failures/reconnect", > test_postcopy_recovery_fail_reconnect); > > + migration_test_add("/migration/postcopy/multifd/plain", > + test_multifd_postcopy); > + migration_test_add("/migration/postcopy/multifd/preempt/plain", > + test_multifd_postcopy_preempt); > if (env->is_x86) { > migration_test_add("/migration/postcopy/suspend", > test_postcopy_suspend); > diff --git a/tests/qtest/migration/precopy-tests.c b/tests/qtest/migration/precopy-tests.c > index f8404793b8..b2b0db8076 100644 > --- a/tests/qtest/migration/precopy-tests.c > +++ b/tests/qtest/migration/precopy-tests.c > @@ -34,6 +34,7 @@ > #define DIRTYLIMIT_TOLERANCE_RANGE 25 /* MB/s */ > > static char *tmpfs; > +static bool postcopy_ram = false; > > static void test_precopy_unix_plain(void) > { > @@ -476,6 +477,11 @@ static void test_multifd_tcp_cancel(void) > migrate_ensure_non_converge(from); > migrate_prepare_for_dirty_mem(from); > > + if (postcopy_ram) { > + migrate_set_capability(from, "postcopy-ram", true); > + migrate_set_capability(to, "postcopy-ram", true); > + } > + > migrate_set_parameter_int(from, "multifd-channels", 16); > migrate_set_parameter_int(to, "multifd-channels", 16); > > @@ -517,6 +523,10 @@ static void test_multifd_tcp_cancel(void) > return; > } > > + if (postcopy_ram) { > + migrate_set_capability(to2, "postcopy-ram", true); > + } > + > migrate_set_parameter_int(to2, "multifd-channels", 16); > > migrate_set_capability(to2, "multifd", true); > @@ -540,6 +550,13 @@ static void test_multifd_tcp_cancel(void) > migrate_end(from, to2, true); > } > > +static void test_multifd_postcopy_tcp_cancel(void) > +{ > + postcopy_ram = true; > + test_multifd_tcp_cancel(); > + postcopy_ram = false; > +} > + > static void test_cancel_src_after_failed(QTestState *from, QTestState *to, > const char *uri, const char *phase) > { > @@ -1127,6 +1144,8 @@ static void migration_test_add_precopy_smoke(MigrationTestEnv *env) > test_multifd_tcp_uri_none); > migration_test_add("/migration/multifd/tcp/plain/cancel", > test_multifd_tcp_cancel); > + migration_test_add("/migration/multifd+postcopy/tcp/plain/cancel", > + test_multifd_postcopy_tcp_cancel); > } > > void migration_test_add_precopy(MigrationTestEnv *env) > diff --git a/tests/qtest/migration/tls-tests.c b/tests/qtest/migration/tls-tests.c > index 72f44defbb..81a2faf4c0 100644 > --- a/tests/qtest/migration/tls-tests.c > +++ b/tests/qtest/migration/tls-tests.c > @@ -395,6 +395,19 @@ static void test_postcopy_recovery_tls_psk(void) > test_postcopy_recovery_common(&args); > } > > +static void test_multifd_postcopy_recovery_tls_psk(void) > +{ > + MigrateCommon args = { > + .start_hook = migrate_hook_start_tls_psk_match, > + .end_hook = migrate_hook_end_tls_psk, > + .start = { > + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, > + }, > + }; > + > + test_postcopy_recovery_common(&args); > +} > + > /* This contains preempt+recovery+tls test altogether */ > static void test_postcopy_preempt_all(void) > { > @@ -409,6 +422,19 @@ static void test_postcopy_preempt_all(void) > test_postcopy_recovery_common(&args); > } > > +static void test_multifd_postcopy_preempt_recovery_tls_psk(void) > +{ > + MigrateCommon args = { > + .start_hook = migrate_hook_start_tls_psk_match, > + .end_hook = migrate_hook_end_tls_psk, > + .start = { > + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, > + }, > + }; > + > + test_postcopy_recovery_common(&args); > +} > + > static void test_precopy_unix_tls_psk(void) > { > g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs); > @@ -657,6 +683,20 @@ static void test_multifd_tcp_tls_psk_mismatch(void) > test_precopy_common(&args); > } > > +static void test_multifd_postcopy_tcp_tls_psk_match(void) > +{ > + MigrateCommon args = { > + .start = { > + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, Missing POSTCOPY_RAM here, no? > + }, > + .listen_uri = "defer", > + .start_hook = migrate_hook_start_multifd_tcp_tls_psk_match, > + .end_hook = migrate_hook_end_tls_psk, > + }; > + > + test_precopy_common(&args); > +} > + > #ifdef CONFIG_TASN1 > static void test_multifd_tcp_tls_x509_default_host(void) > { > @@ -774,6 +814,10 @@ void migration_test_add_tls(MigrationTestEnv *env) > test_postcopy_preempt_tls_psk); > migration_test_add("/migration/postcopy/preempt/recovery/tls/psk", > test_postcopy_preempt_all); > + migration_test_add("/migration/postcopy/multifd/recovery/tls/psk", > + test_multifd_postcopy_recovery_tls_psk); > + migration_test_add("/migration/postcopy/multifd/preempt/recovery/tls/psk", > + test_multifd_postcopy_preempt_recovery_tls_psk); > } > #ifdef CONFIG_TASN1 > migration_test_add("/migration/precopy/unix/tls/x509/default-host", > @@ -805,6 +849,8 @@ void migration_test_add_tls(MigrationTestEnv *env) > test_multifd_tcp_tls_psk_match); > migration_test_add("/migration/multifd/tcp/tls/psk/mismatch", > test_multifd_tcp_tls_psk_mismatch); > + migration_test_add("/migration/multifd+postcopy/tcp/tls/psk/match", > + test_multifd_postcopy_tcp_tls_psk_match); > #ifdef CONFIG_TASN1 > migration_test_add("/migration/multifd/tcp/tls/x509/default-host", > test_multifd_tcp_tls_x509_default_host); ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd 2025-02-28 15:11 ` Fabiano Rosas @ 2025-03-03 9:33 ` Prasad Pandit 0 siblings, 0 replies; 37+ messages in thread From: Prasad Pandit @ 2025-03-03 9:33 UTC (permalink / raw) To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit On Fri, 28 Feb 2025 at 20:41, Fabiano Rosas <farosas@suse.de> wrote: > > +static void test_multifd_postcopy_tcp_tls_psk_match(void) > > +{ > > + MigrateCommon args = { > > + .start = { > > + .caps[MIGRATION_CAPABILITY_MULTIFD] = true, > > Missing POSTCOPY_RAM here, no? Yes, will fix it. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit ` (3 preceding siblings ...) 2025-02-28 12:17 ` [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit @ 2025-02-28 12:17 ` Prasad Pandit 2025-02-28 13:42 ` Peter Xu 2025-02-28 14:53 ` [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Fabiano Rosas 5 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-02-28 12:17 UTC (permalink / raw) To: qemu-devel; +Cc: peterx, farosas, berrange, Prasad Pandit From: Prasad Pandit <pjp@fedoraproject.org> When Multifd and Postcopy migration is enabled together, before starting Postcopy phase, Multifd threads are shutdown. But before this shutdown, we need to flush the Multifd channels on the source side and notify the destination migration thread to synchronise with the Multifd 'recv_x' threads, so that all the Multifd data is received and processed on the destination side, before Multifd threads are shutdown. This synchronisation happens when the main migration thread waits for all the Multifd threads to receive their data. Add 'MULTIFD_RECV_SYNC' migration command to notify the destination side to synchronise with Multifd threads. Suggested-by: Peter Xu <peterx@redhat.com> Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> --- migration/migration.c | 3 +++ migration/multifd-nocomp.c | 21 +++++++++++++++------ migration/multifd.c | 1 + migration/multifd.h | 1 + migration/savevm.c | 13 +++++++++++++ migration/savevm.h | 1 + 6 files changed, 34 insertions(+), 6 deletions(-) v6: New patch, not from earlier versions. - https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t diff --git a/migration/migration.c b/migration/migration.c index 5db9e18272..65fc4f5eed 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3401,6 +3401,9 @@ static MigIterateState migration_iteration_run(MigrationState *s) if (!in_postcopy && must_precopy <= s->threshold_size && can_switchover && qatomic_read(&s->start_postcopy)) { if (migrate_multifd()) { + multifd_send_flush(); + multifd_send_sync_main(MULTIFD_SYNC_LOCAL); + qemu_savevm_send_multifd_recv_sync(s->to_dst_file); multifd_send_shutdown(); } if (postcopy_start(s, &local_err)) { diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index d0edec7cd1..bbe07e4f7e 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -334,7 +334,7 @@ retry: * After flush, always retry. */ if (pages->block != block || multifd_queue_full(pages)) { - if (!multifd_send(&multifd_ram_send)) { + if (multifd_send_flush() < 0) { return false; } goto retry; @@ -387,6 +387,18 @@ bool multifd_ram_sync_per_round(void) return !migrate_multifd_flush_after_each_section(); } +int multifd_send_flush(void) +{ + if (!multifd_payload_empty(multifd_ram_send)) { + if (!multifd_send(&multifd_ram_send)) { + error_report("%s: multifd_send fail", __func__); + return -1; + } + } + + return 0; +} + int multifd_ram_flush_and_sync(QEMUFile *f) { MultiFDSyncReq req; @@ -396,11 +408,8 @@ int multifd_ram_flush_and_sync(QEMUFile *f) return 0; } - if (!multifd_payload_empty(multifd_ram_send)) { - if (!multifd_send(&multifd_ram_send)) { - error_report("%s: multifd_send fail", __func__); - return -1; - } + if ((ret = multifd_send_flush()) < 0) { + return ret; } /* File migrations only need to sync with threads */ diff --git a/migration/multifd.c b/migration/multifd.c index 2bd8604ca1..8928ca2611 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -1265,6 +1265,7 @@ static void *multifd_recv_thread(void *opaque) rcu_unregister_thread(); trace_multifd_recv_thread_end(p->id, p->packets_recved); + qemu_sem_post(&multifd_recv_state->sem_sync); return NULL; } diff --git a/migration/multifd.h b/migration/multifd.h index bff867ca6b..242b923633 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -361,6 +361,7 @@ static inline uint32_t multifd_ram_page_count(void) void multifd_ram_save_setup(void); void multifd_ram_save_cleanup(void); +int multifd_send_flush(void); int multifd_ram_flush_and_sync(QEMUFile *f); bool multifd_ram_sync_per_round(void); bool multifd_ram_sync_per_section(void); diff --git a/migration/savevm.c b/migration/savevm.c index 4046faf009..0b71e988ba 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -37,6 +37,7 @@ #include "migration/register.h" #include "migration/global_state.h" #include "migration/channel-block.h" +#include "multifd.h" #include "ram.h" #include "qemu-file.h" #include "savevm.h" @@ -90,6 +91,7 @@ enum qemu_vm_cmd { MIG_CMD_ENABLE_COLO, /* Enable COLO */ MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ + MIG_CMD_MULTIFD_RECV_SYNC, /* Sync multifd recv_x and main threads */ MIG_CMD_MAX }; @@ -109,6 +111,7 @@ static struct mig_cmd_args { [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, + [MIG_CMD_MULTIFD_RECV_SYNC] = { .len = 0, .name = "MULTIFD_RECV_SYNC" }, [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, }; @@ -1201,6 +1204,12 @@ void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name) qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf); } +void qemu_savevm_send_multifd_recv_sync(QEMUFile *f) +{ + /* TBD: trace_savevm_send_multifd_recv_sync(); */ + qemu_savevm_command_send(f, MIG_CMD_MULTIFD_RECV_SYNC, 0, NULL); +} + bool qemu_savevm_state_blocked(Error **errp) { SaveStateEntry *se; @@ -2479,6 +2488,10 @@ static int loadvm_process_command(QEMUFile *f) case MIG_CMD_RECV_BITMAP: return loadvm_handle_recv_bitmap(mis, len); + case MIG_CMD_MULTIFD_RECV_SYNC: + multifd_recv_sync_main(); + break; + case MIG_CMD_ENABLE_COLO: return loadvm_process_enable_colo(mis); } diff --git a/migration/savevm.h b/migration/savevm.h index 7957460062..91ae703925 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -53,6 +53,7 @@ void qemu_savevm_send_postcopy_listen(QEMUFile *f); void qemu_savevm_send_postcopy_run(QEMUFile *f); void qemu_savevm_send_postcopy_resume(QEMUFile *f); void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name); +void qemu_savevm_send_multifd_recv_sync(QEMUFile *f); void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, uint16_t len, -- 2.48.1 ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-02-28 12:17 ` [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command Prasad Pandit @ 2025-02-28 13:42 ` Peter Xu 2025-03-03 11:43 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-02-28 13:42 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Fri, Feb 28, 2025 at 05:47:49PM +0530, Prasad Pandit wrote: > From: Prasad Pandit <pjp@fedoraproject.org> > > When Multifd and Postcopy migration is enabled together, > before starting Postcopy phase, Multifd threads are shutdown. > > But before this shutdown, we need to flush the Multifd channels > on the source side and notify the destination migration thread > to synchronise with the Multifd 'recv_x' threads, so that all > the Multifd data is received and processed on the destination > side, before Multifd threads are shutdown. > > This synchronisation happens when the main migration thread > waits for all the Multifd threads to receive their data. > > Add 'MULTIFD_RECV_SYNC' migration command to notify the > destination side to synchronise with Multifd threads. > > Suggested-by: Peter Xu <peterx@redhat.com> Thanks for trying to grant me the credit, but.. I suggest not having a new message at all. We should be able to do multifd's flush and sync before VM stopped in postcopy_start().. So we can drop this line. And whatever come up with, it needs to be with patch 2 because patch 2 currently is broken itself. > Signed-off-by: Prasad Pandit <pjp@fedoraproject.org> > --- > migration/migration.c | 3 +++ > migration/multifd-nocomp.c | 21 +++++++++++++++------ > migration/multifd.c | 1 + > migration/multifd.h | 1 + > migration/savevm.c | 13 +++++++++++++ > migration/savevm.h | 1 + > 6 files changed, 34 insertions(+), 6 deletions(-) > > v6: New patch, not from earlier versions. > - https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t > > diff --git a/migration/migration.c b/migration/migration.c > index 5db9e18272..65fc4f5eed 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -3401,6 +3401,9 @@ static MigIterateState migration_iteration_run(MigrationState *s) > if (!in_postcopy && must_precopy <= s->threshold_size > && can_switchover && qatomic_read(&s->start_postcopy)) { > if (migrate_multifd()) { > + multifd_send_flush(); > + multifd_send_sync_main(MULTIFD_SYNC_LOCAL); > + qemu_savevm_send_multifd_recv_sync(s->to_dst_file); And.. this is not what I meant.. So again, there're more than one way to make sure no multifd pages will arrive during postcopy. What I actually think the easiest is to do flush and sync once in postcopy_start() as mentioned above. I think that'll serialize with postcopy messages later on the main channel, making sure all channels are flushed before moving to postcopy work. If you want to stick with shutdown channels, I'm OK. But then we at least need one message to say "the recv side finished joining all recv threads". That needs to be an ACK sent from dest to src, not src to dest. > multifd_send_shutdown(); > } > if (postcopy_start(s, &local_err)) { > diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c > index d0edec7cd1..bbe07e4f7e 100644 > --- a/migration/multifd-nocomp.c > +++ b/migration/multifd-nocomp.c > @@ -334,7 +334,7 @@ retry: > * After flush, always retry. > */ > if (pages->block != block || multifd_queue_full(pages)) { > - if (!multifd_send(&multifd_ram_send)) { > + if (multifd_send_flush() < 0) { > return false; > } > goto retry; > @@ -387,6 +387,18 @@ bool multifd_ram_sync_per_round(void) > return !migrate_multifd_flush_after_each_section(); > } > > +int multifd_send_flush(void) > +{ > + if (!multifd_payload_empty(multifd_ram_send)) { > + if (!multifd_send(&multifd_ram_send)) { > + error_report("%s: multifd_send fail", __func__); > + return -1; > + } > + } > + > + return 0; > +} > + > int multifd_ram_flush_and_sync(QEMUFile *f) > { > MultiFDSyncReq req; > @@ -396,11 +408,8 @@ int multifd_ram_flush_and_sync(QEMUFile *f) > return 0; > } > > - if (!multifd_payload_empty(multifd_ram_send)) { > - if (!multifd_send(&multifd_ram_send)) { > - error_report("%s: multifd_send fail", __func__); > - return -1; > - } > + if ((ret = multifd_send_flush()) < 0) { > + return ret; > } > > /* File migrations only need to sync with threads */ > diff --git a/migration/multifd.c b/migration/multifd.c > index 2bd8604ca1..8928ca2611 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -1265,6 +1265,7 @@ static void *multifd_recv_thread(void *opaque) > > rcu_unregister_thread(); > trace_multifd_recv_thread_end(p->id, p->packets_recved); > + qemu_sem_post(&multifd_recv_state->sem_sync); > > return NULL; > } > diff --git a/migration/multifd.h b/migration/multifd.h > index bff867ca6b..242b923633 100644 > --- a/migration/multifd.h > +++ b/migration/multifd.h > @@ -361,6 +361,7 @@ static inline uint32_t multifd_ram_page_count(void) > > void multifd_ram_save_setup(void); > void multifd_ram_save_cleanup(void); > +int multifd_send_flush(void); > int multifd_ram_flush_and_sync(QEMUFile *f); > bool multifd_ram_sync_per_round(void); > bool multifd_ram_sync_per_section(void); > diff --git a/migration/savevm.c b/migration/savevm.c > index 4046faf009..0b71e988ba 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -37,6 +37,7 @@ > #include "migration/register.h" > #include "migration/global_state.h" > #include "migration/channel-block.h" > +#include "multifd.h" > #include "ram.h" > #include "qemu-file.h" > #include "savevm.h" > @@ -90,6 +91,7 @@ enum qemu_vm_cmd { > MIG_CMD_ENABLE_COLO, /* Enable COLO */ > MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ > MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ > + MIG_CMD_MULTIFD_RECV_SYNC, /* Sync multifd recv_x and main threads */ > MIG_CMD_MAX > }; > > @@ -109,6 +111,7 @@ static struct mig_cmd_args { > [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, > [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, > [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, > + [MIG_CMD_MULTIFD_RECV_SYNC] = { .len = 0, .name = "MULTIFD_RECV_SYNC" }, > [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, > }; > > @@ -1201,6 +1204,12 @@ void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name) > qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf); > } > > +void qemu_savevm_send_multifd_recv_sync(QEMUFile *f) > +{ > + /* TBD: trace_savevm_send_multifd_recv_sync(); */ > + qemu_savevm_command_send(f, MIG_CMD_MULTIFD_RECV_SYNC, 0, NULL); > +} > + > bool qemu_savevm_state_blocked(Error **errp) > { > SaveStateEntry *se; > @@ -2479,6 +2488,10 @@ static int loadvm_process_command(QEMUFile *f) > case MIG_CMD_RECV_BITMAP: > return loadvm_handle_recv_bitmap(mis, len); > > + case MIG_CMD_MULTIFD_RECV_SYNC: > + multifd_recv_sync_main(); > + break; > + > case MIG_CMD_ENABLE_COLO: > return loadvm_process_enable_colo(mis); > } > diff --git a/migration/savevm.h b/migration/savevm.h > index 7957460062..91ae703925 100644 > --- a/migration/savevm.h > +++ b/migration/savevm.h > @@ -53,6 +53,7 @@ void qemu_savevm_send_postcopy_listen(QEMUFile *f); > void qemu_savevm_send_postcopy_run(QEMUFile *f); > void qemu_savevm_send_postcopy_resume(QEMUFile *f); > void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name); > +void qemu_savevm_send_multifd_recv_sync(QEMUFile *f); > > void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, > uint16_t len, > -- > 2.48.1 > -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-02-28 13:42 ` Peter Xu @ 2025-03-03 11:43 ` Prasad Pandit 2025-03-03 14:50 ` Peter Xu 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-03 11:43 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit Hello Peter, On Fri, 28 Feb 2025 at 19:13, Peter Xu <peterx@redhat.com> wrote: > We should be able to do multifd's flush and sync before VM > stopped in postcopy_start().. > > What I actually think the easiest is to do flush and sync once in > postcopy_start() as mentioned above. I think that'll serialize with > postcopy messages later on the main channel, making sure all channels are > flushed before moving to postcopy work. * Is there some specific benefit to calling 'multifd_ram_flush_and_sync()' from OR before postcopy_start()? Maybe that is missing on me. * I'll try to explain why adding a migration command makes sense: - I did call 'multifd_ram_flush_and_sync()' before calling postcopy_start() in migration_iteration_run(). After flushing the 'multifd_send' queue, all it does is send 'RAM_SAVE_FLAG_MULTIFD_FLUSH' flag on the main channel. On the destination side the 'qemu_loadvm_state_main()' function does not understand that message, because it looks for 'section_type'; And when it is not able to identify the section, it leads to an error. "Expected vmdescription section, but got %d", section_type(=0)" - ie. multifd_ram_flush_and_sync() is not reusable by itself. To make it work, we need to add a (RAM) section header to the message. Because then it'll go to ram_load_precopy() and call -> multifd_recv_sync_main(). - But sending a lone RAM section header from migration_iteration_run() OR even in postcopy_start() does not seem right. That is out-of-place, because both migration_iteration_run() and postcopy_start() have nothing to do with RAM sections. - I think 'flush' and 'sync' ought to be small individual functions/operations that are section agnostic. We should be able to call 'flush' and 'sync' from anywhere in the code without side-effects. So tying 'flush' and 'sync' with RAM sections does not seem right, because we need to be able to call 'flush' and 'sync' from other parts too, like before calling postcopy_start() OR maybe some other code parts. - Another observation is: when multifd and postcopy are enabled together and the guest VM is writing to its RAM, multifd_ram_flush_and_sync() is called only during setup, not after that. ===== 2025-03-02 18:13:26.425+0000: initiating migration 2025-03-02T18:13:26.508809Z qemu-system-x86_64: multifd_ram_flush_and_sync: called 2025-03-02 18:15:20.070+0000: shutting down, reason=migrated 2025-03-03 11:26:41.340+0000: initiating migration 2025-03-03T11:26:41.415625Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-03 11:30:22.766+0000: shutting down, reason=migrated ===== - If we untie the 'flush and sync' from RAM sections and make it a general migration command, we shall be able to call it from anywhere else. > If you want to stick with shutdown channels, I'm OK. But then we at least > need one message to say "the recv side finished joining all recv threads". > That needs to be an ACK sent from dest to src, not src to dest. * But earlier we discussed 'flush and sync' is enough for that, no? Because 'multifd_ram_flush_and_sync' only notifies the destination to sync multifd_recv threads. It does not receive any acknowledgement back. * And multifd_recv_sync_main() function on the destination blocks the 'main' thread until all multfd_recv_threads (mig/dst/recv_x) have exited, only then it proceeds to accept the incoming new postcopy connection. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-03 11:43 ` Prasad Pandit @ 2025-03-03 14:50 ` Peter Xu 2025-03-04 8:10 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-03-03 14:50 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Mon, Mar 03, 2025 at 05:13:56PM +0530, Prasad Pandit wrote: > Hello Peter, > > On Fri, 28 Feb 2025 at 19:13, Peter Xu <peterx@redhat.com> wrote: > > We should be able to do multifd's flush and sync before VM > > stopped in postcopy_start().. > > > > What I actually think the easiest is to do flush and sync once in > > postcopy_start() as mentioned above. I think that'll serialize with > > postcopy messages later on the main channel, making sure all channels are > > flushed before moving to postcopy work. > > * Is there some specific benefit to calling > 'multifd_ram_flush_and_sync()' from OR before postcopy_start()? Maybe > that is missing on me. > > * I'll try to explain why adding a migration command makes sense: > > - I did call 'multifd_ram_flush_and_sync()' before calling > postcopy_start() in migration_iteration_run(). After flushing the > 'multifd_send' queue, all it does is send > 'RAM_SAVE_FLAG_MULTIFD_FLUSH' flag on the main channel. On the > destination side the 'qemu_loadvm_state_main()' function does not > understand that message, because it looks for 'section_type'; And when > it is not able to identify the section, it leads to an error. > > "Expected vmdescription section, but got %d", section_type(=0)" > > - ie. multifd_ram_flush_and_sync() is not reusable by itself. To > make it work, we need to add a (RAM) section header to the message. > Because then it'll go to ram_load_precopy() and call -> > multifd_recv_sync_main(). > > - But sending a lone RAM section header from > migration_iteration_run() OR even in postcopy_start() does not seem > right. That is out-of-place, because both migration_iteration_run() > and postcopy_start() have nothing to do with RAM sections. > > - I think 'flush' and 'sync' ought to be small individual > functions/operations that are section agnostic. We should be able to > call 'flush' and 'sync' from anywhere in the code without > side-effects. So tying 'flush' and 'sync' with RAM sections does not > seem right, because we need to be able to call 'flush' and 'sync' from > other parts too, like before calling postcopy_start() OR maybe some > other code parts. We need the header. Maybe the easiest as of now is one more hook like qemu_savevm_state_complete_precopy_prepare(), and only use it in RAM as of now. > > - Another observation is: when multifd and postcopy are enabled > together and the guest VM is writing to its RAM, > multifd_ram_flush_and_sync() is called only during setup, not after > that. > ===== > 2025-03-02 18:13:26.425+0000: initiating migration > 2025-03-02T18:13:26.508809Z qemu-system-x86_64: > multifd_ram_flush_and_sync: called > 2025-03-02 18:15:20.070+0000: shutting down, reason=migrated > > 2025-03-03 11:26:41.340+0000: initiating migration > 2025-03-03T11:26:41.415625Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-03 11:30:22.766+0000: shutting down, reason=migrated > ===== > > - If we untie the 'flush and sync' from RAM sections and make it a > general migration command, we shall be able to call it from anywhere > else. > > > If you want to stick with shutdown channels, I'm OK. But then we at least > > need one message to say "the recv side finished joining all recv threads". > > That needs to be an ACK sent from dest to src, not src to dest. > > * But earlier we discussed 'flush and sync' is enough for that, no? Yes it's ok I think, but this patch didn't do that. + multifd_send_flush(); + multifd_send_sync_main(MULTIFD_SYNC_LOCAL); + qemu_savevm_send_multifd_recv_sync(s->to_dst_file); I don't think it sent RAM_SAVE_FLAG_MULTIFD_FLUSH. IIUC you need the complete multifd_ram_flush_and_sync(), and the new message not needed. > Because 'multifd_ram_flush_and_sync' only notifies the destination to > sync multifd_recv threads. It does not receive any acknowledgement > back. If my understanding is correct, we don't need ACK. It will make sure when postcopy starts all pages flushed and consumed on dest. Instead of I prepare the patch and whole commit message, please take your time and think about it, justify it, and if you also think it works put explanation into commit message and then we can go with it. > > * And multifd_recv_sync_main() function on the destination blocks the > 'main' thread until all multfd_recv_threads (mig/dst/recv_x) have > exited, only then it proceeds to accept the incoming new postcopy > connection. I don't think it makes sure threads have exited, AFAIU it does not join the threads, but sync with threads on the per-thread sync messages. Thanks, > > > Thank you. > --- > - Prasad > -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-03 14:50 ` Peter Xu @ 2025-03-04 8:10 ` Prasad Pandit 2025-03-04 14:35 ` Peter Xu 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-04 8:10 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit Hello Peter, On Mon, 3 Mar 2025 at 20:20, Peter Xu <peterx@redhat.com> wrote: > We need the header. * We need a section type, which is sent by qemu_savevm_command_send() as 'QEMU_VM_COMMAND'. > Maybe the easiest as of now is one more hook like > qemu_savevm_state_complete_precopy_prepare(), and only use it in RAM as of > now. * What will this helper do? > > * But earlier we discussed 'flush and sync' is enough for that, no? > > Yes it's ok I think, but this patch didn't do that. > > + multifd_send_flush(); > + multifd_send_sync_main(MULTIFD_SYNC_LOCAL); > + qemu_savevm_send_multifd_recv_sync(s->to_dst_file); > > I don't think it sent RAM_SAVE_FLAG_MULTIFD_FLUSH. IIUC you need the > complete multifd_ram_flush_and_sync(), and the new message not needed. * If we look at multifd_ram_flush_and_sync(), it does: 1. multifd_send() <= this patch does it via multifd_send_flush() 2. multifd_send_sync_main() <= this patch also calls it above 3. send RAM_SAVE_FLAG_MULTIFD_FLUSH <= this patch sends MIG_CMD_MULTIFD_RECV_SYNC * What is missing? > Instead of I prepare the patch and whole commit message, please take your > time and think about it, justify it, and if you also think it works put > explanation into commit message and then we can go with it. * The commit message does explain about flush and sync and how the migration command helps. What else do we need to add? > > * And multifd_recv_sync_main() function on the destination blocks the > > 'main' thread until all multfd_recv_threads (mig/dst/recv_x) have > > exited, only then it proceeds to accept the incoming new postcopy > > connection. > > I don't think it makes sure threads have exited, * 'multifd_recv_sync_main()' blocks the main thread on 'multifd_recv_state->sem_sync' semaphore. It is increased when multifd_recv threads exit due to the shutdown message. ie. the 'main' thread unblocks when all 'mig/dst/recv_x' threads have exited. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-04 8:10 ` Prasad Pandit @ 2025-03-04 14:35 ` Peter Xu 2025-03-05 11:21 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-03-04 14:35 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Tue, Mar 04, 2025 at 01:40:02PM +0530, Prasad Pandit wrote: > Hello Peter, > > On Mon, 3 Mar 2025 at 20:20, Peter Xu <peterx@redhat.com> wrote: > > We need the header. > > * We need a section type, which is sent by qemu_savevm_command_send() > as 'QEMU_VM_COMMAND'. I think we need the header, the ram is a module. > > > Maybe the easiest as of now is one more hook like > > qemu_savevm_state_complete_precopy_prepare(), and only use it in RAM as of > > now. > > * What will this helper do? Do similarly like qemu_savevm_state_complete_precopy_iterable() but do whatever a vmstate hander wants, so it'll be with a header. > > > > * But earlier we discussed 'flush and sync' is enough for that, no? > > > > Yes it's ok I think, but this patch didn't do that. > > > > + multifd_send_flush(); > > + multifd_send_sync_main(MULTIFD_SYNC_LOCAL); > > + qemu_savevm_send_multifd_recv_sync(s->to_dst_file); > > > > I don't think it sent RAM_SAVE_FLAG_MULTIFD_FLUSH. IIUC you need the > > complete multifd_ram_flush_and_sync(), and the new message not needed. > > * If we look at multifd_ram_flush_and_sync(), it does: > 1. multifd_send() <= this patch does it via > multifd_send_flush() > 2. multifd_send_sync_main() <= this patch also calls it above MULTIFD_SYNC_LOCAL will not invoke MULTIFD_FLAG_SYNC, which we need. > 3. send RAM_SAVE_FLAG_MULTIFD_FLUSH <= this patch sends > MIG_CMD_MULTIFD_RECV_SYNC IMO we shouldn't make a global cmd for multifd. > > * What is missing? > > > Instead of I prepare the patch and whole commit message, please take your > > time and think about it, justify it, and if you also think it works put > > explanation into commit message and then we can go with it. > > * The commit message does explain about flush and sync and how the > migration command helps. What else do we need to add? Please consider adding details like "we need message AAA on BBB channel to serialize with CCC" and details. Not asking that as required to merge, but my understanding is that that's what is missing and that's why none of yet versions can make sure of it in code. Maybe that'll help you to understand how that was serialized. > > > > * And multifd_recv_sync_main() function on the destination blocks the > > > 'main' thread until all multfd_recv_threads (mig/dst/recv_x) have > > > exited, only then it proceeds to accept the incoming new postcopy > > > connection. > > > > I don't think it makes sure threads have exited, > > * 'multifd_recv_sync_main()' blocks the main thread on > 'multifd_recv_state->sem_sync' semaphore. It is increased when > multifd_recv threads exit due to the shutdown message. ie. the 'main' > thread unblocks when all 'mig/dst/recv_x' threads have exited. So is it your intention to not send MULTIFD_FLAG_SYNC above? In all cases, I still think that's not the right way to do. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-04 14:35 ` Peter Xu @ 2025-03-05 11:21 ` Prasad Pandit 2025-03-05 12:54 ` Peter Xu 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-05 11:21 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit Hi, On Tue, 4 Mar 2025 at 20:05, Peter Xu <peterx@redhat.com> wrote: > I think we need the header, the ram is a module. > Do similarly like qemu_savevm_state_complete_precopy_iterable() but do > whatever a vmstate hander wants, so it'll be with a header. * I don't fully see yet how this shall work. > Please consider adding details like "we need message AAA on BBB channel to > serialize with CCC" and details. Not asking that as required to merge, but > my understanding is that that's what is missing and that's why none of yet > versions can make sure of it in code. Maybe that'll help you to understand > how that was serialized. * Okay, will try. > > MULTIFD_SYNC_LOCAL will not invoke MULTIFD_FLAG_SYNC, which we need. ... > So is it your intention to not send MULTIFD_FLAG_SYNC above? > In all cases, I still think that's not the right way to do. * It makes little difference; MULTIFD_FLAG_SYNC is also used to increase 'multifd_recv_state->sem_sync' semaphore on the destination side, which then unblocks the 'main' thread waiting on it. === diff --git a/migration/migration.c b/migration/migration.c index 65fc4f5eed..d8c4ea0ad1 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3402,7 +3402,7 @@ static MigIterateState migration_iteration_run(MigrationState *s) && can_switchover && qatomic_read(&s->start_postcopy)) { if (migrate_multifd()) { multifd_send_flush(); - multifd_send_sync_main(MULTIFD_SYNC_LOCAL); + multifd_send_sync_main(MULTIFD_SYNC_ALL); qemu_savevm_send_multifd_recv_sync(s->to_dst_file); multifd_send_shutdown(); } diff --git a/migration/multifd.c b/migration/multifd.c index 8928ca2611..2b5bc2d478 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -1265,7 +1265,7 @@ static void *multifd_recv_thread(void *opaque) rcu_unregister_thread(); trace_multifd_recv_thread_end(p->id, p->packets_recved); - qemu_sem_post(&multifd_recv_state->sem_sync); +// qemu_sem_post(&multifd_recv_state->sem_sync); return NULL; } === host-1] 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 159.46s 79 subtests passed host-2] 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 164.55s 79 subtests passed === * I tried the above patch and it also works the same. I'll use this, no issues. Thank you. --- - Prasad ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-05 11:21 ` Prasad Pandit @ 2025-03-05 12:54 ` Peter Xu 2025-03-07 11:45 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-03-05 12:54 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Wed, Mar 05, 2025 at 04:51:00PM +0530, Prasad Pandit wrote: > Hi, > > On Tue, 4 Mar 2025 at 20:05, Peter Xu <peterx@redhat.com> wrote: > > I think we need the header, the ram is a module. > > Do similarly like qemu_savevm_state_complete_precopy_iterable() but do > > whatever a vmstate hander wants, so it'll be with a header. > > * I don't fully see yet how this shall work. Another option is add another event for precopy_notifier_list, it can be PRECOPY_NOTIFY_SWITCH_POSTCOPY, then make RAM register to it, send the header itself, and do the flush and sync. Let me know if you want me to write the patches. That's almost the only thing left to make it clearer.. > > > Please consider adding details like "we need message AAA on BBB channel to > > serialize with CCC" and details. Not asking that as required to merge, but > > my understanding is that that's what is missing and that's why none of yet > > versions can make sure of it in code. Maybe that'll help you to understand > > how that was serialized. > > * Okay, will try. > > > > > MULTIFD_SYNC_LOCAL will not invoke MULTIFD_FLAG_SYNC, which we need. > ... > > So is it your intention to not send MULTIFD_FLAG_SYNC above? > > In all cases, I still think that's not the right way to do. > > * It makes little difference; MULTIFD_FLAG_SYNC is also used to > increase 'multifd_recv_state->sem_sync' semaphore on the destination > side, which then unblocks the 'main' thread waiting on it. AFAIU that's the whole difference and whole point of doing such.. > === > diff --git a/migration/migration.c b/migration/migration.c > index 65fc4f5eed..d8c4ea0ad1 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -3402,7 +3402,7 @@ static MigIterateState > migration_iteration_run(MigrationState *s) > && can_switchover && qatomic_read(&s->start_postcopy)) { > if (migrate_multifd()) { > multifd_send_flush(); > - multifd_send_sync_main(MULTIFD_SYNC_LOCAL); > + multifd_send_sync_main(MULTIFD_SYNC_ALL); > qemu_savevm_send_multifd_recv_sync(s->to_dst_file); > multifd_send_shutdown(); > } > diff --git a/migration/multifd.c b/migration/multifd.c > index 8928ca2611..2b5bc2d478 100644 > --- a/migration/multifd.c > +++ b/migration/multifd.c > @@ -1265,7 +1265,7 @@ static void *multifd_recv_thread(void *opaque) > > rcu_unregister_thread(); > trace_multifd_recv_thread_end(p->id, p->packets_recved); > - qemu_sem_post(&multifd_recv_state->sem_sync); > +// qemu_sem_post(&multifd_recv_state->sem_sync); > > return NULL; > } > === > host-1] 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 159.46s 79 subtests passed > host-2] 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 164.55s 79 subtests passed > === > > * I tried the above patch and it also works the same. I'll use this, no issues. We can't introduce a migration global cmd just to work the same as RAM_SAVE_FLAG_MULTIFD_FLUSH, which is a sub-cmd for a module. > > Thank you. > --- > - Prasad > -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-05 12:54 ` Peter Xu @ 2025-03-07 11:45 ` Prasad Pandit 2025-03-07 22:48 ` Peter Xu 2025-03-07 22:51 ` Peter Xu 0 siblings, 2 replies; 37+ messages in thread From: Prasad Pandit @ 2025-03-07 11:45 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit Hello Peter, On Wed, 5 Mar 2025 at 18:24, Peter Xu <peterx@redhat.com> wrote: > > On Tue, 4 Mar 2025 at 20:05, Peter Xu <peterx@redhat.com> wrote: > > > I think we need the header, the ram is a module. > > > Do similarly like qemu_savevm_state_complete_precopy_iterable() but do > > > whatever a vmstate hander wants, so it'll be with a header. > > === diff --git a/migration/migration.c b/migration/migration.c index 65fc4f5eed..da2c49c303 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -3401,9 +3401,10 @@ static MigIterateState migration_iteration_run(MigrationState *s) if (!in_postcopy && must_precopy <= s->threshold_size && can_switchover && qatomic_read(&s->start_postcopy)) { if (migrate_multifd()) { - multifd_send_flush(); - multifd_send_sync_main(MULTIFD_SYNC_LOCAL); - qemu_savevm_send_multifd_recv_sync(s->to_dst_file); +/* multifd_send_flush(); + * multifd_send_sync_main(MULTIFD_SYNC_ALL); + * qemu_savevm_send_multifd_recv_sync(s->to_dst_file); + */ + qemu_savevm_state_complete_multifd(s->to_dst_file); multifd_send_shutdown(); } if (postcopy_start(s, &local_err)) { diff --git a/migration/savevm.c b/migration/savevm.c index 0b71e988ba..c2b181b0cc 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1602,6 +1602,37 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only) return qemu_fflush(f); } +int qemu_savevm_state_complete_multifd(QEMUFile *f) +{ + int ret; + SaveStateEntry *se; + int64_t start_ts_each, end_ts_each; + + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + if (strcmp(se->idstr, "ram")) { + continue; + } + + start_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); + trace_savevm_section_start(se->idstr, se->section_id); + + save_section_header(f, se, QEMU_VM_SECTION_END); + + ret = se->ops->save_live_complete_precopy(f, se->opaque); + trace_savevm_section_end(se->idstr, se->section_id, ret); + save_section_footer(f, se); + if (ret < 0) { + qemu_file_set_error(f, ret); + return -1; + } + end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); + trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, + end_ts_each - start_ts_each); + } + + return ret; +} + /* Give an estimate of the amount left to be transferred, * the result is split into the amount for units that can and * for units that can't do postcopy. diff --git a/migration/savevm.h b/migration/savevm.h index 91ae703925..e3789984a1 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -70,5 +70,6 @@ int qemu_load_device_state(QEMUFile *f); int qemu_loadvm_approve_switchover(void); int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, bool in_postcopy); +int qemu_savevm_state_complete_multifd(QEMUFile *f); #endif ==== 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 164.02s 79 subtests passed 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 164.99s 79 subtests passed ==== * Does the above patch look okay? ==== Guest not dirtying RAM: =================== 2025-03-07 10:57:28.740+0000: initiating migration 2025-03-07T10:57:28.823166Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T10:58:04.450758Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T10:58:04.711523Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07 10:58:05.078+0000: shutting down, reason=migrated 2025-03-07 10:59:44.322+0000: initiating migration 2025-03-07T10:59:44.394714Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T11:00:20.198360Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T11:00:20.279135Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07 11:00:20.571+0000: shutting down, reason=migrated ==== Guest dirtying RAM: $ dd if=/dev/urandom of=/tmp/random.bin bs=256M count=32 ... ================ 2025-03-07 11:04:00.281+0000: initiating migration 2025-03-07T11:04:00.359426Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T11:05:51.406988Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T11:06:56.899920Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07 11:08:02.376+0000: shutting down, reason=migrated 2025-03-07 11:09:19.295+0000: initiating migration 2025-03-07T11:09:19.372012Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T11:11:24.217610Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07T11:12:35.342709Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called 2025-03-07 11:13:48.947+0000: shutting down, reason=migrated ==== * When a guest is running some workload (dirtying memory), it seems to take a little more time before switching to the Postcopy phase. > Let me know if you want me to write the patches. That's almost the only > thing left to make it clearer.. * If the above patch is not okay, it'll help if you share your version of it. Meanwhile I'll split the patch-2 and re-arrange other patches. Thank you. --- - Prasad ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-07 11:45 ` Prasad Pandit @ 2025-03-07 22:48 ` Peter Xu 2025-03-10 7:36 ` Prasad Pandit 2025-03-13 12:43 ` Prasad Pandit 2025-03-07 22:51 ` Peter Xu 1 sibling, 2 replies; 37+ messages in thread From: Peter Xu @ 2025-03-07 22:48 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Fri, Mar 07, 2025 at 05:15:03PM +0530, Prasad Pandit wrote: > Hello Peter, > > On Wed, 5 Mar 2025 at 18:24, Peter Xu <peterx@redhat.com> wrote: > > > On Tue, 4 Mar 2025 at 20:05, Peter Xu <peterx@redhat.com> wrote: > > > > I think we need the header, the ram is a module. > > > > Do similarly like qemu_savevm_state_complete_precopy_iterable() but do [1] > > > > whatever a vmstate hander wants, so it'll be with a header. > > > > === > diff --git a/migration/migration.c b/migration/migration.c > index 65fc4f5eed..da2c49c303 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -3401,9 +3401,10 @@ static MigIterateState > migration_iteration_run(MigrationState *s) > if (!in_postcopy && must_precopy <= s->threshold_size > && can_switchover && qatomic_read(&s->start_postcopy)) { > if (migrate_multifd()) { > - multifd_send_flush(); > - multifd_send_sync_main(MULTIFD_SYNC_LOCAL); > - qemu_savevm_send_multifd_recv_sync(s->to_dst_file); > +/* multifd_send_flush(); > + * multifd_send_sync_main(MULTIFD_SYNC_ALL); > + * qemu_savevm_send_multifd_recv_sync(s->to_dst_file); > + */ > + qemu_savevm_state_complete_multifd(s->to_dst_file); > multifd_send_shutdown(); > } Please move all of them at the entry of postcopy_start(). > if (postcopy_start(s, &local_err)) { > diff --git a/migration/savevm.c b/migration/savevm.c > index 0b71e988ba..c2b181b0cc 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -1602,6 +1602,37 @@ int qemu_savevm_state_complete_precopy(QEMUFile > *f, bool iterable_only) > return qemu_fflush(f); > } > > +int qemu_savevm_state_complete_multifd(QEMUFile *f) I still like the name I provided because it's more generic, above [1]. > +{ > + int ret; > + SaveStateEntry *se; > + int64_t start_ts_each, end_ts_each; > + > + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { > + if (strcmp(se->idstr, "ram")) { > + continue; > + } > + > + start_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); > + trace_savevm_section_start(se->idstr, se->section_id); > + > + save_section_header(f, se, QEMU_VM_SECTION_END); Maybe it should be SECTION_PART. So we provide all iterator a chance to do something right before switching to postcopy but before VM stop. RAM can register. > + > + ret = se->ops->save_live_complete_precopy(f, se->opaque); Maybe we don't want to run the whole iteration but only flush. I think for simplicity we can make a new hook. > + trace_savevm_section_end(se->idstr, se->section_id, ret); > + save_section_footer(f, se); > + if (ret < 0) { > + qemu_file_set_error(f, ret); > + return -1; > + } > + end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); > + trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, > + end_ts_each - start_ts_each); We do not need to account > + } > + > + return ret; > +} > + > /* Give an estimate of the amount left to be transferred, > * the result is split into the amount for units that can and > * for units that can't do postcopy. > diff --git a/migration/savevm.h b/migration/savevm.h > index 91ae703925..e3789984a1 100644 > --- a/migration/savevm.h > +++ b/migration/savevm.h > @@ -70,5 +70,6 @@ int qemu_load_device_state(QEMUFile *f); > int qemu_loadvm_approve_switchover(void); > int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, > bool in_postcopy); > +int qemu_savevm_state_complete_multifd(QEMUFile *f); > > #endif > ==== > > 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 164.02s 79 subtests passed > 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test > OK 164.99s 79 subtests passed > ==== > > * Does the above patch look okay? > > ==== > Guest not dirtying RAM: > =================== > 2025-03-07 10:57:28.740+0000: initiating migration > 2025-03-07T10:57:28.823166Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T10:58:04.450758Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T10:58:04.711523Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07 10:58:05.078+0000: shutting down, reason=migrated > > 2025-03-07 10:59:44.322+0000: initiating migration > 2025-03-07T10:59:44.394714Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T11:00:20.198360Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T11:00:20.279135Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07 11:00:20.571+0000: shutting down, reason=migrated > ==== > > Guest dirtying RAM: $ dd if=/dev/urandom of=/tmp/random.bin bs=256M count=32 ... > ================ > 2025-03-07 11:04:00.281+0000: initiating migration > 2025-03-07T11:04:00.359426Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T11:05:51.406988Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T11:06:56.899920Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07 11:08:02.376+0000: shutting down, reason=migrated > > 2025-03-07 11:09:19.295+0000: initiating migration > 2025-03-07T11:09:19.372012Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T11:11:24.217610Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07T11:12:35.342709Z qemu-system-x86_64: info: > multifd_ram_flush_and_sync: called > 2025-03-07 11:13:48.947+0000: shutting down, reason=migrated > ==== > > * When a guest is running some workload (dirtying memory), it seems to > take a little more time before switching to the Postcopy phase. > > > Let me know if you want me to write the patches. That's almost the only > > thing left to make it clearer.. > > * If the above patch is not okay, it'll help if you share your > version of it. Meanwhile I'll split the patch-2 and re-arrange other > patches. Please see above, if that doesn't help you come up with a final version, I'll write it. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-07 22:48 ` Peter Xu @ 2025-03-10 7:36 ` Prasad Pandit 2025-03-13 12:43 ` Prasad Pandit 1 sibling, 0 replies; 37+ messages in thread From: Prasad Pandit @ 2025-03-10 7:36 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit Hi, On Sat, 8 Mar 2025 at 04:18, Peter Xu <peterx@redhat.com> wrote: > Please move all of them at the entry of postcopy_start(). * Okay. > [1] > > > +int qemu_savevm_state_complete_multifd(QEMUFile *f) > I still like the name I provided because it's more generic, above [1]. === > Maybe the easiest as of now is one more hook like > qemu_savevm_state_complete_precopy_prepare(), === * ie. use -> 'qemu_savevm_state_complete_precopy_prepare' name? Should it be _postcopy_prepare? >> + if (strcmp(se->idstr, "ram")) { >> + continue; >> + } >> + save_section_header(f, se, QEMU_VM_SECTION_END); > > Maybe it should be SECTION_PART. * Will try with SECTION_PART. > So we provide all iterator a chance to > do something right before switching to postcopy but before VM stop. RAM > can register. * ...all iterators a chance to do something? Above function only processes "ram" entries -> strcmp(se->idstr, "ram"). * ...RAM can register? > > + > > + ret = se->ops->save_live_complete_precopy(f, se->opaque); > > Maybe we don't want to run the whole iteration but only flush. I think for > simplicity we can make a new hook. * ..new hook? I tried calling multifd_ram_flush_and_sync() in place of ->save_live_complete_precopy(), but it crashes QEMU. > > + end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); > > + trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, > > + end_ts_each - start_ts_each); > > We do not need to account > If you do flush and sync, IMHO we can keep the threads there and remove > this shutdown, as long as we are sure it'll be properly shutdown when > cleanup. With the assertion in dest threads, I think it should be OK. * Okay. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-07 22:48 ` Peter Xu 2025-03-10 7:36 ` Prasad Pandit @ 2025-03-13 12:43 ` Prasad Pandit 2025-03-13 20:08 ` Peter Xu 1 sibling, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-13 12:43 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit Hello Peter, On Sat, 8 Mar 2025 at 04:18, Peter Xu <peterx@redhat.com> wrote: > [1] > Please move all of them at the entry of postcopy_start(). > I still like the name I provided because it's more generic, above [1]. > Maybe it should be SECTION_PART. So we provide all iterator a chance to > do something right before switching to postcopy but before VM stop. RAM > can register. > Maybe we don't want to run the whole iteration but only flush. I think for > simplicity we can make a new hook. > We do not need to account > Please see above, if that doesn't help you come up with a final version, > I'll write it. === diff --git a/migration/migration.c b/migration/migration.c index f97bb2777f..9ef8e8ee1d 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2700,6 +2700,10 @@ static int postcopy_start(MigrationState *ms, Error **errp) QIOChannelBuffer *bioc; QEMUFile *fb; + if (migrate_multifd()) { + qemu_savevm_state_postcopy_prepare(ms->to_dst_file); + } + /* * Now we're 100% sure to switch to postcopy, so JSON writer won't be * useful anymore. Free the resources early if it is there. Clearing diff --git a/migration/ram.c b/migration/ram.c index 6fd88cbf2a..13a8c63660 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1143,6 +1143,15 @@ static int save_zero_page(RAMState *rs, PageSearchStatus *pss, return len; } +int ram_save_zero_page(QEMUFile *f, void *opaque) +{ + RAMState *rs = *(RAMState **)opaque; + PageSearchStatus *pss = &rs->pss[RAM_CHANNEL_PRECOPY]; + ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; + + return save_zero_page(rs, pss, offset); +} + /* * @pages: the number of pages written by the control path, * < 0 - error diff --git a/migration/ram.h b/migration/ram.h index 921c39a2c5..425c9ad9fd 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -122,4 +122,5 @@ void ram_write_tracking_prepare(void); int ram_write_tracking_start(void); void ram_write_tracking_stop(void); +int ram_save_zero_page(QEMUFile *f, void *opaque); #endif diff --git a/migration/savevm.c b/migration/savevm.c index ce158c3512..9bb0f03942 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1676,6 +1676,36 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only) return qemu_fflush(f); } +int qemu_savevm_state_postcopy_prepare(QEMUFile *f) +{ + int ret = 0; + SaveStateEntry *se; + + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + if (strcmp(se->idstr, "ram")) { + continue; + } + + save_section_header(f, se, QEMU_VM_SECTION_PART); + + ram_save_zero_page(f, se->opaque); + if ((ret = multifd_ram_flush_and_sync(f)) < 0) { + return ret; + } + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); + + trace_savevm_section_end(se->idstr, se->section_id, ret); + save_section_footer(f, se); + if (ret < 0) { + qemu_file_set_error(f, ret); + return -1; + } + } + + return ret; +} + + /* Give an estimate of the amount left to be transferred, * the result is split into the amount for units that can and * for units that can't do postcopy. diff --git a/migration/savevm.h b/migration/savevm.h index 138c39a7f9..7d4ff25ca3 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -74,4 +74,5 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, char *buf, size_t len, Error **errp); +int qemu_savevm_state_postcopy_prepare(QEMUFile *f); #endif === host-1] 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 139.87s 79 subtests passed host-2] 63/63 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 137.52s 79 subtests passed 2025-03-13 12:11:01.653+0000: initiating migration 2025-03-13T12:11:01.690247Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called: 1, 0 2025-03-13T12:12:05.342194Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called: 1, 0 2025-03-13 12:12:54.688+0000: shutting down, reason=migrated 2025-03-13 12:21:10.967+0000: initiating migration 2025-03-13T12:21:10.994878Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called: 1, 0 2025-03-13T12:22:15.955266Z qemu-system-x86_64: info: multifd_ram_flush_and_sync: called: 1, 0 2025-03-13 12:23:02.107+0000: shutting down, reason=migrated === * Does this patch look okay? Thank you. --- - Prasad ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-13 12:43 ` Prasad Pandit @ 2025-03-13 20:08 ` Peter Xu 2025-03-17 12:30 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-03-13 20:08 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Thu, Mar 13, 2025 at 06:13:24PM +0530, Prasad Pandit wrote: > +int qemu_savevm_state_postcopy_prepare(QEMUFile *f) > +{ > + int ret = 0; > + SaveStateEntry *se; > + > + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { > + if (strcmp(se->idstr, "ram")) { > + continue; > + } > + > + save_section_header(f, se, QEMU_VM_SECTION_PART); > + > + ram_save_zero_page(f, se->opaque); I'll stop requesting a why here... but I think this is another example that even if all the tests pass it may not be correct. > + if ((ret = multifd_ram_flush_and_sync(f)) < 0) { > + return ret; > + } > + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); > + > + trace_savevm_section_end(se->idstr, se->section_id, ret); > + save_section_footer(f, se); > + if (ret < 0) { > + qemu_file_set_error(f, ret); > + return -1; > + } > + } > + > + return ret; > +} [...] > * Does this patch look okay? I've written the relevant patches below, please review and take them if you agree with the changes. Thanks, ===8<=== From f9343dfc777ef04168443e86a1fa3922296ea563 Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx@redhat.com> Date: Thu, 13 Mar 2025 15:34:10 -0400 Subject: [PATCH 1/2] migration: Add save_postcopy_prepare() savevm handler Add a savevm handler for a module to opt-in sending extra sections right before postcopy starts, and before VM is stopped. RAM will start to use this new savevm handler in the next patch to do flush and sync for multifd pages. Note that we choose to do it before VM stopped because the current only potential user is not sensitive to VM status, so doing it before VM is stopped is preferred to enlarge any postcopy downtime. It is still a bit unfortunate that we need to introduce such a new savevm handler just for the only use case, however it's so far the cleanest. Signed-off-by: Peter Xu <peterx@redhat.com> --- include/migration/register.h | 15 +++++++++++++++ migration/savevm.h | 1 + migration/migration.c | 4 ++++ migration/savevm.c | 33 +++++++++++++++++++++++++++++++++ 4 files changed, 53 insertions(+) diff --git a/include/migration/register.h b/include/migration/register.h index c041ce32f2..b79dc81b8d 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -189,6 +189,21 @@ typedef struct SaveVMHandlers { /* This runs outside the BQL! */ + /** + * @save_postcopy_prepare + * + * This hook will be invoked on the source side right before switching + * to postcopy (before VM stopped). + * + * @f: QEMUFile where to send the data + * @opaque: Data pointer passed to register_savevm_live() + * @errp: Error** used to report error message + * + * Returns: true if succeeded, false if error occured. When false is + * returned, @errp must be set. + */ + bool (*save_postcopy_prepare)(QEMUFile *f, void *opaque, Error **errp); + /** * @state_pending_estimate * diff --git a/migration/savevm.h b/migration/savevm.h index 138c39a7f9..2d5e9c7166 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -45,6 +45,7 @@ void qemu_savevm_state_pending_exact(uint64_t *must_precopy, void qemu_savevm_state_pending_estimate(uint64_t *must_precopy, uint64_t *can_postcopy); int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy); +bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp); void qemu_savevm_send_ping(QEMUFile *f, uint32_t value); void qemu_savevm_send_open_return_path(QEMUFile *f); int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len); diff --git a/migration/migration.c b/migration/migration.c index d46e776e24..212f6b4145 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2707,6 +2707,10 @@ static int postcopy_start(MigrationState *ms, Error **errp) } } + if (!qemu_savevm_state_postcopy_prepare(ms->to_dst_file, errp)) { + return -1; + } + trace_postcopy_start(); bql_lock(); trace_postcopy_start_set_run(); diff --git a/migration/savevm.c b/migration/savevm.c index ce158c3512..23ef4c7dc9 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1523,6 +1523,39 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f) qemu_fflush(f); } +bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp) +{ + SaveStateEntry *se; + bool ret; + + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + if (!se->ops || !se->ops->save_postcopy_prepare) { + continue; + } + + if (se->ops->is_active) { + if (!se->ops->is_active(se->opaque)) { + continue; + } + } + + trace_savevm_section_start(se->idstr, se->section_id); + + save_section_header(f, se, QEMU_VM_SECTION_PART); + ret = se->ops->save_postcopy_prepare(f, se->opaque, errp); + save_section_footer(f, se); + + trace_savevm_section_end(se->idstr, se->section_id, ret); + + if (!ret) { + assert(*errp); + return false; + } + } + + return true; +} + int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) { int64_t start_ts_each, end_ts_each; -- 2.47.0 From 299e1cdd9b28802f361ed012673825685e30f965 Mon Sep 17 00:00:00 2001 From: Peter Xu <peterx@redhat.com> Date: Thu, 13 Mar 2025 15:56:01 -0400 Subject: [PATCH 2/2] migration/ram: Implement save_postcopy_prepare() Implement save_postcopy_prepare(), preparing for the enablement of both multifd and postcopy. Please see the rich comment for the rationals. Signed-off-by: Peter Xu <peterx@redhat.com> --- migration/ram.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+) diff --git a/migration/ram.c b/migration/ram.c index 424df6d9f1..119e7d3ac2 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -4420,6 +4420,42 @@ static int ram_resume_prepare(MigrationState *s, void *opaque) return 0; } +static bool ram_save_postcopy_prepare(QEMUFile *f, void *opaque, Error **errp) +{ + int ret; + + if (migrate_multifd()) { + /* + * When multifd is enabled, source QEMU needs to make sure all the + * pages queued before postcopy starts to be flushed. + * + * Meanwhile, the load of these pages must happen before switching + * to postcopy. It's because loading of guest pages (so far) in + * multifd recv threads is still non-atomic, so the load cannot + * happen with vCPUs running on destination side. + * + * This flush and sync will guarantee those pages loaded _before_ + * postcopy starts on destination. The rational is, this happens + * before VM stops (and before source QEMU sends all the rest of + * the postcopy messages). So when the destination QEMU received + * the postcopy messages, it must have received the sync message on + * the main channel (either RAM_SAVE_FLAG_MULTIFD_FLUSH, or + * RAM_SAVE_FLAG_EOS), and such message should have guaranteed all + * previous guest pages queued in the multifd channels to be + * completely loaded. + */ + ret = multifd_ram_flush_and_sync(f); + if (ret < 0) { + error_setg(errp, "%s: multifd flush and sync failed", __func__); + return false; + } + } + + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); + + return true; +} + void postcopy_preempt_shutdown_file(MigrationState *s) { qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS); @@ -4439,6 +4475,7 @@ static SaveVMHandlers savevm_ram_handlers = { .load_setup = ram_load_setup, .load_cleanup = ram_load_cleanup, .resume_prepare = ram_resume_prepare, + .save_postcopy_prepare = ram_save_postcopy_prepare, }; static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host, -- 2.47.0 -- Peter Xu ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-13 20:08 ` Peter Xu @ 2025-03-17 12:30 ` Prasad Pandit 2025-03-17 15:00 ` Peter Xu 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-17 12:30 UTC (permalink / raw) To: Peter Xu; +Cc: qemu-devel, farosas, berrange, Prasad Pandit Hi, On Fri, 14 Mar 2025 at 01:40, Peter Xu <peterx@redhat.com> wrote: >+ save_section_header(f, se, QEMU_VM_SECTION_PART); > + ram_save_zero_page(f, se->opaque); >I'll stop requesting a why here... * Earlier in this thread you mentioned 'We need a header'. I took it as a 'RAM page' header, not save_section_header(). Section type (QEMU_VM_COMMAND) was sent by qemu_savevm_command_send() as well. > but I think this is another example that even if all the tests pass it may not be correct. * This is also an example of - communication is hard. > From f9343dfc777ef04168443e86a1fa3922296ea563 Mon Sep 17 00:00:00 2001 > From: Peter Xu <peterx@redhat.com> > Date: Thu, 13 Mar 2025 15:34:10 -0400 > Subject: [PATCH 1/2] migration: Add save_postcopy_prepare() savevm handler > > Add a savevm handler for a module to opt-in sending extra sections right > before postcopy starts, and before VM is stopped. > > RAM will start to use this new savevm handler in the next patch to do flush > and sync for multifd pages. > > Note that we choose to do it before VM stopped because the current only > potential user is not sensitive to VM status, so doing it before VM is > stopped is preferred to enlarge any postcopy downtime. > > It is still a bit unfortunate that we need to introduce such a new savevm > handler just for the only use case, however it's so far the cleanest. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > include/migration/register.h | 15 +++++++++++++++ > migration/savevm.h | 1 + > migration/migration.c | 4 ++++ > migration/savevm.c | 33 +++++++++++++++++++++++++++++++++ > 4 files changed, 53 insertions(+) > > diff --git a/include/migration/register.h b/include/migration/register.h > index c041ce32f2..b79dc81b8d 100644 > --- a/include/migration/register.h > +++ b/include/migration/register.h > @@ -189,6 +189,21 @@ typedef struct SaveVMHandlers { > > /* This runs outside the BQL! */ > > + /** > + * @save_postcopy_prepare > + * > + * This hook will be invoked on the source side right before switching > + * to postcopy (before VM stopped). > + * > + * @f: QEMUFile where to send the data > + * @opaque: Data pointer passed to register_savevm_live() > + * @errp: Error** used to report error message > + * > + * Returns: true if succeeded, false if error occured. When false is > + * returned, @errp must be set. > + */ > + bool (*save_postcopy_prepare)(QEMUFile *f, void *opaque, Error **errp); > + > /** > * @state_pending_estimate > * > diff --git a/migration/savevm.h b/migration/savevm.h > index 138c39a7f9..2d5e9c7166 100644 > --- a/migration/savevm.h > +++ b/migration/savevm.h > @@ -45,6 +45,7 @@ void qemu_savevm_state_pending_exact(uint64_t *must_precopy, > void qemu_savevm_state_pending_estimate(uint64_t *must_precopy, > uint64_t *can_postcopy); > int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy); > +bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp); > void qemu_savevm_send_ping(QEMUFile *f, uint32_t value); > void qemu_savevm_send_open_return_path(QEMUFile *f); > int qemu_savevm_send_packaged(QEMUFile *f, const uint8_t *buf, size_t len); > diff --git a/migration/migration.c b/migration/migration.c > index d46e776e24..212f6b4145 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -2707,6 +2707,10 @@ static int postcopy_start(MigrationState *ms, Error **errp) > } > } > > + if (!qemu_savevm_state_postcopy_prepare(ms->to_dst_file, errp)) { > + return -1; > + } > + > trace_postcopy_start(); > bql_lock(); > trace_postcopy_start_set_run(); > diff --git a/migration/savevm.c b/migration/savevm.c > index ce158c3512..23ef4c7dc9 100644 > --- a/migration/savevm.c > +++ b/migration/savevm.c > @@ -1523,6 +1523,39 @@ void qemu_savevm_state_complete_postcopy(QEMUFile *f) > qemu_fflush(f); > } > > +bool qemu_savevm_state_postcopy_prepare(QEMUFile *f, Error **errp) > +{ > + SaveStateEntry *se; > + bool ret; > + > + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { > + if (!se->ops || !se->ops->save_postcopy_prepare) { > + continue; > + } > + > + if (se->ops->is_active) { > + if (!se->ops->is_active(se->opaque)) { > + continue; > + } > + } > + > + trace_savevm_section_start(se->idstr, se->section_id); > + > + save_section_header(f, se, QEMU_VM_SECTION_PART); > + ret = se->ops->save_postcopy_prepare(f, se->opaque, errp); > + save_section_footer(f, se); > + > + trace_savevm_section_end(se->idstr, se->section_id, ret); > + > + if (!ret) { > + assert(*errp); > + return false; > + } > + } > + > + return true; > +} > + > int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) > { > int64_t start_ts_each, end_ts_each; > -- > 2.47.0 > > > From 299e1cdd9b28802f361ed012673825685e30f965 Mon Sep 17 00:00:00 2001 > From: Peter Xu <peterx@redhat.com> > Date: Thu, 13 Mar 2025 15:56:01 -0400 > Subject: [PATCH 2/2] migration/ram: Implement save_postcopy_prepare() > > Implement save_postcopy_prepare(), preparing for the enablement of both > multifd and postcopy. > > Please see the rich comment for the rationals. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > migration/ram.c | 37 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/migration/ram.c b/migration/ram.c > index 424df6d9f1..119e7d3ac2 100644 > --- a/migration/ram.c > +++ b/migration/ram.c > @@ -4420,6 +4420,42 @@ static int ram_resume_prepare(MigrationState *s, void *opaque) > return 0; > } > > +static bool ram_save_postcopy_prepare(QEMUFile *f, void *opaque, Error **errp) > +{ > + int ret; > + > + if (migrate_multifd()) { > + /* > + * When multifd is enabled, source QEMU needs to make sure all the > + * pages queued before postcopy starts to be flushed. > + * > + * Meanwhile, the load of these pages must happen before switching > + * to postcopy. It's because loading of guest pages (so far) in > + * multifd recv threads is still non-atomic, so the load cannot > + * happen with vCPUs running on destination side. > + * > + * This flush and sync will guarantee those pages loaded _before_ > + * postcopy starts on destination. The rational is, this happens > + * before VM stops (and before source QEMU sends all the rest of > + * the postcopy messages). So when the destination QEMU received > + * the postcopy messages, it must have received the sync message on > + * the main channel (either RAM_SAVE_FLAG_MULTIFD_FLUSH, or > + * RAM_SAVE_FLAG_EOS), and such message should have guaranteed all > + * previous guest pages queued in the multifd channels to be > + * completely loaded. > + */ > + ret = multifd_ram_flush_and_sync(f); > + if (ret < 0) { > + error_setg(errp, "%s: multifd flush and sync failed", __func__); > + return false; > + } > + } > + > + qemu_put_be64(f, RAM_SAVE_FLAG_EOS); > + > + return true; > +} > + > void postcopy_preempt_shutdown_file(MigrationState *s) > { > qemu_put_be64(s->postcopy_qemufile_src, RAM_SAVE_FLAG_EOS); > @@ -4439,6 +4475,7 @@ static SaveVMHandlers savevm_ram_handlers = { > .load_setup = ram_load_setup, > .load_cleanup = ram_load_cleanup, > .resume_prepare = ram_resume_prepare, > + .save_postcopy_prepare = ram_save_postcopy_prepare, > }; > > static void ram_mig_ram_block_resized(RAMBlockNotifier *n, void *host, > -- > 2.47.0 * I get the infrastructural changes that they'll help to take 'section' specific action before postcopy starts. It's not clear how tying flush and sync with a RAM section helps; because on the destination side 'section' is only used to call se->ops->load_state()->ram_load->ram_load_precopy()->multifd_recv_sync_main(). * To confirm: - Benefit of this approach is that 'flush and sync' works via vmstate_load -> se->ops->load_state() -> ram_load -> ram_load_precopy() sequence? * Thank you for the patches, I'll send a revised patchset including them. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-17 12:30 ` Prasad Pandit @ 2025-03-17 15:00 ` Peter Xu 0 siblings, 0 replies; 37+ messages in thread From: Peter Xu @ 2025-03-17 15:00 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Mon, Mar 17, 2025 at 06:00:14PM +0530, Prasad Pandit wrote: > Hi, > > On Fri, 14 Mar 2025 at 01:40, Peter Xu <peterx@redhat.com> wrote: > >+ save_section_header(f, se, QEMU_VM_SECTION_PART); > > + ram_save_zero_page(f, se->opaque); > >I'll stop requesting a why here... > > * Earlier in this thread you mentioned 'We need a header'. I took it > as a 'RAM page' header, not save_section_header(). Section type The question is not about the header, it's about the zero page, and why we send this zero page out of blue. IIIUC it can corrupt a page if it uses whatever cached in the previous round in the pss, and if it used to be a non-zero. > (QEMU_VM_COMMAND) was sent by qemu_savevm_command_send() as well. But it should be for generic VM operations. We have a few outliers but they're too special. For this case we'd better not add more outliers, because neither it is special, nor necessary. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-07 11:45 ` Prasad Pandit 2025-03-07 22:48 ` Peter Xu @ 2025-03-07 22:51 ` Peter Xu 2025-03-10 14:38 ` Fabiano Rosas 1 sibling, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-03-07 22:51 UTC (permalink / raw) To: Prasad Pandit; +Cc: qemu-devel, farosas, berrange, Prasad Pandit On Fri, Mar 07, 2025 at 05:15:03PM +0530, Prasad Pandit wrote: > diff --git a/migration/migration.c b/migration/migration.c > index 65fc4f5eed..da2c49c303 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -3401,9 +3401,10 @@ static MigIterateState > migration_iteration_run(MigrationState *s) > if (!in_postcopy && must_precopy <= s->threshold_size > && can_switchover && qatomic_read(&s->start_postcopy)) { > if (migrate_multifd()) { > - multifd_send_flush(); > - multifd_send_sync_main(MULTIFD_SYNC_LOCAL); > - qemu_savevm_send_multifd_recv_sync(s->to_dst_file); > +/* multifd_send_flush(); > + * multifd_send_sync_main(MULTIFD_SYNC_ALL); > + * qemu_savevm_send_multifd_recv_sync(s->to_dst_file); > + */ > + qemu_savevm_state_complete_multifd(s->to_dst_file); > multifd_send_shutdown(); Forgot to mention one thing: If you do flush and sync, IMHO we can keep the threads there and remove this shutdown, as long as we are sure it'll be properly shutdown when cleanup. With the assertion in dest threads, I think it should be OK. -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-07 22:51 ` Peter Xu @ 2025-03-10 14:38 ` Fabiano Rosas 2025-03-10 17:08 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Fabiano Rosas @ 2025-03-10 14:38 UTC (permalink / raw) To: Peter Xu, Prasad Pandit; +Cc: qemu-devel, berrange, Prasad Pandit Peter Xu <peterx@redhat.com> writes: > On Fri, Mar 07, 2025 at 05:15:03PM +0530, Prasad Pandit wrote: >> diff --git a/migration/migration.c b/migration/migration.c >> index 65fc4f5eed..da2c49c303 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -3401,9 +3401,10 @@ static MigIterateState >> migration_iteration_run(MigrationState *s) >> if (!in_postcopy && must_precopy <= s->threshold_size >> && can_switchover && qatomic_read(&s->start_postcopy)) { >> if (migrate_multifd()) { >> - multifd_send_flush(); >> - multifd_send_sync_main(MULTIFD_SYNC_LOCAL); >> - qemu_savevm_send_multifd_recv_sync(s->to_dst_file); >> +/* multifd_send_flush(); >> + * multifd_send_sync_main(MULTIFD_SYNC_ALL); >> + * qemu_savevm_send_multifd_recv_sync(s->to_dst_file); >> + */ >> + qemu_savevm_state_complete_multifd(s->to_dst_file); >> multifd_send_shutdown(); > > Forgot to mention one thing: > > If you do flush and sync, IMHO we can keep the threads there and remove > this shutdown, as long as we are sure it'll be properly shutdown when > cleanup. > > With the assertion in dest threads, I think it should be OK. Good point. Shutdown at random places makes it difficult to protect against cleanup races. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-10 14:38 ` Fabiano Rosas @ 2025-03-10 17:08 ` Prasad Pandit 2025-03-10 19:58 ` Fabiano Rosas 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-10 17:08 UTC (permalink / raw) To: Fabiano Rosas; +Cc: Peter Xu, qemu-devel, berrange, Prasad Pandit On Mon, 10 Mar 2025 at 20:09, Fabiano Rosas <farosas@suse.de> wrote: > Good point. Shutdown at random places makes it difficult to protect > against cleanup races. * Just to understand, how and where do these races occur? Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-10 17:08 ` Prasad Pandit @ 2025-03-10 19:58 ` Fabiano Rosas 2025-03-11 10:01 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Fabiano Rosas @ 2025-03-10 19:58 UTC (permalink / raw) To: Prasad Pandit; +Cc: Peter Xu, qemu-devel, berrange, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > On Mon, 10 Mar 2025 at 20:09, Fabiano Rosas <farosas@suse.de> wrote: >> Good point. Shutdown at random places makes it difficult to protect >> against cleanup races. > > * Just to understand, how and where do these races occur? > They occur when cleanup code is allowed to run when resources have not yet been allocated or while the resources are still being accessed. Having the shutdown routine at a single point when it's clear everything else is ready for shutdown helps not only to avoid those issues, but also when investigating them. Otherwise you'll have the same code running at (potentially) completely different points in time and one of those times the system might be in an unexpected state. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-10 19:58 ` Fabiano Rosas @ 2025-03-11 10:01 ` Prasad Pandit 2025-03-11 12:44 ` Fabiano Rosas 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-11 10:01 UTC (permalink / raw) To: Fabiano Rosas; +Cc: Peter Xu, qemu-devel, berrange, Prasad Pandit Hi, On Tue, 11 Mar 2025 at 01:28, Fabiano Rosas <farosas@suse.de> wrote: > They occur when cleanup code is allowed to run when resources have not > yet been allocated or while the resources are still being accessed. > > Having the shutdown routine at a single point when it's clear everything > else is ready for shutdown helps not only to avoid those issues, but > also when investigating them. Otherwise you'll have the same code > running at (potentially) completely different points in time and one of > those times the system might be in an unexpected state. * I see. It's not clear when this would happen in a production deployment. === if (migrate_multifd()) { multifd_send_shutdown(); <= [1] } postcopy_start(...) <= [2] === * There was another scenario regarding multifd shutdown as: the EOF or shutdown message sent via [1] on each multifd connection reaches the destination _later_ than the Postcopy phase start via [2]. And this may result in multifd_recv_threads on the destination overwriting memory pages written by postcopy thread, thus corrupting the guest state. * Do we have any bugs/issues where these above things happened? To see the real circumstances under which it happened? * There seems to be some disconnect between the kind of scenarios we are considering and the minimal requirements for live migrations: a stable network with real good bandwidth. If we test live migration with guest dirtying RAM to the tune of 64M/128M/256M/512M bytes, that assumes/implies that the network bandwidth is much more than 512Mbps, no? Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command 2025-03-11 10:01 ` Prasad Pandit @ 2025-03-11 12:44 ` Fabiano Rosas 0 siblings, 0 replies; 37+ messages in thread From: Fabiano Rosas @ 2025-03-11 12:44 UTC (permalink / raw) To: Prasad Pandit; +Cc: Peter Xu, qemu-devel, berrange, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > Hi, > > On Tue, 11 Mar 2025 at 01:28, Fabiano Rosas <farosas@suse.de> wrote: >> They occur when cleanup code is allowed to run when resources have not >> yet been allocated or while the resources are still being accessed. >> >> Having the shutdown routine at a single point when it's clear everything >> else is ready for shutdown helps not only to avoid those issues, but >> also when investigating them. Otherwise you'll have the same code >> running at (potentially) completely different points in time and one of >> those times the system might be in an unexpected state. > > * I see. It's not clear when this would happen in a production deployment. > === > if (migrate_multifd()) { > multifd_send_shutdown(); <= [1] > } > > postcopy_start(...) <= [2] > === > > * There was another scenario regarding multifd shutdown as: the EOF or > shutdown message sent via [1] on each multifd connection reaches the > destination _later_ than the Postcopy phase start via [2]. And this > may result in multifd_recv_threads on the destination overwriting > memory pages written by postcopy thread, thus corrupting the guest > state. Isn't that the point? To add a sync for this which would allow the shutdown to not be added? > > * Do we have any bugs/issues where these above things happened? To see > the real circumstances under which it happened? > We do. They don't come with a description of the circumstances. You're lucky if you get a coredump. You can peruse `git log migration/multifd`, I'd say most of the work in the recent years has been solving concurrency issues. > * There seems to be some disconnect between the kind of scenarios we > are considering and the minimal requirements for live migrations: a > stable network with real good bandwidth. There's no such requirement. Besides, the topic is not failed migrations due to lack of resources. We're talking about correctness issues that are hard to spot. Those should always be fixed when found, independently of what the production environment is expected to be. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit ` (4 preceding siblings ...) 2025-02-28 12:17 ` [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command Prasad Pandit @ 2025-02-28 14:53 ` Fabiano Rosas 2025-03-03 10:47 ` Prasad Pandit 5 siblings, 1 reply; 37+ messages in thread From: Fabiano Rosas @ 2025-02-28 14:53 UTC (permalink / raw) To: Prasad Pandit, qemu-devel; +Cc: peterx, berrange, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > From: Prasad Pandit <pjp@fedoraproject.org> > > Hello, > > * This series (v7) adds 'MULTIFD_RECV_SYNC' migration command. It is used > to notify the destination migration thread to synchronise with the Multifd > threads. This allows Multifd ('mig/dst/recv_x') threads on the destination > to receive all their data, before they are shutdown. > > This series also updates the channel discovery function and qtests as > suggested in the previous review comments. You forgot to split patch 2. We cannot have a commit that revamps the channel discovery logic and enables a new feature at the same time. Changing the channel discovery affects all the migration use-cases, that change cannot be smuggled along with multifd+postcopy enablement. Similarly, the multifd+postcopy enablement is a new feature that needs to be tested and reasoned upon in isolation, it cannot bring along a bunch of previously existing code that was shuffled around. We need to be able to understand clearly what is done _in preparation for_ the feature and what is done _as part of_ the feature. Not to mention bisect and backporting. Many people will be looking at this code in the future without any knowledge of migration, but as part of a bisect section or when searching for missing backports in the distros. I also suggested to move that logic into channel.c. The code is now well-contained enough that we don't need to be reading it every time someone is going over the migration flow, it becomes just a helper function. > === > 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 147.84s 81 subtests passed I see postcopy/multifd/plain hanging from time to time. Probably due to the changes in patch 5. Please take a look. > === > > > v6: https://lore.kernel.org/qemu-devel/20250215123119.814345-1-ppandit@redhat.com/T/#t > * This series (v6) shuts down Multifd threads before starting Postcopy > migration. It helps to avoid an issue of multifd pages arriving late > at the destination during Postcopy phase and corrupting the vCPU > state. It also reorders the qtest patches and does some refactoring > changes as suggested in previous review. > === > 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 161.35s 73 subtests passed > === > > > v5: https://lore.kernel.org/qemu-devel/20250205122712.229151-1-ppandit@redhat.com/T/#t > * This series (v5) consolidates migration capabilities setting in one > 'set_migration_capabilities()' function, thus simplifying test sources. > It passes all migration tests. > === > 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 143.66s 71 subtests passed > === > > > v4: https://lore.kernel.org/qemu-devel/20250127120823.144949-1-ppandit@redhat.com/T/#t > * This series (v4) adds more 'multifd+postcopy' qtests which test > Precopy migration with 'postcopy-ram' attribute set. And run > Postcopy migrations with 'multifd' channels enabled. > === > $ ../qtest/migration-test --tap -k -r '/x86_64/migration/multifd+postcopy' | grep -i 'slow test' > # slow test /x86_64/migration/multifd+postcopy/plain executed in 1.29 secs > # slow test /x86_64/migration/multifd+postcopy/recovery/tls/psk executed in 2.48 secs > # slow test /x86_64/migration/multifd+postcopy/preempt/plain executed in 1.49 secs > # slow test /x86_64/migration/multifd+postcopy/preempt/recovery/tls/psk executed in 2.52 secs > # slow test /x86_64/migration/multifd+postcopy/tcp/tls/psk/match executed in 3.62 secs > # slow test /x86_64/migration/multifd+postcopy/tcp/plain/zstd executed in 1.34 secs > # slow test /x86_64/migration/multifd+postcopy/tcp/plain/cancel executed in 2.24 secs > ... > 66/66 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 148.41s 71 subtests passed > === > > > v3: https://lore.kernel.org/qemu-devel/20250121131032.1611245-1-ppandit@redhat.com/T/#t > * This series (v3) passes all existing 'tests/qtest/migration/*' tests > and adds a new one to enable multifd channels with postcopy migration. > > > v2: https://lore.kernel.org/qemu-devel/20241129122256.96778-1-ppandit@redhat.com/T/#u > * This series (v2) further refactors the 'ram_save_target_page' > function to make it independent of the multifd & postcopy change. > > > v1: https://lore.kernel.org/qemu-devel/20241126115748.118683-1-ppandit@redhat.com/T/#u > * This series removes magic value (4-bytes) introduced in the > previous series for the Postcopy channel. > > > v0: https://lore.kernel.org/qemu-devel/20241029150908.1136894-1-ppandit@redhat.com/T/#u > * Currently Multifd and Postcopy migration can not be used together. > QEMU shows "Postcopy is not yet compatible with multifd" message. > > When migrating guests with large (100's GB) RAM, Multifd threads > help to accelerate migration, but inability to use it with the > Postcopy mode delays guest start up on the destination side. > > * This patch series allows to enable both Multifd and Postcopy > migration together. Precopy and Multifd threads work during > the initial guest (RAM) transfer. When migration moves to the > Postcopy phase, Multifd threads are restrained and the Postcopy > threads start to request pages from the source side. > > * This series introduces magic value (4-bytes) to be sent on the > Postcopy channel. It helps to differentiate channels and properly > setup incoming connections on the destination side. > > > Thank you. > --- > Prasad Pandit (5): > migration/multifd: move macros to multifd header > migration: enable multifd and postcopy together > tests/qtest/migration: consolidate set capabilities > tests/qtest/migration: add postcopy tests with multifd > migration: add MULTIFD_RECV_SYNC migration command > > migration/migration.c | 136 +++++++++++++--------- > migration/multifd-nocomp.c | 28 +++-- > migration/multifd.c | 17 +-- > migration/multifd.h | 6 + > migration/options.c | 5 - > migration/ram.c | 7 +- > migration/savevm.c | 13 +++ > migration/savevm.h | 1 + > tests/qtest/migration/compression-tests.c | 37 +++++- > tests/qtest/migration/cpr-tests.c | 6 +- > tests/qtest/migration/file-tests.c | 58 +++++---- > tests/qtest/migration/framework.c | 76 ++++++++---- > tests/qtest/migration/framework.h | 9 +- > tests/qtest/migration/misc-tests.c | 4 +- > tests/qtest/migration/postcopy-tests.c | 35 +++++- > tests/qtest/migration/precopy-tests.c | 48 +++++--- > tests/qtest/migration/tls-tests.c | 69 ++++++++++- > 17 files changed, 388 insertions(+), 167 deletions(-) ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-02-28 14:53 ` [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Fabiano Rosas @ 2025-03-03 10:47 ` Prasad Pandit 2025-03-03 14:12 ` Peter Xu 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-03 10:47 UTC (permalink / raw) To: Fabiano Rosas; +Cc: qemu-devel, peterx, berrange, Prasad Pandit Hello Fabiano, On Fri, 28 Feb 2025 at 20:23, Fabiano Rosas <farosas@suse.de> wrote: > You forgot to split patch 2. We cannot have a commit that revamps the > channel discovery logic and enables a new feature at the same > time. Changing the channel discovery affects all the migration > use-cases, that change cannot be smuggled along with multifd+postcopy > enablement. === >> Continue with this patch and fix the stuff I mentioned. You can ignore >> the first two paragraphs of that reply. >> >> https://lore.kernel.org/r/87y0y4tf5q.fsf@suse.de >> >> I still think we need to test that preempt + multifd scenario, but it >> should be easy to write a test for that once the series is in more of a >> final shape. === * I thought that major overhaul of the channel discovery part by moving it to channel.c was to be done subsequently, no? As we discussed, we need to touch the channel discovery parts in 'migration_ioc_process_incoming' because, we need to identify the Postcopy channel when its connection comes in. So the Patch-2 does minimal changes that are _required_ to enable the two (multifd & postcopy) features together. * Whereas, moving channel discovery parts out to connections.c could be done separately with its own reasoning that - we are moving it outside to connection.c because it fits better there. > Similarly, the multifd+postcopy enablement is a new feature that needs > to be tested and reasoned upon in isolation, it cannot bring along a > bunch of previously existing code that was shuffled around. We need to > be able to understand clearly what is done _in preparation for_ the > feature and what is done _as part of_ the feature. ... > Not to mention bisect and backporting. Many people will be looking at > this code in the future without any knowledge of migration, but as part > of a bisect section or when searching for missing backports in the > distros. * I think we (you, me, Peter) are all looking at things differently. - In my view Patch-2 is the minimal change _required_ to enable multifd & postcopy. In your view we are _revamping_ channel discovery parts while _sneaking_ in a feature of enabling multifd & postcopy together. - In my view Patch-5 in this series is an isolated change because it adds a new migration command to allow multifd threads sync from source side. But Peter thinks without that 'flush and sync' Patch-2 is incomplete, so we should merge it back there. * I've also shared my view about not trying to do everything in this one series. I don't know how do we move forward from here. I'm also not sure what the final _acceptable_ patch-set should look like. I thought a tested working patch-set is a good start. At this point I think I'll just follow your lead. > I also suggested to move that logic into channel.c. The code is now > well-contained enough that we don't need to be reading it every time > someone is going over the migration flow, it becomes just a helper > function. * How exactly do we want to divide patch-2 to move channel discovery parts to channel.c? === if (!migration_has_main_and_multifd_channels()) { } ... } else { channel = CH_MAIN; } === Do we move this entire if - else - else block to channel.c from migration_ioc_process_incoming() ? > > === > > 67/67 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test OK 147.84s 81 subtests passed > > I see postcopy/multifd/plain hanging from time to time. Probably due to > the changes in patch 5. Please take a look. * Okay. Does it hang indefinitely or for brief moments? Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-03-03 10:47 ` Prasad Pandit @ 2025-03-03 14:12 ` Peter Xu 2025-03-04 9:47 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-03-03 14:12 UTC (permalink / raw) To: Prasad Pandit; +Cc: Fabiano Rosas, qemu-devel, berrange, Prasad Pandit On Mon, Mar 03, 2025 at 04:17:53PM +0530, Prasad Pandit wrote: > * I think we (you, me, Peter) are all looking at things differently. > - In my view Patch-2 is the minimal change _required_ to enable > multifd & postcopy. In your view we are _revamping_ channel discovery > parts while _sneaking_ in a feature of enabling multifd & postcopy > together. > - In my view Patch-5 in this series is an isolated change because > it adds a new migration command to allow multifd threads sync from > source side. But Peter thinks without that 'flush and sync' Patch-2 is > incomplete, so we should merge it back there. Just to mention, my suggestion does not conflict with splitting patch 2, as long as you keep every patch complete on its own. Patch 5 needs to be squashed to either patch 2 or a split patch out of patch 2, because current patch 2 (or any possible way to split it into smaller ones, then one of them which enables the feature) is buggy. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-03-03 14:12 ` Peter Xu @ 2025-03-04 9:47 ` Prasad Pandit 2025-03-04 14:42 ` Peter Xu 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-04 9:47 UTC (permalink / raw) To: Peter Xu; +Cc: Fabiano Rosas, qemu-devel, berrange, Prasad Pandit Hi, On Mon, 3 Mar 2025 at 19:42, Peter Xu <peterx@redhat.com> wrote: > On Mon, Mar 03, 2025 at 04:17:53PM +0530, Prasad Pandit wrote: > > * I think we (you, me, Peter) are all looking at things differently. > > - In my view Patch-2 is the minimal change _required_ to enable > > multifd & postcopy. In your view we are _revamping_ channel discovery > > parts while _sneaking_ in a feature of enabling multifd & postcopy > > together. > > - In my view Patch-5 in this series is an isolated change because > > it adds a new migration command to allow multifd threads sync from > > source side. But Peter thinks without that 'flush and sync' Patch-2 is > > incomplete, so we should merge it back there. > > Just to mention, my suggestion does not conflict with splitting patch 2, as > long as you keep every patch complete on its own. > > Patch 5 needs to be squashed to either patch 2 or a split patch out of > patch 2, because current patch 2 (or any possible way to split it into > smaller ones, then one of them which enables the feature) is buggy. * I'll try to segregate different parts, then we can discuss how to split them across patches: Terminology: _requires_ => is without which migration shall not work at all. _essential_ => is without which there may be issues. 1. Enable Multifd and Postcopy together - It _requires_ removal of the Multifd capability check in migration/options.c - It _requires_ identification of the CH_POSTCOPY connection when it arrives - so enum { CH_MAIN, CH_MULTIFD, CH_POSTCOPY } is defined - To identify channels, related changes are made to the channel discovery part (if .. else conditionals) in migration_ioc_process_incoming() function. - These changes help to establish all channel connections. After that, the migration proceeds as usual until it's time to start the Postcopy phase. - When time comes to start Postcopy phase, we shutdown multifd channels. - it _requires_ calling multifd_send_shutdown() - It _requires_ moving file_cleanup_outgoing/socket_cleanup_outgoing calls to migration_cleanup() function. - When Postcopy phase starts, we don't want ram_save_target _page() function to call ram_save_multifd_page() function, because migrate_multifd() is still true. - It _requires_ adding the !migration_in_postcopy() checks. * Above changes are minimal _required_ to enable multifd and postcopy together, while also ensuring that migration continues to work when those options are not enabled together. With these changes, guest migration across two machines works without any observed failures. 2. The multifd_ram_flush_and_sync() call/command and the assert(!migration_in_postcopy()) call in multifd_recv_thread() - These calls help to ensure that no multifd data is left behind when the Postcopy phase starts. - And that multifd_recv threads shall not be active when the Postcopy is running. - They protect the guests from potential state corruption. * It is up to us to decide whether (2) is _required_ OR _essential_ for the feature. Individual opinions can vary here. 3. Revamp of the channel discovery parts by moving those bits to connection.c or other places. - This entails moving the channel discovery parts from migration_ioc_process_incoming() function to somewhere else because it makes more sense to move it there and maybe it reduces complexity and makes the sources easier to understand.* * It is up to us to decide whether (3) is _required_ OR _essential_ for the feature. Individual opinions can vary here. * IMHO (1) is _required_, (2) is _essential_, and (3) is neither _required_ nor _essential_ for the - Enable multifd and postcopy together - feature. (3) is a completely unrelated change to this feature. Since it is an individual opinion, we all can think differently here and that is perfectly fine. Once we have some consensus, we can decide how to split or merge patches and move forward. Hope it helps. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-03-04 9:47 ` Prasad Pandit @ 2025-03-04 14:42 ` Peter Xu 2025-03-05 7:41 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Peter Xu @ 2025-03-04 14:42 UTC (permalink / raw) To: Prasad Pandit; +Cc: Fabiano Rosas, qemu-devel, berrange, Prasad Pandit On Tue, Mar 04, 2025 at 03:17:14PM +0530, Prasad Pandit wrote: > Hi, > > On Mon, 3 Mar 2025 at 19:42, Peter Xu <peterx@redhat.com> wrote: > > On Mon, Mar 03, 2025 at 04:17:53PM +0530, Prasad Pandit wrote: > > > * I think we (you, me, Peter) are all looking at things differently. > > > - In my view Patch-2 is the minimal change _required_ to enable > > > multifd & postcopy. In your view we are _revamping_ channel discovery > > > parts while _sneaking_ in a feature of enabling multifd & postcopy > > > together. > > > - In my view Patch-5 in this series is an isolated change because > > > it adds a new migration command to allow multifd threads sync from > > > source side. But Peter thinks without that 'flush and sync' Patch-2 is > > > incomplete, so we should merge it back there. > > > > Just to mention, my suggestion does not conflict with splitting patch 2, as > > long as you keep every patch complete on its own. > > > > Patch 5 needs to be squashed to either patch 2 or a split patch out of > > patch 2, because current patch 2 (or any possible way to split it into > > smaller ones, then one of them which enables the feature) is buggy. > > * I'll try to segregate different parts, then we can discuss how to > split them across patches: > > Terminology: > _requires_ => is without which migration shall not work at all. > _essential_ => is without which there may be issues. > > 1. Enable Multifd and Postcopy together > - It _requires_ removal of the Multifd capability check in > migration/options.c > - It _requires_ identification of the CH_POSTCOPY connection when it arrives > - so enum { CH_MAIN, CH_MULTIFD, CH_POSTCOPY } is defined > - To identify channels, related changes are made to the > channel discovery part (if .. else conditionals) in > migration_ioc_process_incoming() function. > - These changes help to establish all channel connections. > After that, the migration proceeds as usual until it's time to start > the Postcopy phase. > - When time comes to start Postcopy phase, we shutdown multifd channels. > - it _requires_ calling multifd_send_shutdown() > - It _requires_ moving > file_cleanup_outgoing/socket_cleanup_outgoing calls to > migration_cleanup() function. > - When Postcopy phase starts, we don't want ram_save_target > _page() function to call ram_save_multifd_page() function, because > migrate_multifd() is still true. > - It _requires_ adding the !migration_in_postcopy() checks. IIUC Fabiano is not asking you to drop them, but split them. Split still "requires" them to be present, as long as before the enablement patch. For example, if you want you can put the channel changes into a separate patch, but without enabling the feature. That single patch (after applied) should keep migration working as before. > > * Above changes are minimal _required_ to enable multifd and postcopy > together, while also ensuring that migration continues to work when > those options are not enabled together. With these changes, guest > migration across two machines works without any observed failures. > > 2. The multifd_ram_flush_and_sync() call/command and the > assert(!migration_in_postcopy()) call in multifd_recv_thread() > - These calls help to ensure that no multifd data is left behind > when the Postcopy phase starts. > - And that multifd_recv threads shall not be active when the > Postcopy is running. > - They protect the guests from potential state corruption. > > * It is up to us to decide whether (2) is _required_ OR _essential_ > for the feature. Individual opinions can vary here. > > 3. Revamp of the channel discovery parts by moving those bits to > connection.c or other places. > - This entails moving the channel discovery parts from > migration_ioc_process_incoming() function to somewhere else because it > makes more sense to move it there and maybe it reduces complexity and > makes the sources easier to understand.* > > * It is up to us to decide whether (3) is _required_ OR _essential_ > for the feature. Individual opinions can vary here. > > * IMHO (1) is _required_, (2) is _essential_, and (3) is neither > _required_ nor _essential_ for the - Enable multifd and postcopy > together - feature. (3) is a completely unrelated change to this > feature. > > Since it is an individual opinion, we all can think differently here > and that is perfectly fine. Once we have some consensus, we can decide > how to split or merge patches and move forward. > > Hope it helps. Thank you. > --- > - Prasad > -- Peter Xu ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-03-04 14:42 ` Peter Xu @ 2025-03-05 7:41 ` Prasad Pandit 2025-03-05 13:56 ` Fabiano Rosas 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-05 7:41 UTC (permalink / raw) To: Peter Xu; +Cc: Fabiano Rosas, qemu-devel, berrange, Prasad Pandit Hi, On Tue, 4 Mar 2025 at 20:12, Peter Xu <peterx@redhat.com> wrote: > IIUC Fabiano is not asking you to drop them, but split them. Split still > "requires" them to be present, as long as before the enablement patch. * Yes, same here; Even I am not suggesting to drop anything. Fabiano mentioned the following about the changes being done: 1. _in preparation for_ the feature (multifd+postcopy enablement) 2. _as part of_ the feature (multifd+postcopy enablement) 3. Concerns about git bisect and backporting of patches, not missing patches when backporting. * I am thinking: 1. The _required_ changes are the _as part of_ the feature changes. 2. The refactoring of ram_save_target_page and moving of MULTIFD_MAGIC/VERSION macros to multifd.h patch can be termed as _in preparation for_ the feature. Of these - Refactoring of 'ram_save_target_page' function patch is already pulled upstream. -> https://gitlab.com/qemu-project/qemu/-/commit/bc38dc2f5f350310724fd7d4f0a09f8c3a4811fa - Similarly moving of MULTIFD_ macros patch can be pulled. It has not changed for many revisions. 3. The _essential_ changes are further refinement of the feature (multifd+postcopy enablement). * IMO, we can split patches as: - One patch for -> _required_ OR _as part of_ the feature changes - One patch for -> MULTIFD_ macros as _in preparation for_ the feature change - One patch for -> _essential_ changes : flush and sync related - Two patches for qtest cases related changes. This division also seems helpful for git bisect and backporting related concerns. * Currently we are divided over what constitutes: _required_ OR _as part of_ the feature changes. - Hence the request for individual opinions towards that. * If we want to merge _required_/_as part of_ the feature and _essential_ changes together in one patch - I'm okay. * If we want to split the _required_ / _as part of_ the feature patch further into two, we need to define exactly how we do that division. - [???] * I have shared my thoughts above, I won't hold on to it unduly. We need to find a way to move forward. I'm willing to go with either way we decide. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-03-05 7:41 ` Prasad Pandit @ 2025-03-05 13:56 ` Fabiano Rosas 2025-03-06 7:51 ` Prasad Pandit 0 siblings, 1 reply; 37+ messages in thread From: Fabiano Rosas @ 2025-03-05 13:56 UTC (permalink / raw) To: Prasad Pandit, Peter Xu; +Cc: qemu-devel, berrange, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > Hi, > > On Tue, 4 Mar 2025 at 20:12, Peter Xu <peterx@redhat.com> wrote: >> IIUC Fabiano is not asking you to drop them, but split them. Split still >> "requires" them to be present, as long as before the enablement patch. > > * Yes, same here; Even I am not suggesting to drop anything. Fabiano > mentioned the following about the changes being done: > > 1. _in preparation for_ the feature (multifd+postcopy enablement) > 2. _as part of_ the feature (multifd+postcopy enablement) The "channel discovery" part can be considered as a separate patch because in order to integrate it with the new feature you had to introduce a new concept of enum CH_* which is not a trivially verifiable change. That change has the potential to break other use-cases of migration and although the tests are passing we must be cautions about it. Simply having that part as a patch is enough separation that an issue there won't be conflated with the multifd+postcopy feature in general. I also called it "in preparation" for the feature because the channel discovery part is clearly a preexisting issue, in the sense that had we had proper magic numbers in all channels or some other mechanism to diffrentiate channels, we wouldn't need this patch. That's a cleanup you're doing beforehand in order to make the subsequent patches simpler. Changing that code makes sense regardless of whether there is a new feature being introduced. Note that none of this is out of the ordinary, you'll find such discussions in any thread on this community. It may feel arbitrary to you because that's tacit knowledge we gathered along the years. > 3. Concerns about git bisect and backporting of patches, not > missing patches when backporting. Yes, whenever you can frame a code change as a pure refactoring, you should do that. If, say, you had to change 100 lines in order to implement your feature, but 90 of those lines are just preserving the existing behavior, that is a good candidate to be a separate patch. Reviewers will be more inclined to approve the patch, since reading code and making sure it doesn't change behavior is easier than also accounting for the new behavior simultaneously. Maintainers moving patches around, merging, testing, etc will have an easier time when solving git conflicts because the separate code is more easily compared with the old one. Backporters have similar issues with git, but also face questions about the safety of bringing a patch as a prerequisite into older QEMU versions. Imagine someone wants to enable multifd+postcopy in QEMU 9.2, it'll be easier for them to have code more clearly defined as patches. Another scenario could be if the changes to the channel discovery actually fix a bug that we're currently not aware of. That would show up in a bisect session and we wouldn't want to apply a patch that fixes a bug if the same patch also enables a feature. > > * I am thinking: > 1. The _required_ changes are the _as part of_ the feature changes. > 2. The refactoring of ram_save_target_page and moving of > MULTIFD_MAGIC/VERSION macros to multifd.h patch can be termed as _in > preparation for_ the feature. Of these > - Refactoring of 'ram_save_target_page' function patch is > already pulled upstream. > -> https://gitlab.com/qemu-project/qemu/-/commit/bc38dc2f5f350310724fd7d4f0a09f8c3a4811fa > - Similarly moving of MULTIFD_ macros patch can be pulled. It > has not changed for many revisions. > 3. The _essential_ changes are further refinement of the feature > (multifd+postcopy enablement). > > * IMO, we can split patches as: > - One patch for -> _required_ OR _as part of_ the feature changes > - One patch for -> MULTIFD_ macros as _in preparation for_ the > feature change > - One patch for -> _essential_ changes : flush and sync related > - Two patches for qtest cases related changes. > This division also seems helpful for git bisect and backporting > related concerns. > > * Currently we are divided over what constitutes: _required_ OR _as > part of_ the feature changes. > - Hence the request for individual opinions towards that. > > * If we want to merge _required_/_as part of_ the feature and > _essential_ changes together in one patch - I'm okay. > * If we want to split the _required_ / _as part of_ the feature patch > further into two, we need to define exactly how we do that division. - > [???] > We need an extra patch that reads: migration: Refactor channel discovery mechanism The various logical migration channels don't have a standardized way of advertising themselves and their connections may be seen out of order by the migration destination. When a new connection arrives, the incoming migration currently make use of heuristics to determine which channel it belongs to. The next few patches will need to change how the multifd and postcopy capabilities interact and that affects the channel discovery heuristic. Refactor the channel discovery heuristic to make it less opaque and simplify the subsequent patches. <some description of the new code which might be pertinent> --- You'd move all of the channel discovery code into this patch. Some of it will be unreacheable because multifd is not yet allowed with postcopy, but that's fine. You can mention it on the commit message. About moving the code out of migration.c, it was a suggestion that you're free to push back. Ideally, doing the work would be faster than arguing against it on the mailing list. But that's fine. About the hang in the test. It doesn't reproduce often, but once it does, it hangs forever (although I haven't waited that long). > * I have shared my thoughts above, I won't hold on to it unduly. We > need to find a way to move forward. I'm willing to go with either way > we decide. > > Thank you. > --- > - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-03-05 13:56 ` Fabiano Rosas @ 2025-03-06 7:51 ` Prasad Pandit 2025-03-06 13:48 ` Fabiano Rosas 0 siblings, 1 reply; 37+ messages in thread From: Prasad Pandit @ 2025-03-06 7:51 UTC (permalink / raw) To: Fabiano Rosas; +Cc: Peter Xu, qemu-devel, berrange, Prasad Pandit Hello Fabiano, On Wed, 5 Mar 2025 at 19:26, Fabiano Rosas <farosas@suse.de> wrote: > Note that none of this is out of the ordinary, you'll find such > discussions in any thread on this community. It may feel arbitrary to > you because that's tacit knowledge we gathered along the years. * I understand. I don't find it arbitrary. > We need an extra patch that reads: > > migration: Refactor channel discovery mechanism > > The various logical migration channels don't have a standardized way of > advertising themselves and their connections may be seen out of order > by the migration destination. When a new connection arrives, the > incoming migration currently make use of heuristics to determine which > channel it belongs to. > > The next few patches will need to change how the multifd and postcopy > capabilities interact and that affects the channel discovery heuristic. > > Refactor the channel discovery heuristic to make it less opaque and > simplify the subsequent patches. > > <some description of the new code which might be pertinent> > --- > > You'd move all of the channel discovery code into this patch. Some of it > will be unreacheable because multifd is not yet allowed with postcopy, > but that's fine. You can mention it on the commit message. Please see: -> https://privatebin.net/?dad6f052dd986f9f#FULnfrCV29NkQpvsQyvWuU4HdYjDwFbUPbDtvLro7mwi * Does this division look okay? > About moving the code out of migration.c, it was a suggestion that > you're free to push back. Ideally, doing the work would be faster than > arguing against it on the mailing list. But that's fine. * Same here, I'm not against moving that code part to connection.c OR doing the work. My suggestion has been to do that movement in another series and not try to do everything in this one series. > About the hang in the test. It doesn't reproduce often, but once it > does, it hangs forever (although I haven't waited that long). * Okay, I'm not seeing it or able to reproduce it across 3 different machines. One is my laptop and the other 2 are servers wherein I'm testing migrations of guests with 64G/128G of RAM and guest dirtying memory to the tune of 68M/128M/256M bytes. I'll keep an eye on it if I find something. Thank you. --- - Prasad ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH v7 0/5] Allow to enable multifd and postcopy migration together 2025-03-06 7:51 ` Prasad Pandit @ 2025-03-06 13:48 ` Fabiano Rosas 0 siblings, 0 replies; 37+ messages in thread From: Fabiano Rosas @ 2025-03-06 13:48 UTC (permalink / raw) To: Prasad Pandit; +Cc: Peter Xu, qemu-devel, berrange, Prasad Pandit Prasad Pandit <ppandit@redhat.com> writes: > Hello Fabiano, > > On Wed, 5 Mar 2025 at 19:26, Fabiano Rosas <farosas@suse.de> wrote: >> Note that none of this is out of the ordinary, you'll find such >> discussions in any thread on this community. It may feel arbitrary to >> you because that's tacit knowledge we gathered along the years. > > * I understand. I don't find it arbitrary. > >> We need an extra patch that reads: >> >> migration: Refactor channel discovery mechanism >> >> The various logical migration channels don't have a standardized way of >> advertising themselves and their connections may be seen out of order >> by the migration destination. When a new connection arrives, the >> incoming migration currently make use of heuristics to determine which >> channel it belongs to. >> >> The next few patches will need to change how the multifd and postcopy >> capabilities interact and that affects the channel discovery heuristic. >> >> Refactor the channel discovery heuristic to make it less opaque and >> simplify the subsequent patches. >> >> <some description of the new code which might be pertinent> >> --- >> >> You'd move all of the channel discovery code into this patch. Some of it >> will be unreacheable because multifd is not yet allowed with postcopy, >> but that's fine. You can mention it on the commit message. > > Please see: > -> https://privatebin.net/?dad6f052dd986f9f#FULnfrCV29NkQpvsQyvWuU4HdYjDwFbUPbDtvLro7mwi > > * Does this division look okay? > Yes. >> About moving the code out of migration.c, it was a suggestion that >> you're free to push back. Ideally, doing the work would be faster than >> arguing against it on the mailing list. But that's fine. > > * Same here, I'm not against moving that code part to connection.c OR > doing the work. My suggestion has been to do that movement in another > series and not try to do everything in this one series. > >> About the hang in the test. It doesn't reproduce often, but once it >> does, it hangs forever (although I haven't waited that long). > > * Okay, I'm not seeing it or able to reproduce it across 3 different > machines. One is my laptop and the other 2 are servers wherein I'm > testing migrations of guests with 64G/128G of RAM and guest dirtying > memory to the tune of 68M/128M/256M bytes. I'll keep an eye on it if I > find something. Usually a loaded (or slow) machine is needed to reproduce multifd synchronization issues. Sometimes running the test in a loop in parallel with some other workload helps to uncover them. The CI also tends to have slower machines that hit these problems. ^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2025-03-17 15:00 UTC | newest] Thread overview: 37+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-02-28 12:17 [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 1/5] migration/multifd: move macros to multifd header Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 2/5] migration: enable multifd and postcopy together Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 3/5] tests/qtest/migration: consolidate set capabilities Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 4/5] tests/qtest/migration: add postcopy tests with multifd Prasad Pandit 2025-02-28 15:11 ` Fabiano Rosas 2025-03-03 9:33 ` Prasad Pandit 2025-02-28 12:17 ` [PATCH v7 5/5] migration: add MULTIFD_RECV_SYNC migration command Prasad Pandit 2025-02-28 13:42 ` Peter Xu 2025-03-03 11:43 ` Prasad Pandit 2025-03-03 14:50 ` Peter Xu 2025-03-04 8:10 ` Prasad Pandit 2025-03-04 14:35 ` Peter Xu 2025-03-05 11:21 ` Prasad Pandit 2025-03-05 12:54 ` Peter Xu 2025-03-07 11:45 ` Prasad Pandit 2025-03-07 22:48 ` Peter Xu 2025-03-10 7:36 ` Prasad Pandit 2025-03-13 12:43 ` Prasad Pandit 2025-03-13 20:08 ` Peter Xu 2025-03-17 12:30 ` Prasad Pandit 2025-03-17 15:00 ` Peter Xu 2025-03-07 22:51 ` Peter Xu 2025-03-10 14:38 ` Fabiano Rosas 2025-03-10 17:08 ` Prasad Pandit 2025-03-10 19:58 ` Fabiano Rosas 2025-03-11 10:01 ` Prasad Pandit 2025-03-11 12:44 ` Fabiano Rosas 2025-02-28 14:53 ` [PATCH v7 0/5] Allow to enable multifd and postcopy migration together Fabiano Rosas 2025-03-03 10:47 ` Prasad Pandit 2025-03-03 14:12 ` Peter Xu 2025-03-04 9:47 ` Prasad Pandit 2025-03-04 14:42 ` Peter Xu 2025-03-05 7:41 ` Prasad Pandit 2025-03-05 13:56 ` Fabiano Rosas 2025-03-06 7:51 ` Prasad Pandit 2025-03-06 13:48 ` Fabiano Rosas
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).