* [PATCH RFC 01/11] migration: Add helper to get target runstate
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 02/11] migration/block: Make late-block-active the default Peter Xu
` (10 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
In 99% cases, after QEMU migrates to dest host, it tries to detect the
target VM runstate using global_state_get_runstate().
There's one outlier so far which is Xen that won't send global state.
That's the major reason why global_state_received() check was always there
together with global_state_get_runstate().
However it's utterly confusing why global_state_received() has anything to
do with "let's start VM or not".
Provide a helper to explain it, then we have an unified entry for getting
the target dest QEMU runstate after migration.
Suggested-by: Fabiano Rosas <farosas@suse.de>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 21 +++++++++++++++++----
1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 8c5bd0a75c..d2a6b939cf 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -135,6 +135,21 @@ static bool migration_needs_multiple_sockets(void)
return migrate_multifd() || migrate_postcopy_preempt();
}
+static RunState migration_get_target_runstate(void)
+{
+ /*
+ * When the global state is not migrated, it means we don't know the
+ * runstate of the src QEMU. We don't have much choice but assuming
+ * the VM is running. NOTE: this is pretty rare case, so far only Xen
+ * uses it.
+ */
+ if (!global_state_received()) {
+ return RUN_STATE_RUNNING;
+ }
+
+ return global_state_get_runstate();
+}
+
static bool transport_supports_multi_channels(MigrationAddress *addr)
{
if (addr->transport == MIGRATION_ADDRESS_TYPE_SOCKET) {
@@ -735,8 +750,7 @@ static void process_incoming_migration_bh(void *opaque)
* unless we really are starting the VM.
*/
if (!migrate_late_block_activate() ||
- (autostart && (!global_state_received() ||
- runstate_is_live(global_state_get_runstate())))) {
+ (autostart && runstate_is_live(migration_get_target_runstate()))) {
/* Make sure all file formats throw away their mutable metadata.
* If we get an error here, just don't restart the VM yet. */
bdrv_activate_all(&local_err);
@@ -759,8 +773,7 @@ static void process_incoming_migration_bh(void *opaque)
dirty_bitmap_mig_before_vm_start();
- if (!global_state_received() ||
- runstate_is_live(global_state_get_runstate())) {
+ if (runstate_is_live(migration_get_target_runstate())) {
if (autostart) {
vm_start();
} else {
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 02/11] migration/block: Make late-block-active the default
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
2024-12-04 0:51 ` [PATCH RFC 01/11] migration: Add helper to get target runstate Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 03/11] migration/block: Apply late-block-active behavior to postcopy Peter Xu
` (9 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
Migration capability 'late-block-active' controls when the block drives
will be activated. If enabled, block drives will only be activated until
VM starts, either src runstate was "live" (RUNNING, or SUSPENDED), or it'll
be postponed until qmp_cont().
Let's do this unconditionally. There's no harm to delay activation of
block drives. Meanwhile there's no ABI breakage if dest does it, because
src QEMU has nothing to do with it, so it's no concern on ABI breakage.
IIUC we could avoid introducing this cap when introducing it before, but
now it's still not too late to just always do it. Cap now prone to
removal, but it'll be for later patches.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 38 +++++++++++++++++++-------------------
1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index d2a6b939cf..e6db9cfc50 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -743,24 +743,6 @@ static void process_incoming_migration_bh(void *opaque)
trace_vmstate_downtime_checkpoint("dst-precopy-bh-enter");
- /* If capability late_block_activate is set:
- * Only fire up the block code now if we're going to restart the
- * VM, else 'cont' will do it.
- * This causes file locking to happen; so we don't want it to happen
- * unless we really are starting the VM.
- */
- if (!migrate_late_block_activate() ||
- (autostart && runstate_is_live(migration_get_target_runstate()))) {
- /* Make sure all file formats throw away their mutable metadata.
- * If we get an error here, just don't restart the VM yet. */
- bdrv_activate_all(&local_err);
- if (local_err) {
- error_report_err(local_err);
- local_err = NULL;
- autostart = false;
- }
- }
-
/*
* This must happen after all error conditions are dealt with and
* we're sure the VM is going to be running on this host.
@@ -775,7 +757,25 @@ static void process_incoming_migration_bh(void *opaque)
if (runstate_is_live(migration_get_target_runstate())) {
if (autostart) {
- vm_start();
+ /*
+ * Block activation is always delayed until VM starts, either
+ * here (which means we need to start the dest VM right now..),
+ * or until qmp_cont() later.
+ *
+ * We used to have cap 'late-block-activate' but now we do this
+ * unconditionally, as it has no harm but only benefit. E.g.,
+ * it's not part of migration ABI on the time of disk activation.
+ *
+ * Make sure all file formats throw away their mutable
+ * metadata. If error, don't restart the VM yet.
+ */
+ bdrv_activate_all(&local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ local_err = NULL;
+ } else {
+ vm_start();
+ }
} else {
runstate_set(RUN_STATE_PAUSED);
}
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 03/11] migration/block: Apply late-block-active behavior to postcopy
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
2024-12-04 0:51 ` [PATCH RFC 01/11] migration: Add helper to get target runstate Peter Xu
2024-12-04 0:51 ` [PATCH RFC 02/11] migration/block: Make late-block-active the default Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 04/11] migration/block: Fix possible race with block_inactive Peter Xu
` (8 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
Postcopy never cared about late-block-active. However there's no mention
in the capability that it doesn't apply to postcopy.
Considering that we _assumed_ late activation is always good, do that too
for postcopy unconditionally, just like precopy. After this patch, we
should have unified the behavior across all.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/savevm.c | 23 ++++++++++-------------
1 file changed, 10 insertions(+), 13 deletions(-)
diff --git a/migration/savevm.c b/migration/savevm.c
index 98821c8120..0d52fe522c 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2137,22 +2137,19 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
trace_vmstate_downtime_checkpoint("dst-postcopy-bh-announced");
- /* Make sure all file formats throw away their mutable metadata.
- * If we get an error here, just don't restart the VM yet. */
- bdrv_activate_all(&local_err);
- if (local_err) {
- error_report_err(local_err);
- local_err = NULL;
- autostart = false;
- }
-
- trace_vmstate_downtime_checkpoint("dst-postcopy-bh-cache-invalidated");
-
dirty_bitmap_mig_before_vm_start();
if (autostart) {
- /* Hold onto your hats, starting the CPU */
- vm_start();
+ /* Make sure all file formats throw away their mutable metadata.
+ * If we get an error here, just don't restart the VM yet. */
+ bdrv_activate_all(&local_err);
+ trace_vmstate_downtime_checkpoint("dst-postcopy-bh-cache-invalidated");
+ if (local_err) {
+ error_report_err(local_err);
+ local_err = NULL;
+ } else {
+ vm_start();
+ }
} else {
/* leave it paused and let management decide when to start the CPU */
runstate_set(RUN_STATE_PAUSED);
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 04/11] migration/block: Fix possible race with block_inactive
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (2 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 03/11] migration/block: Apply late-block-active behavior to postcopy Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 05/11] migration/block: Merge block reactivations for fail/cancel Peter Xu
` (7 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
Src QEMU sets block_inactive=true very early before the invalidation takes
place. It means if something wrong happened during setting the flag but
before reaching qemu_savevm_state_complete_precopy_non_iterable() where it
did the invalidation work, it'll make block_inactive flag inconsistent.
For example, think about when qemu_savevm_state_complete_precopy_iterable()
can fail: it will have block_inactive set to true even if all block drives
are active.
Fix that by only update the flag after the invalidation is done.
No Fixes for any commit, because it's not an issue if bdrv_activate_all()
is re-entrant upon all-active disks - false positive block_inactive can
bring nothing more than "trying to active the blocks but they're already
active". However let's still do it right to avoid the inconsistent flag
v.s. reality.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 9 +++------
migration/savevm.c | 2 ++
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index e6db9cfc50..ba5deec5bc 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2749,14 +2749,11 @@ static int migration_completion_precopy(MigrationState *s,
goto out_unlock;
}
- /*
- * Inactivate disks except in COLO, and track that we have done so in order
- * to remember to reactivate them if migration fails or is cancelled.
- */
- s->block_inactive = !migrate_colo();
migration_rate_set(RATE_LIMIT_DISABLED);
+
+ /* Inactivate disks except in COLO */
ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false,
- s->block_inactive);
+ !migrate_colo());
out_unlock:
bql_unlock();
return ret;
diff --git a/migration/savevm.c b/migration/savevm.c
index 0d52fe522c..3c75257318 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1558,6 +1558,8 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
qemu_file_set_error(f, ret);
return ret;
}
+ /* Remember that we did this */
+ s->block_inactive = true;
}
if (!in_postcopy) {
/* Postcopy stream will still be going */
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 05/11] migration/block: Merge block reactivations for fail/cancel
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (3 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 04/11] migration/block: Fix possible race with block_inactive Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 06/11] migration/block: Extend the migration_block_* API to dest side Peter Xu
` (6 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
When migration is either cancelled or failed during switchover, especially
when after the disks are inactivated, QEMU needs to remember re-activate
the disks again before vm starts.
It used to be done separately in two paths: one in qmp_migrate_cancel(),
the other one in the failure path of migration_completion().
It used to be fixed in different commits, all over the places in QEMU. So
these are the ones I see at least:
- In 2016, commit fe904ea824 ("migration: regain control of images when
migration fails to complete")
- In 2017, commit 1d2acc3162 ("migration: re-active images while migration
been canceled after inactive them")
- In 2023, commit 6dab4c93ec ("migration: Attempt disk reactivation in
more failure scenarios")
We could have done better on trying to solve this once and for all. Now it
might be a good chance because we have a better whole picture now.
Put both of the error cases together now into migration_iteration_finish(),
which will be invoked for either of the scenario.
At the meantime, cleanup block_inactive in a few ways:
- Rename it to block_active, which is easier to follow..
- Maintain the flag for the whole lifecycle of QEMU. Otherwise it's not
clear on when the flag is valid or not.
- Put it together with the operations, rather than updating the flag
separately.
Now we set the flag to TRUE from the start, showing block ownership of a
fresh started QEMU (but so far only used in migration module). Then it can
be modified by migration switchover code if invalidation happened. With
that, the flag always reflects the correct block ownership.
NOTE: it can always be racy if there's concurrent operation to change the
block activation status (e.g., qmp_cont() right after migration failure but
right before block re-activate happens), but that's not avoidable even
before this patch, so it's not part of the goal that this patch would like
to achieve, so put aside as of now.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.h | 20 +++++++++-
migration/migration.c | 86 +++++++++++++++++++++++++------------------
migration/savevm.c | 11 ++----
3 files changed, 72 insertions(+), 45 deletions(-)
diff --git a/migration/migration.h b/migration/migration.h
index 3857905c0e..b71c062ad2 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -370,8 +370,20 @@ struct MigrationState {
/* Flag set once the migration thread is running (and needs joining) */
bool migration_thread_running;
- /* Flag set once the migration thread called bdrv_inactivate_all */
- bool block_inactive;
+ /*
+ * Migration-only cache to remember the block layer activation status.
+ * Protected by BQL.
+ *
+ * We need this because..
+ *
+ * - Migration can fail after block devices are invalidated (during
+ * switchover phase). When that happens, we need to be able to
+ * recover the block drive status by re-activating them.
+ *
+ * For freshly started QEMU, block_active is initialized to TRUE
+ * reflecting the scenario where QEMU owns block device ownerships.
+ */
+ bool block_active;
/* Migration is waiting for guest to unplug device */
QemuSemaphore wait_unplug_sem;
@@ -556,4 +568,8 @@ void migration_bitmap_sync_precopy(bool last_stage);
/* migration/block-dirty-bitmap.c */
void dirty_bitmap_mig_init(void);
+/* Wrapper for block active/inactive operations */
+bool migration_block_activate(MigrationState *s);
+bool migration_block_inactivate(MigrationState *s);
+
#endif
diff --git a/migration/migration.c b/migration/migration.c
index ba5deec5bc..8f7d09ca84 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -130,6 +130,47 @@ static void migration_downtime_end(MigrationState *s)
trace_vmstate_downtime_checkpoint("src-downtime-end");
}
+bool migration_block_activate(MigrationState *s)
+{
+ Error *local_err = NULL;
+
+ assert(bql_locked());
+
+ if (s->block_active) {
+ return true;
+ }
+
+ bdrv_activate_all(&local_err);
+ if (local_err) {
+ error_report_err(local_err);
+ return false;
+ }
+
+ s->block_active = true;
+ return true;
+}
+
+bool migration_block_inactivate(MigrationState *s)
+{
+ int ret;
+
+ assert(bql_locked());
+
+ if (!s->block_active) {
+ return true;
+ }
+
+ ret = bdrv_inactivate_all();
+ if (ret) {
+ error_report("%s: bdrv_inactivate_all() failed: %d\n",
+ __func__, ret);
+ return false;
+ }
+
+ s->block_active = false;
+ return true;
+}
+
static bool migration_needs_multiple_sockets(void)
{
return migrate_multifd() || migrate_postcopy_preempt();
@@ -1552,16 +1593,6 @@ static void migrate_fd_cancel(MigrationState *s)
}
}
}
- if (s->state == MIGRATION_STATUS_CANCELLING && s->block_inactive) {
- Error *local_err = NULL;
-
- bdrv_activate_all(&local_err);
- if (local_err) {
- error_report_err(local_err);
- } else {
- s->block_inactive = false;
- }
- }
}
void migration_add_notifier_mode(NotifierWithReturn *notify,
@@ -2778,31 +2809,6 @@ static void migration_completion_postcopy(MigrationState *s)
trace_migration_completion_postcopy_end_after_complete();
}
-static void migration_completion_failed(MigrationState *s,
- int current_active_state)
-{
- if (s->block_inactive && (s->state == MIGRATION_STATUS_ACTIVE ||
- s->state == MIGRATION_STATUS_DEVICE)) {
- /*
- * If not doing postcopy, vm_start() will be called: let's
- * regain control on images.
- */
- Error *local_err = NULL;
-
- bql_lock();
- bdrv_activate_all(&local_err);
- if (local_err) {
- error_report_err(local_err);
- } else {
- s->block_inactive = false;
- }
- bql_unlock();
- }
-
- migrate_set_state(&s->state, current_active_state,
- MIGRATION_STATUS_FAILED);
-}
-
/**
* migration_completion: Used by migration_thread when there's not much left.
* The caller 'breaks' the loop when this returns.
@@ -2856,7 +2862,8 @@ fail:
error_free(local_err);
}
- migration_completion_failed(s, current_active_state);
+ migrate_set_state(&s->state, current_active_state,
+ MIGRATION_STATUS_FAILED);
}
/**
@@ -3286,6 +3293,11 @@ static void migration_iteration_finish(MigrationState *s)
case MIGRATION_STATUS_FAILED:
case MIGRATION_STATUS_CANCELLED:
case MIGRATION_STATUS_CANCELLING:
+ /*
+ * Re-activate the block drives if they're inactivated. Note, COLO
+ * shouldn't use block_active at all, so it should be no-op there.
+ */
+ migration_block_activate(s);
if (runstate_is_live(s->vm_old_state)) {
if (!runstate_check(RUN_STATE_SHUTDOWN)) {
vm_start();
@@ -3858,6 +3870,8 @@ static void migration_instance_init(Object *obj)
ms->state = MIGRATION_STATUS_NONE;
ms->mbps = -1;
ms->pages_per_second = -1;
+ /* Freshly started QEMU owns all the block devices */
+ ms->block_active = true;
qemu_sem_init(&ms->pause_sem, 0);
qemu_mutex_init(&ms->error_mutex);
diff --git a/migration/savevm.c b/migration/savevm.c
index 3c75257318..a335344c75 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -1549,17 +1549,14 @@ int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f,
if (inactivate_disks) {
/* Inactivate before sending QEMU_VM_EOF so that the
* bdrv_activate_all() on the other end won't fail. */
- ret = bdrv_inactivate_all();
- if (ret) {
- error_setg(&local_err, "%s: bdrv_inactivate_all() failed (%d)",
- __func__, ret);
+ if (!migration_block_inactivate(ms)) {
+ error_setg(&local_err, "%s: bdrv_inactivate_all() failed",
+ __func__);
migrate_set_error(ms, local_err);
error_report_err(local_err);
- qemu_file_set_error(f, ret);
+ qemu_file_set_error(f, -EFAULT);
return ret;
}
- /* Remember that we did this */
- s->block_inactive = true;
}
if (!in_postcopy) {
/* Postcopy stream will still be going */
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 06/11] migration/block: Extend the migration_block_* API to dest side
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (4 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 05/11] migration/block: Merge block reactivations for fail/cancel Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 14:49 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 07/11] migration/block: Apply the migration_block_* API to postcopy Peter Xu
` (5 subsequent siblings)
11 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
Extend the usage of such API to the receiver side, because such status
maintenance is also necessary there.
It's needed to avoid bdrv_inactivate_all() not being invoked when the
drives are already inactive. When invoked, it can crash QEMU. See the
issue link for more information.
So it means we need to start remember the block activation status not only
on src, but also on dest.
One thing tricky here is, when we need to consider incoming QEMU, we need
to also consider its initial status as incoming side. Due to the fact that
before switchover the incoming QEMU doesn't own the disks, we need to set
initial state to FALSE for a incoming QEMU. In this case, postcopy recover
doesn't count.
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2395
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.h | 13 +++++++++++++
migration/migration.c | 13 +++++++------
2 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/migration/migration.h b/migration/migration.h
index b71c062ad2..f477494c5d 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -380,8 +380,21 @@ struct MigrationState {
* switchover phase). When that happens, we need to be able to
* recover the block drive status by re-activating them.
*
+ * - Currently bdrv_inactivate_all() is not safe to be invoked on top
+ * of invalidated drives (even if bdrv_activate_all() is actually
+ * safe to be called any time!). It means remembering this could
+ * help migration to make sure it won't invalidate twice in a row,
+ * crashing QEMU. It can happen when we migrate a PAUSED VM from
+ * host1 to host2, then migrate again to host3 without starting it.
+ * TODO: a cleaner solution is to allow safe invoke of
+ * bdrv_inactivate_all() at anytime, like bdrv_activate_all().
+ *
* For freshly started QEMU, block_active is initialized to TRUE
* reflecting the scenario where QEMU owns block device ownerships.
+ *
+ * For incoming QEMU taking a migration stream, block_active is set to
+ * FALSE reflecting that the incoming side doesn't own the block
+ * devices, not until switchover happens.
*/
bool block_active;
diff --git a/migration/migration.c b/migration/migration.c
index 8f7d09ca84..e01264168f 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -701,6 +701,12 @@ migration_incoming_state_setup(MigrationIncomingState *mis, Error **errp)
return false;
}
+ /*
+ * Newly setup QEMU, prepared for incoming migration. Mark the block
+ * active state to reflect that the src currently owns the disks.
+ */
+ migrate_get_current()->block_active = false;
+
migrate_set_state(&mis->state, current, MIGRATION_STATUS_SETUP);
return true;
}
@@ -779,7 +785,6 @@ static void qemu_start_incoming_migration(const char *uri, bool has_channels,
static void process_incoming_migration_bh(void *opaque)
{
- Error *local_err = NULL;
MigrationIncomingState *mis = opaque;
trace_vmstate_downtime_checkpoint("dst-precopy-bh-enter");
@@ -810,11 +815,7 @@ static void process_incoming_migration_bh(void *opaque)
* Make sure all file formats throw away their mutable
* metadata. If error, don't restart the VM yet.
*/
- bdrv_activate_all(&local_err);
- if (local_err) {
- error_report_err(local_err);
- local_err = NULL;
- } else {
+ if (migration_block_activate(migrate_get_current())) {
vm_start();
}
} else {
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH RFC 06/11] migration/block: Extend the migration_block_* API to dest side
2024-12-04 0:51 ` [PATCH RFC 06/11] migration/block: Extend the migration_block_* API to dest side Peter Xu
@ 2024-12-04 14:49 ` Peter Xu
0 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 14:49 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, Stefan Hajnoczi
On Tue, Dec 03, 2024 at 07:51:33PM -0500, Peter Xu wrote:
> diff --git a/migration/migration.c b/migration/migration.c
> index 8f7d09ca84..e01264168f 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -701,6 +701,12 @@ migration_incoming_state_setup(MigrationIncomingState *mis, Error **errp)
> return false;
> }
>
> + /*
> + * Newly setup QEMU, prepared for incoming migration. Mark the block
> + * active state to reflect that the src currently owns the disks.
> + */
> + migrate_get_current()->block_active = false;
> +
> migrate_set_state(&mis->state, current, MIGRATION_STATUS_SETUP);
> return true;
> }
I plan to move this over to the end of qmp_migrate_incoming(), which might
be easier to follow (where we updated "once"..), to say that we initialize
the value to FALSE once and for all for an incoming instance.
I don't think there's any functional change on the fixup. The only
difference is when qemu_start_incoming_migration() can fail after
migration_incoming_state_setup(). I don't think it matter hugely on the
block_active flag itself.. as if that failure happens it means the next
qmp_migrate_incoming will overwrite the flag again with nobody being able
to touch it.. But still, if I'll post a new version I'll squash below into
this patch:
===8<===
diff --git a/migration/migration.c b/migration/migration.c
index 1a1c570c5b..bc3a29482a 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -701,12 +701,6 @@ migration_incoming_state_setup(MigrationIncomingState *mis, Error **errp)
return false;
}
- /*
- * Newly setup QEMU, prepared for incoming migration. Mark the block
- * active state to reflect that the src currently owns the disks.
- */
- migrate_get_current()->block_active = false;
-
migrate_set_state(&mis->state, current, MIGRATION_STATUS_SETUP);
return true;
}
@@ -1892,6 +1886,12 @@ void qmp_migrate_incoming(const char *uri, bool has_channels,
return;
}
+ /*
+ * Newly setup incoming QEMU. Mark the block active state to reflect
+ * that the src currently owns the disks.
+ */
+ migrate_get_current()->block_active = false;
+
once = false;
}
--
Peter Xu
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH RFC 07/11] migration/block: Apply the migration_block_* API to postcopy
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (5 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 06/11] migration/block: Extend the migration_block_* API to dest side Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 08/11] tests/qtest/migration: Move more code under only_target Peter Xu
` (4 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
Postcopy also has similar error handling for re-activation of block
devices. Use the same API as precopy to do the re-activation, for both src
& dst sides.
Signed-off-by: Peter Xu <peterx@redhat.com>
---
migration/migration.c | 22 +++-------------------
migration/savevm.c | 10 ++++------
2 files changed, 7 insertions(+), 25 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index e01264168f..1a1c570c5b 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2544,7 +2544,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
QIOChannelBuffer *bioc;
QEMUFile *fb;
uint64_t bandwidth = migrate_max_postcopy_bandwidth();
- bool restart_block = false;
int cur_state = MIGRATION_STATUS_ACTIVE;
if (migrate_postcopy_preempt()) {
@@ -2580,13 +2579,10 @@ static int postcopy_start(MigrationState *ms, Error **errp)
goto fail;
}
- ret = bdrv_inactivate_all();
- if (ret < 0) {
- error_setg_errno(errp, -ret, "%s: Failed in bdrv_inactivate_all()",
- __func__);
+ if (!migration_block_inactivate(ms)) {
+ error_setg(errp, "%s: Failed in bdrv_inactivate_all()", __func__);
goto fail;
}
- restart_block = true;
/*
* Cause any non-postcopiable, but iterative devices to
@@ -2656,8 +2652,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
goto fail_closefb;
}
- restart_block = false;
-
/* Now send that blob */
if (qemu_savevm_send_packaged(ms->to_dst_file, bioc->data, bioc->usage)) {
error_setg(errp, "%s: Failed to send packaged data", __func__);
@@ -2702,17 +2696,7 @@ fail_closefb:
fail:
migrate_set_state(&ms->state, MIGRATION_STATUS_POSTCOPY_ACTIVE,
MIGRATION_STATUS_FAILED);
- if (restart_block) {
- /* A failure happened early enough that we know the destination hasn't
- * accessed block devices, so we're safe to recover.
- */
- Error *local_err = NULL;
-
- bdrv_activate_all(&local_err);
- if (local_err) {
- error_report_err(local_err);
- }
- }
+ migration_block_activate(ms);
migration_call_notifiers(ms, MIG_EVENT_PRECOPY_FAILED, NULL);
bql_unlock();
return -1;
diff --git a/migration/savevm.c b/migration/savevm.c
index a335344c75..2ce5064efc 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2120,7 +2120,6 @@ static int loadvm_postcopy_handle_listen(MigrationIncomingState *mis)
static void loadvm_postcopy_handle_run_bh(void *opaque)
{
- Error *local_err = NULL;
MigrationIncomingState *mis = opaque;
trace_vmstate_downtime_checkpoint("dst-postcopy-bh-enter");
@@ -2141,12 +2140,11 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
if (autostart) {
/* Make sure all file formats throw away their mutable metadata.
* If we get an error here, just don't restart the VM yet. */
- bdrv_activate_all(&local_err);
+ bool success = migration_block_activate(migrate_get_current());
+
trace_vmstate_downtime_checkpoint("dst-postcopy-bh-cache-invalidated");
- if (local_err) {
- error_report_err(local_err);
- local_err = NULL;
- } else {
+
+ if (success) {
vm_start();
}
} else {
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 08/11] tests/qtest/migration: Move more code under only_target
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (6 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 07/11] migration/block: Apply the migration_block_* API to postcopy Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 09/11] tests/qtest/migration: Don't use hardcoded strings for -serial Peter Xu
` (3 subsequent siblings)
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
From: Fabiano Rosas <farosas@suse.de>
The only_target option's purpose is to make sure only the destination
QTestState machine is initialized. This allows the test code to retain
an already initialized source machine (e.g. for doing ping pong
migration).
We have drifted from that a bit when adding new code, so move some
lines under only_target to restore the functionality.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241125144612.16194-2-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
tests/qtest/migration-test.c | 44 ++++++++++++++++++++----------------
1 file changed, 25 insertions(+), 19 deletions(-)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 74d3000198..eafc2da806 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -717,7 +717,6 @@ static int test_migrate_start(QTestState **from, QTestState **to,
g_autofree gchar *arch_target = NULL;
/* options for source and target */
g_autofree gchar *arch_opts = NULL;
- g_autofree gchar *cmd_source = NULL;
g_autofree gchar *cmd_target = NULL;
const gchar *ignore_stderr;
g_autofree char *shmem_opts = NULL;
@@ -735,10 +734,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
}
}
- dst_state = (QTestMigrationState) { };
- src_state = (QTestMigrationState) { };
bootfile_create(tmpfs, args->suspend_me);
- src_state.suspend_me = args->suspend_me;
if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
memory_size = "150M";
@@ -817,27 +813,35 @@ static int test_migrate_start(QTestState **from, QTestState **to,
g_test_message("Using machine type: %s", machine);
- cmd_source = g_strdup_printf("-accel kvm%s -accel tcg "
- "-machine %s,%s "
- "-name source,debug-threads=on "
- "-m %s "
- "-serial file:%s/src_serial "
- "%s %s %s %s %s",
- kvm_opts ? kvm_opts : "",
- machine, machine_opts,
- memory_size, tmpfs,
- arch_opts ? arch_opts : "",
- arch_source ? arch_source : "",
- shmem_opts ? shmem_opts : "",
- args->opts_source ? args->opts_source : "",
- ignore_stderr);
if (!args->only_target) {
+ g_autofree gchar *cmd_source = NULL;
+
+ src_state = (QTestMigrationState) { };
+ src_state.suspend_me = args->suspend_me;
+
+ cmd_source = g_strdup_printf("-accel kvm%s -accel tcg "
+ "-machine %s,%s "
+ "-name source,debug-threads=on "
+ "-m %s "
+ "-serial file:%s/src_serial "
+ "%s %s %s %s %s",
+ kvm_opts ? kvm_opts : "",
+ machine, machine_opts,
+ memory_size, tmpfs,
+ arch_opts ? arch_opts : "",
+ arch_source ? arch_source : "",
+ shmem_opts ? shmem_opts : "",
+ args->opts_source ? args->opts_source : "",
+ ignore_stderr);
+
*from = qtest_init_with_env(QEMU_ENV_SRC, cmd_source);
qtest_qmp_set_event_callback(*from,
migrate_watch_for_events,
&src_state);
}
+ dst_state = (QTestMigrationState) { };
+
cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
"-machine %s,%s "
"-name target,debug-threads=on "
@@ -870,7 +874,9 @@ static int test_migrate_start(QTestState **from, QTestState **to,
* Always enable migration events. Libvirt always uses it, let's try
* to mimic as closer as that.
*/
- migrate_set_capability(*from, "events", true);
+ if (!args->only_target) {
+ migrate_set_capability(*from, "events", true);
+ }
migrate_set_capability(*to, "events", true);
return 0;
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 09/11] tests/qtest/migration: Don't use hardcoded strings for -serial
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (7 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 08/11] tests/qtest/migration: Move more code under only_target Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 19:11 ` Fabiano Rosas
2024-12-04 0:51 ` [PATCH RFC 10/11] tests/qtest/migration: Support cleaning up only one side of migration Peter Xu
` (2 subsequent siblings)
11 siblings, 1 reply; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
From: Fabiano Rosas <farosas@suse.de>
Stop using hardcoded strings for -serial so we can in the next patches
perform more than one migration in a row. Having the serial path
hardcoded means we cannot reuse the code when dst becomes the new src.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241125144612.16194-3-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
tests/qtest/migration-helpers.h | 2 +
tests/qtest/migration-helpers.c | 8 ++++
tests/qtest/migration-test.c | 68 ++++++++++++++++++---------------
3 files changed, 48 insertions(+), 30 deletions(-)
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 72dba369fb..c7a36a33d6 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -20,6 +20,7 @@ typedef struct QTestMigrationState {
bool resume_seen;
bool suspend_seen;
bool suspend_me;
+ char *serial;
} QTestMigrationState;
bool migrate_watch_for_events(QTestState *who, const char *name,
@@ -64,5 +65,6 @@ static inline bool probe_o_direct_support(const char *tmpfs)
#endif
void migration_test_add(const char *path, void (*fn)(void));
void migration_event_wait(QTestState *s, const char *target);
+char *migrate_get_unique_serial(const char *tmpfs);
#endif /* MIGRATION_HELPERS_H */
diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index 3f8ba7fa8e..7c0b54ce0e 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -528,3 +528,11 @@ void migration_event_wait(QTestState *s, const char *target)
qobject_unref(response);
} while (!found);
}
+
+char *migrate_get_unique_serial(const char *tmpfs)
+{
+ static int i;
+
+ assert(i < INT_MAX);
+ return g_strdup_printf("%s/serial_%d", tmpfs, i++);
+}
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index eafc2da806..1452778c81 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -192,9 +192,8 @@ static void bootfile_create(char *dir, bool suspend_me)
* we get an 'A' followed by an endless string of 'B's
* but on the destination we won't have the A (unless we enabled suspend/resume)
*/
-static void wait_for_serial(const char *side)
+static void wait_for_serial(const char *serialpath)
{
- g_autofree char *serialpath = g_strdup_printf("%s/%s", tmpfs, side);
FILE *serialfile = fopen(serialpath, "r");
do {
@@ -216,7 +215,7 @@ static void wait_for_serial(const char *side)
break;
default:
- fprintf(stderr, "Unexpected %d on %s serial\n", readvalue, side);
+ fprintf(stderr, "Unexpected %d on %s\n", readvalue, serialpath);
g_assert_not_reached();
}
} while (true);
@@ -818,16 +817,17 @@ static int test_migrate_start(QTestState **from, QTestState **to,
src_state = (QTestMigrationState) { };
src_state.suspend_me = args->suspend_me;
+ src_state.serial = migrate_get_unique_serial(tmpfs);
cmd_source = g_strdup_printf("-accel kvm%s -accel tcg "
"-machine %s,%s "
"-name source,debug-threads=on "
"-m %s "
- "-serial file:%s/src_serial "
+ "-serial file:%s "
"%s %s %s %s %s",
kvm_opts ? kvm_opts : "",
machine, machine_opts,
- memory_size, tmpfs,
+ memory_size, src_state.serial,
arch_opts ? arch_opts : "",
arch_source ? arch_source : "",
shmem_opts ? shmem_opts : "",
@@ -841,17 +841,18 @@ static int test_migrate_start(QTestState **from, QTestState **to,
}
dst_state = (QTestMigrationState) { };
+ dst_state.serial = migrate_get_unique_serial(tmpfs);
cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
"-machine %s,%s "
"-name target,debug-threads=on "
"-m %s "
- "-serial file:%s/dest_serial "
+ "-serial file:%s "
"-incoming %s "
"%s %s %s %s %s",
kvm_opts ? kvm_opts : "",
machine, machine_opts,
- memory_size, tmpfs, uri,
+ memory_size, dst_state.serial, uri,
arch_opts ? arch_opts : "",
arch_target ? arch_target : "",
shmem_opts ? shmem_opts : "",
@@ -911,8 +912,10 @@ static void test_migrate_end(QTestState *from, QTestState *to, bool test_dest)
qtest_quit(to);
cleanup("migsocket");
- cleanup("src_serial");
- cleanup("dest_serial");
+ unlink(src_state.serial);
+ g_free(src_state.serial);
+ unlink(dst_state.serial);
+ g_free(dst_state.serial);
cleanup(FILE_TEST_FILENAME);
}
@@ -1290,7 +1293,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
" 'port': '0' } } ] } }");
/* Wait for the first serial output from the source */
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
wait_for_suspend(from, &src_state);
migrate_qmp(from, to, NULL, NULL, "{}");
@@ -1314,7 +1317,7 @@ static void migrate_postcopy_complete(QTestState *from, QTestState *to,
}
/* Make sure we get at least one "B" on destination */
- wait_for_serial("dest_serial");
+ wait_for_serial(dst_state.serial);
if (uffd_feature_thread_id) {
read_blocktime(to);
@@ -1719,7 +1722,7 @@ static void test_precopy_common(MigrateCommon *args)
/* Wait for the first serial output from the source */
if (args->result == MIG_TEST_SUCCEED) {
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
wait_for_suspend(from, &src_state);
}
@@ -1796,7 +1799,7 @@ static void test_precopy_common(MigrateCommon *args)
qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
}
- wait_for_serial("dest_serial");
+ wait_for_serial(dst_state.serial);
}
finish:
@@ -1871,7 +1874,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
}
migrate_ensure_converge(from);
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
if (stop_src) {
qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
@@ -1898,7 +1901,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
}
wait_for_resume(to, &dst_state);
- wait_for_serial("dest_serial");
+ wait_for_serial(dst_state.serial);
if (check_offset) {
file_check_offset_region();
@@ -2041,7 +2044,7 @@ static void test_ignore_shared(void)
migrate_set_capability(to, "x-ignore-shared", true);
/* Wait for the first serial output from the source */
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
migrate_qmp(from, to, uri, NULL, "{}");
@@ -2051,7 +2054,7 @@ static void test_ignore_shared(void)
qtest_qmp_eventwait(to, "RESUME");
- wait_for_serial("dest_serial");
+ wait_for_serial(dst_state.serial);
wait_for_migration_complete(from);
/* Check whether shared RAM has been really skipped */
@@ -2669,7 +2672,7 @@ static void do_test_validate_uuid(MigrateStart *args, bool should_fail)
migrate_set_capability(from, "validate-uuid", true);
/* Wait for the first serial output from the source */
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
migrate_qmp(from, to, uri, NULL, "{}");
@@ -2733,7 +2736,7 @@ static void do_test_validate_uri_channel(MigrateCommon *args)
}
/* Wait for the first serial output from the source */
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
/*
* 'uri' and 'channels' validation is checked even before the migration
@@ -2823,7 +2826,7 @@ static void test_migrate_auto_converge(void)
migrate_set_capability(from, "pause-before-switchover", true);
/* Wait for the first serial output from the source */
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
migrate_qmp(from, to, uri, NULL, "{}");
@@ -2885,7 +2888,7 @@ static void test_migrate_auto_converge(void)
qtest_qmp_eventwait(to, "RESUME");
- wait_for_serial("dest_serial");
+ wait_for_serial(dst_state.serial);
wait_for_migration_complete(from);
test_migrate_end(from, to, true);
@@ -3296,7 +3299,7 @@ static void test_multifd_tcp_cancel(void)
migrate_incoming_qmp(to, "tcp:127.0.0.1:0", "{}");
/* Wait for the first serial output from the source */
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
migrate_qmp(from, to, NULL, NULL, "{}");
@@ -3307,7 +3310,8 @@ static void test_multifd_tcp_cancel(void)
/* Make sure QEMU process "to" exited */
qtest_set_expected_status(to, EXIT_FAILURE);
qtest_wait_qemu(to);
- qtest_quit(to);
+ unlink(dst_state.serial);
+ g_free(dst_state.serial);
/*
* Ensure the source QEMU finishes its cancellation process before we
@@ -3345,7 +3349,7 @@ static void test_multifd_tcp_cancel(void)
wait_for_stop(from, &src_state);
qtest_qmp_eventwait(to2, "RESUME");
- wait_for_serial("dest_serial");
+ wait_for_serial(dst_state.serial);
wait_for_migration_complete(from);
test_migrate_end(from, to2, true);
}
@@ -3488,13 +3492,16 @@ static QTestState *dirtylimit_start_vm(void)
QTestState *vm = NULL;
g_autofree gchar *cmd = NULL;
+ src_state = (QTestMigrationState) { };
+ src_state.serial = migrate_get_unique_serial(tmpfs);
+
bootfile_create(tmpfs, false);
cmd = g_strdup_printf("-accel kvm,dirty-ring-size=4096 "
"-name dirtylimit-test,debug-threads=on "
"-m 150M -smp 1 "
- "-serial file:%s/vm_serial "
+ "-serial file:%s "
"-drive file=%s,format=raw ",
- tmpfs, bootpath);
+ src_state.serial, bootpath);
vm = qtest_init(cmd);
return vm;
@@ -3503,7 +3510,8 @@ static QTestState *dirtylimit_start_vm(void)
static void dirtylimit_stop_vm(QTestState *vm)
{
qtest_quit(vm);
- cleanup("vm_serial");
+ unlink(src_state.serial);
+ g_free(src_state.serial);
}
static void test_vcpu_dirty_limit(void)
@@ -3519,7 +3527,7 @@ static void test_vcpu_dirty_limit(void)
vm = dirtylimit_start_vm();
/* Wait for the first serial output from the vm*/
- wait_for_serial("vm_serial");
+ wait_for_serial(src_state.serial);
/* Do dirtyrate measurement with calc time equals 1s */
calc_dirty_rate(vm, 1);
@@ -3612,7 +3620,7 @@ static void migrate_dirty_limit_wait_showup(QTestState *from,
migrate_set_capability(from, "pause-before-switchover", true);
/* Wait for the serial output from the source */
- wait_for_serial("src_serial");
+ wait_for_serial(src_state.serial);
}
/*
@@ -3751,7 +3759,7 @@ static void test_migrate_dirty_limit(void)
qtest_qmp_eventwait(to, "RESUME");
- wait_for_serial("dest_serial");
+ wait_for_serial(dst_state.serial);
wait_for_migration_complete(from);
test_migrate_end(from, to, true);
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH RFC 09/11] tests/qtest/migration: Don't use hardcoded strings for -serial
2024-12-04 0:51 ` [PATCH RFC 09/11] tests/qtest/migration: Don't use hardcoded strings for -serial Peter Xu
@ 2024-12-04 19:11 ` Fabiano Rosas
0 siblings, 0 replies; 15+ messages in thread
From: Fabiano Rosas @ 2024-12-04 19:11 UTC (permalink / raw)
To: Peter Xu, qemu-devel
Cc: qemu-block, Dr . David Alan Gilbert, Eric Blake, Kevin Wolf,
Vladimir Sementsov-Ogievskiy, Daniel P . Berrangé,
Andrey Drobyshev, peterx, Stefan Hajnoczi
Peter Xu <peterx@redhat.com> writes:
> From: Fabiano Rosas <farosas@suse.de>
>
> Stop using hardcoded strings for -serial so we can in the next patches
> perform more than one migration in a row. Having the serial path
> hardcoded means we cannot reuse the code when dst becomes the new src.
>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> Link: https://lore.kernel.org/r/20241125144612.16194-3-farosas@suse.de
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
> tests/qtest/migration-helpers.h | 2 +
> tests/qtest/migration-helpers.c | 8 ++++
> tests/qtest/migration-test.c | 68 ++++++++++++++++++---------------
> 3 files changed, 48 insertions(+), 30 deletions(-)
>
> diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
> index 72dba369fb..c7a36a33d6 100644
> --- a/tests/qtest/migration-helpers.h
> +++ b/tests/qtest/migration-helpers.h
> @@ -20,6 +20,7 @@ typedef struct QTestMigrationState {
> bool resume_seen;
> bool suspend_seen;
> bool suspend_me;
> + char *serial;
> } QTestMigrationState;
>
> bool migrate_watch_for_events(QTestState *who, const char *name,
> @@ -64,5 +65,6 @@ static inline bool probe_o_direct_support(const char *tmpfs)
> #endif
> void migration_test_add(const char *path, void (*fn)(void));
> void migration_event_wait(QTestState *s, const char *target);
> +char *migrate_get_unique_serial(const char *tmpfs);
>
> #endif /* MIGRATION_HELPERS_H */
> diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
> index 3f8ba7fa8e..7c0b54ce0e 100644
> --- a/tests/qtest/migration-helpers.c
> +++ b/tests/qtest/migration-helpers.c
> @@ -528,3 +528,11 @@ void migration_event_wait(QTestState *s, const char *target)
> qobject_unref(response);
> } while (!found);
> }
> +
> +char *migrate_get_unique_serial(const char *tmpfs)
> +{
> + static int i;
> +
> + assert(i < INT_MAX);
> + return g_strdup_printf("%s/serial_%d", tmpfs, i++);
> +}
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index eafc2da806..1452778c81 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -192,9 +192,8 @@ static void bootfile_create(char *dir, bool suspend_me)
> * we get an 'A' followed by an endless string of 'B's
> * but on the destination we won't have the A (unless we enabled suspend/resume)
> */
> -static void wait_for_serial(const char *side)
> +static void wait_for_serial(const char *serialpath)
> {
> - g_autofree char *serialpath = g_strdup_printf("%s/%s", tmpfs, side);
> FILE *serialfile = fopen(serialpath, "r");
>
> do {
> @@ -216,7 +215,7 @@ static void wait_for_serial(const char *side)
> break;
>
> default:
> - fprintf(stderr, "Unexpected %d on %s serial\n", readvalue, side);
> + fprintf(stderr, "Unexpected %d on %s\n", readvalue, serialpath);
> g_assert_not_reached();
> }
> } while (true);
> @@ -818,16 +817,17 @@ static int test_migrate_start(QTestState **from, QTestState **to,
>
> src_state = (QTestMigrationState) { };
> src_state.suspend_me = args->suspend_me;
> + src_state.serial = migrate_get_unique_serial(tmpfs);
>
> cmd_source = g_strdup_printf("-accel kvm%s -accel tcg "
> "-machine %s,%s "
> "-name source,debug-threads=on "
> "-m %s "
> - "-serial file:%s/src_serial "
> + "-serial file:%s "
> "%s %s %s %s %s",
> kvm_opts ? kvm_opts : "",
> machine, machine_opts,
> - memory_size, tmpfs,
> + memory_size, src_state.serial,
> arch_opts ? arch_opts : "",
> arch_source ? arch_source : "",
> shmem_opts ? shmem_opts : "",
> @@ -841,17 +841,18 @@ static int test_migrate_start(QTestState **from, QTestState **to,
> }
>
> dst_state = (QTestMigrationState) { };
> + dst_state.serial = migrate_get_unique_serial(tmpfs);
>
> cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
> "-machine %s,%s "
> "-name target,debug-threads=on "
> "-m %s "
> - "-serial file:%s/dest_serial "
> + "-serial file:%s "
> "-incoming %s "
> "%s %s %s %s %s",
> kvm_opts ? kvm_opts : "",
> machine, machine_opts,
> - memory_size, tmpfs, uri,
> + memory_size, dst_state.serial, uri,
> arch_opts ? arch_opts : "",
> arch_target ? arch_target : "",
> shmem_opts ? shmem_opts : "",
> @@ -911,8 +912,10 @@ static void test_migrate_end(QTestState *from, QTestState *to, bool test_dest)
> qtest_quit(to);
>
> cleanup("migsocket");
> - cleanup("src_serial");
> - cleanup("dest_serial");
> + unlink(src_state.serial);
> + g_free(src_state.serial);
> + unlink(dst_state.serial);
> + g_free(dst_state.serial);
> cleanup(FILE_TEST_FILENAME);
> }
>
> @@ -1290,7 +1293,7 @@ static int migrate_postcopy_prepare(QTestState **from_ptr,
> " 'port': '0' } } ] } }");
>
> /* Wait for the first serial output from the source */
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
> wait_for_suspend(from, &src_state);
>
> migrate_qmp(from, to, NULL, NULL, "{}");
> @@ -1314,7 +1317,7 @@ static void migrate_postcopy_complete(QTestState *from, QTestState *to,
> }
>
> /* Make sure we get at least one "B" on destination */
> - wait_for_serial("dest_serial");
> + wait_for_serial(dst_state.serial);
>
> if (uffd_feature_thread_id) {
> read_blocktime(to);
> @@ -1719,7 +1722,7 @@ static void test_precopy_common(MigrateCommon *args)
>
> /* Wait for the first serial output from the source */
> if (args->result == MIG_TEST_SUCCEED) {
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
> wait_for_suspend(from, &src_state);
> }
>
> @@ -1796,7 +1799,7 @@ static void test_precopy_common(MigrateCommon *args)
> qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
> }
>
> - wait_for_serial("dest_serial");
> + wait_for_serial(dst_state.serial);
> }
>
> finish:
> @@ -1871,7 +1874,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
> }
>
> migrate_ensure_converge(from);
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
>
> if (stop_src) {
> qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
> @@ -1898,7 +1901,7 @@ static void test_file_common(MigrateCommon *args, bool stop_src)
> }
> wait_for_resume(to, &dst_state);
>
> - wait_for_serial("dest_serial");
> + wait_for_serial(dst_state.serial);
>
> if (check_offset) {
> file_check_offset_region();
> @@ -2041,7 +2044,7 @@ static void test_ignore_shared(void)
> migrate_set_capability(to, "x-ignore-shared", true);
>
> /* Wait for the first serial output from the source */
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
>
> migrate_qmp(from, to, uri, NULL, "{}");
>
> @@ -2051,7 +2054,7 @@ static void test_ignore_shared(void)
>
> qtest_qmp_eventwait(to, "RESUME");
>
> - wait_for_serial("dest_serial");
> + wait_for_serial(dst_state.serial);
> wait_for_migration_complete(from);
>
> /* Check whether shared RAM has been really skipped */
> @@ -2669,7 +2672,7 @@ static void do_test_validate_uuid(MigrateStart *args, bool should_fail)
> migrate_set_capability(from, "validate-uuid", true);
>
> /* Wait for the first serial output from the source */
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
>
> migrate_qmp(from, to, uri, NULL, "{}");
>
> @@ -2733,7 +2736,7 @@ static void do_test_validate_uri_channel(MigrateCommon *args)
> }
>
> /* Wait for the first serial output from the source */
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
>
> /*
> * 'uri' and 'channels' validation is checked even before the migration
> @@ -2823,7 +2826,7 @@ static void test_migrate_auto_converge(void)
> migrate_set_capability(from, "pause-before-switchover", true);
>
> /* Wait for the first serial output from the source */
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
>
> migrate_qmp(from, to, uri, NULL, "{}");
>
> @@ -2885,7 +2888,7 @@ static void test_migrate_auto_converge(void)
>
> qtest_qmp_eventwait(to, "RESUME");
>
> - wait_for_serial("dest_serial");
> + wait_for_serial(dst_state.serial);
> wait_for_migration_complete(from);
>
> test_migrate_end(from, to, true);
> @@ -3296,7 +3299,7 @@ static void test_multifd_tcp_cancel(void)
> migrate_incoming_qmp(to, "tcp:127.0.0.1:0", "{}");
>
> /* Wait for the first serial output from the source */
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
>
> migrate_qmp(from, to, NULL, NULL, "{}");
>
> @@ -3307,7 +3310,8 @@ static void test_multifd_tcp_cancel(void)
> /* Make sure QEMU process "to" exited */
> qtest_set_expected_status(to, EXIT_FAILURE);
> qtest_wait_qemu(to);
> - qtest_quit(to);
> + unlink(dst_state.serial);
> + g_free(dst_state.serial);
>
> /*
> * Ensure the source QEMU finishes its cancellation process before we
> @@ -3345,7 +3349,7 @@ static void test_multifd_tcp_cancel(void)
> wait_for_stop(from, &src_state);
> qtest_qmp_eventwait(to2, "RESUME");
>
> - wait_for_serial("dest_serial");
> + wait_for_serial(dst_state.serial);
> wait_for_migration_complete(from);
> test_migrate_end(from, to2, true);
> }
> @@ -3488,13 +3492,16 @@ static QTestState *dirtylimit_start_vm(void)
> QTestState *vm = NULL;
> g_autofree gchar *cmd = NULL;
>
> + src_state = (QTestMigrationState) { };
> + src_state.serial = migrate_get_unique_serial(tmpfs);
> +
> bootfile_create(tmpfs, false);
> cmd = g_strdup_printf("-accel kvm,dirty-ring-size=4096 "
> "-name dirtylimit-test,debug-threads=on "
> "-m 150M -smp 1 "
> - "-serial file:%s/vm_serial "
> + "-serial file:%s "
> "-drive file=%s,format=raw ",
> - tmpfs, bootpath);
> + src_state.serial, bootpath);
>
> vm = qtest_init(cmd);
> return vm;
> @@ -3503,7 +3510,8 @@ static QTestState *dirtylimit_start_vm(void)
> static void dirtylimit_stop_vm(QTestState *vm)
> {
> qtest_quit(vm);
> - cleanup("vm_serial");
> + unlink(src_state.serial);
> + g_free(src_state.serial);
> }
>
> static void test_vcpu_dirty_limit(void)
> @@ -3519,7 +3527,7 @@ static void test_vcpu_dirty_limit(void)
> vm = dirtylimit_start_vm();
>
> /* Wait for the first serial output from the vm*/
> - wait_for_serial("vm_serial");
> + wait_for_serial(src_state.serial);
>
> /* Do dirtyrate measurement with calc time equals 1s */
> calc_dirty_rate(vm, 1);
> @@ -3612,7 +3620,7 @@ static void migrate_dirty_limit_wait_showup(QTestState *from,
> migrate_set_capability(from, "pause-before-switchover", true);
>
> /* Wait for the serial output from the source */
> - wait_for_serial("src_serial");
> + wait_for_serial(src_state.serial);
> }
>
> /*
> @@ -3751,7 +3759,7 @@ static void test_migrate_dirty_limit(void)
>
> qtest_qmp_eventwait(to, "RESUME");
>
> - wait_for_serial("dest_serial");
> + wait_for_serial(dst_state.serial);
> wait_for_migration_complete(from);
>
> test_migrate_end(from, to, true);
There's a preexisting issue in the dirty_limit test of not doing cleanup
between the two migrations that it does, causing (with this patch) one
of the serial files to get left behind, which breaks make check
SPEED=slow because the test directory cannot be removed.
Just an FYI. I'm fixing that on master and we can update this patch
afterward.
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH RFC 10/11] tests/qtest/migration: Support cleaning up only one side of migration
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (8 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 09/11] tests/qtest/migration: Don't use hardcoded strings for -serial Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 0:51 ` [PATCH RFC 11/11] tests/qtest/migration: Test successive migrations Peter Xu
2024-12-04 17:25 ` [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
From: Fabiano Rosas <farosas@suse.de>
We don't always want to cleanup both VMs at the same time. One example
is the multifd cancel test, where there's a second migration reusing
the source VM. The next patches will add another instance, keeping the
destination VM instead.
Extract the cleanup routine from test_migrate_end() into another
function so it can be reused.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241125144612.16194-4-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
tests/qtest/migration-test.c | 35 +++++++++++++++++++++--------------
1 file changed, 21 insertions(+), 14 deletions(-)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 1452778c81..74645f749c 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -883,13 +883,29 @@ static int test_migrate_start(QTestState **from, QTestState **to,
return 0;
}
+static void migrate_cleanup(QTestState *from, QTestState *to)
+{
+ if (from) {
+ qtest_quit(from);
+ unlink(src_state.serial);
+ g_free(src_state.serial);
+ }
+
+ if (to) {
+ qtest_quit(to);
+ unlink(dst_state.serial);
+ g_free(dst_state.serial);
+ }
+
+ cleanup("migsocket");
+ cleanup(FILE_TEST_FILENAME);
+}
+
static void test_migrate_end(QTestState *from, QTestState *to, bool test_dest)
{
unsigned char dest_byte_a, dest_byte_b, dest_byte_c, dest_byte_d;
- qtest_quit(from);
-
- if (test_dest) {
+ if (to && test_dest) {
qtest_memread(to, start_address, &dest_byte_a, 1);
/* Destination still running, wait for a byte to change */
@@ -909,14 +925,7 @@ static void test_migrate_end(QTestState *from, QTestState *to, bool test_dest)
check_guests_ram(to);
}
- qtest_quit(to);
-
- cleanup("migsocket");
- unlink(src_state.serial);
- g_free(src_state.serial);
- unlink(dst_state.serial);
- g_free(dst_state.serial);
- cleanup(FILE_TEST_FILENAME);
+ migrate_cleanup(from, to);
}
#ifdef CONFIG_GNUTLS
@@ -3309,9 +3318,7 @@ static void test_multifd_tcp_cancel(void)
/* Make sure QEMU process "to" exited */
qtest_set_expected_status(to, EXIT_FAILURE);
- qtest_wait_qemu(to);
- unlink(dst_state.serial);
- g_free(dst_state.serial);
+ migrate_cleanup(NULL, to);
/*
* Ensure the source QEMU finishes its cancellation process before we
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* [PATCH RFC 11/11] tests/qtest/migration: Test successive migrations
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (9 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 10/11] tests/qtest/migration: Support cleaning up only one side of migration Peter Xu
@ 2024-12-04 0:51 ` Peter Xu
2024-12-04 17:25 ` [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 0:51 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, peterx,
Stefan Hajnoczi
From: Fabiano Rosas <farosas@suse.de>
Add a framework for running migrations back and forth between two
guests (aka ping-pong migration). We have seen a couple of bugs that
only reproduce when a guest is migrated and the destination machine is
used as the source of a new migration.
Add a simple test that does 2 migrations and another test that uses
the late-block-activate capability, which is one area where a bug has
been found.
Note that this does not reuse any of the existing tests
(e.g. test_precopy_common), because there is too much ambiguity
regarding how to handle the hooks, args->result, stop/continue, etc.
Signed-off-by: Fabiano Rosas <farosas@suse.de>
Link: https://lore.kernel.org/r/20241125144612.16194-6-farosas@suse.de
Signed-off-by: Peter Xu <peterx@redhat.com>
---
tests/qtest/migration-test.c | 121 +++++++++++++++++++++++++++++++++++
1 file changed, 121 insertions(+)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 74645f749c..4955b3b9eb 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1819,6 +1819,104 @@ finish:
test_migrate_end(from, to, args->result == MIG_TEST_SUCCEED);
}
+static void migrate_dst_becomes_src(QTestState **from, QTestState **to,
+ QTestMigrationState *src_state,
+ QTestMigrationState *dst_state)
+{
+ *from = *to;
+
+ *src_state = (QTestMigrationState) { };
+ src_state->serial = dst_state->serial;
+ qtest_qmp_set_event_callback(*from, migrate_watch_for_events, src_state);
+
+ src_state->stop_seen = dst_state->stop_seen;
+}
+
+static void test_precopy_ping_pong_common(MigrateCommon *args, int n,
+ bool late_block_activate)
+{
+ QTestState *from, *to;
+ bool keep_paused = false;
+
+ g_assert(!args->live);
+
+ g_test_message("Migrating back and forth %d times", n);
+
+ for (int i = 0; i < n; i++) {
+ g_test_message("%s (%d/%d)", i % 2 ? "pong" : "ping", i + 1, n);
+
+ if (test_migrate_start(&from, &to, args->listen_uri, &args->start)) {
+ return;
+ }
+
+ if (late_block_activate) {
+ migrate_set_capability(from, "late-block-activate", true);
+ migrate_set_capability(to, "late-block-activate", true);
+
+ /*
+ * The late-block-activate capability expects that the
+ * management layer will issue a qmp_continue command on
+ * the destination once the migration finishes. In order
+ * to test the capability properly, make sure the test is
+ * not issuing spurious continue commands.
+ */
+ keep_paused = true;
+ }
+
+ if (!src_state.stop_seen) {
+ wait_for_serial(src_state.serial);
+ }
+
+ /* because of !args->live */
+ qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
+ wait_for_stop(from, &src_state);
+
+ migrate_ensure_converge(from);
+ migrate_qmp(from, to, args->connect_uri, args->connect_channels, "{}");
+
+ wait_for_migration_complete(from);
+ wait_for_migration_complete(to);
+
+ /* note check_guests_ram() requires a stopped guest */
+ check_guests_ram(to);
+
+ if (keep_paused) {
+ /*
+ * Nothing issued continue on the destination, reflect
+ * that the guest is paused in the dst_state.
+ */
+ dst_state.stop_seen = true;
+ } else {
+ qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
+ wait_for_serial(dst_state.serial);
+ dst_state.stop_seen = false;
+ }
+
+ /*
+ * Last iteration, let the guest run to make sure there's no
+ * memory corruption. The test_migrate_end() below will check
+ * the memory one last time.
+ */
+ if (i == n - 1) {
+ qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
+ wait_for_serial(dst_state.serial);
+ dst_state.stop_seen = false;
+ break;
+ }
+
+ /*
+ * Prepare for migrating back: clear the original source and
+ * switch source & destination.
+ */
+ migrate_cleanup(from, NULL);
+ migrate_dst_becomes_src(&from, &to, &src_state, &dst_state);
+
+ args->start.only_target = true;
+ }
+
+ test_migrate_end(from, to, true);
+}
+
static void file_dirty_offset_region(void)
{
g_autofree char *path = g_strdup_printf("%s/%s", tmpfs, FILE_TEST_FILENAME);
@@ -3772,6 +3870,24 @@ static void test_migrate_dirty_limit(void)
test_migrate_end(from, to, true);
}
+static void test_precopy_ping_pong(void)
+{
+ MigrateCommon args = {
+ .listen_uri = "tcp:127.0.0.1:0",
+ };
+
+ test_precopy_ping_pong_common(&args, 2, false);
+}
+
+static void test_precopy_ping_pong_late_block(void)
+{
+ MigrateCommon args = {
+ .listen_uri = "tcp:127.0.0.1:0",
+ };
+
+ test_precopy_ping_pong_common(&args, 2, true);
+}
+
static bool kvm_dirty_ring_supported(void)
{
#if defined(__linux__) && defined(HOST_X86_64)
@@ -4055,6 +4171,11 @@ int main(int argc, char **argv)
}
}
+ migration_test_add("/migration/precopy/ping-pong/plain",
+ test_precopy_ping_pong);
+ migration_test_add("/migration/precopy/ping-pong/late-block-activate",
+ test_precopy_ping_pong_late_block);
+
ret = g_test_run();
g_assert_cmpint(ret, ==, 0);
--
2.47.0
^ permalink raw reply related [flat|nested] 15+ messages in thread* Re: [PATCH RFC 00/11] migration/block: disk activation rewrite
2024-12-04 0:51 [PATCH RFC 00/11] migration/block: disk activation rewrite Peter Xu
` (10 preceding siblings ...)
2024-12-04 0:51 ` [PATCH RFC 11/11] tests/qtest/migration: Test successive migrations Peter Xu
@ 2024-12-04 17:25 ` Peter Xu
11 siblings, 0 replies; 15+ messages in thread
From: Peter Xu @ 2024-12-04 17:25 UTC (permalink / raw)
To: qemu-devel
Cc: qemu-block, Fabiano Rosas, Dr . David Alan Gilbert, Eric Blake,
Kevin Wolf, Vladimir Sementsov-Ogievskiy,
Daniel P . Berrangé, Andrey Drobyshev, Stefan Hajnoczi
On Tue, Dec 03, 2024 at 07:51:27PM -0500, Peter Xu wrote:
> migration/block: Merge block reactivations for fail/cancel
> migration/block: Extend the migration_block_* API to dest side
> migration/block: Apply the migration_block_* API to postcopy
I just noticed these three patches cannot be separate, because right after
we introduce a flag to cache disk activation status, it will need to apply
to all the code or inconsistency can happen if someone applies one of the
patches, for example.
I also overlooked that we must take qmp_cont() into the picture too, so
that will also need to use the API so the flag will be consistent.
So in the next version I'll squash the three patches into one, and also use
the new API in qmp_cont(), so the status flag should always be consistent.
Again, to block layer developers: please help if any of you know how to
make bdrv_inactivate_all() safe to be called on top of inactivated disks,
or any reasoning on why it mustn't. It could be very helpful.
--
Peter Xu
^ permalink raw reply [flat|nested] 15+ messages in thread