* [PATCH V4 00/11] fix migration of suspended runstate
@ 2023-08-29 18:17 Steve Sistare
2023-08-29 18:17 ` [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start Steve Sistare
` (10 more replies)
0 siblings, 11 replies; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:17 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Migration of a guest in the suspended runstate is broken. The incoming
migration code automatically tries to wake the guest, which is wrong;
the guest should end migration in the same runstate it started. Further,
for a restored snapshot, the automatic wakeup fails. The runstate is
RUNNING, but the guest is not. See the commit messages for the details.
Changes in V2:
* simplify "start on wakeup request"
* fix postcopy, snapshot, and background migration
* refactor fixes for each type of migration
* explicitly handled suspended events and runstate in tests
* add test for postcopy and background migration
Changes in V3:
* rebase to tip
* fix hang in new function migrate_wait_for_dirty_mem
Changes in V4:
* rebase to tip
* add patch for vm_prepare_start (thanks Peter)
* add patch to preserve cpu ticks
Steve Sistare (11):
cpus: pass runstate to vm_prepare_start
migration: preserve suspended runstate
migration: add runstate function
migration: preserve suspended for snapshot
migration: preserve suspended for bg_migration
migration: preserve cpu ticks if suspended
tests/qtest: migration events
tests/qtest: option to suspend during migration
tests/qtest: precopy migration with suspend
tests/qtest: postcopy migration with suspend
tests/qtest: background migration with suspend
backends/tpm/tpm_emulator.c | 2 +-
gdbstub/softmmu.c | 2 +-
hw/usb/hcd-ehci.c | 2 +-
hw/usb/redirect.c | 2 +-
hw/xen/xen-hvm-common.c | 2 +-
include/sysemu/runstate.h | 3 +-
migration/migration.c | 49 +++++-----
migration/migration.h | 1 +
migration/savevm.c | 9 +-
softmmu/cpu-timers.c | 36 ++++++-
softmmu/cpus.c | 8 +-
softmmu/runstate.c | 3 +
tests/migration/i386/Makefile | 5 +-
tests/migration/i386/a-b-bootblock.S | 51 +++++++++-
tests/migration/i386/a-b-bootblock.h | 22 +++--
tests/qtest/migration-helpers.c | 27 ++----
tests/qtest/migration-helpers.h | 9 +-
tests/qtest/migration-test.c | 178 +++++++++++++++++++++++++++--------
18 files changed, 299 insertions(+), 112 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
@ 2023-08-29 18:17 ` Steve Sistare
2023-08-30 15:52 ` Peter Xu
2023-08-29 18:17 ` [PATCH V4 02/11] migration: preserve suspended runstate Steve Sistare
` (9 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:17 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
When a vm in the suspended state is migrated, we must call vm_prepare_start
on the destination, so a later system_wakeup properly resumes the guest,
when main_loop_should_exit callsresume_all_vcpus. However, the runstate
should remain suspended until system_wakeup is called, so allow the caller
to pass the new state to vm_prepare_start, rather than assume the new state
is RUN_STATE_RUNNING. Modify vm state change handlers that check
RUN_STATE_RUNNING to instead use the running parameter.
No functional change.
Suggested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
backends/tpm/tpm_emulator.c | 2 +-
gdbstub/softmmu.c | 2 +-
hw/usb/hcd-ehci.c | 2 +-
hw/usb/redirect.c | 2 +-
hw/xen/xen-hvm-common.c | 2 +-
include/sysemu/runstate.h | 3 ++-
softmmu/cpus.c | 8 ++++----
7 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/backends/tpm/tpm_emulator.c b/backends/tpm/tpm_emulator.c
index 402a2d6..a8e559a 100644
--- a/backends/tpm/tpm_emulator.c
+++ b/backends/tpm/tpm_emulator.c
@@ -907,7 +907,7 @@ static void tpm_emulator_vm_state_change(void *opaque, bool running,
trace_tpm_emulator_vm_state_change(running, state);
- if (!running || state != RUN_STATE_RUNNING || !tpm_emu->relock_storage) {
+ if (!running || !tpm_emu->relock_storage) {
return;
}
diff --git a/gdbstub/softmmu.c b/gdbstub/softmmu.c
index f509b72..a43e832 100644
--- a/gdbstub/softmmu.c
+++ b/gdbstub/softmmu.c
@@ -565,7 +565,7 @@ int gdb_continue_partial(char *newstates)
}
}
- if (vm_prepare_start(step_requested)) {
+ if (vm_prepare_start(step_requested, RUN_STATE_RUNNING)) {
return 0;
}
diff --git a/hw/usb/hcd-ehci.c b/hw/usb/hcd-ehci.c
index c930c60..e436f5c 100644
--- a/hw/usb/hcd-ehci.c
+++ b/hw/usb/hcd-ehci.c
@@ -2451,7 +2451,7 @@ static void usb_ehci_vm_state_change(void *opaque, bool running, RunState state)
* USB-devices which have async handled packages have a packet in the
* ep queue to match the completion with.
*/
- if (state == RUN_STATE_RUNNING) {
+ if (running) {
ehci_advance_async_state(ehci);
}
diff --git a/hw/usb/redirect.c b/hw/usb/redirect.c
index 39fbaaa..1ec5909 100644
--- a/hw/usb/redirect.c
+++ b/hw/usb/redirect.c
@@ -1403,7 +1403,7 @@ static void usbredir_vm_state_change(void *priv, bool running, RunState state)
{
USBRedirDevice *dev = priv;
- if (state == RUN_STATE_RUNNING && dev->parser != NULL) {
+ if (running && dev->parser != NULL) {
usbredirparser_do_write(dev->parser); /* Flush any pending writes */
}
}
diff --git a/hw/xen/xen-hvm-common.c b/hw/xen/xen-hvm-common.c
index 565dc39..47e6cb1 100644
--- a/hw/xen/xen-hvm-common.c
+++ b/hw/xen/xen-hvm-common.c
@@ -623,7 +623,7 @@ void xen_hvm_change_state_handler(void *opaque, bool running,
xen_set_ioreq_server_state(xen_domid,
state->ioservid,
- (rstate == RUN_STATE_RUNNING));
+ running);
}
void xen_exit_notifier(Notifier *n, void *data)
diff --git a/include/sysemu/runstate.h b/include/sysemu/runstate.h
index 7beb29c..7d889ab 100644
--- a/include/sysemu/runstate.h
+++ b/include/sysemu/runstate.h
@@ -39,8 +39,9 @@ void vm_start(void);
* vm_prepare_start: Prepare for starting/resuming the VM
*
* @step_pending: whether any of the CPUs is about to be single-stepped by gdb
+ * @state: the vm state to setup
*/
-int vm_prepare_start(bool step_pending);
+int vm_prepare_start(bool step_pending, RunState state);
int vm_stop(RunState state);
int vm_stop_force_state(RunState state);
int vm_shutdown(void);
diff --git a/softmmu/cpus.c b/softmmu/cpus.c
index fed20ff..0a082d3 100644
--- a/softmmu/cpus.c
+++ b/softmmu/cpus.c
@@ -681,7 +681,7 @@ int vm_stop(RunState state)
* Returns -1 if the vCPUs are not to be restarted (e.g. if they are already
* running or in case of an error condition), 0 otherwise.
*/
-int vm_prepare_start(bool step_pending)
+int vm_prepare_start(bool step_pending, RunState state)
{
RunState requested;
@@ -713,14 +713,14 @@ int vm_prepare_start(bool step_pending)
qapi_event_send_resume();
cpu_enable_ticks();
- runstate_set(RUN_STATE_RUNNING);
- vm_state_notify(1, RUN_STATE_RUNNING);
+ runstate_set(state);
+ vm_state_notify(1, state);
return 0;
}
void vm_start(void)
{
- if (!vm_prepare_start(false)) {
+ if (!vm_prepare_start(false, RUN_STATE_RUNNING)) {
resume_all_vcpus();
}
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 02/11] migration: preserve suspended runstate
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
2023-08-29 18:17 ` [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start Steve Sistare
@ 2023-08-29 18:17 ` Steve Sistare
2023-08-30 16:07 ` Peter Xu
2023-08-29 18:17 ` [PATCH V4 03/11] migration: add runstate function Steve Sistare
` (8 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:17 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
A guest that is migrated in the suspended state automaticaly wakes and
continues execution. This is wrong; the guest should end migration in
the same state it started. The root cause is that the outgoing migration
code automatically wakes the guest, then saves the RUNNING runstate in
global_state_store(), hence the incoming migration code thinks the guest is
running and continues the guest if autostart is true.
On the outgoing side, do not call qemu_system_wakeup_request().
On the incoming side for precopy, prepare to start the vm, but do not
yet start it. A future system_wakeup will cause the main loop to resume
the VCPUs.
On the incoming side for postcopy, do not wake the guest, and apply the
the same logic as found in precopy: if autostart and the runstate is
RUNNING, then vm_start, else prepare to start the vm.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
---
migration/migration.c | 4 ++--
migration/savevm.c | 15 ++++++++++-----
2 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 5528acb..5bcc761 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -496,6 +496,8 @@ static void process_incoming_migration_bh(void *opaque)
} else if (migration_incoming_colo_enabled()) {
migration_incoming_disable_colo();
vm_start();
+ } else if (global_state_get_runstate() == RUN_STATE_SUSPENDED) {
+ vm_prepare_start(false, global_state_get_runstate());
} else {
runstate_set(global_state_get_runstate());
}
@@ -2109,7 +2111,6 @@ static int postcopy_start(MigrationState *ms, Error **errp)
qemu_mutex_lock_iothread();
trace_postcopy_start_set_run();
- qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
global_state_store();
ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
if (ret < 0) {
@@ -2315,7 +2316,6 @@ static void migration_completion(MigrationState *s)
if (s->state == MIGRATION_STATUS_ACTIVE) {
qemu_mutex_lock_iothread();
s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
- qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
s->vm_old_state = runstate_get();
global_state_store();
diff --git a/migration/savevm.c b/migration/savevm.c
index a2cb885..bae0a1a 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2070,12 +2070,17 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
dirty_bitmap_mig_before_vm_start();
- if (autostart) {
- /* Hold onto your hats, starting the CPU */
- vm_start();
+ if (!global_state_received() ||
+ global_state_get_runstate() == RUN_STATE_RUNNING) {
+ if (autostart) {
+ vm_start();
+ } else {
+ runstate_set(RUN_STATE_PAUSED);
+ }
+ } else if (global_state_get_runstate() == RUN_STATE_SUSPENDED) {
+ vm_prepare_start(false, RUN_STATE_SUSPENDED);
} else {
- /* leave it paused and let management decide when to start the CPU */
- runstate_set(RUN_STATE_PAUSED);
+ runstate_set(global_state_get_runstate());
}
qemu_bh_delete(mis->bh);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 03/11] migration: add runstate function
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
2023-08-29 18:17 ` [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start Steve Sistare
2023-08-29 18:17 ` [PATCH V4 02/11] migration: preserve suspended runstate Steve Sistare
@ 2023-08-29 18:17 ` Steve Sistare
2023-08-30 16:11 ` Peter Xu
2023-08-29 18:17 ` [PATCH V4 04/11] migration: preserve suspended for snapshot Steve Sistare
` (7 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:17 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Create a subroutine for preserving the runstate after migration,
to be used in a subsequent patch. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
---
migration/migration.c | 37 ++++++++++++++++++++++---------------
migration/migration.h | 1 +
migration/savevm.c | 13 +------------
3 files changed, 24 insertions(+), 27 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index 5bcc761..a9ecb66 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -486,21 +486,8 @@ static void process_incoming_migration_bh(void *opaque)
dirty_bitmap_mig_before_vm_start();
- if (!global_state_received() ||
- global_state_get_runstate() == RUN_STATE_RUNNING) {
- if (autostart) {
- vm_start();
- } else {
- runstate_set(RUN_STATE_PAUSED);
- }
- } else if (migration_incoming_colo_enabled()) {
- migration_incoming_disable_colo();
- vm_start();
- } else if (global_state_get_runstate() == RUN_STATE_SUSPENDED) {
- vm_prepare_start(false, global_state_get_runstate());
- } else {
- runstate_set(global_state_get_runstate());
- }
+ migrate_set_runstate();
+
/*
* This must happen after any state changes since as soon as an external
* observer sees this event they might start to prod at the VM assuming
@@ -1143,6 +1130,26 @@ void migrate_set_state(int *state, int old_state, int new_state)
}
}
+void migrate_set_runstate(void)
+{
+ RunState state = global_state_get_runstate();
+
+ if (!global_state_received() || state == RUN_STATE_RUNNING) {
+ if (autostart) {
+ vm_start();
+ } else {
+ runstate_set(RUN_STATE_PAUSED);
+ }
+ } else if (migration_incoming_colo_enabled()) {
+ migration_incoming_disable_colo();
+ vm_start();
+ } else if (state == RUN_STATE_SUSPENDED) {
+ vm_prepare_start(false, state);
+ } else {
+ runstate_set(state);
+ }
+}
+
static void migrate_fd_cleanup(MigrationState *s)
{
qemu_bh_delete(s->cleanup_bh);
diff --git a/migration/migration.h b/migration/migration.h
index 6eea18d..45e9805 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -456,6 +456,7 @@ struct MigrationState {
};
void migrate_set_state(int *state, int old_state, int new_state);
+void migrate_set_runstate(void);
void migration_fd_process_incoming(QEMUFile *f, Error **errp);
void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp);
diff --git a/migration/savevm.c b/migration/savevm.c
index bae0a1a..eba3653 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -2070,18 +2070,7 @@ static void loadvm_postcopy_handle_run_bh(void *opaque)
dirty_bitmap_mig_before_vm_start();
- if (!global_state_received() ||
- global_state_get_runstate() == RUN_STATE_RUNNING) {
- if (autostart) {
- vm_start();
- } else {
- runstate_set(RUN_STATE_PAUSED);
- }
- } else if (global_state_get_runstate() == RUN_STATE_SUSPENDED) {
- vm_prepare_start(false, RUN_STATE_SUSPENDED);
- } else {
- runstate_set(global_state_get_runstate());
- }
+ migrate_set_runstate();
qemu_bh_delete(mis->bh);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 04/11] migration: preserve suspended for snapshot
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (2 preceding siblings ...)
2023-08-29 18:17 ` [PATCH V4 03/11] migration: add runstate function Steve Sistare
@ 2023-08-29 18:17 ` Steve Sistare
2023-08-30 16:22 ` Peter Xu
2023-08-29 18:18 ` [PATCH V4 05/11] migration: preserve suspended for bg_migration Steve Sistare
` (6 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:17 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Restoring a snapshot can break a suspended guest.
If a guest is suspended and saved to a snapshot using savevm, and qemu
is terminated and restarted with the -S option, then loadvm does not
restore the guest. The runstate is running, but the guest is not, because
vm_start was not called. The root cause is that loadvm does not restore
the runstate (eg suspended) from global_state loaded from the state file.
Restore the runstate, and allow the new state transitions that are possible.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
migration/savevm.c | 1 +
softmmu/runstate.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/migration/savevm.c b/migration/savevm.c
index eba3653..7b9c477 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3194,6 +3194,7 @@ bool load_snapshot(const char *name, const char *vmstate,
}
aio_context_acquire(aio_context);
ret = qemu_loadvm_state(f);
+ migrate_set_runstate();
migration_incoming_state_destroy();
aio_context_release(aio_context);
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index f3bd862..21d7407 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -77,6 +77,8 @@ typedef struct {
static const RunStateTransition runstate_transitions_def[] = {
{ RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
+ { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
+ { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
{ RUN_STATE_DEBUG, RUN_STATE_RUNNING },
{ RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 05/11] migration: preserve suspended for bg_migration
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (3 preceding siblings ...)
2023-08-29 18:17 ` [PATCH V4 04/11] migration: preserve suspended for snapshot Steve Sistare
@ 2023-08-29 18:18 ` Steve Sistare
2023-08-30 16:35 ` Peter Xu
2023-08-29 18:18 ` [PATCH V4 06/11] migration: preserve cpu ticks if suspended Steve Sistare
` (5 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:18 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Do not wake a suspended guest during bg_migration.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
migration/migration.c | 12 +++++-------
softmmu/runstate.c | 1 +
2 files changed, 6 insertions(+), 7 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index a9ecb66..303d5a6 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -3064,7 +3064,9 @@ static void bg_migration_vm_start_bh(void *opaque)
qemu_bh_delete(s->vm_start_bh);
s->vm_start_bh = NULL;
- vm_start();
+ if (!runstate_check(RUN_STATE_SUSPENDED)) {
+ vm_start();
+ }
s->downtime = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - s->downtime_start;
}
@@ -3134,16 +3136,12 @@ static void *bg_migration_thread(void *opaque)
qemu_mutex_lock_iothread();
- /*
- * If VM is currently in suspended state, then, to make a valid runstate
- * transition in vm_stop_force_state() we need to wakeup it up.
- */
- qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
s->vm_old_state = runstate_get();
global_state_store();
/* Forcibly stop VM before saving state of vCPUs and devices */
- if (vm_stop_force_state(RUN_STATE_PAUSED)) {
+ if (!runstate_check(RUN_STATE_SUSPENDED) &&
+ vm_stop_force_state(RUN_STATE_PAUSED)) {
goto fail;
}
/*
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index 21d7407..4417527 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -163,6 +163,7 @@ static const RunStateTransition runstate_transitions_def[] = {
{ RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
{ RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
{ RUN_STATE_SUSPENDED, RUN_STATE_COLO},
+ { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
{ RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
{ RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 06/11] migration: preserve cpu ticks if suspended
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (4 preceding siblings ...)
2023-08-29 18:18 ` [PATCH V4 05/11] migration: preserve suspended for bg_migration Steve Sistare
@ 2023-08-29 18:18 ` Steve Sistare
2023-08-30 16:47 ` Peter Xu
2023-08-29 18:18 ` [PATCH V4 07/11] tests/qtest: migration events Steve Sistare
` (4 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:18 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
timers_state saved to the migration stream is stale, causing time errors
in the guest when it wakes from suspend.
To fix, maintain a shadow copy of timers_state, and update the shadow in a
pre_save handler that uses the same logic found in cpu_disable_ticks. Copy
the shadow to timers_state in post_load.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
softmmu/cpu-timers.c | 36 +++++++++++++++++++++++++++++++++++-
1 file changed, 35 insertions(+), 1 deletion(-)
diff --git a/softmmu/cpu-timers.c b/softmmu/cpu-timers.c
index 117408c..d5af317 100644
--- a/softmmu/cpu-timers.c
+++ b/softmmu/cpu-timers.c
@@ -157,6 +157,36 @@ static bool icount_shift_state_needed(void *opaque)
return icount_enabled() == 2;
}
+static int cpu_pre_save_ticks(void *opaque)
+{
+ TimersState *t = &timers_state;
+ TimersState *snap = opaque;
+
+ seqlock_write_lock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+
+ if (t->cpu_ticks_enabled) {
+ snap->cpu_ticks_offset = t->cpu_ticks_offset + cpu_get_host_ticks();
+ snap->cpu_clock_offset = cpu_get_clock_locked();
+ } else {
+ snap->cpu_ticks_offset = t->cpu_ticks_offset;
+ snap->cpu_clock_offset = t->cpu_clock_offset;
+ }
+ seqlock_write_unlock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+ return 0;
+}
+
+static int cpu_post_load_ticks(void *opaque, int version_id)
+{
+ TimersState *t = &timers_state;
+ TimersState *snap = opaque;
+
+ seqlock_write_lock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+ t->cpu_ticks_offset = snap->cpu_ticks_offset;
+ t->cpu_clock_offset = snap->cpu_clock_offset;
+ seqlock_write_unlock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+ return 0;
+}
+
/*
* Subsection for warp timer migration is optional, because may not be created
*/
@@ -221,6 +251,8 @@ static const VMStateDescription vmstate_timers = {
.name = "timer",
.version_id = 2,
.minimum_version_id = 1,
+ .pre_save = cpu_pre_save_ticks,
+ .post_load = cpu_post_load_ticks,
.fields = (VMStateField[]) {
VMSTATE_INT64(cpu_ticks_offset, TimersState),
VMSTATE_UNUSED(8),
@@ -269,9 +301,11 @@ TimersState timers_state;
/* initialize timers state and the cpu throttle for convenience */
void cpu_timers_init(void)
{
+ static TimersState timers_snapshot;
+
seqlock_init(&timers_state.vm_clock_seqlock);
qemu_spin_init(&timers_state.vm_clock_lock);
- vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+ vmstate_register(NULL, 0, &vmstate_timers, &timers_snapshot);
cpu_throttle_init();
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 07/11] tests/qtest: migration events
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (5 preceding siblings ...)
2023-08-29 18:18 ` [PATCH V4 06/11] migration: preserve cpu ticks if suspended Steve Sistare
@ 2023-08-29 18:18 ` Steve Sistare
2023-08-30 17:00 ` Peter Xu
2023-08-29 18:18 ` [PATCH V4 08/11] tests/qtest: option to suspend during migration Steve Sistare
` (3 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:18 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Define a state object to capture events seen by migration tests, to allow
more events to be captured in a subsequent patch, and simplify event
checking in wait_for_migration_pass. No functional change.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
---
tests/qtest/migration-helpers.c | 24 +++++----------
tests/qtest/migration-helpers.h | 8 +++--
tests/qtest/migration-test.c | 68 +++++++++++++++++++----------------------
3 files changed, 44 insertions(+), 56 deletions(-)
diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index be00c52..b541108 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -23,26 +23,16 @@
*/
#define MIGRATION_STATUS_WAIT_TIMEOUT 120
-bool migrate_watch_for_stop(QTestState *who, const char *name,
- QDict *event, void *opaque)
-{
- bool *seen = opaque;
-
- if (g_str_equal(name, "STOP")) {
- *seen = true;
- return true;
- }
-
- return false;
-}
-
-bool migrate_watch_for_resume(QTestState *who, const char *name,
+bool migrate_watch_for_events(QTestState *who, const char *name,
QDict *event, void *opaque)
{
- bool *seen = opaque;
+ QTestMigrationState *state = opaque;
- if (g_str_equal(name, "RESUME")) {
- *seen = true;
+ if (g_str_equal(name, "STOP")) {
+ state->stop_seen = true;
+ return true;
+ } else if (g_str_equal(name, "RESUME")) {
+ state->resume_seen = true;
return true;
}
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 009e250..59fbb83 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -15,9 +15,11 @@
#include "libqtest.h"
-bool migrate_watch_for_stop(QTestState *who, const char *name,
- QDict *event, void *opaque);
-bool migrate_watch_for_resume(QTestState *who, const char *name,
+typedef struct QTestMigrationState {
+ bool stop_seen, resume_seen;
+} QTestMigrationState;
+
+bool migrate_watch_for_events(QTestState *who, const char *name,
QDict *event, void *opaque);
G_GNUC_PRINTF(3, 4)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 62d3f37..526a1b7 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -43,8 +43,8 @@
unsigned start_address;
unsigned end_address;
static bool uffd_feature_thread_id;
-static bool got_src_stop;
-static bool got_dst_resume;
+static QTestMigrationState src_state;
+static QTestMigrationState dst_state;
/*
* An initial 3 MB offset is used as that corresponds
@@ -188,6 +188,13 @@ static void wait_for_serial(const char *side)
} while (true);
}
+static void wait_for_stop(QTestState *who, QTestMigrationState *state)
+{
+ if (!state->stop_seen) {
+ qtest_qmp_eventwait(who, "STOP");
+ }
+}
+
/*
* It's tricky to use qemu's migration event capability with qtest,
* events suddenly appearing confuse the qmp()/hmp() responses.
@@ -235,21 +242,19 @@ static void read_blocktime(QTestState *who)
qobject_unref(rsp_return);
}
+/*
+ * Wait for two changes in the migration pass count, but bail if we stop.
+ */
static void wait_for_migration_pass(QTestState *who)
{
- uint64_t initial_pass = get_migration_pass(who);
- uint64_t pass;
+ uint64_t pass, prev_pass = 0, changes = 0;
- /* Wait for the 1st sync */
- while (!got_src_stop && !initial_pass) {
- usleep(1000);
- initial_pass = get_migration_pass(who);
- }
-
- do {
+ while (changes < 2 && !src_state.stop_seen) {
usleep(1000);
pass = get_migration_pass(who);
- } while (pass == initial_pass && !got_src_stop);
+ changes += (pass != prev_pass);
+ prev_pass = pass;
+ }
}
static void check_guests_ram(QTestState *who)
@@ -586,10 +591,7 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
{
qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
- if (!got_src_stop) {
- qtest_qmp_eventwait(from, "STOP");
- }
-
+ wait_for_stop(from, &src_state);
qtest_qmp_eventwait(to, "RESUME");
}
@@ -720,8 +722,9 @@ static int test_migrate_start(QTestState **from, QTestState **to,
}
}
- got_src_stop = false;
- got_dst_resume = false;
+ dst_state = (QTestMigrationState) { };
+ src_state = (QTestMigrationState) { };
+
bootpath = g_strdup_printf("%s/bootsect", tmpfs);
if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
/* the assembled x86 boot sector should be exactly one sector large */
@@ -801,8 +804,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
if (!args->only_target) {
*from = qtest_init(cmd_source);
qtest_qmp_set_event_callback(*from,
- migrate_watch_for_stop,
- &got_src_stop);
+ migrate_watch_for_events,
+ &src_state);
}
cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
@@ -821,8 +824,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
ignore_stderr);
*to = qtest_init(cmd_target);
qtest_qmp_set_event_callback(*to,
- migrate_watch_for_resume,
- &got_dst_resume);
+ migrate_watch_for_events,
+ &dst_state);
/*
* Remove shmem file immediately to avoid memory leak in test failed case.
@@ -1516,9 +1519,7 @@ static void test_precopy_common(MigrateCommon *args)
*/
if (args->result == MIG_TEST_SUCCEED) {
qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
- if (!got_src_stop) {
- qtest_qmp_eventwait(from, "STOP");
- }
+ wait_for_stop(from, &src_state);
migrate_ensure_converge(from);
}
}
@@ -1560,9 +1561,8 @@ static void test_precopy_common(MigrateCommon *args)
*/
wait_for_migration_complete(from);
- if (!got_src_stop) {
- qtest_qmp_eventwait(from, "STOP");
- }
+ wait_for_stop(from, &src_state);
+
} else {
wait_for_migration_complete(from);
/*
@@ -1575,7 +1575,7 @@ static void test_precopy_common(MigrateCommon *args)
qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
}
- if (!got_dst_resume) {
+ if (!dst_state.resume_seen) {
qtest_qmp_eventwait(to, "RESUME");
}
@@ -1696,9 +1696,7 @@ static void test_ignore_shared(void)
migrate_wait_for_dirty_mem(from, to);
- if (!got_src_stop) {
- qtest_qmp_eventwait(from, "STOP");
- }
+ wait_for_stop(from, &src_state);
qtest_qmp_eventwait(to, "RESUME");
@@ -2139,7 +2137,7 @@ static void test_migrate_auto_converge(void)
break;
}
usleep(20);
- g_assert_false(got_src_stop);
+ g_assert_false(src_state.stop_seen);
} while (true);
/* The first percentage of throttling should be at least init_pct */
g_assert_cmpint(percentage, >=, init_pct);
@@ -2481,9 +2479,7 @@ static void test_multifd_tcp_cancel(void)
migrate_ensure_converge(from);
- if (!got_src_stop) {
- qtest_qmp_eventwait(from, "STOP");
- }
+ wait_for_stop(from, &src_state);
qtest_qmp_eventwait(to2, "RESUME");
wait_for_serial("dest_serial");
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 08/11] tests/qtest: option to suspend during migration
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (6 preceding siblings ...)
2023-08-29 18:18 ` [PATCH V4 07/11] tests/qtest: migration events Steve Sistare
@ 2023-08-29 18:18 ` Steve Sistare
2023-08-30 17:01 ` Peter Xu
2023-08-29 18:18 ` [PATCH V4 09/11] tests/qtest: precopy migration with suspend Steve Sistare
` (2 subsequent siblings)
10 siblings, 1 reply; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:18 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Add an option to suspend the src in a-b-bootblock.S, which puts the guest
in S3 state after one round of writing to memory. The option is enabled by
poking a 1 into the suspend_me word in the boot block prior to starting the
src vm. Generate symbol offsets in a-b-bootblock.h so that the suspend_me
offset is known.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
tests/migration/i386/Makefile | 5 ++--
tests/migration/i386/a-b-bootblock.S | 51 +++++++++++++++++++++++++++++++++---
tests/migration/i386/a-b-bootblock.h | 22 ++++++++++------
tests/qtest/migration-test.c | 4 +++
4 files changed, 68 insertions(+), 14 deletions(-)
diff --git a/tests/migration/i386/Makefile b/tests/migration/i386/Makefile
index 5c03241..37a72ae 100644
--- a/tests/migration/i386/Makefile
+++ b/tests/migration/i386/Makefile
@@ -4,9 +4,10 @@
.PHONY: all clean
all: a-b-bootblock.h
-a-b-bootblock.h: x86.bootsect
+a-b-bootblock.h: x86.bootsect x86.o
echo "$$__note" > header.tmp
xxd -i $< | sed -e 's/.*int.*//' >> header.tmp
+ nm x86.o | awk '{print "#define SYM_"$$3" 0x"$$1}' >> header.tmp
mv header.tmp $@
x86.bootsect: x86.boot
@@ -16,7 +17,7 @@ x86.boot: x86.o
$(CROSS_PREFIX)objcopy -O binary $< $@
x86.o: a-b-bootblock.S
- $(CROSS_PREFIX)gcc -m32 -march=i486 -c $< -o $@
+ $(CROSS_PREFIX)gcc -I.. -m32 -march=i486 -c $< -o $@
clean:
@rm -rf *.boot *.o *.bootsect
diff --git a/tests/migration/i386/a-b-bootblock.S b/tests/migration/i386/a-b-bootblock.S
index 3d464c7..62d79b2 100644
--- a/tests/migration/i386/a-b-bootblock.S
+++ b/tests/migration/i386/a-b-bootblock.S
@@ -9,6 +9,23 @@
#
# Author: dgilbert@redhat.com
+#include "migration-test.h"
+
+#define ACPI_ENABLE 0xf1
+#define ACPI_PORT_SMI_CMD 0xb2
+#define ACPI_PM_BASE 0x600
+#define PM1A_CNT_OFFSET 4
+
+#define ACPI_SCI_ENABLE 0x0001
+#define ACPI_SLEEP_TYPE 0x0400
+#define ACPI_SLEEP_ENABLE 0x2000
+#define SLEEP (ACPI_SCI_ENABLE + ACPI_SLEEP_TYPE + ACPI_SLEEP_ENABLE)
+
+#define LOW_ADDR X86_TEST_MEM_START
+#define HIGH_ADDR X86_TEST_MEM_END
+
+/* Save the suspended status at an address that is not written in the loop. */
+#define suspended (X86_TEST_MEM_START + 4)
.code16
.org 0x7c00
@@ -41,12 +58,11 @@ start: # at 0x7c00 ?
# bl keeps a counter so we limit the output speed
mov $0, %bl
mainloop:
- # Start from 1MB
- mov $(1024*1024),%eax
+ mov $LOW_ADDR,%eax
innerloop:
incb (%eax)
add $4096,%eax
- cmp $(100*1024*1024),%eax
+ cmp $HIGH_ADDR,%eax
jl innerloop
inc %bl
@@ -57,7 +73,30 @@ innerloop:
mov $0x3f8,%dx
outb %al,%dx
- jmp mainloop
+ # should this test suspend?
+ mov (suspend_me),%eax
+ cmp $0,%eax
+ je mainloop
+
+ # are we waking after suspend? do not suspend again.
+ mov $suspended,%eax
+ mov (%eax),%eax
+ cmp $1,%eax
+ je mainloop
+
+ # enable acpi
+ mov $ACPI_ENABLE,%al
+ outb %al,$ACPI_PORT_SMI_CMD
+
+ # suspend to ram
+ mov $suspended,%eax
+ movl $1,(%eax)
+ mov $SLEEP,%ax
+ mov $(ACPI_PM_BASE + PM1A_CNT_OFFSET),%dx
+ outw %ax,%dx
+ # not reached. The wakeup causes reset and restart at 0x7c00, and we
+ # do not save and restore registers as a real kernel would do.
+
# GDT magic from old (GPLv2) Grub startup.S
.p2align 2 /* force 4-byte alignment */
@@ -83,6 +122,10 @@ gdtdesc:
.word 0x27 /* limit */
.long gdt /* addr */
+ /* test launcher can poke a 1 here to exercise suspend */
+suspend_me:
+ .int 0
+
/* I'm a bootable disk */
.org 0x7dfe
.byte 0x55
diff --git a/tests/migration/i386/a-b-bootblock.h b/tests/migration/i386/a-b-bootblock.h
index b7b0fce..4d46873 100644
--- a/tests/migration/i386/a-b-bootblock.h
+++ b/tests/migration/i386/a-b-bootblock.h
@@ -4,20 +4,20 @@
* the header and the assembler differences in your patch submission.
*/
unsigned char x86_bootsect[] = {
- 0xfa, 0x0f, 0x01, 0x16, 0x78, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
+ 0xfa, 0x0f, 0x01, 0x16, 0xa4, 0x7c, 0x66, 0xb8, 0x01, 0x00, 0x00, 0x00,
0x0f, 0x22, 0xc0, 0x66, 0xea, 0x20, 0x7c, 0x00, 0x00, 0x08, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xe4, 0x92, 0x0c, 0x02,
0xe6, 0x92, 0xb8, 0x10, 0x00, 0x00, 0x00, 0x8e, 0xd8, 0x66, 0xb8, 0x41,
0x00, 0x66, 0xba, 0xf8, 0x03, 0xee, 0xb3, 0x00, 0xb8, 0x00, 0x00, 0x10,
0x00, 0xfe, 0x00, 0x05, 0x00, 0x10, 0x00, 0x00, 0x3d, 0x00, 0x00, 0x40,
0x06, 0x7c, 0xf2, 0xfe, 0xc3, 0x80, 0xe3, 0x3f, 0x75, 0xe6, 0x66, 0xb8,
- 0x42, 0x00, 0x66, 0xba, 0xf8, 0x03, 0xee, 0xeb, 0xdb, 0x8d, 0x76, 0x00,
- 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00,
- 0x00, 0x9a, 0xcf, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00,
- 0x27, 0x00, 0x60, 0x7c, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
- 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
- 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
- 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
+ 0x42, 0x00, 0x66, 0xba, 0xf8, 0x03, 0xee, 0xa1, 0xaa, 0x7c, 0x00, 0x00,
+ 0x83, 0xf8, 0x00, 0x74, 0xd3, 0xb8, 0x04, 0x00, 0x10, 0x00, 0x8b, 0x00,
+ 0x83, 0xf8, 0x01, 0x74, 0xc7, 0xb0, 0xf1, 0xe6, 0xb2, 0xb8, 0x04, 0x00,
+ 0x10, 0x00, 0xc7, 0x00, 0x01, 0x00, 0x00, 0x00, 0x66, 0xb8, 0x01, 0x24,
+ 0x66, 0xba, 0x04, 0x06, 0x66, 0xef, 0x66, 0x90, 0x00, 0x00, 0x00, 0x00,
+ 0x00, 0x00, 0x00, 0x00, 0xff, 0xff, 0x00, 0x00, 0x00, 0x9a, 0xcf, 0x00,
+ 0xff, 0xff, 0x00, 0x00, 0x00, 0x92, 0xcf, 0x00, 0x27, 0x00, 0x8c, 0x7c,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
@@ -49,3 +49,9 @@ unsigned char x86_bootsect[] = {
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x55, 0xaa
};
+#define SYM_gdt 0x00007c8c
+#define SYM_gdtdesc 0x00007ca4
+#define SYM_innerloop 0x00007c3d
+#define SYM_mainloop 0x00007c38
+#define SYM_start 0x00007c00
+#define SYM_suspend_me 0x00007caa
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 526a1b7..32fea73 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -608,6 +608,8 @@ typedef struct {
bool use_dirty_ring;
const char *opts_source;
const char *opts_target;
+ /* suspend the src before migrating to dest. */
+ bool suspend_me;
} MigrateStart;
/*
@@ -725,6 +727,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
dst_state = (QTestMigrationState) { };
src_state = (QTestMigrationState) { };
+ x86_bootsect[SYM_suspend_me - SYM_start] = args->suspend_me;
+
bootpath = g_strdup_printf("%s/bootsect", tmpfs);
if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
/* the assembled x86 boot sector should be exactly one sector large */
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 09/11] tests/qtest: precopy migration with suspend
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (7 preceding siblings ...)
2023-08-29 18:18 ` [PATCH V4 08/11] tests/qtest: option to suspend during migration Steve Sistare
@ 2023-08-29 18:18 ` Steve Sistare
2023-08-29 18:18 ` [PATCH V4 10/11] tests/qtest: postcopy " Steve Sistare
2023-08-29 18:18 ` [PATCH V4 11/11] tests/qtest: background " Steve Sistare
10 siblings, 0 replies; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:18 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Add a test case to verify that the suspended state is handled correctly
during live migration precopy. The test suspends the src, migrates, then
wakes the dest.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
tests/qtest/migration-helpers.c | 3 ++
tests/qtest/migration-helpers.h | 3 +-
tests/qtest/migration-test.c | 73 +++++++++++++++++++++++++++++++++++++----
3 files changed, 71 insertions(+), 8 deletions(-)
diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
index b541108..d1fec49 100644
--- a/tests/qtest/migration-helpers.c
+++ b/tests/qtest/migration-helpers.c
@@ -31,6 +31,9 @@ bool migrate_watch_for_events(QTestState *who, const char *name,
if (g_str_equal(name, "STOP")) {
state->stop_seen = true;
return true;
+ } else if (g_str_equal(name, "SUSPEND")) {
+ state->suspend_seen = true;
+ return true;
} else if (g_str_equal(name, "RESUME")) {
state->resume_seen = true;
return true;
diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
index 59fbb83..bac8699 100644
--- a/tests/qtest/migration-helpers.h
+++ b/tests/qtest/migration-helpers.h
@@ -16,7 +16,8 @@
#include "libqtest.h"
typedef struct QTestMigrationState {
- bool stop_seen, resume_seen;
+ bool suspend_me;
+ bool stop_seen, suspend_seen, resume_seen;
} QTestMigrationState;
bool migrate_watch_for_events(QTestState *who, const char *name,
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 32fea73..cc508ef 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -135,7 +135,7 @@ static void init_bootfile(const char *bootpath, void *content, size_t len)
/*
* Wait for some output in the serial output file,
* we get an 'A' followed by an endless string of 'B's
- * but on the destination we won't have the A.
+ * but on the destination we won't have the A (unless we enabled suspend/resume)
*/
static void wait_for_serial(const char *side)
{
@@ -195,6 +195,13 @@ static void wait_for_stop(QTestState *who, QTestMigrationState *state)
}
}
+static void wait_for_suspend(QTestState *who, QTestMigrationState *state)
+{
+ if (!state->suspend_seen) {
+ qtest_qmp_eventwait(who, "SUSPEND");
+ }
+}
+
/*
* It's tricky to use qemu's migration event capability with qtest,
* events suddenly appearing confuse the qmp()/hmp() responses.
@@ -249,7 +256,7 @@ static void wait_for_migration_pass(QTestState *who)
{
uint64_t pass, prev_pass = 0, changes = 0;
- while (changes < 2 && !src_state.stop_seen) {
+ while (changes < 2 && !src_state.stop_seen && !src_state.suspend_seen) {
usleep(1000);
pass = get_migration_pass(who);
changes += (pass != prev_pass);
@@ -545,7 +552,8 @@ static void migrate_wait_for_dirty_mem(QTestState *from,
watch_byte = qtest_readb(from, watch_address);
do {
usleep(1000 * 10);
- } while (qtest_readb(from, watch_address) == watch_byte);
+ } while (qtest_readb(from, watch_address) == watch_byte &&
+ !src_state.suspend_seen);
}
@@ -727,6 +735,7 @@ static int test_migrate_start(QTestState **from, QTestState **to,
dst_state = (QTestMigrationState) { };
src_state = (QTestMigrationState) { };
+ src_state.suspend_me = args->suspend_me;
x86_bootsect[SYM_suspend_me - SYM_start] = args->suspend_me;
bootpath = g_strdup_printf("%s/bootsect", tmpfs);
@@ -1522,8 +1531,12 @@ static void test_precopy_common(MigrateCommon *args)
* change anything.
*/
if (args->result == MIG_TEST_SUCCEED) {
- qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
- wait_for_stop(from, &src_state);
+ if (src_state.suspend_me) {
+ wait_for_suspend(from, &src_state);
+ } else {
+ qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
+ wait_for_stop(from, &src_state);
+ }
migrate_ensure_converge(from);
}
}
@@ -1565,7 +1578,11 @@ static void test_precopy_common(MigrateCommon *args)
*/
wait_for_migration_complete(from);
- wait_for_stop(from, &src_state);
+ if (src_state.suspend_me) {
+ wait_for_suspend(from, &src_state);
+ } else {
+ wait_for_stop(from, &src_state);
+ }
} else {
wait_for_migration_complete(from);
@@ -1579,10 +1596,16 @@ static void test_precopy_common(MigrateCommon *args)
qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
}
+ /* Always get RESUME first after switchover */
if (!dst_state.resume_seen) {
qtest_qmp_eventwait(to, "RESUME");
}
+ if (args->start.suspend_me) {
+ /* wakeup succeeds only if guest is suspended */
+ qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
+ }
+
wait_for_serial("dest_serial");
}
@@ -1609,6 +1632,34 @@ static void test_precopy_unix_plain(void)
test_precopy_common(&args);
}
+static void test_precopy_unix_suspend_live(void)
+{
+ g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+ MigrateCommon args = {
+ .listen_uri = uri,
+ .connect_uri = uri,
+ /*
+ * despite being live, the test is fast because the src
+ * suspends immediately.
+ */
+ .live = true,
+ .start.suspend_me = true,
+ };
+
+ test_precopy_common(&args);
+}
+
+static void test_precopy_unix_suspend_notlive(void)
+{
+ g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+ MigrateCommon args = {
+ .listen_uri = uri,
+ .connect_uri = uri,
+ .start.suspend_me = true,
+ };
+
+ test_precopy_common(&args);
+}
static void test_precopy_unix_dirty_ring(void)
{
@@ -2765,7 +2816,7 @@ static bool kvm_dirty_ring_supported(void)
int main(int argc, char **argv)
{
bool has_kvm, has_tcg;
- bool has_uffd;
+ bool has_uffd, is_x86;
const char *arch;
g_autoptr(GError) err = NULL;
int ret;
@@ -2782,6 +2833,7 @@ int main(int argc, char **argv)
has_uffd = ufd_version_check();
arch = qtest_get_arch();
+ is_x86 = !strcmp(arch, "i386") || !strcmp(arch, "x86_64");
/*
* On ppc64, the test only works with kvm-hv, but not with kvm-pr and TCG
@@ -2812,6 +2864,13 @@ int main(int argc, char **argv)
module_call_init(MODULE_INIT_QOM);
+ if (is_x86) {
+ qtest_add_func("/migration/precopy/unix/suspend/live",
+ test_precopy_unix_suspend_live);
+ qtest_add_func("/migration/precopy/unix/suspend/notlive",
+ test_precopy_unix_suspend_notlive);
+ }
+
if (has_uffd) {
qtest_add_func("/migration/postcopy/plain", test_postcopy);
qtest_add_func("/migration/postcopy/recovery/plain",
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 10/11] tests/qtest: postcopy migration with suspend
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (8 preceding siblings ...)
2023-08-29 18:18 ` [PATCH V4 09/11] tests/qtest: precopy migration with suspend Steve Sistare
@ 2023-08-29 18:18 ` Steve Sistare
2023-08-29 18:18 ` [PATCH V4 11/11] tests/qtest: background " Steve Sistare
10 siblings, 0 replies; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:18 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Add a test case to verify that the suspended state is handled correctly by
live migration postcopy. The test suspends the src, migrates, then wakes
the dest.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
tests/qtest/migration-test.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index cc508ef..6306fb0 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -599,8 +599,12 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
{
qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
- wait_for_stop(from, &src_state);
- qtest_qmp_eventwait(to, "RESUME");
+ if (src_state.suspend_me) {
+ wait_for_suspend(from, &src_state);
+ } else {
+ wait_for_stop(from, &src_state);
+ qtest_qmp_eventwait(to, "RESUME");
+ }
}
typedef struct {
@@ -1299,6 +1303,11 @@ static void migrate_postcopy_complete(QTestState *from, QTestState *to,
{
wait_for_migration_complete(from);
+ if (args->start.suspend_me) {
+ /* wakeup succeeds only if guest is suspended */
+ qtest_qmp_assert_success(to, "{'execute': 'system_wakeup'}");
+ }
+
/* Make sure we get at least one "B" on destination */
wait_for_serial("dest_serial");
@@ -1332,6 +1341,15 @@ static void test_postcopy(void)
test_postcopy_common(&args);
}
+static void test_postcopy_suspend(void)
+{
+ MigrateCommon args = {
+ .start.suspend_me = true,
+ };
+
+ test_postcopy_common(&args);
+}
+
static void test_postcopy_compress(void)
{
MigrateCommon args = {
@@ -2884,6 +2902,10 @@ int main(int argc, char **argv)
qtest_add_func("/migration/postcopy/recovery/compress/plain",
test_postcopy_recovery_compress);
}
+ if (is_x86) {
+ qtest_add_func("/migration/postcopy/suspend",
+ test_postcopy_suspend);
+ }
}
qtest_add_func("/migration/bad_dest", test_baddest);
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH V4 11/11] tests/qtest: background migration with suspend
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
` (9 preceding siblings ...)
2023-08-29 18:18 ` [PATCH V4 10/11] tests/qtest: postcopy " Steve Sistare
@ 2023-08-29 18:18 ` Steve Sistare
10 siblings, 0 replies; 27+ messages in thread
From: Steve Sistare @ 2023-08-29 18:18 UTC (permalink / raw)
To: qemu-devel
Cc: Juan Quintela, Peter Xu, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé, Steve Sistare
Add a test case to verify that the suspended state is handled correctly by
a background migration. The test suspends the src, migrates, then wakes
the dest.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
tests/qtest/migration-test.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
index 6306fb0..5cc8b91 100644
--- a/tests/qtest/migration-test.c
+++ b/tests/qtest/migration-test.c
@@ -1679,6 +1679,26 @@ static void test_precopy_unix_suspend_notlive(void)
test_precopy_common(&args);
}
+static void *test_bg_suspend_start(QTestState *from, QTestState *to)
+{
+ migrate_set_capability(from, "background-snapshot", true);
+ return NULL;
+}
+
+static void test_bg_suspend(void)
+{
+ g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
+ MigrateCommon args = {
+ .listen_uri = uri,
+ .connect_uri = uri,
+ .live = true, /* runs fast, the src suspends immediately. */
+ .start.suspend_me = true,
+ .start_hook = test_bg_suspend_start
+ };
+
+ test_precopy_common(&args);
+}
+
static void test_precopy_unix_dirty_ring(void)
{
g_autofree char *uri = g_strdup_printf("unix:%s/migsocket", tmpfs);
@@ -2905,6 +2925,7 @@ int main(int argc, char **argv)
if (is_x86) {
qtest_add_func("/migration/postcopy/suspend",
test_postcopy_suspend);
+ qtest_add_func("/migration/bg/suspend", test_bg_suspend);
}
}
--
1.8.3.1
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start
2023-08-29 18:17 ` [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start Steve Sistare
@ 2023-08-30 15:52 ` Peter Xu
2023-08-30 15:56 ` Steven Sistare
0 siblings, 1 reply; 27+ messages in thread
From: Peter Xu @ 2023-08-30 15:52 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:17:56AM -0700, Steve Sistare wrote:
> When a vm in the suspended state is migrated, we must call vm_prepare_start
> on the destination, so a later system_wakeup properly resumes the guest,
> when main_loop_should_exit callsresume_all_vcpus. However, the runstate
> should remain suspended until system_wakeup is called, so allow the caller
> to pass the new state to vm_prepare_start, rather than assume the new state
> is RUN_STATE_RUNNING. Modify vm state change handlers that check
> RUN_STATE_RUNNING to instead use the running parameter.
>
> No functional change.
>
> Suggested-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
I think all the call sites should be covered indeed, via:
qemu_add_vm_change_state_handler_prio
qdev_add_vm_change_state_handler
virtio_blk_device_realize[1653] qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
scsi_qdev_realize[289] dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
vfio_migration_init[796] migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
virtio_init[3189] vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
qemu_add_vm_change_state_handler
xen_init[106] qemu_add_vm_change_state_handler(xen_change_state_handler, NULL);
audio_init[1827] e = qemu_add_vm_change_state_handler (audio_vm_change_state_handler, s);
tpm_emulator_inst_init[978] qemu_add_vm_change_state_handler(tpm_emulator_vm_state_change,
blk_root_activate[223] blk->vmsh = qemu_add_vm_change_state_handler(blk_vm_state_changed,
gdbserver_start[384] qemu_add_vm_change_state_handler(gdb_vm_state_change, NULL);
pflash_post_load[1038] pfl->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
qxl_realize_common[2202] qemu_add_vm_change_state_handler(qxl_vm_change_state_handler, qxl);
kvmclock_realize[233] qemu_add_vm_change_state_handler(kvmclock_vm_state_change, s);
kvm_pit_realizefn[298] qemu_add_vm_change_state_handler(kvm_pit_vm_state_change, s);
vapic_post_load[796] qemu_add_vm_change_state_handler(kvmvapic_vm_state_change, s);
ide_bus_register_restart_cb[2767] bus->vmstate = qemu_add_vm_change_state_handler(ide_restart_cb, bus);
kvm_arm_its_realize[122] qemu_add_vm_change_state_handler(vm_change_state_handler, s);
kvm_arm_gicv3_realize[888] qemu_add_vm_change_state_handler(vm_change_state_handler, s);
kvmppc_xive_connect[794] xive->change = qemu_add_vm_change_state_handler(
via1_post_load[971] v1s->vmstate = qemu_add_vm_change_state_handler(
e1000e_core_pci_realize[3379] qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
igb_core_pci_realize[4012] core->vmstate = qemu_add_vm_change_state_handler(igb_vm_state_change, core);
spapr_nvram_post_load[235] nvram->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
ppc_booke_timers_init[366] qemu_add_vm_change_state_handler(cpu_state_change_handler, cpu);
spapr_machine_init[3070] qemu_add_vm_change_state_handler(cpu_ppc_clock_vm_state_change,
kvm_s390_tod_realize[133] qemu_add_vm_change_state_handler(kvm_s390_tod_vm_state_change, td);
usb_ehci_realize[2540] s->vmstate = qemu_add_vm_change_state_handler(usb_ehci_vm_state_change, s);
usb_host_auto_check[1912] usb_vmstate = qemu_add_vm_change_state_handler(usb_host_vm_state, NULL);
usbredir_realize[1466] qemu_add_vm_change_state_handler(usbredir_vm_state_change, dev);
virtio_rng_device_realize[226] vrng->vmstate = qemu_add_vm_change_state_handler(virtio_rng_vm_state_change,
xen_do_ioreq_register[825] qemu_add_vm_change_state_handler(xen_hvm_change_state_handler, state);
net_init_clients[1644] qemu_add_vm_change_state_handler(net_vm_change_state_handler, NULL);
memory_global_dirty_log_stop[2978] vmstate_change = qemu_add_vm_change_state_handler(
hvf_arch_init[2036] qemu_add_vm_change_state_handler(hvf_vm_state_change, &vtimer);
kvm_arch_init_vcpu[567] qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cs);
kvm_arch_init_vcpu[2191] cpu->vmsentry = qemu_add_vm_change_state_handler(cpu_update_state, env);
sev_kvm_init[1014] qemu_add_vm_change_state_handler(sev_vm_state_change, sev);
whpx_init_vcpu[2248] qemu_add_vm_change_state_handler(whpx_cpu_update_state, cpu->env_ptr);
kvm_arch_init_vcpu[70] qemu_add_vm_change_state_handler(kvm_mips_update_state, cs);
kvm_arch_init_vcpu[891] qemu_add_vm_change_state_handler(kvm_riscv_vm_state_change, cs);
gtk_display_init[2410] qemu_add_vm_change_state_handler(gd_change_runstate, s);
qemu_spice_display_init_done[651] qemu_add_vm_change_state_handler(vm_change_state_handler, NULL);
qemu_spice_add_interface[868] qemu_add_vm_change_state_handler(vm_change_state_handler, NULL);
Looks all correct:
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start
2023-08-30 15:52 ` Peter Xu
@ 2023-08-30 15:56 ` Steven Sistare
0 siblings, 0 replies; 27+ messages in thread
From: Steven Sistare @ 2023-08-30 15:56 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On 8/30/2023 11:52 AM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:17:56AM -0700, Steve Sistare wrote:
>> When a vm in the suspended state is migrated, we must call vm_prepare_start
>> on the destination, so a later system_wakeup properly resumes the guest,
>> when main_loop_should_exit callsresume_all_vcpus. However, the runstate
>> should remain suspended until system_wakeup is called, so allow the caller
>> to pass the new state to vm_prepare_start, rather than assume the new state
>> is RUN_STATE_RUNNING. Modify vm state change handlers that check
>> RUN_STATE_RUNNING to instead use the running parameter.
>>
>> No functional change.
>>
>> Suggested-by: Peter Xu <peterx@redhat.com>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>
> I think all the call sites should be covered indeed, via:
Indeed, I laboriously checked every handler, like you, and then I realized
that searching for RUN_STATE_RUNNING outside of runstate.c yielded many
fewer call sites to check!
- Steve
> qemu_add_vm_change_state_handler_prio
> qdev_add_vm_change_state_handler
> virtio_blk_device_realize[1653] qdev_add_vm_change_state_handler(dev, virtio_blk_dma_restart_cb, s);
> scsi_qdev_realize[289] dev->vmsentry = qdev_add_vm_change_state_handler(DEVICE(dev),
> vfio_migration_init[796] migration->vm_state = qdev_add_vm_change_state_handler(vbasedev->dev,
> virtio_init[3189] vdev->vmstate = qdev_add_vm_change_state_handler(DEVICE(vdev),
> qemu_add_vm_change_state_handler
> xen_init[106] qemu_add_vm_change_state_handler(xen_change_state_handler, NULL);
> audio_init[1827] e = qemu_add_vm_change_state_handler (audio_vm_change_state_handler, s);
> tpm_emulator_inst_init[978] qemu_add_vm_change_state_handler(tpm_emulator_vm_state_change,
> blk_root_activate[223] blk->vmsh = qemu_add_vm_change_state_handler(blk_vm_state_changed,
> gdbserver_start[384] qemu_add_vm_change_state_handler(gdb_vm_state_change, NULL);
> pflash_post_load[1038] pfl->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
> qxl_realize_common[2202] qemu_add_vm_change_state_handler(qxl_vm_change_state_handler, qxl);
> kvmclock_realize[233] qemu_add_vm_change_state_handler(kvmclock_vm_state_change, s);
> kvm_pit_realizefn[298] qemu_add_vm_change_state_handler(kvm_pit_vm_state_change, s);
> vapic_post_load[796] qemu_add_vm_change_state_handler(kvmvapic_vm_state_change, s);
> ide_bus_register_restart_cb[2767] bus->vmstate = qemu_add_vm_change_state_handler(ide_restart_cb, bus);
> kvm_arm_its_realize[122] qemu_add_vm_change_state_handler(vm_change_state_handler, s);
> kvm_arm_gicv3_realize[888] qemu_add_vm_change_state_handler(vm_change_state_handler, s);
> kvmppc_xive_connect[794] xive->change = qemu_add_vm_change_state_handler(
> via1_post_load[971] v1s->vmstate = qemu_add_vm_change_state_handler(
> e1000e_core_pci_realize[3379] qemu_add_vm_change_state_handler(e1000e_vm_state_change, core);
> igb_core_pci_realize[4012] core->vmstate = qemu_add_vm_change_state_handler(igb_vm_state_change, core);
> spapr_nvram_post_load[235] nvram->vmstate = qemu_add_vm_change_state_handler(postload_update_cb,
> ppc_booke_timers_init[366] qemu_add_vm_change_state_handler(cpu_state_change_handler, cpu);
> spapr_machine_init[3070] qemu_add_vm_change_state_handler(cpu_ppc_clock_vm_state_change,
> kvm_s390_tod_realize[133] qemu_add_vm_change_state_handler(kvm_s390_tod_vm_state_change, td);
> usb_ehci_realize[2540] s->vmstate = qemu_add_vm_change_state_handler(usb_ehci_vm_state_change, s);
> usb_host_auto_check[1912] usb_vmstate = qemu_add_vm_change_state_handler(usb_host_vm_state, NULL);
> usbredir_realize[1466] qemu_add_vm_change_state_handler(usbredir_vm_state_change, dev);
> virtio_rng_device_realize[226] vrng->vmstate = qemu_add_vm_change_state_handler(virtio_rng_vm_state_change,
> xen_do_ioreq_register[825] qemu_add_vm_change_state_handler(xen_hvm_change_state_handler, state);
> net_init_clients[1644] qemu_add_vm_change_state_handler(net_vm_change_state_handler, NULL);
> memory_global_dirty_log_stop[2978] vmstate_change = qemu_add_vm_change_state_handler(
> hvf_arch_init[2036] qemu_add_vm_change_state_handler(hvf_vm_state_change, &vtimer);
> kvm_arch_init_vcpu[567] qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cs);
> kvm_arch_init_vcpu[2191] cpu->vmsentry = qemu_add_vm_change_state_handler(cpu_update_state, env);
> sev_kvm_init[1014] qemu_add_vm_change_state_handler(sev_vm_state_change, sev);
> whpx_init_vcpu[2248] qemu_add_vm_change_state_handler(whpx_cpu_update_state, cpu->env_ptr);
> kvm_arch_init_vcpu[70] qemu_add_vm_change_state_handler(kvm_mips_update_state, cs);
> kvm_arch_init_vcpu[891] qemu_add_vm_change_state_handler(kvm_riscv_vm_state_change, cs);
> gtk_display_init[2410] qemu_add_vm_change_state_handler(gd_change_runstate, s);
> qemu_spice_display_init_done[651] qemu_add_vm_change_state_handler(vm_change_state_handler, NULL);
> qemu_spice_add_interface[868] qemu_add_vm_change_state_handler(vm_change_state_handler, NULL);
>
> Looks all correct:
>
> Reviewed-by: Peter Xu <peterx@redhat.com>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 02/11] migration: preserve suspended runstate
2023-08-29 18:17 ` [PATCH V4 02/11] migration: preserve suspended runstate Steve Sistare
@ 2023-08-30 16:07 ` Peter Xu
0 siblings, 0 replies; 27+ messages in thread
From: Peter Xu @ 2023-08-30 16:07 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:17:57AM -0700, Steve Sistare wrote:
> A guest that is migrated in the suspended state automaticaly wakes and
> continues execution. This is wrong; the guest should end migration in
> the same state it started. The root cause is that the outgoing migration
> code automatically wakes the guest, then saves the RUNNING runstate in
> global_state_store(), hence the incoming migration code thinks the guest is
> running and continues the guest if autostart is true.
>
> On the outgoing side, do not call qemu_system_wakeup_request().
>
> On the incoming side for precopy, prepare to start the vm, but do not
> yet start it. A future system_wakeup will cause the main loop to resume
> the VCPUs.
>
> On the incoming side for postcopy, do not wake the guest, and apply the
> the same logic as found in precopy: if autostart and the runstate is
> RUNNING, then vm_start, else prepare to start the vm.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 03/11] migration: add runstate function
2023-08-29 18:17 ` [PATCH V4 03/11] migration: add runstate function Steve Sistare
@ 2023-08-30 16:11 ` Peter Xu
0 siblings, 0 replies; 27+ messages in thread
From: Peter Xu @ 2023-08-30 16:11 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:17:58AM -0700, Steve Sistare wrote:
> Create a subroutine for preserving the runstate after migration,
> to be used in a subsequent patch. No functional change.
There is actually a functional change when postcopy=on && colo=on, but I
don't think it's a valid use case, so maybe we should fail
migrate_caps_check perhaps when both set? Not some immediate concern.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 04/11] migration: preserve suspended for snapshot
2023-08-29 18:17 ` [PATCH V4 04/11] migration: preserve suspended for snapshot Steve Sistare
@ 2023-08-30 16:22 ` Peter Xu
2023-11-13 18:32 ` Steven Sistare
0 siblings, 1 reply; 27+ messages in thread
From: Peter Xu @ 2023-08-30 16:22 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:17:59AM -0700, Steve Sistare wrote:
> Restoring a snapshot can break a suspended guest.
>
> If a guest is suspended and saved to a snapshot using savevm, and qemu
> is terminated and restarted with the -S option, then loadvm does not
> restore the guest. The runstate is running, but the guest is not, because
> vm_start was not called. The root cause is that loadvm does not restore
> the runstate (eg suspended) from global_state loaded from the state file.
>
> Restore the runstate, and allow the new state transitions that are possible.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> migration/savevm.c | 1 +
> softmmu/runstate.c | 2 ++
> 2 files changed, 3 insertions(+)
>
> diff --git a/migration/savevm.c b/migration/savevm.c
> index eba3653..7b9c477 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -3194,6 +3194,7 @@ bool load_snapshot(const char *name, const char *vmstate,
> }
> aio_context_acquire(aio_context);
> ret = qemu_loadvm_state(f);
> + migrate_set_runstate();
I see that some load_snapshot() callers manage the vm states on their own.
Take snapshot_load_job_bh() as an example:
s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
if (s->ret && orig_vm_running) {
vm_start();
}
I assume you wanted to unify the state changes here. Need to fix the
callers too?
> migration_incoming_state_destroy();
> aio_context_release(aio_context);
>
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> index f3bd862..21d7407 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -77,6 +77,8 @@ typedef struct {
>
> static const RunStateTransition runstate_transitions_def[] = {
> { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
> + { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
> + { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
>
> { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
> { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
Many of the call sites also starts loadvm under RUN_STATE_RESTORE_VM. Do
we need more entries for that?
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 05/11] migration: preserve suspended for bg_migration
2023-08-29 18:18 ` [PATCH V4 05/11] migration: preserve suspended for bg_migration Steve Sistare
@ 2023-08-30 16:35 ` Peter Xu
2023-11-13 18:32 ` Steven Sistare
0 siblings, 1 reply; 27+ messages in thread
From: Peter Xu @ 2023-08-30 16:35 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:18:00AM -0700, Steve Sistare wrote:
> Do not wake a suspended guest during bg_migration.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
> migration/migration.c | 12 +++++-------
> softmmu/runstate.c | 1 +
> 2 files changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/migration/migration.c b/migration/migration.c
> index a9ecb66..303d5a6 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -3064,7 +3064,9 @@ static void bg_migration_vm_start_bh(void *opaque)
> qemu_bh_delete(s->vm_start_bh);
> s->vm_start_bh = NULL;
>
> - vm_start();
> + if (!runstate_check(RUN_STATE_SUSPENDED)) {
> + vm_start();
> + }
> s->downtime = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - s->downtime_start;
> }
>
> @@ -3134,16 +3136,12 @@ static void *bg_migration_thread(void *opaque)
>
> qemu_mutex_lock_iothread();
>
> - /*
> - * If VM is currently in suspended state, then, to make a valid runstate
> - * transition in vm_stop_force_state() we need to wakeup it up.
> - */
> - qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
> s->vm_old_state = runstate_get();
>
> global_state_store();
> /* Forcibly stop VM before saving state of vCPUs and devices */
> - if (vm_stop_force_state(RUN_STATE_PAUSED)) {
> + if (!runstate_check(RUN_STATE_SUSPENDED) &&
> + vm_stop_force_state(RUN_STATE_PAUSED)) {
IIUC we need the vm_stop even for SUSPENDED? I think we need to make sure
all backends are stopped before we start wr-protect the guest pages, so
when wr-protect happens the guest pages should be in a consistent state.
I'd think it proper to juse reuse the helper you introduced in the previous
patches, but maybe you have a reason to not do so (I didn't see it
mentioned anywhere, though)?
> goto fail;
> }
> /*
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> index 21d7407..4417527 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -163,6 +163,7 @@ static const RunStateTransition runstate_transitions_def[] = {
> { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
> { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
> { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
> + { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
>
> { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
> { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
> --
> 1.8.3.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 06/11] migration: preserve cpu ticks if suspended
2023-08-29 18:18 ` [PATCH V4 06/11] migration: preserve cpu ticks if suspended Steve Sistare
@ 2023-08-30 16:47 ` Peter Xu
2023-09-07 15:50 ` Steven Sistare
2023-11-13 18:32 ` Steven Sistare
0 siblings, 2 replies; 27+ messages in thread
From: Peter Xu @ 2023-08-30 16:47 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:18:01AM -0700, Steve Sistare wrote:
> During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
> timers_state saved to the migration stream is stale, causing time errors
> in the guest when it wakes from suspend.
Instead of having this, I'm wondering whether we should just let:
ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
stop the vm for suspended too - I think we reached a consensus that
SUSPENDED should be treated the same as running here (except the vcpu
beingg running or not).
So the more risky change is we should make runstate_is_running() cover
SUSPENDED, but of course that again can affect many other call sites.. and
I'm not sure whether it's 100% working everywhere.
I think I mentioned the other "easier" way, which is to modify
vm_stop_force_state() to take suspended:
int vm_stop_force_state(RunState state)
{
- if (runstate_is_running()) {
+ if (runstate_is_running() || runstate_is_suspended()) {
return vm_stop(state);
That resides in cpus.c but it really only affects migration, so much less
risky. Do you think this should be the better (and correct) way to go?
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 07/11] tests/qtest: migration events
2023-08-29 18:18 ` [PATCH V4 07/11] tests/qtest: migration events Steve Sistare
@ 2023-08-30 17:00 ` Peter Xu
2023-11-13 18:33 ` Steven Sistare
0 siblings, 1 reply; 27+ messages in thread
From: Peter Xu @ 2023-08-30 17:00 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:18:02AM -0700, Steve Sistare wrote:
> Define a state object to capture events seen by migration tests, to allow
> more events to be captured in a subsequent patch, and simplify event
> checking in wait_for_migration_pass. No functional change.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: Fabiano Rosas <farosas@suse.de>
> ---
> tests/qtest/migration-helpers.c | 24 +++++----------
> tests/qtest/migration-helpers.h | 8 +++--
> tests/qtest/migration-test.c | 68 +++++++++++++++++++----------------------
> 3 files changed, 44 insertions(+), 56 deletions(-)
>
> diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
> index be00c52..b541108 100644
> --- a/tests/qtest/migration-helpers.c
> +++ b/tests/qtest/migration-helpers.c
> @@ -23,26 +23,16 @@
> */
> #define MIGRATION_STATUS_WAIT_TIMEOUT 120
>
> -bool migrate_watch_for_stop(QTestState *who, const char *name,
> - QDict *event, void *opaque)
> -{
> - bool *seen = opaque;
> -
> - if (g_str_equal(name, "STOP")) {
> - *seen = true;
> - return true;
> - }
> -
> - return false;
> -}
> -
> -bool migrate_watch_for_resume(QTestState *who, const char *name,
> +bool migrate_watch_for_events(QTestState *who, const char *name,
> QDict *event, void *opaque)
> {
> - bool *seen = opaque;
> + QTestMigrationState *state = opaque;
>
> - if (g_str_equal(name, "RESUME")) {
> - *seen = true;
> + if (g_str_equal(name, "STOP")) {
> + state->stop_seen = true;
> + return true;
> + } else if (g_str_equal(name, "RESUME")) {
> + state->resume_seen = true;
> return true;
> }
>
> diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
> index 009e250..59fbb83 100644
> --- a/tests/qtest/migration-helpers.h
> +++ b/tests/qtest/migration-helpers.h
> @@ -15,9 +15,11 @@
>
> #include "libqtest.h"
>
> -bool migrate_watch_for_stop(QTestState *who, const char *name,
> - QDict *event, void *opaque);
> -bool migrate_watch_for_resume(QTestState *who, const char *name,
> +typedef struct QTestMigrationState {
> + bool stop_seen, resume_seen;
> +} QTestMigrationState;
> +
> +bool migrate_watch_for_events(QTestState *who, const char *name,
> QDict *event, void *opaque);
>
> G_GNUC_PRINTF(3, 4)
> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
> index 62d3f37..526a1b7 100644
> --- a/tests/qtest/migration-test.c
> +++ b/tests/qtest/migration-test.c
> @@ -43,8 +43,8 @@
> unsigned start_address;
> unsigned end_address;
> static bool uffd_feature_thread_id;
> -static bool got_src_stop;
> -static bool got_dst_resume;
> +static QTestMigrationState src_state;
> +static QTestMigrationState dst_state;
>
> /*
> * An initial 3 MB offset is used as that corresponds
> @@ -188,6 +188,13 @@ static void wait_for_serial(const char *side)
> } while (true);
> }
>
> +static void wait_for_stop(QTestState *who, QTestMigrationState *state)
> +{
> + if (!state->stop_seen) {
> + qtest_qmp_eventwait(who, "STOP");
> + }
> +}
> +
> /*
> * It's tricky to use qemu's migration event capability with qtest,
> * events suddenly appearing confuse the qmp()/hmp() responses.
> @@ -235,21 +242,19 @@ static void read_blocktime(QTestState *who)
> qobject_unref(rsp_return);
> }
>
> +/*
> + * Wait for two changes in the migration pass count, but bail if we stop.
> + */
> static void wait_for_migration_pass(QTestState *who)
> {
> - uint64_t initial_pass = get_migration_pass(who);
> - uint64_t pass;
> + uint64_t pass, prev_pass = 0, changes = 0;
>
> - /* Wait for the 1st sync */
> - while (!got_src_stop && !initial_pass) {
> - usleep(1000);
> - initial_pass = get_migration_pass(who);
> - }
> -
> - do {
> + while (changes < 2 && !src_state.stop_seen) {
> usleep(1000);
> pass = get_migration_pass(who);
> - } while (pass == initial_pass && !got_src_stop);
> + changes += (pass != prev_pass);
> + prev_pass = pass;
> + }
Here won't it start to wait for 2 iterations every time instead of 1?
Note that previously we only wait for 1 iteration as long as not the
initial pass. And I think the change will double the counts for below..
while (args->iterations > 1) {
wait_for_migration_pass(from);
args->iterations--;
}
The event-related changes are all fine, but maybe leave this piece as before?
> }
>
> static void check_guests_ram(QTestState *who)
> @@ -586,10 +591,7 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
> {
> qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
>
> - if (!got_src_stop) {
> - qtest_qmp_eventwait(from, "STOP");
> - }
> -
> + wait_for_stop(from, &src_state);
> qtest_qmp_eventwait(to, "RESUME");
> }
>
> @@ -720,8 +722,9 @@ static int test_migrate_start(QTestState **from, QTestState **to,
> }
> }
>
> - got_src_stop = false;
> - got_dst_resume = false;
> + dst_state = (QTestMigrationState) { };
> + src_state = (QTestMigrationState) { };
> +
> bootpath = g_strdup_printf("%s/bootsect", tmpfs);
> if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
> /* the assembled x86 boot sector should be exactly one sector large */
> @@ -801,8 +804,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
> if (!args->only_target) {
> *from = qtest_init(cmd_source);
> qtest_qmp_set_event_callback(*from,
> - migrate_watch_for_stop,
> - &got_src_stop);
> + migrate_watch_for_events,
> + &src_state);
> }
>
> cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
> @@ -821,8 +824,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
> ignore_stderr);
> *to = qtest_init(cmd_target);
> qtest_qmp_set_event_callback(*to,
> - migrate_watch_for_resume,
> - &got_dst_resume);
> + migrate_watch_for_events,
> + &dst_state);
>
> /*
> * Remove shmem file immediately to avoid memory leak in test failed case.
> @@ -1516,9 +1519,7 @@ static void test_precopy_common(MigrateCommon *args)
> */
> if (args->result == MIG_TEST_SUCCEED) {
> qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
> - if (!got_src_stop) {
> - qtest_qmp_eventwait(from, "STOP");
> - }
> + wait_for_stop(from, &src_state);
> migrate_ensure_converge(from);
> }
> }
> @@ -1560,9 +1561,8 @@ static void test_precopy_common(MigrateCommon *args)
> */
> wait_for_migration_complete(from);
>
> - if (!got_src_stop) {
> - qtest_qmp_eventwait(from, "STOP");
> - }
> + wait_for_stop(from, &src_state);
> +
> } else {
> wait_for_migration_complete(from);
> /*
> @@ -1575,7 +1575,7 @@ static void test_precopy_common(MigrateCommon *args)
> qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
> }
>
> - if (!got_dst_resume) {
> + if (!dst_state.resume_seen) {
> qtest_qmp_eventwait(to, "RESUME");
> }
>
> @@ -1696,9 +1696,7 @@ static void test_ignore_shared(void)
>
> migrate_wait_for_dirty_mem(from, to);
>
> - if (!got_src_stop) {
> - qtest_qmp_eventwait(from, "STOP");
> - }
> + wait_for_stop(from, &src_state);
>
> qtest_qmp_eventwait(to, "RESUME");
>
> @@ -2139,7 +2137,7 @@ static void test_migrate_auto_converge(void)
> break;
> }
> usleep(20);
> - g_assert_false(got_src_stop);
> + g_assert_false(src_state.stop_seen);
> } while (true);
> /* The first percentage of throttling should be at least init_pct */
> g_assert_cmpint(percentage, >=, init_pct);
> @@ -2481,9 +2479,7 @@ static void test_multifd_tcp_cancel(void)
>
> migrate_ensure_converge(from);
>
> - if (!got_src_stop) {
> - qtest_qmp_eventwait(from, "STOP");
> - }
> + wait_for_stop(from, &src_state);
> qtest_qmp_eventwait(to2, "RESUME");
>
> wait_for_serial("dest_serial");
> --
> 1.8.3.1
>
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 08/11] tests/qtest: option to suspend during migration
2023-08-29 18:18 ` [PATCH V4 08/11] tests/qtest: option to suspend during migration Steve Sistare
@ 2023-08-30 17:01 ` Peter Xu
0 siblings, 0 replies; 27+ messages in thread
From: Peter Xu @ 2023-08-30 17:01 UTC (permalink / raw)
To: Steve Sistare
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On Tue, Aug 29, 2023 at 11:18:03AM -0700, Steve Sistare wrote:
> Add an option to suspend the src in a-b-bootblock.S, which puts the guest
> in S3 state after one round of writing to memory. The option is enabled by
> poking a 1 into the suspend_me word in the boot block prior to starting the
> src vm. Generate symbol offsets in a-b-bootblock.h so that the suspend_me
> offset is known.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Acked-by: Peter Xu <peterx@redhat.com>
--
Peter Xu
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 06/11] migration: preserve cpu ticks if suspended
2023-08-30 16:47 ` Peter Xu
@ 2023-09-07 15:50 ` Steven Sistare
2023-11-13 18:32 ` Steven Sistare
1 sibling, 0 replies; 27+ messages in thread
From: Steven Sistare @ 2023-09-07 15:50 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On 8/30/2023 12:47 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:18:01AM -0700, Steve Sistare wrote:
>> During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
>> timers_state saved to the migration stream is stale, causing time errors
>> in the guest when it wakes from suspend.
>
> Instead of having this, I'm wondering whether we should just let:
>
> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>
> stop the vm for suspended too - I think we reached a consensus that
> SUSPENDED should be treated the same as running here (except the vcpu
> beingg running or not).
>
> So the more risky change is we should make runstate_is_running() cover
> SUSPENDED, but of course that again can affect many other call sites.. and
> I'm not sure whether it's 100% working everywhere.
>
> I think I mentioned the other "easier" way, which is to modify
> vm_stop_force_state() to take suspended:
>
> int vm_stop_force_state(RunState state)
> {
> - if (runstate_is_running()) {
> + if (runstate_is_running() || runstate_is_suspended()) {
> return vm_stop(state);
>
> That resides in cpus.c but it really only affects migration, so much less
> risky. Do you think this should be the better (and correct) way to go?
Hi Peter, thanks for your careful review. I have some other work to do, but
I will get back to this soon.
- Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 04/11] migration: preserve suspended for snapshot
2023-08-30 16:22 ` Peter Xu
@ 2023-11-13 18:32 ` Steven Sistare
0 siblings, 0 replies; 27+ messages in thread
From: Steven Sistare @ 2023-11-13 18:32 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On 8/30/2023 12:22 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:17:59AM -0700, Steve Sistare wrote:
>> Restoring a snapshot can break a suspended guest.
>>
>> If a guest is suspended and saved to a snapshot using savevm, and qemu
>> is terminated and restarted with the -S option, then loadvm does not
>> restore the guest. The runstate is running, but the guest is not, because
>> vm_start was not called. The root cause is that loadvm does not restore
>> the runstate (eg suspended) from global_state loaded from the state file.
>>
>> Restore the runstate, and allow the new state transitions that are possible.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> migration/savevm.c | 1 +
>> softmmu/runstate.c | 2 ++
>> 2 files changed, 3 insertions(+)
>>
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index eba3653..7b9c477 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -3194,6 +3194,7 @@ bool load_snapshot(const char *name, const char *vmstate,
>> }
>> aio_context_acquire(aio_context);
>> ret = qemu_loadvm_state(f);
>> + migrate_set_runstate();
>
> I see that some load_snapshot() callers manage the vm states on their own.
> Take snapshot_load_job_bh() as an example:
>
> s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
> if (s->ret && orig_vm_running) {
> vm_start();
> }
>
> I assume you wanted to unify the state changes here. Need to fix the
> callers too?
Agreed. Fixed in V5.
>> migration_incoming_state_destroy();
>> aio_context_release(aio_context);
>>
>> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
>> index f3bd862..21d7407 100644
>> --- a/softmmu/runstate.c
>> +++ b/softmmu/runstate.c
>> @@ -77,6 +77,8 @@ typedef struct {
>>
>> static const RunStateTransition runstate_transitions_def[] = {
>> { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
>> + { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
>> + { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
>>
>> { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
>> { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
>
> Many of the call sites also starts loadvm under RUN_STATE_RESTORE_VM. Do
> we need more entries for that?
Agreed. Fixed in V5.
- Steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 05/11] migration: preserve suspended for bg_migration
2023-08-30 16:35 ` Peter Xu
@ 2023-11-13 18:32 ` Steven Sistare
0 siblings, 0 replies; 27+ messages in thread
From: Steven Sistare @ 2023-11-13 18:32 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On 8/30/2023 12:35 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:18:00AM -0700, Steve Sistare wrote:
>> Do not wake a suspended guest during bg_migration.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>> migration/migration.c | 12 +++++-------
>> softmmu/runstate.c | 1 +
>> 2 files changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/migration/migration.c b/migration/migration.c
>> index a9ecb66..303d5a6 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -3064,7 +3064,9 @@ static void bg_migration_vm_start_bh(void *opaque)
>> qemu_bh_delete(s->vm_start_bh);
>> s->vm_start_bh = NULL;
>>
>> - vm_start();
>> + if (!runstate_check(RUN_STATE_SUSPENDED)) {
>> + vm_start();
>> + }
>> s->downtime = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) - s->downtime_start;
>> }
>>
>> @@ -3134,16 +3136,12 @@ static void *bg_migration_thread(void *opaque)
>>
>> qemu_mutex_lock_iothread();
>>
>> - /*
>> - * If VM is currently in suspended state, then, to make a valid runstate
>> - * transition in vm_stop_force_state() we need to wakeup it up.
>> - */
>> - qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>> s->vm_old_state = runstate_get();
>>
>> global_state_store();
>> /* Forcibly stop VM before saving state of vCPUs and devices */
>> - if (vm_stop_force_state(RUN_STATE_PAUSED)) {
>> + if (!runstate_check(RUN_STATE_SUSPENDED) &&
>> + vm_stop_force_state(RUN_STATE_PAUSED)) {
>
> IIUC we need the vm_stop even for SUSPENDED? I think we need to make sure
> all backends are stopped before we start wr-protect the guest pages, so
> when wr-protect happens the guest pages should be in a consistent state.
Agreed, I stop everything for suspended in V5.
> I'd think it proper to juse reuse the helper you introduced in the previous
> patches, but maybe you have a reason to not do so (I didn't see it
> mentioned anywhere, though)?
This code is outgoing. The helper applies to the incoming side.
- Steve
>> goto fail;
>> }
>> /*
>> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
>> index 21d7407..4417527 100644
>> --- a/softmmu/runstate.c
>> +++ b/softmmu/runstate.c
>> @@ -163,6 +163,7 @@ static const RunStateTransition runstate_transitions_def[] = {
>> { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
>> { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>> { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
>> + { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
>>
>> { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING },
>> { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE },
>> --
>> 1.8.3.1
>>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 06/11] migration: preserve cpu ticks if suspended
2023-08-30 16:47 ` Peter Xu
2023-09-07 15:50 ` Steven Sistare
@ 2023-11-13 18:32 ` Steven Sistare
1 sibling, 0 replies; 27+ messages in thread
From: Steven Sistare @ 2023-11-13 18:32 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On 8/30/2023 12:47 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:18:01AM -0700, Steve Sistare wrote:
>> During RUN_STATE_SUSPENDED, the cpu clock remains enabled, so the
>> timers_state saved to the migration stream is stale, causing time errors
>> in the guest when it wakes from suspend.
>
> Instead of having this, I'm wondering whether we should just let:
>
> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>
> stop the vm for suspended too - I think we reached a consensus that
> SUSPENDED should be treated the same as running here (except the vcpu
> beingg running or not).
>
> So the more risky change is we should make runstate_is_running() cover
> SUSPENDED, but of course that again can affect many other call sites.. and
> I'm not sure whether it's 100% working everywhere.
>
> I think I mentioned the other "easier" way, which is to modify
> vm_stop_force_state() to take suspended:
>
> int vm_stop_force_state(RunState state)
> {
> - if (runstate_is_running()) {
> + if (runstate_is_running() || runstate_is_suspended()) {
> return vm_stop(state);
>
> That resides in cpus.c but it really only affects migration, so much less
> risky. Do you think this should be the better (and correct) way to go?
Agreed, good idea, done in V5 - steve
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 07/11] tests/qtest: migration events
2023-08-30 17:00 ` Peter Xu
@ 2023-11-13 18:33 ` Steven Sistare
2023-11-13 19:20 ` Steven Sistare
0 siblings, 1 reply; 27+ messages in thread
From: Steven Sistare @ 2023-11-13 18:33 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On 8/30/2023 1:00 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:18:02AM -0700, Steve Sistare wrote:
>> Define a state object to capture events seen by migration tests, to allow
>> more events to be captured in a subsequent patch, and simplify event
>> checking in wait_for_migration_pass. No functional change.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> Reviewed-by: Fabiano Rosas <farosas@suse.de>
>> ---
>> tests/qtest/migration-helpers.c | 24 +++++----------
>> tests/qtest/migration-helpers.h | 8 +++--
>> tests/qtest/migration-test.c | 68 +++++++++++++++++++----------------------
>> 3 files changed, 44 insertions(+), 56 deletions(-)
>>
>> diff --git a/tests/qtest/migration-helpers.c b/tests/qtest/migration-helpers.c
>> index be00c52..b541108 100644
>> --- a/tests/qtest/migration-helpers.c
>> +++ b/tests/qtest/migration-helpers.c
>> @@ -23,26 +23,16 @@
>> */
>> #define MIGRATION_STATUS_WAIT_TIMEOUT 120
>>
>> -bool migrate_watch_for_stop(QTestState *who, const char *name,
>> - QDict *event, void *opaque)
>> -{
>> - bool *seen = opaque;
>> -
>> - if (g_str_equal(name, "STOP")) {
>> - *seen = true;
>> - return true;
>> - }
>> -
>> - return false;
>> -}
>> -
>> -bool migrate_watch_for_resume(QTestState *who, const char *name,
>> +bool migrate_watch_for_events(QTestState *who, const char *name,
>> QDict *event, void *opaque)
>> {
>> - bool *seen = opaque;
>> + QTestMigrationState *state = opaque;
>>
>> - if (g_str_equal(name, "RESUME")) {
>> - *seen = true;
>> + if (g_str_equal(name, "STOP")) {
>> + state->stop_seen = true;
>> + return true;
>> + } else if (g_str_equal(name, "RESUME")) {
>> + state->resume_seen = true;
>> return true;
>> }
>>
>> diff --git a/tests/qtest/migration-helpers.h b/tests/qtest/migration-helpers.h
>> index 009e250..59fbb83 100644
>> --- a/tests/qtest/migration-helpers.h
>> +++ b/tests/qtest/migration-helpers.h
>> @@ -15,9 +15,11 @@
>>
>> #include "libqtest.h"
>>
>> -bool migrate_watch_for_stop(QTestState *who, const char *name,
>> - QDict *event, void *opaque);
>> -bool migrate_watch_for_resume(QTestState *who, const char *name,
>> +typedef struct QTestMigrationState {
>> + bool stop_seen, resume_seen;
>> +} QTestMigrationState;
>> +
>> +bool migrate_watch_for_events(QTestState *who, const char *name,
>> QDict *event, void *opaque);
>>
>> G_GNUC_PRINTF(3, 4)
>> diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c
>> index 62d3f37..526a1b7 100644
>> --- a/tests/qtest/migration-test.c
>> +++ b/tests/qtest/migration-test.c
>> @@ -43,8 +43,8 @@
>> unsigned start_address;
>> unsigned end_address;
>> static bool uffd_feature_thread_id;
>> -static bool got_src_stop;
>> -static bool got_dst_resume;
>> +static QTestMigrationState src_state;
>> +static QTestMigrationState dst_state;
>>
>> /*
>> * An initial 3 MB offset is used as that corresponds
>> @@ -188,6 +188,13 @@ static void wait_for_serial(const char *side)
>> } while (true);
>> }
>>
>> +static void wait_for_stop(QTestState *who, QTestMigrationState *state)
>> +{
>> + if (!state->stop_seen) {
>> + qtest_qmp_eventwait(who, "STOP");
>> + }
>> +}
>> +
>> /*
>> * It's tricky to use qemu's migration event capability with qtest,
>> * events suddenly appearing confuse the qmp()/hmp() responses.
>> @@ -235,21 +242,19 @@ static void read_blocktime(QTestState *who)
>> qobject_unref(rsp_return);
>> }
>>
>> +/*
>> + * Wait for two changes in the migration pass count, but bail if we stop.
>> + */
>> static void wait_for_migration_pass(QTestState *who)
>> {
>> - uint64_t initial_pass = get_migration_pass(who);
>> - uint64_t pass;
>> + uint64_t pass, prev_pass = 0, changes = 0;
>>
>> - /* Wait for the 1st sync */
>> - while (!got_src_stop && !initial_pass) {
>> - usleep(1000);
>> - initial_pass = get_migration_pass(who);
>> - }
>> -
>> - do {
>> + while (changes < 2 && !src_state.stop_seen) {
>> usleep(1000);
>> pass = get_migration_pass(who);
>> - } while (pass == initial_pass && !got_src_stop);
>> + changes += (pass != prev_pass);
>> + prev_pass = pass;
>> + }
>
> Here won't it start to wait for 2 iterations every time instead of 1?
>
> Note that previously we only wait for 1 iteration as long as not the
> initial pass.
I don't think so. Both the old and new code require at least a transition from
pass 0 to 1, and pass 1 to 2, to return. With the old:
when initial_pass becomes non-zero, done with first loop
when pass changes again, done with 2nd loop
- Steve
> And I think the change will double the counts for below..
>
> while (args->iterations > 1) {
> wait_for_migration_pass(from);
> args->iterations--;
> }
>
> The event-related changes are all fine, but maybe leave this piece as before?
>
>> }
>>
>> static void check_guests_ram(QTestState *who)
>> @@ -586,10 +591,7 @@ static void migrate_postcopy_start(QTestState *from, QTestState *to)
>> {
>> qtest_qmp_assert_success(from, "{ 'execute': 'migrate-start-postcopy' }");
>>
>> - if (!got_src_stop) {
>> - qtest_qmp_eventwait(from, "STOP");
>> - }
>> -
>> + wait_for_stop(from, &src_state);
>> qtest_qmp_eventwait(to, "RESUME");
>> }
>>
>> @@ -720,8 +722,9 @@ static int test_migrate_start(QTestState **from, QTestState **to,
>> }
>> }
>>
>> - got_src_stop = false;
>> - got_dst_resume = false;
>> + dst_state = (QTestMigrationState) { };
>> + src_state = (QTestMigrationState) { };
>> +
>> bootpath = g_strdup_printf("%s/bootsect", tmpfs);
>> if (strcmp(arch, "i386") == 0 || strcmp(arch, "x86_64") == 0) {
>> /* the assembled x86 boot sector should be exactly one sector large */
>> @@ -801,8 +804,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
>> if (!args->only_target) {
>> *from = qtest_init(cmd_source);
>> qtest_qmp_set_event_callback(*from,
>> - migrate_watch_for_stop,
>> - &got_src_stop);
>> + migrate_watch_for_events,
>> + &src_state);
>> }
>>
>> cmd_target = g_strdup_printf("-accel kvm%s -accel tcg "
>> @@ -821,8 +824,8 @@ static int test_migrate_start(QTestState **from, QTestState **to,
>> ignore_stderr);
>> *to = qtest_init(cmd_target);
>> qtest_qmp_set_event_callback(*to,
>> - migrate_watch_for_resume,
>> - &got_dst_resume);
>> + migrate_watch_for_events,
>> + &dst_state);
>>
>> /*
>> * Remove shmem file immediately to avoid memory leak in test failed case.
>> @@ -1516,9 +1519,7 @@ static void test_precopy_common(MigrateCommon *args)
>> */
>> if (args->result == MIG_TEST_SUCCEED) {
>> qtest_qmp_assert_success(from, "{ 'execute' : 'stop'}");
>> - if (!got_src_stop) {
>> - qtest_qmp_eventwait(from, "STOP");
>> - }
>> + wait_for_stop(from, &src_state);
>> migrate_ensure_converge(from);
>> }
>> }
>> @@ -1560,9 +1561,8 @@ static void test_precopy_common(MigrateCommon *args)
>> */
>> wait_for_migration_complete(from);
>>
>> - if (!got_src_stop) {
>> - qtest_qmp_eventwait(from, "STOP");
>> - }
>> + wait_for_stop(from, &src_state);
>> +
>> } else {
>> wait_for_migration_complete(from);
>> /*
>> @@ -1575,7 +1575,7 @@ static void test_precopy_common(MigrateCommon *args)
>> qtest_qmp_assert_success(to, "{ 'execute' : 'cont'}");
>> }
>>
>> - if (!got_dst_resume) {
>> + if (!dst_state.resume_seen) {
>> qtest_qmp_eventwait(to, "RESUME");
>> }
>>
>> @@ -1696,9 +1696,7 @@ static void test_ignore_shared(void)
>>
>> migrate_wait_for_dirty_mem(from, to);
>>
>> - if (!got_src_stop) {
>> - qtest_qmp_eventwait(from, "STOP");
>> - }
>> + wait_for_stop(from, &src_state);
>>
>> qtest_qmp_eventwait(to, "RESUME");
>>
>> @@ -2139,7 +2137,7 @@ static void test_migrate_auto_converge(void)
>> break;
>> }
>> usleep(20);
>> - g_assert_false(got_src_stop);
>> + g_assert_false(src_state.stop_seen);
>> } while (true);
>> /* The first percentage of throttling should be at least init_pct */
>> g_assert_cmpint(percentage, >=, init_pct);
>> @@ -2481,9 +2479,7 @@ static void test_multifd_tcp_cancel(void)
>>
>> migrate_ensure_converge(from);
>>
>> - if (!got_src_stop) {
>> - qtest_qmp_eventwait(from, "STOP");
>> - }
>> + wait_for_stop(from, &src_state);
>> qtest_qmp_eventwait(to2, "RESUME");
>>
>> wait_for_serial("dest_serial");
>> --
>> 1.8.3.1
>>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH V4 07/11] tests/qtest: migration events
2023-11-13 18:33 ` Steven Sistare
@ 2023-11-13 19:20 ` Steven Sistare
0 siblings, 0 replies; 27+ messages in thread
From: Steven Sistare @ 2023-11-13 19:20 UTC (permalink / raw)
To: Peter Xu
Cc: qemu-devel, Juan Quintela, Paolo Bonzini, Thomas Huth,
Daniel P. Berrangé
On 11/13/2023 1:33 PM, Steven Sistare wrote:
> On 8/30/2023 1:00 PM, Peter Xu wrote:
>> On Tue, Aug 29, 2023 at 11:18:02AM -0700, Steve Sistare wrote:
[...]
>>> +/*
>>> + * Wait for two changes in the migration pass count, but bail if we stop.
>>> + */
>>> static void wait_for_migration_pass(QTestState *who)
>>> {
>>> - uint64_t initial_pass = get_migration_pass(who);
>>> - uint64_t pass;
>>> + uint64_t pass, prev_pass = 0, changes = 0;
>>>
>>> - /* Wait for the 1st sync */
>>> - while (!got_src_stop && !initial_pass) {
>>> - usleep(1000);
>>> - initial_pass = get_migration_pass(who);
>>> - }
>>> -
>>> - do {
>>> + while (changes < 2 && !src_state.stop_seen) {
>>> usleep(1000);
>>> pass = get_migration_pass(who);
>>> - } while (pass == initial_pass && !got_src_stop);
>>> + changes += (pass != prev_pass);
>>> + prev_pass = pass;
>>> + }
>>
>> Here won't it start to wait for 2 iterations every time instead of 1?
>>
>> Note that previously we only wait for 1 iteration as long as not the
>> initial pass.
>
> I don't think so. Both the old and new code require at least a transition from
> pass 0 to 1, and pass 1 to 2, to return. With the old:
> when initial_pass becomes non-zero, done with first loop
> when pass changes again, done with 2nd loop
OK, you refer to count of iterations that call usleep(1000), not pass count.
Yes, the new code may call usleep(1000) twice instead of once.
args->iterations at the caller is at most 2, so this is a tiny difference.
However, I could improve it like so:
static void wait_for_migration_pass(QTestState *who)
{
uint64_t pass = get_migration_pass(who);
uint64_t changes = (pass > 0);
uint64_t prev_pass;
while (changes < 2 && !src_state.stop_seen && !src_state.suspend_seen) {
usleep(1000);
prev_pass = pass;
pass = get_migration_pass(who);
changes += (pass != prev_pass);
}
}
- Steve
>> And I think the change will double the counts for below..
>>
>> while (args->iterations > 1) {
>> wait_for_migration_pass(from);
>> args->iterations--;
>> }
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2023-11-13 19:22 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-29 18:17 [PATCH V4 00/11] fix migration of suspended runstate Steve Sistare
2023-08-29 18:17 ` [PATCH V4 01/11] cpus: pass runstate to vm_prepare_start Steve Sistare
2023-08-30 15:52 ` Peter Xu
2023-08-30 15:56 ` Steven Sistare
2023-08-29 18:17 ` [PATCH V4 02/11] migration: preserve suspended runstate Steve Sistare
2023-08-30 16:07 ` Peter Xu
2023-08-29 18:17 ` [PATCH V4 03/11] migration: add runstate function Steve Sistare
2023-08-30 16:11 ` Peter Xu
2023-08-29 18:17 ` [PATCH V4 04/11] migration: preserve suspended for snapshot Steve Sistare
2023-08-30 16:22 ` Peter Xu
2023-11-13 18:32 ` Steven Sistare
2023-08-29 18:18 ` [PATCH V4 05/11] migration: preserve suspended for bg_migration Steve Sistare
2023-08-30 16:35 ` Peter Xu
2023-11-13 18:32 ` Steven Sistare
2023-08-29 18:18 ` [PATCH V4 06/11] migration: preserve cpu ticks if suspended Steve Sistare
2023-08-30 16:47 ` Peter Xu
2023-09-07 15:50 ` Steven Sistare
2023-11-13 18:32 ` Steven Sistare
2023-08-29 18:18 ` [PATCH V4 07/11] tests/qtest: migration events Steve Sistare
2023-08-30 17:00 ` Peter Xu
2023-11-13 18:33 ` Steven Sistare
2023-11-13 19:20 ` Steven Sistare
2023-08-29 18:18 ` [PATCH V4 08/11] tests/qtest: option to suspend during migration Steve Sistare
2023-08-30 17:01 ` Peter Xu
2023-08-29 18:18 ` [PATCH V4 09/11] tests/qtest: precopy migration with suspend Steve Sistare
2023-08-29 18:18 ` [PATCH V4 10/11] tests/qtest: postcopy " Steve Sistare
2023-08-29 18:18 ` [PATCH V4 11/11] tests/qtest: background " Steve Sistare
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).