From: Peter Xu <peterx@redhat.com>
To: Steven Sistare <steven.sistare@oracle.com>
Cc: qemu-devel@nongnu.org, "Juan Quintela" <quintela@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Thomas Huth" <thuth@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [PATCH V1 2/3] migration: fix suspended runstate
Date: Mon, 26 Jun 2023 14:27:32 -0400 [thread overview]
Message-ID: <ZJnYlApmsQLXBK/3@x1n> (raw)
In-Reply-To: <340b5f58-0924-6f8e-6f82-0462a5cc22cc@oracle.com>
On Fri, Jun 23, 2023 at 02:25:05PM -0400, Steven Sistare wrote:
> On 6/21/2023 4:28 PM, Peter Xu wrote:
> > On Wed, Jun 21, 2023 at 03:15:42PM -0400, Steven Sistare wrote:
> >> On 6/20/2023 5:46 PM, Peter Xu wrote:
> >>> On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> >>>> Migration of a guest in the suspended state is broken. The incoming
> >>>> migration code automatically tries to wake the guest, which IMO is
> >>>> wrong -- the guest should end migration in the same state it started.
> >>>> Further, the wakeup is done by calling qemu_system_wakeup_request(), which
> >>>> bypasses vm_start(). The guest appears to be in the running state, but
> >>>> it is not.
> >>>>
> >>>> To fix, leave the guest in the suspended state, but call
> >>>> qemu_system_start_on_wakeup_request() so the guest is properly resumed
> >>>> later, when the client sends a system_wakeup command.
> >>>>
> >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >>>> ---
> >>>> migration/migration.c | 11 ++++-------
> >>>> softmmu/runstate.c | 1 +
> >>>> 2 files changed, 5 insertions(+), 7 deletions(-)
> >>>>
> >>>> diff --git a/migration/migration.c b/migration/migration.c
> >>>> index 17b4b47..851fe6d 100644
> >>>> --- a/migration/migration.c
> >>>> +++ b/migration/migration.c
> >>>> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque)
> >>>> vm_start();
> >>>> } else {
> >>>> runstate_set(global_state_get_runstate());
> >>>> + if (runstate_check(RUN_STATE_SUSPENDED)) {
> >>>> + /* Force vm_start to be called later. */
> >>>> + qemu_system_start_on_wakeup_request();
> >>>> + }
> >>>
> >>> Is this really needed, along with patch 1?
> >>>
> >>> I have a very limited knowledge on suspension, so I'm prone to making
> >>> mistakes..
> >>>
> >>> But from what I read this, qemu_system_wakeup_request() (existing one, not
> >>> after patch 1 applied) will setup wakeup_reason and kick the main thread
> >>> using qemu_notify_event(). Then IIUC the e.g. vcpu wakeups will be done in
> >>> the main thread later on after qemu_wakeup_requested() returns true.
> >>
> >> Correct, here:
> >>
> >> if (qemu_wakeup_requested()) {
> >> pause_all_vcpus();
> >> qemu_system_wakeup();
> >> notifier_list_notify(&wakeup_notifiers, &wakeup_reason);
> >> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
> >> resume_all_vcpus();
> >> qapi_event_send_wakeup();
> >> }
> >>
> >> However, that is not sufficient, because vm_start() was never called on the incoming
> >> side. vm_start calls the vm state notifiers for RUN_STATE_RUNNING, among other things.
> >>
> >>
> >> Without my fixes, it "works" because the outgoing migration automatically wakes a suspended
> >> guest, which sets the state to running, which is saved in global state:
> >>
> >> void migration_completion(MigrationState *s)
> >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
> >> global_state_store()
> >>
> >> Then the incoming migration calls vm_start here:
> >>
> >> migration/migration.c
> >> if (!global_state_received() ||
> >> global_state_get_runstate() == RUN_STATE_RUNNING) {
> >> if (autostart) {
> >> vm_start();
> >>
> >> vm_start must be called for correctness.
> >
> > I see. Though I had a feeling that this is still not the right way to do,
> > at least not as clean.
> >
> > One question is, would above work for postcopy when VM is suspended during
> > the switchover?
>
> Good catch, that is broken.
> I added qemu_system_start_on_wakeup_request to loadvm_postcopy_handle_run_bh
> and now it works.
>
> if (global_state_get_runstate() == RUN_STATE_RUNNING) {
> if (autostart) {
> vm_start();
> } else {
> runstate_set(RUN_STATE_PAUSED);
> }
> } else {
> runstate_set(global_state_get_runstate());
> if (runstate_check(RUN_STATE_SUSPENDED)) {
> qemu_system_start_on_wakeup_request();
> }
> }
>
> > I think I see your point that vm_start() (mostly vm_prepare_start())
> > contains a bunch of operations that maybe we must have before starting the
> > VM, but then.. should we just make that vm_start() unconditional when
> > loading VM completes? I just don't see anything won't need it (besides
> > -S), even COLO.
> >
> > So I'm wondering about something like this:
> >
> > ===8<===
> > --- a/migration/migration.c
> > +++ b/migration/migration.c
> > @@ -481,19 +481,28 @@ static void process_incoming_migration_bh(void *opaque)
> >
> > dirty_bitmap_mig_before_vm_start();
> >
> > - if (!global_state_received() ||
> > - global_state_get_runstate() == RUN_STATE_RUNNING) {
> > - if (autostart) {
> > - vm_start();
> > - } else {
> > - runstate_set(RUN_STATE_PAUSED);
> > - }
> > - } else if (migration_incoming_colo_enabled()) {
> > + if (migration_incoming_colo_enabled()) {
> > migration_incoming_disable_colo();
> > + /* COLO should always have autostart=1 or we can enforce it here */
> > + }
> > +
> > + if (autostart) {
> > + RunState run_state = global_state_get_runstate();
> > vm_start();
>
> This will resume the guest for a brief time, against the user's wishes. Not OK IMO.
Ah okay..
Can we do whatever we need in vm_prepare_start(), then? I assume these
chunks are needed:
/*
* WHPX accelerator needs to know whether we are going to step
* any CPUs, before starting the first one.
*/
if (cpus_accel->synchronize_pre_resume) {
cpus_accel->synchronize_pre_resume(step_pending);
}
/* We are sending this now, but the CPUs will be resumed shortly later */
qapi_event_send_resume();
cpu_enable_ticks();
While these may not be needed, but instead only needed if RUN_STATE_RUNNING
below (runstate_set() will be needed regardless):
runstate_set(RUN_STATE_RUNNING);
vm_state_notify(1, RUN_STATE_RUNNING);
So here's another of my attempt (this time also taking
!global_state_received() into account)...
RunState run_state;
if (migration_incoming_colo_enabled()) {
migration_incoming_disable_colo();
}
if (!global_state_received()) {
run_state = RUN_STATE_RUNNING;
} else {
run_state = global_state_get_runstate();
}
if (autostart) {
/* Part of vm_prepare_start(), may isolate into a helper? */
if (cpus_accel->synchronize_pre_resume) {
cpus_accel->synchronize_pre_resume(step_pending);
}
qapi_event_send_resume();
cpu_enable_ticks();
/* Setup the runstate on src */
runstate_set(run_state);
if (run_state == RUN_STATE_RUNNING) {
vm_state_notify(1, RUN_STATE_RUNNING);
}
} else {
runstate_set(RUN_STATE_PAUSED);
}
The whole idea is still do whatever needed here besides resuming the vcpus,
rather than leaving the whole vm_start() into system wakeup. It then can
avoid touching the system wakeup code which seems cleaner.
> > IIUC this can drop qemu_system_start_on_wakeup_request() along with the
> > other global var. Would something like it work for us?
>
> Afraid not. start_on_wake is the only correct solution I can think of.
Please check again above, I just hope we can avoid yet another global to
QEMU if possible.
Thanks,
--
Peter Xu
next prev parent reply other threads:[~2023-06-26 18:28 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-15 20:26 [PATCH V1 0/3] fix migration of suspended runstate Steve Sistare
2023-06-15 20:26 ` [PATCH V1 1/3] vl: start on wakeup request Steve Sistare
2023-06-15 20:26 ` [PATCH V1 2/3] migration: fix suspended runstate Steve Sistare
2023-06-20 21:46 ` Peter Xu
2023-06-21 19:15 ` Steven Sistare
2023-06-21 20:28 ` Peter Xu
2023-06-23 18:25 ` Steven Sistare
2023-06-23 19:56 ` Steven Sistare
2023-06-26 18:27 ` Peter Xu [this message]
2023-06-26 19:11 ` Peter Xu
2023-06-30 13:50 ` Steven Sistare
2023-07-26 20:18 ` Peter Xu
2023-08-14 18:53 ` Steven Sistare
2023-08-14 19:37 ` Peter Xu
2023-08-16 17:48 ` Steven Sistare
2023-08-17 18:19 ` Peter Xu
2023-08-24 20:53 ` Steven Sistare
2023-06-15 20:26 ` [PATCH V1 3/3] tests/qtest: live migration suspended state Steve Sistare
2023-06-21 16:45 ` Peter Xu
2023-06-21 19:39 ` Steven Sistare
2023-06-21 20:00 ` Peter Xu
2023-06-22 21:28 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZJnYlApmsQLXBK/3@x1n \
--to=peterx@redhat.com \
--cc=berrange@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=steven.sistare@oracle.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.