qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Steven Sistare <steven.sistare@oracle.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, "Juan Quintela" <quintela@redhat.com>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Thomas Huth" <thuth@redhat.com>,
	"Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [PATCH V3 00/10] fix migration of suspended runstate
Date: Fri, 25 Aug 2023 13:56:24 -0400	[thread overview]
Message-ID: <7f0c0fc8-2848-06a3-f271-6cbecbd87f4b@oracle.com> (raw)
In-Reply-To: <ZOjDmf/o77puC+OW@x1n>

On 8/25/2023 11:07 AM, Peter Xu wrote:
> On Fri, Aug 25, 2023 at 09:28:28AM -0400, Steven Sistare wrote:
>> On 8/24/2023 5:09 PM, Steven Sistare wrote:
>>> On 8/17/2023 2:23 PM, Peter Xu wrote:
>>>> On Mon, Aug 14, 2023 at 11:54:26AM -0700, Steve Sistare wrote:
>>>>> Migration of a guest in the suspended runstate is broken.  The incoming
>>>>> migration code automatically tries to wake the guest, which is wrong;
>>>>> the guest should end migration in the same runstate it started.  Further,
>>>>> for a restored snapshot, the automatic wakeup fails.  The runstate is
>>>>> RUNNING, but the guest is not.  See the commit messages for the details.
>>>>
>>>> Hi Steve,
>>>>
>>>> I drafted two small patches to show what I meant, on top of this series.
>>>> Before applying these two, one needs to revert patch 1 in this series.
>>>>
>>>> After applied, it should also pass all three new suspend tests.  We can
>>>> continue the discussion here based on the patches.
>>>
>>> Your 2 patches look good.  I suggest we keep patch 1, and I squash patch 2
>>> into the other patches.
> 
> Yes.  Feel free to reorganize / modify /.. the changes in whatever way you
> prefer in the final patchset.
> 
>>>
>>> There is one more fix needed: on the sending side, if the state is suspended,
>>> then ticks must be disabled so the tick globals are updated before they are
>>> written to vmstate.  Otherwise, tick starts at 0 in the receiver when
>>> cpu_enable_ticks is called.
>>>
>>> -------------------------------------------
>>> diff --git a/migration/migration.c b/migration/migration.c
>> [...]
>>> -------------------------------------------
>>
>> This diff is just a rough draft.  I need to resume ticks if the migration
>> fails or is cancelled, and I am trying to push the logic into vm_stop,
>> vm_stop_force_state, and vm_start, and/or vm_prepare_start.
> 
> Yes this sounds better than hard code things into migration codes, thanks.
> 
> Maybe at least all the migration related code paths should always use
> vm_stop_force_state() (e.g. save_snapshot)?
> 
> At the meantime, AFAIU we should allow runstate_is_running() to return true
> even for suspended, matching current usages of vm_start() / vm_stop().  But
> again that can have risk of breaking existing users.
> 
> I bet you may have a better grasp of what it should look like to solve the
> current "migrate suspended VM" problem at the minimum but hopefully still
> in a clean way, so I assume I'll just wait and see.

I found a better way.
Rather than disabling ticks, I added a pre_save handler to capture and save
the correct timer state even if the timer is running, using the logic from
cpr_disable_ticks. No changes needed in the migration code:

------------------------------------
diff --git a/softmmu/cpu-timers.c b/softmmu/cpu-timers.c
index 117408c..d5af317 100644
--- a/softmmu/cpu-timers.c
+++ b/softmmu/cpu-timers.c
@@ -157,6 +157,36 @@ static bool icount_shift_state_needed(void *opaque)
     return icount_enabled() == 2;
 }

+static int cpu_pre_save_ticks(void *opaque)
+{
+    TimersState *t = &timers_state;
+    TimersState *snap = opaque;
+
+    seqlock_write_lock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+
+    if (t->cpu_ticks_enabled) {
+        snap->cpu_ticks_offset = t->cpu_ticks_offset + cpu_get_host_ticks();
+        snap->cpu_clock_offset = cpu_get_clock_locked();
+    } else {
+        snap->cpu_ticks_offset = t->cpu_ticks_offset;
+        snap->cpu_clock_offset = t->cpu_clock_offset;
+    }
+    seqlock_write_unlock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+    return 0;
+}
+
+static int cpu_post_load_ticks(void *opaque, int version_id)
+{
+    TimersState *t = &timers_state;
+    TimersState *snap = opaque;
+
+    seqlock_write_lock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+    t->cpu_ticks_offset = snap->cpu_ticks_offset;
+    t->cpu_clock_offset = snap->cpu_clock_offset;
+    seqlock_write_unlock(&t->vm_clock_seqlock, &t->vm_clock_lock);
+    return 0;
+}
+
 /*
  * Subsection for warp timer migration is optional, because may not be created
  */
@@ -221,6 +251,8 @@ static const VMStateDescription vmstate_timers = {
     .name = "timer",
     .version_id = 2,
     .minimum_version_id = 1,
+    .pre_save = cpu_pre_save_ticks,
+    .post_load = cpu_post_load_ticks,
     .fields = (VMStateField[]) {
         VMSTATE_INT64(cpu_ticks_offset, TimersState),
         VMSTATE_UNUSED(8),
@@ -269,9 +301,11 @@ TimersState timers_state;
 /* initialize timers state and the cpu throttle for convenience */
 void cpu_timers_init(void)
 {
+    static TimersState timers_snapshot;
+
     seqlock_init(&timers_state.vm_clock_seqlock);
     qemu_spin_init(&timers_state.vm_clock_lock);
-    vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
+    vmstate_register(NULL, 0, &vmstate_timers, &timers_snapshot);

     cpu_throttle_init();
 }
------------------------------------

- Steve


      reply	other threads:[~2023-08-25 17:56 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-14 18:54 [PATCH V3 00/10] fix migration of suspended runstate Steve Sistare
2023-08-14 18:54 ` [PATCH V3 01/10] vl: start on wakeup request Steve Sistare
2023-08-17 18:27   ` Peter Xu
2023-08-24 20:54     ` Steven Sistare
2023-08-14 18:54 ` [PATCH V3 02/10] migration: preserve suspended runstate Steve Sistare
2023-08-14 18:54 ` [PATCH V3 03/10] migration: add runstate function Steve Sistare
2023-08-14 18:54 ` [PATCH V3 04/10] migration: preserve suspended for snapshot Steve Sistare
2023-08-14 18:54 ` [PATCH V3 05/10] migration: preserve suspended for bg_migration Steve Sistare
2023-08-14 18:54 ` [PATCH V3 06/10] tests/qtest: migration events Steve Sistare
2023-08-14 18:54 ` [PATCH V3 07/10] tests/qtest: option to suspend during migration Steve Sistare
2023-08-14 18:54 ` [PATCH V3 08/10] tests/qtest: precopy migration with suspend Steve Sistare
2023-08-14 18:54 ` [PATCH V3 09/10] tests/qtest: postcopy " Steve Sistare
2023-08-14 18:54 ` [PATCH V3 10/10] tests/qtest: background " Steve Sistare
2023-08-17 18:23 ` [PATCH V3 00/10] fix migration of suspended runstate Peter Xu
2023-08-24 21:09   ` Steven Sistare
2023-08-25 13:28     ` Steven Sistare
2023-08-25 15:07       ` Peter Xu
2023-08-25 17:56         ` Steven Sistare [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f0c0fc8-2848-06a3-f271-6cbecbd87f4b@oracle.com \
    --to=steven.sistare@oracle.com \
    --cc=berrange@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=quintela@redhat.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).