From: Steven Sistare <steven.sistare@oracle.com>
To: Peter Xu <peterx@redhat.com>
Cc: qemu-devel@nongnu.org, "Juan Quintela" <quintela@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
"Thomas Huth" <thuth@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>
Subject: Re: [PATCH V1 2/3] migration: fix suspended runstate
Date: Thu, 24 Aug 2023 16:53:13 -0400 [thread overview]
Message-ID: <b64bae85-1665-2ea5-3690-12cd5e67620b@oracle.com> (raw)
In-Reply-To: <ZN5kmZdvc1Q1446r@x1n>
On 8/17/2023 2:19 PM, Peter Xu wrote:
> On Wed, Aug 16, 2023 at 01:48:13PM -0400, Steven Sistare wrote:
>> On 8/14/2023 3:37 PM, Peter Xu wrote:
>>> On Mon, Aug 14, 2023 at 02:53:56PM -0400, Steven Sistare wrote:
>>>>> Can we just call vm_state_notify() earlier?
>>>>
>>>> We cannot. The guest is not running yet, and will not be until later.
>>>> We cannot call notifiers that perform actions that complete, or react to,
>>>> the guest entering a running state.
>>>
>>> I tried to look at a few examples of the notifees and most of them I read
>>> do not react to "vcpu running" but "vm running" (in which case I think
>>> "suspended" mode falls into "vm running" case); most of them won't care on
>>> the RunState parameter passed in, but only the bool "running".
>>>
>>> In reality, when running=true, it must be RUNNING so far.
>>>
>>> In that case does it mean we should notify right after the switchover,
>>> since after migration the vm is indeed running only if the vcpus are not
>>> during suspend?
>>
>> I cannot parse your question, but maybe this answers it.
>> If the outgoing VM is running and not suspended, then the incoming side
>> tests for autostart==true and calls vm_start, which calls the notifiers,
>> right after the switchover.
>
> I meant IMHO SUSPENDED should be seen as "vm running" case to me, just like
> RUNNING. Then, we should invoke vm_prepare_start(), just need some touch
> ups.
>
>>
>>> One example (of possible issue) is vfio_vmstate_change(), where iiuc if we
>>> try to suspend a VM it should keep to be VFIO_DEVICE_STATE_RUNNING for that
>>> device; this kind of prove to me that SUSPEND is actually one of
>>> running=true states.
>>>
>>> If we postpone all notifiers here even after we switched over to dest qemu
>>> to the next upcoming suspend wakeup, I think it means these devices will
>>> not be in VFIO_DEVICE_STATE_RUNNING after switchover but perhaps
>>> VFIO_DEVICE_STATE_STOP.
>>
>> or VFIO_DEVICE_STATE_RESUMING, which is set in vfio_load_setup.
>> AFAIK it is OK to remain in that state until wakeup is called later.
>
> So let me provide another clue of why I think we should call
> vm_prepare_start()..
>
> Firstly, I think RESUME event should always be there right after we
> switched over, no matter suspeneded or not. I just noticed that your test
> case would work because you put "wakeup" to be before RESUME. I'm not sure
> whether that's correct. I'd bet people could rely on that RESUME to
> identify the switchover.
Yes, possibly.
> More importantly, I'm wondering whether RTC should still be running during
> the suspended mode? Sorry again if my knowledge over there is just
> limited, so correct me otherwise - but my understanding is during suspend
> mode (s1 or s3, frankly I can't tell which one this belongs..), rtc should
> still be running along with the system clock. It means we _should_ at
> least call cpu_enable_ticks() to enable rtc:
Agreed. The comment above cpu_get_ticks says:
* return the time elapsed in VM between vm_start and vm_stop
Suspending a guest does not call vm_stop, so ticks keeps running.
I also verified that experimentally.
> /*
> * enable cpu_get_ticks()
> * Caller must hold BQL which serves as mutex for vm_clock_seqlock.
> */
> void cpu_enable_ticks(void)
>
> I think that'll enable cpu_get_tsc() and make it start to work right.
>
>>
>>> Ideally I think we should here call vm_state_notify() with running=true and
>>> state=SUSPEND, but since I do see some hooks are not well prepared for
>>> SUSPEND over running=true, I'd think we should on the safe side call
>>> vm_state_notify(running=true, state=RUNNING) even for SUSPEND at switch
>>> over phase. With that IIUC it'll naturally work (e.g. when wakeup again
>>> later we just need to call no notifiers).
>>
>> Notifiers are just one piece, all the code in vm_prepare_start must be called.
>> Is it correct to call all of that long before we actually resume the CPUs in
>> wakeup? I don't know, but what is the point?
>
> The point is not only for cleaness (again, I really, really don't like that
> new global.. sorry), but also now I think we should make the vm running.
I do believe it is safe to call vm_prepare_start immediately, since the vcpus
are never running when it is called.
>> The wakeup code still needs
>> modification to conditionally resume the vcpus. The scheme would be roughly:
>>
>> loadvm_postcopy_handle_run_bh()
>> runstat = global_state_get_runstate();
>> if (runstate == RUN_STATE_RUNNING) {
>> vm_start()
>> } else if (runstate == RUN_STATE_SUSPENDED)
>> vm_prepare_start(); // the start of vm_start()
>> }
>>
>> qemu_system_wakeup_request()
>> if (some condition)
>> resume_all_vcpus(); // the remainder of vm_start()
>> else
>> runstate_set(RUN_STATE_RUNNING)
>
> No it doesn't. wakeup_reason is set there, main loop does the resuming.
> See:
>
> if (qemu_wakeup_requested()) {
> pause_all_vcpus();
> qemu_system_wakeup();
> notifier_list_notify(&wakeup_notifiers, &wakeup_reason);
> wakeup_reason = QEMU_WAKEUP_REASON_NONE;
> resume_all_vcpus();
> qapi_event_send_wakeup();
> }
Agreed, we can rely on that main loop code to call resume_all_vcpus, and not
modify qemu_system_wakeup_request. That is cleaner.
>> How is that better than my patches
>> [PATCH V3 01/10] vl: start on wakeup request
>> [PATCH V3 02/10] migration: preserve suspended runstate
>>
>> loadvm_postcopy_handle_run_bh()
>> runstate = global_state_get_runstate();
>> if (runstate == RUN_STATE_RUNNING)
>> vm_start()
>> else
>> runstate_set(runstate); // eg RUN_STATE_SUSPENDED
>>
>> qemu_system_wakeup_request()
>> if (!vm_started)
>> vm_start();
>> else
>> runstate_set(RUN_STATE_RUNNING);
>>
>> Recall this thread started with your comment "It then can avoid touching the
>> system wakeup code which seems cleaner". We still need to touch the wakeup
>> code.
>
> Let me provide some code and reply to your new patchset inlines.
Thank you, I have more comments there.
- Steve
next prev parent reply other threads:[~2023-08-24 20:53 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-06-15 20:26 [PATCH V1 0/3] fix migration of suspended runstate Steve Sistare
2023-06-15 20:26 ` [PATCH V1 1/3] vl: start on wakeup request Steve Sistare
2023-06-15 20:26 ` [PATCH V1 2/3] migration: fix suspended runstate Steve Sistare
2023-06-20 21:46 ` Peter Xu
2023-06-21 19:15 ` Steven Sistare
2023-06-21 20:28 ` Peter Xu
2023-06-23 18:25 ` Steven Sistare
2023-06-23 19:56 ` Steven Sistare
2023-06-26 18:27 ` Peter Xu
2023-06-26 19:11 ` Peter Xu
2023-06-30 13:50 ` Steven Sistare
2023-07-26 20:18 ` Peter Xu
2023-08-14 18:53 ` Steven Sistare
2023-08-14 19:37 ` Peter Xu
2023-08-16 17:48 ` Steven Sistare
2023-08-17 18:19 ` Peter Xu
2023-08-24 20:53 ` Steven Sistare [this message]
2023-06-15 20:26 ` [PATCH V1 3/3] tests/qtest: live migration suspended state Steve Sistare
2023-06-21 16:45 ` Peter Xu
2023-06-21 19:39 ` Steven Sistare
2023-06-21 20:00 ` Peter Xu
2023-06-22 21:28 ` Steven Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b64bae85-1665-2ea5-3690-12cd5e67620b@oracle.com \
--to=steven.sistare@oracle.com \
--cc=berrange@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).