From: Steven Sistare <steven.sistare@oracle.com>
To: "Peter Xu" <peterx@redhat.com>,
"Daniel P. Berrangé" <berrange@redhat.com>
Cc: qemu-devel@nongnu.org, Juan Quintela <quintela@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
Thomas Huth <thuth@redhat.com>, Fabiano Rosas <farosas@suse.de>,
Leonardo Bras <leobras@redhat.com>
Subject: Re: [PATCH V5 02/12] cpus: stop vm in suspended state
Date: Tue, 28 Nov 2023 08:26:08 -0500 [thread overview]
Message-ID: <4699f005-0b3c-45de-8967-0ef90b79f228@oracle.com> (raw)
In-Reply-To: <ZV4qlqxp8QVagcjG@x1n>
On 11/22/2023 11:21 AM, Peter Xu wrote:
> On Wed, Nov 22, 2023 at 09:38:06AM +0000, Daniel P. Berrangé wrote:
>> On Mon, Nov 20, 2023 at 04:44:50PM -0500, Peter Xu wrote:
>>> On Mon, Nov 20, 2023 at 03:55:54PM -0500, Steven Sistare wrote:
>>>> If we drop force, then all calls to vm_stop will completely stop the
>>>> suspended state, eg an hmp "stop" command. This causes two problems.
>>>> First, that is a change in user-visible behavior for something that
>>>> currently works,
>>>
>>> IMHO it depends on what should be the correct behavior. IOW, when VM is in
>>> SUSPENDED state and then the user sends "stop" QMP command, what should we
>>> expect?
>>
>> I would say that from a mgmtm app POV "stop" is initiating a state
>> transition, from RUN_STATE_RUNNING to RUN_STATE_PAUSED and "cont"
>> is doing the reverse from PAUSED to RUNNING.
>>
>> It is a little more complicated than that as there are some other
>> states like INMIGRATE that are conceptually equiv to RUNNING,
>> and states where the transition simply doesn't make sense.
>
> In the qemu impl, INMIGRATE is, imho, more equiv of PAUSED - do_vm_stop()
> mostly ignores every state except RUNNING (putting bdrv operations aside).
> IOW, anything besides "running" is treated as "not running".
>
> But then Paolo fixed that in 1e9981465f ("qmp: handle stop/cont in
> INMIGRATE state"), wiring that to autostart.
>
> Now we seem to find that "suspended" should actually fall within (where
> "vm" is running, but "vcpu" is not), and it seems we should treat "vm" and
> "vcpu" differently.
>
>>
>> So my question is if we're in "SUSPENDED" and someone issues "stop",
>> what state do we go into, and perhaps more importantly what state
>> do we go to in a subsequent "cont".
>
> I think we must stop the "vm", not only the "vcpu". I discussed this bit
> in the other thread more or less: it's because qemu_system_wakeup_request()
> can be called in many places, e.g. acpi_pm_tmr_timer().
>
> It means even after the VM is "stop"ped by the admin QMP (where qmp_stop()
> ignores SUSPENDED, keep the "vm" running), it can silently got waken up
> without admin even noticing it. I'm not sure what Libvirt will behave if
> it suddenly receives a QAPI_EVENT_WAKEUP randomly after a "stop".
>
>>
>> If you say SUSPENDED ---(stop)---> PAUSED ---(cont)---> SUSPENDED
>> then we create a problem, because the decision for the transition
>> out of PAUSED needs memory of the previous state.
>
> Right, that's why I think we at least need one more boolean to remember the
> suspended state, then when we switch from any STOP states into any RUN
> states, we know where to go. Here STOP states I meant anything except
> RUNNING and SUSPENDED, while RUN -> RUNNING or SUSPENDED.
>
>>
>>> My understanding is we should expect to fully stop the VM, including the
>>> ticks, for example. Keeping the ticks running even after QMP "stop"
>>> doesn't sound right, isn't it?
>>
>> The "stop" QMP command is documented as
>>
>> "Stop all guest VCPU execution"
>>
>> the devil is in the detail though, and we've not documented any detail.
>>
>> Whether or not timers keep running across stop/cont I think can be
>> argued to be an impl detail, as long as the headline goal "vcpus
>> don't execute" is satisfied.
>
> "stop" was since qemu v0.14, so I guess we can't blame the missing of
> details or any form of inaccuracy.. Obviously we do more than "stop vCPU
> executions" in the current implementation.
>
> But after we reach a consensus on how we should fix the current suspended
> problem, we may want to update the documentation to start containing more
> information.
>
>>
>>>> vs the migration code where we are fixing brokenness.
>>>
>>> This is not a migration-only bug if above holds, IMO.
>>>
>>>> Second, it does not quite work, because the state becomes
>>>> RUN_STATE_PAUSED, so the suspended state is forgotten, and the hmp "cont"
>>>> will try to set the running state. I could fix that by introducing a new
>>>> state RUN_STATE_SUSPENDED_STOPPED, but again it is a user-visible change
>>>> in existing behavior. (I even implemented that while developing, then I
>>>> realized it was not needed to fix the migration bugs.)
>>>
>>> Good point.
>>
>> We have added new guest states periodically. It is a user visible
>> change, but you could argue that it is implementing a new feature
>> ie the ability to "stop" a "suspended" guest, and so is justified.
>>
>> S3 is so little used in virt, so I'm not surprised we're finding
>> long standing edge cases that have never been thought about before.
>>
>>> Now with above comments, what's your thoughts on whether we should change
>>> the user behavior? My answer is still a yes.
>>>
>>> Maybe SUSPENDED should not be a RunState at all? SUSPENDED is guest visible
>>> behavior, while something like QMP "stop" is not guest visible. Maybe we
>>> should remember it separately?
>>
>> Yes, every time I look at this area I come away thinking that
>> the RunState enum is a mess, overloading too many different
>> concepts onto the same single field.
>>
>> Specifically "SUSPENDED" vs "RUNNING" is a reflection of guest
>> state (ie whether or not the VM is in S3), but pretty much all
>> the others are a reflection of QEMU host state. I kind of feel
>> that SUSPENDED (S3) probably shouldn't have been a RunState at
>> all. I'd probably put guest-panicked into a separate thing too.
>>
>> But we're stuck with what we have.
>
> IMO compatibility is only necessary if at least the existing code is
> running well. But now I see it a major flaw in suspended state and I can't
> see how it can go right if with current code base.. My concern is instead
> that after suspended will be used more (e.g., assuming CPR will rely on it)
> we can have more chance to confuse/oops a mgmt app like Libvirt, like I
> described above.
>
> In summary, I think a current solution to me is only to fix at least
> suspended state for good, by:
>
> - adding vm_suspended boolean to remember machine RUNNING / SUSPENDED
> state. When "cont" we need to switch to either RUNNING / SUSPENDED
> depending on the boolean.
>
> - keeping SUSPENDED state as RunState (for compatibility, otherwise we'll
> need another interface to fetch that boolean anyway), even though not
> query-able during any !RUNNING && !SUSPENDED state.. hopefully not a
> big deal
>
> - enrich ducumentation of qmp_stop/qmp_cont to describe what they really do
>
> - (with suspended working all right...) fix migration of SUSPENDED state
>
> I don't expect a lot of code changes is needed, maybe even less than the
> current series (because we don't need special knob to differenciate
> migration or non-migration callers of do_vm_stop()). IMHO this series is
> already doing some of that but just decided to ignore outside-migration
> states for suspeneded.
>
> We may want to add some test cases though to verify the suspended state
> transitions (maybe easier to put into migration-test with the new ACPI
> guest code), but optional.
FYI, here is a brief update before today's meeting. I have implemented this and
I am testing libvirt and its various save + restore commands, when the guest is
suspended running (RUN_STATE_SUSPENDED), and suspended stopped (RUN_STATE_PAUSED
with vm_was_suspended = true). There are a few failures, and I am still investigating
to see whether they can be fixed in qemu, or need a fix in libvirt.
I will send more details later.
- Steve
next prev parent reply other threads:[~2023-11-28 13:27 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-13 18:33 [PATCH V5 00/12] fix migration of suspended runstate Steve Sistare
2023-11-13 18:33 ` [PATCH V5 01/12] cpus: refactor vm_stop Steve Sistare
2023-11-20 13:22 ` Fabiano Rosas
2023-11-20 19:09 ` Steven Sistare
2023-11-20 19:46 ` Peter Xu
2023-11-20 19:49 ` Steven Sistare
2023-11-20 20:01 ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 02/12] cpus: stop vm in suspended state Steve Sistare
2023-11-20 14:15 ` Fabiano Rosas
2023-11-20 19:10 ` Steven Sistare
2023-11-20 19:59 ` Peter Xu
2023-11-20 20:47 ` Fabiano Rosas
2023-11-20 21:26 ` Steven Sistare
2023-11-20 20:55 ` Steven Sistare
2023-11-20 21:44 ` Peter Xu
2023-11-21 21:21 ` Steven Sistare
2023-11-21 22:47 ` Peter Xu
2023-11-22 9:38 ` Daniel P. Berrangé
2023-11-22 16:21 ` Peter Xu
2023-11-28 13:26 ` Steven Sistare [this message]
2023-11-13 18:33 ` [PATCH V5 03/12] cpus: pass runstate to vm_prepare_start Steve Sistare
2023-11-13 18:33 ` [PATCH V5 04/12] cpus: start vm in suspended state Steve Sistare
2023-11-20 17:20 ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 05/12] migration: preserve suspended runstate Steve Sistare
2023-11-20 17:30 ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 06/12] migration: preserve suspended for snapshot Steve Sistare
2023-11-20 18:13 ` Fabiano Rosas
2023-11-20 19:10 ` Steven Sistare
2023-11-13 18:33 ` [PATCH V5 07/12] migration: preserve suspended for bg_migration Steve Sistare
2023-11-20 18:18 ` Fabiano Rosas
2023-11-13 18:33 ` [PATCH V5 08/12] tests/qtest: migration events Steve Sistare
2023-11-13 18:33 ` [PATCH V5 09/12] tests/qtest: option to suspend during migration Steve Sistare
2023-11-13 18:33 ` [PATCH V5 10/12] tests/qtest: precopy migration with suspend Steve Sistare
2023-11-13 18:33 ` [PATCH V5 11/12] tests/qtest: postcopy " Steve Sistare
2023-11-13 18:34 ` [PATCH V5 12/12] tests/qtest: background " Steve Sistare
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4699f005-0b3c-45de-8967-0ef90b79f228@oracle.com \
--to=steven.sistare@oracle.com \
--cc=berrange@redhat.com \
--cc=farosas@suse.de \
--cc=leobras@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=quintela@redhat.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).