From: David Hildenbrand <david@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>, qemu-devel@nongnu.org
Cc: Thomas Huth <thuth@redhat.com>,
Janosch Frank <frankja@linux.ibm.com>,
Cornelia Huck <cohuck@redhat.com>,
qemu-s390x@nongnu.org, Richard Henderson <rth@twiddle.net>
Subject: Re: [Qemu-devel] [qemu-s390x] [PATCH v3] s390x/tod: Properly stop the KVM TOD while the guest is not running
Date: Fri, 30 Nov 2018 13:49:45 +0100 [thread overview]
Message-ID: <aea268f6-50e1-1e58-cf0f-477cdaaf22f1@redhat.com> (raw)
In-Reply-To: <946aa813-c854-0c99-8721-e04ed755c6da@de.ibm.com>
On 30.11.18 13:39, Christian Borntraeger wrote:
>
>
> On 30.11.2018 10:49, David Hildenbrand wrote:
>> Just like on other architectures, we should stop the clock while the guest
>> is not running. This is already properly done for TCG. Right now, doing an
>> offline migration (stop, migrate, cont) can easily trigger stalls in the
>> guest.
>>
>> Even doing a
>> (hmp) stop
>> ... wait 2 minutes ...
>> (hmp) cont
>> will already trigger stalls.
>>
>> So whenever the guest stops, backup the KVM TOD. When continuing to run
>> the guest, restore the KVM TOD.
>>
>> One special case is starting a simple VM: Reading the TOD from KVM to
>> stop it right away until the guest is actually started means that the
>> time of any simple VM will already differ to the host time. We can
>> simply leave the TOD running and the guest won't be able to recognize
>> it.
>>
>> For migration, we actually want to keep the TOD stopped until really
>> starting the guest. To be able to catch most errors, we should however
>> try to set the TOD in addition to simply storing it. So we can still
>> catch basic migration problems.
>>
>> If anything goes wrong while backing up/restoring the TOD, we have to
>> ignore it (but print a warning). This is then basically a fallback to
>> old behavior (TOD remains running).
>>
>> I tested this very basically with an initrd:
>> 1. Start a simple VM. Observed that the TOD is kept running. Old
>> behavior.
>> 2. Ordinary live migration. Observed that the TOD is temporarily
>> stopped on the destination when setting the new value and
>> correctly started when finally starting the guest.
>> 3. Offline live migration. (stop, migrate, cont). Observed that the
>> TOD will be stopped on the source with the "stop" command. On the
>> destination, the TOD is temporarily stopped when setting the new
>> value and correctly started when finally starting the guest via
>> "cont".
>> 4. Simple stop/cont correctly stops/starts the TOD. (multiple stops
>> or conts in a row have no effect, so works as expected)
>>
>> In the future, we might want to send the guest a special kind of time sync
>> interrupt under some conditions, so it can synchronize its tod to the
>> host tod. This is interesting for migration scenarios but also when we
>> get time sync interrupts ourselves. This however will most probably have
>> to be handled in KVM (e.g. when the tods differ too much) and is not
>> desired e.g. when debugging the guest. (single stepping should not
>> result in permanent time syncs). I consider something like that an add-on
>> on top of this basic "don't break the guest" handling.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>
>> v2 -> v3:
>> - use device_class_set_parent_realize() to implement a child realize
>> function
>>
>> hw/s390x/tod-kvm.c | 102 ++++++++++++++++++++++++++++++++++++++++-
>> include/hw/s390x/tod.h | 8 +++-
>> 2 files changed, 107 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/s390x/tod-kvm.c b/hw/s390x/tod-kvm.c
>> index df564ab89c..2456bf7b24 100644
>> --- a/hw/s390x/tod-kvm.c
>> +++ b/hw/s390x/tod-kvm.c
>> @@ -10,10 +10,11 @@
>>
>> #include "qemu/osdep.h"
>> #include "qapi/error.h"
>> +#include "sysemu/sysemu.h"
>> #include "hw/s390x/tod.h"
>> #include "kvm_s390x.h"
>>
>> -static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error **errp)
>> +static void kvm_s390_get_tod_raw(S390TOD *tod, Error **errp)
>> {
>> int r;
>>
>> @@ -27,7 +28,17 @@ static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error **errp)
>> }
>> }
>>
>> -static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error **errp)
>> +static void kvm_s390_tod_get(const S390TODState *td, S390TOD *tod, Error **errp)
>> +{
>> + if (td->stopped) {
>> + *tod = td->base;
>> + return;
>> + }
>> +
>> + kvm_s390_get_tod_raw(tod, errp);
>> +}
>> +
>> +static void kvm_s390_set_tod_raw(const S390TOD *tod, Error **errp)
>> {
>> int r;
>>
>> @@ -41,18 +52,105 @@ static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error **errp)
>> }
>> }
>>
>> +static void kvm_s390_tod_set(S390TODState *td, const S390TOD *tod, Error **errp)
>> +{
>> + Error *local_err = NULL;
>> +
>> + /*
>> + * Somebody (e.g. migration) set the TOD. We'll store it into KVM to
>> + * properly detect errors now but take a look at the runstate to decide
>> + * whether really to keep the tod running. E.g. during migration, this
>> + * is the point where we want to stop the initially running TOD to fire
>> + * it back up when actually starting the migrated guest.
>> + */
>> + kvm_s390_set_tod_raw(tod, &local_err);
>> + if (local_err) {
>> + error_propagate(errp, local_err);
>> + return;
>> + }
>> +
>> + if (runstate_is_running()) {
>> + td->stopped = false;
>> + } else {
>> + td->stopped = true;
>> + td->base = *tod;
>> + }
>> +}
>> +
>> +static void kvm_s390_tod_vm_state_change(void *opaque, int running,
>> + RunState state)
>> +{
>> + S390TODState *td = opaque;
>> + Error *local_err = NULL;
>> +
>> + if (running && td->stopped) {
>> + /* Set the old TOD when running the VM - start the TOD clock. */
>> + kvm_s390_set_tod_raw(&td->base, &local_err);
>> + if (local_err) {
>> + warn_report_err(local_err);
>> + }
>> + /* Treat errors like the TOD was running all the time. */
>> + td->stopped = false;
>> + } else if (!running && !td->stopped) {
>> + /* Store the TOD when stopping the VM - stop the TOD clock. */
>> + kvm_s390_get_tod_raw(&td->base, &local_err);
>> + if (local_err) {
>> + /* Keep the TOD running in case we could not back it up. */
>> + warn_report_err(local_err);
>> + } else {
>> + td->stopped = true;
>> + }
>> + }
>> +}
>> +
>> +static void kvm_s390_tod_realize(DeviceState *dev, Error **errp)
>> +{
>> + S390TODState *td = S390_TOD(dev);
>> + S390TODClass *tdc = S390_TOD_GET_CLASS(td);
>> + Error *local_err = NULL;
>> +
>> + tdc->parent_realize(dev, &local_err);
>> + if (local_err) {
>> + error_propagate(errp, local_err);
>> + return;
>> + }
>> +
>> + /*
>> + * We need to know when the VM gets started/stopped to start/stop the TOD.
>> + * As we can never have more than one TOD instance (and that will never be
>> + * removed), registering here and never unregistering is good enough.
>> + */
>> + qemu_add_vm_change_state_handler(kvm_s390_tod_vm_state_change, td);
>> +}
>> +
>> static void kvm_s390_tod_class_init(ObjectClass *oc, void *data)
>> {
>> S390TODClass *tdc = S390_TOD_CLASS(oc);
>>
>> + device_class_set_parent_realize(DEVICE_CLASS(oc), kvm_s390_tod_realize,
>> + &tdc->parent_realize);
>> tdc->get = kvm_s390_tod_get;
>> tdc->set = kvm_s390_tod_set;
>> }
>>
>> +static void kvm_s390_tod_init(Object *obj)
>> +{
>> + S390TODState *td = S390_TOD(obj);
>> +
>> + /*
>> + * The TOD is initially running (value stored in KVM). Avoid needless
>> + * loading/storing of the TOD when starting a simple VM, so let it
>> + * run although the (never started) VM is stopped. For migration, we
>> + * will properly set the TOD later.
>> + */
>> + td->stopped = false;
>
>
> Do we have to migrate the td->stopped during migration?
>
No, this is fully handled using the runstate.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2018-11-30 12:50 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-30 9:49 [Qemu-devel] [PATCH v3] s390x/tod: Properly stop the KVM TOD while the guest is not running David Hildenbrand
2018-11-30 12:11 ` Cornelia Huck
2018-11-30 12:47 ` David Hildenbrand
2018-11-30 12:39 ` [Qemu-devel] [qemu-s390x] " Christian Borntraeger
2018-11-30 12:49 ` David Hildenbrand [this message]
2018-12-04 8:27 ` [Qemu-devel] " Christian Borntraeger
2018-12-06 9:33 ` David Hildenbrand
2018-12-06 9:47 ` Cornelia Huck
2018-12-04 8:54 ` Thomas Huth
2018-12-06 9:49 ` Cornelia Huck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aea268f6-50e1-1e58-cf0f-477cdaaf22f1@redhat.com \
--to=david@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=cohuck@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=rth@twiddle.net \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).