From: Cornelia Huck <cohuck@redhat.com>
To: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: David Hildenbrand <david@redhat.com>,
qemu-devel@nongnu.org, qemu-s390x@nongnu.org,
Thomas Huth <thuth@redhat.com>,
Janosch Frank <frankja@linux.ibm.com>,
Richard Henderson <rth@twiddle.net>,
Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Subject: Re: [Qemu-devel] [PATCH v3] s390x/tod: Properly stop the KVM TOD while the guest is not running
Date: Thu, 6 Dec 2018 10:47:17 +0100 [thread overview]
Message-ID: <20181206104717.62189f59.cohuck@redhat.com> (raw)
In-Reply-To: <b1bd36ca-f0e0-4973-86b8-6dc8b2c0e4b5@de.ibm.com>
On Tue, 4 Dec 2018 09:27:21 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> On 30.11.2018 10:49, David Hildenbrand wrote:
> > Just like on other architectures, we should stop the clock while the guest
> > is not running. This is already properly done for TCG. Right now, doing an
> > offline migration (stop, migrate, cont) can easily trigger stalls in the
> > guest.
> >
> > Even doing a
> > (hmp) stop
> > ... wait 2 minutes ...
> > (hmp) cont
> > will already trigger stalls.
> >
> > So whenever the guest stops, backup the KVM TOD. When continuing to run
> > the guest, restore the KVM TOD.
> >
> > One special case is starting a simple VM: Reading the TOD from KVM to
> > stop it right away until the guest is actually started means that the
> > time of any simple VM will already differ to the host time. We can
> > simply leave the TOD running and the guest won't be able to recognize
> > it.
> >
> > For migration, we actually want to keep the TOD stopped until really
> > starting the guest. To be able to catch most errors, we should however
> > try to set the TOD in addition to simply storing it. So we can still
> > catch basic migration problems.
> >
> > If anything goes wrong while backing up/restoring the TOD, we have to
> > ignore it (but print a warning). This is then basically a fallback to
> > old behavior (TOD remains running).
> >
> > I tested this very basically with an initrd:
> > 1. Start a simple VM. Observed that the TOD is kept running. Old
> > behavior.
> > 2. Ordinary live migration. Observed that the TOD is temporarily
> > stopped on the destination when setting the new value and
> > correctly started when finally starting the guest.
> > 3. Offline live migration. (stop, migrate, cont). Observed that the
> > TOD will be stopped on the source with the "stop" command. On the
> > destination, the TOD is temporarily stopped when setting the new
> > value and correctly started when finally starting the guest via
> > "cont".
> > 4. Simple stop/cont correctly stops/starts the TOD. (multiple stops
> > or conts in a row have no effect, so works as expected)
> >
> > In the future, we might want to send the guest a special kind of time sync
> > interrupt under some conditions, so it can synchronize its tod to the
> > host tod. This is interesting for migration scenarios but also when we
> > get time sync interrupts ourselves. This however will most probably have
> > to be handled in KVM (e.g. when the tods differ too much) and is not
> > desired e.g. when debugging the guest. (single stepping should not
> > result in permanent time syncs). I consider something like that an add-on
> > on top of this basic "don't break the guest" handling.
> >
> > Signed-off-by: David Hildenbrand <david@redhat.com>
>
>
> Long time we should really work on getting the guest back in sync with the host
> TOD (e..g on migration) since there are some advanced mechanisms that rely on all
> clocks to be in sync. For example the dasd I/O will also write time stamps
> and in an stp complex (synced time across CECs) this can be useful for "classic"
> mainframe databases and ordering.
I think so. It sounds like a bigger effort, though.
> It is probably the right thing to do as of today as on migration we are also out
> of sync.
Nod.
>
> Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
>
> Adding Viktor in case he has concerns.
I'll go ahead and queue this now, so I don't forget about it (I plan to
send a pull request as soon as 4.0 is out.)
We can still do further changes on top.
next prev parent reply other threads:[~2018-12-06 9:47 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-11-30 9:49 [Qemu-devel] [PATCH v3] s390x/tod: Properly stop the KVM TOD while the guest is not running David Hildenbrand
2018-11-30 12:11 ` Cornelia Huck
2018-11-30 12:47 ` David Hildenbrand
2018-11-30 12:39 ` [Qemu-devel] [qemu-s390x] " Christian Borntraeger
2018-11-30 12:49 ` David Hildenbrand
2018-12-04 8:27 ` [Qemu-devel] " Christian Borntraeger
2018-12-06 9:33 ` David Hildenbrand
2018-12-06 9:47 ` Cornelia Huck [this message]
2018-12-04 8:54 ` Thomas Huth
2018-12-06 9:49 ` Cornelia Huck
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20181206104717.62189f59.cohuck@redhat.com \
--to=cohuck@redhat.com \
--cc=borntraeger@de.ibm.com \
--cc=david@redhat.com \
--cc=frankja@linux.ibm.com \
--cc=mihajlov@linux.vnet.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=rth@twiddle.net \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).