From: Gleb Natapov <gleb@redhat.com>
To: Stefan Weil <sw@weilnetz.de>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>,
Jan Kiszka <jan.kiszka@siemens.com>,
qemu-devel@nongnu.org, Luiz Capitulino <lcapitulino@redhat.com>,
Avi Kivity <avi@redhat.com>,
Anthony Liguori <anthony@codemonkey.ws>,
Paolo Bonzini <pbonzini@redhat.com>,
Eric Blake <eblake@redhat.com>
Subject: Re: [Qemu-devel] Rethinking missed tick catchup
Date: Wed, 12 Sep 2012 21:13:12 +0300 [thread overview]
Message-ID: <20120912181312.GE25041@redhat.com> (raw)
In-Reply-To: <5050C6A0.8050800@weilnetz.de>
On Wed, Sep 12, 2012 at 07:30:08PM +0200, Stefan Weil wrote:
> Am 12.09.2012 18:45, schrieb Gleb Natapov:
> >On Wed, Sep 12, 2012 at 06:27:14PM +0200, Stefan Weil wrote:
> >>Am 12.09.2012 15:54, schrieb Anthony Liguori:
> >>>Hi,
> >>>
> >>>We've been running into a lot of problems lately with Windows guests and
> >>>I think they all ultimately could be addressed by revisiting the missed
> >>>tick catchup algorithms that we use. Mike and I spent a while talking
> >>>about it yesterday and I wanted to take the discussion to the list to
> >>>get some additional input.
> >>>
> >>>Here are the problems we're seeing:
> >>>
> >>>1) Rapid reinjection can lead to time moving faster for short bursts of
> >>> time. We've seen a number of RTC watchdog BSoDs and it's possible
> >>> that at least one cause is reinjection speed.
> >>>
> >>>2) When hibernating a host system, the guest gets is essentially paused
> >>> for a long period of time. This results in a very large tick catchup
> >>> while also resulting in a large skew in guest time.
> >>>
> >>> I've gotten reports of the tick catchup consuming a lot of CPU time
> >>> from rapid delivery of interrupts (although I haven't reproduced this
> >>> yet).
> >>>
> >>>3) Windows appears to have a service that periodically syncs the guest
> >>> time with the hardware clock. I've been told the resync period is an
> >>> hour. For large clock skews, this can compete with reinjection
> >>> resulting in a positive skew in time (the guest can be ahead of the
> >>> host).
> >>Nearly each modern OS (including Windows) uses NTP
> >>or some other protocol to get the time via a TCP network.
> >>
> >The drifts we are talking about will take ages for NTP to fix.
> >
> >>If a guest OS detects a small difference of time, it will usually
> >>accelerate or decelerate the OS clock until the time is
> >>synchronised again.
> >>
> >>Large jumps in network time will make the OS time jump, too.
> >>With a little bad luck, QEMU's reinjection will add the
> >>positive skew, no matter whether the guest is Linux or Windows.
> >>
> >As far as I know NTP will never make OS clock jump. The purpose of NTP
> >is to fix time gradually, so apps will not notice. npdate is used to
> >force clock synchronization, but is should be run manually.
>
> s/npdate/ntpdate. Yes, some Linux distros run it at system start,
Yes, typo.
> and it's also usual to call it every hour (poor man's NTP, uses
> less resources).
>
> >
> >>>I've been thinking about an algorithm like this to address these
> >>>problems:
> >>>
> >>>A) Limit the number of interrupts that we reinject to the equivalent of
> >>> a small period of wallclock time. Something like 60 seconds.
> >>>
> >>>B) In the event of (A), trigger a notification in QEMU. This is easy
> >>> for the RTC but harder for the in-kernel PIT. Maybe it's a good time to
> >>> revisit usage of the in-kernel PIT?
> >>>
> >>>C) On acculumated tick overflow, rely on using a qemu-ga command to
> >>> force a resync of the guest's time to the hardware wallclock time.
> >>>
> >>>D) Whenever the guest reads the wallclock time from the RTC, reset all
> >>> accumulated ticks.
> >>D) makes no sense, see my comment above.
> >>
> >>Injection of additional timer interrupts should not be needed
> >>after a hibernation. The guest must handle that situation
> >>by reading either the hw clock (which must be updated
> >>by QEMU when it resumes from hibernate) or by using
> >>another time reference (like NTP, for example).
> >>
> >He is talking about host hibernation, not guest.
> >
>
> I also meant host hibernation.
Than I don't see how guest can handle the situation since it has
no idea that it was stopped. Qemu has not idea about host hibernation
either.
>
> Maybe the host should tell the guest that it is going to
> hibernate (ACPI event), then the guest can use its
> normal hibernate entry and recovery code, too.
Qemu does not emulate Sleep button, but even if it did guest may ignore
it. AFAIK libvirt migrate VM into a file during host hibernation. While
this does not require guest cooperation it have time keeping issues.
--
Gleb.
next prev parent reply other threads:[~2012-09-12 18:13 UTC|newest]
Thread overview: 48+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-09-12 13:54 [Qemu-devel] Rethinking missed tick catchup Anthony Liguori
2012-09-12 14:21 ` Jan Kiszka
2012-09-12 14:44 ` Anthony Liguori
2012-09-12 14:50 ` Jan Kiszka
2012-09-12 15:06 ` Gleb Natapov
2012-09-12 15:42 ` Jan Kiszka
2012-09-12 15:45 ` Gleb Natapov
2012-09-12 16:16 ` Gleb Natapov
2012-09-12 15:15 ` Gleb Natapov
2012-09-12 18:19 ` Anthony Liguori
2012-09-13 10:49 ` Gleb Natapov
2012-09-13 13:14 ` Eric Blake
2012-09-13 13:28 ` Daniel P. Berrange
2012-09-13 14:06 ` Anthony Liguori
2012-09-13 14:22 ` Gleb Natapov
2012-09-13 14:34 ` Avi Kivity
2012-09-13 14:42 ` Eric Blake
2012-09-13 15:40 ` Avi Kivity
2012-09-13 15:50 ` Anthony Liguori
2012-09-13 15:53 ` Avi Kivity
2012-09-13 18:27 ` Anthony Liguori
2012-09-16 10:05 ` Avi Kivity
2012-09-16 14:37 ` Anthony Liguori
2012-09-19 15:34 ` Avi Kivity
2012-09-19 16:37 ` Gleb Natapov
2012-09-19 16:44 ` Avi Kivity
2012-09-19 16:55 ` Gleb Natapov
2012-09-19 16:57 ` Avi Kivity
2012-09-13 14:35 ` Anthony Liguori
2012-09-13 14:48 ` Gleb Natapov
2012-09-13 15:51 ` Avi Kivity
2012-09-13 15:56 ` Anthony Liguori
2012-09-13 16:06 ` Gleb Natapov
2012-09-13 18:33 ` Anthony Liguori
2012-09-13 18:56 ` Gleb Natapov
2012-09-13 20:06 ` Anthony Liguori
2012-09-13 16:08 ` Avi Kivity
2012-09-13 13:47 ` Gleb Natapov
2012-09-12 16:27 ` Stefan Weil
2012-09-12 16:45 ` Gleb Natapov
2012-09-12 17:30 ` Stefan Weil
2012-09-12 18:13 ` Gleb Natapov [this message]
2012-09-12 19:45 ` Stefan Weil
2012-09-13 10:50 ` Gleb Natapov
2012-09-12 20:06 ` Michael Roth
2012-09-12 17:23 ` Luiz Capitulino
-- strict thread matches above, loose matches on Subject: below --
2012-09-12 18:03 Clemens Kolbitsch
2012-09-13 6:25 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120912181312.GE25041@redhat.com \
--to=gleb@redhat.com \
--cc=anthony@codemonkey.ws \
--cc=avi@redhat.com \
--cc=eblake@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=lcapitulino@redhat.com \
--cc=mdroth@linux.vnet.ibm.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=sw@weilnetz.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).