From: Jan Kiszka <jan.kiszka@siemens.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: "David S. Ahern" <daahern@cisco.com>,
Glauber Costa <gcosta@redhat.com>,
kvm-devel <kvm@vger.kernel.org>
Subject: Re: kvm guest loops_per_jiffy miscalibration under host load
Date: Tue, 22 Jul 2008 10:22:00 +0200 [thread overview]
Message-ID: <488598A8.8040104@siemens.com> (raw)
In-Reply-To: <20080722032510.GB1358@dmt.cnet>
Marcelo Tosatti wrote:
> On Sat, Jul 12, 2008 at 01:28:13PM -0600, David S. Ahern wrote:
>>> All time drift issues we were aware of are fixed in kvm-70. Can you
>>> please provide more details on how you see the time drifting with
>>> RHEL3/4 guests? It slowly but continually drifts or there are large
>>> drifts at once? Are they using TSC or ACPIPM as clocksource?
>> The attached file shows one example of the drift I am seeing. It's for a
>> 4-way RHEL3 guest started with 'nice -20'. After startup each vcpu was
>> pinned to a physical cpu using taskset. The only activity on the host is
>> this one single guest; the guest is relatively idle -- about 4% activity
>> (~1% user, ~3% system time). Host is synchronized to an ntp server; the
>> guest is not. The guest is started with the -localtime parameter. From
>> the file you can see the guest gains about 1-2 seconds every 5 minutes.
>>
>> Since it's a RHEL3 guest I believe the PIT is the only choice (how to
>> confirm?), though it does read the TSC (ie., use_tsc is 1).
>
> Since its an SMP guest I believe its using PIT to generate periodic
> timers and ACPI pmtimer as a clock source.
>
>>> Also, most issues we've seen could only be replicated with dyntick
>>> guests.
>>>
>>> I'll try to reproduce it locally.
>>>
>>>> In the course of it I have been launching guests with boosted priority
>>>> (both nice -20 and realtime priority (RR 1)) on a nearly 100% idle
>>>> host.
>>> Can you also see wacked bogomips without boosting the guest priority?
>> The wacked bogomips only shows up when started with real-time priority.
>> With the 'nice -20' it's sane and close to what the host shows.
>>
>> As another data point I restarted the RHEL3 guest using the -no-kvm-pit
>> and -tdf options (nice -20 priority boost). After 22 hours of uptime,
>> the guest is 29 seconds *behind* the host. Using the in-kernel pit the
>> guest time is always fast compared to the host.
>>
>> I've seen similar drifting in RHEL4 guests, but I have not spent as much
>> time investigating it yet. On ESX adding clock=pit to the boot
>> parameters for RHEL4 guests helps immensely.
>
> The problem with clock=pmtmr and clock=tsc on older 2.6 kernels is lost
> tick and irq latency adjustments, as mentioned in the VMWare paper
> (http://www.vmware.com/pdf/vmware_timekeeping.pdf). They try to detect
> this and compensate by advancing the clock. But the delay between the
> host time fire, injection of guest irq and actual count read (either
> tsc or pmtimer) fool these adjustments. clock=pit has no such lost tick
> detection, so is susceptible to lost ticks under load (in theory).
>
> The fact that qemu emulation is less suspectible to guest clock running
> faster than it should is because the emulated PIT timer is rearmed
> relative to alarm processing (next_expiration = current_time + count).
> But that also means it is suspectible to host load, ie. the frequency is
> virtual.
>
> The in-kernel PIT rearms relative to host clock, so the frequency is
> more reliable (next_expiration = prev_expiration + count).
The same happens under plain QEMU:
static void pit_irq_timer_update(PITChannelState *s, int64_t current_time);
static void pit_irq_timer(void *opaque)
{
PITChannelState *s = opaque;
pit_irq_timer_update(s, s->next_transition_time);
}
To my experience QEMU's PIT is suffering from lost ticks under load
(when some delay gets larger than 2*period).
I recently played a bit with QEMU new icount feature. Than one tracks
the guest progress based on a virtual instruction pointer, derives the
QEMU's virtual clock from it, but also tries to keep that clock in sync
with the host by periodically adjusting its scaling factor (kind of
virtual CPU frequency tuning to keep the TSC in sync with real time).
Works quite nicely, but my feeling is that the adjustment is not 100%
stable yet.
Maybe such pattern could be applied on kvm as well with tsc_vmexit -
tsc_vmentry serving as "guest progress counter" (instead of icount which
depends on QEMU's code translator).
Jan
--
Siemens AG, Corporate Technology, CT SE 2
Corporate Competence Center Embedded Linux
next prev parent reply other threads:[~2008-07-22 8:22 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-22 3:25 kvm guest loops_per_jiffy miscalibration under host load Marcelo Tosatti
2008-07-22 8:22 ` Jan Kiszka [this message]
2008-07-22 12:49 ` Marcelo Tosatti
2008-07-22 15:54 ` Jan Kiszka
2008-07-22 22:00 ` Dor Laor
2008-07-22 19:56 ` David S. Ahern
2008-07-23 2:57 ` David S. Ahern
2008-07-29 14:58 ` Marcelo Tosatti
2008-07-29 16:06 ` PIT/ntp/timekeeping [was Re: kvm guest loops_per_jiffy miscalibration under host load] David S. Ahern
2008-07-29 17:29 ` David S. Ahern
-- strict thread matches above, loose matches on Subject: below --
2008-07-02 16:40 kvm guest loops_per_jiffy miscalibration under host load Marcelo Tosatti
2008-07-03 13:17 ` Glauber Costa
2008-07-04 22:51 ` Marcelo Tosatti
2008-07-07 1:56 ` Anthony Liguori
2008-07-07 18:27 ` Glauber Costa
2008-07-07 18:48 ` Marcelo Tosatti
2008-07-07 19:21 ` Anthony Liguori
2008-07-07 19:32 ` Glauber Costa
2008-07-07 21:35 ` Glauber Costa
2008-07-11 21:18 ` David S. Ahern
2008-07-12 14:10 ` Marcelo Tosatti
2008-07-12 19:28 ` David S. Ahern
2008-07-07 18:17 ` Daniel P. Berrange
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=488598A8.8040104@siemens.com \
--to=jan.kiszka@siemens.com \
--cc=daahern@cisco.com \
--cc=gcosta@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox