From: Paolo Bonzini <pbonzini@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>,
Gleb Natapov <gleb@kernel.org>,
Marcelo Tosatti <mtosatti@redhat.com>,
kvm list <kvm@vger.kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: Cleaning up the KVM clock
Date: Mon, 22 Dec 2014 11:15:08 +0100 [thread overview]
Message-ID: <5497EF2C.2070305@redhat.com> (raw)
In-Reply-To: <CALCETrV4Bhc=pwjJR-S7dfgcR7n0xKtj0o=ofOxCfJmnVFcjGw@mail.gmail.com>
On 21/12/2014 04:31, Andy Lutomirski wrote:
> I'm looking at the vdso timing code, and I'm puzzled by the pvclock
> code. My motivation is comprehensibility, performance, and
> correctness.
>
> # for i in `seq 10`; do ./timing_test_64 10 vclock_gettime 0; done
> 10000000 loops in 0.69138s = 69.14 nsec / loop
> 10000000 loops in 0.63614s = 63.61 nsec / loop
> 10000000 loops in 0.63213s = 63.21 nsec / loop
> 10000000 loops in 0.63087s = 63.09 nsec / loop
> 10000000 loops in 0.63079s = 63.08 nsec / loop
> 10000000 loops in 0.63096s = 63.10 nsec / loop
> 10000000 loops in 0.63096s = 63.10 nsec / loop
> 10000000 loops in 0.63062s = 63.06 nsec / loop
> 10000000 loops in 0.63100s = 63.10 nsec / loop
> 10000000 loops in 0.63112s = 63.11 nsec / loop
> bash-4.3# echo tsc
>> /sys/devices/system/clocksource/clocksource0/current_clocksource
> [ 45.957524] Switched to clocksource tsc
> bash-4.3# for i in `seq 10`; do ./timing_test_64 10 vclock_gettime 0;
> done10000000 loops in 0.33583s = 33.58 nsec / loop
> 10000000 loops in 0.28530s = 28.53 nsec / loop
> 10000000 loops in 0.28904s = 28.90 nsec / loop
> 10000000 loops in 0.29001s = 29.00 nsec / loop
> 10000000 loops in 0.28775s = 28.78 nsec / loop
> 10000000 loops in 0.30102s = 30.10 nsec / loop
> 10000000 loops in 0.28006s = 28.01 nsec / loop
> 10000000 loops in 0.28584s = 28.58 nsec / loop
> 10000000 loops in 0.28175s = 28.17 nsec / loop
> 10000000 loops in 0.28724s = 28.72 nsec / loop
>
> The current code is rather slow, especially compared to the tsc variant.
>
> The algorithm used by the pvclock vgetsns implementation is, approximately:
>
> cpu = getcpu;
> pvti = pointer to the relevant paravirt data
> version = pvti->version;
> rdtsc_barrier();
> tsc = rdtsc()
> delta = (tsc - x) * y >> z;
> cycles = delta + w;
> flags = pvti->flags;
> rdtsc_barrier(); <-- totally unnecessary
It's not unnecessary. The first one is a "lock", the second is an
"unlock". You can move the second rdtsc_barrier below the cpu/seqlock
check though.
>
> cpu1 = getcpu;
> if (cpu != cpu1 || the we missed the seqlock)
> retry;
>
> if (!stable)
> bail;
>
> After that, the main vclock_gettime code applies the kernel's regular
> time adjustments.
>
> First, is there any guarantee that, if pvti is marked as stable, that
> the pvti data is consistent across cpus? If so (which would be really
> nice), then we could always use vcpu 0's pvti, which would be a really
> nice cleanup.
I think you cannot because the TSCs might not be perfectly synced up
even if the rates are, but...
> If not, then the current algorithm is buggy. There is no guarantee
> that the tsc stamp we get matches the cpu whose pvti we looked at. We
> could fix that using rdtscp.
... Marcelo will have to answer this.
> I think it's also rather strange that the return value is "cycles"
> instead of nanoseconds. If the guest is using pvclock *and* ntp,
> isn't something very wrong?
It's not cycles. pvclock_get_nsec_offset returns nanoseconds, and
__pvclock_read_cycles does the same. Patches are welcome. :)
Paolo
next prev parent reply other threads:[~2014-12-22 10:15 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-12-21 3:31 Cleaning up the KVM clock Andy Lutomirski
2014-12-22 10:15 ` Paolo Bonzini [this message]
2014-12-22 13:34 ` Marcelo Tosatti
2014-12-22 14:09 ` Marcelo Tosatti
2014-12-22 16:03 ` Andy Lutomirski
2014-12-22 22:47 ` Paolo Bonzini
2014-12-22 22:49 ` Paolo Bonzini
2014-12-22 23:00 ` Andy Lutomirski
2014-12-22 23:14 ` Paolo Bonzini
2014-12-22 23:31 ` Andy Lutomirski
[not found] ` <CAAyOgsbNsDmPGzJcEuYExaswWaQoLHzkMVd+r6r-8jnGc00yTA@mail.gmail.com>
2014-12-23 10:25 ` Paolo Bonzini
2014-12-22 13:38 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5497EF2C.2070305@redhat.com \
--to=pbonzini@redhat.com \
--cc=gleb@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=mtosatti@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.