From: Marcelo Tosatti <mtosatti@redhat.com>
To: Radim Krcmar <rkrcmar@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>, kvm list <kvm@vger.kernel.org>
Subject: Re: What's kvmclock's custom sched_clock for?
Date: Thu, 7 Jan 2016 18:10:45 -0200 [thread overview]
Message-ID: <20160107201043.GA18469@amt.cnet> (raw)
In-Reply-To: <20160107151810.GA12375@potion.brq.redhat.com>
On Thu, Jan 07, 2016 at 04:18:11PM +0100, Radim Krcmar wrote:
> 2016-01-07 00:41-0800, Andy Lutomirski:
> > On Wed, Jan 6, 2016 at 11:18 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> >> AFAICT KVM reliably passes a monotonic TSC through to guests, even if
> >> the host suspends. That's all that sched_clock needs, I think.
> >>
> >> So why does kvmclock have a custom sched_clock?
>
> If the host CPU has enough features, then yes, KVM can take care of
> everything and kvmclock has no advantage over TSC, even when migrating
> to TSC with different frequency as modern CPUs support TSC offset +
> scaling in guests.
>
> The problem is with antiques. Guests on old CPUs need to have more
> information on top of TSC to be able to get useful system time.
> And old KVM doesn't provide good information, so we have legacy layers
> everywhere.
>
> kvmclock in the guest can just equal to rdtsc() with modern CPUs, but we
> still want to use kvmclock wrapper, because kvmclock can provide an
> stable clock regardless of underlying TSC (in theory).
>
> >> On a related note, KVM doesn't pass the "invariant TSC" feature
> >> through to guests on my machine even though "invtsc" is set in QEMU
> >> and the kernel host code appears to support it. What gives?
> >
> > I think I solved part of the puzzle. KVM doesn't like to advertise
> > invtsc by default because that breaks migration. (Oddly, the end
> > result seems wrong -- with migration, the TSC doesn't stop, but it's
> > not constant, and X86_FEATURE_CONSTANT_TSC is nonetheless set, but
> > whatever.)
>
> QEMU probably missed that because X86_FEATURE_CONSTANT_TSC is a function
> of family/model. (CONSTANT_TSC is the same as invariant TSC as KVM
> guests don't have c-states.)
>
> > So the scheduler clock doesn't get marked stable.
>
> Stable sched clock is quite unrelated to TSC features. KVMs from last
> few years should always give good enough result to allow stable sched
> clock. We wanted realtime guests and realtime linux needs no_hz=full
> that depends on stable sched clock. The result is huge hack.
>
> We'd need to say that migration creates powerful gravity fields to
> faithfully migrate constant/invariant TSC, but stable sched clock
> doesn't have that strict expectations about time.
Was that supposed to be a joke?
> > Is that it?
> >
> > This still doesn't explain why even explicitly trying to set invtsc
> > doesn't seem to work.
>
> Seems like a bug. Mine cpuid is
> 0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100
> and QEMU says
> warning: host doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8]
>
> I'll see if it's in KVM or QEMU. (We should only forbid migrations to
> hosts with different frequency and without guest TSC scaling.)
next prev parent reply other threads:[~2016-01-07 20:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-07 7:18 What's kvmclock's custom sched_clock for? Andy Lutomirski
2016-01-07 8:41 ` Andy Lutomirski
2016-01-07 10:59 ` Marcelo Tosatti
2016-01-07 15:18 ` Radim Krcmar
2016-01-07 17:27 ` Andy Lutomirski
2016-01-07 17:48 ` Radim Krcmar
2016-01-07 20:15 ` Marcelo Tosatti
2016-01-07 20:10 ` Marcelo Tosatti [this message]
2016-01-08 14:13 ` Radim Krcmar
2016-01-11 21:00 ` Marcelo Tosatti
2016-01-12 15:33 ` Radim Krcmar
2016-01-12 20:48 ` Marcelo Tosatti
2016-01-13 14:59 ` Radim Krcmar
2016-01-07 10:56 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160107201043.GA18469@amt.cnet \
--to=mtosatti@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.