kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Radim Krcmar <rkrcmar@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>, kvm list <kvm@vger.kernel.org>
Subject: Re: What's kvmclock's custom sched_clock for?
Date: Thu, 7 Jan 2016 18:10:45 -0200	[thread overview]
Message-ID: <20160107201043.GA18469@amt.cnet> (raw)
In-Reply-To: <20160107151810.GA12375@potion.brq.redhat.com>

On Thu, Jan 07, 2016 at 04:18:11PM +0100, Radim Krcmar wrote:
> 2016-01-07 00:41-0800, Andy Lutomirski:
> > On Wed, Jan 6, 2016 at 11:18 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> >> AFAICT KVM reliably passes a monotonic TSC through to guests, even if
> >> the host suspends.  That's all that sched_clock needs, I think.
> >>
> >> So why does kvmclock have a custom sched_clock?
> 
> If the host CPU has enough features, then yes, KVM can take care of
> everything and kvmclock has no advantage over TSC, even when migrating
> to TSC with different frequency as modern CPUs support TSC offset +
> scaling in guests.
> 
> The problem is with antiques.  Guests on old CPUs need to have more
> information on top of TSC to be able to get useful system time.
> And old KVM doesn't provide good information, so we have legacy layers
> everywhere.
> 
> kvmclock in the guest can just equal to rdtsc() with modern CPUs, but we
> still want to use kvmclock wrapper, because kvmclock can provide an
> stable clock regardless of underlying TSC (in theory).
> 
> >> On a related note, KVM doesn't pass the "invariant TSC" feature
> >> through to guests on my machine even though "invtsc" is set in QEMU
> >> and the kernel host code appears to support it.  What gives?
> > 
> > I think I solved part of the puzzle.  KVM doesn't like to advertise
> > invtsc by default because that breaks migration.  (Oddly, the end
> > result seems wrong -- with migration, the TSC doesn't stop, but it's
> > not constant, and X86_FEATURE_CONSTANT_TSC is nonetheless set, but
> > whatever.)
> 
> QEMU probably missed that because X86_FEATURE_CONSTANT_TSC is a function
> of family/model.  (CONSTANT_TSC is the same as invariant TSC as KVM
> guests don't have c-states.)
> 
> >             So the scheduler clock doesn't get marked stable.
> 
> Stable sched clock is quite unrelated to TSC features.  KVMs from last
> few years should always give good enough result to allow stable sched
> clock.  We wanted realtime guests and realtime linux needs no_hz=full
> that depends on stable sched clock.  The result is huge hack.
> 
> We'd need to say that migration creates powerful gravity fields to
> faithfully migrate constant/invariant TSC, but stable sched clock
> doesn't have that strict expectations about time.

Was that supposed to be a joke? 

> > Is that it?
> > 
> > This still doesn't explain why even explicitly trying to set invtsc
> > doesn't seem to work.
> 
> Seems like a bug.  Mine cpuid is
>    0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100
> and QEMU says
>   warning: host doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8]
> 
> I'll see if it's in KVM or QEMU.  (We should only forbid migrations to
> hosts with different frequency and without guest TSC scaling.)

  parent reply	other threads:[~2016-01-07 20:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-07  7:18 What's kvmclock's custom sched_clock for? Andy Lutomirski
2016-01-07  8:41 ` Andy Lutomirski
2016-01-07 10:59   ` Marcelo Tosatti
2016-01-07 15:18   ` Radim Krcmar
2016-01-07 17:27     ` Andy Lutomirski
2016-01-07 17:48       ` Radim Krcmar
2016-01-07 20:15       ` Marcelo Tosatti
2016-01-07 20:10     ` Marcelo Tosatti [this message]
2016-01-08 14:13       ` Radim Krcmar
2016-01-11 21:00         ` Marcelo Tosatti
2016-01-12 15:33           ` Radim Krcmar
2016-01-12 20:48             ` Marcelo Tosatti
2016-01-13 14:59               ` Radim Krcmar
2016-01-07 10:56 ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160107201043.GA18469@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=rkrcmar@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).