kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Radim Krcmar <rkrcmar@redhat.com>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Marcelo Tosatti <mtosatti@redhat.com>, kvm list <kvm@vger.kernel.org>
Subject: Re: What's kvmclock's custom sched_clock for?
Date: Thu, 7 Jan 2016 16:18:11 +0100	[thread overview]
Message-ID: <20160107151810.GA12375@potion.brq.redhat.com> (raw)
In-Reply-To: <CALCETrV2CSRntNHD79uq=vJjTOyN_pqvzrt+K7Hx=K7rdNf1CQ@mail.gmail.com>

2016-01-07 00:41-0800, Andy Lutomirski:
> On Wed, Jan 6, 2016 at 11:18 PM, Andy Lutomirski <luto@amacapital.net> wrote:
>> AFAICT KVM reliably passes a monotonic TSC through to guests, even if
>> the host suspends.  That's all that sched_clock needs, I think.
>>
>> So why does kvmclock have a custom sched_clock?

If the host CPU has enough features, then yes, KVM can take care of
everything and kvmclock has no advantage over TSC, even when migrating
to TSC with different frequency as modern CPUs support TSC offset +
scaling in guests.

The problem is with antiques.  Guests on old CPUs need to have more
information on top of TSC to be able to get useful system time.
And old KVM doesn't provide good information, so we have legacy layers
everywhere.

kvmclock in the guest can just equal to rdtsc() with modern CPUs, but we
still want to use kvmclock wrapper, because kvmclock can provide an
stable clock regardless of underlying TSC (in theory).

>> On a related note, KVM doesn't pass the "invariant TSC" feature
>> through to guests on my machine even though "invtsc" is set in QEMU
>> and the kernel host code appears to support it.  What gives?
> 
> I think I solved part of the puzzle.  KVM doesn't like to advertise
> invtsc by default because that breaks migration.  (Oddly, the end
> result seems wrong -- with migration, the TSC doesn't stop, but it's
> not constant, and X86_FEATURE_CONSTANT_TSC is nonetheless set, but
> whatever.)

QEMU probably missed that because X86_FEATURE_CONSTANT_TSC is a function
of family/model.  (CONSTANT_TSC is the same as invariant TSC as KVM
guests don't have c-states.)

>             So the scheduler clock doesn't get marked stable.

Stable sched clock is quite unrelated to TSC features.  KVMs from last
few years should always give good enough result to allow stable sched
clock.  We wanted realtime guests and realtime linux needs no_hz=full
that depends on stable sched clock.  The result is huge hack.

We'd need to say that migration creates powerful gravity fields to
faithfully migrate constant/invariant TSC, but stable sched clock
doesn't have that strict expectations about time.

> Is that it?
> 
> This still doesn't explain why even explicitly trying to set invtsc
> doesn't seem to work.

Seems like a bug.  Mine cpuid is
   0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100
and QEMU says
  warning: host doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8]

I'll see if it's in KVM or QEMU.  (We should only forbid migrations to
hosts with different frequency and without guest TSC scaling.)

  parent reply	other threads:[~2016-01-07 15:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-07  7:18 What's kvmclock's custom sched_clock for? Andy Lutomirski
2016-01-07  8:41 ` Andy Lutomirski
2016-01-07 10:59   ` Marcelo Tosatti
2016-01-07 15:18   ` Radim Krcmar [this message]
2016-01-07 17:27     ` Andy Lutomirski
2016-01-07 17:48       ` Radim Krcmar
2016-01-07 20:15       ` Marcelo Tosatti
2016-01-07 20:10     ` Marcelo Tosatti
2016-01-08 14:13       ` Radim Krcmar
2016-01-11 21:00         ` Marcelo Tosatti
2016-01-12 15:33           ` Radim Krcmar
2016-01-12 20:48             ` Marcelo Tosatti
2016-01-13 14:59               ` Radim Krcmar
2016-01-07 10:56 ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160107151810.GA12375@potion.brq.redhat.com \
    --to=rkrcmar@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).