From: Marcelo Tosatti <mtosatti@redhat.com>
To: Radim Krcmar <rkrcmar@redhat.com>
Cc: Andy Lutomirski <luto@amacapital.net>, kvm list <kvm@vger.kernel.org>
Subject: Re: What's kvmclock's custom sched_clock for?
Date: Thu, 7 Jan 2016 18:10:45 -0200 [thread overview]
Message-ID: <20160107201043.GA18469@amt.cnet> (raw)
In-Reply-To: <20160107151810.GA12375@potion.brq.redhat.com>
On Thu, Jan 07, 2016 at 04:18:11PM +0100, Radim Krcmar wrote:
> 2016-01-07 00:41-0800, Andy Lutomirski:
> > On Wed, Jan 6, 2016 at 11:18 PM, Andy Lutomirski <luto@amacapital.net> wrote:
> >> AFAICT KVM reliably passes a monotonic TSC through to guests, even if
> >> the host suspends. That's all that sched_clock needs, I think.
> >>
> >> So why does kvmclock have a custom sched_clock?
>
> If the host CPU has enough features, then yes, KVM can take care of
> everything and kvmclock has no advantage over TSC, even when migrating
> to TSC with different frequency as modern CPUs support TSC offset +
> scaling in guests.
>
> The problem is with antiques. Guests on old CPUs need to have more
> information on top of TSC to be able to get useful system time.
> And old KVM doesn't provide good information, so we have legacy layers
> everywhere.
>
> kvmclock in the guest can just equal to rdtsc() with modern CPUs, but we
> still want to use kvmclock wrapper, because kvmclock can provide an
> stable clock regardless of underlying TSC (in theory).
>
> >> On a related note, KVM doesn't pass the "invariant TSC" feature
> >> through to guests on my machine even though "invtsc" is set in QEMU
> >> and the kernel host code appears to support it. What gives?
> >
> > I think I solved part of the puzzle. KVM doesn't like to advertise
> > invtsc by default because that breaks migration. (Oddly, the end
> > result seems wrong -- with migration, the TSC doesn't stop, but it's
> > not constant, and X86_FEATURE_CONSTANT_TSC is nonetheless set, but
> > whatever.)
>
> QEMU probably missed that because X86_FEATURE_CONSTANT_TSC is a function
> of family/model. (CONSTANT_TSC is the same as invariant TSC as KVM
> guests don't have c-states.)
>
> > So the scheduler clock doesn't get marked stable.
>
> Stable sched clock is quite unrelated to TSC features. KVMs from last
> few years should always give good enough result to allow stable sched
> clock. We wanted realtime guests and realtime linux needs no_hz=full
> that depends on stable sched clock. The result is huge hack.
>
> We'd need to say that migration creates powerful gravity fields to
> faithfully migrate constant/invariant TSC, but stable sched clock
> doesn't have that strict expectations about time.
Was that supposed to be a joke?
> > Is that it?
> >
> > This still doesn't explain why even explicitly trying to set invtsc
> > doesn't seem to work.
>
> Seems like a bug. Mine cpuid is
> 0x80000007 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000100
> and QEMU says
> warning: host doesn't support requested feature: CPUID.80000007H:EDX.invtsc [bit 8]
>
> I'll see if it's in KVM or QEMU. (We should only forbid migrations to
> hosts with different frequency and without guest TSC scaling.)
next prev parent reply other threads:[~2016-01-07 20:18 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-01-07 7:18 What's kvmclock's custom sched_clock for? Andy Lutomirski
2016-01-07 8:41 ` Andy Lutomirski
2016-01-07 10:59 ` Marcelo Tosatti
2016-01-07 15:18 ` Radim Krcmar
2016-01-07 17:27 ` Andy Lutomirski
2016-01-07 17:48 ` Radim Krcmar
2016-01-07 20:15 ` Marcelo Tosatti
2016-01-07 20:10 ` Marcelo Tosatti [this message]
2016-01-08 14:13 ` Radim Krcmar
2016-01-11 21:00 ` Marcelo Tosatti
2016-01-12 15:33 ` Radim Krcmar
2016-01-12 20:48 ` Marcelo Tosatti
2016-01-13 14:59 ` Radim Krcmar
2016-01-07 10:56 ` Marcelo Tosatti
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160107201043.GA18469@amt.cnet \
--to=mtosatti@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).