From: Eduardo Habkost <ehabkost@redhat.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: invtsc + migration + TSC scaling
Date: Wed, 19 Oct 2016 11:55:02 -0200 [thread overview]
Message-ID: <20161019135502.GP5057@thinpad.lan.raisama.net> (raw)
In-Reply-To: <20161019132752.GA2841@potion>
On Wed, Oct 19, 2016 at 03:27:52PM +0200, Radim Krčmář wrote:
> 2016-10-18 19:05-0200, Eduardo Habkost:
> > On Tue, Oct 18, 2016 at 10:52:14PM +0200, Radim Krčmář wrote:
> > [...]
> >> The main problem is that QEMU changes virtual_tsc_khz when migrating
> >> without hardware scaling, so KVM is forced to get nanoseconds wrong ...
> >>
> >> If QEMU doesn't want to keep the TSC frequency constant, then it would
> >> be better if it didn't expose TSC in CPUID -- guest would just use
> >> kvmclock without being tempted by direct TSC accesses.
> >
> > Isn't enough to simply not expose invtsc? Aren't guests expected
> > to assume the TSC frequency can change if invtsc isn't set on
> > CPUID?
>
> There are exceptions. An OS can assume constant TSC on some models that
> QEMU emulates: coreduo, core2duo, Conroe, Penryn, n270, kvm32 and kvm64.
> The list from SDM (17.15 TIME-STAMP COUNTER):
>
> Pentium 4 processors, Intel Xeon processors (family [0FH], models [03H
> and higher]); Intel Core Solo and Intel Core Duo processors (family
> [06H], model [0EH]); the Intel Xeon processor 5100 series and Intel
> Core 2 Duo processors (family [06H], model [0FH]); Intel Core 2 and
> Intel Xeon processors (family [06H], DisplayModel [17H]); Intel Atom
> processors (family [06H], DisplayModel [1CH]))
>
> Another sad part is that Linux uses the following condition to assume
> constant TSC frequency:
>
> if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
> (c->x86 == 0x6 && c->x86_model >= 0x0e))
> set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
>
> which returns sets constant TSC for all modern processors. It's not a
> problem on real hardware, because all modern processors likely have
> invariant TSC.
>
> Fun fact: Linux shows constant_tsc flag in /proc/cpuinfo even if the
> modern CPU doesn't expose TSC in CPUID.
>
> Considering that Linux is fixed on Nehalem and newer processors, we have
> few options for the rest:
> 1) treat TSC like invariant TSC on those models (the guest cannot use
> ACPI state, so its OS might assume that they are equivalent)
> 2) hide TSC on those models
> 3) ignore the problem
> 4) remove those models
>
> I don't know enough about QEMU design goals to guess which one is the
> most appropriate. (4) is the clear winner for me, followed by (3). :)
(4) can't be implemented because it breaks existing
configurations. (3) is the current solution.
Option (2) sounds attractive to me, but seems risky. I would like
to understand the consequences for guests. What could stop
working if we remove TSC? What about kvmclock?
If we implement (2), we could even add an extra check that blocks
migration (or at least prints a warning) in case:
1) TSC is forcibly enabled in the configuration;
2) TSC scaling is not available on destination; and
3) the family/model values match the ones on the list above.
And we could even keep TSC enabled by default for users who don't
want migration (using migratable=false).
--
Eduardo
WARNING: multiple messages have this Message-ID (diff)
From: Eduardo Habkost <ehabkost@redhat.com>
To: "Radim Krčmář" <rkrcmar@redhat.com>
Cc: Marcelo Tosatti <mtosatti@redhat.com>,
Paolo Bonzini <pbonzini@redhat.com>,
qemu-devel@nongnu.org, kvm@vger.kernel.org
Subject: Re: [Qemu-devel] invtsc + migration + TSC scaling
Date: Wed, 19 Oct 2016 11:55:02 -0200 [thread overview]
Message-ID: <20161019135502.GP5057@thinpad.lan.raisama.net> (raw)
In-Reply-To: <20161019132752.GA2841@potion>
On Wed, Oct 19, 2016 at 03:27:52PM +0200, Radim Krčmář wrote:
> 2016-10-18 19:05-0200, Eduardo Habkost:
> > On Tue, Oct 18, 2016 at 10:52:14PM +0200, Radim Krčmář wrote:
> > [...]
> >> The main problem is that QEMU changes virtual_tsc_khz when migrating
> >> without hardware scaling, so KVM is forced to get nanoseconds wrong ...
> >>
> >> If QEMU doesn't want to keep the TSC frequency constant, then it would
> >> be better if it didn't expose TSC in CPUID -- guest would just use
> >> kvmclock without being tempted by direct TSC accesses.
> >
> > Isn't enough to simply not expose invtsc? Aren't guests expected
> > to assume the TSC frequency can change if invtsc isn't set on
> > CPUID?
>
> There are exceptions. An OS can assume constant TSC on some models that
> QEMU emulates: coreduo, core2duo, Conroe, Penryn, n270, kvm32 and kvm64.
> The list from SDM (17.15 TIME-STAMP COUNTER):
>
> Pentium 4 processors, Intel Xeon processors (family [0FH], models [03H
> and higher]); Intel Core Solo and Intel Core Duo processors (family
> [06H], model [0EH]); the Intel Xeon processor 5100 series and Intel
> Core 2 Duo processors (family [06H], model [0FH]); Intel Core 2 and
> Intel Xeon processors (family [06H], DisplayModel [17H]); Intel Atom
> processors (family [06H], DisplayModel [1CH]))
>
> Another sad part is that Linux uses the following condition to assume
> constant TSC frequency:
>
> if ((c->x86 == 0xf && c->x86_model >= 0x03) ||
> (c->x86 == 0x6 && c->x86_model >= 0x0e))
> set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
>
> which returns sets constant TSC for all modern processors. It's not a
> problem on real hardware, because all modern processors likely have
> invariant TSC.
>
> Fun fact: Linux shows constant_tsc flag in /proc/cpuinfo even if the
> modern CPU doesn't expose TSC in CPUID.
>
> Considering that Linux is fixed on Nehalem and newer processors, we have
> few options for the rest:
> 1) treat TSC like invariant TSC on those models (the guest cannot use
> ACPI state, so its OS might assume that they are equivalent)
> 2) hide TSC on those models
> 3) ignore the problem
> 4) remove those models
>
> I don't know enough about QEMU design goals to guess which one is the
> most appropriate. (4) is the clear winner for me, followed by (3). :)
(4) can't be implemented because it breaks existing
configurations. (3) is the current solution.
Option (2) sounds attractive to me, but seems risky. I would like
to understand the consequences for guests. What could stop
working if we remove TSC? What about kvmclock?
If we implement (2), we could even add an extra check that blocks
migration (or at least prints a warning) in case:
1) TSC is forcibly enabled in the configuration;
2) TSC scaling is not available on destination; and
3) the family/model values match the ones on the list above.
And we could even keep TSC enabled by default for users who don't
want migration (using migratable=false).
--
Eduardo
next prev parent reply other threads:[~2016-10-19 13:55 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-14 21:20 invtsc + migration + TSC scaling Eduardo Habkost
2016-10-14 21:20 ` [Qemu-devel] " Eduardo Habkost
2016-10-17 9:47 ` Marcelo Tosatti
2016-10-17 9:47 ` [Qemu-devel] " Marcelo Tosatti
2016-10-17 14:50 ` Radim Krčmář
2016-10-17 14:50 ` [Qemu-devel] " Radim Krčmář
2016-10-17 16:24 ` Paolo Bonzini
2016-10-17 16:24 ` [Qemu-devel] " Paolo Bonzini
2016-10-17 21:11 ` Eduardo Habkost
2016-10-17 21:11 ` [Qemu-devel] " Eduardo Habkost
2016-10-17 23:58 ` Marcelo Tosatti
2016-10-17 23:58 ` [Qemu-devel] " Marcelo Tosatti
2016-10-18 13:41 ` Paolo Bonzini
2016-10-18 13:41 ` [Qemu-devel] " Paolo Bonzini
2016-10-18 17:09 ` Marcelo Tosatti
2016-10-18 17:09 ` [Qemu-devel] " Marcelo Tosatti
2016-10-18 20:52 ` Radim Krčmář
2016-10-18 20:52 ` [Qemu-devel] " Radim Krčmář
2016-10-18 21:05 ` Eduardo Habkost
2016-10-18 21:05 ` [Qemu-devel] " Eduardo Habkost
2016-10-19 13:27 ` Radim Krčmář
2016-10-19 13:27 ` [Qemu-devel] " Radim Krčmář
2016-10-19 13:55 ` Eduardo Habkost [this message]
2016-10-19 13:55 ` Eduardo Habkost
2016-10-19 15:42 ` Radim Krčmář
2016-10-19 15:42 ` [Qemu-devel] " Radim Krčmář
2016-10-19 17:42 ` Eduardo Habkost
2016-10-19 17:42 ` [Qemu-devel] " Eduardo Habkost
2016-10-18 13:48 ` Radim Krčmář
2016-10-18 13:48 ` [Qemu-devel] " Radim Krčmář
2016-10-18 13:36 ` Radim Krčmář
2016-10-18 13:36 ` [Qemu-devel] " Radim Krčmář
2016-10-18 13:38 ` Radim Krčmář
2016-10-18 13:38 ` [Qemu-devel] " Radim Krčmář
2016-10-17 17:20 ` Marcelo Tosatti
2016-10-17 17:20 ` [Qemu-devel] " Marcelo Tosatti
2016-10-18 13:27 ` Radim Krčmář
2016-10-18 13:27 ` [Qemu-devel] " Radim Krčmář
2016-10-18 9:04 ` Dr. David Alan Gilbert
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161019135502.GP5057@thinpad.lan.raisama.net \
--to=ehabkost@redhat.com \
--cc=kvm@vger.kernel.org \
--cc=mtosatti@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=rkrcmar@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.