All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: xen-devel@lists.xensource.com, Jed Smith <jed@linode.com>,
	Jan Beulich <JBeulich@novell.com>,
	Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Subject: Re: Re: [PATCH] x86: unconditionally mark TSC unstable under Xen
Date: Thu, 15 Jul 2010 10:30:24 -0700	[thread overview]
Message-ID: <4C3F45B0.305@goop.org> (raw)
In-Reply-To: <70bac91d-227c-452f-bf20-a03a86d02255@default>

On 07/15/2010 10:14 AM, Dan Magenheimer wrote:
>>> Isn't the real problem that, in a PV guest, the cpuid instructions
>>> that are testing the TSC-related CPUID bits are obtaining the actual
>>> hardware value, rather than what Xen would like the guest to believe?
>>>       
>> No, because there shouldn't be any "naked" rdtscs in the kernel.
>>
>>     
>>> IOW, isn't the correct fix to use pvcpuid instead of cpuid when
>>> xen_pvdomain() is true?
>>>       
>> Every use of cpuid in the kernel goes via the cpuid pvop, which ends up
>> doing the Xen cpuid rather than the native one.  Usermode cpuid is
>> still the "real" one, unless they explicitly use the Xen version.
>>     
> OK, then I'm confused.  Either:
> - this is one of those recent Intel boxes where all the TSCs should
>   be sync'ed but due to firmware issues they are not, in which case
>   this is a Linux bug that has already been fixed upstream; or
> - this isn't Xen 4.0+ but should be fixed in 4.0; or
> - this is Xen 4.0+ and the default tsc_mode is being overridden
>
> Otherwise, why is TSC not synchronized and pvclock always getting
> an offset of 0?

No, this bug doesn't really have anything to do with tsc sync issues.

The situation is:

    * The scheduler uses its own timebase, called sched_clock
    * We have a pvop for sched_clock
    * The Xen implementation for sched_clock counts unstolen ns, rather
      than wallclock ns, since this is (somewhat, in theory) useful
    * However, the scheduler checks to see if the tsc is stable (because
      the default sched_clock is based on the tsc), and if so, assumes
      that sched_clock is synced across all cpus - but of course the
      amount of stolen time is different for each vcpu

Unfortunately, while the idea of counting unstolen time is useful to see
how much work got done in a timeslice, it pretty useless for counting
how long something was asleep for (since you don't care about how much
time was "stolen" while you were asleep).  And the scheduler uses the
same timebase for measuring both.

So the fix is to simply use plain Xen system time as the scheduler
clock, as that will be synced across cpus.

    J

  reply	other threads:[~2010-07-15 17:30 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-14 19:24 [PATCH] x86: unconditionally mark TSC unstable under Xen Jed Smith
2010-07-14 21:36 ` Jeremy Fitzhardinge
     [not found]   ` <alpine.DEB.2.00.1007151656380.21432@kaball-desktop 4C3F37A5.9050606@goop.org>
2010-07-14 22:29   ` Jed Smith
2010-07-15 14:23   ` Jed Smith
2010-07-15 15:57   ` Stefano Stabellini
2010-07-15 16:30     ` Jeremy Fitzhardinge
2010-07-15 16:40       ` Dan Magenheimer
2010-07-15 16:45         ` Jeremy Fitzhardinge
2010-07-15 17:14           ` Dan Magenheimer
2010-07-15 17:30             ` Jeremy Fitzhardinge [this message]
2010-07-15 17:48               ` Dan Magenheimer
2010-07-15 18:05                 ` Jeremy Fitzhardinge
2010-08-15 20:57                   ` Ævar Arnfjörð Bjarmason
2010-08-15 23:40                     ` Jeremy Fitzhardinge
2010-08-16  1:27                       ` Ævar Arnfjörð Bjarmason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C3F45B0.305@goop.org \
    --to=jeremy@goop.org \
    --cc=JBeulich@novell.com \
    --cc=dan.magenheimer@oracle.com \
    --cc=jed@linode.com \
    --cc=stefano.stabellini@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.