From: Jeremy Fitzhardinge <jeremy@goop.org>
To: Dan Magenheimer <dan.magenheimer@oracle.com>
Cc: john.v.morris@hp.com,
"Xen-Devel (E-mail)" <xen-devel@lists.xensource.com>
Subject: Re: Time skew on HP DL785 (and possibly other boxes)
Date: Fri, 27 Mar 2009 15:36:59 -0700 [thread overview]
Message-ID: <49CD550B.1070908@goop.org> (raw)
In-Reply-To: <055de860-7f5f-496c-81ae-df1bf383d4bc@default>
Dan Magenheimer wrote:
> However, I'm told that its not possible to route a clocksource
> over hypertransport, so TSC's on processors on different
> motherboards may be VERY different and apparently the
> mechanisms for synchronizing Xen system time across
> motherboards may not be up to the challenge. As a result,
> OS's and apps sensitive to time that are running on PV
> domains may be in for a rough ride on systems like this.
> (HVM domains may run into other problems because time will
> apparently stop for a "long time".)
>
I don't see what the problem is. If each individual cpu has well known
tsc parameters (rate and offset), then a PV client will get those timing
parameters and use it to compute its time. It doesn't matter if they're
syncronized between cpus or nodes.
Xen will need to calibrate each of them against a good reference
(hpet?), but that's no different from now. I guess its possible that
this system has more variation and latency for hpet access, which may
mean that the calibration algorithm needs tweaking.
Of course, if the tsc rates on each cpu is changing in some
unpredictable way then that's a whole other barrel of problems. Guests
rely on Xen maintaing accurate tsc timing parameters.
> Since systems like this are targeted for consolidation
> and virtualization, I see this as a potentially big problem
> as it may appear to real Xen customers as bizarre
> non-reproducible problems, such as "make" failing,
> leading to questions about the stability and viability
> of using Xen.
>
> Comments?
>
In Linux there's this function:
/*
* apic_is_clustered_box() -- Check if we can expect good TSC
*
* Thus far, the major user of this is IBM's Summit2 series:
*
* Clustered boxes may have unsynced TSC problems if they are
* multi-chassis. Use available data to take a good guess.
* If in doubt, go HPET.
*/
__cpuinit int apic_is_clustered_box(void)
{...}
Which deals with Summit2 and ScaleSMP vsmp systems which also have
unsynchronized tscs across nodes. At the moment it assumes that no
non-VSMP AMD system has unsynchronized tscs; sounds like it will need
updating for this system.
J
next prev parent reply other threads:[~2009-03-27 22:36 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-03-27 20:49 Time skew on HP DL785 (and possibly other boxes) Dan Magenheimer
2009-03-27 22:36 ` Jeremy Fitzhardinge [this message]
2009-04-03 22:23 ` Dan Magenheimer
2009-04-05 7:56 ` Keir Fraser
2009-04-05 12:17 ` Tian, Kevin
2009-04-05 13:27 ` Keir Fraser
2009-04-05 13:37 ` Tian, Kevin
2009-04-05 12:41 ` Tian, Kevin
2009-04-05 12:43 ` Tian, Kevin
2009-04-06 14:34 ` Dan Magenheimer
2009-04-06 14:48 ` Keir Fraser
2009-04-05 12:59 ` Tian, Kevin
2009-04-06 14:41 ` Dan Magenheimer
2009-04-06 22:48 ` Tian, Kevin
2009-03-28 2:29 ` Tian, Kevin
2009-03-31 22:08 ` Dan Magenheimer
2009-03-31 22:48 ` Tian, Kevin
2009-03-31 23:21 ` Dan Magenheimer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=49CD550B.1070908@goop.org \
--to=jeremy@goop.org \
--cc=dan.magenheimer@oracle.com \
--cc=john.v.morris@hp.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.