All of lore.kernel.org
 help / color / mirror / Atom feed
* Time skew on HP DL785 (and possibly other boxes)
@ 2009-03-27 20:49 Dan Magenheimer
  2009-03-27 22:36 ` Jeremy Fitzhardinge
  2009-03-28  2:29 ` Tian, Kevin
  0 siblings, 2 replies; 18+ messages in thread
From: Dan Magenheimer @ 2009-03-27 20:49 UTC (permalink / raw)
  To: Xen-Devel (E-mail); +Cc: john.v.morris

(Raising a yellow flag because this could turn into
a serious issue for Xen and it may take quite a bit
of work to come up with a solution.)

We recently measured Xen system time skew on an HP DL785
and found it to be horrible... nearly a quarter millisecond
worst case (with only about 10000 samples so it may get worse).

This box uses 8 quad-core AMD chips connected via
hypertransport.  BUT each chip is on a separate motherboard.
On this system hypertransport is fast and cross-node
memory accesses are fast enough so that these NUMA systems
need not behave like NUMA systems from a memory access
perspective.  So Xen just views the system as a 32-cpu box
(other than some code in the memory allocator that tries
to allocate near-memory where possible, but silently falls
back to far-memory if necessary) and guest vcpus migrate
freely between the nodes.  (Correct?)

However, I'm told that its not possible to route a clocksource
over hypertransport, so TSC's on processors on different
motherboards may be VERY different and apparently the
mechanisms for synchronizing Xen system time across
motherboards may not be up to the challenge.  As a result,
OS's and apps sensitive to time that are running on PV
domains may be in for a rough ride on systems like this.
(HVM domains may run into other problems because time will
apparently stop for a "long time".)

Since systems like this are targeted for consolidation
and virtualization, I see this as a potentially big problem
as it may appear to real Xen customers as bizarre
non-reproducible problems, such as "make" failing,
leading to questions about the stability and viability
of using Xen.

Comments?

Dan

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2009-04-06 22:48 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-27 20:49 Time skew on HP DL785 (and possibly other boxes) Dan Magenheimer
2009-03-27 22:36 ` Jeremy Fitzhardinge
2009-04-03 22:23   ` Dan Magenheimer
2009-04-05  7:56     ` Keir Fraser
2009-04-05 12:17       ` Tian, Kevin
2009-04-05 13:27         ` Keir Fraser
2009-04-05 13:37           ` Tian, Kevin
2009-04-05 12:41       ` Tian, Kevin
2009-04-05 12:43         ` Tian, Kevin
2009-04-06 14:34       ` Dan Magenheimer
2009-04-06 14:48         ` Keir Fraser
2009-04-05 12:59     ` Tian, Kevin
2009-04-06 14:41       ` Dan Magenheimer
2009-04-06 22:48         ` Tian, Kevin
2009-03-28  2:29 ` Tian, Kevin
2009-03-31 22:08   ` Dan Magenheimer
2009-03-31 22:48     ` Tian, Kevin
2009-03-31 23:21       ` Dan Magenheimer

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.