All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dan Magenheimer" <dan.magenheimer@oracle.com>
To: Keir Fraser <keir.fraser@eu.citrix.com>,
	"Xen-Devel (E-mail)" <xen-devel@lists.xensource.com>
Cc: Dave Winchell <dwinchell@virtualiron.com>
Subject: RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing  hvm guest time)
Date: Wed, 9 Jul 2008 18:24:44 -0600	[thread overview]
Message-ID: <20080709182444984.00000003744@djm-pc> (raw)
In-Reply-To: <C4943EFB.1A911%keir.fraser@eu.citrix.com>

> > Well one suspicion I had was that very long hpet reads were
> > getting serialized, but I tried clocksource=acpi and
> > clocksource=pit and get similar skew range results.
> > In fact pit shows a max of >17000 vs hpet and acpi closer
> > to 11000.  (OTOH, I suppose it IS possible that this is
> > roughly how long it takes to read each of these platform
> > timers.)
> 
> That ought to be easy to check. I would expect that the PIT, 
> for example,
> could take a couple of microseconds to access.
> 
>  -- Keir

(I haven't seen the patch applied... since it just collects
data, it would be nice if it was applied so others could
try it.)

To follow up on this, I tried a number of tests but wasn't
able to identify the problem and have given up (for now).
In case someone else starts looking at this (or if any of
my tests suggest a solution to someone), I thought I'd
document what I tried.

PROBLEM: Xen system time skew between processors local time
and platform time is generally "small" but "sometimes" gets
quite "large".  This is important because, the larger the
skew, the more likely an hvm guest will experience time
stopping or (in some cases) time going backwards.

On my box, "small" is under 1 usec, "large" is 9-18 usec,
and "sometimes" is about one out of 500 measurements.  Note
that my box is a recent vintage Intel single-socket dual-core
("Conroe").

I suspect periodically some lock is being waited for for
a long time, or maybe an unexpected interrupt is occurring,
but I didn't find anything through code reading or
experiments.

TEST METHOD: The patch I sent on this thread collects data
whenever local_time_calibration() is run (which is 1Hz on
each processor) and "xm debug-key t" prints this data
so it can be seen with "xm dmesg".  To see the problem,
one need only boot dom0 and run xm debug-key and xm dmesg.

1) CONJECTURE: Related to how long it takes to read the
   platform timer

The max skew (and distribution) are definitely different
depending on whether clocksource=hpet or clocksource=pit.
For hpet, I am almost always seeing a max skew of 11000+
and with pit 17000+.  ONCE (over many hours of runs) I saw
a skew with hpet of 15000.  However, I added code in the
platform timer read routine (inside all locks but NOT with
interrupts off) to artificially lengthen a platform timer
read and it made no difference in the measurements

2) CONJECTURE: Max skew only occurs on some processors (e.g.
   not on the one that does the platform calibration)

Nope, if you wait long enough max skew is fairly close
on all processors (though in some cases, it seems to
take a long time... perhaps because of unbalanced load?)

3) CONJECTURE: Max skew occurs on platform timer overflow.

Possibly, but there is certainly not a 1-1 correspondence.
Sometimes there are more large skews than overflows and
sometimes less.

4) CONJECTURE: Artifact of ntpd running

Nope, same skews whether ntpd is running on dom0 or not

5) CONJECTURE: Related to frequency changes or suspends

Nope, none of these happening on my box.

6)  CONJECTURE: "Weirdness can happen" comment in time.c

Nope, this path isn't getting executed.

7) CONJECTURE: Result of natural skews between platform
timer and tsc, plus jitter.  Unfixable.

Possible, untested, not sure how.

  reply	other threads:[~2008-07-10  0:24 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-02 16:03 [PATCH] strictly increasing hvm guest time Dan Magenheimer
2008-07-02 16:07 ` Keir Fraser
2008-07-02 21:50   ` Dan Magenheimer
2008-07-02 22:41     ` [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time) Dan Magenheimer
2008-07-03  8:03       ` Keir Fraser
2008-07-03 16:24         ` Dan Magenheimer
2008-07-03 16:35           ` Dan Magenheimer
2008-07-03 20:03             ` Dan Magenheimer
2008-07-03 23:00               ` Keir Fraser
2008-07-04 15:11                 ` Dan Magenheimer
2008-07-04 15:22                   ` Keir Fraser
2008-07-04 19:32                     ` Dan Magenheimer
2008-07-04 19:56                       ` Keir Fraser
2008-07-10  0:24                         ` Dan Magenheimer [this message]
2008-07-10  7:40                           ` Keir Fraser
2008-07-10 22:42                             ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time)) Dan Magenheimer
2008-07-11  8:27                               ` Keir Fraser
2008-07-11 20:53                                 ` Dan Magenheimer
2008-07-11 21:27                                   ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasinghvm " Ian Pratt
2008-07-12 21:05                                     ` Dan Magenheimer
2008-07-11 21:27                                   ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm " Keir Fraser
2008-07-12 21:07                                     ` Dan Magenheimer
2008-07-19 17:51                                 ` Dan Magenheimer
2008-07-21  8:32                                   ` Keir Fraser
2008-07-22 22:27                                     ` Dan Magenheimer
2008-07-22 23:07                                       ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasinghvm " Ian Pratt
2008-07-23  0:40                                         ` Dan Magenheimer
2008-07-23  1:16                                           ` Ian Pratt
2008-07-23  6:11                                       ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080709182444984.00000003744@djm-pc \
    --to=dan.magenheimer@oracle.com \
    --cc=dwinchell@virtualiron.com \
    --cc=keir.fraser@eu.citrix.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.