From: "Dan Magenheimer" <dan.magenheimer@oracle.com>
To: "dan.magenheimer@oracle.com" <dan.magenheimer@oracle.com>,
Keir Fraser <keir.fraser@eu.citrix.com>,
"Xen-Devel (E-mail)" <xen-devel@lists.xensource.com>
Cc: Dave Winchell <dwinchell@virtualiron.com>
Subject: RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time)
Date: Thu, 3 Jul 2008 14:03:41 -0600 [thread overview]
Message-ID: <20080703140341906.00000003744@djm-pc> (raw)
In-Reply-To: <20080703103548140.00000003744@djm-pc>
[-- Attachment #1: Type: text/plain, Size: 1437 bytes --]
> > IMHO, it would be nice to put this patch into the tree as it
> > will be good for helping to diagnose time skew problems
> > such as the one just reported on the list.
>
> Oops! Just after I sent the above email, I checked again and
> the same machine (no reboots, no guests ever launched) now reports
> a max stime skew of 4333ns!! Methinks there might be some
> periodic glitch in the calibration code?
OK this version records not only max but also a distribution
of skew. (The code is a bit ugly... I thought about doing
something fancy with log-binary but decided a few base-10
ranges were clearer for a human to read.)
With this, I use "watch -d 'xm debug-key t; xm dmesg | tail -3'"
and can observe that (on my single-socket two-core recent-vintage
Intel box) roughly three-quarters of the skew measurements are
between 10-100nsec, roughly one-quarter are between 100ns-1us,
a couple percent are between 1us-10us and a few are >10us.
This represents an approximate distribution of how long an hvm
guest might observe time to be stopped (if it is able to repeatedly
read time values quickly enough).
So on some machines, this might be substantially worse than the
old hvm-platform-timer-built-on-tsc mechanism (though we had
no monotonicity constraint built into that).
I wonder if the >1us outliers are occurring only if the
processor has been idle for awhile, vs entirely random.
Dan
[-- Attachment #2: maxskew3.patch --]
[-- Type: application/octet-stream, Size: 5382 bytes --]
diff -r 08f77df14cba xen/arch/x86/hvm/vpt.c
--- a/xen/arch/x86/hvm/vpt.c Wed Jul 02 11:30:37 2008 +0900
+++ b/xen/arch/x86/hvm/vpt.c Thu Jul 03 13:29:54 2008 -0600
@@ -25,6 +25,8 @@
#define mode_is(d, name) \
((d)->arch.hvm_domain.params[HVM_PARAM_TIMER_MODE] == HVMPTM_##name)
+u64 max_guest_time_skew = 0;
+
void hvm_init_guest_time(struct domain *d)
{
struct pl_time *pl = &d->arch.hvm_domain.pl_time;
@@ -38,16 +40,22 @@ u64 hvm_get_guest_time(struct vcpu *v)
{
struct pl_time *pl = &v->domain->arch.hvm_domain.pl_time;
u64 now;
+ int64_t skew;
/* Called from device models shared with PV guests. Be careful. */
ASSERT(is_hvm_vcpu(v));
spin_lock(&pl->pl_time_lock);
now = get_s_time() + pl->stime_offset;
- if ( (int64_t)(now - pl->last_guest_time) >= 0 )
+ if ( ( skew = (int64_t)(now - pl->last_guest_time) ) >= 0 )
pl->last_guest_time = now;
else
+ {
+ skew = -skew;
+ if ( skew > max_guest_time_skew )
+ max_guest_time_skew = skew;
now = pl->last_guest_time;
+ }
spin_unlock(&pl->pl_time_lock);
return now + v->arch.hvm_vcpu.stime_offset;
diff -r 08f77df14cba xen/arch/x86/time.c
--- a/xen/arch/x86/time.c Wed Jul 02 11:30:37 2008 +0900
+++ b/xen/arch/x86/time.c Thu Jul 03 13:29:54 2008 -0600
@@ -69,6 +69,11 @@ static DEFINE_PER_CPU(struct cpu_time, c
/* TSC is invariant on C state entry? */
static bool_t tsc_invariant;
+
+/* record maximum skew and range to report with debug-key t */
+u64 max_stime_skew = 0;
+u64 stime_skew_zero_cnt = 0,stime_skew_10_cnt = 0, stime_skew_100_cnt = 0;
+u64 stime_skew_1000_cnt = 0,stime_skew_10000_cnt = 0, stime_skew_big_cnt = 0;
/*
* We simulate a 32-bit platform timer from the 16-bit PIT ch2 counter.
@@ -808,6 +813,7 @@ static void local_time_calibration(void
*/
s_time_t prev_local_stime, curr_local_stime;
s_time_t prev_master_stime, curr_master_stime;
+ s_time_t curr_stime_skew;
/* TSC timestamps taken during this calibration and prev calibration. */
u64 prev_tsc, curr_tsc;
@@ -831,6 +837,9 @@ static void local_time_calibration(void
/* The overall calibration scale multiplier. */
u32 calibration_mul_frac;
+ /* ignore max skew calculation on first few iterations */
+ static int skip_max_skew_calc = 100;
+
prev_tsc = t->local_tsc_stamp;
prev_local_stime = t->stime_local_stamp;
prev_master_stime = t->stime_master_stamp;
@@ -844,6 +853,40 @@ static void local_time_calibration(void
curr_local_stime = get_s_time();
rdtscll(curr_tsc);
local_irq_enable();
+
+ /*
+ * Record maximum stime skew from master processor. Note that
+ * in the case of a fast local clock, skew reflects the post-adjusted
+ * skew (see below and get_s_time()), not the actual skew. Also
+ * note that some processors may skew positive and others negative
+ * relative to master so skew between ANY pair of processors may be
+ * as much as 2x recorded max
+ */
+ if ( smp_processor_id() )
+ {
+ if ( skip_max_skew_calc > 0)
+ skip_max_skew_calc--; /* allow calibration to converge */
+ else
+ {
+ curr_stime_skew = curr_master_stime - curr_local_stime;
+ if ( (s64) curr_stime_skew < 0 )
+ curr_stime_skew = - curr_stime_skew;
+ if ( curr_stime_skew > max_stime_skew )
+ max_stime_skew = curr_stime_skew;
+ if ( !curr_stime_skew )
+ stime_skew_zero_cnt++;
+ else if ( curr_stime_skew < 10 )
+ stime_skew_10_cnt++;
+ else if ( curr_stime_skew < 100 )
+ stime_skew_100_cnt++;
+ else if ( curr_stime_skew < 1000 )
+ stime_skew_1000_cnt++;
+ else if ( curr_stime_skew < 10000 )
+ stime_skew_10000_cnt++;
+ else
+ stime_skew_big_cnt++;
+ }
+ }
#if 0
printk("PRE%d: tsc=%"PRIu64" stime=%"PRIu64" master=%"PRIu64"\n",
diff -r 08f77df14cba xen/common/keyhandler.c
--- a/xen/common/keyhandler.c Wed Jul 02 11:30:37 2008 +0900
+++ b/xen/common/keyhandler.c Thu Jul 03 13:29:54 2008 -0600
@@ -251,6 +251,9 @@ static void read_clocks(unsigned char ke
unsigned int cpu = smp_processor_id(), min_cpu, max_cpu;
u64 min, max, dif, difus;
static DEFINE_SPINLOCK(lock);
+ extern u64 max_stime_skew, max_guest_time_skew;
+ extern u64 stime_skew_zero_cnt, stime_skew_10_cnt, stime_skew_100_cnt;
+ extern u64 stime_skew_1000_cnt, stime_skew_10000_cnt, stime_skew_big_cnt;
spin_lock(&lock);
@@ -284,6 +287,14 @@ static void read_clocks(unsigned char ke
printk("Min = %"PRIu64" ; Max = %"PRIu64" ; Diff = %"PRIu64
" (%"PRIu64" microseconds)\n",
min, max, dif, difus);
+ printk("Max stime skew = %"PRIu64"ns; Max guest stoppage = %"PRIu64"ns\n",
+ max_stime_skew, max_guest_time_skew);
+ printk("stime skew counts: 0=%"PRIu64"; ",stime_skew_zero_cnt);
+ printk("-10=%"PRIu64"; ",stime_skew_10_cnt);
+ printk("-100=%"PRIu64"; ",stime_skew_100_cnt);
+ printk("-1000=%"PRIu64"; ",stime_skew_1000_cnt);
+ printk("-10000=%"PRIu64"; ",stime_skew_10000_cnt);
+ printk(">10000=%"PRIu64"\n",stime_skew_big_cnt);
}
extern void dump_runq(unsigned char key);
[-- Attachment #3: Type: text/plain, Size: 138 bytes --]
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel
next prev parent reply other threads:[~2008-07-03 20:03 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-02 16:03 [PATCH] strictly increasing hvm guest time Dan Magenheimer
2008-07-02 16:07 ` Keir Fraser
2008-07-02 21:50 ` Dan Magenheimer
2008-07-02 22:41 ` [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time) Dan Magenheimer
2008-07-03 8:03 ` Keir Fraser
2008-07-03 16:24 ` Dan Magenheimer
2008-07-03 16:35 ` Dan Magenheimer
2008-07-03 20:03 ` Dan Magenheimer [this message]
2008-07-03 23:00 ` Keir Fraser
2008-07-04 15:11 ` Dan Magenheimer
2008-07-04 15:22 ` Keir Fraser
2008-07-04 19:32 ` Dan Magenheimer
2008-07-04 19:56 ` Keir Fraser
2008-07-10 0:24 ` Dan Magenheimer
2008-07-10 7:40 ` Keir Fraser
2008-07-10 22:42 ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm guest time)) Dan Magenheimer
2008-07-11 8:27 ` Keir Fraser
2008-07-11 20:53 ` Dan Magenheimer
2008-07-11 21:27 ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasinghvm " Ian Pratt
2008-07-12 21:05 ` Dan Magenheimer
2008-07-11 21:27 ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasing hvm " Keir Fraser
2008-07-12 21:07 ` Dan Magenheimer
2008-07-19 17:51 ` Dan Magenheimer
2008-07-21 8:32 ` Keir Fraser
2008-07-22 22:27 ` Dan Magenheimer
2008-07-22 23:07 ` Xen system skew MUCH worse than tsc skew (was RE: RE: [PATCH] record max stime skew (was RE: [PATCH] strictly increasinghvm " Ian Pratt
2008-07-23 0:40 ` Dan Magenheimer
2008-07-23 1:16 ` Ian Pratt
2008-07-23 6:11 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080703140341906.00000003744@djm-pc \
--to=dan.magenheimer@oracle.com \
--cc=dwinchell@virtualiron.com \
--cc=keir.fraser@eu.citrix.com \
--cc=xen-devel@lists.xensource.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.