xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* rdtsc strangeness on upstream kernel
@ 2010-07-19 23:19 Dan Magenheimer
  2010-07-19 23:34 ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Magenheimer @ 2010-07-19 23:19 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

Hi Jeremy --

In trying to understand some strange behavior I was
seeing on a RHEL6b2 guest, I ran across an interesting
anomaly, and it seems to be true on different upstream
(pvops) kernels.  At first I thought it was a result
of the xen_sched_clock->xen_clocksource_read patch
you recently posted, but after some testing this
appears to be unrelated.

The number of rdtsc/second goes up dramatically when
there is CPU-intensive load on an upstream kernel!
I know we both observed some cases where rdtsc/sec
was very high, but I don't think we ever were able to
reproduce this consistently.

First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
kernel.

I am observing ~300 rdtsc/sec on an idle VM.  When
I run a load of:

main() {while(1);}

I am observing about 10000 rdtsc/sec!!

This is a CONFIG_HZ_1000=y kernel, so I would
expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
but 10000 makes me wonder if there is some hidden
bug.

Changing to vcpus=2, the above numbers roughly
double... ~600 goes up to 20000 rdtsc/sec.
So clearly it is some kind of per-vcpu activity.

Any idea what kernel parts might be doing this?
I can instrument the xen rdtsc code to collect
some more information, but thought I'd get ideas
from you first.

Thanks,
Dan

P.S. Lest anyone panic about this, a rate of
10000 rdtsc/sec/vcpu should translate to ~1%
performance.  Not a catastrophe but still worth
understanding.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: rdtsc strangeness on upstream kernel
  2010-07-19 23:19 rdtsc strangeness on upstream kernel Dan Magenheimer
@ 2010-07-19 23:34 ` Jeremy Fitzhardinge
  2010-07-19 23:51   ` Dan Magenheimer
  0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Fitzhardinge @ 2010-07-19 23:34 UTC (permalink / raw)
  To: Dan Magenheimer; +Cc: xen-devel

On 07/19/2010 04:19 PM, Dan Magenheimer wrote:
> In trying to understand some strange behavior I was
> seeing on a RHEL6b2 guest, I ran across an interesting
> anomaly, and it seems to be true on different upstream
> (pvops) kernels.  At first I thought it was a result
> of the xen_sched_clock->xen_clocksource_read patch
> you recently posted, but after some testing this
> appears to be unrelated.
>
> The number of rdtsc/second goes up dramatically when
> there is CPU-intensive load on an upstream kernel!
>   

Are you looking at rdtsc emulation traps when running a PV domain?

> I know we both observed some cases where rdtsc/sec
> was very high, but I don't think we ever were able to
> reproduce this consistently.
>   

It would be interesting to compare that to the context switch rate (cs
column in vmstat output) to see if they correlate.

Also, how does it relate to timer interrupts?

> First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
> kernel.
>
> I am observing ~300 rdtsc/sec on an idle VM.  When
> I run a load of:
>
> main() {while(1);}
>
> I am observing about 10000 rdtsc/sec!!
>
> This is a CONFIG_HZ_1000=y kernel, so I would
> expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
> but 10000 makes me wonder if there is some hidden
> bug.
>   

Do you have preemption running?  And why HZ=1000?

Thanks,
    J

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: rdtsc strangeness on upstream kernel
  2010-07-19 23:34 ` Jeremy Fitzhardinge
@ 2010-07-19 23:51   ` Dan Magenheimer
       [not found]     ` <681683ff-677b-4709-911e-3d90d7dbed7c@default4C44EF08.3070103@goop.org>
  2010-07-19 23:55     ` Dan Magenheimer
  0 siblings, 2 replies; 8+ messages in thread
From: Dan Magenheimer @ 2010-07-19 23:51 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

> > The number of rdtsc/second goes up dramatically when
> > there is CPU-intensive load on an upstream kernel!
> 
> Are you looking at rdtsc emulation traps when running a PV domain?

Yes, with "for i in {0..99}; do xm debug-key s; sleep 1; done".
 
> > I know we both observed some cases where rdtsc/sec
> > was very high, but I don't think we ever were able to
> > reproduce this consistently.
> 
> It would be interesting to compare that to the context switch rate (cs
> column in vmstat output) to see if they correlate.

Doesn't appear to be related.  Vmstat continues to show
cs at about 40/sec while the rdtsc/sec is very high.

> Also, how does it relate to timer interrupts?

With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
Is it possible that the timer interrupt is using rdtsc
five times each interrupt?

> > First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
> > kernel.
> >
> > I am observing ~300 rdtsc/sec on an idle VM.  When
> > I run a load of:
> >
> > main() {while(1);}
> >
> > I am observing about 10000 rdtsc/sec!!
> >
> > This is a CONFIG_HZ_1000=y kernel, so I would
> > expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
> > but 10000 makes me wonder if there is some hidden
> > bug.
> 
> Do you have preemption running?  And why HZ=1000?

CONFIG_PREEMPT is off, though there are some other
PREEMPT-related configs that are on.  Are any others
relevant?

HZ=1000 is because this is a straight RHEL6b2 kernel,
but running as a PV guest.

Dan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: rdtsc strangeness on upstream kernel
  2010-07-19 23:51   ` Dan Magenheimer
       [not found]     ` <681683ff-677b-4709-911e-3d90d7dbed7c@default4C44EF08.3070103@goop.org>
@ 2010-07-19 23:55     ` Dan Magenheimer
  2010-07-20  0:34       ` Jeremy Fitzhardinge
  1 sibling, 1 reply; 8+ messages in thread
From: Dan Magenheimer @ 2010-07-19 23:55 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.

No, wait, correct that...timer0 and timer1 are 1000 irq/sec.
(I was using "watch -d" which defaults to updating every
two seconds, not every second.)

> Is it possible that the timer interrupt is using rdtsc
> five times each interrupt?

TEN times each interrupt.

> -----Original Message-----
> From: Dan Magenheimer
> Sent: Monday, July 19, 2010 5:51 PM
> To: Jeremy Fitzhardinge
> Cc: xen-devel@lists.xensource.com
> Subject: RE: rdtsc strangeness on upstream kernel
> 
> > > The number of rdtsc/second goes up dramatically when
> > > there is CPU-intensive load on an upstream kernel!
> >
> > Are you looking at rdtsc emulation traps when running a PV domain?
> 
> Yes, with "for i in {0..99}; do xm debug-key s; sleep 1; done".
> 
> > > I know we both observed some cases where rdtsc/sec
> > > was very high, but I don't think we ever were able to
> > > reproduce this consistently.
> >
> > It would be interesting to compare that to the context switch rate
> (cs
> > column in vmstat output) to see if they correlate.
> 
> Doesn't appear to be related.  Vmstat continues to show
> cs at about 40/sec while the rdtsc/sec is very high.
> 
> > Also, how does it relate to timer interrupts?
> 
> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
> Is it possible that the timer interrupt is using rdtsc
> five times each interrupt?
> 
> > > First, this is a single vcpu, 64-bit 2.6.32 (RHEL6b2)
> > > kernel.
> > >
> > > I am observing ~300 rdtsc/sec on an idle VM.  When
> > > I run a load of:
> > >
> > > main() {while(1);}
> > >
> > > I am observing about 10000 rdtsc/sec!!
> > >
> > > This is a CONFIG_HZ_1000=y kernel, so I would
> > > expect 1000 rdtsc/sec, or maybe 2000 rdtsc/sec,
> > > but 10000 makes me wonder if there is some hidden
> > > bug.
> >
> > Do you have preemption running?  And why HZ=1000?
> 
> CONFIG_PREEMPT is off, though there are some other
> PREEMPT-related configs that are on.  Are any others
> relevant?
> 
> HZ=1000 is because this is a straight RHEL6b2 kernel,
> but running as a PV guest.
> 
> Dan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RE: rdtsc strangeness on upstream kernel
  2010-07-19 23:55     ` Dan Magenheimer
@ 2010-07-20  0:34       ` Jeremy Fitzhardinge
  2010-07-20  0:42         ` Dan Magenheimer
  0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Fitzhardinge @ 2010-07-20  0:34 UTC (permalink / raw)
  To: Dan Magenheimer; +Cc: xen-devel

On 07/19/2010 04:55 PM, Dan Magenheimer wrote:
>> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
>> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
>>     
> No, wait, correct that...timer0 and timer1 are 1000 irq/sec.
> (I was using "watch -d" which defaults to updating every
> two seconds, not every second.)
>
>   
>> Is it possible that the timer interrupt is using rdtsc
>> five times each interrupt?
>>     
> TEN times each interrupt.
>   

Do you have any other debugging/tracing enabled?  CONFIG_XEN_DEBUG_FS
for example?

10 sounds a little high, but not necessarily completely unreasonable,
especially if there's tracing/debugging on.

    J

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: RE: rdtsc strangeness on upstream kernel
  2010-07-20  0:34       ` Jeremy Fitzhardinge
@ 2010-07-20  0:42         ` Dan Magenheimer
  2010-07-20  0:44           ` Jeremy Fitzhardinge
  0 siblings, 1 reply; 8+ messages in thread
From: Dan Magenheimer @ 2010-07-20  0:42 UTC (permalink / raw)
  To: Jeremy Fitzhardinge; +Cc: xen-devel

> On 07/19/2010 04:55 PM, Dan Magenheimer wrote:
> >> With vcpus=2, timer0 goes up to about 2000 irq/sec on CPU0
> >> and timer1 on CPU1 is the same while rdtsc/sec is about 20000.
> >>
> > No, wait, correct that...timer0 and timer1 are 1000 irq/sec.
> > (I was using "watch -d" which defaults to updating every
> > two seconds, not every second.)
> >
> >
> >> Is it possible that the timer interrupt is using rdtsc
> >> five times each interrupt?
> >>
> > TEN times each interrupt.
> >
> 
> Do you have any other debugging/tracing enabled?  CONFIG_XEN_DEBUG_FS
> for example?
> 
> 10 sounds a little high, but not necessarily completely unreasonable,
> especially if there's tracing/debugging on.

Yes, CONFIG_XEN_DEBUG_FS is on.  I dug up some instrumentation
I used in the hypervisor last fall which should identify the
kernel locations of the rdtsc's.  I'll plug that in and
post the results tomorrow.

Dan

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: RE: rdtsc strangeness on upstream kernel
  2010-07-20  0:42         ` Dan Magenheimer
@ 2010-07-20  0:44           ` Jeremy Fitzhardinge
  2010-07-23 16:47             ` Dan Magenheimer
  0 siblings, 1 reply; 8+ messages in thread
From: Jeremy Fitzhardinge @ 2010-07-20  0:44 UTC (permalink / raw)
  To: Dan Magenheimer; +Cc: xen-devel

On 07/19/2010 05:42 PM, Dan Magenheimer wrote:
> Yes, CONFIG_XEN_DEBUG_FS is on.  I dug up some instrumentation
> I used in the hypervisor last fall which should identify the
> kernel locations of the rdtsc's.  I'll plug that in and
> post the results tomorrow.
>   

They're probably all in the same place, so you'll need to get some
calling stack to make sense of it.

    J

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: RE: rdtsc strangeness on upstream kernel
  2010-07-20  0:44           ` Jeremy Fitzhardinge
@ 2010-07-23 16:47             ` Dan Magenheimer
  0 siblings, 0 replies; 8+ messages in thread
From: Dan Magenheimer @ 2010-07-23 16:47 UTC (permalink / raw)
  To: Jeremy Fitzhardinge, Andrew Jones; +Cc: xen-devel

[-- Attachment #1: Type: text/plain, Size: 3591 bytes --]

> On 07/19/2010 05:42 PM, Dan Magenheimer wrote:
> > Yes, CONFIG_XEN_DEBUG_FS is on.  I dug up some instrumentation
> > I used in the hypervisor last fall which should identify the
> > kernel locations of the rdtsc's.  I'll plug that in and
> > post the results tomorrow.
> 
> They're probably all in the same place, so you'll need to get some
> calling stack to make sense of it.
> 
>     J

Sorry this took awhile.  I've been hacking and instrumenting and
measuring and the results are not very good.

I'm cc'ing Andrew Jones since things look worse on stock RHEL6b2.

First, I've measured on both PV 2.6.35-rc4 ("upstream") and PV
2.6.32-37 ("rhel6b2").  I've attached the patch for rhel6b2 that
I used for measurement; the same code fragments were used for
upstream.

Rhel6b2 had HZ=1000 and had vcpus=2.  Upstream had HZ=250 and vcpus=1.

Both had the "xen_sched_clock" call changed to xen_clocksource_read.

The workload is simply "main() {while(1); }" run in background.
On vcpus=2, I ran two in background.

Both had CONFIG_XEN_FS_DEBUG set, but the workload should encounter
few or no kernel spinlocks and the measurements confirmed this.
Rhel6b2 had HIGH_RES_TIMERS and SCHED_HRTICK set; they were not
set on upstream... don't know if this matters.

BOTTOM LINE: A cpu-loaded vcpus=2 HZ=1000 rhel6b2 guest had close to
20000 rdtsc/sec.  Unloaded, the same guest had about 600 rdtsc/sec.
A cpu-loaded HZ=250 upstream guest had about 4000 rdtsc/sec,
regardless of whether it was loaded or not.

Breakdown: I instrumented several of the primary sources of calls
to native_read_tsc.  No rdtsc instructions occurred, except the
one in native_read_tsc.  (Some small number are binary translated
to call native_read_tsc.)

The most frequent users of rdtsc:
(A) sched_clock via xen_clocksource_read->pvclock_clocksource_read->
                   pvclock_nsec_offset->native_read_tsc
(B) clock->read via xen_clocksource_read
(C) set_next_event via xen_timerop_clockevent

I measured (running the workload overnight) how frequently each
of these paths resulted in a call to native_read_tsc.  Each
of these functions go through this path exactly once so this
is also a frequency (per second) count for the callers.

ALL 4672..18689
(A) 2149...8428 (~8*HZ)
(B) 1350...9456 (~(4*HZ+?)*vcpus)
(C)  524...2102 (~2*HZ)

The measurement code outputs frequency every 2**20 samples
and I did a cursory review to verify that the frequency
samples were consistent. There is some small variance,
which accounts for the fact that the numbers above (which
were from one sample) don't quite add up.  So please
consider them "trend indicators" rather than exact measurements.
I also did verify the total rdtsc usage counts against the
"xm debug-key s" output from the Xen console.

Conclusions/questions:
1) Frequency of rdtsc use is much higher than expected, especially
   when HZ=1000.  Sadly, the divider= kernel parameter may still
   be necessary for RHEL6b2 on older machines where Xen must do
   rdtsc emulation... and for any guest that has migrated (or will
   migrate?)... unless this high frequency identifies a fixable bug.
2) There is some fix in rhel6b2 that hides this when vcpus are inactive.
   Should this go upstream so idle guests don't consume so much CPU?
3) Why is sched_clock called at 8*HZ?!? Why is clock->read called at
   4*HZ (on every processor)? !? Why is set_next_event called
   at 2*HZ?!?  Even with hardware rdtsc, this seems like a potential
   (though admittedly small) performance issue.

[-- Attachment #2: linux-2.6.32-37-tsc.patch --]
[-- Type: application/octet-stream, Size: 8489 bytes --]

diff -Napur linux-2.6.32.x86_64/arch/x86/kernel/rtc.c linux-2.6.32.x86_64-tsc/arch/x86/kernel/rtc.c
--- linux-2.6.32.x86_64/arch/x86/kernel/rtc.c	2009-12-02 20:51:21.000000000 -0700
+++ linux-2.6.32.x86_64-tsc/arch/x86/kernel/rtc.c	2010-07-22 10:24:36.000000000 -0600
@@ -193,6 +193,15 @@ void read_persistent_clock(struct timesp
 
 unsigned long long native_read_tsc(void)
 {
+#define DJM_TSC_FREQ
+#ifdef DJM_TSC_FREQ
+#define INTERVAL (1024L*1024UL)
+{
+static unsigned long count=0;
+if (!(++count & (INTERVAL-1)))
+printk(KERN_ERR "native_read_tsc: count=%lu\n", count);
+}
+#endif
 	return __native_read_tsc();
 }
 EXPORT_SYMBOL(native_read_tsc);
diff -Napur linux-2.6.32.x86_64/arch/x86/xen/enlighten.c linux-2.6.32.x86_64-tsc/arch/x86/xen/enlighten.c
--- linux-2.6.32.x86_64/arch/x86/xen/enlighten.c	2010-07-07 11:02:01.000000000 -0600
+++ linux-2.6.32.x86_64-tsc/arch/x86/xen/enlighten.c	2010-07-22 15:56:11.000000000 -0600
@@ -955,7 +955,8 @@ static const struct pv_init_ops xen_init
 };
 
 static const struct pv_time_ops xen_time_ops __initdata = {
-	.sched_clock = xen_sched_clock,
+//	.sched_clock = xen_sched_clock,
+	.sched_clock = xen_schedclocksource_read,
 };
 
 static const struct pv_cpu_ops xen_cpu_ops __initdata = {
diff -Napur linux-2.6.32.x86_64/arch/x86/xen/time.c linux-2.6.32.x86_64-tsc/arch/x86/xen/time.c
--- linux-2.6.32.x86_64/arch/x86/xen/time.c	2010-07-07 11:01:46.000000000 -0600
+++ linux-2.6.32.x86_64-tsc/arch/x86/xen/time.c	2010-07-22 10:29:00.000000000 -0600
@@ -206,7 +206,83 @@ cycle_t xen_clocksource_read(void)
 {
         struct pvclock_vcpu_time_info *src;
 	cycle_t ret;
+#define DJM_TSC_FREQ
+#ifdef DJM_TSC_FREQ
+#define CPU_HZ (2993UL*1024UL*1024UL)
+#define INTERVAL (1024L*1024UL)
+{
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "xen_clocksource_read(xen/time.c): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+}
+#endif
+
+	src = &get_cpu_var(xen_vcpu)->time;
+	ret = pvclock_clocksource_read(src);
+	put_cpu_var(xen_vcpu);
+	return ret;
+}
+
+cycle_t xen_clocksource_read2(void)
+{
+        struct pvclock_vcpu_time_info *src;
+	cycle_t ret;
+
+#ifdef DJM_TSC_FREQ
+{
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "xen_clocksource_READ2(xen/time.c)(CALLED ONLY VIA clock->read): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+}
+#endif
+	src = &get_cpu_var(xen_vcpu)->time;
+	ret = pvclock_clocksource_read(src);
+	put_cpu_var(xen_vcpu);
+	return ret;
+}
 
+cycle_t xen_clocksource_read3(void)
+{
+        struct pvclock_vcpu_time_info *src;
+	cycle_t ret;
+
+#ifdef DJM_TSC_FREQ
+{
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "xen_clocksource_READ3(xen/time.c)(CALLED ONLY VIA set_next_event): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+}
+#endif
+	src = &get_cpu_var(xen_vcpu)->time;
+	ret = pvclock_clocksource_read(src);
+	put_cpu_var(xen_vcpu);
+	return ret;
+}
+
+cycle_t xen_schedclocksource_read(void)
+{
+        struct pvclock_vcpu_time_info *src;
+	cycle_t ret;
+
+#ifdef DJM_TSC_FREQ
+{
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "xen_SCHEDclocksource_READ(xen/time.c)(CALLED ONLY VIA sched_clock): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+}
+#endif
 	src = &get_cpu_var(xen_vcpu)->time;
 	ret = pvclock_clocksource_read(src);
 	put_cpu_var(xen_vcpu);
@@ -215,7 +291,7 @@ cycle_t xen_clocksource_read(void)
 
 static cycle_t xen_clocksource_get_cycles(struct clocksource *cs)
 {
-	return xen_clocksource_read();
+	return xen_clocksource_read2();
 }
 
 static void xen_read_wallclock(struct timespec *ts)
@@ -286,7 +362,7 @@ static struct clocksource xen_clocksourc
 */
 static s64 get_abs_timeout(unsigned long delta)
 {
-	return xen_clocksource_read() + delta;
+	return xen_clocksource_read3() + delta;
 }
 
 static void xen_timerop_set_mode(enum clock_event_mode mode,
diff -Napur linux-2.6.32.x86_64/arch/x86/xen/xen-ops.h linux-2.6.32.x86_64-tsc/arch/x86/xen/xen-ops.h
--- linux-2.6.32.x86_64/arch/x86/xen/xen-ops.h	2010-07-07 11:01:46.000000000 -0600
+++ linux-2.6.32.x86_64-tsc/arch/x86/xen/xen-ops.h	2010-07-22 10:36:26.000000000 -0600
@@ -45,6 +45,7 @@ void xen_setup_timer(int cpu);
 void xen_setup_runstate_info(int cpu);
 void xen_teardown_timer(int cpu);
 cycle_t xen_clocksource_read(void);
+cycle_t xen_schedclocksource_read(void);
 void xen_setup_cpu_clockevents(void);
 unsigned long xen_tsc_khz(void);
 void __init xen_time_init(void);
diff -Napur linux-2.6.32.x86_64/kernel/sched_clock.c linux-2.6.32.x86_64-tsc/kernel/sched_clock.c
--- linux-2.6.32.x86_64/kernel/sched_clock.c	2010-07-07 11:01:57.000000000 -0600
+++ linux-2.6.32.x86_64-tsc/kernel/sched_clock.c	2010-07-22 10:34:51.000000000 -0600
@@ -105,8 +105,21 @@ static u64 sched_clock_local(struct sche
 {
 	u64 now, clock, old_clock, min_clock, max_clock;
 	s64 delta;
+#define DJM_TSC_FREQ
+#ifdef DJM_TSC_FREQ
+static unsigned long count=0,last,new;
+#define CPU_HZ (2993UL*1024UL*1024UL)
+#define INTERVAL (1024L*1024UL)
+#endif
 
 again:
+#ifdef DJM_TSC_FREQ
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "sched_clock_local: count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+#endif
 	now = sched_clock();
 	delta = now - scd->tick_raw;
 	if (unlikely(delta < 0))
@@ -177,7 +190,20 @@ u64 sched_clock_cpu(int cpu)
 	WARN_ON_ONCE(!irqs_disabled());
 
 	if (sched_clock_stable)
+#ifdef DJM_TSC_FREQ
+{
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "sched_clock_local: count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+#endif
 		return sched_clock();
+#ifdef DJM_TSC_FREQ
+}
+}
+#endif
+
 
 	if (unlikely(!sched_clock_running))
 		return 0ull;
@@ -206,6 +232,16 @@ void sched_clock_tick(void)
 	WARN_ON_ONCE(!irqs_disabled());
 
 	scd = this_scd();
+#ifdef DJM_TSC_FREQ
+{
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "sched_clock_tick (TWO): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last=new;
+}
+}
+#endif
 	now_gtod = ktime_to_ns(ktime_get());
 	now = sched_clock();
 
diff -Napur linux-2.6.32.x86_64/kernel/time/timekeeping.c linux-2.6.32.x86_64-tsc/kernel/time/timekeeping.c
--- linux-2.6.32.x86_64/kernel/time/timekeeping.c	2010-07-07 11:01:55.000000000 -0600
+++ linux-2.6.32.x86_64-tsc/kernel/time/timekeeping.c	2010-07-22 10:57:37.000000000 -0600
@@ -122,6 +122,19 @@ static inline s64 timekeeping_get_ns_raw
 
 	/* read clocksource: */
 	clock = timekeeper.clock;
+#define DJM_TSC_FREQ
+#ifdef DJM_TSC_FREQ
+#define CPU_HZ (2993UL*1024UL*1024UL)
+#define INTERVAL (1024L*1024UL)
+{
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "timekeeping_get_ns_raw(timekeeping.c)(CALLS clock->read): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+}
+#endif
 	cycle_now = clock->read(clock);
 
 	/* calculate the delta since the last update_wall_time: */
@@ -226,6 +239,14 @@ void getnstimeofday(struct timespec *ts)
 	WARN_ON(timekeeping_suspended);
 
 	do {
+#ifdef DJM_TSC_FREQ
+static unsigned long count=0,last,new;
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "getnstimeofday(timekeeping.c)(CALLS clock->read via timekeeping_get_ns): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+#endif
 		seq = read_seqbegin(&xtime_lock);
 
 		*ts = xtime;
@@ -276,10 +297,20 @@ void ktime_get_ts(struct timespec *ts)
 	struct timespec tomono;
 	unsigned int seq;
 	s64 nsecs;
+#ifdef DJM_TSC_FREQ
+static unsigned long count=0,last,new;
+#endif
 
 	WARN_ON(timekeeping_suspended);
 
 	do {
+#ifdef DJM_TSC_FREQ
+if (!count) rdtscll(last);
+if (!(++count & (INTERVAL-1))) {
+printk(KERN_ERR "ktime_get_ts(timekeeping.c)(CALLS getnstimeofday): count=%lu freq=%lu\n", count, (INTERVAL*CPU_HZ)/(rdtscll(new)-last));
+last = new;
+}
+#endif
 		seq = read_seqbegin(&xtime_lock);
 		*ts = xtime;
 		tomono = wall_to_monotonic;

[-- Attachment #3: Type: text/plain, Size: 138 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-07-23 16:47 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-19 23:19 rdtsc strangeness on upstream kernel Dan Magenheimer
2010-07-19 23:34 ` Jeremy Fitzhardinge
2010-07-19 23:51   ` Dan Magenheimer
     [not found]     ` <681683ff-677b-4709-911e-3d90d7dbed7c@default4C44EF08.3070103@goop.org>
2010-07-19 23:55     ` Dan Magenheimer
2010-07-20  0:34       ` Jeremy Fitzhardinge
2010-07-20  0:42         ` Dan Magenheimer
2010-07-20  0:44           ` Jeremy Fitzhardinge
2010-07-23 16:47             ` Dan Magenheimer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).