public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch?] s2ram + P4 + tsc = annoyance
@ 2007-12-27  8:55 Mike Galbraith
  2007-12-27 18:19 ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2007-12-27  8:55 UTC (permalink / raw)
  To: LKML; +Cc: Ingo Molnar

Greetings,

s2ram recently became useful here, except for the kernel's annoying
habit of disabling my P4's perfectly good TSC.

[  107.894470] CPU 1 is now offline
[  107.894474] SMP alternatives: switching to UP code
[  107.895832] CPU0 attaching sched-domain:
[  107.895836]  domain 0: span 1
[  107.895838]   groups: 1
[  107.896097] CPU1 is down
[    3.726156] Intel machine check architecture supported.
[    3.726165] Intel machine check reporting enabled on CPU#0.
[    3.726167] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
[    3.726170] CPU0: Thermal monitoring enabled
[    3.726175] Back to C!
[    3.726708] Force enabled HPET at resume
[    3.726775] Enabling non-boot CPUs ...
[    3.727049] CPU0 attaching NULL sched-domain.
[    3.727165] SMP alternatives: switching to SMP code
[    3.727858] Booting processor 1/1 eip 3000
[    3.727862] CPU 1 irqstacks, hard=b042f000 soft=b042d000
[    3.738173] Initializing CPU#1
[    3.798912] Calibrating delay using timer specific routine.. 5986.12 BogoMIPS (lpj=2993061)
[    3.798920] CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 00000000
[    3.798931] CPU: Trace cache: 12K uops, L1 D cache: 8K
[    3.798934] CPU: L2 cache: 512K
[    3.798936] CPU: Physical Processor ID: 0
[    3.798938] CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 00000000
[    3.798946] Intel machine check architecture supported.
[    3.798952] Intel machine check reporting enabled on CPU#1.
[    3.798955] CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
[    3.798959] CPU1: Thermal monitoring enabled
[    3.799161] CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
[    3.799187] checking TSC synchronization [CPU#0 -> CPU#1]:
[    3.819181] Measured 63588552840 cycles TSC warp between CPUs, turning off TSC clock.
[    3.819184] Marking TSC unstable due to: check_tsc_sync_source failed.

I wonder why I'm the only guy in the galaxy experiencing this.  Does
everybody else's clock continue to move forward across resume or
something?  Anyway, I asked it to please stop doing that, and it
complied without even exploding (unlike crabby APICs).

diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index 9125efe..7b74969 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -46,7 +46,7 @@ static __cpuinit void check_tsc_warp(void)
 	cycles_t start, now, prev, end;
 	int i;
 
-	start = get_cycles_sync();
+	start = last_tsc = get_cycles_sync();
 	/*
 	 * The measurement runs for 20 msecs:
 	 */



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
       [not found] <fa./Gg6lFn4vN7sfhp2nE3LSMDm0q4@ifi.uio.no>
@ 2007-12-27 18:09 ` Robert Hancock
  2007-12-27 18:27   ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Robert Hancock @ 2007-12-27 18:09 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML, Ingo Molnar

Mike Galbraith wrote:
> Greetings,
> 
> s2ram recently became useful here, except for the kernel's annoying
> habit of disabling my P4's perfectly good TSC.
> 
> [  107.894470] CPU 1 is now offline
> [  107.894474] SMP alternatives: switching to UP code
> [  107.895832] CPU0 attaching sched-domain:
> [  107.895836]  domain 0: span 1
> [  107.895838]   groups: 1
> [  107.896097] CPU1 is down
> [    3.726156] Intel machine check architecture supported.
> [    3.726165] Intel machine check reporting enabled on CPU#0.
> [    3.726167] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
> [    3.726170] CPU0: Thermal monitoring enabled
> [    3.726175] Back to C!
> [    3.726708] Force enabled HPET at resume
> [    3.726775] Enabling non-boot CPUs ...
> [    3.727049] CPU0 attaching NULL sched-domain.
> [    3.727165] SMP alternatives: switching to SMP code
> [    3.727858] Booting processor 1/1 eip 3000
> [    3.727862] CPU 1 irqstacks, hard=b042f000 soft=b042d000
> [    3.738173] Initializing CPU#1
> [    3.798912] Calibrating delay using timer specific routine.. 5986.12 BogoMIPS (lpj=2993061)
> [    3.798920] CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 00000000
> [    3.798931] CPU: Trace cache: 12K uops, L1 D cache: 8K
> [    3.798934] CPU: L2 cache: 512K
> [    3.798936] CPU: Physical Processor ID: 0
> [    3.798938] CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 00000000
> [    3.798946] Intel machine check architecture supported.
> [    3.798952] Intel machine check reporting enabled on CPU#1.
> [    3.798955] CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
> [    3.798959] CPU1: Thermal monitoring enabled
> [    3.799161] CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
> [    3.799187] checking TSC synchronization [CPU#0 -> CPU#1]:
> [    3.819181] Measured 63588552840 cycles TSC warp between CPUs, turning off TSC clock.
> [    3.819184] Marking TSC unstable due to: check_tsc_sync_source failed.
> 
> I wonder why I'm the only guy in the galaxy experiencing this.  Does
> everybody else's clock continue to move forward across resume or
> something?  Anyway, I asked it to please stop doing that, and it
> complied without even exploding (unlike crabby APICs).

Are we missing some logic to resync the TSCs after resume, or something?

-- 
Robert Hancock      Saskatoon, SK, Canada
To email, remove "nospam" from hancockr@nospamshaw.ca
Home Page: http://www.roberthancock.com/


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2007-12-27  8:55 Mike Galbraith
@ 2007-12-27 18:19 ` Mike Galbraith
  2007-12-30 14:04   ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2007-12-27 18:19 UTC (permalink / raw)
  To: LKML; +Cc: Ingo Molnar


Or, reset to pristine prior to testing, though that's more lines to
accomplish the same thing.  Either way, or some other way...

If check_tsc_warp() is called after initial boot, and the TSC has in the
meantime been set (BIOS, user, silicon, elves) to a value lower than the
last stored/stale value, we blame the TSC.  Reset to pristine condition
after every test.

Signed-off-by: Mike Galbraith <efault@gmx.de>

diff --git a/arch/x86/kernel/tsc_sync.c b/arch/x86/kernel/tsc_sync.c
index 9125efe..05d8f25 100644
--- a/arch/x86/kernel/tsc_sync.c
+++ b/arch/x86/kernel/tsc_sync.c
@@ -129,24 +129,24 @@ void __cpuinit check_tsc_sync_source(int cpu)
 	while (atomic_read(&stop_count) != cpus-1)
 		cpu_relax();
 
-	/*
-	 * Reset it - just in case we boot another CPU later:
-	 */
-	atomic_set(&start_count, 0);
-
 	if (nr_warps) {
 		printk("\n");
 		printk(KERN_WARNING "Measured %Ld cycles TSC warp between CPUs,"
 				    " turning off TSC clock.\n", max_warp);
 		mark_tsc_unstable("check_tsc_sync_source failed");
-		nr_warps = 0;
-		max_warp = 0;
-		last_tsc = 0;
 	} else {
 		printk(" passed.\n");
 	}
 
 	/*
+	 * Reset it - just in case we boot another CPU later:
+	 */
+	atomic_set(&start_count, 0);
+	nr_warps = 0;
+	max_warp = 0;
+	last_tsc = 0;
+
+	/*
 	 * Let the target continue with the bootup:
 	 */
 	atomic_inc(&stop_count);



^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2007-12-27 18:09 ` [patch?] s2ram + P4 + tsc = annoyance Robert Hancock
@ 2007-12-27 18:27   ` Mike Galbraith
  2007-12-27 18:40     ` Mike Galbraith
  0 siblings, 1 reply; 11+ messages in thread
From: Mike Galbraith @ 2007-12-27 18:27 UTC (permalink / raw)
  To: Robert Hancock; +Cc: LKML, Ingo Molnar


On Thu, 2007-12-27 at 12:09 -0600, Robert Hancock wrote:

> Are we missing some logic to resync the TSCs after resume, or something?

They used to be forcibly synchronized during boot, but it seems that was
dropped in 2.6.21.

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2007-12-27 18:27   ` Mike Galbraith
@ 2007-12-27 18:40     ` Mike Galbraith
  0 siblings, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2007-12-27 18:40 UTC (permalink / raw)
  To: Robert Hancock; +Cc: LKML, Ingo Molnar


On Thu, 2007-12-27 at 19:27 +0100, Mike Galbraith wrote:
> On Thu, 2007-12-27 at 12:09 -0600, Robert Hancock wrote:
> 
> > Are we missing some logic to resync the TSCs after resume, or something?
> 
> They used to be forcibly synchronized during boot, but it seems that was
> dropped in 2.6.21.

(but out of sync isn't my problem, it's TSC having been reset)


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2007-12-27 18:19 ` Mike Galbraith
@ 2007-12-30 14:04   ` Ingo Molnar
  2007-12-30 14:08     ` Ingo Molnar
  0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2007-12-30 14:04 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML


* Mike Galbraith <efault@gmx.de> wrote:

> Or, reset to pristine prior to testing, though that's more lines to 
> accomplish the same thing.  Either way, or some other way...
> 
> If check_tsc_warp() is called after initial boot, and the TSC has in 
> the meantime been set (BIOS, user, silicon, elves) to a value lower 
> than the last stored/stale value, we blame the TSC.  Reset to pristine 
> condition after every test.

ok, i prefer this fix a bit more. (we dont want to set last_tsc outside 
of the sync_lock - which your initial patch does)

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2007-12-30 14:04   ` Ingo Molnar
@ 2007-12-30 14:08     ` Ingo Molnar
  2007-12-30 17:42       ` Mike Galbraith
  2008-01-04 16:39       ` Pavel Machek
  0 siblings, 2 replies; 11+ messages in thread
From: Ingo Molnar @ 2007-12-30 14:08 UTC (permalink / raw)
  To: Mike Galbraith; +Cc: LKML, Andrew Morton


> ok, i prefer this fix a bit more. (we dont want to set last_tsc 
> outside of the sync_lock - which your initial patch does)

i've added your patch to x86.git - thanks Mike! (patch below) I think 
this would be too dangerous for v2.6.24 though - we can put it back into 
-stable for 2.6.24.1, once it had more testing?

	Ingo

----------------->
Subject: x86: fix: s2ram + P4 + tsc = annoyance
From: Mike Galbraith <efault@gmx.de>

s2ram recently became useful here, except for the kernel's annoying
habit of disabling my P4's perfectly good TSC.

[  107.894470] CPU 1 is now offline
[  107.894474] SMP alternatives: switching to UP code
[  107.895832] CPU0 attaching sched-domain:
[  107.895836]  domain 0: span 1
[  107.895838]   groups: 1
[  107.896097] CPU1 is down
[    3.726156] Intel machine check architecture supported.
[    3.726165] Intel machine check reporting enabled on CPU#0.
[    3.726167] CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
[    3.726170] CPU0: Thermal monitoring enabled
[    3.726175] Back to C!
[    3.726708] Force enabled HPET at resume
[    3.726775] Enabling non-boot CPUs ...
[    3.727049] CPU0 attaching NULL sched-domain.
[    3.727165] SMP alternatives: switching to SMP code
[    3.727858] Booting processor 1/1 eip 3000
[    3.727862] CPU 1 irqstacks, hard=b042f000 soft=b042d000
[    3.738173] Initializing CPU#1
[    3.798912] Calibrating delay using timer specific routine.. 5986.12 BogoMIPS (lpj=2993061)
[    3.798920] CPU: After generic identify, caps: bfebfbff 00000000 00000000 00000000 00004400 00000000 00000000 00000000
[    3.798931] CPU: Trace cache: 12K uops, L1 D cache: 8K
[    3.798934] CPU: L2 cache: 512K
[    3.798936] CPU: Physical Processor ID: 0
[    3.798938] CPU: After all inits, caps: bfebfbff 00000000 00000000 0000b080 00004400 00000000 00000000 00000000
[    3.798946] Intel machine check architecture supported.
[    3.798952] Intel machine check reporting enabled on CPU#1.
[    3.798955] CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
[    3.798959] CPU1: Thermal monitoring enabled
[    3.799161] CPU1: Intel(R) Pentium(R) 4 CPU 3.00GHz stepping 09
[    3.799187] checking TSC synchronization [CPU#0 -> CPU#1]:
[    3.819181] Measured 63588552840 cycles TSC warp between CPUs, turning off TSC clock.
[    3.819184] Marking TSC unstable due to: check_tsc_sync_source failed.

If check_tsc_warp() is called after initial boot, and the TSC has in the
meantime been set (BIOS, user, silicon, elves) to a value lower than the
last stored/stale value, we blame the TSC.  Reset to pristine condition
after every test.

Signed-off-by: Mike Galbraith <efault@gmx.de>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/tsc_sync.c |   16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

Index: linux-x86.q/arch/x86/kernel/tsc_sync.c
===================================================================
--- linux-x86.q.orig/arch/x86/kernel/tsc_sync.c
+++ linux-x86.q/arch/x86/kernel/tsc_sync.c
@@ -129,24 +129,24 @@ void __cpuinit check_tsc_sync_source(int
 	while (atomic_read(&stop_count) != cpus-1)
 		cpu_relax();
 
-	/*
-	 * Reset it - just in case we boot another CPU later:
-	 */
-	atomic_set(&start_count, 0);
-
 	if (nr_warps) {
 		printk("\n");
 		printk(KERN_WARNING "Measured %Ld cycles TSC warp between CPUs,"
 				    " turning off TSC clock.\n", max_warp);
 		mark_tsc_unstable("check_tsc_sync_source failed");
-		nr_warps = 0;
-		max_warp = 0;
-		last_tsc = 0;
 	} else {
 		printk(" passed.\n");
 	}
 
 	/*
+	 * Reset it - just in case we boot another CPU later:
+	 */
+	atomic_set(&start_count, 0);
+	nr_warps = 0;
+	max_warp = 0;
+	last_tsc = 0;
+
+	/*
 	 * Let the target continue with the bootup:
 	 */
 	atomic_inc(&stop_count);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2007-12-30 14:08     ` Ingo Molnar
@ 2007-12-30 17:42       ` Mike Galbraith
  2008-01-04 16:39       ` Pavel Machek
  1 sibling, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2007-12-30 17:42 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: LKML, Andrew Morton


On Sun, 2007-12-30 at 15:08 +0100, Ingo Molnar wrote:
> > ok, i prefer this fix a bit more. (we dont want to set last_tsc 
> > outside of the sync_lock - which your initial patch does)
> 
> i've added your patch to x86.git - thanks Mike! (patch below) I think 
> this would be too dangerous for v2.6.24 though - we can put it back into 
> -stable for 2.6.24.1, once it had more testing?

Methinks it's about as low a priority as exists.  Google returned me and
one other fellow...

	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2007-12-30 14:08     ` Ingo Molnar
  2007-12-30 17:42       ` Mike Galbraith
@ 2008-01-04 16:39       ` Pavel Machek
  2008-01-04 22:04         ` Mike Galbraith
  2008-01-05  9:06         ` Ingo Molnar
  1 sibling, 2 replies; 11+ messages in thread
From: Pavel Machek @ 2008-01-04 16:39 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: Mike Galbraith, LKML, Andrew Morton

On Sun 2007-12-30 15:08:15, Ingo Molnar wrote:
> 
> > ok, i prefer this fix a bit more. (we dont want to set last_tsc 
> > outside of the sync_lock - which your initial patch does)
> 
> i've added your patch to x86.git - thanks Mike! (patch below) I think 
> this would be too dangerous for v2.6.24 though - we can put it back into 
> -stable for 2.6.24.1, once it had more testing?

If it is too dangerous for -final, I guess it is bad idea to push it
into -stable.

Plus it does not fix really bad bug.
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2008-01-04 16:39       ` Pavel Machek
@ 2008-01-04 22:04         ` Mike Galbraith
  2008-01-05  9:06         ` Ingo Molnar
  1 sibling, 0 replies; 11+ messages in thread
From: Mike Galbraith @ 2008-01-04 22:04 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Ingo Molnar, LKML, Andrew Morton


On Fri, 2008-01-04 at 17:39 +0100, Pavel Machek wrote:

> If it is too dangerous for -final, I guess it is bad idea to push it
> into -stable.
> 
> Plus it does not fix really bad bug.

The furthest back burner is fine.  It's only an annoyance.
 
	-Mike


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch?] s2ram + P4 + tsc = annoyance
  2008-01-04 16:39       ` Pavel Machek
  2008-01-04 22:04         ` Mike Galbraith
@ 2008-01-05  9:06         ` Ingo Molnar
  1 sibling, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2008-01-05  9:06 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Mike Galbraith, LKML, Andrew Morton


* Pavel Machek <pavel@ucw.cz> wrote:

> On Sun 2007-12-30 15:08:15, Ingo Molnar wrote:
> > 
> > > ok, i prefer this fix a bit more. (we dont want to set last_tsc 
> > > outside of the sync_lock - which your initial patch does)
> > 
> > i've added your patch to x86.git - thanks Mike! (patch below) I 
> > think this would be too dangerous for v2.6.24 though - we can put it 
> > back into -stable for 2.6.24.1, once it had more testing?
> 
> If it is too dangerous for -final, I guess it is bad idea to push it 
> into -stable.

we regularly delay bugs to the .1 release, if a fix is a bit complex and 
just needs a bit of time to be fully trusted for a stable release.

> Plus it does not fix really bad bug.

yeah, granted.

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-01-05  9:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <fa./Gg6lFn4vN7sfhp2nE3LSMDm0q4@ifi.uio.no>
2007-12-27 18:09 ` [patch?] s2ram + P4 + tsc = annoyance Robert Hancock
2007-12-27 18:27   ` Mike Galbraith
2007-12-27 18:40     ` Mike Galbraith
2007-12-27  8:55 Mike Galbraith
2007-12-27 18:19 ` Mike Galbraith
2007-12-30 14:04   ` Ingo Molnar
2007-12-30 14:08     ` Ingo Molnar
2007-12-30 17:42       ` Mike Galbraith
2008-01-04 16:39       ` Pavel Machek
2008-01-04 22:04         ` Mike Galbraith
2008-01-05  9:06         ` Ingo Molnar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox