public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
@ 2005-09-19 19:16 john stultz
  2005-09-19 19:31 ` Andi Kleen
  2005-10-07 12:26 ` Vladimir B. Savkin
  0 siblings, 2 replies; 28+ messages in thread
From: john stultz @ 2005-09-19 19:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: lkml, Andi Kleen

Andrew,
	This patch should resolve the issue seen in bugme bug #5105, where it
is assumed that dualcore x86_64 systems have synced TSCs. This is not
the case, and alternate timesources should be used instead.

For more details, see:
http://bugzilla.kernel.org/show_bug.cgi?id=5105


Please consider for inclusion in your tree.

thanks
-john

diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
--- a/arch/x86_64/kernel/time.c
+++ b/arch/x86_64/kernel/time.c
@@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
  	   are handled in the OEM check above. */
  	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
  		return 0;
- 	/* All in a single socket - should be synchronized */
- 	if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
- 		return 0;
 #endif
  	/* Assume multi socket systems are not synchronized */
  	return num_online_cpus() > 1;



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-19 19:16 [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs john stultz
@ 2005-09-19 19:31 ` Andi Kleen
  2005-09-19 19:42   ` john stultz
  2005-10-07 12:26 ` Vladimir B. Savkin
  1 sibling, 1 reply; 28+ messages in thread
From: Andi Kleen @ 2005-09-19 19:31 UTC (permalink / raw)
  To: john stultz; +Cc: Andrew Morton, lkml, Andi Kleen, discuss

On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> 	This patch should resolve the issue seen in bugme bug #5105, where it
> is assumed that dualcore x86_64 systems have synced TSCs. This is not
> the case, and alternate timesources should be used instead.


I asked AMD some time ago and they told me it was synchronized.
The TSC on K8 is C state invariant, but not P state invariant,
but P states always happen synchronized on dual cores.

So I'm not quite convinced of your explanation yet.

Most likely you workaround some other bug by switching to pmtimer,
Or just changed the timing enough because pmtimer is incredibly
slow.  It would be better to find the other bug.


> 
> For more details, see:
> http://bugzilla.kernel.org/show_bug.cgi?id=5105
> 
> 
> Please consider for inclusion in your tree.

Please don't for now.

-Andi

> 
> thanks
> -john
> 
> diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
> --- a/arch/x86_64/kernel/time.c
> +++ b/arch/x86_64/kernel/time.c
> @@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
>   	   are handled in the OEM check above. */
>   	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>   		return 0;
> - 	/* All in a single socket - should be synchronized */
> - 	if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
> - 		return 0;
>  #endif
>   	/* Assume multi socket systems are not synchronized */
>   	return num_online_cpus() > 1;
> 
> 

-- 

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-19 19:31 ` Andi Kleen
@ 2005-09-19 19:42   ` john stultz
  2005-09-19 19:49     ` [discuss] " Andi Kleen
  0 siblings, 1 reply; 28+ messages in thread
From: john stultz @ 2005-09-19 19:42 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, lkml, discuss

On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
> On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > 	This patch should resolve the issue seen in bugme bug #5105, where it
> > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > the case, and alternate timesources should be used instead.
> 
> 
> I asked AMD some time ago and they told me it was synchronized.
> The TSC on K8 is C state invariant, but not P state invariant,
> but P states always happen synchronized on dual cores.
> 
> So I'm not quite convinced of your explanation yet.

Would a litter userspace test checking the TSC synchronization maybe
shed additional light on the issue?

thanks
-john




^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-19 19:42   ` john stultz
@ 2005-09-19 19:49     ` Andi Kleen
  2005-09-20 18:59       ` john stultz
  0 siblings, 1 reply; 28+ messages in thread
From: Andi Kleen @ 2005-09-19 19:49 UTC (permalink / raw)
  To: john stultz; +Cc: Andi Kleen, Andrew Morton, lkml, discuss

On Mon, Sep 19, 2005 at 12:42:16PM -0700, john stultz wrote:
> On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
> > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > 	This patch should resolve the issue seen in bugme bug #5105, where it
> > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > the case, and alternate timesources should be used instead.
> > 
> > 
> > I asked AMD some time ago and they told me it was synchronized.
> > The TSC on K8 is C state invariant, but not P state invariant,
> > but P states always happen synchronized on dual cores.
> > 
> > So I'm not quite convinced of your explanation yet.
> 
> Would a litter userspace test checking the TSC synchronization maybe
> shed additional light on the issue?

Sure you can try it.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-19 19:49     ` [discuss] " Andi Kleen
@ 2005-09-20 18:59       ` john stultz
  2005-09-21  4:03         ` Daniel Jacobowitz
  0 siblings, 1 reply; 28+ messages in thread
From: john stultz @ 2005-09-20 18:59 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Andrew Morton, lkml, discuss

On Mon, 2005-09-19 at 21:49 +0200, Andi Kleen wrote:
> On Mon, Sep 19, 2005 at 12:42:16PM -0700, john stultz wrote:
> > On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
> > > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > > 	This patch should resolve the issue seen in bugme bug #5105, where it
> > > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > > the case, and alternate timesources should be used instead.
> > > 
> > > 
> > > I asked AMD some time ago and they told me it was synchronized.
> > > The TSC on K8 is C state invariant, but not P state invariant,
> > > but P states always happen synchronized on dual cores.
> > > 
> > > So I'm not quite convinced of your explanation yet.
> > 
> > Would a litter userspace test checking the TSC synchronization maybe
> > shed additional light on the issue?
> 
> Sure you can try it.

So, bugzilla.kernel.org has (temporarily at least) lost the reports from
yesterday, but from the email i got, folks using my TSC consistency
check that I posted were seeing what appears to be unsynched TSCs on
dualcore AMD systems.

Personally I suspect that the powernow driver is putting the cores
independently into low power sleep and the TSCs are being independently
halted, causing them to become unsynchronized.

Do you still feel there is some other issue here? Any ideas for shaking
out whatever else might in play?

thanks
-john




^ permalink raw reply	[flat|nested] 28+ messages in thread

* RE: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
@ 2005-09-20 19:13 Langsdorf, Mark
  2005-09-20 19:24 ` Scott Lampert
  0 siblings, 1 reply; 28+ messages in thread
From: Langsdorf, Mark @ 2005-09-20 19:13 UTC (permalink / raw)
  To: john stultz, Andi Kleen; +Cc: Andrew Morton, lkml, discuss

> On Mon, 2005-09-19 at 21:49 +0200, Andi Kleen wrote:
> > On Mon, Sep 19, 2005 at 12:42:16PM -0700, john stultz wrote:
> > > On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
> > > > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > > > 	This patch should resolve the issue seen in 
> bugme bug #5105, 
> > > > > where it is assumed that dualcore x86_64 systems have synced 
> > > > > TSCs. This is not the case, and alternate timesources 
> should be 
> > > > > used instead.
> > > > 
> > > > 
> > > > I asked AMD some time ago and they told me it was synchronized. 
> > > > The TSC on K8 is C state invariant, but not P state 
> invariant, but 
> > > > P states always happen synchronized on dual cores.
> > > > 
> > > > So I'm not quite convinced of your explanation yet.
> > > 
> > > Would a litter userspace test checking the TSC 
> synchronization maybe 
> > > shed additional light on the issue?
> > 
> > Sure you can try it.
> 
> So, bugzilla.kernel.org has (temporarily at least) lost the 
> reports from yesterday, but from the email i got, folks using 
> my TSC consistency check that I posted were seeing what 
> appears to be unsynched TSCs on dualcore AMD systems.

My understanding was that each TSC on a dual-core processor
will advance individually and atomically.  They will not 
always be in synchronization.

> Personally I suspect that the powernow driver is putting the 
> cores independently into low power sleep and the TSCs are 
> being independently halted, causing them to become unsynchronized.

The powernow-k8 driver doesn't know what a low power sleep state
is, so I strongly doubt it is involved here.  It only handles
pstates.
 
-Mark Langsdorf
K8 PowerNow! Maintainer
AMD, Inc.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-20 19:13 [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs Langsdorf, Mark
@ 2005-09-20 19:24 ` Scott Lampert
  2005-09-20 19:30   ` john stultz
  0 siblings, 1 reply; 28+ messages in thread
From: Scott Lampert @ 2005-09-20 19:24 UTC (permalink / raw)
  To: Langsdorf, Mark; +Cc: john stultz, Andi Kleen, Andrew Morton, lkml, discuss

Langsdorf, Mark wrote:

>>On Mon, 2005-09-19 at 21:49 +0200, Andi Kleen wrote:
>>    
>>
>>>On Mon, Sep 19, 2005 at 12:42:16PM -0700, john stultz wrote:
>>>      
>>>
>>>>On Mon, 2005-09-19 at 21:31 +0200, Andi Kleen wrote:
>>>>        
>>>>
>>>>>On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
>>>>>          
>>>>>
>>>>>>	This patch should resolve the issue seen in 
>>>>>>            
>>>>>>
>>bugme bug #5105, 
>>    
>>
>>>>>>where it is assumed that dualcore x86_64 systems have synced 
>>>>>>TSCs. This is not the case, and alternate timesources 
>>>>>>            
>>>>>>
>>should be 
>>    
>>
>>>>>>used instead.
>>>>>>            
>>>>>>
>>>>>I asked AMD some time ago and they told me it was synchronized. 
>>>>>The TSC on K8 is C state invariant, but not P state 
>>>>>          
>>>>>
>>invariant, but 
>>    
>>
>>>>>P states always happen synchronized on dual cores.
>>>>>
>>>>>So I'm not quite convinced of your explanation yet.
>>>>>          
>>>>>
>>>>Would a litter userspace test checking the TSC 
>>>>        
>>>>
>>synchronization maybe 
>>    
>>
>>>>shed additional light on the issue?
>>>>        
>>>>
>>>Sure you can try it.
>>>      
>>>
>>So, bugzilla.kernel.org has (temporarily at least) lost the 
>>reports from yesterday, but from the email i got, folks using 
>>my TSC consistency check that I posted were seeing what 
>>appears to be unsynched TSCs on dualcore AMD systems.
>>    
>>
>
>My understanding was that each TSC on a dual-core processor
>will advance individually and atomically.  They will not 
>always be in synchronization.
>
>  
>
>>Personally I suspect that the powernow driver is putting the 
>>cores independently into low power sleep and the TSCs are 
>>being independently halted, causing them to become unsynchronized.
>>    
>>
>
>The powernow-k8 driver doesn't know what a low power sleep state
>is, so I strongly doubt it is involved here.  It only handles
>pstates.
> 
>-Mark Langsdorf
>K8 PowerNow! Maintainer
>AMD, Inc.
>
>  
>

Just to add some end-user input here, I see the same issues regardless 
of whether I'm running with the powernow-k8 or not.  The clock problems 
seem to be unrelated to that, at least on my system.
    -Scott

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-20 19:24 ` Scott Lampert
@ 2005-09-20 19:30   ` john stultz
  0 siblings, 0 replies; 28+ messages in thread
From: john stultz @ 2005-09-20 19:30 UTC (permalink / raw)
  To: Scott Lampert; +Cc: Langsdorf, Mark, Andi Kleen, Andrew Morton, lkml, discuss

On Tue, 2005-09-20 at 12:24 -0700, Scott Lampert wrote:
> Langsdorf, Mark wrote:
> >>Personally I suspect that the powernow driver is putting the 
> >>cores independently into low power sleep and the TSCs are 
> >>being independently halted, causing them to become unsynchronized.
> >
> >The powernow-k8 driver doesn't know what a low power sleep state
> >is, so I strongly doubt it is involved here.  It only handles
> >pstates.
> > 
> Just to add some end-user input here, I see the same issues regardless 
> of whether I'm running with the powernow-k8 or not.  The clock problems 
> seem to be unrelated to that, at least on my system.

Hmmm. Ok, I don't know the cpufreq/power management code well enough. 

I know some Intel cpus halt the TSC in C3. Could the ACPI code be
causing this? 

Could anyone with better knowledge speak to why it looks like the TSCs
are unsynced? Is my test flawed?

thanks
-john



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-20 18:59       ` john stultz
@ 2005-09-21  4:03         ` Daniel Jacobowitz
  2005-09-21 15:15           ` Ray Bryant
  0 siblings, 1 reply; 28+ messages in thread
From: Daniel Jacobowitz @ 2005-09-21  4:03 UTC (permalink / raw)
  To: john stultz; +Cc: Andi Kleen, Andrew Morton, lkml, discuss

On Tue, Sep 20, 2005 at 11:59:45AM -0700, john stultz wrote:
> So, bugzilla.kernel.org has (temporarily at least) lost the reports from
> yesterday, but from the email i got, folks using my TSC consistency
> check that I posted were seeing what appears to be unsynched TSCs on
> dualcore AMD systems.
> 
> Personally I suspect that the powernow driver is putting the cores
> independently into low power sleep and the TSCs are being independently
> halted, causing them to become unsynchronized.
> 
> Do you still feel there is some other issue here? Any ideas for shaking
> out whatever else might in play?

FYI, at least I have reproduced this without powernow loaded.

-- 
Daniel Jacobowitz
CodeSourcery, LLC

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-21 15:15           ` Ray Bryant
@ 2005-09-21 15:04             ` Andi Kleen
  2005-09-21 15:46               ` Ray Bryant
  2005-09-21 20:17               ` Andrew Morton
  0 siblings, 2 replies; 28+ messages in thread
From: Andi Kleen @ 2005-09-21 15:04 UTC (permalink / raw)
  To: Ray Bryant
  Cc: Daniel Jacobowitz, john stultz, Andi Kleen, Andrew Morton, lkml,
	discuss

On Wed, Sep 21, 2005 at 10:15:08AM -0500, Ray Bryant wrote:
> On Tuesday 20 September 2005 23:03, Daniel Jacobowitz wrote:
> 
> >
> > FYI, at least I have reproduced this without powernow loaded.
> 
> There are cases that we are aware of where the TSC will count slower while the 
> processor is halted.    This can make TSC's get out of sync on dual cores.

Ok thanks for the confirmation. I guess John's patch is ok then.
Drawback is much slower to extremly slow gettimeofday  (depending
if the chipset/BIOS has usable HPET, most seem not to) 

> 
> I wonder if you can reproduce this problem while also running a pair of cpu 
> bound tasks on your dual core box.   If you can't, then this is the culprit.
> 
> In general, however, on multisocket systems, you can't depend on TSC's being 
> synchronized between sockets, so all of this is moot.   We just have to deal 
> with it. 

We handle this, but single socket dual core was special cased because
I was told previously it should be ok.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-21  4:03         ` Daniel Jacobowitz
@ 2005-09-21 15:15           ` Ray Bryant
  2005-09-21 15:04             ` Andi Kleen
  0 siblings, 1 reply; 28+ messages in thread
From: Ray Bryant @ 2005-09-21 15:15 UTC (permalink / raw)
  To: Daniel Jacobowitz; +Cc: john stultz, Andi Kleen, Andrew Morton, lkml, discuss

On Tuesday 20 September 2005 23:03, Daniel Jacobowitz wrote:

>
> FYI, at least I have reproduced this without powernow loaded.

There are cases that we are aware of where the TSC will count slower while the 
processor is halted.    This can make TSC's get out of sync on dual cores.

I wonder if you can reproduce this problem while also running a pair of cpu 
bound tasks on your dual core box.   If you can't, then this is the culprit.

In general, however, on multisocket systems, you can't depend on TSC's being 
synchronized between sockets, so all of this is moot.   We just have to deal 
with it. 

-- 
Ray Bryant
AMD Performance Labs                   Austin, Tx
512-602-0038 (o)                 512-507-7807 (c)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-21 15:04             ` Andi Kleen
@ 2005-09-21 15:46               ` Ray Bryant
  2005-09-22  8:00                 ` Jonas Oreland
  2005-09-21 20:17               ` Andrew Morton
  1 sibling, 1 reply; 28+ messages in thread
From: Ray Bryant @ 2005-09-21 15:46 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Daniel Jacobowitz, john stultz, Andrew Morton, lkml, discuss

On Wednesday 21 September 2005 10:04, Andi Kleen wrote:

>
> We handle this, but single socket dual core was special cased because
> I was told previously it should be ok.
>
> -Andi

AFAIK there is a processor state bit that enables/disables this behavior.
Apparently some BIOS's are setting this one way for desktop systems and the 
other way for servers.   If it is thought to be important I can track that 
down and see if it can be externally documented.  (It may actually be in the 
bios and kernel developer guide...)

-- 
Ray Bryant
AMD Performance Labs                   Austin, Tx
512-602-0038 (o)                 512-507-7807 (c)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-21 15:04             ` Andi Kleen
  2005-09-21 15:46               ` Ray Bryant
@ 2005-09-21 20:17               ` Andrew Morton
  1 sibling, 0 replies; 28+ messages in thread
From: Andrew Morton @ 2005-09-21 20:17 UTC (permalink / raw)
  To: Andi Kleen; +Cc: raybry, dan, johnstul, ak, linux-kernel, discuss

Andi Kleen <ak@suse.de> wrote:
>
> On Wed, Sep 21, 2005 at 10:15:08AM -0500, Ray Bryant wrote:
> > On Tuesday 20 September 2005 23:03, Daniel Jacobowitz wrote:
> > 
> > >
> > > FYI, at least I have reproduced this without powernow loaded.
> > 
> > There are cases that we are aware of where the TSC will count slower while the 
> > processor is halted.    This can make TSC's get out of sync on dual cores.

You mean a single `hlt' instruction?   I guess that rules out resyncing them.

> Ok thanks for the confirmation. I guess John's patch is ok then.
> Drawback is much slower to extremly slow gettimeofday  (depending
> if the chipset/BIOS has usable HPET, most seem not to) 

That's a really big drawback.   Will this affect many CPU types?

If the user was prepared to use `idle=poll' then they could get their fast
gettimeofday() back, perhaps.


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-21 15:46               ` Ray Bryant
@ 2005-09-22  8:00                 ` Jonas Oreland
  0 siblings, 0 replies; 28+ messages in thread
From: Jonas Oreland @ 2005-09-22  8:00 UTC (permalink / raw)
  To: Ray Bryant
  Cc: Andi Kleen, Daniel Jacobowitz, john stultz, Andrew Morton, lkml,
	discuss

Ray Bryant wrote:
> On Wednesday 21 September 2005 10:04, Andi Kleen wrote:
> 
> 
>>We handle this, but single socket dual core was special cased because
>>I was told previously it should be ok.
>>
>>-Andi
> 
> 
> AFAIK there is a processor state bit that enables/disables this behavior.
> Apparently some BIOS's are setting this one way for desktop systems and the 
> other way for servers.   If it is thought to be important I can track that 
> down and see if it can be externally documented.  (It may actually be in the 
> bios and kernel developer guide...)
> 

Hi,

This would be very good (for us single socket dual core users)
I tried a very small benchmark:

clock_gettime(CLOCK_REALTIME): elapsed 7336657 -> 733.665700ns/call
clock_gettime(CLOCK_PROCESS_CPUTIME_ID): elapsed 763247 -> 76.324700ns/call

It's a factor 10 faster if the TSC were to be in sync.

/Jonas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-09-19 19:16 [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs john stultz
  2005-09-19 19:31 ` Andi Kleen
@ 2005-10-07 12:26 ` Vladimir B. Savkin
  2005-10-07 12:31   ` Andi Kleen
  1 sibling, 1 reply; 28+ messages in thread
From: Vladimir B. Savkin @ 2005-10-07 12:26 UTC (permalink / raw)
  To: lkml; +Cc: Andrew Morton, Andi Kleen, john stultz

[-- Attachment #1: Type: text/plain, Size: 1381 bytes --]

On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> Andrew,
> 	This patch should resolve the issue seen in bugme bug #5105, where it
> is assumed that dualcore x86_64 systems have synced TSCs. This is not
> the case, and alternate timesources should be used instead.
> 
> For more details, see:
> http://bugzilla.kernel.org/show_bug.cgi?id=5105

I too have a box that shows the symptoms from bugzilla entry above.
The system is Asus A8V Deluxe MB with  
"AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".

The patch below did not fix the problem, while "idle=poll" did.
Hope this helps, dmesg attached.

> 
> 
> Please consider for inclusion in your tree.
> 
> thanks
> -john
> 
> diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
> --- a/arch/x86_64/kernel/time.c
> +++ b/arch/x86_64/kernel/time.c
> @@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
>   	   are handled in the OEM check above. */
>   	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
>   		return 0;
> - 	/* All in a single socket - should be synchronized */
> - 	if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
> - 		return 0;
>  #endif
>   	/* Assume multi socket systems are not synchronized */
>   	return num_online_cpus() > 1;
> 
> 
~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


[-- Attachment #2: dmesg --]
[-- Type: text/plain, Size: 13528 bytes --]

Bootdata ok (command line is root=/dev/sda3 ro idle=poll )
Linux version 2.6.13.2 (vsavkin@forum) (gcc version 3.3.5 (Debian 1:3.3.5-13)) #6 SMP Fri Oct 7 16:17:05 MSD 2005
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
 BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 00000000bffb0000 (usable)
 BIOS-e820: 00000000bffb0000 - 00000000bffc0000 (ACPI data)
 BIOS-e820: 00000000bffc0000 - 00000000bfff0000 (ACPI NVS)
 BIOS-e820: 00000000bfff0000 - 00000000c0000000 (reserved)
 BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
 BIOS-e820: 0000000100000000 - 0000000140000000 (usable)
ACPI: RSDP (v002 ACPIAM                                ) @ 0x00000000000fa7c0
ACPI: XSDT (v001 A M I  OEMXSDT  0x05000519 MSFT 0x00000097) @ 0x00000000bffb0100
ACPI: FADT (v003 A M I  OEMFACP  0x05000519 MSFT 0x00000097) @ 0x00000000bffb0290
ACPI: MADT (v001 A M I  OEMAPIC  0x05000519 MSFT 0x00000097) @ 0x00000000bffb0390
ACPI: OEMB (v001 A M I  OEMBIOS  0x05000519 MSFT 0x00000097) @ 0x00000000bffc0040
ACPI: DSDT (v001  A0036 A0036001 0x00000001 MSFT 0x0100000d) @ 0x0000000000000000
On node 0 totalpages: 1048399
  DMA zone: 3999 pages, LIFO batch:1
  Normal zone: 1044400 pages, LIFO batch:31
  HighMem zone: 0 pages, LIFO batch:1
Looks like a VIA chipset. Disabling IOMMU. Overwrite with "iommu=allowed"
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:11 APIC version 16
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:11 APIC version 16
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Setting APIC routing to flat
Using ACPI (MADT) for SMP configuration information
Allocating PCI resources starting at c0000000 (gap: c0000000:3f780000)
Built 1 zonelists
Kernel command line: root=/dev/sda3 ro idle=poll 
using polling idle threads.
Initializing CPU#0
PID hash table entries: 4096 (order: 12, 131072 bytes)
time.c: Using 1.193182 MHz PIT timer.
time.c: Detected 2002.578 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
Placing software IO TLB between 0x633d000 - 0x833d000
Memory: 4070468k/5242880k available (2282k kernel code, 122840k reserved, 1211k data, 532k init)
Calibrating delay using timer specific routine.. 4012.03 BogoMIPS (lpj=2006016)
Security Framework v1.0.0 initialized
Capability LSM initialized
Mount-cache hash table entries: 256
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 0(2) -> Node 0 -> Core 0
mtrr: v2.0 (20020519)
 tbxface-0120 [02] acpi_load_tables      : ACPI Tables successfully acquired
Parsing all Control Methods:...................................................................................................................................................
Table [DSDT](id F004) - 547 Objects with 51 Devices 147 Methods 25 Regions
ACPI Namespace successfully loaded at root ffffffff804c8000
evxfevnt-0096 [03] acpi_enable           : Transition to ACPI mode successful
Using local APIC timer interrupts.
Detected 12.516 MHz APIC timer.
Booting processor 1/2 APIC 0x1
Initializing CPU#1
Calibrating delay using timer specific routine.. 4004.57 BogoMIPS (lpj=2002289)
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 512K (64 bytes/line)
CPU 1(2) -> Node 0 -> Core 1
AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ stepping 01
CPU 1: Syncing TSC to CPU 0.
CPU 1: synchronized TSC with CPU 0 (last diff 0 cycles, maxerr 568 cycles)
Brought up 2 CPUs
time.c: Using PIT/TSC based timekeeping.
testing NMI watchdog ... OK.
NET: Registered protocol family 16
ACPI: bus type pci registered
PCI: Using configuration type 1
ACPI: Subsystem revision 20050408
evgpeblk-1016 [06] ev_create_gpe_block   : GPE 00 to 0F [_GPE] 2 regs on int 0x9
evgpeblk-1024 [06] ev_create_gpe_block   : Found 7 Wake, Enabled 0 Runtime GPEs in this block
Completing Region/Field/Buffer/Package initialization:............................................................................................................................
Initialized 24/25 Regions 44/44 Fields 41/41 Buffers 15/16 Packages (556 nodes)
Executing all Device _STA and_INI methods:.......................................................
55 Devices found containing: 55 _STA, 0 _INI methods
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (0000:00)
PCI: Probing PCI hardware (bus 00)
ACPI: Assume root bridge [\_SB_.PCI0] segment is 0
Boot video device is 0000:01:00.0
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 10 11 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
SCSI subsystem initialized
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
PCI: If a device doesn't work, try "pci=routeirq".  If it helps, post a report
agpgart: Detected AGP bridge 0
agpgart: AGP aperture is 64M @ 0xdc000000
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)
PCI: Bridge: 0000:00:01.0
  IO window: e000-efff
  MEM window: fbe00000-fbffffff
  PREFETCH window: e0000000-faffffff
acpi_bus-0212 [01] acpi_bus_set_power    : Device is not power manageable
PCI: Setting latency timer of device 0000:00:01.0 to 64
IA32 emulation $Id: sys_ia32.c,v 1.32 2002/03/24 13:02:28 ak Exp $
audit: initializing netlink socket (disabled)
audit(1128687641.426:1): initialized
Total HugeTLB memory allocated, 0
Initializing Cryptographic API
Real Time Clock Driver v1.12
Non-volatile memory driver v1.2
Linux agpgart interface v0.101 (c) Dave Jones
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
io scheduler noop registered
io scheduler deadline registered
io scheduler cfq registered
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
acpi_bus-0212 [01] acpi_bus_set_power    : Device is not power manageable
ACPI: PCI Interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 169
PCI: Via IRQ fixup for 0000:00:0f.1, from 255 to 9
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
    ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:pio, hdb:pio
    ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:DMA, hdd:pio
Probing IDE interface ide0...
Probing IDE interface ide1...
hdc: IC35L080AVVA07-0, ATA DISK drive
isa bounce pool size: 16 pages
ide1 at 0x170-0x177,0x376 on irq 15
hdc: max request size: 128KiB
hdc: 160836480 sectors (82348 MB) w/1863KiB Cache, CHS=65535/16/63, UDMA(100)
hdc: cache flushes supported
 hdc: hdc1 hdc2 hdc3 hdc4
libata version 1.12 loaded.
sata_via version 1.1
ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 169
PCI: Via IRQ fixup for 0000:00:0f.0, from 10 to 9
sata_via(0000:00:0f.0): routed to hard irq line 9
ata1: SATA max UDMA/133 cmd 0xC400 ctl 0xC002 bmdma 0xB000 irq 169
ata2: SATA max UDMA/133 cmd 0xB800 ctl 0xB402 bmdma 0xB008 irq 169
ata1: no device found (phy stat 00000000)
scsi0 : sata_via
ata2: dev 0 cfg 49:2f00 82:74eb 83:7fea 84:4023 85:74e8 86:3c02 87:4023 88:203f
ata2: dev 0 ATA, max UDMA/100, 488397168 sectors: lba48
ata2: dev 0 configured for UDMA/100
scsi1 : sata_via
  Vendor: ATA       Model: HDS722525VLSA80   Rev: V36O
  Type:   Direct-Access                      ANSI SCSI revision: 05
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
SCSI device sda: 488397168 512-byte hdwr sectors (250059 MB)
SCSI device sda: drive cache: write back
 sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 >
Attached scsi disk sda at scsi1, channel 0, id 0, lun 0
Attached scsi generic sg0 at scsi1, channel 0, id 0, lun 0,  type 0
mice: PS/2 mouse device common for all mice
input: PC Speaker
device-mapper: 4.4.0-ioctl (2005-01-12) initialised: dm-devel@redhat.com
NET: Registered protocol family 2
IP route cache hash table entries: 262144 (order: 9, 2097152 bytes)
TCP established hash table entries: 262144 (order: 10, 4194304 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 262144 bind 65536)
TCP reno registered
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 532k freed
input: AT Translated Set 2 keyboard on isa0060/serio0
Adding 7912004k swap on /dev/sda2.  Priority:2 extents:1
Adding 3906496k swap on /dev/hdc3.  Priority:4 extents:1
802.1Q VLAN Support v1.8 Ben Greear <greearb@candelatech.com>
All bugs added by David S. Miller <davem@redhat.com>
8139too Fast Ethernet driver 0.9.27
ACPI: PCI Interrupt 0000:00:0c.0[A] -> GSI 17 (level, low) -> IRQ 177
eth0: RealTek RTL8139 at 0xffffc20000012000, 00:c0:26:a1:92:f5, IRQ 177
eth0:  Identified 8139 chip type 'RTL-8139C'
ReiserFS: sda9: found reiserfs format "3.6" with standard journal
ReiserFS: sda9: using ordered data mode
ReiserFS: sda9: journal params: device sda9, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda9: checking transaction log (sda9)
ReiserFS: sda9: Using r5 hash to sort names
ReiserFS: hdc4: found reiserfs format "3.6" with standard journal
ReiserFS: hdc4: using ordered data mode
ReiserFS: hdc4: journal params: device hdc4, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hdc4: checking transaction log (hdc4)
ReiserFS: hdc4: Using r5 hash to sort names
ReiserFS: sda11: found reiserfs format "3.6" with standard journal
ReiserFS: sda11: using ordered data mode
ReiserFS: sda11: journal params: device sda11, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda11: checking transaction log (sda11)
ReiserFS: sda11: Using r5 hash to sort names
ReiserFS: sda8: found reiserfs format "3.6" with standard journal
ReiserFS: sda8: using ordered data mode
ReiserFS: sda8: journal params: device sda8, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda8: checking transaction log (sda8)
ReiserFS: sda8: Using r5 hash to sort names
ReiserFS: hdc1: found reiserfs format "3.6" with standard journal
ReiserFS: hdc1: using ordered data mode
ReiserFS: hdc1: journal params: device hdc1, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hdc1: checking transaction log (hdc1)
ReiserFS: hdc1: Using r5 hash to sort names
ReiserFS: sda10: found reiserfs format "3.6" with standard journal
ReiserFS: sda10: using ordered data mode
ReiserFS: sda10: journal params: device sda10, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda10: checking transaction log (sda10)
ReiserFS: sda10: Using r5 hash to sort names
ReiserFS: sda5: found reiserfs format "3.6" with standard journal
ReiserFS: sda5: using ordered data mode
ReiserFS: sda5: journal params: device sda5, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda5: checking transaction log (sda5)
ReiserFS: sda5: Using r5 hash to sort names
ReiserFS: sda6: found reiserfs format "3.6" with standard journal
ReiserFS: sda6: using ordered data mode
ReiserFS: sda6: journal params: device sda6, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: sda6: checking transaction log (sda6)
ReiserFS: sda6: Using r5 hash to sort names
ReiserFS: hdc2: found reiserfs format "3.6" with standard journal
ReiserFS: hdc2: using ordered data mode
ReiserFS: hdc2: journal params: device hdc2, size 8192, journal first block 18, max trans len 1024, max batch 900, max commit age 30, max trans age 30
ReiserFS: hdc2: checking transaction log (hdc2)
ReiserFS: hdc2: Using r5 hash to sort names
eth0: link up, 100Mbps, full-duplex, lpa 0x41E1
vlan0169: add 01:00:5e:00:00:01 mcast address to master interface
vlan0170: add 01:00:5e:00:00:01 mcast address to master interface
NET: Registered protocol family 10
vlan0169: add 33:33:00:00:00:01 mcast address to master interface
vlan0169: add 33:33:ff:a1:92:f5 mcast address to master interface
vlan0170: add 33:33:00:00:00:01 mcast address to master interface
vlan0170: add 33:33:ff:a1:92:f5 mcast address to master interface
IPv6 over IPv4 tunneling driver

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-07 12:26 ` Vladimir B. Savkin
@ 2005-10-07 12:31   ` Andi Kleen
  2005-10-07 14:15     ` Vladimir B. Savkin
  2005-10-08 10:11     ` Vladimir B. Savkin
  0 siblings, 2 replies; 28+ messages in thread
From: Andi Kleen @ 2005-10-07 12:31 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: lkml, Andrew Morton, john stultz, discuss

On Friday 07 October 2005 14:26, Vladimir B. Savkin wrote:
> On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > Andrew,
> > 	This patch should resolve the issue seen in bugme bug #5105, where it
> > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > the case, and alternate timesources should be used instead.
> >
> > For more details, see:
> > http://bugzilla.kernel.org/show_bug.cgi?id=5105
>
> I too have a box that shows the symptoms from bugzilla entry above.
> The system is Asus A8V Deluxe MB with
> "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
>
> The patch below did not fix the problem, while "idle=poll" did.
> Hope this helps, dmesg attached.

Are you running the latest BIOS?

-Andi

>
> > Please consider for inclusion in your tree.
> >
> > thanks
> > -john
> >
> > diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
> > --- a/arch/x86_64/kernel/time.c
> > +++ b/arch/x86_64/kernel/time.c
> > @@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
> >   	   are handled in the OEM check above. */
> >   	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> >   		return 0;
> > - 	/* All in a single socket - should be synchronized */
> > - 	if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
> > - 		return 0;
> >  #endif
> >   	/* Assume multi socket systems are not synchronized */
> >   	return num_online_cpus() > 1;
>
> ~
>
> :wq
>
>                                         With best regards,
>                                            Vladimir Savkin.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-07 12:31   ` Andi Kleen
@ 2005-10-07 14:15     ` Vladimir B. Savkin
  2005-10-07 14:21       ` [discuss] " Velu Erwan
  2005-10-08 10:11     ` Vladimir B. Savkin
  1 sibling, 1 reply; 28+ messages in thread
From: Vladimir B. Savkin @ 2005-10-07 14:15 UTC (permalink / raw)
  To: Andi Kleen; +Cc: lkml, Andrew Morton, john stultz, discuss

On Fri, Oct 07, 2005 at 02:31:46PM +0200, Andi Kleen wrote:
> > I too have a box that shows the symptoms from bugzilla entry above.
> > The system is Asus A8V Deluxe MB with
> > "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
> >
> > The patch below did not fix the problem, while "idle=poll" did.
> > Hope this helps, dmesg attached.
> 
> Are you running the latest BIOS?

Well, I think not.
Asus file download page is unavailable since yesterday.

> 
> -Andi
> 
~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-07 14:15     ` Vladimir B. Savkin
@ 2005-10-07 14:21       ` Velu Erwan
  0 siblings, 0 replies; 28+ messages in thread
From: Velu Erwan @ 2005-10-07 14:21 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: Andi Kleen, lkml, Andrew Morton, john stultz, discuss

Vladimir B. Savkin a écrit :

>Well, I think not.
>Asus file download page is unavailable since yesterday.
>  
>
Agreed but ftp://ftp.asus.com:/pub/ASUS/mb/socket939/a8v-deluxe is still 
available ;)

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-07 12:31   ` Andi Kleen
  2005-10-07 14:15     ` Vladimir B. Savkin
@ 2005-10-08 10:11     ` Vladimir B. Savkin
  2005-10-10 18:03       ` john stultz
  1 sibling, 1 reply; 28+ messages in thread
From: Vladimir B. Savkin @ 2005-10-08 10:11 UTC (permalink / raw)
  To: Andi Kleen; +Cc: lkml, Andrew Morton, john stultz, discuss

On Fri, Oct 07, 2005 at 02:31:46PM +0200, Andi Kleen wrote:
> On Friday 07 October 2005 14:26, Vladimir B. Savkin wrote:
> > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > Andrew,
> > > 	This patch should resolve the issue seen in bugme bug #5105, where it
> > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > the case, and alternate timesources should be used instead.
> > >
> > > For more details, see:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=5105
> >
> > I too have a box that shows the symptoms from bugzilla entry above.
> > The system is Asus A8V Deluxe MB with
> > "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
> >
> > The patch below did not fix the problem, while "idle=poll" did.
> > Hope this helps, dmesg attached.
> 
> Are you running the latest BIOS?

Just upgraded to the lastest BIOS (revision 1014), nothing changed.
Only with "idle=poll" timers run normally.

> 
> > > Please consider for inclusion in your tree.
> > >
> > > thanks
> > > -john
> > >
> > > diff --git a/arch/x86_64/kernel/time.c b/arch/x86_64/kernel/time.c
> > > --- a/arch/x86_64/kernel/time.c
> > > +++ b/arch/x86_64/kernel/time.c
> > > @@ -959,9 +959,6 @@ static __init int unsynchronized_tsc(voi
> > >   	   are handled in the OEM check above. */
> > >   	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL)
> > >   		return 0;
> > > - 	/* All in a single socket - should be synchronized */
> > > - 	if (cpus_weight(cpu_core_map[0]) == num_online_cpus())
> > > - 		return 0;
> > >  #endif
> > >   	/* Assume multi socket systems are not synchronized */
> > >   	return num_online_cpus() > 1;
> >
> > ~
~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-08 10:11     ` Vladimir B. Savkin
@ 2005-10-10 18:03       ` john stultz
  2005-10-10 18:12         ` Vladimir B. Savkin
  0 siblings, 1 reply; 28+ messages in thread
From: john stultz @ 2005-10-10 18:03 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: Andi Kleen, lkml, Andrew Morton, discuss

On Sat, 2005-10-08 at 14:11 +0400, Vladimir B. Savkin wrote:
> On Fri, Oct 07, 2005 at 02:31:46PM +0200, Andi Kleen wrote:
> > On Friday 07 October 2005 14:26, Vladimir B. Savkin wrote:
> > > On Mon, Sep 19, 2005 at 12:16:43PM -0700, john stultz wrote:
> > > > Andrew,
> > > > 	This patch should resolve the issue seen in bugme bug #5105, where it
> > > > is assumed that dualcore x86_64 systems have synced TSCs. This is not
> > > > the case, and alternate timesources should be used instead.
> > > >
> > > > For more details, see:
> > > > http://bugzilla.kernel.org/show_bug.cgi?id=5105
> > >
> > > I too have a box that shows the symptoms from bugzilla entry above.
> > > The system is Asus A8V Deluxe MB with
> > > "AMD Athlon(tm) 64 X2 Dual Core Processor 3800+".
> > >
> > > The patch below did not fix the problem, while "idle=poll" did.
> > > Hope this helps, dmesg attached.
> > 
> > Are you running the latest BIOS?
> 
> Just upgraded to the lastest BIOS (revision 1014), nothing changed.
> Only with "idle=poll" timers run normally.

>From your dmesg, it appears that there are no other timesources other
then the TSC available on your hardware. So I'm guessing idle=poll is
keeping the CPUs from halting the TSC and keeping them synched. 


I would think that the ACPI PM timer would be supported, but I don't see
anything about it in your dmesg. Could you make sure it is properly
configured in?

thanks
-john



^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-10 18:03       ` john stultz
@ 2005-10-10 18:12         ` Vladimir B. Savkin
  2005-10-10 18:19           ` Jonas Oreland
  0 siblings, 1 reply; 28+ messages in thread
From: Vladimir B. Savkin @ 2005-10-10 18:12 UTC (permalink / raw)
  To: john stultz; +Cc: Andi Kleen, lkml, Andrew Morton, discuss

On Mon, Oct 10, 2005 at 11:03:24AM -0700, john stultz wrote:
> >From your dmesg, it appears that there are no other timesources other
> then the TSC available on your hardware. So I'm guessing idle=poll is
> keeping the CPUs from halting the TSC and keeping them synched. 
> 
> 
> I would think that the ACPI PM timer would be supported, but I don't see
> anything about it in your dmesg. Could you make sure it is properly
> configured in?

Yes, I tried different combinations of PM_TIMER and HPET options.
In this try, PM_TIMER was definetly enabled in kernel config.

What kind of kernel message did you expect from workibf PM timer?

~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-10 18:12         ` Vladimir B. Savkin
@ 2005-10-10 18:19           ` Jonas Oreland
  2005-10-11  7:35             ` Vladimir B. Savkin
  0 siblings, 1 reply; 28+ messages in thread
From: Jonas Oreland @ 2005-10-10 18:19 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: john stultz, Andi Kleen, lkml, Andrew Morton, discuss

Hi,

check http://bugzilla.kernel.org/show_bug.cgi?id=5283

/Jonas

Vladimir B. Savkin wrote:
> On Mon, Oct 10, 2005 at 11:03:24AM -0700, john stultz wrote:
> 
>>>From your dmesg, it appears that there are no other timesources other
>>then the TSC available on your hardware. So I'm guessing idle=poll is
>>keeping the CPUs from halting the TSC and keeping them synched. 
>>
>>
>>I would think that the ACPI PM timer would be supported, but I don't see
>>anything about it in your dmesg. Could you make sure it is properly
>>configured in?
> 
> 
> Yes, I tried different combinations of PM_TIMER and HPET options.
> In this try, PM_TIMER was definetly enabled in kernel config.
> 
> What kind of kernel message did you expect from workibf PM timer?
> 
> ~
> :wq
>                                         With best regards, 
>                                            Vladimir Savkin. 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-10 18:19           ` Jonas Oreland
@ 2005-10-11  7:35             ` Vladimir B. Savkin
  2005-10-11  8:06               ` Andi Kleen
  2005-10-11 16:27               ` Jonas Oreland
  0 siblings, 2 replies; 28+ messages in thread
From: Vladimir B. Savkin @ 2005-10-11  7:35 UTC (permalink / raw)
  To: Jonas Oreland; +Cc: john stultz, Andi Kleen, lkml, Andrew Morton, discuss

On Mon, Oct 10, 2005 at 08:19:42PM +0200, Jonas Oreland wrote:
> Hi,
> 
> check http://bugzilla.kernel.org/show_bug.cgi?id=5283

Excuse me for possibly dumb question, but is it safe to leave TSCs
unsynchronized when using other time source?
How will other subsystems e.g. traffic queueing disciplines react?

~
:wq
                                        With best regards, 
                                           Vladimir Savkin. 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-11  7:35             ` Vladimir B. Savkin
@ 2005-10-11  8:06               ` Andi Kleen
  2005-10-11 16:27               ` Jonas Oreland
  1 sibling, 0 replies; 28+ messages in thread
From: Andi Kleen @ 2005-10-11  8:06 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: john stultz, lkml, Andrew Morton, discuss

"Vladimir B. Savkin" <master@sectorb.msk.ru> writes:

> On Mon, Oct 10, 2005 at 08:19:42PM +0200, Jonas Oreland wrote:
> > Hi,
> > 
> > check http://bugzilla.kernel.org/show_bug.cgi?id=5283
> 
> Excuse me for possibly dumb question, but is it safe to leave TSCs
> unsynchronized when using other time source?
> How will other subsystems e.g. traffic queueing disciplines react?

They might see hickups, but normally they all have relatively 
benign failure modes so I wouldn't worry too much.

If you use it on a Opteron with frequency scaling and multiple sockets
it would be safer to patch them to use do_gettimeofday() or better
monotonic_clock(), because the differences can be very large there
(CPUs running with completely different frequencies). Drawback would
be that it would be slower. On systems without frequency scaling
you would likely only see problems if at all after a long uptime.

For some subsystems it is ok, e.g. the scheduler which also uses
TSCs especially deals with unsynchronized clocks.

-Andi


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs
  2005-10-11  7:35             ` Vladimir B. Savkin
  2005-10-11  8:06               ` Andi Kleen
@ 2005-10-11 16:27               ` Jonas Oreland
  2005-10-25  7:35                 ` x86-64: Syncing dualcore cpus TSCs Jonas Oreland
  1 sibling, 1 reply; 28+ messages in thread
From: Jonas Oreland @ 2005-10-11 16:27 UTC (permalink / raw)
  To: Vladimir B. Savkin; +Cc: john stultz, Andi Kleen, lkml, Andrew Morton, discuss

Vladimir B. Savkin wrote:
> On Mon, Oct 10, 2005 at 08:19:42PM +0200, Jonas Oreland wrote:
> 
>>Hi,
>>
>>check http://bugzilla.kernel.org/show_bug.cgi?id=5283
> 
> 
> Excuse me for possibly dumb question, but is it safe to leave TSCs
> unsynchronized when using other time source?
> How will other subsystems e.g. traffic queueing disciplines react?

Excuse me for possibly dumb answer: (i'm not a kernel hacker)

yes, I would guess that this will be handled as any other 
SMP machine where TSCs arent in sync.

/Jonas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* x86-64: Syncing dualcore cpus TSCs
  2005-10-11 16:27               ` Jonas Oreland
@ 2005-10-25  7:35                 ` Jonas Oreland
  2005-10-25  7:42                   ` Andi Kleen
  0 siblings, 1 reply; 28+ messages in thread
From: Jonas Oreland @ 2005-10-25  7:35 UTC (permalink / raw)
  To: john stultz, Andi Kleen; +Cc: Jonas Oreland, lkml, Andrew Morton, discuss

Hi,

This might be a very bad suggestion, but here it is:

On dualcore cpus (amd64) the TSC will get out of sync when executing hlt instruction.
booting with idle=poll, will make it never to execute hlt, hence TSC will be in sync.
booting with notsc will make it use other time source...but this is slower
  (this is default after "[PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs")

How about syncing TSC after hlt?

If cost of syncing TSC's is smaller than cost of using other time source this might be an alternative.

/Jonas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: x86-64: Syncing dualcore cpus TSCs
  2005-10-25  7:35                 ` x86-64: Syncing dualcore cpus TSCs Jonas Oreland
@ 2005-10-25  7:42                   ` Andi Kleen
  2005-10-26  0:05                     ` David Lang
  0 siblings, 1 reply; 28+ messages in thread
From: Andi Kleen @ 2005-10-25  7:42 UTC (permalink / raw)
  To: Jonas Oreland; +Cc: john stultz, lkml, Andrew Morton, discuss

On Tuesday 25 October 2005 09:35, Jonas Oreland wrote:
> Hi,
>
> This might be a very bad suggestion, but here it is:
>
> On dualcore cpus (amd64) the TSC will get out of sync when executing hlt
> instruction. booting with idle=poll, will make it never to execute hlt,
> hence TSC will be in sync. booting with notsc will make it use other time
> source...but this is slower (this is default after "[PATCH] x86-64: Fix bad
> assumption that dualcore cpus have synced TSCs")
>
> How about syncing TSC after hlt?
>
> If cost of syncing TSC's is smaller than cost of using other time source
> this might be an alternative.

I very doubt it is. Syncing TSCs requires stopping multiple CPUs for longer 
time. It is unlikely you can make that up.

-Andi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: x86-64: Syncing dualcore cpus TSCs
  2005-10-25  7:42                   ` Andi Kleen
@ 2005-10-26  0:05                     ` David Lang
  0 siblings, 0 replies; 28+ messages in thread
From: David Lang @ 2005-10-26  0:05 UTC (permalink / raw)
  To: Andi Kleen; +Cc: Jonas Oreland, john stultz, lkml, Andrew Morton, discuss

On Tue, 25 Oct 2005, Andi Kleen wrote:

> On Tuesday 25 October 2005 09:35, Jonas Oreland wrote:
>> Hi,
>>
>> This might be a very bad suggestion, but here it is:
>>
>> On dualcore cpus (amd64) the TSC will get out of sync when executing hlt
>> instruction. booting with idle=poll, will make it never to execute hlt,
>> hence TSC will be in sync. booting with notsc will make it use other time
>> source...but this is slower (this is default after "[PATCH] x86-64: Fix bad
>> assumption that dualcore cpus have synced TSCs")
>>
>> How about syncing TSC after hlt?
>>
>> If cost of syncing TSC's is smaller than cost of using other time source
>> this might be an alternative.
>
> I very doubt it is. Syncing TSCs requires stopping multiple CPUs for longer
> time. It is unlikely you can make that up.

I may be misunderstanding things, but as I understand it the reason for 
calling hlt is to save power.

if you really care about the last bit of performance then useing idle=poll 
to make the TSC's stay synced makes perfect sense.

it's cases where you care about saving power that you would want to use 
hlt. can the power management be reasonably configured so that when things 
are running close to full-bore hlt isn't called, when things are more idle 
it switches to useing hlt and a non-TSC timesource or re-syncing the TSC 
on wakeup, and then if it's more idle then that it goes into the more 
traditional power saving modes where it works to shutdown individual CPU's 
(obviously having to re-sync the TSC when they wake up).

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2005-10-26  0:06 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-19 19:16 [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs john stultz
2005-09-19 19:31 ` Andi Kleen
2005-09-19 19:42   ` john stultz
2005-09-19 19:49     ` [discuss] " Andi Kleen
2005-09-20 18:59       ` john stultz
2005-09-21  4:03         ` Daniel Jacobowitz
2005-09-21 15:15           ` Ray Bryant
2005-09-21 15:04             ` Andi Kleen
2005-09-21 15:46               ` Ray Bryant
2005-09-22  8:00                 ` Jonas Oreland
2005-09-21 20:17               ` Andrew Morton
2005-10-07 12:26 ` Vladimir B. Savkin
2005-10-07 12:31   ` Andi Kleen
2005-10-07 14:15     ` Vladimir B. Savkin
2005-10-07 14:21       ` [discuss] " Velu Erwan
2005-10-08 10:11     ` Vladimir B. Savkin
2005-10-10 18:03       ` john stultz
2005-10-10 18:12         ` Vladimir B. Savkin
2005-10-10 18:19           ` Jonas Oreland
2005-10-11  7:35             ` Vladimir B. Savkin
2005-10-11  8:06               ` Andi Kleen
2005-10-11 16:27               ` Jonas Oreland
2005-10-25  7:35                 ` x86-64: Syncing dualcore cpus TSCs Jonas Oreland
2005-10-25  7:42                   ` Andi Kleen
2005-10-26  0:05                     ` David Lang
  -- strict thread matches above, loose matches on Subject: below --
2005-09-20 19:13 [discuss] Re: [PATCH] x86-64: Fix bad assumption that dualcore cpus have synced TSCs Langsdorf, Mark
2005-09-20 19:24 ` Scott Lampert
2005-09-20 19:30   ` john stultz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox