* AMD X2 unsynced TSC fix?
@ 2006-10-27 17:15 Lee Revell
2006-10-27 20:18 ` Luca Tettamanti
` (2 more replies)
0 siblings, 3 replies; 65+ messages in thread
From: Lee Revell @ 2006-10-27 17:15 UTC (permalink / raw)
To: linux-kernel; +Cc: Andi Kleen, john stultz
Someone recently pointed out to me that a Windows "CPU driver update"
supplied by AMD fixes the unsynced TSC problem on dual core AMD64
systems.
http://www.amd.com/us-en/Processors/TechnicalResources/0,,30_182_871_13118,00.html
"The AMD Dual-Core Optimizer can help improve some PC gaming video
performance by compensating for those applications that bypass the
Windows API for timing by directly using the RDTSC (Read Time Stamp
Counter) instruction. Applications that rely on RDTSC do not benefit
from the logic in the operating system to properly account for the
affect of power management mechanisms on the rate at which a processor
core's Time Stamp Counter (TSC) is incremented. The AMD Dual-Core
Optimizer helps to correct the resulting video performance effects or
other incorrect timing effects that these applications may experience on
dual-core processor systems, by periodically adjusting the core
time-stamp-counters, so that they are synchronized."
What are the chances of Linux getting a similar fix?
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 17:15 AMD X2 unsynced TSC fix? Lee Revell
@ 2006-10-27 20:18 ` Luca Tettamanti
2006-10-27 23:04 ` thockin
2006-10-27 20:35 ` Andi Kleen
2006-10-27 21:58 ` Friedrich Göpel
2 siblings, 1 reply; 65+ messages in thread
From: Luca Tettamanti @ 2006-10-27 20:18 UTC (permalink / raw)
To: Lee Revell; +Cc: linux-kernel, Andi Kleen, john stultz
Lee Revell <rlrevell@joe-job.com> ha scritto:
> Someone recently pointed out to me that a Windows "CPU driver update"
> supplied by AMD fixes the unsynced TSC problem on dual core AMD64
> systems.
[...]
> other incorrect timing effects that these applications may experience on
> dual-core processor systems, by periodically adjusting the core
> time-stamp-counters, so that they are synchronized."
>
> What are the chances of Linux getting a similar fix?
Zero? ;)
There's always a window where the TSCs are not in sync (and userspace may
see a non-monotonic counter); furthermore when C'n'Q is active TSCs
aren't updated at a fixed frequency, userspace cannot use TSC for timing
anyway.
Luca
--
> While we're on all of this, are we going to change "tained" to some
> other less alarmist word?
"screwed" -- Alexander Viro
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 17:15 AMD X2 unsynced TSC fix? Lee Revell
2006-10-27 20:18 ` Luca Tettamanti
@ 2006-10-27 20:35 ` Andi Kleen
2006-10-27 20:41 ` Lee Revell
2006-10-27 21:58 ` Friedrich Göpel
2 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2006-10-27 20:35 UTC (permalink / raw)
To: Lee Revell; +Cc: linux-kernel, john stultz
> What are the chances of Linux getting a similar fix?
Fix isn't the right word i would use for this particular implementation.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 20:35 ` Andi Kleen
@ 2006-10-27 20:41 ` Lee Revell
2006-10-27 21:48 ` Chris Friesen
0 siblings, 1 reply; 65+ messages in thread
From: Lee Revell @ 2006-10-27 20:41 UTC (permalink / raw)
To: Andi Kleen; +Cc: linux-kernel, john stultz
On Fri, 2006-10-27 at 13:35 -0700, Andi Kleen wrote:
> > What are the chances of Linux getting a similar fix?
>
> Fix isn't the right word i would use for this particular implementation.
What exactly does that AMD patch do? Other OS users report that it
makes TSC usable for timing again. Does it do something really heavy
handed like disable power management features?
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 20:41 ` Lee Revell
@ 2006-10-27 21:48 ` Chris Friesen
2006-10-27 22:08 ` Lee Revell
0 siblings, 1 reply; 65+ messages in thread
From: Chris Friesen @ 2006-10-27 21:48 UTC (permalink / raw)
To: Lee Revell; +Cc: Andi Kleen, linux-kernel, john stultz
Lee Revell wrote:
> What exactly does that AMD patch do?
"...by periodically adjusting the core time-stamp-counters, so that they
are synchronized."
It sounds like they just periodically write a new value to the TSC.
Presumably they set the "slower" one equal to the "faster" one.
You'd likely still have windows where time might run backwards, but it
would be better than nothing.
Chris
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 17:15 AMD X2 unsynced TSC fix? Lee Revell
2006-10-27 20:18 ` Luca Tettamanti
2006-10-27 20:35 ` Andi Kleen
@ 2006-10-27 21:58 ` Friedrich Göpel
2 siblings, 0 replies; 65+ messages in thread
From: Friedrich Göpel @ 2006-10-27 21:58 UTC (permalink / raw)
To: linux-kernel
On 13:15 Fri 27 Oct , Lee Revell wrote:
> Someone recently pointed out to me that a Windows "CPU driver update"
> supplied by AMD fixes the unsynced TSC problem on dual core AMD64
> systems.
>
...
> What are the chances of Linux getting a similar fix?
>
> Lee
>
Hi,
This post earlier seems to suggest someone is indeed working on
something similar, if I'm understanding this correctly:
http://lkml.org/lkml/2006/10/27/27
quote:
> Jiri Bohac (jbohac@suse.cz) is currently working on a new timekeeping code for
> x86-64 that takes a significantly different approach that allows for
> precise and fast gettimeofday even on CPUs with unsynchronized TSCs.
Cheers,
Friedrich Göpel
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 21:48 ` Chris Friesen
@ 2006-10-27 22:08 ` Lee Revell
2006-10-28 3:58 ` Sergio Monteiro Basto
0 siblings, 1 reply; 65+ messages in thread
From: Lee Revell @ 2006-10-27 22:08 UTC (permalink / raw)
To: Chris Friesen; +Cc: Andi Kleen, linux-kernel, john stultz
On Fri, 2006-10-27 at 15:48 -0600, Chris Friesen wrote:
> Lee Revell wrote:
>
> > What exactly does that AMD patch do?
>
> "...by periodically adjusting the core time-stamp-counters, so that they
> are synchronized."
>
> It sounds like they just periodically write a new value to the TSC.
> Presumably they set the "slower" one equal to the "faster" one.
>
> You'd likely still have windows where time might run backwards, but it
> would be better than nothing.
The patch also apparently changes boot params to make the OS use the
ACPI PM timer, so it must not be a complete solution.
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 20:18 ` Luca Tettamanti
@ 2006-10-27 23:04 ` thockin
2006-10-28 0:00 ` Luca Tettamanti
` (2 more replies)
0 siblings, 3 replies; 65+ messages in thread
From: thockin @ 2006-10-27 23:04 UTC (permalink / raw)
To: Luca Tettamanti; +Cc: Lee Revell, linux-kernel, Andi Kleen, john stultz
On Fri, Oct 27, 2006 at 10:18:20PM +0200, Luca Tettamanti wrote:
> Lee Revell <rlrevell@joe-job.com> ha scritto:
> > Someone recently pointed out to me that a Windows "CPU driver update"
> > supplied by AMD fixes the unsynced TSC problem on dual core AMD64
> > systems.
> [...]
> > other incorrect timing effects that these applications may experience on
> > dual-core processor systems, by periodically adjusting the core
> > time-stamp-counters, so that they are synchronized."
> >
> > What are the chances of Linux getting a similar fix?
>
> Zero? ;)
Wrong. We have a fix that has been under serious testing for a long time.
> There's always a window where the TSCs are not in sync (and userspace may
> see a non-monotonic counter); furthermore when C'n'Q is active TSCs
> aren't updated at a fixed frequency, userspace cannot use TSC for timing
> anyway.
Wrong, too. We have a patch that will be coming SOON (trust me, I am
pushing hard for the author to publish it). With this patch applied you
should never see the TSC go backwards. Period. It should be monotonic
(to userspace, kernel rdtsc calls can still be wrong). CPUs should stay
very nearly in sync (again, to userspace). The overhead of this patch is
pretty minimal and costs nothing unless you actually read the TSC.
The catch is that, while it is monotonic, it is not guaranteed to be
perfectly linear. For many applications, this will be good enough. Time
will always move forward, and you won't be subject to the weird HZ
granularity gettimeofday that unsynced TSCs can show.
I'm BCCing the author to poke him more publicly.
Tim
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 23:04 ` thockin
@ 2006-10-28 0:00 ` Luca Tettamanti
2006-10-28 0:17 ` Lee Revell
2006-10-28 2:46 ` thockin
2006-10-28 1:04 ` Andi Kleen
2006-10-30 20:30 ` Christoph Lameter
2 siblings, 2 replies; 65+ messages in thread
From: Luca Tettamanti @ 2006-10-28 0:00 UTC (permalink / raw)
To: thockin@hockin.org; +Cc: Lee Revell, linux-kernel, Andi Kleen, john stultz
On 10/28/06, thockin@hockin.org <thockin@hockin.org> wrote:
> On Fri, Oct 27, 2006 at 10:18:20PM +0200, Luca Tettamanti wrote:
> > There's always a window where the TSCs are not in sync (and userspace may
> > see a non-monotonic counter); furthermore when C'n'Q is active TSCs
> > aren't updated at a fixed frequency, userspace cannot use TSC for timing
> > anyway.
>
> Wrong, too. We have a patch that will be coming SOON (trust me, I am
> pushing hard for the author to publish it). With this patch applied you
> should never see the TSC go backwards. Period. It should be monotonic
> (to userspace, kernel rdtsc calls can still be wrong). CPUs should stay
> very nearly in sync (again, to userspace). The overhead of this patch is
> pretty minimal and costs nothing unless you actually read the TSC.
I know that's it's possible to resync the TSCs, but:
> The catch is that, while it is monotonic, it is not guaranteed to be
> perfectly linear. For many applications, this will be good enough. Time
> will always move forward, and you won't be subject to the weird HZ
> granularity gettimeofday that unsynced TSCs can show.
As you say you cannot use it to do timing unless you disable any power
management on the CPU. Otherwise you can count the elapsed ticks but
you cannot convert the number to anything meaningful.
You may be able to emulate rdtsc for userspace but then again the
whole point of using rdtsc is that it should be uber-fast... if rdtsc
is emulated then you can just use gettimeofday (which is also
optimized to be *very* fast). No?
Luca
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 0:00 ` Luca Tettamanti
@ 2006-10-28 0:17 ` Lee Revell
2006-10-28 2:46 ` thockin
1 sibling, 0 replies; 65+ messages in thread
From: Lee Revell @ 2006-10-28 0:17 UTC (permalink / raw)
To: Luca Tettamanti; +Cc: thockin@hockin.org, linux-kernel, Andi Kleen, john stultz
On Sat, 2006-10-28 at 02:00 +0200, Luca Tettamanti wrote:
>
> As you say you cannot use it to do timing unless you disable any power
> management on the CPU. Otherwise you can count the elapsed ticks but
> you cannot convert the number to anything meaningful.
> You may be able to emulate rdtsc for userspace but then again the
> whole point of using rdtsc is that it should be uber-fast... if rdtsc
> is emulated then you can just use gettimeofday (which is also
> optimized to be *very* fast). No?
gettimeofday() cannot be fast if it has to use the ACPI PM timer. It's
50% slower on my shiny new "AMD Athlon(tm)64 X2 Dual Core Processor
3800+" than on my 600Mhz Via C3, which in general is about a 10x slower
machine. That's a massive regression.
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 23:04 ` thockin
2006-10-28 0:00 ` Luca Tettamanti
@ 2006-10-28 1:04 ` Andi Kleen
2006-10-28 3:28 ` Lee Revell
2006-10-30 20:30 ` Christoph Lameter
2 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 1:04 UTC (permalink / raw)
To: thockin; +Cc: Luca Tettamanti, Lee Revell, linux-kernel, john stultz
> Wrong, too. We have a patch that will be coming SOON (trust me, I am
> pushing hard for the author to publish it). With this patch applied you
> should never see the TSC go backwards. Period. It should be monotonic
> (to userspace, kernel rdtsc calls can still be wrong). CPUs should stay
> very nearly in sync (again, to userspace). The Thoverhead of this patch is
> pretty minimal and costs nothing unless you actually read the TSC.
There is another patch in the pipeline to make gettimeofday use
RDTSC in more cases by keeping the offsets per CPU
(this has nothing to do with syncing TSCs which is not possible
in the general case on several platforms)
I don't think it makes too much sense to hack on pure RDTSC when
gtod is fast enough -- RDTSC will be always icky and hard to use.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 0:00 ` Luca Tettamanti
2006-10-28 0:17 ` Lee Revell
@ 2006-10-28 2:46 ` thockin
2006-10-28 3:59 ` Andi Kleen
1 sibling, 1 reply; 65+ messages in thread
From: thockin @ 2006-10-28 2:46 UTC (permalink / raw)
To: Luca Tettamanti; +Cc: Lee Revell, linux-kernel, Andi Kleen, john stultz
On Sat, Oct 28, 2006 at 02:00:11AM +0200, Luca Tettamanti wrote:
> I know that's it's possible to resync the TSCs, but:
>
> >The catch is that, while it is monotonic, it is not guaranteed to be
> >perfectly linear. For many applications, this will be good enough. Time
> >will always move forward, and you won't be subject to the weird HZ
> >granularity gettimeofday that unsynced TSCs can show.
>
> As you say you cannot use it to do timing unless you disable any power
> management on the CPU. Otherwise you can count the elapsed ticks but
> you cannot convert the number to anything meaningful.
I fyou have a third-party clock you can get pretty darn close.
Fortunately, we usually have an HPET, these days. You can definitely
resync and get near-linear values of RDTSC.
> You may be able to emulate rdtsc for userspace but then again the
> whole point of using rdtsc is that it should be uber-fast... if rdtsc
> is emulated then you can just use gettimeofday (which is also
> optimized to be *very* fast). No?
We're not emulating it at all. The vast vast vast majority of rdtsc calls
are nothing more than the RDTSC instruction. RDTSC is faster than
gettimeofday(), necessarily. If gettimeofday() uses RDTSC, then the
gettimeofday() vsyscall will be pretty good.
But, if I recall, i386 does not support vsyscall? 32 bit binaries on
x86_64 does not support vsyscall. There is still a need for very fast
pure RDTSC.
There are few problems at hand. I'm not familiar with the patch Andi's
talking about but it has to solve all these problems to be really useful:
* TSC skew across CPUs at bootup (Linux handles this already)
* TSC drift across CPUs at the "same" frequency (pretty constant, minimal)
* TSC drift because of PM states, such as C1 (hlt) (semi-random, severe)
Anyway, I hope that all solutions will be considered. And I hope this
patch comes soon.
Tim
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 1:04 ` Andi Kleen
@ 2006-10-28 3:28 ` Lee Revell
2006-10-28 5:28 ` Willy Tarreau
0 siblings, 1 reply; 65+ messages in thread
From: Lee Revell @ 2006-10-28 3:28 UTC (permalink / raw)
To: Andi Kleen; +Cc: thockin, Luca Tettamanti, linux-kernel, john stultz
On Fri, 2006-10-27 at 18:04 -0700, Andi Kleen wrote:
> I don't think it makes too much sense to hack on pure RDTSC when
> gtod is fast enough -- RDTSC will be always icky and hard to use.
I agree FWIW, our application would be happy to just use gtod if it
wasn't so slow on these machines.
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 22:08 ` Lee Revell
@ 2006-10-28 3:58 ` Sergio Monteiro Basto
2006-10-28 4:06 ` Andi Kleen
0 siblings, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-10-28 3:58 UTC (permalink / raw)
To: Lee Revell; +Cc: Chris Friesen, Andi Kleen, linux-kernel, john stultz
On Fri, 2006-10-27 at 18:08 -0400, Lee Revell wrote:
> On Fri, 2006-10-27 at 15:48 -0600, Chris Friesen wrote:
> > Lee Revell wrote:
> >
> > > What exactly does that AMD patch do?
> >
> > "...by periodically adjusting the core time-stamp-counters, so that they
> > are synchronized."
> >
> > It sounds like they just periodically write a new value to the TSC.
> > Presumably they set the "slower" one equal to the "faster" one.
> >
> > You'd likely still have windows where time might run backwards, but it
> > would be better than nothing.
>
> The patch also apparently changes boot params to make the OS use the
> ACPI PM timer, so it must not be a complete solution.
Hi,
So far, has I can understand. Seems to me that my computer which have a
Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
with the patch of hrtimers on
( http://www.tglx.de/projects/hrtimers/2.6.18/ )
Kernel found and use a new clocksource, the acpi_pm. And works stable
but I don't deny that could be a little slower.
Just to point out. This could be more a problem of chipsets than CPUs
(AMD or Intel). AMD just begin first using x86_64 archs :)
Last Note:
I still have other minor problem, seems (to me) related with SATA
drives. Kernel 2.4.19-rc3 have big changes on SATA and I like to test it
but can't apply hrtimers patch (I don't understand half seems in kernel
other half not).
In rc3 with jiffies clocksource even with boot parameter "notsc" I have
unsynchronized issues and many "Lost timer tickets", but I can say that
is a regression because computer never work well.
Thanks,
--
Sérgio M. B.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 2:46 ` thockin
@ 2006-10-28 3:59 ` Andi Kleen
2006-10-28 6:32 ` thockin
` (2 more replies)
0 siblings, 3 replies; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 3:59 UTC (permalink / raw)
To: thockin, vojtech, Jiri Bohac
Cc: Luca Tettamanti, Lee Revell, linux-kernel, john stultz
On Friday 27 October 2006 19:46, thockin@hockin.org wrote:
> On Sat, Oct 28, 2006 at 02:00:11AM +0200, Luca Tettamanti wrote:
> > I know that's it's possible to resync the TSCs, but:
> > >The catch is that, while it is monotonic, it is not guaranteed to be
> > >perfectly linear. For many applications, this will be good enough.
> > > Time will always move forward, and you won't be subject to the weird HZ
> > > granularity gettimeofday that unsynced TSCs can show.
> >
> > As you say you cannot use it to do timing unless you disable any power
> > management on the CPU. Otherwise you can count the elapsed ticks but
> > you cannot convert the number to anything meaningful.
>
> I fyou have a third-party clock you can get pretty darn close.
Not when powernow is involved on a multi socket system.
This means it could be probably gotten to work on a variety of systems,
but it wouldn't work on other systems because of that and I don't
think it makes sense to try to fix an interface that will never
work everywhere.
> Fortunately, we usually have an HPET, these days. You can definitely
> resync and get near-linear values of RDTSC.
No we don't -- most BIOS still don't give us the HPET table
even when it is there in hardware. In the future this will change sure
but people will still run a lot of older motherboards.
> > You may be able to emulate rdtsc for userspace but then again the
> > whole point of using rdtsc is that it should be uber-fast... if rdtsc
> > is emulated then you can just use gettimeofday (which is also
> > optimized to be *very* fast). No?
>
> We're not emulating it at all. The vast vast vast majority of rdtsc calls
> are nothing more than the RDTSC instruction.> RDTSC is faster than
> gettimeofday(), necessarily. If gettimeofday() uses RDTSC, then the
> gettimeofday() vsyscall will be pretty good.
Yes.
> But, if I recall, i386 does not support vsyscall?
There are ways to make it work there.
> 32 bit binaries on
> x86_64 does not support vsyscall.
And here too.
Basically you have to test for the calls in the system call vDSO
and jump off. It's a little ugly but possible. I think John had experimental
patches for this once.
> There are few problems at hand. I'm not familiar with the patch Andi's
> talking about but it has to solve all these problems to be really useful:
It's from Jiri and Vojtech. Basically it will allow to use RDTSC
in gettimeofday even with unsynchronized TSCs by keeping
the necessary offsets CPU local.
Drawback: for vsyscall you need RDTSCP, this means AMD F stepping
at least. But even as a syscall it will be still faster than before.
> * TSC skew across CPUs at bootup (Linux handles this already)
Just not very good. There is still a significant error when it's done.
> * TSC drift across CPUs at the "same" frequency (pretty constant, minimal)
It just adds up over time.
> * TSC drift because of PM states, such as C1 (hlt) (semi-random, severe)
TSC drift with powernow -- CPUs run at different frequencies
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 3:58 ` Sergio Monteiro Basto
@ 2006-10-28 4:06 ` Andi Kleen
2006-10-28 4:22 ` Sergio Monteiro Basto
2006-10-28 6:35 ` thockin
0 siblings, 2 replies; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 4:06 UTC (permalink / raw)
To: Sergio Monteiro Basto
Cc: Lee Revell, Chris Friesen, linux-kernel, john stultz
> So far, has I can understand. Seems to me that my computer which have a
> Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> with the patch of hrtimers on
Intel systems (except for some large highend systems) have synchronized TSCs.
Only exception so far seems to be a few systems that are
overclocked/overvolted and running outside their specification.
When you do that you'e on your own and we're not interested in a bug
report.
There was also one BIOS found that had this problem, but it was old and rare
and got fixed with a upgrade.
> Just to point out. This could be more a problem of chipsets than CPUs
> (AMD or Intel). AMD just begin first using x86_64 archs :)
No.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 4:06 ` Andi Kleen
@ 2006-10-28 4:22 ` Sergio Monteiro Basto
2006-10-30 3:10 ` Sergio Monteiro Basto
2006-10-28 6:35 ` thockin
1 sibling, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-10-28 4:22 UTC (permalink / raw)
To: Andi Kleen; +Cc: Lee Revell, Chris Friesen, linux-kernel, john stultz
On Fri, 2006-10-27 at 21:06 -0700, Andi Kleen wrote:
> > So far, has I can understand. Seems to me that my computer which have a
> > Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> > with the patch of hrtimers on
>
> Intel systems (except for some large highend systems) have synchronized TSCs.
> Only exception so far seems to be a few systems that are
> overclocked/overvolted and running outside their specification.
> When you do that you'e on your own and we're not interested in a bug
> report.
and my computer :)
http://www.asrock.com/product/775Dual-880Pro.htm
http://www.asrock.com/support/CPU_Support/show.asp?Model=775Dual-880Pro
Monday I will checkout if my computer is under specs.
Seems that I like buy computers with many problems on Linux and fix :)
> There was also one BIOS found that had this problem, but it was old and rare
> and got fixed with a upgrade.
>
> > Just to point out. This could be more a problem of chipsets than CPUs
> > (AMD or Intel). AMD just begin first using x86_64 archs :)
>
> No.
>
> -Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 3:28 ` Lee Revell
@ 2006-10-28 5:28 ` Willy Tarreau
2006-10-28 18:08 ` Lee Revell
` (2 more replies)
0 siblings, 3 replies; 65+ messages in thread
From: Willy Tarreau @ 2006-10-28 5:28 UTC (permalink / raw)
To: Lee Revell
Cc: Andi Kleen, thockin, Luca Tettamanti, linux-kernel, john stultz
On Fri, Oct 27, 2006 at 11:28:00PM -0400, Lee Revell wrote:
> On Fri, 2006-10-27 at 18:04 -0700, Andi Kleen wrote:
> > I don't think it makes too much sense to hack on pure RDTSC when
> > gtod is fast enough -- RDTSC will be always icky and hard to use.
>
> I agree FWIW, our application would be happy to just use gtod if it
> wasn't so slow on these machines.
Agreed, I had to turn about 20 dual-core servers to single core because
the only way to get a monotonic gtod made it so slow that it was not
worth using a dual-core. I initially considered buying one dual-core
AMD for my own use, but after seeing this, I'm definitely sure I won't
ever buy one as long as this problem is not fixed, as it causes too
many problems.
Willy
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 3:59 ` Andi Kleen
@ 2006-10-28 6:32 ` thockin
2006-10-28 9:14 ` Vojtech Pavlik
2006-10-28 18:22 ` Lee Revell
2 siblings, 0 replies; 65+ messages in thread
From: thockin @ 2006-10-28 6:32 UTC (permalink / raw)
To: Andi Kleen
Cc: vojtech, Jiri Bohac, Luca Tettamanti, Lee Revell, linux-kernel,
john stultz
On Fri, Oct 27, 2006 at 08:59:13PM -0700, Andi Kleen wrote:
> > I fyou have a third-party clock you can get pretty darn close.
>
> Not when powernow is involved on a multi socket system.
When CPUs are in different P-States, any resync effort will become
unsynced immediately. I agree with that. This is a further complication
that I think our code does not handle perfectly, yet.
> > Fortunately, we usually have an HPET, these days. You can definitely
> > resync and get near-linear values of RDTSC.
>
> No we don't -- most BIOS still don't give us the HPET table
> even when it is there in hardware. In the future this will change sure
> but people will still run a lot of older motherboards.
If you know where the HPET base-address-register is, can't we program it
ourselves? Even without HPET, we have PM Timer. As long as you don't
need to resync the TSCs on most gtod(), you can still do better than not
trying.
> > There are few problems at hand. I'm not familiar with the patch Andi's
> > talking about but it has to solve all these problems to be really useful:
>
> It's from Jiri and Vojtech. Basically it will allow to use RDTSC
> in gettimeofday even with unsynchronized TSCs by keeping
> the necessary offsets CPU local.
Offset from what? With automatic clock ramping in C1, the rate is
cycling up and down a lot.
> > * TSC drift because of PM states, such as C1 (hlt) (semi-random, severe)
>
> TSC drift with powernow -- CPUs run at different frequencies
Yeah, C1 is workaround-able, because the clock returns to full frequency,
and we never execute code in the reduced clock state. Powernow makes it
more fun. Not only do you need some offset, but you need some scalar.
Assume you resync TSCs to a clock (PM, HPET, whatever) any time any CPU
changes p-state. Then you can calculate the approximate TSC for now by:
tsc_now = tsc_at_last_resync + ((rdtsc - tsc_at_last_resync) * pstate_scalar)
Something like that. Not pretty, but still possible to get close. And
close might be good enough. As long as you can guarantee monotonicity and
approximate linearity, you can make most apps happy ENOUGH.
Tim
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 4:06 ` Andi Kleen
2006-10-28 4:22 ` Sergio Monteiro Basto
@ 2006-10-28 6:35 ` thockin
2006-10-28 6:46 ` Andrew Morton
2006-10-28 9:48 ` Andi Kleen
1 sibling, 2 replies; 65+ messages in thread
From: thockin @ 2006-10-28 6:35 UTC (permalink / raw)
To: Andi Kleen
Cc: Sergio Monteiro Basto, Lee Revell, Chris Friesen, linux-kernel,
john stultz
On Fri, Oct 27, 2006 at 09:06:12PM -0700, Andi Kleen wrote:
>
> > So far, has I can understand. Seems to me that my computer which have a
> > Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> > with the patch of hrtimers on
>
> Intel systems (except for some large highend systems) have synchronized TSCs.
Does Intel guarantee that, or is that just what we happen to see, so far.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 6:35 ` thockin
@ 2006-10-28 6:46 ` Andrew Morton
2006-10-28 6:49 ` thockin
2006-10-28 9:45 ` Andi Kleen
2006-10-28 9:48 ` Andi Kleen
1 sibling, 2 replies; 65+ messages in thread
From: Andrew Morton @ 2006-10-28 6:46 UTC (permalink / raw)
To: thockin
Cc: Andi Kleen, Sergio Monteiro Basto, Lee Revell, Chris Friesen,
linux-kernel, john stultz
On Fri, 27 Oct 2006 23:35:24 -0700
thockin@hockin.org wrote:
> On Fri, Oct 27, 2006 at 09:06:12PM -0700, Andi Kleen wrote:
> >
> > > So far, has I can understand. Seems to me that my computer which have a
> > > Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> > > with the patch of hrtimers on
> >
> > Intel systems (except for some large highend systems) have synchronized TSCs.
>
> Does Intel guarantee that, or is that just what we happen to see, so far.
Matthias has a Xeon machine on which the TSCs are unsynced, and which are
unsyncable - write_tsc() just doesn't do anything. See thread at
http://lkml.org/lkml/2006/7/22/104
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 6:46 ` Andrew Morton
@ 2006-10-28 6:49 ` thockin
2006-10-28 7:13 ` Andrew Morton
2006-10-28 9:46 ` Andi Kleen
2006-10-28 9:45 ` Andi Kleen
1 sibling, 2 replies; 65+ messages in thread
From: thockin @ 2006-10-28 6:49 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Sergio Monteiro Basto, Lee Revell, Chris Friesen,
linux-kernel, john stultz
On Fri, Oct 27, 2006 at 11:46:15PM -0700, Andrew Morton wrote:
> On Fri, 27 Oct 2006 23:35:24 -0700
> thockin@hockin.org wrote:
>
> > On Fri, Oct 27, 2006 at 09:06:12PM -0700, Andi Kleen wrote:
> > >
> > > > So far, has I can understand. Seems to me that my computer which have a
> > > > Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> > > > with the patch of hrtimers on
> > >
> > > Intel systems (except for some large highend systems) have synchronized TSCs.
> >
> > Does Intel guarantee that, or is that just what we happen to see, so far.
>
> Matthias has a Xeon machine on which the TSCs are unsynced, and which are
> unsyncable - write_tsc() just doesn't do anything. See thread at
> http://lkml.org/lkml/2006/7/22/104
Nothing at all, or just the the low few bits are writeable? I had heard,
but never seen that some Intel CPUs only allowed 16 bits of writable bits
in the TSC MSR. I also heard of, but never saw, CPUs that cleared the TSC
to 0 on a write!
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 6:49 ` thockin
@ 2006-10-28 7:13 ` Andrew Morton
2006-10-28 7:25 ` thockin
2006-10-28 9:46 ` Andi Kleen
1 sibling, 1 reply; 65+ messages in thread
From: Andrew Morton @ 2006-10-28 7:13 UTC (permalink / raw)
To: thockin
Cc: Andi Kleen, Sergio Monteiro Basto, Lee Revell, Chris Friesen,
linux-kernel, john stultz
On Fri, 27 Oct 2006 23:49:24 -0700
thockin@hockin.org wrote:
> On Fri, Oct 27, 2006 at 11:46:15PM -0700, Andrew Morton wrote:
> > On Fri, 27 Oct 2006 23:35:24 -0700
> > thockin@hockin.org wrote:
> >
> > > On Fri, Oct 27, 2006 at 09:06:12PM -0700, Andi Kleen wrote:
> > > >
> > > > > So far, has I can understand. Seems to me that my computer which have a
> > > > > Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> > > > > with the patch of hrtimers on
> > > >
> > > > Intel systems (except for some large highend systems) have synchronized TSCs.
> > >
> > > Does Intel guarantee that, or is that just what we happen to see, so far.
> >
> > Matthias has a Xeon machine on which the TSCs are unsynced, and which are
> > unsyncable - write_tsc() just doesn't do anything. See thread at
> > http://lkml.org/lkml/2006/7/22/104
>
> Nothing at all, or just the the low few bits are writeable?
We don't know - the tsc sync code doesn't remeasure the errors after "correcting"
them.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 7:13 ` Andrew Morton
@ 2006-10-28 7:25 ` thockin
0 siblings, 0 replies; 65+ messages in thread
From: thockin @ 2006-10-28 7:25 UTC (permalink / raw)
To: Andrew Morton
Cc: Andi Kleen, Sergio Monteiro Basto, Lee Revell, Chris Friesen,
linux-kernel, john stultz
On Sat, Oct 28, 2006 at 12:13:16AM -0700, Andrew Morton wrote:
> > > http://lkml.org/lkml/2006/7/22/104
> >
> > Nothing at all, or just the the low few bits are writeable?
>
> We don't know - the tsc sync code doesn't remeasure the errors after "correcting"
> them.
I read the thread. Just as a challenge, I'd love to poke at such a
system, but I doubt very much I'll get the chance :)
Tim
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 3:59 ` Andi Kleen
2006-10-28 6:32 ` thockin
@ 2006-10-28 9:14 ` Vojtech Pavlik
2006-10-28 18:22 ` Lee Revell
2 siblings, 0 replies; 65+ messages in thread
From: Vojtech Pavlik @ 2006-10-28 9:14 UTC (permalink / raw)
To: Andi Kleen
Cc: thockin, Jiri Bohac, Luca Tettamanti, Lee Revell, linux-kernel,
john stultz
On Fri, Oct 27, 2006 at 08:59:13PM -0700, Andi Kleen wrote:
> > There are few problems at hand. I'm not familiar with the patch Andi's
> > talking about but it has to solve all these problems to be really useful:
>
> It's from Jiri and Vojtech. Basically it will allow to use RDTSC
> in gettimeofday even with unsynchronized TSCs by keeping
> the necessary offsets CPU local.
>
> Drawback: for vsyscall you need RDTSCP, this means AMD F stepping
> at least. But even as a syscall it will be still faster than before.
>
> > * TSC skew across CPUs at bootup (Linux handles this already)
>
> Just not very good. There is still a significant error when it's done.
>
> > * TSC drift across CPUs at the "same" frequency (pretty constant, minimal)
>
> It just adds up over time.
>
> > * TSC drift because of PM states, such as C1 (hlt) (semi-random, severe)
>
> TSC drift with powernow -- CPUs run at different frequencies
And the patch does exactly that.
It doesn't assume much about TSCs, except that they're individually
monotonic and that without a warning (cpufreq notifier, c1 state
enter/leave) the frequency doesn't change quickly. Slow frequency drift
(spread spectrum modulation, thermal effects on Xtal) is compensated for.
We still are testing the patch and fixing the issues we find, currently
with our cpufreq handling, but I believe we're on a good way to have it
working well.
--
Vojtech Pavlik
Director SuSE Labs
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 6:46 ` Andrew Morton
2006-10-28 6:49 ` thockin
@ 2006-10-28 9:45 ` Andi Kleen
1 sibling, 0 replies; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 9:45 UTC (permalink / raw)
To: Andrew Morton
Cc: thockin, Sergio Monteiro Basto, Lee Revell, Chris Friesen,
linux-kernel, john stultz
On Friday 27 October 2006 23:46, Andrew Morton wrote:
> On Fri, 27 Oct 2006 23:35:24 -0700
>
> thockin@hockin.org wrote:
> > On Fri, Oct 27, 2006 at 09:06:12PM -0700, Andi Kleen wrote:
> > > > So far, has I can understand. Seems to me that my computer which have
> > > > a Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC
> > > > and with the patch of hrtimers on
> > >
> > > Intel systems (except for some large highend systems) have synchronized
> > > TSCs.
> >
> > Does Intel guarantee that, or is that just what we happen to see, so far.
>
> Matthias has a Xeon machine on which the TSCs are unsynced, and which are
> unsyncable - write_tsc() just doesn't do anything. See thread at
> http://lkml.org/lkml/2006/7/22/104
That is a clear BIOS bug (FSBs are programmed incorrectly) and doesn't seem to
be common. In fact the BIOS bug is so bad that it's surprising the system
works at all.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 6:49 ` thockin
2006-10-28 7:13 ` Andrew Morton
@ 2006-10-28 9:46 ` Andi Kleen
1 sibling, 0 replies; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 9:46 UTC (permalink / raw)
To: thockin
Cc: Andrew Morton, Sergio Monteiro Basto, Lee Revell, Chris Friesen,
linux-kernel, john stultz
> Nothing at all, or just the the low few bits are writeable? I had heard,
> but never seen that some Intel CPUs only allowed 16 bits of writable bits
> in the TSC MSR. I also heard of, but never saw, CPUs that cleared the TSC
> to 0 on a write!
Normally on Intel you can only write the first 32bits
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 6:35 ` thockin
2006-10-28 6:46 ` Andrew Morton
@ 2006-10-28 9:48 ` Andi Kleen
1 sibling, 0 replies; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 9:48 UTC (permalink / raw)
To: thockin
Cc: Sergio Monteiro Basto, Lee Revell, Chris Friesen, linux-kernel,
john stultz
> Does Intel guarantee that, or is that just what we happen to see, so far.
I don't think it's architecturally guaranteed no.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 5:28 ` Willy Tarreau
@ 2006-10-28 18:08 ` Lee Revell
2006-10-28 19:14 ` thockin
2006-10-30 17:22 ` Langsdorf, Mark
2006-10-28 18:37 ` Andi Kleen
2006-10-31 11:12 ` Pádraig Brady
2 siblings, 2 replies; 65+ messages in thread
From: Lee Revell @ 2006-10-28 18:08 UTC (permalink / raw)
To: Willy Tarreau
Cc: Andi Kleen, thockin, Luca Tettamanti, linux-kernel, john stultz
On Sat, 2006-10-28 at 07:28 +0200, Willy Tarreau wrote:
> On Fri, Oct 27, 2006 at 11:28:00PM -0400, Lee Revell wrote:
> > On Fri, 2006-10-27 at 18:04 -0700, Andi Kleen wrote:
> > > I don't think it makes too much sense to hack on pure RDTSC when
> > > gtod is fast enough -- RDTSC will be always icky and hard to use.
> >
> > I agree FWIW, our application would be happy to just use gtod if it
> > wasn't so slow on these machines.
>
> Agreed, I had to turn about 20 dual-core servers to single core because
> the only way to get a monotonic gtod made it so slow that it was not
> worth using a dual-core. I initially considered buying one dual-core
> AMD for my own use, but after seeing this, I'm definitely sure I won't
> ever buy one as long as this problem is not fixed, as it causes too
> many problems.
Does anyone know if the problem will really be fixed in new CPUs, as AMD
promised a year or so ago?
http://lkml.org/lkml/2005/11/4/173
Since that post, there has been Socket F and AM2 which apparently have
the same issue.
Were the AMD guys just blowing smoke?
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 3:59 ` Andi Kleen
2006-10-28 6:32 ` thockin
2006-10-28 9:14 ` Vojtech Pavlik
@ 2006-10-28 18:22 ` Lee Revell
2006-10-28 19:57 ` Vojtech Pavlik
2 siblings, 1 reply; 65+ messages in thread
From: Lee Revell @ 2006-10-28 18:22 UTC (permalink / raw)
To: Andi Kleen
Cc: thockin, vojtech, Jiri Bohac, Luca Tettamanti, linux-kernel,
john stultz
On Fri, 2006-10-27 at 20:59 -0700, Andi Kleen wrote:
> > Fortunately, we usually have an HPET, these days. You can
> definitely
> > resync and get near-linear values of RDTSC.
>
> No we don't -- most BIOS still don't give us the HPET table
> even when it is there in hardware. In the future this will change sure
> but people will still run a lot of older motherboards.
I have exactly such a system (see thread "x86-64 with nvidia MCP51
chipset: kernel does not find HPET"). Is there anything at all I can do
to make the kernel see the HPET? Can I try to guess the address? BIOS
upgrade?
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 5:28 ` Willy Tarreau
2006-10-28 18:08 ` Lee Revell
@ 2006-10-28 18:37 ` Andi Kleen
2006-10-28 19:15 ` Willy Tarreau
2006-10-31 11:12 ` Pádraig Brady
2 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 18:37 UTC (permalink / raw)
To: Willy Tarreau
Cc: Lee Revell, thockin, Luca Tettamanti, linux-kernel, john stultz
On Friday 27 October 2006 22:28, Willy Tarreau wrote:
> On Fri, Oct 27, 2006 at 11:28:00PM -0400, Lee Revell wrote:
> > On Fri, 2006-10-27 at 18:04 -0700, Andi Kleen wrote:
> > > I don't think it makes too much sense to hack on pure RDTSC when
> > > gtod is fast enough -- RDTSC will be always icky and hard to use.
> >
> > I agree FWIW, our application would be happy to just use gtod if it
> > wasn't so slow on these machines.
>
> Agreed, I had to turn about 20 dual-core servers to single core because
> the only way to get a monotonic gtod made it so slow that it was not
> worth using a dual-core.
Curious - what workload was that?
While gtod is time critical and often appears high on profile lists it is
normally not as time critical as you're claiming it is; especially not
time critical enough to warrant such radical action.
> I initially considered buying one dual-core
> AMD for my own use, but after seeing this, I'm definitely sure I won't
> ever buy one as long as this problem is not fixed, as it causes too
> many problems.
It's somewhat slower, but I'm not sure what "too many problems" you're
refering to.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 18:08 ` Lee Revell
@ 2006-10-28 19:14 ` thockin
2006-10-30 17:22 ` Langsdorf, Mark
1 sibling, 0 replies; 65+ messages in thread
From: thockin @ 2006-10-28 19:14 UTC (permalink / raw)
To: Lee Revell
Cc: Willy Tarreau, Andi Kleen, Luca Tettamanti, linux-kernel,
john stultz
On Sat, Oct 28, 2006 at 02:08:34PM -0400, Lee Revell wrote:
> Does anyone know if the problem will really be fixed in new CPUs, as AMD
> promised a year or so ago?
>
> http://lkml.org/lkml/2005/11/4/173
>
> Since that post, there has been Socket F and AM2 which apparently have
> the same issue.
>
> Were the AMD guys just blowing smoke?
I think it is coming, but still not here yet.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 18:37 ` Andi Kleen
@ 2006-10-28 19:15 ` Willy Tarreau
2006-10-28 19:18 ` thockin
` (2 more replies)
0 siblings, 3 replies; 65+ messages in thread
From: Willy Tarreau @ 2006-10-28 19:15 UTC (permalink / raw)
To: Andi Kleen
Cc: Lee Revell, thockin, Luca Tettamanti, linux-kernel, john stultz
On Sat, Oct 28, 2006 at 11:37:22AM -0700, Andi Kleen wrote:
> On Friday 27 October 2006 22:28, Willy Tarreau wrote:
> > On Fri, Oct 27, 2006 at 11:28:00PM -0400, Lee Revell wrote:
> > > On Fri, 2006-10-27 at 18:04 -0700, Andi Kleen wrote:
> > > > I don't think it makes too much sense to hack on pure RDTSC when
> > > > gtod is fast enough -- RDTSC will be always icky and hard to use.
> > >
> > > I agree FWIW, our application would be happy to just use gtod if it
> > > wasn't so slow on these machines.
> >
> > Agreed, I had to turn about 20 dual-core servers to single core because
> > the only way to get a monotonic gtod made it so slow that it was not
> > worth using a dual-core.
>
> Curious - what workload was that?
Two different but related workloads :
- load balancer doing between 10 and 100k gtod per second on a sun
x2100 under RHEL 3. HPET was not available and the only way I found
to get monotonic clock was to use the APIC timer IIRC (it was more
than 6 months ago, so sorry if I don't remember about all the details).
- network sniffer that I tried to tune to get the highest possible packet
rates on gigabit ethernet.
> While gtod is time critical and often appears high on profile lists it is
> normally not as time critical as you're claiming it is; especially not
> time critical enough to warrant such radical action.
Yes it was, because the small gain of using a dual core with such
a workload was clearly lost by that change. IIRC, I reached 25000
sessions/s on dual core with TSC if I didn't care about the clock,
20000 without TSC, and 18000 on single core+TSC. But with the sniffer,
it was even worse : I had 500 kpps in dual-core+TSC, 70kpps without
TSC and 300 kpps with single-core+TSC. Since I had to buy the same
machines for both uses, this last argument was enough for me to stick
to a single core.
> > I initially considered buying one dual-core
> > AMD for my own use, but after seeing this, I'm definitely sure I won't
> > ever buy one as long as this problem is not fixed, as it causes too
> > many problems.
>
> It's somewhat slower, but I'm not sure what "too many problems" you're
> refering to.
Anticipated or delayed timeouts on the proxy, time measurement errors
(when the logs show that a session finishes before it begins, there's
a real problem, particularly because we use those logs for troubleshooting).
And for the sniffer, getting wrong times by about 2s was a real problem too.
I would have preferred to get something monotonic with little accuracy than
out of order packets !
This is definitely a design problem on those chips, probably because
marketting targets gamers only. And that's very sad, because they are
excellent processors !
> -Andi
Regards,
Willy
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:15 ` Willy Tarreau
@ 2006-10-28 19:18 ` thockin
2006-10-28 19:32 ` Willy Tarreau
2006-10-28 19:33 ` Andi Kleen
2006-10-28 21:00 ` Lee Revell
2 siblings, 1 reply; 65+ messages in thread
From: thockin @ 2006-10-28 19:18 UTC (permalink / raw)
To: Willy Tarreau
Cc: Andi Kleen, Lee Revell, Luca Tettamanti, linux-kernel,
john stultz
On Sat, Oct 28, 2006 at 09:15:15PM +0200, Willy Tarreau wrote:
> > While gtod is time critical and often appears high on profile lists it is
> > normally not as time critical as you're claiming it is; especially not
> > time critical enough to warrant such radical action.
>
> Yes it was, because the small gain of using a dual core with such
> a workload was clearly lost by that change. IIRC, I reached 25000
> sessions/s on dual core with TSC if I didn't care about the clock,
> 20000 without TSC, and 18000 on single core+TSC. But with the sniffer,
> it was even worse : I had 500 kpps in dual-core+TSC, 70kpps without
> TSC and 300 kpps with single-core+TSC. Since I had to buy the same
> machines for both uses, this last argument was enough for me to stick
> to a single core.
Was the problem that they were not synced at poweron or that they would
drift due to power-states?
Did you try running with idle=poll, to avoid ever entering C1 state (hlt)?
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:18 ` thockin
@ 2006-10-28 19:32 ` Willy Tarreau
2006-10-28 19:42 ` thockin
0 siblings, 1 reply; 65+ messages in thread
From: Willy Tarreau @ 2006-10-28 19:32 UTC (permalink / raw)
To: thockin; +Cc: Andi Kleen, Lee Revell, Luca Tettamanti, linux-kernel,
john stultz
On Sat, Oct 28, 2006 at 12:18:00PM -0700, thockin@hockin.org wrote:
> On Sat, Oct 28, 2006 at 09:15:15PM +0200, Willy Tarreau wrote:
> > > While gtod is time critical and often appears high on profile lists it is
> > > normally not as time critical as you're claiming it is; especially not
> > > time critical enough to warrant such radical action.
> >
> > Yes it was, because the small gain of using a dual core with such
> > a workload was clearly lost by that change. IIRC, I reached 25000
> > sessions/s on dual core with TSC if I didn't care about the clock,
> > 20000 without TSC, and 18000 on single core+TSC. But with the sniffer,
> > it was even worse : I had 500 kpps in dual-core+TSC, 70kpps without
> > TSC and 300 kpps with single-core+TSC. Since I had to buy the same
> > machines for both uses, this last argument was enough for me to stick
> > to a single core.
>
> Was the problem that they were not synced at poweron or that they would
> drift due to power-states?
They resynced at power up, but would constantly drift. I don't even know
if it was caused by power states. When the machine was loaded, a single
task moving across the cores could see its time jump back and forth
several times a second by an offset sometimes close to +2/-2s.
> Did you try running with idle=poll, to avoid ever entering C1 state (hlt)?
Yes, I remember trying such things. I also tried 'nohlt', completely
disabling power management, including ACPI, etc... I also tried vanilla
kernels as well as severely patched ones, but the problem remained the
same in all circumstances, that only 'notsc' could solve.
BTW, I've just found a remain of dmesg capture after boot in case you'd
like to look for anything in it.
Regards,
Willy
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:15 ` Willy Tarreau
2006-10-28 19:18 ` thockin
@ 2006-10-28 19:33 ` Andi Kleen
2006-10-28 20:04 ` Willy Tarreau
2006-10-29 1:28 ` Lee Revell
2006-10-28 21:00 ` Lee Revell
2 siblings, 2 replies; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 19:33 UTC (permalink / raw)
To: Willy Tarreau
Cc: Lee Revell, thockin, Luca Tettamanti, linux-kernel, john stultz
On Saturday 28 October 2006 12:15, Willy Tarreau wrote:
> Yes it was, because the small gain of using a dual core with such
> a workload was clearly lost by that change. IIRC, I reached 25000
> sessions/s on dual core with TSC if I didn't care about the clock,
> 20000 without TSC, and 18000 on single core+TSC. But with the sniffer,
> it was even worse : I had 500 kpps in dual-core+TSC, 70kpps without
> TSC and 300 kpps with single-core+TSC. Since I had to buy the same
> machines for both uses, this last argument was enough for me to stick
> to a single core.
Ok, but it is a very specialized situation not applicable to most
others. I just say this for all the other people following the thread.
Again most workloads are not that gtod intensive.
BTW if you don't use powernow and don't use blades with thermal clock ramping
and use idle=poll then the TSCs should be synchronized on AMD dual core
and TSC gtod can be used. But it will burn a lot of power and make the system
run very hot.
>
> > > I initially considered buying one dual-core
> > > AMD for my own use, but after seeing this, I'm definitely sure I won't
> > > ever buy one as long as this problem is not fixed, as it causes too
> > > many problems.
> >
> > It's somewhat slower, but I'm not sure what "too many problems" you're
> > refering to.
>
> Anticipated or delayed timeouts on the proxy, time measurement errors
> (when the logs show that a session finishes before it begins, there's
> a real problem, particularly because we use those logs for
> troubleshooting). And for the sniffer, getting wrong times by about 2s was
> a real problem too. I would have preferred to get something monotonic with
> little accuracy than out of order packets !
Ah you mean you forced the kernel to use a unsynchronized TSC
for gtod during your tuning attempts and then discovered that it didn't work?
Call me surprised.
In the default configuration there shouldn't be any problems
like this, it will just run slower because the kernel falls back to a slower
time source.
> This is definitely a design problem on those chips, probably because
> marketting targets gamers only.
Last time I checked Dual core Opterons weren't marketed to gamers.
> And that's very sad, because they are
> excellent processors !
Lots of various parties are to blame here, not just AMD.
The BIOS vendors for not exposing HPET even when it is available in the
hardware. While HPET is slower than TSC too it definitely isn't nearly as
slow as pmtimer.
Possibly the Linux people for not getting per CPU TSC going quicker.
The writers of software who uses gtod too often or force the kernel
to call it for each packet by carelessly using the timestamp ioctl.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:32 ` Willy Tarreau
@ 2006-10-28 19:42 ` thockin
2006-10-28 20:16 ` Willy Tarreau
0 siblings, 1 reply; 65+ messages in thread
From: thockin @ 2006-10-28 19:42 UTC (permalink / raw)
To: Willy Tarreau
Cc: Andi Kleen, Lee Revell, Luca Tettamanti, linux-kernel,
john stultz
On Sat, Oct 28, 2006 at 09:32:18PM +0200, Willy Tarreau wrote:
> > Was the problem that they were not synced at poweron or that they would
> > drift due to power-states?
>
> They resynced at power up, but would constantly drift. I don't even know
> if it was caused by power states. When the machine was loaded, a single
> task moving across the cores could see its time jump back and forth
> several times a second by an offset sometimes close to +2/-2s.
That sounds like C1, to me.
> > Did you try running with idle=poll, to avoid ever entering C1 state (hlt)?
>
> Yes, I remember trying such things. I also tried 'nohlt', completely
> disabling power management, including ACPI, etc... I also tried vanilla
> kernels as well as severely patched ones, but the problem remained the
> same in all circumstances, that only 'notsc' could solve.
That's exceedingly strange. On my dual-socket dual-core, I can get
roughly synced TSCs (no appreciable drift) by just using idle=poll. If
that did not work for you, I'd really want to poke at the system more.
> BTW, I've just found a remain of dmesg capture after boot in case you'd
> like to look for anything in it.
A dmesg won't be that useful, I'd actually have to poke at the system.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 18:22 ` Lee Revell
@ 2006-10-28 19:57 ` Vojtech Pavlik
2006-10-28 22:54 ` thockin
0 siblings, 1 reply; 65+ messages in thread
From: Vojtech Pavlik @ 2006-10-28 19:57 UTC (permalink / raw)
To: Lee Revell
Cc: Andi Kleen, thockin, Jiri Bohac, Luca Tettamanti, linux-kernel,
john stultz
On Sat, Oct 28, 2006 at 02:22:11PM -0400, Lee Revell wrote:
> On Fri, 2006-10-27 at 20:59 -0700, Andi Kleen wrote:
> > > Fortunately, we usually have an HPET, these days. You can
> > definitely
> > > resync and get near-linear values of RDTSC.
> >
> > No we don't -- most BIOS still don't give us the HPET table
> > even when it is there in hardware. In the future this will change sure
> > but people will still run a lot of older motherboards.
>
> I have exactly such a system (see thread "x86-64 with nvidia MCP51
> chipset: kernel does not find HPET"). Is there anything at all I can do
> to make the kernel see the HPET? Can I try to guess the address? BIOS
> upgrade?
In most cases where the HPET is present but not reported, it's not
configured. Usually, you need to write a chipset-specific register to
configure the address.
Finding the register, finding some free MMIO space, writing the address
to the register and telling the address to the kernel is enough.
--
Vojtech Pavlik
Director SuSE Labs
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:33 ` Andi Kleen
@ 2006-10-28 20:04 ` Willy Tarreau
2006-10-28 20:11 ` Andi Kleen
2006-10-29 1:28 ` Lee Revell
1 sibling, 1 reply; 65+ messages in thread
From: Willy Tarreau @ 2006-10-28 20:04 UTC (permalink / raw)
To: Andi Kleen
Cc: Lee Revell, thockin, Luca Tettamanti, linux-kernel, john stultz
On Sat, Oct 28, 2006 at 12:33:27PM -0700, Andi Kleen wrote:
> On Saturday 28 October 2006 12:15, Willy Tarreau wrote:
>
> > Yes it was, because the small gain of using a dual core with such
> > a workload was clearly lost by that change. IIRC, I reached 25000
> > sessions/s on dual core with TSC if I didn't care about the clock,
> > 20000 without TSC, and 18000 on single core+TSC. But with the sniffer,
> > it was even worse : I had 500 kpps in dual-core+TSC, 70kpps without
> > TSC and 300 kpps with single-core+TSC. Since I had to buy the same
> > machines for both uses, this last argument was enough for me to stick
> > to a single core.
>
> Ok, but it is a very specialized situation not applicable to most
> others. I just say this for all the other people following the thread.
> Again most workloads are not that gtod intensive.
100% agreed (fortunately !).
> BTW if you don't use powernow and don't use blades with thermal clock ramping
> and use idle=poll then the TSCs should be synchronized on AMD dual core
> and TSC gtod can be used. But it will burn a lot of power and make the system
> run very hot.
I tried to make it run like this. I once was said that by racking pizza boxes,
you get a pizza oven. I was prepared to accept it :-)
But I would not manage to keep them in sync. I even remember running
background loops to ensure that there was no idle at all, and the clocks
still managed to get out of sync ! I tried to disable a lot of devices,
starting with everything susceptible to send interrupts with long processing
time (eg: USB, SATA, ...), but with no success. I once thought that I
succeeded by sticking all interrupts to one core and the tasks to the
other one, but was proved wrong after several minutes.
I really think that the hardware was doing tricks far beyond my knowledge,
because on another Sun (a V40Z), there were 4 dual cores which I never saw
out of sync even after hours of testing. But the HPET was available in it,
I don't remember if it's used by default when detected.
> > > > I initially considered buying one dual-core
> > > > AMD for my own use, but after seeing this, I'm definitely sure I won't
> > > > ever buy one as long as this problem is not fixed, as it causes too
> > > > many problems.
> > >
> > > It's somewhat slower, but I'm not sure what "too many problems" you're
> > > refering to.
> >
> > Anticipated or delayed timeouts on the proxy, time measurement errors
> > (when the logs show that a session finishes before it begins, there's
> > a real problem, particularly because we use those logs for
> > troubleshooting). And for the sniffer, getting wrong times by about 2s was
> > a real problem too. I would have preferred to get something monotonic with
> > little accuracy than out of order packets !
>
> Ah you mean you forced the kernel to use a unsynchronized TSC
> for gtod during your tuning attempts and then discovered that it didn't work?
> Call me surprised.
No I did not "force" anything at first. You take the RHEL3 CD, you install
it, reboot and watch your logs report negative times, then scratch your
head, first call red hat dumb ass, and after a few tests, apologize to the
poor innocent red hat and call the box a total crap. To put it shortly
(might be useful for people who Google for it) : Dual-core Sun x2100 is
unreliable out of the box under Linux.
> In the default configuration there shouldn't be any problems
> like this, it will just run slower because the kernel falls back to a slower
> time source.
You have to specify "notsc" for this. As an alternative, a NUMA kernel
worked fine too (because TSC is disabled), but it's not obvious for
anyone why a dual-core, single proc system should be considered "NUMA" !
> > This is definitely a design problem on those chips, probably because
> > marketting targets gamers only.
>
> Last time I checked Dual core Opterons weren't marketed to gamers.
Not "opterons" under this name, but AMD X2 yes. Ask google for "AMD X2"
and click on the first non-AMD site (3rd link), then check how it's
benchmarked... On the other hand, if you look for "opteron", you
immediately find more serious usages.
> > And that's very sad, because they are
> > excellent processors !
>
> Lots of various parties are to blame here, not just AMD.
>
> The BIOS vendors for not exposing HPET even when it is available in the
> hardware. While HPET is slower than TSC too it definitely isn't nearly as
> slow as pmtimer.
I'm sure that the BIOS is buggy there, because I too found it strange
that there was no HPET reported in such a system. But I found no way
to enable it by force either, as I did not know where to start looking
at.
> Possibly the Linux people for not getting per CPU TSC going quicker.
>
> The writers of software who uses gtod too often or force the kernel
> to call it for each packet by carelessly using the timestamp ioctl.
You can't use gtod less than once in a poll() loop unfortunately. And
believe me, I do count my syscalls because each one hits performance
by a few percent. When it comes to getting time on each packet, the
problem is the same : you're dependant on the frequency of external
events. You need to get the time once for each event. But I agree
that a per-CPU TSC could help a lot at getting monotonic clocks. I
think that using the local TSC to measure non-accurate time and
decide when to call an external source would be a great improvement.
Regards,
Willy
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 20:04 ` Willy Tarreau
@ 2006-10-28 20:11 ` Andi Kleen
2006-10-28 20:36 ` Willy Tarreau
0 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2006-10-28 20:11 UTC (permalink / raw)
To: Willy Tarreau
Cc: Lee Revell, thockin, Luca Tettamanti, linux-kernel, john stultz
On Saturday 28 October 2006 13:04, Willy Tarreau wrote:
> I really think that the hardware was doing tricks far beyond my knowledge,
> because on another Sun (a V40Z), there were 4 dual cores which I never saw
> out of sync even after hours of testing. But the HPET was available in it,
> I don't remember if it's used by default when detected.
I think some system occasionally ramp the clock for thermal management,
but that should be rare.
> No I did not "force" anything at first. You take the RHEL3 CD, you install
> it, reboot and watch your logs report negative times, then scratch your
> head, first call red hat dumb ass, and after a few tests, apologize to the
> poor innocent red hat
Well they should have fixed the kernel to fall back to another clock
by backporting the appropiate fixes from mainline. I assume they
did actually.
> and call the box a total crap. To put it shortly
> (might be useful for people who Google for it) : Dual-core Sun x2100 is
> unreliable out of the box under Linux.
No that shouldn't be true with any modern kernel. It will just fallback
to HPET or more likely PMtimer.
>
> > In the default configuration there shouldn't be any problems
> > like this, it will just run slower because the kernel falls back to a
> > slower time source.
>
> You have to specify "notsc" for this.
No, the kernel should work out of the box. Some older kernels didn't
at various points of time though.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:42 ` thockin
@ 2006-10-28 20:16 ` Willy Tarreau
0 siblings, 0 replies; 65+ messages in thread
From: Willy Tarreau @ 2006-10-28 20:16 UTC (permalink / raw)
To: thockin; +Cc: Andi Kleen, Lee Revell, Luca Tettamanti, linux-kernel,
john stultz
On Sat, Oct 28, 2006 at 12:42:45PM -0700, thockin@hockin.org wrote:
> On Sat, Oct 28, 2006 at 09:32:18PM +0200, Willy Tarreau wrote:
> > > Was the problem that they were not synced at poweron or that they would
> > > drift due to power-states?
> >
> > They resynced at power up, but would constantly drift. I don't even know
> > if it was caused by power states. When the machine was loaded, a single
> > task moving across the cores could see its time jump back and forth
> > several times a second by an offset sometimes close to +2/-2s.
>
> That sounds like C1, to me.
OK.
> > > Did you try running with idle=poll, to avoid ever entering C1 state (hlt)?
> >
> > Yes, I remember trying such things. I also tried 'nohlt', completely
> > disabling power management, including ACPI, etc... I also tried vanilla
> > kernels as well as severely patched ones, but the problem remained the
> > same in all circumstances, that only 'notsc' could solve.
>
> That's exceedingly strange. On my dual-socket dual-core, I can get
> roughly synced TSCs (no appreciable drift) by just using idle=poll.
As I said in another mail, I thought I won by running several busy loops
in parallel to the load, which prevented the system from either halting
or slowing down. But it was OK for a few minutes only and started going
mad again.
> If that did not work for you, I'd really want to poke at the system more.
The machine was returned to the supplier and for other reasons, we switched
to a different maker for the about 20 machines (and all single-core). I've
read somewhere that there's already a second version of the sun x2100, I
don't know if it still exhibits the problem. Maybe at least they've fixed
the BIOS to report the HPET.
> > BTW, I've just found a remain of dmesg capture after boot in case you'd
> > like to look for anything in it.
>
> A dmesg won't be that useful, I'd actually have to poke at the system.
OK. I don't know if anyone there has one at hand, as I don't have it
anymore.
Regards,
Willy
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 20:11 ` Andi Kleen
@ 2006-10-28 20:36 ` Willy Tarreau
0 siblings, 0 replies; 65+ messages in thread
From: Willy Tarreau @ 2006-10-28 20:36 UTC (permalink / raw)
To: Andi Kleen
Cc: Lee Revell, thockin, Luca Tettamanti, linux-kernel, john stultz
On Sat, Oct 28, 2006 at 01:11:14PM -0700, Andi Kleen wrote:
> On Saturday 28 October 2006 13:04, Willy Tarreau wrote:
>
> > I really think that the hardware was doing tricks far beyond my knowledge,
> > because on another Sun (a V40Z), there were 4 dual cores which I never saw
> > out of sync even after hours of testing. But the HPET was available in it,
> > I don't remember if it's used by default when detected.
>
> I think some system occasionally ramp the clock for thermal management,
> but that should be rare.
I should say that at one moment, I've been wondering whether they were
or not performing sort of an automatic overclocking under load, because
those machines were really faster even in single-core than other opterons
I had tested. Since such boxes are often compared on workloads such as
SSL, doing so might have favored them in comparative benchmarks.
> > No I did not "force" anything at first. You take the RHEL3 CD, you install
> > it, reboot and watch your logs report negative times, then scratch your
> > head, first call red hat dumb ass, and after a few tests, apologize to the
> > poor innocent red hat
>
> Well they should have fixed the kernel to fall back to another clock
> by backporting the appropiate fixes from mainline. I assume they
> did actually.
But upon what trigger should they apply the fallback ? I don't see
what can be detected. I see no such thing in 2.4 mainline (except
TSC resync at boot), and do not seem to find any such fallback either
in 2.6 (though I might not have looked deep enough as the code is more
complex there).
> > and call the box a total crap. To put it shortly
> > (might be useful for people who Google for it) : Dual-core Sun x2100 is
> > unreliable out of the box under Linux.
>
> No that shouldn't be true with any modern kernel. It will just fallback
> to HPET or more likely PMtimer.
same comment as above :-)
> >
> > > In the default configuration there shouldn't be any problems
> > > like this, it will just run slower because the kernel falls back to a
> > > slower time source.
> >
> > You have to specify "notsc" for this.
>
> No, the kernel should work out of the box. Some older kernels didn't
> at various points of time though.
Anyway, if they started providing kernels which used TSC by default,
I don't think they will change this afterwards, in order to avoid
causing regressions.
Could you please check if the fallbacks you're talking about are
hard to backport in 2.4 ? Depending on their complexity and risk,
I would not be against a small backport. I think for instance that
automatically disabling TSC on SMP when HPET is present would not
be a terrible regression and might help in a number of occasions.
The user would then have to force the use of TSC if needed.
Regards,
Willy
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:15 ` Willy Tarreau
2006-10-28 19:18 ` thockin
2006-10-28 19:33 ` Andi Kleen
@ 2006-10-28 21:00 ` Lee Revell
2 siblings, 0 replies; 65+ messages in thread
From: Lee Revell @ 2006-10-28 21:00 UTC (permalink / raw)
To: Willy Tarreau
Cc: Andi Kleen, thockin, Luca Tettamanti, linux-kernel, john stultz
On Sat, 2006-10-28 at 21:15 +0200, Willy Tarreau wrote:
> This is definitely a design problem on those chips, probably because
> marketting targets gamers only. And that's very sad, because they are
> excellent processors !
Hmm, gamers seem to be the worst affected by this problem on other OS...
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:57 ` Vojtech Pavlik
@ 2006-10-28 22:54 ` thockin
0 siblings, 0 replies; 65+ messages in thread
From: thockin @ 2006-10-28 22:54 UTC (permalink / raw)
To: Vojtech Pavlik
Cc: Lee Revell, Andi Kleen, Jiri Bohac, Luca Tettamanti, linux-kernel,
john stultz
On Sat, Oct 28, 2006 at 09:57:39PM +0200, Vojtech Pavlik wrote:
> > > No we don't -- most BIOS still don't give us the HPET table
> > > even when it is there in hardware. In the future this will change sure
> > > but people will still run a lot of older motherboards.
> >
> > I have exactly such a system (see thread "x86-64 with nvidia MCP51
> > chipset: kernel does not find HPET"). Is there anything at all I can do
> > to make the kernel see the HPET? Can I try to guess the address? BIOS
> > upgrade?
>
> In most cases where the HPET is present but not reported, it's not
> configured. Usually, you need to write a chipset-specific register to
> configure the address.
>
> Finding the register, finding some free MMIO space, writing the address
> to the register and telling the address to the kernel is enough.
Do we want to establish a precedent for chipsets that we can find the HPET
and configure ourselves? Register them all as PCI quirks...
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 19:33 ` Andi Kleen
2006-10-28 20:04 ` Willy Tarreau
@ 2006-10-29 1:28 ` Lee Revell
1 sibling, 0 replies; 65+ messages in thread
From: Lee Revell @ 2006-10-29 1:28 UTC (permalink / raw)
To: Andi Kleen
Cc: Willy Tarreau, thockin, Luca Tettamanti, linux-kernel,
john stultz
On Sat, 2006-10-28 at 12:33 -0700, Andi Kleen wrote:
> On Saturday 28 October 2006 12:15, Willy Tarreau wrote:
>
> > Yes it was, because the small gain of using a dual core with such
> > a workload was clearly lost by that change. IIRC, I reached 25000
> > sessions/s on dual core with TSC if I didn't care about the clock,
> > 20000 without TSC, and 18000 on single core+TSC. But with the sniffer,
> > it was even worse : I had 500 kpps in dual-core+TSC, 70kpps without
> > TSC and 300 kpps with single-core+TSC. Since I had to buy the same
> > machines for both uses, this last argument was enough for me to stick
> > to a single core.
>
> Ok, but it is a very specialized situation not applicable to most
> others. I just say this for all the other people following the thread.
> Again most workloads are not that gtod intensive.
Haven't benchmarked or anything, but isn't X11 also a very gtod
intensive workload?
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 4:22 ` Sergio Monteiro Basto
@ 2006-10-30 3:10 ` Sergio Monteiro Basto
2006-10-30 15:23 ` Andi Kleen
0 siblings, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-10-30 3:10 UTC (permalink / raw)
To: Andi Kleen; +Cc: Lee Revell, Chris Friesen, linux-kernel, john stultz
[-- Attachment #1.1: Type: text/plain, Size: 2220 bytes --]
On Sat, 2006-10-28 at 05:22 +0100, Sergio Monteiro Basto wrote:
> On Fri, 2006-10-27 at 21:06 -0700, Andi Kleen wrote:
> > > So far, has I can understand. Seems to me that my computer which have a
> > > Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> > > with the patch of hrtimers on
> >
> > Intel systems (except for some large highend systems) have synchronized TSCs.
> > Only exception so far seems to be a few systems that are
> > overclocked/overvolted and running outside their specification.
> > When you do that you'e on your own and we're not interested in a bug
> > report.
>
> and my computer :)
> http://www.asrock.com/product/775Dual-880Pro.htm
> http://www.asrock.com/support/CPU_Support/show.asp?Model=775Dual-880Pro
> Monday I will checkout if my computer is under specs.
> Seems that I like buy computers with many problems on Linux and fix :)
I bought this computer, on computers shop that have the best credits in
Portugal. And I don't change anything.
cat /proc/cpuinfo
processor : 1
vendor_id : GenuineIntel
cpu family : 15
model : 4
model name : Intel(R) Pentium(R) D CPU 2.80GHz
stepping : 4
cpu MHz : 2793.050
cache size : 1024 KB
with 2 x 1024 KB cache size just saw Pentium D 820 in
http://www.intel.com/products/processor_number/chart/pentium_d.htm
which is supported on
http://www.asrock.com/support/CPU_Support/show.asp?Model=775Dual-880Pro
775 Pentium D 820 2.80GHz 8O0MHz 2MB Smithfield All
Just see that don't have Enhanced Intel SpeedStep® Technology.
I attach here x86info which match with
http://processorfinder.intel.com/details.aspx?sSpec=SL88T
Other curiosity with kernel 2.6.18.1 and the hrtimers patch. Kernel boot
oops and hang , if I don't give "notsc" option.
>
> > There was also one BIOS found that had this problem, but it was old and rare
> > and got fixed with a upgrade.
I have last BIOS released
> >
> > > Just to point out. This could be more a problem of chipsets than CPUs
> > > (AMD or Intel). AMD just begin first using x86_64 archs :)
> >
> > No.
> >
> > -Andi
--
Sérgio M.B.
[-- Attachment #1.2: x86info.txt --]
[-- Type: text/plain, Size: 1855 bytes --]
x86info v1.17. Dave Jones 2001-2005
Feedback to <davej@redhat.com>.
Found 2 CPUs
--------------------------------------------------------------------------
CPU #1
Found unknown cache descriptors: 81 91 96
Family: 15 Model: 4 Stepping: 4 Type: 0 Brand: 0
CPU Model: Extreme Edition [A0]
Processor name string: Intel(R) Pentium(R) D CPU 2.80GHz
Feature flags:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflsh ds acpi mmx fxsr sse sse2 ss ht tm pbe sse3 monitor ds-cpl cntx-id cx16 xTPR
Extended feature flags:
SYSCALL em64t
L1 Data cache:
Size: 16KB Sectored, 8-way associative.
line size=64 bytes.
Instruction TLB: 4K, 2MB or 4MB pages, fully associative, 128 entries.
Found unknown cache descriptors: 81 91 96
Data TLB: 4KB or 4MB pages, fully associative, 64 entries.
Processor serial: 0000-0F44-0000-0000-0000-0000
The physical package supports 2 logical processors
--------------------------------------------------------------------------
CPU #2
Found unknown cache descriptors: 81 91 96
Family: 15 Model: 4 Stepping: 4 Type: 0 Brand: 0
CPU Model: Extreme Edition [A0]
Processor name string: Intel(R) Pentium(R) D CPU 2.80GHz
Feature flags:
fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflsh ds acpi mmx fxsr sse sse2 ss ht tm pbe sse3 monitor ds-cpl cntx-id cx16 xTPR
Extended feature flags:
SYSCALL em64t
L1 Data cache:
Size: 16KB Sectored, 8-way associative.
line size=64 bytes.
Instruction TLB: 4K, 2MB or 4MB pages, fully associative, 128 entries.
Found unknown cache descriptors: 81 91 96
Data TLB: 4KB or 4MB pages, fully associative, 64 entries.
Processor serial: 0000-0F44-0000-0000-0000-0000
The physical package supports 2 logical processors
--------------------------------------------------------------------------
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-30 3:10 ` Sergio Monteiro Basto
@ 2006-10-30 15:23 ` Andi Kleen
[not found] ` <1162253008.2999.9.camel@localhost.portugal>
0 siblings, 1 reply; 65+ messages in thread
From: Andi Kleen @ 2006-10-30 15:23 UTC (permalink / raw)
To: sergio
Cc: Lee Revell, Chris Friesen, linux-kernel, john stultz,
suresh.b.siddha
On Monday 30 October 2006 04:10, Sergio Monteiro Basto wrote:
> On Sat, 2006-10-28 at 05:22 +0100, Sergio Monteiro Basto wrote:
> > On Fri, 2006-10-27 at 21:06 -0700, Andi Kleen wrote:
> > > > So far, has I can understand. Seems to me that my computer which have a
> > > > Pentium D (Dual Core) on VIA chipset, also have unsynchronized TSC and
> > > > with the patch of hrtimers on
> > >
> > > Intel systems (except for some large highend systems) have synchronized TSCs.
> > > Only exception so far seems to be a few systems that are
> > > overclocked/overvolted and running outside their specification.
> > > When you do that you'e on your own and we're not interested in a bug
> > > report.
> >
> > and my computer :)
> > http://www.asrock.com/product/775Dual-880Pro.htm
> > http://www.asrock.com/support/CPU_Support/show.asp?Model=775Dual-880Pro
> > Monday I will checkout if my computer is under specs.
> > Seems that I like buy computers with many problems on Linux and fix :)
>
> I bought this computer, on computers shop that have the best credits in
> Portugal. And I don't change anything.
Can you give us a full dmesg without noapic or notsc please?
Adding Suresh to cc too because he spotted a similar problem last time.
-Andi
^ permalink raw reply [flat|nested] 65+ messages in thread
* RE: AMD X2 unsynced TSC fix?
2006-10-28 18:08 ` Lee Revell
2006-10-28 19:14 ` thockin
@ 2006-10-30 17:22 ` Langsdorf, Mark
1 sibling, 0 replies; 65+ messages in thread
From: Langsdorf, Mark @ 2006-10-30 17:22 UTC (permalink / raw)
To: Lee Revell; +Cc: linux-kernel
> > Agreed, I had to turn about 20 dual-core servers to single
> > core because the only way to get a monotonic gtod made it
> > so slow that it was not worth using a dual-core. I initially
> > considered buying one dual-core AMD for my own use, but after
> > seeing this, I'm definitely sure I won't ever buy one as
> > long as this problem is not fixed, as it causes too
> > many problems.
>
> Does anyone know if the problem will really be fixed in new
> CPUs, as AMD promised a year or so ago?
>
> http://lkml.org/lkml/2005/11/4/173
>
> Since that post, there has been Socket F and AM2 which apparently have
> the same issue.
> Were the AMD guys just blowing smoke?
AMD was not blowing smoke. Future AMD processors will have
pstate/cstate invariant TSCs detectable by a CPUID bit.
Unfortunately, those processors have not be released yet, and
I can't comment on their release timeframe, other than to say
they are on our roadmap.
-Mark Langsdorf
AMD, Inc.
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-27 23:04 ` thockin
2006-10-28 0:00 ` Luca Tettamanti
2006-10-28 1:04 ` Andi Kleen
@ 2006-10-30 20:30 ` Christoph Lameter
2 siblings, 0 replies; 65+ messages in thread
From: Christoph Lameter @ 2006-10-30 20:30 UTC (permalink / raw)
To: thockin; +Cc: Luca Tettamanti, Lee Revell, linux-kernel, Andi Kleen,
john stultz
On Fri, 27 Oct 2006, thockin@hockin.org wrote:
> Wrong, too. We have a patch that will be coming SOON (trust me, I am
> pushing hard for the author to publish it). With this patch applied you
> should never see the TSC go backwards. Period. It should be monotonic
> (to userspace, kernel rdtsc calls can still be wrong). CPUs should stay
> very nearly in sync (again, to userspace). The overhead of this patch is
> pretty minimal and costs nothing unless you actually read the TSC.
Well why not use regular clock_gettime() instead? If you add code for TSC
processing (intercepting RDTSC from user space???) then it may be
comparable in performance to time retrieval via POSIX calls using
vsyscalls. Look like you may start duplicating the time subsystem?
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
[not found] ` <1162253008.2999.9.camel@localhost.portugal>
@ 2006-10-31 0:14 ` Lee Revell
2006-10-31 0:25 ` john stultz
2006-10-31 2:41 ` Siddha, Suresh B
1 sibling, 1 reply; 65+ messages in thread
From: Lee Revell @ 2006-10-31 0:14 UTC (permalink / raw)
To: sergio
Cc: Andi Kleen, Chris Friesen, linux-kernel, john stultz,
suresh.b.siddha
On Tue, 2006-10-31 at 00:03 +0000, Sergio Monteiro Basto wrote:
> On Mon, 2006-10-30 at 16:23 +0100, Andi Kleen wrote:
> > Can you give us a full dmesg without noapic or notsc please?
> >
>
> yes , I send an dmesg of 2.6.18-git20, dmesg27
> and other dmesg of kernel 2.6.18.1, dmesg30
> To vanilla kernel I just add this patch:
> ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/broken-out/gregkh-pci-pci-via-irq-quirk-behaviour-change.patch
>
> > Adding Suresh to cc too because he spotted a similar problem last
> > time.
>
> Feel free to ask any test, test patches or even access to this machine.
Maybe I've been running -rt for too long but I don't see clocksource
selection - does 2.6.18 not have John Stultz's GTOD rework?
How can it know not to use TSC on machines where it's unstable?
Lee
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-31 0:14 ` Lee Revell
@ 2006-10-31 0:25 ` john stultz
0 siblings, 0 replies; 65+ messages in thread
From: john stultz @ 2006-10-31 0:25 UTC (permalink / raw)
To: Lee Revell
Cc: sergio, Andi Kleen, Chris Friesen, linux-kernel, suresh.b.siddha
On Mon, 2006-10-30 at 19:14 -0500, Lee Revell wrote:
> On Tue, 2006-10-31 at 00:03 +0000, Sergio Monteiro Basto wrote:
> > On Mon, 2006-10-30 at 16:23 +0100, Andi Kleen wrote:
> > > Can you give us a full dmesg without noapic or notsc please?
> > >
> >
> > yes , I send an dmesg of 2.6.18-git20, dmesg27
> > and other dmesg of kernel 2.6.18.1, dmesg30
> > To vanilla kernel I just add this patch:
> > ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.19-rc1/2.6.19-rc1-mm1/broken-out/gregkh-pci-pci-via-irq-quirk-behaviour-change.patch
> >
> > > Adding Suresh to cc too because he spotted a similar problem last
> > > time.
> >
> > Feel free to ask any test, test patches or even access to this machine.
>
> Maybe I've been running -rt for too long but I don't see clocksource
> selection - does 2.6.18 not have John Stultz's GTOD rework?
He's booting x86_64. I've not had the time yet to cleanup and push my
x86_64 conversion to CONFIG_GENERIC_TIME. Soon hopefully.
thanks
-john
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
[not found] ` <1162253008.2999.9.camel@localhost.portugal>
2006-10-31 0:14 ` Lee Revell
@ 2006-10-31 2:41 ` Siddha, Suresh B
2006-10-31 15:05 ` Sergio Monteiro Basto
2006-11-01 1:46 ` Sergio Monteiro Basto
1 sibling, 2 replies; 65+ messages in thread
From: Siddha, Suresh B @ 2006-10-31 2:41 UTC (permalink / raw)
To: Sergio Monteiro Basto
Cc: Andi Kleen, Lee Revell, Chris Friesen, linux-kernel, john stultz,
suresh.b.siddha
On Tue, Oct 31, 2006 at 12:03:28AM +0000, Sergio Monteiro Basto wrote:
> time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
Is this the reason why you are saying your system has unsynchronized TSC?
Some where in this thread, you mentioned that Lost ticks happen even
when you use "notsc"
This sounds to me as a different problem. Can you send us the output
of /proc/interrupts?
thanks,
suresh
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-28 5:28 ` Willy Tarreau
2006-10-28 18:08 ` Lee Revell
2006-10-28 18:37 ` Andi Kleen
@ 2006-10-31 11:12 ` Pádraig Brady
2006-10-31 15:31 ` Willy Tarreau
2 siblings, 1 reply; 65+ messages in thread
From: Pádraig Brady @ 2006-10-31 11:12 UTC (permalink / raw)
To: Willy Tarreau
Cc: Lee Revell, Andi Kleen, thockin, Luca Tettamanti, linux-kernel,
john stultz
Willy Tarreau wrote:
> On Fri, Oct 27, 2006 at 11:28:00PM -0400, Lee Revell wrote:
>
>>On Fri, 2006-10-27 at 18:04 -0700, Andi Kleen wrote:
>>
>>>I don't think it makes too much sense to hack on pure RDTSC when
>>>gtod is fast enough -- RDTSC will be always icky and hard to use.
>>
>>I agree FWIW, our application would be happy to just use gtod if it
>>wasn't so slow on these machines.
>
>
> Agreed, I had to turn about 20 dual-core servers to single core because
> the only way to get a monotonic gtod made it so slow that it was not
> worth using a dual-core. I initially considered buying one dual-core
> AMD for my own use, but after seeing this, I'm definitely sure I won't
> ever buy one as long as this problem is not fixed, as it causes too
> many problems.
For the record, in my previous job we were implementing
a very fast packet sniffer/timestamper using 2x3.2GHz P4 Xeons + linux 2.4.20 (with gtod)
Very rarely we would see inter packet times jump by (2^32)/CPU_Hz seconds,
when sniffing about 1.2 million packets per second on 2 e1000 links,
which suggested a wrap around of a 32 bit comparison somewhere.
This lead to the fix below which was never picked up
(I guessed because it was addressed elsewhere?).
Note we were only interested in millisecond resolution for the timestamps,
but the approximation is very good in general as you know the TSCs are very
close to each other when this condition happens.
Note power management was not used on our systems.
Pádraig.
diff -Naru linux-2.4.20/arch/i386/kernel/time.c linux-2.4.20-corvil/arch/i386/kernel/time.c
--- linux-2.4.20/arch/i386/kernel/time.c 2002-11-28 23:53:09.000000000 +0000
+++ linux-2.4.20-pb/arch/i386/kernel/time.c 2005-07-07 10:32:34.000000000 +0100
@@ -94,6 +94,9 @@
/* .. relative to previous jiffy (32 bits is enough) */
eax -= last_tsc_low; /* tsc_low delta */
+ if ((signed)eax < 0) { /* workaround for drifting TSCs */
+ eax = 0;
+ printk(KERN_INFO "tsc wrap around applied\n"); /* rare */
+ }
/*
* Time offset = (tsc_low delta) * fast_gettimeoffset_quotient
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-31 2:41 ` Siddha, Suresh B
@ 2006-10-31 15:05 ` Sergio Monteiro Basto
2006-11-01 1:46 ` Sergio Monteiro Basto
1 sibling, 0 replies; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-10-31 15:05 UTC (permalink / raw)
To: Siddha, Suresh B
Cc: Andi Kleen, Lee Revell, Chris Friesen, linux-kernel, john stultz
On Mon, 2006-10-30 at 18:41 -0800, Siddha, Suresh B wrote:
> On Tue, Oct 31, 2006 at 12:03:28AM +0000, Sergio Monteiro Basto wrote:
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
>
> Is this the reason why you are saying your system has unsynchronized TSC?
yes
> Some where in this thread, you mentioned that Lost ticks happen even
> when you use "notsc"
yes, with news kernels 2.6.19-rcx
>
> This sounds to me as a different problem. Can you send us the output
> of /proc/interrupts?
of which kernel ?
I am not at home ..
but I have here /proc/interrupts from one 2.6.16
http://bugzilla.kernel.org/attachment.cgi?id=7927&action=view
from my bug
http://bugzilla.kernel.org/show_bug.cgi?id=6419
Tonight I can attach on bugzilla bug#6419, /proc/interrupts from one
kernel 2.6.18 and from one kernel 2.6.19-rc4
BTW: those kernels are for x86_64 arch, I haven't try, yet, i386, but
maybe will be my next test.
Thanks,
--
Sérgio M. B.
>
> thanks,
> suresh
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-31 11:12 ` Pádraig Brady
@ 2006-10-31 15:31 ` Willy Tarreau
0 siblings, 0 replies; 65+ messages in thread
From: Willy Tarreau @ 2006-10-31 15:31 UTC (permalink / raw)
To: Pádraig Brady
Cc: Lee Revell, Andi Kleen, thockin, Luca Tettamanti, linux-kernel,
john stultz
On Tue, Oct 31, 2006 at 11:12:47AM +0000, Pádraig Brady wrote:
> Willy Tarreau wrote:
> > On Fri, Oct 27, 2006 at 11:28:00PM -0400, Lee Revell wrote:
> >
> >>On Fri, 2006-10-27 at 18:04 -0700, Andi Kleen wrote:
> >>
> >>>I don't think it makes too much sense to hack on pure RDTSC when
> >>>gtod is fast enough -- RDTSC will be always icky and hard to use.
> >>
> >>I agree FWIW, our application would be happy to just use gtod if it
> >>wasn't so slow on these machines.
> >
> >
> > Agreed, I had to turn about 20 dual-core servers to single core because
> > the only way to get a monotonic gtod made it so slow that it was not
> > worth using a dual-core. I initially considered buying one dual-core
> > AMD for my own use, but after seeing this, I'm definitely sure I won't
> > ever buy one as long as this problem is not fixed, as it causes too
> > many problems.
>
> For the record, in my previous job we were implementing
> a very fast packet sniffer/timestamper using 2x3.2GHz P4 Xeons + linux 2.4.20 (with gtod)
> Very rarely we would see inter packet times jump by (2^32)/CPU_Hz seconds,
> when sniffing about 1.2 million packets per second on 2 e1000 links,
> which suggested a wrap around of a 32 bit comparison somewhere.
Interesting, as in my case I was jumps of about +/- 2s on a 2.2 GHz box, which
also suggests a wrap around.
> This lead to the fix below which was never picked up
> (I guessed because it was addressed elsewhere?).
> Note we were only interested in millisecond resolution for the timestamps,
> but the approximation is very good in general as you know the TSCs are very
> close to each other when this condition happens.
100% agreed.
> Note power management was not used on our systems.
>
> Pádraig.
>
> diff -Naru linux-2.4.20/arch/i386/kernel/time.c linux-2.4.20-corvil/arch/i386/kernel/time.c
> --- linux-2.4.20/arch/i386/kernel/time.c 2002-11-28 23:53:09.000000000 +0000
> +++ linux-2.4.20-pb/arch/i386/kernel/time.c 2005-07-07 10:32:34.000000000 +0100
> @@ -94,6 +94,9 @@
>
> /* .. relative to previous jiffy (32 bits is enough) */
> eax -= last_tsc_low; /* tsc_low delta */
> + if ((signed)eax < 0) { /* workaround for drifting TSCs */
> + eax = 0;
> + printk(KERN_INFO "tsc wrap around applied\n"); /* rare */
> + }
>
> /*
> * Time offset = (tsc_low delta) * fast_gettimeoffset_quotient
Cheers,
Willy
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-10-31 2:41 ` Siddha, Suresh B
2006-10-31 15:05 ` Sergio Monteiro Basto
@ 2006-11-01 1:46 ` Sergio Monteiro Basto
2006-11-01 2:44 ` Siddha, Suresh B
1 sibling, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-11-01 1:46 UTC (permalink / raw)
To: Siddha, Suresh B
Cc: Andi Kleen, Lee Revell, Chris Friesen, linux-kernel, john stultz
[-- Attachment #1: Type: text/plain, Size: 1334 bytes --]
On Mon, 2006-10-30 at 18:41 -0800, Siddha, Suresh B wrote:
> On Tue, Oct 31, 2006 at 12:03:28AM +0000, Sergio Monteiro Basto wrote:
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
>
> Is this the reason why you are saying your system has unsynchronized TSC?
> Some where in this thread, you mentioned that Lost ticks happen even
> when you use "notsc"
>
> This sounds to me as a different problem. Can you send us the output
> of /proc/interrupts?
/proc/interrupts on kernel 2.6.18
http://bugzilla.kernel.org/attachment.cgi?id=9384&action=view
dmesg w/o notsc kernel 2.6.19-rc4
http://bugzilla.kernel.org/attachment.cgi?id=9385&action=view
/proc/interrupts kernel 2.6.19-rc4
http://bugzilla.kernel.org/attachment.cgi?id=9386&action=view
dmesg w/ notsc kernel 2.6.19-rc4
http://bugzilla.kernel.org/attachment.cgi?id=9387&action=view
/proc/interrupts kernel 2.6.19-rc4
http://bugzilla.kernel.org/attachment.cgi?id=9388&action=view
list of interrupts give by windows XP
http://bugzilla.kernel.org/attachment.cgi?id=9389&action=view
Let me know, if I can help on something.
Thanks,
--
Sérgio M.B.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-01 1:46 ` Sergio Monteiro Basto
@ 2006-11-01 2:44 ` Siddha, Suresh B
2006-11-08 0:22 ` Sergio Monteiro Basto
0 siblings, 1 reply; 65+ messages in thread
From: Siddha, Suresh B @ 2006-11-01 2:44 UTC (permalink / raw)
To: Sergio Monteiro Basto
Cc: Siddha, Suresh B, Andi Kleen, Lee Revell, Chris Friesen,
linux-kernel, john stultz, len.brown
On Wed, Nov 01, 2006 at 01:46:48AM +0000, Sergio Monteiro Basto wrote:
> On Mon, 2006-10-30 at 18:41 -0800, Siddha, Suresh B wrote:
> > On Tue, Oct 31, 2006 at 12:03:28AM +0000, Sergio Monteiro Basto wrote:
> > > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> > > time.c: Lost 300 timer tick(s)! rip mwait_idle+0x33/0x4f)
> >
> > Is this the reason why you are saying your system has unsynchronized TSC?
> > Some where in this thread, you mentioned that Lost ticks happen even
> > when you use "notsc"
> >
> > This sounds to me as a different problem. Can you send us the output
> > of /proc/interrupts?
>
> /proc/interrupts on kernel 2.6.18
> http://bugzilla.kernel.org/attachment.cgi?id=9384&action=view
> dmesg w/o notsc kernel 2.6.19-rc4
> http://bugzilla.kernel.org/attachment.cgi?id=9385&action=view
> /proc/interrupts kernel 2.6.19-rc4
> http://bugzilla.kernel.org/attachment.cgi?id=9386&action=view
> dmesg w/ notsc kernel 2.6.19-rc4
> http://bugzilla.kernel.org/attachment.cgi?id=9387&action=view
> /proc/interrupts kernel 2.6.19-rc4
> http://bugzilla.kernel.org/attachment.cgi?id=9388&action=view
> list of interrupts give by windows XP
> http://bugzilla.kernel.org/attachment.cgi?id=9389&action=view
First of all, from "lost timer ticks" messages and the fact that "notsc"
decreases the number of ticks lost can't be concluded as a TSC sync issue.
Some device is hogging interrupts which results in lost timer ticks and from
your 2.6.18 interrupts info, usb seems to be the culprit.. It is probably
a side effect that "notsc" decreases the lost timer ticks..
copied Len who seems to be the owner of the bug for his thoughts..
(http://bugzilla.kernel.org/show_bug.cgi?id=6419)
thanks,
suresh
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-01 2:44 ` Siddha, Suresh B
@ 2006-11-08 0:22 ` Sergio Monteiro Basto
2006-11-08 19:53 ` Thomas Gleixner
0 siblings, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-11-08 0:22 UTC (permalink / raw)
To: Siddha, Suresh B
Cc: Andi Kleen, Lee Revell, Chris Friesen, linux-kernel, john stultz,
len.brown
[-- Attachment #1: Type: text/plain, Size: 1175 bytes --]
On Tue, 2006-10-31 at 18:44 -0800, Siddha, Suresh B wrote:
> First of all, from "lost timer ticks" messages and the fact that "notsc"
> decreases the number of ticks lost can't be concluded as a TSC sync issue.
ok, but without notsc it is a nightmare
>
> Some device is hogging interrupts which results in lost timer ticks and from
> your 2.6.18 interrupts info, usb seems to be the culprit.. It is probably
> a side effect that "notsc" decreases the lost timer ticks..
I begging use net with Ethernet instead usbnet and reduce a little the
problems (I can have nvidia DRI working without problems or oops) but
still appear the same lost tickets.
> copied Len who seems to be the owner of the bug for his thoughts..
> (http://bugzilla.kernel.org/show_bug.cgi?id=6419)
I had update bugzilla with dmesg from 2.6.19-RC4-mm2, which already came
with the latest release of hrtimers, because for the first time I could
boot without hang on boot, with hrtimers and without notsc boot option.
But it have a long long oops that maybe could give you some clues.
http://bugzilla.kernel.org/show_bug.cgi?id=6419#c55
Thanks,
--
Sérgio M.B.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-08 0:22 ` Sergio Monteiro Basto
@ 2006-11-08 19:53 ` Thomas Gleixner
2006-11-09 0:39 ` Sergio Monteiro Basto
0 siblings, 1 reply; 65+ messages in thread
From: Thomas Gleixner @ 2006-11-08 19:53 UTC (permalink / raw)
To: sergio
Cc: Siddha, Suresh B, Andi Kleen, Lee Revell, Chris Friesen,
linux-kernel, john stultz, len.brown, Ingo Molnar,
Arjan van de Ven
On Wed, 2006-11-08 at 00:22 +0000, Sergio Monteiro Basto wrote:
> I had update bugzilla with dmesg from 2.6.19-RC4-mm2, which already came
> with the latest release of hrtimers, because for the first time I could
> boot without hang on boot, with hrtimers and without notsc boot option.
> But it have a long long oops that maybe could give you some clues.
>
> http://bugzilla.kernel.org/show_bug.cgi?id=6419#c55
This one is a lock dependency problem, which is fixed in -rc5-mm1
tglx
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-08 19:53 ` Thomas Gleixner
@ 2006-11-09 0:39 ` Sergio Monteiro Basto
2006-11-09 1:13 ` john stultz
0 siblings, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-11-09 0:39 UTC (permalink / raw)
To: tglx
Cc: Siddha, Suresh B, Andi Kleen, Lee Revell, Chris Friesen,
linux-kernel, john stultz, len.brown, Ingo Molnar,
Arjan van de Ven
[-- Attachment #1: Type: text/plain, Size: 450 bytes --]
On Wed, 2006-11-08 at 20:53 +0100, Thomas Gleixner wrote:
> This one is a lock dependency problem, which is fixed in -rc5-mm1
yes, oops fixed w/ and w/o notsc option.
Other question, hrtimer in 2.6.18 found acpi_pm clocksource and use it.
With 2.6.19-rcx can't get acpi_pm clocksource even trying force at boot
kernel with clocksource=acpi_pm, any idea ?
because with this clocksource my lost ticket disappears
Thanks,
--
Sérgio M.B.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-09 0:39 ` Sergio Monteiro Basto
@ 2006-11-09 1:13 ` john stultz
2006-11-09 1:27 ` Sergio Monteiro Basto
2006-11-15 1:51 ` Sergio Monteiro Basto
0 siblings, 2 replies; 65+ messages in thread
From: john stultz @ 2006-11-09 1:13 UTC (permalink / raw)
To: sergio
Cc: tglx, Siddha, Suresh B, Andi Kleen, Lee Revell, Chris Friesen,
linux-kernel, len.brown, Ingo Molnar, Arjan van de Ven
On Thu, 2006-11-09 at 00:39 +0000, Sergio Monteiro Basto wrote:
> On Wed, 2006-11-08 at 20:53 +0100, Thomas Gleixner wrote:
> > This one is a lock dependency problem, which is fixed in -rc5-mm1
>
> yes, oops fixed w/ and w/o notsc option.
> Other question, hrtimer in 2.6.18 found acpi_pm clocksource and use it.
> With 2.6.19-rcx can't get acpi_pm clocksource even trying force at boot
> kernel with clocksource=acpi_pm, any idea ?
> because with this clocksource my lost ticket disappears
Looking at the dmesg in the bugzilla:
http://bugzilla.kernel.org/show_bug.cgi?id=6419
I noticed you're using x86_64. x86_64 doesn't yet support clocksource
overrides in mainline, as it is not converted to GENERIC_TIME. (Probably
printing out such a warning if an override is used would be nice. I'll
try to get to that soon.)
Now, the code to convert x86_64 is in tglx's hrtimer patch set, so I'm
glad to hear its working for you, however I'm not sure if it really is
solving the issue or just hiding it (as lost ticks won't affect
timekeeping when you use continuous clocksources and GENERIC_TIME).
To use the ACPI PM w/ a 2.6.19-rcX kernel, use "notsc", and you'll see
the line:
time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
Using the "notsc" option, do you continue to see lost tick messages
after bootup?
thanks
-john
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-09 1:13 ` john stultz
@ 2006-11-09 1:27 ` Sergio Monteiro Basto
2006-11-15 1:51 ` Sergio Monteiro Basto
1 sibling, 0 replies; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-11-09 1:27 UTC (permalink / raw)
To: john stultz
Cc: tglx, Siddha, Suresh B, Andi Kleen, Lee Revell, Chris Friesen,
linux-kernel, len.brown, Ingo Molnar, Arjan van de Ven
[-- Attachment #1: Type: text/plain, Size: 403 bytes --]
On Wed, 2006-11-08 at 17:13 -0800, john stultz wrote:
> Using the "notsc" option, do you continue to see lost tick messages
> after bootup?
With notsc after boot up, lost ticket stops, the bigger exception
was in last test kernel (2.6.19-RC5-mm1) which appear some few lost
ticket but seems they just stop. I am waiting to see if appears a new
one but don't.
Thanks,
--
Sérgio M.B.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-09 1:13 ` john stultz
2006-11-09 1:27 ` Sergio Monteiro Basto
@ 2006-11-15 1:51 ` Sergio Monteiro Basto
[not found] ` <20061115193514.41C01102C011@mail.goron.de>
1 sibling, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-11-15 1:51 UTC (permalink / raw)
To: john stultz
Cc: tglx, Siddha, Suresh B, Andi Kleen, Lee Revell, Chris Friesen,
linux-kernel, len.brown, Ingo Molnar, Arjan van de Ven
[-- Attachment #1: Type: text/plain, Size: 2124 bytes --]
On Wed, 2006-11-08 at 17:13 -0800, john stultz wrote:
> On Thu, 2006-11-09 at 00:39 +0000, Sergio Monteiro Basto wrote:
> > On Wed, 2006-11-08 at 20:53 +0100, Thomas Gleixner wrote:
> > > This one is a lock dependency problem, which is fixed in -rc5-mm1
> >
> > yes, oops fixed w/ and w/o notsc option.
> > Other question, hrtimer in 2.6.18 found acpi_pm clocksource and use it.
> > With 2.6.19-rcx can't get acpi_pm clocksource even trying force at boot
> > kernel with clocksource=acpi_pm, any idea ?
> > because with this clocksource my lost ticket disappears
>
> Looking at the dmesg in the bugzilla:
> http://bugzilla.kernel.org/show_bug.cgi?id=6419
>
> I noticed you're using x86_64.
yes, I _just_ use x86_64 never test it on i386.
> x86_64 doesn't yet support clocksource
> overrides in mainline,
petty , can I have a experimental patch to test it?
> as it is not converted to GENERIC_TIME. (Probably
> printing out such a warning if an override is used would be nice. I'll
> try to get to that soon.)
>
> Now, the code to convert x86_64 is in tglx's hrtimer patch set, so I'm
> glad to hear its working for you, however I'm not sure if it really is
> solving the issue or just hiding it (as lost ticks won't affect
> timekeeping when you use continuous clocksources and GENERIC_TIME).
Well, the only kernel where I can work (yes I use computer to work) is
2.6.18 + dyntick. I think don't hid neither solve the issue, is just use
other resource (clocksource) that works better ! .
>
> To use the ACPI PM w/ a 2.6.19-rcX kernel, use "notsc", and you'll see
> the line:
> time.c: Using 3.579545 MHz WALL PM GTOD PM timer.
>
> Using the "notsc" option, do you continue to see lost tick messages
> after bootup?
I just test 2.6.19-RC5-mm2 and still very unstable even with notsc.
And after bootup, yes appears some lost tick messages.
Just trying rebuild other kernel and use command yum to update others
things, at same time, have lock up my computer.
So I back to kernel 2.6.18 + dyntick
Thanks,
>
> thanks
> -john
>
>
--
Sérgio M.B.
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
[not found] ` <20061115193514.41C01102C011@mail.goron.de>
@ 2006-11-16 1:38 ` Sergio Monteiro Basto
2006-11-16 1:45 ` Sergio Monteiro Basto
0 siblings, 1 reply; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-11-16 1:38 UTC (permalink / raw)
To: Andreas Arens, acpi devel
Cc: tglx, Siddha, Suresh B, Andi Kleen, Lee Revell, Chris Friesen,
linux-kernel, len.brown, Ingo Molnar, Arjan van de Ven,
john stultz
[-- Attachment #1.1: Type: text/plain, Size: 1183 bytes --]
On Wed, 2006-11-15 at 19:40 +0100, Andreas Arens wrote:
> as I see from the dmesg on the Fedora bugzilla, your acpi tables
> don't provide an entry to the HPET timer.
> As the VIA8237 happens to have a built-in HPET, I was able to force it
> on using the
> attached patch (against 2.6.18) on an X2 system with the same
> problem, which greatly improved the system stability for me.
But I have one Intel(R) Pentium(R) D CPU 2.8 on a VIA8237
My latest suspect of the root of the problem of my computer is not in
Processor but in those VIAs. As you find that "don't provide an entry to
the HPET timer on acpi tables" it match, but how do you know that ?
I don't send DSDT on bugzilla
> The patch is hand-crafted from some older clock-tick kernel tree
> sources I found by googling.
>
> The thing is hackish and not suitable for mainline inclusion,
> but may be useful nontheless.
> If you find it useful, and it helps you please let me know.
I try your patch and it give me this differences on dmesg (file attach),
detect a different timer.c but no improvement without notsc boot option
and with notsc the computer got worst.
>
--
Sérgio M.B.
[-- Attachment #1.2: dmesg30-38.diff --]
[-- Type: text/x-patch, Size: 14830 bytes --]
2c2
< Linux version 2.6.18-1.3_FC5 (root@monteirov) (gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)) #1 SMP Wed Oct 18 23:54:57 WEST 2006
---
> Linux version 2.6.18-1.13_2.6.18.2+via_8237_force_hpet_FC6 (root@monteirov) (gcc version 4.1.1 20061011 (Red Hat 4.1.1-30)) #1 SMP Thu Nov 16 00:30:41 WET 2006
24,25c24,25
< On node 0 totalpages: 254172
< DMA zone: 1746 pages, LIFO batch:0
---
> On node 0 totalpages: 254180
> DMA zone: 1754 pages, LIFO batch:0
48c48
< Built 1 zonelists. Total pages: 254172
---
> Built 1 zonelists. Total pages: 254180
52,53c52,55
< time.c: Using 3.579545 MHz WALL PM GTOD PIT/TSC timer.
< time.c: Detected 2793.051 MHz processor.
---
> 80000800
> time.c: WARNING: Enabled VIA8237 HPET at 0xfed00000.
> time.c: Using 14.318180 MHz WALL HPET GTOD HPET/TSC timer.
> time.c: Detected 2793.383 MHz processor.
55c57
< time.c: Lost 1 timer tick(s)! rip release_console_sem+0x1bc/0x232)
---
> time.c: Lost 11 timer tick(s)! rip release_console_sem+0x1b3/0x229)
69,70c71,72
< Memory: 1013704k/1048256k available (2391k kernel code, 34164k reserved, 1964k data, 204k init)
< Calibrating delay using timer specific routine.. 5595.08 BogoMIPS (lpj=11190172)
---
> Memory: 1013332k/1048256k available (2382k kernel code, 34536k reserved, 1956k data, 204k init)
> Calibrating delay using timer specific routine.. 5594.01 BogoMIPS (lpj=2797007)
86,88c88,90
< result 12468969
< Detected 12.468 MHz APIC timer.
< time.c: Lost 9 timer tick(s)! rip setup_boot_APIC_clock+0x12c/0x12f)
---
> result 12470375
> Detected 12.470 MHz APIC timer.
> time.c: Lost 35 timer tick(s)! rip setup_boot_APIC_clock+0x128/0x12f)
92c94
< Calibrating delay using timer specific routine.. 5586.53 BogoMIPS (lpj=11173068)
---
> Calibrating delay using timer specific routine.. 5586.07 BogoMIPS (lpj=2793035)
99a102
> time.c: Lost 2 timer tick(s)! rip __do_softirq+0x5c/0xf5)
101c104
< migration_cost=691
---
> migration_cost=681
103c106
< Freeing initrd memory: 1100k freed
---
> Freeing initrd memory: 1507k freed
169a173,174
> hpet0: at MMIO 0xfed00000 (virtual 0xffffffffff5fe000), IRQs 2, 8, 0
> hpet0: 3 32-bit timers, 14318180 Hz
191c196
< audit(1162252117.788:1): initialized
---
> audit(1163638320.760:1): initialized
210,211c215,216
< ACPI Error (psparse-0537): Method parse/execution failed [\_PR_.CPU1._PDC] (Node ffff8100027fb810), AE_BAD_HEADER
< ACPI Error (psparse-0537): Method parse/execution failed [\_PR_.CPU2._PDC] (Node ffff8100027fb650), AE_BAD_HEADER
---
> ACPI Error (psparse-0537): Method parse/execution failed [\_PR_.CPU1._PDC] (Node ffff8100027e0810), AE_BAD_HEADER
> ACPI Error (psparse-0537): Method parse/execution failed [\_PR_.CPU2._PDC] (Node ffff8100027e0650), AE_BAD_HEADER
249d253
< PM: Adding info for serio:serio1
250a255
> PM: Adding info for serio:serio1
259c264
< Write protecting the kernel read-only data: 451k
---
> Write protecting the kernel read-only data: 444k
261,308d265
< SCSI subsystem initialized
< libata version 2.00 loaded.
< sata_via 0000:00:0f.0: version 2.0
< ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 193
< sata_via 0000:00:0f.0: routed to hard irq line 3
< ata1: SATA max UDMA/133 cmd 0xD880 ctl 0xD802 bmdma 0xD080 irq 193
< ata2: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xD088 irq 193
< scsi0 : sata_via
< PM: Adding info for No Bus:host0
< ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
< input: ImPS/2 Generic Wheel Mouse as /class/input/input1
< ata1.00: ATA-7, max UDMA/100, 390721968 sectors: LBA48 NCQ (depth 0/32)
< ata1.00: ata1: dev 0 multi count 16
< ata1.00: configured for UDMA/100
< scsi1 : sata_via
< PM: Adding info for No Bus:host1
< ata2: SATA link down 1.5 Gbps (SStatus 0 SControl 300)
< ATA: abnormal status 0x7F on port 0xD487
< time.c: Lost 1 timer tick(s)! rip __do_softirq+0x5c/0xf5)
< PM: Adding info for No Bus:target0:0:0
< Vendor: ATA Model: Maxtor 6L200M0 Rev: BANC
< Type: Direct-Access ANSI SCSI revision: 05
< PM: Adding info for scsi:0:0:0:0
< SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
< sda: Write Protect is off
< sda: Mode Sense: 00 3a 00 00
< SCSI device sda: drive cache: write back
< SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
< sda: Write Protect is off
< sda: Mode Sense: 00 3a 00 00
< SCSI device sda: drive cache: write back
< sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8 >
< sd 0:0:0:0: Attached scsi disk sda
< kjournald starting. Commit interval 5 seconds
< EXT3-fs: mounted filesystem with ordered data mode.
< SELinux: Disabled at runtime.
< SELinux: Unregistering netfilter hooks
< audit(1162252122.040:2): selinux=0 auid=4294967295
< input: PC Speaker as /class/input/input2
< via-rhine.c:v1.10-LK1.4.1 July-24-2006 Written by Donald Becker
< GSI 18 sharing vector 0xC9 and IRQ 18
< ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 23 (level, low) -> IRQ 201
< eth0: VIA Rhine II at 0xfbfffc00, 00:13:8f:6e:8f:c5, IRQ 201.
< eth0: MII PHY found at address 1, status 0x786d advertising 05e1 Link 0021.
< hdc: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33)
< Uniform CD-ROM driver Revision: 3.20
< shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
< sd 0:0:0:0: Attached scsi generic sg0 type 0
310,311c267,268
< GSI 19 sharing vector 0xD1 and IRQ 19
< ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 209
---
> GSI 18 sharing vector 0xC9 and IRQ 18
> ACPI: PCI Interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 201
314c271
< uhci_hcd 0000:00:10.0: irq 209, io base 0x0000ec00
---
> uhci_hcd 0000:00:10.0: irq 201, io base 0x0000ec00
322c279
< ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 209
---
> ACPI: PCI Interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 201
325c282
< uhci_hcd 0000:00:10.1: irq 209, io base 0x0000e080
---
> uhci_hcd 0000:00:10.1: irq 201, io base 0x0000e080
333c290
< ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 209
---
> ACPI: PCI Interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 201
336c293
< uhci_hcd 0000:00:10.2: irq 209, io base 0x0000e000
---
> uhci_hcd 0000:00:10.2: irq 201, io base 0x0000e000
343,345d299
< Floppy drive(s): fd0 is 1.44M
< FDC 0 is a post-1991 82077
< PM: Adding info for platform:floppy.0
347c301
< ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 209
---
> ACPI: PCI Interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 201
350c304
< uhci_hcd 0000:00:10.3: irq 209, io base 0x0000dc00
---
> uhci_hcd 0000:00:10.3: irq 201, io base 0x0000dc00
356a311
> input: ImPS/2 Generic Wheel Mouse as /class/input/input1
358,359c313,314
< PM: Adding info for No Bus:i2c-0
< ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 209
---
> ohci_hcd: 2005 April 22 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
> ACPI: PCI Interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 201
362c317
< ehci_hcd 0000:00:10.4: irq 209, io mem 0xfbfff800
---
> ehci_hcd 0000:00:10.4: irq 201, io mem 0xfbfff800
370a326,385
> SCSI subsystem initialized
> libata version 2.00 loaded.
> sata_via 0000:00:0f.0: version 2.0
> ACPI: PCI Interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 193
> sata_via 0000:00:0f.0: routed to hard irq line 3
> ata1: SATA max UDMA/133 cmd 0xD880 ctl 0xD802 bmdma 0xD080 irq 193
> ata2: SATA max UDMA/133 cmd 0xD480 ctl 0xD402 bmdma 0xD088 irq 193
> scsi0 : sata_via
> PM: Adding info for No Bus:host0
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata1.00: ATA-7, max UDMA/100, 390721968 sectors: LBA48 NCQ (depth 0/32)
> ata1.00: ata1: dev 0 multi count 16
> ata1.00: configured for UDMA/100
> scsi1 : sata_via
> PM: Adding info for No Bus:host1
> ata2: SATA link down 1.5 Gbps (SStatus 0 SControl 300)
> ATA: abnormal status 0x7F on port 0xD487
> time.c: Lost 10 timer tick(s)! rip __do_softirq+0x5c/0xf5)
> usb 5-6: new high speed USB device using ehci_hcd and address 2
> PM: Adding info for No Bus:target0:0:0
> Vendor: ATA Model: Maxtor 6L200M0 Rev: BANC
> Type: Direct-Access ANSI SCSI revision: 05
> PM: Adding info for scsi:0:0:0:0
> SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
> sda: Write Protect is off
> sda: Mode Sense: 00 3a 00 00
> SCSI device sda: drive cache: write back
> SCSI device sda: 390721968 512-byte hdwr sectors (200050 MB)
> sda: Write Protect is off
> sda: Mode Sense: 00 3a 00 00
> SCSI device sda: drive cache: write back
> sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8 >
> sd 0:0:0:0: Attached scsi disk sda
> PM: Adding info for usb:5-6
> PM: Adding info for No Bus:usbdev5.2_ep00
> usb 5-6: configuration #1 chosen from 1 choice
> PM: Adding info for usb:5-6:1.0
> PM: Adding info for No Bus:usbdev5.2_ep81
> PM: Adding info for No Bus:usbdev5.2_ep02
> PM: Adding info for No Bus:usbdev5.2_ep83
> libusual: modprobe for usb-storage succeeded, but module is not present
> kjournald starting. Commit interval 5 seconds
> EXT3-fs: mounted filesystem with ordered data mode.
> SELinux: Disabled at runtime.
> SELinux: Unregistering netfilter hooks
> audit(1163638326.450:2): selinux=0 auid=4294967295
> via-rhine.c:v1.10-LK1.4.1 July-24-2006 Written by Donald Becker
> GSI 19 sharing vector 0xD1 and IRQ 19
> ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 23 (level, low) -> IRQ 209
> eth0: VIA Rhine II at 0xfbfffc00, 00:13:8f:6e:8f:c5, IRQ 209.
> eth0: MII PHY found at address 1, status 0x786d advertising 05e1 Link 45e1.
> shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
> hdc: ATAPI 48X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33)
> Uniform CD-ROM driver Revision: 3.20
> sd 0:0:0:0: Attached scsi generic sg0 type 0
> input: PC Speaker as /class/input/input2
> PM: Adding info for No Bus:i2c-0
> Floppy drive(s): fd0 is 1.44M
> FDC 0 is a post-1991 82077
> PM: Adding info for platform:floppy.0
375,388d389
< usb 5-6: new high speed USB device using ehci_hcd and address 3
< PM: Adding info for usb:5-6
< PM: Adding info for No Bus:usbdev5.3_ep00
< usb 5-6: configuration #1 chosen from 1 choice
< PM: Adding info for usb:5-6:1.0
< PM: Adding info for No Bus:usbdev5.3_ep81
< PM: Adding info for No Bus:usbdev5.3_ep02
< PM: Adding info for No Bus:usbdev5.3_ep83
< Initializing USB Mass Storage driver...
< usb 2-2: new full speed USB device using uhci_hcd and address 2
< PM: Adding info for usb:2-2
< PM: Adding info for No Bus:usbdev2.2_ep00
< usb 2-2: configuration #1 chosen from 1 choice
< PM: Adding info for usb:2-2:1.0
391,402d391
< PM: Adding info for No Bus:usbdev2.2_ep85
< PM: Adding info for usb:2-2:1.1
< scsi2 : SCSI emulation for USB Mass Storage devices
< PM: Adding info for No Bus:host2
< usb-storage: device found at 3
< usb-storage: waiting for device to settle before scanning
< usbcore: registered new driver usb-storage
< USB Mass Storage support registered.
< PM: Adding info for No Bus:usbdev2.2_ep81
< PM: Adding info for No Bus:usbdev2.2_ep02
< eth1: register 'cdc_ether' at usb-0000:00:10.1-2, CDC Ethernet Device, 00:90:64:fc:ce:2b
< usbcore: registered new driver cdc_ether
407d395
< ibm_acpi: ec object not found
421,458c409,410
< PM: Adding info for No Bus:target2:0:0
< Vendor: OTi Model: CF CARD Reader Rev: 2.00
< Type: Direct-Access ANSI SCSI revision: 00
< PM: Adding info for scsi:2:0:0:0
< sd 2:0:0:0: Attached scsi removable disk sdb
< sd 2:0:0:0: Attached scsi generic sg1 type 0
< Vendor: OTi Model: SM CARD Reader Rev: 2.00
< Type: Direct-Access ANSI SCSI revision: 00
< PM: Adding info for scsi:2:0:0:1
< sd 2:0:0:1: Attached scsi removable disk sdc
< sd 2:0:0:1: Attached scsi generic sg2 type 0
< Vendor: OTi Model: SD CARD Reader Rev: 2.00
< Type: Direct-Access ANSI SCSI revision: 00
< PM: Adding info for scsi:2:0:0:2
< sd 2:0:0:2: Attached scsi removable disk sdd
< sd 2:0:0:2: Attached scsi generic sg3 type 0
< Vendor: OTi Model: MS CARD Reader Rev: 2.00
< Type: Direct-Access ANSI SCSI revision: 00
< PM: Adding info for scsi:2:0:0:3
< sd 2:0:0:3: Attached scsi removable disk sde
< sd 2:0:0:3: Attached scsi generic sg4 type 0
< PM: Adding info for No Bus:target2:0:1
< PM: Removing info for No Bus:target2:0:1
< PM: Adding info for No Bus:target2:0:2
< PM: Removing info for No Bus:target2:0:2
< PM: Adding info for No Bus:target2:0:3
< PM: Removing info for No Bus:target2:0:3
< PM: Adding info for No Bus:target2:0:4
< PM: Removing info for No Bus:target2:0:4
< PM: Adding info for No Bus:target2:0:5
< PM: Removing info for No Bus:target2:0:5
< PM: Adding info for No Bus:target2:0:6
< PM: Removing info for No Bus:target2:0:6
< PM: Adding info for No Bus:target2:0:7
< PM: Removing info for No Bus:target2:0:7
< usb-storage: device scan complete
< eth0: link up, 10Mbps, half-duplex, lpa 0x0021
< audit(1162252141.385:3): audit_pid=2024 old=0 by auid=4294967295
---
> eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
> audit(1163638345.198:3): audit_pid=1980 old=0 by auid=4294967295
462d413
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
466c417
< eth1: no IPv6 routers present
---
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
468,476c419,430
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
< time.c: Lost 300 timer tick(s)! rip _spin_unlock_irq+0x2e/0x31)
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
< time.c: Lost 300 timer tick(s)! rip mwait_idle+0x3f/0x54)
---
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip sha_transform+0x1c/0x1f4)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip 0x2b5b93ffd59a)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
> time.c: Lost 1201 timer tick(s)! rip mwait_idle+0x3f/0x54)
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
* Re: AMD X2 unsynced TSC fix?
2006-11-16 1:38 ` Sergio Monteiro Basto
@ 2006-11-16 1:45 ` Sergio Monteiro Basto
0 siblings, 0 replies; 65+ messages in thread
From: Sergio Monteiro Basto @ 2006-11-16 1:45 UTC (permalink / raw)
To: Andreas Arens
Cc: acpi devel, tglx, Siddha, Suresh B, Andi Kleen, Lee Revell,
Chris Friesen, linux-kernel, len.brown, Ingo Molnar,
Arjan van de Ven, john stultz
[-- Attachment #1.1: Type: text/plain, Size: 1395 bytes --]
yap Andreas Arens send the patch just for me, I am sending it to the
maling lists.
On Thu, 2006-11-16 at 01:38 +0000, Sergio Monteiro Basto wrote:
> On Wed, 2006-11-15 at 19:40 +0100, Andreas Arens wrote:
> > as I see from the dmesg on the Fedora bugzilla, your acpi tables
> > don't provide an entry to the HPET timer.
>
> > As the VIA8237 happens to have a built-in HPET, I was able to force it
> > on using the
> > attached patch (against 2.6.18) on an X2 system with the same
> > problem, which greatly improved the system stability for me.
>
> But I have one Intel(R) Pentium(R) D CPU 2.8 on a VIA8237
> My latest suspect of the root of the problem of my computer is not in
> Processor but in those VIAs. As you find that "don't provide an entry to
> the HPET timer on acpi tables" it match, but how do you know that ?
> I don't send DSDT on bugzilla
>
>
> > The patch is hand-crafted from some older clock-tick kernel tree
> > sources I found by googling.
> >
> > The thing is hackish and not suitable for mainline inclusion,
> > but may be useful nontheless.
> > If you find it useful, and it helps you please let me know.
>
> I try your patch and it give me this differences on dmesg (file attach),
> detect a different timer.c but no improvement without notsc boot option
> and with notsc the computer got worst.
>
>
> >
--
Sérgio M.B.
[-- Attachment #1.2: 2_6_18_via_8237_force_hpet_enable.diff --]
[-- Type: text/x-patch, Size: 1436 bytes --]
--- linux-2.6.18-gentoo-r2/arch/x86_64/kernel/time.c.unpatched 2006-11-15 19:29:07.000000000 +0100
+++ linux-2.6.18-gentoo-r2/arch/x86_64/kernel/time.c 2006-11-15 19:30:51.000000000 +0100
@@ -42,6 +42,9 @@
#ifdef CONFIG_X86_LOCAL_APIC
#include <asm/apic.h>
#endif
+#if 1
+#include <linux/pci_ids.h>
+#endif
#ifdef CONFIG_CPU_FREQ
static void cpufreq_delayed_get(void);
@@ -815,6 +818,48 @@
static int hpet_init(void)
{
unsigned int id;
+#if 1
+ union conf_address {
+ struct {
+ u8 reg;
+ u8 func: 3;
+ u8 dev: 5;
+ u8 bus;
+ u8 reserved:7;
+ u8 enable: 1;
+ } bits;
+ u32 dword;
+ };
+ union conf_address ca = {
+ .bits.reg = 0,
+ .bits.dev = 17,
+ .bits.enable = 1
+ };
+ union {
+ struct {
+ u8 control;
+ u8 address[3];
+ } hpet;
+ unsigned raw;
+ } hpet;
+ u32 vendor_id, control;
+
+ control = inl(0xcf8);
+ printk("%X\n", control);
+ outl(ca.dword, 0xcf8);
+ vendor_id = inl(0xcfc);
+ if (vendor_id == (PCI_VENDOR_ID_VIA + (PCI_DEVICE_ID_VIA_8237 << 16))) {
+ hpet.raw = 0xFED00000;
+ hpet.hpet.control = 0x80;
+ ca.bits.reg = 0x68;
+ outl(ca.dword, 0xcf8);
+ outl(hpet.raw, 0xcfc);
+ outl(ca.dword, 0xcf8);
+ vxtime.hpet_address = (inl(0xcfc) & 0xFFFFFF00);
+ printk(KERN_WARNING "time.c: WARNING: Enabled VIA8237 HPET "
+ "at %#lx.\n", vxtime.hpet_address);
+ }
+#endif
if (!vxtime.hpet_address)
return -1;
[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 2166 bytes --]
^ permalink raw reply [flat|nested] 65+ messages in thread
end of thread, other threads:[~2006-11-16 1:45 UTC | newest]
Thread overview: 65+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-10-27 17:15 AMD X2 unsynced TSC fix? Lee Revell
2006-10-27 20:18 ` Luca Tettamanti
2006-10-27 23:04 ` thockin
2006-10-28 0:00 ` Luca Tettamanti
2006-10-28 0:17 ` Lee Revell
2006-10-28 2:46 ` thockin
2006-10-28 3:59 ` Andi Kleen
2006-10-28 6:32 ` thockin
2006-10-28 9:14 ` Vojtech Pavlik
2006-10-28 18:22 ` Lee Revell
2006-10-28 19:57 ` Vojtech Pavlik
2006-10-28 22:54 ` thockin
2006-10-28 1:04 ` Andi Kleen
2006-10-28 3:28 ` Lee Revell
2006-10-28 5:28 ` Willy Tarreau
2006-10-28 18:08 ` Lee Revell
2006-10-28 19:14 ` thockin
2006-10-30 17:22 ` Langsdorf, Mark
2006-10-28 18:37 ` Andi Kleen
2006-10-28 19:15 ` Willy Tarreau
2006-10-28 19:18 ` thockin
2006-10-28 19:32 ` Willy Tarreau
2006-10-28 19:42 ` thockin
2006-10-28 20:16 ` Willy Tarreau
2006-10-28 19:33 ` Andi Kleen
2006-10-28 20:04 ` Willy Tarreau
2006-10-28 20:11 ` Andi Kleen
2006-10-28 20:36 ` Willy Tarreau
2006-10-29 1:28 ` Lee Revell
2006-10-28 21:00 ` Lee Revell
2006-10-31 11:12 ` Pádraig Brady
2006-10-31 15:31 ` Willy Tarreau
2006-10-30 20:30 ` Christoph Lameter
2006-10-27 20:35 ` Andi Kleen
2006-10-27 20:41 ` Lee Revell
2006-10-27 21:48 ` Chris Friesen
2006-10-27 22:08 ` Lee Revell
2006-10-28 3:58 ` Sergio Monteiro Basto
2006-10-28 4:06 ` Andi Kleen
2006-10-28 4:22 ` Sergio Monteiro Basto
2006-10-30 3:10 ` Sergio Monteiro Basto
2006-10-30 15:23 ` Andi Kleen
[not found] ` <1162253008.2999.9.camel@localhost.portugal>
2006-10-31 0:14 ` Lee Revell
2006-10-31 0:25 ` john stultz
2006-10-31 2:41 ` Siddha, Suresh B
2006-10-31 15:05 ` Sergio Monteiro Basto
2006-11-01 1:46 ` Sergio Monteiro Basto
2006-11-01 2:44 ` Siddha, Suresh B
2006-11-08 0:22 ` Sergio Monteiro Basto
2006-11-08 19:53 ` Thomas Gleixner
2006-11-09 0:39 ` Sergio Monteiro Basto
2006-11-09 1:13 ` john stultz
2006-11-09 1:27 ` Sergio Monteiro Basto
2006-11-15 1:51 ` Sergio Monteiro Basto
[not found] ` <20061115193514.41C01102C011@mail.goron.de>
2006-11-16 1:38 ` Sergio Monteiro Basto
2006-11-16 1:45 ` Sergio Monteiro Basto
2006-10-28 6:35 ` thockin
2006-10-28 6:46 ` Andrew Morton
2006-10-28 6:49 ` thockin
2006-10-28 7:13 ` Andrew Morton
2006-10-28 7:25 ` thockin
2006-10-28 9:46 ` Andi Kleen
2006-10-28 9:45 ` Andi Kleen
2006-10-28 9:48 ` Andi Kleen
2006-10-27 21:58 ` Friedrich Göpel
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox