public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* x86 TSC time warp puzzle
@ 2005-04-02  1:43 Jonathan Lundell
  2005-04-04  8:59 ` P
  0 siblings, 1 reply; 6+ messages in thread
From: Jonathan Lundell @ 2005-04-02  1:43 UTC (permalink / raw)
  To: LKML

Well, not actually a time warp, though it feels like one.

I'm doing some real-time bit-twiddling in a driver, using the TSC to 
measure out delays on the order of hundreds of nanoseconds. Because I 
want an upper limit on the delay, I disable interrupts around it.

The logic is something like:

	local_irq_save
	out(set a bit)
	t0 = TSC
	wait while (t = (TSC - t0)) < delay_time
	out(clear the bit)
	local_irq_restore

 From time to time, when I exit the delay, t is *much* bigger than 
delay_time. If delay_time is, say, 300ns, t is usually no more than 
325ns. But every so often, t can be 2000, or 10000, or even much 
higher.

The value of t seems to depend on the CPU involved, The worst case is 
with an Intel 915GV chipset, where t approaches 500 microseconds (!).

This is with ACPI and HT disabled, to avoid confounding interactions. 
I suspected NMI, of course, but I monitored the nmi counter, and 
mostly saw nothing (from time to time a random hit, but mostly not).

The longer delay is real. I can see the bit being set/cleared in the 
pseudocode above on a scope, and when the long delay happens, the bit 
is set for a correspondingly long time.

BTW, the symptom is independent of my IO. I wrote a test case that 
does diddles nothing but reading TSC, and get the same result.

Finally, on some CPUs, at least, the extra delay appears to be 
periodic. The 500us delay happens about every second. On a different 
machine (chipset) it happens at about 5 Hz. And the characteristic 
delay on each type of machine seems consistent.

Any ideas of where to look? Other lists to inquire on?

Thanks.
-- 
/Jonathan Lundell.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: x86 TSC time warp puzzle
@ 2005-04-02  7:05 Pallipadi, Venkatesh
  2005-04-02  8:13 ` Lee Revell
  0 siblings, 1 reply; 6+ messages in thread
From: Pallipadi, Venkatesh @ 2005-04-02  7:05 UTC (permalink / raw)
  To: Jonathan Lundell, LKML


At what point are you seeing these delays? During early boot or after
boot? 
It can be SMI happening in the platform. Typically BIOS uses some SMI
polling 
to handle some devices during early boot. Though 500 microseconds sounds
a 
bit too high.

Thanks,
Venki

>-----Original Message-----
>From: linux-kernel-owner@vger.kernel.org 
>[mailto:linux-kernel-owner@vger.kernel.org] On Behalf Of 
>Jonathan Lundell
>Sent: Friday, April 01, 2005 5:43 PM
>To: LKML
>Subject: x86 TSC time warp puzzle
>
>Well, not actually a time warp, though it feels like one.
>
>I'm doing some real-time bit-twiddling in a driver, using the TSC to 
>measure out delays on the order of hundreds of nanoseconds. Because I 
>want an upper limit on the delay, I disable interrupts around it.
>
>The logic is something like:
>
>	local_irq_save
>	out(set a bit)
>	t0 = TSC
>	wait while (t = (TSC - t0)) < delay_time
>	out(clear the bit)
>	local_irq_restore
>
> From time to time, when I exit the delay, t is *much* bigger than 
>delay_time. If delay_time is, say, 300ns, t is usually no more than 
>325ns. But every so often, t can be 2000, or 10000, or even much 
>higher.
>
>The value of t seems to depend on the CPU involved, The worst case is 
>with an Intel 915GV chipset, where t approaches 500 microseconds (!).
>
>This is with ACPI and HT disabled, to avoid confounding interactions. 
>I suspected NMI, of course, but I monitored the nmi counter, and 
>mostly saw nothing (from time to time a random hit, but mostly not).
>
>The longer delay is real. I can see the bit being set/cleared in the 
>pseudocode above on a scope, and when the long delay happens, the bit 
>is set for a correspondingly long time.
>
>BTW, the symptom is independent of my IO. I wrote a test case that 
>does diddles nothing but reading TSC, and get the same result.
>
>Finally, on some CPUs, at least, the extra delay appears to be 
>periodic. The 500us delay happens about every second. On a different 
>machine (chipset) it happens at about 5 Hz. And the characteristic 
>delay on each type of machine seems consistent.
>
>Any ideas of where to look? Other lists to inquire on?
>
>Thanks.
>-- 
>/Jonathan Lundell.
>-
>To unsubscribe from this list: send the line "unsubscribe 
>linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: x86 TSC time warp puzzle
  2005-04-02  7:05 x86 TSC time warp puzzle Pallipadi, Venkatesh
@ 2005-04-02  8:13 ` Lee Revell
  2005-04-03  4:04   ` Jonathan Lundell
  0 siblings, 1 reply; 6+ messages in thread
From: Lee Revell @ 2005-04-02  8:13 UTC (permalink / raw)
  To: Pallipadi, Venkatesh; +Cc: Jonathan Lundell, LKML

On Fri, 2005-04-01 at 23:05 -0800, Pallipadi, Venkatesh wrote:
> It can be SMI happening in the platform. Typically BIOS uses some SMI
> polling 
> to handle some devices during early boot. Though 500 microseconds sounds
> a 
> bit too high.
> 

Nope, that sounds just about right.  Buggy BIOSes that implement ACPI
via SMM (or so I have been told) can stall the machine for over a
millisecond, this is why some laptops lose timer ticks at HZ=1000.  The
issue is well known by Linux audio users, as it causes big problems for
people who buy laptops for live audio use.

A list of known good/bad machines would be a tremendous help, but no one
knows the exact extent of the problem.  All Acer laptops seem to be
affected.

Hardware manufacturers (laptops anyway) don't seem to care about
anything below 1-2ms because Windows uses HZ=100 and the ASIO drivers on
that platform only go down to about ~1.5 ms latency.

Lee




^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: x86 TSC time warp puzzle
  2005-04-02  8:13 ` Lee Revell
@ 2005-04-03  4:04   ` Jonathan Lundell
  0 siblings, 0 replies; 6+ messages in thread
From: Jonathan Lundell @ 2005-04-03  4:04 UTC (permalink / raw)
  To: LKML

At 3:13 AM -0500 4/2/05, Lee Revell wrote:
>On Fri, 2005-04-01 at 23:05 -0800, Pallipadi, Venkatesh wrote:
>>  It can be SMI happening in the platform. Typically BIOS uses some SMI
>  > polling to handle some devices during early boot. Though 500 
>microseconds > sounds a bit too high.
>
>Nope, that sounds just about right.  Buggy BIOSes that implement ACPI
>via SMM (or so I have been told) can stall the machine for over a
>millisecond, this is why some laptops lose timer ticks at HZ=1000.  The
>issue is well known by Linux audio users, as it causes big problems for
>people who buy laptops for live audio use.

This is a desktop board, and this is well after boot (hours). Also, 
ACPI is disabled in the BIOS.

I suppose I can try to disable SMI via the APIC?
-- 
/Jonathan Lundell.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: x86 TSC time warp puzzle
  2005-04-02  1:43 Jonathan Lundell
@ 2005-04-04  8:59 ` P
  2005-04-04 13:58   ` Joe Korty
  0 siblings, 1 reply; 6+ messages in thread
From: P @ 2005-04-04  8:59 UTC (permalink / raw)
  To: Jonathan Lundell; +Cc: LKML

Jonathan Lundell wrote:
> Well, not actually a time warp, though it feels like one.
> 
> I'm doing some real-time bit-twiddling in a driver, using the TSC to 
> measure out delays on the order of hundreds of nanoseconds. Because I 
> want an upper limit on the delay, I disable interrupts around it.
> 
> The logic is something like:
> 
>     local_irq_save
>     out(set a bit)
>     t0 = TSC
>     wait while (t = (TSC - t0)) < delay_time
>     out(clear the bit)
>     local_irq_restore
> 
>  From time to time, when I exit the delay, t is *much* bigger than 
> delay_time. If delay_time is, say, 300ns, t is usually no more than 
> 325ns. But every so often, t can be 2000, or 10000, or even much higher.
> 
> The value of t seems to depend on the CPU involved, The worst case is 
> with an Intel 915GV chipset, where t approaches 500 microseconds (!).

Probably not the same thing, but on 2.4 I was noticing
large TSC jumps, the magnitude of which was dependent on CPU speed.
They were always around 1.26ms on my 3.4GHz dual HT xeon system.
That's (2^32)/(3.4*10^9) which suggested it was a 32 bit overflow
somewhere, which pointed me at:
http://lxr.linux.no/source/arch/i386/kernel/time.c?v=2.4.28#L96
This implied the TSCs were drifting relative to each other
(even between logical CPUs on 1 package).
I worked around the problem by setting the IRQ affinity
for my ethernet IRQs (the source of the do_gettimeofday()s)
to a particular logical CPU rather than a physical CPU and also
tied the timer interrupt to CPU0.
I guess I could also maintain a last_tsc_low for each CPU also?

Pádraig.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: x86 TSC time warp puzzle
  2005-04-04  8:59 ` P
@ 2005-04-04 13:58   ` Joe Korty
  0 siblings, 0 replies; 6+ messages in thread
From: Joe Korty @ 2005-04-04 13:58 UTC (permalink / raw)
  To: P; +Cc: Jonathan Lundell, LKML

On Mon, Apr 04, 2005 at 09:59:22AM +0100, P@draigBrady.com wrote:
> Jonathan Lundell wrote:
> >Well, not actually a time warp, though it feels like one.
> >
> >I'm doing some real-time bit-twiddling in a driver, using the TSC to 
> >measure out delays on the order of hundreds of nanoseconds. Because I 
> >want an upper limit on the delay, I disable interrupts around it.
> >
> >The logic is something like:
> >
> >    local_irq_save
> >    out(set a bit)
> >    t0 = TSC
> >    wait while (t = (TSC - t0)) < delay_time
> >    out(clear the bit)
> >    local_irq_restore
> >
> > From time to time, when I exit the delay, t is *much* bigger than 
> >delay_time. If delay_time is, say, 300ns, t is usually no more than 
> >325ns. But every so often, t can be 2000, or 10000, or even much higher.
> >
> >The value of t seems to depend on the CPU involved, The worst case is 
> >with an Intel 915GV chipset, where t approaches 500 microseconds (!).


Add nmi_watchdog=0 to your boot command line.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-04-04 13:58 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-04-02  7:05 x86 TSC time warp puzzle Pallipadi, Venkatesh
2005-04-02  8:13 ` Lee Revell
2005-04-03  4:04   ` Jonathan Lundell
  -- strict thread matches above, loose matches on Subject: below --
2005-04-02  1:43 Jonathan Lundell
2005-04-04  8:59 ` P
2005-04-04 13:58   ` Joe Korty

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox