public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 2.6.14] X86_64 delay resolution
@ 2005-12-09 16:34 Ross Biro
  2005-12-09 23:24 ` Andi Kleen
  0 siblings, 1 reply; 2+ messages in thread
From: Ross Biro @ 2005-12-09 16:34 UTC (permalink / raw)
  To: Andrew Morton, linux-kernel@vger.kernel.org, Linus Torvalds

On x86_64 smp systems, we noticed that the amount of time udelay would
spin for varied from cpu to cpu.  In our test case, udelay(10) would
only delay for 9.7us on some cpus while on others it would delay for
the full 10us.  We tracked the problem down to an unnecessary attempt
to avoid arithmetic overflow.  Here's a fix complete with an overly
verbose comment.

--- ../linux-2.6.14/arch/x86_64/lib/delay.c     2005-10-27
20:02:08.000000000 -0400
+++ arch/x86_64/lib/delay.c     2005-12-09 11:21:33.000000000 -0500
@@ -37,9 +37,16 @@
        while((now-bclock) < loops);
 }

+/*
+ * We should be able to do this multiplication with out overflowing
provided the
+ * cpu is running at less than about 128 GHz.  xloops < 20000 * 0x10c6.
+ * loops_per_jiffy * HZ <= cpu_clock_speed.  So if the cpu clock speed
+ * < 2^64/(20000 * 0x10c6) = 2^64/ 51E6CC0 < 2^64/2^27 = 2^37 = 128G we
+ * will not overflow the calculation.
+ */
  inline void __const_udelay(unsigned long xloops)
 {
-       __delay(((xloops *
cpu_data[raw_smp_processor_id()].loops_per_jiffy) >> 32) * HZ);
+       __delay(((xloops * HZ *
cpu_data[raw_smp_processor_id()].loops_per_jiffy) >> 32));
 }

This patch is against 2.6.15-rc5-git1, but the code in question hasn't
changed for quite a while, so it should apply cleanly to most later
2.6 kernels.

Since it's possible, although unlikely, that some device drivers
depend on the lack of resolution of the timer, the conservative thing
to do would be to apply this fix to the -mm kernels and let it sit for
a bit.

We only found this because we made another change that might have
impacted udelay, so we tested udelay to make sure it was still working
properly.  In particular, I'm not aware of any problems that can be
traced to udelay or ndelay having a resolution of HZ/2^32 seconds.

    Ross

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [PATCH 2.6.14] X86_64 delay resolution
  2005-12-09 16:34 [PATCH 2.6.14] X86_64 delay resolution Ross Biro
@ 2005-12-09 23:24 ` Andi Kleen
  0 siblings, 0 replies; 2+ messages in thread
From: Andi Kleen @ 2005-12-09 23:24 UTC (permalink / raw)
  To: Ross Biro; +Cc: linux-kerneL

Ross Biro <ross.biro@gmail.com> writes:

> On x86_64 smp systems, we noticed that the amount of time udelay would
> spin for varied from cpu to cpu.  In our test case, udelay(10) would
> only delay for 9.7us on some cpus while on others it would delay for
> the full 10us.  We tracked the problem down to an unnecessary attempt
> to avoid arithmetic overflow.  Here's a fix complete with an overly
> verbose comment.

I applied it without the comment and without the extra sets
of brackets. Please submit future x86-64 patches directly
to me and also Signed them off and use toplevel diffs.

Thanks,
-Andi

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2005-12-09 18:53 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-09 16:34 [PATCH 2.6.14] X86_64 delay resolution Ross Biro
2005-12-09 23:24 ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox