All of lore.kernel.org
 help / color / mirror / Atom feed
* [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1))
@ 2006-03-26 21:44 Chuck Ebbert
  2006-03-26 23:18 ` Andi Kleen
  2006-03-27  0:42 ` Ray Lee
  0 siblings, 2 replies; 3+ messages in thread
From: Chuck Ebbert @ 2006-03-26 21:44 UTC (permalink / raw)
  To: Ray Lee
  Cc: Andreas Mohr, Andrew Morton, linux-kernel, Dominik Brodowski,
	John Stultz, Alan Cox, Andi Kleen

In-Reply-To: <2c0942db0603240922u7bade58lcf2a6af2af8ec6ae@mail.gmail.com>

On Fri, 24 Mar 2006 09:22:51 -0800, Ray Lee wrote:
> On 3/24/06, Andreas Mohr <andi@rhlx01.fht-esslingen.de> wrote:
> > +       loops += bclock;
> [...]
> > -       } while ((now-bclock) < loops);
> > +       } while (now < loops);
> 
> Erm, aren't you introducing an overflow problem here?
> 
> if loops is 2^32-1, bclock is 1, the old version would execute the
> proper number of times, the new one will blow out in one tick.

Yes, but the old version has a bug too.  I don't have 2.6.16-mm1, but
in 2.6.16 it's in arch/i386/kernel/timers/timer_tsc.c and
arch/x86_64/lib/delay.c:

static void delay_tsc(unsigned long loops)
{
        unsigned long bclock, now;

        rdtscl(bclock);
        do
        {
                rep_nop();
                rdtscl(now);
        } while ((now-bclock) < loops);
}

If (loops == 100000) and (bclock == 2^32-1) the loop will terminate
immediately when the low part of the TSC overflows because (now-bclock)
is a large number. That can't be right...

I'm running a system with this applied now.  I think there are still
problems if someone uses huge delays, though.  What keeps someone from
trying to delay for > 2^31 cycles?


i386 delay_tsc() will truncate delays when the TSC is within 'loops'
of overflow.  We must be able to handle TSC overflow both before and
after 'end', i.e. [1], [2] and [3] below.

                                               zero
                                                 |
case A (end < start)       start  [1]            |  [2]  end  [3]
                                                 |
case B (end > start)       start  [1]  end  [2]  |  [3]

Signed-off-by: Chuck Ebbert <76306.1226@compuserve.com>

--- 2.6.16-d2.orig/arch/i386/kernel/timers/timer_tsc.c
+++ 2.6.16-d2/arch/i386/kernel/timers/timer_tsc.c
@@ -170,14 +170,22 @@ unsigned long long sched_clock(void)
 
 static void delay_tsc(unsigned long loops)
 {
-	unsigned long bclock, now;
+	unsigned long start, end, now;
 	
-	rdtscl(bclock);
-	do
-	{
-		rep_nop();
-		rdtscl(now);
-	} while ((now-bclock) < loops);
+	rdtscl(start);
+	end = start + loops;
+
+	if (unlikely(end < start)) {
+		do {
+			rep_nop();
+			rdtscl(now);
+		} while (now > start || now < end);
+	} else {
+		do {
+			rep_nop();
+			rdtscl(now);
+		} while (now > start && now < end);
+	}
 }
 
 #ifdef CONFIG_HPET_TIMER
-- 
Chuck
"Penguins don't come from next door, they come from the Antarctic!"

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1))
  2006-03-26 21:44 [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1)) Chuck Ebbert
@ 2006-03-26 23:18 ` Andi Kleen
  2006-03-27  0:42 ` Ray Lee
  1 sibling, 0 replies; 3+ messages in thread
From: Andi Kleen @ 2006-03-26 23:18 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Ray Lee, Andreas Mohr, Andrew Morton, linux-kernel,
	Dominik Brodowski, John Stultz, Alan Cox


> I'm running a system with this applied now.  I think there are still
> problems if someone uses huge delays, though.  What keeps someone from
> trying to delay for > 2^31 cycles?

You shouldn't. The caller has a compile time check for that. And if you pass 
in dynamic values you get what you deserve.

-Andi

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1))
  2006-03-26 21:44 [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1)) Chuck Ebbert
  2006-03-26 23:18 ` Andi Kleen
@ 2006-03-27  0:42 ` Ray Lee
  1 sibling, 0 replies; 3+ messages in thread
From: Ray Lee @ 2006-03-27  0:42 UTC (permalink / raw)
  To: Chuck Ebbert
  Cc: Andreas Mohr, Andrew Morton, linux-kernel, Dominik Brodowski,
	John Stultz, Alan Cox, Andi Kleen

On 3/26/06, Chuck Ebbert <76306.1226@compuserve.com> wrote:
> On Fri, 24 Mar 2006 09:22:51 -0800, Ray Lee wrote:
> > On 3/24/06, Andreas Mohr <andi@rhlx01.fht-esslingen.de> wrote:
> > > +       loops += bclock;
> > [...]
> > > -       } while ((now-bclock) < loops);
> > > +       } while (now < loops);
> >
> > Erm, aren't you introducing an overflow problem here?
> >
> > if loops is 2^32-1, bclock is 1, the old version would execute the
> > proper number of times, the new one will blow out in one tick.
>
> Yes, but the old version has a bug too.
[...]
> If (loops == 100000) and (bclock == 2^32-1) the loop will terminate
> immediately when the low part of the TSC overflows because (now-bclock)
> is a large number.

Er, no, it won't, because (now-bclock) won't be large.

I know thinking about math on a modulo number line such as u8/16/32 is
odd, but it's best if you just always think of "subtraction" to mean
"distance between." (Which is always true in any space or coordinate
system, even with wrap arounds.) This is the same trick used by
Andrew's ring buffers, where you let head and tail wrap around freely,
and only perform the modulo operation at dereferencing.

A simple test program will give you a better feel for what's going on
(I write a lot of these...):

#include <stdio.h>
int main() {
  unsigned int a,b,c;
  a=-1-1;
  b=1000;
  c=b-a;
  printf("%u - %u = %u\n", b, a, c);
}

ray@issola:~/work/test/overflow$ gcc -o test test.c
ray@issola:~/work/test/overflow$ ./test
1000 - 4294967294 = 1002

So, it wraps appropriately, as odd as that may seem at first blush.

Ray

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2006-03-27  0:42 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-26 21:44 [patch] fix delay_tsc (was Re: delay_tsc(): inefficient delay loop (2.6.16-mm1)) Chuck Ebbert
2006-03-26 23:18 ` Andi Kleen
2006-03-27  0:42 ` Ray Lee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.