[parisc-linux] teaching the kernel to do division

All of lore.kernel.org
 help / color / mirror / Atom feed

* [parisc-linux] teaching the kernel to do division
@ 2003-10-23  7:21 Randolph Chung
  2003-10-23 23:39 ` [parisc-linux] " Randolph Chung
  0 siblings, 1 reply; 5+ messages in thread
From: Randolph Chung @ 2003-10-23  7:21 UTC (permalink / raw)
  To: parisc-linux

I was investigating why nanosleep() doesn't return the correct remaining
time with a 2.6 kernel... turns out the kernel doesn't seem to know 
how to do division properly :-(

kernel/posix-timers.c has:

                tsave->tv_sec = div_long_long_rem(left,
                                                  NSEC_PER_SEC,
                                                  &tsave->tv_nsec);

which eventually calls __div64_32() in lib/div64.c

if you test that function, you see that it does weird things. For
example, it tells me that:

28999591392 / 1000000000 = 3, remainder = 229787616

<sigh>

does anyone want to look into fixing it, and/or writing an optimized
version of that function for pa? :-) it needs to do basically this (but
be standalone)

uint32_t div64(uint64_t *n, uint32_t base)
{
        uint32_t rem;

        rem = *n % base;
        *n = *n / base;

        return rem;
}

thanks :)
randolph
-- 
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [parisc-linux] Re: teaching the kernel to do division
  2003-10-23  7:21 [parisc-linux] teaching the kernel to do division Randolph Chung
@ 2003-10-23 23:39 ` Randolph Chung
  2003-10-25 11:13   ` Joel Soete
  0 siblings, 1 reply; 5+ messages in thread
From: Randolph Chung @ 2003-10-23 23:39 UTC (permalink / raw)
  To: parisc-linux

> does anyone want to look into fixing it, and/or writing an optimized
> version of that function for pa? :-) it needs to do basically this (but
> be standalone)

Here's one way to do it, but maybe it can be optimized a bit. Any
suggestions before i send it upstream?

randolph

Index: lib/div64.c
===================================================================
RCS file: /var/cvs/linux-2.6/lib/div64.c,v
retrieving revision 1.1
diff -u -p -r1.1 div64.c
--- lib/div64.c	29 Jul 2003 17:02:19 -0000	1.1
+++ lib/div64.c	23 Oct 2003 23:36:08 -0000
@@ -25,26 +25,28 @@
 
 uint32_t __div64_32(uint64_t *n, uint32_t base)
 {
-	uint32_t low, low2, high, rem;
+        uint64_t rem = *n;
+        uint64_t b = base;
+        uint64_t res = 0, d = 1;
 
-	low   = *n   & 0xffffffff;
-	high  = *n  >> 32;
-	rem   = high % (uint32_t)base;
-	high  = high / (uint32_t)base;
-	low2  = low >> 16;
-	low2 += rem << 16;
-	rem   = low2 % (uint32_t)base;
-	low2  = low2 / (uint32_t)base;
-	low   = low  & 0xffff;
-	low  += rem << 16;
-	rem   = low  % (uint32_t)base;
-	low   = low  / (uint32_t)base;
+        if (b > 0) {
+                while (b < rem) {
+                        b <<= 1;
+                        d <<= 1;
+                }
+        }
+        
+        do {
+                if (rem >= b) {
+                        rem -= b;
+                        res += d;
+                }
+                b >>= 1;
+                d >>= 1;
+        } while (d);
 
-	*n = low +
-		((uint64_t)low2 << 16) +
-		((uint64_t)high << 32);
-
-	return rem;
+        *n = res;
+        return rem;
 }
 
 EXPORT_SYMBOL(__div64_32);

-- 
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [parisc-linux] Re: teaching the kernel to do division
  2003-10-23 23:39 ` [parisc-linux] " Randolph Chung
@ 2003-10-25 11:13   ` Joel Soete
  2003-10-25 15:42     ` Joel Soete
  2003-10-25 16:07     ` Randolph Chung
  0 siblings, 2 replies; 5+ messages in thread
From: Joel Soete @ 2003-10-25 11:13 UTC (permalink / raw)
  To: Randolph Chung; +Cc: parisc-linux

Hi Rnadolph and all,

I reach to figure out what your pb is.
This new algo is Ok and the previous formula was definitely wrong (even 
in 2.4 :( ). So it could be interesting to fix it also.

For this I start from basic math:
	Dividend = divisor * quotient + reminder
ie	D = d * q + r (D, d, q, r: being integers)

I can also split D in two 32bits words and write
	D = D1 * 2^32 + D0

But I always have only one equation with two unknown var (q and r).

Is somebody knows where can I find the right solution?
(or should I find it back in objdump from :
uint32_t div64(uint64_t *n, uint32_t base)
{
         uint32_t rem;

         rem = *n % base;
         *n = *n / base;

         return rem;
})

Thanks in advance,
	Joel


Randolph Chung wrote:
>>does anyone want to look into fixing it, and/or writing an optimized
>>version of that function for pa? :-) it needs to do basically this (but
>>be standalone)
> 
> 
> Here's one way to do it, but maybe it can be optimized a bit. Any
> suggestions before i send it upstream?
> 
> randolph
> 
> Index: lib/div64.c
> ===================================================================
> RCS file: /var/cvs/linux-2.6/lib/div64.c,v
> retrieving revision 1.1
> diff -u -p -r1.1 div64.c
> --- lib/div64.c	29 Jul 2003 17:02:19 -0000	1.1
> +++ lib/div64.c	23 Oct 2003 23:36:08 -0000
> @@ -25,26 +25,28 @@
>  
>  uint32_t __div64_32(uint64_t *n, uint32_t base)
>  {
> -	uint32_t low, low2, high, rem;
> +        uint64_t rem = *n;
> +        uint64_t b = base;
> +        uint64_t res = 0, d = 1;
>  
> -	low   = *n   & 0xffffffff;
> -	high  = *n  >> 32;
> -	rem   = high % (uint32_t)base;
> -	high  = high / (uint32_t)base;
> -	low2  = low >> 16;
> -	low2 += rem << 16;
> -	rem   = low2 % (uint32_t)base;
> -	low2  = low2 / (uint32_t)base;
> -	low   = low  & 0xffff;
> -	low  += rem << 16;
> -	rem   = low  % (uint32_t)base;
> -	low   = low  / (uint32_t)base;
> +        if (b > 0) {
> +                while (b < rem) {
> +                        b <<= 1;
> +                        d <<= 1;
> +                }
> +        }
> +        
> +        do {
> +                if (rem >= b) {
> +                        rem -= b;
> +                        res += d;
> +                }
> +                b >>= 1;
> +                d >>= 1;
> +        } while (d);
>  
> -	*n = low +
> -		((uint64_t)low2 << 16) +
> -		((uint64_t)high << 32);
> -
> -	return rem;
> +        *n = res;
> +        return rem;
>  }
>  
>  EXPORT_SYMBOL(__div64_32);
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [parisc-linux] Re: teaching the kernel to do division
  2003-10-25 11:13   ` Joel Soete
@ 2003-10-25 15:42     ` Joel Soete
  2003-10-25 16:07     ` Randolph Chung
  1 sibling, 0 replies; 5+ messages in thread
From: Joel Soete @ 2003-10-25 15:42 UTC (permalink / raw)
  To: Joel Soete; +Cc: Randolph Chung, parisc-linux

Hi all,

this test case would explaining better (i hope at least) than long 
sentences what i mean:
#include <linux/types.h>
#include <stdio.h>
#include <limits.h>

/*
/* Algo 1: the goal */
uint32_t div64(uint64_t *n, uint32_t base)
{
         uint32_t rem;

         rem = *n % base;
         *n = *n / base;

         return rem;
}

/* Algo 2: original 2.6 */
uint32_t div64(uint64_t *n, uint32_t base)
{
         uint32_t low, low2, high, rem;

         low   = *n   & 0xffffffff;
         high  = *n  >> 32;
         rem   = high % (uint32_t)base;
         high  = high / (uint32_t)base;
         low2  = low >> 16;
         low2 += rem << 16;
         rem   = low2 % (uint32_t)base;
         low2  = low2 / (uint32_t)base;
         low   = low  & 0xffff;
         low  += rem << 16;
         rem   = low  % (uint32_t)base;
         low   = low  / (uint32_t)base;

         *n = low +
                 ((uint64_t)low2 << 16) +
                 ((uint64_t)high << 32);

         return rem;
}

/* Algo 3: 2.4 method  */
#define div64(n,base)                                                  \
({                                                                     \
         unsigned long __low, __low2, __high, __rem;                    \
         __low  = (n) & 0xffffffff;                                     \
         __high = (n) >> 32;                                            \
         if (__high) {                                                  \
                 __rem   = __high % (unsigned long)base;                \
                 __high  = __high / (unsigned long)base;                \
                 __low2  = __low >> 16;                                 \
                 __low2 += __rem << 16;                                 \
                 __rem   = __low2 % (unsigned long)base;                \
                 __low2  = __low2 / (unsigned long)base;                \
                 __low   = __low & 0xffff;                              \
                 __low  += __rem << 16;                                 \
                 __rem   = __low  % (unsigned long)base;                \
                 __low   = __low  / (unsigned long)base;                \
                 n = __low  + ((long long)__low2 << 16) +               \
                         ((long long) __high << 32);                    \
         } else {                                                       \
                 __rem = __low % (unsigned long)base;                   \
                 n = (__low / (unsigned long)base);                     \
         }                                                              \
         __rem;                                                         \
})
  */

  /* Algo 4: the Randolph patch OK*/
uint32_t div64(uint64_t *n, uint32_t base)
{
         uint64_t rem = *n;
         uint64_t b = base;
         uint64_t res = 0, d = 1;

         if (b > 0) {
                 while (b < rem) {
                         b <<= 1;
                         d <<= 1;
                 }
         }

         do {
                 if (rem >= b) {
                         rem -= b;
                         res += d;
                 }
                 b >>= 1;
                 d >>= 1;
         } while (d);

         *n = res;
         return rem;
}

main()
{
         uint32_t Rem;
         uint64_t Num;

         Num=(uint64_t) 28999591392ULL;

/*
         printf("Length of Rem: %d\n", sizeof(Rem));
         printf("Length of Num: %d\n", sizeof(Num));
  */
         printf ("Num = %#018Lx (%#lld)\n", Num, Num);
         Rem = div64(&Num, 1000000000UL);

         printf ("Rem = %#012x (%#ld)\n", Rem, Rem);
}

With corresponding results:
         Algo 1 (Ok)
Num = 0x00000006c082a5e0 (28999591392)
Rem = 0x003b948de0 (999591392)

         Algo 2 (Ko)
Num = 0x00000006c082a5e0 (28999591392)
Rem = 0x000db247e0 (229787616)

         Algo 3 (Ko)
Num = 0x00000006c082a5e0 (28999591392)
Rem = 0x000db247e0 (229787616)

         Algo 4 (Ok)
Num = 0x00000006c082a5e0 (28999591392)
Rem = 0x003b948de0 (999591392)

hth,
	Joel

PS: All seems well presented in my mozilla window and hope that will stay ;)

Joel Soete wrote:
> Hi Rnadolph and all,
> 
> I reach to figure out what your pb is.
> This new algo is Ok and the previous formula was definitely wrong (even 
> in 2.4 :( ). So it could be interesting to fix it also.
> 
> For this I start from basic math:
>     Dividend = divisor * quotient + reminder
> ie    D = d * q + r (D, d, q, r: being integers)
> 
> I can also split D in two 32bits words and write
>     D = D1 * 2^32 + D0
> 
> But I always have only one equation with two unknown var (q and r).
> 
> Is somebody knows where can I find the right solution?
> (or should I find it back in objdump from :
> uint32_t div64(uint64_t *n, uint32_t base)
> {
>         uint32_t rem;
> 
>         rem = *n % base;
>         *n = *n / base;
> 
>         return rem;
> })
> 
> Thanks in advance,
>     Joel
> 
> 
> Randolph Chung wrote:
> 
>>> does anyone want to look into fixing it, and/or writing an optimized
>>> version of that function for pa? :-) it needs to do basically this (but
>>> be standalone)
>>
>>
>>
>> Here's one way to do it, but maybe it can be optimized a bit. Any
>> suggestions before i send it upstream?
>>
>> randolph
>>
>> Index: lib/div64.c
>> ===================================================================
>> RCS file: /var/cvs/linux-2.6/lib/div64.c,v
>> retrieving revision 1.1
>> diff -u -p -r1.1 div64.c
>> --- lib/div64.c    29 Jul 2003 17:02:19 -0000    1.1
>> +++ lib/div64.c    23 Oct 2003 23:36:08 -0000
>> @@ -25,26 +25,28 @@
>>  
>>  uint32_t __div64_32(uint64_t *n, uint32_t base)
>>  {
>> -    uint32_t low, low2, high, rem;
>> +        uint64_t rem = *n;
>> +        uint64_t b = base;
>> +        uint64_t res = 0, d = 1;
>>  
>> -    low   = *n   & 0xffffffff;
>> -    high  = *n  >> 32;
>> -    rem   = high % (uint32_t)base;
>> -    high  = high / (uint32_t)base;
>> -    low2  = low >> 16;
>> -    low2 += rem << 16;
>> -    rem   = low2 % (uint32_t)base;
>> -    low2  = low2 / (uint32_t)base;
>> -    low   = low  & 0xffff;
>> -    low  += rem << 16;
>> -    rem   = low  % (uint32_t)base;
>> -    low   = low  / (uint32_t)base;
>> +        if (b > 0) {
>> +                while (b < rem) {
>> +                        b <<= 1;
>> +                        d <<= 1;
>> +                }
>> +        }
>> +        +        do {
>> +                if (rem >= b) {
>> +                        rem -= b;
>> +                        res += d;
>> +                }
>> +                b >>= 1;
>> +                d >>= 1;
>> +        } while (d);
>>  
>> -    *n = low +
>> -        ((uint64_t)low2 << 16) +
>> -        ((uint64_t)high << 32);
>> -
>> -    return rem;
>> +        *n = res;
>> +        return rem;
>>  }
>>  
>>  EXPORT_SYMBOL(__div64_32);
>>
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [parisc-linux] Re: teaching the kernel to do division
  2003-10-25 11:13   ` Joel Soete
  2003-10-25 15:42     ` Joel Soete
@ 2003-10-25 16:07     ` Randolph Chung
  1 sibling, 0 replies; 5+ messages in thread
From: Randolph Chung @ 2003-10-25 16:07 UTC (permalink / raw)
  To: Joel Soete; +Cc: parisc-linux

> Is somebody knows where can I find the right solution?

think about how you would do binary long division by hand. the algorithm
is exactly the same.

the more interesting question is how can we implement a more efficient
one for hppa. $$divU in libgcc.a is almost what we want, except i only
see a 32-bit version.

randolph
-- 
Randolph Chung
Debian GNU/Linux Developer, hppa/ia64 ports
http://www.tausq.org/

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2003-10-25 16:04 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-23  7:21 [parisc-linux] teaching the kernel to do division Randolph Chung
2003-10-23 23:39 ` [parisc-linux] " Randolph Chung
2003-10-25 11:13   ` Joel Soete
2003-10-25 15:42     ` Joel Soete
2003-10-25 16:07     ` Randolph Chung

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.