A bug in div64.s

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* A bug in div64.s
@ 2005-09-12  7:21 Liu Fred-a18596
  2005-09-12  8:26 ` Paul Mackerras
  0 siblings, 1 reply; 3+ messages in thread
From: Liu Fred-a18596 @ 2005-09-12  7:21 UTC (permalink / raw)
  To: linuxppc-dev

[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]

Hello, Greetings,

 

I found there was a bug in arch/ppc/lib/div64.s.

The original div64_32() function would produce a zero divide exception in case of div64_32(0x100000000, 0xFFFFFFFF).

 

Here is the bug-fixed new div64_32()function.

 

==============================================

/*

 * Divide a 64-bit unsigned number by a 32-bit unsigned number.

 * This routine assumes that the top 32 bits of the dividend are

 * non-zero to start with.

 * On entry, r3 points to the dividend, which get overwritten with

 * the 64-bit quotient, and r4 contains the divisor.

 * On exit, r3 contains the remainder.

 *

 * Copyright (C) 2002 Paul Mackerras, IBM Corp.

 *

 * This program is free software; you can redistribute it and/or

 * modify it under the terms of the GNU General Public License

 * as published by the Free Software Foundation; either version

 * 2 of the License, or (at your option) any later version.

 */

#include <asm/ppc_asm.h>

#include <asm/processor.h>

 

_GLOBAL(__div64_32)

               lwz          r5,0(r3)   # get the dividend into r5/r6

               lwz          r6,4(r3)

               cmplw    r5,r4

               li              r7,0

               li              r8,0

               blt           1f

               divwu      r7,r5,r4   # if dividend.hi >= divisor,

               mullw      r0,r7,r4   # quotient.hi = dividend.hi / divisor

               subf.       r5,r0,r5   # dividend.hi %= divisor

               beq         3f

1:            mr           r11,r5                     # here dividend.hi != 0

               andis.      r0,r5,0xc000

               bne         2f

               cntlzw     r0,r5                       # we are shifting the dividend right

               li              r10,-1                     # to make it < 2^32, and shifting

               srw         r10,r10,r0              # the divisor right the same amount,

               addc.      r9,r4,r10 # rounding up (so the estimate cannot

               andc       r11,r6,r10              # ever be too large, only too small)

               andc       r9,r9,r10

               mfxer   r12                    #

               rlwinm. r12, r12, 3, 31, 31    # XER[CA]

               cmpwi   r12, 1                 # carry ?

               bne     5f

               ori     r9, r9, 0x1            # carry !

5:

               or            r11,r5,r11

               rotlw       r9,r9,r0

               rotlw       r11,r11,r0

               divwu      r11,r11,r9              # then we divide the shifted quantities

2:            mullw      r10,r11,r4              # to get an estimate of the quotient,

               mulhwu  r9,r11,r4 # multiply the estimate by the divisor,

               subfc      r6,r10,r6 # take the product from the divisor,

               add         r8,r8,r11 # and add the estimate to the accumulated

               subfe.     r5,r9,r5   # quotient

               bne         1b

3:            cmplw    r6,r4

               blt           4f

               divwu      r0,r6,r4   # perform the remaining 32-bit division

               mullw      r10,r0,r4 # and get the remainder

               add         r8,r8,r0

               subf        r6,r10,r6

4:            stw          r7,0(r3)   # return the quotient in *r3

               stw          r8,4(r3)

               mr           r3,r6                       # return the remainder in r3

               blr

=================================================

 

BR,

Fred

 


[-- Attachment #2: Type: text/html, Size: 19979 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: A bug in div64.s
  2005-09-12  7:21 A bug in div64.s Liu Fred-a18596
@ 2005-09-12  8:26 ` Paul Mackerras
  0 siblings, 0 replies; 3+ messages in thread
From: Paul Mackerras @ 2005-09-12  8:26 UTC (permalink / raw)
  To: Liu Fred-a18596; +Cc: linuxppc-dev

Liu Fred-a18596 writes:

> I found there was a bug in arch/ppc/lib/div64.s.
> 
> The original div64_32() function would produce a zero divide
> exception in case of div64_32(0x100000000, 0xFFFFFFFF).

Thanks for pointing this out.

> Here is the bug-fixed new div64_32()function.

A patch would have been more helpful, so we could see immediately what
you have changed.  Also, notes (or outlook or whatever mail program
you are using) has completely munged all the spacing of the code you
posted.

Anyway, I have some comments on your proposed fix:

>                addc.      r9,r4,r10 # rounding up (so the estimate cannot

I understand that the add that I had here can overflow in the case you
pointed out, so we need to preserve the carry.  But why did you use
"addc." rather than "addc" ?

>                mfxer   r12                    #
> 
>                rlwinm. r12, r12, 3, 31, 31    # XER[CA]
> 
>                cmpwi   r12, 1                 # carry ?
> 
>                bne     5f
> 
>                ori     r9, r9, 0x1            # carry !

I think just "addze r9,r9" would do instead of these 5 instructions,
wouldn't it?

Paul.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* RE: A bug in div64.s
@ 2005-09-12 10:05 Liu Fred-a18596
  0 siblings, 0 replies; 3+ messages in thread
From: Liu Fred-a18596 @ 2005-09-12 10:05 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev

Hello, Paul,

Thank you very much. Your code is really cool.

Best regards,
Fred

-----Original Message-----
From: Paul Mackerras [mailto:paulus@samba.org] 
Sent: Monday, September 12, 2005 4:26 PM
To: Liu Fred-a18596
Cc: linuxppc-dev@ozlabs.org
Subject: Re: A bug in div64.s

Liu Fred-a18596 writes:

> I found there was a bug in arch/ppc/lib/div64.s.
> 
> The original div64_32() function would produce a zero divide
> exception in case of div64_32(0x100000000, 0xFFFFFFFF).

Thanks for pointing this out.

> Here is the bug-fixed new div64_32()function.

A patch would have been more helpful, so we could see immediately what
you have changed.  Also, notes (or outlook or whatever mail program
you are using) has completely munged all the spacing of the code you
posted.

Anyway, I have some comments on your proposed fix:

>                addc.      r9,r4,r10 # rounding up (so the estimate cannot

I understand that the add that I had here can overflow in the case you
pointed out, so we need to preserve the carry.  But why did you use
"addc." rather than "addc" ?

>                mfxer   r12                    #
> 
>                rlwinm. r12, r12, 3, 31, 31    # XER[CA]
> 
>                cmpwi   r12, 1                 # carry ?
> 
>                bne     5f
> 
>                ori     r9, r9, 0x1            # carry !

I think just "addze r9,r9" would do instead of these 5 instructions,
wouldn't it?

Paul.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-09-12 10:05 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-12  7:21 A bug in div64.s Liu Fred-a18596
2005-09-12  8:26 ` Paul Mackerras
  -- strict thread matches above, loose matches on Subject: below --
2005-09-12 10:05 Liu Fred-a18596

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).