* A bug in div64.s
@ 2005-09-12 7:21 Liu Fred-a18596
2005-09-12 8:26 ` Paul Mackerras
0 siblings, 1 reply; 3+ messages in thread
From: Liu Fred-a18596 @ 2005-09-12 7:21 UTC (permalink / raw)
To: linuxppc-dev
[-- Attachment #1: Type: text/plain, Size: 3533 bytes --]
Hello, Greetings,
I found there was a bug in arch/ppc/lib/div64.s.
The original div64_32() function would produce a zero divide exception in case of div64_32(0x100000000, 0xFFFFFFFF).
Here is the bug-fixed new div64_32()function.
==============================================
/*
* Divide a 64-bit unsigned number by a 32-bit unsigned number.
* This routine assumes that the top 32 bits of the dividend are
* non-zero to start with.
* On entry, r3 points to the dividend, which get overwritten with
* the 64-bit quotient, and r4 contains the divisor.
* On exit, r3 contains the remainder.
*
* Copyright (C) 2002 Paul Mackerras, IBM Corp.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version
* 2 of the License, or (at your option) any later version.
*/
#include <asm/ppc_asm.h>
#include <asm/processor.h>
_GLOBAL(__div64_32)
lwz r5,0(r3) # get the dividend into r5/r6
lwz r6,4(r3)
cmplw r5,r4
li r7,0
li r8,0
blt 1f
divwu r7,r5,r4 # if dividend.hi >= divisor,
mullw r0,r7,r4 # quotient.hi = dividend.hi / divisor
subf. r5,r0,r5 # dividend.hi %= divisor
beq 3f
1: mr r11,r5 # here dividend.hi != 0
andis. r0,r5,0xc000
bne 2f
cntlzw r0,r5 # we are shifting the dividend right
li r10,-1 # to make it < 2^32, and shifting
srw r10,r10,r0 # the divisor right the same amount,
addc. r9,r4,r10 # rounding up (so the estimate cannot
andc r11,r6,r10 # ever be too large, only too small)
andc r9,r9,r10
mfxer r12 #
rlwinm. r12, r12, 3, 31, 31 # XER[CA]
cmpwi r12, 1 # carry ?
bne 5f
ori r9, r9, 0x1 # carry !
5:
or r11,r5,r11
rotlw r9,r9,r0
rotlw r11,r11,r0
divwu r11,r11,r9 # then we divide the shifted quantities
2: mullw r10,r11,r4 # to get an estimate of the quotient,
mulhwu r9,r11,r4 # multiply the estimate by the divisor,
subfc r6,r10,r6 # take the product from the divisor,
add r8,r8,r11 # and add the estimate to the accumulated
subfe. r5,r9,r5 # quotient
bne 1b
3: cmplw r6,r4
blt 4f
divwu r0,r6,r4 # perform the remaining 32-bit division
mullw r10,r0,r4 # and get the remainder
add r8,r8,r0
subf r6,r10,r6
4: stw r7,0(r3) # return the quotient in *r3
stw r8,4(r3)
mr r3,r6 # return the remainder in r3
blr
=================================================
BR,
Fred
[-- Attachment #2: Type: text/html, Size: 19979 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: A bug in div64.s
2005-09-12 7:21 A bug in div64.s Liu Fred-a18596
@ 2005-09-12 8:26 ` Paul Mackerras
0 siblings, 0 replies; 3+ messages in thread
From: Paul Mackerras @ 2005-09-12 8:26 UTC (permalink / raw)
To: Liu Fred-a18596; +Cc: linuxppc-dev
Liu Fred-a18596 writes:
> I found there was a bug in arch/ppc/lib/div64.s.
>
> The original div64_32() function would produce a zero divide
> exception in case of div64_32(0x100000000, 0xFFFFFFFF).
Thanks for pointing this out.
> Here is the bug-fixed new div64_32()function.
A patch would have been more helpful, so we could see immediately what
you have changed. Also, notes (or outlook or whatever mail program
you are using) has completely munged all the spacing of the code you
posted.
Anyway, I have some comments on your proposed fix:
> addc. r9,r4,r10 # rounding up (so the estimate cannot
I understand that the add that I had here can overflow in the case you
pointed out, so we need to preserve the carry. But why did you use
"addc." rather than "addc" ?
> mfxer r12 #
>
> rlwinm. r12, r12, 3, 31, 31 # XER[CA]
>
> cmpwi r12, 1 # carry ?
>
> bne 5f
>
> ori r9, r9, 0x1 # carry !
I think just "addze r9,r9" would do instead of these 5 instructions,
wouldn't it?
Paul.
^ permalink raw reply [flat|nested] 3+ messages in thread
* RE: A bug in div64.s
@ 2005-09-12 10:05 Liu Fred-a18596
0 siblings, 0 replies; 3+ messages in thread
From: Liu Fred-a18596 @ 2005-09-12 10:05 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linuxppc-dev
Hello, Paul,
Thank you very much. Your code is really cool.
Best regards,
Fred
-----Original Message-----
From: Paul Mackerras [mailto:paulus@samba.org]
Sent: Monday, September 12, 2005 4:26 PM
To: Liu Fred-a18596
Cc: linuxppc-dev@ozlabs.org
Subject: Re: A bug in div64.s
Liu Fred-a18596 writes:
> I found there was a bug in arch/ppc/lib/div64.s.
>
> The original div64_32() function would produce a zero divide
> exception in case of div64_32(0x100000000, 0xFFFFFFFF).
Thanks for pointing this out.
> Here is the bug-fixed new div64_32()function.
A patch would have been more helpful, so we could see immediately what
you have changed. Also, notes (or outlook or whatever mail program
you are using) has completely munged all the spacing of the code you
posted.
Anyway, I have some comments on your proposed fix:
> addc. r9,r4,r10 # rounding up (so the estimate cannot
I understand that the add that I had here can overflow in the case you
pointed out, so we need to preserve the carry. But why did you use
"addc." rather than "addc" ?
> mfxer r12 #
>
> rlwinm. r12, r12, 3, 31, 31 # XER[CA]
>
> cmpwi r12, 1 # carry ?
>
> bne 5f
>
> ori r9, r9, 0x1 # carry !
I think just "addze r9,r9" would do instead of these 5 instructions,
wouldn't it?
Paul.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2005-09-12 10:05 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-09-12 7:21 A bug in div64.s Liu Fred-a18596
2005-09-12 8:26 ` Paul Mackerras
-- strict thread matches above, loose matches on Subject: below --
2005-09-12 10:05 Liu Fred-a18596
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).