* Re: Current egcs, binutils and kernel (fwd)
@ 1999-04-20 11:39 ` Geert Uytterhoeven
1999-04-20 17:05 ` David Edelsohn
` (2 more replies)
0 siblings, 3 replies; 9+ messages in thread
From: Geert Uytterhoeven @ 1999-04-20 11:39 UTC (permalink / raw)
To: Linux/PPC Development
---------- Forwarded message ----------
Date: Tue, 20 Apr 1999 13:15:41 +0200
From: Reinhard Nissl <rnissl@gmx.de>
To: Geert Uytterhoeven <Geert.Uytterhoeven@cs.kuleuven.ac.be>
Cc: "linux-apus@sunsite.auc.dk" <linux-apus@sunsite.auc.dk>
Subject: Re: Current egcs, binutils and kernel
Hi,
Geert Uytterhoeven wrote:
> On Wed, 14 Apr 1999, Reinhard Nissl wrote:
> > has anyone had success in compiling (egcs-1.1.2 and binutils-2.9.1.0.23)
> > the current APUS kernel with support for network block devices (nbd.c)?
> >
> > I get an undefined reference to __lshrdi3 from nbd_ioctl(), which looks
> > like a compiler / binutils bug.
>
> Hence a __lshrdi3() routine needs to be added to arch/ppc/kernel/misc.S.
I had a look into misc.S and found similar routines (__ashrdi3) there. Then I
searched in the egcs-1.1.2 sources for files, where such functions are
referenced. I found definitions in egcs-1.1.2/gcc/config/rs6000/rs6000.md but
they are not native ppc assembler instructions. As I'm not that much used to
*.md files and ppc assembly code, I'm currently not able to define the missing
function in misc.S myself.
I checked the kernel source diffs from version 2.2.4 to 2.2.6 for lshrdi3 and
had only success for arch=sparc. So, is there anybody who can add the missing
function to misc.S for arch=ppc?
> Greetings,
>
> Geert
Bye.
--
Dipl.-Inform. (FH) Reinhard Nissl
mailto:rnissl@gmx.de
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
[not found] <Pine.LNX.4.10.9904201339320.26859-100000@mercator.cs.kuleu ven.ac.be>
@ 1999-04-20 14:49 ` Franz Sirl
0 siblings, 0 replies; 9+ messages in thread
From: Franz Sirl @ 1999-04-20 14:49 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Linux/PPC Development
At 13:39 20.04.99 , Geert Uytterhoeven wrote:
>---------- Forwarded message ----------
>Date: Tue, 20 Apr 1999 13:15:41 +0200
>From: Reinhard Nissl <rnissl@gmx.de>
>To: Geert Uytterhoeven <Geert.Uytterhoeven@cs.kuleuven.ac.be>
>Cc: "linux-apus@sunsite.auc.dk" <linux-apus@sunsite.auc.dk>
>Subject: Re: Current egcs, binutils and kernel
>
>Hi,
>
>Geert Uytterhoeven wrote:
>
>> On Wed, 14 Apr 1999, Reinhard Nissl wrote:
>> > has anyone had success in compiling (egcs-1.1.2 and binutils-2.9.1.0.23)
>> > the current APUS kernel with support for network block devices (nbd.c)?
>> >
>> > I get an undefined reference to __lshrdi3 from nbd_ioctl(), which looks
>> > like a compiler / binutils bug.
>>
>> Hence a __lshrdi3() routine needs to be added to arch/ppc/kernel/misc.S.
>
>I had a look into misc.S and found similar routines (__ashrdi3) there. Then I
>searched in the egcs-1.1.2 sources for files, where such functions are
>referenced. I found definitions in egcs-1.1.2/gcc/config/rs6000/rs6000.md but
>they are not native ppc assembler instructions. As I'm not that much used to
>*.md files and ppc assembly code, I'm currently not able to define the missing
>function in misc.S myself.
>
>I checked the kernel source diffs from version 2.2.4 to 2.2.6 for lshrdi3 and
>had only success for arch=sparc. So, is there anybody who can add the missing
>function to misc.S for arch=ppc?
objdump -D libgcc.a shows:
_lshrdi3.o: file format elf32-powerpc
Disassembly of section .text:
00000000 <__lshrdi3>:
0: 7c a5 2b 79 mr. r5,r5
4: 4d 82 00 20 beqlr
8: 20 05 00 20 subfic r0,r5,32
c: 2c 00 00 00 cmpwi r0,0
10: 41 81 00 14 bgt 24 <__lshrdi3+0x24>
14: 7c 00 00 d0 neg r0,r0
18: 39 60 00 00 li r11,0
1c: 7c 6c 04 30 srw r12,r3,r0
20: 48 00 00 14 b 34 <__lshrdi3+0x34>
24: 7c 89 2c 30 srw r9,r4,r5
28: 7c 60 00 30 slw r0,r3,r0
2c: 7c 6b 2c 30 srw r11,r3,r5
30: 7d 2c 03 78 or r12,r9,r0
34: 7d 63 5b 78 mr r3,r11
38: 7d 84 63 78 mr r4,r12
3c: 4e 80 00 20 blr
__lshrdi3 is a logical right shift (0 is shifted in from the left, whereas
__ashrdi3 arithmetic right shift shifts in the sign bit) on a 64-bit
quantitity. You could also use the example code in Appendix E of the PPC601
manual.
Franz.
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
1999-04-20 11:39 ` Current egcs, binutils and kernel (fwd) Geert Uytterhoeven
@ 1999-04-20 17:05 ` David Edelsohn
1999-04-20 20:43 ` Gabriel Paubert
1999-04-20 17:45 ` Gabriel Paubert
1999-04-20 18:29 ` Gary Thomas
2 siblings, 1 reply; 9+ messages in thread
From: David Edelsohn @ 1999-04-20 17:05 UTC (permalink / raw)
To: Linux/PPC Development
The original POWER architecture with the MQ register allowed a
short sequence to provide this 64-bit functionality. The PowerPC
instruction set does not provide the necessary building blocks, so GCC
falls back to the C implementation in libgcc2.c producing the assembly
code shown in the later posting.
David
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
1999-04-20 11:39 ` Current egcs, binutils and kernel (fwd) Geert Uytterhoeven
1999-04-20 17:05 ` David Edelsohn
@ 1999-04-20 17:45 ` Gabriel Paubert
1999-04-20 18:29 ` Gary Thomas
2 siblings, 0 replies; 9+ messages in thread
From: Gabriel Paubert @ 1999-04-20 17:45 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Linux/PPC Development
On Tue, 20 Apr 1999, Geert Uytterhoeven wrote:
>
> ---------- Forwarded message ----------
> Date: Tue, 20 Apr 1999 13:15:41 +0200
> From: Reinhard Nissl <rnissl@gmx.de>
> To: Geert Uytterhoeven <Geert.Uytterhoeven@cs.kuleuven.ac.be>
> Cc: "linux-apus@sunsite.auc.dk" <linux-apus@sunsite.auc.dk>
> Subject: Re: Current egcs, binutils and kernel
>
> Hi,
>
> Geert Uytterhoeven wrote:
>
> > On Wed, 14 Apr 1999, Reinhard Nissl wrote:
> > > has anyone had success in compiling (egcs-1.1.2 and binutils-2.9.1.0.23)
> > > the current APUS kernel with support for network block devices (nbd.c)?
> > >
> > > I get an undefined reference to __lshrdi3 from nbd_ioctl(), which looks
> > > like a compiler / binutils bug.
> >
> > Hence a __lshrdi3() routine needs to be added to arch/ppc/kernel/misc.S.
>
> I had a look into misc.S and found similar routines (__ashrdi3) there. Then I
> searched in the egcs-1.1.2 sources for files, where such functions are
> referenced. I found definitions in egcs-1.1.2/gcc/config/rs6000/rs6000.md but
> they are not native ppc assembler instructions. As I'm not that much used to
> *.md files and ppc assembly code, I'm currently not able to define the missing
> function in misc.S myself.
>
> I checked the kernel source diffs from version 2.2.4 to 2.2.6 for lshrdi3 and
> had only success for arch=sparc. So, is there anybody who can add the missing
> function to misc.S for arch=ppc?
I'd suggest the following patch. Note that the current versions of the
long long shifts will not work when the shift count is > 32. There is an
appendix in all good PPC manuals on how to do multiple precision shifts
and I've followed it (except for the exact order for better superscalar
issue/execution and completion, all the code should flow perfectly
through 2 pipes) with one exception: the arithmetic right
shift is one instruction longer but is branchless (conditional clear
of a register using a shift whose amount is computed by an rlwinm
instruction).
I've also fixed a few other oddities in the code:
- atomic_dec_and_test uses cntlzw the way God intended to evaluate
`(x==0) ? 1 : 0' without any branch
- the abs function is also branchless now (it would nevertheless be better
to use the __builtin_abs function of GCC)
Greetings,
Gabriel.
--- linux-2.2.6/arch/ppc/kernel/misc.S Thu Mar 11 05:30:32 1999
+++ linux/arch/ppc/kernel/misc.S Tue Apr 20 20:14:03 1999
@@ -228,10 +228,8 @@
subi r5,r5,1 /* Perform 'add' operation */
stwcx. r5,0,r3 /* Update with new value */
bne- 10b /* Retry if "reservation" (i.e. lock) lost */
- cmpi 0,r5,0 /* Return 'true' IFF 0 */
- li r3,1
- beqlr
- li r3,0
+ cntlzw r3,r5 /* Return 'true' IFF 0 */
+ srwi r3,r3,5 /* But do it the clever way */
blr
_GLOBAL(atomic_clear_mask)
10: lwarx r5,0,r4
@@ -355,38 +353,59 @@
blr
/*
- * Extended precision shifts
+ * Extended precision shifts.
+ *
+ * Updated to be valid for shift counts from 0 to 63 inclusive.
+ * -- Gabriel
*
* R3/R4 has 64 bit value
* R5 has shift count
* result in R3/R4
*
- * ashrdi3: XXXYYY/ZZZAAA -> SSSXXX/YYYZZZ
- * ashldi3: XXXYYY/ZZZAAA -> YYYZZZ/AAA000
+ * ashrdi3: arithmetic right shift (sign propagation)
+ * lslhdi3: logical right shift
+ * ashldi3: left shift
*/
_GLOBAL(__ashrdi3)
- li r6,32
- sub r6,r6,r5
- slw r7,r3,r6 /* isolate YYY */
- srw r4,r4,r5 /* isolate ZZZ */
- or r4,r4,r7 /* YYYZZZ */
- sraw r3,r3,r5 /* SSSXXX */
+ subfic r6,r5,32
+ srw r4,r4,r5 # LSW = count > 31 ? 0 : LSW >> count
+ addi r7,r5,32 # could be xori, or addi with -32
+ slw r6,r3,r6 # t1 = count > 31 ? 0 : MSW << (32-count)
+ rlwinm r8,r7,0,32 # t3 = (count < 32) ? 32 : 0
+ sraw r7,r3,r7 # t2 = MSW >> (count-32)
+ or r4,r4,r6 # LSW |= t1
+ slw r7,r7,r8 # t2 = (count < 32) ? 0 : t2
+ sraw r3,r3,r5 # MSW = MSW >> count
+ or r4,r4,r7 # LSW |= t2
blr
-
+
_GLOBAL(__ashldi3)
- li r6,32
- sub r6,r6,r5
- srw r7,r4,r6 /* isolate ZZZ */
- slw r4,r4,r5 /* AAA000 */
- slw r3,r3,r5 /* YYY--- */
- or r3,r3,r7 /* YYYZZZ */
+ subfic r6,r5,32
+ slw r3,r3,r5 # MSW = count > 31 ? 0 : MSW << count
+ addi r7,r5,32 # could be xori, or addi with -32
+ srw r6,r4,r6 # t1 = count > 31 ? 0 : LSW >> (32-count)
+ slw r7,r4,r7 # t2 = count < 32 ? 0 : LSW << (count-32)
+ or r3,r3,r6 # MSW |= t1
+ slw r4,r4,r5 # LSW = LSW << count
+ or r3,r3,r7 # MSW |= t2
+ blr
+
+_GLOBAL(__lshrdi3)
+ subfic r6,r5,32
+ srw r4,r4,r5 # LSW = count > 31 ? 0 : LSW >> count
+ addi r7,r5,32 # could be xori, or addi with -32
+ slw r6,r3,r6 # t1 = count > 31 ? 0 : MSW << (32-count)
+ srw r7,r3,r7 # t2 = count < 32 ? 0 : MSW >> (count-32)
+ or r4,r4,r6 # LSW |= t1
+ srw r3,r3,r5 # MSW = MSW >> count
+ or r4,r4,r7 # LSW |= t2
blr
_GLOBAL(abs)
- cmpi 0,r3,0
- bge 10f
- neg r3,r3
-10: blr
+ srawi r4,r3,31
+ xor r3,r3,r4
+ sub r3,r3,r4
+ blr
_GLOBAL(_get_SP)
mr r3,r1 /* Close enough */
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
1999-04-20 11:39 ` Current egcs, binutils and kernel (fwd) Geert Uytterhoeven
1999-04-20 17:05 ` David Edelsohn
1999-04-20 17:45 ` Gabriel Paubert
@ 1999-04-20 18:29 ` Gary Thomas
2 siblings, 0 replies; 9+ messages in thread
From: Gary Thomas @ 1999-04-20 18:29 UTC (permalink / raw)
To: Geert Uytterhoeven; +Cc: Linux/PPC Development
I looked at the code in "misc.S" and it appears to me [by cursory glance]
that it already is "__lshrdi3" - i.e. the code that is there is doing a
logical shift and not an arithmetic one!
It seems to me that you could just add "__lshrdi3" as an alternate entry
point.
I need to investigate the existing code and see if maybe my impression
is just "day's end..."
On 20-Apr-99 Geert Uytterhoeven wrote:
>
> ---------- Forwarded message ----------
> Date: Tue, 20 Apr 1999 13:15:41 +0200
> From: Reinhard Nissl <rnissl@gmx.de>
> To: Geert Uytterhoeven <Geert.Uytterhoeven@cs.kuleuven.ac.be>
> Cc: "linux-apus@sunsite.auc.dk" <linux-apus@sunsite.auc.dk>
> Subject: Re: Current egcs, binutils and kernel
>
> Hi,
>
> Geert Uytterhoeven wrote:
>
>> On Wed, 14 Apr 1999, Reinhard Nissl wrote:
>> > has anyone had success in compiling (egcs-1.1.2 and binutils-2.9.1.0.23)
>> > the current APUS kernel with support for network block devices (nbd.c)?
>> >
>> > I get an undefined reference to __lshrdi3 from nbd_ioctl(), which looks
>> > like a compiler / binutils bug.
>>
>> Hence a __lshrdi3() routine needs to be added to arch/ppc/kernel/misc.S.
>
> I had a look into misc.S and found similar routines (__ashrdi3) there. Then I
> searched in the egcs-1.1.2 sources for files, where such functions are
> referenced. I found definitions in egcs-1.1.2/gcc/config/rs6000/rs6000.md but
> they are not native ppc assembler instructions. As I'm not that much used to
> *.md files and ppc assembly code, I'm currently not able to define the missing
> function in misc.S myself.
>
> I checked the kernel source diffs from version 2.2.4 to 2.2.6 for lshrdi3 and
> had only success for arch=sparc. So, is there anybody who can add the missing
> function to misc.S for arch=ppc?
>
>> Greetings,
>>
>> Geert
>
> Bye.
> --
> Dipl.-Inform. (FH) Reinhard Nissl
> mailto:rnissl@gmx.de
>
>
>
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
1999-04-20 17:05 ` David Edelsohn
@ 1999-04-20 20:43 ` Gabriel Paubert
1999-04-20 21:14 ` David Edelsohn
1999-04-24 12:35 ` Paul Mackerras
0 siblings, 2 replies; 9+ messages in thread
From: Gabriel Paubert @ 1999-04-20 20:43 UTC (permalink / raw)
To: David Edelsohn; +Cc: Linux/PPC Development
On Tue, 20 Apr 1999, David Edelsohn wrote:
>
> The original POWER architecture with the MQ register allowed a
> short sequence to provide this 64-bit functionality. The PowerPC
> instruction set does not provide the necessary building blocks, so GCC
> falls back to the C implementation in libgcc2.c producing the assembly
> code shown in the later posting.
Yes, but the kernel is not linked with libgcc2. Anyway I've posted a patch
to correctly implement this. And then I've noticed just after that people
do more work than necessary in the cvt_df and cvt_fd routines: floating
point loads and stores never affect or depend on the FPSCR contents, so
there is no point at all in manipulating it (except adding bloat for the
sake of adding bloat).
Gabriel.
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
1999-04-20 20:43 ` Gabriel Paubert
@ 1999-04-20 21:14 ` David Edelsohn
1999-04-20 21:26 ` Gabriel Paubert
1999-04-24 12:35 ` Paul Mackerras
1 sibling, 1 reply; 9+ messages in thread
From: David Edelsohn @ 1999-04-20 21:14 UTC (permalink / raw)
To: Gabriel Paubert; +Cc: Linux/PPC Development
>>>>> Gabriel Paubert writes:
David> The original POWER architecture with the MQ register allowed a
David> short sequence to provide this 64-bit functionality. The PowerPC
David> instruction set does not provide the necessary building blocks, so GCC
David> falls back to the C implementation in libgcc2.c producing the assembly
David> code shown in the later posting.
Gabriel> Yes, but the kernel is not linked with libgcc2. Anyway I've posted a patch
Gabriel> to correctly implement this. And then I've noticed just after that people
Gabriel> do more work than necessary in the cvt_df and cvt_fd routines: floating
Gabriel> point loads and stores never affect or depend on the FPSCR contents, so
Gabriel> there is no point at all in manipulating it (except adding bloat for the
Gabriel> sake of adding bloat).
The purpose of my earlier message was to explain what was
occurring in rs6000.md and why GCC was calling the function instead of
inlining the code. I thought that some people might actually want to
understand *WHY* the function was required and why GCC was behaving as
designed.
David
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
1999-04-20 21:14 ` David Edelsohn
@ 1999-04-20 21:26 ` Gabriel Paubert
0 siblings, 0 replies; 9+ messages in thread
From: Gabriel Paubert @ 1999-04-20 21:26 UTC (permalink / raw)
To: David Edelsohn; +Cc: Linux/PPC Development
On Tue, 20 Apr 1999, David Edelsohn wrote:
> The purpose of my earlier message was to explain what was
> occurring in rs6000.md and why GCC was calling the function instead of
> inlining the code. I thought that some people might actually want to
> understand *WHY* the function was required and why GCC was behaving as
> designed.
Ok, calm down. But I still don't know why GCC does not expand this inline.
It's 8 or 10 instructions, without branches so it's not much bigger than a
call for which you have to set up 3 parameter registers and which may
destroy many more registers according to the ABI conventions.
Furthermore if only one of the 2 result registers is needed, it may be
simplified quite significantly (i'm not sure it's easy to do in current
GCC to split a variable using 2 registers to remove dead code).
Maybe it's time to add the alternatives in rs6000.md, enabled with a
-minline-longlongshifts option. But I don't yet know enough of GCC to
implement it.
Gabriel.
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Current egcs, binutils and kernel (fwd)
1999-04-20 20:43 ` Gabriel Paubert
1999-04-20 21:14 ` David Edelsohn
@ 1999-04-24 12:35 ` Paul Mackerras
1 sibling, 0 replies; 9+ messages in thread
From: Paul Mackerras @ 1999-04-24 12:35 UTC (permalink / raw)
To: paubert; +Cc: dje, linuxppc-dev
Gabriel Paubert <paubert@iram.es> wrote:
> to correctly implement this. And then I've noticed just after that people
> do more work than necessary in the cvt_df and cvt_fd routines: floating
> point loads and stores never affect or depend on the FPSCR contents, so
Arrghl, you're right. I had thought you could get overflows or other
errors being signalled, that's why I had the fpscr load/save in there,
but it appears from the manual that I was wrong.
Paul.
[[ This message was sent via the linuxppc-dev mailing list. Replies are ]]
[[ not forced back to the list, so be sure to Cc linuxppc-dev if your ]]
[[ reply is of general interest. Please check http://lists.linuxppc.org/ ]]
[[ and http://www.linuxppc.org/ for useful information before posting. ]]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~1999-04-24 12:35 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <Geert.Uytterhoeven@cs.kuleuven.ac.be>
1999-04-20 11:39 ` Current egcs, binutils and kernel (fwd) Geert Uytterhoeven
1999-04-20 17:05 ` David Edelsohn
1999-04-20 20:43 ` Gabriel Paubert
1999-04-20 21:14 ` David Edelsohn
1999-04-20 21:26 ` Gabriel Paubert
1999-04-24 12:35 ` Paul Mackerras
1999-04-20 17:45 ` Gabriel Paubert
1999-04-20 18:29 ` Gary Thomas
[not found] <Pine.LNX.4.10.9904201339320.26859-100000@mercator.cs.kuleu ven.ac.be>
1999-04-20 14:49 ` Franz Sirl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).