* [parisc-linux] Compiler switches @ 2003-02-02 3:59 Matthew Wilcox 2003-02-02 4:56 ` John David Anglin 0 siblings, 1 reply; 10+ messages in thread From: Matthew Wilcox @ 2003-02-02 3:59 UTC (permalink / raw) To: parisc-linux Just wondering how many of the compiler switches we really need these days. Here's what we currently do: cflags-y := -D__linux__ -pipe -fno-strength-reduce # These should be on for older toolchains or SOM toolchains that don't # enable them by default. cflags-y += -mno-space-regs -mfast-indirect-calls # No fixed-point multiply cflags-y += -mdisable-fpregs # Without this, "ld -r" results in .text sections that are too big # (> 0x40000) for branches to reach stubs. cflags-y += -ffunction-sections -D__linux__ looks like it can go away. -pipe I'm agnostic on. Someone want to benchmark builds both with and without it? -fno-strength-reduce has been there since before we moved to ELF -- over 3 years. Any bug this was working around has hopefully been long-squashed. I think we should eliminate this and submit PRs if it finds new holes. -mno-space-regs & -mfast-indirect-calls can also go away, I think. I can't imagine that we ever didn't have them as default on a gcc 3.0-based compiler. Do we still need -ffunction-sections? I'm inclined to leave it anyway to enable compilation with older toolchains. -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 3:59 [parisc-linux] Compiler switches Matthew Wilcox @ 2003-02-02 4:56 ` John David Anglin 2003-02-02 5:19 ` Randolph Chung 2003-02-02 21:03 ` Matthew Wilcox 0 siblings, 2 replies; 10+ messages in thread From: John David Anglin @ 2003-02-02 4:56 UTC (permalink / raw) To: Matthew Wilcox; +Cc: parisc-linux > -D__linux__ looks like it can go away. Agreed. This is defined for sure in 3.1/3.2 and later. > -fno-strength-reduce has been there since before we moved to ELF -- over 3 > years. Any bug this was working around has hopefully been long-squashed. > I think we should eliminate this and submit PRs if it finds new holes. I always wondered why this option was used. > -mno-space-regs & -mfast-indirect-calls can also go away, I think. > I can't imagine that we ever didn't have them as default on a gcc > 3.0-based compiler. These are still not the default but possibly they should be. Actually, I can see that we can save a couple of instructions when generating long indirect calls to a symbol reference when no space registers is defined. > Do we still need -ffunction-sections? I'm inclined to leave it anyway > to enable compilation with older toolchains. Cross my fingers, but I believe that the distance problem for calls is fixed 3.2.2. However, if there really are objects with .text larger than 240000 bytes, then you should probably still define -ffunction-sections. When the total code bytes exceeds the above limit (PA 1.X), gcc switches to long indirect calls. These are horribly inefficient. There are better sequences but we need some new support in gas and ld to handle the relocations and generate appropriate stubs. For example, the following non-pic sequence works under hpux with the HP assembler, but not linux or hpux with gas: ldil L'dest,%r1 be R'dest(%sr4,%r1) Long pic pc-relative and symbol difference sequences also don't work. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6605) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 4:56 ` John David Anglin @ 2003-02-02 5:19 ` Randolph Chung 2003-02-02 5:49 ` John David Anglin 2003-02-02 5:52 ` John David Anglin 2003-02-02 21:03 ` Matthew Wilcox 1 sibling, 2 replies; 10+ messages in thread From: Randolph Chung @ 2003-02-02 5:19 UTC (permalink / raw) To: John David Anglin; +Cc: Matthew Wilcox, parisc-linux > Cross my fingers, but I believe that the distance problem for calls > is fixed 3.2.2. However, if there really are objects with .text larger Lamont still saw some function-section failures with Debian gcc 3.2.2-pre7 (20030128) This one looks pretty small: http://buildd.debian.org/fetch.php?&pkg=synopsis&ver=0.4.1cvs20030125-1&arch=hppa&stamp=1044017572&file=log&as=raw randolph -- Randolph Chung Debian GNU/Linux Developer, hppa/ia64 ports http://www.tausq.org/ ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 5:19 ` Randolph Chung @ 2003-02-02 5:49 ` John David Anglin 2003-02-02 5:52 ` John David Anglin 1 sibling, 0 replies; 10+ messages in thread From: John David Anglin @ 2003-02-02 5:49 UTC (permalink / raw) To: tausq; +Cc: willy, parisc-linux > Lamont still saw some function-section failures with Debian gcc > 3.2.2-pre7 (20030128) If this is building C++ code, then the problem is likely in the linker. Try this: Index: emultempl/hppaelf.em =================================================================== RCS file: /cvs/src/src/ld/emultempl/hppaelf.em,v retrieving revision 1.24 diff -u -3 -p -r1.24 hppaelf.em --- emultempl/hppaelf.em 30 Nov 2002 08:39:46 -0000 1.24 +++ emultempl/hppaelf.em 2 Feb 2003 05:41:38 -0000 @@ -50,7 +50,7 @@ static int need_laying_out = 0; /* Maximum size of a group of input sections that can be handled by one stub section. A value of +/-1 indicates the bfd back-end should use a suitable default size. */ -static bfd_signed_vma group_size = 1; +static bfd_signed_vma group_size = -1; /* Stops the linker merging .text sections on a relocatable link, and adds millicode library to the list of input files. */ This reduces the stub pressure by about 50%. I also installed this yesterday to fix a related problem: 2003-01-31 John David Anglin <dave.anglin@nrc-cnrc.gc.ca> * pa.c (pa_output_function_prologue, pa_output_function_epilogue): Move updating of total_code_bytes from prologue to epilogue. The updating of total_code_bytes in the prologue changed caused the call sizes to change in a function that spanned the 240000 byte boundary. This caused the branch distance of any branch over a call to increase, and sometimes exceed the range of the branch. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6605) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 5:19 ` Randolph Chung 2003-02-02 5:49 ` John David Anglin @ 2003-02-02 5:52 ` John David Anglin 2003-02-02 5:55 ` John David Anglin 2003-02-02 8:18 ` Randolph Chung 1 sibling, 2 replies; 10+ messages in thread From: John David Anglin @ 2003-02-02 5:52 UTC (permalink / raw) To: tausq; +Cc: willy, parisc-linux > http://buildd.debian.org/fetch.php?&pkg=synopsis&ver=0.4.1cvs20030125-1&arch=hppa&stamp=1044017572&file=log&as=raw Yah, that's a stub table overflow. Try the linker patch. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6605) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 5:52 ` John David Anglin @ 2003-02-02 5:55 ` John David Anglin 2003-02-02 8:18 ` Randolph Chung 1 sibling, 0 replies; 10+ messages in thread From: John David Anglin @ 2003-02-02 5:55 UTC (permalink / raw) To: John David Anglin; +Cc: tausq, willy, parisc-linux > > http://buildd.debian.org/fetch.php?&pkg=synopsis&ver=0.4.1cvs20030125-1&arch=hppa&stamp=1044017572&file=log&as=raw > > Yah, that's a stub table overflow. Try the linker patch. Oh, I should mention there is a linker option to change the stub table size. It's "--stub-group-size=N". For more info, "ld --help". Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6605) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 5:52 ` John David Anglin 2003-02-02 5:55 ` John David Anglin @ 2003-02-02 8:18 ` Randolph Chung 1 sibling, 0 replies; 10+ messages in thread From: Randolph Chung @ 2003-02-02 8:18 UTC (permalink / raw) To: John David Anglin; +Cc: willy, parisc-linux In reference to a message from John David Anglin, dated Feb 02: > > http://buildd.debian.org/fetch.php?&pkg=synopsis&ver=0.4.1cvs20030125-1&arch=hppa&stamp=1044017572&file=log&as=raw > > Yah, that's a stub table overflow. Try the linker patch. yup, this seems to work. randolph ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 4:56 ` John David Anglin 2003-02-02 5:19 ` Randolph Chung @ 2003-02-02 21:03 ` Matthew Wilcox 2003-02-02 22:02 ` John David Anglin 1 sibling, 1 reply; 10+ messages in thread From: Matthew Wilcox @ 2003-02-02 21:03 UTC (permalink / raw) To: John David Anglin; +Cc: Matthew Wilcox, parisc-linux On Sat, Feb 01, 2003 at 11:56:40PM -0500, John David Anglin wrote: > > -D__linux__ looks like it can go away. > > Agreed. This is defined for sure in 3.1/3.2 and later. Plus there's very little code which checks for __linux__ in the source tree ;-) > > -fno-strength-reduce has been there since before we moved to ELF -- over 3 > > years. Any bug this was working around has hopefully been long-squashed. > > I think we should eliminate this and submit PRs if it finds new holes. > > I always wondered why this option was used. OK, I'll take it out now. > > -mno-space-regs & -mfast-indirect-calls can also go away, I think. > > I can't imagine that we ever didn't have them as default on a gcc > > 3.0-based compiler. > > These are still not the default but possibly they should be. > > Actually, I can see that we can save a couple of instructions when > generating long indirect calls to a symbol reference when no space > registers is defined. I think they definitely should be implied by configuring for hppa-linux. I don't see any enthusiasm for allowing use of additional space registers for special purposes. > > Do we still need -ffunction-sections? I'm inclined to leave it anyway > > to enable compilation with older toolchains. > > Cross my fingers, but I believe that the distance problem for calls > is fixed 3.2.2. However, if there really are objects with .text larger > than 240000 bytes, then you should probably still define -ffunction-sections. hmm.. part of the problem is the ld -r steps. The current 2.5 build process does things like: hppa-linux-gcc -Wp,-MD,drivers/block/.loop.o.d -D__KERNEL__ -Iinclude -Wall -W strict-prototypes -Wno-trigraphs -O2 -fno-strict-aliasing -fno-common -D__linux_ _ -pipe -fno-strength-reduce -mno-space-regs -mfast-indirect-calls -mdisable-fpr egs -ffunction-sections -march=1.1 -mschedule=7100LC -fomit-frame-pointer -nostd inc -iwithprefix include -DKBUILD_BASENAME=loop -DKBUILD_MODNAME=loop -c -o drivers/block/loop.o drivers/block/loop.c hppa-linux-ld -r -o drivers/block/built-in.o drivers/block/elevator.o drive rs/block/ll_rw_blk.o drivers/block/ioctl.o drivers/block/genhd.o drivers/block/s csi_ioctl.o drivers/block/deadline-iosched.o drivers/block/rd.o drivers/block/lo op.o hppa-linux-ld -r -o drivers/built-in.o drivers/pci/built-in.o drivers/paris c/built-in.o drivers/serial/built-in.o drivers/parport/built-in.o drivers/base/b uilt-in.o drivers/char/built-in.o drivers/block/built-in.o drivers/misc/built-in .o drivers/net/built-in.o drivers/media/built-in.o drivers/scsi/built-in.o drive rs/cdrom/built-in.o drivers/video/built-in.o drivers/usb/built-in.o drivers/inpu t/built-in.o drivers/input/serio/built-in.o drivers/md/built-in.o drivers/eisa/b uilt-in.o hppa-linux-ld -T arch/parisc/vmlinux.lds.s arch/parisc/kernel/head.o init/built-in.o --start-group usr/built-in.o arch/parisc/kernel/pdc_cons.o arch /parisc/kernel/process.o arch/parisc/mm/built-in.o arch/parisc/kernel/built-in .o arch/parisc/hpux/built-in.o arch/parisc/math-emu/built-in.o arch/parisc/ker nel/init_task.o kernel/built-in.o mm/built-in.o fs/built-in.o ipc/built-in.o security/built-in.o crypto/built-in.o lib/lib.a arch/parisc/lib/lib.a `hppa -linux-gcc -print-libgcc-file-name` drivers/built-in.o sound/built-in.o arch/ parisc/oprofile/built-in.o net/built-in.o --end-group -o vmlinux Now, which .text is limited to 240k? loop.o, drivers/block/built-in.o, drivers/built-in.o or vmlinux? > When the total code bytes exceeds the above limit (PA 1.X), gcc switches > to long indirect calls. These are horribly inefficient. Horribly inefficient in terms of being prefetchable? -- "It's not Hollywood. War is real, war is primarily not about defeat or victory, it is about death. I've seen thousands and thousands of dead bodies. Do you think I want to have an academic debate on this subject?" -- Robert Fisk ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 21:03 ` Matthew Wilcox @ 2003-02-02 22:02 ` John David Anglin 2003-02-06 22:21 ` John David Anglin 0 siblings, 1 reply; 10+ messages in thread From: John David Anglin @ 2003-02-02 22:02 UTC (permalink / raw) To: Matthew Wilcox; +Cc: willy, parisc-linux > I think they definitely should be implied by configuring for hppa-linux. > I don't see any enthusiasm for allowing use of additional space registers > for special purposes. Ok, I will do some testing with fast indirect and no space registers. > hmm.. part of the problem is the ld -r steps. The current 2.5 build > process does things like: ld -r is generally bad news on the PA. The compiler selects call types based on the distance from the call to the beginning of the current translation unit, or function if -ffunction-sections is being used. Relinking without using -ffunction-sections will change the distance to the beginning of the code section for all calls. That's where the linker will insert a long call stub if necessary. If linking with ld -r creates an object with a text section larger than 240k, then there are likely to be calls which can't reach a long branch stub for calls external to the object. I guess if the linker could create stubs when doing ld -r then this problem could be avoided. However, I think stubs are created only when doing a final link. Using -ffunction-sections may have some drawbacks. Normally, related functions are placed in the same object. The linker does some grouping of sections but I doubt it is optimal. So, you might end up needing stubs in some cases where you would want a simple {bl|b,l}. You wouldn't want this to happen when you have a tightly coupled pair of sibling calls. > Now, which .text is limited to 240k? loop.o, drivers/block/built-in.o, > drivers/built-in.o or vmlinux? My understanding is that it's the size of any text sections involved in a final link. The linker intersperses stub groups between the text sections that are used in any final link. Thus, the final text section for vmlinux can be much larger than 240k and stubs will be provided for any branches exceeding 240k. Thus, you shouldn't need -ffunction-sections when compiling objects that will be prelinked using ld -r if the resultant size of the text sections after prelinking is smaller than 240k. > > When the total code bytes exceeds the above limit (PA 1.X), gcc switches > > to long indirect calls. These are horribly inefficient. > > Horribly inefficient in terms of being prefetchable? No, in terms of the number of instructions involved. Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6605) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [parisc-linux] Compiler switches 2003-02-02 22:02 ` John David Anglin @ 2003-02-06 22:21 ` John David Anglin 0 siblings, 0 replies; 10+ messages in thread From: John David Anglin @ 2003-02-06 22:21 UTC (permalink / raw) To: John David Anglin; +Cc: willy, parisc-linux > > I think they definitely should be implied by configuring for hppa-linux. > > I don't see any enthusiasm for allowing use of additional space registers > > for special purposes. > > Ok, I will do some testing with fast indirect and no space registers. I have installed a GCC patch on 3.3 and trunk that changes the hppa-unknown-linux-gnu target (no change to hppa64-unknown-linux-gnu) to include MASK_NO_SPACE_REGISTERS in the target default. The patch also improves the code generated in a few situations where we unnecessarily loaded the destination space register when doing an external branch. As these were in rather uncommon situations, I doubt anyone will see much difference in performance. This really only affects PA 1.x as PA 2.0 has the "bve" insn which sets the space register automatically. It's not a good idea to make fast indirect the default. Fast indirect calls only work when a program has static linkage (i.e., the're not compatible with shared libraries because the PIC register is not set when you do a fast indirect call). Dave -- J. David Anglin dave.anglin@nrc-cnrc.gc.ca National Research Council of Canada (613) 990-0752 (FAX: 952-6605) ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2003-02-06 22:21 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2003-02-02 3:59 [parisc-linux] Compiler switches Matthew Wilcox 2003-02-02 4:56 ` John David Anglin 2003-02-02 5:19 ` Randolph Chung 2003-02-02 5:49 ` John David Anglin 2003-02-02 5:52 ` John David Anglin 2003-02-02 5:55 ` John David Anglin 2003-02-02 8:18 ` Randolph Chung 2003-02-02 21:03 ` Matthew Wilcox 2003-02-02 22:02 ` John David Anglin 2003-02-06 22:21 ` John David Anglin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox