public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [patch 02/2] allow gcc4 to optimize unit-at-a-time
@ 2005-12-28 11:47 Ingo Molnar
  2005-12-28 12:04 ` Jakub Jelinek
  2005-12-28 15:30 ` Andi Kleen
  0 siblings, 2 replies; 11+ messages in thread
From: Ingo Molnar @ 2005-12-28 11:47 UTC (permalink / raw)
  To: lkml; +Cc: Linus Torvalds, Andrew Morton, Arjan van de Ven, Matt Mackall

allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
having a wider scope when optimizing. This also results in smaller code
when optimizing for size. (gcc4 does not have the stack footprint
problem of gcc3 compilers.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
----

 arch/i386/Makefile |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

Index: linux/arch/i386/Makefile
===================================================================
--- linux.orig/arch/i386/Makefile
+++ linux/arch/i386/Makefile
@@ -42,9 +42,9 @@ include $(srctree)/arch/i386/Makefile.cp
 GCC_VERSION			:= $(call cc-version)
 cflags-$(CONFIG_REGPARM) 	+= $(shell if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;)
 
-# Disable unit-at-a-time mode, it makes gcc use a lot more stack
-# due to the lack of sharing of stacklots.
-CFLAGS += $(call cc-option,-fno-unit-at-a-time)
+# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
+# a lot more stack due to the lack of sharing of stacklots:
+CFLAGS				+= $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
 
 CFLAGS += $(cflags-y)
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 11:47 [patch 02/2] allow gcc4 to optimize unit-at-a-time Ingo Molnar
@ 2005-12-28 12:04 ` Jakub Jelinek
  2005-12-28 12:28   ` Sam Ravnborg
  2005-12-28 13:06   ` Ingo Molnar
  2005-12-28 15:30 ` Andi Kleen
  1 sibling, 2 replies; 11+ messages in thread
From: Jakub Jelinek @ 2005-12-28 12:04 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: lkml, Linus Torvalds, Andrew Morton, Arjan van de Ven,
	Matt Mackall

On Wed, Dec 28, 2005 at 12:47:01PM +0100, Ingo Molnar wrote:
> allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
> having a wider scope when optimizing. This also results in smaller code
> when optimizing for size. (gcc4 does not have the stack footprint
> problem of gcc3 compilers.)
> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> Signed-off-by: Arjan van de Ven <arjan@infradead.org>
> ----
> 
>  arch/i386/Makefile |    6 +++---
>  1 files changed, 3 insertions(+), 3 deletions(-)
> 
> Index: linux/arch/i386/Makefile
> ===================================================================
> --- linux.orig/arch/i386/Makefile
> +++ linux/arch/i386/Makefile
> @@ -42,9 +42,9 @@ include $(srctree)/arch/i386/Makefile.cp
>  GCC_VERSION			:= $(call cc-version)
>  cflags-$(CONFIG_REGPARM) 	+= $(shell if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;)
>  
> -# Disable unit-at-a-time mode, it makes gcc use a lot more stack
> -# due to the lack of sharing of stacklots.
> -CFLAGS += $(call cc-option,-fno-unit-at-a-time)
> +# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
> +# a lot more stack due to the lack of sharing of stacklots:
> +CFLAGS				+= $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)

-fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
branch).  So unless the minimum supported GCC version to compile kernel is
3.4+, you need to replace
echo "-fno-unit-at-a-time"
with
$(call cc-option,-fno-unit-at-a-time)
.

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 12:04 ` Jakub Jelinek
@ 2005-12-28 12:28   ` Sam Ravnborg
  2005-12-28 13:04     ` Jakub Jelinek
  2005-12-28 13:06   ` Ingo Molnar
  1 sibling, 1 reply; 11+ messages in thread
From: Sam Ravnborg @ 2005-12-28 12:28 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Ingo Molnar, lkml, Linus Torvalds, Andrew Morton,
	Arjan van de Ven, Matt Mackall

On Wed, Dec 28, 2005 at 07:04:35AM -0500, Jakub Jelinek wrote:
> > +# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
> > +# a lot more stack due to the lack of sharing of stacklots:
> > +CFLAGS				+= $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
> 
> -fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
> branch).  So unless the minimum supported GCC version to compile kernel is
> 3.4+, you need to replace
> echo "-fno-unit-at-a-time"
> with
> $(call cc-option,-fno-unit-at-a-time)
The test "$(GCC_VERSION) -lt 0400" takes care of this.

	Sam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 13:04     ` Jakub Jelinek
@ 2005-12-28 12:47       ` Sam Ravnborg
  2005-12-28 12:50         ` Sam Ravnborg
  0 siblings, 1 reply; 11+ messages in thread
From: Sam Ravnborg @ 2005-12-28 12:47 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Ingo Molnar, lkml, Linus Torvalds, Andrew Morton,
	Arjan van de Ven, Matt Mackall

On Wed, Dec 28, 2005 at 08:04:35AM -0500, Jakub Jelinek wrote:
> No.
> -fno-unit-at-a-time should be used with GCCs that
> a) support it
> b) are older than GCC 4.0
> 
> The "$(GCC_VERSION) -lt 0400" test cares of b),
> $(call cc-option,-fno-unit-at-a-time) cares of a).

There was a reason for disabling it unconditionally in first place.
That was due to unexpected huge stack usage if I understand correct.
Ingo's patch enebles unit-at-a-time only for gcc > 4.00 which should
have this issue fixed.

If the argument is that we suddenly shall enable unit-at-a-time for
gcc before 4.00 then we should visit the reasons why it originally was
disabled.

	Sam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 12:47       ` Sam Ravnborg
@ 2005-12-28 12:50         ` Sam Ravnborg
  0 siblings, 0 replies; 11+ messages in thread
From: Sam Ravnborg @ 2005-12-28 12:50 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Ingo Molnar, lkml, Linus Torvalds, Andrew Morton,
	Arjan van de Ven, Matt Mackall

On Wed, Dec 28, 2005 at 01:47:04PM +0100, Sam Ravnborg wrote:
> On Wed, Dec 28, 2005 at 08:04:35AM -0500, Jakub Jelinek wrote:
> > No.
> > -fno-unit-at-a-time should be used with GCCs that
> > a) support it
> > b) are older than GCC 4.0
> > 
> > The "$(GCC_VERSION) -lt 0400" test cares of b),
> > $(call cc-option,-fno-unit-at-a-time) cares of a).
> 
> There was a reason for disabling it unconditionally in first place.
> That was due to unexpected huge stack usage if I understand correct.
> Ingo's patch enebles unit-at-a-time only for gcc > 4.00 which should
> have this issue fixed.
> 
> If the argument is that we suddenly shall enable unit-at-a-time for
> gcc before 4.00 then we should visit the reasons why it originally was
> disabled.
Hi Jakub.

Reading your mail once more I understood it.
And you are right of course.

	Sam - on his way to get more coffee...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 12:28   ` Sam Ravnborg
@ 2005-12-28 13:04     ` Jakub Jelinek
  2005-12-28 12:47       ` Sam Ravnborg
  0 siblings, 1 reply; 11+ messages in thread
From: Jakub Jelinek @ 2005-12-28 13:04 UTC (permalink / raw)
  To: Sam Ravnborg
  Cc: Ingo Molnar, lkml, Linus Torvalds, Andrew Morton,
	Arjan van de Ven, Matt Mackall

On Wed, Dec 28, 2005 at 01:28:15PM +0100, Sam Ravnborg wrote:
> On Wed, Dec 28, 2005 at 07:04:35AM -0500, Jakub Jelinek wrote:
> > > +# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
> > > +# a lot more stack due to the lack of sharing of stacklots:
> > > +CFLAGS				+= $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
> > 
> > -fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
> > branch).  So unless the minimum supported GCC version to compile kernel is
> > 3.4+, you need to replace
> > echo "-fno-unit-at-a-time"
> > with
> > $(call cc-option,-fno-unit-at-a-time)
> The test "$(GCC_VERSION) -lt 0400" takes care of this.

No.
-fno-unit-at-a-time should be used with GCCs that
a) support it
b) are older than GCC 4.0

The "$(GCC_VERSION) -lt 0400" test cares of b),
$(call cc-option,-fno-unit-at-a-time) cares of a).

	Jakub

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 12:04 ` Jakub Jelinek
  2005-12-28 12:28   ` Sam Ravnborg
@ 2005-12-28 13:06   ` Ingo Molnar
  1 sibling, 0 replies; 11+ messages in thread
From: Ingo Molnar @ 2005-12-28 13:06 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: lkml, Linus Torvalds, Andrew Morton, Arjan van de Ven,
	Matt Mackall


* Jakub Jelinek <jakub@redhat.com> wrote:

> > +CFLAGS				+= $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then echo "-fno-unit-at-a-time"; fi ;)
> 
> -fno-unit-at-a-time option has been introduced in GCC 3.4 (and 3.3-hammer
> branch).  So unless the minimum supported GCC version to compile kernel is
> 3.4+, you need to replace
> echo "-fno-unit-at-a-time"
> with
> $(call cc-option,-fno-unit-at-a-time)
> .

indeed - updated patch below.

	Ingo

Subject: allow gcc4 to optimize unit-at-a-time

allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
having a wider scope when optimizing. This also results in smaller code
when optimizing for size. (gcc4 does not have the stack footprint
problem of gcc3 compilers.)

Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
----

 arch/i386/Makefile |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

Index: linux/arch/i386/Makefile
===================================================================
--- linux.orig/arch/i386/Makefile
+++ linux/arch/i386/Makefile
@@ -42,9 +42,9 @@ include $(srctree)/arch/i386/Makefile.cp
 GCC_VERSION			:= $(call cc-version)
 cflags-$(CONFIG_REGPARM) 	+= $(shell if [ $(GCC_VERSION) -ge 0300 ] ; then echo "-mregparm=3"; fi ;)
 
-# Disable unit-at-a-time mode, it makes gcc use a lot more stack
-# due to the lack of sharing of stacklots.
-CFLAGS += $(call cc-option,-fno-unit-at-a-time)
+# Disable unit-at-a-time mode on pre-gcc-4.0 compilers, it makes gcc use
+# a lot more stack due to the lack of sharing of stacklots:
+CFLAGS				+= $(shell if [ $(GCC_VERSION) -lt 0400 ] ; then $(call cc-option,-fno-unit-at-a-time); fi ;)
 
 CFLAGS += $(cflags-y)
 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 11:47 [patch 02/2] allow gcc4 to optimize unit-at-a-time Ingo Molnar
  2005-12-28 12:04 ` Jakub Jelinek
@ 2005-12-28 15:30 ` Andi Kleen
  2005-12-28 15:34   ` Matt Mackall
  2005-12-28 15:41   ` Ingo Molnar
  1 sibling, 2 replies; 11+ messages in thread
From: Andi Kleen @ 2005-12-28 15:30 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andrew Morton, Arjan van de Ven, Matt Mackall,
	linux-kernel

Ingo Molnar <mingo@elte.hu> writes:

> allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
> having a wider scope when optimizing. This also results in smaller code
> when optimizing for size. (gcc4 does not have the stack footprint
> problem of gcc3 compilers.)

I never had any trouble with stack footprint even with gcc 3.3 on x86-64
and unit-at-a-time and it was always enabled. 

But one caveat: turning on unit-at-a-time makes objdump -S / make
foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because objdump
cannot deal with functions being out of order in the object file. This
can be a big problem while analyzing oopses - essentially you have
to analyze the functions without source level information. And with
unit-at-a-time they become bigger so it's more difficult.

But I still think it's a good idea.

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 15:30 ` Andi Kleen
@ 2005-12-28 15:34   ` Matt Mackall
  2005-12-28 15:41   ` Ingo Molnar
  1 sibling, 0 replies; 11+ messages in thread
From: Matt Mackall @ 2005-12-28 15:34 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Ingo Molnar, Linus Torvalds, Andrew Morton, Arjan van de Ven,
	linux-kernel

On Wed, Dec 28, 2005 at 04:30:49PM +0100, Andi Kleen wrote:
> Ingo Molnar <mingo@elte.hu> writes:
> 
> > allow gcc4 compilers to optimize unit-at-a-time - which results in gcc
> > having a wider scope when optimizing. This also results in smaller code
> > when optimizing for size. (gcc4 does not have the stack footprint
> > problem of gcc3 compilers.)
> 
> I never had any trouble with stack footprint even with gcc 3.3 on x86-64
> and unit-at-a-time and it was always enabled. 

The particular offenders I remember were in lib/inflate.c running over
4K well before 4K stacks were in mainline, so I fixed it well before
anyone else got to see it.
 
> But one caveat: turning on unit-at-a-time makes objdump -S / make
> foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because objdump
> cannot deal with functions being out of order in the object file. This
> can be a big problem while analyzing oopses - essentially you have
> to analyze the functions without source level information. And with
> unit-at-a-time they become bigger so it's more difficult.

Yeah, and it also makes stuff like bloat-o-meter output go all to hell.
 
> But I still think it's a good idea.

Indeed.

-- 
Mathematics is the supreme nostalgia of our time.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 15:30 ` Andi Kleen
  2005-12-28 15:34   ` Matt Mackall
@ 2005-12-28 15:41   ` Ingo Molnar
  2005-12-28 17:46     ` Andreas Kleen
  1 sibling, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2005-12-28 15:41 UTC (permalink / raw)
  To: Andi Kleen
  Cc: Linus Torvalds, Andrew Morton, Arjan van de Ven, Matt Mackall,
	linux-kernel


* Andi Kleen <ak@suse.de> wrote:

> But one caveat: turning on unit-at-a-time makes objdump -S / make 
> foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because objdump 
> cannot deal with functions being out of order in the object file. This 
> can be a big problem while analyzing oopses - essentially you have to 
> analyze the functions without source level information. And with 
> unit-at-a-time they become bigger so it's more difficult.
> 
> But I still think it's a good idea.

hm, i dont seem to have problems with DEBUG_INFO. I picked a random 
address within the kernel:

c035766f T schedule_timeout

(gdb) list *0xc035768f
0xc035768f is in schedule_timeout (kernel/timer.c:1075).
1070                     * should never happens anyway). You just have the printk()
1071                     * that will tell you if something is gone wrong and where.
1072                     */
1073                    if (timeout < 0)
1074                    {
1075                            printk(KERN_ERR "schedule_timeout: wrong timeout "
1076                                    "value %lx from %p\n", timeout,
1077                                    __builtin_return_address(0));
1078                            current->state = TASK_RUNNING;
1079                            goto out;
(gdb)

or is it something else that breaks?

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [patch 02/2] allow gcc4 to optimize unit-at-a-time
  2005-12-28 15:41   ` Ingo Molnar
@ 2005-12-28 17:46     ` Andreas Kleen
  0 siblings, 0 replies; 11+ messages in thread
From: Andreas Kleen @ 2005-12-28 17:46 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Linus Torvalds, Andrew Morton, Arjan van de Ven, Matt Mackall,
	linux-kernel

Am Mi 28.12.2005 16:41 schrieb Ingo Molnar <mingo@elte.hu>:

>
> * Andi Kleen <ak@suse.de> wrote:
>
> > But one caveat: turning on unit-at-a-time makes objdump -S / make
> > foo/bar.lst with CONFIG_DEBUG_INFO essentially useless because
> > objdump
> > cannot deal with functions being out of order in the object file.
> > This
> > can be a big problem while analyzing oopses - essentially you have
> > to
> > analyze the functions without source level information. And with
> > unit-at-a-time they become bigger so it's more difficult.
> >
> > But I still think it's a good idea.
>
> hm, i dont seem to have problems with DEBUG_INFO. I picked a random
> address within the kernel:
>
> c035766f T schedule_timeout
>
> (gdb) list *0xc035768f
> 0xc035768f is in schedule_timeout (kernel/timer.c:1075).
> 1070 * should never happens anyway). You just have the printk()
> 1071 * that will tell you if something is gone wrong and where.
> 1072 */
> 1073 if (timeout < 0)
> 1074 {
> 1075 printk(KERN_ERR "schedule_timeout: wrong timeout "
> 1076 "value %lx from %p
", timeout,
> 1077 __builtin_return_address(0));
> 1078 current->state = TASK_RUNNING;
> 1079 goto out;
> (gdb)
>
> or is it something else that breaks?

It's objdump that breaks. Try objdump -S. gdb can deal with it, but you
can't generate
mixed C/assembly listings with it, so it's hard to match up the exact
lines.

(apparently it's possible through the gdb/mi interface, but I haven't
attempted
that yet)

-Andi




^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2005-12-28 17:46 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-28 11:47 [patch 02/2] allow gcc4 to optimize unit-at-a-time Ingo Molnar
2005-12-28 12:04 ` Jakub Jelinek
2005-12-28 12:28   ` Sam Ravnborg
2005-12-28 13:04     ` Jakub Jelinek
2005-12-28 12:47       ` Sam Ravnborg
2005-12-28 12:50         ` Sam Ravnborg
2005-12-28 13:06   ` Ingo Molnar
2005-12-28 15:30 ` Andi Kleen
2005-12-28 15:34   ` Matt Mackall
2005-12-28 15:41   ` Ingo Molnar
2005-12-28 17:46     ` Andreas Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox