All of lore.kernel.org
 help / color / mirror / Atom feed
* Compiling x86 with and without frame pointer
@ 2002-11-21  4:47 Keith Owens
  2002-11-21  5:06 ` Mark Mielke
                   ` (2 more replies)
  0 siblings, 3 replies; 17+ messages in thread
From: Keith Owens @ 2002-11-21  4:47 UTC (permalink / raw)
  To: linux-kernel

The conventional wisdom is that compiling x86 without frame pointer
results in smaller code.  It turns out to be the opposite, compiling
with frame pointers results in a smaller kernel.  gcc version 3.2
20020822 (Red Hat Linux Rawhide 3.2-4).

# size 2.4.20-rc2-*/vmlinux
   text    data     bss     dec     hex filename
2669584  337972  402697 3410253  34094d 2.4.20-rc2-fp/vmlinux
2676919  337972  402697 3417588  3425f4 2.4.20-rc2-nofp/vmlinux

Without frame pointers, vmlinux is 7K bigger.  The difference is that
code with frame pointers can use ebp to directly access the stack,
without frame pointers it has to use esp with an index.

With frame pointers:

00000c10 <inet_dgram_connect>:
     c10:       55                      push   %ebp
     c11:       89 e5                   mov    %esp,%ebp
     c13:       83 ec 14                sub    $0x14,%esp
     c16:       89 75 fc                mov    %esi,0xfffffffc(%ebp)
     c19:       8b 45 08                mov    0x8(%ebp),%eax
     c1c:       8b 75 0c                mov    0xc(%ebp),%esi
     c1f:       89 5d f8                mov    %ebx,0xfffffff8(%ebp)
     c22:       8b 58 18                mov    0x18(%eax),%ebx
     c25:       66 83 3e 00             cmpw   $0x0,(%esi)
     c29:       74 3d                   je     c68 <inet_dgram_connect+0x58>

Without frame pointers:

00000c10 <inet_dgram_connect>:
     c10:       83 ec 14                sub    $0x14,%esp
     c13:       8b 44 24 18             mov    0x18(%esp,1),%eax
     c17:       89 74 24 10             mov    %esi,0x10(%esp,1)
     c1b:       8b 74 24 1c             mov    0x1c(%esp,1),%esi
     c1f:       89 5c 24 0c             mov    %ebx,0xc(%esp,1)
     c23:       8b 58 18                mov    0x18(%eax),%ebx
     c26:       66 83 3e 00             cmpw   $0x0,(%esi)
     c2a:       74 44                   je     c70 <inet_dgram_connect+0x60>

The difference is that stack accesses via ebp are 3 bytes, stack
accesses via esp+index are 4 bytes.  On any function with a large
number of stack accesses, this quickly outweighs the extra prologue
code for frame pointers.

The smaller instruction set will improve icache usage.  Whether this is
offset by the increased register pressure is something for
benchmarking.  Any of the benchmarkers care to test x86 kernels with
and without frame pointers?


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21  4:47 Compiling x86 with and without frame pointer Keith Owens
@ 2002-11-21  5:06 ` Mark Mielke
  2002-11-21  9:30   ` David Zaffiro
  2002-11-21 12:55 ` Dave Jones
  2002-11-21 17:44 ` Martin J. Bligh
  2 siblings, 1 reply; 17+ messages in thread
From: Mark Mielke @ 2002-11-21  5:06 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

A few weeks ago I was surprised to find that code compiled with
-fomit-frame-pointers reliably executed a few percentages slower.
Since the functions I was testing were not anywhere big enough to
fill even the I1 cache, I wrote it off as 'the CPU is obviously
optimized to expect certain instruction sequences after call and
before ret'. Something to think about anyways...

mark


On Thu, Nov 21, 2002 at 03:47:13PM +1100, Keith Owens wrote:
> The conventional wisdom is that compiling x86 without frame pointer
> results in smaller code.  It turns out to be the opposite, compiling
> with frame pointers results in a smaller kernel.  gcc version 3.2
> 20020822 (Red Hat Linux Rawhide 3.2-4).
> 
> # size 2.4.20-rc2-*/vmlinux
>    text    data     bss     dec     hex filename
> 2669584  337972  402697 3410253  34094d 2.4.20-rc2-fp/vmlinux
> 2676919  337972  402697 3417588  3425f4 2.4.20-rc2-nofp/vmlinux
> 
> Without frame pointers, vmlinux is 7K bigger.  The difference is that
> code with frame pointers can use ebp to directly access the stack,
> without frame pointers it has to use esp with an index.
> 
> With frame pointers:
> 
> 00000c10 <inet_dgram_connect>:
>      c10:       55                      push   %ebp
>      c11:       89 e5                   mov    %esp,%ebp
>      c13:       83 ec 14                sub    $0x14,%esp
>      c16:       89 75 fc                mov    %esi,0xfffffffc(%ebp)
>      c19:       8b 45 08                mov    0x8(%ebp),%eax
>      c1c:       8b 75 0c                mov    0xc(%ebp),%esi
>      c1f:       89 5d f8                mov    %ebx,0xfffffff8(%ebp)
>      c22:       8b 58 18                mov    0x18(%eax),%ebx
>      c25:       66 83 3e 00             cmpw   $0x0,(%esi)
>      c29:       74 3d                   je     c68 <inet_dgram_connect+0x58>
> 
> Without frame pointers:
> 
> 00000c10 <inet_dgram_connect>:
>      c10:       83 ec 14                sub    $0x14,%esp
>      c13:       8b 44 24 18             mov    0x18(%esp,1),%eax
>      c17:       89 74 24 10             mov    %esi,0x10(%esp,1)
>      c1b:       8b 74 24 1c             mov    0x1c(%esp,1),%esi
>      c1f:       89 5c 24 0c             mov    %ebx,0xc(%esp,1)
>      c23:       8b 58 18                mov    0x18(%eax),%ebx
>      c26:       66 83 3e 00             cmpw   $0x0,(%esi)
>      c2a:       74 44                   je     c70 <inet_dgram_connect+0x60>
> 
> The difference is that stack accesses via ebp are 3 bytes, stack
> accesses via esp+index are 4 bytes.  On any function with a large
> number of stack accesses, this quickly outweighs the extra prologue
> code for frame pointers.
> 
> The smaller instruction set will improve icache usage.  Whether this is
> offset by the increased register pressure is something for
> benchmarking.  Any of the benchmarkers care to test x86 kernels with
> and without frame pointers?
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21  5:06 ` Mark Mielke
@ 2002-11-21  9:30   ` David Zaffiro
  2002-11-21 19:20     ` Willy Tarreau
  0 siblings, 1 reply; 17+ messages in thread
From: David Zaffiro @ 2002-11-21  9:30 UTC (permalink / raw)
  To: linux-kernel

I use -momit-leaf-frame-pointer for optimization in some own projects, 
instead of the "-fomit-frame-pointer". For me, this results in better 
codesize/speed compared to both "-fomit-frame-pointer" or no option at 
all. Actually gcc-2.95 seems to support this feature as well, but it 
never made it into the 2.95 docs... It makes debugging a lot easier too.

So anyone "caring to benchmark", could you please test the 
"-momit-leaf-frame-pointer" option for x86 as well...


Mark Mielke wrote:
> A few weeks ago I was surprised to find that code compiled with
> -fomit-frame-pointers reliably executed a few percentages slower.
> Since the functions I was testing were not anywhere big enough to
> fill even the I1 cache, I wrote it off as 'the CPU is obviously
> optimized to expect certain instruction sequences after call and
> before ret'. Something to think about anyways...


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21  4:47 Compiling x86 with and without frame pointer Keith Owens
  2002-11-21  5:06 ` Mark Mielke
@ 2002-11-21 12:55 ` Dave Jones
  2002-11-21 14:46   ` Alan Cox
  2002-11-21 17:44 ` Martin J. Bligh
  2 siblings, 1 reply; 17+ messages in thread
From: Dave Jones @ 2002-11-21 12:55 UTC (permalink / raw)
  To: Keith Owens; +Cc: linux-kernel

On Thu, Nov 21, 2002 at 03:47:13PM +1100, Keith Owens wrote:
 > The conventional wisdom is that compiling x86 without frame pointer
 > results in smaller code.  It turns out to be the opposite, compiling
 > with frame pointers results in a smaller kernel.  gcc version 3.2
 > 20020822 (Red Hat Linux Rawhide 3.2-4).

I've been pushing a forward port of the CONFIG_FRAME_POINTER changes
that went into 2.4 for a while, but Linus hasn't taken them each time.
I'll keep pushing until I get a comment..

		Dave

-- 
| Dave Jones.        http://www.codemonkey.org.uk
| SuSE Labs

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21 12:55 ` Dave Jones
@ 2002-11-21 14:46   ` Alan Cox
  0 siblings, 0 replies; 17+ messages in thread
From: Alan Cox @ 2002-11-21 14:46 UTC (permalink / raw)
  To: Dave Jones; +Cc: Keith Owens, Linux Kernel Mailing List

On Thu, 2002-11-21 at 12:55, Dave Jones wrote:
> On Thu, Nov 21, 2002 at 03:47:13PM +1100, Keith Owens wrote:
>  > The conventional wisdom is that compiling x86 without frame pointer
>  > results in smaller code.  It turns out to be the opposite, compiling
>  > with frame pointers results in a smaller kernel.  gcc version 3.2
>  > 20020822 (Red Hat Linux Rawhide 3.2-4).
> 
> I've been pushing a forward port of the CONFIG_FRAME_POINTER changes
> that went into 2.4 for a while, but Linus hasn't taken them each time.
> I'll keep pushing until I get a comment..

Send it this way 8)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21  4:47 Compiling x86 with and without frame pointer Keith Owens
  2002-11-21  5:06 ` Mark Mielke
  2002-11-21 12:55 ` Dave Jones
@ 2002-11-21 17:44 ` Martin J. Bligh
  2002-11-21 23:47   ` Rudmer van Dijk
  2002-11-25  8:59   ` David Zaffiro
  2 siblings, 2 replies; 17+ messages in thread
From: Martin J. Bligh @ 2002-11-21 17:44 UTC (permalink / raw)
  To: Keith Owens, linux-kernel; +Cc: David Zaffiro

> The conventional wisdom is that compiling x86 without frame pointer
> results in smaller code.  It turns out to be the opposite, compiling
> with frame pointers results in a smaller kernel.  gcc version 3.2
> 20020822 (Red Hat Linux Rawhide 3.2-4).

I looked at 2.5.47 (with a splattering of performance patches) using 
gcc 2.95.4 (Debian Woody), on a 16-way NUMA-Q, and did some kernel
compile testing. The times to do the tests were almost identical
(within error noise), but the kernel was indeed smaller

   text    data     bss     dec     hex filename
1873293  396231  459388 2728912  29a3d0 2.5.47-mjb1/vmlinux
1427355  396875  455356 2279586  22c8a2 2.5.47-mjb1-frameptr/vmlinux

Wow ... that's quite some difference ;-)

> I use -momit-leaf-frame-pointer for optimization in some own 
> projects, instead of the "-fomit-frame-pointer". For me, this 
> results in better codesize/speed compared to both "-fomit-frame-pointer" 
> or no option at all. Actually gcc-2.95 seems to support this feature 
> as well, but it never made it into the 2.95 docs...

I tried this, but it seemed to be the same as -fomit-frame-pointer
(on 2.95 at least).

Given that omitting the -fomit-frame-pointer makes a smaller kernel,
that's easier to debug, I'd say this is a good thing to do unless someone
can get *negative* benchmark results. 

M.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21  9:30   ` David Zaffiro
@ 2002-11-21 19:20     ` Willy Tarreau
  2002-11-21 19:32       ` Doug Ledford
  2002-11-25  8:47       ` David Zaffiro
  0 siblings, 2 replies; 17+ messages in thread
From: Willy Tarreau @ 2002-11-21 19:20 UTC (permalink / raw)
  To: David Zaffiro; +Cc: linux-kernel

On Thu, Nov 21, 2002 at 10:30:49AM +0100, David Zaffiro wrote:
> I use -momit-leaf-frame-pointer for optimization in some own projects, 
> instead of the "-fomit-frame-pointer". For me, this results in better 
> codesize/speed compared to both "-fomit-frame-pointer" or no option at 
> all. Actually gcc-2.95 seems to support this feature as well, but it 
> never made it into the 2.95 docs... It makes debugging a lot easier too.
> 
> So anyone "caring to benchmark", could you please test the 
> "-momit-leaf-frame-pointer" option for x86 as well...

Well, I tried on a 2.4.18+patches with gcc 2.95.3. bzImage is :
538481 bytes with -fomit-frame-pointer
538510 bytes with no particular flag
542137 bytes with -momit-leaf-frame-pointer.

So -fomit-frame-pointer shows the same as other's observation, but in this
particular case, -momit-leaf-frame-pointer made a slightly bigger kernel.

Didn't have time to inspect all sections, though.

Cheers,
Willy


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21 19:20     ` Willy Tarreau
@ 2002-11-21 19:32       ` Doug Ledford
  2002-11-21 19:41         ` Willy Tarreau
  2002-11-25  8:47       ` David Zaffiro
  1 sibling, 1 reply; 17+ messages in thread
From: Doug Ledford @ 2002-11-21 19:32 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: David Zaffiro, linux-kernel

On Thu, Nov 21, 2002 at 08:20:45PM +0100, Willy Tarreau wrote:
> On Thu, Nov 21, 2002 at 10:30:49AM +0100, David Zaffiro wrote:
> > I use -momit-leaf-frame-pointer for optimization in some own projects, 
> > instead of the "-fomit-frame-pointer". For me, this results in better 
> > codesize/speed compared to both "-fomit-frame-pointer" or no option at 
> > all. Actually gcc-2.95 seems to support this feature as well, but it 
> > never made it into the 2.95 docs... It makes debugging a lot easier too.
> > 
> > So anyone "caring to benchmark", could you please test the 
> > "-momit-leaf-frame-pointer" option for x86 as well...
> 
> Well, I tried on a 2.4.18+patches with gcc 2.95.3. bzImage is :
> 538481 bytes with -fomit-frame-pointer
> 538510 bytes with no particular flag
> 542137 bytes with -momit-leaf-frame-pointer.

These numbers are useless.  Since a change in frame pointer setup changes 
the code sequences in the text section, it is likely to also change 
maximum acheived compression.  Therefore, the size of the compressed 
images can not be compared and result in any useable data, you need to 
compare the size of the uncompressed images.

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21 19:32       ` Doug Ledford
@ 2002-11-21 19:41         ` Willy Tarreau
  2002-11-21 20:00           ` Doug Ledford
  0 siblings, 1 reply; 17+ messages in thread
From: Willy Tarreau @ 2002-11-21 19:41 UTC (permalink / raw)
  To: Willy Tarreau, David Zaffiro, linux-kernel

On Thu, Nov 21, 2002 at 02:32:31PM -0500, Doug Ledford wrote:
> On Thu, Nov 21, 2002 at 08:20:45PM +0100, Willy Tarreau wrote:
> > On Thu, Nov 21, 2002 at 10:30:49AM +0100, David Zaffiro wrote:
> > > I use -momit-leaf-frame-pointer for optimization in some own projects, 
> > > instead of the "-fomit-frame-pointer". For me, this results in better 
> > > codesize/speed compared to both "-fomit-frame-pointer" or no option at 
> > > all. Actually gcc-2.95 seems to support this feature as well, but it 
> > > never made it into the 2.95 docs... It makes debugging a lot easier too.
> > > 
> > > So anyone "caring to benchmark", could you please test the 
> > > "-momit-leaf-frame-pointer" option for x86 as well...
> > 
> > Well, I tried on a 2.4.18+patches with gcc 2.95.3. bzImage is :
> > 538481 bytes with -fomit-frame-pointer
> > 538510 bytes with no particular flag
> > 542137 bytes with -momit-leaf-frame-pointer.
> 
> These numbers are useless.  Since a change in frame pointer setup changes 
> the code sequences in the text section, it is likely to also change 
> maximum acheived compression.  Therefore, the size of the compressed 
> images can not be compared and result in any useable data, you need to 
> compare the size of the uncompressed images.

Yes, you're quite right about this. I had my mind obsessed all the day reducing
a bzImage to fit it on a diskette, and didn't immediately realise that other
people were speaking pure vmlinux in this discussion :-)

So I retried, and the difference in vmlinux between -fomit-frame-pointer and
-momit-leaf-frame-pointer is nearly 1 kB LESS for the last one (difference
in text only). So David was right here. Please also node that the code is
really less compressible because 1 kB less gives 4 kB more after compression.
Even after upx, the difference is still 3 kB between the two images.

Anyway, the compressed size is sometimes more relevant than the vmlinux one,
when it comes to put it on very limited devices such as diskettes. In my case,
I don't need this extra 1 kB ram, I prefer those 4 kB floppy image for another
NIC driver !

I haven't benchmarked anything with these options. Maybe David's suggestion
is interesting for userland where compression is rarely used.

Cheers,
Willy


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21 19:41         ` Willy Tarreau
@ 2002-11-21 20:00           ` Doug Ledford
  0 siblings, 0 replies; 17+ messages in thread
From: Doug Ledford @ 2002-11-21 20:00 UTC (permalink / raw)
  To: Willy Tarreau; +Cc: David Zaffiro, linux-kernel

On Thu, Nov 21, 2002 at 08:41:27PM +0100, Willy Tarreau wrote:
> Yes, you're quite right about this. I had my mind obsessed all the day reducing
> a bzImage to fit it on a diskette, and didn't immediately realise that other
> people were speaking pure vmlinux in this discussion :-)

I had thought about that as well, but then my answer was that if the 
space is that important on the floppy, then we (meaning Red Hat) could 
compile out BOOT kernel with whatever option gave the smallest compressed 
image and compile installed kernels with whatever gave actual best 
performance (with a + given to kernels that have a frame pointer in the 
event of a tie or insignificant performance difference).

Of course you may be talking about a system that always boots from floppy 
and sits in some closet for years or some embedded system where that 4k in 
flash is super important, so situational decision rules apply ;-)

-- 
  Doug Ledford <dledford@redhat.com>     919-754-3700 x44233
         Red Hat, Inc. 
         1801 Varsity Dr.
         Raleigh, NC 27606
  

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21 17:44 ` Martin J. Bligh
@ 2002-11-21 23:47   ` Rudmer van Dijk
  2002-11-25  8:59   ` David Zaffiro
  1 sibling, 0 replies; 17+ messages in thread
From: Rudmer van Dijk @ 2002-11-21 23:47 UTC (permalink / raw)
  To: Keith Owens, linux-kernel; +Cc: David Zaffiro

On Thursday 21 November 2002 18:44, Martin J. Bligh wrote:
> > The conventional wisdom is that compiling x86 without frame pointer
> > results in smaller code.  It turns out to be the opposite, compiling
> > with frame pointers results in a smaller kernel.  gcc version 3.2
> > 20020822 (Red Hat Linux Rawhide 3.2-4).
> 
> I looked at 2.5.47 (with a splattering of performance patches) using 
> gcc 2.95.4 (Debian Woody), on a 16-way NUMA-Q, and did some kernel
> compile testing. The times to do the tests were almost identical
> (within error noise), but the kernel was indeed smaller
> 
>    text    data     bss     dec     hex filename
> 1873293  396231  459388 2728912  29a3d0 2.5.47-mjb1/vmlinux
> 1427355  396875  455356 2279586  22c8a2 2.5.47-mjb1-frameptr/vmlinux
> 
> Wow ... that's quite some difference ;-)

I also tried it, but it is not that big a difference:

   text    data     bss     dec     hex filename  flags
1991125  306324  270484 2567933  272efd vmlinux    -fomit-frame-pointer
1981477  306324  270484 2558285  27094d vmlinux    
1990965  306324  270484 2567773  272e5d vmlinux    -momit-leaf-frame-pointer

this was with gcc 2.95.3 and binutils 2.12 on my lfs system

        Rudmer
> 
> > I use -momit-leaf-frame-pointer for optimization in some own 
> > projects, instead of the "-fomit-frame-pointer". For me, this 
> > results in better codesize/speed compared to both "-fomit-frame-pointer" 
> > or no option at all. Actually gcc-2.95 seems to support this feature 
> > as well, but it never made it into the 2.95 docs...
> 
> I tried this, but it seemed to be the same as -fomit-frame-pointer
> (on 2.95 at least).
> 
> Given that omitting the -fomit-frame-pointer makes a smaller kernel,
> that's easier to debug, I'd say this is a good thing to do unless someone
> can get *negative* benchmark results. 
> 
> M.
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21 19:20     ` Willy Tarreau
  2002-11-21 19:32       ` Doug Ledford
@ 2002-11-25  8:47       ` David Zaffiro
  2002-11-25  8:52         ` Willy Tarreau
  2002-11-25 15:00         ` Denis Vlasenko
  1 sibling, 2 replies; 17+ messages in thread
From: David Zaffiro @ 2002-11-25  8:47 UTC (permalink / raw)
  To: willy; +Cc: linux-kernel

I can understand why not omitting framepointers generates better 
compressible code, since every function will start with:
	push   %ebp
	mov    %esp,%ebp
and end with:
	leave
	ret

But it's harder to find a reason why -fomit-frame-pointer is better 
compressible that -momit-leaf-frame-pointer (but it's probably related 
to a lot of mov's with stackpointer involved), especially since 
"-momit-leaf-frame-pointer" makes a trade-off between both other 
options: it omits framepointers for leaf functions (callees that aren't 
callers as well) and it doesn't for branch-functions.
The mixture of functions with frame-pointers and those without is 
probably causing bzip to compress less optimal.

Anyway it makes me wonder, whether kernelcompilation shouldn't be 
configurable between a "optimize for (compressed image) size" and a 
"optimize for speed" option... I'd go for speed... (and always omitting 
frame-pointers doesn't seem to as fast as omitting them only in leaf 
functions).


> Well, I tried on a 2.4.18+patches with gcc 2.95.3. bzImage is :
> 538481 bytes with -fomit-frame-pointer
> 538510 bytes with no particular flag
> 542137 bytes with -momit-leaf-frame-pointer.
> 
> So -fomit-frame-pointer shows the same as other's observation, but in this
> particular case, -momit-leaf-frame-pointer made a slightly bigger kernel.


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-25  8:47       ` David Zaffiro
@ 2002-11-25  8:52         ` Willy Tarreau
  2002-11-25 14:55           ` Denis Vlasenko
  2002-11-25 15:00         ` Denis Vlasenko
  1 sibling, 1 reply; 17+ messages in thread
From: Willy Tarreau @ 2002-11-25  8:52 UTC (permalink / raw)
  To: David Zaffiro; +Cc: willy, linux-kernel

 
> Anyway it makes me wonder, whether kernelcompilation shouldn't be 
> configurable between a "optimize for (compressed image) size" and a 
> "optimize for speed" option... I'd go for speed... (and always omitting 
> frame-pointers doesn't seem to as fast as omitting them only in leaf 
> functions).

hehe :-)
I've put this in my kernels for about 2 years now. You can also reduce the
image size with -malign-jumps=0 -mpreferred-stack-boundary=2 and -mcpu=i386.

I also use some other options, but don't have them at hand right now. But it
basically gives me slightly smaller kernels, which is pretty good for install
CD or diskettes.

Cheers,
Willy

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-21 17:44 ` Martin J. Bligh
  2002-11-21 23:47   ` Rudmer van Dijk
@ 2002-11-25  8:59   ` David Zaffiro
  1 sibling, 0 replies; 17+ messages in thread
From: David Zaffiro @ 2002-11-25  8:59 UTC (permalink / raw)
  To: Martin J. Bligh; +Cc: Keith Owens, linux-kernel

> I looked at 2.5.47 (with a splattering of performance patches) using 
> gcc 2.95.4 (Debian Woody), on a 16-way NUMA-Q, and did some kernel
> compile testing. The times to do the tests were almost identical
> (within error noise), but the kernel was indeed smaller
> 
>    text    data     bss     dec     hex filename
> 1873293  396231  459388 2728912  29a3d0 2.5.47-mjb1/vmlinux
> 1427355  396875  455356 2279586  22c8a2 2.5.47-mjb1-frameptr/vmlinux
> 

I can't think of any reason why the data- and bss-part of the kernel are 
influenced by a framepointer option, this seems highly illogical. It 
shouldn't make any difference as far as I can tell, maybe you altered 
other options as well? (Could be strange compilerbehaviour though)

Keith's results seem more reliable:

# size 2.4.20-rc2-*/vmlinux
    text    data     bss     dec     hex filename
2669584  337972  402697 3410253  34094d 2.4.20-rc2-fp/vmlinux
2676919  337972  402697 3417588  3425f4 2.4.20-rc2-nofp/vmlinux


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-25 15:00         ` Denis Vlasenko
@ 2002-11-25 11:57           ` David Zaffiro
  0 siblings, 0 replies; 17+ messages in thread
From: David Zaffiro @ 2002-11-25 11:57 UTC (permalink / raw)
  To: vda; +Cc: willy, linux-kernel

>>since "-momit-leaf-frame-pointer" makes a trade-off between both
>>other options: it omits framepointers for leaf functions (callees
>>that aren't callers as well) and it doesn't for branch-functions.
> 
> 
> Which does not sound quite right for me. FP should be omitted
> only if function contains less than half dozen stack references,
> otherwise not. It does not matter whether it is a leaf function or not.

Leaf functions generally do not contain more than half dozen 
stackreferences, and are generally called more or equally often as there 
callers. The slight overhead of leaf functions that do contain a dozen 
stackreferences is much smaller than the overhead of omitting 
framepointers in /all/ branch functions including those with dozens of 
stackreferences. Maybe gcc's optimizer could be adapted in the (near) 
future to compare either speed or sizes of possibly generated code, with 
and without framepointer, if the compile is not a debug one.

But in the mean time, in most "userland" projects I've tested with, the 
-momit-leaf-frame-pointer resulted in almost te same codesize as 
compiles with framepointer, along with more or less the same speed as 
"-fomit-frame-pointer". I wouldn't know how to benchmark kernel-configs 
though, and I haven't seen anyone doing this with the framepointer 
options yet...


> OTOH, AFAIK frame pointers make debugging easier, development kernels
> are better to be compiled with fp in every func.

Honestly, I think that's a shortcoming of the debugger if that's true. 
The debugger could store the stackpointer position after a call or 
calculate it based on sub/add/push/pop's, instead of borrowing it from 
ebp.  I'm just concerned about the extra costs (in speed and size) of 
always omiting the framepointer.

(It shouldn't be impossible to debug regparm- and stdcall-functions as 
well, I wonder why this could be a problem at the moment. But just 
"omitting framepointers" at least doesn't mix up the (IMHO: somewhat 
thoughtlessly defined) i386 32-bit C-callingconvention.)


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-25  8:52         ` Willy Tarreau
@ 2002-11-25 14:55           ` Denis Vlasenko
  0 siblings, 0 replies; 17+ messages in thread
From: Denis Vlasenko @ 2002-11-25 14:55 UTC (permalink / raw)
  To: Willy Tarreau, David Zaffiro; +Cc: willy, linux-kernel

On 25 November 2002 06:52, Willy Tarreau wrote:
> > Anyway it makes me wonder, whether kernelcompilation shouldn't be
> > configurable between a "optimize for (compressed image) size" and a
> > "optimize for speed" option... I'd go for speed... (and always
> > omitting frame-pointers doesn't seem to as fast as omitting them
> > only in leaf functions).
>
> hehe :-)
> I've put this in my kernels for about 2 years now. You can also
> reduce the image size with -malign-jumps=0
> -mpreferred-stack-boundary=2 and -mcpu=i386.

Hehe indeed ;)
--
vda

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: Compiling x86 with and without frame pointer
  2002-11-25  8:47       ` David Zaffiro
  2002-11-25  8:52         ` Willy Tarreau
@ 2002-11-25 15:00         ` Denis Vlasenko
  2002-11-25 11:57           ` David Zaffiro
  1 sibling, 1 reply; 17+ messages in thread
From: Denis Vlasenko @ 2002-11-25 15:00 UTC (permalink / raw)
  To: David Zaffiro, willy; +Cc: linux-kernel

On 25 November 2002 06:47, David Zaffiro wrote:
> I can understand why not omitting framepointers generates better
> compressible code, since every function will start with:
> 	push   %ebp
> 	mov    %esp,%ebp
> and end with:
> 	leave
> 	ret
>
> But it's harder to find a reason why -fomit-frame-pointer is better
> compressible that -momit-leaf-frame-pointer (but it's probably
> related to a lot of mov's with stackpointer involved), especially
> since "-momit-leaf-frame-pointer" makes a trade-off between both
> other options: it omits framepointers for leaf functions (callees
> that aren't callers as well) and it doesn't for branch-functions.

Which does not sound quite right for me. FP should be omitted
only if function contains less than half dozen stack references,
otherwise not. It does not matter whether it is a leaf function or not.

OTOH, AFAIK frame pointers make debugging easier, development kernels
are better to be compiled with fp in every func.
--
vda

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2002-11-25 11:51 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-11-21  4:47 Compiling x86 with and without frame pointer Keith Owens
2002-11-21  5:06 ` Mark Mielke
2002-11-21  9:30   ` David Zaffiro
2002-11-21 19:20     ` Willy Tarreau
2002-11-21 19:32       ` Doug Ledford
2002-11-21 19:41         ` Willy Tarreau
2002-11-21 20:00           ` Doug Ledford
2002-11-25  8:47       ` David Zaffiro
2002-11-25  8:52         ` Willy Tarreau
2002-11-25 14:55           ` Denis Vlasenko
2002-11-25 15:00         ` Denis Vlasenko
2002-11-25 11:57           ` David Zaffiro
2002-11-21 12:55 ` Dave Jones
2002-11-21 14:46   ` Alan Cox
2002-11-21 17:44 ` Martin J. Bligh
2002-11-21 23:47   ` Rudmer van Dijk
2002-11-25  8:59   ` David Zaffiro

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.