Re: Compiling x86 with and without frame pointer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mark Mielke <mark@mark.mielke.cc>
To: Keith Owens <kaos@ocs.com.au>
Cc: linux-kernel@vger.kernel.org
Subject: Re: Compiling x86 with and without frame pointer
Date: Thu, 21 Nov 2002 00:06:07 -0500	[thread overview]
Message-ID: <20021121050607.GA1554@mark.mielke.cc> (raw)
In-Reply-To: <19005.1037854033@kao2.melbourne.sgi.com>

A few weeks ago I was surprised to find that code compiled with
-fomit-frame-pointers reliably executed a few percentages slower.
Since the functions I was testing were not anywhere big enough to
fill even the I1 cache, I wrote it off as 'the CPU is obviously
optimized to expect certain instruction sequences after call and
before ret'. Something to think about anyways...

mark


On Thu, Nov 21, 2002 at 03:47:13PM +1100, Keith Owens wrote:
> The conventional wisdom is that compiling x86 without frame pointer
> results in smaller code.  It turns out to be the opposite, compiling
> with frame pointers results in a smaller kernel.  gcc version 3.2
> 20020822 (Red Hat Linux Rawhide 3.2-4).
> 
> # size 2.4.20-rc2-*/vmlinux
>    text    data     bss     dec     hex filename
> 2669584  337972  402697 3410253  34094d 2.4.20-rc2-fp/vmlinux
> 2676919  337972  402697 3417588  3425f4 2.4.20-rc2-nofp/vmlinux
> 
> Without frame pointers, vmlinux is 7K bigger.  The difference is that
> code with frame pointers can use ebp to directly access the stack,
> without frame pointers it has to use esp with an index.
> 
> With frame pointers:
> 
> 00000c10 <inet_dgram_connect>:
>      c10:       55                      push   %ebp
>      c11:       89 e5                   mov    %esp,%ebp
>      c13:       83 ec 14                sub    $0x14,%esp
>      c16:       89 75 fc                mov    %esi,0xfffffffc(%ebp)
>      c19:       8b 45 08                mov    0x8(%ebp),%eax
>      c1c:       8b 75 0c                mov    0xc(%ebp),%esi
>      c1f:       89 5d f8                mov    %ebx,0xfffffff8(%ebp)
>      c22:       8b 58 18                mov    0x18(%eax),%ebx
>      c25:       66 83 3e 00             cmpw   $0x0,(%esi)
>      c29:       74 3d                   je     c68 <inet_dgram_connect+0x58>
> 
> Without frame pointers:
> 
> 00000c10 <inet_dgram_connect>:
>      c10:       83 ec 14                sub    $0x14,%esp
>      c13:       8b 44 24 18             mov    0x18(%esp,1),%eax
>      c17:       89 74 24 10             mov    %esi,0x10(%esp,1)
>      c1b:       8b 74 24 1c             mov    0x1c(%esp,1),%esi
>      c1f:       89 5c 24 0c             mov    %ebx,0xc(%esp,1)
>      c23:       8b 58 18                mov    0x18(%eax),%ebx
>      c26:       66 83 3e 00             cmpw   $0x0,(%esi)
>      c2a:       74 44                   je     c70 <inet_dgram_connect+0x60>
> 
> The difference is that stack accesses via ebp are 3 bytes, stack
> accesses via esp+index are 4 bytes.  On any function with a large
> number of stack accesses, this quickly outweighs the extra prologue
> code for frame pointers.
> 
> The smaller instruction set will improve icache usage.  Whether this is
> offset by the increased register pressure is something for
> benchmarking.  Any of the benchmarkers care to test x86 kernels with
> and without frame pointers?
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
mark@mielke.cc/markm@ncf.ca/markm@nortelnetworks.com __________________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
                       and in the darkness bind them...

                           http://mark.mielke.cc/

next prev parent reply	other threads:[~2002-11-21  4:52 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-11-21  4:47 Compiling x86 with and without frame pointer Keith Owens
2002-11-21  5:06 ` Mark Mielke [this message]
2002-11-21  9:30   ` David Zaffiro
2002-11-21 19:20     ` Willy Tarreau
2002-11-21 19:32       ` Doug Ledford
2002-11-21 19:41         ` Willy Tarreau
2002-11-21 20:00           ` Doug Ledford
2002-11-25  8:47       ` David Zaffiro
2002-11-25  8:52         ` Willy Tarreau
2002-11-25 14:55           ` Denis Vlasenko
2002-11-25 15:00         ` Denis Vlasenko
2002-11-25 11:57           ` David Zaffiro
2002-11-21 12:55 ` Dave Jones
2002-11-21 14:46   ` Alan Cox
2002-11-21 17:44 ` Martin J. Bligh
2002-11-21 23:47   ` Rudmer van Dijk
2002-11-25  8:59   ` David Zaffiro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20021121050607.GA1554@mark.mielke.cc \
    --to=mark@mark.mielke.cc \
    --cc=kaos@ocs.com.au \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.