All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <fweisbec@gmail.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: linux-kernel@vger.kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Jan Beulich <JBeulich@suse.com>,
	Arjan van de Ven <arjan@infradead.org>,
	Alexander van Heukelum <heukelum@fastmail.fm>
Subject: Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
Date: Fri, 16 Dec 2011 15:01:33 +0100	[thread overview]
Message-ID: <20111216140130.GA22873@somewhere.redhat.com> (raw)
In-Reply-To: <20111216081915.GA28288@elte.hu>

On Fri, Dec 16, 2011 at 09:19:16AM +0100, Ingo Molnar wrote:
> 
> This patch turns on -momit-leaf-frame-pointer on x86 builds and 
> thus shrinks .text noticeably. On a defconfig-ish kernel:
> 
>    text	   data	    bss	    dec	    hex	filename
>    9843902	1935808	3649536	15429246	 eb6e7e	vmlinux.before
>    9813764	1935792	3649536	15399092	 eaf8b4	vmlinux.after
> 
> That's 0.3% off text size.
> 
> The actual win is larger than this percentage suggests: many 
> small, hot helper functions such as find_next_bit(), 
> do_raw_spin_lock() or most of the list_*() functions are leaf 
> functions and are now shorter by 2 instructions.
> 
> Probably a good chunk of the framepointers related runtime 
> overhead on common workloads is eliminated via this patch, as 
> small leaf functions execute more often than larger parent 
> functions.
> 
> The call-chains are still intact for quality backtraces and for 
> call-chain profiling (perf record -g), as the backtrace walker 
> can deduct the full backtrace from the RIP of a leaf function 
> and the parent chain.

Probably not actually. We are going to miss the parent of those
leaf functions all the time in the stacktrace.

Consider an irq interrupting the following chain:

spin_lock() -> raw_spin_lock() -> do_raw_spin_lock()

And we do a stacktrace on top of the interrupted regs.

What we we do typically is to include the regs->ip as a first entry
(like in perf) or we make it obvious in a bug stacktrace. Then we
purely walk through regs->bp (perf) or we walk the stack and validate
with regs->bp (bug stacktraces)

If do_raw_spin_lock() is a leaf function, we have the following happening:

1) dump regs->ip = do_raw_spin_lock()
2) then use regs->bp to find the return address, but bp
has been saved in the parent, the return address is the one of the parent,
which is = spin_lock() and not raw_spin_lock()

We are more lucky with the paranoid stack walking made for bug reports
because we at least find the return address somehow of do_raw_spin_lock()
but it will appear with the "?" because it won't be validated by the frame
pointer.

I'm not sure we can work around that, unless we can find fast ways
to identify which functions are concerned by this ripped frame pointer
while we are unwinding, in which case we can use some black magic. And
still, we can do something reliable only if we ensure the leaf function has
no stackframe (otherwise we can't reliably find its return address).

> 
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
>  arch/x86/Makefile |    8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> Index: linux/arch/x86/Makefile
> ===================================================================
> --- linux.orig/arch/x86/Makefile
> +++ linux/arch/x86/Makefile
> @@ -72,6 +72,14 @@ else
>          KBUILD_CFLAGS += -maccumulate-outgoing-args
>  endif
>  
> +#
> +# This shrinks many small functions, we don't actually
> +# need their frame pointer, in backtraces the RIP will
> +# identify the function and the stack frame walker will
> +# find the parent function:
> +#
> +KBUILD_CFLAGS += $(call cc-option,-momit-leaf-frame-pointer)
> +
>  ifdef CONFIG_CC_STACKPROTECTOR
>  	cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
>          ifeq ($(shell $(CONFIG_SHELL) $(cc_has_sp) $(CC) $(KBUILD_CPPFLAGS) $(biarch)),y)

  parent reply	other threads:[~2011-12-16 14:01 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-16  8:19 [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size Ingo Molnar
2011-12-16  8:48 ` Andrew Morton
2011-12-16  8:54   ` Ingo Molnar
2011-12-16  8:53 ` Ingo Molnar
2011-12-16  9:23   ` Jeremy Fitzhardinge
2011-12-16 10:20     ` Peter Zijlstra
2011-12-16 16:27       ` Richard Henderson
2011-12-16 11:46     ` Jan Beulich
2011-12-16 12:00       ` Ingo Molnar
2011-12-16 15:32         ` H. Peter Anvin
2011-12-16 14:01 ` Frederic Weisbecker [this message]
2011-12-16 14:06 ` Frederic Weisbecker

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111216140130.GA22873@somewhere.redhat.com \
    --to=fweisbec@gmail.com \
    --cc=JBeulich@suse.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@infradead.org \
    --cc=heukelum@fastmail.fm \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.