* [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
@ 2011-12-16 8:19 Ingo Molnar
2011-12-16 8:48 ` Andrew Morton
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Ingo Molnar @ 2011-12-16 8:19 UTC (permalink / raw)
To: linux-kernel
Cc: H. Peter Anvin, Thomas Gleixner, Peter Zijlstra,
Frédéric Weisbecker, Linus Torvalds, Andrew Morton,
Jan Beulich, Arjan van de Ven, Alexander van Heukelum
This patch turns on -momit-leaf-frame-pointer on x86 builds and
thus shrinks .text noticeably. On a defconfig-ish kernel:
text data bss dec hex filename
9843902 1935808 3649536 15429246 eb6e7e vmlinux.before
9813764 1935792 3649536 15399092 eaf8b4 vmlinux.after
That's 0.3% off text size.
The actual win is larger than this percentage suggests: many
small, hot helper functions such as find_next_bit(),
do_raw_spin_lock() or most of the list_*() functions are leaf
functions and are now shorter by 2 instructions.
Probably a good chunk of the framepointers related runtime
overhead on common workloads is eliminated via this patch, as
small leaf functions execute more often than larger parent
functions.
The call-chains are still intact for quality backtraces and for
call-chain profiling (perf record -g), as the backtrace walker
can deduct the full backtrace from the RIP of a leaf function
and the parent chain.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
arch/x86/Makefile | 8 ++++++++
1 file changed, 8 insertions(+)
Index: linux/arch/x86/Makefile
===================================================================
--- linux.orig/arch/x86/Makefile
+++ linux/arch/x86/Makefile
@@ -72,6 +72,14 @@ else
KBUILD_CFLAGS += -maccumulate-outgoing-args
endif
+#
+# This shrinks many small functions, we don't actually
+# need their frame pointer, in backtraces the RIP will
+# identify the function and the stack frame walker will
+# find the parent function:
+#
+KBUILD_CFLAGS += $(call cc-option,-momit-leaf-frame-pointer)
+
ifdef CONFIG_CC_STACKPROTECTOR
cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
ifeq ($(shell $(CONFIG_SHELL) $(cc_has_sp) $(CC) $(KBUILD_CPPFLAGS) $(biarch)),y)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 8:19 [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size Ingo Molnar
@ 2011-12-16 8:48 ` Andrew Morton
2011-12-16 8:54 ` Ingo Molnar
2011-12-16 8:53 ` Ingo Molnar
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2011-12-16 8:48 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, H. Peter Anvin, Thomas Gleixner, Peter Zijlstra,
Frédéric Weisbecker, Linus Torvalds, Jan Beulich,
Arjan van de Ven, Alexander van Heukelum
On Fri, 16 Dec 2011 09:19:16 +0100 Ingo Molnar <mingo@elte.hu> wrote:
>
> This patch turns on -momit-leaf-frame-pointer on x86 builds and
> thus shrinks .text noticeably. On a defconfig-ish kernel:
>
> text data bss dec hex filename
> 9843902 1935808 3649536 15429246 eb6e7e vmlinux.before
> 9813764 1935792 3649536 15399092 eaf8b4 vmlinux.after
>
> That's 0.3% off text size.
>
> The actual win is larger than this percentage suggests: many
> small, hot helper functions such as find_next_bit(),
> do_raw_spin_lock() or most of the list_*() functions are leaf
> functions and are now shorter by 2 instructions.
>
> Probably a good chunk of the framepointers related runtime
> overhead on common workloads is eliminated via this patch, as
> small leaf functions execute more often than larger parent
> functions.
>
> The call-chains are still intact for quality backtraces and for
> call-chain profiling (perf record -g), as the backtrace walker
> can deduct the full backtrace from the RIP of a leaf function
> and the parent chain.
The only problem I can think of (apart from tickling gcc bugs) is that
it might break __builtin_return_address(n) for n>0 with frame pointers
enabled? The only code I can find which does this is
drivers/isdn/hardware/mISDN/ and ftrace.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 8:19 [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size Ingo Molnar
2011-12-16 8:48 ` Andrew Morton
@ 2011-12-16 8:53 ` Ingo Molnar
2011-12-16 9:23 ` Jeremy Fitzhardinge
2011-12-16 14:01 ` Frederic Weisbecker
2011-12-16 14:06 ` Frederic Weisbecker
3 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2011-12-16 8:53 UTC (permalink / raw)
To: linux-kernel
Cc: H. Peter Anvin, Thomas Gleixner, Peter Zijlstra,
Frédéric Weisbecker, Linus Torvalds, Andrew Morton,
Jan Beulich, Arjan van de Ven, Alexander van Heukelum,
Jeremy Fitzhardinge, Konrad Rzeszutek Wilk
* Ingo Molnar <mingo@elte.hu> wrote:
> [...]
>
> The call-chains are still intact for quality backtraces and
> for call-chain profiling (perf record -g), as the backtrace
> walker can deduct the full backtrace from the RIP of a leaf
> function and the parent chain.
Hm, noticed one complication while looking at annotated assembly
code in perf top. Code doing function calls from within asm() is
incorrectly marked 'leaf' by GCC:
ffffffff812b82d8 <arch_local_save_flags>:
ffffffff812b82d8: ff 14 25 00 d9 c1 81 callq *0xffffffff81c1d900
ffffffff812b82df: c3 retq
So all the paravirt details will have to be fixed, so that GCC
is able to see that there's a real function call done inside.
Jeremy, Konrad?
Thanks,
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 8:48 ` Andrew Morton
@ 2011-12-16 8:54 ` Ingo Molnar
0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2011-12-16 8:54 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-kernel, H. Peter Anvin, Thomas Gleixner, Peter Zijlstra,
Frédéric Weisbecker, Linus Torvalds, Jan Beulich,
Arjan van de Ven, Alexander van Heukelum
* Andrew Morton <akpm@linux-foundation.org> wrote:
> On Fri, 16 Dec 2011 09:19:16 +0100 Ingo Molnar <mingo@elte.hu> wrote:
>
> >
> > This patch turns on -momit-leaf-frame-pointer on x86 builds and
> > thus shrinks .text noticeably. On a defconfig-ish kernel:
> >
> > text data bss dec hex filename
> > 9843902 1935808 3649536 15429246 eb6e7e vmlinux.before
> > 9813764 1935792 3649536 15399092 eaf8b4 vmlinux.after
> >
> > That's 0.3% off text size.
> >
> > The actual win is larger than this percentage suggests: many
> > small, hot helper functions such as find_next_bit(),
> > do_raw_spin_lock() or most of the list_*() functions are leaf
> > functions and are now shorter by 2 instructions.
> >
> > Probably a good chunk of the framepointers related runtime
> > overhead on common workloads is eliminated via this patch, as
> > small leaf functions execute more often than larger parent
> > functions.
> >
> > The call-chains are still intact for quality backtraces and for
> > call-chain profiling (perf record -g), as the backtrace walker
> > can deduct the full backtrace from the RIP of a leaf function
> > and the parent chain.
>
> The only problem I can think of (apart from tickling gcc bugs) is that
> it might break __builtin_return_address(n) for n>0 with frame pointers
> enabled? The only code I can find which does this is
> drivers/isdn/hardware/mISDN/ and ftrace.
Well, AFAICS it won't really 'break' it but behave as if the
leaf function got inlined into the parent function. I think we
can live with that.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 8:53 ` Ingo Molnar
@ 2011-12-16 9:23 ` Jeremy Fitzhardinge
2011-12-16 10:20 ` Peter Zijlstra
2011-12-16 11:46 ` Jan Beulich
0 siblings, 2 replies; 12+ messages in thread
From: Jeremy Fitzhardinge @ 2011-12-16 9:23 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, H. Peter Anvin, Thomas Gleixner, Peter Zijlstra,
Frédéric Weisbecker, Linus Torvalds, Andrew Morton,
Jan Beulich, Arjan van de Ven, Alexander van Heukelum,
Konrad Rzeszutek Wilk
On 12/16/2011 12:53 AM, Ingo Molnar wrote:
> * Ingo Molnar <mingo@elte.hu> wrote:
>
>> [...]
>>
>> The call-chains are still intact for quality backtraces and
>> for call-chain profiling (perf record -g), as the backtrace
>> walker can deduct the full backtrace from the RIP of a leaf
>> function and the parent chain.
> Hm, noticed one complication while looking at annotated assembly
> code in perf top. Code doing function calls from within asm() is
> incorrectly marked 'leaf' by GCC:
>
> ffffffff812b82d8 <arch_local_save_flags>:
> ffffffff812b82d8: ff 14 25 00 d9 c1 81 callq *0xffffffff81c1d900
> ffffffff812b82df: c3 retq
>
> So all the paravirt details will have to be fixed, so that GCC
> is able to see that there's a real function call done inside.
> Jeremy, Konrad?
Um. So the issue is that a function that contains only pvops looks like
it's a leaf to gcc and it does some leaf-function optimisation?
How can we tell gcc the asm contains a call, or otherwise suppress the
"leaf function" classification?
The alternative is to just make it a plain C-level indirect call, but
then we'd lose all the patching and callee-save optimisations.
Any suggestions?
Thanks,
J
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 9:23 ` Jeremy Fitzhardinge
@ 2011-12-16 10:20 ` Peter Zijlstra
2011-12-16 16:27 ` Richard Henderson
2011-12-16 11:46 ` Jan Beulich
1 sibling, 1 reply; 12+ messages in thread
From: Peter Zijlstra @ 2011-12-16 10:20 UTC (permalink / raw)
To: Jeremy Fitzhardinge
Cc: Ingo Molnar, linux-kernel, H. Peter Anvin, Thomas Gleixner,
Frédéric Weisbecker, Linus Torvalds, Andrew Morton,
Jan Beulich, Arjan van de Ven, Alexander van Heukelum,
Konrad Rzeszutek Wilk, rth
On Fri, 2011-12-16 at 01:23 -0800, Jeremy Fitzhardinge wrote:
> On 12/16/2011 12:53 AM, Ingo Molnar wrote:
> > * Ingo Molnar <mingo@elte.hu> wrote:
> >
> >> [...]
> >>
> >> The call-chains are still intact for quality backtraces and
> >> for call-chain profiling (perf record -g), as the backtrace
> >> walker can deduct the full backtrace from the RIP of a leaf
> >> function and the parent chain.
> > Hm, noticed one complication while looking at annotated assembly
> > code in perf top. Code doing function calls from within asm() is
> > incorrectly marked 'leaf' by GCC:
> >
> > ffffffff812b82d8 <arch_local_save_flags>:
> > ffffffff812b82d8: ff 14 25 00 d9 c1 81 callq *0xffffffff81c1d900
> > ffffffff812b82df: c3 retq
> >
> > So all the paravirt details will have to be fixed, so that GCC
> > is able to see that there's a real function call done inside.
> > Jeremy, Konrad?
>
> Um. So the issue is that a function that contains only pvops looks like
> it's a leaf to gcc and it does some leaf-function optimisation?
>
> How can we tell gcc the asm contains a call, or otherwise suppress the
> "leaf function" classification?
>
> The alternative is to just make it a plain C-level indirect call, but
> then we'd lose all the patching and callee-save optimisations.
>
> Any suggestions?
Added Richard Henderson to CC.
I only found the function __attribute__((leaf)) to explicitly mark a
function as being a leaf function, but the documentation doesn't list
the inverse of that to explicitly mark it as _not_ being one.
I haven't done a git grep on the gcc sources yet since I seem to have
misplaced my gcc.git tree.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 9:23 ` Jeremy Fitzhardinge
2011-12-16 10:20 ` Peter Zijlstra
@ 2011-12-16 11:46 ` Jan Beulich
2011-12-16 12:00 ` Ingo Molnar
1 sibling, 1 reply; 12+ messages in thread
From: Jan Beulich @ 2011-12-16 11:46 UTC (permalink / raw)
To: Ingo Molnar, Jeremy Fitzhardinge
Cc: Peter Zijlstra, Alexander van Heukelum, fweisbec,
Arjan van de Ven, Thomas Gleixner, Andrew Morton, Linus Torvalds,
Konrad Rzeszutek Wilk, linux-kernel, H. Peter Anvin
>>> On 16.12.11 at 10:23, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> On 12/16/2011 12:53 AM, Ingo Molnar wrote:
>> * Ingo Molnar <mingo@elte.hu> wrote:
>>
>>> [...]
>>>
>>> The call-chains are still intact for quality backtraces and
>>> for call-chain profiling (perf record -g), as the backtrace
>>> walker can deduct the full backtrace from the RIP of a leaf
>>> function and the parent chain.
Are you sure about that even if the leaf function uses rBP for a
different purpose?
>> Hm, noticed one complication while looking at annotated assembly
>> code in perf top. Code doing function calls from within asm() is
>> incorrectly marked 'leaf' by GCC:
>>
>> ffffffff812b82d8 <arch_local_save_flags>:
>> ffffffff812b82d8: ff 14 25 00 d9 c1 81 callq *0xffffffff81c1d900
>> ffffffff812b82df: c3 retq
>>
>> So all the paravirt details will have to be fixed, so that GCC
>> is able to see that there's a real function call done inside.
>> Jeremy, Konrad?
If the above is not a problem, wouldn't this simply result in a skipped
function layer?
Also, iirc it's not just pv-ops that uses calls within asm()-s.
> Um. So the issue is that a function that contains only pvops looks like
> it's a leaf to gcc and it does some leaf-function optimisation?
>
> How can we tell gcc the asm contains a call, or otherwise suppress the
> "leaf function" classification?
I'm afraid you can't without adding code (i.e. a dummy function call).
Jan
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 11:46 ` Jan Beulich
@ 2011-12-16 12:00 ` Ingo Molnar
2011-12-16 15:32 ` H. Peter Anvin
0 siblings, 1 reply; 12+ messages in thread
From: Ingo Molnar @ 2011-12-16 12:00 UTC (permalink / raw)
To: Jan Beulich
Cc: Jeremy Fitzhardinge, Peter Zijlstra, Alexander van Heukelum,
fweisbec, Arjan van de Ven, Thomas Gleixner, Andrew Morton,
Linus Torvalds, Konrad Rzeszutek Wilk, linux-kernel,
H. Peter Anvin
* Jan Beulich <JBeulich@suse.com> wrote:
> >>> On 16.12.11 at 10:23, Jeremy Fitzhardinge <jeremy@goop.org> wrote:
> > On 12/16/2011 12:53 AM, Ingo Molnar wrote:
> >> * Ingo Molnar <mingo@elte.hu> wrote:
> >>
> >>> [...]
> >>>
> >>> The call-chains are still intact for quality backtraces
> >>> and for call-chain profiling (perf record -g), as the
> >>> backtrace walker can deduct the full backtrace from the
> >>> RIP of a leaf function and the parent chain.
>
> Are you sure about that even if the leaf function uses rBP for
> a different purpose?
Well, i assumed that GCC does not mess with %bp in leaf
functions - a frame pointer is barely useful if it's destroyed
spuriously in leaf functions.
A quick grep of the assembly appears to support that assumption:
$ objdump -d vmlinux | grep ',%rbp$' | cut -d: -f2- | sort | uniq -c | sort -n | tail -10
3 48 89 d5 mov %rdx,%rbp
3 4c 89 cd mov %r9,%rbp
4 48 0f 45 e8 cmovne %rax,%rbp
4 48 83 cd ff or $0xffffffffffffffff,%rbp
5 4c 89 dd mov %r11,%rbp
7 48 21 fd and %rdi,%rbp
10 48 d3 e5 shl %cl,%rbp
14 48 85 ed test %rbp,%rbp
14 48 8b 6c 24 20 mov 0x20(%rsp),%rbp
31042 48 89 e5 mov %rsp,%rbp
%rbp is not touched, except in a few special assembly glue/entry
pieces of code.
> >> Hm, noticed one complication while looking at annotated
> >> assembly code in perf top. Code doing function calls from
> >> within asm() is incorrectly marked 'leaf' by GCC:
> >>
> >> ffffffff812b82d8 <arch_local_save_flags>:
> >> ffffffff812b82d8: ff 14 25 00 d9 c1 81 callq *0xffffffff81c1d900
> >> ffffffff812b82df: c3 retq
> >>
> >> So all the paravirt details will have to be fixed, so that
> >> GCC is able to see that there's a real function call done
> >> inside. Jeremy, Konrad?
>
> If the above is not a problem, wouldn't this simply result in
> a skipped function layer?
Yeah - i guess we can live with that, as long as the frame
pointer chain is otherwise usable and walkable.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 8:19 [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size Ingo Molnar
2011-12-16 8:48 ` Andrew Morton
2011-12-16 8:53 ` Ingo Molnar
@ 2011-12-16 14:01 ` Frederic Weisbecker
2011-12-16 14:06 ` Frederic Weisbecker
3 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2011-12-16 14:01 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, H. Peter Anvin, Thomas Gleixner, Peter Zijlstra,
Linus Torvalds, Andrew Morton, Jan Beulich, Arjan van de Ven,
Alexander van Heukelum
On Fri, Dec 16, 2011 at 09:19:16AM +0100, Ingo Molnar wrote:
>
> This patch turns on -momit-leaf-frame-pointer on x86 builds and
> thus shrinks .text noticeably. On a defconfig-ish kernel:
>
> text data bss dec hex filename
> 9843902 1935808 3649536 15429246 eb6e7e vmlinux.before
> 9813764 1935792 3649536 15399092 eaf8b4 vmlinux.after
>
> That's 0.3% off text size.
>
> The actual win is larger than this percentage suggests: many
> small, hot helper functions such as find_next_bit(),
> do_raw_spin_lock() or most of the list_*() functions are leaf
> functions and are now shorter by 2 instructions.
>
> Probably a good chunk of the framepointers related runtime
> overhead on common workloads is eliminated via this patch, as
> small leaf functions execute more often than larger parent
> functions.
>
> The call-chains are still intact for quality backtraces and for
> call-chain profiling (perf record -g), as the backtrace walker
> can deduct the full backtrace from the RIP of a leaf function
> and the parent chain.
Probably not actually. We are going to miss the parent of those
leaf functions all the time in the stacktrace.
Consider an irq interrupting the following chain:
spin_lock() -> raw_spin_lock() -> do_raw_spin_lock()
And we do a stacktrace on top of the interrupted regs.
What we we do typically is to include the regs->ip as a first entry
(like in perf) or we make it obvious in a bug stacktrace. Then we
purely walk through regs->bp (perf) or we walk the stack and validate
with regs->bp (bug stacktraces)
If do_raw_spin_lock() is a leaf function, we have the following happening:
1) dump regs->ip = do_raw_spin_lock()
2) then use regs->bp to find the return address, but bp
has been saved in the parent, the return address is the one of the parent,
which is = spin_lock() and not raw_spin_lock()
We are more lucky with the paranoid stack walking made for bug reports
because we at least find the return address somehow of do_raw_spin_lock()
but it will appear with the "?" because it won't be validated by the frame
pointer.
I'm not sure we can work around that, unless we can find fast ways
to identify which functions are concerned by this ripped frame pointer
while we are unwinding, in which case we can use some black magic. And
still, we can do something reliable only if we ensure the leaf function has
no stackframe (otherwise we can't reliably find its return address).
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
> arch/x86/Makefile | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> Index: linux/arch/x86/Makefile
> ===================================================================
> --- linux.orig/arch/x86/Makefile
> +++ linux/arch/x86/Makefile
> @@ -72,6 +72,14 @@ else
> KBUILD_CFLAGS += -maccumulate-outgoing-args
> endif
>
> +#
> +# This shrinks many small functions, we don't actually
> +# need their frame pointer, in backtraces the RIP will
> +# identify the function and the stack frame walker will
> +# find the parent function:
> +#
> +KBUILD_CFLAGS += $(call cc-option,-momit-leaf-frame-pointer)
> +
> ifdef CONFIG_CC_STACKPROTECTOR
> cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
> ifeq ($(shell $(CONFIG_SHELL) $(cc_has_sp) $(CC) $(KBUILD_CPPFLAGS) $(biarch)),y)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 8:19 [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size Ingo Molnar
` (2 preceding siblings ...)
2011-12-16 14:01 ` Frederic Weisbecker
@ 2011-12-16 14:06 ` Frederic Weisbecker
3 siblings, 0 replies; 12+ messages in thread
From: Frederic Weisbecker @ 2011-12-16 14:06 UTC (permalink / raw)
To: Ingo Molnar
Cc: linux-kernel, H. Peter Anvin, Thomas Gleixner, Peter Zijlstra,
Linus Torvalds, Andrew Morton, Jan Beulich, Arjan van de Ven,
Alexander van Heukelum
On Fri, Dec 16, 2011 at 09:19:16AM +0100, Ingo Molnar wrote:
>
> This patch turns on -momit-leaf-frame-pointer on x86 builds and
> thus shrinks .text noticeably. On a defconfig-ish kernel:
>
> text data bss dec hex filename
> 9843902 1935808 3649536 15429246 eb6e7e vmlinux.before
> 9813764 1935792 3649536 15399092 eaf8b4 vmlinux.after
>
> That's 0.3% off text size.
>
> The actual win is larger than this percentage suggests: many
> small, hot helper functions such as find_next_bit(),
> do_raw_spin_lock() or most of the list_*() functions are leaf
> functions and are now shorter by 2 instructions.
>
> Probably a good chunk of the framepointers related runtime
> overhead on common workloads is eliminated via this patch, as
> small leaf functions execute more often than larger parent
> functions.
>
> The call-chains are still intact for quality backtraces and for
> call-chain profiling (perf record -g), as the backtrace walker
> can deduct the full backtrace from the RIP of a leaf function
> and the parent chain.
In the case of linked stacks (like frame pointer linking interrupt
to exception stack) we may also miss a leaf function (and not its parent).
Now perhaps we can live with all that, I don't know.
>
> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> ---
> arch/x86/Makefile | 8 ++++++++
> 1 file changed, 8 insertions(+)
>
> Index: linux/arch/x86/Makefile
> ===================================================================
> --- linux.orig/arch/x86/Makefile
> +++ linux/arch/x86/Makefile
> @@ -72,6 +72,14 @@ else
> KBUILD_CFLAGS += -maccumulate-outgoing-args
> endif
>
> +#
> +# This shrinks many small functions, we don't actually
> +# need their frame pointer, in backtraces the RIP will
> +# identify the function and the stack frame walker will
> +# find the parent function:
> +#
> +KBUILD_CFLAGS += $(call cc-option,-momit-leaf-frame-pointer)
> +
> ifdef CONFIG_CC_STACKPROTECTOR
> cc_has_sp := $(srctree)/scripts/gcc-x86_$(BITS)-has-stack-protector.sh
> ifeq ($(shell $(CONFIG_SHELL) $(cc_has_sp) $(CC) $(KBUILD_CPPFLAGS) $(biarch)),y)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 12:00 ` Ingo Molnar
@ 2011-12-16 15:32 ` H. Peter Anvin
0 siblings, 0 replies; 12+ messages in thread
From: H. Peter Anvin @ 2011-12-16 15:32 UTC (permalink / raw)
To: Ingo Molnar
Cc: Jan Beulich, Jeremy Fitzhardinge, Peter Zijlstra,
Alexander van Heukelum, fweisbec, Arjan van de Ven,
Thomas Gleixner, Andrew Morton, Linus Torvalds,
Konrad Rzeszutek Wilk, linux-kernel
On 12/16/2011 04:00 AM, Ingo Molnar wrote:
>>
>> Are you sure about that even if the leaf function uses rBP for
>> a different purpose?
>
> Well, i assumed that GCC does not mess with %bp in leaf
> functions - a frame pointer is barely useful if it's destroyed
> spuriously in leaf functions.
>
We should verify that explicitly. gcc has every "right" to treat it as
a normal callee-saved register, but I think it is a very low priority
register in gcc's register allocation scheme.
-hpa
--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size
2011-12-16 10:20 ` Peter Zijlstra
@ 2011-12-16 16:27 ` Richard Henderson
0 siblings, 0 replies; 12+ messages in thread
From: Richard Henderson @ 2011-12-16 16:27 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Jeremy Fitzhardinge, Ingo Molnar, linux-kernel, H. Peter Anvin,
Thomas Gleixner, Frédéric Weisbecker, Linus Torvalds,
Andrew Morton, Jan Beulich, Arjan van de Ven,
Alexander van Heukelum, Konrad Rzeszutek Wilk
On 12/16/2011 02:20 AM, Peter Zijlstra wrote:
>> How can we tell gcc the asm contains a call, or otherwise suppress the
>> "leaf function" classification?
You can't at present.
> I only found the function __attribute__((leaf)) to explicitly mark a
> function as being a leaf function, but the documentation doesn't list
> the inverse of that to explicitly mark it as _not_ being one.
In any case, "leaf" doesn't do what you think it does -- see the docs.
I told Honza at the time that he was overloading existing terminology
without purpose, but no one came up with a better name before the
release, so it stayed that way. AFAIK, no one uses it because no one
understands it.
Do you have a proposal for what you'd like the extension to look like?
The only thing that comes to mind atm is "call" in the clobber section.
This is not likely to get written before gcc 4.8 though...
r~
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2011-12-16 16:27 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-12-16 8:19 [PATCH] x86: Use -m-omit-leaf-frame-pointer to shrink text size Ingo Molnar
2011-12-16 8:48 ` Andrew Morton
2011-12-16 8:54 ` Ingo Molnar
2011-12-16 8:53 ` Ingo Molnar
2011-12-16 9:23 ` Jeremy Fitzhardinge
2011-12-16 10:20 ` Peter Zijlstra
2011-12-16 16:27 ` Richard Henderson
2011-12-16 11:46 ` Jan Beulich
2011-12-16 12:00 ` Ingo Molnar
2011-12-16 15:32 ` H. Peter Anvin
2011-12-16 14:01 ` Frederic Weisbecker
2011-12-16 14:06 ` Frederic Weisbecker
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).