* Re: CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] [not found] ` <20080724104459.GI28817@elte.hu> @ 2008-07-25 18:45 ` Hugh Dickins 2008-07-25 21:46 ` Benjamin Herrenschmidt 2008-07-28 14:41 ` Ingo Molnar 0 siblings, 2 replies; 7+ messages in thread From: Hugh Dickins @ 2008-07-25 18:45 UTC (permalink / raw) To: Ingo Molnar Cc: the arch/x86 maintainers, Linuxppc-dev, linux-kernel, Mike Travis On Thu, 24 Jul 2008, Ingo Molnar wrote: > * Hugh Dickins <hugh@veritas.com> wrote: > > > I've been using -fno-unit-at-a-time (to lessen inlining, for easier > > debugging) for a long time > > Should we perhaps enable this automatically on CONFIG_FRAME_POINTER=y > builds? Although a separate, default-off config option might be better, I'm content to go on patching the Makefile to suit my own private debug proclivities, somewhat reluctant to foist them upon others. But I must have found no-unit-at-a-time made things clearer at some point, and assume it is still being useful. Whether it deserves its own config option, hmm, not sure of that. But agree it shouldn't really go into CONFIG_FRAME_POINTER=y (though notice that already bundles no-optimize-sibling-calls). I rather think CONFIG_FRAME_POINTER shouldn't exist at all (or be a private, config-user-invisible, specific-to-a-few-arches thing): what one wants to configure is how far to sacrifice cpu performance and kernel smallness to getting a good stacktrace. Frame pointer is just an implementation detail on that, appropriate to some arches. Perhaps three settings: no stacktrace, fair stacktrace, best stacktrace. I've Cc'ed Ben and linuxppc-dev because I wonder if they're aware that several options (I got it from LATENCYTOP, but I think LOCKDEP and FTRACE and some others) are doing a "select FRAME_POINTER", which forces CONFIG_FRAME_POINTER=y on PowerPC, even though FRAME_POINTER is not an option offered on PowerPC. The resulting kernels appear to run okay, but I was surprised. > i'd not be surprised if there were more regressions in this area, it's a > seldom used build vector. There'd probably be a lot more if we weren't (recently?) supporting compilers which only manage no-unit-at-a-time. The only issue I get in my kernel builds is with sched_clock() - but I've not been trying allyesconfig or randconfig. Here's a patch: but whereas the BUILD_IRQ one seemed worth sending because the problem might crop up in other ways and waste someone's time, this one just depends on your taste. [PATCH] sched: move sched_clock before first use Move sched_clock() up to stop warning: weak declaration of `sched_clock' after first use results in unspecified behavior (if -fno-unit-at-a-time). Signed-off-by: Hugh Dickins <hugh@veritas.com> --- kernel/sched_clock.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) --- 2.6.26-git/kernel/sched_clock.c 2008-07-18 11:33:34.000000000 +0100 +++ linux/kernel/sched_clock.c 2008-07-18 11:57:46.000000000 +0100 @@ -32,6 +32,15 @@ #include <linux/ktime.h> #include <linux/module.h> +/* + * Scheduler clock - returns current time in nanosec units. + * This is default implementation. + * Architectures and sub-architectures can override this. + */ +unsigned long long __attribute__((weak)) sched_clock(void) +{ + return (unsigned long long)jiffies * (NSEC_PER_SEC / HZ); +} #ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK @@ -321,16 +330,6 @@ EXPORT_SYMBOL_GPL(sched_clock_idle_wakeu #endif -/* - * Scheduler clock - returns current time in nanosec units. - * This is default implementation. - * Architectures and sub-architectures can override this. - */ -unsigned long long __attribute__((weak)) sched_clock(void) -{ - return (unsigned long long)jiffies * (NSEC_PER_SEC / HZ); -} - unsigned long long cpu_clock(int cpu) { unsigned long long clock; ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] 2008-07-25 18:45 ` CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] Hugh Dickins @ 2008-07-25 21:46 ` Benjamin Herrenschmidt 2008-07-26 11:02 ` Hugh Dickins 2008-07-28 14:41 ` Ingo Molnar 1 sibling, 1 reply; 7+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-25 21:46 UTC (permalink / raw) To: Hugh Dickins Cc: Linuxppc-dev, Ingo Molnar, the arch/x86 maintainers, linux-kernel, Mike Travis On Fri, 2008-07-25 at 19:45 +0100, Hugh Dickins wrote: > > I've Cc'ed Ben and linuxppc-dev because I wonder if they're aware > that several options (I got it from LATENCYTOP, but I think LOCKDEP > and FTRACE and some others) are doing a "select FRAME_POINTER", > which forces CONFIG_FRAME_POINTER=y on PowerPC, even though > FRAME_POINTER is not an option offered on PowerPC. The > resulting kernels appear to run okay, but I was surprised. Because the option just does nothing for us ? :-) We always have frame pointers on powerpc except in some case for leaf functions. I don't know if the option has any actual effect on the later, but I don't think we have a case where doing either way would break things. Cheers, Ben. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] 2008-07-25 21:46 ` Benjamin Herrenschmidt @ 2008-07-26 11:02 ` Hugh Dickins 2008-07-26 12:36 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 7+ messages in thread From: Hugh Dickins @ 2008-07-26 11:02 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: Linuxppc-dev, Ingo Molnar, the arch/x86 maintainers, linux-kernel, Mike Travis On Sat, 26 Jul 2008, Benjamin Herrenschmidt wrote: > On Fri, 2008-07-25 at 19:45 +0100, Hugh Dickins wrote: > > > > I've Cc'ed Ben and linuxppc-dev because I wonder if they're aware > > that several options (I got it from LATENCYTOP, but I think LOCKDEP > > and FTRACE and some others) are doing a "select FRAME_POINTER", > > which forces CONFIG_FRAME_POINTER=y on PowerPC, even though > > FRAME_POINTER is not an option offered on PowerPC. The > > resulting kernels appear to run okay, but I was surprised. > > Because the option just does nothing for us ? :-) We always have frame > pointers on powerpc except in some case for leaf functions. I don't know > if the option has any actual effect on the later, but I don't think we > have a case where doing either way would break things. Thanks, that's reassuring. I raised the question partly because I'd noticed CONFIG_FRAME_POINTER=y does increase the size of powerpc kernels: part of that will have been because of the -fno-optimize-sibling-calls bundled in, but when I edit that out of the Makefile I'm left with text data bss dec hex filename 4773061 856632 232052 5861745 597171 FPN/vmlinux 4943653 856632 232052 6032337 5c0bd1 FPY/vmlinux Going to the first divergence between them, the 2.6.26-git6 vmlinux built without CONFIG_FRAME_POINTER has c000000000008024 <.run_init_process>: c000000000008024: 7c 08 02 a6 mflr r0 c000000000008028: fb c1 ff f0 std r30,-16(r1) c00000000000802c: eb c2 80 48 ld r30,-32696(r2) c000000000008030: f8 01 00 10 std r0,16(r1) c000000000008034: f8 21 ff 81 stdu r1,-128(r1) c000000000008038: e9 3e 80 10 ld r9,-32752(r30) c00000000000803c: f8 69 00 00 std r3,0(r9) c000000000008040: 7d 24 4b 78 mr r4,r9 c000000000008044: 38 a9 01 10 addi r5,r9,272 c000000000008048: 48 01 70 4d bl c00000000001f094 <.kernel_execve> c00000000000804c: 60 00 00 00 nop c000000000008050: 38 21 00 80 addi r1,r1,128 c000000000008054: e8 01 00 10 ld r0,16(r1) c000000000008058: eb c1 ff f0 ld r30,-16(r1) c00000000000805c: 7c 08 03 a6 mtlr r0 c000000000008060: 4e 80 00 20 blr Whereas the vmlinux built with -fno-omit-frame_pointer has c000000000008024 <.run_init_process>: c000000000008024: 7c 08 02 a6 mflr r0 c000000000008028: fb c1 ff f0 std r30,-16(r1) c00000000000802c: eb c2 80 48 ld r30,-32696(r2) c000000000008030: fb e1 ff f8 std r31,-8(r1) c000000000008034: f8 01 00 10 std r0,16(r1) c000000000008038: f8 21 ff 81 stdu r1,-128(r1) c00000000000803c: e9 3e 80 10 ld r9,-32752(r30) c000000000008040: f8 69 00 00 std r3,0(r9) c000000000008044: 7d 24 4b 78 mr r4,r9 c000000000008048: 38 a9 01 10 addi r5,r9,272 c00000000000804c: 7c 3f 0b 78 mr r31,r1 c000000000008050: 48 01 8c 91 bl c000000000020ce0 <.kernel_execve> c000000000008054: 60 00 00 00 nop c000000000008058: e8 21 00 00 ld r1,0(r1) c00000000000805c: e8 01 00 10 ld r0,16(r1) c000000000008060: eb c1 ff f0 ld r30,-16(r1) c000000000008064: eb e1 ff f8 ld r31,-8(r1) c000000000008068: 7c 08 03 a6 mtlr r0 c00000000000806c: 4e 80 00 20 blr That's for static void run_init_process(char *init_filename) { argv_init[0] = init_filename; kernel_execve(init_filename, argv_init, envp_init); } Hmm, perhaps it is doing sibling calls differently even without the explicit -fno-optimize-sibling-calls (but when I add that option, the vmlinux size does go up another 4400). Sorry, I'm most probably fussing over nothing, and wasting your time with my ignorance. Hugh ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] 2008-07-26 11:02 ` Hugh Dickins @ 2008-07-26 12:36 ` Benjamin Herrenschmidt 2008-07-28 13:52 ` Gabriel Paubert 0 siblings, 1 reply; 7+ messages in thread From: Benjamin Herrenschmidt @ 2008-07-26 12:36 UTC (permalink / raw) To: Hugh Dickins Cc: Linuxppc-dev, Ingo Molnar, the arch/x86 maintainers, linux-kernel, Mike Travis On Sat, 2008-07-26 at 12:02 +0100, Hugh Dickins wrote: > > Hmm, perhaps it is doing sibling calls differently even without the > explicit -fno-optimize-sibling-calls (but when I add that option, > the vmlinux size does go up another 4400). > > Sorry, I'm most probably fussing over nothing, > and wasting your time with my ignorance. No you aren't, there is indeed something happening. It looks like gcc is keeping a copy of each stack frame in r31, thus forcing to save/restore that register, along function calls, possibly to help get reliable frames for leaf functions. I don't think we use that "feature" in our backtrace code though... so it won't harm in the sense that it won't break things, but it will indeed bloat the code a little bit. Maybe we should totally disable -fno-omit-frame-pointers on powerpc ... either that or see about actually using that r31 linkage, though I'm not sure it would be that useful. I'll have to talk to our toolchain folks to figure out exactly what's going on there. Cheers, Ben. ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] 2008-07-26 12:36 ` Benjamin Herrenschmidt @ 2008-07-28 13:52 ` Gabriel Paubert 0 siblings, 0 replies; 7+ messages in thread From: Gabriel Paubert @ 2008-07-28 13:52 UTC (permalink / raw) To: Benjamin Herrenschmidt Cc: the arch/x86 maintainers, linux-kernel, Mike Travis, Linuxppc-dev, Hugh Dickins, Ingo Molnar On Sat, Jul 26, 2008 at 10:36:42PM +1000, Benjamin Herrenschmidt wrote: > On Sat, 2008-07-26 at 12:02 +0100, Hugh Dickins wrote: > > > > Hmm, perhaps it is doing sibling calls differently even without the > > explicit -fno-optimize-sibling-calls (but when I add that option, > > the vmlinux size does go up another 4400). > > > > Sorry, I'm most probably fussing over nothing, > > and wasting your time with my ignorance. > > No you aren't, there is indeed something happening. It looks like gcc is > keeping a copy of each stack frame in r31, thus forcing to save/restore > that register, along function calls, possibly to help get reliable > frames for leaf functions. I don't think we use that "feature" in our > backtrace code though... so it won't harm in the sense that it won't > break things, but it will indeed bloat the code a little bit. > > Maybe we should totally disable -fno-omit-frame-pointers on powerpc ... Yes. > either that or see about actually using that r31 linkage, though I'm not > sure it would be that useful. On PPC you can get reliable backtraces (modulo leaf functions, but AFAIR the frame pointer does not help, only the CFI data) without a frame pointer since the ABI makes the stack pointer chain easy to follow. The frame pointer (r31) is only necessary when there are variable size stack allocations, alloca() for example, but are they even allowed in the kernel? Regards, Gabriel ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] 2008-07-25 18:45 ` CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] Hugh Dickins 2008-07-25 21:46 ` Benjamin Herrenschmidt @ 2008-07-28 14:41 ` Ingo Molnar 2008-07-28 14:54 ` Hugh Dickins 1 sibling, 1 reply; 7+ messages in thread From: Ingo Molnar @ 2008-07-28 14:41 UTC (permalink / raw) To: Hugh Dickins Cc: the arch/x86 maintainers, linux-kernel, Mike Travis, Linuxppc-dev, Linus Torvalds, Arjan van de Ven * Hugh Dickins <hugh@veritas.com> wrote: > [PATCH] sched: move sched_clock before first use > > Move sched_clock() up to stop warning: weak declaration of > `sched_clock' after first use results in unspecified behavior (if > -fno-unit-at-a-time). > > Signed-off-by: Hugh Dickins <hugh@veritas.com> applied to tip/sched/urgent - thanks Hugh. > I rather think CONFIG_FRAME_POINTER shouldn't exist at all (or be a > private, config-user-invisible, specific-to-a-few-arches thing): what > one wants to configure is how far to sacrifice cpu performance and > kernel smallness to getting a good stacktrace. Frame pointer is just > an implementation detail on that, appropriate to some arches. Perhaps > three settings: no stacktrace, fair stacktrace, best stacktrace. actually, we consciously use and rely on frame pointers on x86. The runtime cost on 64-bit is miniscule and the improved backtrace output in recent kernels makes backtraces _much_ easier to interpret: Call Trace: <NMI> [<ffffffff80480779>] _raw_spin_trylock+0x19/0x50 [<ffffffff808fb2e9>] _spin_lock_irqsave+0x59/0x90 [<ffffffff80261ab4>] atomic_notifier_chain_register+0x24/0x60 [<ffffffff80262f38>] ? __profile_tick+0x58/0x90 [<ffffffff808fd1a9>] nmi_watchdog_tick+0x59/0x1e0 [<ffffffff808fc79a>] default_do_nmi+0x6a/0x220 [<ffffffff808fc9b4>] do_nmi+0x64/0xb0 [<ffffffff808fc032>] nmi+0xa2/0xc2 [<ffffffff80285bd1>] ? stopmachine+0x61/0xd0 <<EOE>> [<ffffffff8020dca9>] child_rip+0xa/0x11 [<ffffffff8020cf3e>] ? restore_args+0x0/0x30 [<ffffffff80285b70>] ? stopmachine+0x0/0xd0 [<ffffffff8020dc9f>] ? child_rip+0x0/0x11 we experimented with using dwarf2 data in the past but it proved to be very fragile in practice - we depended too much on the whims of gcc/binutils being absolutely correct, etc. Something as fundamental to the kernel's general health as backtraces must not be fragile. So the EBP based backtracing code was ported to 64-bit as well and it was improved further upon. kudos to Arjan for that. Ingo ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] 2008-07-28 14:41 ` Ingo Molnar @ 2008-07-28 14:54 ` Hugh Dickins 0 siblings, 0 replies; 7+ messages in thread From: Hugh Dickins @ 2008-07-28 14:54 UTC (permalink / raw) To: Ingo Molnar Cc: the arch/x86 maintainers, linux-kernel, Mike Travis, Linuxppc-dev, Linus Torvalds, Arjan van de Ven On Mon, 28 Jul 2008, Ingo Molnar wrote: > * Hugh Dickins <hugh@veritas.com> wrote: > > > I rather think CONFIG_FRAME_POINTER shouldn't exist at all (or be a > > private, config-user-invisible, specific-to-a-few-arches thing): what > > one wants to configure is how far to sacrifice cpu performance and > > kernel smallness to getting a good stacktrace. Frame pointer is just > > an implementation detail on that, appropriate to some arches. Perhaps > > three settings: no stacktrace, fair stacktrace, best stacktrace. > > actually, we consciously use and rely on frame pointers on x86. The > runtime cost on 64-bit is miniscule and the improved backtrace output in > recent kernels makes backtraces _much_ easier to interpret: Just to clarify, no way was I criticizing the use of frame pointers on x86. What I don't care for is that CONFIG_FRAME_POINTER is used by common code (e.g. top level Makefile, and various debug Kconfigs), when I see it as an arch-specific technique for getting best stacktrace. Hugh ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-07-28 14:54 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <Pine.LNX.4.64.0807211828470.26173@blonde.site> [not found] ` <20080724104459.GI28817@elte.hu> 2008-07-25 18:45 ` CONFIG_FRAME_POINTER [was [PATCH] x86: BUILD_IRQ say .text] Hugh Dickins 2008-07-25 21:46 ` Benjamin Herrenschmidt 2008-07-26 11:02 ` Hugh Dickins 2008-07-26 12:36 ` Benjamin Herrenschmidt 2008-07-28 13:52 ` Gabriel Paubert 2008-07-28 14:41 ` Ingo Molnar 2008-07-28 14:54 ` Hugh Dickins
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).