* Anyone working on ftrace function graph support on ARM?
@ 2009-03-24 19:38 Tim Bird
2009-03-24 20:25 ` Ingo Molnar
` (2 more replies)
0 siblings, 3 replies; 37+ messages in thread
From: Tim Bird @ 2009-03-24 19:38 UTC (permalink / raw)
To: linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar,
Abhishek Sagar, Russell King, Uwe Kleine-König
ARM and FTRACE people,
Is anyone working on function graph support for ARM?
I haven't done much ARM assembly, but the Intel mechanism
for the return hook looks fairly straightforward,
and I thought I'd take a shot at implementing it on ARM.
But if someone else is already doing it, I'd rather work
with them.
BTW - I turned on -finstrument-functions on ARM, and it seems
to work OK (with the exception being that I don't see evenly matched
calls and returns). For this latter reason, I'm going to
start with an implementation that copies the return hook
used by x86, with a fallback to using __cyg_profile_... instead
of mcount, if the hook approach proves too hard for me on ARM.
My ultimate goal is to add function duration filtering, which
is one of the nicer features of KFT (an older tracer I used
to maintain out-of-mainline). With all the ftrace and ringbuffer
support already in mainline, this shouldn't be too hard, but
I need to start with basic graph support on ARM first.
Any comments or feedback on the approach, or on current plans
to extend ftrace support on ARM, before I get too far along,
are welcome.
Thanks,
-- Tim
=============================
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=============================
^ permalink raw reply [flat|nested] 37+ messages in thread* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 19:38 Anyone working on ftrace function graph support on ARM? Tim Bird @ 2009-03-24 20:25 ` Ingo Molnar 2009-03-24 20:48 ` Tim Bird 2009-03-24 20:38 ` Uwe Kleine-König 2009-03-24 21:36 ` Frederic Weisbecker 2 siblings, 1 reply; 37+ messages in thread From: Ingo Molnar @ 2009-03-24 20:25 UTC (permalink / raw) To: Tim Bird, Frédéric Weisbecker Cc: linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König * Tim Bird <tim.bird@am.sony.com> wrote: > My ultimate goal is to add function duration filtering, which is > one of the nicer features of KFT (an older tracer I used to > maintain out-of-mainline). With all the ftrace and ringbuffer > support already in mainline, this shouldn't be too hard, but I > need to start with basic graph support on ARM first. ah, function duration filtering - is that to only show functions which have a duration beyond a certain (runtime configurable) value? Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 20:25 ` Ingo Molnar @ 2009-03-24 20:48 ` Tim Bird 0 siblings, 0 replies; 37+ messages in thread From: Tim Bird @ 2009-03-24 20:48 UTC (permalink / raw) To: Ingo Molnar Cc: Frédéric Weisbecker, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König Ingo Molnar wrote: > * Tim Bird <tim.bird@am.sony.com> wrote: > >> My ultimate goal is to add function duration filtering, which is >> one of the nicer features of KFT (an older tracer I used to >> maintain out-of-mainline). With all the ftrace and ringbuffer >> support already in mainline, this shouldn't be too hard, but I >> need to start with basic graph support on ARM first. > > ah, function duration filtering - is that to only show functions > which have a duration beyond a certain (runtime configurable) value? Exactly. With KFT, you set a "mintime filter", and it removed from the trace buffer any functions which were less than the threshhold. This was not high cost, because you usually only removed the most recently added function in the trace buffer. This lets you focus on routines that last a long time. I used it mainly to find long-duration routines in early boot. The initcall tracer serves a similar purpose, but not at the same granularity. It greatly extends the length of time you can trace, before the buffer fills up. -- Tim ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America ============================= ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 19:38 Anyone working on ftrace function graph support on ARM? Tim Bird 2009-03-24 20:25 ` Ingo Molnar @ 2009-03-24 20:38 ` Uwe Kleine-König 2009-03-24 21:36 ` Frederic Weisbecker 2 siblings, 0 replies; 37+ messages in thread From: Uwe Kleine-König @ 2009-03-24 20:38 UTC (permalink / raw) To: Tim Bird Cc: linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King Hello, On Tue, Mar 24, 2009 at 12:38:50PM -0700, Tim Bird wrote: > ARM and FTRACE people, > > Is anyone working on function graph support for ARM? I don't think there is anyone working in that area. You can Cc: me when your patches are ready, I will happily proof-read them. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 19:38 Anyone working on ftrace function graph support on ARM? Tim Bird 2009-03-24 20:25 ` Ingo Molnar 2009-03-24 20:38 ` Uwe Kleine-König @ 2009-03-24 21:36 ` Frederic Weisbecker 2009-03-24 21:40 ` Tim Bird ` (2 more replies) 2 siblings, 3 replies; 37+ messages in thread From: Frederic Weisbecker @ 2009-03-24 21:36 UTC (permalink / raw) To: Tim Bird Cc: linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König On Tue, Mar 24, 2009 at 12:38:50PM -0700, Tim Bird wrote: > ARM and FTRACE people, > > Is anyone working on function graph support for ARM? Aah, it was on my plans! But well, if you are about to do it, so don't hesitate. I don't have any arm board for now, though I have a pending one which I will probably get in few weeks, but for now I can't work on it, so knock yourself out. > I haven't done much ARM assembly, but the Intel mechanism > for the return hook looks fairly straightforward, > and I thought I'd take a shot at implementing it on ARM. Yes, ie: _Before jumping to the function entry hook, you must save the arguments for the traced function on the stack. On x86, its eax, edx and ecx. On arm, it will be r0-r3. Then you have to transmit the address of the traced function (it's on r14) and it's parent (must rely on fp for that). Then you call the entry hook and restore the old scratch/arg registers. _ On return hook it's pretty the same, (saving the scratch registers, especially the return value which should be on r0 and r1 if I'm not wrong). But you'll have to get the original return address from the return handler and then put it on pc. Well it's a very naive listing, there are sometimes some problems. For example on x86-64, I had to save even some non-scratch registers before calling the return hook, I still don't know why. > But if someone else is already doing it, I'd rather work > with them. > > BTW - I turned on -finstrument-functions on ARM, and it seems > to work OK (with the exception being that I don't see evenly matched > calls and returns). For this latter reason, I'm going to > start with an implementation that copies the return hook > used by x86, with a fallback to using __cyg_profile_... instead > of mcount, if the hook approach proves too hard for me on ARM. Be care, -finstrument-functions will add one more handler and then increase the size of the kernel, may be a lot. Moreover it will not be compatible with dynamic tracing which is designed to patch the mcount call sites. It doesn't support patching of __cyg_profile. And patching __cyg_profile call sites would mean running twice the time to patch the kernel code. Function tracing (hooks only on function entries) and dynamic tracing are already implemented on Arm. Using the mcount way shoudn't really be a problem because all is already set for that with the function tracer. Anyway if you need some help, don't hesitate! > My ultimate goal is to add function duration filtering, which > is one of the nicer features of KFT (an older tracer I used > to maintain out-of-mainline). With all the ftrace and ringbuffer > support already in mainline, this shouldn't be too hard, but > I need to start with basic graph support on ARM first. Yeah, seems a nice idea. Thanks, Frederic. > Any comments or feedback on the approach, or on current plans > to extend ftrace support on ARM, before I get too far along, > are welcome. > > Thanks, > -- Tim > > ============================= > Tim Bird > Architecture Group Chair, CE Linux Forum > Senior Staff Engineer, Sony Corporation of America > ============================= > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 21:36 ` Frederic Weisbecker @ 2009-03-24 21:40 ` Tim Bird 2009-03-24 21:48 ` Ingo Molnar 2009-03-24 22:29 ` Anyone working on ftrace function graph support on ARM? Abhishek Sagar 2 siblings, 0 replies; 37+ messages in thread From: Tim Bird @ 2009-03-24 21:40 UTC (permalink / raw) To: Frederic Weisbecker Cc: linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König Frederic Weisbecker wrote: > Yes, ie: > > _Before jumping to the function entry hook, you must save > the arguments for the traced function on the stack. > On x86, its eax, edx and ecx. > On arm, it will be r0-r3. > Then you have to transmit the address of the traced function > (it's on r14) and it's parent (must rely on fp for that). > Then you call the entry hook and restore the old scratch/arg > registers. > > > _ On return hook it's pretty the same, (saving the scratch > registers, especially the return value which should be on r0 > and r1 if I'm not wrong). > But you'll have to get the original return address from the > return handler and then put it on pc. > > Well it's a very naive listing, there are sometimes some problems. > For example on x86-64, I had to save even some non-scratch registers > before calling the return hook, I still don't know why. Thanks - this is very helpful. I'll send patches around when I get something going. Hopefully in the next week or two. -- Tim ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America ============================= ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 21:36 ` Frederic Weisbecker 2009-03-24 21:40 ` Tim Bird @ 2009-03-24 21:48 ` Ingo Molnar 2009-03-24 21:57 ` Frederic Weisbecker 2009-03-24 22:29 ` Anyone working on ftrace function graph support on ARM? Abhishek Sagar 2 siblings, 1 reply; 37+ messages in thread From: Ingo Molnar @ 2009-03-24 21:48 UTC (permalink / raw) To: Frederic Weisbecker Cc: Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König * Frederic Weisbecker <fweisbec@gmail.com> wrote: > Well it's a very naive listing, there are sometimes some problems. > For example on x86-64, I had to save even some non-scratch > registers before calling the return hook, I still don't know why. btw., which are those registers? Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 21:48 ` Ingo Molnar @ 2009-03-24 21:57 ` Frederic Weisbecker 2009-03-24 22:14 ` Ingo Molnar 2009-03-25 16:00 ` Steven Rostedt 0 siblings, 2 replies; 37+ messages in thread From: Frederic Weisbecker @ 2009-03-24 21:57 UTC (permalink / raw) To: Ingo Molnar Cc: Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König On Tue, Mar 24, 2009 at 10:48:46PM +0100, Ingo Molnar wrote: > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > Well it's a very naive listing, there are sometimes some problems. > > For example on x86-64, I had to save even some non-scratch > > registers before calling the return hook, I still don't know why. > > btw., which are those registers? > > Ingo I would expect to only save rax,rdi,rsi,rdx,rcx,r8,r9 which are used for parameters. And I had some crashes until I append r10 and r11 which actually are scratch if I'm not wrong, but since they are scratch and are not used for arguments, I thought they didn't need to be saved. Well, I think there were some code flow cases I was missing. The complete code is: movq %rax, (%rsp) movq %rcx, 8(%rsp) movq %rdx, 16(%rsp) movq %rsi, 24(%rsp) movq %rdi, 32(%rsp) movq %r8, 40(%rsp) movq %r9, 48(%rsp) movq %r10, 56(%rsp) movq %r11, 64(%rsp) call ftrace_return_to_handler movq %rax, 72(%rsp) <-- get original return value movq 64(%rsp), %r11 movq 56(%rsp), %r10 movq 48(%rsp), %r9 movq 40(%rsp), %r8 movq 32(%rsp), %rdi movq 24(%rsp), %rsi movq 16(%rsp), %rdx movq 8(%rsp), %rcx movq (%rsp), %rax addq $72, %rsp ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 21:57 ` Frederic Weisbecker @ 2009-03-24 22:14 ` Ingo Molnar 2009-03-24 22:54 ` Frederic Weisbecker 2009-03-25 8:36 ` Russell King - ARM Linux 2009-03-25 16:00 ` Steven Rostedt 1 sibling, 2 replies; 37+ messages in thread From: Ingo Molnar @ 2009-03-24 22:14 UTC (permalink / raw) To: Frederic Weisbecker Cc: Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König * Frederic Weisbecker <fweisbec@gmail.com> wrote: > On Tue, Mar 24, 2009 at 10:48:46PM +0100, Ingo Molnar wrote: > > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > Well it's a very naive listing, there are sometimes some problems. > > > For example on x86-64, I had to save even some non-scratch > > > registers before calling the return hook, I still don't know why. > > > > btw., which are those registers? > > > > Ingo > > > I would expect to only save rax,rdi,rsi,rdx,rcx,r8,r9 which are > used for parameters. > And I had some crashes until I append r10 and r11 which actually > are scratch if I'm not wrong, but since they are scratch and are > not used for arguments, I thought they didn't need to be saved. > > Well, I think there were some code flow cases I was missing. Correct, r10 and r11 are clobbered registers too - and you need to save them too in mcount methods. The reason is that mcount has a special calling convention - it's not just about not destroying arguments - GCC can keep data in r10 or r11 scratch registers across function calls as well - for example for relatively static functions that are in its local optimization scope. If GCC can prove that the local scope function itself does not clobber r10/r11, it does not have to clobber them across the function call. But the mcount() callback still gets inserted. So the rule is: mcount must not destroy _any_ register state. (beyond flags) ngo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 22:14 ` Ingo Molnar @ 2009-03-24 22:54 ` Frederic Weisbecker 2009-03-25 8:36 ` Russell King - ARM Linux 1 sibling, 0 replies; 37+ messages in thread From: Frederic Weisbecker @ 2009-03-24 22:54 UTC (permalink / raw) To: Ingo Molnar Cc: Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König On Tue, Mar 24, 2009 at 11:14:39PM +0100, Ingo Molnar wrote: > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > On Tue, Mar 24, 2009 at 10:48:46PM +0100, Ingo Molnar wrote: > > > > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > > > Well it's a very naive listing, there are sometimes some problems. > > > > For example on x86-64, I had to save even some non-scratch > > > > registers before calling the return hook, I still don't know why. > > > > > > btw., which are those registers? > > > > > > Ingo > > > > > > I would expect to only save rax,rdi,rsi,rdx,rcx,r8,r9 which are > > used for parameters. > > > And I had some crashes until I append r10 and r11 which actually > > are scratch if I'm not wrong, but since they are scratch and are > > not used for arguments, I thought they didn't need to be saved. > > > > Well, I think there were some code flow cases I was missing. > > Correct, r10 and r11 are clobbered registers too - and you need to > save them too in mcount methods. > > The reason is that mcount has a special calling convention - it's > not just about not destroying arguments - GCC can keep data in r10 > or r11 scratch registers across function calls as well - for example > for relatively static functions that are in its local optimization > scope. > > If GCC can prove that the local scope function itself does not > clobber r10/r11, it does not have to clobber them across the > function call. But the mcount() callback still gets inserted. > > So the rule is: mcount must not destroy _any_ register state. > (beyond flags) > > ngo Aah, ok, understood! Thanks. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 22:14 ` Ingo Molnar 2009-03-24 22:54 ` Frederic Weisbecker @ 2009-03-25 8:36 ` Russell King - ARM Linux 1 sibling, 0 replies; 37+ messages in thread From: Russell King - ARM Linux @ 2009-03-25 8:36 UTC (permalink / raw) To: Ingo Molnar Cc: Frederic Weisbecker, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Abhishek Sagar, Uwe Kleine-König On Tue, Mar 24, 2009 at 11:14:39PM +0100, Ingo Molnar wrote: > So the rule is: mcount must not destroy _any_ register state. > (beyond flags) As has already been discussed on the ARM lists, we have a problem with using mcount along with EABI, or OABI without frame pointers. We can not avoid destroying 'lr' - the return address for the function. OABI works around this by assuming that the parent function saved a stack frame, which means it's at a known offset from the frame pointer (provided frame pointers are enabled.) mcount reloads 'lr' before returning: ENTRY(mcount) stmdb sp!, {r0-r3, lr} mov r0, lr sub r0, r0, #MCOUNT_INSN_SIZE .globl mcount_call mcount_call: bl ftrace_stub ldr lr, [fp, #-4] @ restore lr ldmia sp!, {r0-r3, pc} So I think if Tim is expecting to be able to use this feature on EABI kernels, he's going to struggle. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 21:57 ` Frederic Weisbecker 2009-03-24 22:14 ` Ingo Molnar @ 2009-03-25 16:00 ` Steven Rostedt 2009-03-25 17:13 ` Frederic Weisbecker 2009-03-25 20:27 ` [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 Steven Rostedt 1 sibling, 2 replies; 37+ messages in thread From: Steven Rostedt @ 2009-03-25 16:00 UTC (permalink / raw) To: Frederic Weisbecker Cc: Ingo Molnar, Tim Bird, linux-arm-kernel, linux kernel, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König On Tue, 24 Mar 2009, Frederic Weisbecker wrote: > On Tue, Mar 24, 2009 at 10:48:46PM +0100, Ingo Molnar wrote: > > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > Well it's a very naive listing, there are sometimes some problems. > > > For example on x86-64, I had to save even some non-scratch > > > registers before calling the return hook, I still don't know why. > > > > btw., which are those registers? > > > > Ingo > > > I would expect to only save rax,rdi,rsi,rdx,rcx,r8,r9 which > are used for parameters. > And I had some crashes until I append r10 and r11 which actually are > scratch if I'm not wrong, but since they are scratch and are not used for > arguments, I thought they didn't need to be saved. > > Well, I think there were some code flow cases I was missing. > > > The complete code is: > > movq %rax, (%rsp) > movq %rcx, 8(%rsp) > movq %rdx, 16(%rsp) > movq %rsi, 24(%rsp) > movq %rdi, 32(%rsp) > movq %r8, 40(%rsp) > movq %r9, 48(%rsp) > movq %r10, 56(%rsp) > movq %r11, 64(%rsp) > > call ftrace_return_to_handler > > movq %rax, 72(%rsp) <-- get original return value > movq 64(%rsp), %r11 > movq 56(%rsp), %r10 > movq 48(%rsp), %r9 > movq 40(%rsp), %r8 > movq 32(%rsp), %rdi > movq 24(%rsp), %rsi > movq 16(%rsp), %rdx > movq 8(%rsp), %rcx > movq (%rsp), %rax > addq $72, %rsp This bothers me. In PowerPC 64, all I have is: _GLOBAL(return_to_handler) /* need to save return values */ std r4, -24(r1) std r3, -16(r1) std r31, -8(r1) mr r31, r1 stdu r1, -112(r1) bl .ftrace_return_to_handler nop /* return value has real return address */ mtlr r3 ld r1, 0(r1) ld r4, -24(r1) ld r3, -16(r1) ld r31, -8(r1) /* Jump back to real return address */ blr All I save is the return values (and I'm paranoid with that, by saving both r3 and r4 and not just r3) as well as saving the stack. There should be no reason to save any other registers. This is not the same as mcount. mcount varies differently from arch to arch. But this is the return of a function. This is not a mcount call, and really has nothing to do with mcount. If you think about it, the return is coming back from a function that should have already saved all the registers that it modifies. The caller of that function (the one we will return to) should have saved any registers that are allowed to be modified by the callee. When we call our ftrace_return_to_handler function it too will save any register that it must for callees and restore it on return. Perhaps the issue you had with x86_64 was that you did not set up the stack frame properly? And by saving all those registers, it just happen to do it for you? -- Steve ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 16:00 ` Steven Rostedt @ 2009-03-25 17:13 ` Frederic Weisbecker 2009-03-25 20:27 ` [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 Steven Rostedt 1 sibling, 0 replies; 37+ messages in thread From: Frederic Weisbecker @ 2009-03-25 17:13 UTC (permalink / raw) To: Steven Rostedt Cc: Ingo Molnar, Tim Bird, linux-arm-kernel, linux kernel, Ingo Molnar, Abhishek Sagar, Russell King, Uwe Kleine-König On Wed, Mar 25, 2009 at 12:00:24PM -0400, Steven Rostedt wrote: > > On Tue, 24 Mar 2009, Frederic Weisbecker wrote: > > > On Tue, Mar 24, 2009 at 10:48:46PM +0100, Ingo Molnar wrote: > > > > > > * Frederic Weisbecker <fweisbec@gmail.com> wrote: > > > > > > > Well it's a very naive listing, there are sometimes some problems. > > > > For example on x86-64, I had to save even some non-scratch > > > > registers before calling the return hook, I still don't know why. > > > > > > btw., which are those registers? > > > > > > Ingo > > > > > > I would expect to only save rax,rdi,rsi,rdx,rcx,r8,r9 which > > are used for parameters. > > And I had some crashes until I append r10 and r11 which actually are > > scratch if I'm not wrong, but since they are scratch and are not used for > > arguments, I thought they didn't need to be saved. > > > > Well, I think there were some code flow cases I was missing. > > > > > > The complete code is: > > > > movq %rax, (%rsp) > > movq %rcx, 8(%rsp) > > movq %rdx, 16(%rsp) > > movq %rsi, 24(%rsp) > > movq %rdi, 32(%rsp) > > movq %r8, 40(%rsp) > > movq %r9, 48(%rsp) > > movq %r10, 56(%rsp) > > movq %r11, 64(%rsp) > > > > call ftrace_return_to_handler > > > > movq %rax, 72(%rsp) <-- get original return value > > movq 64(%rsp), %r11 > > movq 56(%rsp), %r10 > > movq 48(%rsp), %r9 > > movq 40(%rsp), %r8 > > movq 32(%rsp), %rdi > > movq 24(%rsp), %rsi > > movq 16(%rsp), %rdx > > movq 8(%rsp), %rcx > > movq (%rsp), %rax > > addq $72, %rsp > > This bothers me. In PowerPC 64, all I have is: > > _GLOBAL(return_to_handler) > /* need to save return values */ > std r4, -24(r1) > std r3, -16(r1) > std r31, -8(r1) > mr r31, r1 > stdu r1, -112(r1) > > bl .ftrace_return_to_handler > nop > > /* return value has real return address */ > mtlr r3 > > ld r1, 0(r1) > ld r4, -24(r1) > ld r3, -16(r1) > ld r31, -8(r1) > > /* Jump back to real return address */ > blr > > All I save is the return values (and I'm paranoid with that, by saving > both r3 and r4 and not just r3) as well as saving the stack. There should > be no reason to save any other registers. > > This is not the same as mcount. mcount varies differently from arch to > arch. But this is the return of a function. This is not a mcount call, and > really has nothing to do with mcount. > > If you think about it, the return is coming back from a function that > should have already saved all the registers that it modifies. The caller > of that function (the one we will return to) should have saved any > registers that are allowed to be modified by the callee. > > When we call our ftrace_return_to_handler function it too will save any > register that it must for callees and restore it on return. > > Perhaps the issue you had with x86_64 was that you did not set up the > stack frame properly? And by saving all those registers, it just happen to > do it for you? I don't know. It seems to me that the stack frame is well set. This is weird. > -- Steve > ^ permalink raw reply [flat|nested] 37+ messages in thread
* [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 2009-03-25 16:00 ` Steven Rostedt 2009-03-25 17:13 ` Frederic Weisbecker @ 2009-03-25 20:27 ` Steven Rostedt 2009-03-25 20:45 ` Jaswinder Singh Rajput 2009-04-08 16:09 ` Ingo Molnar 1 sibling, 2 replies; 37+ messages in thread From: Steven Rostedt @ 2009-03-25 20:27 UTC (permalink / raw) To: linux kernel Cc: Frederic Weisbecker, Ingo Molnar, Tim Bird, Ingo Molnar, Abhishek Sagar, Peter Zijlstra, Thomas Gleixner Ingo, Please pull the latest tip/tracing/function-graph tree, which can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git tip/tracing/function-graph Steven Rostedt (1): x86, function-graph: only save return values on x86_64 ---- arch/x86/kernel/entry_64.S | 19 +++---------------- 1 files changed, 3 insertions(+), 16 deletions(-) --------------------------- commit d5b754b2208b1c22691a7cd5d1a65398b5973d74 Author: Steven Rostedt <srostedt@redhat.com> Date: Wed Mar 25 14:30:04 2009 -0400 x86, function-graph: only save return values on x86_64 Impact: speed up The return to handler portion of the function graph tracer should only need to save the return values. The caller already saved off the registers that the callee can modify. The returning function already saved the registers it modified. When we call our own trace function it too will save the registers that the callee must restore. There's no reason to save off anything more that the registers used to return the values. Note, I did a complete kernel build with this modification and the function graph tracer running on x86_64. Signed-off-by: Steven Rostedt <srostedt@redhat.com> diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S index a331ec3..1ac9986 100644 --- a/arch/x86/kernel/entry_64.S +++ b/arch/x86/kernel/entry_64.S @@ -147,27 +147,14 @@ END(ftrace_graph_caller) GLOBAL(return_to_handler) subq $80, %rsp + /* Save the return values */ movq %rax, (%rsp) - movq %rcx, 8(%rsp) - movq %rdx, 16(%rsp) - movq %rsi, 24(%rsp) - movq %rdi, 32(%rsp) - movq %r8, 40(%rsp) - movq %r9, 48(%rsp) - movq %r10, 56(%rsp) - movq %r11, 64(%rsp) + movq %rdx, 8(%rsp) call ftrace_return_to_handler movq %rax, 72(%rsp) - movq 64(%rsp), %r11 - movq 56(%rsp), %r10 - movq 48(%rsp), %r9 - movq 40(%rsp), %r8 - movq 32(%rsp), %rdi - movq 24(%rsp), %rsi - movq 16(%rsp), %rdx - movq 8(%rsp), %rcx + movq 8(%rsp), %rdx movq (%rsp), %rax addq $72, %rsp retq ^ permalink raw reply related [flat|nested] 37+ messages in thread
* Re: [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 2009-03-25 20:27 ` [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 Steven Rostedt @ 2009-03-25 20:45 ` Jaswinder Singh Rajput 2009-03-25 21:26 ` Steven Rostedt 2009-04-08 16:09 ` Ingo Molnar 1 sibling, 1 reply; 37+ messages in thread From: Jaswinder Singh Rajput @ 2009-03-25 20:45 UTC (permalink / raw) To: Steven Rostedt Cc: linux kernel, Frederic Weisbecker, Ingo Molnar, Tim Bird, Ingo Molnar, Abhishek Sagar, Peter Zijlstra, Thomas Gleixner On Wed, 2009-03-25 at 16:27 -0400, Steven Rostedt wrote: > Ingo, > > Please pull the latest tip/tracing/function-graph tree, which can be found at: > > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git > tip/tracing/function-graph > > > Steven Rostedt (1): > x86, function-graph: only save return values on x86_64 > > ---- > arch/x86/kernel/entry_64.S | 19 +++---------------- > 1 files changed, 3 insertions(+), 16 deletions(-) > --------------------------- > commit d5b754b2208b1c22691a7cd5d1a65398b5973d74 > Author: Steven Rostedt <srostedt@redhat.com> > Date: Wed Mar 25 14:30:04 2009 -0400 > > x86, function-graph: only save return values on x86_64 > > Impact: speed up > > The return to handler portion of the function graph tracer should only > need to save the return values. The caller already saved off the > registers that the callee can modify. The returning function already > saved the registers it modified. When we call our own trace function > it too will save the registers that the callee must restore. > > There's no reason to save off anything more that the registers used > to return the values. > > Note, I did a complete kernel build with this modification and the > function graph tracer running on x86_64. > > Signed-off-by: Steven Rostedt <srostedt@redhat.com> > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index a331ec3..1ac9986 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -147,27 +147,14 @@ END(ftrace_graph_caller) > GLOBAL(return_to_handler) > subq $80, %rsp > > + /* Save the return values */ > movq %rax, (%rsp) > - movq %rcx, 8(%rsp) > - movq %rdx, 16(%rsp) > - movq %rsi, 24(%rsp) > - movq %rdi, 32(%rsp) > - movq %r8, 40(%rsp) > - movq %r9, 48(%rsp) > - movq %r10, 56(%rsp) > - movq %r11, 64(%rsp) > + movq %rdx, 8(%rsp) > > call ftrace_return_to_handler > > movq %rax, 72(%rsp) > - movq 64(%rsp), %r11 > - movq 56(%rsp), %r10 > - movq 48(%rsp), %r9 > - movq 40(%rsp), %r8 > - movq 32(%rsp), %rdi > - movq 24(%rsp), %rsi > - movq 16(%rsp), %rdx > - movq 8(%rsp), %rcx > + movq 8(%rsp), %rdx > movq (%rsp), %rax > addq $72, %rsp > retq hmm, is this gonna work: movq %rax, 16(%rsp) movq 8(%rsp), %rdx movq (%rsp), %rax addq $16, %rsp retq -- JSR ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 2009-03-25 20:45 ` Jaswinder Singh Rajput @ 2009-03-25 21:26 ` Steven Rostedt 0 siblings, 0 replies; 37+ messages in thread From: Steven Rostedt @ 2009-03-25 21:26 UTC (permalink / raw) To: Jaswinder Singh Rajput Cc: linux kernel, Frederic Weisbecker, Ingo Molnar, Tim Bird, Ingo Molnar, Abhishek Sagar, Peter Zijlstra, Thomas Gleixner On Thu, 26 Mar 2009, Jaswinder Singh Rajput wrote: > > > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > > index a331ec3..1ac9986 100644 > > --- a/arch/x86/kernel/entry_64.S > > +++ b/arch/x86/kernel/entry_64.S > > @@ -147,27 +147,14 @@ END(ftrace_graph_caller) > > GLOBAL(return_to_handler) > > subq $80, %rsp > > > > + /* Save the return values */ > > movq %rax, (%rsp) > > - movq %rcx, 8(%rsp) > > - movq %rdx, 16(%rsp) > > - movq %rsi, 24(%rsp) > > - movq %rdi, 32(%rsp) > > - movq %r8, 40(%rsp) > > - movq %r9, 48(%rsp) > > - movq %r10, 56(%rsp) > > - movq %r11, 64(%rsp) > > + movq %rdx, 8(%rsp) > > > > call ftrace_return_to_handler > > > > movq %rax, 72(%rsp) > > - movq 64(%rsp), %r11 > > - movq 56(%rsp), %r10 > > - movq 48(%rsp), %r9 > > - movq 40(%rsp), %r8 > > - movq 32(%rsp), %rdi > > - movq 24(%rsp), %rsi > > - movq 16(%rsp), %rdx > > - movq 8(%rsp), %rcx > > + movq 8(%rsp), %rdx > > movq (%rsp), %rax > > addq $72, %rsp > > retq > > hmm, is this gonna work: > movq %rax, 16(%rsp) > movq 8(%rsp), %rdx > movq (%rsp), %rax > addq $16, %rsp > retq Interesting. I was just talking over IRC with Frederic about this. One theory about his earlier failures was not having a proper stack frame when calling the function. This is probably a bogus theory and there was probably something else that broke the code. But there is no harm in allocating 80 bytes (it worked before). Thus, I rather error on the side of least change than risk it. -- Steve ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 2009-03-25 20:27 ` [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 Steven Rostedt 2009-03-25 20:45 ` Jaswinder Singh Rajput @ 2009-04-08 16:09 ` Ingo Molnar 2009-04-08 16:37 ` Steven Rostedt 2009-04-08 17:40 ` Frederic Weisbecker 1 sibling, 2 replies; 37+ messages in thread From: Ingo Molnar @ 2009-04-08 16:09 UTC (permalink / raw) To: Steven Rostedt Cc: linux kernel, Frederic Weisbecker, Tim Bird, Ingo Molnar, Abhishek Sagar, Peter Zijlstra, Thomas Gleixner (replying to old mail - working down my awful email backlog) do we still need this for -tip? Ingo * Steven Rostedt <rostedt@goodmis.org> wrote: > > Ingo, > > Please pull the latest tip/tracing/function-graph tree, which can be found at: > > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git > tip/tracing/function-graph > > > Steven Rostedt (1): > x86, function-graph: only save return values on x86_64 > > ---- > arch/x86/kernel/entry_64.S | 19 +++---------------- > 1 files changed, 3 insertions(+), 16 deletions(-) > --------------------------- > commit d5b754b2208b1c22691a7cd5d1a65398b5973d74 > Author: Steven Rostedt <srostedt@redhat.com> > Date: Wed Mar 25 14:30:04 2009 -0400 > > x86, function-graph: only save return values on x86_64 > > Impact: speed up > > The return to handler portion of the function graph tracer should only > need to save the return values. The caller already saved off the > registers that the callee can modify. The returning function already > saved the registers it modified. When we call our own trace function > it too will save the registers that the callee must restore. > > There's no reason to save off anything more that the registers used > to return the values. > > Note, I did a complete kernel build with this modification and the > function graph tracer running on x86_64. > > Signed-off-by: Steven Rostedt <srostedt@redhat.com> > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > index a331ec3..1ac9986 100644 > --- a/arch/x86/kernel/entry_64.S > +++ b/arch/x86/kernel/entry_64.S > @@ -147,27 +147,14 @@ END(ftrace_graph_caller) > GLOBAL(return_to_handler) > subq $80, %rsp > > + /* Save the return values */ > movq %rax, (%rsp) > - movq %rcx, 8(%rsp) > - movq %rdx, 16(%rsp) > - movq %rsi, 24(%rsp) > - movq %rdi, 32(%rsp) > - movq %r8, 40(%rsp) > - movq %r9, 48(%rsp) > - movq %r10, 56(%rsp) > - movq %r11, 64(%rsp) > + movq %rdx, 8(%rsp) > > call ftrace_return_to_handler > > movq %rax, 72(%rsp) > - movq 64(%rsp), %r11 > - movq 56(%rsp), %r10 > - movq 48(%rsp), %r9 > - movq 40(%rsp), %r8 > - movq 32(%rsp), %rdi > - movq 24(%rsp), %rsi > - movq 16(%rsp), %rdx > - movq 8(%rsp), %rcx > + movq 8(%rsp), %rdx > movq (%rsp), %rax > addq $72, %rsp > retq ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 2009-04-08 16:09 ` Ingo Molnar @ 2009-04-08 16:37 ` Steven Rostedt 2009-04-08 16:41 ` Ingo Molnar 2009-04-08 17:40 ` Frederic Weisbecker 1 sibling, 1 reply; 37+ messages in thread From: Steven Rostedt @ 2009-04-08 16:37 UTC (permalink / raw) To: Ingo Molnar Cc: linux kernel, Frederic Weisbecker, Tim Bird, Ingo Molnar, Abhishek Sagar, Peter Zijlstra, Thomas Gleixner On Wed, 8 Apr 2009, Ingo Molnar wrote: > > (replying to old mail - working down my awful email backlog) > > do we still need this for -tip? > I've ran it without issues. It will help speed up the function graph tracer since it removes a bunch of copies at every function call. But I would test it in your work suite, since I only tested it on two boxes. -- Steve ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 2009-04-08 16:37 ` Steven Rostedt @ 2009-04-08 16:41 ` Ingo Molnar 0 siblings, 0 replies; 37+ messages in thread From: Ingo Molnar @ 2009-04-08 16:41 UTC (permalink / raw) To: Steven Rostedt Cc: linux kernel, Frederic Weisbecker, Tim Bird, Ingo Molnar, Abhishek Sagar, Peter Zijlstra, Thomas Gleixner * Steven Rostedt <rostedt@goodmis.org> wrote: > > do we still need this for -tip? > > I've ran it without issues. It will help speed up the function > graph tracer since it removes a bunch of copies at every function > call. But I would test it in your work suite, since I only tested > it on two boxes. ok, great - i've applied it to tip:tracing/ftrace. Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 2009-04-08 16:09 ` Ingo Molnar 2009-04-08 16:37 ` Steven Rostedt @ 2009-04-08 17:40 ` Frederic Weisbecker 1 sibling, 0 replies; 37+ messages in thread From: Frederic Weisbecker @ 2009-04-08 17:40 UTC (permalink / raw) To: Ingo Molnar Cc: Steven Rostedt, linux kernel, Tim Bird, Ingo Molnar, Abhishek Sagar, Peter Zijlstra, Thomas Gleixner On Wed, Apr 08, 2009 at 06:09:15PM +0200, Ingo Molnar wrote: > > (replying to old mail - working down my awful email backlog) > > do we still need this for -tip? > > Ingo Yes. It drops a small overhead on the function graph tracer. Thanks, Frederic. > * Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > Ingo, > > > > Please pull the latest tip/tracing/function-graph tree, which can be found at: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-trace.git > > tip/tracing/function-graph > > > > > > Steven Rostedt (1): > > x86, function-graph: only save return values on x86_64 > > > > ---- > > arch/x86/kernel/entry_64.S | 19 +++---------------- > > 1 files changed, 3 insertions(+), 16 deletions(-) > > --------------------------- > > commit d5b754b2208b1c22691a7cd5d1a65398b5973d74 > > Author: Steven Rostedt <srostedt@redhat.com> > > Date: Wed Mar 25 14:30:04 2009 -0400 > > > > x86, function-graph: only save return values on x86_64 > > > > Impact: speed up > > > > The return to handler portion of the function graph tracer should only > > need to save the return values. The caller already saved off the > > registers that the callee can modify. The returning function already > > saved the registers it modified. When we call our own trace function > > it too will save the registers that the callee must restore. > > > > There's no reason to save off anything more that the registers used > > to return the values. > > > > Note, I did a complete kernel build with this modification and the > > function graph tracer running on x86_64. > > > > Signed-off-by: Steven Rostedt <srostedt@redhat.com> > > > > diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S > > index a331ec3..1ac9986 100644 > > --- a/arch/x86/kernel/entry_64.S > > +++ b/arch/x86/kernel/entry_64.S > > @@ -147,27 +147,14 @@ END(ftrace_graph_caller) > > GLOBAL(return_to_handler) > > subq $80, %rsp > > > > + /* Save the return values */ > > movq %rax, (%rsp) > > - movq %rcx, 8(%rsp) > > - movq %rdx, 16(%rsp) > > - movq %rsi, 24(%rsp) > > - movq %rdi, 32(%rsp) > > - movq %r8, 40(%rsp) > > - movq %r9, 48(%rsp) > > - movq %r10, 56(%rsp) > > - movq %r11, 64(%rsp) > > + movq %rdx, 8(%rsp) > > > > call ftrace_return_to_handler > > > > movq %rax, 72(%rsp) > > - movq 64(%rsp), %r11 > > - movq 56(%rsp), %r10 > > - movq 48(%rsp), %r9 > > - movq 40(%rsp), %r8 > > - movq 32(%rsp), %rdi > > - movq 24(%rsp), %rsi > > - movq 16(%rsp), %rdx > > - movq 8(%rsp), %rcx > > + movq 8(%rsp), %rdx > > movq (%rsp), %rax > > addq $72, %rsp > > retq ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 21:36 ` Frederic Weisbecker 2009-03-24 21:40 ` Tim Bird 2009-03-24 21:48 ` Ingo Molnar @ 2009-03-24 22:29 ` Abhishek Sagar 2009-03-24 22:48 ` Frederic Weisbecker 2 siblings, 1 reply; 37+ messages in thread From: Abhishek Sagar @ 2009-03-24 22:29 UTC (permalink / raw) To: Frederic Weisbecker Cc: Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Russell King, Uwe Kleine-König Frederic Weisbecker wrote: > Yes, ie: > > _Before jumping to the function entry hook, you must save > the arguments for the traced function on the stack. > On x86, its eax, edx and ecx. > On arm, it will be r0-r3. > Then you have to transmit the address of the traced function > (it's on r14) and it's parent (must rely on fp for that). > Then you call the entry hook and restore the old scratch/arg > registers. > Instead of just restoring the old backed-up args, lr can be fixed up inside the entry hook to point to the return hook. So when the traced function returns, it actually returns to the return hook (where we can restore the original return address). This means that -finstrument-functions is not required at all. This is analogous to how kretprobes work. The only difference here is that instead of planting a kprobe at the function entry and redirecting the function return to the profiling exit routine, we can use mcount. This is slightly more complicated to implement but can be a very efficient alternative to kretprobes. -- Abhishek ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 22:29 ` Anyone working on ftrace function graph support on ARM? Abhishek Sagar @ 2009-03-24 22:48 ` Frederic Weisbecker 2009-03-25 8:42 ` Russell King - ARM Linux 0 siblings, 1 reply; 37+ messages in thread From: Frederic Weisbecker @ 2009-03-24 22:48 UTC (permalink / raw) To: Abhishek Sagar Cc: Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Russell King, Uwe Kleine-König On Tue, Mar 24, 2009 at 06:29:03PM -0400, Abhishek Sagar wrote: > Frederic Weisbecker wrote: > > Yes, ie: > > > > _Before jumping to the function entry hook, you must save > > the arguments for the traced function on the stack. > > On x86, its eax, edx and ecx. > > On arm, it will be r0-r3. > > Then you have to transmit the address of the traced function > > (it's on r14) and it's parent (must rely on fp for that). > > Then you call the entry hook and restore the old scratch/arg > > registers. > > > Instead of just restoring the old backed-up args, lr can be fixed up > inside the entry hook to point to the return hook. So when the traced > function returns, it actually returns to the return hook (where we can > restore the original return address). This means that > -finstrument-functions is not required at all. This is analogous to how > kretprobes work. The only difference here is that instead of planting a > kprobe at the function entry and redirecting the function return to the > profiling exit routine, we can use mcount. This is slightly more > complicated to implement but can be a very efficient alternative to > kretprobes. > -- > Abhishek > Indeed, you need to override lr, that even the only solution. I was still thinking in an x86 way with its on stack return address. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-24 22:48 ` Frederic Weisbecker @ 2009-03-25 8:42 ` Russell King - ARM Linux 2009-03-25 8:54 ` Ingo Molnar ` (2 more replies) 0 siblings, 3 replies; 37+ messages in thread From: Russell King - ARM Linux @ 2009-03-25 8:42 UTC (permalink / raw) To: Frederic Weisbecker Cc: Abhishek Sagar, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König On Tue, Mar 24, 2009 at 11:48:58PM +0100, Frederic Weisbecker wrote: > On Tue, Mar 24, 2009 at 06:29:03PM -0400, Abhishek Sagar wrote: > > Instead of just restoring the old backed-up args, lr can be fixed up > > inside the entry hook to point to the return hook. So when the traced > > function returns, it actually returns to the return hook (where we can > > restore the original return address). This means that > > -finstrument-functions is not required at all. This is analogous to how > > kretprobes work. The only difference here is that instead of planting a > > kprobe at the function entry and redirecting the function return to the > > profiling exit routine, we can use mcount. This is slightly more > > complicated to implement but can be a very efficient alternative to > > kretprobes. > > -- > > Abhishek > > > > Indeed, you need to override lr, that even the only solution. > I was still thinking in an x86 way with its on stack return address. As pointed out in my previous mail, identifying where on the stack the return address is stored is only possible for OABI with frame pointers. EABI will probably be possible with the stack unwinding code, but it probably won't be cheap. The EABI unwinder is scheduled for merging during the present now-open merge window. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 8:42 ` Russell King - ARM Linux @ 2009-03-25 8:54 ` Ingo Molnar 2009-03-25 9:57 ` Russell King - ARM Linux 2009-03-25 16:41 ` Tim Bird 2009-03-25 11:41 ` Frederic Weisbecker 2009-03-25 16:34 ` Tim Bird 2 siblings, 2 replies; 37+ messages in thread From: Ingo Molnar @ 2009-03-25 8:54 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Frederic Weisbecker, Abhishek Sagar, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König * Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Tue, Mar 24, 2009 at 11:48:58PM +0100, Frederic Weisbecker wrote: > > On Tue, Mar 24, 2009 at 06:29:03PM -0400, Abhishek Sagar wrote: > > > Instead of just restoring the old backed-up args, lr can be fixed up > > > inside the entry hook to point to the return hook. So when the traced > > > function returns, it actually returns to the return hook (where we can > > > restore the original return address). This means that > > > -finstrument-functions is not required at all. This is analogous to how > > > kretprobes work. The only difference here is that instead of planting a > > > kprobe at the function entry and redirecting the function return to the > > > profiling exit routine, we can use mcount. This is slightly more > > > complicated to implement but can be a very efficient alternative to > > > kretprobes. > > > -- > > > Abhishek > > > > > > > Indeed, you need to override lr, that even the only solution. > > I was still thinking in an x86 way with its on stack return address. > > As pointed out in my previous mail, identifying where on the stack > the return address is stored is only possible for OABI with frame > pointers. > > EABI will probably be possible with the stack unwinding code, but > it probably won't be cheap. The EABI unwinder is scheduled for > merging during the present now-open merge window. Unwinding is not realistic or desired for the function tracer - it runs in every kernel function so performance is paramount. So, if i understood you correctly, an OABI_COMPAT and FRAME_POINTERS dependency has to be added to the ARM HAVE_FUNCTION_GRAPH_TRACER Kconfig rule. Ingo ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 8:54 ` Ingo Molnar @ 2009-03-25 9:57 ` Russell King - ARM Linux 2009-03-25 10:45 ` Uwe Kleine-König 2009-03-25 16:41 ` Tim Bird 1 sibling, 1 reply; 37+ messages in thread From: Russell King - ARM Linux @ 2009-03-25 9:57 UTC (permalink / raw) To: Ingo Molnar Cc: Frederic Weisbecker, Abhishek Sagar, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König On Wed, Mar 25, 2009 at 09:54:18AM +0100, Ingo Molnar wrote: > Unwinding is not realistic or desired for the function tracer - it > runs in every kernel function so performance is paramount. Which would also include the unwinder itself as well. > So, if i understood you correctly, an OABI_COMPAT and FRAME_POINTERS > dependency has to be added to the ARM HAVE_FUNCTION_GRAPH_TRACER > Kconfig rule. If we have frame pointers enabled with EABI, then it looks like it will work as well. So the dependency should be on FRAME_POINTERS for _every_ feature using the mcount code. Hmm, and it looks like the ftrace code is rather crap: ENTRY(mcount) stmdb sp!, {r0-r3, lr} ldr r0, =ftrace_trace_function ldr r2, [r0] adr r0, ftrace_stub cmp r0, r2 bne trace ldr lr, [fp, #-4] @ restore lr ldmia sp!, {r0-r3, pc} trace: ldr r1, [fp, #-4] @ lr of instrumented routine mov r0, lr sub r0, r0, #MCOUNT_INSN_SIZE mov lr, pc mov pc, r2 XXX calling a C function results in r0-r3,ip,lr being clobbered XXX mov lr, r1 @ restore lr XXX not necessarily, r1 might be some other random value ldmia sp!, {r0-r3, pc} In fact, to me the above code looks totally crap, because it's checking whether the caller is 'ftrace_stub'. It can never be 'ftrace_stub' because that is an assembly function: .globl ftrace_stub ftrace_stub: mov pc, lr and therefore gcc has no hand in adding a mcount call to it. Moreover, the _dynamic_ ftrace code does this: ENTRY(mcount) stmdb sp!, {r0-r3, lr} mov r0, lr sub r0, r0, #MCOUNT_INSN_SIZE .globl mcount_call mcount_call: bl ftrace_stub ldr lr, [fp, #-4] @ restore lr ldmia sp!, {r0-r3, pc} ENTRY(ftrace_caller) stmdb sp!, {r0-r3, lr} ldr r1, [fp, #-4] mov r0, lr sub r0, r0, #MCOUNT_INSN_SIZE .globl ftrace_call ftrace_call: bl ftrace_stub ldr lr, [fp, #-4] @ restore lr ldmia sp!, {r0-r3, pc} In other words, it pushes some words onto the stack, sets r0, calls an assembly function which does nothing but just returns, reloads lr, restores the stack and returns. This ftrace implementation looks like an exercise in slowing down execution to me with no added value. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 9:57 ` Russell King - ARM Linux @ 2009-03-25 10:45 ` Uwe Kleine-König 2009-03-25 11:21 ` Russell King - ARM Linux 0 siblings, 1 reply; 37+ messages in thread From: Uwe Kleine-König @ 2009-03-25 10:45 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Ingo Molnar, Frederic Weisbecker, Abhishek Sagar, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar Hi Russell, On Wed, Mar 25, 2009 at 09:57:51AM +0000, Russell King - ARM Linux wrote: > On Wed, Mar 25, 2009 at 09:54:18AM +0100, Ingo Molnar wrote: > > Unwinding is not realistic or desired for the function tracer - it > > runs in every kernel function so performance is paramount. > > Which would also include the unwinder itself as well. > > > So, if i understood you correctly, an OABI_COMPAT and FRAME_POINTERS > > dependency has to be added to the ARM HAVE_FUNCTION_GRAPH_TRACER > > Kconfig rule. > > If we have frame pointers enabled with EABI, then it looks like it will > work as well. So the dependency should be on FRAME_POINTERS for _every_ > feature using the mcount code. > > Hmm, and it looks like the ftrace code is rather crap: > > ENTRY(mcount) > stmdb sp!, {r0-r3, lr} > ldr r0, =ftrace_trace_function > ldr r2, [r0] > adr r0, ftrace_stub > cmp r0, r2 > bne trace > ldr lr, [fp, #-4] @ restore lr > ldmia sp!, {r0-r3, pc} > > trace: > ldr r1, [fp, #-4] @ lr of instrumented routine > mov r0, lr > sub r0, r0, #MCOUNT_INSN_SIZE > mov lr, pc > mov pc, r2 > XXX calling a C function results in r0-r3,ip,lr being clobbered XXX > > mov lr, r1 @ restore lr > XXX not necessarily, r1 might be some other random value > > ldmia sp!, {r0-r3, pc} > > In fact, to me the above code looks totally crap, because it's checking > whether the caller is 'ftrace_stub'. It can never be 'ftrace_stub' > because that is an assembly function: > > .globl ftrace_stub > ftrace_stub: > mov pc, lr > > and therefore gcc has no hand in adding a mcount call to it. Hhhm. Isn't the equivalent C-Code ~ as follows: if (ftrace_trace_function != ftrace_stub) trace(some, args); return; ? ftrace_trace_function is initialised to ftrace_stub at compile time and is changed when a tracerfunction is registered. > Moreover, the _dynamic_ ftrace code does this: > > ENTRY(mcount) > stmdb sp!, {r0-r3, lr} > mov r0, lr > sub r0, r0, #MCOUNT_INSN_SIZE > > .globl mcount_call > mcount_call: > bl ftrace_stub > ldr lr, [fp, #-4] @ restore lr > ldmia sp!, {r0-r3, pc} > > ENTRY(ftrace_caller) > stmdb sp!, {r0-r3, lr} > ldr r1, [fp, #-4] > mov r0, lr > sub r0, r0, #MCOUNT_INSN_SIZE > > .globl ftrace_call > ftrace_call: > bl ftrace_stub > ldr lr, [fp, #-4] @ restore lr > ldmia sp!, {r0-r3, pc} > > In other words, it pushes some words onto the stack, sets r0, calls > an assembly function which does nothing but just returns, reloads lr, > restores the stack and returns. This ftrace implementation looks like > an exercise in slowing down execution to me with no added value. The idea is that the instruction at address mcount_call (and ftrace_call IIRC) is rewritten at run time. Still the code is not active currently (because CONFIG_DYNAMIC_FTRACE isn't selectable for ARM) and needs some care anyhow on reactivation because the way how dynamic ftrace works changed somehow. Didn't look at it up to now though. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 10:45 ` Uwe Kleine-König @ 2009-03-25 11:21 ` Russell King - ARM Linux 2009-03-25 12:09 ` Uwe Kleine-König 0 siblings, 1 reply; 37+ messages in thread From: Russell King - ARM Linux @ 2009-03-25 11:21 UTC (permalink / raw) To: Uwe Kleine-König Cc: Ingo Molnar, Frederic Weisbecker, Abhishek Sagar, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar On Wed, Mar 25, 2009 at 11:45:05AM +0100, Uwe Kleine-König wrote: > Hi Russell, > > On Wed, Mar 25, 2009 at 09:57:51AM +0000, Russell King - ARM Linux wrote: > > On Wed, Mar 25, 2009 at 09:54:18AM +0100, Ingo Molnar wrote: > > > Unwinding is not realistic or desired for the function tracer - it > > > runs in every kernel function so performance is paramount. > > > > Which would also include the unwinder itself as well. > > > > > So, if i understood you correctly, an OABI_COMPAT and FRAME_POINTERS > > > dependency has to be added to the ARM HAVE_FUNCTION_GRAPH_TRACER > > > Kconfig rule. > > > > If we have frame pointers enabled with EABI, then it looks like it will > > work as well. So the dependency should be on FRAME_POINTERS for _every_ > > feature using the mcount code. > > > > Hmm, and it looks like the ftrace code is rather crap: > > > > ENTRY(mcount) > > stmdb sp!, {r0-r3, lr} > > ldr r0, =ftrace_trace_function > > ldr r2, [r0] > > adr r0, ftrace_stub > > cmp r0, r2 > > bne trace > > ldr lr, [fp, #-4] @ restore lr > > ldmia sp!, {r0-r3, pc} > > > > trace: > > ldr r1, [fp, #-4] @ lr of instrumented routine > > mov r0, lr > > sub r0, r0, #MCOUNT_INSN_SIZE > > mov lr, pc > > mov pc, r2 > > XXX calling a C function results in r0-r3,ip,lr being clobbered XXX > > > > mov lr, r1 @ restore lr > > XXX not necessarily, r1 might be some other random value > > > > ldmia sp!, {r0-r3, pc} > > > > In fact, to me the above code looks totally crap, because it's checking > > whether the caller is 'ftrace_stub'. It can never be 'ftrace_stub' > > because that is an assembly function: > > > > .globl ftrace_stub > > ftrace_stub: > > mov pc, lr > > > > and therefore gcc has no hand in adding a mcount call to it. > Hhhm. Isn't the equivalent C-Code ~ as follows: > > if (ftrace_trace_function != ftrace_stub) > trace(some, args); > return; > ? ftrace_trace_function is initialised to ftrace_stub at compile time > and is changed when a tracerfunction is registered. Correct. But my point is that there's no way for ftrace_stub to ever call mcount. So the check there is pointless. > > Moreover, the _dynamic_ ftrace code does this: > > > > ENTRY(mcount) > > stmdb sp!, {r0-r3, lr} > > mov r0, lr > > sub r0, r0, #MCOUNT_INSN_SIZE > > > > .globl mcount_call > > mcount_call: > > bl ftrace_stub > > ldr lr, [fp, #-4] @ restore lr > > ldmia sp!, {r0-r3, pc} > > > > ENTRY(ftrace_caller) > > stmdb sp!, {r0-r3, lr} > > ldr r1, [fp, #-4] > > mov r0, lr > > sub r0, r0, #MCOUNT_INSN_SIZE > > > > .globl ftrace_call > > ftrace_call: > > bl ftrace_stub > > ldr lr, [fp, #-4] @ restore lr > > ldmia sp!, {r0-r3, pc} > > > > In other words, it pushes some words onto the stack, sets r0, calls > > an assembly function which does nothing but just returns, reloads lr, > > restores the stack and returns. This ftrace implementation looks like > > an exercise in slowing down execution to me with no added value. > The idea is that the instruction at address mcount_call (and > ftrace_call IIRC) is rewritten at run time. > Still the code is not active currently (because CONFIG_DYNAMIC_FTRACE > isn't selectable for ARM) and needs some care anyhow on reactivation > because the way how dynamic ftrace works changed somehow. Didn't look > at it up to now though. Ok - it would be nice if there was a comment to explain that. Is someone going to fix the existing ftrace before trying to build stuff on top of it? ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 11:21 ` Russell King - ARM Linux @ 2009-03-25 12:09 ` Uwe Kleine-König 0 siblings, 0 replies; 37+ messages in thread From: Uwe Kleine-König @ 2009-03-25 12:09 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Ingo Molnar, Frederic Weisbecker, Abhishek Sagar, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar Hi Russell, > > > Hmm, and it looks like the ftrace code is rather crap: > > > > > > ENTRY(mcount) > > > stmdb sp!, {r0-r3, lr} > > > ldr r0, =ftrace_trace_function > > > ldr r2, [r0] > > > adr r0, ftrace_stub > > > cmp r0, r2 > > > bne trace > > > ldr lr, [fp, #-4] @ restore lr > > > ldmia sp!, {r0-r3, pc} > > > > > > trace: > > > ldr r1, [fp, #-4] @ lr of instrumented routine > > > mov r0, lr > > > sub r0, r0, #MCOUNT_INSN_SIZE > > > mov lr, pc > > > mov pc, r2 > > > XXX calling a C function results in r0-r3,ip,lr being clobbered XXX > > > > > > mov lr, r1 @ restore lr > > > XXX not necessarily, r1 might be some other random value > > > > > > ldmia sp!, {r0-r3, pc} > > > > > > In fact, to me the above code looks totally crap, because it's checking > > > whether the caller is 'ftrace_stub'. It can never be 'ftrace_stub' > > > because that is an assembly function: > > > > > > .globl ftrace_stub > > > ftrace_stub: > > > mov pc, lr > > > > > > and therefore gcc has no hand in adding a mcount call to it. > > Hhhm. Isn't the equivalent C-Code ~ as follows: > > > > if (ftrace_trace_function != ftrace_stub) > > trace(some, args); > > return; > > ? ftrace_trace_function is initialised to ftrace_stub at compile time > > and is changed when a tracerfunction is registered. > > Correct. But my point is that there's no way for ftrace_stub to ever call > mcount. So the check there is pointless. Which check? ftrace_trace_function isn't the caller of mcount but a function pointer defined in kernel/trace/ftrace.c. And if this pointer changes, the if-condition becomes true. > Ok - it would be nice if there was a comment to explain that. Some comments would be nice, that's right. > Is someone going to fix the existing ftrace before trying to build stuff > on top of it? Mmh, I think you misunderstood the code, so I don't see something to fix here (at least for non-dynamic ftrace). If the next few mails show, that it's me who didn't understand the code, I will fix the problems, of course. Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 8:54 ` Ingo Molnar 2009-03-25 9:57 ` Russell King - ARM Linux @ 2009-03-25 16:41 ` Tim Bird 1 sibling, 0 replies; 37+ messages in thread From: Tim Bird @ 2009-03-25 16:41 UTC (permalink / raw) To: Ingo Molnar Cc: Russell King - ARM Linux, Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König Ingo Molnar wrote: > Unwinding is not realistic or desired for the function tracer - it > runs in every kernel function so performance is paramount. > > So, if i understood you correctly, an OABI_COMPAT and FRAME_POINTERS > dependency has to be added to the ARM HAVE_FUNCTION_GRAPH_TRACER > Kconfig rule. This, unfortunately, is a problem for me. All my ARM projects are now using EABI. If the unwinding looks like it has too much overhead, I'll do some research on the -finstrument-functions (__cyg_profile_func_enter/exit) approach. I'm not sure, however, if it's possible to integrate this with the dynamic tracing mechanisms, though. So something may have to give to get this supported on ARM. -- Tim ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America ============================= ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 8:42 ` Russell King - ARM Linux 2009-03-25 8:54 ` Ingo Molnar @ 2009-03-25 11:41 ` Frederic Weisbecker 2009-03-25 16:34 ` Tim Bird 2 siblings, 0 replies; 37+ messages in thread From: Frederic Weisbecker @ 2009-03-25 11:41 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Abhishek Sagar, Tim Bird, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König On Wed, Mar 25, 2009 at 08:42:48AM +0000, Russell King - ARM Linux wrote: > On Tue, Mar 24, 2009 at 11:48:58PM +0100, Frederic Weisbecker wrote: > > On Tue, Mar 24, 2009 at 06:29:03PM -0400, Abhishek Sagar wrote: > > > Instead of just restoring the old backed-up args, lr can be fixed up > > > inside the entry hook to point to the return hook. So when the traced > > > function returns, it actually returns to the return hook (where we can > > > restore the original return address). This means that > > > -finstrument-functions is not required at all. This is analogous to how > > > kretprobes work. The only difference here is that instead of planting a > > > kprobe at the function entry and redirecting the function return to the > > > profiling exit routine, we can use mcount. This is slightly more > > > complicated to implement but can be a very efficient alternative to > > > kretprobes. > > > -- > > > Abhishek > > > > > > > Indeed, you need to override lr, that even the only solution. > > I was still thinking in an x86 way with its on stack return address. > > As pointed out in my previous mail, identifying where on the stack the > return address is stored is only possible for OABI with frame pointers. > > EABI will probably be possible with the stack unwinding code, but it > probably won't be cheap. The EABI unwinder is scheduled for merging > during the present now-open merge window. Hm, if I'm not wrong, the function tracer already depends on frame pointer, this is necessary to retrieve the parent of the caller. ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 8:42 ` Russell King - ARM Linux 2009-03-25 8:54 ` Ingo Molnar 2009-03-25 11:41 ` Frederic Weisbecker @ 2009-03-25 16:34 ` Tim Bird 2009-03-25 17:05 ` Uwe Kleine-König 2009-03-27 12:58 ` Catalin Marinas 2 siblings, 2 replies; 37+ messages in thread From: Tim Bird @ 2009-03-25 16:34 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König Russell King - ARM Linux wrote: > As pointed out in my previous mail, identifying where on the stack the > return address is stored is only possible for OABI with frame pointers. > > EABI will probably be possible with the stack unwinding code, but it > probably won't be cheap. The EABI unwinder is scheduled for merging > during the present now-open merge window. > -finstrument-functions is looking better and better. I know it adds more overhead than the mcount call, and may wreak havoc with the dynamic ftrace mechanisms, but at least the callouts are simple, clear, and you get both entry and exit, at fixed costs. I'll take a look at the EABI unwinder to see what kind of variability it introduces (e.g. if it does a stack scan or something). Thanks very much for the information! -- Tim ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America ============================= ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 16:34 ` Tim Bird @ 2009-03-25 17:05 ` Uwe Kleine-König 2009-03-25 17:17 ` Russell King - ARM Linux 2009-03-27 12:58 ` Catalin Marinas 1 sibling, 1 reply; 37+ messages in thread From: Uwe Kleine-König @ 2009-03-25 17:05 UTC (permalink / raw) To: Tim Bird, Russell King - ARM Linux Cc: Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar On Wed, Mar 25, 2009 at 09:34:07AM -0700, Tim Bird wrote: > Russell King - ARM Linux wrote: > > As pointed out in my previous mail, identifying where on the stack the > > return address is stored is only possible for OABI with frame pointers. > > > > EABI will probably be possible with the stack unwinding code, but it > > probably won't be cheap. The EABI unwinder is scheduled for merging > > during the present now-open merge window. EABI with frame pointers should work, too, shouldn't it? Best regards Uwe -- Pengutronix e.K. | Uwe Kleine-König | Industrial Linux Solutions | http://www.pengutronix.de/ | ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 17:05 ` Uwe Kleine-König @ 2009-03-25 17:17 ` Russell King - ARM Linux 2009-03-25 18:37 ` Tim Bird 0 siblings, 1 reply; 37+ messages in thread From: Russell King - ARM Linux @ 2009-03-25 17:17 UTC (permalink / raw) To: Uwe Kleine-König Cc: Tim Bird, Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar On Wed, Mar 25, 2009 at 06:05:33PM +0100, Uwe Kleine-König wrote: > On Wed, Mar 25, 2009 at 09:34:07AM -0700, Tim Bird wrote: > > Russell King - ARM Linux wrote: > > > As pointed out in my previous mail, identifying where on the stack the > > > return address is stored is only possible for OABI with frame pointers. > > > > > > EABI will probably be possible with the stack unwinding code, but it > > > probably won't be cheap. The EABI unwinder is scheduled for merging > > > during the present now-open merge window. > EABI with frame pointers should work, too, shouldn't it? Yes, as I said at the top of my reply dated Wed, 25 Mar 2009 09:57:51 +0000 ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 17:17 ` Russell King - ARM Linux @ 2009-03-25 18:37 ` Tim Bird 2009-03-25 18:41 ` Steven Rostedt 0 siblings, 1 reply; 37+ messages in thread From: Tim Bird @ 2009-03-25 18:37 UTC (permalink / raw) To: Russell King - ARM Linux Cc: Uwe Kleine-König, Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar Russell King - ARM Linux wrote: > On Wed, Mar 25, 2009 at 06:05:33PM +0100, Uwe Kleine-K�nig wrote: >> On Wed, Mar 25, 2009 at 09:34:07AM -0700, Tim Bird wrote: >>> Russell King - ARM Linux wrote: >>>> As pointed out in my previous mail, identifying where on the stack the >>>> return address is stored is only possible for OABI with frame pointers. >>>> >>>> EABI will probably be possible with the stack unwinding code, but it >>>> probably won't be cheap. The EABI unwinder is scheduled for merging >>>> during the present now-open merge window. >> EABI with frame pointers should work, too, shouldn't it? > > Yes, as I said at the top of my reply dated Wed, 25 Mar 2009 09:57:51 +0000 It turns out you can't use -pg and -fomit-frame-pointers at the same time. At least, my gcc complains about this: $ make arm-sony-linux-gnueabi-dev-gcc -g -pg -fomit-frame-pointer -o hello hello.c arm-sony-linux-gnueabi-dev-gcc: -pg and -fomit-frame-pointer are incompatible make: *** [hello] Error 1 ============================= Tim Bird Architecture Group Chair, CE Linux Forum Senior Staff Engineer, Sony Corporation of America ============================= ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 18:37 ` Tim Bird @ 2009-03-25 18:41 ` Steven Rostedt 0 siblings, 0 replies; 37+ messages in thread From: Steven Rostedt @ 2009-03-25 18:41 UTC (permalink / raw) To: Tim Bird Cc: Russell King - ARM Linux, Uwe Kleine-König, Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Ingo Molnar On Wed, 25 Mar 2009, Tim Bird wrote: > Russell King - ARM Linux wrote: > > On Wed, Mar 25, 2009 at 06:05:33PM +0100, Uwe Kleine-K???nig wrote: > >> On Wed, Mar 25, 2009 at 09:34:07AM -0700, Tim Bird wrote: > >>> Russell King - ARM Linux wrote: > >>>> As pointed out in my previous mail, identifying where on the stack the > >>>> return address is stored is only possible for OABI with frame pointers. > >>>> > >>>> EABI will probably be possible with the stack unwinding code, but it > >>>> probably won't be cheap. The EABI unwinder is scheduled for merging > >>>> during the present now-open merge window. > >> EABI with frame pointers should work, too, shouldn't it? > > > > Yes, as I said at the top of my reply dated Wed, 25 Mar 2009 09:57:51 +0000 > > It turns out you can't use -pg and -fomit-frame-pointers at the same time. > At least, my gcc complains about this: > $ make > arm-sony-linux-gnueabi-dev-gcc -g -pg -fomit-frame-pointer -o hello hello.c > arm-sony-linux-gnueabi-dev-gcc: -pg and -fomit-frame-pointer are incompatible > make: *** [hello] Error 1 Correct, that is because mcount requires (as Russell earlier pointed out) the use of frame pointers. -- Steve ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-25 16:34 ` Tim Bird 2009-03-25 17:05 ` Uwe Kleine-König @ 2009-03-27 12:58 ` Catalin Marinas 2009-04-09 15:29 ` Daniel Jacobowitz 1 sibling, 1 reply; 37+ messages in thread From: Catalin Marinas @ 2009-03-27 12:58 UTC (permalink / raw) To: Tim Bird Cc: Russell King - ARM Linux, Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König On Wed, 2009-03-25 at 09:34 -0700, Tim Bird wrote: > Russell King - ARM Linux wrote: > > As pointed out in my previous mail, identifying where on the stack the > > return address is stored is only possible for OABI with frame pointers. > > > > EABI will probably be possible with the stack unwinding code, but it > > probably won't be cheap. The EABI unwinder is scheduled for merging > > during the present now-open merge window. > > -finstrument-functions is looking better and better. I know it > adds more overhead than the mcount call, and may wreak havoc with > the dynamic ftrace mechanisms, but at least the callouts are > simple, clear, and you get both entry and exit, at fixed > costs. This approach would work even with Thumb-2 compiled kernels (not in mainline) where the frame pointer is useless wrt backtraces. The Thumb-2 kernel cannot currently be compiled with frame pointer because there is some inline assembly where gcc fail to allocate lower registers (in Thumb, the frame pointer is r7). Anyway, the EABI toolchain I have (from CodeSourcery) generates something like below with -pg for both ARM and Thumb code (so that it doesn't rely on the frame pointer): push {lr} bl __gnu_mcount_nc I think this will be (was?) merged into the mainline gcc for ARM. The -pg option is still incompatible with -fomit-frame-pointer but maybe it shouldn't be anymore. > I'll take a look at the EABI unwinder to see > what kind of variability it introduces (e.g. if it does a stack > scan or something). It's not cheap - it looks up the function address in a binary tree and it reads some bytecodes (either from the tree or from a different table pointed to by the tree) which it interprets to perform the equivalent of a return from the current function (reading registers from the stack and changing SP, LR). -- Catalin ^ permalink raw reply [flat|nested] 37+ messages in thread
* Re: Anyone working on ftrace function graph support on ARM? 2009-03-27 12:58 ` Catalin Marinas @ 2009-04-09 15:29 ` Daniel Jacobowitz 0 siblings, 0 replies; 37+ messages in thread From: Daniel Jacobowitz @ 2009-04-09 15:29 UTC (permalink / raw) To: Catalin Marinas Cc: Tim Bird, Russell King - ARM Linux, Frederic Weisbecker, Abhishek Sagar, linux-arm-kernel, linux kernel, Steven Rostedt, Ingo Molnar, Uwe Kleine-König Sorry for the late reply, I'm way behind on list mail. On Fri, Mar 27, 2009 at 12:58:08PM +0000, Catalin Marinas wrote: > Anyway, the EABI toolchain I have (from CodeSourcery) generates > something like below with -pg for both ARM and Thumb code (so that it > doesn't rely on the frame pointer): > > push {lr} > bl __gnu_mcount_nc > > I think this will be (was?) merged into the mainline gcc for ARM. The > -pg option is still incompatible with -fomit-frame-pointer but maybe it > shouldn't be anymore. Was merged. It'll be in GCC 4.4 and is in our current Lite releases; it was created to solve precisely this problem. It should be easy to apply to an older GCC release if desired, but it's not in existing FSF releases. -- Daniel Jacobowitz CodeSourcery ^ permalink raw reply [flat|nested] 37+ messages in thread
end of thread, other threads:[~2009-04-09 15:54 UTC | newest] Thread overview: 37+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-03-24 19:38 Anyone working on ftrace function graph support on ARM? Tim Bird 2009-03-24 20:25 ` Ingo Molnar 2009-03-24 20:48 ` Tim Bird 2009-03-24 20:38 ` Uwe Kleine-König 2009-03-24 21:36 ` Frederic Weisbecker 2009-03-24 21:40 ` Tim Bird 2009-03-24 21:48 ` Ingo Molnar 2009-03-24 21:57 ` Frederic Weisbecker 2009-03-24 22:14 ` Ingo Molnar 2009-03-24 22:54 ` Frederic Weisbecker 2009-03-25 8:36 ` Russell King - ARM Linux 2009-03-25 16:00 ` Steven Rostedt 2009-03-25 17:13 ` Frederic Weisbecker 2009-03-25 20:27 ` [PATCH][GIT PULL] x86, function-graph: only save return values on x86_64 Steven Rostedt 2009-03-25 20:45 ` Jaswinder Singh Rajput 2009-03-25 21:26 ` Steven Rostedt 2009-04-08 16:09 ` Ingo Molnar 2009-04-08 16:37 ` Steven Rostedt 2009-04-08 16:41 ` Ingo Molnar 2009-04-08 17:40 ` Frederic Weisbecker 2009-03-24 22:29 ` Anyone working on ftrace function graph support on ARM? Abhishek Sagar 2009-03-24 22:48 ` Frederic Weisbecker 2009-03-25 8:42 ` Russell King - ARM Linux 2009-03-25 8:54 ` Ingo Molnar 2009-03-25 9:57 ` Russell King - ARM Linux 2009-03-25 10:45 ` Uwe Kleine-König 2009-03-25 11:21 ` Russell King - ARM Linux 2009-03-25 12:09 ` Uwe Kleine-König 2009-03-25 16:41 ` Tim Bird 2009-03-25 11:41 ` Frederic Weisbecker 2009-03-25 16:34 ` Tim Bird 2009-03-25 17:05 ` Uwe Kleine-König 2009-03-25 17:17 ` Russell King - ARM Linux 2009-03-25 18:37 ` Tim Bird 2009-03-25 18:41 ` Steven Rostedt 2009-03-27 12:58 ` Catalin Marinas 2009-04-09 15:29 ` Daniel Jacobowitz
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox