* [BUG] no ORC stacktrace from kretprobe.multi bpf program @ 2025-10-08 21:08 Jiri Olsa 2025-10-12 4:09 ` Masami Hiramatsu 2025-10-13 17:10 ` Steven Rostedt 0 siblings, 2 replies; 13+ messages in thread From: Jiri Olsa @ 2025-10-08 21:08 UTC (permalink / raw) To: Masami Hiramatsu, Steven Rostedt, Josh Poimboeuf Cc: Peter Zijlstra, Andrii Nakryiko, bpf, linux-trace-kernel, x86, Yonghong Song hi, I'm getting no stacktrace from bpf program attached on kretprobe.multi probe (which means on top of return fprobe) on x86. I think we need some kind of treatment we do for rethook, AFAICS the ORC unwind stops on return_to_handler, because the stack and the function itself are not adjusted for unwind_recover_ret_addr call If it's any help I pushed the bpf/selftest for that in here: https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=stacktrace_test just execute: # test_progs -t stacktrace_map/kretprobe_multi thanks, jirka ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-08 21:08 [BUG] no ORC stacktrace from kretprobe.multi bpf program Jiri Olsa @ 2025-10-12 4:09 ` Masami Hiramatsu 2025-10-13 14:36 ` Jiri Olsa 2025-10-13 17:10 ` Steven Rostedt 1 sibling, 1 reply; 13+ messages in thread From: Masami Hiramatsu @ 2025-10-12 4:09 UTC (permalink / raw) To: Jiri Olsa Cc: Steven Rostedt, Josh Poimboeuf, Peter Zijlstra, Andrii Nakryiko, bpf, linux-trace-kernel, x86, Yonghong Song On Wed, 8 Oct 2025 23:08:26 +0200 Jiri Olsa <olsajiri@gmail.com> wrote: > hi, > I'm getting no stacktrace from bpf program attached on kretprobe.multi probe > (which means on top of return fprobe) on x86. > > I think we need some kind of treatment we do for rethook, AFAICS the ORC unwind > stops on return_to_handler, because the stack and the function itself are not > adjusted for unwind_recover_ret_addr call > > If it's any help I pushed the bpf/selftest for that in here: > https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=stacktrace_test > > just execute: > # test_progs -t stacktrace_map/kretprobe_multi Hmm, curious. as far as we are using fgraph, stacktrace should work. May this happen if function-graph tracer is enabled too? Thank you, > > thanks, > jirka -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-12 4:09 ` Masami Hiramatsu @ 2025-10-13 14:36 ` Jiri Olsa 0 siblings, 0 replies; 13+ messages in thread From: Jiri Olsa @ 2025-10-13 14:36 UTC (permalink / raw) To: Masami Hiramatsu Cc: Jiri Olsa, Steven Rostedt, Josh Poimboeuf, Peter Zijlstra, Andrii Nakryiko, bpf, linux-trace-kernel, x86, Yonghong Song On Sun, Oct 12, 2025 at 01:09:31PM +0900, Masami Hiramatsu wrote: > On Wed, 8 Oct 2025 23:08:26 +0200 > Jiri Olsa <olsajiri@gmail.com> wrote: > > > hi, > > I'm getting no stacktrace from bpf program attached on kretprobe.multi probe > > (which means on top of return fprobe) on x86. > > > > I think we need some kind of treatment we do for rethook, AFAICS the ORC unwind > > stops on return_to_handler, because the stack and the function itself are not > > adjusted for unwind_recover_ret_addr call > > > > If it's any help I pushed the bpf/selftest for that in here: > > https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=stacktrace_test > > > > just execute: > > # test_progs -t stacktrace_map/kretprobe_multi > > Hmm, curious. as far as we are using fgraph, stacktrace should work. > May this happen if function-graph tracer is enabled too? that tests is just simple kretprobe so there should be no function-graph tracer in the way.. I plan to check on this again later this week jirka ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-08 21:08 [BUG] no ORC stacktrace from kretprobe.multi bpf program Jiri Olsa 2025-10-12 4:09 ` Masami Hiramatsu @ 2025-10-13 17:10 ` Steven Rostedt 2025-10-15 16:06 ` Josh Poimboeuf 1 sibling, 1 reply; 13+ messages in thread From: Steven Rostedt @ 2025-10-13 17:10 UTC (permalink / raw) To: Jiri Olsa Cc: Masami Hiramatsu, Josh Poimboeuf, Peter Zijlstra, Andrii Nakryiko, bpf, linux-trace-kernel, x86, Yonghong Song On Wed, 8 Oct 2025 23:08:26 +0200 Jiri Olsa <olsajiri@gmail.com> wrote: > hi, > I'm getting no stacktrace from bpf program attached on kretprobe.multi probe > (which means on top of return fprobe) on x86. > > I think we need some kind of treatment we do for rethook, AFAICS the ORC unwind > stops on return_to_handler, because the stack and the function itself are not > adjusted for unwind_recover_ret_addr call Hmm, we do have a way to retrieve the actual return caller from a location for return_to_handler: See kernel/trace/fgraph.c: ftrace_graph_get_ret_stack() Hmm, I think the x86 ORC unwinder needs to use this. -- Steve > > If it's any help I pushed the bpf/selftest for that in here: > https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git/log/?h=stacktrace_test > > just execute: > # test_progs -t stacktrace_map/kretprobe_multi > > thanks, > jirka ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-13 17:10 ` Steven Rostedt @ 2025-10-15 16:06 ` Josh Poimboeuf 2025-10-15 16:11 ` Steven Rostedt 0 siblings, 1 reply; 13+ messages in thread From: Josh Poimboeuf @ 2025-10-15 16:06 UTC (permalink / raw) To: Steven Rostedt Cc: Jiri Olsa, Masami Hiramatsu, Peter Zijlstra, Andrii Nakryiko, bpf, linux-trace-kernel, x86, Yonghong Song On Mon, Oct 13, 2025 at 01:10:55PM -0400, Steven Rostedt wrote: > On Wed, 8 Oct 2025 23:08:26 +0200 > Jiri Olsa <olsajiri@gmail.com> wrote: > > > hi, > > I'm getting no stacktrace from bpf program attached on kretprobe.multi probe > > (which means on top of return fprobe) on x86. > > > > I think we need some kind of treatment we do for rethook, AFAICS the ORC unwind > > stops on return_to_handler, because the stack and the function itself are not > > adjusted for unwind_recover_ret_addr call > > Hmm, we do have a way to retrieve the actual return caller from a location > for return_to_handler: > > See kernel/trace/fgraph.c: ftrace_graph_get_ret_stack() > > Hmm, I think the x86 ORC unwinder needs to use this. I'm confused, is that not what ftrace_graph_ret_addr() already does? -- Josh ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-15 16:06 ` Josh Poimboeuf @ 2025-10-15 16:11 ` Steven Rostedt 2025-10-22 9:04 ` Feng Yang 0 siblings, 1 reply; 13+ messages in thread From: Steven Rostedt @ 2025-10-15 16:11 UTC (permalink / raw) To: Josh Poimboeuf Cc: Jiri Olsa, Masami Hiramatsu, Peter Zijlstra, Andrii Nakryiko, bpf, linux-trace-kernel, x86, Yonghong Song On Wed, 15 Oct 2025 09:06:12 -0700 Josh Poimboeuf <jpoimboe@kernel.org> wrote: > > Hmm, we do have a way to retrieve the actual return caller from a location > > for return_to_handler: > > > > See kernel/trace/fgraph.c: ftrace_graph_get_ret_stack() > > > > Hmm, I think the x86 ORC unwinder needs to use this. > > I'm confused, is that not what ftrace_graph_ret_addr() already does? Ah yeah, that does it too. I just searched for the first function that did the look up ;-) Now I guess the question is, why is this not working? -- Steve ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-15 16:11 ` Steven Rostedt @ 2025-10-22 9:04 ` Feng Yang 2025-10-22 12:32 ` Jiri Olsa 0 siblings, 1 reply; 13+ messages in thread From: Feng Yang @ 2025-10-22 9:04 UTC (permalink / raw) To: rostedt Cc: andrii, bpf, jpoimboe, linux-trace-kernel, mhiramat, olsajiri, peterz, x86, yhs On Wed, 15 Oct 2025 12:11:38 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > > > Hmm, we do have a way to retrieve the actual return caller from a location > > > for return_to_handler: > > > > > > See kernel/trace/fgraph.c: ftrace_graph_get_ret_stack() > > > > > > Hmm, I think the x86 ORC unwinder needs to use this. > > > > I'm confused, is that not what ftrace_graph_ret_addr() already does? > Ah yeah, that does it too. I just searched for the first function that did > the look up ;-) > Now I guess the question is, why is this not working? I've also encountered this issue recently. It only outputs the stack trace of return_to_handler, for example: # bpftrace -e 'kretprobe:vfs_rea* {@[kstack]=count()}' Attaching 1 probe... ^C @[ ksys_read+192 get_perf_callchain+211 bpf_get_stackid+101 cleanup_module+303100 kprobe_multi_link_prog_run+175 fprobe_return+265 __ftrace_return_to_handler.isra.0+433 return_to_handler+30 ]: 1 The return stack trace when directly executing samples/fprobe/fprobe_example.c is similar: [ 71.892353] return_to_handler: kernel_thread+0x71/0xa0 [ 71.892356] sample_exit_handler: Return from <kernel_clone+0x4/0x470> ip = 0x000000000e0e2004 to rip = 0x00000000127e6d58 (kernel_thread+0x71/0xa0) [ 71.892361] __ftrace_return_to_handler.isra.0+0x1b1/0x280 [ 71.892363] return_to_handler+0x1e/0x50 No cases were found where the ret of the ftrace_graph_ret_addr function is equal to return_handler. Additionally, I noticed that when the x86 architecture executes perf_callchain_kernel, perf_hw_regs(regs) is false, and it calls unwind_start(&state, current, NULL, (void *)regs->sp); which then proceeds to __unwind_start where the check task == current is performed. However, the ARM architecture executes kunwind_init_from_regs(&state, regs); instead of taking the second branch with the task == current check. I hope these phenomena can help you analyze the cause of this issue. Thanks. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-22 9:04 ` Feng Yang @ 2025-10-22 12:32 ` Jiri Olsa 2025-10-22 14:28 ` Steven Rostedt 0 siblings, 1 reply; 13+ messages in thread From: Jiri Olsa @ 2025-10-22 12:32 UTC (permalink / raw) To: Feng Yang Cc: rostedt, andrii, bpf, jpoimboe, linux-trace-kernel, mhiramat, olsajiri, peterz, x86, yhs On Wed, Oct 22, 2025 at 05:04:29PM +0800, Feng Yang wrote: > On Wed, 15 Oct 2025 12:11:38 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > > > > > Hmm, we do have a way to retrieve the actual return caller from a location > > > > for return_to_handler: > > > > > > > > See kernel/trace/fgraph.c: ftrace_graph_get_ret_stack() > > > > > > > > Hmm, I think the x86 ORC unwinder needs to use this. > > > > > > I'm confused, is that not what ftrace_graph_ret_addr() already does? > > > Ah yeah, that does it too. I just searched for the first function that did > > the look up ;-) > > > Now I guess the question is, why is this not working? > > > I've also encountered this issue recently. It only outputs the stack trace of return_to_handler, for example: > > # bpftrace -e 'kretprobe:vfs_rea* {@[kstack]=count()}' > Attaching 1 probe... > ^C > > @[ > ksys_read+192 > get_perf_callchain+211 > bpf_get_stackid+101 > cleanup_module+303100 > kprobe_multi_link_prog_run+175 > fprobe_return+265 > __ftrace_return_to_handler.isra.0+433 > return_to_handler+30 > ]: 1 that looks messed up > > The return stack trace when directly executing samples/fprobe/fprobe_example.c is similar: > [ 71.892353] return_to_handler: kernel_thread+0x71/0xa0 > [ 71.892356] sample_exit_handler: Return from <kernel_clone+0x4/0x470> ip = 0x000000000e0e2004 to rip = 0x00000000127e6d58 (kernel_thread+0x71/0xa0) > [ 71.892361] __ftrace_return_to_handler.isra.0+0x1b1/0x280 > [ 71.892363] return_to_handler+0x1e/0x50 > > No cases were found where the ret of the ftrace_graph_ret_addr function is equal to return_handler. > > Additionally, I noticed that when the x86 architecture executes perf_callchain_kernel, perf_hw_regs(regs) is false, > and it calls unwind_start(&state, current, NULL, (void *)regs->sp); > which then proceeds to __unwind_start where the check task == current is performed. > However, the ARM architecture executes kunwind_init_from_regs(&state, regs); > instead of taking the second branch with the task == current check. > > I hope these phenomena can help you analyze the cause of this issue. > Thanks. > thanks for the report.. so above is from arm? yes the x86_64 starts with: unwind_start(&state, current, NULL, (void *)regs->sp); I seems to get reasonable stack traces on x86 with the change below, which just initializes fields in regs that are used later on and sets the stack so the ftrace_graph_ret_addr code is triggered during unwind but I'm not familiar with this code, Masami, Josh, any idea? thanks, jirka --- diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S index 367da3638167..2d2bb8c37b56 100644 --- a/arch/x86/kernel/ftrace_64.S +++ b/arch/x86/kernel/ftrace_64.S @@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__) SYM_CODE_START(return_to_handler) UNWIND_HINT_UNDEFINED ANNOTATE_NOENDBR + push $return_to_handler + UNWIND_HINT_FUNC /* Save ftrace_regs for function exit context */ subq $(FRAME_SIZE), %rsp @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler) movq %rax, RAX(%rsp) movq %rdx, RDX(%rsp) movq %rbp, RBP(%rsp) + movq %rsp, RSP(%rsp) + movq $0, EFLAGS(%rsp) + movq $__KERNEL_CS, CS(%rsp) movq %rsp, %rdi call ftrace_return_to_handler @@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler) movq RDX(%rsp), %rdx movq RAX(%rsp), %rax - addq $(FRAME_SIZE), %rsp + addq $(FRAME_SIZE) + 8, %rsp + /* * Jump back to the old return address. This cannot be JMP_NOSPEC rdi * since IBT would demand that contain ENDBR, which simply isn't so for ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-22 12:32 ` Jiri Olsa @ 2025-10-22 14:28 ` Steven Rostedt 2025-10-22 20:41 ` Jiri Olsa 0 siblings, 1 reply; 13+ messages in thread From: Steven Rostedt @ 2025-10-22 14:28 UTC (permalink / raw) To: Jiri Olsa Cc: Feng Yang, andrii, bpf, jpoimboe, linux-trace-kernel, mhiramat, peterz, x86, yhs On Wed, 22 Oct 2025 14:32:19 +0200 Jiri Olsa <olsajiri@gmail.com> wrote: > thanks for the report.. so above is from arm? > > yes the x86_64 starts with: > unwind_start(&state, current, NULL, (void *)regs->sp); > > I seems to get reasonable stack traces on x86 with the change below, > which just initializes fields in regs that are used later on and sets > the stack so the ftrace_graph_ret_addr code is triggered during unwind > > but I'm not familiar with this code, Masami, Josh, any idea? Oh! This is an issue with a stack trace happening from a callback of the exit handler? OK, that makes much more sense. As I don't think the code handles that properly. > > thanks, > jirka > > > --- > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S > index 367da3638167..2d2bb8c37b56 100644 > --- a/arch/x86/kernel/ftrace_64.S > +++ b/arch/x86/kernel/ftrace_64.S > @@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__) > SYM_CODE_START(return_to_handler) > UNWIND_HINT_UNDEFINED I believe the above UNWIND_HINT_UNDEFINED means that if ORC were to hit this, it should just give up. This is because tracing the exit of the function really doesn't fit in the normal execution paradigm. The entry is easy. It's the same as if the callback was called by the function being traced. The exit is more difficult because the function being traced has already did its return. Now the callback is in this limbo area of being called between a return and the caller. > ANNOTATE_NOENDBR > + push $return_to_handler > + UNWIND_HINT_FUNC OK, so what happened here is that you put in the return_to_handle into the stack and told ORC that this is a normal function, and that when it triggers to do a lookup from the handler itself. I wonder if we could just add a new UNWIND_HINT that tells ORC to do that? > > /* Save ftrace_regs for function exit context */ > subq $(FRAME_SIZE), %rsp > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler) > movq %rax, RAX(%rsp) > movq %rdx, RDX(%rsp) > movq %rbp, RBP(%rsp) > + movq %rsp, RSP(%rsp) > + movq $0, EFLAGS(%rsp) > + movq $__KERNEL_CS, CS(%rsp) Is this simulating some kind of interrupt? > movq %rsp, %rdi > > call ftrace_return_to_handler Now it gets tricky in the ftrace_return_to_handler as the first thing it does is to pop the shadow stack, which makes the return_to_handler lookup different, as its no longer on the stack that the unwinder will use. The return address will live in the "ret" variable of that function, which the unwinder will not have access to. Yeah, this will not be easy to solve. -- Steve > @@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler) > movq RDX(%rsp), %rdx > movq RAX(%rsp), %rax > > - addq $(FRAME_SIZE), %rsp > + addq $(FRAME_SIZE) + 8, %rsp > + > /* > * Jump back to the old return address. This cannot be JMP_NOSPEC rdi > * since IBT would demand that contain ENDBR, which simply isn't so for ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-22 14:28 ` Steven Rostedt @ 2025-10-22 20:41 ` Jiri Olsa 2025-10-22 21:17 ` Steven Rostedt 0 siblings, 1 reply; 13+ messages in thread From: Jiri Olsa @ 2025-10-22 20:41 UTC (permalink / raw) To: Steven Rostedt Cc: Jiri Olsa, Feng Yang, andrii, bpf, jpoimboe, linux-trace-kernel, mhiramat, peterz, x86, yhs On Wed, Oct 22, 2025 at 10:28:19AM -0400, Steven Rostedt wrote: > On Wed, 22 Oct 2025 14:32:19 +0200 > Jiri Olsa <olsajiri@gmail.com> wrote: > > > thanks for the report.. so above is from arm? > > > > yes the x86_64 starts with: > > unwind_start(&state, current, NULL, (void *)regs->sp); > > > > I seems to get reasonable stack traces on x86 with the change below, > > which just initializes fields in regs that are used later on and sets > > the stack so the ftrace_graph_ret_addr code is triggered during unwind > > > > but I'm not familiar with this code, Masami, Josh, any idea? > > Oh! This is an issue with a stack trace happening from a callback of the > exit handler? yes, it's triggered via: return_to_handler ftrace_return_to_handler fprobe_return kprobe_multi_link_exit_handler kprobe_multi_link_prog_run bpf_prog_run bpf_prog.. bpf_get_stackid get_perf_callchain perf_callchain_kernel unwind_start > > OK, that makes much more sense. As I don't think the code handles that > properly. > > > > > thanks, > > jirka > > > > > > --- > > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S > > index 367da3638167..2d2bb8c37b56 100644 > > --- a/arch/x86/kernel/ftrace_64.S > > +++ b/arch/x86/kernel/ftrace_64.S > > @@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__) > > SYM_CODE_START(return_to_handler) > > UNWIND_HINT_UNDEFINED > > I believe the above UNWIND_HINT_UNDEFINED means that if ORC were to hit > this, it should just give up. > > This is because tracing the exit of the function really doesn't fit in the > normal execution paradigm. > > The entry is easy. It's the same as if the callback was called by the > function being traced. The exit is more difficult because the function > being traced has already did its return. Now the callback is in this limbo > area of being called between a return and the caller. I followed rethook trampoline arch_rethook_trampoline code which does similar stuff and gets similar treatment in unwind_recover_ret_addr like fgraph > > > ANNOTATE_NOENDBR > > + push $return_to_handler > > + UNWIND_HINT_FUNC > > OK, so what happened here is that you put in the return_to_handle into the > stack and told ORC that this is a normal function, and that when it > triggers to do a lookup from the handler itself. together with the "push $return_to_handler" it suppose to instruct ftrace_graph_ret_addr to go get the 'real' return address from shadow stack > > I wonder if we could just add a new UNWIND_HINT that tells ORC to do that? if I remove the initial UNWIND_HINT_UNDEFINED I get objtool warning about unreachable instruction > > > > > /* Save ftrace_regs for function exit context */ > > subq $(FRAME_SIZE), %rsp > > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler) > > movq %rax, RAX(%rsp) > > movq %rdx, RDX(%rsp) > > movq %rbp, RBP(%rsp) > > + movq %rsp, RSP(%rsp) > > + movq $0, EFLAGS(%rsp) > > + movq $__KERNEL_CS, CS(%rsp) > > Is this simulating some kind of interrupt? there are several checks in pt_regs on these fields - in get_perf_callchain we check user_mode(regs) so CS has to be set - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS has to be set > > > movq %rsp, %rdi > > > > call ftrace_return_to_handler > > Now it gets tricky in the ftrace_return_to_handler as the first thing it > does is to pop the shadow stack, which makes the return_to_handler lookup > different, as its no longer on the stack that the unwinder will use. hum strange.. the resulting stack trace seems ok, I'll make it a selftest I send it ftrace_graph_ret_addr that checks on the 'real return address seems to have 2 ways of getting to it: i = *idx ? : task->curr_ret_stack; I dont know how that previous pop affects this, but I'm sure it's more complicated than this ;-) jirka > > The return address will live in the "ret" variable of that function, which > the unwinder will not have access to. Yeah, this will not be easy to solve. > > -- Steve > > > > @@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler) > > movq RDX(%rsp), %rdx > > movq RAX(%rsp), %rax > > > > - addq $(FRAME_SIZE), %rsp > > + addq $(FRAME_SIZE) + 8, %rsp > > + > > /* > > * Jump back to the old return address. This cannot be JMP_NOSPEC rdi > > * since IBT would demand that contain ENDBR, which simply isn't so for > ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-22 20:41 ` Jiri Olsa @ 2025-10-22 21:17 ` Steven Rostedt 2025-10-23 20:42 ` Jiri Olsa 0 siblings, 1 reply; 13+ messages in thread From: Steven Rostedt @ 2025-10-22 21:17 UTC (permalink / raw) To: Jiri Olsa Cc: Feng Yang, andrii, bpf, jpoimboe, linux-trace-kernel, mhiramat, peterz, x86, yhs On Wed, 22 Oct 2025 22:41:20 +0200 Jiri Olsa <olsajiri@gmail.com> wrote: > > > > > ANNOTATE_NOENDBR > > > + push $return_to_handler > > > + UNWIND_HINT_FUNC > > > > OK, so what happened here is that you put in the return_to_handle into the > > stack and told ORC that this is a normal function, and that when it > > triggers to do a lookup from the handler itself. > > together with the "push $return_to_handler" it suppose to instruct ftrace_graph_ret_addr > to go get the 'real' return address from shadow stack > > > > > I wonder if we could just add a new UNWIND_HINT that tells ORC to do that? > > if I remove the initial UNWIND_HINT_UNDEFINED I get objtool warning > about unreachable instruction Right. I was thinking we add UNWIND_HINT_RETHOOK and an UNWIND_HINT_TYPE_RETHOOK that lets objtool know that this function is a "return_to_hook" function and the unwinder can do something special with it. > > > > > > > > > /* Save ftrace_regs for function exit context */ > > > subq $(FRAME_SIZE), %rsp > > > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler) > > > movq %rax, RAX(%rsp) > > > movq %rdx, RDX(%rsp) > > > movq %rbp, RBP(%rsp) > > > + movq %rsp, RSP(%rsp) > > > + movq $0, EFLAGS(%rsp) > > > + movq $__KERNEL_CS, CS(%rsp) > > > > Is this simulating some kind of interrupt? > > there are several checks in pt_regs on these fields > > - in get_perf_callchain we check user_mode(regs) so CS has to be set > - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS has to be set So this is a different issue. I rather have this added in kprobe_multi_link_prog_run as its the only user of it. Or have the ftrace_regs conversion update it. This isn't something that should be done at every call and slow everyone else down. > > > > > > movq %rsp, %rdi > > > > > > call ftrace_return_to_handler > > > > Now it gets tricky in the ftrace_return_to_handler as the first thing it > > does is to pop the shadow stack, which makes the return_to_handler lookup > > different, as its no longer on the stack that the unwinder will use. > > hum strange.. the resulting stack trace seems ok, I'll make it a > selftest I send it > > ftrace_graph_ret_addr that checks on the 'real return address seems > to have 2 ways of getting to it: > > i = *idx ? : task->curr_ret_stack; > > I dont know how that previous pop affects this, but I'm sure it's > more complicated than this ;-) Oh wait, it may be OK. I forgot I had to change the pop function to give the data back, but it doesn't modify the task->curr_ret_stack until after it calls all the callbacks. That's because the shadow stack still has the data that is being passed from the entry callback. So it can't be updated yet otherwise that data on the shadow stack will get corrupted. Yeah, the return_to_handler should work up until the end of ftrace_return_to_handler(). -- Steve ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-22 21:17 ` Steven Rostedt @ 2025-10-23 20:42 ` Jiri Olsa 2025-10-23 20:55 ` Steven Rostedt 0 siblings, 1 reply; 13+ messages in thread From: Jiri Olsa @ 2025-10-23 20:42 UTC (permalink / raw) To: Steven Rostedt Cc: Jiri Olsa, Feng Yang, andrii, bpf, jpoimboe, linux-trace-kernel, mhiramat, peterz, x86, yhs On Wed, Oct 22, 2025 at 05:17:11PM -0400, Steven Rostedt wrote: > On Wed, 22 Oct 2025 22:41:20 +0200 > Jiri Olsa <olsajiri@gmail.com> wrote: > > > > > > > > ANNOTATE_NOENDBR > > > > + push $return_to_handler > > > > + UNWIND_HINT_FUNC > > > > > > OK, so what happened here is that you put in the return_to_handle into the > > > stack and told ORC that this is a normal function, and that when it > > > triggers to do a lookup from the handler itself. > > > > together with the "push $return_to_handler" it suppose to instruct ftrace_graph_ret_addr > > to go get the 'real' return address from shadow stack > > > > > > > > I wonder if we could just add a new UNWIND_HINT that tells ORC to do that? > > > > if I remove the initial UNWIND_HINT_UNDEFINED I get objtool warning > > about unreachable instruction > > Right. I was thinking we add UNWIND_HINT_RETHOOK and an > UNWIND_HINT_TYPE_RETHOOK that lets objtool know that this function is a > "return_to_hook" function and the unwinder can do something special with it. > > > > > > > > > > > > > > /* Save ftrace_regs for function exit context */ > > > > subq $(FRAME_SIZE), %rsp > > > > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler) > > > > movq %rax, RAX(%rsp) > > > > movq %rdx, RDX(%rsp) > > > > movq %rbp, RBP(%rsp) > > > > + movq %rsp, RSP(%rsp) > > > > + movq $0, EFLAGS(%rsp) > > > > + movq $__KERNEL_CS, CS(%rsp) > > > > > > Is this simulating some kind of interrupt? > > > > there are several checks in pt_regs on these fields > > > > - in get_perf_callchain we check user_mode(regs) so CS has to be set > > - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS has to be set > > So this is a different issue. I rather have this added in > kprobe_multi_link_prog_run as its the only user of it. Or have the there's also fprobe tracer that probably needs it as well > ftrace_regs conversion update it. This isn't something that should be done > at every call and slow everyone else down. I think it's ok, but not sure where to get rsp value at that point, perhaps we could just use the pt_regs address jirka > > > > > > > > > > movq %rsp, %rdi > > > > > > > > call ftrace_return_to_handler SNIP ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program 2025-10-23 20:42 ` Jiri Olsa @ 2025-10-23 20:55 ` Steven Rostedt 0 siblings, 0 replies; 13+ messages in thread From: Steven Rostedt @ 2025-10-23 20:55 UTC (permalink / raw) To: Jiri Olsa Cc: Feng Yang, andrii, bpf, jpoimboe, linux-trace-kernel, mhiramat, peterz, x86, yhs On Thu, 23 Oct 2025 22:42:08 +0200 Jiri Olsa <olsajiri@gmail.com> wrote: > > > > > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler) > > > > > movq %rax, RAX(%rsp) > > > > > movq %rdx, RDX(%rsp) > > > > > movq %rbp, RBP(%rsp) > > > > > + movq %rsp, RSP(%rsp) > > > > > + movq $0, EFLAGS(%rsp) > > > > > + movq $__KERNEL_CS, CS(%rsp) > > > > > > > > Is this simulating some kind of interrupt? > > > > > > there are several checks in pt_regs on these fields > > > > > > - in get_perf_callchain we check user_mode(regs) so CS has to be set > > > - in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS has to be set > > > > So this is a different issue. I rather have this added in > > kprobe_multi_link_prog_run as its the only user of it. Or have the > > there's also fprobe tracer that probably needs it as well > > > ftrace_regs conversion update it. This isn't something that should be done > > at every call and slow everyone else down. > > I think it's ok, but not sure where to get rsp value at that point, > perhaps we could just use the pt_regs address Oh, rsp is fine to add, as that's one of the items expected for ftrace_regs. It's the flags and CS that isn't needed. -- Steve ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2025-10-23 20:54 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-08 21:08 [BUG] no ORC stacktrace from kretprobe.multi bpf program Jiri Olsa 2025-10-12 4:09 ` Masami Hiramatsu 2025-10-13 14:36 ` Jiri Olsa 2025-10-13 17:10 ` Steven Rostedt 2025-10-15 16:06 ` Josh Poimboeuf 2025-10-15 16:11 ` Steven Rostedt 2025-10-22 9:04 ` Feng Yang 2025-10-22 12:32 ` Jiri Olsa 2025-10-22 14:28 ` Steven Rostedt 2025-10-22 20:41 ` Jiri Olsa 2025-10-22 21:17 ` Steven Rostedt 2025-10-23 20:42 ` Jiri Olsa 2025-10-23 20:55 ` Steven Rostedt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox