From: Jiri Olsa <olsajiri@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Jiri Olsa <olsajiri@gmail.com>, Feng Yang <yangfeng59949@163.com>,
andrii@kernel.org, bpf@vger.kernel.org, jpoimboe@kernel.org,
linux-trace-kernel@vger.kernel.org, mhiramat@kernel.org,
peterz@infradead.org, x86@kernel.org, yhs@fb.com
Subject: Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program
Date: Wed, 22 Oct 2025 22:41:20 +0200 [thread overview]
Message-ID: <aPlBcKq7S-bD3B56@krava> (raw)
In-Reply-To: <20251022102819.7675ee7a@gandalf.local.home>
On Wed, Oct 22, 2025 at 10:28:19AM -0400, Steven Rostedt wrote:
> On Wed, 22 Oct 2025 14:32:19 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > thanks for the report.. so above is from arm?
> >
> > yes the x86_64 starts with:
> > unwind_start(&state, current, NULL, (void *)regs->sp);
> >
> > I seems to get reasonable stack traces on x86 with the change below,
> > which just initializes fields in regs that are used later on and sets
> > the stack so the ftrace_graph_ret_addr code is triggered during unwind
> >
> > but I'm not familiar with this code, Masami, Josh, any idea?
>
> Oh! This is an issue with a stack trace happening from a callback of the
> exit handler?
yes, it's triggered via:
return_to_handler
ftrace_return_to_handler
fprobe_return
kprobe_multi_link_exit_handler
kprobe_multi_link_prog_run
bpf_prog_run
bpf_prog..
bpf_get_stackid
get_perf_callchain
perf_callchain_kernel
unwind_start
>
> OK, that makes much more sense. As I don't think the code handles that
> properly.
>
> >
> > thanks,
> > jirka
> >
> >
> > ---
> > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
> > index 367da3638167..2d2bb8c37b56 100644
> > --- a/arch/x86/kernel/ftrace_64.S
> > +++ b/arch/x86/kernel/ftrace_64.S
> > @@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__)
> > SYM_CODE_START(return_to_handler)
> > UNWIND_HINT_UNDEFINED
>
> I believe the above UNWIND_HINT_UNDEFINED means that if ORC were to hit
> this, it should just give up.
>
> This is because tracing the exit of the function really doesn't fit in the
> normal execution paradigm.
>
> The entry is easy. It's the same as if the callback was called by the
> function being traced. The exit is more difficult because the function
> being traced has already did its return. Now the callback is in this limbo
> area of being called between a return and the caller.
I followed rethook trampoline arch_rethook_trampoline code which does similar
stuff and gets similar treatment in unwind_recover_ret_addr like fgraph
>
> > ANNOTATE_NOENDBR
> > + push $return_to_handler
> > + UNWIND_HINT_FUNC
>
> OK, so what happened here is that you put in the return_to_handle into the
> stack and told ORC that this is a normal function, and that when it
> triggers to do a lookup from the handler itself.
together with the "push $return_to_handler" it suppose to instruct ftrace_graph_ret_addr
to go get the 'real' return address from shadow stack
>
> I wonder if we could just add a new UNWIND_HINT that tells ORC to do that?
if I remove the initial UNWIND_HINT_UNDEFINED I get objtool warning
about unreachable instruction
>
> >
> > /* Save ftrace_regs for function exit context */
> > subq $(FRAME_SIZE), %rsp
> > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler)
> > movq %rax, RAX(%rsp)
> > movq %rdx, RDX(%rsp)
> > movq %rbp, RBP(%rsp)
> > + movq %rsp, RSP(%rsp)
> > + movq $0, EFLAGS(%rsp)
> > + movq $__KERNEL_CS, CS(%rsp)
>
> Is this simulating some kind of interrupt?
there are several checks in pt_regs on these fields
- in get_perf_callchain we check user_mode(regs) so CS has to be set
- in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS has to be set
>
> > movq %rsp, %rdi
> >
> > call ftrace_return_to_handler
>
> Now it gets tricky in the ftrace_return_to_handler as the first thing it
> does is to pop the shadow stack, which makes the return_to_handler lookup
> different, as its no longer on the stack that the unwinder will use.
hum strange.. the resulting stack trace seems ok, I'll make it a
selftest I send it
ftrace_graph_ret_addr that checks on the 'real return address seems
to have 2 ways of getting to it:
i = *idx ? : task->curr_ret_stack;
I dont know how that previous pop affects this, but I'm sure it's
more complicated than this ;-)
jirka
>
> The return address will live in the "ret" variable of that function, which
> the unwinder will not have access to. Yeah, this will not be easy to solve.
>
> -- Steve
>
>
> > @@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler)
> > movq RDX(%rsp), %rdx
> > movq RAX(%rsp), %rax
> >
> > - addq $(FRAME_SIZE), %rsp
> > + addq $(FRAME_SIZE) + 8, %rsp
> > +
> > /*
> > * Jump back to the old return address. This cannot be JMP_NOSPEC rdi
> > * since IBT would demand that contain ENDBR, which simply isn't so for
>
next prev parent reply other threads:[~2025-10-22 20:41 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-08 21:08 [BUG] no ORC stacktrace from kretprobe.multi bpf program Jiri Olsa
2025-10-12 4:09 ` Masami Hiramatsu
2025-10-13 14:36 ` Jiri Olsa
2025-10-13 17:10 ` Steven Rostedt
2025-10-15 16:06 ` Josh Poimboeuf
2025-10-15 16:11 ` Steven Rostedt
2025-10-22 9:04 ` Feng Yang
2025-10-22 12:32 ` Jiri Olsa
2025-10-22 14:28 ` Steven Rostedt
2025-10-22 20:41 ` Jiri Olsa [this message]
2025-10-22 21:17 ` Steven Rostedt
2025-10-23 20:42 ` Jiri Olsa
2025-10-23 20:55 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPlBcKq7S-bD3B56@krava \
--to=olsajiri@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=jpoimboe@kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=x86@kernel.org \
--cc=yangfeng59949@163.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox