From: Jiri Olsa <olsajiri@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Jiri Olsa <olsajiri@gmail.com>, Feng Yang <yangfeng59949@163.com>,
andrii@kernel.org, bpf@vger.kernel.org, jpoimboe@kernel.org,
linux-trace-kernel@vger.kernel.org, mhiramat@kernel.org,
peterz@infradead.org, x86@kernel.org, yhs@fb.com
Subject: Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program
Date: Wed, 22 Oct 2025 22:41:20 +0200 [thread overview]
Message-ID: <aPlBcKq7S-bD3B56@krava> (raw)
In-Reply-To: <20251022102819.7675ee7a@gandalf.local.home>
On Wed, Oct 22, 2025 at 10:28:19AM -0400, Steven Rostedt wrote:
> On Wed, 22 Oct 2025 14:32:19 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > thanks for the report.. so above is from arm?
> >
> > yes the x86_64 starts with:
> > unwind_start(&state, current, NULL, (void *)regs->sp);
> >
> > I seems to get reasonable stack traces on x86 with the change below,
> > which just initializes fields in regs that are used later on and sets
> > the stack so the ftrace_graph_ret_addr code is triggered during unwind
> >
> > but I'm not familiar with this code, Masami, Josh, any idea?
>
> Oh! This is an issue with a stack trace happening from a callback of the
> exit handler?
yes, it's triggered via:
return_to_handler
ftrace_return_to_handler
fprobe_return
kprobe_multi_link_exit_handler
kprobe_multi_link_prog_run
bpf_prog_run
bpf_prog..
bpf_get_stackid
get_perf_callchain
perf_callchain_kernel
unwind_start
>
> OK, that makes much more sense. As I don't think the code handles that
> properly.
>
> >
> > thanks,
> > jirka
> >
> >
> > ---
> > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
> > index 367da3638167..2d2bb8c37b56 100644
> > --- a/arch/x86/kernel/ftrace_64.S
> > +++ b/arch/x86/kernel/ftrace_64.S
> > @@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__)
> > SYM_CODE_START(return_to_handler)
> > UNWIND_HINT_UNDEFINED
>
> I believe the above UNWIND_HINT_UNDEFINED means that if ORC were to hit
> this, it should just give up.
>
> This is because tracing the exit of the function really doesn't fit in the
> normal execution paradigm.
>
> The entry is easy. It's the same as if the callback was called by the
> function being traced. The exit is more difficult because the function
> being traced has already did its return. Now the callback is in this limbo
> area of being called between a return and the caller.
I followed rethook trampoline arch_rethook_trampoline code which does similar
stuff and gets similar treatment in unwind_recover_ret_addr like fgraph
>
> > ANNOTATE_NOENDBR
> > + push $return_to_handler
> > + UNWIND_HINT_FUNC
>
> OK, so what happened here is that you put in the return_to_handle into the
> stack and told ORC that this is a normal function, and that when it
> triggers to do a lookup from the handler itself.
together with the "push $return_to_handler" it suppose to instruct ftrace_graph_ret_addr
to go get the 'real' return address from shadow stack
>
> I wonder if we could just add a new UNWIND_HINT that tells ORC to do that?
if I remove the initial UNWIND_HINT_UNDEFINED I get objtool warning
about unreachable instruction
>
> >
> > /* Save ftrace_regs for function exit context */
> > subq $(FRAME_SIZE), %rsp
> > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler)
> > movq %rax, RAX(%rsp)
> > movq %rdx, RDX(%rsp)
> > movq %rbp, RBP(%rsp)
> > + movq %rsp, RSP(%rsp)
> > + movq $0, EFLAGS(%rsp)
> > + movq $__KERNEL_CS, CS(%rsp)
>
> Is this simulating some kind of interrupt?
there are several checks in pt_regs on these fields
- in get_perf_callchain we check user_mode(regs) so CS has to be set
- in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS has to be set
>
> > movq %rsp, %rdi
> >
> > call ftrace_return_to_handler
>
> Now it gets tricky in the ftrace_return_to_handler as the first thing it
> does is to pop the shadow stack, which makes the return_to_handler lookup
> different, as its no longer on the stack that the unwinder will use.
hum strange.. the resulting stack trace seems ok, I'll make it a
selftest I send it
ftrace_graph_ret_addr that checks on the 'real return address seems
to have 2 ways of getting to it:
i = *idx ? : task->curr_ret_stack;
I dont know how that previous pop affects this, but I'm sure it's
more complicated than this ;-)
jirka
>
> The return address will live in the "ret" variable of that function, which
> the unwinder will not have access to. Yeah, this will not be easy to solve.
>
> -- Steve
>
>
> > @@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler)
> > movq RDX(%rsp), %rdx
> > movq RAX(%rsp), %rax
> >
> > - addq $(FRAME_SIZE), %rsp
> > + addq $(FRAME_SIZE) + 8, %rsp
> > +
> > /*
> > * Jump back to the old return address. This cannot be JMP_NOSPEC rdi
> > * since IBT would demand that contain ENDBR, which simply isn't so for
>
next prev parent reply other threads:[~2025-10-22 20:41 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-08 21:08 [BUG] no ORC stacktrace from kretprobe.multi bpf program Jiri Olsa
2025-10-12 4:09 ` Masami Hiramatsu
2025-10-13 14:36 ` Jiri Olsa
2025-10-13 17:10 ` Steven Rostedt
2025-10-15 16:06 ` Josh Poimboeuf
2025-10-15 16:11 ` Steven Rostedt
2025-10-22 9:04 ` Feng Yang
2025-10-22 12:32 ` Jiri Olsa
2025-10-22 14:28 ` Steven Rostedt
2025-10-22 20:41 ` Jiri Olsa [this message]
2025-10-22 21:17 ` Steven Rostedt
2025-10-23 20:42 ` Jiri Olsa
2025-10-23 20:55 ` Steven Rostedt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aPlBcKq7S-bD3B56@krava \
--to=olsajiri@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=jpoimboe@kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mhiramat@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=x86@kernel.org \
--cc=yangfeng59949@163.com \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.