public inbox for bpf@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Jiri Olsa <olsajiri@gmail.com>, Feng Yang <yangfeng59949@163.com>,
	andrii@kernel.org, bpf@vger.kernel.org, jpoimboe@kernel.org,
	linux-trace-kernel@vger.kernel.org, mhiramat@kernel.org,
	peterz@infradead.org, x86@kernel.org, yhs@fb.com
Subject: Re: [BUG] no ORC stacktrace from kretprobe.multi bpf program
Date: Wed, 22 Oct 2025 22:41:20 +0200	[thread overview]
Message-ID: <aPlBcKq7S-bD3B56@krava> (raw)
In-Reply-To: <20251022102819.7675ee7a@gandalf.local.home>

On Wed, Oct 22, 2025 at 10:28:19AM -0400, Steven Rostedt wrote:
> On Wed, 22 Oct 2025 14:32:19 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
> 
> > thanks for the report.. so above is from arm?
> > 
> > yes the x86_64 starts with:
> >   unwind_start(&state, current, NULL, (void *)regs->sp);
> > 
> > I seems to get reasonable stack traces on x86 with the change below,
> > which just initializes fields in regs that are used later on and sets
> > the stack so the ftrace_graph_ret_addr code is triggered during unwind
> > 
> > but I'm not familiar with this code, Masami, Josh, any idea?
> 
> Oh! This is an issue with a stack trace happening from a callback of the
> exit handler?

yes, it's triggered via:

  return_to_handler
    ftrace_return_to_handler
      fprobe_return
        kprobe_multi_link_exit_handler
	  kprobe_multi_link_prog_run
	    bpf_prog_run
	      bpf_prog..
	        bpf_get_stackid
		  get_perf_callchain
		    perf_callchain_kernel
		      unwind_start

> 
> OK, that makes much more sense. As I don't think the code handles that
> properly.
> 
> > 
> > thanks,
> > jirka
> > 
> > 
> > ---
> > diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
> > index 367da3638167..2d2bb8c37b56 100644
> > --- a/arch/x86/kernel/ftrace_64.S
> > +++ b/arch/x86/kernel/ftrace_64.S
> > @@ -353,6 +353,8 @@ STACK_FRAME_NON_STANDARD_FP(__fentry__)
> >  SYM_CODE_START(return_to_handler)
> >  	UNWIND_HINT_UNDEFINED
> 
> I believe the above UNWIND_HINT_UNDEFINED means that if ORC were to hit
> this, it should just give up.
> 
> This is because tracing the exit of the function really doesn't fit in the
> normal execution paradigm.
> 
> The entry is easy. It's the same as if the callback was called by the
> function being traced. The exit is more difficult because the function
> being traced has already did its return. Now the callback is in this limbo
> area of being called between a return and the caller.

I followed rethook trampoline arch_rethook_trampoline code which does similar
stuff and gets similar treatment in unwind_recover_ret_addr like fgraph

> 
> >  	ANNOTATE_NOENDBR
> > +	push $return_to_handler
> > +	UNWIND_HINT_FUNC
> 
> OK, so what happened here is that you put in the return_to_handle into the
> stack and told ORC that this is a normal function, and that when it
> triggers to do a lookup from the handler itself.

together with the "push $return_to_handler" it suppose to instruct ftrace_graph_ret_addr
to go get the 'real' return address from shadow stack

> 
> I wonder if we could just add a new UNWIND_HINT that tells ORC to do that?

if I remove the initial UNWIND_HINT_UNDEFINED I get objtool warning
about unreachable instruction

> 
> >  
> >  	/* Save ftrace_regs for function exit context  */
> >  	subq $(FRAME_SIZE), %rsp
> > @@ -360,6 +362,9 @@ SYM_CODE_START(return_to_handler)
> >  	movq %rax, RAX(%rsp)
> >  	movq %rdx, RDX(%rsp)
> >  	movq %rbp, RBP(%rsp)
> > +	movq %rsp, RSP(%rsp)
> > +	movq $0, EFLAGS(%rsp)
> > +	movq $__KERNEL_CS, CS(%rsp)
> 
> Is this simulating some kind of interrupt?

there are several checks in pt_regs on these fields 

- in get_perf_callchain we check user_mode(regs) so CS has to be set
- in perf_callchain_kernel we call perf_hw_regs(regs), so EFLAGS has to be set

> 
> >  	movq %rsp, %rdi
> >  
> >  	call ftrace_return_to_handler
> 
> Now it gets tricky in the ftrace_return_to_handler as the first thing it
> does is to pop the shadow stack, which makes the return_to_handler lookup
> different, as its no longer on the stack that the unwinder will use.

hum strange.. the resulting stack trace seems ok, I'll make it a
selftest I send it

ftrace_graph_ret_addr that checks on the 'real return address seems
to have 2 ways of getting to it:

        i = *idx ? : task->curr_ret_stack;

I dont know how that previous pop affects this, but I'm sure it's
more complicated than this ;-)

jirka


> 
> The return address will live in the "ret" variable of that function, which
> the unwinder will not have access to. Yeah, this will not be easy to solve.
> 
> -- Steve
> 
> 
> > @@ -368,7 +373,8 @@ SYM_CODE_START(return_to_handler)
> >  	movq RDX(%rsp), %rdx
> >  	movq RAX(%rsp), %rax
> >  
> > -	addq $(FRAME_SIZE), %rsp
> > +	addq $(FRAME_SIZE) + 8, %rsp
> > +
> >  	/*
> >  	 * Jump back to the old return address. This cannot be JMP_NOSPEC rdi
> >  	 * since IBT would demand that contain ENDBR, which simply isn't so for
> 

  reply	other threads:[~2025-10-22 20:41 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-08 21:08 [BUG] no ORC stacktrace from kretprobe.multi bpf program Jiri Olsa
2025-10-12  4:09 ` Masami Hiramatsu
2025-10-13 14:36   ` Jiri Olsa
2025-10-13 17:10 ` Steven Rostedt
2025-10-15 16:06   ` Josh Poimboeuf
2025-10-15 16:11     ` Steven Rostedt
2025-10-22  9:04       ` Feng Yang
2025-10-22 12:32         ` Jiri Olsa
2025-10-22 14:28           ` Steven Rostedt
2025-10-22 20:41             ` Jiri Olsa [this message]
2025-10-22 21:17               ` Steven Rostedt
2025-10-23 20:42                 ` Jiri Olsa
2025-10-23 20:55                   ` Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aPlBcKq7S-bD3B56@krava \
    --to=olsajiri@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=x86@kernel.org \
    --cc=yangfeng59949@163.com \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox