All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <olsajiri@gmail.com>
To: Jiri Olsa <olsajiri@gmail.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	Masami Hiramatsu <mhiramat@kernel.org>,
	Josh Poimboeuf <jpoimboe@kernel.org>,
	Mahe Tardy <mahe.tardy@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	bpf@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	x86@kernel.org, Yonghong Song <yhs@fb.com>,
	Song Liu <songliubraving@fb.com>,
	Andrii Nakryiko <andrii@kernel.org>
Subject: Re: [PATCH bpf-next 2/4] x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path
Date: Tue, 20 Jan 2026 17:17:08 +0100	[thread overview]
Message-ID: <aW-qhJ2pige8aRl4@krava> (raw)
In-Reply-To: <aWYv6864cdO2PWbb@krava>

On Tue, Jan 13, 2026 at 12:43:39PM +0100, Jiri Olsa wrote:
> On Mon, Jan 12, 2026 at 05:07:57PM -0500, Steven Rostedt wrote:
> > On Mon, 12 Jan 2026 22:49:38 +0100
> > Jiri Olsa <jolsa@kernel.org> wrote:
> > 
> > > To recreate same stack setup for return probe as we have for entry
> > > probe, we set the instruction pointer to the attached function address,
> > > which gets us the same unwind setup and same stack trace.
> > > 
> > > With the fix, entry probe:
> > > 
> > >   # bpftrace -e 'kprobe:__x64_sys_newuname* { print(kstack)}'
> > >   Attaching 1 probe...
> > > 
> > >         __x64_sys_newuname+9
> > >         do_syscall_64+134
> > >         entry_SYSCALL_64_after_hwframe+118
> > > 
> > > return probe:
> > > 
> > >   # bpftrace -e 'kretprobe:__x64_sys_newuname* { print(kstack)}'
> > >   Attaching 1 probe...
> > > 
> > >         __x64_sys_newuname+4
> > >         do_syscall_64+134
> > >         entry_SYSCALL_64_after_hwframe+118
> > 
> > But is this really correct?
> > 
> > The stack trace of the return from __x86_sys_newuname is from offset "+4".
> > 
> > The stack trace from entry is offset "+9". Isn't it confusing that the
> > offset is likely not from the return portion of that function?
> 
> right, makes sense.. so standard kprobe actualy skips attached function
> (__x86_sys_newuname) on return probe stacktrace.. perhaps we should do
> the same for kprobe_multi
> 
> I managed to get that with the change below, but it's wrong wrt arch code,
> note the ftrace_regs_set_stack_pointer(fregs, stack + 8) .. will try to
> figure out better way when we agree on the solution
> 
> thanks,
> jirka
> 
> 
> ---
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index c56e1e63b893..b0e8ce4934e7 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -71,6 +71,9 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
>  #define ftrace_regs_set_instruction_pointer(fregs, _ip)	\
>  	do { arch_ftrace_regs(fregs)->regs.ip = (_ip); } while (0)
>  
> +#define ftrace_regs_set_stack_pointer(fregs, _sp)	\
> +	do { arch_ftrace_regs(fregs)->regs.sp = (_sp); } while (0)
> +
>  
>  static __always_inline unsigned long
>  ftrace_regs_get_return_address(struct ftrace_regs *fregs)
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 6279e0a753cf..b1510c412dcb 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -717,7 +717,8 @@ int function_graph_enter_regs(unsigned long ret, unsigned long func,
>  /* Retrieve a function return address to the trace stack on thread info.*/
>  static struct ftrace_ret_stack *
>  ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret,
> -			unsigned long frame_pointer, int *offset)
> +			unsigned long *stack, unsigned long frame_pointer,
> +			int *offset)
>  {
>  	struct ftrace_ret_stack *ret_stack;
>  
> @@ -762,6 +763,7 @@ ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret,
>  
>  	*offset += FGRAPH_FRAME_OFFSET;
>  	*ret = ret_stack->ret;
> +	*stack = (unsigned long) ret_stack->retp;
>  	trace->func = ret_stack->func;
>  	trace->overrun = atomic_read(&current->trace_overrun);
>  	trace->depth = current->curr_ret_depth;
> @@ -810,12 +812,13 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
>  	struct ftrace_ret_stack *ret_stack;
>  	struct ftrace_graph_ret trace;
>  	unsigned long bitmap;
> +	unsigned long stack;
>  	unsigned long ret;
>  	int offset;
>  	int bit;
>  	int i;
>  
> -	ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> +	ret_stack = ftrace_pop_return_trace(&trace, &ret, &stack, frame_pointer, &offset);
>  
>  	if (unlikely(!ret_stack)) {
>  		ftrace_graph_stop();
> @@ -824,8 +827,11 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
>  		return (unsigned long)panic;
>  	}
>  
> -	if (fregs)
> -		ftrace_regs_set_instruction_pointer(fregs, trace.func);
> +	if (fregs) {
> +		ftrace_regs_set_instruction_pointer(fregs, ret);
> +		ftrace_regs_set_stack_pointer(fregs, stack + 8);

actually looks like this might be better solution.. storing the proper
rsp value directly to the regs in return_to_handler

jirka


---
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index a132608265f6..971883045b75 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -368,13 +368,16 @@ SYM_CODE_START(return_to_handler)
 	subq $8, %rsp
 	UNWIND_HINT_FUNC
 
+	movq %rsp, %rdi
+	addq $8, %rdi
+
 	/* Save ftrace_regs for function exit context  */
 	subq $(FRAME_SIZE), %rsp
 
 	movq %rax, RAX(%rsp)
 	movq %rdx, RDX(%rsp)
 	movq %rbp, RBP(%rsp)
-	movq %rsp, RSP(%rsp)
+	movq %rdi, RSP(%rsp)
 	movq %rsp, %rdi
 
 	call ftrace_return_to_handler

  parent reply	other threads:[~2026-01-20 16:17 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-12 21:49 [PATCH bpf-next 0/4] x86/fgraph,bpf: Fix ORC stack unwind from kprobe_multi Jiri Olsa
2026-01-12 21:49 ` [PATCH bpf-next 1/4] x86/fgraph: Fix return_to_handler regs.rsp value Jiri Olsa
2026-01-12 21:49 ` [PATCH bpf-next 2/4] x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path Jiri Olsa
2026-01-12 22:07   ` Steven Rostedt
2026-01-13 11:43     ` Jiri Olsa
2026-01-15 18:52       ` Andrii Nakryiko
2026-01-16 16:25         ` Jiri Olsa
2026-01-16 22:24           ` Andrii Nakryiko
2026-01-20 14:50             ` Steven Rostedt
2026-01-20 16:17       ` Jiri Olsa [this message]
2026-01-12 21:49 ` [PATCH bpf-next 3/4] selftests/bpf: Fix kprobe multi stacktrace_ips test Jiri Olsa
2026-01-12 21:49 ` [PATCH bpf-next 4/4] selftests/bpf: Allow to benchmark trigger with stacktrace Jiri Olsa
2026-01-15 18:48   ` Andrii Nakryiko
2026-01-15 18:50     ` Andrii Nakryiko
2026-01-16 16:30       ` Jiri Olsa
2026-01-16 16:26     ` Jiri Olsa
2026-01-22  8:35     ` Jiri Olsa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aW-qhJ2pige8aRl4@krava \
    --to=olsajiri@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mahe.tardy@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=songliubraving@fb.com \
    --cc=x86@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.