From: Jiri Olsa <olsajiri@gmail.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
Josh Poimboeuf <jpoimboe@kernel.org>,
Mahe Tardy <mahe.tardy@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
bpf@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
x86@kernel.org, Yonghong Song <yhs@fb.com>,
Song Liu <songliubraving@fb.com>,
Andrii Nakryiko <andrii@kernel.org>
Subject: Re: [PATCH bpf-next 2/4] x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path
Date: Tue, 13 Jan 2026 12:43:39 +0100 [thread overview]
Message-ID: <aWYv6864cdO2PWbb@krava> (raw)
In-Reply-To: <20260112170757.4e41c0d8@gandalf.local.home>
On Mon, Jan 12, 2026 at 05:07:57PM -0500, Steven Rostedt wrote:
> On Mon, 12 Jan 2026 22:49:38 +0100
> Jiri Olsa <jolsa@kernel.org> wrote:
>
> > To recreate same stack setup for return probe as we have for entry
> > probe, we set the instruction pointer to the attached function address,
> > which gets us the same unwind setup and same stack trace.
> >
> > With the fix, entry probe:
> >
> > # bpftrace -e 'kprobe:__x64_sys_newuname* { print(kstack)}'
> > Attaching 1 probe...
> >
> > __x64_sys_newuname+9
> > do_syscall_64+134
> > entry_SYSCALL_64_after_hwframe+118
> >
> > return probe:
> >
> > # bpftrace -e 'kretprobe:__x64_sys_newuname* { print(kstack)}'
> > Attaching 1 probe...
> >
> > __x64_sys_newuname+4
> > do_syscall_64+134
> > entry_SYSCALL_64_after_hwframe+118
>
> But is this really correct?
>
> The stack trace of the return from __x86_sys_newuname is from offset "+4".
>
> The stack trace from entry is offset "+9". Isn't it confusing that the
> offset is likely not from the return portion of that function?
right, makes sense.. so standard kprobe actualy skips attached function
(__x86_sys_newuname) on return probe stacktrace.. perhaps we should do
the same for kprobe_multi
I managed to get that with the change below, but it's wrong wrt arch code,
note the ftrace_regs_set_stack_pointer(fregs, stack + 8) .. will try to
figure out better way when we agree on the solution
thanks,
jirka
---
diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index c56e1e63b893..b0e8ce4934e7 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -71,6 +71,9 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
#define ftrace_regs_set_instruction_pointer(fregs, _ip) \
do { arch_ftrace_regs(fregs)->regs.ip = (_ip); } while (0)
+#define ftrace_regs_set_stack_pointer(fregs, _sp) \
+ do { arch_ftrace_regs(fregs)->regs.sp = (_sp); } while (0)
+
static __always_inline unsigned long
ftrace_regs_get_return_address(struct ftrace_regs *fregs)
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 6279e0a753cf..b1510c412dcb 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -717,7 +717,8 @@ int function_graph_enter_regs(unsigned long ret, unsigned long func,
/* Retrieve a function return address to the trace stack on thread info.*/
static struct ftrace_ret_stack *
ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret,
- unsigned long frame_pointer, int *offset)
+ unsigned long *stack, unsigned long frame_pointer,
+ int *offset)
{
struct ftrace_ret_stack *ret_stack;
@@ -762,6 +763,7 @@ ftrace_pop_return_trace(struct ftrace_graph_ret *trace, unsigned long *ret,
*offset += FGRAPH_FRAME_OFFSET;
*ret = ret_stack->ret;
+ *stack = (unsigned long) ret_stack->retp;
trace->func = ret_stack->func;
trace->overrun = atomic_read(¤t->trace_overrun);
trace->depth = current->curr_ret_depth;
@@ -810,12 +812,13 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
struct ftrace_ret_stack *ret_stack;
struct ftrace_graph_ret trace;
unsigned long bitmap;
+ unsigned long stack;
unsigned long ret;
int offset;
int bit;
int i;
- ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
+ ret_stack = ftrace_pop_return_trace(&trace, &ret, &stack, frame_pointer, &offset);
if (unlikely(!ret_stack)) {
ftrace_graph_stop();
@@ -824,8 +827,11 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
return (unsigned long)panic;
}
- if (fregs)
- ftrace_regs_set_instruction_pointer(fregs, trace.func);
+ if (fregs) {
+ ftrace_regs_set_instruction_pointer(fregs, ret);
+ ftrace_regs_set_stack_pointer(fregs, stack + 8);
+ }
+
bit = ftrace_test_recursion_trylock(trace.func, ret);
/*
diff --git a/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c b/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c
index e1a9b55e07cb..852830536109 100644
--- a/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c
+++ b/tools/testing/selftests/bpf/prog_tests/stacktrace_ips.c
@@ -74,12 +74,20 @@ static void test_stacktrace_ips_kprobe_multi(bool retprobe)
load_kallsyms();
- check_stacktrace_ips(bpf_map__fd(skel->maps.stackmap), skel->bss->stack_key, 5,
- ksym_get_addr("bpf_testmod_stacktrace_test"),
- ksym_get_addr("bpf_testmod_stacktrace_test_3"),
- ksym_get_addr("bpf_testmod_stacktrace_test_2"),
- ksym_get_addr("bpf_testmod_stacktrace_test_1"),
- ksym_get_addr("bpf_testmod_test_read"));
+ if (retprobe) {
+ check_stacktrace_ips(bpf_map__fd(skel->maps.stackmap), skel->bss->stack_key, 4,
+ ksym_get_addr("bpf_testmod_stacktrace_test_3"),
+ ksym_get_addr("bpf_testmod_stacktrace_test_2"),
+ ksym_get_addr("bpf_testmod_stacktrace_test_1"),
+ ksym_get_addr("bpf_testmod_test_read"));
+ } else {
+ check_stacktrace_ips(bpf_map__fd(skel->maps.stackmap), skel->bss->stack_key, 5,
+ ksym_get_addr("bpf_testmod_stacktrace_test"),
+ ksym_get_addr("bpf_testmod_stacktrace_test_3"),
+ ksym_get_addr("bpf_testmod_stacktrace_test_2"),
+ ksym_get_addr("bpf_testmod_stacktrace_test_1"),
+ ksym_get_addr("bpf_testmod_test_read"));
+ }
cleanup:
stacktrace_ips__destroy(skel);
next prev parent reply other threads:[~2026-01-13 11:43 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-12 21:49 [PATCH bpf-next 0/4] x86/fgraph,bpf: Fix ORC stack unwind from kprobe_multi Jiri Olsa
2026-01-12 21:49 ` [PATCH bpf-next 1/4] x86/fgraph: Fix return_to_handler regs.rsp value Jiri Olsa
2026-01-12 21:49 ` [PATCH bpf-next 2/4] x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path Jiri Olsa
2026-01-12 22:07 ` Steven Rostedt
2026-01-13 11:43 ` Jiri Olsa [this message]
2026-01-15 18:52 ` Andrii Nakryiko
2026-01-16 16:25 ` Jiri Olsa
2026-01-16 22:24 ` Andrii Nakryiko
2026-01-20 14:50 ` Steven Rostedt
2026-01-20 16:17 ` Jiri Olsa
2026-01-12 21:49 ` [PATCH bpf-next 3/4] selftests/bpf: Fix kprobe multi stacktrace_ips test Jiri Olsa
2026-01-12 21:49 ` [PATCH bpf-next 4/4] selftests/bpf: Allow to benchmark trigger with stacktrace Jiri Olsa
2026-01-15 18:48 ` Andrii Nakryiko
2026-01-15 18:50 ` Andrii Nakryiko
2026-01-16 16:30 ` Jiri Olsa
2026-01-16 16:26 ` Jiri Olsa
2026-01-22 8:35 ` Jiri Olsa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aWYv6864cdO2PWbb@krava \
--to=olsajiri@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=jpoimboe@kernel.org \
--cc=linux-trace-kernel@vger.kernel.org \
--cc=mahe.tardy@gmail.com \
--cc=mhiramat@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=songliubraving@fb.com \
--cc=x86@kernel.org \
--cc=yhs@fb.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.