public inbox for linux-trace-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jiri Olsa <jolsa@kernel.org>
To: Masami Hiramatsu <mhiramat@kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Josh Poimboeuf <jpoimboe@kernel.org>
Cc: Mahe Tardy <mahe.tardy@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	bpf@vger.kernel.org, linux-trace-kernel@vger.kernel.org,
	x86@kernel.org, Yonghong Song <yhs@fb.com>,
	Song Liu <songliubraving@fb.com>,
	Andrii Nakryiko <andrii@kernel.org>
Subject: [PATCHv2 bpf-next 2/6] x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path
Date: Mon, 26 Jan 2026 22:18:33 +0100	[thread overview]
Message-ID: <20260126211837.472802-3-jolsa@kernel.org> (raw)
In-Reply-To: <20260126211837.472802-1-jolsa@kernel.org>

Mahe reported missing function from stack trace on top of kprobe
multi program. The missing function is the very first one in the
stacktrace, the one that the bpf program is attached to.

  # bpftrace -e 'kprobe:__x64_sys_newuname* { print(kstack)}'
  Attaching 1 probe...

        do_syscall_64+134
        entry_SYSCALL_64_after_hwframe+118

  ('*' is used for kprobe_multi attachment)

The reason is that the previous change (the Fixes commit) fixed
stack unwind for tracepoint, but removed attached function address
from the stack trace on top of kprobe multi programs, which I also
overlooked in the related test (check following patch).

The tracepoint and kprobe_multi have different stack setup, but use
same unwind path. I think it's better to keep the previous change,
which fixed tracepoint unwind and instead change the kprobe multi
unwind as explained below.

The bpf program stack unwind calls perf_callchain_kernel for kernel
portion and it follows two unwind paths based on X86_EFLAGS_FIXED
bit in pt_regs.flags.

When the bit set we unwind from stack represented by pt_regs argument,
otherwise we unwind currently executed stack up to 'first_frame'
boundary.

The 'first_frame' value is taken from regs.rsp value, but ftrace_caller
and ftrace_regs_caller (ftrace trampoline) functions set the regs.rsp
to the previous stack frame, so we skip the attached function entry.

If we switch kprobe_multi unwind to use the X86_EFLAGS_FIXED bit,
we set the start of the unwind to the attached function address.
As another benefit we also cut extra unwind cycles needed to reach
the 'first_frame' boundary.

The speedup can be measured with trigger bench for kprobe_multi
program and stacktrace support.

- trigger bench with stacktrace on current code:

        kprobe-multi   :     0.810 ± 0.001M/s
        kretprobe-multi:     0.808 ± 0.001M/s

- and with the fix:

        kprobe-multi   :     1.264 ± 0.001M/s
        kretprobe-multi:     1.401 ± 0.002M/s

With the fix, the entry probe stacktrace:

  # bpftrace -e 'kprobe:__x64_sys_newuname* { print(kstack)}'
  Attaching 1 probe...

        __x64_sys_newuname+9
        do_syscall_64+134
        entry_SYSCALL_64_after_hwframe+118

The return probe skips the attached function, because it's no longer
on the stack at the point of the unwind and this way is the same how
standard kretprobe works.

  # bpftrace -e 'kretprobe:__x64_sys_newuname* { print(kstack)}'
  Attaching 1 probe...

        do_syscall_64+134
        entry_SYSCALL_64_after_hwframe+118

Fixes: 6d08340d1e35 ("Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()"")
Reported-by: Mahe Tardy <mahe.tardy@gmail.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/include/asm/ftrace.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index b08c95872eed..c56e1e63b893 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -57,7 +57,7 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
 }
 
 #define arch_ftrace_partial_regs(regs) do {	\
-	regs->flags &= ~X86_EFLAGS_FIXED;	\
+	regs->flags |= X86_EFLAGS_FIXED;	\
 	regs->cs = __KERNEL_CS;			\
 } while (0)
 
-- 
2.52.0


  parent reply	other threads:[~2026-01-26 21:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-26 21:18 [PATCHv2 bpf-next 0/6] x86/fgraph,bpf: Fix ORC stack unwind from kprobe_multi Jiri Olsa
2026-01-26 21:18 ` [PATCHv2 bpf-next 1/6] x86/fgraph: Fix return_to_handler regs.rsp value Jiri Olsa
2026-01-29  0:49   ` Steven Rostedt
2026-01-26 21:18 ` Jiri Olsa [this message]
2026-01-29  0:50   ` [PATCHv2 bpf-next 2/6] x86/fgraph,bpf: Switch kprobe_multi program stack unwind to hw_regs path Steven Rostedt
2026-01-26 21:18 ` [PATCHv2 bpf-next 3/6] selftests/bpf: Fix kprobe multi stacktrace_ips test Jiri Olsa
2026-01-26 21:18 ` [PATCHv2 bpf-next 4/6] selftests/bpf: Add stacktrace ips test for kprobe/kretprobe Jiri Olsa
2026-01-26 21:18 ` [PATCHv2 bpf-next 5/6] selftests/bpf: Add stacktrace ips test for fentry/fexit Jiri Olsa
2026-01-26 21:18 ` [PATCHv2 bpf-next 6/6] selftests/bpf: Allow to benchmark trigger with stacktrace Jiri Olsa
2026-01-30 21:50 ` [PATCHv2 bpf-next 0/6] x86/fgraph,bpf: Fix ORC stack unwind from kprobe_multi patchwork-bot+netdevbpf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260126211837.472802-3-jolsa@kernel.org \
    --to=jolsa@kernel.org \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=jpoimboe@kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mahe.tardy@gmail.com \
    --cc=mhiramat@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=songliubraving@fb.com \
    --cc=x86@kernel.org \
    --cc=yhs@fb.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox