* [PATCH] x86/ibt: make is_endbr() notrace
@ 2025-09-18 12:09 Menglong Dong
2025-09-18 13:05 ` Peter Zijlstra
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Menglong Dong @ 2025-09-18 12:09 UTC (permalink / raw)
To: peterz, jolsa
Cc: tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt,
luto, mhiramat, ast, andrii, linux-kernel, bpf
is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
kprobe_multi_link_exit_handler -> is_endbr.
It is not protected by the "bpf_prog_active", so it can't be traced by
kprobe-multi, which can cause recurring and panic the kernel. Fix it by
make it notrace.
Fixes: 72e213a7ccf9 ("x86/ibt: Clean up is_endbr()")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
arch/x86/kernel/alternative.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 69fb818df2ee..f67a31c77c89 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
#ifdef CONFIG_X86_KERNEL_IBT
-__noendbr bool is_endbr(u32 *val)
+__noendbr notrace bool is_endbr(u32 *val)
{
u32 endbr;
--
2.51.0
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong
@ 2025-09-18 13:05 ` Peter Zijlstra
2025-09-18 13:32 ` Menglong Dong
2025-09-19 8:52 ` Masami Hiramatsu
2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google)
2 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-18 13:05 UTC (permalink / raw)
To: Menglong Dong
Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen,
rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf
On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> kprobe_multi_link_exit_handler -> is_endbr.
>
> It is not protected by the "bpf_prog_active", so it can't be traced by
> kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> make it notrace.
This is very much a riddle wrapped in an enigma. Notably
kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
cryptic next line sufficient to explain why its a problem.
I suspect the is_endbr() you did mean is the one in
arch_ftrace_get_symaddr(), but who knows.
Also, depending on compiler insanity, it is possible the thing
out-of-lines things like __is_endbr(), getting you yet another
__fentry__ site.
Please try again.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 13:05 ` Peter Zijlstra
@ 2025-09-18 13:32 ` Menglong Dong
2025-09-18 16:02 ` Alexei Starovoitov
2025-09-18 16:56 ` Peter Zijlstra
0 siblings, 2 replies; 32+ messages in thread
From: Menglong Dong @ 2025-09-18 13:32 UTC (permalink / raw)
To: Peter Zijlstra
Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen,
rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf
On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > kprobe_multi_link_exit_handler -> is_endbr.
> >
> > It is not protected by the "bpf_prog_active", so it can't be traced by
> > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > make it notrace.
>
> This is very much a riddle wrapped in an enigma. Notably
> kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> cryptic next line sufficient to explain why its a problem.
>
> I suspect the is_endbr() you did mean is the one in
> arch_ftrace_get_symaddr(), but who knows.
Yeah, I mean
kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
arch_ftrace_get_symaddr -> is_endbr
actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
>
> Also, depending on compiler insanity, it is possible the thing
> out-of-lines things like __is_endbr(), getting you yet another
> __fentry__ site.
The panic happens when I run the bpf bench testing:
./bench kretprobe-multi-all
And skip the "is_endbr" fix this problem.
__is_endbr should be marked with "notrace" too. I slacked off
on it, as it didn't happen in my case :/
>
> Please try again.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 13:32 ` Menglong Dong
@ 2025-09-18 16:02 ` Alexei Starovoitov
2025-09-18 16:59 ` Peter Zijlstra
2025-09-18 16:56 ` Peter Zijlstra
1 sibling, 1 reply; 32+ messages in thread
From: Alexei Starovoitov @ 2025-09-18 16:02 UTC (permalink / raw)
To: Menglong Dong
Cc: Peter Zijlstra, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On Thu, Sep 18, 2025 at 6:32 AM Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > kprobe_multi_link_exit_handler -> is_endbr.
> > >
> > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > make it notrace.
> >
> > This is very much a riddle wrapped in an enigma. Notably
> > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > cryptic next line sufficient to explain why its a problem.
> >
> > I suspect the is_endbr() you did mean is the one in
> > arch_ftrace_get_symaddr(), but who knows.
>
> Yeah, I mean
> kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> arch_ftrace_get_symaddr -> is_endbr
> actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
All this makes sense to me.
__noendbr bool is_endbr(u32 *val) needs "notrace",
since it's in alternative.c and won't get inlined (unless LTO+luck).
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 13:32 ` Menglong Dong
2025-09-18 16:02 ` Alexei Starovoitov
@ 2025-09-18 16:56 ` Peter Zijlstra
2025-09-19 12:35 ` Masami Hiramatsu
1 sibling, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-18 16:56 UTC (permalink / raw)
To: Menglong Dong
Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen,
rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf
On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote:
> On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > kprobe_multi_link_exit_handler -> is_endbr.
> > >
> > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > make it notrace.
> >
> > This is very much a riddle wrapped in an enigma. Notably
> > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > cryptic next line sufficient to explain why its a problem.
> >
> > I suspect the is_endbr() you did mean is the one in
> > arch_ftrace_get_symaddr(), but who knows.
>
> Yeah, I mean
> kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> arch_ftrace_get_symaddr -> is_endbr
> actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
>
> >
> > Also, depending on compiler insanity, it is possible the thing
> > out-of-lines things like __is_endbr(), getting you yet another
> > __fentry__ site.
>
> The panic happens when I run the bpf bench testing:
> ./bench kretprobe-multi-all
>
> And skip the "is_endbr" fix this problem.
But why does it panic? Supposedly you've done the analysis; but then
forgot to write it down?
Why is kprobe_multi_link_exit_handler() special; doesn't the issue also
exist with kprobe_multi_link_handler() ? If so, removing __fentry__
isn't going to help much, you can just stick an actual kprobe in
is_endbr(), right?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 16:02 ` Alexei Starovoitov
@ 2025-09-18 16:59 ` Peter Zijlstra
2025-09-18 17:53 ` Alexei Starovoitov
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-18 16:59 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Menglong Dong, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote:
> On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > >
> > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > make it notrace.
> > >
> > > This is very much a riddle wrapped in an enigma. Notably
> > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > cryptic next line sufficient to explain why its a problem.
> > >
> > > I suspect the is_endbr() you did mean is the one in
> > > arch_ftrace_get_symaddr(), but who knows.
> >
> > Yeah, I mean
> > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > arch_ftrace_get_symaddr -> is_endbr
> > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
>
> All this makes sense to me.
As written down, I'm still clueless.
> __noendbr bool is_endbr(u32 *val) needs "notrace",
> since it's in alternative.c and won't get inlined (unless LTO+luck).
notrace don't help with kprobes in general, only with __fentry__ sites.
I've still not clue why there is a panic, or why notrace would be
sufficient.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 16:59 ` Peter Zijlstra
@ 2025-09-18 17:53 ` Alexei Starovoitov
2025-09-19 1:13 ` Menglong Dong
2025-09-22 6:36 ` Peter Zijlstra
0 siblings, 2 replies; 32+ messages in thread
From: Alexei Starovoitov @ 2025-09-18 17:53 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Menglong Dong, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote:
> > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > >
> > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > > >
> > > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > > make it notrace.
> > > >
> > > > This is very much a riddle wrapped in an enigma. Notably
> > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > > cryptic next line sufficient to explain why its a problem.
> > > >
> > > > I suspect the is_endbr() you did mean is the one in
> > > > arch_ftrace_get_symaddr(), but who knows.
> > >
> > > Yeah, I mean
> > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > > arch_ftrace_get_symaddr -> is_endbr
> > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> >
> > All this makes sense to me.
>
> As written down, I'm still clueless.
>
> > __noendbr bool is_endbr(u32 *val) needs "notrace",
> > since it's in alternative.c and won't get inlined (unless LTO+luck).
>
> notrace don't help with kprobes in general, only with __fentry__ sites.
Are you sure ? Last time I checked "notrace" prevents kprobing
anywhere in that function.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 17:53 ` Alexei Starovoitov
@ 2025-09-19 1:13 ` Menglong Dong
2025-09-22 6:52 ` Peter Zijlstra
2025-09-22 6:36 ` Peter Zijlstra
1 sibling, 1 reply; 32+ messages in thread
From: Menglong Dong @ 2025-09-19 1:13 UTC (permalink / raw)
To: Peter Zijlstra, Alexei Starovoitov
Cc: Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov,
Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen,
Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On Fri, Sep 19, 2025 at 1:54 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote:
> > > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote:
> > > >
> > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > > > >
> > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > > > >
> > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > > > make it notrace.
> > > > >
> > > > > This is very much a riddle wrapped in an enigma. Notably
> > > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > > > cryptic next line sufficient to explain why its a problem.
> > > > >
> > > > > I suspect the is_endbr() you did mean is the one in
> > > > > arch_ftrace_get_symaddr(), but who knows.
> > > >
> > > > Yeah, I mean
> > > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > > > arch_ftrace_get_symaddr -> is_endbr
> > > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> > >
> > > All this makes sense to me.
> >
> > As written down, I'm still clueless.
Ok, let me describe the problem in deetail.
First of all, it has nothing to do with kprobe. The bpf program of type
kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all
about the ftrace, which means __fentry__.
Second, let me explain the recur detection of the kprobe-multi. Let's
take the is_endbr() for example. When it is hooked by the bpf program
of type kretprobe-multi, following calling chain will happen:
is_endbr -> __ftrace_return_to_handler -> fprobe_return ->
kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
arch_ftrace_get_symaddr -> is_endbr
Look, is_endbr() is called again during the ftrace handler, so it will
trigger the ftrace handler(__ftrace_return_to_handler) again, which
causes recurrence.
Such recurrence can be detected. In kprobe_multi_link_prog_run(),
the percpu various "bpf_prog_active" will be increased by 1 before we
run the bpf progs, and decrease by 1 after the bpf progs finish. If the
kprobe_multi_link_prog_run() is triggered again during bpf progs run,
it will check if bpf_prog_active is zero, and return directly if it is not.
Therefore, recurrence can't happen within the "bpf_prog_active" protection.
However, the calling to is_endbr() is not within that scope, which makes
the recurrence happen.
Hope I described it clearly 😉
Thanks!
Menglong Dong
> >
> > > __noendbr bool is_endbr(u32 *val) needs "notrace",
> > > since it's in alternative.c and won't get inlined (unless LTO+luck).
> >
> > notrace don't help with kprobes in general, only with __fentry__ sites.
>
> Are you sure ? Last time I checked "notrace" prevents kprobing
> anywhere in that function.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong
2025-09-18 13:05 ` Peter Zijlstra
@ 2025-09-19 8:52 ` Masami Hiramatsu
2025-09-19 8:58 ` Menglong Dong
2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google)
2 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-19 8:52 UTC (permalink / raw)
To: Menglong Dong
Cc: peterz, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees,
samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel,
bpf
On Thu, 18 Sep 2025 20:09:39 +0800
Menglong Dong <menglong8.dong@gmail.com> wrote:
> is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> kprobe_multi_link_exit_handler -> is_endbr.
>
> It is not protected by the "bpf_prog_active", so it can't be traced by
> kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> make it notrace.
Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check
recursion, but fgraph only detects the recursion in the entry handler.
Thus it happens in the exit handler, fprobe does not check the recursion.
But since the fprobe provides users to register callback at exit, it
should check the recursion in return path too.
Thanks,
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-19 8:52 ` Masami Hiramatsu
@ 2025-09-19 8:58 ` Menglong Dong
2025-09-19 12:32 ` Masami Hiramatsu
0 siblings, 1 reply; 32+ messages in thread
From: Menglong Dong @ 2025-09-19 8:58 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: peterz, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees,
samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel,
bpf
On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write:
> On Thu, 18 Sep 2025 20:09:39 +0800
> Menglong Dong <menglong8.dong@gmail.com> wrote:
>
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > kprobe_multi_link_exit_handler -> is_endbr.
> >
> > It is not protected by the "bpf_prog_active", so it can't be traced by
> > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > make it notrace.
>
> Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check
> recursion, but fgraph only detects the recursion in the entry handler.
> Thus it happens in the exit handler, fprobe does not check the recursion.
>
> But since the fprobe provides users to register callback at exit, it
> should check the recursion in return path too.
That's a good idea to provide recursion checking for the exit handler,
which is able to solve this problem too.
If so, we don't need to check the recursion on the kprobe-multi anymore.
Do we?
Thanks!
Menglong Dong
>
> Thanks,
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong
2025-09-18 13:05 ` Peter Zijlstra
2025-09-19 8:52 ` Masami Hiramatsu
@ 2025-09-19 11:57 ` Masami Hiramatsu (Google)
2025-09-19 15:27 ` Steven Rostedt
2025-09-20 13:39 ` Menglong Dong
2 siblings, 2 replies; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2025-09-19 11:57 UTC (permalink / raw)
To: Peter Zijlstra, Steven Rostedt, Menglong Dong
Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen,
rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf
From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
function_graph_enter_regs() prevents itself from recursion by
ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
which is called at the exit, does not prevent such recursion.
Therefore, while it can prevent recursive calls from
fgraph_ops::entryfunc(), it is not able to prevent recursive calls
to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
This can lead an unexpected recursion bug reported by Menglong.
is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
-> kprobe_multi_link_exit_handler -> is_endbr.
To fix this issue, acquire ftrace_test_recursion_trylock() in the
__ftrace_return_to_handler() after unwind the shadow stack to mark
this section must prevent recursive call of fgraph inside user-defined
fgraph_ops::retfunc().
This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
fprobe on function-graph tracer"), because before that fgraph was
only used from the function graph tracer. Fprobe allowed user to run
any callbacks from fgraph after that commit.
Reported-by: Menglong Dong <menglong8.dong@gmail.com>
Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
kernel/trace/fgraph.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 1e3b32b1e82c..08dde420635b 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
unsigned long bitmap;
unsigned long ret;
int offset;
+ int bit;
int i;
ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
@@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
if (fregs)
ftrace_regs_set_instruction_pointer(fregs, ret);
+ bit = ftrace_test_recursion_trylock(trace.func, ret);
+ /*
+ * This must be succeeded because the entry handler returns before
+ * modifying the return address if it is nested. Anyway, we need to
+ * avoid calling user callbacks if it is nested.
+ */
+ if (WARN_ON_ONCE(bit < 0))
+ goto out;
+
#ifdef CONFIG_FUNCTION_GRAPH_RETVAL
trace.retval = ftrace_regs_get_return_value(fregs);
#endif
@@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
}
}
+ ftrace_test_recursion_unlock(bit);
+out:
/*
* The ftrace_graph_return() may still access the current
* ret_stack structure, we need to make sure the update of
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-19 8:58 ` Menglong Dong
@ 2025-09-19 12:32 ` Masami Hiramatsu
0 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-19 12:32 UTC (permalink / raw)
To: Menglong Dong
Cc: peterz, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees,
samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf
On Fri, 19 Sep 2025 16:58:57 +0800
Menglong Dong <menglong8.dong@gmail.com> wrote:
> On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write:
> > On Thu, 18 Sep 2025 20:09:39 +0800
> > Menglong Dong <menglong8.dong@gmail.com> wrote:
> >
> > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > kprobe_multi_link_exit_handler -> is_endbr.
> > >
> > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > make it notrace.
> >
> > Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check
> > recursion, but fgraph only detects the recursion in the entry handler.
> > Thus it happens in the exit handler, fprobe does not check the recursion.
> >
> > But since the fprobe provides users to register callback at exit, it
> > should check the recursion in return path too.
>
> That's a good idea to provide recursion checking for the exit handler,
> which is able to solve this problem too.
>
> If so, we don't need to check the recursion on the kprobe-multi anymore.
> Do we?
Yes, but *if possible*, please avoid calling such functions from fprobe
callbacks. This just prevents kernel crash from such recursion, but that
means it is not possible to trace such functions.
Thank you,
>
> Thanks!
> Menglong Dong
>
> >
> > Thanks,
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
>
>
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 16:56 ` Peter Zijlstra
@ 2025-09-19 12:35 ` Masami Hiramatsu
0 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-19 12:35 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa,
kees, samitolvanen, rppt, luto, mhiramat, ast, andrii,
linux-kernel, bpf
On Thu, 18 Sep 2025 18:56:56 +0200
Peter Zijlstra <peterz@infradead.org> wrote:
> On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote:
> > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote:
> > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
> > > > kprobe_multi_link_exit_handler -> is_endbr.
> > > >
> > > > It is not protected by the "bpf_prog_active", so it can't be traced by
> > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by
> > > > make it notrace.
> > >
> > > This is very much a riddle wrapped in an enigma. Notably
> > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that
> > > cryptic next line sufficient to explain why its a problem.
> > >
> > > I suspect the is_endbr() you did mean is the one in
> > > arch_ftrace_get_symaddr(), but who knows.
> >
> > Yeah, I mean
> > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > arch_ftrace_get_symaddr -> is_endbr
> > actually. And CONFIG_X86_KERNEL_IBT is enabled of course.
> >
> > >
> > > Also, depending on compiler insanity, it is possible the thing
> > > out-of-lines things like __is_endbr(), getting you yet another
> > > __fentry__ site.
> >
> > The panic happens when I run the bpf bench testing:
> > ./bench kretprobe-multi-all
> >
> > And skip the "is_endbr" fix this problem.
>
> But why does it panic? Supposedly you've done the analysis; but then
> forgot to write it down?
Yeah, that is an fprobe's bug. It should not panic. I sent a fix.
https://lore.kernel.org/all/175828305637.117978.4183947592750468265.stgit@devnote2/
Thank you,
>
> Why is kprobe_multi_link_exit_handler() special; doesn't the issue also
> exist with kprobe_multi_link_handler() ? If so, removing __fentry__
> isn't going to help much, you can just stick an actual kprobe in
> is_endbr(), right?
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google)
@ 2025-09-19 15:27 ` Steven Rostedt
2025-09-20 7:45 ` Jiri Olsa
2025-09-21 4:05 ` Masami Hiramatsu
2025-09-20 13:39 ` Menglong Dong
1 sibling, 2 replies; 32+ messages in thread
From: Steven Rostedt @ 2025-09-19 15:27 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Fri, 19 Sep 2025 20:57:36 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> function_graph_enter_regs() prevents itself from recursion by
> ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> which is called at the exit, does not prevent such recursion.
> Therefore, while it can prevent recursive calls from
> fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> This can lead an unexpected recursion bug reported by Menglong.
>
> is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> -> kprobe_multi_link_exit_handler -> is_endbr.
So basically its if the handler for the return part calls something that it
is tracing, it can trigger the recursion?
>
> To fix this issue, acquire ftrace_test_recursion_trylock() in the
> __ftrace_return_to_handler() after unwind the shadow stack to mark
> this section must prevent recursive call of fgraph inside user-defined
> fgraph_ops::retfunc().
>
> This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> fprobe on function-graph tracer"), because before that fgraph was
> only used from the function graph tracer. Fprobe allowed user to run
> any callbacks from fgraph after that commit.
I would actually say it's because before this commit, the return handler
callers never called anything that the entry handlers didn't already call.
If there was recursion, the entry handler would catch it (and the entry
tells fgraph if the exit handler should be called).
The difference here is with fprobes, you can have the exit handler calling
functions that the entry handler does not, which exposes more cases where
recursion could happen.
>
> Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
> kernel/trace/fgraph.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 1e3b32b1e82c..08dde420635b 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> unsigned long bitmap;
> unsigned long ret;
> int offset;
> + int bit;
> int i;
>
> ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> if (fregs)
> ftrace_regs_set_instruction_pointer(fregs, ret);
>
> + bit = ftrace_test_recursion_trylock(trace.func, ret);
> + /*
> + * This must be succeeded because the entry handler returns before
> + * modifying the return address if it is nested. Anyway, we need to
> + * avoid calling user callbacks if it is nested.
> + */
> + if (WARN_ON_ONCE(bit < 0))
I'm not so sure we need the warn on here. We should probably hook it to the
recursion detection infrastructure that the function tracer has.
The reason I would say not to have the warn on, is because we don't have a
warn on for recursion happening at the entry handler. Because this now is
exposed by fprobe allowing different routines to be called at exit than
what is used in entry, it can easily be triggered.
-- Steve
> + goto out;
> +
> #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> trace.retval = ftrace_regs_get_return_value(fregs);
> #endif
> @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> }
> }
>
> + ftrace_test_recursion_unlock(bit);
> +out:
> /*
> * The ftrace_graph_return() may still access the current
> * ret_stack structure, we need to make sure the update of
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-19 15:27 ` Steven Rostedt
@ 2025-09-20 7:45 ` Jiri Olsa
2025-09-22 6:16 ` Masami Hiramatsu
2025-09-21 4:05 ` Masami Hiramatsu
1 sibling, 1 reply; 32+ messages in thread
From: Jiri Olsa @ 2025-09-20 7:45 UTC (permalink / raw)
To: Steven Rostedt
Cc: Masami Hiramatsu (Google), Peter Zijlstra, Steven Rostedt,
Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees,
samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf
On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote:
> On Fri, 19 Sep 2025 20:57:36 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
>
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > function_graph_enter_regs() prevents itself from recursion by
> > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > which is called at the exit, does not prevent such recursion.
> > Therefore, while it can prevent recursive calls from
> > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > This can lead an unexpected recursion bug reported by Menglong.
> >
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > -> kprobe_multi_link_exit_handler -> is_endbr.
>
> So basically its if the handler for the return part calls something that it
> is tracing, it can trigger the recursion?
>
> >
> > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > this section must prevent recursive call of fgraph inside user-defined
> > fgraph_ops::retfunc().
> >
> > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > fprobe on function-graph tracer"), because before that fgraph was
> > only used from the function graph tracer. Fprobe allowed user to run
> > any callbacks from fgraph after that commit.
>
> I would actually say it's because before this commit, the return handler
> callers never called anything that the entry handlers didn't already call.
> If there was recursion, the entry handler would catch it (and the entry
> tells fgraph if the exit handler should be called).
>
> The difference here is with fprobes, you can have the exit handler calling
> functions that the entry handler does not, which exposes more cases where
> recursion could happen.
so IIUC we have return kprobe multi probe on is_endbr and now we do:
is_endbr()
{ -> function_graph_enter_regs installs return probe
...
} -> __ftrace_return_to_handler
fprobe_return
kprobe_multi_link_exit_handler
is_endbr
{ -> function_graph_enter_regs installs return probe
...
} -> __ftrace_return_to_handler
fprobe_return
kprobe_multi_link_exit_handler
is_endbr
{ -> function_graph_enter_regs installs return probe
...
} -> __ftrace_return_to_handler
... recursion
with the fix:
is_endbr()
{ -> function_graph_enter_regs installs return probe
...
} -> __ftrace_return_to_handler
fprobe_return
kprobe_multi_link_exit_handler
...
is_endbr
{ -> function_graph_enter_regs
ftrace_test_recursion_trylock fails and we do NOT install return probe
...
}
there's is_endbr call also in kprobe_multi_link_handler, but it won't
trigger recursion, because function_graph_enter_regs already uses
ftrace_test_recursion_trylock
if above is correct then the fix looks good to me
Acked-by: Jiri Olsa <jolsa@kernel.org>
thanks,
jirka
>
> >
> > Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> > kernel/trace/fgraph.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index 1e3b32b1e82c..08dde420635b 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > unsigned long bitmap;
> > unsigned long ret;
> > int offset;
> > + int bit;
> > int i;
> >
> > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > if (fregs)
> > ftrace_regs_set_instruction_pointer(fregs, ret);
> >
> > + bit = ftrace_test_recursion_trylock(trace.func, ret);
> > + /*
> > + * This must be succeeded because the entry handler returns before
> > + * modifying the return address if it is nested. Anyway, we need to
> > + * avoid calling user callbacks if it is nested.
> > + */
> > + if (WARN_ON_ONCE(bit < 0))
>
> I'm not so sure we need the warn on here. We should probably hook it to the
> recursion detection infrastructure that the function tracer has.
>
> The reason I would say not to have the warn on, is because we don't have a
> warn on for recursion happening at the entry handler. Because this now is
> exposed by fprobe allowing different routines to be called at exit than
> what is used in entry, it can easily be triggered.
>
> -- Steve
>
>
>
> > + goto out;
> > +
> > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > trace.retval = ftrace_regs_get_return_value(fregs);
> > #endif
> > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > }
> > }
> >
> > + ftrace_test_recursion_unlock(bit);
> > +out:
> > /*
> > * The ftrace_graph_return() may still access the current
> > * ret_stack structure, we need to make sure the update of
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google)
2025-09-19 15:27 ` Steven Rostedt
@ 2025-09-20 13:39 ` Menglong Dong
2025-09-21 4:06 ` Masami Hiramatsu
2025-09-22 5:19 ` Masami Hiramatsu
1 sibling, 2 replies; 32+ messages in thread
From: Menglong Dong @ 2025-09-20 13:39 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto,
mhiramat, ast, andrii, linux-kernel, bpf
On 2025/9/19 19:57, Masami Hiramatsu (Google) wrote:
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> function_graph_enter_regs() prevents itself from recursion by
> ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> which is called at the exit, does not prevent such recursion.
> Therefore, while it can prevent recursive calls from
> fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> This can lead an unexpected recursion bug reported by Menglong.
>
> is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> -> kprobe_multi_link_exit_handler -> is_endbr.
>
> To fix this issue, acquire ftrace_test_recursion_trylock() in the
> __ftrace_return_to_handler() after unwind the shadow stack to mark
> this section must prevent recursive call of fgraph inside user-defined
> fgraph_ops::retfunc().
>
> This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> fprobe on function-graph tracer"), because before that fgraph was
> only used from the function graph tracer. Fprobe allowed user to run
> any callbacks from fgraph after that commit.
>
> Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> Cc: stable@vger.kernel.org
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
> kernel/trace/fgraph.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index 1e3b32b1e82c..08dde420635b 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> unsigned long bitmap;
> unsigned long ret;
> int offset;
> + int bit;
> int i;
>
> ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> if (fregs)
> ftrace_regs_set_instruction_pointer(fregs, ret);
>
> + bit = ftrace_test_recursion_trylock(trace.func, ret);
> + /*
> + * This must be succeeded because the entry handler returns before
> + * modifying the return address if it is nested. Anyway, we need to
> + * avoid calling user callbacks if it is nested.
> + */
> + if (WARN_ON_ONCE(bit < 0))
> + goto out;
Hi, the logic seems right, but the warning is triggered when
I try to run the bpf bench testing:
$ ./benchs/run_bench_trigger.sh kretprobe-multi-all
[ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
[ 139.509036] ------------[ cut here ]------------
[ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
[ 139.509411] Modules linked in: virtio_net
[ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE
[ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
[ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
[ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
[ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
[ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
[ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
[ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
[ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
[ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
[ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
[ 139.512038] PKRU: 55555554
[ 139.512106] Call Trace:
[ 139.512177] <IRQ>
[ 139.512232] ? irq_exit_rcu+0x4/0xb0
[ 139.512322] return_to_handler+0x1e/0x50
[ 139.512422] ? idle_cpu+0x9/0x50
[ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80
[ 139.512638] ? idle_cpu+0x9/0x50
[ 139.512731] ? irq_exit_rcu+0x3a/0xb0
[ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10
[ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80
[ 139.513101] </IRQ>
[ 139.513168] <TASK>
> +
> #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> trace.retval = ftrace_regs_get_return_value(fregs);
> #endif
> @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> }
> }
>
> + ftrace_test_recursion_unlock(bit);
> +out:
> /*
> * The ftrace_graph_return() may still access the current
> * ret_stack structure, we need to make sure the update of
>
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-19 15:27 ` Steven Rostedt
2025-09-20 7:45 ` Jiri Olsa
@ 2025-09-21 4:05 ` Masami Hiramatsu
2025-09-21 22:52 ` Steven Rostedt
1 sibling, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-21 4:05 UTC (permalink / raw)
To: Steven Rostedt
Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Fri, 19 Sep 2025 11:27:46 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Fri, 19 Sep 2025 20:57:36 +0900
> "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
>
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > function_graph_enter_regs() prevents itself from recursion by
> > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > which is called at the exit, does not prevent such recursion.
> > Therefore, while it can prevent recursive calls from
> > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > This can lead an unexpected recursion bug reported by Menglong.
> >
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > -> kprobe_multi_link_exit_handler -> is_endbr.
>
> So basically its if the handler for the return part calls something that it
> is tracing, it can trigger the recursion?
>
> >
> > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > this section must prevent recursive call of fgraph inside user-defined
> > fgraph_ops::retfunc().
> >
> > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > fprobe on function-graph tracer"), because before that fgraph was
> > only used from the function graph tracer. Fprobe allowed user to run
> > any callbacks from fgraph after that commit.
>
> I would actually say it's because before this commit, the return handler
> callers never called anything that the entry handlers didn't already call.
> If there was recursion, the entry handler would catch it (and the entry
> tells fgraph if the exit handler should be called).
>
> The difference here is with fprobes, you can have the exit handler calling
> functions that the entry handler does not, which exposes more cases where
> recursion could happen.
>
> >
> > Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> > kernel/trace/fgraph.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index 1e3b32b1e82c..08dde420635b 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > unsigned long bitmap;
> > unsigned long ret;
> > int offset;
> > + int bit;
> > int i;
> >
> > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > if (fregs)
> > ftrace_regs_set_instruction_pointer(fregs, ret);
> >
> > + bit = ftrace_test_recursion_trylock(trace.func, ret);
> > + /*
> > + * This must be succeeded because the entry handler returns before
> > + * modifying the return address if it is nested. Anyway, we need to
> > + * avoid calling user callbacks if it is nested.
> > + */
> > + if (WARN_ON_ONCE(bit < 0))
>
> I'm not so sure we need the warn on here. We should probably hook it to the
> recursion detection infrastructure that the function tracer has.
The reason I added WARN_ON here is this should not happen. The recursion
should be detected at the entry, not at exit. The __ftrace_return_to_handler()
is installed only if the entry does NOT detect the recursion. In that case
I think the exit should succeed recursion_trylock().
>
> The reason I would say not to have the warn on, is because we don't have a
> warn on for recursion happening at the entry handler. Because this now is
> exposed by fprobe allowing different routines to be called at exit than
> what is used in entry, it can easily be triggered.
At the entry, if it detect recursion, it exits soon. But here we have to
process stack unwind to get the return address. This recursion_trylock()
is to mark this is in the critical section, not detect it.
Thank you,
>
> -- Steve
>
>
>
> > + goto out;
> > +
> > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > trace.retval = ftrace_regs_get_return_value(fregs);
> > #endif
> > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > }
> > }
> >
> > + ftrace_test_recursion_unlock(bit);
> > +out:
> > /*
> > * The ftrace_graph_return() may still access the current
> > * ret_stack structure, we need to make sure the update of
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-20 13:39 ` Menglong Dong
@ 2025-09-21 4:06 ` Masami Hiramatsu
2025-09-21 23:00 ` Steven Rostedt
2025-09-22 5:19 ` Masami Hiramatsu
1 sibling, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-21 4:06 UTC (permalink / raw)
To: Menglong Dong
Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Sat, 20 Sep 2025 21:39:25 +0800
Menglong Dong <menglong.dong@linux.dev> wrote:
> On 2025/9/19 19:57, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > function_graph_enter_regs() prevents itself from recursion by
> > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > which is called at the exit, does not prevent such recursion.
> > Therefore, while it can prevent recursive calls from
> > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > This can lead an unexpected recursion bug reported by Menglong.
> >
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > -> kprobe_multi_link_exit_handler -> is_endbr.
> >
> > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > this section must prevent recursive call of fgraph inside user-defined
> > fgraph_ops::retfunc().
> >
> > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > fprobe on function-graph tracer"), because before that fgraph was
> > only used from the function graph tracer. Fprobe allowed user to run
> > any callbacks from fgraph after that commit.
> >
> > Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> > kernel/trace/fgraph.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index 1e3b32b1e82c..08dde420635b 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > unsigned long bitmap;
> > unsigned long ret;
> > int offset;
> > + int bit;
> > int i;
> >
> > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > if (fregs)
> > ftrace_regs_set_instruction_pointer(fregs, ret);
> >
> > + bit = ftrace_test_recursion_trylock(trace.func, ret);
> > + /*
> > + * This must be succeeded because the entry handler returns before
> > + * modifying the return address if it is nested. Anyway, we need to
> > + * avoid calling user callbacks if it is nested.
> > + */
> > + if (WARN_ON_ONCE(bit < 0))
> > + goto out;
>
> Hi, the logic seems right, but the warning is triggered when
> I try to run the bpf bench testing:
Hmm, this is strange. Let me check why this happens.
Thank you,
>
> $ ./benchs/run_bench_trigger.sh kretprobe-multi-all
> [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
> [ 139.509036] ------------[ cut here ]------------
> [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
> [ 139.509411] Modules linked in: virtio_net
> [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE
> [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
> [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
> [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
> [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
> [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
> [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
> [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
> [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
> [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
> [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
> [ 139.512038] PKRU: 55555554
> [ 139.512106] Call Trace:
> [ 139.512177] <IRQ>
> [ 139.512232] ? irq_exit_rcu+0x4/0xb0
> [ 139.512322] return_to_handler+0x1e/0x50
> [ 139.512422] ? idle_cpu+0x9/0x50
> [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80
> [ 139.512638] ? idle_cpu+0x9/0x50
> [ 139.512731] ? irq_exit_rcu+0x3a/0xb0
> [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10
> [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80
> [ 139.513101] </IRQ>
> [ 139.513168] <TASK>
>
> > +
> > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > trace.retval = ftrace_regs_get_return_value(fregs);
> > #endif
> > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > }
> > }
> >
> > + ftrace_test_recursion_unlock(bit);
> > +out:
> > /*
> > * The ftrace_graph_return() may still access the current
> > * ret_stack structure, we need to make sure the update of
> >
> >
> >
>
>
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-21 4:05 ` Masami Hiramatsu
@ 2025-09-21 22:52 ` Steven Rostedt
2025-09-24 22:58 ` Masami Hiramatsu
0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2025-09-21 22:52 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Sun, 21 Sep 2025 13:05:19 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
>
> > The reason I would say not to have the warn on, is because we don't have a
> > warn on for recursion happening at the entry handler. Because this now is
> > exposed by fprobe allowing different routines to be called at exit than
> > what is used in entry, it can easily be triggered.
>
> At the entry, if it detect recursion, it exits soon. But here we have to
> process stack unwind to get the return address. This recursion_trylock()
> is to mark this is in the critical section, not detect it.
Ah, because the first instance of the exit callback sets the recursion
bit. This will cause recursed entry calls to detect the recursion bit
and return without setting the exit handler to be called.
That is, by setting the recursion bit in the exit handler, it will cause
a recursion in entry to fail before the exit is called again.
I'd like to update the comment:
+ bit = ftrace_test_recursion_trylock(trace.func, ret);
+ /*
+ * This must be succeeded because the entry handler returns before
+ * modifying the return address if it is nested. Anyway, we need to
+ * avoid calling user callbacks if it is nested.
+ */
+ if (WARN_ON_ONCE(bit < 0))
+ goto out;
+
to:
/*
* Setting the recursion bit here will cause the graph entry to
* detect recursion before the exit handle will. If the ext
* handler detects recursion, something went wrong.
*/
if (WARN_ON_ONCE(bit < 0))
-- Steve
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-21 4:06 ` Masami Hiramatsu
@ 2025-09-21 23:00 ` Steven Rostedt
2025-09-24 22:59 ` Masami Hiramatsu
0 siblings, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2025-09-21 23:00 UTC (permalink / raw)
To: Masami Hiramatsu (Google)
Cc: Menglong Dong, Peter Zijlstra, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Sun, 21 Sep 2025 13:06:47 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> >
> > Hi, the logic seems right, but the warning is triggered when
> > I try to run the bpf bench testing:
>
> Hmm, this is strange. Let me check why this happens.
>
> Thank you,
>
> >
> > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all
> > [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
> > [ 139.509036] ------------[ cut here ]------------
> > [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
> > [ 139.509411] Modules linked in: virtio_net
> > [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE
> > [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
> > [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
> > [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
> > [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
> > [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
> > [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
> > [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
> > [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
> > [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
> > [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
> > [ 139.512038] PKRU: 55555554
> > [ 139.512106] Call Trace:
> > [ 139.512177] <IRQ>
> > [ 139.512232] ? irq_exit_rcu+0x4/0xb0
> > [ 139.512322] return_to_handler+0x1e/0x50
> > [ 139.512422] ? idle_cpu+0x9/0x50
> > [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80
> > [ 139.512638] ? idle_cpu+0x9/0x50
> > [ 139.512731] ? irq_exit_rcu+0x3a/0xb0
> > [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10
> > [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80
> > [ 139.513101] </IRQ>
> > [ 139.513168] <TASK>
> >
> > > +
> > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > > trace.retval = ftrace_regs_get_return_value(fregs);
> > > #endif
> > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > }
> > > }
> > >
> > > + ftrace_test_recursion_unlock(bit);
> > > +out:
> > > /*
> > > * The ftrace_graph_return() may still access the current
> > > * ret_stack structure, we need to make sure the update of
Hmm, I wonder if this has to do with the "TRANSITION BIT". The
ftrace_test_recursion_trylock() allows one level of recursion. This is
to handle the case of an interrupt happening after the recursion bit is
set and traces something before it updates the context in the preempt
count. This would cause a false positive of the recursion test. To
handle this case, it allows a single level of recursion.
I originally was going to mention this, but I still can't see how this
would affect it. Because if the entry were to allow one level of
recursion, so would the exit. I still see the entry preventing the exit
to be called. But perhaps there's an combination that I missed?
-- Steve
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-20 13:39 ` Menglong Dong
2025-09-21 4:06 ` Masami Hiramatsu
@ 2025-09-22 5:19 ` Masami Hiramatsu
1 sibling, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-22 5:19 UTC (permalink / raw)
To: Menglong Dong
Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Sat, 20 Sep 2025 21:39:25 +0800
Menglong Dong <menglong.dong@linux.dev> wrote:
> On 2025/9/19 19:57, Masami Hiramatsu (Google) wrote:
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > function_graph_enter_regs() prevents itself from recursion by
> > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > which is called at the exit, does not prevent such recursion.
> > Therefore, while it can prevent recursive calls from
> > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > This can lead an unexpected recursion bug reported by Menglong.
> >
> > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > -> kprobe_multi_link_exit_handler -> is_endbr.
> >
> > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > this section must prevent recursive call of fgraph inside user-defined
> > fgraph_ops::retfunc().
> >
> > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > fprobe on function-graph tracer"), because before that fgraph was
> > only used from the function graph tracer. Fprobe allowed user to run
> > any callbacks from fgraph after that commit.
> >
> > Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > Cc: stable@vger.kernel.org
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > ---
> > kernel/trace/fgraph.c | 12 ++++++++++++
> > 1 file changed, 12 insertions(+)
> >
> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index 1e3b32b1e82c..08dde420635b 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > unsigned long bitmap;
> > unsigned long ret;
> > int offset;
> > + int bit;
> > int i;
> >
> > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > if (fregs)
> > ftrace_regs_set_instruction_pointer(fregs, ret);
> >
> > + bit = ftrace_test_recursion_trylock(trace.func, ret);
> > + /*
> > + * This must be succeeded because the entry handler returns before
> > + * modifying the return address if it is nested. Anyway, we need to
> > + * avoid calling user callbacks if it is nested.
> > + */
> > + if (WARN_ON_ONCE(bit < 0))
> > + goto out;
>
> Hi, the logic seems right, but the warning is triggered when
> I try to run the bpf bench testing:
Ah, I remember that this ftrace_test_recursion_trylock() allows
one-level nesting. OK, I will remove the WARN_ON.
Thanks,
>
> $ ./benchs/run_bench_trigger.sh kretprobe-multi-all
> [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
> [ 139.509036] ------------[ cut here ]------------
> [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
> [ 139.509411] Modules linked in: virtio_net
> [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE
> [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
> [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
> [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
> [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
> [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
> [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
> [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
> [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
> [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
> [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
> [ 139.512038] PKRU: 55555554
> [ 139.512106] Call Trace:
> [ 139.512177] <IRQ>
> [ 139.512232] ? irq_exit_rcu+0x4/0xb0
> [ 139.512322] return_to_handler+0x1e/0x50
> [ 139.512422] ? idle_cpu+0x9/0x50
> [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80
> [ 139.512638] ? idle_cpu+0x9/0x50
> [ 139.512731] ? irq_exit_rcu+0x3a/0xb0
> [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10
> [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80
> [ 139.513101] </IRQ>
> [ 139.513168] <TASK>
>
> > +
> > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > trace.retval = ftrace_regs_get_return_value(fregs);
> > #endif
> > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > }
> > }
> >
> > + ftrace_test_recursion_unlock(bit);
> > +out:
> > /*
> > * The ftrace_graph_return() may still access the current
> > * ret_stack structure, we need to make sure the update of
> >
> >
> >
>
>
>
>
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-20 7:45 ` Jiri Olsa
@ 2025-09-22 6:16 ` Masami Hiramatsu
2025-09-22 13:38 ` Jiri Olsa
0 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-22 6:16 UTC (permalink / raw)
To: Jiri Olsa
Cc: Steven Rostedt, Masami Hiramatsu (Google), Peter Zijlstra,
Steven Rostedt, Menglong Dong, tglx, mingo, bp, dave.hansen, x86,
hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel,
bpf
On Sat, 20 Sep 2025 09:45:15 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:
> On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote:
> > On Fri, 19 Sep 2025 20:57:36 +0900
> > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> >
> > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > >
> > > function_graph_enter_regs() prevents itself from recursion by
> > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > > which is called at the exit, does not prevent such recursion.
> > > Therefore, while it can prevent recursive calls from
> > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > > This can lead an unexpected recursion bug reported by Menglong.
> > >
> > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > > -> kprobe_multi_link_exit_handler -> is_endbr.
> >
> > So basically its if the handler for the return part calls something that it
> > is tracing, it can trigger the recursion?
> >
> > >
> > > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > > this section must prevent recursive call of fgraph inside user-defined
> > > fgraph_ops::retfunc().
> > >
> > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > > fprobe on function-graph tracer"), because before that fgraph was
> > > only used from the function graph tracer. Fprobe allowed user to run
> > > any callbacks from fgraph after that commit.
> >
> > I would actually say it's because before this commit, the return handler
> > callers never called anything that the entry handlers didn't already call.
> > If there was recursion, the entry handler would catch it (and the entry
> > tells fgraph if the exit handler should be called).
> >
> > The difference here is with fprobes, you can have the exit handler calling
> > functions that the entry handler does not, which exposes more cases where
> > recursion could happen.
>
> so IIUC we have return kprobe multi probe on is_endbr and now we do:
>
> is_endbr()
> { -> function_graph_enter_regs installs return probe
> ...
> } -> __ftrace_return_to_handler
> fprobe_return
> kprobe_multi_link_exit_handler
> is_endbr
> { -> function_graph_enter_regs installs return probe
> ...
> } -> __ftrace_return_to_handler
> fprobe_return
> kprobe_multi_link_exit_handler
> is_endbr
> { -> function_graph_enter_regs installs return probe
> ...
> } -> __ftrace_return_to_handler
> ... recursion
>
>
> with the fix:
>
> is_endbr()
> { -> function_graph_enter_regs installs return probe
> ...
> } -> __ftrace_return_to_handler
> fprobe_return
> kprobe_multi_link_exit_handler
> ...
> is_endbr
> { -> function_graph_enter_regs
> ftrace_test_recursion_trylock fails and we do NOT install return probe
> ...
> }
>
>
> there's is_endbr call also in kprobe_multi_link_handler, but it won't
> trigger recursion, because function_graph_enter_regs already uses
> ftrace_test_recursion_trylock
>
>
> if above is correct then the fix looks good to me
>
> Acked-by: Jiri Olsa <jolsa@kernel.org>
Hi Jiri,
I found ftrace_test_recursion_trylock() allows one nest level, can you
make sure it is OK?
Thank you,
>
> thanks,
> jirka
>
>
> >
> > >
> > > Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > ---
> > > kernel/trace/fgraph.c | 12 ++++++++++++
> > > 1 file changed, 12 insertions(+)
> > >
> > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > > index 1e3b32b1e82c..08dde420635b 100644
> > > --- a/kernel/trace/fgraph.c
> > > +++ b/kernel/trace/fgraph.c
> > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > unsigned long bitmap;
> > > unsigned long ret;
> > > int offset;
> > > + int bit;
> > > int i;
> > >
> > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > if (fregs)
> > > ftrace_regs_set_instruction_pointer(fregs, ret);
> > >
> > > + bit = ftrace_test_recursion_trylock(trace.func, ret);
> > > + /*
> > > + * This must be succeeded because the entry handler returns before
> > > + * modifying the return address if it is nested. Anyway, we need to
> > > + * avoid calling user callbacks if it is nested.
> > > + */
> > > + if (WARN_ON_ONCE(bit < 0))
> >
> > I'm not so sure we need the warn on here. We should probably hook it to the
> > recursion detection infrastructure that the function tracer has.
> >
> > The reason I would say not to have the warn on, is because we don't have a
> > warn on for recursion happening at the entry handler. Because this now is
> > exposed by fprobe allowing different routines to be called at exit than
> > what is used in entry, it can easily be triggered.
> >
> > -- Steve
> >
> >
> >
> > > + goto out;
> > > +
> > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > > trace.retval = ftrace_regs_get_return_value(fregs);
> > > #endif
> > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > }
> > > }
> > >
> > > + ftrace_test_recursion_unlock(bit);
> > > +out:
> > > /*
> > > * The ftrace_graph_return() may still access the current
> > > * ret_stack structure, we need to make sure the update of
> >
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-18 17:53 ` Alexei Starovoitov
2025-09-19 1:13 ` Menglong Dong
@ 2025-09-22 6:36 ` Peter Zijlstra
1 sibling, 0 replies; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-22 6:36 UTC (permalink / raw)
To: Alexei Starovoitov
Cc: Menglong Dong, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On Thu, Sep 18, 2025 at 10:53:51AM -0700, Alexei Starovoitov wrote:
> > notrace don't help with kprobes in general, only with __fentry__ sites.
>
> Are you sure ? Last time I checked "notrace" prevents kprobing
> anywhere in that function.
#define notrace __attribute__((__no_instrument_function__))
AFAICT that only inhibits the compiler from generating __fentry__ calls.
There is NOKPROBE_SYMBOL() to explicitly exclude kprobes from a symbol.
And we have noinstr, which along with a gazillion code-gen attributes
puts things in a .noinstr.text section and that section is also excluded
from kprobe placement.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-19 1:13 ` Menglong Dong
@ 2025-09-22 6:52 ` Peter Zijlstra
2025-09-22 7:13 ` menglong.dong
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-22 6:52 UTC (permalink / raw)
To: Menglong Dong
Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote:
> Ok, let me describe the problem in deetail.
>
> First of all, it has nothing to do with kprobe. The bpf program of type
> kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all
> about the ftrace, which means __fentry__.
Well, that's not confusing at all. Something called kprobe-multi not
being related to kprobes :-(
> Second, let me explain the recur detection of the kprobe-multi. Let's
> take the is_endbr() for example. When it is hooked by the bpf program
> of type kretprobe-multi, following calling chain will happen:
>
> is_endbr -> __ftrace_return_to_handler -> fprobe_return ->
> kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> arch_ftrace_get_symaddr -> is_endbr
>
> Look, is_endbr() is called again during the ftrace handler, so it will
> trigger the ftrace handler(__ftrace_return_to_handler) again, which
> causes recurrence.
Right.
> Such recurrence can be detected. In kprobe_multi_link_prog_run(),
> the percpu various "bpf_prog_active" will be increased by 1 before we
> run the bpf progs, and decrease by 1 after the bpf progs finish. If the
> kprobe_multi_link_prog_run() is triggered again during bpf progs run,
> it will check if bpf_prog_active is zero, and return directly if it is not.
> Therefore, recurrence can't happen within the "bpf_prog_active" protection.
As I think Masami already said, the problem is the layer. You're trying
to fix an ftrace problem at the bpf layer.
> However, the calling to is_endbr() is not within that scope, which makes
> the recurrence happen.
Sorta, I'm still sketchy on the whole kprobe-multi thing.
Anyway, I don't mind making is_endbr() invisible to tracing, that might
just have security benefits too. But I think first the ftrace folks need
to figure out how to best kill that recursion, because I don't think
is_endbr is particularly special here.
It is just one more function that can emit a __fentry__ site.
Anyway, something like the below would do:
Note that without making __is_endbr() __always_inline, you run the risk
of the compiler being retarded (they often are in the face of
KASAN/UBSAN like) and deciding to out-of-line that function, resulting
in yet another __fentry__ site.
An added advantage of noinstr is that it is validated by objtool to
never call to !noinstr code. As such, you can be sure there is no
instrumentation in it.
(the below hasn't been near a compiler)
---
arch/x86/include/asm/ibt.h | 2 +-
arch/x86/kernel/alternative.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h
index 5e45d6424722..54937a527042 100644
--- a/arch/x86/include/asm/ibt.h
+++ b/arch/x86/include/asm/ibt.h
@@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void)
return 0xd6401f0f; /* nopl -42(%rax) */
}
-static inline bool __is_endbr(u32 val)
+static __always_inline bool __is_endbr(u32 val)
{
if (val == gen_endbr_poison())
return true;
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 69fb818df2ee..f791e7abd466 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
#ifdef CONFIG_X86_KERNEL_IBT
-__noendbr bool is_endbr(u32 *val)
+__noendbr noinstr bool is_endbr(u32 *val)
{
u32 endbr;
^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-22 6:52 ` Peter Zijlstra
@ 2025-09-22 7:13 ` menglong.dong
2025-09-22 7:19 ` Peter Zijlstra
0 siblings, 1 reply; 32+ messages in thread
From: menglong.dong @ 2025-09-22 7:13 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On 2025/9/22 14:52 Peter Zijlstra <peterz@infradead.org> write:
> On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote:
>
> > Ok, let me describe the problem in deetail.
> >
> > First of all, it has nothing to do with kprobe. The bpf program of type
> > kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all
> > about the ftrace, which means __fentry__.
>
> Well, that's not confusing at all. Something called kprobe-multi not
> being related to kprobes :-(
>
> > Second, let me explain the recur detection of the kprobe-multi. Let's
> > take the is_endbr() for example. When it is hooked by the bpf program
> > of type kretprobe-multi, following calling chain will happen:
> >
> > is_endbr -> __ftrace_return_to_handler -> fprobe_return ->
> > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip ->
> > arch_ftrace_get_symaddr -> is_endbr
> >
> > Look, is_endbr() is called again during the ftrace handler, so it will
> > trigger the ftrace handler(__ftrace_return_to_handler) again, which
> > causes recurrence.
>
> Right.
>
> > Such recurrence can be detected. In kprobe_multi_link_prog_run(),
> > the percpu various "bpf_prog_active" will be increased by 1 before we
> > run the bpf progs, and decrease by 1 after the bpf progs finish. If the
> > kprobe_multi_link_prog_run() is triggered again during bpf progs run,
> > it will check if bpf_prog_active is zero, and return directly if it is not.
> > Therefore, recurrence can't happen within the "bpf_prog_active" protection.
>
> As I think Masami already said, the problem is the layer. You're trying
> to fix an ftrace problem at the bpf layer.
Yeah, I see. And Masami has already posted a series for this
problem in:
https://lore.kernel.org/bpf/175852291163.307379.14414635977719513326.stgit@devnote2/
>
> > However, the calling to is_endbr() is not within that scope, which makes
> > the recurrence happen.
>
> Sorta, I'm still sketchy on the whole kprobe-multi thing.
>
> Anyway, I don't mind making is_endbr() invisible to tracing, that might
> just have security benefits too. But I think first the ftrace folks need
> to figure out how to best kill that recursion, because I don't think
> is_endbr is particularly special here.
So, does this patch seem useful after all?
OK, I'll send a V2 base on your following suggestion.
Thanks!
Menglong Dong
>
> It is just one more function that can emit a __fentry__ site.
>
> Anyway, something like the below would do:
>
> Note that without making __is_endbr() __always_inline, you run the risk
> of the compiler being retarded (they often are in the face of
> KASAN/UBSAN like) and deciding to out-of-line that function, resulting
> in yet another __fentry__ site.
>
> An added advantage of noinstr is that it is validated by objtool to
> never call to !noinstr code. As such, you can be sure there is no
> instrumentation in it.
>
> (the below hasn't been near a compiler)
>
> ---
> arch/x86/include/asm/ibt.h | 2 +-
> arch/x86/kernel/alternative.c | 2 +-
> 2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h
> index 5e45d6424722..54937a527042 100644
> --- a/arch/x86/include/asm/ibt.h
> +++ b/arch/x86/include/asm/ibt.h
> @@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void)
> return 0xd6401f0f; /* nopl -42(%rax) */
> }
>
> -static inline bool __is_endbr(u32 val)
> +static __always_inline bool __is_endbr(u32 val)
> {
> if (val == gen_endbr_poison())
> return true;
> diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
> index 69fb818df2ee..f791e7abd466 100644
> --- a/arch/x86/kernel/alternative.c
> +++ b/arch/x86/kernel/alternative.c
> @@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
>
> #ifdef CONFIG_X86_KERNEL_IBT
>
> -__noendbr bool is_endbr(u32 *val)
> +__noendbr noinstr bool is_endbr(u32 *val)
> {
> u32 endbr;
>
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-22 7:13 ` menglong.dong
@ 2025-09-22 7:19 ` Peter Zijlstra
2025-09-22 7:21 ` Menglong Dong
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2025-09-22 7:19 UTC (permalink / raw)
To: menglong.dong
Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote:
> > Anyway, I don't mind making is_endbr() invisible to tracing, that might
> > just have security benefits too. But I think first the ftrace folks need
> > to figure out how to best kill that recursion, because I don't think
> > is_endbr is particularly special here.
>
> So, does this patch seem useful after all?
The use lies in making it harder to find/manipulate endbr things.
> OK, I'll send a V2 base on your following suggestion.
Hold off until Masami/Steve have fixed the ftrace recursion issue. After
that we can do this.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace
2025-09-22 7:19 ` Peter Zijlstra
@ 2025-09-22 7:21 ` Menglong Dong
0 siblings, 0 replies; 32+ messages in thread
From: Menglong Dong @ 2025-09-22 7:21 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar,
Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook,
Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu,
Alexei Starovoitov, Andrii Nakryiko, LKML, bpf
On 2025/9/22 15:19 Peter Zijlstra <peterz@infradead.org> write:
> On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote:
>
> > > Anyway, I don't mind making is_endbr() invisible to tracing, that might
> > > just have security benefits too. But I think first the ftrace folks need
> > > to figure out how to best kill that recursion, because I don't think
> > > is_endbr is particularly special here.
> >
> > So, does this patch seem useful after all?
>
> The use lies in making it harder to find/manipulate endbr things.
>
> > OK, I'll send a V2 base on your following suggestion.
>
> Hold off until Masami/Steve have fixed the ftrace recursion issue. After
> that we can do this.
OK!
>
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-22 6:16 ` Masami Hiramatsu
@ 2025-09-22 13:38 ` Jiri Olsa
2025-09-22 14:42 ` Steven Rostedt
2025-09-22 19:45 ` Jiri Olsa
0 siblings, 2 replies; 32+ messages in thread
From: Jiri Olsa @ 2025-09-22 13:38 UTC (permalink / raw)
To: Masami Hiramatsu
Cc: Jiri Olsa, Steven Rostedt, Peter Zijlstra, Steven Rostedt,
Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees,
samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf
On Mon, Sep 22, 2025 at 03:16:55PM +0900, Masami Hiramatsu wrote:
> On Sat, 20 Sep 2025 09:45:15 +0200
> Jiri Olsa <olsajiri@gmail.com> wrote:
>
> > On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote:
> > > On Fri, 19 Sep 2025 20:57:36 +0900
> > > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> > >
> > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > >
> > > > function_graph_enter_regs() prevents itself from recursion by
> > > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > > > which is called at the exit, does not prevent such recursion.
> > > > Therefore, while it can prevent recursive calls from
> > > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > > > This can lead an unexpected recursion bug reported by Menglong.
> > > >
> > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > > > -> kprobe_multi_link_exit_handler -> is_endbr.
> > >
> > > So basically its if the handler for the return part calls something that it
> > > is tracing, it can trigger the recursion?
> > >
> > > >
> > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > > > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > > > this section must prevent recursive call of fgraph inside user-defined
> > > > fgraph_ops::retfunc().
> > > >
> > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > > > fprobe on function-graph tracer"), because before that fgraph was
> > > > only used from the function graph tracer. Fprobe allowed user to run
> > > > any callbacks from fgraph after that commit.
> > >
> > > I would actually say it's because before this commit, the return handler
> > > callers never called anything that the entry handlers didn't already call.
> > > If there was recursion, the entry handler would catch it (and the entry
> > > tells fgraph if the exit handler should be called).
> > >
> > > The difference here is with fprobes, you can have the exit handler calling
> > > functions that the entry handler does not, which exposes more cases where
> > > recursion could happen.
> >
> > so IIUC we have return kprobe multi probe on is_endbr and now we do:
> >
> > is_endbr()
> > { -> function_graph_enter_regs installs return probe
> > ...
> > } -> __ftrace_return_to_handler
> > fprobe_return
> > kprobe_multi_link_exit_handler
> > is_endbr
> > { -> function_graph_enter_regs installs return probe
> > ...
> > } -> __ftrace_return_to_handler
> > fprobe_return
> > kprobe_multi_link_exit_handler
> > is_endbr
> > { -> function_graph_enter_regs installs return probe
> > ...
> > } -> __ftrace_return_to_handler
> > ... recursion
> >
> >
> > with the fix:
> >
> > is_endbr()
> > { -> function_graph_enter_regs installs return probe
> > ...
> > } -> __ftrace_return_to_handler
> > fprobe_return
> > kprobe_multi_link_exit_handler
> > ...
> > is_endbr
> > { -> function_graph_enter_regs
> > ftrace_test_recursion_trylock fails and we do NOT install return probe
> > ...
> > }
> >
> >
> > there's is_endbr call also in kprobe_multi_link_handler, but it won't
> > trigger recursion, because function_graph_enter_regs already uses
> > ftrace_test_recursion_trylock
> >
> >
> > if above is correct then the fix looks good to me
> >
> > Acked-by: Jiri Olsa <jolsa@kernel.org>
>
> Hi Jiri,
>
> I found ftrace_test_recursion_trylock() allows one nest level, can you
> make sure it is OK?
hum, I recall being surprised by that already in the past,
I thought we fixed that somehow, will check later today
thanks,
jirka
>
> Thank you,
>
> >
> > thanks,
> > jirka
> >
> >
> > >
> > > >
> > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com>
> > > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/
> > > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer")
> > > > Cc: stable@vger.kernel.org
> > > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > > ---
> > > > kernel/trace/fgraph.c | 12 ++++++++++++
> > > > 1 file changed, 12 insertions(+)
> > > >
> > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > > > index 1e3b32b1e82c..08dde420635b 100644
> > > > --- a/kernel/trace/fgraph.c
> > > > +++ b/kernel/trace/fgraph.c
> > > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > > unsigned long bitmap;
> > > > unsigned long ret;
> > > > int offset;
> > > > + int bit;
> > > > int i;
> > > >
> > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset);
> > > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > > if (fregs)
> > > > ftrace_regs_set_instruction_pointer(fregs, ret);
> > > >
> > > > + bit = ftrace_test_recursion_trylock(trace.func, ret);
> > > > + /*
> > > > + * This must be succeeded because the entry handler returns before
> > > > + * modifying the return address if it is nested. Anyway, we need to
> > > > + * avoid calling user callbacks if it is nested.
> > > > + */
> > > > + if (WARN_ON_ONCE(bit < 0))
> > >
> > > I'm not so sure we need the warn on here. We should probably hook it to the
> > > recursion detection infrastructure that the function tracer has.
> > >
> > > The reason I would say not to have the warn on, is because we don't have a
> > > warn on for recursion happening at the entry handler. Because this now is
> > > exposed by fprobe allowing different routines to be called at exit than
> > > what is used in entry, it can easily be triggered.
> > >
> > > -- Steve
> > >
> > >
> > >
> > > > + goto out;
> > > > +
> > > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > > > trace.retval = ftrace_regs_get_return_value(fregs);
> > > > #endif
> > > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > > }
> > > > }
> > > >
> > > > + ftrace_test_recursion_unlock(bit);
> > > > +out:
> > > > /*
> > > > * The ftrace_graph_return() may still access the current
> > > > * ret_stack structure, we need to make sure the update of
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-22 13:38 ` Jiri Olsa
@ 2025-09-22 14:42 ` Steven Rostedt
2025-09-22 19:45 ` Jiri Olsa
1 sibling, 0 replies; 32+ messages in thread
From: Steven Rostedt @ 2025-09-22 14:42 UTC (permalink / raw)
To: Jiri Olsa
Cc: Masami Hiramatsu, Peter Zijlstra, Steven Rostedt, Menglong Dong,
tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt,
luto, ast, andrii, linux-kernel, bpf
On Mon, 22 Sep 2025 15:38:13 +0200
Jiri Olsa <olsajiri@gmail.com> wrote:
> > I found ftrace_test_recursion_trylock() allows one nest level, can you
> > make sure it is OK?
>
> hum, I recall being surprised by that already in the past,
> I thought we fixed that somehow, will check later today
It still does allow one level of recursion, because there's still
locations that can trace in interrupt context before the preempt count
is updated to the new context.
I looked at fixing these, but IIRC, there's several locations in
assembly that do this and it started getting messy. I guess I can try
to fix this again.
-- Steve
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-22 13:38 ` Jiri Olsa
2025-09-22 14:42 ` Steven Rostedt
@ 2025-09-22 19:45 ` Jiri Olsa
1 sibling, 0 replies; 32+ messages in thread
From: Jiri Olsa @ 2025-09-22 19:45 UTC (permalink / raw)
To: Jiri Olsa
Cc: Masami Hiramatsu, Steven Rostedt, Peter Zijlstra, Steven Rostedt,
Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees,
samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf
On Mon, Sep 22, 2025 at 03:38:13PM +0200, Jiri Olsa wrote:
> On Mon, Sep 22, 2025 at 03:16:55PM +0900, Masami Hiramatsu wrote:
> > On Sat, 20 Sep 2025 09:45:15 +0200
> > Jiri Olsa <olsajiri@gmail.com> wrote:
> >
> > > On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote:
> > > > On Fri, 19 Sep 2025 20:57:36 +0900
> > > > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:
> > > >
> > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > > >
> > > > > function_graph_enter_regs() prevents itself from recursion by
> > > > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(),
> > > > > which is called at the exit, does not prevent such recursion.
> > > > > Therefore, while it can prevent recursive calls from
> > > > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls
> > > > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop.
> > > > > This can lead an unexpected recursion bug reported by Menglong.
> > > > >
> > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return
> > > > > -> kprobe_multi_link_exit_handler -> is_endbr.
> > > >
> > > > So basically its if the handler for the return part calls something that it
> > > > is tracing, it can trigger the recursion?
> > > >
> > > > >
> > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the
> > > > > __ftrace_return_to_handler() after unwind the shadow stack to mark
> > > > > this section must prevent recursive call of fgraph inside user-defined
> > > > > fgraph_ops::retfunc().
> > > > >
> > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite
> > > > > fprobe on function-graph tracer"), because before that fgraph was
> > > > > only used from the function graph tracer. Fprobe allowed user to run
> > > > > any callbacks from fgraph after that commit.
> > > >
> > > > I would actually say it's because before this commit, the return handler
> > > > callers never called anything that the entry handlers didn't already call.
> > > > If there was recursion, the entry handler would catch it (and the entry
> > > > tells fgraph if the exit handler should be called).
> > > >
> > > > The difference here is with fprobes, you can have the exit handler calling
> > > > functions that the entry handler does not, which exposes more cases where
> > > > recursion could happen.
> > >
> > > so IIUC we have return kprobe multi probe on is_endbr and now we do:
> > >
> > > is_endbr()
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > fprobe_return
> > > kprobe_multi_link_exit_handler
> > > is_endbr
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > fprobe_return
> > > kprobe_multi_link_exit_handler
> > > is_endbr
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > ... recursion
> > >
> > >
> > > with the fix:
> > >
> > > is_endbr()
> > > { -> function_graph_enter_regs installs return probe
> > > ...
> > > } -> __ftrace_return_to_handler
> > > fprobe_return
> > > kprobe_multi_link_exit_handler
> > > ...
> > > is_endbr
> > > { -> function_graph_enter_regs
> > > ftrace_test_recursion_trylock fails and we do NOT install return probe
> > > ...
> > > }
> > >
> > >
> > > there's is_endbr call also in kprobe_multi_link_handler, but it won't
> > > trigger recursion, because function_graph_enter_regs already uses
> > > ftrace_test_recursion_trylock
> > >
> > >
> > > if above is correct then the fix looks good to me
> > >
> > > Acked-by: Jiri Olsa <jolsa@kernel.org>
> >
> > Hi Jiri,
> >
> > I found ftrace_test_recursion_trylock() allows one nest level, can you
> > make sure it is OK?
we have nesting check on the kprobe multi layer making sure
the bpf program will not nest into itself
kprobe_multi_link_prog_run
bpf_prog_active check
jirka
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-21 22:52 ` Steven Rostedt
@ 2025-09-24 22:58 ` Masami Hiramatsu
0 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-24 22:58 UTC (permalink / raw)
To: Steven Rostedt
Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Sun, 21 Sep 2025 18:52:03 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:
> On Sun, 21 Sep 2025 13:05:19 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
>
> >
> > > The reason I would say not to have the warn on, is because we don't have a
> > > warn on for recursion happening at the entry handler. Because this now is
> > > exposed by fprobe allowing different routines to be called at exit than
> > > what is used in entry, it can easily be triggered.
> >
> > At the entry, if it detect recursion, it exits soon. But here we have to
> > process stack unwind to get the return address. This recursion_trylock()
> > is to mark this is in the critical section, not detect it.
>
> Ah, because the first instance of the exit callback sets the recursion
> bit. This will cause recursed entry calls to detect the recursion bit
> and return without setting the exit handler to be called.
>
> That is, by setting the recursion bit in the exit handler, it will cause
> a recursion in entry to fail before the exit is called again.
>
> I'd like to update the comment:
>
> + bit = ftrace_test_recursion_trylock(trace.func, ret);
> + /*
> + * This must be succeeded because the entry handler returns before
> + * modifying the return address if it is nested. Anyway, we need to
> + * avoid calling user callbacks if it is nested.
> + */
> + if (WARN_ON_ONCE(bit < 0))
> + goto out;
> +
>
> to:
>
> /*
> * Setting the recursion bit here will cause the graph entry to
> * detect recursion before the exit handle will. If the ext
> * handler detects recursion, something went wrong.
> */
> if (WARN_ON_ONCE(bit < 0))
OK, let me update it.
Thank you,
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop
2025-09-21 23:00 ` Steven Rostedt
@ 2025-09-24 22:59 ` Masami Hiramatsu
0 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2025-09-24 22:59 UTC (permalink / raw)
To: Steven Rostedt
Cc: Menglong Dong, Peter Zijlstra, Menglong Dong, jolsa, tglx, mingo,
bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast,
andrii, linux-kernel, bpf
On Sun, 21 Sep 2025 19:00:56 -0400
Steven Rostedt <rostedt@kernel.org> wrote:
> On Sun, 21 Sep 2025 13:06:47 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> > >
> > > Hi, the logic seems right, but the warning is triggered when
> > > I try to run the bpf bench testing:
> >
> > Hmm, this is strange. Let me check why this happens.
> >
> > Thank you,
> >
> > >
> > > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all
> > > [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030
> > > [ 139.509036] ------------[ cut here ]------------
> > > [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0
> > > [ 139.509411] Modules linked in: virtio_net
> > > [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE
> > > [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014
> > > [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0
> > > [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00
> > > [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002
> > > [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003
> > > [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000
> > > [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319
> > > [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000
> > > [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> > > [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000
> > > [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0
> > > [ 139.512038] PKRU: 55555554
> > > [ 139.512106] Call Trace:
> > > [ 139.512177] <IRQ>
> > > [ 139.512232] ? irq_exit_rcu+0x4/0xb0
> > > [ 139.512322] return_to_handler+0x1e/0x50
> > > [ 139.512422] ? idle_cpu+0x9/0x50
> > > [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80
> > > [ 139.512638] ? idle_cpu+0x9/0x50
> > > [ 139.512731] ? irq_exit_rcu+0x3a/0xb0
> > > [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10
> > > [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80
> > > [ 139.513101] </IRQ>
> > > [ 139.513168] <TASK>
> > >
> > > > +
> > > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL
> > > > trace.retval = ftrace_regs_get_return_value(fregs);
> > > > #endif
> > > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe
> > > > }
> > > > }
> > > >
> > > > + ftrace_test_recursion_unlock(bit);
> > > > +out:
> > > > /*
> > > > * The ftrace_graph_return() may still access the current
> > > > * ret_stack structure, we need to make sure the update of
>
>
> Hmm, I wonder if this has to do with the "TRANSITION BIT". The
> ftrace_test_recursion_trylock() allows one level of recursion. This is
> to handle the case of an interrupt happening after the recursion bit is
> set and traces something before it updates the context in the preempt
> count. This would cause a false positive of the recursion test. To
> handle this case, it allows a single level of recursion.
>
> I originally was going to mention this, but I still can't see how this
> would affect it. Because if the entry were to allow one level of
> recursion, so would the exit. I still see the entry preventing the exit
> to be called. But perhaps there's an combination that I missed?
If the fgraph allows one level of recursion, fprobe should work around
functions called from fprobe's callback, such as is_endbr(), as follows:
/* probe on enter */
call target_func()
=> function_graph_enter_regs() /* 1st lock */
-> fprobe_entry()
-> user_callback()
-> is_endbr()
=> function_graph_enter_regs() /* 2nd lock */
-> fprobe_entry()
-> user_callback()
-> is_endbr()
=> function_graph_enter_regs() /* lock failed */
/* probe on exit */
call target_func()
=> function_graph_enter_regs()
-> ftrace_push_return_trace()
/* run target_func() */
=> __ftrace_return_to_handler() /* 1st lock */
-> fprobe_exit()
-> user_callback()
-> is_endbr()
=> function_graph_enter_regs() /* 2nd lock */
-> ftrace_push_return_trace()
/* unlock 2nd */
/* run is_endbr() */
=> __ftrace_return_to_handler() /* 2nd lock again */
-> fprobe_exit()
-> user_callback()
-> is_endbr()
=> function_graph_enter_regs() /* lock failed */
/* run is_endbr() */
So it can detect the recursion.
Thank you,
--
Masami Hiramatsu (Google) <mhiramat@kernel.org>
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-09-24 22:59 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong
2025-09-18 13:05 ` Peter Zijlstra
2025-09-18 13:32 ` Menglong Dong
2025-09-18 16:02 ` Alexei Starovoitov
2025-09-18 16:59 ` Peter Zijlstra
2025-09-18 17:53 ` Alexei Starovoitov
2025-09-19 1:13 ` Menglong Dong
2025-09-22 6:52 ` Peter Zijlstra
2025-09-22 7:13 ` menglong.dong
2025-09-22 7:19 ` Peter Zijlstra
2025-09-22 7:21 ` Menglong Dong
2025-09-22 6:36 ` Peter Zijlstra
2025-09-18 16:56 ` Peter Zijlstra
2025-09-19 12:35 ` Masami Hiramatsu
2025-09-19 8:52 ` Masami Hiramatsu
2025-09-19 8:58 ` Menglong Dong
2025-09-19 12:32 ` Masami Hiramatsu
2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google)
2025-09-19 15:27 ` Steven Rostedt
2025-09-20 7:45 ` Jiri Olsa
2025-09-22 6:16 ` Masami Hiramatsu
2025-09-22 13:38 ` Jiri Olsa
2025-09-22 14:42 ` Steven Rostedt
2025-09-22 19:45 ` Jiri Olsa
2025-09-21 4:05 ` Masami Hiramatsu
2025-09-21 22:52 ` Steven Rostedt
2025-09-24 22:58 ` Masami Hiramatsu
2025-09-20 13:39 ` Menglong Dong
2025-09-21 4:06 ` Masami Hiramatsu
2025-09-21 23:00 ` Steven Rostedt
2025-09-24 22:59 ` Masami Hiramatsu
2025-09-22 5:19 ` Masami Hiramatsu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox