* [PATCH] x86/ibt: make is_endbr() notrace
@ 2025-09-18 12:09 Menglong Dong
2025-09-18 13:05 ` Peter Zijlstra
` (2 more replies)
0 siblings, 3 replies; 32+ messages in thread
From: Menglong Dong @ 2025-09-18 12:09 UTC (permalink / raw)
To: peterz, jolsa
Cc: tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt,
luto, mhiramat, ast, andrii, linux-kernel, bpf
is_endbr() is called in __ftrace_return_to_handler -> fprobe_return ->
kprobe_multi_link_exit_handler -> is_endbr.
It is not protected by the "bpf_prog_active", so it can't be traced by
kprobe-multi, which can cause recurring and panic the kernel. Fix it by
make it notrace.
Fixes: 72e213a7ccf9 ("x86/ibt: Clean up is_endbr()")
Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
---
arch/x86/kernel/alternative.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index 69fb818df2ee..f67a31c77c89 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { }
#ifdef CONFIG_X86_KERNEL_IBT
-__noendbr bool is_endbr(u32 *val)
+__noendbr notrace bool is_endbr(u32 *val)
{
u32 endbr;
--
2.51.0
^ permalink raw reply related [flat|nested] 32+ messages in thread* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong @ 2025-09-18 13:05 ` Peter Zijlstra 2025-09-18 13:32 ` Menglong Dong 2025-09-19 8:52 ` Masami Hiramatsu 2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google) 2 siblings, 1 reply; 32+ messages in thread From: Peter Zijlstra @ 2025-09-18 13:05 UTC (permalink / raw) To: Menglong Dong Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > kprobe_multi_link_exit_handler -> is_endbr. > > It is not protected by the "bpf_prog_active", so it can't be traced by > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > make it notrace. This is very much a riddle wrapped in an enigma. Notably kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that cryptic next line sufficient to explain why its a problem. I suspect the is_endbr() you did mean is the one in arch_ftrace_get_symaddr(), but who knows. Also, depending on compiler insanity, it is possible the thing out-of-lines things like __is_endbr(), getting you yet another __fentry__ site. Please try again. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 13:05 ` Peter Zijlstra @ 2025-09-18 13:32 ` Menglong Dong 2025-09-18 16:02 ` Alexei Starovoitov 2025-09-18 16:56 ` Peter Zijlstra 0 siblings, 2 replies; 32+ messages in thread From: Menglong Dong @ 2025-09-18 13:32 UTC (permalink / raw) To: Peter Zijlstra Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > kprobe_multi_link_exit_handler -> is_endbr. > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > make it notrace. > > This is very much a riddle wrapped in an enigma. Notably > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > cryptic next line sufficient to explain why its a problem. > > I suspect the is_endbr() you did mean is the one in > arch_ftrace_get_symaddr(), but who knows. Yeah, I mean kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> arch_ftrace_get_symaddr -> is_endbr actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > Also, depending on compiler insanity, it is possible the thing > out-of-lines things like __is_endbr(), getting you yet another > __fentry__ site. The panic happens when I run the bpf bench testing: ./bench kretprobe-multi-all And skip the "is_endbr" fix this problem. __is_endbr should be marked with "notrace" too. I slacked off on it, as it didn't happen in my case :/ > > Please try again. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 13:32 ` Menglong Dong @ 2025-09-18 16:02 ` Alexei Starovoitov 2025-09-18 16:59 ` Peter Zijlstra 2025-09-18 16:56 ` Peter Zijlstra 1 sibling, 1 reply; 32+ messages in thread From: Alexei Starovoitov @ 2025-09-18 16:02 UTC (permalink / raw) To: Menglong Dong Cc: Peter Zijlstra, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On Thu, Sep 18, 2025 at 6:32 AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > On Thu, Sep 18, 2025 at 9:05 PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > make it notrace. > > > > This is very much a riddle wrapped in an enigma. Notably > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > cryptic next line sufficient to explain why its a problem. > > > > I suspect the is_endbr() you did mean is the one in > > arch_ftrace_get_symaddr(), but who knows. > > Yeah, I mean > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > arch_ftrace_get_symaddr -> is_endbr > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. All this makes sense to me. __noendbr bool is_endbr(u32 *val) needs "notrace", since it's in alternative.c and won't get inlined (unless LTO+luck). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 16:02 ` Alexei Starovoitov @ 2025-09-18 16:59 ` Peter Zijlstra 2025-09-18 17:53 ` Alexei Starovoitov 0 siblings, 1 reply; 32+ messages in thread From: Peter Zijlstra @ 2025-09-18 16:59 UTC (permalink / raw) To: Alexei Starovoitov Cc: Menglong Dong, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote: > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > make it notrace. > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > cryptic next line sufficient to explain why its a problem. > > > > > > I suspect the is_endbr() you did mean is the one in > > > arch_ftrace_get_symaddr(), but who knows. > > > > Yeah, I mean > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > arch_ftrace_get_symaddr -> is_endbr > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > All this makes sense to me. As written down, I'm still clueless. > __noendbr bool is_endbr(u32 *val) needs "notrace", > since it's in alternative.c and won't get inlined (unless LTO+luck). notrace don't help with kprobes in general, only with __fentry__ sites. I've still not clue why there is a panic, or why notrace would be sufficient. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 16:59 ` Peter Zijlstra @ 2025-09-18 17:53 ` Alexei Starovoitov 2025-09-19 1:13 ` Menglong Dong 2025-09-22 6:36 ` Peter Zijlstra 0 siblings, 2 replies; 32+ messages in thread From: Alexei Starovoitov @ 2025-09-18 17:53 UTC (permalink / raw) To: Peter Zijlstra Cc: Menglong Dong, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote: > > On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote: > > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > > make it notrace. > > > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > > cryptic next line sufficient to explain why its a problem. > > > > > > > > I suspect the is_endbr() you did mean is the one in > > > > arch_ftrace_get_symaddr(), but who knows. > > > > > > Yeah, I mean > > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > > arch_ftrace_get_symaddr -> is_endbr > > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > All this makes sense to me. > > As written down, I'm still clueless. > > > __noendbr bool is_endbr(u32 *val) needs "notrace", > > since it's in alternative.c and won't get inlined (unless LTO+luck). > > notrace don't help with kprobes in general, only with __fentry__ sites. Are you sure ? Last time I checked "notrace" prevents kprobing anywhere in that function. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 17:53 ` Alexei Starovoitov @ 2025-09-19 1:13 ` Menglong Dong 2025-09-22 6:52 ` Peter Zijlstra 2025-09-22 6:36 ` Peter Zijlstra 1 sibling, 1 reply; 32+ messages in thread From: Menglong Dong @ 2025-09-19 1:13 UTC (permalink / raw) To: Peter Zijlstra, Alexei Starovoitov Cc: Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On Fri, Sep 19, 2025 at 1:54 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Thu, Sep 18, 2025 at 9:59 AM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Sep 18, 2025 at 09:02:31AM -0700, Alexei Starovoitov wrote: > > > On Thu, Sep 18, 2025 at 6:32???AM Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > > > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > > > make it notrace. > > > > > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > > > cryptic next line sufficient to explain why its a problem. > > > > > > > > > > I suspect the is_endbr() you did mean is the one in > > > > > arch_ftrace_get_symaddr(), but who knows. > > > > > > > > Yeah, I mean > > > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > > > arch_ftrace_get_symaddr -> is_endbr > > > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > > > All this makes sense to me. > > > > As written down, I'm still clueless. Ok, let me describe the problem in deetail. First of all, it has nothing to do with kprobe. The bpf program of type kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all about the ftrace, which means __fentry__. Second, let me explain the recur detection of the kprobe-multi. Let's take the is_endbr() for example. When it is hooked by the bpf program of type kretprobe-multi, following calling chain will happen: is_endbr -> __ftrace_return_to_handler -> fprobe_return -> kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> arch_ftrace_get_symaddr -> is_endbr Look, is_endbr() is called again during the ftrace handler, so it will trigger the ftrace handler(__ftrace_return_to_handler) again, which causes recurrence. Such recurrence can be detected. In kprobe_multi_link_prog_run(), the percpu various "bpf_prog_active" will be increased by 1 before we run the bpf progs, and decrease by 1 after the bpf progs finish. If the kprobe_multi_link_prog_run() is triggered again during bpf progs run, it will check if bpf_prog_active is zero, and return directly if it is not. Therefore, recurrence can't happen within the "bpf_prog_active" protection. However, the calling to is_endbr() is not within that scope, which makes the recurrence happen. Hope I described it clearly 😉 Thanks! Menglong Dong > > > > > __noendbr bool is_endbr(u32 *val) needs "notrace", > > > since it's in alternative.c and won't get inlined (unless LTO+luck). > > > > notrace don't help with kprobes in general, only with __fentry__ sites. > > Are you sure ? Last time I checked "notrace" prevents kprobing > anywhere in that function. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-19 1:13 ` Menglong Dong @ 2025-09-22 6:52 ` Peter Zijlstra 2025-09-22 7:13 ` menglong.dong 0 siblings, 1 reply; 32+ messages in thread From: Peter Zijlstra @ 2025-09-22 6:52 UTC (permalink / raw) To: Menglong Dong Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote: > Ok, let me describe the problem in deetail. > > First of all, it has nothing to do with kprobe. The bpf program of type > kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all > about the ftrace, which means __fentry__. Well, that's not confusing at all. Something called kprobe-multi not being related to kprobes :-( > Second, let me explain the recur detection of the kprobe-multi. Let's > take the is_endbr() for example. When it is hooked by the bpf program > of type kretprobe-multi, following calling chain will happen: > > is_endbr -> __ftrace_return_to_handler -> fprobe_return -> > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > arch_ftrace_get_symaddr -> is_endbr > > Look, is_endbr() is called again during the ftrace handler, so it will > trigger the ftrace handler(__ftrace_return_to_handler) again, which > causes recurrence. Right. > Such recurrence can be detected. In kprobe_multi_link_prog_run(), > the percpu various "bpf_prog_active" will be increased by 1 before we > run the bpf progs, and decrease by 1 after the bpf progs finish. If the > kprobe_multi_link_prog_run() is triggered again during bpf progs run, > it will check if bpf_prog_active is zero, and return directly if it is not. > Therefore, recurrence can't happen within the "bpf_prog_active" protection. As I think Masami already said, the problem is the layer. You're trying to fix an ftrace problem at the bpf layer. > However, the calling to is_endbr() is not within that scope, which makes > the recurrence happen. Sorta, I'm still sketchy on the whole kprobe-multi thing. Anyway, I don't mind making is_endbr() invisible to tracing, that might just have security benefits too. But I think first the ftrace folks need to figure out how to best kill that recursion, because I don't think is_endbr is particularly special here. It is just one more function that can emit a __fentry__ site. Anyway, something like the below would do: Note that without making __is_endbr() __always_inline, you run the risk of the compiler being retarded (they often are in the face of KASAN/UBSAN like) and deciding to out-of-line that function, resulting in yet another __fentry__ site. An added advantage of noinstr is that it is validated by objtool to never call to !noinstr code. As such, you can be sure there is no instrumentation in it. (the below hasn't been near a compiler) --- arch/x86/include/asm/ibt.h | 2 +- arch/x86/kernel/alternative.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h index 5e45d6424722..54937a527042 100644 --- a/arch/x86/include/asm/ibt.h +++ b/arch/x86/include/asm/ibt.h @@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void) return 0xd6401f0f; /* nopl -42(%rax) */ } -static inline bool __is_endbr(u32 val) +static __always_inline bool __is_endbr(u32 val) { if (val == gen_endbr_poison()) return true; diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 69fb818df2ee..f791e7abd466 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { } #ifdef CONFIG_X86_KERNEL_IBT -__noendbr bool is_endbr(u32 *val) +__noendbr noinstr bool is_endbr(u32 *val) { u32 endbr; ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-22 6:52 ` Peter Zijlstra @ 2025-09-22 7:13 ` menglong.dong 2025-09-22 7:19 ` Peter Zijlstra 0 siblings, 1 reply; 32+ messages in thread From: menglong.dong @ 2025-09-22 7:13 UTC (permalink / raw) To: Peter Zijlstra Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On 2025/9/22 14:52 Peter Zijlstra <peterz@infradead.org> write: > On Fri, Sep 19, 2025 at 09:13:15AM +0800, Menglong Dong wrote: > > > Ok, let me describe the problem in deetail. > > > > First of all, it has nothing to do with kprobe. The bpf program of type > > kprobe-multi based on fprobe, and fprobe base on fgraph. So it's all > > about the ftrace, which means __fentry__. > > Well, that's not confusing at all. Something called kprobe-multi not > being related to kprobes :-( > > > Second, let me explain the recur detection of the kprobe-multi. Let's > > take the is_endbr() for example. When it is hooked by the bpf program > > of type kretprobe-multi, following calling chain will happen: > > > > is_endbr -> __ftrace_return_to_handler -> fprobe_return -> > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > arch_ftrace_get_symaddr -> is_endbr > > > > Look, is_endbr() is called again during the ftrace handler, so it will > > trigger the ftrace handler(__ftrace_return_to_handler) again, which > > causes recurrence. > > Right. > > > Such recurrence can be detected. In kprobe_multi_link_prog_run(), > > the percpu various "bpf_prog_active" will be increased by 1 before we > > run the bpf progs, and decrease by 1 after the bpf progs finish. If the > > kprobe_multi_link_prog_run() is triggered again during bpf progs run, > > it will check if bpf_prog_active is zero, and return directly if it is not. > > Therefore, recurrence can't happen within the "bpf_prog_active" protection. > > As I think Masami already said, the problem is the layer. You're trying > to fix an ftrace problem at the bpf layer. Yeah, I see. And Masami has already posted a series for this problem in: https://lore.kernel.org/bpf/175852291163.307379.14414635977719513326.stgit@devnote2/ > > > However, the calling to is_endbr() is not within that scope, which makes > > the recurrence happen. > > Sorta, I'm still sketchy on the whole kprobe-multi thing. > > Anyway, I don't mind making is_endbr() invisible to tracing, that might > just have security benefits too. But I think first the ftrace folks need > to figure out how to best kill that recursion, because I don't think > is_endbr is particularly special here. So, does this patch seem useful after all? OK, I'll send a V2 base on your following suggestion. Thanks! Menglong Dong > > It is just one more function that can emit a __fentry__ site. > > Anyway, something like the below would do: > > Note that without making __is_endbr() __always_inline, you run the risk > of the compiler being retarded (they often are in the face of > KASAN/UBSAN like) and deciding to out-of-line that function, resulting > in yet another __fentry__ site. > > An added advantage of noinstr is that it is validated by objtool to > never call to !noinstr code. As such, you can be sure there is no > instrumentation in it. > > (the below hasn't been near a compiler) > > --- > arch/x86/include/asm/ibt.h | 2 +- > arch/x86/kernel/alternative.c | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/x86/include/asm/ibt.h b/arch/x86/include/asm/ibt.h > index 5e45d6424722..54937a527042 100644 > --- a/arch/x86/include/asm/ibt.h > +++ b/arch/x86/include/asm/ibt.h > @@ -65,7 +65,7 @@ static __always_inline __attribute_const__ u32 gen_endbr_poison(void) > return 0xd6401f0f; /* nopl -42(%rax) */ > } > > -static inline bool __is_endbr(u32 val) > +static __always_inline bool __is_endbr(u32 val) > { > if (val == gen_endbr_poison()) > return true; > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > index 69fb818df2ee..f791e7abd466 100644 > --- a/arch/x86/kernel/alternative.c > +++ b/arch/x86/kernel/alternative.c > @@ -1108,7 +1108,7 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) { } > > #ifdef CONFIG_X86_KERNEL_IBT > > -__noendbr bool is_endbr(u32 *val) > +__noendbr noinstr bool is_endbr(u32 *val) > { > u32 endbr; > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-22 7:13 ` menglong.dong @ 2025-09-22 7:19 ` Peter Zijlstra 2025-09-22 7:21 ` Menglong Dong 0 siblings, 1 reply; 32+ messages in thread From: Peter Zijlstra @ 2025-09-22 7:19 UTC (permalink / raw) To: menglong.dong Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote: > > Anyway, I don't mind making is_endbr() invisible to tracing, that might > > just have security benefits too. But I think first the ftrace folks need > > to figure out how to best kill that recursion, because I don't think > > is_endbr is particularly special here. > > So, does this patch seem useful after all? The use lies in making it harder to find/manipulate endbr things. > OK, I'll send a V2 base on your following suggestion. Hold off until Masami/Steve have fixed the ftrace recursion issue. After that we can do this. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-22 7:19 ` Peter Zijlstra @ 2025-09-22 7:21 ` Menglong Dong 0 siblings, 0 replies; 32+ messages in thread From: Menglong Dong @ 2025-09-22 7:21 UTC (permalink / raw) To: Peter Zijlstra Cc: Alexei Starovoitov, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On 2025/9/22 15:19 Peter Zijlstra <peterz@infradead.org> write: > On Mon, Sep 22, 2025 at 03:13:38PM +0800, menglong.dong@linux.dev wrote: > > > > Anyway, I don't mind making is_endbr() invisible to tracing, that might > > > just have security benefits too. But I think first the ftrace folks need > > > to figure out how to best kill that recursion, because I don't think > > > is_endbr is particularly special here. > > > > So, does this patch seem useful after all? > > The use lies in making it harder to find/manipulate endbr things. > > > OK, I'll send a V2 base on your following suggestion. > > Hold off until Masami/Steve have fixed the ftrace recursion issue. After > that we can do this. OK! > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 17:53 ` Alexei Starovoitov 2025-09-19 1:13 ` Menglong Dong @ 2025-09-22 6:36 ` Peter Zijlstra 1 sibling, 0 replies; 32+ messages in thread From: Peter Zijlstra @ 2025-09-22 6:36 UTC (permalink / raw) To: Alexei Starovoitov Cc: Menglong Dong, Jiri Olsa, Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen, X86 ML, H. Peter Anvin, Kees Cook, Sami Tolvanen, Mike Rapoport, Andy Lutomirski, Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, LKML, bpf On Thu, Sep 18, 2025 at 10:53:51AM -0700, Alexei Starovoitov wrote: > > notrace don't help with kprobes in general, only with __fentry__ sites. > > Are you sure ? Last time I checked "notrace" prevents kprobing > anywhere in that function. #define notrace __attribute__((__no_instrument_function__)) AFAICT that only inhibits the compiler from generating __fentry__ calls. There is NOKPROBE_SYMBOL() to explicitly exclude kprobes from a symbol. And we have noinstr, which along with a gazillion code-gen attributes puts things in a .noinstr.text section and that section is also excluded from kprobe placement. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 13:32 ` Menglong Dong 2025-09-18 16:02 ` Alexei Starovoitov @ 2025-09-18 16:56 ` Peter Zijlstra 2025-09-19 12:35 ` Masami Hiramatsu 1 sibling, 1 reply; 32+ messages in thread From: Peter Zijlstra @ 2025-09-18 16:56 UTC (permalink / raw) To: Menglong Dong Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote: > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > make it notrace. > > > > This is very much a riddle wrapped in an enigma. Notably > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > cryptic next line sufficient to explain why its a problem. > > > > I suspect the is_endbr() you did mean is the one in > > arch_ftrace_get_symaddr(), but who knows. > > Yeah, I mean > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > arch_ftrace_get_symaddr -> is_endbr > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > > Also, depending on compiler insanity, it is possible the thing > > out-of-lines things like __is_endbr(), getting you yet another > > __fentry__ site. > > The panic happens when I run the bpf bench testing: > ./bench kretprobe-multi-all > > And skip the "is_endbr" fix this problem. But why does it panic? Supposedly you've done the analysis; but then forgot to write it down? Why is kprobe_multi_link_exit_handler() special; doesn't the issue also exist with kprobe_multi_link_handler() ? If so, removing __fentry__ isn't going to help much, you can just stick an actual kprobe in is_endbr(), right? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 16:56 ` Peter Zijlstra @ 2025-09-19 12:35 ` Masami Hiramatsu 0 siblings, 0 replies; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-19 12:35 UTC (permalink / raw) To: Peter Zijlstra Cc: Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf On Thu, 18 Sep 2025 18:56:56 +0200 Peter Zijlstra <peterz@infradead.org> wrote: > On Thu, Sep 18, 2025 at 09:32:27PM +0800, Menglong Dong wrote: > > On Thu, Sep 18, 2025 at 9:05???PM Peter Zijlstra <peterz@infradead.org> wrote: > > > > > > On Thu, Sep 18, 2025 at 08:09:39PM +0800, Menglong Dong wrote: > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > > make it notrace. > > > > > > This is very much a riddle wrapped in an enigma. Notably > > > kprobe_multi_link_exit_handler() does not call is_endbr(). Nor is that > > > cryptic next line sufficient to explain why its a problem. > > > > > > I suspect the is_endbr() you did mean is the one in > > > arch_ftrace_get_symaddr(), but who knows. > > > > Yeah, I mean > > kprobe_multi_link_exit_handler -> ftrace_get_entry_ip -> > > arch_ftrace_get_symaddr -> is_endbr > > actually. And CONFIG_X86_KERNEL_IBT is enabled of course. > > > > > > > > Also, depending on compiler insanity, it is possible the thing > > > out-of-lines things like __is_endbr(), getting you yet another > > > __fentry__ site. > > > > The panic happens when I run the bpf bench testing: > > ./bench kretprobe-multi-all > > > > And skip the "is_endbr" fix this problem. > > But why does it panic? Supposedly you've done the analysis; but then > forgot to write it down? Yeah, that is an fprobe's bug. It should not panic. I sent a fix. https://lore.kernel.org/all/175828305637.117978.4183947592750468265.stgit@devnote2/ Thank you, > > Why is kprobe_multi_link_exit_handler() special; doesn't the issue also > exist with kprobe_multi_link_handler() ? If so, removing __fentry__ > isn't going to help much, you can just stick an actual kprobe in > is_endbr(), right? > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong 2025-09-18 13:05 ` Peter Zijlstra @ 2025-09-19 8:52 ` Masami Hiramatsu 2025-09-19 8:58 ` Menglong Dong 2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google) 2 siblings, 1 reply; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-19 8:52 UTC (permalink / raw) To: Menglong Dong Cc: peterz, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf On Thu, 18 Sep 2025 20:09:39 +0800 Menglong Dong <menglong8.dong@gmail.com> wrote: > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > kprobe_multi_link_exit_handler -> is_endbr. > > It is not protected by the "bpf_prog_active", so it can't be traced by > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > make it notrace. Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check recursion, but fgraph only detects the recursion in the entry handler. Thus it happens in the exit handler, fprobe does not check the recursion. But since the fprobe provides users to register callback at exit, it should check the recursion in return path too. Thanks, -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-19 8:52 ` Masami Hiramatsu @ 2025-09-19 8:58 ` Menglong Dong 2025-09-19 12:32 ` Masami Hiramatsu 0 siblings, 1 reply; 32+ messages in thread From: Menglong Dong @ 2025-09-19 8:58 UTC (permalink / raw) To: Masami Hiramatsu Cc: peterz, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write: > On Thu, 18 Sep 2025 20:09:39 +0800 > Menglong Dong <menglong8.dong@gmail.com> wrote: > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > kprobe_multi_link_exit_handler -> is_endbr. > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > make it notrace. > > Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check > recursion, but fgraph only detects the recursion in the entry handler. > Thus it happens in the exit handler, fprobe does not check the recursion. > > But since the fprobe provides users to register callback at exit, it > should check the recursion in return path too. That's a good idea to provide recursion checking for the exit handler, which is able to solve this problem too. If so, we don't need to check the recursion on the kprobe-multi anymore. Do we? Thanks! Menglong Dong > > Thanks, > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org> > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] x86/ibt: make is_endbr() notrace 2025-09-19 8:58 ` Menglong Dong @ 2025-09-19 12:32 ` Masami Hiramatsu 0 siblings, 0 replies; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-19 12:32 UTC (permalink / raw) To: Menglong Dong Cc: peterz, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Fri, 19 Sep 2025 16:58:57 +0800 Menglong Dong <menglong8.dong@gmail.com> wrote: > On 2025/9/19 16:52 Masami Hiramatsu <mhiramat@kernel.org> write: > > On Thu, 18 Sep 2025 20:09:39 +0800 > > Menglong Dong <menglong8.dong@gmail.com> wrote: > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> > > > kprobe_multi_link_exit_handler -> is_endbr. > > > > > > It is not protected by the "bpf_prog_active", so it can't be traced by > > > kprobe-multi, which can cause recurring and panic the kernel. Fix it by > > > make it notrace. > > > > Ah, OK. This is fprobe's issue. fprobe depends on fgraph to check > > recursion, but fgraph only detects the recursion in the entry handler. > > Thus it happens in the exit handler, fprobe does not check the recursion. > > > > But since the fprobe provides users to register callback at exit, it > > should check the recursion in return path too. > > That's a good idea to provide recursion checking for the exit handler, > which is able to solve this problem too. > > If so, we don't need to check the recursion on the kprobe-multi anymore. > Do we? Yes, but *if possible*, please avoid calling such functions from fprobe callbacks. This just prevents kernel crash from such recursion, but that means it is not possible to trace such functions. Thank you, > > Thanks! > Menglong Dong > > > > > Thanks, > > > > -- > > Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong 2025-09-18 13:05 ` Peter Zijlstra 2025-09-19 8:52 ` Masami Hiramatsu @ 2025-09-19 11:57 ` Masami Hiramatsu (Google) 2025-09-19 15:27 ` Steven Rostedt 2025-09-20 13:39 ` Menglong Dong 2 siblings, 2 replies; 32+ messages in thread From: Masami Hiramatsu (Google) @ 2025-09-19 11:57 UTC (permalink / raw) To: Peter Zijlstra, Steven Rostedt, Menglong Dong Cc: jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf From: Masami Hiramatsu (Google) <mhiramat@kernel.org> function_graph_enter_regs() prevents itself from recursion by ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), which is called at the exit, does not prevent such recursion. Therefore, while it can prevent recursive calls from fgraph_ops::entryfunc(), it is not able to prevent recursive calls to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. This can lead an unexpected recursion bug reported by Menglong. is_endbr() is called in __ftrace_return_to_handler -> fprobe_return -> kprobe_multi_link_exit_handler -> is_endbr. To fix this issue, acquire ftrace_test_recursion_trylock() in the __ftrace_return_to_handler() after unwind the shadow stack to mark this section must prevent recursive call of fgraph inside user-defined fgraph_ops::retfunc(). This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer"), because before that fgraph was only used from the function graph tracer. Fprobe allowed user to run any callbacks from fgraph after that commit. Reported-by: Menglong Dong <menglong8.dong@gmail.com> Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") Cc: stable@vger.kernel.org Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> --- kernel/trace/fgraph.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c index 1e3b32b1e82c..08dde420635b 100644 --- a/kernel/trace/fgraph.c +++ b/kernel/trace/fgraph.c @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe unsigned long bitmap; unsigned long ret; int offset; + int bit; int i; ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe if (fregs) ftrace_regs_set_instruction_pointer(fregs, ret); + bit = ftrace_test_recursion_trylock(trace.func, ret); + /* + * This must be succeeded because the entry handler returns before + * modifying the return address if it is nested. Anyway, we need to + * avoid calling user callbacks if it is nested. + */ + if (WARN_ON_ONCE(bit < 0)) + goto out; + #ifdef CONFIG_FUNCTION_GRAPH_RETVAL trace.retval = ftrace_regs_get_return_value(fregs); #endif @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe } } + ftrace_test_recursion_unlock(bit); +out: /* * The ftrace_graph_return() may still access the current * ret_stack structure, we need to make sure the update of ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google) @ 2025-09-19 15:27 ` Steven Rostedt 2025-09-20 7:45 ` Jiri Olsa 2025-09-21 4:05 ` Masami Hiramatsu 2025-09-20 13:39 ` Menglong Dong 1 sibling, 2 replies; 32+ messages in thread From: Steven Rostedt @ 2025-09-19 15:27 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Fri, 19 Sep 2025 20:57:36 +0900 "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > function_graph_enter_regs() prevents itself from recursion by > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > which is called at the exit, does not prevent such recursion. > Therefore, while it can prevent recursive calls from > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > This can lead an unexpected recursion bug reported by Menglong. > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > -> kprobe_multi_link_exit_handler -> is_endbr. So basically its if the handler for the return part calls something that it is tracing, it can trigger the recursion? > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > __ftrace_return_to_handler() after unwind the shadow stack to mark > this section must prevent recursive call of fgraph inside user-defined > fgraph_ops::retfunc(). > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > fprobe on function-graph tracer"), because before that fgraph was > only used from the function graph tracer. Fprobe allowed user to run > any callbacks from fgraph after that commit. I would actually say it's because before this commit, the return handler callers never called anything that the entry handlers didn't already call. If there was recursion, the entry handler would catch it (and the entry tells fgraph if the exit handler should be called). The difference here is with fprobes, you can have the exit handler calling functions that the entry handler does not, which exposes more cases where recursion could happen. > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > Cc: stable@vger.kernel.org > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > --- > kernel/trace/fgraph.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > index 1e3b32b1e82c..08dde420635b 100644 > --- a/kernel/trace/fgraph.c > +++ b/kernel/trace/fgraph.c > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > unsigned long bitmap; > unsigned long ret; > int offset; > + int bit; > int i; > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > if (fregs) > ftrace_regs_set_instruction_pointer(fregs, ret); > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > + /* > + * This must be succeeded because the entry handler returns before > + * modifying the return address if it is nested. Anyway, we need to > + * avoid calling user callbacks if it is nested. > + */ > + if (WARN_ON_ONCE(bit < 0)) I'm not so sure we need the warn on here. We should probably hook it to the recursion detection infrastructure that the function tracer has. The reason I would say not to have the warn on, is because we don't have a warn on for recursion happening at the entry handler. Because this now is exposed by fprobe allowing different routines to be called at exit than what is used in entry, it can easily be triggered. -- Steve > + goto out; > + > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > trace.retval = ftrace_regs_get_return_value(fregs); > #endif > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > } > } > > + ftrace_test_recursion_unlock(bit); > +out: > /* > * The ftrace_graph_return() may still access the current > * ret_stack structure, we need to make sure the update of ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-19 15:27 ` Steven Rostedt @ 2025-09-20 7:45 ` Jiri Olsa 2025-09-22 6:16 ` Masami Hiramatsu 2025-09-21 4:05 ` Masami Hiramatsu 1 sibling, 1 reply; 32+ messages in thread From: Jiri Olsa @ 2025-09-20 7:45 UTC (permalink / raw) To: Steven Rostedt Cc: Masami Hiramatsu (Google), Peter Zijlstra, Steven Rostedt, Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote: > On Fri, 19 Sep 2025 20:57:36 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > function_graph_enter_regs() prevents itself from recursion by > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > which is called at the exit, does not prevent such recursion. > > Therefore, while it can prevent recursive calls from > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > This can lead an unexpected recursion bug reported by Menglong. > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > -> kprobe_multi_link_exit_handler -> is_endbr. > > So basically its if the handler for the return part calls something that it > is tracing, it can trigger the recursion? > > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > this section must prevent recursive call of fgraph inside user-defined > > fgraph_ops::retfunc(). > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > fprobe on function-graph tracer"), because before that fgraph was > > only used from the function graph tracer. Fprobe allowed user to run > > any callbacks from fgraph after that commit. > > I would actually say it's because before this commit, the return handler > callers never called anything that the entry handlers didn't already call. > If there was recursion, the entry handler would catch it (and the entry > tells fgraph if the exit handler should be called). > > The difference here is with fprobes, you can have the exit handler calling > functions that the entry handler does not, which exposes more cases where > recursion could happen. so IIUC we have return kprobe multi probe on is_endbr and now we do: is_endbr() { -> function_graph_enter_regs installs return probe ... } -> __ftrace_return_to_handler fprobe_return kprobe_multi_link_exit_handler is_endbr { -> function_graph_enter_regs installs return probe ... } -> __ftrace_return_to_handler fprobe_return kprobe_multi_link_exit_handler is_endbr { -> function_graph_enter_regs installs return probe ... } -> __ftrace_return_to_handler ... recursion with the fix: is_endbr() { -> function_graph_enter_regs installs return probe ... } -> __ftrace_return_to_handler fprobe_return kprobe_multi_link_exit_handler ... is_endbr { -> function_graph_enter_regs ftrace_test_recursion_trylock fails and we do NOT install return probe ... } there's is_endbr call also in kprobe_multi_link_handler, but it won't trigger recursion, because function_graph_enter_regs already uses ftrace_test_recursion_trylock if above is correct then the fix looks good to me Acked-by: Jiri Olsa <jolsa@kernel.org> thanks, jirka > > > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > > Cc: stable@vger.kernel.org > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > --- > > kernel/trace/fgraph.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > index 1e3b32b1e82c..08dde420635b 100644 > > --- a/kernel/trace/fgraph.c > > +++ b/kernel/trace/fgraph.c > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > unsigned long bitmap; > > unsigned long ret; > > int offset; > > + int bit; > > int i; > > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > if (fregs) > > ftrace_regs_set_instruction_pointer(fregs, ret); > > > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > > + /* > > + * This must be succeeded because the entry handler returns before > > + * modifying the return address if it is nested. Anyway, we need to > > + * avoid calling user callbacks if it is nested. > > + */ > > + if (WARN_ON_ONCE(bit < 0)) > > I'm not so sure we need the warn on here. We should probably hook it to the > recursion detection infrastructure that the function tracer has. > > The reason I would say not to have the warn on, is because we don't have a > warn on for recursion happening at the entry handler. Because this now is > exposed by fprobe allowing different routines to be called at exit than > what is used in entry, it can easily be triggered. > > -- Steve > > > > > + goto out; > > + > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > trace.retval = ftrace_regs_get_return_value(fregs); > > #endif > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > } > > } > > > > + ftrace_test_recursion_unlock(bit); > > +out: > > /* > > * The ftrace_graph_return() may still access the current > > * ret_stack structure, we need to make sure the update of > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-20 7:45 ` Jiri Olsa @ 2025-09-22 6:16 ` Masami Hiramatsu 2025-09-22 13:38 ` Jiri Olsa 0 siblings, 1 reply; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-22 6:16 UTC (permalink / raw) To: Jiri Olsa Cc: Steven Rostedt, Masami Hiramatsu (Google), Peter Zijlstra, Steven Rostedt, Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Sat, 20 Sep 2025 09:45:15 +0200 Jiri Olsa <olsajiri@gmail.com> wrote: > On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote: > > On Fri, 19 Sep 2025 20:57:36 +0900 > > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > function_graph_enter_regs() prevents itself from recursion by > > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > > which is called at the exit, does not prevent such recursion. > > > Therefore, while it can prevent recursive calls from > > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > > This can lead an unexpected recursion bug reported by Menglong. > > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > > -> kprobe_multi_link_exit_handler -> is_endbr. > > > > So basically its if the handler for the return part calls something that it > > is tracing, it can trigger the recursion? > > > > > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > > this section must prevent recursive call of fgraph inside user-defined > > > fgraph_ops::retfunc(). > > > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > > fprobe on function-graph tracer"), because before that fgraph was > > > only used from the function graph tracer. Fprobe allowed user to run > > > any callbacks from fgraph after that commit. > > > > I would actually say it's because before this commit, the return handler > > callers never called anything that the entry handlers didn't already call. > > If there was recursion, the entry handler would catch it (and the entry > > tells fgraph if the exit handler should be called). > > > > The difference here is with fprobes, you can have the exit handler calling > > functions that the entry handler does not, which exposes more cases where > > recursion could happen. > > so IIUC we have return kprobe multi probe on is_endbr and now we do: > > is_endbr() > { -> function_graph_enter_regs installs return probe > ... > } -> __ftrace_return_to_handler > fprobe_return > kprobe_multi_link_exit_handler > is_endbr > { -> function_graph_enter_regs installs return probe > ... > } -> __ftrace_return_to_handler > fprobe_return > kprobe_multi_link_exit_handler > is_endbr > { -> function_graph_enter_regs installs return probe > ... > } -> __ftrace_return_to_handler > ... recursion > > > with the fix: > > is_endbr() > { -> function_graph_enter_regs installs return probe > ... > } -> __ftrace_return_to_handler > fprobe_return > kprobe_multi_link_exit_handler > ... > is_endbr > { -> function_graph_enter_regs > ftrace_test_recursion_trylock fails and we do NOT install return probe > ... > } > > > there's is_endbr call also in kprobe_multi_link_handler, but it won't > trigger recursion, because function_graph_enter_regs already uses > ftrace_test_recursion_trylock > > > if above is correct then the fix looks good to me > > Acked-by: Jiri Olsa <jolsa@kernel.org> Hi Jiri, I found ftrace_test_recursion_trylock() allows one nest level, can you make sure it is OK? Thank you, > > thanks, > jirka > > > > > > > > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > > > Cc: stable@vger.kernel.org > > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > --- > > > kernel/trace/fgraph.c | 12 ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > > index 1e3b32b1e82c..08dde420635b 100644 > > > --- a/kernel/trace/fgraph.c > > > +++ b/kernel/trace/fgraph.c > > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > unsigned long bitmap; > > > unsigned long ret; > > > int offset; > > > + int bit; > > > int i; > > > > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > if (fregs) > > > ftrace_regs_set_instruction_pointer(fregs, ret); > > > > > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > > > + /* > > > + * This must be succeeded because the entry handler returns before > > > + * modifying the return address if it is nested. Anyway, we need to > > > + * avoid calling user callbacks if it is nested. > > > + */ > > > + if (WARN_ON_ONCE(bit < 0)) > > > > I'm not so sure we need the warn on here. We should probably hook it to the > > recursion detection infrastructure that the function tracer has. > > > > The reason I would say not to have the warn on, is because we don't have a > > warn on for recursion happening at the entry handler. Because this now is > > exposed by fprobe allowing different routines to be called at exit than > > what is used in entry, it can easily be triggered. > > > > -- Steve > > > > > > > > > + goto out; > > > + > > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > > trace.retval = ftrace_regs_get_return_value(fregs); > > > #endif > > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > } > > > } > > > > > > + ftrace_test_recursion_unlock(bit); > > > +out: > > > /* > > > * The ftrace_graph_return() may still access the current > > > * ret_stack structure, we need to make sure the update of > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-22 6:16 ` Masami Hiramatsu @ 2025-09-22 13:38 ` Jiri Olsa 2025-09-22 14:42 ` Steven Rostedt 2025-09-22 19:45 ` Jiri Olsa 0 siblings, 2 replies; 32+ messages in thread From: Jiri Olsa @ 2025-09-22 13:38 UTC (permalink / raw) To: Masami Hiramatsu Cc: Jiri Olsa, Steven Rostedt, Peter Zijlstra, Steven Rostedt, Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Mon, Sep 22, 2025 at 03:16:55PM +0900, Masami Hiramatsu wrote: > On Sat, 20 Sep 2025 09:45:15 +0200 > Jiri Olsa <olsajiri@gmail.com> wrote: > > > On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote: > > > On Fri, 19 Sep 2025 20:57:36 +0900 > > > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > > > function_graph_enter_regs() prevents itself from recursion by > > > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > > > which is called at the exit, does not prevent such recursion. > > > > Therefore, while it can prevent recursive calls from > > > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > > > This can lead an unexpected recursion bug reported by Menglong. > > > > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > > > -> kprobe_multi_link_exit_handler -> is_endbr. > > > > > > So basically its if the handler for the return part calls something that it > > > is tracing, it can trigger the recursion? > > > > > > > > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > > > this section must prevent recursive call of fgraph inside user-defined > > > > fgraph_ops::retfunc(). > > > > > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > > > fprobe on function-graph tracer"), because before that fgraph was > > > > only used from the function graph tracer. Fprobe allowed user to run > > > > any callbacks from fgraph after that commit. > > > > > > I would actually say it's because before this commit, the return handler > > > callers never called anything that the entry handlers didn't already call. > > > If there was recursion, the entry handler would catch it (and the entry > > > tells fgraph if the exit handler should be called). > > > > > > The difference here is with fprobes, you can have the exit handler calling > > > functions that the entry handler does not, which exposes more cases where > > > recursion could happen. > > > > so IIUC we have return kprobe multi probe on is_endbr and now we do: > > > > is_endbr() > > { -> function_graph_enter_regs installs return probe > > ... > > } -> __ftrace_return_to_handler > > fprobe_return > > kprobe_multi_link_exit_handler > > is_endbr > > { -> function_graph_enter_regs installs return probe > > ... > > } -> __ftrace_return_to_handler > > fprobe_return > > kprobe_multi_link_exit_handler > > is_endbr > > { -> function_graph_enter_regs installs return probe > > ... > > } -> __ftrace_return_to_handler > > ... recursion > > > > > > with the fix: > > > > is_endbr() > > { -> function_graph_enter_regs installs return probe > > ... > > } -> __ftrace_return_to_handler > > fprobe_return > > kprobe_multi_link_exit_handler > > ... > > is_endbr > > { -> function_graph_enter_regs > > ftrace_test_recursion_trylock fails and we do NOT install return probe > > ... > > } > > > > > > there's is_endbr call also in kprobe_multi_link_handler, but it won't > > trigger recursion, because function_graph_enter_regs already uses > > ftrace_test_recursion_trylock > > > > > > if above is correct then the fix looks good to me > > > > Acked-by: Jiri Olsa <jolsa@kernel.org> > > Hi Jiri, > > I found ftrace_test_recursion_trylock() allows one nest level, can you > make sure it is OK? hum, I recall being surprised by that already in the past, I thought we fixed that somehow, will check later today thanks, jirka > > Thank you, > > > > > thanks, > > jirka > > > > > > > > > > > > > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > > > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > --- > > > > kernel/trace/fgraph.c | 12 ++++++++++++ > > > > 1 file changed, 12 insertions(+) > > > > > > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > > > index 1e3b32b1e82c..08dde420635b 100644 > > > > --- a/kernel/trace/fgraph.c > > > > +++ b/kernel/trace/fgraph.c > > > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > > unsigned long bitmap; > > > > unsigned long ret; > > > > int offset; > > > > + int bit; > > > > int i; > > > > > > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > > if (fregs) > > > > ftrace_regs_set_instruction_pointer(fregs, ret); > > > > > > > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > > > > + /* > > > > + * This must be succeeded because the entry handler returns before > > > > + * modifying the return address if it is nested. Anyway, we need to > > > > + * avoid calling user callbacks if it is nested. > > > > + */ > > > > + if (WARN_ON_ONCE(bit < 0)) > > > > > > I'm not so sure we need the warn on here. We should probably hook it to the > > > recursion detection infrastructure that the function tracer has. > > > > > > The reason I would say not to have the warn on, is because we don't have a > > > warn on for recursion happening at the entry handler. Because this now is > > > exposed by fprobe allowing different routines to be called at exit than > > > what is used in entry, it can easily be triggered. > > > > > > -- Steve > > > > > > > > > > > > > + goto out; > > > > + > > > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > > > trace.retval = ftrace_regs_get_return_value(fregs); > > > > #endif > > > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > > } > > > > } > > > > > > > > + ftrace_test_recursion_unlock(bit); > > > > +out: > > > > /* > > > > * The ftrace_graph_return() may still access the current > > > > * ret_stack structure, we need to make sure the update of > > > > > > -- > Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-22 13:38 ` Jiri Olsa @ 2025-09-22 14:42 ` Steven Rostedt 2025-09-22 19:45 ` Jiri Olsa 1 sibling, 0 replies; 32+ messages in thread From: Steven Rostedt @ 2025-09-22 14:42 UTC (permalink / raw) To: Jiri Olsa Cc: Masami Hiramatsu, Peter Zijlstra, Steven Rostedt, Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Mon, 22 Sep 2025 15:38:13 +0200 Jiri Olsa <olsajiri@gmail.com> wrote: > > I found ftrace_test_recursion_trylock() allows one nest level, can you > > make sure it is OK? > > hum, I recall being surprised by that already in the past, > I thought we fixed that somehow, will check later today It still does allow one level of recursion, because there's still locations that can trace in interrupt context before the preempt count is updated to the new context. I looked at fixing these, but IIRC, there's several locations in assembly that do this and it started getting messy. I guess I can try to fix this again. -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-22 13:38 ` Jiri Olsa 2025-09-22 14:42 ` Steven Rostedt @ 2025-09-22 19:45 ` Jiri Olsa 1 sibling, 0 replies; 32+ messages in thread From: Jiri Olsa @ 2025-09-22 19:45 UTC (permalink / raw) To: Jiri Olsa Cc: Masami Hiramatsu, Steven Rostedt, Peter Zijlstra, Steven Rostedt, Menglong Dong, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Mon, Sep 22, 2025 at 03:38:13PM +0200, Jiri Olsa wrote: > On Mon, Sep 22, 2025 at 03:16:55PM +0900, Masami Hiramatsu wrote: > > On Sat, 20 Sep 2025 09:45:15 +0200 > > Jiri Olsa <olsajiri@gmail.com> wrote: > > > > > On Fri, Sep 19, 2025 at 11:27:46AM -0400, Steven Rostedt wrote: > > > > On Fri, 19 Sep 2025 20:57:36 +0900 > > > > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > > > > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > > > > > > > function_graph_enter_regs() prevents itself from recursion by > > > > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > > > > which is called at the exit, does not prevent such recursion. > > > > > Therefore, while it can prevent recursive calls from > > > > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > > > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > > > > This can lead an unexpected recursion bug reported by Menglong. > > > > > > > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > > > > -> kprobe_multi_link_exit_handler -> is_endbr. > > > > > > > > So basically its if the handler for the return part calls something that it > > > > is tracing, it can trigger the recursion? > > > > > > > > > > > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > > > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > > > > this section must prevent recursive call of fgraph inside user-defined > > > > > fgraph_ops::retfunc(). > > > > > > > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > > > > fprobe on function-graph tracer"), because before that fgraph was > > > > > only used from the function graph tracer. Fprobe allowed user to run > > > > > any callbacks from fgraph after that commit. > > > > > > > > I would actually say it's because before this commit, the return handler > > > > callers never called anything that the entry handlers didn't already call. > > > > If there was recursion, the entry handler would catch it (and the entry > > > > tells fgraph if the exit handler should be called). > > > > > > > > The difference here is with fprobes, you can have the exit handler calling > > > > functions that the entry handler does not, which exposes more cases where > > > > recursion could happen. > > > > > > so IIUC we have return kprobe multi probe on is_endbr and now we do: > > > > > > is_endbr() > > > { -> function_graph_enter_regs installs return probe > > > ... > > > } -> __ftrace_return_to_handler > > > fprobe_return > > > kprobe_multi_link_exit_handler > > > is_endbr > > > { -> function_graph_enter_regs installs return probe > > > ... > > > } -> __ftrace_return_to_handler > > > fprobe_return > > > kprobe_multi_link_exit_handler > > > is_endbr > > > { -> function_graph_enter_regs installs return probe > > > ... > > > } -> __ftrace_return_to_handler > > > ... recursion > > > > > > > > > with the fix: > > > > > > is_endbr() > > > { -> function_graph_enter_regs installs return probe > > > ... > > > } -> __ftrace_return_to_handler > > > fprobe_return > > > kprobe_multi_link_exit_handler > > > ... > > > is_endbr > > > { -> function_graph_enter_regs > > > ftrace_test_recursion_trylock fails and we do NOT install return probe > > > ... > > > } > > > > > > > > > there's is_endbr call also in kprobe_multi_link_handler, but it won't > > > trigger recursion, because function_graph_enter_regs already uses > > > ftrace_test_recursion_trylock > > > > > > > > > if above is correct then the fix looks good to me > > > > > > Acked-by: Jiri Olsa <jolsa@kernel.org> > > > > Hi Jiri, > > > > I found ftrace_test_recursion_trylock() allows one nest level, can you > > make sure it is OK? we have nesting check on the kprobe multi layer making sure the bpf program will not nest into itself kprobe_multi_link_prog_run bpf_prog_active check jirka ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-19 15:27 ` Steven Rostedt 2025-09-20 7:45 ` Jiri Olsa @ 2025-09-21 4:05 ` Masami Hiramatsu 2025-09-21 22:52 ` Steven Rostedt 1 sibling, 1 reply; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-21 4:05 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Fri, 19 Sep 2025 11:27:46 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Fri, 19 Sep 2025 20:57:36 +0900 > "Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote: > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > function_graph_enter_regs() prevents itself from recursion by > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > which is called at the exit, does not prevent such recursion. > > Therefore, while it can prevent recursive calls from > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > This can lead an unexpected recursion bug reported by Menglong. > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > -> kprobe_multi_link_exit_handler -> is_endbr. > > So basically its if the handler for the return part calls something that it > is tracing, it can trigger the recursion? > > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > this section must prevent recursive call of fgraph inside user-defined > > fgraph_ops::retfunc(). > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > fprobe on function-graph tracer"), because before that fgraph was > > only used from the function graph tracer. Fprobe allowed user to run > > any callbacks from fgraph after that commit. > > I would actually say it's because before this commit, the return handler > callers never called anything that the entry handlers didn't already call. > If there was recursion, the entry handler would catch it (and the entry > tells fgraph if the exit handler should be called). > > The difference here is with fprobes, you can have the exit handler calling > functions that the entry handler does not, which exposes more cases where > recursion could happen. > > > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > > Cc: stable@vger.kernel.org > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > --- > > kernel/trace/fgraph.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > index 1e3b32b1e82c..08dde420635b 100644 > > --- a/kernel/trace/fgraph.c > > +++ b/kernel/trace/fgraph.c > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > unsigned long bitmap; > > unsigned long ret; > > int offset; > > + int bit; > > int i; > > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > if (fregs) > > ftrace_regs_set_instruction_pointer(fregs, ret); > > > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > > + /* > > + * This must be succeeded because the entry handler returns before > > + * modifying the return address if it is nested. Anyway, we need to > > + * avoid calling user callbacks if it is nested. > > + */ > > + if (WARN_ON_ONCE(bit < 0)) > > I'm not so sure we need the warn on here. We should probably hook it to the > recursion detection infrastructure that the function tracer has. The reason I added WARN_ON here is this should not happen. The recursion should be detected at the entry, not at exit. The __ftrace_return_to_handler() is installed only if the entry does NOT detect the recursion. In that case I think the exit should succeed recursion_trylock(). > > The reason I would say not to have the warn on, is because we don't have a > warn on for recursion happening at the entry handler. Because this now is > exposed by fprobe allowing different routines to be called at exit than > what is used in entry, it can easily be triggered. At the entry, if it detect recursion, it exits soon. But here we have to process stack unwind to get the return address. This recursion_trylock() is to mark this is in the critical section, not detect it. Thank you, > > -- Steve > > > > > + goto out; > > + > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > trace.retval = ftrace_regs_get_return_value(fregs); > > #endif > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > } > > } > > > > + ftrace_test_recursion_unlock(bit); > > +out: > > /* > > * The ftrace_graph_return() may still access the current > > * ret_stack structure, we need to make sure the update of > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-21 4:05 ` Masami Hiramatsu @ 2025-09-21 22:52 ` Steven Rostedt 2025-09-24 22:58 ` Masami Hiramatsu 0 siblings, 1 reply; 32+ messages in thread From: Steven Rostedt @ 2025-09-21 22:52 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Sun, 21 Sep 2025 13:05:19 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > The reason I would say not to have the warn on, is because we don't have a > > warn on for recursion happening at the entry handler. Because this now is > > exposed by fprobe allowing different routines to be called at exit than > > what is used in entry, it can easily be triggered. > > At the entry, if it detect recursion, it exits soon. But here we have to > process stack unwind to get the return address. This recursion_trylock() > is to mark this is in the critical section, not detect it. Ah, because the first instance of the exit callback sets the recursion bit. This will cause recursed entry calls to detect the recursion bit and return without setting the exit handler to be called. That is, by setting the recursion bit in the exit handler, it will cause a recursion in entry to fail before the exit is called again. I'd like to update the comment: + bit = ftrace_test_recursion_trylock(trace.func, ret); + /* + * This must be succeeded because the entry handler returns before + * modifying the return address if it is nested. Anyway, we need to + * avoid calling user callbacks if it is nested. + */ + if (WARN_ON_ONCE(bit < 0)) + goto out; + to: /* * Setting the recursion bit here will cause the graph entry to * detect recursion before the exit handle will. If the ext * handler detects recursion, something went wrong. */ if (WARN_ON_ONCE(bit < 0)) -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-21 22:52 ` Steven Rostedt @ 2025-09-24 22:58 ` Masami Hiramatsu 0 siblings, 0 replies; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-24 22:58 UTC (permalink / raw) To: Steven Rostedt Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Sun, 21 Sep 2025 18:52:03 -0400 Steven Rostedt <rostedt@goodmis.org> wrote: > On Sun, 21 Sep 2025 13:05:19 +0900 > Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > > > > The reason I would say not to have the warn on, is because we don't have a > > > warn on for recursion happening at the entry handler. Because this now is > > > exposed by fprobe allowing different routines to be called at exit than > > > what is used in entry, it can easily be triggered. > > > > At the entry, if it detect recursion, it exits soon. But here we have to > > process stack unwind to get the return address. This recursion_trylock() > > is to mark this is in the critical section, not detect it. > > Ah, because the first instance of the exit callback sets the recursion > bit. This will cause recursed entry calls to detect the recursion bit > and return without setting the exit handler to be called. > > That is, by setting the recursion bit in the exit handler, it will cause > a recursion in entry to fail before the exit is called again. > > I'd like to update the comment: > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > + /* > + * This must be succeeded because the entry handler returns before > + * modifying the return address if it is nested. Anyway, we need to > + * avoid calling user callbacks if it is nested. > + */ > + if (WARN_ON_ONCE(bit < 0)) > + goto out; > + > > to: > > /* > * Setting the recursion bit here will cause the graph entry to > * detect recursion before the exit handle will. If the ext > * handler detects recursion, something went wrong. > */ > if (WARN_ON_ONCE(bit < 0)) OK, let me update it. Thank you, -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google) 2025-09-19 15:27 ` Steven Rostedt @ 2025-09-20 13:39 ` Menglong Dong 2025-09-21 4:06 ` Masami Hiramatsu 2025-09-22 5:19 ` Masami Hiramatsu 1 sibling, 2 replies; 32+ messages in thread From: Menglong Dong @ 2025-09-20 13:39 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, mhiramat, ast, andrii, linux-kernel, bpf On 2025/9/19 19:57, Masami Hiramatsu (Google) wrote: > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > function_graph_enter_regs() prevents itself from recursion by > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > which is called at the exit, does not prevent such recursion. > Therefore, while it can prevent recursive calls from > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > This can lead an unexpected recursion bug reported by Menglong. > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > -> kprobe_multi_link_exit_handler -> is_endbr. > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > __ftrace_return_to_handler() after unwind the shadow stack to mark > this section must prevent recursive call of fgraph inside user-defined > fgraph_ops::retfunc(). > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > fprobe on function-graph tracer"), because before that fgraph was > only used from the function graph tracer. Fprobe allowed user to run > any callbacks from fgraph after that commit. > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > Cc: stable@vger.kernel.org > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > --- > kernel/trace/fgraph.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > index 1e3b32b1e82c..08dde420635b 100644 > --- a/kernel/trace/fgraph.c > +++ b/kernel/trace/fgraph.c > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > unsigned long bitmap; > unsigned long ret; > int offset; > + int bit; > int i; > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > if (fregs) > ftrace_regs_set_instruction_pointer(fregs, ret); > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > + /* > + * This must be succeeded because the entry handler returns before > + * modifying the return address if it is nested. Anyway, we need to > + * avoid calling user callbacks if it is nested. > + */ > + if (WARN_ON_ONCE(bit < 0)) > + goto out; Hi, the logic seems right, but the warning is triggered when I try to run the bpf bench testing: $ ./benchs/run_bench_trigger.sh kretprobe-multi-all [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030 [ 139.509036] ------------[ cut here ]------------ [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0 [ 139.509411] Modules linked in: virtio_net [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014 [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0 [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00 [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002 [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003 [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000 [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319 [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000 [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000 [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0 [ 139.512038] PKRU: 55555554 [ 139.512106] Call Trace: [ 139.512177] <IRQ> [ 139.512232] ? irq_exit_rcu+0x4/0xb0 [ 139.512322] return_to_handler+0x1e/0x50 [ 139.512422] ? idle_cpu+0x9/0x50 [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80 [ 139.512638] ? idle_cpu+0x9/0x50 [ 139.512731] ? irq_exit_rcu+0x3a/0xb0 [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10 [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80 [ 139.513101] </IRQ> [ 139.513168] <TASK> > + > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > trace.retval = ftrace_regs_get_return_value(fregs); > #endif > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > } > } > > + ftrace_test_recursion_unlock(bit); > +out: > /* > * The ftrace_graph_return() may still access the current > * ret_stack structure, we need to make sure the update of > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-20 13:39 ` Menglong Dong @ 2025-09-21 4:06 ` Masami Hiramatsu 2025-09-21 23:00 ` Steven Rostedt 2025-09-22 5:19 ` Masami Hiramatsu 1 sibling, 1 reply; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-21 4:06 UTC (permalink / raw) To: Menglong Dong Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Sat, 20 Sep 2025 21:39:25 +0800 Menglong Dong <menglong.dong@linux.dev> wrote: > On 2025/9/19 19:57, Masami Hiramatsu (Google) wrote: > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > function_graph_enter_regs() prevents itself from recursion by > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > which is called at the exit, does not prevent such recursion. > > Therefore, while it can prevent recursive calls from > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > This can lead an unexpected recursion bug reported by Menglong. > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > -> kprobe_multi_link_exit_handler -> is_endbr. > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > this section must prevent recursive call of fgraph inside user-defined > > fgraph_ops::retfunc(). > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > fprobe on function-graph tracer"), because before that fgraph was > > only used from the function graph tracer. Fprobe allowed user to run > > any callbacks from fgraph after that commit. > > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > > Cc: stable@vger.kernel.org > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > --- > > kernel/trace/fgraph.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > index 1e3b32b1e82c..08dde420635b 100644 > > --- a/kernel/trace/fgraph.c > > +++ b/kernel/trace/fgraph.c > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > unsigned long bitmap; > > unsigned long ret; > > int offset; > > + int bit; > > int i; > > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > if (fregs) > > ftrace_regs_set_instruction_pointer(fregs, ret); > > > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > > + /* > > + * This must be succeeded because the entry handler returns before > > + * modifying the return address if it is nested. Anyway, we need to > > + * avoid calling user callbacks if it is nested. > > + */ > > + if (WARN_ON_ONCE(bit < 0)) > > + goto out; > > Hi, the logic seems right, but the warning is triggered when > I try to run the bpf bench testing: Hmm, this is strange. Let me check why this happens. Thank you, > > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all > [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030 > [ 139.509036] ------------[ cut here ]------------ > [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0 > [ 139.509411] Modules linked in: virtio_net > [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE > [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014 > [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0 > [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00 > [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002 > [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003 > [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000 > [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319 > [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000 > [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000 > [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0 > [ 139.512038] PKRU: 55555554 > [ 139.512106] Call Trace: > [ 139.512177] <IRQ> > [ 139.512232] ? irq_exit_rcu+0x4/0xb0 > [ 139.512322] return_to_handler+0x1e/0x50 > [ 139.512422] ? idle_cpu+0x9/0x50 > [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80 > [ 139.512638] ? idle_cpu+0x9/0x50 > [ 139.512731] ? irq_exit_rcu+0x3a/0xb0 > [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10 > [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80 > [ 139.513101] </IRQ> > [ 139.513168] <TASK> > > > + > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > trace.retval = ftrace_regs_get_return_value(fregs); > > #endif > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > } > > } > > > > + ftrace_test_recursion_unlock(bit); > > +out: > > /* > > * The ftrace_graph_return() may still access the current > > * ret_stack structure, we need to make sure the update of > > > > > > > > > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-21 4:06 ` Masami Hiramatsu @ 2025-09-21 23:00 ` Steven Rostedt 2025-09-24 22:59 ` Masami Hiramatsu 0 siblings, 1 reply; 32+ messages in thread From: Steven Rostedt @ 2025-09-21 23:00 UTC (permalink / raw) To: Masami Hiramatsu (Google) Cc: Menglong Dong, Peter Zijlstra, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Sun, 21 Sep 2025 13:06:47 +0900 Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > > Hi, the logic seems right, but the warning is triggered when > > I try to run the bpf bench testing: > > Hmm, this is strange. Let me check why this happens. > > Thank you, > > > > > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all > > [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030 > > [ 139.509036] ------------[ cut here ]------------ > > [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0 > > [ 139.509411] Modules linked in: virtio_net > > [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE > > [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014 > > [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0 > > [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00 > > [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002 > > [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003 > > [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000 > > [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319 > > [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000 > > [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > > [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000 > > [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0 > > [ 139.512038] PKRU: 55555554 > > [ 139.512106] Call Trace: > > [ 139.512177] <IRQ> > > [ 139.512232] ? irq_exit_rcu+0x4/0xb0 > > [ 139.512322] return_to_handler+0x1e/0x50 > > [ 139.512422] ? idle_cpu+0x9/0x50 > > [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80 > > [ 139.512638] ? idle_cpu+0x9/0x50 > > [ 139.512731] ? irq_exit_rcu+0x3a/0xb0 > > [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10 > > [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80 > > [ 139.513101] </IRQ> > > [ 139.513168] <TASK> > > > > > + > > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > > trace.retval = ftrace_regs_get_return_value(fregs); > > > #endif > > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > } > > > } > > > > > > + ftrace_test_recursion_unlock(bit); > > > +out: > > > /* > > > * The ftrace_graph_return() may still access the current > > > * ret_stack structure, we need to make sure the update of Hmm, I wonder if this has to do with the "TRANSITION BIT". The ftrace_test_recursion_trylock() allows one level of recursion. This is to handle the case of an interrupt happening after the recursion bit is set and traces something before it updates the context in the preempt count. This would cause a false positive of the recursion test. To handle this case, it allows a single level of recursion. I originally was going to mention this, but I still can't see how this would affect it. Because if the entry were to allow one level of recursion, so would the exit. I still see the entry preventing the exit to be called. But perhaps there's an combination that I missed? -- Steve ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-21 23:00 ` Steven Rostedt @ 2025-09-24 22:59 ` Masami Hiramatsu 0 siblings, 0 replies; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-24 22:59 UTC (permalink / raw) To: Steven Rostedt Cc: Menglong Dong, Peter Zijlstra, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Sun, 21 Sep 2025 19:00:56 -0400 Steven Rostedt <rostedt@kernel.org> wrote: > On Sun, 21 Sep 2025 13:06:47 +0900 > Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > > > > > > Hi, the logic seems right, but the warning is triggered when > > > I try to run the bpf bench testing: > > > > Hmm, this is strange. Let me check why this happens. > > > > Thank you, > > > > > > > > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all > > > [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030 > > > [ 139.509036] ------------[ cut here ]------------ > > > [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0 > > > [ 139.509411] Modules linked in: virtio_net > > > [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE > > > [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014 > > > [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0 > > > [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00 > > > [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002 > > > [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003 > > > [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000 > > > [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319 > > > [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000 > > > [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > > > [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000 > > > [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0 > > > [ 139.512038] PKRU: 55555554 > > > [ 139.512106] Call Trace: > > > [ 139.512177] <IRQ> > > > [ 139.512232] ? irq_exit_rcu+0x4/0xb0 > > > [ 139.512322] return_to_handler+0x1e/0x50 > > > [ 139.512422] ? idle_cpu+0x9/0x50 > > > [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80 > > > [ 139.512638] ? idle_cpu+0x9/0x50 > > > [ 139.512731] ? irq_exit_rcu+0x3a/0xb0 > > > [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10 > > > [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80 > > > [ 139.513101] </IRQ> > > > [ 139.513168] <TASK> > > > > > > > + > > > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > > > trace.retval = ftrace_regs_get_return_value(fregs); > > > > #endif > > > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > > > } > > > > } > > > > > > > > + ftrace_test_recursion_unlock(bit); > > > > +out: > > > > /* > > > > * The ftrace_graph_return() may still access the current > > > > * ret_stack structure, we need to make sure the update of > > > Hmm, I wonder if this has to do with the "TRANSITION BIT". The > ftrace_test_recursion_trylock() allows one level of recursion. This is > to handle the case of an interrupt happening after the recursion bit is > set and traces something before it updates the context in the preempt > count. This would cause a false positive of the recursion test. To > handle this case, it allows a single level of recursion. > > I originally was going to mention this, but I still can't see how this > would affect it. Because if the entry were to allow one level of > recursion, so would the exit. I still see the entry preventing the exit > to be called. But perhaps there's an combination that I missed? If the fgraph allows one level of recursion, fprobe should work around functions called from fprobe's callback, such as is_endbr(), as follows: /* probe on enter */ call target_func() => function_graph_enter_regs() /* 1st lock */ -> fprobe_entry() -> user_callback() -> is_endbr() => function_graph_enter_regs() /* 2nd lock */ -> fprobe_entry() -> user_callback() -> is_endbr() => function_graph_enter_regs() /* lock failed */ /* probe on exit */ call target_func() => function_graph_enter_regs() -> ftrace_push_return_trace() /* run target_func() */ => __ftrace_return_to_handler() /* 1st lock */ -> fprobe_exit() -> user_callback() -> is_endbr() => function_graph_enter_regs() /* 2nd lock */ -> ftrace_push_return_trace() /* unlock 2nd */ /* run is_endbr() */ => __ftrace_return_to_handler() /* 2nd lock again */ -> fprobe_exit() -> user_callback() -> is_endbr() => function_graph_enter_regs() /* lock failed */ /* run is_endbr() */ So it can detect the recursion. Thank you, -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH] tracing: fgraph: Protect return handler from recursion loop 2025-09-20 13:39 ` Menglong Dong 2025-09-21 4:06 ` Masami Hiramatsu @ 2025-09-22 5:19 ` Masami Hiramatsu 1 sibling, 0 replies; 32+ messages in thread From: Masami Hiramatsu @ 2025-09-22 5:19 UTC (permalink / raw) To: Menglong Dong Cc: Peter Zijlstra, Steven Rostedt, Menglong Dong, jolsa, tglx, mingo, bp, dave.hansen, x86, hpa, kees, samitolvanen, rppt, luto, ast, andrii, linux-kernel, bpf On Sat, 20 Sep 2025 21:39:25 +0800 Menglong Dong <menglong.dong@linux.dev> wrote: > On 2025/9/19 19:57, Masami Hiramatsu (Google) wrote: > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > > > function_graph_enter_regs() prevents itself from recursion by > > ftrace_test_recursion_trylock(), but __ftrace_return_to_handler(), > > which is called at the exit, does not prevent such recursion. > > Therefore, while it can prevent recursive calls from > > fgraph_ops::entryfunc(), it is not able to prevent recursive calls > > to fgraph from fgraph_ops::retfunc(), resulting in a recursive loop. > > This can lead an unexpected recursion bug reported by Menglong. > > > > is_endbr() is called in __ftrace_return_to_handler -> fprobe_return > > -> kprobe_multi_link_exit_handler -> is_endbr. > > > > To fix this issue, acquire ftrace_test_recursion_trylock() in the > > __ftrace_return_to_handler() after unwind the shadow stack to mark > > this section must prevent recursive call of fgraph inside user-defined > > fgraph_ops::retfunc(). > > > > This is essentially a fix to commit 4346ba160409 ("fprobe: Rewrite > > fprobe on function-graph tracer"), because before that fgraph was > > only used from the function graph tracer. Fprobe allowed user to run > > any callbacks from fgraph after that commit. > > > > Reported-by: Menglong Dong <menglong8.dong@gmail.com> > > Closes: https://lore.kernel.org/all/20250918120939.1706585-1-dongml2@chinatelecom.cn/ > > Fixes: 4346ba160409 ("fprobe: Rewrite fprobe on function-graph tracer") > > Cc: stable@vger.kernel.org > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> > > --- > > kernel/trace/fgraph.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c > > index 1e3b32b1e82c..08dde420635b 100644 > > --- a/kernel/trace/fgraph.c > > +++ b/kernel/trace/fgraph.c > > @@ -815,6 +815,7 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > unsigned long bitmap; > > unsigned long ret; > > int offset; > > + int bit; > > int i; > > > > ret_stack = ftrace_pop_return_trace(&trace, &ret, frame_pointer, &offset); > > @@ -829,6 +830,15 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > if (fregs) > > ftrace_regs_set_instruction_pointer(fregs, ret); > > > > + bit = ftrace_test_recursion_trylock(trace.func, ret); > > + /* > > + * This must be succeeded because the entry handler returns before > > + * modifying the return address if it is nested. Anyway, we need to > > + * avoid calling user callbacks if it is nested. > > + */ > > + if (WARN_ON_ONCE(bit < 0)) > > + goto out; > > Hi, the logic seems right, but the warning is triggered when > I try to run the bpf bench testing: Ah, I remember that this ftrace_test_recursion_trylock() allows one-level nesting. OK, I will remove the WARN_ON. Thanks, > > $ ./benchs/run_bench_trigger.sh kretprobe-multi-all > [ 20.619642] NOTICE: Automounting of tracing to debugfs is deprecated and will be removed in 2030 > [ 139.509036] ------------[ cut here ]------------ > [ 139.509180] WARNING: CPU: 2 PID: 522 at kernel/trace/fgraph.c:839 ftrace_return_to_handler+0x2b9/0x2d0 > [ 139.509411] Modules linked in: virtio_net > [ 139.509514] CPU: 2 UID: 0 PID: 522 Comm: bench Not tainted 6.17.0-rc5-g1fe6d652bfa0 #106 NONE > [ 139.509720] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.17.0-1-1 04/01/2014 > [ 139.509948] RIP: 0010:ftrace_return_to_handler+0x2b9/0x2d0 > [ 139.510086] Code: e8 0c 08 0e 00 0f 0b 49 c7 c1 00 73 20 81 e9 d1 fe ff ff 40 f6 c6 10 75 11 49 c7 c3 ef ff ff ff ba 10 00 00 00 e9 57 fe ff ff <0f> 0b e9 a5 fe ff ff e8 1b 72 0d 01 66 66 2e 0f 1f 84 00 00 00 00 > [ 139.510536] RSP: 0018:ffffc9000012cef8 EFLAGS: 00010002 > [ 139.510664] RAX: ffff88810f709800 RBX: ffffc900007c3678 RCX: 0000000000000003 > [ 139.510835] RDX: 0000000000000008 RSI: 0000000000000018 RDI: 0000000000000000 > [ 139.511007] RBP: 0000000000000000 R08: 0000000000000034 R09: ffffffff82550319 > [ 139.511184] R10: ffffc9000012cf50 R11: fffffffffffffff7 R12: 0000000000000000 > [ 139.511357] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 > [ 139.511532] FS: 00007fe58276fb00(0000) GS:ffff8884ab3b8000(0000) knlGS:0000000000000000 > [ 139.511724] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 139.511865] CR2: 0000562a28314b67 CR3: 00000001143f9000 CR4: 0000000000750ef0 > [ 139.512038] PKRU: 55555554 > [ 139.512106] Call Trace: > [ 139.512177] <IRQ> > [ 139.512232] ? irq_exit_rcu+0x4/0xb0 > [ 139.512322] return_to_handler+0x1e/0x50 > [ 139.512422] ? idle_cpu+0x9/0x50 > [ 139.512506] ? sysvec_apic_timer_interrupt+0x69/0x80 > [ 139.512638] ? idle_cpu+0x9/0x50 > [ 139.512731] ? irq_exit_rcu+0x3a/0xb0 > [ 139.512833] ? ftrace_stub_direct_tramp+0x10/0x10 > [ 139.512961] ? sysvec_apic_timer_interrupt+0x69/0x80 > [ 139.513101] </IRQ> > [ 139.513168] <TASK> > > > + > > #ifdef CONFIG_FUNCTION_GRAPH_RETVAL > > trace.retval = ftrace_regs_get_return_value(fregs); > > #endif > > @@ -852,6 +862,8 @@ __ftrace_return_to_handler(struct ftrace_regs *fregs, unsigned long frame_pointe > > } > > } > > > > + ftrace_test_recursion_unlock(bit); > > +out: > > /* > > * The ftrace_graph_return() may still access the current > > * ret_stack structure, we need to make sure the update of > > > > > > > > > > -- Masami Hiramatsu (Google) <mhiramat@kernel.org> ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-09-24 22:59 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-09-18 12:09 [PATCH] x86/ibt: make is_endbr() notrace Menglong Dong 2025-09-18 13:05 ` Peter Zijlstra 2025-09-18 13:32 ` Menglong Dong 2025-09-18 16:02 ` Alexei Starovoitov 2025-09-18 16:59 ` Peter Zijlstra 2025-09-18 17:53 ` Alexei Starovoitov 2025-09-19 1:13 ` Menglong Dong 2025-09-22 6:52 ` Peter Zijlstra 2025-09-22 7:13 ` menglong.dong 2025-09-22 7:19 ` Peter Zijlstra 2025-09-22 7:21 ` Menglong Dong 2025-09-22 6:36 ` Peter Zijlstra 2025-09-18 16:56 ` Peter Zijlstra 2025-09-19 12:35 ` Masami Hiramatsu 2025-09-19 8:52 ` Masami Hiramatsu 2025-09-19 8:58 ` Menglong Dong 2025-09-19 12:32 ` Masami Hiramatsu 2025-09-19 11:57 ` [PATCH] tracing: fgraph: Protect return handler from recursion loop Masami Hiramatsu (Google) 2025-09-19 15:27 ` Steven Rostedt 2025-09-20 7:45 ` Jiri Olsa 2025-09-22 6:16 ` Masami Hiramatsu 2025-09-22 13:38 ` Jiri Olsa 2025-09-22 14:42 ` Steven Rostedt 2025-09-22 19:45 ` Jiri Olsa 2025-09-21 4:05 ` Masami Hiramatsu 2025-09-21 22:52 ` Steven Rostedt 2025-09-24 22:58 ` Masami Hiramatsu 2025-09-20 13:39 ` Menglong Dong 2025-09-21 4:06 ` Masami Hiramatsu 2025-09-21 23:00 ` Steven Rostedt 2025-09-24 22:59 ` Masami Hiramatsu 2025-09-22 5:19 ` Masami Hiramatsu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox