* [PATCH] uprobes: Fix race in uprobe_free_utask
@ 2025-01-09 14:14 Jiri Olsa
2025-01-09 14:24 ` Oleg Nesterov
` (2 more replies)
0 siblings, 3 replies; 5+ messages in thread
From: Jiri Olsa @ 2025-01-09 14:14 UTC (permalink / raw)
To: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
Daniel Borkmann
Cc: Max Makarov, bpf, Steven Rostedt, linux-kernel,
linux-trace-kernel
Max Makarov reported kernel panic [1] in perf user callchain code.
The reason for that is the race between uprobe_free_utask and bpf
profiler code doing the perf user stack unwind and is triggered
within uprobe_free_utask function:
- after current->utask is freed and
- before current->utask is set to NULL
general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
...
? die_addr+0x36/0x90
? exc_general_protection+0x217/0x420
? asm_exc_general_protection+0x26/0x30
? is_uprobe_at_func_entry+0x28/0x80
perf_callchain_user+0x20a/0x360
get_perf_callchain+0x147/0x1d0
bpf_get_stackid+0x60/0x90
bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
? __smp_call_single_queue+0xad/0x120
bpf_overflow_handler+0x75/0x110
...
asm_sysvec_apic_timer_interrupt+0x1a/0x20
RIP: 0010:__kmem_cache_free+0x1cb/0x350
...
? uprobe_free_utask+0x62/0x80
? acct_collect+0x4c/0x220
uprobe_free_utask+0x62/0x80
mm_release+0x12/0xb0
do_exit+0x26b/0xaa0
__x64_sys_exit+0x1b/0x20
do_syscall_64+0x5a/0x80
It can be easily reproduced by running following commands in
separate terminals:
# while :; do bpftrace -e 'uprobe:/bin/ls:_start { printf("hit\n"); }' -c ls; done
# bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'
Fixing this by making sure current->utask pointer is set to NULL
before we start to release the utask object.
[1] https://github.com/grafana/pyroscope/issues/3673
Reported-by: Max Makarov <maxpain@linux.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
kernel/events/uprobes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index fa04b14a7d72..5d71ef85420c 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1915,6 +1915,7 @@ void uprobe_free_utask(struct task_struct *t)
if (!utask)
return;
+ t->utask = NULL;
WARN_ON_ONCE(utask->active_uprobe || utask->xol_vaddr);
timer_delete_sync(&utask->ri_timer);
@@ -1924,7 +1925,6 @@ void uprobe_free_utask(struct task_struct *t)
ri = free_ret_instance(ri, true /* cleanup_hprobe */);
kfree(utask);
- t->utask = NULL;
}
#define RI_TIMER_PERIOD (HZ / 10) /* 100 ms */
--
2.47.0
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
@ 2025-01-09 14:24 ` Oleg Nesterov
2025-01-09 14:41 ` Daniel Borkmann
2025-01-09 22:13 ` Andrii Nakryiko
2 siblings, 0 replies; 5+ messages in thread
From: Oleg Nesterov @ 2025-01-09 14:24 UTC (permalink / raw)
To: Jiri Olsa
Cc: Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
Daniel Borkmann, Max Makarov, bpf, Steven Rostedt, linux-kernel,
linux-trace-kernel
On 01/09, Jiri Olsa wrote:
>
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1915,6 +1915,7 @@ void uprobe_free_utask(struct task_struct *t)
> if (!utask)
> return;
>
> + t->utask = NULL;
> WARN_ON_ONCE(utask->active_uprobe || utask->xol_vaddr);
>
> timer_delete_sync(&utask->ri_timer);
> @@ -1924,7 +1925,6 @@ void uprobe_free_utask(struct task_struct *t)
> ri = free_ret_instance(ri, true /* cleanup_hprobe */);
>
> kfree(utask);
> - t->utask = NULL;
Acked-by: Oleg Nesterov <oleg@redhat.com>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
2025-01-09 14:24 ` Oleg Nesterov
@ 2025-01-09 14:41 ` Daniel Borkmann
2025-01-09 20:49 ` Jiri Olsa
2025-01-09 22:13 ` Andrii Nakryiko
2 siblings, 1 reply; 5+ messages in thread
From: Daniel Borkmann @ 2025-01-09 14:41 UTC (permalink / raw)
To: Jiri Olsa, Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko,
Masami Hiramatsu
Cc: Max Makarov, bpf, Steven Rostedt, linux-kernel,
linux-trace-kernel
On 1/9/25 3:14 PM, Jiri Olsa wrote:
> Max Makarov reported kernel panic [1] in perf user callchain code.
>
> The reason for that is the race between uprobe_free_utask and bpf
> profiler code doing the perf user stack unwind and is triggered
> within uprobe_free_utask function:
> - after current->utask is freed and
> - before current->utask is set to NULL
>
> general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
> RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
> ...
> ? die_addr+0x36/0x90
> ? exc_general_protection+0x217/0x420
> ? asm_exc_general_protection+0x26/0x30
> ? is_uprobe_at_func_entry+0x28/0x80
> perf_callchain_user+0x20a/0x360
> get_perf_callchain+0x147/0x1d0
> bpf_get_stackid+0x60/0x90
> bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
> ? __smp_call_single_queue+0xad/0x120
> bpf_overflow_handler+0x75/0x110
> ...
> asm_sysvec_apic_timer_interrupt+0x1a/0x20
> RIP: 0010:__kmem_cache_free+0x1cb/0x350
> ...
> ? uprobe_free_utask+0x62/0x80
> ? acct_collect+0x4c/0x220
> uprobe_free_utask+0x62/0x80
> mm_release+0x12/0xb0
> do_exit+0x26b/0xaa0
> __x64_sys_exit+0x1b/0x20
> do_syscall_64+0x5a/0x80
>
> It can be easily reproduced by running following commands in
> separate terminals:
>
> # while :; do bpftrace -e 'uprobe:/bin/ls:_start { printf("hit\n"); }' -c ls; done
> # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'
>
> Fixing this by making sure current->utask pointer is set to NULL
> before we start to release the utask object.
Lets add Fixes tag for stable team:
Fixes: cfa7f3d2c526 ("perf,x86: avoid missing caller address in stack traces captured in uprobe")
> [1] https://github.com/grafana/pyroscope/issues/3673
> Reported-by: Max Makarov <maxpain@linux.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
fwiw, the other version we were potentially thinking of was below, but
just moving the t->utask NULL assignment seemed better.
Thanks,
Daniel
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index c75c482d4c52..05f9cedf2691 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2835,6 +2835,8 @@ static bool is_uprobe_at_func_entry(struct pt_regs *regs)
if (!current->utask)
return false;
+ if (!current->utask->active_uprobe)
+ return false;
auprobe = current->utask->auprobe;
if (!auprobe)
^ permalink raw reply related [flat|nested] 5+ messages in thread* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
2025-01-09 14:41 ` Daniel Borkmann
@ 2025-01-09 20:49 ` Jiri Olsa
0 siblings, 0 replies; 5+ messages in thread
From: Jiri Olsa @ 2025-01-09 20:49 UTC (permalink / raw)
To: Daniel Borkmann
Cc: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
Max Makarov, bpf, Steven Rostedt, linux-kernel,
linux-trace-kernel
On Thu, Jan 09, 2025 at 03:41:26PM +0100, Daniel Borkmann wrote:
> On 1/9/25 3:14 PM, Jiri Olsa wrote:
> > Max Makarov reported kernel panic [1] in perf user callchain code.
> >
> > The reason for that is the race between uprobe_free_utask and bpf
> > profiler code doing the perf user stack unwind and is triggered
> > within uprobe_free_utask function:
> > - after current->utask is freed and
> > - before current->utask is set to NULL
> >
> > general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
> > RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
> > ...
> > ? die_addr+0x36/0x90
> > ? exc_general_protection+0x217/0x420
> > ? asm_exc_general_protection+0x26/0x30
> > ? is_uprobe_at_func_entry+0x28/0x80
> > perf_callchain_user+0x20a/0x360
> > get_perf_callchain+0x147/0x1d0
> > bpf_get_stackid+0x60/0x90
> > bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
> > ? __smp_call_single_queue+0xad/0x120
> > bpf_overflow_handler+0x75/0x110
> > ...
> > asm_sysvec_apic_timer_interrupt+0x1a/0x20
> > RIP: 0010:__kmem_cache_free+0x1cb/0x350
> > ...
> > ? uprobe_free_utask+0x62/0x80
> > ? acct_collect+0x4c/0x220
> > uprobe_free_utask+0x62/0x80
> > mm_release+0x12/0xb0
> > do_exit+0x26b/0xaa0
> > __x64_sys_exit+0x1b/0x20
> > do_syscall_64+0x5a/0x80
> >
> > It can be easily reproduced by running following commands in
> > separate terminals:
> >
> > # while :; do bpftrace -e 'uprobe:/bin/ls:_start { printf("hit\n"); }' -c ls; done
> > # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'
> >
> > Fixing this by making sure current->utask pointer is set to NULL
> > before we start to release the utask object.
>
> Lets add Fixes tag for stable team:
>
> Fixes: cfa7f3d2c526 ("perf,x86: avoid missing caller address in stack traces captured in uprobe")
ugh right, thanks for finding that
jirka
>
> > [1] https://github.com/grafana/pyroscope/issues/3673
> > Reported-by: Max Makarov <maxpain@linux.com>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
>
> fwiw, the other version we were potentially thinking of was below, but
> just moving the t->utask NULL assignment seemed better.
>
> Thanks,
> Daniel
>
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index c75c482d4c52..05f9cedf2691 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2835,6 +2835,8 @@ static bool is_uprobe_at_func_entry(struct pt_regs *regs)
>
> if (!current->utask)
> return false;
> + if (!current->utask->active_uprobe)
> + return false;
>
> auprobe = current->utask->auprobe;
> if (!auprobe)
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
2025-01-09 14:24 ` Oleg Nesterov
2025-01-09 14:41 ` Daniel Borkmann
@ 2025-01-09 22:13 ` Andrii Nakryiko
2 siblings, 0 replies; 5+ messages in thread
From: Andrii Nakryiko @ 2025-01-09 22:13 UTC (permalink / raw)
To: Jiri Olsa
Cc: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
Daniel Borkmann, Max Makarov, bpf, Steven Rostedt, linux-kernel,
linux-trace-kernel
On Thu, Jan 9, 2025 at 6:14 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Max Makarov reported kernel panic [1] in perf user callchain code.
>
> The reason for that is the race between uprobe_free_utask and bpf
> profiler code doing the perf user stack unwind and is triggered
> within uprobe_free_utask function:
> - after current->utask is freed and
> - before current->utask is set to NULL
>
> general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
> RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
> ...
> ? die_addr+0x36/0x90
> ? exc_general_protection+0x217/0x420
> ? asm_exc_general_protection+0x26/0x30
> ? is_uprobe_at_func_entry+0x28/0x80
> perf_callchain_user+0x20a/0x360
> get_perf_callchain+0x147/0x1d0
> bpf_get_stackid+0x60/0x90
> bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
> ? __smp_call_single_queue+0xad/0x120
> bpf_overflow_handler+0x75/0x110
> ...
> asm_sysvec_apic_timer_interrupt+0x1a/0x20
> RIP: 0010:__kmem_cache_free+0x1cb/0x350
> ...
> ? uprobe_free_utask+0x62/0x80
> ? acct_collect+0x4c/0x220
> uprobe_free_utask+0x62/0x80
> mm_release+0x12/0xb0
> do_exit+0x26b/0xaa0
> __x64_sys_exit+0x1b/0x20
> do_syscall_64+0x5a/0x80
>
> It can be easily reproduced by running following commands in
> separate terminals:
>
> # while :; do bpftrace -e 'uprobe:/bin/ls:_start { printf("hit\n"); }' -c ls; done
> # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'
>
> Fixing this by making sure current->utask pointer is set to NULL
> before we start to release the utask object.
>
> [1] https://github.com/grafana/pyroscope/issues/3673
> Reported-by: Max Makarov <maxpain@linux.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
> kernel/events/uprobes.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
ah, it's interrupt/NMI checking current->utask, makes total sense,
thanks for the fix!
Acked-by: Andrii Nakryiko <andrii@kernel.org>
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index fa04b14a7d72..5d71ef85420c 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1915,6 +1915,7 @@ void uprobe_free_utask(struct task_struct *t)
> if (!utask)
> return;
>
> + t->utask = NULL;
> WARN_ON_ONCE(utask->active_uprobe || utask->xol_vaddr);
>
> timer_delete_sync(&utask->ri_timer);
> @@ -1924,7 +1925,6 @@ void uprobe_free_utask(struct task_struct *t)
> ri = free_ret_instance(ri, true /* cleanup_hprobe */);
>
> kfree(utask);
> - t->utask = NULL;
> }
>
> #define RI_TIMER_PERIOD (HZ / 10) /* 100 ms */
> --
> 2.47.0
>
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-01-09 22:13 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
2025-01-09 14:24 ` Oleg Nesterov
2025-01-09 14:41 ` Daniel Borkmann
2025-01-09 20:49 ` Jiri Olsa
2025-01-09 22:13 ` Andrii Nakryiko
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).