All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] uprobes: Fix race in uprobe_free_utask
@ 2025-01-09 14:14 Jiri Olsa
  2025-01-09 14:24 ` Oleg Nesterov
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Jiri Olsa @ 2025-01-09 14:14 UTC (permalink / raw)
  To: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
	Daniel Borkmann
  Cc: Max Makarov, bpf, Steven Rostedt, linux-kernel,
	linux-trace-kernel

Max Makarov reported kernel panic [1] in perf user callchain code.

The reason for that is the race between uprobe_free_utask and bpf
profiler code doing the perf user stack unwind and is triggered
within uprobe_free_utask function:
  - after current->utask is freed and
  - before current->utask is set to NULL

 general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
 RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
 ...
  ? die_addr+0x36/0x90
  ? exc_general_protection+0x217/0x420
  ? asm_exc_general_protection+0x26/0x30
  ? is_uprobe_at_func_entry+0x28/0x80
  perf_callchain_user+0x20a/0x360
  get_perf_callchain+0x147/0x1d0
  bpf_get_stackid+0x60/0x90
  bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
  ? __smp_call_single_queue+0xad/0x120
  bpf_overflow_handler+0x75/0x110
  ...
  asm_sysvec_apic_timer_interrupt+0x1a/0x20
 RIP: 0010:__kmem_cache_free+0x1cb/0x350
 ...
  ? uprobe_free_utask+0x62/0x80
  ? acct_collect+0x4c/0x220
  uprobe_free_utask+0x62/0x80
  mm_release+0x12/0xb0
  do_exit+0x26b/0xaa0
  __x64_sys_exit+0x1b/0x20
  do_syscall_64+0x5a/0x80

It can be easily reproduced by running following commands in
separate terminals:

  # while :; do bpftrace -e 'uprobe:/bin/ls:_start  { printf("hit\n"); }' -c ls; done
  # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'

Fixing this by making sure current->utask pointer is set to NULL
before we start to release the utask object.

[1] https://github.com/grafana/pyroscope/issues/3673
Reported-by: Max Makarov <maxpain@linux.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/events/uprobes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index fa04b14a7d72..5d71ef85420c 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1915,6 +1915,7 @@ void uprobe_free_utask(struct task_struct *t)
 	if (!utask)
 		return;
 
+	t->utask = NULL;
 	WARN_ON_ONCE(utask->active_uprobe || utask->xol_vaddr);
 
 	timer_delete_sync(&utask->ri_timer);
@@ -1924,7 +1925,6 @@ void uprobe_free_utask(struct task_struct *t)
 		ri = free_ret_instance(ri, true /* cleanup_hprobe */);
 
 	kfree(utask);
-	t->utask = NULL;
 }
 
 #define RI_TIMER_PERIOD (HZ / 10) /* 100 ms */
-- 
2.47.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
  2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
@ 2025-01-09 14:24 ` Oleg Nesterov
  2025-01-09 14:41 ` Daniel Borkmann
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2025-01-09 14:24 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
	Daniel Borkmann, Max Makarov, bpf, Steven Rostedt, linux-kernel,
	linux-trace-kernel

On 01/09, Jiri Olsa wrote:
>
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1915,6 +1915,7 @@ void uprobe_free_utask(struct task_struct *t)
>  	if (!utask)
>  		return;
>
> +	t->utask = NULL;
>  	WARN_ON_ONCE(utask->active_uprobe || utask->xol_vaddr);
>
>  	timer_delete_sync(&utask->ri_timer);
> @@ -1924,7 +1925,6 @@ void uprobe_free_utask(struct task_struct *t)
>  		ri = free_ret_instance(ri, true /* cleanup_hprobe */);
>
>  	kfree(utask);
> -	t->utask = NULL;

Acked-by: Oleg Nesterov <oleg@redhat.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
  2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
  2025-01-09 14:24 ` Oleg Nesterov
@ 2025-01-09 14:41 ` Daniel Borkmann
  2025-01-09 20:49   ` Jiri Olsa
  2025-01-09 22:13 ` Andrii Nakryiko
  2025-01-10  8:40 ` [tip: perf/urgent] " tip-bot2 for Jiri Olsa
  3 siblings, 1 reply; 6+ messages in thread
From: Daniel Borkmann @ 2025-01-09 14:41 UTC (permalink / raw)
  To: Jiri Olsa, Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko,
	Masami Hiramatsu
  Cc: Max Makarov, bpf, Steven Rostedt, linux-kernel,
	linux-trace-kernel

On 1/9/25 3:14 PM, Jiri Olsa wrote:
> Max Makarov reported kernel panic [1] in perf user callchain code.
> 
> The reason for that is the race between uprobe_free_utask and bpf
> profiler code doing the perf user stack unwind and is triggered
> within uprobe_free_utask function:
>    - after current->utask is freed and
>    - before current->utask is set to NULL
> 
>   general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
>   RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
>   ...
>    ? die_addr+0x36/0x90
>    ? exc_general_protection+0x217/0x420
>    ? asm_exc_general_protection+0x26/0x30
>    ? is_uprobe_at_func_entry+0x28/0x80
>    perf_callchain_user+0x20a/0x360
>    get_perf_callchain+0x147/0x1d0
>    bpf_get_stackid+0x60/0x90
>    bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
>    ? __smp_call_single_queue+0xad/0x120
>    bpf_overflow_handler+0x75/0x110
>    ...
>    asm_sysvec_apic_timer_interrupt+0x1a/0x20
>   RIP: 0010:__kmem_cache_free+0x1cb/0x350
>   ...
>    ? uprobe_free_utask+0x62/0x80
>    ? acct_collect+0x4c/0x220
>    uprobe_free_utask+0x62/0x80
>    mm_release+0x12/0xb0
>    do_exit+0x26b/0xaa0
>    __x64_sys_exit+0x1b/0x20
>    do_syscall_64+0x5a/0x80
> 
> It can be easily reproduced by running following commands in
> separate terminals:
> 
>    # while :; do bpftrace -e 'uprobe:/bin/ls:_start  { printf("hit\n"); }' -c ls; done
>    # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'
> 
> Fixing this by making sure current->utask pointer is set to NULL
> before we start to release the utask object.

Lets add Fixes tag for stable team:

Fixes: cfa7f3d2c526 ("perf,x86: avoid missing caller address in stack traces captured in uprobe")

> [1] https://github.com/grafana/pyroscope/issues/3673
> Reported-by: Max Makarov <maxpain@linux.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>

fwiw, the other version we were potentially thinking of was below, but
just moving the t->utask NULL assignment seemed better.

Thanks,
Daniel

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index c75c482d4c52..05f9cedf2691 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -2835,6 +2835,8 @@ static bool is_uprobe_at_func_entry(struct pt_regs *regs)

         if (!current->utask)
                 return false;
+       if (!current->utask->active_uprobe)
+               return false;

         auprobe = current->utask->auprobe;
         if (!auprobe)

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
  2025-01-09 14:41 ` Daniel Borkmann
@ 2025-01-09 20:49   ` Jiri Olsa
  0 siblings, 0 replies; 6+ messages in thread
From: Jiri Olsa @ 2025-01-09 20:49 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
	Max Makarov, bpf, Steven Rostedt, linux-kernel,
	linux-trace-kernel

On Thu, Jan 09, 2025 at 03:41:26PM +0100, Daniel Borkmann wrote:
> On 1/9/25 3:14 PM, Jiri Olsa wrote:
> > Max Makarov reported kernel panic [1] in perf user callchain code.
> > 
> > The reason for that is the race between uprobe_free_utask and bpf
> > profiler code doing the perf user stack unwind and is triggered
> > within uprobe_free_utask function:
> >    - after current->utask is freed and
> >    - before current->utask is set to NULL
> > 
> >   general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
> >   RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
> >   ...
> >    ? die_addr+0x36/0x90
> >    ? exc_general_protection+0x217/0x420
> >    ? asm_exc_general_protection+0x26/0x30
> >    ? is_uprobe_at_func_entry+0x28/0x80
> >    perf_callchain_user+0x20a/0x360
> >    get_perf_callchain+0x147/0x1d0
> >    bpf_get_stackid+0x60/0x90
> >    bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
> >    ? __smp_call_single_queue+0xad/0x120
> >    bpf_overflow_handler+0x75/0x110
> >    ...
> >    asm_sysvec_apic_timer_interrupt+0x1a/0x20
> >   RIP: 0010:__kmem_cache_free+0x1cb/0x350
> >   ...
> >    ? uprobe_free_utask+0x62/0x80
> >    ? acct_collect+0x4c/0x220
> >    uprobe_free_utask+0x62/0x80
> >    mm_release+0x12/0xb0
> >    do_exit+0x26b/0xaa0
> >    __x64_sys_exit+0x1b/0x20
> >    do_syscall_64+0x5a/0x80
> > 
> > It can be easily reproduced by running following commands in
> > separate terminals:
> > 
> >    # while :; do bpftrace -e 'uprobe:/bin/ls:_start  { printf("hit\n"); }' -c ls; done
> >    # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'
> > 
> > Fixing this by making sure current->utask pointer is set to NULL
> > before we start to release the utask object.
> 
> Lets add Fixes tag for stable team:
> 
> Fixes: cfa7f3d2c526 ("perf,x86: avoid missing caller address in stack traces captured in uprobe")

ugh right, thanks for finding that

jirka

> 
> > [1] https://github.com/grafana/pyroscope/issues/3673
> > Reported-by: Max Makarov <maxpain@linux.com>
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> 
> fwiw, the other version we were potentially thinking of was below, but
> just moving the t->utask NULL assignment seemed better.
> 
> Thanks,
> Daniel
> 
> diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
> index c75c482d4c52..05f9cedf2691 100644
> --- a/arch/x86/events/core.c
> +++ b/arch/x86/events/core.c
> @@ -2835,6 +2835,8 @@ static bool is_uprobe_at_func_entry(struct pt_regs *regs)
> 
>         if (!current->utask)
>                 return false;
> +       if (!current->utask->active_uprobe)
> +               return false;
> 
>         auprobe = current->utask->auprobe;
>         if (!auprobe)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] uprobes: Fix race in uprobe_free_utask
  2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
  2025-01-09 14:24 ` Oleg Nesterov
  2025-01-09 14:41 ` Daniel Borkmann
@ 2025-01-09 22:13 ` Andrii Nakryiko
  2025-01-10  8:40 ` [tip: perf/urgent] " tip-bot2 for Jiri Olsa
  3 siblings, 0 replies; 6+ messages in thread
From: Andrii Nakryiko @ 2025-01-09 22:13 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Oleg Nesterov, Peter Zijlstra, Andrii Nakryiko, Masami Hiramatsu,
	Daniel Borkmann, Max Makarov, bpf, Steven Rostedt, linux-kernel,
	linux-trace-kernel

On Thu, Jan 9, 2025 at 6:14 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Max Makarov reported kernel panic [1] in perf user callchain code.
>
> The reason for that is the race between uprobe_free_utask and bpf
> profiler code doing the perf user stack unwind and is triggered
> within uprobe_free_utask function:
>   - after current->utask is freed and
>   - before current->utask is set to NULL
>
>  general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
>  RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
>  ...
>   ? die_addr+0x36/0x90
>   ? exc_general_protection+0x217/0x420
>   ? asm_exc_general_protection+0x26/0x30
>   ? is_uprobe_at_func_entry+0x28/0x80
>   perf_callchain_user+0x20a/0x360
>   get_perf_callchain+0x147/0x1d0
>   bpf_get_stackid+0x60/0x90
>   bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
>   ? __smp_call_single_queue+0xad/0x120
>   bpf_overflow_handler+0x75/0x110
>   ...
>   asm_sysvec_apic_timer_interrupt+0x1a/0x20
>  RIP: 0010:__kmem_cache_free+0x1cb/0x350
>  ...
>   ? uprobe_free_utask+0x62/0x80
>   ? acct_collect+0x4c/0x220
>   uprobe_free_utask+0x62/0x80
>   mm_release+0x12/0xb0
>   do_exit+0x26b/0xaa0
>   __x64_sys_exit+0x1b/0x20
>   do_syscall_64+0x5a/0x80
>
> It can be easily reproduced by running following commands in
> separate terminals:
>
>   # while :; do bpftrace -e 'uprobe:/bin/ls:_start  { printf("hit\n"); }' -c ls; done
>   # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'
>
> Fixing this by making sure current->utask pointer is set to NULL
> before we start to release the utask object.
>
> [1] https://github.com/grafana/pyroscope/issues/3673
> Reported-by: Max Makarov <maxpain@linux.com>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  kernel/events/uprobes.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>

ah, it's interrupt/NMI checking current->utask, makes total sense,
thanks for the fix!

Acked-by: Andrii Nakryiko <andrii@kernel.org>

> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index fa04b14a7d72..5d71ef85420c 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1915,6 +1915,7 @@ void uprobe_free_utask(struct task_struct *t)
>         if (!utask)
>                 return;
>
> +       t->utask = NULL;
>         WARN_ON_ONCE(utask->active_uprobe || utask->xol_vaddr);
>
>         timer_delete_sync(&utask->ri_timer);
> @@ -1924,7 +1925,6 @@ void uprobe_free_utask(struct task_struct *t)
>                 ri = free_ret_instance(ri, true /* cleanup_hprobe */);
>
>         kfree(utask);
> -       t->utask = NULL;
>  }
>
>  #define RI_TIMER_PERIOD (HZ / 10) /* 100 ms */
> --
> 2.47.0
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip: perf/urgent] uprobes: Fix race in uprobe_free_utask
  2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
                   ` (2 preceding siblings ...)
  2025-01-09 22:13 ` Andrii Nakryiko
@ 2025-01-10  8:40 ` tip-bot2 for Jiri Olsa
  3 siblings, 0 replies; 6+ messages in thread
From: tip-bot2 for Jiri Olsa @ 2025-01-10  8:40 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Max Makarov, Jiri Olsa, Peter Zijlstra (Intel), Oleg Nesterov,
	Andrii Nakryiko, x86, linux-kernel

The following commit has been merged into the perf/urgent branch of tip:

Commit-ID:     b583ef82b671c9a752fbe3e95bd4c1c51eab764d
Gitweb:        https://git.kernel.org/tip/b583ef82b671c9a752fbe3e95bd4c1c51eab764d
Author:        Jiri Olsa <jolsa@kernel.org>
AuthorDate:    Thu, 09 Jan 2025 15:14:40 +01:00
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Fri, 10 Jan 2025 09:28:01 +01:00

uprobes: Fix race in uprobe_free_utask

Max Makarov reported kernel panic [1] in perf user callchain code.

The reason for that is the race between uprobe_free_utask and bpf
profiler code doing the perf user stack unwind and is triggered
within uprobe_free_utask function:
  - after current->utask is freed and
  - before current->utask is set to NULL

 general protection fault, probably for non-canonical address 0x9e759c37ee555c76: 0000 [#1] SMP PTI
 RIP: 0010:is_uprobe_at_func_entry+0x28/0x80
 ...
  ? die_addr+0x36/0x90
  ? exc_general_protection+0x217/0x420
  ? asm_exc_general_protection+0x26/0x30
  ? is_uprobe_at_func_entry+0x28/0x80
  perf_callchain_user+0x20a/0x360
  get_perf_callchain+0x147/0x1d0
  bpf_get_stackid+0x60/0x90
  bpf_prog_9aac297fb833e2f5_do_perf_event+0x434/0x53b
  ? __smp_call_single_queue+0xad/0x120
  bpf_overflow_handler+0x75/0x110
  ...
  asm_sysvec_apic_timer_interrupt+0x1a/0x20
 RIP: 0010:__kmem_cache_free+0x1cb/0x350
 ...
  ? uprobe_free_utask+0x62/0x80
  ? acct_collect+0x4c/0x220
  uprobe_free_utask+0x62/0x80
  mm_release+0x12/0xb0
  do_exit+0x26b/0xaa0
  __x64_sys_exit+0x1b/0x20
  do_syscall_64+0x5a/0x80

It can be easily reproduced by running following commands in
separate terminals:

  # while :; do bpftrace -e 'uprobe:/bin/ls:_start  { printf("hit\n"); }' -c ls; done
  # bpftrace -e 'profile:hz:100000 { @[ustack()] = count(); }'

Fixing this by making sure current->utask pointer is set to NULL
before we start to release the utask object.

[1] https://github.com/grafana/pyroscope/issues/3673

Fixes: cfa7f3d2c526 ("perf,x86: avoid missing caller address in stack traces captured in uprobe")
Reported-by: Max Makarov <maxpain@linux.com>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Oleg Nesterov <oleg@redhat.com>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20250109141440.2692173-1-jolsa@kernel.org
---
 kernel/events/uprobes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index fa04b14..5d71ef8 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1915,6 +1915,7 @@ void uprobe_free_utask(struct task_struct *t)
 	if (!utask)
 		return;
 
+	t->utask = NULL;
 	WARN_ON_ONCE(utask->active_uprobe || utask->xol_vaddr);
 
 	timer_delete_sync(&utask->ri_timer);
@@ -1924,7 +1925,6 @@ void uprobe_free_utask(struct task_struct *t)
 		ri = free_ret_instance(ri, true /* cleanup_hprobe */);
 
 	kfree(utask);
-	t->utask = NULL;
 }
 
 #define RI_TIMER_PERIOD (HZ / 10) /* 100 ms */

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2025-01-10  8:40 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-01-09 14:14 [PATCH] uprobes: Fix race in uprobe_free_utask Jiri Olsa
2025-01-09 14:24 ` Oleg Nesterov
2025-01-09 14:41 ` Daniel Borkmann
2025-01-09 20:49   ` Jiri Olsa
2025-01-09 22:13 ` Andrii Nakryiko
2025-01-10  8:40 ` [tip: perf/urgent] " tip-bot2 for Jiri Olsa

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.