Re: [BUG] perf/core: Task stuck on global_ctx_data

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* Re: [BUG] perf/core: Task stuck on global_ctx_data_rwsem
       [not found] <aUnVfxDtLNUDJM_v@google.com>
@ 2025-12-22 23:36 ` Namhyung Kim
  2026-01-06 22:34   ` Namhyung Kim
  0 siblings, 1 reply; 7+ messages in thread
From: Namhyung Kim @ 2025-12-22 23:36 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Ian Rogers, Adrian Hunter, James Clark,
	linux-perf-users, linux-kernel

Added a subject prefix and CC LKML.

Thanks,
Namhyung

On Mon, Dec 22, 2025 at 03:34:23PM -0800, Namhyung Kim wrote:
> Hello,
> 
> I got a report that a task is stuck in perf_event_exit_task() waiting
> for global_ctx_data_rwsem.  On large systems, it'd have performance
> issues when it grabs the lock to iterate all threads in the system to
> allocate the context data.  And it'd block task exit path which is
> problematic especially under memory pressure.
> 
>   perf_event_open
>     perf_event_alloc
>       attach_perf_ctx_data
>         attach_global_ctx_data
>           percpu_down_write (global_ctx_data_rwsem)
>             for_each_process_thread
>               alloc_task_ctx_data
>                                                do_exit
>                                                  perf_event_exit_task
>                                                    percpu_down_read (global_ctx_data_rwsem)
> 
> I think attach_global_ctx_data() should skip tasks with PF_EXITING and
> it'd be nice if perf_event_exit_task() could release the ctx_data
> unconditionally.  But I'm not sure how to synchronize them properly.
> 
> Any thoughts?
> 
> Thanks,
> Namhyung
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf/core: Task stuck on global_ctx_data_rwsem
  2025-12-22 23:36 ` [BUG] perf/core: Task stuck on global_ctx_data_rwsem Namhyung Kim
@ 2026-01-06 22:34   ` Namhyung Kim
  2026-01-07  9:16     ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Namhyung Kim @ 2026-01-06 22:34 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar
  Cc: Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Ian Rogers, Adrian Hunter, James Clark,
	linux-perf-users, linux-kernel

Hello,

On Mon, Dec 22, 2025 at 03:36:53PM -0800, Namhyung Kim wrote:
> On Mon, Dec 22, 2025 at 03:34:23PM -0800, Namhyung Kim wrote:
> > Hello,
> > 
> > I got a report that a task is stuck in perf_event_exit_task() waiting
> > for global_ctx_data_rwsem.  On large systems, it'd have performance
> > issues when it grabs the lock to iterate all threads in the system to
> > allocate the context data.  And it'd block task exit path which is
> > problematic especially under memory pressure.
> > 
> >   perf_event_open
> >     perf_event_alloc
> >       attach_perf_ctx_data
> >         attach_global_ctx_data
> >           percpu_down_write (global_ctx_data_rwsem)
> >             for_each_process_thread
> >               alloc_task_ctx_data
> >                                                do_exit
> >                                                  perf_event_exit_task
> >                                                    percpu_down_read (global_ctx_data_rwsem)
> > 
> > I think attach_global_ctx_data() should skip tasks with PF_EXITING and
> > it'd be nice if perf_event_exit_task() could release the ctx_data
> > unconditionally.  But I'm not sure how to synchronize them properly.
> > 
> > Any thoughts?

I'm curious if this makes any sense..  I feel like it needs to check the
flag again before allocation.

Thanks,
Namhyung


diff --git a/kernel/events/core.c b/kernel/events/core.c
index 376fb07d869b8b50..2a8847e95d7eb698 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5469,6 +5469,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
 	/* Allocate everything */
 	scoped_guard (rcu) {
 		for_each_process_thread(g, p) {
+			if (p->flags & PF_EXITING)
+				continue;
 			cd = rcu_dereference(p->perf_ctx_data);
 			if (cd && !cd->global) {
 				cd->global = 1;
@@ -14563,7 +14565,6 @@ void perf_event_exit_task(struct task_struct *task)
 	/*
 	 * Detach the perf_ctx_data for the system-wide event.
 	 */
-	guard(percpu_read)(&global_ctx_data_rwsem);
 	detach_task_ctx_data(task);
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [BUG] perf/core: Task stuck on global_ctx_data_rwsem
  2026-01-06 22:34   ` Namhyung Kim
@ 2026-01-07  9:16     ` Peter Zijlstra
  2026-01-07 19:01       ` Namhyung Kim
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2026-01-07  9:16 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, linux-perf-users, linux-kernel

On Tue, Jan 06, 2026 at 02:34:40PM -0800, Namhyung Kim wrote:
> Hello,
> 
> On Mon, Dec 22, 2025 at 03:36:53PM -0800, Namhyung Kim wrote:
> > On Mon, Dec 22, 2025 at 03:34:23PM -0800, Namhyung Kim wrote:
> > > Hello,
> > > 
> > > I got a report that a task is stuck in perf_event_exit_task() waiting
> > > for global_ctx_data_rwsem.  On large systems, it'd have performance
> > > issues when it grabs the lock to iterate all threads in the system to
> > > allocate the context data.  And it'd block task exit path which is
> > > problematic especially under memory pressure.
> > > 
> > >   perf_event_open
> > >     perf_event_alloc
> > >       attach_perf_ctx_data
> > >         attach_global_ctx_data
> > >           percpu_down_write (global_ctx_data_rwsem)
> > >             for_each_process_thread
> > >               alloc_task_ctx_data
> > >                                                do_exit
> > >                                                  perf_event_exit_task
> > >                                                    percpu_down_read (global_ctx_data_rwsem)
> > > 
> > > I think attach_global_ctx_data() should skip tasks with PF_EXITING and
> > > it'd be nice if perf_event_exit_task() could release the ctx_data
> > > unconditionally.  But I'm not sure how to synchronize them properly.
> > > 
> > > Any thoughts?
> 
> I'm curious if this makes any sense..  I feel like it needs to check the
> flag again before allocation.
> 
> Thanks,
> Namhyung
> 
> 
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 376fb07d869b8b50..2a8847e95d7eb698 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5469,6 +5469,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
>  	/* Allocate everything */
>  	scoped_guard (rcu) {
>  		for_each_process_thread(g, p) {
> +			if (p->flags & PF_EXITING)
> +				continue;
>  			cd = rcu_dereference(p->perf_ctx_data);
>  			if (cd && !cd->global) {
>  				cd->global = 1;

I suppose this makes sense.

> @@ -14563,7 +14565,6 @@ void perf_event_exit_task(struct task_struct *task)
>  	/*
>  	 * Detach the perf_ctx_data for the system-wide event.
>  	 */
> -	guard(percpu_read)(&global_ctx_data_rwsem);
>  	detach_task_ctx_data(task);
>  }

This would need a comment; something like:

	/*
	 * This can be done without holding global_ctx_data_rwsem
	 * because this is done after setting PF_EXITING such that
	 * attach_global_ctx_data() will skip over this task.
	 */
	WARN_ON_ONCE(!(task->flags & PF_EXITING))

But yes, I suppose this can do. The question is however, how do you get
into this predicament to begin with? Are you creating and destroying a
lot of global LBR events or something?

Would it make sense to delay detach_global_ctx_data() for a second or
so? That is, what is your event creation pattern?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf/core: Task stuck on global_ctx_data_rwsem
  2026-01-07  9:16     ` Peter Zijlstra
@ 2026-01-07 19:01       ` Namhyung Kim
  2026-01-07 22:28         ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Namhyung Kim @ 2026-01-07 19:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, linux-perf-users, linux-kernel

On Wed, Jan 07, 2026 at 10:16:52AM +0100, Peter Zijlstra wrote:
> On Tue, Jan 06, 2026 at 02:34:40PM -0800, Namhyung Kim wrote:
> > Hello,
> > 
> > On Mon, Dec 22, 2025 at 03:36:53PM -0800, Namhyung Kim wrote:
> > > On Mon, Dec 22, 2025 at 03:34:23PM -0800, Namhyung Kim wrote:
> > > > Hello,
> > > > 
> > > > I got a report that a task is stuck in perf_event_exit_task() waiting
> > > > for global_ctx_data_rwsem.  On large systems, it'd have performance
> > > > issues when it grabs the lock to iterate all threads in the system to
> > > > allocate the context data.  And it'd block task exit path which is
> > > > problematic especially under memory pressure.
> > > > 
> > > >   perf_event_open
> > > >     perf_event_alloc
> > > >       attach_perf_ctx_data
> > > >         attach_global_ctx_data
> > > >           percpu_down_write (global_ctx_data_rwsem)
> > > >             for_each_process_thread
> > > >               alloc_task_ctx_data
> > > >                                                do_exit
> > > >                                                  perf_event_exit_task
> > > >                                                    percpu_down_read (global_ctx_data_rwsem)
> > > > 
> > > > I think attach_global_ctx_data() should skip tasks with PF_EXITING and
> > > > it'd be nice if perf_event_exit_task() could release the ctx_data
> > > > unconditionally.  But I'm not sure how to synchronize them properly.
> > > > 
> > > > Any thoughts?
> > 
> > I'm curious if this makes any sense..  I feel like it needs to check the
> > flag again before allocation.
> > 
> > Thanks,
> > Namhyung
> > 
> > 
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 376fb07d869b8b50..2a8847e95d7eb698 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -5469,6 +5469,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
> >  	/* Allocate everything */
> >  	scoped_guard (rcu) {
> >  		for_each_process_thread(g, p) {
> > +			if (p->flags & PF_EXITING)
> > +				continue;
> >  			cd = rcu_dereference(p->perf_ctx_data);
> >  			if (cd && !cd->global) {
> >  				cd->global = 1;
> 
> I suppose this makes sense.
> 
> > @@ -14563,7 +14565,6 @@ void perf_event_exit_task(struct task_struct *task)
> >  	/*
> >  	 * Detach the perf_ctx_data for the system-wide event.
> >  	 */
> > -	guard(percpu_read)(&global_ctx_data_rwsem);
> >  	detach_task_ctx_data(task);
> >  }
> 
> This would need a comment; something like:
> 
> 	/*
> 	 * This can be done without holding global_ctx_data_rwsem
> 	 * because this is done after setting PF_EXITING such that
> 	 * attach_global_ctx_data() will skip over this task.
> 	 */
> 	WARN_ON_ONCE(!(task->flags & PF_EXITING))
> 
> But yes, I suppose this can do. The question is however, how do you get
> into this predicament to begin with? Are you creating and destroying a
> lot of global LBR events or something?

I think it's just because there are too many tasks in the system like
O(100K).  And any thread going to exit needs to wait for
attach_global_ctx_data() to finish the iteration over every task.

> 
> Would it make sense to delay detach_global_ctx_data() for a second or
> so? That is, what is your event creation pattern?

I don't think it has a special pattern, but I'm curious how we can
handle a race like below.

  attach_global_ctx_data
    check p->flags & PF_EXITING
                                              do_exit
    (preemption)                                set PF_EXITING
                                                detach_task_ctx_data()
    check p->perf_ctx_data
    attach_task_ctx_data()   ---> memory leak

Thanks,
Namhyung


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf/core: Task stuck on global_ctx_data_rwsem
  2026-01-07 19:01       ` Namhyung Kim
@ 2026-01-07 22:28         ` Peter Zijlstra
  2026-01-07 22:32           ` Peter Zijlstra
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2026-01-07 22:28 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, linux-perf-users, linux-kernel

On Wed, Jan 07, 2026 at 11:01:53AM -0800, Namhyung Kim wrote:

> > But yes, I suppose this can do. The question is however, how do you get
> > into this predicament to begin with? Are you creating and destroying a
> > lot of global LBR events or something?
> 
> I think it's just because there are too many tasks in the system like
> O(100K).  And any thread going to exit needs to wait for
> attach_global_ctx_data() to finish the iteration over every task.

OMG, so many tasks ...

> > Would it make sense to delay detach_global_ctx_data() for a second or
> > so? That is, what is your event creation pattern?
> 
> I don't think it has a special pattern, but I'm curious how we can
> handle a race like below.
> 
>   attach_global_ctx_data
>     check p->flags & PF_EXITING
>                                               do_exit
>     (preemption)                                set PF_EXITING
>                                                 detach_task_ctx_data()
>     check p->perf_ctx_data
>     attach_task_ctx_data()   ---> memory leak

Oh right. Something like so perhaps?

---
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 3c2a491200c6..e5e716420eb3 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5421,9 +5421,19 @@ attach_task_ctx_data(struct task_struct *task, struct kmem_cache *ctx_cache,
 		return -ENOMEM;
 
 	for (;;) {
-		if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) {
+		if (try_cmpxchg(&task->perf_ctx_data, &old, cd)) {
 			if (old)
 				perf_free_ctx_data_rcu(old);
+			/*
+			 * try_cmpxchg() pairs with try_cmpxchg() from
+			 * detach_task_ctx_data() such that
+			 * if we race with perf_event_exit_task(), we must
+			 * observe PF_EXITING.
+			 */
+			if (task->flags & PF_EXITING) {
+				task->perf_ctx_data = NULL;
+				perf_free_ctx_data_rcu(cd);
+			}
 			return 0;
 		}
 
@@ -5469,6 +5479,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
 	/* Allocate everything */
 	scoped_guard (rcu) {
 		for_each_process_thread(g, p) {
+			if (p->flags & PF_EXITING)
+				continue;
 			cd = rcu_dereference(p->perf_ctx_data);
 			if (cd && !cd->global) {
 				cd->global = 1;
@@ -14568,8 +14580,11 @@ void perf_event_exit_task(struct task_struct *task)
 
 	/*
 	 * Detach the perf_ctx_data for the system-wide event.
+	 *
+	 * Done without holding global_ctx_data_rwsem; typically
+	 * attach_global_ctx_data() will skip over this task, but otherwise
+	 * attach_task_ctx_data() will observe PF_EXITING.
 	 */
-	guard(percpu_read)(&global_ctx_data_rwsem);
 	detach_task_ctx_data(task);
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [BUG] perf/core: Task stuck on global_ctx_data_rwsem
  2026-01-07 22:28         ` Peter Zijlstra
@ 2026-01-07 22:32           ` Peter Zijlstra
  2026-01-08 19:56             ` Namhyung Kim
  0 siblings, 1 reply; 7+ messages in thread
From: Peter Zijlstra @ 2026-01-07 22:32 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, linux-perf-users, linux-kernel

On Wed, Jan 07, 2026 at 11:28:24PM +0100, Peter Zijlstra wrote:
> On Wed, Jan 07, 2026 at 11:01:53AM -0800, Namhyung Kim wrote:
> 
> > > But yes, I suppose this can do. The question is however, how do you get
> > > into this predicament to begin with? Are you creating and destroying a
> > > lot of global LBR events or something?
> > 
> > I think it's just because there are too many tasks in the system like
> > O(100K).  And any thread going to exit needs to wait for
> > attach_global_ctx_data() to finish the iteration over every task.
> 
> OMG, so many tasks ...
> 
> > > Would it make sense to delay detach_global_ctx_data() for a second or
> > > so? That is, what is your event creation pattern?
> > 
> > I don't think it has a special pattern, but I'm curious how we can
> > handle a race like below.
> > 
> >   attach_global_ctx_data
> >     check p->flags & PF_EXITING
> >                                               do_exit
> >     (preemption)                                set PF_EXITING
> >                                                 detach_task_ctx_data()
> >     check p->perf_ctx_data
> >     attach_task_ctx_data()   ---> memory leak
> 
> Oh right. Something like so perhaps?
> 
> ---
> diff --git a/kernel/events/core.c b/kernel/events/core.c
> index 3c2a491200c6..e5e716420eb3 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -5421,9 +5421,19 @@ attach_task_ctx_data(struct task_struct *task, struct kmem_cache *ctx_cache,
>  		return -ENOMEM;
>  
>  	for (;;) {
> -		if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) {
> +		if (try_cmpxchg(&task->perf_ctx_data, &old, cd)) {
>  			if (old)
>  				perf_free_ctx_data_rcu(old);
> +			/*
> +			 * try_cmpxchg() pairs with try_cmpxchg() from
> +			 * detach_task_ctx_data() such that
> +			 * if we race with perf_event_exit_task(), we must
> +			 * observe PF_EXITING.
> +			 */
> +			if (task->flags & PF_EXITING) {
> +				task->perf_ctx_data = NULL;
> +				perf_free_ctx_data_rcu(cd);

Ugh and now it can race and do a double free, another try_cmpxchg() is
needed here.

> +			}
>  			return 0;
>  		}
>  
> @@ -5469,6 +5479,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
>  	/* Allocate everything */
>  	scoped_guard (rcu) {
>  		for_each_process_thread(g, p) {
> +			if (p->flags & PF_EXITING)
> +				continue;
>  			cd = rcu_dereference(p->perf_ctx_data);
>  			if (cd && !cd->global) {
>  				cd->global = 1;
> @@ -14568,8 +14580,11 @@ void perf_event_exit_task(struct task_struct *task)
>  
>  	/*
>  	 * Detach the perf_ctx_data for the system-wide event.
> +	 *
> +	 * Done without holding global_ctx_data_rwsem; typically
> +	 * attach_global_ctx_data() will skip over this task, but otherwise
> +	 * attach_task_ctx_data() will observe PF_EXITING.
>  	 */
> -	guard(percpu_read)(&global_ctx_data_rwsem);
>  	detach_task_ctx_data(task);
>  }
>  

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [BUG] perf/core: Task stuck on global_ctx_data_rwsem
  2026-01-07 22:32           ` Peter Zijlstra
@ 2026-01-08 19:56             ` Namhyung Kim
  0 siblings, 0 replies; 7+ messages in thread
From: Namhyung Kim @ 2026-01-08 19:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Ian Rogers, Adrian Hunter,
	James Clark, linux-perf-users, linux-kernel

On Wed, Jan 07, 2026 at 11:32:56PM +0100, Peter Zijlstra wrote:
> On Wed, Jan 07, 2026 at 11:28:24PM +0100, Peter Zijlstra wrote:
> > On Wed, Jan 07, 2026 at 11:01:53AM -0800, Namhyung Kim wrote:
> > 
> > > > But yes, I suppose this can do. The question is however, how do you get
> > > > into this predicament to begin with? Are you creating and destroying a
> > > > lot of global LBR events or something?
> > > 
> > > I think it's just because there are too many tasks in the system like
> > > O(100K).  And any thread going to exit needs to wait for
> > > attach_global_ctx_data() to finish the iteration over every task.
> > 
> > OMG, so many tasks ...
> > 
> > > > Would it make sense to delay detach_global_ctx_data() for a second or
> > > > so? That is, what is your event creation pattern?
> > > 
> > > I don't think it has a special pattern, but I'm curious how we can
> > > handle a race like below.
> > > 
> > >   attach_global_ctx_data
> > >     check p->flags & PF_EXITING
> > >                                               do_exit
> > >     (preemption)                                set PF_EXITING
> > >                                                 detach_task_ctx_data()
> > >     check p->perf_ctx_data
> > >     attach_task_ctx_data()   ---> memory leak
> > 
> > Oh right. Something like so perhaps?
> > 
> > ---
> > diff --git a/kernel/events/core.c b/kernel/events/core.c
> > index 3c2a491200c6..e5e716420eb3 100644
> > --- a/kernel/events/core.c
> > +++ b/kernel/events/core.c
> > @@ -5421,9 +5421,19 @@ attach_task_ctx_data(struct task_struct *task, struct kmem_cache *ctx_cache,
> >  		return -ENOMEM;
> >  
> >  	for (;;) {
> > -		if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) {
> > +		if (try_cmpxchg(&task->perf_ctx_data, &old, cd)) {
> >  			if (old)
> >  				perf_free_ctx_data_rcu(old);
> > +			/*
> > +			 * try_cmpxchg() pairs with try_cmpxchg() from
> > +			 * detach_task_ctx_data() such that
> > +			 * if we race with perf_event_exit_task(), we must
> > +			 * observe PF_EXITING.
> > +			 */
> > +			if (task->flags & PF_EXITING) {
> > +				task->perf_ctx_data = NULL;
> > +				perf_free_ctx_data_rcu(cd);
> 
> Ugh and now it can race and do a double free, another try_cmpxchg() is
> needed here.

Thanks!  Something like this?

Namhyung


diff --git a/kernel/events/core.c b/kernel/events/core.c
index 376fb07d869b8b50..cf252d8f49b2b259 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5421,9 +5421,20 @@ attach_task_ctx_data(struct task_struct *task, struct kmem_cache *ctx_cache,
 		return -ENOMEM;
 
 	for (;;) {
-		if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) {
+		if (try_cmpxchg(&task->perf_ctx_data, &old, cd)) {
 			if (old)
 				perf_free_ctx_data_rcu(old);
+			/*
+			 * try_cmpxchg() pairs with try_cmpxchg() from
+			 * detach_task_ctx_data() such that
+			 * if we race with perf_event_exit_task(), we must
+			 * observe PF_EXITING.
+			 */
+			if (task->flags & PF_EXITING) {
+				/* detach_task_ctx_data() may free it already */
+				if (try_cmpxchg(&task->perf_ctx_data, &cd, NULL))
+					perf_free_ctx_data_rcu(cd);
+			}
 			return 0;
 		}
 
@@ -5469,6 +5480,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache)
 	/* Allocate everything */
 	scoped_guard (rcu) {
 		for_each_process_thread(g, p) {
+			if (p->flags & PF_EXITING)
+				continue;
 			cd = rcu_dereference(p->perf_ctx_data);
 			if (cd && !cd->global) {
 				cd->global = 1;
@@ -14562,8 +14575,11 @@ void perf_event_exit_task(struct task_struct *task)
 
 	/*
 	 * Detach the perf_ctx_data for the system-wide event.
+	 *
+	 * Done without holding global_ctx_data_rwsem; typically
+	 * attach_global_ctx_data() will skip over this task, but otherwise
+	 * attach_task_ctx_data() will observe PF_EXITING.
 	 */
-	guard(percpu_read)(&global_ctx_data_rwsem);
 	detach_task_ctx_data(task);
 }
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2026-01-08 19:57 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <aUnVfxDtLNUDJM_v@google.com>
2025-12-22 23:36 ` [BUG] perf/core: Task stuck on global_ctx_data_rwsem Namhyung Kim
2026-01-06 22:34   ` Namhyung Kim
2026-01-07  9:16     ` Peter Zijlstra
2026-01-07 19:01       ` Namhyung Kim
2026-01-07 22:28         ` Peter Zijlstra
2026-01-07 22:32           ` Peter Zijlstra
2026-01-08 19:56             ` Namhyung Kim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox