From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A07C14A91 for ; Sat, 17 Jan 2026 00:11:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768608672; cv=none; b=gTe0i8w0MgCy7XwfCQPf9FXWYYw5dO/2S7ZmAVyOpHInGu67JNi2S1ahL9VsxPOmzHZLcQW6xKOReByQi5e8wEzT2IK19njQpvWMk+wemsJwhQkO4IyRSWy6H/e+oRCWkN4LBavsdoduAXx/6e0mabTWE8R3zrOrWxIRMbCDkQ4= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1768608672; c=relaxed/simple; bh=fs7EGF+OwUsqOCb6hZC733Kk7w43WedaOO5ZrTT6k7I=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=sHmUv5KmaNSdOS/xaUjIrrr4oUj18ng8l7PEDa6WEZh8FzT5KHB7dPanzvXAeQjhFdX4tnyuClV16rKsE/gKa8k/uP3sk0wLX3NRmUceUTUHRc3PUpt7eXae3zb5PvGJOX4MBN3KBx9M6KhMFFErZdNcZPYcayskqIHeyfo4cbc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=dSNZJZ6c; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="dSNZJZ6c" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7DEDFC19421; Sat, 17 Jan 2026 00:11:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1768608671; bh=fs7EGF+OwUsqOCb6hZC733Kk7w43WedaOO5ZrTT6k7I=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=dSNZJZ6c5iD6x7h1fQpSY9Afudz2TLWEuQLkGAT0De6G0uZ5eW+qA8vXSgVLcM7UY Yk1ihrMYBobH635avrKnsAMA6CvYfLxwf+59XrR8aKWE5fjTkun47q0LNl+Oqzfq+z gU5r9mEw5umKw6Z2D2hWXH6jhK6cqs/VVSlA/La+5R4RmgEwLcwUGM0sKAwa08wXVd q3ZrbL8JqCwg8WCC0czBK3zRVWWyQPg2twYAZgH9wPkp4GX6Ko1HV4ttTQfZL8dbga 37O9prRQ/Csl1f8nRTqg7iZmnEZK0HRYYaX8WSG7y3AKhioOcI/6YOXG95oxSbqRpJ rX/h52R7ZdeEw== Date: Fri, 16 Jan 2026 16:11:09 -0800 From: Namhyung Kim To: Peter Zijlstra , Ingo Molnar Cc: Mark Rutland , Alexander Shishkin , Arnaldo Carvalho de Melo , LKML , Rosalie Fang Subject: Re: [PATCH v2] perf/core: Fix slow perf_event_task_exit() with LBR callstacks Message-ID: References: <20260114180130.133766-1-namhyung@kernel.org> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20260114180130.133766-1-namhyung@kernel.org> Hi Peter, On Wed, Jan 14, 2026 at 10:01:30AM -0800, Namhyung Kim wrote: > I got a report that a task is stuck in perf_event_exit_task() waiting > for global_ctx_data_rwsem. On large systems with lots threads, it'd > have performance issues when it grabs the lock to iterate all threads > in the system to allocate the context data. > > And it'd block task exit path which is problematic especially under > memory pressure. > > perf_event_open > perf_event_alloc > attach_perf_ctx_data > attach_global_ctx_data > percpu_down_write (global_ctx_data_rwsem) > for_each_process_thread > alloc_task_ctx_data > do_exit > perf_event_exit_task > percpu_down_read (global_ctx_data_rwsem) > > It should not hold the global_ctx_data_rwsem on the exit path. Let's > skip allocation for exiting tasks and free the data carefully. > > Reported-by: Rosalie Fang > Suggested-by: Peter Zijlstra > Signed-off-by: Namhyung Kim It seems you merged v1 which has a sparse warning. Thanks, Namhyung > --- > kernel/events/core.c | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 376fb07d869b8b50..b164e884102323f5 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -5424,6 +5424,17 @@ attach_task_ctx_data(struct task_struct *task, struct kmem_cache *ctx_cache, > if (try_cmpxchg((struct perf_ctx_data **)&task->perf_ctx_data, &old, cd)) { > if (old) > perf_free_ctx_data_rcu(old); > + /* > + * Above try_cmpxchg() pairs with try_cmpxchg() from > + * detach_task_ctx_data() such that > + * if we race with perf_event_exit_task(), we must > + * observe PF_EXITING. > + */ > + if (task->flags & PF_EXITING) { > + /* detach_task_ctx_data() may free it already */ > + if (try_cmpxchg(&task->perf_ctx_data, &cd, NULL)) > + perf_free_ctx_data_rcu(cd); > + } > return 0; > } > > @@ -5469,6 +5480,8 @@ attach_global_ctx_data(struct kmem_cache *ctx_cache) > /* Allocate everything */ > scoped_guard (rcu) { > for_each_process_thread(g, p) { > + if (p->flags & PF_EXITING) > + continue; > cd = rcu_dereference(p->perf_ctx_data); > if (cd && !cd->global) { > cd->global = 1; > @@ -14562,8 +14575,11 @@ void perf_event_exit_task(struct task_struct *task) > > /* > * Detach the perf_ctx_data for the system-wide event. > + * > + * Done without holding global_ctx_data_rwsem; typically > + * attach_global_ctx_data() will skip over this task, but otherwise > + * attach_task_ctx_data() will observe PF_EXITING. > */ > - guard(percpu_read)(&global_ctx_data_rwsem); > detach_task_ctx_data(task); > } > > -- > 2.52.0.457.g6b5491de43-goog >