From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D09333A380C for ; Tue, 7 Apr 2026 07:58:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=90.155.92.199 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775548710; cv=none; b=h2XMWzjP0q3r9YhSJxxsgnlPj2jw02GeQdNjfq7/M07VWxMXhMepuQDAcHoTz6nz7ryXfUvA3AXZskdg9PrmICTbubhXaoHPlpfXovauEjVKsA4Wbyz/8jowPtkpVL6kzbbGGp6TWEfSVf0Il7qNrOAMu+3QpBkRQgHdo2tpfQE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1775548710; c=relaxed/simple; bh=5mSgsX8M3o/PWHVHAaR6RlONTgBkBc78Q2EJvLxL7EY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=SrYS5oGc7b3N9RP8xoY/v1gtFEFf2PF/TkPg3cOBuxgugjILuzu4bNWz8MnOCkQuNMlTW1hFU1yxI3SIjGp3NOYq+RBWptvFNCttH8GxzOeZMhdScaRInCIJn+bqwZhNTHLS880hIJinckguiX9jaXL8x/C5fuIeIIPpDjIT6rA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org; spf=none smtp.mailfrom=infradead.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b=KqnSLeQ5; arc=none smtp.client-ip=90.155.92.199 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=infradead.org Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=infradead.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="KqnSLeQ5" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=DGA6BEm9x9i+VyzYxWILhQNYn7b8TmcySc4WiveOc+s=; b=KqnSLeQ5W54RPp3+Q6VAHZjBwH X4O5g9Fx663Q6BEZvQGYRoybqHSAryT40SKkD50g0B7nNnIQaqnpNUjahgrcFqv5nUnKeZ2Ss8DPb v6pDHHTLnXZjP+vruHsFppavoQ1jJrjqT+dxt7tlVGRQNROkZeZ1DT2swCHzdDbe2fjxIP/eP6wQz dmD3qxLDYKLdlXiQdOpcmnZ1ptMFULTlGD2ej+lAJt3RiKpwoWM4NuY6g4c9Omt8x2wumyUehFolb UdiV5VLBQMfSC+4qycBoBE52gj3Y5IcCUNkMA+rLeOhfvUn3exDC+9wFNtoZULXlatUgil/VOCck+ dkgWHUTw==; Received: from 2001-1c00-8d85-4b00-266e-96ff-fe07-7dcc.cable.dynamic.v6.ziggo.nl ([2001:1c00:8d85:4b00:266e:96ff:fe07:7dcc] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98.2 #2 (Red Hat Linux)) id 1wA1K8-00000007n35-3Mmd; Tue, 07 Apr 2026 07:58:13 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id 2BCC03005E5; Tue, 07 Apr 2026 09:58:11 +0200 (CEST) Date: Tue, 7 Apr 2026 09:58:11 +0200 From: Peter Zijlstra To: Rik van Riel , Tejun Heo Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com, Ingo Molnar , Juri Lelli , Vincent Guittot , Steven Rostedt Subject: Re: [PATCH] sched/cpuacct: fix use-after-free in cpuacct_account_field() Message-ID: <20260407075811.GB3738010@noisy.programming.kicks-ass.net> References: <20260404224742.56d8df3e@fangorn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260404224742.56d8df3e@fangorn> On Sat, Apr 04, 2026 at 10:47:42PM -0400, Rik van Riel wrote: > cpuacct_css_free() calls free_percpu() on ca->cpustat and ca->cpuusage, > then kfree(ca). However, a timer interrupt on another CPU can > concurrently access this data through cpuacct_account_field(), which > walks the cpuacct hierarchy via task_ca()/parent_ca() and performs > __this_cpu_add(ca->cpustat->cpustat[index], val). > > The race window exists because put_css_set_locked() drops the CSS > reference (css_put) before the css_set is RCU-freed (kfree_rcu). This > means the CSS percpu_ref can reach zero and trigger the css_free chain > while readers obtained the CSS pointer from the old css_set that is > still visible via RCU. > > Although css_free_rwork_fn is already called after one RCU grace period, > the css_set -> CSS reference drop in put_css_set_locked() creates a > window where the CSS free chain races with readers still holding the > old css_set reference. To me this reads like a cgroup fail, not a cpuacct fail per se. But I'm forever confused there. TJ? > With KASAN enabled, free_percpu() unmaps shadow pages, so the > KASAN-instrumented __this_cpu_add hits an unmapped shadow page > (PMD=0), causing a page fault in IRQ context that cascades into an > IRQ stack overflow. > > Fix this by deferring the actual freeing of percpu data and the cpuacct > struct to an RCU callback via call_rcu(), ensuring that all concurrent > readers in RCU read-side critical sections (including timer tick > handlers) have completed before the memory is freed. > > Found in an AI driven syzkaller run. The bug did not repeat in the > 14 hours since this patch was applied. > > Signed-off-by: Rik van Riel > Assisted-by: Claude:claude-opus-4.6 syzkaller > Fixes: 3eba0505d03a ("sched/cpuacct: Remove redundant RCU read lock") > Cc: stable@kernel.org > --- > kernel/sched/cpuacct.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c > index ca9d52cb1ebb..b6e7b34de616 100644 > --- a/kernel/sched/cpuacct.c > +++ b/kernel/sched/cpuacct.c > @@ -28,6 +28,7 @@ struct cpuacct { > /* cpuusage holds pointer to a u64-type object on every CPU */ > u64 __percpu *cpuusage; > struct kernel_cpustat __percpu *cpustat; > + struct rcu_head rcu; > }; > > static inline struct cpuacct *css_ca(struct cgroup_subsys_state *css) > @@ -84,15 +85,22 @@ cpuacct_css_alloc(struct cgroup_subsys_state *parent_css) > } > > /* Destroy an existing CPU accounting group */ > -static void cpuacct_css_free(struct cgroup_subsys_state *css) > +static void cpuacct_free_rcu(struct rcu_head *rcu) > { > - struct cpuacct *ca = css_ca(css); > + struct cpuacct *ca = container_of(rcu, struct cpuacct, rcu); > > free_percpu(ca->cpustat); > free_percpu(ca->cpuusage); > kfree(ca); > } > > +static void cpuacct_css_free(struct cgroup_subsys_state *css) > +{ > + struct cpuacct *ca = css_ca(css); > + > + call_rcu(&ca->rcu, cpuacct_free_rcu); > +} > + > static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu, > enum cpuacct_stat_index index) > { > -- > 2.52.0 > >