From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2CCE3BD228 for ; Wed, 24 Jun 2026 14:23:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782310994; cv=none; b=GdDeufczCrFLvcPAA5HYLiB/U9Y2D+2CELxdgKLJnqEbzLIi+UliBBhrhhN+LfPTXd5vn4P1YdWp/It8j+bnkN9Zevt//laUGUQIv4KsQnMtBdydGm8oYuC/n08gXOEs54t0mVARAhXII1ArE3/YaSNr5fUzIi+GoKnMXC4fa34= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1782310994; c=relaxed/simple; bh=jJyJKz/Z5heQz3UaZQwvMBIZf4pUmmWhaAQpIDQtPa0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GQJgz5i+cBREpiHCMNMC5M4PdNoFl1C5bEUPwEuGT1dCG/Hb9OmhqVH8HRQL1aiR7dsjOLn6j3/N2qlsdumDOcskcLdqxeMheOzEtnGupcl97VblyQofvgy11zxg69NeKqEuNohYCWe8arhhjWYj1tmVxRtMuxTcWtG3uROIELk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=R9lupM8B; arc=none smtp.client-ip=100.103.45.18 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="R9lupM8B" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ED2FD1F000E9; Wed, 24 Jun 2026 14:23:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org; s=k20260515; t=1782310993; bh=WLMupxBZwRCVLTY4WQy1mEXLNBRM5yd9s6atKbpiNdA=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=R9lupM8BqHwWQQJwtSAxanv9TtiBCcbXoNuIy9jKW6PLooFAcWnTLJpMoUqpY5266 gtW4PlqhNPp0r5p84RQtOAminT1CAE13dfKyhoRLTgOhjny7oRuNsHxlOgcVb37veZ r2bWDAyp6hvbq6l2OShzKnEnxTfVtDFzB7JilMpyUADoN7G94kO4S6W/yuUmV2fuTe 0IeRxhHbpqCi5ciw4lC2MU8WOjXoPN7XGuV1/EziPbXkY0XDo+y+KnR1Iy21QZywgF r/wlqH3IOnsY6weKIp+63FTaDlZ0yLHwDCKO1wQ/KjNG1W1Kp0EeYPSlN4pbrjmftO kJZJW/l5P8Ecw== Date: Wed, 24 Jun 2026 15:23:05 +0100 From: Lorenzo Stoakes To: Xuewen Wang Cc: akpm@linux-foundation.org, david@kernel.org, liam@infradead.org, vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1] mm: fix data-race in folio_batch_count() Message-ID: References: <20260624092606.1083449-1-wangxuewen@kylinos.cn> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260624092606.1083449-1-wangxuewen@kylinos.cn> On Wed, Jun 24, 2026 at 05:26:06PM +0800, Xuewen Wang wrote: > KCSAN reports: > > BUG: KCSAN: data-race in __lru_add_drain_all / folio_batch_add_and_move Where? A syzbot report? A local run? Please specify. > > write to 0xffff98fe74c015f8 of 1 bytes by task 45153 on cpu 2: > folio_batch_add+0x30/0xe0 > > read to 0xffff98fe74c015f8 of 1 bytes by task 45175 on cpu 0: > folio_batch_count+0x0/0x10 > cpu_needs_drain+0x253/0x430 > > The write side is a per-cpu local operation (folio_batch_add on the > CPU that owns the per-cpu batch), while cpu_needs_drain() reads > another CPU's per-cpu batch without locking. Reading a slightly stale > value is harmless -- it only determines whether to schedule a drain, Then why are we adding a READ_ONCE() in such a core helper? > and a subsequent check will catch it. Where? Which check? Be specific. > > Use READ_ONCE() to annotate the read and prevent load tearing, which > also suppresses the KCSAN warning. Tearing on a single byte? Which architecture tears a single byte? I think you're actually more concerned about the value being optimised out on assumption of it not being updated elsewhere right? But acutally you're not, because everybody else uses a stack variable or _their own_ per-CPU value? Only cpu_needs_drain() is the odd one out right? > > Signed-off-by: Xuewen Wang > --- > include/linux/folio_batch.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/include/linux/folio_batch.h b/include/linux/folio_batch.h > index b45946adc50b..1e31e058e19d 100644 > --- a/include/linux/folio_batch.h > +++ b/include/linux/folio_batch.h > @@ -53,7 +53,7 @@ static inline void folio_batch_reinit(struct folio_batch *fbatch) > > static inline unsigned int folio_batch_count(const struct folio_batch *fbatch) > { > - return fbatch->nr; > + return READ_ONCE(fbatch->nr); This isn't free, you're breaking optimisations here by doing that... It feels like the wrong level of abstraction, but actually I think every other case is either stack or per-CPU _on its own CPU_ (please check), in which case we _can_ suppress the check here but I think best done with data_race(). And see the kernel bug bot report, you need to add: #include Too for that. > } > > static inline unsigned int folio_batch_space(const struct folio_batch *fbatch) > -- > 2.25.1 > Thanks, Lorenzo