From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from smtp.kernel.org (aws-us-west-2-korg-mail-alma10-1.taild15c8.ts.net [100.103.45.18])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits))
	(No client certificate requested)
	by smtp.subspace.kernel.org (Postfix) with ESMTPS id D2CCE3BD228
	for <linux-kernel@vger.kernel.org>; Wed, 24 Jun 2026 14:23:13 +0000 (UTC)
Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=100.103.45.18
ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116;
	t=1782310994; cv=none; b=GdDeufczCrFLvcPAA5HYLiB/U9Y2D+2CELxdgKLJnqEbzLIi+UliBBhrhhN+LfPTXd5vn4P1YdWp/It8j+bnkN9Zevt//laUGUQIv4KsQnMtBdydGm8oYuC/n08gXOEs54t0mVARAhXII1ArE3/YaSNr5fUzIi+GoKnMXC4fa34=
ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org;
	s=arc-20240116; t=1782310994; c=relaxed/simple;
	bh=jJyJKz/Z5heQz3UaZQwvMBIZf4pUmmWhaAQpIDQtPa0=;
	h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version:
	 Content-Type:Content-Disposition:In-Reply-To; b=GQJgz5i+cBREpiHCMNMC5M4PdNoFl1C5bEUPwEuGT1dCG/Hb9OmhqVH8HRQL1aiR7dsjOLn6j3/N2qlsdumDOcskcLdqxeMheOzEtnGupcl97VblyQofvgy11zxg69NeKqEuNohYCWe8arhhjWYj1tmVxRtMuxTcWtG3uROIELk=
ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=R9lupM8B; arc=none smtp.client-ip=100.103.45.18
Authentication-Results: smtp.subspace.kernel.org;
	dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="R9lupM8B"
Received: by smtp.kernel.org (Postfix) with ESMTPSA id ED2FD1F000E9;
	Wed, 24 Jun 2026 14:23:10 +0000 (UTC)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel.org;
	s=k20260515; t=1782310993;
	bh=WLMupxBZwRCVLTY4WQy1mEXLNBRM5yd9s6atKbpiNdA=;
	h=Date:From:To:Cc:Subject:References:In-Reply-To;
	b=R9lupM8BqHwWQQJwtSAxanv9TtiBCcbXoNuIy9jKW6PLooFAcWnTLJpMoUqpY5266
	 gtW4PlqhNPp0r5p84RQtOAminT1CAE13dfKyhoRLTgOhjny7oRuNsHxlOgcVb37veZ
	 r2bWDAyp6hvbq6l2OShzKnEnxTfVtDFzB7JilMpyUADoN7G94kO4S6W/yuUmV2fuTe
	 0IeRxhHbpqCi5ciw4lC2MU8WOjXoPN7XGuV1/EziPbXkY0XDo+y+KnR1Iy21QZywgF
	 r/wlqH3IOnsY6weKIp+63FTaDlZ0yLHwDCKO1wQ/KjNG1W1Kp0EeYPSlN4pbrjmftO
	 kJZJW/l5P8Ecw==
Date: Wed, 24 Jun 2026 15:23:05 +0100
From: Lorenzo Stoakes <ljs@kernel.org>
To: Xuewen Wang <wangxuewen@kylinos.cn>
Cc: akpm@linux-foundation.org, david@kernel.org, liam@infradead.org, 
	vbabka@kernel.org, rppt@kernel.org, surenb@google.com, mhocko@suse.com, 
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v1] mm: fix data-race in folio_batch_count()
Message-ID: <ajvkK-vcVzgll2Tm@lucifer>
References: <20260624092606.1083449-1-wangxuewen@kylinos.cn>
Precedence: bulk
X-Mailing-List: linux-kernel@vger.kernel.org
List-Id: <linux-kernel.vger.kernel.org>
List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org>
List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20260624092606.1083449-1-wangxuewen@kylinos.cn>

On Wed, Jun 24, 2026 at 05:26:06PM +0800, Xuewen Wang wrote:
> KCSAN reports:
>
> BUG: KCSAN: data-race in __lru_add_drain_all / folio_batch_add_and_move

Where? A syzbot report? A local run? Please specify.

>
> write to 0xffff98fe74c015f8 of 1 bytes by task 45153 on cpu 2:
>   folio_batch_add+0x30/0xe0
>
> read to 0xffff98fe74c015f8 of 1 bytes by task 45175 on cpu 0:
>   folio_batch_count+0x0/0x10
>   cpu_needs_drain+0x253/0x430
>
> The write side is a per-cpu local operation (folio_batch_add on the
> CPU that owns the per-cpu batch), while cpu_needs_drain() reads
> another CPU's per-cpu batch without locking. Reading a slightly stale
> value is harmless -- it only determines whether to schedule a drain,

Then why are we adding a READ_ONCE() in such a core helper?

> and a subsequent check will catch it.

Where? Which check? Be specific.

>
> Use READ_ONCE() to annotate the read and prevent load tearing, which
> also suppresses the KCSAN warning.

Tearing on a single byte? Which architecture tears a single byte?

I think you're actually more concerned about the value being optimised out on
assumption of it not being updated elsewhere right?

But acutally you're not, because everybody else uses a stack variable or _their
own_ per-CPU value?

Only cpu_needs_drain() is the odd one out right?

>
> Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
> ---
>  include/linux/folio_batch.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/folio_batch.h b/include/linux/folio_batch.h
> index b45946adc50b..1e31e058e19d 100644
> --- a/include/linux/folio_batch.h
> +++ b/include/linux/folio_batch.h
> @@ -53,7 +53,7 @@ static inline void folio_batch_reinit(struct folio_batch *fbatch)
>
>  static inline unsigned int folio_batch_count(const struct folio_batch *fbatch)
>  {
> -	return fbatch->nr;
> +	return READ_ONCE(fbatch->nr);

This isn't free, you're breaking optimisations here by doing that...

It feels like the wrong level of abstraction, but actually I think every other
case is either stack or per-CPU _on its own CPU_ (please check), in which case
we _can_ suppress the check here but I think best done with data_race().

And see the kernel bug bot report, you need to add:

#include <linux/compiler.h>

Too for that.

>  }
>
>  static inline unsigned int folio_batch_space(const struct folio_batch *fbatch)
> --
> 2.25.1
>

Thanks, Lorenzo