Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] mm: annotate data-race in cpu_needs_drain() and need_mlock_drain()
@ 2026-06-25  6:51 Xuewen Wang
  2026-06-25  9:31 ` Pedro Falcato
  0 siblings, 1 reply; 3+ messages in thread
From: Xuewen Wang @ 2026-06-25  6:51 UTC (permalink / raw)
  To: akpm, liam, ljs, vbabka, jannh, pfalcato, chrisl, kasong,
	shikemeng, nphamcs, baoquan.he, baohua, youngjun.park, qi.zheng,
	shakeel.butt, axelrasmussen, yuanchu, weixugc, david
  Cc: linux-mm, linux-kernel, Xuewen Wang

KCSAN reports a data-race when cpu_needs_drain() reads another CPU's
per-cpu folio_batch->nr without locking, while the owning CPU writes
to it via folio_batch_add(). The same race exists in need_mlock_drain()
which is called from cpu_needs_drain().

Reading a slightly stale value is harmless -- cpu_needs_drain() only
decides whether to schedule a drain, and the next iteration of
__lru_add_drain_all() will re-check.

All other callers of folio_batch_count() either use stack variables or
access their own CPU's per-cpu data where no race exists, so
data_race() is added at the call sites rather than in
folio_batch_count() itself to avoid suppressing KCSAN warnings for
future callers that may have real bugs.

Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
---
Changes in v2:
- Use data_race() instead of READ_ONCE() in folio_batch_count(), as
  suggested by Lorenzo. READ_ONCE() is unnecessary for a single-byte
  read and imposes overhead on all callers, most of which have no race.
- Move the annotation from folio_batch_count() to the actual call sites
  (cpu_needs_drain() and need_mlock_drain()) where the cross-CPU race
  occurs, rather than affecting all callers.
- Add need_mlock_drain() which has the same cross-CPU race.
- Add comments explaining why the data race is safe.
v1:
  https://lore.kernel.org/all/20260624092606.1083449-1-wangxuewen@kylinos.cn/
---
 mm/mlock.c |  2 +-
 mm/swap.c  | 12 ++++++------
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/mlock.c b/mm/mlock.c
index 8c227fefa2df..fbdb5018e2c3 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -232,7 +232,7 @@ void mlock_drain_remote(int cpu)
 
 bool need_mlock_drain(int cpu)
 {
-	return folio_batch_count(&per_cpu(mlock_fbatch.fbatch, cpu));
+	return data_race(folio_batch_count(&per_cpu(mlock_fbatch.fbatch, cpu)));
 }
 
 /**
diff --git a/mm/swap.c b/mm/swap.c
index 588f50d8f1a8..d046428caed6 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -828,12 +828,12 @@ static bool cpu_needs_drain(unsigned int cpu)
 	struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
 
 	/* Check these in order of likelihood that they're not zero */
-	return folio_batch_count(&fbatches->lru_add) ||
-		folio_batch_count(&fbatches->lru_move_tail) ||
-		folio_batch_count(&fbatches->lru_deactivate_file) ||
-		folio_batch_count(&fbatches->lru_deactivate) ||
-		folio_batch_count(&fbatches->lru_lazyfree) ||
-		folio_batch_count(&fbatches->lru_activate) ||
+	return data_race(folio_batch_count(&fbatches->lru_add)) ||
+		data_race(folio_batch_count(&fbatches->lru_move_tail)) ||
+		data_race(folio_batch_count(&fbatches->lru_deactivate_file)) ||
+		data_race(folio_batch_count(&fbatches->lru_deactivate)) ||
+		data_race(folio_batch_count(&fbatches->lru_lazyfree)) ||
+		data_race(folio_batch_count(&fbatches->lru_activate)) ||
 		need_mlock_drain(cpu) ||
 		has_bh_in_lru(cpu, NULL);
 }
-- 
2.25.1



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] mm: annotate data-race in cpu_needs_drain() and need_mlock_drain()
  2026-06-25  6:51 [PATCH v2] mm: annotate data-race in cpu_needs_drain() and need_mlock_drain() Xuewen Wang
@ 2026-06-25  9:31 ` Pedro Falcato
  2026-06-25 14:27   ` Lorenzo Stoakes
  0 siblings, 1 reply; 3+ messages in thread
From: Pedro Falcato @ 2026-06-25  9:31 UTC (permalink / raw)
  To: Xuewen Wang
  Cc: akpm, liam, ljs, vbabka, jannh, chrisl, kasong, shikemeng,
	nphamcs, baoquan.he, baohua, youngjun.park, qi.zheng,
	shakeel.butt, axelrasmussen, yuanchu, weixugc, david, linux-mm,
	linux-kernel

On Thu, Jun 25, 2026 at 02:51:53PM +0800, Xuewen Wang wrote:
> KCSAN reports a data-race when cpu_needs_drain() reads another CPU's
> per-cpu folio_batch->nr without locking, while the owning CPU writes
> to it via folio_batch_add(). The same race exists in need_mlock_drain()
> which is called from cpu_needs_drain().
> 
> Reading a slightly stale value is harmless -- cpu_needs_drain() only
> decides whether to schedule a drain, and the next iteration of
> __lru_add_drain_all() will re-check.
> 
> All other callers of folio_batch_count() either use stack variables or
> access their own CPU's per-cpu data where no race exists, so
> data_race() is added at the call sites rather than in
> folio_batch_count() itself to avoid suppressing KCSAN warnings for
> future callers that may have real bugs.
> 
> Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
> ---
> Changes in v2:
> - Use data_race() instead of READ_ONCE() in folio_batch_count(), as
>   suggested by Lorenzo. READ_ONCE() is unnecessary for a single-byte
>   read and imposes overhead on all callers, most of which have no race.
> - Move the annotation from folio_batch_count() to the actual call sites
>   (cpu_needs_drain() and need_mlock_drain()) where the cross-CPU race
>   occurs, rather than affecting all callers.
> - Add need_mlock_drain() which has the same cross-CPU race.
> - Add comments explaining why the data race is safe.
> v1:
>   https://lore.kernel.org/all/20260624092606.1083449-1-wangxuewen@kylinos.cn/
> ---
>  mm/mlock.c |  2 +-
>  mm/swap.c  | 12 ++++++------
>  2 files changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 8c227fefa2df..fbdb5018e2c3 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -232,7 +232,7 @@ void mlock_drain_remote(int cpu)
>  
>  bool need_mlock_drain(int cpu)
>  {
> -	return folio_batch_count(&per_cpu(mlock_fbatch.fbatch, cpu));
> +	return data_race(folio_batch_count(&per_cpu(mlock_fbatch.fbatch, cpu)));
>  }
>  
>  /**
> diff --git a/mm/swap.c b/mm/swap.c
> index 588f50d8f1a8..d046428caed6 100644
> --- a/mm/swap.c
> +++ b/mm/swap.c
> @@ -828,12 +828,12 @@ static bool cpu_needs_drain(unsigned int cpu)
>  	struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
>  
>  	/* Check these in order of likelihood that they're not zero */
> -	return folio_batch_count(&fbatches->lru_add) ||
> -		folio_batch_count(&fbatches->lru_move_tail) ||
> -		folio_batch_count(&fbatches->lru_deactivate_file) ||
> -		folio_batch_count(&fbatches->lru_deactivate) ||
> -		folio_batch_count(&fbatches->lru_lazyfree) ||
> -		folio_batch_count(&fbatches->lru_activate) ||
> +	return data_race(folio_batch_count(&fbatches->lru_add)) ||
> +		data_race(folio_batch_count(&fbatches->lru_move_tail)) ||
> +		data_race(folio_batch_count(&fbatches->lru_deactivate_file)) ||
> +		data_race(folio_batch_count(&fbatches->lru_deactivate)) ||
> +		data_race(folio_batch_count(&fbatches->lru_lazyfree)) ||
> +		data_race(folio_batch_count(&fbatches->lru_activate)) ||
>  		need_mlock_drain(cpu) ||
>  		has_bh_in_lru(cpu, NULL);
>  }

eww.

How about:

static bool cpu_needs_drain(unsigned int cpu)
{
        struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);

        /* Check these in order of likelihood that they're not zero */
        return data_race(
		folio_batch_count(&fbatches->lru_add) ||
                folio_batch_count(&fbatches->lru_move_tail) ||
                folio_batch_count(&fbatches->lru_deactivate_file) ||
                folio_batch_count(&fbatches->lru_deactivate) ||
                folio_batch_count(&fbatches->lru_lazyfree) ||
                folio_batch_count(&fbatches->lru_activate) ||
                need_mlock_drain(cpu)) ||
                has_bh_in_lru(cpu, NULL);
}

this should work equally well, while being far more aesthetically pleasing :)

> -- 
> 2.25.1
> 

-- 
Pedro


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] mm: annotate data-race in cpu_needs_drain() and need_mlock_drain()
  2026-06-25  9:31 ` Pedro Falcato
@ 2026-06-25 14:27   ` Lorenzo Stoakes
  0 siblings, 0 replies; 3+ messages in thread
From: Lorenzo Stoakes @ 2026-06-25 14:27 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Xuewen Wang, akpm, liam, vbabka, jannh, chrisl, kasong, shikemeng,
	nphamcs, baoquan.he, baohua, youngjun.park, qi.zheng,
	shakeel.butt, axelrasmussen, yuanchu, weixugc, david, linux-mm,
	linux-kernel

On Thu, Jun 25, 2026 at 10:31:24AM +0100, Pedro Falcato wrote:
> On Thu, Jun 25, 2026 at 02:51:53PM +0800, Xuewen Wang wrote:
> > KCSAN reports a data-race when cpu_needs_drain() reads another CPU's
> > per-cpu folio_batch->nr without locking, while the owning CPU writes
> > to it via folio_batch_add(). The same race exists in need_mlock_drain()
> > which is called from cpu_needs_drain().
> >
> > Reading a slightly stale value is harmless -- cpu_needs_drain() only
> > decides whether to schedule a drain, and the next iteration of
> > __lru_add_drain_all() will re-check.
> >
> > All other callers of folio_batch_count() either use stack variables or
> > access their own CPU's per-cpu data where no race exists, so
> > data_race() is added at the call sites rather than in
> > folio_batch_count() itself to avoid suppressing KCSAN warnings for
> > future callers that may have real bugs.
> >
> > Signed-off-by: Xuewen Wang <wangxuewen@kylinos.cn>
> > ---
> > Changes in v2:
> > - Use data_race() instead of READ_ONCE() in folio_batch_count(), as
> >   suggested by Lorenzo. READ_ONCE() is unnecessary for a single-byte
> >   read and imposes overhead on all callers, most of which have no race.
> > - Move the annotation from folio_batch_count() to the actual call sites
> >   (cpu_needs_drain() and need_mlock_drain()) where the cross-CPU race
> >   occurs, rather than affecting all callers.
> > - Add need_mlock_drain() which has the same cross-CPU race.
> > - Add comments explaining why the data race is safe.
> > v1:
> >   https://lore.kernel.org/all/20260624092606.1083449-1-wangxuewen@kylinos.cn/
> > ---
> >  mm/mlock.c |  2 +-
> >  mm/swap.c  | 12 ++++++------
> >  2 files changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/mlock.c b/mm/mlock.c
> > index 8c227fefa2df..fbdb5018e2c3 100644
> > --- a/mm/mlock.c
> > +++ b/mm/mlock.c
> > @@ -232,7 +232,7 @@ void mlock_drain_remote(int cpu)
> >
> >  bool need_mlock_drain(int cpu)
> >  {
> > -	return folio_batch_count(&per_cpu(mlock_fbatch.fbatch, cpu));
> > +	return data_race(folio_batch_count(&per_cpu(mlock_fbatch.fbatch, cpu)));
> >  }
> >
> >  /**
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 588f50d8f1a8..d046428caed6 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -828,12 +828,12 @@ static bool cpu_needs_drain(unsigned int cpu)
> >  	struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
> >
> >  	/* Check these in order of likelihood that they're not zero */
> > -	return folio_batch_count(&fbatches->lru_add) ||
> > -		folio_batch_count(&fbatches->lru_move_tail) ||
> > -		folio_batch_count(&fbatches->lru_deactivate_file) ||
> > -		folio_batch_count(&fbatches->lru_deactivate) ||
> > -		folio_batch_count(&fbatches->lru_lazyfree) ||
> > -		folio_batch_count(&fbatches->lru_activate) ||
> > +	return data_race(folio_batch_count(&fbatches->lru_add)) ||
> > +		data_race(folio_batch_count(&fbatches->lru_move_tail)) ||
> > +		data_race(folio_batch_count(&fbatches->lru_deactivate_file)) ||
> > +		data_race(folio_batch_count(&fbatches->lru_deactivate)) ||
> > +		data_race(folio_batch_count(&fbatches->lru_lazyfree)) ||
> > +		data_race(folio_batch_count(&fbatches->lru_activate)) ||
> >  		need_mlock_drain(cpu) ||
> >  		has_bh_in_lru(cpu, NULL);
> >  }
>
> eww.
>
> How about:
>
> static bool cpu_needs_drain(unsigned int cpu)
> {
>         struct cpu_fbatches *fbatches = &per_cpu(cpu_fbatches, cpu);
>
>         /* Check these in order of likelihood that they're not zero */
>         return data_race(
> 		folio_batch_count(&fbatches->lru_add) ||
>                 folio_batch_count(&fbatches->lru_move_tail) ||
>                 folio_batch_count(&fbatches->lru_deactivate_file) ||
>                 folio_batch_count(&fbatches->lru_deactivate) ||
>                 folio_batch_count(&fbatches->lru_lazyfree) ||
>                 folio_batch_count(&fbatches->lru_activate) ||
>                 need_mlock_drain(cpu)) ||
>                 has_bh_in_lru(cpu, NULL);
> }
>
> this should work equally well, while being far more aesthetically pleasing :)

Yes...!

>
> > --
> > 2.25.1
> >
>
> --
> Pedro

Cheers, Lorenzo


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-25 14:27 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-25  6:51 [PATCH v2] mm: annotate data-race in cpu_needs_drain() and need_mlock_drain() Xuewen Wang
2026-06-25  9:31 ` Pedro Falcato
2026-06-25 14:27   ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox