All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid()
@ 2026-06-23  8:44 Hui Zhu
  2026-06-23 11:48 ` David Hildenbrand (Arm)
  2026-06-24 21:01 ` Andrew Morton
  0 siblings, 2 replies; 5+ messages in thread
From: Hui Zhu @ 2026-06-23  8:44 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
	Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, linux-mm, linux-kernel
  Cc: Hui Zhu

From: Hui Zhu <zhuhui@kylinos.cn>

KCSAN reports a data race between page_to_nid()/folio_nid() reading
page->flags and folio_trylock()/folio_lock() concurrently doing
test_and_set_bit_lock(PG_locked, ...) on the same word, e.g.:

  BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp

The node id occupies a fixed bit-range of page->flags that is set
once at page init and never modified afterwards, so it can never
overlap with the low PG_locked/PG_waiters bits touched by the folio
lock path.

Use ASSERT_EXCLUSIVE_BITS() in memdesc_nid() to scope the exemption
to just the node-id bits, consistent with how memdesc_zonenum()
already handles the same class of race for the zone-id bits.

Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
---
Changelog:
v2:
According to the comments of David, remove useless comments and use
ASSERT_EXCLUSIVE_BITS() in memdesc_nid() instead of data_race() in
page_to_nid().

 include/linux/mm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 485df9c2dbdd..7518d6364a00 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2290,6 +2290,7 @@ int memdesc_nid(memdesc_flags_t mdf);
 #else
 static inline int memdesc_nid(memdesc_flags_t mdf)
 {
+	ASSERT_EXCLUSIVE_BITS(mdf.f, NODES_MASK << NODES_PGSHIFT);
 	return (mdf.f >> NODES_PGSHIFT) & NODES_MASK;
 }
 #endif
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid()
  2026-06-23  8:44 [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid() Hui Zhu
@ 2026-06-23 11:48 ` David Hildenbrand (Arm)
  2026-06-24 21:01 ` Andrew Morton
  1 sibling, 0 replies; 5+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-23 11:48 UTC (permalink / raw)
  To: Hui Zhu, Andrew Morton, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	linux-mm, linux-kernel
  Cc: Hui Zhu

On 6/23/26 10:44, Hui Zhu wrote:
> From: Hui Zhu <zhuhui@kylinos.cn>
> 
> KCSAN reports a data race between page_to_nid()/folio_nid() reading
> page->flags and folio_trylock()/folio_lock() concurrently doing
> test_and_set_bit_lock(PG_locked, ...) on the same word, e.g.:
> 
>   BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp
> 
> The node id occupies a fixed bit-range of page->flags that is set
> once at page init and never modified afterwards, so it can never
> overlap with the low PG_locked/PG_waiters bits touched by the folio
> lock path.
> 
> Use ASSERT_EXCLUSIVE_BITS() in memdesc_nid() to scope the exemption
> to just the node-id bits, consistent with how memdesc_zonenum()
> already handles the same class of race for the zone-id bits.
> 
> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
> ---

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid()
  2026-06-23  8:44 [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid() Hui Zhu
  2026-06-23 11:48 ` David Hildenbrand (Arm)
@ 2026-06-24 21:01 ` Andrew Morton
  2026-06-25  1:32   ` Hui Zhu
  1 sibling, 1 reply; 5+ messages in thread
From: Andrew Morton @ 2026-06-24 21:01 UTC (permalink / raw)
  To: Hui Zhu
  Cc: David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	linux-mm, linux-kernel, Hui Zhu

On Tue, 23 Jun 2026 16:44:32 +0800 Hui Zhu <hui.zhu@linux.dev> wrote:

> From: Hui Zhu <zhuhui@kylinos.cn>
> 
> KCSAN reports a data race between page_to_nid()/folio_nid() reading
> page->flags and folio_trylock()/folio_lock() concurrently doing
> test_and_set_bit_lock(PG_locked, ...) on the same word, e.g.:
> 
>   BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp
> 
> The node id occupies a fixed bit-range of page->flags that is set
> once at page init and never modified afterwards, so it can never
> overlap with the low PG_locked/PG_waiters bits touched by the folio
> lock path.
> 
> Use ASSERT_EXCLUSIVE_BITS() in memdesc_nid() to scope the exemption
> to just the node-id bits, consistent with how memdesc_zonenum()
> already handles the same class of race for the zone-id bits.
> 
> ...
>
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2290,6 +2290,7 @@ int memdesc_nid(memdesc_flags_t mdf);
>  #else
>  static inline int memdesc_nid(memdesc_flags_t mdf)
>  {
> +	ASSERT_EXCLUSIVE_BITS(mdf.f, NODES_MASK << NODES_PGSHIFT);
>  	return (mdf.f >> NODES_PGSHIFT) & NODES_MASK;
>  }
>  #endif

It seems weird to be doing this against a local variable within a
random function, seemingly unrelated to the problematic functions which
you've identified.

Seems that it fooled Sashiko:
	https://sashiko.dev/#/patchset/20260623084432.701120-1-hui.zhu@linux.dev

I'm wondering what the heck is going on here?

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid()
  2026-06-24 21:01 ` Andrew Morton
@ 2026-06-25  1:32   ` Hui Zhu
  2026-06-25  1:58     ` Andrew Morton
  0 siblings, 1 reply; 5+ messages in thread
From: Hui Zhu @ 2026-06-25  1:32 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	linux-mm, linux-kernel, Hui Zhu

> 
> On Tue, 23 Jun 2026 16:44:32 +0800 Hui Zhu <hui.zhu@linux.dev> wrote:
> 
> > 
> > From: Hui Zhu <zhuhui@kylinos.cn>
> >  
> >  KCSAN reports a data race between page_to_nid()/folio_nid() reading
> >  page->flags and folio_trylock()/folio_lock() concurrently doing
> >  test_and_set_bit_lock(PG_locked, ...) on the same word, e.g.:
> >  
> >  BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp
> >  
> >  The node id occupies a fixed bit-range of page->flags that is set
> >  once at page init and never modified afterwards, so it can never
> >  overlap with the low PG_locked/PG_waiters bits touched by the folio
> >  lock path.
> >  
> >  Use ASSERT_EXCLUSIVE_BITS() in memdesc_nid() to scope the exemption
> >  to just the node-id bits, consistent with how memdesc_zonenum()
> >  already handles the same class of race for the zone-id bits.
> >  
> >  ...
> > 
> >  --- a/include/linux/mm.h
> >  +++ b/include/linux/mm.h
> >  @@ -2290,6 +2290,7 @@ int memdesc_nid(memdesc_flags_t mdf);
> >  #else
> >  static inline int memdesc_nid(memdesc_flags_t mdf)
> >  {
> >  + ASSERT_EXCLUSIVE_BITS(mdf.f, NODES_MASK << NODES_PGSHIFT);
> >  return (mdf.f >> NODES_PGSHIFT) & NODES_MASK;
> >  }
> >  #endif
> > 
> It seems weird to be doing this against a local variable within a
> random function, seemingly unrelated to the problematic functions which
> you've identified.
> 
> Seems that it fooled Sashiko:
>  https://sashiko.dev/#/patchset/20260623084432.701120-1-hui.zhu@linux.dev
> 
> I'm wondering what the heck is going on here?
>

Hi Andrew,

Good catch. ASSERT_EXCLUSIVE_BITS(mdf.f, ...) is checking a by-value
copy of the flags word inside memdesc_nid(), not the actual shared
page->flags/folio->flags being modified by folio_trylock(). Whatever
made it appear to suppress the KCSAN report is likely an artifact of
inlining/codegen (kcsan_atomic_next() happening to land on the real
load after inlining), not a principled fix - so Sashiko's pass is
not reassuring here.

I'll move the assertion to where the real dereference happens (at
the page_to_nid()/folio_nid() call sites) instead of inside the
by-value helper. This probably also applies to the existing
memdesc_zonenum() pattern - is that one actually verified to work,
or does it have the same issue?

Best,
Hui

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid()
  2026-06-25  1:32   ` Hui Zhu
@ 2026-06-25  1:58     ` Andrew Morton
  0 siblings, 0 replies; 5+ messages in thread
From: Andrew Morton @ 2026-06-25  1:58 UTC (permalink / raw)
  To: Hui Zhu
  Cc: David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	linux-mm, linux-kernel, Hui Zhu

On Thu, 25 Jun 2026 01:32:49 +0000 "Hui Zhu" <hui.zhu@linux.dev> wrote:

> Good catch. ASSERT_EXCLUSIVE_BITS(mdf.f, ...) is checking a by-value
> copy of the flags word inside memdesc_nid(), not the actual shared
> page->flags/folio->flags being modified by folio_trylock(). Whatever
> made it appear to suppress the KCSAN report is likely an artifact of
> inlining/codegen (kcsan_atomic_next() happening to land on the real
> load after inlining), not a principled fix - so Sashiko's pass is
> not reassuring here.

Yeah, I was wondering if the inlining accidentally gave the macro the
correct thing.  Which seems wrong - an inlined function should treat an
incoming arg purely as a local thing.  Maybe we fooled the compiler.

> I'll move the assertion to where the real dereference happens (at
> the page_to_nid()/folio_nid() call sites) instead of inside the
> by-value helper. This probably also applies to the existing
> memdesc_zonenum() pattern - is that one actually verified to work,
> or does it have the same issue?

I assume the memdesc_zonenum() code worked, for the same (poorly
understood) reason as did your patch.

Yes, moving this into the sites where we officially have access to the
shared storage seems the right approach.


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-25  1:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-23  8:44 [PATCH v2] mm: avoid KCSAN false positive in memdesc_nid() Hui Zhu
2026-06-23 11:48 ` David Hildenbrand (Arm)
2026-06-24 21:01 ` Andrew Morton
2026-06-25  1:32   ` Hui Zhu
2026-06-25  1:58     ` Andrew Morton

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.