* [PATCH] mm: avoid KCSAN false positive in page_to_nid() @ 2026-06-23 7:41 Hui Zhu 2026-06-23 7:59 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 4+ messages in thread From: Hui Zhu @ 2026-06-23 7:41 UTC (permalink / raw) To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, linux-mm, linux-kernel Cc: Hui Zhu From: Hui Zhu <zhuhui@kylinos.cn> KCSAN reports a data race between page_to_nid() reading page->flags and folio_trylock()/folio_lock() doing test_and_set_bit_lock(PG_locked, ...) on the same word from another CPU, e.g.: BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp The node id occupies a fixed, high bit-range of page->flags that is set once when the page is initialized and never modified afterwards, so it can never overlap with the low PG_locked/PG_waiters bits touched by the folio lock path. The race is therefore harmless: page_to_nid() always returns a consistent value regardless of how the read interleaves with the lock bit ops. Wrap the flags read with data_race() to tell KCSAN this race is intentional and benign, consistent with how page->page_type is already annotated for similar packed-field accesses. Signed-off-by: Hui Zhu <zhuhui@kylinos.cn> --- include/linux/mm.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 485df9c2dbdd..122d3b39369f 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2296,7 +2296,14 @@ static inline int memdesc_nid(memdesc_flags_t mdf) static inline int page_to_nid(const struct page *page) { - return memdesc_nid(PF_POISONED_CHECK(page)->flags); + /* + * The node id occupies a fixed high bit-range of page->flags + * that is set once at page init and never changed afterwards. + * It cannot overlap with the low PG_locked/PG_waiters bits + * that folio_lock()/folio_unlock() concurrently update, so + * this data race is benign. + */ + return memdesc_nid(data_race(PF_POISONED_CHECK(page)->flags)); } static inline int folio_nid(const struct folio *folio) -- 2.43.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: avoid KCSAN false positive in page_to_nid() 2026-06-23 7:41 [PATCH] mm: avoid KCSAN false positive in page_to_nid() Hui Zhu @ 2026-06-23 7:59 ` David Hildenbrand (Arm) 2026-06-23 10:25 ` Lorenzo Stoakes 0 siblings, 1 reply; 4+ messages in thread From: David Hildenbrand (Arm) @ 2026-06-23 7:59 UTC (permalink / raw) To: Hui Zhu, Andrew Morton, Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, linux-mm, linux-kernel Cc: Hui Zhu On 6/23/26 09:41, Hui Zhu wrote: > From: Hui Zhu <zhuhui@kylinos.cn> > > KCSAN reports a data race between page_to_nid() reading page->flags > and folio_trylock()/folio_lock() doing test_and_set_bit_lock(PG_locked, > ...) on the same word from another CPU, e.g.: > > BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp > > The node id occupies a fixed, high bit-range of page->flags that is > set once when the page is initialized and never modified afterwards, > so it can never overlap with the low PG_locked/PG_waiters bits touched > by the folio lock path. The race is therefore harmless: page_to_nid() > always returns a consistent value regardless of how the read > interleaves with the lock bit ops. > > Wrap the flags read with data_race() to tell KCSAN this race is > intentional and benign, consistent with how page->page_type is > already annotated for similar packed-field accesses. > > Signed-off-by: Hui Zhu <zhuhui@kylinos.cn> > --- > include/linux/mm.h | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 485df9c2dbdd..122d3b39369f 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2296,7 +2296,14 @@ static inline int memdesc_nid(memdesc_flags_t mdf) > > static inline int page_to_nid(const struct page *page) > { > - return memdesc_nid(PF_POISONED_CHECK(page)->flags); > + /* > + * The node id occupies a fixed high bit-range of page->flags > + * that is set once at page init and never changed afterwards. > + * It cannot overlap with the low PG_locked/PG_waiters bits > + * that folio_lock()/folio_unlock() concurrently update, so > + * this data race is benign. > + */ Do we really need this excessive comment? > + return memdesc_nid(data_race(PF_POISONED_CHECK(page)->flags)); In memdesc_zonenum() we use ASSERT_EXCLUSIVE_BITS. Can we do the same here inside memdesc_nid? diff --git a/include/linux/mm.h b/include/linux/mm.h index 69daeeab7fe8f..76d3bb54be844 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2290,6 +2290,7 @@ int memdesc_nid(memdesc_flags_t mdf); #else static inline int memdesc_nid(memdesc_flags_t mdf) { + ASSERT_EXCLUSIVE_BITS(mdf.f, NODES_MASK << NODES_PGSHIFT); return (mdf.f >> NODES_PGSHIFT) & NODES_MASK; } #endif -- Cheers, David ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: avoid KCSAN false positive in page_to_nid() 2026-06-23 7:59 ` David Hildenbrand (Arm) @ 2026-06-23 10:25 ` Lorenzo Stoakes 2026-06-24 1:33 ` Hui Zhu 0 siblings, 1 reply; 4+ messages in thread From: Lorenzo Stoakes @ 2026-06-23 10:25 UTC (permalink / raw) To: David Hildenbrand (Arm) Cc: Hui Zhu, Andrew Morton, Liam R. Howlett, Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, linux-mm, linux-kernel, Hui Zhu On Tue, Jun 23, 2026 at 09:59:45AM +0200, David Hildenbrand (Arm) wrote: > On 6/23/26 09:41, Hui Zhu wrote: > > From: Hui Zhu <zhuhui@kylinos.cn> > > > > KCSAN reports a data race between page_to_nid() reading page->flags > > and folio_trylock()/folio_lock() doing test_and_set_bit_lock(PG_locked, > > ...) on the same word from another CPU, e.g.: > > > > BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp > > > > The node id occupies a fixed, high bit-range of page->flags that is > > set once when the page is initialized and never modified afterwards, > > so it can never overlap with the low PG_locked/PG_waiters bits touched > > by the folio lock path. The race is therefore harmless: page_to_nid() > > always returns a consistent value regardless of how the read > > interleaves with the lock bit ops. > > > > Wrap the flags read with data_race() to tell KCSAN this race is > > intentional and benign, consistent with how page->page_type is > > already annotated for similar packed-field accesses. > > > > Signed-off-by: Hui Zhu <zhuhui@kylinos.cn> How did you discover this? A syzbot report? If so please include Reported-by, Closes tags. > > --- > > include/linux/mm.h | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 485df9c2dbdd..122d3b39369f 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2296,7 +2296,14 @@ static inline int memdesc_nid(memdesc_flags_t mdf) > > > > static inline int page_to_nid(const struct page *page) > > { > > - return memdesc_nid(PF_POISONED_CHECK(page)->flags); > > + /* > > + * The node id occupies a fixed high bit-range of page->flags > > + * that is set once at page init and never changed afterwards. > > + * It cannot overlap with the low PG_locked/PG_waiters bits > > + * that folio_lock()/folio_unlock() concurrently update, so > > + * this data race is benign. > > + */ > > Do we really need this excessive comment? Agreed, just delete it. For a trivial benign data race it's a bit much, and the commit message can cover it off for those who are curious. > > > + return memdesc_nid(data_race(PF_POISONED_CHECK(page)->flags)); > > In memdesc_zonenum() we use ASSERT_EXCLUSIVE_BITS. > > Can we do the same here inside memdesc_nid? Also agreed > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 69daeeab7fe8f..76d3bb54be844 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2290,6 +2290,7 @@ int memdesc_nid(memdesc_flags_t mdf); > #else > static inline int memdesc_nid(memdesc_flags_t mdf) > { > + ASSERT_EXCLUSIVE_BITS(mdf.f, NODES_MASK << NODES_PGSHIFT); > return (mdf.f >> NODES_PGSHIFT) & NODES_MASK; > } > #endif > > > -- > Cheers, > > David Thanks, Lorenzo ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: avoid KCSAN false positive in page_to_nid() 2026-06-23 10:25 ` Lorenzo Stoakes @ 2026-06-24 1:33 ` Hui Zhu 0 siblings, 0 replies; 4+ messages in thread From: Hui Zhu @ 2026-06-24 1:33 UTC (permalink / raw) To: Lorenzo Stoakes, David Hildenbrand (Arm) Cc: Andrew Morton, Liam R. Howlett, Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko, linux-mm, linux-kernel, Hui Zhu > > On Tue, Jun 23, 2026 at 09:59:45AM +0200, David Hildenbrand (Arm) wrote: > > > > > On 6/23/26 09:41, Hui Zhu wrote: > > From: Hui Zhu <zhuhui@kylinos.cn> > > > > KCSAN reports a data race between page_to_nid() reading page->flags > > and folio_trylock()/folio_lock() doing test_and_set_bit_lock(PG_locked, > > ...) on the same word from another CPU, e.g.: > > > > BUG: KCSAN: data-race in __lruvec_stat_mod_folio / shmem_get_folio_gfp > > > > The node id occupies a fixed, high bit-range of page->flags that is > > set once when the page is initialized and never modified afterwards, > > so it can never overlap with the low PG_locked/PG_waiters bits touched > > by the folio lock path. The race is therefore harmless: page_to_nid() > > always returns a consistent value regardless of how the read > > interleaves with the lock bit ops. > > > > Wrap the flags read with data_race() to tell KCSAN this race is > > intentional and benign, consistent with how page->page_type is > > already annotated for similar packed-field accesses. > > > > Signed-off-by: Hui Zhu <zhuhui@kylinos.cn> > > > How did you discover this? > > A syzbot report? If so please include Reported-by, Closes tags. This wasn't reported by syzbot - it was found by our internal fuzzing CI on a KCSAN-enabled build of our downstream tree (based on v6.6.121), triggered by a multi-threaded mlockall() fuzz testcase. There's no public bug id for it, so Reported-by/Closes don't apply here. Best, Hui > > > > > --- > > include/linux/mm.h | 9 ++++++++- > > 1 file changed, 8 insertions(+), 1 deletion(-) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 485df9c2dbdd..122d3b39369f 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2296,7 +2296,14 @@ static inline int memdesc_nid(memdesc_flags_t mdf) > > > > static inline int page_to_nid(const struct page *page) > > { > > - return memdesc_nid(PF_POISONED_CHECK(page)->flags); > > + /* > > + * The node id occupies a fixed high bit-range of page->flags > > + * that is set once at page init and never changed afterwards. > > + * It cannot overlap with the low PG_locked/PG_waiters bits > > + * that folio_lock()/folio_unlock() concurrently update, so > > + * this data race is benign. > > + */ > > > > Do we really need this excessive comment? > > > Agreed, just delete it. For a trivial benign data race it's a bit much, and the > commit message can cover it off for those who are curious. > > > > > + return memdesc_nid(data_race(PF_POISONED_CHECK(page)->flags)); > > > > In memdesc_zonenum() we use ASSERT_EXCLUSIVE_BITS. > > > > Can we do the same here inside memdesc_nid? > > > Also agreed > > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index 69daeeab7fe8f..76d3bb54be844 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -2290,6 +2290,7 @@ int memdesc_nid(memdesc_flags_t mdf); > > #else > > static inline int memdesc_nid(memdesc_flags_t mdf) > > { > > + ASSERT_EXCLUSIVE_BITS(mdf.f, NODES_MASK << NODES_PGSHIFT); > > return (mdf.f >> NODES_PGSHIFT) & NODES_MASK; > > } > > #endif > > > > -- > > Cheers, > > > > David > > > Thanks, Lorenzo > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-24 1:33 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-23 7:41 [PATCH] mm: avoid KCSAN false positive in page_to_nid() Hui Zhu 2026-06-23 7:59 ` David Hildenbrand (Arm) 2026-06-23 10:25 ` Lorenzo Stoakes 2026-06-24 1:33 ` Hui Zhu
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.