* [PATCH] mm: fix data race in __filemap_remove_folio / folio_mapping [not found] <20260322190319.85301-1-abhishek_sts8.ref@yahoo.com> @ 2026-03-22 19:03 ` Abhishek Kumar 2026-03-23 3:57 ` Matthew Wilcox 0 siblings, 1 reply; 3+ messages in thread From: Abhishek Kumar @ 2026-03-22 19:03 UTC (permalink / raw) To: Matthew Wilcox, Andrew Morton Cc: linux-mm, linux-kernel, linux-fsdevel, syzbot+606f94dfeaaa45124c90, Abhishek Kumar KCSAN reports a data race between page_cache_delete() and folio_mapping(): page_cache_delete() performs a plain store to folio->mapping: folio->mapping = NULL; folio_mapping() performs a plain load from folio->mapping: mapping = folio->mapping; page_cache_delete() is called from the truncation path under the i_pages xarray lock, while folio_mapping() is called from the reclaim path (evict_folios -> folio_evictable -> folio_mapping) under only rcu_read_lock() without the xarray lock. The race is benign since the reclaim path tolerates stale values -- reading a stale non-NULL mapping simply results in a suboptimal eviction decision. However, the plain accesses risk store/load tearing and allow the compiler to perform harmful optimizations (merging, elision, or fission of the accesses). Fix this by using WRITE_ONCE() in page_cache_delete() and READ_ONCE() in folio_mapping() to prevent compiler misbehavior and silence the KCSAN report. Fixes: 2f52578f9c64 ("mm/util: Add folio_mapping() and folio_file_mapping()") Reported-by: syzbot+606f94dfeaaa45124c90@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=606f94dfeaaa45124c90 Signed-off-by: Abhishek Kumar <abhishek_sts8@yahoo.com> --- mm/filemap.c | 2 +- mm/util.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/filemap.c b/mm/filemap.c index 406cef06b684..bba669bd114f 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -142,7 +142,7 @@ static void page_cache_delete(struct address_space *mapping, xas_store(&xas, shadow); xas_init_marks(&xas); - folio->mapping = NULL; + WRITE_ONCE(folio->mapping, NULL); /* Leave folio->index set: truncation lookup relies upon it */ mapping->nrpages -= nr; } diff --git a/mm/util.c b/mm/util.c index b05ab6f97e11..5ecb19ddf026 100644 --- a/mm/util.c +++ b/mm/util.c @@ -700,7 +700,7 @@ struct address_space *folio_mapping(const struct folio *folio) if (unlikely(folio_test_swapcache(folio))) return swap_address_space(folio->swap); - mapping = folio->mapping; + mapping = READ_ONCE(folio->mapping); if ((unsigned long)mapping & FOLIO_MAPPING_FLAGS) return NULL; -- 2.43.0 ^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: fix data race in __filemap_remove_folio / folio_mapping 2026-03-22 19:03 ` [PATCH] mm: fix data race in __filemap_remove_folio / folio_mapping Abhishek Kumar @ 2026-03-23 3:57 ` Matthew Wilcox 2026-03-23 10:47 ` Pedro Falcato 0 siblings, 1 reply; 3+ messages in thread From: Matthew Wilcox @ 2026-03-23 3:57 UTC (permalink / raw) To: Abhishek Kumar Cc: Andrew Morton, linux-mm, linux-kernel, linux-fsdevel, syzbot+606f94dfeaaa45124c90, Axel Rasmussen, Yuanchu Xie, Wei Xu On Mon, Mar 23, 2026 at 12:33:19AM +0530, Abhishek Kumar wrote: > KCSAN reports a data race between page_cache_delete() and > folio_mapping(): > > page_cache_delete() performs a plain store to folio->mapping: > folio->mapping = NULL; > > folio_mapping() performs a plain load from folio->mapping: > mapping = folio->mapping; > > page_cache_delete() is called from the truncation path under the i_pages > xarray lock, That's not relevant. The important lock for maintaining folio->mapping is the folio lock (see the VM_BUG_ON_FOLIO line in page_cache_delete()). At a minimum, this changelog needs to be fixed because there's already too much confusion around the locking rules. > while folio_mapping() is called from the reclaim path > (evict_folios -> folio_evictable -> folio_mapping) under only > rcu_read_lock() without the xarray lock. Umm. First up, this is MGLRU-only code, right? Adding the so-called maintainers. Second ... I'm really unsure how we want to handle this generally. This could be quite the game of whack-a-mole; we have many, many places in the kernel which dereference folio->mapping without holding a lock. Perhaps they are all fine; but 12 of the 455 references to folio->mapping currently have READ_ONCE attached. That's a lot of code to audit. > The race is benign since the reclaim path tolerates stale values -- > reading a stale non-NULL mapping simply results in a suboptimal eviction > decision. However, the plain accesses risk store/load tearing and allow > the compiler to perform harmful optimizations (merging, elision, or > fission of the accesses). I think the bigger problem is reloading. As I understand it, this code: struct address_space *m = folio->mapping; if (m && m->flags) could end up loading 'm' twice, once before the setting to NULL and once after. That's more plausible than deciding to load byte-by-byte, or whatever else these "merging, elision, or fission" words mean. > Fix this by using WRITE_ONCE() in page_cache_delete() and READ_ONCE() > in folio_mapping() to prevent compiler misbehavior and silence the KCSAN > report. Just to be clear, I don't object to the patch itself, I'm just scared of the consequences. And the locking comment above needs to be fixed. But please wait a few days for discussion to play out. ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] mm: fix data race in __filemap_remove_folio / folio_mapping 2026-03-23 3:57 ` Matthew Wilcox @ 2026-03-23 10:47 ` Pedro Falcato 0 siblings, 0 replies; 3+ messages in thread From: Pedro Falcato @ 2026-03-23 10:47 UTC (permalink / raw) To: Matthew Wilcox Cc: Abhishek Kumar, Andrew Morton, linux-mm, linux-kernel, linux-fsdevel, syzbot+606f94dfeaaa45124c90, Axel Rasmussen, Yuanchu Xie, Wei Xu On Mon, Mar 23, 2026 at 03:57:51AM +0000, Matthew Wilcox wrote: > On Mon, Mar 23, 2026 at 12:33:19AM +0530, Abhishek Kumar wrote: > > KCSAN reports a data race between page_cache_delete() and > > folio_mapping(): > > > > page_cache_delete() performs a plain store to folio->mapping: > > folio->mapping = NULL; > > > > folio_mapping() performs a plain load from folio->mapping: > > mapping = folio->mapping; > > > > page_cache_delete() is called from the truncation path under the i_pages > > xarray lock, > > That's not relevant. The important lock for maintaining folio->mapping > is the folio lock (see the VM_BUG_ON_FOLIO line in page_cache_delete()). Not only that, but holding invalidate_lock or i_rwsem can implicitly make the folios stable by excluding out truncation. > At a minimum, this changelog needs to be fixed because there's already > too much confusion around the locking rules. > > > while folio_mapping() is called from the reclaim path > > (evict_folios -> folio_evictable -> folio_mapping) under only > > rcu_read_lock() without the xarray lock. > > Umm. First up, this is MGLRU-only code, right? Adding the so-called > maintainers. > > Second ... I'm really unsure how we want to handle this generally. > This could be quite the game of whack-a-mole; we have many, many places > in the kernel which dereference folio->mapping without holding a lock. > > Perhaps they are all fine; but 12 of the 455 references to > folio->mapping currently have READ_ONCE attached. That's a lot of code > to audit. Yes, and a lot of these just aren't trivial to prove 100% correct. e.g: fs/ext2/dir.c: static void ext2_commit_chunk(struct folio *folio, loff_t pos, unsigned len) { struct address_space *mapping = folio->mapping; Racey? int ext2_set_link(struct inode *dir, struct ext2_dir_entry_2 *de, struct folio *folio, struct inode *inode, bool update_times) { loff_t pos = folio_pos(folio) + offset_in_folio(folio, de); unsigned len = ext2_rec_len_from_disk(de->rec_len); int err; folio_lock(folio); Maybe not, but we don't revalidate folio->mapping after the lock. Racey? Then you look at the ext2_set_link() callers and it is rename, which holds i_rwsem. This plus the reload in filemap_get_entry() should make it sufficient (and excludes against reclaim whacking the folio). But there are other callers of ext2_commit_chunk(), etc, and ext2 is by far one of the simpler filesystems out there :) > > > The race is benign since the reclaim path tolerates stale values -- > > reading a stale non-NULL mapping simply results in a suboptimal eviction > > decision. However, the plain accesses risk store/load tearing and allow > > the compiler to perform harmful optimizations (merging, elision, or > > fission of the accesses). > > I think the bigger problem is reloading. As I understand it, this code: > > struct address_space *m = folio->mapping; > > if (m && m->flags) > > could end up loading 'm' twice, once before the setting to NULL and once > after. That's more plausible than deciding to load byte-by-byte, or > whatever else these "merging, elision, or fission" words mean. > > > Fix this by using WRITE_ONCE() in page_cache_delete() and READ_ONCE() > > in folio_mapping() to prevent compiler misbehavior and silence the KCSAN > > report. > > Just to be clear, I don't object to the patch itself, I'm just scared > of the consequences. And the locking comment above needs to be fixed. > But please wait a few days for discussion to play out. IMHO this looks fine, particularly as folio_mapping() is practically mm-internal (but there are some other users in fs/, for some reason). And shouldn't be problematic as AIUI KCSAN will still report WRITE_ONCE+plain read or READ_ONCE+plain write races. -- Pedro ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-03-23 10:47 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20260322190319.85301-1-abhishek_sts8.ref@yahoo.com>
2026-03-22 19:03 ` [PATCH] mm: fix data race in __filemap_remove_folio / folio_mapping Abhishek Kumar
2026-03-23 3:57 ` Matthew Wilcox
2026-03-23 10:47 ` Pedro Falcato
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox