From: "Jianzhou Zhao" <luckd0g@163.com>
To: tytso@mit.edu, adilger.kernel@dilger.ca,
linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: BUG: KCSAN: data-race in _copy_to_iter / ext4_generic_delete_entry
Date: Wed, 11 Mar 2026 16:04:28 +0800 (CST) [thread overview]
Message-ID: <6c4b2013.6d16.19cdbecff3e.Coremail.luckd0g@163.com> (raw)
Subject: [BUG] ext4: KCSAN: data-race in _copy_to_iter / ext4_generic_delete_entry
Dear Maintainers,
We are writing to report a KCSAN-detected data race vulnerability within `ext4` and the block device layer. This bug was found by our custom fuzzing tool, RacePilot. The race occurs when `ext4_generic_delete_entry` modifies the `rec_len` of a previous directory entry (via a 2-byte write) during a path unlink operation, while a concurrent thread directly accesses the raw block device of the mounted filesystem (via `read()`), executing `_copy_to_iter()` which blindly bulk-reads the buffer underlying the filesystem page cache. We observed this bug on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.
Call Trace & Context
==================================================================
BUG: KCSAN: data-race in _copy_to_iter / ext4_generic_delete_entry
write to 0xffff888033da2010 of 2 bytes by task 5608 on cpu 0:
ext4_generic_delete_entry+0x358/0x470 fs/ext4/namei.c:2670
ext4_delete_entry+0x16d/0x280 fs/ext4/namei.c:2724
__ext4_unlink+0x504/0x6e0 fs/ext4/namei.c:3263
ext4_unlink+0x25d/0x280 fs/ext4/namei.c:3312
vfs_unlink+0x323/0x710 fs/namei.c:5409
do_unlinkat+0x301/0x540 fs/namei.c:5480
...
__x64_sys_unlink+0x7d/0xa0 fs/namei.c:5513
read to 0xffff888033da2000 of 1377 bytes by task 4793 on cpu 1:
instrument_copy_to_user include/linux/instrumented.h:113 [inline]
copy_to_user_iter lib/iov_iter.c:29 [inline]
iterate_ubuf include/linux/iov_iter.h:31 [inline]
iterate_and_advance2 include/linux/iov_iter.h:304 [inline]
iterate_and_advance include/linux/iov_iter.h:332 [inline]
_copy_to_iter+0x210/0xf10 lib/iov_iter.c:231
copy_page_to_iter lib/iov_iter.c:412 [inline]
copy_page_to_iter+0xd1/0x150 lib/iov_iter.c:399
copy_folio_to_iter include/linux/uio.h:204 [inline]
filemap_read+0x46e/0x8f0 mm/filemap.c:2899
blkdev_read_iter+0x114/0x360 block/fops.c:868
new_sync_read fs/read_write.c:502 [inline]
vfs_read+0x5c8/0x820 fs/read_write.c:583
ksys_read+0xbe/0x190 fs/read_write.c:730
...
__x64_sys_read+0x41/0x50 fs/read_write.c:737
Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 4793 Comm: systemd-udevd Not tainted 6.18.0-08691-g2061f18ad76e-dirty #50 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
==================================================================
Execution Flow & Code Context
When deleting an entry from a directory block, `ext4_generic_delete_entry()` merges the deleted target into the preceding directory entry (`pde`) by extending its `rec_len`.
```c
// fs/ext4/namei.c
int ext4_generic_delete_entry(...)
{
...
if (de == de_del) {
if (pde) {
pde->rec_len = ext4_rec_len_to_disk( // <-- Plain concurrent 2-byte write
ext4_rec_len_from_disk(pde->rec_len,
blocksize) +
ext4_rec_len_from_disk(de->rec_len,
blocksize),
blocksize);
/* wipe entire dir_entry */
memset(de, 0, ...);
}
...
}
```
Meanwhile, a secondary application accesses the underlying block device nodes (e.g. `/dev/sda1` mapped to the mount). The buffer structures managing the directory metadata pages are shared in physical memory between the `ext4` filesystem instance and the raw block device layer. Invoking `sys_read` drops into `blkdev_read_iter()`, which accesses the page/buffer mapping via lockless algorithms or asynchronous chunk reads. Eventually `copy_page_to_iter` reads the exact memory span being manipulated.
Root Cause Analysis
A KCSAN data race arises because one thread (the raw block device reader) executes an unannotated bulk copy (`_copy_to_iter`) of physical pages simultaneously while the filesystem actively modifies the active directory entries (`pde->rec_len`) over the same page cache mapped structures. Reading raw block devices underpinning actively mounted filesystems is intrinsically racy and officially unsupported for data coherence in Linux; however, without compiler barriers, this specific mutation can suffer load tearing or generate severe KCSAN spam.
Unfortunately, we were unable to generate a reproducer for this bug.
Potential Impact
This data race is benign. Raw device readers to live filesystems accept that data may be torn or in-transition when extracted locklessly. For the filesystem itself, it relies on journal commits and directory locks, so ext4 consistency is untampered. But, unannotated mutations trigger compiler optimization alarms within the sanitization toolchains.
Proposed Fix
While disabling direct block access to mounted filesystems is the broader architectural shift, to silence KCSAN localized tearing warnings on this commonly active mutation path, `WRITE_ONCE` explicitly encodes the volatile memory constraint for the record length modification.
```diff
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2667,12 +2667,12 @@ int ext4_generic_delete_entry(struct inode *dir,
return -EFSCORRUPTED;
if (de == de_del) {
if (pde) {
- pde->rec_len = ext4_rec_len_to_disk(
+ WRITE_ONCE(pde->rec_len, ext4_rec_len_to_disk(
ext4_rec_len_from_disk(pde->rec_len,
blocksize) +
ext4_rec_len_from_disk(de->rec_len,
blocksize),
- blocksize);
+ blocksize));
/* wipe entire dir_entry */
memset(de, 0, ext4_rec_len_from_disk(de->rec_len,
```
We would be highly honored if this could be of any help.
Best regards,
RacePilot Team
next reply other threads:[~2026-03-11 8:04 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-11 8:04 Jianzhou Zhao [this message]
2026-03-11 14:40 ` BUG: KCSAN: data-race in _copy_to_iter / ext4_generic_delete_entry Theodore Tso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6c4b2013.6d16.19cdbecff3e.Coremail.luckd0g@163.com \
--to=luckd0g@163.com \
--cc=adilger.kernel@dilger.ca \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox