* BUG: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk
@ 2026-03-11 7:44 Jianzhou Zhao
0 siblings, 0 replies; only message in thread
From: Jianzhou Zhao @ 2026-03-11 7:44 UTC (permalink / raw)
To: linux-kernel, linux-mm, akpm, aliceryhl, Liam.Howlett,
andrewjballance, maple-tree
Subject: [BUG] maple_tree: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk
Dear Maintainers,
We are writing to report a KCSAN-detected data-race vulnerability in the Linux kernel involving the maple tree subsystem. This bug was found by our custom fuzzing tool, RacePilot. The bug occurs because `mtree_range_walk` reads tree node pivots locklessly while a writer is concurrently updating those bounds through `mas_wr_slot_store` without explicit memory access annotations. We observed this on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.
Call Trace & Context
==================================================================
BUG: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk
write to 0xffff88800c50aa30 of 8 bytes by task 214464 on cpu 1:
mas_wr_slot_store lib/maple_tree.c:3601 [inline]
mas_wr_store_entry+0xf54/0x1120 lib/maple_tree.c:3777
mas_store_prealloc+0x47c/0xa60 lib/maple_tree.c:5191
vma_iter_store_overwrite mm/vma.h:481 [inline]
commit_merge+0x3ea/0x740 mm/vma.c:766
vma_merge_existing_range mm/vma.c:980 [inline]
vma_modify+0x5ff/0xdd0 mm/vma.c:1620
vma_modify_flags+0x16c/0x1a0 mm/vma.c:1662
mprotect_fixup+0x170/0x660 mm/mprotect.c:816
do_mprotect_pkey+0x5fe/0x930 mm/mprotect.c:990
__do_sys_mprotect mm/mprotect.c:1011 [inline]
__se_sys_mprotect mm/mprotect.c:1008 [inline]
__x64_sys_mprotect+0x47/0x60 mm/mprotect.c:1008
x64_sys_call+0xc6c/0x2030 arch/x86/include/generated/asm/syscalls_64.h:11
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xae/0x2c0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
read to 0xffff88800c50aa30 of 8 bytes by task 214467 on cpu 0:
mtree_range_walk+0x368/0x630 lib/maple_tree.c:2581
mas_state_walk lib/maple_tree.c:3313 [inline]
mas_walk+0x2a4/0x400 lib/maple_tree.c:4617
lock_vma_under_rcu+0xd3/0x710 mm/mmap_lock.c:238
do_user_addr_fault arch/x86/mm/fault.c:1327 [inline]
handle_page_fault arch/x86/mm/fault.c:1476 [inline]
exc_page_fault+0x294/0x10d0 arch/x86/mm/fault.c:1532
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:618
value changed: 0x00007f571d344fff -> 0x00007f571d324fff
Reported by Kernel Concurrency Sanitizer on:
CPU: 0 UID: 0 PID: 214467 Comm: syz.8.13076 Not tainted 6.18.0-08691-g2061f18ad76e-dirty #44 PREEMPT(voluntary)
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
==================================================================
Execution Flow & Code Context
During a virtual memory modification (via `mprotect`), `vma_modify` invokes `mas_store_prealloc` which calls `mas_wr_store_entry` -> `mas_wr_slot_store` to insert elements and overwrite node boundaries. This involves updating `wr_mas->pivots` arrays with plain assignments:
```c
// lib/maple_tree.c
static inline void mas_wr_slot_store(struct ma_wr_state *wr_mas)
{
...
if (mas->index == wr_mas->r_min) {
/* Overwriting the range and a part of the next one */
rcu_assign_pointer(slots[offset], wr_mas->entry);
wr_mas->pivots[offset] = mas->last; // <-- Plain Write
} else {
...
}
```
Concurrently, a page fault handler (`do_user_addr_fault`) tries to lock the VMA under RCU using `mas_walk`, which descends into `mtree_range_walk()`. The reader locklessly accesses `pivots[offset]` values in order to resolve bounding boxes of ranges evaluating exactly against `mas->index`:
```c
// lib/maple_tree.c
static inline void *mtree_range_walk(struct ma_state *mas)
{
...
if (pivots[0] >= mas->index) { // <-- Plain Read
offset = 0;
max = pivots[0]; // <-- Potentially double read / torn value
goto next;
}
offset = 1;
while (offset < end) {
if (pivots[offset] >= mas->index) { // <-- Plain Read
max = pivots[offset];
break;
}
offset++;
}
...
}
```
Root Cause Analysis
A data race occurs because the maple tree iterates `pivots[]` boundaries out of RCU order and locklessly in the reader, whilst `mas_wr_slot_store()` can execute a concurrent update on the exact same layout. Both accesses are compiled as plain C loads and stores. As `pivots[]` can be updated natively, tearing could occur where `mtree_range_walk` witnesses a split or intermediate bounds state, leading to broken limits being assigned to `max`.
Furthermore, `pivots[offset]` is fetched repeatedly in sequence in `max = pivots[offset]` creating a risk of fetching inconsistent values across multiple evaluations.
Unfortunately, we were unable to generate a reproducer for this bug.
Potential Impact
A torn pivot read or an inconsistent repeated read can misdirect the range walk entirely, fetching the wrong VMA structures or crashing tree navigation logic. This could result in random local Denial of Service (DoS) faults or unmapped memory panics due to mismatched virtual mappings under high memory stress.
Proposed Fix
To safely inform the compiler that these memory slots are concurrently accessed in RCU readers and avoid torn logic, `WRITE_ONCE()` should be used when updating pivots in `mas_wr_slot_store`. Correspondingly, the walker should fetch the pivot exactly once via `READ_ONCE()` into a single local variable before using it:
```diff
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2577,15 +2577,18 @@ static inline void *mtree_range_walk(struct ma_state *mas)
end = ma_data_end(node, type, pivots, max);
prev_min = min;
prev_max = max;
- if (pivots[0] >= mas->index) {
+
+ unsigned long pivot = READ_ONCE(pivots[0]);
+ if (pivot >= mas->index) {
offset = 0;
- max = pivots[0];
+ max = pivot;
goto next;
}
offset = 1;
while (offset < end) {
- if (pivots[offset] >= mas->index) {
- max = pivots[offset];
+ pivot = READ_ONCE(pivots[offset]);
+ if (pivot >= mas->index) {
+ max = pivot;
break;
}
@@ -3605,10 +3613,10 @@ static inline void mas_wr_slot_store(struct ma_wr_state *wr_mas)
if (mas->index == wr_mas->r_min) {
/* Overwriting the range and a part of the next one */
rcu_assign_pointer(slots[offset], wr_mas->entry);
- wr_mas->pivots[offset] = mas->last;
+ WRITE_ONCE(wr_mas->pivots[offset], mas->last);
} else {
/* Overwriting a part of the range and the next one */
rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
- wr_mas->pivots[offset] = mas->index - 1;
+ WRITE_ONCE(wr_mas->pivots[offset], mas->index - 1);
mas->offset++; /* Keep mas accurate. */
}
} else {
@@ -3621,8 +3629,8 @@ static inline void mas_wr_slot_store(struct ma_wr_state *wr_mas)
*/
gap |= !mt_slot_locked(mas->tree, slots, offset + 2);
rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
- wr_mas->pivots[offset] = mas->index - 1;
- wr_mas->pivots[offset + 1] = mas->last;
+ WRITE_ONCE(wr_mas->pivots[offset], mas->index - 1);
+ WRITE_ONCE(wr_mas->pivots[offset + 1], mas->last);
mas->offset++; /* Keep mas accurate. */
}
```
We would be highly honored if this could be of any help.
Best regards,
RacePilot Team
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2026-03-11 7:44 UTC | newest]
Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11 7:44 BUG: KCSAN: data-race in mas_wr_store_entry / mtree_range_walk Jianzhou Zhao
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox