public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* maple_tree: KCSAN: data-race in mas_wr_node_store / mtree_range_walk
@ 2026-03-11  3:27 Jianzhou Zhao
  0 siblings, 0 replies; only message in thread
From: Jianzhou Zhao @ 2026-03-11  3:27 UTC (permalink / raw)
  To: linux-kernel, aliceryhl, Liam.Howlett, andrewjballance, akpm,
	maple-tree, linux-mm

Subject: [BUG] maple_tree: KCSAN: data-race in mas_wr_node_store / mtree_range_walk

Dear Maintainers,

We are writing to report a KCSAN-detected data-race vulnerability in the Linux kernel. This bug was found by our custom fuzzing tool, RacePilot. The bug occurs in the maple tree component during concurrent node storage manipulation and tree traversal/RCU walk operations. We observed this on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.

Call Trace & Context
==================================================================
BUG: KCSAN: data-race in mas_wr_node_store / mtree_range_walk

write to 0xffff888023e00900 of 8 bytes by task 62996 on cpu 0:
 mte_set_node_dead home/kfuzz/linux/lib/maple_tree.c:335 [inline]
 mas_put_in_tree home/kfuzz/linux/lib/maple_tree.c:1571 [inline]
 mas_replace_node home/kfuzz/linux/lib/maple_tree.c:1587 [inline]
 mas_wr_node_store+0xa5c/0xc10 home/kfuzz/linux/lib/maple_tree.c:3568
 mas_wr_store_entry+0xabd/0x1120 home/kfuzz/linux/lib/maple_tree.c:3780
 mas_store_prealloc+0x47c/0xa60 home/kfuzz/linux/lib/maple_tree.c:5191
 vma_iter_store_overwrite home/kfuzz/linux/mm/vma.h:481 [inline]
 vma_iter_store_new home/kfuzz/linux/mm/vma.h:488 [inline]
 __mmap_new_vma home/kfuzz/linux/mm/vma.c:2508 [inline]
 __mmap_region+0x12d5/0x1ef0 home/kfuzz/linux/mm/vma.c:2681
 mmap_region+0x15f/0x260 home/kfuzz/linux/mm/vma.c:2751
 do_mmap+0x754/0xcd0 home/kfuzz/linux/mm/mmap.c:558
 vm_mmap_pgoff+0x15d/0x2e0 home/kfuzz/linux/mm/util.c:587
 ksys_mmap_pgoff+0x7d/0x380 home/kfuzz/linux/mm/mmap.c:604
 __do_sys_mmap home/kfuzz/linux/arch/x86/kernel/sys_x86_64.c:89 [inline]
 __se_sys_mmap home/kfuzz/linux/arch/x86/kernel/sys_x86_64.c:82 [inline]
 __x64_sys_mmap+0x71/0xa0 home/kfuzz/linux/arch/x86/kernel/sys_x86_64.c:82
 x64_sys_call+0x1b42/0x2030 home/kfuzz/linux/arch/x86/include/generated/asm/syscalls_64.h:10
 do_syscall_x64 home/kfuzz/linux/arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xae/0x2c0 home/kfuzz/linux/arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff888023e00900 of 8 bytes by task 62997 on cpu 1:
 ma_dead_node home/kfuzz/linux/lib/maple_tree.c:576 [inline]
 mtree_range_walk+0x11e/0x630 home/kfuzz/linux/lib/maple_tree.c:2594
 mas_state_walk home/kfuzz/linux/lib/maple_tree.c:3313 [inline]
 mas_walk+0x2a4/0x400 home/kfuzz/linux/lib/maple_tree.c:4617
 lock_vma_under_rcu+0xd3/0x710 home/kfuzz/linux/mm/mmap_lock.c:238
 do_user_addr_fault home/kfuzz/linux/arch/x86/mm/fault.c:1327 [inline]
 handle_page_fault home/kfuzz/linux/arch/x86/mm/fault.c:1476 [inline]
 exc_page_fault+0x294/0x10d0 home/kfuzz/linux/arch/x86/mm/fault.c:1532
 asm_exc_page_fault+0x26/0x30 home/kfuzz/linux/arch/x86/include/asm/idtentry.h:618

value changed: 0xffff88800bf0d706 -> 0xffff888023e00900

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 62997 Comm: syz.8.4355 Not tainted 6.18.0-08691-g2061f18ad76e-dirty #42 PREEMPT(voluntary) 
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
==================================================================

Execution Flow & Code Context
The CPU 0 task is currently modifying the maple tree mapping memory ranges via `__mmap_region`. The tree update routine uses `mas_wr_node_store()`, which calls `mas_replace_node()` to swap the old node with the new one. As part of replacing the node, it calls `mte_set_node_dead()`, performing a plain write to update the `node->parent` pointer to point to itself to indicate the node is dead:
```c
// lib/maple_tree.c
static inline void mte_set_node_dead(struct maple_enode *mn)
{
	mte_to_node(mn)->parent = ma_parent_ptr(mte_to_node(mn)); // <-- Write
	smp_wmb(); /* Needed for RCU */
}
```

Simultaneously, CPU 1 tries to handle a page fault with lockless concurrent RCU lookup using `lock_vma_under_rcu`. The maple tree traversal routines `mtree_range_walk()` calls `ma_dead_node()` on the nodes it fetches to ensure it hasn't stepped into a dead tree node. `ma_dead_node()` locklessly fetches the `node->parent` using a simple unannotated fetch in C:
```c
// lib/maple_tree.c
static __always_inline bool ma_dead_node(const struct maple_node *node)
{
	struct maple_node *parent;

	/* Do not reorder reads from the node prior to the parent check */
	smp_rmb();
	parent = (void *)((unsigned long)node->parent & ~MAPLE_NODE_MASK); // <-- Lockless Read
	return (parent == node);
}
```

Root Cause Analysis
A data race occurs over `node->parent` between the writer updating it to indicate tree modification explicitly (via `mte_set_node_dead()`) and the fast-path page fault traversal logic trying to deduce if the node is live concurrently (`ma_dead_node()`). The lockless reader runs while the writer makes an unsynchronized plain store in C.
Unfortunately, we were unable to generate a reproducer for this bug.

Potential Impact
If `ma_dead_node()` reads a partially torn or out-of-date pointer due to missing compiler annotations (read-tearing/store-tearing architectures or aggressive optimizations like value caching and hoisting), a dead node could be erroneously evaluated as alive (or vice versa). This could lead to a use-after-free, memory corruption, infinite loops inside the `maple_tree` navigation routines, or local Denial of Service (DoS) scenarios under heavy concurrent page-faulting load.

Proposed Fix
To safely resolve this data race without compromising the performance of the RCU walk path, we suggest adding standard Linux kernel concurrent annotations around the `node->parent` access manually. The writer should use `WRITE_ONCE()` and the reader should fetch the pointer context via `READ_ONCE()`.

```diff
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -332,7 +332,7 @@ static inline struct maple_node *mas_mn(const struct ma_state *mas)
 static inline void mte_set_node_dead(struct maple_enode *mn)
 {
-	mte_to_node(mn)->parent = ma_parent_ptr(mte_to_node(mn));
+	WRITE_ONCE(mte_to_node(mn)->parent, ma_parent_ptr(mte_to_node(mn)));
 	smp_wmb(); /* Needed for RCU */
 }
 
@@ -576,7 +576,8 @@ static __always_inline bool ma_dead_node(const struct maple_node *node)
 
 	/* Do not reorder reads from the node prior to the parent check */
 	smp_rmb();
-	parent = (void *)((unsigned long)node->parent & ~MAPLE_NODE_MASK);
+	parent = (void *)((unsigned long)READ_ONCE(node->parent) &
+			  ~MAPLE_NODE_MASK);
 	return (parent == node);
 }
```

We would be highly honored if this could be of any help.

Best regards,
RacePilot Team

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2026-03-11  3:27 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-11  3:27 maple_tree: KCSAN: data-race in mas_wr_node_store / mtree_range_walk Jianzhou Zhao

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox