public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: "Jianzhou Zhao" <luckd0g@163.com>
To: linux-kernel@vger.kernel.org, aliceryhl@google.com,
	Liam.Howlett@oracle.com, andrewjballance@gmail.com,
	akpm@linux-foundation.org, maple-tree@lists.infradead.org,
	linux-mm@kvack.org
Subject: maple_tree: KCSAN: data-race in mas_wr_node_store / mtree_range_walk
Date: Wed, 11 Mar 2026 11:27:16 +0800 (CST)	[thread overview]
Message-ID: <480ffa8f.3729.19cdaef38bd.Coremail.luckd0g@163.com> (raw)

Subject: [BUG] maple_tree: KCSAN: data-race in mas_wr_node_store / mtree_range_walk

Dear Maintainers,

We are writing to report a KCSAN-detected data-race vulnerability in the Linux kernel. This bug was found by our custom fuzzing tool, RacePilot. The bug occurs in the maple tree component during concurrent node storage manipulation and tree traversal/RCU walk operations. We observed this on the Linux kernel version 6.18.0-08691-g2061f18ad76e-dirty.

Call Trace & Context
==================================================================
BUG: KCSAN: data-race in mas_wr_node_store / mtree_range_walk

write to 0xffff888023e00900 of 8 bytes by task 62996 on cpu 0:
 mte_set_node_dead home/kfuzz/linux/lib/maple_tree.c:335 [inline]
 mas_put_in_tree home/kfuzz/linux/lib/maple_tree.c:1571 [inline]
 mas_replace_node home/kfuzz/linux/lib/maple_tree.c:1587 [inline]
 mas_wr_node_store+0xa5c/0xc10 home/kfuzz/linux/lib/maple_tree.c:3568
 mas_wr_store_entry+0xabd/0x1120 home/kfuzz/linux/lib/maple_tree.c:3780
 mas_store_prealloc+0x47c/0xa60 home/kfuzz/linux/lib/maple_tree.c:5191
 vma_iter_store_overwrite home/kfuzz/linux/mm/vma.h:481 [inline]
 vma_iter_store_new home/kfuzz/linux/mm/vma.h:488 [inline]
 __mmap_new_vma home/kfuzz/linux/mm/vma.c:2508 [inline]
 __mmap_region+0x12d5/0x1ef0 home/kfuzz/linux/mm/vma.c:2681
 mmap_region+0x15f/0x260 home/kfuzz/linux/mm/vma.c:2751
 do_mmap+0x754/0xcd0 home/kfuzz/linux/mm/mmap.c:558
 vm_mmap_pgoff+0x15d/0x2e0 home/kfuzz/linux/mm/util.c:587
 ksys_mmap_pgoff+0x7d/0x380 home/kfuzz/linux/mm/mmap.c:604
 __do_sys_mmap home/kfuzz/linux/arch/x86/kernel/sys_x86_64.c:89 [inline]
 __se_sys_mmap home/kfuzz/linux/arch/x86/kernel/sys_x86_64.c:82 [inline]
 __x64_sys_mmap+0x71/0xa0 home/kfuzz/linux/arch/x86/kernel/sys_x86_64.c:82
 x64_sys_call+0x1b42/0x2030 home/kfuzz/linux/arch/x86/include/generated/asm/syscalls_64.h:10
 do_syscall_x64 home/kfuzz/linux/arch/x86/entry/syscall_64.c:63 [inline]
 do_syscall_64+0xae/0x2c0 home/kfuzz/linux/arch/x86/entry/syscall_64.c:94
 entry_SYSCALL_64_after_hwframe+0x77/0x7f

read to 0xffff888023e00900 of 8 bytes by task 62997 on cpu 1:
 ma_dead_node home/kfuzz/linux/lib/maple_tree.c:576 [inline]
 mtree_range_walk+0x11e/0x630 home/kfuzz/linux/lib/maple_tree.c:2594
 mas_state_walk home/kfuzz/linux/lib/maple_tree.c:3313 [inline]
 mas_walk+0x2a4/0x400 home/kfuzz/linux/lib/maple_tree.c:4617
 lock_vma_under_rcu+0xd3/0x710 home/kfuzz/linux/mm/mmap_lock.c:238
 do_user_addr_fault home/kfuzz/linux/arch/x86/mm/fault.c:1327 [inline]
 handle_page_fault home/kfuzz/linux/arch/x86/mm/fault.c:1476 [inline]
 exc_page_fault+0x294/0x10d0 home/kfuzz/linux/arch/x86/mm/fault.c:1532
 asm_exc_page_fault+0x26/0x30 home/kfuzz/linux/arch/x86/include/asm/idtentry.h:618

value changed: 0xffff88800bf0d706 -> 0xffff888023e00900

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 UID: 0 PID: 62997 Comm: syz.8.4355 Not tainted 6.18.0-08691-g2061f18ad76e-dirty #42 PREEMPT(voluntary) 
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
==================================================================

Execution Flow & Code Context
The CPU 0 task is currently modifying the maple tree mapping memory ranges via `__mmap_region`. The tree update routine uses `mas_wr_node_store()`, which calls `mas_replace_node()` to swap the old node with the new one. As part of replacing the node, it calls `mte_set_node_dead()`, performing a plain write to update the `node->parent` pointer to point to itself to indicate the node is dead:
```c
// lib/maple_tree.c
static inline void mte_set_node_dead(struct maple_enode *mn)
{
	mte_to_node(mn)->parent = ma_parent_ptr(mte_to_node(mn)); // <-- Write
	smp_wmb(); /* Needed for RCU */
}
```

Simultaneously, CPU 1 tries to handle a page fault with lockless concurrent RCU lookup using `lock_vma_under_rcu`. The maple tree traversal routines `mtree_range_walk()` calls `ma_dead_node()` on the nodes it fetches to ensure it hasn't stepped into a dead tree node. `ma_dead_node()` locklessly fetches the `node->parent` using a simple unannotated fetch in C:
```c
// lib/maple_tree.c
static __always_inline bool ma_dead_node(const struct maple_node *node)
{
	struct maple_node *parent;

	/* Do not reorder reads from the node prior to the parent check */
	smp_rmb();
	parent = (void *)((unsigned long)node->parent & ~MAPLE_NODE_MASK); // <-- Lockless Read
	return (parent == node);
}
```

Root Cause Analysis
A data race occurs over `node->parent` between the writer updating it to indicate tree modification explicitly (via `mte_set_node_dead()`) and the fast-path page fault traversal logic trying to deduce if the node is live concurrently (`ma_dead_node()`). The lockless reader runs while the writer makes an unsynchronized plain store in C.
Unfortunately, we were unable to generate a reproducer for this bug.

Potential Impact
If `ma_dead_node()` reads a partially torn or out-of-date pointer due to missing compiler annotations (read-tearing/store-tearing architectures or aggressive optimizations like value caching and hoisting), a dead node could be erroneously evaluated as alive (or vice versa). This could lead to a use-after-free, memory corruption, infinite loops inside the `maple_tree` navigation routines, or local Denial of Service (DoS) scenarios under heavy concurrent page-faulting load.

Proposed Fix
To safely resolve this data race without compromising the performance of the RCU walk path, we suggest adding standard Linux kernel concurrent annotations around the `node->parent` access manually. The writer should use `WRITE_ONCE()` and the reader should fetch the pointer context via `READ_ONCE()`.

```diff
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -332,7 +332,7 @@ static inline struct maple_node *mas_mn(const struct ma_state *mas)
 static inline void mte_set_node_dead(struct maple_enode *mn)
 {
-	mte_to_node(mn)->parent = ma_parent_ptr(mte_to_node(mn));
+	WRITE_ONCE(mte_to_node(mn)->parent, ma_parent_ptr(mte_to_node(mn)));
 	smp_wmb(); /* Needed for RCU */
 }
 
@@ -576,7 +576,8 @@ static __always_inline bool ma_dead_node(const struct maple_node *node)
 
 	/* Do not reorder reads from the node prior to the parent check */
 	smp_rmb();
-	parent = (void *)((unsigned long)node->parent & ~MAPLE_NODE_MASK);
+	parent = (void *)((unsigned long)READ_ONCE(node->parent) &
+			  ~MAPLE_NODE_MASK);
 	return (parent == node);
 }
```

We would be highly honored if this could be of any help.

Best regards,
RacePilot Team

                 reply	other threads:[~2026-03-11  3:27 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=480ffa8f.3729.19cdaef38bd.Coremail.luckd0g@163.com \
    --to=luckd0g@163.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=aliceryhl@google.com \
    --cc=andrewjballance@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=maple-tree@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox