* [PATCH v2 00/19] maple_tree: lock checking and clean ups
@ 2026-06-30 19:08 Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
` (19 more replies)
0 siblings, 20 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
The goals of this series are:
1. Lock issue detection
A number of syzbot reports are incorrectly pointing to the mm exit as
a source of the locking error. The first three patches attempt to
help users detect errors in their locking - but they still have to use
LOCKDEP. I guess it's still down to hope and prayers.
2. Documentation fixes
The documentation was lacking clarity, there are updates to try and
help the users, especially around the erase() cases.
3. Two benign issues
The cyclic allocator may have a race, although no in-kernel user can
hit it.
The erase functions may cause allocation issues if used with the
incorrect locking type, but none are present in-tree.
Beyond these goals there are some test fixes, some general speed-up
patches targeting extra work and cycles, and dropping dead code.
v1: https://lore.kernel.org/all/20260629144145.1572283-1-liam@infradead.org/
Changes since v1:
- Relocated mas_set_parent_slots() lower so that mt_slots_locked() for rcu dereferencing - Thanks kernel build bot
- Made mas_lockdep_map() static - Thanks kernel build bot
- Added mas_lock_check() to mas_may_activate() when restoring a node to active - Thanks sashiko
- Changed mas_lock_check() aliasing to use mt_locked() - Thanks sashiko
- Updated __lock_unpin_lock() underflow check - Thanks sashiko
- Updated checking the lock sequence based on the lockdep map - Thanks sashiko
- Add a comment to mas_wr_node_store() to explain why the last slot isn't cleared. - Thanks sashiko
- Drop mas_set_parent() function as it is no longer used - Thanks sashiko
- Clear the mas_alloc_cyclic() wrap flag after storage - Thanks sashiko
- Add might_allocate() to the mtree_erase() call - Thanks Jason
- Added one more cleanup for making maple state walkable.
Liam R. Howlett (Oracle) (19):
maple_tree: Add rcu locking check when LOCKDEP is enabled
locking/lockdep: Add sequence counter to held_lock
maple_tree: Add write lock checking with lockdep sequence numbers
maple_tree: Documentation fix
maple_tree: Drop dead code from mas_extend_spanning_null()
maple_tree: Drop MAPLE_ALLOC_SLOTS
maple_tree: Clarify comments on mas_nomem()
maple_tree: Use prefetched value in mas_wr_store_type()
maple_tree: Optimise mas_wr_node_store() when not in rcu mode
maple_tree: micro optimisation of mas_wr_store_type()
maple_tree: Add bulk parent set helper
maple_tree: Catch race in mas_alloc_cyclic()
maple_tree: Document that erase may use GFP_KERNEL for allocations
maple_tree: WARN_ON_ONCE when allocations fail
maple_tree: Document erase and allocations better
maple_tree: Change two GFP flags in tests
maple_tree: Fix argument name in header
maple_tree: Avoid extra gap calculation
maple_tree: Add helper mas_make_walkable()
Documentation/core-api/maple_tree.rst | 21 +-
include/linux/lockdep.h | 3 +
include/linux/lockdep_types.h | 3 +-
include/linux/maple_tree.h | 10 +-
include/linux/sched.h | 1 +
kernel/locking/lockdep.c | 58 ++++-
lib/maple_tree.c | 353 ++++++++++++++++++--------
tools/testing/radix-tree/maple.c | 4 +-
8 files changed, 326 insertions(+), 127 deletions(-)
--
2.47.3
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock Liam R. Howlett (Oracle)
` (18 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
When CONFIG_LOCKDEP and CONFIG_RCU_STRICT_GRACE_PERIOD is enabled, check
for rcu locking issues by recording the grace period in the maple state
and checking the rcu window is still valid whenever the maple state is
reused with a state that is not MA_START or MA_PAUSED.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
include/linux/maple_tree.h | 3 +++
lib/maple_tree.c | 50 +++++++++++++++++++++++++++++++++++++-
2 files changed, 52 insertions(+), 1 deletion(-)
diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 4a5631906affb..11a16c3508dc0 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -485,6 +485,9 @@ struct ma_state {
unsigned char mas_flags;
unsigned char end; /* The end of the node */
enum store_type store_type; /* The type of store needed for this operation */
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+ unsigned long rcu_gp;
+#endif
};
struct ma_wr_state {
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index e52876435b77f..3dff0435c13a7 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1153,6 +1153,42 @@ static inline void mas_free(struct ma_state *mas, struct maple_enode *used)
ma_free_rcu(mte_to_node(used));
}
+void mas_lock_check(struct ma_state *mas)
+{
+
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+ if (!mas_is_active(mas))
+ return;
+
+ if (!mt_locked(mas->tree)) {
+ if (mt_in_rcu(mas->tree))
+ WARN_ON_ONCE(poll_state_synchronize_rcu(mas->rcu_gp));
+ }
+#endif
+
+}
+
+void mas_init_lock_check(struct ma_state *mas)
+{
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+ if (!mt_locked(mas->tree)) {
+ if (mt_in_rcu(mas->tree))
+ mas->rcu_gp = get_state_synchronize_rcu();
+ }
+#endif
+
+}
+
+static void mas_may_init_lock_check(struct ma_state *mas)
+{
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+ if (mas_is_start(mas) || mas_is_paused(mas))
+ mas_init_lock_check(mas);
+ else
+ mas_lock_check(mas);
+#endif
+}
+
/*
* mas_start() - Sets up maple state for operations.
* @mas: The maple state.
@@ -1171,6 +1207,7 @@ static inline struct maple_enode *mas_start(struct ma_state *mas)
if (likely(mas_is_start(mas))) {
struct maple_enode *root;
+ mas_init_lock_check(mas);
mas->min = 0;
mas->max = ULONG_MAX;
@@ -4360,6 +4397,7 @@ void *mas_walk(struct ma_state *mas)
{
void *entry;
+ mas_may_init_lock_check(mas);
if (!mas_is_active(mas) && !mas_is_start(mas))
mas->status = ma_start;
retry:
@@ -4997,6 +5035,7 @@ static void mas_may_activate(struct ma_state *mas)
mas->status = ma_start;
} else {
mas->status = ma_active;
+ mas_lock_check(mas);
}
}
@@ -5074,6 +5113,7 @@ void *mas_next(struct ma_state *mas, unsigned long max)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_next_setup(mas, max, &entry))
return entry;
@@ -5097,6 +5137,7 @@ void *mas_next_range(struct ma_state *mas, unsigned long max)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_next_setup(mas, max, &entry))
return entry;
@@ -5205,6 +5246,7 @@ void *mas_prev(struct ma_state *mas, unsigned long min)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_prev_setup(mas, min, &entry))
return entry;
@@ -5228,6 +5270,7 @@ void *mas_prev_range(struct ma_state *mas, unsigned long min)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_prev_setup(mas, min, &entry))
return entry;
@@ -5274,6 +5317,7 @@ EXPORT_SYMBOL_GPL(mt_prev);
*/
void mas_pause(struct ma_state *mas)
{
+ mas_lock_check(mas);
mas->status = ma_pause;
mas->node = NULL;
}
@@ -5382,6 +5426,7 @@ void *mas_find(struct ma_state *mas, unsigned long max)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_find_setup(mas, max, &entry))
return entry;
@@ -5409,6 +5454,7 @@ void *mas_find_range(struct ma_state *mas, unsigned long max)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_find_setup(mas, max, &entry))
return entry;
@@ -5521,6 +5567,7 @@ void *mas_find_rev(struct ma_state *mas, unsigned long min)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_find_rev_setup(mas, min, &entry))
return entry;
@@ -5547,6 +5594,7 @@ void *mas_find_range_rev(struct ma_state *mas, unsigned long min)
{
void *entry = NULL;
+ mas_may_init_lock_check(mas);
if (mas_find_rev_setup(mas, min, &entry))
return entry;
@@ -5623,7 +5671,7 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
if (!mas->sheaf && !mas->alloc)
return false;
- mas->status = ma_start;
+ mas_reset(mas);
return true;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers Liam R. Howlett (Oracle)
` (17 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long, Peter Zijlstra
Add an 8 bit small sequence counter to the held_lock struct to detect if
the lock as been dropped and reacquired. This is useful when a data
structure depends on a constant locking context, but is not able to
detect locking and unlocking of the lock through its own API.
Since the __lock_unpin_lock() will no longer detect underflow by casting
the unsigned int to a signed int, update the casting code to use a temp
variable for calculations using a signed int.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/all/h3tpnj5kzcrxms5picmimtkpg4aypcpip5wbd6bt2rpdj5k7eb@nhtzs3lefrkq/
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
include/linux/lockdep.h | 3 ++
include/linux/lockdep_types.h | 3 +-
include/linux/sched.h | 1 +
kernel/locking/lockdep.c | 58 ++++++++++++++++++++++++++++++-----
4 files changed, 57 insertions(+), 8 deletions(-)
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 621566345406d..a6451ecbbe9a0 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -273,6 +273,9 @@ extern struct pin_cookie lock_pin_lock(struct lockdep_map *lock);
extern void lock_repin_lock(struct lockdep_map *lock, struct pin_cookie);
extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
+extern u32 lock_sequence(struct lockdep_map *lock);
+#define lockdep_sequence(lock) lock_sequence(&(lock)->dep_map)
+
#define lockdep_depth(tsk) (debug_locks ? (tsk)->lockdep_depth : 0)
#define lockdep_assert(cond) \
diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index eae115a264885..55c4b152fedf7 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -253,7 +253,8 @@ struct held_lock {
unsigned int hardirqs_off:1;
unsigned int sync:1;
unsigned int references:11; /* 32 bits */
- unsigned int pin_count;
+ unsigned int pin_count:24;
+ unsigned int seq_count:8;
};
#else /* !CONFIG_LOCKDEP */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 373bcc0598d10..14d5ce8dd6136 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1288,6 +1288,7 @@ struct task_struct {
u64 curr_chain_key;
int lockdep_depth;
unsigned int lockdep_recursion;
+ unsigned int lockdep_seq;
struct held_lock held_locks[MAX_LOCK_DEPTH];
#endif
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2d4c5bab5af88..a69567bdd7912 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5077,7 +5077,7 @@ static int __lock_is_held(const struct lockdep_map *lock, int read);
static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
int trylock, int read, int check, int hardirqs_off,
struct lockdep_map *nest_lock, unsigned long ip,
- int references, int pin_count, int sync)
+ int references, int pin_count, int sync, int seq)
{
struct task_struct *curr = current;
struct lock_class *class = NULL;
@@ -5183,6 +5183,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
hlock->holdtime_stamp = lockstat_clock();
#endif
hlock->pin_count = pin_count;
+ hlock->seq_count = seq;
if (check_wait_context(curr, hlock))
return 0;
@@ -5388,7 +5389,7 @@ static int reacquire_held_locks(struct task_struct *curr, unsigned int depth,
hlock->read, hlock->check,
hlock->hardirqs_off,
hlock->nest_lock, hlock->acquire_ip,
- hlock->references, hlock->pin_count, 0)) {
+ hlock->references, hlock->pin_count, 0, hlock->seq_count)) {
case 0:
return 1;
case 1:
@@ -5669,14 +5670,17 @@ static void __lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie
struct held_lock *hlock = curr->held_locks + i;
if (match_held_lock(hlock, lock)) {
+ int pin_count;
+
if (WARN(!hlock->pin_count, "unpinning an unpinned lock\n"))
return;
- hlock->pin_count -= cookie.val;
+ pin_count = hlock->pin_count - cookie.val;
- if (WARN((int)hlock->pin_count < 0, "pin count corrupted\n"))
- hlock->pin_count = 0;
+ if (WARN(pin_count < 0, "pin count corrupted\n"))
+ pin_count = 0;
+ hlock->pin_count = pin_count;
return;
}
}
@@ -5684,6 +5688,24 @@ static void __lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie
WARN(1, "unpinning an unheld lock\n");
}
+static u32 __lock_sequence(struct lockdep_map *lock)
+{
+ struct task_struct *curr = current;
+ int i;
+
+ if (unlikely(!debug_locks))
+ return ~0;
+
+ for (i = 0; i < curr->lockdep_depth; i++) {
+ struct held_lock *hlock = curr->held_locks + i;
+
+ if (match_held_lock(hlock, lock))
+ return hlock->seq_count;
+ }
+
+ return ~0;
+}
+
/*
* Check whether we follow the irq-flags state precisely:
*/
@@ -5866,7 +5888,8 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
lockdep_recursion_inc();
__lock_acquire(lock, subclass, trylock, read, check,
- irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 0);
+ irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 0,
+ ++current->lockdep_seq);
lockdep_recursion_finish();
raw_local_irq_restore(flags);
}
@@ -5914,7 +5937,8 @@ void lock_sync(struct lockdep_map *lock, unsigned subclass, int read,
lockdep_recursion_inc();
__lock_acquire(lock, subclass, 0, read, check,
- irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 1);
+ irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 1,
+ ++current->lockdep_seq);
check_chain_key(current);
lockdep_recursion_finish();
raw_local_irq_restore(flags);
@@ -6000,6 +6024,26 @@ void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie)
}
EXPORT_SYMBOL_GPL(lock_unpin_lock);
+u32 lock_sequence(struct lockdep_map *lock)
+{
+ unsigned long flags;
+ u32 seq = ~0;
+
+ if (unlikely(!lockdep_enabled()))
+ return seq;
+
+ raw_local_irq_save(flags);
+ check_flags(flags);
+
+ lockdep_recursion_inc();
+ seq = __lock_sequence(lock);
+ lockdep_recursion_finish();
+ raw_local_irq_restore(flags);
+
+ return seq;
+}
+EXPORT_SYMBOL_GPL(lock_sequence);
+
#ifdef CONFIG_LOCK_STAT
static void print_lock_contention_bug(struct task_struct *curr,
struct lockdep_map *lock,
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 04/19] maple_tree: Documentation fix Liam R. Howlett (Oracle)
` (16 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
Use the lockdep sequence numbers to ensure the write lock is not dropped
between write operations. The lockdep sequence is recorded on any walk
that starts from the top of the tree and re-checked prior to any
operation using an active node.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
include/linux/maple_tree.h | 7 +++--
lib/maple_tree.c | 58 ++++++++++++++++++++++++++++++--------
2 files changed, 51 insertions(+), 14 deletions(-)
diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 11a16c3508dc0..220b380cdd491 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -485,9 +485,12 @@ struct ma_state {
unsigned char mas_flags;
unsigned char end; /* The end of the node */
enum store_type store_type; /* The type of store needed for this operation */
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+#ifdef CONFIG_LOCKDEP
+ u32 ld_seq;
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
unsigned long rcu_gp;
-#endif
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+#endif /* CONFIG_LOCKDEP */
};
struct ma_wr_state {
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 3dff0435c13a7..e2f7474572319 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1153,40 +1153,71 @@ static inline void mas_free(struct ma_state *mas, struct maple_enode *used)
ma_free_rcu(mte_to_node(used));
}
-void mas_lock_check(struct ma_state *mas)
+
+#ifdef CONFIG_LOCKDEP
+static struct lockdep_map *mas_lockdep_map(struct ma_state *mas)
{
+ struct maple_tree *mt = mas->tree;
+
+ if (mt_external_lock(mt))
+ return mt->ma_external_lock;
+
+ return &(mt->ma_lock).dep_map;
+}
+
+#endif
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+static void mas_lock_check(struct ma_state *mas)
+{
+#ifdef CONFIG_LOCKDEP
+ struct lockdep_map *map;
if (!mas_is_active(mas))
return;
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
if (!mt_locked(mas->tree)) {
if (mt_in_rcu(mas->tree))
WARN_ON_ONCE(poll_state_synchronize_rcu(mas->rcu_gp));
}
-#endif
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+
+ map = mas_lockdep_map(mas);
+ if (map && lock_is_held(map))
+ WARN_ON_ONCE(mas->ld_seq != lock_sequence(map));
+#endif /* CONFIG_LOCKDEP */
}
-void mas_init_lock_check(struct ma_state *mas)
+static void mas_init_lock_check(struct ma_state *mas)
{
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+#ifdef CONFIG_LOCKDEP
+ struct lockdep_map *map;
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
if (!mt_locked(mas->tree)) {
if (mt_in_rcu(mas->tree))
mas->rcu_gp = get_state_synchronize_rcu();
+ return;
}
-#endif
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+
+ map = mas_lockdep_map(mas);
+ if (map && lock_is_held(map))
+ mas->ld_seq = lock_sequence(mas_lockdep_map(mas));
+#endif /* CONFIG_LOCKDEP */
}
static void mas_may_init_lock_check(struct ma_state *mas)
{
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
- if (mas_is_start(mas) || mas_is_paused(mas))
+#ifdef CONFIG_LOCKDEP
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
+ if (mas_is_start(mas) || mas_is_paused(mas)) {
mas_init_lock_check(mas);
- else
- mas_lock_check(mas);
-#endif
+ return;
+ }
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+ mas_lock_check(mas);
+#endif /* CONFIG_LOCKDEP */
}
/*
@@ -4869,6 +4900,7 @@ void *mas_store(struct ma_state *mas, void *entry)
{
MA_WR_STATE(wr_mas, mas, entry);
+ mas_may_init_lock_check(mas);
trace_ma_write(TP_FCT, mas, 0, entry);
#ifdef CONFIG_DEBUG_MAPLE_TREE
if (MAS_WARN_ON(mas, mas->index > mas->last))
@@ -4927,6 +4959,7 @@ int mas_store_gfp(struct ma_state *mas, void *entry, gfp_t gfp)
MA_WR_STATE(wr_mas, mas, entry);
int ret = 0;
+ mas_may_init_lock_check(mas);
retry:
mas_wr_preallocate(&wr_mas, entry);
if (unlikely(mas_nomem(mas, gfp))) {
@@ -4957,6 +4990,7 @@ void mas_store_prealloc(struct ma_state *mas, void *entry)
{
MA_WR_STATE(wr_mas, mas, entry);
+ mas_lock_check(mas);
if (mas->store_type == wr_store_root) {
mas_wr_prealloc_setup(&wr_mas);
goto store;
@@ -4989,6 +5023,7 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
{
MA_WR_STATE(wr_mas, mas, entry);
+ mas_may_init_lock_check(mas);
mas_wr_prealloc_setup(&wr_mas);
mas->store_type = mas_wr_store_type(&wr_mas);
mas_prealloc_calc(&wr_mas, entry);
@@ -5474,7 +5509,6 @@ EXPORT_SYMBOL_GPL(mas_find_range);
static bool mas_find_rev_setup(struct ma_state *mas, unsigned long min,
void **entry)
{
-
switch (mas->status) {
case ma_active:
goto active;
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 04/19] maple_tree: Documentation fix
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (2 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null() Liam R. Howlett (Oracle)
` (15 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
Don't include the word flag in the quotes with the actual flag.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
Documentation/core-api/maple_tree.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/core-api/maple_tree.rst b/Documentation/core-api/maple_tree.rst
index ccdd1615cf974..34964ec88d179 100644
--- a/Documentation/core-api/maple_tree.rst
+++ b/Documentation/core-api/maple_tree.rst
@@ -211,7 +211,7 @@ Advanced Locking
The maple tree uses a spinlock by default, but external locks can be used for
tree updates as well. To use an external lock, the tree must be initialized
-with the ``MT_FLAGS_LOCK_EXTERN flag``, this is usually done with the
+with the ``MT_FLAGS_LOCK_EXTERN`` flag, this is usually done with the
MTREE_INIT_EXT() #define, which takes an external lock as an argument.
Functions and structures
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null()
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (3 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 04/19] maple_tree: Documentation fix Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS Liam R. Howlett (Oracle)
` (14 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
mas_extend_spanning_null() had a clause if the end of the range being
written (mas->last) is the same as the end of the existing range it is
overwriting (wr_mas->r_max), action will be taken.
This code path is not possible because the only calling function
increments mas->last (unless it's ULONG_MAX) to walk to one beyond the
write and then resets the value back to the initial value.
In the case of mas->last == ULONG_MAX, then the second part of the
statement will always be false - mas->last cannot be less than the node
max.
This code never executed and is flawed anyways (the arguments are
incorrectly ordered), so removing it is the safest action. Since the
code never executes, it is not fixing any issue so Fixes tag is not
given.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 7 -------
1 file changed, 7 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index e2f7474572319..97849db132e7a 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2996,13 +2996,6 @@ static inline void mas_extend_spanning_null(struct ma_wr_state *l_wr_mas,
if (r_mas->last < r_wr_mas->r_max)
r_mas->last = r_wr_mas->r_max;
r_mas->offset++;
- } else if ((r_mas->last == r_wr_mas->r_max) &&
- (r_mas->last < r_mas->max) &&
- !mas_slot_locked(r_mas, r_wr_mas->slots, r_mas->offset + 1)) {
- r_mas->last = mas_safe_pivot(r_mas, r_wr_mas->pivots,
- r_wr_mas->type, r_mas->offset + 1);
- r_mas->offset++;
- r_wr_mas->r_max = r_mas->last;
}
}
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (4 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem() Liam R. Howlett (Oracle)
` (13 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
MAPLE_ALLOC_SLOTS is no longer used, so remove it.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
include/linux/maple_tree.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 220b380cdd491..346eae48cf15e 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -29,13 +29,11 @@
#define MAPLE_NODE_SLOTS 31 /* 256 bytes including ->parent */
#define MAPLE_RANGE64_SLOTS 16 /* 256 bytes */
#define MAPLE_ARANGE64_SLOTS 10 /* 240 bytes */
-#define MAPLE_ALLOC_SLOTS (MAPLE_NODE_SLOTS - 1)
#else
/* 32bit sizes */
#define MAPLE_NODE_SLOTS 63 /* 256 bytes including ->parent */
#define MAPLE_RANGE64_SLOTS 32 /* 256 bytes */
#define MAPLE_ARANGE64_SLOTS 21 /* 240 bytes */
-#define MAPLE_ALLOC_SLOTS (MAPLE_NODE_SLOTS - 2)
#endif /* defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) */
#define MAPLE_NODE_MASK 255UL
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem()
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (5 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type() Liam R. Howlett (Oracle)
` (12 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
When an allocation completely fails, the return is false. If the
allocation succeeds or partially succeeds, return true to indicate a
retry of the operation. Note that since the lock may have been dropped,
the operation is retried from the start - including potentially
allocating more memory.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 97849db132e7a..3886b856d6e3a 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5676,10 +5676,11 @@ EXPORT_SYMBOL_GPL(mas_erase);
/**
* mas_nomem() - Check if there was an error allocating and do the allocation
- * if necessary If there are allocations, then free them.
+ * if necessary.
+ *
* @mas: The maple state
* @gfp: The GFP_FLAGS to use for allocations
- * Return: true on allocation, false otherwise.
+ * Return: False on no memory. True otherwise (partial success as well)
*/
bool mas_nomem(struct ma_state *mas, gfp_t gfp)
__must_hold(mas->tree->ma_lock)
@@ -5695,6 +5696,10 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
mas_alloc_nodes(mas, gfp);
}
+ /*
+ * Return false on zero forward progress. Partial allocations are kept
+ * so the retry path will attempt to get the rest.
+ */
if (!mas->sheaf && !mas->alloc)
return false;
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type()
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (6 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode Liam R. Howlett (Oracle)
` (11 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
The slot contents exist in wr_mas->content, which has less overhead than
reading the slot again.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 3886b856d6e3a..8f5a7fa10cf4c 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3279,7 +3279,7 @@ static inline void mas_wr_slot_store(struct ma_wr_state *wr_mas)
void __rcu **slots = wr_mas->slots;
bool gap = false;
- gap |= !mt_slot_locked(mas->tree, slots, offset);
+ gap |= !wr_mas->content;
gap |= !mt_slot_locked(mas->tree, slots, offset + 1);
if (wr_mas->offset_end - offset == 1) {
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (7 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type() Liam R. Howlett (Oracle)
` (10 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
Clearing the entire node on the stack is unnecessary since most of the
node will be overwritten anyways. Just clear what isn't used after the
data is in place.
Benchmarking shows a speedup of 0.67% on a height 4 tree with 2048
entries.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 8f5a7fa10cf4c..1c96b3a5dab60 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3186,7 +3186,7 @@ static void mas_wr_spanning_store(struct ma_wr_state *wr_mas)
static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
{
unsigned char dst_offset, offset_end;
- unsigned char copy_size, node_pivots;
+ unsigned char copy_size, node_pivots, node_slots;
struct maple_node reuse, *newnode;
unsigned long *dst_pivots;
void __rcu **dst_slots;
@@ -3199,6 +3199,7 @@ static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
in_rcu = mt_in_rcu(mas->tree);
offset_end = wr_mas->offset_end;
node_pivots = mt_pivots[wr_mas->type];
+ node_slots = mt_slots[wr_mas->type];
/* Assume last adds an entry */
new_end = mas->end + 1 - offset_end + mas->offset;
if (mas->last == wr_mas->end_piv) {
@@ -3210,7 +3211,6 @@ static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
if (in_rcu) {
newnode = mas_pop_node(mas);
} else {
- memset(&reuse, 0, sizeof(struct maple_node));
newnode = &reuse;
}
@@ -3254,7 +3254,21 @@ static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
dst_pivots[new_end] = mas->max;
done:
- mas_leaf_set_meta(newnode, maple_leaf_64, new_end);
+ if (!in_rcu && new_end + 2 < node_slots) {
+ unsigned char clear_from = new_end + 1;
+
+ /*
+ * Note that the last slot is never cleared, since the metadata
+ * will be stored there or it has a value.
+ */
+ memset(dst_slots + clear_from, 0,
+ sizeof(void __rcu *) * (node_slots - clear_from));
+ if (clear_from < node_pivots)
+ memset(dst_pivots + clear_from, 0,
+ sizeof(unsigned long) * (node_pivots - clear_from));
+ }
+
+ mas_leaf_set_meta(newnode, wr_mas->type, new_end);
if (in_rcu) {
struct maple_enode *old_enode = mas->node;
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type()
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (8 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 11/19] maple_tree: Add bulk parent set helper Liam R. Howlett (Oracle)
` (9 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
Use three new local booleans instead of reading other structures. This
has shown an increase of 0.62% on a 2048 entry tree of height 4.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 18 +++++++++++++++---
1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 1c96b3a5dab60..999a1d095e369 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3736,6 +3736,9 @@ static inline enum store_type mas_wr_store_type(struct ma_wr_state *wr_mas)
{
struct ma_state *mas = wr_mas->mas;
unsigned char new_end;
+ bool appending;
+ bool one_slot;
+ bool in_rcu;
if (unlikely(mas_is_none(mas) || mas_is_ptr(mas)))
return wr_store_root;
@@ -3755,21 +3758,30 @@ static inline enum store_type mas_wr_store_type(struct ma_wr_state *wr_mas)
return wr_new_root;
new_end = mas_wr_new_end(wr_mas);
+ in_rcu = mt_in_rcu(mas->tree);
+ appending = mas->offset == mas->end;
+ one_slot = wr_mas->offset_end - mas->offset == 1;
+
/* Potential spanning rebalance collapsing a node */
if (new_end < mt_min_slots[wr_mas->type]) {
if (!mte_is_root(mas->node))
return wr_rebalance;
+ if (!in_rcu) {
+ if (appending)
+ return wr_append;
+ else if (mas->end == new_end && one_slot)
+ return wr_slot_store;
+ }
return wr_node_store;
}
if (new_end >= mt_slots[wr_mas->type])
return wr_split_store;
- if (!mt_in_rcu(mas->tree) && (mas->offset == mas->end))
+ if (!in_rcu && appending)
return wr_append;
- if ((new_end == mas->end) && (!mt_in_rcu(mas->tree) ||
- (wr_mas->offset_end - mas->offset == 1)))
+ if (new_end == mas->end && (!in_rcu || one_slot))
return wr_slot_store;
return wr_node_store;
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 11/19] maple_tree: Add bulk parent set helper
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (9 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
` (8 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
Instead of calculating the parent pointer each time for a child,
cache the majority of the parent pointer and only change the slot per
child.
Drop the mas_set_parent() function since the last user has been removed.
Testing on a tree containing 2048 entries of height 4 had an increased
gain of 3.51% on nodes tracking gaps.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 97 +++++++++++++++++++++---------------------------
1 file changed, 42 insertions(+), 55 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 999a1d095e369..768193406c3db 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -455,46 +455,6 @@ enum maple_type mas_parent_type(struct ma_state *mas, struct maple_enode *enode)
return 0;
}
-/*
- * mas_set_parent() - Set the parent node and encode the slot
- * @mas: The maple state
- * @enode: The encoded maple node.
- * @parent: The encoded maple node that is the parent of @enode.
- * @slot: The slot that @enode resides in @parent.
- *
- * Slot number is encoded in the enode->parent bit 3-6 or 2-6, depending on the
- * parent type.
- */
-static inline
-void mas_set_parent(struct ma_state *mas, struct maple_enode *enode,
- const struct maple_enode *parent, unsigned char slot)
-{
- unsigned long val = (unsigned long)parent;
- unsigned long shift;
- unsigned long type;
- enum maple_type p_type = mte_node_type(parent);
-
- MAS_BUG_ON(mas, p_type == maple_dense);
- MAS_BUG_ON(mas, p_type == maple_leaf_64);
-
- switch (p_type) {
- case maple_range_64:
- case maple_arange_64:
- shift = MAPLE_PARENT_SLOT_SHIFT;
- type = MAPLE_PARENT_RANGE64;
- break;
- default:
- case maple_dense:
- case maple_leaf_64:
- shift = type = 0;
- break;
- }
-
- val &= ~MAPLE_NODE_MASK; /* Clear all node metadata in parent */
- val |= (slot << shift) | type;
- mte_to_node(enode)->parent = ma_parent_ptr(val);
-}
-
/*
* mte_parent_slot() - get the parent slot of @enode.
* @enode: The encoded maple node.
@@ -876,6 +836,42 @@ static inline void ma_set_meta_gap(struct maple_node *mn, enum maple_type mt,
meta->gap = offset;
}
+/*
+ * mas_set_parent_slots() - Bulk operation to set many slot parent pointers
+ * @mas: The maple state
+ * @parent: The encoded maple node that is the parent of @enode.
+ * @slot: The slot that of the @enode.
+ * @start_slot: The offset into @slot
+ * @count: The number of slots to set (eg: exclusive)
+ */
+static inline
+void mas_set_parent_slots(struct ma_state *mas, struct maple_enode *parent,
+ void __rcu **slots, unsigned char start_slot, unsigned char count)
+{
+ unsigned long val;
+ unsigned long shift;
+ unsigned long type;
+ enum maple_type p_type = mte_node_type(parent);
+ unsigned char i;
+
+ MAS_BUG_ON(mas, p_type != maple_range_64 &&
+ p_type != maple_arange_64);
+
+ shift = MAPLE_PARENT_SLOT_SHIFT;
+ type = MAPLE_PARENT_RANGE64;
+
+ val = (unsigned long)parent;
+ val &= ~MAPLE_NODE_MASK;
+
+ for (i = 0; i < count; i++) {
+ unsigned long pval = val | ((start_slot + i) << shift) | type;
+ struct maple_enode *child;
+
+ child = mt_slot_locked(mas->tree, slots, i);
+ mte_to_node(child)->parent = ma_parent_ptr(pval);
+ }
+}
+
/*
* mat_add() - Add a @dead_enode to the ma_topiary of a list of dead nodes.
* @mat: the ma_topiary, a linked list of dead nodes.
@@ -1608,14 +1604,10 @@ static inline void mas_adopt_children(struct ma_state *mas,
struct maple_node *node = mte_to_node(parent);
void __rcu **slots = ma_slots(node, type);
unsigned long *pivots = ma_pivots(node, type);
- struct maple_enode *child;
- unsigned char offset;
+ unsigned char end;
- offset = ma_data_end(node, type, pivots, mas->max);
- do {
- child = mas_slot_locked(mas, slots, offset);
- mas_set_parent(mas, child, parent, offset);
- } while (offset--);
+ end = ma_data_end(node, type, pivots, mas->max);
+ mas_set_parent_slots(mas, parent, slots, 0, end + 1);
}
/*
@@ -1997,15 +1989,10 @@ unsigned long node_copy(struct ma_state *mas, struct maple_node *src,
s_slots = ma_slots(src, s_mt) + start;
s_pivots = ma_pivots(src, s_mt) + start;
memcpy(d_slots, s_slots, size * sizeof(void __rcu *));
- if (!ma_is_leaf(d_mt) && s_mt == maple_copy) {
- struct maple_enode *edst = mt_mk_node(dst, d_mt);
-
- for (int i = 0; i < size; i++)
- mas_set_parent(mas,
- mt_slot_locked(mas->tree, d_slots, i),
- edst, d_start + i);
- }
+ if (!ma_is_leaf(d_mt) && s_mt == maple_copy)
+ mas_set_parent_slots(mas, mt_mk_node(dst, d_mt),
+ d_slots, d_start, size);
d_gaps = ma_gaps(dst, d_mt);
if (d_gaps) {
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic()
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (10 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 11/19] maple_tree: Add bulk parent set helper Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:41 ` Chuck Lever
2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
` (7 subsequent siblings)
19 siblings, 1 reply; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
Chris Mason, Chuck Lever
If mas_alloc_cyclic() is called during a low memory situation, it is
possible the lock may be dropped so reclaim can occur. There is a
window where some other task may allocate the same id and cause the
mas_insert() to fail with -EEXIST. In this scenario the function will
return -EEXIST, which is not expected.
Modifying the retry on mas_nomem() to re-search for a slot means that
any race with other writes will not matter as the lock will be held
between finding the index and writing the index.
Moving the flag logic avoids cases where the flag is modified on drop
lock/reacquire or when the write fails after clearing the flag.
No existing users are exposed to this issue.
Fixes: 9b6713cc75229 ("maple_tree: Add mtree_alloc_cyclic()")
Reported-by: Chris Mason <clm@meta.com>
Cc: Chuck Lever <cel@kernel.org>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 43 ++++++++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 19 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 768193406c3db..1bc177bf42f9f 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3867,35 +3867,40 @@ int mas_alloc_cyclic(struct ma_state *mas, unsigned long *startp,
void *entry, unsigned long range_lo, unsigned long range_hi,
unsigned long *next, gfp_t gfp)
{
- unsigned long min = range_lo;
- int ret = 0;
-
- range_lo = max(min, *next);
- ret = mas_empty_area(mas, range_lo, range_hi, 1);
- if ((mas->tree->ma_flags & MT_FLAGS_ALLOC_WRAPPED) && ret == 0) {
- mas->tree->ma_flags &= ~MT_FLAGS_ALLOC_WRAPPED;
- ret = 1;
- }
- if (ret < 0 && range_lo > min) {
- mas_reset(mas);
- ret = mas_empty_area(mas, min, range_hi, 1);
- if (ret == 0)
- ret = 1;
- }
- if (ret < 0)
- return ret;
+ int ret;
+ unsigned long min;
+ min = range_lo;
do {
+ range_lo = max(min, *next);
+ ret = mas_empty_area(mas, range_lo, range_hi, 1);
+ if (ret < 0 && range_lo > min) {
+ mas_reset(mas);
+ ret = mas_empty_area(mas, min, range_hi, 1);
+ if (ret == 0)
+ ret = 1;
+ }
+ if (ret < 0)
+ goto out;
+
mas_insert(mas, entry);
} while (mas_nomem(mas, gfp));
- if (mas_is_err(mas))
- return xa_err(mas->node);
+ if (mas_is_err(mas)) {
+ ret = xa_err(mas->node);
+ goto out;
+ }
+
+ if ((mas->tree->ma_flags & MT_FLAGS_ALLOC_WRAPPED) && ret == 0) {
+ mas->tree->ma_flags &= ~MT_FLAGS_ALLOC_WRAPPED;
+ ret = 1;
+ }
*startp = mas->index;
*next = *startp + 1;
if (*next == 0)
mas->tree->ma_flags |= MT_FLAGS_ALLOC_WRAPPED;
+out:
mas_destroy(mas);
return ret;
}
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (11 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:32 ` Rik van Riel
2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
` (6 subsequent siblings)
19 siblings, 1 reply; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
Jason Gunthorpe, Rik van Riel
State that the mas_erase() and mtree_erase() functions may use
GFP_KERNEL on allocation retry. Don't just depend on people reading the
documentation by adding a check that will warn of the use.
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 1bc177bf42f9f..15e8081f61808 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5657,6 +5657,10 @@ EXPORT_SYMBOL_GPL(mas_find_range_rev);
* Searches for @mas->index, sets @mas->index and @mas->last to the range and
* erases that range.
*
+ * Note that erase requires allocations and will use GFP_KERNEL to do so if
+ * necessary. If the allocation fails, the internal lock will be dropped to
+ * retry.
+ *
* Return: the entry that was erased or %NULL, @mas->index and @mas->last are updated.
*/
void *mas_erase(struct ma_state *mas)
@@ -5665,13 +5669,21 @@ void *mas_erase(struct ma_state *mas)
unsigned long index = mas->index;
MA_WR_STATE(wr_mas, mas, NULL);
+ /*
+ * In low memory situations, the allocation is retried with the gfp flag
+ * GFP_KERNEL. The internal spinlock is dropped in mas_nomem(), however
+ * the external lock is not dropped.
+ */
+ if (mt_external_lock(mas->tree))
+ might_alloc(GFP_KERNEL);
+
if (!mas_is_active(mas) || !mas_is_start(mas))
mas->status = ma_start;
write_retry:
entry = mas_state_walk(mas);
if (!entry)
- return NULL;
+ goto out;
/* Must reset to ensure spanning writes of last slot are detected */
mas_reset(mas);
@@ -5682,8 +5694,10 @@ void *mas_erase(struct ma_state *mas)
goto write_retry;
}
- if (mas_is_err(mas))
+ if (mas_is_err(mas)) {
+ entry = NULL;
goto out;
+ }
mas_wr_store_entry(&wr_mas);
out:
@@ -6011,6 +6025,10 @@ EXPORT_SYMBOL(mtree_alloc_rrange);
* Erasing is the same as a walk to an entry then a store of a NULL to that
* ENTIRE range. In fact, it is implemented as such using the advanced API.
*
+ * Note that erase requires allocations and will use GFP_KERNEL to do so if
+ * necessary. If the allocation fails, the internal lock will be dropped to
+ * retry.
+ *
* Return: The entry stored at the @index or %NULL
*/
void *mtree_erase(struct maple_tree *mt, unsigned long index)
@@ -6020,6 +6038,7 @@ void *mtree_erase(struct maple_tree *mt, unsigned long index)
MA_STATE(mas, mt, index, index);
trace_ma_op(TP_FCT, &mas);
+ might_alloc(GFP_KERNEL);
mtree_lock(mt);
entry = mas_erase(&mas);
mtree_unlock(mt);
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (12 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 23:02 ` Andrew Morton
2026-06-30 19:08 ` [PATCH v2 15/19] maple_tree: Document erase and allocations better Liam R. Howlett (Oracle)
` (5 subsequent siblings)
19 siblings, 1 reply; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
Rik van Riel, Jason Gunthorpe
Allocations should never fail in the circumstances that are expected to
occur. Add checks in the code to ensure the circumstances are correctly
set up by the user and warn if they are not.
Also add a warning on failure to allocate, which should never happen.
Cc: Rik van Riel <riel@surriel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 15e8081f61808..afe65ff0f8a6f 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5720,6 +5720,10 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
if (likely(mas->node != MA_ERROR(-ENOMEM)))
return false;
+ /* Allocations can fail, don't do this. */
+ WARN_ON_ONCE(!gfpflags_allow_blocking(gfp) &&
+ mt_external_lock(mas->tree));
+
if (gfpflags_allow_blocking(gfp) && !mt_external_lock(mas->tree)) {
mtree_unlock(mas->tree);
mas_alloc_nodes(mas, gfp);
@@ -5730,9 +5734,12 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
/*
* Return false on zero forward progress. Partial allocations are kept
- * so the retry path will attempt to get the rest.
+ * so the retry path will attempt to get the rest. The failure should
+ * not happen as we try our best to reclaim. The user would need an
+ * external lock with a non-blocking gfp in a low memory situation -
+ * which would have triggered the first warning in this function.
*/
- if (!mas->sheaf && !mas->alloc)
+ if (WARN_ON_ONCE(!mas->sheaf && !mas->alloc))
return false;
mas_reset(mas);
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 15/19] maple_tree: Document erase and allocations better
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (13 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 16/19] maple_tree: Change two GFP flags in tests Liam R. Howlett (Oracle)
` (4 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
Jason Gunthorpe, Rik van Riel
During a discussion on the maple tree erase process and GFP flags, Jason
suggested there be an amendment to the documentation to clarify the
situation on allocations within the tree.
The added text is an attempt to better explain that the tree may
allocate, even when erasing, and provide some guidance on how to work
around such issues.
Link: https://lore.kernel.org/all/20260617180419.GA231643@ziepe.ca/
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
Documentation/core-api/maple_tree.rst | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/Documentation/core-api/maple_tree.rst b/Documentation/core-api/maple_tree.rst
index 34964ec88d179..1eae24a9c18b4 100644
--- a/Documentation/core-api/maple_tree.rst
+++ b/Documentation/core-api/maple_tree.rst
@@ -17,7 +17,8 @@ supports iterating over a range of entries and going to the previous or next
entry in a cache-efficient manner. The tree can also be put into an RCU-safe
mode of operation which allows reading and writing concurrently. Writers must
synchronize on a lock, which can be the default spinlock, or the user can set
-the lock to an external lock of a different type.
+the lock to an external lock of a different type. Note that external locks may
+interfere with allocations in a low memory situation.
The Maple Tree maintains a small memory footprint and was designed to use
modern processor cache efficiently. The majority of the users will be able to
@@ -42,6 +43,15 @@ successful store operation within a given
code segment when allocating cannot be done. Allocations of nodes are
relatively small at around 256 bytes.
+Since the maple tree uses internal nodes that are allocated and has rules on
+data density, erasing an entry may cause allocations to occur. That is,
+erasing an entry may consume memory. Users must take care to ensure that they
+do not violate the larger system constraints on when and how memory is
+allocated. Most situations are fine to allocate, but the pre-allocation
+support is provided as a mechanism to avoid trickier situations. There is also
+the possibility of using special entries and clean up the tree later, in
+extreme circumstances.
+
.. _maple-tree-normal-api:
Normal API
@@ -63,7 +73,8 @@ success or an error code otherwise. mtree_store_range() works in the same way
but takes a range. mtree_load() is used to retrieve the entry stored at a
given index. You can use mtree_erase() to erase an entire range by only
knowing one value within that range, or mtree_store() call with an entry of
-NULL may be used to partially erase a range or many ranges at once.
+NULL may be used to partially erase a range or many ranges at once. Note that
+mtree_erase() may use GFP_KERNEL on allocations.
If you want to only store a new entry to a range (or index) if that range is
currently ``NULL``, you can use mtree_insert_range() or mtree_insert() which
@@ -163,7 +174,9 @@ You can use mas_erase() to erase an entire range by setting index and
last of the maple state to the desired range to erase. This will erase
the first range that is found in that range, set the maple state index
and last as the range that was erased and return the entry that existed
-at that location.
+at that location. Note that mas_erase() may allocate with the GFP_KERNEL flag.
+If this is not okay, consider using mas_store_gfp() and pass it a ``NULL``,
+after setting up the correct range by walking to the entry.
You can walk each entry within a range by using mas_for_each(). If you want
to walk each element of the tree then ``0`` and ``ULONG_MAX`` may be used as
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 16/19] maple_tree: Change two GFP flags in tests
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (14 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 15/19] maple_tree: Document erase and allocations better Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 17/19] maple_tree: Fix argument name in header Liam R. Howlett (Oracle)
` (3 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
Joe Perches
The GFP flags in two tests are obviously incorrect. Make the tests
correctly run by updating the GFP flags.
Link: https://lore.kernel.org/all/d9cbb89faa5bdb71d451781d214a51ce8923a83e.camel@perches.com/
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
tools/testing/radix-tree/maple.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c
index 0607913a3022d..d967e76a3c065 100644
--- a/tools/testing/radix-tree/maple.c
+++ b/tools/testing/radix-tree/maple.c
@@ -35234,7 +35234,7 @@ static noinline void __init check_prealloc(struct maple_tree *mt)
mt_set_non_kernel(1);
/* Spanning store */
mas_set_range(&mas, 1, 100);
- MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL & GFP_NOWAIT) == 0);
+ MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_NOWAIT) == 0);
allocated = mas_allocated(&mas);
height = mas_mt_height(&mas);
MT_BUG_ON(mt, allocated != 0);
@@ -35257,7 +35257,7 @@ static noinline void __init check_prealloc(struct maple_tree *mt)
MT_BUG_ON(mt, mas_allocated(&mas) != 0);
mas_set_range(&mas, 0, 200);
mt_set_non_kernel(1);
- MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL & GFP_NOWAIT) == 0);
+ MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_NOWAIT) == 0);
allocated = mas_allocated(&mas);
height = mas_mt_height(&mas);
MT_BUG_ON(mt, allocated != 0);
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 17/19] maple_tree: Fix argument name in header
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (15 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 16/19] maple_tree: Change two GFP flags in tests Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 18/19] maple_tree: Avoid extra gap calculation Liam R. Howlett (Oracle)
` (2 subsequent siblings)
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
The mas_prev_range() function takes a min and not a max.
Fixes: 6b9e93e010204 ("maple_tree: add mas_prev_range() and mas_find_range_rev interface")
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
include/linux/maple_tree.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 346eae48cf15e..5e0bd2857d941 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -576,7 +576,7 @@ void maple_tree_init(void);
void mas_destroy(struct ma_state *mas);
void *mas_prev(struct ma_state *mas, unsigned long min);
-void *mas_prev_range(struct ma_state *mas, unsigned long max);
+void *mas_prev_range(struct ma_state *mas, unsigned long min);
void *mas_next(struct ma_state *mas, unsigned long max);
void *mas_next_range(struct ma_state *mas, unsigned long max);
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 18/19] maple_tree: Avoid extra gap calculation
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (16 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 17/19] maple_tree: Fix argument name in header Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable() Liam R. Howlett (Oracle)
2026-06-30 23:05 ` [PATCH v2 00/19] maple_tree: lock checking and clean ups Andrew Morton
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
Prior to ending the ascension loop of larger operations like split,
rebalance, and spanning store the gap in the node had been calculated.
Once the node is inserted into the tree, the gap is recalculated in
mas_update_gap(). This can be avoided by creating a helper for
mas_update_gap() that accepts the known gap value, which reduces the
operations required for gap updating path.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 28 +++++++++++++++++-----------
1 file changed, 17 insertions(+), 11 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index afe65ff0f8a6f..4336a98763984 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1565,14 +1565,26 @@ static inline void mas_parent_gap(struct ma_state *mas, unsigned char offset,
goto ascend;
}
+static __always_inline void mas_update_gap_known(struct ma_state *mas,
+ unsigned long gap)
+{
+ unsigned char pslot;
+ unsigned long p_gap;
+
+ pslot = mte_parent_slot(mas->node);
+ p_gap = ma_gaps(mte_parent(mas->node),
+ mas_parent_type(mas, mas->node))[pslot];
+
+ if (p_gap != gap)
+ mas_parent_gap(mas, pslot, gap);
+}
+
/*
* mas_update_gap() - Update a nodes gaps and propagate up if necessary.
* @mas: the maple state.
*/
static inline void mas_update_gap(struct ma_state *mas)
{
- unsigned char pslot;
- unsigned long p_gap;
unsigned long max_gap;
if (!mt_is_alloc(mas->tree))
@@ -1582,13 +1594,7 @@ static inline void mas_update_gap(struct ma_state *mas)
return;
max_gap = mas_max_gap(mas);
-
- pslot = mte_parent_slot(mas->node);
- p_gap = ma_gaps(mte_parent(mas->node),
- mas_parent_type(mas, mas->node))[pslot];
-
- if (p_gap != max_gap)
- mas_parent_gap(mas, pslot, max_gap);
+ mas_update_gap_known(mas, max_gap);
}
/*
@@ -2136,8 +2142,8 @@ static inline void mas_wmb_replace(struct ma_state *mas, struct maple_copy *cp)
mas->node = mt_slot_locked(mas->tree, cp->slot, 0);
/* Insert the new data in the tree */
mas_topiary_replace(mas, old_enode, cp->height);
- if (!mte_is_leaf(mas->node))
- mas_update_gap(mas);
+ if (mt_is_alloc(mas->tree) && !mte_is_root(mas->node))
+ mas_update_gap_known(mas, cp->gap[0]);
mtree_range_walk(mas);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable()
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (17 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 18/19] maple_tree: Avoid extra gap calculation Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
2026-06-30 23:05 ` [PATCH v2 00/19] maple_tree: lock checking and clean ups Andrew Morton
19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)
A check in mas_walk() was incorrect and caused inefficient use of the
maple state. The same issue existed in mas_erase(), but was left
unfixed. Making a helper function is the obvious answer.
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
lib/maple_tree.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 4336a98763984..06cc05b79fbdc 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -261,6 +261,12 @@ static inline bool mas_is_underflow(struct ma_state *mas)
return mas->status == ma_underflow;
}
+static inline void mas_make_walkable(struct ma_state *mas)
+{
+ if (!mas_is_active(mas) && !mas_is_start(mas))
+ mas->status = ma_start;
+}
+
static __always_inline struct maple_node *mte_to_node(
const struct maple_enode *entry)
{
@@ -4446,8 +4452,7 @@ void *mas_walk(struct ma_state *mas)
void *entry;
mas_may_init_lock_check(mas);
- if (!mas_is_active(mas) && !mas_is_start(mas))
- mas->status = ma_start;
+ mas_make_walkable(mas);
retry:
entry = mas_state_walk(mas);
if (mas_is_start(mas)) {
@@ -5683,9 +5688,7 @@ void *mas_erase(struct ma_state *mas)
if (mt_external_lock(mas->tree))
might_alloc(GFP_KERNEL);
- if (!mas_is_active(mas) || !mas_is_start(mas))
- mas->status = ma_start;
-
+ mas_make_walkable(mas);
write_retry:
entry = mas_state_walk(mas);
if (!entry)
--
2.47.3
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations
2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
@ 2026-06-30 19:32 ` Rik van Riel
0 siblings, 0 replies; 24+ messages in thread
From: Rik van Riel @ 2026-06-30 19:32 UTC (permalink / raw)
To: Liam R. Howlett (Oracle), Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Jason Gunthorpe
On Tue, 2026-06-30 at 15:08 -0400, Liam R. Howlett (Oracle) wrote:
> State that the mas_erase() and mtree_erase() functions may use
> GFP_KERNEL on allocation retry. Don't just depend on people reading
> the
> documentation by adding a check that will warn of the use.
>
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Rik van Riel <riel@surriel.com>
> Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
>
Reviewed-by: Rik van Riel <riel@surriel.com>
--
All Rights Reversed.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic()
2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
@ 2026-06-30 19:41 ` Chuck Lever
0 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2026-06-30 19:41 UTC (permalink / raw)
To: Liam R. Howlett (Oracle), Andrew Morton
Cc: linux-mm, linux-kernel, maple-tree, Chris Mason
On Tue, Jun 30, 2026, at 3:08 PM, Liam R. Howlett (Oracle) wrote:
> If mas_alloc_cyclic() is called during a low memory situation, it is
> possible the lock may be dropped so reclaim can occur. There is a
> window where some other task may allocate the same id and cause the
> mas_insert() to fail with -EEXIST. In this scenario the function will
> return -EEXIST, which is not expected.
>
> Modifying the retry on mas_nomem() to re-search for a slot means that
> any race with other writes will not matter as the lock will be held
> between finding the index and writing the index.
>
> Moving the flag logic avoids cases where the flag is modified on drop
> lock/reacquire or when the write fails after clearing the flag.
>
> No existing users are exposed to this issue.
>
> Fixes: 9b6713cc75229 ("maple_tree: Add mtree_alloc_cyclic()")
> Reported-by: Chris Mason <clm@meta.com>
> Cc: Chuck Lever <cel@kernel.org>
> Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
Reviewed-by: Chuck Lever <cel@kernel.org>
--
Chuck Lever
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail
2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
@ 2026-06-30 23:02 ` Andrew Morton
0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2026-06-30 23:02 UTC (permalink / raw)
To: Liam R. Howlett (Oracle)
Cc: linux-mm, linux-kernel, maple-tree, Rik van Riel, Jason Gunthorpe
On Tue, 30 Jun 2026 15:08:38 -0400 "Liam R. Howlett (Oracle)" <liam@infradead.org> wrote:
> Allocations should never fail in the circumstances that are expected to
> occur. Add checks in the code to ensure the circumstances are correctly
> set up by the user and warn if they are not.
>
> Also add a warning on failure to allocate, which should never happen.
>
> ...
>
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5720,6 +5720,10 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
> if (likely(mas->node != MA_ERROR(-ENOMEM)))
> return false;
>
> + /* Allocations can fail, don't do this. */
> + WARN_ON_ONCE(!gfpflags_allow_blocking(gfp) &&
> + mt_external_lock(mas->tree));
> +
> if (gfpflags_allow_blocking(gfp) && !mt_external_lock(mas->tree)) {
> mtree_unlock(mas->tree);
> mas_alloc_nodes(mas, gfp);
> @@ -5730,9 +5734,12 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
>
> /*
> * Return false on zero forward progress. Partial allocations are kept
> - * so the retry path will attempt to get the rest.
> + * so the retry path will attempt to get the rest. The failure should
> + * not happen as we try our best to reclaim. The user would need an
> + * external lock with a non-blocking gfp in a low memory situation -
> + * which would have triggered the first warning in this function.
> */
> - if (!mas->sheaf && !mas->alloc)
> + if (WARN_ON_ONCE(!mas->sheaf && !mas->alloc))
> return false;
>
Can either of these warnings duplicate the effect of !__GFP_NOWARN?
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] maple_tree: lock checking and clean ups
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
` (18 preceding siblings ...)
2026-06-30 19:08 ` [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable() Liam R. Howlett (Oracle)
@ 2026-06-30 23:05 ` Andrew Morton
19 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2026-06-30 23:05 UTC (permalink / raw)
To: Liam R. Howlett (Oracle); +Cc: linux-mm, linux-kernel, maple-tree
On Tue, 30 Jun 2026 15:08:24 -0400 "Liam R. Howlett (Oracle)" <liam@infradead.org> wrote:
> The goals of this series are:
> 1. Lock issue detection
> A number of syzbot reports are incorrectly pointing to the mm exit as
> a source of the locking error. The first three patches attempt to
> help users detect errors in their locking - but they still have to use
> LOCKDEP. I guess it's still down to hope and prayers.
>
> 2. Documentation fixes
> The documentation was lacking clarity, there are updates to try and
> help the users, especially around the erase() cases.
>
> 3. Two benign issues
> The cyclic allocator may have a race, although no in-kernel user can
> hit it.
> The erase functions may cause allocation issues if used with the
> incorrect locking type, but none are present in-tree.
>
> Beyond these goals there are some test fixes, some general speed-up
> patches targeting extra work and cycles, and dropping dead code.
Great, thanks, I'll add this to mm-new for some testing.
^ permalink raw reply [flat|nested] 24+ messages in thread
end of thread, other threads:[~2026-06-30 23:05 UTC | newest]
Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 04/19] maple_tree: Documentation fix Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 11/19] maple_tree: Add bulk parent set helper Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
2026-06-30 19:41 ` Chuck Lever
2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
2026-06-30 19:32 ` Rik van Riel
2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
2026-06-30 23:02 ` Andrew Morton
2026-06-30 19:08 ` [PATCH v2 15/19] maple_tree: Document erase and allocations better Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 16/19] maple_tree: Change two GFP flags in tests Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 17/19] maple_tree: Fix argument name in header Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 18/19] maple_tree: Avoid extra gap calculation Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable() Liam R. Howlett (Oracle)
2026-06-30 23:05 ` [PATCH v2 00/19] maple_tree: lock checking and clean ups Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox