Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 00/19] maple_tree: lock checking and clean ups
@ 2026-06-30 19:08 Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
                   ` (19 more replies)
  0 siblings, 20 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

The goals of this series are:
1. Lock issue detection
  A number of syzbot reports are incorrectly pointing to the mm exit as
  a source of the locking error.  The first three patches attempt to
  help users detect errors in their locking - but they still have to use
  LOCKDEP.  I guess it's still down to hope and prayers.

2. Documentation fixes
  The documentation was lacking clarity, there are updates to try and
  help the users, especially around the erase() cases.

3. Two benign issues
  The cyclic allocator may have a race, although no in-kernel user can
  hit it.
  The erase functions may cause allocation issues if used with the
  incorrect locking type, but none are present in-tree.

Beyond these goals there are some test fixes, some general speed-up
patches targeting extra work and cycles, and dropping dead code.

v1: https://lore.kernel.org/all/20260629144145.1572283-1-liam@infradead.org/
Changes since v1:
 - Relocated mas_set_parent_slots() lower so that mt_slots_locked() for rcu dereferencing - Thanks kernel build bot
 - Made mas_lockdep_map() static - Thanks kernel build bot
 - Added mas_lock_check() to mas_may_activate() when restoring a node to active - Thanks sashiko
 - Changed mas_lock_check() aliasing to use mt_locked() - Thanks sashiko
 - Updated __lock_unpin_lock() underflow check - Thanks sashiko
 - Updated checking the lock sequence based on the lockdep map - Thanks sashiko
 - Add a comment to mas_wr_node_store() to explain why the last slot isn't cleared. - Thanks sashiko
 - Drop mas_set_parent() function as it is no longer used - Thanks sashiko
 - Clear the mas_alloc_cyclic() wrap flag after storage - Thanks sashiko
 - Add might_allocate() to the mtree_erase() call - Thanks Jason
 - Added one more cleanup for making maple state walkable.

Liam R. Howlett (Oracle) (19):
  maple_tree: Add rcu locking check when LOCKDEP is enabled
  locking/lockdep: Add sequence counter to held_lock
  maple_tree: Add write lock checking with lockdep sequence numbers
  maple_tree: Documentation fix
  maple_tree: Drop dead code from mas_extend_spanning_null()
  maple_tree: Drop MAPLE_ALLOC_SLOTS
  maple_tree: Clarify comments on mas_nomem()
  maple_tree: Use prefetched value in mas_wr_store_type()
  maple_tree: Optimise mas_wr_node_store() when not in rcu mode
  maple_tree: micro optimisation of mas_wr_store_type()
  maple_tree: Add bulk parent set helper
  maple_tree: Catch race in mas_alloc_cyclic()
  maple_tree: Document that erase may use GFP_KERNEL for allocations
  maple_tree: WARN_ON_ONCE when allocations fail
  maple_tree: Document erase and allocations better
  maple_tree: Change two GFP flags in tests
  maple_tree: Fix argument name in header
  maple_tree: Avoid extra gap calculation
  maple_tree: Add helper mas_make_walkable()

 Documentation/core-api/maple_tree.rst |  21 +-
 include/linux/lockdep.h               |   3 +
 include/linux/lockdep_types.h         |   3 +-
 include/linux/maple_tree.h            |  10 +-
 include/linux/sched.h                 |   1 +
 kernel/locking/lockdep.c              |  58 ++++-
 lib/maple_tree.c                      | 353 ++++++++++++++++++--------
 tools/testing/radix-tree/maple.c      |   4 +-
 8 files changed, 326 insertions(+), 127 deletions(-)

-- 
2.47.3



^ permalink raw reply	[flat|nested] 24+ messages in thread

* [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock Liam R. Howlett (Oracle)
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

When CONFIG_LOCKDEP and CONFIG_RCU_STRICT_GRACE_PERIOD is enabled, check
for rcu locking issues by recording the grace period in the maple state
and checking the rcu window is still valid whenever the maple state is
reused with a state that is not MA_START or MA_PAUSED.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 include/linux/maple_tree.h |  3 +++
 lib/maple_tree.c           | 50 +++++++++++++++++++++++++++++++++++++-
 2 files changed, 52 insertions(+), 1 deletion(-)

diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 4a5631906affb..11a16c3508dc0 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -485,6 +485,9 @@ struct ma_state {
 	unsigned char mas_flags;
 	unsigned char end;		/* The end of the node */
 	enum store_type store_type;	/* The type of store needed for this operation */
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+	unsigned long rcu_gp;
+#endif
 };
 
 struct ma_wr_state {
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index e52876435b77f..3dff0435c13a7 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1153,6 +1153,42 @@ static inline void mas_free(struct ma_state *mas, struct maple_enode *used)
 	ma_free_rcu(mte_to_node(used));
 }
 
+void mas_lock_check(struct ma_state *mas)
+{
+
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+	if (!mas_is_active(mas))
+		return;
+
+	if (!mt_locked(mas->tree)) {
+		if (mt_in_rcu(mas->tree))
+			WARN_ON_ONCE(poll_state_synchronize_rcu(mas->rcu_gp));
+	}
+#endif
+
+}
+
+void mas_init_lock_check(struct ma_state *mas)
+{
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+	if (!mt_locked(mas->tree)) {
+		if (mt_in_rcu(mas->tree))
+			mas->rcu_gp = get_state_synchronize_rcu();
+	}
+#endif
+
+}
+
+static void mas_may_init_lock_check(struct ma_state *mas)
+{
+#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+	if (mas_is_start(mas) || mas_is_paused(mas))
+		mas_init_lock_check(mas);
+	else
+		mas_lock_check(mas);
+#endif
+}
+
 /*
  * mas_start() - Sets up maple state for operations.
  * @mas: The maple state.
@@ -1171,6 +1207,7 @@ static inline struct maple_enode *mas_start(struct ma_state *mas)
 	if (likely(mas_is_start(mas))) {
 		struct maple_enode *root;
 
+		mas_init_lock_check(mas);
 		mas->min = 0;
 		mas->max = ULONG_MAX;
 
@@ -4360,6 +4397,7 @@ void *mas_walk(struct ma_state *mas)
 {
 	void *entry;
 
+	mas_may_init_lock_check(mas);
 	if (!mas_is_active(mas) && !mas_is_start(mas))
 		mas->status = ma_start;
 retry:
@@ -4997,6 +5035,7 @@ static void mas_may_activate(struct ma_state *mas)
 		mas->status = ma_start;
 	} else {
 		mas->status = ma_active;
+		mas_lock_check(mas);
 	}
 }
 
@@ -5074,6 +5113,7 @@ void *mas_next(struct ma_state *mas, unsigned long max)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_next_setup(mas, max, &entry))
 		return entry;
 
@@ -5097,6 +5137,7 @@ void *mas_next_range(struct ma_state *mas, unsigned long max)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_next_setup(mas, max, &entry))
 		return entry;
 
@@ -5205,6 +5246,7 @@ void *mas_prev(struct ma_state *mas, unsigned long min)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_prev_setup(mas, min, &entry))
 		return entry;
 
@@ -5228,6 +5270,7 @@ void *mas_prev_range(struct ma_state *mas, unsigned long min)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_prev_setup(mas, min, &entry))
 		return entry;
 
@@ -5274,6 +5317,7 @@ EXPORT_SYMBOL_GPL(mt_prev);
  */
 void mas_pause(struct ma_state *mas)
 {
+	mas_lock_check(mas);
 	mas->status = ma_pause;
 	mas->node = NULL;
 }
@@ -5382,6 +5426,7 @@ void *mas_find(struct ma_state *mas, unsigned long max)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_find_setup(mas, max, &entry))
 		return entry;
 
@@ -5409,6 +5454,7 @@ void *mas_find_range(struct ma_state *mas, unsigned long max)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_find_setup(mas, max, &entry))
 		return entry;
 
@@ -5521,6 +5567,7 @@ void *mas_find_rev(struct ma_state *mas, unsigned long min)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_find_rev_setup(mas, min, &entry))
 		return entry;
 
@@ -5547,6 +5594,7 @@ void *mas_find_range_rev(struct ma_state *mas, unsigned long min)
 {
 	void *entry = NULL;
 
+	mas_may_init_lock_check(mas);
 	if (mas_find_rev_setup(mas, min, &entry))
 		return entry;
 
@@ -5623,7 +5671,7 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
 	if (!mas->sheaf && !mas->alloc)
 		return false;
 
-	mas->status = ma_start;
+	mas_reset(mas);
 	return true;
 }
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers Liam R. Howlett (Oracle)
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
	Ingo Molnar, Will Deacon, Boqun Feng, Waiman Long, Peter Zijlstra

Add an 8 bit small sequence counter to the held_lock struct to detect if
the lock as been dropped and reacquired.  This is useful when a data
structure depends on a constant locking context, but is not able to
detect locking and unlocking of the lock through its own API.

Since the __lock_unpin_lock() will no longer detect underflow by casting
the unsigned int to a signed int, update the casting code to use a temp
variable for calculations using a signed int.

Cc: Ingo Molnar <mingo@redhat.com>
Cc: Will Deacon <will@kernel.org>
Cc: Boqun Feng <boqun.feng@gmail.com>
Cc: Waiman Long <longman@redhat.com>
Link: https://lore.kernel.org/all/h3tpnj5kzcrxms5picmimtkpg4aypcpip5wbd6bt2rpdj5k7eb@nhtzs3lefrkq/
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 include/linux/lockdep.h       |  3 ++
 include/linux/lockdep_types.h |  3 +-
 include/linux/sched.h         |  1 +
 kernel/locking/lockdep.c      | 58 ++++++++++++++++++++++++++++++-----
 4 files changed, 57 insertions(+), 8 deletions(-)

diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 621566345406d..a6451ecbbe9a0 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -273,6 +273,9 @@ extern struct pin_cookie lock_pin_lock(struct lockdep_map *lock);
 extern void lock_repin_lock(struct lockdep_map *lock, struct pin_cookie);
 extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
 
+extern u32 lock_sequence(struct lockdep_map *lock);
+#define lockdep_sequence(lock)	lock_sequence(&(lock)->dep_map)
+
 #define lockdep_depth(tsk)	(debug_locks ? (tsk)->lockdep_depth : 0)
 
 #define lockdep_assert(cond)		\
diff --git a/include/linux/lockdep_types.h b/include/linux/lockdep_types.h
index eae115a264885..55c4b152fedf7 100644
--- a/include/linux/lockdep_types.h
+++ b/include/linux/lockdep_types.h
@@ -253,7 +253,8 @@ struct held_lock {
 	unsigned int hardirqs_off:1;
 	unsigned int sync:1;
 	unsigned int references:11;					/* 32 bits */
-	unsigned int pin_count;
+	unsigned int pin_count:24;
+	unsigned int seq_count:8;
 };
 
 #else /* !CONFIG_LOCKDEP */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 373bcc0598d10..14d5ce8dd6136 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -1288,6 +1288,7 @@ struct task_struct {
 	u64				curr_chain_key;
 	int				lockdep_depth;
 	unsigned int			lockdep_recursion;
+	unsigned int			lockdep_seq;
 	struct held_lock		held_locks[MAX_LOCK_DEPTH];
 #endif
 
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 2d4c5bab5af88..a69567bdd7912 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -5077,7 +5077,7 @@ static int __lock_is_held(const struct lockdep_map *lock, int read);
 static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 			  int trylock, int read, int check, int hardirqs_off,
 			  struct lockdep_map *nest_lock, unsigned long ip,
-			  int references, int pin_count, int sync)
+			  int references, int pin_count, int sync, int seq)
 {
 	struct task_struct *curr = current;
 	struct lock_class *class = NULL;
@@ -5183,6 +5183,7 @@ static int __lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 	hlock->holdtime_stamp = lockstat_clock();
 #endif
 	hlock->pin_count = pin_count;
+	hlock->seq_count = seq;
 
 	if (check_wait_context(curr, hlock))
 		return 0;
@@ -5388,7 +5389,7 @@ static int reacquire_held_locks(struct task_struct *curr, unsigned int depth,
 				    hlock->read, hlock->check,
 				    hlock->hardirqs_off,
 				    hlock->nest_lock, hlock->acquire_ip,
-				    hlock->references, hlock->pin_count, 0)) {
+				    hlock->references, hlock->pin_count, 0, hlock->seq_count)) {
 		case 0:
 			return 1;
 		case 1:
@@ -5669,14 +5670,17 @@ static void __lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie
 		struct held_lock *hlock = curr->held_locks + i;
 
 		if (match_held_lock(hlock, lock)) {
+			int pin_count;
+
 			if (WARN(!hlock->pin_count, "unpinning an unpinned lock\n"))
 				return;
 
-			hlock->pin_count -= cookie.val;
+			pin_count = hlock->pin_count - cookie.val;
 
-			if (WARN((int)hlock->pin_count < 0, "pin count corrupted\n"))
-				hlock->pin_count = 0;
+			if (WARN(pin_count < 0, "pin count corrupted\n"))
+				pin_count = 0;
 
+			hlock->pin_count = pin_count;
 			return;
 		}
 	}
@@ -5684,6 +5688,24 @@ static void __lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie
 	WARN(1, "unpinning an unheld lock\n");
 }
 
+static u32 __lock_sequence(struct lockdep_map *lock)
+{
+	struct task_struct *curr = current;
+	int i;
+
+	if (unlikely(!debug_locks))
+		return ~0;
+
+	for (i = 0; i < curr->lockdep_depth; i++) {
+		struct held_lock *hlock = curr->held_locks + i;
+
+		if (match_held_lock(hlock, lock))
+			return hlock->seq_count;
+	}
+
+	return ~0;
+}
+
 /*
  * Check whether we follow the irq-flags state precisely:
  */
@@ -5866,7 +5888,8 @@ void lock_acquire(struct lockdep_map *lock, unsigned int subclass,
 
 	lockdep_recursion_inc();
 	__lock_acquire(lock, subclass, trylock, read, check,
-		       irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 0);
+		       irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 0,
+		       ++current->lockdep_seq);
 	lockdep_recursion_finish();
 	raw_local_irq_restore(flags);
 }
@@ -5914,7 +5937,8 @@ void lock_sync(struct lockdep_map *lock, unsigned subclass, int read,
 
 	lockdep_recursion_inc();
 	__lock_acquire(lock, subclass, 0, read, check,
-		       irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 1);
+		       irqs_disabled_flags(flags), nest_lock, ip, 0, 0, 1,
+		       ++current->lockdep_seq);
 	check_chain_key(current);
 	lockdep_recursion_finish();
 	raw_local_irq_restore(flags);
@@ -6000,6 +6024,26 @@ void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie cookie)
 }
 EXPORT_SYMBOL_GPL(lock_unpin_lock);
 
+u32 lock_sequence(struct lockdep_map *lock)
+{
+	unsigned long flags;
+	u32 seq = ~0;
+
+	if (unlikely(!lockdep_enabled()))
+		return seq;
+
+	raw_local_irq_save(flags);
+	check_flags(flags);
+
+	lockdep_recursion_inc();
+	seq = __lock_sequence(lock);
+	lockdep_recursion_finish();
+	raw_local_irq_restore(flags);
+
+	return seq;
+}
+EXPORT_SYMBOL_GPL(lock_sequence);
+
 #ifdef CONFIG_LOCK_STAT
 static void print_lock_contention_bug(struct task_struct *curr,
 				      struct lockdep_map *lock,
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 04/19] maple_tree: Documentation fix Liam R. Howlett (Oracle)
                   ` (16 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

Use the lockdep sequence numbers to ensure the write lock is not dropped
between write operations.  The lockdep sequence is recorded on any walk
that starts from the top of the tree and re-checked prior to any
operation using an active node.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 include/linux/maple_tree.h |  7 +++--
 lib/maple_tree.c           | 58 ++++++++++++++++++++++++++++++--------
 2 files changed, 51 insertions(+), 14 deletions(-)

diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 11a16c3508dc0..220b380cdd491 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -485,9 +485,12 @@ struct ma_state {
 	unsigned char mas_flags;
 	unsigned char end;		/* The end of the node */
 	enum store_type store_type;	/* The type of store needed for this operation */
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+#ifdef CONFIG_LOCKDEP
+	u32 ld_seq;
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
 	unsigned long rcu_gp;
-#endif
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+#endif /* CONFIG_LOCKDEP */
 };
 
 struct ma_wr_state {
diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 3dff0435c13a7..e2f7474572319 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1153,40 +1153,71 @@ static inline void mas_free(struct ma_state *mas, struct maple_enode *used)
 	ma_free_rcu(mte_to_node(used));
 }
 
-void mas_lock_check(struct ma_state *mas)
+
+#ifdef CONFIG_LOCKDEP
+static struct lockdep_map *mas_lockdep_map(struct ma_state *mas)
 {
+	struct maple_tree *mt = mas->tree;
+
+	if (mt_external_lock(mt))
+		return mt->ma_external_lock;
+
+	return &(mt->ma_lock).dep_map;
+}
+
+#endif
 
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+static void mas_lock_check(struct ma_state *mas)
+{
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map *map;
 	if (!mas_is_active(mas))
 		return;
 
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
 	if (!mt_locked(mas->tree)) {
 		if (mt_in_rcu(mas->tree))
 			WARN_ON_ONCE(poll_state_synchronize_rcu(mas->rcu_gp));
 	}
-#endif
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+
+	map = mas_lockdep_map(mas);
+	if (map && lock_is_held(map))
+		WARN_ON_ONCE(mas->ld_seq != lock_sequence(map));
+#endif /* CONFIG_LOCKDEP */
 
 }
 
-void mas_init_lock_check(struct ma_state *mas)
+static void mas_init_lock_check(struct ma_state *mas)
 {
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
+#ifdef CONFIG_LOCKDEP
+	struct lockdep_map *map;
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
 	if (!mt_locked(mas->tree)) {
 		if (mt_in_rcu(mas->tree))
 			mas->rcu_gp = get_state_synchronize_rcu();
+		return;
 	}
-#endif
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+
+	map = mas_lockdep_map(mas);
+	if (map && lock_is_held(map))
+		mas->ld_seq = lock_sequence(mas_lockdep_map(mas));
+#endif /* CONFIG_LOCKDEP */
 
 }
 
 static void mas_may_init_lock_check(struct ma_state *mas)
 {
-#if IS_ENABLED(CONFIG_LOCKDEP) && IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD)
-	if (mas_is_start(mas) || mas_is_paused(mas))
+#ifdef CONFIG_LOCKDEP
+#ifdef CONFIG_RCU_STRICT_GRACE_PERIOD
+	if (mas_is_start(mas) || mas_is_paused(mas)) {
 		mas_init_lock_check(mas);
-	else
-		mas_lock_check(mas);
-#endif
+		return;
+	}
+#endif /* CONFIG_RCU_STRICT_GRACE_PERIOD */
+	mas_lock_check(mas);
+#endif /* CONFIG_LOCKDEP */
 }
 
 /*
@@ -4869,6 +4900,7 @@ void *mas_store(struct ma_state *mas, void *entry)
 {
 	MA_WR_STATE(wr_mas, mas, entry);
 
+	mas_may_init_lock_check(mas);
 	trace_ma_write(TP_FCT, mas, 0, entry);
 #ifdef CONFIG_DEBUG_MAPLE_TREE
 	if (MAS_WARN_ON(mas, mas->index > mas->last))
@@ -4927,6 +4959,7 @@ int mas_store_gfp(struct ma_state *mas, void *entry, gfp_t gfp)
 	MA_WR_STATE(wr_mas, mas, entry);
 	int ret = 0;
 
+	mas_may_init_lock_check(mas);
 retry:
 	mas_wr_preallocate(&wr_mas, entry);
 	if (unlikely(mas_nomem(mas, gfp))) {
@@ -4957,6 +4990,7 @@ void mas_store_prealloc(struct ma_state *mas, void *entry)
 {
 	MA_WR_STATE(wr_mas, mas, entry);
 
+	mas_lock_check(mas);
 	if (mas->store_type == wr_store_root) {
 		mas_wr_prealloc_setup(&wr_mas);
 		goto store;
@@ -4989,6 +5023,7 @@ int mas_preallocate(struct ma_state *mas, void *entry, gfp_t gfp)
 {
 	MA_WR_STATE(wr_mas, mas, entry);
 
+	mas_may_init_lock_check(mas);
 	mas_wr_prealloc_setup(&wr_mas);
 	mas->store_type = mas_wr_store_type(&wr_mas);
 	mas_prealloc_calc(&wr_mas, entry);
@@ -5474,7 +5509,6 @@ EXPORT_SYMBOL_GPL(mas_find_range);
 static bool mas_find_rev_setup(struct ma_state *mas, unsigned long min,
 		void **entry)
 {
-
 	switch (mas->status) {
 	case ma_active:
 		goto active;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 04/19] maple_tree: Documentation fix
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (2 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null() Liam R. Howlett (Oracle)
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

Don't include the word flag in the quotes with the actual flag.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 Documentation/core-api/maple_tree.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/core-api/maple_tree.rst b/Documentation/core-api/maple_tree.rst
index ccdd1615cf974..34964ec88d179 100644
--- a/Documentation/core-api/maple_tree.rst
+++ b/Documentation/core-api/maple_tree.rst
@@ -211,7 +211,7 @@ Advanced Locking
 
 The maple tree uses a spinlock by default, but external locks can be used for
 tree updates as well.  To use an external lock, the tree must be initialized
-with the ``MT_FLAGS_LOCK_EXTERN flag``, this is usually done with the
+with the ``MT_FLAGS_LOCK_EXTERN`` flag, this is usually done with the
 MTREE_INIT_EXT() #define, which takes an external lock as an argument.
 
 Functions and structures
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null()
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (3 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 04/19] maple_tree: Documentation fix Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS Liam R. Howlett (Oracle)
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

mas_extend_spanning_null() had a clause if the end of the range being
written (mas->last) is the same as the end of the existing range it is
overwriting (wr_mas->r_max), action will be taken.

This code path is not possible because the only calling function
increments mas->last (unless it's ULONG_MAX) to walk to one beyond the
write and then resets the value back to the initial value.

In the case of mas->last == ULONG_MAX, then the second part of the
statement will always be false - mas->last cannot be less than the node
max.

This code never executed and is flawed anyways (the arguments are
incorrectly ordered), so removing it is the safest action.  Since the
code never executes, it is not fixing any issue so Fixes tag is not
given.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index e2f7474572319..97849db132e7a 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -2996,13 +2996,6 @@ static inline void mas_extend_spanning_null(struct ma_wr_state *l_wr_mas,
 		if (r_mas->last < r_wr_mas->r_max)
 			r_mas->last = r_wr_mas->r_max;
 		r_mas->offset++;
-	} else if ((r_mas->last == r_wr_mas->r_max) &&
-	    (r_mas->last < r_mas->max) &&
-	    !mas_slot_locked(r_mas, r_wr_mas->slots, r_mas->offset + 1)) {
-		r_mas->last = mas_safe_pivot(r_mas, r_wr_mas->pivots,
-					     r_wr_mas->type, r_mas->offset + 1);
-		r_mas->offset++;
-		r_wr_mas->r_max = r_mas->last;
 	}
 }
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (4 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem() Liam R. Howlett (Oracle)
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

MAPLE_ALLOC_SLOTS is no longer used, so remove it.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 include/linux/maple_tree.h | 2 --
 1 file changed, 2 deletions(-)

diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 220b380cdd491..346eae48cf15e 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -29,13 +29,11 @@
 #define MAPLE_NODE_SLOTS	31	/* 256 bytes including ->parent */
 #define MAPLE_RANGE64_SLOTS	16	/* 256 bytes */
 #define MAPLE_ARANGE64_SLOTS	10	/* 240 bytes */
-#define MAPLE_ALLOC_SLOTS	(MAPLE_NODE_SLOTS - 1)
 #else
 /* 32bit sizes */
 #define MAPLE_NODE_SLOTS	63	/* 256 bytes including ->parent */
 #define MAPLE_RANGE64_SLOTS	32	/* 256 bytes */
 #define MAPLE_ARANGE64_SLOTS	21	/* 240 bytes */
-#define MAPLE_ALLOC_SLOTS	(MAPLE_NODE_SLOTS - 2)
 #endif /* defined(CONFIG_64BIT) || defined(BUILD_VDSO32_64) */
 
 #define MAPLE_NODE_MASK		255UL
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem()
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (5 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type() Liam R. Howlett (Oracle)
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

When an allocation completely fails, the return is false.  If the
allocation succeeds or partially succeeds, return true to indicate a
retry of the operation.  Note that since the lock may have been dropped,
the operation is retried from the start - including potentially
allocating more memory.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 97849db132e7a..3886b856d6e3a 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5676,10 +5676,11 @@ EXPORT_SYMBOL_GPL(mas_erase);
 
 /**
  * mas_nomem() - Check if there was an error allocating and do the allocation
- * if necessary If there are allocations, then free them.
+ * if necessary.
+ *
  * @mas: The maple state
  * @gfp: The GFP_FLAGS to use for allocations
- * Return: true on allocation, false otherwise.
+ * Return: False on no memory.  True otherwise (partial success as well)
  */
 bool mas_nomem(struct ma_state *mas, gfp_t gfp)
 	__must_hold(mas->tree->ma_lock)
@@ -5695,6 +5696,10 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
 		mas_alloc_nodes(mas, gfp);
 	}
 
+	/*
+	 * Return false on zero forward progress.  Partial allocations are kept
+	 * so the retry path will attempt to get the rest.
+	 */
 	if (!mas->sheaf && !mas->alloc)
 		return false;
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type()
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (6 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode Liam R. Howlett (Oracle)
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

The slot contents exist in wr_mas->content, which has less overhead than
reading the slot again.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 3886b856d6e3a..8f5a7fa10cf4c 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3279,7 +3279,7 @@ static inline void mas_wr_slot_store(struct ma_wr_state *wr_mas)
 	void __rcu **slots = wr_mas->slots;
 	bool gap = false;
 
-	gap |= !mt_slot_locked(mas->tree, slots, offset);
+	gap |= !wr_mas->content;
 	gap |= !mt_slot_locked(mas->tree, slots, offset + 1);
 
 	if (wr_mas->offset_end - offset == 1) {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (7 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type() Liam R. Howlett (Oracle)
                   ` (10 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

Clearing the entire node on the stack is unnecessary since most of the
node will be overwritten anyways.  Just clear what isn't used after the
data is in place.

Benchmarking shows a speedup of 0.67% on a height 4 tree with 2048
entries.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 8f5a7fa10cf4c..1c96b3a5dab60 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3186,7 +3186,7 @@ static void mas_wr_spanning_store(struct ma_wr_state *wr_mas)
 static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
 {
 	unsigned char dst_offset, offset_end;
-	unsigned char copy_size, node_pivots;
+	unsigned char copy_size, node_pivots, node_slots;
 	struct maple_node reuse, *newnode;
 	unsigned long *dst_pivots;
 	void __rcu **dst_slots;
@@ -3199,6 +3199,7 @@ static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
 	in_rcu = mt_in_rcu(mas->tree);
 	offset_end = wr_mas->offset_end;
 	node_pivots = mt_pivots[wr_mas->type];
+	node_slots = mt_slots[wr_mas->type];
 	/* Assume last adds an entry */
 	new_end = mas->end + 1 - offset_end + mas->offset;
 	if (mas->last == wr_mas->end_piv) {
@@ -3210,7 +3211,6 @@ static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
 	if (in_rcu) {
 		newnode = mas_pop_node(mas);
 	} else {
-		memset(&reuse, 0, sizeof(struct maple_node));
 		newnode = &reuse;
 	}
 
@@ -3254,7 +3254,21 @@ static inline void mas_wr_node_store(struct ma_wr_state *wr_mas)
 		dst_pivots[new_end] = mas->max;
 
 done:
-	mas_leaf_set_meta(newnode, maple_leaf_64, new_end);
+	if (!in_rcu && new_end + 2 < node_slots) {
+		unsigned char clear_from = new_end + 1;
+
+		/*
+		 * Note that the last slot is never cleared, since the metadata
+		 * will be stored there or it has a value.
+		 */
+		memset(dst_slots + clear_from, 0,
+		       sizeof(void __rcu *) * (node_slots - clear_from));
+		if (clear_from < node_pivots)
+			memset(dst_pivots + clear_from, 0,
+			       sizeof(unsigned long) * (node_pivots - clear_from));
+	}
+
+	mas_leaf_set_meta(newnode, wr_mas->type, new_end);
 	if (in_rcu) {
 		struct maple_enode *old_enode = mas->node;
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type()
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (8 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 11/19] maple_tree: Add bulk parent set helper Liam R. Howlett (Oracle)
                   ` (9 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

Use three new local booleans instead of reading other structures.  This
has shown an increase of 0.62% on a 2048 entry tree of height 4.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 18 +++++++++++++++---
 1 file changed, 15 insertions(+), 3 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 1c96b3a5dab60..999a1d095e369 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3736,6 +3736,9 @@ static inline enum store_type mas_wr_store_type(struct ma_wr_state *wr_mas)
 {
 	struct ma_state *mas = wr_mas->mas;
 	unsigned char new_end;
+	bool appending;
+	bool one_slot;
+	bool in_rcu;
 
 	if (unlikely(mas_is_none(mas) || mas_is_ptr(mas)))
 		return wr_store_root;
@@ -3755,21 +3758,30 @@ static inline enum store_type mas_wr_store_type(struct ma_wr_state *wr_mas)
 		return wr_new_root;
 
 	new_end = mas_wr_new_end(wr_mas);
+	in_rcu = mt_in_rcu(mas->tree);
+	appending = mas->offset == mas->end;
+	one_slot = wr_mas->offset_end - mas->offset == 1;
+
 	/* Potential spanning rebalance collapsing a node */
 	if (new_end < mt_min_slots[wr_mas->type]) {
 		if (!mte_is_root(mas->node))
 			return  wr_rebalance;
+		if (!in_rcu) {
+			if (appending)
+				return wr_append;
+			else if (mas->end == new_end && one_slot)
+				return wr_slot_store;
+		}
 		return wr_node_store;
 	}
 
 	if (new_end >= mt_slots[wr_mas->type])
 		return wr_split_store;
 
-	if (!mt_in_rcu(mas->tree) && (mas->offset == mas->end))
+	if (!in_rcu && appending)
 		return wr_append;
 
-	if ((new_end == mas->end) && (!mt_in_rcu(mas->tree) ||
-		(wr_mas->offset_end - mas->offset == 1)))
+	if (new_end == mas->end && (!in_rcu || one_slot))
 		return wr_slot_store;
 
 	return wr_node_store;
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 11/19] maple_tree: Add bulk parent set helper
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (9 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
                   ` (8 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

Instead of calculating the parent pointer each time for a child,
cache the majority of the parent pointer and only change the slot per
child.

Drop the mas_set_parent() function since the last user has been removed.

Testing on a tree containing 2048 entries of height 4 had an increased
gain of 3.51% on nodes tracking gaps.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 97 +++++++++++++++++++++---------------------------
 1 file changed, 42 insertions(+), 55 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 999a1d095e369..768193406c3db 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -455,46 +455,6 @@ enum maple_type mas_parent_type(struct ma_state *mas, struct maple_enode *enode)
 	return 0;
 }
 
-/*
- * mas_set_parent() - Set the parent node and encode the slot
- * @mas: The maple state
- * @enode: The encoded maple node.
- * @parent: The encoded maple node that is the parent of @enode.
- * @slot: The slot that @enode resides in @parent.
- *
- * Slot number is encoded in the enode->parent bit 3-6 or 2-6, depending on the
- * parent type.
- */
-static inline
-void mas_set_parent(struct ma_state *mas, struct maple_enode *enode,
-		    const struct maple_enode *parent, unsigned char slot)
-{
-	unsigned long val = (unsigned long)parent;
-	unsigned long shift;
-	unsigned long type;
-	enum maple_type p_type = mte_node_type(parent);
-
-	MAS_BUG_ON(mas, p_type == maple_dense);
-	MAS_BUG_ON(mas, p_type == maple_leaf_64);
-
-	switch (p_type) {
-	case maple_range_64:
-	case maple_arange_64:
-		shift = MAPLE_PARENT_SLOT_SHIFT;
-		type = MAPLE_PARENT_RANGE64;
-		break;
-	default:
-	case maple_dense:
-	case maple_leaf_64:
-		shift = type = 0;
-		break;
-	}
-
-	val &= ~MAPLE_NODE_MASK; /* Clear all node metadata in parent */
-	val |= (slot << shift) | type;
-	mte_to_node(enode)->parent = ma_parent_ptr(val);
-}
-
 /*
  * mte_parent_slot() - get the parent slot of @enode.
  * @enode: The encoded maple node.
@@ -876,6 +836,42 @@ static inline void ma_set_meta_gap(struct maple_node *mn, enum maple_type mt,
 	meta->gap = offset;
 }
 
+/*
+ * mas_set_parent_slots() - Bulk operation to set many slot parent pointers
+ * @mas: The maple state
+ * @parent: The encoded maple node that is the parent of @enode.
+ * @slot: The slot that of the @enode.
+ * @start_slot: The offset into @slot
+ * @count: The number of slots to set (eg: exclusive)
+ */
+static inline
+void mas_set_parent_slots(struct ma_state *mas, struct maple_enode *parent,
+	void __rcu **slots, unsigned char start_slot, unsigned char count)
+{
+	unsigned long val;
+	unsigned long shift;
+	unsigned long type;
+	enum maple_type p_type = mte_node_type(parent);
+	unsigned char i;
+
+	MAS_BUG_ON(mas, p_type != maple_range_64 &&
+			p_type != maple_arange_64);
+
+	shift = MAPLE_PARENT_SLOT_SHIFT;
+	type = MAPLE_PARENT_RANGE64;
+
+	val = (unsigned long)parent;
+	val &= ~MAPLE_NODE_MASK;
+
+	for (i = 0; i < count; i++) {
+		unsigned long pval = val | ((start_slot + i) << shift) | type;
+		struct maple_enode *child;
+
+		child = mt_slot_locked(mas->tree, slots, i);
+		mte_to_node(child)->parent = ma_parent_ptr(pval);
+	}
+}
+
 /*
  * mat_add() - Add a @dead_enode to the ma_topiary of a list of dead nodes.
  * @mat: the ma_topiary, a linked list of dead nodes.
@@ -1608,14 +1604,10 @@ static inline void mas_adopt_children(struct ma_state *mas,
 	struct maple_node *node = mte_to_node(parent);
 	void __rcu **slots = ma_slots(node, type);
 	unsigned long *pivots = ma_pivots(node, type);
-	struct maple_enode *child;
-	unsigned char offset;
+	unsigned char end;
 
-	offset = ma_data_end(node, type, pivots, mas->max);
-	do {
-		child = mas_slot_locked(mas, slots, offset);
-		mas_set_parent(mas, child, parent, offset);
-	} while (offset--);
+	end = ma_data_end(node, type, pivots, mas->max);
+	mas_set_parent_slots(mas, parent, slots, 0, end + 1);
 }
 
 /*
@@ -1997,15 +1989,10 @@ unsigned long node_copy(struct ma_state *mas, struct maple_node *src,
 	s_slots = ma_slots(src, s_mt) + start;
 	s_pivots = ma_pivots(src, s_mt) + start;
 	memcpy(d_slots, s_slots, size * sizeof(void __rcu *));
-	if (!ma_is_leaf(d_mt) && s_mt == maple_copy) {
-		struct maple_enode *edst = mt_mk_node(dst, d_mt);
-
 
-		for (int i = 0; i < size; i++)
-			mas_set_parent(mas,
-				       mt_slot_locked(mas->tree, d_slots, i),
-				       edst, d_start + i);
-	}
+	if (!ma_is_leaf(d_mt) && s_mt == maple_copy)
+		mas_set_parent_slots(mas, mt_mk_node(dst, d_mt),
+				     d_slots, d_start, size);
 
 	d_gaps = ma_gaps(dst, d_mt);
 	if (d_gaps) {
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic()
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (10 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 11/19] maple_tree: Add bulk parent set helper Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:41   ` Chuck Lever
  2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
                   ` (7 subsequent siblings)
  19 siblings, 1 reply; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
	Chris Mason, Chuck Lever

If mas_alloc_cyclic() is called during a low memory situation, it is
possible the lock may be dropped so reclaim can occur.  There is a
window where some other task may allocate the same id and cause the
mas_insert() to fail with -EEXIST.  In this scenario the function will
return -EEXIST, which is not expected.

Modifying the retry on mas_nomem() to re-search for a slot means that
any race with other writes will not matter as the lock will be held
between finding the index and writing the index.

Moving the flag logic avoids cases where the flag is modified on drop
lock/reacquire or when the write fails after clearing the flag.

No existing users are exposed to this issue.

Fixes: 9b6713cc75229 ("maple_tree: Add mtree_alloc_cyclic()")
Reported-by: Chris Mason <clm@meta.com>
Cc: Chuck Lever <cel@kernel.org>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 43 ++++++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 768193406c3db..1bc177bf42f9f 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -3867,35 +3867,40 @@ int mas_alloc_cyclic(struct ma_state *mas, unsigned long *startp,
 		void *entry, unsigned long range_lo, unsigned long range_hi,
 		unsigned long *next, gfp_t gfp)
 {
-	unsigned long min = range_lo;
-	int ret = 0;
-
-	range_lo = max(min, *next);
-	ret = mas_empty_area(mas, range_lo, range_hi, 1);
-	if ((mas->tree->ma_flags & MT_FLAGS_ALLOC_WRAPPED) && ret == 0) {
-		mas->tree->ma_flags &= ~MT_FLAGS_ALLOC_WRAPPED;
-		ret = 1;
-	}
-	if (ret < 0 && range_lo > min) {
-		mas_reset(mas);
-		ret = mas_empty_area(mas, min, range_hi, 1);
-		if (ret == 0)
-			ret = 1;
-	}
-	if (ret < 0)
-		return ret;
+	int ret;
+	unsigned long min;
 
+	min = range_lo;
 	do {
+		range_lo = max(min, *next);
+		ret = mas_empty_area(mas, range_lo, range_hi, 1);
+		if (ret < 0 && range_lo > min) {
+			mas_reset(mas);
+			ret = mas_empty_area(mas, min, range_hi, 1);
+			if (ret == 0)
+				ret = 1;
+		}
+		if (ret < 0)
+			goto out;
+
 		mas_insert(mas, entry);
 	} while (mas_nomem(mas, gfp));
-	if (mas_is_err(mas))
-		return xa_err(mas->node);
 
+	if (mas_is_err(mas)) {
+		ret = xa_err(mas->node);
+		goto out;
+	}
+
+	if ((mas->tree->ma_flags & MT_FLAGS_ALLOC_WRAPPED) && ret == 0) {
+		mas->tree->ma_flags &= ~MT_FLAGS_ALLOC_WRAPPED;
+		ret = 1;
+	}
 	*startp = mas->index;
 	*next = *startp + 1;
 	if (*next == 0)
 		mas->tree->ma_flags |= MT_FLAGS_ALLOC_WRAPPED;
 
+out:
 	mas_destroy(mas);
 	return ret;
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (11 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:32   ` Rik van Riel
  2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
                   ` (6 subsequent siblings)
  19 siblings, 1 reply; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
	Jason Gunthorpe, Rik van Riel

State that the mas_erase() and mtree_erase() functions may use
GFP_KERNEL on allocation retry.  Don't just depend on people reading the
documentation by adding a check that will warn of the use.

Cc: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 1bc177bf42f9f..15e8081f61808 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5657,6 +5657,10 @@ EXPORT_SYMBOL_GPL(mas_find_range_rev);
  * Searches for @mas->index, sets @mas->index and @mas->last to the range and
  * erases that range.
  *
+ * Note that erase requires allocations and will use GFP_KERNEL to do so if
+ * necessary.  If the allocation fails, the internal lock will be dropped to
+ * retry.
+ *
  * Return: the entry that was erased or %NULL, @mas->index and @mas->last are updated.
  */
 void *mas_erase(struct ma_state *mas)
@@ -5665,13 +5669,21 @@ void *mas_erase(struct ma_state *mas)
 	unsigned long index = mas->index;
 	MA_WR_STATE(wr_mas, mas, NULL);
 
+	/*
+	 * In low memory situations, the allocation is retried with the gfp flag
+	 * GFP_KERNEL.  The internal spinlock is dropped in mas_nomem(), however
+	 * the external lock is not dropped.
+	 */
+	if (mt_external_lock(mas->tree))
+		might_alloc(GFP_KERNEL);
+
 	if (!mas_is_active(mas) || !mas_is_start(mas))
 		mas->status = ma_start;
 
 write_retry:
 	entry = mas_state_walk(mas);
 	if (!entry)
-		return NULL;
+		goto out;
 
 	/* Must reset to ensure spanning writes of last slot are detected */
 	mas_reset(mas);
@@ -5682,8 +5694,10 @@ void *mas_erase(struct ma_state *mas)
 		goto write_retry;
 	}
 
-	if (mas_is_err(mas))
+	if (mas_is_err(mas)) {
+		entry = NULL;
 		goto out;
+	}
 
 	mas_wr_store_entry(&wr_mas);
 out:
@@ -6011,6 +6025,10 @@ EXPORT_SYMBOL(mtree_alloc_rrange);
  * Erasing is the same as a walk to an entry then a store of a NULL to that
  * ENTIRE range.  In fact, it is implemented as such using the advanced API.
  *
+ * Note that erase requires allocations and will use GFP_KERNEL to do so if
+ * necessary.  If the allocation fails, the internal lock will be dropped to
+ * retry.
+ *
  * Return: The entry stored at the @index or %NULL
  */
 void *mtree_erase(struct maple_tree *mt, unsigned long index)
@@ -6020,6 +6038,7 @@ void *mtree_erase(struct maple_tree *mt, unsigned long index)
 	MA_STATE(mas, mt, index, index);
 	trace_ma_op(TP_FCT, &mas);
 
+	might_alloc(GFP_KERNEL);
 	mtree_lock(mt);
 	entry = mas_erase(&mas);
 	mtree_unlock(mt);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (12 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 23:02   ` Andrew Morton
  2026-06-30 19:08 ` [PATCH v2 15/19] maple_tree: Document erase and allocations better Liam R. Howlett (Oracle)
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
	Rik van Riel, Jason Gunthorpe

Allocations should never fail in the circumstances that are expected to
occur.  Add checks in the code to ensure the circumstances are correctly
set up by the user and warn if they are not.

Also add a warning on failure to allocate, which should never happen.

Cc: Rik van Riel <riel@surriel.com>
Cc: Jason Gunthorpe <jgg@ziepe.ca>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 15e8081f61808..afe65ff0f8a6f 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -5720,6 +5720,10 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
 	if (likely(mas->node != MA_ERROR(-ENOMEM)))
 		return false;
 
+	/* Allocations can fail, don't do this. */
+	WARN_ON_ONCE(!gfpflags_allow_blocking(gfp) &&
+		     mt_external_lock(mas->tree));
+
 	if (gfpflags_allow_blocking(gfp) && !mt_external_lock(mas->tree)) {
 		mtree_unlock(mas->tree);
 		mas_alloc_nodes(mas, gfp);
@@ -5730,9 +5734,12 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
 
 	/*
 	 * Return false on zero forward progress.  Partial allocations are kept
-	 * so the retry path will attempt to get the rest.
+	 * so the retry path will attempt to get the rest.  The failure should
+	 * not happen as we try our best to reclaim.  The user would need an
+	 * external lock with a non-blocking gfp in a low memory situation -
+	 * which would have triggered the first warning in this function.
 	 */
-	if (!mas->sheaf && !mas->alloc)
+	if (WARN_ON_ONCE(!mas->sheaf && !mas->alloc))
 		return false;
 
 	mas_reset(mas);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 15/19] maple_tree: Document erase and allocations better
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (13 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 16/19] maple_tree: Change two GFP flags in tests Liam R. Howlett (Oracle)
                   ` (4 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
	Jason Gunthorpe, Rik van Riel

During a discussion on the maple tree erase process and GFP flags, Jason
suggested there be an amendment to the documentation to clarify the
situation on allocations within the tree.

The added text is an attempt to better explain that the tree may
allocate, even when erasing, and provide some guidance on how to work
around such issues.

Link: https://lore.kernel.org/all/20260617180419.GA231643@ziepe.ca/
Suggested-by: Jason Gunthorpe <jgg@ziepe.ca>
Cc: Rik van Riel <riel@surriel.com>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 Documentation/core-api/maple_tree.rst | 19 ++++++++++++++++---
 1 file changed, 16 insertions(+), 3 deletions(-)

diff --git a/Documentation/core-api/maple_tree.rst b/Documentation/core-api/maple_tree.rst
index 34964ec88d179..1eae24a9c18b4 100644
--- a/Documentation/core-api/maple_tree.rst
+++ b/Documentation/core-api/maple_tree.rst
@@ -17,7 +17,8 @@ supports iterating over a range of entries and going to the previous or next
 entry in a cache-efficient manner.  The tree can also be put into an RCU-safe
 mode of operation which allows reading and writing concurrently.  Writers must
 synchronize on a lock, which can be the default spinlock, or the user can set
-the lock to an external lock of a different type.
+the lock to an external lock of a different type.  Note that external locks may
+interfere with allocations in a low memory situation.
 
 The Maple Tree maintains a small memory footprint and was designed to use
 modern processor cache efficiently.  The majority of the users will be able to
@@ -42,6 +43,15 @@ successful store operation within a given
 code segment when allocating cannot be done.  Allocations of nodes are
 relatively small at around 256 bytes.
 
+Since the maple tree uses internal nodes that are allocated and has rules on
+data density, erasing an entry may cause allocations to occur.  That is,
+erasing an entry may consume memory.  Users must take care to ensure that they
+do not violate the larger system constraints on when and how memory is
+allocated.  Most situations are fine to allocate, but the pre-allocation
+support is provided as a mechanism to avoid trickier situations.  There is also
+the possibility of using special entries and clean up the tree later, in
+extreme circumstances.
+
 .. _maple-tree-normal-api:
 
 Normal API
@@ -63,7 +73,8 @@ success or an error code otherwise.  mtree_store_range() works in the same way
 but takes a range.  mtree_load() is used to retrieve the entry stored at a
 given index.  You can use mtree_erase() to erase an entire range by only
 knowing one value within that range, or mtree_store() call with an entry of
-NULL may be used to partially erase a range or many ranges at once.
+NULL may be used to partially erase a range or many ranges at once.  Note that
+mtree_erase() may use GFP_KERNEL on allocations.
 
 If you want to only store a new entry to a range (or index) if that range is
 currently ``NULL``, you can use mtree_insert_range() or mtree_insert() which
@@ -163,7 +174,9 @@ You can use mas_erase() to erase an entire range by setting index and
 last of the maple state to the desired range to erase.  This will erase
 the first range that is found in that range, set the maple state index
 and last as the range that was erased and return the entry that existed
-at that location.
+at that location.  Note that mas_erase() may allocate with the GFP_KERNEL flag.
+If this is not okay, consider using mas_store_gfp() and pass it a ``NULL``,
+after setting up the correct range by walking to the entry.
 
 You can walk each entry within a range by using mas_for_each().  If you want
 to walk each element of the tree then ``0`` and ``ULONG_MAX`` may be used as
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 16/19] maple_tree: Change two GFP flags in tests
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (14 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 15/19] maple_tree: Document erase and allocations better Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 17/19] maple_tree: Fix argument name in header Liam R. Howlett (Oracle)
                   ` (3 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle),
	Joe Perches

The GFP flags in two tests are obviously incorrect.  Make the tests
correctly run by updating the GFP flags.

Link: https://lore.kernel.org/all/d9cbb89faa5bdb71d451781d214a51ce8923a83e.camel@perches.com/
Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 tools/testing/radix-tree/maple.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/radix-tree/maple.c b/tools/testing/radix-tree/maple.c
index 0607913a3022d..d967e76a3c065 100644
--- a/tools/testing/radix-tree/maple.c
+++ b/tools/testing/radix-tree/maple.c
@@ -35234,7 +35234,7 @@ static noinline void __init check_prealloc(struct maple_tree *mt)
 	mt_set_non_kernel(1);
 	/* Spanning store */
 	mas_set_range(&mas, 1, 100);
-	MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL & GFP_NOWAIT) == 0);
+	MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_NOWAIT) == 0);
 	allocated = mas_allocated(&mas);
 	height = mas_mt_height(&mas);
 	MT_BUG_ON(mt, allocated != 0);
@@ -35257,7 +35257,7 @@ static noinline void __init check_prealloc(struct maple_tree *mt)
 	MT_BUG_ON(mt, mas_allocated(&mas) != 0);
 	mas_set_range(&mas, 0, 200);
 	mt_set_non_kernel(1);
-	MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_KERNEL & GFP_NOWAIT) == 0);
+	MT_BUG_ON(mt, mas_preallocate(&mas, ptr, GFP_NOWAIT) == 0);
 	allocated = mas_allocated(&mas);
 	height = mas_mt_height(&mas);
 	MT_BUG_ON(mt, allocated != 0);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 17/19] maple_tree: Fix argument name in header
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (15 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 16/19] maple_tree: Change two GFP flags in tests Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 18/19] maple_tree: Avoid extra gap calculation Liam R. Howlett (Oracle)
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

The mas_prev_range() function takes a min and not a max.

Fixes: 6b9e93e010204 ("maple_tree: add mas_prev_range() and mas_find_range_rev interface")
Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 include/linux/maple_tree.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/maple_tree.h b/include/linux/maple_tree.h
index 346eae48cf15e..5e0bd2857d941 100644
--- a/include/linux/maple_tree.h
+++ b/include/linux/maple_tree.h
@@ -576,7 +576,7 @@ void maple_tree_init(void);
 void mas_destroy(struct ma_state *mas);
 
 void *mas_prev(struct ma_state *mas, unsigned long min);
-void *mas_prev_range(struct ma_state *mas, unsigned long max);
+void *mas_prev_range(struct ma_state *mas, unsigned long min);
 void *mas_next(struct ma_state *mas, unsigned long max);
 void *mas_next_range(struct ma_state *mas, unsigned long max);
 
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 18/19] maple_tree: Avoid extra gap calculation
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (16 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 17/19] maple_tree: Fix argument name in header Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 19:08 ` [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable() Liam R. Howlett (Oracle)
  2026-06-30 23:05 ` [PATCH v2 00/19] maple_tree: lock checking and clean ups Andrew Morton
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

Prior to ending the ascension loop of larger operations like split,
rebalance, and spanning store the gap in the node had been calculated.
Once the node is inserted into the tree, the gap is recalculated in
mas_update_gap().  This can be avoided by creating a helper for
mas_update_gap() that accepts the known gap value, which reduces the
operations required for gap updating path.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 28 +++++++++++++++++-----------
 1 file changed, 17 insertions(+), 11 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index afe65ff0f8a6f..4336a98763984 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -1565,14 +1565,26 @@ static inline void mas_parent_gap(struct ma_state *mas, unsigned char offset,
 	goto ascend;
 }
 
+static __always_inline void mas_update_gap_known(struct ma_state *mas,
+		unsigned long gap)
+{
+	unsigned char pslot;
+	unsigned long p_gap;
+
+	pslot = mte_parent_slot(mas->node);
+	p_gap = ma_gaps(mte_parent(mas->node),
+			mas_parent_type(mas, mas->node))[pslot];
+
+	if (p_gap != gap)
+		mas_parent_gap(mas, pslot, gap);
+}
+
 /*
  * mas_update_gap() - Update a nodes gaps and propagate up if necessary.
  * @mas: the maple state.
  */
 static inline void mas_update_gap(struct ma_state *mas)
 {
-	unsigned char pslot;
-	unsigned long p_gap;
 	unsigned long max_gap;
 
 	if (!mt_is_alloc(mas->tree))
@@ -1582,13 +1594,7 @@ static inline void mas_update_gap(struct ma_state *mas)
 		return;
 
 	max_gap = mas_max_gap(mas);
-
-	pslot = mte_parent_slot(mas->node);
-	p_gap = ma_gaps(mte_parent(mas->node),
-			mas_parent_type(mas, mas->node))[pslot];
-
-	if (p_gap != max_gap)
-		mas_parent_gap(mas, pslot, max_gap);
+	mas_update_gap_known(mas, max_gap);
 }
 
 /*
@@ -2136,8 +2142,8 @@ static inline void mas_wmb_replace(struct ma_state *mas, struct maple_copy *cp)
 	mas->node = mt_slot_locked(mas->tree, cp->slot, 0);
 	/* Insert the new data in the tree */
 	mas_topiary_replace(mas, old_enode, cp->height);
-	if (!mte_is_leaf(mas->node))
-		mas_update_gap(mas);
+	if (mt_is_alloc(mas->tree) && !mte_is_root(mas->node))
+		mas_update_gap_known(mas, cp->gap[0]);
 
 	mtree_range_walk(mas);
 }
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable()
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (17 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 18/19] maple_tree: Avoid extra gap calculation Liam R. Howlett (Oracle)
@ 2026-06-30 19:08 ` Liam R. Howlett (Oracle)
  2026-06-30 23:05 ` [PATCH v2 00/19] maple_tree: lock checking and clean ups Andrew Morton
  19 siblings, 0 replies; 24+ messages in thread
From: Liam R. Howlett (Oracle) @ 2026-06-30 19:08 UTC (permalink / raw)
  To: Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Liam R. Howlett (Oracle)

A check in mas_walk() was incorrect and caused inefficient use of the
maple state.  The same issue existed in mas_erase(), but was left
unfixed.  Making a helper function is the obvious answer.

Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
---
 lib/maple_tree.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index 4336a98763984..06cc05b79fbdc 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -261,6 +261,12 @@ static inline bool mas_is_underflow(struct ma_state *mas)
 	return mas->status == ma_underflow;
 }
 
+static inline void mas_make_walkable(struct ma_state *mas)
+{
+	if (!mas_is_active(mas) && !mas_is_start(mas))
+		mas->status = ma_start;
+}
+
 static __always_inline struct maple_node *mte_to_node(
 		const struct maple_enode *entry)
 {
@@ -4446,8 +4452,7 @@ void *mas_walk(struct ma_state *mas)
 	void *entry;
 
 	mas_may_init_lock_check(mas);
-	if (!mas_is_active(mas) && !mas_is_start(mas))
-		mas->status = ma_start;
+	mas_make_walkable(mas);
 retry:
 	entry = mas_state_walk(mas);
 	if (mas_is_start(mas)) {
@@ -5683,9 +5688,7 @@ void *mas_erase(struct ma_state *mas)
 	if (mt_external_lock(mas->tree))
 		might_alloc(GFP_KERNEL);
 
-	if (!mas_is_active(mas) || !mas_is_start(mas))
-		mas->status = ma_start;
-
+	mas_make_walkable(mas);
 write_retry:
 	entry = mas_state_walk(mas);
 	if (!entry)
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations
  2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
@ 2026-06-30 19:32   ` Rik van Riel
  0 siblings, 0 replies; 24+ messages in thread
From: Rik van Riel @ 2026-06-30 19:32 UTC (permalink / raw)
  To: Liam R. Howlett (Oracle), Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Jason Gunthorpe

On Tue, 2026-06-30 at 15:08 -0400, Liam R. Howlett (Oracle) wrote:
> State that the mas_erase() and mtree_erase() functions may use
> GFP_KERNEL on allocation retry.  Don't just depend on people reading
> the
> documentation by adding a check that will warn of the use.
> 
> Cc: Jason Gunthorpe <jgg@ziepe.ca>
> Cc: Rik van Riel <riel@surriel.com>
> Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>
> 

Reviewed-by: Rik van Riel <riel@surriel.com>

-- 
All Rights Reversed.


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic()
  2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
@ 2026-06-30 19:41   ` Chuck Lever
  0 siblings, 0 replies; 24+ messages in thread
From: Chuck Lever @ 2026-06-30 19:41 UTC (permalink / raw)
  To: Liam R. Howlett (Oracle), Andrew Morton
  Cc: linux-mm, linux-kernel, maple-tree, Chris Mason


On Tue, Jun 30, 2026, at 3:08 PM, Liam R. Howlett (Oracle) wrote:
> If mas_alloc_cyclic() is called during a low memory situation, it is
> possible the lock may be dropped so reclaim can occur.  There is a
> window where some other task may allocate the same id and cause the
> mas_insert() to fail with -EEXIST.  In this scenario the function will
> return -EEXIST, which is not expected.
>
> Modifying the retry on mas_nomem() to re-search for a slot means that
> any race with other writes will not matter as the lock will be held
> between finding the index and writing the index.
>
> Moving the flag logic avoids cases where the flag is modified on drop
> lock/reacquire or when the write fails after clearing the flag.
>
> No existing users are exposed to this issue.
>
> Fixes: 9b6713cc75229 ("maple_tree: Add mtree_alloc_cyclic()")
> Reported-by: Chris Mason <clm@meta.com>
> Cc: Chuck Lever <cel@kernel.org>
> Signed-off-by: Liam R. Howlett (Oracle) <liam@infradead.org>

Reviewed-by: Chuck Lever <cel@kernel.org>

-- 
Chuck Lever


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail
  2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
@ 2026-06-30 23:02   ` Andrew Morton
  0 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2026-06-30 23:02 UTC (permalink / raw)
  To: Liam R. Howlett (Oracle)
  Cc: linux-mm, linux-kernel, maple-tree, Rik van Riel, Jason Gunthorpe

On Tue, 30 Jun 2026 15:08:38 -0400 "Liam R. Howlett (Oracle)" <liam@infradead.org> wrote:

> Allocations should never fail in the circumstances that are expected to
> occur.  Add checks in the code to ensure the circumstances are correctly
> set up by the user and warn if they are not.
> 
> Also add a warning on failure to allocate, which should never happen.
> 
> ...
>
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -5720,6 +5720,10 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
>  	if (likely(mas->node != MA_ERROR(-ENOMEM)))
>  		return false;
>  
> +	/* Allocations can fail, don't do this. */
> +	WARN_ON_ONCE(!gfpflags_allow_blocking(gfp) &&
> +		     mt_external_lock(mas->tree));
> +
>  	if (gfpflags_allow_blocking(gfp) && !mt_external_lock(mas->tree)) {
>  		mtree_unlock(mas->tree);
>  		mas_alloc_nodes(mas, gfp);
> @@ -5730,9 +5734,12 @@ bool mas_nomem(struct ma_state *mas, gfp_t gfp)
>  
>  	/*
>  	 * Return false on zero forward progress.  Partial allocations are kept
> -	 * so the retry path will attempt to get the rest.
> +	 * so the retry path will attempt to get the rest.  The failure should
> +	 * not happen as we try our best to reclaim.  The user would need an
> +	 * external lock with a non-blocking gfp in a low memory situation -
> +	 * which would have triggered the first warning in this function.
>  	 */
> -	if (!mas->sheaf && !mas->alloc)
> +	if (WARN_ON_ONCE(!mas->sheaf && !mas->alloc))
>  		return false;
>  

Can either of these warnings duplicate the effect of !__GFP_NOWARN?


^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [PATCH v2 00/19] maple_tree: lock checking and clean ups
  2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
                   ` (18 preceding siblings ...)
  2026-06-30 19:08 ` [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable() Liam R. Howlett (Oracle)
@ 2026-06-30 23:05 ` Andrew Morton
  19 siblings, 0 replies; 24+ messages in thread
From: Andrew Morton @ 2026-06-30 23:05 UTC (permalink / raw)
  To: Liam R. Howlett (Oracle); +Cc: linux-mm, linux-kernel, maple-tree

On Tue, 30 Jun 2026 15:08:24 -0400 "Liam R. Howlett (Oracle)" <liam@infradead.org> wrote:

> The goals of this series are:
> 1. Lock issue detection
>   A number of syzbot reports are incorrectly pointing to the mm exit as
>   a source of the locking error.  The first three patches attempt to
>   help users detect errors in their locking - but they still have to use
>   LOCKDEP.  I guess it's still down to hope and prayers.
> 
> 2. Documentation fixes
>   The documentation was lacking clarity, there are updates to try and
>   help the users, especially around the erase() cases.
> 
> 3. Two benign issues
>   The cyclic allocator may have a race, although no in-kernel user can
>   hit it.
>   The erase functions may cause allocation issues if used with the
>   incorrect locking type, but none are present in-tree.
> 
> Beyond these goals there are some test fixes, some general speed-up
> patches targeting extra work and cycles, and dropping dead code.

Great, thanks, I'll add this to mm-new for some testing.



^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2026-06-30 23:05 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-30 19:08 [PATCH v2 00/19] maple_tree: lock checking and clean ups Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 01/19] maple_tree: Add rcu locking check when LOCKDEP is enabled Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 02/19] locking/lockdep: Add sequence counter to held_lock Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 03/19] maple_tree: Add write lock checking with lockdep sequence numbers Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 04/19] maple_tree: Documentation fix Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 05/19] maple_tree: Drop dead code from mas_extend_spanning_null() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 06/19] maple_tree: Drop MAPLE_ALLOC_SLOTS Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 07/19] maple_tree: Clarify comments on mas_nomem() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 08/19] maple_tree: Use prefetched value in mas_wr_store_type() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 09/19] maple_tree: Optimise mas_wr_node_store() when not in rcu mode Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 10/19] maple_tree: micro optimisation of mas_wr_store_type() Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 11/19] maple_tree: Add bulk parent set helper Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 12/19] maple_tree: Catch race in mas_alloc_cyclic() Liam R. Howlett (Oracle)
2026-06-30 19:41   ` Chuck Lever
2026-06-30 19:08 ` [PATCH v2 13/19] maple_tree: Document that erase may use GFP_KERNEL for allocations Liam R. Howlett (Oracle)
2026-06-30 19:32   ` Rik van Riel
2026-06-30 19:08 ` [PATCH v2 14/19] maple_tree: WARN_ON_ONCE when allocations fail Liam R. Howlett (Oracle)
2026-06-30 23:02   ` Andrew Morton
2026-06-30 19:08 ` [PATCH v2 15/19] maple_tree: Document erase and allocations better Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 16/19] maple_tree: Change two GFP flags in tests Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 17/19] maple_tree: Fix argument name in header Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 18/19] maple_tree: Avoid extra gap calculation Liam R. Howlett (Oracle)
2026-06-30 19:08 ` [PATCH v2 19/19] maple_tree: Add helper mas_make_walkable() Liam R. Howlett (Oracle)
2026-06-30 23:05 ` [PATCH v2 00/19] maple_tree: lock checking and clean ups Andrew Morton

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox