linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements
@ 2025-04-15  2:39 Baoquan He
  2025-04-15  2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He
                   ` (6 more replies)
  0 siblings, 7 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15  2:39 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He

These were made from code inspection in mm/vmalloc.c.

Baoquan He (5):
  mm/vmalloc.c: change purge_ndoes as local static variable
  mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
  mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  mm/vmalloc: optimize function vm_unmap_aliases()
  mm/vmalloc.c: return explicit error value in alloc_vmap_area()

 mm/vmalloc.c | 68 +++++++++++++++++++++++++---------------------------
 1 file changed, 32 insertions(+), 36 deletions(-)

-- 
2.41.0



^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable
  2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
@ 2025-04-15  2:39 ` Baoquan He
  2025-04-15 10:47   ` Uladzislau Rezki
                     ` (2 more replies)
  2025-04-15  2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He
                   ` (5 subsequent siblings)
  6 siblings, 3 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15  2:39 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He

Static variable 'purge_ndoes' is defined in global scope, while it's
only used in function __purge_vmap_area_lazy(). It mainly serves to
avoid memory allocation repeatedly, especially when NR_CPUS is big.

While a local static variable can also satisfy the demand, and can
improve code readibility. Hence move its definition into
__purge_vmap_area_lazy().

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/vmalloc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 6ee7fc2ec986..aca1905d3397 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2128,7 +2128,6 @@ static DEFINE_MUTEX(vmap_purge_lock);
 
 /* for per-CPU blocks */
 static void purge_fragmented_blocks_allcpus(void);
-static cpumask_t purge_nodes;
 
 static void
 reclaim_list_global(struct list_head *head)
@@ -2261,6 +2260,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end,
 {
 	unsigned long nr_purged_areas = 0;
 	unsigned int nr_purge_helpers;
+	static cpumask_t purge_nodes;
 	unsigned int nr_purge_nodes;
 	struct vmap_node *vn;
 	int i;
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
  2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
  2025-04-15  2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He
@ 2025-04-15  2:39 ` Baoquan He
  2025-04-15 15:25   ` Uladzislau Rezki
  2025-04-15 19:09   ` Shivank Garg
  2025-04-15  2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He
                   ` (4 subsequent siblings)
  6 siblings, 2 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15  2:39 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He

When finding VA in vn->busy, if VA spans several zones and the passed
addr is not the same as va->va_start, we should scan the vn in reverse
odrdr because the starting address of VA must be smaller than the passed
addr if it really resides in the VA.

E.g on a system nr_vmap_nodes=100,

     <----va---->
 -|-----|-----|-----|-----|-----|-----|-----|-----|-----|-
    ...   n-1   n    n+1   n+2   ...   100     0     1

VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed
addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n',
then succeed very soon.

Meanwhile we still need loop around because VA could spans node from 'n'
to node 100, node 0, node 1.

Anyway, changing to find in reverse order can improve efficiency on
many CPUs system.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/vmalloc.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index aca1905d3397..488d69b56765 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr)
 
 		if (va)
 			return va;
-	} while ((i = (i + 1) % nr_vmap_nodes) != j);
+	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
 
 	return NULL;
 }
@@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr)
 
 		if (va)
 			return va;
-	} while ((i = (i + 1) % nr_vmap_nodes) != j);
+	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
 
 	return NULL;
 }
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
  2025-04-15  2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He
  2025-04-15  2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He
@ 2025-04-15  2:39 ` Baoquan He
  2025-04-15 10:29   ` Shivank Garg
  2025-04-16 13:50   ` Uladzislau Rezki
  2025-04-15  2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15  2:39 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He

When purge lazily freed vmap areas, VA stored in vn->pool[] will also be
taken away into free vmap tree partially or completely accordingly, that
is done in decay_va_pool_node(). When doing that, for each pool of node,
the whole list is detached from the pool for handling. At this time,
that pool is empty. It's not necessary to update the pool size each time
when one VA is removed and addded into free vmap tree.

Here change code to update the pool size when attaching the pool back.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/vmalloc.c | 23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 488d69b56765..bf735c890878 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 	LIST_HEAD(decay_list);
 	struct rb_root decay_root = RB_ROOT;
 	struct vmap_area *va, *nva;
-	unsigned long n_decay;
+	unsigned long n_decay, len;
 	int i;
 
 	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
@@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 		list_replace_init(&vn->pool[i].head, &tmp_list);
 		spin_unlock(&vn->pool_lock);
 
-		if (full_decay)
-			WRITE_ONCE(vn->pool[i].len, 0);
+		len = n_decay = vn->pool[i].len;
+		WRITE_ONCE(vn->pool[i].len, 0);
 
 		/* Decay a pool by ~25% out of left objects. */
-		n_decay = vn->pool[i].len >> 2;
+		if (!full_decay)
+			n_decay >>= 2;
+		len -= n_decay;
 
 		list_for_each_entry_safe(va, nva, &tmp_list, list) {
+			if (!n_decay)
+				break;
 			list_del_init(&va->list);
 			merge_or_add_vmap_area(va, &decay_root, &decay_list);
-
-			if (!full_decay) {
-				WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1);
-
-				if (!--n_decay)
-					break;
-			}
+			n_decay--;
 		}
 
 		/*
@@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 		 * can populate the pool therefore a simple list replace
 		 * operation takes place here.
 		 */
-		if (!full_decay && !list_empty(&tmp_list)) {
+		if (!list_empty(&tmp_list)) {
 			spin_lock(&vn->pool_lock);
 			list_replace_init(&tmp_list, &vn->pool[i].head);
+			vn->pool[i].len = len;
 			spin_unlock(&vn->pool_lock);
 		}
 	}
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases()
  2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
                   ` (2 preceding siblings ...)
  2025-04-15  2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He
@ 2025-04-15  2:39 ` Baoquan He
  2025-04-15 10:44   ` Uladzislau Rezki
                     ` (2 more replies)
  2025-04-15  2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He
                   ` (2 subsequent siblings)
  6 siblings, 3 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15  2:39 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He

Remove unneeded local variables and replace them with values.

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/vmalloc.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index bf735c890878..3f38a232663b 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2930,10 +2930,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush)
  */
 void vm_unmap_aliases(void)
 {
-	unsigned long start = ULONG_MAX, end = 0;
-	int flush = 0;
-
-	_vm_unmap_aliases(start, end, flush);
+	_vm_unmap_aliases(ULONG_MAX, 0, 0);
 }
 EXPORT_SYMBOL_GPL(vm_unmap_aliases);
 
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
                   ` (3 preceding siblings ...)
  2025-04-15  2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He
@ 2025-04-15  2:39 ` Baoquan He
  2025-04-15  6:44   ` Baoquan He
                     ` (2 more replies)
  2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki
  2025-04-15 19:06 ` Shivank Garg
  6 siblings, 3 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15  2:39 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He

In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
indicate if the allocation is successful or failed. That is not very clear.

Here change to return explicit error values and check them to judge if
allocation is successful.

IS_ERR_VALUE already uses unlikely() internally

Signed-off-by: Baoquan He <bhe@redhat.com>
---
 mm/vmalloc.c | 34 +++++++++++++++++-----------------
 1 file changed, 17 insertions(+), 17 deletions(-)

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3f38a232663b..5b21cd09b2b4 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
 			 */
 			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
 			if (!lva)
-				return -1;
+				return -ENOMEM;
 		}
 
 		/*
@@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
 		 */
 		va->va_start = nva_start_addr + size;
 	} else {
-		return -1;
+		return -EINVAL;
 	}
 
 	if (type != FL_FIT_TYPE) {
@@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
 
 	/* Check the "vend" restriction. */
 	if (nva_start_addr + size > vend)
-		return vend;
+		return -ERANGE;
 
 	/* Update the free vmap_area. */
 	ret = va_clip(root, head, va, nva_start_addr, size);
-	if (WARN_ON_ONCE(ret))
-		return vend;
+	if (ret)
+		return ret;
 
 	return nva_start_addr;
 }
 
 /*
  * Returns a start address of the newly allocated area, if success.
- * Otherwise a vend is returned that indicates failure.
+ * Otherwise an error value is returned that indicates failure.
  */
 static __always_inline unsigned long
 __alloc_vmap_area(struct rb_root *root, struct list_head *head,
@@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
 
 	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
 	if (unlikely(!va))
-		return vend;
+		return -ENOENT;
 
 	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
-	if (nva_start_addr == vend)
-		return vend;
 
 #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
-	find_vmap_lowest_match_check(root, head, size, align);
+	if (!IS_ERR_VALUE(nva_start_addr))
+		find_vmap_lowest_match_check(root, head, size, align);
 #endif
 
 	return nva_start_addr;
@@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
 	struct vmap_area *va;
 
 	*vn_id = 0;
-	*addr = vend;
+	*addr = -EINVAL;
 
 	/*
 	 * Fallback to a global heap if not vmalloc or there
@@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
 	}
 
 retry:
-	if (addr == vend) {
+	if (IS_ERR_VALUE(addr)) {
 		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
 		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
 			size, align, vstart, vend);
 		spin_unlock(&free_vmap_area_lock);
 	}
 
-	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
+	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
 
 	/*
-	 * If an allocation fails, the "vend" address is
+	 * If an allocation fails, the error value is
 	 * returned. Therefore trigger the overflow path.
 	 */
-	if (unlikely(addr == vend))
+	if (IS_ERR_VALUE(addr))
 		goto overflow;
 
 	va->va_start = addr;
@@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
 
 		ret = va_clip(&free_vmap_area_root,
 			&free_vmap_area_list, va, start, size);
-		if (WARN_ON_ONCE(unlikely(ret)))
-			/* It is a BUG(), but trigger recovery instead. */
+		if ((unlikely(ret))) {
+			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
 			goto recovery;
+		}
 
 		/* Allocated area. */
 		va = vas[area];
-- 
2.41.0



^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-15  2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He
@ 2025-04-15  6:44   ` Baoquan He
  2025-04-15  7:22   ` Shivank Garg
  2025-04-16 14:28   ` Uladzislau Rezki
  2 siblings, 0 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15  6:44 UTC (permalink / raw)
  To: linux-mm; +Cc: akpm, urezki, linux-kernel

On 04/15/25 at 10:39am, Baoquan He wrote:
> In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
> indicate if the allocation is successful or failed. That is not very clear.
> 
> Here change to return explicit error values and check them to judge if
> allocation is successful.
> 

> IS_ERR_VALUE already uses unlikely() internally
  ^^^^^^^^^
Sorry, above line was added mistakenly in log draft, should be removed.

> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 34 +++++++++++++++++-----------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 3f38a232663b..5b21cd09b2b4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>  			 */
>  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
>  			if (!lva)
> -				return -1;
> +				return -ENOMEM;
>  		}
>  
>  		/*
> @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>  		 */
>  		va->va_start = nva_start_addr + size;
>  	} else {
> -		return -1;
> +		return -EINVAL;
>  	}
>  
>  	if (type != FL_FIT_TYPE) {
> @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
>  
>  	/* Check the "vend" restriction. */
>  	if (nva_start_addr + size > vend)
> -		return vend;
> +		return -ERANGE;
>  
>  	/* Update the free vmap_area. */
>  	ret = va_clip(root, head, va, nva_start_addr, size);
> -	if (WARN_ON_ONCE(ret))
> -		return vend;
> +	if (ret)
> +		return ret;
>  
>  	return nva_start_addr;
>  }
>  
>  /*
>   * Returns a start address of the newly allocated area, if success.
> - * Otherwise a vend is returned that indicates failure.
> + * Otherwise an error value is returned that indicates failure.
>   */
>  static __always_inline unsigned long
>  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
>  
>  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
>  	if (unlikely(!va))
> -		return vend;
> +		return -ENOENT;
>  
>  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
> -	if (nva_start_addr == vend)
> -		return vend;
>  
>  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
> -	find_vmap_lowest_match_check(root, head, size, align);
> +	if (!IS_ERR_VALUE(nva_start_addr))
> +		find_vmap_lowest_match_check(root, head, size, align);
>  #endif
>  
>  	return nva_start_addr;
> @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
>  	struct vmap_area *va;
>  
>  	*vn_id = 0;
> -	*addr = vend;
> +	*addr = -EINVAL;
>  
>  	/*
>  	 * Fallback to a global heap if not vmalloc or there
> @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	}
>  
>  retry:
> -	if (addr == vend) {
> +	if (IS_ERR_VALUE(addr)) {
>  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
>  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
>  			size, align, vstart, vend);
>  		spin_unlock(&free_vmap_area_lock);
>  	}
>  
> -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
> +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
>  
>  	/*
> -	 * If an allocation fails, the "vend" address is
> +	 * If an allocation fails, the error value is
>  	 * returned. Therefore trigger the overflow path.
>  	 */
> -	if (unlikely(addr == vend))
> +	if (IS_ERR_VALUE(addr))
>  		goto overflow;
>  
>  	va->va_start = addr;
> @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
>  
>  		ret = va_clip(&free_vmap_area_root,
>  			&free_vmap_area_list, va, start, size);
> -		if (WARN_ON_ONCE(unlikely(ret)))
> -			/* It is a BUG(), but trigger recovery instead. */
> +		if ((unlikely(ret))) {
> +			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
>  			goto recovery;
> +		}
>  
>  		/* Allocated area. */
>  		va = vas[area];
> -- 
> 2.41.0
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-15  2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He
  2025-04-15  6:44   ` Baoquan He
@ 2025-04-15  7:22   ` Shivank Garg
  2025-04-15 13:01     ` Baoquan He
  2025-04-16 14:28   ` Uladzislau Rezki
  2 siblings, 1 reply; 32+ messages in thread
From: Shivank Garg @ 2025-04-15  7:22 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel

On 4/15/2025 8:09 AM, Baoquan He wrote:
> In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
> indicate if the allocation is successful or failed. That is not very clear.
> 
> Here change to return explicit error values and check them to judge if
> allocation is successful.
> 
> IS_ERR_VALUE already uses unlikely() internally
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 34 +++++++++++++++++-----------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 3f38a232663b..5b21cd09b2b4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>  			 */
>  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
>  			if (!lva)
> -				return -1;
> +				return -ENOMEM;
>  		}
>  
>  		/*
> @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>  		 */
>  		va->va_start = nva_start_addr + size;
>  	} else {
> -		return -1;
> +		return -EINVAL;
>  	}

Braces around return -EINVAL seem unnecessary.
They can be dropped.

>  
>  	if (type != FL_FIT_TYPE) {
> @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
>  
>  	/* Check the "vend" restriction. */
>  	if (nva_start_addr + size > vend)
> -		return vend;
> +		return -ERANGE;
>  
>  	/* Update the free vmap_area. */
>  	ret = va_clip(root, head, va, nva_start_addr, size);
> -	if (WARN_ON_ONCE(ret))
> -		return vend;
> +	if (ret)
> +		return ret;

Is it safe to remove the warning, or was it critical for debugging?

>  
>  	return nva_start_addr;
>  }
>  
>  /*
>   * Returns a start address of the newly allocated area, if success.
> - * Otherwise a vend is returned that indicates failure.
> + * Otherwise an error value is returned that indicates failure.
>   */
>  static __always_inline unsigned long
>  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
>  
>  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
>  	if (unlikely(!va))
> -		return vend;
> +		return -ENOENT;
>  
>  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
> -	if (nva_start_addr == vend)
> -		return vend;
>  
>  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
> -	find_vmap_lowest_match_check(root, head, size, align);
> +	if (!IS_ERR_VALUE(nva_start_addr))
> +		find_vmap_lowest_match_check(root, head, size, align);
>  #endif
>  
>  	return nva_start_addr;
> @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
>  	struct vmap_area *va;
>  
>  	*vn_id = 0;
> -	*addr = vend;
> +	*addr = -EINVAL;
>  
>  	/*
>  	 * Fallback to a global heap if not vmalloc or there
> @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	}
>  
>  retry:
> -	if (addr == vend) {
> +	if (IS_ERR_VALUE(addr)) {
>  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
>  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
>  			size, align, vstart, vend);
>  		spin_unlock(&free_vmap_area_lock);
>  	}
>  
> -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
> +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
>  
>  	/*
> -	 * If an allocation fails, the "vend" address is
> +	 * If an allocation fails, the error value is
>  	 * returned. Therefore trigger the overflow path.
>  	 */
> -	if (unlikely(addr == vend))
> +	if (IS_ERR_VALUE(addr))
>  		goto overflow;
>  
>  	va->va_start = addr;
> @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
>  
>  		ret = va_clip(&free_vmap_area_root,
>  			&free_vmap_area_list, va, start, size);
> -		if (WARN_ON_ONCE(unlikely(ret)))
> -			/* It is a BUG(), but trigger recovery instead. */
> +		if ((unlikely(ret))) {
		    ^^		   ^^
The extra parentheses are redundant and can be removed for clarity.

> +			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
>  			goto recovery;
> +		}
>  
>  		/* Allocated area. */
>  		va = vas[area];



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  2025-04-15  2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He
@ 2025-04-15 10:29   ` Shivank Garg
  2025-04-15 14:05     ` Baoquan He
  2025-04-16 13:50   ` Uladzislau Rezki
  1 sibling, 1 reply; 32+ messages in thread
From: Shivank Garg @ 2025-04-15 10:29 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel

On 4/15/2025 8:09 AM, Baoquan He wrote:
> When purge lazily freed vmap areas, VA stored in vn->pool[] will also be
> taken away into free vmap tree partially or completely accordingly, that
> is done in decay_va_pool_node(). When doing that, for each pool of node,
> the whole list is detached from the pool for handling. At this time,
> that pool is empty. It's not necessary to update the pool size each time
> when one VA is removed and addded into free vmap tree.
> 
> Here change code to update the pool size when attaching the pool back.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 488d69b56765..bf735c890878 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  	LIST_HEAD(decay_list);
>  	struct rb_root decay_root = RB_ROOT;
>  	struct vmap_area *va, *nva;
> -	unsigned long n_decay;
> +	unsigned long n_decay, len;
>  	int i;
>  
>  	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
> @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		list_replace_init(&vn->pool[i].head, &tmp_list);
>  		spin_unlock(&vn->pool_lock);
>  
> -		if (full_decay)
> -			WRITE_ONCE(vn->pool[i].len, 0);
> +		len = n_decay = vn->pool[i].len;
> +		WRITE_ONCE(vn->pool[i].len, 0);
>  
>  		/* Decay a pool by ~25% out of left objects. */
> -		n_decay = vn->pool[i].len >> 2;
> +		if (!full_decay)
> +			n_decay >>= 2;
> +		len -= n_decay;
>  
>  		list_for_each_entry_safe(va, nva, &tmp_list, list) {
> +			if (!n_decay)
> +				break;
>  			list_del_init(&va->list);
>  			merge_or_add_vmap_area(va, &decay_root, &decay_list);
> -
> -			if (!full_decay) {
> -				WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1);
> -
> -				if (!--n_decay)
> -					break;
> -			}
> +			n_decay--;
>  		}
>  
>  		/*
> @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		 * can populate the pool therefore a simple list replace
>  		 * operation takes place here.
>  		 */
> -		if (!full_decay && !list_empty(&tmp_list)) {
> +		if (!list_empty(&tmp_list)) {
>  			spin_lock(&vn->pool_lock);
>  			list_replace_init(&tmp_list, &vn->pool[i].head);
> +			vn->pool[i].len = len;

Current logic uses WRITE_ONCE() to update vn->pool[i].len.
Could this lead to consistency issues?

>  			spin_unlock(&vn->pool_lock);
>  		}
>  	}



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases()
  2025-04-15  2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He
@ 2025-04-15 10:44   ` Uladzislau Rezki
  2025-04-15 19:10   ` Shivank Garg
  2025-04-15 23:54   ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-15 10:44 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:51AM +0800, Baoquan He wrote:
> Remove unneeded local variables and replace them with values.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index bf735c890878..3f38a232663b 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2930,10 +2930,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush)
>   */
>  void vm_unmap_aliases(void)
>  {
> -	unsigned long start = ULONG_MAX, end = 0;
> -	int flush = 0;
> -
> -	_vm_unmap_aliases(start, end, flush);
> +	_vm_unmap_aliases(ULONG_MAX, 0, 0);
>  }
>  EXPORT_SYMBOL_GPL(vm_unmap_aliases);
>  
> -- 
> 2.41.0
> 
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

Makes sense!

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable
  2025-04-15  2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He
@ 2025-04-15 10:47   ` Uladzislau Rezki
  2025-04-15 19:08   ` Shivank Garg
  2025-04-15 23:53   ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-15 10:47 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:48AM +0800, Baoquan He wrote:
> Static variable 'purge_ndoes' is defined in global scope, while it's
> only used in function __purge_vmap_area_lazy(). It mainly serves to
> avoid memory allocation repeatedly, especially when NR_CPUS is big.
> 
> While a local static variable can also satisfy the demand, and can
> improve code readibility. Hence move its definition into
> __purge_vmap_area_lazy().
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 6ee7fc2ec986..aca1905d3397 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2128,7 +2128,6 @@ static DEFINE_MUTEX(vmap_purge_lock);
>  
>  /* for per-CPU blocks */
>  static void purge_fragmented_blocks_allcpus(void);
> -static cpumask_t purge_nodes;
>  
>  static void
>  reclaim_list_global(struct list_head *head)
> @@ -2261,6 +2260,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end,
>  {
>  	unsigned long nr_purged_areas = 0;
>  	unsigned int nr_purge_helpers;
> +	static cpumask_t purge_nodes;
>  	unsigned int nr_purge_nodes;
>  	struct vmap_node *vn;
>  	int i;
> -- 
> 2.41.0
> 
Well. I do not have a strong opinion here. But right, it is only
used inside the function.

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-15  7:22   ` Shivank Garg
@ 2025-04-15 13:01     ` Baoquan He
  2025-04-15 19:00       ` Shivank Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Baoquan He @ 2025-04-15 13:01 UTC (permalink / raw)
  To: Shivank Garg; +Cc: linux-mm, akpm, urezki, linux-kernel

On 04/15/25 at 12:52pm, Shivank Garg wrote:
> On 4/15/2025 8:09 AM, Baoquan He wrote:
> > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
> > indicate if the allocation is successful or failed. That is not very clear.
> > 
> > Here change to return explicit error values and check them to judge if
> > allocation is successful.
> > 
> > IS_ERR_VALUE already uses unlikely() internally
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/vmalloc.c | 34 +++++++++++++++++-----------------
> >  1 file changed, 17 insertions(+), 17 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 3f38a232663b..5b21cd09b2b4 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> >  			 */
> >  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
> >  			if (!lva)
> > -				return -1;
> > +				return -ENOMEM;
> >  		}
> >  
> >  		/*
> > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> >  		 */
> >  		va->va_start = nva_start_addr + size;
> >  	} else {
> > -		return -1;
> > +		return -EINVAL;
> >  	}

Thanks for reviewing.

> 
> Braces around return -EINVAL seem unnecessary.
> They can be dropped.

This complys with the codeing style required in 3) Placing Braces and
Spaces of Documentation/process/coding-style.rst because other branches
are multiple statements.

> 
> >  
> >  	if (type != FL_FIT_TYPE) {
> > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
> >  
> >  	/* Check the "vend" restriction. */
> >  	if (nva_start_addr + size > vend)
> > -		return vend;
> > +		return -ERANGE;
> >  
> >  	/* Update the free vmap_area. */
> >  	ret = va_clip(root, head, va, nva_start_addr, size);
> > -	if (WARN_ON_ONCE(ret))
> > -		return vend;
> > +	if (ret)
> > +		return ret;
> 
> Is it safe to remove the warning, or was it critical for debugging?

This comes from a reported concern because va_clip() could be failed by 
NOTHING_FIT or kmem_cache_alloc failure. The warning here could cause
confusion misleading people to think vmap area management is failed.

> 
> >  
> >  	return nva_start_addr;
> >  }
> >  
> >  /*
> >   * Returns a start address of the newly allocated area, if success.
> > - * Otherwise a vend is returned that indicates failure.
> > + * Otherwise an error value is returned that indicates failure.
> >   */
> >  static __always_inline unsigned long
> >  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> >  
> >  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
> >  	if (unlikely(!va))
> > -		return vend;
> > +		return -ENOENT;
> >  
> >  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
> > -	if (nva_start_addr == vend)
> > -		return vend;
> >  
> >  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
> > -	find_vmap_lowest_match_check(root, head, size, align);
> > +	if (!IS_ERR_VALUE(nva_start_addr))
> > +		find_vmap_lowest_match_check(root, head, size, align);
> >  #endif
> >  
> >  	return nva_start_addr;
> > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
> >  	struct vmap_area *va;
> >  
> >  	*vn_id = 0;
> > -	*addr = vend;
> > +	*addr = -EINVAL;
> >  
> >  	/*
> >  	 * Fallback to a global heap if not vmalloc or there
> > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
> >  	}
> >  
> >  retry:
> > -	if (addr == vend) {
> > +	if (IS_ERR_VALUE(addr)) {
> >  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
> >  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
> >  			size, align, vstart, vend);
> >  		spin_unlock(&free_vmap_area_lock);
> >  	}
> >  
> > -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
> > +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
> >  
> >  	/*
> > -	 * If an allocation fails, the "vend" address is
> > +	 * If an allocation fails, the error value is
> >  	 * returned. Therefore trigger the overflow path.
> >  	 */
> > -	if (unlikely(addr == vend))
> > +	if (IS_ERR_VALUE(addr))
> >  		goto overflow;
> >  
> >  	va->va_start = addr;
> > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
> >  
> >  		ret = va_clip(&free_vmap_area_root,
> >  			&free_vmap_area_list, va, start, size);
> > -		if (WARN_ON_ONCE(unlikely(ret)))
> > -			/* It is a BUG(), but trigger recovery instead. */
> > +		if ((unlikely(ret))) {
> 		    ^^		   ^^
> The extra parentheses are redundant and can be removed for clarity.

You are right, I will remove it. Thanks.

> 
> > +			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
> >  			goto recovery;
> > +		}
> >  
> >  		/* Allocated area. */
> >  		va = vas[area];
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  2025-04-15 10:29   ` Shivank Garg
@ 2025-04-15 14:05     ` Baoquan He
  2025-04-15 19:02       ` Shivank Garg
  0 siblings, 1 reply; 32+ messages in thread
From: Baoquan He @ 2025-04-15 14:05 UTC (permalink / raw)
  To: Shivank Garg; +Cc: linux-mm, akpm, urezki, linux-kernel

On 04/15/25 at 03:59pm, Shivank Garg wrote:
> On 4/15/2025 8:09 AM, Baoquan He wrote:
> > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be
> > taken away into free vmap tree partially or completely accordingly, that
> > is done in decay_va_pool_node(). When doing that, for each pool of node,
> > the whole list is detached from the pool for handling. At this time,
> > that pool is empty. It's not necessary to update the pool size each time
> > when one VA is removed and addded into free vmap tree.
> > 
> > Here change code to update the pool size when attaching the pool back.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/vmalloc.c | 23 +++++++++++------------
> >  1 file changed, 11 insertions(+), 12 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 488d69b56765..bf735c890878 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  	LIST_HEAD(decay_list);
> >  	struct rb_root decay_root = RB_ROOT;
> >  	struct vmap_area *va, *nva;
> > -	unsigned long n_decay;
> > +	unsigned long n_decay, len;
> >  	int i;
> >  
> >  	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
> > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  		list_replace_init(&vn->pool[i].head, &tmp_list);
> >  		spin_unlock(&vn->pool_lock);
> >  
> > -		if (full_decay)
> > -			WRITE_ONCE(vn->pool[i].len, 0);
> > +		len = n_decay = vn->pool[i].len;
> > +		WRITE_ONCE(vn->pool[i].len, 0);
> >  
> >  		/* Decay a pool by ~25% out of left objects. */
> > -		n_decay = vn->pool[i].len >> 2;
> > +		if (!full_decay)
> > +			n_decay >>= 2;
> > +		len -= n_decay;
> >  
> >  		list_for_each_entry_safe(va, nva, &tmp_list, list) {
> > +			if (!n_decay)
> > +				break;
> >  			list_del_init(&va->list);
> >  			merge_or_add_vmap_area(va, &decay_root, &decay_list);
> > -
> > -			if (!full_decay) {
> > -				WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1);
> > -
> > -				if (!--n_decay)
> > -					break;
> > -			}
> > +			n_decay--;
> >  		}
> >  
> >  		/*
> > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  		 * can populate the pool therefore a simple list replace
> >  		 * operation takes place here.
> >  		 */
> > -		if (!full_decay && !list_empty(&tmp_list)) {
> > +		if (!list_empty(&tmp_list)) {
> >  			spin_lock(&vn->pool_lock);
> >  			list_replace_init(&tmp_list, &vn->pool[i].head);
> > +			vn->pool[i].len = len;
> 
> Current logic uses WRITE_ONCE() to update vn->pool[i].len.
> Could this lead to consistency issues?

Seems no necessary to use WRITE_ONCE(). I can change back to use
WRITE_ONCE() just in case. Currently, it's only updated in
node_alloc(), decay_va_pool_node(), purge_vmap_node(). And the latter
two are inside vmap_purge_lock area.

> 
> >  			spin_unlock(&vn->pool_lock);
> >  		}
> >  	}
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
  2025-04-15  2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He
@ 2025-04-15 15:25   ` Uladzislau Rezki
  2025-04-15 23:41     ` Baoquan He
  2025-04-15 19:09   ` Shivank Garg
  1 sibling, 1 reply; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-15 15:25 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote:
> When finding VA in vn->busy, if VA spans several zones and the passed
> addr is not the same as va->va_start, we should scan the vn in reverse
> odrdr because the starting address of VA must be smaller than the passed
> addr if it really resides in the VA.
> 
> E.g on a system nr_vmap_nodes=100,
> 
>      <----va---->
>  -|-----|-----|-----|-----|-----|-----|-----|-----|-----|-
>     ...   n-1   n    n+1   n+2   ...   100     0     1
> 
> VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed
> addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n',
> then succeed very soon.
> 
> Meanwhile we still need loop around because VA could spans node from 'n'
> to node 100, node 0, node 1.
> 
> Anyway, changing to find in reverse order can improve efficiency on
> many CPUs system.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index aca1905d3397..488d69b56765 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr)
>  
>  		if (va)
>  			return va;
> -	} while ((i = (i + 1) % nr_vmap_nodes) != j);
> +	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
>  
>  	return NULL;
>  }
> @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr)
>  
>  		if (va)
>  			return va;
> -	} while ((i = (i + 1) % nr_vmap_nodes) != j);
> +	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
>  
>  	return NULL;
>  }
> -- 
> 2.41.0
> 
It depends. Consider a below situation:

             addr
              |
        VA    V
  <------------>
<---|---|---|---|---|---|---|--->
  0   1   2   3   0   1   2   3

basically it matters how big VA and how many nodes it spans. But i
agree that an assumption to reverse back is more convinced in most
cases.

Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements
  2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
                   ` (4 preceding siblings ...)
  2025-04-15  2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He
@ 2025-04-15 15:29 ` Uladzislau Rezki
  2025-04-15 22:55   ` Baoquan He
  2025-04-15 19:06 ` Shivank Garg
  6 siblings, 1 reply; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-15 15:29 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:47AM +0800, Baoquan He wrote:
> These were made from code inspection in mm/vmalloc.c.
> 
> Baoquan He (5):
>   mm/vmalloc.c: change purge_ndoes as local static variable
>   mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
>   mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
>   mm/vmalloc: optimize function vm_unmap_aliases()
>   mm/vmalloc.c: return explicit error value in alloc_vmap_area()
> 
>  mm/vmalloc.c | 68 +++++++++++++++++++++++++---------------------------
>  1 file changed, 32 insertions(+), 36 deletions(-)
> 
> -- 
> 2.41.0
> 
I have review some patches, the rest i will check tomorrow!

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-15 13:01     ` Baoquan He
@ 2025-04-15 19:00       ` Shivank Garg
  2025-04-15 22:57         ` Baoquan He
  0 siblings, 1 reply; 32+ messages in thread
From: Shivank Garg @ 2025-04-15 19:00 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel



On 4/15/2025 6:31 PM, Baoquan He wrote:
> On 04/15/25 at 12:52pm, Shivank Garg wrote:
>> On 4/15/2025 8:09 AM, Baoquan He wrote:
>>> In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
>>> indicate if the allocation is successful or failed. That is not very clear.
>>>
>>> Here change to return explicit error values and check them to judge if
>>> allocation is successful.
>>>
>>> IS_ERR_VALUE already uses unlikely() internally
>>>
>>> Signed-off-by: Baoquan He <bhe@redhat.com>
>>> ---
>>>  mm/vmalloc.c | 34 +++++++++++++++++-----------------
>>>  1 file changed, 17 insertions(+), 17 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index 3f38a232663b..5b21cd09b2b4 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>>>  			 */
>>>  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
>>>  			if (!lva)
>>> -				return -1;
>>> +				return -ENOMEM;
>>>  		}
>>>  
>>>  		/*
>>> @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>>>  		 */
>>>  		va->va_start = nva_start_addr + size;
>>>  	} else {
>>> -		return -1;
>>> +		return -EINVAL;
>>>  	}
> 
> Thanks for reviewing.
> 
>>
>> Braces around return -EINVAL seem unnecessary.
>> They can be dropped.
> 
> This complys with the codeing style required in 3) Placing Braces and
> Spaces of Documentation/process/coding-style.rst because other branches
> are multiple statements.
> 
>>
>>>  
>>>  	if (type != FL_FIT_TYPE) {
>>> @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
>>>  
>>>  	/* Check the "vend" restriction. */
>>>  	if (nva_start_addr + size > vend)
>>> -		return vend;
>>> +		return -ERANGE;
>>>  
>>>  	/* Update the free vmap_area. */
>>>  	ret = va_clip(root, head, va, nva_start_addr, size);
>>> -	if (WARN_ON_ONCE(ret))
>>> -		return vend;
>>> +	if (ret)
>>> +		return ret;
>>
>> Is it safe to remove the warning, or was it critical for debugging?
> 
> This comes from a reported concern because va_clip() could be failed by 
> NOTHING_FIT or kmem_cache_alloc failure. The warning here could cause
> confusion misleading people to think vmap area management is failed.
> 
>>
>>>  
>>>  	return nva_start_addr;
>>>  }
>>>  
>>>  /*
>>>   * Returns a start address of the newly allocated area, if success.
>>> - * Otherwise a vend is returned that indicates failure.
>>> + * Otherwise an error value is returned that indicates failure.
>>>   */
>>>  static __always_inline unsigned long
>>>  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
>>> @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
>>>  
>>>  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
>>>  	if (unlikely(!va))
>>> -		return vend;
>>> +		return -ENOENT;
>>>  
>>>  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
>>> -	if (nva_start_addr == vend)
>>> -		return vend;
>>>  
>>>  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
>>> -	find_vmap_lowest_match_check(root, head, size, align);
>>> +	if (!IS_ERR_VALUE(nva_start_addr))
>>> +		find_vmap_lowest_match_check(root, head, size, align);
>>>  #endif
>>>  
>>>  	return nva_start_addr;
>>> @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
>>>  	struct vmap_area *va;
>>>  
>>>  	*vn_id = 0;
>>> -	*addr = vend;
>>> +	*addr = -EINVAL;
>>>  
>>>  	/*
>>>  	 * Fallback to a global heap if not vmalloc or there
>>> @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>>>  	}
>>>  
>>>  retry:
>>> -	if (addr == vend) {
>>> +	if (IS_ERR_VALUE(addr)) {
>>>  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
>>>  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
>>>  			size, align, vstart, vend);
>>>  		spin_unlock(&free_vmap_area_lock);
>>>  	}
>>>  
>>> -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
>>> +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
>>>  
>>>  	/*
>>> -	 * If an allocation fails, the "vend" address is
>>> +	 * If an allocation fails, the error value is
>>>  	 * returned. Therefore trigger the overflow path.
>>>  	 */
>>> -	if (unlikely(addr == vend))
>>> +	if (IS_ERR_VALUE(addr))
>>>  		goto overflow;
>>>  
>>>  	va->va_start = addr;
>>> @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
>>>  
>>>  		ret = va_clip(&free_vmap_area_root,
>>>  			&free_vmap_area_list, va, start, size);
>>> -		if (WARN_ON_ONCE(unlikely(ret)))
>>> -			/* It is a BUG(), but trigger recovery instead. */
>>> +		if ((unlikely(ret))) {
>> 		    ^^		   ^^
>> The extra parentheses are redundant and can be removed for clarity.
> 
> You are right, I will remove it. Thanks.
> 

Please feel free to add following in next version.

Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>

Thanks,
Shivank

>>
>>> +			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
>>>  			goto recovery;
>>> +		}
>>>  
>>>  		/* Allocated area. */
>>>  		va = vas[area];
>>
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  2025-04-15 14:05     ` Baoquan He
@ 2025-04-15 19:02       ` Shivank Garg
  0 siblings, 0 replies; 32+ messages in thread
From: Shivank Garg @ 2025-04-15 19:02 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel



On 4/15/2025 7:35 PM, Baoquan He wrote:
> On 04/15/25 at 03:59pm, Shivank Garg wrote:
>> On 4/15/2025 8:09 AM, Baoquan He wrote:
>>> When purge lazily freed vmap areas, VA stored in vn->pool[] will also be
>>> taken away into free vmap tree partially or completely accordingly, that
>>> is done in decay_va_pool_node(). When doing that, for each pool of node,
>>> the whole list is detached from the pool for handling. At this time,
>>> that pool is empty. It's not necessary to update the pool size each time
>>> when one VA is removed and addded into free vmap tree.
>>>
>>> Here change code to update the pool size when attaching the pool back.
>>>
>>> Signed-off-by: Baoquan He <bhe@redhat.com>
>>> ---
>>>  mm/vmalloc.c | 23 +++++++++++------------
>>>  1 file changed, 11 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index 488d69b56765..bf735c890878 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>>>  	LIST_HEAD(decay_list);
>>>  	struct rb_root decay_root = RB_ROOT;
>>>  	struct vmap_area *va, *nva;
>>> -	unsigned long n_decay;
>>> +	unsigned long n_decay, len;
>>>  	int i;
>>>  
>>>  	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
>>> @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>>>  		list_replace_init(&vn->pool[i].head, &tmp_list);
>>>  		spin_unlock(&vn->pool_lock);
>>>  
>>> -		if (full_decay)
>>> -			WRITE_ONCE(vn->pool[i].len, 0);
>>> +		len = n_decay = vn->pool[i].len;
>>> +		WRITE_ONCE(vn->pool[i].len, 0);
>>>  
>>>  		/* Decay a pool by ~25% out of left objects. */
>>> -		n_decay = vn->pool[i].len >> 2;
>>> +		if (!full_decay)
>>> +			n_decay >>= 2;
>>> +		len -= n_decay;
>>>  
>>>  		list_for_each_entry_safe(va, nva, &tmp_list, list) {
>>> +			if (!n_decay)
>>> +				break;
>>>  			list_del_init(&va->list);
>>>  			merge_or_add_vmap_area(va, &decay_root, &decay_list);
>>> -
>>> -			if (!full_decay) {
>>> -				WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1);
>>> -
>>> -				if (!--n_decay)
>>> -					break;
>>> -			}
>>> +			n_decay--;
>>>  		}
>>>  
>>>  		/*
>>> @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>>>  		 * can populate the pool therefore a simple list replace
>>>  		 * operation takes place here.
>>>  		 */
>>> -		if (!full_decay && !list_empty(&tmp_list)) {
>>> +		if (!list_empty(&tmp_list)) {
>>>  			spin_lock(&vn->pool_lock);
>>>  			list_replace_init(&tmp_list, &vn->pool[i].head);
>>> +			vn->pool[i].len = len;
>>
>> Current logic uses WRITE_ONCE() to update vn->pool[i].len.
>> Could this lead to consistency issues?
> 
> Seems no necessary to use WRITE_ONCE(). I can change back to use
> WRITE_ONCE() just in case. Currently, it's only updated in
> node_alloc(), decay_va_pool_node(), purge_vmap_node(). And the latter
> two are inside vmap_purge_lock area.
> 

Okay.

Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>

Best,
Shivank

>>
>>>  			spin_unlock(&vn->pool_lock);
>>>  		}
>>>  	}
>>
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements
  2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
                   ` (5 preceding siblings ...)
  2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki
@ 2025-04-15 19:06 ` Shivank Garg
  6 siblings, 0 replies; 32+ messages in thread
From: Shivank Garg @ 2025-04-15 19:06 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel

On 4/15/2025 8:09 AM, Baoquan He wrote:
> These were made from code inspection in mm/vmalloc.c.
> 
> Baoquan He (5):
>   mm/vmalloc.c: change purge_ndoes as local static variable
>   mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
>   mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
>   mm/vmalloc: optimize function vm_unmap_aliases()
>   mm/vmalloc.c: return explicit error value in alloc_vmap_area()
> 
>  mm/vmalloc.c | 68 +++++++++++++++++++++++++---------------------------
>  1 file changed, 32 insertions(+), 36 deletions(-)
> 

I've tested this patch-series on my AMD EPYC ZEN 3 machine using

1. Using lib/test_vmalloc.c: modprobe test_vmalloc
2. stress-ng: stress-ng --class memory,vm  --seq 0

I did not find any issues.

Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>

Thanks,
Shivank



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable
  2025-04-15  2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He
  2025-04-15 10:47   ` Uladzislau Rezki
@ 2025-04-15 19:08   ` Shivank Garg
  2025-04-15 23:53   ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 32+ messages in thread
From: Shivank Garg @ 2025-04-15 19:08 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel

On 4/15/2025 8:09 AM, Baoquan He wrote:
> Static variable 'purge_ndoes' is defined in global scope, while it's
> only used in function __purge_vmap_area_lazy(). It mainly serves to
> avoid memory allocation repeatedly, especially when NR_CPUS is big.
> 
> While a local static variable can also satisfy the demand, and can
> improve code readibility. Hence move its definition into
> __purge_vmap_area_lazy().
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 6ee7fc2ec986..aca1905d3397 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2128,7 +2128,6 @@ static DEFINE_MUTEX(vmap_purge_lock);
>  
>  /* for per-CPU blocks */
>  static void purge_fragmented_blocks_allcpus(void);
> -static cpumask_t purge_nodes;
>  
>  static void
>  reclaim_list_global(struct list_head *head)
> @@ -2261,6 +2260,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end,
>  {
>  	unsigned long nr_purged_areas = 0;
>  	unsigned int nr_purge_helpers;
> +	static cpumask_t purge_nodes;
>  	unsigned int nr_purge_nodes;
>  	struct vmap_node *vn;
>  	int i;

Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
  2025-04-15  2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He
  2025-04-15 15:25   ` Uladzislau Rezki
@ 2025-04-15 19:09   ` Shivank Garg
  1 sibling, 0 replies; 32+ messages in thread
From: Shivank Garg @ 2025-04-15 19:09 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel

On 4/15/2025 8:09 AM, Baoquan He wrote:
> When finding VA in vn->busy, if VA spans several zones and the passed
> addr is not the same as va->va_start, we should scan the vn in reverse
> odrdr because the starting address of VA must be smaller than the passed
> addr if it really resides in the VA.
> 
> E.g on a system nr_vmap_nodes=100,
> 
>      <----va---->
>  -|-----|-----|-----|-----|-----|-----|-----|-----|-----|-
>     ...   n-1   n    n+1   n+2   ...   100     0     1
> 
> VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed
> addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n',
> then succeed very soon.
> 
> Meanwhile we still need loop around because VA could spans node from 'n'
> to node 100, node 0, node 1.
> 
> Anyway, changing to find in reverse order can improve efficiency on
> many CPUs system.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index aca1905d3397..488d69b56765 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr)
>  
>  		if (va)
>  			return va;
> -	} while ((i = (i + 1) % nr_vmap_nodes) != j);
> +	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
>  
>  	return NULL;
>  }
> @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr)
>  
>  		if (va)
>  			return va;
> -	} while ((i = (i + 1) % nr_vmap_nodes) != j);
> +	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
>  
>  	return NULL;
>  }

Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>

Thanks,
Shivank



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases()
  2025-04-15  2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He
  2025-04-15 10:44   ` Uladzislau Rezki
@ 2025-04-15 19:10   ` Shivank Garg
  2025-04-15 23:54   ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 32+ messages in thread
From: Shivank Garg @ 2025-04-15 19:10 UTC (permalink / raw)
  To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel

On 4/15/2025 8:09 AM, Baoquan He wrote:
> Remove unneeded local variables and replace them with values.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 5 +----
>  1 file changed, 1 insertion(+), 4 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index bf735c890878..3f38a232663b 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2930,10 +2930,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush)
>   */
>  void vm_unmap_aliases(void)
>  {
> -	unsigned long start = ULONG_MAX, end = 0;
> -	int flush = 0;
> -
> -	_vm_unmap_aliases(start, end, flush);
> +	_vm_unmap_aliases(ULONG_MAX, 0, 0);
>  }
>  EXPORT_SYMBOL_GPL(vm_unmap_aliases);
>  

Reviewed-by: Shivank Garg <shivankg@amd.com>
Tested-by: Shivank Garg <shivankg@amd.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements
  2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki
@ 2025-04-15 22:55   ` Baoquan He
  0 siblings, 0 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15 22:55 UTC (permalink / raw)
  To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel

On 04/15/25 at 05:29pm, Uladzislau Rezki wrote:
> On Tue, Apr 15, 2025 at 10:39:47AM +0800, Baoquan He wrote:
> > These were made from code inspection in mm/vmalloc.c.
> > 
> > Baoquan He (5):
> >   mm/vmalloc.c: change purge_ndoes as local static variable
> >   mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
> >   mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
> >   mm/vmalloc: optimize function vm_unmap_aliases()
> >   mm/vmalloc.c: return explicit error value in alloc_vmap_area()
> > 
> >  mm/vmalloc.c | 68 +++++++++++++++++++++++++---------------------------
> >  1 file changed, 32 insertions(+), 36 deletions(-)
> > 
> > -- 
> > 2.41.0
> > 
> I have review some patches, the rest i will check tomorrow!

Thanks a lot for your quick and careful reviewing.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-15 19:00       ` Shivank Garg
@ 2025-04-15 22:57         ` Baoquan He
  0 siblings, 0 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15 22:57 UTC (permalink / raw)
  To: Shivank Garg; +Cc: linux-mm, akpm, urezki, linux-kernel

On 04/16/25 at 12:30am, Shivank Garg wrote:
> 
> 
> On 4/15/2025 6:31 PM, Baoquan He wrote:
> > On 04/15/25 at 12:52pm, Shivank Garg wrote:
> >> On 4/15/2025 8:09 AM, Baoquan He wrote:
> >>> In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
> >>> indicate if the allocation is successful or failed. That is not very clear.
> >>>
> >>> Here change to return explicit error values and check them to judge if
> >>> allocation is successful.
> >>>
> >>> IS_ERR_VALUE already uses unlikely() internally
> >>>
> >>> Signed-off-by: Baoquan He <bhe@redhat.com>
> >>> ---
> >>>  mm/vmalloc.c | 34 +++++++++++++++++-----------------
> >>>  1 file changed, 17 insertions(+), 17 deletions(-)
> >>>
> >>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> >>> index 3f38a232663b..5b21cd09b2b4 100644
> >>> --- a/mm/vmalloc.c
> >>> +++ b/mm/vmalloc.c
> >>> @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> >>>  			 */
> >>>  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
> >>>  			if (!lva)
> >>> -				return -1;
> >>> +				return -ENOMEM;
> >>>  		}
> >>>  
> >>>  		/*
> >>> @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> >>>  		 */
> >>>  		va->va_start = nva_start_addr + size;
> >>>  	} else {
> >>> -		return -1;
> >>> +		return -EINVAL;
> >>>  	}
> > 
> > Thanks for reviewing.
> > 
> >>
> >> Braces around return -EINVAL seem unnecessary.
> >> They can be dropped.
> > 
> > This complys with the codeing style required in 3) Placing Braces and
> > Spaces of Documentation/process/coding-style.rst because other branches
> > are multiple statements.
> > 
> >>
> >>>  
> >>>  	if (type != FL_FIT_TYPE) {
> >>> @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
> >>>  
> >>>  	/* Check the "vend" restriction. */
> >>>  	if (nva_start_addr + size > vend)
> >>> -		return vend;
> >>> +		return -ERANGE;
> >>>  
> >>>  	/* Update the free vmap_area. */
> >>>  	ret = va_clip(root, head, va, nva_start_addr, size);
> >>> -	if (WARN_ON_ONCE(ret))
> >>> -		return vend;
> >>> +	if (ret)
> >>> +		return ret;
> >>
> >> Is it safe to remove the warning, or was it critical for debugging?
> > 
> > This comes from a reported concern because va_clip() could be failed by 
> > NOTHING_FIT or kmem_cache_alloc failure. The warning here could cause
> > confusion misleading people to think vmap area management is failed.
> > 
> >>
> >>>  
> >>>  	return nva_start_addr;
> >>>  }
> >>>  
> >>>  /*
> >>>   * Returns a start address of the newly allocated area, if success.
> >>> - * Otherwise a vend is returned that indicates failure.
> >>> + * Otherwise an error value is returned that indicates failure.
> >>>   */
> >>>  static __always_inline unsigned long
> >>>  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> >>> @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> >>>  
> >>>  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
> >>>  	if (unlikely(!va))
> >>> -		return vend;
> >>> +		return -ENOENT;
> >>>  
> >>>  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
> >>> -	if (nva_start_addr == vend)
> >>> -		return vend;
> >>>  
> >>>  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
> >>> -	find_vmap_lowest_match_check(root, head, size, align);
> >>> +	if (!IS_ERR_VALUE(nva_start_addr))
> >>> +		find_vmap_lowest_match_check(root, head, size, align);
> >>>  #endif
> >>>  
> >>>  	return nva_start_addr;
> >>> @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
> >>>  	struct vmap_area *va;
> >>>  
> >>>  	*vn_id = 0;
> >>> -	*addr = vend;
> >>> +	*addr = -EINVAL;
> >>>  
> >>>  	/*
> >>>  	 * Fallback to a global heap if not vmalloc or there
> >>> @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
> >>>  	}
> >>>  
> >>>  retry:
> >>> -	if (addr == vend) {
> >>> +	if (IS_ERR_VALUE(addr)) {
> >>>  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
> >>>  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
> >>>  			size, align, vstart, vend);
> >>>  		spin_unlock(&free_vmap_area_lock);
> >>>  	}
> >>>  
> >>> -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
> >>> +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
> >>>  
> >>>  	/*
> >>> -	 * If an allocation fails, the "vend" address is
> >>> +	 * If an allocation fails, the error value is
> >>>  	 * returned. Therefore trigger the overflow path.
> >>>  	 */
> >>> -	if (unlikely(addr == vend))
> >>> +	if (IS_ERR_VALUE(addr))
> >>>  		goto overflow;
> >>>  
> >>>  	va->va_start = addr;
> >>> @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
> >>>  
> >>>  		ret = va_clip(&free_vmap_area_root,
> >>>  			&free_vmap_area_list, va, start, size);
> >>> -		if (WARN_ON_ONCE(unlikely(ret)))
> >>> -			/* It is a BUG(), but trigger recovery instead. */
> >>> +		if ((unlikely(ret))) {
> >> 		    ^^		   ^^
> >> The extra parentheses are redundant and can be removed for clarity.
> > 
> > You are right, I will remove it. Thanks.
> > 
> 
> Please feel free to add following in next version.
> 
> Reviewed-by: Shivank Garg <shivankg@amd.com>
> Tested-by: Shivank Garg <shivankg@amd.com>

Thanks a lot for your careful reviewing and testing.

> 
> >>
> >>> +			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
> >>>  			goto recovery;
> >>> +		}
> >>>  
> >>>  		/* Allocated area. */
> >>>  		va = vas[area];
> >>
> > 
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order
  2025-04-15 15:25   ` Uladzislau Rezki
@ 2025-04-15 23:41     ` Baoquan He
  0 siblings, 0 replies; 32+ messages in thread
From: Baoquan He @ 2025-04-15 23:41 UTC (permalink / raw)
  To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel

On 04/15/25 at 05:25pm, Uladzislau Rezki wrote:
> On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote:
> > When finding VA in vn->busy, if VA spans several zones and the passed
> > addr is not the same as va->va_start, we should scan the vn in reverse
> > odrdr because the starting address of VA must be smaller than the passed
> > addr if it really resides in the VA.
> > 
> > E.g on a system nr_vmap_nodes=100,
> > 
> >      <----va---->
> >  -|-----|-----|-----|-----|-----|-----|-----|-----|-----|-
> >     ...   n-1   n    n+1   n+2   ...   100     0     1
> > 
> > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed
> > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n',
> > then succeed very soon.
> > 
> > Meanwhile we still need loop around because VA could spans node from 'n'
> > to node 100, node 0, node 1.
> > 
> > Anyway, changing to find in reverse order can improve efficiency on
> > many CPUs system.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/vmalloc.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index aca1905d3397..488d69b56765 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr)
> >  
> >  		if (va)
> >  			return va;
> > -	} while ((i = (i + 1) % nr_vmap_nodes) != j);
> > +	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
> >  
> >  	return NULL;
> >  }
> > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr)
> >  
> >  		if (va)
> >  			return va;
> > -	} while ((i = (i + 1) % nr_vmap_nodes) != j);
> > +	} while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j);
> >  
> >  	return NULL;
> >  }
> > -- 
> > 2.41.0
> > 
> It depends. Consider a below situation:
> 
>              addr
>               |
>         VA    V
>   <------------>
> <---|---|---|---|---|---|---|--->
>   0   1   2   3   0   1   2   3
> 
> basically it matters how big VA and how many nodes it spans. But i
> agree that an assumption to reverse back is more convinced in most
> cases.

Agree, on small system with few CPUs and big VA case, the advantage is
not apparent.

> 
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

Thanks.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable
  2025-04-15  2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He
  2025-04-15 10:47   ` Uladzislau Rezki
  2025-04-15 19:08   ` Shivank Garg
@ 2025-04-15 23:53   ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 32+ messages in thread
From: Vishal Moola (Oracle) @ 2025-04-15 23:53 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:48AM +0800, Baoquan He wrote:
> Static variable 'purge_ndoes' is defined in global scope, while it's
> only used in function __purge_vmap_area_lazy(). It mainly serves to
> avoid memory allocation repeatedly, especially when NR_CPUS is big.
> 
> While a local static variable can also satisfy the demand, and can
> improve code readibility. Hence move its definition into
> __purge_vmap_area_lazy().
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases()
  2025-04-15  2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He
  2025-04-15 10:44   ` Uladzislau Rezki
  2025-04-15 19:10   ` Shivank Garg
@ 2025-04-15 23:54   ` Vishal Moola (Oracle)
  2 siblings, 0 replies; 32+ messages in thread
From: Vishal Moola (Oracle) @ 2025-04-15 23:54 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:51AM +0800, Baoquan He wrote:
> Remove unneeded local variables and replace them with values.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>

Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  2025-04-15  2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He
  2025-04-15 10:29   ` Shivank Garg
@ 2025-04-16 13:50   ` Uladzislau Rezki
  2025-04-17  2:51     ` Baoquan He
  1 sibling, 1 reply; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-16 13:50 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:50AM +0800, Baoquan He wrote:
> When purge lazily freed vmap areas, VA stored in vn->pool[] will also be
> taken away into free vmap tree partially or completely accordingly, that
> is done in decay_va_pool_node(). When doing that, for each pool of node,
> the whole list is detached from the pool for handling. At this time,
> that pool is empty. It's not necessary to update the pool size each time
> when one VA is removed and addded into free vmap tree.
> 
> Here change code to update the pool size when attaching the pool back.
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 488d69b56765..bf735c890878 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  	LIST_HEAD(decay_list);
>  	struct rb_root decay_root = RB_ROOT;
>  	struct vmap_area *va, *nva;
> -	unsigned long n_decay;
> +	unsigned long n_decay, len;
>  	int i;
>  
>  	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
> @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		list_replace_init(&vn->pool[i].head, &tmp_list);
>  		spin_unlock(&vn->pool_lock);
>  
> -		if (full_decay)
> -			WRITE_ONCE(vn->pool[i].len, 0);
> +		len = n_decay = vn->pool[i].len;
> +		WRITE_ONCE(vn->pool[i].len, 0);
>  
>  		/* Decay a pool by ~25% out of left objects. */
> -		n_decay = vn->pool[i].len >> 2;
> +		if (!full_decay)
> +			n_decay >>= 2;
> +		len -= n_decay;
>  
>  		list_for_each_entry_safe(va, nva, &tmp_list, list) {
> +			if (!n_decay)
> +				break;
>  			list_del_init(&va->list);
>  			merge_or_add_vmap_area(va, &decay_root, &decay_list);
> -
> -			if (!full_decay) {
> -				WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1);
> -
> -				if (!--n_decay)
> -					break;
> -			}
> +			n_decay--;
>  		}
>  
>  		/*
> @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		 * can populate the pool therefore a simple list replace
>  		 * operation takes place here.
>  		 */
> -		if (!full_decay && !list_empty(&tmp_list)) {
> +		if (!list_empty(&tmp_list)) {
>  			spin_lock(&vn->pool_lock);
>  			list_replace_init(&tmp_list, &vn->pool[i].head);
> +			vn->pool[i].len = len;
>  			spin_unlock(&vn->pool_lock);
>  		}
>  	}
> -- 
> 2.41.0
> 
Which Linux version this patch is based on? I can not apply it.

Small nits:

<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c909b8fea6eb..0ae53c997219 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 	LIST_HEAD(decay_list);
 	struct rb_root decay_root = RB_ROOT;
 	struct vmap_area *va, *nva;
-	unsigned long n_decay, len;
+	unsigned long n_decay, pool_len;
 	int i;
 
 	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
@@ -2164,21 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 		list_replace_init(&vn->pool[i].head, &tmp_list);
 		spin_unlock(&vn->pool_lock);
 
-		len = n_decay = vn->pool[i].len;
+		pool_len = n_decay = vn->pool[i].len;
 		WRITE_ONCE(vn->pool[i].len, 0);
 
 		/* Decay a pool by ~25% out of left objects. */
 		if (!full_decay)
 			n_decay >>= 2;
-		len -= n_decay;
+		pool_len -= n_decay;
 
 		list_for_each_entry_safe(va, nva, &tmp_list, list) {
-			if (!n_decay)
+			if (!n_decay--)
 				break;
 
 			list_del_init(&va->list);
 			merge_or_add_vmap_area(va, &decay_root, &decay_list);
-			n_decay--;
 		}
 
 		/*
@@ -2190,7 +2189,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
 		if (!list_empty(&tmp_list)) {
 			spin_lock(&vn->pool_lock);
 			list_replace_init(&tmp_list, &vn->pool[i].head);
-			vn->pool[i].len = len;
+			vn->pool[i].len = pool_len;
 			spin_unlock(&vn->pool_lock);
 		}
 	}
<snip>

on top of this?

a) decay in "if" statement, no need extra line;
b) rename len to something obvious - pool_len.

--
Uladzislau Rezki


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-15  2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He
  2025-04-15  6:44   ` Baoquan He
  2025-04-15  7:22   ` Shivank Garg
@ 2025-04-16 14:28   ` Uladzislau Rezki
  2025-04-17  3:02     ` Baoquan He
  2 siblings, 1 reply; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-16 14:28 UTC (permalink / raw)
  To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel

On Tue, Apr 15, 2025 at 10:39:52AM +0800, Baoquan He wrote:
> In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
> indicate if the allocation is successful or failed. That is not very clear.
> 
> Here change to return explicit error values and check them to judge if
> allocation is successful.
> 
> IS_ERR_VALUE already uses unlikely() internally
> 
> Signed-off-by: Baoquan He <bhe@redhat.com>
> ---
>  mm/vmalloc.c | 34 +++++++++++++++++-----------------
>  1 file changed, 17 insertions(+), 17 deletions(-)
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 3f38a232663b..5b21cd09b2b4 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>  			 */
>  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
>  			if (!lva)
> -				return -1;
> +				return -ENOMEM;
>  		}
>  
>  		/*
> @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
>  		 */
>  		va->va_start = nva_start_addr + size;
>  	} else {
> -		return -1;
> +		return -EINVAL;
>  	}
>  
>  	if (type != FL_FIT_TYPE) {
> @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
>  
>  	/* Check the "vend" restriction. */
>  	if (nva_start_addr + size > vend)
> -		return vend;
> +		return -ERANGE;
>  
>  	/* Update the free vmap_area. */
>  	ret = va_clip(root, head, va, nva_start_addr, size);
> -	if (WARN_ON_ONCE(ret))
> -		return vend;
>
Not clear why you remove this WARN_ON by this patch. It should be
a separate patch or just keep it as is. The warning here can mean
that something is really wrong, especially if NOTHING_FIT. So we
definitely want the warning.

> +	if (ret)
> +		return ret;
>  
>  	return nva_start_addr;
>  }
>  
>  /*
>   * Returns a start address of the newly allocated area, if success.
> - * Otherwise a vend is returned that indicates failure.
> + * Otherwise an error value is returned that indicates failure.
>   */
>  static __always_inline unsigned long
>  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
>  
>  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
>  	if (unlikely(!va))
> -		return vend;
> +		return -ENOENT;
>  
>  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
> -	if (nva_start_addr == vend)
> -		return vend;
>  
>  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
> -	find_vmap_lowest_match_check(root, head, size, align);
> +	if (!IS_ERR_VALUE(nva_start_addr))
>
Just keep it as it was. No need to check if addr is valid or not.

> +		find_vmap_lowest_match_check(root, head, size, align);
>  #endif
>  
>  	return nva_start_addr;
> @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
>  	struct vmap_area *va;
>  
>  	*vn_id = 0;
> -	*addr = vend;
> +	*addr = -EINVAL;
>  
>  	/*
>  	 * Fallback to a global heap if not vmalloc or there
> @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
>  	}
>  
>  retry:
> -	if (addr == vend) {
> +	if (IS_ERR_VALUE(addr)) {
>  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
>  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
>  			size, align, vstart, vend);
>  		spin_unlock(&free_vmap_area_lock);
>  	}
>  
> -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
> +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
>  
>  	/*
> -	 * If an allocation fails, the "vend" address is
> +	 * If an allocation fails, the error value is
>  	 * returned. Therefore trigger the overflow path.
>  	 */
> -	if (unlikely(addr == vend))
> +	if (IS_ERR_VALUE(addr))
>  		goto overflow;
>  
>  	va->va_start = addr;
> @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
>  
>  		ret = va_clip(&free_vmap_area_root,
>  			&free_vmap_area_list, va, start, size);
> -		if (WARN_ON_ONCE(unlikely(ret)))
> -			/* It is a BUG(), but trigger recovery instead. */
Keep the comment.

> +		if ((unlikely(ret))) {
> +			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
>  			goto recovery;
> +		}
>

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  2025-04-16 13:50   ` Uladzislau Rezki
@ 2025-04-17  2:51     ` Baoquan He
  2025-04-17 16:18       ` Uladzislau Rezki
  0 siblings, 1 reply; 32+ messages in thread
From: Baoquan He @ 2025-04-17  2:51 UTC (permalink / raw)
  To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel

On 04/16/25 at 03:50pm, Uladzislau Rezki wrote:
> On Tue, Apr 15, 2025 at 10:39:50AM +0800, Baoquan He wrote:
> > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be
> > taken away into free vmap tree partially or completely accordingly, that
> > is done in decay_va_pool_node(). When doing that, for each pool of node,
> > the whole list is detached from the pool for handling. At this time,
> > that pool is empty. It's not necessary to update the pool size each time
> > when one VA is removed and addded into free vmap tree.
> > 
> > Here change code to update the pool size when attaching the pool back.
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/vmalloc.c | 23 +++++++++++------------
> >  1 file changed, 11 insertions(+), 12 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 488d69b56765..bf735c890878 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  	LIST_HEAD(decay_list);
> >  	struct rb_root decay_root = RB_ROOT;
> >  	struct vmap_area *va, *nva;
> > -	unsigned long n_decay;
> > +	unsigned long n_decay, len;
> >  	int i;
> >  
> >  	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
> > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  		list_replace_init(&vn->pool[i].head, &tmp_list);
> >  		spin_unlock(&vn->pool_lock);
> >  
> > -		if (full_decay)
> > -			WRITE_ONCE(vn->pool[i].len, 0);
> > +		len = n_decay = vn->pool[i].len;
> > +		WRITE_ONCE(vn->pool[i].len, 0);
> >  
> >  		/* Decay a pool by ~25% out of left objects. */
> > -		n_decay = vn->pool[i].len >> 2;
> > +		if (!full_decay)
> > +			n_decay >>= 2;
> > +		len -= n_decay;
> >  
> >  		list_for_each_entry_safe(va, nva, &tmp_list, list) {
> > +			if (!n_decay)
> > +				break;
> >  			list_del_init(&va->list);
> >  			merge_or_add_vmap_area(va, &decay_root, &decay_list);
> > -
> > -			if (!full_decay) {
> > -				WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1);
> > -
> > -				if (!--n_decay)
> > -					break;
> > -			}
> > +			n_decay--;
> >  		}
> >  
> >  		/*
> > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> >  		 * can populate the pool therefore a simple list replace
> >  		 * operation takes place here.
> >  		 */
> > -		if (!full_decay && !list_empty(&tmp_list)) {
> > +		if (!list_empty(&tmp_list)) {
> >  			spin_lock(&vn->pool_lock);
> >  			list_replace_init(&tmp_list, &vn->pool[i].head);
> > +			vn->pool[i].len = len;
> >  			spin_unlock(&vn->pool_lock);
> >  		}
> >  	}
> > -- 
> > 2.41.0
> > 
> Which Linux version this patch is based on? I can not apply it.

I can apply them on the latest mainline kernel, next/master and
mm-new branch of akpm/mm.git. I checked just now.

> 
> Small nits:
> 
> <snip>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c909b8fea6eb..0ae53c997219 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  	LIST_HEAD(decay_list);
>  	struct rb_root decay_root = RB_ROOT;
>  	struct vmap_area *va, *nva;
> -	unsigned long n_decay, len;
> +	unsigned long n_decay, pool_len;
>  	int i;
>  
>  	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
> @@ -2164,21 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		list_replace_init(&vn->pool[i].head, &tmp_list);
>  		spin_unlock(&vn->pool_lock);
>  
> -		len = n_decay = vn->pool[i].len;
> +		pool_len = n_decay = vn->pool[i].len;
>  		WRITE_ONCE(vn->pool[i].len, 0);
>  
>  		/* Decay a pool by ~25% out of left objects. */
>  		if (!full_decay)
>  			n_decay >>= 2;
> -		len -= n_decay;
> +		pool_len -= n_decay;
>  
>  		list_for_each_entry_safe(va, nva, &tmp_list, list) {
> -			if (!n_decay)
> +			if (!n_decay--)
>  				break;
>  
>  			list_del_init(&va->list);
>  			merge_or_add_vmap_area(va, &decay_root, &decay_list);
> -			n_decay--;
>  		}
>  
>  		/*
> @@ -2190,7 +2189,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
>  		if (!list_empty(&tmp_list)) {
>  			spin_lock(&vn->pool_lock);
>  			list_replace_init(&tmp_list, &vn->pool[i].head);
> -			vn->pool[i].len = len;
> +			vn->pool[i].len = pool_len;
>  			spin_unlock(&vn->pool_lock);
>  		}
>  	}
> <snip>
> 
> on top of this?
> 
> a) decay in "if" statement, no need extra line;
> b) rename len to something obvious - pool_len.

All look good, will add them to this patch 3 of v2. Thanks a lot.



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-16 14:28   ` Uladzislau Rezki
@ 2025-04-17  3:02     ` Baoquan He
  2025-04-17 16:17       ` Uladzislau Rezki
  0 siblings, 1 reply; 32+ messages in thread
From: Baoquan He @ 2025-04-17  3:02 UTC (permalink / raw)
  To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel

On 04/16/25 at 04:28pm, Uladzislau Rezki wrote:
> On Tue, Apr 15, 2025 at 10:39:52AM +0800, Baoquan He wrote:
> > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
> > indicate if the allocation is successful or failed. That is not very clear.
> > 
> > Here change to return explicit error values and check them to judge if
> > allocation is successful.
> > 
> > IS_ERR_VALUE already uses unlikely() internally
> > 
> > Signed-off-by: Baoquan He <bhe@redhat.com>
> > ---
> >  mm/vmalloc.c | 34 +++++++++++++++++-----------------
> >  1 file changed, 17 insertions(+), 17 deletions(-)
> > 
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 3f38a232663b..5b21cd09b2b4 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> >  			 */
> >  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
> >  			if (!lva)
> > -				return -1;
> > +				return -ENOMEM;
> >  		}
> >  
> >  		/*
> > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> >  		 */
> >  		va->va_start = nva_start_addr + size;
> >  	} else {
> > -		return -1;
> > +		return -EINVAL;
> >  	}
> >  
> >  	if (type != FL_FIT_TYPE) {
> > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
> >  
> >  	/* Check the "vend" restriction. */
> >  	if (nva_start_addr + size > vend)
> > -		return vend;
> > +		return -ERANGE;
> >  
> >  	/* Update the free vmap_area. */
> >  	ret = va_clip(root, head, va, nva_start_addr, size);
> > -	if (WARN_ON_ONCE(ret))
> > -		return vend;
> >
> Not clear why you remove this WARN_ON by this patch. It should be
> a separate patch or just keep it as is. The warning here can mean
> that something is really wrong, especially if NOTHING_FIT. So we
> definitely want the warning.

I remember one time someone reported that the slab allocation failure
triggered this warning which is confusing to them. But yes, it should be
discussed in a separate post or thread, not appropriate to remove it
silently. I will add it back in v2.

> 
> > +	if (ret)
> > +		return ret;
> >  
> >  	return nva_start_addr;
> >  }
> >  
> >  /*
> >   * Returns a start address of the newly allocated area, if success.
> > - * Otherwise a vend is returned that indicates failure.
> > + * Otherwise an error value is returned that indicates failure.
> >   */
> >  static __always_inline unsigned long
> >  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> >  
> >  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
> >  	if (unlikely(!va))
> > -		return vend;
> > +		return -ENOENT;
> >  
> >  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
> > -	if (nva_start_addr == vend)
> > -		return vend;
> >  
> >  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
> > -	find_vmap_lowest_match_check(root, head, size, align);
> > +	if (!IS_ERR_VALUE(nva_start_addr))
> >
> Just keep it as it was. No need to check if addr is valid or not.

This is to keep consistent with the old code. Before this patch, if
va_alloc() return vend, it returns directly, no
find_vmap_lowest_match_check() invocation is done. I tried to keep the
behaviour unchanged. That code is for debugging, both is fine to me.

> 
> > +		find_vmap_lowest_match_check(root, head, size, align);
> >  #endif
> >  
> >  	return nva_start_addr;
> > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
> >  	struct vmap_area *va;
> >  
> >  	*vn_id = 0;
> > -	*addr = vend;
> > +	*addr = -EINVAL;
> >  
> >  	/*
> >  	 * Fallback to a global heap if not vmalloc or there
> > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
> >  	}
> >  
> >  retry:
> > -	if (addr == vend) {
> > +	if (IS_ERR_VALUE(addr)) {
> >  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
> >  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
> >  			size, align, vstart, vend);
> >  		spin_unlock(&free_vmap_area_lock);
> >  	}
> >  
> > -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
> > +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
> >  
> >  	/*
> > -	 * If an allocation fails, the "vend" address is
> > +	 * If an allocation fails, the error value is
> >  	 * returned. Therefore trigger the overflow path.
> >  	 */
> > -	if (unlikely(addr == vend))
> > +	if (IS_ERR_VALUE(addr))
> >  		goto overflow;
> >  
> >  	va->va_start = addr;
> > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
> >  
> >  		ret = va_clip(&free_vmap_area_root,
> >  			&free_vmap_area_list, va, start, size);
> > -		if (WARN_ON_ONCE(unlikely(ret)))
> > -			/* It is a BUG(), but trigger recovery instead. */
> Keep the comment.

OK, will add it back.

> 
> > +		if ((unlikely(ret))) {
> > +			WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret);
> >  			goto recovery;
> > +		}
> >
> 
> --
> Uladzislau Rezki
> 



^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area()
  2025-04-17  3:02     ` Baoquan He
@ 2025-04-17 16:17       ` Uladzislau Rezki
  0 siblings, 0 replies; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-17 16:17 UTC (permalink / raw)
  To: Baoquan He; +Cc: Uladzislau Rezki, linux-mm, akpm, linux-kernel

On Thu, Apr 17, 2025 at 11:02:06AM +0800, Baoquan He wrote:
> On 04/16/25 at 04:28pm, Uladzislau Rezki wrote:
> > On Tue, Apr 15, 2025 at 10:39:52AM +0800, Baoquan He wrote:
> > > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to
> > > indicate if the allocation is successful or failed. That is not very clear.
> > > 
> > > Here change to return explicit error values and check them to judge if
> > > allocation is successful.
> > > 
> > > IS_ERR_VALUE already uses unlikely() internally
> > > 
> > > Signed-off-by: Baoquan He <bhe@redhat.com>
> > > ---
> > >  mm/vmalloc.c | 34 +++++++++++++++++-----------------
> > >  1 file changed, 17 insertions(+), 17 deletions(-)
> > > 
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 3f38a232663b..5b21cd09b2b4 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> > >  			 */
> > >  			lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT);
> > >  			if (!lva)
> > > -				return -1;
> > > +				return -ENOMEM;
> > >  		}
> > >  
> > >  		/*
> > > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head,
> > >  		 */
> > >  		va->va_start = nva_start_addr + size;
> > >  	} else {
> > > -		return -1;
> > > +		return -EINVAL;
> > >  	}
> > >  
> > >  	if (type != FL_FIT_TYPE) {
> > > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va,
> > >  
> > >  	/* Check the "vend" restriction. */
> > >  	if (nva_start_addr + size > vend)
> > > -		return vend;
> > > +		return -ERANGE;
> > >  
> > >  	/* Update the free vmap_area. */
> > >  	ret = va_clip(root, head, va, nva_start_addr, size);
> > > -	if (WARN_ON_ONCE(ret))
> > > -		return vend;
> > >
> > Not clear why you remove this WARN_ON by this patch. It should be
> > a separate patch or just keep it as is. The warning here can mean
> > that something is really wrong, especially if NOTHING_FIT. So we
> > definitely want the warning.
> 
> I remember one time someone reported that the slab allocation failure
> triggered this warning which is confusing to them. But yes, it should be
> discussed in a separate post or thread, not appropriate to remove it
> silently. I will add it back in v2.
> 
Thanks!

> > 
> > > +	if (ret)
> > > +		return ret;
> > >  
> > >  	return nva_start_addr;
> > >  }
> > >  
> > >  /*
> > >   * Returns a start address of the newly allocated area, if success.
> > > - * Otherwise a vend is returned that indicates failure.
> > > + * Otherwise an error value is returned that indicates failure.
> > >   */
> > >  static __always_inline unsigned long
> > >  __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> > > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head,
> > >  
> > >  	va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size);
> > >  	if (unlikely(!va))
> > > -		return vend;
> > > +		return -ENOENT;
> > >  
> > >  	nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend);
> > > -	if (nva_start_addr == vend)
> > > -		return vend;
> > >  
> > >  #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK
> > > -	find_vmap_lowest_match_check(root, head, size, align);
> > > +	if (!IS_ERR_VALUE(nva_start_addr))
> > >
> > Just keep it as it was. No need to check if addr is valid or not.
> 
> This is to keep consistent with the old code. Before this patch, if
> va_alloc() return vend, it returns directly, no
> find_vmap_lowest_match_check() invocation is done. I tried to keep the
> behaviour unchanged. That code is for debugging, both is fine to me.
> 
Ack. Makes sense to keep same behaviour as it was/is.

> > 
> > > +		find_vmap_lowest_match_check(root, head, size, align);
> > >  #endif
> > >  
> > >  	return nva_start_addr;
> > > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align,
> > >  	struct vmap_area *va;
> > >  
> > >  	*vn_id = 0;
> > > -	*addr = vend;
> > > +	*addr = -EINVAL;
> > >  
> > >  	/*
> > >  	 * Fallback to a global heap if not vmalloc or there
> > > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size,
> > >  	}
> > >  
> > >  retry:
> > > -	if (addr == vend) {
> > > +	if (IS_ERR_VALUE(addr)) {
> > >  		preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node);
> > >  		addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list,
> > >  			size, align, vstart, vend);
> > >  		spin_unlock(&free_vmap_area_lock);
> > >  	}
> > >  
> > > -	trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend);
> > > +	trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr));
> > >  
> > >  	/*
> > > -	 * If an allocation fails, the "vend" address is
> > > +	 * If an allocation fails, the error value is
> > >  	 * returned. Therefore trigger the overflow path.
> > >  	 */
> > > -	if (unlikely(addr == vend))
> > > +	if (IS_ERR_VALUE(addr))
> > >  		goto overflow;
> > >  
> > >  	va->va_start = addr;
> > > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
> > >  
> > >  		ret = va_clip(&free_vmap_area_root,
> > >  			&free_vmap_area_list, va, start, size);
> > > -		if (WARN_ON_ONCE(unlikely(ret)))
> > > -			/* It is a BUG(), but trigger recovery instead. */
> > Keep the comment.
> 
> OK, will add it back.
> 
Thank you.

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit
  2025-04-17  2:51     ` Baoquan He
@ 2025-04-17 16:18       ` Uladzislau Rezki
  0 siblings, 0 replies; 32+ messages in thread
From: Uladzislau Rezki @ 2025-04-17 16:18 UTC (permalink / raw)
  To: Baoquan He; +Cc: Uladzislau Rezki, linux-mm, akpm, linux-kernel

On Thu, Apr 17, 2025 at 10:51:44AM +0800, Baoquan He wrote:
> On 04/16/25 at 03:50pm, Uladzislau Rezki wrote:
> > On Tue, Apr 15, 2025 at 10:39:50AM +0800, Baoquan He wrote:
> > > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be
> > > taken away into free vmap tree partially or completely accordingly, that
> > > is done in decay_va_pool_node(). When doing that, for each pool of node,
> > > the whole list is detached from the pool for handling. At this time,
> > > that pool is empty. It's not necessary to update the pool size each time
> > > when one VA is removed and addded into free vmap tree.
> > > 
> > > Here change code to update the pool size when attaching the pool back.
> > > 
> > > Signed-off-by: Baoquan He <bhe@redhat.com>
> > > ---
> > >  mm/vmalloc.c | 23 +++++++++++------------
> > >  1 file changed, 11 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index 488d69b56765..bf735c890878 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> > >  	LIST_HEAD(decay_list);
> > >  	struct rb_root decay_root = RB_ROOT;
> > >  	struct vmap_area *va, *nva;
> > > -	unsigned long n_decay;
> > > +	unsigned long n_decay, len;
> > >  	int i;
> > >  
> > >  	for (i = 0; i < MAX_VA_SIZE_PAGES; i++) {
> > > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> > >  		list_replace_init(&vn->pool[i].head, &tmp_list);
> > >  		spin_unlock(&vn->pool_lock);
> > >  
> > > -		if (full_decay)
> > > -			WRITE_ONCE(vn->pool[i].len, 0);
> > > +		len = n_decay = vn->pool[i].len;
> > > +		WRITE_ONCE(vn->pool[i].len, 0);
> > >  
> > >  		/* Decay a pool by ~25% out of left objects. */
> > > -		n_decay = vn->pool[i].len >> 2;
> > > +		if (!full_decay)
> > > +			n_decay >>= 2;
> > > +		len -= n_decay;
> > >  
> > >  		list_for_each_entry_safe(va, nva, &tmp_list, list) {
> > > +			if (!n_decay)
> > > +				break;
> > >  			list_del_init(&va->list);
> > >  			merge_or_add_vmap_area(va, &decay_root, &decay_list);
> > > -
> > > -			if (!full_decay) {
> > > -				WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1);
> > > -
> > > -				if (!--n_decay)
> > > -					break;
> > > -			}
> > > +			n_decay--;
> > >  		}
> > >  
> > >  		/*
> > > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay)
> > >  		 * can populate the pool therefore a simple list replace
> > >  		 * operation takes place here.
> > >  		 */
> > > -		if (!full_decay && !list_empty(&tmp_list)) {
> > > +		if (!list_empty(&tmp_list)) {
> > >  			spin_lock(&vn->pool_lock);
> > >  			list_replace_init(&tmp_list, &vn->pool[i].head);
> > > +			vn->pool[i].len = len;
> > >  			spin_unlock(&vn->pool_lock);
> > >  		}
> > >  	}
> > > -- 
> > > 2.41.0
> > > 
> > Which Linux version this patch is based on? I can not apply it.
> 
> I can apply them on the latest mainline kernel, next/master and
> mm-new branch of akpm/mm.git. I checked just now.
> 
Sounds good.

--
Uladzislau Rezki


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2025-04-17 16:19 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-15  2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He
2025-04-15  2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He
2025-04-15 10:47   ` Uladzislau Rezki
2025-04-15 19:08   ` Shivank Garg
2025-04-15 23:53   ` Vishal Moola (Oracle)
2025-04-15  2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He
2025-04-15 15:25   ` Uladzislau Rezki
2025-04-15 23:41     ` Baoquan He
2025-04-15 19:09   ` Shivank Garg
2025-04-15  2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He
2025-04-15 10:29   ` Shivank Garg
2025-04-15 14:05     ` Baoquan He
2025-04-15 19:02       ` Shivank Garg
2025-04-16 13:50   ` Uladzislau Rezki
2025-04-17  2:51     ` Baoquan He
2025-04-17 16:18       ` Uladzislau Rezki
2025-04-15  2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He
2025-04-15 10:44   ` Uladzislau Rezki
2025-04-15 19:10   ` Shivank Garg
2025-04-15 23:54   ` Vishal Moola (Oracle)
2025-04-15  2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He
2025-04-15  6:44   ` Baoquan He
2025-04-15  7:22   ` Shivank Garg
2025-04-15 13:01     ` Baoquan He
2025-04-15 19:00       ` Shivank Garg
2025-04-15 22:57         ` Baoquan He
2025-04-16 14:28   ` Uladzislau Rezki
2025-04-17  3:02     ` Baoquan He
2025-04-17 16:17       ` Uladzislau Rezki
2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki
2025-04-15 22:55   ` Baoquan He
2025-04-15 19:06 ` Shivank Garg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).