* [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements @ 2025-04-15 2:39 Baoquan He 2025-04-15 2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He ` (6 more replies) 0 siblings, 7 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 2:39 UTC (permalink / raw) To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He These were made from code inspection in mm/vmalloc.c. Baoquan He (5): mm/vmalloc.c: change purge_ndoes as local static variable mm/vmalloc.c: find the vmap of vmap_nodes in reverse order mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit mm/vmalloc: optimize function vm_unmap_aliases() mm/vmalloc.c: return explicit error value in alloc_vmap_area() mm/vmalloc.c | 68 +++++++++++++++++++++++++--------------------------- 1 file changed, 32 insertions(+), 36 deletions(-) -- 2.41.0 ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He @ 2025-04-15 2:39 ` Baoquan He 2025-04-15 10:47 ` Uladzislau Rezki ` (2 more replies) 2025-04-15 2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He ` (5 subsequent siblings) 6 siblings, 3 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 2:39 UTC (permalink / raw) To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He Static variable 'purge_ndoes' is defined in global scope, while it's only used in function __purge_vmap_area_lazy(). It mainly serves to avoid memory allocation repeatedly, especially when NR_CPUS is big. While a local static variable can also satisfy the demand, and can improve code readibility. Hence move its definition into __purge_vmap_area_lazy(). Signed-off-by: Baoquan He <bhe@redhat.com> --- mm/vmalloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 6ee7fc2ec986..aca1905d3397 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2128,7 +2128,6 @@ static DEFINE_MUTEX(vmap_purge_lock); /* for per-CPU blocks */ static void purge_fragmented_blocks_allcpus(void); -static cpumask_t purge_nodes; static void reclaim_list_global(struct list_head *head) @@ -2261,6 +2260,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end, { unsigned long nr_purged_areas = 0; unsigned int nr_purge_helpers; + static cpumask_t purge_nodes; unsigned int nr_purge_nodes; struct vmap_node *vn; int i; -- 2.41.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable 2025-04-15 2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He @ 2025-04-15 10:47 ` Uladzislau Rezki 2025-04-15 19:08 ` Shivank Garg 2025-04-15 23:53 ` Vishal Moola (Oracle) 2 siblings, 0 replies; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-15 10:47 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:48AM +0800, Baoquan He wrote: > Static variable 'purge_ndoes' is defined in global scope, while it's > only used in function __purge_vmap_area_lazy(). It mainly serves to > avoid memory allocation repeatedly, especially when NR_CPUS is big. > > While a local static variable can also satisfy the demand, and can > improve code readibility. Hence move its definition into > __purge_vmap_area_lazy(). > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 6ee7fc2ec986..aca1905d3397 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2128,7 +2128,6 @@ static DEFINE_MUTEX(vmap_purge_lock); > > /* for per-CPU blocks */ > static void purge_fragmented_blocks_allcpus(void); > -static cpumask_t purge_nodes; > > static void > reclaim_list_global(struct list_head *head) > @@ -2261,6 +2260,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end, > { > unsigned long nr_purged_areas = 0; > unsigned int nr_purge_helpers; > + static cpumask_t purge_nodes; > unsigned int nr_purge_nodes; > struct vmap_node *vn; > int i; > -- > 2.41.0 > Well. I do not have a strong opinion here. But right, it is only used inside the function. Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable 2025-04-15 2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He 2025-04-15 10:47 ` Uladzislau Rezki @ 2025-04-15 19:08 ` Shivank Garg 2025-04-15 23:53 ` Vishal Moola (Oracle) 2 siblings, 0 replies; 32+ messages in thread From: Shivank Garg @ 2025-04-15 19:08 UTC (permalink / raw) To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel On 4/15/2025 8:09 AM, Baoquan He wrote: > Static variable 'purge_ndoes' is defined in global scope, while it's > only used in function __purge_vmap_area_lazy(). It mainly serves to > avoid memory allocation repeatedly, especially when NR_CPUS is big. > > While a local static variable can also satisfy the demand, and can > improve code readibility. Hence move its definition into > __purge_vmap_area_lazy(). > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 6ee7fc2ec986..aca1905d3397 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2128,7 +2128,6 @@ static DEFINE_MUTEX(vmap_purge_lock); > > /* for per-CPU blocks */ > static void purge_fragmented_blocks_allcpus(void); > -static cpumask_t purge_nodes; > > static void > reclaim_list_global(struct list_head *head) > @@ -2261,6 +2260,7 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end, > { > unsigned long nr_purged_areas = 0; > unsigned int nr_purge_helpers; > + static cpumask_t purge_nodes; > unsigned int nr_purge_nodes; > struct vmap_node *vn; > int i; Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable 2025-04-15 2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He 2025-04-15 10:47 ` Uladzislau Rezki 2025-04-15 19:08 ` Shivank Garg @ 2025-04-15 23:53 ` Vishal Moola (Oracle) 2 siblings, 0 replies; 32+ messages in thread From: Vishal Moola (Oracle) @ 2025-04-15 23:53 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:48AM +0800, Baoquan He wrote: > Static variable 'purge_ndoes' is defined in global scope, while it's > only used in function __purge_vmap_area_lazy(). It mainly serves to > avoid memory allocation repeatedly, especially when NR_CPUS is big. > > While a local static variable can also satisfy the demand, and can > improve code readibility. Hence move its definition into > __purge_vmap_area_lazy(). > > Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He 2025-04-15 2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He @ 2025-04-15 2:39 ` Baoquan He 2025-04-15 15:25 ` Uladzislau Rezki 2025-04-15 19:09 ` Shivank Garg 2025-04-15 2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He ` (4 subsequent siblings) 6 siblings, 2 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 2:39 UTC (permalink / raw) To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He When finding VA in vn->busy, if VA spans several zones and the passed addr is not the same as va->va_start, we should scan the vn in reverse odrdr because the starting address of VA must be smaller than the passed addr if it really resides in the VA. E.g on a system nr_vmap_nodes=100, <----va----> -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- ... n-1 n n+1 n+2 ... 100 0 1 VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', then succeed very soon. Meanwhile we still need loop around because VA could spans node from 'n' to node 100, node 0, node 1. Anyway, changing to find in reverse order can improve efficiency on many CPUs system. Signed-off-by: Baoquan He <bhe@redhat.com> --- mm/vmalloc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index aca1905d3397..488d69b56765 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) if (va) return va; - } while ((i = (i + 1) % nr_vmap_nodes) != j); + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); return NULL; } @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) if (va) return va; - } while ((i = (i + 1) % nr_vmap_nodes) != j); + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); return NULL; } -- 2.41.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order 2025-04-15 2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He @ 2025-04-15 15:25 ` Uladzislau Rezki 2025-04-15 23:41 ` Baoquan He 2025-04-15 19:09 ` Shivank Garg 1 sibling, 1 reply; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-15 15:25 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote: > When finding VA in vn->busy, if VA spans several zones and the passed > addr is not the same as va->va_start, we should scan the vn in reverse > odrdr because the starting address of VA must be smaller than the passed > addr if it really resides in the VA. > > E.g on a system nr_vmap_nodes=100, > > <----va----> > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > ... n-1 n n+1 n+2 ... 100 0 1 > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > then succeed very soon. > > Meanwhile we still need loop around because VA could spans node from 'n' > to node 100, node 0, node 1. > > Anyway, changing to find in reverse order can improve efficiency on > many CPUs system. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index aca1905d3397..488d69b56765 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } > -- > 2.41.0 > It depends. Consider a below situation: addr | VA V <------------> <---|---|---|---|---|---|---|---> 0 1 2 3 0 1 2 3 basically it matters how big VA and how many nodes it spans. But i agree that an assumption to reverse back is more convinced in most cases. Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order 2025-04-15 15:25 ` Uladzislau Rezki @ 2025-04-15 23:41 ` Baoquan He 0 siblings, 0 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 23:41 UTC (permalink / raw) To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel On 04/15/25 at 05:25pm, Uladzislau Rezki wrote: > On Tue, Apr 15, 2025 at 10:39:49AM +0800, Baoquan He wrote: > > When finding VA in vn->busy, if VA spans several zones and the passed > > addr is not the same as va->va_start, we should scan the vn in reverse > > odrdr because the starting address of VA must be smaller than the passed > > addr if it really resides in the VA. > > > > E.g on a system nr_vmap_nodes=100, > > > > <----va----> > > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > > ... n-1 n n+1 n+2 ... 100 0 1 > > > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > > then succeed very soon. > > > > Meanwhile we still need loop around because VA could spans node from 'n' > > to node 100, node 0, node 1. > > > > Anyway, changing to find in reverse order can improve efficiency on > > many CPUs system. > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > mm/vmalloc.c | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index aca1905d3397..488d69b56765 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > > > if (va) > > return va; > > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > > > return NULL; > > } > > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > > > if (va) > > return va; > > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > > > return NULL; > > } > > -- > > 2.41.0 > > > It depends. Consider a below situation: > > addr > | > VA V > <------------> > <---|---|---|---|---|---|---|---> > 0 1 2 3 0 1 2 3 > > basically it matters how big VA and how many nodes it spans. But i > agree that an assumption to reverse back is more convinced in most > cases. Agree, on small system with few CPUs and big VA case, the advantage is not apparent. > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Thanks. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order 2025-04-15 2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He 2025-04-15 15:25 ` Uladzislau Rezki @ 2025-04-15 19:09 ` Shivank Garg 1 sibling, 0 replies; 32+ messages in thread From: Shivank Garg @ 2025-04-15 19:09 UTC (permalink / raw) To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel On 4/15/2025 8:09 AM, Baoquan He wrote: > When finding VA in vn->busy, if VA spans several zones and the passed > addr is not the same as va->va_start, we should scan the vn in reverse > odrdr because the starting address of VA must be smaller than the passed > addr if it really resides in the VA. > > E.g on a system nr_vmap_nodes=100, > > <----va----> > -|-----|-----|-----|-----|-----|-----|-----|-----|-----|- > ... n-1 n n+1 n+2 ... 100 0 1 > > VA resides in node 'n' whereas it spans 'n', 'n+1' and 'n+2'. If passed > addr is within 'n+2', we should try nodes backwards on 'n+1' and 'n', > then succeed very soon. > > Meanwhile we still need loop around because VA could spans node from 'n' > to node 100, node 0, node 1. > > Anyway, changing to find in reverse order can improve efficiency on > many CPUs system. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index aca1905d3397..488d69b56765 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2436,7 +2436,7 @@ struct vmap_area *find_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } > @@ -2462,7 +2462,7 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) > > if (va) > return va; > - } while ((i = (i + 1) % nr_vmap_nodes) != j); > + } while ((i = (i + nr_vmap_nodes - 1) % nr_vmap_nodes) != j); > > return NULL; > } Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> Thanks, Shivank ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He 2025-04-15 2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He 2025-04-15 2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He @ 2025-04-15 2:39 ` Baoquan He 2025-04-15 10:29 ` Shivank Garg 2025-04-16 13:50 ` Uladzislau Rezki 2025-04-15 2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He ` (3 subsequent siblings) 6 siblings, 2 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 2:39 UTC (permalink / raw) To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He When purge lazily freed vmap areas, VA stored in vn->pool[] will also be taken away into free vmap tree partially or completely accordingly, that is done in decay_va_pool_node(). When doing that, for each pool of node, the whole list is detached from the pool for handling. At this time, that pool is empty. It's not necessary to update the pool size each time when one VA is removed and addded into free vmap tree. Here change code to update the pool size when attaching the pool back. Signed-off-by: Baoquan He <bhe@redhat.com> --- mm/vmalloc.c | 23 +++++++++++------------ 1 file changed, 11 insertions(+), 12 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 488d69b56765..bf735c890878 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) LIST_HEAD(decay_list); struct rb_root decay_root = RB_ROOT; struct vmap_area *va, *nva; - unsigned long n_decay; + unsigned long n_decay, len; int i; for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) list_replace_init(&vn->pool[i].head, &tmp_list); spin_unlock(&vn->pool_lock); - if (full_decay) - WRITE_ONCE(vn->pool[i].len, 0); + len = n_decay = vn->pool[i].len; + WRITE_ONCE(vn->pool[i].len, 0); /* Decay a pool by ~25% out of left objects. */ - n_decay = vn->pool[i].len >> 2; + if (!full_decay) + n_decay >>= 2; + len -= n_decay; list_for_each_entry_safe(va, nva, &tmp_list, list) { + if (!n_decay) + break; list_del_init(&va->list); merge_or_add_vmap_area(va, &decay_root, &decay_list); - - if (!full_decay) { - WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1); - - if (!--n_decay) - break; - } + n_decay--; } /* @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) * can populate the pool therefore a simple list replace * operation takes place here. */ - if (!full_decay && !list_empty(&tmp_list)) { + if (!list_empty(&tmp_list)) { spin_lock(&vn->pool_lock); list_replace_init(&tmp_list, &vn->pool[i].head); + vn->pool[i].len = len; spin_unlock(&vn->pool_lock); } } -- 2.41.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit 2025-04-15 2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He @ 2025-04-15 10:29 ` Shivank Garg 2025-04-15 14:05 ` Baoquan He 2025-04-16 13:50 ` Uladzislau Rezki 1 sibling, 1 reply; 32+ messages in thread From: Shivank Garg @ 2025-04-15 10:29 UTC (permalink / raw) To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel On 4/15/2025 8:09 AM, Baoquan He wrote: > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be > taken away into free vmap tree partially or completely accordingly, that > is done in decay_va_pool_node(). When doing that, for each pool of node, > the whole list is detached from the pool for handling. At this time, > that pool is empty. It's not necessary to update the pool size each time > when one VA is removed and addded into free vmap tree. > > Here change code to update the pool size when attaching the pool back. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 23 +++++++++++------------ > 1 file changed, 11 insertions(+), 12 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 488d69b56765..bf735c890878 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > LIST_HEAD(decay_list); > struct rb_root decay_root = RB_ROOT; > struct vmap_area *va, *nva; > - unsigned long n_decay; > + unsigned long n_decay, len; > int i; > > for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > list_replace_init(&vn->pool[i].head, &tmp_list); > spin_unlock(&vn->pool_lock); > > - if (full_decay) > - WRITE_ONCE(vn->pool[i].len, 0); > + len = n_decay = vn->pool[i].len; > + WRITE_ONCE(vn->pool[i].len, 0); > > /* Decay a pool by ~25% out of left objects. */ > - n_decay = vn->pool[i].len >> 2; > + if (!full_decay) > + n_decay >>= 2; > + len -= n_decay; > > list_for_each_entry_safe(va, nva, &tmp_list, list) { > + if (!n_decay) > + break; > list_del_init(&va->list); > merge_or_add_vmap_area(va, &decay_root, &decay_list); > - > - if (!full_decay) { > - WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1); > - > - if (!--n_decay) > - break; > - } > + n_decay--; > } > > /* > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > * can populate the pool therefore a simple list replace > * operation takes place here. > */ > - if (!full_decay && !list_empty(&tmp_list)) { > + if (!list_empty(&tmp_list)) { > spin_lock(&vn->pool_lock); > list_replace_init(&tmp_list, &vn->pool[i].head); > + vn->pool[i].len = len; Current logic uses WRITE_ONCE() to update vn->pool[i].len. Could this lead to consistency issues? > spin_unlock(&vn->pool_lock); > } > } ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit 2025-04-15 10:29 ` Shivank Garg @ 2025-04-15 14:05 ` Baoquan He 2025-04-15 19:02 ` Shivank Garg 0 siblings, 1 reply; 32+ messages in thread From: Baoquan He @ 2025-04-15 14:05 UTC (permalink / raw) To: Shivank Garg; +Cc: linux-mm, akpm, urezki, linux-kernel On 04/15/25 at 03:59pm, Shivank Garg wrote: > On 4/15/2025 8:09 AM, Baoquan He wrote: > > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be > > taken away into free vmap tree partially or completely accordingly, that > > is done in decay_va_pool_node(). When doing that, for each pool of node, > > the whole list is detached from the pool for handling. At this time, > > that pool is empty. It's not necessary to update the pool size each time > > when one VA is removed and addded into free vmap tree. > > > > Here change code to update the pool size when attaching the pool back. > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > mm/vmalloc.c | 23 +++++++++++------------ > > 1 file changed, 11 insertions(+), 12 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 488d69b56765..bf735c890878 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > LIST_HEAD(decay_list); > > struct rb_root decay_root = RB_ROOT; > > struct vmap_area *va, *nva; > > - unsigned long n_decay; > > + unsigned long n_decay, len; > > int i; > > > > for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { > > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > list_replace_init(&vn->pool[i].head, &tmp_list); > > spin_unlock(&vn->pool_lock); > > > > - if (full_decay) > > - WRITE_ONCE(vn->pool[i].len, 0); > > + len = n_decay = vn->pool[i].len; > > + WRITE_ONCE(vn->pool[i].len, 0); > > > > /* Decay a pool by ~25% out of left objects. */ > > - n_decay = vn->pool[i].len >> 2; > > + if (!full_decay) > > + n_decay >>= 2; > > + len -= n_decay; > > > > list_for_each_entry_safe(va, nva, &tmp_list, list) { > > + if (!n_decay) > > + break; > > list_del_init(&va->list); > > merge_or_add_vmap_area(va, &decay_root, &decay_list); > > - > > - if (!full_decay) { > > - WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1); > > - > > - if (!--n_decay) > > - break; > > - } > > + n_decay--; > > } > > > > /* > > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > * can populate the pool therefore a simple list replace > > * operation takes place here. > > */ > > - if (!full_decay && !list_empty(&tmp_list)) { > > + if (!list_empty(&tmp_list)) { > > spin_lock(&vn->pool_lock); > > list_replace_init(&tmp_list, &vn->pool[i].head); > > + vn->pool[i].len = len; > > Current logic uses WRITE_ONCE() to update vn->pool[i].len. > Could this lead to consistency issues? Seems no necessary to use WRITE_ONCE(). I can change back to use WRITE_ONCE() just in case. Currently, it's only updated in node_alloc(), decay_va_pool_node(), purge_vmap_node(). And the latter two are inside vmap_purge_lock area. > > > spin_unlock(&vn->pool_lock); > > } > > } > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit 2025-04-15 14:05 ` Baoquan He @ 2025-04-15 19:02 ` Shivank Garg 0 siblings, 0 replies; 32+ messages in thread From: Shivank Garg @ 2025-04-15 19:02 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On 4/15/2025 7:35 PM, Baoquan He wrote: > On 04/15/25 at 03:59pm, Shivank Garg wrote: >> On 4/15/2025 8:09 AM, Baoquan He wrote: >>> When purge lazily freed vmap areas, VA stored in vn->pool[] will also be >>> taken away into free vmap tree partially or completely accordingly, that >>> is done in decay_va_pool_node(). When doing that, for each pool of node, >>> the whole list is detached from the pool for handling. At this time, >>> that pool is empty. It's not necessary to update the pool size each time >>> when one VA is removed and addded into free vmap tree. >>> >>> Here change code to update the pool size when attaching the pool back. >>> >>> Signed-off-by: Baoquan He <bhe@redhat.com> >>> --- >>> mm/vmalloc.c | 23 +++++++++++------------ >>> 1 file changed, 11 insertions(+), 12 deletions(-) >>> >>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c >>> index 488d69b56765..bf735c890878 100644 >>> --- a/mm/vmalloc.c >>> +++ b/mm/vmalloc.c >>> @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) >>> LIST_HEAD(decay_list); >>> struct rb_root decay_root = RB_ROOT; >>> struct vmap_area *va, *nva; >>> - unsigned long n_decay; >>> + unsigned long n_decay, len; >>> int i; >>> >>> for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { >>> @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) >>> list_replace_init(&vn->pool[i].head, &tmp_list); >>> spin_unlock(&vn->pool_lock); >>> >>> - if (full_decay) >>> - WRITE_ONCE(vn->pool[i].len, 0); >>> + len = n_decay = vn->pool[i].len; >>> + WRITE_ONCE(vn->pool[i].len, 0); >>> >>> /* Decay a pool by ~25% out of left objects. */ >>> - n_decay = vn->pool[i].len >> 2; >>> + if (!full_decay) >>> + n_decay >>= 2; >>> + len -= n_decay; >>> >>> list_for_each_entry_safe(va, nva, &tmp_list, list) { >>> + if (!n_decay) >>> + break; >>> list_del_init(&va->list); >>> merge_or_add_vmap_area(va, &decay_root, &decay_list); >>> - >>> - if (!full_decay) { >>> - WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1); >>> - >>> - if (!--n_decay) >>> - break; >>> - } >>> + n_decay--; >>> } >>> >>> /* >>> @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) >>> * can populate the pool therefore a simple list replace >>> * operation takes place here. >>> */ >>> - if (!full_decay && !list_empty(&tmp_list)) { >>> + if (!list_empty(&tmp_list)) { >>> spin_lock(&vn->pool_lock); >>> list_replace_init(&tmp_list, &vn->pool[i].head); >>> + vn->pool[i].len = len; >> >> Current logic uses WRITE_ONCE() to update vn->pool[i].len. >> Could this lead to consistency issues? > > Seems no necessary to use WRITE_ONCE(). I can change back to use > WRITE_ONCE() just in case. Currently, it's only updated in > node_alloc(), decay_va_pool_node(), purge_vmap_node(). And the latter > two are inside vmap_purge_lock area. > Okay. Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> Best, Shivank >> >>> spin_unlock(&vn->pool_lock); >>> } >>> } >> > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit 2025-04-15 2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He 2025-04-15 10:29 ` Shivank Garg @ 2025-04-16 13:50 ` Uladzislau Rezki 2025-04-17 2:51 ` Baoquan He 1 sibling, 1 reply; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-16 13:50 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:50AM +0800, Baoquan He wrote: > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be > taken away into free vmap tree partially or completely accordingly, that > is done in decay_va_pool_node(). When doing that, for each pool of node, > the whole list is detached from the pool for handling. At this time, > that pool is empty. It's not necessary to update the pool size each time > when one VA is removed and addded into free vmap tree. > > Here change code to update the pool size when attaching the pool back. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 23 +++++++++++------------ > 1 file changed, 11 insertions(+), 12 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 488d69b56765..bf735c890878 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > LIST_HEAD(decay_list); > struct rb_root decay_root = RB_ROOT; > struct vmap_area *va, *nva; > - unsigned long n_decay; > + unsigned long n_decay, len; > int i; > > for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > list_replace_init(&vn->pool[i].head, &tmp_list); > spin_unlock(&vn->pool_lock); > > - if (full_decay) > - WRITE_ONCE(vn->pool[i].len, 0); > + len = n_decay = vn->pool[i].len; > + WRITE_ONCE(vn->pool[i].len, 0); > > /* Decay a pool by ~25% out of left objects. */ > - n_decay = vn->pool[i].len >> 2; > + if (!full_decay) > + n_decay >>= 2; > + len -= n_decay; > > list_for_each_entry_safe(va, nva, &tmp_list, list) { > + if (!n_decay) > + break; > list_del_init(&va->list); > merge_or_add_vmap_area(va, &decay_root, &decay_list); > - > - if (!full_decay) { > - WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1); > - > - if (!--n_decay) > - break; > - } > + n_decay--; > } > > /* > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > * can populate the pool therefore a simple list replace > * operation takes place here. > */ > - if (!full_decay && !list_empty(&tmp_list)) { > + if (!list_empty(&tmp_list)) { > spin_lock(&vn->pool_lock); > list_replace_init(&tmp_list, &vn->pool[i].head); > + vn->pool[i].len = len; > spin_unlock(&vn->pool_lock); > } > } > -- > 2.41.0 > Which Linux version this patch is based on? I can not apply it. Small nits: <snip> diff --git a/mm/vmalloc.c b/mm/vmalloc.c index c909b8fea6eb..0ae53c997219 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) LIST_HEAD(decay_list); struct rb_root decay_root = RB_ROOT; struct vmap_area *va, *nva; - unsigned long n_decay, len; + unsigned long n_decay, pool_len; int i; for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { @@ -2164,21 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) list_replace_init(&vn->pool[i].head, &tmp_list); spin_unlock(&vn->pool_lock); - len = n_decay = vn->pool[i].len; + pool_len = n_decay = vn->pool[i].len; WRITE_ONCE(vn->pool[i].len, 0); /* Decay a pool by ~25% out of left objects. */ if (!full_decay) n_decay >>= 2; - len -= n_decay; + pool_len -= n_decay; list_for_each_entry_safe(va, nva, &tmp_list, list) { - if (!n_decay) + if (!n_decay--) break; list_del_init(&va->list); merge_or_add_vmap_area(va, &decay_root, &decay_list); - n_decay--; } /* @@ -2190,7 +2189,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) if (!list_empty(&tmp_list)) { spin_lock(&vn->pool_lock); list_replace_init(&tmp_list, &vn->pool[i].head); - vn->pool[i].len = len; + vn->pool[i].len = pool_len; spin_unlock(&vn->pool_lock); } } <snip> on top of this? a) decay in "if" statement, no need extra line; b) rename len to something obvious - pool_len. -- Uladzislau Rezki ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit 2025-04-16 13:50 ` Uladzislau Rezki @ 2025-04-17 2:51 ` Baoquan He 2025-04-17 16:18 ` Uladzislau Rezki 0 siblings, 1 reply; 32+ messages in thread From: Baoquan He @ 2025-04-17 2:51 UTC (permalink / raw) To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel On 04/16/25 at 03:50pm, Uladzislau Rezki wrote: > On Tue, Apr 15, 2025 at 10:39:50AM +0800, Baoquan He wrote: > > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be > > taken away into free vmap tree partially or completely accordingly, that > > is done in decay_va_pool_node(). When doing that, for each pool of node, > > the whole list is detached from the pool for handling. At this time, > > that pool is empty. It's not necessary to update the pool size each time > > when one VA is removed and addded into free vmap tree. > > > > Here change code to update the pool size when attaching the pool back. > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > mm/vmalloc.c | 23 +++++++++++------------ > > 1 file changed, 11 insertions(+), 12 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 488d69b56765..bf735c890878 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > LIST_HEAD(decay_list); > > struct rb_root decay_root = RB_ROOT; > > struct vmap_area *va, *nva; > > - unsigned long n_decay; > > + unsigned long n_decay, len; > > int i; > > > > for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { > > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > list_replace_init(&vn->pool[i].head, &tmp_list); > > spin_unlock(&vn->pool_lock); > > > > - if (full_decay) > > - WRITE_ONCE(vn->pool[i].len, 0); > > + len = n_decay = vn->pool[i].len; > > + WRITE_ONCE(vn->pool[i].len, 0); > > > > /* Decay a pool by ~25% out of left objects. */ > > - n_decay = vn->pool[i].len >> 2; > > + if (!full_decay) > > + n_decay >>= 2; > > + len -= n_decay; > > > > list_for_each_entry_safe(va, nva, &tmp_list, list) { > > + if (!n_decay) > > + break; > > list_del_init(&va->list); > > merge_or_add_vmap_area(va, &decay_root, &decay_list); > > - > > - if (!full_decay) { > > - WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1); > > - > > - if (!--n_decay) > > - break; > > - } > > + n_decay--; > > } > > > > /* > > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > * can populate the pool therefore a simple list replace > > * operation takes place here. > > */ > > - if (!full_decay && !list_empty(&tmp_list)) { > > + if (!list_empty(&tmp_list)) { > > spin_lock(&vn->pool_lock); > > list_replace_init(&tmp_list, &vn->pool[i].head); > > + vn->pool[i].len = len; > > spin_unlock(&vn->pool_lock); > > } > > } > > -- > > 2.41.0 > > > Which Linux version this patch is based on? I can not apply it. I can apply them on the latest mainline kernel, next/master and mm-new branch of akpm/mm.git. I checked just now. > > Small nits: > > <snip> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index c909b8fea6eb..0ae53c997219 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > LIST_HEAD(decay_list); > struct rb_root decay_root = RB_ROOT; > struct vmap_area *va, *nva; > - unsigned long n_decay, len; > + unsigned long n_decay, pool_len; > int i; > > for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { > @@ -2164,21 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > list_replace_init(&vn->pool[i].head, &tmp_list); > spin_unlock(&vn->pool_lock); > > - len = n_decay = vn->pool[i].len; > + pool_len = n_decay = vn->pool[i].len; > WRITE_ONCE(vn->pool[i].len, 0); > > /* Decay a pool by ~25% out of left objects. */ > if (!full_decay) > n_decay >>= 2; > - len -= n_decay; > + pool_len -= n_decay; > > list_for_each_entry_safe(va, nva, &tmp_list, list) { > - if (!n_decay) > + if (!n_decay--) > break; > > list_del_init(&va->list); > merge_or_add_vmap_area(va, &decay_root, &decay_list); > - n_decay--; > } > > /* > @@ -2190,7 +2189,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > if (!list_empty(&tmp_list)) { > spin_lock(&vn->pool_lock); > list_replace_init(&tmp_list, &vn->pool[i].head); > - vn->pool[i].len = len; > + vn->pool[i].len = pool_len; > spin_unlock(&vn->pool_lock); > } > } > <snip> > > on top of this? > > a) decay in "if" statement, no need extra line; > b) rename len to something obvious - pool_len. All look good, will add them to this patch 3 of v2. Thanks a lot. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit 2025-04-17 2:51 ` Baoquan He @ 2025-04-17 16:18 ` Uladzislau Rezki 0 siblings, 0 replies; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-17 16:18 UTC (permalink / raw) To: Baoquan He; +Cc: Uladzislau Rezki, linux-mm, akpm, linux-kernel On Thu, Apr 17, 2025 at 10:51:44AM +0800, Baoquan He wrote: > On 04/16/25 at 03:50pm, Uladzislau Rezki wrote: > > On Tue, Apr 15, 2025 at 10:39:50AM +0800, Baoquan He wrote: > > > When purge lazily freed vmap areas, VA stored in vn->pool[] will also be > > > taken away into free vmap tree partially or completely accordingly, that > > > is done in decay_va_pool_node(). When doing that, for each pool of node, > > > the whole list is detached from the pool for handling. At this time, > > > that pool is empty. It's not necessary to update the pool size each time > > > when one VA is removed and addded into free vmap tree. > > > > > > Here change code to update the pool size when attaching the pool back. > > > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > > --- > > > mm/vmalloc.c | 23 +++++++++++------------ > > > 1 file changed, 11 insertions(+), 12 deletions(-) > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > index 488d69b56765..bf735c890878 100644 > > > --- a/mm/vmalloc.c > > > +++ b/mm/vmalloc.c > > > @@ -2150,7 +2150,7 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > > LIST_HEAD(decay_list); > > > struct rb_root decay_root = RB_ROOT; > > > struct vmap_area *va, *nva; > > > - unsigned long n_decay; > > > + unsigned long n_decay, len; > > > int i; > > > > > > for (i = 0; i < MAX_VA_SIZE_PAGES; i++) { > > > @@ -2164,22 +2164,20 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > > list_replace_init(&vn->pool[i].head, &tmp_list); > > > spin_unlock(&vn->pool_lock); > > > > > > - if (full_decay) > > > - WRITE_ONCE(vn->pool[i].len, 0); > > > + len = n_decay = vn->pool[i].len; > > > + WRITE_ONCE(vn->pool[i].len, 0); > > > > > > /* Decay a pool by ~25% out of left objects. */ > > > - n_decay = vn->pool[i].len >> 2; > > > + if (!full_decay) > > > + n_decay >>= 2; > > > + len -= n_decay; > > > > > > list_for_each_entry_safe(va, nva, &tmp_list, list) { > > > + if (!n_decay) > > > + break; > > > list_del_init(&va->list); > > > merge_or_add_vmap_area(va, &decay_root, &decay_list); > > > - > > > - if (!full_decay) { > > > - WRITE_ONCE(vn->pool[i].len, vn->pool[i].len - 1); > > > - > > > - if (!--n_decay) > > > - break; > > > - } > > > + n_decay--; > > > } > > > > > > /* > > > @@ -2188,9 +2186,10 @@ decay_va_pool_node(struct vmap_node *vn, bool full_decay) > > > * can populate the pool therefore a simple list replace > > > * operation takes place here. > > > */ > > > - if (!full_decay && !list_empty(&tmp_list)) { > > > + if (!list_empty(&tmp_list)) { > > > spin_lock(&vn->pool_lock); > > > list_replace_init(&tmp_list, &vn->pool[i].head); > > > + vn->pool[i].len = len; > > > spin_unlock(&vn->pool_lock); > > > } > > > } > > > -- > > > 2.41.0 > > > > > Which Linux version this patch is based on? I can not apply it. > > I can apply them on the latest mainline kernel, next/master and > mm-new branch of akpm/mm.git. I checked just now. > Sounds good. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He ` (2 preceding siblings ...) 2025-04-15 2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He @ 2025-04-15 2:39 ` Baoquan He 2025-04-15 10:44 ` Uladzislau Rezki ` (2 more replies) 2025-04-15 2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He ` (2 subsequent siblings) 6 siblings, 3 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 2:39 UTC (permalink / raw) To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He Remove unneeded local variables and replace them with values. Signed-off-by: Baoquan He <bhe@redhat.com> --- mm/vmalloc.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index bf735c890878..3f38a232663b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -2930,10 +2930,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) */ void vm_unmap_aliases(void) { - unsigned long start = ULONG_MAX, end = 0; - int flush = 0; - - _vm_unmap_aliases(start, end, flush); + _vm_unmap_aliases(ULONG_MAX, 0, 0); } EXPORT_SYMBOL_GPL(vm_unmap_aliases); -- 2.41.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() 2025-04-15 2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He @ 2025-04-15 10:44 ` Uladzislau Rezki 2025-04-15 19:10 ` Shivank Garg 2025-04-15 23:54 ` Vishal Moola (Oracle) 2 siblings, 0 replies; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-15 10:44 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:51AM +0800, Baoquan He wrote: > Remove unneeded local variables and replace them with values. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index bf735c890878..3f38a232663b 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2930,10 +2930,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) > */ > void vm_unmap_aliases(void) > { > - unsigned long start = ULONG_MAX, end = 0; > - int flush = 0; > - > - _vm_unmap_aliases(start, end, flush); > + _vm_unmap_aliases(ULONG_MAX, 0, 0); > } > EXPORT_SYMBOL_GPL(vm_unmap_aliases); > > -- > 2.41.0 > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com> Makes sense! -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() 2025-04-15 2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He 2025-04-15 10:44 ` Uladzislau Rezki @ 2025-04-15 19:10 ` Shivank Garg 2025-04-15 23:54 ` Vishal Moola (Oracle) 2 siblings, 0 replies; 32+ messages in thread From: Shivank Garg @ 2025-04-15 19:10 UTC (permalink / raw) To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel On 4/15/2025 8:09 AM, Baoquan He wrote: > Remove unneeded local variables and replace them with values. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 5 +---- > 1 file changed, 1 insertion(+), 4 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index bf735c890878..3f38a232663b 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2930,10 +2930,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) > */ > void vm_unmap_aliases(void) > { > - unsigned long start = ULONG_MAX, end = 0; > - int flush = 0; > - > - _vm_unmap_aliases(start, end, flush); > + _vm_unmap_aliases(ULONG_MAX, 0, 0); > } > EXPORT_SYMBOL_GPL(vm_unmap_aliases); > Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() 2025-04-15 2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He 2025-04-15 10:44 ` Uladzislau Rezki 2025-04-15 19:10 ` Shivank Garg @ 2025-04-15 23:54 ` Vishal Moola (Oracle) 2 siblings, 0 replies; 32+ messages in thread From: Vishal Moola (Oracle) @ 2025-04-15 23:54 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:51AM +0800, Baoquan He wrote: > Remove unneeded local variables and replace them with values. > > Signed-off-by: Baoquan He <bhe@redhat.com> Reviewed-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He ` (3 preceding siblings ...) 2025-04-15 2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He @ 2025-04-15 2:39 ` Baoquan He 2025-04-15 6:44 ` Baoquan He ` (2 more replies) 2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki 2025-04-15 19:06 ` Shivank Garg 6 siblings, 3 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 2:39 UTC (permalink / raw) To: linux-mm; +Cc: akpm, urezki, linux-kernel, Baoquan He In codes of alloc_vmap_area(), it returns the upper bound 'vend' to indicate if the allocation is successful or failed. That is not very clear. Here change to return explicit error values and check them to judge if allocation is successful. IS_ERR_VALUE already uses unlikely() internally Signed-off-by: Baoquan He <bhe@redhat.com> --- mm/vmalloc.c | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 3f38a232663b..5b21cd09b2b4 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, */ lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); if (!lva) - return -1; + return -ENOMEM; } /* @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, */ va->va_start = nva_start_addr + size; } else { - return -1; + return -EINVAL; } if (type != FL_FIT_TYPE) { @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, /* Check the "vend" restriction. */ if (nva_start_addr + size > vend) - return vend; + return -ERANGE; /* Update the free vmap_area. */ ret = va_clip(root, head, va, nva_start_addr, size); - if (WARN_ON_ONCE(ret)) - return vend; + if (ret) + return ret; return nva_start_addr; } /* * Returns a start address of the newly allocated area, if success. - * Otherwise a vend is returned that indicates failure. + * Otherwise an error value is returned that indicates failure. */ static __always_inline unsigned long __alloc_vmap_area(struct rb_root *root, struct list_head *head, @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); if (unlikely(!va)) - return vend; + return -ENOENT; nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); - if (nva_start_addr == vend) - return vend; #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK - find_vmap_lowest_match_check(root, head, size, align); + if (!IS_ERR_VALUE(nva_start_addr)) + find_vmap_lowest_match_check(root, head, size, align); #endif return nva_start_addr; @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, struct vmap_area *va; *vn_id = 0; - *addr = vend; + *addr = -EINVAL; /* * Fallback to a global heap if not vmalloc or there @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, } retry: - if (addr == vend) { + if (IS_ERR_VALUE(addr)) { preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, size, align, vstart, vend); spin_unlock(&free_vmap_area_lock); } - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); /* - * If an allocation fails, the "vend" address is + * If an allocation fails, the error value is * returned. Therefore trigger the overflow path. */ - if (unlikely(addr == vend)) + if (IS_ERR_VALUE(addr)) goto overflow; va->va_start = addr; @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, ret = va_clip(&free_vmap_area_root, &free_vmap_area_list, va, start, size); - if (WARN_ON_ONCE(unlikely(ret))) - /* It is a BUG(), but trigger recovery instead. */ + if ((unlikely(ret))) { + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); goto recovery; + } /* Allocated area. */ va = vas[area]; -- 2.41.0 ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-15 2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He @ 2025-04-15 6:44 ` Baoquan He 2025-04-15 7:22 ` Shivank Garg 2025-04-16 14:28 ` Uladzislau Rezki 2 siblings, 0 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 6:44 UTC (permalink / raw) To: linux-mm; +Cc: akpm, urezki, linux-kernel On 04/15/25 at 10:39am, Baoquan He wrote: > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to > indicate if the allocation is successful or failed. That is not very clear. > > Here change to return explicit error values and check them to judge if > allocation is successful. > > IS_ERR_VALUE already uses unlikely() internally ^^^^^^^^^ Sorry, above line was added mistakenly in log draft, should be removed. > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 34 +++++++++++++++++----------------- > 1 file changed, 17 insertions(+), 17 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 3f38a232663b..5b21cd09b2b4 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, > */ > lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > if (!lva) > - return -1; > + return -ENOMEM; > } > > /* > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, > */ > va->va_start = nva_start_addr + size; > } else { > - return -1; > + return -EINVAL; > } > > if (type != FL_FIT_TYPE) { > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, > > /* Check the "vend" restriction. */ > if (nva_start_addr + size > vend) > - return vend; > + return -ERANGE; > > /* Update the free vmap_area. */ > ret = va_clip(root, head, va, nva_start_addr, size); > - if (WARN_ON_ONCE(ret)) > - return vend; > + if (ret) > + return ret; > > return nva_start_addr; > } > > /* > * Returns a start address of the newly allocated area, if success. > - * Otherwise a vend is returned that indicates failure. > + * Otherwise an error value is returned that indicates failure. > */ > static __always_inline unsigned long > __alloc_vmap_area(struct rb_root *root, struct list_head *head, > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); > if (unlikely(!va)) > - return vend; > + return -ENOENT; > > nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); > - if (nva_start_addr == vend) > - return vend; > > #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK > - find_vmap_lowest_match_check(root, head, size, align); > + if (!IS_ERR_VALUE(nva_start_addr)) > + find_vmap_lowest_match_check(root, head, size, align); > #endif > > return nva_start_addr; > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, > struct vmap_area *va; > > *vn_id = 0; > - *addr = vend; > + *addr = -EINVAL; > > /* > * Fallback to a global heap if not vmalloc or there > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > } > > retry: > - if (addr == vend) { > + if (IS_ERR_VALUE(addr)) { > preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); > addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, > size, align, vstart, vend); > spin_unlock(&free_vmap_area_lock); > } > > - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); > + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); > > /* > - * If an allocation fails, the "vend" address is > + * If an allocation fails, the error value is > * returned. Therefore trigger the overflow path. > */ > - if (unlikely(addr == vend)) > + if (IS_ERR_VALUE(addr)) > goto overflow; > > va->va_start = addr; > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > > ret = va_clip(&free_vmap_area_root, > &free_vmap_area_list, va, start, size); > - if (WARN_ON_ONCE(unlikely(ret))) > - /* It is a BUG(), but trigger recovery instead. */ > + if ((unlikely(ret))) { > + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); > goto recovery; > + } > > /* Allocated area. */ > va = vas[area]; > -- > 2.41.0 > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-15 2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He 2025-04-15 6:44 ` Baoquan He @ 2025-04-15 7:22 ` Shivank Garg 2025-04-15 13:01 ` Baoquan He 2025-04-16 14:28 ` Uladzislau Rezki 2 siblings, 1 reply; 32+ messages in thread From: Shivank Garg @ 2025-04-15 7:22 UTC (permalink / raw) To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel On 4/15/2025 8:09 AM, Baoquan He wrote: > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to > indicate if the allocation is successful or failed. That is not very clear. > > Here change to return explicit error values and check them to judge if > allocation is successful. > > IS_ERR_VALUE already uses unlikely() internally > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 34 +++++++++++++++++----------------- > 1 file changed, 17 insertions(+), 17 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 3f38a232663b..5b21cd09b2b4 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, > */ > lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > if (!lva) > - return -1; > + return -ENOMEM; > } > > /* > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, > */ > va->va_start = nva_start_addr + size; > } else { > - return -1; > + return -EINVAL; > } Braces around return -EINVAL seem unnecessary. They can be dropped. > > if (type != FL_FIT_TYPE) { > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, > > /* Check the "vend" restriction. */ > if (nva_start_addr + size > vend) > - return vend; > + return -ERANGE; > > /* Update the free vmap_area. */ > ret = va_clip(root, head, va, nva_start_addr, size); > - if (WARN_ON_ONCE(ret)) > - return vend; > + if (ret) > + return ret; Is it safe to remove the warning, or was it critical for debugging? > > return nva_start_addr; > } > > /* > * Returns a start address of the newly allocated area, if success. > - * Otherwise a vend is returned that indicates failure. > + * Otherwise an error value is returned that indicates failure. > */ > static __always_inline unsigned long > __alloc_vmap_area(struct rb_root *root, struct list_head *head, > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); > if (unlikely(!va)) > - return vend; > + return -ENOENT; > > nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); > - if (nva_start_addr == vend) > - return vend; > > #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK > - find_vmap_lowest_match_check(root, head, size, align); > + if (!IS_ERR_VALUE(nva_start_addr)) > + find_vmap_lowest_match_check(root, head, size, align); > #endif > > return nva_start_addr; > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, > struct vmap_area *va; > > *vn_id = 0; > - *addr = vend; > + *addr = -EINVAL; > > /* > * Fallback to a global heap if not vmalloc or there > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > } > > retry: > - if (addr == vend) { > + if (IS_ERR_VALUE(addr)) { > preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); > addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, > size, align, vstart, vend); > spin_unlock(&free_vmap_area_lock); > } > > - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); > + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); > > /* > - * If an allocation fails, the "vend" address is > + * If an allocation fails, the error value is > * returned. Therefore trigger the overflow path. > */ > - if (unlikely(addr == vend)) > + if (IS_ERR_VALUE(addr)) > goto overflow; > > va->va_start = addr; > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > > ret = va_clip(&free_vmap_area_root, > &free_vmap_area_list, va, start, size); > - if (WARN_ON_ONCE(unlikely(ret))) > - /* It is a BUG(), but trigger recovery instead. */ > + if ((unlikely(ret))) { ^^ ^^ The extra parentheses are redundant and can be removed for clarity. > + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); > goto recovery; > + } > > /* Allocated area. */ > va = vas[area]; ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-15 7:22 ` Shivank Garg @ 2025-04-15 13:01 ` Baoquan He 2025-04-15 19:00 ` Shivank Garg 0 siblings, 1 reply; 32+ messages in thread From: Baoquan He @ 2025-04-15 13:01 UTC (permalink / raw) To: Shivank Garg; +Cc: linux-mm, akpm, urezki, linux-kernel On 04/15/25 at 12:52pm, Shivank Garg wrote: > On 4/15/2025 8:09 AM, Baoquan He wrote: > > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to > > indicate if the allocation is successful or failed. That is not very clear. > > > > Here change to return explicit error values and check them to judge if > > allocation is successful. > > > > IS_ERR_VALUE already uses unlikely() internally > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > mm/vmalloc.c | 34 +++++++++++++++++----------------- > > 1 file changed, 17 insertions(+), 17 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 3f38a232663b..5b21cd09b2b4 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, > > */ > > lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > > if (!lva) > > - return -1; > > + return -ENOMEM; > > } > > > > /* > > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, > > */ > > va->va_start = nva_start_addr + size; > > } else { > > - return -1; > > + return -EINVAL; > > } Thanks for reviewing. > > Braces around return -EINVAL seem unnecessary. > They can be dropped. This complys with the codeing style required in 3) Placing Braces and Spaces of Documentation/process/coding-style.rst because other branches are multiple statements. > > > > > if (type != FL_FIT_TYPE) { > > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, > > > > /* Check the "vend" restriction. */ > > if (nva_start_addr + size > vend) > > - return vend; > > + return -ERANGE; > > > > /* Update the free vmap_area. */ > > ret = va_clip(root, head, va, nva_start_addr, size); > > - if (WARN_ON_ONCE(ret)) > > - return vend; > > + if (ret) > > + return ret; > > Is it safe to remove the warning, or was it critical for debugging? This comes from a reported concern because va_clip() could be failed by NOTHING_FIT or kmem_cache_alloc failure. The warning here could cause confusion misleading people to think vmap area management is failed. > > > > > return nva_start_addr; > > } > > > > /* > > * Returns a start address of the newly allocated area, if success. > > - * Otherwise a vend is returned that indicates failure. > > + * Otherwise an error value is returned that indicates failure. > > */ > > static __always_inline unsigned long > > __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > > > va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); > > if (unlikely(!va)) > > - return vend; > > + return -ENOENT; > > > > nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); > > - if (nva_start_addr == vend) > > - return vend; > > > > #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK > > - find_vmap_lowest_match_check(root, head, size, align); > > + if (!IS_ERR_VALUE(nva_start_addr)) > > + find_vmap_lowest_match_check(root, head, size, align); > > #endif > > > > return nva_start_addr; > > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, > > struct vmap_area *va; > > > > *vn_id = 0; > > - *addr = vend; > > + *addr = -EINVAL; > > > > /* > > * Fallback to a global heap if not vmalloc or there > > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > > } > > > > retry: > > - if (addr == vend) { > > + if (IS_ERR_VALUE(addr)) { > > preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); > > addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, > > size, align, vstart, vend); > > spin_unlock(&free_vmap_area_lock); > > } > > > > - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); > > + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); > > > > /* > > - * If an allocation fails, the "vend" address is > > + * If an allocation fails, the error value is > > * returned. Therefore trigger the overflow path. > > */ > > - if (unlikely(addr == vend)) > > + if (IS_ERR_VALUE(addr)) > > goto overflow; > > > > va->va_start = addr; > > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > > > > ret = va_clip(&free_vmap_area_root, > > &free_vmap_area_list, va, start, size); > > - if (WARN_ON_ONCE(unlikely(ret))) > > - /* It is a BUG(), but trigger recovery instead. */ > > + if ((unlikely(ret))) { > ^^ ^^ > The extra parentheses are redundant and can be removed for clarity. You are right, I will remove it. Thanks. > > > + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); > > goto recovery; > > + } > > > > /* Allocated area. */ > > va = vas[area]; > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-15 13:01 ` Baoquan He @ 2025-04-15 19:00 ` Shivank Garg 2025-04-15 22:57 ` Baoquan He 0 siblings, 1 reply; 32+ messages in thread From: Shivank Garg @ 2025-04-15 19:00 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On 4/15/2025 6:31 PM, Baoquan He wrote: > On 04/15/25 at 12:52pm, Shivank Garg wrote: >> On 4/15/2025 8:09 AM, Baoquan He wrote: >>> In codes of alloc_vmap_area(), it returns the upper bound 'vend' to >>> indicate if the allocation is successful or failed. That is not very clear. >>> >>> Here change to return explicit error values and check them to judge if >>> allocation is successful. >>> >>> IS_ERR_VALUE already uses unlikely() internally >>> >>> Signed-off-by: Baoquan He <bhe@redhat.com> >>> --- >>> mm/vmalloc.c | 34 +++++++++++++++++----------------- >>> 1 file changed, 17 insertions(+), 17 deletions(-) >>> >>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c >>> index 3f38a232663b..5b21cd09b2b4 100644 >>> --- a/mm/vmalloc.c >>> +++ b/mm/vmalloc.c >>> @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, >>> */ >>> lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); >>> if (!lva) >>> - return -1; >>> + return -ENOMEM; >>> } >>> >>> /* >>> @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, >>> */ >>> va->va_start = nva_start_addr + size; >>> } else { >>> - return -1; >>> + return -EINVAL; >>> } > > Thanks for reviewing. > >> >> Braces around return -EINVAL seem unnecessary. >> They can be dropped. > > This complys with the codeing style required in 3) Placing Braces and > Spaces of Documentation/process/coding-style.rst because other branches > are multiple statements. > >> >>> >>> if (type != FL_FIT_TYPE) { >>> @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, >>> >>> /* Check the "vend" restriction. */ >>> if (nva_start_addr + size > vend) >>> - return vend; >>> + return -ERANGE; >>> >>> /* Update the free vmap_area. */ >>> ret = va_clip(root, head, va, nva_start_addr, size); >>> - if (WARN_ON_ONCE(ret)) >>> - return vend; >>> + if (ret) >>> + return ret; >> >> Is it safe to remove the warning, or was it critical for debugging? > > This comes from a reported concern because va_clip() could be failed by > NOTHING_FIT or kmem_cache_alloc failure. The warning here could cause > confusion misleading people to think vmap area management is failed. > >> >>> >>> return nva_start_addr; >>> } >>> >>> /* >>> * Returns a start address of the newly allocated area, if success. >>> - * Otherwise a vend is returned that indicates failure. >>> + * Otherwise an error value is returned that indicates failure. >>> */ >>> static __always_inline unsigned long >>> __alloc_vmap_area(struct rb_root *root, struct list_head *head, >>> @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, >>> >>> va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); >>> if (unlikely(!va)) >>> - return vend; >>> + return -ENOENT; >>> >>> nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); >>> - if (nva_start_addr == vend) >>> - return vend; >>> >>> #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK >>> - find_vmap_lowest_match_check(root, head, size, align); >>> + if (!IS_ERR_VALUE(nva_start_addr)) >>> + find_vmap_lowest_match_check(root, head, size, align); >>> #endif >>> >>> return nva_start_addr; >>> @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, >>> struct vmap_area *va; >>> >>> *vn_id = 0; >>> - *addr = vend; >>> + *addr = -EINVAL; >>> >>> /* >>> * Fallback to a global heap if not vmalloc or there >>> @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, >>> } >>> >>> retry: >>> - if (addr == vend) { >>> + if (IS_ERR_VALUE(addr)) { >>> preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); >>> addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, >>> size, align, vstart, vend); >>> spin_unlock(&free_vmap_area_lock); >>> } >>> >>> - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); >>> + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); >>> >>> /* >>> - * If an allocation fails, the "vend" address is >>> + * If an allocation fails, the error value is >>> * returned. Therefore trigger the overflow path. >>> */ >>> - if (unlikely(addr == vend)) >>> + if (IS_ERR_VALUE(addr)) >>> goto overflow; >>> >>> va->va_start = addr; >>> @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, >>> >>> ret = va_clip(&free_vmap_area_root, >>> &free_vmap_area_list, va, start, size); >>> - if (WARN_ON_ONCE(unlikely(ret))) >>> - /* It is a BUG(), but trigger recovery instead. */ >>> + if ((unlikely(ret))) { >> ^^ ^^ >> The extra parentheses are redundant and can be removed for clarity. > > You are right, I will remove it. Thanks. > Please feel free to add following in next version. Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> Thanks, Shivank >> >>> + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); >>> goto recovery; >>> + } >>> >>> /* Allocated area. */ >>> va = vas[area]; >> > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-15 19:00 ` Shivank Garg @ 2025-04-15 22:57 ` Baoquan He 0 siblings, 0 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 22:57 UTC (permalink / raw) To: Shivank Garg; +Cc: linux-mm, akpm, urezki, linux-kernel On 04/16/25 at 12:30am, Shivank Garg wrote: > > > On 4/15/2025 6:31 PM, Baoquan He wrote: > > On 04/15/25 at 12:52pm, Shivank Garg wrote: > >> On 4/15/2025 8:09 AM, Baoquan He wrote: > >>> In codes of alloc_vmap_area(), it returns the upper bound 'vend' to > >>> indicate if the allocation is successful or failed. That is not very clear. > >>> > >>> Here change to return explicit error values and check them to judge if > >>> allocation is successful. > >>> > >>> IS_ERR_VALUE already uses unlikely() internally > >>> > >>> Signed-off-by: Baoquan He <bhe@redhat.com> > >>> --- > >>> mm/vmalloc.c | 34 +++++++++++++++++----------------- > >>> 1 file changed, 17 insertions(+), 17 deletions(-) > >>> > >>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c > >>> index 3f38a232663b..5b21cd09b2b4 100644 > >>> --- a/mm/vmalloc.c > >>> +++ b/mm/vmalloc.c > >>> @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, > >>> */ > >>> lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > >>> if (!lva) > >>> - return -1; > >>> + return -ENOMEM; > >>> } > >>> > >>> /* > >>> @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, > >>> */ > >>> va->va_start = nva_start_addr + size; > >>> } else { > >>> - return -1; > >>> + return -EINVAL; > >>> } > > > > Thanks for reviewing. > > > >> > >> Braces around return -EINVAL seem unnecessary. > >> They can be dropped. > > > > This complys with the codeing style required in 3) Placing Braces and > > Spaces of Documentation/process/coding-style.rst because other branches > > are multiple statements. > > > >> > >>> > >>> if (type != FL_FIT_TYPE) { > >>> @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, > >>> > >>> /* Check the "vend" restriction. */ > >>> if (nva_start_addr + size > vend) > >>> - return vend; > >>> + return -ERANGE; > >>> > >>> /* Update the free vmap_area. */ > >>> ret = va_clip(root, head, va, nva_start_addr, size); > >>> - if (WARN_ON_ONCE(ret)) > >>> - return vend; > >>> + if (ret) > >>> + return ret; > >> > >> Is it safe to remove the warning, or was it critical for debugging? > > > > This comes from a reported concern because va_clip() could be failed by > > NOTHING_FIT or kmem_cache_alloc failure. The warning here could cause > > confusion misleading people to think vmap area management is failed. > > > >> > >>> > >>> return nva_start_addr; > >>> } > >>> > >>> /* > >>> * Returns a start address of the newly allocated area, if success. > >>> - * Otherwise a vend is returned that indicates failure. > >>> + * Otherwise an error value is returned that indicates failure. > >>> */ > >>> static __always_inline unsigned long > >>> __alloc_vmap_area(struct rb_root *root, struct list_head *head, > >>> @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, > >>> > >>> va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); > >>> if (unlikely(!va)) > >>> - return vend; > >>> + return -ENOENT; > >>> > >>> nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); > >>> - if (nva_start_addr == vend) > >>> - return vend; > >>> > >>> #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK > >>> - find_vmap_lowest_match_check(root, head, size, align); > >>> + if (!IS_ERR_VALUE(nva_start_addr)) > >>> + find_vmap_lowest_match_check(root, head, size, align); > >>> #endif > >>> > >>> return nva_start_addr; > >>> @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, > >>> struct vmap_area *va; > >>> > >>> *vn_id = 0; > >>> - *addr = vend; > >>> + *addr = -EINVAL; > >>> > >>> /* > >>> * Fallback to a global heap if not vmalloc or there > >>> @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > >>> } > >>> > >>> retry: > >>> - if (addr == vend) { > >>> + if (IS_ERR_VALUE(addr)) { > >>> preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); > >>> addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, > >>> size, align, vstart, vend); > >>> spin_unlock(&free_vmap_area_lock); > >>> } > >>> > >>> - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); > >>> + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); > >>> > >>> /* > >>> - * If an allocation fails, the "vend" address is > >>> + * If an allocation fails, the error value is > >>> * returned. Therefore trigger the overflow path. > >>> */ > >>> - if (unlikely(addr == vend)) > >>> + if (IS_ERR_VALUE(addr)) > >>> goto overflow; > >>> > >>> va->va_start = addr; > >>> @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > >>> > >>> ret = va_clip(&free_vmap_area_root, > >>> &free_vmap_area_list, va, start, size); > >>> - if (WARN_ON_ONCE(unlikely(ret))) > >>> - /* It is a BUG(), but trigger recovery instead. */ > >>> + if ((unlikely(ret))) { > >> ^^ ^^ > >> The extra parentheses are redundant and can be removed for clarity. > > > > You are right, I will remove it. Thanks. > > > > Please feel free to add following in next version. > > Reviewed-by: Shivank Garg <shivankg@amd.com> > Tested-by: Shivank Garg <shivankg@amd.com> Thanks a lot for your careful reviewing and testing. > > >> > >>> + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); > >>> goto recovery; > >>> + } > >>> > >>> /* Allocated area. */ > >>> va = vas[area]; > >> > > > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-15 2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He 2025-04-15 6:44 ` Baoquan He 2025-04-15 7:22 ` Shivank Garg @ 2025-04-16 14:28 ` Uladzislau Rezki 2025-04-17 3:02 ` Baoquan He 2 siblings, 1 reply; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-16 14:28 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:52AM +0800, Baoquan He wrote: > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to > indicate if the allocation is successful or failed. That is not very clear. > > Here change to return explicit error values and check them to judge if > allocation is successful. > > IS_ERR_VALUE already uses unlikely() internally > > Signed-off-by: Baoquan He <bhe@redhat.com> > --- > mm/vmalloc.c | 34 +++++++++++++++++----------------- > 1 file changed, 17 insertions(+), 17 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 3f38a232663b..5b21cd09b2b4 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, > */ > lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > if (!lva) > - return -1; > + return -ENOMEM; > } > > /* > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, > */ > va->va_start = nva_start_addr + size; > } else { > - return -1; > + return -EINVAL; > } > > if (type != FL_FIT_TYPE) { > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, > > /* Check the "vend" restriction. */ > if (nva_start_addr + size > vend) > - return vend; > + return -ERANGE; > > /* Update the free vmap_area. */ > ret = va_clip(root, head, va, nva_start_addr, size); > - if (WARN_ON_ONCE(ret)) > - return vend; > Not clear why you remove this WARN_ON by this patch. It should be a separate patch or just keep it as is. The warning here can mean that something is really wrong, especially if NOTHING_FIT. So we definitely want the warning. > + if (ret) > + return ret; > > return nva_start_addr; > } > > /* > * Returns a start address of the newly allocated area, if success. > - * Otherwise a vend is returned that indicates failure. > + * Otherwise an error value is returned that indicates failure. > */ > static __always_inline unsigned long > __alloc_vmap_area(struct rb_root *root, struct list_head *head, > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); > if (unlikely(!va)) > - return vend; > + return -ENOENT; > > nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); > - if (nva_start_addr == vend) > - return vend; > > #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK > - find_vmap_lowest_match_check(root, head, size, align); > + if (!IS_ERR_VALUE(nva_start_addr)) > Just keep it as it was. No need to check if addr is valid or not. > + find_vmap_lowest_match_check(root, head, size, align); > #endif > > return nva_start_addr; > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, > struct vmap_area *va; > > *vn_id = 0; > - *addr = vend; > + *addr = -EINVAL; > > /* > * Fallback to a global heap if not vmalloc or there > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > } > > retry: > - if (addr == vend) { > + if (IS_ERR_VALUE(addr)) { > preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); > addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, > size, align, vstart, vend); > spin_unlock(&free_vmap_area_lock); > } > > - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); > + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); > > /* > - * If an allocation fails, the "vend" address is > + * If an allocation fails, the error value is > * returned. Therefore trigger the overflow path. > */ > - if (unlikely(addr == vend)) > + if (IS_ERR_VALUE(addr)) > goto overflow; > > va->va_start = addr; > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > > ret = va_clip(&free_vmap_area_root, > &free_vmap_area_list, va, start, size); > - if (WARN_ON_ONCE(unlikely(ret))) > - /* It is a BUG(), but trigger recovery instead. */ Keep the comment. > + if ((unlikely(ret))) { > + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); > goto recovery; > + } > -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-16 14:28 ` Uladzislau Rezki @ 2025-04-17 3:02 ` Baoquan He 2025-04-17 16:17 ` Uladzislau Rezki 0 siblings, 1 reply; 32+ messages in thread From: Baoquan He @ 2025-04-17 3:02 UTC (permalink / raw) To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel On 04/16/25 at 04:28pm, Uladzislau Rezki wrote: > On Tue, Apr 15, 2025 at 10:39:52AM +0800, Baoquan He wrote: > > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to > > indicate if the allocation is successful or failed. That is not very clear. > > > > Here change to return explicit error values and check them to judge if > > allocation is successful. > > > > IS_ERR_VALUE already uses unlikely() internally > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > --- > > mm/vmalloc.c | 34 +++++++++++++++++----------------- > > 1 file changed, 17 insertions(+), 17 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index 3f38a232663b..5b21cd09b2b4 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, > > */ > > lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > > if (!lva) > > - return -1; > > + return -ENOMEM; > > } > > > > /* > > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, > > */ > > va->va_start = nva_start_addr + size; > > } else { > > - return -1; > > + return -EINVAL; > > } > > > > if (type != FL_FIT_TYPE) { > > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, > > > > /* Check the "vend" restriction. */ > > if (nva_start_addr + size > vend) > > - return vend; > > + return -ERANGE; > > > > /* Update the free vmap_area. */ > > ret = va_clip(root, head, va, nva_start_addr, size); > > - if (WARN_ON_ONCE(ret)) > > - return vend; > > > Not clear why you remove this WARN_ON by this patch. It should be > a separate patch or just keep it as is. The warning here can mean > that something is really wrong, especially if NOTHING_FIT. So we > definitely want the warning. I remember one time someone reported that the slab allocation failure triggered this warning which is confusing to them. But yes, it should be discussed in a separate post or thread, not appropriate to remove it silently. I will add it back in v2. > > > + if (ret) > > + return ret; > > > > return nva_start_addr; > > } > > > > /* > > * Returns a start address of the newly allocated area, if success. > > - * Otherwise a vend is returned that indicates failure. > > + * Otherwise an error value is returned that indicates failure. > > */ > > static __always_inline unsigned long > > __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > > > va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); > > if (unlikely(!va)) > > - return vend; > > + return -ENOENT; > > > > nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); > > - if (nva_start_addr == vend) > > - return vend; > > > > #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK > > - find_vmap_lowest_match_check(root, head, size, align); > > + if (!IS_ERR_VALUE(nva_start_addr)) > > > Just keep it as it was. No need to check if addr is valid or not. This is to keep consistent with the old code. Before this patch, if va_alloc() return vend, it returns directly, no find_vmap_lowest_match_check() invocation is done. I tried to keep the behaviour unchanged. That code is for debugging, both is fine to me. > > > + find_vmap_lowest_match_check(root, head, size, align); > > #endif > > > > return nva_start_addr; > > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, > > struct vmap_area *va; > > > > *vn_id = 0; > > - *addr = vend; > > + *addr = -EINVAL; > > > > /* > > * Fallback to a global heap if not vmalloc or there > > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > > } > > > > retry: > > - if (addr == vend) { > > + if (IS_ERR_VALUE(addr)) { > > preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); > > addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, > > size, align, vstart, vend); > > spin_unlock(&free_vmap_area_lock); > > } > > > > - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); > > + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); > > > > /* > > - * If an allocation fails, the "vend" address is > > + * If an allocation fails, the error value is > > * returned. Therefore trigger the overflow path. > > */ > > - if (unlikely(addr == vend)) > > + if (IS_ERR_VALUE(addr)) > > goto overflow; > > > > va->va_start = addr; > > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > > > > ret = va_clip(&free_vmap_area_root, > > &free_vmap_area_list, va, start, size); > > - if (WARN_ON_ONCE(unlikely(ret))) > > - /* It is a BUG(), but trigger recovery instead. */ > Keep the comment. OK, will add it back. > > > + if ((unlikely(ret))) { > > + WARN_ONCE(1, "%s error: errno (%d)\n", __func__, ret); > > goto recovery; > > + } > > > > -- > Uladzislau Rezki > ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() 2025-04-17 3:02 ` Baoquan He @ 2025-04-17 16:17 ` Uladzislau Rezki 0 siblings, 0 replies; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-17 16:17 UTC (permalink / raw) To: Baoquan He; +Cc: Uladzislau Rezki, linux-mm, akpm, linux-kernel On Thu, Apr 17, 2025 at 11:02:06AM +0800, Baoquan He wrote: > On 04/16/25 at 04:28pm, Uladzislau Rezki wrote: > > On Tue, Apr 15, 2025 at 10:39:52AM +0800, Baoquan He wrote: > > > In codes of alloc_vmap_area(), it returns the upper bound 'vend' to > > > indicate if the allocation is successful or failed. That is not very clear. > > > > > > Here change to return explicit error values and check them to judge if > > > allocation is successful. > > > > > > IS_ERR_VALUE already uses unlikely() internally > > > > > > Signed-off-by: Baoquan He <bhe@redhat.com> > > > --- > > > mm/vmalloc.c | 34 +++++++++++++++++----------------- > > > 1 file changed, 17 insertions(+), 17 deletions(-) > > > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > > index 3f38a232663b..5b21cd09b2b4 100644 > > > --- a/mm/vmalloc.c > > > +++ b/mm/vmalloc.c > > > @@ -1715,7 +1715,7 @@ va_clip(struct rb_root *root, struct list_head *head, > > > */ > > > lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > > > if (!lva) > > > - return -1; > > > + return -ENOMEM; > > > } > > > > > > /* > > > @@ -1729,7 +1729,7 @@ va_clip(struct rb_root *root, struct list_head *head, > > > */ > > > va->va_start = nva_start_addr + size; > > > } else { > > > - return -1; > > > + return -EINVAL; > > > } > > > > > > if (type != FL_FIT_TYPE) { > > > @@ -1758,19 +1758,19 @@ va_alloc(struct vmap_area *va, > > > > > > /* Check the "vend" restriction. */ > > > if (nva_start_addr + size > vend) > > > - return vend; > > > + return -ERANGE; > > > > > > /* Update the free vmap_area. */ > > > ret = va_clip(root, head, va, nva_start_addr, size); > > > - if (WARN_ON_ONCE(ret)) > > > - return vend; > > > > > Not clear why you remove this WARN_ON by this patch. It should be > > a separate patch or just keep it as is. The warning here can mean > > that something is really wrong, especially if NOTHING_FIT. So we > > definitely want the warning. > > I remember one time someone reported that the slab allocation failure > triggered this warning which is confusing to them. But yes, it should be > discussed in a separate post or thread, not appropriate to remove it > silently. I will add it back in v2. > Thanks! > > > > > + if (ret) > > > + return ret; > > > > > > return nva_start_addr; > > > } > > > > > > /* > > > * Returns a start address of the newly allocated area, if success. > > > - * Otherwise a vend is returned that indicates failure. > > > + * Otherwise an error value is returned that indicates failure. > > > */ > > > static __always_inline unsigned long > > > __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > > @@ -1795,14 +1795,13 @@ __alloc_vmap_area(struct rb_root *root, struct list_head *head, > > > > > > va = find_vmap_lowest_match(root, size, align, vstart, adjust_search_size); > > > if (unlikely(!va)) > > > - return vend; > > > + return -ENOENT; > > > > > > nva_start_addr = va_alloc(va, root, head, size, align, vstart, vend); > > > - if (nva_start_addr == vend) > > > - return vend; > > > > > > #if DEBUG_AUGMENT_LOWEST_MATCH_CHECK > > > - find_vmap_lowest_match_check(root, head, size, align); > > > + if (!IS_ERR_VALUE(nva_start_addr)) > > > > > Just keep it as it was. No need to check if addr is valid or not. > > This is to keep consistent with the old code. Before this patch, if > va_alloc() return vend, it returns directly, no > find_vmap_lowest_match_check() invocation is done. I tried to keep the > behaviour unchanged. That code is for debugging, both is fine to me. > Ack. Makes sense to keep same behaviour as it was/is. > > > > > + find_vmap_lowest_match_check(root, head, size, align); > > > #endif > > > > > > return nva_start_addr; > > > @@ -1932,7 +1931,7 @@ node_alloc(unsigned long size, unsigned long align, > > > struct vmap_area *va; > > > > > > *vn_id = 0; > > > - *addr = vend; > > > + *addr = -EINVAL; > > > > > > /* > > > * Fallback to a global heap if not vmalloc or there > > > @@ -2012,20 +2011,20 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > > > } > > > > > > retry: > > > - if (addr == vend) { > > > + if (IS_ERR_VALUE(addr)) { > > > preload_this_cpu_lock(&free_vmap_area_lock, gfp_mask, node); > > > addr = __alloc_vmap_area(&free_vmap_area_root, &free_vmap_area_list, > > > size, align, vstart, vend); > > > spin_unlock(&free_vmap_area_lock); > > > } > > > > > > - trace_alloc_vmap_area(addr, size, align, vstart, vend, addr == vend); > > > + trace_alloc_vmap_area(addr, size, align, vstart, vend, IS_ERR_VALUE(addr)); > > > > > > /* > > > - * If an allocation fails, the "vend" address is > > > + * If an allocation fails, the error value is > > > * returned. Therefore trigger the overflow path. > > > */ > > > - if (unlikely(addr == vend)) > > > + if (IS_ERR_VALUE(addr)) > > > goto overflow; > > > > > > va->va_start = addr; > > > @@ -4753,9 +4752,10 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, > > > > > > ret = va_clip(&free_vmap_area_root, > > > &free_vmap_area_list, va, start, size); > > > - if (WARN_ON_ONCE(unlikely(ret))) > > > - /* It is a BUG(), but trigger recovery instead. */ > > Keep the comment. > > OK, will add it back. > Thank you. -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He ` (4 preceding siblings ...) 2025-04-15 2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He @ 2025-04-15 15:29 ` Uladzislau Rezki 2025-04-15 22:55 ` Baoquan He 2025-04-15 19:06 ` Shivank Garg 6 siblings, 1 reply; 32+ messages in thread From: Uladzislau Rezki @ 2025-04-15 15:29 UTC (permalink / raw) To: Baoquan He; +Cc: linux-mm, akpm, urezki, linux-kernel On Tue, Apr 15, 2025 at 10:39:47AM +0800, Baoquan He wrote: > These were made from code inspection in mm/vmalloc.c. > > Baoquan He (5): > mm/vmalloc.c: change purge_ndoes as local static variable > mm/vmalloc.c: find the vmap of vmap_nodes in reverse order > mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit > mm/vmalloc: optimize function vm_unmap_aliases() > mm/vmalloc.c: return explicit error value in alloc_vmap_area() > > mm/vmalloc.c | 68 +++++++++++++++++++++++++--------------------------- > 1 file changed, 32 insertions(+), 36 deletions(-) > > -- > 2.41.0 > I have review some patches, the rest i will check tomorrow! -- Uladzislau Rezki ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements 2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki @ 2025-04-15 22:55 ` Baoquan He 0 siblings, 0 replies; 32+ messages in thread From: Baoquan He @ 2025-04-15 22:55 UTC (permalink / raw) To: Uladzislau Rezki; +Cc: linux-mm, akpm, linux-kernel On 04/15/25 at 05:29pm, Uladzislau Rezki wrote: > On Tue, Apr 15, 2025 at 10:39:47AM +0800, Baoquan He wrote: > > These were made from code inspection in mm/vmalloc.c. > > > > Baoquan He (5): > > mm/vmalloc.c: change purge_ndoes as local static variable > > mm/vmalloc.c: find the vmap of vmap_nodes in reverse order > > mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit > > mm/vmalloc: optimize function vm_unmap_aliases() > > mm/vmalloc.c: return explicit error value in alloc_vmap_area() > > > > mm/vmalloc.c | 68 +++++++++++++++++++++++++--------------------------- > > 1 file changed, 32 insertions(+), 36 deletions(-) > > > > -- > > 2.41.0 > > > I have review some patches, the rest i will check tomorrow! Thanks a lot for your quick and careful reviewing. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He ` (5 preceding siblings ...) 2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki @ 2025-04-15 19:06 ` Shivank Garg 6 siblings, 0 replies; 32+ messages in thread From: Shivank Garg @ 2025-04-15 19:06 UTC (permalink / raw) To: Baoquan He, linux-mm; +Cc: akpm, urezki, linux-kernel On 4/15/2025 8:09 AM, Baoquan He wrote: > These were made from code inspection in mm/vmalloc.c. > > Baoquan He (5): > mm/vmalloc.c: change purge_ndoes as local static variable > mm/vmalloc.c: find the vmap of vmap_nodes in reverse order > mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit > mm/vmalloc: optimize function vm_unmap_aliases() > mm/vmalloc.c: return explicit error value in alloc_vmap_area() > > mm/vmalloc.c | 68 +++++++++++++++++++++++++--------------------------- > 1 file changed, 32 insertions(+), 36 deletions(-) > I've tested this patch-series on my AMD EPYC ZEN 3 machine using 1. Using lib/test_vmalloc.c: modprobe test_vmalloc 2. stress-ng: stress-ng --class memory,vm --seq 0 I did not find any issues. Reviewed-by: Shivank Garg <shivankg@amd.com> Tested-by: Shivank Garg <shivankg@amd.com> Thanks, Shivank ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2025-04-17 16:19 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-04-15 2:39 [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Baoquan He 2025-04-15 2:39 ` [PATCH 1/5] mm/vmalloc.c: change purge_ndoes as local static variable Baoquan He 2025-04-15 10:47 ` Uladzislau Rezki 2025-04-15 19:08 ` Shivank Garg 2025-04-15 23:53 ` Vishal Moola (Oracle) 2025-04-15 2:39 ` [PATCH 2/5] mm/vmalloc.c: find the vmap of vmap_nodes in reverse order Baoquan He 2025-04-15 15:25 ` Uladzislau Rezki 2025-04-15 23:41 ` Baoquan He 2025-04-15 19:09 ` Shivank Garg 2025-04-15 2:39 ` [PATCH 3/5] mm/vmalloc.c: optimize code in decay_va_pool_node() a little bit Baoquan He 2025-04-15 10:29 ` Shivank Garg 2025-04-15 14:05 ` Baoquan He 2025-04-15 19:02 ` Shivank Garg 2025-04-16 13:50 ` Uladzislau Rezki 2025-04-17 2:51 ` Baoquan He 2025-04-17 16:18 ` Uladzislau Rezki 2025-04-15 2:39 ` [PATCH 4/5] mm/vmalloc: optimize function vm_unmap_aliases() Baoquan He 2025-04-15 10:44 ` Uladzislau Rezki 2025-04-15 19:10 ` Shivank Garg 2025-04-15 23:54 ` Vishal Moola (Oracle) 2025-04-15 2:39 ` [PATCH 5/5] mm/vmalloc.c: return explicit error value in alloc_vmap_area() Baoquan He 2025-04-15 6:44 ` Baoquan He 2025-04-15 7:22 ` Shivank Garg 2025-04-15 13:01 ` Baoquan He 2025-04-15 19:00 ` Shivank Garg 2025-04-15 22:57 ` Baoquan He 2025-04-16 14:28 ` Uladzislau Rezki 2025-04-17 3:02 ` Baoquan He 2025-04-17 16:17 ` Uladzislau Rezki 2025-04-15 15:29 ` [PATCH 0/5] mm/vmalloc.c: code cleanup and improvements Uladzislau Rezki 2025-04-15 22:55 ` Baoquan He 2025-04-15 19:06 ` Shivank Garg
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).