From: Uladzislau Rezki <urezki@gmail.com>
To: Lorenzo Stoakes <lstoakes@gmail.com>
Cc: "Uladzislau Rezki (Sony)" <urezki@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>,
Baoquan He <bhe@redhat.com>,
Christoph Hellwig <hch@infradead.org>,
Matthew Wilcox <willy@infradead.org>,
Nicholas Piggin <npiggin@gmail.com>,
Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>,
Roman Gushchin <roman.gushchin@linux.dev>
Subject: Re: [PATCH v3 1/3] mm: vmalloc: Avoid calling __find_vmap_area() twice in __vunmap()
Date: Fri, 23 Dec 2022 17:42:30 +0100 [thread overview]
Message-ID: <Y6XaduXdrA6IqEmI@pc636> (raw)
In-Reply-To: <Y6S4s2PAg6AjYKbP@lucifer>
On Thu, Dec 22, 2022 at 08:06:11PM +0000, Lorenzo Stoakes wrote:
> n Thu, Dec 22, 2022 at 08:00:20PM +0100, Uladzislau Rezki (Sony) wrote:
> > Currently the __vunmap() path calls __find_vmap_area() twice. Once on
> > entry to check that the area exists, then inside the remove_vm_area()
> > function which also performs a new search for the VA.
> >
> > In order to improvie it from a performance point of view we split
> > remove_vm_area() into two new parts:
> > - find_unlink_vmap_area() that does a search and unlink from tree;
> > - __remove_vm_area() that removes without searching.
> >
> > In this case there is no any functional change for remove_vm_area()
> > whereas vm_remove_mappings(), where a second search happens, switches
> > to the __remove_vm_area() variant where the already detached VA is
> > passed as a parameter, so there is no need to find it again.
> >
> > Performance wise, i use test_vmalloc.sh with 32 threads doing alloc
> > free on a 64-CPUs-x86_64-box:
> >
> > perf without this patch:
> > - 31.41% 0.50% vmalloc_test/10 [kernel.vmlinux] [k] __vunmap
> > - 30.92% __vunmap
> > - 17.67% _raw_spin_lock
> > native_queued_spin_lock_slowpath
> > - 12.33% remove_vm_area
> > - 11.79% free_vmap_area_noflush
> > - 11.18% _raw_spin_lock
> > native_queued_spin_lock_slowpath
> > 0.76% free_unref_page
> >
> > perf with this patch:
> > - 11.35% 0.13% vmalloc_test/14 [kernel.vmlinux] [k] __vunmap
> > - 11.23% __vunmap
> > - 8.28% find_unlink_vmap_area
> > - 7.95% _raw_spin_lock
> > 7.44% native_queued_spin_lock_slowpath
> > - 1.93% free_vmap_area_noflush
> > - 0.56% _raw_spin_lock
> > 0.53% native_queued_spin_lock_slowpath
> > 0.60% __vunmap_range_noflush
> >
> > __vunmap() consumes around ~20% less CPU cycles on this test.
> >
> > v2 -> v3:
> > - update commit message;
> > - rename the vm_remove_mappings() to the va_remove_mappings();
> > - move va-unlinking to the callers so the free_vmap_area_noflush()
> > now expects a VA that has been disconnected;
> > - eliminate a local variable in the remove_vm_area().
> >
> > Reported-by: Roman Gushchin <roman.gushchin@linux.dev>
> > Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > ---
> > mm/vmalloc.c | 77 ++++++++++++++++++++++++++++++++--------------------
> > 1 file changed, 47 insertions(+), 30 deletions(-)
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 9e30f0b39203..eb91ecaa7277 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -1815,9 +1815,9 @@ static void drain_vmap_area_work(struct work_struct *work)
> > }
> >
> > /*
> > - * Free a vmap area, caller ensuring that the area has been unmapped
> > - * and flush_cache_vunmap had been called for the correct range
> > - * previously.
> > + * Free a vmap area, caller ensuring that the area has been unmapped,
> > + * unlinked and flush_cache_vunmap had been called for the correct
> > + * range previously.
> > */
> > static void free_vmap_area_noflush(struct vmap_area *va)
> > {
> > @@ -1825,9 +1825,8 @@ static void free_vmap_area_noflush(struct vmap_area *va)
> > unsigned long va_start = va->va_start;
> > unsigned long nr_lazy;
> >
> > - spin_lock(&vmap_area_lock);
> > - unlink_va(va, &vmap_area_root);
> > - spin_unlock(&vmap_area_lock);
> > + if (WARN_ON_ONCE(!list_empty(&va->list)))
> > + return;
> >
> > nr_lazy = atomic_long_add_return((va->va_end - va->va_start) >>
> > PAGE_SHIFT, &vmap_lazy_nr);
> > @@ -1871,6 +1870,19 @@ struct vmap_area *find_vmap_area(unsigned long addr)
> > return va;
> > }
> >
> > +static struct vmap_area *find_unlink_vmap_area(unsigned long addr)
> > +{
> > + struct vmap_area *va;
> > +
> > + spin_lock(&vmap_area_lock);
> > + va = __find_vmap_area(addr, &vmap_area_root);
> > + if (va)
> > + unlink_va(va, &vmap_area_root);
> > + spin_unlock(&vmap_area_lock);
> > +
> > + return va;
> > +}
> > +
> > /*** Per cpu kva allocator ***/
> >
> > /*
> > @@ -2015,6 +2027,10 @@ static void free_vmap_block(struct vmap_block *vb)
> > tmp = xa_erase(&vmap_blocks, addr_to_vb_idx(vb->va->va_start));
> > BUG_ON(tmp != vb);
> >
> > + spin_lock(&vmap_area_lock);
> > + unlink_va(vb->va, &vmap_area_root);
> > + spin_unlock(&vmap_area_lock);
> > +
> > free_vmap_area_noflush(vb->va);
> > kfree_rcu(vb, rcu_head);
> > }
> > @@ -2591,6 +2607,20 @@ struct vm_struct *find_vm_area(const void *addr)
> > return va->vm;
> > }
> >
> > +static struct vm_struct *__remove_vm_area(struct vmap_area *va)
> > +{
> > + struct vm_struct *vm;
> > +
> > + if (!va || !va->vm)
> > + return NULL;
> > +
> > + vm = va->vm;
> > + kasan_free_module_shadow(vm);
> > + free_unmap_vmap_area(va);
> > +
> > + return vm;
> > +}
> > +
> > /**
> > * remove_vm_area - find and remove a continuous kernel virtual area
> > * @addr: base address
> > @@ -2603,26 +2633,10 @@ struct vm_struct *find_vm_area(const void *addr)
> > */
> > struct vm_struct *remove_vm_area(const void *addr)
> > {
> > - struct vmap_area *va;
> > -
> > might_sleep();
> >
> > - spin_lock(&vmap_area_lock);
> > - va = __find_vmap_area((unsigned long)addr, &vmap_area_root);
> > - if (va && va->vm) {
> > - struct vm_struct *vm = va->vm;
> > -
> > - va->vm = NULL;
> > - spin_unlock(&vmap_area_lock);
> > -
> > - kasan_free_module_shadow(vm);
> > - free_unmap_vmap_area(va);
> > -
> > - return vm;
> > - }
> > -
> > - spin_unlock(&vmap_area_lock);
> > - return NULL;
> > + return __remove_vm_area(
> > + find_unlink_vmap_area((unsigned long) addr));
> > }
> >
> > static inline void set_area_direct_map(const struct vm_struct *area,
> > @@ -2636,16 +2650,17 @@ static inline void set_area_direct_map(const struct vm_struct *area,
> > set_direct_map(area->pages[i]);
> > }
> >
> > -/* Handle removing and resetting vm mappings related to the vm_struct. */
> > -static void vm_remove_mappings(struct vm_struct *area, int deallocate_pages)
> > +/* Handle removing and resetting vm mappings related to the VA's vm_struct. */
> > +static void va_remove_mappings(struct vmap_area *va, int deallocate_pages)
> > {
> > + struct vm_struct *area = va->vm;
> > unsigned long start = ULONG_MAX, end = 0;
> > unsigned int page_order = vm_area_page_order(area);
> > int flush_reset = area->flags & VM_FLUSH_RESET_PERMS;
> > int flush_dmap = 0;
> > int i;
> >
> > - remove_vm_area(area->addr);
> > + __remove_vm_area(va);
> >
> > /* If this is not VM_FLUSH_RESET_PERMS memory, no need for the below. */
> > if (!flush_reset)
> > @@ -2690,6 +2705,7 @@ static void vm_remove_mappings(struct vm_struct *area, int deallocate_pages)
> > static void __vunmap(const void *addr, int deallocate_pages)
> > {
> > struct vm_struct *area;
> > + struct vmap_area *va;
> >
> > if (!addr)
> > return;
> > @@ -2698,19 +2714,20 @@ static void __vunmap(const void *addr, int deallocate_pages)
> > addr))
> > return;
> >
> > - area = find_vm_area(addr);
> > - if (unlikely(!area)) {
> > + va = find_unlink_vmap_area((unsigned long)addr);
> > + if (unlikely(!va)) {
> > WARN(1, KERN_ERR "Trying to vfree() nonexistent vm area (%p)\n",
> > addr);
> > return;
> > }
> >
> > + area = va->vm;
> > debug_check_no_locks_freed(area->addr, get_vm_area_size(area));
> > debug_check_no_obj_freed(area->addr, get_vm_area_size(area));
> >
> > kasan_poison_vmalloc(area->addr, get_vm_area_size(area));
> >
> > - vm_remove_mappings(area, deallocate_pages);
> > + va_remove_mappings(va, deallocate_pages);
> >
> > if (deallocate_pages) {
> > int i;
> > --
> > 2.30.2
> >
>
> All looks good to me! Great work! Feel free to add the below to all patches in series:-
>
> Reviewed-by: Lorenzo Stoakes <lstoakes@gmail.com>
>
Added :)
Thank you for review!
--
Uladzislau Rezki
next prev parent reply other threads:[~2022-12-23 16:42 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-12-22 19:00 [PATCH v3 1/3] mm: vmalloc: Avoid calling __find_vmap_area() twice in __vunmap() Uladzislau Rezki (Sony)
2022-12-22 19:00 ` [PATCH v3 2/3] mm: vmalloc: Switch to find_unlink_vmap_area() in vm_unmap_ram() Uladzislau Rezki (Sony)
2022-12-23 8:21 ` Christoph Hellwig
2022-12-23 16:41 ` Uladzislau Rezki
2022-12-28 23:47 ` Andrew Morton
2022-12-29 12:47 ` Uladzislau Rezki
2022-12-29 23:17 ` Andrew Morton
2022-12-31 9:17 ` Uladzislau Rezki
2022-12-22 19:00 ` [PATCH v3 3/3] mm: vmalloc: Replace BUG_ON() by WARN_ON_ONCE() Uladzislau Rezki (Sony)
2022-12-23 8:21 ` Christoph Hellwig
2022-12-22 20:06 ` [PATCH v3 1/3] mm: vmalloc: Avoid calling __find_vmap_area() twice in __vunmap() Lorenzo Stoakes
2022-12-23 16:42 ` Uladzislau Rezki [this message]
2022-12-23 8:19 ` Christoph Hellwig
2022-12-23 16:43 ` Uladzislau Rezki
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Y6XaduXdrA6IqEmI@pc636 \
--to=urezki@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=bhe@redhat.com \
--cc=hch@infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lstoakes@gmail.com \
--cc=npiggin@gmail.com \
--cc=oleksiy.avramchenko@sony.com \
--cc=roman.gushchin@linux.dev \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.