From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>,
Muhammad Usama Anjum <usama.anjum@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Uladzislau Rezki <urezki@gmail.com>,
Nick Terrell <terrelln@fb.com>, David Sterba <dsterba@suse.com>,
"Vishal Moola (Oracle)" <vishal.moola@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, Ryan.Roberts@arm.com,
david.hildenbrand@arm.com
Subject: Re: [PATCH v2 2/3] vmalloc: Optimize vfree
Date: Fri, 20 Mar 2026 15:33:46 +0100 [thread overview]
Message-ID: <38379a16-d596-4266-ac3b-1dbee5356add@kernel.org> (raw)
In-Reply-To: <b7959d34-6e7d-45c2-a674-f32afd057068@kernel.org>
On 3/20/26 09:39, David Hildenbrand (Arm) wrote:
> On 3/16/26 16:49, Vlastimil Babka wrote:
>>> mm/vmalloc.c | 34 +++++++++++++++++++++++++---------
>>> 1 file changed, 25 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index c607307c657a6..8b935395fb068 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -3459,18 +3459,34 @@ void vfree(const void *addr)
>>>
>>> if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
>>> vm_reset_perms(vm);
>>> - for (i = 0; i < vm->nr_pages; i++) {
>>> - struct page *page = vm->pages[i];
>>> +
>>> + if (vm->nr_pages) {
>>> + bool account = !(vm->flags & VM_MAP_PUT_PAGES);
>>> + unsigned long start_pfn, pfn;
>>> + struct page *page = vm->pages[0];
>>> + int nr = 1;
>>>
>>> BUG_ON(!page);
>>> - /*
>>> - * High-order allocs for huge vmallocs are split, so
>>> - * can be freed as an array of order-0 allocations
>>> - */
>>> - if (!(vm->flags & VM_MAP_PUT_PAGES))
>>> + start_pfn = page_to_pfn(page);
>>> + if (account)
>>> mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>> - __free_page(page);
>>> - cond_resched();
>>> +
>>> + for (i = 1; i < vm->nr_pages; i++) {
>>> + page = vm->pages[i];
>>> + BUG_ON(!page);
>>
>> We shouldn't be adding BUG_ON()'s. Rather demote also the pre-existing one
>> to VM_WARN_ON_ONCE() and skip gracefully.
>>
>>> + if (account)
>>> + mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>
>> I think we should be able to batch this too to use "nr"?
>
> Are we sure that pages cannot cross nodes etc? It could happen that we
> have a contig range that spans zones/nodes/etc ...
Hmm single order-3 allocation can't but we could be unlucky and get the last
order-3 from zone X and first order-3 from adjacent zone Y.
In that case the loop would need to also check same zone/node.
> Anyhow, should we try to decouple both things, providing a
> core-mm function to do the page freeing?
>
> We do have something similar, optimized unpinning of large folios,
> in unpin_user_pages_dirty_lock(). This here is a bit different.
>
>
> So what I am thinking about for this code here to do:
>
> if (!(vm->flags & VM_MAP_PUT_PAGES)) {
> for (i = 0; i < vm->nr_pages; i++)
> mod_lruvec_page_state(vm->pages[i], NR_VMALLOC, -1);
> }
> free_pages_bulk(vm->pages, vm->nr_pages);
>
>
> We could optimize the first loop to do batching where possible as well.
>
>
> free_pages_bulk() would match alloc_pages_bulk()
>
> void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>
> Internally we'd do the contig handling.
>
> Was that already discussed?
AFAIU some of Zi's replies hinted at this direction. It would make sense, yeah.
next prev parent reply other threads:[~2026-03-20 14:33 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 11:31 [PATCH v2 0/3] mm: Free contiguous order-0 pages efficiently Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 1/3] mm/page_alloc: Optimize free_contig_range() Muhammad Usama Anjum
2026-03-16 15:21 ` Vlastimil Babka
2026-03-16 16:02 ` Zi Yan
2026-03-16 16:19 ` Vlastimil Babka (SUSE)
2026-03-17 15:17 ` Zi Yan
2026-03-17 18:48 ` Vlastimil Babka (SUSE)
2026-03-19 22:07 ` David Hildenbrand (Arm)
2026-03-20 8:20 ` Vlastimil Babka (SUSE)
2026-03-20 12:46 ` Zi Yan
2026-03-16 16:11 ` Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 2/3] vmalloc: Optimize vfree Muhammad Usama Anjum
2026-03-16 15:49 ` Vlastimil Babka
2026-03-17 9:36 ` Muhammad Usama Anjum
2026-03-20 8:39 ` David Hildenbrand (Arm)
2026-03-20 14:33 ` Vlastimil Babka (SUSE) [this message]
2026-03-23 11:28 ` Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 3/3] mm/page_alloc: Optimize __free_contig_frozen_range() Muhammad Usama Anjum
2026-03-16 16:22 ` Vlastimil Babka
2026-03-20 14:26 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=38379a16-d596-4266-ac3b-1dbee5356add@kernel.org \
--to=vbabka@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=Ryan.Roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=david.hildenbrand@arm.com \
--cc=david@kernel.org \
--cc=dsterba@suse.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=terrelln@fb.com \
--cc=urezki@gmail.com \
--cc=usama.anjum@arm.com \
--cc=vishal.moola@gmail.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.