From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: "David Hildenbrand (Arm)" <david@kernel.org>,
Muhammad Usama Anjum <usama.anjum@arm.com>,
Andrew Morton <akpm@linux-foundation.org>,
Lorenzo Stoakes <ljs@kernel.org>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Mike Rapoport <rppt@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Michal Hocko <mhocko@suse.com>,
Brendan Jackman <jackmanb@google.com>,
Johannes Weiner <hannes@cmpxchg.org>, Zi Yan <ziy@nvidia.com>,
Uladzislau Rezki <urezki@gmail.com>,
Nick Terrell <terrelln@fb.com>, David Sterba <dsterba@suse.com>,
"Vishal Moola (Oracle)" <vishal.moola@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
bpf@vger.kernel.org, Ryan.Roberts@arm.com,
david.hildenbrand@arm.com
Subject: Re: [PATCH v2 2/3] vmalloc: Optimize vfree
Date: Fri, 20 Mar 2026 15:33:46 +0100 [thread overview]
Message-ID: <38379a16-d596-4266-ac3b-1dbee5356add@kernel.org> (raw)
In-Reply-To: <b7959d34-6e7d-45c2-a674-f32afd057068@kernel.org>
On 3/20/26 09:39, David Hildenbrand (Arm) wrote:
> On 3/16/26 16:49, Vlastimil Babka wrote:
>>> mm/vmalloc.c | 34 +++++++++++++++++++++++++---------
>>> 1 file changed, 25 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index c607307c657a6..8b935395fb068 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -3459,18 +3459,34 @@ void vfree(const void *addr)
>>>
>>> if (unlikely(vm->flags & VM_FLUSH_RESET_PERMS))
>>> vm_reset_perms(vm);
>>> - for (i = 0; i < vm->nr_pages; i++) {
>>> - struct page *page = vm->pages[i];
>>> +
>>> + if (vm->nr_pages) {
>>> + bool account = !(vm->flags & VM_MAP_PUT_PAGES);
>>> + unsigned long start_pfn, pfn;
>>> + struct page *page = vm->pages[0];
>>> + int nr = 1;
>>>
>>> BUG_ON(!page);
>>> - /*
>>> - * High-order allocs for huge vmallocs are split, so
>>> - * can be freed as an array of order-0 allocations
>>> - */
>>> - if (!(vm->flags & VM_MAP_PUT_PAGES))
>>> + start_pfn = page_to_pfn(page);
>>> + if (account)
>>> mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>> - __free_page(page);
>>> - cond_resched();
>>> +
>>> + for (i = 1; i < vm->nr_pages; i++) {
>>> + page = vm->pages[i];
>>> + BUG_ON(!page);
>>
>> We shouldn't be adding BUG_ON()'s. Rather demote also the pre-existing one
>> to VM_WARN_ON_ONCE() and skip gracefully.
>>
>>> + if (account)
>>> + mod_lruvec_page_state(page, NR_VMALLOC, -1);
>>
>> I think we should be able to batch this too to use "nr"?
>
> Are we sure that pages cannot cross nodes etc? It could happen that we
> have a contig range that spans zones/nodes/etc ...
Hmm single order-3 allocation can't but we could be unlucky and get the last
order-3 from zone X and first order-3 from adjacent zone Y.
In that case the loop would need to also check same zone/node.
> Anyhow, should we try to decouple both things, providing a
> core-mm function to do the page freeing?
>
> We do have something similar, optimized unpinning of large folios,
> in unpin_user_pages_dirty_lock(). This here is a bit different.
>
>
> So what I am thinking about for this code here to do:
>
> if (!(vm->flags & VM_MAP_PUT_PAGES)) {
> for (i = 0; i < vm->nr_pages; i++)
> mod_lruvec_page_state(vm->pages[i], NR_VMALLOC, -1);
> }
> free_pages_bulk(vm->pages, vm->nr_pages);
>
>
> We could optimize the first loop to do batching where possible as well.
>
>
> free_pages_bulk() would match alloc_pages_bulk()
>
> void free_pages_bulk(struct page **page_array, unsigned long nr_pages)
>
> Internally we'd do the contig handling.
>
> Was that already discussed?
AFAIU some of Zi's replies hinted at this direction. It would make sense, yeah.
next prev parent reply other threads:[~2026-03-20 14:33 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-16 11:31 [PATCH v2 0/3] mm: Free contiguous order-0 pages efficiently Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 1/3] mm/page_alloc: Optimize free_contig_range() Muhammad Usama Anjum
2026-03-16 15:21 ` Vlastimil Babka
2026-03-16 16:02 ` Zi Yan
2026-03-16 16:19 ` Vlastimil Babka (SUSE)
2026-03-17 15:17 ` Zi Yan
2026-03-17 18:48 ` Vlastimil Babka (SUSE)
2026-03-19 22:07 ` David Hildenbrand (Arm)
2026-03-20 8:20 ` Vlastimil Babka (SUSE)
2026-03-20 12:46 ` Zi Yan
2026-03-16 16:11 ` Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 2/3] vmalloc: Optimize vfree Muhammad Usama Anjum
2026-03-16 15:49 ` Vlastimil Babka
2026-03-17 9:36 ` Muhammad Usama Anjum
2026-03-20 8:39 ` David Hildenbrand (Arm)
2026-03-20 14:33 ` Vlastimil Babka (SUSE) [this message]
2026-03-23 11:28 ` Muhammad Usama Anjum
2026-03-16 11:31 ` [PATCH v2 3/3] mm/page_alloc: Optimize __free_contig_frozen_range() Muhammad Usama Anjum
2026-03-16 16:22 ` Vlastimil Babka
2026-03-20 14:26 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=38379a16-d596-4266-ac3b-1dbee5356add@kernel.org \
--to=vbabka@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=Ryan.Roberts@arm.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=david.hildenbrand@arm.com \
--cc=david@kernel.org \
--cc=dsterba@suse.com \
--cc=hannes@cmpxchg.org \
--cc=jackmanb@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@suse.com \
--cc=rppt@kernel.org \
--cc=surenb@google.com \
--cc=terrelln@fb.com \
--cc=urezki@gmail.com \
--cc=usama.anjum@arm.com \
--cc=vishal.moola@gmail.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox