From: Dev Jain <dev.jain@arm.com>
To: Barry Song <baohua@kernel.org>
Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
catalin.marinas@arm.com, will@kernel.org,
akpm@linux-foundation.org, urezki@gmail.com,
linux-kernel@vger.kernel.org, anshuman.khandual@arm.com,
ryan.roberts@arm.com, ajd@linux.ibm.com, rppt@kernel.org,
david@kernel.org, Xueyuan.chen21@gmail.com
Subject: Re: [RFC PATCH 5/8] mm/vmalloc: map contiguous pages in batches for vmap() if possible
Date: Thu, 9 Apr 2026 15:40:38 +0530 [thread overview]
Message-ID: <46fbd241-4d64-409a-b9dc-77e778ca088e@arm.com> (raw)
In-Reply-To: <CAGsJ_4yL3Y1Sr0MjTd6=ROC0jKf4JkCqNPODMh-m155rUFcS9g@mail.gmail.com>
On 09/04/26 3:24 am, Barry Song wrote:
> On Wed, Apr 8, 2026 at 10:03 PM Dev Jain <dev.jain@arm.com> wrote:
>>
>>
>>
>> On 08/04/26 8:21 am, Barry Song (Xiaomi) wrote:
>>> In many cases, the pages passed to vmap() may include high-order
>>> pages allocated with __GFP_COMP flags. For example, the systemheap
>>> often allocates pages in descending order: order 8, then 4, then 0.
>>> Currently, vmap() iterates over every page individually—even pages
>>> inside a high-order block are handled one by one.
>>>
>>> This patch detects high-order pages and maps them as a single
>>> contiguous block whenever possible.
>>>
>>> An alternative would be to implement a new API, vmap_sg(), but that
>>> change seems to be large in scope.
>>>
>>> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
>>> ---
>>> mm/vmalloc.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++--
>>> 1 file changed, 49 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>>> index eba436386929..e8dbfada42bc 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -3529,6 +3529,53 @@ void vunmap(const void *addr)
>>> }
>>> EXPORT_SYMBOL(vunmap);
>>>
>>> +static inline int get_vmap_batch_order(struct page **pages,
>>> + unsigned int max_steps, unsigned int idx)
>>> +{
>>> + unsigned int nr_pages;
>>> +
>>> + if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP) ||
>>> + ioremap_max_page_shift == PAGE_SHIFT)
>>> + return 0;
>>> +
>>> + nr_pages = compound_nr(pages[idx]);
>>> + if (nr_pages == 1 || max_steps < nr_pages)
>>> + return 0;
>>
>> This assumes that the page array passed to vmap() will have compound pages
>> if it is a higher order allocation.
>>
>> See rb_alloc_aux_page(). It gets higher-order allocations without passing
>> GFP_COMP.
>>
>> That is why my implementation does not assume anything about the property
>> of the pages.
>
> If you’re asking about support for non-compound pages, I think
> that’s fine. My current use case is dma-buf, where pages are
> compound. I recall discussing this previously with David and
> Uladzislau.
>
> If you’re working with non-compound pages, I’m happy to add
> support in the next version. I’m also happy to reuse some of your
> code and credit you as Co-developed-by if you’re willing. I actually
> prefer your __vmap_huge() name over my
> vmap_contig_pages_range().
>
> Does that make sense to you?
Yeah it will perhaps be better to have a fast-path detecting compound
pages, and if not then checking contiguity. So sure please go ahead
sharing some of my code and you can co-credit me.
>
>>
>> Also it may be useful to do regression-testing for the common case of
>> vmap() with a single page (assuming it is common, I don't know), in
>> which case we may have to special case it.
>
> I agree, so I had Xueyuan test single pages and highlighted this
> in the cover letter. There is no regression: "vmap() is 5.6×
> faster when memory includes some order-8 pages, with no
> regression observed for order-0 pages."
>
>>
>> My implementation requires opting in with VM_ALLOW_HUGE_VMAP - I suspect
>> you may run into problems if you make vmap() do huge-mappings as best-effort
>> by default. I am guessing this because ...
>>
>> Drivers can operate on individual pages, so vmalloc() calls split_page()
>> and then does the block/cont mappings. This same issue should be present
>> with vmap() too? In which case if we are to do huge-mappings by default
>> then we can do split_page() after detecting contiguous chunks.
>>
>> But ... that may create problems for the caller of vmap() - vmap now
>> has the changed the properties of the pages.
>
> I don’t see this as a problem at all. Splitting pages does not
> affect physical or virtual contiguity; it only changes the
> contents of struct page objects, not the PTE/PMD mappings.
> For ioremap, there isn’t even a struct page, yet the mappings
> can still be huge.
Okay so I was under the impression that *not* splitting the page
will be problematic.
But, vmalloc splits pages because the caller can operate on
individual struct pages by vmalloc_to_page(). To the contrary,
since the caller of vmap() decides what kind of pages to
virtually-map, we don't have the problem I was raising. So
I guess we are fine by making vmap do huge-mappings by default.
>
> Thanks
> Barry
next prev parent reply other threads:[~2026-04-09 10:11 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-08 2:51 [RFC PATCH 0/8] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Barry Song (Xiaomi)
2026-04-08 2:51 ` [RFC PATCH 1/8] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Barry Song (Xiaomi)
2026-04-08 10:32 ` Dev Jain
2026-04-08 11:00 ` Barry Song
2026-04-08 2:51 ` [RFC PATCH 2/8] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Barry Song (Xiaomi)
2026-04-08 2:51 ` [RFC PATCH 3/8] mm/vmalloc: Extend vmap_small_pages_range_noflush() to support larger page_shift sizes Barry Song (Xiaomi)
2026-04-08 11:08 ` Dev Jain
2026-04-08 21:29 ` Barry Song
2026-04-13 16:08 ` Mike Rapoport
2026-04-13 20:16 ` Barry Song
2026-04-08 2:51 ` [RFC PATCH 4/8] mm/vmalloc: Eliminate page table zigzag for huge vmalloc mappings Barry Song (Xiaomi)
2026-04-13 16:16 ` Mike Rapoport
2026-04-13 19:49 ` Barry Song
2026-04-08 2:51 ` [RFC PATCH 5/8] mm/vmalloc: map contiguous pages in batches for vmap() if possible Barry Song (Xiaomi)
2026-04-08 4:19 ` Dev Jain
2026-04-08 5:12 ` Barry Song
2026-04-08 11:22 ` Dev Jain
2026-04-08 14:03 ` Dev Jain
2026-04-08 21:54 ` Barry Song
2026-04-09 10:10 ` Dev Jain [this message]
2026-04-09 10:20 ` Uladzislau Rezki
2026-04-10 1:02 ` Barry Song
2026-04-13 19:23 ` David Hildenbrand (Arm)
2026-04-13 19:56 ` Barry Song
2026-04-08 2:51 ` [RFC PATCH 6/8] mm/vmalloc: align vm_area so vmap() can batch mappings Barry Song (Xiaomi)
2026-04-08 2:51 ` [RFC PATCH 7/8] mm/vmalloc: Coalesce same page_shift mappings in vmap to avoid pgtable zigzag Barry Song (Xiaomi)
2026-04-08 11:36 ` Dev Jain
2026-04-08 21:58 ` Barry Song
2026-04-08 2:51 ` [RFC PATCH 8/8] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Barry Song (Xiaomi)
2026-04-08 9:14 ` [RFC PATCH 0/8] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Dev Jain
2026-04-08 10:51 ` Barry Song
2026-04-08 10:55 ` Dev Jain
2026-04-27 15:04 ` Dev Jain
2026-04-28 3:16 ` Barry Song
2026-04-28 4:14 ` Dev Jain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=46fbd241-4d64-409a-b9dc-77e778ca088e@arm.com \
--to=dev.jain@arm.com \
--cc=Xueyuan.chen21@gmail.com \
--cc=ajd@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=baohua@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=urezki@gmail.com \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.