* [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
@ 2026-05-14 9:41 Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
This patchset accelerates ioremap, vmalloc, and vmap when the memory
is physically fully or partially contiguous. Two techniques are used:
1. Avoid page table rewalk when setting PTEs/PMDs for multiple memory
segments
2. Use batched mappings wherever possible in both vmalloc and ARM64
layers
Besides accelerating the mapping path, this also enables large
mappings (PMD and cont-PTE) for vmap, which are currently not
supported.
Patches 1-2 extend ARM64 vmalloc CONT-PTE mapping to support multiple
CONT-PTE regions instead of just one.
Patch 3 extracts a common helper vmap_set_ptes() that consolidates PTE
mapping logic between the ioremap and vmalloc/vmap paths, handling both
CONT_PTE and regular PTE mappings. This prepares for the next patch.
Patch 4 extends the page table walk path to support page shifts other
than PAGE_SHIFT and eliminates the page table rewalk for huge vmalloc
mappings. The function is renamed from vmap_small_pages_range_noflush()
to vmap_pages_range_noflush_walk().
Patches 5-7 add huge vmap support for contiguous pages, including
support for non-compound pages with pfn alignment verification.
On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
the performance CPUfreq policy enabled, benchmark results:
* ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
* vmalloc(1 MB) mapping time (excluding allocation) with
VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
* vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
Many thanks to Xueyuan Chen for his testing efforts on RK3588 boards.
Changes since v1:
- Fix condition order and use PMD_SIZE instead of CONT_PMD_SIZE in
patch 1 (Dev Jain)
- Squash patch 3+4 and patch 5+7 (Dev Jain)
- Replace "zigzag" with "page table rewalk" in commit messages
(Dev Jain)
- Rename vmap_small_pages_range_noflush() to
vmap_pages_range_noflush_walk() (Dev Jain)
- Extract vmap_set_ptes() as a new patch to consolidate PTE mapping
logic between vmap_pte_range() and vmap_pages_pte_range(), handling
both CONT_PTE and regular mappings (Mike Rapoport)
- Support non-compound pages in get_vmap_batch_order() by falling
back to physical contiguity scanning with pfn alignment check
(Dev Jain, Uladzislau Rezki)
- In get_vmap_batch_order(), filter out orders that the architecture
cannot batch by checking arch_vmap_pte_supported_shift() directly.
This avoids overhead for orders 1-3 on ARM64 CONT_PTE with 4K
pages. (patch 5)
Barry Song (Xiaomi) (6):
arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE
setup
arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple
CONT_PTE
mm/vmalloc: Extend page table walk to support larger page_shift sizes
and eliminate page table rewalk
mm/vmalloc: map contiguous pages in batches for vmap() if possible
mm/vmalloc: align vm_area so vmap() can batch mappings
mm/vmalloc: Stop scanning for compound pages after encountering small
pages in vmap
Wen Jiang (1):
mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic
arch/arm64/include/asm/vmalloc.h | 6 +-
arch/arm64/mm/hugetlbpage.c | 10 ++
mm/vmalloc.c | 221 ++++++++++++++++++++++++-------
3 files changed, 189 insertions(+), 48 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Wen Jiang
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
For sizes aligned to CONT_PTE_SIZE and smaller than PMD_SIZE,
we can batch CONT_PTE settings instead of handling them individually.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
arch/arm64/mm/hugetlbpage.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 30772a909..d477a9dd1 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -110,6 +110,12 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
contig_ptes = CONT_PTES;
break;
default:
+ if (size > 0 && size < PMD_SIZE &&
+ IS_ALIGNED(size, CONT_PTE_SIZE)) {
+ contig_ptes = size >> PAGE_SHIFT;
+ *pgsize = PAGE_SIZE;
+ break;
+ }
WARN_ON(!__hugetlb_valid_size(size));
}
@@ -359,6 +365,10 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
case CONT_PTE_SIZE:
return pte_mkcont(entry);
default:
+ if (pagesize > 0 && pagesize < PMD_SIZE &&
+ IS_ALIGNED(pagesize, CONT_PTE_SIZE))
+ return pte_mkcont(entry);
+
break;
}
pr_warn("%s: unrecognized huge page size 0x%lx\n",
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic Wen Jiang
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE hugepages,
reducing both PTE setup and TLB flush iterations.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
arch/arm64/include/asm/vmalloc.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/vmalloc.h b/arch/arm64/include/asm/vmalloc.h
index 4ec1acd3c..9eea06d0f 100644
--- a/arch/arm64/include/asm/vmalloc.h
+++ b/arch/arm64/include/asm/vmalloc.h
@@ -23,6 +23,8 @@ static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr,
unsigned long end, u64 pfn,
unsigned int max_page_shift)
{
+ unsigned long size;
+
/*
* If the block is at least CONT_PTE_SIZE in size, and is naturally
* aligned in both virtual and physical space, then we can pte-map the
@@ -40,7 +42,9 @@ static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr,
if (!IS_ALIGNED(PFN_PHYS(pfn), CONT_PTE_SIZE))
return PAGE_SIZE;
- return CONT_PTE_SIZE;
+ size = min3(end - addr, 1UL << max_page_shift, PMD_SIZE >> 1);
+ size = 1UL << (fls(size) - 1);
+ return size;
}
#define arch_vmap_pte_range_unmap_size arch_vmap_pte_range_unmap_size
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
2026-05-14 9:41 ` [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
Extract the common PTE mapping logic from vmap_pte_range() into a
shared helper vmap_set_ptes(). This handles both CONT_PTE and regular
PTE mappings in a single function, preparing for the next patch which
will extend vmap_pages_pte_range() to also use this helper.
The #ifdef CONFIG_HUGETLB_PAGE guard is moved inside vmap_set_ptes(),
so callers no longer need to handle the conditional compilation.
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 49 ++++++++++++++++++++++++++++++++++---------------
1 file changed, 34 insertions(+), 15 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3e9e5156f..9bfd0aa34 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -91,6 +91,35 @@ struct vfree_deferred {
static DEFINE_PER_CPU(struct vfree_deferred, vfree_deferred);
/*** Page table manipulation functions ***/
+
+/*
+ * Set PTE mappings for the given PFN. Try CONT_PTE mappings first when
+ * supported, otherwise fall back to PAGE_SIZE mappings.
+ *
+ * Return: mapping size.
+ */
+static __always_inline unsigned long vmap_set_ptes(pte_t *pte,
+ unsigned long addr, unsigned long end, u64 pfn,
+ pgprot_t prot, unsigned int max_page_shift)
+{
+#ifdef CONFIG_HUGETLB_PAGE
+ if (max_page_shift > PAGE_SHIFT) {
+ unsigned long size;
+
+ size = arch_vmap_pte_range_map_size(addr, end, pfn, max_page_shift);
+ if (size != PAGE_SIZE) {
+ pte_t entry = pfn_pte(pfn, prot);
+
+ entry = arch_make_huge_pte(entry, ilog2(size), 0);
+ set_huge_pte_at(&init_mm, addr, pte, entry, size);
+ return size;
+ }
+ }
+#endif
+ set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot));
+ return PAGE_SIZE;
+}
+
static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
phys_addr_t phys_addr, pgprot_t prot,
unsigned int max_page_shift, pgtbl_mod_mask *mask)
@@ -98,7 +127,8 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
pte_t *pte;
u64 pfn;
struct page *page;
- unsigned long size = PAGE_SIZE;
+ unsigned long size;
+ unsigned int steps;
if (WARN_ON_ONCE(!PAGE_ALIGNED(end - addr)))
return -EINVAL;
@@ -119,20 +149,9 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
BUG();
}
-#ifdef CONFIG_HUGETLB_PAGE
- size = arch_vmap_pte_range_map_size(addr, end, pfn, max_page_shift);
- if (size != PAGE_SIZE) {
- pte_t entry = pfn_pte(pfn, prot);
-
- entry = arch_make_huge_pte(entry, ilog2(size), 0);
- set_huge_pte_at(&init_mm, addr, pte, entry, size);
- pfn += PFN_DOWN(size);
- continue;
- }
-#endif
- set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot));
- pfn++;
- } while (pte += PFN_DOWN(size), addr += size, addr != end);
+ size = vmap_set_ptes(pte, addr, end, pfn, prot, max_page_shift);
+ steps = PFN_DOWN(size);
+ } while (pte += steps, pfn += steps, addr += size, addr != end);
lazy_mmu_mode_disable();
*mask |= PGTBL_PTE_MODIFIED;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (2 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
vmap_pages_range_noflush_walk() (formerly vmap_small_pages_range_noflush())
provides a clean interface by taking struct page **pages and mapping them
via direct PTE iteration. This avoids the page table rewalk seen when
using vmap_range_noflush() for page_shift values other than PAGE_SHIFT.
Extend it to support larger page_shift values, and add PMD- and
contiguous-PTE mappings as well. Rename it to vmap_pages_range_noflush_walk()
since it now handles more than just small pages.
For vmalloc() allocations with VM_ALLOW_HUGE_VMAP, we no longer need to
iterate over pages one by one via vmap_range_noflush(), which would
otherwise lead to page table rewalk. The code is now unified with the
PAGE_SHIFT case by simply calling vmap_pages_range_noflush_walk().
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 64 +++++++++++++++++++++++++++-------------------------
1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 9bfd0aa34..516d40650 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -543,8 +543,10 @@ void vunmap_range(unsigned long addr, unsigned long end)
static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
+ unsigned long pfn, size;
+ unsigned int steps;
int err = 0;
pte_t *pte;
@@ -575,9 +577,10 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
break;
}
- set_pte_at(&init_mm, addr, pte, mk_pte(page, prot));
- (*nr)++;
- } while (pte++, addr += PAGE_SIZE, addr != end);
+ pfn = page_to_pfn(page);
+ size = vmap_set_ptes(pte, addr, end, pfn, prot, shift);
+ steps = PFN_DOWN(size);
+ } while (pte += steps, *nr += steps, addr += size, addr != end);
lazy_mmu_mode_disable();
*mask |= PGTBL_PTE_MODIFIED;
@@ -587,7 +590,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
pmd_t *pmd;
unsigned long next;
@@ -597,7 +600,20 @@ static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
return -ENOMEM;
do {
next = pmd_addr_end(addr, end);
- if (vmap_pages_pte_range(pmd, addr, next, prot, pages, nr, mask))
+
+ if (shift == PMD_SHIFT) {
+ struct page *page = pages[*nr];
+ phys_addr_t phys_addr = page_to_phys(page);
+
+ if (vmap_try_huge_pmd(pmd, addr, next, phys_addr, prot,
+ shift)) {
+ *mask |= PGTBL_PMD_MODIFIED;
+ *nr += 1 << (shift - PAGE_SHIFT);
+ continue;
+ }
+ }
+
+ if (vmap_pages_pte_range(pmd, addr, next, prot, pages, nr, mask, shift))
return -ENOMEM;
} while (pmd++, addr = next, addr != end);
return 0;
@@ -605,7 +621,7 @@ static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
static int vmap_pages_pud_range(p4d_t *p4d, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
pud_t *pud;
unsigned long next;
@@ -615,7 +631,7 @@ static int vmap_pages_pud_range(p4d_t *p4d, unsigned long addr,
return -ENOMEM;
do {
next = pud_addr_end(addr, end);
- if (vmap_pages_pmd_range(pud, addr, next, prot, pages, nr, mask))
+ if (vmap_pages_pmd_range(pud, addr, next, prot, pages, nr, mask, shift))
return -ENOMEM;
} while (pud++, addr = next, addr != end);
return 0;
@@ -623,7 +639,7 @@ static int vmap_pages_pud_range(p4d_t *p4d, unsigned long addr,
static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
p4d_t *p4d;
unsigned long next;
@@ -633,14 +649,14 @@ static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr,
return -ENOMEM;
do {
next = p4d_addr_end(addr, end);
- if (vmap_pages_pud_range(p4d, addr, next, prot, pages, nr, mask))
+ if (vmap_pages_pud_range(p4d, addr, next, prot, pages, nr, mask, shift))
return -ENOMEM;
} while (p4d++, addr = next, addr != end);
return 0;
}
-static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
- pgprot_t prot, struct page **pages)
+static int vmap_pages_range_noflush_walk(unsigned long addr, unsigned long end,
+ pgprot_t prot, struct page **pages, unsigned int shift)
{
unsigned long start = addr;
pgd_t *pgd;
@@ -655,7 +671,7 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
next = pgd_addr_end(addr, end);
if (pgd_bad(*pgd))
mask |= PGTBL_PGD_MODIFIED;
- err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, &mask);
+ err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, &mask, shift);
if (err)
break;
} while (pgd++, addr = next, addr != end);
@@ -678,27 +694,13 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
int __vmap_pages_range_noflush(unsigned long addr, unsigned long end,
pgprot_t prot, struct page **pages, unsigned int page_shift)
{
- unsigned int i, nr = (end - addr) >> PAGE_SHIFT;
-
WARN_ON(page_shift < PAGE_SHIFT);
- if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) ||
- page_shift == PAGE_SHIFT)
- return vmap_small_pages_range_noflush(addr, end, prot, pages);
-
- for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) {
- int err;
-
- err = vmap_range_noflush(addr, addr + (1UL << page_shift),
- page_to_phys(pages[i]), prot,
- page_shift);
- if (err)
- return err;
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC))
+ page_shift = PAGE_SHIFT;
- addr += 1UL << page_shift;
- }
-
- return 0;
+ return vmap_pages_range_noflush_walk(addr, end, prot, pages,
+ min(page_shift, PMD_SHIFT));
}
int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (3 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
2026-05-14 9:41 ` [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Wen Jiang
6 siblings, 0 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
In many cases, the pages passed to vmap() may include high-order
pages. For example, the systemheap often allocates pages in descending
order: order 8, then 4, then 0. Currently, vmap() iterates over every
page individually—even pages inside a high-order block are handled
one by one.
This patch detects physically contiguous pages (regardless of whether
they are compound or non-compound) by scanning with
num_pages_contiguous(), and maps them as a single contiguous block
whenever possible. The first page's pfn must be aligned to the
mapping order for the batched mapping to be used.
Pages with the same page_shift are coalesced and mapped via
vmap_pages_range_noflush_walk() to avoid page table rewalk.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Co-developed-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 73 insertions(+), 2 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 516d40650..c30a7673e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3520,6 +3520,77 @@ void vunmap(const void *addr)
}
EXPORT_SYMBOL(vunmap);
+static inline int get_vmap_batch_order(struct page **pages,
+ unsigned int max_steps, unsigned int idx)
+{
+ unsigned int nr_contig;
+ int order;
+
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP) ||
+ ioremap_max_page_shift == PAGE_SHIFT)
+ return 0;
+
+ nr_contig = num_pages_contiguous(&pages[idx], max_steps);
+ if (nr_contig < 2)
+ return 0;
+
+ order = fls(nr_contig) - 1;
+
+ if (arch_vmap_pte_supported_shift(PAGE_SIZE << order) == PAGE_SHIFT)
+ return 0;
+
+ /* Ensure the first page's pfn is aligned to the order */
+ if (!IS_ALIGNED(page_to_pfn(pages[idx]), 1 << order))
+ return 0;
+
+ return order;
+}
+
+static int __vmap_huge(unsigned long addr, unsigned long end,
+ pgprot_t prot, struct page **pages)
+{
+ unsigned int count = (end - addr) >> PAGE_SHIFT;
+ unsigned int prev_shift = 0, idx = 0;
+ unsigned long map_addr = addr;
+ int err;
+
+ err = kmsan_vmap_pages_range_noflush(addr, end, prot, pages,
+ PAGE_SHIFT, GFP_KERNEL);
+ if (err)
+ goto out;
+
+ for (unsigned int i = 0; i < count; ) {
+ unsigned int shift = PAGE_SHIFT +
+ get_vmap_batch_order(pages, count - i, i);
+
+ if (!i)
+ prev_shift = shift;
+
+ if (shift != prev_shift) {
+ err = vmap_pages_range_noflush_walk(map_addr, addr,
+ prot, pages + idx,
+ min(prev_shift, PMD_SHIFT));
+ if (err)
+ goto out;
+ prev_shift = shift;
+ map_addr = addr;
+ idx = i;
+ }
+
+ addr += 1UL << shift;
+ i += 1U << (shift - PAGE_SHIFT);
+ }
+
+ /* Remaining */
+ if (map_addr < end)
+ err = vmap_pages_range_noflush_walk(map_addr, end,
+ prot, pages + idx, min(prev_shift, PMD_SHIFT));
+
+out:
+ flush_cache_vmap(addr, end);
+ return err;
+}
+
/**
* vmap - map an array of pages into virtually contiguous space
* @pages: array of page pointers
@@ -3563,8 +3634,8 @@ void *vmap(struct page **pages, unsigned int count,
return NULL;
addr = (unsigned long)area->addr;
- if (vmap_pages_range(addr, addr + size, pgprot_nx(prot),
- pages, PAGE_SHIFT) < 0) {
+ if (__vmap_huge(addr, addr + size, pgprot_nx(prot),
+ pages) < 0) {
vunmap(area->addr);
return NULL;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (4 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Wen Jiang
6 siblings, 0 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
Try to align the vmap virtual address to PMD_SHIFT or a
larger PTE mapping size hinted by the architecture, so
contiguous pages can be batch-mapped when setting PMD or
PTE entries.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c30a7673e..b3389c8f1 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3591,6 +3591,35 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
return err;
}
+static struct vm_struct *get_aligned_vm_area(unsigned long size, unsigned long flags)
+{
+ unsigned int shift = (size >= PMD_SIZE) ? PMD_SHIFT :
+ arch_vmap_pte_supported_shift(size);
+ struct vm_struct *vm_area = NULL;
+
+ /*
+ * Try to allocate an aligned vm_area so contiguous pages can be
+ * mapped in batches.
+ */
+ while (1) {
+ unsigned long align = 1UL << shift;
+
+ vm_area = __get_vm_area_node(size, align, PAGE_SHIFT, flags,
+ VMALLOC_START, VMALLOC_END,
+ NUMA_NO_NODE, GFP_KERNEL,
+ __builtin_return_address(0));
+ if (vm_area || shift <= PAGE_SHIFT)
+ goto out;
+ if (shift == PMD_SHIFT)
+ shift = arch_vmap_pte_supported_shift(size);
+ else if (shift > PAGE_SHIFT)
+ shift = PAGE_SHIFT;
+ }
+
+out:
+ return vm_area;
+}
+
/**
* vmap - map an array of pages into virtually contiguous space
* @pages: array of page pointers
@@ -3629,7 +3658,7 @@ void *vmap(struct page **pages, unsigned int count,
return NULL;
size = (unsigned long)count << PAGE_SHIFT;
- area = get_vm_area_caller(size, flags, __builtin_return_address(0));
+ area = get_aligned_vm_area(size, flags);
if (!area)
return NULL;
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (5 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
6 siblings, 0 replies; 8+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
Users typically allocate memory in descending orders, e.g.
8 → 4 → 0. Once an order-0 page is encountered, subsequent
pages are likely to also be order-0, so we stop scanning
for compound pages at that point.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b3389c8f1..60579bfbf 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3576,6 +3576,12 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
map_addr = addr;
idx = i;
}
+ /*
+ * Once small pages are encountered, the remaining pages
+ * are likely small as well
+ */
+ if (shift == PAGE_SHIFT)
+ break;
addr += 1UL << shift;
i += 1U << (shift - PAGE_SHIFT);
--
2.34.1
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-14 9:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
2026-05-14 9:41 ` [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Wen Jiang
2026-05-14 9:41 ` [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic Wen Jiang
2026-05-14 9:41 ` [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
2026-05-14 9:41 ` [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
2026-05-14 9:41 ` [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
2026-05-14 9:41 ` [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Wen Jiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox