* [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
@ 2026-05-14 9:41 Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
` (7 more replies)
0 siblings, 8 replies; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
This patchset accelerates ioremap, vmalloc, and vmap when the memory
is physically fully or partially contiguous. Two techniques are used:
1. Avoid page table rewalk when setting PTEs/PMDs for multiple memory
segments
2. Use batched mappings wherever possible in both vmalloc and ARM64
layers
Besides accelerating the mapping path, this also enables large
mappings (PMD and cont-PTE) for vmap, which are currently not
supported.
Patches 1-2 extend ARM64 vmalloc CONT-PTE mapping to support multiple
CONT-PTE regions instead of just one.
Patch 3 extracts a common helper vmap_set_ptes() that consolidates PTE
mapping logic between the ioremap and vmalloc/vmap paths, handling both
CONT_PTE and regular PTE mappings. This prepares for the next patch.
Patch 4 extends the page table walk path to support page shifts other
than PAGE_SHIFT and eliminates the page table rewalk for huge vmalloc
mappings. The function is renamed from vmap_small_pages_range_noflush()
to vmap_pages_range_noflush_walk().
Patches 5-7 add huge vmap support for contiguous pages, including
support for non-compound pages with pfn alignment verification.
On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
the performance CPUfreq policy enabled, benchmark results:
* ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
* vmalloc(1 MB) mapping time (excluding allocation) with
VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
* vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
Many thanks to Xueyuan Chen for his testing efforts on RK3588 boards.
Changes since v1:
- Fix condition order and use PMD_SIZE instead of CONT_PMD_SIZE in
patch 1 (Dev Jain)
- Squash patch 3+4 and patch 5+7 (Dev Jain)
- Replace "zigzag" with "page table rewalk" in commit messages
(Dev Jain)
- Rename vmap_small_pages_range_noflush() to
vmap_pages_range_noflush_walk() (Dev Jain)
- Extract vmap_set_ptes() as a new patch to consolidate PTE mapping
logic between vmap_pte_range() and vmap_pages_pte_range(), handling
both CONT_PTE and regular mappings (Mike Rapoport)
- Support non-compound pages in get_vmap_batch_order() by falling
back to physical contiguity scanning with pfn alignment check
(Dev Jain, Uladzislau Rezki)
- In get_vmap_batch_order(), filter out orders that the architecture
cannot batch by checking arch_vmap_pte_supported_shift() directly.
This avoids overhead for orders 1-3 on ARM64 CONT_PTE with 4K
pages. (patch 5)
Barry Song (Xiaomi) (6):
arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE
setup
arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple
CONT_PTE
mm/vmalloc: Extend page table walk to support larger page_shift sizes
and eliminate page table rewalk
mm/vmalloc: map contiguous pages in batches for vmap() if possible
mm/vmalloc: align vm_area so vmap() can batch mappings
mm/vmalloc: Stop scanning for compound pages after encountering small
pages in vmap
Wen Jiang (1):
mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic
arch/arm64/include/asm/vmalloc.h | 6 +-
arch/arm64/mm/hugetlbpage.c | 10 ++
mm/vmalloc.c | 221 ++++++++++++++++++++++++-------
3 files changed, 189 insertions(+), 48 deletions(-)
--
2.34.1
^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Wen Jiang
` (6 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
For sizes aligned to CONT_PTE_SIZE and smaller than PMD_SIZE,
we can batch CONT_PTE settings instead of handling them individually.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
arch/arm64/mm/hugetlbpage.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 30772a909..d477a9dd1 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -110,6 +110,12 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize)
contig_ptes = CONT_PTES;
break;
default:
+ if (size > 0 && size < PMD_SIZE &&
+ IS_ALIGNED(size, CONT_PTE_SIZE)) {
+ contig_ptes = size >> PAGE_SHIFT;
+ *pgsize = PAGE_SIZE;
+ break;
+ }
WARN_ON(!__hugetlb_valid_size(size));
}
@@ -359,6 +365,10 @@ pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags)
case CONT_PTE_SIZE:
return pte_mkcont(entry);
default:
+ if (pagesize > 0 && pagesize < PMD_SIZE &&
+ IS_ALIGNED(pagesize, CONT_PTE_SIZE))
+ return pte_mkcont(entry);
+
break;
}
pr_warn("%s: unrecognized huge page size 0x%lx\n",
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic Wen Jiang
` (5 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE hugepages,
reducing both PTE setup and TLB flush iterations.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
arch/arm64/include/asm/vmalloc.h | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/vmalloc.h b/arch/arm64/include/asm/vmalloc.h
index 4ec1acd3c..9eea06d0f 100644
--- a/arch/arm64/include/asm/vmalloc.h
+++ b/arch/arm64/include/asm/vmalloc.h
@@ -23,6 +23,8 @@ static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr,
unsigned long end, u64 pfn,
unsigned int max_page_shift)
{
+ unsigned long size;
+
/*
* If the block is at least CONT_PTE_SIZE in size, and is naturally
* aligned in both virtual and physical space, then we can pte-map the
@@ -40,7 +42,9 @@ static inline unsigned long arch_vmap_pte_range_map_size(unsigned long addr,
if (!IS_ALIGNED(PFN_PHYS(pfn), CONT_PTE_SIZE))
return PAGE_SIZE;
- return CONT_PTE_SIZE;
+ size = min3(end - addr, 1UL << max_page_shift, PMD_SIZE >> 1);
+ size = 1UL << (fls(size) - 1);
+ return size;
}
#define arch_vmap_pte_range_unmap_size arch_vmap_pte_range_unmap_size
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
2026-05-14 9:41 ` [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-14 9:41 ` [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
` (4 subsequent siblings)
7 siblings, 0 replies; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
Extract the common PTE mapping logic from vmap_pte_range() into a
shared helper vmap_set_ptes(). This handles both CONT_PTE and regular
PTE mappings in a single function, preparing for the next patch which
will extend vmap_pages_pte_range() to also use this helper.
The #ifdef CONFIG_HUGETLB_PAGE guard is moved inside vmap_set_ptes(),
so callers no longer need to handle the conditional compilation.
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 49 ++++++++++++++++++++++++++++++++++---------------
1 file changed, 34 insertions(+), 15 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 3e9e5156f..9bfd0aa34 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -91,6 +91,35 @@ struct vfree_deferred {
static DEFINE_PER_CPU(struct vfree_deferred, vfree_deferred);
/*** Page table manipulation functions ***/
+
+/*
+ * Set PTE mappings for the given PFN. Try CONT_PTE mappings first when
+ * supported, otherwise fall back to PAGE_SIZE mappings.
+ *
+ * Return: mapping size.
+ */
+static __always_inline unsigned long vmap_set_ptes(pte_t *pte,
+ unsigned long addr, unsigned long end, u64 pfn,
+ pgprot_t prot, unsigned int max_page_shift)
+{
+#ifdef CONFIG_HUGETLB_PAGE
+ if (max_page_shift > PAGE_SHIFT) {
+ unsigned long size;
+
+ size = arch_vmap_pte_range_map_size(addr, end, pfn, max_page_shift);
+ if (size != PAGE_SIZE) {
+ pte_t entry = pfn_pte(pfn, prot);
+
+ entry = arch_make_huge_pte(entry, ilog2(size), 0);
+ set_huge_pte_at(&init_mm, addr, pte, entry, size);
+ return size;
+ }
+ }
+#endif
+ set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot));
+ return PAGE_SIZE;
+}
+
static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
phys_addr_t phys_addr, pgprot_t prot,
unsigned int max_page_shift, pgtbl_mod_mask *mask)
@@ -98,7 +127,8 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
pte_t *pte;
u64 pfn;
struct page *page;
- unsigned long size = PAGE_SIZE;
+ unsigned long size;
+ unsigned int steps;
if (WARN_ON_ONCE(!PAGE_ALIGNED(end - addr)))
return -EINVAL;
@@ -119,20 +149,9 @@ static int vmap_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
BUG();
}
-#ifdef CONFIG_HUGETLB_PAGE
- size = arch_vmap_pte_range_map_size(addr, end, pfn, max_page_shift);
- if (size != PAGE_SIZE) {
- pte_t entry = pfn_pte(pfn, prot);
-
- entry = arch_make_huge_pte(entry, ilog2(size), 0);
- set_huge_pte_at(&init_mm, addr, pte, entry, size);
- pfn += PFN_DOWN(size);
- continue;
- }
-#endif
- set_pte_at(&init_mm, addr, pte, pfn_pte(pfn, prot));
- pfn++;
- } while (pte += PFN_DOWN(size), addr += size, addr != end);
+ size = vmap_set_ptes(pte, addr, end, pfn, prot, max_page_shift);
+ steps = PFN_DOWN(size);
+ } while (pte += steps, pfn += steps, addr += size, addr != end);
lazy_mmu_mode_disable();
*mask |= PGTBL_PTE_MODIFIED;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (2 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-20 11:53 ` Mike Rapoport
2026-05-14 9:41 ` [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
` (3 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
vmap_pages_range_noflush_walk() (formerly vmap_small_pages_range_noflush())
provides a clean interface by taking struct page **pages and mapping them
via direct PTE iteration. This avoids the page table rewalk seen when
using vmap_range_noflush() for page_shift values other than PAGE_SHIFT.
Extend it to support larger page_shift values, and add PMD- and
contiguous-PTE mappings as well. Rename it to vmap_pages_range_noflush_walk()
since it now handles more than just small pages.
For vmalloc() allocations with VM_ALLOW_HUGE_VMAP, we no longer need to
iterate over pages one by one via vmap_range_noflush(), which would
otherwise lead to page table rewalk. The code is now unified with the
PAGE_SHIFT case by simply calling vmap_pages_range_noflush_walk().
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 64 +++++++++++++++++++++++++++-------------------------
1 file changed, 33 insertions(+), 31 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 9bfd0aa34..516d40650 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -543,8 +543,10 @@ void vunmap_range(unsigned long addr, unsigned long end)
static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
+ unsigned long pfn, size;
+ unsigned int steps;
int err = 0;
pte_t *pte;
@@ -575,9 +577,10 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
break;
}
- set_pte_at(&init_mm, addr, pte, mk_pte(page, prot));
- (*nr)++;
- } while (pte++, addr += PAGE_SIZE, addr != end);
+ pfn = page_to_pfn(page);
+ size = vmap_set_ptes(pte, addr, end, pfn, prot, shift);
+ steps = PFN_DOWN(size);
+ } while (pte += steps, *nr += steps, addr += size, addr != end);
lazy_mmu_mode_disable();
*mask |= PGTBL_PTE_MODIFIED;
@@ -587,7 +590,7 @@ static int vmap_pages_pte_range(pmd_t *pmd, unsigned long addr,
static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
pmd_t *pmd;
unsigned long next;
@@ -597,7 +600,20 @@ static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
return -ENOMEM;
do {
next = pmd_addr_end(addr, end);
- if (vmap_pages_pte_range(pmd, addr, next, prot, pages, nr, mask))
+
+ if (shift == PMD_SHIFT) {
+ struct page *page = pages[*nr];
+ phys_addr_t phys_addr = page_to_phys(page);
+
+ if (vmap_try_huge_pmd(pmd, addr, next, phys_addr, prot,
+ shift)) {
+ *mask |= PGTBL_PMD_MODIFIED;
+ *nr += 1 << (shift - PAGE_SHIFT);
+ continue;
+ }
+ }
+
+ if (vmap_pages_pte_range(pmd, addr, next, prot, pages, nr, mask, shift))
return -ENOMEM;
} while (pmd++, addr = next, addr != end);
return 0;
@@ -605,7 +621,7 @@ static int vmap_pages_pmd_range(pud_t *pud, unsigned long addr,
static int vmap_pages_pud_range(p4d_t *p4d, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
pud_t *pud;
unsigned long next;
@@ -615,7 +631,7 @@ static int vmap_pages_pud_range(p4d_t *p4d, unsigned long addr,
return -ENOMEM;
do {
next = pud_addr_end(addr, end);
- if (vmap_pages_pmd_range(pud, addr, next, prot, pages, nr, mask))
+ if (vmap_pages_pmd_range(pud, addr, next, prot, pages, nr, mask, shift))
return -ENOMEM;
} while (pud++, addr = next, addr != end);
return 0;
@@ -623,7 +639,7 @@ static int vmap_pages_pud_range(p4d_t *p4d, unsigned long addr,
static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr,
unsigned long end, pgprot_t prot, struct page **pages, int *nr,
- pgtbl_mod_mask *mask)
+ pgtbl_mod_mask *mask, unsigned int shift)
{
p4d_t *p4d;
unsigned long next;
@@ -633,14 +649,14 @@ static int vmap_pages_p4d_range(pgd_t *pgd, unsigned long addr,
return -ENOMEM;
do {
next = p4d_addr_end(addr, end);
- if (vmap_pages_pud_range(p4d, addr, next, prot, pages, nr, mask))
+ if (vmap_pages_pud_range(p4d, addr, next, prot, pages, nr, mask, shift))
return -ENOMEM;
} while (p4d++, addr = next, addr != end);
return 0;
}
-static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
- pgprot_t prot, struct page **pages)
+static int vmap_pages_range_noflush_walk(unsigned long addr, unsigned long end,
+ pgprot_t prot, struct page **pages, unsigned int shift)
{
unsigned long start = addr;
pgd_t *pgd;
@@ -655,7 +671,7 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
next = pgd_addr_end(addr, end);
if (pgd_bad(*pgd))
mask |= PGTBL_PGD_MODIFIED;
- err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, &mask);
+ err = vmap_pages_p4d_range(pgd, addr, next, prot, pages, &nr, &mask, shift);
if (err)
break;
} while (pgd++, addr = next, addr != end);
@@ -678,27 +694,13 @@ static int vmap_small_pages_range_noflush(unsigned long addr, unsigned long end,
int __vmap_pages_range_noflush(unsigned long addr, unsigned long end,
pgprot_t prot, struct page **pages, unsigned int page_shift)
{
- unsigned int i, nr = (end - addr) >> PAGE_SHIFT;
-
WARN_ON(page_shift < PAGE_SHIFT);
- if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC) ||
- page_shift == PAGE_SHIFT)
- return vmap_small_pages_range_noflush(addr, end, prot, pages);
-
- for (i = 0; i < nr; i += 1U << (page_shift - PAGE_SHIFT)) {
- int err;
-
- err = vmap_range_noflush(addr, addr + (1UL << page_shift),
- page_to_phys(pages[i]), prot,
- page_shift);
- if (err)
- return err;
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMALLOC))
+ page_shift = PAGE_SHIFT;
- addr += 1UL << page_shift;
- }
-
- return 0;
+ return vmap_pages_range_noflush_walk(addr, end, prot, pages,
+ min(page_shift, PMD_SHIFT));
}
int vmap_pages_range_noflush(unsigned long addr, unsigned long end,
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (3 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-20 11:53 ` Mike Rapoport
2026-05-14 9:41 ` [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
` (2 subsequent siblings)
7 siblings, 1 reply; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
In many cases, the pages passed to vmap() may include high-order
pages. For example, the systemheap often allocates pages in descending
order: order 8, then 4, then 0. Currently, vmap() iterates over every
page individually—even pages inside a high-order block are handled
one by one.
This patch detects physically contiguous pages (regardless of whether
they are compound or non-compound) by scanning with
num_pages_contiguous(), and maps them as a single contiguous block
whenever possible. The first page's pfn must be aligned to the
mapping order for the batched mapping to be used.
Pages with the same page_shift are coalesced and mapped via
vmap_pages_range_noflush_walk() to avoid page table rewalk.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Co-developed-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Dev Jain <dev.jain@arm.com>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 73 insertions(+), 2 deletions(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 516d40650..c30a7673e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3520,6 +3520,77 @@ void vunmap(const void *addr)
}
EXPORT_SYMBOL(vunmap);
+static inline int get_vmap_batch_order(struct page **pages,
+ unsigned int max_steps, unsigned int idx)
+{
+ unsigned int nr_contig;
+ int order;
+
+ if (!IS_ENABLED(CONFIG_HAVE_ARCH_HUGE_VMAP) ||
+ ioremap_max_page_shift == PAGE_SHIFT)
+ return 0;
+
+ nr_contig = num_pages_contiguous(&pages[idx], max_steps);
+ if (nr_contig < 2)
+ return 0;
+
+ order = fls(nr_contig) - 1;
+
+ if (arch_vmap_pte_supported_shift(PAGE_SIZE << order) == PAGE_SHIFT)
+ return 0;
+
+ /* Ensure the first page's pfn is aligned to the order */
+ if (!IS_ALIGNED(page_to_pfn(pages[idx]), 1 << order))
+ return 0;
+
+ return order;
+}
+
+static int __vmap_huge(unsigned long addr, unsigned long end,
+ pgprot_t prot, struct page **pages)
+{
+ unsigned int count = (end - addr) >> PAGE_SHIFT;
+ unsigned int prev_shift = 0, idx = 0;
+ unsigned long map_addr = addr;
+ int err;
+
+ err = kmsan_vmap_pages_range_noflush(addr, end, prot, pages,
+ PAGE_SHIFT, GFP_KERNEL);
+ if (err)
+ goto out;
+
+ for (unsigned int i = 0; i < count; ) {
+ unsigned int shift = PAGE_SHIFT +
+ get_vmap_batch_order(pages, count - i, i);
+
+ if (!i)
+ prev_shift = shift;
+
+ if (shift != prev_shift) {
+ err = vmap_pages_range_noflush_walk(map_addr, addr,
+ prot, pages + idx,
+ min(prev_shift, PMD_SHIFT));
+ if (err)
+ goto out;
+ prev_shift = shift;
+ map_addr = addr;
+ idx = i;
+ }
+
+ addr += 1UL << shift;
+ i += 1U << (shift - PAGE_SHIFT);
+ }
+
+ /* Remaining */
+ if (map_addr < end)
+ err = vmap_pages_range_noflush_walk(map_addr, end,
+ prot, pages + idx, min(prev_shift, PMD_SHIFT));
+
+out:
+ flush_cache_vmap(addr, end);
+ return err;
+}
+
/**
* vmap - map an array of pages into virtually contiguous space
* @pages: array of page pointers
@@ -3563,8 +3634,8 @@ void *vmap(struct page **pages, unsigned int count,
return NULL;
addr = (unsigned long)area->addr;
- if (vmap_pages_range(addr, addr + size, pgprot_nx(prot),
- pages, PAGE_SHIFT) < 0) {
+ if (__vmap_huge(addr, addr + size, pgprot_nx(prot),
+ pages) < 0) {
vunmap(area->addr);
return NULL;
}
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (4 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-20 7:32 ` Uladzislau Rezki
2026-05-14 9:41 ` [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Wen Jiang
2026-05-19 20:17 ` [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Andrew Morton
7 siblings, 1 reply; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
Try to align the vmap virtual address to PMD_SHIFT or a
larger PTE mapping size hinted by the architecture, so
contiguous pages can be batch-mapped when setting PMD or
PTE entries.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index c30a7673e..b3389c8f1 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3591,6 +3591,35 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
return err;
}
+static struct vm_struct *get_aligned_vm_area(unsigned long size, unsigned long flags)
+{
+ unsigned int shift = (size >= PMD_SIZE) ? PMD_SHIFT :
+ arch_vmap_pte_supported_shift(size);
+ struct vm_struct *vm_area = NULL;
+
+ /*
+ * Try to allocate an aligned vm_area so contiguous pages can be
+ * mapped in batches.
+ */
+ while (1) {
+ unsigned long align = 1UL << shift;
+
+ vm_area = __get_vm_area_node(size, align, PAGE_SHIFT, flags,
+ VMALLOC_START, VMALLOC_END,
+ NUMA_NO_NODE, GFP_KERNEL,
+ __builtin_return_address(0));
+ if (vm_area || shift <= PAGE_SHIFT)
+ goto out;
+ if (shift == PMD_SHIFT)
+ shift = arch_vmap_pte_supported_shift(size);
+ else if (shift > PAGE_SHIFT)
+ shift = PAGE_SHIFT;
+ }
+
+out:
+ return vm_area;
+}
+
/**
* vmap - map an array of pages into virtually contiguous space
* @pages: array of page pointers
@@ -3629,7 +3658,7 @@ void *vmap(struct page **pages, unsigned int count,
return NULL;
size = (unsigned long)count << PAGE_SHIFT;
- area = get_vm_area_caller(size, flags, __builtin_return_address(0));
+ area = get_aligned_vm_area(size, flags);
if (!area)
return NULL;
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (5 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
@ 2026-05-14 9:41 ` Wen Jiang
2026-05-20 9:44 ` Uladzislau Rezki
2026-05-19 20:17 ` [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Andrew Morton
7 siblings, 1 reply; 20+ messages in thread
From: Wen Jiang @ 2026-05-14 9:41 UTC (permalink / raw)
To: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki
Cc: baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang, Xueyuan Chen
From: "Barry Song (Xiaomi)" <baohua@kernel.org>
Users typically allocate memory in descending orders, e.g.
8 → 4 → 0. Once an order-0 page is encountered, subsequent
pages are likely to also be order-0, so we stop scanning
for compound pages at that point.
Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
---
mm/vmalloc.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index b3389c8f1..60579bfbf 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3576,6 +3576,12 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
map_addr = addr;
idx = i;
}
+ /*
+ * Once small pages are encountered, the remaining pages
+ * are likely small as well
+ */
+ if (shift == PAGE_SHIFT)
+ break;
addr += 1UL << shift;
i += 1U << (shift - PAGE_SHIFT);
--
2.34.1
^ permalink raw reply related [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
` (6 preceding siblings ...)
2026-05-14 9:41 ` [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Wen Jiang
@ 2026-05-19 20:17 ` Andrew Morton
2026-05-20 3:40 ` Dev Jain
` (2 more replies)
7 siblings, 3 replies; 20+ messages in thread
From: Andrew Morton @ 2026-05-19 20:17 UTC (permalink / raw)
To: Wen Jiang
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, urezki, baohua,
Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
On Thu, 14 May 2026 17:41:01 +0800 Wen Jiang <jiangwenxiaomi@gmail.com> wrote:
> This patchset accelerates ioremap, vmalloc, and vmap when the memory
> is physically fully or partially contiguous.
>
> ...
>
> On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
> the performance CPUfreq policy enabled, benchmark results:
>
> * ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
> * vmalloc(1 MB) mapping time (excluding allocation) with
> VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
> * vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
Nice.
AI review found a bunch of things to ask about:
https://sashiko.dev/#/patchset/20260514094108.2016201-1-jiangwen6@xiaomi.com
It doesn't appear that you'll be getting any more review on this
series, so please check the above questions and resend?
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
2026-05-19 20:17 ` [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Andrew Morton
@ 2026-05-20 3:40 ` Dev Jain
2026-05-20 7:15 ` Uladzislau Rezki
2026-05-20 12:29 ` Wen Jiang
2 siblings, 0 replies; 20+ messages in thread
From: Dev Jain @ 2026-05-20 3:40 UTC (permalink / raw)
To: Andrew Morton, Wen Jiang
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, urezki, baohua,
Xueyuan.chen21, rppt, david, ryan.roberts, anshuman.khandual, ajd,
linux-kernel, Wen Jiang
On 20/05/26 1:47 am, Andrew Morton wrote:
> On Thu, 14 May 2026 17:41:01 +0800 Wen Jiang <jiangwenxiaomi@gmail.com> wrote:
>
>> This patchset accelerates ioremap, vmalloc, and vmap when the memory
>> is physically fully or partially contiguous.
>>
>> ...
>>
>> On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
>> the performance CPUfreq policy enabled, benchmark results:
>>
>> * ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
>> * vmalloc(1 MB) mapping time (excluding allocation) with
>> VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
>> * vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
>
> Nice.
>
> AI review found a bunch of things to ask about:
> https://sashiko.dev/#/patchset/20260514094108.2016201-1-jiangwen6@xiaomi.com
>
> It doesn't appear that you'll be getting any more review on this
> series, so please check the above questions and resend?
I have to review this but struggling to find time right now. So please
don't wait for me : )
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
2026-05-19 20:17 ` [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Andrew Morton
2026-05-20 3:40 ` Dev Jain
@ 2026-05-20 7:15 ` Uladzislau Rezki
2026-05-20 8:37 ` Wen Jiang
2026-05-20 12:29 ` Wen Jiang
2 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2026-05-20 7:15 UTC (permalink / raw)
To: Andrew Morton
Cc: Wen Jiang, linux-mm, linux-arm-kernel, catalin.marinas, will,
urezki, baohua, Xueyuan.chen21, dev.jain, rppt, david,
ryan.roberts, anshuman.khandual, ajd, linux-kernel, Wen Jiang
On Tue, May 19, 2026 at 01:17:38PM -0700, Andrew Morton wrote:
> On Thu, 14 May 2026 17:41:01 +0800 Wen Jiang <jiangwenxiaomi@gmail.com> wrote:
>
> > This patchset accelerates ioremap, vmalloc, and vmap when the memory
> > is physically fully or partially contiguous.
> >
> > ...
> >
> > On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
> > the performance CPUfreq policy enabled, benchmark results:
> >
> > * ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
> > * vmalloc(1 MB) mapping time (excluding allocation) with
> > VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
> > * vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
>
> Nice.
>
> AI review found a bunch of things to ask about:
> https://sashiko.dev/#/patchset/20260514094108.2016201-1-jiangwen6@xiaomi.com
>
> It doesn't appear that you'll be getting any more review on this
> series, so please check the above questions and resend?
>
Actually i keep an eye on it and i have done some stability testing.
So, just need some time. Fixing AI sounds good.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings
2026-05-14 9:41 ` [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
@ 2026-05-20 7:32 ` Uladzislau Rezki
2026-05-20 7:55 ` Barry Song
0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2026-05-20 7:32 UTC (permalink / raw)
To: Wen Jiang
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki,
baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
On Thu, May 14, 2026 at 05:41:07PM +0800, Wen Jiang wrote:
> From: "Barry Song (Xiaomi)" <baohua@kernel.org>
>
> Try to align the vmap virtual address to PMD_SHIFT or a
> larger PTE mapping size hinted by the architecture, so
> contiguous pages can be batch-mapped when setting PMD or
> PTE entries.
>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
> Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
> ---
> mm/vmalloc.c | 31 ++++++++++++++++++++++++++++++-
> 1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index c30a7673e..b3389c8f1 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3591,6 +3591,35 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
> return err;
> }
>
> +static struct vm_struct *get_aligned_vm_area(unsigned long size, unsigned long flags)
> +{
> + unsigned int shift = (size >= PMD_SIZE) ? PMD_SHIFT :
> + arch_vmap_pte_supported_shift(size);
> + struct vm_struct *vm_area = NULL;
> +
> + /*
> + * Try to allocate an aligned vm_area so contiguous pages can be
> + * mapped in batches.
> + */
> + while (1) {
> + unsigned long align = 1UL << shift;
> +
> + vm_area = __get_vm_area_node(size, align, PAGE_SHIFT, flags,
> + VMALLOC_START, VMALLOC_END,
> + NUMA_NO_NODE, GFP_KERNEL,
> + __builtin_return_address(0));
> + if (vm_area || shift <= PAGE_SHIFT)
> + goto out;
> + if (shift == PMD_SHIFT)
> + shift = arch_vmap_pte_supported_shift(size);
> + else if (shift > PAGE_SHIFT)
> + shift = PAGE_SHIFT;
> + }
> +
> +out:
> + return vm_area;
> +}
> +
IMO, we should get rid of this while(1) loop. It looks like you need to
handle just few cases. 3?
shift min value is PAGE_SHIFT, could you please clarify when it can be less?
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings
2026-05-20 7:32 ` Uladzislau Rezki
@ 2026-05-20 7:55 ` Barry Song
2026-05-20 9:12 ` Uladzislau Rezki
0 siblings, 1 reply; 20+ messages in thread
From: Barry Song @ 2026-05-20 7:55 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Wen Jiang, linux-mm, linux-arm-kernel, catalin.marinas, will,
akpm, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
On Wed, May 20, 2026 at 3:37 PM Uladzislau Rezki <urezki@gmail.com> wrote:
>
> On Thu, May 14, 2026 at 05:41:07PM +0800, Wen Jiang wrote:
> > From: "Barry Song (Xiaomi)" <baohua@kernel.org>
> >
> > Try to align the vmap virtual address to PMD_SHIFT or a
> > larger PTE mapping size hinted by the architecture, so
> > contiguous pages can be batch-mapped when setting PMD or
> > PTE entries.
> >
> > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
> > Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
> > ---
> > mm/vmalloc.c | 31 ++++++++++++++++++++++++++++++-
> > 1 file changed, 30 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index c30a7673e..b3389c8f1 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3591,6 +3591,35 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
> > return err;
> > }
> >
> > +static struct vm_struct *get_aligned_vm_area(unsigned long size, unsigned long flags)
> > +{
> > + unsigned int shift = (size >= PMD_SIZE) ? PMD_SHIFT :
> > + arch_vmap_pte_supported_shift(size);
> > + struct vm_struct *vm_area = NULL;
> > +
> > + /*
> > + * Try to allocate an aligned vm_area so contiguous pages can be
> > + * mapped in batches.
> > + */
> > + while (1) {
> > + unsigned long align = 1UL << shift;
> > +
> > + vm_area = __get_vm_area_node(size, align, PAGE_SHIFT, flags,
> > + VMALLOC_START, VMALLOC_END,
> > + NUMA_NO_NODE, GFP_KERNEL,
> > + __builtin_return_address(0));
> > + if (vm_area || shift <= PAGE_SHIFT)
> > + goto out;
> > + if (shift == PMD_SHIFT)
> > + shift = arch_vmap_pte_supported_shift(size);
> > + else if (shift > PAGE_SHIFT)
> > + shift = PAGE_SHIFT;
> > + }
> > +
> > +out:
> > + return vm_area;
> > +}
> > +
> IMO, we should get rid of this while(1) loop. It looks like you need to
> handle just few cases. 3?
Hi Uladzislau,
I don’t quite understand what you mean — are you suggesting
calling __get_vm_area_node() three times? We try 2MB first,
then 64KB, and finally 4KB. If 2MB succeeds, there is no
reason to try 64KB. Likewise, if 64KB succeeds, there is no
need to fall back to 4KB.
>
>
> shift min value is PAGE_SHIFT, could you please clarify when it can be less?
I guess this should be changed to "==" ?
Thanks
Barry
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
2026-05-20 7:15 ` Uladzislau Rezki
@ 2026-05-20 8:37 ` Wen Jiang
0 siblings, 0 replies; 20+ messages in thread
From: Wen Jiang @ 2026-05-20 8:37 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Andrew Morton, linux-mm, linux-arm-kernel, catalin.marinas, will,
baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
[-- Attachment #1: Type: text/plain, Size: 1287 bytes --]
Thanks, I'll look into it.
Uladzislau Rezki <urezki@gmail.com> 于2026年5月20日周三 15:15写道:
> On Tue, May 19, 2026 at 01:17:38PM -0700, Andrew Morton wrote:
> > On Thu, 14 May 2026 17:41:01 +0800 Wen Jiang <jiangwenxiaomi@gmail.com>
> wrote:
> >
> > > This patchset accelerates ioremap, vmalloc, and vmap when the memory
> > > is physically fully or partially contiguous.
> > >
> > > ...
> > >
> > > On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
> > > the performance CPUfreq policy enabled, benchmark results:
> > >
> > > * ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
> > > * vmalloc(1 MB) mapping time (excluding allocation) with
> > > VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
> > > * vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
> >
> > Nice.
> >
> > AI review found a bunch of things to ask about:
> >
> https://sashiko.dev/#/patchset/20260514094108.2016201-1-jiangwen6@xiaomi.com
> >
> > It doesn't appear that you'll be getting any more review on this
> > series, so please check the above questions and resend?
> >
> Actually i keep an eye on it and i have done some stability testing.
> So, just need some time. Fixing AI sounds good.
>
> --
> Uladzislau Rezki
>
[-- Attachment #2: Type: text/html, Size: 1971 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings
2026-05-20 7:55 ` Barry Song
@ 2026-05-20 9:12 ` Uladzislau Rezki
0 siblings, 0 replies; 20+ messages in thread
From: Uladzislau Rezki @ 2026-05-20 9:12 UTC (permalink / raw)
To: Barry Song
Cc: Uladzislau Rezki, Wen Jiang, linux-mm, linux-arm-kernel,
catalin.marinas, will, akpm, Xueyuan.chen21, dev.jain, rppt,
david, ryan.roberts, anshuman.khandual, ajd, linux-kernel,
Wen Jiang
On Wed, May 20, 2026 at 03:55:02PM +0800, Barry Song wrote:
> On Wed, May 20, 2026 at 3:37 PM Uladzislau Rezki <urezki@gmail.com> wrote:
> >
> > On Thu, May 14, 2026 at 05:41:07PM +0800, Wen Jiang wrote:
> > > From: "Barry Song (Xiaomi)" <baohua@kernel.org>
> > >
> > > Try to align the vmap virtual address to PMD_SHIFT or a
> > > larger PTE mapping size hinted by the architecture, so
> > > contiguous pages can be batch-mapped when setting PMD or
> > > PTE entries.
> > >
> > > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > > Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
> > > Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
> > > ---
> > > mm/vmalloc.c | 31 ++++++++++++++++++++++++++++++-
> > > 1 file changed, 30 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index c30a7673e..b3389c8f1 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -3591,6 +3591,35 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
> > > return err;
> > > }
> > >
> > > +static struct vm_struct *get_aligned_vm_area(unsigned long size, unsigned long flags)
> > > +{
> > > + unsigned int shift = (size >= PMD_SIZE) ? PMD_SHIFT :
> > > + arch_vmap_pte_supported_shift(size);
> > > + struct vm_struct *vm_area = NULL;
> > > +
> > > + /*
> > > + * Try to allocate an aligned vm_area so contiguous pages can be
> > > + * mapped in batches.
> > > + */
> > > + while (1) {
> > > + unsigned long align = 1UL << shift;
> > > +
> > > + vm_area = __get_vm_area_node(size, align, PAGE_SHIFT, flags,
> > > + VMALLOC_START, VMALLOC_END,
> > > + NUMA_NO_NODE, GFP_KERNEL,
> > > + __builtin_return_address(0));
> > > + if (vm_area || shift <= PAGE_SHIFT)
> > > + goto out;
> > > + if (shift == PMD_SHIFT)
> > > + shift = arch_vmap_pte_supported_shift(size);
> > > + else if (shift > PAGE_SHIFT)
> > > + shift = PAGE_SHIFT;
> > > + }
> > > +
> > > +out:
> > > + return vm_area;
> > > +}
> > > +
> > IMO, we should get rid of this while(1) loop. It looks like you need to
> > handle just few cases. 3?
>
Hello, Barry!
>
> I don’t quite understand what you mean — are you suggesting
> calling __get_vm_area_node() three times? We try 2MB first,
> then 64KB, and finally 4KB. If 2MB succeeds, there is no
> reason to try 64KB. Likewise, if 64KB succeeds, there is no
> need to fall back to 4KB.
>
I mean either to make three cases kind of open-coded:
...
if (size >= PMD_SIZE)
alloc_vm_area_with_shift(PMD_SHIFT);
if (vm_area)
return vm_area;
shift = get_supported_shift(size)
if (shift > PAGE_SHIFT)
alloc_vm_area_with_shift(shift);
if (vm_area)
return vm_area;
return alloc_vm_area_with_shift(PAGE_SHIFT)
...
or put everything into the: for (i = 0; i < 3; i++) - that way it will
finish in any case and for user it is obvious that we handle max 3 scenario.
> >
> >
> > shift min value is PAGE_SHIFT, could you please clarify when it can be less?
>
> I guess this should be changed to "==" ?
>
I assume shift can not be less than PAGE_SHIFT :)
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap
2026-05-14 9:41 ` [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Wen Jiang
@ 2026-05-20 9:44 ` Uladzislau Rezki
2026-05-20 10:56 ` Wen Jiang
0 siblings, 1 reply; 20+ messages in thread
From: Uladzislau Rezki @ 2026-05-20 9:44 UTC (permalink / raw)
To: Wen Jiang
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki,
baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
On Thu, May 14, 2026 at 05:41:08PM +0800, Wen Jiang wrote:
> From: "Barry Song (Xiaomi)" <baohua@kernel.org>
>
> Users typically allocate memory in descending orders, e.g.
> 8 → 4 → 0. Once an order-0 page is encountered, subsequent
> pages are likely to also be order-0, so we stop scanning
> for compound pages at that point.
>
> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
> Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
> ---
> mm/vmalloc.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index b3389c8f1..60579bfbf 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3576,6 +3576,12 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
> map_addr = addr;
> idx = i;
> }
> + /*
> + * Once small pages are encountered, the remaining pages
> + * are likely small as well
> + */
> + if (shift == PAGE_SHIFT)
> + break;
>
> addr += 1UL << shift;
> i += 1U << (shift - PAGE_SHIFT);
> --
> 2.34.1
>
Can we squash this patch with
"mm/vmalloc: map contiguous pages in batches for vmap() if possible"?
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap
2026-05-20 9:44 ` Uladzislau Rezki
@ 2026-05-20 10:56 ` Wen Jiang
0 siblings, 0 replies; 20+ messages in thread
From: Wen Jiang @ 2026-05-20 10:56 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, baohua,
Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel
Sure, will do.
Thanks,
Wen Jiang
On Wed, 20 May 2026 at 17:44, Uladzislau Rezki <urezki@gmail.com> wrote:
>
> On Thu, May 14, 2026 at 05:41:08PM +0800, Wen Jiang wrote:
> > From: "Barry Song (Xiaomi)" <baohua@kernel.org>
> >
> > Users typically allocate memory in descending orders, e.g.
> > 8 → 4 → 0. Once an order-0 page is encountered, subsequent
> > pages are likely to also be order-0, so we stop scanning
> > for compound pages at that point.
> >
> > Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
> > Signed-off-by: Wen Jiang <jiangwen6@xiaomi.com>
> > Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com>
> > ---
> > mm/vmalloc.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index b3389c8f1..60579bfbf 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3576,6 +3576,12 @@ static int __vmap_huge(unsigned long addr, unsigned long end,
> > map_addr = addr;
> > idx = i;
> > }
> > + /*
> > + * Once small pages are encountered, the remaining pages
> > + * are likely small as well
> > + */
> > + if (shift == PAGE_SHIFT)
> > + break;
> >
> > addr += 1UL << shift;
> > i += 1U << (shift - PAGE_SHIFT);
> > --
> > 2.34.1
> >
> Can we squash this patch with
> "mm/vmalloc: map contiguous pages in batches for vmap() if possible"?
>
> --
> Uladzislau Rezki
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk
2026-05-14 9:41 ` [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
@ 2026-05-20 11:53 ` Mike Rapoport
0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2026-05-20 11:53 UTC (permalink / raw)
To: Wen Jiang
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki,
baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
On Thu, 14 May 2026 17:41:05 +0800, Wen Jiang <jiangwenxiaomi@gmail.com> wrote:
Hi,
> vmap_pages_range_noflush_walk() (formerly vmap_small_pages_range_noflush())
> provides a clean interface by taking struct page **pages and mapping them
> via direct PTE iteration. This avoids the page table rewalk seen when
> using vmap_range_noflush() for page_shift values other than PAGE_SHIFT.
>
> Extend it to support larger page_shift values, and add PMD- and
> contiguous-PTE mappings as well. Rename it to vmap_pages_range_noflush_walk()
> since it now handles more than just small pages.
>
> For vmalloc() allocations with VM_ALLOW_HUGE_VMAP, we no longer need to
> iterate over pages one by one via vmap_range_noflush(), which would
> otherwise lead to page table rewalk. The code is now unified with the
> PAGE_SHIFT case by simply calling vmap_pages_range_noflush_walk().
After this patch we have two very simalar page table walkers:
vmap_pages_range_noflush_walk() and vmap_range_noflush().
The subtly differ at what levels they try huge mappings, how they
account page table modifucations and, at last vmap_range_noflush() is
left without support for contiguous mappings.
Is there a fundamental reason to have two page walkers?
Is there a reason not to support contiguous mappings in
vmap_range_noflush()?
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible
2026-05-14 9:41 ` [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
@ 2026-05-20 11:53 ` Mike Rapoport
0 siblings, 0 replies; 20+ messages in thread
From: Mike Rapoport @ 2026-05-20 11:53 UTC (permalink / raw)
To: Wen Jiang
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, akpm, urezki,
baohua, Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
On Thu, 14 May 2026 17:41:06 +0800, Wen Jiang <jiangwenxiaomi@gmail.com> wrote:
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 516d406507e9..c30a7673e0ac 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3520,6 +3520,77 @@ void vunmap(const void *addr)
> [ ... skip 25 lines ... ]
> +
> + return order;
> +}
> +
> +static int __vmap_huge(unsigned long addr, unsigned long end,
> + pgprot_t prot, struct page **pages)
This won't necessaryly create huge mappings, maybe vmap_batched?
--
Sincerely yours,
Mike.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory
2026-05-19 20:17 ` [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Andrew Morton
2026-05-20 3:40 ` Dev Jain
2026-05-20 7:15 ` Uladzislau Rezki
@ 2026-05-20 12:29 ` Wen Jiang
2 siblings, 0 replies; 20+ messages in thread
From: Wen Jiang @ 2026-05-20 12:29 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-arm-kernel, catalin.marinas, will, urezki, baohua,
Xueyuan.chen21, dev.jain, rppt, david, ryan.roberts,
anshuman.khandual, ajd, linux-kernel, Wen Jiang
Hi Andrew,
I've reviewed all the Sashiko findings:
- Patch 2 (fls() truncation risk): Will fix. Replace fls() with
__fls() to accept unsigned long directly.
- Patch 4 (nr overflow risk): Pre-existing type choice.
- Patch 4 (missing NULL check before page_to_phys): Will fix.
Add defensive checks consistent with vmap_pages_pte_range().
- Patch 5 (flush_cache_vmap with empty range): Valid point. Will
save the original start address and use it for the final flush.
- Patch 5 (virtual address alignment not checked): Addressed by
Patch 6 in this series.
- Patch 6 (caller tracking loss and while(1) loop): Valid point.
Will pass caller as a parameter and restructure per Uladzislau's
suggestion to replace while(1) with explicit sequential attempts.
- Patch 7 (partial cache flush on early break): Same root cause as
the Patch 5 flush issue.
Will resend V3 shortly.
Thanks,
Wen
On Wed, 20 May 2026 at 04:17, Andrew Morton <akpm@linux-foundation.org> wrote:
>
> On Thu, 14 May 2026 17:41:01 +0800 Wen Jiang <jiangwenxiaomi@gmail.com> wrote:
>
> > This patchset accelerates ioremap, vmalloc, and vmap when the memory
> > is physically fully or partially contiguous.
> >
> > ...
> >
> > On the RK3588 8-core ARM64 SoC, with tasks pinned to a little core and
> > the performance CPUfreq policy enabled, benchmark results:
> >
> > * ioremap(1 MB): 1.35× faster (3407 ns -> 2526 ns)
> > * vmalloc(1 MB) mapping time (excluding allocation) with
> > VM_ALLOW_HUGE_VMAP: 1.42× faster (5.00 us -> 3.53us)
> > * vmap(100MB) with order-8 pages: 8.3× faster (1235 us -> 149 us)
>
> Nice.
>
> AI review found a bunch of things to ask about:
> https://sashiko.dev/#/patchset/20260514094108.2016201-1-jiangwen6@xiaomi.com
>
> It doesn't appear that you'll be getting any more review on this
> series, so please check the above questions and resend?
>
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2026-05-20 12:30 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-14 9:41 [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Wen Jiang
2026-05-14 9:41 ` [PATCH v2 1/7] arm64/hugetlb: Extend batching of multiple CONT_PTE in a single PTE setup Wen Jiang
2026-05-14 9:41 ` [PATCH v2 2/7] arm64/vmalloc: Allow arch_vmap_pte_range_map_size to batch multiple CONT_PTE Wen Jiang
2026-05-14 9:41 ` [PATCH v2 3/7] mm/vmalloc: Extract vmap_set_ptes() to consolidate PTE mapping logic Wen Jiang
2026-05-14 9:41 ` [PATCH v2 4/7] mm/vmalloc: Extend page table walk to support larger page_shift sizes and eliminate page table rewalk Wen Jiang
2026-05-20 11:53 ` Mike Rapoport
2026-05-14 9:41 ` [PATCH v2 5/7] mm/vmalloc: map contiguous pages in batches for vmap() if possible Wen Jiang
2026-05-20 11:53 ` Mike Rapoport
2026-05-14 9:41 ` [PATCH v2 6/7] mm/vmalloc: align vm_area so vmap() can batch mappings Wen Jiang
2026-05-20 7:32 ` Uladzislau Rezki
2026-05-20 7:55 ` Barry Song
2026-05-20 9:12 ` Uladzislau Rezki
2026-05-14 9:41 ` [PATCH v2 7/7] mm/vmalloc: Stop scanning for compound pages after encountering small pages in vmap Wen Jiang
2026-05-20 9:44 ` Uladzislau Rezki
2026-05-20 10:56 ` Wen Jiang
2026-05-19 20:17 ` [PATCH v2 0/7] mm/vmalloc: Speed up ioremap, vmalloc and vmap with contiguous memory Andrew Morton
2026-05-20 3:40 ` Dev Jain
2026-05-20 7:15 ` Uladzislau Rezki
2026-05-20 8:37 ` Wen Jiang
2026-05-20 12:29 ` Wen Jiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox