All of lore.kernel.org
 help / color / mirror / Atom feed
From: Barry Song <baohua@kernel.org>
To: urezki@gmail.com
Cc: 21cnbao@gmail.com, akpm@linux-foundation.org, david@kernel.org,
	dri-devel@lists.freedesktop.org, jstultz@google.com,
	linaro-mm-sig@lists.linaro.org, linux-kernel@vger.kernel.org,
	linux-media@vger.kernel.org, linux-mm@kvack.org,
	mripard@kernel.org, sumit.semwal@linaro.org,
	xueyuan.chen21@gmail.com
Subject: Re: [PATCH] mm/vmalloc: map contiguous pages in batches for vmap() whenever possible
Date: Fri,  3 Apr 2026 17:20:28 +0800	[thread overview]
Message-ID: <20260403092028.61257-1-baohua@kernel.org> (raw)
In-Reply-To: <aVvmxGUp2l0Tavwb@milan>


> I think so, at least the place:
> 
> <snip>
> [    2.959030] Oops: Oops: 0000 [#66] SMP NOPTI
> [    2.960004] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.18.0+ #220 PREEMPT(none)
> [    2.961781] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014
> [    2.963870] BUG: unable to handle page fault for address: ffffffff3fd68118
> [    2.965383] #PF: supervisor read access in kernel mode
> [    2.966532] #PF: error_code(0x0000) - not-present page
> [    2.967682] BAD
> <snip>
> 
> but it is broken for sure:

> i += 1U << shift - "i" is an index in the page array.
> For example if order-0 you jump 4096 indices ahead.

> Should be: i += 1U << (shift - PAGE_SHIFT)

You’re right! And sorry for the slow response—it’s been
three months since the last discussion.

> vmap_page_range() does flushing and it has instrumented KMSAN inside.
> We should follow same semantic. Also it uses ioremap_max_page_shift as
> maximum page shift policy.

Not quite sure if vmap() should follow ioremap()’s
ioremap_max_page_shift. If needed, it shouldn’t be
difficult to do so.

I have a version queued for testing (Xueyuan is working
hard on it). Meanwhile, if you have any comments, please
feel free to share.

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 57eae99d9909..8d449e78a07a 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3513,6 +3513,60 @@ void vunmap(const void *addr)
 }
 EXPORT_SYMBOL(vunmap);
 
+#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
+static inline int get_vmap_batch_order(struct page **pages,
+		unsigned int max_steps, unsigned int idx)
+{
+	unsigned int nr_pages;
+
+	if (ioremap_max_page_shift == PAGE_SHIFT)
+		return 0;
+
+	nr_pages = compound_nr(pages[idx]);
+	if (nr_pages == 1 || max_steps < nr_pages)
+		return 0;
+
+	if (num_pages_contiguous(&pages[idx], nr_pages) == nr_pages)
+		return compound_order(pages[idx]);
+	return 0;
+}
+#else
+static inline int get_vmap_batch_order(struct page **pages,
+		unsigned int max_steps, unsigned int idx)
+{
+	return 0;
+}
+#endif
+
+static int vmap_contig_pages_range(unsigned long addr, unsigned long end,
+		pgprot_t prot, struct page **pages)
+{
+	unsigned int count = (end - addr) >> PAGE_SHIFT;
+	int err;
+
+	err = kmsan_vmap_pages_range_noflush(addr, end, prot, pages,
+						PAGE_SHIFT, GFP_KERNEL);
+	if (err)
+		goto out;
+
+	for (unsigned int i = 0; i < count; ) {
+		unsigned int shift = PAGE_SHIFT;
+
+		shift += get_vmap_batch_order(pages, count - i, i);
+		err = vmap_range_noflush(addr, addr + (1UL << shift),
+				page_to_phys(pages[i]), prot, shift);
+		if (err)
+			goto out;
+
+		addr += 1UL  << shift;
+		i += 1U << (shift - PAGE_SHIFT);
+	}
+
+out:
+	flush_cache_vmap(addr, end);
+	return err;
+}
+
 /**
  * vmap - map an array of pages into virtually contiguous space
  * @pages: array of page pointers
@@ -3556,8 +3610,8 @@ void *vmap(struct page **pages, unsigned int count,
 		return NULL;
 
 	addr = (unsigned long)area->addr;
-	if (vmap_pages_range(addr, addr + size, pgprot_nx(prot),
-				pages, PAGE_SHIFT) < 0) {
+	if (vmap_contig_pages_range(addr, addr + size, pgprot_nx(prot),
+				pages) < 0) {
 		vunmap(area->addr);
 		return NULL;
 	}
-- 
2.39.3 (Apple Git-146)

  reply	other threads:[~2026-04-03  9:20 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-15  5:30 [PATCH] mm/vmalloc: map contiguous pages in batches for vmap() whenever possible Barry Song
2025-12-18 13:01 ` David Hildenbrand (Red Hat)
2025-12-18 13:54   ` Uladzislau Rezki
2025-12-18 21:24     ` Barry Song
2025-12-22 13:08       ` Uladzislau Rezki
2025-12-23 21:23         ` Barry Song
2026-01-05 16:28           ` Uladzislau Rezki
2026-04-03  9:20             ` Barry Song [this message]
2026-04-13 20:34               ` Barry Song (Xiaomi)
2025-12-18 14:00 ` Uladzislau Rezki
2025-12-18 20:05   ` Barry Song
2026-01-14 12:59     ` David Hildenbrand (Red Hat)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260403092028.61257-1-baohua@kernel.org \
    --to=baohua@kernel.org \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@kernel.org \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jstultz@google.com \
    --cc=linaro-mm-sig@lists.linaro.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-media@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mripard@kernel.org \
    --cc=sumit.semwal@linaro.org \
    --cc=urezki@gmail.com \
    --cc=xueyuan.chen21@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.