* [PATCH 1/4] mm: mincore: try per-VMA lock firstly and use walk_page_range_vma()
2026-06-17 8:26 [PATCH 0/4] mm: convert to walk_page_range_vma() to eliminate find_vma() Kefeng Wang
@ 2026-06-17 8:26 ` Kefeng Wang
2026-06-17 14:54 ` Zi Yan
2026-06-17 15:01 ` David Hildenbrand (Arm)
2026-06-17 8:26 ` [PATCH 2/4] mm: mprotect: use walk_page_range_vma() in mprotect_fixup() Kefeng Wang
` (2 subsequent siblings)
3 siblings, 2 replies; 13+ messages in thread
From: Kefeng Wang @ 2026-06-17 8:26 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm, Kefeng Wang
The mincore syscall currently takes mmap lock for the entire
duration of the VMA lookup and page table walk. This creates
a global contention point with page faults and other mmap_lock
holders in multi-threaded applications.
The mincore is a read-only operation that only queries page
residency from a single VMA, making it an ideal candidate for
per-VMA locking, so try per-vma lock firstly and use the
walk_page_range_vma() in do_mincore() to eliminates an unnecessary
find_vma() lookup.
Unlike walk_page_range(), walk_page_range_vma() does not call
walk_page_test(), which handles VM_PFNMAP by invoking ->pte_hole()
to skip the page table walk. Without this check, PFNMAP PTEs
would be treated as present by mincore_pte_range(), changing
the returned residency status. Handle VM_PFNMAP explicitly in
do_mincore() to preserve the original behavior.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/mincore.c | 71 +++++++++++++++++++++++++++++++++++++++-------------
1 file changed, 53 insertions(+), 18 deletions(-)
diff --git a/mm/mincore.c b/mm/mincore.c
index 296f2e3922b5..a786a073feab 100644
--- a/mm/mincore.c
+++ b/mm/mincore.c
@@ -12,6 +12,7 @@
#include <linux/gfp.h>
#include <linux/pagewalk.h>
#include <linux/mman.h>
+#include <linux/mmap_lock.h>
#include <linux/syscalls.h>
#include <linux/swap.h>
#include <linux/leafops.h>
@@ -232,34 +233,47 @@ static inline bool can_do_mincore(struct vm_area_struct *vma)
file_permission(vma->vm_file, MAY_WRITE) == 0;
}
-static const struct mm_walk_ops mincore_walk_ops = {
- .pmd_entry = mincore_pte_range,
- .pte_hole = mincore_unmapped_range,
- .hugetlb_entry = mincore_hugetlb,
- .walk_lock = PGWALK_RDLOCK,
-};
-
/*
* Do a chunk of "sys_mincore()". We've already checked
- * all the arguments, we hold the mmap semaphore: we should
- * just return the amount of info we're asked for.
+ * all the arguments, we should just return the amount of
+ * info we're asked for. The vma is already looked up and
+ * locked; vma_locked indicates whether the per-VMA lock
+ * or mmap_read_lock is held.
*/
-static long do_mincore(unsigned long addr, unsigned long pages, unsigned char *vec)
+static long do_mincore(struct vm_area_struct *vma, unsigned long addr,
+ unsigned long pages, unsigned char *vec, bool vma_locked)
{
- struct vm_area_struct *vma;
unsigned long end;
int err;
+ struct mm_walk_ops mincore_walk_ops = {
+ .pmd_entry = mincore_pte_range,
+ .pte_hole = mincore_unmapped_range,
+ .hugetlb_entry = mincore_hugetlb,
+ .walk_lock = vma_locked ?
+ PGWALK_VMA_RDLOCK_VERIFY : PGWALK_RDLOCK,
+ };
- vma = vma_lookup(current->mm, addr);
- if (!vma)
- return -ENOMEM;
end = min(vma->vm_end, addr + (pages << PAGE_SHIFT));
if (!can_do_mincore(vma)) {
unsigned long pages = DIV_ROUND_UP(end - addr, PAGE_SIZE);
memset(vec, 1, pages);
return pages;
}
- err = walk_page_range(vma->vm_mm, addr, end, &mincore_walk_ops, vec);
+
+ /*
+ * walk_page_range_vma() does not call walk_page_test(), which
+ * handles VM_PFNMAP VMA by invoking ->pte_hole() to skip the
+ * page table walk. Without this check, PFNMAP PTEs would be
+ * treated as present by mincore_pte_range(), changing the returned
+ * residency status from the historical "not resident" to "resident".
+ * Handle VM_PFNMAP explicitly to preserve the original behavior.
+ */
+ if (vma->vm_flags & VM_PFNMAP) {
+ __mincore_unmapped_range(addr, end, vma, vec);
+ return (end - addr) >> PAGE_SHIFT;
+ }
+
+ err = walk_page_range_vma(vma, addr, end, &mincore_walk_ops, vec);
if (err < 0)
return err;
return (end - addr) >> PAGE_SHIFT;
@@ -319,13 +333,34 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len,
retval = 0;
while (pages) {
+ struct mm_struct *mm = current->mm;
+ struct vm_area_struct *vma;
+ bool vma_locked = false;
+
/*
+ * Try per-VMA lock first, fall back to mmap_read_lock.
* Do at most PAGE_SIZE entries per iteration, due to
* the temporary buffer size.
*/
- mmap_read_lock(current->mm);
- retval = do_mincore(start, min(pages, PAGE_SIZE), tmp);
- mmap_read_unlock(current->mm);
+ vma = lock_vma_under_rcu(mm, start);
+ if (vma) {
+ vma_locked = true;
+ } else {
+ mmap_read_lock(mm);
+ vma = vma_lookup(mm, start);
+ if (!vma) {
+ mmap_read_unlock(mm);
+ retval = -ENOMEM;
+ break;
+ }
+ }
+
+ retval = do_mincore(vma, start, min(pages, PAGE_SIZE), tmp, vma_locked);
+
+ if (vma_locked)
+ vma_end_read(vma);
+ else
+ mmap_read_unlock(mm);
if (retval <= 0)
break;
--
2.27.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH 1/4] mm: mincore: try per-VMA lock firstly and use walk_page_range_vma()
2026-06-17 8:26 ` [PATCH 1/4] mm: mincore: try per-VMA lock firstly and use walk_page_range_vma() Kefeng Wang
@ 2026-06-17 14:54 ` Zi Yan
2026-06-17 15:01 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 13+ messages in thread
From: Zi Yan @ 2026-06-17 14:54 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm
On Wed Jun 17, 2026 at 4:26 AM EDT, Kefeng Wang wrote:
> The mincore syscall currently takes mmap lock for the entire
> duration of the VMA lookup and page table walk. This creates
> a global contention point with page faults and other mmap_lock
> holders in multi-threaded applications.
>
> The mincore is a read-only operation that only queries page
> residency from a single VMA, making it an ideal candidate for
> per-VMA locking, so try per-vma lock firstly and use the
> walk_page_range_vma() in do_mincore() to eliminates an unnecessary
> find_vma() lookup.
>
> Unlike walk_page_range(), walk_page_range_vma() does not call
> walk_page_test(), which handles VM_PFNMAP by invoking ->pte_hole()
> to skip the page table walk. Without this check, PFNMAP PTEs
> would be treated as present by mincore_pte_range(), changing
> the returned residency status. Handle VM_PFNMAP explicitly in
> do_mincore() to preserve the original behavior.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/mincore.c | 71 +++++++++++++++++++++++++++++++++++++++-------------
> 1 file changed, 53 insertions(+), 18 deletions(-)
>
> diff --git a/mm/mincore.c b/mm/mincore.c
> index 296f2e3922b5..a786a073feab 100644
> --- a/mm/mincore.c
> +++ b/mm/mincore.c
> @@ -12,6 +12,7 @@
> #include <linux/gfp.h>
> #include <linux/pagewalk.h>
> #include <linux/mman.h>
> +#include <linux/mmap_lock.h>
> #include <linux/syscalls.h>
> #include <linux/swap.h>
> #include <linux/leafops.h>
> @@ -232,34 +233,47 @@ static inline bool can_do_mincore(struct vm_area_struct *vma)
> file_permission(vma->vm_file, MAY_WRITE) == 0;
> }
>
> -static const struct mm_walk_ops mincore_walk_ops = {
> - .pmd_entry = mincore_pte_range,
> - .pte_hole = mincore_unmapped_range,
> - .hugetlb_entry = mincore_hugetlb,
> - .walk_lock = PGWALK_RDLOCK,
> -};
> -
> /*
> * Do a chunk of "sys_mincore()". We've already checked
> - * all the arguments, we hold the mmap semaphore: we should
> - * just return the amount of info we're asked for.
> + * all the arguments, we should just return the amount of
> + * info we're asked for. The vma is already looked up and
> + * locked; vma_locked indicates whether the per-VMA lock
> + * or mmap_read_lock is held.
> */
> -static long do_mincore(unsigned long addr, unsigned long pages, unsigned char *vec)
> +static long do_mincore(struct vm_area_struct *vma, unsigned long addr,
> + unsigned long pages, unsigned char *vec, bool vma_locked)
vma_locked is confusing me, since I thought vma_locked == false means
vma is not locked, but it actually means mmap_lock is taken instead. But
I am not sure an enum vma_lock_state {VMA_LOCKED, MM_LOCKED} is needed
here.
> {
> - struct vm_area_struct *vma;
> unsigned long end;
> int err;
> + struct mm_walk_ops mincore_walk_ops = {
> + .pmd_entry = mincore_pte_range,
> + .pte_hole = mincore_unmapped_range,
> + .hugetlb_entry = mincore_hugetlb,
> + .walk_lock = vma_locked ?
> + PGWALK_VMA_RDLOCK_VERIFY : PGWALK_RDLOCK,
An unrelated comment about PGWALK_RDLOCK. Maybe PGWALK_MM_RDLOCK_VERIFY
is a better name since the code just verifies mmap_lock, unlike
PGWALK_WRLOCK, which requires vma_start_write(). PGWALK_WRLOCK_VERIFY
might be better named as PGWALK_VMA_WRLOCK_VERIFY.
Otherwise, LGTM.
Acked-by: Zi Yan <ziy@nvidia.com>
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH 1/4] mm: mincore: try per-VMA lock firstly and use walk_page_range_vma()
2026-06-17 8:26 ` [PATCH 1/4] mm: mincore: try per-VMA lock firstly and use walk_page_range_vma() Kefeng Wang
2026-06-17 14:54 ` Zi Yan
@ 2026-06-17 15:01 ` David Hildenbrand (Arm)
1 sibling, 0 replies; 13+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-17 15:01 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: Zi Yan, Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka,
Suren Baghdasaryan, linux-mm
On 6/17/26 10:26, Kefeng Wang wrote:
> The mincore syscall currently takes mmap lock for the entire
> duration of the VMA lookup and page table walk. This creates
> a global contention point with page faults and other mmap_lock
> holders in multi-threaded applications.
>
> The mincore is a read-only operation that only queries page
> residency from a single VMA, making it an ideal candidate for
> per-VMA locking, so try per-vma lock firstly and use the
> walk_page_range_vma() in do_mincore() to eliminates an unnecessary
> find_vma() lookup.
>
> Unlike walk_page_range(), walk_page_range_vma() does not call
> walk_page_test(), which handles VM_PFNMAP by invoking ->pte_hole()
> to skip the page table walk. Without this check, PFNMAP PTEs
> would be treated as present by mincore_pte_range(), changing
> the returned residency status. Handle VM_PFNMAP explicitly in
> do_mincore() to preserve the original behavior.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
I think you should first add a patch that converts do_mincore() to just use
walk_page_range_vma() -- which is what your patch series "mm: convert to
walk_page_range_vma() to eliminate find_vma()" wants to achieve.
Which would already be an improvement :)
Regarding the per-VMA lock, given that we already drop the mmap lock after each
VMA, concurrent mremap() would already be able to return funky results.
So that should be just fine.
--
Cheers,
David
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 2/4] mm: mprotect: use walk_page_range_vma() in mprotect_fixup()
2026-06-17 8:26 [PATCH 0/4] mm: convert to walk_page_range_vma() to eliminate find_vma() Kefeng Wang
2026-06-17 8:26 ` [PATCH 1/4] mm: mincore: try per-VMA lock firstly and use walk_page_range_vma() Kefeng Wang
@ 2026-06-17 8:26 ` Kefeng Wang
2026-06-17 13:25 ` David Hildenbrand (Arm)
2026-06-17 14:28 ` Zi Yan
2026-06-17 8:26 ` [PATCH 3/4] mm: mlock: use walk_page_range_vma() in mlock_vma_pages_range() Kefeng Wang
2026-06-17 8:26 ` [PATCH 4/4] mm: migrate_device: use walk_page_range_vma() in migrate_vma_collect() Kefeng Wang
3 siblings, 2 replies; 13+ messages in thread
From: Kefeng Wang @ 2026-06-17 8:26 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm, Kefeng Wang
In mprotect_fixup(), the PROT_NONE PFN permission check uses
walk_page_range() to walk the page table. Fortunately, the caller
always passes start/end that falls within a single VMA, the
do_mprotect_pkey() iterates per-VMA via for_each_vma_range(),
and setup_arg_pages() passes the whole VMA.
Note, walk_page_test() isn't called in walk_page_range_vma(),
however, prot_none_test() in prot_none_walk_ops always return 0,
so it's safely replace walk_page_range() with walk_page_range_vma()
to eliminates an unnecessary find_vma() lookup.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/mprotect.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028c..d26e09862daa 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -753,7 +753,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
!vma_flags_test_any_mask(&new_vma_flags, VMA_ACCESS_FLAGS)) {
pgprot_t new_pgprot = vm_get_page_prot(newflags);
- error = walk_page_range(current->mm, start, end,
+ error = walk_page_range_vma(vma, start, end,
&prot_none_walk_ops, &new_pgprot);
if (error)
return error;
--
2.27.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH 2/4] mm: mprotect: use walk_page_range_vma() in mprotect_fixup()
2026-06-17 8:26 ` [PATCH 2/4] mm: mprotect: use walk_page_range_vma() in mprotect_fixup() Kefeng Wang
@ 2026-06-17 13:25 ` David Hildenbrand (Arm)
2026-06-17 14:28 ` Zi Yan
1 sibling, 0 replies; 13+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-17 13:25 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: Zi Yan, Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka,
Suren Baghdasaryan, linux-mm
On 6/17/26 10:26, Kefeng Wang wrote:
> In mprotect_fixup(), the PROT_NONE PFN permission check uses
> walk_page_range() to walk the page table. Fortunately, the caller
> always passes start/end that falls within a single VMA, the
> do_mprotect_pkey() iterates per-VMA via for_each_vma_range(),
> and setup_arg_pages() passes the whole VMA.
>
> Note, walk_page_test() isn't called in walk_page_range_vma(),
> however, prot_none_test() in prot_none_walk_ops always return 0,
> so it's safely replace walk_page_range() with walk_page_range_vma()
> to eliminates an unnecessary find_vma() lookup.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/mprotect.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028c..d26e09862daa 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -753,7 +753,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
> !vma_flags_test_any_mask(&new_vma_flags, VMA_ACCESS_FLAGS)) {
> pgprot_t new_pgprot = vm_get_page_prot(newflags);
>
> - error = walk_page_range(current->mm, start, end,
> + error = walk_page_range_vma(vma, start, end,
> &prot_none_walk_ops, &new_pgprot);
> if (error)
> return error;
As Sashiko says, prot_none_test() seems unnecessary now.
Apart from that, LGTM.
--
Cheers,
David
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH 2/4] mm: mprotect: use walk_page_range_vma() in mprotect_fixup()
2026-06-17 8:26 ` [PATCH 2/4] mm: mprotect: use walk_page_range_vma() in mprotect_fixup() Kefeng Wang
2026-06-17 13:25 ` David Hildenbrand (Arm)
@ 2026-06-17 14:28 ` Zi Yan
1 sibling, 0 replies; 13+ messages in thread
From: Zi Yan @ 2026-06-17 14:28 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm
On Wed Jun 17, 2026 at 4:26 AM EDT, Kefeng Wang wrote:
> In mprotect_fixup(), the PROT_NONE PFN permission check uses
> walk_page_range() to walk the page table. Fortunately, the caller
> always passes start/end that falls within a single VMA, the
> do_mprotect_pkey() iterates per-VMA via for_each_vma_range(),
> and setup_arg_pages() passes the whole VMA.
>
> Note, walk_page_test() isn't called in walk_page_range_vma(),
> however, prot_none_test() in prot_none_walk_ops always return 0,
> so it's safely replace walk_page_range() with walk_page_range_vma()
> to eliminates an unnecessary find_vma() lookup.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/mprotect.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028c..d26e09862daa 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -753,7 +753,7 @@ mprotect_fixup(struct vma_iterator *vmi, struct mmu_gather *tlb,
> !vma_flags_test_any_mask(&new_vma_flags, VMA_ACCESS_FLAGS)) {
> pgprot_t new_pgprot = vm_get_page_prot(newflags);
>
> - error = walk_page_range(current->mm, start, end,
> + error = walk_page_range_vma(vma, start, end,
> &prot_none_walk_ops, &new_pgprot);
> if (error)
> return error;
Like David and Sashiko said, a fixup like below is needed. Otherwise,
LGTM.
Reviewed-by: Zi Yan <ziy@nvidia.com>
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028cf..323949638894b 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -708,16 +708,9 @@ static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
0 : -EACCES;
}
-static int prot_none_test(unsigned long addr, unsigned long next,
- struct mm_walk *walk)
-{
- return 0;
-}
-
static const struct mm_walk_ops prot_none_walk_ops = {
.pte_entry = prot_none_pte_entry,
.hugetlb_entry = prot_none_hugetlb_entry,
- .test_walk = prot_none_test,
.walk_lock = PGWALK_WRLOCK,
};
--
Best Regards,
Yan, Zi
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/4] mm: mlock: use walk_page_range_vma() in mlock_vma_pages_range()
2026-06-17 8:26 [PATCH 0/4] mm: convert to walk_page_range_vma() to eliminate find_vma() Kefeng Wang
2026-06-17 8:26 ` [PATCH 1/4] mm: mincore: try per-VMA lock firstly and use walk_page_range_vma() Kefeng Wang
2026-06-17 8:26 ` [PATCH 2/4] mm: mprotect: use walk_page_range_vma() in mprotect_fixup() Kefeng Wang
@ 2026-06-17 8:26 ` Kefeng Wang
2026-06-17 13:26 ` David Hildenbrand (Arm)
2026-06-17 14:35 ` Zi Yan
2026-06-17 8:26 ` [PATCH 4/4] mm: migrate_device: use walk_page_range_vma() in migrate_vma_collect() Kefeng Wang
3 siblings, 2 replies; 13+ messages in thread
From: Kefeng Wang @ 2026-06-17 8:26 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm, Kefeng Wang
The mlock_vma_pages_range() uses walk_page_range() to walk the
page table. Fortunately, the caller always passes start/end that
falls within a single VMA, apply_vma_lock_flags() iterates per-VMA,
and apply_mlockall_flags() passes the whole VMA.
Since there is no .test_walk in mlock_walk_ops and VM_PFNMAP has
been filtered by vma_supports_mlock(), it's safely replace
walk_page_range() with walk_page_range_vma() to eliminates an
unnecessary find_vma() lookup.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/mlock.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/mlock.c b/mm/mlock.c
index 8c227fefa2df..97e49038d8d3 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -446,7 +446,7 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
vma_flags_reset_once(vma, new_vma_flags);
lru_add_drain();
- walk_page_range(vma->vm_mm, start, end, &mlock_walk_ops, NULL);
+ walk_page_range_vma(vma, start, end, &mlock_walk_ops, NULL);
lru_add_drain();
if (vma_flags_test(new_vma_flags, VMA_IO_BIT)) {
--
2.27.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH 3/4] mm: mlock: use walk_page_range_vma() in mlock_vma_pages_range()
2026-06-17 8:26 ` [PATCH 3/4] mm: mlock: use walk_page_range_vma() in mlock_vma_pages_range() Kefeng Wang
@ 2026-06-17 13:26 ` David Hildenbrand (Arm)
2026-06-17 14:35 ` Zi Yan
1 sibling, 0 replies; 13+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-17 13:26 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: Zi Yan, Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka,
Suren Baghdasaryan, linux-mm
On 6/17/26 10:26, Kefeng Wang wrote:
> The mlock_vma_pages_range() uses walk_page_range() to walk the
> page table. Fortunately, the caller always passes start/end that
> falls within a single VMA, apply_vma_lock_flags() iterates per-VMA,
> and apply_mlockall_flags() passes the whole VMA.
>
> Since there is no .test_walk in mlock_walk_ops and VM_PFNMAP has
> been filtered by vma_supports_mlock(), it's safely replace
> walk_page_range() with walk_page_range_vma() to eliminates an
> unnecessary find_vma() lookup.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/mlock.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 8c227fefa2df..97e49038d8d3 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -446,7 +446,7 @@ static void mlock_vma_pages_range(struct vm_area_struct *vma,
> vma_flags_reset_once(vma, new_vma_flags);
>
> lru_add_drain();
> - walk_page_range(vma->vm_mm, start, end, &mlock_walk_ops, NULL);
> + walk_page_range_vma(vma, start, end, &mlock_walk_ops, NULL);
> lru_add_drain();
>
> if (vma_flags_test(new_vma_flags, VMA_IO_BIT)) {
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: [PATCH 3/4] mm: mlock: use walk_page_range_vma() in mlock_vma_pages_range()
2026-06-17 8:26 ` [PATCH 3/4] mm: mlock: use walk_page_range_vma() in mlock_vma_pages_range() Kefeng Wang
2026-06-17 13:26 ` David Hildenbrand (Arm)
@ 2026-06-17 14:35 ` Zi Yan
1 sibling, 0 replies; 13+ messages in thread
From: Zi Yan @ 2026-06-17 14:35 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm
On Wed Jun 17, 2026 at 4:26 AM EDT, Kefeng Wang wrote:
> The mlock_vma_pages_range() uses walk_page_range() to walk the
> page table. Fortunately, the caller always passes start/end that
> falls within a single VMA, apply_vma_lock_flags() iterates per-VMA,
> and apply_mlockall_flags() passes the whole VMA.
>
> Since there is no .test_walk in mlock_walk_ops and VM_PFNMAP has
> been filtered by vma_supports_mlock(), it's safely replace
> walk_page_range() with walk_page_range_vma() to eliminates an
> unnecessary find_vma() lookup.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/mlock.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
LGTM.
Reviewed-by: Zi Yan <ziy@nvidia.com>
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 4/4] mm: migrate_device: use walk_page_range_vma() in migrate_vma_collect()
2026-06-17 8:26 [PATCH 0/4] mm: convert to walk_page_range_vma() to eliminate find_vma() Kefeng Wang
` (2 preceding siblings ...)
2026-06-17 8:26 ` [PATCH 3/4] mm: mlock: use walk_page_range_vma() in mlock_vma_pages_range() Kefeng Wang
@ 2026-06-17 8:26 ` Kefeng Wang
2026-06-17 13:30 ` David Hildenbrand (Arm)
2026-06-17 14:37 ` Zi Yan
3 siblings, 2 replies; 13+ messages in thread
From: Kefeng Wang @ 2026-06-17 8:26 UTC (permalink / raw)
To: Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm, Kefeng Wang
The migrate_vma_collect() uses walk_page_range() to walk the page
table. Fortunately, migrate_vma_setup() already validates that the
entire range falls within a single VMA.
Since there is no .test_walk in migrate_vma_walk_ops and VM_PFNMAP
VM_PFNMAP has filtered by migrate_vma_setup(), it's safetly replace
walk_page_range() with walk_page_range_vma() to eliminates an
unnecessary find_vma() lookup.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/migrate_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index 554754eb26ff..ae39173d6a0e 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -513,7 +513,7 @@ static void migrate_vma_collect(struct migrate_vma *migrate)
migrate->pgmap_owner);
mmu_notifier_invalidate_range_start(&range);
- walk_page_range(migrate->vma->vm_mm, migrate->start, migrate->end,
+ walk_page_range_vma(migrate->vma, migrate->start, migrate->end,
&migrate_vma_walk_ops, migrate);
mmu_notifier_invalidate_range_end(&range);
--
2.27.0
^ permalink raw reply related [flat|nested] 13+ messages in thread* Re: [PATCH 4/4] mm: migrate_device: use walk_page_range_vma() in migrate_vma_collect()
2026-06-17 8:26 ` [PATCH 4/4] mm: migrate_device: use walk_page_range_vma() in migrate_vma_collect() Kefeng Wang
@ 2026-06-17 13:30 ` David Hildenbrand (Arm)
2026-06-17 14:37 ` Zi Yan
1 sibling, 0 replies; 13+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-17 13:30 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: Zi Yan, Liam R. Howlett, Lorenzo Stoakes, Vlastimil Babka,
Suren Baghdasaryan, linux-mm
On 6/17/26 10:26, Kefeng Wang wrote:
> The migrate_vma_collect() uses walk_page_range() to walk the page
> table. Fortunately, migrate_vma_setup() already validates that the
> entire range falls within a single VMA.
>
> Since there is no .test_walk in migrate_vma_walk_ops and VM_PFNMAP
> VM_PFNMAP
You mention VM_PFNMAP twice.
> has filtered by migrate_vma_setup(), it's safetly replace
s/has/was/
s/ / /
s/safetly/safe to/
> walk_page_range() with walk_page_range_vma() to eliminates an
s/eliminates/eliminate/
> unnecessary find_vma() lookup.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/migrate_device.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> index 554754eb26ff..ae39173d6a0e 100644
> --- a/mm/migrate_device.c
> +++ b/mm/migrate_device.c
> @@ -513,7 +513,7 @@ static void migrate_vma_collect(struct migrate_vma *migrate)
> migrate->pgmap_owner);
> mmu_notifier_invalidate_range_start(&range);
>
> - walk_page_range(migrate->vma->vm_mm, migrate->start, migrate->end,
> + walk_page_range_vma(migrate->vma, migrate->start, migrate->end,
> &migrate_vma_walk_ops, migrate);
>
> mmu_notifier_invalidate_range_end(&range);
LGTM
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH 4/4] mm: migrate_device: use walk_page_range_vma() in migrate_vma_collect()
2026-06-17 8:26 ` [PATCH 4/4] mm: migrate_device: use walk_page_range_vma() in migrate_vma_collect() Kefeng Wang
2026-06-17 13:30 ` David Hildenbrand (Arm)
@ 2026-06-17 14:37 ` Zi Yan
1 sibling, 0 replies; 13+ messages in thread
From: Zi Yan @ 2026-06-17 14:37 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: David Hildenbrand, Zi Yan, Liam R. Howlett, Lorenzo Stoakes,
Vlastimil Babka, Suren Baghdasaryan, linux-mm
On Wed Jun 17, 2026 at 4:26 AM EDT, Kefeng Wang wrote:
> The migrate_vma_collect() uses walk_page_range() to walk the page
> table. Fortunately, migrate_vma_setup() already validates that the
> entire range falls within a single VMA.
>
> Since there is no .test_walk in migrate_vma_walk_ops and VM_PFNMAP
> VM_PFNMAP has filtered by migrate_vma_setup(), it's safetly replace
> walk_page_range() with walk_page_range_vma() to eliminates an
> unnecessary find_vma() lookup.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> mm/migrate_device.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
Makes sense.
Acked-by: Zi Yan <ziy@nvidia.com>
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 13+ messages in thread