* [PATCH 0/5] Fix incorrect access of hugetlb pte entries
@ 2026-06-25 11:29 Dev Jain
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
` (5 more replies)
0 siblings, 6 replies; 27+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual
There are various places which use ptep_get() to get the pte entry
corresponding to a hugetlb folio. Some arches have special handling
to compute the pteval, so they provide huge_ptep_get(). Use this
helper consistently.
Dev Jain (5):
mm/rmap: use huge_ptep_get() in try_to_unmap_one()
mm/rmap: use huge_ptep_get() in try_to_migrate_one()
mm/migrate: use huge_ptep_get() in remove_migration_pte()
mm/page_vma_mapped: use huge_ptep_get() for hugetlb
mm/mprotect: use huge_ptep_get() for hugetlb
include/linux/hugetlb.h | 3 +++
mm/migrate.c | 6 +++++-
mm/mprotect.c | 8 +++++++-
mm/page_vma_mapped.c | 8 +++++++-
mm/rmap.c | 32 ++++++++++++++++++++------------
5 files changed, 42 insertions(+), 15 deletions(-)
--
2.43.0
^ permalink raw reply [flat|nested] 27+ messages in thread
* [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:17 ` Muchun Song
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
` (4 subsequent siblings)
5 siblings, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
try_to_unmap_one() handles hugetlb folios when memory failure needs
to replace a poisoned hugetlb mapping with a hwpoison entry. In that
case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
in pvmw.pte, but the code reads it with ptep_get().
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Just use huge_ptep_get() for dereferencing a huge pte pointer.
Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
include/linux/hugetlb.h | 3 +++
mm/rmap.c | 16 ++++++++++------
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 2abaf99321e90..fdb7bdf7645c5 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
{
}
+pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep);
+
static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
diff --git a/mm/rmap.c b/mm/rmap.c
index 1c77d5dc06e9f..aa8a254efaecc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
/* Unexpected PMD-mapped THP? */
VM_BUG_ON_FOLIO(!pvmw.pte, folio);
- /*
- * Handle PFN swap PTEs, such as device-exclusive ones, that
- * actually map pages.
- */
- pteval = ptep_get(pvmw.pte);
+ address = pvmw.address;
+ if (folio_test_hugetlb(folio)) {
+ pteval = huge_ptep_get(mm, address, pvmw.pte);
+ } else {
+ /*
+ * Handle PFN swap PTEs, such as device-exclusive ones,
+ * that actually map pages.
+ */
+ pteval = ptep_get(pvmw.pte);
+ }
if (likely(pte_present(pteval))) {
pfn = pte_pfn(pteval);
} else {
@@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
}
subpage = folio_page(folio, pfn - folio_pfn(folio));
- address = pvmw.address;
anon_exclusive = folio_test_anon(folio) &&
PageAnonExclusive(subpage);
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one()
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:24 ` Muchun Song
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
` (3 subsequent siblings)
5 siblings, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
try_to_migrate_one() is used by folio migration to replace a present
mapping with a migration entry. For hugetlb folios, page_vma_mapped_walk()
returns the pte pointer to the hugetlb folio in pvmw.pte, but the code
reads the huge pte entry with ptep_get().
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Use huge_ptep_get() to dereference a huge pte pointer.
Commit a98a2f0c8ce1 copied the bug from try_to_unmap_one into
try_to_migrate_one.
Fixes: a98a2f0c8ce1 ("mm/rmap: split migration into its own function")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/rmap.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index aa8a254efaecc..abc3a44baaa3d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2505,11 +2505,16 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
/* Unexpected PMD-mapped THP? */
VM_BUG_ON_FOLIO(!pvmw.pte, folio);
- /*
- * Handle PFN swap PTEs, such as device-exclusive ones, that
- * actually map pages.
- */
- pteval = ptep_get(pvmw.pte);
+ address = pvmw.address;
+ if (folio_test_hugetlb(folio)) {
+ pteval = huge_ptep_get(mm, address, pvmw.pte);
+ } else {
+ /*
+ * Handle PFN swap PTEs, such as device-exclusive ones,
+ * that actually map pages.
+ */
+ pteval = ptep_get(pvmw.pte);
+ }
if (likely(pte_present(pteval))) {
pfn = pte_pfn(pteval);
} else {
@@ -2520,7 +2525,6 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
}
subpage = folio_page(folio, pfn - folio_pfn(folio));
- address = pvmw.address;
anon_exclusive = folio_test_anon(folio) &&
PageAnonExclusive(subpage);
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte()
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:32 ` Muchun Song
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
` (2 subsequent siblings)
5 siblings, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
remove_migration_pte() converts migration entries back to present PTEs
after folio migration completes. For hugetlb folios,
page_vma_mapped_walk() returns the pte pointer to the hugetlb folio in
pvmw.pte, but the code reads it with ptep_get().
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(),
pte_present() etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Use huge_ptep_get() to dereference a huge pte pointer.
Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/migrate.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index d9b23909d716c..c65f0f43df7eb 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -371,7 +371,11 @@ static bool remove_migration_pte(struct folio *folio,
continue;
}
#endif
- old_pte = ptep_get(pvmw.pte);
+ if (folio_test_hugetlb(folio))
+ old_pte = huge_ptep_get(vma->vm_mm, pvmw.address,
+ pvmw.pte);
+ else
+ old_pte = ptep_get(pvmw.pte);
if (rmap_walk_arg->map_unused_to_zeropage &&
try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
continue;
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
` (2 preceding siblings ...)
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 2:31 ` Lance Yang
2026-06-26 7:48 ` Lance Yang
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
5 siblings, 2 replies; 27+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
check_pte() is the final validation step in page_vma_mapped_walk().
It reads pvmw->pte with ptep_get() to decide whether the entry maps
the PFN range being walked. For hugetlb VMAs, that pointer refers
to a hugetlb entry.
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(),
pte_present() etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Use huge_ptep_get() to dereference a huge pte pointer.
Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/page_vma_mapped.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 2ccbabfb2cc17..18e1d341f463c 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
{
unsigned long pfn;
- pte_t ptent = ptep_get(pvmw->pte);
+ pte_t ptent;
+
+ if (is_vm_hugetlb_page(pvmw->vma))
+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
+ pvmw->pte);
+ else
+ ptent = ptep_get(pvmw->pte);
if (pvmw->flags & PVMW_MIGRATION) {
const softleaf_t entry = softleaf_from_pte(ptent);
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
` (3 preceding siblings ...)
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:40 ` Muchun Song
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
5 siblings, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual
prot_none_hugetlb_entry() is the hugetlb callback for the early
mprotect(PROT_NONE) PFN permission walk on x86.
The callback passes the decoded PFN to pfn_modify_allowed(). For a
hugetlb callback, the pte pointer refers to a hugetlb entry. On
architectures where hugetlb entries need huge_ptep_get(), reading that
entry with ptep_get() can make the permission check use the wrong PFN.
Use huge_ptep_get() before decoding the hugetlb PFN.
Currently there is no path which can trigger a bug: huge_ptep_get() is a
simple ptep_get() for x86, and the prot_none walk occurs only for x86.
But use the correct helper anyways.
Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/mprotect.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028cf..23779632d18bf 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
0 : -EACCES;
}
+#ifdef CONFIG_HUGETLB_PAGE
static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
unsigned long addr, unsigned long next,
struct mm_walk *walk)
{
- return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
+ pte_t entry = huge_ptep_get(walk->mm, addr, pte);
+
+ return pfn_modify_allowed(pte_pfn(entry),
*(pgprot_t *)(walk->private)) ?
0 : -EACCES;
}
+#else
+#define prot_none_hugetlb_entry NULL
+#endif
static int prot_none_test(unsigned long addr, unsigned long next,
struct mm_walk *walk)
--
2.43.0
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 0/5] Fix incorrect access of hugetlb pte entries
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
` (4 preceding siblings ...)
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
@ 2026-06-25 13:59 ` Zi Yan
2026-06-26 4:09 ` Dev Jain
5 siblings, 1 reply; 27+ messages in thread
From: Zi Yan @ 2026-06-25 13:59 UTC (permalink / raw)
To: Dev Jain, muchun.song, osalvador, akpm, ljs, david, liam
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, mel, nao.horiguchi, ak,
j-nomura, pfalcato, dave.hansen, tglx, jpoimboe, ryan.roberts,
anshuman.khandual
On Thu Jun 25, 2026 at 7:29 AM EDT, Dev Jain wrote:
> There are various places which use ptep_get() to get the pte entry
> corresponding to a hugetlb folio. Some arches have special handling
I think it is better to mention s390 as a concrete example.
> to compute the pteval, so they provide huge_ptep_get(). Use this
> helper consistently.
>
> Dev Jain (5):
> mm/rmap: use huge_ptep_get() in try_to_unmap_one()
> mm/rmap: use huge_ptep_get() in try_to_migrate_one()
> mm/migrate: use huge_ptep_get() in remove_migration_pte()
> mm/page_vma_mapped: use huge_ptep_get() for hugetlb
> mm/mprotect: use huge_ptep_get() for hugetlb
>
> include/linux/hugetlb.h | 3 +++
> mm/migrate.c | 6 +++++-
> mm/mprotect.c | 8 +++++++-
> mm/page_vma_mapped.c | 8 +++++++-
> mm/rmap.c | 32 ++++++++++++++++++++------------
> 5 files changed, 42 insertions(+), 15 deletions(-)
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
@ 2026-06-26 2:31 ` Lance Yang
2026-06-26 4:06 ` Dev Jain
2026-06-26 7:48 ` Lance Yang
1 sibling, 1 reply; 27+ messages in thread
From: Lance Yang @ 2026-06-26 2:31 UTC (permalink / raw)
To: dev.jain
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, lance.yang, kas, linux-mm, linux-kernel, rcampbell,
apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>check_pte() is the final validation step in page_vma_mapped_walk().
>It reads pvmw->pte with ptep_get() to decide whether the entry maps
>the PFN range being walked. For hugetlb VMAs, that pointer refers
>to a hugetlb entry.
>
>On arches which provide their own huge_ptep_get() to dereference a huge
>pte pointer, accessing via ptep_get() would cause pte_pfn(),
>pte_present() etc to misbehave.
>
>It is not clear whether this has a trivially visible effect to userspace.
>
>Use huge_ptep_get() to dereference a huge pte pointer.
>
>Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>Cc: stable@vger.kernel.org
>Signed-off-by: Dev Jain <dev.jain@arm.com>
>---
> mm/page_vma_mapped.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..18e1d341f463c 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
Just one ordering thing: should this patch come first?
Patches #01-#03 only reach the new huge_ptep_get() after
page_vma_mapped_walk() succeeds. But before this patch, hugetlb sill
goes through check_pte() (still using ptep_get()).
> {
> unsigned long pfn;
>- pte_t ptent = ptep_get(pvmw->pte);
>+ pte_t ptent;
>+
>+ if (is_vm_hugetlb_page(pvmw->vma))
>+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>+ pvmw->pte);
>+ else
>+ ptent = ptep_get(pvmw->pte);
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
>--
>2.43.0
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
@ 2026-06-26 3:17 ` Muchun Song
2026-06-26 4:03 ` Dev Jain
0 siblings, 1 reply; 27+ messages in thread
From: Muchun Song @ 2026-06-26 3:17 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
akpm, ljs, david, liam
On 2026/6/25 19:29, Dev Jain wrote:
> try_to_unmap_one() handles hugetlb folios when memory failure needs
> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
> in pvmw.pte, but the code reads it with ptep_get().
>
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
> etc to misbehave.
>
> It is not clear whether this has a trivially visible effect to userspace.
>
> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>
> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
> include/linux/hugetlb.h | 3 +++
> mm/rmap.c | 16 ++++++++++------
> 2 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 2abaf99321e90..fdb7bdf7645c5 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
> {
> }
>
> +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
> + pte_t *ptep);
Thanks so much for the fix! I'm curious, though: why do we
need to add a separate declaration for this function here?
Thanks,
Muchun
> +
> static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
> unsigned long addr, pte_t *ptep)
> {
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 1c77d5dc06e9f..aa8a254efaecc 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
> /* Unexpected PMD-mapped THP? */
> VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>
> - /*
> - * Handle PFN swap PTEs, such as device-exclusive ones, that
> - * actually map pages.
> - */
> - pteval = ptep_get(pvmw.pte);
> + address = pvmw.address;
> + if (folio_test_hugetlb(folio)) {
> + pteval = huge_ptep_get(mm, address, pvmw.pte);
> + } else {
> + /*
> + * Handle PFN swap PTEs, such as device-exclusive ones,
> + * that actually map pages.
> + */
> + pteval = ptep_get(pvmw.pte);
> + }
> if (likely(pte_present(pteval))) {
> pfn = pte_pfn(pteval);
> } else {
> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
> }
>
> subpage = folio_page(folio, pfn - folio_pfn(folio));
> - address = pvmw.address;
> anon_exclusive = folio_test_anon(folio) &&
> PageAnonExclusive(subpage);
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one()
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
@ 2026-06-26 3:24 ` Muchun Song
0 siblings, 0 replies; 27+ messages in thread
From: Muchun Song @ 2026-06-26 3:24 UTC (permalink / raw)
To: Dev Jain
Cc: osalvador, akpm, ljs, david, liam, riel, vbabka, harry, jannh,
lance.yang, kas, linux-mm, linux-kernel, rcampbell, apopple, ziy,
matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
> On Jun 25, 2026, at 19:29, Dev Jain <dev.jain@arm.com> wrote:
>
> try_to_migrate_one() is used by folio migration to replace a present
> mapping with a migration entry. For hugetlb folios, page_vma_mapped_walk()
> returns the pte pointer to the hugetlb folio in pvmw.pte, but the code
> reads the huge pte entry with ptep_get().
>
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
> etc to misbehave.
>
> It is not clear whether this has a trivially visible effect to userspace.
>
> Use huge_ptep_get() to dereference a huge pte pointer.
>
> Commit a98a2f0c8ce1 copied the bug from try_to_unmap_one into
> try_to_migrate_one.
>
> Fixes: a98a2f0c8ce1 ("mm/rmap: split migration into its own function")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Thanks.
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte()
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
@ 2026-06-26 3:32 ` Muchun Song
0 siblings, 0 replies; 27+ messages in thread
From: Muchun Song @ 2026-06-26 3:32 UTC (permalink / raw)
To: Dev Jain
Cc: osalvador, akpm, ljs, david, liam, riel, vbabka, harry, jannh,
lance.yang, kas, linux-mm, linux-kernel, rcampbell, apopple, ziy,
matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
> On Jun 25, 2026, at 19:29, Dev Jain <dev.jain@arm.com> wrote:
>
> remove_migration_pte() converts migration entries back to present PTEs
> after folio migration completes. For hugetlb folios,
> page_vma_mapped_walk() returns the pte pointer to the hugetlb folio in
> pvmw.pte, but the code reads it with ptep_get().
>
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(),
> pte_present() etc to misbehave.
>
> It is not clear whether this has a trivially visible effect to userspace.
We are dealing with migration entries here, so the issue mentioned shouldn't
be a problem with any of the architectures. Semantically speaking, we definitely
should fix this.
>
> Use huge_ptep_get() to dereference a huge pte pointer.
>
> Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Thanks
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
@ 2026-06-26 3:40 ` Muchun Song
2026-06-26 4:08 ` Dev Jain
0 siblings, 1 reply; 27+ messages in thread
From: Muchun Song @ 2026-06-26 3:40 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
On 2026/6/25 19:29, Dev Jain wrote:
> prot_none_hugetlb_entry() is the hugetlb callback for the early
> mprotect(PROT_NONE) PFN permission walk on x86.
>
> The callback passes the decoded PFN to pfn_modify_allowed(). For a
> hugetlb callback, the pte pointer refers to a hugetlb entry. On
> architectures where hugetlb entries need huge_ptep_get(), reading that
> entry with ptep_get() can make the permission check use the wrong PFN.
>
> Use huge_ptep_get() before decoding the hugetlb PFN.
>
> Currently there is no path which can trigger a bug: huge_ptep_get() is a
> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
> But use the correct helper anyways.
>
> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
> mm/mprotect.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028cf..23779632d18bf 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
> 0 : -EACCES;
> }
>
> +#ifdef CONFIG_HUGETLB_PAGE
> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
> unsigned long addr, unsigned long next,
> struct mm_walk *walk)
> {
> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
> +
> + return pfn_modify_allowed(pte_pfn(entry),
> *(pgprot_t *)(walk->private)) ?
> 0 : -EACCES;
> }
> +#else
> +#define prot_none_hugetlb_entry NULL
This is very strange, because we defined a stub as NULL for a helper
function. How about the following diff?
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028c..4d8c1551fbce 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr,
unsigned long next,
static const struct mm_walk_ops prot_none_walk_ops = {
.pte_entry = prot_none_pte_entry,
+#ifdef CONFIG_HUGETLB_PAGE
.hugetlb_entry = prot_none_hugetlb_entry,
+#endif
.test_walk = prot_none_test,
.walk_lock = PGWALK_WRLOCK,
};
Thanks,
Muchun
> +#endif
>
> static int prot_none_test(unsigned long addr, unsigned long next,
> struct mm_walk *walk)
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-26 3:17 ` Muchun Song
@ 2026-06-26 4:03 ` Dev Jain
2026-06-26 4:16 ` Muchun Song
0 siblings, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-26 4:03 UTC (permalink / raw)
To: Muchun Song
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
akpm, ljs, david, liam
On 26/06/26 8:47 am, Muchun Song wrote:
>
>
> On 2026/6/25 19:29, Dev Jain wrote:
>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
>> in pvmw.pte, but the code reads it with ptep_get().
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
>> etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>>
>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> include/linux/hugetlb.h | 3 +++
>> mm/rmap.c | 16 ++++++++++------
>> 2 files changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>> index 2abaf99321e90..fdb7bdf7645c5 100644
>> --- a/include/linux/hugetlb.h
>> +++ b/include/linux/hugetlb.h
>> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
>> {
>> }
>> +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
>> + pte_t *ptep);
>
> Thanks so much for the fix! I'm curious, though: why do we
> need to add a separate declaration for this function here?
For !CONFIG_HUGETLB_PAGE, compiler complains that there is no huge_ptep_get.
So this is to make compiler happy.
>
> Thanks,
> Muchun
>
>> +
>> static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>> unsigned long addr, pte_t *ptep)
>> {
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 1c77d5dc06e9f..aa8a254efaecc 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>> /* Unexpected PMD-mapped THP? */
>> VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>> - /*
>> - * Handle PFN swap PTEs, such as device-exclusive ones, that
>> - * actually map pages.
>> - */
>> - pteval = ptep_get(pvmw.pte);
>> + address = pvmw.address;
>> + if (folio_test_hugetlb(folio)) {
>> + pteval = huge_ptep_get(mm, address, pvmw.pte);
>> + } else {
>> + /*
>> + * Handle PFN swap PTEs, such as device-exclusive ones,
>> + * that actually map pages.
>> + */
>> + pteval = ptep_get(pvmw.pte);
>> + }
>> if (likely(pte_present(pteval))) {
>> pfn = pte_pfn(pteval);
>> } else {
>> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>> }
>> subpage = folio_page(folio, pfn - folio_pfn(folio));
>> - address = pvmw.address;
>> anon_exclusive = folio_test_anon(folio) &&
>> PageAnonExclusive(subpage);
>>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 2:31 ` Lance Yang
@ 2026-06-26 4:06 ` Dev Jain
0 siblings, 0 replies; 27+ messages in thread
From: Dev Jain @ 2026-06-26 4:06 UTC (permalink / raw)
To: Lance Yang
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 26/06/26 8:01 am, Lance Yang wrote:
>
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>
> Just one ordering thing: should this patch come first?
>
> Patches #01-#03 only reach the new huge_ptep_get() after
> page_vma_mapped_walk() succeeds. But before this patch, hugetlb sill
> goes through check_pte() (still using ptep_get()).
You are right, but do we care? This is not a series meant for adding functionality.
I just sent it as a series because they are similar fixes - the patches are to
be applied individually with no dependency.
>
>> {
>> unsigned long pfn;
>> - pte_t ptent = ptep_get(pvmw->pte);
>> + pte_t ptent;
>> +
>> + if (is_vm_hugetlb_page(pvmw->vma))
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>> + else
>> + ptent = ptep_get(pvmw->pte);
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>> 2.43.0
>>
>>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-26 3:40 ` Muchun Song
@ 2026-06-26 4:08 ` Dev Jain
2026-06-26 4:21 ` Muchun Song
0 siblings, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-26 4:08 UTC (permalink / raw)
To: Muchun Song
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
On 26/06/26 9:10 am, Muchun Song wrote:
>
>
> On 2026/6/25 19:29, Dev Jain wrote:
>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>> mprotect(PROT_NONE) PFN permission walk on x86.
>>
>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>> architectures where hugetlb entries need huge_ptep_get(), reading that
>> entry with ptep_get() can make the permission check use the wrong PFN.
>>
>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>
>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>> But use the correct helper anyways.
>>
>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/mprotect.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 9cbf932b028cf..23779632d18bf 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>> 0 : -EACCES;
>> }
>> +#ifdef CONFIG_HUGETLB_PAGE
>> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>> unsigned long addr, unsigned long next,
>> struct mm_walk *walk)
>> {
>> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>> +
>> + return pfn_modify_allowed(pte_pfn(entry),
>> *(pgprot_t *)(walk->private)) ?
>> 0 : -EACCES;
>> }
>> +#else
>> +#define prot_none_hugetlb_entry NULL
>
> This is very strange, because we defined a stub as NULL for a helper
I was following pattern elsewhere, search for ".hugetlb_entry" in the
codebase and you will find others doing the same.
> function. How about the following diff?
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028c..4d8c1551fbce 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>
> static const struct mm_walk_ops prot_none_walk_ops = {
> .pte_entry = prot_none_pte_entry,
> +#ifdef CONFIG_HUGETLB_PAGE
> .hugetlb_entry = prot_none_hugetlb_entry,
> +#endif
> .test_walk = prot_none_test,
> .walk_lock = PGWALK_WRLOCK,
> };
>
> Thanks,
> Muchun
>
>> +#endif
>> static int prot_none_test(unsigned long addr, unsigned long next,
>> struct mm_walk *walk)
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 0/5] Fix incorrect access of hugetlb pte entries
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
@ 2026-06-26 4:09 ` Dev Jain
0 siblings, 0 replies; 27+ messages in thread
From: Dev Jain @ 2026-06-26 4:09 UTC (permalink / raw)
To: Zi Yan, muchun.song, osalvador, akpm, ljs, david, liam
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, mel, nao.horiguchi, ak,
j-nomura, pfalcato, dave.hansen, tglx, jpoimboe, ryan.roberts,
anshuman.khandual
On 25/06/26 7:29 pm, Zi Yan wrote:
> On Thu Jun 25, 2026 at 7:29 AM EDT, Dev Jain wrote:
>> There are various places which use ptep_get() to get the pte entry
>> corresponding to a hugetlb folio. Some arches have special handling
>
> I think it is better to mention s390 as a concrete example.
Sure. In case there is no v2, requesting Andrew to change
"Some arches have special handling" to "Some arches like s390 have
special handling".
>
>> to compute the pteval, so they provide huge_ptep_get(). Use this
>> helper consistently.
>>
>> Dev Jain (5):
>> mm/rmap: use huge_ptep_get() in try_to_unmap_one()
>> mm/rmap: use huge_ptep_get() in try_to_migrate_one()
>> mm/migrate: use huge_ptep_get() in remove_migration_pte()
>> mm/page_vma_mapped: use huge_ptep_get() for hugetlb
>> mm/mprotect: use huge_ptep_get() for hugetlb
>>
>> include/linux/hugetlb.h | 3 +++
>> mm/migrate.c | 6 +++++-
>> mm/mprotect.c | 8 +++++++-
>> mm/page_vma_mapped.c | 8 +++++++-
>> mm/rmap.c | 32 ++++++++++++++++++++------------
>> 5 files changed, 42 insertions(+), 15 deletions(-)
>
>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-26 4:03 ` Dev Jain
@ 2026-06-26 4:16 ` Muchun Song
0 siblings, 0 replies; 27+ messages in thread
From: Muchun Song @ 2026-06-26 4:16 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
akpm, ljs, david, liam
> On Jun 26, 2026, at 12:03, Dev Jain <dev.jain@arm.com> wrote:
>
>
>
> On 26/06/26 8:47 am, Muchun Song wrote:
>>
>>
>> On 2026/6/25 19:29, Dev Jain wrote:
>>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>>> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
>>> in pvmw.pte, but the code reads it with ptep_get().
>>>
>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
>>> etc to misbehave.
>>>
>>> It is not clear whether this has a trivially visible effect to userspace.
>>>
>>> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>>>
>>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>> include/linux/hugetlb.h | 3 +++
>>> mm/rmap.c | 16 ++++++++++------
>>> 2 files changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>>> index 2abaf99321e90..fdb7bdf7645c5 100644
>>> --- a/include/linux/hugetlb.h
>>> +++ b/include/linux/hugetlb.h
>>> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
>>> {
>>> }
>>> +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
>>> + pte_t *ptep);
>>
>> Thanks so much for the fix! I'm curious, though: why do we
>> need to add a separate declaration for this function here?
>
> For !CONFIG_HUGETLB_PAGE, compiler complains that there is no huge_ptep_get.
> So this is to make compiler happy.
Got it. We can refer to 5d4af6195c87c6b162b7963e0ad00a214b80d764 to fix
this warning.
Muchun,
Thanks.
>
>>
>> Thanks,
>> Muchun
>>
>>> +
>>> static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>>> unsigned long addr, pte_t *ptep)
>>> {
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index 1c77d5dc06e9f..aa8a254efaecc 100644
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>> /* Unexpected PMD-mapped THP? */
>>> VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>>> - /*
>>> - * Handle PFN swap PTEs, such as device-exclusive ones, that
>>> - * actually map pages.
>>> - */
>>> - pteval = ptep_get(pvmw.pte);
>>> + address = pvmw.address;
>>> + if (folio_test_hugetlb(folio)) {
>>> + pteval = huge_ptep_get(mm, address, pvmw.pte);
>>> + } else {
>>> + /*
>>> + * Handle PFN swap PTEs, such as device-exclusive ones,
>>> + * that actually map pages.
>>> + */
>>> + pteval = ptep_get(pvmw.pte);
>>> + }
>>> if (likely(pte_present(pteval))) {
>>> pfn = pte_pfn(pteval);
>>> } else {
>>> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>> }
>>> subpage = folio_page(folio, pfn - folio_pfn(folio));
>>> - address = pvmw.address;
>>> anon_exclusive = folio_test_anon(folio) &&
>>> PageAnonExclusive(subpage);
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-26 4:08 ` Dev Jain
@ 2026-06-26 4:21 ` Muchun Song
2026-06-26 4:42 ` Dev Jain
0 siblings, 1 reply; 27+ messages in thread
From: Muchun Song @ 2026-06-26 4:21 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
> On Jun 26, 2026, at 12:08, Dev Jain <dev.jain@arm.com> wrote:
>
>
>
> On 26/06/26 9:10 am, Muchun Song wrote:
>>
>>
>> On 2026/6/25 19:29, Dev Jain wrote:
>>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>>> mprotect(PROT_NONE) PFN permission walk on x86.
>>>
>>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>>> architectures where hugetlb entries need huge_ptep_get(), reading that
>>> entry with ptep_get() can make the permission check use the wrong PFN.
>>>
>>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>>
>>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>>> But use the correct helper anyways.
>>>
>>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>> mm/mprotect.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>> index 9cbf932b028cf..23779632d18bf 100644
>>> --- a/mm/mprotect.c
>>> +++ b/mm/mprotect.c
>>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>>> 0 : -EACCES;
>>> }
>>> +#ifdef CONFIG_HUGETLB_PAGE
>>> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>>> unsigned long addr, unsigned long next,
>>> struct mm_walk *walk)
>>> {
>>> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>>> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>>> +
>>> + return pfn_modify_allowed(pte_pfn(entry),
>>> *(pgprot_t *)(walk->private)) ?
>>> 0 : -EACCES;
>>> }
>>> +#else
>>> +#define prot_none_hugetlb_entry NULL
>>
>> This is very strange, because we defined a stub as NULL for a helper
>
> I was following pattern elsewhere, search for ".hugetlb_entry" in the
> codebase and you will find others doing the same.
Okay, I understand why you want to do it that way, but I would still
recommend not following that format.
Thanks.
>
>
>> function. How about the following diff?
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 9cbf932b028c..4d8c1551fbce 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>>
>> static const struct mm_walk_ops prot_none_walk_ops = {
>> .pte_entry = prot_none_pte_entry,
>> +#ifdef CONFIG_HUGETLB_PAGE
>> .hugetlb_entry = prot_none_hugetlb_entry,
>> +#endif
>> .test_walk = prot_none_test,
>> .walk_lock = PGWALK_WRLOCK,
>> };
>>
>> Thanks,
>> Muchun
>>
>>> +#endif
>>> static int prot_none_test(unsigned long addr, unsigned long next,
>>> struct mm_walk *walk)
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-26 4:21 ` Muchun Song
@ 2026-06-26 4:42 ` Dev Jain
0 siblings, 0 replies; 27+ messages in thread
From: Dev Jain @ 2026-06-26 4:42 UTC (permalink / raw)
To: Muchun Song
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
On 26/06/26 9:51 am, Muchun Song wrote:
>
>
>> On Jun 26, 2026, at 12:08, Dev Jain <dev.jain@arm.com> wrote:
>>
>>
>>
>> On 26/06/26 9:10 am, Muchun Song wrote:
>>>
>>>
>>> On 2026/6/25 19:29, Dev Jain wrote:
>>>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>>>> mprotect(PROT_NONE) PFN permission walk on x86.
>>>>
>>>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>>>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>>>> architectures where hugetlb entries need huge_ptep_get(), reading that
>>>> entry with ptep_get() can make the permission check use the wrong PFN.
>>>>
>>>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>>>
>>>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>>>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>>>> But use the correct helper anyways.
>>>>
>>>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>> ---
>>>> mm/mprotect.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>>> index 9cbf932b028cf..23779632d18bf 100644
>>>> --- a/mm/mprotect.c
>>>> +++ b/mm/mprotect.c
>>>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>>>> 0 : -EACCES;
>>>> }
>>>> +#ifdef CONFIG_HUGETLB_PAGE
>>>> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>>>> unsigned long addr, unsigned long next,
>>>> struct mm_walk *walk)
>>>> {
>>>> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>>>> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>>>> +
>>>> + return pfn_modify_allowed(pte_pfn(entry),
>>>> *(pgprot_t *)(walk->private)) ?
>>>> 0 : -EACCES;
>>>> }
>>>> +#else
>>>> +#define prot_none_hugetlb_entry NULL
>>>
>>> This is very strange, because we defined a stub as NULL for a helper
>>
>> I was following pattern elsewhere, search for ".hugetlb_entry" in the
>> codebase and you will find others doing the same.
>
> Okay, I understand why you want to do it that way, but I would still
> recommend not following that format.
Okay then I'll update v2 with the below diff.
>
> Thanks.
>
>>
>>
>>> function. How about the following diff?
>>>
>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>> index 9cbf932b028c..4d8c1551fbce 100644
>>> --- a/mm/mprotect.c
>>> +++ b/mm/mprotect.c
>>> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>>>
>>> static const struct mm_walk_ops prot_none_walk_ops = {
>>> .pte_entry = prot_none_pte_entry,
>>> +#ifdef CONFIG_HUGETLB_PAGE
>>> .hugetlb_entry = prot_none_hugetlb_entry,
>>> +#endif
>>> .test_walk = prot_none_test,
>>> .walk_lock = PGWALK_WRLOCK,
>>> };
>>>
>>> Thanks,
>>> Muchun
>>>
>>>> +#endif
>>>> static int prot_none_test(unsigned long addr, unsigned long next,
>>>> struct mm_walk *walk)
>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
2026-06-26 2:31 ` Lance Yang
@ 2026-06-26 7:48 ` Lance Yang
2026-06-26 9:14 ` Lance Yang
2026-06-26 13:23 ` Dev Jain
1 sibling, 2 replies; 27+ messages in thread
From: Lance Yang @ 2026-06-26 7:48 UTC (permalink / raw)
To: dev.jain, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, lance.yang, kas, linux-mm, linux-kernel, rcampbell,
apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>check_pte() is the final validation step in page_vma_mapped_walk().
>It reads pvmw->pte with ptep_get() to decide whether the entry maps
>the PFN range being walked. For hugetlb VMAs, that pointer refers
>to a hugetlb entry.
>
>On arches which provide their own huge_ptep_get() to dereference a huge
>pte pointer, accessing via ptep_get() would cause pte_pfn(),
>pte_present() etc to misbehave.
>
>It is not clear whether this has a trivially visible effect to userspace.
>
>Use huge_ptep_get() to dereference a huge pte pointer.
>
>Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>Cc: stable@vger.kernel.org
>Signed-off-by: Dev Jain <dev.jain@arm.com>
>---
> mm/page_vma_mapped.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..18e1d341f463c 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> {
> unsigned long pfn;
>- pte_t ptent = ptep_get(pvmw->pte);
>+ pte_t ptent;
>+
>+ if (is_vm_hugetlb_page(pvmw->vma))
>+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>+ pvmw->pte);
I think check_pte() can pass a wrong address to huge_ptep_get() ...
Not sure that is wrong in the first place. For memory failure,
page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
folio. In that case, pvmw->address need not be hugepage-aligned.
@Miaohe
For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
page_vma_mapped_walk() passes that size to hugetlb_walk():
bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
{
...
if (unlikely(is_vm_hugetlb_page(vma))) {
...
pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
...
}
...
}
hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
calculating pmdp:
pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
...
if (sz == CONT_PMD_SIZE)
addr &= CONT_PMD_MASK;
pmdp = pmd_offset(pudp, addr);
pmd = READ_ONCE(*pmdp);
...
}
So for that case, pvmw->pte is calculated from the aligned addr, not
necessarily from the original pvmw->address. But check_pte() passes the
original address together with pvmw->pte:
+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
+ pvmw->pte);
arm64 then uses that addr again to choose ncontig:
pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
...
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
for (i = 0; i < ncontig; i++, ptep++) {
...
}
return orig_pte;
}
static int find_num_contig(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, size_t *pgsize)
{
pgd_t *pgdp = pgd_offset(mm, addr);
p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
*pgsize = PAGE_SIZE;
p4dp = p4d_offset(pgdp, addr);
pudp = pud_offset(p4dp, addr);
pmdp = pmd_offset(pudp, addr);
if ((pte_t *)pmdp == ptep) {
*pgsize = PMD_SIZE;
return CONT_PMDS;
}
return CONT_PTES;
}
With a tail address, pmdp may no longer point at pvmw->pte, so
find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
can walk past the CONT_PMD entries, and possibly past the PMD table.
Should check_pte() pass the address matching pvmw->pte, sth like:
---8<---
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 406fd50bbd8f..58463493bd3d 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
unsigned long pfn;
pte_t ptent;
- if (is_vm_hugetlb_page(pvmw->vma))
- ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
- pvmw->pte);
- else
+ if (is_vm_hugetlb_page(pvmw->vma)) {
+ struct hstate *hstate = hstate_vma(pvmw->vma);
+ unsigned long haddr = pvmw->address & huge_page_mask(hstate);
+
+ ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
+ } else {
ptent = ptep_get(pvmw->pte);
+ }
if (pvmw->flags & PVMW_MIGRATION) {
const softleaf_t entry = softleaf_from_pte(ptent);
--
while leaving pvmw->address unchanged for page_mapped_in_vma()?
Cheers, Lance
>+ else
>+ ptent = ptep_get(pvmw->pte);
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
>--
>2.43.0
>
>
^ permalink raw reply related [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 7:48 ` Lance Yang
@ 2026-06-26 9:14 ` Lance Yang
2026-06-26 13:23 ` Dev Jain
1 sibling, 0 replies; 27+ messages in thread
From: Lance Yang @ 2026-06-26 9:14 UTC (permalink / raw)
To: dev.jain, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 2026/6/26 15:48, Lance Yang wrote:
>
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>> {
>> unsigned long pfn;
>> - pte_t ptent = ptep_get(pvmw->pte);
>> + pte_t ptent;
>> +
>> + if (is_vm_hugetlb_page(pvmw->vma))
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>
> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>
> Not sure that is wrong in the first place. For memory failure,
> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
> folio. In that case, pvmw->address need not be hugepage-aligned.
>
> @Miaohe
>
> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
> page_vma_mapped_walk() passes that size to hugetlb_walk():
>
> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> {
> ...
> if (unlikely(is_vm_hugetlb_page(vma))) {
> ...
> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
> ...
> }
> ...
> }
>
> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
> calculating pmdp:
>
> pte_t *huge_pte_offset(struct mm_struct *mm,
> unsigned long addr, unsigned long sz)
> {
> ...
> if (sz == CONT_PMD_SIZE)
> addr &= CONT_PMD_MASK;
>
> pmdp = pmd_offset(pudp, addr);
> pmd = READ_ONCE(*pmdp);
> ...
> }
>
> So for that case, pvmw->pte is calculated from the aligned addr, not
> necessarily from the original pvmw->address. But check_pte() passes the
> original address together with pvmw->pte:
>
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> + pvmw->pte);
In addition:
Went through all arch code that has its own huge_ptep_get(); only
arm64 and powerpc actually use addr, and there addr has to match the
ptep, IIUC.
So I am wondering whether all huge_ptep_get() callers satisfy that
requirement.
Cheers, Lance
>
> arm64 then uses that addr again to choose ncontig:
>
> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> ...
> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
> for (i = 0; i < ncontig; i++, ptep++) {
> ...
> }
> return orig_pte;
> }
>
> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, size_t *pgsize)
> {
> pgd_t *pgdp = pgd_offset(mm, addr);
> p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
>
> *pgsize = PAGE_SIZE;
> p4dp = p4d_offset(pgdp, addr);
> pudp = pud_offset(p4dp, addr);
> pmdp = pmd_offset(pudp, addr);
> if ((pte_t *)pmdp == ptep) {
> *pgsize = PMD_SIZE;
> return CONT_PMDS;
> }
> return CONT_PTES;
> }
>
> With a tail address, pmdp may no longer point at pvmw->pte, so
> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>
> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
> can walk past the CONT_PMD entries, and possibly past the PMD table.
>
> Should check_pte() pass the address matching pvmw->pte, sth like:
>
> ---8<---
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index 406fd50bbd8f..58463493bd3d 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> unsigned long pfn;
> pte_t ptent;
>
> - if (is_vm_hugetlb_page(pvmw->vma))
> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> - pvmw->pte);
> - else
> + if (is_vm_hugetlb_page(pvmw->vma)) {
> + struct hstate *hstate = hstate_vma(pvmw->vma);
> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
> +
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
> + } else {
> ptent = ptep_get(pvmw->pte);
> + }
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
> --
>
> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>
> Cheers, Lance
>
>> + else
>> + ptent = ptep_get(pvmw->pte);
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>> 2.43.0
>>
>>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 7:48 ` Lance Yang
2026-06-26 9:14 ` Lance Yang
@ 2026-06-26 13:23 ` Dev Jain
2026-06-26 14:10 ` Lance Yang
1 sibling, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-26 13:23 UTC (permalink / raw)
To: Lance Yang, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 26/06/26 1:18 pm, Lance Yang wrote:
>
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>> {
>> unsigned long pfn;
>> - pte_t ptent = ptep_get(pvmw->pte);
>> + pte_t ptent;
>> +
>> + if (is_vm_hugetlb_page(pvmw->vma))
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>
> I think check_pte() can pass a wrong address to huge_ptep_get() ...
Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
etc ...
>
> Not sure that is wrong in the first place. For memory failure,
> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
> folio. In that case, pvmw->address need not be hugepage-aligned.
>
> @Miaohe
>
> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
> page_vma_mapped_walk() passes that size to hugetlb_walk():
>
> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> {
> ...
> if (unlikely(is_vm_hugetlb_page(vma))) {
> ...
> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
> ...
> }
> ...
> }
>
> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
> calculating pmdp:
>
> pte_t *huge_pte_offset(struct mm_struct *mm,
> unsigned long addr, unsigned long sz)
> {
> ...
> if (sz == CONT_PMD_SIZE)
> addr &= CONT_PMD_MASK;
>
> pmdp = pmd_offset(pudp, addr);
> pmd = READ_ONCE(*pmdp);
> ...
> }
>
> So for that case, pvmw->pte is calculated from the aligned addr, not
> necessarily from the original pvmw->address. But check_pte() passes the
> original address together with pvmw->pte:
>
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> + pvmw->pte);
>
> arm64 then uses that addr again to choose ncontig:
>
> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> ...
> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
> for (i = 0; i < ncontig; i++, ptep++) {
> ...
> }
> return orig_pte;
> }
>
> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, size_t *pgsize)
> {
> pgd_t *pgdp = pgd_offset(mm, addr);
> p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
>
> *pgsize = PAGE_SIZE;
> p4dp = p4d_offset(pgdp, addr);
> pudp = pud_offset(p4dp, addr);
> pmdp = pmd_offset(pudp, addr);
> if ((pte_t *)pmdp == ptep) {
> *pgsize = PMD_SIZE;
> return CONT_PMDS;
> }
> return CONT_PTES;
> }
>
> With a tail address, pmdp may no longer point at pvmw->pte, so
> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>
> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
> can walk past the CONT_PMD entries, and possibly past the PMD table.
>
> Should check_pte() pass the address matching pvmw->pte, sth like:
>
> ---8<---
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index 406fd50bbd8f..58463493bd3d 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> unsigned long pfn;
> pte_t ptent;
>
> - if (is_vm_hugetlb_page(pvmw->vma))
> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> - pvmw->pte);
> - else
> + if (is_vm_hugetlb_page(pvmw->vma)) {
> + struct hstate *hstate = hstate_vma(pvmw->vma);
> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
> +
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
> + } else {
> ptent = ptep_get(pvmw->pte);
> + }
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
> --
>
> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>
> Cheers, Lance
>
>> + else
>> + ptent = ptep_get(pvmw->pte);
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>> 2.43.0
>>
>>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 13:23 ` Dev Jain
@ 2026-06-26 14:10 ` Lance Yang
2026-06-26 15:26 ` Dev Jain
0 siblings, 1 reply; 27+ messages in thread
From: Lance Yang @ 2026-06-26 14:10 UTC (permalink / raw)
To: dev.jain, linmiaohe
Cc: lance.yang, muchun.song, osalvador, akpm, ljs, david, liam, riel,
vbabka, harry, jannh, kas, linux-mm, linux-kernel, rcampbell,
apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On Fri, Jun 26, 2026 at 06:53:10PM +0530, Dev Jain wrote:
>
>
>On 26/06/26 1:18 pm, Lance Yang wrote:
>>
>> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>>> check_pte() is the final validation step in page_vma_mapped_walk().
>>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>>> to a hugetlb entry.
>>>
>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>>> pte_present() etc to misbehave.
>>>
>>> It is not clear whether this has a trivially visible effect to userspace.
>>>
>>> Use huge_ptep_get() to dereference a huge pte pointer.
>>>
>>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>> mm/page_vma_mapped.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index 2ccbabfb2cc17..18e1d341f463c 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>> {
>>> unsigned long pfn;
>>> - pte_t ptent = ptep_get(pvmw->pte);
>>> + pte_t ptent;
>>> +
>>> + if (is_vm_hugetlb_page(pvmw->vma))
>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>> + pvmw->pte);
>>
>> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>
>Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
>performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
>etc ...
Right, that should cover the rmap callbacks. The bit I was worried about
is page_mapped_in_vma() though.
>>
>> Not sure that is wrong in the first place. For memory failure,
>> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
>> folio. In that case, pvmw->address need not be hugepage-aligned.
>>
>> @Miaohe
For hugetlb memory failure we start with the poisoned PFN:
static int try_memory_failure_hugetlb(unsigned long pfn, int flags)
{
...
struct page *p = pfn_to_page(pfn);
struct folio *folio;
...
folio = page_folio(p);
...
if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
...
}
...
}
and pass the same p down:
static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
unsigned long pfn, int flags)
{
...
collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
...
}
static void collect_procs(const struct folio *folio, const struct page *page,
struct list_head *tokill, int force_early)
{
...
if (unlikely(folio_test_ksm(folio)))
...
else if (folio_test_anon(folio))
collect_procs_anon(folio, page, tokill, force_early);
else
...
}
So collect_procs_anon() still gets the poisoned page, not &folio->page:
static void collect_procs_anon(const struct folio *folio,
const struct page *page, struct list_head *to_kill,
int force_early)
{
...
pgoff = page_pgoff(folio, page);
rcu_read_lock();
for_each_process(tsk) {
...
anon_vma_interval_tree_foreach(vmac, &av->rb_root,
pgoff, pgoff) {
...
addr = page_mapped_in_vma(page, vma);
...
}
}
rcu_read_unlock();
anon_vma_unlock_read(av);
}
page_mapped_in_vma() then builds pvmw for that page:
unsigned long page_mapped_in_vma(const struct page *page,
struct vm_area_struct *vma)
{
const struct folio *folio = page_folio(page);
struct page_vma_mapped_walk pvmw = {
.pfn = page_to_pfn(page),
.nr_pages = 1,
.vma = vma,
.flags = PVMW_SYNC,
};
pvmw.address = vma_address(vma, page_pgoff(folio, page), 1);
...
}
and page_pgoff() includes the subpage index:
static inline pgoff_t page_pgoff(const struct folio *folio,
const struct page *page)
{
return folio->index + folio_page_idx(folio, page);
}
So if the poisoned PFN points to a tail page, pvmw->address can be offset
from the start of the hugetlb mapping by
folio_page_idx(folio, page) << PAGE_SHIFT
Should check_pte() pass the hugepage-aligned address to huge_ptep_get()
for that case?
Cheers, Lance
>>
>> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
>> page_vma_mapped_walk() passes that size to hugetlb_walk():
>>
>> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> {
>> ...
>> if (unlikely(is_vm_hugetlb_page(vma))) {
>> ...
>> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
>> ...
>> }
>> ...
>> }
>>
>> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
>> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
>> calculating pmdp:
>>
>> pte_t *huge_pte_offset(struct mm_struct *mm,
>> unsigned long addr, unsigned long sz)
>> {
>> ...
>> if (sz == CONT_PMD_SIZE)
>> addr &= CONT_PMD_MASK;
>>
>> pmdp = pmd_offset(pudp, addr);
>> pmd = READ_ONCE(*pmdp);
>> ...
>> }
>>
>> So for that case, pvmw->pte is calculated from the aligned addr, not
>> necessarily from the original pvmw->address. But check_pte() passes the
>> original address together with pvmw->pte:
>>
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>>
>> arm64 then uses that addr again to choose ncontig:
>>
>> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>> {
>> ...
>> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
>> for (i = 0; i < ncontig; i++, ptep++) {
>> ...
>> }
>> return orig_pte;
>> }
>>
>> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
>> pte_t *ptep, size_t *pgsize)
>> {
>> pgd_t *pgdp = pgd_offset(mm, addr);
>> p4d_t *p4dp;
>> pud_t *pudp;
>> pmd_t *pmdp;
>>
>> *pgsize = PAGE_SIZE;
>> p4dp = p4d_offset(pgdp, addr);
>> pudp = pud_offset(p4dp, addr);
>> pmdp = pmd_offset(pudp, addr);
>> if ((pte_t *)pmdp == ptep) {
>> *pgsize = PMD_SIZE;
>> return CONT_PMDS;
>> }
>> return CONT_PTES;
>> }
>>
>> With a tail address, pmdp may no longer point at pvmw->pte, so
>> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>>
>> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
>> can walk past the CONT_PMD entries, and possibly past the PMD table.
>>
>> Should check_pte() pass the address matching pvmw->pte, sth like:
>>
>> ---8<---
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 406fd50bbd8f..58463493bd3d 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>> unsigned long pfn;
>> pte_t ptent;
>>
>> - if (is_vm_hugetlb_page(pvmw->vma))
>> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> - pvmw->pte);
>> - else
>> + if (is_vm_hugetlb_page(pvmw->vma)) {
>> + struct hstate *hstate = hstate_vma(pvmw->vma);
>> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
>> +
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
>> + } else {
>> ptent = ptep_get(pvmw->pte);
>> + }
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>>
>> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>>
>> Cheers, Lance
>>
>>> + else
>>> + ptent = ptep_get(pvmw->pte);
>>>
>>> if (pvmw->flags & PVMW_MIGRATION) {
>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>> --
>>> 2.43.0
>>>
>>>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 14:10 ` Lance Yang
@ 2026-06-26 15:26 ` Dev Jain
2026-06-26 16:46 ` Lance Yang
0 siblings, 1 reply; 27+ messages in thread
From: Dev Jain @ 2026-06-26 15:26 UTC (permalink / raw)
To: Lance Yang, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 26/06/26 7:40 pm, Lance Yang wrote:
>
> On Fri, Jun 26, 2026 at 06:53:10PM +0530, Dev Jain wrote:
>>
>>
>> On 26/06/26 1:18 pm, Lance Yang wrote:
>>>
>>> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>>>> check_pte() is the final validation step in page_vma_mapped_walk().
>>>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>>>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>>>> to a hugetlb entry.
>>>>
>>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>>>> pte_present() etc to misbehave.
>>>>
>>>> It is not clear whether this has a trivially visible effect to userspace.
>>>>
>>>> Use huge_ptep_get() to dereference a huge pte pointer.
>>>>
>>>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>>>> Cc: stable@vger.kernel.org
>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>> ---
>>>> mm/page_vma_mapped.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>> index 2ccbabfb2cc17..18e1d341f463c 100644
>>>> --- a/mm/page_vma_mapped.c
>>>> +++ b/mm/page_vma_mapped.c
>>>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>> {
>>>> unsigned long pfn;
>>>> - pte_t ptent = ptep_get(pvmw->pte);
>>>> + pte_t ptent;
>>>> +
>>>> + if (is_vm_hugetlb_page(pvmw->vma))
>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>> + pvmw->pte);
>>>
>>> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>>
>> Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
>> performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
>> etc ...
>
> Right, that should cover the rmap callbacks. The bit I was worried about
> is page_mapped_in_vma() though.
>
>>>
>>> Not sure that is wrong in the first place. For memory failure,
>>> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
>>> folio. In that case, pvmw->address need not be hugepage-aligned.
>>>
>>> @Miaohe
>
> For hugetlb memory failure we start with the poisoned PFN:
>
> static int try_memory_failure_hugetlb(unsigned long pfn, int flags)
> {
> ...
> struct page *p = pfn_to_page(pfn);
> struct folio *folio;
> ...
>
> folio = page_folio(p);
>
> ...
>
> if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
> ...
> }
>
> ...
> }
>
> and pass the same p down:
>
> static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
> unsigned long pfn, int flags)
> {
> ...
>
> collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
>
> ...
> }
>
> static void collect_procs(const struct folio *folio, const struct page *page,
> struct list_head *tokill, int force_early)
> {
> ...
>
> if (unlikely(folio_test_ksm(folio)))
> ...
> else if (folio_test_anon(folio))
> collect_procs_anon(folio, page, tokill, force_early);
> else
> ...
> }
>
> So collect_procs_anon() still gets the poisoned page, not &folio->page:
>
> static void collect_procs_anon(const struct folio *folio,
> const struct page *page, struct list_head *to_kill,
> int force_early)
> {
> ...
>
> pgoff = page_pgoff(folio, page);
> rcu_read_lock();
> for_each_process(tsk) {
> ...
>
> anon_vma_interval_tree_foreach(vmac, &av->rb_root,
> pgoff, pgoff) {
> ...
> addr = page_mapped_in_vma(page, vma);
> ...
> }
> }
> rcu_read_unlock();
> anon_vma_unlock_read(av);
> }
>
> page_mapped_in_vma() then builds pvmw for that page:
>
> unsigned long page_mapped_in_vma(const struct page *page,
> struct vm_area_struct *vma)
> {
> const struct folio *folio = page_folio(page);
> struct page_vma_mapped_walk pvmw = {
> .pfn = page_to_pfn(page),
> .nr_pages = 1,
> .vma = vma,
> .flags = PVMW_SYNC,
> };
>
> pvmw.address = vma_address(vma, page_pgoff(folio, page), 1);
> ...
> }
>
> and page_pgoff() includes the subpage index:
>
> static inline pgoff_t page_pgoff(const struct folio *folio,
> const struct page *page)
> {
> return folio->index + folio_page_idx(folio, page);
> }
>
> So if the poisoned PFN points to a tail page, pvmw->address can be offset
> from the start of the hugetlb mapping by
>
> folio_page_idx(folio, page) << PAGE_SHIFT
>
> Should check_pte() pass the hugepage-aligned address to huge_ptep_get()
> for that case?
Thanks! This looks correct.
I can indeed fix this up in check_pte. But in the memory-failure path
it has always been confusing to me for hugetlb folios why we are bothering
with the tail page. I am sure that area can also be simplified. But for
now I'll just do a simple fix here itself.
>
> Cheers, Lance
>
>>>
>>> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
>>> page_vma_mapped_walk() passes that size to hugetlb_walk():
>>>
>>> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>> {
>>> ...
>>> if (unlikely(is_vm_hugetlb_page(vma))) {
>>> ...
>>> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
>>> ...
>>> }
>>> ...
>>> }
>>>
>>> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
>>> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
>>> calculating pmdp:
>>>
>>> pte_t *huge_pte_offset(struct mm_struct *mm,
>>> unsigned long addr, unsigned long sz)
>>> {
>>> ...
>>> if (sz == CONT_PMD_SIZE)
>>> addr &= CONT_PMD_MASK;
>>>
>>> pmdp = pmd_offset(pudp, addr);
>>> pmd = READ_ONCE(*pmdp);
>>> ...
>>> }
>>>
>>> So for that case, pvmw->pte is calculated from the aligned addr, not
>>> necessarily from the original pvmw->address. But check_pte() passes the
>>> original address together with pvmw->pte:
>>>
>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>> + pvmw->pte);
>>>
>>> arm64 then uses that addr again to choose ncontig:
>>>
>>> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>>> {
>>> ...
>>> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
>>> for (i = 0; i < ncontig; i++, ptep++) {
>>> ...
>>> }
>>> return orig_pte;
>>> }
>>>
>>> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
>>> pte_t *ptep, size_t *pgsize)
>>> {
>>> pgd_t *pgdp = pgd_offset(mm, addr);
>>> p4d_t *p4dp;
>>> pud_t *pudp;
>>> pmd_t *pmdp;
>>>
>>> *pgsize = PAGE_SIZE;
>>> p4dp = p4d_offset(pgdp, addr);
>>> pudp = pud_offset(p4dp, addr);
>>> pmdp = pmd_offset(pudp, addr);
>>> if ((pte_t *)pmdp == ptep) {
>>> *pgsize = PMD_SIZE;
>>> return CONT_PMDS;
>>> }
>>> return CONT_PTES;
>>> }
>>>
>>> With a tail address, pmdp may no longer point at pvmw->pte, so
>>> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>>>
>>> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
>>> can walk past the CONT_PMD entries, and possibly past the PMD table.
>>>
>>> Should check_pte() pass the address matching pvmw->pte, sth like:
>>>
>>> ---8<---
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index 406fd50bbd8f..58463493bd3d 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>> unsigned long pfn;
>>> pte_t ptent;
>>>
>>> - if (is_vm_hugetlb_page(pvmw->vma))
>>> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>> - pvmw->pte);
>>> - else
>>> + if (is_vm_hugetlb_page(pvmw->vma)) {
>>> + struct hstate *hstate = hstate_vma(pvmw->vma);
>>> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
>>> +
>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
>>> + } else {
>>> ptent = ptep_get(pvmw->pte);
>>> + }
>>>
>>> if (pvmw->flags & PVMW_MIGRATION) {
>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>> --
>>>
>>> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>>>
>>> Cheers, Lance
>>>
>>>> + else
>>>> + ptent = ptep_get(pvmw->pte);
>>>>
>>>> if (pvmw->flags & PVMW_MIGRATION) {
>>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>>> --
>>>> 2.43.0
>>>>
>>>>
>>
>>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 15:26 ` Dev Jain
@ 2026-06-26 16:46 ` Lance Yang
2026-06-27 3:54 ` Miaohe Lin
2026-06-27 7:13 ` Dev Jain
0 siblings, 2 replies; 27+ messages in thread
From: Lance Yang @ 2026-06-26 16:46 UTC (permalink / raw)
To: Dev Jain, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 2026/6/26 23:26, Dev Jain wrote:
>
>
> On 26/06/26 7:40 pm, Lance Yang wrote:
>>
>> On Fri, Jun 26, 2026 at 06:53:10PM +0530, Dev Jain wrote:
>>>
>>>
>>> On 26/06/26 1:18 pm, Lance Yang wrote:
>>>>
>>>> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>>>>> check_pte() is the final validation step in page_vma_mapped_walk().
>>>>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>>>>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>>>>> to a hugetlb entry.
>>>>>
>>>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>>>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>>>>> pte_present() etc to misbehave.
>>>>>
>>>>> It is not clear whether this has a trivially visible effect to userspace.
>>>>>
>>>>> Use huge_ptep_get() to dereference a huge pte pointer.
>>>>>
>>>>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>>>>> Cc: stable@vger.kernel.org
>>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>>> ---
>>>>> mm/page_vma_mapped.c | 8 +++++++-
>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>>> index 2ccbabfb2cc17..18e1d341f463c 100644
>>>>> --- a/mm/page_vma_mapped.c
>>>>> +++ b/mm/page_vma_mapped.c
>>>>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>>> {
>>>>> unsigned long pfn;
>>>>> - pte_t ptent = ptep_get(pvmw->pte);
>>>>> + pte_t ptent;
>>>>> +
>>>>> + if (is_vm_hugetlb_page(pvmw->vma))
>>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>>> + pvmw->pte);
>>>>
>>>> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>>>
>>> Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
>>> performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
>>> etc ...
>>
>> Right, that should cover the rmap callbacks. The bit I was worried about
>> is page_mapped_in_vma() though.
>>
>>>>
>>>> Not sure that is wrong in the first place. For memory failure,
>>>> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
>>>> folio. In that case, pvmw->address need not be hugepage-aligned.
>>>>
>>>> @Miaohe
>>
>> For hugetlb memory failure we start with the poisoned PFN:
>>
>> static int try_memory_failure_hugetlb(unsigned long pfn, int flags)
>> {
>> ...
>> struct page *p = pfn_to_page(pfn);
>> struct folio *folio;
>> ...
>>
>> folio = page_folio(p);
>>
>> ...
>>
>> if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
>> ...
>> }
>>
>> ...
>> }
>>
>> and pass the same p down:
>>
>> static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
>> unsigned long pfn, int flags)
>> {
>> ...
>>
>> collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
>>
>> ...
>> }
>>
>> static void collect_procs(const struct folio *folio, const struct page *page,
>> struct list_head *tokill, int force_early)
>> {
>> ...
>>
>> if (unlikely(folio_test_ksm(folio)))
>> ...
>> else if (folio_test_anon(folio))
>> collect_procs_anon(folio, page, tokill, force_early);
>> else
>> ...
>> }
>>
>> So collect_procs_anon() still gets the poisoned page, not &folio->page:
>>
>> static void collect_procs_anon(const struct folio *folio,
>> const struct page *page, struct list_head *to_kill,
>> int force_early)
>> {
>> ...
>>
>> pgoff = page_pgoff(folio, page);
>> rcu_read_lock();
>> for_each_process(tsk) {
>> ...
>>
>> anon_vma_interval_tree_foreach(vmac, &av->rb_root,
>> pgoff, pgoff) {
>> ...
>> addr = page_mapped_in_vma(page, vma);
>> ...
>> }
>> }
>> rcu_read_unlock();
>> anon_vma_unlock_read(av);
>> }
>>
>> page_mapped_in_vma() then builds pvmw for that page:
>>
>> unsigned long page_mapped_in_vma(const struct page *page,
>> struct vm_area_struct *vma)
>> {
>> const struct folio *folio = page_folio(page);
>> struct page_vma_mapped_walk pvmw = {
>> .pfn = page_to_pfn(page),
>> .nr_pages = 1,
>> .vma = vma,
>> .flags = PVMW_SYNC,
>> };
>>
>> pvmw.address = vma_address(vma, page_pgoff(folio, page), 1);
>> ...
>> }
>>
>> and page_pgoff() includes the subpage index:
>>
>> static inline pgoff_t page_pgoff(const struct folio *folio,
>> const struct page *page)
>> {
>> return folio->index + folio_page_idx(folio, page);
>> }
>>
>> So if the poisoned PFN points to a tail page, pvmw->address can be offset
>> from the start of the hugetlb mapping by
>>
>> folio_page_idx(folio, page) << PAGE_SHIFT
>>
>> Should check_pte() pass the hugepage-aligned address to huge_ptep_get()
>> for that case?
>
> Thanks! This looks correct.
>
> I can indeed fix this up in check_pte. But in the memory-failure path
> it has always been confusing to me for hugetlb folios why we are bothering
> with the tail page. I am sure that area can also be simplified. But for
> now I'll just do a simple fix here itself.
Just thinking out loud: given that huge_ptep_get() already assumes that
addr matches the huge pte, at least on arm64, would it make sense to
have a small hugetlb wrapper around it that takes hstate and aligns
the address before calling the arch helper?
Might make the rule clearer, and a bit harder to get wrong again :)
Thanks, Lance
>
>>
>> Cheers, Lance
>>
>>>>
>>>> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
>>>> page_vma_mapped_walk() passes that size to hugetlb_walk():
>>>>
>>>> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>>> {
>>>> ...
>>>> if (unlikely(is_vm_hugetlb_page(vma))) {
>>>> ...
>>>> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
>>>> ...
>>>> }
>>>> ...
>>>> }
>>>>
>>>> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
>>>> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
>>>> calculating pmdp:
>>>>
>>>> pte_t *huge_pte_offset(struct mm_struct *mm,
>>>> unsigned long addr, unsigned long sz)
>>>> {
>>>> ...
>>>> if (sz == CONT_PMD_SIZE)
>>>> addr &= CONT_PMD_MASK;
>>>>
>>>> pmdp = pmd_offset(pudp, addr);
>>>> pmd = READ_ONCE(*pmdp);
>>>> ...
>>>> }
>>>>
>>>> So for that case, pvmw->pte is calculated from the aligned addr, not
>>>> necessarily from the original pvmw->address. But check_pte() passes the
>>>> original address together with pvmw->pte:
>>>>
>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>> + pvmw->pte);
>>>>
>>>> arm64 then uses that addr again to choose ncontig:
>>>>
>>>> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>>>> {
>>>> ...
>>>> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
>>>> for (i = 0; i < ncontig; i++, ptep++) {
>>>> ...
>>>> }
>>>> return orig_pte;
>>>> }
>>>>
>>>> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
>>>> pte_t *ptep, size_t *pgsize)
>>>> {
>>>> pgd_t *pgdp = pgd_offset(mm, addr);
>>>> p4d_t *p4dp;
>>>> pud_t *pudp;
>>>> pmd_t *pmdp;
>>>>
>>>> *pgsize = PAGE_SIZE;
>>>> p4dp = p4d_offset(pgdp, addr);
>>>> pudp = pud_offset(p4dp, addr);
>>>> pmdp = pmd_offset(pudp, addr);
>>>> if ((pte_t *)pmdp == ptep) {
>>>> *pgsize = PMD_SIZE;
>>>> return CONT_PMDS;
>>>> }
>>>> return CONT_PTES;
>>>> }
>>>>
>>>> With a tail address, pmdp may no longer point at pvmw->pte, so
>>>> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>>>>
>>>> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
>>>> can walk past the CONT_PMD entries, and possibly past the PMD table.
>>>>
>>>> Should check_pte() pass the address matching pvmw->pte, sth like:
>>>>
>>>> ---8<---
>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>> index 406fd50bbd8f..58463493bd3d 100644
>>>> --- a/mm/page_vma_mapped.c
>>>> +++ b/mm/page_vma_mapped.c
>>>> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>> unsigned long pfn;
>>>> pte_t ptent;
>>>>
>>>> - if (is_vm_hugetlb_page(pvmw->vma))
>>>> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>> - pvmw->pte);
>>>> - else
>>>> + if (is_vm_hugetlb_page(pvmw->vma)) {
>>>> + struct hstate *hstate = hstate_vma(pvmw->vma);
>>>> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
>>>> +
>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
>>>> + } else {
>>>> ptent = ptep_get(pvmw->pte);
>>>> + }
>>>>
>>>> if (pvmw->flags & PVMW_MIGRATION) {
>>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>>> --
>>>>
>>>> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>>>>
>>>> Cheers, Lance
>>>>
>>>>> + else
>>>>> + ptent = ptep_get(pvmw->pte);
>>>>>
>>>>> if (pvmw->flags & PVMW_MIGRATION) {
>>>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>>>> --
>>>>> 2.43.0
>>>>>
>>>>>
>>>
>>>
>>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 16:46 ` Lance Yang
@ 2026-06-27 3:54 ` Miaohe Lin
2026-06-27 7:13 ` Dev Jain
1 sibling, 0 replies; 27+ messages in thread
From: Miaohe Lin @ 2026-06-27 3:54 UTC (permalink / raw)
To: Lance Yang, Dev Jain
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 2026/6/27 0:46, Lance Yang wrote:
>
>
> On 2026/6/26 23:26, Dev Jain wrote:
>>
>>
>> On 26/06/26 7:40 pm, Lance Yang wrote:
>>>
>>> On Fri, Jun 26, 2026 at 06:53:10PM +0530, Dev Jain wrote:
>>>>
>>>>
>>>> On 26/06/26 1:18 pm, Lance Yang wrote:
>>>>>
>>>>> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>>>>>> check_pte() is the final validation step in page_vma_mapped_walk().
>>>>>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>>>>>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>>>>>> to a hugetlb entry.
>>>>>>
>>>>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>>>>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>>>>>> pte_present() etc to misbehave.
>>>>>>
>>>>>> It is not clear whether this has a trivially visible effect to userspace.
>>>>>>
>>>>>> Use huge_ptep_get() to dereference a huge pte pointer.
>>>>>>
>>>>>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>>>>>> Cc: stable@vger.kernel.org
>>>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>>>> ---
>>>>>> mm/page_vma_mapped.c | 8 +++++++-
>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>>>> index 2ccbabfb2cc17..18e1d341f463c 100644
>>>>>> --- a/mm/page_vma_mapped.c
>>>>>> +++ b/mm/page_vma_mapped.c
>>>>>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>>>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>>>> {
>>>>>> unsigned long pfn;
>>>>>> - pte_t ptent = ptep_get(pvmw->pte);
>>>>>> + pte_t ptent;
>>>>>> +
>>>>>> + if (is_vm_hugetlb_page(pvmw->vma))
>>>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>>>> + pvmw->pte);
>>>>>
>>>>> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>>>>
>>>> Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
>>>> performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
>>>> etc ...
>>>
>>> Right, that should cover the rmap callbacks. The bit I was worried about
>>> is page_mapped_in_vma() though.
>>>
>>>>>
>>>>> Not sure that is wrong in the first place. For memory failure,
>>>>> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
>>>>> folio. In that case, pvmw->address need not be hugepage-aligned.
>>>>>
>>>>> @Miaohe
>>>
>>> For hugetlb memory failure we start with the poisoned PFN:
>>>
>>> static int try_memory_failure_hugetlb(unsigned long pfn, int flags)
>>> {
>>> ...
>>> struct page *p = pfn_to_page(pfn);
>>> struct folio *folio;
>>> ...
>>>
>>> folio = page_folio(p);
>>>
>>> ...
>>>
>>> if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
>>> ...
>>> }
>>>
>>> ...
>>> }
>>>
>>> and pass the same p down:
>>>
>>> static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
>>> unsigned long pfn, int flags)
>>> {
>>> ...
>>>
>>> collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
>>>
>>> ...
>>> }
>>>
>>> static void collect_procs(const struct folio *folio, const struct page *page,
>>> struct list_head *tokill, int force_early)
>>> {
>>> ...
>>>
>>> if (unlikely(folio_test_ksm(folio)))
>>> ...
>>> else if (folio_test_anon(folio))
>>> collect_procs_anon(folio, page, tokill, force_early);
>>> else
>>> ...
>>> }
>>>
>>> So collect_procs_anon() still gets the poisoned page, not &folio->page:
>>>
>>> static void collect_procs_anon(const struct folio *folio,
>>> const struct page *page, struct list_head *to_kill,
>>> int force_early)
>>> {
>>> ...
>>>
>>> pgoff = page_pgoff(folio, page);
>>> rcu_read_lock();
>>> for_each_process(tsk) {
>>> ...
>>>
>>> anon_vma_interval_tree_foreach(vmac, &av->rb_root,
>>> pgoff, pgoff) {
>>> ...
>>> addr = page_mapped_in_vma(page, vma);
>>> ...
>>> }
>>> }
>>> rcu_read_unlock();
>>> anon_vma_unlock_read(av);
>>> }
>>>
>>> page_mapped_in_vma() then builds pvmw for that page:
>>>
>>> unsigned long page_mapped_in_vma(const struct page *page,
>>> struct vm_area_struct *vma)
>>> {
>>> const struct folio *folio = page_folio(page);
>>> struct page_vma_mapped_walk pvmw = {
>>> .pfn = page_to_pfn(page),
>>> .nr_pages = 1,
>>> .vma = vma,
>>> .flags = PVMW_SYNC,
>>> };
>>>
>>> pvmw.address = vma_address(vma, page_pgoff(folio, page), 1);
>>> ...
>>> }
>>>
>>> and page_pgoff() includes the subpage index:
>>>
>>> static inline pgoff_t page_pgoff(const struct folio *folio,
>>> const struct page *page)
>>> {
>>> return folio->index + folio_page_idx(folio, page);
>>> }
>>>
>>> So if the poisoned PFN points to a tail page, pvmw->address can be offset
>>> from the start of the hugetlb mapping by
>>>
>>> folio_page_idx(folio, page) << PAGE_SHIFT
>>>
>>> Should check_pte() pass the hugepage-aligned address to huge_ptep_get()
>>> for that case?
>>
>> Thanks! This looks correct.
>>
Thanks both.
>> I can indeed fix this up in check_pte. But in the memory-failure path
>> it has always been confusing to me for hugetlb folios why we are bothering
IIUC, the hugetlb tail page is used to calculate the specified hwpoisoned address
and send it to userspace through SIGBUS.
>> with the tail page. I am sure that area can also be simplified. But for
>> now I'll just do a simple fix here itself.
>
> Just thinking out loud: given that huge_ptep_get() already assumes that
> addr matches the huge pte, at least on arm64, would it make sense to
> have a small hugetlb wrapper around it that takes hstate and aligns
> the address before calling the arch helper?
This proposal looks good to me given the assumption.
Thanks.
.
>
> Might make the rule clearer, and a bit harder to get wrong again :)
>
> Thanks, Lance
>
>>
>>>
>>> Cheers, Lance
>>>
>>>>>
>>>>> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
>>>>> page_vma_mapped_walk() passes that size to hugetlb_walk():
>>>>>
>>>>> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>>>> {
>>>>> ...
>>>>> if (unlikely(is_vm_hugetlb_page(vma))) {
>>>>> ...
>>>>> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
>>>>> ...
>>>>> }
>>>>> ...
>>>>> }
>>>>>
>>>>> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
>>>>> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
>>>>> calculating pmdp:
>>>>>
>>>>> pte_t *huge_pte_offset(struct mm_struct *mm,
>>>>> unsigned long addr, unsigned long sz)
>>>>> {
>>>>> ...
>>>>> if (sz == CONT_PMD_SIZE)
>>>>> addr &= CONT_PMD_MASK;
>>>>>
>>>>> pmdp = pmd_offset(pudp, addr);
>>>>> pmd = READ_ONCE(*pmdp);
>>>>> ...
>>>>> }
>>>>>
>>>>> So for that case, pvmw->pte is calculated from the aligned addr, not
>>>>> necessarily from the original pvmw->address. But check_pte() passes the
>>>>> original address together with pvmw->pte:
>>>>>
>>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>>> + pvmw->pte);
>>>>>
>>>>> arm64 then uses that addr again to choose ncontig:
>>>>>
>>>>> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>>>>> {
>>>>> ...
>>>>> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
>>>>> for (i = 0; i < ncontig; i++, ptep++) {
>>>>> ...
>>>>> }
>>>>> return orig_pte;
>>>>> }
>>>>>
>>>>> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
>>>>> pte_t *ptep, size_t *pgsize)
>>>>> {
>>>>> pgd_t *pgdp = pgd_offset(mm, addr);
>>>>> p4d_t *p4dp;
>>>>> pud_t *pudp;
>>>>> pmd_t *pmdp;
>>>>>
>>>>> *pgsize = PAGE_SIZE;
>>>>> p4dp = p4d_offset(pgdp, addr);
>>>>> pudp = pud_offset(p4dp, addr);
>>>>> pmdp = pmd_offset(pudp, addr);
>>>>> if ((pte_t *)pmdp == ptep) {
>>>>> *pgsize = PMD_SIZE;
>>>>> return CONT_PMDS;
>>>>> }
>>>>> return CONT_PTES;
>>>>> }
>>>>>
>>>>> With a tail address, pmdp may no longer point at pvmw->pte, so
>>>>> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>>>>>
>>>>> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
>>>>> can walk past the CONT_PMD entries, and possibly past the PMD table.
>>>>>
>>>>> Should check_pte() pass the address matching pvmw->pte, sth like:
>>>>>
>>>>> ---8<---
>>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>>> index 406fd50bbd8f..58463493bd3d 100644
>>>>> --- a/mm/page_vma_mapped.c
>>>>> +++ b/mm/page_vma_mapped.c
>>>>> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>>> unsigned long pfn;
>>>>> pte_t ptent;
>>>>>
>>>>> - if (is_vm_hugetlb_page(pvmw->vma))
>>>>> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>>> - pvmw->pte);
>>>>> - else
>>>>> + if (is_vm_hugetlb_page(pvmw->vma)) {
>>>>> + struct hstate *hstate = hstate_vma(pvmw->vma);
>>>>> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
>>>>> +
>>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
>>>>> + } else {
>>>>> ptent = ptep_get(pvmw->pte);
>>>>> + }
>>>>>
>>>>> if (pvmw->flags & PVMW_MIGRATION) {
>>>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>>>> --
>>>>>
>>>>> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>>>>>
>>>>> Cheers, Lance
>>>>>
>>>>>> + else
>>>>>> + ptent = ptep_get(pvmw->pte);
>>>>>>
>>>>>> if (pvmw->flags & PVMW_MIGRATION) {
>>>>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>>>>> --
>>>>>> 2.43.0
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>
>
> .
^ permalink raw reply [flat|nested] 27+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 16:46 ` Lance Yang
2026-06-27 3:54 ` Miaohe Lin
@ 2026-06-27 7:13 ` Dev Jain
1 sibling, 0 replies; 27+ messages in thread
From: Dev Jain @ 2026-06-27 7:13 UTC (permalink / raw)
To: Lance Yang, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 26/06/26 10:16 pm, Lance Yang wrote:
>
>
> On 2026/6/26 23:26, Dev Jain wrote:
>>
>>
>> On 26/06/26 7:40 pm, Lance Yang wrote:
>>>
>>> On Fri, Jun 26, 2026 at 06:53:10PM +0530, Dev Jain wrote:
>>>>
>>>>
>>>> On 26/06/26 1:18 pm, Lance Yang wrote:
>>>>>
>>>>> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>>>>>> check_pte() is the final validation step in page_vma_mapped_walk().
>>>>>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>>>>>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>>>>>> to a hugetlb entry.
>>>>>>
>>>>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>>>>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>>>>>> pte_present() etc to misbehave.
>>>>>>
>>>>>> It is not clear whether this has a trivially visible effect to userspace.
>>>>>>
>>>>>> Use huge_ptep_get() to dereference a huge pte pointer.
>>>>>>
>>>>>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>>>>>> Cc: stable@vger.kernel.org
>>>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>>>> ---
>>>>>> mm/page_vma_mapped.c | 8 +++++++-
>>>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>>>> index 2ccbabfb2cc17..18e1d341f463c 100644
>>>>>> --- a/mm/page_vma_mapped.c
>>>>>> +++ b/mm/page_vma_mapped.c
>>>>>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>>>>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>>>> {
>>>>>> unsigned long pfn;
>>>>>> - pte_t ptent = ptep_get(pvmw->pte);
>>>>>> + pte_t ptent;
>>>>>> +
>>>>>> + if (is_vm_hugetlb_page(pvmw->vma))
>>>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>>>> + pvmw->pte);
>>>>>
>>>>> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>>>>
>>>> Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
>>>> performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
>>>> etc ...
>>>
>>> Right, that should cover the rmap callbacks. The bit I was worried about
>>> is page_mapped_in_vma() though.
>>>
>>>>>
>>>>> Not sure that is wrong in the first place. For memory failure,
>>>>> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
>>>>> folio. In that case, pvmw->address need not be hugepage-aligned.
>>>>>
>>>>> @Miaohe
>>>
>>> For hugetlb memory failure we start with the poisoned PFN:
>>>
>>> static int try_memory_failure_hugetlb(unsigned long pfn, int flags)
>>> {
>>> ...
>>> struct page *p = pfn_to_page(pfn);
>>> struct folio *folio;
>>> ...
>>>
>>> folio = page_folio(p);
>>>
>>> ...
>>>
>>> if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
>>> ...
>>> }
>>>
>>> ...
>>> }
>>>
>>> and pass the same p down:
>>>
>>> static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
>>> unsigned long pfn, int flags)
>>> {
>>> ...
>>>
>>> collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
>>>
>>> ...
>>> }
>>>
>>> static void collect_procs(const struct folio *folio, const struct page *page,
>>> struct list_head *tokill, int force_early)
>>> {
>>> ...
>>>
>>> if (unlikely(folio_test_ksm(folio)))
>>> ...
>>> else if (folio_test_anon(folio))
>>> collect_procs_anon(folio, page, tokill, force_early);
>>> else
>>> ...
>>> }
>>>
>>> So collect_procs_anon() still gets the poisoned page, not &folio->page:
>>>
>>> static void collect_procs_anon(const struct folio *folio,
>>> const struct page *page, struct list_head *to_kill,
>>> int force_early)
>>> {
>>> ...
>>>
>>> pgoff = page_pgoff(folio, page);
>>> rcu_read_lock();
>>> for_each_process(tsk) {
>>> ...
>>>
>>> anon_vma_interval_tree_foreach(vmac, &av->rb_root,
>>> pgoff, pgoff) {
>>> ...
>>> addr = page_mapped_in_vma(page, vma);
>>> ...
>>> }
>>> }
>>> rcu_read_unlock();
>>> anon_vma_unlock_read(av);
>>> }
>>>
>>> page_mapped_in_vma() then builds pvmw for that page:
>>>
>>> unsigned long page_mapped_in_vma(const struct page *page,
>>> struct vm_area_struct *vma)
>>> {
>>> const struct folio *folio = page_folio(page);
>>> struct page_vma_mapped_walk pvmw = {
>>> .pfn = page_to_pfn(page),
>>> .nr_pages = 1,
>>> .vma = vma,
>>> .flags = PVMW_SYNC,
>>> };
>>>
>>> pvmw.address = vma_address(vma, page_pgoff(folio, page), 1);
>>> ...
>>> }
>>>
>>> and page_pgoff() includes the subpage index:
>>>
>>> static inline pgoff_t page_pgoff(const struct folio *folio,
>>> const struct page *page)
>>> {
>>> return folio->index + folio_page_idx(folio, page);
>>> }
>>>
>>> So if the poisoned PFN points to a tail page, pvmw->address can be offset
>>> from the start of the hugetlb mapping by
>>>
>>> folio_page_idx(folio, page) << PAGE_SHIFT
>>>
>>> Should check_pte() pass the hugepage-aligned address to huge_ptep_get()
>>> for that case?
>>
>> Thanks! This looks correct.
>>
>> I can indeed fix this up in check_pte. But in the memory-failure path
>> it has always been confusing to me for hugetlb folios why we are bothering
>> with the tail page. I am sure that area can also be simplified. But for
>> now I'll just do a simple fix here itself.
>
> Just thinking out loud: given that huge_ptep_get() already assumes that
> addr matches the huge pte, at least on arm64, would it make sense to
> have a small hugetlb wrapper around it that takes hstate and aligns
> the address before calling the arch helper?
>
> Might make the rule clearer, and a bit harder to get wrong again :)
Are you suggesting something like:
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index fdb7bdf7645c..xxxxxxxxxxxx 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -825,6 +825,15 @@ static inline struct folio *filemap_lock_hugetlb_folio(struct hstate *h,
#include <asm/hugetlb.h>
+static inline pte_t hugetlb_ptep_get(struct vm_area_struct *vma,
+ unsigned long addr, pte_t *ptep)
+{
+ struct hstate *h = hstate_vma(vma);
+
+ return huge_ptep_get(vma->vm_mm, addr & huge_page_mask(h), ptep);
+}
+
#ifndef is_hugepage_only_range
static inline int is_hugepage_only_range(struct mm_struct *mm,
unsigned long addr, unsigned long len)
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 18e1d341f463..xxxxxxxxxxxx 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -110,8 +110,7 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
pte_t ptent;
if (is_vm_hugetlb_page(pvmw->vma))
- ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
- pvmw->pte);
+ ptent = hugetlb_ptep_get(pvmw->vma, pvmw->address, pvmw->pte);
else
ptent = ptep_get(pvmw->pte);
>
> Thanks, Lance
>
>>
>>>
>>> Cheers, Lance
>>>
>>>>>
>>>>> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
>>>>> page_vma_mapped_walk() passes that size to hugetlb_walk():
>>>>>
>>>>> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>>>> {
>>>>> ...
>>>>> if (unlikely(is_vm_hugetlb_page(vma))) {
>>>>> ...
>>>>> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
>>>>> ...
>>>>> }
>>>>> ...
>>>>> }
>>>>>
>>>>> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
>>>>> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
>>>>> calculating pmdp:
>>>>>
>>>>> pte_t *huge_pte_offset(struct mm_struct *mm,
>>>>> unsigned long addr, unsigned long sz)
>>>>> {
>>>>> ...
>>>>> if (sz == CONT_PMD_SIZE)
>>>>> addr &= CONT_PMD_MASK;
>>>>>
>>>>> pmdp = pmd_offset(pudp, addr);
>>>>> pmd = READ_ONCE(*pmdp);
>>>>> ...
>>>>> }
>>>>>
>>>>> So for that case, pvmw->pte is calculated from the aligned addr, not
>>>>> necessarily from the original pvmw->address. But check_pte() passes the
>>>>> original address together with pvmw->pte:
>>>>>
>>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>>> + pvmw->pte);
>>>>>
>>>>> arm64 then uses that addr again to choose ncontig:
>>>>>
>>>>> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>>>>> {
>>>>> ...
>>>>> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
>>>>> for (i = 0; i < ncontig; i++, ptep++) {
>>>>> ...
>>>>> }
>>>>> return orig_pte;
>>>>> }
>>>>>
>>>>> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
>>>>> pte_t *ptep, size_t *pgsize)
>>>>> {
>>>>> pgd_t *pgdp = pgd_offset(mm, addr);
>>>>> p4d_t *p4dp;
>>>>> pud_t *pudp;
>>>>> pmd_t *pmdp;
>>>>>
>>>>> *pgsize = PAGE_SIZE;
>>>>> p4dp = p4d_offset(pgdp, addr);
>>>>> pudp = pud_offset(p4dp, addr);
>>>>> pmdp = pmd_offset(pudp, addr);
>>>>> if ((pte_t *)pmdp == ptep) {
>>>>> *pgsize = PMD_SIZE;
>>>>> return CONT_PMDS;
>>>>> }
>>>>> return CONT_PTES;
>>>>> }
>>>>>
>>>>> With a tail address, pmdp may no longer point at pvmw->pte, so
>>>>> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>>>>>
>>>>> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
>>>>> can walk past the CONT_PMD entries, and possibly past the PMD table.
>>>>>
>>>>> Should check_pte() pass the address matching pvmw->pte, sth like:
>>>>>
>>>>> ---8<---
>>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>>> index 406fd50bbd8f..58463493bd3d 100644
>>>>> --- a/mm/page_vma_mapped.c
>>>>> +++ b/mm/page_vma_mapped.c
>>>>> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>>>> unsigned long pfn;
>>>>> pte_t ptent;
>>>>>
>>>>> - if (is_vm_hugetlb_page(pvmw->vma))
>>>>> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>>>> - pvmw->pte);
>>>>> - else
>>>>> + if (is_vm_hugetlb_page(pvmw->vma)) {
>>>>> + struct hstate *hstate = hstate_vma(pvmw->vma);
>>>>> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
>>>>> +
>>>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
>>>>> + } else {
>>>>> ptent = ptep_get(pvmw->pte);
>>>>> + }
>>>>>
>>>>> if (pvmw->flags & PVMW_MIGRATION) {
>>>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>>>> --
>>>>>
>>>>> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>>>>>
>>>>> Cheers, Lance
>>>>>
>>>>>> + else
>>>>>> + ptent = ptep_get(pvmw->pte);
>>>>>>
>>>>>> if (pvmw->flags & PVMW_MIGRATION) {
>>>>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>>>>> --
>>>>>> 2.43.0
>>>>>>
>>>>>>
>>>>
>>>>
>>>
>>
>
>
^ permalink raw reply [flat|nested] 27+ messages in thread
end of thread, other threads:[~2026-06-27 7:13 UTC | newest]
Thread overview: 27+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
2026-06-26 3:17 ` Muchun Song
2026-06-26 4:03 ` Dev Jain
2026-06-26 4:16 ` Muchun Song
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
2026-06-26 3:24 ` Muchun Song
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
2026-06-26 3:32 ` Muchun Song
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
2026-06-26 2:31 ` Lance Yang
2026-06-26 4:06 ` Dev Jain
2026-06-26 7:48 ` Lance Yang
2026-06-26 9:14 ` Lance Yang
2026-06-26 13:23 ` Dev Jain
2026-06-26 14:10 ` Lance Yang
2026-06-26 15:26 ` Dev Jain
2026-06-26 16:46 ` Lance Yang
2026-06-27 3:54 ` Miaohe Lin
2026-06-27 7:13 ` Dev Jain
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
2026-06-26 3:40 ` Muchun Song
2026-06-26 4:08 ` Dev Jain
2026-06-26 4:21 ` Muchun Song
2026-06-26 4:42 ` Dev Jain
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
2026-06-26 4:09 ` Dev Jain
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.