Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/5] Fix incorrect access of hugetlb pte entries
@ 2026-06-25 11:29 Dev Jain
  2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
  To: muchun.song, osalvador, akpm, ljs, david, liam
  Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual

There are various places which use ptep_get() to get the pte entry
corresponding to a hugetlb folio. Some arches have special handling
to compute the pteval, so they provide huge_ptep_get(). Use this
helper consistently.

Dev Jain (5):
  mm/rmap: use huge_ptep_get() in try_to_unmap_one()
  mm/rmap: use huge_ptep_get() in try_to_migrate_one()
  mm/migrate: use huge_ptep_get() in remove_migration_pte()
  mm/page_vma_mapped: use huge_ptep_get() for hugetlb
  mm/mprotect: use huge_ptep_get() for hugetlb

 include/linux/hugetlb.h |  3 +++
 mm/migrate.c            |  6 +++++-
 mm/mprotect.c           |  8 +++++++-
 mm/page_vma_mapped.c    |  8 +++++++-
 mm/rmap.c               | 32 ++++++++++++++++++++------------
 5 files changed, 42 insertions(+), 15 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
  2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
  2026-06-26  3:17   ` Muchun Song
  2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
  To: muchun.song, osalvador, akpm, ljs, david, liam
  Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, stable

try_to_unmap_one() handles hugetlb folios when memory failure needs
to replace a poisoned hugetlb mapping with a hwpoison entry. In that
case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
in pvmw.pte, but the code reads it with ptep_get().

On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
etc to misbehave.

It is not clear whether this has a trivially visible effect to userspace.

Just use huge_ptep_get() for dereferencing a huge pte pointer.

Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 include/linux/hugetlb.h |  3 +++
 mm/rmap.c               | 16 ++++++++++------
 2 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 2abaf99321e90..fdb7bdf7645c5 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
 {
 }
 
+pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
+		    pte_t *ptep);
+
 static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
 					  unsigned long addr, pte_t *ptep)
 {
diff --git a/mm/rmap.c b/mm/rmap.c
index 1c77d5dc06e9f..aa8a254efaecc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
 		/* Unexpected PMD-mapped THP? */
 		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
 
-		/*
-		 * Handle PFN swap PTEs, such as device-exclusive ones, that
-		 * actually map pages.
-		 */
-		pteval = ptep_get(pvmw.pte);
+		address = pvmw.address;
+		if (folio_test_hugetlb(folio)) {
+			pteval = huge_ptep_get(mm, address, pvmw.pte);
+		} else {
+			/*
+			 * Handle PFN swap PTEs, such as device-exclusive ones,
+			 * that actually map pages.
+			 */
+			pteval = ptep_get(pvmw.pte);
+		}
 		if (likely(pte_present(pteval))) {
 			pfn = pte_pfn(pteval);
 		} else {
@@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
 		}
 
 		subpage = folio_page(folio, pfn - folio_pfn(folio));
-		address = pvmw.address;
 		anon_exclusive = folio_test_anon(folio) &&
 				 PageAnonExclusive(subpage);
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one()
  2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
  2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
  2026-06-26  3:24   ` Muchun Song
  2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
  To: muchun.song, osalvador, akpm, ljs, david, liam
  Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, stable

try_to_migrate_one() is used by folio migration to replace a present
mapping with a migration entry. For hugetlb folios, page_vma_mapped_walk()
returns the pte pointer to the hugetlb folio in pvmw.pte, but the code
reads the huge pte entry with ptep_get().

On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
etc to misbehave.

It is not clear whether this has a trivially visible effect to userspace.

Use huge_ptep_get() to dereference a huge pte pointer.

Commit a98a2f0c8ce1 copied the bug from try_to_unmap_one into
try_to_migrate_one.

Fixes: a98a2f0c8ce1 ("mm/rmap: split migration into its own function")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 mm/rmap.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index aa8a254efaecc..abc3a44baaa3d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2505,11 +2505,16 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 		/* Unexpected PMD-mapped THP? */
 		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
 
-		/*
-		 * Handle PFN swap PTEs, such as device-exclusive ones, that
-		 * actually map pages.
-		 */
-		pteval = ptep_get(pvmw.pte);
+		address = pvmw.address;
+		if (folio_test_hugetlb(folio)) {
+			pteval = huge_ptep_get(mm, address, pvmw.pte);
+		} else {
+			/*
+			 * Handle PFN swap PTEs, such as device-exclusive ones,
+			 * that actually map pages.
+			 */
+			pteval = ptep_get(pvmw.pte);
+		}
 		if (likely(pte_present(pteval))) {
 			pfn = pte_pfn(pteval);
 		} else {
@@ -2520,7 +2525,6 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
 		}
 
 		subpage = folio_page(folio, pfn - folio_pfn(folio));
-		address = pvmw.address;
 		anon_exclusive = folio_test_anon(folio) &&
 				 PageAnonExclusive(subpage);
 
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte()
  2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
  2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
  2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
  2026-06-26  3:32   ` Muchun Song
  2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
  To: muchun.song, osalvador, akpm, ljs, david, liam
  Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, stable

remove_migration_pte() converts migration entries back to present PTEs
after folio migration completes. For hugetlb folios,
page_vma_mapped_walk() returns the pte pointer to the hugetlb folio in
pvmw.pte, but the code reads it with ptep_get().

On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(),
pte_present() etc to misbehave.

It is not clear whether this has a trivially visible effect to userspace.

Use huge_ptep_get() to dereference a huge pte pointer.

Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 mm/migrate.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/mm/migrate.c b/mm/migrate.c
index d9b23909d716c..c65f0f43df7eb 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -371,7 +371,11 @@ static bool remove_migration_pte(struct folio *folio,
 			continue;
 		}
 #endif
-		old_pte = ptep_get(pvmw.pte);
+		if (folio_test_hugetlb(folio))
+			old_pte = huge_ptep_get(vma->vm_mm, pvmw.address,
+						pvmw.pte);
+		else
+			old_pte = ptep_get(pvmw.pte);
 		if (rmap_walk_arg->map_unused_to_zeropage &&
 		    try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
 			continue;
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
  2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
                   ` (2 preceding siblings ...)
  2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
  2026-06-26  2:31   ` Lance Yang
  2026-06-26  7:48   ` Lance Yang
  2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
  2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
  5 siblings, 2 replies; 21+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
  To: muchun.song, osalvador, akpm, ljs, david, liam
  Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, stable

check_pte() is the final validation step in page_vma_mapped_walk().
It reads pvmw->pte with ptep_get() to decide whether the entry maps
the PFN range being walked. For hugetlb VMAs, that pointer refers
to a hugetlb entry.

On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(),
pte_present() etc to misbehave.

It is not clear whether this has a trivially visible effect to userspace.

Use huge_ptep_get() to dereference a huge pte pointer.

Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 mm/page_vma_mapped.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 2ccbabfb2cc17..18e1d341f463c 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
 static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 {
 	unsigned long pfn;
-	pte_t ptent = ptep_get(pvmw->pte);
+	pte_t ptent;
+
+	if (is_vm_hugetlb_page(pvmw->vma))
+		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
+				      pvmw->pte);
+	else
+		ptent = ptep_get(pvmw->pte);
 
 	if (pvmw->flags & PVMW_MIGRATION) {
 		const softleaf_t entry = softleaf_from_pte(ptent);
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
  2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
                   ` (3 preceding siblings ...)
  2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
  2026-06-26  3:40   ` Muchun Song
  2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
  5 siblings, 1 reply; 21+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
  To: muchun.song, osalvador, akpm, ljs, david, liam
  Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual

prot_none_hugetlb_entry() is the hugetlb callback for the early
mprotect(PROT_NONE) PFN permission walk on x86.

The callback passes the decoded PFN to pfn_modify_allowed(). For a
hugetlb callback, the pte pointer refers to a hugetlb entry. On
architectures where hugetlb entries need huge_ptep_get(), reading that
entry with ptep_get() can make the permission check use the wrong PFN.

Use huge_ptep_get() before decoding the hugetlb PFN.

Currently there is no path which can trigger a bug: huge_ptep_get() is a
simple ptep_get() for x86, and the prot_none walk occurs only for x86.
But use the correct helper anyways.

Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
 mm/mprotect.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028cf..23779632d18bf 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
 		0 : -EACCES;
 }
 
+#ifdef CONFIG_HUGETLB_PAGE
 static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
 				   unsigned long addr, unsigned long next,
 				   struct mm_walk *walk)
 {
-	return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
+	pte_t entry = huge_ptep_get(walk->mm, addr, pte);
+
+	return pfn_modify_allowed(pte_pfn(entry),
 				  *(pgprot_t *)(walk->private)) ?
 		0 : -EACCES;
 }
+#else
+#define prot_none_hugetlb_entry	NULL
+#endif
 
 static int prot_none_test(unsigned long addr, unsigned long next,
 			  struct mm_walk *walk)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] Fix incorrect access of hugetlb pte entries
  2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
                   ` (4 preceding siblings ...)
  2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
@ 2026-06-25 13:59 ` Zi Yan
  2026-06-26  4:09   ` Dev Jain
  5 siblings, 1 reply; 21+ messages in thread
From: Zi Yan @ 2026-06-25 13:59 UTC (permalink / raw)
  To: Dev Jain, muchun.song, osalvador, akpm, ljs, david, liam
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, matthew.brost, joshua.hahnjy,
	rakie.kim, byungchul, gourry, ying.huang, mel, nao.horiguchi, ak,
	j-nomura, pfalcato, dave.hansen, tglx, jpoimboe, ryan.roberts,
	anshuman.khandual

On Thu Jun 25, 2026 at 7:29 AM EDT, Dev Jain wrote:
> There are various places which use ptep_get() to get the pte entry
> corresponding to a hugetlb folio. Some arches have special handling

I think it is better to mention s390 as a concrete example.

> to compute the pteval, so they provide huge_ptep_get(). Use this
> helper consistently.
>
> Dev Jain (5):
>   mm/rmap: use huge_ptep_get() in try_to_unmap_one()
>   mm/rmap: use huge_ptep_get() in try_to_migrate_one()
>   mm/migrate: use huge_ptep_get() in remove_migration_pte()
>   mm/page_vma_mapped: use huge_ptep_get() for hugetlb
>   mm/mprotect: use huge_ptep_get() for hugetlb
>
>  include/linux/hugetlb.h |  3 +++
>  mm/migrate.c            |  6 +++++-
>  mm/mprotect.c           |  8 +++++++-
>  mm/page_vma_mapped.c    |  8 +++++++-
>  mm/rmap.c               | 32 ++++++++++++++++++++------------
>  5 files changed, 42 insertions(+), 15 deletions(-)




-- 
Best Regards,
Yan, Zi



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
  2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
@ 2026-06-26  2:31   ` Lance Yang
  2026-06-26  4:06     ` Dev Jain
  2026-06-26  7:48   ` Lance Yang
  1 sibling, 1 reply; 21+ messages in thread
From: Lance Yang @ 2026-06-26  2:31 UTC (permalink / raw)
  To: dev.jain
  Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
	harry, jannh, lance.yang, kas, linux-mm, linux-kernel, rcampbell,
	apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
	gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
	dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
	stable


On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>check_pte() is the final validation step in page_vma_mapped_walk().
>It reads pvmw->pte with ptep_get() to decide whether the entry maps
>the PFN range being walked. For hugetlb VMAs, that pointer refers
>to a hugetlb entry.
>
>On arches which provide their own huge_ptep_get() to dereference a huge
>pte pointer, accessing via ptep_get() would cause pte_pfn(),
>pte_present() etc to misbehave.
>
>It is not clear whether this has a trivially visible effect to userspace.
>
>Use huge_ptep_get() to dereference a huge pte pointer.
>
>Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>Cc: stable@vger.kernel.org
>Signed-off-by: Dev Jain <dev.jain@arm.com>
>---
> mm/page_vma_mapped.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..18e1d341f463c 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)

Just one ordering thing: should this patch come first?

Patches #01-#03 only reach the new huge_ptep_get() after
page_vma_mapped_walk() succeeds. But before this patch, hugetlb sill
goes through check_pte() (still using ptep_get()).

> {
> 	unsigned long pfn;
>-	pte_t ptent = ptep_get(pvmw->pte);
>+	pte_t ptent;
>+
>+	if (is_vm_hugetlb_page(pvmw->vma))
>+		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>+				      pvmw->pte);
>+	else
>+		ptent = ptep_get(pvmw->pte);
> 
> 	if (pvmw->flags & PVMW_MIGRATION) {
> 		const softleaf_t entry = softleaf_from_pte(ptent);
>-- 
>2.43.0
>
>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
  2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
@ 2026-06-26  3:17   ` Muchun Song
  2026-06-26  4:03     ` Dev Jain
  0 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2026-06-26  3:17 UTC (permalink / raw)
  To: Dev Jain
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
	akpm, ljs, david, liam



On 2026/6/25 19:29, Dev Jain wrote:
> try_to_unmap_one() handles hugetlb folios when memory failure needs
> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
> in pvmw.pte, but the code reads it with ptep_get().
>
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
> etc to misbehave.
>
> It is not clear whether this has a trivially visible effect to userspace.
>
> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>
> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
>   include/linux/hugetlb.h |  3 +++
>   mm/rmap.c               | 16 ++++++++++------
>   2 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 2abaf99321e90..fdb7bdf7645c5 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
>   {
>   }
>   
> +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
> +		    pte_t *ptep);

Thanks so much for the fix! I'm curious, though: why do we
need to add a separate declaration for this function here?

Thanks,
Muchun

> +
>   static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>   					  unsigned long addr, pte_t *ptep)
>   {
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 1c77d5dc06e9f..aa8a254efaecc 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>   		/* Unexpected PMD-mapped THP? */
>   		VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>   
> -		/*
> -		 * Handle PFN swap PTEs, such as device-exclusive ones, that
> -		 * actually map pages.
> -		 */
> -		pteval = ptep_get(pvmw.pte);
> +		address = pvmw.address;
> +		if (folio_test_hugetlb(folio)) {
> +			pteval = huge_ptep_get(mm, address, pvmw.pte);
> +		} else {
> +			/*
> +			 * Handle PFN swap PTEs, such as device-exclusive ones,
> +			 * that actually map pages.
> +			 */
> +			pteval = ptep_get(pvmw.pte);
> +		}
>   		if (likely(pte_present(pteval))) {
>   			pfn = pte_pfn(pteval);
>   		} else {
> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>   		}
>   
>   		subpage = folio_page(folio, pfn - folio_pfn(folio));
> -		address = pvmw.address;
>   		anon_exclusive = folio_test_anon(folio) &&
>   				 PageAnonExclusive(subpage);
>   



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one()
  2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
@ 2026-06-26  3:24   ` Muchun Song
  0 siblings, 0 replies; 21+ messages in thread
From: Muchun Song @ 2026-06-26  3:24 UTC (permalink / raw)
  To: Dev Jain
  Cc: osalvador, akpm, ljs, david, liam, riel, vbabka, harry, jannh,
	lance.yang, kas, linux-mm, linux-kernel, rcampbell, apopple, ziy,
	matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
	ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
	dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
	stable



> On Jun 25, 2026, at 19:29, Dev Jain <dev.jain@arm.com> wrote:
> 
> try_to_migrate_one() is used by folio migration to replace a present
> mapping with a migration entry. For hugetlb folios, page_vma_mapped_walk()
> returns the pte pointer to the hugetlb folio in pvmw.pte, but the code
> reads the huge pte entry with ptep_get().
> 
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
> etc to misbehave.
> 
> It is not clear whether this has a trivially visible effect to userspace.
> 
> Use huge_ptep_get() to dereference a huge pte pointer.
> 
> Commit a98a2f0c8ce1 copied the bug from try_to_unmap_one into
> try_to_migrate_one.
> 
> Fixes: a98a2f0c8ce1 ("mm/rmap: split migration into its own function")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>

Acked-by: Muchun Song <muchun.song@linux.dev>

Thanks.	



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte()
  2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
@ 2026-06-26  3:32   ` Muchun Song
  0 siblings, 0 replies; 21+ messages in thread
From: Muchun Song @ 2026-06-26  3:32 UTC (permalink / raw)
  To: Dev Jain
  Cc: osalvador, akpm, ljs, david, liam, riel, vbabka, harry, jannh,
	lance.yang, kas, linux-mm, linux-kernel, rcampbell, apopple, ziy,
	matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
	ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
	dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
	stable



> On Jun 25, 2026, at 19:29, Dev Jain <dev.jain@arm.com> wrote:
> 
> remove_migration_pte() converts migration entries back to present PTEs
> after folio migration completes. For hugetlb folios,
> page_vma_mapped_walk() returns the pte pointer to the hugetlb folio in
> pvmw.pte, but the code reads it with ptep_get().
> 
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(),
> pte_present() etc to misbehave.
> 
> It is not clear whether this has a trivially visible effect to userspace.

We are dealing with migration entries here, so the issue mentioned shouldn't
be a problem with any of the architectures. Semantically speaking, we definitely
should fix this.

> 
> Use huge_ptep_get() to dereference a huge pte pointer.
> 
> Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>

Acked-by: Muchun Song <muchun.song@linux.dev>

Thanks



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
  2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
@ 2026-06-26  3:40   ` Muchun Song
  2026-06-26  4:08     ` Dev Jain
  0 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2026-06-26  3:40 UTC (permalink / raw)
  To: Dev Jain
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
	david, liam



On 2026/6/25 19:29, Dev Jain wrote:
> prot_none_hugetlb_entry() is the hugetlb callback for the early
> mprotect(PROT_NONE) PFN permission walk on x86.
>
> The callback passes the decoded PFN to pfn_modify_allowed(). For a
> hugetlb callback, the pte pointer refers to a hugetlb entry. On
> architectures where hugetlb entries need huge_ptep_get(), reading that
> entry with ptep_get() can make the permission check use the wrong PFN.
>
> Use huge_ptep_get() before decoding the hugetlb PFN.
>
> Currently there is no path which can trigger a bug: huge_ptep_get() is a
> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
> But use the correct helper anyways.
>
> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
>   mm/mprotect.c | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028cf..23779632d18bf 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>   		0 : -EACCES;
>   }
>   
> +#ifdef CONFIG_HUGETLB_PAGE
>   static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>   				   unsigned long addr, unsigned long next,
>   				   struct mm_walk *walk)
>   {
> -	return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
> +	pte_t entry = huge_ptep_get(walk->mm, addr, pte);
> +
> +	return pfn_modify_allowed(pte_pfn(entry),
>   				  *(pgprot_t *)(walk->private)) ?
>   		0 : -EACCES;
>   }
> +#else
> +#define prot_none_hugetlb_entry	NULL

This is very strange, because we defined a stub as NULL for a helper
function. How about  the following diff?

diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028c..4d8c1551fbce 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, 
unsigned long next,

  static const struct mm_walk_ops prot_none_walk_ops = {
         .pte_entry              = prot_none_pte_entry,
+#ifdef CONFIG_HUGETLB_PAGE
         .hugetlb_entry          = prot_none_hugetlb_entry,
+#endif
         .test_walk              = prot_none_test,
         .walk_lock              = PGWALK_WRLOCK,
  };

Thanks,
Muchun

> +#endif
>   
>   static int prot_none_test(unsigned long addr, unsigned long next,
>   			  struct mm_walk *walk)



^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
  2026-06-26  3:17   ` Muchun Song
@ 2026-06-26  4:03     ` Dev Jain
  2026-06-26  4:16       ` Muchun Song
  0 siblings, 1 reply; 21+ messages in thread
From: Dev Jain @ 2026-06-26  4:03 UTC (permalink / raw)
  To: Muchun Song
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
	akpm, ljs, david, liam



On 26/06/26 8:47 am, Muchun Song wrote:
> 
> 
> On 2026/6/25 19:29, Dev Jain wrote:
>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
>> in pvmw.pte, but the code reads it with ptep_get().
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
>> etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>>
>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>>   include/linux/hugetlb.h |  3 +++
>>   mm/rmap.c               | 16 ++++++++++------
>>   2 files changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>> index 2abaf99321e90..fdb7bdf7645c5 100644
>> --- a/include/linux/hugetlb.h
>> +++ b/include/linux/hugetlb.h
>> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
>>   {
>>   }
>>   +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
>> +            pte_t *ptep);
> 
> Thanks so much for the fix! I'm curious, though: why do we
> need to add a separate declaration for this function here?

For !CONFIG_HUGETLB_PAGE, compiler complains that there is no huge_ptep_get.
So this is to make compiler happy.

> 
> Thanks,
> Muchun
> 
>> +
>>   static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>>                         unsigned long addr, pte_t *ptep)
>>   {
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 1c77d5dc06e9f..aa8a254efaecc 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>           /* Unexpected PMD-mapped THP? */
>>           VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>>   -        /*
>> -         * Handle PFN swap PTEs, such as device-exclusive ones, that
>> -         * actually map pages.
>> -         */
>> -        pteval = ptep_get(pvmw.pte);
>> +        address = pvmw.address;
>> +        if (folio_test_hugetlb(folio)) {
>> +            pteval = huge_ptep_get(mm, address, pvmw.pte);
>> +        } else {
>> +            /*
>> +             * Handle PFN swap PTEs, such as device-exclusive ones,
>> +             * that actually map pages.
>> +             */
>> +            pteval = ptep_get(pvmw.pte);
>> +        }
>>           if (likely(pte_present(pteval))) {
>>               pfn = pte_pfn(pteval);
>>           } else {
>> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>           }
>>             subpage = folio_page(folio, pfn - folio_pfn(folio));
>> -        address = pvmw.address;
>>           anon_exclusive = folio_test_anon(folio) &&
>>                    PageAnonExclusive(subpage);
>>   
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
  2026-06-26  2:31   ` Lance Yang
@ 2026-06-26  4:06     ` Dev Jain
  0 siblings, 0 replies; 21+ messages in thread
From: Dev Jain @ 2026-06-26  4:06 UTC (permalink / raw)
  To: Lance Yang
  Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
	harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
	ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
	ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
	dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
	stable



On 26/06/26 8:01 am, Lance Yang wrote:
> 
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> 
> Just one ordering thing: should this patch come first?
> 
> Patches #01-#03 only reach the new huge_ptep_get() after
> page_vma_mapped_walk() succeeds. But before this patch, hugetlb sill
> goes through check_pte() (still using ptep_get()).

You are right, but do we care? This is not a series meant for adding functionality.
I just sent it as a series because they are similar fixes - the patches are to
be applied individually with no dependency.
> 
>> {
>> 	unsigned long pfn;
>> -	pte_t ptent = ptep_get(pvmw->pte);
>> +	pte_t ptent;
>> +
>> +	if (is_vm_hugetlb_page(pvmw->vma))
>> +		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> +				      pvmw->pte);
>> +	else
>> +		ptent = ptep_get(pvmw->pte);
>>
>> 	if (pvmw->flags & PVMW_MIGRATION) {
>> 		const softleaf_t entry = softleaf_from_pte(ptent);
>> -- 
>> 2.43.0
>>
>>
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
  2026-06-26  3:40   ` Muchun Song
@ 2026-06-26  4:08     ` Dev Jain
  2026-06-26  4:21       ` Muchun Song
  0 siblings, 1 reply; 21+ messages in thread
From: Dev Jain @ 2026-06-26  4:08 UTC (permalink / raw)
  To: Muchun Song
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
	david, liam



On 26/06/26 9:10 am, Muchun Song wrote:
> 
> 
> On 2026/6/25 19:29, Dev Jain wrote:
>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>> mprotect(PROT_NONE) PFN permission walk on x86.
>>
>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>> architectures where hugetlb entries need huge_ptep_get(), reading that
>> entry with ptep_get() can make the permission check use the wrong PFN.
>>
>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>
>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>> But use the correct helper anyways.
>>
>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>>   mm/mprotect.c | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 9cbf932b028cf..23779632d18bf 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>>           0 : -EACCES;
>>   }
>>   +#ifdef CONFIG_HUGETLB_PAGE
>>   static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>>                      unsigned long addr, unsigned long next,
>>                      struct mm_walk *walk)
>>   {
>> -    return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>> +    pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>> +
>> +    return pfn_modify_allowed(pte_pfn(entry),
>>                     *(pgprot_t *)(walk->private)) ?
>>           0 : -EACCES;
>>   }
>> +#else
>> +#define prot_none_hugetlb_entry    NULL
> 
> This is very strange, because we defined a stub as NULL for a helper

I was following pattern elsewhere, search for ".hugetlb_entry" in the
codebase and you will find others doing the same.


> function. How about  the following diff?
> 
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028c..4d8c1551fbce 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
> 
>  static const struct mm_walk_ops prot_none_walk_ops = {
>         .pte_entry              = prot_none_pte_entry,
> +#ifdef CONFIG_HUGETLB_PAGE
>         .hugetlb_entry          = prot_none_hugetlb_entry,
> +#endif
>         .test_walk              = prot_none_test,
>         .walk_lock              = PGWALK_WRLOCK,
>  };
> 
> Thanks,
> Muchun
> 
>> +#endif
>>     static int prot_none_test(unsigned long addr, unsigned long next,
>>                 struct mm_walk *walk)
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 0/5] Fix incorrect access of hugetlb pte entries
  2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
@ 2026-06-26  4:09   ` Dev Jain
  0 siblings, 0 replies; 21+ messages in thread
From: Dev Jain @ 2026-06-26  4:09 UTC (permalink / raw)
  To: Zi Yan, muchun.song, osalvador, akpm, ljs, david, liam
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, matthew.brost, joshua.hahnjy,
	rakie.kim, byungchul, gourry, ying.huang, mel, nao.horiguchi, ak,
	j-nomura, pfalcato, dave.hansen, tglx, jpoimboe, ryan.roberts,
	anshuman.khandual



On 25/06/26 7:29 pm, Zi Yan wrote:
> On Thu Jun 25, 2026 at 7:29 AM EDT, Dev Jain wrote:
>> There are various places which use ptep_get() to get the pte entry
>> corresponding to a hugetlb folio. Some arches have special handling
> 
> I think it is better to mention s390 as a concrete example.

Sure. In case there is no v2, requesting Andrew to change
"Some arches have special handling" to "Some arches like s390 have
special handling".

> 
>> to compute the pteval, so they provide huge_ptep_get(). Use this
>> helper consistently.
>>
>> Dev Jain (5):
>>   mm/rmap: use huge_ptep_get() in try_to_unmap_one()
>>   mm/rmap: use huge_ptep_get() in try_to_migrate_one()
>>   mm/migrate: use huge_ptep_get() in remove_migration_pte()
>>   mm/page_vma_mapped: use huge_ptep_get() for hugetlb
>>   mm/mprotect: use huge_ptep_get() for hugetlb
>>
>>  include/linux/hugetlb.h |  3 +++
>>  mm/migrate.c            |  6 +++++-
>>  mm/mprotect.c           |  8 +++++++-
>>  mm/page_vma_mapped.c    |  8 +++++++-
>>  mm/rmap.c               | 32 ++++++++++++++++++++------------
>>  5 files changed, 42 insertions(+), 15 deletions(-)
> 
> 
> 
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
  2026-06-26  4:03     ` Dev Jain
@ 2026-06-26  4:16       ` Muchun Song
  0 siblings, 0 replies; 21+ messages in thread
From: Muchun Song @ 2026-06-26  4:16 UTC (permalink / raw)
  To: Dev Jain
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
	akpm, ljs, david, liam



> On Jun 26, 2026, at 12:03, Dev Jain <dev.jain@arm.com> wrote:
> 
> 
> 
> On 26/06/26 8:47 am, Muchun Song wrote:
>> 
>> 
>> On 2026/6/25 19:29, Dev Jain wrote:
>>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>>> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
>>> in pvmw.pte, but the code reads it with ptep_get().
>>> 
>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
>>> etc to misbehave.
>>> 
>>> It is not clear whether this has a trivially visible effect to userspace.
>>> 
>>> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>>> 
>>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>>   include/linux/hugetlb.h |  3 +++
>>>   mm/rmap.c               | 16 ++++++++++------
>>>   2 files changed, 13 insertions(+), 6 deletions(-)
>>> 
>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>>> index 2abaf99321e90..fdb7bdf7645c5 100644
>>> --- a/include/linux/hugetlb.h
>>> +++ b/include/linux/hugetlb.h
>>> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
>>>   {
>>>   }
>>>   +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
>>> +            pte_t *ptep);
>> 
>> Thanks so much for the fix! I'm curious, though: why do we
>> need to add a separate declaration for this function here?
> 
> For !CONFIG_HUGETLB_PAGE, compiler complains that there is no huge_ptep_get.
> So this is to make compiler happy.

Got it. We can refer to 5d4af6195c87c6b162b7963e0ad00a214b80d764 to fix
this warning.

Muchun,
Thanks.

> 
>> 
>> Thanks,
>> Muchun
>> 
>>> +
>>>   static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>>>                         unsigned long addr, pte_t *ptep)
>>>   {
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index 1c77d5dc06e9f..aa8a254efaecc 100644
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>>           /* Unexpected PMD-mapped THP? */
>>>           VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>>>   -        /*
>>> -         * Handle PFN swap PTEs, such as device-exclusive ones, that
>>> -         * actually map pages.
>>> -         */
>>> -        pteval = ptep_get(pvmw.pte);
>>> +        address = pvmw.address;
>>> +        if (folio_test_hugetlb(folio)) {
>>> +            pteval = huge_ptep_get(mm, address, pvmw.pte);
>>> +        } else {
>>> +            /*
>>> +             * Handle PFN swap PTEs, such as device-exclusive ones,
>>> +             * that actually map pages.
>>> +             */
>>> +            pteval = ptep_get(pvmw.pte);
>>> +        }
>>>           if (likely(pte_present(pteval))) {
>>>               pfn = pte_pfn(pteval);
>>>           } else {
>>> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>>           }
>>>             subpage = folio_page(folio, pfn - folio_pfn(folio));
>>> -        address = pvmw.address;
>>>           anon_exclusive = folio_test_anon(folio) &&
>>>                    PageAnonExclusive(subpage);




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
  2026-06-26  4:08     ` Dev Jain
@ 2026-06-26  4:21       ` Muchun Song
  2026-06-26  4:42         ` Dev Jain
  0 siblings, 1 reply; 21+ messages in thread
From: Muchun Song @ 2026-06-26  4:21 UTC (permalink / raw)
  To: Dev Jain
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
	david, liam



> On Jun 26, 2026, at 12:08, Dev Jain <dev.jain@arm.com> wrote:
> 
> 
> 
> On 26/06/26 9:10 am, Muchun Song wrote:
>> 
>> 
>> On 2026/6/25 19:29, Dev Jain wrote:
>>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>>> mprotect(PROT_NONE) PFN permission walk on x86.
>>> 
>>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>>> architectures where hugetlb entries need huge_ptep_get(), reading that
>>> entry with ptep_get() can make the permission check use the wrong PFN.
>>> 
>>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>> 
>>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>>> But use the correct helper anyways.
>>> 
>>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>>   mm/mprotect.c | 8 +++++++-
>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>> 
>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>> index 9cbf932b028cf..23779632d18bf 100644
>>> --- a/mm/mprotect.c
>>> +++ b/mm/mprotect.c
>>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>>>           0 : -EACCES;
>>>   }
>>>   +#ifdef CONFIG_HUGETLB_PAGE
>>>   static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>>>                      unsigned long addr, unsigned long next,
>>>                      struct mm_walk *walk)
>>>   {
>>> -    return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>>> +    pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>>> +
>>> +    return pfn_modify_allowed(pte_pfn(entry),
>>>                     *(pgprot_t *)(walk->private)) ?
>>>           0 : -EACCES;
>>>   }
>>> +#else
>>> +#define prot_none_hugetlb_entry    NULL
>> 
>> This is very strange, because we defined a stub as NULL for a helper
> 
> I was following pattern elsewhere, search for ".hugetlb_entry" in the
> codebase and you will find others doing the same.

Okay, I understand why you want to do it that way, but I would still
recommend not following that format.

Thanks.

> 
> 
>> function. How about  the following diff?
>> 
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 9cbf932b028c..4d8c1551fbce 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>> 
>>  static const struct mm_walk_ops prot_none_walk_ops = {
>>         .pte_entry              = prot_none_pte_entry,
>> +#ifdef CONFIG_HUGETLB_PAGE
>>         .hugetlb_entry          = prot_none_hugetlb_entry,
>> +#endif
>>         .test_walk              = prot_none_test,
>>         .walk_lock              = PGWALK_WRLOCK,
>>  };
>> 
>> Thanks,
>> Muchun
>> 
>>> +#endif
>>>     static int prot_none_test(unsigned long addr, unsigned long next,
>>>                 struct mm_walk *walk)




^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
  2026-06-26  4:21       ` Muchun Song
@ 2026-06-26  4:42         ` Dev Jain
  0 siblings, 0 replies; 21+ messages in thread
From: Dev Jain @ 2026-06-26  4:42 UTC (permalink / raw)
  To: Muchun Song
  Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
	linux-kernel, rcampbell, apopple, ziy, matthew.brost,
	joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
	nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
	jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
	david, liam



On 26/06/26 9:51 am, Muchun Song wrote:
> 
> 
>> On Jun 26, 2026, at 12:08, Dev Jain <dev.jain@arm.com> wrote:
>>
>>
>>
>> On 26/06/26 9:10 am, Muchun Song wrote:
>>>
>>>
>>> On 2026/6/25 19:29, Dev Jain wrote:
>>>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>>>> mprotect(PROT_NONE) PFN permission walk on x86.
>>>>
>>>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>>>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>>>> architectures where hugetlb entries need huge_ptep_get(), reading that
>>>> entry with ptep_get() can make the permission check use the wrong PFN.
>>>>
>>>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>>>
>>>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>>>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>>>> But use the correct helper anyways.
>>>>
>>>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>> ---
>>>>   mm/mprotect.c | 8 +++++++-
>>>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>>> index 9cbf932b028cf..23779632d18bf 100644
>>>> --- a/mm/mprotect.c
>>>> +++ b/mm/mprotect.c
>>>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>>>>           0 : -EACCES;
>>>>   }
>>>>   +#ifdef CONFIG_HUGETLB_PAGE
>>>>   static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>>>>                      unsigned long addr, unsigned long next,
>>>>                      struct mm_walk *walk)
>>>>   {
>>>> -    return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>>>> +    pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>>>> +
>>>> +    return pfn_modify_allowed(pte_pfn(entry),
>>>>                     *(pgprot_t *)(walk->private)) ?
>>>>           0 : -EACCES;
>>>>   }
>>>> +#else
>>>> +#define prot_none_hugetlb_entry    NULL
>>>
>>> This is very strange, because we defined a stub as NULL for a helper
>>
>> I was following pattern elsewhere, search for ".hugetlb_entry" in the
>> codebase and you will find others doing the same.
> 
> Okay, I understand why you want to do it that way, but I would still
> recommend not following that format.

Okay then I'll update v2 with the below diff.

> 
> Thanks.
> 
>>
>>
>>> function. How about  the following diff?
>>>
>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>> index 9cbf932b028c..4d8c1551fbce 100644
>>> --- a/mm/mprotect.c
>>> +++ b/mm/mprotect.c
>>> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>>>
>>>  static const struct mm_walk_ops prot_none_walk_ops = {
>>>         .pte_entry              = prot_none_pte_entry,
>>> +#ifdef CONFIG_HUGETLB_PAGE
>>>         .hugetlb_entry          = prot_none_hugetlb_entry,
>>> +#endif
>>>         .test_walk              = prot_none_test,
>>>         .walk_lock              = PGWALK_WRLOCK,
>>>  };
>>>
>>> Thanks,
>>> Muchun
>>>
>>>> +#endif
>>>>     static int prot_none_test(unsigned long addr, unsigned long next,
>>>>                 struct mm_walk *walk)
> 
> 
> 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
  2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
  2026-06-26  2:31   ` Lance Yang
@ 2026-06-26  7:48   ` Lance Yang
  2026-06-26  9:14     ` Lance Yang
  1 sibling, 1 reply; 21+ messages in thread
From: Lance Yang @ 2026-06-26  7:48 UTC (permalink / raw)
  To: dev.jain, linmiaohe
  Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
	harry, jannh, lance.yang, kas, linux-mm, linux-kernel, rcampbell,
	apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
	gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
	dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
	stable


On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>check_pte() is the final validation step in page_vma_mapped_walk().
>It reads pvmw->pte with ptep_get() to decide whether the entry maps
>the PFN range being walked. For hugetlb VMAs, that pointer refers
>to a hugetlb entry.
>
>On arches which provide their own huge_ptep_get() to dereference a huge
>pte pointer, accessing via ptep_get() would cause pte_pfn(),
>pte_present() etc to misbehave.
>
>It is not clear whether this has a trivially visible effect to userspace.
>
>Use huge_ptep_get() to dereference a huge pte pointer.
>
>Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>Cc: stable@vger.kernel.org
>Signed-off-by: Dev Jain <dev.jain@arm.com>
>---
> mm/page_vma_mapped.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..18e1d341f463c 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> {
> 	unsigned long pfn;
>-	pte_t ptent = ptep_get(pvmw->pte);
>+	pte_t ptent;
>+
>+	if (is_vm_hugetlb_page(pvmw->vma))
>+		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>+				      pvmw->pte);

I think check_pte() can pass a wrong address to huge_ptep_get() ...

Not sure that is wrong in the first place. For memory failure,
page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
folio. In that case, pvmw->address need not be hugepage-aligned.

@Miaohe

For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
page_vma_mapped_walk() passes that size to hugetlb_walk():

bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
{
	...
	if (unlikely(is_vm_hugetlb_page(vma))) {
		...
		pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
		...
	}
	...
}

hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
calculating pmdp:

pte_t *huge_pte_offset(struct mm_struct *mm,
		       unsigned long addr, unsigned long sz)
{
	...
	if (sz == CONT_PMD_SIZE)
		addr &= CONT_PMD_MASK;

	pmdp = pmd_offset(pudp, addr);
	pmd = READ_ONCE(*pmdp);
	...
}

So for that case, pvmw->pte is calculated from the aligned addr, not
necessarily from the original pvmw->address. But check_pte() passes the
original address together with pvmw->pte:

+		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
+				      pvmw->pte);

arm64 then uses that addr again to choose ncontig:

pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
	...
	ncontig = find_num_contig(mm, addr, ptep, &pgsize);
	for (i = 0; i < ncontig; i++, ptep++) {
		...
	}
	return orig_pte;
}

static int find_num_contig(struct mm_struct *mm, unsigned long addr,
			   pte_t *ptep, size_t *pgsize)
{
	pgd_t *pgdp = pgd_offset(mm, addr);
	p4d_t *p4dp;
	pud_t *pudp;
	pmd_t *pmdp;

	*pgsize = PAGE_SIZE;
	p4dp = p4d_offset(pgdp, addr);
	pudp = pud_offset(p4dp, addr);
	pmdp = pmd_offset(pudp, addr);
	if ((pte_t *)pmdp == ptep) {
		*pgsize = PMD_SIZE;
		return CONT_PMDS;
	}
	return CONT_PTES;
}

With a tail address, pmdp may no longer point at pvmw->pte, so
find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.

On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
can walk past the CONT_PMD entries, and possibly past the PMD table.

Should check_pte() pass the address matching pvmw->pte, sth like:

---8<---
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 406fd50bbd8f..58463493bd3d 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
 	unsigned long pfn;
 	pte_t ptent;

-	if (is_vm_hugetlb_page(pvmw->vma))
-		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
-				      pvmw->pte);
-	else
+	if (is_vm_hugetlb_page(pvmw->vma)) {
+		struct hstate *hstate = hstate_vma(pvmw->vma);
+		unsigned long haddr = pvmw->address & huge_page_mask(hstate);
+
+		ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
+	} else {
 		ptent = ptep_get(pvmw->pte);
+	}

 	if (pvmw->flags & PVMW_MIGRATION) {
 		const softleaf_t entry = softleaf_from_pte(ptent);
--

while leaving pvmw->address unchanged for page_mapped_in_vma()?

Cheers, Lance

>+	else
>+		ptent = ptep_get(pvmw->pte);
> 
> 	if (pvmw->flags & PVMW_MIGRATION) {
> 		const softleaf_t entry = softleaf_from_pte(ptent);
>-- 
>2.43.0
>
>


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
  2026-06-26  7:48   ` Lance Yang
@ 2026-06-26  9:14     ` Lance Yang
  0 siblings, 0 replies; 21+ messages in thread
From: Lance Yang @ 2026-06-26  9:14 UTC (permalink / raw)
  To: dev.jain, linmiaohe
  Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
	harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
	ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
	ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
	dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
	stable



On 2026/6/26 15:48, Lance Yang wrote:
> 
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>> {
>> 	unsigned long pfn;
>> -	pte_t ptent = ptep_get(pvmw->pte);
>> +	pte_t ptent;
>> +
>> +	if (is_vm_hugetlb_page(pvmw->vma))
>> +		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> +				      pvmw->pte);
> 
> I think check_pte() can pass a wrong address to huge_ptep_get() ...
> 
> Not sure that is wrong in the first place. For memory failure,
> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
> folio. In that case, pvmw->address need not be hugepage-aligned.
> 
> @Miaohe
> 
> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
> page_vma_mapped_walk() passes that size to hugetlb_walk():
> 
> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> {
> 	...
> 	if (unlikely(is_vm_hugetlb_page(vma))) {
> 		...
> 		pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
> 		...
> 	}
> 	...
> }
> 
> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
> calculating pmdp:
> 
> pte_t *huge_pte_offset(struct mm_struct *mm,
> 		       unsigned long addr, unsigned long sz)
> {
> 	...
> 	if (sz == CONT_PMD_SIZE)
> 		addr &= CONT_PMD_MASK;
> 
> 	pmdp = pmd_offset(pudp, addr);
> 	pmd = READ_ONCE(*pmdp);
> 	...
> }
> 
> So for that case, pvmw->pte is calculated from the aligned addr, not
> necessarily from the original pvmw->address. But check_pte() passes the
> original address together with pvmw->pte:
> 
> +		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> +				      pvmw->pte);

In addition:

Went through all arch code that has its own huge_ptep_get(); only
arm64 and powerpc actually use addr, and there addr has to match the
ptep, IIUC.

So I am wondering whether all huge_ptep_get() callers satisfy that
requirement.

Cheers, Lance

> 
> arm64 then uses that addr again to choose ncontig:
> 
> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> 	...
> 	ncontig = find_num_contig(mm, addr, ptep, &pgsize);
> 	for (i = 0; i < ncontig; i++, ptep++) {
> 		...
> 	}
> 	return orig_pte;
> }
> 
> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> 			   pte_t *ptep, size_t *pgsize)
> {
> 	pgd_t *pgdp = pgd_offset(mm, addr);
> 	p4d_t *p4dp;
> 	pud_t *pudp;
> 	pmd_t *pmdp;
> 
> 	*pgsize = PAGE_SIZE;
> 	p4dp = p4d_offset(pgdp, addr);
> 	pudp = pud_offset(p4dp, addr);
> 	pmdp = pmd_offset(pudp, addr);
> 	if ((pte_t *)pmdp == ptep) {
> 		*pgsize = PMD_SIZE;
> 		return CONT_PMDS;
> 	}
> 	return CONT_PTES;
> }
> 
> With a tail address, pmdp may no longer point at pvmw->pte, so
> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
> 
> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
> can walk past the CONT_PMD entries, and possibly past the PMD table.
> 
> Should check_pte() pass the address matching pvmw->pte, sth like:
> 
> ---8<---
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index 406fd50bbd8f..58463493bd3d 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>   	unsigned long pfn;
>   	pte_t ptent;
> 
> -	if (is_vm_hugetlb_page(pvmw->vma))
> -		ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> -				      pvmw->pte);
> -	else
> +	if (is_vm_hugetlb_page(pvmw->vma)) {
> +		struct hstate *hstate = hstate_vma(pvmw->vma);
> +		unsigned long haddr = pvmw->address & huge_page_mask(hstate);
> +
> +		ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
> +	} else {
>   		ptent = ptep_get(pvmw->pte);
> +	}
> 
>   	if (pvmw->flags & PVMW_MIGRATION) {
>   		const softleaf_t entry = softleaf_from_pte(ptent);
> --
> 
> while leaving pvmw->address unchanged for page_mapped_in_vma()?
> 
> Cheers, Lance
> 
>> +	else
>> +		ptent = ptep_get(pvmw->pte);
>>
>> 	if (pvmw->flags & PVMW_MIGRATION) {
>> 		const softleaf_t entry = softleaf_from_pte(ptent);
>> -- 
>> 2.43.0
>>
>>



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2026-06-26  9:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
2026-06-26  3:17   ` Muchun Song
2026-06-26  4:03     ` Dev Jain
2026-06-26  4:16       ` Muchun Song
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
2026-06-26  3:24   ` Muchun Song
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
2026-06-26  3:32   ` Muchun Song
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
2026-06-26  2:31   ` Lance Yang
2026-06-26  4:06     ` Dev Jain
2026-06-26  7:48   ` Lance Yang
2026-06-26  9:14     ` Lance Yang
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
2026-06-26  3:40   ` Muchun Song
2026-06-26  4:08     ` Dev Jain
2026-06-26  4:21       ` Muchun Song
2026-06-26  4:42         ` Dev Jain
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
2026-06-26  4:09   ` Dev Jain

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox