* [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:17 ` Muchun Song
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
` (4 subsequent siblings)
5 siblings, 1 reply; 23+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
try_to_unmap_one() handles hugetlb folios when memory failure needs
to replace a poisoned hugetlb mapping with a hwpoison entry. In that
case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
in pvmw.pte, but the code reads it with ptep_get().
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Just use huge_ptep_get() for dereferencing a huge pte pointer.
Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
include/linux/hugetlb.h | 3 +++
mm/rmap.c | 16 ++++++++++------
2 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 2abaf99321e90..fdb7bdf7645c5 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
{
}
+pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep);
+
static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
unsigned long addr, pte_t *ptep)
{
diff --git a/mm/rmap.c b/mm/rmap.c
index 1c77d5dc06e9f..aa8a254efaecc 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
/* Unexpected PMD-mapped THP? */
VM_BUG_ON_FOLIO(!pvmw.pte, folio);
- /*
- * Handle PFN swap PTEs, such as device-exclusive ones, that
- * actually map pages.
- */
- pteval = ptep_get(pvmw.pte);
+ address = pvmw.address;
+ if (folio_test_hugetlb(folio)) {
+ pteval = huge_ptep_get(mm, address, pvmw.pte);
+ } else {
+ /*
+ * Handle PFN swap PTEs, such as device-exclusive ones,
+ * that actually map pages.
+ */
+ pteval = ptep_get(pvmw.pte);
+ }
if (likely(pte_present(pteval))) {
pfn = pte_pfn(pteval);
} else {
@@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
}
subpage = folio_page(folio, pfn - folio_pfn(folio));
- address = pvmw.address;
anon_exclusive = folio_test_anon(folio) &&
PageAnonExclusive(subpage);
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
@ 2026-06-26 3:17 ` Muchun Song
2026-06-26 4:03 ` Dev Jain
0 siblings, 1 reply; 23+ messages in thread
From: Muchun Song @ 2026-06-26 3:17 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
akpm, ljs, david, liam
On 2026/6/25 19:29, Dev Jain wrote:
> try_to_unmap_one() handles hugetlb folios when memory failure needs
> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
> in pvmw.pte, but the code reads it with ptep_get().
>
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
> etc to misbehave.
>
> It is not clear whether this has a trivially visible effect to userspace.
>
> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>
> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
> include/linux/hugetlb.h | 3 +++
> mm/rmap.c | 16 ++++++++++------
> 2 files changed, 13 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 2abaf99321e90..fdb7bdf7645c5 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
> {
> }
>
> +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
> + pte_t *ptep);
Thanks so much for the fix! I'm curious, though: why do we
need to add a separate declaration for this function here?
Thanks,
Muchun
> +
> static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
> unsigned long addr, pte_t *ptep)
> {
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 1c77d5dc06e9f..aa8a254efaecc 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
> /* Unexpected PMD-mapped THP? */
> VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>
> - /*
> - * Handle PFN swap PTEs, such as device-exclusive ones, that
> - * actually map pages.
> - */
> - pteval = ptep_get(pvmw.pte);
> + address = pvmw.address;
> + if (folio_test_hugetlb(folio)) {
> + pteval = huge_ptep_get(mm, address, pvmw.pte);
> + } else {
> + /*
> + * Handle PFN swap PTEs, such as device-exclusive ones,
> + * that actually map pages.
> + */
> + pteval = ptep_get(pvmw.pte);
> + }
> if (likely(pte_present(pteval))) {
> pfn = pte_pfn(pteval);
> } else {
> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
> }
>
> subpage = folio_page(folio, pfn - folio_pfn(folio));
> - address = pvmw.address;
> anon_exclusive = folio_test_anon(folio) &&
> PageAnonExclusive(subpage);
>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-26 3:17 ` Muchun Song
@ 2026-06-26 4:03 ` Dev Jain
2026-06-26 4:16 ` Muchun Song
0 siblings, 1 reply; 23+ messages in thread
From: Dev Jain @ 2026-06-26 4:03 UTC (permalink / raw)
To: Muchun Song
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
akpm, ljs, david, liam
On 26/06/26 8:47 am, Muchun Song wrote:
>
>
> On 2026/6/25 19:29, Dev Jain wrote:
>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
>> in pvmw.pte, but the code reads it with ptep_get().
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
>> etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>>
>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> include/linux/hugetlb.h | 3 +++
>> mm/rmap.c | 16 ++++++++++------
>> 2 files changed, 13 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>> index 2abaf99321e90..fdb7bdf7645c5 100644
>> --- a/include/linux/hugetlb.h
>> +++ b/include/linux/hugetlb.h
>> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
>> {
>> }
>> +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
>> + pte_t *ptep);
>
> Thanks so much for the fix! I'm curious, though: why do we
> need to add a separate declaration for this function here?
For !CONFIG_HUGETLB_PAGE, compiler complains that there is no huge_ptep_get.
So this is to make compiler happy.
>
> Thanks,
> Muchun
>
>> +
>> static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>> unsigned long addr, pte_t *ptep)
>> {
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 1c77d5dc06e9f..aa8a254efaecc 100644
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>> /* Unexpected PMD-mapped THP? */
>> VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>> - /*
>> - * Handle PFN swap PTEs, such as device-exclusive ones, that
>> - * actually map pages.
>> - */
>> - pteval = ptep_get(pvmw.pte);
>> + address = pvmw.address;
>> + if (folio_test_hugetlb(folio)) {
>> + pteval = huge_ptep_get(mm, address, pvmw.pte);
>> + } else {
>> + /*
>> + * Handle PFN swap PTEs, such as device-exclusive ones,
>> + * that actually map pages.
>> + */
>> + pteval = ptep_get(pvmw.pte);
>> + }
>> if (likely(pte_present(pteval))) {
>> pfn = pte_pfn(pteval);
>> } else {
>> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>> }
>> subpage = folio_page(folio, pfn - folio_pfn(folio));
>> - address = pvmw.address;
>> anon_exclusive = folio_test_anon(folio) &&
>> PageAnonExclusive(subpage);
>>
>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one()
2026-06-26 4:03 ` Dev Jain
@ 2026-06-26 4:16 ` Muchun Song
0 siblings, 0 replies; 23+ messages in thread
From: Muchun Song @ 2026-06-26 4:16 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable, osalvador,
akpm, ljs, david, liam
> On Jun 26, 2026, at 12:03, Dev Jain <dev.jain@arm.com> wrote:
>
>
>
> On 26/06/26 8:47 am, Muchun Song wrote:
>>
>>
>> On 2026/6/25 19:29, Dev Jain wrote:
>>> try_to_unmap_one() handles hugetlb folios when memory failure needs
>>> to replace a poisoned hugetlb mapping with a hwpoison entry. In that
>>> case page_vma_mapped_walk() returns the pte pointer to the hugetlb folio
>>> in pvmw.pte, but the code reads it with ptep_get().
>>>
>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
>>> etc to misbehave.
>>>
>>> It is not clear whether this has a trivially visible effect to userspace.
>>>
>>> Just use huge_ptep_get() for dereferencing a huge pte pointer.
>>>
>>> Fixes: c7ab0d2fdc84 ("mm: convert try_to_unmap_one() to use page_vma_mapped_walk()")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>> include/linux/hugetlb.h | 3 +++
>>> mm/rmap.c | 16 ++++++++++------
>>> 2 files changed, 13 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
>>> index 2abaf99321e90..fdb7bdf7645c5 100644
>>> --- a/include/linux/hugetlb.h
>>> +++ b/include/linux/hugetlb.h
>>> @@ -1261,6 +1261,9 @@ static inline void hugetlb_count_sub(long l, struct mm_struct *mm)
>>> {
>>> }
>>> +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr,
>>> + pte_t *ptep);
>>
>> Thanks so much for the fix! I'm curious, though: why do we
>> need to add a separate declaration for this function here?
>
> For !CONFIG_HUGETLB_PAGE, compiler complains that there is no huge_ptep_get.
> So this is to make compiler happy.
Got it. We can refer to 5d4af6195c87c6b162b7963e0ad00a214b80d764 to fix
this warning.
Muchun,
Thanks.
>
>>
>> Thanks,
>> Muchun
>>
>>> +
>>> static inline pte_t huge_ptep_clear_flush(struct vm_area_struct *vma,
>>> unsigned long addr, pte_t *ptep)
>>> {
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index 1c77d5dc06e9f..aa8a254efaecc 100644
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -2095,11 +2095,16 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>> /* Unexpected PMD-mapped THP? */
>>> VM_BUG_ON_FOLIO(!pvmw.pte, folio);
>>> - /*
>>> - * Handle PFN swap PTEs, such as device-exclusive ones, that
>>> - * actually map pages.
>>> - */
>>> - pteval = ptep_get(pvmw.pte);
>>> + address = pvmw.address;
>>> + if (folio_test_hugetlb(folio)) {
>>> + pteval = huge_ptep_get(mm, address, pvmw.pte);
>>> + } else {
>>> + /*
>>> + * Handle PFN swap PTEs, such as device-exclusive ones,
>>> + * that actually map pages.
>>> + */
>>> + pteval = ptep_get(pvmw.pte);
>>> + }
>>> if (likely(pte_present(pteval))) {
>>> pfn = pte_pfn(pteval);
>>> } else {
>>> @@ -2110,7 +2115,6 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
>>> }
>>> subpage = folio_page(folio, pfn - folio_pfn(folio));
>>> - address = pvmw.address;
>>> anon_exclusive = folio_test_anon(folio) &&
>>> PageAnonExclusive(subpage);
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one()
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:24 ` Muchun Song
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
` (3 subsequent siblings)
5 siblings, 1 reply; 23+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
try_to_migrate_one() is used by folio migration to replace a present
mapping with a migration entry. For hugetlb folios, page_vma_mapped_walk()
returns the pte pointer to the hugetlb folio in pvmw.pte, but the code
reads the huge pte entry with ptep_get().
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Use huge_ptep_get() to dereference a huge pte pointer.
Commit a98a2f0c8ce1 copied the bug from try_to_unmap_one into
try_to_migrate_one.
Fixes: a98a2f0c8ce1 ("mm/rmap: split migration into its own function")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/rmap.c | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index aa8a254efaecc..abc3a44baaa3d 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2505,11 +2505,16 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
/* Unexpected PMD-mapped THP? */
VM_BUG_ON_FOLIO(!pvmw.pte, folio);
- /*
- * Handle PFN swap PTEs, such as device-exclusive ones, that
- * actually map pages.
- */
- pteval = ptep_get(pvmw.pte);
+ address = pvmw.address;
+ if (folio_test_hugetlb(folio)) {
+ pteval = huge_ptep_get(mm, address, pvmw.pte);
+ } else {
+ /*
+ * Handle PFN swap PTEs, such as device-exclusive ones,
+ * that actually map pages.
+ */
+ pteval = ptep_get(pvmw.pte);
+ }
if (likely(pte_present(pteval))) {
pfn = pte_pfn(pteval);
} else {
@@ -2520,7 +2525,6 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
}
subpage = folio_page(folio, pfn - folio_pfn(folio));
- address = pvmw.address;
anon_exclusive = folio_test_anon(folio) &&
PageAnonExclusive(subpage);
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one()
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
@ 2026-06-26 3:24 ` Muchun Song
0 siblings, 0 replies; 23+ messages in thread
From: Muchun Song @ 2026-06-26 3:24 UTC (permalink / raw)
To: Dev Jain
Cc: osalvador, akpm, ljs, david, liam, riel, vbabka, harry, jannh,
lance.yang, kas, linux-mm, linux-kernel, rcampbell, apopple, ziy,
matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
> On Jun 25, 2026, at 19:29, Dev Jain <dev.jain@arm.com> wrote:
>
> try_to_migrate_one() is used by folio migration to replace a present
> mapping with a migration entry. For hugetlb folios, page_vma_mapped_walk()
> returns the pte pointer to the hugetlb folio in pvmw.pte, but the code
> reads the huge pte entry with ptep_get().
>
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(), pte_present()
> etc to misbehave.
>
> It is not clear whether this has a trivially visible effect to userspace.
>
> Use huge_ptep_get() to dereference a huge pte pointer.
>
> Commit a98a2f0c8ce1 copied the bug from try_to_unmap_one into
> try_to_migrate_one.
>
> Fixes: a98a2f0c8ce1 ("mm/rmap: split migration into its own function")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Thanks.
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte()
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
2026-06-25 11:29 ` [PATCH 1/5] mm/rmap: use huge_ptep_get() in try_to_unmap_one() Dev Jain
2026-06-25 11:29 ` [PATCH 2/5] mm/rmap: use huge_ptep_get() in try_to_migrate_one() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:32 ` Muchun Song
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
` (2 subsequent siblings)
5 siblings, 1 reply; 23+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
remove_migration_pte() converts migration entries back to present PTEs
after folio migration completes. For hugetlb folios,
page_vma_mapped_walk() returns the pte pointer to the hugetlb folio in
pvmw.pte, but the code reads it with ptep_get().
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(),
pte_present() etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Use huge_ptep_get() to dereference a huge pte pointer.
Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/migrate.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/mm/migrate.c b/mm/migrate.c
index d9b23909d716c..c65f0f43df7eb 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -371,7 +371,11 @@ static bool remove_migration_pte(struct folio *folio,
continue;
}
#endif
- old_pte = ptep_get(pvmw.pte);
+ if (folio_test_hugetlb(folio))
+ old_pte = huge_ptep_get(vma->vm_mm, pvmw.address,
+ pvmw.pte);
+ else
+ old_pte = ptep_get(pvmw.pte);
if (rmap_walk_arg->map_unused_to_zeropage &&
try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx))
continue;
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte()
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
@ 2026-06-26 3:32 ` Muchun Song
0 siblings, 0 replies; 23+ messages in thread
From: Muchun Song @ 2026-06-26 3:32 UTC (permalink / raw)
To: Dev Jain
Cc: osalvador, akpm, ljs, david, liam, riel, vbabka, harry, jannh,
lance.yang, kas, linux-mm, linux-kernel, rcampbell, apopple, ziy,
matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
> On Jun 25, 2026, at 19:29, Dev Jain <dev.jain@arm.com> wrote:
>
> remove_migration_pte() converts migration entries back to present PTEs
> after folio migration completes. For hugetlb folios,
> page_vma_mapped_walk() returns the pte pointer to the hugetlb folio in
> pvmw.pte, but the code reads it with ptep_get().
>
> On arches which provide their own huge_ptep_get() to dereference a huge
> pte pointer, accessing via ptep_get() would cause pte_pfn(),
> pte_present() etc to misbehave.
>
> It is not clear whether this has a trivially visible effect to userspace.
We are dealing with migration entries here, so the issue mentioned shouldn't
be a problem with any of the architectures. Semantically speaking, we definitely
should fix this.
>
> Use huge_ptep_get() to dereference a huge pte pointer.
>
> Fixes: 290408d4a250 ("hugetlb: hugepage migration core")
> Cc: stable@vger.kernel.org
> Signed-off-by: Dev Jain <dev.jain@arm.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Thanks
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
` (2 preceding siblings ...)
2026-06-25 11:29 ` [PATCH 3/5] mm/migrate: use huge_ptep_get() in remove_migration_pte() Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 2:31 ` Lance Yang
2026-06-26 7:48 ` Lance Yang
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
5 siblings, 2 replies; 23+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, stable
check_pte() is the final validation step in page_vma_mapped_walk().
It reads pvmw->pte with ptep_get() to decide whether the entry maps
the PFN range being walked. For hugetlb VMAs, that pointer refers
to a hugetlb entry.
On arches which provide their own huge_ptep_get() to dereference a huge
pte pointer, accessing via ptep_get() would cause pte_pfn(),
pte_present() etc to misbehave.
It is not clear whether this has a trivially visible effect to userspace.
Use huge_ptep_get() to dereference a huge pte pointer.
Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
Cc: stable@vger.kernel.org
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/page_vma_mapped.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 2ccbabfb2cc17..18e1d341f463c 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
{
unsigned long pfn;
- pte_t ptent = ptep_get(pvmw->pte);
+ pte_t ptent;
+
+ if (is_vm_hugetlb_page(pvmw->vma))
+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
+ pvmw->pte);
+ else
+ ptent = ptep_get(pvmw->pte);
if (pvmw->flags & PVMW_MIGRATION) {
const softleaf_t entry = softleaf_from_pte(ptent);
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
@ 2026-06-26 2:31 ` Lance Yang
2026-06-26 4:06 ` Dev Jain
2026-06-26 7:48 ` Lance Yang
1 sibling, 1 reply; 23+ messages in thread
From: Lance Yang @ 2026-06-26 2:31 UTC (permalink / raw)
To: dev.jain
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, lance.yang, kas, linux-mm, linux-kernel, rcampbell,
apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>check_pte() is the final validation step in page_vma_mapped_walk().
>It reads pvmw->pte with ptep_get() to decide whether the entry maps
>the PFN range being walked. For hugetlb VMAs, that pointer refers
>to a hugetlb entry.
>
>On arches which provide their own huge_ptep_get() to dereference a huge
>pte pointer, accessing via ptep_get() would cause pte_pfn(),
>pte_present() etc to misbehave.
>
>It is not clear whether this has a trivially visible effect to userspace.
>
>Use huge_ptep_get() to dereference a huge pte pointer.
>
>Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>Cc: stable@vger.kernel.org
>Signed-off-by: Dev Jain <dev.jain@arm.com>
>---
> mm/page_vma_mapped.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..18e1d341f463c 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
Just one ordering thing: should this patch come first?
Patches #01-#03 only reach the new huge_ptep_get() after
page_vma_mapped_walk() succeeds. But before this patch, hugetlb sill
goes through check_pte() (still using ptep_get()).
> {
> unsigned long pfn;
>- pte_t ptent = ptep_get(pvmw->pte);
>+ pte_t ptent;
>+
>+ if (is_vm_hugetlb_page(pvmw->vma))
>+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>+ pvmw->pte);
>+ else
>+ ptent = ptep_get(pvmw->pte);
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
>--
>2.43.0
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 2:31 ` Lance Yang
@ 2026-06-26 4:06 ` Dev Jain
0 siblings, 0 replies; 23+ messages in thread
From: Dev Jain @ 2026-06-26 4:06 UTC (permalink / raw)
To: Lance Yang
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 26/06/26 8:01 am, Lance Yang wrote:
>
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>
> Just one ordering thing: should this patch come first?
>
> Patches #01-#03 only reach the new huge_ptep_get() after
> page_vma_mapped_walk() succeeds. But before this patch, hugetlb sill
> goes through check_pte() (still using ptep_get()).
You are right, but do we care? This is not a series meant for adding functionality.
I just sent it as a series because they are similar fixes - the patches are to
be applied individually with no dependency.
>
>> {
>> unsigned long pfn;
>> - pte_t ptent = ptep_get(pvmw->pte);
>> + pte_t ptent;
>> +
>> + if (is_vm_hugetlb_page(pvmw->vma))
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>> + else
>> + ptent = ptep_get(pvmw->pte);
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>> 2.43.0
>>
>>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
2026-06-26 2:31 ` Lance Yang
@ 2026-06-26 7:48 ` Lance Yang
2026-06-26 9:14 ` Lance Yang
2026-06-26 13:23 ` Dev Jain
1 sibling, 2 replies; 23+ messages in thread
From: Lance Yang @ 2026-06-26 7:48 UTC (permalink / raw)
To: dev.jain, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, lance.yang, kas, linux-mm, linux-kernel, rcampbell,
apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>check_pte() is the final validation step in page_vma_mapped_walk().
>It reads pvmw->pte with ptep_get() to decide whether the entry maps
>the PFN range being walked. For hugetlb VMAs, that pointer refers
>to a hugetlb entry.
>
>On arches which provide their own huge_ptep_get() to dereference a huge
>pte pointer, accessing via ptep_get() would cause pte_pfn(),
>pte_present() etc to misbehave.
>
>It is not clear whether this has a trivially visible effect to userspace.
>
>Use huge_ptep_get() to dereference a huge pte pointer.
>
>Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>Cc: stable@vger.kernel.org
>Signed-off-by: Dev Jain <dev.jain@arm.com>
>---
> mm/page_vma_mapped.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..18e1d341f463c 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> {
> unsigned long pfn;
>- pte_t ptent = ptep_get(pvmw->pte);
>+ pte_t ptent;
>+
>+ if (is_vm_hugetlb_page(pvmw->vma))
>+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>+ pvmw->pte);
I think check_pte() can pass a wrong address to huge_ptep_get() ...
Not sure that is wrong in the first place. For memory failure,
page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
folio. In that case, pvmw->address need not be hugepage-aligned.
@Miaohe
For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
page_vma_mapped_walk() passes that size to hugetlb_walk():
bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
{
...
if (unlikely(is_vm_hugetlb_page(vma))) {
...
pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
...
}
...
}
hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
calculating pmdp:
pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz)
{
...
if (sz == CONT_PMD_SIZE)
addr &= CONT_PMD_MASK;
pmdp = pmd_offset(pudp, addr);
pmd = READ_ONCE(*pmdp);
...
}
So for that case, pvmw->pte is calculated from the aligned addr, not
necessarily from the original pvmw->address. But check_pte() passes the
original address together with pvmw->pte:
+ ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
+ pvmw->pte);
arm64 then uses that addr again to choose ncontig:
pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
...
ncontig = find_num_contig(mm, addr, ptep, &pgsize);
for (i = 0; i < ncontig; i++, ptep++) {
...
}
return orig_pte;
}
static int find_num_contig(struct mm_struct *mm, unsigned long addr,
pte_t *ptep, size_t *pgsize)
{
pgd_t *pgdp = pgd_offset(mm, addr);
p4d_t *p4dp;
pud_t *pudp;
pmd_t *pmdp;
*pgsize = PAGE_SIZE;
p4dp = p4d_offset(pgdp, addr);
pudp = pud_offset(p4dp, addr);
pmdp = pmd_offset(pudp, addr);
if ((pte_t *)pmdp == ptep) {
*pgsize = PMD_SIZE;
return CONT_PMDS;
}
return CONT_PTES;
}
With a tail address, pmdp may no longer point at pvmw->pte, so
find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
can walk past the CONT_PMD entries, and possibly past the PMD table.
Should check_pte() pass the address matching pvmw->pte, sth like:
---8<---
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index 406fd50bbd8f..58463493bd3d 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
unsigned long pfn;
pte_t ptent;
- if (is_vm_hugetlb_page(pvmw->vma))
- ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
- pvmw->pte);
- else
+ if (is_vm_hugetlb_page(pvmw->vma)) {
+ struct hstate *hstate = hstate_vma(pvmw->vma);
+ unsigned long haddr = pvmw->address & huge_page_mask(hstate);
+
+ ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
+ } else {
ptent = ptep_get(pvmw->pte);
+ }
if (pvmw->flags & PVMW_MIGRATION) {
const softleaf_t entry = softleaf_from_pte(ptent);
--
while leaving pvmw->address unchanged for page_mapped_in_vma()?
Cheers, Lance
>+ else
>+ ptent = ptep_get(pvmw->pte);
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
>--
>2.43.0
>
>
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 7:48 ` Lance Yang
@ 2026-06-26 9:14 ` Lance Yang
2026-06-26 13:23 ` Dev Jain
1 sibling, 0 replies; 23+ messages in thread
From: Lance Yang @ 2026-06-26 9:14 UTC (permalink / raw)
To: dev.jain, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 2026/6/26 15:48, Lance Yang wrote:
>
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>> {
>> unsigned long pfn;
>> - pte_t ptent = ptep_get(pvmw->pte);
>> + pte_t ptent;
>> +
>> + if (is_vm_hugetlb_page(pvmw->vma))
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>
> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>
> Not sure that is wrong in the first place. For memory failure,
> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
> folio. In that case, pvmw->address need not be hugepage-aligned.
>
> @Miaohe
>
> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
> page_vma_mapped_walk() passes that size to hugetlb_walk():
>
> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> {
> ...
> if (unlikely(is_vm_hugetlb_page(vma))) {
> ...
> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
> ...
> }
> ...
> }
>
> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
> calculating pmdp:
>
> pte_t *huge_pte_offset(struct mm_struct *mm,
> unsigned long addr, unsigned long sz)
> {
> ...
> if (sz == CONT_PMD_SIZE)
> addr &= CONT_PMD_MASK;
>
> pmdp = pmd_offset(pudp, addr);
> pmd = READ_ONCE(*pmdp);
> ...
> }
>
> So for that case, pvmw->pte is calculated from the aligned addr, not
> necessarily from the original pvmw->address. But check_pte() passes the
> original address together with pvmw->pte:
>
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> + pvmw->pte);
In addition:
Went through all arch code that has its own huge_ptep_get(); only
arm64 and powerpc actually use addr, and there addr has to match the
ptep, IIUC.
So I am wondering whether all huge_ptep_get() callers satisfy that
requirement.
Cheers, Lance
>
> arm64 then uses that addr again to choose ncontig:
>
> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> ...
> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
> for (i = 0; i < ncontig; i++, ptep++) {
> ...
> }
> return orig_pte;
> }
>
> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, size_t *pgsize)
> {
> pgd_t *pgdp = pgd_offset(mm, addr);
> p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
>
> *pgsize = PAGE_SIZE;
> p4dp = p4d_offset(pgdp, addr);
> pudp = pud_offset(p4dp, addr);
> pmdp = pmd_offset(pudp, addr);
> if ((pte_t *)pmdp == ptep) {
> *pgsize = PMD_SIZE;
> return CONT_PMDS;
> }
> return CONT_PTES;
> }
>
> With a tail address, pmdp may no longer point at pvmw->pte, so
> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>
> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
> can walk past the CONT_PMD entries, and possibly past the PMD table.
>
> Should check_pte() pass the address matching pvmw->pte, sth like:
>
> ---8<---
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index 406fd50bbd8f..58463493bd3d 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> unsigned long pfn;
> pte_t ptent;
>
> - if (is_vm_hugetlb_page(pvmw->vma))
> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> - pvmw->pte);
> - else
> + if (is_vm_hugetlb_page(pvmw->vma)) {
> + struct hstate *hstate = hstate_vma(pvmw->vma);
> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
> +
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
> + } else {
> ptent = ptep_get(pvmw->pte);
> + }
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
> --
>
> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>
> Cheers, Lance
>
>> + else
>> + ptent = ptep_get(pvmw->pte);
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>> 2.43.0
>>
>>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 7:48 ` Lance Yang
2026-06-26 9:14 ` Lance Yang
@ 2026-06-26 13:23 ` Dev Jain
2026-06-26 14:10 ` Lance Yang
1 sibling, 1 reply; 23+ messages in thread
From: Dev Jain @ 2026-06-26 13:23 UTC (permalink / raw)
To: Lance Yang, linmiaohe
Cc: muchun.song, osalvador, akpm, ljs, david, liam, riel, vbabka,
harry, jannh, kas, linux-mm, linux-kernel, rcampbell, apopple,
ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul, gourry,
ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On 26/06/26 1:18 pm, Lance Yang wrote:
>
> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>> check_pte() is the final validation step in page_vma_mapped_walk().
>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>> to a hugetlb entry.
>>
>> On arches which provide their own huge_ptep_get() to dereference a huge
>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>> pte_present() etc to misbehave.
>>
>> It is not clear whether this has a trivially visible effect to userspace.
>>
>> Use huge_ptep_get() to dereference a huge pte pointer.
>>
>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/page_vma_mapped.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..18e1d341f463c 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>> {
>> unsigned long pfn;
>> - pte_t ptent = ptep_get(pvmw->pte);
>> + pte_t ptent;
>> +
>> + if (is_vm_hugetlb_page(pvmw->vma))
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>
> I think check_pte() can pass a wrong address to huge_ptep_get() ...
Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
etc ...
>
> Not sure that is wrong in the first place. For memory failure,
> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
> folio. In that case, pvmw->address need not be hugepage-aligned.
>
> @Miaohe
>
> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
> page_vma_mapped_walk() passes that size to hugetlb_walk():
>
> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> {
> ...
> if (unlikely(is_vm_hugetlb_page(vma))) {
> ...
> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
> ...
> }
> ...
> }
>
> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
> calculating pmdp:
>
> pte_t *huge_pte_offset(struct mm_struct *mm,
> unsigned long addr, unsigned long sz)
> {
> ...
> if (sz == CONT_PMD_SIZE)
> addr &= CONT_PMD_MASK;
>
> pmdp = pmd_offset(pudp, addr);
> pmd = READ_ONCE(*pmdp);
> ...
> }
>
> So for that case, pvmw->pte is calculated from the aligned addr, not
> necessarily from the original pvmw->address. But check_pte() passes the
> original address together with pvmw->pte:
>
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> + pvmw->pte);
>
> arm64 then uses that addr again to choose ncontig:
>
> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
> {
> ...
> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
> for (i = 0; i < ncontig; i++, ptep++) {
> ...
> }
> return orig_pte;
> }
>
> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
> pte_t *ptep, size_t *pgsize)
> {
> pgd_t *pgdp = pgd_offset(mm, addr);
> p4d_t *p4dp;
> pud_t *pudp;
> pmd_t *pmdp;
>
> *pgsize = PAGE_SIZE;
> p4dp = p4d_offset(pgdp, addr);
> pudp = pud_offset(p4dp, addr);
> pmdp = pmd_offset(pudp, addr);
> if ((pte_t *)pmdp == ptep) {
> *pgsize = PMD_SIZE;
> return CONT_PMDS;
> }
> return CONT_PTES;
> }
>
> With a tail address, pmdp may no longer point at pvmw->pte, so
> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>
> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
> can walk past the CONT_PMD entries, and possibly past the PMD table.
>
> Should check_pte() pass the address matching pvmw->pte, sth like:
>
> ---8<---
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index 406fd50bbd8f..58463493bd3d 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
> unsigned long pfn;
> pte_t ptent;
>
> - if (is_vm_hugetlb_page(pvmw->vma))
> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
> - pvmw->pte);
> - else
> + if (is_vm_hugetlb_page(pvmw->vma)) {
> + struct hstate *hstate = hstate_vma(pvmw->vma);
> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
> +
> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
> + } else {
> ptent = ptep_get(pvmw->pte);
> + }
>
> if (pvmw->flags & PVMW_MIGRATION) {
> const softleaf_t entry = softleaf_from_pte(ptent);
> --
>
> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>
> Cheers, Lance
>
>> + else
>> + ptent = ptep_get(pvmw->pte);
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>> 2.43.0
>>
>>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb
2026-06-26 13:23 ` Dev Jain
@ 2026-06-26 14:10 ` Lance Yang
0 siblings, 0 replies; 23+ messages in thread
From: Lance Yang @ 2026-06-26 14:10 UTC (permalink / raw)
To: dev.jain, linmiaohe
Cc: lance.yang, muchun.song, osalvador, akpm, ljs, david, liam, riel,
vbabka, harry, jannh, kas, linux-mm, linux-kernel, rcampbell,
apopple, ziy, matthew.brost, joshua.hahnjy, rakie.kim, byungchul,
gourry, ying.huang, mel, nao.horiguchi, ak, j-nomura, pfalcato,
dave.hansen, tglx, jpoimboe, ryan.roberts, anshuman.khandual,
stable
On Fri, Jun 26, 2026 at 06:53:10PM +0530, Dev Jain wrote:
>
>
>On 26/06/26 1:18 pm, Lance Yang wrote:
>>
>> On Thu, Jun 25, 2026 at 11:29:53AM +0000, Dev Jain wrote:
>>> check_pte() is the final validation step in page_vma_mapped_walk().
>>> It reads pvmw->pte with ptep_get() to decide whether the entry maps
>>> the PFN range being walked. For hugetlb VMAs, that pointer refers
>>> to a hugetlb entry.
>>>
>>> On arches which provide their own huge_ptep_get() to dereference a huge
>>> pte pointer, accessing via ptep_get() would cause pte_pfn(),
>>> pte_present() etc to misbehave.
>>>
>>> It is not clear whether this has a trivially visible effect to userspace.
>>>
>>> Use huge_ptep_get() to dereference a huge pte pointer.
>>>
>>> Fixes: ace71a19cec5 ("mm: introduce page_vma_mapped_walk()")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>> mm/page_vma_mapped.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index 2ccbabfb2cc17..18e1d341f463c 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -107,7 +107,13 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw, pmd_t *pmdvalp,
>>> static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>>> {
>>> unsigned long pfn;
>>> - pte_t ptent = ptep_get(pvmw->pte);
>>> + pte_t ptent;
>>> +
>>> + if (is_vm_hugetlb_page(pvmw->vma))
>>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>>> + pvmw->pte);
>>
>> I think check_pte() can pass a wrong address to huge_ptep_get() ...
>
>Won't this be handled by rmap_walk_anon/rmap_walk_file - they are the ones
>performing the rmap traversal and passing address to try_to_unmap_one/folio_referenced_one
>etc ...
Right, that should cover the rmap callbacks. The bit I was worried about
is page_mapped_in_vma() though.
>>
>> Not sure that is wrong in the first place. For memory failure,
>> page_mapped_in_vma() can be called with a poisoned tail page of a hugetlb
>> folio. In that case, pvmw->address need not be hugepage-aligned.
>>
>> @Miaohe
For hugetlb memory failure we start with the poisoned PFN:
static int try_memory_failure_hugetlb(unsigned long pfn, int flags)
{
...
struct page *p = pfn_to_page(pfn);
struct folio *folio;
...
folio = page_folio(p);
...
if (!hwpoison_user_mappings(folio, p, pfn, flags)) {
...
}
...
}
and pass the same p down:
static bool hwpoison_user_mappings(struct folio *folio, struct page *p,
unsigned long pfn, int flags)
{
...
collect_procs(folio, p, &tokill, flags & MF_ACTION_REQUIRED);
...
}
static void collect_procs(const struct folio *folio, const struct page *page,
struct list_head *tokill, int force_early)
{
...
if (unlikely(folio_test_ksm(folio)))
...
else if (folio_test_anon(folio))
collect_procs_anon(folio, page, tokill, force_early);
else
...
}
So collect_procs_anon() still gets the poisoned page, not &folio->page:
static void collect_procs_anon(const struct folio *folio,
const struct page *page, struct list_head *to_kill,
int force_early)
{
...
pgoff = page_pgoff(folio, page);
rcu_read_lock();
for_each_process(tsk) {
...
anon_vma_interval_tree_foreach(vmac, &av->rb_root,
pgoff, pgoff) {
...
addr = page_mapped_in_vma(page, vma);
...
}
}
rcu_read_unlock();
anon_vma_unlock_read(av);
}
page_mapped_in_vma() then builds pvmw for that page:
unsigned long page_mapped_in_vma(const struct page *page,
struct vm_area_struct *vma)
{
const struct folio *folio = page_folio(page);
struct page_vma_mapped_walk pvmw = {
.pfn = page_to_pfn(page),
.nr_pages = 1,
.vma = vma,
.flags = PVMW_SYNC,
};
pvmw.address = vma_address(vma, page_pgoff(folio, page), 1);
...
}
and page_pgoff() includes the subpage index:
static inline pgoff_t page_pgoff(const struct folio *folio,
const struct page *page)
{
return folio->index + folio_page_idx(folio, page);
}
So if the poisoned PFN points to a tail page, pvmw->address can be offset
from the start of the hugetlb mapping by
folio_page_idx(folio, page) << PAGE_SHIFT
Should check_pte() pass the hugepage-aligned address to huge_ptep_get()
for that case?
Cheers, Lance
>>
>> For arm64, CONT_PMD_SIZE is one supported HugeTLB size. With such a VMA,
>> page_vma_mapped_walk() passes that size to hugetlb_walk():
>>
>> bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> {
>> ...
>> if (unlikely(is_vm_hugetlb_page(vma))) {
>> ...
>> pvmw->pte = hugetlb_walk(vma, pvmw->address, size);
>> ...
>> }
>> ...
>> }
>>
>> hugetlb_walk() then calls arm64 huge_pte_offset(mm, addr, sz). For
>> sz == CONT_PMD_SIZE, huge_pte_offset() aligns its local addr before
>> calculating pmdp:
>>
>> pte_t *huge_pte_offset(struct mm_struct *mm,
>> unsigned long addr, unsigned long sz)
>> {
>> ...
>> if (sz == CONT_PMD_SIZE)
>> addr &= CONT_PMD_MASK;
>>
>> pmdp = pmd_offset(pudp, addr);
>> pmd = READ_ONCE(*pmdp);
>> ...
>> }
>>
>> So for that case, pvmw->pte is calculated from the aligned addr, not
>> necessarily from the original pvmw->address. But check_pte() passes the
>> original address together with pvmw->pte:
>>
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> + pvmw->pte);
>>
>> arm64 then uses that addr again to choose ncontig:
>>
>> pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>> {
>> ...
>> ncontig = find_num_contig(mm, addr, ptep, &pgsize);
>> for (i = 0; i < ncontig; i++, ptep++) {
>> ...
>> }
>> return orig_pte;
>> }
>>
>> static int find_num_contig(struct mm_struct *mm, unsigned long addr,
>> pte_t *ptep, size_t *pgsize)
>> {
>> pgd_t *pgdp = pgd_offset(mm, addr);
>> p4d_t *p4dp;
>> pud_t *pudp;
>> pmd_t *pmdp;
>>
>> *pgsize = PAGE_SIZE;
>> p4dp = p4d_offset(pgdp, addr);
>> pudp = pud_offset(p4dp, addr);
>> pmdp = pmd_offset(pudp, addr);
>> if ((pte_t *)pmdp == ptep) {
>> *pgsize = PMD_SIZE;
>> return CONT_PMDS;
>> }
>> return CONT_PTES;
>> }
>>
>> With a tail address, pmdp may no longer point at pvmw->pte, so
>> find_num_contig() can return CONT_PTES for a CONT_PMD HugeTLB mapping.
>>
>> On 16K arm64, that changes ncontig from 32 to 128. So huge_ptep_get()
>> can walk past the CONT_PMD entries, and possibly past the PMD table.
>>
>> Should check_pte() pass the address matching pvmw->pte, sth like:
>>
>> ---8<---
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 406fd50bbd8f..58463493bd3d 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -109,11 +109,14 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw, unsigned long pte_nr)
>> unsigned long pfn;
>> pte_t ptent;
>>
>> - if (is_vm_hugetlb_page(pvmw->vma))
>> - ptent = huge_ptep_get(pvmw->vma->vm_mm, pvmw->address,
>> - pvmw->pte);
>> - else
>> + if (is_vm_hugetlb_page(pvmw->vma)) {
>> + struct hstate *hstate = hstate_vma(pvmw->vma);
>> + unsigned long haddr = pvmw->address & huge_page_mask(hstate);
>> +
>> + ptent = huge_ptep_get(pvmw->vma->vm_mm, haddr, pvmw->pte);
>> + } else {
>> ptent = ptep_get(pvmw->pte);
>> + }
>>
>> if (pvmw->flags & PVMW_MIGRATION) {
>> const softleaf_t entry = softleaf_from_pte(ptent);
>> --
>>
>> while leaving pvmw->address unchanged for page_mapped_in_vma()?
>>
>> Cheers, Lance
>>
>>> + else
>>> + ptent = ptep_get(pvmw->pte);
>>>
>>> if (pvmw->flags & PVMW_MIGRATION) {
>>> const softleaf_t entry = softleaf_from_pte(ptent);
>>> --
>>> 2.43.0
>>>
>>>
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
` (3 preceding siblings ...)
2026-06-25 11:29 ` [PATCH 4/5] mm/page_vma_mapped: use huge_ptep_get() for hugetlb Dev Jain
@ 2026-06-25 11:29 ` Dev Jain
2026-06-26 3:40 ` Muchun Song
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
5 siblings, 1 reply; 23+ messages in thread
From: Dev Jain @ 2026-06-25 11:29 UTC (permalink / raw)
To: muchun.song, osalvador, akpm, ljs, david, liam
Cc: Dev Jain, riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual
prot_none_hugetlb_entry() is the hugetlb callback for the early
mprotect(PROT_NONE) PFN permission walk on x86.
The callback passes the decoded PFN to pfn_modify_allowed(). For a
hugetlb callback, the pte pointer refers to a hugetlb entry. On
architectures where hugetlb entries need huge_ptep_get(), reading that
entry with ptep_get() can make the permission check use the wrong PFN.
Use huge_ptep_get() before decoding the hugetlb PFN.
Currently there is no path which can trigger a bug: huge_ptep_get() is a
simple ptep_get() for x86, and the prot_none walk occurs only for x86.
But use the correct helper anyways.
Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/mprotect.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028cf..23779632d18bf 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
0 : -EACCES;
}
+#ifdef CONFIG_HUGETLB_PAGE
static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
unsigned long addr, unsigned long next,
struct mm_walk *walk)
{
- return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
+ pte_t entry = huge_ptep_get(walk->mm, addr, pte);
+
+ return pfn_modify_allowed(pte_pfn(entry),
*(pgprot_t *)(walk->private)) ?
0 : -EACCES;
}
+#else
+#define prot_none_hugetlb_entry NULL
+#endif
static int prot_none_test(unsigned long addr, unsigned long next,
struct mm_walk *walk)
--
2.43.0
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
@ 2026-06-26 3:40 ` Muchun Song
2026-06-26 4:08 ` Dev Jain
0 siblings, 1 reply; 23+ messages in thread
From: Muchun Song @ 2026-06-26 3:40 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
On 2026/6/25 19:29, Dev Jain wrote:
> prot_none_hugetlb_entry() is the hugetlb callback for the early
> mprotect(PROT_NONE) PFN permission walk on x86.
>
> The callback passes the decoded PFN to pfn_modify_allowed(). For a
> hugetlb callback, the pte pointer refers to a hugetlb entry. On
> architectures where hugetlb entries need huge_ptep_get(), reading that
> entry with ptep_get() can make the permission check use the wrong PFN.
>
> Use huge_ptep_get() before decoding the hugetlb PFN.
>
> Currently there is no path which can trigger a bug: huge_ptep_get() is a
> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
> But use the correct helper anyways.
>
> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
> mm/mprotect.c | 8 +++++++-
> 1 file changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028cf..23779632d18bf 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
> 0 : -EACCES;
> }
>
> +#ifdef CONFIG_HUGETLB_PAGE
> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
> unsigned long addr, unsigned long next,
> struct mm_walk *walk)
> {
> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
> +
> + return pfn_modify_allowed(pte_pfn(entry),
> *(pgprot_t *)(walk->private)) ?
> 0 : -EACCES;
> }
> +#else
> +#define prot_none_hugetlb_entry NULL
This is very strange, because we defined a stub as NULL for a helper
function. How about the following diff?
diff --git a/mm/mprotect.c b/mm/mprotect.c
index 9cbf932b028c..4d8c1551fbce 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr,
unsigned long next,
static const struct mm_walk_ops prot_none_walk_ops = {
.pte_entry = prot_none_pte_entry,
+#ifdef CONFIG_HUGETLB_PAGE
.hugetlb_entry = prot_none_hugetlb_entry,
+#endif
.test_walk = prot_none_test,
.walk_lock = PGWALK_WRLOCK,
};
Thanks,
Muchun
> +#endif
>
> static int prot_none_test(unsigned long addr, unsigned long next,
> struct mm_walk *walk)
^ permalink raw reply related [flat|nested] 23+ messages in thread* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-26 3:40 ` Muchun Song
@ 2026-06-26 4:08 ` Dev Jain
2026-06-26 4:21 ` Muchun Song
0 siblings, 1 reply; 23+ messages in thread
From: Dev Jain @ 2026-06-26 4:08 UTC (permalink / raw)
To: Muchun Song
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
On 26/06/26 9:10 am, Muchun Song wrote:
>
>
> On 2026/6/25 19:29, Dev Jain wrote:
>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>> mprotect(PROT_NONE) PFN permission walk on x86.
>>
>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>> architectures where hugetlb entries need huge_ptep_get(), reading that
>> entry with ptep_get() can make the permission check use the wrong PFN.
>>
>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>
>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>> But use the correct helper anyways.
>>
>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/mprotect.c | 8 +++++++-
>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 9cbf932b028cf..23779632d18bf 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>> 0 : -EACCES;
>> }
>> +#ifdef CONFIG_HUGETLB_PAGE
>> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>> unsigned long addr, unsigned long next,
>> struct mm_walk *walk)
>> {
>> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>> +
>> + return pfn_modify_allowed(pte_pfn(entry),
>> *(pgprot_t *)(walk->private)) ?
>> 0 : -EACCES;
>> }
>> +#else
>> +#define prot_none_hugetlb_entry NULL
>
> This is very strange, because we defined a stub as NULL for a helper
I was following pattern elsewhere, search for ".hugetlb_entry" in the
codebase and you will find others doing the same.
> function. How about the following diff?
>
> diff --git a/mm/mprotect.c b/mm/mprotect.c
> index 9cbf932b028c..4d8c1551fbce 100644
> --- a/mm/mprotect.c
> +++ b/mm/mprotect.c
> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>
> static const struct mm_walk_ops prot_none_walk_ops = {
> .pte_entry = prot_none_pte_entry,
> +#ifdef CONFIG_HUGETLB_PAGE
> .hugetlb_entry = prot_none_hugetlb_entry,
> +#endif
> .test_walk = prot_none_test,
> .walk_lock = PGWALK_WRLOCK,
> };
>
> Thanks,
> Muchun
>
>> +#endif
>> static int prot_none_test(unsigned long addr, unsigned long next,
>> struct mm_walk *walk)
>
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-26 4:08 ` Dev Jain
@ 2026-06-26 4:21 ` Muchun Song
2026-06-26 4:42 ` Dev Jain
0 siblings, 1 reply; 23+ messages in thread
From: Muchun Song @ 2026-06-26 4:21 UTC (permalink / raw)
To: Dev Jain
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
> On Jun 26, 2026, at 12:08, Dev Jain <dev.jain@arm.com> wrote:
>
>
>
> On 26/06/26 9:10 am, Muchun Song wrote:
>>
>>
>> On 2026/6/25 19:29, Dev Jain wrote:
>>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>>> mprotect(PROT_NONE) PFN permission walk on x86.
>>>
>>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>>> architectures where hugetlb entries need huge_ptep_get(), reading that
>>> entry with ptep_get() can make the permission check use the wrong PFN.
>>>
>>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>>
>>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>>> But use the correct helper anyways.
>>>
>>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>> ---
>>> mm/mprotect.c | 8 +++++++-
>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>> index 9cbf932b028cf..23779632d18bf 100644
>>> --- a/mm/mprotect.c
>>> +++ b/mm/mprotect.c
>>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>>> 0 : -EACCES;
>>> }
>>> +#ifdef CONFIG_HUGETLB_PAGE
>>> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>>> unsigned long addr, unsigned long next,
>>> struct mm_walk *walk)
>>> {
>>> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>>> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>>> +
>>> + return pfn_modify_allowed(pte_pfn(entry),
>>> *(pgprot_t *)(walk->private)) ?
>>> 0 : -EACCES;
>>> }
>>> +#else
>>> +#define prot_none_hugetlb_entry NULL
>>
>> This is very strange, because we defined a stub as NULL for a helper
>
> I was following pattern elsewhere, search for ".hugetlb_entry" in the
> codebase and you will find others doing the same.
Okay, I understand why you want to do it that way, but I would still
recommend not following that format.
Thanks.
>
>
>> function. How about the following diff?
>>
>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>> index 9cbf932b028c..4d8c1551fbce 100644
>> --- a/mm/mprotect.c
>> +++ b/mm/mprotect.c
>> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>>
>> static const struct mm_walk_ops prot_none_walk_ops = {
>> .pte_entry = prot_none_pte_entry,
>> +#ifdef CONFIG_HUGETLB_PAGE
>> .hugetlb_entry = prot_none_hugetlb_entry,
>> +#endif
>> .test_walk = prot_none_test,
>> .walk_lock = PGWALK_WRLOCK,
>> };
>>
>> Thanks,
>> Muchun
>>
>>> +#endif
>>> static int prot_none_test(unsigned long addr, unsigned long next,
>>> struct mm_walk *walk)
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 5/5] mm/mprotect: use huge_ptep_get() for hugetlb
2026-06-26 4:21 ` Muchun Song
@ 2026-06-26 4:42 ` Dev Jain
0 siblings, 0 replies; 23+ messages in thread
From: Dev Jain @ 2026-06-26 4:42 UTC (permalink / raw)
To: Muchun Song
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, ziy, matthew.brost,
joshua.hahnjy, rakie.kim, byungchul, gourry, ying.huang, mel,
nao.horiguchi, ak, j-nomura, pfalcato, dave.hansen, tglx,
jpoimboe, ryan.roberts, anshuman.khandual, osalvador, akpm, ljs,
david, liam
On 26/06/26 9:51 am, Muchun Song wrote:
>
>
>> On Jun 26, 2026, at 12:08, Dev Jain <dev.jain@arm.com> wrote:
>>
>>
>>
>> On 26/06/26 9:10 am, Muchun Song wrote:
>>>
>>>
>>> On 2026/6/25 19:29, Dev Jain wrote:
>>>> prot_none_hugetlb_entry() is the hugetlb callback for the early
>>>> mprotect(PROT_NONE) PFN permission walk on x86.
>>>>
>>>> The callback passes the decoded PFN to pfn_modify_allowed(). For a
>>>> hugetlb callback, the pte pointer refers to a hugetlb entry. On
>>>> architectures where hugetlb entries need huge_ptep_get(), reading that
>>>> entry with ptep_get() can make the permission check use the wrong PFN.
>>>>
>>>> Use huge_ptep_get() before decoding the hugetlb PFN.
>>>>
>>>> Currently there is no path which can trigger a bug: huge_ptep_get() is a
>>>> simple ptep_get() for x86, and the prot_none walk occurs only for x86.
>>>> But use the correct helper anyways.
>>>>
>>>> Fixes: 42e4089c7890 ("x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings")
>>>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>>>> ---
>>>> mm/mprotect.c | 8 +++++++-
>>>> 1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>>> index 9cbf932b028cf..23779632d18bf 100644
>>>> --- a/mm/mprotect.c
>>>> +++ b/mm/mprotect.c
>>>> @@ -699,14 +699,20 @@ static int prot_none_pte_entry(pte_t *pte, unsigned long addr,
>>>> 0 : -EACCES;
>>>> }
>>>> +#ifdef CONFIG_HUGETLB_PAGE
>>>> static int prot_none_hugetlb_entry(pte_t *pte, unsigned long hmask,
>>>> unsigned long addr, unsigned long next,
>>>> struct mm_walk *walk)
>>>> {
>>>> - return pfn_modify_allowed(pte_pfn(ptep_get(pte)),
>>>> + pte_t entry = huge_ptep_get(walk->mm, addr, pte);
>>>> +
>>>> + return pfn_modify_allowed(pte_pfn(entry),
>>>> *(pgprot_t *)(walk->private)) ?
>>>> 0 : -EACCES;
>>>> }
>>>> +#else
>>>> +#define prot_none_hugetlb_entry NULL
>>>
>>> This is very strange, because we defined a stub as NULL for a helper
>>
>> I was following pattern elsewhere, search for ".hugetlb_entry" in the
>> codebase and you will find others doing the same.
>
> Okay, I understand why you want to do it that way, but I would still
> recommend not following that format.
Okay then I'll update v2 with the below diff.
>
> Thanks.
>
>>
>>
>>> function. How about the following diff?
>>>
>>> diff --git a/mm/mprotect.c b/mm/mprotect.c
>>> index 9cbf932b028c..4d8c1551fbce 100644
>>> --- a/mm/mprotect.c
>>> +++ b/mm/mprotect.c
>>> @@ -716,7 +716,9 @@ static int prot_none_test(unsigned long addr, unsigned long next,
>>>
>>> static const struct mm_walk_ops prot_none_walk_ops = {
>>> .pte_entry = prot_none_pte_entry,
>>> +#ifdef CONFIG_HUGETLB_PAGE
>>> .hugetlb_entry = prot_none_hugetlb_entry,
>>> +#endif
>>> .test_walk = prot_none_test,
>>> .walk_lock = PGWALK_WRLOCK,
>>> };
>>>
>>> Thanks,
>>> Muchun
>>>
>>>> +#endif
>>>> static int prot_none_test(unsigned long addr, unsigned long next,
>>>> struct mm_walk *walk)
>
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH 0/5] Fix incorrect access of hugetlb pte entries
2026-06-25 11:29 [PATCH 0/5] Fix incorrect access of hugetlb pte entries Dev Jain
` (4 preceding siblings ...)
2026-06-25 11:29 ` [PATCH 5/5] mm/mprotect: " Dev Jain
@ 2026-06-25 13:59 ` Zi Yan
2026-06-26 4:09 ` Dev Jain
5 siblings, 1 reply; 23+ messages in thread
From: Zi Yan @ 2026-06-25 13:59 UTC (permalink / raw)
To: Dev Jain, muchun.song, osalvador, akpm, ljs, david, liam
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, mel, nao.horiguchi, ak,
j-nomura, pfalcato, dave.hansen, tglx, jpoimboe, ryan.roberts,
anshuman.khandual
On Thu Jun 25, 2026 at 7:29 AM EDT, Dev Jain wrote:
> There are various places which use ptep_get() to get the pte entry
> corresponding to a hugetlb folio. Some arches have special handling
I think it is better to mention s390 as a concrete example.
> to compute the pteval, so they provide huge_ptep_get(). Use this
> helper consistently.
>
> Dev Jain (5):
> mm/rmap: use huge_ptep_get() in try_to_unmap_one()
> mm/rmap: use huge_ptep_get() in try_to_migrate_one()
> mm/migrate: use huge_ptep_get() in remove_migration_pte()
> mm/page_vma_mapped: use huge_ptep_get() for hugetlb
> mm/mprotect: use huge_ptep_get() for hugetlb
>
> include/linux/hugetlb.h | 3 +++
> mm/migrate.c | 6 +++++-
> mm/mprotect.c | 8 +++++++-
> mm/page_vma_mapped.c | 8 +++++++-
> mm/rmap.c | 32 ++++++++++++++++++++------------
> 5 files changed, 42 insertions(+), 15 deletions(-)
--
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 23+ messages in thread* Re: [PATCH 0/5] Fix incorrect access of hugetlb pte entries
2026-06-25 13:59 ` [PATCH 0/5] Fix incorrect access of hugetlb pte entries Zi Yan
@ 2026-06-26 4:09 ` Dev Jain
0 siblings, 0 replies; 23+ messages in thread
From: Dev Jain @ 2026-06-26 4:09 UTC (permalink / raw)
To: Zi Yan, muchun.song, osalvador, akpm, ljs, david, liam
Cc: riel, vbabka, harry, jannh, lance.yang, kas, linux-mm,
linux-kernel, rcampbell, apopple, matthew.brost, joshua.hahnjy,
rakie.kim, byungchul, gourry, ying.huang, mel, nao.horiguchi, ak,
j-nomura, pfalcato, dave.hansen, tglx, jpoimboe, ryan.roberts,
anshuman.khandual
On 25/06/26 7:29 pm, Zi Yan wrote:
> On Thu Jun 25, 2026 at 7:29 AM EDT, Dev Jain wrote:
>> There are various places which use ptep_get() to get the pte entry
>> corresponding to a hugetlb folio. Some arches have special handling
>
> I think it is better to mention s390 as a concrete example.
Sure. In case there is no v2, requesting Andrew to change
"Some arches have special handling" to "Some arches like s390 have
special handling".
>
>> to compute the pteval, so they provide huge_ptep_get(). Use this
>> helper consistently.
>>
>> Dev Jain (5):
>> mm/rmap: use huge_ptep_get() in try_to_unmap_one()
>> mm/rmap: use huge_ptep_get() in try_to_migrate_one()
>> mm/migrate: use huge_ptep_get() in remove_migration_pte()
>> mm/page_vma_mapped: use huge_ptep_get() for hugetlb
>> mm/mprotect: use huge_ptep_get() for hugetlb
>>
>> include/linux/hugetlb.h | 3 +++
>> mm/migrate.c | 6 +++++-
>> mm/mprotect.c | 8 +++++++-
>> mm/page_vma_mapped.c | 8 +++++++-
>> mm/rmap.c | 32 ++++++++++++++++++++------------
>> 5 files changed, 42 insertions(+), 15 deletions(-)
>
>
>
>
^ permalink raw reply [flat|nested] 23+ messages in thread