* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 10:07 ` David Hildenbrand (Arm)
@ 2026-06-26 10:42 ` Lorenzo Stoakes
2026-06-26 11:31 ` David Hildenbrand (Arm)
2026-06-26 13:27 ` Lance Yang
` (2 subsequent siblings)
3 siblings, 1 reply; 24+ messages in thread
From: Lorenzo Stoakes @ 2026-06-26 10:42 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Wei Yang, akpm, riel, liam, vbabka, harry, jannh, ziy, sj,
balbirs, linux-mm, linux-kernel, stable, Lance Yang
On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
> On 6/24/26 08:53, Wei Yang wrote:
> > Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
> > device-private entries") introduced the concept of device-private
> > PMD entries, but did not correctly update the rmap walk code to
> > account for them.
> >
> > As a result, when page_vma_mapped_walk() encounters device-private
> > PMD entries, it takes no action other than to acquire the PMD lock
> > and exit.
> >
> > However this is highly problematic for two reasons - firstly,
> > device private entries possess a PFN so check_pmd() needs to be
> > called to ensure an overlapping PFN range.
> >
> > Secondly, and more importantly, if PVMW_MIGRATION is set the
> > caller assumes the returned entry is a migration entry, resulting
> > in memory corruption when the caller tries to interpret the device
> > private entry as such.
> >
> > In addition, commit 146287290023 ("mm/huge_memory: implement
> > device-private THP splitting") allowed device private PMDs to be
> > split like THP mappings, but again did not update this code path.
> >
> > As a result, we might race a PMD split prior to acquiring the PMD
> > lock.
> >
> > This patch addresses all of these issues by invoking check_pmd(),
> > ensuring PMVW_MIGRATION is not set and checks whether a split raced
> > us we do for PMD THP and migration entries.
> >
> > Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> > Suggested-by: David Hildenbrand <david@kernel.org>
> > Cc: David Hildenbrand <david@kernel.org>
> > Cc: Balbir Singh <balbirs@nvidia.com>
> > Cc: SeongJae Park <sj@kernel.org>
> > Cc: Zi Yan <ziy@nvidia.com>
> > Cc: Lorenzo Stoakes <ljs@kernel.org>
> > Cc: Lance Yang <lance.yang@linux.dev>
> >
> > ---
> > v4:
> > * refine subject and commit log based on Lorenzo's suggestion
> > * put pmd device-private entry handling in its own if branch,
> > suggested by Lorenzo
> >
> > v3:
> > * remove cleanup part, only fix the issue for device-private entry
> > * refine user effect description based on Lorenzo's suggestion
> >
> > v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
> > * specify the possible error case of current code and user visible effect
> > * besides fix, cleanup the pmd entry handling based on David's suggestion
> >
> > v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
> > ---
> > mm/page_vma_mapped.c | 20 +++++++++++++++-----
> > 1 file changed, 15 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> > index 2ccbabfb2cc1..17dff8aab9f9 100644
> > --- a/mm/page_vma_mapped.c
> > +++ b/mm/page_vma_mapped.c
> > @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> > /* THP pmd was split under us: handle on pte level */
> > spin_unlock(pvmw->ptl);
> > pvmw->ptl = NULL;
> > - } else if (!pmd_present(pmde)) {
> > - const softleaf_t entry = softleaf_from_pmd(pmde);
> > + } else if (pmd_is_device_private_entry(pmde)) {
> > + softleaf_t entry;
> > +
> > + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> > + pmde = *pvmw->pmd;
> > + entry = softleaf_from_pmd(pmde);
> >
> > - if (softleaf_is_device_private(entry)) {
> > - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> > + if (likely(softleaf_is_device_private(entry))) {
> > + if (pvmw->flags & PVMW_MIGRATION)
> > + return not_found(pvmw);
> > + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> > + return not_found(pvmw);
> > return true;
> > }
> > -
> > + /* device-private pmd was split under us: handle on pte level */
> > + spin_unlock(pvmw->ptl);
> > + pvmw->ptl = NULL;
> > + } else if (!pmd_present(pmde)) {
> > if ((pvmw->flags & PVMW_SYNC) &&
> > thp_vma_suitable_order(vma, pvmw->address,
> > PMD_ORDER) &&
>
> This is extremely hard to review given the existing crap handling here. I'm
> really sorry, but it makes my head hurt (I'm not kidding :) ).
>
> It's completely unclear why we only have to check for a subset of the cases
> after taking the lock.
>
> Could we simply extend the existing migration pmd handling and leave the
> !pmd_present() case for pmd_none()?
>
> That leaves no question to "which transitions are actually allowed", including
> "could we accidentally assume something is a page table when really it isn't".
>
>
> So what about something like the following?
>
> The "thp_migration_supported()" is not required when checking for
> pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>
> Untested:
>
>
> From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
> From: "David Hildenbrand (Arm)" <david@kernel.org>
> Date: Fri, 26 Jun 2026 12:03:40 +0200
> Subject: [PATCH] tmp
>
> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
> ---
> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index 2ccbabfb2cc17..ed2a23a90e8dd 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> */
> pmde = pmdp_get_lockless(pvmw->pmd);
>
> - if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
> + if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
> + pmd_is_device_private_entry(pmde)) {
> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> pmde = *pvmw->pmd;
> - if (!pmd_present(pmde)) {
> + if (pmd_is_migration_entry(pmde)) {
> softleaf_t entry;
>
> - if (!thp_migration_supported() ||
Do we care about this? Or is !tmp_migration_supported() -> implies you
wouldn't see a migration entry here anyway?
Maybe worth a VM_WARN_ON_ONCE()?
> - !(pvmw->flags & PVMW_MIGRATION))
> + if (!(pvmw->flags & PVMW_MIGRATION))
> return not_found(pvmw);
> entry = softleaf_from_pmd(pmde);
> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> + return not_found(pvmw);
> + return true;
> + } else if (pmd_is_device_private_entry(pmde)) {
> + softleaf_t entry;
>
> - if (!softleaf_is_migration(entry) ||
> - !check_pmd(softleaf_to_pfn(entry), pvmw))
> + if (pvmw->flags & PVMW_MIGRATION)
> + return not_found(pvmw);
> + entry = softleaf_from_pmd(pmde);
> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> return not_found(pvmw);
> return true;
> + } else if (!pmd_present(pmde) ){
> + return not_found(pvmw);
> }
> if (likely(pmd_trans_huge(pmde))) {
> if (pvmw->flags & PVMW_MIGRATION)
> @@ -270,12 +280,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> spin_unlock(pvmw->ptl);
> pvmw->ptl = NULL;
> } else if (!pmd_present(pmde)) {
> - const softleaf_t entry = softleaf_from_pmd(pmde);
> -
> - if (softleaf_is_device_private(entry)) {
> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> - return true;
> - }
>
> if ((pvmw->flags & PVMW_SYNC) &&
> thp_vma_suitable_order(vma, pvmw->address,
Overall though this seems fine to me.
> --
> 2.43.0
>
>
> --
> Cheers,
>
> David
Thanks, Lorenzo
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 10:42 ` Lorenzo Stoakes
@ 2026-06-26 11:31 ` David Hildenbrand (Arm)
2026-06-26 13:24 ` Zi Yan
0 siblings, 1 reply; 24+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-26 11:31 UTC (permalink / raw)
To: Lorenzo Stoakes
Cc: Wei Yang, akpm, riel, liam, vbabka, harry, jannh, ziy, sj,
balbirs, linux-mm, linux-kernel, stable, Lance Yang
On 6/26/26 12:42, Lorenzo Stoakes wrote:
> On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
>> On 6/24/26 08:53, Wei Yang wrote:
>>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
>>> device-private entries") introduced the concept of device-private
>>> PMD entries, but did not correctly update the rmap walk code to
>>> account for them.
>>>
>>> As a result, when page_vma_mapped_walk() encounters device-private
>>> PMD entries, it takes no action other than to acquire the PMD lock
>>> and exit.
>>>
>>> However this is highly problematic for two reasons - firstly,
>>> device private entries possess a PFN so check_pmd() needs to be
>>> called to ensure an overlapping PFN range.
>>>
>>> Secondly, and more importantly, if PVMW_MIGRATION is set the
>>> caller assumes the returned entry is a migration entry, resulting
>>> in memory corruption when the caller tries to interpret the device
>>> private entry as such.
>>>
>>> In addition, commit 146287290023 ("mm/huge_memory: implement
>>> device-private THP splitting") allowed device private PMDs to be
>>> split like THP mappings, but again did not update this code path.
>>>
>>> As a result, we might race a PMD split prior to acquiring the PMD
>>> lock.
>>>
>>> This patch addresses all of these issues by invoking check_pmd(),
>>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
>>> us we do for PMD THP and migration entries.
>>>
>>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Suggested-by: David Hildenbrand <david@kernel.org>
>>> Cc: David Hildenbrand <david@kernel.org>
>>> Cc: Balbir Singh <balbirs@nvidia.com>
>>> Cc: SeongJae Park <sj@kernel.org>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: Lorenzo Stoakes <ljs@kernel.org>
>>> Cc: Lance Yang <lance.yang@linux.dev>
>>>
>>> ---
>>> v4:
>>> * refine subject and commit log based on Lorenzo's suggestion
>>> * put pmd device-private entry handling in its own if branch,
>>> suggested by Lorenzo
>>>
>>> v3:
>>> * remove cleanup part, only fix the issue for device-private entry
>>> * refine user effect description based on Lorenzo's suggestion
>>>
>>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
>>> * specify the possible error case of current code and user visible effect
>>> * besides fix, cleanup the pmd entry handling based on David's suggestion
>>>
>>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
>>> ---
>>> mm/page_vma_mapped.c | 20 +++++++++++++++-----
>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index 2ccbabfb2cc1..17dff8aab9f9 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>> /* THP pmd was split under us: handle on pte level */
>>> spin_unlock(pvmw->ptl);
>>> pvmw->ptl = NULL;
>>> - } else if (!pmd_present(pmde)) {
>>> - const softleaf_t entry = softleaf_from_pmd(pmde);
>>> + } else if (pmd_is_device_private_entry(pmde)) {
>>> + softleaf_t entry;
>>> +
>>> + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> + pmde = *pvmw->pmd;
>>> + entry = softleaf_from_pmd(pmde);
>>>
>>> - if (softleaf_is_device_private(entry)) {
>>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> + if (likely(softleaf_is_device_private(entry))) {
>>> + if (pvmw->flags & PVMW_MIGRATION)
>>> + return not_found(pvmw);
>>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>>> + return not_found(pvmw);
>>> return true;
>>> }
>>> -
>>> + /* device-private pmd was split under us: handle on pte level */
>>> + spin_unlock(pvmw->ptl);
>>> + pvmw->ptl = NULL;
>>> + } else if (!pmd_present(pmde)) {
>>> if ((pvmw->flags & PVMW_SYNC) &&
>>> thp_vma_suitable_order(vma, pvmw->address,
>>> PMD_ORDER) &&
>>
>> This is extremely hard to review given the existing crap handling here. I'm
>> really sorry, but it makes my head hurt (I'm not kidding :) ).
>>
>> It's completely unclear why we only have to check for a subset of the cases
>> after taking the lock.
>>
>> Could we simply extend the existing migration pmd handling and leave the
>> !pmd_present() case for pmd_none()?
>>
>> That leaves no question to "which transitions are actually allowed", including
>> "could we accidentally assume something is a page table when really it isn't".
>>
>>
>> So what about something like the following?
>>
>> The "thp_migration_supported()" is not required when checking for
>> pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>>
>> Untested:
>>
>>
>> From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>> From: "David Hildenbrand (Arm)" <david@kernel.org>
>> Date: Fri, 26 Jun 2026 12:03:40 +0200
>> Subject: [PATCH] tmp
>>
>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>> ---
>> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
>> 1 file changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> */
>> pmde = pmdp_get_lockless(pvmw->pmd);
>>
>> - if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>> + if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>> + pmd_is_device_private_entry(pmde)) {
>> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> pmde = *pvmw->pmd;
>> - if (!pmd_present(pmde)) {
>> + if (pmd_is_migration_entry(pmde)) {
>> softleaf_t entry;
>>
>> - if (!thp_migration_supported() ||
>
> Do we care about this? Or is !tmp_migration_supported() -> implies you
> wouldn't see a migration entry here anyway?
Yeah, I noted above
"The "thp_migration_supported()" is not required when checking for
pmd_is_migration_entry(), as that defaults to "false" when not compiled in."
Given that
tmp_migration_supported() -> IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION);$
And
pmd_is_migration_entry() -> softleaf_is_migration(softleaf_from_pmd(pmd));
whereby softleaf_from_pmd() only returns something non-none for
CONFIG_ARCH_ENABLE_THP_MIGRATION.
>
> Maybe worth a VM_WARN_ON_ONCE()?
I think it was primarily a a hack to slightly optimize code generated for
!CONFIG_ARCH_ENABLE_THP_MIGRATION, not really something for correctness as it seems.
So I think we can safely drop it. :)
--
Cheers,
David
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 11:31 ` David Hildenbrand (Arm)
@ 2026-06-26 13:24 ` Zi Yan
2026-06-26 13:32 ` Lorenzo Stoakes
0 siblings, 1 reply; 24+ messages in thread
From: Zi Yan @ 2026-06-26 13:24 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Lorenzo Stoakes, Wei Yang, akpm, riel, liam, vbabka, harry, jannh,
sj, balbirs, linux-mm, linux-kernel, stable, Lance Yang
On 26 Jun 2026, at 7:31, David Hildenbrand (Arm) wrote:
> On 6/26/26 12:42, Lorenzo Stoakes wrote:
>> On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
>>> On 6/24/26 08:53, Wei Yang wrote:
>>>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
>>>> device-private entries") introduced the concept of device-private
>>>> PMD entries, but did not correctly update the rmap walk code to
>>>> account for them.
>>>>
>>>> As a result, when page_vma_mapped_walk() encounters device-private
>>>> PMD entries, it takes no action other than to acquire the PMD lock
>>>> and exit.
>>>>
>>>> However this is highly problematic for two reasons - firstly,
>>>> device private entries possess a PFN so check_pmd() needs to be
>>>> called to ensure an overlapping PFN range.
>>>>
>>>> Secondly, and more importantly, if PVMW_MIGRATION is set the
>>>> caller assumes the returned entry is a migration entry, resulting
>>>> in memory corruption when the caller tries to interpret the device
>>>> private entry as such.
>>>>
>>>> In addition, commit 146287290023 ("mm/huge_memory: implement
>>>> device-private THP splitting") allowed device private PMDs to be
>>>> split like THP mappings, but again did not update this code path.
>>>>
>>>> As a result, we might race a PMD split prior to acquiring the PMD
>>>> lock.
>>>>
>>>> This patch addresses all of these issues by invoking check_pmd(),
>>>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
>>>> us we do for PMD THP and migration entries.
>>>>
>>>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>>>> Cc: <stable@vger.kernel.org>
>>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>>> Suggested-by: David Hildenbrand <david@kernel.org>
>>>> Cc: David Hildenbrand <david@kernel.org>
>>>> Cc: Balbir Singh <balbirs@nvidia.com>
>>>> Cc: SeongJae Park <sj@kernel.org>
>>>> Cc: Zi Yan <ziy@nvidia.com>
>>>> Cc: Lorenzo Stoakes <ljs@kernel.org>
>>>> Cc: Lance Yang <lance.yang@linux.dev>
>>>>
>>>> ---
>>>> v4:
>>>> * refine subject and commit log based on Lorenzo's suggestion
>>>> * put pmd device-private entry handling in its own if branch,
>>>> suggested by Lorenzo
>>>>
>>>> v3:
>>>> * remove cleanup part, only fix the issue for device-private entry
>>>> * refine user effect description based on Lorenzo's suggestion
>>>>
>>>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
>>>> * specify the possible error case of current code and user visible effect
>>>> * besides fix, cleanup the pmd entry handling based on David's suggestion
>>>>
>>>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
>>>> ---
>>>> mm/page_vma_mapped.c | 20 +++++++++++++++-----
>>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>>
>>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>>> index 2ccbabfb2cc1..17dff8aab9f9 100644
>>>> --- a/mm/page_vma_mapped.c
>>>> +++ b/mm/page_vma_mapped.c
>>>> @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>>> /* THP pmd was split under us: handle on pte level */
>>>> spin_unlock(pvmw->ptl);
>>>> pvmw->ptl = NULL;
>>>> - } else if (!pmd_present(pmde)) {
>>>> - const softleaf_t entry = softleaf_from_pmd(pmde);
>>>> + } else if (pmd_is_device_private_entry(pmde)) {
>>>> + softleaf_t entry;
>>>> +
>>>> + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>>> + pmde = *pvmw->pmd;
>>>> + entry = softleaf_from_pmd(pmde);
>>>>
>>>> - if (softleaf_is_device_private(entry)) {
>>>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>>> + if (likely(softleaf_is_device_private(entry))) {
>>>> + if (pvmw->flags & PVMW_MIGRATION)
>>>> + return not_found(pvmw);
>>>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>>>> + return not_found(pvmw);
>>>> return true;
>>>> }
>>>> -
>>>> + /* device-private pmd was split under us: handle on pte level */
>>>> + spin_unlock(pvmw->ptl);
>>>> + pvmw->ptl = NULL;
>>>> + } else if (!pmd_present(pmde)) {
>>>> if ((pvmw->flags & PVMW_SYNC) &&
>>>> thp_vma_suitable_order(vma, pvmw->address,
>>>> PMD_ORDER) &&
>>>
>>> This is extremely hard to review given the existing crap handling here. I'm
>>> really sorry, but it makes my head hurt (I'm not kidding :) ).
>>>
>>> It's completely unclear why we only have to check for a subset of the cases
>>> after taking the lock.
>>>
>>> Could we simply extend the existing migration pmd handling and leave the
>>> !pmd_present() case for pmd_none()?
>>>
>>> That leaves no question to "which transitions are actually allowed", including
>>> "could we accidentally assume something is a page table when really it isn't".
>>>
>>>
>>> So what about something like the following?
>>>
>>> The "thp_migration_supported()" is not required when checking for
>>> pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>>>
>>> Untested:
>>>
>>>
>>> From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>>> From: "David Hildenbrand (Arm)" <david@kernel.org>
>>> Date: Fri, 26 Jun 2026 12:03:40 +0200
>>> Subject: [PATCH] tmp
>>>
>>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>>> ---
>>> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
>>> 1 file changed, 17 insertions(+), 12 deletions(-)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>> */
>>> pmde = pmdp_get_lockless(pvmw->pmd);
>>>
>>> - if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>>> + if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>>> + pmd_is_device_private_entry(pmde)) {
>>> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> pmde = *pvmw->pmd;
>>> - if (!pmd_present(pmde)) {
>>> + if (pmd_is_migration_entry(pmde)) {
>>> softleaf_t entry;
>>>
>>> - if (!thp_migration_supported() ||
>>
>> Do we care about this? Or is !tmp_migration_supported() -> implies you
>> wouldn't see a migration entry here anyway?
>
> Yeah, I noted above
>
> "The "thp_migration_supported()" is not required when checking for
> pmd_is_migration_entry(), as that defaults to "false" when not compiled in."
>
> Given that
>
> tmp_migration_supported() -> IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION);$
>
> And
>
> pmd_is_migration_entry() -> softleaf_is_migration(softleaf_from_pmd(pmd));
>
> whereby softleaf_from_pmd() only returns something non-none for
> CONFIG_ARCH_ENABLE_THP_MIGRATION.
>
>>
>> Maybe worth a VM_WARN_ON_ONCE()?
>
> I think it was primarily a a hack to slightly optimize code generated for
> !CONFIG_ARCH_ENABLE_THP_MIGRATION, not really something for correctness as it seems.
>
> So I think we can safely drop it. :)
thp_migration_supported() here is legacy code[1] from v4.14 when I added
the THP migration support. IIRC, the purpose was to avoid checking
PMD migration entry if the support is not enabled, but looking at it again
today, that thp_migration_supported() is unnecessary since
is_migration_entry(pmd_to_swp_entry(*pvmw->pmd)) returns false if
!CONFIG_ARCH_ENABLE_THP_MIGRATION.
[1] https://elixir.bootlin.com/linux/v4.14/source/mm/page_vma_mapped.c#L157
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 13:24 ` Zi Yan
@ 2026-06-26 13:32 ` Lorenzo Stoakes
0 siblings, 0 replies; 24+ messages in thread
From: Lorenzo Stoakes @ 2026-06-26 13:32 UTC (permalink / raw)
To: Zi Yan
Cc: David Hildenbrand (Arm), Wei Yang, akpm, riel, liam, vbabka,
harry, jannh, sj, balbirs, linux-mm, linux-kernel, stable,
Lance Yang
On Fri, Jun 26, 2026 at 09:24:06AM -0400, Zi Yan wrote:
> On 26 Jun 2026, at 7:31, David Hildenbrand (Arm) wrote:
>
> > On 6/26/26 12:42, Lorenzo Stoakes wrote:
> >> On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
> >>> On 6/24/26 08:53, Wei Yang wrote:
> >>>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
> >>>> device-private entries") introduced the concept of device-private
> >>>> PMD entries, but did not correctly update the rmap walk code to
> >>>> account for them.
> >>>>
> >>>> As a result, when page_vma_mapped_walk() encounters device-private
> >>>> PMD entries, it takes no action other than to acquire the PMD lock
> >>>> and exit.
> >>>>
> >>>> However this is highly problematic for two reasons - firstly,
> >>>> device private entries possess a PFN so check_pmd() needs to be
> >>>> called to ensure an overlapping PFN range.
> >>>>
> >>>> Secondly, and more importantly, if PVMW_MIGRATION is set the
> >>>> caller assumes the returned entry is a migration entry, resulting
> >>>> in memory corruption when the caller tries to interpret the device
> >>>> private entry as such.
> >>>>
> >>>> In addition, commit 146287290023 ("mm/huge_memory: implement
> >>>> device-private THP splitting") allowed device private PMDs to be
> >>>> split like THP mappings, but again did not update this code path.
> >>>>
> >>>> As a result, we might race a PMD split prior to acquiring the PMD
> >>>> lock.
> >>>>
> >>>> This patch addresses all of these issues by invoking check_pmd(),
> >>>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
> >>>> us we do for PMD THP and migration entries.
> >>>>
> >>>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
> >>>> Cc: <stable@vger.kernel.org>
> >>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >>>> Suggested-by: David Hildenbrand <david@kernel.org>
> >>>> Cc: David Hildenbrand <david@kernel.org>
> >>>> Cc: Balbir Singh <balbirs@nvidia.com>
> >>>> Cc: SeongJae Park <sj@kernel.org>
> >>>> Cc: Zi Yan <ziy@nvidia.com>
> >>>> Cc: Lorenzo Stoakes <ljs@kernel.org>
> >>>> Cc: Lance Yang <lance.yang@linux.dev>
> >>>>
> >>>> ---
> >>>> v4:
> >>>> * refine subject and commit log based on Lorenzo's suggestion
> >>>> * put pmd device-private entry handling in its own if branch,
> >>>> suggested by Lorenzo
> >>>>
> >>>> v3:
> >>>> * remove cleanup part, only fix the issue for device-private entry
> >>>> * refine user effect description based on Lorenzo's suggestion
> >>>>
> >>>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
> >>>> * specify the possible error case of current code and user visible effect
> >>>> * besides fix, cleanup the pmd entry handling based on David's suggestion
> >>>>
> >>>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
> >>>> ---
> >>>> mm/page_vma_mapped.c | 20 +++++++++++++++-----
> >>>> 1 file changed, 15 insertions(+), 5 deletions(-)
> >>>>
> >>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> >>>> index 2ccbabfb2cc1..17dff8aab9f9 100644
> >>>> --- a/mm/page_vma_mapped.c
> >>>> +++ b/mm/page_vma_mapped.c
> >>>> @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> >>>> /* THP pmd was split under us: handle on pte level */
> >>>> spin_unlock(pvmw->ptl);
> >>>> pvmw->ptl = NULL;
> >>>> - } else if (!pmd_present(pmde)) {
> >>>> - const softleaf_t entry = softleaf_from_pmd(pmde);
> >>>> + } else if (pmd_is_device_private_entry(pmde)) {
> >>>> + softleaf_t entry;
> >>>> +
> >>>> + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> >>>> + pmde = *pvmw->pmd;
> >>>> + entry = softleaf_from_pmd(pmde);
> >>>>
> >>>> - if (softleaf_is_device_private(entry)) {
> >>>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> >>>> + if (likely(softleaf_is_device_private(entry))) {
> >>>> + if (pvmw->flags & PVMW_MIGRATION)
> >>>> + return not_found(pvmw);
> >>>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> >>>> + return not_found(pvmw);
> >>>> return true;
> >>>> }
> >>>> -
> >>>> + /* device-private pmd was split under us: handle on pte level */
> >>>> + spin_unlock(pvmw->ptl);
> >>>> + pvmw->ptl = NULL;
> >>>> + } else if (!pmd_present(pmde)) {
> >>>> if ((pvmw->flags & PVMW_SYNC) &&
> >>>> thp_vma_suitable_order(vma, pvmw->address,
> >>>> PMD_ORDER) &&
> >>>
> >>> This is extremely hard to review given the existing crap handling here. I'm
> >>> really sorry, but it makes my head hurt (I'm not kidding :) ).
> >>>
> >>> It's completely unclear why we only have to check for a subset of the cases
> >>> after taking the lock.
> >>>
> >>> Could we simply extend the existing migration pmd handling and leave the
> >>> !pmd_present() case for pmd_none()?
> >>>
> >>> That leaves no question to "which transitions are actually allowed", including
> >>> "could we accidentally assume something is a page table when really it isn't".
> >>>
> >>>
> >>> So what about something like the following?
> >>>
> >>> The "thp_migration_supported()" is not required when checking for
> >>> pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
> >>>
> >>> Untested:
> >>>
> >>>
> >>> From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
> >>> From: "David Hildenbrand (Arm)" <david@kernel.org>
> >>> Date: Fri, 26 Jun 2026 12:03:40 +0200
> >>> Subject: [PATCH] tmp
> >>>
> >>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
> >>> ---
> >>> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
> >>> 1 file changed, 17 insertions(+), 12 deletions(-)
> >>>
> >>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> >>> index 2ccbabfb2cc17..ed2a23a90e8dd 100644
> >>> --- a/mm/page_vma_mapped.c
> >>> +++ b/mm/page_vma_mapped.c
> >>> @@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> >>> */
> >>> pmde = pmdp_get_lockless(pvmw->pmd);
> >>>
> >>> - if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
> >>> + if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
> >>> + pmd_is_device_private_entry(pmde)) {
> >>> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> >>> pmde = *pvmw->pmd;
> >>> - if (!pmd_present(pmde)) {
> >>> + if (pmd_is_migration_entry(pmde)) {
> >>> softleaf_t entry;
> >>>
> >>> - if (!thp_migration_supported() ||
> >>
> >> Do we care about this? Or is !tmp_migration_supported() -> implies you
> >> wouldn't see a migration entry here anyway?
> >
> > Yeah, I noted above
> >
> > "The "thp_migration_supported()" is not required when checking for
> > pmd_is_migration_entry(), as that defaults to "false" when not compiled in."
> >
> > Given that
> >
> > tmp_migration_supported() -> IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION);$
> >
> > And
> >
> > pmd_is_migration_entry() -> softleaf_is_migration(softleaf_from_pmd(pmd));
> >
> > whereby softleaf_from_pmd() only returns something non-none for
> > CONFIG_ARCH_ENABLE_THP_MIGRATION.
> >
> >>
> >> Maybe worth a VM_WARN_ON_ONCE()?
> >
> > I think it was primarily a a hack to slightly optimize code generated for
> > !CONFIG_ARCH_ENABLE_THP_MIGRATION, not really something for correctness as it seems.
> >
> > So I think we can safely drop it. :)
>
> thp_migration_supported() here is legacy code[1] from v4.14 when I added
> the THP migration support. IIRC, the purpose was to avoid checking
> PMD migration entry if the support is not enabled, but looking at it again
> today, that thp_migration_supported() is unnecessary since
> is_migration_entry(pmd_to_swp_entry(*pvmw->pmd)) returns false if
> !CONFIG_ARCH_ENABLE_THP_MIGRATION.
>
> [1] https://elixir.bootlin.com/linux/v4.14/source/mm/page_vma_mapped.c#L157
Thanks guys, let's drop it then!
>
> Best Regards,
> Yan, Zi
Cheers, Lorenzo
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 10:07 ` David Hildenbrand (Arm)
2026-06-26 10:42 ` Lorenzo Stoakes
@ 2026-06-26 13:27 ` Lance Yang
2026-06-26 13:51 ` David Hildenbrand (Arm)
2026-06-27 0:38 ` Wei Yang
2026-06-27 0:04 ` Wei Yang
2026-06-27 2:07 ` Wei Yang
3 siblings, 2 replies; 24+ messages in thread
From: Lance Yang @ 2026-06-26 13:27 UTC (permalink / raw)
To: david
Cc: richard.weiyang, akpm, ljs, riel, liam, vbabka, harry, jannh, ziy,
sj, balbirs, linux-mm, linux-kernel, stable, lance.yang
On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
>On 6/24/26 08:53, Wei Yang wrote:
>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
>> device-private entries") introduced the concept of device-private
>> PMD entries, but did not correctly update the rmap walk code to
>> account for them.
>>
>> As a result, when page_vma_mapped_walk() encounters device-private
>> PMD entries, it takes no action other than to acquire the PMD lock
>> and exit.
>>
>> However this is highly problematic for two reasons - firstly,
>> device private entries possess a PFN so check_pmd() needs to be
>> called to ensure an overlapping PFN range.
>>
>> Secondly, and more importantly, if PVMW_MIGRATION is set the
>> caller assumes the returned entry is a migration entry, resulting
>> in memory corruption when the caller tries to interpret the device
>> private entry as such.
>>
>> In addition, commit 146287290023 ("mm/huge_memory: implement
>> device-private THP splitting") allowed device private PMDs to be
>> split like THP mappings, but again did not update this code path.
>>
>> As a result, we might race a PMD split prior to acquiring the PMD
>> lock.
>>
>> This patch addresses all of these issues by invoking check_pmd(),
>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
>> us we do for PMD THP and migration entries.
>>
>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Suggested-by: David Hildenbrand <david@kernel.org>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Balbir Singh <balbirs@nvidia.com>
>> Cc: SeongJae Park <sj@kernel.org>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Lorenzo Stoakes <ljs@kernel.org>
>> Cc: Lance Yang <lance.yang@linux.dev>
>>
>> ---
>> v4:
>> * refine subject and commit log based on Lorenzo's suggestion
>> * put pmd device-private entry handling in its own if branch,
>> suggested by Lorenzo
>>
>> v3:
>> * remove cleanup part, only fix the issue for device-private entry
>> * refine user effect description based on Lorenzo's suggestion
>>
>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
>> * specify the possible error case of current code and user visible effect
>> * besides fix, cleanup the pmd entry handling based on David's suggestion
>>
>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
>> ---
>> mm/page_vma_mapped.c | 20 +++++++++++++++-----
>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc1..17dff8aab9f9 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> /* THP pmd was split under us: handle on pte level */
>> spin_unlock(pvmw->ptl);
>> pvmw->ptl = NULL;
>> - } else if (!pmd_present(pmde)) {
>> - const softleaf_t entry = softleaf_from_pmd(pmde);
>> + } else if (pmd_is_device_private_entry(pmde)) {
>> + softleaf_t entry;
>> +
>> + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> + pmde = *pvmw->pmd;
>> + entry = softleaf_from_pmd(pmde);
>>
>> - if (softleaf_is_device_private(entry)) {
>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> + if (likely(softleaf_is_device_private(entry))) {
>> + if (pvmw->flags & PVMW_MIGRATION)
>> + return not_found(pvmw);
>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> + return not_found(pvmw);
>> return true;
>> }
>> -
>> + /* device-private pmd was split under us: handle on pte level */
>> + spin_unlock(pvmw->ptl);
>> + pvmw->ptl = NULL;
>> + } else if (!pmd_present(pmde)) {
>> if ((pvmw->flags & PVMW_SYNC) &&
>> thp_vma_suitable_order(vma, pvmw->address,
>> PMD_ORDER) &&
>
>This is extremely hard to review given the existing crap handling here. I'm
>really sorry, but it makes my head hurt (I'm not kidding :) ).
>
>It's completely unclear why we only have to check for a subset of the cases
>after taking the lock.
>
>Could we simply extend the existing migration pmd handling and leave the
>!pmd_present() case for pmd_none()?
>
>That leaves no question to "which transitions are actually allowed", including
>"could we accidentally assume something is a page table when really it isn't".
>
>
>So what about something like the following?
>
>The "thp_migration_supported()" is not required when checking for
>pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>
>Untested:
>
>
>>From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>From: "David Hildenbrand (Arm)" <david@kernel.org>
>Date: Fri, 26 Jun 2026 12:03:40 +0200
>Subject: [PATCH] tmp
>
>Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>---
> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> */
> pmde = pmdp_get_lockless(pvmw->pmd);
>
>- if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>+ if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>+ pmd_is_device_private_entry(pmde)) {
> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> pmde = *pvmw->pmd;
>- if (!pmd_present(pmde)) {
>+ if (pmd_is_migration_entry(pmde)) {
> softleaf_t entry;
>
>- if (!thp_migration_supported() ||
>- !(pvmw->flags & PVMW_MIGRATION))
>+ if (!(pvmw->flags & PVMW_MIGRATION))
> return not_found(pvmw);
> entry = softleaf_from_pmd(pmde);
>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>+ return not_found(pvmw);
>+ return true;
>+ } else if (pmd_is_device_private_entry(pmde)) {
>+ softleaf_t entry;
>
>- if (!softleaf_is_migration(entry) ||
>- !check_pmd(softleaf_to_pfn(entry), pvmw))
>+ if (pvmw->flags & PVMW_MIGRATION)
>+ return not_found(pvmw);
>+ entry = softleaf_from_pmd(pmde);
>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> return not_found(pvmw);
> return true;
>+ } else if (!pmd_present(pmde) ){
>+ return not_found(pvmw);
> }
> if (likely(pmd_trans_huge(pmde))) {
> if (pvmw->flags & PVMW_MIGRATION)
>@@ -270,12 +280,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> spin_unlock(pvmw->ptl);
> pvmw->ptl = NULL;
> } else if (!pmd_present(pmde)) {
>- const softleaf_t entry = softleaf_from_pmd(pmde);
>-
>- if (softleaf_is_device_private(entry)) {
>- pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>- return true;
>- }
>
> if ((pvmw->flags & PVMW_SYNC) &&
> thp_vma_suitable_order(vma, pvmw->address,
>--
Might be good with this on top:
---8<---
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index cfa1230c87bb..8b7c062bd81d 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -281,7 +281,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
return not_found(pvmw);
return true;
}
- /* THP pmd was split under us: handle on pte level */
+ /* THP/device-private pmd was split under us: handle on pte level */
spin_unlock(pvmw->ptl);
pvmw->ptl = NULL;
} else if (!pmd_present(pmde)) {
--
Looks good to me as well, thanks!
^ permalink raw reply related [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 13:27 ` Lance Yang
@ 2026-06-26 13:51 ` David Hildenbrand (Arm)
2026-06-27 0:38 ` Wei Yang
1 sibling, 0 replies; 24+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-26 13:51 UTC (permalink / raw)
To: Lance Yang
Cc: richard.weiyang, akpm, ljs, riel, liam, vbabka, harry, jannh, ziy,
sj, balbirs, linux-mm, linux-kernel, stable
On 6/26/26 15:27, Lance Yang wrote:
>
> On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
>> On 6/24/26 08:53, Wei Yang wrote:
>>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
>>> device-private entries") introduced the concept of device-private
>>> PMD entries, but did not correctly update the rmap walk code to
>>> account for them.
>>>
>>> As a result, when page_vma_mapped_walk() encounters device-private
>>> PMD entries, it takes no action other than to acquire the PMD lock
>>> and exit.
>>>
>>> However this is highly problematic for two reasons - firstly,
>>> device private entries possess a PFN so check_pmd() needs to be
>>> called to ensure an overlapping PFN range.
>>>
>>> Secondly, and more importantly, if PVMW_MIGRATION is set the
>>> caller assumes the returned entry is a migration entry, resulting
>>> in memory corruption when the caller tries to interpret the device
>>> private entry as such.
>>>
>>> In addition, commit 146287290023 ("mm/huge_memory: implement
>>> device-private THP splitting") allowed device private PMDs to be
>>> split like THP mappings, but again did not update this code path.
>>>
>>> As a result, we might race a PMD split prior to acquiring the PMD
>>> lock.
>>>
>>> This patch addresses all of these issues by invoking check_pmd(),
>>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
>>> us we do for PMD THP and migration entries.
>>>
>>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Suggested-by: David Hildenbrand <david@kernel.org>
>>> Cc: David Hildenbrand <david@kernel.org>
>>> Cc: Balbir Singh <balbirs@nvidia.com>
>>> Cc: SeongJae Park <sj@kernel.org>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: Lorenzo Stoakes <ljs@kernel.org>
>>> Cc: Lance Yang <lance.yang@linux.dev>
>>>
>>> ---
>>> v4:
>>> * refine subject and commit log based on Lorenzo's suggestion
>>> * put pmd device-private entry handling in its own if branch,
>>> suggested by Lorenzo
>>>
>>> v3:
>>> * remove cleanup part, only fix the issue for device-private entry
>>> * refine user effect description based on Lorenzo's suggestion
>>>
>>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
>>> * specify the possible error case of current code and user visible effect
>>> * besides fix, cleanup the pmd entry handling based on David's suggestion
>>>
>>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
>>> ---
>>> mm/page_vma_mapped.c | 20 +++++++++++++++-----
>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index 2ccbabfb2cc1..17dff8aab9f9 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>> /* THP pmd was split under us: handle on pte level */
>>> spin_unlock(pvmw->ptl);
>>> pvmw->ptl = NULL;
>>> - } else if (!pmd_present(pmde)) {
>>> - const softleaf_t entry = softleaf_from_pmd(pmde);
>>> + } else if (pmd_is_device_private_entry(pmde)) {
>>> + softleaf_t entry;
>>> +
>>> + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> + pmde = *pvmw->pmd;
>>> + entry = softleaf_from_pmd(pmde);
>>>
>>> - if (softleaf_is_device_private(entry)) {
>>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> + if (likely(softleaf_is_device_private(entry))) {
>>> + if (pvmw->flags & PVMW_MIGRATION)
>>> + return not_found(pvmw);
>>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>>> + return not_found(pvmw);
>>> return true;
>>> }
>>> -
>>> + /* device-private pmd was split under us: handle on pte level */
>>> + spin_unlock(pvmw->ptl);
>>> + pvmw->ptl = NULL;
>>> + } else if (!pmd_present(pmde)) {
>>> if ((pvmw->flags & PVMW_SYNC) &&
>>> thp_vma_suitable_order(vma, pvmw->address,
>>> PMD_ORDER) &&
>>
>> This is extremely hard to review given the existing crap handling here. I'm
>> really sorry, but it makes my head hurt (I'm not kidding :) ).
>>
>> It's completely unclear why we only have to check for a subset of the cases
>> after taking the lock.
>>
>> Could we simply extend the existing migration pmd handling and leave the
>> !pmd_present() case for pmd_none()?
>>
>> That leaves no question to "which transitions are actually allowed", including
>> "could we accidentally assume something is a page table when really it isn't".
>>
>>
>> So what about something like the following?
>>
>> The "thp_migration_supported()" is not required when checking for
>> pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>>
>> Untested:
>>
>>
>> >From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>> From: "David Hildenbrand (Arm)" <david@kernel.org>
>> Date: Fri, 26 Jun 2026 12:03:40 +0200
>> Subject: [PATCH] tmp
>>
>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>> ---
>> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
>> 1 file changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> */
>> pmde = pmdp_get_lockless(pvmw->pmd);
>>
>> - if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>> + if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>> + pmd_is_device_private_entry(pmde)) {
>> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> pmde = *pvmw->pmd;
>> - if (!pmd_present(pmde)) {
>> + if (pmd_is_migration_entry(pmde)) {
>> softleaf_t entry;
>>
>> - if (!thp_migration_supported() ||
>> - !(pvmw->flags & PVMW_MIGRATION))
>> + if (!(pvmw->flags & PVMW_MIGRATION))
>> return not_found(pvmw);
>> entry = softleaf_from_pmd(pmde);
>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> + return not_found(pvmw);
>> + return true;
>> + } else if (pmd_is_device_private_entry(pmde)) {
>> + softleaf_t entry;
>>
>> - if (!softleaf_is_migration(entry) ||
>> - !check_pmd(softleaf_to_pfn(entry), pvmw))
>> + if (pvmw->flags & PVMW_MIGRATION)
>> + return not_found(pvmw);
>> + entry = softleaf_from_pmd(pmde);
>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> return not_found(pvmw);
>> return true;
>> + } else if (!pmd_present(pmde) ){
>> + return not_found(pvmw);
>> }
>> if (likely(pmd_trans_huge(pmde))) {
>> if (pvmw->flags & PVMW_MIGRATION)
>> @@ -270,12 +280,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> spin_unlock(pvmw->ptl);
>> pvmw->ptl = NULL;
>> } else if (!pmd_present(pmde)) {
>> - const softleaf_t entry = softleaf_from_pmd(pmde);
>> -
>> - if (softleaf_is_device_private(entry)) {
>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> - return true;
>> - }
>>
>> if ((pvmw->flags & PVMW_SYNC) &&
>> thp_vma_suitable_order(vma, pvmw->address,
>> --
>
> Might be good with this on top:
>
> ---8<---
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index cfa1230c87bb..8b7c062bd81d 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -281,7 +281,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> return not_found(pvmw);
> return true;
> }
> - /* THP pmd was split under us: handle on pte level */
> + /* THP/device-private pmd was split under us: handle on pte level */
> spin_unlock(pvmw->ptl);
> pvmw->ptl = NULL;
> } else if (!pmd_present(pmde)) {
> --
>
> Looks good to me as well, thanks!
Makes sense, thanks!
--
Cheers,
David
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 13:27 ` Lance Yang
2026-06-26 13:51 ` David Hildenbrand (Arm)
@ 2026-06-27 0:38 ` Wei Yang
2026-06-27 2:51 ` Lance Yang
1 sibling, 1 reply; 24+ messages in thread
From: Wei Yang @ 2026-06-27 0:38 UTC (permalink / raw)
To: Lance Yang
Cc: david, richard.weiyang, akpm, ljs, riel, liam, vbabka, harry,
jannh, ziy, sj, balbirs, linux-mm, linux-kernel, stable
On Fri, Jun 26, 2026 at 09:27:28PM +0800, Lance Yang wrote:
>
>On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
>>On 6/24/26 08:53, Wei Yang wrote:
>>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
>>> device-private entries") introduced the concept of device-private
>>> PMD entries, but did not correctly update the rmap walk code to
>>> account for them.
>>>
>>> As a result, when page_vma_mapped_walk() encounters device-private
>>> PMD entries, it takes no action other than to acquire the PMD lock
>>> and exit.
>>>
>>> However this is highly problematic for two reasons - firstly,
>>> device private entries possess a PFN so check_pmd() needs to be
>>> called to ensure an overlapping PFN range.
>>>
>>> Secondly, and more importantly, if PVMW_MIGRATION is set the
>>> caller assumes the returned entry is a migration entry, resulting
>>> in memory corruption when the caller tries to interpret the device
>>> private entry as such.
>>>
>>> In addition, commit 146287290023 ("mm/huge_memory: implement
>>> device-private THP splitting") allowed device private PMDs to be
>>> split like THP mappings, but again did not update this code path.
>>>
>>> As a result, we might race a PMD split prior to acquiring the PMD
>>> lock.
>>>
>>> This patch addresses all of these issues by invoking check_pmd(),
>>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
>>> us we do for PMD THP and migration entries.
>>>
>>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>>> Cc: <stable@vger.kernel.org>
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Suggested-by: David Hildenbrand <david@kernel.org>
>>> Cc: David Hildenbrand <david@kernel.org>
>>> Cc: Balbir Singh <balbirs@nvidia.com>
>>> Cc: SeongJae Park <sj@kernel.org>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: Lorenzo Stoakes <ljs@kernel.org>
>>> Cc: Lance Yang <lance.yang@linux.dev>
>>>
>>> ---
>>> v4:
>>> * refine subject and commit log based on Lorenzo's suggestion
>>> * put pmd device-private entry handling in its own if branch,
>>> suggested by Lorenzo
>>>
>>> v3:
>>> * remove cleanup part, only fix the issue for device-private entry
>>> * refine user effect description based on Lorenzo's suggestion
>>>
>>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
>>> * specify the possible error case of current code and user visible effect
>>> * besides fix, cleanup the pmd entry handling based on David's suggestion
>>>
>>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
>>> ---
>>> mm/page_vma_mapped.c | 20 +++++++++++++++-----
>>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index 2ccbabfb2cc1..17dff8aab9f9 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>> /* THP pmd was split under us: handle on pte level */
>>> spin_unlock(pvmw->ptl);
>>> pvmw->ptl = NULL;
>>> - } else if (!pmd_present(pmde)) {
>>> - const softleaf_t entry = softleaf_from_pmd(pmde);
>>> + } else if (pmd_is_device_private_entry(pmde)) {
>>> + softleaf_t entry;
>>> +
>>> + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> + pmde = *pvmw->pmd;
>>> + entry = softleaf_from_pmd(pmde);
>>>
>>> - if (softleaf_is_device_private(entry)) {
>>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> + if (likely(softleaf_is_device_private(entry))) {
>>> + if (pvmw->flags & PVMW_MIGRATION)
>>> + return not_found(pvmw);
>>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>>> + return not_found(pvmw);
>>> return true;
>>> }
>>> -
>>> + /* device-private pmd was split under us: handle on pte level */
>>> + spin_unlock(pvmw->ptl);
>>> + pvmw->ptl = NULL;
>>> + } else if (!pmd_present(pmde)) {
>>> if ((pvmw->flags & PVMW_SYNC) &&
>>> thp_vma_suitable_order(vma, pvmw->address,
>>> PMD_ORDER) &&
>>
>>This is extremely hard to review given the existing crap handling here. I'm
>>really sorry, but it makes my head hurt (I'm not kidding :) ).
>>
>>It's completely unclear why we only have to check for a subset of the cases
>>after taking the lock.
>>
>>Could we simply extend the existing migration pmd handling and leave the
>>!pmd_present() case for pmd_none()?
>>
>>That leaves no question to "which transitions are actually allowed", including
>>"could we accidentally assume something is a page table when really it isn't".
>>
>>
>>So what about something like the following?
>>
>>The "thp_migration_supported()" is not required when checking for
>>pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>>
>>Untested:
>>
>>
>>>From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>>From: "David Hildenbrand (Arm)" <david@kernel.org>
>>Date: Fri, 26 Jun 2026 12:03:40 +0200
>>Subject: [PATCH] tmp
>>
>>Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>>---
>> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
>> 1 file changed, 17 insertions(+), 12 deletions(-)
>>
>>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>>--- a/mm/page_vma_mapped.c
>>+++ b/mm/page_vma_mapped.c
>>@@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> */
>> pmde = pmdp_get_lockless(pvmw->pmd);
>>
>>- if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>>+ if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>>+ pmd_is_device_private_entry(pmde)) {
>> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> pmde = *pvmw->pmd;
>>- if (!pmd_present(pmde)) {
>>+ if (pmd_is_migration_entry(pmde)) {
>> softleaf_t entry;
>>
>>- if (!thp_migration_supported() ||
>>- !(pvmw->flags & PVMW_MIGRATION))
>>+ if (!(pvmw->flags & PVMW_MIGRATION))
>> return not_found(pvmw);
>> entry = softleaf_from_pmd(pmde);
>>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>>+ return not_found(pvmw);
>>+ return true;
>>+ } else if (pmd_is_device_private_entry(pmde)) {
>>+ softleaf_t entry;
>>
>>- if (!softleaf_is_migration(entry) ||
>>- !check_pmd(softleaf_to_pfn(entry), pvmw))
>>+ if (pvmw->flags & PVMW_MIGRATION)
>>+ return not_found(pvmw);
>>+ entry = softleaf_from_pmd(pmde);
>>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> return not_found(pvmw);
>> return true;
>>+ } else if (!pmd_present(pmde) ){
>>+ return not_found(pvmw);
>> }
>> if (likely(pmd_trans_huge(pmde))) {
>> if (pvmw->flags & PVMW_MIGRATION)
>>@@ -270,12 +280,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> spin_unlock(pvmw->ptl);
>> pvmw->ptl = NULL;
>> } else if (!pmd_present(pmde)) {
>>- const softleaf_t entry = softleaf_from_pmd(pmde);
>>-
>>- if (softleaf_is_device_private(entry)) {
>>- pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>- return true;
>>- }
>>
>> if ((pvmw->flags & PVMW_SYNC) &&
>> thp_vma_suitable_order(vma, pvmw->address,
>>--
>
>Might be good with this on top:
>
>---8<---
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index cfa1230c87bb..8b7c062bd81d 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -281,7 +281,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> return not_found(pvmw);
> return true;
> }
>- /* THP pmd was split under us: handle on pte level */
>+ /* THP/device-private pmd was split under us: handle on pte level */
As the comment in commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration
support device-private entries") says:
Add device-private THP support...
Per my understanding, we first already setup mapping and "migrate" to device
memory. This looks a kind of place holder.
Not familiar with this. Just want to clarify, we want to treat device-private
pmd as some sort of THP or not?
> spin_unlock(pvmw->ptl);
> pvmw->ptl = NULL;
> } else if (!pmd_present(pmde)) {
>--
>
>Looks good to me as well, thanks!
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-27 0:38 ` Wei Yang
@ 2026-06-27 2:51 ` Lance Yang
0 siblings, 0 replies; 24+ messages in thread
From: Lance Yang @ 2026-06-27 2:51 UTC (permalink / raw)
To: Wei Yang
Cc: david, akpm, ljs, riel, liam, vbabka, harry, jannh, ziy, sj,
balbirs, linux-mm, linux-kernel, stable
On 2026/6/27 08:38, Wei Yang wrote:
[...]
>>
>> Might be good with this on top:
>>
>> ---8<---
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index cfa1230c87bb..8b7c062bd81d 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -281,7 +281,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> return not_found(pvmw);
>> return true;
>> }
>> - /* THP pmd was split under us: handle on pte level */
>> + /* THP/device-private pmd was split under us: handle on pte level */
>
> As the comment in commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration
> support device-private entries") says:
>
> Add device-private THP support...
>
> Per my understanding, we first already setup mapping and "migrate" to device
> memory. This looks a kind of place holder.
>
> Not familiar with this. Just want to clarify, we want to treat device-private
> pmd as some sort of THP or not?
Not a regular THP, obviously. Just the PMD-sized device-private entry case.
It can be split under us too; see commit 146287290023 ("mm/huge_memory:
implement device-private THP splitting").
Nothing deeper meant here. After taking PTL, if that PMD-sized entry
is gone, just drop to the PTE walk.
>
>> spin_unlock(pvmw->ptl);
>> pvmw->ptl = NULL;
>> } else if (!pmd_present(pmde)) {
>> --
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 10:07 ` David Hildenbrand (Arm)
2026-06-26 10:42 ` Lorenzo Stoakes
2026-06-26 13:27 ` Lance Yang
@ 2026-06-27 0:04 ` Wei Yang
2026-06-27 2:07 ` Wei Yang
3 siblings, 0 replies; 24+ messages in thread
From: Wei Yang @ 2026-06-27 0:04 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Wei Yang, akpm, ljs, riel, liam, vbabka, harry, jannh, ziy, sj,
balbirs, linux-mm, linux-kernel, stable, Lance Yang
On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
>On 6/24/26 08:53, Wei Yang wrote:
>> Commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration support
>> device-private entries") introduced the concept of device-private
>> PMD entries, but did not correctly update the rmap walk code to
>> account for them.
>>
>> As a result, when page_vma_mapped_walk() encounters device-private
>> PMD entries, it takes no action other than to acquire the PMD lock
>> and exit.
>>
>> However this is highly problematic for two reasons - firstly,
>> device private entries possess a PFN so check_pmd() needs to be
>> called to ensure an overlapping PFN range.
>>
>> Secondly, and more importantly, if PVMW_MIGRATION is set the
>> caller assumes the returned entry is a migration entry, resulting
>> in memory corruption when the caller tries to interpret the device
>> private entry as such.
>>
>> In addition, commit 146287290023 ("mm/huge_memory: implement
>> device-private THP splitting") allowed device private PMDs to be
>> split like THP mappings, but again did not update this code path.
>>
>> As a result, we might race a PMD split prior to acquiring the PMD
>> lock.
>>
>> This patch addresses all of these issues by invoking check_pmd(),
>> ensuring PMVW_MIGRATION is not set and checks whether a split raced
>> us we do for PMD THP and migration entries.
>>
>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>> Cc: <stable@vger.kernel.org>
>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>> Suggested-by: David Hildenbrand <david@kernel.org>
>> Cc: David Hildenbrand <david@kernel.org>
>> Cc: Balbir Singh <balbirs@nvidia.com>
>> Cc: SeongJae Park <sj@kernel.org>
>> Cc: Zi Yan <ziy@nvidia.com>
>> Cc: Lorenzo Stoakes <ljs@kernel.org>
>> Cc: Lance Yang <lance.yang@linux.dev>
>>
>> ---
>> v4:
>> * refine subject and commit log based on Lorenzo's suggestion
>> * put pmd device-private entry handling in its own if branch,
>> suggested by Lorenzo
>>
>> v3:
>> * remove cleanup part, only fix the issue for device-private entry
>> * refine user effect description based on Lorenzo's suggestion
>>
>> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
>> * specify the possible error case of current code and user visible effect
>> * besides fix, cleanup the pmd entry handling based on David's suggestion
>>
>> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
>> ---
>> mm/page_vma_mapped.c | 20 +++++++++++++++-----
>> 1 file changed, 15 insertions(+), 5 deletions(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc1..17dff8aab9f9 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> /* THP pmd was split under us: handle on pte level */
>> spin_unlock(pvmw->ptl);
>> pvmw->ptl = NULL;
>> - } else if (!pmd_present(pmde)) {
>> - const softleaf_t entry = softleaf_from_pmd(pmde);
>> + } else if (pmd_is_device_private_entry(pmde)) {
>> + softleaf_t entry;
>> +
>> + pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> + pmde = *pvmw->pmd;
>> + entry = softleaf_from_pmd(pmde);
>>
>> - if (softleaf_is_device_private(entry)) {
>> - pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> + if (likely(softleaf_is_device_private(entry))) {
>> + if (pvmw->flags & PVMW_MIGRATION)
>> + return not_found(pvmw);
>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> + return not_found(pvmw);
>> return true;
>> }
>> -
>> + /* device-private pmd was split under us: handle on pte level */
>> + spin_unlock(pvmw->ptl);
>> + pvmw->ptl = NULL;
>> + } else if (!pmd_present(pmde)) {
>> if ((pvmw->flags & PVMW_SYNC) &&
>> thp_vma_suitable_order(vma, pvmw->address,
>> PMD_ORDER) &&
>
>This is extremely hard to review given the existing crap handling here. I'm
>really sorry, but it makes my head hurt (I'm not kidding :) ).
>
>It's completely unclear why we only have to check for a subset of the cases
>after taking the lock.
>
>Could we simply extend the existing migration pmd handling and leave the
>!pmd_present() case for pmd_none()?
>
>That leaves no question to "which transitions are actually allowed", including
>"could we accidentally assume something is a page table when really it isn't".
>
Consolidate all the cases in one place looks reasonable.
And make the logic clearer.
>
>So what about something like the following?
>
>The "thp_migration_supported()" is not required when checking for
>pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>
>Untested:
>
>
>>From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>From: "David Hildenbrand (Arm)" <david@kernel.org>
>Date: Fri, 26 Jun 2026 12:03:40 +0200
>Subject: [PATCH] tmp
>
>Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>---
> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> */
> pmde = pmdp_get_lockless(pvmw->pmd);
>
>- if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>+ if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>+ pmd_is_device_private_entry(pmde)) {
> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> pmde = *pvmw->pmd;
>- if (!pmd_present(pmde)) {
>+ if (pmd_is_migration_entry(pmde)) {
> softleaf_t entry;
>
>- if (!thp_migration_supported() ||
>- !(pvmw->flags & PVMW_MIGRATION))
>+ if (!(pvmw->flags & PVMW_MIGRATION))
> return not_found(pvmw);
> entry = softleaf_from_pmd(pmde);
>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>+ return not_found(pvmw);
>+ return true;
>+ } else if (pmd_is_device_private_entry(pmde)) {
>+ softleaf_t entry;
>
>- if (!softleaf_is_migration(entry) ||
>- !check_pmd(softleaf_to_pfn(entry), pvmw))
>+ if (pvmw->flags & PVMW_MIGRATION)
>+ return not_found(pvmw);
>+ entry = softleaf_from_pmd(pmde);
>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> return not_found(pvmw);
> return true;
>+ } else if (!pmd_present(pmde) ){
>+ return not_found(pvmw);
> }
> if (likely(pmd_trans_huge(pmde))) {
> if (pvmw->flags & PVMW_MIGRATION)
>@@ -270,12 +280,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> spin_unlock(pvmw->ptl);
> pvmw->ptl = NULL;
> } else if (!pmd_present(pmde)) {
>- const softleaf_t entry = softleaf_from_pmd(pmde);
>-
>- if (softleaf_is_device_private(entry)) {
>- pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>- return true;
>- }
>
> if ((pvmw->flags & PVMW_SYNC) &&
> thp_vma_suitable_order(vma, pvmw->address,
>--
>2.43.0
>
Will prepare v5 based one this.
Thanks.
>
>--
>Cheers,
>
>David
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-26 10:07 ` David Hildenbrand (Arm)
` (2 preceding siblings ...)
2026-06-27 0:04 ` Wei Yang
@ 2026-06-27 2:07 ` Wei Yang
2026-06-27 2:59 ` Lance Yang
3 siblings, 1 reply; 24+ messages in thread
From: Wei Yang @ 2026-06-27 2:07 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Wei Yang, akpm, ljs, riel, liam, vbabka, harry, jannh, ziy, sj,
balbirs, linux-mm, linux-kernel, stable, Lance Yang
On Fri, Jun 26, 2026 at 12:07:56PM +0200, David Hildenbrand (Arm) wrote:
>On 6/24/26 08:53, Wei Yang wrote:
[...]
>
>This is extremely hard to review given the existing crap handling here. I'm
>really sorry, but it makes my head hurt (I'm not kidding :) ).
>
>It's completely unclear why we only have to check for a subset of the cases
>after taking the lock.
>
>Could we simply extend the existing migration pmd handling and leave the
>!pmd_present() case for pmd_none()?
>
>That leaves no question to "which transitions are actually allowed", including
>"could we accidentally assume something is a page table when really it isn't".
>
>
>So what about something like the following?
>
>The "thp_migration_supported()" is not required when checking for
>pmd_is_migration_entry(), as that defaults to "false" when not compiled in.
>
>Untested:
>
Hi David
I did a little adjustment like below. Want to check with you at first.
>
>>From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>From: "David Hildenbrand (Arm)" <david@kernel.org>
>Date: Fri, 26 Jun 2026 12:03:40 +0200
>Subject: [PATCH] tmp
>
>Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>---
> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
> 1 file changed, 17 insertions(+), 12 deletions(-)
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> */
> pmde = pmdp_get_lockless(pvmw->pmd);
>
>- if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>+ if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>+ pmd_is_device_private_entry(pmde)) {
> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> pmde = *pvmw->pmd;
>- if (!pmd_present(pmde)) {
>+ if (pmd_is_migration_entry(pmde)) {
> softleaf_t entry;
>
How about:
const softleaf_t entry = softleaf_from_pmd(pmde);
>- if (!thp_migration_supported() ||
>- !(pvmw->flags & PVMW_MIGRATION))
>+ if (!(pvmw->flags & PVMW_MIGRATION))
> return not_found(pvmw);
> entry = softleaf_from_pmd(pmde);
could be removed.
>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>+ return not_found(pvmw);
>+ return true;
>+ } else if (pmd_is_device_private_entry(pmde)) {
>+ softleaf_t entry;
The same.
>
>- if (!softleaf_is_migration(entry) ||
>- !check_pmd(softleaf_to_pfn(entry), pvmw))
>+ if (pvmw->flags & PVMW_MIGRATION)
>+ return not_found(pvmw);
>+ entry = softleaf_from_pmd(pmde);
>+ if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> return not_found(pvmw);
> return true;
>+ } else if (!pmd_present(pmde) ){
>+ return not_found(pvmw);
> }
> if (likely(pmd_trans_huge(pmde))) {
> if (pvmw->flags & PVMW_MIGRATION)
How about merge this with above? And put at the first case?
Below is what it looks like:
if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
pmd_is_device_private_entry(pmde)) {
pvmw->ptl = pmd_lock(mm, pvmw->pmd);
pmde = *pvmw->pmd;
if (likely(pmd_trans_huge(pmde))) {
if (pvmw->flags & PVMW_MIGRATION)
return not_found(pvmw);
if (!check_pmd(pmd_pfn(pmde), pvmw))
return not_found(pvmw);
return true;
} else if (pmd_is_migration_entry(pmde)) {
const softleaf_t entry = softleaf_from_pmd(pmde);
if (!(pvmw->flags & PVMW_MIGRATION))
return not_found(pvmw);
if (!check_pmd(softleaf_to_pfn(entry), pvmw))
return not_found(pvmw);
return true;
} else if (pmd_is_device_private_entry(pmde)) {
const softleaf_t entry = softleaf_from_pmd(pmde);
if (pvmw->flags & PVMW_MIGRATION)
return not_found(pvmw);
if (!check_pmd(softleaf_to_pfn(entry), pvmw))
return not_found(pvmw);
return true;
} else if (!pmd_present(pmde)) {
return not_found(pvmw);
}
/* THP pmd was split under us: handle on pte level */
spin_unlock(pvmw->ptl);
pvmw->ptl = NULL;
} else if (!pmd_present(pmde)) {
Test with split_huge_page_test/khugepaged/hmm-test/migration in selftets,
looks good.
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 24+ messages in thread* Re: [Patch mm-hotfixes v4] mm/page_vma_mapped: fix device-private PMD handling
2026-06-27 2:07 ` Wei Yang
@ 2026-06-27 2:59 ` Lance Yang
0 siblings, 0 replies; 24+ messages in thread
From: Lance Yang @ 2026-06-27 2:59 UTC (permalink / raw)
To: Wei Yang
Cc: akpm, ljs, riel, liam, vbabka, harry, jannh, ziy, sj, balbirs,
linux-mm, linux-kernel, stable, David Hildenbrand (Arm)
On 2026/6/27 10:07, Wei Yang wrote:
[...]
>
> Hi David
>
> I did a little adjustment like below. Want to check with you at first.
>
>>
>> >From 048ecd33673ec649e168fbbb97749a7c0e344fcd Mon Sep 17 00:00:00 2001
>> From: "David Hildenbrand (Arm)" <david@kernel.org>
>> Date: Fri, 26 Jun 2026 12:03:40 +0200
>> Subject: [PATCH] tmp
>>
>> Signed-off-by: David Hildenbrand (Arm) <david@kernel.org>
>> ---
>> mm/page_vma_mapped.c | 29 +++++++++++++++++------------
>> 1 file changed, 17 insertions(+), 12 deletions(-)
>>
>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>> index 2ccbabfb2cc17..ed2a23a90e8dd 100644
>> --- a/mm/page_vma_mapped.c
>> +++ b/mm/page_vma_mapped.c
>> @@ -243,21 +243,31 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>> */
>> pmde = pmdp_get_lockless(pvmw->pmd);
>>
>> - if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>> + if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde) ||
>> + pmd_is_device_private_entry(pmde)) {
>> pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>> pmde = *pvmw->pmd;
>> - if (!pmd_present(pmde)) {
>> + if (pmd_is_migration_entry(pmde)) {
>> softleaf_t entry;
>>
>
> How about:
> const softleaf_t entry = softleaf_from_pmd(pmde);
>
>> - if (!thp_migration_supported() ||
>> - !(pvmw->flags & PVMW_MIGRATION))
>> + if (!(pvmw->flags & PVMW_MIGRATION))
>> return not_found(pvmw);
>> entry = softleaf_from_pmd(pmde);
>
> could be removed.
>
>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> + return not_found(pvmw);
>> + return true;
>> + } else if (pmd_is_device_private_entry(pmde)) {
>> + softleaf_t entry;
>
> The same.
>
>>
>> - if (!softleaf_is_migration(entry) ||
>> - !check_pmd(softleaf_to_pfn(entry), pvmw))
>> + if (pvmw->flags & PVMW_MIGRATION)
>> + return not_found(pvmw);
>> + entry = softleaf_from_pmd(pmde);
>> + if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>> return not_found(pvmw);
>> return true;
>> + } else if (!pmd_present(pmde) ){
>> + return not_found(pvmw);
>> }
>> if (likely(pmd_trans_huge(pmde))) {
>> if (pvmw->flags & PVMW_MIGRATION)
>
> How about merge this with above? And put at the first case?
>
> Below is what it looks like:
Why add more churn to a fix with a stable tag? Cleanup can come later no?
[...]
^ permalink raw reply [flat|nested] 24+ messages in thread