Re: [PATCH] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Wei Yang <richard.weiyang@gmail.com>
To: "David Hildenbrand (Arm)" <david@kernel.org>
Cc: Balbir Singh <balbirs@nvidia.com>,
	Wei Yang <richard.weiyang@gmail.com>,
	akpm@linux-foundation.org, ljs@kernel.org, riel@surriel.com,
	liam@infradead.org, vbabka@kernel.org, harry@kernel.org,
	jannh@google.com, sj@kernel.org, ziy@nvidia.com,
	linux-mm@kvack.org, Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd
Date: Tue, 12 May 2026 14:35:42 +0000	[thread overview]
Message-ID: <20260512143542.izpp3gu4iqxttw3f@master> (raw)
In-Reply-To: <0aab59b8-71c5-4059-8281-5dd876946528@kernel.org>

On Tue, May 12, 2026 at 02:43:54PM +0200, David Hildenbrand (Arm) wrote:
>On 5/9/26 00:48, Balbir Singh wrote:
>> On 5/8/26 11:37, Wei Yang wrote:
>>> For pmd_trans_huge() and pmd_is_migration_entry(), we does following
>>> before return the pmd entry:
>>>
>>>   * re-validate pmd entry
>>>   * check PVMW_MIGRATION
>>>   * check_pmd()
>>>   * handle on pte level if split under us
>>>
>>> But for device-private pmd, we just return after pmd_lock(). This may
>>> lead to inproper situation.
>>>
>> 
>> Could you elaborate a more on the improper situation?
>> 
>>> This patch fixes commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration
>>> support device-private entries") by following the same pattern as
>>> pmd_trans_huge() and pmd_is_migration_entry().
>>>
>>> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
>>> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
>>> Cc: David Hildenbrand <david@kernel.org>
>>> Cc: Balbir Singh <balbirs@nvidia.com>
>>> Cc: SeongJae Park <sj@kernel.org>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>>> Cc: <stable@vger.kernel.org>
>>> ---
>>>  mm/page_vma_mapped.c | 34 +++++++++++++++++++++++-----------
>>>  1 file changed, 23 insertions(+), 11 deletions(-)
>>>
>>> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>>> index a4d52fdb3056..5d337ea43019 100644
>>> --- a/mm/page_vma_mapped.c
>>> +++ b/mm/page_vma_mapped.c
>>> @@ -269,21 +269,33 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>>>  			spin_unlock(pvmw->ptl);
>>>  			pvmw->ptl = NULL;
>>>  		} else if (!pmd_present(pmde)) {
>>> -			const softleaf_t entry = softleaf_from_pmd(pmde);
>>> +			softleaf_t entry = softleaf_from_pmd(pmde);
>>>  
>>>  			if (softleaf_is_device_private(entry)) {
>>>  				pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>> -				return true;
>>> -			}
>>> -
>>> -			if ((pvmw->flags & PVMW_SYNC) &&
>>> -			    thp_vma_suitable_order(vma, pvmw->address,
>>> -						   PMD_ORDER) &&
>>> -			    (pvmw->nr_pages >= HPAGE_PMD_NR))
>>> -				sync_with_folio_pmd_zap(mm, pvmw->pmd);
>>> +				entry = softleaf_from_pmd(*pvmw->pmd);
>>> +
>>> +				if (softleaf_is_device_private(entry)) {
>> 
>> Do we need to check softleaf_is_device_private() twice, can't we hold the pmd
>> lock and check once?
>
>I think what we try to do here is, is to only grab the lock if we verified that there is something of interest in there.
>
>I wonder if we should rewrite that whole thing to just do a pmd_same() check after grabbing the lock.
>
>Something a lot cleaner like:
>
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index a4d52fdb3056..de6a255cc847 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -242,40 +242,28 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>                 */
>                pmde = pmdp_get_lockless(pvmw->pmd);
> 
>-               if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>-                       pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>-                       pmde = *pvmw->pmd;
>-                       if (!pmd_present(pmde)) {
>-                               softleaf_t entry;
>-
>-                               if (!thp_migration_supported() ||
>-                                   !(pvmw->flags & PVMW_MIGRATION))
>-                                       return not_found(pvmw);
>-                               entry = softleaf_from_pmd(pmde);
>-
>-                               if (!softleaf_is_migration(entry) ||
>-                                   !check_pmd(softleaf_to_pfn(entry), pvmw))
>-                                       return not_found(pvmw);
>-                               return true;
>-                       }
>-                       if (likely(pmd_trans_huge(pmde))) {
>-                               if (pvmw->flags & PVMW_MIGRATION)
>-                                       return not_found(pvmw);
>-                               if (!check_pmd(pmd_pfn(pmde), pvmw))
>-                                       return not_found(pvmw);
>-                               return true;
>-                       }
>-                       /* THP pmd was split under us: handle on pte level */
>-                       spin_unlock(pvmw->ptl);
>-                       pvmw->ptl = NULL;
>-               } else if (!pmd_present(pmde)) {
>-                       const softleaf_t entry = softleaf_from_pmd(pmde);
>-
>-                       if (softleaf_is_device_private(entry)) {
>-                               pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>-                               return true;
>-                       }
>+               if (pmd_present(pmde)) {
>+                       if (!pmd_leaf(pmde))
>+                               goto pte_table;
>+                       if (pvmw->flags & PVMW_MIGRATION)
>+                               return not_found(pvmw);
>+                       if (!check_pmd(pmd_pfn(pmde), pvmw))
>+                               return not_found(pvmw);
>+               } else if (pmd_is_migration_entry(pmde)) {
>+                       softleaf_t entry = softleaf_from_pmd(pmde);
>+
>+                       if (!(pvmw->flags & PVMW_MIGRATION))
>+                               return not_found(pvmw);
>+                       if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>+                               return not_found(pvmw);
>+               } else if (pmd_is_device_private_entry(pmde)) {
>+                       softleaf_t entry = softleaf_from_pmd(pmde);
> 
>+                       if (pvmw->flags & PVMW_MIGRATION)
>+                               return not_found(pvmw);
>+                       if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>+                               return not_found(pvmw);
>+               } else {
>                        if ((pvmw->flags & PVMW_SYNC) &&
>                            thp_vma_suitable_order(vma, pvmw->address,
>                                                   PMD_ORDER) &&
>@@ -285,6 +273,15 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>                        step_forward(pvmw, PMD_SIZE);
>                        continue;
>                }
>+
>+               /* Double-check under PTL that the PMD didn't change. */
>+               pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>+               if (pmd_same(pmde, pmdp_get(pvmw->pmd)))
>+                       return true;
>+               spin_unlock(pvmw->ptl);
>+               pvmw->ptl = NULL;
>+               goto restart;
>+pte_table:
>                if (!map_pte(pvmw, &pmde, &ptl)) {
>                        if (!pvmw->pte)
>
>
>
>
>There is likely room to clean this up / compress it further.

I tried to compress above logic like this, hope it could look cleaner.

	if (pmd_trans_huge(pmde) || pmd_is_valid_softleaf(pmde)) {
		unsigned long pfn;
		bool is_migration = pmd_is_migration_entry(pmde);
		bool for_migration = !!(pvmw->flags & PVMW_MIGRATION);

		if (is_migration != for_migration)
			return not_found(pvmw);

		if (pmd_trans_huge(pmde))
			pfn = pmd_pfn(pmde);
		else
			pfn = softleaf_to_pfn(softleaf_from_pmd(pmde));

		if (!check_pmd(pfn, pvmw))
			return not_found(pvmw);
	} else if (!pmd_present(pmde)) {

>I'll note that this now also adds proper check_pmd() checks to pmd_is_device_private_entry().
>
>The not_found(pvmw) if check_pmd() fails is rather weird ... but likely this works because
>THPs can really only be mapped through one PMD, and we always will look at the right spot ...
>
>-- 
>Cheers,
>
>David

-- 
Wei Yang
Help you, Help me

next prev parent reply	other threads:[~2026-05-12 14:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-08  1:37 [PATCH] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd Wei Yang
2026-05-08 21:51 ` Andrew Morton
2026-05-10  1:22   ` Wei Yang
2026-05-08 22:48 ` Balbir Singh
2026-05-10  1:20   ` Wei Yang
2026-05-12 12:43   ` David Hildenbrand (Arm)
2026-05-12 14:35     ` Wei Yang [this message]
2026-05-12 18:55       ` David Hildenbrand (Arm)
2026-05-12 23:03         ` Balbir Singh
2026-05-12 23:14           ` Wei Yang
2026-05-12 23:19             ` Balbir Singh
2026-05-13  1:47             ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260512143542.izpp3gu4iqxttw3f@master \
    --to=richard.weiyang@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=balbirs@nvidia.com \
    --cc=david@kernel.org \
    --cc=harry@kernel.org \
    --cc=jannh@google.com \
    --cc=liam@infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=riel@surriel.com \
    --cc=sj@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.