All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lance Yang <lance.yang@linux.dev>
To: richard.weiyang@gmail.com
Cc: akpm@linux-foundation.org, david@kernel.org, ljs@kernel.org,
	riel@surriel.com, liam@infradead.org, vbabka@kernel.org,
	harry@kernel.org, jannh@google.com, balbirs@nvidia.com,
	ziy@nvidia.com, sj@kernel.org, linux-mm@kvack.org,
	lorenzo.stoakes@oracle.com, stable@vger.kernel.org,
	Lance Yang <lance.yang@linux.dev>
Subject: Re: [Patch v2] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd
Date: Tue, 16 Jun 2026 20:30:01 +0800	[thread overview]
Message-ID: <20260616123001.6501-1-lance.yang@linux.dev> (raw)
In-Reply-To: <20260616063436.20455-1-richard.weiyang@gmail.com>


On Tue, Jun 16, 2026 at 06:34:36AM +0000, Wei Yang wrote:
[...]
>diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
>index 2ccbabfb2cc1..21635fab209c 100644
>--- a/mm/page_vma_mapped.c
>+++ b/mm/page_vma_mapped.c
>@@ -243,40 +243,28 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> 		 */
> 		pmde = pmdp_get_lockless(pvmw->pmd);
> 
>-		if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
>-			pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>-			pmde = *pvmw->pmd;
>-			if (!pmd_present(pmde)) {
>-				softleaf_t entry;
>-
>-				if (!thp_migration_supported() ||
>-				    !(pvmw->flags & PVMW_MIGRATION))
>-					return not_found(pvmw);
>-				entry = softleaf_from_pmd(pmde);
>-
>-				if (!softleaf_is_migration(entry) ||
>-				    !check_pmd(softleaf_to_pfn(entry), pvmw))
>-					return not_found(pvmw);
>-				return true;
>-			}
>-			if (likely(pmd_trans_huge(pmde))) {
>-				if (pvmw->flags & PVMW_MIGRATION)
>-					return not_found(pvmw);
>-				if (!check_pmd(pmd_pfn(pmde), pvmw))
>-					return not_found(pvmw);
>-				return true;
>-			}
>-			/* THP pmd was split under us: handle on pte level */
>-			spin_unlock(pvmw->ptl);
>-			pvmw->ptl = NULL;
>-		} else if (!pmd_present(pmde)) {
>-			const softleaf_t entry = softleaf_from_pmd(pmde);
>-
>-			if (softleaf_is_device_private(entry)) {
>-				pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>-				return true;
>-			}
>+		if (pmd_present(pmde)) {
>+			if (!pmd_leaf(pmde))
>+				goto pte_table;
>+			if (pvmw->flags & PVMW_MIGRATION)
>+				return not_found(pvmw);
>+			if (!check_pmd(pmd_pfn(pmde), pvmw))
>+				return not_found(pvmw);
>+		} else if (pmd_is_migration_entry(pmde)) {
>+			softleaf_t entry = softleaf_from_pmd(pmde);
>+
>+			if (!(pvmw->flags & PVMW_MIGRATION))
>+				return not_found(pvmw);

Looked at history a bit, and I wonder if this changed something old
here ...

Since 616b8371539a ("mm: thp: enable thp migration in generic path"), PMD
migration handling took PTL before doing PVMW_MIGRATION/PFN checks,
including not_found() cases. So lockless PMD read was just a filter ...

With this fix, true case gets final pmd_same() check, but this
not_found() case happens before taking PTL.

So a !PVMW_MIGRATION walker could race with someone, e.g.
remove_migration_pmd(): we make the not_found() decision from old PMD
value that still says "migration", while real *pvmw->pmd may already be
present again. We return without ever taking PTL :)

Not sure about practical fallout, but should these PMD-level not_found()
cases also take PTL and restart if PMD changed?

>+			if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>+				return not_found(pvmw);
>+		} else if (pmd_is_device_private_entry(pmde)) {
>+			softleaf_t entry = softleaf_from_pmd(pmde);
> 
>+			if (pvmw->flags & PVMW_MIGRATION)
>+				return not_found(pvmw);
>+			if (!check_pmd(softleaf_to_pfn(entry), pvmw))
>+				return not_found(pvmw);
>+		} else {
> 			if ((pvmw->flags & PVMW_SYNC) &&
> 			    thp_vma_suitable_order(vma, pvmw->address,
> 						   PMD_ORDER) &&
>@@ -286,6 +274,15 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> 			step_forward(pvmw, PMD_SIZE);
> 			continue;
> 		}
>+
>+		/* Double-check under PTL that the PMD didn't change. */
>+		pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>+		if (pmd_same(pmde, pmdp_get(pvmw->pmd)))
>+			return true;
>+		spin_unlock(pvmw->ptl);
>+		pvmw->ptl = NULL;
>+		goto restart;
>+pte_table:
> 		if (!map_pte(pvmw, &pmde, &ptl)) {
> 			if (!pvmw->pte)
> 				goto restart;
>-- 
>2.34.1
>

Cheers, Lance


       reply	other threads:[~2026-06-16 20:46 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260616063436.20455-1-richard.weiyang@gmail.com>
2026-06-16 12:30 ` Lance Yang [this message]
2026-06-16 23:50   ` [Patch v2] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd Wei Yang
     [not found]   ` <666dc40b-e37a-46eb-af55-7a81bc1643f1@kernel.org>
2026-06-17  2:11     ` Balbir Singh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260616123001.6501-1-lance.yang@linux.dev \
    --to=lance.yang@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=balbirs@nvidia.com \
    --cc=david@kernel.org \
    --cc=harry@kernel.org \
    --cc=jannh@google.com \
    --cc=liam@infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=richard.weiyang@gmail.com \
    --cc=riel@surriel.com \
    --cc=sj@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.