All of lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, david@kernel.org, riel@surriel.com,
	 liam@infradead.org, vbabka@kernel.org, harry@kernel.org,
	jannh@google.com,  sj@kernel.org, ziy@nvidia.com,
	balbirs@nvidia.com, linux-mm@kvack.org,  stable@vger.kernel.org,
	Lance Yang <lance.yang@linux.dev>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd
Date: Mon, 22 Jun 2026 17:11:02 +0100	[thread overview]
Message-ID: <ajld6RKK02Vi-LxM@lucifer> (raw)
In-Reply-To: <20260622142102.pcmr5pftshj5lvju@master>

On Mon, Jun 22, 2026 at 02:21:02PM +0000, Wei Yang wrote:
> On Mon, Jun 22, 2026 at 02:46:40PM +0100, Lorenzo Stoakes wrote:
> >+cc Lance, linux-kernel
> >
> >Your subject line is 83 characters long and is way too detailed how about 'fix
> >device-private PMD handling'?
> >
>
> Got it.
>
> >You forgot to include linux-kernel@vger.kernel.org on the mail, lore seems to be
> >a bit broken atm but in general it's helpful to include that.
>
> Got it.
>
> So usually we send a patch to both linux-mm and linux-kernel? If so, I
> remember is later actions.

Yeah it's better for dealing with kvack going wrong etc. :)

>
> >
> >Also is useful to make this [PATCH mm-hotfixes] to make it really clear it's
> >intended as a hotfix.
> >
>
> Got it.
>
> >Some commit msg language nits:
> >
> >On Mon, Jun 22, 2026 at 01:06:51PM +0000, Wei Yang wrote:
> >> For pmd_trans_huge() and pmd_is_migration_entry(), we does following
> >> before return the pmd entry:
> >
> >Sounds better as:
> >
> >	For PMD entries that satisfy pmd_trans_huge() or pmd_is_migration_entry(), we
> >	perform the following actions:
> >
>
> Sure.
>
> >>
> >>   * re-validate pmd entry after PTL
> >>   * check PVMW_MIGRATION
> >>   * check_pmd()
> >>   * handle on pte level if split under us
> >>
> >> But for device-private pmd, we just return after pmd_lock().
> >
> >->
> >
> >	However, for device-private PMD entries, we simply acquire the PMD lock
> >	and return.
> >
>
> Sure.
>
> >Also can you please give some justification here as to why all this also applies
> >to device-private PMD? Right now it sounds hand wavey.
> >
>
> I thought below paragraph explain it. Not sure what justification is preferred.

Something about device private PMDs splitting the same way THP ones do, in the
pmd_is_device_private_entry() branch of __split_huge_pmd_locked().

>
> >> If a softleaf entry is present, e.g. device-private pmd, the existing
> >> code simply acquires the PMD lock and returns success even if
> >> PVMW_MIGRATION is set (indicating a migration entry is sought), meaning
> >> that the caller can incorrectly interpret the entry as something it is
> >> not, causing data corruption.
> >
> >This is repetitive, you already mentioned device-private PMD, you already
> >mentioned that it simply acquires the PMD lock.
> >
>
> Ah, I copied your suggestion from [1]. Hope I don't misunderstand it.
>
> [1]: https://lore.kernel.org/linux-mm/ajUXNjRMraKb6k2n@lucifer/

Sure, thanks but in context with the above ends up being a bit repetitive.

>
> >You should talk about what issue it caused and why:
> >
> >	This is particularly problematic when PVMW_MIGRATION is set (meaning a
> >	migration entry is sought), as it causes a device-private PMD entry to
> >	be returned with a different data layout, causing memory corruption.
> >
>
> This looks good. I would take this one, if you prefer.

Sure, thanks!

>
> >>
> >> This patch fixes commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration
> >> support device-private entries") by following the same pattern as
> >> pmd_trans_huge() and pmd_is_migration_entry() for device private entry.
> >
> >This is pretty useless. We see what patch it fixes in the Fixes tag, and you're
> >just repeating things you said above, I'd drop it.
> >
>
> Got it.
>
> >> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")
> >> Cc: <stable@vger.kernel.org>
> >> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> >> Suggested-by: David Hildenbrand <david@kernel.org>
> >> Cc: David Hildenbrand <david@kernel.org>
> >> Cc: Balbir Singh <balbirs@nvidia.com>
> >> Cc: SeongJae Park <sj@kernel.org>
> >> Cc: Zi Yan <ziy@nvidia.com>
> >> Cc: Lorenzo Stoakes <ljs@kernel.org>
> >>
> >> ---
> >> v3:
> >>   * remove cleanup part, only fix the issue for device-private entry
> >>   * refine user effect description based on Lorenzo's suggestion
> >> v2: https://lore.kernel.org/all/20260616063436.20455-1-richard.weiyang@gmail.com/T/#u
> >>   * specify the possible error case of current code and user visible effect
> >>   * besides fix, cleanup the pmd entry handling based on David's suggestion
> >>
> >> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
> >> ---
> >>  mm/page_vma_mapped.c | 32 ++++++++++++++++++++++----------
> >>  1 file changed, 22 insertions(+), 10 deletions(-)
> >>
> >> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> >> index 2ccbabfb2cc1..8de3c6b82df6 100644
> >> --- a/mm/page_vma_mapped.c
> >> +++ b/mm/page_vma_mapped.c
> >> @@ -270,21 +270,33 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> >>  			spin_unlock(pvmw->ptl);
> >>  			pvmw->ptl = NULL;
> >>  		} else if (!pmd_present(pmde)) {
> >> -			const softleaf_t entry = softleaf_from_pmd(pmde);
> >> +			softleaf_t entry = softleaf_from_pmd(pmde);
> >>
> >>  			if (softleaf_is_device_private(entry)) {
> >>  				pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> >> -				return true;
> >> -			}
> >>
> >> -			if ((pvmw->flags & PVMW_SYNC) &&
> >> -			    thp_vma_suitable_order(vma, pvmw->address,
> >> -						   PMD_ORDER) &&
> >> -			    (pvmw->nr_pages >= HPAGE_PMD_NR))
> >> -				sync_with_folio_pmd_zap(mm, pvmw->pmd);
> >> +				entry = softleaf_from_pmd(*pvmw->pmd);
> >>
> >> -			step_forward(pvmw, PMD_SIZE);
> >> -			continue;
> >> +				if (softleaf_is_device_private(entry)) {
> >
> >This is all very horrible. You have an example of how pmde is re-got in the
> >pmd_trans_huge() branch and pmd_is_device_private_entry() exists...
> >
> >We can just make this another branch and do the re-check more neatly.
> >
>
> I plan to keep the change small, but yeah it is ugly.
>
> >I enclose a patch that does that (untested, please check).
> >
> >
> >> +					if (pvmw->flags & PVMW_MIGRATION)
> >> +						return not_found(pvmw);
> >> +					if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> >> +						return not_found(pvmw);
> >> +					return true;
> >> +				}
> >> +				/* device-private pmd was split under us: handle on pte level */
> >> +				spin_unlock(pvmw->ptl);
> >> +				pvmw->ptl = NULL;
> >> +			} else {
> >> +				if ((pvmw->flags & PVMW_SYNC) &&
> >> +				    thp_vma_suitable_order(vma, pvmw->address,
> >> +							   PMD_ORDER) &&
> >> +				    (pvmw->nr_pages >= HPAGE_PMD_NR))
> >> +					sync_with_folio_pmd_zap(mm, pvmw->pmd);
> >> +
> >> +				step_forward(pvmw, PMD_SIZE);
> >> +				continue;
> >> +			}
> >>  		}
> >>  		if (!map_pte(pvmw, &pmde, &ptl)) {
> >>  			if (!pvmw->pte)
> >> --
> >> 2.34.1
> >>
> >
> >Thanks, Lorenzo
> >
> >----8<----
> >>From e6a3c1c782714ed831c4d46a14bb99226423bf59 Mon Sep 17 00:00:00 2001
> >From: Wei Yang <richard.weiyang@gmail.com>
> >Date: Mon, 22 Jun 2026 13:06:51 +0000
> >Subject: [PATCH] refactored
> >
> >Signed-off-by: Lorenzo Stoakes <ljs@kernel.org>
> >---
> > mm/page_vma_mapped.c | 20 +++++++++++++++-----
> > 1 file changed, 15 insertions(+), 5 deletions(-)
> >
> >diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> >index 2ccbabfb2cc1..17dff8aab9f9 100644
> >--- a/mm/page_vma_mapped.c
> >+++ b/mm/page_vma_mapped.c
> >@@ -269,14 +269,24 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
> > 			/* THP pmd was split under us: handle on pte level */
> > 			spin_unlock(pvmw->ptl);
> > 			pvmw->ptl = NULL;
> >-		} else if (!pmd_present(pmde)) {
> >-			const softleaf_t entry = softleaf_from_pmd(pmde);
> >+		} else if (pmd_is_device_private_entry(pmde)) {
> >+			softleaf_t entry;
> >+
> >+			pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> >+			pmde = *pvmw->pmd;
> >+			entry = softleaf_from_pmd(pmde);
> >
> >-			if (softleaf_is_device_private(entry)) {
> >-				pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> >+			if (likely(softleaf_is_device_private(entry))) {
> >+				if (pvmw->flags & PVMW_MIGRATION)
> >+					return not_found(pvmw);
> >+				if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> >+					return not_found(pvmw);
> > 				return true;
> > 			}
> >-
> >+			/* device-private pmd was split under us: handle on pte level */
> >+			spin_unlock(pvmw->ptl);
> >+			pvmw->ptl = NULL;
> >+		} else if (!pmd_present(pmde)) {
> > 			if ((pvmw->flags & PVMW_SYNC) &&
> > 			    thp_vma_suitable_order(vma, pvmw->address,
> > 						   PMD_ORDER) &&
> >--
> >2.54.0
>
> If we prefer this way, I will check and take it.

Yeah, feel free to go ahead + use it :)

>
> --
> Wei Yang
> Help you, Help me

Thanks, Lorenzo

  parent reply	other threads:[~2026-06-22 16:11 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-22 13:06 [PATCH] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd Wei Yang
2026-06-22 13:14 ` Wei Yang
2026-06-22 13:46 ` Lorenzo Stoakes
2026-06-22 14:21   ` Wei Yang
2026-06-22 14:59     ` Lance Yang
2026-06-22 16:11     ` Lorenzo Stoakes [this message]
2026-06-22 14:44   ` Lance Yang
  -- strict thread matches above, loose matches on Subject: below --
2026-05-08  1:37 Wei Yang
2026-05-08 21:51 ` Andrew Morton
2026-05-10  1:22   ` Wei Yang
2026-05-08 22:48 ` Balbir Singh
2026-05-10  1:20   ` Wei Yang
2026-05-12 12:43   ` David Hildenbrand (Arm)
2026-05-12 14:35     ` Wei Yang
2026-05-12 18:55       ` David Hildenbrand (Arm)
2026-05-12 23:03         ` Balbir Singh
2026-05-12 23:14           ` Wei Yang
2026-05-12 23:19             ` Balbir Singh
2026-05-13  1:47             ` Balbir Singh
2026-06-12  2:48         ` Wei Yang
2026-06-15 11:58           ` David Hildenbrand (Arm)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajld6RKK02Vi-LxM@lucifer \
    --to=ljs@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=balbirs@nvidia.com \
    --cc=david@kernel.org \
    --cc=harry@kernel.org \
    --cc=jannh@google.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=richard.weiyang@gmail.com \
    --cc=riel@surriel.com \
    --cc=sj@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.