Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, david@kernel.org, riel@surriel.com,
	 liam@infradead.org, vbabka@kernel.org, harry@kernel.org,
	jannh@google.com,  balbirs@nvidia.com, ziy@nvidia.com,
	sj@kernel.org, linux-mm@kvack.org,  stable@vger.kernel.org
Subject: Re: [Patch v2] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd
Date: Fri, 19 Jun 2026 11:44:13 +0100	[thread overview]
Message-ID: <ajUXNjRMraKb6k2n@lucifer> (raw)
In-Reply-To: <20260616063436.20455-1-richard.weiyang@gmail.com>

-cc wrong email

On Tue, Jun 16, 2026 at 06:34:36AM +0000, Wei Yang wrote:
> For pmd_trans_huge() and pmd_is_migration_entry(), we does following
> before return the pmd entry:
>
>   * re-validate pmd entry after PTL
>   * check PVMW_MIGRATION
>   * check_pmd()
>   * handle on pte level if split under us
>
> But for device-private pmd, we just return after pmd_lock().
>
> This may return improper entry, e.g. if we are looking for a migration
> entry, device-private entry could still be returned, which leads to data
> corruption.

I don't thik this is quite clear?

How about:

	If a softleaf entry is present, the existing code simply acquires the
	PMD lock and returns success even if PVMW_MIGRATION is set (indicating a
	migration entry is sought), meaning that the caller can incorrectly
	interpret the entry as something it is not, causing data corruption.

>
> This patch fixes commit 65edfda6f3f2 ("mm/rmap: extend rmap and migration
> support device-private entries") by following the same pattern as
> pmd_trans_huge() and pmd_is_migration_entry() for device private entry.
>
> While at it, it cleanups the pmd entry handling in page_vma_mapped_walk().
>
>   * Instead of handling trans huge/migration entry/device private entry
>     in a mixed manner, we put each case into its own if condition and
>     handle with the same pattern.
>   * Also we grab PTL and make sure pmd is not changed under us after
>     above check instead of do the check with PTL hold.
>   * restart the process if pmd is changed under us

You're doing quite a bit for a fix and you're putting it all in one place.

How about do the fix as 1 patch, and then cleanups as other ones? It helps with
review too :)

It's a general rule of thumb that if you do more than one of moving, refactoring
or changing code, to do them as separate patches so a reviewer/somebody
bisecting can clearly separate each.

Also PLEASE do not add new functionality (this lock recheck) in a fixes
patch. We'll end up backporting new logic that way.

Make the fixes bit _minimal_.

I think in general Andrew prefers separate fixes patches so I'd just make the
_minimal_ change that fixes this for the backport, and the cleanup stuff as a
separate series.

>
> Fixes: 65edfda6f3f2 ("mm/rmap: extend rmap and migration support device-private entries")

Hmm seems the device private stuff has had a rocky road of late!

I wonder if we need some more test coverage on this?

> Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
> Suggested-by: David Hildenbrand <david@kernel.org>
> Cc: David Hildenbrand <david@kernel.org>
> Cc: Balbir Singh <balbirs@nvidia.com>
> Cc: SeongJae Park <sj@kernel.org>
> Cc: Zi Yan <ziy@nvidia.com>
> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Annoying nag: You sent to my correct email ljs@kernel.org (thanks!) but also
cc'd the incorrect one, please only send to ljs@kernel.org thanks :)

> Cc: <stable@vger.kernel.org>

Be better to just have this with the Fixes tag, Andrew adds the Cc's from the
actual cc- list anyway.

>
> ---
> v2:
>   * specify the possible error case of current code and user visible effect
>   * besides fix, cleanup the pmd entry handling based on David's suggestion
>
> v1: https://lore.kernel.org/linux-mm/20260508013728.21285-1-richard.weiyang@gmail.com/
> ---
>  mm/page_vma_mapped.c | 63 +++++++++++++++++++++-----------------------
>  1 file changed, 30 insertions(+), 33 deletions(-)
>
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index 2ccbabfb2cc1..21635fab209c 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -243,40 +243,28 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>  		 */
>  		pmde = pmdp_get_lockless(pvmw->pmd);
>
> -		if (pmd_trans_huge(pmde) || pmd_is_migration_entry(pmde)) {
> -			pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> -			pmde = *pvmw->pmd;
> -			if (!pmd_present(pmde)) {
> -				softleaf_t entry;
> -
> -				if (!thp_migration_supported() ||
> -				    !(pvmw->flags & PVMW_MIGRATION))
> -					return not_found(pvmw);
> -				entry = softleaf_from_pmd(pmde);
> -
> -				if (!softleaf_is_migration(entry) ||
> -				    !check_pmd(softleaf_to_pfn(entry), pvmw))
> -					return not_found(pvmw);
> -				return true;
> -			}
> -			if (likely(pmd_trans_huge(pmde))) {
> -				if (pvmw->flags & PVMW_MIGRATION)
> -					return not_found(pvmw);
> -				if (!check_pmd(pmd_pfn(pmde), pvmw))
> -					return not_found(pvmw);
> -				return true;
> -			}
> -			/* THP pmd was split under us: handle on pte level */

Don't drop critical comments like this, that's very bad.

> -			spin_unlock(pvmw->ptl);
> -			pvmw->ptl = NULL;
> -		} else if (!pmd_present(pmde)) {
> -			const softleaf_t entry = softleaf_from_pmd(pmde);
> -
> -			if (softleaf_is_device_private(entry)) {
> -				pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> -				return true;
> -			}
> +		if (pmd_present(pmde)) {

You're not checking pmd_trans_huge() at all now? Just assuming pmd_present() ==
pmd_trans_huge()?

> +			if (!pmd_leaf(pmde))
> +				goto pte_table;

OK now assuming pmd_leaf() == pmd_trans_huge()?

You didn't mention this in the commit msg? Justificaiton please?

> +			if (pvmw->flags & PVMW_MIGRATION)
> +				return not_found(pvmw);
> +			if (!check_pmd(pmd_pfn(pmde), pvmw))
> +				return not_found(pvmw);



> +		} else if (pmd_is_migration_entry(pmde)) {
> +			softleaf_t entry = softleaf_from_pmd(pmde);
> +

Err you dropped the thp_migration_supported() check? Why?

> +			if (!(pvmw->flags & PVMW_MIGRATION))
> +				return not_found(pvmw);
> +			if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> +				return not_found(pvmw);
> +		} else if (pmd_is_device_private_entry(pmde)) {
> +			softleaf_t entry = softleaf_from_pmd(pmde);
>
> +			if (pvmw->flags & PVMW_MIGRATION)
> +				return not_found(pvmw);
> +			if (!check_pmd(softleaf_to_pfn(entry), pvmw))
> +				return not_found(pvmw);

I mean it's less awful than what was there before, but as refactoring goes,
putting a massive set of branches in the middle of a long function isn't really
the best.

Can we avoid this horrible goto by separating this out into a function?

> +		} else {

Else means what exactly? A comment would be good.

I feel like in an effort to keep all this at one level of indentation you've
created a confusing else case.

Sane thing would be to:

a. have this as a separate function
b. to do:

	if (pmd_none(pmde)) {
		...
		return ...;
	}

	if (pmd_present(pmde)) {
		...
		return ...;
	}

	softleaf = softleaf_from_pmd(pmde);

	/* softleaf stuff */

	return ...;

In this function.

That way it's _very clear_ you're doing softleaf stuff.


>  			if ((pvmw->flags & PVMW_SYNC) &&
>  			    thp_vma_suitable_order(vma, pvmw->address,
>  						   PMD_ORDER) &&

Umm... previously this was done for all softleaf entries other than
device-private, now not, and you haven't, why?



> @@ -286,6 +274,15 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>  			step_forward(pvmw, PMD_SIZE);
>  			continue;
>  		}
> +
> +		/* Double-check under PTL that the PMD didn't change. */
> +		pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> +		if (pmd_same(pmde, pmdp_get(pvmw->pmd)))
> +			return true;
> +		spin_unlock(pvmw->ptl);
> +		pvmw->ptl = NULL;
> +		goto restart;

I'm not sure I really get the justification for this? seems you just added this
arbitrarily?

Again, you shouldn't be making changes like this in a fix patch.

> +pte_table:
>  		if (!map_pte(pvmw, &pmde, &ptl)) {
>  			if (!pvmw->pte)
>  				goto restart;
> --
> 2.34.1
>

This is really hard to review like this, as above please split this up properly.

Thanks, Lorenzo


  parent reply	other threads:[~2026-06-19 10:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20260616063436.20455-1-richard.weiyang@gmail.com>
2026-06-16 12:30 ` [Patch v2] mm/page_vma_mapped: revalidate and do proper check before return device-private pmd Lance Yang
2026-06-16 23:50   ` Wei Yang
2026-06-17  2:32     ` Lance Yang
2026-06-17  8:18       ` Wei Yang
2026-06-19  2:30         ` Wei Yang
2026-06-19 12:19           ` Lance Yang
2026-06-20  2:02             ` Wei Yang
     [not found]   ` <666dc40b-e37a-46eb-af55-7a81bc1643f1@kernel.org>
2026-06-17  2:11     ` Balbir Singh
2026-06-17  3:14       ` Lance Yang
2026-06-19 10:44 ` Lorenzo Stoakes [this message]
2026-06-19 10:48   ` David Hildenbrand (Arm)
2026-06-19 11:04     ` Lorenzo Stoakes
2026-06-20  2:13     ` Wei Yang
2026-06-20  2:11   ` Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ajUXNjRMraKb6k2n@lucifer \
    --to=ljs@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=balbirs@nvidia.com \
    --cc=david@kernel.org \
    --cc=harry@kernel.org \
    --cc=jannh@google.com \
    --cc=liam@infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=richard.weiyang@gmail.com \
    --cc=riel@surriel.com \
    --cc=sj@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox