Re: [PATCH] mm/migrate_device: pin large folios before splitting

The Linux Kernel Mailing List
 help / color / mirror / Atom feed

From: SJ Park <sj@kernel.org>
To: Usama Arif <usama.arif@linux.dev>
Cc: SJ Park <sj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	apopple@nvidia.com, byungchul@sk.com, david@kernel.org,
	gourry@gourry.net, joshua.hahnjy@gmail.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	matthew.brost@intel.com, rakie.kim@sk.com,
	ying.huang@linux.alibaba.com, ziy@nvidia.com,
	shakeel.butt@linux.dev, hannes@cmpxchg.org, kernel-team@meta.com,
	sashiko-bot <sashiko-bot@kernel.org>
Subject: Re: [PATCH] mm/migrate_device: pin large folios before splitting
Date: Wed,  1 Jul 2026 16:29:36 -0700	[thread overview]
Message-ID: <20260701232936.85398-1-sj@kernel.org> (raw)
In-Reply-To: <20260701140638.840773-1-usama.arif@linux.dev>

On Wed,  1 Jul 2026 07:06:38 -0700 Usama Arif <usama.arif@linux.dev> wrote:

> migrate_vma_collect_pmd() can detect a large folio while holding the PTE
> lock, then drop the PTE lock before calling migrate_vma_split_folio(). The
> split helper took its own reference, but only after the lock had already
> been dropped.
> 
> One way to hit this is device migration over a range that contains a large
> folio. The walker reads the PTE while holding the PTE lock and derives the
> folio either from a present PTE via vm_normal_page(), or from a non-present
> PTE that encodes a device-private softleaf entry. It then has to drop the
> PTE lock because split_folio() can block. Before migrate_vma_split_folio()
> gets a folio reference, concurrent reclaim, migration, or truncation can
> replace or clear the entry and drop the last reference to the folio. The
> split helper would then take a reference and lock on a stale folio pointer.
> 
> Take a temporary reference before dropping the PTE lock and pass that
> reference into migrate_vma_split_folio(). The helper consumes the
> reference, so split_folio() still sees only the expected caller pin instead
> of an extra pin that could make the split fail.
> 
> Reported-by: sashiko-bot <sashiko-bot@kernel.org>

Just curious.  Has the bot sent the review directly to you?  Or, you read it
from sashiko.dev web site but kindly giving the credit to the bot?

> Link: https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev

The link shows the comments for the entire series, so I was little bit confused
where the real finding is.  Seems like that is on the reply to the patch 5?
And today I learned Sashiko.dev provides a way to point a reply to a specific
patch, like this:

https://sashiko.dev/#/patchset/20260630164143.1595669-1-usama.arif%40linux.dev?part=5

> Fixes: 022a12deda53 ("mm/migrate_device: handle partially mapped folios during collection")

Seems merged in 6.19.  Should we Cc stable@ ?

> Signed-off-by: Usama Arif <usama.arif@linux.dev>

Reviewed-by: SJ Park <sj@kernel.org>

> ---
>  mm/migrate_device.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/mm/migrate_device.c b/mm/migrate_device.c
> index 2f8b646302c2..f5a5f699e98e 100644
> --- a/mm/migrate_device.c
> +++ b/mm/migrate_device.c
> @@ -77,6 +77,9 @@ static int migrate_vma_collect_hole(unsigned long start,
>   * @folio: the folio to split
>   * @fault_page: struct page associated with the fault if any
>   *
> + * If @folio is not the folio containing @fault_page, the caller must hold a
> + * reference on @folio. The helper consumes that reference.
> + *
>   * Returns 0 on success
>   */
>  static int migrate_vma_split_folio(struct folio *folio,
> @@ -86,10 +89,8 @@ static int migrate_vma_split_folio(struct folio *folio,
>  	struct folio *fault_folio = fault_page ? page_folio(fault_page) : NULL;
>  	struct folio *new_fault_folio = NULL;
>  
> -	if (folio != fault_folio) {
> -		folio_get(folio);
> +	if (folio != fault_folio)
>  		folio_lock(folio);
> -	}
>  
>  	ret = split_folio(folio);
>  	if (ret) {
> @@ -310,6 +311,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  			if (folio_test_large(folio)) {
>  				int ret;
>  
> +				/*
> +				 * Keep the folio stable after dropping the PTE
> +				 * lock. migrate_vma_split_folio() consumes this
> +				 * reference.
> +				 */
> +				if (folio != fault_folio)
> +					folio_get(folio);
>  				lazy_mmu_mode_disable();
>  				pte_unmap_unlock(ptep, ptl);
>  				ret = migrate_vma_split_folio(folio,
> @@ -353,6 +361,13 @@ static int migrate_vma_collect_pmd(pmd_t *pmdp,
>  			if (folio && folio_test_large(folio)) {
>  				int ret;
>  
> +				/*
> +				 * Keep the folio stable after dropping the
> +				 * PTE lock. migrate_vma_split_folio() consumes
> +				 * this reference.
> +				 */

I don't mind having the above comments.  Nonetheless, having same content but
different wrapping look special to me.

If people really hate having two (incompletely) duplicated comments, what about
separating the recurring code out and adding the coment once on the function?
Maybe having a hotfix version without the comment and later refactoring with
the function can also be an option?  Just a pure loud thinking.

Anyway, thank you for making Linux safer! :)

> +				if (folio != fault_folio)
> +					folio_get(folio);
>  				lazy_mmu_mode_disable();
>  				pte_unmap_unlock(ptep, ptl);
>  				ret = migrate_vma_split_folio(folio,
> -- 
> 2.53.0-Meta


Thanks,
SJ

next prev parent reply	other threads:[~2026-07-01 23:29 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-07-01 14:06 [PATCH] mm/migrate_device: pin large folios before splitting Usama Arif
2026-07-01 16:49 ` David Hildenbrand (Arm)
2026-07-01 17:02 ` Zi Yan
2026-07-01 19:27   ` Andrew Morton
2026-07-01 20:06     ` David Hildenbrand (Arm)
2026-07-01 20:16       ` Zi Yan
2026-07-02  0:33         ` Alistair Popple
2026-07-02  7:59           ` David Hildenbrand (Arm)
2026-07-01 23:29 ` SJ Park [this message]
2026-07-02 10:47   ` Usama Arif
2026-07-02  4:45 ` Lance Yang
2026-07-02 10:56 ` Usama Arif

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260701232936.85398-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=byungchul@sk.com \
    --cc=david@kernel.org \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=matthew.brost@intel.com \
    --cc=rakie.kim@sk.com \
    --cc=sashiko-bot@kernel.org \
    --cc=shakeel.butt@linux.dev \
    --cc=usama.arif@linux.dev \
    --cc=ying.huang@linux.alibaba.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox