Re: [PATCH 1/2] mm/huge_memory: return true if split_huge_pmd_locked() split PMD to migration entry

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: "David Hildenbrand (Arm)" <david@kernel.org>
To: Wei Yang <richard.weiyang@gmail.com>
Cc: akpm@linux-foundation.org, ljs@kernel.org, ziy@nvidia.com,
	baolin.wang@linux.alibaba.com, Liam.Howlett@oracle.com,
	npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
	baohua@kernel.org, lance.yang@linux.dev, riel@surriel.com,
	vbabka@kernel.org, harry@kernel.org, jannh@google.com,
	rppt@kernel.org, surenb@google.com, mhocko@suse.com,
	shuah@kernel.org, linux-mm@kvack.org,
	Gavin Guo <gavinguo@igalia.com>
Subject: Re: [PATCH 1/2] mm/huge_memory: return true if split_huge_pmd_locked() split PMD to migration entry
Date: Wed, 29 Apr 2026 08:55:07 +0200	[thread overview]
Message-ID: <413feed4-6aab-43d9-b7e5-a9386fa79f4b@kernel.org> (raw)
In-Reply-To: <20260429024913.iepoi7cit3xnwca2@master>

On 4/29/26 04:49, Wei Yang wrote:
> On Tue, Apr 28, 2026 at 10:24:42AM +0200, David Hildenbrand (Arm) wrote:
>> On 4/26/26 11:19, Wei Yang wrote:
>>>
>>> Will adjust related places.
>>>
>>>
>>> Got it.
>>>
>>>
>>> Here is my understanding.
>>>
>>> We get here when page_vma_mapped_walk() touch a pmd entry, with three cases:
>>>
>>>   * pmd_trans_huge()
>>>   * pmd_is_migration_entry()
>>>   * pmd_is_device_private_entry()
>>>
>>> For the first two cases, we grab pmd_lock() and then check the condition is
>>> still valid before return. But for case 3, after grab pmd_lock(), it return
>>> directly.
>>>
>>> This may give chance for another thread to split pmd_is_device_private_entry()
>>> to pte mapped, IIUC. For this case, we should restart the walk here.
>>
>>
>> So what you are saying is that we should re-validate in page_vma_mapped_walk()
>> that we indeed still have a device-private entry after grabbing the lock?
>>
>> That's what we do in map_pte() through pmd_same() check.
>>
>> Likely we should apply the same model here!
>>
> 
> Below is my proposed change:
> 
> diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
> index a4d52fdb3056..6e915d35ae54 100644
> --- a/mm/page_vma_mapped.c
> +++ b/mm/page_vma_mapped.c
> @@ -273,17 +273,21 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw)
>  
>  			if (softleaf_is_device_private(entry)) {
>  				pvmw->ptl = pmd_lock(mm, pvmw->pmd);
> -				return true;
> +				if (pmd_same(pmde, pmdp_get_lockless(pvmw->pmd)))
> +					return true;

As we have a softleaf entry, I assume we wouldn't expect to get any other bits
(access/dirty) set until we grab the lock. Verifying
softleaf_is_device_private() again would be better cleaner, though.

But really, I do wonder if we should just have a "goto retry" back to the "pmde
= pmdp_get_lockless(pvmw->pmd);" instead?


And now I wonder why we don't have a check_pmd() handling in there? :/

Should we check for the pfn here?

> +				/* THP pmd was split under us: handle on pte level */
> +				spin_unlock(pvmw->ptl);
> +				pvmw->ptl = NULL;



> +			} else {
> +				if ((pvmw->flags & PVMW_SYNC) &&
> +				    thp_vma_suitable_order(vma, pvmw->address,
> +							   PMD_ORDER) &&
> +				    (pvmw->nr_pages >= HPAGE_PMD_NR))
> +					sync_with_folio_pmd_zap(mm, pvmw->pmd);
> +
> +				step_forward(pvmw, PMD_SIZE);
> +				continue;
>  			}
> -
> -			if ((pvmw->flags & PVMW_SYNC) &&
> -			    thp_vma_suitable_order(vma, pvmw->address,
> -						   PMD_ORDER) &&
> -			    (pvmw->nr_pages >= HPAGE_PMD_NR))
> -				sync_with_folio_pmd_zap(mm, pvmw->pmd);
> -
> -			step_forward(pvmw, PMD_SIZE);
> -			continue;
>  		}
>  		if (!map_pte(pvmw, &pmde, &ptl)) {
>  			if (!pvmw->pte)
> 
> After this, we could simplify the logic in try_to_migrate_one() as:
> 
> @@ -2471,14 +2471,10 @@ static bool try_to_migrate_one(struct folio *folio, struct vm_area_struct *vma,
>                                  * so we can detect this scenario and properly
>                                  * abort the walk.
>                                  */
> -                               if (split_huge_pmd_locked(vma, pvmw.address,
> -                                                     pvmw.pmd, true)) {
> -                                       page_vma_mapped_walk_done(&pvmw);
> -                                       break;
> -                               }
> -                               flags &= ~TTU_SPLIT_HUGE_PMD;
> -                               page_vma_mapped_walk_restart(&pvmw);
> -                               continue;
> +                               ret = split_huge_pmd_locked(vma, pvmw.address,
> +                                                     pvmw.pmd, true);
> +                               page_vma_mapped_walk_done(&pvmw);
> +                               break;
>                         }

Right. But just to be clear: Let's split the page_vma_mapped_walk() validation
(which looks like a bugfix to me) from the other optimization.

-- 
Cheers,

David

next prev parent reply	other threads:[~2026-04-29  6:55 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-15  1:08 [PATCH 0/2] mm/huge_memory: optimize migration when huge PMD needs split Wei Yang
2026-04-15  1:08 ` [PATCH 1/2] mm/huge_memory: return true if split_huge_pmd_locked() split PMD to migration entry Wei Yang
2026-04-24 19:29   ` David Hildenbrand (Arm)
2026-04-26  9:19     ` Wei Yang
2026-04-28  8:24       ` David Hildenbrand (Arm)
2026-04-29  2:49         ` Wei Yang
2026-04-29  6:55           ` David Hildenbrand (Arm) [this message]
2026-05-03  0:38             ` Wei Yang
2026-05-04 12:44               ` David Hildenbrand (Arm)
2026-05-05  3:15                 ` Wei Yang
2026-04-15  1:08 ` [PATCH 2/2] mm/selftests: add split_shared_pmd() Wei Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=413feed4-6aab-43d9-b7e5-a9386fa79f4b@kernel.org \
    --to=david@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=dev.jain@arm.com \
    --cc=gavinguo@igalia.com \
    --cc=harry@kernel.org \
    --cc=jannh@google.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=mhocko@suse.com \
    --cc=npache@redhat.com \
    --cc=richard.weiyang@gmail.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox