public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
@ 2026-03-10 10:54 Usama Arif
  2026-03-10 16:52 ` Zi Yan
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Usama Arif @ 2026-03-10 10:54 UTC (permalink / raw)
  To: Andrew Morton, npache, david, ziy, willy, linux-mm
  Cc: matthew.brost, joshua.hahnjy, hannes, rakie.kim, byungchul,
	gourry, ying.huang, apopple, linux-kernel, kernel-team,
	Usama Arif

During folio migration, __folio_migrate_mapping() removes the source
folio from the deferred split queue, but the destination folio is never
re-queued.  This causes underutilized THPs to escape the shrinker after
NUMA migration, since they silently drop off the deferred split list.

Fix this by recording whether the source folio was on the deferred split
queue and its partially mapped state before move_to_new_folio() unqueues
it, and re-queuing the destination folio after a successful migration if
it was.

By the time migrate_folio_move() runs, partially mapped folios without a
pin have already been split by migrate_pages_batch().  So only two cases
remain on the deferred list at this point:
  1. Partially mapped folios with a pin (split failed).
  2. Fully mapped but potentially underused folios.
The recorded partially_mapped state is forwarded to deferred_split_folio()
so that the destination folio is correctly re-queued in both cases.

Reported-by: Johannes Weiner <hannes@cmpxchg.org>
Fixes: dafff3f4c850 ("mm: split underused THPs")
Signed-off-by: Usama Arif <usama.arif@linux.dev>
---
v1 -> v2:
- record whether source folio was on the deferred split queue before
  move_to_folio() (David)
- record partially mapped state and update commit message (Zi)
---
 mm/migrate.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/mm/migrate.c b/mm/migrate.c
index ece77ccb2ec0..61013d258eb4 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
 	int rc;
 	int old_page_state = 0;
 	struct anon_vma *anon_vma = NULL;
+	bool src_deferred_split = false;
+	bool src_partially_mapped = false;
 	struct list_head *prev;
 
 	__migrate_folio_extract(dst, &old_page_state, &anon_vma);
@@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
 		goto out_unlock_both;
 	}
 
+	if (folio_test_large(src) && folio_test_large_rmappable(src) &&
+	    !data_race(list_empty(&src->_deferred_list))) {
+		src_deferred_split = true;
+		src_partially_mapped = folio_test_partially_mapped(src);
+	}
+
 	rc = move_to_new_folio(dst, src, mode);
 	if (rc)
 		goto out;
@@ -1393,6 +1401,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
 	if (old_page_state & PAGE_WAS_MAPPED)
 		remove_migration_ptes(src, dst, 0);
 
+	/*
+	 * Requeue the destination folio on the deferred split queue if
+	 * the source was on the queue.  The source is unqueued in
+	 * __folio_migrate_mapping(), so we recorded the state from
+	 * before move_to_new_folio().
+	 */
+	if (src_deferred_split)
+		deferred_split_folio(dst, src_partially_mapped);
+
 out_unlock_both:
 	folio_unlock(dst);
 	folio_set_owner_migrate_reason(dst, reason);
-- 
2.47.3



^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
  2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
@ 2026-03-10 16:52 ` Zi Yan
  2026-03-11  9:23 ` David Hildenbrand (Arm)
  2026-03-12  3:18 ` Wei Yang
  2 siblings, 0 replies; 6+ messages in thread
From: Zi Yan @ 2026-03-10 16:52 UTC (permalink / raw)
  To: Usama Arif
  Cc: Andrew Morton, npache, david, willy, linux-mm, matthew.brost,
	joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
	apopple, linux-kernel, kernel-team

On 10 Mar 2026, at 6:54, Usama Arif wrote:

> During folio migration, __folio_migrate_mapping() removes the source
> folio from the deferred split queue, but the destination folio is never
> re-queued.  This causes underutilized THPs to escape the shrinker after
> NUMA migration, since they silently drop off the deferred split list.
>
> Fix this by recording whether the source folio was on the deferred split
> queue and its partially mapped state before move_to_new_folio() unqueues
> it, and re-queuing the destination folio after a successful migration if
> it was.
>
> By the time migrate_folio_move() runs, partially mapped folios without a
> pin have already been split by migrate_pages_batch().  So only two cases
> remain on the deferred list at this point:
>   1. Partially mapped folios with a pin (split failed).
>   2. Fully mapped but potentially underused folios.
> The recorded partially_mapped state is forwarded to deferred_split_folio()
> so that the destination folio is correctly re-queued in both cases.
>
> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
> Fixes: dafff3f4c850 ("mm: split underused THPs")
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> v1 -> v2:
> - record whether source folio was on the deferred split queue before
>   move_to_folio() (David)
> - record partially mapped state and update commit message (Zi)
> ---
>  mm/migrate.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
LGTM.

Acked-by: Zi Yan <ziy@nvidia.com>

Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
  2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
  2026-03-10 16:52 ` Zi Yan
@ 2026-03-11  9:23 ` David Hildenbrand (Arm)
  2026-03-11 13:25   ` Usama Arif
  2026-03-12  3:18 ` Wei Yang
  2 siblings, 1 reply; 6+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-11  9:23 UTC (permalink / raw)
  To: Usama Arif, Andrew Morton, npache, ziy, willy, linux-mm
  Cc: matthew.brost, joshua.hahnjy, hannes, rakie.kim, byungchul,
	gourry, ying.huang, apopple, linux-kernel, kernel-team

On 3/10/26 11:54, Usama Arif wrote:
> During folio migration, __folio_migrate_mapping() removes the source
> folio from the deferred split queue, but the destination folio is never
> re-queued.  This causes underutilized THPs to escape the shrinker after
> NUMA migration, since they silently drop off the deferred split list.
> 
> Fix this by recording whether the source folio was on the deferred split
> queue and its partially mapped state before move_to_new_folio() unqueues
> it, and re-queuing the destination folio after a successful migration if
> it was.
> 
> By the time migrate_folio_move() runs, partially mapped folios without a
> pin have already been split by migrate_pages_batch().  So only two cases
> remain on the deferred list at this point:
>   1. Partially mapped folios with a pin (split failed).
>   2. Fully mapped but potentially underused folios.
> The recorded partially_mapped state is forwarded to deferred_split_folio()
> so that the destination folio is correctly re-queued in both cases.
> 
> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
> Fixes: dafff3f4c850 ("mm: split underused THPs")
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> v1 -> v2:
> - record whether source folio was on the deferred split queue before
>   move_to_folio() (David)
> - record partially mapped state and update commit message (Zi)
> ---
>  mm/migrate.c | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index ece77ccb2ec0..61013d258eb4 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>  	int rc;
>  	int old_page_state = 0;
>  	struct anon_vma *anon_vma = NULL;
> +	bool src_deferred_split = false;
> +	bool src_partially_mapped = false;
>  	struct list_head *prev;
>  
>  	__migrate_folio_extract(dst, &old_page_state, &anon_vma);
> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>  		goto out_unlock_both;
>  	}
>  
> +	if (folio_test_large(src) && folio_test_large_rmappable(src) &&

I don't think the folio_test_large_rmappable() check is required. Other
folios we migrate here would always have _deferred_list initialized but
unused.

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
  2026-03-11  9:23 ` David Hildenbrand (Arm)
@ 2026-03-11 13:25   ` Usama Arif
  0 siblings, 0 replies; 6+ messages in thread
From: Usama Arif @ 2026-03-11 13:25 UTC (permalink / raw)
  To: David Hildenbrand (Arm), Andrew Morton, npache, ziy, willy,
	linux-mm
  Cc: matthew.brost, joshua.hahnjy, hannes, rakie.kim, byungchul,
	gourry, ying.huang, apopple, linux-kernel, kernel-team



On 11/03/2026 12:23, David Hildenbrand (Arm) wrote:
> On 3/10/26 11:54, Usama Arif wrote:
>> During folio migration, __folio_migrate_mapping() removes the source
>> folio from the deferred split queue, but the destination folio is never
>> re-queued.  This causes underutilized THPs to escape the shrinker after
>> NUMA migration, since they silently drop off the deferred split list.
>>
>> Fix this by recording whether the source folio was on the deferred split
>> queue and its partially mapped state before move_to_new_folio() unqueues
>> it, and re-queuing the destination folio after a successful migration if
>> it was.
>>
>> By the time migrate_folio_move() runs, partially mapped folios without a
>> pin have already been split by migrate_pages_batch().  So only two cases
>> remain on the deferred list at this point:
>>   1. Partially mapped folios with a pin (split failed).
>>   2. Fully mapped but potentially underused folios.
>> The recorded partially_mapped state is forwarded to deferred_split_folio()
>> so that the destination folio is correctly re-queued in both cases.
>>
>> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>> ---
>> v1 -> v2:
>> - record whether source folio was on the deferred split queue before
>>   move_to_folio() (David)
>> - record partially mapped state and update commit message (Zi)
>> ---
>>  mm/migrate.c | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index ece77ccb2ec0..61013d258eb4 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>>  	int rc;
>>  	int old_page_state = 0;
>>  	struct anon_vma *anon_vma = NULL;
>> +	bool src_deferred_split = false;
>> +	bool src_partially_mapped = false;
>>  	struct list_head *prev;
>>  
>>  	__migrate_folio_extract(dst, &old_page_state, &anon_vma);
>> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>>  		goto out_unlock_both;
>>  	}
>>  
>> +	if (folio_test_large(src) && folio_test_large_rmappable(src) &&
> 
> I don't think the folio_test_large_rmappable() check is required. Other
> folios we migrate here would always have _deferred_list initialized but
> unused.
> 
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
> 


I have been auditing the THP shrinker code when it comes to NUMA migration and I think we need
another fix for this. I have sent it here https://lore.kernel.org/all/20260311132342.3193160-1-usama.arif@linux.dev/


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
  2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
  2026-03-10 16:52 ` Zi Yan
  2026-03-11  9:23 ` David Hildenbrand (Arm)
@ 2026-03-12  3:18 ` Wei Yang
  2026-03-12  8:26   ` David Hildenbrand (Arm)
  2 siblings, 1 reply; 6+ messages in thread
From: Wei Yang @ 2026-03-12  3:18 UTC (permalink / raw)
  To: Usama Arif
  Cc: Andrew Morton, npache, david, ziy, willy, linux-mm, matthew.brost,
	joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
	apopple, linux-kernel, kernel-team

On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>During folio migration, __folio_migrate_mapping() removes the source
>folio from the deferred split queue, but the destination folio is never
>re-queued.  This causes underutilized THPs to escape the shrinker after
>NUMA migration, since they silently drop off the deferred split list.
>
>Fix this by recording whether the source folio was on the deferred split
>queue and its partially mapped state before move_to_new_folio() unqueues
>it, and re-queuing the destination folio after a successful migration if
>it was.
>
>By the time migrate_folio_move() runs, partially mapped folios without a
>pin have already been split by migrate_pages_batch().  So only two cases
>remain on the deferred list at this point:
>  1. Partially mapped folios with a pin (split failed).
>  2. Fully mapped but potentially underused folios.
>The recorded partially_mapped state is forwarded to deferred_split_folio()
>so that the destination folio is correctly re-queued in both cases.
>
>Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>Fixes: dafff3f4c850 ("mm: split underused THPs")
>Signed-off-by: Usama Arif <usama.arif@linux.dev>
>---
>v1 -> v2:
>- record whether source folio was on the deferred split queue before
>  move_to_folio() (David)
>- record partially mapped state and update commit message (Zi)
>---
> mm/migrate.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
>diff --git a/mm/migrate.c b/mm/migrate.c
>index ece77ccb2ec0..61013d258eb4 100644
>--- a/mm/migrate.c
>+++ b/mm/migrate.c
>@@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> 	int rc;
> 	int old_page_state = 0;
> 	struct anon_vma *anon_vma = NULL;
>+	bool src_deferred_split = false;
>+	bool src_partially_mapped = false;
> 	struct list_head *prev;
> 
> 	__migrate_folio_extract(dst, &old_page_state, &anon_vma);
>@@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> 		goto out_unlock_both;
> 	}
> 
>+	if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>+	    !data_race(list_empty(&src->_deferred_list))) {

We usually check order > 1, before accessing _deferred_list, because it is in
subpage 2.

I am not sure why we don't do it here. Do I miss something?

>+		src_deferred_split = true;
>+		src_partially_mapped = folio_test_partially_mapped(src);
>+	}
>+
> 	rc = move_to_new_folio(dst, src, mode);
> 	if (rc)
> 		goto out;
>@@ -1393,6 +1401,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> 	if (old_page_state & PAGE_WAS_MAPPED)
> 		remove_migration_ptes(src, dst, 0);
> 
>+	/*
>+	 * Requeue the destination folio on the deferred split queue if
>+	 * the source was on the queue.  The source is unqueued in
>+	 * __folio_migrate_mapping(), so we recorded the state from
>+	 * before move_to_new_folio().
>+	 */
>+	if (src_deferred_split)
>+		deferred_split_folio(dst, src_partially_mapped);
>+
> out_unlock_both:
> 	folio_unlock(dst);
> 	folio_set_owner_migrate_reason(dst, reason);
>-- 
>2.47.3
>

-- 
Wei Yang
Help you, Help me


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
  2026-03-12  3:18 ` Wei Yang
@ 2026-03-12  8:26   ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 6+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-12  8:26 UTC (permalink / raw)
  To: Wei Yang, Usama Arif
  Cc: Andrew Morton, npache, ziy, willy, linux-mm, matthew.brost,
	joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
	apopple, linux-kernel, kernel-team

On 3/12/26 04:18, Wei Yang wrote:
> On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>> During folio migration, __folio_migrate_mapping() removes the source
>> folio from the deferred split queue, but the destination folio is never
>> re-queued.  This causes underutilized THPs to escape the shrinker after
>> NUMA migration, since they silently drop off the deferred split list.
>>
>> Fix this by recording whether the source folio was on the deferred split
>> queue and its partially mapped state before move_to_new_folio() unqueues
>> it, and re-queuing the destination folio after a successful migration if
>> it was.
>>
>> By the time migrate_folio_move() runs, partially mapped folios without a
>> pin have already been split by migrate_pages_batch().  So only two cases
>> remain on the deferred list at this point:
>>  1. Partially mapped folios with a pin (split failed).
>>  2. Fully mapped but potentially underused folios.
>> The recorded partially_mapped state is forwarded to deferred_split_folio()
>> so that the destination folio is correctly re-queued in both cases.
>>
>> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>> ---
>> v1 -> v2:
>> - record whether source folio was on the deferred split queue before
>>  move_to_folio() (David)
>> - record partially mapped state and update commit message (Zi)
>> ---
>> mm/migrate.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index ece77ccb2ec0..61013d258eb4 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> 	int rc;
>> 	int old_page_state = 0;
>> 	struct anon_vma *anon_vma = NULL;
>> +	bool src_deferred_split = false;
>> +	bool src_partially_mapped = false;
>> 	struct list_head *prev;
>>
>> 	__migrate_folio_extract(dst, &old_page_state, &anon_vma);
>> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> 		goto out_unlock_both;
>> 	}
>>
>> +	if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>> +	    !data_race(list_empty(&src->_deferred_list))) {
> 
> We usually check order > 1, before accessing _deferred_list, because it is in
> subpage 2.
> 
> I am not sure why we don't do it here. Do I miss something?

Valid point! non-anon folios could trigger that.

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2026-03-12  8:26 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
2026-03-10 16:52 ` Zi Yan
2026-03-11  9:23 ` David Hildenbrand (Arm)
2026-03-11 13:25   ` Usama Arif
2026-03-12  3:18 ` Wei Yang
2026-03-12  8:26   ` David Hildenbrand (Arm)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox