* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
@ 2026-03-10 16:52 ` Zi Yan
2026-03-11 9:23 ` David Hildenbrand (Arm)
` (2 subsequent siblings)
3 siblings, 0 replies; 10+ messages in thread
From: Zi Yan @ 2026-03-10 16:52 UTC (permalink / raw)
To: Usama Arif
Cc: Andrew Morton, npache, david, willy, linux-mm, matthew.brost,
joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
apopple, linux-kernel, kernel-team
On 10 Mar 2026, at 6:54, Usama Arif wrote:
> During folio migration, __folio_migrate_mapping() removes the source
> folio from the deferred split queue, but the destination folio is never
> re-queued. This causes underutilized THPs to escape the shrinker after
> NUMA migration, since they silently drop off the deferred split list.
>
> Fix this by recording whether the source folio was on the deferred split
> queue and its partially mapped state before move_to_new_folio() unqueues
> it, and re-queuing the destination folio after a successful migration if
> it was.
>
> By the time migrate_folio_move() runs, partially mapped folios without a
> pin have already been split by migrate_pages_batch(). So only two cases
> remain on the deferred list at this point:
> 1. Partially mapped folios with a pin (split failed).
> 2. Fully mapped but potentially underused folios.
> The recorded partially_mapped state is forwarded to deferred_split_folio()
> so that the destination folio is correctly re-queued in both cases.
>
> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
> Fixes: dafff3f4c850 ("mm: split underused THPs")
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> v1 -> v2:
> - record whether source folio was on the deferred split queue before
> move_to_folio() (David)
> - record partially mapped state and update commit message (Zi)
> ---
> mm/migrate.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
LGTM.
Acked-by: Zi Yan <ziy@nvidia.com>
Best Regards,
Yan, Zi
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
2026-03-10 16:52 ` Zi Yan
@ 2026-03-11 9:23 ` David Hildenbrand (Arm)
2026-03-11 13:25 ` Usama Arif
2026-03-12 3:18 ` Wei Yang
2026-06-20 7:27 ` Wei Yang
3 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-11 9:23 UTC (permalink / raw)
To: Usama Arif, Andrew Morton, npache, ziy, willy, linux-mm
Cc: matthew.brost, joshua.hahnjy, hannes, rakie.kim, byungchul,
gourry, ying.huang, apopple, linux-kernel, kernel-team
On 3/10/26 11:54, Usama Arif wrote:
> During folio migration, __folio_migrate_mapping() removes the source
> folio from the deferred split queue, but the destination folio is never
> re-queued. This causes underutilized THPs to escape the shrinker after
> NUMA migration, since they silently drop off the deferred split list.
>
> Fix this by recording whether the source folio was on the deferred split
> queue and its partially mapped state before move_to_new_folio() unqueues
> it, and re-queuing the destination folio after a successful migration if
> it was.
>
> By the time migrate_folio_move() runs, partially mapped folios without a
> pin have already been split by migrate_pages_batch(). So only two cases
> remain on the deferred list at this point:
> 1. Partially mapped folios with a pin (split failed).
> 2. Fully mapped but potentially underused folios.
> The recorded partially_mapped state is forwarded to deferred_split_folio()
> so that the destination folio is correctly re-queued in both cases.
>
> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
> Fixes: dafff3f4c850 ("mm: split underused THPs")
> Signed-off-by: Usama Arif <usama.arif@linux.dev>
> ---
> v1 -> v2:
> - record whether source folio was on the deferred split queue before
> move_to_folio() (David)
> - record partially mapped state and update commit message (Zi)
> ---
> mm/migrate.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/mm/migrate.c b/mm/migrate.c
> index ece77ccb2ec0..61013d258eb4 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> int rc;
> int old_page_state = 0;
> struct anon_vma *anon_vma = NULL;
> + bool src_deferred_split = false;
> + bool src_partially_mapped = false;
> struct list_head *prev;
>
> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> goto out_unlock_both;
> }
>
> + if (folio_test_large(src) && folio_test_large_rmappable(src) &&
I don't think the folio_test_large_rmappable() check is required. Other
folios we migrate here would always have _deferred_list initialized but
unused.
Acked-by: David Hildenbrand (Arm) <david@kernel.org>
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-03-11 9:23 ` David Hildenbrand (Arm)
@ 2026-03-11 13:25 ` Usama Arif
0 siblings, 0 replies; 10+ messages in thread
From: Usama Arif @ 2026-03-11 13:25 UTC (permalink / raw)
To: David Hildenbrand (Arm), Andrew Morton, npache, ziy, willy,
linux-mm
Cc: matthew.brost, joshua.hahnjy, hannes, rakie.kim, byungchul,
gourry, ying.huang, apopple, linux-kernel, kernel-team
On 11/03/2026 12:23, David Hildenbrand (Arm) wrote:
> On 3/10/26 11:54, Usama Arif wrote:
>> During folio migration, __folio_migrate_mapping() removes the source
>> folio from the deferred split queue, but the destination folio is never
>> re-queued. This causes underutilized THPs to escape the shrinker after
>> NUMA migration, since they silently drop off the deferred split list.
>>
>> Fix this by recording whether the source folio was on the deferred split
>> queue and its partially mapped state before move_to_new_folio() unqueues
>> it, and re-queuing the destination folio after a successful migration if
>> it was.
>>
>> By the time migrate_folio_move() runs, partially mapped folios without a
>> pin have already been split by migrate_pages_batch(). So only two cases
>> remain on the deferred list at this point:
>> 1. Partially mapped folios with a pin (split failed).
>> 2. Fully mapped but potentially underused folios.
>> The recorded partially_mapped state is forwarded to deferred_split_folio()
>> so that the destination folio is correctly re-queued in both cases.
>>
>> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>> ---
>> v1 -> v2:
>> - record whether source folio was on the deferred split queue before
>> move_to_folio() (David)
>> - record partially mapped state and update commit message (Zi)
>> ---
>> mm/migrate.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index ece77ccb2ec0..61013d258eb4 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> int rc;
>> int old_page_state = 0;
>> struct anon_vma *anon_vma = NULL;
>> + bool src_deferred_split = false;
>> + bool src_partially_mapped = false;
>> struct list_head *prev;
>>
>> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
>> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> goto out_unlock_both;
>> }
>>
>> + if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>
> I don't think the folio_test_large_rmappable() check is required. Other
> folios we migrate here would always have _deferred_list initialized but
> unused.
>
> Acked-by: David Hildenbrand (Arm) <david@kernel.org>
>
I have been auditing the THP shrinker code when it comes to NUMA migration and I think we need
another fix for this. I have sent it here https://lore.kernel.org/all/20260311132342.3193160-1-usama.arif@linux.dev/
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
2026-03-10 16:52 ` Zi Yan
2026-03-11 9:23 ` David Hildenbrand (Arm)
@ 2026-03-12 3:18 ` Wei Yang
2026-03-12 8:26 ` David Hildenbrand (Arm)
2026-06-20 7:27 ` Wei Yang
3 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2026-03-12 3:18 UTC (permalink / raw)
To: Usama Arif
Cc: Andrew Morton, npache, david, ziy, willy, linux-mm, matthew.brost,
joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
apopple, linux-kernel, kernel-team
On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>During folio migration, __folio_migrate_mapping() removes the source
>folio from the deferred split queue, but the destination folio is never
>re-queued. This causes underutilized THPs to escape the shrinker after
>NUMA migration, since they silently drop off the deferred split list.
>
>Fix this by recording whether the source folio was on the deferred split
>queue and its partially mapped state before move_to_new_folio() unqueues
>it, and re-queuing the destination folio after a successful migration if
>it was.
>
>By the time migrate_folio_move() runs, partially mapped folios without a
>pin have already been split by migrate_pages_batch(). So only two cases
>remain on the deferred list at this point:
> 1. Partially mapped folios with a pin (split failed).
> 2. Fully mapped but potentially underused folios.
>The recorded partially_mapped state is forwarded to deferred_split_folio()
>so that the destination folio is correctly re-queued in both cases.
>
>Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>Fixes: dafff3f4c850 ("mm: split underused THPs")
>Signed-off-by: Usama Arif <usama.arif@linux.dev>
>---
>v1 -> v2:
>- record whether source folio was on the deferred split queue before
> move_to_folio() (David)
>- record partially mapped state and update commit message (Zi)
>---
> mm/migrate.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
>diff --git a/mm/migrate.c b/mm/migrate.c
>index ece77ccb2ec0..61013d258eb4 100644
>--- a/mm/migrate.c
>+++ b/mm/migrate.c
>@@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> int rc;
> int old_page_state = 0;
> struct anon_vma *anon_vma = NULL;
>+ bool src_deferred_split = false;
>+ bool src_partially_mapped = false;
> struct list_head *prev;
>
> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
>@@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> goto out_unlock_both;
> }
>
>+ if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>+ !data_race(list_empty(&src->_deferred_list))) {
We usually check order > 1, before accessing _deferred_list, because it is in
subpage 2.
I am not sure why we don't do it here. Do I miss something?
>+ src_deferred_split = true;
>+ src_partially_mapped = folio_test_partially_mapped(src);
>+ }
>+
> rc = move_to_new_folio(dst, src, mode);
> if (rc)
> goto out;
>@@ -1393,6 +1401,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> if (old_page_state & PAGE_WAS_MAPPED)
> remove_migration_ptes(src, dst, 0);
>
>+ /*
>+ * Requeue the destination folio on the deferred split queue if
>+ * the source was on the queue. The source is unqueued in
>+ * __folio_migrate_mapping(), so we recorded the state from
>+ * before move_to_new_folio().
>+ */
>+ if (src_deferred_split)
>+ deferred_split_folio(dst, src_partially_mapped);
>+
> out_unlock_both:
> folio_unlock(dst);
> folio_set_owner_migrate_reason(dst, reason);
>--
>2.47.3
>
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-03-12 3:18 ` Wei Yang
@ 2026-03-12 8:26 ` David Hildenbrand (Arm)
0 siblings, 0 replies; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-12 8:26 UTC (permalink / raw)
To: Wei Yang, Usama Arif
Cc: Andrew Morton, npache, ziy, willy, linux-mm, matthew.brost,
joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
apopple, linux-kernel, kernel-team
On 3/12/26 04:18, Wei Yang wrote:
> On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>> During folio migration, __folio_migrate_mapping() removes the source
>> folio from the deferred split queue, but the destination folio is never
>> re-queued. This causes underutilized THPs to escape the shrinker after
>> NUMA migration, since they silently drop off the deferred split list.
>>
>> Fix this by recording whether the source folio was on the deferred split
>> queue and its partially mapped state before move_to_new_folio() unqueues
>> it, and re-queuing the destination folio after a successful migration if
>> it was.
>>
>> By the time migrate_folio_move() runs, partially mapped folios without a
>> pin have already been split by migrate_pages_batch(). So only two cases
>> remain on the deferred list at this point:
>> 1. Partially mapped folios with a pin (split failed).
>> 2. Fully mapped but potentially underused folios.
>> The recorded partially_mapped state is forwarded to deferred_split_folio()
>> so that the destination folio is correctly re-queued in both cases.
>>
>> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>> ---
>> v1 -> v2:
>> - record whether source folio was on the deferred split queue before
>> move_to_folio() (David)
>> - record partially mapped state and update commit message (Zi)
>> ---
>> mm/migrate.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index ece77ccb2ec0..61013d258eb4 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> int rc;
>> int old_page_state = 0;
>> struct anon_vma *anon_vma = NULL;
>> + bool src_deferred_split = false;
>> + bool src_partially_mapped = false;
>> struct list_head *prev;
>>
>> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
>> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> goto out_unlock_both;
>> }
>>
>> + if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>> + !data_race(list_empty(&src->_deferred_list))) {
>
> We usually check order > 1, before accessing _deferred_list, because it is in
> subpage 2.
>
> I am not sure why we don't do it here. Do I miss something?
Valid point! non-anon folios could trigger that.
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-03-10 10:54 [PATCH v2] mm: migrate: requeue destination folio on deferred split queue Usama Arif
` (2 preceding siblings ...)
2026-03-12 3:18 ` Wei Yang
@ 2026-06-20 7:27 ` Wei Yang
2026-06-22 9:16 ` David Hildenbrand (Arm)
3 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2026-06-20 7:27 UTC (permalink / raw)
To: Usama Arif
Cc: Andrew Morton, npache, david, ziy, willy, linux-mm, matthew.brost,
joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
apopple, linux-kernel, kernel-team
On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>During folio migration, __folio_migrate_mapping() removes the source
>folio from the deferred split queue, but the destination folio is never
>re-queued. This causes underutilized THPs to escape the shrinker after
>NUMA migration, since they silently drop off the deferred split list.
>
>Fix this by recording whether the source folio was on the deferred split
>queue and its partially mapped state before move_to_new_folio() unqueues
>it, and re-queuing the destination folio after a successful migration if
>it was.
>
>By the time migrate_folio_move() runs, partially mapped folios without a
>pin have already been split by migrate_pages_batch(). So only two cases
>remain on the deferred list at this point:
> 1. Partially mapped folios with a pin (split failed).
> 2. Fully mapped but potentially underused folios.
>The recorded partially_mapped state is forwarded to deferred_split_folio()
>so that the destination folio is correctly re-queued in both cases.
>
>Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>Fixes: dafff3f4c850 ("mm: split underused THPs")
>Signed-off-by: Usama Arif <usama.arif@linux.dev>
>---
>v1 -> v2:
>- record whether source folio was on the deferred split queue before
> move_to_folio() (David)
>- record partially mapped state and update commit message (Zi)
>---
> mm/migrate.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
>diff --git a/mm/migrate.c b/mm/migrate.c
>index ece77ccb2ec0..61013d258eb4 100644
>--- a/mm/migrate.c
>+++ b/mm/migrate.c
>@@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> int rc;
> int old_page_state = 0;
> struct anon_vma *anon_vma = NULL;
>+ bool src_deferred_split = false;
>+ bool src_partially_mapped = false;
> struct list_head *prev;
>
> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
>@@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> goto out_unlock_both;
> }
>
>+ if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>+ !data_race(list_empty(&src->_deferred_list))) {
>+ src_deferred_split = true;
>+ src_partially_mapped = folio_test_partially_mapped(src);
>+ }
Hi, Usama
I am afraid there maybe a race between migration and defer_split.
A B
migrate_pages_batch deferred_split_scan
migrate_folio_unmap list_del_init(&folio->_deferred_list)
folio_lock/folio_trylock
migrate_folios_move
migrate_folio_move
list_empty(&src->_deferred_list)
folio_trylock()
requeue:
In case list_empty() check happens after folio removed from defer_list but
before requeued, we will miss this folio.
And the behavior is the same before commit fafaeceb89a5, IIUC.
Do you thinks this will happen?
>+
> rc = move_to_new_folio(dst, src, mode);
> if (rc)
> goto out;
>@@ -1393,6 +1401,15 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
> if (old_page_state & PAGE_WAS_MAPPED)
> remove_migration_ptes(src, dst, 0);
>
>+ /*
>+ * Requeue the destination folio on the deferred split queue if
>+ * the source was on the queue. The source is unqueued in
>+ * __folio_migrate_mapping(), so we recorded the state from
>+ * before move_to_new_folio().
>+ */
>+ if (src_deferred_split)
>+ deferred_split_folio(dst, src_partially_mapped);
>+
> out_unlock_both:
> folio_unlock(dst);
> folio_set_owner_migrate_reason(dst, reason);
>--
>2.47.3
>
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-06-20 7:27 ` Wei Yang
@ 2026-06-22 9:16 ` David Hildenbrand (Arm)
2026-06-22 13:43 ` Wei Yang
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-06-22 9:16 UTC (permalink / raw)
To: Wei Yang, Usama Arif
Cc: Andrew Morton, npache, ziy, willy, linux-mm, matthew.brost,
joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
apopple, linux-kernel, kernel-team
On 6/20/26 09:27, Wei Yang wrote:
> On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>> During folio migration, __folio_migrate_mapping() removes the source
>> folio from the deferred split queue, but the destination folio is never
>> re-queued. This causes underutilized THPs to escape the shrinker after
>> NUMA migration, since they silently drop off the deferred split list.
>>
>> Fix this by recording whether the source folio was on the deferred split
>> queue and its partially mapped state before move_to_new_folio() unqueues
>> it, and re-queuing the destination folio after a successful migration if
>> it was.
>>
>> By the time migrate_folio_move() runs, partially mapped folios without a
>> pin have already been split by migrate_pages_batch(). So only two cases
>> remain on the deferred list at this point:
>> 1. Partially mapped folios with a pin (split failed).
>> 2. Fully mapped but potentially underused folios.
>> The recorded partially_mapped state is forwarded to deferred_split_folio()
>> so that the destination folio is correctly re-queued in both cases.
>>
>> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>> ---
>> v1 -> v2:
>> - record whether source folio was on the deferred split queue before
>> move_to_folio() (David)
>> - record partially mapped state and update commit message (Zi)
>> ---
>> mm/migrate.c | 17 +++++++++++++++++
>> 1 file changed, 17 insertions(+)
>>
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index ece77ccb2ec0..61013d258eb4 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> int rc;
>> int old_page_state = 0;
>> struct anon_vma *anon_vma = NULL;
>> + bool src_deferred_split = false;
>> + bool src_partially_mapped = false;
>> struct list_head *prev;
>>
>> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
>> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>> goto out_unlock_both;
>> }
>>
>> + if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>> + !data_race(list_empty(&src->_deferred_list))) {
>> + src_deferred_split = true;
>> + src_partially_mapped = folio_test_partially_mapped(src);
>> + }
>
> Hi, Usama
>
> I am afraid there maybe a race between migration and defer_split.
>
> A B
> migrate_pages_batch deferred_split_scan
> migrate_folio_unmap list_del_init(&folio->_deferred_list)
> folio_lock/folio_trylock
>
> migrate_folios_move
> migrate_folio_move
> list_empty(&src->_deferred_list)
> folio_trylock()
> requeue:
>
> In case list_empty() check happens after folio removed from defer_list but
> before requeued, we will miss this folio.
deferred_split_isolate() would grab a reference through folio_try_get().
How can we migrate a folio with a raised refcount?
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-06-22 9:16 ` David Hildenbrand (Arm)
@ 2026-06-22 13:43 ` Wei Yang
2026-06-22 16:44 ` Usama Arif
0 siblings, 1 reply; 10+ messages in thread
From: Wei Yang @ 2026-06-22 13:43 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Wei Yang, Usama Arif, Andrew Morton, npache, ziy, willy, linux-mm,
matthew.brost, joshua.hahnjy, hannes, rakie.kim, byungchul,
gourry, ying.huang, apopple, linux-kernel, kernel-team
On Mon, Jun 22, 2026 at 11:16:39AM +0200, David Hildenbrand (Arm) wrote:
>On 6/20/26 09:27, Wei Yang wrote:
>> On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>>> During folio migration, __folio_migrate_mapping() removes the source
>>> folio from the deferred split queue, but the destination folio is never
>>> re-queued. This causes underutilized THPs to escape the shrinker after
>>> NUMA migration, since they silently drop off the deferred split list.
>>>
>>> Fix this by recording whether the source folio was on the deferred split
>>> queue and its partially mapped state before move_to_new_folio() unqueues
>>> it, and re-queuing the destination folio after a successful migration if
>>> it was.
>>>
>>> By the time migrate_folio_move() runs, partially mapped folios without a
>>> pin have already been split by migrate_pages_batch(). So only two cases
>>> remain on the deferred list at this point:
>>> 1. Partially mapped folios with a pin (split failed).
>>> 2. Fully mapped but potentially underused folios.
>>> The recorded partially_mapped state is forwarded to deferred_split_folio()
>>> so that the destination folio is correctly re-queued in both cases.
>>>
>>> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>>> ---
>>> v1 -> v2:
>>> - record whether source folio was on the deferred split queue before
>>> move_to_folio() (David)
>>> - record partially mapped state and update commit message (Zi)
>>> ---
>>> mm/migrate.c | 17 +++++++++++++++++
>>> 1 file changed, 17 insertions(+)
>>>
>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>> index ece77ccb2ec0..61013d258eb4 100644
>>> --- a/mm/migrate.c
>>> +++ b/mm/migrate.c
>>> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>>> int rc;
>>> int old_page_state = 0;
>>> struct anon_vma *anon_vma = NULL;
>>> + bool src_deferred_split = false;
>>> + bool src_partially_mapped = false;
>>> struct list_head *prev;
>>>
>>> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
>>> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>>> goto out_unlock_both;
>>> }
>>>
>>> + if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>>> + !data_race(list_empty(&src->_deferred_list))) {
>>> + src_deferred_split = true;
>>> + src_partially_mapped = folio_test_partially_mapped(src);
>>> + }
>>
>> Hi, Usama
>>
>> I am afraid there maybe a race between migration and defer_split.
>>
>> A B
>> migrate_pages_batch deferred_split_scan
>> migrate_folio_unmap list_del_init(&folio->_deferred_list)
>> folio_lock/folio_trylock
>>
>> migrate_folios_move
>> migrate_folio_move
>> list_empty(&src->_deferred_list)
>> folio_trylock()
>> requeue:
>>
>> In case list_empty() check happens after folio removed from defer_list but
>> before requeued, we will miss this folio.
>
>deferred_split_isolate() would grab a reference through folio_try_get().
>
>How can we migrate a folio with a raised refcount?
>
Thanks, I missed expected_refcount check in __migrate_folio().
>--
>Cheers,
>
>David
--
Wei Yang
Help you, Help me
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [PATCH v2] mm: migrate: requeue destination folio on deferred split queue
2026-06-22 13:43 ` Wei Yang
@ 2026-06-22 16:44 ` Usama Arif
0 siblings, 0 replies; 10+ messages in thread
From: Usama Arif @ 2026-06-22 16:44 UTC (permalink / raw)
To: Wei Yang, David Hildenbrand (Arm)
Cc: Andrew Morton, npache, ziy, willy, linux-mm, matthew.brost,
joshua.hahnjy, hannes, rakie.kim, byungchul, gourry, ying.huang,
apopple, linux-kernel, kernel-team
On 22/06/2026 14:43, Wei Yang wrote:
> On Mon, Jun 22, 2026 at 11:16:39AM +0200, David Hildenbrand (Arm) wrote:
>> On 6/20/26 09:27, Wei Yang wrote:
>>> On Tue, Mar 10, 2026 at 03:54:19AM -0700, Usama Arif wrote:
>>>> During folio migration, __folio_migrate_mapping() removes the source
>>>> folio from the deferred split queue, but the destination folio is never
>>>> re-queued. This causes underutilized THPs to escape the shrinker after
>>>> NUMA migration, since they silently drop off the deferred split list.
>>>>
>>>> Fix this by recording whether the source folio was on the deferred split
>>>> queue and its partially mapped state before move_to_new_folio() unqueues
>>>> it, and re-queuing the destination folio after a successful migration if
>>>> it was.
>>>>
>>>> By the time migrate_folio_move() runs, partially mapped folios without a
>>>> pin have already been split by migrate_pages_batch(). So only two cases
>>>> remain on the deferred list at this point:
>>>> 1. Partially mapped folios with a pin (split failed).
>>>> 2. Fully mapped but potentially underused folios.
>>>> The recorded partially_mapped state is forwarded to deferred_split_folio()
>>>> so that the destination folio is correctly re-queued in both cases.
>>>>
>>>> Reported-by: Johannes Weiner <hannes@cmpxchg.org>
>>>> Fixes: dafff3f4c850 ("mm: split underused THPs")
>>>> Signed-off-by: Usama Arif <usama.arif@linux.dev>
>>>> ---
>>>> v1 -> v2:
>>>> - record whether source folio was on the deferred split queue before
>>>> move_to_folio() (David)
>>>> - record partially mapped state and update commit message (Zi)
>>>> ---
>>>> mm/migrate.c | 17 +++++++++++++++++
>>>> 1 file changed, 17 insertions(+)
>>>>
>>>> diff --git a/mm/migrate.c b/mm/migrate.c
>>>> index ece77ccb2ec0..61013d258eb4 100644
>>>> --- a/mm/migrate.c
>>>> +++ b/mm/migrate.c
>>>> @@ -1360,6 +1360,8 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>>>> int rc;
>>>> int old_page_state = 0;
>>>> struct anon_vma *anon_vma = NULL;
>>>> + bool src_deferred_split = false;
>>>> + bool src_partially_mapped = false;
>>>> struct list_head *prev;
>>>>
>>>> __migrate_folio_extract(dst, &old_page_state, &anon_vma);
>>>> @@ -1373,6 +1375,12 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private,
>>>> goto out_unlock_both;
>>>> }
>>>>
>>>> + if (folio_test_large(src) && folio_test_large_rmappable(src) &&
>>>> + !data_race(list_empty(&src->_deferred_list))) {
>>>> + src_deferred_split = true;
>>>> + src_partially_mapped = folio_test_partially_mapped(src);
>>>> + }
>>>
>>> Hi, Usama
>>>
>>> I am afraid there maybe a race between migration and defer_split.
>>>
>>> A B
>>> migrate_pages_batch deferred_split_scan
>>> migrate_folio_unmap list_del_init(&folio->_deferred_list)
>>> folio_lock/folio_trylock
>>>
>>> migrate_folios_move
>>> migrate_folio_move
>>> list_empty(&src->_deferred_list)
>>> folio_trylock()
>>> requeue:
>>>
>>> In case list_empty() check happens after folio removed from defer_list but
>>> before requeued, we will miss this folio.
>>
>> deferred_split_isolate() would grab a reference through folio_try_get().
>>
>> How can we migrate a folio with a raised refcount?
>>
>
> Thanks, I missed expected_refcount check in __migrate_folio().
>
Thanks David for pointing it out! I have just started looking at the
mailing list for today :)
>> --
>> Cheers,
>>
>> David
>
^ permalink raw reply [flat|nested] 10+ messages in thread