* [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
@ 2026-05-06 1:21 Ye Liu
2026-05-06 2:05 ` Lance Yang
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Ye Liu @ 2026-05-06 1:21 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Xin Hao
Cc: Ye Liu, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Andrew Morton,
linux-mm, linux-kernel
From: Ye Liu <liuye@kylinos.cn>
__khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
corresponding mm_slot. If mm_slot_alloc() fails, the function
returns with the flag set but without inserting the mm into the
khugepaged tracking structures.
This leaves the mm in an inconsistent state: it is marked as
registered (MMF_VM_HUGEPAGE set), but will never be scanned by
khugepaged. Future attempts to register the mm are skipped since
khugepaged_enter_vma() checks the flag and returns early.
Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
restoring the ability to retry registration later.
Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
Signed-off-by: Ye Liu <liuye@kylinos.cn>
---
Changes since v1:
- Add Fixes tag as suggested by Dev Jain and Lance Yang
mm/khugepaged.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 7d48d4fbd5f3..60ab7c1b61dd 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm)
return;
slot = mm_slot_alloc(mm_slot_cache);
- if (!slot)
+ if (!slot) {
+ mm_flags_clear(MMF_VM_HUGEPAGE, mm);
return;
+ }
spin_lock(&khugepaged_mm_lock);
mm_slot_insert(mm_slots_hash, mm, slot);
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu
@ 2026-05-06 2:05 ` Lance Yang
2026-05-06 5:17 ` Baolin Wang
2026-05-06 6:46 ` Dev Jain
2 siblings, 0 replies; 10+ messages in thread
From: Lance Yang @ 2026-05-06 2:05 UTC (permalink / raw)
To: ye.liu
Cc: akpm, david, ljs, xhao, liuye, ziy, baolin.wang, liam, npache,
ryan.roberts, dev.jain, baohua, lance.yang, akpm, linux-mm,
linux-kernel
On Wed, May 06, 2026 at 09:21:30AM +0800, Ye Liu wrote:
>From: Ye Liu <liuye@kylinos.cn>
>
>__khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
>corresponding mm_slot. If mm_slot_alloc() fails, the function
>returns with the flag set but without inserting the mm into the
>khugepaged tracking structures.
>
>This leaves the mm in an inconsistent state: it is marked as
>registered (MMF_VM_HUGEPAGE set), but will never be scanned by
>khugepaged. Future attempts to register the mm are skipped since
>khugepaged_enter_vma() checks the flag and returns early.
>
>Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
>restoring the ability to retry registration later.
>
>Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
Nit: the title does not exactly match the original commit subject.
It should be
Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for "struct mm_slot"")
which includes quotes around "struct mm_slot".
No need to resend; I think Andrew can fix it up when applying :)
>Signed-off-by: Ye Liu <liuye@kylinos.cn>
>---
Thanks, LGTM.
Reviewed-by: Lance Yang <lance.yang@linux.dev>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu
2026-05-06 2:05 ` Lance Yang
@ 2026-05-06 5:17 ` Baolin Wang
2026-05-06 6:46 ` Dev Jain
2 siblings, 0 replies; 10+ messages in thread
From: Baolin Wang @ 2026-05-06 5:17 UTC (permalink / raw)
To: Ye Liu, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Xin Hao
Cc: Ye Liu, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts,
Dev Jain, Barry Song, Lance Yang, Andrew Morton, linux-mm,
linux-kernel
On 5/6/26 9:21 AM, Ye Liu wrote:
> From: Ye Liu <liuye@kylinos.cn>
>
> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
> corresponding mm_slot. If mm_slot_alloc() fails, the function
> returns with the flag set but without inserting the mm into the
> khugepaged tracking structures.
>
> This leaves the mm in an inconsistent state: it is marked as
> registered (MMF_VM_HUGEPAGE set), but will never be scanned by
> khugepaged. Future attempts to register the mm are skipped since
> khugepaged_enter_vma() checks the flag and returns early.
>
> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
> restoring the ability to retry registration later.
>
> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
> Signed-off-by: Ye Liu <liuye@kylinos.cn>
> ---
Good catch. Feel free to add:
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu
2026-05-06 2:05 ` Lance Yang
2026-05-06 5:17 ` Baolin Wang
@ 2026-05-06 6:46 ` Dev Jain
2026-05-06 10:51 ` Lance Yang
2 siblings, 1 reply; 10+ messages in thread
From: Dev Jain @ 2026-05-06 6:46 UTC (permalink / raw)
To: Ye Liu, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
Xin Hao
Cc: Ye Liu, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Barry Song, Lance Yang, Andrew Morton, linux-mm,
linux-kernel
On 06/05/26 6:51 am, Ye Liu wrote:
> From: Ye Liu <liuye@kylinos.cn>
>
> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
> corresponding mm_slot. If mm_slot_alloc() fails, the function
> returns with the flag set but without inserting the mm into the
> khugepaged tracking structures.
>
> This leaves the mm in an inconsistent state: it is marked as
> registered (MMF_VM_HUGEPAGE set), but will never be scanned by
> khugepaged. Future attempts to register the mm are skipped since
> khugepaged_enter_vma() checks the flag and returns early.
>
> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
> restoring the ability to retry registration later.
>
> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
> Signed-off-by: Ye Liu <liuye@kylinos.cn>
> ---
> Changes since v1:
> - Add Fixes tag as suggested by Dev Jain and Lance Yang
>
> mm/khugepaged.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 7d48d4fbd5f3..60ab7c1b61dd 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm)
> return;
>
> slot = mm_slot_alloc(mm_slot_cache);
> - if (!slot)
> + if (!slot) {
> + mm_flags_clear(MMF_VM_HUGEPAGE, mm);
> return;
> + }
Note that, a racing khugepaged_enter_vma() may back off
when it sees that MMF_VM_HUGEPAGE is set, but then the above
clears the flag after slot alloc failure. So we end up not
registering the mm with khugepaged. But I am sure no one
cares, we are in much big trouble if slot alloc is failing.
Although one could argue the same about this patch, I will
still say it is important to fix "flag is set but not registered
with khugepaged" because that just feels wrong.
Reviewed-by: Dev Jain <dev.jain@arm.com>
>
> spin_lock(&khugepaged_mm_lock);
> mm_slot_insert(mm_slots_hash, mm, slot);
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-06 6:46 ` Dev Jain
@ 2026-05-06 10:51 ` Lance Yang
2026-05-08 21:41 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 10+ messages in thread
From: Lance Yang @ 2026-05-06 10:51 UTC (permalink / raw)
To: dev.jain
Cc: ye.liu, akpm, david, ljs, xhao, liuye, ziy, baolin.wang, liam,
npache, ryan.roberts, baohua, lance.yang, akpm, linux-mm,
linux-kernel
On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote:
>
>
>On 06/05/26 6:51 am, Ye Liu wrote:
>> From: Ye Liu <liuye@kylinos.cn>
>>
>> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
>> corresponding mm_slot. If mm_slot_alloc() fails, the function
>> returns with the flag set but without inserting the mm into the
>> khugepaged tracking structures.
>>
>> This leaves the mm in an inconsistent state: it is marked as
>> registered (MMF_VM_HUGEPAGE set), but will never be scanned by
>> khugepaged. Future attempts to register the mm are skipped since
>> khugepaged_enter_vma() checks the flag and returns early.
>>
>> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
>> restoring the ability to retry registration later.
>>
>> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
>> Signed-off-by: Ye Liu <liuye@kylinos.cn>
>> ---
>> Changes since v1:
>> - Add Fixes tag as suggested by Dev Jain and Lance Yang
>>
>> mm/khugepaged.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 7d48d4fbd5f3..60ab7c1b61dd 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm)
>> return;
>>
>> slot = mm_slot_alloc(mm_slot_cache);
>> - if (!slot)
>> + if (!slot) {
>> + mm_flags_clear(MMF_VM_HUGEPAGE, mm);
>> return;
>> + }
>
>Note that, a racing khugepaged_enter_vma() may back off
>when it sees that MMF_VM_HUGEPAGE is set, but then the above
>clears the flag after slot alloc failure. So we end up not
>registering the mm with khugepaged. But I am sure no one
>cares, we are in much big trouble if slot alloc is failing.
Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set
and return, then !slot clears it again. If there is no later
khugepaged_enter_vma(), the mm still wouldn't get registered :)
>Although one could argue the same about this patch, I will
>still say it is important to fix "flag is set but not registered
>with khugepaged" because that just feels wrong.
+1 that is still better then keeping MMF_VM_HUGEPAGE set forever: any
later attempt can retry and install the mm_slot :)
Cheers, Lance
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-06 10:51 ` Lance Yang
@ 2026-05-08 21:41 ` David Hildenbrand (Arm)
2026-05-09 1:57 ` Lance Yang
2026-05-11 4:00 ` Dev Jain
0 siblings, 2 replies; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-08 21:41 UTC (permalink / raw)
To: Lance Yang, dev.jain
Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache,
ryan.roberts, baohua, akpm, linux-mm, linux-kernel
On 5/6/26 12:51, Lance Yang wrote:
>
> On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote:
>>
>>
>> On 06/05/26 6:51 am, Ye Liu wrote:
>>> From: Ye Liu <liuye@kylinos.cn>
>>>
>>> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
>>> corresponding mm_slot. If mm_slot_alloc() fails, the function
>>> returns with the flag set but without inserting the mm into the
>>> khugepaged tracking structures.
>>>
>>> This leaves the mm in an inconsistent state: it is marked as
>>> registered (MMF_VM_HUGEPAGE set), but will never be scanned by
>>> khugepaged. Future attempts to register the mm are skipped since
>>> khugepaged_enter_vma() checks the flag and returns early.
>>>
>>> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
>>> restoring the ability to retry registration later.
>>>
>>> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
>>> Signed-off-by: Ye Liu <liuye@kylinos.cn>
>>> ---
>>> Changes since v1:
>>> - Add Fixes tag as suggested by Dev Jain and Lance Yang
>>>
>>> mm/khugepaged.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index 7d48d4fbd5f3..60ab7c1b61dd 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm)
>>> return;
>>>
>>> slot = mm_slot_alloc(mm_slot_cache);
>>> - if (!slot)
>>> + if (!slot) {
>>> + mm_flags_clear(MMF_VM_HUGEPAGE, mm);
>>> return;
>>> + }
>>
>> Note that, a racing khugepaged_enter_vma() may back off
>> when it sees that MMF_VM_HUGEPAGE is set, but then the above
>> clears the flag after slot alloc failure. So we end up not
>> registering the mm with khugepaged. But I am sure no one
>> cares, we are in much big trouble if slot alloc is failing.
>
> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set
> and return, then !slot clears it again. If there is no later
> khugepaged_enter_vma(), the mm still wouldn't get registered :)
So why not
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 5f4e009593e0..78735f34250a 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm)
/* __khugepaged_exit() must not run from under us */
VM_BUG_ON_MM(collapse_test_exit(mm), mm);
- if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm)))
- return;
slot = mm_slot_alloc(mm_slot_cache);
if (!slot)
return;
+ if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) {
+ mm_slot_free(mm_slot_cache, slot);
+ return;
+ }
+
spin_lock(&khugepaged_mm_lock);
mm_slot_insert(mm_slots_hash, mm, slot);
/*
Arguably, on the race described above, likely the thread seeing the
MMF_VM_HUGEPAGE would likely similarly have failed the allocation.
I'm fine with either, just wanted to raise the (cleaner looking?) alternative
where we just properly back off?
--
Cheers,
David
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-08 21:41 ` David Hildenbrand (Arm)
@ 2026-05-09 1:57 ` Lance Yang
2026-05-11 4:00 ` Dev Jain
1 sibling, 0 replies; 10+ messages in thread
From: Lance Yang @ 2026-05-09 1:57 UTC (permalink / raw)
To: david
Cc: lance.yang, dev.jain, ye.liu, akpm, ljs, xhao, liuye, ziy,
baolin.wang, liam, npache, ryan.roberts, baohua, akpm, linux-mm,
linux-kernel
On Fri, May 08, 2026 at 11:41:34PM +0200, David Hildenbrand (Arm) wrote:
>On 5/6/26 12:51, Lance Yang wrote:
>>
>> On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote:
>>>
>>>
>>> On 06/05/26 6:51 am, Ye Liu wrote:
>>>> From: Ye Liu <liuye@kylinos.cn>
>>>>
>>>> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
>>>> corresponding mm_slot. If mm_slot_alloc() fails, the function
>>>> returns with the flag set but without inserting the mm into the
>>>> khugepaged tracking structures.
>>>>
>>>> This leaves the mm in an inconsistent state: it is marked as
>>>> registered (MMF_VM_HUGEPAGE set), but will never be scanned by
>>>> khugepaged. Future attempts to register the mm are skipped since
>>>> khugepaged_enter_vma() checks the flag and returns early.
>>>>
>>>> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
>>>> restoring the ability to retry registration later.
>>>>
>>>> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
>>>> Signed-off-by: Ye Liu <liuye@kylinos.cn>
>>>> ---
>>>> Changes since v1:
>>>> - Add Fixes tag as suggested by Dev Jain and Lance Yang
>>>>
>>>> mm/khugepaged.c | 4 +++-
>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>>> index 7d48d4fbd5f3..60ab7c1b61dd 100644
>>>> --- a/mm/khugepaged.c
>>>> +++ b/mm/khugepaged.c
>>>> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm)
>>>> return;
>>>>
>>>> slot = mm_slot_alloc(mm_slot_cache);
>>>> - if (!slot)
>>>> + if (!slot) {
>>>> + mm_flags_clear(MMF_VM_HUGEPAGE, mm);
>>>> return;
>>>> + }
>>>
>>> Note that, a racing khugepaged_enter_vma() may back off
>>> when it sees that MMF_VM_HUGEPAGE is set, but then the above
>>> clears the flag after slot alloc failure. So we end up not
>>> registering the mm with khugepaged. But I am sure no one
>>> cares, we are in much big trouble if slot alloc is failing.
>>
>> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set
>> and return, then !slot clears it again. If there is no later
>> khugepaged_enter_vma(), the mm still wouldn't get registered :)
>
>So why not
>
>diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>index 5f4e009593e0..78735f34250a 100644
>--- a/mm/khugepaged.c
>+++ b/mm/khugepaged.c
>@@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm)
>
> /* __khugepaged_exit() must not run from under us */
> VM_BUG_ON_MM(collapse_test_exit(mm), mm);
>- if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm)))
>- return;
>
> slot = mm_slot_alloc(mm_slot_cache);
> if (!slot)
> return;
>
>+ if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) {
>+ mm_slot_free(mm_slot_cache, slot);
>+ return;
>+ }
>+
> spin_lock(&khugepaged_mm_lock);
> mm_slot_insert(mm_slots_hash, mm, slot);
> /*
>
>
>Arguably, on the race described above, likely the thread seeing the
>MMF_VM_HUGEPAGE would likely similarly have failed the allocation.
Right, LGTM!
>I'm fine with either, just wanted to raise the (cleaner looking?) alternative
>where we just properly back off?
Dev suggested the same thing[1] on v1 as well. We should have gone that
way :)
Allocating the slot first and only setting MMF_VM_HUGEPAGE after that
makes the race go away. If mm_slot_alloc() fails, there is nothing to
undo.
[1] https://lore.kernel.org/linux-mm/aed7c1d5-2189-4ee2-b0f3-ce5a3e3c2118@arm.com/
Cheers, Lance
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-08 21:41 ` David Hildenbrand (Arm)
2026-05-09 1:57 ` Lance Yang
@ 2026-05-11 4:00 ` Dev Jain
2026-05-11 5:40 ` David Hildenbrand (Arm)
1 sibling, 1 reply; 10+ messages in thread
From: Dev Jain @ 2026-05-11 4:00 UTC (permalink / raw)
To: David Hildenbrand (Arm), Lance Yang
Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache,
ryan.roberts, baohua, akpm, linux-mm, linux-kernel
On 09/05/26 3:11 am, David Hildenbrand (Arm) wrote:
> On 5/6/26 12:51, Lance Yang wrote:
>>
>> On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote:
>>>
>>>
>>> On 06/05/26 6:51 am, Ye Liu wrote:
>>>> From: Ye Liu <liuye@kylinos.cn>
>>>>
>>>> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
>>>> corresponding mm_slot. If mm_slot_alloc() fails, the function
>>>> returns with the flag set but without inserting the mm into the
>>>> khugepaged tracking structures.
>>>>
>>>> This leaves the mm in an inconsistent state: it is marked as
>>>> registered (MMF_VM_HUGEPAGE set), but will never be scanned by
>>>> khugepaged. Future attempts to register the mm are skipped since
>>>> khugepaged_enter_vma() checks the flag and returns early.
>>>>
>>>> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
>>>> restoring the ability to retry registration later.
>>>>
>>>> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
>>>> Signed-off-by: Ye Liu <liuye@kylinos.cn>
>>>> ---
>>>> Changes since v1:
>>>> - Add Fixes tag as suggested by Dev Jain and Lance Yang
>>>>
>>>> mm/khugepaged.c | 4 +++-
>>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>>> index 7d48d4fbd5f3..60ab7c1b61dd 100644
>>>> --- a/mm/khugepaged.c
>>>> +++ b/mm/khugepaged.c
>>>> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm)
>>>> return;
>>>>
>>>> slot = mm_slot_alloc(mm_slot_cache);
>>>> - if (!slot)
>>>> + if (!slot) {
>>>> + mm_flags_clear(MMF_VM_HUGEPAGE, mm);
>>>> return;
>>>> + }
>>>
>>> Note that, a racing khugepaged_enter_vma() may back off
>>> when it sees that MMF_VM_HUGEPAGE is set, but then the above
>>> clears the flag after slot alloc failure. So we end up not
>>> registering the mm with khugepaged. But I am sure no one
>>> cares, we are in much big trouble if slot alloc is failing.
>>
>> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set
>> and return, then !slot clears it again. If there is no later
>> khugepaged_enter_vma(), the mm still wouldn't get registered :)
>
> So why not
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 5f4e009593e0..78735f34250a 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm)
>
> /* __khugepaged_exit() must not run from under us */
> VM_BUG_ON_MM(collapse_test_exit(mm), mm);
> - if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm)))
> - return;
>
> slot = mm_slot_alloc(mm_slot_cache);
> if (!slot)
> return;
>
> + if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) {
> + mm_slot_free(mm_slot_cache, slot);
> + return;
> + }
> +
> spin_lock(&khugepaged_mm_lock);
> mm_slot_insert(mm_slots_hash, mm, slot);
> /*
>
>
> Arguably, on the race described above, likely the thread seeing the
> MMF_VM_HUGEPAGE would likely similarly have failed the allocation.
>
> I'm fine with either, just wanted to raise the (cleaner looking?) alternative
> where we just properly back off?
Yes this is also fine - I am overthinking but I wasn't going this way because ...
A process doing THP allocations will fail on the mm_flags_test_and_set everytime
after the first time. So we will have an unconditional overhead of
mm_slot_alloc in the fault handler.
Again I am overthinking : )
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-11 4:00 ` Dev Jain
@ 2026-05-11 5:40 ` David Hildenbrand (Arm)
2026-05-11 5:44 ` Dev Jain
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-11 5:40 UTC (permalink / raw)
To: Dev Jain, Lance Yang
Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache,
ryan.roberts, baohua, akpm, linux-mm, linux-kernel
On 5/11/26 06:00, Dev Jain wrote:
>
>
> On 09/05/26 3:11 am, David Hildenbrand (Arm) wrote:
>> On 5/6/26 12:51, Lance Yang wrote:
>>>
>>>
>>> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set
>>> and return, then !slot clears it again. If there is no later
>>> khugepaged_enter_vma(), the mm still wouldn't get registered :)
>>
>> So why not
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 5f4e009593e0..78735f34250a 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm)
>>
>> /* __khugepaged_exit() must not run from under us */
>> VM_BUG_ON_MM(collapse_test_exit(mm), mm);
>> - if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm)))
>> - return;
>>
>> slot = mm_slot_alloc(mm_slot_cache);
>> if (!slot)
>> return;
>>
>> + if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) {
>> + mm_slot_free(mm_slot_cache, slot);
>> + return;
>> + }
>> +
>> spin_lock(&khugepaged_mm_lock);
>> mm_slot_insert(mm_slots_hash, mm, slot);
>> /*
>>
>>
>> Arguably, on the race described above, likely the thread seeing the
>> MMF_VM_HUGEPAGE would likely similarly have failed the allocation.
>>
>> I'm fine with either, just wanted to raise the (cleaner looking?) alternative
>> where we just properly back off?
>
> Yes this is also fine - I am overthinking but I wasn't going this way because ...
> A process doing THP allocations will fail on the mm_flags_test_and_set everytime
> after the first time.
We should perform a mm_flags_test(MMF_VM_HUGEPAGE, vma->vm_mm) test before
calling the function when the flag might not be set yet: in khugepaged_enter_vma()
khugepaged_fork() should only get called once per process.
Which makes sense, because mm_flags_test_and_set() is expensive.
--
Cheers,
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
2026-05-11 5:40 ` David Hildenbrand (Arm)
@ 2026-05-11 5:44 ` Dev Jain
0 siblings, 0 replies; 10+ messages in thread
From: Dev Jain @ 2026-05-11 5:44 UTC (permalink / raw)
To: David Hildenbrand (Arm), Lance Yang
Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache,
ryan.roberts, baohua, akpm, linux-mm, linux-kernel
On 11/05/26 11:10 am, David Hildenbrand (Arm) wrote:
> On 5/11/26 06:00, Dev Jain wrote:
>>
>>
>> On 09/05/26 3:11 am, David Hildenbrand (Arm) wrote:
>>> On 5/6/26 12:51, Lance Yang wrote:
>>>>
>>>>
>>>> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set
>>>> and return, then !slot clears it again. If there is no later
>>>> khugepaged_enter_vma(), the mm still wouldn't get registered :)
>>>
>>> So why not
>>>
>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>>> index 5f4e009593e0..78735f34250a 100644
>>> --- a/mm/khugepaged.c
>>> +++ b/mm/khugepaged.c
>>> @@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm)
>>>
>>> /* __khugepaged_exit() must not run from under us */
>>> VM_BUG_ON_MM(collapse_test_exit(mm), mm);
>>> - if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm)))
>>> - return;
>>>
>>> slot = mm_slot_alloc(mm_slot_cache);
>>> if (!slot)
>>> return;
>>>
>>> + if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) {
>>> + mm_slot_free(mm_slot_cache, slot);
>>> + return;
>>> + }
>>> +
>>> spin_lock(&khugepaged_mm_lock);
>>> mm_slot_insert(mm_slots_hash, mm, slot);
>>> /*
>>>
>>>
>>> Arguably, on the race described above, likely the thread seeing the
>>> MMF_VM_HUGEPAGE would likely similarly have failed the allocation.
>>>
>>> I'm fine with either, just wanted to raise the (cleaner looking?) alternative
>>> where we just properly back off?
>>
>> Yes this is also fine - I am overthinking but I wasn't going this way because ...
>> A process doing THP allocations will fail on the mm_flags_test_and_set everytime
>> after the first time.
> We should perform a mm_flags_test(MMF_VM_HUGEPAGE, vma->vm_mm) test before
> calling the function when the flag might not be set yet: in khugepaged_enter_vma()
Ah that slipped my mind, you are right.
>
> khugepaged_fork() should only get called once per process.
>
> Which makes sense, because mm_flags_test_and_set() is expensive.
>
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-11 5:45 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu
2026-05-06 2:05 ` Lance Yang
2026-05-06 5:17 ` Baolin Wang
2026-05-06 6:46 ` Dev Jain
2026-05-06 10:51 ` Lance Yang
2026-05-08 21:41 ` David Hildenbrand (Arm)
2026-05-09 1:57 ` Lance Yang
2026-05-11 4:00 ` Dev Jain
2026-05-11 5:40 ` David Hildenbrand (Arm)
2026-05-11 5:44 ` Dev Jain
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox