* [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure
@ 2026-05-06 1:21 Ye Liu
2026-05-06 2:05 ` Lance Yang
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Ye Liu @ 2026-05-06 1:21 UTC (permalink / raw)
To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Xin Hao
Cc: Ye Liu, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache,
Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Andrew Morton,
linux-mm, linux-kernel
From: Ye Liu <liuye@kylinos.cn>
__khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the
corresponding mm_slot. If mm_slot_alloc() fails, the function
returns with the flag set but without inserting the mm into the
khugepaged tracking structures.
This leaves the mm in an inconsistent state: it is marked as
registered (MMF_VM_HUGEPAGE set), but will never be scanned by
khugepaged. Future attempts to register the mm are skipped since
khugepaged_enter_vma() checks the flag and returns early.
Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails,
restoring the ability to retry registration later.
Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot")
Signed-off-by: Ye Liu <liuye@kylinos.cn>
---
Changes since v1:
- Add Fixes tag as suggested by Dev Jain and Lance Yang
mm/khugepaged.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 7d48d4fbd5f3..60ab7c1b61dd 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm)
return;
slot = mm_slot_alloc(mm_slot_cache);
- if (!slot)
+ if (!slot) {
+ mm_flags_clear(MMF_VM_HUGEPAGE, mm);
return;
+ }
spin_lock(&khugepaged_mm_lock);
mm_slot_insert(mm_slots_hash, mm, slot);
--
2.43.0
^ permalink raw reply related [flat|nested] 10+ messages in thread* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu @ 2026-05-06 2:05 ` Lance Yang 2026-05-06 5:17 ` Baolin Wang 2026-05-06 6:46 ` Dev Jain 2 siblings, 0 replies; 10+ messages in thread From: Lance Yang @ 2026-05-06 2:05 UTC (permalink / raw) To: ye.liu Cc: akpm, david, ljs, xhao, liuye, ziy, baolin.wang, liam, npache, ryan.roberts, dev.jain, baohua, lance.yang, akpm, linux-mm, linux-kernel On Wed, May 06, 2026 at 09:21:30AM +0800, Ye Liu wrote: >From: Ye Liu <liuye@kylinos.cn> > >__khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the >corresponding mm_slot. If mm_slot_alloc() fails, the function >returns with the flag set but without inserting the mm into the >khugepaged tracking structures. > >This leaves the mm in an inconsistent state: it is marked as >registered (MMF_VM_HUGEPAGE set), but will never be scanned by >khugepaged. Future attempts to register the mm are skipped since >khugepaged_enter_vma() checks the flag and returns early. > >Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails, >restoring the ability to retry registration later. > >Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot") Nit: the title does not exactly match the original commit subject. It should be Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for "struct mm_slot"") which includes quotes around "struct mm_slot". No need to resend; I think Andrew can fix it up when applying :) >Signed-off-by: Ye Liu <liuye@kylinos.cn> >--- Thanks, LGTM. Reviewed-by: Lance Yang <lance.yang@linux.dev> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu 2026-05-06 2:05 ` Lance Yang @ 2026-05-06 5:17 ` Baolin Wang 2026-05-06 6:46 ` Dev Jain 2 siblings, 0 replies; 10+ messages in thread From: Baolin Wang @ 2026-05-06 5:17 UTC (permalink / raw) To: Ye Liu, Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Xin Hao Cc: Ye Liu, Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song, Lance Yang, Andrew Morton, linux-mm, linux-kernel On 5/6/26 9:21 AM, Ye Liu wrote: > From: Ye Liu <liuye@kylinos.cn> > > __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the > corresponding mm_slot. If mm_slot_alloc() fails, the function > returns with the flag set but without inserting the mm into the > khugepaged tracking structures. > > This leaves the mm in an inconsistent state: it is marked as > registered (MMF_VM_HUGEPAGE set), but will never be scanned by > khugepaged. Future attempts to register the mm are skipped since > khugepaged_enter_vma() checks the flag and returns early. > > Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails, > restoring the ability to retry registration later. > > Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot") > Signed-off-by: Ye Liu <liuye@kylinos.cn> > --- Good catch. Feel free to add: Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com> ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu 2026-05-06 2:05 ` Lance Yang 2026-05-06 5:17 ` Baolin Wang @ 2026-05-06 6:46 ` Dev Jain 2026-05-06 10:51 ` Lance Yang 2 siblings, 1 reply; 10+ messages in thread From: Dev Jain @ 2026-05-06 6:46 UTC (permalink / raw) To: Ye Liu, Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Xin Hao Cc: Ye Liu, Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Barry Song, Lance Yang, Andrew Morton, linux-mm, linux-kernel On 06/05/26 6:51 am, Ye Liu wrote: > From: Ye Liu <liuye@kylinos.cn> > > __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the > corresponding mm_slot. If mm_slot_alloc() fails, the function > returns with the flag set but without inserting the mm into the > khugepaged tracking structures. > > This leaves the mm in an inconsistent state: it is marked as > registered (MMF_VM_HUGEPAGE set), but will never be scanned by > khugepaged. Future attempts to register the mm are skipped since > khugepaged_enter_vma() checks the flag and returns early. > > Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails, > restoring the ability to retry registration later. > > Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot") > Signed-off-by: Ye Liu <liuye@kylinos.cn> > --- > Changes since v1: > - Add Fixes tag as suggested by Dev Jain and Lance Yang > > mm/khugepaged.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 7d48d4fbd5f3..60ab7c1b61dd 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm) > return; > > slot = mm_slot_alloc(mm_slot_cache); > - if (!slot) > + if (!slot) { > + mm_flags_clear(MMF_VM_HUGEPAGE, mm); > return; > + } Note that, a racing khugepaged_enter_vma() may back off when it sees that MMF_VM_HUGEPAGE is set, but then the above clears the flag after slot alloc failure. So we end up not registering the mm with khugepaged. But I am sure no one cares, we are in much big trouble if slot alloc is failing. Although one could argue the same about this patch, I will still say it is important to fix "flag is set but not registered with khugepaged" because that just feels wrong. Reviewed-by: Dev Jain <dev.jain@arm.com> > > spin_lock(&khugepaged_mm_lock); > mm_slot_insert(mm_slots_hash, mm, slot); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-06 6:46 ` Dev Jain @ 2026-05-06 10:51 ` Lance Yang 2026-05-08 21:41 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 10+ messages in thread From: Lance Yang @ 2026-05-06 10:51 UTC (permalink / raw) To: dev.jain Cc: ye.liu, akpm, david, ljs, xhao, liuye, ziy, baolin.wang, liam, npache, ryan.roberts, baohua, lance.yang, akpm, linux-mm, linux-kernel On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote: > > >On 06/05/26 6:51 am, Ye Liu wrote: >> From: Ye Liu <liuye@kylinos.cn> >> >> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the >> corresponding mm_slot. If mm_slot_alloc() fails, the function >> returns with the flag set but without inserting the mm into the >> khugepaged tracking structures. >> >> This leaves the mm in an inconsistent state: it is marked as >> registered (MMF_VM_HUGEPAGE set), but will never be scanned by >> khugepaged. Future attempts to register the mm are skipped since >> khugepaged_enter_vma() checks the flag and returns early. >> >> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails, >> restoring the ability to retry registration later. >> >> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot") >> Signed-off-by: Ye Liu <liuye@kylinos.cn> >> --- >> Changes since v1: >> - Add Fixes tag as suggested by Dev Jain and Lance Yang >> >> mm/khugepaged.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 7d48d4fbd5f3..60ab7c1b61dd 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm) >> return; >> >> slot = mm_slot_alloc(mm_slot_cache); >> - if (!slot) >> + if (!slot) { >> + mm_flags_clear(MMF_VM_HUGEPAGE, mm); >> return; >> + } > >Note that, a racing khugepaged_enter_vma() may back off >when it sees that MMF_VM_HUGEPAGE is set, but then the above >clears the flag after slot alloc failure. So we end up not >registering the mm with khugepaged. But I am sure no one >cares, we are in much big trouble if slot alloc is failing. Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set and return, then !slot clears it again. If there is no later khugepaged_enter_vma(), the mm still wouldn't get registered :) >Although one could argue the same about this patch, I will >still say it is important to fix "flag is set but not registered >with khugepaged" because that just feels wrong. +1 that is still better then keeping MMF_VM_HUGEPAGE set forever: any later attempt can retry and install the mm_slot :) Cheers, Lance ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-06 10:51 ` Lance Yang @ 2026-05-08 21:41 ` David Hildenbrand (Arm) 2026-05-09 1:57 ` Lance Yang 2026-05-11 4:00 ` Dev Jain 0 siblings, 2 replies; 10+ messages in thread From: David Hildenbrand (Arm) @ 2026-05-08 21:41 UTC (permalink / raw) To: Lance Yang, dev.jain Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache, ryan.roberts, baohua, akpm, linux-mm, linux-kernel On 5/6/26 12:51, Lance Yang wrote: > > On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote: >> >> >> On 06/05/26 6:51 am, Ye Liu wrote: >>> From: Ye Liu <liuye@kylinos.cn> >>> >>> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the >>> corresponding mm_slot. If mm_slot_alloc() fails, the function >>> returns with the flag set but without inserting the mm into the >>> khugepaged tracking structures. >>> >>> This leaves the mm in an inconsistent state: it is marked as >>> registered (MMF_VM_HUGEPAGE set), but will never be scanned by >>> khugepaged. Future attempts to register the mm are skipped since >>> khugepaged_enter_vma() checks the flag and returns early. >>> >>> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails, >>> restoring the ability to retry registration later. >>> >>> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot") >>> Signed-off-by: Ye Liu <liuye@kylinos.cn> >>> --- >>> Changes since v1: >>> - Add Fixes tag as suggested by Dev Jain and Lance Yang >>> >>> mm/khugepaged.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index 7d48d4fbd5f3..60ab7c1b61dd 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm) >>> return; >>> >>> slot = mm_slot_alloc(mm_slot_cache); >>> - if (!slot) >>> + if (!slot) { >>> + mm_flags_clear(MMF_VM_HUGEPAGE, mm); >>> return; >>> + } >> >> Note that, a racing khugepaged_enter_vma() may back off >> when it sees that MMF_VM_HUGEPAGE is set, but then the above >> clears the flag after slot alloc failure. So we end up not >> registering the mm with khugepaged. But I am sure no one >> cares, we are in much big trouble if slot alloc is failing. > > Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set > and return, then !slot clears it again. If there is no later > khugepaged_enter_vma(), the mm still wouldn't get registered :) So why not diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 5f4e009593e0..78735f34250a 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm) /* __khugepaged_exit() must not run from under us */ VM_BUG_ON_MM(collapse_test_exit(mm), mm); - if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) - return; slot = mm_slot_alloc(mm_slot_cache); if (!slot) return; + if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) { + mm_slot_free(mm_slot_cache, slot); + return; + } + spin_lock(&khugepaged_mm_lock); mm_slot_insert(mm_slots_hash, mm, slot); /* Arguably, on the race described above, likely the thread seeing the MMF_VM_HUGEPAGE would likely similarly have failed the allocation. I'm fine with either, just wanted to raise the (cleaner looking?) alternative where we just properly back off? -- Cheers, David ^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-08 21:41 ` David Hildenbrand (Arm) @ 2026-05-09 1:57 ` Lance Yang 2026-05-11 4:00 ` Dev Jain 1 sibling, 0 replies; 10+ messages in thread From: Lance Yang @ 2026-05-09 1:57 UTC (permalink / raw) To: david Cc: lance.yang, dev.jain, ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache, ryan.roberts, baohua, akpm, linux-mm, linux-kernel On Fri, May 08, 2026 at 11:41:34PM +0200, David Hildenbrand (Arm) wrote: >On 5/6/26 12:51, Lance Yang wrote: >> >> On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote: >>> >>> >>> On 06/05/26 6:51 am, Ye Liu wrote: >>>> From: Ye Liu <liuye@kylinos.cn> >>>> >>>> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the >>>> corresponding mm_slot. If mm_slot_alloc() fails, the function >>>> returns with the flag set but without inserting the mm into the >>>> khugepaged tracking structures. >>>> >>>> This leaves the mm in an inconsistent state: it is marked as >>>> registered (MMF_VM_HUGEPAGE set), but will never be scanned by >>>> khugepaged. Future attempts to register the mm are skipped since >>>> khugepaged_enter_vma() checks the flag and returns early. >>>> >>>> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails, >>>> restoring the ability to retry registration later. >>>> >>>> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot") >>>> Signed-off-by: Ye Liu <liuye@kylinos.cn> >>>> --- >>>> Changes since v1: >>>> - Add Fixes tag as suggested by Dev Jain and Lance Yang >>>> >>>> mm/khugepaged.c | 4 +++- >>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>>> index 7d48d4fbd5f3..60ab7c1b61dd 100644 >>>> --- a/mm/khugepaged.c >>>> +++ b/mm/khugepaged.c >>>> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm) >>>> return; >>>> >>>> slot = mm_slot_alloc(mm_slot_cache); >>>> - if (!slot) >>>> + if (!slot) { >>>> + mm_flags_clear(MMF_VM_HUGEPAGE, mm); >>>> return; >>>> + } >>> >>> Note that, a racing khugepaged_enter_vma() may back off >>> when it sees that MMF_VM_HUGEPAGE is set, but then the above >>> clears the flag after slot alloc failure. So we end up not >>> registering the mm with khugepaged. But I am sure no one >>> cares, we are in much big trouble if slot alloc is failing. >> >> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set >> and return, then !slot clears it again. If there is no later >> khugepaged_enter_vma(), the mm still wouldn't get registered :) > >So why not > >diff --git a/mm/khugepaged.c b/mm/khugepaged.c >index 5f4e009593e0..78735f34250a 100644 >--- a/mm/khugepaged.c >+++ b/mm/khugepaged.c >@@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm) > > /* __khugepaged_exit() must not run from under us */ > VM_BUG_ON_MM(collapse_test_exit(mm), mm); >- if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) >- return; > > slot = mm_slot_alloc(mm_slot_cache); > if (!slot) > return; > >+ if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) { >+ mm_slot_free(mm_slot_cache, slot); >+ return; >+ } >+ > spin_lock(&khugepaged_mm_lock); > mm_slot_insert(mm_slots_hash, mm, slot); > /* > > >Arguably, on the race described above, likely the thread seeing the >MMF_VM_HUGEPAGE would likely similarly have failed the allocation. Right, LGTM! >I'm fine with either, just wanted to raise the (cleaner looking?) alternative >where we just properly back off? Dev suggested the same thing[1] on v1 as well. We should have gone that way :) Allocating the slot first and only setting MMF_VM_HUGEPAGE after that makes the race go away. If mm_slot_alloc() fails, there is nothing to undo. [1] https://lore.kernel.org/linux-mm/aed7c1d5-2189-4ee2-b0f3-ce5a3e3c2118@arm.com/ Cheers, Lance ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-08 21:41 ` David Hildenbrand (Arm) 2026-05-09 1:57 ` Lance Yang @ 2026-05-11 4:00 ` Dev Jain 2026-05-11 5:40 ` David Hildenbrand (Arm) 1 sibling, 1 reply; 10+ messages in thread From: Dev Jain @ 2026-05-11 4:00 UTC (permalink / raw) To: David Hildenbrand (Arm), Lance Yang Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache, ryan.roberts, baohua, akpm, linux-mm, linux-kernel On 09/05/26 3:11 am, David Hildenbrand (Arm) wrote: > On 5/6/26 12:51, Lance Yang wrote: >> >> On Wed, May 06, 2026 at 12:16:35PM +0530, Dev Jain wrote: >>> >>> >>> On 06/05/26 6:51 am, Ye Liu wrote: >>>> From: Ye Liu <liuye@kylinos.cn> >>>> >>>> __khugepaged_enter() sets MMF_VM_HUGEPAGE before allocating the >>>> corresponding mm_slot. If mm_slot_alloc() fails, the function >>>> returns with the flag set but without inserting the mm into the >>>> khugepaged tracking structures. >>>> >>>> This leaves the mm in an inconsistent state: it is marked as >>>> registered (MMF_VM_HUGEPAGE set), but will never be scanned by >>>> khugepaged. Future attempts to register the mm are skipped since >>>> khugepaged_enter_vma() checks the flag and returns early. >>>> >>>> Fix this by clearing MMF_VM_HUGEPAGE when mm_slot_alloc() fails, >>>> restoring the ability to retry registration later. >>>> >>>> Fixes: 16618670276a ("mm: khugepaged: avoid pointless allocation for struct mm_slot") >>>> Signed-off-by: Ye Liu <liuye@kylinos.cn> >>>> --- >>>> Changes since v1: >>>> - Add Fixes tag as suggested by Dev Jain and Lance Yang >>>> >>>> mm/khugepaged.c | 4 +++- >>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>>> index 7d48d4fbd5f3..60ab7c1b61dd 100644 >>>> --- a/mm/khugepaged.c >>>> +++ b/mm/khugepaged.c >>>> @@ -559,8 +559,10 @@ void __khugepaged_enter(struct mm_struct *mm) >>>> return; >>>> >>>> slot = mm_slot_alloc(mm_slot_cache); >>>> - if (!slot) >>>> + if (!slot) { >>>> + mm_flags_clear(MMF_VM_HUGEPAGE, mm); >>>> return; >>>> + } >>> >>> Note that, a racing khugepaged_enter_vma() may back off >>> when it sees that MMF_VM_HUGEPAGE is set, but then the above >>> clears the flag after slot alloc failure. So we end up not >>> registering the mm with khugepaged. But I am sure no one >>> cares, we are in much big trouble if slot alloc is failing. >> >> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set >> and return, then !slot clears it again. If there is no later >> khugepaged_enter_vma(), the mm still wouldn't get registered :) > > So why not > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 5f4e009593e0..78735f34250a 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm) > > /* __khugepaged_exit() must not run from under us */ > VM_BUG_ON_MM(collapse_test_exit(mm), mm); > - if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) > - return; > > slot = mm_slot_alloc(mm_slot_cache); > if (!slot) > return; > > + if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) { > + mm_slot_free(mm_slot_cache, slot); > + return; > + } > + > spin_lock(&khugepaged_mm_lock); > mm_slot_insert(mm_slots_hash, mm, slot); > /* > > > Arguably, on the race described above, likely the thread seeing the > MMF_VM_HUGEPAGE would likely similarly have failed the allocation. > > I'm fine with either, just wanted to raise the (cleaner looking?) alternative > where we just properly back off? Yes this is also fine - I am overthinking but I wasn't going this way because ... A process doing THP allocations will fail on the mm_flags_test_and_set everytime after the first time. So we will have an unconditional overhead of mm_slot_alloc in the fault handler. Again I am overthinking : ) > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-11 4:00 ` Dev Jain @ 2026-05-11 5:40 ` David Hildenbrand (Arm) 2026-05-11 5:44 ` Dev Jain 0 siblings, 1 reply; 10+ messages in thread From: David Hildenbrand (Arm) @ 2026-05-11 5:40 UTC (permalink / raw) To: Dev Jain, Lance Yang Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache, ryan.roberts, baohua, akpm, linux-mm, linux-kernel On 5/11/26 06:00, Dev Jain wrote: > > > On 09/05/26 3:11 am, David Hildenbrand (Arm) wrote: >> On 5/6/26 12:51, Lance Yang wrote: >>> >>> >>> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set >>> and return, then !slot clears it again. If there is no later >>> khugepaged_enter_vma(), the mm still wouldn't get registered :) >> >> So why not >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 5f4e009593e0..78735f34250a 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm) >> >> /* __khugepaged_exit() must not run from under us */ >> VM_BUG_ON_MM(collapse_test_exit(mm), mm); >> - if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) >> - return; >> >> slot = mm_slot_alloc(mm_slot_cache); >> if (!slot) >> return; >> >> + if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) { >> + mm_slot_free(mm_slot_cache, slot); >> + return; >> + } >> + >> spin_lock(&khugepaged_mm_lock); >> mm_slot_insert(mm_slots_hash, mm, slot); >> /* >> >> >> Arguably, on the race described above, likely the thread seeing the >> MMF_VM_HUGEPAGE would likely similarly have failed the allocation. >> >> I'm fine with either, just wanted to raise the (cleaner looking?) alternative >> where we just properly back off? > > Yes this is also fine - I am overthinking but I wasn't going this way because ... > A process doing THP allocations will fail on the mm_flags_test_and_set everytime > after the first time. We should perform a mm_flags_test(MMF_VM_HUGEPAGE, vma->vm_mm) test before calling the function when the flag might not be set yet: in khugepaged_enter_vma() khugepaged_fork() should only get called once per process. Which makes sense, because mm_flags_test_and_set() is expensive. -- Cheers, David ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure 2026-05-11 5:40 ` David Hildenbrand (Arm) @ 2026-05-11 5:44 ` Dev Jain 0 siblings, 0 replies; 10+ messages in thread From: Dev Jain @ 2026-05-11 5:44 UTC (permalink / raw) To: David Hildenbrand (Arm), Lance Yang Cc: ye.liu, akpm, ljs, xhao, liuye, ziy, baolin.wang, liam, npache, ryan.roberts, baohua, akpm, linux-mm, linux-kernel On 11/05/26 11:10 am, David Hildenbrand (Arm) wrote: > On 5/11/26 06:00, Dev Jain wrote: >> >> >> On 09/05/26 3:11 am, David Hildenbrand (Arm) wrote: >>> On 5/6/26 12:51, Lance Yang wrote: >>>> >>>> >>>> Right. A racing khugepaged_enter_vma() can see MMF_VM_HUGEPAGE is set >>>> and return, then !slot clears it again. If there is no later >>>> khugepaged_enter_vma(), the mm still wouldn't get registered :) >>> >>> So why not >>> >>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >>> index 5f4e009593e0..78735f34250a 100644 >>> --- a/mm/khugepaged.c >>> +++ b/mm/khugepaged.c >>> @@ -437,13 +437,16 @@ void __khugepaged_enter(struct mm_struct *mm) >>> >>> /* __khugepaged_exit() must not run from under us */ >>> VM_BUG_ON_MM(collapse_test_exit(mm), mm); >>> - if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) >>> - return; >>> >>> slot = mm_slot_alloc(mm_slot_cache); >>> if (!slot) >>> return; >>> >>> + if (unlikely(mm_flags_test_and_set(MMF_VM_HUGEPAGE, mm))) { >>> + mm_slot_free(mm_slot_cache, slot); >>> + return; >>> + } >>> + >>> spin_lock(&khugepaged_mm_lock); >>> mm_slot_insert(mm_slots_hash, mm, slot); >>> /* >>> >>> >>> Arguably, on the race described above, likely the thread seeing the >>> MMF_VM_HUGEPAGE would likely similarly have failed the allocation. >>> >>> I'm fine with either, just wanted to raise the (cleaner looking?) alternative >>> where we just properly back off? >> >> Yes this is also fine - I am overthinking but I wasn't going this way because ... >> A process doing THP allocations will fail on the mm_flags_test_and_set everytime >> after the first time. > We should perform a mm_flags_test(MMF_VM_HUGEPAGE, vma->vm_mm) test before > calling the function when the flag might not be set yet: in khugepaged_enter_vma() Ah that slipped my mind, you are right. > > khugepaged_fork() should only get called once per process. > > Which makes sense, because mm_flags_test_and_set() is expensive. > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2026-05-11 5:45 UTC | newest] Thread overview: 10+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-06 1:21 [PATCH v2] mm/khugepaged: clear MMF_VM_HUGEPAGE on mm_slot_alloc() failure Ye Liu 2026-05-06 2:05 ` Lance Yang 2026-05-06 5:17 ` Baolin Wang 2026-05-06 6:46 ` Dev Jain 2026-05-06 10:51 ` Lance Yang 2026-05-08 21:41 ` David Hildenbrand (Arm) 2026-05-09 1:57 ` Lance Yang 2026-05-11 4:00 ` Dev Jain 2026-05-11 5:40 ` David Hildenbrand (Arm) 2026-05-11 5:44 ` Dev Jain
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox