* [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE
@ 2026-05-11 6:57 Wandun Chen
2026-05-11 7:17 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 6+ messages in thread
From: Wandun Chen @ 2026-05-11 6:57 UTC (permalink / raw)
To: linux-mm, linux-kernel
Cc: akpm, david, ljs, ziy, baolin.wang, liam, npache, ryan.roberts,
dev.jain, baohua, lance.yang
From: Chen Wandun <chenwandun@lixiang.com>
madvise_collapse() computes the THP-aligned window:
hstart = ALIGN(start, HPAGE_PMD_SIZE); /* round up */
hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); /* round down */
The following case will cause hstart > hend, and result in underflow
in the return statement, avoid it by returning -EINVAL early when
hstart > hend.
madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE);
In addition, kmalloc_obj(), mmgrab() and lru_add_drain_all() are
unnecessary when hstart == hend, so skip these operations by
returning early too.
Signed-off-by: Chen Wandun <chenwandun@lixiang.com>
---
v1 --> v2:
- Rebase and resolve code conflict.
- Return -EINVAL when hstart > hend, suggested by Lorenzo.
- Drop Fixes tag, suggested by David and Lorenzo.
- Updated commit message to be more explicit, suggested by Lorenzo.
---
mm/khugepaged.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 28a843f30b32..36baab17f098 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2837,6 +2837,15 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_FORCED_COLLAPSE, PMD_ORDER))
return -EINVAL;
+ hstart = ALIGN(start, HPAGE_PMD_SIZE);
+ hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE);
+
+ if (hstart > hend)
+ return -EINVAL;
+
+ if (hstart == hend)
+ return 0;
+
cc = kmalloc_obj(*cc);
if (!cc)
return -ENOMEM;
@@ -2846,9 +2855,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
mmgrab(mm);
lru_add_drain_all();
- hstart = ALIGN(start, HPAGE_PMD_SIZE);
- hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE);
-
for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) {
enum scan_result result = SCAN_FAIL;
--
2.43.0
^ permalink raw reply related [flat|nested] 6+ messages in thread* Re: [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE 2026-05-11 6:57 [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE Wandun Chen @ 2026-05-11 7:17 ` David Hildenbrand (Arm) 2026-05-11 7:35 ` Wandun 0 siblings, 1 reply; 6+ messages in thread From: David Hildenbrand (Arm) @ 2026-05-11 7:17 UTC (permalink / raw) To: Wandun Chen, linux-mm, linux-kernel Cc: akpm, ljs, ziy, baolin.wang, liam, npache, ryan.roberts, dev.jain, baohua, lance.yang On 5/11/26 08:57, Wandun Chen wrote: > From: Chen Wandun <chenwandun@lixiang.com> > > madvise_collapse() computes the THP-aligned window: > > hstart = ALIGN(start, HPAGE_PMD_SIZE); /* round up */ > hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); /* round down */ > > The following case will cause hstart > hend, and result in underflow > in the return statement, avoid it by returning -EINVAL early when > hstart > hend. > > madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); Ok, so providing a PMD-aligned address as start will result in 0 and a non-aligned address will result in -EINVAL. Didn't Lorenzo agree that just returning 0 in both cases would be clearer? But I might have misunderstood it. That would also make the code easier ... > > In addition, kmalloc_obj(), mmgrab() and lru_add_drain_all() are > unnecessary when hstart == hend, so skip these operations by > returning early too. > > Signed-off-by: Chen Wandun <chenwandun@lixiang.com> > > --- > v1 --> v2: > - Rebase and resolve code conflict. > - Return -EINVAL when hstart > hend, suggested by Lorenzo. > - Drop Fixes tag, suggested by David and Lorenzo. > - Updated commit message to be more explicit, suggested by Lorenzo. > --- > mm/khugepaged.c | 12 +++++++++--- > 1 file changed, 9 insertions(+), 3 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 28a843f30b32..36baab17f098 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -2837,6 +2837,15 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start, > if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_FORCED_COLLAPSE, PMD_ORDER)) > return -EINVAL; > > + hstart = ALIGN(start, HPAGE_PMD_SIZE); > + hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); > + > + if (hstart > hend) > + return -EINVAL; > + > + if (hstart == hend) > + return 0; > + ... as it would simply be: /* Nothing to collapse. */ if (hstart >= hend) return 0; @Lorenzo, or do we want to keep the (questionable : ) ) existing behavior? -- Cheers, David ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE 2026-05-11 7:17 ` David Hildenbrand (Arm) @ 2026-05-11 7:35 ` Wandun 2026-05-11 8:01 ` David Hildenbrand (Arm) 0 siblings, 1 reply; 6+ messages in thread From: Wandun @ 2026-05-11 7:35 UTC (permalink / raw) To: David Hildenbrand (Arm), linux-mm, linux-kernel Cc: akpm, ljs, ziy, baolin.wang, liam, npache, ryan.roberts, dev.jain, baohua, lance.yang On 5/11/26 15:17, David Hildenbrand (Arm) wrote: > On 5/11/26 08:57, Wandun Chen wrote: >> From: Chen Wandun <chenwandun@lixiang.com> >> >> madvise_collapse() computes the THP-aligned window: >> >> hstart = ALIGN(start, HPAGE_PMD_SIZE); /* round up */ >> hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); /* round down */ >> >> The following case will cause hstart > hend, and result in underflow >> in the return statement, avoid it by returning -EINVAL early when >> hstart > hend. >> >> madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); > Ok, so providing a PMD-aligned address as start will result in 0 and a > non-aligned address will result in -EINVAL. > > Didn't Lorenzo agree that just returning 0 in both cases would be clearer? But I > might have misunderstood it. Lorenzo suggested retuern -EINVAL for both case at the beginning, Later, Lorenzo add an correction, suggested should return 0 for compatibilty reasons for hstart == hend case. (If I haven't missed any information) Best regards, Wandun > > That would also make the code easier ... > >> In addition, kmalloc_obj(), mmgrab() and lru_add_drain_all() are >> unnecessary when hstart == hend, so skip these operations by >> returning early too. >> >> Signed-off-by: Chen Wandun <chenwandun@lixiang.com> >> >> --- >> v1 --> v2: >> - Rebase and resolve code conflict. >> - Return -EINVAL when hstart > hend, suggested by Lorenzo. >> - Drop Fixes tag, suggested by David and Lorenzo. >> - Updated commit message to be more explicit, suggested by Lorenzo. >> --- >> mm/khugepaged.c | 12 +++++++++--- >> 1 file changed, 9 insertions(+), 3 deletions(-) >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 28a843f30b32..36baab17f098 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -2837,6 +2837,15 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start, >> if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_FORCED_COLLAPSE, PMD_ORDER)) >> return -EINVAL; >> >> + hstart = ALIGN(start, HPAGE_PMD_SIZE); >> + hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); >> + >> + if (hstart > hend) >> + return -EINVAL; >> + >> + if (hstart == hend) >> + return 0; >> + > ... as it would simply be: > > /* Nothing to collapse. */ > if (hstart >= hend) > return 0; > @Lorenzo, or do we want to keep the (questionable : ) ) existing behavior? > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE 2026-05-11 7:35 ` Wandun @ 2026-05-11 8:01 ` David Hildenbrand (Arm) 2026-05-11 16:21 ` Lorenzo Stoakes 0 siblings, 1 reply; 6+ messages in thread From: David Hildenbrand (Arm) @ 2026-05-11 8:01 UTC (permalink / raw) To: Wandun, linux-mm, linux-kernel Cc: akpm, ljs, ziy, baolin.wang, liam, npache, ryan.roberts, dev.jain, baohua, lance.yang On 5/11/26 09:35, Wandun wrote: > > > On 5/11/26 15:17, David Hildenbrand (Arm) wrote: >> On 5/11/26 08:57, Wandun Chen wrote: >>> From: Chen Wandun <chenwandun@lixiang.com> >>> >>> madvise_collapse() computes the THP-aligned window: >>> >>> hstart = ALIGN(start, HPAGE_PMD_SIZE); /* round up */ >>> hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); /* round down */ >>> >>> The following case will cause hstart > hend, and result in underflow >>> in the return statement, avoid it by returning -EINVAL early when >>> hstart > hend. >>> >>> madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); >> Ok, so providing a PMD-aligned address as start will result in 0 and a >> non-aligned address will result in -EINVAL. >> >> Didn't Lorenzo agree that just returning 0 in both cases would be clearer? But I >> might have misunderstood it. > Lorenzo suggested retuern -EINVAL for both case at the beginning, > Later, Lorenzo add an correction, suggested should return 0 for > compatibilty reasons for hstart == hend case. > (If I haven't missed any information) Let's wait for Lorenzo's confirmation. I think the important part is that we cannot have a situation where start < end (given that madvise() consumes a length). Because, there we really should have returned -EINVAL. For start <= end, if there is nothing suitable to collapse, I'd say we'd just consistently return 0. -- Cheers, David ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE 2026-05-11 8:01 ` David Hildenbrand (Arm) @ 2026-05-11 16:21 ` Lorenzo Stoakes 2026-05-13 3:24 ` Wandun 0 siblings, 1 reply; 6+ messages in thread From: Lorenzo Stoakes @ 2026-05-11 16:21 UTC (permalink / raw) To: David Hildenbrand (Arm) Cc: Wandun, linux-mm, linux-kernel, akpm, ziy, baolin.wang, liam, npache, ryan.roberts, dev.jain, baohua, lance.yang On Mon, May 11, 2026 at 10:01:39AM +0200, David Hildenbrand (Arm) wrote: > On 5/11/26 09:35, Wandun wrote: > > > > > > On 5/11/26 15:17, David Hildenbrand (Arm) wrote: > >> On 5/11/26 08:57, Wandun Chen wrote: > >>> From: Chen Wandun <chenwandun@lixiang.com> > >>> > >>> madvise_collapse() computes the THP-aligned window: > >>> > >>> hstart = ALIGN(start, HPAGE_PMD_SIZE); /* round up */ > >>> hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); /* round down */ > >>> > >>> The following case will cause hstart > hend, and result in underflow > >>> in the return statement, avoid it by returning -EINVAL early when > >>> hstart > hend. > >>> > >>> madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); > >> Ok, so providing a PMD-aligned address as start will result in 0 and a > >> non-aligned address will result in -EINVAL. > >> > >> Didn't Lorenzo agree that just returning 0 in both cases would be clearer? But I > >> might have misunderstood it. > > Lorenzo suggested retuern -EINVAL for both case at the beginning, > > Later, Lorenzo add an correction, suggested should return 0 for > > compatibilty reasons for hstart == hend case. > > (If I haven't missed any information) > > Let's wait for Lorenzo's confirmation. :) thanks. See below but TL;DR I convinced myself that actually, I agree with David... I hadn't really examined the madvise() <-> madvise_collapse() logic closely enough but yeah. Return 0 for both cases. > > I think the important part is that we cannot have a situation where start < end > (given that madvise() consumes a length). Because, there we really should have > returned -EINVAL. > > For start <= end, if there is nothing suitable to collapse, I'd say we'd just > consistently return 0. Right so madvise_vma_behavior() should be called with range->[start, end] tied to the VMA (under VMA lock we assert this also). So what we're really talking about is hstart, hend. Really we should NEVER have aligned the addresses for the user, that was the real mistake here. But that ship has sailed... Since we do: hstart = ALIGN(start, HPAGE_PMD_SIZE); hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); That means e.g. start = <some PMD aligned addr> + 1 end = start + <not enough to get it to the next hpage> Results in hstart > hend, so the user must have given incorrect input. hstart == hend would be e.g.: start = <some PMD aligned addr> + 1 end = start + HPAGE_PMD_SIZE Which is still an invalid input. I honestly wish we just required that the user provided PMD aligned ranges and we didn't align for them, it's stupid that we do. So the real question is, what constitutes an actual invalid input here? We've made a mess and now we have to decide how we interpret it... :) I still feel that hstart > hend is an error. Since we are aligning things the stupid way we do, we are treating start, end as _bounds_. So we are saying 'turn everything in the bounded range [start, end) into huge pages'. So we use ALIGN() on start so nothing BEFORE start gets converted. And we use ALIGN_DOWN() on end so nothing AFTER end gets converted. But if you do: (first case above) PMD aligned PMD aligned | <-----------------> | You're kinda doing something stupid obviously, because that range cannot span any huge pages, you don't even cross a boundary, and importantly - your range _isn't even large enough to include a single page_. With: PMD aligned PMD aligned | <------------------------|-> You are crossing a boundary but not enough to get a page. But you might well have a large enough range to span a single page... But OTOH... PMD aligned PMD aligned | <---|-> Would get you the same as is equally silly. Yeah ok this is a long way round of coming to the same conclusion as David, godamnit, and I so wanted to disagree here :P Since both get you nothing, and the input was valid _to madvise()_ let's just return 0 for both cases. > > -- > Cheers, > > David Cheers, Lorenzo ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE 2026-05-11 16:21 ` Lorenzo Stoakes @ 2026-05-13 3:24 ` Wandun 0 siblings, 0 replies; 6+ messages in thread From: Wandun @ 2026-05-13 3:24 UTC (permalink / raw) To: Lorenzo Stoakes, David Hildenbrand (Arm) Cc: linux-mm, linux-kernel, akpm, ziy, baolin.wang, liam, npache, ryan.roberts, dev.jain, baohua, lance.yang On 5/12/26 00:21, Lorenzo Stoakes wrote: > On Mon, May 11, 2026 at 10:01:39AM +0200, David Hildenbrand (Arm) wrote: >> On 5/11/26 09:35, Wandun wrote: >>> >>> On 5/11/26 15:17, David Hildenbrand (Arm) wrote: >>>> On 5/11/26 08:57, Wandun Chen wrote: >>>>> From: Chen Wandun <chenwandun@lixiang.com> >>>>> >>>>> madvise_collapse() computes the THP-aligned window: >>>>> >>>>> hstart = ALIGN(start, HPAGE_PMD_SIZE); /* round up */ >>>>> hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); /* round down */ >>>>> >>>>> The following case will cause hstart > hend, and result in underflow >>>>> in the return statement, avoid it by returning -EINVAL early when >>>>> hstart > hend. >>>>> >>>>> madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE); >>>> Ok, so providing a PMD-aligned address as start will result in 0 and a >>>> non-aligned address will result in -EINVAL. >>>> >>>> Didn't Lorenzo agree that just returning 0 in both cases would be clearer? But I >>>> might have misunderstood it. >>> Lorenzo suggested retuern -EINVAL for both case at the beginning, >>> Later, Lorenzo add an correction, suggested should return 0 for >>> compatibilty reasons for hstart == hend case. >>> (If I haven't missed any information) >> Let's wait for Lorenzo's confirmation. > :) thanks. > > See below but TL;DR I convinced myself that actually, I agree with David... > > I hadn't really examined the madvise() <-> madvise_collapse() logic closely > enough but yeah. Return 0 for both cases. > >> I think the important part is that we cannot have a situation where start < end >> (given that madvise() consumes a length). Because, there we really should have >> returned -EINVAL. >> >> For start <= end, if there is nothing suitable to collapse, I'd say we'd just >> consistently return 0. > Right so madvise_vma_behavior() should be called with range->[start, end] tied > to the VMA (under VMA lock we assert this also). > > So what we're really talking about is hstart, hend. > > Really we should NEVER have aligned the addresses for the user, that was the > real mistake here. But that ship has sailed... > > Since we do: > > hstart = ALIGN(start, HPAGE_PMD_SIZE); > hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE); > > That means e.g. > > start = <some PMD aligned addr> + 1 > end = start + <not enough to get it to the next hpage> > > Results in hstart > hend, so the user must have given incorrect input. > > hstart == hend would be e.g.: > > start = <some PMD aligned addr> + 1 > end = start + HPAGE_PMD_SIZE > > Which is still an invalid input. I honestly wish we just required that the user > provided PMD aligned ranges and we didn't align for them, it's stupid that we > do. > > So the real question is, what constitutes an actual invalid input here? We've > made a mess and now we have to decide how we interpret it... :) > > I still feel that hstart > hend is an error. Since we are aligning things the > stupid way we do, we are treating start, end as _bounds_. So we are saying 'turn > everything in the bounded range [start, end) into huge pages'. > > So we use ALIGN() on start so nothing BEFORE start gets converted. And we use > ALIGN_DOWN() on end so nothing AFTER end gets converted. > > But if you do: > > (first case above) > > PMD aligned PMD aligned > | <-----------------> | > > You're kinda doing something stupid obviously, because that range cannot span > any huge pages, you don't even cross a boundary, and importantly - your range > _isn't even large enough to include a single page_. > > With: > > PMD aligned PMD aligned > | <------------------------|-> > > You are crossing a boundary but not enough to get a page. But you might well > have a large enough range to span a single page... > > But OTOH... > > PMD aligned PMD aligned > | <---|-> > > Would get you the same as is equally silly. > > Yeah ok this is a long way round of coming to the same conclusion as David, > godamnit, and I so wanted to disagree here :P > > Since both get you nothing, and the input was valid _to madvise()_ let's just > return 0 for both cases. Thanks for taking the time to walk through this, Lorenzo and David. Now we've agreed both cases should consistently return 0, I'll send a v3 that simply bails out with 0 when hstart >= hend. Best regards, Wandun > >> -- >> Cheers, >> >> David > Cheers, Lorenzo ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2026-05-13 3:24 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-11 6:57 [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE Wandun Chen 2026-05-11 7:17 ` David Hildenbrand (Arm) 2026-05-11 7:35 ` Wandun 2026-05-11 8:01 ` David Hildenbrand (Arm) 2026-05-11 16:21 ` Lorenzo Stoakes 2026-05-13 3:24 ` Wandun
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox