Linux-mm Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Wandun <chenwandun1@gmail.com>
To: Lorenzo Stoakes <ljs@kernel.org>,
	"David Hildenbrand (Arm)" <david@kernel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	akpm@linux-foundation.org, ziy@nvidia.com,
	baolin.wang@linux.alibaba.com, liam@infradead.org,
	npache@redhat.com, ryan.roberts@arm.com, dev.jain@arm.com,
	baohua@kernel.org, lance.yang@linux.dev
Subject: Re: [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE
Date: Wed, 13 May 2026 11:24:14 +0800	[thread overview]
Message-ID: <93275e30-ef8b-4ca0-9854-206b4232d90c@gmail.com> (raw)
In-Reply-To: <agHxP3iZd8RdHqSw@lucifer>



On 5/12/26 00:21, Lorenzo Stoakes wrote:
> On Mon, May 11, 2026 at 10:01:39AM +0200, David Hildenbrand (Arm) wrote:
>> On 5/11/26 09:35, Wandun wrote:
>>>
>>> On 5/11/26 15:17, David Hildenbrand (Arm) wrote:
>>>> On 5/11/26 08:57, Wandun Chen wrote:
>>>>> From: Chen Wandun <chenwandun@lixiang.com>
>>>>>
>>>>> madvise_collapse() computes the THP-aligned window:
>>>>>
>>>>>       hstart = ALIGN(start, HPAGE_PMD_SIZE);     /* round up */
>>>>>       hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE);    /* round down */
>>>>>
>>>>> The following case will cause hstart > hend, and result in underflow
>>>>> in the return statement, avoid it by returning -EINVAL early when
>>>>> hstart > hend.
>>>>>
>>>>>       madvise(PMD-aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE);
>>>> Ok, so providing a PMD-aligned address as start will result in 0 and a
>>>> non-aligned address will result in -EINVAL.
>>>>
>>>> Didn't Lorenzo agree that just returning 0 in both cases would be clearer? But I
>>>> might have misunderstood it.
>>> Lorenzo suggested retuern -EINVAL for both case at the beginning,
>>> Later, Lorenzo add an correction, suggested should return 0 for
>>> compatibilty reasons for hstart == hend case.
>>> (If I haven't missed any information)
>> Let's wait for Lorenzo's confirmation.
> :) thanks.
>
> See below but TL;DR I convinced myself that actually, I agree with David...
>
> I hadn't really examined the madvise() <-> madvise_collapse() logic closely
> enough but yeah. Return 0 for both cases.
>
>> I think the important part is that we cannot have a situation where start < end
>> (given that madvise() consumes a length). Because, there we really should have
>> returned -EINVAL.
>>
>> For start <= end, if there is nothing suitable to collapse, I'd say we'd just
>> consistently return 0.
> Right so madvise_vma_behavior() should be called with range->[start, end] tied
> to the VMA (under VMA lock we assert this also).
>
> So what we're really talking about is hstart, hend.
>
> Really we should NEVER have aligned the addresses for the user, that was the
> real mistake here. But that ship has sailed...
>
> Since we do:
>
> hstart = ALIGN(start, HPAGE_PMD_SIZE);
> hend = ALIGN_DOWN(end, HPAGE_PMD_SIZE);
>
> That means e.g.
>
> start = <some PMD aligned addr> + 1
> end = start + <not enough to get it to the next hpage>
>
> Results in hstart > hend, so the user must have given incorrect input.
>
> hstart == hend would be e.g.:
>
> start = <some PMD aligned addr> + 1
> end = start + HPAGE_PMD_SIZE
>
> Which is still an invalid input. I honestly wish we just required that the user
> provided PMD aligned ranges and we didn't align for them, it's stupid that we
> do.
>
> So the real question is, what constitutes an actual invalid input here? We've
> made a mess and now we have to decide how we interpret it... :)
>
> I still feel that hstart > hend is an error. Since we are aligning things the
> stupid way we do, we are treating start, end as _bounds_. So we are saying 'turn
> everything in the bounded range [start, end) into huge pages'.
>
> So we use ALIGN() on start so nothing BEFORE start gets converted. And we use
> ALIGN_DOWN() on end so nothing AFTER end gets converted.
>
> But if you do:
>
> (first case above)
>
> PMD aligned                  PMD aligned
>      |  <----------------->      |
>
> You're kinda doing something stupid obviously, because that range cannot span
> any huge pages, you don't even cross a boundary, and importantly - your range
> _isn't even large enough to include a single page_.
>
> With:
>
> PMD aligned                  PMD aligned
>      |  <------------------------|->
>
> You are crossing a boundary but not enough to get a page. But you might well
> have a large enough range to span a single page...
>
> But OTOH...
>
> PMD aligned                  PMD aligned
>      |                       <---|->
>
> Would get you the same as is equally silly.
>
> Yeah ok this is a long way round of coming to the same conclusion as David,
> godamnit, and I so wanted to disagree here :P
>
> Since both get you nothing, and the input was valid _to madvise()_ let's just
> return 0 for both cases.
Thanks for taking the time to walk through this, Lorenzo and David.

Now we've agreed both cases should consistently return 0, I'll send
a v3 that simply bails out with 0 when hstart >= hend.

Best regards,
Wandun
>
>> --
>> Cheers,
>>
>> David
> Cheers, Lorenzo



      reply	other threads:[~2026-05-13  3:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-11  6:57 [PATCH v2] mm/khugepaged: avoid underflow in madvise_collapse for sub-PMD MADV_COLLAPSE Wandun Chen
2026-05-11  7:17 ` David Hildenbrand (Arm)
2026-05-11  7:35   ` Wandun
2026-05-11  8:01     ` David Hildenbrand (Arm)
2026-05-11 16:21       ` Lorenzo Stoakes
2026-05-13  3:24         ` Wandun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=93275e30-ef8b-4ca0-9854-206b4232d90c@gmail.com \
    --to=chenwandun1@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=liam@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox