The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: Chen Wandun <chenwandun1@gmail.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
	 akpm@linux-foundation.org, shuah@kernel.org, zokeefe@google.com,
	 linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH 1/2] mm/khugepaged: fix spurious -EINVAL from sub-PMD MADV_COLLAPSE range
Date: Fri, 8 May 2026 16:02:31 +0100	[thread overview]
Message-ID: <af30otSSGbPxHbxM@lucifer> (raw)
In-Reply-To: <9eea2afb-8c35-47eb-b1de-6a08503c9679@kernel.org>

On Fri, May 08, 2026 at 02:27:37PM +0200, David Hildenbrand (Arm) wrote:
> On 5/7/26 09:05, Chen Wandun wrote:
> > madvise_collapse() computes the THP-aligned window:
> >
> >   hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK  /* round up  */
> >   hend   =  end   &  HPAGE_PMD_MASK                    /* round down */
> >
> > Previously this was done after kmalloc_obj(), so problem arose when
> > the range contained no complete PMD-aligned window (hstart >= hend).
> >
> > When hstart > hend, (hend - hstart) wraps unsigned to a huge value, the
> > final comparison fails and -EINVAL is returned instead of 0.  Consider

I think both should return -EINVAL.

> > two single-page calls on a 2 MiB-aligned address:
> >
> >     /* hstart == hend == aligned  ->  0 == 0  ->  returns 0 */
> >     madvise(aligned, PAGE_SIZE, MADV_COLLAPSE);

What's aligned? You're putting a random variable name in there? Presumably a PMD-aligned address?

> >
> >     /* hstart = aligned + 2MiB, hend = aligned
> >      * (hend - hstart) wraps unsigned  ->  returns -EINVAL */
> >     madvise(aligned + PAGE_SIZE, PAGE_SIZE, MADV_COLLAPSE);
> >
> > Both calls cover less than one THP and collapse nothing; both should
> > return 0.

Disagree.

>
> Okay, so we talk about a "userspace is being stupid" scenario.

Yes!

I feel that -EINVAL is correct for hend > hstart, and I think it might even be a
userland A[BP]I break to change it (maybe somebody, somewhere is being foolish
enough to use this to also validate input ranges).

The weirdness is when hstart == hend being 0 but that's sort of established
behaviour I guess.

>
> >
> > In addition, kmalloc_obj(), mmgrab() and lru_add_drain_all() were all
> > called before discovering there was nothing to do, only for the code
> > to kfree() and return immediately after.
>
> Just a comment as you motivate here why this is suboptimal: we do not care about
> a "userspace is being stupid" scenario being fast.

Yes, in general - so what? The user is doing stupid things, so the user wins
stupid prizes?

>
> >
> > Fix both by computing hstart/hend after thp_vma_allowable_order() but
> > before kmalloc_obj(), and returning 0 early when hstart >= hend.
> >
> > Fixes: 7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse")
>
> Fixes: is likely ok, but I don't think we want to treat this as a hotfix or CC
> stable.

I'm not sure I want a fixes here, this isn't really fixing anything. This isn't
a bug afaik, it's just us not handling this brilliantly, but (possibly by
mistake) getting the right output.

>
> > Signed-off-by: Chen Wandun <chenwandun@lixiang.com>

I put this patch through AI detection and it's telling me there's an 80% chance
this whole thing is LLM-generated, which is making me grumpy.

Can you confirm that this is, in fact, your own work? Plagiarism is not a nice
thing to do, and THP doesn't need more traffic, we're overloaded as it is.

> > ---
> >  mm/khugepaged.c | 9 ++++++---
> >  1 file changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> > index b8452dbdb043..92473d93e837 100644
> > --- a/mm/khugepaged.c
> > +++ b/mm/khugepaged.c
> > @@ -2836,6 +2836,12 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> >  	if (!thp_vma_allowable_order(vma, vma->vm_flags, TVA_FORCED_COLLAPSE, PMD_ORDER))
> >  		return -EINVAL;
> >
> > +	hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
> > +	hend = end & HPAGE_PMD_MASK;

See below re: conflict.

> > +
> > +	if (hstart >= hend)
> > +		return 0;

	if (hstart > hend)
		return -EINVAL;
	/* For compatibility, users may rely on this. */
	if (hstart == hend)
		return 0;

Is probably better.

But I'm not sure what the point is if we're already doing this behaviour?

> > +
> >  	cc = kmalloc_obj(*cc);
> >  	if (!cc)
> >  		return -ENOMEM;
> > @@ -2845,9 +2851,6 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start,
> >  	mmgrab(mm);
> >  	lru_add_drain_all();
> >
> > -	hstart = (start + ~HPAGE_PMD_MASK) & HPAGE_PMD_MASK;
> > -	hend = end & HPAGE_PMD_MASK;
> > -
> >  	for (addr = hstart; addr < hend; addr += HPAGE_PMD_SIZE) {
> >  		enum scan_result result = SCAN_FAIL;
> >
>
> In general, LGTM, but see for conflict:
> https://lore.kernel.org/all/20260409014323.2385982-1-ye.liu@linux.dev/

Please use mm-unstable as a basis for your mm work Chen, this is something you
need to fix, the patch above has been around for a while and is in
mm-unstable.

You have patches in mm already so you should know better by now.

But I'm really not sure I'm in favour of this anyway. I'll defer to David but
this feels useless to me.

>
>
> --
> Cheers,
>
> David

Thanks, Lorenzo

  reply	other threads:[~2026-05-08 15:02 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-07  7:05 [PATCH 0/2] mm/khugepaged: fix sub-PMD MADV_COLLAPSE range handling Chen Wandun
2026-05-07  7:05 ` [PATCH 1/2] mm/khugepaged: fix spurious -EINVAL from sub-PMD MADV_COLLAPSE range Chen Wandun
2026-05-08 12:27   ` David Hildenbrand (Arm)
2026-05-08 15:02     ` Lorenzo Stoakes [this message]
2026-05-08 15:04       ` Lorenzo Stoakes
2026-05-09  7:53         ` Wandun
2026-05-08 19:29       ` David Hildenbrand (Arm)
2026-05-09  7:04       ` Wandun
2026-05-09  5:56     ` Wandun
2026-05-07  7:05 ` [PATCH 2/2] selftests/mm: add MADV_COLLAPSE sub-PMD range tests Chen Wandun
2026-05-08 12:23   ` David Hildenbrand (Arm)
2026-05-08 15:03     ` Lorenzo Stoakes
2026-05-09  9:47 ` [PATCH 0/2] mm/khugepaged: fix sub-PMD MADV_COLLAPSE range handling Lance Yang
2026-05-11  2:06   ` Wandun

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=af30otSSGbPxHbxM@lucifer \
    --to=ljs@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=chenwandun1@gmail.com \
    --cc=david@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shuah@kernel.org \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox