The Linux Kernel Mailing List
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <ljs@kernel.org>
To: Jann Horn <jannh@google.com>
Cc: fujunjie <fujunjie1@qq.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 "Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	 Shuah Khan <shuah@kernel.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 linux-kselftest@vger.kernel.org
Subject: Re: [PATCH] mm/mremap: unmap full fixed target for multi-VMA moves
Date: Mon, 11 May 2026 17:00:27 +0100	[thread overview]
Message-ID: <agH7ONBSX7XLzS6q@lucifer> (raw)
In-Reply-To: <CAG48ez17evhhcoDKF6AoPqzYZwGuFCkVAsc-0bBzevf0LHYNxA@mail.gmail.com>

On Mon, May 11, 2026 at 05:40:24PM +0200, Jann Horn wrote:
> On Mon, May 11, 2026 at 5:32 PM Lorenzo Stoakes <ljs@kernel.org> wrote:
> > On Mon, May 11, 2026 at 05:19:50PM +0200, Jann Horn wrote:
> > > On Mon, May 11, 2026 at 5:05 PM Lorenzo Stoakes <ljs@kernel.org> wrote:
> > > > Hmmm I think it's a bit debateable honestly. The ability to handle there being
> > > > gaps is a _new thing_, so there are no semantics to speak of. Prevoiusly
> > > > mremap() simply required that you only span across a single VMA.
> > >
> > > FWIW, I think mremap() on a source region with gaps is such a
> > > hazardous operation that nearly no userspace code should be doing it -
> > > gaps are areas in which any mmap() call without a fixed address could
> > > place unrelated mappings (unless stack VMAs are involved, which would
> > > also be a weird scenario), so to use it safely, you have to, among
> > > other things, make sure not to use libc malloc() at a time when that
> > > could place an allocation in the gap (which means you also can't use
> > > printf(), and so on, unless you have swapped out the memory
> > > allocator), and make sure that you have no other threads that could be
> > > doing that, and so on. There are rare circumstances under which it
> > > could be safe, but I think it is almost always better to have a
> > > PROT_NONE anonymous VMA or such as a placeholder.
> >
> > Well, we're holding the mmap write lock so none of that could happen
> > _during_ the operation right?
>
> Not during the operation, but right before the operation. So from the
> userspace perspective, you have to know that there are no concurrent
> threads that could be creating memory mappings at non-fixed addresses,
> and you have to know that no mappings can have been created in the
> memory range between when you checked that it's empty and when you
> make the syscall.

That's a very good point :)

But I guess applies to any operations that operate over a range of mappings
anyway (madvise() lets you also do this, though it'll give an error code
_at the end_ _after having done the operations_ if there are gaps).

So madvise() can have the exact same thing happen right? which is... fun :)

I actually wonder if we shouldn't just change this to disallow gaps. It'd
simplify the code and we could even do the check upfront in one pass. It's
doubtful anybody is relying on the gaps behaviour for anything real.

>
> > You might debate also the fact we hold that for an extended period.
>
> Eh, I mean, that's also true if you call mmap() with MAP_FIXED on a
> gigantic virtual address region with lots of populated PTEs in it or
> such, or if you call mprotect() on a big region. I don't think it is
> problematic that there are some very chonky MM syscalls you can make
> that will hold the mmap_lock for a long time, as long as we don't
> expect heavily multithreaded code to be doing those operations
> frequently. (Being able to keep the mmap lock held in write mode for a
> long time can be useful as an exploitation trick in some cases, but I
> don't think that's easy enough to fix to be worth the trouble of
> addressing it.)

Yeah true.

>
> > Honestly I probably shouldn't have allowed for this, I've had to do at
> > least one fixup relating to it I seem to recall and the semantics are
> > _clearly_ confusing.
> >
> > >
> > > I think the right documentation for this is "do not use this on a
> > > source region with gaps, it is technically possible but extremely
> > > hazardous".
> >
> > Yeah, I mean on reflection, allowing it was probably a mistake.
> >
> > The real use case was 'my VMAs are fragmented and I don't want to have to
> > know about VMA merge rules in order to move them', i.e. no gaps.
> >
> > I will do a manpage update and indicate that it probably shouldn't be used
> > but if it is, the sematics are such that gaps are not propagated (i.e. it
> > is as if you mremap()'d each individually).
>
> Thanks!

Cheers, Lorenzo

  reply	other threads:[~2026-05-11 16:00 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-07 18:24 [PATCH] mm/mremap: unmap full fixed target for multi-VMA moves fujunjie
2026-05-11 15:05 ` Lorenzo Stoakes
2026-05-11 15:19   ` Jann Horn
2026-05-11 15:32     ` Lorenzo Stoakes
2026-05-11 15:40       ` Jann Horn
2026-05-11 16:00         ` Lorenzo Stoakes [this message]
2026-05-11 16:02           ` Jann Horn
2026-05-11 16:30             ` Lorenzo Stoakes
2026-05-12  7:24               ` Fujunjie
2026-05-12 13:59                 ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agH7ONBSX7XLzS6q@lucifer \
    --to=ljs@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=fujunjie1@qq.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shuah@kernel.org \
    --cc=vbabka@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox