public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: "Harry Yoo (Oracle)" <harry@kernel.org>
To: "Denis M. Karpov" <komlomal@gmail.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>,
	rppt@kernel.org, akpm@linux-foundation.org,
	Liam.Howlett@oracle.com, ljs@kernel.org, vbabka@kernel.org,
	jannh@google.com, peterx@redhat.com, pfalcato@suse.de,
	brauner@kernel.org, viro@zeniv.linux.org.uk, jack@suse.cz,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr
Date: Thu, 9 Apr 2026 11:51:11 +0900	[thread overview]
Message-ID: <adcUHx27oZLftSjS@hyeyoo> (raw)
In-Reply-To: <CADtiZd3AcY2LwPfCkb782TaY1h50dTPpEzhfLSyMeQeXG6VXfA@mail.gmail.com>

On Wed, Apr 08, 2026 at 11:09:00AM +0300, Denis M. Karpov wrote:
> > Hmm but it looks bit strange to check capability for address that is
> > already mapped by mmap(). Why is this required?
>
> Actually, it's not obvious to me either, but I may miss something.
> My intent was to replace the current restrictive check with a more flexible one.

Technically, it's less restrictive only if start < mmap_min_addr
(setting aside the discussion of whether this is an appropriate check).

Otherwise (start >= mmap_min_addr) it's more restrictive? (now, the process
should have the capability when registering an existing VMA to userfaultfd)

> I think performing this check here allows us to deny invalid requests early,
> before locks or VMA lookups occur.

But we're not trying to optimize it and we shouldn't add checks without
a proper explanation for the sake of optimization.

> Removing this check entirely would also allow using UFFD in cases where a task
> drops privileges after the initial mmap(). This seems reasonable because the
> VMA already exists, i.e. kernel already allowed this mapping.

Yeah, that seems reasonable to me.

IOW, I don't think "creating a VMA on a specific address (w/ proper
capabilities) is okay but once it is registered to userfaultfd,
it becomes a security hole" is a valid argument.

And we don't unmap those mappings when the process loses the capability
to map them anyway.

> In the [BUG] thread discussion

Was it a private discussion? I can't find Andrea's emails on the thread.

> Andrea Arcangeli also suggested adding a check for
> FIRST_USER_ADDRESS to handle architectural constraints.

Again, what's the point of checking this on the VMA that is already created?
*checks why FIRST_USER_ADDRESS was introduced*

commit e2cdef8c847b480529b7e26991926aab4be008e6
Author: Hugh Dickins <hugh@veritas.com>
Date:   Tue Apr 19 13:29:19 2005 -0700

    [PATCH] freepgt: free_pgtables from FIRST_USER_ADDRESS

    The patches to free_pgtables by vma left problems on any architectures which
    leave some user address page table entries unencapsulated by vma.  Andi has
    fixed the 32-bit vDSO on x86_64 to use a vma.  Now fix arm (and arm26), whose
    first PAGE_SIZE is reserved (perhaps) for machine vectors.

    Our calls to free_pgtables must not touch that area, and exit_mmap's
    BUG_ON(nr_ptes) must allow that arm's get_pgd_slow may (or may not) have
    allocated an extra page table, which its free_pgd_slow would free later.

    FIRST_USER_PGD_NR has misled me and others: until all the arches define
    FIRST_USER_ADDRESS instead, a hack in mmap.c to derive one from t'other.  This
    patch fixes the bugs, the remaining patches just clean it up.

    Signed-off-by: Hugh Dickins <hugh@veritas.com>
    Signed-off-by: Andrew Morton <akpm@osdl.org>
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

Oh, ok. there might be a raw mapping without VMA below FIRST_USER_ADDRESS.

Adding such a check wouldn't hurt... but if there is no VMA, you can't
register the range to userfaultfd anyway?

> Andrea, could you please comment on this? Specifically, would a
> check against FIRST_USER_ADDRESS sufficient here, or do we still
> need to check caps?
> 
> On Wed, Apr 8, 2026 at 6:21 AM Harry Yoo (Oracle) <harry@kernel.org> wrote:
> >
> > On Tue, Apr 07, 2026 at 11:14:42AM +0300, Denis M. Karpov wrote:
> > > The current implementation of validate_range() in fs/userfaultfd.c
> > > performs a hard check against mmap_min_addr without considering
> > > capabilities, but the mmap() syscall uses security_mmap_addr()
> > > which allows privileged processes (with CAP_SYS_RAWIO) to map below
> > > mmap_min_addr. Furthermore, security_mmap_addr()->cap_mmap_addr() uses
> > > dac_mmap_min_addr variable which can be changed with
> > > /proc/sys/vm/mmap_min_addr.
> > >
> > > Because userfaultfd uses a different check, UFFDIO_REGISTER may fail
> > > with -EINVAL for valid memory areas that were successfully mapped
> > > below mmap_min_addr even with appropriate capabilities.
> > >
> > > This prevents apps like binary compilers from using UFFD for valid memory
> > > regions mapped by application.
> > >
> > > Replace the rigid mmap_min_addr check with security_mmap_addr() to align
> > > userfaultfd with the standard kernel memory mapping security policy.
> >
> > Perhaps worth adding
> >
> > Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization")
> >
> > > Signed-off-by: Denis M. Karpov <komlomal@gmail.com>
> > >
> > > ---
> > >  fs/userfaultfd.c | 4 +---
> > >  1 file changed, 1 insertion(+), 3 deletions(-)
> > >
> > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> > > index bdc84e521..dbfe5b2a0 100644
> > > --- a/fs/userfaultfd.c
> > > +++ b/fs/userfaultfd.c
> > > @@ -1238,15 +1238,13 @@ static __always_inline int validate_unaligned_range(
> > >               return -EINVAL;
> > >       if (!len)
> > >               return -EINVAL;
> > > -     if (start < mmap_min_addr)
> > > -             return -EINVAL;
> > >       if (start >= task_size)
> > >               return -EINVAL;
> > >       if (len > task_size - start)
> > >               return -EINVAL;
> > >       if (start + len <= start)
> > >               return -EINVAL;
> > > -     return 0;
> > > +     return security_mmap_addr(start);
> >
> > Hmm but it looks bit strange to check capability for address that is
> > already mapped by mmap(). Why is this required?

-- 
Cheers,
Harry / Hyeonggon


  reply	other threads:[~2026-04-09  2:51 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-07  8:14 [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr Denis M. Karpov
2026-04-08  3:21 ` Harry Yoo (Oracle)
2026-04-08  8:09   ` Denis M. Karpov
2026-04-09  2:51     ` Harry Yoo (Oracle) [this message]
2026-04-09  7:58       ` Lorenzo Stoakes
2026-04-08 12:36 ` Usama Arif
2026-04-09  8:01   ` Lorenzo Stoakes
2026-04-09  9:05     ` Denis M. Karpov
2026-04-09 10:52     ` Usama Arif

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=adcUHx27oZLftSjS@hyeyoo \
    --to=harry@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=komlomal@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=rppt@kernel.org \
    --cc=vbabka@kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox