* [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr @ 2026-04-07 8:14 Denis M. Karpov 2026-04-08 3:21 ` Harry Yoo (Oracle) 2026-04-08 12:36 ` Usama Arif 0 siblings, 2 replies; 4+ messages in thread From: Denis M. Karpov @ 2026-04-07 8:14 UTC (permalink / raw) To: rppt, akpm, Liam.Howlett, ljs Cc: vbabka, jannh, peterx, pfalcato, brauner, viro, jack, linux-mm, linux-fsdevel, linux-kernel, Denis M . Karpov The current implementation of validate_range() in fs/userfaultfd.c performs a hard check against mmap_min_addr without considering capabilities, but the mmap() syscall uses security_mmap_addr() which allows privileged processes (with CAP_SYS_RAWIO) to map below mmap_min_addr. Furthermore, security_mmap_addr()->cap_mmap_addr() uses dac_mmap_min_addr variable which can be changed with /proc/sys/vm/mmap_min_addr. Because userfaultfd uses a different check, UFFDIO_REGISTER may fail with -EINVAL for valid memory areas that were successfully mapped below mmap_min_addr even with appropriate capabilities. This prevents apps like binary compilers from using UFFD for valid memory regions mapped by application. Replace the rigid mmap_min_addr check with security_mmap_addr() to align userfaultfd with the standard kernel memory mapping security policy. Signed-off-by: Denis M. Karpov <komlomal@gmail.com> --- Initial RFC following the discussion on the [BUG] thread. Link: https://lore.kernel.org/all/CADtiZd0tWysx5HMCUnOXfSHB7PXAuXg1Mh4eY_hUmH29S=sejg@mail.gmail.com/ --- fs/userfaultfd.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index bdc84e521..dbfe5b2a0 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1238,15 +1238,13 @@ static __always_inline int validate_unaligned_range( return -EINVAL; if (!len) return -EINVAL; - if (start < mmap_min_addr) - return -EINVAL; if (start >= task_size) return -EINVAL; if (len > task_size - start) return -EINVAL; if (start + len <= start) return -EINVAL; - return 0; + return security_mmap_addr(start); } static __always_inline int validate_range(struct mm_struct *mm, -- 2.47.3 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr 2026-04-07 8:14 [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr Denis M. Karpov @ 2026-04-08 3:21 ` Harry Yoo (Oracle) 2026-04-08 8:09 ` Denis M. Karpov 2026-04-08 12:36 ` Usama Arif 1 sibling, 1 reply; 4+ messages in thread From: Harry Yoo (Oracle) @ 2026-04-08 3:21 UTC (permalink / raw) To: Denis M. Karpov Cc: rppt, akpm, Liam.Howlett, ljs, vbabka, jannh, peterx, pfalcato, brauner, viro, jack, linux-mm, linux-fsdevel, linux-kernel On Tue, Apr 07, 2026 at 11:14:42AM +0300, Denis M. Karpov wrote: > The current implementation of validate_range() in fs/userfaultfd.c > performs a hard check against mmap_min_addr without considering > capabilities, but the mmap() syscall uses security_mmap_addr() > which allows privileged processes (with CAP_SYS_RAWIO) to map below > mmap_min_addr. Furthermore, security_mmap_addr()->cap_mmap_addr() uses > dac_mmap_min_addr variable which can be changed with > /proc/sys/vm/mmap_min_addr. > > Because userfaultfd uses a different check, UFFDIO_REGISTER may fail > with -EINVAL for valid memory areas that were successfully mapped > below mmap_min_addr even with appropriate capabilities. > > This prevents apps like binary compilers from using UFFD for valid memory > regions mapped by application. > > Replace the rigid mmap_min_addr check with security_mmap_addr() to align > userfaultfd with the standard kernel memory mapping security policy. Perhaps worth adding Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") > Signed-off-by: Denis M. Karpov <komlomal@gmail.com> > > --- > fs/userfaultfd.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index bdc84e521..dbfe5b2a0 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1238,15 +1238,13 @@ static __always_inline int validate_unaligned_range( > return -EINVAL; > if (!len) > return -EINVAL; > - if (start < mmap_min_addr) > - return -EINVAL; > if (start >= task_size) > return -EINVAL; > if (len > task_size - start) > return -EINVAL; > if (start + len <= start) > return -EINVAL; > - return 0; > + return security_mmap_addr(start); Hmm but it looks bit strange to check capability for address that is already mapped by mmap(). Why is this required? > } -- Cheers, Harry / Hyeonggon ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr 2026-04-08 3:21 ` Harry Yoo (Oracle) @ 2026-04-08 8:09 ` Denis M. Karpov 0 siblings, 0 replies; 4+ messages in thread From: Denis M. Karpov @ 2026-04-08 8:09 UTC (permalink / raw) To: Harry Yoo (Oracle), Andrea Arcangeli Cc: rppt, akpm, Liam.Howlett, ljs, vbabka, jannh, peterx, pfalcato, brauner, viro, jack, linux-mm, linux-fsdevel, linux-kernel > Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Thank you, I will add this Fixes tag in the next patch. > Hmm but it looks bit strange to check capability for address that is > already mapped by mmap(). Why is this required? Actually, it's not obvious to me either, but I may miss something. My intent was to replace the current restrictive check with a more flexible one. I think performing this check here allows us to deny invalid requests early, before locks or VMA lookups occur. Removing this check entirely would also allow using UFFD in cases where a task drops privileges after the initial mmap(). This seems reasonable because the VMA already exists, i.e. kernel already allowed this mapping. In the [BUG] thread discussion Andrea Arcangeli also suggested adding a check for FIRST_USER_ADDRESS to handle architectural constraints. Andrea, could you please comment on this? Specifically, would a check against FIRST_USER_ADDRESS sufficient here, or do we still need to check caps? On Wed, Apr 8, 2026 at 6:21 AM Harry Yoo (Oracle) <harry@kernel.org> wrote: > > On Tue, Apr 07, 2026 at 11:14:42AM +0300, Denis M. Karpov wrote: > > The current implementation of validate_range() in fs/userfaultfd.c > > performs a hard check against mmap_min_addr without considering > > capabilities, but the mmap() syscall uses security_mmap_addr() > > which allows privileged processes (with CAP_SYS_RAWIO) to map below > > mmap_min_addr. Furthermore, security_mmap_addr()->cap_mmap_addr() uses > > dac_mmap_min_addr variable which can be changed with > > /proc/sys/vm/mmap_min_addr. > > > > Because userfaultfd uses a different check, UFFDIO_REGISTER may fail > > with -EINVAL for valid memory areas that were successfully mapped > > below mmap_min_addr even with appropriate capabilities. > > > > This prevents apps like binary compilers from using UFFD for valid memory > > regions mapped by application. > > > > Replace the rigid mmap_min_addr check with security_mmap_addr() to align > > userfaultfd with the standard kernel memory mapping security policy. > > Perhaps worth adding > > Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") > > > Signed-off-by: Denis M. Karpov <komlomal@gmail.com> > > > > --- > > fs/userfaultfd.c | 4 +--- > > 1 file changed, 1 insertion(+), 3 deletions(-) > > > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > > index bdc84e521..dbfe5b2a0 100644 > > --- a/fs/userfaultfd.c > > +++ b/fs/userfaultfd.c > > @@ -1238,15 +1238,13 @@ static __always_inline int validate_unaligned_range( > > return -EINVAL; > > if (!len) > > return -EINVAL; > > - if (start < mmap_min_addr) > > - return -EINVAL; > > if (start >= task_size) > > return -EINVAL; > > if (len > task_size - start) > > return -EINVAL; > > if (start + len <= start) > > return -EINVAL; > > - return 0; > > + return security_mmap_addr(start); > > Hmm but it looks bit strange to check capability for address that is > already mapped by mmap(). Why is this required? > > > } > > -- > Cheers, > Harry / Hyeonggon ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr 2026-04-07 8:14 [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr Denis M. Karpov 2026-04-08 3:21 ` Harry Yoo (Oracle) @ 2026-04-08 12:36 ` Usama Arif 1 sibling, 0 replies; 4+ messages in thread From: Usama Arif @ 2026-04-08 12:36 UTC (permalink / raw) To: Denis M. Karpov Cc: Usama Arif, rppt, akpm, Liam.Howlett, ljs, vbabka, jannh, peterx, pfalcato, brauner, viro, jack, linux-mm, linux-fsdevel, linux-kernel On Tue, 7 Apr 2026 11:14:42 +0300 "Denis M. Karpov" <komlomal@gmail.com> wrote: > The current implementation of validate_range() in fs/userfaultfd.c > performs a hard check against mmap_min_addr without considering > capabilities, but the mmap() syscall uses security_mmap_addr() > which allows privileged processes (with CAP_SYS_RAWIO) to map below > mmap_min_addr. Furthermore, security_mmap_addr()->cap_mmap_addr() uses > dac_mmap_min_addr variable which can be changed with > /proc/sys/vm/mmap_min_addr. > > Because userfaultfd uses a different check, UFFDIO_REGISTER may fail > with -EINVAL for valid memory areas that were successfully mapped > below mmap_min_addr even with appropriate capabilities. > > This prevents apps like binary compilers from using UFFD for valid memory > regions mapped by application. > > Replace the rigid mmap_min_addr check with security_mmap_addr() to align > userfaultfd with the standard kernel memory mapping security policy. > > Signed-off-by: Denis M. Karpov <komlomal@gmail.com> > > --- > Initial RFC following the discussion on the [BUG] thread. > Link: https://lore.kernel.org/all/CADtiZd0tWysx5HMCUnOXfSHB7PXAuXg1Mh4eY_hUmH29S=sejg@mail.gmail.com/ > --- > fs/userfaultfd.c | 4 +--- > 1 file changed, 1 insertion(+), 3 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index bdc84e521..dbfe5b2a0 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1238,15 +1238,13 @@ static __always_inline int validate_unaligned_range( > return -EINVAL; > if (!len) > return -EINVAL; > - if (start < mmap_min_addr) > - return -EINVAL; > if (start >= task_size) > return -EINVAL; > if (len > task_size - start) > return -EINVAL; > if (start + len <= start) > return -EINVAL; > - return 0; > + return security_mmap_addr(start); Is this introducing an ABI change? The old code returned -EINVAL when start was below mmap_min_addr. The new code calls security_mmap_addr() which returns -EPERM when the caller lacks CAP_SYS_RAWIO. Existing userspace callers checking specifically for -EINVAL would see different behavior start is below mmap_min_addr. > } > > static __always_inline int validate_range(struct mm_struct *mm, > -- > 2.47.3 > > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-04-08 12:37 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-04-07 8:14 [RFC PATCH] userfaultfd: allow registration of ranges below mmap_min_addr Denis M. Karpov 2026-04-08 3:21 ` Harry Yoo (Oracle) 2026-04-08 8:09 ` Denis M. Karpov 2026-04-08 12:36 ` Usama Arif
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox