* [PATCH v3 2/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd
2025-06-20 1:24 [PATCH v3 0/4] mm: userfaultfd: assorted fixes and cleanups Tal Zussman
2025-06-20 1:24 ` [PATCH v3 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman
@ 2025-06-20 1:24 ` Tal Zussman
2025-06-20 1:24 ` [PATCH v3 3/4] userfaultfd: remove (VM_)BUG_ON()s Tal Zussman
2025-06-20 1:24 ` [PATCH v3 4/4] userfaultfd: remove UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET Tal Zussman
3 siblings, 0 replies; 5+ messages in thread
From: Tal Zussman @ 2025-06-20 1:24 UTC (permalink / raw)
To: Andrew Morton, Peter Xu, Jason A. Donenfeld, David Hildenbrand,
Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli
Cc: linux-mm, linux-kernel, linux-fsdevel, Tal Zussman
Currently, a VMA registered with a uffd can be unregistered through a
different uffd associated with the same mm_struct.
The existing behavior is slightly broken and may incorrectly reject
unregistering some VMAs due to the following check:
if (!vma_can_userfault(cur, cur->vm_flags, wp_async))
goto out_unlock;
where wp_async is derived from ctx, not from cur. For example, a
file-backed VMA registered with wp_async enabled and UFFD_WP mode cannot
be unregistered through a uffd that does not have wp_async enabled.
Rather than fix this and maintain this odd behavior, make unregistration
stricter by requiring VMAs to be unregistered through the same uffd they
were registered with. Additionally, reorder the BUG() checks to avoid
the aforementioned wp_async issue in them. Convert the existing check to
VM_WARN_ON_ONCE() as BUG_ON() is deprecated.
This change slightly modifies the ABI. It should not be backported to
-stable. It is expected that no one depends on this behavior, and no
such cases are known.
While at it, correct the comment for the no userfaultfd case. This seems to
be a copy-paste artifact from the analogous userfaultfd_register() check.
Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization")
Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
---
fs/userfaultfd.c | 17 +++++++++++------
1 file changed, 11 insertions(+), 6 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 22f4bf956ba1..8e7fb2a7a6aa 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -1467,6 +1467,14 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
BUG_ON(!!cur->vm_userfaultfd_ctx.ctx ^
!!(cur->vm_flags & __VM_UFFD_FLAGS));
+ /*
+ * Prevent unregistering through a different userfaultfd than
+ * the one used for registration.
+ */
+ if (cur->vm_userfaultfd_ctx.ctx &&
+ cur->vm_userfaultfd_ctx.ctx != ctx)
+ goto out_unlock;
+
/*
* Check not compatible vmas, not strictly required
* here as not compatible vmas cannot have an
@@ -1490,15 +1498,12 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
for_each_vma_range(vmi, vma, end) {
cond_resched();
- BUG_ON(!vma_can_userfault(vma, vma->vm_flags, wp_async));
-
- /*
- * Nothing to do: this vma is already registered into this
- * userfaultfd and with the right tracking mode too.
- */
+ /* VMA not registered with userfaultfd. */
if (!vma->vm_userfaultfd_ctx.ctx)
goto skip;
+ VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx != ctx);
+ VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async));
WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
if (vma->vm_start > start)
--
2.39.5
^ permalink raw reply related [flat|nested] 5+ messages in thread
* [PATCH v3 3/4] userfaultfd: remove (VM_)BUG_ON()s
2025-06-20 1:24 [PATCH v3 0/4] mm: userfaultfd: assorted fixes and cleanups Tal Zussman
2025-06-20 1:24 ` [PATCH v3 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman
2025-06-20 1:24 ` [PATCH v3 2/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd Tal Zussman
@ 2025-06-20 1:24 ` Tal Zussman
2025-06-20 1:24 ` [PATCH v3 4/4] userfaultfd: remove UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET Tal Zussman
3 siblings, 0 replies; 5+ messages in thread
From: Tal Zussman @ 2025-06-20 1:24 UTC (permalink / raw)
To: Andrew Morton, Peter Xu, Jason A. Donenfeld, David Hildenbrand,
Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli
Cc: linux-mm, linux-kernel, linux-fsdevel, Tal Zussman
BUG_ON() is deprecated [1]. Convert all the BUG_ON()s and VM_BUG_ON()s
to use VM_WARN_ON_ONCE().
There are a few additional cases that are converted or modified:
- Convert the printk(KERN_WARNING ...) in handle_userfault() to use
pr_warn().
- Convert the WARN_ON_ONCE()s in move_pages() to use VM_WARN_ON_ONCE(),
as the relevant conditions are already checked in validate_range() in
move_pages()'s caller.
- Convert the VM_WARN_ON()'s in move_pages() to VM_WARN_ON_ONCE(). These
cases should never happen and are similar to those in mfill_atomic()
and mfill_atomic_hugetlb(), which were previously BUG_ON()s.
move_pages() was added later than those functions and makes use of
VM_WARN_ON() as a replacement for the deprecated BUG_ON(), but.
VM_WARN_ON_ONCE() is likely a better direct replacement.
- Convert the WARN_ON() for !VM_MAYWRITE in userfaultfd_unregister() and
userfaultfd_register_range() to VM_WARN_ON_ONCE(). This condition is
enforced in userfaultfd_register() so it should never happen, and can
be converted to a debug check.
[1] https://www.kernel.org/doc/html/v6.15/process/coding-style.html#use-warn-rather-than-bug
Signed-off-by: Tal Zussman <tz2294@columbia.edu>
---
fs/userfaultfd.c | 59 ++++++++++++++++++++++++------------------------
mm/userfaultfd.c | 68 +++++++++++++++++++++++++++-----------------------------
2 files changed, 62 insertions(+), 65 deletions(-)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index 8e7fb2a7a6aa..771e81ea4ef6 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -165,14 +165,14 @@ static void userfaultfd_ctx_get(struct userfaultfd_ctx *ctx)
static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx)
{
if (refcount_dec_and_test(&ctx->refcount)) {
- VM_BUG_ON(spin_is_locked(&ctx->fault_pending_wqh.lock));
- VM_BUG_ON(waitqueue_active(&ctx->fault_pending_wqh));
- VM_BUG_ON(spin_is_locked(&ctx->fault_wqh.lock));
- VM_BUG_ON(waitqueue_active(&ctx->fault_wqh));
- VM_BUG_ON(spin_is_locked(&ctx->event_wqh.lock));
- VM_BUG_ON(waitqueue_active(&ctx->event_wqh));
- VM_BUG_ON(spin_is_locked(&ctx->fd_wqh.lock));
- VM_BUG_ON(waitqueue_active(&ctx->fd_wqh));
+ VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_pending_wqh.lock));
+ VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_pending_wqh));
+ VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_wqh.lock));
+ VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_wqh));
+ VM_WARN_ON_ONCE(spin_is_locked(&ctx->event_wqh.lock));
+ VM_WARN_ON_ONCE(waitqueue_active(&ctx->event_wqh));
+ VM_WARN_ON_ONCE(spin_is_locked(&ctx->fd_wqh.lock));
+ VM_WARN_ON_ONCE(waitqueue_active(&ctx->fd_wqh));
mmdrop(ctx->mm);
kmem_cache_free(userfaultfd_ctx_cachep, ctx);
}
@@ -383,12 +383,12 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
if (!ctx)
goto out;
- BUG_ON(ctx->mm != mm);
+ VM_WARN_ON_ONCE(ctx->mm != mm);
/* Any unrecognized flag is a bug. */
- VM_BUG_ON(reason & ~__VM_UFFD_FLAGS);
+ VM_WARN_ON_ONCE(reason & ~__VM_UFFD_FLAGS);
/* 0 or > 1 flags set is a bug; we expect exactly 1. */
- VM_BUG_ON(!reason || (reason & (reason - 1)));
+ VM_WARN_ON_ONCE(!reason || (reason & (reason - 1)));
if (ctx->features & UFFD_FEATURE_SIGBUS)
goto out;
@@ -411,12 +411,11 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason)
* to be sure not to return SIGBUS erroneously on
* nowait invocations.
*/
- BUG_ON(vmf->flags & FAULT_FLAG_RETRY_NOWAIT);
+ VM_WARN_ON_ONCE(vmf->flags & FAULT_FLAG_RETRY_NOWAIT);
#ifdef CONFIG_DEBUG_VM
if (printk_ratelimit()) {
- printk(KERN_WARNING
- "FAULT_FLAG_ALLOW_RETRY missing %x\n",
- vmf->flags);
+ pr_warn("FAULT_FLAG_ALLOW_RETRY missing %x\n",
+ vmf->flags);
dump_stack();
}
#endif
@@ -602,7 +601,7 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx,
*/
out:
atomic_dec(&ctx->mmap_changing);
- VM_BUG_ON(atomic_read(&ctx->mmap_changing) < 0);
+ VM_WARN_ON_ONCE(atomic_read(&ctx->mmap_changing) < 0);
userfaultfd_ctx_put(ctx);
}
@@ -710,7 +709,7 @@ void dup_userfaultfd_fail(struct list_head *fcs)
struct userfaultfd_ctx *ctx = fctx->new;
atomic_dec(&octx->mmap_changing);
- VM_BUG_ON(atomic_read(&octx->mmap_changing) < 0);
+ VM_WARN_ON_ONCE(atomic_read(&octx->mmap_changing) < 0);
userfaultfd_ctx_put(octx);
userfaultfd_ctx_put(ctx);
@@ -1317,8 +1316,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
do {
cond_resched();
- BUG_ON(!!cur->vm_userfaultfd_ctx.ctx ^
- !!(cur->vm_flags & __VM_UFFD_FLAGS));
+ VM_WARN_ON_ONCE(!!cur->vm_userfaultfd_ctx.ctx ^
+ !!(cur->vm_flags & __VM_UFFD_FLAGS));
/* check not compatible vmas */
ret = -EINVAL;
@@ -1372,7 +1371,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx,
found = true;
} for_each_vma_range(vmi, cur, end);
- BUG_ON(!found);
+ VM_WARN_ON_ONCE(!found);
ret = userfaultfd_register_range(ctx, vma, vm_flags, start, end,
wp_async);
@@ -1464,8 +1463,8 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
do {
cond_resched();
- BUG_ON(!!cur->vm_userfaultfd_ctx.ctx ^
- !!(cur->vm_flags & __VM_UFFD_FLAGS));
+ VM_WARN_ON_ONCE(!!cur->vm_userfaultfd_ctx.ctx ^
+ !!(cur->vm_flags & __VM_UFFD_FLAGS));
/*
* Prevent unregistering through a different userfaultfd than
@@ -1487,7 +1486,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
found = true;
} for_each_vma_range(vmi, cur, end);
- BUG_ON(!found);
+ VM_WARN_ON_ONCE(!found);
vma_iter_set(&vmi, start);
prev = vma_prev(&vmi);
@@ -1504,7 +1503,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx,
VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx != ctx);
VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async));
- WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
+ VM_WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE));
if (vma->vm_start > start)
start = vma->vm_start;
@@ -1569,7 +1568,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx,
* len == 0 means wake all and we don't want to wake all here,
* so check it again to be sure.
*/
- VM_BUG_ON(!range.len);
+ VM_WARN_ON_ONCE(!range.len);
wake_userfault(ctx, &range);
ret = 0;
@@ -1626,7 +1625,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx,
return -EFAULT;
if (ret < 0)
goto out;
- BUG_ON(!ret);
+ VM_WARN_ON_ONCE(!ret);
/* len == 0 would wake all */
range.len = ret;
if (!(uffdio_copy.mode & UFFDIO_COPY_MODE_DONTWAKE)) {
@@ -1681,7 +1680,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx,
if (ret < 0)
goto out;
/* len == 0 would wake all */
- BUG_ON(!ret);
+ VM_WARN_ON_ONCE(!ret);
range.len = ret;
if (!(uffdio_zeropage.mode & UFFDIO_ZEROPAGE_MODE_DONTWAKE)) {
range.start = uffdio_zeropage.range.start;
@@ -1793,7 +1792,7 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
goto out;
/* len == 0 would wake all */
- BUG_ON(!ret);
+ VM_WARN_ON_ONCE(!ret);
range.len = ret;
if (!(uffdio_continue.mode & UFFDIO_CONTINUE_MODE_DONTWAKE)) {
range.start = uffdio_continue.range.start;
@@ -1850,7 +1849,7 @@ static inline int userfaultfd_poison(struct userfaultfd_ctx *ctx, unsigned long
goto out;
/* len == 0 would wake all */
- BUG_ON(!ret);
+ VM_WARN_ON_ONCE(!ret);
range.len = ret;
if (!(uffdio_poison.mode & UFFDIO_POISON_MODE_DONTWAKE)) {
range.start = uffdio_poison.range.start;
@@ -2111,7 +2110,7 @@ static int new_userfaultfd(int flags)
struct file *file;
int fd;
- BUG_ON(!current->mm);
+ VM_WARN_ON_ONCE(!current->mm);
/* Check the UFFD_* constants for consistency. */
BUILD_BUG_ON(UFFD_USER_MODE_ONLY & UFFD_SHARED_FCNTL_FLAGS);
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index bc473ad21202..240578bed181 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -561,7 +561,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
}
while (src_addr < src_start + len) {
- BUG_ON(dst_addr >= dst_start + len);
+ VM_WARN_ON_ONCE(dst_addr >= dst_start + len);
/*
* Serialize via vma_lock and hugetlb_fault_mutex.
@@ -602,7 +602,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
if (unlikely(err == -ENOENT)) {
up_read(&ctx->map_changing_lock);
uffd_mfill_unlock(dst_vma);
- BUG_ON(!folio);
+ VM_WARN_ON_ONCE(!folio);
err = copy_folio_from_user(folio,
(const void __user *)src_addr, true);
@@ -614,7 +614,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
dst_vma = NULL;
goto retry;
} else
- BUG_ON(folio);
+ VM_WARN_ON_ONCE(folio);
if (!err) {
dst_addr += vma_hpagesize;
@@ -635,9 +635,9 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
out:
if (folio)
folio_put(folio);
- BUG_ON(copied < 0);
- BUG_ON(err > 0);
- BUG_ON(!copied && !err);
+ VM_WARN_ON_ONCE(copied < 0);
+ VM_WARN_ON_ONCE(err > 0);
+ VM_WARN_ON_ONCE(!copied && !err);
return copied ? copied : err;
}
#else /* !CONFIG_HUGETLB_PAGE */
@@ -711,12 +711,12 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
/*
* Sanitize the command parameters:
*/
- BUG_ON(dst_start & ~PAGE_MASK);
- BUG_ON(len & ~PAGE_MASK);
+ VM_WARN_ON_ONCE(dst_start & ~PAGE_MASK);
+ VM_WARN_ON_ONCE(len & ~PAGE_MASK);
/* Does the address range wrap, or is the span zero-sized? */
- BUG_ON(src_start + len <= src_start);
- BUG_ON(dst_start + len <= dst_start);
+ VM_WARN_ON_ONCE(src_start + len <= src_start);
+ VM_WARN_ON_ONCE(dst_start + len <= dst_start);
src_addr = src_start;
dst_addr = dst_start;
@@ -775,7 +775,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
while (src_addr < src_start + len) {
pmd_t dst_pmdval;
- BUG_ON(dst_addr >= dst_start + len);
+ VM_WARN_ON_ONCE(dst_addr >= dst_start + len);
dst_pmd = mm_alloc_pmd(dst_mm, dst_addr);
if (unlikely(!dst_pmd)) {
@@ -818,7 +818,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
up_read(&ctx->map_changing_lock);
uffd_mfill_unlock(dst_vma);
- BUG_ON(!folio);
+ VM_WARN_ON_ONCE(!folio);
kaddr = kmap_local_folio(folio, 0);
err = copy_from_user(kaddr,
@@ -832,7 +832,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
flush_dcache_folio(folio);
goto retry;
} else
- BUG_ON(folio);
+ VM_WARN_ON_ONCE(folio);
if (!err) {
dst_addr += PAGE_SIZE;
@@ -852,9 +852,9 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
out:
if (folio)
folio_put(folio);
- BUG_ON(copied < 0);
- BUG_ON(err > 0);
- BUG_ON(!copied && !err);
+ VM_WARN_ON_ONCE(copied < 0);
+ VM_WARN_ON_ONCE(err > 0);
+ VM_WARN_ON_ONCE(!copied && !err);
return copied ? copied : err;
}
@@ -940,11 +940,11 @@ int mwriteprotect_range(struct userfaultfd_ctx *ctx, unsigned long start,
/*
* Sanitize the command parameters:
*/
- BUG_ON(start & ~PAGE_MASK);
- BUG_ON(len & ~PAGE_MASK);
+ VM_WARN_ON_ONCE(start & ~PAGE_MASK);
+ VM_WARN_ON_ONCE(len & ~PAGE_MASK);
/* Does the address range wrap, or is the span zero-sized? */
- BUG_ON(start + len <= start);
+ VM_WARN_ON_ONCE(start + len <= start);
mmap_read_lock(dst_mm);
@@ -1709,15 +1709,13 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start,
ssize_t moved = 0;
/* Sanitize the command parameters. */
- if (WARN_ON_ONCE(src_start & ~PAGE_MASK) ||
- WARN_ON_ONCE(dst_start & ~PAGE_MASK) ||
- WARN_ON_ONCE(len & ~PAGE_MASK))
- goto out;
+ VM_WARN_ON_ONCE(src_start & ~PAGE_MASK);
+ VM_WARN_ON_ONCE(dst_start & ~PAGE_MASK);
+ VM_WARN_ON_ONCE(len & ~PAGE_MASK);
/* Does the address range wrap, or is the span zero-sized? */
- if (WARN_ON_ONCE(src_start + len <= src_start) ||
- WARN_ON_ONCE(dst_start + len <= dst_start))
- goto out;
+ VM_WARN_ON_ONCE(src_start + len < src_start);
+ VM_WARN_ON_ONCE(dst_start + len < dst_start);
err = uffd_move_lock(mm, dst_start, src_start, &dst_vma, &src_vma);
if (err)
@@ -1867,9 +1865,9 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start,
up_read(&ctx->map_changing_lock);
uffd_move_unlock(dst_vma, src_vma);
out:
- VM_WARN_ON(moved < 0);
- VM_WARN_ON(err > 0);
- VM_WARN_ON(!moved && !err);
+ VM_WARN_ON_ONCE(moved < 0);
+ VM_WARN_ON_ONCE(err > 0);
+ VM_WARN_ON_ONCE(!moved && !err);
return moved ? moved : err;
}
@@ -1956,10 +1954,10 @@ int userfaultfd_register_range(struct userfaultfd_ctx *ctx,
for_each_vma_range(vmi, vma, end) {
cond_resched();
- BUG_ON(!vma_can_userfault(vma, vm_flags, wp_async));
- BUG_ON(vma->vm_userfaultfd_ctx.ctx &&
- vma->vm_userfaultfd_ctx.ctx != ctx);
- WARN_ON(!(vma->vm_flags & VM_MAYWRITE));
+ VM_WARN_ON_ONCE(!vma_can_userfault(vma, vm_flags, wp_async));
+ VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx &&
+ vma->vm_userfaultfd_ctx.ctx != ctx);
+ VM_WARN_ON_ONCE(!(vma->vm_flags & VM_MAYWRITE));
/*
* Nothing to do: this vma is already registered into this
@@ -2035,8 +2033,8 @@ void userfaultfd_release_all(struct mm_struct *mm,
prev = NULL;
for_each_vma(vmi, vma) {
cond_resched();
- BUG_ON(!!vma->vm_userfaultfd_ctx.ctx ^
- !!(vma->vm_flags & __VM_UFFD_FLAGS));
+ VM_WARN_ON_ONCE(!!vma->vm_userfaultfd_ctx.ctx ^
+ !!(vma->vm_flags & __VM_UFFD_FLAGS));
if (vma->vm_userfaultfd_ctx.ctx != ctx) {
prev = vma;
continue;
--
2.39.5
^ permalink raw reply related [flat|nested] 5+ messages in thread