* [PATCH v2 0/4] mm: userfaultfd: assorted fixes and cleanups @ 2025-06-07 6:39 Tal Zussman 2025-06-07 6:40 ` [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman ` (3 more replies) 0 siblings, 4 replies; 15+ messages in thread From: Tal Zussman @ 2025-06-07 6:39 UTC (permalink / raw) To: Andrew Morton, Peter Xu, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli Cc: linux-mm, linux-kernel, linux-fsdevel, Tal Zussman Two fixes and two cleanups for userfaultfd. I added a patch converting BUG_ON()s in the userfaultfd code to VM_WARN_ON_ONCE() this time around. Note that the third patch yields a small change in the ABI, but we seem to have concluded that that's acceptable in this case. --- Changes in v2: - Remove Pavel Emelyanov <xemul@parallels.com> from To: due to bouncing email. - Propagate tags. (David, Peter) - Add a patch converting userfaultfd BUG_ON()s to VM_WARN_ON_ONCE(). - Move the "different uffd" check in Patch 3 (prev. Patch 2) before the vma_can_userfault() check due to the wp_async bug, as per James. - Change the added BUG_ON() in Patch 3 to a VM_WARN_ON_ONCE, as per James and David. - Reorder the assertions in Patch 3 to simplify them and avoid the wp_async bug, as per James. - Update the Patch 3 commit message to include more details, as per Peter. - Link to v1: https://lore.kernel.org/r/20250603-uffd-fixes-v1-0-9c638c73f047@columbia.edu --- Tal Zussman (4): userfaultfd: correctly prevent registering VM_DROPPABLE regions userfaultfd: remove (VM_)BUG_ON()s userfaultfd: prevent unregistering VMAs through a different userfaultfd userfaultfd: remove UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET fs/userfaultfd.c | 76 +++++++++++++++++++++++-------------------- include/linux/userfaultfd_k.h | 6 +--- mm/userfaultfd.c | 66 ++++++++++++++++++------------------- 3 files changed, 74 insertions(+), 74 deletions(-) --- base-commit: 546b1c9e93c2bb8cf5ed24e0be1c86bb089b3253 change-id: 20250531-uffd-fixes-142331b15e63 Best regards, -- Tal Zussman <tz2294@columbia.edu> ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions 2025-06-07 6:39 [PATCH v2 0/4] mm: userfaultfd: assorted fixes and cleanups Tal Zussman @ 2025-06-07 6:40 ` Tal Zussman 2025-06-07 14:40 ` Jason A. Donenfeld 2025-06-07 22:04 ` Andrew Morton 2025-06-07 6:40 ` [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s Tal Zussman ` (2 subsequent siblings) 3 siblings, 2 replies; 15+ messages in thread From: Tal Zussman @ 2025-06-07 6:40 UTC (permalink / raw) To: Andrew Morton, Peter Xu, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli Cc: linux-mm, linux-kernel, linux-fsdevel, Tal Zussman vma_can_userfault() masks off non-userfaultfd VM flags from vm_flags. The vm_flags & VM_DROPPABLE test will then always be false, incorrectly allowing VM_DROPPABLE regions to be registered with userfaultfd. Additionally, vm_flags is not guaranteed to correspond to the actual VMA's flags. Fix this test by checking the VMA's flags directly. Link: https://lore.kernel.org/linux-mm/5a875a3a-2243-4eab-856f-bc53ccfec3ea@redhat.com/ Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Tal Zussman <tz2294@columbia.edu> --- include/linux/userfaultfd_k.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index 75342022d144..f3b3d2c9dd5e 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -218,7 +218,7 @@ static inline bool vma_can_userfault(struct vm_area_struct *vma, { vm_flags &= __VM_UFFD_FLAGS; - if (vm_flags & VM_DROPPABLE) + if (vma->vm_flags & VM_DROPPABLE) return false; if ((vm_flags & VM_UFFD_MINOR) && -- 2.39.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions 2025-06-07 6:40 ` [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman @ 2025-06-07 14:40 ` Jason A. Donenfeld 2025-06-07 22:04 ` Andrew Morton 1 sibling, 0 replies; 15+ messages in thread From: Jason A. Donenfeld @ 2025-06-07 14:40 UTC (permalink / raw) To: Tal Zussman Cc: Andrew Morton, Peter Xu, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On Sat, Jun 07, 2025 at 02:40:00AM -0400, Tal Zussman wrote: > vma_can_userfault() masks off non-userfaultfd VM flags from vm_flags. > The vm_flags & VM_DROPPABLE test will then always be false, incorrectly > allowing VM_DROPPABLE regions to be registered with userfaultfd. > > Additionally, vm_flags is not guaranteed to correspond to the actual > VMA's flags. Fix this test by checking the VMA's flags directly. > > Link: https://lore.kernel.org/linux-mm/5a875a3a-2243-4eab-856f-bc53ccfec3ea@redhat.com/ > Fixes: 9651fcedf7b9 ("mm: add MAP_DROPPABLE for designating always lazily freeable mappings") > Acked-by: David Hildenbrand <david@redhat.com> > Acked-by: Peter Xu <peterx@redhat.com> > Signed-off-by: Tal Zussman <tz2294@columbia.edu> Nice catch and thanks for fixing this. Acked-by: Jason A. Donenfeld <Jason@zx2c4.com> ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions 2025-06-07 6:40 ` [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman 2025-06-07 14:40 ` Jason A. Donenfeld @ 2025-06-07 22:04 ` Andrew Morton 2025-06-09 15:02 ` Peter Xu 1 sibling, 1 reply; 15+ messages in thread From: Andrew Morton @ 2025-06-07 22:04 UTC (permalink / raw) To: Tal Zussman Cc: Peter Xu, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On Sat, 07 Jun 2025 02:40:00 -0400 Tal Zussman <tz2294@columbia.edu> wrote: > vma_can_userfault() masks off non-userfaultfd VM flags from vm_flags. > The vm_flags & VM_DROPPABLE test will then always be false, incorrectly > allowing VM_DROPPABLE regions to be registered with userfaultfd. > > Additionally, vm_flags is not guaranteed to correspond to the actual > VMA's flags. Fix this test by checking the VMA's flags directly. Wondering if we should backport this. afaict we don't know the userspace impact of this because nobody has tried it! ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions 2025-06-07 22:04 ` Andrew Morton @ 2025-06-09 15:02 ` Peter Xu 0 siblings, 0 replies; 15+ messages in thread From: Peter Xu @ 2025-06-09 15:02 UTC (permalink / raw) To: Andrew Morton Cc: Tal Zussman, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On Sat, Jun 07, 2025 at 03:04:38PM -0700, Andrew Morton wrote: > On Sat, 07 Jun 2025 02:40:00 -0400 Tal Zussman <tz2294@columbia.edu> wrote: > > > vma_can_userfault() masks off non-userfaultfd VM flags from vm_flags. > > The vm_flags & VM_DROPPABLE test will then always be false, incorrectly > > allowing VM_DROPPABLE regions to be registered with userfaultfd. > > > > Additionally, vm_flags is not guaranteed to correspond to the actual > > VMA's flags. Fix this test by checking the VMA's flags directly. > > Wondering if we should backport this. afaict we don't know the > userspace impact of this because nobody has tried it! Yes that's fair question. Per my limited understanding of MAP_DROPPABLE (even if as a generic flag), I'd be surprised if someone tries to enable userfaultfd on it, being succeeded or not.. or requiring that to properly fail on any stable branches. AFAIU that's the only possible effect we can expect from a backport. IMHO for this case we can avoid backporting until anyone requested with an explicit use case. Thanks, -- Peter Xu ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s 2025-06-07 6:39 [PATCH v2 0/4] mm: userfaultfd: assorted fixes and cleanups Tal Zussman 2025-06-07 6:40 ` [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman @ 2025-06-07 6:40 ` Tal Zussman 2025-06-10 7:26 ` David Hildenbrand 2025-06-10 13:11 ` Peter Xu 2025-06-07 6:40 ` [PATCH v2 3/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd Tal Zussman 2025-06-07 6:40 ` [PATCH v2 4/4] userfaultfd: remove UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET Tal Zussman 3 siblings, 2 replies; 15+ messages in thread From: Tal Zussman @ 2025-06-07 6:40 UTC (permalink / raw) To: Andrew Morton, Peter Xu, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli Cc: linux-mm, linux-kernel, linux-fsdevel, Tal Zussman BUG_ON() is deprecated [1]. Convert all the BUG_ON()s and VM_BUG_ON()s to use VM_WARN_ON_ONCE(). While at it, also convert the WARN_ON_ONCE()s in move_pages() to use VM_WARN_ON_ONCE(), as the relevant conditions are already checked in validate_range() in move_pages()'s caller. [1] https://www.kernel.org/doc/html/v6.15/process/coding-style.html#use-warn-rather-than-bug Signed-off-by: Tal Zussman <tz2294@columbia.edu> --- fs/userfaultfd.c | 59 +++++++++++++++++++++++++------------------------- mm/userfaultfd.c | 66 +++++++++++++++++++++++++++----------------------------- 2 files changed, 61 insertions(+), 64 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 22f4bf956ba1..80c95c712266 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -165,14 +165,14 @@ static void userfaultfd_ctx_get(struct userfaultfd_ctx *ctx) static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx) { if (refcount_dec_and_test(&ctx->refcount)) { - VM_BUG_ON(spin_is_locked(&ctx->fault_pending_wqh.lock)); - VM_BUG_ON(waitqueue_active(&ctx->fault_pending_wqh)); - VM_BUG_ON(spin_is_locked(&ctx->fault_wqh.lock)); - VM_BUG_ON(waitqueue_active(&ctx->fault_wqh)); - VM_BUG_ON(spin_is_locked(&ctx->event_wqh.lock)); - VM_BUG_ON(waitqueue_active(&ctx->event_wqh)); - VM_BUG_ON(spin_is_locked(&ctx->fd_wqh.lock)); - VM_BUG_ON(waitqueue_active(&ctx->fd_wqh)); + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_pending_wqh.lock)); + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_pending_wqh)); + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_wqh.lock)); + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_wqh)); + VM_WARN_ON_ONCE(spin_is_locked(&ctx->event_wqh.lock)); + VM_WARN_ON_ONCE(waitqueue_active(&ctx->event_wqh)); + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fd_wqh.lock)); + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fd_wqh)); mmdrop(ctx->mm); kmem_cache_free(userfaultfd_ctx_cachep, ctx); } @@ -383,12 +383,12 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) if (!ctx) goto out; - BUG_ON(ctx->mm != mm); + VM_WARN_ON_ONCE(ctx->mm != mm); /* Any unrecognized flag is a bug. */ - VM_BUG_ON(reason & ~__VM_UFFD_FLAGS); + VM_WARN_ON_ONCE(reason & ~__VM_UFFD_FLAGS); /* 0 or > 1 flags set is a bug; we expect exactly 1. */ - VM_BUG_ON(!reason || (reason & (reason - 1))); + VM_WARN_ON_ONCE(!reason || (reason & (reason - 1))); if (ctx->features & UFFD_FEATURE_SIGBUS) goto out; @@ -411,12 +411,11 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) * to be sure not to return SIGBUS erroneously on * nowait invocations. */ - BUG_ON(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); + VM_WARN_ON_ONCE(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); #ifdef CONFIG_DEBUG_VM if (printk_ratelimit()) { - printk(KERN_WARNING - "FAULT_FLAG_ALLOW_RETRY missing %x\n", - vmf->flags); + pr_warn("FAULT_FLAG_ALLOW_RETRY missing %x\n", + vmf->flags); dump_stack(); } #endif @@ -602,7 +601,7 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx, */ out: atomic_dec(&ctx->mmap_changing); - VM_BUG_ON(atomic_read(&ctx->mmap_changing) < 0); + VM_WARN_ON_ONCE(atomic_read(&ctx->mmap_changing) < 0); userfaultfd_ctx_put(ctx); } @@ -710,7 +709,7 @@ void dup_userfaultfd_fail(struct list_head *fcs) struct userfaultfd_ctx *ctx = fctx->new; atomic_dec(&octx->mmap_changing); - VM_BUG_ON(atomic_read(&octx->mmap_changing) < 0); + VM_WARN_ON_ONCE(atomic_read(&octx->mmap_changing) < 0); userfaultfd_ctx_put(octx); userfaultfd_ctx_put(ctx); @@ -1317,8 +1316,8 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!!cur->vm_userfaultfd_ctx.ctx ^ - !!(cur->vm_flags & __VM_UFFD_FLAGS)); + VM_WARN_ON_ONCE(!!cur->vm_userfaultfd_ctx.ctx ^ + !!(cur->vm_flags & __VM_UFFD_FLAGS)); /* check not compatible vmas */ ret = -EINVAL; @@ -1372,7 +1371,7 @@ static int userfaultfd_register(struct userfaultfd_ctx *ctx, found = true; } for_each_vma_range(vmi, cur, end); - BUG_ON(!found); + VM_WARN_ON_ONCE(!found); ret = userfaultfd_register_range(ctx, vma, vm_flags, start, end, wp_async); @@ -1464,8 +1463,8 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, do { cond_resched(); - BUG_ON(!!cur->vm_userfaultfd_ctx.ctx ^ - !!(cur->vm_flags & __VM_UFFD_FLAGS)); + VM_WARN_ON_ONCE(!!cur->vm_userfaultfd_ctx.ctx ^ + !!(cur->vm_flags & __VM_UFFD_FLAGS)); /* * Check not compatible vmas, not strictly required @@ -1479,7 +1478,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, found = true; } for_each_vma_range(vmi, cur, end); - BUG_ON(!found); + VM_WARN_ON_ONCE(!found); vma_iter_set(&vmi, start); prev = vma_prev(&vmi); @@ -1490,7 +1489,7 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, for_each_vma_range(vmi, vma, end) { cond_resched(); - BUG_ON(!vma_can_userfault(vma, vma->vm_flags, wp_async)); + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async)); /* * Nothing to do: this vma is already registered into this @@ -1564,7 +1563,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx, * len == 0 means wake all and we don't want to wake all here, * so check it again to be sure. */ - VM_BUG_ON(!range.len); + VM_WARN_ON_ONCE(!range.len); wake_userfault(ctx, &range); ret = 0; @@ -1621,7 +1620,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, return -EFAULT; if (ret < 0) goto out; - BUG_ON(!ret); + VM_WARN_ON_ONCE(!ret); /* len == 0 would wake all */ range.len = ret; if (!(uffdio_copy.mode & UFFDIO_COPY_MODE_DONTWAKE)) { @@ -1676,7 +1675,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx, if (ret < 0) goto out; /* len == 0 would wake all */ - BUG_ON(!ret); + VM_WARN_ON_ONCE(!ret); range.len = ret; if (!(uffdio_zeropage.mode & UFFDIO_ZEROPAGE_MODE_DONTWAKE)) { range.start = uffdio_zeropage.range.start; @@ -1788,7 +1787,7 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) goto out; /* len == 0 would wake all */ - BUG_ON(!ret); + VM_WARN_ON_ONCE(!ret); range.len = ret; if (!(uffdio_continue.mode & UFFDIO_CONTINUE_MODE_DONTWAKE)) { range.start = uffdio_continue.range.start; @@ -1845,7 +1844,7 @@ static inline int userfaultfd_poison(struct userfaultfd_ctx *ctx, unsigned long goto out; /* len == 0 would wake all */ - BUG_ON(!ret); + VM_WARN_ON_ONCE(!ret); range.len = ret; if (!(uffdio_poison.mode & UFFDIO_POISON_MODE_DONTWAKE)) { range.start = uffdio_poison.range.start; @@ -2106,7 +2105,7 @@ static int new_userfaultfd(int flags) struct file *file; int fd; - BUG_ON(!current->mm); + VM_WARN_ON_ONCE(!current->mm); /* Check the UFFD_* constants for consistency. */ BUILD_BUG_ON(UFFD_USER_MODE_ONLY & UFFD_SHARED_FCNTL_FLAGS); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index bc473ad21202..41e67ded5a6e 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -561,7 +561,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( } while (src_addr < src_start + len) { - BUG_ON(dst_addr >= dst_start + len); + VM_WARN_ON_ONCE(dst_addr >= dst_start + len); /* * Serialize via vma_lock and hugetlb_fault_mutex. @@ -602,7 +602,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( if (unlikely(err == -ENOENT)) { up_read(&ctx->map_changing_lock); uffd_mfill_unlock(dst_vma); - BUG_ON(!folio); + VM_WARN_ON_ONCE(!folio); err = copy_folio_from_user(folio, (const void __user *)src_addr, true); @@ -614,7 +614,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( dst_vma = NULL; goto retry; } else - BUG_ON(folio); + VM_WARN_ON_ONCE(folio); if (!err) { dst_addr += vma_hpagesize; @@ -635,9 +635,9 @@ static __always_inline ssize_t mfill_atomic_hugetlb( out: if (folio) folio_put(folio); - BUG_ON(copied < 0); - BUG_ON(err > 0); - BUG_ON(!copied && !err); + VM_WARN_ON_ONCE(copied < 0); + VM_WARN_ON_ONCE(err > 0); + VM_WARN_ON_ONCE(!copied && !err); return copied ? copied : err; } #else /* !CONFIG_HUGETLB_PAGE */ @@ -711,12 +711,12 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, /* * Sanitize the command parameters: */ - BUG_ON(dst_start & ~PAGE_MASK); - BUG_ON(len & ~PAGE_MASK); + VM_WARN_ON_ONCE(dst_start & ~PAGE_MASK); + VM_WARN_ON_ONCE(len & ~PAGE_MASK); /* Does the address range wrap, or is the span zero-sized? */ - BUG_ON(src_start + len <= src_start); - BUG_ON(dst_start + len <= dst_start); + VM_WARN_ON_ONCE(src_start + len <= src_start); + VM_WARN_ON_ONCE(dst_start + len <= dst_start); src_addr = src_start; dst_addr = dst_start; @@ -775,7 +775,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, while (src_addr < src_start + len) { pmd_t dst_pmdval; - BUG_ON(dst_addr >= dst_start + len); + VM_WARN_ON_ONCE(dst_addr >= dst_start + len); dst_pmd = mm_alloc_pmd(dst_mm, dst_addr); if (unlikely(!dst_pmd)) { @@ -818,7 +818,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, up_read(&ctx->map_changing_lock); uffd_mfill_unlock(dst_vma); - BUG_ON(!folio); + VM_WARN_ON_ONCE(!folio); kaddr = kmap_local_folio(folio, 0); err = copy_from_user(kaddr, @@ -832,7 +832,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, flush_dcache_folio(folio); goto retry; } else - BUG_ON(folio); + VM_WARN_ON_ONCE(folio); if (!err) { dst_addr += PAGE_SIZE; @@ -852,9 +852,9 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, out: if (folio) folio_put(folio); - BUG_ON(copied < 0); - BUG_ON(err > 0); - BUG_ON(!copied && !err); + VM_WARN_ON_ONCE(copied < 0); + VM_WARN_ON_ONCE(err > 0); + VM_WARN_ON_ONCE(!copied && !err); return copied ? copied : err; } @@ -940,11 +940,11 @@ int mwriteprotect_range(struct userfaultfd_ctx *ctx, unsigned long start, /* * Sanitize the command parameters: */ - BUG_ON(start & ~PAGE_MASK); - BUG_ON(len & ~PAGE_MASK); + VM_WARN_ON_ONCE(start & ~PAGE_MASK); + VM_WARN_ON_ONCE(len & ~PAGE_MASK); /* Does the address range wrap, or is the span zero-sized? */ - BUG_ON(start + len <= start); + VM_WARN_ON_ONCE(start + len <= start); mmap_read_lock(dst_mm); @@ -1709,15 +1709,13 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, ssize_t moved = 0; /* Sanitize the command parameters. */ - if (WARN_ON_ONCE(src_start & ~PAGE_MASK) || - WARN_ON_ONCE(dst_start & ~PAGE_MASK) || - WARN_ON_ONCE(len & ~PAGE_MASK)) - goto out; + VM_WARN_ON_ONCE(src_start & ~PAGE_MASK); + VM_WARN_ON_ONCE(dst_start & ~PAGE_MASK); + VM_WARN_ON_ONCE(len & ~PAGE_MASK); /* Does the address range wrap, or is the span zero-sized? */ - if (WARN_ON_ONCE(src_start + len <= src_start) || - WARN_ON_ONCE(dst_start + len <= dst_start)) - goto out; + VM_WARN_ON_ONCE(src_start + len < src_start); + VM_WARN_ON_ONCE(dst_start + len < dst_start); err = uffd_move_lock(mm, dst_start, src_start, &dst_vma, &src_vma); if (err) @@ -1867,9 +1865,9 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, up_read(&ctx->map_changing_lock); uffd_move_unlock(dst_vma, src_vma); out: - VM_WARN_ON(moved < 0); - VM_WARN_ON(err > 0); - VM_WARN_ON(!moved && !err); + VM_WARN_ON_ONCE(moved < 0); + VM_WARN_ON_ONCE(err > 0); + VM_WARN_ON_ONCE(!moved && !err); return moved ? moved : err; } @@ -1956,9 +1954,9 @@ int userfaultfd_register_range(struct userfaultfd_ctx *ctx, for_each_vma_range(vmi, vma, end) { cond_resched(); - BUG_ON(!vma_can_userfault(vma, vm_flags, wp_async)); - BUG_ON(vma->vm_userfaultfd_ctx.ctx && - vma->vm_userfaultfd_ctx.ctx != ctx); + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vm_flags, wp_async)); + VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx && + vma->vm_userfaultfd_ctx.ctx != ctx); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); /* @@ -2035,8 +2033,8 @@ void userfaultfd_release_all(struct mm_struct *mm, prev = NULL; for_each_vma(vmi, vma) { cond_resched(); - BUG_ON(!!vma->vm_userfaultfd_ctx.ctx ^ - !!(vma->vm_flags & __VM_UFFD_FLAGS)); + VM_WARN_ON_ONCE(!!vma->vm_userfaultfd_ctx.ctx ^ + !!(vma->vm_flags & __VM_UFFD_FLAGS)); if (vma->vm_userfaultfd_ctx.ctx != ctx) { prev = vma; continue; -- 2.39.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s 2025-06-07 6:40 ` [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s Tal Zussman @ 2025-06-10 7:26 ` David Hildenbrand 2025-06-17 23:10 ` Tal Zussman 2025-06-10 13:11 ` Peter Xu 1 sibling, 1 reply; 15+ messages in thread From: David Hildenbrand @ 2025-06-10 7:26 UTC (permalink / raw) To: Tal Zussman, Andrew Morton, Peter Xu, Jason A. Donenfeld, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli Cc: linux-mm, linux-kernel, linux-fsdevel On 07.06.25 08:40, Tal Zussman wrote: > BUG_ON() is deprecated [1]. Convert all the BUG_ON()s and VM_BUG_ON()s > to use VM_WARN_ON_ONCE(). > > While at it, also convert the WARN_ON_ONCE()s in move_pages() to use > VM_WARN_ON_ONCE(), as the relevant conditions are already checked in > validate_range() in move_pages()'s caller. > > [1] https://www.kernel.org/doc/html/v6.15/process/coding-style.html#use-warn-rather-than-bug > > Signed-off-by: Tal Zussman <tz2294@columbia.edu> > --- [...] > if (ctx->features & UFFD_FEATURE_SIGBUS) > goto out; > @@ -411,12 +411,11 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) > * to be sure not to return SIGBUS erroneously on > * nowait invocations. > */ > - BUG_ON(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); > + VM_WARN_ON_ONCE(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); > #ifdef CONFIG_DEBUG_VM > if (printk_ratelimit()) { > - printk(KERN_WARNING > - "FAULT_FLAG_ALLOW_RETRY missing %x\n", > - vmf->flags); > + pr_warn("FAULT_FLAG_ALLOW_RETRY missing %x\n", > + vmf->flags); You didn't cover that in the patch description. I do wonder if we really still want the dump_stack() here and could simplify to pr_warn_ratelimited(). But I recall that the stack was helpful at least once for me (well, I was able to reproduce and could have figured it out differently.). [...] > err = uffd_move_lock(mm, dst_start, src_start, &dst_vma, &src_vma); > if (err) > @@ -1867,9 +1865,9 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, > up_read(&ctx->map_changing_lock); > uffd_move_unlock(dst_vma, src_vma); > out: > - VM_WARN_ON(moved < 0); > - VM_WARN_ON(err > 0); > - VM_WARN_ON(!moved && !err); > + VM_WARN_ON_ONCE(moved < 0); > + VM_WARN_ON_ONCE(err > 0); > + VM_WARN_ON_ONCE(!moved && !err); > return moved ? moved : err; Here you convert VM_WARN_ON to VM_WARN_ON_ONCE without stating it in the description (including the why). > @@ -1956,9 +1954,9 @@ int userfaultfd_register_range(struct userfaultfd_ctx *ctx, > for_each_vma_range(vmi, vma, end) { > cond_resched(); > > - BUG_ON(!vma_can_userfault(vma, vm_flags, wp_async)); > - BUG_ON(vma->vm_userfaultfd_ctx.ctx && > - vma->vm_userfaultfd_ctx.ctx != ctx); > + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vm_flags, wp_async)); > + VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx && > + vma->vm_userfaultfd_ctx.ctx != ctx); > WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); Which raises the question, why this here should still be a WARN -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s 2025-06-10 7:26 ` David Hildenbrand @ 2025-06-17 23:10 ` Tal Zussman 2025-06-23 15:18 ` David Hildenbrand 0 siblings, 1 reply; 15+ messages in thread From: Tal Zussman @ 2025-06-17 23:10 UTC (permalink / raw) To: David Hildenbrand Cc: Andrew Morton, Peter Xu, Jason A. Donenfeld, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On Tue, Jun 10, 2025 at 3:26 AM David Hildenbrand <david@redhat.com> wrote: > > On 07.06.25 08:40, Tal Zussman wrote: > > > > if (ctx->features & UFFD_FEATURE_SIGBUS) > > goto out; > > @@ -411,12 +411,11 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) > > * to be sure not to return SIGBUS erroneously on > > * nowait invocations. > > */ > > - BUG_ON(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); > > + VM_WARN_ON_ONCE(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); > > #ifdef CONFIG_DEBUG_VM > > if (printk_ratelimit()) { > > - printk(KERN_WARNING > > - "FAULT_FLAG_ALLOW_RETRY missing %x\n", > > - vmf->flags); > > + pr_warn("FAULT_FLAG_ALLOW_RETRY missing %x\n", > > + vmf->flags); > > You didn't cover that in the patch description. > > I do wonder if we really still want the dump_stack() here and could > simplify to > > pr_warn_ratelimited(). > > But I recall that the stack was helpful at least once for me (well, I > was able to reproduce and could have figured it out differently.). I'll include this in the description as well. Given that you found the stack helpful before, I'll leave it as-is for now. > [...] > > > err = uffd_move_lock(mm, dst_start, src_start, &dst_vma, &src_vma); > > if (err) > > @@ -1867,9 +1865,9 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, > > up_read(&ctx->map_changing_lock); > > uffd_move_unlock(dst_vma, src_vma); > > out: > > - VM_WARN_ON(moved < 0); > > - VM_WARN_ON(err > 0); > > - VM_WARN_ON(!moved && !err); > > + VM_WARN_ON_ONCE(moved < 0); > > + VM_WARN_ON_ONCE(err > 0); > > + VM_WARN_ON_ONCE(!moved && !err); > > return moved ? moved : err; > > > Here you convert VM_WARN_ON to VM_WARN_ON_ONCE without stating it in the > description (including the why). Will update in the description. These checks represent impossible conditions and are analogous to the BUG_ON()s in move_pages(), but were added later. So instead of BUG_ON(), they use VM_WARN_ON() as a replacement when VM_WARN_ON_ONCE() is likely a better fit, as per other conversions. > > @@ -1956,9 +1954,9 @@ int userfaultfd_register_range(struct userfaultfd_ctx *ctx, > > for_each_vma_range(vmi, vma, end) { > > cond_resched(); > > > > - BUG_ON(!vma_can_userfault(vma, vm_flags, wp_async)); > > - BUG_ON(vma->vm_userfaultfd_ctx.ctx && > > - vma->vm_userfaultfd_ctx.ctx != ctx); > > + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vm_flags, wp_async)); > > + VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx && > > + vma->vm_userfaultfd_ctx.ctx != ctx); > > WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); > > Which raises the question, why this here should still be a WARN After checking it, it looks like the relevant condition is enforced in the registration path, so this can be converted to a debug check. I'll update the patch accordingly. However, I think the checks in userfaultfd_register() may be redundant. First, it checks !(cur->vm_flags & VM_MAYWRITE). Then, after a hugetlb check, it checks ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_MAYWRITE)), which should never be hit. Am I missing something? Additionally, the comment above the first !(cur->vm_flags & VM_MAYWRITE) check is a bit confusing. It seems to be based on the fact that file-backed MAP_SHARED mappings without write permissions will not have VM_MAYWRITE set, while MAP_PRIVATE mappings will always(?) have it set, but doesn't say it as explicitly. Am I understanding this check correctly? Thanks for the review! > -- > Cheers, > > David / dhildenb > ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s 2025-06-17 23:10 ` Tal Zussman @ 2025-06-23 15:18 ` David Hildenbrand 0 siblings, 0 replies; 15+ messages in thread From: David Hildenbrand @ 2025-06-23 15:18 UTC (permalink / raw) To: Tal Zussman Cc: Andrew Morton, Peter Xu, Jason A. Donenfeld, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On 18.06.25 01:10, Tal Zussman wrote: > On Tue, Jun 10, 2025 at 3:26 AM David Hildenbrand <david@redhat.com> wrote: >> >> On 07.06.25 08:40, Tal Zussman wrote: >>> >>> if (ctx->features & UFFD_FEATURE_SIGBUS) >>> goto out; >>> @@ -411,12 +411,11 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) >>> * to be sure not to return SIGBUS erroneously on >>> * nowait invocations. >>> */ >>> - BUG_ON(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); >>> + VM_WARN_ON_ONCE(vmf->flags & FAULT_FLAG_RETRY_NOWAIT); >>> #ifdef CONFIG_DEBUG_VM >>> if (printk_ratelimit()) { >>> - printk(KERN_WARNING >>> - "FAULT_FLAG_ALLOW_RETRY missing %x\n", >>> - vmf->flags); >>> + pr_warn("FAULT_FLAG_ALLOW_RETRY missing %x\n", >>> + vmf->flags); >> >> You didn't cover that in the patch description. >> >> I do wonder if we really still want the dump_stack() here and could >> simplify to >> >> pr_warn_ratelimited(). >> >> But I recall that the stack was helpful at least once for me (well, I >> was able to reproduce and could have figured it out differently.). > > I'll include this in the description as well. Given that you found the stack > helpful before, I'll leave it as-is for now. > >> [...] >> >>> err = uffd_move_lock(mm, dst_start, src_start, &dst_vma, &src_vma); >>> if (err) >>> @@ -1867,9 +1865,9 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start, >>> up_read(&ctx->map_changing_lock); >>> uffd_move_unlock(dst_vma, src_vma); >>> out: >>> - VM_WARN_ON(moved < 0); >>> - VM_WARN_ON(err > 0); >>> - VM_WARN_ON(!moved && !err); >>> + VM_WARN_ON_ONCE(moved < 0); >>> + VM_WARN_ON_ONCE(err > 0); >>> + VM_WARN_ON_ONCE(!moved && !err); >>> return moved ? moved : err; >> >> >> Here you convert VM_WARN_ON to VM_WARN_ON_ONCE without stating it in the >> description (including the why). > > Will update in the description. These checks represent impossible conditions > and are analogous to the BUG_ON()s in move_pages(), but were added later. > So instead of BUG_ON(), they use VM_WARN_ON() as a replacement when > VM_WARN_ON_ONCE() is likely a better fit, as per other conversions. > >>> @@ -1956,9 +1954,9 @@ int userfaultfd_register_range(struct userfaultfd_ctx *ctx, >>> for_each_vma_range(vmi, vma, end) { >>> cond_resched(); >>> >>> - BUG_ON(!vma_can_userfault(vma, vm_flags, wp_async)); >>> - BUG_ON(vma->vm_userfaultfd_ctx.ctx && >>> - vma->vm_userfaultfd_ctx.ctx != ctx); >>> + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vm_flags, wp_async)); >>> + VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx && >>> + vma->vm_userfaultfd_ctx.ctx != ctx); >>> WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); >> >> Which raises the question, why this here should still be a WARN > > After checking it, it looks like the relevant condition is enforced in the > registration path, so this can be converted to a debug check. I'll update > the patch accordingly. Sorry for the late reply. Yeah, a lot of that probably needs a cleanup. > > However, I think the checks in userfaultfd_register() may be redundant. > First, it checks !(cur->vm_flags & VM_MAYWRITE). Then, after a hugetlb > check, it checks > ((vm_flags & VM_UFFD_WP) && !(cur->vm_flags & VM_MAYWRITE)), which should > never be hit. Yes, likely. > > Am I missing something? > > Additionally, the comment above the first !(cur->vm_flags & VM_MAYWRITE) > check is a bit confusing. It seems to be based on the fact that file-backed > MAP_SHARED mappings without write permissions will not have VM_MAYWRITE set, > while MAP_PRIVATE mappings will always(?) have it set, but doesn't say it as > explicitly. Am I understanding this check correctly? private anon mappings should always have VM_MAYWRITE. (there is no driver to really change that). For other mappings (MAP_SHARED), VM_MAYWRITE is cleared for R/O files I think. VM_MAYWRITE really just tells you that you can do mprotect(PROT_WRITE) later, to effectively set VM_WRITE. So if VM_MAYWRITE is clear, we cannot possibly write to such mappings later (not even after mprotect()). -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s 2025-06-07 6:40 ` [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s Tal Zussman 2025-06-10 7:26 ` David Hildenbrand @ 2025-06-10 13:11 ` Peter Xu 2025-06-10 13:17 ` David Hildenbrand 1 sibling, 1 reply; 15+ messages in thread From: Peter Xu @ 2025-06-10 13:11 UTC (permalink / raw) To: Tal Zussman Cc: Andrew Morton, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On Sat, Jun 07, 2025 at 02:40:01AM -0400, Tal Zussman wrote: > BUG_ON() is deprecated [1]. Convert all the BUG_ON()s and VM_BUG_ON()s > to use VM_WARN_ON_ONCE(). > > While at it, also convert the WARN_ON_ONCE()s in move_pages() to use > VM_WARN_ON_ONCE(), as the relevant conditions are already checked in > validate_range() in move_pages()'s caller. > > [1] https://www.kernel.org/doc/html/v6.15/process/coding-style.html#use-warn-rather-than-bug > > Signed-off-by: Tal Zussman <tz2294@columbia.edu> > --- > fs/userfaultfd.c | 59 +++++++++++++++++++++++++------------------------- > mm/userfaultfd.c | 66 +++++++++++++++++++++++++++----------------------------- > 2 files changed, 61 insertions(+), 64 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 22f4bf956ba1..80c95c712266 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -165,14 +165,14 @@ static void userfaultfd_ctx_get(struct userfaultfd_ctx *ctx) > static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx) > { > if (refcount_dec_and_test(&ctx->refcount)) { > - VM_BUG_ON(spin_is_locked(&ctx->fault_pending_wqh.lock)); > - VM_BUG_ON(waitqueue_active(&ctx->fault_pending_wqh)); > - VM_BUG_ON(spin_is_locked(&ctx->fault_wqh.lock)); > - VM_BUG_ON(waitqueue_active(&ctx->fault_wqh)); > - VM_BUG_ON(spin_is_locked(&ctx->event_wqh.lock)); > - VM_BUG_ON(waitqueue_active(&ctx->event_wqh)); > - VM_BUG_ON(spin_is_locked(&ctx->fd_wqh.lock)); > - VM_BUG_ON(waitqueue_active(&ctx->fd_wqh)); > + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_pending_wqh.lock)); > + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_pending_wqh)); > + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_wqh.lock)); > + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_wqh)); > + VM_WARN_ON_ONCE(spin_is_locked(&ctx->event_wqh.lock)); > + VM_WARN_ON_ONCE(waitqueue_active(&ctx->event_wqh)); > + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fd_wqh.lock)); > + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fd_wqh)); > mmdrop(ctx->mm); > kmem_cache_free(userfaultfd_ctx_cachep, ctx); I didn't follow closely on the latest discussions on BUG_ON, but here I just stumbled on top of this chunk, it does look like a slight overkill using tons of bools for each of them.. even if the doc suggested WARN_ON_ONCE(). David might have a better picture of what's our plan for mm to properly assert while reducing the overhead as much as possible. For this specific one, if we really want to convert we could also merge them into one, so one bool to cover all. -- Peter Xu ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s 2025-06-10 13:11 ` Peter Xu @ 2025-06-10 13:17 ` David Hildenbrand 0 siblings, 0 replies; 15+ messages in thread From: David Hildenbrand @ 2025-06-10 13:17 UTC (permalink / raw) To: Peter Xu, Tal Zussman Cc: Andrew Morton, Jason A. Donenfeld, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On 10.06.25 15:11, Peter Xu wrote: > On Sat, Jun 07, 2025 at 02:40:01AM -0400, Tal Zussman wrote: >> BUG_ON() is deprecated [1]. Convert all the BUG_ON()s and VM_BUG_ON()s >> to use VM_WARN_ON_ONCE(). >> >> While at it, also convert the WARN_ON_ONCE()s in move_pages() to use >> VM_WARN_ON_ONCE(), as the relevant conditions are already checked in >> validate_range() in move_pages()'s caller. >> >> [1] https://www.kernel.org/doc/html/v6.15/process/coding-style.html#use-warn-rather-than-bug >> >> Signed-off-by: Tal Zussman <tz2294@columbia.edu> >> --- >> fs/userfaultfd.c | 59 +++++++++++++++++++++++++------------------------- >> mm/userfaultfd.c | 66 +++++++++++++++++++++++++++----------------------------- >> 2 files changed, 61 insertions(+), 64 deletions(-) >> >> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c >> index 22f4bf956ba1..80c95c712266 100644 >> --- a/fs/userfaultfd.c >> +++ b/fs/userfaultfd.c >> @@ -165,14 +165,14 @@ static void userfaultfd_ctx_get(struct userfaultfd_ctx *ctx) >> static void userfaultfd_ctx_put(struct userfaultfd_ctx *ctx) >> { >> if (refcount_dec_and_test(&ctx->refcount)) { >> - VM_BUG_ON(spin_is_locked(&ctx->fault_pending_wqh.lock)); >> - VM_BUG_ON(waitqueue_active(&ctx->fault_pending_wqh)); >> - VM_BUG_ON(spin_is_locked(&ctx->fault_wqh.lock)); >> - VM_BUG_ON(waitqueue_active(&ctx->fault_wqh)); >> - VM_BUG_ON(spin_is_locked(&ctx->event_wqh.lock)); >> - VM_BUG_ON(waitqueue_active(&ctx->event_wqh)); >> - VM_BUG_ON(spin_is_locked(&ctx->fd_wqh.lock)); >> - VM_BUG_ON(waitqueue_active(&ctx->fd_wqh)); >> + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_pending_wqh.lock)); >> + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_pending_wqh)); >> + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fault_wqh.lock)); >> + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fault_wqh)); >> + VM_WARN_ON_ONCE(spin_is_locked(&ctx->event_wqh.lock)); >> + VM_WARN_ON_ONCE(waitqueue_active(&ctx->event_wqh)); >> + VM_WARN_ON_ONCE(spin_is_locked(&ctx->fd_wqh.lock)); >> + VM_WARN_ON_ONCE(waitqueue_active(&ctx->fd_wqh)); >> mmdrop(ctx->mm); >> kmem_cache_free(userfaultfd_ctx_cachep, ctx); > > I didn't follow closely on the latest discussions on BUG_ON, but here I > just stumbled on top of this chunk, it does look like a slight overkill > using tons of bools for each of them.. even if the doc suggested > WARN_ON_ONCE(). > > David might have a better picture of what's our plan for mm to properly > assert while reducing the overhead as much as possible. There is currently still a discussion whether VM_WARN_ON an VM_WARN_ON_ONCE could be unified. In a CONFIG_DEBUG_VM kernel, the overhead of a couple of booleans is usually the least concern (everything is big and slow already) :) > > For this specific one, if we really want to convert we could also merge > them into one, so one bool to cover all. One loses precision, but yeah, they are supposed to be found during early testing, in which case one can usually reproduce + debug fairly easily. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 3/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd 2025-06-07 6:39 [PATCH v2 0/4] mm: userfaultfd: assorted fixes and cleanups Tal Zussman 2025-06-07 6:40 ` [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman 2025-06-07 6:40 ` [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s Tal Zussman @ 2025-06-07 6:40 ` Tal Zussman 2025-06-10 7:30 ` David Hildenbrand 2025-06-07 6:40 ` [PATCH v2 4/4] userfaultfd: remove UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET Tal Zussman 3 siblings, 1 reply; 15+ messages in thread From: Tal Zussman @ 2025-06-07 6:40 UTC (permalink / raw) To: Andrew Morton, Peter Xu, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli Cc: linux-mm, linux-kernel, linux-fsdevel, Tal Zussman Currently, a VMA registered with a uffd can be unregistered through a different uffd associated with the same mm_struct. The existing behavior is slightly broken and may incorrectly reject unregistering some VMAs due to the following check: if (!vma_can_userfault(cur, cur->vm_flags, wp_async)) goto out_unlock; where wp_async is derived from ctx, not from cur. For example, a file-backed VMA registered with wp_async enabled and UFFD_WP mode cannot be unregistered through a uffd that does not have wp_async enabled. Rather than fix this and maintain this odd behavior, make unregistration stricter by requiring VMAs to be unregistered through the same uffd they were registered with. Additionally, reorder the WARN() checks to avoid the aforementioned wp_async issue in the WARN()s. This change slightly modifies the ABI. It should not be backported to -stable. While at it, correct the comment for the no userfaultfd case. This seems to be a copy-paste artifact from the analogous userfaultfd_register() check. Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Signed-off-by: Tal Zussman <tz2294@columbia.edu> --- fs/userfaultfd.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 80c95c712266..10e8037f5216 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1466,6 +1466,16 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, VM_WARN_ON_ONCE(!!cur->vm_userfaultfd_ctx.ctx ^ !!(cur->vm_flags & __VM_UFFD_FLAGS)); + /* + * Check that this VMA isn't already owned by a different + * userfaultfd. This provides for more strict behavior by + * preventing a VMA registered with a userfaultfd from being + * unregistered through a different userfaultfd. + */ + if (cur->vm_userfaultfd_ctx.ctx && + cur->vm_userfaultfd_ctx.ctx != ctx) + goto out_unlock; + /* * Check not compatible vmas, not strictly required * here as not compatible vmas cannot have an @@ -1489,15 +1499,14 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, for_each_vma_range(vmi, vma, end) { cond_resched(); - VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async)); - /* - * Nothing to do: this vma is already registered into this - * userfaultfd and with the right tracking mode too. + * Nothing to do: this vma is not registered with userfaultfd. */ if (!vma->vm_userfaultfd_ctx.ctx) goto skip; + VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx != ctx); + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async)); WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); if (vma->vm_start > start) -- 2.39.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v2 3/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd 2025-06-07 6:40 ` [PATCH v2 3/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd Tal Zussman @ 2025-06-10 7:30 ` David Hildenbrand 2025-06-17 20:50 ` Tal Zussman 0 siblings, 1 reply; 15+ messages in thread From: David Hildenbrand @ 2025-06-10 7:30 UTC (permalink / raw) To: Tal Zussman, Andrew Morton, Peter Xu, Jason A. Donenfeld, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli Cc: linux-mm, linux-kernel, linux-fsdevel On 07.06.25 08:40, Tal Zussman wrote: > Currently, a VMA registered with a uffd can be unregistered through a > different uffd associated with the same mm_struct. > > The existing behavior is slightly broken and may incorrectly reject > unregistering some VMAs due to the following check: > > if (!vma_can_userfault(cur, cur->vm_flags, wp_async)) > goto out_unlock; > > where wp_async is derived from ctx, not from cur. For example, a file-backed > VMA registered with wp_async enabled and UFFD_WP mode cannot be unregistered > through a uffd that does not have wp_async enabled. > > Rather than fix this and maintain this odd behavior, make unregistration > stricter by requiring VMAs to be unregistered through the same uffd they > were registered with. Additionally, reorder the WARN() checks to avoid > the aforementioned wp_async issue in the WARN()s. > > This change slightly modifies the ABI. It should not be backported to > -stable. Probably add that the expectation is that nobody really depends on this behavior, and that no such cases are known. > > While at it, correct the comment for the no userfaultfd case. This seems to > be a copy-paste artifact from the analogous userfaultfd_register() check. > > Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") Fixes should come before anything else in a series (Andrew even prefers a separate series for fixes vs. follow-up cleanups). > Signed-off-by: Tal Zussman <tz2294@columbia.edu> > --- > fs/userfaultfd.c | 17 +++++++++++++---- > 1 file changed, 13 insertions(+), 4 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 80c95c712266..10e8037f5216 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1466,6 +1466,16 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > VM_WARN_ON_ONCE(!!cur->vm_userfaultfd_ctx.ctx ^ > !!(cur->vm_flags & __VM_UFFD_FLAGS)); > > + /* > + * Check that this VMA isn't already owned by a different > + * userfaultfd. This provides for more strict behavior by > + * preventing a VMA registered with a userfaultfd from being > + * unregistered through a different userfaultfd. > + */ Probably we can shorted to: /* * Prevent unregistering through another userfaultfd than used for * registering. */ ? > + if (cur->vm_userfaultfd_ctx.ctx && > + cur->vm_userfaultfd_ctx.ctx != ctx) > + goto out_unlock; > + > /* > * Check not compatible vmas, not strictly required > * here as not compatible vmas cannot have an > @@ -1489,15 +1499,14 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > for_each_vma_range(vmi, vma, end) { > cond_resched(); > > - VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async)); > - > /* > - * Nothing to do: this vma is already registered into this > - * userfaultfd and with the right tracking mode too. > + * Nothing to do: this vma is not registered with userfaultfd. > */ Maybe /* VMA not registered with userfaultfd. */ The "skip" below is rather clear. :) > if (!vma->vm_userfaultfd_ctx.ctx) > goto skip; > > + VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx != ctx); > + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async)); > WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); > > if (vma->vm_start > start) > Acked-by: David Hildenbrand <david@redhat.com> -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v2 3/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd 2025-06-10 7:30 ` David Hildenbrand @ 2025-06-17 20:50 ` Tal Zussman 0 siblings, 0 replies; 15+ messages in thread From: Tal Zussman @ 2025-06-17 20:50 UTC (permalink / raw) To: David Hildenbrand Cc: Andrew Morton, Peter Xu, Jason A. Donenfeld, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli, linux-mm, linux-kernel, linux-fsdevel On Tue, Jun 10, 2025 at 3:31 AM David Hildenbrand <david@redhat.com> wrote: > > On 07.06.25 08:40, Tal Zussman wrote: > > Currently, a VMA registered with a uffd can be unregistered through a > > different uffd associated with the same mm_struct. > > > > The existing behavior is slightly broken and may incorrectly reject > > unregistering some VMAs due to the following check: > > > > if (!vma_can_userfault(cur, cur->vm_flags, wp_async)) > > goto out_unlock; > > > > where wp_async is derived from ctx, not from cur. For example, a file-backed > > VMA registered with wp_async enabled and UFFD_WP mode cannot be unregistered > > through a uffd that does not have wp_async enabled. > > > > Rather than fix this and maintain this odd behavior, make unregistration > > stricter by requiring VMAs to be unregistered through the same uffd they > > were registered with. Additionally, reorder the WARN() checks to avoid > > the aforementioned wp_async issue in the WARN()s. > > > > This change slightly modifies the ABI. It should not be backported to > > -stable. > > Probably add that the expectation is that nobody really depends on this > behavior, and that no such cases are known. > > > > > While at it, correct the comment for the no userfaultfd case. This seems to > > be a copy-paste artifact from the analogous userfaultfd_register() check. > > > > Fixes: 86039bd3b4e6 ("userfaultfd: add new syscall to provide memory externalization") > > Fixes should come before anything else in a series (Andrew even prefers > a separate series for fixes vs. follow-up cleanups). > > > Signed-off-by: Tal Zussman <tz2294@columbia.edu> > > --- > > fs/userfaultfd.c | 17 +++++++++++++---- > > 1 file changed, 13 insertions(+), 4 deletions(-) > > > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > > index 80c95c712266..10e8037f5216 100644 > > --- a/fs/userfaultfd.c > > +++ b/fs/userfaultfd.c > > @@ -1466,6 +1466,16 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > > VM_WARN_ON_ONCE(!!cur->vm_userfaultfd_ctx.ctx ^ > > !!(cur->vm_flags & __VM_UFFD_FLAGS)); > > > > + /* > > + * Check that this VMA isn't already owned by a different > > + * userfaultfd. This provides for more strict behavior by > > + * preventing a VMA registered with a userfaultfd from being > > + * unregistered through a different userfaultfd. > > + */ > > Probably we can shorted to: > > /* > * Prevent unregistering through another userfaultfd than used for > * registering. > */ > > ? > > > + if (cur->vm_userfaultfd_ctx.ctx && > > + cur->vm_userfaultfd_ctx.ctx != ctx) > > + goto out_unlock; > > + > > /* > > * Check not compatible vmas, not strictly required > > * here as not compatible vmas cannot have an > > @@ -1489,15 +1499,14 @@ static int userfaultfd_unregister(struct userfaultfd_ctx *ctx, > > for_each_vma_range(vmi, vma, end) { > > cond_resched(); > > > > - VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async)); > > - > > /* > > - * Nothing to do: this vma is already registered into this > > - * userfaultfd and with the right tracking mode too. > > + * Nothing to do: this vma is not registered with userfaultfd. > > */ > > Maybe > > /* VMA not registered with userfaultfd. */ > > The "skip" below is rather clear. :) > > > if (!vma->vm_userfaultfd_ctx.ctx) > > goto skip; > > > > + VM_WARN_ON_ONCE(vma->vm_userfaultfd_ctx.ctx != ctx); > > + VM_WARN_ON_ONCE(!vma_can_userfault(vma, vma->vm_flags, wp_async)); > > WARN_ON(!(vma->vm_flags & VM_MAYWRITE)); > > > > if (vma->vm_start > start) > > > > Acked-by: David Hildenbrand <david@redhat.com> Thanks! Will update with the suggested comment + commit message changes and move this patch before the VM_WARN changes. > -- > Cheers, > > David / dhildenb > ^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v2 4/4] userfaultfd: remove UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET 2025-06-07 6:39 [PATCH v2 0/4] mm: userfaultfd: assorted fixes and cleanups Tal Zussman ` (2 preceding siblings ...) 2025-06-07 6:40 ` [PATCH v2 3/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd Tal Zussman @ 2025-06-07 6:40 ` Tal Zussman 3 siblings, 0 replies; 15+ messages in thread From: Tal Zussman @ 2025-06-07 6:40 UTC (permalink / raw) To: Andrew Morton, Peter Xu, Jason A. Donenfeld, David Hildenbrand, Alexander Viro, Christian Brauner, Jan Kara, Andrea Arcangeli Cc: linux-mm, linux-kernel, linux-fsdevel, Tal Zussman UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET have been unused since they were added in commit 932b18e0aec6 ("userfaultfd: linux/userfaultfd_k.h"). Remove them and the associated BUILD_BUG_ON() checks. Acked-by: David Hildenbrand <david@redhat.com> Acked-by: Peter Xu <peterx@redhat.com> Signed-off-by: Tal Zussman <tz2294@columbia.edu> --- fs/userfaultfd.c | 2 -- include/linux/userfaultfd_k.h | 4 ---- 2 files changed, 6 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 10e8037f5216..ef054b3154b2 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -2118,8 +2118,6 @@ static int new_userfaultfd(int flags) /* Check the UFFD_* constants for consistency. */ BUILD_BUG_ON(UFFD_USER_MODE_ONLY & UFFD_SHARED_FCNTL_FLAGS); - BUILD_BUG_ON(UFFD_CLOEXEC != O_CLOEXEC); - BUILD_BUG_ON(UFFD_NONBLOCK != O_NONBLOCK); if (flags & ~(UFFD_SHARED_FCNTL_FLAGS | UFFD_USER_MODE_ONLY)) return -EINVAL; diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index f3b3d2c9dd5e..ccad58602846 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -30,11 +30,7 @@ * from userfaultfd, in order to leave a free define-space for * shared O_* flags. */ -#define UFFD_CLOEXEC O_CLOEXEC -#define UFFD_NONBLOCK O_NONBLOCK - #define UFFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK) -#define UFFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS) /* * Start with fault_pending_wqh and fault_wqh so they're more likely -- 2.39.5 ^ permalink raw reply related [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-06-23 15:18 UTC | newest] Thread overview: 15+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-06-07 6:39 [PATCH v2 0/4] mm: userfaultfd: assorted fixes and cleanups Tal Zussman 2025-06-07 6:40 ` [PATCH v2 1/4] userfaultfd: correctly prevent registering VM_DROPPABLE regions Tal Zussman 2025-06-07 14:40 ` Jason A. Donenfeld 2025-06-07 22:04 ` Andrew Morton 2025-06-09 15:02 ` Peter Xu 2025-06-07 6:40 ` [PATCH v2 2/4] userfaultfd: remove (VM_)BUG_ON()s Tal Zussman 2025-06-10 7:26 ` David Hildenbrand 2025-06-17 23:10 ` Tal Zussman 2025-06-23 15:18 ` David Hildenbrand 2025-06-10 13:11 ` Peter Xu 2025-06-10 13:17 ` David Hildenbrand 2025-06-07 6:40 ` [PATCH v2 3/4] userfaultfd: prevent unregistering VMAs through a different userfaultfd Tal Zussman 2025-06-10 7:30 ` David Hildenbrand 2025-06-17 20:50 ` Tal Zussman 2025-06-07 6:40 ` [PATCH v2 4/4] userfaultfd: remove UFFD_CLOEXEC, UFFD_NONBLOCK, and UFFD_FLAGS_SET Tal Zussman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).