public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH] mm/userfaultfd: re-validate vma in mfill_atomic() loop under CONFIG_PER_VMA_LOCK
@ 2026-03-16  7:00 Deepanshu Kartikey
  2026-03-16 10:59 ` Mike Rapoport
  0 siblings, 1 reply; 2+ messages in thread
From: Deepanshu Kartikey @ 2026-03-16  7:00 UTC (permalink / raw)
  To: akpm, rppt, peterx
  Cc: linux-mm, linux-kernel, Deepanshu Kartikey,
	syzbot+e24a2e34fad0efbac047, Deepanshu Kartikey

Under CONFIG_PER_VMA_LOCK, mfill_atomic() holds only a per-VMA read
lock across its page-by-page copy loop. A concurrent UFFDIO_UNREGISTER
can acquire mmap_write_lock() and split the VMA mid-loop via
__split_vma(), which calls vma_start_write() via __vma_enter_locked().

The split happens in the race window between CHECK 1 and vm_refcnt++
in vma_start_read(). During this window vm_refcnt equals the base
attached value, so vma_start_write() sees no readers and proceeds
immediately without waiting, shrinking vma->vm_end in place.

Both seqnum checks in vma_start_read() miss this because after
mmap_write_unlock(), mm_lock_seq is incremented past vm_lock_seq
making them unequal, so a split VMA is returned to mfill_atomic().

On the next iteration, mfill_atomic_install_pte() calls
folio_add_new_anon_rmap() with state.dst_addr >= vma->vm_end,
triggering its sanity check:

  address < vma->vm_start || address + (nr << 12) > vma->vm_end
  WARNING: mm/rmap.c:1682 folio_add_new_anon_rmap+0x5fe/0x14b0

Fix this by checking on each loop iteration whether state.dst_addr
has fallen outside state.vma. If so, release the stale vma, update
dst_start and len to reflect the current position, and re-lookup the
vma via mfill_get_vma().

Reported-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=e24a2e34fad0efbac047"
Tested-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
---
 mm/userfaultfd.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 9ffc80d0a51b..ab73c2106c38 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -910,6 +910,22 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
 
 	while (state.src_addr < src_start + len) {
 		VM_WARN_ON_ONCE(state.dst_addr >= dst_start + len);
+	       /*
+		* Under CONFIG_PER_VMA_LOCK, a concurrent UFFDIO_UNREGISTER can
+		* split state.vma while we hold only the per-VMA read lock. The
+		* split shrinks vma->vm_end in place, causing dst_addr to fall
+		* outside the VMA bounds. Re-validate dst_addr on each iteration
+		* and re-lookup the vma if it has been split.
+		*/
+		if (state.dst_addr < state.vma->vm_start ||
+		    state.dst_addr >= state.vma->vm_end) {
+			mfill_put_vma(&state);
+			state.dst_start = state.dst_addr;
+			state.len = dst_start + len - state.dst_addr;
+			err = mfill_get_vma(&state);
+			if (err)
+				break;
+		}
 
 		err = mfill_get_pmd(&state);
 		if (err)
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] mm/userfaultfd: re-validate vma in mfill_atomic() loop under CONFIG_PER_VMA_LOCK
  2026-03-16  7:00 [PATCH] mm/userfaultfd: re-validate vma in mfill_atomic() loop under CONFIG_PER_VMA_LOCK Deepanshu Kartikey
@ 2026-03-16 10:59 ` Mike Rapoport
  0 siblings, 0 replies; 2+ messages in thread
From: Mike Rapoport @ 2026-03-16 10:59 UTC (permalink / raw)
  To: Deepanshu Kartikey
  Cc: akpm, peterx, linux-mm, linux-kernel, syzbot+e24a2e34fad0efbac047

Hi,

On Mon, Mar 16, 2026 at 12:30:39PM +0530, Deepanshu Kartikey wrote:
> Under CONFIG_PER_VMA_LOCK, mfill_atomic() holds only a per-VMA read
> lock across its page-by-page copy loop. A concurrent UFFDIO_UNREGISTER
> can acquire mmap_write_lock() and split the VMA mid-loop via
> __split_vma(), which calls vma_start_write() via __vma_enter_locked().
> 
> The split happens in the race window between CHECK 1 and vm_refcnt++
> in vma_start_read(). During this window vm_refcnt equals the base
> attached value, so vma_start_write() sees no readers and proceeds
> immediately without waiting, shrinking vma->vm_end in place.
> 
> Both seqnum checks in vma_start_read() miss this because after
> mmap_write_unlock(), mm_lock_seq is incremented past vm_lock_seq
> making them unequal, so a split VMA is returned to mfill_atomic().
> 
> On the next iteration, mfill_atomic_install_pte() calls
> folio_add_new_anon_rmap() with state.dst_addr >= vma->vm_end,
> triggering its sanity check:
> 
>   address < vma->vm_start || address + (nr << 12) > vma->vm_end
>   WARNING: mm/rmap.c:1682 folio_add_new_anon_rmap+0x5fe/0x14b0
> 
> Fix this by checking on each loop iteration whether state.dst_addr
> has fallen outside state.vma. If so, release the stale vma, update
> dst_start and len to reflect the current position, and re-lookup the
> vma via mfill_get_vma().
> 
> Reported-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=e24a2e34fad0efbac047"
> Tested-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com
> Signed-off-by: Deepanshu Kartikey <Kartikey406@gmail.com>
> ---
>  mm/userfaultfd.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)

There's a simpler fix Harry posted:
https://lore.kernel.org/all/abehBY7QakYF9bK4@hyeyoo

and it keeps the original behaviour when copy bails out if the VMA was
changed underneath it.

> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index 9ffc80d0a51b..ab73c2106c38 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -910,6 +910,22 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
>  
>  	while (state.src_addr < src_start + len) {
>  		VM_WARN_ON_ONCE(state.dst_addr >= dst_start + len);
> +	       /*
> +		* Under CONFIG_PER_VMA_LOCK, a concurrent UFFDIO_UNREGISTER can
> +		* split state.vma while we hold only the per-VMA read lock. The
> +		* split shrinks vma->vm_end in place, causing dst_addr to fall
> +		* outside the VMA bounds. Re-validate dst_addr on each iteration
> +		* and re-lookup the vma if it has been split.
> +		*/
> +		if (state.dst_addr < state.vma->vm_start ||
> +		    state.dst_addr >= state.vma->vm_end) {
> +			mfill_put_vma(&state);
> +			state.dst_start = state.dst_addr;
> +			state.len = dst_start + len - state.dst_addr;
> +			err = mfill_get_vma(&state);
> +			if (err)
> +				break;
> +		}
>  
>  		err = mfill_get_pmd(&state);
>  		if (err)
> -- 
> 2.43.0
> 

-- 
Sincerely yours,
Mike.


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2026-03-16 10:59 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-16  7:00 [PATCH] mm/userfaultfd: re-validate vma in mfill_atomic() loop under CONFIG_PER_VMA_LOCK Deepanshu Kartikey
2026-03-16 10:59 ` Mike Rapoport

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox