From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2870B3909AB for ; Mon, 16 Mar 2026 10:59:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773658774; cv=none; b=Gnca1OxYjKWa1HHdbdsVj4pPlR9tyHVC7YEWBy+sOTCAi3/HcKPSBbPgs09KB6kgAzN+G8Gwn3ZikJ15Di7UuyvfCnHJrogwMaqxvVVYZEG8R+XvRR1pMjqcUo5Ex/IIGnKMQ4NxNlcpblFAzkOau9GZnSr3n6HloavdpWrHt2w= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773658774; c=relaxed/simple; bh=KSbvaOM7T4lvkehB8ybCCIyypc9grzepYJ4R937Jz7g=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=D+8ywsbgu3qzINQcKmKsUUAZ9yxiaOmmYdRonWUDNAdbIbQdb1c/cWRWI8wR/iodh9t8qdBSVcfWRTSUvihMVI27oPTUGisVf7dTzyIc4m7SQZXNBFLApY0TWR2RENeGP2+YrkjX/VnNMiUf55qBmwrygJjwHLn3E80Kajj2C60= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=V4VSDzwP; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="V4VSDzwP" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 644E5C19424; Mon, 16 Mar 2026 10:59:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773658773; bh=KSbvaOM7T4lvkehB8ybCCIyypc9grzepYJ4R937Jz7g=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=V4VSDzwPwBaRFCtyXl8RGn8/zotr9AcFFnrlNOHuHN7cKX3gqOFHZDyYMa5iqeMuk s9d7f4UJF92XGiWtWA7dW5e6Qj6mZ4uTUW0T0DnQxD36dCZ4hWyMXdkYtsx+3TMSDR omULYYva/OpHfwXvMipXKigs0/jIYUUFJo22ePKs+uxBlns0em7XsoAJA4lBeDmE5Z KJV2HcPDErprePHPPGOXy7dl4rYN3S0yRnTlKr8Z3AfpHyn/cf83llU57kyDzejKMT Xiv5VMckPS/al9ql3Pxwr/356st56s4RrtEzpD479Xf1dE8/cq4MyS5Lk/8D5mAPw2 7QmCyMIKOCHFA== Date: Mon, 16 Mar 2026 12:59:27 +0200 From: Mike Rapoport To: Deepanshu Kartikey Cc: akpm@linux-foundation.org, peterx@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com Subject: Re: [PATCH] mm/userfaultfd: re-validate vma in mfill_atomic() loop under CONFIG_PER_VMA_LOCK Message-ID: References: <20260316070039.549506-1-kartikey406@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20260316070039.549506-1-kartikey406@gmail.com> Hi, On Mon, Mar 16, 2026 at 12:30:39PM +0530, Deepanshu Kartikey wrote: > Under CONFIG_PER_VMA_LOCK, mfill_atomic() holds only a per-VMA read > lock across its page-by-page copy loop. A concurrent UFFDIO_UNREGISTER > can acquire mmap_write_lock() and split the VMA mid-loop via > __split_vma(), which calls vma_start_write() via __vma_enter_locked(). > > The split happens in the race window between CHECK 1 and vm_refcnt++ > in vma_start_read(). During this window vm_refcnt equals the base > attached value, so vma_start_write() sees no readers and proceeds > immediately without waiting, shrinking vma->vm_end in place. > > Both seqnum checks in vma_start_read() miss this because after > mmap_write_unlock(), mm_lock_seq is incremented past vm_lock_seq > making them unequal, so a split VMA is returned to mfill_atomic(). > > On the next iteration, mfill_atomic_install_pte() calls > folio_add_new_anon_rmap() with state.dst_addr >= vma->vm_end, > triggering its sanity check: > > address < vma->vm_start || address + (nr << 12) > vma->vm_end > WARNING: mm/rmap.c:1682 folio_add_new_anon_rmap+0x5fe/0x14b0 > > Fix this by checking on each loop iteration whether state.dst_addr > has fallen outside state.vma. If so, release the stale vma, update > dst_start and len to reflect the current position, and re-lookup the > vma via mfill_get_vma(). > > Reported-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=e24a2e34fad0efbac047" > Tested-by: syzbot+e24a2e34fad0efbac047@syzkaller.appspotmail.com > Signed-off-by: Deepanshu Kartikey > --- > mm/userfaultfd.c | 16 ++++++++++++++++ > 1 file changed, 16 insertions(+) There's a simpler fix Harry posted: https://lore.kernel.org/all/abehBY7QakYF9bK4@hyeyoo and it keeps the original behaviour when copy bails out if the VMA was changed underneath it. > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c > index 9ffc80d0a51b..ab73c2106c38 100644 > --- a/mm/userfaultfd.c > +++ b/mm/userfaultfd.c > @@ -910,6 +910,22 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx, > > while (state.src_addr < src_start + len) { > VM_WARN_ON_ONCE(state.dst_addr >= dst_start + len); > + /* > + * Under CONFIG_PER_VMA_LOCK, a concurrent UFFDIO_UNREGISTER can > + * split state.vma while we hold only the per-VMA read lock. The > + * split shrinks vma->vm_end in place, causing dst_addr to fall > + * outside the VMA bounds. Re-validate dst_addr on each iteration > + * and re-lookup the vma if it has been split. > + */ > + if (state.dst_addr < state.vma->vm_start || > + state.dst_addr >= state.vma->vm_end) { > + mfill_put_vma(&state); > + state.dst_start = state.dst_addr; > + state.len = dst_start + len - state.dst_addr; > + err = mfill_get_vma(&state); > + if (err) > + break; > + } > > err = mfill_get_pmd(&state); > if (err) > -- > 2.43.0 > -- Sincerely yours, Mike.