From: Mike Rapoport <rppt@kernel.org>
To: Peter Xu <peterx@redhat.com>
Cc: linux-mm@kvack.org, Andrea Arcangeli <aarcange@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
David Hildenbrand <david@redhat.com>,
Hugh Dickins <hughd@google.com>,
James Houghton <jthoughton@google.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Michal Hocko <mhocko@suse.com>,
Muchun Song <muchun.song@linux.dev>,
Nikita Kalyazin <kalyazin@amazon.com>,
Oscar Salvador <osalvador@suse.de>,
Paolo Bonzini <pbonzini@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Shuah Khan <shuah@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>,
linux-kernel@vger.kernel.org, kvm@vger.kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH RFC 05/17] userfaultfd: retry copying with locks dropped in mfill_atomic_pte_copy()
Date: Sun, 8 Feb 2026 12:01:11 +0200 [thread overview]
Message-ID: <aYhe55zEqvuyng8z@kernel.org> (raw)
In-Reply-To: <aYEVxD_vY-qimMNL@x1.local>
On Mon, Feb 02, 2026 at 04:23:16PM -0500, Peter Xu wrote:
> Hi, Mike,
>
> On Tue, Jan 27, 2026 at 09:29:24PM +0200, Mike Rapoport wrote:
> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> >
> > Implementation of UFFDIO_COPY for anonymous memory might fail to copy
> > data data from userspace buffer when the destination VMA is locked
> > (either with mm_lock or with per-VMA lock).
> >
> > In that case, mfill_atomic() releases the locks, retries copying the
> > data with locks dropped and then re-locks the destination VMA and
> > re-establishes PMD.
> >
> > Since this retry-reget dance is only relevant for UFFDIO_COPY and it
> > never happens for other UFFDIO_ operations, make it a part of
> > mfill_atomic_pte_copy() that actually implements UFFDIO_COPY for
> > anonymous memory.
> >
> > shmem implementation will be updated later and the loop in
> > mfill_atomic() will be adjusted afterwards.
>
> Thanks for the refactoring. Looks good to me in general, only some
> nitpicks inline.
>
> >
> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > ---
> > mm/userfaultfd.c | 70 +++++++++++++++++++++++++++++++-----------------
> > 1 file changed, 46 insertions(+), 24 deletions(-)
> >
> > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> > index 45d8f04aaf4f..01a2b898fa40 100644
> > --- a/mm/userfaultfd.c
> > +++ b/mm/userfaultfd.c
> > @@ -404,35 +404,57 @@ static int mfill_copy_folio_locked(struct folio *folio, unsigned long src_addr)
> > return ret;
> > }
> >
> > +static int mfill_copy_folio_retry(struct mfill_state *state, struct folio *folio)
> > +{
> > + unsigned long src_addr = state->src_addr;
> > + void *kaddr;
> > + int err;
> > +
> > + /* retry copying with mm_lock dropped */
> > + mfill_put_vma(state);
> > +
> > + kaddr = kmap_local_folio(folio, 0);
> > + err = copy_from_user(kaddr, (const void __user *) src_addr, PAGE_SIZE);
> > + kunmap_local(kaddr);
> > + if (unlikely(err))
> > + return -EFAULT;
> > +
> > + flush_dcache_folio(folio);
> > +
> > + /* reget VMA and PMD, they could change underneath us */
> > + err = mfill_get_vma(state);
> > + if (err)
> > + return err;
> > +
> > + err = mfill_get_pmd(state);
> > + if (err)
> > + return err;
> > +
> > + return 0;
> > +}
> > +
> > static int mfill_atomic_pte_copy(struct mfill_state *state)
> > {
> > - struct vm_area_struct *dst_vma = state->vma;
> > unsigned long dst_addr = state->dst_addr;
> > unsigned long src_addr = state->src_addr;
> > uffd_flags_t flags = state->flags;
> > - pmd_t *dst_pmd = state->pmd;
> > struct folio *folio;
> > int ret;
> >
> > - if (!state->folio) {
> > - ret = -ENOMEM;
> > - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, dst_vma,
> > - dst_addr);
> > - if (!folio)
> > - goto out;
> > + folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, state->vma, dst_addr);
> > + if (!folio)
> > + return -ENOMEM;
> >
> > - ret = mfill_copy_folio_locked(folio, src_addr);
> > + ret = -ENOMEM;
> > + if (mem_cgroup_charge(folio, state->vma->vm_mm, GFP_KERNEL))
> > + goto out_release;
> >
> > + ret = mfill_copy_folio_locked(folio, src_addr);
> > + if (unlikely(ret)) {
> > /* fallback to copy_from_user outside mmap_lock */
> > - if (unlikely(ret)) {
> > - ret = -ENOENT;
> > - state->folio = folio;
> > - /* don't free the page */
> > - goto out;
> > - }
> > - } else {
> > - folio = state->folio;
> > - state->folio = NULL;
> > + ret = mfill_copy_folio_retry(state, folio);
>
> Yes, I agree this should work and should avoid the previous ENOENT
> processing that might be hard to follow. It'll move the complexity into
> mfill_state though (e.g., now it's unknown on the vma lock state after this
> function returns..), but I guess it's fine.
When this function returns success VMA is locked. If the function fails it
does not matter if the VMA is locked.
I'll add some comments.
> > + if (ret)
> > + goto out_release;
> > }
> >
> > /*
> > @@ -442,17 +464,16 @@ static int mfill_atomic_pte_copy(struct mfill_state *state)
> > */
> > __folio_mark_uptodate(folio);
>
> Since success path should make sure vma lock held when reaching here, but
> now with mfill_copy_folio_retry()'s presence it's not as clear as before,
> maybe we add an assertion for that here before installing ptes? No strong
> feelings.
I'll add comments.
> >
> > - ret = -ENOMEM;
> > - if (mem_cgroup_charge(folio, dst_vma->vm_mm, GFP_KERNEL))
> > - goto out_release;
> > -
> > - ret = mfill_atomic_install_pte(dst_pmd, dst_vma, dst_addr,
> > + ret = mfill_atomic_install_pte(state->pmd, state->vma, dst_addr,
> > &folio->page, true, flags);
> > if (ret)
> > goto out_release;
> > out:
> > return ret;
> > out_release:
> > + /* Don't return -ENOENT so that our caller won't retry */
> > + if (ret == -ENOENT)
> > + ret = -EFAULT;
>
> I recall the code removed is the only path that can return ENOENT? Then
> maybe this line isn't needed?
I didn't want to audit all potential errors and this is a temporal safety
measure to avoid breaking biscection. This is anyway removed in the
following patches.
> > folio_put(folio);
> > goto out;
> > }
> > @@ -907,7 +928,8 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
> > break;
> > }
> >
> > - mfill_put_vma(&state);
> > + if (state.vma)
>
> I wonder if we should move this check into mfill_put_vma() directly, it
> might be overlooked if we'll put_vma in other paths otherwise.
Yeah, I'll check this.
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2026-02-08 10:01 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-27 19:29 [PATCH RFC 00/17] mm, kvm: allow uffd suppot in guest_memfd Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 01/17] userfaultfd: introduce mfill_copy_folio_locked() helper Mike Rapoport
2026-02-03 17:45 ` Peter Xu
2026-02-08 9:49 ` Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 02/17] userfaultfd: introduce struct mfill_state Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 03/17] userfaultfd: introduce mfill_get_pmd() helper Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 04/17] userfaultfd: introduce mfill_get_vma() and mfill_put_vma() Mike Rapoport
2026-02-02 21:49 ` Peter Xu
2026-02-08 9:54 ` Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 05/17] userfaultfd: retry copying with locks dropped in mfill_atomic_pte_copy() Mike Rapoport
2026-02-02 21:23 ` Peter Xu
2026-02-08 10:01 ` Mike Rapoport [this message]
2026-01-27 19:29 ` [PATCH RFC 06/17] userfaultfd: move vma_can_userfault out of line Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 07/17] userfaultfd: introduce vm_uffd_ops Mike Rapoport
2026-02-02 21:36 ` Peter Xu
2026-02-08 10:13 ` Mike Rapoport
2026-02-11 19:35 ` Peter Xu
2026-02-15 17:47 ` Mike Rapoport
2026-02-18 21:34 ` Peter Xu
2026-01-27 19:29 ` [PATCH RFC 08/17] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 09/17] userfaultfd: introduce vm_uffd_ops->alloc_folio() Mike Rapoport
2026-02-02 22:13 ` Peter Xu
2026-02-08 10:22 ` Mike Rapoport
2026-02-11 19:37 ` Peter Xu
2026-01-27 19:29 ` [PATCH RFC 10/17] shmem, userfaultfd: implement shmem uffd operations using vm_uffd_ops Mike Rapoport
2026-02-03 17:40 ` Peter Xu
2026-02-08 10:35 ` Mike Rapoport
2026-02-11 20:00 ` Peter Xu
2026-02-15 17:45 ` Mike Rapoport
2026-02-18 21:45 ` Peter Xu
2026-01-27 19:29 ` [PATCH RFC 11/17] userfaultfd: mfill_atomic() remove retry logic Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 12/17] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 13/17] mm: introduce VM_FAULT_UFFD_MISSING " Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 14/17] KVM: guest_memfd: implement userfaultfd minor mode Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 15/17] KVM: guest_memfd: implement userfaultfd missing mode Mike Rapoport
2026-03-04 17:12 ` Nikita Kalyazin
2026-03-04 17:17 ` Nikita Kalyazin
2026-01-27 19:29 ` [PATCH RFC 16/17] KVM: selftests: test userfaultfd minor for guest_memfd Mike Rapoport
2026-01-27 19:29 ` [PATCH RFC 17/17] KVM: selftests: test userfaultfd missing " Mike Rapoport
2026-02-03 20:56 ` [PATCH RFC 00/17] mm, kvm: allow uffd suppot in guest_memfd Peter Xu
2026-02-09 15:35 ` David Hildenbrand (Arm)
2026-02-11 6:04 ` Mike Rapoport
2026-02-11 9:52 ` David Hildenbrand (Arm)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aYhe55zEqvuyng8z@kernel.org \
--to=rppt@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=jthoughton@google.com \
--cc=kalyazin@amazon.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=seanjc@google.com \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.