From: Mike Rapoport <rppt@kernel.org>
To: James Houghton <jthoughton@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Andrea Arcangeli <aarcange@redhat.com>,
Axel Rasmussen <axelrasmussen@google.com>,
Baolin Wang <baolin.wang@linux.alibaba.com>,
David Hildenbrand <david@kernel.org>,
Hugh Dickins <hughd@google.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
"Matthew Wilcox (Oracle)" <willy@infradead.org>,
Michal Hocko <mhocko@suse.com>,
Muchun Song <muchun.song@linux.dev>,
Nikita Kalyazin <kalyazin@amazon.com>,
Oscar Salvador <osalvador@suse.de>,
Paolo Bonzini <pbonzini@redhat.com>, Peter Xu <peterx@redhat.com>,
Sean Christopherson <seanjc@google.com>,
Shuah Khan <shuah@kernel.org>,
Suren Baghdasaryan <surenb@google.com>,
Vlastimil Babka <vbabka@suse.cz>,
kvm@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH v2 12/15] mm: generalize handling of userfaults in __do_fault()
Date: Fri, 27 Mar 2026 14:31:06 +0300 [thread overview]
Message-ID: <acZqemRy3ltBWGL2@kernel.org> (raw)
In-Reply-To: <CADrL8HVUJ5FL97d9ytxp2WXos6HS+U+ycpsi5VxffsW9vacr9Q@mail.gmail.com>
On Thu, Mar 26, 2026 at 06:55:15PM -0700, James Houghton wrote:
> On Fri, Mar 6, 2026 at 9:19 AM Mike Rapoport <rppt@kernel.org> wrote:
> >
> > From: Peter Xu <peterx@redhat.com>
> >
> > When a VMA is registered with userfaulfd, its ->fault()
> > method should check if a folio exists in the page cache and call
> > handle_userfault() with appropriate mode:
> >
> > - VM_UFFD_MINOR if VMA is registered in minor mode and the folio exists
> > - VM_UFFD_MISSING if VMA is registered in missing mode and the folio
> > does not exist
> >
> > Instead of calling handle_userfault() directly from a specific ->fault()
> > handler, call __do_userfault() helper from the generic __do_fault().
> >
> > For VMAs registered with userfaultfd the new __do_userfault() helper
> > will check if the folio is found in the page cache using
> > vm_uffd_ops->get_folio_noalloc() and call handle_userfault() with the
> > appropriate mode.
> >
> > Make vm_uffd_ops->get_folio_noalloc() required method for non-anonymous
> > VMAs mapped at PTE level.
> >
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> > ---
> > mm/memory.c | 43 +++++++++++++++++++++++++++++++++++++++++++
> > mm/shmem.c | 12 ------------
> > mm/userfaultfd.c | 8 ++++++++
> > 3 files changed, 51 insertions(+), 12 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 07778814b4a8..e2183c44d70b 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -5328,6 +5328,41 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf)
> > return VM_FAULT_OOM;
> > }
> >
> > +#ifdef CONFIG_USERFAULTFD
> > +static vm_fault_t __do_userfault(struct vm_fault *vmf)
> > +{
> > + struct vm_area_struct *vma = vmf->vma;
> > + struct inode *inode;
> > + struct folio *folio;
> > +
> > + if (!(userfaultfd_missing(vma) || userfaultfd_minor(vma)))
> > + return 0;
> > +
> > + inode = file_inode(vma->vm_file);
> > + folio = vma->vm_ops->uffd_ops->get_folio_noalloc(inode, vmf->pgoff);
>
> If you do away with the change you made to vma_can_userfault(), please
> add a WARN_ON_ONCE() here if uffd_ops or uffd_ops->get_folio_noalloc
> are not present.
>
> > + if (!IS_ERR_OR_NULL(folio)) {
> > + /*
> > + * TODO: provide a flag for get_folio_noalloc() to avoid
> > + * locking (or even the extra reference?)
> > + */
>
> I think a whole new op is better than adding a flag. Something like
> uffd_ops->folio_present().
Right now it follows what shmem does, let's keep it as TODO for future
optimizations.
> > + folio_unlock(folio);
> > + folio_put(folio);
> > + if (userfaultfd_minor(vma))
> > + return handle_userfault(vmf, VM_UFFD_MINOR);
> > + } else {
> > + if (userfaultfd_missing(vma))
> > + return handle_userfault(vmf, VM_UFFD_MISSING);
> > + }
> > +
> > + return 0;
> > +}
> > +#else
> > +static inline vm_fault_t __do_userfault(struct vm_fault *vmf)
> > +{
> > + return 0;
> > +}
> > +#endif
> > +
> > /*
> > * The mmap_lock must have been held on entry, and may have been
> > * released depending on flags and vma->vm_ops->fault() return value.
> > @@ -5360,6 +5395,14 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
> > return VM_FAULT_OOM;
> > }
> >
> > + /*
> > + * If this is an userfaultfd trap, process it in advance before
>
> "If this is a userfault"
Indeed :)
> > + * triggering the genuine fault handler.
> > + */
> > + ret = __do_userfault(vmf);
> > + if (ret)
> > + return ret;
> > +
> > ret = vma->vm_ops->fault(vmf);
> > if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY |
> > VM_FAULT_DONE_COW)))
> > diff --git a/mm/shmem.c b/mm/shmem.c
> > index 68620caaf75f..239545352cd2 100644
> > --- a/mm/shmem.c
> > +++ b/mm/shmem.c
> > @@ -2489,13 +2489,6 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index,
> > fault_mm = vma ? vma->vm_mm : NULL;
> >
> > folio = filemap_get_entry(inode->i_mapping, index);
> > - if (folio && vma && userfaultfd_minor(vma)) {
> > - if (!xa_is_value(folio))
> > - folio_put(folio);
> > - *fault_type = handle_userfault(vmf, VM_UFFD_MINOR);
> > - return 0;
> > - }
> > -
> > if (xa_is_value(folio)) {
> > error = shmem_swapin_folio(inode, index, &folio,
> > sgp, gfp, vma, fault_type);
> > @@ -2540,11 +2533,6 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index,
> > * Fast cache lookup and swap lookup did not find it: allocate.
> > */
> >
> > - if (vma && userfaultfd_missing(vma)) {
> > - *fault_type = handle_userfault(vmf, VM_UFFD_MISSING);
> > - return 0;
> > - }
> > -
> > /* Find hugepage orders that are allowed for anonymous shmem and tmpfs. */
> > orders = shmem_allowable_huge_orders(inode, vma, index, write_end, false);
> > if (orders > 0) {
> > diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> > index 7cd7c5d1ce84..2ac5fad0ed6c 100644
> > --- a/mm/userfaultfd.c
> > +++ b/mm/userfaultfd.c
> > @@ -2045,6 +2045,14 @@ bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
> > !vma_is_anonymous(vma))
> > return false;
> >
> > + /*
> > + * File backed memory with PTE level mappigns must implement
>
> "File-backed VMAs (except HugeTLB VMAs) must implement
> ops->get_folio_alloc() to support userfaults, as all userfaults in
> file-backed VMAs wlll call ops->get_folio_alloc() to determine the
> userfault type."
I like (except HugeTLB) more than "PTE ...", but it's not here to determine
the userfaulfd type, it's here to ensure we can __do_userfault() for a
memory type.
/*
* File backed VMAs (except HugeTLB) must implement
* ops->get_folio_noalloc() because it's required by
* __do_userfault() in page fault handling.
*/
> > + * ops->get_folio_noalloc()
> > + */
> > + if (!vma_is_anonymous(vma) && !is_vm_hugetlb_page(vma) &&
> > + !ops->get_folio_noalloc)
>
> With the separate folio_present() (or whatever) op, this check becomes
> more obvious. (Looking at this right now, it looks like we're checking
> for UFFDIO_CONTINUE/minor fault support here, but that's not really
> what's going on.)
>
> Honestly I'm not 100% sure if this check should be here. If it's going
> to be here, then we should probably also have checks like these here:
>
> if (vm_flags & VM_UFFD_MINOR && !ops->get_folio_noalloc)
>
> and
>
> if (vm_flags & VM_UFFD_MISSING && !ops->alloc_folio)
>
> So, maybe we should just leave it up to ops->can_userfault() to make
> sure we're doing the right thing?
This must be in the core so it can reject VMAs that can't be used in
__do_userfault() without complicating hot path in the page fault handler
with extra conditions.
>
> > + return false;
> > +
> > return ops->can_userfault(vma, vm_flags);
> > }
> >
> > --
> > 2.51.0
> >
--
Sincerely yours,
Mike.
next prev parent reply other threads:[~2026-03-27 11:31 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-06 17:18 [PATCH v2 00/15] mm, kvm: allow uffd support in guest_memfd Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 01/15] userfaultfd: introduce mfill_copy_folio_locked() helper Mike Rapoport
2026-03-20 11:58 ` David Hildenbrand (Arm)
2026-03-06 17:18 ` [PATCH v2 02/15] userfaultfd: introduce struct mfill_state Mike Rapoport
2026-03-20 12:43 ` David Hildenbrand (Arm)
2026-03-22 10:03 ` Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 03/15] userfaultfd: introduce mfill_get_pmd() helper Mike Rapoport
2026-03-20 12:55 ` David Hildenbrand (Arm)
2026-03-22 10:22 ` Mike Rapoport
2026-03-30 15:21 ` David Hildenbrand (Arm)
2026-03-06 17:18 ` [PATCH v2 04/15] userfaultfd: introduce mfill_get_vma() and mfill_put_vma() Mike Rapoport
[not found] ` <abe1FHyYinvfLYnw@hyeyoo>
2026-03-16 7:48 ` [PATCH v2 4/15] " Harry Yoo
2026-03-16 8:05 ` Deepanshu Kartikey
2026-03-16 8:36 ` Harry Yoo
2026-03-16 8:52 ` Deepanshu Kartikey
2026-03-06 17:18 ` [PATCH v2 05/15] userfaultfd: retry copying with locks dropped in mfill_atomic_pte_copy() Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 06/15] userfaultfd: move vma_can_userfault out of line Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 07/15] userfaultfd: introduce vm_uffd_ops Mike Rapoport
2026-03-11 18:49 ` Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 08/15] shmem, userfaultfd: use a VMA callback to handle UFFDIO_CONTINUE Mike Rapoport
2026-03-26 23:43 ` James Houghton
2026-03-27 0:26 ` Andrew Morton
2026-03-27 7:12 ` Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 09/15] userfaultfd: introduce vm_uffd_ops->alloc_folio() Mike Rapoport
2026-03-27 0:07 ` James Houghton
2026-03-27 7:17 ` Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 10/15] shmem, userfaultfd: implement shmem uffd operations using vm_uffd_ops Mike Rapoport
2026-03-27 1:13 ` James Houghton
2026-03-27 7:46 ` Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 11/15] userfaultfd: mfill_atomic(): remove retry logic Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 12/15] mm: generalize handling of userfaults in __do_fault() Mike Rapoport
2026-03-27 1:55 ` James Houghton
2026-03-27 11:31 ` Mike Rapoport [this message]
2026-03-06 17:18 ` [PATCH v2 13/15] KVM: guest_memfd: implement userfaultfd operations Mike Rapoport
2026-03-27 2:33 ` James Houghton
2026-03-27 11:47 ` Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 14/15] KVM: selftests: test userfaultfd minor for guest_memfd Mike Rapoport
2026-03-06 17:18 ` [PATCH v2 15/15] KVM: selftests: test userfaultfd missing " Mike Rapoport
2026-03-06 22:21 ` [PATCH v2 00/15] mm, kvm: allow uffd support in guest_memfd Andrew Morton
2026-03-26 23:23 ` Andrew Morton
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=acZqemRy3ltBWGL2@kernel.org \
--to=rppt@kernel.org \
--cc=Liam.Howlett@oracle.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@kernel.org \
--cc=hughd@google.com \
--cc=jthoughton@google.com \
--cc=kalyazin@amazon.com \
--cc=kvm@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=osalvador@suse.de \
--cc=pbonzini@redhat.com \
--cc=peterx@redhat.com \
--cc=seanjc@google.com \
--cc=shuah@kernel.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.