All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] mm, kvm: add guest_memfd support for uffd minor faults
@ 2025-11-25 18:38 Mike Rapoport
  2025-11-25 18:38 ` [PATCH v2 1/5] userfaultfd: move vma_can_userfault out of line Mike Rapoport
                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-25 18:38 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Mike Rapoport, Nikita Kalyazin,
	Paolo Bonzini, Peter Xu, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest

From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>

Hi,

These patches allow guest_memfd to notify userspace about minor page
faults using userfaultfd and let userspace to resolve these page faults
using UFFDIO_CONTINUE.

To allow UFFDIO_CONTINUE outside of the core mm I added a get_shmem_folio()
callback to vm_ops that allows an address space backing a VMA to return a
folio that exists in it's page cache (patch 2)

In order for guest_memfd to notify userspace about page faults, there is a
new VM_FAULT_UFFD_MINOR that a ->fault() handler can return to inform the
page fault handler that it needs to call handle_userfault() to complete the
fault (patch 3).
 
Patch 4 plumbs these new goodies into guest_memfd.

This series is the minimal change I've been able to come up with to allow
integration of guest_memfd with uffd and while refactoring uffd and making
mfill_atomic() flow more linear would have been a nice improvement, it's
way out of the scope of enabling uffd with guest_memfd.

v2 changes:
* rename ->get_shared_folio() to ->get_folio()
* hardwire VM_FAULF_UFFD_MINOR to 0 when CONFIG_USERFAULTFD=n

v1: https://patch.msgid.link/20251123102707.559422-1-rppt@kernel.org
* Introduce VM_FAULF_UFFD_MINOR to avoid exporting handle_userfault()
* Simplify vma_can_mfill_atomic()
* Rename get_pagecache_folio() to get_shared_folio() and use inode
  instead of vma as its argument

rfc: https://patch.msgid.link/20251117114631.2029447-1-rppt@kernel.org

Mike Rapoport (Microsoft) (4):
  userfaultfd: move vma_can_userfault out of line
  userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE
  mm: introduce VM_FAULT_UFFD_MINOR fault reason
  guest_memfd: add support for userfaultfd minor mode

Nikita Kalyazin (1):
  KVM: selftests: test userfaultfd minor for guest_memfd

 include/linux/mm.h                            |   9 ++
 include/linux/mm_types.h                      |  10 +-
 include/linux/userfaultfd_k.h                 |  36 +-----
 mm/memory.c                                   |   2 +
 mm/shmem.c                                    |  20 +++-
 mm/userfaultfd.c                              |  80 +++++++++++---
 .../testing/selftests/kvm/guest_memfd_test.c  | 103 ++++++++++++++++++
 virt/kvm/guest_memfd.c                        |  28 +++++
 8 files changed, 236 insertions(+), 52 deletions(-)


base-commit: 6a23ae0a96a600d1d12557add110e0bb6e32730c
-- 
2.50.1


^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 1/5] userfaultfd: move vma_can_userfault out of line
  2025-11-25 18:38 [PATCH v2 0/5] mm, kvm: add guest_memfd support for uffd minor faults Mike Rapoport
@ 2025-11-25 18:38 ` Mike Rapoport
  2025-11-26 15:05   ` Liam R. Howlett
  2025-11-25 18:38 ` [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE Mike Rapoport
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2025-11-25 18:38 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Mike Rapoport, Nikita Kalyazin,
	Paolo Bonzini, Peter Xu, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest, David Hildenbrand (Red Hat)

From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>

vma_can_userfault() has grown pretty big and it's not called on
performance critical path.

Move it out of line.

No functional changes.

Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 include/linux/userfaultfd_k.h | 36 ++---------------------------------
 mm/userfaultfd.c              | 34 +++++++++++++++++++++++++++++++++
 2 files changed, 36 insertions(+), 34 deletions(-)

diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index c0e716aec26a..e4f43e7b063f 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -208,40 +208,8 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma)
 	return vma->vm_flags & __VM_UFFD_FLAGS;
 }
 
-static inline bool vma_can_userfault(struct vm_area_struct *vma,
-				     vm_flags_t vm_flags,
-				     bool wp_async)
-{
-	vm_flags &= __VM_UFFD_FLAGS;
-
-	if (vma->vm_flags & VM_DROPPABLE)
-		return false;
-
-	if ((vm_flags & VM_UFFD_MINOR) &&
-	    (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
-		return false;
-
-	/*
-	 * If wp async enabled, and WP is the only mode enabled, allow any
-	 * memory type.
-	 */
-	if (wp_async && (vm_flags == VM_UFFD_WP))
-		return true;
-
-#ifndef CONFIG_PTE_MARKER_UFFD_WP
-	/*
-	 * If user requested uffd-wp but not enabled pte markers for
-	 * uffd-wp, then shmem & hugetlbfs are not supported but only
-	 * anonymous.
-	 */
-	if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma))
-		return false;
-#endif
-
-	/* By default, allow any of anon|shmem|hugetlb */
-	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
-	    vma_is_shmem(vma);
-}
+bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
+		       bool wp_async);
 
 static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct *vma)
 {
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index af61b95c89e4..8dc964389b0d 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -1977,6 +1977,40 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start,
 	return moved ? moved : err;
 }
 
+bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
+		       bool wp_async)
+{
+	vm_flags &= __VM_UFFD_FLAGS;
+
+	if (vma->vm_flags & VM_DROPPABLE)
+		return false;
+
+	if ((vm_flags & VM_UFFD_MINOR) &&
+	    (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
+		return false;
+
+	/*
+	 * If wp async enabled, and WP is the only mode enabled, allow any
+	 * memory type.
+	 */
+	if (wp_async && (vm_flags == VM_UFFD_WP))
+		return true;
+
+#ifndef CONFIG_PTE_MARKER_UFFD_WP
+	/*
+	 * If user requested uffd-wp but not enabled pte markers for
+	 * uffd-wp, then shmem & hugetlbfs are not supported but only
+	 * anonymous.
+	 */
+	if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma))
+		return false;
+#endif
+
+	/* By default, allow any of anon|shmem|hugetlb */
+	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
+	    vma_is_shmem(vma);
+}
+
 static void userfaultfd_set_vm_flags(struct vm_area_struct *vma,
 				     vm_flags_t vm_flags)
 {
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE
  2025-11-25 18:38 [PATCH v2 0/5] mm, kvm: add guest_memfd support for uffd minor faults Mike Rapoport
  2025-11-25 18:38 ` [PATCH v2 1/5] userfaultfd: move vma_can_userfault out of line Mike Rapoport
@ 2025-11-25 18:38 ` Mike Rapoport
  2025-11-26 10:21   ` David Hildenbrand (Red Hat)
  2025-11-26 15:11   ` Liam R. Howlett
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
                   ` (2 subsequent siblings)
  4 siblings, 2 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-25 18:38 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Mike Rapoport, Nikita Kalyazin,
	Paolo Bonzini, Peter Xu, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest

From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>

When userspace resolves a page fault in a shmem VMA with UFFDIO_CONTINUE
it needs to get a folio that already exists in the pagecache backing
that VMA.

Instead of using shmem_get_folio() for that, add a get_folio() method to
'struct vm_operations_struct' that will return a folio if it exists in
the VMA's pagecache at given pgoff.

Implement get_folio() method for shmem and slightly refactor
userfaultfd's mfill_atomic() and mfill_atomic_pte_continue() to support
this new API.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 include/linux/mm.h |  9 ++++++++
 mm/shmem.c         | 18 ++++++++++++++++
 mm/userfaultfd.c   | 52 +++++++++++++++++++++++++++++-----------------
 3 files changed, 60 insertions(+), 19 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 7c79b3369b82..c8647707d75b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -690,6 +690,15 @@ struct vm_operations_struct {
 	struct page *(*find_normal_page)(struct vm_area_struct *vma,
 					 unsigned long addr);
 #endif /* CONFIG_FIND_NORMAL_PAGE */
+#ifdef CONFIG_USERFAULTFD
+	/*
+	 * Called by userfault to resolve UFFDIO_CONTINUE request.
+	 * Should return the folio found at pgoff in the VMA's pagecache if it
+	 * exists or ERR_PTR otherwise.
+	 * The returned folio is locked and with reference held.
+	 */
+	struct folio *(*get_folio)(struct inode *inode, pgoff_t pgoff);
+#endif
 };
 
 #ifdef CONFIG_NUMA_BALANCING
diff --git a/mm/shmem.c b/mm/shmem.c
index 58701d14dd96..e16c7c8c3e1e 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3263,6 +3263,18 @@ int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
 	shmem_inode_unacct_blocks(inode, 1);
 	return ret;
 }
+
+static struct folio *shmem_get_folio_noalloc(struct inode *inode, pgoff_t pgoff)
+{
+	struct folio *folio;
+	int err;
+
+	err = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC);
+	if (err)
+		return ERR_PTR(err);
+
+	return folio;
+}
 #endif /* CONFIG_USERFAULTFD */
 
 #ifdef CONFIG_TMPFS
@@ -5295,6 +5307,9 @@ static const struct vm_operations_struct shmem_vm_ops = {
 	.set_policy     = shmem_set_policy,
 	.get_policy     = shmem_get_policy,
 #endif
+#ifdef CONFIG_USERFAULTFD
+	.get_folio	= shmem_get_folio_noalloc,
+#endif
 };
 
 static const struct vm_operations_struct shmem_anon_vm_ops = {
@@ -5304,6 +5319,9 @@ static const struct vm_operations_struct shmem_anon_vm_ops = {
 	.set_policy     = shmem_set_policy,
 	.get_policy     = shmem_get_policy,
 #endif
+#ifdef CONFIG_USERFAULTFD
+	.get_folio	= shmem_get_folio_noalloc,
+#endif
 };
 
 int shmem_init_fs_context(struct fs_context *fc)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 8dc964389b0d..9f0f879b603a 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -388,15 +388,12 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd,
 	struct page *page;
 	int ret;
 
-	ret = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC);
+	folio = dst_vma->vm_ops->get_folio(inode, pgoff);
 	/* Our caller expects us to return -EFAULT if we failed to find folio */
-	if (ret == -ENOENT)
-		ret = -EFAULT;
-	if (ret)
-		goto out;
-	if (!folio) {
-		ret = -EFAULT;
-		goto out;
+	if (IS_ERR_OR_NULL(folio)) {
+		if (PTR_ERR(folio) == -ENOENT || !folio)
+			return -EFAULT;
+		return PTR_ERR(folio);
 	}
 
 	page = folio_file_page(folio, pgoff);
@@ -411,13 +408,12 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd,
 		goto out_release;
 
 	folio_unlock(folio);
-	ret = 0;
-out:
-	return ret;
+	return 0;
+
 out_release:
 	folio_unlock(folio);
 	folio_put(folio);
-	goto out;
+	return ret;
 }
 
 /* Handles UFFDIO_POISON for all non-hugetlb VMAs. */
@@ -694,6 +690,15 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t *dst_pmd,
 	return err;
 }
 
+static __always_inline bool vma_can_mfill_atomic(struct vm_area_struct *vma,
+						 uffd_flags_t flags)
+{
+	if (uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE))
+		return vma->vm_ops && vma->vm_ops->get_folio;
+
+	return vma_is_anonymous(vma) || vma_is_shmem(vma);
+}
+
 static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
 					    unsigned long dst_start,
 					    unsigned long src_start,
@@ -766,10 +771,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
 		return  mfill_atomic_hugetlb(ctx, dst_vma, dst_start,
 					     src_start, len, flags);
 
-	if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
-		goto out_unlock;
-	if (!vma_is_shmem(dst_vma) &&
-	    uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE))
+	if (!vma_can_mfill_atomic(dst_vma, flags))
 		goto out_unlock;
 
 	while (src_addr < src_start + len) {
@@ -1985,9 +1987,21 @@ bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
 	if (vma->vm_flags & VM_DROPPABLE)
 		return false;
 
-	if ((vm_flags & VM_UFFD_MINOR) &&
-	    (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
-		return false;
+	if (vm_flags & VM_UFFD_MINOR) {
+		/*
+		 * If only MINOR mode is requested and we can request an
+		 * existing folio from VMA's page cache, allow it
+		 */
+		if (vm_flags == VM_UFFD_MINOR && vma->vm_ops &&
+		    vma->vm_ops->get_folio)
+			return true;
+		/*
+		 * Only hugetlb and shmem can support MINOR mode in combination
+		 * with other modes
+		 */
+		if (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma))
+			return false;
+	}
 
 	/*
 	 * If wp async enabled, and WP is the only mode enabled, allow any
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 [PATCH v2 0/5] mm, kvm: add guest_memfd support for uffd minor faults Mike Rapoport
  2025-11-25 18:38 ` [PATCH v2 1/5] userfaultfd: move vma_can_userfault out of line Mike Rapoport
  2025-11-25 18:38 ` [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE Mike Rapoport
@ 2025-11-25 18:38 ` Mike Rapoport
  2025-11-25 19:21   ` Peter Xu
                     ` (6 more replies)
  2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
  2025-11-25 18:38 ` [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd Mike Rapoport
  4 siblings, 7 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-25 18:38 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Mike Rapoport, Nikita Kalyazin,
	Paolo Bonzini, Peter Xu, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest, David Hildenbrand (Red Hat)

From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>

When a VMA is registered with userfaulfd in minor mode, its ->fault()
method should check if a folio exists in the page cache and if yes
->fault() should call handle_userfault(VM_UFFD_MISSING).

Instead of calling handle_userfault() directly from a specific ->fault()
implementation introduce new fault reason VM_FAULT_UFFD_MINOR that will
notify the core page fault handler that it should call
handle_userfaultfd(VM_UFFD_MISSING) to complete a page fault.

Replace a call to handle_userfault(VM_UFFD_MISSING) in shmem and use the
new VM_FAULT_UFFD_MINOR there instead.

For configurations that don't enable CONFIG_USERFAULTFD,
VM_FAULT_UFFD_MINOR is set to 0.

Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 include/linux/mm_types.h | 10 +++++++++-
 mm/memory.c              |  2 ++
 mm/shmem.c               |  2 +-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index 90e5790c318f..df71b057111b 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -1523,6 +1523,8 @@ typedef __bitwise unsigned int vm_fault_t;
  *				fsync() to complete (for synchronous page faults
  *				in DAX)
  * @VM_FAULT_COMPLETED:		->fault completed, meanwhile mmap lock released
+ * @VM_FAULT_UFFD_MINOR:	->fault did not modify page tables and needs
+ *				handle_userfault(VM_UFFD_MINOR) to complete
  * @VM_FAULT_HINDEX_MASK:	mask HINDEX value
  *
  */
@@ -1540,6 +1542,11 @@ enum vm_fault_reason {
 	VM_FAULT_DONE_COW       = (__force vm_fault_t)0x001000,
 	VM_FAULT_NEEDDSYNC      = (__force vm_fault_t)0x002000,
 	VM_FAULT_COMPLETED      = (__force vm_fault_t)0x004000,
+#ifdef CONFIG_USERFAULTFD
+	VM_FAULT_UFFD_MINOR	= (__force vm_fault_t)0x008000,
+#else
+	VM_FAULT_UFFD_MINOR	= (__force vm_fault_t)0x000000,
+#endif
 	VM_FAULT_HINDEX_MASK    = (__force vm_fault_t)0x0f0000,
 };
 
@@ -1564,7 +1571,8 @@ enum vm_fault_reason {
 	{ VM_FAULT_FALLBACK,            "FALLBACK" },	\
 	{ VM_FAULT_DONE_COW,            "DONE_COW" },	\
 	{ VM_FAULT_NEEDDSYNC,           "NEEDDSYNC" },	\
-	{ VM_FAULT_COMPLETED,           "COMPLETED" }
+	{ VM_FAULT_COMPLETED,           "COMPLETED" },	\
+	{ VM_FAULT_UFFD_MINOR,		"UFFD_MINOR" },	\
 
 struct vm_special_mapping {
 	const char *name;	/* The name, e.g. "[vdso]". */
diff --git a/mm/memory.c b/mm/memory.c
index b59ae7ce42eb..94acbac8cefb 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -5279,6 +5279,8 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
 	}
 
 	ret = vma->vm_ops->fault(vmf);
+	if (unlikely(ret & VM_FAULT_UFFD_MINOR))
+		return handle_userfault(vmf, VM_UFFD_MINOR);
 	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY |
 			    VM_FAULT_DONE_COW)))
 		return ret;
diff --git a/mm/shmem.c b/mm/shmem.c
index e16c7c8c3e1e..a9a31c0b5979 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2461,7 +2461,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index,
 	if (folio && vma && userfaultfd_minor(vma)) {
 		if (!xa_is_value(folio))
 			folio_put(folio);
-		*fault_type = handle_userfault(vmf, VM_UFFD_MINOR);
+		*fault_type = VM_FAULT_UFFD_MINOR;
 		return 0;
 	}
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-25 18:38 [PATCH v2 0/5] mm, kvm: add guest_memfd support for uffd minor faults Mike Rapoport
                   ` (2 preceding siblings ...)
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
@ 2025-11-25 18:38 ` Mike Rapoport
  2025-11-26 10:25   ` David Hildenbrand (Red Hat)
                     ` (4 more replies)
  2025-11-25 18:38 ` [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd Mike Rapoport
  4 siblings, 5 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-25 18:38 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Mike Rapoport, Nikita Kalyazin,
	Paolo Bonzini, Peter Xu, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest

From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>

userfaultfd notifications about minor page faults used for live migration
and snapshotting of VMs with memory backed by shared hugetlbfs or tmpfs
mappings as described in detail in commit 7677f7fd8be7 ("userfaultfd: add
minor fault registration mode").

To use the same mechanism for VMs that use guest_memfd to map their memory,
guest_memfd should support userfaultfd minor mode.

Extend ->fault() method of guest_memfd with ability to notify core page
fault handler that a page fault requires handle_userfault(VM_UFFD_MINOR) to
complete and add implementation of ->get_shared_folio() to guest_memfd
vm_ops.

Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 virt/kvm/guest_memfd.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
index ffadc5ee8e04..2a2b076293f9 100644
--- a/virt/kvm/guest_memfd.c
+++ b/virt/kvm/guest_memfd.c
@@ -4,6 +4,7 @@
 #include <linux/kvm_host.h>
 #include <linux/pagemap.h>
 #include <linux/anon_inodes.h>
+#include <linux/userfaultfd_k.h>
 
 #include "kvm_mm.h"
 
@@ -369,6 +370,12 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
 		return vmf_error(err);
 	}
 
+	if (userfaultfd_minor(vmf->vma)) {
+		folio_unlock(folio);
+		folio_put(folio);
+		return VM_FAULT_UFFD_MINOR;
+	}
+
 	if (WARN_ON_ONCE(folio_test_large(folio))) {
 		ret = VM_FAULT_SIGBUS;
 		goto out_folio;
@@ -390,8 +397,29 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
 	return ret;
 }
 
+#ifdef CONFIG_USERFAULTFD
+static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
+{
+	struct folio *folio;
+
+	folio = kvm_gmem_get_folio(inode, pgoff);
+	if (IS_ERR_OR_NULL(folio))
+		return folio;
+
+	if (!folio_test_uptodate(folio)) {
+		clear_highpage(folio_page(folio, 0));
+		kvm_gmem_mark_prepared(folio);
+	}
+
+	return folio;
+}
+#endif
+
 static const struct vm_operations_struct kvm_gmem_vm_ops = {
 	.fault = kvm_gmem_fault_user_mapping,
+#ifdef CONFIG_USERFAULTFD
+	.get_folio	= kvm_gmem_get_folio,
+#endif
 };
 
 static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd
  2025-11-25 18:38 [PATCH v2 0/5] mm, kvm: add guest_memfd support for uffd minor faults Mike Rapoport
                   ` (3 preceding siblings ...)
  2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
@ 2025-11-25 18:38 ` Mike Rapoport
  2025-11-26 15:23   ` Liam R. Howlett
  2025-11-26 16:49   ` Nikita Kalyazin
  4 siblings, 2 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-25 18:38 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Mike Rapoport, Nikita Kalyazin,
	Paolo Bonzini, Peter Xu, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest

From: Nikita Kalyazin <kalyazin@amazon.com>

The test demonstrates that a minor userfaultfd event in guest_memfd can
be resolved via a memcpy followed by a UFFDIO_CONTINUE ioctl.

Signed-off-by: Nikita Kalyazin <kalyazin@amazon.com>
Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
---
 .../testing/selftests/kvm/guest_memfd_test.c  | 103 ++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
index e7d9aeb418d3..a5d3ed21d7bb 100644
--- a/tools/testing/selftests/kvm/guest_memfd_test.c
+++ b/tools/testing/selftests/kvm/guest_memfd_test.c
@@ -10,13 +10,17 @@
 #include <errno.h>
 #include <stdio.h>
 #include <fcntl.h>
+#include <pthread.h>
 
 #include <linux/bitmap.h>
 #include <linux/falloc.h>
 #include <linux/sizes.h>
+#include <linux/userfaultfd.h>
 #include <sys/mman.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <sys/syscall.h>
+#include <sys/ioctl.h>
 
 #include "kvm_util.h"
 #include "test_util.h"
@@ -254,6 +258,104 @@ static void test_guest_memfd_flags(struct kvm_vm *vm)
 	}
 }
 
+struct fault_args {
+	char *addr;
+	volatile char value;
+};
+
+static void *fault_thread_fn(void *arg)
+{
+	struct fault_args *args = arg;
+
+	/* Trigger page fault */
+	args->value = *args->addr;
+	return NULL;
+}
+
+static void test_uffd_minor(int fd, size_t total_size)
+{
+	struct uffdio_api uffdio_api = {
+		.api = UFFD_API,
+		.features = UFFD_FEATURE_MINOR_GENERIC,
+	};
+	struct uffdio_register uffd_reg;
+	struct uffdio_continue uffd_cont;
+	struct uffd_msg msg;
+	struct fault_args args;
+	pthread_t fault_thread;
+	void *mem, *mem_nofault, *buf = NULL;
+	int uffd, ret;
+	off_t offset = page_size;
+	void *fault_addr;
+
+	ret = posix_memalign(&buf, page_size, total_size);
+	TEST_ASSERT_EQ(ret, 0);
+
+	memset(buf, 0xaa, total_size);
+
+	uffd = syscall(__NR_userfaultfd, O_CLOEXEC);
+	TEST_ASSERT(uffd != -1, "userfaultfd creation should succeed");
+
+	ret = ioctl(uffd, UFFDIO_API, &uffdio_api);
+	TEST_ASSERT(ret != -1, "ioctl(UFFDIO_API) should succeed");
+
+	mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	TEST_ASSERT(mem != MAP_FAILED, "mmap should succeed");
+
+	mem_nofault = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
+	TEST_ASSERT(mem_nofault != MAP_FAILED, "mmap should succeed");
+
+	uffd_reg.range.start = (unsigned long)mem;
+	uffd_reg.range.len = total_size;
+	uffd_reg.mode = UFFDIO_REGISTER_MODE_MINOR;
+	ret = ioctl(uffd, UFFDIO_REGISTER, &uffd_reg);
+	TEST_ASSERT(ret != -1, "ioctl(UFFDIO_REGISTER) should succeed");
+
+	ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE,
+			offset, page_size);
+	TEST_ASSERT(!ret, "fallocate(PUNCH_HOLE) should succeed");
+
+	fault_addr = mem + offset;
+	args.addr = fault_addr;
+
+	ret = pthread_create(&fault_thread, NULL, fault_thread_fn, &args);
+	TEST_ASSERT(ret == 0, "pthread_create should succeed");
+
+	ret = read(uffd, &msg, sizeof(msg));
+	TEST_ASSERT(ret != -1, "read from userfaultfd should succeed");
+	TEST_ASSERT(msg.event == UFFD_EVENT_PAGEFAULT, "event type should be pagefault");
+	TEST_ASSERT((void *)(msg.arg.pagefault.address & ~(page_size - 1)) == fault_addr,
+		    "pagefault should occur at expected address");
+
+	memcpy(mem_nofault + offset, buf + offset, page_size);
+
+	uffd_cont.range.start = (unsigned long)fault_addr;
+	uffd_cont.range.len = page_size;
+	uffd_cont.mode = 0;
+	ret = ioctl(uffd, UFFDIO_CONTINUE, &uffd_cont);
+	TEST_ASSERT(ret != -1, "ioctl(UFFDIO_CONTINUE) should succeed");
+
+	/*
+	 * wait for fault_thread to finish to make sure fault happened and was
+	 * resolved before we verify the values
+	 */
+	ret = pthread_join(fault_thread, NULL);
+	TEST_ASSERT(ret == 0, "pthread_join should succeed");
+
+	TEST_ASSERT(args.value == *(char *)(mem_nofault + offset),
+		    "memory should contain the value that was copied");
+	TEST_ASSERT(args.value == *(char *)(mem + offset),
+		    "no further fault is expected");
+
+	ret = munmap(mem_nofault, total_size);
+	TEST_ASSERT(!ret, "munmap should succeed");
+
+	ret = munmap(mem, total_size);
+	TEST_ASSERT(!ret, "munmap should succeed");
+	free(buf);
+	close(uffd);
+}
+
 #define gmem_test(__test, __vm, __flags)				\
 do {									\
 	int fd = vm_create_guest_memfd(__vm, page_size * 4, __flags);	\
@@ -273,6 +375,7 @@ static void __test_guest_memfd(struct kvm_vm *vm, uint64_t flags)
 		if (flags & GUEST_MEMFD_FLAG_INIT_SHARED) {
 			gmem_test(mmap_supported, vm, flags);
 			gmem_test(fault_overflow, vm, flags);
+			gmem_test(uffd_minor, vm, flags);
 		} else {
 			gmem_test(fault_private, vm, flags);
 		}
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
@ 2025-11-25 19:21   ` Peter Xu
  2025-11-27 11:18     ` Mike Rapoport
  2025-11-26 10:19   ` David Hildenbrand (Red Hat)
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 33+ messages in thread
From: Peter Xu @ 2025-11-25 19:21 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin,
	Paolo Bonzini, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest, David Hildenbrand (Red Hat)

Hi, Mike,

On Tue, Nov 25, 2025 at 08:38:38PM +0200, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> When a VMA is registered with userfaulfd in minor mode, its ->fault()
> method should check if a folio exists in the page cache and if yes
> ->fault() should call handle_userfault(VM_UFFD_MISSING).

s/MISSING/MINOR/

> 
> Instead of calling handle_userfault() directly from a specific ->fault()
> implementation introduce new fault reason VM_FAULT_UFFD_MINOR that will
> notify the core page fault handler that it should call
> handle_userfaultfd(VM_UFFD_MISSING) to complete a page fault.

Same.

> 
> Replace a call to handle_userfault(VM_UFFD_MISSING) in shmem and use the

Same.

> new VM_FAULT_UFFD_MINOR there instead.

Personally I'd keep the fault path as simple as possible, because that's
the more frequently used path (rather than when userfaultfd is armed). I
also see it slightly a pity that even with flags introduced, it only solves
the MINOR problem, not MISSING.

If it's me, I'd simply export handle_userfault()..  I confess I still don't
know why exporting it is a problem, but maybe I missed something.

Only my two cents.  Feel free to go with whatever way you prefer.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
@ 2025-11-26  4:24 kernel test robot
  0 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2025-11-26  4:24 UTC (permalink / raw)
  To: oe-kbuild; +Cc: lkp

:::::: 
:::::: Manual check reason: "low confidence bisect report"
:::::: 

BCC: lkp@intel.com
CC: oe-kbuild-all@lists.linux.dev
In-Reply-To: <20251125183840.2368510-4-rppt@kernel.org>
References: <20251125183840.2368510-4-rppt@kernel.org>
TO: Mike Rapoport <rppt@kernel.org>

Hi Mike,

kernel test robot noticed the following build errors:

[auto build test ERROR on 6a23ae0a96a600d1d12557add110e0bb6e32730c]

url:    https://github.com/intel-lab-lkp/linux/commits/Mike-Rapoport/userfaultfd-move-vma_can_userfault-out-of-line/20251126-024059
base:   6a23ae0a96a600d1d12557add110e0bb6e32730c
patch link:    https://lore.kernel.org/r/20251125183840.2368510-4-rppt%40kernel.org
patch subject: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
:::::: branch date: 10 hours ago
:::::: commit date: 10 hours ago
config: riscv-allnoconfig-bpf (https://download.01.org/0day-ci/archive/20251126/202511261233.RgfmIwhI-lkp@intel.com/config)
compiler: riscv64-linux-gcc (GCC) 15.1.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251126/202511261233.RgfmIwhI-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/r/202511261233.RgfmIwhI-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from ./include/trace/define_trace.h:132,
                    from ./include/trace/events/f2fs.h:2407,
                    from fs/f2fs/super.c:41:
   ./include/trace/events/f2fs.h: In function 'trace_raw_output_f2fs_mmap':
>> ./include/trace/stages/stage3_trace_output.h:70:37: error: expected expression before ',' token
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   ./include/trace/trace_events.h:219:34: note: in definition of macro 'DECLARE_EVENT_CLASS'
     219 |         trace_event_printf(iter, print);                                \
         |                                  ^~~~~
   ./include/trace/events/f2fs.h:1432:9: note: in expansion of macro 'TP_printk'
    1432 |         TP_printk("dev = (%d,%d), ino = %lu, index = %lu, flags: %s, ret: %s",
         |         ^~~~~~~~~
   ./include/trace/events/f2fs.h:1436:17: note: in expansion of macro '__print_flags'
    1436 |                 __print_flags(__entry->ret, "|", VM_FAULT_RESULT_TRACE))
         |                 ^~~~~~~~~~~~~


vim +70 ./include/trace/stages/stage3_trace_output.h

1bc191051dca28 include/trace/stages/stage3_defines.h Linus Torvalds          2022-03-23  65  
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  66) #undef __print_flags
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  67) #define __print_flags(flag, delim, flag_array...)			\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  68) 	({								\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  69) 		static const struct trace_print_flags __flags[] =	\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03 @70) 			{ flag_array, { -1, NULL }};			\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  71) 		trace_print_flags_seq(p, delim, flag, __flags);	\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  72) 	})
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  73) 

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
  2025-11-25 19:21   ` Peter Xu
@ 2025-11-26 10:19   ` David Hildenbrand (Red Hat)
  2025-11-26 12:47   ` kernel test robot
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26 10:19 UTC (permalink / raw)
  To: Mike Rapoport, linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	Hugh Dickins, James Houghton, Liam R. Howlett, Lorenzo Stoakes,
	Michal Hocko, Nikita Kalyazin, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

> @@ -1564,7 +1571,8 @@ enum vm_fault_reason {
>   	{ VM_FAULT_FALLBACK,            "FALLBACK" },	\
>   	{ VM_FAULT_DONE_COW,            "DONE_COW" },	\
>   	{ VM_FAULT_NEEDDSYNC,           "NEEDDSYNC" },	\
> -	{ VM_FAULT_COMPLETED,           "COMPLETED" }
> +	{ VM_FAULT_COMPLETED,           "COMPLETED" },	\
> +	{ VM_FAULT_UFFD_MINOR,		"UFFD_MINOR" },	\
>   
>   struct vm_special_mapping {
>   	const char *name;	/* The name, e.g. "[vdso]". */
> diff --git a/mm/memory.c b/mm/memory.c
> index b59ae7ce42eb..94acbac8cefb 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5279,6 +5279,8 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>   	}
>   
>   	ret = vma->vm_ops->fault(vmf);
> +	if (unlikely(ret & VM_FAULT_UFFD_MINOR))
> +		return handle_userfault(vmf, VM_UFFD_MINOR);
>   	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY |
>   			    VM_FAULT_DONE_COW)))

If we want to reduce the overhead on the fast path, we can simply do

if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY |
		    VM_FAULT_DONE_COW | VM_FAULT_UFFD_MINOR))) {
	if (unlikely(ret & VM_FAULT_UFFD_MINOR))
		return handle_userfault(vmf, VM_UFFD_MINOR);
	return ret;
}

Maybe the compiler already does that to improve the likely case.

LGTM

-- 
Cheers

David

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE
  2025-11-25 18:38 ` [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE Mike Rapoport
@ 2025-11-26 10:21   ` David Hildenbrand (Red Hat)
  2025-11-26 15:11   ` Liam R. Howlett
  1 sibling, 0 replies; 33+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26 10:21 UTC (permalink / raw)
  To: Mike Rapoport, linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	Hugh Dickins, James Houghton, Liam R. Howlett, Lorenzo Stoakes,
	Michal Hocko, Nikita Kalyazin, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

On 11/25/25 19:38, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> When userspace resolves a page fault in a shmem VMA with UFFDIO_CONTINUE
> it needs to get a folio that already exists in the pagecache backing
> that VMA.
> 
> Instead of using shmem_get_folio() for that, add a get_folio() method to
> 'struct vm_operations_struct' that will return a folio if it exists in
> the VMA's pagecache at given pgoff.
> 
> Implement get_folio() method for shmem and slightly refactor
> userfaultfd's mfill_atomic() and mfill_atomic_pte_continue() to support
> this new API.
> 
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---

Acked-by: David Hildenbrand (Red Hat) <david@kernel.org>

-- 
Cheers

David

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
@ 2025-11-26 10:25   ` David Hildenbrand (Red Hat)
  2025-11-26 15:22   ` Liam R. Howlett
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-26 10:25 UTC (permalink / raw)
  To: Mike Rapoport, linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	Hugh Dickins, James Houghton, Liam R. Howlett, Lorenzo Stoakes,
	Michal Hocko, Nikita Kalyazin, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

On 11/25/25 19:38, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> userfaultfd notifications about minor page faults used for live migration
> and snapshotting of VMs with memory backed by shared hugetlbfs or tmpfs
> mappings as described in detail in commit 7677f7fd8be7 ("userfaultfd: add
> minor fault registration mode").
> 
> To use the same mechanism for VMs that use guest_memfd to map their memory,
> guest_memfd should support userfaultfd minor mode.
> 
> Extend ->fault() method of guest_memfd with ability to notify core page
> fault handler that a page fault requires handle_userfault(VM_UFFD_MINOR) to
> complete and add implementation of ->get_shared_folio() to guest_memfd
> vm_ops.
> 
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---

No exports and still looks clean to me, nice. :)

-- 
Cheers

David

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
  2025-11-25 19:21   ` Peter Xu
  2025-11-26 10:19   ` David Hildenbrand (Red Hat)
@ 2025-11-26 12:47   ` kernel test robot
  2025-11-26 15:19   ` Liam R. Howlett
                     ` (3 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2025-11-26 12:47 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: llvm, oe-kbuild-all

Hi Mike,

kernel test robot noticed the following build errors:

[auto build test ERROR on 6a23ae0a96a600d1d12557add110e0bb6e32730c]

url:    https://github.com/intel-lab-lkp/linux/commits/Mike-Rapoport/userfaultfd-move-vma_can_userfault-out-of-line/20251126-024059
base:   6a23ae0a96a600d1d12557add110e0bb6e32730c
patch link:    https://lore.kernel.org/r/20251125183840.2368510-4-rppt%40kernel.org
patch subject: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
config: arm64-randconfig-002-20251126 (https://download.01.org/0day-ci/archive/20251126/202511262027.beB7ZuYw-lkp@intel.com/config)
compiler: clang version 19.1.7 (https://github.com/llvm/llvm-project cd708029e0b2869e80abe31ddb175f7c35361f90)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251126/202511262027.beB7ZuYw-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511262027.beB7ZuYw-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from fs/dax.c:30:
   In file included from include/trace/events/fs_dax.h:208:
   In file included from include/trace/define_trace.h:132:
   In file included from include/trace/trace_events.h:256:
>> include/trace/events/fs_dax.h:50:3: error: expected expression
      50 |                 __print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
         |                 ^
   include/trace/stages/stage3_trace_output.h:70:16: note: expanded from macro '__print_flags'
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   In file included from fs/dax.c:30:
   In file included from include/trace/events/fs_dax.h:208:
   In file included from include/trace/define_trace.h:132:
   In file included from include/trace/trace_events.h:256:
   include/trace/events/fs_dax.h:134:3: error: expected expression
     134 |                 __print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
         |                 ^
   include/trace/stages/stage3_trace_output.h:70:16: note: expanded from macro '__print_flags'
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   2 errors generated.


vim +50 include/trace/events/fs_dax.h

282a8e0391c377 Ross Zwisler            2017-02-22    9  
282a8e0391c377 Ross Zwisler            2017-02-22   10  DECLARE_EVENT_CLASS(dax_pmd_fault_class,
f42003917b4569 Dave Jiang              2017-02-22   11  	TP_PROTO(struct inode *inode, struct vm_fault *vmf,
f42003917b4569 Dave Jiang              2017-02-22   12  		pgoff_t max_pgoff, int result),
f42003917b4569 Dave Jiang              2017-02-22   13  	TP_ARGS(inode, vmf, max_pgoff, result),
282a8e0391c377 Ross Zwisler            2017-02-22   14  	TP_STRUCT__entry(
282a8e0391c377 Ross Zwisler            2017-02-22   15  		__field(unsigned long, ino)
282a8e0391c377 Ross Zwisler            2017-02-22   16  		__field(unsigned long, vm_start)
282a8e0391c377 Ross Zwisler            2017-02-22   17  		__field(unsigned long, vm_end)
bfbe71109fa40e Lorenzo Stoakes         2025-06-18   18  		__field(vm_flags_t, vm_flags)
282a8e0391c377 Ross Zwisler            2017-02-22   19  		__field(unsigned long, address)
282a8e0391c377 Ross Zwisler            2017-02-22   20  		__field(pgoff_t, pgoff)
282a8e0391c377 Ross Zwisler            2017-02-22   21  		__field(pgoff_t, max_pgoff)
282a8e0391c377 Ross Zwisler            2017-02-22   22  		__field(dev_t, dev)
282a8e0391c377 Ross Zwisler            2017-02-22   23  		__field(unsigned int, flags)
282a8e0391c377 Ross Zwisler            2017-02-22   24  		__field(int, result)
282a8e0391c377 Ross Zwisler            2017-02-22   25  	),
282a8e0391c377 Ross Zwisler            2017-02-22   26  	TP_fast_assign(
282a8e0391c377 Ross Zwisler            2017-02-22   27  		__entry->dev = inode->i_sb->s_dev;
282a8e0391c377 Ross Zwisler            2017-02-22   28  		__entry->ino = inode->i_ino;
f42003917b4569 Dave Jiang              2017-02-22   29  		__entry->vm_start = vmf->vma->vm_start;
f42003917b4569 Dave Jiang              2017-02-22   30  		__entry->vm_end = vmf->vma->vm_end;
f42003917b4569 Dave Jiang              2017-02-22   31  		__entry->vm_flags = vmf->vma->vm_flags;
d8a849e1bc1237 Dave Jiang              2017-02-22   32  		__entry->address = vmf->address;
d8a849e1bc1237 Dave Jiang              2017-02-22   33  		__entry->flags = vmf->flags;
d8a849e1bc1237 Dave Jiang              2017-02-22   34  		__entry->pgoff = vmf->pgoff;
282a8e0391c377 Ross Zwisler            2017-02-22   35  		__entry->max_pgoff = max_pgoff;
282a8e0391c377 Ross Zwisler            2017-02-22   36  		__entry->result = result;
282a8e0391c377 Ross Zwisler            2017-02-22   37  	),
282a8e0391c377 Ross Zwisler            2017-02-22   38  	TP_printk("dev %d:%d ino %#lx %s %s address %#lx vm_start "
282a8e0391c377 Ross Zwisler            2017-02-22   39  			"%#lx vm_end %#lx pgoff %#lx max_pgoff %#lx %s",
282a8e0391c377 Ross Zwisler            2017-02-22   40  		MAJOR(__entry->dev),
282a8e0391c377 Ross Zwisler            2017-02-22   41  		MINOR(__entry->dev),
282a8e0391c377 Ross Zwisler            2017-02-22   42  		__entry->ino,
282a8e0391c377 Ross Zwisler            2017-02-22   43  		__entry->vm_flags & VM_SHARED ? "shared" : "private",
282a8e0391c377 Ross Zwisler            2017-02-22   44  		__print_flags(__entry->flags, "|", FAULT_FLAG_TRACE),
282a8e0391c377 Ross Zwisler            2017-02-22   45  		__entry->address,
282a8e0391c377 Ross Zwisler            2017-02-22   46  		__entry->vm_start,
282a8e0391c377 Ross Zwisler            2017-02-22   47  		__entry->vm_end,
282a8e0391c377 Ross Zwisler            2017-02-22   48  		__entry->pgoff,
282a8e0391c377 Ross Zwisler            2017-02-22   49  		__entry->max_pgoff,
282a8e0391c377 Ross Zwisler            2017-02-22  @50  		__print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
282a8e0391c377 Ross Zwisler            2017-02-22   51  	)
282a8e0391c377 Ross Zwisler            2017-02-22   52  )
282a8e0391c377 Ross Zwisler            2017-02-22   53  
282a8e0391c377 Ross Zwisler            2017-02-22   54  #define DEFINE_PMD_FAULT_EVENT(name) \
282a8e0391c377 Ross Zwisler            2017-02-22   55  DEFINE_EVENT(dax_pmd_fault_class, name, \
f42003917b4569 Dave Jiang              2017-02-22   56  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, \
282a8e0391c377 Ross Zwisler            2017-02-22   57  		pgoff_t max_pgoff, int result), \
f42003917b4569 Dave Jiang              2017-02-22   58  	TP_ARGS(inode, vmf, max_pgoff, result))
282a8e0391c377 Ross Zwisler            2017-02-22   59  
282a8e0391c377 Ross Zwisler            2017-02-22   60  DEFINE_PMD_FAULT_EVENT(dax_pmd_fault);
282a8e0391c377 Ross Zwisler            2017-02-22   61  DEFINE_PMD_FAULT_EVENT(dax_pmd_fault_done);
282a8e0391c377 Ross Zwisler            2017-02-22   62  
653b2ea3396fda Ross Zwisler            2017-02-22   63  DECLARE_EVENT_CLASS(dax_pmd_load_hole_class,
f42003917b4569 Dave Jiang              2017-02-22   64  	TP_PROTO(struct inode *inode, struct vm_fault *vmf,
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   65) 		struct folio *zero_folio,
653b2ea3396fda Ross Zwisler            2017-02-22   66  		void *radix_entry),
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   67) 	TP_ARGS(inode, vmf, zero_folio, radix_entry),
653b2ea3396fda Ross Zwisler            2017-02-22   68  	TP_STRUCT__entry(
653b2ea3396fda Ross Zwisler            2017-02-22   69  		__field(unsigned long, ino)
bfbe71109fa40e Lorenzo Stoakes         2025-06-18   70  		__field(vm_flags_t, vm_flags)
653b2ea3396fda Ross Zwisler            2017-02-22   71  		__field(unsigned long, address)
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   72) 		__field(struct folio *, zero_folio)
653b2ea3396fda Ross Zwisler            2017-02-22   73  		__field(void *, radix_entry)
653b2ea3396fda Ross Zwisler            2017-02-22   74  		__field(dev_t, dev)
653b2ea3396fda Ross Zwisler            2017-02-22   75  	),
653b2ea3396fda Ross Zwisler            2017-02-22   76  	TP_fast_assign(
653b2ea3396fda Ross Zwisler            2017-02-22   77  		__entry->dev = inode->i_sb->s_dev;
653b2ea3396fda Ross Zwisler            2017-02-22   78  		__entry->ino = inode->i_ino;
f42003917b4569 Dave Jiang              2017-02-22   79  		__entry->vm_flags = vmf->vma->vm_flags;
f42003917b4569 Dave Jiang              2017-02-22   80  		__entry->address = vmf->address;
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   81) 		__entry->zero_folio = zero_folio;
653b2ea3396fda Ross Zwisler            2017-02-22   82  		__entry->radix_entry = radix_entry;
653b2ea3396fda Ross Zwisler            2017-02-22   83  	),
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   84) 	TP_printk("dev %d:%d ino %#lx %s address %#lx zero_folio %p "
653b2ea3396fda Ross Zwisler            2017-02-22   85  			"radix_entry %#lx",
653b2ea3396fda Ross Zwisler            2017-02-22   86  		MAJOR(__entry->dev),
653b2ea3396fda Ross Zwisler            2017-02-22   87  		MINOR(__entry->dev),
653b2ea3396fda Ross Zwisler            2017-02-22   88  		__entry->ino,
653b2ea3396fda Ross Zwisler            2017-02-22   89  		__entry->vm_flags & VM_SHARED ? "shared" : "private",
653b2ea3396fda Ross Zwisler            2017-02-22   90  		__entry->address,
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   91) 		__entry->zero_folio,
653b2ea3396fda Ross Zwisler            2017-02-22   92  		(unsigned long)__entry->radix_entry
653b2ea3396fda Ross Zwisler            2017-02-22   93  	)
653b2ea3396fda Ross Zwisler            2017-02-22   94  )
653b2ea3396fda Ross Zwisler            2017-02-22   95  
653b2ea3396fda Ross Zwisler            2017-02-22   96  #define DEFINE_PMD_LOAD_HOLE_EVENT(name) \
653b2ea3396fda Ross Zwisler            2017-02-22   97  DEFINE_EVENT(dax_pmd_load_hole_class, name, \
f42003917b4569 Dave Jiang              2017-02-22   98  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, \
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   99) 		struct folio *zero_folio, void *radix_entry), \
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26  100) 	TP_ARGS(inode, vmf, zero_folio, radix_entry))
653b2ea3396fda Ross Zwisler            2017-02-22  101  
653b2ea3396fda Ross Zwisler            2017-02-22  102  DEFINE_PMD_LOAD_HOLE_EVENT(dax_pmd_load_hole);
653b2ea3396fda Ross Zwisler            2017-02-22  103  DEFINE_PMD_LOAD_HOLE_EVENT(dax_pmd_load_hole_fallback);
282a8e0391c377 Ross Zwisler            2017-02-22  104  
a9c42b33ed8096 Ross Zwisler            2017-05-08  105  DECLARE_EVENT_CLASS(dax_pte_fault_class,
a9c42b33ed8096 Ross Zwisler            2017-05-08  106  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, int result),
a9c42b33ed8096 Ross Zwisler            2017-05-08  107  	TP_ARGS(inode, vmf, result),
a9c42b33ed8096 Ross Zwisler            2017-05-08  108  	TP_STRUCT__entry(
a9c42b33ed8096 Ross Zwisler            2017-05-08  109  		__field(unsigned long, ino)
bfbe71109fa40e Lorenzo Stoakes         2025-06-18  110  		__field(vm_flags_t, vm_flags)
a9c42b33ed8096 Ross Zwisler            2017-05-08  111  		__field(unsigned long, address)
a9c42b33ed8096 Ross Zwisler            2017-05-08  112  		__field(pgoff_t, pgoff)
a9c42b33ed8096 Ross Zwisler            2017-05-08  113  		__field(dev_t, dev)
a9c42b33ed8096 Ross Zwisler            2017-05-08  114  		__field(unsigned int, flags)
a9c42b33ed8096 Ross Zwisler            2017-05-08  115  		__field(int, result)
a9c42b33ed8096 Ross Zwisler            2017-05-08  116  	),
a9c42b33ed8096 Ross Zwisler            2017-05-08  117  	TP_fast_assign(
a9c42b33ed8096 Ross Zwisler            2017-05-08  118  		__entry->dev = inode->i_sb->s_dev;
a9c42b33ed8096 Ross Zwisler            2017-05-08  119  		__entry->ino = inode->i_ino;
a9c42b33ed8096 Ross Zwisler            2017-05-08  120  		__entry->vm_flags = vmf->vma->vm_flags;
a9c42b33ed8096 Ross Zwisler            2017-05-08  121  		__entry->address = vmf->address;
a9c42b33ed8096 Ross Zwisler            2017-05-08  122  		__entry->flags = vmf->flags;
a9c42b33ed8096 Ross Zwisler            2017-05-08  123  		__entry->pgoff = vmf->pgoff;
a9c42b33ed8096 Ross Zwisler            2017-05-08  124  		__entry->result = result;
a9c42b33ed8096 Ross Zwisler            2017-05-08  125  	),
a9c42b33ed8096 Ross Zwisler            2017-05-08  126  	TP_printk("dev %d:%d ino %#lx %s %s address %#lx pgoff %#lx %s",
a9c42b33ed8096 Ross Zwisler            2017-05-08  127  		MAJOR(__entry->dev),
a9c42b33ed8096 Ross Zwisler            2017-05-08  128  		MINOR(__entry->dev),
a9c42b33ed8096 Ross Zwisler            2017-05-08  129  		__entry->ino,
a9c42b33ed8096 Ross Zwisler            2017-05-08  130  		__entry->vm_flags & VM_SHARED ? "shared" : "private",
a9c42b33ed8096 Ross Zwisler            2017-05-08  131  		__print_flags(__entry->flags, "|", FAULT_FLAG_TRACE),
a9c42b33ed8096 Ross Zwisler            2017-05-08  132  		__entry->address,
a9c42b33ed8096 Ross Zwisler            2017-05-08  133  		__entry->pgoff,
a9c42b33ed8096 Ross Zwisler            2017-05-08  134  		__print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
a9c42b33ed8096 Ross Zwisler            2017-05-08  135  	)
a9c42b33ed8096 Ross Zwisler            2017-05-08  136  )
a9c42b33ed8096 Ross Zwisler            2017-05-08  137  
a9c42b33ed8096 Ross Zwisler            2017-05-08  138  #define DEFINE_PTE_FAULT_EVENT(name) \
a9c42b33ed8096 Ross Zwisler            2017-05-08  139  DEFINE_EVENT(dax_pte_fault_class, name, \
a9c42b33ed8096 Ross Zwisler            2017-05-08  140  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, int result), \
a9c42b33ed8096 Ross Zwisler            2017-05-08  141  	TP_ARGS(inode, vmf, result))
a9c42b33ed8096 Ross Zwisler            2017-05-08  142  
a9c42b33ed8096 Ross Zwisler            2017-05-08  143  DEFINE_PTE_FAULT_EVENT(dax_pte_fault);
a9c42b33ed8096 Ross Zwisler            2017-05-08  144  DEFINE_PTE_FAULT_EVENT(dax_pte_fault_done);
678c9fd0430a14 Ross Zwisler            2017-05-08  145  DEFINE_PTE_FAULT_EVENT(dax_load_hole);
71eab6dfd91eab Jan Kara                2017-11-01  146  DEFINE_PTE_FAULT_EVENT(dax_insert_pfn_mkwrite_no_entry);
71eab6dfd91eab Jan Kara                2017-11-01  147  DEFINE_PTE_FAULT_EVENT(dax_insert_pfn_mkwrite);
a9c42b33ed8096 Ross Zwisler            2017-05-08  148  
d14a3f48a152b7 Ross Zwisler            2017-05-08  149  DECLARE_EVENT_CLASS(dax_writeback_range_class,
d14a3f48a152b7 Ross Zwisler            2017-05-08  150  	TP_PROTO(struct inode *inode, pgoff_t start_index, pgoff_t end_index),
d14a3f48a152b7 Ross Zwisler            2017-05-08  151  	TP_ARGS(inode, start_index, end_index),
d14a3f48a152b7 Ross Zwisler            2017-05-08  152  	TP_STRUCT__entry(
d14a3f48a152b7 Ross Zwisler            2017-05-08  153  		__field(unsigned long, ino)
d14a3f48a152b7 Ross Zwisler            2017-05-08  154  		__field(pgoff_t, start_index)
d14a3f48a152b7 Ross Zwisler            2017-05-08  155  		__field(pgoff_t, end_index)
d14a3f48a152b7 Ross Zwisler            2017-05-08  156  		__field(dev_t, dev)
d14a3f48a152b7 Ross Zwisler            2017-05-08  157  	),
d14a3f48a152b7 Ross Zwisler            2017-05-08  158  	TP_fast_assign(
d14a3f48a152b7 Ross Zwisler            2017-05-08  159  		__entry->dev = inode->i_sb->s_dev;
d14a3f48a152b7 Ross Zwisler            2017-05-08  160  		__entry->ino = inode->i_ino;
d14a3f48a152b7 Ross Zwisler            2017-05-08  161  		__entry->start_index = start_index;
d14a3f48a152b7 Ross Zwisler            2017-05-08  162  		__entry->end_index = end_index;
d14a3f48a152b7 Ross Zwisler            2017-05-08  163  	),
d14a3f48a152b7 Ross Zwisler            2017-05-08  164  	TP_printk("dev %d:%d ino %#lx pgoff %#lx-%#lx",
d14a3f48a152b7 Ross Zwisler            2017-05-08  165  		MAJOR(__entry->dev),
d14a3f48a152b7 Ross Zwisler            2017-05-08  166  		MINOR(__entry->dev),
d14a3f48a152b7 Ross Zwisler            2017-05-08  167  		__entry->ino,
d14a3f48a152b7 Ross Zwisler            2017-05-08  168  		__entry->start_index,
d14a3f48a152b7 Ross Zwisler            2017-05-08  169  		__entry->end_index
d14a3f48a152b7 Ross Zwisler            2017-05-08  170  	)
d14a3f48a152b7 Ross Zwisler            2017-05-08  171  )
d14a3f48a152b7 Ross Zwisler            2017-05-08  172  
d14a3f48a152b7 Ross Zwisler            2017-05-08  173  #define DEFINE_WRITEBACK_RANGE_EVENT(name) \
d14a3f48a152b7 Ross Zwisler            2017-05-08  174  DEFINE_EVENT(dax_writeback_range_class, name, \
d14a3f48a152b7 Ross Zwisler            2017-05-08  175  	TP_PROTO(struct inode *inode, pgoff_t start_index, pgoff_t end_index),\
d14a3f48a152b7 Ross Zwisler            2017-05-08  176  	TP_ARGS(inode, start_index, end_index))
d14a3f48a152b7 Ross Zwisler            2017-05-08  177  
d14a3f48a152b7 Ross Zwisler            2017-05-08  178  DEFINE_WRITEBACK_RANGE_EVENT(dax_writeback_range);
d14a3f48a152b7 Ross Zwisler            2017-05-08  179  DEFINE_WRITEBACK_RANGE_EVENT(dax_writeback_range_done);
d14a3f48a152b7 Ross Zwisler            2017-05-08  180  
f9bc3a07539bc8 Ross Zwisler            2017-05-08  181  TRACE_EVENT(dax_writeback_one,
f9bc3a07539bc8 Ross Zwisler            2017-05-08  182  	TP_PROTO(struct inode *inode, pgoff_t pgoff, pgoff_t pglen),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  183  	TP_ARGS(inode, pgoff, pglen),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  184  	TP_STRUCT__entry(
f9bc3a07539bc8 Ross Zwisler            2017-05-08  185  		__field(unsigned long, ino)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  186  		__field(pgoff_t, pgoff)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  187  		__field(pgoff_t, pglen)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  188  		__field(dev_t, dev)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  189  	),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  190  	TP_fast_assign(
f9bc3a07539bc8 Ross Zwisler            2017-05-08  191  		__entry->dev = inode->i_sb->s_dev;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  192  		__entry->ino = inode->i_ino;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  193  		__entry->pgoff = pgoff;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  194  		__entry->pglen = pglen;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  195  	),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  196  	TP_printk("dev %d:%d ino %#lx pgoff %#lx pglen %#lx",
f9bc3a07539bc8 Ross Zwisler            2017-05-08  197  		MAJOR(__entry->dev),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  198  		MINOR(__entry->dev),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  199  		__entry->ino,
f9bc3a07539bc8 Ross Zwisler            2017-05-08  200  		__entry->pgoff,
f9bc3a07539bc8 Ross Zwisler            2017-05-08  201  		__entry->pglen
f9bc3a07539bc8 Ross Zwisler            2017-05-08  202  	)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  203  )
f9bc3a07539bc8 Ross Zwisler            2017-05-08  204  
282a8e0391c377 Ross Zwisler            2017-02-22  205  #endif /* _TRACE_FS_DAX_H */
282a8e0391c377 Ross Zwisler            2017-02-22  206  
282a8e0391c377 Ross Zwisler            2017-02-22  207  /* This part must be outside protection */
282a8e0391c377 Ross Zwisler            2017-02-22 @208  #include <trace/define_trace.h>

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 1/5] userfaultfd: move vma_can_userfault out of line
  2025-11-25 18:38 ` [PATCH v2 1/5] userfaultfd: move vma_can_userfault out of line Mike Rapoport
@ 2025-11-26 15:05   ` Liam R. Howlett
  0 siblings, 0 replies; 33+ messages in thread
From: Liam R. Howlett @ 2025-11-26 15:05 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest,
	David Hildenbrand (Red Hat)

* Mike Rapoport <rppt@kernel.org> [251125 13:39]:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> vma_can_userfault() has grown pretty big and it's not called on
> performance critical path.
> 
> Move it out of line.
> 
> No functional changes.
> 
> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  include/linux/userfaultfd_k.h | 36 ++---------------------------------
>  mm/userfaultfd.c              | 34 +++++++++++++++++++++++++++++++++
>  2 files changed, 36 insertions(+), 34 deletions(-)
> 
> diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
> index c0e716aec26a..e4f43e7b063f 100644
> --- a/include/linux/userfaultfd_k.h
> +++ b/include/linux/userfaultfd_k.h
> @@ -208,40 +208,8 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma)
>  	return vma->vm_flags & __VM_UFFD_FLAGS;
>  }
>  
> -static inline bool vma_can_userfault(struct vm_area_struct *vma,
> -				     vm_flags_t vm_flags,
> -				     bool wp_async)
> -{
> -	vm_flags &= __VM_UFFD_FLAGS;
> -
> -	if (vma->vm_flags & VM_DROPPABLE)
> -		return false;
> -
> -	if ((vm_flags & VM_UFFD_MINOR) &&
> -	    (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
> -		return false;
> -
> -	/*
> -	 * If wp async enabled, and WP is the only mode enabled, allow any
> -	 * memory type.
> -	 */
> -	if (wp_async && (vm_flags == VM_UFFD_WP))
> -		return true;
> -
> -#ifndef CONFIG_PTE_MARKER_UFFD_WP
> -	/*
> -	 * If user requested uffd-wp but not enabled pte markers for
> -	 * uffd-wp, then shmem & hugetlbfs are not supported but only
> -	 * anonymous.
> -	 */
> -	if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma))
> -		return false;
> -#endif
> -
> -	/* By default, allow any of anon|shmem|hugetlb */
> -	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
> -	    vma_is_shmem(vma);
> -}
> +bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
> +		       bool wp_async);
>  
>  static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct *vma)
>  {
> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index af61b95c89e4..8dc964389b0d 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -1977,6 +1977,40 @@ ssize_t move_pages(struct userfaultfd_ctx *ctx, unsigned long dst_start,
>  	return moved ? moved : err;
>  }
>  
> +bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
> +		       bool wp_async)
> +{
> +	vm_flags &= __VM_UFFD_FLAGS;
> +
> +	if (vma->vm_flags & VM_DROPPABLE)
> +		return false;
> +
> +	if ((vm_flags & VM_UFFD_MINOR) &&
> +	    (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
> +		return false;
> +
> +	/*
> +	 * If wp async enabled, and WP is the only mode enabled, allow any
> +	 * memory type.
> +	 */
> +	if (wp_async && (vm_flags == VM_UFFD_WP))
> +		return true;
> +
> +#ifndef CONFIG_PTE_MARKER_UFFD_WP
> +	/*
> +	 * If user requested uffd-wp but not enabled pte markers for
> +	 * uffd-wp, then shmem & hugetlbfs are not supported but only
> +	 * anonymous.
> +	 */
> +	if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma))
> +		return false;
> +#endif
> +
> +	/* By default, allow any of anon|shmem|hugetlb */
> +	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
> +	    vma_is_shmem(vma);
> +}
> +
>  static void userfaultfd_set_vm_flags(struct vm_area_struct *vma,
>  				     vm_flags_t vm_flags)
>  {
> -- 
> 2.50.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE
  2025-11-25 18:38 ` [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE Mike Rapoport
  2025-11-26 10:21   ` David Hildenbrand (Red Hat)
@ 2025-11-26 15:11   ` Liam R. Howlett
  1 sibling, 0 replies; 33+ messages in thread
From: Liam R. Howlett @ 2025-11-26 15:11 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

* Mike Rapoport <rppt@kernel.org> [251125 13:39]:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> When userspace resolves a page fault in a shmem VMA with UFFDIO_CONTINUE
> it needs to get a folio that already exists in the pagecache backing
> that VMA.
> 
> Instead of using shmem_get_folio() for that, add a get_folio() method to
> 'struct vm_operations_struct' that will return a folio if it exists in
> the VMA's pagecache at given pgoff.
> 
> Implement get_folio() method for shmem and slightly refactor
> userfaultfd's mfill_atomic() and mfill_atomic_pte_continue() to support
> this new API.
> 
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  include/linux/mm.h |  9 ++++++++
>  mm/shmem.c         | 18 ++++++++++++++++
>  mm/userfaultfd.c   | 52 +++++++++++++++++++++++++++++-----------------
>  3 files changed, 60 insertions(+), 19 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 7c79b3369b82..c8647707d75b 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -690,6 +690,15 @@ struct vm_operations_struct {
>  	struct page *(*find_normal_page)(struct vm_area_struct *vma,
>  					 unsigned long addr);
>  #endif /* CONFIG_FIND_NORMAL_PAGE */
> +#ifdef CONFIG_USERFAULTFD
> +	/*
> +	 * Called by userfault to resolve UFFDIO_CONTINUE request.
> +	 * Should return the folio found at pgoff in the VMA's pagecache if it
> +	 * exists or ERR_PTR otherwise.
> +	 * The returned folio is locked and with reference held.
> +	 */
> +	struct folio *(*get_folio)(struct inode *inode, pgoff_t pgoff);
> +#endif
>  };
>  
>  #ifdef CONFIG_NUMA_BALANCING
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 58701d14dd96..e16c7c8c3e1e 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -3263,6 +3263,18 @@ int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
>  	shmem_inode_unacct_blocks(inode, 1);
>  	return ret;
>  }
> +
> +static struct folio *shmem_get_folio_noalloc(struct inode *inode, pgoff_t pgoff)
> +{
> +	struct folio *folio;
> +	int err;
> +
> +	err = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC);
> +	if (err)
> +		return ERR_PTR(err);
> +
> +	return folio;
> +}
>  #endif /* CONFIG_USERFAULTFD */
>  
>  #ifdef CONFIG_TMPFS
> @@ -5295,6 +5307,9 @@ static const struct vm_operations_struct shmem_vm_ops = {
>  	.set_policy     = shmem_set_policy,
>  	.get_policy     = shmem_get_policy,
>  #endif
> +#ifdef CONFIG_USERFAULTFD
> +	.get_folio	= shmem_get_folio_noalloc,
> +#endif
>  };
>  
>  static const struct vm_operations_struct shmem_anon_vm_ops = {
> @@ -5304,6 +5319,9 @@ static const struct vm_operations_struct shmem_anon_vm_ops = {
>  	.set_policy     = shmem_set_policy,
>  	.get_policy     = shmem_get_policy,
>  #endif
> +#ifdef CONFIG_USERFAULTFD
> +	.get_folio	= shmem_get_folio_noalloc,
> +#endif
>  };
>  
>  int shmem_init_fs_context(struct fs_context *fc)
> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index 8dc964389b0d..9f0f879b603a 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -388,15 +388,12 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd,
>  	struct page *page;
>  	int ret;
>  
> -	ret = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC);
> +	folio = dst_vma->vm_ops->get_folio(inode, pgoff);
>  	/* Our caller expects us to return -EFAULT if we failed to find folio */
> -	if (ret == -ENOENT)
> -		ret = -EFAULT;
> -	if (ret)
> -		goto out;
> -	if (!folio) {
> -		ret = -EFAULT;
> -		goto out;
> +	if (IS_ERR_OR_NULL(folio)) {
> +		if (PTR_ERR(folio) == -ENOENT || !folio)
> +			return -EFAULT;
> +		return PTR_ERR(folio);
>  	}
>  
>  	page = folio_file_page(folio, pgoff);
> @@ -411,13 +408,12 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd,
>  		goto out_release;
>  
>  	folio_unlock(folio);
> -	ret = 0;
> -out:
> -	return ret;
> +	return 0;
> +
>  out_release:
>  	folio_unlock(folio);
>  	folio_put(folio);
> -	goto out;
> +	return ret;

I really like this part.

>  }
>  
>  /* Handles UFFDIO_POISON for all non-hugetlb VMAs. */
> @@ -694,6 +690,15 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t *dst_pmd,
>  	return err;
>  }
>  
> +static __always_inline bool vma_can_mfill_atomic(struct vm_area_struct *vma,
> +						 uffd_flags_t flags)
> +{
> +	if (uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE))
> +		return vma->vm_ops && vma->vm_ops->get_folio;
> +
> +	return vma_is_anonymous(vma) || vma_is_shmem(vma);
> +}
> +
>  static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
>  					    unsigned long dst_start,
>  					    unsigned long src_start,
> @@ -766,10 +771,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
>  		return  mfill_atomic_hugetlb(ctx, dst_vma, dst_start,
>  					     src_start, len, flags);
>  
> -	if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
> -		goto out_unlock;
> -	if (!vma_is_shmem(dst_vma) &&
> -	    uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE))
> +	if (!vma_can_mfill_atomic(dst_vma, flags))
>  		goto out_unlock;
>  
>  	while (src_addr < src_start + len) {
> @@ -1985,9 +1987,21 @@ bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
>  	if (vma->vm_flags & VM_DROPPABLE)
>  		return false;
>  
> -	if ((vm_flags & VM_UFFD_MINOR) &&
> -	    (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
> -		return false;
> +	if (vm_flags & VM_UFFD_MINOR) {
> +		/*
> +		 * If only MINOR mode is requested and we can request an
> +		 * existing folio from VMA's page cache, allow it
> +		 */
> +		if (vm_flags == VM_UFFD_MINOR && vma->vm_ops &&
> +		    vma->vm_ops->get_folio)
> +			return true;
> +		/*
> +		 * Only hugetlb and shmem can support MINOR mode in combination
> +		 * with other modes
> +		 */
> +		if (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma))
> +			return false;
> +	}
>  
>  	/*
>  	 * If wp async enabled, and WP is the only mode enabled, allow any
> -- 
> 2.50.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
                     ` (2 preceding siblings ...)
  2025-11-26 12:47   ` kernel test robot
@ 2025-11-26 15:19   ` Liam R. Howlett
  2025-11-26 16:49   ` Nikita Kalyazin
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 33+ messages in thread
From: Liam R. Howlett @ 2025-11-26 15:19 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest,
	David Hildenbrand (Red Hat)

* Mike Rapoport <rppt@kernel.org> [251125 13:39]:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> When a VMA is registered with userfaulfd in minor mode, its ->fault()
> method should check if a folio exists in the page cache and if yes
> ->fault() should call handle_userfault(VM_UFFD_MISSING).
> 
> Instead of calling handle_userfault() directly from a specific ->fault()
> implementation introduce new fault reason VM_FAULT_UFFD_MINOR that will
> notify the core page fault handler that it should call
> handle_userfaultfd(VM_UFFD_MISSING) to complete a page fault.
> 
> Replace a call to handle_userfault(VM_UFFD_MISSING) in shmem and use the
> new VM_FAULT_UFFD_MINOR there instead.
> 
> For configurations that don't enable CONFIG_USERFAULTFD,
> VM_FAULT_UFFD_MINOR is set to 0.
> 
> Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Same nit as David, but the rest looks good.

> ---
>  include/linux/mm_types.h | 10 +++++++++-
>  mm/memory.c              |  2 ++
>  mm/shmem.c               |  2 +-
>  3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 90e5790c318f..df71b057111b 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1523,6 +1523,8 @@ typedef __bitwise unsigned int vm_fault_t;
>   *				fsync() to complete (for synchronous page faults
>   *				in DAX)
>   * @VM_FAULT_COMPLETED:		->fault completed, meanwhile mmap lock released
> + * @VM_FAULT_UFFD_MINOR:	->fault did not modify page tables and needs
> + *				handle_userfault(VM_UFFD_MINOR) to complete
>   * @VM_FAULT_HINDEX_MASK:	mask HINDEX value
>   *
>   */
> @@ -1540,6 +1542,11 @@ enum vm_fault_reason {
>  	VM_FAULT_DONE_COW       = (__force vm_fault_t)0x001000,
>  	VM_FAULT_NEEDDSYNC      = (__force vm_fault_t)0x002000,
>  	VM_FAULT_COMPLETED      = (__force vm_fault_t)0x004000,
> +#ifdef CONFIG_USERFAULTFD
> +	VM_FAULT_UFFD_MINOR	= (__force vm_fault_t)0x008000,
> +#else
> +	VM_FAULT_UFFD_MINOR	= (__force vm_fault_t)0x000000,
> +#endif
>  	VM_FAULT_HINDEX_MASK    = (__force vm_fault_t)0x0f0000,
>  };
>  
> @@ -1564,7 +1571,8 @@ enum vm_fault_reason {
>  	{ VM_FAULT_FALLBACK,            "FALLBACK" },	\
>  	{ VM_FAULT_DONE_COW,            "DONE_COW" },	\
>  	{ VM_FAULT_NEEDDSYNC,           "NEEDDSYNC" },	\
> -	{ VM_FAULT_COMPLETED,           "COMPLETED" }
> +	{ VM_FAULT_COMPLETED,           "COMPLETED" },	\
> +	{ VM_FAULT_UFFD_MINOR,		"UFFD_MINOR" },	\
>  
>  struct vm_special_mapping {
>  	const char *name;	/* The name, e.g. "[vdso]". */
> diff --git a/mm/memory.c b/mm/memory.c
> index b59ae7ce42eb..94acbac8cefb 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5279,6 +5279,8 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>  	}
>  
>  	ret = vma->vm_ops->fault(vmf);
> +	if (unlikely(ret & VM_FAULT_UFFD_MINOR))
> +		return handle_userfault(vmf, VM_UFFD_MINOR);
>  	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY |
>  			    VM_FAULT_DONE_COW)))

I have the same concern as David here with adding instructions to the
faults that are not UFFD_MINOR.. I suspect the compiler will remove the
statement completely when UFFD is disabled (and thus ret & 0 in the
check), but it might be worth looking at this closer in the case where
uffd is enabled?  It won't be as clean looking but might make the
assembly better.

>  		return ret;
> diff --git a/mm/shmem.c b/mm/shmem.c
> index e16c7c8c3e1e..a9a31c0b5979 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2461,7 +2461,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index,
>  	if (folio && vma && userfaultfd_minor(vma)) {
>  		if (!xa_is_value(folio))
>  			folio_put(folio);
> -		*fault_type = handle_userfault(vmf, VM_UFFD_MINOR);
> +		*fault_type = VM_FAULT_UFFD_MINOR;
>  		return 0;
>  	}
>  
> -- 
> 2.50.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
  2025-11-26 10:25   ` David Hildenbrand (Red Hat)
@ 2025-11-26 15:22   ` Liam R. Howlett
  2025-11-26 16:21   ` kernel test robot
                     ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Liam R. Howlett @ 2025-11-26 15:22 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

* Mike Rapoport <rppt@kernel.org> [251125 13:39]:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> userfaultfd notifications about minor page faults used for live migration
> and snapshotting of VMs with memory backed by shared hugetlbfs or tmpfs
> mappings as described in detail in commit 7677f7fd8be7 ("userfaultfd: add
> minor fault registration mode").
> 
> To use the same mechanism for VMs that use guest_memfd to map their memory,
> guest_memfd should support userfaultfd minor mode.
> 
> Extend ->fault() method of guest_memfd with ability to notify core page
> fault handler that a page fault requires handle_userfault(VM_UFFD_MINOR) to
> complete and add implementation of ->get_shared_folio() to guest_memfd
> vm_ops.
> 
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  virt/kvm/guest_memfd.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index ffadc5ee8e04..2a2b076293f9 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -4,6 +4,7 @@
>  #include <linux/kvm_host.h>
>  #include <linux/pagemap.h>
>  #include <linux/anon_inodes.h>
> +#include <linux/userfaultfd_k.h>
>  
>  #include "kvm_mm.h"
>  
> @@ -369,6 +370,12 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
>  		return vmf_error(err);
>  	}
>  
> +	if (userfaultfd_minor(vmf->vma)) {
> +		folio_unlock(folio);
> +		folio_put(folio);
> +		return VM_FAULT_UFFD_MINOR;
> +	}
> +
>  	if (WARN_ON_ONCE(folio_test_large(folio))) {
>  		ret = VM_FAULT_SIGBUS;
>  		goto out_folio;
> @@ -390,8 +397,29 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
>  	return ret;
>  }
>  
> +#ifdef CONFIG_USERFAULTFD
> +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
> +{
> +	struct folio *folio;
> +
> +	folio = kvm_gmem_get_folio(inode, pgoff);
> +	if (IS_ERR_OR_NULL(folio))
> +		return folio;
> +
> +	if (!folio_test_uptodate(folio)) {
> +		clear_highpage(folio_page(folio, 0));
> +		kvm_gmem_mark_prepared(folio);
> +	}
> +
> +	return folio;
> +}
> +#endif
> +
>  static const struct vm_operations_struct kvm_gmem_vm_ops = {
>  	.fault = kvm_gmem_fault_user_mapping,
> +#ifdef CONFIG_USERFAULTFD
> +	.get_folio	= kvm_gmem_get_folio,
> +#endif
>  };
>  
>  static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma)
> -- 
> 2.50.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd
  2025-11-25 18:38 ` [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd Mike Rapoport
@ 2025-11-26 15:23   ` Liam R. Howlett
  2025-11-26 16:49   ` Nikita Kalyazin
  1 sibling, 0 replies; 33+ messages in thread
From: Liam R. Howlett @ 2025-11-26 15:23 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

* Mike Rapoport <rppt@kernel.org> [251125 13:39]:
> From: Nikita Kalyazin <kalyazin@amazon.com>
> 
> The test demonstrates that a minor userfaultfd event in guest_memfd can
> be resolved via a memcpy followed by a UFFDIO_CONTINUE ioctl.
> 
> Signed-off-by: Nikita Kalyazin <kalyazin@amazon.com>
> Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

Acked-by: Liam R. Howlett <Liam.Howlett@oracle.com>

> ---
>  .../testing/selftests/kvm/guest_memfd_test.c  | 103 ++++++++++++++++++
>  1 file changed, 103 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
> index e7d9aeb418d3..a5d3ed21d7bb 100644
> --- a/tools/testing/selftests/kvm/guest_memfd_test.c
> +++ b/tools/testing/selftests/kvm/guest_memfd_test.c
> @@ -10,13 +10,17 @@
>  #include <errno.h>
>  #include <stdio.h>
>  #include <fcntl.h>
> +#include <pthread.h>
>  
>  #include <linux/bitmap.h>
>  #include <linux/falloc.h>
>  #include <linux/sizes.h>
> +#include <linux/userfaultfd.h>
>  #include <sys/mman.h>
>  #include <sys/types.h>
>  #include <sys/stat.h>
> +#include <sys/syscall.h>
> +#include <sys/ioctl.h>
>  
>  #include "kvm_util.h"
>  #include "test_util.h"
> @@ -254,6 +258,104 @@ static void test_guest_memfd_flags(struct kvm_vm *vm)
>  	}
>  }
>  
> +struct fault_args {
> +	char *addr;
> +	volatile char value;
> +};
> +
> +static void *fault_thread_fn(void *arg)
> +{
> +	struct fault_args *args = arg;
> +
> +	/* Trigger page fault */
> +	args->value = *args->addr;
> +	return NULL;
> +}
> +
> +static void test_uffd_minor(int fd, size_t total_size)
> +{
> +	struct uffdio_api uffdio_api = {
> +		.api = UFFD_API,
> +		.features = UFFD_FEATURE_MINOR_GENERIC,
> +	};
> +	struct uffdio_register uffd_reg;
> +	struct uffdio_continue uffd_cont;
> +	struct uffd_msg msg;
> +	struct fault_args args;
> +	pthread_t fault_thread;
> +	void *mem, *mem_nofault, *buf = NULL;
> +	int uffd, ret;
> +	off_t offset = page_size;
> +	void *fault_addr;
> +
> +	ret = posix_memalign(&buf, page_size, total_size);
> +	TEST_ASSERT_EQ(ret, 0);
> +
> +	memset(buf, 0xaa, total_size);
> +
> +	uffd = syscall(__NR_userfaultfd, O_CLOEXEC);
> +	TEST_ASSERT(uffd != -1, "userfaultfd creation should succeed");
> +
> +	ret = ioctl(uffd, UFFDIO_API, &uffdio_api);
> +	TEST_ASSERT(ret != -1, "ioctl(UFFDIO_API) should succeed");
> +
> +	mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> +	TEST_ASSERT(mem != MAP_FAILED, "mmap should succeed");
> +
> +	mem_nofault = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> +	TEST_ASSERT(mem_nofault != MAP_FAILED, "mmap should succeed");
> +
> +	uffd_reg.range.start = (unsigned long)mem;
> +	uffd_reg.range.len = total_size;
> +	uffd_reg.mode = UFFDIO_REGISTER_MODE_MINOR;
> +	ret = ioctl(uffd, UFFDIO_REGISTER, &uffd_reg);
> +	TEST_ASSERT(ret != -1, "ioctl(UFFDIO_REGISTER) should succeed");
> +
> +	ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE,
> +			offset, page_size);
> +	TEST_ASSERT(!ret, "fallocate(PUNCH_HOLE) should succeed");
> +
> +	fault_addr = mem + offset;
> +	args.addr = fault_addr;
> +
> +	ret = pthread_create(&fault_thread, NULL, fault_thread_fn, &args);
> +	TEST_ASSERT(ret == 0, "pthread_create should succeed");
> +
> +	ret = read(uffd, &msg, sizeof(msg));
> +	TEST_ASSERT(ret != -1, "read from userfaultfd should succeed");
> +	TEST_ASSERT(msg.event == UFFD_EVENT_PAGEFAULT, "event type should be pagefault");
> +	TEST_ASSERT((void *)(msg.arg.pagefault.address & ~(page_size - 1)) == fault_addr,
> +		    "pagefault should occur at expected address");
> +
> +	memcpy(mem_nofault + offset, buf + offset, page_size);
> +
> +	uffd_cont.range.start = (unsigned long)fault_addr;
> +	uffd_cont.range.len = page_size;
> +	uffd_cont.mode = 0;
> +	ret = ioctl(uffd, UFFDIO_CONTINUE, &uffd_cont);
> +	TEST_ASSERT(ret != -1, "ioctl(UFFDIO_CONTINUE) should succeed");
> +
> +	/*
> +	 * wait for fault_thread to finish to make sure fault happened and was
> +	 * resolved before we verify the values
> +	 */
> +	ret = pthread_join(fault_thread, NULL);
> +	TEST_ASSERT(ret == 0, "pthread_join should succeed");
> +
> +	TEST_ASSERT(args.value == *(char *)(mem_nofault + offset),
> +		    "memory should contain the value that was copied");
> +	TEST_ASSERT(args.value == *(char *)(mem + offset),
> +		    "no further fault is expected");
> +
> +	ret = munmap(mem_nofault, total_size);
> +	TEST_ASSERT(!ret, "munmap should succeed");
> +
> +	ret = munmap(mem, total_size);
> +	TEST_ASSERT(!ret, "munmap should succeed");
> +	free(buf);
> +	close(uffd);
> +}
> +
>  #define gmem_test(__test, __vm, __flags)				\
>  do {									\
>  	int fd = vm_create_guest_memfd(__vm, page_size * 4, __flags);	\
> @@ -273,6 +375,7 @@ static void __test_guest_memfd(struct kvm_vm *vm, uint64_t flags)
>  		if (flags & GUEST_MEMFD_FLAG_INIT_SHARED) {
>  			gmem_test(mmap_supported, vm, flags);
>  			gmem_test(fault_overflow, vm, flags);
> +			gmem_test(uffd_minor, vm, flags);
>  		} else {
>  			gmem_test(fault_private, vm, flags);
>  		}
> -- 
> 2.50.1
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
  2025-11-26 10:25   ` David Hildenbrand (Red Hat)
  2025-11-26 15:22   ` Liam R. Howlett
@ 2025-11-26 16:21   ` kernel test robot
  2025-11-26 16:49   ` Nikita Kalyazin
  2025-11-28  3:27   ` kernel test robot
  4 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2025-11-26 16:21 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: llvm, oe-kbuild-all

Hi Mike,

kernel test robot noticed the following build errors:

[auto build test ERROR on 6a23ae0a96a600d1d12557add110e0bb6e32730c]

url:    https://github.com/intel-lab-lkp/linux/commits/Mike-Rapoport/userfaultfd-move-vma_can_userfault-out-of-line/20251126-024059
base:   6a23ae0a96a600d1d12557add110e0bb6e32730c
patch link:    https://lore.kernel.org/r/20251125183840.2368510-5-rppt%40kernel.org
patch subject: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
config: x86_64-randconfig-011-20251126 (https://download.01.org/0day-ci/archive/20251127/202511270012.rGTVhLaw-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251127/202511270012.rGTVhLaw-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511270012.rGTVhLaw-lkp@intel.com/

All errors (new ones prefixed by >>):

>> arch/x86/kvm/../../../virt/kvm/guest_memfd.c:401:22: error: redefinition of 'kvm_gmem_get_folio'
     401 | static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
         |                      ^
   arch/x86/kvm/../../../virt/kvm/guest_memfd.c:100:22: note: previous definition is here
     100 | static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index)
         |                      ^
   1 error generated.


vim +/kvm_gmem_get_folio +401 arch/x86/kvm/../../../virt/kvm/guest_memfd.c

   399	
   400	#ifdef CONFIG_USERFAULTFD
 > 401	static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
   402	{
   403		struct folio *folio;
   404	
   405		folio = kvm_gmem_get_folio(inode, pgoff);
   406		if (IS_ERR_OR_NULL(folio))
   407			return folio;
   408	
   409		if (!folio_test_uptodate(folio)) {
   410			clear_highpage(folio_page(folio, 0));
   411			kvm_gmem_mark_prepared(folio);
   412		}
   413	
   414		return folio;
   415	}
   416	#endif
   417	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
                     ` (3 preceding siblings ...)
  2025-11-26 15:19   ` Liam R. Howlett
@ 2025-11-26 16:49   ` Nikita Kalyazin
  2025-11-28  1:48   ` kernel test robot
  2025-11-28  3:07   ` kernel test robot
  6 siblings, 0 replies; 33+ messages in thread
From: Nikita Kalyazin @ 2025-11-26 16:49 UTC (permalink / raw)
  To: Mike Rapoport, linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest,
	David Hildenbrand (Red Hat)



On 25/11/2025 18:38, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> When a VMA is registered with userfaulfd in minor mode, its ->fault()
> method should check if a folio exists in the page cache and if yes
> ->fault() should call handle_userfault(VM_UFFD_MISSING).
> 
> Instead of calling handle_userfault() directly from a specific ->fault()
> implementation introduce new fault reason VM_FAULT_UFFD_MINOR that will
> notify the core page fault handler that it should call
> handle_userfaultfd(VM_UFFD_MISSING) to complete a page fault.
> 
> Replace a call to handle_userfault(VM_UFFD_MISSING) in shmem and use the
> new VM_FAULT_UFFD_MINOR there instead.
> 
> For configurations that don't enable CONFIG_USERFAULTFD,
> VM_FAULT_UFFD_MINOR is set to 0.
> 
> Suggested-by: David Hildenbrand (Red Hat) <david@kernel.org>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
>   include/linux/mm_types.h | 10 +++++++++-
>   mm/memory.c              |  2 ++
>   mm/shmem.c               |  2 +-
>   3 files changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index 90e5790c318f..df71b057111b 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -1523,6 +1523,8 @@ typedef __bitwise unsigned int vm_fault_t;
>    *                             fsync() to complete (for synchronous page faults
>    *                             in DAX)
>    * @VM_FAULT_COMPLETED:                ->fault completed, meanwhile mmap lock released
> + * @VM_FAULT_UFFD_MINOR:       ->fault did not modify page tables and needs
> + *                             handle_userfault(VM_UFFD_MINOR) to complete
>    * @VM_FAULT_HINDEX_MASK:      mask HINDEX value
>    *
>    */
> @@ -1540,6 +1542,11 @@ enum vm_fault_reason {
>          VM_FAULT_DONE_COW       = (__force vm_fault_t)0x001000,
>          VM_FAULT_NEEDDSYNC      = (__force vm_fault_t)0x002000,
>          VM_FAULT_COMPLETED      = (__force vm_fault_t)0x004000,
> +#ifdef CONFIG_USERFAULTFD
> +       VM_FAULT_UFFD_MINOR     = (__force vm_fault_t)0x008000,
> +#else
> +       VM_FAULT_UFFD_MINOR     = (__force vm_fault_t)0x000000,
> +#endif
>          VM_FAULT_HINDEX_MASK    = (__force vm_fault_t)0x0f0000,
>   };
> 
> @@ -1564,7 +1571,8 @@ enum vm_fault_reason {
>          { VM_FAULT_FALLBACK,            "FALLBACK" },   \
>          { VM_FAULT_DONE_COW,            "DONE_COW" },   \
>          { VM_FAULT_NEEDDSYNC,           "NEEDDSYNC" },  \
> -       { VM_FAULT_COMPLETED,           "COMPLETED" }
> +       { VM_FAULT_COMPLETED,           "COMPLETED" },  \
> +       { VM_FAULT_UFFD_MINOR,          "UFFD_MINOR" }, \

It looks like we have to keep the last element comma-less, otherwise I'm 
seeing compile errors somewhere in fs/dax.c.

> 
>   struct vm_special_mapping {
>          const char *name;       /* The name, e.g. "[vdso]". */
> diff --git a/mm/memory.c b/mm/memory.c
> index b59ae7ce42eb..94acbac8cefb 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -5279,6 +5279,8 @@ static vm_fault_t __do_fault(struct vm_fault *vmf)
>          }
> 
>          ret = vma->vm_ops->fault(vmf);
> +       if (unlikely(ret & VM_FAULT_UFFD_MINOR))
> +               return handle_userfault(vmf, VM_UFFD_MINOR);
>          if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY |
>                              VM_FAULT_DONE_COW)))
>                  return ret;
> diff --git a/mm/shmem.c b/mm/shmem.c
> index e16c7c8c3e1e..a9a31c0b5979 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -2461,7 +2461,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index,
>          if (folio && vma && userfaultfd_minor(vma)) {
>                  if (!xa_is_value(folio))
>                          folio_put(folio);
> -               *fault_type = handle_userfault(vmf, VM_UFFD_MINOR);
> +               *fault_type = VM_FAULT_UFFD_MINOR;
>                  return 0;
>          }
> 
> --
> 2.50.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
                     ` (2 preceding siblings ...)
  2025-11-26 16:21   ` kernel test robot
@ 2025-11-26 16:49   ` Nikita Kalyazin
  2025-11-27 10:36     ` Mike Rapoport
  2025-11-28  3:27   ` kernel test robot
  4 siblings, 1 reply; 33+ messages in thread
From: Nikita Kalyazin @ 2025-11-26 16:49 UTC (permalink / raw)
  To: Mike Rapoport, linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest



On 25/11/2025 18:38, Mike Rapoport wrote:
> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> 
> userfaultfd notifications about minor page faults used for live migration
> and snapshotting of VMs with memory backed by shared hugetlbfs or tmpfs
> mappings as described in detail in commit 7677f7fd8be7 ("userfaultfd: add
> minor fault registration mode").
> 
> To use the same mechanism for VMs that use guest_memfd to map their memory,
> guest_memfd should support userfaultfd minor mode.
> 
> Extend ->fault() method of guest_memfd with ability to notify core page
> fault handler that a page fault requires handle_userfault(VM_UFFD_MINOR) to
> complete and add implementation of ->get_shared_folio() to guest_memfd
> vm_ops.
> 
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
>   virt/kvm/guest_memfd.c | 28 ++++++++++++++++++++++++++++
>   1 file changed, 28 insertions(+)
> 
> diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c
> index ffadc5ee8e04..2a2b076293f9 100644
> --- a/virt/kvm/guest_memfd.c
> +++ b/virt/kvm/guest_memfd.c
> @@ -4,6 +4,7 @@
>   #include <linux/kvm_host.h>
>   #include <linux/pagemap.h>
>   #include <linux/anon_inodes.h>
> +#include <linux/userfaultfd_k.h>
> 
>   #include "kvm_mm.h"
> 
> @@ -369,6 +370,12 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
>                  return vmf_error(err);
>          }
> 
> +       if (userfaultfd_minor(vmf->vma)) {
> +               folio_unlock(folio);
> +               folio_put(folio);
> +               return VM_FAULT_UFFD_MINOR;
> +       }
> +
>          if (WARN_ON_ONCE(folio_test_large(folio))) {
>                  ret = VM_FAULT_SIGBUS;
>                  goto out_folio;
> @@ -390,8 +397,29 @@ static vm_fault_t kvm_gmem_fault_user_mapping(struct vm_fault *vmf)
>          return ret;
>   }
> 
> +#ifdef CONFIG_USERFAULTFD
> +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)

We have to name it differently, otherwise it clashes with the existing 
one in this file.

> +{
> +       struct folio *folio;
> +
> +       folio = kvm_gmem_get_folio(inode, pgoff);

                   ^^

> +       if (IS_ERR_OR_NULL(folio))
> +               return folio;
> +
> +       if (!folio_test_uptodate(folio)) {
> +               clear_highpage(folio_page(folio, 0));
> +               kvm_gmem_mark_prepared(folio);
> +       }
> +
> +       return folio;
> +}
> +#endif
> +
>   static const struct vm_operations_struct kvm_gmem_vm_ops = {
>          .fault = kvm_gmem_fault_user_mapping,
> +#ifdef CONFIG_USERFAULTFD
> +       .get_folio      = kvm_gmem_get_folio,
> +#endif
>   };
> 
>   static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma)
> --
> 2.50.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd
  2025-11-25 18:38 ` [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd Mike Rapoport
  2025-11-26 15:23   ` Liam R. Howlett
@ 2025-11-26 16:49   ` Nikita Kalyazin
  2025-11-27 10:39     ` Mike Rapoport
  1 sibling, 1 reply; 33+ messages in thread
From: Nikita Kalyazin @ 2025-11-26 16:49 UTC (permalink / raw)
  To: Mike Rapoport, linux-mm
  Cc: Andrea Arcangeli, Andrew Morton, Axel Rasmussen, Baolin Wang,
	David Hildenbrand, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest



On 25/11/2025 18:38, Mike Rapoport wrote:
> From: Nikita Kalyazin <kalyazin@amazon.com>
> 
> The test demonstrates that a minor userfaultfd event in guest_memfd can
> be resolved via a memcpy followed by a UFFDIO_CONTINUE ioctl.
> 
> Signed-off-by: Nikita Kalyazin <kalyazin@amazon.com>
> Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> ---
>   .../testing/selftests/kvm/guest_memfd_test.c  | 103 ++++++++++++++++++
>   1 file changed, 103 insertions(+)
> 
> diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c
> index e7d9aeb418d3..a5d3ed21d7bb 100644
> --- a/tools/testing/selftests/kvm/guest_memfd_test.c
> +++ b/tools/testing/selftests/kvm/guest_memfd_test.c
> @@ -10,13 +10,17 @@
>   #include <errno.h>
>   #include <stdio.h>
>   #include <fcntl.h>
> +#include <pthread.h>
> 
>   #include <linux/bitmap.h>
>   #include <linux/falloc.h>
>   #include <linux/sizes.h>
> +#include <linux/userfaultfd.h>
>   #include <sys/mman.h>
>   #include <sys/types.h>
>   #include <sys/stat.h>
> +#include <sys/syscall.h>
> +#include <sys/ioctl.h>
> 
>   #include "kvm_util.h"
>   #include "test_util.h"
> @@ -254,6 +258,104 @@ static void test_guest_memfd_flags(struct kvm_vm *vm)
>          }
>   }
> 
> +struct fault_args {
> +       char *addr;
> +       volatile char value;
> +};
> +
> +static void *fault_thread_fn(void *arg)
> +{
> +       struct fault_args *args = arg;
> +
> +       /* Trigger page fault */
> +       args->value = *args->addr;
> +       return NULL;
> +}
> +
> +static void test_uffd_minor(int fd, size_t total_size)
> +{
> +       struct uffdio_api uffdio_api = {
> +               .api = UFFD_API,
> +               .features = UFFD_FEATURE_MINOR_GENERIC,

Should it be UFFD_FEATURE_MINOR_SHMEM instead? 
UFFD_FEATURE_MINOR_GENERIC was removed in the v1.

> +       };
> +       struct uffdio_register uffd_reg;
> +       struct uffdio_continue uffd_cont;
> +       struct uffd_msg msg;
> +       struct fault_args args;
> +       pthread_t fault_thread;
> +       void *mem, *mem_nofault, *buf = NULL;
> +       int uffd, ret;
> +       off_t offset = page_size;
> +       void *fault_addr;
> +
> +       ret = posix_memalign(&buf, page_size, total_size);
> +       TEST_ASSERT_EQ(ret, 0);
> +
> +       memset(buf, 0xaa, total_size);
> +
> +       uffd = syscall(__NR_userfaultfd, O_CLOEXEC);
> +       TEST_ASSERT(uffd != -1, "userfaultfd creation should succeed");
> +
> +       ret = ioctl(uffd, UFFDIO_API, &uffdio_api);
> +       TEST_ASSERT(ret != -1, "ioctl(UFFDIO_API) should succeed");
> +
> +       mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> +       TEST_ASSERT(mem != MAP_FAILED, "mmap should succeed");
> +
> +       mem_nofault = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0);
> +       TEST_ASSERT(mem_nofault != MAP_FAILED, "mmap should succeed");
> +
> +       uffd_reg.range.start = (unsigned long)mem;
> +       uffd_reg.range.len = total_size;
> +       uffd_reg.mode = UFFDIO_REGISTER_MODE_MINOR;
> +       ret = ioctl(uffd, UFFDIO_REGISTER, &uffd_reg);
> +       TEST_ASSERT(ret != -1, "ioctl(UFFDIO_REGISTER) should succeed");
> +
> +       ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE,
> +                       offset, page_size);
> +       TEST_ASSERT(!ret, "fallocate(PUNCH_HOLE) should succeed");
> +
> +       fault_addr = mem + offset;
> +       args.addr = fault_addr;
> +
> +       ret = pthread_create(&fault_thread, NULL, fault_thread_fn, &args);
> +       TEST_ASSERT(ret == 0, "pthread_create should succeed");
> +
> +       ret = read(uffd, &msg, sizeof(msg));
> +       TEST_ASSERT(ret != -1, "read from userfaultfd should succeed");
> +       TEST_ASSERT(msg.event == UFFD_EVENT_PAGEFAULT, "event type should be pagefault");
> +       TEST_ASSERT((void *)(msg.arg.pagefault.address & ~(page_size - 1)) == fault_addr,
> +                   "pagefault should occur at expected address");
> +
> +       memcpy(mem_nofault + offset, buf + offset, page_size);
> +
> +       uffd_cont.range.start = (unsigned long)fault_addr;
> +       uffd_cont.range.len = page_size;
> +       uffd_cont.mode = 0;
> +       ret = ioctl(uffd, UFFDIO_CONTINUE, &uffd_cont);
> +       TEST_ASSERT(ret != -1, "ioctl(UFFDIO_CONTINUE) should succeed");
> +
> +       /*
> +        * wait for fault_thread to finish to make sure fault happened and was
> +        * resolved before we verify the values
> +        */
> +       ret = pthread_join(fault_thread, NULL);
> +       TEST_ASSERT(ret == 0, "pthread_join should succeed");
> +
> +       TEST_ASSERT(args.value == *(char *)(mem_nofault + offset),
> +                   "memory should contain the value that was copied");
> +       TEST_ASSERT(args.value == *(char *)(mem + offset),
> +                   "no further fault is expected");
> +
> +       ret = munmap(mem_nofault, total_size);
> +       TEST_ASSERT(!ret, "munmap should succeed");
> +
> +       ret = munmap(mem, total_size);
> +       TEST_ASSERT(!ret, "munmap should succeed");
> +       free(buf);
> +       close(uffd);
> +}
> +
>   #define gmem_test(__test, __vm, __flags)                               \
>   do {                                                                   \
>          int fd = vm_create_guest_memfd(__vm, page_size * 4, __flags);   \
> @@ -273,6 +375,7 @@ static void __test_guest_memfd(struct kvm_vm *vm, uint64_t flags)
>                  if (flags & GUEST_MEMFD_FLAG_INIT_SHARED) {
>                          gmem_test(mmap_supported, vm, flags);
>                          gmem_test(fault_overflow, vm, flags);
> +                       gmem_test(uffd_minor, vm, flags);
>                  } else {
>                          gmem_test(fault_private, vm, flags);
>                  }
> --
> 2.50.1
> 


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-26 16:49   ` Nikita Kalyazin
@ 2025-11-27 10:36     ` Mike Rapoport
  2025-11-27 11:19       ` Nikita Kalyazin
  2025-11-27 11:27       ` David Hildenbrand (Red Hat)
  0 siblings, 2 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-27 10:36 UTC (permalink / raw)
  To: Nikita Kalyazin
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

On Wed, Nov 26, 2025 at 04:49:31PM +0000, Nikita Kalyazin wrote:
> On 25/11/2025 18:38, Mike Rapoport wrote:
> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> > 
> > +#ifdef CONFIG_USERFAULTFD
> > +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
> 
> We have to name it differently, otherwise it clashes with the existing one
> in this file.

It's all David's fault! ;-P
How about kvm_gmem_get_prepared_folio() ?

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd
  2025-11-26 16:49   ` Nikita Kalyazin
@ 2025-11-27 10:39     ` Mike Rapoport
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-27 10:39 UTC (permalink / raw)
  To: Nikita Kalyazin
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

On Wed, Nov 26, 2025 at 04:49:46PM +0000, Nikita Kalyazin wrote:
> On 25/11/2025 18:38, Mike Rapoport wrote:
> > From: Nikita Kalyazin <kalyazin@amazon.com>
> > 
> > +static void test_uffd_minor(int fd, size_t total_size)
> > +{
> > +       struct uffdio_api uffdio_api = {
> > +               .api = UFFD_API,
> > +               .features = UFFD_FEATURE_MINOR_GENERIC,
> 
> Should it be UFFD_FEATURE_MINOR_SHMEM instead? UFFD_FEATURE_MINOR_GENERIC
> was removed in the v1.

I'll drop .features completely, the checks in UFFDIO_REGISTER are
sufficient.
 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 19:21   ` Peter Xu
@ 2025-11-27 11:18     ` Mike Rapoport
  2025-11-27 14:10       ` Peter Xu
  0 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2025-11-27 11:18 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin,
	Paolo Bonzini, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest, David Hildenbrand (Red Hat)

On Tue, Nov 25, 2025 at 02:21:16PM -0500, Peter Xu wrote:
> Hi, Mike,
> 
> On Tue, Nov 25, 2025 at 08:38:38PM +0200, Mike Rapoport wrote:
> > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> > 
> > When a VMA is registered with userfaulfd in minor mode, its ->fault()
> > method should check if a folio exists in the page cache and if yes
> > ->fault() should call handle_userfault(VM_UFFD_MISSING).
> 
> s/MISSING/MINOR/

Thanks, fixed. 

> > new VM_FAULT_UFFD_MINOR there instead.
> 
> Personally I'd keep the fault path as simple as possible, because that's
> the more frequently used path (rather than when userfaultfd is armed). I
> also see it slightly a pity that even with flags introduced, it only solves
> the MINOR problem, not MISSING.

With David's suggestion the likely path remains unchanged.

As for MISSING, let's take it baby steps. We have enough space in
vm_fault_reason for UFFD_MISSING if we'd want to pull handle_userfault()
from shmem and hugetlb.
 
> If it's me, I'd simply export handle_userfault()..  I confess I still don't
> know why exporting it is a problem, but maybe I missed something.

It's not only about export, it's also about not requiring ->fault()
methods for pte-mapped memory call handle_userfault().

> Only my two cents.  Feel free to go with whatever way you prefer.
> 
> Thanks,
> 
> -- 
> Peter Xu
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-27 10:36     ` Mike Rapoport
@ 2025-11-27 11:19       ` Nikita Kalyazin
  2025-11-27 19:04         ` Mike Rapoport
  2025-11-27 11:27       ` David Hildenbrand (Red Hat)
  1 sibling, 1 reply; 33+ messages in thread
From: Nikita Kalyazin @ 2025-11-27 11:19 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest



On 27/11/2025 10:36, Mike Rapoport wrote:
> On Wed, Nov 26, 2025 at 04:49:31PM +0000, Nikita Kalyazin wrote:
>> On 25/11/2025 18:38, Mike Rapoport wrote:
>>> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>>>
>>> +#ifdef CONFIG_USERFAULTFD
>>> +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
>>
>> We have to name it differently, otherwise it clashes with the existing one
>> in this file.
> 
> It's all David's fault! ;-P
> How about kvm_gmem_get_prepared_folio() ?

I'm afraid it may not be ideal due to preparedness tracking being 
removed from guest_memfd at some point [1].  Would it be too bad to add 
an indication to userfaultfd in the name somehow given that it's already 
guarded by the config?

[1] 
https://lore.kernel.org/linux-coco/20251113230759.1562024-1-michael.roth@amd.com

> 
> --
> Sincerely yours,
> Mike.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-27 10:36     ` Mike Rapoport
  2025-11-27 11:19       ` Nikita Kalyazin
@ 2025-11-27 11:27       ` David Hildenbrand (Red Hat)
  1 sibling, 0 replies; 33+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-11-27 11:27 UTC (permalink / raw)
  To: Mike Rapoport, Nikita Kalyazin
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, Hugh Dickins, James Houghton, Liam R. Howlett,
	Lorenzo Stoakes, Michal Hocko, Paolo Bonzini, Peter Xu,
	Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

On 11/27/25 11:36, Mike Rapoport wrote:
> On Wed, Nov 26, 2025 at 04:49:31PM +0000, Nikita Kalyazin wrote:
>> On 25/11/2025 18:38, Mike Rapoport wrote:
>>> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>>>
>>> +#ifdef CONFIG_USERFAULTFD
>>> +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
>>
>> We have to name it differently, otherwise it clashes with the existing one
>> in this file.
> 
> It's all David's fault! ;-P

As usual :)

> How about kvm_gmem_get_prepared_folio() ?

Or maybe just spell out that it is for vm_ops

kvm_gmem_vm_ops_get_folio()

-- 
Cheers

David

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-27 11:18     ` Mike Rapoport
@ 2025-11-27 14:10       ` Peter Xu
  2025-11-30 11:05         ` Mike Rapoport
  0 siblings, 1 reply; 33+ messages in thread
From: Peter Xu @ 2025-11-27 14:10 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin,
	Paolo Bonzini, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest, David Hildenbrand (Red Hat)

On Thu, Nov 27, 2025 at 01:18:10PM +0200, Mike Rapoport wrote:
> On Tue, Nov 25, 2025 at 02:21:16PM -0500, Peter Xu wrote:
> > Hi, Mike,
> > 
> > On Tue, Nov 25, 2025 at 08:38:38PM +0200, Mike Rapoport wrote:
> > > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> > > 
> > > When a VMA is registered with userfaulfd in minor mode, its ->fault()
> > > method should check if a folio exists in the page cache and if yes
> > > ->fault() should call handle_userfault(VM_UFFD_MISSING).
> > 
> > s/MISSING/MINOR/
> 
> Thanks, fixed. 
> 
> > > new VM_FAULT_UFFD_MINOR there instead.
> > 
> > Personally I'd keep the fault path as simple as possible, because that's
> > the more frequently used path (rather than when userfaultfd is armed). I
> > also see it slightly a pity that even with flags introduced, it only solves
> > the MINOR problem, not MISSING.
> 
> With David's suggestion the likely path remains unchanged.

It is not about the likely, it's about introducing flags into core path
that makes the core path harder to follow, when it's not strictly required.

Meanwhile, personally I'm also not sure if we should have "unlikely" here..
My gut feeling is in reality we will only have two major use cases:

  (a) when userfaultfd minor isn't in the picture

  (b) when userfaultfd minor registered and actively being used (e.g. in a
      postcopy process)

Then without likely, IIUC the hardware should optimize path selected hence
both a+b performs almost equally well.

My guessing is after adding unlikely, (a) works well, but (b) works badly.
We may need to measure it, IIUC it's part of the reason why we sometimes do
not encourage "likely/unlikely".  But that's only my guess, some numbers
would be more helpful.

One thing we can try is if we add "unlikely" then compare a sequential
MINOR fault trapping on shmem and measure the time it takes, we need to
better make sure we don't regress perf there.  I wonder if James / Axel
would care about it - QEMU doesn't yet support minor, but will soon, and we
will also prefer better perf since the start.

> 
> As for MISSING, let's take it baby steps. We have enough space in
> vm_fault_reason for UFFD_MISSING if we'd want to pull handle_userfault()
> from shmem and hugetlb.

Yep.

>  
> > If it's me, I'd simply export handle_userfault()..  I confess I still don't
> > know why exporting it is a problem, but maybe I missed something.
> 
> It's not only about export, it's also about not requiring ->fault()
> methods for pte-mapped memory call handle_userfault().

I also don't see it a problem.. as what shmem used to do.  Maybe it's a
personal preference?  If so, I don't have a strong opinion.

Just to mention, if we want, I think we have at least one more option to do
the same thing, but without even introducing a new flag to ->fault()
retval.

That is, when we have get_folio() around, we can essentially do two faults
in sequence, one lighter then the real one, only for minor vmas, something
like (I didn't think deeper, so only a rough idea shown):

__do_fault():
  if (uffd_minor(vma)) {
    ...
    folio = vma->get_folio(...);
    if (folio)
       return handle_userfault(vmf, VM_UFFD_MINOR);
    // fallthrough, which imply a cache miss
  }
  ret = vma->vm_ops->fault(vmf);
  ...

The risk of above is also perf-wise, but it's another angle where it might
slow down page cache miss case where MINOR is registered only (hence, when
cache missing we'll need to call both get_folio() and fault() now).
However that's likely a less critical case than the unlikely, and I'm also
guessing due to the shared code of get_folio() / fault(), codes will be
preheated and it may not be measureable even if we write it like that.

Then maybe we can avoid this new flag completely but also achieve the same
goal.

Thanks,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-27 11:19       ` Nikita Kalyazin
@ 2025-11-27 19:04         ` Mike Rapoport
  2025-11-28 12:15           ` Nikita Kalyazin
  0 siblings, 1 reply; 33+ messages in thread
From: Mike Rapoport @ 2025-11-27 19:04 UTC (permalink / raw)
  To: Nikita Kalyazin
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest

On Thu, Nov 27, 2025 at 11:19:35AM +0000, Nikita Kalyazin wrote:
> 
> 
> On 27/11/2025 10:36, Mike Rapoport wrote:
> > On Wed, Nov 26, 2025 at 04:49:31PM +0000, Nikita Kalyazin wrote:
> > > On 25/11/2025 18:38, Mike Rapoport wrote:
> > > > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> > > > 
> > > > +#ifdef CONFIG_USERFAULTFD
> > > > +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
> > > 
> > > We have to name it differently, otherwise it clashes with the existing one
> > > in this file.
> > 
> > It's all David's fault! ;-P
> > How about kvm_gmem_get_prepared_folio() ?
> 
> I'm afraid it may not be ideal due to preparedness tracking being removed
> from guest_memfd at some point [1].  Would it be too bad to add an
> indication to userfaultfd in the name somehow given that it's already
> guarded by the config?

Hmm, shmem also has this clash. There I picked shmem_get_folio_noalloc()
because that describes well what it does: lookup folio in the page cache,
grab it if it's there or return -ENOENT if it's missing.
That's also what hugetlb does for uffd minor fault.

The guest_memfd implementation I copied from one of the older postings
allocates the folio if it's not in the page cache and it seems to me that
it also should only look up existing folios to keep uffd minor semantics
uniform.
 
Then it makes sense also to name the vm_ops method get_folio_noalloc().

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
                     ` (4 preceding siblings ...)
  2025-11-26 16:49   ` Nikita Kalyazin
@ 2025-11-28  1:48   ` kernel test robot
  2025-11-28  3:07   ` kernel test robot
  6 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2025-11-28  1:48 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: llvm, oe-kbuild-all

Hi Mike,

kernel test robot noticed the following build errors:

[auto build test ERROR on 6a23ae0a96a600d1d12557add110e0bb6e32730c]

url:    https://github.com/intel-lab-lkp/linux/commits/Mike-Rapoport/userfaultfd-move-vma_can_userfault-out-of-line/20251126-024059
base:   6a23ae0a96a600d1d12557add110e0bb6e32730c
patch link:    https://lore.kernel.org/r/20251125183840.2368510-4-rppt%40kernel.org
patch subject: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20251128/202511280257.T9fSJoDF-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251128/202511280257.T9fSJoDF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511280257.T9fSJoDF-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from fs/dax.c:30:
   In file included from include/trace/events/fs_dax.h:208:
   In file included from include/trace/define_trace.h:132:
   In file included from include/trace/trace_events.h:256:
>> include/trace/events/fs_dax.h:50:3: error: expected expression
      50 |                 __print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
         |                 ^
   include/trace/stages/stage3_trace_output.h:70:16: note: expanded from macro '__print_flags'
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   In file included from fs/dax.c:30:
   In file included from include/trace/events/fs_dax.h:208:
   In file included from include/trace/define_trace.h:132:
   In file included from include/trace/trace_events.h:256:
   include/trace/events/fs_dax.h:134:3: error: expected expression
     134 |                 __print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
         |                 ^
   include/trace/stages/stage3_trace_output.h:70:16: note: expanded from macro '__print_flags'
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   2 errors generated.
--
   In file included from fs/f2fs/super.c:41:
   In file included from include/trace/events/f2fs.h:2407:
   In file included from include/trace/define_trace.h:132:
   In file included from include/trace/trace_events.h:256:
>> include/trace/events/f2fs.h:1436:3: error: expected expression
    1436 |                 __print_flags(__entry->ret, "|", VM_FAULT_RESULT_TRACE))
         |                 ^
   include/trace/stages/stage3_trace_output.h:70:16: note: expanded from macro '__print_flags'
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   1 error generated.


vim +50 include/trace/events/fs_dax.h

282a8e0391c377 Ross Zwisler            2017-02-22    9  
282a8e0391c377 Ross Zwisler            2017-02-22   10  DECLARE_EVENT_CLASS(dax_pmd_fault_class,
f42003917b4569 Dave Jiang              2017-02-22   11  	TP_PROTO(struct inode *inode, struct vm_fault *vmf,
f42003917b4569 Dave Jiang              2017-02-22   12  		pgoff_t max_pgoff, int result),
f42003917b4569 Dave Jiang              2017-02-22   13  	TP_ARGS(inode, vmf, max_pgoff, result),
282a8e0391c377 Ross Zwisler            2017-02-22   14  	TP_STRUCT__entry(
282a8e0391c377 Ross Zwisler            2017-02-22   15  		__field(unsigned long, ino)
282a8e0391c377 Ross Zwisler            2017-02-22   16  		__field(unsigned long, vm_start)
282a8e0391c377 Ross Zwisler            2017-02-22   17  		__field(unsigned long, vm_end)
bfbe71109fa40e Lorenzo Stoakes         2025-06-18   18  		__field(vm_flags_t, vm_flags)
282a8e0391c377 Ross Zwisler            2017-02-22   19  		__field(unsigned long, address)
282a8e0391c377 Ross Zwisler            2017-02-22   20  		__field(pgoff_t, pgoff)
282a8e0391c377 Ross Zwisler            2017-02-22   21  		__field(pgoff_t, max_pgoff)
282a8e0391c377 Ross Zwisler            2017-02-22   22  		__field(dev_t, dev)
282a8e0391c377 Ross Zwisler            2017-02-22   23  		__field(unsigned int, flags)
282a8e0391c377 Ross Zwisler            2017-02-22   24  		__field(int, result)
282a8e0391c377 Ross Zwisler            2017-02-22   25  	),
282a8e0391c377 Ross Zwisler            2017-02-22   26  	TP_fast_assign(
282a8e0391c377 Ross Zwisler            2017-02-22   27  		__entry->dev = inode->i_sb->s_dev;
282a8e0391c377 Ross Zwisler            2017-02-22   28  		__entry->ino = inode->i_ino;
f42003917b4569 Dave Jiang              2017-02-22   29  		__entry->vm_start = vmf->vma->vm_start;
f42003917b4569 Dave Jiang              2017-02-22   30  		__entry->vm_end = vmf->vma->vm_end;
f42003917b4569 Dave Jiang              2017-02-22   31  		__entry->vm_flags = vmf->vma->vm_flags;
d8a849e1bc1237 Dave Jiang              2017-02-22   32  		__entry->address = vmf->address;
d8a849e1bc1237 Dave Jiang              2017-02-22   33  		__entry->flags = vmf->flags;
d8a849e1bc1237 Dave Jiang              2017-02-22   34  		__entry->pgoff = vmf->pgoff;
282a8e0391c377 Ross Zwisler            2017-02-22   35  		__entry->max_pgoff = max_pgoff;
282a8e0391c377 Ross Zwisler            2017-02-22   36  		__entry->result = result;
282a8e0391c377 Ross Zwisler            2017-02-22   37  	),
282a8e0391c377 Ross Zwisler            2017-02-22   38  	TP_printk("dev %d:%d ino %#lx %s %s address %#lx vm_start "
282a8e0391c377 Ross Zwisler            2017-02-22   39  			"%#lx vm_end %#lx pgoff %#lx max_pgoff %#lx %s",
282a8e0391c377 Ross Zwisler            2017-02-22   40  		MAJOR(__entry->dev),
282a8e0391c377 Ross Zwisler            2017-02-22   41  		MINOR(__entry->dev),
282a8e0391c377 Ross Zwisler            2017-02-22   42  		__entry->ino,
282a8e0391c377 Ross Zwisler            2017-02-22   43  		__entry->vm_flags & VM_SHARED ? "shared" : "private",
282a8e0391c377 Ross Zwisler            2017-02-22   44  		__print_flags(__entry->flags, "|", FAULT_FLAG_TRACE),
282a8e0391c377 Ross Zwisler            2017-02-22   45  		__entry->address,
282a8e0391c377 Ross Zwisler            2017-02-22   46  		__entry->vm_start,
282a8e0391c377 Ross Zwisler            2017-02-22   47  		__entry->vm_end,
282a8e0391c377 Ross Zwisler            2017-02-22   48  		__entry->pgoff,
282a8e0391c377 Ross Zwisler            2017-02-22   49  		__entry->max_pgoff,
282a8e0391c377 Ross Zwisler            2017-02-22  @50  		__print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
282a8e0391c377 Ross Zwisler            2017-02-22   51  	)
282a8e0391c377 Ross Zwisler            2017-02-22   52  )
282a8e0391c377 Ross Zwisler            2017-02-22   53  
282a8e0391c377 Ross Zwisler            2017-02-22   54  #define DEFINE_PMD_FAULT_EVENT(name) \
282a8e0391c377 Ross Zwisler            2017-02-22   55  DEFINE_EVENT(dax_pmd_fault_class, name, \
f42003917b4569 Dave Jiang              2017-02-22   56  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, \
282a8e0391c377 Ross Zwisler            2017-02-22   57  		pgoff_t max_pgoff, int result), \
f42003917b4569 Dave Jiang              2017-02-22   58  	TP_ARGS(inode, vmf, max_pgoff, result))
282a8e0391c377 Ross Zwisler            2017-02-22   59  
282a8e0391c377 Ross Zwisler            2017-02-22   60  DEFINE_PMD_FAULT_EVENT(dax_pmd_fault);
282a8e0391c377 Ross Zwisler            2017-02-22   61  DEFINE_PMD_FAULT_EVENT(dax_pmd_fault_done);
282a8e0391c377 Ross Zwisler            2017-02-22   62  
653b2ea3396fda Ross Zwisler            2017-02-22   63  DECLARE_EVENT_CLASS(dax_pmd_load_hole_class,
f42003917b4569 Dave Jiang              2017-02-22   64  	TP_PROTO(struct inode *inode, struct vm_fault *vmf,
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   65) 		struct folio *zero_folio,
653b2ea3396fda Ross Zwisler            2017-02-22   66  		void *radix_entry),
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   67) 	TP_ARGS(inode, vmf, zero_folio, radix_entry),
653b2ea3396fda Ross Zwisler            2017-02-22   68  	TP_STRUCT__entry(
653b2ea3396fda Ross Zwisler            2017-02-22   69  		__field(unsigned long, ino)
bfbe71109fa40e Lorenzo Stoakes         2025-06-18   70  		__field(vm_flags_t, vm_flags)
653b2ea3396fda Ross Zwisler            2017-02-22   71  		__field(unsigned long, address)
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   72) 		__field(struct folio *, zero_folio)
653b2ea3396fda Ross Zwisler            2017-02-22   73  		__field(void *, radix_entry)
653b2ea3396fda Ross Zwisler            2017-02-22   74  		__field(dev_t, dev)
653b2ea3396fda Ross Zwisler            2017-02-22   75  	),
653b2ea3396fda Ross Zwisler            2017-02-22   76  	TP_fast_assign(
653b2ea3396fda Ross Zwisler            2017-02-22   77  		__entry->dev = inode->i_sb->s_dev;
653b2ea3396fda Ross Zwisler            2017-02-22   78  		__entry->ino = inode->i_ino;
f42003917b4569 Dave Jiang              2017-02-22   79  		__entry->vm_flags = vmf->vma->vm_flags;
f42003917b4569 Dave Jiang              2017-02-22   80  		__entry->address = vmf->address;
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   81) 		__entry->zero_folio = zero_folio;
653b2ea3396fda Ross Zwisler            2017-02-22   82  		__entry->radix_entry = radix_entry;
653b2ea3396fda Ross Zwisler            2017-02-22   83  	),
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   84) 	TP_printk("dev %d:%d ino %#lx %s address %#lx zero_folio %p "
653b2ea3396fda Ross Zwisler            2017-02-22   85  			"radix_entry %#lx",
653b2ea3396fda Ross Zwisler            2017-02-22   86  		MAJOR(__entry->dev),
653b2ea3396fda Ross Zwisler            2017-02-22   87  		MINOR(__entry->dev),
653b2ea3396fda Ross Zwisler            2017-02-22   88  		__entry->ino,
653b2ea3396fda Ross Zwisler            2017-02-22   89  		__entry->vm_flags & VM_SHARED ? "shared" : "private",
653b2ea3396fda Ross Zwisler            2017-02-22   90  		__entry->address,
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   91) 		__entry->zero_folio,
653b2ea3396fda Ross Zwisler            2017-02-22   92  		(unsigned long)__entry->radix_entry
653b2ea3396fda Ross Zwisler            2017-02-22   93  	)
653b2ea3396fda Ross Zwisler            2017-02-22   94  )
653b2ea3396fda Ross Zwisler            2017-02-22   95  
653b2ea3396fda Ross Zwisler            2017-02-22   96  #define DEFINE_PMD_LOAD_HOLE_EVENT(name) \
653b2ea3396fda Ross Zwisler            2017-02-22   97  DEFINE_EVENT(dax_pmd_load_hole_class, name, \
f42003917b4569 Dave Jiang              2017-02-22   98  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, \
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26   99) 		struct folio *zero_folio, void *radix_entry), \
c93012d849c9e3 Matthew Wilcox (Oracle  2024-03-26  100) 	TP_ARGS(inode, vmf, zero_folio, radix_entry))
653b2ea3396fda Ross Zwisler            2017-02-22  101  
653b2ea3396fda Ross Zwisler            2017-02-22  102  DEFINE_PMD_LOAD_HOLE_EVENT(dax_pmd_load_hole);
653b2ea3396fda Ross Zwisler            2017-02-22  103  DEFINE_PMD_LOAD_HOLE_EVENT(dax_pmd_load_hole_fallback);
282a8e0391c377 Ross Zwisler            2017-02-22  104  
a9c42b33ed8096 Ross Zwisler            2017-05-08  105  DECLARE_EVENT_CLASS(dax_pte_fault_class,
a9c42b33ed8096 Ross Zwisler            2017-05-08  106  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, int result),
a9c42b33ed8096 Ross Zwisler            2017-05-08  107  	TP_ARGS(inode, vmf, result),
a9c42b33ed8096 Ross Zwisler            2017-05-08  108  	TP_STRUCT__entry(
a9c42b33ed8096 Ross Zwisler            2017-05-08  109  		__field(unsigned long, ino)
bfbe71109fa40e Lorenzo Stoakes         2025-06-18  110  		__field(vm_flags_t, vm_flags)
a9c42b33ed8096 Ross Zwisler            2017-05-08  111  		__field(unsigned long, address)
a9c42b33ed8096 Ross Zwisler            2017-05-08  112  		__field(pgoff_t, pgoff)
a9c42b33ed8096 Ross Zwisler            2017-05-08  113  		__field(dev_t, dev)
a9c42b33ed8096 Ross Zwisler            2017-05-08  114  		__field(unsigned int, flags)
a9c42b33ed8096 Ross Zwisler            2017-05-08  115  		__field(int, result)
a9c42b33ed8096 Ross Zwisler            2017-05-08  116  	),
a9c42b33ed8096 Ross Zwisler            2017-05-08  117  	TP_fast_assign(
a9c42b33ed8096 Ross Zwisler            2017-05-08  118  		__entry->dev = inode->i_sb->s_dev;
a9c42b33ed8096 Ross Zwisler            2017-05-08  119  		__entry->ino = inode->i_ino;
a9c42b33ed8096 Ross Zwisler            2017-05-08  120  		__entry->vm_flags = vmf->vma->vm_flags;
a9c42b33ed8096 Ross Zwisler            2017-05-08  121  		__entry->address = vmf->address;
a9c42b33ed8096 Ross Zwisler            2017-05-08  122  		__entry->flags = vmf->flags;
a9c42b33ed8096 Ross Zwisler            2017-05-08  123  		__entry->pgoff = vmf->pgoff;
a9c42b33ed8096 Ross Zwisler            2017-05-08  124  		__entry->result = result;
a9c42b33ed8096 Ross Zwisler            2017-05-08  125  	),
a9c42b33ed8096 Ross Zwisler            2017-05-08  126  	TP_printk("dev %d:%d ino %#lx %s %s address %#lx pgoff %#lx %s",
a9c42b33ed8096 Ross Zwisler            2017-05-08  127  		MAJOR(__entry->dev),
a9c42b33ed8096 Ross Zwisler            2017-05-08  128  		MINOR(__entry->dev),
a9c42b33ed8096 Ross Zwisler            2017-05-08  129  		__entry->ino,
a9c42b33ed8096 Ross Zwisler            2017-05-08  130  		__entry->vm_flags & VM_SHARED ? "shared" : "private",
a9c42b33ed8096 Ross Zwisler            2017-05-08  131  		__print_flags(__entry->flags, "|", FAULT_FLAG_TRACE),
a9c42b33ed8096 Ross Zwisler            2017-05-08  132  		__entry->address,
a9c42b33ed8096 Ross Zwisler            2017-05-08  133  		__entry->pgoff,
a9c42b33ed8096 Ross Zwisler            2017-05-08  134  		__print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
a9c42b33ed8096 Ross Zwisler            2017-05-08  135  	)
a9c42b33ed8096 Ross Zwisler            2017-05-08  136  )
a9c42b33ed8096 Ross Zwisler            2017-05-08  137  
a9c42b33ed8096 Ross Zwisler            2017-05-08  138  #define DEFINE_PTE_FAULT_EVENT(name) \
a9c42b33ed8096 Ross Zwisler            2017-05-08  139  DEFINE_EVENT(dax_pte_fault_class, name, \
a9c42b33ed8096 Ross Zwisler            2017-05-08  140  	TP_PROTO(struct inode *inode, struct vm_fault *vmf, int result), \
a9c42b33ed8096 Ross Zwisler            2017-05-08  141  	TP_ARGS(inode, vmf, result))
a9c42b33ed8096 Ross Zwisler            2017-05-08  142  
a9c42b33ed8096 Ross Zwisler            2017-05-08  143  DEFINE_PTE_FAULT_EVENT(dax_pte_fault);
a9c42b33ed8096 Ross Zwisler            2017-05-08  144  DEFINE_PTE_FAULT_EVENT(dax_pte_fault_done);
678c9fd0430a14 Ross Zwisler            2017-05-08  145  DEFINE_PTE_FAULT_EVENT(dax_load_hole);
71eab6dfd91eab Jan Kara                2017-11-01  146  DEFINE_PTE_FAULT_EVENT(dax_insert_pfn_mkwrite_no_entry);
71eab6dfd91eab Jan Kara                2017-11-01  147  DEFINE_PTE_FAULT_EVENT(dax_insert_pfn_mkwrite);
a9c42b33ed8096 Ross Zwisler            2017-05-08  148  
d14a3f48a152b7 Ross Zwisler            2017-05-08  149  DECLARE_EVENT_CLASS(dax_writeback_range_class,
d14a3f48a152b7 Ross Zwisler            2017-05-08  150  	TP_PROTO(struct inode *inode, pgoff_t start_index, pgoff_t end_index),
d14a3f48a152b7 Ross Zwisler            2017-05-08  151  	TP_ARGS(inode, start_index, end_index),
d14a3f48a152b7 Ross Zwisler            2017-05-08  152  	TP_STRUCT__entry(
d14a3f48a152b7 Ross Zwisler            2017-05-08  153  		__field(unsigned long, ino)
d14a3f48a152b7 Ross Zwisler            2017-05-08  154  		__field(pgoff_t, start_index)
d14a3f48a152b7 Ross Zwisler            2017-05-08  155  		__field(pgoff_t, end_index)
d14a3f48a152b7 Ross Zwisler            2017-05-08  156  		__field(dev_t, dev)
d14a3f48a152b7 Ross Zwisler            2017-05-08  157  	),
d14a3f48a152b7 Ross Zwisler            2017-05-08  158  	TP_fast_assign(
d14a3f48a152b7 Ross Zwisler            2017-05-08  159  		__entry->dev = inode->i_sb->s_dev;
d14a3f48a152b7 Ross Zwisler            2017-05-08  160  		__entry->ino = inode->i_ino;
d14a3f48a152b7 Ross Zwisler            2017-05-08  161  		__entry->start_index = start_index;
d14a3f48a152b7 Ross Zwisler            2017-05-08  162  		__entry->end_index = end_index;
d14a3f48a152b7 Ross Zwisler            2017-05-08  163  	),
d14a3f48a152b7 Ross Zwisler            2017-05-08  164  	TP_printk("dev %d:%d ino %#lx pgoff %#lx-%#lx",
d14a3f48a152b7 Ross Zwisler            2017-05-08  165  		MAJOR(__entry->dev),
d14a3f48a152b7 Ross Zwisler            2017-05-08  166  		MINOR(__entry->dev),
d14a3f48a152b7 Ross Zwisler            2017-05-08  167  		__entry->ino,
d14a3f48a152b7 Ross Zwisler            2017-05-08  168  		__entry->start_index,
d14a3f48a152b7 Ross Zwisler            2017-05-08  169  		__entry->end_index
d14a3f48a152b7 Ross Zwisler            2017-05-08  170  	)
d14a3f48a152b7 Ross Zwisler            2017-05-08  171  )
d14a3f48a152b7 Ross Zwisler            2017-05-08  172  
d14a3f48a152b7 Ross Zwisler            2017-05-08  173  #define DEFINE_WRITEBACK_RANGE_EVENT(name) \
d14a3f48a152b7 Ross Zwisler            2017-05-08  174  DEFINE_EVENT(dax_writeback_range_class, name, \
d14a3f48a152b7 Ross Zwisler            2017-05-08  175  	TP_PROTO(struct inode *inode, pgoff_t start_index, pgoff_t end_index),\
d14a3f48a152b7 Ross Zwisler            2017-05-08  176  	TP_ARGS(inode, start_index, end_index))
d14a3f48a152b7 Ross Zwisler            2017-05-08  177  
d14a3f48a152b7 Ross Zwisler            2017-05-08  178  DEFINE_WRITEBACK_RANGE_EVENT(dax_writeback_range);
d14a3f48a152b7 Ross Zwisler            2017-05-08  179  DEFINE_WRITEBACK_RANGE_EVENT(dax_writeback_range_done);
d14a3f48a152b7 Ross Zwisler            2017-05-08  180  
f9bc3a07539bc8 Ross Zwisler            2017-05-08  181  TRACE_EVENT(dax_writeback_one,
f9bc3a07539bc8 Ross Zwisler            2017-05-08  182  	TP_PROTO(struct inode *inode, pgoff_t pgoff, pgoff_t pglen),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  183  	TP_ARGS(inode, pgoff, pglen),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  184  	TP_STRUCT__entry(
f9bc3a07539bc8 Ross Zwisler            2017-05-08  185  		__field(unsigned long, ino)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  186  		__field(pgoff_t, pgoff)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  187  		__field(pgoff_t, pglen)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  188  		__field(dev_t, dev)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  189  	),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  190  	TP_fast_assign(
f9bc3a07539bc8 Ross Zwisler            2017-05-08  191  		__entry->dev = inode->i_sb->s_dev;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  192  		__entry->ino = inode->i_ino;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  193  		__entry->pgoff = pgoff;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  194  		__entry->pglen = pglen;
f9bc3a07539bc8 Ross Zwisler            2017-05-08  195  	),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  196  	TP_printk("dev %d:%d ino %#lx pgoff %#lx pglen %#lx",
f9bc3a07539bc8 Ross Zwisler            2017-05-08  197  		MAJOR(__entry->dev),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  198  		MINOR(__entry->dev),
f9bc3a07539bc8 Ross Zwisler            2017-05-08  199  		__entry->ino,
f9bc3a07539bc8 Ross Zwisler            2017-05-08  200  		__entry->pgoff,
f9bc3a07539bc8 Ross Zwisler            2017-05-08  201  		__entry->pglen
f9bc3a07539bc8 Ross Zwisler            2017-05-08  202  	)
f9bc3a07539bc8 Ross Zwisler            2017-05-08  203  )
f9bc3a07539bc8 Ross Zwisler            2017-05-08  204  
282a8e0391c377 Ross Zwisler            2017-02-22  205  #endif /* _TRACE_FS_DAX_H */
282a8e0391c377 Ross Zwisler            2017-02-22  206  
282a8e0391c377 Ross Zwisler            2017-02-22  207  /* This part must be outside protection */
282a8e0391c377 Ross Zwisler            2017-02-22 @208  #include <trace/define_trace.h>

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
                     ` (5 preceding siblings ...)
  2025-11-28  1:48   ` kernel test robot
@ 2025-11-28  3:07   ` kernel test robot
  6 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2025-11-28  3:07 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: oe-kbuild-all

Hi Mike,

kernel test robot noticed the following build errors:

[auto build test ERROR on 6a23ae0a96a600d1d12557add110e0bb6e32730c]

url:    https://github.com/intel-lab-lkp/linux/commits/Mike-Rapoport/userfaultfd-move-vma_can_userfault-out-of-line/20251126-024059
base:   6a23ae0a96a600d1d12557add110e0bb6e32730c
patch link:    https://lore.kernel.org/r/20251125183840.2368510-4-rppt%40kernel.org
patch subject: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
config: x86_64-rhel-9.4 (https://download.01.org/0day-ci/archive/20251128/202511280432.OrK2ulRs-lkp@intel.com/config)
compiler: gcc-14 (Debian 14.2.0-19) 14.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251128/202511280432.OrK2ulRs-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511280432.OrK2ulRs-lkp@intel.com/

All errors (new ones prefixed by >>):

   In file included from include/trace/define_trace.h:132,
                    from include/trace/events/fs_dax.h:208,
                    from fs/dax.c:30:
   include/trace/events/fs_dax.h: In function 'trace_raw_output_dax_pmd_fault_class':
>> include/trace/stages/stage3_trace_output.h:70:37: error: expected expression before ',' token
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   include/trace/trace_events.h:219:34: note: in definition of macro 'DECLARE_EVENT_CLASS'
     219 |         trace_event_printf(iter, print);                                \
         |                                  ^~~~~
   include/trace/events/fs_dax.h:38:9: note: in expansion of macro 'TP_printk'
      38 |         TP_printk("dev %d:%d ino %#lx %s %s address %#lx vm_start "
         |         ^~~~~~~~~
   include/trace/events/fs_dax.h:50:17: note: in expansion of macro '__print_flags'
      50 |                 __print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
         |                 ^~~~~~~~~~~~~
   include/trace/events/fs_dax.h: In function 'trace_raw_output_dax_pte_fault_class':
>> include/trace/stages/stage3_trace_output.h:70:37: error: expected expression before ',' token
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   include/trace/trace_events.h:219:34: note: in definition of macro 'DECLARE_EVENT_CLASS'
     219 |         trace_event_printf(iter, print);                                \
         |                                  ^~~~~
   include/trace/events/fs_dax.h:126:9: note: in expansion of macro 'TP_printk'
     126 |         TP_printk("dev %d:%d ino %#lx %s %s address %#lx pgoff %#lx %s",
         |         ^~~~~~~~~
   include/trace/events/fs_dax.h:134:17: note: in expansion of macro '__print_flags'
     134 |                 __print_flags(__entry->result, "|", VM_FAULT_RESULT_TRACE)
         |                 ^~~~~~~~~~~~~
--
   In file included from include/trace/define_trace.h:132,
                    from include/trace/events/f2fs.h:2407,
                    from fs/f2fs/super.c:41:
   include/trace/events/f2fs.h: In function 'trace_raw_output_f2fs_mmap':
>> include/trace/stages/stage3_trace_output.h:70:37: error: expected expression before ',' token
      70 |                         { flag_array, { -1, NULL }};                    \
         |                                     ^
   include/trace/trace_events.h:219:34: note: in definition of macro 'DECLARE_EVENT_CLASS'
     219 |         trace_event_printf(iter, print);                                \
         |                                  ^~~~~
   include/trace/events/f2fs.h:1432:9: note: in expansion of macro 'TP_printk'
    1432 |         TP_printk("dev = (%d,%d), ino = %lu, index = %lu, flags: %s, ret: %s",
         |         ^~~~~~~~~
   include/trace/events/f2fs.h:1436:17: note: in expansion of macro '__print_flags'
    1436 |                 __print_flags(__entry->ret, "|", VM_FAULT_RESULT_TRACE))
         |                 ^~~~~~~~~~~~~


vim +70 include/trace/stages/stage3_trace_output.h

1bc191051dca28 include/trace/stages/stage3_defines.h Linus Torvalds          2022-03-23  65  
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  66) #undef __print_flags
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  67) #define __print_flags(flag, delim, flag_array...)			\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  68) 	({								\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  69) 		static const struct trace_print_flags __flags[] =	\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03 @70) 			{ flag_array, { -1, NULL }};			\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  71) 		trace_print_flags_seq(p, delim, flag, __flags);	\
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  72) 	})
af6b9668e85ffd include/trace/stages/stage3_defines.h Steven Rostedt (Google  2022-03-03  73) 

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
                     ` (3 preceding siblings ...)
  2025-11-26 16:49   ` Nikita Kalyazin
@ 2025-11-28  3:27   ` kernel test robot
  4 siblings, 0 replies; 33+ messages in thread
From: kernel test robot @ 2025-11-28  3:27 UTC (permalink / raw)
  To: Mike Rapoport; +Cc: llvm, oe-kbuild-all

Hi Mike,

kernel test robot noticed the following build errors:

[auto build test ERROR on 6a23ae0a96a600d1d12557add110e0bb6e32730c]

url:    https://github.com/intel-lab-lkp/linux/commits/Mike-Rapoport/userfaultfd-move-vma_can_userfault-out-of-line/20251126-024059
base:   6a23ae0a96a600d1d12557add110e0bb6e32730c
patch link:    https://lore.kernel.org/r/20251125183840.2368510-5-rppt%40kernel.org
patch subject: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20251128/202511280435.kT9zhWyV-lkp@intel.com/config)
compiler: clang version 20.1.8 (https://github.com/llvm/llvm-project 87f0227cb60147a26a1eeb4fb06e3b505e9c7261)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251128/202511280435.kT9zhWyV-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511280435.kT9zhWyV-lkp@intel.com/

All errors (new ones prefixed by >>):

>> arch/x86/kvm/../../../virt/kvm/guest_memfd.c:401:22: error: redefinition of 'kvm_gmem_get_folio'
     401 | static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
         |                      ^
   arch/x86/kvm/../../../virt/kvm/guest_memfd.c:100:22: note: previous definition is here
     100 | static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index)
         |                      ^
   1 error generated.


vim +/kvm_gmem_get_folio +401 arch/x86/kvm/../../../virt/kvm/guest_memfd.c

   399	
   400	#ifdef CONFIG_USERFAULTFD
 > 401	static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
   402	{
   403		struct folio *folio;
   404	
   405		folio = kvm_gmem_get_folio(inode, pgoff);
   406		if (IS_ERR_OR_NULL(folio))
   407			return folio;
   408	
   409		if (!folio_test_uptodate(folio)) {
   410			clear_highpage(folio_page(folio, 0));
   411			kvm_gmem_mark_prepared(folio);
   412		}
   413	
   414		return folio;
   415	}
   416	#endif
   417	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode
  2025-11-27 19:04         ` Mike Rapoport
@ 2025-11-28 12:15           ` Nikita Kalyazin
  0 siblings, 0 replies; 33+ messages in thread
From: Nikita Kalyazin @ 2025-11-28 12:15 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Paolo Bonzini,
	Peter Xu, Sean Christopherson, Shuah Khan, Suren Baghdasaryan,
	Vlastimil Babka, linux-kernel, kvm, linux-kselftest



On 27/11/2025 19:04, Mike Rapoport wrote:
> On Thu, Nov 27, 2025 at 11:19:35AM +0000, Nikita Kalyazin wrote:
>>
>>
>> On 27/11/2025 10:36, Mike Rapoport wrote:
>>> On Wed, Nov 26, 2025 at 04:49:31PM +0000, Nikita Kalyazin wrote:
>>>> On 25/11/2025 18:38, Mike Rapoport wrote:
>>>>> From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
>>>>>
>>>>> +#ifdef CONFIG_USERFAULTFD
>>>>> +static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t pgoff)
>>>>
>>>> We have to name it differently, otherwise it clashes with the existing one
>>>> in this file.
>>>
>>> It's all David's fault! ;-P
>>> How about kvm_gmem_get_prepared_folio() ?
>>
>> I'm afraid it may not be ideal due to preparedness tracking being removed
>> from guest_memfd at some point [1].  Would it be too bad to add an
>> indication to userfaultfd in the name somehow given that it's already
>> guarded by the config?
> 
> Hmm, shmem also has this clash. There I picked shmem_get_folio_noalloc()
> because that describes well what it does: lookup folio in the page cache,
> grab it if it's there or return -ENOENT if it's missing.
> That's also what hugetlb does for uffd minor fault.
> 
> The guest_memfd implementation I copied from one of the older postings
> allocates the folio if it's not in the page cache and it seems to me that
> it also should only look up existing folios to keep uffd minor semantics
> uniform.

I can't see a reason for guest_memfd to deviate from shmem and hugetlb 
here so makes sense to me.

> 
> Then it makes sense also to name the vm_ops method get_folio_noalloc().
> 
> --
> Sincerely yours,
> Mike.


^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason
  2025-11-27 14:10       ` Peter Xu
@ 2025-11-30 11:05         ` Mike Rapoport
  0 siblings, 0 replies; 33+ messages in thread
From: Mike Rapoport @ 2025-11-30 11:05 UTC (permalink / raw)
  To: Peter Xu
  Cc: linux-mm, Andrea Arcangeli, Andrew Morton, Axel Rasmussen,
	Baolin Wang, David Hildenbrand, Hugh Dickins, James Houghton,
	Liam R. Howlett, Lorenzo Stoakes, Michal Hocko, Nikita Kalyazin,
	Paolo Bonzini, Sean Christopherson, Shuah Khan,
	Suren Baghdasaryan, Vlastimil Babka, linux-kernel, kvm,
	linux-kselftest, David Hildenbrand (Red Hat)

On Thu, Nov 27, 2025 at 09:10:56AM -0500, Peter Xu wrote:
> On Thu, Nov 27, 2025 at 01:18:10PM +0200, Mike Rapoport wrote:
> > On Tue, Nov 25, 2025 at 02:21:16PM -0500, Peter Xu wrote:
> > > Hi, Mike,
> > > 
> > > On Tue, Nov 25, 2025 at 08:38:38PM +0200, Mike Rapoport wrote:
> > > > From: "Mike Rapoport (Microsoft)" <rppt@kernel.org>
> > > > 
> > > > When a VMA is registered with userfaulfd in minor mode, its ->fault()
> > > > method should check if a folio exists in the page cache and if yes
> > > > ->fault() should call handle_userfault(VM_UFFD_MISSING).
> > > 
> > > s/MISSING/MINOR/
> > 
> > Thanks, fixed. 
> > 
> > > > new VM_FAULT_UFFD_MINOR there instead.
> > > 
> > > Personally I'd keep the fault path as simple as possible, because that's
> > > the more frequently used path (rather than when userfaultfd is armed). I
> > > also see it slightly a pity that even with flags introduced, it only solves
> > > the MINOR problem, not MISSING.
> > 
> > With David's suggestion the likely path remains unchanged.
> 
> It is not about the likely, it's about introducing flags into core path
> that makes the core path harder to follow, when it's not strictly required.
 
	ret = vma->vm_ops->fault(vmf);
	if (unlikely(ret & (VM_FAULT_ERROR | VM_FAULT_NOPAGE | VM_FAULT_RETRY |
			    VM_FAULT_DONE_COW | VM_FAULT_UFFD_MINOR))) {
		if (ret & VM_FAULT_UFFD_MINOR)
			return handle_userfault(vmf, VM_UFFD_MINOR);
		return ret;
	}

isn't hard to follow and it's cleaner than adding EXPORT_SYMBOL that is not
strictly required.

> Meanwhile, personally I'm also not sure if we should have "unlikely" here..
> My gut feeling is in reality we will only have two major use cases:
> 
>   (a) when userfaultfd minor isn't in the picture
> 
>   (b) when userfaultfd minor registered and actively being used (e.g. in a
>       postcopy process)
> 
> Then without likely, IIUC the hardware should optimize path selected hence
> both a+b performs almost equally well.

unlikely() adds a branch that hardware will predict correctly if
UFFD_MINOR is actively used.

But even misspredicted branch is nothing compared to putting a task on a
wait queue and waiting for userspace to react to the fault notification
before handle_userfault() returns the control to the fault handler.
 
> Just to mention, if we want, I think we have at least one more option to do
> the same thing, but without even introducing a new flag to ->fault()
> retval.
> 
> That is, when we have get_folio() around, we can essentially do two faults
> in sequence, one lighter then the real one, only for minor vmas, something
> like (I didn't think deeper, so only a rough idea shown):
> 
> __do_fault():
>   if (uffd_minor(vma)) {
>     ...
>     folio = vma->get_folio(...);
>     if (folio)
>        return handle_userfault(vmf, VM_UFFD_MINOR);
>     // fallthrough, which imply a cache miss
>   }
>   ret = vma->vm_ops->fault(vmf);

That's something to consider for the future, especially if we'd be able to
pull out MISSING handling as well from ->fault() handlers.

> Thanks,
> -- 
> Peter Xu

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2025-11-30 11:05 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-11-25 18:38 [PATCH v2 0/5] mm, kvm: add guest_memfd support for uffd minor faults Mike Rapoport
2025-11-25 18:38 ` [PATCH v2 1/5] userfaultfd: move vma_can_userfault out of line Mike Rapoport
2025-11-26 15:05   ` Liam R. Howlett
2025-11-25 18:38 ` [PATCH v2 2/5] userfaultfd, shmem: use a VMA callback to handle UFFDIO_CONTINUE Mike Rapoport
2025-11-26 10:21   ` David Hildenbrand (Red Hat)
2025-11-26 15:11   ` Liam R. Howlett
2025-11-25 18:38 ` [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason Mike Rapoport
2025-11-25 19:21   ` Peter Xu
2025-11-27 11:18     ` Mike Rapoport
2025-11-27 14:10       ` Peter Xu
2025-11-30 11:05         ` Mike Rapoport
2025-11-26 10:19   ` David Hildenbrand (Red Hat)
2025-11-26 12:47   ` kernel test robot
2025-11-26 15:19   ` Liam R. Howlett
2025-11-26 16:49   ` Nikita Kalyazin
2025-11-28  1:48   ` kernel test robot
2025-11-28  3:07   ` kernel test robot
2025-11-25 18:38 ` [PATCH v2 4/5] guest_memfd: add support for userfaultfd minor mode Mike Rapoport
2025-11-26 10:25   ` David Hildenbrand (Red Hat)
2025-11-26 15:22   ` Liam R. Howlett
2025-11-26 16:21   ` kernel test robot
2025-11-26 16:49   ` Nikita Kalyazin
2025-11-27 10:36     ` Mike Rapoport
2025-11-27 11:19       ` Nikita Kalyazin
2025-11-27 19:04         ` Mike Rapoport
2025-11-28 12:15           ` Nikita Kalyazin
2025-11-27 11:27       ` David Hildenbrand (Red Hat)
2025-11-28  3:27   ` kernel test robot
2025-11-25 18:38 ` [PATCH v2 5/5] KVM: selftests: test userfaultfd minor for guest_memfd Mike Rapoport
2025-11-26 15:23   ` Liam R. Howlett
2025-11-26 16:49   ` Nikita Kalyazin
2025-11-27 10:39     ` Mike Rapoport
  -- strict thread matches above, loose matches on Subject: below --
2025-11-26  4:24 [PATCH v2 3/5] mm: introduce VM_FAULT_UFFD_MINOR fault reason kernel test robot

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.