linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] mm/userfaultfd: modulize memory types
@ 2025-10-14 23:14 Peter Xu
  2025-10-14 23:14 ` [PATCH v4 1/4] mm: Introduce vm_uffd_ops API Peter Xu
                   ` (4 more replies)
  0 siblings, 5 replies; 23+ messages in thread
From: Peter Xu @ 2025-10-14 23:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	David Hildenbrand, Hugh Dickins, Michal Hocko, Ujwal Kundur,
	Liam R . Howlett, peterx, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

[based on latest akpm/mm-new of Oct 14th, commit 36c6c5ce1b275]

v4:
- Some cleanups within vma_can_userfault() [David]
- Rename uffd_get_folio() to minor_get_folio() [David]
- Remove uffd_features in vm_uffd_ops, deduce it from supported ioctls [David]

v1: https://lore.kernel.org/r/20250620190342.1780170-1-peterx@redhat.com
v2: https://lore.kernel.org/r/20250627154655.2085903-1-peterx@redhat.com
v3: https://lore.kernel.org/r/20250926211650.525109-1-peterx@redhat.com

This series is an alternative proposal of what Nikita proposed here on the
initial three patches:

  https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com

This is not yet relevant to any guest-memfd support, but paving way for it.
Here, the major goal is to make kernel modules be able to opt-in with any
form of userfaultfd supports, like guest-memfd.  This alternative option
should hopefully be cleaner, and avoid leaking userfault details into
vm_ops.fault().

It also means this series does not depend on anything.  It's a pure
refactoring of userfaultfd internals to provide a generic API, so that
other types of files, especially RAM based, can support userfaultfd without
touching mm/ at all.

To achieve it, this series introduced a file operation called vm_uffd_ops.
The ops needs to be provided when a file type supports any of userfaultfd.

With that, I moved both hugetlbfs and shmem over, whenever possible.  So
far due to concerns on exposing an uffd_copy() API, the MISSING faults are
still separately processed and can only be done within mm/.  Hugetlbfs kept
its special paths untouched.

An example of shmem uffd_ops:

static const struct vm_uffd_ops shmem_uffd_ops = {
	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
					BIT(_UFFDIO_ZEROPAGE) |
					BIT(_UFFDIO_WRITEPROTECT) |
					BIT(_UFFDIO_CONTINUE) |
					BIT(_UFFDIO_POISON),
	.minor_get_folio	=	shmem_uffd_get_folio,
};

To show another sample, this is the patch that Nikita posted to implement
minor fault for guest-memfd (on top of older versions of this series):

  https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/

No functional change expected at all after the whole series applied.  There
might be some slightly stricter check on uffd ops here and there in the
last patch, but that really shouldn't stand out anywhere to anyone.

For testing: besides the cross-compilation tests, I did also try with
uffd-stress in a VM to measure any perf difference before/after the change;
The static call becomes a pointer now.  I really cannot measure anything
different, which is more or less expected.

Comments welcomed, thanks.

Peter Xu (4):
  mm: Introduce vm_uffd_ops API
  mm/shmem: Support vm_uffd_ops API
  mm/hugetlb: Support vm_uffd_ops API
  mm: Apply vm_uffd_ops API to core mm

 include/linux/mm.h            |   9 +++
 include/linux/userfaultfd_k.h |  75 +++++++++++----------
 mm/hugetlb.c                  |  18 +++++
 mm/shmem.c                    |  24 +++++++
 mm/userfaultfd.c              | 120 +++++++++++++++++++++++++++-------
 5 files changed, 189 insertions(+), 57 deletions(-)

-- 
2.50.1



^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v4 1/4] mm: Introduce vm_uffd_ops API
  2025-10-14 23:14 [PATCH v4 0/4] mm/userfaultfd: modulize memory types Peter Xu
@ 2025-10-14 23:14 ` Peter Xu
  2025-10-20 14:18   ` David Hildenbrand
  2025-10-14 23:14 ` [PATCH v4 2/4] mm/shmem: Support " Peter Xu
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Peter Xu @ 2025-10-14 23:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	David Hildenbrand, Hugh Dickins, Michal Hocko, Ujwal Kundur,
	Liam R . Howlett, peterx, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

Currently, most of the userfaultfd features are implemented directly in the
core mm.  It will invoke VMA specific functions whenever necessary.  So far
it is fine because it almost only interacts with shmem and hugetlbfs.

Introduce a generic userfaultfd API extension for vm_operations_struct,
so that any code that implements vm_operations_struct (including kernel
modules that can be compiled separately from the kernel core) can support
userfaults without modifying the core files.

With this API applied, if a module wants to support userfaultfd, the
module should only need to properly define vm_uffd_ops and hook it to
vm_operations_struct, instead of changing anything in core mm.

This API will not work for anonymous memory. Handling of userfault
operations for anonymous memory remains unchanged in core mm.

Due to a security concern while reviewing older versions of this series
[1], uffd_copy() will be temprorarily removed.  IOW, MISSING-capable memory
types can only be hard-coded and implemented in mm/.  It would also affect
UFFDIO_COPY and UFFDIO_ZEROPAGE.  Other functions should still be able to
be provided from vm_uffd_ops.

Introduces the API only so that existing userfaultfd users can be moved
over without breaking them.

[1] https://lore.kernel.org/all/20250627154655.2085903-1-peterx@redhat.com/

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/mm.h            |  9 +++++++++
 include/linux/userfaultfd_k.h | 29 +++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5c01c4b59ca67..011962130c148 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -620,6 +620,8 @@ struct vm_fault {
 					 */
 };
 
+struct vm_uffd_ops;
+
 /*
  * These are the virtual MM functions - opening of an area, closing and
  * unmapping it (needed to keep files on disk up-to-date etc), pointer
@@ -705,6 +707,13 @@ struct vm_operations_struct {
 	struct page *(*find_normal_page)(struct vm_area_struct *vma,
 					 unsigned long addr);
 #endif /* CONFIG_FIND_NORMAL_PAGE */
+#ifdef CONFIG_USERFAULTFD
+	/*
+	 * Userfaultfd related ops.  Modules need to define this to support
+	 * userfaultfd.
+	 */
+	const struct vm_uffd_ops *userfaultfd_ops;
+#endif
 };
 
 #ifdef CONFIG_NUMA_BALANCING
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index c0e716aec26aa..b5b4f3f174b32 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -92,6 +92,35 @@ enum mfill_atomic_mode {
 	NR_MFILL_ATOMIC_MODES,
 };
 
+/* VMA userfaultfd operations */
+struct vm_uffd_ops {
+	/**
+	 * @supported_ioctls: userfaultfd ioctls supported in bitmask.
+	 *
+	 * Userfaultfd ioctls supported by the module.  Below will always
+	 * be supported by default whenever a module provides vm_uffd_ops:
+	 *
+	 *   _UFFDIO_API, _UFFDIO_REGISTER, _UFFDIO_UNREGISTER, _UFFDIO_WAKE
+	 *
+	 * The module needs to provide all the rest optionally supported
+	 * ioctls as a bitmask.  For example, a module needs to set the bit
+	 * BIT(_UFFDIO_CONTINUE) to support userfaultfd minor faults.
+	 */
+	unsigned long supported_ioctls;
+	/**
+	 * minor_get_folio: Handler to resolve UFFDIO_CONTINUE request.
+	 * Must be specified if _UFFDIO_CONTINUE is set.
+	 *
+	 * @inode: the inode for folio lookup
+	 * @pgoff: the pgoff of the folio
+	 * @folio: returned folio pointer
+	 *
+	 * Return: zero if succeeded, negative for errors.
+	 */
+	int (*minor_get_folio)(struct inode *inode, pgoff_t pgoff,
+			       struct folio **folio);
+};
+
 #define MFILL_ATOMIC_MODE_BITS (const_ilog2(NR_MFILL_ATOMIC_MODES - 1) + 1)
 #define MFILL_ATOMIC_BIT(nr) BIT(MFILL_ATOMIC_MODE_BITS + (nr))
 #define MFILL_ATOMIC_FLAG(nr) ((__force uffd_flags_t) MFILL_ATOMIC_BIT(nr))
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 2/4] mm/shmem: Support vm_uffd_ops API
  2025-10-14 23:14 [PATCH v4 0/4] mm/userfaultfd: modulize memory types Peter Xu
  2025-10-14 23:14 ` [PATCH v4 1/4] mm: Introduce vm_uffd_ops API Peter Xu
@ 2025-10-14 23:14 ` Peter Xu
  2025-10-20 14:18   ` David Hildenbrand
  2025-10-14 23:15 ` [PATCH v4 3/4] mm/hugetlb: " Peter Xu
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 23+ messages in thread
From: Peter Xu @ 2025-10-14 23:14 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	David Hildenbrand, Hugh Dickins, Michal Hocko, Ujwal Kundur,
	Liam R . Howlett, peterx, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

Add support for the new vm_uffd_ops API for shmem.  Note that this only
introduces the support, the API is not yet used by core mm.

It only needs a separate minor_get_folio() definition but that's oneliner.

Due to the limitation of the current vm_uffd_ops on MISSING mode support,
the shmem UFFDIO_COPY/ZEROPAGE process are still hard-coded in mm/.

Cc: Hugh Dickins <hughd@google.com>
Acked-by: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/shmem.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/mm/shmem.c b/mm/shmem.c
index b50ce7dbc84a0..0be112689f9e5 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3146,6 +3146,13 @@ static inline struct inode *shmem_get_inode(struct mnt_idmap *idmap,
 #endif /* CONFIG_TMPFS_QUOTA */
 
 #ifdef CONFIG_USERFAULTFD
+
+static int shmem_uffd_get_folio(struct inode *inode, pgoff_t pgoff,
+				struct folio **folio)
+{
+	return shmem_get_folio(inode, pgoff, 0, folio, SGP_NOALLOC);
+}
+
 int shmem_mfill_atomic_pte(pmd_t *dst_pmd,
 			   struct vm_area_struct *dst_vma,
 			   unsigned long dst_addr,
@@ -5189,6 +5196,17 @@ static int shmem_error_remove_folio(struct address_space *mapping,
 	return 0;
 }
 
+#ifdef CONFIG_USERFAULTFD
+static const struct vm_uffd_ops shmem_uffd_ops = {
+	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
+					BIT(_UFFDIO_ZEROPAGE) |
+					BIT(_UFFDIO_WRITEPROTECT) |
+					BIT(_UFFDIO_CONTINUE) |
+					BIT(_UFFDIO_POISON),
+	.minor_get_folio	=	shmem_uffd_get_folio,
+};
+#endif
+
 static const struct address_space_operations shmem_aops = {
 	.dirty_folio	= noop_dirty_folio,
 #ifdef CONFIG_TMPFS
@@ -5291,6 +5309,9 @@ static const struct vm_operations_struct shmem_vm_ops = {
 	.set_policy     = shmem_set_policy,
 	.get_policy     = shmem_get_policy,
 #endif
+#ifdef CONFIG_USERFAULTFD
+	.userfaultfd_ops = &shmem_uffd_ops,
+#endif
 };
 
 static const struct vm_operations_struct shmem_anon_vm_ops = {
@@ -5300,6 +5321,9 @@ static const struct vm_operations_struct shmem_anon_vm_ops = {
 	.set_policy     = shmem_set_policy,
 	.get_policy     = shmem_get_policy,
 #endif
+#ifdef CONFIG_USERFAULTFD
+	.userfaultfd_ops = &shmem_uffd_ops,
+#endif
 };
 
 int shmem_init_fs_context(struct fs_context *fc)
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 3/4] mm/hugetlb: Support vm_uffd_ops API
  2025-10-14 23:14 [PATCH v4 0/4] mm/userfaultfd: modulize memory types Peter Xu
  2025-10-14 23:14 ` [PATCH v4 1/4] mm: Introduce vm_uffd_ops API Peter Xu
  2025-10-14 23:14 ` [PATCH v4 2/4] mm/shmem: Support " Peter Xu
@ 2025-10-14 23:15 ` Peter Xu
  2025-10-20 14:19   ` David Hildenbrand
  2025-10-14 23:15 ` [PATCH v4 4/4] mm: Apply vm_uffd_ops API to core mm Peter Xu
  2025-10-20 13:34 ` [PATCH v4 0/4] mm/userfaultfd: modulize memory types David Hildenbrand
  4 siblings, 1 reply; 23+ messages in thread
From: Peter Xu @ 2025-10-14 23:15 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	David Hildenbrand, Hugh Dickins, Michal Hocko, Ujwal Kundur,
	Liam R . Howlett, peterx, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

Add support for the new vm_uffd_ops API for hugetlb.  Note that this only
introduces the support, the API is not yet used by core mm.

Due to legacy reasons, it's still not trivial to move hugetlb completely to
the API.  But it will still use supported_ioctls properly on the API with
all the implementations still hard coded in mm/.

Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Acked-by: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 mm/hugetlb.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 657b9facba280..ad98876d338e5 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5503,6 +5503,21 @@ static vm_fault_t hugetlb_vm_op_fault(struct vm_fault *vmf)
 	return 0;
 }
 
+#ifdef CONFIG_USERFAULTFD
+static const struct vm_uffd_ops hugetlb_uffd_ops = {
+	/* _UFFDIO_ZEROPAGE not supported */
+	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
+					BIT(_UFFDIO_WRITEPROTECT) |
+					BIT(_UFFDIO_CONTINUE) |
+					BIT(_UFFDIO_POISON),
+	/*
+	 * Hugetlbfs still has its own hard-coded handler in userfaultfd,
+	 * due to limitations similar to vm_operations_struct.fault().
+	 * TODO: generalize it to use the API functions.
+	 */
+};
+#endif
+
 /*
  * When a new function is introduced to vm_operations_struct and added
  * to hugetlb_vm_ops, please consider adding the function to shm_vm_ops.
@@ -5516,6 +5531,9 @@ const struct vm_operations_struct hugetlb_vm_ops = {
 	.close = hugetlb_vm_op_close,
 	.may_split = hugetlb_vm_op_split,
 	.pagesize = hugetlb_vm_op_pagesize,
+#ifdef CONFIG_USERFAULTFD
+	.userfaultfd_ops = &hugetlb_uffd_ops,
+#endif
 };
 
 static pte_t make_huge_pte(struct vm_area_struct *vma, struct folio *folio,
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v4 4/4] mm: Apply vm_uffd_ops API to core mm
  2025-10-14 23:14 [PATCH v4 0/4] mm/userfaultfd: modulize memory types Peter Xu
                   ` (2 preceding siblings ...)
  2025-10-14 23:15 ` [PATCH v4 3/4] mm/hugetlb: " Peter Xu
@ 2025-10-14 23:15 ` Peter Xu
  2025-10-20 13:34 ` [PATCH v4 0/4] mm/userfaultfd: modulize memory types David Hildenbrand
  4 siblings, 0 replies; 23+ messages in thread
From: Peter Xu @ 2025-10-14 23:15 UTC (permalink / raw)
  To: linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	David Hildenbrand, Hugh Dickins, Michal Hocko, Ujwal Kundur,
	Liam R . Howlett, peterx, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

Move userfaultfd core to use new vm_uffd_ops API. After this change, file
systems that implement vm_operations_struct can start using new API for
userfaultfd operations.

When at it, moving vma_can_userfault() into mm/userfaultfd.c instead,
because it's getting too big.  It's only used in slow paths so it shouldn't
be an issue.  Move the pte marker check before wp_async, which might be
more intuitive because wp_async depends on pte markers.  That shouldn't
cause any functional change though because only one check would take effect
depending on whether pte marker was selected in config.

This will also remove quite some hard-coded checks for either shmem or
hugetlbfs.  Now all the old checks should still work but with vm_uffd_ops.

Note that anonymous memory will still need to be processed separately
because it doesn't have vm_ops at all.

Reviewed-by: James Houghton <jthoughton@google.com>
Acked-by: Mike Rapoport <rppt@kernel.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/userfaultfd_k.h |  46 ++++---------
 mm/userfaultfd.c              | 120 +++++++++++++++++++++++++++-------
 2 files changed, 109 insertions(+), 57 deletions(-)

diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index b5b4f3f174b32..983efaecfdcb0 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -126,9 +126,14 @@ struct vm_uffd_ops {
 #define MFILL_ATOMIC_FLAG(nr) ((__force uffd_flags_t) MFILL_ATOMIC_BIT(nr))
 #define MFILL_ATOMIC_MODE_MASK ((__force uffd_flags_t) (MFILL_ATOMIC_BIT(0) - 1))
 
+static inline enum mfill_atomic_mode uffd_flags_get_mode(uffd_flags_t flags)
+{
+	return (__force enum mfill_atomic_mode)(flags & MFILL_ATOMIC_MODE_MASK);
+}
+
 static inline bool uffd_flags_mode_is(uffd_flags_t flags, enum mfill_atomic_mode expected)
 {
-	return (flags & MFILL_ATOMIC_MODE_MASK) == ((__force uffd_flags_t) expected);
+	return uffd_flags_get_mode(flags) == expected;
 }
 
 static inline uffd_flags_t uffd_flags_set_mode(uffd_flags_t flags, enum mfill_atomic_mode mode)
@@ -237,41 +242,16 @@ static inline bool userfaultfd_armed(struct vm_area_struct *vma)
 	return vma->vm_flags & __VM_UFFD_FLAGS;
 }
 
-static inline bool vma_can_userfault(struct vm_area_struct *vma,
-				     vm_flags_t vm_flags,
-				     bool wp_async)
+static inline const struct vm_uffd_ops *vma_get_uffd_ops(struct vm_area_struct *vma)
 {
-	vm_flags &= __VM_UFFD_FLAGS;
-
-	if (vma->vm_flags & VM_DROPPABLE)
-		return false;
-
-	if ((vm_flags & VM_UFFD_MINOR) &&
-	    (!is_vm_hugetlb_page(vma) && !vma_is_shmem(vma)))
-		return false;
-
-	/*
-	 * If wp async enabled, and WP is the only mode enabled, allow any
-	 * memory type.
-	 */
-	if (wp_async && (vm_flags == VM_UFFD_WP))
-		return true;
-
-#ifndef CONFIG_PTE_MARKER_UFFD_WP
-	/*
-	 * If user requested uffd-wp but not enabled pte markers for
-	 * uffd-wp, then shmem & hugetlbfs are not supported but only
-	 * anonymous.
-	 */
-	if ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma))
-		return false;
-#endif
-
-	/* By default, allow any of anon|shmem|hugetlb */
-	return vma_is_anonymous(vma) || is_vm_hugetlb_page(vma) ||
-	    vma_is_shmem(vma);
+	if (vma->vm_ops && vma->vm_ops->userfaultfd_ops)
+		return vma->vm_ops->userfaultfd_ops;
+	return NULL;
 }
 
+bool vma_can_userfault(struct vm_area_struct *vma,
+		       unsigned long vm_flags, bool wp_async);
+
 static inline bool vma_has_uffd_without_event_remap(struct vm_area_struct *vma)
 {
 	struct userfaultfd_ctx *uffd_ctx = vma->vm_userfaultfd_ctx.ctx;
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
index 00122f42718cc..b50db51e75d5b 100644
--- a/mm/userfaultfd.c
+++ b/mm/userfaultfd.c
@@ -20,6 +20,61 @@
 #include "internal.h"
 #include "swap.h"
 
+static unsigned long uffd_get_supported_modes(struct vm_area_struct *vma)
+{
+	const struct vm_uffd_ops *uffd_ops = vma_get_uffd_ops(vma);
+	unsigned long ioctls, supported = 0;
+
+	if (!uffd_ops)
+		return 0;
+
+	ioctls = uffd_ops->supported_ioctls;
+
+	if (ioctls & (BIT(_UFFDIO_COPY) | BIT(_UFFDIO_ZEROPAGE)))
+		supported |= VM_UFFD_MISSING;
+	if (ioctls & BIT(_UFFDIO_CONTINUE))
+		supported |= VM_UFFD_MINOR;
+	if (ioctls & BIT(_UFFDIO_WRITEPROTECT))
+		supported |= VM_UFFD_WP;
+
+	return supported;
+}
+
+bool vma_can_userfault(struct vm_area_struct *vma, vm_flags_t vm_flags,
+		       bool wp_async)
+{
+	unsigned long supported = 0;
+
+	if (vma->vm_flags & VM_DROPPABLE)
+		return false;
+
+	vm_flags &= __VM_UFFD_FLAGS;
+
+	/*
+	 * If user requested uffd-wp but not enabled pte markers for
+	 * uffd-wp, then any file system (like shmem or hugetlbfs) are not
+	 * supported but only anonymous.
+	 */
+	if (!IS_ENABLED(CONFIG_PTE_MARKER_UFFD_WP) &&
+	    ((vm_flags & VM_UFFD_WP) && !vma_is_anonymous(vma)))
+		return false;
+
+	/*
+	 * If wp async enabled, and WP is the only mode enabled, allow any
+	 * memory type.
+	 */
+	if (wp_async && (vm_flags == VM_UFFD_WP))
+		return true;
+
+	if (vma_is_anonymous(vma))
+		/* Anonymous has no page cache, MINOR not supported */
+		supported = VM_UFFD_MISSING | VM_UFFD_WP;
+	else if (vma_get_uffd_ops(vma))
+		supported = uffd_get_supported_modes(vma);
+
+	return !(vm_flags & ~supported);
+}
+
 static __always_inline
 bool validate_dst_vma(struct vm_area_struct *dst_vma, unsigned long dst_end)
 {
@@ -382,13 +437,17 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd,
 				     unsigned long dst_addr,
 				     uffd_flags_t flags)
 {
+	const struct vm_uffd_ops *uffd_ops = vma_get_uffd_ops(dst_vma);
 	struct inode *inode = file_inode(dst_vma->vm_file);
 	pgoff_t pgoff = linear_page_index(dst_vma, dst_addr);
 	struct folio *folio;
 	struct page *page;
 	int ret;
 
-	ret = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC);
+	if (WARN_ON_ONCE(!uffd_ops || !uffd_ops->minor_get_folio))
+		return -EINVAL;
+
+	ret = uffd_ops->minor_get_folio(inode, pgoff, &folio);
 	/* Our caller expects us to return -EFAULT if we failed to find folio */
 	if (ret == -ENOENT)
 		ret = -EFAULT;
@@ -504,18 +563,6 @@ static __always_inline ssize_t mfill_atomic_hugetlb(
 	u32 hash;
 	struct address_space *mapping;
 
-	/*
-	 * There is no default zero huge page for all huge page sizes as
-	 * supported by hugetlb.  A PMD_SIZE huge pages may exist as used
-	 * by THP.  Since we can not reliably insert a zero page, this
-	 * feature is not supported.
-	 */
-	if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE)) {
-		up_read(&ctx->map_changing_lock);
-		uffd_mfill_unlock(dst_vma);
-		return -EINVAL;
-	}
-
 	src_addr = src_start;
 	dst_addr = dst_start;
 	copied = 0;
@@ -694,6 +741,41 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t *dst_pmd,
 	return err;
 }
 
+static inline bool
+vma_uffd_ops_supported(struct vm_area_struct *vma, uffd_flags_t flags)
+{
+	enum mfill_atomic_mode mode = uffd_flags_get_mode(flags);
+	const struct vm_uffd_ops *uffd_ops;
+	unsigned long supported_ioctls;
+
+	if ((flags & MFILL_ATOMIC_WP) && !(vma->vm_flags & VM_UFFD_WP))
+		return false;
+
+	/* Anonymous supports everything except CONTINUE */
+	if (vma_is_anonymous(vma))
+		return mode != MFILL_ATOMIC_CONTINUE;
+
+	uffd_ops = vma_get_uffd_ops(vma);
+	if (!uffd_ops)
+		return false;
+
+	supported_ioctls = uffd_ops->supported_ioctls;
+	switch (mode) {
+	case MFILL_ATOMIC_COPY:
+		return supported_ioctls & BIT(_UFFDIO_COPY);
+	case MFILL_ATOMIC_ZEROPAGE:
+		return supported_ioctls & BIT(_UFFDIO_ZEROPAGE);
+	case MFILL_ATOMIC_CONTINUE:
+		if (!(vma->vm_flags & VM_SHARED))
+			return false;
+		return supported_ioctls & BIT(_UFFDIO_CONTINUE);
+	case MFILL_ATOMIC_POISON:
+		return supported_ioctls & BIT(_UFFDIO_POISON);
+	default:
+		return false;
+	}
+}
+
 static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
 					    unsigned long dst_start,
 					    unsigned long src_start,
@@ -752,11 +834,7 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
 	    dst_vma->vm_flags & VM_SHARED))
 		goto out_unlock;
 
-	/*
-	 * validate 'mode' now that we know the dst_vma: don't allow
-	 * a wrprotect copy if the userfaultfd didn't register as WP.
-	 */
-	if ((flags & MFILL_ATOMIC_WP) && !(dst_vma->vm_flags & VM_UFFD_WP))
+	if (!vma_uffd_ops_supported(dst_vma, flags))
 		goto out_unlock;
 
 	/*
@@ -766,12 +844,6 @@ static __always_inline ssize_t mfill_atomic(struct userfaultfd_ctx *ctx,
 		return  mfill_atomic_hugetlb(ctx, dst_vma, dst_start,
 					     src_start, len, flags);
 
-	if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
-		goto out_unlock;
-	if (!vma_is_shmem(dst_vma) &&
-	    uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE))
-		goto out_unlock;
-
 	while (src_addr < src_start + len) {
 		pmd_t dst_pmdval;
 
-- 
2.50.1



^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-14 23:14 [PATCH v4 0/4] mm/userfaultfd: modulize memory types Peter Xu
                   ` (3 preceding siblings ...)
  2025-10-14 23:15 ` [PATCH v4 4/4] mm: Apply vm_uffd_ops API to core mm Peter Xu
@ 2025-10-20 13:34 ` David Hildenbrand
  2025-10-20 14:12   ` Peter Xu
  4 siblings, 1 reply; 23+ messages in thread
From: David Hildenbrand @ 2025-10-20 13:34 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Liam R . Howlett,
	Oscar Salvador, Suren Baghdasaryan, Andrea Arcangeli

On 15.10.25 01:14, Peter Xu wrote:
> [based on latest akpm/mm-new of Oct 14th, commit 36c6c5ce1b275]
> 
> v4:
> - Some cleanups within vma_can_userfault() [David]
> - Rename uffd_get_folio() to minor_get_folio() [David]
> - Remove uffd_features in vm_uffd_ops, deduce it from supported ioctls [David]
> 
> v1: https://lore.kernel.org/r/20250620190342.1780170-1-peterx@redhat.com
> v2: https://lore.kernel.org/r/20250627154655.2085903-1-peterx@redhat.com
> v3: https://lore.kernel.org/r/20250926211650.525109-1-peterx@redhat.com
> 
> This series is an alternative proposal of what Nikita proposed here on the
> initial three patches:
> 
>    https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com
> 
> This is not yet relevant to any guest-memfd support, but paving way for it.
> Here, the major goal is to make kernel modules be able to opt-in with any
> form of userfaultfd supports, like guest-memfd.  This alternative option
> should hopefully be cleaner, and avoid leaking userfault details into
> vm_ops.fault().
> 
> It also means this series does not depend on anything.  It's a pure
> refactoring of userfaultfd internals to provide a generic API, so that
> other types of files, especially RAM based, can support userfaultfd without
> touching mm/ at all.
> 
> To achieve it, this series introduced a file operation called vm_uffd_ops.
> The ops needs to be provided when a file type supports any of userfaultfd.
> 
> With that, I moved both hugetlbfs and shmem over, whenever possible.  So
> far due to concerns on exposing an uffd_copy() API, the MISSING faults are
> still separately processed and can only be done within mm/.  Hugetlbfs kept
> its special paths untouched.
> 
> An example of shmem uffd_ops:
> 
> static const struct vm_uffd_ops shmem_uffd_ops = {
> 	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
> 					BIT(_UFFDIO_ZEROPAGE) |
> 					BIT(_UFFDIO_WRITEPROTECT) |
> 					BIT(_UFFDIO_CONTINUE) |
> 					BIT(_UFFDIO_POISON),
> 	.minor_get_folio	=	shmem_uffd_get_folio,
> };

This looks better than the previous version to me.

Long term the goal should be to move all hugetlb/shmem specific stuff 
out of mm/hugetlb.c and of course, we won't be adding any new ones to 
mm/userfaultfd.c

I agree with Liam that a better interface could be providing default 
handlers for the separate ioctls [1], but there is always the option to 
evolve this interface into something like that later.


[1] 
https://lkml.kernel.org/r/frnos5jtmlqvzpcrredcoummuzvllweku5dgp5ii5in6epwnw5@anu4dqsz6shy
-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-20 13:34 ` [PATCH v4 0/4] mm/userfaultfd: modulize memory types David Hildenbrand
@ 2025-10-20 14:12   ` Peter Xu
  2025-10-21 15:51     ` Liam R. Howlett
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Xu @ 2025-10-20 14:12 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, Mike Rapoport, Muchun Song,
	Nikita Kalyazin, Vlastimil Babka, Axel Rasmussen, Andrew Morton,
	James Houghton, Lorenzo Stoakes, Hugh Dickins, Michal Hocko,
	Ujwal Kundur, Liam R . Howlett, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli

On Mon, Oct 20, 2025 at 03:34:47PM +0200, David Hildenbrand wrote:
> On 15.10.25 01:14, Peter Xu wrote:
> > [based on latest akpm/mm-new of Oct 14th, commit 36c6c5ce1b275]
> > 
> > v4:
> > - Some cleanups within vma_can_userfault() [David]
> > - Rename uffd_get_folio() to minor_get_folio() [David]
> > - Remove uffd_features in vm_uffd_ops, deduce it from supported ioctls [David]
> > 
> > v1: https://lore.kernel.org/r/20250620190342.1780170-1-peterx@redhat.com
> > v2: https://lore.kernel.org/r/20250627154655.2085903-1-peterx@redhat.com
> > v3: https://lore.kernel.org/r/20250926211650.525109-1-peterx@redhat.com
> > 
> > This series is an alternative proposal of what Nikita proposed here on the
> > initial three patches:
> > 
> >    https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com
> > 
> > This is not yet relevant to any guest-memfd support, but paving way for it.
> > Here, the major goal is to make kernel modules be able to opt-in with any
> > form of userfaultfd supports, like guest-memfd.  This alternative option
> > should hopefully be cleaner, and avoid leaking userfault details into
> > vm_ops.fault().
> > 
> > It also means this series does not depend on anything.  It's a pure
> > refactoring of userfaultfd internals to provide a generic API, so that
> > other types of files, especially RAM based, can support userfaultfd without
> > touching mm/ at all.
> > 
> > To achieve it, this series introduced a file operation called vm_uffd_ops.
> > The ops needs to be provided when a file type supports any of userfaultfd.
> > 
> > With that, I moved both hugetlbfs and shmem over, whenever possible.  So
> > far due to concerns on exposing an uffd_copy() API, the MISSING faults are
> > still separately processed and can only be done within mm/.  Hugetlbfs kept
> > its special paths untouched.
> > 
> > An example of shmem uffd_ops:
> > 
> > static const struct vm_uffd_ops shmem_uffd_ops = {
> > 	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
> > 					BIT(_UFFDIO_ZEROPAGE) |
> > 					BIT(_UFFDIO_WRITEPROTECT) |
> > 					BIT(_UFFDIO_CONTINUE) |
> > 					BIT(_UFFDIO_POISON),
> > 	.minor_get_folio	=	shmem_uffd_get_folio,
> > };
> 
> This looks better than the previous version to me.
> 
> Long term the goal should be to move all hugetlb/shmem specific stuff out of
> mm/hugetlb.c and of course, we won't be adding any new ones to
> mm/userfaultfd.c
> 
> I agree with Liam that a better interface could be providing default
> handlers for the separate ioctls [1], but there is always the option to
> evolve this interface into something like that later.

Thanks for accepting this current form.

> 
> 
> [1] https://lkml.kernel.org/r/frnos5jtmlqvzpcrredcoummuzvllweku5dgp5ii5in6epwnw5@anu4dqsz6shy

I have replied to that, here:

https://lore.kernel.org/all/aOVEDii4HPB6outm@x1.local/

If we ignore hugetlbfs, most of the hooks may not be needed, as explained.

If we introduce hooks only for hugetlbfs, IMHO it's going backwards.  When
we want to get rid of hugetlbfs paths, we will have something more to get
rid of..

We can wait until some use cases to appear, that may help us to better
understand what might be more suitable to be abstracted. Especially, if
there will be any consumer for missing faults.  However if guest-memfd has
it optional, my expectation so far is there'll be no real demand on that
for the foreseeable future.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 1/4] mm: Introduce vm_uffd_ops API
  2025-10-14 23:14 ` [PATCH v4 1/4] mm: Introduce vm_uffd_ops API Peter Xu
@ 2025-10-20 14:18   ` David Hildenbrand
  0 siblings, 0 replies; 23+ messages in thread
From: David Hildenbrand @ 2025-10-20 14:18 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Liam R . Howlett,
	Oscar Salvador, Suren Baghdasaryan, Andrea Arcangeli

On 15.10.25 01:14, Peter Xu wrote:
> Currently, most of the userfaultfd features are implemented directly in the
> core mm.  It will invoke VMA specific functions whenever necessary.  So far
> it is fine because it almost only interacts with shmem and hugetlbfs.
> 
> Introduce a generic userfaultfd API extension for vm_operations_struct,
> so that any code that implements vm_operations_struct (including kernel
> modules that can be compiled separately from the kernel core) can support
> userfaults without modifying the core files.
> 
> With this API applied, if a module wants to support userfaultfd, the
> module should only need to properly define vm_uffd_ops and hook it to
> vm_operations_struct, instead of changing anything in core mm.
> 
> This API will not work for anonymous memory. Handling of userfault
> operations for anonymous memory remains unchanged in core mm.
> 
> Due to a security concern while reviewing older versions of this series
> [1], uffd_copy() will be temprorarily removed.  IOW, MISSING-capable memory
> types can only be hard-coded and implemented in mm/.  It would also affect
> UFFDIO_COPY and UFFDIO_ZEROPAGE.  Other functions should still be able to
> be provided from vm_uffd_ops.
> 
> Introduces the API only so that existing userfaultfd users can be moved
> over without breaking them.
> 
> [1] https://lore.kernel.org/all/20250627154655.2085903-1-peterx@redhat.com/
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 2/4] mm/shmem: Support vm_uffd_ops API
  2025-10-14 23:14 ` [PATCH v4 2/4] mm/shmem: Support " Peter Xu
@ 2025-10-20 14:18   ` David Hildenbrand
  0 siblings, 0 replies; 23+ messages in thread
From: David Hildenbrand @ 2025-10-20 14:18 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Liam R . Howlett,
	Oscar Salvador, Suren Baghdasaryan, Andrea Arcangeli

On 15.10.25 01:14, Peter Xu wrote:
> Add support for the new vm_uffd_ops API for shmem.  Note that this only
> introduces the support, the API is not yet used by core mm.
> 
> It only needs a separate minor_get_folio() definition but that's oneliner.
> 
> Due to the limitation of the current vm_uffd_ops on MISSING mode support,
> the shmem UFFDIO_COPY/ZEROPAGE process are still hard-coded in mm/.
> 
> Cc: Hugh Dickins <hughd@google.com>
> Acked-by: Mike Rapoport <rppt@kernel.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 3/4] mm/hugetlb: Support vm_uffd_ops API
  2025-10-14 23:15 ` [PATCH v4 3/4] mm/hugetlb: " Peter Xu
@ 2025-10-20 14:19   ` David Hildenbrand
  0 siblings, 0 replies; 23+ messages in thread
From: David Hildenbrand @ 2025-10-20 14:19 UTC (permalink / raw)
  To: Peter Xu, linux-kernel, linux-mm
  Cc: Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Liam R . Howlett,
	Oscar Salvador, Suren Baghdasaryan, Andrea Arcangeli

On 15.10.25 01:15, Peter Xu wrote:
> Add support for the new vm_uffd_ops API for hugetlb.  Note that this only
> introduces the support, the API is not yet used by core mm.
> 
> Due to legacy reasons, it's still not trivial to move hugetlb completely to
> the API.  But it will still use supported_ioctls properly on the API with
> all the implementations still hard coded in mm/.
> 
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: Oscar Salvador <osalvador@suse.de>
> Acked-by: Mike Rapoport <rppt@kernel.org>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---

Acked-by: David Hildenbrand <david@redhat.com>

-- 
Cheers

David / dhildenb



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-20 14:12   ` Peter Xu
@ 2025-10-21 15:51     ` Liam R. Howlett
  2025-10-21 16:28       ` Peter Xu
  0 siblings, 1 reply; 23+ messages in thread
From: Liam R. Howlett @ 2025-10-21 15:51 UTC (permalink / raw)
  To: Peter Xu
  Cc: David Hildenbrand, linux-kernel, linux-mm, Mike Rapoport,
	Muchun Song, Nikita Kalyazin, Vlastimil Babka, Axel Rasmussen,
	Andrew Morton, James Houghton, Lorenzo Stoakes, Hugh Dickins,
	Michal Hocko, Ujwal Kundur, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

* Peter Xu <peterx@redhat.com> [251020 10:12]:
> On Mon, Oct 20, 2025 at 03:34:47PM +0200, David Hildenbrand wrote:
> > On 15.10.25 01:14, Peter Xu wrote:
> > > [based on latest akpm/mm-new of Oct 14th, commit 36c6c5ce1b275]
> > > 
> > > v4:
> > > - Some cleanups within vma_can_userfault() [David]
> > > - Rename uffd_get_folio() to minor_get_folio() [David]
> > > - Remove uffd_features in vm_uffd_ops, deduce it from supported ioctls [David]
> > > 
> > > v1: https://lore.kernel.org/r/20250620190342.1780170-1-peterx@redhat.com
> > > v2: https://lore.kernel.org/r/20250627154655.2085903-1-peterx@redhat.com
> > > v3: https://lore.kernel.org/r/20250926211650.525109-1-peterx@redhat.com
> > > 
> > > This series is an alternative proposal of what Nikita proposed here on the
> > > initial three patches:
> > > 
> > >    https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com
> > > 
> > > This is not yet relevant to any guest-memfd support, but paving way for it.
> > > Here, the major goal is to make kernel modules be able to opt-in with any
> > > form of userfaultfd supports, like guest-memfd.  This alternative option
> > > should hopefully be cleaner, and avoid leaking userfault details into
> > > vm_ops.fault().
> > > 
> > > It also means this series does not depend on anything.  It's a pure
> > > refactoring of userfaultfd internals to provide a generic API, so that
> > > other types of files, especially RAM based, can support userfaultfd without
> > > touching mm/ at all.
> > > 
> > > To achieve it, this series introduced a file operation called vm_uffd_ops.
> > > The ops needs to be provided when a file type supports any of userfaultfd.
> > > 
> > > With that, I moved both hugetlbfs and shmem over, whenever possible.  So
> > > far due to concerns on exposing an uffd_copy() API, the MISSING faults are
> > > still separately processed and can only be done within mm/.  Hugetlbfs kept
> > > its special paths untouched.
> > > 
> > > An example of shmem uffd_ops:
> > > 
> > > static const struct vm_uffd_ops shmem_uffd_ops = {
> > > 	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
> > > 					BIT(_UFFDIO_ZEROPAGE) |
> > > 					BIT(_UFFDIO_WRITEPROTECT) |
> > > 					BIT(_UFFDIO_CONTINUE) |
> > > 					BIT(_UFFDIO_POISON),
> > > 	.minor_get_folio	=	shmem_uffd_get_folio,
> > > };

I think you forgot to add the link to the guest_memfd implementation [1]
to your cover letter.

> > 
> > This looks better than the previous version to me.
> > 
> > Long term the goal should be to move all hugetlb/shmem specific stuff out of
> > mm/hugetlb.c and of course, we won't be adding any new ones to
> > mm/userfaultfd.c
> > 
> > I agree with Liam that a better interface could be providing default
> > handlers for the separate ioctls [1], but there is always the option to
> > evolve this interface into something like that later.
> 
> Thanks for accepting this current form.
> 
> > 
> > 
> > [1] https://lkml.kernel.org/r/frnos5jtmlqvzpcrredcoummuzvllweku5dgp5ii5in6epwnw5@anu4dqsz6shy
> 
> I have replied to that, here:
> 
> https://lore.kernel.org/all/aOVEDii4HPB6outm@x1.local/
> 
> If we ignore hugetlbfs, most of the hooks may not be needed, as explained.

Those were examples.

Hooks allow for all the memory type checking to go away in the code,
which allows for more readable code and less operations per call.

> 
> If we introduce hooks only for hugetlbfs, IMHO it's going backwards.  When
> we want to get rid of hugetlbfs paths, we will have something more to get
> rid of..

This is just wrong.

It is far easier to remove one function pointer than go through all the
code and remove the checks for hugetlbfs.

Are you thinking the hooks will just point to the generic function?
This is the only way I can see your statement making sense.  That's not
the idea I'm trying to communicate.

The idea is that you split the functions into parts that everyone does
and special parts, then call them in the correct sequence for each type.
New types need new special parts while using the generic code for the
majority of the work.

In this way, the memory types are modularized into function pointers
that all use common code without adding complexity.  In fact, knowing
implicitly which context from call path means we don't need to check the
types and should be able to reduce the complexity.

Then adding a new memory type will call almost all the same functions
except for special areas.

Removing old memory types would me removing the special areas only - and
maybe a function pointer if they are the only user.

The current patch set does not modularizing memory, it is creating a
middleware level where we have to parse a value to figure out what to
do.

These patches DO expose a method for memory types to be coded in a
kernel module, which is fundamentally different than modularizing the
memory types.  Different enough to be glossed over on a ML by looking at
the subject alone.

Yes, one value is better than two values, but no magic values is ideal.

Is it a significant amount of work to remove the magic value by
fragmenting the code into memory type specific function pointers?

IOW, instead of decoding the value to figure out where to route calls,
just expose the calls directly in the function pointer layer that you
are creating?  What is the minimum amount of function pointers to get
the guest_memfd to work without this value being parsed?

[1].  https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/

Regards,
Liam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-21 15:51     ` Liam R. Howlett
@ 2025-10-21 16:28       ` Peter Xu
  2025-10-30 17:13         ` Liam R. Howlett
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Xu @ 2025-10-21 16:28 UTC (permalink / raw)
  To: Liam R. Howlett, David Hildenbrand, linux-kernel, linux-mm,
	Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli

On Tue, Oct 21, 2025 at 11:51:33AM -0400, Liam R. Howlett wrote:
> * Peter Xu <peterx@redhat.com> [251020 10:12]:
> > On Mon, Oct 20, 2025 at 03:34:47PM +0200, David Hildenbrand wrote:
> > > On 15.10.25 01:14, Peter Xu wrote:
> > > > [based on latest akpm/mm-new of Oct 14th, commit 36c6c5ce1b275]
> > > > 
> > > > v4:
> > > > - Some cleanups within vma_can_userfault() [David]
> > > > - Rename uffd_get_folio() to minor_get_folio() [David]
> > > > - Remove uffd_features in vm_uffd_ops, deduce it from supported ioctls [David]
> > > > 
> > > > v1: https://lore.kernel.org/r/20250620190342.1780170-1-peterx@redhat.com
> > > > v2: https://lore.kernel.org/r/20250627154655.2085903-1-peterx@redhat.com
> > > > v3: https://lore.kernel.org/r/20250926211650.525109-1-peterx@redhat.com
> > > > 
> > > > This series is an alternative proposal of what Nikita proposed here on the
> > > > initial three patches:
> > > > 
> > > >    https://lore.kernel.org/r/20250404154352.23078-1-kalyazin@amazon.com
> > > > 
> > > > This is not yet relevant to any guest-memfd support, but paving way for it.
> > > > Here, the major goal is to make kernel modules be able to opt-in with any
> > > > form of userfaultfd supports, like guest-memfd.  This alternative option
> > > > should hopefully be cleaner, and avoid leaking userfault details into
> > > > vm_ops.fault().
> > > > 
> > > > It also means this series does not depend on anything.  It's a pure
> > > > refactoring of userfaultfd internals to provide a generic API, so that
> > > > other types of files, especially RAM based, can support userfaultfd without
> > > > touching mm/ at all.
> > > > 
> > > > To achieve it, this series introduced a file operation called vm_uffd_ops.
> > > > The ops needs to be provided when a file type supports any of userfaultfd.
> > > > 
> > > > With that, I moved both hugetlbfs and shmem over, whenever possible.  So
> > > > far due to concerns on exposing an uffd_copy() API, the MISSING faults are
> > > > still separately processed and can only be done within mm/.  Hugetlbfs kept
> > > > its special paths untouched.
> > > > 
> > > > An example of shmem uffd_ops:
> > > > 
> > > > static const struct vm_uffd_ops shmem_uffd_ops = {
> > > > 	.supported_ioctls	=	BIT(_UFFDIO_COPY) |
> > > > 					BIT(_UFFDIO_ZEROPAGE) |
> > > > 					BIT(_UFFDIO_WRITEPROTECT) |
> > > > 					BIT(_UFFDIO_CONTINUE) |
> > > > 					BIT(_UFFDIO_POISON),
> > > > 	.minor_get_folio	=	shmem_uffd_get_folio,
> > > > };
> 
> I think you forgot to add the link to the guest_memfd implementation [1]
> to your cover letter.

I didn't.

https://lore.kernel.org/all/20251014231501.2301398-1-peterx@redhat.com/

    To show another sample, this is the patch that Nikita posted to implement
    minor fault for guest-memfd (on top of older versions of this series):

      https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/


> 
> > > 
> > > This looks better than the previous version to me.
> > > 
> > > Long term the goal should be to move all hugetlb/shmem specific stuff out of
> > > mm/hugetlb.c and of course, we won't be adding any new ones to
> > > mm/userfaultfd.c
> > > 
> > > I agree with Liam that a better interface could be providing default
> > > handlers for the separate ioctls [1], but there is always the option to
> > > evolve this interface into something like that later.
> > 
> > Thanks for accepting this current form.
> > 
> > > 
> > > 
> > > [1] https://lkml.kernel.org/r/frnos5jtmlqvzpcrredcoummuzvllweku5dgp5ii5in6epwnw5@anu4dqsz6shy
> > 
> > I have replied to that, here:
> > 
> > https://lore.kernel.org/all/aOVEDii4HPB6outm@x1.local/
> > 
> > If we ignore hugetlbfs, most of the hooks may not be needed, as explained.
> 
> Those were examples.
> 
> Hooks allow for all the memory type checking to go away in the code,
> which allows for more readable code and less operations per call.
> 
> > 
> > If we introduce hooks only for hugetlbfs, IMHO it's going backwards.  When
> > we want to get rid of hugetlbfs paths, we will have something more to get
> > rid of..
> 
> This is just wrong.
> 
> It is far easier to remove one function pointer than go through all the
> code and remove the checks for hugetlbfs.
> 
> Are you thinking the hooks will just point to the generic function?
> This is the only way I can see your statement making sense.  That's not
> the idea I'm trying to communicate.
> 
> The idea is that you split the functions into parts that everyone does
> and special parts, then call them in the correct sequence for each type.
> New types need new special parts while using the generic code for the
> majority of the work.
> 
> In this way, the memory types are modularized into function pointers
> that all use common code without adding complexity.  In fact, knowing
> implicitly which context from call path means we don't need to check the
> types and should be able to reduce the complexity.
> 
> Then adding a new memory type will call almost all the same functions
> except for special areas.
> 
> Removing old memory types would me removing the special areas only - and
> maybe a function pointer if they are the only user.
> 
> The current patch set does not modularizing memory, it is creating a
> middleware level where we have to parse a value to figure out what to
> do.
> 
> These patches DO expose a method for memory types to be coded in a
> kernel module, which is fundamentally different than modularizing the
> memory types.  Different enough to be glossed over on a ML by looking at
> the subject alone.
> 
> Yes, one value is better than two values, but no magic values is ideal.
> 
> Is it a significant amount of work to remove the magic value by
> fragmenting the code into memory type specific function pointers?
> 
> IOW, instead of decoding the value to figure out where to route calls,
> just expose the calls directly in the function pointer layer that you
> are creating?  What is the minimum amount of function pointers to get
> the guest_memfd to work without this value being parsed?
> 
> [1].  https://lore.kernel.org/all/114133f5-0282-463d-9d65-3143aa658806@amazon.com/

I don't know what you're looking for.

I think I got most acks from userfaultfd developers whoever were active in
the past few years, ever since v1...

Then, we got some concern on uffd_copy() API being complicated, it's fine,
I dropped it.

We got some other concern on having a function returning folio pointer.  We
talked it all through, luckily, even if I do not know what really happened.

Now, I really don't know what you're suggesting here.

Can you send some patches and show us the code, help everyone to support
guest-memfd minor fault, please?

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-21 16:28       ` Peter Xu
@ 2025-10-30 17:13         ` Liam R. Howlett
  2025-10-30 18:00           ` Nikita Kalyazin
  2025-10-30 19:07           ` Peter Xu
  0 siblings, 2 replies; 23+ messages in thread
From: Liam R. Howlett @ 2025-10-30 17:13 UTC (permalink / raw)
  To: Peter Xu
  Cc: David Hildenbrand, linux-kernel, linux-mm, Mike Rapoport,
	Muchun Song, Nikita Kalyazin, Vlastimil Babka, Axel Rasmussen,
	Andrew Morton, James Houghton, Lorenzo Stoakes, Hugh Dickins,
	Michal Hocko, Ujwal Kundur, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

* Peter Xu <peterx@redhat.com> [251021 12:28]:

...

> Can you send some patches and show us the code, help everyone to support
> guest-memfd minor fault, please?

Patches are here:

https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem

This is actually modularized memory types.  That means there is no
hugetlb.h or shmem.h included in mm/userfaultfd.c code.

uffd_flag_t has been removed.  This was turning into a middleware and
it is not necessary.  Neither is supported_ioctls.

hugetlb now uses the same functions as every other memory type,
including anon memory.

Any memory type can change functionality without adding instructions or
flags or anything to some other code.

This code passes uffd-unit-test and uffd-wp-mremap (skipped the swap
tests).

guest-memfd can implement whatever it needs to (or use others
implementations), like shmem_uffd_ops here:

static const struct vm_uffd_ops shmem_uffd_ops = {
        .copy                   =       shmem_mfill_atomic_pte_copy,
        .zeropage               =       shmem_mfill_atomic_pte_zeropage,
        .cont                   =       shmem_mfill_atomic_pte_continue,
        .poison                 =       mfill_atomic_pte_poison,
        .writeprotect           =       uffd_writeprotect,
        .is_dst_valid           =       shmem_is_dst_valid,
        .increment              =       mfill_size,
        .failed_do_unlock       =       uffd_failed_do_unlock,
        .page_shift             =       uffd_page_shift,
        .complete_register      =       uffd_complete_register,
};   

Where guest-memfd needs to write the one function:
guest_memfd_pte_continue(), from what I understand.

Obviously some of the shmem_ functions would need to be added to a
header, or such.

And most of that can come from shmem_mfill_atomic_pte_continue(), from
what I understand.  This is about 40 lines of code, but may require
exposing some shmem functions to keep the code that compact.

So we don't need to expose getting a folio to a module, or decode any
special flags or whatever.  We just call the function that needs to be
called on the vma that is found.

If anyone has tests I can use for guest-memfd and instructions on
guest-memfd setup, I'll just write it instead of expanding the
userfaultfd middleware.

Thanks,
Liam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 17:13         ` Liam R. Howlett
@ 2025-10-30 18:00           ` Nikita Kalyazin
  2025-10-30 19:07           ` Peter Xu
  1 sibling, 0 replies; 23+ messages in thread
From: Nikita Kalyazin @ 2025-10-30 18:00 UTC (permalink / raw)
  To: Liam R. Howlett, Peter Xu, David Hildenbrand, linux-kernel,
	linux-mm, Mike Rapoport, Muchun Song, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli



On 30/10/2025 17:13, Liam R. Howlett wrote:
> * Peter Xu <peterx@redhat.com> [251021 12:28]:
> 
> ...
> 
>> Can you send some patches and show us the code, help everyone to support
>> guest-memfd minor fault, please?
> 
> Patches are here:
> 
> https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem
> 
> This is actually modularized memory types.  That means there is no
> hugetlb.h or shmem.h included in mm/userfaultfd.c code.
> 
> uffd_flag_t has been removed.  This was turning into a middleware and
> it is not necessary.  Neither is supported_ioctls.
> 
> hugetlb now uses the same functions as every other memory type,
> including anon memory.
> 
> Any memory type can change functionality without adding instructions or
> flags or anything to some other code.
> 
> This code passes uffd-unit-test and uffd-wp-mremap (skipped the swap
> tests).
> 
> guest-memfd can implement whatever it needs to (or use others
> implementations), like shmem_uffd_ops here:
> 
> static const struct vm_uffd_ops shmem_uffd_ops = {
>          .copy                   =       shmem_mfill_atomic_pte_copy,
>          .zeropage               =       shmem_mfill_atomic_pte_zeropage,
>          .cont                   =       shmem_mfill_atomic_pte_continue,
>          .poison                 =       mfill_atomic_pte_poison,
>          .writeprotect           =       uffd_writeprotect,
>          .is_dst_valid           =       shmem_is_dst_valid,
>          .increment              =       mfill_size,
>          .failed_do_unlock       =       uffd_failed_do_unlock,
>          .page_shift             =       uffd_page_shift,
>          .complete_register      =       uffd_complete_register,
> };
> 
> Where guest-memfd needs to write the one function:
> guest_memfd_pte_continue(), from what I understand.
> 
> Obviously some of the shmem_ functions would need to be added to a
> header, or such.
> 
> And most of that can come from shmem_mfill_atomic_pte_continue(), from
> what I understand.  This is about 40 lines of code, but may require
> exposing some shmem functions to keep the code that compact.
> 
> So we don't need to expose getting a folio to a module, or decode any
> special flags or whatever.  We just call the function that needs to be
> called on the vma that is found.
> 
> If anyone has tests I can use for guest-memfd and instructions on
> guest-memfd setup, I'll just write it instead of expanding the
> userfaultfd middleware.

I used to use [1] as a test.  The test function can be called from inside

         if (flags & GUEST_MEMFD_FLAG_MMAP) {
                 test_mmap_supported(fd, page_size, total_size);
                 test_fault_overflow(fd, page_size, total_size);
         } else {

in your tree (tools/testing/selftests/kvm/guest_memfd_test.c).

CONFIG_KVM_GUEST_MEMFD=y should be sufficient to enable guest_memfd support.

[1] https://lore.kernel.org/kvm/20250404154352.23078-7-kalyazin@amazon.com/

> 
> Thanks,
> Liam
> 



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 17:13         ` Liam R. Howlett
  2025-10-30 18:00           ` Nikita Kalyazin
@ 2025-10-30 19:07           ` Peter Xu
  2025-10-30 19:55             ` Peter Xu
  2025-10-30 20:24             ` Liam R. Howlett
  1 sibling, 2 replies; 23+ messages in thread
From: Peter Xu @ 2025-10-30 19:07 UTC (permalink / raw)
  To: Liam R. Howlett, David Hildenbrand, linux-kernel, linux-mm,
	Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli

On Thu, Oct 30, 2025 at 01:13:24PM -0400, Liam R. Howlett wrote:
> * Peter Xu <peterx@redhat.com> [251021 12:28]:
> 
> ...
> 
> > Can you send some patches and show us the code, help everyone to support
> > guest-memfd minor fault, please?
> 
> Patches are here:
> 
> https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem

Great!  Finally we have something solid to discuss on top.

Yes, I'm extremely happy to see whatever code there is, I'm happy to review
it.  I'm happy to see it rolling.  If it is better, we can adopt it.

> 
> This is actually modularized memory types.  That means there is no
> hugetlb.h or shmem.h included in mm/userfaultfd.c code.
> 
> uffd_flag_t has been removed.  This was turning into a middleware and
> it is not necessary.  Neither is supported_ioctls.
> 
> hugetlb now uses the same functions as every other memory type,
> including anon memory.
> 
> Any memory type can change functionality without adding instructions or
> flags or anything to some other code.
> 
> This code passes uffd-unit-test and uffd-wp-mremap (skipped the swap
> tests).
> 
> guest-memfd can implement whatever it needs to (or use others
> implementations), like shmem_uffd_ops here:

I didn't look at the patches in details, however I already commented
previously on a similar comment you left.  Since you have solid code this
time, let me ask one by one clearly this time inline:

> 
> static const struct vm_uffd_ops shmem_uffd_ops = {
>         .copy                   =       shmem_mfill_atomic_pte_copy,

This is optional, if you did the convertion it's perfect (though it's buggy
right now, more below).  Yes, UFFDIO_COPY might be a good fit for a global
API like this, however that's the least useful, as I mentioned, I do not
expect a new user..

It means, what you did on this may not grow any user.  The whole change may
not be useful to anyone..

Then I see what you introduced as the API:

struct vm_uffd_ops {
	int (*copy)(struct uffd_info *info);
	int (*zeropage)(struct uffd_info *info);
	int (*cont)(struct uffd_info *info);
	int (*poison)(struct uffd_info *info);
	int (*writeprotect)(struct uffd_info *info);
	/* Required features below */
	ssize_t (*is_dst_valid)(struct vm_area_struct *dst_vma,
				unsigned long dst_start, unsigned long len);
	unsigned long (*increment)(struct vm_area_struct *vma);
	ssize_t (*failed_do_unlock)(struct uffd_info *info);
	unsigned int (*page_shift)(struct vm_area_struct *src_vma);
	void (*complete_register)(struct vm_area_struct *vma);
};

The copy() interface (and most of the rest) takes something called
uffd_info*.  Then it looks like this:

struct uffd_info {
	unsigned long dst_addr;			/* Target address */
	unsigned long src_addr;			/* Source address */
	unsigned long len;			/* Total length to copy */
	unsigned long src_last;			/* starting src_addr + len */
	unsigned long dst_last;			/* starting dst_addr + len */
	struct folio *foliop;			/* folio pointer for retry */
	struct userfaultfd_ctx *ctx;		/* The userfaultfd context */

	struct vm_area_struct *dst_vma;		/* Target vma */
	unsigned long increment;		/* Size of each operation */
	bool wp;				/* Operation is requesting write protection */
	const struct vm_uffd_ops *uffd_ops;	/* The vma uffd_ops pointer */
	int (*op)(struct uffd_info *info);	/* The operation to perform on the dst */
	long copied;
};

You went almost mad when I introduced uffd_copy() in v1.  It might be
because there used to have pmd* and something around pgtables.  However
I'll still need to question whether this is a better and easier to adopt
interface if a mem type wants to opt-in any uffd features.

So are you happy yourself now with above complicated struct that memtype
needs to implement and support?

>         .zeropage               =       shmem_mfill_atomic_pte_zeropage,

Why a memory type needs to provide a separate hook to inject a zeropage?
It should almost always be the same of UFFDIO_COPY except copying.

>         .cont                   =       shmem_mfill_atomic_pte_continue,

It's OK to have it.  However said that, what we really need is "fetching a
cache folio".  I'll need to check how you exposed the userfaultfd helpers
so that it will support mem types to opt-in this.  To me, this is really an
overkill.

Shmem impl:

static int shmem_mfill_atomic_pte_continue(struct uffd_info *info)
{
	struct vm_area_struct *dst_vma = info->dst_vma;
	struct inode *inode = file_inode(dst_vma->vm_file);
	pgoff_t pgoff = linear_page_index(dst_vma, info->dst_addr);
	pmd_t *dst_pmd;
	struct folio *folio;
	struct page *page;
	int ret;

	ret = uffd_get_dst_pmd(dst_vma, info->dst_addr, &dst_pmd);
	if (ret)
		return ret;

	ret = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC);
	/* Our caller expects us to return -EFAULT if we failed to find folio */
	if (ret == -ENOENT)
		ret = -EFAULT;
	if (ret)
		goto out;
	if (!folio) {
		ret = -EFAULT;
		goto out;
	}

	page = folio_file_page(folio, pgoff);
	if (PageHWPoison(page)) {
		ret = -EIO;
		goto out_release;
	}

	ret = mfill_atomic_install_pte(dst_pmd, dst_vma, info->dst_addr,
				       page, false, info->wp);
	if (ret)
		goto out_release;

	folio_unlock(folio);
	ret = 0;
out:
	return ret;
out_release:
	folio_unlock(folio);
	folio_put(folio);
	goto out;
}

So are you sure this is better than the oneliner?

In your new API, the driver also needs to operate on pmd*.  Is it a concern
to you?  Maybe you don't think it's a concern now, even if you used to
think uffd_copy() has concerns exposing pmd* pointers?

The current series I proposed is not only simpler, but only expose folio*.
That's at least safer at least from your theory, is that right?

>         .poison                 =       mfill_atomic_pte_poison,

Why this is needed if UFFDIO_POISON should exactly do the same thing for
each memory type?

>         .writeprotect           =       uffd_writeprotect,

Same question.  Wr-protect over a pagecache folio should really behave the
same.  Why do you need to introduce it?

After all, even in your branch you reused change_protection(), where the
hugetlb special operations reside.  I don't see much point on why it's
needed.

>         .is_dst_valid           =       shmem_is_dst_valid,

It's definitely not obvious what this is for.

Looks like it's trying to verify vma validity.  However then I see shmem
impl has this:

static ssize_t shmem_is_dst_valid(struct vm_area_struct *dst_vma,
		unsigned long dst_start, unsigned long len)
{
	if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
	    return -EINVAL;

	return 0;
}

Why shmem allows anon vma?

>         .increment              =       mfill_size,

This is almost only useful for hugetlbfs, same to page_shift and
complete_register.

It's ok if you want to clean hugetlbfs.  But if we want to fetch the
granule of the folios inside a vma, IMHO we should make it a vma attribute,
not something special to userfaultfd.

>         .failed_do_unlock       =       uffd_failed_do_unlock,

You'd better at least change the name of it.  It so far unlocks some
misterious locks then try to copy the folio without mmap/vma lock.

inline ssize_t
uffd_failed_do_unlock(struct uffd_info *info)
{
	ssize_t err;
	void *kaddr;

	up_read(&info->ctx->map_changing_lock);
	uffd_mfill_unlock(info->dst_vma);
	VM_WARN_ON_ONCE(!info->foliop);

	kaddr = kmap_local_folio(info->foliop, 0);
	err = copy_from_user(kaddr, (const void __user *) info->src_addr,
			     PAGE_SIZE);
	kunmap_local(kaddr);
	if (unlikely(err))
		return -EFAULT;

	flush_dcache_folio(info->foliop);
	return 0;
}

How do the mem type know what locks to unlock if they do not lock it first
themselves?

>         .page_shift             =       uffd_page_shift,
>         .complete_register      =       uffd_complete_register,

Hugetlbfs specific hooks.  It's OK if you prefer rewritting code for
hugetlbfs.  I don't have objections.

> };   
> 
> Where guest-memfd needs to write the one function:
> guest_memfd_pte_continue(), from what I understand.

You did mention what is required in your new API:

	/* Required features below */
	ssize_t (*is_dst_valid)(struct vm_area_struct *dst_vma,
				unsigned long dst_start, unsigned long len);
	unsigned long (*increment)(struct vm_area_struct *vma);
	ssize_t (*failed_do_unlock)(struct uffd_info *info);
	unsigned int (*page_shift)(struct vm_area_struct *src_vma);
	void (*complete_register)(struct vm_area_struct *vma);

So guest-memfd needs to implement these 6 APIs to support minor fault.

> 
> Obviously some of the shmem_ functions would need to be added to a
> header, or such.
> 
> And most of that can come from shmem_mfill_atomic_pte_continue(), from
> what I understand.  This is about 40 lines of code, but may require
> exposing some shmem functions to keep the code that compact.
> 
> So we don't need to expose getting a folio to a module, or decode any
> special flags or whatever.  We just call the function that needs to be

I didn't reply before, but I don't think supported_ioctls is special flags
or magic values.  They're simply flags showing what the mem type supports
on the ioctls.  One can set it wrong, but I don't think what you proposed
with above 6 APIs would be easier to get right.  Any module can also make
things wrong in any of above.

UFFDIO_CONTINUE itself is definitely getting much more complicated, it used
to only set a flag in supported_ioctls, provide a oneliner to fetch a
cache.  Now it needs all above 6 APIs, one of them taking uffd_info* which
further contains 10+ fields to process.

> called on the vma that is found.
> 
> If anyone has tests I can use for guest-memfd and instructions on
> guest-memfd setup, I'll just write it instead of expanding the
> userfaultfd middleware.

Personally, this whole thing is an overkill to me:

$ git diff --stat 63f84ba525ea04ef376eac851efce2f82dd05f21..HEAD
 fs/userfaultfd.c              |  14 +--
 include/linux/hugetlb.h       |  21 ----
 include/linux/mm.h            |  11 ++
 include/linux/shmem_fs.h      |  14 ---
 include/linux/userfaultfd_k.h | 108 ++++++++++------
 mm/hugetlb.c                  | 359 +++++++++++++++++++++++++++++++++++++++++++++--------
 mm/shmem.c                    | 245 ++++++++++++++++++++++++------------
 mm/userfaultfd.c              | 869 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
 8 files changed, 962 insertions(+), 679 deletions(-)

I can wait for some second opinions, but if you are confident this will be
welcomed by others, I suggest you prepare a minimum changeset preparing
support for guest-memfd minor fault, then post it formally upstream.

Thanks,

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 19:07           ` Peter Xu
@ 2025-10-30 19:55             ` Peter Xu
  2025-10-30 20:23               ` Lorenzo Stoakes
  2025-10-30 20:52               ` Liam R. Howlett
  2025-10-30 20:24             ` Liam R. Howlett
  1 sibling, 2 replies; 23+ messages in thread
From: Peter Xu @ 2025-10-30 19:55 UTC (permalink / raw)
  To: Liam R. Howlett, David Hildenbrand, linux-kernel, linux-mm,
	Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli

On Thu, Oct 30, 2025 at 03:07:18PM -0400, Peter Xu wrote:
> > Patches are here:
> > 
> > https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem
> 
> Great!  Finally we have something solid to discuss on top.
> 
> Yes, I'm extremely happy to see whatever code there is, I'm happy to review
> it.  I'm happy to see it rolling.  If it is better, we can adopt it.

So here is a summary of why I think my proposal is better:

- Much less code

  I think this is crystal clear..  I'm pasting once more in this summary
  email on what your proposal touches:

 fs/userfaultfd.c              |  14 +--
 include/linux/hugetlb.h       |  21 ----
 include/linux/mm.h            |  11 ++
 include/linux/shmem_fs.h      |  14 ---
 include/linux/userfaultfd_k.h | 108 ++++++++++------
 mm/hugetlb.c                  | 359 +++++++++++++++++++++++++++++++++++++++++++++--------
 mm/shmem.c                    | 245 ++++++++++++++++++++++++------------
 mm/userfaultfd.c              | 869 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
 8 files changed, 962 insertions(+), 679 deletions(-)

- Much less future code

  The new proposal needs at least 6 APIs to implement even minor fault..
  One of the API needs to be implemented with uffd_info* which further
  includes 10+ fields to process.  It means we'll have a bunch of
  duplicated code in the future if new things pop up, so it's not only
  about what we merge.

- Much less exported functions to modules

  My solution, after exposing vm_uffd_ops, doesn't need to export any
  function.

  Your solution needs to export a lot of new functions to modules.  I
  didn't pay a lot of attention but the list should at least include these
  10 functions:
  
        void uffd_complete_register(struct vm_area_struct *vma);
        unsigned int uffd_page_shift(struct vm_area_struct *vma);
        int uffd_writeprotect(struct uffd_info *info);
        ssize_t uffd_failed_do_unlock(struct uffd_info *info);
        int uffd_atomic_pte_copy(struct folio *folio, unsigned long src_addr);
        unsigned long mfill_size(struct vm_area_struct *vma)
        int mfill_atomic_pte_poison(struct uffd_info *info);
        int mfill_atomic_pte_copy(struct uffd_info *info);
        int mfill_atomic_pte_zeropage(struct uffd_info *info);
        ssize_t uffd_get_dst_pmd(struct vm_area_struct *dst_vma, unsigned long dst_addr,pmd_t **dst_pmd);

  It's simply unnecessary.

- Less error prone

  At least to support minor fault, my solution only needs one hook fetching
  page cache, then set the CONTINUE ioctl in the supported_ioctls.

- Safer

  Your code allows to operate on pmd* in a module??? That's too risky and
  mm can explode!  Isn't it?

- Do not build new codes on top of hugetlbfs

  AFAICT, more than half of your solution's API is trying to service
  hugetlbfs.  IMHO that's the wrong way to go.  I explained to you multiple
  times.  We should either keep hugetlbfs alone, or having hugetlbfs adopt
  mm APIs instead.  We shouldn't build any new code only trying to service
  hugetlbfsv1 but nobody else.  We shouldn't introduce new mm API only to
  service hugetlbfs.

- Much less risk of breaking things

  I'm pretty sure my code introduce zero or very little bug, if there's
  one, I'll fix it, but really, likely not, because the changes are
  straightforward.

  Your changes are huge.  I would not be surprised you break things here
  and there.  I hope at least you will be around fixing them when it
  happens, even if we're not sure the benefits of most of the changes.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 19:55             ` Peter Xu
@ 2025-10-30 20:23               ` Lorenzo Stoakes
  2025-10-30 21:13                 ` Peter Xu
  2025-10-30 20:52               ` Liam R. Howlett
  1 sibling, 1 reply; 23+ messages in thread
From: Lorenzo Stoakes @ 2025-10-30 20:23 UTC (permalink / raw)
  To: Peter Xu
  Cc: Liam R. Howlett, David Hildenbrand, linux-kernel, linux-mm,
	Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Hugh Dickins,
	Michal Hocko, Ujwal Kundur, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli, conduct

+cc CoC

Peter,

I'm sorry but your reply here is completely out of line.

I know tensions can run high sometimes, but this is a _good faith_ effort to try
to find a way forward.

Please take a step back and show some respect for the fact that Liam has put
VERY significant effort in preparing this after you _repeatedly_ asked him to
show him code.

I am starting to worry that your approach here is to bat off criticism by trying
to wear reviewers down and that's really not a good thing.

Again, this is _good faith_. Nobody is trying to unreasonably push back on these
changes, we are just trying to find the best solution possible.

Comments like:

'Your code allows to operate on pmd* in a module??? That's too risky and mm can
explode!  Isn't it?'

and 'that's the wrong way to go. I explained to you multiple times.'

and 'I'm pretty sure my code introduce zero or very little bug, if there's one, I'll
fix it, but really, likely not, because the changes are straightforward.'

vs. 'Your changes are huge.  I would not be surprised you break things here and
there.  I hope at least you will be around fixing them when it happens, even if
we're not sure the benefits of most of the changes.'

are just _entirely_ unhelpful and really unacceptable.

I have an extremely heavy workload at the moment anyway, but honestly
interactions like this have seriously put me off being involved in this review
personally.

Do we really want this to be how review in mm or the kernel is?

Is that really the culture we want to have here?

Thanks, Lorenzo


On Thu, Oct 30, 2025 at 03:55:44PM -0400, Peter Xu wrote:
> On Thu, Oct 30, 2025 at 03:07:18PM -0400, Peter Xu wrote:
> > > Patches are here:
> > >
> > > https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem
> >
> > Great!  Finally we have something solid to discuss on top.
> >
> > Yes, I'm extremely happy to see whatever code there is, I'm happy to review
> > it.  I'm happy to see it rolling.  If it is better, we can adopt it.
>
> So here is a summary of why I think my proposal is better:
>
> - Much less code
>
>   I think this is crystal clear..  I'm pasting once more in this summary
>   email on what your proposal touches:
>
>  fs/userfaultfd.c              |  14 +--
>  include/linux/hugetlb.h       |  21 ----
>  include/linux/mm.h            |  11 ++
>  include/linux/shmem_fs.h      |  14 ---
>  include/linux/userfaultfd_k.h | 108 ++++++++++------
>  mm/hugetlb.c                  | 359 +++++++++++++++++++++++++++++++++++++++++++++--------
>  mm/shmem.c                    | 245 ++++++++++++++++++++++++------------
>  mm/userfaultfd.c              | 869 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
>  8 files changed, 962 insertions(+), 679 deletions(-)
>
> - Much less future code
>
>   The new proposal needs at least 6 APIs to implement even minor fault..
>   One of the API needs to be implemented with uffd_info* which further
>   includes 10+ fields to process.  It means we'll have a bunch of
>   duplicated code in the future if new things pop up, so it's not only
>   about what we merge.
>
> - Much less exported functions to modules
>
>   My solution, after exposing vm_uffd_ops, doesn't need to export any
>   function.
>
>   Your solution needs to export a lot of new functions to modules.  I
>   didn't pay a lot of attention but the list should at least include these
>   10 functions:
>
>         void uffd_complete_register(struct vm_area_struct *vma);
>         unsigned int uffd_page_shift(struct vm_area_struct *vma);
>         int uffd_writeprotect(struct uffd_info *info);
>         ssize_t uffd_failed_do_unlock(struct uffd_info *info);
>         int uffd_atomic_pte_copy(struct folio *folio, unsigned long src_addr);
>         unsigned long mfill_size(struct vm_area_struct *vma)
>         int mfill_atomic_pte_poison(struct uffd_info *info);
>         int mfill_atomic_pte_copy(struct uffd_info *info);
>         int mfill_atomic_pte_zeropage(struct uffd_info *info);
>         ssize_t uffd_get_dst_pmd(struct vm_area_struct *dst_vma, unsigned long dst_addr,pmd_t **dst_pmd);
>
>   It's simply unnecessary.
>
> - Less error prone
>
>   At least to support minor fault, my solution only needs one hook fetching
>   page cache, then set the CONTINUE ioctl in the supported_ioctls.
>
> - Safer
>
>   Your code allows to operate on pmd* in a module??? That's too risky and
>   mm can explode!  Isn't it?
>
> - Do not build new codes on top of hugetlbfs
>
>   AFAICT, more than half of your solution's API is trying to service
>   hugetlbfs.  IMHO that's the wrong way to go.  I explained to you multiple
>   times.  We should either keep hugetlbfs alone, or having hugetlbfs adopt
>   mm APIs instead.  We shouldn't build any new code only trying to service
>   hugetlbfsv1 but nobody else.  We shouldn't introduce new mm API only to
>   service hugetlbfs.
>
> - Much less risk of breaking things
>
>   I'm pretty sure my code introduce zero or very little bug, if there's
>   one, I'll fix it, but really, likely not, because the changes are
>   straightforward.
>
>   Your changes are huge.  I would not be surprised you break things here
>   and there.  I hope at least you will be around fixing them when it
>   happens, even if we're not sure the benefits of most of the changes.
>
> --
> Peter Xu
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 19:07           ` Peter Xu
  2025-10-30 19:55             ` Peter Xu
@ 2025-10-30 20:24             ` Liam R. Howlett
  2025-10-30 21:26               ` Peter Xu
  1 sibling, 1 reply; 23+ messages in thread
From: Liam R. Howlett @ 2025-10-30 20:24 UTC (permalink / raw)
  To: Peter Xu
  Cc: David Hildenbrand, linux-kernel, linux-mm, Mike Rapoport,
	Muchun Song, Nikita Kalyazin, Vlastimil Babka, Axel Rasmussen,
	Andrew Morton, James Houghton, Lorenzo Stoakes, Hugh Dickins,
	Michal Hocko, Ujwal Kundur, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

* Peter Xu <peterx@redhat.com> [251030 15:07]:
> On Thu, Oct 30, 2025 at 01:13:24PM -0400, Liam R. Howlett wrote:
> > * Peter Xu <peterx@redhat.com> [251021 12:28]:
> > 
> > ...
> > 
> > > Can you send some patches and show us the code, help everyone to support
> > > guest-memfd minor fault, please?
> > 
> > Patches are here:
> > 
> > https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem
> 
> Great!  Finally we have something solid to discuss on top.
> 
> Yes, I'm extremely happy to see whatever code there is, I'm happy to review
> it.  I'm happy to see it rolling.  If it is better, we can adopt it.
> 
> > 
> > This is actually modularized memory types.  That means there is no
> > hugetlb.h or shmem.h included in mm/userfaultfd.c code.
> > 
> > uffd_flag_t has been removed.  This was turning into a middleware and
> > it is not necessary.  Neither is supported_ioctls.
> > 
> > hugetlb now uses the same functions as every other memory type,
> > including anon memory.
> > 
> > Any memory type can change functionality without adding instructions or
> > flags or anything to some other code.
> > 
> > This code passes uffd-unit-test and uffd-wp-mremap (skipped the swap
> > tests).
> > 
> > guest-memfd can implement whatever it needs to (or use others
> > implementations), like shmem_uffd_ops here:
> 
> I didn't look at the patches in details, however I already commented
> previously on a similar comment you left.  Since you have solid code this
> time, let me ask one by one clearly this time inline:
> 
> > 
> > static const struct vm_uffd_ops shmem_uffd_ops = {
> >         .copy                   =       shmem_mfill_atomic_pte_copy,
> 
> This is optional, if you did the convertion it's perfect (though it's buggy
> right now, more below).  Yes, UFFDIO_COPY might be a good fit for a global
> API like this, however that's the least useful, as I mentioned, I do not
> expect a new user..
> 
> It means, what you did on this may not grow any user.  The whole change may
> not be useful to anyone..

We currently have two users, that's why it's here.  It is useful because
we don't have hugetlb in the code anymore.  We don't even have shmem
code in the userfaultfd code.

> 
> Then I see what you introduced as the API:
> 
> struct vm_uffd_ops {
> 	int (*copy)(struct uffd_info *info);
> 	int (*zeropage)(struct uffd_info *info);
> 	int (*cont)(struct uffd_info *info);
> 	int (*poison)(struct uffd_info *info);
> 	int (*writeprotect)(struct uffd_info *info);
> 	/* Required features below */
> 	ssize_t (*is_dst_valid)(struct vm_area_struct *dst_vma,
> 				unsigned long dst_start, unsigned long len);
> 	unsigned long (*increment)(struct vm_area_struct *vma);
> 	ssize_t (*failed_do_unlock)(struct uffd_info *info);
> 	unsigned int (*page_shift)(struct vm_area_struct *src_vma);
> 	void (*complete_register)(struct vm_area_struct *vma);
> };
> 
> The copy() interface (and most of the rest) takes something called
> uffd_info*.  Then it looks like this:
> 
> struct uffd_info {
> 	unsigned long dst_addr;			/* Target address */
> 	unsigned long src_addr;			/* Source address */
> 	unsigned long len;			/* Total length to copy */
> 	unsigned long src_last;			/* starting src_addr + len */
> 	unsigned long dst_last;			/* starting dst_addr + len */
> 	struct folio *foliop;			/* folio pointer for retry */
> 	struct userfaultfd_ctx *ctx;		/* The userfaultfd context */
> 
> 	struct vm_area_struct *dst_vma;		/* Target vma */
> 	unsigned long increment;		/* Size of each operation */
> 	bool wp;				/* Operation is requesting write protection */
> 	const struct vm_uffd_ops *uffd_ops;	/* The vma uffd_ops pointer */
> 	int (*op)(struct uffd_info *info);	/* The operation to perform on the dst */
> 	long copied;
> };
> 
> You went almost mad when I introduced uffd_copy() in v1.  It might be
> because there used to have pmd* and something around pgtables.  However
> I'll still need to question whether this is a better and easier to adopt
> interface if a mem type wants to opt-in any uffd features.
> 
> So are you happy yourself now with above complicated struct that memtype
> needs to implement and support?

The entire thing is crazy, really.  I don't think modules should be
doing any of this.  It's nuts.  How can you register for a memory type
and not know how to deal with the memory type?  It makes no sense in the
first place.

The only reason that this is approaching acceptable is that memfd
already does the things it needs to do with it.  So we're trying to hand
something back to someone who already made the damn thing, but has no
way of finding it themselves.

Otherwise, this is stupid.

> 
> >         .zeropage               =       shmem_mfill_atomic_pte_zeropage,
> 
> Why a memory type needs to provide a separate hook to inject a zeropage?
> It should almost always be the same of UFFDIO_COPY except copying.

This also has two users.

> 
> >         .cont                   =       shmem_mfill_atomic_pte_continue,
> 
> It's OK to have it.  However said that, what we really need is "fetching a
> cache folio".  I'll need to check how you exposed the userfaultfd helpers
> so that it will support mem types to opt-in this.  To me, this is really an
> overkill.

This is literally doing what your subject said: modularize memory types.

> 
> Shmem impl:
> 
> static int shmem_mfill_atomic_pte_continue(struct uffd_info *info)
> {
> 	struct vm_area_struct *dst_vma = info->dst_vma;
> 	struct inode *inode = file_inode(dst_vma->vm_file);
> 	pgoff_t pgoff = linear_page_index(dst_vma, info->dst_addr);
> 	pmd_t *dst_pmd;
> 	struct folio *folio;
> 	struct page *page;
> 	int ret;
> 
> 	ret = uffd_get_dst_pmd(dst_vma, info->dst_addr, &dst_pmd);
> 	if (ret)
> 		return ret;
> 
> 	ret = shmem_get_folio(inode, pgoff, 0, &folio, SGP_NOALLOC);
> 	/* Our caller expects us to return -EFAULT if we failed to find folio */
> 	if (ret == -ENOENT)
> 		ret = -EFAULT;
> 	if (ret)
> 		goto out;
> 	if (!folio) {
> 		ret = -EFAULT;
> 		goto out;
> 	}
> 
> 	page = folio_file_page(folio, pgoff);
> 	if (PageHWPoison(page)) {
> 		ret = -EIO;
> 		goto out_release;
> 	}
> 
> 	ret = mfill_atomic_install_pte(dst_pmd, dst_vma, info->dst_addr,
> 				       page, false, info->wp);
> 	if (ret)
> 		goto out_release;
> 
> 	folio_unlock(folio);
> 	ret = 0;
> out:
> 	return ret;
> out_release:
> 	folio_unlock(folio);
> 	folio_put(folio);
> 	goto out;
> }
> 
> So are you sure this is better than the oneliner?
> 
> In your new API, the driver also needs to operate on pmd*.  Is it a concern
> to you?  Maybe you don't think it's a concern now, even if you used to
> think uffd_copy() has concerns exposing pmd* pointers?
> 
> The current series I proposed is not only simpler, but only expose folio*.
> That's at least safer at least from your theory, is that right?

I disagree, it's adding yet-another-mode to the code instead of function
pointers.  When do we stop bolting on and start fixing existing issues?

This change stands on its own without guest_memfd support.  Yours is an
anti-pattern.

> 
> >         .poison                 =       mfill_atomic_pte_poison,
> 
> Why this is needed if UFFDIO_POISON should exactly do the same thing for
> each memory type?

Yet, it doesn't so there's a function pointer for it.

> 
> >         .writeprotect           =       uffd_writeprotect,
> 
> Same question.  Wr-protect over a pagecache folio should really behave the
> same.  Why do you need to introduce it?

hugetlb is different.

> 
> After all, even in your branch you reused change_protection(), where the
> hugetlb special operations reside.  I don't see much point on why it's
> needed.

hugetlb uses hugetlb_change_protection().

> 
> >         .is_dst_valid           =       shmem_is_dst_valid,
> 
> It's definitely not obvious what this is for.

is_dst_valid checks to see if the destination is valid.  I can add a
comment if you want.

> 
> Looks like it's trying to verify vma validity.  However then I see shmem
> impl has this:
> 
> static ssize_t shmem_is_dst_valid(struct vm_area_struct *dst_vma,
> 		unsigned long dst_start, unsigned long len)
> {
> 	if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
> 	    return -EINVAL;
> 
> 	return 0;
> }
> 
> Why shmem allows anon vma?

We can drop the !vma_is_shmem() check from here, but I just moved them.
This is an existing check today.


> 
> >         .increment              =       mfill_size,
> 
> This is almost only useful for hugetlbfs, same to page_shift and
> complete_register.

Correct.

> 
> It's ok if you want to clean hugetlbfs.  But if we want to fetch the
> granule of the folios inside a vma, IMHO we should make it a vma attribute,
> not something special to userfaultfd.

If there is a user outside of uffd, then we can move it.  Until then,
let's not pollute outside of mm/userfaultfd.c please.

> 
> >         .failed_do_unlock       =       uffd_failed_do_unlock,
> 
> You'd better at least change the name of it.  It so far unlocks some
> misterious locks then try to copy the folio without mmap/vma lock.
> 
> inline ssize_t
> uffd_failed_do_unlock(struct uffd_info *info)
> {
> 	ssize_t err;
> 	void *kaddr;
> 
> 	up_read(&info->ctx->map_changing_lock);
> 	uffd_mfill_unlock(info->dst_vma);
> 	VM_WARN_ON_ONCE(!info->foliop);
> 
> 	kaddr = kmap_local_folio(info->foliop, 0);
> 	err = copy_from_user(kaddr, (const void __user *) info->src_addr,
> 			     PAGE_SIZE);
> 	kunmap_local(kaddr);
> 	if (unlikely(err))
> 		return -EFAULT;
> 
> 	flush_dcache_folio(info->foliop);
> 	return 0;
> }
> 
> How do the mem type know what locks to unlock if they do not lock it first
> themselves?

We can change this if you want to move the lock to the internal code,
but the failed exit due to the folio mess that copy makes required this
block of code before.

hugetlb has a different block of code.


> 
> >         .page_shift             =       uffd_page_shift,
> >         .complete_register      =       uffd_complete_register,
> 
> Hugetlbfs specific hooks.  It's OK if you prefer rewritting code for
> hugetlbfs.  I don't have objections.

Yes, hugetlb is a memory type and so it is also modularized.

> 
> > };   
> > 
> > Where guest-memfd needs to write the one function:
> > guest_memfd_pte_continue(), from what I understand.
> 
> You did mention what is required in your new API:
> 
> 	/* Required features below */
> 	ssize_t (*is_dst_valid)(struct vm_area_struct *dst_vma,
> 				unsigned long dst_start, unsigned long len);
> 	unsigned long (*increment)(struct vm_area_struct *vma);
> 	ssize_t (*failed_do_unlock)(struct uffd_info *info);
> 	unsigned int (*page_shift)(struct vm_area_struct *src_vma);
> 	void (*complete_register)(struct vm_area_struct *vma);
> 
> So guest-memfd needs to implement these 6 APIs to support minor fault.

Or, you know, call the existing functions.

> 
> > 
> > Obviously some of the shmem_ functions would need to be added to a
> > header, or such.
> > 
> > And most of that can come from shmem_mfill_atomic_pte_continue(), from
> > what I understand.  This is about 40 lines of code, but may require
> > exposing some shmem functions to keep the code that compact.
> > 
> > So we don't need to expose getting a folio to a module, or decode any
> > special flags or whatever.  We just call the function that needs to be
> 
> I didn't reply before, but I don't think supported_ioctls is special flags
> or magic values.  They're simply flags showing what the mem type supports
> on the ioctls.  One can set it wrong, but I don't think what you proposed
> with above 6 APIs would be easier to get right.  Any module can also make
> things wrong in any of above.
> 
> UFFDIO_CONTINUE itself is definitely getting much more complicated, it used
> to only set a flag in supported_ioctls, provide a oneliner to fetch a
> cache.  Now it needs all above 6 APIs, one of them taking uffd_info* which
> further contains 10+ fields to process.
> 
> > called on the vma that is found.
> > 
> > If anyone has tests I can use for guest-memfd and instructions on
> > guest-memfd setup, I'll just write it instead of expanding the
> > userfaultfd middleware.
> 
> Personally, this whole thing is an overkill to me:

Absolutely.  And it needed to be done.  You are welcome.

> 
> $ git diff --stat 63f84ba525ea04ef376eac851efce2f82dd05f21..HEAD
>  fs/userfaultfd.c              |  14 +--
>  include/linux/hugetlb.h       |  21 ----
>  include/linux/mm.h            |  11 ++
>  include/linux/shmem_fs.h      |  14 ---
>  include/linux/userfaultfd_k.h | 108 ++++++++++------
>  mm/hugetlb.c                  | 359 +++++++++++++++++++++++++++++++++++++++++++++--------
>  mm/shmem.c                    | 245 ++++++++++++++++++++++++------------
>  mm/userfaultfd.c              | 869 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
>  8 files changed, 962 insertions(+), 679 deletions(-)


Right, so the existing code a huge mess today and you won't fix
anything, ever.

We currently have a uffd_flags_t that sets wp or not then passes it
through to set another part of the flag to pass it through again.  This
is stupid.

And then it gets more stupid.  We then check that flag exactly once for
the second argument in a global function and NEVER LOOK AT THAT PART
AGAIN.  So we now have a type to contain a boolean, but we keep passing
through the uffd_flags_t, so we can see if wp is set or not.

And you're fine with it.  In fact, let's add another one.  I mean, we
tried for two but people didn't like two so lets whine and whine and
whine until people get frustrated and let you push a second one of those
gems into the code you REFUSED TO MAINTAIN.

This is what owning a problem looks like: I removed the uufd_flags_t,
because it's stupid.  I removed all the hugetlb checks because I
modularized the memory types.

I'm happy to address changes, but I'm not happy to accept more
middleware and "it's not part of the patch set" to address any problem
as you push more trash into an already horrible code base.

We need to fix things too.

So I'm fixing it.
Liam


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 19:55             ` Peter Xu
  2025-10-30 20:23               ` Lorenzo Stoakes
@ 2025-10-30 20:52               ` Liam R. Howlett
  2025-10-30 21:33                 ` Peter Xu
  1 sibling, 1 reply; 23+ messages in thread
From: Liam R. Howlett @ 2025-10-30 20:52 UTC (permalink / raw)
  To: Peter Xu
  Cc: David Hildenbrand, linux-kernel, linux-mm, Mike Rapoport,
	Muchun Song, Nikita Kalyazin, Vlastimil Babka, Axel Rasmussen,
	Andrew Morton, James Houghton, Lorenzo Stoakes, Hugh Dickins,
	Michal Hocko, Ujwal Kundur, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli

* Peter Xu <peterx@redhat.com> [251030 15:56]:
> On Thu, Oct 30, 2025 at 03:07:18PM -0400, Peter Xu wrote:
> > > Patches are here:
> > > 
> > > https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem
> > 
> > Great!  Finally we have something solid to discuss on top.
> > 
> > Yes, I'm extremely happy to see whatever code there is, I'm happy to review
> > it.  I'm happy to see it rolling.  If it is better, we can adopt it.
> 
> So here is a summary of why I think my proposal is better:
> 
> - Much less code
> 
>   I think this is crystal clear..  I'm pasting once more in this summary
>   email on what your proposal touches:
> 
>  fs/userfaultfd.c              |  14 +--
>  include/linux/hugetlb.h       |  21 ----
>  include/linux/mm.h            |  11 ++
>  include/linux/shmem_fs.h      |  14 ---
>  include/linux/userfaultfd_k.h | 108 ++++++++++------
>  mm/hugetlb.c                  | 359 +++++++++++++++++++++++++++++++++++++++++++++--------
>  mm/shmem.c                    | 245 ++++++++++++++++++++++++------------
>  mm/userfaultfd.c              | 869 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
>  8 files changed, 962 insertions(+), 679 deletions(-)

Including the following highlights:
 -#include <linux/hugetlb.h>

and

 -typedef unsigned int __bitwise uffd_flags_t;


> 
> - Much less future code
> 
>   The new proposal needs at least 6 APIs to implement even minor fault..
>   One of the API needs to be implemented with uffd_info* which further
>   includes 10+ fields to process.  It means we'll have a bunch of
>   duplicated code in the future if new things pop up, so it's not only
>   about what we merge.

You can reuse existing functions if there is no change.

> 
> - Much less exported functions to modules

I haven't exported anything.  You asked for code and I provided it.
This doesn't do what guest_memfd needs as it is.  This is all clean up
you wouldn't do.

> 
>   My solution, after exposing vm_uffd_ops, doesn't need to export any
>   function.
> 
>   Your solution needs to export a lot of new functions to modules.  I
>   didn't pay a lot of attention but the list should at least include these
>   10 functions:
>   
>         void uffd_complete_register(struct vm_area_struct *vma);
>         unsigned int uffd_page_shift(struct vm_area_struct *vma);
>         int uffd_writeprotect(struct uffd_info *info);
>         ssize_t uffd_failed_do_unlock(struct uffd_info *info);
>         int uffd_atomic_pte_copy(struct folio *folio, unsigned long src_addr);
>         unsigned long mfill_size(struct vm_area_struct *vma)
>         int mfill_atomic_pte_poison(struct uffd_info *info);
>         int mfill_atomic_pte_copy(struct uffd_info *info);
>         int mfill_atomic_pte_zeropage(struct uffd_info *info);
>         ssize_t uffd_get_dst_pmd(struct vm_area_struct *dst_vma, unsigned long dst_addr,pmd_t **dst_pmd);
> 
>   It's simply unnecessary.

Maybe we don't export any of them.  Maybe there's another function
pointer that could be checked or overwritten?  We can do that without a
flag or setting or wahtever the name you used for your flag was.

> 
> - Less error prone
> 
>   At least to support minor fault, my solution only needs one hook fetching
>   page cache, then set the CONTINUE ioctl in the supported_ioctls.

Your code just adds more junk to uffd, and fails to modularize anything
beyond getting a folio.  And you only support certain types and places.

> 
> - Safer
> 
>   Your code allows to operate on pmd* in a module??? That's too risky and
>   mm can explode!  Isn't it?

Again. I didn't export anything.

> 
> - Do not build new codes on top of hugetlbfs
> 
>   AFAICT, more than half of your solution's API is trying to service
>   hugetlbfs.  IMHO that's the wrong way to go.  I explained to you multiple
>   times.  We should either keep hugetlbfs alone, or having hugetlbfs adopt
>   mm APIs instead.  We shouldn't build any new code only trying to service
>   hugetlbfsv1 but nobody else.  We shouldn't introduce new mm API only to
>   service hugetlbfs.

Ignoring hugetlb exists is a problem.  I have removed the hugetlb from
being included in uffd while you have left it in its own loop.  This
doesn't build new things on hugetlb, it moves the code for hugetlb out
of mm/userfaultfd.c - how is it building on hugetlb?

Believe it or not, hugetlb is a memory type.

Certainly smaller changes are inherently less bug prone.  I honestly
think all of what I have here needs to be done, regardless of memfd
support.  I cannot believe that things were allowed to be pushed this
far.  I do not think they should be allowed to go further.

> 
> - Much less risk of breaking things
> 
>   I'm pretty sure my code introduce zero or very little bug, if there's
>   one, I'll fix it, but really, likely not, because the changes are
>   straightforward.
> 
>   Your changes are huge.  I would not be surprised you break things here
>   and there.  I hope at least you will be around fixing them when it
>   happens, even if we're not sure the benefits of most of the changes.

I have always been prompt in fixing my issues and have taken
responsibility for anything I've written.  I maintain the maple tree and
other areas of mm.  I have no plans of leaving Linux and I hope not to
die.

I can maintain mm/userfaultfd.c, if that helps.  I didn't feel like I
knew the area enough before, but I'm learning a lot doing this.

Thanks,
Liam



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 20:23               ` Lorenzo Stoakes
@ 2025-10-30 21:13                 ` Peter Xu
  2025-10-30 21:27                   ` Peter
  0 siblings, 1 reply; 23+ messages in thread
From: Peter Xu @ 2025-10-30 21:13 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Liam R. Howlett, David Hildenbrand, linux-kernel, linux-mm,
	Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Hugh Dickins,
	Michal Hocko, Ujwal Kundur, Oscar Salvador, Suren Baghdasaryan,
	Andrea Arcangeli, conduct

On Thu, Oct 30, 2025 at 08:23:03PM +0000, Lorenzo Stoakes wrote:
> +cc CoC
> 
> Peter,
> 
> I'm sorry but your reply here is completely out of line.
> 
> I know tensions can run high sometimes, but this is a _good faith_ effort to try
> to find a way forward.
> 
> Please take a step back and show some respect for the fact that Liam has put
> VERY significant effort in preparing this after you _repeatedly_ asked him to
> show him code.
> 
> I am starting to worry that your approach here is to bat off criticism by trying
> to wear reviewers down and that's really not a good thing.
> 
> Again, this is _good faith_. Nobody is trying to unreasonably push back on these
> changes, we are just trying to find the best solution possible.
> 
> Comments like:
> 
> 'Your code allows to operate on pmd* in a module??? That's too risky and mm can
> explode!  Isn't it?'
> 
> and 'that's the wrong way to go. I explained to you multiple times.'
> 
> and 'I'm pretty sure my code introduce zero or very little bug, if there's one, I'll
> fix it, but really, likely not, because the changes are straightforward.'
> 
> vs. 'Your changes are huge.  I would not be surprised you break things here and
> there.  I hope at least you will be around fixing them when it happens, even if
> we're not sure the benefits of most of the changes.'
> 
> are just _entirely_ unhelpful and really unacceptable.
> 
> I have an extremely heavy workload at the moment anyway, but honestly
> interactions like this have seriously put me off being involved in this review
> personally.
> 
> Do we really want this to be how review in mm or the kernel is?
> 
> Is that really the culture we want to have here?

Gosh.. Seriously?

I'm ok if this needs to be audited.  I have all the previous discussions in
the cover letter as links.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 20:24             ` Liam R. Howlett
@ 2025-10-30 21:26               ` Peter Xu
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Xu @ 2025-10-30 21:26 UTC (permalink / raw)
  To: Liam R. Howlett, David Hildenbrand, linux-kernel, linux-mm,
	Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli

Let me reply in short.

On Thu, Oct 30, 2025 at 04:24:46PM -0400, Liam R. Howlett wrote:
> Right, so the existing code a huge mess today and you won't fix
> anything, ever.

IMHO fix is the wrong word.  Cleanup it is.  I agree the current
userfaultfd code isn't extremely easy to follow.

> 
> We currently have a uffd_flags_t that sets wp or not then passes it
> through to set another part of the flag to pass it through again.  This
> is stupid.

I feel like you're applying very strongly personal preference to this piece
of code.  I won't say that is stupid.  Axel introduced it, and I think it's
still an improvement comparing to before.

> 
> And then it gets more stupid.  We then check that flag exactly once for
> the second argument in a global function and NEVER LOOK AT THAT PART
> AGAIN.  So we now have a type to contain a boolean, but we keep passing
> through the uffd_flags_t, so we can see if wp is set or not.
> 
> And you're fine with it.  In fact, let's add another one.  I mean, we
> tried for two but people didn't like two so lets whine and whine and
> whine until people get frustrated and let you push a second one of those
> gems into the code you REFUSED TO MAINTAIN.

What I refuse to maintain?  I don't really follow.

> 
> This is what owning a problem looks like: I removed the uufd_flags_t,
> because it's stupid.  I removed all the hugetlb checks because I
> modularized the memory types.
> 
> I'm happy to address changes, but I'm not happy to accept more
> middleware and "it's not part of the patch set" to address any problem
> as you push more trash into an already horrible code base.
> 
> We need to fix things too.
> 
> So I'm fixing it.

Let's wait for a 2nd opinion on the approaches.

As I said, I'm OK if everyone likes your solution and if I'm the only one
NACKing it.  If we can support guest-memfd finally whoever adds that, it's
not so bad.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 21:13                 ` Peter Xu
@ 2025-10-30 21:27                   ` Peter
  0 siblings, 0 replies; 23+ messages in thread
From: Peter @ 2025-10-30 21:27 UTC (permalink / raw)
  To: Peter Xu
  Cc: Lorenzo Stoakes, Liam R. Howlett, David Hildenbrand, linux-kernel,
	linux-mm, Mike Rapoport, Muchun Song, Nikita Kalyazin,
	Vlastimil Babka, Axel Rasmussen, Andrew Morton, James Houghton,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli, conduct

thank you


On Thu, 30 Oct 2025 at 21:24, Peter Xu <peterx@redhat.com> wrote:
>
> On Thu, Oct 30, 2025 at 08:23:03PM +0000, Lorenzo Stoakes wrote:
> > +cc CoC
> >
> > Peter,
> >
> > I'm sorry but your reply here is completely out of line.
> >
> > I know tensions can run high sometimes, but this is a _good faith_ effort to try
> > to find a way forward.
> >
> > Please take a step back and show some respect for the fact that Liam has put
> > VERY significant effort in preparing this after you _repeatedly_ asked him to
> > show him code.
> >
> > I am starting to worry that your approach here is to bat off criticism by trying
> > to wear reviewers down and that's really not a good thing.
> >
> > Again, this is _good faith_. Nobody is trying to unreasonably push back on these
> > changes, we are just trying to find the best solution possible.
> >
> > Comments like:
> >
> > 'Your code allows to operate on pmd* in a module??? That's too risky and mm can
> > explode!  Isn't it?'
> >
> > and 'that's the wrong way to go. I explained to you multiple times.'
> >
> > and 'I'm pretty sure my code introduce zero or very little bug, if there's one, I'll
> > fix it, but really, likely not, because the changes are straightforward.'
> >
> > vs. 'Your changes are huge.  I would not be surprised you break things here and
> > there.  I hope at least you will be around fixing them when it happens, even if
> > we're not sure the benefits of most of the changes.'
> >
> > are just _entirely_ unhelpful and really unacceptable.
> >
> > I have an extremely heavy workload at the moment anyway, but honestly
> > interactions like this have seriously put me off being involved in this review
> > personally.
> >
> > Do we really want this to be how review in mm or the kernel is?
> >
> > Is that really the culture we want to have here?
>
> Gosh.. Seriously?
>
> I'm ok if this needs to be audited.  I have all the previous discussions in
> the cover letter as links.
>
> --
> Peter Xu
>
>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v4 0/4] mm/userfaultfd: modulize memory types
  2025-10-30 20:52               ` Liam R. Howlett
@ 2025-10-30 21:33                 ` Peter Xu
  0 siblings, 0 replies; 23+ messages in thread
From: Peter Xu @ 2025-10-30 21:33 UTC (permalink / raw)
  To: Liam R. Howlett, David Hildenbrand, linux-kernel, linux-mm,
	Mike Rapoport, Muchun Song, Nikita Kalyazin, Vlastimil Babka,
	Axel Rasmussen, Andrew Morton, James Houghton, Lorenzo Stoakes,
	Hugh Dickins, Michal Hocko, Ujwal Kundur, Oscar Salvador,
	Suren Baghdasaryan, Andrea Arcangeli

On Thu, Oct 30, 2025 at 04:52:26PM -0400, Liam R. Howlett wrote:
> * Peter Xu <peterx@redhat.com> [251030 15:56]:
> > On Thu, Oct 30, 2025 at 03:07:18PM -0400, Peter Xu wrote:
> > > > Patches are here:
> > > > 
> > > > https://git.infradead.org/?p=users/jedix/linux-maple.git;a=shortlog;h=refs/heads/modularized_mem
> > > 
> > > Great!  Finally we have something solid to discuss on top.
> > > 
> > > Yes, I'm extremely happy to see whatever code there is, I'm happy to review
> > > it.  I'm happy to see it rolling.  If it is better, we can adopt it.
> > 
> > So here is a summary of why I think my proposal is better:
> > 
> > - Much less code
> > 
> >   I think this is crystal clear..  I'm pasting once more in this summary
> >   email on what your proposal touches:
> > 
> >  fs/userfaultfd.c              |  14 +--
> >  include/linux/hugetlb.h       |  21 ----
> >  include/linux/mm.h            |  11 ++
> >  include/linux/shmem_fs.h      |  14 ---
> >  include/linux/userfaultfd_k.h | 108 ++++++++++------
> >  mm/hugetlb.c                  | 359 +++++++++++++++++++++++++++++++++++++++++++++--------
> >  mm/shmem.c                    | 245 ++++++++++++++++++++++++------------
> >  mm/userfaultfd.c              | 869 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++---------------------------------------------------------------------
> >  8 files changed, 962 insertions(+), 679 deletions(-)
> 
> Including the following highlights:
>  -#include <linux/hugetlb.h>
> 
> and
> 
>  -typedef unsigned int __bitwise uffd_flags_t;

So you spent ~1000 LOC for these as highlights..  It's ok, I'll wait for a
second opinion.

> 
> 
> > 
> > - Much less future code
> > 
> >   The new proposal needs at least 6 APIs to implement even minor fault..
> >   One of the API needs to be implemented with uffd_info* which further
> >   includes 10+ fields to process.  It means we'll have a bunch of
> >   duplicated code in the future if new things pop up, so it's not only
> >   about what we merge.
> 
> You can reuse existing functions if there is no change.
> 
> > 
> > - Much less exported functions to modules
> 
> I haven't exported anything.  You asked for code and I provided it.
> This doesn't do what guest_memfd needs as it is.  This is all clean up
> you wouldn't do.
> 
> > 
> >   My solution, after exposing vm_uffd_ops, doesn't need to export any
> >   function.
> > 
> >   Your solution needs to export a lot of new functions to modules.  I
> >   didn't pay a lot of attention but the list should at least include these
> >   10 functions:
> >   
> >         void uffd_complete_register(struct vm_area_struct *vma);
> >         unsigned int uffd_page_shift(struct vm_area_struct *vma);
> >         int uffd_writeprotect(struct uffd_info *info);
> >         ssize_t uffd_failed_do_unlock(struct uffd_info *info);
> >         int uffd_atomic_pte_copy(struct folio *folio, unsigned long src_addr);
> >         unsigned long mfill_size(struct vm_area_struct *vma)
> >         int mfill_atomic_pte_poison(struct uffd_info *info);
> >         int mfill_atomic_pte_copy(struct uffd_info *info);
> >         int mfill_atomic_pte_zeropage(struct uffd_info *info);
> >         ssize_t uffd_get_dst_pmd(struct vm_area_struct *dst_vma, unsigned long dst_addr,pmd_t **dst_pmd);
> > 
> >   It's simply unnecessary.
> 
> Maybe we don't export any of them.  Maybe there's another function
> pointer that could be checked or overwritten?  We can do that without a
> flag or setting or wahtever the name you used for your flag was.

You need to export them when guest-memfd will be involved, am I right?

> 
> > 
> > - Less error prone
> > 
> >   At least to support minor fault, my solution only needs one hook fetching
> >   page cache, then set the CONTINUE ioctl in the supported_ioctls.
> 
> Your code just adds more junk to uffd, and fails to modularize anything
> beyond getting a folio.  And you only support certain types and places.

So would CoC accepts "junk" in this context?

> 
> > 
> > - Safer
> > 
> >   Your code allows to operate on pmd* in a module??? That's too risky and
> >   mm can explode!  Isn't it?
> 
> Again. I didn't export anything.
> 
> > 
> > - Do not build new codes on top of hugetlbfs
> > 
> >   AFAICT, more than half of your solution's API is trying to service
> >   hugetlbfs.  IMHO that's the wrong way to go.  I explained to you multiple
> >   times.  We should either keep hugetlbfs alone, or having hugetlbfs adopt
> >   mm APIs instead.  We shouldn't build any new code only trying to service
> >   hugetlbfsv1 but nobody else.  We shouldn't introduce new mm API only to
> >   service hugetlbfs.
> 
> Ignoring hugetlb exists is a problem.  I have removed the hugetlb from
> being included in uffd while you have left it in its own loop.  This
> doesn't build new things on hugetlb, it moves the code for hugetlb out
> of mm/userfaultfd.c - how is it building on hugetlb?

The APIs you introduced is building for hugetlb.  If without hugetlb, more
than half of the API is not needed.

> 
> Believe it or not, hugetlb is a memory type.
> 
> Certainly smaller changes are inherently less bug prone.  I honestly
> think all of what I have here needs to be done, regardless of memfd
> support.  I cannot believe that things were allowed to be pushed this
> far.  I do not think they should be allowed to go further.
> 
> > 
> > - Much less risk of breaking things
> > 
> >   I'm pretty sure my code introduce zero or very little bug, if there's
> >   one, I'll fix it, but really, likely not, because the changes are
> >   straightforward.
> > 
> >   Your changes are huge.  I would not be surprised you break things here
> >   and there.  I hope at least you will be around fixing them when it
> >   happens, even if we're not sure the benefits of most of the changes.
> 
> I have always been prompt in fixing my issues and have taken
> responsibility for anything I've written.  I maintain the maple tree and
> other areas of mm.  I have no plans of leaving Linux and I hope not to
> die.
> 
> I can maintain mm/userfaultfd.c, if that helps.  I didn't feel like I
> knew the area enough before, but I'm learning a lot doing this.

I'm not maintainer of userfaultfd, I'm a reviewer.  I was almost just
trying to help, in reality that is also true that obviously I don't make
decisions.

I definitely think you can at least propose add yourself as a reviewer if
you like to start looking after userfaultfds, or even M if Andrew likes it.

You're already listed as core mm R so I don't see much difference if it's
only a R addition.

-- 
Peter Xu



^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2025-10-30 21:33 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-14 23:14 [PATCH v4 0/4] mm/userfaultfd: modulize memory types Peter Xu
2025-10-14 23:14 ` [PATCH v4 1/4] mm: Introduce vm_uffd_ops API Peter Xu
2025-10-20 14:18   ` David Hildenbrand
2025-10-14 23:14 ` [PATCH v4 2/4] mm/shmem: Support " Peter Xu
2025-10-20 14:18   ` David Hildenbrand
2025-10-14 23:15 ` [PATCH v4 3/4] mm/hugetlb: " Peter Xu
2025-10-20 14:19   ` David Hildenbrand
2025-10-14 23:15 ` [PATCH v4 4/4] mm: Apply vm_uffd_ops API to core mm Peter Xu
2025-10-20 13:34 ` [PATCH v4 0/4] mm/userfaultfd: modulize memory types David Hildenbrand
2025-10-20 14:12   ` Peter Xu
2025-10-21 15:51     ` Liam R. Howlett
2025-10-21 16:28       ` Peter Xu
2025-10-30 17:13         ` Liam R. Howlett
2025-10-30 18:00           ` Nikita Kalyazin
2025-10-30 19:07           ` Peter Xu
2025-10-30 19:55             ` Peter Xu
2025-10-30 20:23               ` Lorenzo Stoakes
2025-10-30 21:13                 ` Peter Xu
2025-10-30 21:27                   ` Peter
2025-10-30 20:52               ` Liam R. Howlett
2025-10-30 21:33                 ` Peter Xu
2025-10-30 20:24             ` Liam R. Howlett
2025-10-30 21:26               ` Peter Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).