linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook
@ 2025-04-30 19:54 Lorenzo Stoakes
  2025-04-30 19:54 ` [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback Lorenzo Stoakes
                   ` (5 more replies)
  0 siblings, 6 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan --cc=Michal Hocko

During the mmap() of a file-backed mapping, we invoke the underlying
driver's f_op->mmap() callback in order to perform driver/file system
initialisation of the underlying VMA.

This has been a source of issues in the past, including a significant
security concern relating to unwinding of error state discovered by Jann
Horn, as fixed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
error path behaviour") which performed the recent, significant, rework of
mmap() as a whole.

However, we have had a fly in the ointment remain - drivers have a great
deal of freedom in the .mmap() hook to manipulate VMA state (as well as
page table state).

This can be problematic, as we can no longer reason sensibly about VMA
state once the call is complete (the ability to do - anything - here does
rather interfere with that).

In addition, callers may choose to do odd or unusual things which might
interfere with subsequent steps in the mmap() process, and it may do so and
then raise an error, requiring very careful unwinding of state about which
we can make no assumptions.

Rather than providing such an open-ended interface, this series provides an
alternative, far more restrictive one - we expose a whitelist of fields
which can be adjusted by the driver, along with immutable state upon which
the driver can make such decisions:

struct vma_proto {
	/* Immutable state. */
	struct mm_struct *mm;
	unsigned long start;
	unsigned long end;

	/* Mutable fields. Populated with initial state. */
	pgoff_t pgoff;
	struct file *file;
	vm_flags_t flags;
	pgprot_t page_prot;

	/* Write-only fields. */
	const struct vm_operations_struct *vm_ops;
	void *private_data;
};

The mmap logic then updates the state used to either merge with a VMA or
establish a new VMA based upon this logic.

This is achieved via new f_op hook .vma_proto(), which is, importantly,
invoked very early on in the mmap() process.

If an error arises, we can very simply abort the operation with very little
unwinding of state required.

The existing logic contains another, related, peccadillo - since the
.mmap() callback might do anything, it may also cause a previously
unmergeable VMA to become mergeable with adjacent VMAs.

Right now the logic will retry a merge like this only if the driver changes
VMA flags, and changes them in such a way that a merge might succeed (that
is, the flags are not 'special', that is do not contain any of the flags
specified in VM_SPECIAL).

This has been the source of a great deal of pain for a while - it is hard
to reason about an .mmap() callback that might do - anything - but it's
also hard to reason about setting up a VMA and writing to the maple tree,
only to do it again utilising a great deal of shared state.

Since .mmap_proto() sets fields before the first merge is even attempted,
the use of this callback obviates the need for this retry merge logic.

A driver can specify either .mmap_proto(), .mmap() or both. This provides
maximum flexibility.

In researching this change, I examined every .mmap() callback, and
discovered only a very few that set VMA state in such a way that a. the VMA
flags changed and b. this would be mergeable.

In the majority of cases, it turns out that drivers are mapping kernel
memory and thus ultimately set VM_PFNMAP, VM_MIXEDMAP, or other unmergeable
VM_SPECIAL flags.

Of those that remain I identified a number of cases which are only
applicable in DAX, setting the VM_HUGEPAGE flag:

* dax_mmap()
* erofs_file_mmap()
* ext4_file_mmap()
* xfs_file_mmap()

For this remerge to not occur and to impact users, each of these cases
would require a user to mmap() files using DAX, in parts, immediately
adjacent to one another.

This is a very unlikely usecase and so it does not appear to be worthwhile
to adjust this functionality accordingly.

We can, however, very quickly do so if needed by simply adding an
.mmap_proto() hook to these as required.

There are two further non-DAX cases I idenitfied:

* orangefs_file_mmap() - Clears VM_RAND_READ if set, replacing with
  VM_SEQ_READ.
* usb_stream_hwdep_mmap() - Sets VM_DONTDUMP.

Both of these cases again seem very unlikely to be mmap()'d immediately
adjacent to one another in a fashion that would result in a merge.

Finally, we are left with a viable case:

* secretmem_mmap() - Set VM_LOCKED, VM_DONTDUMP.

This is viable enough that the mm selftests trigger the logic as a matter
of course. Therefore, this series replace the .secretmem_mmap() hook with
.secret_mmap_proto().

Lorenzo Stoakes (3):
  mm: introduce new .mmap_proto() f_op callback
  mm: secretmem: convert to .mmap_proto() hook
  mm/vma: remove mmap() retry merge

 include/linux/fs.h               |  7 +++
 include/linux/mm_types.h         | 24 ++++++++
 mm/memory.c                      |  3 +-
 mm/mmap.c                        |  2 +-
 mm/secretmem.c                   | 14 ++---
 mm/vma.c                         | 99 +++++++++++++++++++++++++++-----
 tools/testing/vma/vma_internal.h | 38 ++++++++++++
 7 files changed, 164 insertions(+), 23 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 19:54 [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
@ 2025-04-30 19:54 ` Lorenzo Stoakes
  2025-04-30 19:59   ` Lorenzo Stoakes
                     ` (2 more replies)
  2025-04-30 19:54 ` [RFC PATCH 2/3] mm: secretmem: convert to .mmap_proto() hook Lorenzo Stoakes
                   ` (4 subsequent siblings)
  5 siblings, 3 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan --cc=Michal Hocko

Provide a means by which drivers can specify which fields of those
permitted to be changed should be altered to prior to mmap()'ing a
range (which may either result from a merge or from mapping an entirely new
VMA).

Doing so is substantially safer than the existing .mmap() calback which
provides unrestricted access to the part-constructed VMA and permits
drivers and file systems to do 'creative' things which makes it hard to
reason about the state of the VMA after the function returns.

The existing .mmap() callback's freedom has caused a great deal of issues,
especially in error handling, as unwinding the mmap() state has proven to
be non-trivial and caused significant issues in the past, for instance
those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
error path behaviour").

It also necessitates a second attempt at merge once the .mmap() callback
has completed, which has caused issues in the past, is awkward, adds
overhead and is difficult to reason about.

The .mmap_proto() callback eliminates this requirement, as we can update
fields prior to even attempting the first merge. It is safer, as we heavily
restrict what can actually be modified, and being invoked very early in the
mmap() process, error handling can be performed safely with very little
unwinding of state required.

Update vma userland test stubs to account for changes.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 include/linux/fs.h               |  7 +++
 include/linux/mm_types.h         | 24 +++++++++
 mm/memory.c                      |  3 +-
 mm/mmap.c                        |  2 +-
 mm/vma.c                         | 87 +++++++++++++++++++++++++++++++-
 tools/testing/vma/vma_internal.h | 38 ++++++++++++++
 6 files changed, 158 insertions(+), 3 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 016b0fe1536e..f8ccdf5419fc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2169,6 +2169,7 @@ struct file_operations {
 	int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags);
 	int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *,
 				unsigned int poll_flags);
+	int (*mmap_proto)(struct vma_proto *);
 } __randomize_layout;
 
 /* Supports async buffered reads */
@@ -2243,6 +2244,12 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
 	return file->f_op->mmap(file, vma);
 }
 
+/* Does the file have an .mmap() hook? */
+static inline bool file_has_mmap_hook(struct file *file)
+{
+	return file->f_op->mmap || file->f_op->mmap_proto;
+}
+
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index e76bade9ebb1..b21d01efc541 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -763,6 +763,30 @@ struct vma_numab_state {
 	int prev_scan_seq;
 };
 
+/*
+ * Describes a prototype VMA that is about to be mmap()'ed. Drivers may choose
+ * to manipulate non-const fields, which will cause those fields to be updated
+ * in the resultant VMA.
+ *
+ * Helper functions are not required for manipulating any field.
+ */
+struct vma_proto {
+	/* Immutable state. */
+	struct mm_struct *mm;
+	unsigned long start;
+	unsigned long end;
+
+	/* Mutable fields. Populated with initial state. */
+	pgoff_t pgoff;
+	struct file *file;
+	vm_flags_t flags;
+	pgprot_t page_prot;
+
+	/* Write-only fields. */
+	const struct vm_operations_struct *vm_ops;
+	void *private_data;
+};
+
 /*
  * This struct describes a virtual memory area. There is one of these
  * per VM-area/task. A VM area is any part of the process virtual memory
diff --git a/mm/memory.c b/mm/memory.c
index 68c1d962d0ad..98a20565825b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -527,10 +527,11 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
 		dump_page(page, "bad pte");
 	pr_alert("addr:%px vm_flags:%08lx anon_vma:%px mapping:%px index:%lx\n",
 		 (void *)addr, vma->vm_flags, vma->anon_vma, mapping, index);
-	pr_alert("file:%pD fault:%ps mmap:%ps read_folio:%ps\n",
+	pr_alert("file:%pD fault:%ps mmap:%ps mmap_proto: %ps read_folio:%ps\n",
 		 vma->vm_file,
 		 vma->vm_ops ? vma->vm_ops->fault : NULL,
 		 vma->vm_file ? vma->vm_file->f_op->mmap : NULL,
+		 vma->vm_file ? vma->vm_file->f_op->mmap_proto : NULL,
 		 mapping ? mapping->a_ops->read_folio : NULL);
 	dump_stack();
 	add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
diff --git a/mm/mmap.c b/mm/mmap.c
index 81dd962a1cfc..411309c7b235 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -475,7 +475,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 				vm_flags &= ~VM_MAYEXEC;
 			}
 
-			if (!file->f_op->mmap)
+			if (!file_has_mmap_hook(file))
 				return -ENODEV;
 			if (vm_flags & (VM_GROWSDOWN|VM_GROWSUP))
 				return -EINVAL;
diff --git a/mm/vma.c b/mm/vma.c
index 1f2634b29568..76bd3a67ce0f 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -17,6 +17,11 @@ struct mmap_state {
 	unsigned long pglen;
 	unsigned long flags;
 	struct file *file;
+	pgprot_t page_prot;
+
+	/* User-defined fields, perhaps updated by .mmap_proto(). */
+	const struct vm_operations_struct *vm_ops;
+	void *vm_private_data;
 
 	unsigned long charged;
 	bool retry_merge;
@@ -40,6 +45,7 @@ struct mmap_state {
 		.pglen = PHYS_PFN(len_),				\
 		.flags = flags_,					\
 		.file = file_,						\
+		.page_prot = vm_get_page_prot(flags_),			\
 	}
 
 #define VMG_MMAP_STATE(name, map_, vma_)				\
@@ -2384,7 +2390,17 @@ static int __mmap_new_file_vma(struct mmap_state *map,
 	struct vma_iterator *vmi = map->vmi;
 	int error;
 
+	VM_WARN_ON(!file_has_mmap_hook(map->file));
+
 	vma->vm_file = get_file(map->file);
+
+	/*
+	 * The caller might only define .mmap_proto(), in which case we have
+	 * nothing further to do.
+	 */
+	if (!map->file->f_op->mmap)
+		return 0;
+
 	error = mmap_file(vma->vm_file, vma);
 	if (error) {
 		fput(vma->vm_file);
@@ -2441,7 +2457,7 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
 	vma_iter_config(vmi, map->addr, map->end);
 	vma_set_range(vma, map->addr, map->end, map->pgoff);
 	vm_flags_init(vma, map->flags);
-	vma->vm_page_prot = vm_get_page_prot(map->flags);
+	vma->vm_page_prot = map->page_prot;
 
 	if (vma_iter_prealloc(vmi, vma)) {
 		error = -ENOMEM;
@@ -2528,6 +2544,69 @@ static void __mmap_complete(struct mmap_state *map, struct vm_area_struct *vma)
 	vma_set_page_prot(vma);
 }
 
+/* Does the driver backing this implement an .mmap_proto() hook? */
+static bool have_mmap_proto_hook(struct mmap_state *map)
+{
+	struct file *file = map->file;
+
+	return file && file->f_op->mmap_proto;
+}
+
+/*
+ * Invoke the f_op->mmap_proto() callback for a file-backed mapping that
+ * specifies it.
+ *
+ * This is called prior to any merge attempt, and updates whitelisted fields
+ * that are permitted to be updated by the caller.
+ *
+ * All but user-defined fields will be pre-populated with original values
+ *
+ * Returns 0 on success, or an error code otherwise.
+ */
+static int call_proto(struct mmap_state *map)
+{
+	int err;
+	const struct file_operations *f_op = map->file->f_op;
+	struct vma_proto proto = {
+		.mm = map->mm,
+		.start = map->addr,
+		.end = map->end,
+
+		.pgoff = map->pgoff,
+		.file = map->file,
+		.flags = map->flags,
+	};
+
+	/* Invoke the hook. */
+	err = f_op->mmap_proto(&proto);
+	if (err)
+		return err;
+
+	/* Update fields permitted to be changed. */
+	map->pgoff = proto.pgoff;
+	map->file = proto.file;
+	map->flags = proto.flags;
+	map->page_prot = proto.page_prot;
+	/* User-defined fields. */
+	map->vm_ops = proto.vm_ops;
+	map->vm_private_data = proto.private_data;
+
+	return 0;
+}
+
+static void set_vma_user_defined_fields(struct vm_area_struct *vma,
+		struct mmap_state *map)
+{
+	/*
+	 * If the .mmap() handler set these, that takes precedent (indicated by
+	 * the vma fields being non-empty).
+	 */
+	if (map->vm_ops && vma->vm_ops == &vma_dummy_vm_ops)
+		vma->vm_ops = map->vm_ops;
+	if (!vma->vm_private_data)
+		vma->vm_private_data = map->vm_private_data;
+}
+
 static unsigned long __mmap_region(struct file *file, unsigned long addr,
 		unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
 		struct list_head *uf)
@@ -2537,8 +2616,11 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
 	int error;
 	VMA_ITERATOR(vmi, mm, addr);
 	MMAP_STATE(map, mm, &vmi, addr, len, pgoff, vm_flags, file);
+	bool have_proto = have_mmap_proto_hook(&map);
 
 	error = __mmap_prepare(&map, uf);
+	if (!error && have_proto)
+		error = call_proto(&map);
 	if (error)
 		goto abort_munmap;
 
@@ -2556,6 +2638,9 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
 			goto unacct_error;
 	}
 
+	if (have_proto)
+		set_vma_user_defined_fields(vma, &map);
+
 	/* If flags changed, we might be able to merge, so try again. */
 	if (map.retry_merge) {
 		struct vm_area_struct *merged;
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index 198abe66de5a..56a49d455949 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -253,8 +253,40 @@ struct mm_struct {
 	unsigned long flags; /* Must use atomic bitops to access */
 };
 
+struct vm_area_struct;
+
+/*
+ * Describes a prototype VMA that is about to be mmap()'ed. Drivers may choose
+ * to manipulate non-const fields, which will cause those fields to be updated
+ * in the resultant VMA.
+ *
+ * Helper functions are not required for manipulating any field.
+ */
+struct vma_proto {
+	/* Immutable state. */
+	struct mm_struct *mm;
+	unsigned long start;
+	unsigned long end;
+
+	/* Mutable fields. Populated with initial state. */
+	pgoff_t pgoff;
+	struct file *file;
+	vm_flags_t flags;
+	pgprot_t page_prot;
+
+	/* Write-only fields. */
+	const struct vm_operations_struct *vm_ops;
+	void *private_data;
+};
+
+struct file_operations {
+	int (*mmap)(struct file *, struct vm_area_struct *);
+	int (*mmap_proto)(struct vma_proto *);
+};
+
 struct file {
 	struct address_space	*f_mapping;
+	const struct file_operations	*f_op;
 };
 
 #define VMA_LOCK_OFFSET	0x40000000
@@ -1405,4 +1437,10 @@ static inline void free_anon_vma_name(struct vm_area_struct *vma)
 	(void)vma;
 }
 
+/* Does the file have an .mmap() hook? */
+static inline bool file_has_mmap_hook(struct file *file)
+{
+	return file->f_op->mmap || file->f_op->mmap_proto;
+}
+
 #endif	/* __MM_VMA_INTERNAL_H */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 2/3] mm: secretmem: convert to .mmap_proto() hook
  2025-04-30 19:54 [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
  2025-04-30 19:54 ` [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback Lorenzo Stoakes
@ 2025-04-30 19:54 ` Lorenzo Stoakes
  2025-04-30 19:59   ` Lorenzo Stoakes
  2025-04-30 19:54 ` [RFC PATCH 3/3] mm/vma: remove mmap() retry merge Lorenzo Stoakes
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan --cc=Michal Hocko

Secretmem has a simple .mmap() hook which is easily converted to the new
.mmap_proto() callback.

In addition, importantly, it's a rare instance of a mergeable VMA mapping
which adjusts parameters which affect merge compatibility. By using the
.mmap_proto() callback there's no longer any need to retry the merge later
as we can simply set the correct flags from the start.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 mm/secretmem.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/secretmem.c b/mm/secretmem.c
index 1b0a214ee558..64fc0890a28b 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -120,18 +120,18 @@ static int secretmem_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int secretmem_mmap(struct file *file, struct vm_area_struct *vma)
+static int secretmem_mmap_proto(struct vma_proto *proto)
 {
-	unsigned long len = vma->vm_end - vma->vm_start;
+	unsigned long len = proto->end - proto->start;
 
-	if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
+	if ((proto->flags & (VM_SHARED | VM_MAYSHARE)) == 0)
 		return -EINVAL;
 
-	if (!mlock_future_ok(vma->vm_mm, vma->vm_flags | VM_LOCKED, len))
+	if (!mlock_future_ok(proto->mm, proto->flags | VM_LOCKED, len))
 		return -EAGAIN;
 
-	vm_flags_set(vma, VM_LOCKED | VM_DONTDUMP);
-	vma->vm_ops = &secretmem_vm_ops;
+	proto->flags |= VM_LOCKED | VM_DONTDUMP;
+	proto->vm_ops = &secretmem_vm_ops;
 
 	return 0;
 }
@@ -143,7 +143,7 @@ bool vma_is_secretmem(struct vm_area_struct *vma)
 
 static const struct file_operations secretmem_fops = {
 	.release	= secretmem_release,
-	.mmap		= secretmem_mmap,
+	.mmap_proto	= secretmem_mmap_proto,
 };
 
 static int secretmem_migrate_folio(struct address_space *mapping,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 3/3] mm/vma: remove mmap() retry merge
  2025-04-30 19:54 [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
  2025-04-30 19:54 ` [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback Lorenzo Stoakes
  2025-04-30 19:54 ` [RFC PATCH 2/3] mm: secretmem: convert to .mmap_proto() hook Lorenzo Stoakes
@ 2025-04-30 19:54 ` Lorenzo Stoakes
  2025-04-30 19:59   ` Lorenzo Stoakes
  2025-04-30 19:58 ` [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:54 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan --cc=Michal Hocko

We have now introduced a mechanism that obviates the need for a reattempted
merge via the f_op->mmap_proto() hook, so eliminate this functionality
altogether.

The retry merge logic has been the cause of a great deal of complexity in
the past and required a great deal of careful manoeuvring of code to ensure
its continued and correct functionality.

It has also recently been involved in an issue surrounding maple tree
state, which again points to its problematic nature.

We make it much easier to reason about mmap() logic by eliminating this and
simply writing a VMA once. This also opens the doors to future
optimisation and improvement in the mmap() logic.

For any device or file system which encounters unwanted VMA fragmentation
as a result of this change (that is, having not implemented .mmap_proto
hooks), the issue is easily resolvable by doing so.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 mm/vma.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/mm/vma.c b/mm/vma.c
index 76bd3a67ce0f..40c98f88472e 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -24,7 +24,6 @@ struct mmap_state {
 	void *vm_private_data;
 
 	unsigned long charged;
-	bool retry_merge;
 
 	struct vm_area_struct *prev;
 	struct vm_area_struct *next;
@@ -2423,8 +2422,6 @@ static int __mmap_new_file_vma(struct mmap_state *map,
 			!(map->flags & VM_MAYWRITE) &&
 			(vma->vm_flags & VM_MAYWRITE));
 
-	/* If the flags change (and are mergeable), let's retry later. */
-	map->retry_merge = vma->vm_flags != map->flags && !(vma->vm_flags & VM_SPECIAL);
 	map->flags = vma->vm_flags;
 
 	return 0;
@@ -2641,17 +2638,6 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
 	if (have_proto)
 		set_vma_user_defined_fields(vma, &map);
 
-	/* If flags changed, we might be able to merge, so try again. */
-	if (map.retry_merge) {
-		struct vm_area_struct *merged;
-		VMG_MMAP_STATE(vmg, &map, vma);
-
-		vma_iter_config(map.vmi, map.addr, map.end);
-		merged = vma_merge_existing_range(&vmg);
-		if (merged)
-			vma = merged;
-	}
-
 	__mmap_complete(&map, vma);
 
 	return addr;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook
  2025-04-30 19:54 [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
                   ` (2 preceding siblings ...)
  2025-04-30 19:54 ` [RFC PATCH 3/3] mm/vma: remove mmap() retry merge Lorenzo Stoakes
@ 2025-04-30 19:58 ` Lorenzo Stoakes
  2025-04-30 19:59 ` Lorenzo Stoakes
  2025-04-30 21:29 ` Jann Horn
  5 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:58 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm, Alexander Viro, Christian Brauner,
	Jan Kara

Apologies, made a mistake that I wrongly assumed tooling would catch... let me
try sending this again...

On Wed, Apr 30, 2025 at 08:54:10PM +0100, Lorenzo Stoakes wrote:
> During the mmap() of a file-backed mapping, we invoke the underlying
> driver's f_op->mmap() callback in order to perform driver/file system
> initialisation of the underlying VMA.
> 
> This has been a source of issues in the past, including a significant
> security concern relating to unwinding of error state discovered by Jann
> Horn, as fixed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> error path behaviour") which performed the recent, significant, rework of
> mmap() as a whole.
> 
> However, we have had a fly in the ointment remain - drivers have a great
> deal of freedom in the .mmap() hook to manipulate VMA state (as well as
> page table state).
> 
> This can be problematic, as we can no longer reason sensibly about VMA
> state once the call is complete (the ability to do - anything - here does
> rather interfere with that).
> 
> In addition, callers may choose to do odd or unusual things which might
> interfere with subsequent steps in the mmap() process, and it may do so and
> then raise an error, requiring very careful unwinding of state about which
> we can make no assumptions.
> 
> Rather than providing such an open-ended interface, this series provides an
> alternative, far more restrictive one - we expose a whitelist of fields
> which can be adjusted by the driver, along with immutable state upon which
> the driver can make such decisions:
> 
> struct vma_proto {
> 	/* Immutable state. */
> 	struct mm_struct *mm;
> 	unsigned long start;
> 	unsigned long end;
> 
> 	/* Mutable fields. Populated with initial state. */
> 	pgoff_t pgoff;
> 	struct file *file;
> 	vm_flags_t flags;
> 	pgprot_t page_prot;
> 
> 	/* Write-only fields. */
> 	const struct vm_operations_struct *vm_ops;
> 	void *private_data;
> };
> 
> The mmap logic then updates the state used to either merge with a VMA or
> establish a new VMA based upon this logic.
> 
> This is achieved via new f_op hook .vma_proto(), which is, importantly,
> invoked very early on in the mmap() process.
> 
> If an error arises, we can very simply abort the operation with very little
> unwinding of state required.
> 
> The existing logic contains another, related, peccadillo - since the
> .mmap() callback might do anything, it may also cause a previously
> unmergeable VMA to become mergeable with adjacent VMAs.
> 
> Right now the logic will retry a merge like this only if the driver changes
> VMA flags, and changes them in such a way that a merge might succeed (that
> is, the flags are not 'special', that is do not contain any of the flags
> specified in VM_SPECIAL).
> 
> This has been the source of a great deal of pain for a while - it is hard
> to reason about an .mmap() callback that might do - anything - but it's
> also hard to reason about setting up a VMA and writing to the maple tree,
> only to do it again utilising a great deal of shared state.
> 
> Since .mmap_proto() sets fields before the first merge is even attempted,
> the use of this callback obviates the need for this retry merge logic.
> 
> A driver can specify either .mmap_proto(), .mmap() or both. This provides
> maximum flexibility.
> 
> In researching this change, I examined every .mmap() callback, and
> discovered only a very few that set VMA state in such a way that a. the VMA
> flags changed and b. this would be mergeable.
> 
> In the majority of cases, it turns out that drivers are mapping kernel
> memory and thus ultimately set VM_PFNMAP, VM_MIXEDMAP, or other unmergeable
> VM_SPECIAL flags.
> 
> Of those that remain I identified a number of cases which are only
> applicable in DAX, setting the VM_HUGEPAGE flag:
> 
> * dax_mmap()
> * erofs_file_mmap()
> * ext4_file_mmap()
> * xfs_file_mmap()
> 
> For this remerge to not occur and to impact users, each of these cases
> would require a user to mmap() files using DAX, in parts, immediately
> adjacent to one another.
> 
> This is a very unlikely usecase and so it does not appear to be worthwhile
> to adjust this functionality accordingly.
> 
> We can, however, very quickly do so if needed by simply adding an
> .mmap_proto() hook to these as required.
> 
> There are two further non-DAX cases I idenitfied:
> 
> * orangefs_file_mmap() - Clears VM_RAND_READ if set, replacing with
>   VM_SEQ_READ.
> * usb_stream_hwdep_mmap() - Sets VM_DONTDUMP.
> 
> Both of these cases again seem very unlikely to be mmap()'d immediately
> adjacent to one another in a fashion that would result in a merge.
> 
> Finally, we are left with a viable case:
> 
> * secretmem_mmap() - Set VM_LOCKED, VM_DONTDUMP.
> 
> This is viable enough that the mm selftests trigger the logic as a matter
> of course. Therefore, this series replace the .secretmem_mmap() hook with
> .secret_mmap_proto().
> 
> Lorenzo Stoakes (3):
>   mm: introduce new .mmap_proto() f_op callback
>   mm: secretmem: convert to .mmap_proto() hook
>   mm/vma: remove mmap() retry merge
> 
>  include/linux/fs.h               |  7 +++
>  include/linux/mm_types.h         | 24 ++++++++
>  mm/memory.c                      |  3 +-
>  mm/mmap.c                        |  2 +-
>  mm/secretmem.c                   | 14 ++---
>  mm/vma.c                         | 99 +++++++++++++++++++++++++++-----
>  tools/testing/vma/vma_internal.h | 38 ++++++++++++
>  7 files changed, 164 insertions(+), 23 deletions(-)
> 
> -- 
> 2.49.0
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook
  2025-04-30 19:54 [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
                   ` (3 preceding siblings ...)
  2025-04-30 19:58 ` [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
@ 2025-04-30 19:59 ` Lorenzo Stoakes
  2025-04-30 21:29 ` Jann Horn
  5 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Matthew Wilcox

During the mmap() of a file-backed mapping, we invoke the underlying
driver's f_op->mmap() callback in order to perform driver/file system
initialisation of the underlying VMA.

This has been a source of issues in the past, including a significant
security concern relating to unwinding of error state discovered by Jann
Horn, as fixed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
error path behaviour") which performed the recent, significant, rework of
mmap() as a whole.

However, we have had a fly in the ointment remain - drivers have a great
deal of freedom in the .mmap() hook to manipulate VMA state (as well as
page table state).

This can be problematic, as we can no longer reason sensibly about VMA
state once the call is complete (the ability to do - anything - here does
rather interfere with that).

In addition, callers may choose to do odd or unusual things which might
interfere with subsequent steps in the mmap() process, and it may do so and
then raise an error, requiring very careful unwinding of state about which
we can make no assumptions.

Rather than providing such an open-ended interface, this series provides an
alternative, far more restrictive one - we expose a whitelist of fields
which can be adjusted by the driver, along with immutable state upon which
the driver can make such decisions:

struct vma_proto {
	/* Immutable state. */
	struct mm_struct *mm;
	unsigned long start;
	unsigned long end;

	/* Mutable fields. Populated with initial state. */
	pgoff_t pgoff;
	struct file *file;
	vm_flags_t flags;
	pgprot_t page_prot;

	/* Write-only fields. */
	const struct vm_operations_struct *vm_ops;
	void *private_data;
};

The mmap logic then updates the state used to either merge with a VMA or
establish a new VMA based upon this logic.

This is achieved via new f_op hook .vma_proto(), which is, importantly,
invoked very early on in the mmap() process.

If an error arises, we can very simply abort the operation with very little
unwinding of state required.

The existing logic contains another, related, peccadillo - since the
.mmap() callback might do anything, it may also cause a previously
unmergeable VMA to become mergeable with adjacent VMAs.

Right now the logic will retry a merge like this only if the driver changes
VMA flags, and changes them in such a way that a merge might succeed (that
is, the flags are not 'special', that is do not contain any of the flags
specified in VM_SPECIAL).

This has been the source of a great deal of pain for a while - it is hard
to reason about an .mmap() callback that might do - anything - but it's
also hard to reason about setting up a VMA and writing to the maple tree,
only to do it again utilising a great deal of shared state.

Since .mmap_proto() sets fields before the first merge is even attempted,
the use of this callback obviates the need for this retry merge logic.

A driver can specify either .mmap_proto(), .mmap() or both. This provides
maximum flexibility.

In researching this change, I examined every .mmap() callback, and
discovered only a very few that set VMA state in such a way that a. the VMA
flags changed and b. this would be mergeable.

In the majority of cases, it turns out that drivers are mapping kernel
memory and thus ultimately set VM_PFNMAP, VM_MIXEDMAP, or other unmergeable
VM_SPECIAL flags.

Of those that remain I identified a number of cases which are only
applicable in DAX, setting the VM_HUGEPAGE flag:

* dax_mmap()
* erofs_file_mmap()
* ext4_file_mmap()
* xfs_file_mmap()

For this remerge to not occur and to impact users, each of these cases
would require a user to mmap() files using DAX, in parts, immediately
adjacent to one another.

This is a very unlikely usecase and so it does not appear to be worthwhile
to adjust this functionality accordingly.

We can, however, very quickly do so if needed by simply adding an
.mmap_proto() hook to these as required.

There are two further non-DAX cases I idenitfied:

* orangefs_file_mmap() - Clears VM_RAND_READ if set, replacing with
  VM_SEQ_READ.
* usb_stream_hwdep_mmap() - Sets VM_DONTDUMP.

Both of these cases again seem very unlikely to be mmap()'d immediately
adjacent to one another in a fashion that would result in a merge.

Finally, we are left with a viable case:

* secretmem_mmap() - Set VM_LOCKED, VM_DONTDUMP.

This is viable enough that the mm selftests trigger the logic as a matter
of course. Therefore, this series replace the .secretmem_mmap() hook with
.secret_mmap_proto().

Lorenzo Stoakes (3):
  mm: introduce new .mmap_proto() f_op callback
  mm: secretmem: convert to .mmap_proto() hook
  mm/vma: remove mmap() retry merge

 include/linux/fs.h               |  7 +++
 include/linux/mm_types.h         | 24 ++++++++
 mm/memory.c                      |  3 +-
 mm/mmap.c                        |  2 +-
 mm/secretmem.c                   | 14 ++---
 mm/vma.c                         | 99 +++++++++++++++++++++++++++-----
 tools/testing/vma/vma_internal.h | 38 ++++++++++++
 7 files changed, 164 insertions(+), 23 deletions(-)

-- 
2.49.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 19:54 ` [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback Lorenzo Stoakes
@ 2025-04-30 19:59   ` Lorenzo Stoakes
  2025-04-30 21:44   ` Jann Horn
  2025-04-30 21:58   ` David Hildenbrand
  2 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Matthew Wilcox

Provide a means by which drivers can specify which fields of those
permitted to be changed should be altered to prior to mmap()'ing a
range (which may either result from a merge or from mapping an entirely new
VMA).

Doing so is substantially safer than the existing .mmap() calback which
provides unrestricted access to the part-constructed VMA and permits
drivers and file systems to do 'creative' things which makes it hard to
reason about the state of the VMA after the function returns.

The existing .mmap() callback's freedom has caused a great deal of issues,
especially in error handling, as unwinding the mmap() state has proven to
be non-trivial and caused significant issues in the past, for instance
those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
error path behaviour").

It also necessitates a second attempt at merge once the .mmap() callback
has completed, which has caused issues in the past, is awkward, adds
overhead and is difficult to reason about.

The .mmap_proto() callback eliminates this requirement, as we can update
fields prior to even attempting the first merge. It is safer, as we heavily
restrict what can actually be modified, and being invoked very early in the
mmap() process, error handling can be performed safely with very little
unwinding of state required.

Update vma userland test stubs to account for changes.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 include/linux/fs.h               |  7 +++
 include/linux/mm_types.h         | 24 +++++++++
 mm/memory.c                      |  3 +-
 mm/mmap.c                        |  2 +-
 mm/vma.c                         | 87 +++++++++++++++++++++++++++++++-
 tools/testing/vma/vma_internal.h | 38 ++++++++++++++
 6 files changed, 158 insertions(+), 3 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 016b0fe1536e..f8ccdf5419fc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2169,6 +2169,7 @@ struct file_operations {
 	int (*uring_cmd)(struct io_uring_cmd *ioucmd, unsigned int issue_flags);
 	int (*uring_cmd_iopoll)(struct io_uring_cmd *, struct io_comp_batch *,
 				unsigned int poll_flags);
+	int (*mmap_proto)(struct vma_proto *);
 } __randomize_layout;
 
 /* Supports async buffered reads */
@@ -2243,6 +2244,12 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
 	return file->f_op->mmap(file, vma);
 }
 
+/* Does the file have an .mmap() hook? */
+static inline bool file_has_mmap_hook(struct file *file)
+{
+	return file->f_op->mmap || file->f_op->mmap_proto;
+}
+
 extern ssize_t vfs_read(struct file *, char __user *, size_t, loff_t *);
 extern ssize_t vfs_write(struct file *, const char __user *, size_t, loff_t *);
 extern ssize_t vfs_copy_file_range(struct file *, loff_t , struct file *,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index e76bade9ebb1..b21d01efc541 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -763,6 +763,30 @@ struct vma_numab_state {
 	int prev_scan_seq;
 };
 
+/*
+ * Describes a prototype VMA that is about to be mmap()'ed. Drivers may choose
+ * to manipulate non-const fields, which will cause those fields to be updated
+ * in the resultant VMA.
+ *
+ * Helper functions are not required for manipulating any field.
+ */
+struct vma_proto {
+	/* Immutable state. */
+	struct mm_struct *mm;
+	unsigned long start;
+	unsigned long end;
+
+	/* Mutable fields. Populated with initial state. */
+	pgoff_t pgoff;
+	struct file *file;
+	vm_flags_t flags;
+	pgprot_t page_prot;
+
+	/* Write-only fields. */
+	const struct vm_operations_struct *vm_ops;
+	void *private_data;
+};
+
 /*
  * This struct describes a virtual memory area. There is one of these
  * per VM-area/task. A VM area is any part of the process virtual memory
diff --git a/mm/memory.c b/mm/memory.c
index 68c1d962d0ad..98a20565825b 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -527,10 +527,11 @@ static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr,
 		dump_page(page, "bad pte");
 	pr_alert("addr:%px vm_flags:%08lx anon_vma:%px mapping:%px index:%lx\n",
 		 (void *)addr, vma->vm_flags, vma->anon_vma, mapping, index);
-	pr_alert("file:%pD fault:%ps mmap:%ps read_folio:%ps\n",
+	pr_alert("file:%pD fault:%ps mmap:%ps mmap_proto: %ps read_folio:%ps\n",
 		 vma->vm_file,
 		 vma->vm_ops ? vma->vm_ops->fault : NULL,
 		 vma->vm_file ? vma->vm_file->f_op->mmap : NULL,
+		 vma->vm_file ? vma->vm_file->f_op->mmap_proto : NULL,
 		 mapping ? mapping->a_ops->read_folio : NULL);
 	dump_stack();
 	add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
diff --git a/mm/mmap.c b/mm/mmap.c
index 81dd962a1cfc..411309c7b235 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -475,7 +475,7 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 				vm_flags &= ~VM_MAYEXEC;
 			}
 
-			if (!file->f_op->mmap)
+			if (!file_has_mmap_hook(file))
 				return -ENODEV;
 			if (vm_flags & (VM_GROWSDOWN|VM_GROWSUP))
 				return -EINVAL;
diff --git a/mm/vma.c b/mm/vma.c
index 1f2634b29568..76bd3a67ce0f 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -17,6 +17,11 @@ struct mmap_state {
 	unsigned long pglen;
 	unsigned long flags;
 	struct file *file;
+	pgprot_t page_prot;
+
+	/* User-defined fields, perhaps updated by .mmap_proto(). */
+	const struct vm_operations_struct *vm_ops;
+	void *vm_private_data;
 
 	unsigned long charged;
 	bool retry_merge;
@@ -40,6 +45,7 @@ struct mmap_state {
 		.pglen = PHYS_PFN(len_),				\
 		.flags = flags_,					\
 		.file = file_,						\
+		.page_prot = vm_get_page_prot(flags_),			\
 	}
 
 #define VMG_MMAP_STATE(name, map_, vma_)				\
@@ -2384,7 +2390,17 @@ static int __mmap_new_file_vma(struct mmap_state *map,
 	struct vma_iterator *vmi = map->vmi;
 	int error;
 
+	VM_WARN_ON(!file_has_mmap_hook(map->file));
+
 	vma->vm_file = get_file(map->file);
+
+	/*
+	 * The caller might only define .mmap_proto(), in which case we have
+	 * nothing further to do.
+	 */
+	if (!map->file->f_op->mmap)
+		return 0;
+
 	error = mmap_file(vma->vm_file, vma);
 	if (error) {
 		fput(vma->vm_file);
@@ -2441,7 +2457,7 @@ static int __mmap_new_vma(struct mmap_state *map, struct vm_area_struct **vmap)
 	vma_iter_config(vmi, map->addr, map->end);
 	vma_set_range(vma, map->addr, map->end, map->pgoff);
 	vm_flags_init(vma, map->flags);
-	vma->vm_page_prot = vm_get_page_prot(map->flags);
+	vma->vm_page_prot = map->page_prot;
 
 	if (vma_iter_prealloc(vmi, vma)) {
 		error = -ENOMEM;
@@ -2528,6 +2544,69 @@ static void __mmap_complete(struct mmap_state *map, struct vm_area_struct *vma)
 	vma_set_page_prot(vma);
 }
 
+/* Does the driver backing this implement an .mmap_proto() hook? */
+static bool have_mmap_proto_hook(struct mmap_state *map)
+{
+	struct file *file = map->file;
+
+	return file && file->f_op->mmap_proto;
+}
+
+/*
+ * Invoke the f_op->mmap_proto() callback for a file-backed mapping that
+ * specifies it.
+ *
+ * This is called prior to any merge attempt, and updates whitelisted fields
+ * that are permitted to be updated by the caller.
+ *
+ * All but user-defined fields will be pre-populated with original values
+ *
+ * Returns 0 on success, or an error code otherwise.
+ */
+static int call_proto(struct mmap_state *map)
+{
+	int err;
+	const struct file_operations *f_op = map->file->f_op;
+	struct vma_proto proto = {
+		.mm = map->mm,
+		.start = map->addr,
+		.end = map->end,
+
+		.pgoff = map->pgoff,
+		.file = map->file,
+		.flags = map->flags,
+	};
+
+	/* Invoke the hook. */
+	err = f_op->mmap_proto(&proto);
+	if (err)
+		return err;
+
+	/* Update fields permitted to be changed. */
+	map->pgoff = proto.pgoff;
+	map->file = proto.file;
+	map->flags = proto.flags;
+	map->page_prot = proto.page_prot;
+	/* User-defined fields. */
+	map->vm_ops = proto.vm_ops;
+	map->vm_private_data = proto.private_data;
+
+	return 0;
+}
+
+static void set_vma_user_defined_fields(struct vm_area_struct *vma,
+		struct mmap_state *map)
+{
+	/*
+	 * If the .mmap() handler set these, that takes precedent (indicated by
+	 * the vma fields being non-empty).
+	 */
+	if (map->vm_ops && vma->vm_ops == &vma_dummy_vm_ops)
+		vma->vm_ops = map->vm_ops;
+	if (!vma->vm_private_data)
+		vma->vm_private_data = map->vm_private_data;
+}
+
 static unsigned long __mmap_region(struct file *file, unsigned long addr,
 		unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
 		struct list_head *uf)
@@ -2537,8 +2616,11 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
 	int error;
 	VMA_ITERATOR(vmi, mm, addr);
 	MMAP_STATE(map, mm, &vmi, addr, len, pgoff, vm_flags, file);
+	bool have_proto = have_mmap_proto_hook(&map);
 
 	error = __mmap_prepare(&map, uf);
+	if (!error && have_proto)
+		error = call_proto(&map);
 	if (error)
 		goto abort_munmap;
 
@@ -2556,6 +2638,9 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
 			goto unacct_error;
 	}
 
+	if (have_proto)
+		set_vma_user_defined_fields(vma, &map);
+
 	/* If flags changed, we might be able to merge, so try again. */
 	if (map.retry_merge) {
 		struct vm_area_struct *merged;
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index 198abe66de5a..56a49d455949 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -253,8 +253,40 @@ struct mm_struct {
 	unsigned long flags; /* Must use atomic bitops to access */
 };
 
+struct vm_area_struct;
+
+/*
+ * Describes a prototype VMA that is about to be mmap()'ed. Drivers may choose
+ * to manipulate non-const fields, which will cause those fields to be updated
+ * in the resultant VMA.
+ *
+ * Helper functions are not required for manipulating any field.
+ */
+struct vma_proto {
+	/* Immutable state. */
+	struct mm_struct *mm;
+	unsigned long start;
+	unsigned long end;
+
+	/* Mutable fields. Populated with initial state. */
+	pgoff_t pgoff;
+	struct file *file;
+	vm_flags_t flags;
+	pgprot_t page_prot;
+
+	/* Write-only fields. */
+	const struct vm_operations_struct *vm_ops;
+	void *private_data;
+};
+
+struct file_operations {
+	int (*mmap)(struct file *, struct vm_area_struct *);
+	int (*mmap_proto)(struct vma_proto *);
+};
+
 struct file {
 	struct address_space	*f_mapping;
+	const struct file_operations	*f_op;
 };
 
 #define VMA_LOCK_OFFSET	0x40000000
@@ -1405,4 +1437,10 @@ static inline void free_anon_vma_name(struct vm_area_struct *vma)
 	(void)vma;
 }
 
+/* Does the file have an .mmap() hook? */
+static inline bool file_has_mmap_hook(struct file *file)
+{
+	return file->f_op->mmap || file->f_op->mmap_proto;
+}
+
 #endif	/* __MM_VMA_INTERNAL_H */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 2/3] mm: secretmem: convert to .mmap_proto() hook
  2025-04-30 19:54 ` [RFC PATCH 2/3] mm: secretmem: convert to .mmap_proto() hook Lorenzo Stoakes
@ 2025-04-30 19:59   ` Lorenzo Stoakes
  0 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Matthew Wilcox

Secretmem has a simple .mmap() hook which is easily converted to the new
.mmap_proto() callback.

In addition, importantly, it's a rare instance of a mergeable VMA mapping
which adjusts parameters which affect merge compatibility. By using the
.mmap_proto() callback there's no longer any need to retry the merge later
as we can simply set the correct flags from the start.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 mm/secretmem.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/mm/secretmem.c b/mm/secretmem.c
index 1b0a214ee558..64fc0890a28b 100644
--- a/mm/secretmem.c
+++ b/mm/secretmem.c
@@ -120,18 +120,18 @@ static int secretmem_release(struct inode *inode, struct file *file)
 	return 0;
 }
 
-static int secretmem_mmap(struct file *file, struct vm_area_struct *vma)
+static int secretmem_mmap_proto(struct vma_proto *proto)
 {
-	unsigned long len = vma->vm_end - vma->vm_start;
+	unsigned long len = proto->end - proto->start;
 
-	if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0)
+	if ((proto->flags & (VM_SHARED | VM_MAYSHARE)) == 0)
 		return -EINVAL;
 
-	if (!mlock_future_ok(vma->vm_mm, vma->vm_flags | VM_LOCKED, len))
+	if (!mlock_future_ok(proto->mm, proto->flags | VM_LOCKED, len))
 		return -EAGAIN;
 
-	vm_flags_set(vma, VM_LOCKED | VM_DONTDUMP);
-	vma->vm_ops = &secretmem_vm_ops;
+	proto->flags |= VM_LOCKED | VM_DONTDUMP;
+	proto->vm_ops = &secretmem_vm_ops;
 
 	return 0;
 }
@@ -143,7 +143,7 @@ bool vma_is_secretmem(struct vm_area_struct *vma)
 
 static const struct file_operations secretmem_fops = {
 	.release	= secretmem_release,
-	.mmap		= secretmem_mmap,
+	.mmap_proto	= secretmem_mmap_proto,
 };
 
 static int secretmem_migrate_folio(struct address_space *mapping,
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [RFC PATCH 3/3] mm/vma: remove mmap() retry merge
  2025-04-30 19:54 ` [RFC PATCH 3/3] mm/vma: remove mmap() retry merge Lorenzo Stoakes
@ 2025-04-30 19:59   ` Lorenzo Stoakes
  0 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-04-30 19:59 UTC (permalink / raw)
  To: Andrew Morton
  Cc: David Hildenbrand, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Jann Horn,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Matthew Wilcox

We have now introduced a mechanism that obviates the need for a reattempted
merge via the f_op->mmap_proto() hook, so eliminate this functionality
altogether.

The retry merge logic has been the cause of a great deal of complexity in
the past and required a great deal of careful manoeuvring of code to ensure
its continued and correct functionality.

It has also recently been involved in an issue surrounding maple tree
state, which again points to its problematic nature.

We make it much easier to reason about mmap() logic by eliminating this and
simply writing a VMA once. This also opens the doors to future
optimisation and improvement in the mmap() logic.

For any device or file system which encounters unwanted VMA fragmentation
as a result of this change (that is, having not implemented .mmap_proto
hooks), the issue is easily resolvable by doing so.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 mm/vma.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/mm/vma.c b/mm/vma.c
index 76bd3a67ce0f..40c98f88472e 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -24,7 +24,6 @@ struct mmap_state {
 	void *vm_private_data;
 
 	unsigned long charged;
-	bool retry_merge;
 
 	struct vm_area_struct *prev;
 	struct vm_area_struct *next;
@@ -2423,8 +2422,6 @@ static int __mmap_new_file_vma(struct mmap_state *map,
 			!(map->flags & VM_MAYWRITE) &&
 			(vma->vm_flags & VM_MAYWRITE));
 
-	/* If the flags change (and are mergeable), let's retry later. */
-	map->retry_merge = vma->vm_flags != map->flags && !(vma->vm_flags & VM_SPECIAL);
 	map->flags = vma->vm_flags;
 
 	return 0;
@@ -2641,17 +2638,6 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
 	if (have_proto)
 		set_vma_user_defined_fields(vma, &map);
 
-	/* If flags changed, we might be able to merge, so try again. */
-	if (map.retry_merge) {
-		struct vm_area_struct *merged;
-		VMG_MMAP_STATE(vmg, &map, vma);
-
-		vma_iter_config(map.vmi, map.addr, map.end);
-		merged = vma_merge_existing_range(&vmg);
-		if (merged)
-			vma = merged;
-	}
-
 	__mmap_complete(&map, vma);
 
 	return addr;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook
  2025-04-30 19:54 [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
                   ` (4 preceding siblings ...)
  2025-04-30 19:59 ` Lorenzo Stoakes
@ 2025-04-30 21:29 ` Jann Horn
  2025-05-01 10:27   ` Lorenzo Stoakes
  5 siblings, 1 reply; 22+ messages in thread
From: Jann Horn @ 2025-04-30 21:29 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Matthew Wilcox

On Wed, Apr 30, 2025 at 9:59 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> A driver can specify either .mmap_proto(), .mmap() or both. This provides
> maximum flexibility.

Just to check I understand the intent correctly: The idea here is that
.mmap_proto() is, at least for now, only for drivers who want to
trigger merging, right? If a driver doesn't need merging, the normal
.mmap() handler can still do all the things it was able to do before
(like changing the ->vm_file pointer through vma_set_file(), or
changing VMA flags in allowed ways)?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 19:54 ` [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback Lorenzo Stoakes
  2025-04-30 19:59   ` Lorenzo Stoakes
@ 2025-04-30 21:44   ` Jann Horn
  2025-05-01 10:37     ` Lorenzo Stoakes
  2025-04-30 21:58   ` David Hildenbrand
  2 siblings, 1 reply; 22+ messages in thread
From: Jann Horn @ 2025-04-30 21:44 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm, Alexander Viro, Christian Brauner,
	Jan Kara

On Wed, Apr 30, 2025 at 9:54 PM Lorenzo Stoakes
<lorenzo.stoakes@oracle.com> wrote:
> Provide a means by which drivers can specify which fields of those
> permitted to be changed should be altered to prior to mmap()'ing a
> range (which may either result from a merge or from mapping an entirely new
> VMA).
>
> Doing so is substantially safer than the existing .mmap() calback which
> provides unrestricted access to the part-constructed VMA and permits
> drivers and file systems to do 'creative' things which makes it hard to
> reason about the state of the VMA after the function returns.
>
> The existing .mmap() callback's freedom has caused a great deal of issues,
> especially in error handling, as unwinding the mmap() state has proven to
> be non-trivial and caused significant issues in the past, for instance
> those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> error path behaviour").
>
> It also necessitates a second attempt at merge once the .mmap() callback
> has completed, which has caused issues in the past, is awkward, adds
> overhead and is difficult to reason about.
>
> The .mmap_proto() callback eliminates this requirement, as we can update
> fields prior to even attempting the first merge. It is safer, as we heavily
> restrict what can actually be modified, and being invoked very early in the
> mmap() process, error handling can be performed safely with very little
> unwinding of state required.

I wonder if this requires adjustments to the existing users of
call_mmap() that use call_mmap() for forwarding mmap operations to
some kind of backing file. In particular fuse_passthrough_mmap(),
which I think can operate on fairly arbitrary user-supplied backing
files (for context, I think fuse_backing_open() allows root to just
provide an fd to be used as backing file).

I guess the easiest approach would be to add bailouts to those if an
->mmap_proto handler exists for now, and revisit this if we ever want
to use ->mmap_proto for more normal types of files?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 19:54 ` [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback Lorenzo Stoakes
  2025-04-30 19:59   ` Lorenzo Stoakes
  2025-04-30 21:44   ` Jann Horn
@ 2025-04-30 21:58   ` David Hildenbrand
  2025-05-01 10:23     ` Lorenzo Stoakes
                       ` (2 more replies)
  2 siblings, 3 replies; 22+ messages in thread
From: David Hildenbrand @ 2025-04-30 21:58 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: Liam R . Howlett, Vlastimil Babka, Mike Rapoport, Jann Horn,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Suren Baghdasaryan,
	Michal Hocko

On 30.04.25 21:54, Lorenzo Stoakes wrote:
> Provide a means by which drivers can specify which fields of those
> permitted to be changed should be altered to prior to mmap()'ing a
> range (which may either result from a merge or from mapping an entirely new
> VMA).
> 
> Doing so is substantially safer than the existing .mmap() calback which
> provides unrestricted access to the part-constructed VMA and permits
> drivers and file systems to do 'creative' things which makes it hard to
> reason about the state of the VMA after the function returns.
> 
> The existing .mmap() callback's freedom has caused a great deal of issues,
> especially in error handling, as unwinding the mmap() state has proven to
> be non-trivial and caused significant issues in the past, for instance
> those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> error path behaviour").
> 
> It also necessitates a second attempt at merge once the .mmap() callback
> has completed, which has caused issues in the past, is awkward, adds
> overhead and is difficult to reason about.
> 
> The .mmap_proto() callback eliminates this requirement, as we can update
> fields prior to even attempting the first merge. It is safer, as we heavily
> restrict what can actually be modified, and being invoked very early in the
> mmap() process, error handling can be performed safely with very little
> unwinding of state required.
> 
> Update vma userland test stubs to account for changes.
> 
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>


I really don't like the "proto" terminology. :)

[yes, David and his naming :P ]

No, the problem is that it is fairly unintuitive what is happening here.

Coming from a different direction, the callback is trigger after 
__mmap_prepare() ... could we call it "->mmap_prepare" or something like 
that? (mmap_setup, whatever)

Maybe mmap_setup and vma_setup_param? Just a thought ...


In general (although it's late in Germany), it does sound like an 
interesting approach.

How feasiable is it to remove ->mmap in the long run, and would we maybe 
need other callbacks to make that possible?


-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 21:58   ` David Hildenbrand
@ 2025-05-01 10:23     ` Lorenzo Stoakes
  2025-05-01 12:17       ` Mike Rapoport
  2025-05-01 13:51     ` Liam R. Howlett
  2025-05-05 13:29     ` Christian Brauner
  2 siblings, 1 reply; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-05-01 10:23 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Andrew Morton, Liam R . Howlett, Vlastimil Babka, Mike Rapoport,
	Jann Horn, Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Suren Baghdasaryan,
	Michal Hocko

On Wed, Apr 30, 2025 at 11:58:14PM +0200, David Hildenbrand wrote:
> On 30.04.25 21:54, Lorenzo Stoakes wrote:
> > Provide a means by which drivers can specify which fields of those
> > permitted to be changed should be altered to prior to mmap()'ing a
> > range (which may either result from a merge or from mapping an entirely new
> > VMA).
> >
> > Doing so is substantially safer than the existing .mmap() calback which
> > provides unrestricted access to the part-constructed VMA and permits
> > drivers and file systems to do 'creative' things which makes it hard to
> > reason about the state of the VMA after the function returns.
> >
> > The existing .mmap() callback's freedom has caused a great deal of issues,
> > especially in error handling, as unwinding the mmap() state has proven to
> > be non-trivial and caused significant issues in the past, for instance
> > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > error path behaviour").
> >
> > It also necessitates a second attempt at merge once the .mmap() callback
> > has completed, which has caused issues in the past, is awkward, adds
> > overhead and is difficult to reason about.
> >
> > The .mmap_proto() callback eliminates this requirement, as we can update
> > fields prior to even attempting the first merge. It is safer, as we heavily
> > restrict what can actually be modified, and being invoked very early in the
> > mmap() process, error handling can be performed safely with very little
> > unwinding of state required.
> >
> > Update vma userland test stubs to account for changes.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>
>
> I really don't like the "proto" terminology. :)
>
> [yes, David and his naming :P ]
>
> No, the problem is that it is fairly unintuitive what is happening here.
>
> Coming from a different direction, the callback is trigger after
> __mmap_prepare() ... could we call it "->mmap_prepare" or something like
> that? (mmap_setup, whatever)
>
> Maybe mmap_setup and vma_setup_param? Just a thought ...

Haha that's fine, I'm not sure I love 'proto' either to be honest, naming is
hard...

I would rather not refer to VMA's at all to be honest, if I had my way, no
driver would ever have access to a VMA at all...

But mmap_setup() or mmap_prepare() sound good!

>
>
> In general (although it's late in Germany), it does sound like an
> interesting approach.

Thanks! Appreciate it :) I really want to attack this, as I _hate_ how we
effectively allow drivers to do _anything_ with VMAs like this.

Yes, hate-driven development...

Locking this down is just a generally good idea I think!

Was late in UK too when I sent :P hence why I managed to not send it properly
the first time... (sorry again...)

>
> How feasiable is it to remove ->mmap in the long run, and would we maybe
> need other callbacks to make that possible?

I do think it is, because we can do this super-incrementally, and I'm willing to
put in the legwork to gradually move drivers over.

I think it might be folio-like in taking some time, but we'll get there
(obviously _nowhere near_ the impact of that work, a mere humble effort, but
comparable somewhat in this regard).

I actually took the time to look through ~350 or so .mmap() callbacks so it's
not so crazy to delve in either.

Re: other callbacks, yes I suspect we will need. But I think we are fine to
start with this and add as needed.

I suspect esp. given Jann's comments we might want to make .mmap_prepare() and
.mmap() mutualy exclusive actually. Idea of allowing them both wass flexibility
but I think is more downside than up.

We can then add additional callbacks as needed. Also good for the transition
away from .mmap() which I really want to absolutely deprecate.

>
>
> --
> Cheers,
>
> David / dhildenb
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook
  2025-04-30 21:29 ` Jann Horn
@ 2025-05-01 10:27   ` Lorenzo Stoakes
  2025-05-01 14:33     ` Lorenzo Stoakes
  0 siblings, 1 reply; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-05-01 10:27 UTC (permalink / raw)
  To: Jann Horn
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Matthew Wilcox

On Wed, Apr 30, 2025 at 11:29:46PM +0200, Jann Horn wrote:
> On Wed, Apr 30, 2025 at 9:59 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> > A driver can specify either .mmap_proto(), .mmap() or both. This provides
> > maximum flexibility.
>
> Just to check I understand the intent correctly: The idea here is that
> .mmap_proto() is, at least for now, only for drivers who want to
> trigger merging, right? If a driver doesn't need merging, the normal
> .mmap() handler can still do all the things it was able to do before
> (like changing the ->vm_file pointer through vma_set_file(), or
> changing VMA flags in allowed ways)?

No, the intent is that this form the basis of an entirely new set of callbacks
to use _instead_ of .mmap().

The vma_set_file() semantics would need to be changed actually, I will update
logic to do this implicitly when the user sets vma_proto (or whatever this gets
renamed to :P)->file - we can do this automagically.

However the first use is indeed to be able to remove this merge retry. I have
gone through .mmap() callbacks and identified the only one that seems
meaningfully to care, so it's a great first use.

Two birds with one stone, and forming the foundatio for future work to prevent
drivers from having carte blache to do whatever on .mmap() wrt vma's.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 21:44   ` Jann Horn
@ 2025-05-01 10:37     ` Lorenzo Stoakes
  0 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-05-01 10:37 UTC (permalink / raw)
  To: Jann Horn
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm, Alexander Viro, Christian Brauner,
	Jan Kara

On Wed, Apr 30, 2025 at 11:44:15PM +0200, Jann Horn wrote:
> On Wed, Apr 30, 2025 at 9:54 PM Lorenzo Stoakes
> <lorenzo.stoakes@oracle.com> wrote:
> > Provide a means by which drivers can specify which fields of those
> > permitted to be changed should be altered to prior to mmap()'ing a
> > range (which may either result from a merge or from mapping an entirely new
> > VMA).
> >
> > Doing so is substantially safer than the existing .mmap() calback which
> > provides unrestricted access to the part-constructed VMA and permits
> > drivers and file systems to do 'creative' things which makes it hard to
> > reason about the state of the VMA after the function returns.
> >
> > The existing .mmap() callback's freedom has caused a great deal of issues,
> > especially in error handling, as unwinding the mmap() state has proven to
> > be non-trivial and caused significant issues in the past, for instance
> > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > error path behaviour").
> >
> > It also necessitates a second attempt at merge once the .mmap() callback
> > has completed, which has caused issues in the past, is awkward, adds
> > overhead and is difficult to reason about.
> >
> > The .mmap_proto() callback eliminates this requirement, as we can update
> > fields prior to even attempting the first merge. It is safer, as we heavily
> > restrict what can actually be modified, and being invoked very early in the
> > mmap() process, error handling can be performed safely with very little
> > unwinding of state required.
>
> I wonder if this requires adjustments to the existing users of
> call_mmap() that use call_mmap() for forwarding mmap operations to
> some kind of backing file. In particular fuse_passthrough_mmap(),
> which I think can operate on fairly arbitrary user-supplied backing
> files (for context, I think fuse_backing_open() allows root to just
> provide an fd to be used as backing file).

Yeah the fact these exist is just another example of us being far, far, far
too permissive on this stuff imo.

I mean it's useful ofc, but the fact you have multiple layers of being able
to do _anything_ isn't great...

>
> I guess the easiest approach would be to add bailouts to those if an
> ->mmap_proto handler exists for now, and revisit this if we ever want
> to use ->mmap_proto for more normal types of files?

Yeah good point, luckily we abstract to call_mmap(), will have that bail
out in that case, thanks!

I think by implication we shouldn't allow .mmap_proto() and .mmap() to
co-exist, rather in future we can add additional callbacks as needed (see
discussion with David).

Will respin accordingly... :)

Thanks for taking a look, much appreciated to both you and David! :)

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-05-01 10:23     ` Lorenzo Stoakes
@ 2025-05-01 12:17       ` Mike Rapoport
  2025-05-01 13:00         ` Lorenzo Stoakes
  0 siblings, 1 reply; 22+ messages in thread
From: Mike Rapoport @ 2025-05-01 12:17 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: David Hildenbrand, Andrew Morton, Liam R . Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm, Alexander Viro, Christian Brauner,
	Jan Kara, Suren Baghdasaryan, Michal Hocko

On Thu, May 01, 2025 at 11:23:32AM +0100, Lorenzo Stoakes wrote:
> On Wed, Apr 30, 2025 at 11:58:14PM +0200, David Hildenbrand wrote:
> > On 30.04.25 21:54, Lorenzo Stoakes wrote:
> > > Provide a means by which drivers can specify which fields of those
> > > permitted to be changed should be altered to prior to mmap()'ing a
> > > range (which may either result from a merge or from mapping an entirely new
> > > VMA).
> > >
> > > Doing so is substantially safer than the existing .mmap() calback which
> > > provides unrestricted access to the part-constructed VMA and permits
> > > drivers and file systems to do 'creative' things which makes it hard to
> > > reason about the state of the VMA after the function returns.
> > >
> > > The existing .mmap() callback's freedom has caused a great deal of issues,
> > > especially in error handling, as unwinding the mmap() state has proven to
> > > be non-trivial and caused significant issues in the past, for instance
> > > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > > error path behaviour").
> > >
> > > It also necessitates a second attempt at merge once the .mmap() callback
> > > has completed, which has caused issues in the past, is awkward, adds
> > > overhead and is difficult to reason about.
> > >
> > > The .mmap_proto() callback eliminates this requirement, as we can update
> > > fields prior to even attempting the first merge. It is safer, as we heavily
> > > restrict what can actually be modified, and being invoked very early in the
> > > mmap() process, error handling can be performed safely with very little
> > > unwinding of state required.
> > >
> > > Update vma userland test stubs to account for changes.
> > >
> > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> >
> >
> > I really don't like the "proto" terminology. :)
> >
> > [yes, David and his naming :P ]
> >
> > No, the problem is that it is fairly unintuitive what is happening here.
> >
> > Coming from a different direction, the callback is trigger after
> > __mmap_prepare() ... could we call it "->mmap_prepare" or something like
> > that? (mmap_setup, whatever)
> >
> > Maybe mmap_setup and vma_setup_param? Just a thought ...
> 
> Haha that's fine, I'm not sure I love 'proto' either to be honest, naming is
> hard...
> 
> I would rather not refer to VMA's at all to be honest, if I had my way, no
> driver would ever have access to a VMA at all...
> 
> But mmap_setup() or mmap_prepare() sound good!

+1

and struct vm_area_desc maybe? 

> >
> >
> > In general (although it's late in Germany), it does sound like an
> > interesting approach.
> 
> Thanks! Appreciate it :) I really want to attack this, as I _hate_ how we
> effectively allow drivers to do _anything_ with VMAs like this.
> 
> Yes, hate-driven development...

Just move vm_area_struct to mm/internal.h and let them cope :-D

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-05-01 12:17       ` Mike Rapoport
@ 2025-05-01 13:00         ` Lorenzo Stoakes
  0 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-05-01 13:00 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: David Hildenbrand, Andrew Morton, Liam R . Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm, Alexander Viro, Christian Brauner,
	Jan Kara, Suren Baghdasaryan, Michal Hocko

On Thu, May 01, 2025 at 03:17:07PM +0300, Mike Rapoport wrote:
> On Thu, May 01, 2025 at 11:23:32AM +0100, Lorenzo Stoakes wrote:
> > On Wed, Apr 30, 2025 at 11:58:14PM +0200, David Hildenbrand wrote:
> > > On 30.04.25 21:54, Lorenzo Stoakes wrote:
> > > > Provide a means by which drivers can specify which fields of those
> > > > permitted to be changed should be altered to prior to mmap()'ing a
> > > > range (which may either result from a merge or from mapping an entirely new
> > > > VMA).
> > > >
> > > > Doing so is substantially safer than the existing .mmap() calback which
> > > > provides unrestricted access to the part-constructed VMA and permits
> > > > drivers and file systems to do 'creative' things which makes it hard to
> > > > reason about the state of the VMA after the function returns.
> > > >
> > > > The existing .mmap() callback's freedom has caused a great deal of issues,
> > > > especially in error handling, as unwinding the mmap() state has proven to
> > > > be non-trivial and caused significant issues in the past, for instance
> > > > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > > > error path behaviour").
> > > >
> > > > It also necessitates a second attempt at merge once the .mmap() callback
> > > > has completed, which has caused issues in the past, is awkward, adds
> > > > overhead and is difficult to reason about.
> > > >
> > > > The .mmap_proto() callback eliminates this requirement, as we can update
> > > > fields prior to even attempting the first merge. It is safer, as we heavily
> > > > restrict what can actually be modified, and being invoked very early in the
> > > > mmap() process, error handling can be performed safely with very little
> > > > unwinding of state required.
> > > >
> > > > Update vma userland test stubs to account for changes.
> > > >
> > > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > >
> > >
> > > I really don't like the "proto" terminology. :)
> > >
> > > [yes, David and his naming :P ]
> > >
> > > No, the problem is that it is fairly unintuitive what is happening here.
> > >
> > > Coming from a different direction, the callback is trigger after
> > > __mmap_prepare() ... could we call it "->mmap_prepare" or something like
> > > that? (mmap_setup, whatever)
> > >
> > > Maybe mmap_setup and vma_setup_param? Just a thought ...
> >
> > Haha that's fine, I'm not sure I love 'proto' either to be honest, naming is
> > hard...
> >
> > I would rather not refer to VMA's at all to be honest, if I had my way, no
> > driver would ever have access to a VMA at all...
> >
> > But mmap_setup() or mmap_prepare() sound good!
>
> +1
>
> and struct vm_area_desc maybe?

That's nice actually thanks, will do!

>
> > >
> > >
> > > In general (although it's late in Germany), it does sound like an
> > > interesting approach.
> >
> > Thanks! Appreciate it :) I really want to attack this, as I _hate_ how we
> > effectively allow drivers to do _anything_ with VMAs like this.
> >
> > Yes, hate-driven development...
>
> Just move vm_area_struct to mm/internal.h and let them cope :-D

Haha oh man the dream. Though it'd be vma.h of course :P

>
> --
> Sincerely yours,
> Mike.

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 21:58   ` David Hildenbrand
  2025-05-01 10:23     ` Lorenzo Stoakes
@ 2025-05-01 13:51     ` Liam R. Howlett
  2025-05-01 13:57       ` Lorenzo Stoakes
  2025-05-05 13:29     ` Christian Brauner
  2 siblings, 1 reply; 22+ messages in thread
From: Liam R. Howlett @ 2025-05-01 13:51 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Lorenzo Stoakes, Andrew Morton, Vlastimil Babka, Mike Rapoport,
	Jann Horn, Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Suren Baghdasaryan,
	Michal Hocko

* David Hildenbrand <david@redhat.com> [250430 17:58]:
> On 30.04.25 21:54, Lorenzo Stoakes wrote:
> > Provide a means by which drivers can specify which fields of those
> > permitted to be changed should be altered to prior to mmap()'ing a
> > range (which may either result from a merge or from mapping an entirely new
> > VMA).
> > 
> > Doing so is substantially safer than the existing .mmap() calback which
> > provides unrestricted access to the part-constructed VMA and permits
> > drivers and file systems to do 'creative' things which makes it hard to
> > reason about the state of the VMA after the function returns.
> > 
> > The existing .mmap() callback's freedom has caused a great deal of issues,
> > especially in error handling, as unwinding the mmap() state has proven to
> > be non-trivial and caused significant issues in the past, for instance
> > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > error path behaviour").
> > 
> > It also necessitates a second attempt at merge once the .mmap() callback
> > has completed, which has caused issues in the past, is awkward, adds
> > overhead and is difficult to reason about.
> > 
> > The .mmap_proto() callback eliminates this requirement, as we can update
> > fields prior to even attempting the first merge. It is safer, as we heavily
> > restrict what can actually be modified, and being invoked very early in the
> > mmap() process, error handling can be performed safely with very little
> > unwinding of state required.
> > 
> > Update vma userland test stubs to account for changes.
> > 
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> 
> I really don't like the "proto" terminology. :)
> 
> [yes, David and his naming :P ]
> 
> No, the problem is that it is fairly unintuitive what is happening here.
> 
> Coming from a different direction, the callback is trigger after
> __mmap_prepare() ... could we call it "->mmap_prepare" or something like
> that? (mmap_setup, whatever)
> 
> Maybe mmap_setup and vma_setup_param? Just a thought ...

Although I don't really mind what we call this, I don't like the flags
name.  Can we qualify it with vm_flags?  It looks dumb most of the time
but we have had variables named "flags" set to the wrong flag type make
it through code review and into the kernel.

That is, we may see people set a struct vma_proto proto later do
proto.flags = map_flags.  It sounds stupid here, but we have had cases
of exactly this making it through to a kernel release.

I bring this up here because it may influence the prefix of the setup
call, or vice versa... and not _just_ to derail another renaming.

> 
> 
> In general (although it's late in Germany), it does sound like an
> interesting approach.
> 
> How feasiable is it to remove ->mmap in the long run, and would we maybe
> need other callbacks to make that possible?
> 
> 
> -- 
> Cheers,
> 
> David / dhildenb
> 

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-05-01 13:51     ` Liam R. Howlett
@ 2025-05-01 13:57       ` Lorenzo Stoakes
  0 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-05-01 13:57 UTC (permalink / raw)
  To: Liam R. Howlett, David Hildenbrand, Andrew Morton,
	Vlastimil Babka, Mike Rapoport, Jann Horn, Pedro Falcato,
	linux-fsdevel, linux-kernel, linux-mm, Alexander Viro,
	Christian Brauner, Jan Kara, Suren Baghdasaryan, Michal Hocko

On Thu, May 01, 2025 at 09:51:26AM -0400, Liam R. Howlett wrote:
> * David Hildenbrand <david@redhat.com> [250430 17:58]:
> > On 30.04.25 21:54, Lorenzo Stoakes wrote:
> > > Provide a means by which drivers can specify which fields of those
> > > permitted to be changed should be altered to prior to mmap()'ing a
> > > range (which may either result from a merge or from mapping an entirely new
> > > VMA).
> > >
> > > Doing so is substantially safer than the existing .mmap() calback which
> > > provides unrestricted access to the part-constructed VMA and permits
> > > drivers and file systems to do 'creative' things which makes it hard to
> > > reason about the state of the VMA after the function returns.
> > >
> > > The existing .mmap() callback's freedom has caused a great deal of issues,
> > > especially in error handling, as unwinding the mmap() state has proven to
> > > be non-trivial and caused significant issues in the past, for instance
> > > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > > error path behaviour").
> > >
> > > It also necessitates a second attempt at merge once the .mmap() callback
> > > has completed, which has caused issues in the past, is awkward, adds
> > > overhead and is difficult to reason about.
> > >
> > > The .mmap_proto() callback eliminates this requirement, as we can update
> > > fields prior to even attempting the first merge. It is safer, as we heavily
> > > restrict what can actually be modified, and being invoked very early in the
> > > mmap() process, error handling can be performed safely with very little
> > > unwinding of state required.
> > >
> > > Update vma userland test stubs to account for changes.
> > >
> > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> >
> >
> > I really don't like the "proto" terminology. :)
> >
> > [yes, David and his naming :P ]
> >
> > No, the problem is that it is fairly unintuitive what is happening here.
> >
> > Coming from a different direction, the callback is trigger after
> > __mmap_prepare() ... could we call it "->mmap_prepare" or something like
> > that? (mmap_setup, whatever)
> >
> > Maybe mmap_setup and vma_setup_param? Just a thought ...
>
> Although I don't really mind what we call this, I don't like the flags
> name.  Can we qualify it with vm_flags?  It looks dumb most of the time
> but we have had variables named "flags" set to the wrong flag type make
> it through code review and into the kernel.

Sure, will do!

>
> That is, we may see people set a struct vma_proto proto later do
> proto.flags = map_flags.  It sounds stupid here, but we have had cases
> of exactly this making it through to a kernel release.
>
> I bring this up here because it may influence the prefix of the setup
> call, or vice versa... and not _just_ to derail another renaming.

;)

Yeah, 'flags' is one of the more ambigious names in the kernel
generally... I did go back and forth on this one but this is a good point,
and it's an easy mistake to make, sadly...

>
> >
> >
> > In general (although it's late in Germany), it does sound like an
> > interesting approach.
> >
> > How feasiable is it to remove ->mmap in the long run, and would we maybe
> > need other callbacks to make that possible?
> >
> >
> > --
> > Cheers,
> >
> > David / dhildenb
> >
>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook
  2025-05-01 10:27   ` Lorenzo Stoakes
@ 2025-05-01 14:33     ` Lorenzo Stoakes
  0 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-05-01 14:33 UTC (permalink / raw)
  To: Jann Horn
  Cc: Andrew Morton, David Hildenbrand, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm,
	Alexander Viro, Christian Brauner, Jan Kara, Matthew Wilcox

On Thu, May 01, 2025 at 11:27:51AM +0100, Lorenzo Stoakes wrote:
> On Wed, Apr 30, 2025 at 11:29:46PM +0200, Jann Horn wrote:
> > On Wed, Apr 30, 2025 at 9:59 PM Lorenzo Stoakes
> > <lorenzo.stoakes@oracle.com> wrote:
> > > A driver can specify either .mmap_proto(), .mmap() or both. This provides
> > > maximum flexibility.
> >
> > Just to check I understand the intent correctly: The idea here is that
> > .mmap_proto() is, at least for now, only for drivers who want to
> > trigger merging, right? If a driver doesn't need merging, the normal
> > .mmap() handler can still do all the things it was able to do before
> > (like changing the ->vm_file pointer through vma_set_file(), or
> > changing VMA flags in allowed ways)?
>
> No, the intent is that this form the basis of an entirely new set of callbacks
> to use _instead_ of .mmap().
>
> The vma_set_file() semantics would need to be changed actually, I will update
> logic to do this implicitly when the user sets vma_proto (or whatever this gets
> renamed to :P)->file - we can do this automagically.

Actually there is no need, as the file pointer we pass to the caller does
not have an incremented reference count (it is pinned by the .mmap() call),
we only increment it once it's in the VMA so this is unnecessary.

>
> However the first use is indeed to be able to remove this merge retry. I have
> gone through .mmap() callbacks and identified the only one that seems
> meaningfully to care, so it's a great first use.
>
> Two birds with one stone, and forming the foundatio for future work to prevent
> drivers from having carte blache to do whatever on .mmap() wrt vma's.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-04-30 21:58   ` David Hildenbrand
  2025-05-01 10:23     ` Lorenzo Stoakes
  2025-05-01 13:51     ` Liam R. Howlett
@ 2025-05-05 13:29     ` Christian Brauner
  2025-05-06 10:01       ` Lorenzo Stoakes
  2 siblings, 1 reply; 22+ messages in thread
From: Christian Brauner @ 2025-05-05 13:29 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Lorenzo Stoakes, Andrew Morton, Liam R . Howlett, Vlastimil Babka,
	Mike Rapoport, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm, Alexander Viro, Jan Kara,
	Suren Baghdasaryan, Michal Hocko

On Wed, Apr 30, 2025 at 11:58:14PM +0200, David Hildenbrand wrote:
> On 30.04.25 21:54, Lorenzo Stoakes wrote:
> > Provide a means by which drivers can specify which fields of those
> > permitted to be changed should be altered to prior to mmap()'ing a
> > range (which may either result from a merge or from mapping an entirely new
> > VMA).
> > 
> > Doing so is substantially safer than the existing .mmap() calback which
> > provides unrestricted access to the part-constructed VMA and permits
> > drivers and file systems to do 'creative' things which makes it hard to
> > reason about the state of the VMA after the function returns.
> > 
> > The existing .mmap() callback's freedom has caused a great deal of issues,
> > especially in error handling, as unwinding the mmap() state has proven to
> > be non-trivial and caused significant issues in the past, for instance
> > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > error path behaviour").
> > 
> > It also necessitates a second attempt at merge once the .mmap() callback
> > has completed, which has caused issues in the past, is awkward, adds
> > overhead and is difficult to reason about.
> > 
> > The .mmap_proto() callback eliminates this requirement, as we can update
> > fields prior to even attempting the first merge. It is safer, as we heavily
> > restrict what can actually be modified, and being invoked very early in the
> > mmap() process, error handling can be performed safely with very little
> > unwinding of state required.
> > 
> > Update vma userland test stubs to account for changes.
> > 
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> 
> I really don't like the "proto" terminology. :)
> 
> [yes, David and his naming :P ]
> 
> No, the problem is that it is fairly unintuitive what is happening here.
> 
> Coming from a different direction, the callback is trigger after
> __mmap_prepare() ... could we call it "->mmap_prepare" or something like
> that? (mmap_setup, whatever)
> 
> Maybe mmap_setup and vma_setup_param? Just a thought ...
> 
> 
> In general (although it's late in Germany), it does sound like an
> interesting approach.
> 
> How feasiable is it to remove ->mmap in the long run, and would we maybe
> need other callbacks to make that possible?

If mm needs new file operations that aim to replace the old ->mmap() I
want the old method to be ripped out within a reasonable time frame. I
don't want to have ->mmap() and ->mmap_$new() hanging around for the
next 5 years. We have enough of that already. And it would be great to
be clear whether that replacement can actually happen.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback
  2025-05-05 13:29     ` Christian Brauner
@ 2025-05-06 10:01       ` Lorenzo Stoakes
  0 siblings, 0 replies; 22+ messages in thread
From: Lorenzo Stoakes @ 2025-05-06 10:01 UTC (permalink / raw)
  To: Christian Brauner
  Cc: David Hildenbrand, Andrew Morton, Liam R . Howlett,
	Vlastimil Babka, Mike Rapoport, Jann Horn, Pedro Falcato,
	linux-fsdevel, linux-kernel, linux-mm, Alexander Viro, Jan Kara,
	Suren Baghdasaryan, Michal Hocko

On Mon, May 05, 2025 at 03:29:08PM +0200, Christian Brauner wrote:
> On Wed, Apr 30, 2025 at 11:58:14PM +0200, David Hildenbrand wrote:
> > On 30.04.25 21:54, Lorenzo Stoakes wrote:
> > > Provide a means by which drivers can specify which fields of those
> > > permitted to be changed should be altered to prior to mmap()'ing a
> > > range (which may either result from a merge or from mapping an entirely new
> > > VMA).
> > >
> > > Doing so is substantially safer than the existing .mmap() calback which
> > > provides unrestricted access to the part-constructed VMA and permits
> > > drivers and file systems to do 'creative' things which makes it hard to
> > > reason about the state of the VMA after the function returns.
> > >
> > > The existing .mmap() callback's freedom has caused a great deal of issues,
> > > especially in error handling, as unwinding the mmap() state has proven to
> > > be non-trivial and caused significant issues in the past, for instance
> > > those addressed in commit 5de195060b2e ("mm: resolve faulty mmap_region()
> > > error path behaviour").
> > >
> > > It also necessitates a second attempt at merge once the .mmap() callback
> > > has completed, which has caused issues in the past, is awkward, adds
> > > overhead and is difficult to reason about.
> > >
> > > The .mmap_proto() callback eliminates this requirement, as we can update
> > > fields prior to even attempting the first merge. It is safer, as we heavily
> > > restrict what can actually be modified, and being invoked very early in the
> > > mmap() process, error handling can be performed safely with very little
> > > unwinding of state required.
> > >
> > > Update vma userland test stubs to account for changes.
> > >
> > > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> >
> >
> > I really don't like the "proto" terminology. :)
> >
> > [yes, David and his naming :P ]
> >
> > No, the problem is that it is fairly unintuitive what is happening here.
> >
> > Coming from a different direction, the callback is trigger after
> > __mmap_prepare() ... could we call it "->mmap_prepare" or something like
> > that? (mmap_setup, whatever)
> >
> > Maybe mmap_setup and vma_setup_param? Just a thought ...
> >
> >
> > In general (although it's late in Germany), it does sound like an
> > interesting approach.
> >
> > How feasiable is it to remove ->mmap in the long run, and would we maybe
> > need other callbacks to make that possible?
>
> If mm needs new file operations that aim to replace the old ->mmap() I
> want the old method to be ripped out within a reasonable time frame. I
> don't want to have ->mmap() and ->mmap_$new() hanging around for the
> next 5 years. We have enough of that already. And it would be great to
> be clear whether that replacement can actually happen.

Ack, I am committed to making these changes myself and will be putting effort
into doing this (and obviously liaising with others along these lines). And
absolutely it can, and will happen.

This has been an ongoing bugbear for me for some time, and recent issues
relating to a direct consequence of the overly permissive callback (that is,
relating to the merge retry we have to do afterwards) brought it into sharp
relief and precipitated this action.

So you can take this as a personal commitment, and you know who to shout
at if needed ;)

Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2025-05-06 10:02 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-30 19:54 [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
2025-04-30 19:54 ` [RFC PATCH 1/3] mm: introduce new .mmap_proto() f_op callback Lorenzo Stoakes
2025-04-30 19:59   ` Lorenzo Stoakes
2025-04-30 21:44   ` Jann Horn
2025-05-01 10:37     ` Lorenzo Stoakes
2025-04-30 21:58   ` David Hildenbrand
2025-05-01 10:23     ` Lorenzo Stoakes
2025-05-01 12:17       ` Mike Rapoport
2025-05-01 13:00         ` Lorenzo Stoakes
2025-05-01 13:51     ` Liam R. Howlett
2025-05-01 13:57       ` Lorenzo Stoakes
2025-05-05 13:29     ` Christian Brauner
2025-05-06 10:01       ` Lorenzo Stoakes
2025-04-30 19:54 ` [RFC PATCH 2/3] mm: secretmem: convert to .mmap_proto() hook Lorenzo Stoakes
2025-04-30 19:59   ` Lorenzo Stoakes
2025-04-30 19:54 ` [RFC PATCH 3/3] mm/vma: remove mmap() retry merge Lorenzo Stoakes
2025-04-30 19:59   ` Lorenzo Stoakes
2025-04-30 19:58 ` [RFC PATCH 0/3] eliminate mmap() retry merge, add .mmap_proto hook Lorenzo Stoakes
2025-04-30 19:59 ` Lorenzo Stoakes
2025-04-30 21:29 ` Jann Horn
2025-05-01 10:27   ` Lorenzo Stoakes
2025-05-01 14:33     ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).