linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
@ 2025-06-09  9:24 Lorenzo Stoakes
  2025-06-09 10:18 ` Vlastimil Babka
                   ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2025-06-09  9:24 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Liam R . Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm

Nested file systems, that is those which invoke call_mmap() within their
own f_op->mmap() handlers, may encounter underlying file systems which
provide the f_op->mmap_prepare() hook introduced by commit
c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").

We have a chicken-and-egg scenario here - until all file systems are
converted to using .mmap_prepare(), we cannot convert these nested
handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.

So we have to do it the other way round - invoke the .mmap_prepare() hook
from an .mmap() one.

in order to do so, we need to convert VMA state into a struct vm_area_desc
descriptor, invoking the underlying file system's f_op->mmap_prepare()
callback passing a pointer to this, and then setting VMA state accordingly
and safely.

This patch achieves this via the compat_vma_mmap_prepare() function, which
we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
passed in file pointer.

We place the fundamental logic into mm/vma.c where VMA manipulation
belongs. We also update the VMA userland tests to accommodate the changes.

The compat_vma_mmap_prepare() function and its associated machinery is
temporary, and will be removed once the conversion of file systems is
complete.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Reported-by: Jann Horn <jannh@google.com>
Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
---
 include/linux/fs.h               |  6 +++--
 mm/mmap.c                        | 39 +++++++++++++++++++++++++++
 mm/vma.c                         | 46 +++++++++++++++++++++++++++++++-
 mm/vma.h                         |  4 +++
 tools/testing/vma/vma_internal.h | 16 +++++++++++
 5 files changed, 108 insertions(+), 3 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 05abdabe9db7..8fe41a2b7527 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2274,10 +2274,12 @@ static inline bool file_has_valid_mmap_hooks(struct file *file)
 	return true;
 }
 
+int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma);
+
 static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
 {
-	if (WARN_ON_ONCE(file->f_op->mmap_prepare))
-		return -EINVAL;
+	if (file->f_op->mmap_prepare)
+		return compat_vma_mmap_prepare(file, vma);
 
 	return file->f_op->mmap(file, vma);
 }
diff --git a/mm/mmap.c b/mm/mmap.c
index 09c563c95112..0755cb5d89d1 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1891,3 +1891,42 @@ __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
 	vm_unacct_memory(charge);
 	goto loop_out;
 }
+
+/**
+ * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an
+ * existing VMA
+ * @file: The file which possesss an f_op->mmap_prepare() hook
+ * @vma; The VMA to apply the .mmap_prepare() hook to.
+ *
+ * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
+ * 'wrapper' file systems invoke a nested mmap hook of an underlying file.
+ *
+ * Until all filesystems are converted to use .mmap_prepare(), we must be
+ * conservative and continue to invoke these 'wrapper' filesystems using the
+ * deprecated .mmap() hook.
+ *
+ * However we have a problem if the underlying file system possesses an
+ * .mmap_prepare() hook, as we are in a different context when we invoke the
+ * .mmap() hook, already having a VMA to deal with.
+ *
+ * compat_vma_mmap_prepare() is a compatibility function that takes VMA state,
+ * establishes a struct vm_area_desc descriptor, passes to the underlying
+ * .mmap_prepare() hook and applies any changes performed by it.
+ *
+ * Once the conversion of filesystems is complete this function will no longer
+ * be required and will be removed.
+ *
+ * Returns: 0 on success or error.
+ */
+int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma)
+{
+	struct vm_area_desc desc;
+	int err;
+
+	err = file->f_op->mmap_prepare(vma_to_desc(vma, &desc));
+	if (err)
+		return err;
+	set_vma_from_desc(vma, &desc);
+
+	return 0;
+}
diff --git a/mm/vma.c b/mm/vma.c
index 01b1d26d87b4..d771750f8f76 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -3153,7 +3153,6 @@ int __vm_munmap(unsigned long start, size_t len, bool unlock)
 	return ret;
 }
 
-
 /* Insert vm structure into process list sorted by address
  * and into the inode's i_mmap tree.  If vm_file is non-NULL
  * then i_mmap_rwsem is taken here.
@@ -3195,3 +3194,48 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
 
 	return 0;
 }
+
+/*
+ * Temporary helper functions for file systems which wrap an invocation of
+ * f_op->mmap() but which might have an underlying file system which implements
+ * f_op->mmap_prepare().
+ */
+
+struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
+		struct vm_area_desc *desc)
+{
+	desc->mm = vma->vm_mm;
+	desc->start = vma->vm_start;
+	desc->end = vma->vm_end;
+
+	desc->pgoff = vma->vm_pgoff;
+	desc->file = vma->vm_file;
+	desc->vm_flags = vma->vm_flags;
+	desc->page_prot = vma->vm_page_prot;
+
+	desc->vm_ops = NULL;
+	desc->private_data = NULL;
+
+	return desc;
+}
+
+void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc)
+{
+	/*
+	 * Since we're invoking .mmap_prepare() despite having a partially
+	 * established VMA, we must take care to handle setting fields
+	 * correctly.
+	 */
+
+	/* Mutable fields. Populated with initial state. */
+	vma->vm_pgoff = desc->pgoff;
+	if (vma->vm_file != desc->file)
+		vma_set_file(vma, desc->file);
+	if (vma->vm_flags != desc->vm_flags)
+		vm_flags_set(vma, desc->vm_flags);
+	vma->vm_page_prot = desc->page_prot;
+
+	/* User-defined fields. */
+	vma->vm_ops = desc->vm_ops;
+	vma->vm_private_data = desc->private_data;
+}
diff --git a/mm/vma.h b/mm/vma.h
index 0db066e7a45d..afd6cc026658 100644
--- a/mm/vma.h
+++ b/mm/vma.h
@@ -570,4 +570,8 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap,
 int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift);
 #endif
 
+struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
+		struct vm_area_desc *desc);
+void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc);
+
 #endif	/* __MM_VMA_H */
diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
index 77b2949d874a..675a55216607 100644
--- a/tools/testing/vma/vma_internal.h
+++ b/tools/testing/vma/vma_internal.h
@@ -159,6 +159,14 @@ typedef __bitwise unsigned int vm_fault_t;
 
 #define ASSERT_EXCLUSIVE_WRITER(x)
 
+/**
+ * swap - swap values of @a and @b
+ * @a: first value
+ * @b: second value
+ */
+#define swap(a, b) \
+	do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
+
 struct kref {
 	refcount_t refcount;
 };
@@ -1479,4 +1487,12 @@ static inline vm_flags_t ksm_vma_flags(const struct mm_struct *, const struct fi
 	return vm_flags;
 }
 
+static inline void vma_set_file(struct vm_area_struct *vma, struct file *file)
+{
+	/* Changing an anonymous vma with this is illegal */
+	get_file(file);
+	swap(vma->vm_file, file);
+	fput(file);
+}
+
 #endif	/* __MM_VMA_INTERNAL_H */
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09  9:24 [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems Lorenzo Stoakes
@ 2025-06-09 10:18 ` Vlastimil Babka
  2025-06-09 10:27   ` Lorenzo Stoakes
  2025-06-09 11:35 ` Pedro Falcato
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Vlastimil Babka @ 2025-06-09 10:18 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Liam R . Howlett,
	Jann Horn, Pedro Falcato, linux-fsdevel, linux-kernel, linux-mm

On 6/9/25 11:24 AM, Lorenzo Stoakes wrote:
> Nested file systems, that is those which invoke call_mmap() within their
> own f_op->mmap() handlers, may encounter underlying file systems which
> provide the f_op->mmap_prepare() hook introduced by commit
> c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> 
> We have a chicken-and-egg scenario here - until all file systems are
> converted to using .mmap_prepare(), we cannot convert these nested
> handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
> 
> So we have to do it the other way round - invoke the .mmap_prepare() hook
> from an .mmap() one.
> 
> in order to do so, we need to convert VMA state into a struct vm_area_desc
> descriptor, invoking the underlying file system's f_op->mmap_prepare()
> callback passing a pointer to this, and then setting VMA state accordingly
> and safely.
> 
> This patch achieves this via the compat_vma_mmap_prepare() function, which
> we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
> passed in file pointer.
> 
> We place the fundamental logic into mm/vma.c where VMA manipulation
> belongs. We also update the VMA userland tests to accommodate the changes.
> 
> The compat_vma_mmap_prepare() function and its associated machinery is
> temporary, and will be removed once the conversion of file systems is
> complete.
> 
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
> Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").

So this is a hotfix for 6.16-rc1 but doesn't need cc: stable.
Also probably nothing wraps yet the filesystems with .mmap_prepare? But
good to have this handled within 6.16.

Reviewed-by: Vlastimil Babka <vbabka@suse.cz>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09 10:18 ` Vlastimil Babka
@ 2025-06-09 10:27   ` Lorenzo Stoakes
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2025-06-09 10:27 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: Andrew Morton, Alexander Viro, Christian Brauner, Jan Kara,
	Liam R . Howlett, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm

Andrew - to be clear, this should be a hotfix against 6.16-rc1 :>) Thanks!

On Mon, Jun 09, 2025 at 12:18:40PM +0200, Vlastimil Babka wrote:
> On 6/9/25 11:24 AM, Lorenzo Stoakes wrote:
> > Nested file systems, that is those which invoke call_mmap() within their
> > own f_op->mmap() handlers, may encounter underlying file systems which
> > provide the f_op->mmap_prepare() hook introduced by commit
> > c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> >
> > We have a chicken-and-egg scenario here - until all file systems are
> > converted to using .mmap_prepare(), we cannot convert these nested
> > handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
> >
> > So we have to do it the other way round - invoke the .mmap_prepare() hook
> > from an .mmap() one.
> >
> > in order to do so, we need to convert VMA state into a struct vm_area_desc
> > descriptor, invoking the underlying file system's f_op->mmap_prepare()
> > callback passing a pointer to this, and then setting VMA state accordingly
> > and safely.
> >
> > This patch achieves this via the compat_vma_mmap_prepare() function, which
> > we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
> > passed in file pointer.
> >
> > We place the fundamental logic into mm/vma.c where VMA manipulation
> > belongs. We also update the VMA userland tests to accommodate the changes.
> >
> > The compat_vma_mmap_prepare() function and its associated machinery is
> > temporary, and will be removed once the conversion of file systems is
> > complete.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Reported-by: Jann Horn <jannh@google.com>
> > Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
> > Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
>
> So this is a hotfix for 6.16-rc1 but doesn't need cc: stable.
> Also probably nothing wraps yet the filesystems with .mmap_prepare? But
> good to have this handled within 6.16.

Only secretmem uses this so unaffected BUT I want to make changes to filesystems
in 6.17 which is blocked by this problem (I actually have a bunch of changes
queued up ready), so it's really important to get this hotfixed.

Also, given the change in use of callback is going to touch a ton of filesystems
this is important for backporting purposes, in case anything needs
backporting there.

So from that point of view it's important to have it in 6.16 also.

>
> Reviewed-by: Vlastimil Babka <vbabka@suse.cz>
>

Thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09  9:24 [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems Lorenzo Stoakes
  2025-06-09 10:18 ` Vlastimil Babka
@ 2025-06-09 11:35 ` Pedro Falcato
  2025-06-09 11:45   ` Lorenzo Stoakes
  2025-06-09 15:40 ` kernel test robot
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 11+ messages in thread
From: Pedro Falcato @ 2025-06-09 11:35 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Andrew Morton, Alexander Viro, Christian Brauner, Jan Kara,
	Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-fsdevel,
	linux-kernel, linux-mm

On Mon, Jun 09, 2025 at 10:24:13AM +0100, Lorenzo Stoakes wrote:
> Nested file systems, that is those which invoke call_mmap() within their
> own f_op->mmap() handlers, may encounter underlying file systems which
> provide the f_op->mmap_prepare() hook introduced by commit
> c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> 
> We have a chicken-and-egg scenario here - until all file systems are
> converted to using .mmap_prepare(), we cannot convert these nested
> handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
> 
> So we have to do it the other way round - invoke the .mmap_prepare() hook
> from an .mmap() one.
> 
> in order to do so, we need to convert VMA state into a struct vm_area_desc
> descriptor, invoking the underlying file system's f_op->mmap_prepare()
> callback passing a pointer to this, and then setting VMA state accordingly
> and safely.
> 
> This patch achieves this via the compat_vma_mmap_prepare() function, which
> we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
> passed in file pointer.
> 
> We place the fundamental logic into mm/vma.c where VMA manipulation
> belongs. We also update the VMA userland tests to accommodate the changes.
> 
> The compat_vma_mmap_prepare() function and its associated machinery is
> temporary, and will be removed once the conversion of file systems is
> complete.
>

Thanks, this is annoying but looks mostly cromulent!

> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
> Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> ---
>  include/linux/fs.h               |  6 +++--
>  mm/mmap.c                        | 39 +++++++++++++++++++++++++++
>  mm/vma.c                         | 46 +++++++++++++++++++++++++++++++-
>  mm/vma.h                         |  4 +++
>  tools/testing/vma/vma_internal.h | 16 +++++++++++
>  5 files changed, 108 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 05abdabe9db7..8fe41a2b7527 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2274,10 +2274,12 @@ static inline bool file_has_valid_mmap_hooks(struct file *file)
>  	return true;
>  }
>  
> +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma);
> +
>  static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
>  {
> -	if (WARN_ON_ONCE(file->f_op->mmap_prepare))
> -		return -EINVAL;
> +	if (file->f_op->mmap_prepare)
> +		return compat_vma_mmap_prepare(file, vma);
>  
>  	return file->f_op->mmap(file, vma);
>  }
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 09c563c95112..0755cb5d89d1 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1891,3 +1891,42 @@ __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
>  	vm_unacct_memory(charge);
>  	goto loop_out;
>  }
> +
> +/**
> + * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an
> + * existing VMA
> + * @file: The file which possesss an f_op->mmap_prepare() hook
> + * @vma; The VMA to apply the .mmap_prepare() hook to.
> + *
> + * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
> + * 'wrapper' file systems invoke a nested mmap hook of an underlying file.
> + *
> + * Until all filesystems are converted to use .mmap_prepare(), we must be
> + * conservative and continue to invoke these 'wrapper' filesystems using the
> + * deprecated .mmap() hook.
> + *
> + * However we have a problem if the underlying file system possesses an
> + * .mmap_prepare() hook, as we are in a different context when we invoke the
> + * .mmap() hook, already having a VMA to deal with.
> + *
> + * compat_vma_mmap_prepare() is a compatibility function that takes VMA state,
> + * establishes a struct vm_area_desc descriptor, passes to the underlying
> + * .mmap_prepare() hook and applies any changes performed by it.
> + *
> + * Once the conversion of filesystems is complete this function will no longer
> + * be required and will be removed.
> + *
> + * Returns: 0 on success or error.
> + */
> +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma)
> +{
> +	struct vm_area_desc desc;
> +	int err;
> +
> +	err = file->f_op->mmap_prepare(vma_to_desc(vma, &desc));
> +	if (err)
> +		return err;
> +	set_vma_from_desc(vma, &desc);
> +
> +	return 0;
> +}
> diff --git a/mm/vma.c b/mm/vma.c
> index 01b1d26d87b4..d771750f8f76 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -3153,7 +3153,6 @@ int __vm_munmap(unsigned long start, size_t len, bool unlock)
>  	return ret;
>  }
>  
> -
>  /* Insert vm structure into process list sorted by address
>   * and into the inode's i_mmap tree.  If vm_file is non-NULL
>   * then i_mmap_rwsem is taken here.
> @@ -3195,3 +3194,48 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
>  
>  	return 0;
>  }
> +
> +/*
> + * Temporary helper functions for file systems which wrap an invocation of
> + * f_op->mmap() but which might have an underlying file system which implements
> + * f_op->mmap_prepare().
> + */
> +
> +struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> +		struct vm_area_desc *desc)
> +{
> +	desc->mm = vma->vm_mm;
> +	desc->start = vma->vm_start;
> +	desc->end = vma->vm_end;
> +
> +	desc->pgoff = vma->vm_pgoff;
> +	desc->file = vma->vm_file;
> +	desc->vm_flags = vma->vm_flags;
> +	desc->page_prot = vma->vm_page_prot;
> +
> +	desc->vm_ops = NULL;
> +	desc->private_data = NULL;
> +
> +	return desc;
> +}
> +
> +void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc)
> +{
> +	/*
> +	 * Since we're invoking .mmap_prepare() despite having a partially
> +	 * established VMA, we must take care to handle setting fields
> +	 * correctly.
> +	 */
> +
> +	/* Mutable fields. Populated with initial state. */
> +	vma->vm_pgoff = desc->pgoff;
> +	if (vma->vm_file != desc->file)
> +		vma_set_file(vma, desc->file);
> +	if (vma->vm_flags != desc->vm_flags)
> +		vm_flags_set(vma, desc->vm_flags);

I think we don't need vm_flags_set in this case, since the VMA isn't exposed yet.
__vm_flags_mod should work just fine. Of course this isn't a big deal, but I would
like it if we reduced vm_flags_set to core mm and conceptually attached things.

In any case, with or without that addressed:

Reviewed-by: Pedro Falcato <pfalcato@suse.de>

-- 
Pedro

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09 11:35 ` Pedro Falcato
@ 2025-06-09 11:45   ` Lorenzo Stoakes
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2025-06-09 11:45 UTC (permalink / raw)
  To: Pedro Falcato
  Cc: Andrew Morton, Alexander Viro, Christian Brauner, Jan Kara,
	Liam R . Howlett, Vlastimil Babka, Jann Horn, linux-fsdevel,
	linux-kernel, linux-mm

On Mon, Jun 09, 2025 at 12:35:40PM +0100, Pedro Falcato wrote:
> On Mon, Jun 09, 2025 at 10:24:13AM +0100, Lorenzo Stoakes wrote:
> > Nested file systems, that is those which invoke call_mmap() within their
> > own f_op->mmap() handlers, may encounter underlying file systems which
> > provide the f_op->mmap_prepare() hook introduced by commit
> > c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> >
> > We have a chicken-and-egg scenario here - until all file systems are
> > converted to using .mmap_prepare(), we cannot convert these nested
> > handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
> >
> > So we have to do it the other way round - invoke the .mmap_prepare() hook
> > from an .mmap() one.
> >
> > in order to do so, we need to convert VMA state into a struct vm_area_desc
> > descriptor, invoking the underlying file system's f_op->mmap_prepare()
> > callback passing a pointer to this, and then setting VMA state accordingly
> > and safely.
> >
> > This patch achieves this via the compat_vma_mmap_prepare() function, which
> > we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
> > passed in file pointer.
> >
> > We place the fundamental logic into mm/vma.c where VMA manipulation
> > belongs. We also update the VMA userland tests to accommodate the changes.
> >
> > The compat_vma_mmap_prepare() function and its associated machinery is
> > temporary, and will be removed once the conversion of file systems is
> > complete.
> >
>
> Thanks, this is annoying but looks mostly cromulent!

You do love that word :P I have to look it up every time... Maybe time to
up my vocabulary?? ;)

>
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Reported-by: Jann Horn <jannh@google.com>
> > Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
> > Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> > ---
> >  include/linux/fs.h               |  6 +++--
> >  mm/mmap.c                        | 39 +++++++++++++++++++++++++++
> >  mm/vma.c                         | 46 +++++++++++++++++++++++++++++++-
> >  mm/vma.h                         |  4 +++
> >  tools/testing/vma/vma_internal.h | 16 +++++++++++
> >  5 files changed, 108 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 05abdabe9db7..8fe41a2b7527 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2274,10 +2274,12 @@ static inline bool file_has_valid_mmap_hooks(struct file *file)
> >  	return true;
> >  }
> >
> > +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma);
> > +
> >  static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
> >  {
> > -	if (WARN_ON_ONCE(file->f_op->mmap_prepare))
> > -		return -EINVAL;
> > +	if (file->f_op->mmap_prepare)
> > +		return compat_vma_mmap_prepare(file, vma);
> >
> >  	return file->f_op->mmap(file, vma);
> >  }
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 09c563c95112..0755cb5d89d1 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1891,3 +1891,42 @@ __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
> >  	vm_unacct_memory(charge);
> >  	goto loop_out;
> >  }
> > +
> > +/**
> > + * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an
> > + * existing VMA
> > + * @file: The file which possesss an f_op->mmap_prepare() hook
> > + * @vma; The VMA to apply the .mmap_prepare() hook to.
> > + *
> > + * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
> > + * 'wrapper' file systems invoke a nested mmap hook of an underlying file.
> > + *
> > + * Until all filesystems are converted to use .mmap_prepare(), we must be
> > + * conservative and continue to invoke these 'wrapper' filesystems using the
> > + * deprecated .mmap() hook.
> > + *
> > + * However we have a problem if the underlying file system possesses an
> > + * .mmap_prepare() hook, as we are in a different context when we invoke the
> > + * .mmap() hook, already having a VMA to deal with.
> > + *
> > + * compat_vma_mmap_prepare() is a compatibility function that takes VMA state,
> > + * establishes a struct vm_area_desc descriptor, passes to the underlying
> > + * .mmap_prepare() hook and applies any changes performed by it.
> > + *
> > + * Once the conversion of filesystems is complete this function will no longer
> > + * be required and will be removed.
> > + *
> > + * Returns: 0 on success or error.
> > + */
> > +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma)
> > +{
> > +	struct vm_area_desc desc;
> > +	int err;
> > +
> > +	err = file->f_op->mmap_prepare(vma_to_desc(vma, &desc));
> > +	if (err)
> > +		return err;
> > +	set_vma_from_desc(vma, &desc);
> > +
> > +	return 0;
> > +}
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 01b1d26d87b4..d771750f8f76 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -3153,7 +3153,6 @@ int __vm_munmap(unsigned long start, size_t len, bool unlock)
> >  	return ret;
> >  }
> >
> > -
> >  /* Insert vm structure into process list sorted by address
> >   * and into the inode's i_mmap tree.  If vm_file is non-NULL
> >   * then i_mmap_rwsem is taken here.
> > @@ -3195,3 +3194,48 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
> >
> >  	return 0;
> >  }
> > +
> > +/*
> > + * Temporary helper functions for file systems which wrap an invocation of
> > + * f_op->mmap() but which might have an underlying file system which implements
> > + * f_op->mmap_prepare().
> > + */
> > +
> > +struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> > +		struct vm_area_desc *desc)
> > +{
> > +	desc->mm = vma->vm_mm;
> > +	desc->start = vma->vm_start;
> > +	desc->end = vma->vm_end;
> > +
> > +	desc->pgoff = vma->vm_pgoff;
> > +	desc->file = vma->vm_file;
> > +	desc->vm_flags = vma->vm_flags;
> > +	desc->page_prot = vma->vm_page_prot;
> > +
> > +	desc->vm_ops = NULL;
> > +	desc->private_data = NULL;
> > +
> > +	return desc;
> > +}
> > +
> > +void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc)
> > +{
> > +	/*
> > +	 * Since we're invoking .mmap_prepare() despite having a partially
> > +	 * established VMA, we must take care to handle setting fields
> > +	 * correctly.
> > +	 */
> > +
> > +	/* Mutable fields. Populated with initial state. */
> > +	vma->vm_pgoff = desc->pgoff;
> > +	if (vma->vm_file != desc->file)
> > +		vma_set_file(vma, desc->file);
> > +	if (vma->vm_flags != desc->vm_flags)
> > +		vm_flags_set(vma, desc->vm_flags);
>
> I think we don't need vm_flags_set in this case, since the VMA isn't exposed yet.
> __vm_flags_mod should work just fine. Of course this isn't a big deal, but I would
> like it if we reduced vm_flags_set to core mm and conceptually attached things.

Yeah I considered doing something like this, but I really want to keep this as
close to how .mmap() handlers do this ordinarily, even though it's a bit
unnecessary here.

This will (genuinely!) be temporary either way, and we can address the
unnecesary use of vm_flags_set() as a whole elsewhere.

>
> In any case, with or without that addressed:
>
> Reviewed-by: Pedro Falcato <pfalcato@suse.de>

Thanks!

>
> --
> Pedro

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09  9:24 [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems Lorenzo Stoakes
  2025-06-09 10:18 ` Vlastimil Babka
  2025-06-09 11:35 ` Pedro Falcato
@ 2025-06-09 15:40 ` kernel test robot
  2025-06-09 15:45   ` Lorenzo Stoakes
  2025-06-09 15:44 ` Lorenzo Stoakes
  2025-06-09 16:13 ` kernel test robot
  4 siblings, 1 reply; 11+ messages in thread
From: kernel test robot @ 2025-06-09 15:40 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: oe-kbuild-all, Linux Memory Management List, Alexander Viro,
	Christian Brauner, Jan Kara, Liam R . Howlett, Vlastimil Babka,
	Jann Horn, Pedro Falcato, linux-fsdevel, linux-kernel

Hi Lorenzo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-add-mmap_prepare-compatibility-layer-for-nested-file-systems/20250609-172628
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250609092413.45435-1-lorenzo.stoakes%40oracle.com
patch subject: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
config: arc-randconfig-002-20250609 (https://download.01.org/0day-ci/archive/20250609/202506092301.jUMzAZW1-lkp@intel.com/config)
compiler: arc-linux-gcc (GCC) 10.5.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250609/202506092301.jUMzAZW1-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506092301.jUMzAZW1-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> Warning: mm/mmap.c:1921 function parameter 'vma' not described in 'compat_vma_mmap_prepare'

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09  9:24 [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems Lorenzo Stoakes
                   ` (2 preceding siblings ...)
  2025-06-09 15:40 ` kernel test robot
@ 2025-06-09 15:44 ` Lorenzo Stoakes
  2025-06-09 16:52   ` Lorenzo Stoakes
  2025-06-09 16:13 ` kernel test robot
  4 siblings, 1 reply; 11+ messages in thread
From: Lorenzo Stoakes @ 2025-06-09 15:44 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Liam R . Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm

Andrew - I typo'd a ';' when there should be a ':' below, could you fix
that up or would you want a fix-patch for that?

I highlight where the issue is below.

Thanks!

On Mon, Jun 09, 2025 at 10:24:13AM +0100, Lorenzo Stoakes wrote:
> Nested file systems, that is those which invoke call_mmap() within their
> own f_op->mmap() handlers, may encounter underlying file systems which
> provide the f_op->mmap_prepare() hook introduced by commit
> c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
>
> We have a chicken-and-egg scenario here - until all file systems are
> converted to using .mmap_prepare(), we cannot convert these nested
> handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
>
> So we have to do it the other way round - invoke the .mmap_prepare() hook
> from an .mmap() one.
>
> in order to do so, we need to convert VMA state into a struct vm_area_desc
> descriptor, invoking the underlying file system's f_op->mmap_prepare()
> callback passing a pointer to this, and then setting VMA state accordingly
> and safely.
>
> This patch achieves this via the compat_vma_mmap_prepare() function, which
> we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
> passed in file pointer.
>
> We place the fundamental logic into mm/vma.c where VMA manipulation
> belongs. We also update the VMA userland tests to accommodate the changes.
>
> The compat_vma_mmap_prepare() function and its associated machinery is
> temporary, and will be removed once the conversion of file systems is
> complete.
>
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> Reported-by: Jann Horn <jannh@google.com>
> Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
> Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> ---
>  include/linux/fs.h               |  6 +++--
>  mm/mmap.c                        | 39 +++++++++++++++++++++++++++
>  mm/vma.c                         | 46 +++++++++++++++++++++++++++++++-
>  mm/vma.h                         |  4 +++
>  tools/testing/vma/vma_internal.h | 16 +++++++++++
>  5 files changed, 108 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 05abdabe9db7..8fe41a2b7527 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -2274,10 +2274,12 @@ static inline bool file_has_valid_mmap_hooks(struct file *file)
>  	return true;
>  }
>
> +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma);
> +
>  static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
>  {
> -	if (WARN_ON_ONCE(file->f_op->mmap_prepare))
> -		return -EINVAL;
> +	if (file->f_op->mmap_prepare)
> +		return compat_vma_mmap_prepare(file, vma);
>
>  	return file->f_op->mmap(file, vma);
>  }
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 09c563c95112..0755cb5d89d1 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -1891,3 +1891,42 @@ __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
>  	vm_unacct_memory(charge);
>  	goto loop_out;
>  }
> +
> +/**
> + * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an
> + * existing VMA
> + * @file: The file which possesss an f_op->mmap_prepare() hook
> + * @vma; The VMA to apply the .mmap_prepare() hook to.
          ^
          |---- should be a :

:)

> + *
> + * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
> + * 'wrapper' file systems invoke a nested mmap hook of an underlying file.
> + *
> + * Until all filesystems are converted to use .mmap_prepare(), we must be
> + * conservative and continue to invoke these 'wrapper' filesystems using the
> + * deprecated .mmap() hook.
> + *
> + * However we have a problem if the underlying file system possesses an
> + * .mmap_prepare() hook, as we are in a different context when we invoke the
> + * .mmap() hook, already having a VMA to deal with.
> + *
> + * compat_vma_mmap_prepare() is a compatibility function that takes VMA state,
> + * establishes a struct vm_area_desc descriptor, passes to the underlying
> + * .mmap_prepare() hook and applies any changes performed by it.
> + *
> + * Once the conversion of filesystems is complete this function will no longer
> + * be required and will be removed.
> + *
> + * Returns: 0 on success or error.
> + */
> +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma)
> +{
> +	struct vm_area_desc desc;
> +	int err;
> +
> +	err = file->f_op->mmap_prepare(vma_to_desc(vma, &desc));
> +	if (err)
> +		return err;
> +	set_vma_from_desc(vma, &desc);
> +
> +	return 0;
> +}
> diff --git a/mm/vma.c b/mm/vma.c
> index 01b1d26d87b4..d771750f8f76 100644
> --- a/mm/vma.c
> +++ b/mm/vma.c
> @@ -3153,7 +3153,6 @@ int __vm_munmap(unsigned long start, size_t len, bool unlock)
>  	return ret;
>  }
>
> -
>  /* Insert vm structure into process list sorted by address
>   * and into the inode's i_mmap tree.  If vm_file is non-NULL
>   * then i_mmap_rwsem is taken here.
> @@ -3195,3 +3194,48 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
>
>  	return 0;
>  }
> +
> +/*
> + * Temporary helper functions for file systems which wrap an invocation of
> + * f_op->mmap() but which might have an underlying file system which implements
> + * f_op->mmap_prepare().
> + */
> +
> +struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> +		struct vm_area_desc *desc)
> +{
> +	desc->mm = vma->vm_mm;
> +	desc->start = vma->vm_start;
> +	desc->end = vma->vm_end;
> +
> +	desc->pgoff = vma->vm_pgoff;
> +	desc->file = vma->vm_file;
> +	desc->vm_flags = vma->vm_flags;
> +	desc->page_prot = vma->vm_page_prot;
> +
> +	desc->vm_ops = NULL;
> +	desc->private_data = NULL;
> +
> +	return desc;
> +}
> +
> +void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc)
> +{
> +	/*
> +	 * Since we're invoking .mmap_prepare() despite having a partially
> +	 * established VMA, we must take care to handle setting fields
> +	 * correctly.
> +	 */
> +
> +	/* Mutable fields. Populated with initial state. */
> +	vma->vm_pgoff = desc->pgoff;
> +	if (vma->vm_file != desc->file)
> +		vma_set_file(vma, desc->file);
> +	if (vma->vm_flags != desc->vm_flags)
> +		vm_flags_set(vma, desc->vm_flags);
> +	vma->vm_page_prot = desc->page_prot;
> +
> +	/* User-defined fields. */
> +	vma->vm_ops = desc->vm_ops;
> +	vma->vm_private_data = desc->private_data;
> +}
> diff --git a/mm/vma.h b/mm/vma.h
> index 0db066e7a45d..afd6cc026658 100644
> --- a/mm/vma.h
> +++ b/mm/vma.h
> @@ -570,4 +570,8 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap,
>  int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift);
>  #endif
>
> +struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> +		struct vm_area_desc *desc);
> +void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc);
> +
>  #endif	/* __MM_VMA_H */
> diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
> index 77b2949d874a..675a55216607 100644
> --- a/tools/testing/vma/vma_internal.h
> +++ b/tools/testing/vma/vma_internal.h
> @@ -159,6 +159,14 @@ typedef __bitwise unsigned int vm_fault_t;
>
>  #define ASSERT_EXCLUSIVE_WRITER(x)
>
> +/**
> + * swap - swap values of @a and @b
> + * @a: first value
> + * @b: second value
> + */
> +#define swap(a, b) \
> +	do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
> +
>  struct kref {
>  	refcount_t refcount;
>  };
> @@ -1479,4 +1487,12 @@ static inline vm_flags_t ksm_vma_flags(const struct mm_struct *, const struct fi
>  	return vm_flags;
>  }
>
> +static inline void vma_set_file(struct vm_area_struct *vma, struct file *file)
> +{
> +	/* Changing an anonymous vma with this is illegal */
> +	get_file(file);
> +	swap(vma->vm_file, file);
> +	fput(file);
> +}
> +
>  #endif	/* __MM_VMA_INTERNAL_H */
> --
> 2.49.0
>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09 15:40 ` kernel test robot
@ 2025-06-09 15:45   ` Lorenzo Stoakes
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2025-06-09 15:45 UTC (permalink / raw)
  To: kernel test robot
  Cc: Andrew Morton, oe-kbuild-all, Linux Memory Management List,
	Alexander Viro, Christian Brauner, Jan Kara, Liam R . Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel

On Mon, Jun 09, 2025 at 11:40:25PM +0800, kernel test robot wrote:
> Hi Lorenzo,
>
> kernel test robot noticed the following build warnings:
>
> [auto build test WARNING on akpm-mm/mm-everything]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-add-mmap_prepare-compatibility-layer-for-nested-file-systems/20250609-172628
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/20250609092413.45435-1-lorenzo.stoakes%40oracle.com
> patch subject: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
> config: arc-randconfig-002-20250609 (https://download.01.org/0day-ci/archive/20250609/202506092301.jUMzAZW1-lkp@intel.com/config)
> compiler: arc-linux-gcc (GCC) 10.5.0
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250609/202506092301.jUMzAZW1-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202506092301.jUMzAZW1-lkp@intel.com/
>
> All warnings (new ones prefixed by >>):
>
> >> Warning: mm/mmap.c:1921 function parameter 'vma' not described in 'compat_vma_mmap_prepare'
>

Thanks, this is due to a silly typo, I put ';' when I meant to say ':' :)

I asked Andrew to fix it in https://lore.kernel.org/linux-mm/dddd402f-1705-41a2-8806-543d0bfff5bc@lucifer.local/

> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki


Cheers, Lorenzo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09  9:24 [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems Lorenzo Stoakes
                   ` (3 preceding siblings ...)
  2025-06-09 15:44 ` Lorenzo Stoakes
@ 2025-06-09 16:13 ` kernel test robot
  2025-06-09 16:22   ` Lorenzo Stoakes
  4 siblings, 1 reply; 11+ messages in thread
From: kernel test robot @ 2025-06-09 16:13 UTC (permalink / raw)
  To: Lorenzo Stoakes, Andrew Morton
  Cc: llvm, oe-kbuild-all, Linux Memory Management List, Alexander Viro,
	Christian Brauner, Jan Kara, Liam R . Howlett, Vlastimil Babka,
	Jann Horn, Pedro Falcato, linux-fsdevel, linux-kernel

Hi Lorenzo,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]

url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-add-mmap_prepare-compatibility-layer-for-nested-file-systems/20250609-172628
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20250609092413.45435-1-lorenzo.stoakes%40oracle.com
patch subject: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
config: arm-randconfig-002-20250609 (https://download.01.org/0day-ci/archive/20250610/202506100000.34KZcoZ5-lkp@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250610/202506100000.34KZcoZ5-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202506100000.34KZcoZ5-lkp@intel.com/

All errors (new ones prefixed by >>):

>> ld.lld: error: undefined symbol: compat_vma_mmap_prepare
   >>> referenced by shm.c
   >>>               ipc/shm.o:(shm_mmap) in archive vmlinux.a
   >>> referenced by backing-file.c
   >>>               fs/backing-file.o:(backing_file_mmap) in archive vmlinux.a
   >>> referenced by nommu.c
   >>>               mm/nommu.o:(do_mmap) in archive vmlinux.a
   >>> referenced 2 more times

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09 16:13 ` kernel test robot
@ 2025-06-09 16:22   ` Lorenzo Stoakes
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2025-06-09 16:22 UTC (permalink / raw)
  To: kernel test robot
  Cc: Andrew Morton, llvm, oe-kbuild-all, Linux Memory Management List,
	Alexander Viro, Christian Brauner, Jan Kara, Liam R . Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel

On Tue, Jun 10, 2025 at 12:13:08AM +0800, kernel test robot wrote:
> Hi Lorenzo,
>
> kernel test robot noticed the following build errors:
>
> [auto build test ERROR on akpm-mm/mm-everything]
>
> url:    https://github.com/intel-lab-lkp/linux/commits/Lorenzo-Stoakes/mm-add-mmap_prepare-compatibility-layer-for-nested-file-systems/20250609-172628
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
> patch link:    https://lore.kernel.org/r/20250609092413.45435-1-lorenzo.stoakes%40oracle.com
> patch subject: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
> config: arm-randconfig-002-20250609 (https://download.01.org/0day-ci/archive/20250610/202506100000.34KZcoZ5-lkp@intel.com/config)
> compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
> reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20250610/202506100000.34KZcoZ5-lkp@intel.com/reproduce)
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <lkp@intel.com>
> | Closes: https://lore.kernel.org/oe-kbuild-all/202506100000.34KZcoZ5-lkp@intel.com/
>
> All errors (new ones prefixed by >>):
>
> >> ld.lld: error: undefined symbol: compat_vma_mmap_prepare
>    >>> referenced by shm.c
>    >>>               ipc/shm.o:(shm_mmap) in archive vmlinux.a
>    >>> referenced by backing-file.c
>    >>>               fs/backing-file.o:(backing_file_mmap) in archive vmlinux.a
>    >>> referenced by nommu.c
>    >>>               mm/nommu.o:(do_mmap) in archive vmlinux.a
>    >>> referenced 2 more times

Yeah of course yet again it's nommu doing what it always does - causing
maintenance pain for little to no apparent benefit.

I'll send a fix-patch.

>
> --
> 0-DAY CI Kernel Test Service
> https://github.com/intel/lkp-tests/wiki

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems
  2025-06-09 15:44 ` Lorenzo Stoakes
@ 2025-06-09 16:52   ` Lorenzo Stoakes
  0 siblings, 0 replies; 11+ messages in thread
From: Lorenzo Stoakes @ 2025-06-09 16:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Alexander Viro, Christian Brauner, Jan Kara, Liam R . Howlett,
	Vlastimil Babka, Jann Horn, Pedro Falcato, linux-fsdevel,
	linux-kernel, linux-mm

On Mon, Jun 09, 2025 at 04:44:15PM +0100, Lorenzo Stoakes wrote:
> Andrew - I typo'd a ';' when there should be a ':' below, could you fix
> that up or would you want a fix-patch for that?
> 
> I highlight where the issue is below.
> 
> Thanks!

On second thoughts, given the silly nommu issue, will do a quick respin.

Sorry for the noise.

> 
> On Mon, Jun 09, 2025 at 10:24:13AM +0100, Lorenzo Stoakes wrote:
> > Nested file systems, that is those which invoke call_mmap() within their
> > own f_op->mmap() handlers, may encounter underlying file systems which
> > provide the f_op->mmap_prepare() hook introduced by commit
> > c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> >
> > We have a chicken-and-egg scenario here - until all file systems are
> > converted to using .mmap_prepare(), we cannot convert these nested
> > handlers, as we can't call f_op->mmap from an .mmap_prepare() hook.
> >
> > So we have to do it the other way round - invoke the .mmap_prepare() hook
> > from an .mmap() one.
> >
> > in order to do so, we need to convert VMA state into a struct vm_area_desc
> > descriptor, invoking the underlying file system's f_op->mmap_prepare()
> > callback passing a pointer to this, and then setting VMA state accordingly
> > and safely.
> >
> > This patch achieves this via the compat_vma_mmap_prepare() function, which
> > we invoke from call_mmap() if f_op->mmap_prepare() is specified in the
> > passed in file pointer.
> >
> > We place the fundamental logic into mm/vma.c where VMA manipulation
> > belongs. We also update the VMA userland tests to accommodate the changes.
> >
> > The compat_vma_mmap_prepare() function and its associated machinery is
> > temporary, and will be removed once the conversion of file systems is
> > complete.
> >
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> > Reported-by: Jann Horn <jannh@google.com>
> > Closes: https://lore.kernel.org/linux-mm/CAG48ez04yOEVx1ekzOChARDDBZzAKwet8PEoPM4Ln3_rk91AzQ@mail.gmail.com/
> > Fixes: c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file callback").
> > ---
> >  include/linux/fs.h               |  6 +++--
> >  mm/mmap.c                        | 39 +++++++++++++++++++++++++++
> >  mm/vma.c                         | 46 +++++++++++++++++++++++++++++++-
> >  mm/vma.h                         |  4 +++
> >  tools/testing/vma/vma_internal.h | 16 +++++++++++
> >  5 files changed, 108 insertions(+), 3 deletions(-)
> >
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index 05abdabe9db7..8fe41a2b7527 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -2274,10 +2274,12 @@ static inline bool file_has_valid_mmap_hooks(struct file *file)
> >  	return true;
> >  }
> >
> > +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma);
> > +
> >  static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
> >  {
> > -	if (WARN_ON_ONCE(file->f_op->mmap_prepare))
> > -		return -EINVAL;
> > +	if (file->f_op->mmap_prepare)
> > +		return compat_vma_mmap_prepare(file, vma);
> >
> >  	return file->f_op->mmap(file, vma);
> >  }
> > diff --git a/mm/mmap.c b/mm/mmap.c
> > index 09c563c95112..0755cb5d89d1 100644
> > --- a/mm/mmap.c
> > +++ b/mm/mmap.c
> > @@ -1891,3 +1891,42 @@ __latent_entropy int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm)
> >  	vm_unacct_memory(charge);
> >  	goto loop_out;
> >  }
> > +
> > +/**
> > + * compat_vma_mmap_prepare() - Apply the file's .mmap_prepare() hook to an
> > + * existing VMA
> > + * @file: The file which possesss an f_op->mmap_prepare() hook
> > + * @vma; The VMA to apply the .mmap_prepare() hook to.
>           ^
>           |---- should be a :
> 
> :)
> 
> > + *
> > + * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
> > + * 'wrapper' file systems invoke a nested mmap hook of an underlying file.
> > + *
> > + * Until all filesystems are converted to use .mmap_prepare(), we must be
> > + * conservative and continue to invoke these 'wrapper' filesystems using the
> > + * deprecated .mmap() hook.
> > + *
> > + * However we have a problem if the underlying file system possesses an
> > + * .mmap_prepare() hook, as we are in a different context when we invoke the
> > + * .mmap() hook, already having a VMA to deal with.
> > + *
> > + * compat_vma_mmap_prepare() is a compatibility function that takes VMA state,
> > + * establishes a struct vm_area_desc descriptor, passes to the underlying
> > + * .mmap_prepare() hook and applies any changes performed by it.
> > + *
> > + * Once the conversion of filesystems is complete this function will no longer
> > + * be required and will be removed.
> > + *
> > + * Returns: 0 on success or error.
> > + */
> > +int compat_vma_mmap_prepare(struct file *file, struct vm_area_struct *vma)
> > +{
> > +	struct vm_area_desc desc;
> > +	int err;
> > +
> > +	err = file->f_op->mmap_prepare(vma_to_desc(vma, &desc));
> > +	if (err)
> > +		return err;
> > +	set_vma_from_desc(vma, &desc);
> > +
> > +	return 0;
> > +}
> > diff --git a/mm/vma.c b/mm/vma.c
> > index 01b1d26d87b4..d771750f8f76 100644
> > --- a/mm/vma.c
> > +++ b/mm/vma.c
> > @@ -3153,7 +3153,6 @@ int __vm_munmap(unsigned long start, size_t len, bool unlock)
> >  	return ret;
> >  }
> >
> > -
> >  /* Insert vm structure into process list sorted by address
> >   * and into the inode's i_mmap tree.  If vm_file is non-NULL
> >   * then i_mmap_rwsem is taken here.
> > @@ -3195,3 +3194,48 @@ int insert_vm_struct(struct mm_struct *mm, struct vm_area_struct *vma)
> >
> >  	return 0;
> >  }
> > +
> > +/*
> > + * Temporary helper functions for file systems which wrap an invocation of
> > + * f_op->mmap() but which might have an underlying file system which implements
> > + * f_op->mmap_prepare().
> > + */
> > +
> > +struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> > +		struct vm_area_desc *desc)
> > +{
> > +	desc->mm = vma->vm_mm;
> > +	desc->start = vma->vm_start;
> > +	desc->end = vma->vm_end;
> > +
> > +	desc->pgoff = vma->vm_pgoff;
> > +	desc->file = vma->vm_file;
> > +	desc->vm_flags = vma->vm_flags;
> > +	desc->page_prot = vma->vm_page_prot;
> > +
> > +	desc->vm_ops = NULL;
> > +	desc->private_data = NULL;
> > +
> > +	return desc;
> > +}
> > +
> > +void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc)
> > +{
> > +	/*
> > +	 * Since we're invoking .mmap_prepare() despite having a partially
> > +	 * established VMA, we must take care to handle setting fields
> > +	 * correctly.
> > +	 */
> > +
> > +	/* Mutable fields. Populated with initial state. */
> > +	vma->vm_pgoff = desc->pgoff;
> > +	if (vma->vm_file != desc->file)
> > +		vma_set_file(vma, desc->file);
> > +	if (vma->vm_flags != desc->vm_flags)
> > +		vm_flags_set(vma, desc->vm_flags);
> > +	vma->vm_page_prot = desc->page_prot;
> > +
> > +	/* User-defined fields. */
> > +	vma->vm_ops = desc->vm_ops;
> > +	vma->vm_private_data = desc->private_data;
> > +}
> > diff --git a/mm/vma.h b/mm/vma.h
> > index 0db066e7a45d..afd6cc026658 100644
> > --- a/mm/vma.h
> > +++ b/mm/vma.h
> > @@ -570,4 +570,8 @@ int create_init_stack_vma(struct mm_struct *mm, struct vm_area_struct **vmap,
> >  int relocate_vma_down(struct vm_area_struct *vma, unsigned long shift);
> >  #endif
> >
> > +struct vm_area_desc *vma_to_desc(struct vm_area_struct *vma,
> > +		struct vm_area_desc *desc);
> > +void set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc);
> > +
> >  #endif	/* __MM_VMA_H */
> > diff --git a/tools/testing/vma/vma_internal.h b/tools/testing/vma/vma_internal.h
> > index 77b2949d874a..675a55216607 100644
> > --- a/tools/testing/vma/vma_internal.h
> > +++ b/tools/testing/vma/vma_internal.h
> > @@ -159,6 +159,14 @@ typedef __bitwise unsigned int vm_fault_t;
> >
> >  #define ASSERT_EXCLUSIVE_WRITER(x)
> >
> > +/**
> > + * swap - swap values of @a and @b
> > + * @a: first value
> > + * @b: second value
> > + */
> > +#define swap(a, b) \
> > +	do { typeof(a) __tmp = (a); (a) = (b); (b) = __tmp; } while (0)
> > +
> >  struct kref {
> >  	refcount_t refcount;
> >  };
> > @@ -1479,4 +1487,12 @@ static inline vm_flags_t ksm_vma_flags(const struct mm_struct *, const struct fi
> >  	return vm_flags;
> >  }
> >
> > +static inline void vma_set_file(struct vm_area_struct *vma, struct file *file)
> > +{
> > +	/* Changing an anonymous vma with this is illegal */
> > +	get_file(file);
> > +	swap(vma->vm_file, file);
> > +	fput(file);
> > +}
> > +
> >  #endif	/* __MM_VMA_INTERNAL_H */
> > --
> > 2.49.0
> >

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2025-06-09 16:52 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-09  9:24 [PATCH] mm: add mmap_prepare() compatibility layer for nested file systems Lorenzo Stoakes
2025-06-09 10:18 ` Vlastimil Babka
2025-06-09 10:27   ` Lorenzo Stoakes
2025-06-09 11:35 ` Pedro Falcato
2025-06-09 11:45   ` Lorenzo Stoakes
2025-06-09 15:40 ` kernel test robot
2025-06-09 15:45   ` Lorenzo Stoakes
2025-06-09 15:44 ` Lorenzo Stoakes
2025-06-09 16:52   ` Lorenzo Stoakes
2025-06-09 16:13 ` kernel test robot
2025-06-09 16:22   ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).