public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Jonathan Corbet <corbet@lwn.net>,
	Clemens Ladisch <clemens@ladisch.de>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	"K . Y . Srinivasan" <kys@microsoft.com>,
	Haiyang Zhang <haiyangz@microsoft.com>,
	Wei Liu <wei.liu@kernel.org>, Dexuan Cui <decui@microsoft.com>,
	Long Li <longli@microsoft.com>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Maxime Coquelin <mcoquelin.stm32@gmail.com>,
	Alexandre Torgue <alexandre.torgue@foss.st.com>,
	Miquel Raynal <miquel.raynal@bootlin.com>,
	Richard Weinberger <richard@nod.at>,
	Vignesh Raghavendra <vigneshr@ti.com>,
	Bodo Stroesser <bostroesser@gmail.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	David Howells <dhowells@redhat.com>,
	Marc Dionne <marc.dionne@auristor.com>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
	David Hildenbrand <david@kernel.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>, Jann Horn <jannh@google.com>,
	Pedro Falcato <pfalcato@suse.de>,
	linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org,
	linux-hyperv@vger.kernel.org,
	linux-stm32@st-md-mailman.stormreply.com,
	linux-arm-kernel@lists.infradead.org,
	linux-mtd@lists.infradead.org, linux-staging@lists.linux.dev,
	linux-scsi@vger.kernel.org, target-devel@vger.kernel.org,
	linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Ryan Roberts <ryan.roberts@arm.com>
Subject: [PATCH v4 08/21] mm: add vm_ops->mapped hook
Date: Fri, 20 Mar 2026 22:39:34 +0000	[thread overview]
Message-ID: <4c5e98297eb0aae9565c564e1c296a112702f144.1774045440.git.ljs@kernel.org> (raw)
In-Reply-To: <cover.1774045440.git.ljs@kernel.org>

Previously, when a driver needed to do something like establish a
reference count, it could do so in the mmap hook in the knowledge that the
mapping would succeed.

With the introduction of f_op->mmap_prepare this is no longer the case, as
it is invoked prior to actually establishing the mapping.

mmap_prepare is not appropriate for this kind of thing as it is called
before any merge might take place, and after which an error might occur
meaning resources could be leaked.

To take this into account, introduce a new vm_ops->mapped callback which
is invoked when the VMA is first mapped (though notably - not when it is
merged - which is correct and mirrors existing mmap/open/close behaviour).

We do better that vm_ops->open() here, as this callback can return an
error, at which point the VMA will be unmapped.

Note that vm_ops->mapped() is invoked after any mmap action is complete
(such as I/O remapping).

We intentionally do not expose the VMA at this point, exposing only the
fields that could be used, and an output parameter in case the operation
needs to update the vma->vm_private_data field.

In order to deal with stacked filesystems which invoke inner filesystem's
mmap() invocations, add __compat_vma_mapped() and invoke it on vfs_mmap()
(via compat_vma_mmap()) to ensure that the mapped callback is handled when
an mmap() caller invokes a nested filesystem's mmap_prepare() callback.

Update the mmap_prepare documentation to describe the mapped hook and make
it clear what its intended use is.

The vm_ops->mapped() call is handled by the mmap complete logic to ensure
the same code paths are handled by both the compatibility and VMA layers.

Additionally, update VMA userland test headers to reflect the change.

Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
 Documentation/filesystems/mmap_prepare.rst | 15 ++++
 include/linux/fs.h                         |  9 ++-
 include/linux/mm.h                         | 17 ++++
 mm/util.c                                  | 90 +++++++++++++++-------
 mm/vma.c                                   |  1 -
 tools/testing/vma/include/dup.h            | 17 ++++
 6 files changed, 120 insertions(+), 29 deletions(-)

diff --git a/Documentation/filesystems/mmap_prepare.rst b/Documentation/filesystems/mmap_prepare.rst
index ae484d371861..f14b35ee11d5 100644
--- a/Documentation/filesystems/mmap_prepare.rst
+++ b/Documentation/filesystems/mmap_prepare.rst
@@ -25,6 +25,21 @@ That is - no resources should be allocated nor state updated to reflect that a
 mapping has been established, as the mapping may either be merged, or fail to be
 mapped after the callback is complete.
 
+Mapped callback
+---------------
+
+If resources need to be allocated per-mapping, or state such as a reference
+count needs to be manipulated, this should be done using the ``vm_ops->mapped``
+hook, which itself should be set by the >mmap_prepare hook.
+
+This callback is only invoked if a new mapping has been established and was not
+merged with any other, and is invoked at a point where no error may occur before
+the mapping is established.
+
+You may return an error to the callback itself, which will cause the mapping to
+become unmapped and an error returned to the mmap() caller. This is useful if
+resources need to be allocated, and that allocation might fail.
+
 How To Use
 ==========
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index a2628a12bd2b..c390f5c667e3 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2059,13 +2059,20 @@ static inline bool can_mmap_file(struct file *file)
 }
 
 int compat_vma_mmap(struct file *file, struct vm_area_struct *vma);
+int __vma_check_mmap_hook(struct vm_area_struct *vma);
 
 static inline int vfs_mmap(struct file *file, struct vm_area_struct *vma)
 {
+	int err;
+
 	if (file->f_op->mmap_prepare)
 		return compat_vma_mmap(file, vma);
 
-	return file->f_op->mmap(file, vma);
+	err = file->f_op->mmap(file, vma);
+	if (err)
+		return err;
+
+	return __vma_check_mmap_hook(vma);
 }
 
 static inline int vfs_mmap_prepare(struct file *file, struct vm_area_desc *desc)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index da94edb287cd..ad1b8c3c0cfd 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -777,6 +777,23 @@ struct vm_operations_struct {
 	 * Context: User context.  May sleep.  Caller holds mmap_lock.
 	 */
 	void (*close)(struct vm_area_struct *vma);
+	/**
+	 * @mapped: Called when the VMA is first mapped in the MM. Not called if
+	 * the new VMA is merged with an adjacent VMA.
+	 *
+	 * The @vm_private_data field is an output field allowing the user to
+	 * modify vma->vm_private_data as necessary.
+	 *
+	 * ONLY valid if set from f_op->mmap_prepare. Will result in an error if
+	 * set from f_op->mmap.
+	 *
+	 * Returns %0 on success, or an error otherwise. On error, the VMA will
+	 * be unmapped.
+	 *
+	 * Context: User context.  May sleep.  Caller holds mmap_lock.
+	 */
+	int (*mapped)(unsigned long start, unsigned long end, pgoff_t pgoff,
+		      const struct file *file, void **vm_private_data);
 	/* Called any time before splitting to check if it's allowed */
 	int (*may_split)(struct vm_area_struct *vma, unsigned long addr);
 	int (*mremap)(struct vm_area_struct *vma);
diff --git a/mm/util.c b/mm/util.c
index a187f3ab4437..df95ae41e09b 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1163,33 +1163,7 @@ void flush_dcache_folio(struct folio *folio)
 EXPORT_SYMBOL(flush_dcache_folio);
 #endif
 
-/**
- * compat_vma_mmap() - Apply the file's .mmap_prepare() hook to an
- * existing VMA and execute any requested actions.
- * @file: The file which possesss an f_op->mmap_prepare() hook.
- * @vma: The VMA to apply the .mmap_prepare() hook to.
- *
- * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
- * stacked filesystems invoke a nested mmap hook of an underlying file.
- *
- * Until all filesystems are converted to use .mmap_prepare(), we must be
- * conservative and continue to invoke these stacked filesystems using the
- * deprecated .mmap() hook.
- *
- * However we have a problem if the underlying file system possesses an
- * .mmap_prepare() hook, as we are in a different context when we invoke the
- * .mmap() hook, already having a VMA to deal with.
- *
- * compat_vma_mmap() is a compatibility function that takes VMA state,
- * establishes a struct vm_area_desc descriptor, passes to the underlying
- * .mmap_prepare() hook and applies any changes performed by it.
- *
- * Once the conversion of filesystems is complete this function will no longer
- * be required and will be removed.
- *
- * Returns: 0 on success or error.
- */
-int compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
+static int __compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct vm_area_desc desc = {
 		.mm = vma->vm_mm,
@@ -1221,8 +1195,49 @@ int compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
 	set_vma_from_desc(vma, &desc);
 	return mmap_action_complete(vma, action);
 }
+
+/**
+ * compat_vma_mmap() - Apply the file's .mmap_prepare() hook to an
+ * existing VMA and execute any requested actions.
+ * @file: The file which possesss an f_op->mmap_prepare() hook.
+ * @vma: The VMA to apply the .mmap_prepare() hook to.
+ *
+ * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, certain
+ * stacked filesystems invoke a nested mmap hook of an underlying file.
+ *
+ * Until all filesystems are converted to use .mmap_prepare(), we must be
+ * conservative and continue to invoke these stacked filesystems using the
+ * deprecated .mmap() hook.
+ *
+ * However we have a problem if the underlying file system possesses an
+ * .mmap_prepare() hook, as we are in a different context when we invoke the
+ * .mmap() hook, already having a VMA to deal with.
+ *
+ * compat_vma_mmap() is a compatibility function that takes VMA state,
+ * establishes a struct vm_area_desc descriptor, passes to the underlying
+ * .mmap_prepare() hook and applies any changes performed by it.
+ *
+ * Once the conversion of filesystems is complete this function will no longer
+ * be required and will be removed.
+ *
+ * Returns: 0 on success or error.
+ */
+int compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	return __compat_vma_mmap(file, vma);
+}
 EXPORT_SYMBOL(compat_vma_mmap);
 
+int __vma_check_mmap_hook(struct vm_area_struct *vma)
+{
+	/* vm_ops->mapped is not valid if mmap() is specified. */
+	if (vma->vm_ops && WARN_ON_ONCE(vma->vm_ops->mapped))
+		return -EINVAL;
+
+	return 0;
+}
+EXPORT_SYMBOL(__vma_check_mmap_hook);
+
 static void set_ps_flags(struct page_snapshot *ps, const struct folio *folio,
 			 const struct page *page)
 {
@@ -1311,11 +1326,32 @@ void snapshot_page(struct page_snapshot *ps, const struct page *page)
 	}
 }
 
+static int call_vma_mapped(struct vm_area_struct *vma)
+{
+	const struct vm_operations_struct *vm_ops = vma->vm_ops;
+	void *vm_private_data = vma->vm_private_data;
+	int err;
+
+	if (!vm_ops || !vm_ops->mapped)
+		return 0;
+
+	err = vm_ops->mapped(vma->vm_start, vma->vm_end, vma->vm_pgoff,
+			     vma->vm_file, &vm_private_data);
+	if (err)
+		return err;
+
+	if (vm_private_data != vma->vm_private_data)
+		vma->vm_private_data = vm_private_data;
+	return 0;
+}
+
 static int mmap_action_finish(struct vm_area_struct *vma,
 			      struct mmap_action *action, int err)
 {
 	size_t len;
 
+	if (!err)
+		err = call_vma_mapped(vma);
 	if (!err && action->success_hook)
 		err = action->success_hook(vma);
 
diff --git a/mm/vma.c b/mm/vma.c
index 86f947c90442..2d631db272e6 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2782,7 +2782,6 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
 
 	if (have_mmap_prepare && allocated_new) {
 		error = mmap_action_complete(vma, &desc.action);
-
 		if (error)
 			return error;
 	}
diff --git a/tools/testing/vma/include/dup.h b/tools/testing/vma/include/dup.h
index bf8cffe1ea58..b4b12fc742c1 100644
--- a/tools/testing/vma/include/dup.h
+++ b/tools/testing/vma/include/dup.h
@@ -643,6 +643,23 @@ struct vm_operations_struct {
 	 * Context: User context.  May sleep.  Caller holds mmap_lock.
 	 */
 	void (*close)(struct vm_area_struct *vma);
+	/**
+	 * @mapped: Called when the VMA is first mapped in the MM. Not called if
+	 * the new VMA is merged with an adjacent VMA.
+	 *
+	 * The @vm_private_data field is an output field allowing the user to
+	 * modify vma->vm_private_data as necessary.
+	 *
+	 * ONLY valid if set from f_op->mmap_prepare. Will result in an error if
+	 * set from f_op->mmap.
+	 *
+	 * Returns %0 on success, or an error otherwise. On error, the VMA will
+	 * be unmapped.
+	 *
+	 * Context: User context.  May sleep.  Caller holds mmap_lock.
+	 */
+	int (*mapped)(unsigned long start, unsigned long end, pgoff_t pgoff,
+		      const struct file *file, void **vm_private_data);
 	/* Called any time before splitting to check if it's allowed */
 	int (*may_split)(struct vm_area_struct *vma, unsigned long addr);
 	int (*mremap)(struct vm_area_struct *vma);
-- 
2.53.0


  parent reply	other threads:[~2026-03-20 22:40 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-20 22:39 [PATCH v4 00/21] mm: expand mmap_prepare functionality and usage Lorenzo Stoakes (Oracle)
2026-03-20 22:39 ` [PATCH v4 01/21] mm: various small mmap_prepare cleanups Lorenzo Stoakes (Oracle)
2026-03-24 10:46   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 02/21] mm: add documentation for the mmap_prepare file operation callback Lorenzo Stoakes (Oracle)
2026-03-20 22:39 ` [PATCH v4 03/21] mm: document vm_operations_struct->open the same as close() Lorenzo Stoakes (Oracle)
2026-03-20 22:39 ` [PATCH v4 04/21] mm: avoid deadlock when holding rmap on mmap_prepare error Lorenzo Stoakes (Oracle)
2026-03-24 10:55   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 05/21] mm: switch the rmap lock held option off in compat layer Lorenzo Stoakes (Oracle)
2026-03-24 14:26   ` Vlastimil Babka (SUSE)
2026-03-24 16:35     ` Lorenzo Stoakes (Oracle)
2026-03-20 22:39 ` [PATCH v4 06/21] mm/vma: remove superfluous map->hold_file_rmap_lock Lorenzo Stoakes (Oracle)
2026-03-24 14:31   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 07/21] mm: have mmap_action_complete() handle the rmap lock and unmap Lorenzo Stoakes (Oracle)
2026-03-24 14:38   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` Lorenzo Stoakes (Oracle) [this message]
2026-03-24 15:32   ` [PATCH v4 08/21] mm: add vm_ops->mapped hook Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 09/21] fs: afs: revert mmap_prepare() change Lorenzo Stoakes (Oracle)
2026-03-25  9:06   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 10/21] fs: afs: restore mmap_prepare implementation Lorenzo Stoakes (Oracle)
2026-03-25  9:47   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 11/21] mm: add mmap_action_simple_ioremap() Lorenzo Stoakes (Oracle)
2026-03-25  9:58   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 12/21] misc: open-dice: replace deprecated mmap hook with mmap_prepare Lorenzo Stoakes (Oracle)
2026-03-25 10:04   ` Vlastimil Babka (SUSE)
2026-03-25 10:14   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 13/21] hpet: " Lorenzo Stoakes (Oracle)
2026-03-25 10:17   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 14/21] mtdchar: replace deprecated mmap hook with mmap_prepare, clean up Lorenzo Stoakes (Oracle)
2026-03-25 10:20   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 15/21] stm: replace deprecated mmap hook with mmap_prepare Lorenzo Stoakes (Oracle)
2026-03-25 10:24   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 16/21] staging: vme_user: " Lorenzo Stoakes (Oracle)
2026-03-25 10:34   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 17/21] mm: allow handling of stacked mmap_prepare hooks in more drivers Lorenzo Stoakes (Oracle)
2026-03-25 13:43   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 18/21] drivers: hv: vmbus: replace deprecated mmap hook with mmap_prepare Lorenzo Stoakes (Oracle)
2026-03-23  4:16   ` Michael Kelley
2026-03-23  9:13     ` Lorenzo Stoakes (Oracle)
2026-03-25 13:57   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 19/21] uio: replace deprecated mmap hook with mmap_prepare in uio_info Lorenzo Stoakes (Oracle)
2026-03-25 14:13   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 20/21] mm: add mmap_action_map_kernel_pages[_full]() Lorenzo Stoakes (Oracle)
2026-03-26 10:44   ` Vlastimil Babka (SUSE)
2026-03-20 22:39 ` [PATCH v4 21/21] mm: on remap assert that input range within the proposed VMA Lorenzo Stoakes (Oracle)
2026-03-26 10:46   ` Vlastimil Babka (SUSE)
2026-03-21  2:42 ` [PATCH v4 00/21] mm: expand mmap_prepare functionality and usage Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4c5e98297eb0aae9565c564e1c296a112702f144.1774045440.git.ljs@kernel.org \
    --to=ljs@kernel.org \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexandre.torgue@foss.st.com \
    --cc=arnd@arndb.de \
    --cc=bostroesser@gmail.com \
    --cc=brauner@kernel.org \
    --cc=clemens@ladisch.de \
    --cc=corbet@lwn.net \
    --cc=david@kernel.org \
    --cc=decui@microsoft.com \
    --cc=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=haiyangz@microsoft.com \
    --cc=jack@suse.cz \
    --cc=jannh@google.com \
    --cc=kys@microsoft.com \
    --cc=linux-afs@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-mtd@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=linux-staging@lists.linux.dev \
    --cc=linux-stm32@st-md-mailman.stormreply.com \
    --cc=longli@microsoft.com \
    --cc=marc.dionne@auristor.com \
    --cc=martin.petersen@oracle.com \
    --cc=mcoquelin.stm32@gmail.com \
    --cc=mhocko@suse.com \
    --cc=miquel.raynal@bootlin.com \
    --cc=pfalcato@suse.de \
    --cc=richard@nod.at \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=target-devel@vger.kernel.org \
    --cc=vbabka@kernel.org \
    --cc=vigneshr@ti.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wei.liu@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox