* FAILED: patch "[PATCH] mm: avoid deadlock when holding rmap on mmap_prepare error" failed to apply to 7.0-stable tree
@ 2026-04-27 16:43 gregkh
2026-04-29 5:36 ` [PATCH 7.0.y 1/2] mm: various small mmap_prepare cleanups Sasha Levin
0 siblings, 1 reply; 3+ messages in thread
From: gregkh @ 2026-04-27 16:43 UTC (permalink / raw)
To: ljs, akpm, alexander.shishkin, alexandre.torgue, arnd,
bostroesser, brauner, clemens, corbet, david, decui, dhowells,
gregkh, haiyangz, jack, jannh, kys, liam.howlett, longli,
marc.dionne, martin.petersen, mcoquelin.stm32, mhocko,
miquel.raynal, pfalcato, richard, rppt, ryan.roberts, stable,
surenb, vbabka, vigneshr, viro, wei.liu
Cc: stable
The patch below does not apply to the 7.0-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable@vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-7.0.y
git checkout FETCH_HEAD
git cherry-pick -x f96e1d5f15b7c854a6a9ec1225d68a12fe7dcda6
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable@vger.kernel.org>' --in-reply-to '2026042747-cardinal-pellet-094f@gregkh' --subject-prefix 'PATCH 7.0.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f96e1d5f15b7c854a6a9ec1225d68a12fe7dcda6 Mon Sep 17 00:00:00 2001
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Date: Fri, 20 Mar 2026 22:39:30 +0000
Subject: [PATCH] mm: avoid deadlock when holding rmap on mmap_prepare error
Commit ac0a3fc9c07d ("mm: add ability to take further action in
vm_area_desc") added the ability for drivers to instruct mm to take actions
after the .mmap_prepare callback is complete.
To make life simpler and safer, this is done before the VMA/mmap write lock
is dropped but when the VMA is completely established.
So on error, we simply munmap() the VMA.
As part of this implementation, unfortunately a horrible hack had to be
implemented to support some questionable behaviour hugetlb relies upon -
that is that the file rmap lock is held until the operation is complete.
The implementation, for convenience, did this in mmap_action_finish() so
both the VMA and mmap_prepare compatibility layer paths would have this
correctly handled.
However, it turns out there is a mistake here - the rmap lock cannot be
held on munmap, as free_pgtables() -> unlink_file_vma_batch_add() ->
unlink_file_vma_batch_process() takes the file rmap lock.
We therefore currently have a deadlock issue that might arise.
Resolve this by leaving it to callers to handle the unmap.
The compatibility layer does not support this rmap behaviour, so we simply
have it unmap on error after calling mmap_action_complete().
In the VMA implementation, we only perform the unmap after the rmap lock is
dropped.
This resolves the issue by ensuring the rmap lock is always dropped when
the unmap occurs.
Link: https://lkml.kernel.org/r/d44248be9da68258b07c2c59d4e73485ee0ca943.1774045440.git.ljs@kernel.org
Fixes: ac0a3fc9c07d ("mm: add ability to take further action in vm_area_desc")
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
diff --git a/mm/util.c b/mm/util.c
index 73c97a748d8e..a2cfa0d77c35 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1215,7 +1215,13 @@ int compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
return err;
set_vma_from_desc(vma, &desc);
- return mmap_action_complete(vma, &desc.action);
+ err = mmap_action_complete(vma, &desc.action);
+ if (err) {
+ const size_t len = vma_pages(vma) << PAGE_SHIFT;
+
+ do_munmap(current->mm, vma->vm_start, len, NULL);
+ }
+ return err;
}
EXPORT_SYMBOL(compat_vma_mmap);
@@ -1316,10 +1322,6 @@ static int mmap_action_finish(struct vm_area_struct *vma,
* invoked if we do NOT merge, so we only clean up the VMA we created.
*/
if (err) {
- const size_t len = vma_pages(vma) << PAGE_SHIFT;
-
- do_munmap(current->mm, vma->vm_start, len, NULL);
-
if (action->error_hook) {
/* We may want to filter the error. */
err = action->error_hook(err);
diff --git a/mm/vma.c b/mm/vma.c
index 1e2996a12d7f..4095834dce09 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2735,9 +2735,9 @@ static int call_action_complete(struct mmap_state *map,
struct mmap_action *action,
struct vm_area_struct *vma)
{
- int ret;
+ int err;
- ret = mmap_action_complete(vma, action);
+ err = mmap_action_complete(vma, action);
/* If we held the file rmap we need to release it. */
if (map->hold_file_rmap_lock) {
@@ -2745,7 +2745,14 @@ static int call_action_complete(struct mmap_state *map,
i_mmap_unlock_write(file->f_mapping);
}
- return ret;
+
+ if (err) {
+ const size_t len = vma_pages(vma) << PAGE_SHIFT;
+
+ do_munmap(current->mm, vma->vm_start, len, NULL);
+ }
+
+ return err;
}
static unsigned long __mmap_region(struct file *file, unsigned long addr,
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 7.0.y 1/2] mm: various small mmap_prepare cleanups
2026-04-27 16:43 FAILED: patch "[PATCH] mm: avoid deadlock when holding rmap on mmap_prepare error" failed to apply to 7.0-stable tree gregkh
@ 2026-04-29 5:36 ` Sasha Levin
2026-04-29 5:36 ` [PATCH 7.0.y 2/2] mm: avoid deadlock when holding rmap on mmap_prepare error Sasha Levin
0 siblings, 1 reply; 3+ messages in thread
From: Sasha Levin @ 2026-04-29 5:36 UTC (permalink / raw)
To: stable
Cc: Lorenzo Stoakes (Oracle), Vlastimil Babka (SUSE),
Alexander Shishkin, Alexandre Torgue, Al Viro, Arnd Bergmann,
Bodo Stroesser, Christian Brauner, Clemens Ladisch,
David Hildenbrand, David Howells, Dexuan Cui, Greg Kroah-Hartman,
Haiyang Zhang, Jan Kara, Jann Horn, Jonathan Corbet,
K. Y. Srinivasan, Liam Howlett, Long Li, Marc Dionne,
Martin K. Petersen, Maxime Coquelin, Michal Hocko, Mike Rapoport,
Miquel Raynal, Pedro Falcato, Richard Weinberger, Ryan Roberts,
Suren Baghdasaryan, Vignesh Raghavendra, Wei Liu, Andrew Morton,
Sasha Levin
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
[ Upstream commit 3e4bb2706817710d9461394da8b75be79981586b ]
Patch series "mm: expand mmap_prepare functionality and usage", v4.
This series expands the mmap_prepare functionality, which is intended to
replace the deprecated f_op->mmap hook which has been the source of bugs
and security issues for some time.
This series starts with some cleanup of existing mmap_prepare logic, then
adds documentation for the mmap_prepare call to make it easier for
filesystem and driver writers to understand how it works.
It then importantly adds a vm_ops->mapped hook, a key feature that was
missing from mmap_prepare previously - this is invoked when a driver which
specifies mmap_prepare has successfully been mapped but not merged with
another VMA.
mmap_prepare is invoked prior to a merge being attempted, so you cannot
manipulate state such as reference counts as if it were a new mapping.
The vm_ops->mapped hook allows a driver to perform tasks required at this
stage, and provides symmetry against subsequent vm_ops->open,close calls.
The series uses this to correct the afs implementation which wrongly
manipulated reference count at mmap_prepare time.
It then adds an mmap_prepare equivalent of vm_iomap_memory() -
mmap_action_simple_ioremap(), then uses this to update a number of drivers.
It then splits out the mmap_prepare compatibility layer (which allows for
invocation of mmap_prepare hooks in an mmap() hook) in such a way as to
allow for more incremental implementation of mmap_prepare hooks.
It then uses this to extend mmap_prepare usage in drivers.
Finally it adds an mmap_prepare equivalent of vm_map_pages(), which lays
the foundation for future work which will extend mmap_prepare to DMA
coherent mappings.
This patch (of 21):
Rather than passing arbitrary fields, pass a vm_area_desc pointer to mmap
prepare functions to mmap prepare, and an action and vma pointer to mmap
complete in order to put all the action-specific logic in the function
actually doing the work.
Additionally, allow mmap prepare functions to return an error so we can
error out as soon as possible if there is something logically incorrect in
the input.
Update remap_pfn_range_prepare() to properly check the input range for the
CoW case.
Also remove io_remap_pfn_range_complete(), as we can simply set up the
fields correctly in io_remap_pfn_range_prepare() and use
remap_pfn_range_complete() for this.
While we're here, make remap_pfn_range_prepare_vma() a little neater, and
pass mmap_action directly to call_action_complete().
Then, update compat_vma_mmap() to perform its logic directly, as
__compat_vma_map() is not used by anything so we don't need to export it.
Also update compat_vma_mmap() to use vfs_mmap_prepare() rather than
calling the mmap_prepare op directly.
Finally, update the VMA userland tests to reflect the changes.
Link: https://lkml.kernel.org/r/cover.1774045440.git.ljs@kernel.org
Link: https://lkml.kernel.org/r/99f408e4694f44ab12bdc55fe0bd9685d3bd1117.1774045440.git.ljs@kernel.org
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Wei Liu <wei.liu@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Stable-dep-of: f96e1d5f15b7 ("mm: avoid deadlock when holding rmap on mmap_prepare error")
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
include/linux/fs.h | 2 -
include/linux/mm.h | 7 +-
mm/internal.h | 32 ++++----
mm/memory.c | 45 +++++++----
mm/util.c | 121 +++++++++++++-----------------
mm/vma.c | 24 +++---
tools/testing/vma/include/dup.h | 7 +-
tools/testing/vma/include/stubs.h | 8 +-
8 files changed, 126 insertions(+), 120 deletions(-)
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8b3dd145b25ec..a2628a12bd2bc 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2058,8 +2058,6 @@ static inline bool can_mmap_file(struct file *file)
return true;
}
-int __compat_vma_mmap(const struct file_operations *f_op,
- struct file *file, struct vm_area_struct *vma);
int compat_vma_mmap(struct file *file, struct vm_area_struct *vma);
static inline int vfs_mmap(struct file *file, struct vm_area_struct *vma)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index abb4963c1f064..0ceba2c86d9c1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4078,10 +4078,9 @@ static inline void mmap_action_ioremap_full(struct vm_area_desc *desc,
mmap_action_ioremap(desc, desc->start, start_pfn, vma_desc_size(desc));
}
-void mmap_action_prepare(struct mmap_action *action,
- struct vm_area_desc *desc);
-int mmap_action_complete(struct mmap_action *action,
- struct vm_area_struct *vma);
+int mmap_action_prepare(struct vm_area_desc *desc);
+int mmap_action_complete(struct vm_area_struct *vma,
+ struct mmap_action *action);
/* Look up the first VMA which exactly match the interval vm_start ... vm_end */
static inline struct vm_area_struct *find_exact_vma(struct mm_struct *mm,
diff --git a/mm/internal.h b/mm/internal.h
index 546114d3ee448..7d7890ab06b80 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -1747,26 +1747,28 @@ int walk_page_range_debug(struct mm_struct *mm, unsigned long start,
void dup_mm_exe_file(struct mm_struct *mm, struct mm_struct *oldmm);
int dup_mmap(struct mm_struct *mm, struct mm_struct *oldmm);
-void remap_pfn_range_prepare(struct vm_area_desc *desc, unsigned long pfn);
-int remap_pfn_range_complete(struct vm_area_struct *vma, unsigned long addr,
- unsigned long pfn, unsigned long size, pgprot_t pgprot);
+int remap_pfn_range_prepare(struct vm_area_desc *desc);
+int remap_pfn_range_complete(struct vm_area_struct *vma,
+ struct mmap_action *action);
-static inline void io_remap_pfn_range_prepare(struct vm_area_desc *desc,
- unsigned long orig_pfn, unsigned long size)
+static inline int io_remap_pfn_range_prepare(struct vm_area_desc *desc)
{
+ struct mmap_action *action = &desc->action;
+ const unsigned long orig_pfn = action->remap.start_pfn;
+ const pgprot_t orig_pgprot = action->remap.pgprot;
+ const unsigned long size = action->remap.size;
const unsigned long pfn = io_remap_pfn_range_pfn(orig_pfn, size);
+ int err;
- return remap_pfn_range_prepare(desc, pfn);
-}
+ action->remap.start_pfn = pfn;
+ action->remap.pgprot = pgprot_decrypted(orig_pgprot);
+ err = remap_pfn_range_prepare(desc);
+ if (err)
+ return err;
-static inline int io_remap_pfn_range_complete(struct vm_area_struct *vma,
- unsigned long addr, unsigned long orig_pfn, unsigned long size,
- pgprot_t orig_prot)
-{
- const unsigned long pfn = io_remap_pfn_range_pfn(orig_pfn, size);
- const pgprot_t prot = pgprot_decrypted(orig_prot);
-
- return remap_pfn_range_complete(vma, addr, pfn, size, prot);
+ /* Remap does the actual work. */
+ action->type = MMAP_REMAP_PFN;
+ return 0;
}
#endif /* __MM_INTERNAL_H */
diff --git a/mm/memory.c b/mm/memory.c
index c65e82c86fed7..e03522c2bea63 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -3105,26 +3105,34 @@ static int do_remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
}
#endif
-void remap_pfn_range_prepare(struct vm_area_desc *desc, unsigned long pfn)
+int remap_pfn_range_prepare(struct vm_area_desc *desc)
{
- /*
- * We set addr=VMA start, end=VMA end here, so this won't fail, but we
- * check it again on complete and will fail there if specified addr is
- * invalid.
- */
- get_remap_pgoff(vma_desc_is_cow_mapping(desc), desc->start, desc->end,
- desc->start, desc->end, pfn, &desc->pgoff);
+ const struct mmap_action *action = &desc->action;
+ const unsigned long start = action->remap.start;
+ const unsigned long end = start + action->remap.size;
+ const unsigned long pfn = action->remap.start_pfn;
+ const bool is_cow = vma_desc_is_cow_mapping(desc);
+ int err;
+
+ err = get_remap_pgoff(is_cow, start, end, desc->start, desc->end, pfn,
+ &desc->pgoff);
+ if (err)
+ return err;
+
vma_desc_set_flags_mask(desc, VMA_REMAP_FLAGS);
+ return 0;
}
-static int remap_pfn_range_prepare_vma(struct vm_area_struct *vma, unsigned long addr,
- unsigned long pfn, unsigned long size)
+static int remap_pfn_range_prepare_vma(struct vm_area_struct *vma,
+ unsigned long addr, unsigned long pfn,
+ unsigned long size)
{
- unsigned long end = addr + PAGE_ALIGN(size);
+ const unsigned long end = addr + PAGE_ALIGN(size);
+ const bool is_cow = is_cow_mapping(vma->vm_flags);
int err;
- err = get_remap_pgoff(is_cow_mapping(vma->vm_flags), addr, end,
- vma->vm_start, vma->vm_end, pfn, &vma->vm_pgoff);
+ err = get_remap_pgoff(is_cow, addr, end, vma->vm_start, vma->vm_end,
+ pfn, &vma->vm_pgoff);
if (err)
return err;
@@ -3157,10 +3165,15 @@ int remap_pfn_range(struct vm_area_struct *vma, unsigned long addr,
}
EXPORT_SYMBOL(remap_pfn_range);
-int remap_pfn_range_complete(struct vm_area_struct *vma, unsigned long addr,
- unsigned long pfn, unsigned long size, pgprot_t prot)
+int remap_pfn_range_complete(struct vm_area_struct *vma,
+ struct mmap_action *action)
{
- return do_remap_pfn_range(vma, addr, pfn, size, prot);
+ const unsigned long start = action->remap.start;
+ const unsigned long pfn = action->remap.start_pfn;
+ const unsigned long size = action->remap.size;
+ const pgprot_t prot = action->remap.pgprot;
+
+ return do_remap_pfn_range(vma, start, pfn, size, prot);
}
/**
diff --git a/mm/util.c b/mm/util.c
index b05ab6f97e110..62ddf9eabb1f6 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1134,43 +1134,6 @@ void flush_dcache_folio(struct folio *folio)
EXPORT_SYMBOL(flush_dcache_folio);
#endif
-/**
- * __compat_vma_mmap() - See description for compat_vma_mmap()
- * for details. This is the same operation, only with a specific file operations
- * struct which may or may not be the same as vma->vm_file->f_op.
- * @f_op: The file operations whose .mmap_prepare() hook is specified.
- * @file: The file which backs or will back the mapping.
- * @vma: The VMA to apply the .mmap_prepare() hook to.
- * Returns: 0 on success or error.
- */
-int __compat_vma_mmap(const struct file_operations *f_op,
- struct file *file, struct vm_area_struct *vma)
-{
- struct vm_area_desc desc = {
- .mm = vma->vm_mm,
- .file = file,
- .start = vma->vm_start,
- .end = vma->vm_end,
-
- .pgoff = vma->vm_pgoff,
- .vm_file = vma->vm_file,
- .vma_flags = vma->flags,
- .page_prot = vma->vm_page_prot,
-
- .action.type = MMAP_NOTHING, /* Default */
- };
- int err;
-
- err = f_op->mmap_prepare(&desc);
- if (err)
- return err;
-
- mmap_action_prepare(&desc.action, &desc);
- set_vma_from_desc(vma, &desc);
- return mmap_action_complete(&desc.action, vma);
-}
-EXPORT_SYMBOL(__compat_vma_mmap);
-
/**
* compat_vma_mmap() - Apply the file's .mmap_prepare() hook to an
* existing VMA and execute any requested actions.
@@ -1199,7 +1162,31 @@ EXPORT_SYMBOL(__compat_vma_mmap);
*/
int compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
{
- return __compat_vma_mmap(file->f_op, file, vma);
+ struct vm_area_desc desc = {
+ .mm = vma->vm_mm,
+ .file = file,
+ .start = vma->vm_start,
+ .end = vma->vm_end,
+
+ .pgoff = vma->vm_pgoff,
+ .vm_file = vma->vm_file,
+ .vma_flags = vma->flags,
+ .page_prot = vma->vm_page_prot,
+
+ .action.type = MMAP_NOTHING, /* Default */
+ };
+ int err;
+
+ err = vfs_mmap_prepare(file, &desc);
+ if (err)
+ return err;
+
+ err = mmap_action_prepare(&desc);
+ if (err)
+ return err;
+
+ set_vma_from_desc(vma, &desc);
+ return mmap_action_complete(vma, &desc.action);
}
EXPORT_SYMBOL(compat_vma_mmap);
@@ -1283,8 +1270,8 @@ void snapshot_page(struct page_snapshot *ps, const struct page *page)
}
}
-static int mmap_action_finish(struct mmap_action *action,
- const struct vm_area_struct *vma, int err)
+static int mmap_action_finish(struct vm_area_struct *vma,
+ struct mmap_action *action, int err)
{
/*
* If an error occurs, unmap the VMA altogether and return an error. We
@@ -1316,37 +1303,38 @@ static int mmap_action_finish(struct mmap_action *action,
/**
* mmap_action_prepare - Perform preparatory setup for an VMA descriptor
* action which need to be performed.
- * @desc: The VMA descriptor to prepare for @action.
- * @action: The action to perform.
+ * @desc: The VMA descriptor to prepare for its @desc->action.
+ *
+ * Returns: %0 on success, otherwise error.
*/
-void mmap_action_prepare(struct mmap_action *action,
- struct vm_area_desc *desc)
+int mmap_action_prepare(struct vm_area_desc *desc)
{
- switch (action->type) {
+ switch (desc->action.type) {
case MMAP_NOTHING:
- break;
+ return 0;
case MMAP_REMAP_PFN:
- remap_pfn_range_prepare(desc, action->remap.start_pfn);
- break;
+ return remap_pfn_range_prepare(desc);
case MMAP_IO_REMAP_PFN:
- io_remap_pfn_range_prepare(desc, action->remap.start_pfn,
- action->remap.size);
- break;
+ return io_remap_pfn_range_prepare(desc);
}
+
+ WARN_ON_ONCE(1);
+ return -EINVAL;
}
EXPORT_SYMBOL(mmap_action_prepare);
/**
* mmap_action_complete - Execute VMA descriptor action.
- * @action: The action to perform.
* @vma: The VMA to perform the action upon.
+ * @action: The action to perform.
*
* Similar to mmap_action_prepare().
*
* Return: 0 on success, or error, at which point the VMA will be unmapped.
*/
-int mmap_action_complete(struct mmap_action *action,
- struct vm_area_struct *vma)
+int mmap_action_complete(struct vm_area_struct *vma,
+ struct mmap_action *action)
+
{
int err = 0;
@@ -1354,25 +1342,22 @@ int mmap_action_complete(struct mmap_action *action,
case MMAP_NOTHING:
break;
case MMAP_REMAP_PFN:
- err = remap_pfn_range_complete(vma, action->remap.start,
- action->remap.start_pfn, action->remap.size,
- action->remap.pgprot);
+ err = remap_pfn_range_complete(vma, action);
break;
case MMAP_IO_REMAP_PFN:
- err = io_remap_pfn_range_complete(vma, action->remap.start,
- action->remap.start_pfn, action->remap.size,
- action->remap.pgprot);
+ /* Should have been delegated. */
+ WARN_ON_ONCE(1);
+ err = -EINVAL;
break;
}
- return mmap_action_finish(action, vma, err);
+ return mmap_action_finish(vma, action, err);
}
EXPORT_SYMBOL(mmap_action_complete);
#else
-void mmap_action_prepare(struct mmap_action *action,
- struct vm_area_desc *desc)
+int mmap_action_prepare(struct vm_area_desc *desc)
{
- switch (action->type) {
+ switch (desc->action.type) {
case MMAP_NOTHING:
break;
case MMAP_REMAP_PFN:
@@ -1380,11 +1365,13 @@ void mmap_action_prepare(struct mmap_action *action,
WARN_ON_ONCE(1); /* nommu cannot handle these. */
break;
}
+
+ return 0;
}
EXPORT_SYMBOL(mmap_action_prepare);
-int mmap_action_complete(struct mmap_action *action,
- struct vm_area_struct *vma)
+int mmap_action_complete(struct vm_area_struct *vma,
+ struct mmap_action *action)
{
int err = 0;
@@ -1399,7 +1386,7 @@ int mmap_action_complete(struct mmap_action *action,
break;
}
- return mmap_action_finish(action, vma, err);
+ return mmap_action_finish(vma, action, err);
}
EXPORT_SYMBOL(mmap_action_complete);
#endif
diff --git a/mm/vma.c b/mm/vma.c
index c8df5f561ad7d..cae154a43f555 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2613,15 +2613,18 @@ static void __mmap_complete(struct mmap_state *map, struct vm_area_struct *vma)
vma_set_page_prot(vma);
}
-static void call_action_prepare(struct mmap_state *map,
- struct vm_area_desc *desc)
+static int call_action_prepare(struct mmap_state *map,
+ struct vm_area_desc *desc)
{
- struct mmap_action *action = &desc->action;
+ int err;
- mmap_action_prepare(action, desc);
+ err = mmap_action_prepare(desc);
+ if (err)
+ return err;
- if (action->hide_from_rmap_until_complete)
+ if (desc->action.hide_from_rmap_until_complete)
map->hold_file_rmap_lock = true;
+ return 0;
}
/*
@@ -2645,7 +2648,9 @@ static int call_mmap_prepare(struct mmap_state *map,
if (err)
return err;
- call_action_prepare(map, desc);
+ err = call_action_prepare(map, desc);
+ if (err)
+ return err;
/* Update fields permitted to be changed. */
map->pgoff = desc->pgoff;
@@ -2700,13 +2705,12 @@ static bool can_set_ksm_flags_early(struct mmap_state *map)
}
static int call_action_complete(struct mmap_state *map,
- struct vm_area_desc *desc,
+ struct mmap_action *action,
struct vm_area_struct *vma)
{
- struct mmap_action *action = &desc->action;
int ret;
- ret = mmap_action_complete(action, vma);
+ ret = mmap_action_complete(vma, action);
/* If we held the file rmap we need to release it. */
if (map->hold_file_rmap_lock) {
@@ -2768,7 +2772,7 @@ static unsigned long __mmap_region(struct file *file, unsigned long addr,
__mmap_complete(&map, vma);
if (have_mmap_prepare && allocated_new) {
- error = call_action_complete(&map, &desc, vma);
+ error = call_action_complete(&map, &desc.action, vma);
if (error)
return error;
diff --git a/tools/testing/vma/include/dup.h b/tools/testing/vma/include/dup.h
index 3078ff1487d3f..6299c76c3b7db 100644
--- a/tools/testing/vma/include/dup.h
+++ b/tools/testing/vma/include/dup.h
@@ -1093,9 +1093,12 @@ static inline int __compat_vma_mmap(const struct file_operations *f_op,
if (err)
return err;
- mmap_action_prepare(&desc.action, &desc);
+ err = mmap_action_prepare(&desc);
+ if (err)
+ return err;
+
set_vma_from_desc(vma, &desc);
- return mmap_action_complete(&desc.action, vma);
+ return mmap_action_complete(vma, &desc.action);
}
static inline int compat_vma_mmap(struct file *file,
diff --git a/tools/testing/vma/include/stubs.h b/tools/testing/vma/include/stubs.h
index 947a3a0c25665..fef111b6817e0 100644
--- a/tools/testing/vma/include/stubs.h
+++ b/tools/testing/vma/include/stubs.h
@@ -81,13 +81,13 @@ static inline void free_anon_vma_name(struct vm_area_struct *vma)
{
}
-static inline void mmap_action_prepare(struct mmap_action *action,
- struct vm_area_desc *desc)
+static inline int mmap_action_prepare(struct vm_area_desc *desc)
{
+ return 0;
}
-static inline int mmap_action_complete(struct mmap_action *action,
- struct vm_area_struct *vma)
+static inline int mmap_action_complete(struct vm_area_struct *vma,
+ struct mmap_action *action)
{
return 0;
}
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
* [PATCH 7.0.y 2/2] mm: avoid deadlock when holding rmap on mmap_prepare error
2026-04-29 5:36 ` [PATCH 7.0.y 1/2] mm: various small mmap_prepare cleanups Sasha Levin
@ 2026-04-29 5:36 ` Sasha Levin
0 siblings, 0 replies; 3+ messages in thread
From: Sasha Levin @ 2026-04-29 5:36 UTC (permalink / raw)
To: stable
Cc: Lorenzo Stoakes (Oracle), Vlastimil Babka (SUSE),
Alexander Shishkin, Alexandre Torgue, Al Viro, Arnd Bergmann,
Bodo Stroesser, Christian Brauner, Clemens Ladisch,
David Hildenbrand, David Howells, Dexuan Cui, Greg Kroah-Hartman,
Haiyang Zhang, Jan Kara, Jann Horn, Jonathan Corbet,
K. Y. Srinivasan, Liam Howlett, Long Li, Marc Dionne,
Martin K. Petersen, Maxime Coquelin, Michal Hocko, Mike Rapoport,
Miquel Raynal, Pedro Falcato, Richard Weinberger, Ryan Roberts,
Suren Baghdasaryan, Vignesh Raghavendra, Wei Liu, Andrew Morton,
Sasha Levin
From: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
[ Upstream commit f96e1d5f15b7c854a6a9ec1225d68a12fe7dcda6 ]
Commit ac0a3fc9c07d ("mm: add ability to take further action in
vm_area_desc") added the ability for drivers to instruct mm to take actions
after the .mmap_prepare callback is complete.
To make life simpler and safer, this is done before the VMA/mmap write lock
is dropped but when the VMA is completely established.
So on error, we simply munmap() the VMA.
As part of this implementation, unfortunately a horrible hack had to be
implemented to support some questionable behaviour hugetlb relies upon -
that is that the file rmap lock is held until the operation is complete.
The implementation, for convenience, did this in mmap_action_finish() so
both the VMA and mmap_prepare compatibility layer paths would have this
correctly handled.
However, it turns out there is a mistake here - the rmap lock cannot be
held on munmap, as free_pgtables() -> unlink_file_vma_batch_add() ->
unlink_file_vma_batch_process() takes the file rmap lock.
We therefore currently have a deadlock issue that might arise.
Resolve this by leaving it to callers to handle the unmap.
The compatibility layer does not support this rmap behaviour, so we simply
have it unmap on error after calling mmap_action_complete().
In the VMA implementation, we only perform the unmap after the rmap lock is
dropped.
This resolves the issue by ensuring the rmap lock is always dropped when
the unmap occurs.
Link: https://lkml.kernel.org/r/d44248be9da68258b07c2c59d4e73485ee0ca943.1774045440.git.ljs@kernel.org
Fixes: ac0a3fc9c07d ("mm: add ability to take further action in vm_area_desc")
Signed-off-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Acked-by: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Bodo Stroesser <bostroesser@gmail.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Clemens Ladisch <clemens@ladisch.de>
Cc: David Hildenbrand <david@kernel.org>
Cc: David Howells <dhowells@redhat.com>
Cc: Dexuan Cui <decui@microsoft.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Jann Horn <jannh@google.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Long Li <longli@microsoft.com>
Cc: Marc Dionne <marc.dionne@auristor.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Miquel Raynal <miquel.raynal@bootlin.com>
Cc: Pedro Falcato <pfalcato@suse.de>
Cc: Richard Weinberger <richard@nod.at>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Vignesh Raghavendra <vigneshr@ti.com>
Cc: Vlastimil Babka (SUSE) <vbabka@kernel.org>
Cc: Wei Liu <wei.liu@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
mm/util.c | 12 +++++++-----
mm/vma.c | 13 ++++++++++---
2 files changed, 17 insertions(+), 8 deletions(-)
diff --git a/mm/util.c b/mm/util.c
index 62ddf9eabb1f6..e2a51e3cfb249 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1186,7 +1186,13 @@ int compat_vma_mmap(struct file *file, struct vm_area_struct *vma)
return err;
set_vma_from_desc(vma, &desc);
- return mmap_action_complete(vma, &desc.action);
+ err = mmap_action_complete(vma, &desc.action);
+ if (err) {
+ const size_t len = vma_pages(vma) << PAGE_SHIFT;
+
+ do_munmap(current->mm, vma->vm_start, len, NULL);
+ }
+ return err;
}
EXPORT_SYMBOL(compat_vma_mmap);
@@ -1279,10 +1285,6 @@ static int mmap_action_finish(struct vm_area_struct *vma,
* invoked if we do NOT merge, so we only clean up the VMA we created.
*/
if (err) {
- const size_t len = vma_pages(vma) << PAGE_SHIFT;
-
- do_munmap(current->mm, vma->vm_start, len, NULL);
-
if (action->error_hook) {
/* We may want to filter the error. */
err = action->error_hook(err);
diff --git a/mm/vma.c b/mm/vma.c
index cae154a43f555..3f55bc42e7be5 100644
--- a/mm/vma.c
+++ b/mm/vma.c
@@ -2708,9 +2708,9 @@ static int call_action_complete(struct mmap_state *map,
struct mmap_action *action,
struct vm_area_struct *vma)
{
- int ret;
+ int err;
- ret = mmap_action_complete(vma, action);
+ err = mmap_action_complete(vma, action);
/* If we held the file rmap we need to release it. */
if (map->hold_file_rmap_lock) {
@@ -2718,7 +2718,14 @@ static int call_action_complete(struct mmap_state *map,
i_mmap_unlock_write(file->f_mapping);
}
- return ret;
+
+ if (err) {
+ const size_t len = vma_pages(vma) << PAGE_SHIFT;
+
+ do_munmap(current->mm, vma->vm_start, len, NULL);
+ }
+
+ return err;
}
static unsigned long __mmap_region(struct file *file, unsigned long addr,
--
2.53.0
^ permalink raw reply related [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-04-29 5:36 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-27 16:43 FAILED: patch "[PATCH] mm: avoid deadlock when holding rmap on mmap_prepare error" failed to apply to 7.0-stable tree gregkh
2026-04-29 5:36 ` [PATCH 7.0.y 1/2] mm: various small mmap_prepare cleanups Sasha Levin
2026-04-29 5:36 ` [PATCH 7.0.y 2/2] mm: avoid deadlock when holding rmap on mmap_prepare error Sasha Levin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox