linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] Remove XA_ZERO from error recovery of
@ 2025-08-15 19:10 Liam R. Howlett
  2025-08-15 19:10 ` [RFC PATCH 1/6] mm/mmap: Move exit_mmap() trace point Liam R. Howlett
                   ` (7 more replies)
  0 siblings, 8 replies; 35+ messages in thread
From: Liam R. Howlett @ 2025-08-15 19:10 UTC (permalink / raw)
  To: David Hildenbrand, Lorenzo Stoakes
  Cc: maple-tree, linux-mm, linux-kernel, Vlastimil Babka,
	Mike Rapoport, Suren Baghdasaryan, Michal Hocko, Andrew Morton,
	Jann Horn, Pedro Falcato, Charan Teja Kalla, shikemeng, kasong,
	nphamcs, bhe, baohua, chrisl, Matthew Wilcox, Liam R. Howlett

Before you read on, please take a moment to acknowledge that David
Hildenbrand asked for this, so I'm blaming mostly him :)

It is possible that the dup_mmap() call fails on allocating or setting
up a vma after the maple tree of the oldmm is copied.  Today, that
failure point is marked by inserting an XA_ZERO entry over the failure
point so that the exact location does not need to be communicated
through to exit_mmap().

However, a race exists in the tear down process because the dup_mmap()
drops the mmap lock before exit_mmap() can remove the partially set up
vma tree.  This means that other tasks may get to the mm tree and find
the invalid vma pointer (since it's an XA_ZERO entry), even though the
mm is marked as MMF_OOM_SKIP and MMF_UNSTABLE.

To remove the race fully, the tree must be cleaned up before dropping
the lock.  This is accomplished by extracting the vma cleanup in
exit_mmap() and changing the required functions to pass through the vma
search limit.

This does run the risk of increasing the possibility of finding no vmas
(which is already possible!) in code this isn't careful.

The passing of so many limits and variables was such a mess when the
dup_mmap() was introduced that it was avoided in favour of the XA_ZERO
entry marker, but since the swap case was the second time we've hit
cases of walking an almost-dead mm, here's the alternative to checking
MMF_UNSTABLE before wandering into other mm structs.

[1].  https://lore.kernel.org/all/2e8df53b-d953-43fb-9c69-7d7d60e95c9a@redhat.com/

Liam R. Howlett (6):
  mm/mmap: Move exit_mmap() trace point
  mm/mmap: Abstract vma clean up from exit_mmap()
  mm/vma: Add limits to unmap_region() for vmas
  mm/memory: Add tree limit to free_pgtables()
  mm/vma: Add page table limit to unmap_region()
  mm: Change dup_mmap() recovery

 mm/internal.h |  4 ++-
 mm/memory.c   | 13 ++++-----
 mm/mmap.c     | 80 ++++++++++++++++++++++++++++++++++-----------------
 mm/vma.c      | 10 +++++--
 mm/vma.h      |  1 +
 5 files changed, 70 insertions(+), 38 deletions(-)

-- 
2.47.2



^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2025-09-09 17:19 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-08-15 19:10 [RFC PATCH 0/6] Remove XA_ZERO from error recovery of Liam R. Howlett
2025-08-15 19:10 ` [RFC PATCH 1/6] mm/mmap: Move exit_mmap() trace point Liam R. Howlett
2025-08-19 18:27   ` Lorenzo Stoakes
2025-08-21 21:12   ` Chris Li
2025-08-15 19:10 ` [RFC PATCH 2/6] mm/mmap: Abstract vma clean up from exit_mmap() Liam R. Howlett
2025-08-19 18:38   ` Lorenzo Stoakes
2025-09-03 19:56     ` Liam R. Howlett
2025-09-04 15:21       ` Lorenzo Stoakes
2025-08-15 19:10 ` [RFC PATCH 3/6] mm/vma: Add limits to unmap_region() for vmas Liam R. Howlett
2025-08-19 18:48   ` Lorenzo Stoakes
2025-09-03 19:57     ` Liam R. Howlett
2025-09-04 15:23       ` Lorenzo Stoakes
2025-08-15 19:10 ` [RFC PATCH 4/6] mm/memory: Add tree limit to free_pgtables() Liam R. Howlett
2025-08-18 15:36   ` Lorenzo Stoakes
2025-08-18 15:54     ` Liam R. Howlett
2025-08-19 19:14   ` Lorenzo Stoakes
2025-09-03 20:19     ` Liam R. Howlett
2025-09-04 10:20       ` David Hildenbrand
2025-09-04 15:36         ` Lorenzo Stoakes
2025-09-09 17:19         ` Liam R. Howlett
2025-09-04 15:33       ` Lorenzo Stoakes
2025-08-15 19:10 ` [RFC PATCH 5/6] mm/vma: Add page table limit to unmap_region() Liam R. Howlett
2025-08-19 19:27   ` Lorenzo Stoakes
2025-08-15 19:10 ` [RFC PATCH 6/6] mm: Change dup_mmap() recovery Liam R. Howlett
2025-08-18 15:12   ` Lorenzo Stoakes
2025-08-18 15:29     ` Lorenzo Stoakes
2025-08-19 20:33   ` Lorenzo Stoakes
2025-09-04  0:13     ` Liam R. Howlett
2025-09-04 15:40       ` Lorenzo Stoakes
2025-08-15 19:49 ` [RFC PATCH 0/6] Remove XA_ZERO from error recovery of Jann Horn
2025-08-18 15:48   ` Liam R. Howlett
2025-08-18  9:44 ` David Hildenbrand
2025-08-18 14:26   ` Charan Teja Kalla
2025-08-18 14:54     ` Liam R. Howlett
2025-08-18 15:47   ` Lorenzo Stoakes

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).