The Linux Kernel Mailing List
 help / color / mirror / Atom feed
* Re: [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios
       [not found]                 ` <b2d9fe2b-abb0-49d1-8056-ac93aa232bbb@kernel.org>
@ 2026-05-08 20:57                   ` Liam R. Howlett
  2026-05-11 13:05                     ` David Hildenbrand (Arm)
  2026-05-13  6:47                     ` Michal Hocko
  0 siblings, 2 replies; 3+ messages in thread
From: Liam R. Howlett @ 2026-05-08 20:57 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Michal Hocko, Minchan Kim, Suren Baghdasaryan, akpm, hca,
	linux-s390, brauner, linux-mm, linux-kernel, timmurray,
	Liam R. Howlett

On 26/04/30 08:08AM, David Hildenbrand (Arm) wrote:
> On 4/29/26 16:44, Michal Hocko wrote:
> > On Wed 29-04-26 15:07:04, David Hildenbrand wrote:
> >> On 4/29/26 12:33, Michal Hocko wrote:
> >>>
> >>> While the oom is the only current kernel user of MMF_UNSTABLE (in a
> >>> sense it sets the flag) the flag should denote that any page faults are
> >>> reliable because it might fault in a fresh memory and user would lose
> >>> the previous content without knowing that. Not sure MMF_OOM_REAPING
> >>> would reflect that reality better.
> >>
> >> We use it for failed fork() as well, but that's slightly different semantics (no
> >> real page faults ever made sense).
> 
> Well, there is a difference: a failed-fork process was never scheduled and will
> never get scheduled.
> 
> In fact, we added the MMF_UNSTABLE to the fork path in
> 
> commit 64c37e134b120fb462fb4a80694bfb8e7be77b14
> Author: Liam R. Howlett <liam@infradead.org>
> Date:   Mon Jan 27 12:02:21 2025 -0500
> 
>     kernel: be more careful about dup_mmap() failures and uprobe registering
> 
>     If a memory allocation fails during dup_mmap(), the maple tree can be left
>     in an unsafe state for other iterators besides the exit path.  All the
>     locks are dropped before the exit_mmap() call (in mm/mmap.c), but the
>     incomplete mm_struct can be reached through (at least) the rmap finding
>     the vmas which have a pointer back to the mm_struct.
> 
>     Up to this point, there have been no issues with being able to find an
>     mm_struct that was only partially initialised.  Syzbot was able to make
>     the incomplete mm_struct fail with recent forking changes, so it has been
>     proven unsafe to use the mm_struct that hasn't been initialised, as
>     referenced in the link below.
> 
>     Although 8ac662f5da19f ("fork: avoid inappropriate uprobe access to
>     invalid mm") fixed the uprobe access, it does not completely remove the
>     race.
> 
>     This patch sets the MMF_OOM_SKIP to avoid the iteration of the vmas on the
>     oom side (even though this is extremely unlikely to be selected as an oom
>     victim in the race window), and sets MMF_UNSTABLE to avoid other potential
>     users from using a partially initialised mm_struct.
> 
> Which was later changed in
> 
> commit 43873af772f8138c5cb4b76dde9c26339e89be3b
> Author: Liam R. Howlett <liam@infradead.org>
> Date:   Wed Jan 21 11:49:42 2026 -0500
> 
>     mm: change dup_mmap() recovery
> 
>     When the dup_mmap() fails during the vma duplication or setup, don't write
>     the XA_ZERO entry in the vma tree.  Instead, destroy the tree and free the
>     new resources, leaving an empty vma tree.
> 
>     Using XA_ZERO introduced races where the vma could be found between
>     dup_mmap() dropping all locks and exit_mmap() taking the locks.  The race
>     can occur because the mm can be reached through the other trees via
>     successfully copied vmas and other methods such as the swapoff code.
> ...
> 
> and I am not sure if MMF_UNSTABLE is still required, as we don't leave these
> stale VMA copies in the maple tree.
> 
> The process might just look like just another process that is getting torn down now.
> 
> But we'd have to learn from Liam :)

Yes, it will be a zero entry tree now.

I left the flag to indicate that it's an unstable mm, not for faulting
in but to be skipped in OOM events and process_mrelease since neither
should bother doing anything in the window between the dup_mm() failure
and the exit_mmap() window where the write lock was dropped.

We can safely drop the flag now if you want to, because everything has
to deal with an empty vma tree anyways - a race can occur between a call
to unmap everything and the task seg faulting.

> 
> 
> > 
> > The bottom line is the same. Make sure PF fails rather than silently
> > provide potentially corrupted data.
> > 
> >> Looking at the original patch here, using MMF_OOM_REAPING to modify zapping
> >> behavior would be clearer than MMF_UNSTABLE, I guess.
> > 
> > Ohh, you mean to add a new flag, right?
> 
> We could do that as well, if it's of any help.

I really think this goes back to the life cycle of the mm being somewhat
difficult to figure out.  I'm fine with another flag.

Thanks,
Liam


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios
  2026-05-08 20:57                   ` [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios Liam R. Howlett
@ 2026-05-11 13:05                     ` David Hildenbrand (Arm)
  2026-05-13  6:47                     ` Michal Hocko
  1 sibling, 0 replies; 3+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-11 13:05 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: Michal Hocko, Minchan Kim, Suren Baghdasaryan, akpm, hca,
	linux-s390, brauner, linux-mm, linux-kernel, timmurray,
	Liam R. Howlett

>>
>> But we'd have to learn from Liam :)
> 
> Yes, it will be a zero entry tree now.
> 
> I left the flag to indicate that it's an unstable mm, not for faulting
> in but to be skipped in OOM events and process_mrelease since neither
> should bother doing anything in the window between the dup_mm() failure
> and the exit_mmap() window where the write lock was dropped.
> 
> We can safely drop the flag now if you want to, because everything has
> to deal with an empty vma tree anyways - a race can occur between a call
> to unmap everything and the task seg faulting.

Thanks for confirming!

> 
>>
>>
>>>
>>> The bottom line is the same. Make sure PF fails rather than silently
>>> provide potentially corrupted data.
>>>
>>>
>>> Ohh, you mean to add a new flag, right?
>>
>> We could do that as well, if it's of any help.
> 
> I really think this goes back to the life cycle of the mm being somewhat
> difficult to figure out.  

Agreed.

> I'm fine with another flag.

Right, alternatively we could just turn the unstable flag into a "OOM is hiding
in the bushes to reap this MM".

-- 
Cheers,

David

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios
  2026-05-08 20:57                   ` [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios Liam R. Howlett
  2026-05-11 13:05                     ` David Hildenbrand (Arm)
@ 2026-05-13  6:47                     ` Michal Hocko
  1 sibling, 0 replies; 3+ messages in thread
From: Michal Hocko @ 2026-05-13  6:47 UTC (permalink / raw)
  To: Liam R. Howlett
  Cc: David Hildenbrand (Arm), Minchan Kim, Suren Baghdasaryan, akpm,
	hca, linux-s390, brauner, linux-mm, linux-kernel, timmurray,
	Liam R. Howlett

On Fri 08-05-26 22:57:31, Liam R. Howlett wrote:
> We can safely drop the flag now if you want to, because everything has
> to deal with an empty vma tree anyways - a race can occur between a call
> to unmap everything and the task seg faulting.

Let's just drop it if it doesn't sever any real need anymore.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-05-13  6:47 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <ae-Zu-VAzAA7SdLa@tiehlicka>
     [not found] ` <ae_roPR64e6sY_fN@google.com>
     [not found]   ` <afBaJLLFigkdszov@tiehlicka>
     [not found]     ` <afFco71vwmpQy3pk@google.com>
     [not found]       ` <afG-4hq7Hr62Uu6J@tiehlicka>
     [not found]         ` <7f98f461-62a7-455d-a7a8-cb8928465946@kernel.org>
     [not found]           ` <afHeXY-yeTwmURWh@tiehlicka>
     [not found]             ` <4a612d63-2838-40f5-ab67-79bf35dd3a56@kernel.org>
     [not found]               ` <afIZQOtaBabeHtCc@tiehlicka>
     [not found]                 ` <b2d9fe2b-abb0-49d1-8056-ac93aa232bbb@kernel.org>
2026-05-08 20:57                   ` [PATCH v1 2/3] mm: process_mrelease: skip LRU movement for exclusive file folios Liam R. Howlett
2026-05-11 13:05                     ` David Hildenbrand (Arm)
2026-05-13  6:47                     ` Michal Hocko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox