linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jann Horn <jannh@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	 Vlastimil Babka <vbabka@suse.cz>,
	Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	 Michal Hocko <mhocko@suse.com>,
	linux-mm@kvack.org, Peter Xu <peterx@redhat.com>,
	 linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH 1/2] mm/memory: ensure fork child sees coherent memory snapshot
Date: Tue, 3 Jun 2025 21:09:14 +0200	[thread overview]
Message-ID: <CAG48ez2NX-L0Wq-DQDB2vb3CvOJ1uTmJOqmbMW=FOTtxVoouxg@mail.gmail.com> (raw)
In-Reply-To: <db2268f0-7885-471d-94a3-8ae4641ba2e5@redhat.com>

On Tue, Jun 3, 2025 at 8:37 PM David Hildenbrand <david@redhat.com> wrote:
> On 03.06.25 20:29, Matthew Wilcox wrote:
> > On Tue, Jun 03, 2025 at 08:21:02PM +0200, Jann Horn wrote:
> >> When fork() encounters possibly-pinned pages, those pages are immediately
> >> copied instead of just marking PTEs to make CoW happen later. If the parent
> >> is multithreaded, this can cause the child to see memory contents that are
> >> inconsistent in multiple ways:
> >>
> >> 1. We are copying the contents of a page with a memcpy() while userspace
> >>     may be writing to it. This can cause the resulting data in the child to
> >>     be inconsistent.
> >> 2. After we've copied this page, future writes to other pages may
> >>     continue to be visible to the child while future writes to this page are
> >>     no longer visible to the child.
> >>
> >> This means the child could theoretically see incoherent states where
> >> allocator freelists point to objects that are actually in use or stuff like
> >> that. A mitigating factor is that, unless userspace already has a deadlock
> >> bug, userspace can pretty much only observe such issues when fancy lockless
> >> data structures are used (because if another thread was in the middle of
> >> mutating data during fork() and the post-fork child tried to take the mutex
> >> protecting that data, it might wait forever).
> >
> > Um, OK, but isn't that expected behaviour?  POSIX says:
> >
> > : A process shall be created with a single thread. If a multi-threaded
> > : process calls fork(), the new process shall contain a replica of the
> > : calling thread and its entire address space, possibly including the
> > : states of mutexes and other resources. Consequently, the application
> > : shall ensure that the child process only executes async-signal-safe
> > : operations until such time as one of the exec functions is successful.
> >
> > It's always been my understanding that you really, really shouldn't call
> > fork() from a multithreaded process.
>
> I have the same recollection, but rather because of concurrent O_DIRECT
> and locking (pthread_atfork ...).
>
> Using the allocator above example: what makes sure that no other thread
> is halfway through modifying allocator state? You really have to sync
> somehow before calling fork() -- e.g., grabbing allocator locks in
> pthread_atfork().

Yeah, like what glibc does for its malloc implementation to prevent
allocator calls from racing with fork(), so that malloc() keeps
working after fork(), even though POSIX says that the libc doesn't
have to guarantee that.

> For Linux we document in the man page
>
> "After  a  fork() in a multithreaded program, the child can safely call
> only async-signal-safe functions (see signal-safety(7)) until such time
> as it calls execve(2)."


  reply	other threads:[~2025-06-03 19:09 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-03 18:21 [PATCH 0/2] mm/memory: fix memory tearing on threaded fork Jann Horn
2025-06-03 18:21 ` [PATCH 1/2] mm/memory: ensure fork child sees coherent memory snapshot Jann Horn
2025-06-03 18:29   ` Matthew Wilcox
2025-06-03 18:37     ` David Hildenbrand
2025-06-03 19:09       ` Jann Horn [this message]
2025-06-03 20:17         ` David Hildenbrand
2025-06-03 19:03     ` Jann Horn
2025-06-04 12:22       ` David Hildenbrand
2025-06-03 18:33   ` David Hildenbrand
2025-06-03 20:32   ` Pedro Falcato
2025-06-04 15:41     ` Jann Horn
2025-06-04 16:16       ` Pedro Falcato
2025-06-05  7:33   ` Vlastimil Babka
2025-06-05 12:30     ` Pedro Falcato
2025-06-06 12:55     ` Jann Horn
2025-06-06 15:34       ` Vlastimil Babka
2025-06-06 12:49   ` Jann Horn
2025-06-06 15:49     ` Vlastimil Babka
2025-06-03 18:21 ` [PATCH 2/2] mm/memory: Document how we make a " Jann Horn
2025-06-04 17:03   ` Peter Xu
2025-06-04 18:11     ` Jann Horn
2025-06-04 20:10       ` Peter Xu
2025-06-04 20:28         ` David Hildenbrand
2025-06-06 14:11         ` Jann Horn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez2NX-L0Wq-DQDB2vb3CvOJ1uTmJOqmbMW=FOTtxVoouxg@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=peterx@redhat.com \
    --cc=rppt@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).