All of lore.kernel.org
 help / color / mirror / Atom feed
* PIDFD_THREAD behavior for thread-group leaders
@ 2025-03-06 11:41 Christian Brauner
  2025-03-06 12:17 ` Oleg Nesterov
  0 siblings, 1 reply; 3+ messages in thread
From: Christian Brauner @ 2025-03-06 11:41 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: linux-fsdevel, Jeff Layton, Lennart Poettering, Daan De Meyer,
	Mike Yuan

Oleg,

I've been thinking about the multi-threaded exec case where a
non-thread-group leader task execs and assumes the thread-group leaders
struct pid.

Back when we implemented support for PIDFD_THREAD we ended up with the
decision that if userspace holds:

pidfd_leader_thread = pidfd_open(<thread-group-leader-pid>, PIDFD_THREAD)

that exit notification is not strictly defined if a non-thread-group
leader thread execs: If poll is called before the exec happened, then an
exit notification may be observed and if aftewards no exit notification
is generated for the old thread-group leader. Of if exit for the old
thread-group leader was observed but poll is called again then it would
block again.

I was wondering why the following snippet wouldn't work to ensure that
PIDFD_THREAD pidfds for thread-group leaders wouldn't be woken with
spurious exits:

diff --git a/kernel/exit.c b/kernel/exit.c
index 9916305e34d3..b79ded1b3bf5 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -745,8 +745,11 @@ static void exit_notify(struct task_struct *tsk, int group_dead)
        /*
         * sub-thread or delay_group_leader(), wake up the
         * PIDFD_THREAD waiters.
+        *
+        * The thread-group leader will be taken over by the execing
+        * task so don't cause spurious wakeups.
         */
-       if (!thread_group_empty(tsk))
+       if (!thread_group_empty(tsk) && (tsk->signal->notify_count >= 0))
                do_notify_pidfd(tsk);

        if (unlikely(tsk->ptrace)) {

Because that would seem more consistent to me. The downside would be
that if userspace performed a series of multi-threaded exec for
non-thread-group leader threads then waiters wouldn't get woken. But I
think that's probably ok.

To handle this case we could later think about whether we can instead
start generating a separate poll (POLLPRI?) event when exec happens.

I'm probably missing something very obvious why that won't work.

Christian

^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2025-03-06 12:48 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-06 11:41 PIDFD_THREAD behavior for thread-group leaders Christian Brauner
2025-03-06 12:17 ` Oleg Nesterov
2025-03-06 12:48   ` Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.