All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Bobrowski <repnop@google.com>
To: Christian Brauner <brauner@kernel.org>
Cc: Alois Wohlschlager <alwoju@gmx.de>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Alexey Gladkov <legion@kernel.org>, Jens Axboe <axboe@kernel.dk>,
	David Hildenbrand <david@redhat.com>,
	Rolf Eike Beer <eb@emlix.com>,
	Ran Xiaokai <ran.xiaokai@zte.com.cn>, Jan Kara <jack@suse.cz>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] pid: Allow creation of pidfds to threads
Date: Wed, 13 Apr 2022 14:46:20 +1000	[thread overview]
Message-ID: <YlZVnMk39kFHF0Kp@google.com> (raw)
In-Reply-To: <20220401094225.epianrz3afrhfrqp@wittgenstein>

On Fri, Apr 01, 2022 at 11:42:25AM +0200, Christian Brauner wrote:
> On Fri, Apr 01, 2022 at 11:00:27AM +0200, Alois Wohlschlager wrote:
> > Hello Christian,
> > 
> > > We originally blocked this because it is not as easy as simply allowing
> > > pidfds to be created for non-thread-group leaders.
> > > For a start, pidfd_poll() currently doens't work if pidfd_task() isn't a
> > > thread-group leader
> > 
> > I did notice the hang there, that's why my patch changes pidfd_poll to return
> > error on tasks which are not thread-group leaders. IIRC, waiting on specific
> > threads is not supported by Linux at all, so I don't see a problem with not
> > supporting it here either.
> 
> In general, it would be quite neat if we could get notified about thread
> exit through poll though. That'd be pretty useful. But maybe it's indeed
> ok to just not support this (for now at least).
> 
> I know that systemd is using pidfds in their event loop so I'd need to
> see whether they'd want support for this behavior.
> 
> > 
> > > and you'll just hang for CLONE_PIDFD | CLONE_THREAD.
> > 
> > No, CLONE_PIDFD | CLONE_THREAD behavior is unchanged, it will still fail with
> > EINVAL. I actually confirmed this by double-checking right now.
> 
> I just used the two flags as a shorthand for pidfds referring to
> threads. That might've been misleading here.
> 
> > 
> > > So at least that needs to be adapated as well and there's likely a bunch
> > > of other corner-cases I'm forgetting about.
> > 
> > I'd be happy to hear about other corner-cases so I can fix them.
> 
> I need to play with this patch a little and see what current
> expectations we do have in the code.
> 
> There are various consumers of pidfds and they all have been added with
> the assumption that a pidfd refers to a thread-group leader. We should
> go through them and see whether changing them to operate on threads is
> sane before we can just switch the generic helper.
> 
> Bot process_madvise() and process_mrelease() should be fine to operate
> on threads afaict from the discussion when they were added.
> 
> For pidfd_send_signal() we likely want to at least consider adding the
> ability to send a thread-specific signal, i.e. supporting tgkill()
> behavior. As it stands it currently only supports kill()-like behavior
> where the signal that gets sent is thread-group directed.
> 
> I roughly had originally envisioned this to be supportable through the
> addition of a new flag to pidfd_send_signal() so callers would be able
> to select whether to send a thread-specific signal or not. What do
> people think of that?

Sorry, I've been on parental leave for the last couple of months and
I'm now playing catch-up.

For the fanotify API i.e. FAN_REPORT_PIDFD, I don't see there being
any issues with supporting/returning pidfds which belong to
non-thread-group leaders. In saying that, for this to be useful from
the fanotify API POV, I definitely do think we should consider
supporting the ability to send thread-specific signals via
pidfd_send_signal(). Adding this extension through the optional flag
parameter makes sense to me.

> > > Do you have a concrete use-case you want this for?
> > 
> > My use-case is basically making pidfd_getfd actually useful for its intended
> > purpose: there is a seccomp_unotify-based supervisor that wants to obtain a
> > file descriptor from its guest. This currently does not work if the action to
> > be forwarded to the supervisor is performed in a secondary thread, since there
> > is no way to obtain the required pidfd.
> 
> Yeah, I'm well aware of that. I've been working around this limitation
> in our implementation for the seccomp notifier for quite a long time
> when intercepting the bpf()-syscall.

/M

  reply	other threads:[~2022-04-13  4:46 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-31 19:02 [PATCH] pid: Allow creation of pidfds to threads Alois Wohlschlager
2022-04-01  7:09 ` Christian Brauner
2022-04-01  9:00   ` Alois Wohlschlager
2022-04-01  9:42     ` Christian Brauner
2022-04-13  4:46       ` Matthew Bobrowski [this message]
2022-04-14  8:11         ` Alois Wohlschlager

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YlZVnMk39kFHF0Kp@google.com \
    --to=repnop@google.com \
    --cc=akpm@linux-foundation.org \
    --cc=alwoju@gmx.de \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=david@redhat.com \
    --cc=eb@emlix.com \
    --cc=ebiederm@xmission.com \
    --cc=jack@suse.cz \
    --cc=legion@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=ran.xiaokai@zte.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.