* [PATCH 0/3] man2: document Linux v6.9 pidfd-related changes
@ 2024-07-09 2:13 Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 1/3] clone.2: document CLONE_PIDFD | CLONE_THREAD Kir Kolyshkin
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Kir Kolyshkin @ 2024-07-09 2:13 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Kir Kolyshkin, linux-man, Christian Brauner, Oleg Nesterov
Linux 6.9 added support for PID file descriptors related to threads
(rather than processes). This adds the appropriate bits and pieces,
mostly taken from the https://github.com/brauner/man-pages-md project,
but also from the kernel commit messages.
Kir Kolyshkin (3):
clone.2: document CLONE_PIDFD | CLONE_THREAD
pidfd_open.2: add PIDFD_THREAD and poll nuances
pidfd_send_signal.2: describe flags
man/man2/clone.2 | 22 +++++++--------
man/man2/pidfd_open.2 | 36 ++++++++++++++++++++----
man/man2/pidfd_send_signal.2 | 53 ++++++++++++++++++++++++++++++++++--
3 files changed, 90 insertions(+), 21 deletions(-)
--
2.45.2
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/3] clone.2: document CLONE_PIDFD | CLONE_THREAD
2024-07-09 2:13 [PATCH 0/3] man2: document Linux v6.9 pidfd-related changes Kir Kolyshkin
@ 2024-07-09 2:13 ` Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 3/3] pidfd_send_signal.2: describe flags Kir Kolyshkin
2 siblings, 0 replies; 7+ messages in thread
From: Kir Kolyshkin @ 2024-07-09 2:13 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Kir Kolyshkin, linux-man, Christian Brauner, Oleg Nesterov
Available since Linux 6.9 [1]. Documented in [2] (added by [3]).
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=83b290c9e3b5d95891f
[2]: https://github.com/brauner/man-pages-md/blob/main/clone.md
[3]: https://github.com/brauner/man-pages-md/pull/4
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
---
man/man2/clone.2 | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/man/man2/clone.2 b/man/man2/clone.2
index 90ab5cadf..e3f6364f6 100644
--- a/man/man2/clone.2
+++ b/man/man2/clone.2
@@ -895,10 +895,16 @@ .SS The flags mask
.BR clone ().
.RE
.IP
-It is currently not possible to use this flag together with
-.B CLONE_THREAD.
-This means that the process identified by the PID file descriptor
-will always be a thread group leader.
+If
+.B CLONE_PIDFD
+is specified together with
+.BR CLONE_THREAD ,
+the obtained PID file descriptor refers to a specific thread,
+as opposed to a thread-group leader if
+.B CLONE_THREAD
+is not specified.
+This feature is available since Linux 6.9.
+.\" commit 83b290c9e3b5d95891f43a4aeadf6071cbff25d3
.IP
If the obsolete
.B CLONE_DETACHED
@@ -1416,14 +1422,6 @@ .SH ERRORS
.I flags
mask.
.TP
-.B EINVAL
-.B CLONE_PIDFD
-was specified together with
-.B CLONE_THREAD
-in the
-.I flags
-mask.
-.TP
.BR "EINVAL " "(" clone "() only)"
.B CLONE_PIDFD
was specified together with
--
2.45.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances
2024-07-09 2:13 [PATCH 0/3] man2: document Linux v6.9 pidfd-related changes Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 1/3] clone.2: document CLONE_PIDFD | CLONE_THREAD Kir Kolyshkin
@ 2024-07-09 2:13 ` Kir Kolyshkin
2024-07-09 9:42 ` Oleg Nesterov
2024-07-09 2:13 ` [PATCH 3/3] pidfd_send_signal.2: describe flags Kir Kolyshkin
2 siblings, 1 reply; 7+ messages in thread
From: Kir Kolyshkin @ 2024-07-09 2:13 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Kir Kolyshkin, linux-man, Christian Brauner, Oleg Nesterov
PIDFD_THREAD flag for pidfd_open(2) was added in Linux 6.9 (see [1]).
The nuances of using poll/epoll/select with a pidfd referring to a
process vs a thread are described in the merge commit [2].
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64bef697d33b
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b5683a37c881
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
---
man/man2/pidfd_open.2 | 36 ++++++++++++++++++++++++++++++------
1 file changed, 30 insertions(+), 6 deletions(-)
diff --git a/man/man2/pidfd_open.2 b/man/man2/pidfd_open.2
index c027afe67..c0c0809f4 100644
--- a/man/man2/pidfd_open.2
+++ b/man/man2/pidfd_open.2
@@ -4,7 +4,7 @@
.\"
.TH pidfd_open 2 (date) "Linux man-pages (unreleased)"
.SH NAME
-pidfd_open \- obtain a file descriptor that refers to a process
+pidfd_open \- obtain a file descriptor that refers to a task
.SH LIBRARY
Standard C library
.RI ( libc ", " \-lc )
@@ -25,24 +25,32 @@ .SH DESCRIPTION
The
.BR pidfd_open ()
system call creates a file descriptor that refers to
-the process whose PID is specified in
+the task whose PID is specified in
.IR pid .
The file descriptor is returned as the function result;
the close-on-exec flag is set on the file descriptor.
.P
The
.I flags
-argument either has the value 0, or contains the following flag:
+argument either has the value 0, or contains the following flags:
.TP
.BR PIDFD_NONBLOCK " (since Linux 5.10)"
.\" commit 4da9af0014b51c8b015ed8c622440ef28912efe6
Return a nonblocking file descriptor.
-If the process referred to by the file descriptor has not yet terminated,
+If the task referred to by the file descriptor has not yet terminated,
then an attempt to wait on the file descriptor using
.BR waitid (2)
will immediately return the error
.B EAGAIN
rather than blocking.
+.TP
+.BR PIDFD_THREAD " (since Linux v6.9)"
+.\" commit 64bef697d33b75fc06c5789b3f8108680271529f
+Create a pidfd that refers to a specific thread, rather than a process
+(thread-group leader). If this flag is not set,
+.I pid
+must refer to a process.
+.P
.SH RETURN VALUE
On success,
.BR pidfd_open ()
@@ -155,13 +163,29 @@ .SS Use cases for PID file descriptors
.BR select (2),
and
.BR epoll (7).
-When the process that it refers to terminates,
-these interfaces indicate the file descriptor as readable.
+These interfaces indicate the PID file descriptor as readable
+when the task has exited.
Note, however, that in the current implementation,
nothing can be read from the file descriptor
.RB ( read (2)
on the file descriptor fails with the error
.BR EINVAL ).
+.IP
+The behavior depends on whether the file descriptor refers
+to a process (thread-group leader) or a thread (see
+.B PIDFD_THREAD
+above):
+.RS
+.IP \[bu] 3
+For a thread-group leader, the polling task is woken if the
+thread-group is empty. In other words, if the thread-group
+leader task exits when there are still threads alive in its
+thread-group, the polling task will not be woken when the
+thread-group leader exits, but rather when the last thread in the
+thread-group exits.
+.IP \[bu]
+For a thread, the polling task is woken if the thread exits.
+.RE
.IP \[bu]
If the PID file descriptor refers to a child of the calling process,
then it can be waited on using
--
2.45.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 3/3] pidfd_send_signal.2: describe flags
2024-07-09 2:13 [PATCH 0/3] man2: document Linux v6.9 pidfd-related changes Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 1/3] clone.2: document CLONE_PIDFD | CLONE_THREAD Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances Kir Kolyshkin
@ 2024-07-09 2:13 ` Kir Kolyshkin
2 siblings, 0 replies; 7+ messages in thread
From: Kir Kolyshkin @ 2024-07-09 2:13 UTC (permalink / raw)
To: Alejandro Colomar
Cc: Kir Kolyshkin, linux-man, Christian Brauner, Oleg Nesterov
Those flags were added in Linux 6.9 (see [1]), and are documented in
[2].
The text added is a modified version of [3], removing some repetition
and adapting from markdown to mandoc.
[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e1fb1dc08e73
[2]: https://github.com/brauner/man-pages-md/blob/main/pidfd_send_signal.md
[3]: https://github.com/brauner/man-pages-md/pull/2
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
---
man/man2/pidfd_send_signal.2 | 53 ++++++++++++++++++++++++++++++++++--
1 file changed, 50 insertions(+), 3 deletions(-)
diff --git a/man/man2/pidfd_send_signal.2 b/man/man2/pidfd_send_signal.2
index c7aecbf96..11d81dbe2 100644
--- a/man/man2/pidfd_send_signal.2
+++ b/man/man2/pidfd_send_signal.2
@@ -77,8 +77,55 @@ .SH DESCRIPTION
.P
The
.I flags
-argument is reserved for future use;
-currently, this argument must be specified as 0.
+argument allows to modify the scope of the signal. By
+default, the scope of the signal will be inferred from the
+. I pidfd
+argument. For example, if
+.I pidfd
+refers to a specific thread, i.e., the
+.I pidfd
+was created through
+.BR pidfd_open (2)
+passing the
+.B PIDFD_THREAD
+flag
+or through
+.BR clone3 (2)
+using the
+.B CLONE_PIDFD
+flag together with the
+.B CLONE_THREAD
+flag, then passing
+.I pidfd to
+.BR pidfd_send_signal (2)
+and leaving the
+.I flags argument as
+.B 0
+will cause the signal to be sent to the specific thread referenced by the
+.I pidfd.
+.TP
+.BR PIDFD_SIGNAL_THREAD " (since Linux v6.9)"
+.\" commit e1fb1dc08e73466830612bcf2f9f72180965c9ba
+Ensure that the signal is sent to the specific thread referenced by
+.I pidfd.
+.TP
+.BR PIDFD_SIGNAL_THREAD_GROUP " (since Linux v6.9)"
+.\" commit e1fb1dc08e73466830612bcf2f9f72180965c9ba
+If
+.I pidfd
+refers to a thread-group leader, ensure that the signal is
+sent to the thread-group, even if
+.I pidfd
+was created to refer to a specific thread.
+.TP
+.BR PIDFD_SIGNAL_PROCESS_GROUP " (since Linux v6.9)"
+.\" commit e1fb1dc08e73466830612bcf2f9f72180965c9ba
+If
+.I pidfd
+refers to a process-group leader, ensure that the signal is
+sent to the process-group, even if
+.I pidfd
+was created to refer to a specific thread or to a thread-group leader.
.SH RETURN VALUE
On success,
.BR pidfd_send_signal ()
@@ -102,7 +149,7 @@ .SH ERRORS
.TP
.B EINVAL
.I flags
-is not 0.
+is not valid.
.TP
.B EPERM
The calling process does not have permission to send the signal
--
2.45.2
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances
2024-07-09 2:13 ` [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances Kir Kolyshkin
@ 2024-07-09 9:42 ` Oleg Nesterov
2024-07-09 22:06 ` Kirill Kolyshkin
0 siblings, 1 reply; 7+ messages in thread
From: Oleg Nesterov @ 2024-07-09 9:42 UTC (permalink / raw)
To: Kir Kolyshkin; +Cc: Alejandro Colomar, linux-man, Christian Brauner
Hi Kir,
On 07/08, Kir Kolyshkin wrote:
>
> [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64bef697d33b
The changelog says:
pidfd: implement PIDFD_THREAD flag for pidfd_open()
With this flag:
- pidfd_open() doesn't require that the target task must be
a thread-group leader
- pidfd_poll() succeeds when the task exits and becomes a
zombie (iow, passes exit_notify()), even if it is a leader
and thread-group is not empty.
This means that the behaviour of pidfd_poll(PIDFD_THREAD,
pid-of-group-leader) is not well defined if it races with
exec() from its sub-thread; pidfd_poll() can succeed or not
depending on whether pidfd_task_exited() is called before
or after exchange_tids().
> +The behavior depends on whether the file descriptor refers
> +to a process (thread-group leader) or a thread (see
> +.B PIDFD_THREAD
> +above):
> +.RS
> +.IP \[bu] 3
> +For a thread-group leader, the polling task is woken if the
> +thread-group is empty. In other words, if the thread-group
> +leader task exits when there are still threads alive in its
> +thread-group, the polling task will not be woken when the
> +thread-group leader exits, but rather when the last thread in the
> +thread-group exits.
so this part is not accurate.
See also 43f0df54c96fa5a ("pidfd_poll: report POLLHUP when pid_task() == NULL")
which adds another feature.
Oleg.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances
2024-07-09 9:42 ` Oleg Nesterov
@ 2024-07-09 22:06 ` Kirill Kolyshkin
2024-07-09 22:25 ` Oleg Nesterov
0 siblings, 1 reply; 7+ messages in thread
From: Kirill Kolyshkin @ 2024-07-09 22:06 UTC (permalink / raw)
To: Oleg Nesterov; +Cc: Alejandro Colomar, linux-man, Christian Brauner
On Tue, Jul 9, 2024 at 2:43 AM Oleg Nesterov <oleg@redhat.com> wrote:
>
> Hi Kir,
>
> On 07/08, Kir Kolyshkin wrote:
> >
> > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64bef697d33b
>
> The changelog says:
>
> pidfd: implement PIDFD_THREAD flag for pidfd_open()
>
> With this flag:
>
> - pidfd_open() doesn't require that the target task must be
> a thread-group leader
>
> - pidfd_poll() succeeds when the task exits and becomes a
> zombie (iow, passes exit_notify()), even if it is a leader
> and thread-group is not empty.
>
> This means that the behaviour of pidfd_poll(PIDFD_THREAD,
> pid-of-group-leader) is not well defined if it races with
> exec() from its sub-thread; pidfd_poll() can succeed or not
> depending on whether pidfd_task_exited() is called before
> or after exchange_tids().
>
> > +The behavior depends on whether the file descriptor refers
> > +to a process (thread-group leader) or a thread (see
> > +.B PIDFD_THREAD
> > +above):
> > +.RS
> > +.IP \[bu] 3
> > +For a thread-group leader, the polling task is woken if the
> > +thread-group is empty. In other words, if the thread-group
> > +leader task exits when there are still threads alive in its
> > +thread-group, the polling task will not be woken when the
> > +thread-group leader exits, but rather when the last thread in the
> > +thread-group exits.
>
> so this part is not accurate.
I copied the above text almost verbatim from the merge commit
description (commit b5683a37c881).
Until it's clarified, let's remove this text, adding a TODO instead.
>
> See also 43f0df54c96fa5a ("pidfd_poll: report POLLHUP when pid_task() == NULL")
> which adds another feature.
>
> Oleg.
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances
2024-07-09 22:06 ` Kirill Kolyshkin
@ 2024-07-09 22:25 ` Oleg Nesterov
0 siblings, 0 replies; 7+ messages in thread
From: Oleg Nesterov @ 2024-07-09 22:25 UTC (permalink / raw)
To: Kirill Kolyshkin; +Cc: Alejandro Colomar, linux-man, Christian Brauner
On 07/09, Kirill Kolyshkin wrote:
>
> On Tue, Jul 9, 2024 at 2:43 AM Oleg Nesterov <oleg@redhat.com> wrote:
> >
> > Hi Kir,
> >
> > On 07/08, Kir Kolyshkin wrote:
> > >
> > > [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=64bef697d33b
> >
> > The changelog says:
> >
> > pidfd: implement PIDFD_THREAD flag for pidfd_open()
> >
> > With this flag:
> >
> > - pidfd_open() doesn't require that the target task must be
> > a thread-group leader
> >
> > - pidfd_poll() succeeds when the task exits and becomes a
> > zombie (iow, passes exit_notify()), even if it is a leader
> > and thread-group is not empty.
> >
> > This means that the behaviour of pidfd_poll(PIDFD_THREAD,
> > pid-of-group-leader) is not well defined if it races with
> > exec() from its sub-thread; pidfd_poll() can succeed or not
> > depending on whether pidfd_task_exited() is called before
> > or after exchange_tids().
> >
> > > +The behavior depends on whether the file descriptor refers
> > > +to a process (thread-group leader) or a thread (see
> > > +.B PIDFD_THREAD
> > > +above):
> > > +.RS
> > > +.IP \[bu] 3
> > > +For a thread-group leader, the polling task is woken if the
> > > +thread-group is empty. In other words, if the thread-group
> > > +leader task exits when there are still threads alive in its
> > > +thread-group, the polling task will not be woken when the
> > > +thread-group leader exits, but rather when the last thread in the
> > > +thread-group exits.
> >
> > so this part is not accurate.
>
> I copied the above text almost verbatim from the merge commit
> description (commit b5683a37c881).
And b5683a37c881 says
For thread-group leader pidfds ...
...
For thread-specific pidfds the polling task is woken if the
thread exits.
I think that "thread-specific pidfds" means pidfd created with the
PIDFD_THREAD flag.
> Until it's clarified, let's remove this text, adding a TODO instead.
OK. but you can also look at the (trivial) code:
static __poll_t pidfd_poll(struct file *file, struct poll_table_struct *pts)
{
struct pid *pid = pidfd_pid(file);
bool thread = file->f_flags & PIDFD_THREAD;
struct task_struct *task;
__poll_t poll_flags = 0;
poll_wait(file, &pid->wait_pidfd, pts);
/*
* Depending on PIDFD_THREAD, inform pollers when the thread
* or the whole thread-group exits.
*/
guard(rcu)();
task = pid_task(pid, PIDTYPE_PID);
if (!task)
poll_flags = EPOLLIN | EPOLLRDNORM | EPOLLHUP;
else if (task->exit_state && (thread || thread_group_empty(task)))
poll_flags = EPOLLIN | EPOLLRDNORM;
return poll_flags;
}
please note that the thread_group_empty() check has no effect if
bool thread == f_flags & PIDFD_THREAD is true.
In this case pidfd_poll() succeeds if the the target task has already
exited (passed exit_notify, so ->exit_state is not zero), no matter if
it is a leader or not.
Oleg.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-07-09 22:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-09 2:13 [PATCH 0/3] man2: document Linux v6.9 pidfd-related changes Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 1/3] clone.2: document CLONE_PIDFD | CLONE_THREAD Kir Kolyshkin
2024-07-09 2:13 ` [PATCH 2/3] pidfd_open.2: add PIDFD_THREAD and poll nuances Kir Kolyshkin
2024-07-09 9:42 ` Oleg Nesterov
2024-07-09 22:06 ` Kirill Kolyshkin
2024-07-09 22:25 ` Oleg Nesterov
2024-07-09 2:13 ` [PATCH 3/3] pidfd_send_signal.2: describe flags Kir Kolyshkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox