From: Oleg Nesterov <oleg@redhat.com>
To: Christian Brauner <brauner@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>,
"Eric W. Biederman" <ebiederm@xmission.com>,
Tycho Andersen <tycho@tycho.pizza>,
linux-api@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] pidfd_poll: report POLLHUP when pid_task() == NULL
Date: Mon, 5 Feb 2024 15:08:49 +0100 [thread overview]
Message-ID: <20240205140848.GA15853@redhat.com> (raw)
In-Reply-To: <20240203-freuden-frucht-b598f8cca27d@brauner>
On 02/03, Christian Brauner wrote:
>
> On Sat, Feb 03, 2024 at 01:04:26PM +0100, Oleg Nesterov wrote:
> >
> > - wake_up_all(&pid->wait_pidfd);
> > + __wake_up(&pid->wait_pidfd, TASK_NORMAL, 0,
> > + poll_to_key(EPOLLIN | EPOLLRDNORM));
>
>
> Ok, care to just send me a full patch for this?
Will do in a minute,
> Second, I agree that this looks ugly. ;/
Agreed ;)
> So here's a very likely a stupid idea. To make that clean we essentially
> need kernel private information that can't be set in userspace (Btw,
> look at EPOLL_URING_WAKE which is similar in that it cannot be set from
> userspace. It's not the same thing ofc but it is a private bit.). Which
> is the gist of your proposal in a way.
>
> So we would have to grab a new private bit in the epoll flag space.
Agreed, but just in case:
- EPOLLMSG (ab)used by this patch can't "leak" to userspace even
if it was (erroneously) set in pollfd.events
- If EPOLLMSG was set by userspace nothing bad can happen, just
poll(non-PIDFD_THREAD-pidfd) will get the spurious wakeups.
So. I am attaching the patch for the record, in case we return to this
later.
It seems to work fine, but when I look into fs/eventpoll.c I suspect
it is not epoll friendly. I _think_ that the neccessary fix is trivial,
ep_item_poll() should just copy pt->_key to epi->event.events after
vfs_poll(), but I am not sure. So lets forget it for now.
Oleg.
---
diff --git a/include/linux/pid.h b/include/linux/pid.h
index 8124d57752b9..7467cdb9735b 100644
--- a/include/linux/pid.h
+++ b/include/linux/pid.h
@@ -74,7 +74,7 @@ struct pid *pidfd_pid(const struct file *file);
struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags);
struct task_struct *pidfd_get_task(int pidfd, unsigned int *flags);
int pidfd_prepare(struct pid *pid, unsigned int flags, struct file **ret);
-void do_notify_pidfd(struct task_struct *task);
+void do_notify_pidfd(struct task_struct *task, bool thread);
static inline struct pid *get_pid(struct pid *pid)
{
diff --git a/kernel/exit.c b/kernel/exit.c
index 493647fd7c07..c31f36d3a1ed 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -744,7 +744,7 @@ static void exit_notify(struct task_struct *tsk, int group_dead)
* PIDFD_THREAD waiters.
*/
if (!thread_group_empty(tsk))
- do_notify_pidfd(tsk);
+ do_notify_pidfd(tsk, true);
if (unlikely(tsk->ptrace)) {
int sig = thread_group_leader(tsk) &&
diff --git a/kernel/fork.c b/kernel/fork.c
index 8d08a2d1b095..3b4474ff6f4a 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2081,6 +2081,15 @@ static __poll_t pidfd_poll(struct file *file, struct poll_table_struct *pts)
struct task_struct *task;
__poll_t poll_flags = 0;
+ if (pts && pts->_qproc) {
+ /*
+ * We are not registered yet. Update pts->_key to mark us as
+ * a non POLLHUP-only PIDFD_THREAD waiter for do_notify_pidfd,
+ * _pollwait() will copy this _key to poll_table_entry->key.
+ */
+ if (thread && (pts->_key & (EPOLLIN | EPOLLRDNORM)))
+ pts->_key |= EPOLLMSG;
+ }
poll_wait(file, &pid->wait_pidfd, pts);
/*
* Depending on PIDFD_THREAD, inform pollers when the thread
diff --git a/kernel/signal.c b/kernel/signal.c
index c3fac06937e2..f51070dec132 100644
--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2019,14 +2019,15 @@ int send_sigqueue(struct sigqueue *q, struct pid *pid, enum pid_type type)
return ret;
}
-void do_notify_pidfd(struct task_struct *task)
+void do_notify_pidfd(struct task_struct *task, bool thread)
{
+ /* see the usage of EPOLLMSG in pidfd_poll() */
+ __poll_t m = thread ? EPOLLMSG : EPOLLIN | EPOLLRDNORM;
struct pid *pid = task_pid(task);
WARN_ON(task->exit_state == 0);
- __wake_up(&pid->wait_pidfd, TASK_NORMAL, 0,
- poll_to_key(EPOLLIN | EPOLLRDNORM));
+ __wake_up(&pid->wait_pidfd, TASK_NORMAL, 0, poll_to_key(m));
}
/*
@@ -2056,7 +2057,7 @@ bool do_notify_parent(struct task_struct *tsk, int sig)
* non-PIDFD_THREAD waiters.
*/
if (thread_group_empty(tsk))
- do_notify_pidfd(tsk);
+ do_notify_pidfd(tsk, false);
if (sig != SIGCHLD) {
/*
next prev parent reply other threads:[~2024-02-05 14:10 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-02 13:11 [PATCH 0/3] pidfd_poll: report POLLHUP when pid_task() == NULL Oleg Nesterov
2024-02-02 13:12 ` [PATCH 1/3] " Oleg Nesterov
2024-02-02 14:44 ` Christian Brauner
2024-02-02 15:16 ` Christian Brauner
2024-02-02 16:07 ` Oleg Nesterov
2024-02-02 17:24 ` Christian Brauner
2024-02-02 19:05 ` Oleg Nesterov
2024-02-02 19:50 ` Christian Brauner
2024-02-03 12:04 ` Oleg Nesterov
2024-02-03 16:46 ` Christian Brauner
2024-02-05 14:08 ` Oleg Nesterov [this message]
2024-02-02 13:12 ` [PATCH 2/3] pidfd: kill the no longer needed do_notify_pidfd() in de_thread() Oleg Nesterov
2024-02-02 13:12 ` [PATCH 3/3] pid: kill the obsolete PIDTYPE_PID code in transfer_pid() Oleg Nesterov
2024-02-02 15:49 ` [PATCH 0/3] pidfd_poll: report POLLHUP when pid_task() == NULL Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240205140848.GA15853@redhat.com \
--to=oleg@redhat.com \
--cc=brauner@kernel.org \
--cc=ebiederm@xmission.com \
--cc=linux-api@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=luto@amacapital.net \
--cc=tycho@tycho.pizza \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.