linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Christian Brauner <brauner@kernel.org>
Cc: Andy Lutomirski <luto@amacapital.net>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	Tycho Andersen <tycho@tycho.pizza>,
	linux-api@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/3] pidfd_poll: report POLLHUP when pid_task() == NULL
Date: Sat, 3 Feb 2024 13:04:26 +0100	[thread overview]
Message-ID: <20240203120425.GA30029@redhat.com> (raw)
In-Reply-To: <20240202190529.GA28818@redhat.com>

Christian, I apologize for my terse and unclear emails yesterday,
I was a bit busy.

On 02/02, Oleg Nesterov wrote:
>
> On 02/02, Christian Brauner wrote:
> >
> > > I think we need a simpler patch. I was going to send it as 4/4, but I'd
> > > like to think more, _perhaps_ we can also discriminate the PIDFD_THREAD
> > > and non-PIDFD_THREAD waiters. I'll try to make the patch(es) tomorrow or
> >
> > Right, I didn't go that far.

OK, so lets forget about the PIDFD_THREAD waiters for the moment.
Then everything is trivial, please see below.

> > > 	1. we can't use wake_up_poll(), it passes nr_exclusive => 1
> >
> > Bah. So we need the same stuff we did for io_uring and use
> > __wake_up() directly. Or we add wake_up_all_poll() and convert the other
> > three callsites:
>
> ...
>
> > +#define wake_up_all_poll(x, m)                                                 \
> > +       __wake_up(x, TASK_NORMAL, 0, poll_to_key(m))
>
> Agreed, but I think this + s/wake_up/wake_up_all_poll/ conversions
> need a separate patch.

And if it was not clear I like this change! In fact I thought about
the new helper too, but I didn't realize that it already have the
users.

> > -void do_notify_pidfd(struct task_struct *task)
> > +void pidfd_wake_up_poll(struct task_struct *task, bool dead)
> >  {
> > -	struct pid *pid;
> > -
> >  	WARN_ON(task->exit_state == 0);
> > -	pid = task_pid(task);
> > -	wake_up_all(&pid->wait_pidfd);
> > +	WARN_ON(mask == 0);
> > +	wake_up_all_poll(&task_pid(task)->wait_pidfd,
> > +			 EPOLLIN | EPOLLRDNORM | dead ? EPOLLHUP : 0);
>
> No...
>
> This is still overcomplicated and is not right.
                                ^^^^^^^^^^^^^^^^
Sorry, sorry, I misread your change, "dead" is always false so it has
no effect and thus the change is correct.

But why do we need this arg? All we need is the trivial one-liner:

--- a/kernel/signal.c
+++ b/kernel/signal.c
@@ -2025,7 +2025,8 @@ void do_notify_pidfd(struct task_struct *task)
 
 	WARN_ON(task->exit_state == 0);
 	pid = task_pid(task);
-	wake_up_all(&pid->wait_pidfd);
+	__wake_up(&pid->wait_pidfd, TASK_NORMAL, 0,
+		  poll_to_key(EPOLLIN | EPOLLRDNORM));
 }
 
 /*

and I was going to send the patch above as 4/4, but then decided
to delay it, see below.

We can rename do_notify_pidfd() if you wish, and of course the
new wake_up_all_poll() helper you proposed makes sense, but this
is another story.

As for __change_pid(). In this case wake_up_all() is fine, we can
change it to use wake_up_all_poll() too for consistency, but this
is not strictly necessary and in fact "key = 0" makes a bit more
sense imo.

And just in case... previously I said that (perhaps) it can use
__wake_up_pollfree() but no, this would be obviously wrong.

------------------------------------------------------------------
Now let's recall about the PIDFD_THREAD waiters. exit_notify() does
		
	/*
	 * sub-thread or delay_group_leader(), wake up the
	 * PIDFD_THREAD waiters.
	 */
	if (!thread_group_empty(tsk))
		do_notify_pidfd(tsk);

and it would be nice to not wakeup the non-PIDFD_THREAD waiters.

I was thinking about something like the changes below but

	- I am NOT sure it will work! I need to read the code
	  in fs/select.c

	- in fact I am not sure this makes a lot of sense, and
	  the hack in pidfd_poll() doesn't look very nice even
	  _if_ it can work.

--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -2081,6 +2081,13 @@ static __poll_t pidfd_poll(struct file *file, struct poll_table_struct *pts)
 	struct task_struct *task;
 	__poll_t poll_flags = 0;
 
+	if (thread && pts && pts->_qproc) {
+		// We are not registered yet. Update the key to mark
+		// us a a PIDFD_THREAD waiter, __pollwait() will copy
+		// this ->_key to poll_table_entry->key.
+		if (pts->_key & EPOLLIN) // exclude the POLLHUP-only waiters
+			pts->_key |= EPOLLMSG; // random flag
+	}
 	poll_wait(file, &pid->wait_pidfd, pts);
 	/*
 	 * Depending on PIDFD_THREAD, inform pollers when the thread

Now, do_notify_pidfd() can do

	if (!thread_group_empty(tsk))
		mask = EPOLLMSG; // matches the hack in pidfd_poll
	else
		mask = EPOLLIN | EPOLLRDNORM;

	__wake_up(..., poll_to_key(mask));

Yes, in this case it makes more sense to pass !thread_group_empty()
as a "bool thread" argument.

---------------------------------------------------------------------

What do you think?

I am starting to think that I shouldn't have delayed the 1st trivial
change. Feel free to push it, with or without rename, with or without
the new wake_up_all_poll() helper, I am fine either way. But please
don't add the new "int dead" argument, afaics it makes no sense.

Thanks,

Oleg.


  parent reply	other threads:[~2024-02-03 12:05 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-02 13:11 [PATCH 0/3] pidfd_poll: report POLLHUP when pid_task() == NULL Oleg Nesterov
2024-02-02 13:12 ` [PATCH 1/3] " Oleg Nesterov
2024-02-02 14:44   ` Christian Brauner
2024-02-02 15:16     ` Christian Brauner
2024-02-02 16:07     ` Oleg Nesterov
2024-02-02 17:24       ` Christian Brauner
2024-02-02 19:05         ` Oleg Nesterov
2024-02-02 19:50           ` Christian Brauner
2024-02-03 12:04           ` Oleg Nesterov [this message]
2024-02-03 16:46             ` Christian Brauner
2024-02-05 14:08               ` Oleg Nesterov
2024-02-02 13:12 ` [PATCH 2/3] pidfd: kill the no longer needed do_notify_pidfd() in de_thread() Oleg Nesterov
2024-02-02 13:12 ` [PATCH 3/3] pid: kill the obsolete PIDTYPE_PID code in transfer_pid() Oleg Nesterov
2024-02-02 15:49 ` [PATCH 0/3] pidfd_poll: report POLLHUP when pid_task() == NULL Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240203120425.GA30029@redhat.com \
    --to=oleg@redhat.com \
    --cc=brauner@kernel.org \
    --cc=ebiederm@xmission.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@amacapital.net \
    --cc=tycho@tycho.pizza \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).