public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Scott James Remnant <scott@canonical.com>
Cc: Roland McGrath <roland@redhat.com>, Ingo Molnar <mingo@elte.hu>,
	Casey Dahlin <cdahlin@redhat.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Randy Dunlap <randy.dunlap@oracle.com>,
	Davide Libenzi <davidel@xmailserver.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>
Subject: Re: [RESEND][RFC PATCH v2] waitfd
Date: Sat, 10 Jan 2009 23:24:34 +0100	[thread overview]
Message-ID: <20090110222434.GA24414@redhat.com> (raw)
In-Reply-To: <1231618407.11642.196.camel@quest>

On 01/10, Scott James Remnant wrote:
>
> On Sat, 2009-01-10 at 19:13 +0100, Oleg Nesterov wrote:
>
> > I never argued with this. And, let me repeat. I am not arguing against
> > waitfd! Actually, I always try to avoid the "do we need this feature"
> > discussions.
> >
> Unless I'm misinterpreting you, you're saying that you don't understand
> why we should change any current behaviour?  My post is attempting to
> illustrate why we should.

Scott. How many times should I repeat: I am _not_ arguing against
waitfd.

But to clarify, neither I vote for it. I don't really care. Except
I do care about the code if it will be merged, that is why I entered
this thread.

> > What I disagree with is that waitfd adds the functionality which does
> > not exists currently.
> >
> I'm not saying that it doesn't at all; in fact I gave an example of how
> you implement the exact same functionality today.

This means I was confused. Because I thought you point is we can't poll
for childs without signalfd. And all I asked was: why do you think so.
I do understand that waitfd can be handy.

> In fact, because main loops use select()/poll(), for the SIGCHLD case
> you'd never use signalfd() at all!
>
> Unless I'm missing something, the following two examples are identical
> in behaviour:
>
> using signalfd:
> ...
> using pselect:

Yes, and that is why I mentioned that ppoll() alone is enough.

> But the pselect() version is neater.  Which is why I started the
> previous reply off with "why have signalfd() at all?"

Unlike waitfd, there are things which we just can not do without signalfd,
even if we have ppol/pselect. For example: wait for the signal, but not
dequeue it.

> One of them was attempting to explain what you don't understand here,
> I'll try and be more verbose...
> ...
> ~~Calling waitpid() does not clear the pending signal.~~
>
> This is the important bit.
>
> If a further process dies while we're inside the waitpid() loop, we will
> most likely reap that straight away.  But this does not clear the
> pending signal.  The main loop will be woken up again, even though it
> does not need to be.
>
> Thus:
>
>  - child process #1 dies
>  - main loop woken up by SIGCHLD
>  - pending status of signal cleared
>  - enter wait loop
>  - child process #2 dies
>  - SIGCHLD pending again
>  - waitpid() called first time, child process #1 reaped
>  - waitpid() called second time, child process #2 reaped
>    (SIGCHLD still pending)
>  - waitpid() called third time, no child processes remain
>  - exit wait loop
>  - back to top of main loop, immediately woken up by pending SIGCHLD
>  - pending status of signal cleared
>  - enter wait loop
>  - waitpid() called first time, but no child processes remain
>    (we reaped it last time round)
>  - exit wait loop
>  - back to top of main loop, sleep

Scott, I don't really understand why are you trying to explain this
all to me. I do understand this. At least I hope ;)

Yes this is possible, and I see no problems here.

>  - SIGCHLD not pending, but waitpid() will not block
>
>    This is true in all example usage; after you've called the read() on
>    the signalfd - or the pselect() has woken, SIGCHLD is probably no
>    longer pending but waitpid() will not block
>
>    Compare with select() behaviour; if you fail to read() from the fd,
>    select() wakes up yet again
>
>  - SIGCHLD pending, but waitpid() will block
>
>    This is true if you exhaust the wait queue in a loop,

... and this too.

> All SIGCHLD is useful for is to get your main loop out of
> select()/poll(); you must always exhaust the wait queue every time you
> have woken up.

Yes, and yes, and yes. Scott, I am sorry, I failed to read to the end
so perhaps I missed something ;)

> --- kernel/signal.c~	2009-01-10 20:04:50.000000000 +0000
> +++ kernel/signal.c	2009-01-10 20:05:24.000000000 +0000
> @@ -816,8 +816,10 @@
>  	 * exactly one non-rt signal, so that we can get more
>  	 * detailed information about the cause of the signal.
>  	 */
> -	if (legacy_queue(pending, sig))
> +	if (legacy_queue(pending, sig)) {
> +		signalfd_notify(t, sig);
>  		return 0;
> +	}

I'd prefer to not discuss this here, but I am not sure I understand.
There should not be no threads which need the wakeup from here, and
I can't see how this change can help.

> A more orthogonal example would be pselect().  That implemented, in the
> kernel, a syscall that it actually wasn't possible to implement in
> userspace

Yes, exactly,

> The argument for waitfd() or similar in the kernel is because there are
> races in userspace that we can't solve.

And now I don't understand you again. Please show me which races we
_can not_ solve in userspace without waitfd?

Yes we can race with the exiting childs while doing waitpid() in a loop,
so we can make the unnecessary syscall. But please do not tell me _this_
is the race we can't solve. This is _harmless_. Unlike the problems
with the poor user-space implementations of pselect/ppol.

Oleg.


  reply	other threads:[~2009-01-10 22:27 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-01-06 18:11 [RFC PATCH v2] waitfd Casey Dahlin
2009-01-06 18:27 ` Alan Cox
2009-01-06 18:31 ` Randy Dunlap
2009-01-06 18:45   ` Casey Dahlin
2009-01-06 18:50     ` Randy Dunlap
2009-01-06 18:48 ` Andi Kleen
2009-01-06 19:07 ` [RESEND][RFC " Casey Dahlin
2009-01-07 12:34   ` Ingo Molnar
2009-01-07 13:05     ` Casey Dahlin
2009-01-07 15:00       ` Ingo Molnar
2009-01-07 17:19     ` Oleg Nesterov
2009-01-07 17:24       ` Ingo Molnar
2009-01-07 17:52       ` Davide Libenzi
2009-01-07 20:38         ` Casey Dahlin
2009-01-10 14:47       ` Scott James Remnant
2009-01-10 21:14         ` Casey Dahlin
2009-01-10 21:20           ` Scott James Remnant
2009-01-10 22:08             ` Casey Dahlin
2009-01-10 22:31           ` Oleg Nesterov
2009-01-10 22:37             ` Casey Dahlin
2009-01-10 22:46               ` Oleg Nesterov
2009-01-07 20:53     ` Roland McGrath
2009-01-07 20:58       ` Ingo Molnar
2009-01-07 21:05         ` Davide Libenzi
2009-01-07 21:50           ` Ingo Molnar
2009-01-07 21:02       ` Ulrich Drepper
2009-01-08 14:32         ` Oleg Nesterov
2009-01-08 19:35           ` Roland McGrath
2009-01-08 20:36             ` Casey Dahlin
2009-01-08 21:39               ` Oleg Nesterov
2009-01-10 14:52                 ` Scott James Remnant
2009-01-10 16:19                   ` Oleg Nesterov
2009-01-10 17:09                     ` Scott James Remnant
2009-01-10 18:21                       ` Oleg Nesterov
2009-01-10 18:46                         ` Scott James Remnant
2009-01-10 14:50               ` Scott James Remnant
2009-01-10 21:20                 ` Casey Dahlin
2009-01-08 22:04       ` Michael Kerrisk
2009-01-10 14:09       ` Scott James Remnant
2009-01-10 14:45       ` Scott James Remnant
2009-01-10 15:57         ` Oleg Nesterov
2009-01-10 17:07           ` Scott James Remnant
2009-01-10 18:13             ` Oleg Nesterov
2009-01-10 20:13               ` Scott James Remnant
2009-01-10 22:24                 ` Oleg Nesterov [this message]
2009-01-10 23:14                   ` Davide Libenzi
2009-01-10 22:25             ` Casey Dahlin
2009-01-10 23:11             ` Davide Libenzi
2011-03-02  1:37           ` Denys Vlasenko
2011-03-02 13:55             ` Oleg Nesterov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20090110222434.GA24414@redhat.com \
    --to=oleg@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=cdahlin@redhat.com \
    --cc=davidel@xmailserver.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=randy.dunlap@oracle.com \
    --cc=roland@redhat.com \
    --cc=scott@canonical.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox