All of lore.kernel.org
 help / color / mirror / Atom feed
From: Casey Dahlin <cdahlin@redhat.com>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: Linux Kernel <linux-kernel@vger.kernel.org>,
	Scott James Remnant <scott@netsplit.com>
Subject: Re: [RFC PATCH] waitfd: file descriptor to wait on child processes
Date: Tue, 09 Dec 2008 15:09:41 -0500	[thread overview]
Message-ID: <493ED085.9010701@redhat.com> (raw)
In-Reply-To: <alpine.DEB.1.10.0812091127160.17144@alien.or.mcafeemobile.com>

Davide Libenzi wrote:
> On Tue, 9 Dec 2008, Casey Dahlin wrote:
>
>   
>> This is essentially my first kernel patch, so be nice :)
>>
>> Linux already has signalfd, timerfd, and eventfd to expose signals, timers,
>> and events via a file descriptor. This patch is a working prototype for a
>> fourth: waitfd. It pretty much does what the name suggests: reading from it
>> yields a series of status ints (as would be written into the second argument
>> of waitpid) for child processes that have changed state. It takes essentially
>> the same arguments as waitpid (for now) and supports the same set of features.
>>
>> This is far from ready for merge. Some things that are wrong with it:
>> * Waitpid's argument scheme probably isn't the best for this. By default it
>> makes it easiest to wait on a single child, which is not often useful in this
>> case. Waiting on all children or children in a particular process group is
>> possible, but not a particular, explicit set of children, which we probably
>> want (and which will complicate the implementation significantly).
>> * The prototype for peek_waitpid is obviously in the wrong place, but I
>> haven't found a good home for it.
>> * Waitid's semantics have slightly altered: passing NULL as the pointer to
>> siginfo_t with WNOWAIT will now return successfully instead of throwing
>> EFAULT. I don't know if that means I broke it or fixed it :)
>> * peek_waitpid may not be required at all now. I can probably trick sys_wait4
>> or sys_waitid into giving me what I want (or I could always just make do_wait
>> non-static).
>>
>> Please provide thoughts.
>>     
>
> What's wrong in having a signalfd on SIGCHLD, than doing waitpid() once 
> you get the signal? Do you have cases where this wouldn't be OK?
>   

When the child doesn't send SIGCHLD (clone() lets you do evil things :)

Seriously though, that may be an option. Scott (CC'd ) is primarily the 
consumer of this, so he can better comment.

> About the code, eventually, you really want to report the PID of the 
> exited child, together with the status. So maybe a non-compat-requiring 
> struct would be better to be returned by read(2).
> Also ...
>
>
>
>
>   
>> +static unsigned int waitfd_poll(struct file *file, poll_table *wait)
>> +{
>> +    struct waitfd_ctx *ctx = file->private_data;
>> +    long value;
>> +
>> +    poll_wait(file, &current->signal->wait_chldexit, wait);
>> +
>> +    value = peek_waitpid(ctx->upid, ctx->ops);
>> +    if (value > 0) {
>> +        return POLLIN;
>> +    } if (value == -ECHILD) {
>> +        return POLLIN;
>> +    }
>>     
>
> Trust the compiler, it's pretty good in not having you to add Perl-like 
> extra brackets ;)
> This also looks wierd:
>
> 	} if (value == -ECHILD) {
>
> So maybe:
>
> 	return value > 0 || value == -ECHILD ? POLLIN: 0;
>
>
>   

Yeah, I ripped this up because during debugging poll was behaving oddly, 
and I had been instrumenting the code with printks. It can go back to 
being 1 line now.

>
>
>   
>> +/*
>> + * Returns a multiple of the size of a "struct waitfd_siginfo", or a negative
>> + * error code. The "count" parameter must be at least the size of a
>> + * "struct waitfd_siginfo".
>> + */
>>     
>
> Really? ...
>
>
>   

Wow... and I read this patch before I sent it, too *facepalm*

The function did briefly return a siginfo_t (I implemented on top of 
waitid first).

>   
>> +static ssize_t waitfd_read(struct file *file, char __user *buf, size_t count,
>> +                 loff_t *ppos)
>> +{
>> +    struct waitfd_ctx *ctx = file->private_data;
>> +    int __user *stat_addr = (int *)buf;
>> +    int nonblock = file->f_flags & O_NONBLOCK ? WNOHANG: 0;
>> +    ssize_t ret, total = 0;
>> +
>> +    count /= sizeof(int);
>> +    if (!count)
>> +        return -EINVAL;
>> +
>> +    do {
>> +        ret = sys_wait4(ctx->upid, stat_addr, ctx->ops | nonblock,
>> +                NULL);
>> +        if (ret == 0)
>> +            ret = -EAGAIN;
>> +        if (ret == -ECHILD)
>> +            ret = 0;
>> +        if (ret <= 0)
>> +            break;
>> +
>> +        stat_addr++;
>> +        total += sizeof(struct siginfo);
>> +        nonblock = WNOHANG;
>> +    } while (--count);
>> +
>> +    return total ? total: ret;
>> +}
>>     
>
> ... looks like you're returning a sequence of status ints, with wrong data 
> size returned by read(2).
>
>
>   
More leftovers from above.
>
>   
>> +
>> +static const struct file_operations waitfd_fops = {
>> +    .release    = waitfd_release,
>> +    .poll        = waitfd_poll,
>> +    .read        = waitfd_read,
>> +};
>> +
>> +asmlinkage long sys_waitfd(pid_t upid, int ops)
>>     
>
> Please leave space for extra flags for fds, otherwise Uli will have to 
> make another sys_waitfd3().
>
>
>   
ack'd.
>   
>> +long peek_waitpid(pid_t upid, int options)
>> +{
>> +    struct pid *pid = NULL;
>> +    enum pid_type type;
>> +    long ret;
>> +
>> +    if (options & ~(WNOHANG|WUNTRACED|WCONTINUED|
>> +            __WNOTHREAD|__WCLONE|__WALL))
>> +        return -EINVAL;
>> +
>> +    options |= WNOHANG | WNOWAIT;
>> +
>> +    if (upid == -1)
>> +        type = PIDTYPE_MAX;
>> +    else if (upid < 0) {
>> +        type = PIDTYPE_PGID;
>> +        pid = find_get_pid(-upid);
>> +    } else if (upid == 0) {
>> +        type = PIDTYPE_PGID;
>> +        pid = get_pid(task_pgrp(current));
>> +    } else /* upid > 0 */ {
>> +        type = PIDTYPE_PID;
>> +        pid = find_get_pid(upid);
>> +    }
>> +
>> +    ret = do_wait(type, pid, options | WEXITED, NULL, NULL, NULL);
>> +    put_pid(pid);
>> +
>> +    /* avoid REGPARM breakage on x86: */
>> +    asmlinkage_protect(4, ret, upid, options);
>> +    return ret;
>> +}
>>     
>
> Given that you blinded copied this from sys_wait4(), you may want to at 
> least try to make sys_wait4() to re-use the new function.
>   

Good idea.

> Also, your patch does not apply to latest Linus tree. Which one was the 
> base?
>
>   
This is the last commit before mine in my git repo:

commit f7a8db89c1f42e504bb12d2ae399cd96f755a7db
Merge: 6f84b4d... c49b9f2...
Author: Linus Torvalds <torvalds@linux-foundation.org>
Date:   Mon Dec 8 19:52:43 2008 -0800

    Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6
   
    * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-2.6:
      tproxy: fixe a possible read from an invalid location in the 
socket match
      zd1211rw: use unaligned safe memcmp() in-place of compare_ether_addr()
          (...lots and lots of changes snipped...)

--CJD



  reply	other threads:[~2008-12-09 20:09 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-12-09 17:00 [RFC PATCH] waitfd: file descriptor to wait on child processes Casey Dahlin
2008-12-09 17:05 ` Alan Cox
2008-12-09 17:12   ` Scott James Remnant
2008-12-09 18:46   ` Casey Dahlin
2008-12-09 19:04     ` Alan Cox
2008-12-09 19:21       ` Casey Dahlin
2008-12-09 19:41 ` Davide Libenzi
2008-12-09 20:09   ` Casey Dahlin [this message]
2008-12-12 23:28   ` Scott James Remnant
2008-12-13  4:29     ` Davide Libenzi
2008-12-13  8:43       ` Scott James Remnant
2008-12-13 18:39         ` Davide Libenzi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=493ED085.9010701@redhat.com \
    --to=cdahlin@redhat.com \
    --cc=davidel@xmailserver.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=scott@netsplit.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.