* [PATCH] signalfd: retrieve multiple signals with one read() call
@ 2007-05-20 0:07 Davi Arnaut
2007-05-20 19:17 ` Davide Libenzi
2007-05-21 4:02 ` Andrew Morton
0 siblings, 2 replies; 8+ messages in thread
From: Davi Arnaut @ 2007-05-20 0:07 UTC (permalink / raw)
To: Andrew Morton; +Cc: Davide Libenzi, Linus Torvalds, Linux Kernel Mailing List
Hi,
Gathering signals in bulk enables server applications to drain a signal
queue (almost full of realtime signals) more efficiently by reducing the
syscall and file look-up overhead.
Very similar to the sigtimedwait4() call described by Niels Provos,
Chuck Lever, and Stephen Tweedie in a paper entitled "Analyzing the
Overload Behavior of a Simple Web Server". The paper lists more details
and advantages.
Signed-off-by: Davi E. M. Arnaut <davi@haxent.com.br>
diff --git a/fs/signalfd.c b/fs/signalfd.c
index 7cfeab4..f6303c5 100644
--- a/fs/signalfd.c
+++ b/fs/signalfd.c
@@ -11,6 +11,8 @@
* Now using anonymous inode source.
* Thanks to Oleg Nesterov for useful code review and suggestions.
* More comments and suggestions from Arnd Bergmann.
+ * Sat May 19, 2007: Davi E. M. Arnaut <davi@haxent.com.br>
+ * Retrieve multiple signals with one read() call
*/
#include <linux/file.h>
@@ -206,6 +208,64 @@ static int signalfd_copyinfo(struct signalfd_siginfo __user *uinfo,
return err ? -EFAULT: sizeof(*uinfo);
}
+static ssize_t signalfd_dequeue(struct signalfd_ctx *ctx, siginfo_t *info,
+ int nonblock)
+{
+ int locked;
+ ssize_t ret;
+ struct signalfd_lockctx lk;
+ DECLARE_WAITQUEUE(wait, current);
+
+ locked = signalfd_lock(ctx, &lk);
+ if (!locked)
+ return 0;
+
+ ret = dequeue_signal(lk.tsk, &ctx->sigmask, info);
+ switch (ret) {
+ case 0:
+ if (!nonblock)
+ break;
+ ret = -EAGAIN;
+ default:
+ signalfd_unlock(&lk);
+ return ret;
+ }
+
+ add_wait_queue(&ctx->wqh, &wait);
+ for (;;) {
+ set_current_state(TASK_INTERRUPTIBLE);
+ ret = dequeue_signal(lk.tsk, &ctx->sigmask, info);
+ if (ret != 0)
+ break;
+ if (signal_pending(current)) {
+ ret = -ERESTARTSYS;
+ break;
+ }
+ signalfd_unlock(&lk);
+ schedule();
+ locked = signalfd_lock(ctx, &lk);
+ if (unlikely(!locked)) {
+ /*
+ * Let the caller read zero byte, ala socket
+ * recv() when the peer disconnect. This test
+ * must be done before doing a dequeue_signal(),
+ * because if the sighand has been orphaned,
+ * the dequeue_signal() call is going to crash.
+ */
+ ret = 0;
+ break;
+ }
+ }
+
+ remove_wait_queue(&ctx->wqh, &wait);
+ __set_current_state(TASK_RUNNING);
+
+ if (likely(locked))
+ signalfd_unlock(&lk);
+
+ return ret;
+}
+
/*
* Returns either the size of a "struct signalfd_siginfo", or zero if the
* sighand we are attached to, has been orphaned. The "count" parameter
@@ -215,55 +275,30 @@ static ssize_t signalfd_read(struct file *file, char __user *buf, size_t count,
loff_t *ppos)
{
struct signalfd_ctx *ctx = file->private_data;
- ssize_t res = 0;
- int locked, signo;
+ struct signalfd_siginfo __user *siginfo;
+ int nonblock = file->f_flags & O_NONBLOCK;
+ ssize_t ret, total = 0;
siginfo_t info;
- struct signalfd_lockctx lk;
- DECLARE_WAITQUEUE(wait, current);
- if (count < sizeof(struct signalfd_siginfo))
+ count /= sizeof(struct signalfd_siginfo);
+ if (!count)
return -EINVAL;
- locked = signalfd_lock(ctx, &lk);
- if (!locked)
- return 0;
- res = -EAGAIN;
- signo = dequeue_signal(lk.tsk, &ctx->sigmask, &info);
- if (signo == 0 && !(file->f_flags & O_NONBLOCK)) {
- add_wait_queue(&ctx->wqh, &wait);
- for (;;) {
- set_current_state(TASK_INTERRUPTIBLE);
- signo = dequeue_signal(lk.tsk, &ctx->sigmask, &info);
- if (signo != 0)
- break;
- if (signal_pending(current)) {
- res = -ERESTARTSYS;
- break;
- }
- signalfd_unlock(&lk);
- schedule();
- locked = signalfd_lock(ctx, &lk);
- if (unlikely(!locked)) {
- /*
- * Let the caller read zero byte, ala socket
- * recv() when the peer disconnect. This test
- * must be done before doing a dequeue_signal(),
- * because if the sighand has been orphaned,
- * the dequeue_signal() call is going to crash.
- */
- res = 0;
- break;
- }
- }
- remove_wait_queue(&ctx->wqh, &wait);
- __set_current_state(TASK_RUNNING);
- }
- if (likely(locked))
- signalfd_unlock(&lk);
- if (likely(signo))
- res = signalfd_copyinfo((struct signalfd_siginfo __user *) buf,
- &info);
- return res;
+ siginfo = (struct signalfd_siginfo __user *) buf;
+
+ do {
+ ret = signalfd_dequeue(ctx, &info, nonblock);
+ if (unlikely(ret <= 0))
+ break;
+ ret = signalfd_copyinfo(siginfo, &info);
+ if (ret < 0)
+ break;
+ siginfo++;
+ total += ret;
+ nonblock = 1;
+ } while (--count);
+
+ return total ? total : ret;
}
static const struct file_operations signalfd_fops = {
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] signalfd: retrieve multiple signals with one read() call
2007-05-20 0:07 [PATCH] signalfd: retrieve multiple signals with one read() call Davi Arnaut
@ 2007-05-20 19:17 ` Davide Libenzi
2007-05-21 4:02 ` Andrew Morton
1 sibling, 0 replies; 8+ messages in thread
From: Davide Libenzi @ 2007-05-20 19:17 UTC (permalink / raw)
To: Davi Arnaut; +Cc: Andrew Morton, Linus Torvalds, Linux Kernel Mailing List
On Sat, 19 May 2007, Davi Arnaut wrote:
> Hi,
>
> Gathering signals in bulk enables server applications to drain a signal
> queue (almost full of realtime signals) more efficiently by reducing the
> syscall and file look-up overhead.
>
> Very similar to the sigtimedwait4() call described by Niels Provos,
> Chuck Lever, and Stephen Tweedie in a paper entitled "Analyzing the
> Overload Behavior of a Simple Web Server". The paper lists more details
> and advantages.
That paper is a pre-epoll thing. Nowadays I see no value in using RT
signals as main event gathering mechanism ;)
> Signed-off-by: Davi E. M. Arnaut <davi@haxent.com.br>
The patch looks otherwise good to me, and I see no reasons why we
shoulnd't be fetching more than one signal if available.
Acked-by: Davide Libenzi <davidel@xmailserver.org>
- Davide
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] signalfd: retrieve multiple signals with one read() call
2007-05-20 0:07 [PATCH] signalfd: retrieve multiple signals with one read() call Davi Arnaut
2007-05-20 19:17 ` Davide Libenzi
@ 2007-05-21 4:02 ` Andrew Morton
2007-05-21 4:14 ` Davide Libenzi
1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2007-05-21 4:02 UTC (permalink / raw)
To: Davi Arnaut; +Cc: Davide Libenzi, Linus Torvalds, Linux Kernel Mailing List
On Sat, 19 May 2007 21:07:11 -0300 Davi Arnaut <davi@haxent.com.br> wrote:
> Hi,
>
> Gathering signals in bulk enables server applications to drain a signal
> queue (almost full of realtime signals) more efficiently by reducing the
> syscall and file look-up overhead.
>
> Very similar to the sigtimedwait4() call described by Niels Provos,
> Chuck Lever, and Stephen Tweedie in a paper entitled "Analyzing the
> Overload Behavior of a Simple Web Server". The paper lists more details
> and advantages.
>
static ssize_t signalfd_read(struct file *file, char __user *buf, size_t count,
loff_t *ppos)
{
struct signalfd_ctx *ctx = file->private_data;
struct signalfd_siginfo __user *siginfo;
int nonblock = file->f_flags & O_NONBLOCK;
ssize_t ret, total = 0;
siginfo_t info;
count /= sizeof(struct signalfd_siginfo);
if (!count)
return -EINVAL;
siginfo = (struct signalfd_siginfo __user *) buf;
do {
ret = signalfd_dequeue(ctx, &info, nonblock);
if (unlikely(ret <= 0))
break;
ret = signalfd_copyinfo(siginfo, &info);
if (ret < 0)
break;
siginfo++;
total += ret;
nonblock = 1;
} while (--count);
return total ? total : ret;
}
If 'count' is not a multiple of sizeof(struct signalfd_siginfo)), the read()
will return the next smallest multiple of `count'.
That is, unless `count' happens to be less than 1*sizeof(struct
signalfd_siginfo)), in which case we return -EINVAL.
This seems inconsistent.
Also, I'm desperately hunting for the place where we zero out that local
siginfo_t, and I ain't finding it. Someone please convince me that we're
not leaking bits of kernel memory out to userspace in that thing.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] signalfd: retrieve multiple signals with one read() call
2007-05-21 4:02 ` Andrew Morton
@ 2007-05-21 4:14 ` Davide Libenzi
2007-05-21 4:20 ` Davide Libenzi
2007-05-21 4:26 ` Andrew Morton
0 siblings, 2 replies; 8+ messages in thread
From: Davide Libenzi @ 2007-05-21 4:14 UTC (permalink / raw)
To: Andrew Morton; +Cc: Davi Arnaut, Linus Torvalds, Linux Kernel Mailing List
On Sun, 20 May 2007, Andrew Morton wrote:
> If 'count' is not a multiple of sizeof(struct signalfd_siginfo)), the read()
> will return the next smallest multiple of `count'.
>
> That is, unless `count' happens to be less than 1*sizeof(struct
> signalfd_siginfo)), in which case we return -EINVAL.
>
> This seems inconsistent.
I think it fits the rule "buffer must be big enough for at least one sigingo".
We use the special return 0; as indicator that the process we were
monitoring signals, detached the sighand.
> Also, I'm desperately hunting for the place where we zero out that local
> siginfo_t, and I ain't finding it. Someone please convince me that we're
> not leaking bits of kernel memory out to userspace in that thing.
Hmm, __clear_user()?
- Davide
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] signalfd: retrieve multiple signals with one read() call
2007-05-21 4:14 ` Davide Libenzi
@ 2007-05-21 4:20 ` Davide Libenzi
2007-05-21 4:26 ` Andrew Morton
1 sibling, 0 replies; 8+ messages in thread
From: Davide Libenzi @ 2007-05-21 4:20 UTC (permalink / raw)
To: Andrew Morton; +Cc: Davi Arnaut, Linus Torvalds, Linux Kernel Mailing List
On Sun, 20 May 2007, Davide Libenzi wrote:
> I think it fits the rule "buffer must be big enough for at least one sigingo".
^^^^^^^
/me wonders what sizeof(sigingo) is :)
- Davide
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] signalfd: retrieve multiple signals with one read() call
2007-05-21 4:14 ` Davide Libenzi
2007-05-21 4:20 ` Davide Libenzi
@ 2007-05-21 4:26 ` Andrew Morton
2007-05-21 5:05 ` Davide Libenzi
1 sibling, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2007-05-21 4:26 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Davi Arnaut, Linus Torvalds, Linux Kernel Mailing List
On Sun, 20 May 2007 21:14:38 -0700 (PDT) Davide Libenzi <davidel@xmailserver.org> wrote:
> On Sun, 20 May 2007, Andrew Morton wrote:
>
> > If 'count' is not a multiple of sizeof(struct signalfd_siginfo)), the read()
> > will return the next smallest multiple of `count'.
> >
> > That is, unless `count' happens to be less than 1*sizeof(struct
> > signalfd_siginfo)), in which case we return -EINVAL.
> >
> > This seems inconsistent.
>
> I think it fits the rule "buffer must be big enough for at least one sigingo".
> We use the special return 0; as indicator that the process we were
> monitoring signals, detached the sighand.
>
hm. Kernel violates proper read() semantics in many places. Looks like we
just did it again.
>
> > Also, I'm desperately hunting for the place where we zero out that local
> > siginfo_t, and I ain't finding it. Someone please convince me that we're
> > not leaking bits of kernel memory out to userspace in that thing.
>
> Hmm, __clear_user()?
oic, yes, that thing. Usually we'd zero out the on-stack struct, assemble
it then copy out the whole thing. I guess doing it the way you have saves
a few instructions. But it's the cache hit against *uinfo which will have
most of the cost, and we can't do anything about that.
Unless we just remove the __clear_user() altogether. Who said that "Unused
memebers should be zero"?
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] signalfd: retrieve multiple signals with one read() call
2007-05-21 4:26 ` Andrew Morton
@ 2007-05-21 5:05 ` Davide Libenzi
2007-05-21 5:12 ` Andrew Morton
0 siblings, 1 reply; 8+ messages in thread
From: Davide Libenzi @ 2007-05-21 5:05 UTC (permalink / raw)
To: Andrew Morton; +Cc: Davi Arnaut, Linus Torvalds, Linux Kernel Mailing List
On Sun, 20 May 2007, Andrew Morton wrote:
> > I think it fits the rule "buffer must be big enough for at least one sigingo".
> > We use the special return 0; as indicator that the process we were
> > monitoring signals, detached the sighand.
> >
>
> hm. Kernel violates proper read() semantics in many places. Looks like we
> just did it again.
I think we can have the check that "if size == 0 return 0". The above
cited return-0-on-detch would still apply for enough sized buffers. So:
1) size == 0, return 0 (POSIX wants this)
2) size < sizeof(signalfd_siginfo), return EINVAL
3) size >= sizeof(signalfd_siginfo) && DETACH, return 0
The signalfd falls into what POSIX defined as "special file", and can
return a lower-than-size result.
> Unless we just remove the __clear_user() altogether. Who said that "Unused
> memebers should be zero"?
Because it is a typically used value for still-unused/reserved members?
Better than random values I think ;)
Members validity is driven by si_code & SI_MASK anyway.
- Davide
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] signalfd: retrieve multiple signals with one read() call
2007-05-21 5:05 ` Davide Libenzi
@ 2007-05-21 5:12 ` Andrew Morton
0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2007-05-21 5:12 UTC (permalink / raw)
To: Davide Libenzi; +Cc: Davi Arnaut, Linus Torvalds, Linux Kernel Mailing List
On Sun, 20 May 2007 22:05:00 -0700 (PDT) Davide Libenzi <davidel@xmailserver.org> wrote:
> On Sun, 20 May 2007, Andrew Morton wrote:
>
> > > I think it fits the rule "buffer must be big enough for at least one sigingo".
> > > We use the special return 0; as indicator that the process we were
> > > monitoring signals, detached the sighand.
> > >
> >
> > hm. Kernel violates proper read() semantics in many places. Looks like we
> > just did it again.
>
> I think we can have the check that "if size == 0 return 0". The above
> cited return-0-on-detch would still apply for enough sized buffers. So:
>
> 1) size == 0, return 0 (POSIX wants this)
>
> 2) size < sizeof(signalfd_siginfo), return EINVAL
>
> 3) size >= sizeof(signalfd_siginfo) && DETACH, return 0
>
> The signalfd falls into what POSIX defined as "special file", and can
> return a lower-than-size result.
>
hm, well. I'd suggest that we do what makes most sense, rather than
warping things to try to obey the letter of posix.
>
> > Unless we just remove the __clear_user() altogether. Who said that "Unused
> > memebers should be zero"?
>
> Because it is a typically used value for still-unused/reserved members?
> Better than random values I think ;)
> Members validity is driven by si_code & SI_MASK anyway.
Sure. And it'd be a bit rude to return 128 from the read() but to only
have written to a few bytes of the user's memory.
otoh, only-writing-a-few-bytes will be usefully quicker than zapping the
whole 128b, particularly on small-cacheline CPUs.
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2007-05-21 5:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-20 0:07 [PATCH] signalfd: retrieve multiple signals with one read() call Davi Arnaut
2007-05-20 19:17 ` Davide Libenzi
2007-05-21 4:02 ` Andrew Morton
2007-05-21 4:14 ` Davide Libenzi
2007-05-21 4:20 ` Davide Libenzi
2007-05-21 4:26 ` Andrew Morton
2007-05-21 5:05 ` Davide Libenzi
2007-05-21 5:12 ` Andrew Morton
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox