From: Michael Kerrisk <mtk.manpages@googlemail.com>
To: Davide Libenzi <davidel@xmailserver.org>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [patch/rfc] eventfd semaphore-like behavior
Date: Mon, 9 Feb 2009 20:45:10 +1300 [thread overview]
Message-ID: <cfd18e0f0902082345o7e36f4deo27fc3466aba66dad@mail.gmail.com> (raw)
In-Reply-To: <alpine.DEB.1.10.0902041349180.6214@alien.or.mcafeemobile.com>
On Thu, Feb 5, 2009 at 11:58 AM, Davide Libenzi <davidel@xmailserver.org> wrote:
> People started using eventfd in scnarios where before where using pipes.
> Many of them use eventfds in a semaphore-like way, like they were before
> with pipes. The problem with eventfd is that a read() on the fd returns
> and wipes the whole counter, making the use of it as semaphore a little
> bit more cumbersome. You can do a read() followed by a write() of
> COUNTER-1, but IMO it's pretty easy and cheap to make this work w/out
> extra steps. This patch introduces a new eventfd flag that tells eventfd
> to only dequeue 1 from the counter, allowing simple read/write to make it
> behave like a semaphore.
> Simple test here:
>
> http://www.xmailserver.org/eventfd-sem.c
>
>
> Signed-off-by: Davide Libenzi <davidel@xmailserver.org>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Applied this patch against 2.6.29-rc3, and it works as I would expect.
A question or two.... This change is rather specific to a single use
case, so I wonder
a) Are there use cases that require the ability to read an arbitrary
number of units off the eventfd -- i.e., read N units off the eventfd,
rather than just 1?
b) Is it desirable to be able to toggle the EFD_SEMAPHORE behavior on
and off for an eventfd?
It's difficult to see how these use cases could be accommodated in the
current API, but I just thought it worth raising the ideas.
Cheers,
Michael
> - Davide
>
>
> ---
> fs/eventfd.c | 20 +++++++++++---------
> include/linux/eventfd.h | 12 +++++++++++-
> 2 files changed, 22 insertions(+), 10 deletions(-)
>
> Index: linux-2.6.mod/fs/eventfd.c
> ===================================================================
> --- linux-2.6.mod.orig/fs/eventfd.c 2009-02-03 18:13:33.000000000 -0800
> +++ linux-2.6.mod/fs/eventfd.c 2009-02-04 12:16:39.000000000 -0800
> @@ -28,6 +28,7 @@ struct eventfd_ctx {
> * issue a wakeup.
> */
> __u64 count;
> + unsigned int flags;
> };
>
> /*
> @@ -87,22 +88,20 @@ static ssize_t eventfd_read(struct file
> {
> struct eventfd_ctx *ctx = file->private_data;
> ssize_t res;
> - __u64 ucnt;
> + __u64 ucnt = 0;
> DECLARE_WAITQUEUE(wait, current);
>
> if (count < sizeof(ucnt))
> return -EINVAL;
> spin_lock_irq(&ctx->wqh.lock);
> res = -EAGAIN;
> - ucnt = ctx->count;
> - if (ucnt > 0)
> + if (ctx->count > 0)
> res = sizeof(ucnt);
> else if (!(file->f_flags & O_NONBLOCK)) {
> __add_wait_queue(&ctx->wqh, &wait);
> for (res = 0;;) {
> set_current_state(TASK_INTERRUPTIBLE);
> if (ctx->count > 0) {
> - ucnt = ctx->count;
> res = sizeof(ucnt);
> break;
> }
> @@ -117,8 +116,9 @@ static ssize_t eventfd_read(struct file
> __remove_wait_queue(&ctx->wqh, &wait);
> __set_current_state(TASK_RUNNING);
> }
> - if (res > 0) {
> - ctx->count = 0;
> + if (likely(res > 0)) {
> + ucnt = (ctx->flags & EFD_SEMAPHORE) ? 1 : ctx->count;
> + ctx->count -= ucnt;
> if (waitqueue_active(&ctx->wqh))
> wake_up_locked_poll(&ctx->wqh, POLLOUT);
> }
> @@ -166,7 +166,7 @@ static ssize_t eventfd_write(struct file
> __remove_wait_queue(&ctx->wqh, &wait);
> __set_current_state(TASK_RUNNING);
> }
> - if (res > 0) {
> + if (likely(res > 0)) {
> ctx->count += ucnt;
> if (waitqueue_active(&ctx->wqh))
> wake_up_locked_poll(&ctx->wqh, POLLIN);
> @@ -207,7 +207,7 @@ SYSCALL_DEFINE2(eventfd2, unsigned int,
> BUILD_BUG_ON(EFD_CLOEXEC != O_CLOEXEC);
> BUILD_BUG_ON(EFD_NONBLOCK != O_NONBLOCK);
>
> - if (flags & ~(EFD_CLOEXEC | EFD_NONBLOCK))
> + if (flags & ~EFD_FLAGS_SET)
> return -EINVAL;
>
> ctx = kmalloc(sizeof(*ctx), GFP_KERNEL);
> @@ -216,13 +216,14 @@ SYSCALL_DEFINE2(eventfd2, unsigned int,
>
> init_waitqueue_head(&ctx->wqh);
> ctx->count = count;
> + ctx->flags = flags;
>
> /*
> * When we call this, the initialization must be complete, since
> * anon_inode_getfd() will install the fd.
> */
> fd = anon_inode_getfd("[eventfd]", &eventfd_fops, ctx,
> - flags & (O_CLOEXEC | O_NONBLOCK));
> + flags & EFD_SHARED_FCNTL_FLAGS);
> if (fd < 0)
> kfree(ctx);
> return fd;
> @@ -232,3 +233,4 @@ SYSCALL_DEFINE1(eventfd, unsigned int, c
> {
> return sys_eventfd2(count, 0);
> }
> +
> Index: linux-2.6.mod/include/linux/eventfd.h
> ===================================================================
> --- linux-2.6.mod.orig/include/linux/eventfd.h 2009-02-03 18:13:33.000000000 -0800
> +++ linux-2.6.mod/include/linux/eventfd.h 2009-02-04 12:16:32.000000000 -0800
> @@ -13,10 +13,20 @@
> /* For O_CLOEXEC and O_NONBLOCK */
> #include <linux/fcntl.h>
>
> -/* Flags for eventfd2. */
> +/*
> + * CAREFUL: Check include/asm-generic/fcntl.h when defining
> + * new flags, since they might collide with O_* ones. We want
> + * to re-use O_* flags that couldn't possibly have a meaning
> + * from eventfd, in order to leave a free define-space for
> + * shared O_* flags.
> + */
> +#define EFD_SEMAPHORE (1 << 0)
> #define EFD_CLOEXEC O_CLOEXEC
> #define EFD_NONBLOCK O_NONBLOCK
>
> +#define EFD_SHARED_FCNTL_FLAGS (O_CLOEXEC | O_NONBLOCK)
> +#define EFD_FLAGS_SET (EFD_SHARED_FCNTL_FLAGS | EFD_SEMAPHORE)
> +
> struct file *eventfd_fget(int fd);
> int eventfd_signal(struct file *file, int n);
>
>
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
git://git.kernel.org/pub/scm/docs/man-pages/man-pages.git
man-pages online: http://www.kernel.org/doc/man-pages/online_pages.html
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
next prev parent reply other threads:[~2009-02-09 7:45 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-02-04 22:58 [patch/rfc] eventfd semaphore-like behavior Davide Libenzi
[not found] ` <alpine.DEB.1.10.0902041349180.6214-GPJ85BhbkB8RepQJljzAVbITYcZ0+W3JAL8bYrjMMd8@public.gmane.org>
2009-02-04 23:05 ` Andrew Morton
2009-02-04 23:05 ` Andrew Morton
[not found] ` <20090204150507.665b5b7c.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-04 23:18 ` Davide Libenzi
2009-02-04 23:18 ` Davide Libenzi
[not found] ` <alpine.DEB.1.10.0902041509060.6214-GPJ85BhbkB8RepQJljzAVbITYcZ0+W3JAL8bYrjMMd8@public.gmane.org>
2009-02-04 23:24 ` Andrew Morton
2009-02-04 23:24 ` Andrew Morton
[not found] ` <20090204152434.c8f65d52.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-04 23:27 ` Davide Libenzi
2009-02-04 23:27 ` Davide Libenzi
[not found] ` <alpine.DEB.1.10.0902041526590.6214-GPJ85BhbkB8RepQJljzAVbITYcZ0+W3JAL8bYrjMMd8@public.gmane.org>
2009-02-04 23:48 ` Michael Kerrisk
2009-02-04 23:48 ` Michael Kerrisk
2009-02-04 23:55 ` Andrew Morton
2009-02-04 23:55 ` Andrew Morton
[not found] ` <20090204155514.6abbdc8f.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-04 23:59 ` Michael Kerrisk
2009-02-04 23:59 ` Michael Kerrisk
[not found] ` <cfd18e0f0902041559l4c40d9b6k704068337a00df7a-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-02-05 0:02 ` Davide Libenzi
2009-02-05 0:02 ` Davide Libenzi
[not found] ` <alpine.DEB.1.10.0902041600170.6214-GPJ85BhbkB8RepQJljzAVbITYcZ0+W3JAL8bYrjMMd8@public.gmane.org>
2010-08-30 5:21 ` Michael Kerrisk
2010-08-30 5:21 ` Michael Kerrisk
[not found] ` <AANLkTim_seWr0LZJyC+dPBy-N+_NkzTx21vte7qrwunC-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2010-08-30 5:24 ` Michael Kerrisk
2010-08-30 5:24 ` Michael Kerrisk
2010-08-30 14:22 ` Davide Libenzi
2009-02-05 0:18 ` Andrew Morton
2009-02-05 0:18 ` Andrew Morton
[not found] ` <20090204161816.b93a4588.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
2009-02-05 0:25 ` Davide Libenzi
2009-02-05 0:25 ` Davide Libenzi
[not found] ` <alpine.DEB.1.10.0902041621360.6214-GPJ85BhbkB8RepQJljzAVbITYcZ0+W3JAL8bYrjMMd8@public.gmane.org>
2009-02-05 0:34 ` Andrew Morton
2009-02-05 0:34 ` Andrew Morton
2009-02-05 0:26 ` Michael Kerrisk
2009-02-04 23:45 ` Ulrich Drepper
2009-02-04 23:45 ` Ulrich Drepper
[not found] ` <a36005b50902041545qb9ce459y710bb3b9b8949b17-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2009-02-04 23:49 ` Michael Kerrisk
2009-02-04 23:49 ` Michael Kerrisk
2009-02-09 7:45 ` Michael Kerrisk [this message]
2009-02-09 17:39 ` Davide Libenzi
[not found] <499E527B.2030702@fastmq.com>
2009-02-20 7:00 ` Martin Sustrik
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cfd18e0f0902082345o7e36f4deo27fc3466aba66dad@mail.gmail.com \
--to=mtk.manpages@googlemail.com \
--cc=akpm@linux-foundation.org \
--cc=davidel@xmailserver.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mtk.manpages@gmail.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.