All of lore.kernel.org
 help / color / mirror / Atom feed
From: Anthony Liguori <anthony@codemonkey.ws>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
	Anthony Liguori <aliguori@us.ibm.com>,
	Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	qemu-devel <qemu-devel@nongnu.org>, Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] Use qemu_eventfd for POSIX AIO
Date: Tue, 27 Sep 2011 09:36:31 -0500	[thread overview]
Message-ID: <4E81DF6F.8080905@codemonkey.ws> (raw)
In-Reply-To: <4E81DDDF.8050201@siemens.com>

On 09/27/2011 09:29 AM, Jan Kiszka wrote:
> On 2011-09-27 16:07, Anthony Liguori wrote:
>> On 09/27/2011 08:56 AM, Jan Kiszka wrote:
>>> Move qemu_eventfd unmodified to oslib-posix and use it for signaling
>>> POSIX AIO completions. If native eventfd suport is available, this
>>> avoids multiple read accesses to drain multiple pending signals. As
>>> before we use a pipe if eventfd is not supported.
>>>
>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>> ---
>>>    os-posix.c         |   32 --------------------------------
>>>    oslib-posix.c      |   32 +++++++++++++++++++++++++++++++-
>>>    posix-aio-compat.c |   12 ++++++++----
>>>    3 files changed, 39 insertions(+), 37 deletions(-)
>>>
>>> diff --git a/os-posix.c b/os-posix.c
>>> index dbf3b24..a918895 100644
>>> --- a/os-posix.c
>>> +++ b/os-posix.c
>>> @@ -45,10 +45,6 @@
>>>    #include<sys/syscall.h>
>>>    #endif
>>>
>>> -#ifdef CONFIG_EVENTFD
>>> -#include<sys/eventfd.h>
>>> -#endif
>>> -
>>>    static struct passwd *user_pwd;
>>>    static const char *chroot_dir;
>>>    static int daemonize;
>>> @@ -333,34 +329,6 @@ void os_set_line_buffering(void)
>>>        setvbuf(stdout, NULL, _IOLBF, 0);
>>>    }
>>>
>>> -/*
>>> - * Creates an eventfd that looks like a pipe and has EFD_CLOEXEC set.
>>> - */
>>> -int qemu_eventfd(int fds[2])
>>> -{
>>> -#ifdef CONFIG_EVENTFD
>>> -    int ret;
>>> -
>>> -    ret = eventfd(0, 0);
>>> -    if (ret>= 0) {
>>> -        fds[0] = ret;
>>> -        qemu_set_cloexec(ret);
>>> -        if ((fds[1] = dup(ret)) == -1) {
>>> -            close(ret);
>>> -            return -1;
>>> -        }
>>> -        qemu_set_cloexec(fds[1]);
>>> -        return 0;
>>> -    }
>>> -
>>> -    if (errno != ENOSYS) {
>>> -        return -1;
>>> -    }
>>> -#endif
>>> -
>>> -    return qemu_pipe(fds);
>>> -}
>>> -
>>>    int qemu_create_pidfile(const char *filename)
>>>    {
>>>        char buffer[128];
>>> diff --git a/oslib-posix.c b/oslib-posix.c
>>> index a304fb0..8ef7bd7 100644
>>> --- a/oslib-posix.c
>>> +++ b/oslib-posix.c
>>> @@ -47,7 +47,9 @@ extern int daemon(int, int);
>>>    #include "trace.h"
>>>    #include "qemu_socket.h"
>>>
>>> -
>>> +#ifdef CONFIG_EVENTFD
>>> +#include<sys/eventfd.h>
>>> +#endif
>>>
>>>    int qemu_daemon(int nochdir, int noclose)
>>>    {
>>> @@ -139,6 +141,34 @@ int qemu_pipe(int pipefd[2])
>>>        return ret;
>>>    }
>>>
>>> +/*
>>> + * Creates an eventfd that looks like a pipe and has EFD_CLOEXEC set.
>>> + */
>>> +int qemu_eventfd(int fds[2])
>>> +{
>>> +#ifdef CONFIG_EVENTFD
>>> +    int ret;
>>> +
>>> +    ret = eventfd(0, 0);
>>> +    if (ret>= 0) {
>>> +        fds[0] = ret;
>>> +        qemu_set_cloexec(ret);
>>> +        if ((fds[1] = dup(ret)) == -1) {
>>> +            close(ret);
>>> +            return -1;
>>> +        }
>>> +        qemu_set_cloexec(fds[1]);
>>> +        return 0;
>>> +    }
>>> +
>>> +    if (errno != ENOSYS) {
>>> +        return -1;
>>> +    }
>>> +#endif
>>> +
>>> +    return qemu_pipe(fds);
>>> +}
>>> +
>>
>> I think it's a bit dangerous to implement eventfd() in terms of pipe().
>>
>> You don't expect to handle EAGAIN with eventfd() whereas you have to handle it
>> with pipe().
>
> EAGAIN is returned on eventfd read if no event is pending and the fd is
> non-blocking - just as we configure it.
>
>>
>> Moreover, the eventfd() counter is not lossy (practically speaking) whereas if
>> you use pipe() as a counter, it will be lossy in practice.
>>
>> This is why posix aio uses pipe() and not eventfd().
>
> I don't get this yet. eventfd is lossy by default. It only decreases the
> counter on read if you specify EFD_SEMAPHORE - which we do not do.

uint64_t value;

for (i = 0; i < 1 << 32; i++) {
    value = 1;
    write(fd, &value, sizeof(value));
}

uint64_t count = 0;

do {
    len = read(fd, &value, sizeof(value));
    count += value;
} while (len != -1);

With eventfd, count == 2^32.  With pipe, count == 8192.

That's each '1' is stored in the pipe buffer whereas with eventfd, an index is 
just incremented.

Not sure what you mean re: EFD_SEMAPHORE.  EFD_SEMAPHORE basically means any 
non-zero value is returned as 1 and the counter is decremented by 1.  Without 
EFD_SEMAPHORE, the count is returned and the counter is reset to 0.

Regards,

Anthony Liguori

>
> Jan
>

  parent reply	other threads:[~2011-09-27 14:36 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-09-20 16:49 [PATCH] qemu-kvm: Switch POSIX compat AIO implementation to upstream Jan Kiszka
2011-09-21  8:06 ` Stefan Hajnoczi
2011-09-26 17:23   ` Jan Kiszka
2011-09-26 17:24     ` Avi Kivity
2011-09-26 18:09       ` Anthony Liguori
2011-09-27  9:00         ` Avi Kivity
2011-09-27 13:56           ` [Qemu-devel] [PATCH] Use qemu_eventfd for POSIX AIO Jan Kiszka
2011-09-27 14:07             ` Anthony Liguori
2011-09-27 14:11               ` Avi Kivity
2011-09-27 14:19                 ` Anthony Liguori
2011-09-27 14:22                   ` Avi Kivity
2011-09-27 14:22                   ` Jan Kiszka
2011-09-27 14:38                     ` Anthony Liguori
2011-09-27 14:29               ` Jan Kiszka
2011-09-27 14:34                 ` Avi Kivity
2011-09-27 14:36                   ` Jan Kiszka
2011-09-27 14:42                     ` Avi Kivity
2011-09-27 14:45                       ` Jan Kiszka
2011-09-27 14:48                         ` Avi Kivity
2011-09-27 14:50                           ` Jan Kiszka
2011-09-27 14:54                             ` Avi Kivity
2011-09-27 14:57                               ` Anthony Liguori
2011-09-27 14:59                               ` Jan Kiszka
2011-09-27 14:36                 ` Anthony Liguori [this message]
2011-09-27 14:41               ` Paolo Bonzini
2011-09-21  8:16 ` [PATCH] qemu-kvm: Switch POSIX compat AIO implementation to upstream Kevin Wolf
2011-09-26 16:56 ` Avi Kivity

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4E81DF6F.8080905@codemonkey.ws \
    --to=anthony@codemonkey.ws \
    --cc=aliguori@us.ibm.com \
    --cc=avi@redhat.com \
    --cc=jan.kiszka@siemens.com \
    --cc=kwolf@redhat.com \
    --cc=mtosatti@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.