From: Anthony Liguori <anthony@codemonkey.ws>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
Anthony Liguori <aliguori@us.ibm.com>,
Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>,
Marcelo Tosatti <mtosatti@redhat.com>,
qemu-devel <qemu-devel@nongnu.org>, Avi Kivity <avi@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] Use qemu_eventfd for POSIX AIO
Date: Tue, 27 Sep 2011 09:36:31 -0500 [thread overview]
Message-ID: <4E81DF6F.8080905@codemonkey.ws> (raw)
In-Reply-To: <4E81DDDF.8050201@siemens.com>
On 09/27/2011 09:29 AM, Jan Kiszka wrote:
> On 2011-09-27 16:07, Anthony Liguori wrote:
>> On 09/27/2011 08:56 AM, Jan Kiszka wrote:
>>> Move qemu_eventfd unmodified to oslib-posix and use it for signaling
>>> POSIX AIO completions. If native eventfd suport is available, this
>>> avoids multiple read accesses to drain multiple pending signals. As
>>> before we use a pipe if eventfd is not supported.
>>>
>>> Signed-off-by: Jan Kiszka<jan.kiszka@siemens.com>
>>> ---
>>> os-posix.c | 32 --------------------------------
>>> oslib-posix.c | 32 +++++++++++++++++++++++++++++++-
>>> posix-aio-compat.c | 12 ++++++++----
>>> 3 files changed, 39 insertions(+), 37 deletions(-)
>>>
>>> diff --git a/os-posix.c b/os-posix.c
>>> index dbf3b24..a918895 100644
>>> --- a/os-posix.c
>>> +++ b/os-posix.c
>>> @@ -45,10 +45,6 @@
>>> #include<sys/syscall.h>
>>> #endif
>>>
>>> -#ifdef CONFIG_EVENTFD
>>> -#include<sys/eventfd.h>
>>> -#endif
>>> -
>>> static struct passwd *user_pwd;
>>> static const char *chroot_dir;
>>> static int daemonize;
>>> @@ -333,34 +329,6 @@ void os_set_line_buffering(void)
>>> setvbuf(stdout, NULL, _IOLBF, 0);
>>> }
>>>
>>> -/*
>>> - * Creates an eventfd that looks like a pipe and has EFD_CLOEXEC set.
>>> - */
>>> -int qemu_eventfd(int fds[2])
>>> -{
>>> -#ifdef CONFIG_EVENTFD
>>> - int ret;
>>> -
>>> - ret = eventfd(0, 0);
>>> - if (ret>= 0) {
>>> - fds[0] = ret;
>>> - qemu_set_cloexec(ret);
>>> - if ((fds[1] = dup(ret)) == -1) {
>>> - close(ret);
>>> - return -1;
>>> - }
>>> - qemu_set_cloexec(fds[1]);
>>> - return 0;
>>> - }
>>> -
>>> - if (errno != ENOSYS) {
>>> - return -1;
>>> - }
>>> -#endif
>>> -
>>> - return qemu_pipe(fds);
>>> -}
>>> -
>>> int qemu_create_pidfile(const char *filename)
>>> {
>>> char buffer[128];
>>> diff --git a/oslib-posix.c b/oslib-posix.c
>>> index a304fb0..8ef7bd7 100644
>>> --- a/oslib-posix.c
>>> +++ b/oslib-posix.c
>>> @@ -47,7 +47,9 @@ extern int daemon(int, int);
>>> #include "trace.h"
>>> #include "qemu_socket.h"
>>>
>>> -
>>> +#ifdef CONFIG_EVENTFD
>>> +#include<sys/eventfd.h>
>>> +#endif
>>>
>>> int qemu_daemon(int nochdir, int noclose)
>>> {
>>> @@ -139,6 +141,34 @@ int qemu_pipe(int pipefd[2])
>>> return ret;
>>> }
>>>
>>> +/*
>>> + * Creates an eventfd that looks like a pipe and has EFD_CLOEXEC set.
>>> + */
>>> +int qemu_eventfd(int fds[2])
>>> +{
>>> +#ifdef CONFIG_EVENTFD
>>> + int ret;
>>> +
>>> + ret = eventfd(0, 0);
>>> + if (ret>= 0) {
>>> + fds[0] = ret;
>>> + qemu_set_cloexec(ret);
>>> + if ((fds[1] = dup(ret)) == -1) {
>>> + close(ret);
>>> + return -1;
>>> + }
>>> + qemu_set_cloexec(fds[1]);
>>> + return 0;
>>> + }
>>> +
>>> + if (errno != ENOSYS) {
>>> + return -1;
>>> + }
>>> +#endif
>>> +
>>> + return qemu_pipe(fds);
>>> +}
>>> +
>>
>> I think it's a bit dangerous to implement eventfd() in terms of pipe().
>>
>> You don't expect to handle EAGAIN with eventfd() whereas you have to handle it
>> with pipe().
>
> EAGAIN is returned on eventfd read if no event is pending and the fd is
> non-blocking - just as we configure it.
>
>>
>> Moreover, the eventfd() counter is not lossy (practically speaking) whereas if
>> you use pipe() as a counter, it will be lossy in practice.
>>
>> This is why posix aio uses pipe() and not eventfd().
>
> I don't get this yet. eventfd is lossy by default. It only decreases the
> counter on read if you specify EFD_SEMAPHORE - which we do not do.
uint64_t value;
for (i = 0; i < 1 << 32; i++) {
value = 1;
write(fd, &value, sizeof(value));
}
uint64_t count = 0;
do {
len = read(fd, &value, sizeof(value));
count += value;
} while (len != -1);
With eventfd, count == 2^32. With pipe, count == 8192.
That's each '1' is stored in the pipe buffer whereas with eventfd, an index is
just incremented.
Not sure what you mean re: EFD_SEMAPHORE. EFD_SEMAPHORE basically means any
non-zero value is returned as 1 and the counter is decremented by 1. Without
EFD_SEMAPHORE, the count is returned and the counter is reset to 0.
Regards,
Anthony Liguori
>
> Jan
>
next prev parent reply other threads:[~2011-09-27 14:36 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <4E78C42D.5030207@siemens.com>
[not found] ` <20110921080600.GA9847@stefanha-thinkpad.localdomain>
[not found] ` <4E80B50B.9000301@siemens.com>
[not found] ` <4E80B55F.5020203@redhat.com>
[not found] ` <4E80BFF3.8000907@us.ibm.com>
[not found] ` <4E8190BE.3000801@redhat.com>
2011-09-27 13:56 ` [Qemu-devel] [PATCH] Use qemu_eventfd for POSIX AIO Jan Kiszka
2011-09-27 14:07 ` Anthony Liguori
2011-09-27 14:11 ` Avi Kivity
2011-09-27 14:19 ` Anthony Liguori
2011-09-27 14:22 ` Avi Kivity
2011-09-27 14:22 ` Jan Kiszka
2011-09-27 14:38 ` Anthony Liguori
2011-09-27 14:29 ` Jan Kiszka
2011-09-27 14:34 ` Avi Kivity
2011-09-27 14:36 ` Jan Kiszka
2011-09-27 14:42 ` Avi Kivity
2011-09-27 14:45 ` Jan Kiszka
2011-09-27 14:48 ` Avi Kivity
2011-09-27 14:50 ` Jan Kiszka
2011-09-27 14:54 ` Avi Kivity
2011-09-27 14:57 ` Anthony Liguori
2011-09-27 14:59 ` Jan Kiszka
2011-09-27 14:36 ` Anthony Liguori [this message]
2011-09-27 14:41 ` Paolo Bonzini
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4E81DF6F.8080905@codemonkey.ws \
--to=anthony@codemonkey.ws \
--cc=aliguori@us.ibm.com \
--cc=avi@redhat.com \
--cc=jan.kiszka@siemens.com \
--cc=kwolf@redhat.com \
--cc=mtosatti@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).