All of lore.kernel.org
 help / color / mirror / Atom feed
* Comparing the aio and epoll event frameworks.
@ 2003-05-19 23:33 John Myers
  2003-05-20  0:38 ` Davide Libenzi
  0 siblings, 1 reply; 12+ messages in thread
From: John Myers @ 2003-05-19 23:33 UTC (permalink / raw)
  To: linux-aio; +Cc: linux-kernel

The following documents my understanding of the differences between
the epoll and aio event frameworks.  These differences stem from the
fact that epoll is designed for use by single threaded callers,
whereas aio is designed for use by multithreaded, thread pool callers.

I do not intend to criticise either design choice--each model (single
threaded vs. thread pool) has its uses and each model has requirements
of the event framework which conflict with the requirements of the
other.

The single threaded model has the advantage of more efficient use of a
single CPU and its associated cache.  A single threaded caller also
tends to have fewer locking issues to deal with.  As a result,
correctly written single threaded code tends to have a higher
throughput per CPU than thread pool code.

The thread pool model permits the application developer to write
blocking code.  Asynchronous code takes more time to write and debug,
especially when one is starting from an existing code base with
blocking code and when one needs to use third party libraries with
blocking APIs.  The thread pool model permits one to code
asynchronously only that 5% of the code where the program waits over
95% of the time, leaving the worker threads to deal with the rest.
The reduction in throughput one pays over the single threaded model is
effectively insurance against having the entire server stalled by an
overlooked blocking call or a page fault.

The biggest difference between the two frameworks is in the
cancellation semantics.  epoll gives single threaded callers a
guarantee that after a legitimate cancel (EPOLL_CTL_DEL) operation
returns to the caller there is no possibility of an in-progress event
for the canceled request being delivered through a subsequent call to
epoll_wait().  This meets a desire for single threaded callers to
not have to deal with cancel/complete races and permits them to
immediately free their application-side per-connection state.

Thread pool applications, on the other hand, have to deal with
cancel/complete races anyway.  Some other thread could have read the
event immediately before the cancel call.  For this reason, aio cancel
does not bother removing pending completions from the event ring.
When aio_cancel() is called on an operation that has already delivered
its completion event, has a completion event in the ring, or is not
cancelable, it returns -EAGAIN.  A thread pool application can deal
with this easily by waiting on a condition variable which is signaled 
by the thread that picked up the event.  A single threaded application
cannot block, so has to handle this condition by writing asynchronous
tear-down code.

Note that the fact that aio supports uncancelable operations (such as
every aio operation currently implemented in the base kernel) means
that single threaded callers which use such operations would need to
write this asynchronous tear-down code anyway.  Should aio later add
any cancelable operations that wouldn't be available through epoll, it
may want to add for single threaded callers a variant of io_cancel()
that removes any associated event from the ring.

Another difference between the two frameworks is in the prevention (or
lack thereof) of multiple simultaneous events for an operation/fd.
epoll effectively assumes that its caller will finish processing a
returned event before making a subsequent call to epoll_wait().  If
multiple threads each call epoll_wait() on the same eventpoll fd, the
application can easily end up with multiple threads simutaneously
handling identical events.  Worse, the application cannot tell how
many of these duplicate events are outstanding, so tear-down becomes
nigh impossible.  epoll_wait() was designed to only be called from a
single thread and it uses this design aspect to optimize for its usual
case of a single submission/add generating multiple events.

aio keeps a one event per submission rule to avoid such problems for
thread pool callers.  The tradeoff is that applications that want
subsequent events have to keep repaying the cost of submission.
Should the cost of submission turn out to be significant (I'm not
convinced it is with respect to the cost of handling the event) some
of this could be amortized by extending the aio framework with a
method for a thread which has obtained an intermediate event to
"re-arm" the operation once the thread has finished processing it.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-19 23:33 Comparing the aio and epoll event frameworks John Myers
@ 2003-05-20  0:38 ` Davide Libenzi
  2003-05-20  1:10   ` Dan Kegel
  0 siblings, 1 reply; 12+ messages in thread
From: Davide Libenzi @ 2003-05-20  0:38 UTC (permalink / raw)
  To: John Myers; +Cc: linux-aio, linux-kernel

On Mon, 19 May 2003, John Myers wrote:

> The following documents my understanding of the differences between
> the epoll and aio event frameworks.  These differences stem from the
> fact that epoll is designed for use by single threaded callers,
> whereas aio is designed for use by multithreaded, thread pool callers.
>
> I do not intend to criticise either design choice--each model (single
> threaded vs. thread pool) has its uses and each model has requirements
> of the event framework which conflict with the requirements of the
> other.

Hi John, you seem to have lost a few episodes of the epoll saga. You can
use epoll in both Edge Triggered or Level Triggered ways, and in LT mode
epoll is basically a super-poll. You can call it with blocking and non
blocking fds. You can call it from many threads and (with LT mode) you
don't even need to reach EAGAIN (actually even with ET you don't need to
reach EAGAIN but I'm not willing in starting again discussions already
happened 25 times on lkml). You can easily do thread pooling also. As
a matter of fact a pretty famous on line gaming company is using epoll
together with a thread pooling implementation and last time I've got
contacted by them they were easily handling more than 150K fds with that
model. John, do not cast API in only work in a single environment. Is
poll/select a single threading API ? A thread pooling one ? I'd say both,
since you can choose the model it better fits your need. Same thing for
epoll. About the single shot feature I has a discussion here with the
guy that wrote kqueue and I was telling him about my wish to keep epoll as
simple as possible since people worked with poll/select for many years and
they did not commit suicide because of the lack of extended features.
Adding a single shot feature to epoll takes about 5 lines of code,
comments included :) You know how many reuqests I had ? Zero, nada.



- Davide


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  1:10   ` Dan Kegel
@ 2003-05-20  0:46     ` William Lee Irwin III
  2003-05-20  0:52       ` Davide Libenzi
  2003-05-20  0:47     ` Davide Libenzi
  1 sibling, 1 reply; 12+ messages in thread
From: William Lee Irwin III @ 2003-05-20  0:46 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Davide Libenzi, John Myers, linux-aio, linux-kernel

Davide Libenzi wrote:
>> Adding a single shot feature to epoll takes about 5 lines of code,
>> comments included :) You know how many reuqests I had ? Zero, nada.

On Mon, May 19, 2003 at 06:10:21PM -0700, Dan Kegel wrote:
> I thought edge triggered epoll *was* single-shot.
> - Dan

fs/eventpoll.c suggests "epoll" stands for "eventpoll" as opposed to
"edge-triggered". Davide, did the LT additions prompt the renaming or
was this always the case?


-- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  1:10   ` Dan Kegel
  2003-05-20  0:46     ` William Lee Irwin III
@ 2003-05-20  0:47     ` Davide Libenzi
  2003-05-20  1:02       ` William Lee Irwin III
  2003-05-20  1:22       ` Dan Kegel
  1 sibling, 2 replies; 12+ messages in thread
From: Davide Libenzi @ 2003-05-20  0:47 UTC (permalink / raw)
  To: Dan Kegel; +Cc: John Myers, linux-aio, Linux Kernel Mailing List

On Mon, 19 May 2003, Dan Kegel wrote:

> Davide Libenzi wrote:
> > Adding a single shot feature to epoll takes about 5 lines of code,
> > comments included :) You know how many reuqests I had ? Zero, nada.
>
> I thought edge triggered epoll *was* single-shot.

For single shot I mean that once you receive one event, you will not
receive more events for that fd if you do not rearm it. Suppose you
receive 1000 bytes of data and you get an event (EPOLLIN). If after 10
seconds you receive another 1000 bytes, you will receive another event.
This is not single shot.


- Davide


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  0:46     ` William Lee Irwin III
@ 2003-05-20  0:52       ` Davide Libenzi
  0 siblings, 0 replies; 12+ messages in thread
From: Davide Libenzi @ 2003-05-20  0:52 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Dan Kegel, John Myers, linux-aio, Linux Kernel Mailing List

On Mon, 19 May 2003, William Lee Irwin III wrote:

> Davide Libenzi wrote:
> >> Adding a single shot feature to epoll takes about 5 lines of code,
> >> comments included :) You know how many reuqests I had ? Zero, nada.
>
> On Mon, May 19, 2003 at 06:10:21PM -0700, Dan Kegel wrote:
> > I thought edge triggered epoll *was* single-shot.
> > - Dan
>
> fs/eventpoll.c suggests "epoll" stands for "eventpoll" as opposed to
> "edge-triggered". Davide, did the LT additions prompt the renaming or
> was this always the case?

It was both actually :) It meant event-poll and also was edge-triggered.
Now you can have it level-triggered on a per-fd basis. The epoll named was
not a good one from the beginning though :)


- Davide


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  1:22       ` Dan Kegel
@ 2003-05-20  0:58         ` Davide Libenzi
  0 siblings, 0 replies; 12+ messages in thread
From: Davide Libenzi @ 2003-05-20  0:58 UTC (permalink / raw)
  To: Dan Kegel; +Cc: Linux Kernel Mailing List

On Mon, 19 May 2003, Dan Kegel wrote:

> > For single shot I mean that once you receive one event, you will not
> > receive more events for that fd if you do not rearm it. Suppose you
> > receive 1000 bytes of data and you get an event (EPOLLIN). If after 10
> > seconds you receive another 1000 bytes, you will receive another event.
> > This is not single shot.
>
> Oh, ok.  I much prefer plain old edge triggered, anyway.  It does
> the right thing with less fuss.

If someone will show a practical case where you cannot live without,
implementing it is trivial.



- Davide


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  0:47     ` Davide Libenzi
@ 2003-05-20  1:02       ` William Lee Irwin III
  2003-05-20  1:37         ` Dan Kegel
  2003-05-20  1:22       ` Dan Kegel
  1 sibling, 1 reply; 12+ messages in thread
From: William Lee Irwin III @ 2003-05-20  1:02 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Dan Kegel, John Myers, linux-aio, Linux Kernel Mailing List

Davide Libenzi wrote:
>>> Adding a single shot feature to epoll takes about 5 lines of code,
>>> comments included :) You know how many reuqests I had ? Zero, nada.

On Mon, 19 May 2003, Dan Kegel wrote:
>> I thought edge triggered epoll *was* single-shot.

On Mon, May 19, 2003 at 05:47:15PM -0700, Davide Libenzi wrote:
> For single shot I mean that once you receive one event, you will not
> receive more events for that fd if you do not rearm it. Suppose you
> receive 1000 bytes of data and you get an event (EPOLLIN). If after 10
> seconds you receive another 1000 bytes, you will receive another event.
> This is not single shot.

I think this would be useful for network daemons that would like to
fairly schedule responses (i.e. not re-arm until a client on a given fd
deserves a turn again). IRC daemons would appear to be a perfect
candidate for such. OTOH you may want to wait until someone is writing
such a beast so "it will be used" instead of "it is potentially useful".


-- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  0:38 ` Davide Libenzi
@ 2003-05-20  1:10   ` Dan Kegel
  2003-05-20  0:46     ` William Lee Irwin III
  2003-05-20  0:47     ` Davide Libenzi
  0 siblings, 2 replies; 12+ messages in thread
From: Dan Kegel @ 2003-05-20  1:10 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: John Myers, linux-aio, linux-kernel

Davide Libenzi wrote:
> Adding a single shot feature to epoll takes about 5 lines of code,
> comments included :) You know how many reuqests I had ? Zero, nada.

I thought edge triggered epoll *was* single-shot.
- Dan

-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  1:37         ` Dan Kegel
@ 2003-05-20  1:15           ` William Lee Irwin III
  2003-05-20  2:06             ` Dan Kegel
  0 siblings, 1 reply; 12+ messages in thread
From: William Lee Irwin III @ 2003-05-20  1:15 UTC (permalink / raw)
  To: Dan Kegel
  Cc: Davide Libenzi, John Myers, linux-aio, Linux Kernel Mailing List

William Lee Irwin III wrote:
>> I think this would be useful for network daemons that would like to
>> fairly schedule responses (i.e. not re-arm until a client on a given fd
>> deserves a turn again). IRC daemons would appear to be a perfect
>> candidate for such.  ...

On Mon, May 19, 2003 at 06:37:49PM -0700, Dan Kegel wrote:
> No need.  The plain old edge triggered behavior can handle this
> nicely.

AIUI after the iospace on an fd is exhausted the event will be re-armed.
It could probably be taken and then ignored until the client deserves a
response again. Is that what you had in mind?

(Don't take this too far; I'm in hypothetical land and am not pushing for
the feature hard if at all.)


-- wli

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  0:47     ` Davide Libenzi
  2003-05-20  1:02       ` William Lee Irwin III
@ 2003-05-20  1:22       ` Dan Kegel
  2003-05-20  0:58         ` Davide Libenzi
  1 sibling, 1 reply; 12+ messages in thread
From: Dan Kegel @ 2003-05-20  1:22 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: John Myers, linux-aio, Linux Kernel Mailing List

Davide Libenzi wrote:
> On Mon, 19 May 2003, Dan Kegel wrote:
> 
> 
>>Davide Libenzi wrote:
>>
>>>Adding a single shot feature to epoll takes about 5 lines of code,
>>>comments included :) You know how many reuqests I had ? Zero, nada.
>>
>>I thought edge triggered epoll *was* single-shot.
> 
> 
> For single shot I mean that once you receive one event, you will not
> receive more events for that fd if you do not rearm it. Suppose you
> receive 1000 bytes of data and you get an event (EPOLLIN). If after 10
> seconds you receive another 1000 bytes, you will receive another event.
> This is not single shot.

Oh, ok.  I much prefer plain old edge triggered, anyway.  It does
the right thing with less fuss.
- Dan


-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  1:02       ` William Lee Irwin III
@ 2003-05-20  1:37         ` Dan Kegel
  2003-05-20  1:15           ` William Lee Irwin III
  0 siblings, 1 reply; 12+ messages in thread
From: Dan Kegel @ 2003-05-20  1:37 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Davide Libenzi, John Myers, linux-aio, Linux Kernel Mailing List

William Lee Irwin III wrote:
> Davide Libenzi wrote:
> 
>>>>Adding a single shot feature to epoll takes about 5 lines of code,
>>>>comments included :) You know how many reuqests I had ? Zero, nada.
> 
> 
> On Mon, 19 May 2003, Dan Kegel wrote:
> 
>>>I thought edge triggered epoll *was* single-shot.
> 
> 
> On Mon, May 19, 2003 at 05:47:15PM -0700, Davide Libenzi wrote:
> 
>>For single shot I mean that once you receive one event, you will not
>>receive more events for that fd if you do not rearm it. Suppose you
>>receive 1000 bytes of data and you get an event (EPOLLIN). If after 10
>>seconds you receive another 1000 bytes, you will receive another event.
>>This is not single shot.
> 
> 
> I think this would be useful for network daemons that would like to
> fairly schedule responses (i.e. not re-arm until a client on a given fd
> deserves a turn again). IRC daemons would appear to be a perfect
> candidate for such.  ...

No need.  The plain old edge triggered behavior can handle this
nicely.
- Dan


-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: Comparing the aio and epoll event frameworks.
  2003-05-20  1:15           ` William Lee Irwin III
@ 2003-05-20  2:06             ` Dan Kegel
  0 siblings, 0 replies; 12+ messages in thread
From: Dan Kegel @ 2003-05-20  2:06 UTC (permalink / raw)
  To: William Lee Irwin III
  Cc: Davide Libenzi, John Myers, linux-aio, Linux Kernel Mailing List

William Lee Irwin III wrote:
> William Lee Irwin III wrote:
> 
>>>I think this would be useful for network daemons that would like to
>>>fairly schedule responses (i.e. not re-arm until a client on a given fd
>>>deserves a turn again). IRC daemons would appear to be a perfect
>>>candidate for such.  ...
> 
> 
> On Mon, May 19, 2003 at 06:37:49PM -0700, Dan Kegel wrote:
> 
>>No need.  The plain old edge triggered behavior can handle this
>>nicely.
> 
> 
> AIUI after the iospace on an fd is exhausted the event will be re-armed.
> It could probably be taken and then ignored until the client deserves a
> response again. Is that what you had in mind?

In edge-triggered mode, epoll will deliver an event only when events warrant it (sic).
If you decide to starve a client for a while, that client's fd
will only get an event or two as the last bits of I/O to it
occur; after that, no more events will come in unless you do
some I/O.

So I guess I'm saying "remember the fact that you got the event, but
don't do anything about it until you feel like it".
- Dan

-- 
Dan Kegel
http://www.kegel.com
http://counter.li.org/cgi-bin/runscript/display-person.cgi?user=78045


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2003-05-20  1:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-19 23:33 Comparing the aio and epoll event frameworks John Myers
2003-05-20  0:38 ` Davide Libenzi
2003-05-20  1:10   ` Dan Kegel
2003-05-20  0:46     ` William Lee Irwin III
2003-05-20  0:52       ` Davide Libenzi
2003-05-20  0:47     ` Davide Libenzi
2003-05-20  1:02       ` William Lee Irwin III
2003-05-20  1:37         ` Dan Kegel
2003-05-20  1:15           ` William Lee Irwin III
2003-05-20  2:06             ` Dan Kegel
2003-05-20  1:22       ` Dan Kegel
2003-05-20  0:58         ` Davide Libenzi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.