From: Benjamin LaHaise <bcrl@kvack.org>
To: Zach Brown <zach.brown@oracle.com>
Cc: David Miller <davem@davemloft.net>,
Evgeniy Polyakov <johnpol@2ka.mipt.ru>,
linux-kernel@vger.kernel.org, netdev@vger.kernel.org
Subject: Re: [RFC 1/4] kevent: core files.
Date: Thu, 27 Jul 2006 18:02:38 -0400 [thread overview]
Message-ID: <20060727220238.GE16971@kvack.org> (raw)
In-Reply-To: <44C933D2.4040406@oracle.com>
On Thu, Jul 27, 2006 at 02:44:50PM -0700, Zach Brown wrote:
>
> >> int kevent_getevents(int event_fd, struct ukevent *events,
> >> int min_events, int max_events,
> >> struct timeval *timeout);
> >
> > You've just reinvented io_getevents().
>
> Well, that's certainly one inflammatory way to put it. I would describe
> it as suggesting that the kevents collection interface not lose the
> nicer properties of io_getevents().
Perhaps, but there seems to be a lot of talk about introducing new APIs
where it isn't entirely clear that it is needed. Sorry if that sounded
rather acerbic.
> > What exactly are we getting from
> > reinventing this (aside from breaking existing apps and creating more of
> > an API mess)?
>
> A generic event collection interface that isn't so strongly bound to the
> existing semantics of io_setup() and io_submit(). It can be a file
> descriptor instead of a mysterious cookie/pointer to the mapped region,
> to start.
Things were like that at one point in time, but file descriptors turn out
to introduce a huge gaping security hole with SUID programs. The problem
is that any event context is closely tied to the address space of the
thread issuing the syscalls, and file descriptors do not have this close
binding.
> Sure, so maybe we experiment with these things in the context of the
> kevent patches and maybe merge them back into the AIO paths if in the
> end that's the right thing to do. I see no problem with separating
> development from the existing code.
Excellent!
> >> epoll and kevent both have the notion of an event type that always
> >> creates an event at the time of the collection syscall while the event
> >> source is on a ready list. Think of epoll calling ->poll(POLLOUT) for
> >> an empty socket buffer at every sys_epoll_wait() call. We can't have
> >> some source constantly spewing into the ring :/. We could fix this by
> >> the API requiring that level events can *only* be collected through the
> >> syscall interface. userspace could call into the collection syscall
> >> every N events collected through the ring, say. N would be tuned to
> >> amortize the syscall cost and still provide fairness or latency for the
> >> level sources. I'd be fine with that, especially when it's hidden off
> >> in glibc.
> >
> > This is exactly why I think level triggered events are nasty. It's
> > impossible to do cleanly without requiring a syscall.
>
> I'm not convinced that it isn't possible to get a sufficiently clean
> interface that involves the mix.
My arguement is that this approach introduces a slow path into the heavily
loaded server case. If you can show me how to avoid that, I'd be happy to
see such an implementation. =-)
> > As soon as you allow queueing events up in kernel space, it becomes
> > necessary to do another syscall after pulling events out of the queue,
> > which is a waste of CPU cycles when you're under heavy load (exactly the
> > point at which you want the system to be its most efficient).
>
> If we've just consumed a full ring worth of events, and done real work
> with them, I'm not convinced that an empty syscall is going to be that
> painful. If we're really under load it might well return some newly
> arrived events. It becomes a mix of ring completions and syscall
> completions.
Except that you're not usually pulling a full ring worth of events at a
time, more often just one. One of the driving forces behind AIO use is
in realtime apps where you don't want to eat occasional spikes in the
latency of request processing, one just wants to eat the highest priority
event then work on the next. By keeping each step small and managable,
the properties of the system are much easier to predict. Yes, batching
can be helpful performance-wise, but it is somewhat opposite to the design
criteria that need to be considered. The right way to cope with that may
be to have two different modes of operation that trade off one way or the
other on the batching question.
-ben
--
"Time is of no importance, Mr. President, only life is important."
Don't Email: <dont@kvack.org>.
next prev parent reply other threads:[~2006-07-27 22:02 UTC|newest]
Thread overview: 180+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-07-09 13:24 [RFC 1/4] kevent: core files Evgeniy Polyakov
2006-07-09 14:59 ` Pekka Enberg
2006-07-09 15:08 ` Evgeniy Polyakov
2006-07-25 6:17 ` David Miller
2006-07-25 6:26 ` Evgeniy Polyakov
2006-07-27 19:18 ` Zach Brown
2006-07-27 20:06 ` Evgeniy Polyakov
2006-07-27 21:32 ` Zach Brown
2006-07-28 5:23 ` Evgeniy Polyakov
2006-07-28 18:33 ` Zach Brown
2006-07-28 18:44 ` Evgeniy Polyakov
2006-07-28 19:10 ` Zach Brown
2006-07-29 3:38 ` Ulrich Drepper
2006-07-29 4:32 ` Nicholas Miell
2006-07-29 15:48 ` Evgeniy Polyakov
2006-07-29 20:54 ` Nicholas Miell
2006-07-30 8:08 ` Ulrich Drepper
2006-07-29 15:44 ` Evgeniy Polyakov
2006-07-29 16:18 ` Ulrich Drepper
2006-07-29 16:36 ` Hans Henrik Happe
2006-07-31 10:33 ` Evgeniy Polyakov
2006-07-31 10:35 ` Herbert Xu
2006-07-31 10:50 ` Evgeniy Polyakov
2006-07-31 10:57 ` David Miller
2006-07-31 10:59 ` Herbert Xu
2006-08-01 7:53 ` Ulrich Drepper
2006-08-01 7:58 ` David Miller
2006-07-31 19:41 ` Evgeniy Polyakov
2006-07-31 22:00 ` David Miller
2006-07-31 22:16 ` Brent Cook
2006-07-31 22:20 ` David Miller
2006-08-01 6:24 ` Evgeniy Polyakov
2006-07-31 22:46 ` Zach Brown
2006-08-01 9:34 ` [take2 0/4] kevent: introduction Evgeniy Polyakov
2006-08-01 9:34 ` [take2 1/4] kevent: core files Evgeniy Polyakov
2006-08-01 9:34 ` [take2 2/4] kevent: network AIO, socket notifications Evgeniy Polyakov
2006-08-01 9:34 ` [take2 4/4] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-01 9:34 ` [take2 3/4] kevent: AIO, aio_sendfile() implementation Evgeniy Polyakov
2006-08-01 13:46 ` [take2 1/4] kevent: core files James Morris
2006-08-01 13:55 ` Evgeniy Polyakov
2006-08-01 14:27 ` James Morris
2006-08-01 14:34 ` Evgeniy Polyakov
2006-08-01 23:56 ` Zach Brown
2006-08-02 0:01 ` David Miller
2006-08-02 6:43 ` Evgeniy Polyakov
2006-08-02 6:39 ` Evgeniy Polyakov
2006-08-02 7:25 ` David Miller
2006-08-02 7:46 ` Evgeniy Polyakov
2006-08-03 9:45 ` [take3 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-08-03 9:40 ` Evgeniy Polyakov
2006-08-03 9:46 ` [take3 1/4] kevent: Core files Evgeniy Polyakov
2006-08-03 9:46 ` [take3 2/4] kevent: AIO, aio_sendfile() implementation Evgeniy Polyakov
2006-08-03 9:46 ` [take3 3/4] kevent: Network AIO, socket notifications Evgeniy Polyakov
2006-08-03 9:46 ` [take3 4/4] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-03 9:43 ` Eric Dumazet
2006-08-03 9:48 ` Evgeniy Polyakov
2006-08-03 9:54 ` [take3 3/4] kevent: Network AIO, socket notifications Eric Dumazet
2006-08-03 10:13 ` Evgeniy Polyakov
2006-08-03 17:04 ` [take3 2/4] kevent: AIO, aio_sendfile() implementation Badari Pulavarty
2006-08-03 17:13 ` Evgeniy Polyakov
2006-08-03 14:40 ` [take3 1/4] kevent: Core files Eric Dumazet
2006-08-03 14:55 ` Evgeniy Polyakov
2006-08-03 15:11 ` Eric Dumazet
2006-08-03 15:21 ` Evgeniy Polyakov
2006-08-03 21:37 ` David Miller
2006-08-05 13:02 ` [take4 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-08-05 13:02 ` [take4 1/4] kevent: Core files Evgeniy Polyakov
2006-08-05 13:02 ` [take4 2/4] kevent: AIO, aio_sendfile() implementation Evgeniy Polyakov
2006-08-05 13:02 ` [take4 3/4] kevent: Network AIO, socket notifications Evgeniy Polyakov
2006-08-05 13:02 ` [take4 4/4] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-05 17:57 ` [take4 1/4] kevent: Core files Greg KH
2006-08-05 18:10 ` Evgeniy Polyakov
2006-08-08 7:44 ` [take5 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-08-08 7:44 ` [take5 1/4] kevent: Core files Evgeniy Polyakov
2006-08-08 7:44 ` [take5 2/4] kevent: AIO, aio_sendfile() implementation Evgeniy Polyakov
2006-08-08 7:44 ` [take5 3/4] kevent: Network AIO, socket notifications Evgeniy Polyakov
2006-08-08 7:44 ` [take5 4/4] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-08 9:52 ` [take5 3/4] kevent: Network AIO, socket notifications Eric Dumazet
2006-08-08 10:02 ` Evgeniy Polyakov
2006-08-08 22:02 ` [take5 1/4] kevent: Core files Zach Brown
2006-08-09 5:22 ` Evgeniy Polyakov
2006-08-08 21:32 ` [take5 0/4] kevent: Generic event handling mechanism Zach Brown
2006-08-09 5:31 ` Evgeniy Polyakov
2006-08-09 5:52 ` David Miller
2006-08-09 6:11 ` Evgeniy Polyakov
2006-08-09 6:25 ` Evgeniy Polyakov
2006-08-09 6:31 ` David Miller
2006-08-09 6:49 ` Evgeniy Polyakov
2006-08-09 6:57 ` Ulrich Drepper
2006-08-09 7:00 ` David Miller
2006-08-09 7:00 ` Evgeniy Polyakov
2006-08-09 8:34 ` Christoph Hellwig
2006-08-09 8:45 ` Andrew Morton
2006-08-09 8:02 ` [take6 0/3] " Evgeniy Polyakov
2006-08-09 7:58 ` David Miller
2006-08-09 8:07 ` Evgeniy Polyakov
2006-08-09 8:20 ` David Miller
2006-08-09 8:24 ` Evgeniy Polyakov
2006-08-09 8:02 ` [take6 1/3] kevent: Core files Evgeniy Polyakov
2006-08-09 8:02 ` [take6 3/3] kevent: Network AIO, socket notifications Evgeniy Polyakov
2006-08-09 8:02 ` [take6 2/3] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-09 17:47 ` [take6 1/3] kevent: Core files Stephen Hemminger
2006-08-09 19:17 ` Evgeniy Polyakov
2006-08-10 0:04 ` David Miller
2006-08-09 22:21 ` Andrew Morton
2006-08-10 6:14 ` Evgeniy Polyakov
2006-08-10 6:42 ` David Miller
2006-08-10 6:48 ` Evgeniy Polyakov
2006-08-10 7:18 ` Andrew Morton
2006-08-10 7:50 ` Evgeniy Polyakov
2006-08-10 8:02 ` Andrew Morton
2006-08-10 8:22 ` Evgeniy Polyakov
2006-08-11 0:56 ` Andrew Morton
2006-08-11 6:15 ` Evgeniy Polyakov
2006-08-11 6:23 ` Andrew Morton
2006-08-11 6:30 ` Evgeniy Polyakov
2006-08-11 7:04 ` Andrew Morton
2006-08-11 7:27 ` Evgeniy Polyakov
2006-08-11 6:25 ` Ulrich Drepper
2006-08-11 6:33 ` Evgeniy Polyakov
2006-08-11 6:38 ` David Miller
2006-08-11 6:55 ` Evgeniy Polyakov
2006-08-10 12:12 ` [take7 0/1] kevent: generic event handling mechanism Evgeniy Polyakov
2006-08-10 12:16 ` [take7 1/1] kevent: core files and timer/poll notifications Evgeniy Polyakov
2006-08-10 12:22 ` Evgeniy Polyakov
2006-08-11 8:40 ` [take8 0/2] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-08-11 8:40 ` [take8 1/2] kevent: Core files Evgeniy Polyakov
2006-08-11 8:40 ` [take8 2/2] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-11 15:45 ` Andrew Morton
2006-08-12 8:18 ` Evgeniy Polyakov
2006-08-12 8:38 ` Andrew Morton
2006-08-12 8:55 ` Evgeniy Polyakov
2006-08-13 0:51 ` [take8 1/2] kevent: Core files Jeff Carr
2006-08-13 9:04 ` Evgeniy Polyakov
2006-08-14 6:20 ` [take8 0/2] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-08-14 6:20 ` [take8 1/2] kevent: Core files Evgeniy Polyakov
2006-08-14 6:20 ` [take8 2/2] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-14 6:21 ` [take9 0/2] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-08-14 6:21 ` [take9 1/2] kevent: Core files Evgeniy Polyakov
2006-08-14 6:21 ` [take9 2/2] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-16 13:30 ` Christoph Hellwig
2006-08-16 13:40 ` Evgeniy Polyakov
2006-08-18 10:41 ` Christoph Hellwig
2006-08-18 10:59 ` Evgeniy Polyakov
2006-08-21 11:01 ` Christoph Hellwig
2006-08-21 11:26 ` Evgeniy Polyakov
2006-08-22 14:35 ` Davide Libenzi
2006-08-16 13:45 ` [take9 1/2] kevent: Core files Christoph Hellwig
2006-08-16 13:56 ` Evgeniy Polyakov
2006-08-16 18:08 ` Zach Brown
2006-08-16 19:24 ` Evgeniy Polyakov
2006-08-16 19:45 ` David Miller
2006-08-16 20:06 ` Evgeniy Polyakov
2006-08-18 10:46 ` Christoph Hellwig
2006-08-18 11:23 ` Evgeniy Polyakov
2006-08-21 10:56 ` Christoph Hellwig
2006-08-21 11:13 ` Evgeniy Polyakov
2006-08-21 12:53 ` Bernd Petrovitsch
2006-08-21 13:01 ` Evgeniy Polyakov
2006-08-21 13:49 ` Bernd Petrovitsch
2006-08-21 19:09 ` David Miller
2006-08-16 13:26 ` [take9 0/2] kevent: Generic event handling mechanism Christoph Hellwig
2006-08-16 13:38 ` Evgeniy Polyakov
2006-08-16 18:10 ` Zach Brown
2006-08-16 12:34 ` [take10 " Evgeniy Polyakov
2006-08-16 12:34 ` [take10 1/2] kevent: Core files Evgeniy Polyakov
2006-08-16 12:34 ` [take10 2/2] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-08-18 9:35 ` [take10 1/2] kevent: Core files Joe Jin
2006-08-18 10:10 ` Evgeniy Polyakov
2006-08-01 1:05 ` [RFC 1/4] kevent: core files David Miller
2006-07-27 20:58 ` Benjamin LaHaise
2006-07-27 21:44 ` Zach Brown
2006-07-27 22:02 ` Benjamin LaHaise [this message]
2006-07-28 5:39 ` Evgeniy Polyakov
2006-07-28 19:01 ` Zach Brown
2006-07-28 19:24 ` Evgeniy Polyakov
2006-07-28 19:34 ` Zach Brown
2006-07-28 19:37 ` Zach Brown
2006-08-01 1:02 ` David Miller
2006-08-01 17:02 ` Zach Brown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060727220238.GE16971@kvack.org \
--to=bcrl@kvack.org \
--cc=davem@davemloft.net \
--cc=johnpol@2ka.mipt.ru \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=zach.brown@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).