netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: lkml <linux-kernel@vger.kernel.org>
Cc: David Miller <davem@davemloft.net>,
	Ulrich Drepper <drepper@redhat.com>,
	Evgeniy Polyakov <johnpol@2ka.mipt.ru>,
	netdev <netdev@vger.kernel.org>
Subject: [0/4] kevent: generic event processing subsystem.
Date: Wed, 26 Jul 2006 13:18:14 +0400	[thread overview]
Message-ID: <11539054941027@2ka.mipt.ru> (raw)
In-Reply-To: <20060726062817.GA20636@2ka.mipt.ru>



Kevent subsystem incorporates several AIO/kqueue design notes and ideas.
Kevent can be used both for edge and level notifications. It supports
socket notifications (accept, send, recv), network AIO (aio_send(),
aio_recv() and aio_sendfile()), inode notifications (create/remove),
generic poll()/select() notifications and timer notifications.

There are several object in the kevent system:
storage - each source of events (socket, inode, timer, aio, anything) has
	structure kevent_storage incorporated into it, which is basically a list
	of registered interests for this source of events.
user - it is abstraction which holds all requested kevents. It is
	similar to FreeBSD's kqueue.
kevent - set of interests for given source of events or storage.

When kevent is queued into storage, it will live there until removed by
kevent_dequeue(). When some activity is noticed in given storage, it
scans it's kevent_storage->list for kevents which match activity event.
If kevents are found and they are not already in the
kevent_user->ready_list, they will be added there at the end.

ioctl(WAIT) (or appropriate syscall) will wait until either requested
number of kevents are ready or timeout elapsed or at least one kevent is
ready, it's behaviour depends on parameters.

It is possible to have one-shot kevents, which are automatically removed
when are ready.

Any event can be added/removed/modified by ioctl or special controlling
syscall.

Network AIO is based on kevent and works as usual kevent storage on top
of inode.
When new socket is created it is associated with that inode and when
some activity is detected appropriate notifications are generated and
kevent_naio_callback() is called.
When new kevent is being registered, network AIO ->enqueue() callback
simply marks itself like usual socket event watcher. It also locks
physical userspace pages in memory and stores appropriate pointers in
private kevent structure. I have not created additional DMA memory
allocation methods, like Ulrich described in his article, so I handle it
inside NAIO which has some overhead (I posted get_user_pages()
sclability graph some time ago).
Network AIO callback gets pointers to userspace pages and tries to copy
data from receiving skb queue into them using protocol specific
callback. This callback is very similar to ->recvmsg(), so they could
share a lot in future (as far as I recall it worked only with hardware
capable to do checksumming, I'm a bit lazy).

Both network and aio implementation work on top of hooks inside
appropriate state machines, but not as repeated call design (curect AIO) 
or special thread (SGI AIO). AIO work was stopped, since I was unable to 
achieve the same speed as synchronous read 
(maximum speeds were 2Gb/sec vs. 2.1 GB/sec for aio and sync IO accordingly
when reading data from the cache).
Network aio_sendfile() works lazily - it asynchronously populates pages
into the VFS cache (which can be used for various tricks with adaptive
readahead) and then uses usual ->sendfile() callback.

I have not created an interface for userspace events (like Solaris), 
since right now I do not see it's usefullness, but if there is
requirements for that it is quite easy with kevents.

Patches currently include ifdefs and kevent can be disabled in config,
when things are settled that can be removed.

1. kevent homepage.
http://tservice.net.ru/~s0mbre/old/?section=projects&item=kevent

2. network aio homepage.
http://tservice.net.ru/~s0mbre/old/?section=projects&item=naio

3. LWN.net published a very good article about kevent.
http://lwn.net/Articles/172844/

Thank you.

Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru>


  reply	other threads:[~2006-07-26  8:55 UTC|newest]

Thread overview: 57+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <44C66FC9.3050402@redhat.com>
2006-07-25 22:01 ` async network I/O, event channels, etc David Miller
2006-07-25 22:55   ` Nicholas Miell
2006-07-26  6:28   ` Evgeniy Polyakov
2006-07-26  9:18     ` Evgeniy Polyakov [this message]
2006-07-26  9:18       ` [1/4] kevent: core files Evgeniy Polyakov
2006-07-26  9:18         ` [2/4] kevent: network AIO, socket notifications Evgeniy Polyakov
2006-07-26  9:18           ` [3/4] kevent: AIO, aio_sendfile() implementation Evgeniy Polyakov
2006-07-26  9:18             ` [4/4] kevent: poll/select() notifications. Timer notifications Evgeniy Polyakov
2006-07-26 10:00             ` [3/4] kevent: AIO, aio_sendfile() implementation Christoph Hellwig
2006-07-26 10:08               ` Evgeniy Polyakov
2006-07-26 10:13                 ` Christoph Hellwig
2006-07-26 10:25                   ` Evgeniy Polyakov
2006-07-26 10:04             ` Christoph Hellwig
2006-07-26 10:12               ` David Miller
2006-07-26 10:15                 ` Christoph Hellwig
2006-07-26 20:21                   ` Phillip Susi
2006-07-26 14:14                 ` Avi Kivity
2006-07-26 10:19               ` Evgeniy Polyakov
2006-07-26 10:30                 ` Christoph Hellwig
2006-07-26 14:28                   ` Ulrich Drepper
2006-07-26 16:22                     ` Badari Pulavarty
2006-07-27  6:49                       ` Sébastien Dugué
2006-07-27 15:28                         ` Badari Pulavarty
2006-07-27 18:14                           ` Zach Brown
2006-07-27 18:29                             ` Badari Pulavarty
2006-07-27 18:44                               ` Ulrich Drepper
2006-07-27 21:02                                 ` Badari Pulavarty
2006-07-28  7:31                                   ` Sébastien Dugué
2006-07-28 12:58                                   ` Sébastien Dugué
2006-08-11 19:45                                     ` Ulrich Drepper
2006-08-12 18:29                                       ` Kernel patches enabling better POSIX AIO (Was Re: [3/4] kevent: AIO, aio_sendfile) Suparna Bhattacharya
2006-08-12 19:10                                         ` Ulrich Drepper
2006-08-12 19:28                                           ` Jakub Jelinek
2006-09-04 14:37                                             ` Sébastien Dugué
2006-08-14  7:02                                           ` Suparna Bhattacharya
2006-08-14 16:38                                             ` Ulrich Drepper
2006-08-15  2:06                                               ` Nicholas Miell
2006-09-04 14:36                                           ` Sébastien Dugué
2006-09-04 14:28                                         ` Sébastien Dugué
2006-07-28  7:29                                 ` [3/4] kevent: AIO, aio_sendfile() implementation Sébastien Dugué
2006-07-31 10:11                                 ` Suparna Bhattacharya
2006-07-28  7:26                           ` Sébastien Dugué
2006-07-26 10:31         ` [1/4] kevent: core files Andrew Morton
2006-07-26 10:37           ` Evgeniy Polyakov
2006-07-26 10:44         ` Evgeniy Polyakov
2006-07-27  6:10     ` async network I/O, event channels, etc David Miller
2006-07-27  7:49       ` Evgeniy Polyakov
2006-07-27  8:02         ` David Miller
2006-07-27  8:09           ` Jens Axboe
2006-07-27  8:11             ` Jens Axboe
2006-07-27  8:20               ` David Miller
2006-07-27  8:29                 ` Jens Axboe
2006-07-27  8:37                   ` David Miller
2006-07-27  8:39                     ` Jens Axboe
2006-07-27  8:58           ` Evgeniy Polyakov
2006-07-27  9:31             ` David Miller
2006-07-27  9:37               ` Evgeniy Polyakov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=11539054941027@2ka.mipt.ru \
    --to=johnpol@2ka.mipt.ru \
    --cc=davem@davemloft.net \
    --cc=drepper@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=netdev@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).