From: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Cc: David Miller <davem@davemloft.net>,
Ulrich Drepper <drepper@redhat.com>,
Andrew Morton <akpm@osdl.org>,
Evgeniy Polyakov <johnpol@2ka.mipt.ru>,
netdev <netdev@vger.kernel.org>,
Zach Brown <zach.brown@oracle.com>,
Christoph Hellwig <hch@infradead.org>,
Chase Venters <chase.venters@clientec.com>,
Johann Borck <johann.borck@densedata.com>,
linux-kernel@vger.kernel.org, Jeff Garzik <jeff@garzik.org>
Subject: [take24 1/6] kevent: Description.
Date: Thu, 9 Nov 2006 11:23:56 +0300 [thread overview]
Message-ID: <1163060636219@2ka.mipt.ru> (raw)
In-Reply-To: <11630606361046@2ka.mipt.ru>
Description.
diff --git a/Documentation/kevent.txt b/Documentation/kevent.txt
new file mode 100644
index 0000000..ca49e4b
--- /dev/null
+++ b/Documentation/kevent.txt
@@ -0,0 +1,186 @@
+Description.
+
+int kevent_ctl(int fd, unsigned int cmd, unsigned int num, struct ukevent *arg);
+
+fd - is the file descriptor referring to the kevent queue to manipulate.
+It is created by opening "/dev/kevent" char device, which is created with
+dynamic minor number and major number assigned for misc devices.
+
+cmd - is the requested operation. It can be one of the following:
+ KEVENT_CTL_ADD - add event notification
+ KEVENT_CTL_REMOVE - remove event notification
+ KEVENT_CTL_MODIFY - modify existing notification
+
+num - number of struct ukevent in the array pointed to by arg
+arg - array of struct ukevent
+
+When called, kevent_ctl will carry out the operation specified in the
+cmd parameter.
+-------------------------------------------------------------------------------
+
+ int kevent_get_events(int ctl_fd, unsigned int min_nr, unsigned int max_nr,
+ __u64 timeout, struct ukevent *buf, unsigned flags)
+
+ctl_fd - file descriptor referring to the kevent queue
+min_nr - minimum number of completed events that kevent_get_events will block
+ waiting for
+max_nr - number of struct ukevent in buf
+timeout - number of nanoseconds to wait before returning less than min_nr
+ events. If this is -1, then wait forever.
+buf - pointer to an array of struct ukevent.
+flags - unused
+
+kevent_get_events will wait timeout milliseconds for at least min_nr completed
+events, copying completed struct ukevents to buf and deleting any
+KEVENT_REQ_ONESHOT event requests. In nonblocking mode it returns as many
+events as possible, but not more than max_nr. In blocking mode it waits until
+timeout or if at least min_nr events are ready.
+-------------------------------------------------------------------------------
+
+ int kevent_wait(int ctl_fd, unsigned int num, __u64 timeout)
+
+ctl_fd - file descriptor referring to the kevent queue
+num - number of processed kevents
+timeout - this timeout specifies number of nanoseconds to wait until there is
+ free space in kevent queue
+
+This syscall waits until either timeout expires or at least one event becomes
+ready. It also copies that num events into special ring buffer and requeues
+them (or removes depending on flags).
+-------------------------------------------------------------------------------
+
+ int kevent_ring_init(int ctl_fd, struct kevent_ring *ring, unsigned int num)
+
+ctl_fd - file descriptor referring to the kevent queue
+num - size of the ring buffer in events
+
+ struct kevent_ring
+ {
+ unsigned int ring_kidx;
+ struct ukevent event[0];
+ }
+
+ring_kidx - is an index in the ring buffer where kernel will put new events
+ when kevent_wait() or kevent_get_events() is called
+
+Example userspace code (ring_buffer.c) can be found on project's homepage.
+
+Each kevent syscall can be so called cancellation point in glibc, i.e. when
+thread has been cancelled in kevent syscall, thread can be safely removed
+and no events will be lost, since each syscall (kevent_wait() or
+kevent_get_events()) will copy event into special ring buffer, accessible
+from other threads or even processes (if shared memory is used).
+
+When kevent is removed (not dequeued when it is ready, but just removed),
+even if it was ready, it is not copied into ring buffer, since if it is
+removed, no one cares about it (otherwise user would wait until it becomes
+ready and got it through usual way using kevent_get_events() or kevent_wait())
+and thus no need to copy it to the ring buffer.
+
+It is possible with userspace ring buffer, that events in the ring buffer
+can be replaced without knowledge for the thread currently reading them
+(when other thread calls kevent_get_events() or kevent_wait()), so appropriate
+locking between threads or processes, which can simultaneously access the same
+ring buffer, is required.
+-------------------------------------------------------------------------------
+
+The bulk of the interface is entirely done through the ukevent struct.
+It is used to add event requests, modify existing event requests,
+specify which event requests to remove, and return completed events.
+
+struct ukevent contains the following members:
+
+struct kevent_id id
+ Id of this request, e.g. socket number, file descriptor and so on
+__u32 type
+ Event type, e.g. KEVENT_SOCK, KEVENT_INODE, KEVENT_TIMER and so on
+__u32 event
+ Event itself, e.g. SOCK_ACCEPT, INODE_CREATED, TIMER_FIRED
+__u32 req_flags
+ Per-event request flags,
+
+ KEVENT_REQ_ONESHOT
+ event will be removed when it is ready
+
+ KEVENT_REQ_WAKEUP_ONE
+ When several threads wait on the same kevent queue and requested the
+ same event, for example 'wake me up when new client has connected,
+ so I could call accept()', then all threads will be awakened when new
+ client has connected, but only one of them can process the data. This
+ problem is known as thundering nerd problem. Events which have this
+ flag set will not be marked as ready (and appropriate threads will
+ not be awakened) if at least one event has been already marked.
+
+ KEVENT_REQ_ET
+ Edge Triggered behaviour. It is an optimisation which allows to move
+ ready and dequeued (i.e. copied to userspace) event to move into set
+ of interest for given storage (socket, inode and so on) again. It is
+ very usefull for cases when the same event should be used many times
+ (like reading from pipe). It is similar to epoll()'s EPOLLET flag.
+
+ KEVENT_REQ_LAST_CHECK
+ if set allows to perform the last check on kevent (call appropriate
+ callback) when kevent is marked as ready and has been removed from
+ ready queue. If it will be confirmed that kevent is ready
+ (k->callbacks.callback(k) returns true) then kevent will be copied
+ to userspace, otherwise it will be requeued back to storage.
+ Second (checking) call is performed with this bit cleared, so callback
+ can detect when it was called from kevent_storage_ready() - bit is set,
+ or kevent_dequeue_ready() - bit is cleared. If kevent will be requeued,
+ bit will be set again.
+
+__u32 ret_flags
+ Per-event return flags
+
+ KEVENT_RET_BROKEN
+ Kevent is broken
+
+ KEVENT_RET_DONE
+ Kevent processing was finished successfully
+
+ KEVENT_RET_COPY_FAILED
+ Kevent was not copied into ring buffer due to some error conditions.
+
+__u32 ret_data
+ Event return data. Event originator fills it with anything it likes
+ (for example timer notifications put number of milliseconds when timer
+ has fired
+union { __u32 user[2]; void *ptr; }
+ User's data. It is not used, just copied to/from user. The whole structure
+ is aligned to 8 bytes already, so the last union is aligned properly.
+
+-------------------------------------------------------------------------------
+
+Usage
+
+For KEVENT_CTL_ADD, all fields relevant to the event type must be filled
+(id, type, possibly event, req_flags).
+After kevent_ctl(..., KEVENT_CTL_ADD, ...) returns each struct's ret_flags
+should be checked to see if the event is already broken or done.
+
+For KEVENT_CTL_MODIFY, the id, req_flags, and user and event fields must be
+set and an existing kevent request must have matching id and user fields. If
+match is found, req_flags and event are replaced with the newly supplied
+values and requeueing is started, so modified kevent can be checked and
+probably marked as ready immediately. If a match can't be found, the
+passed in ukevent's ret_flags has KEVENT_RET_BROKEN set. KEVENT_RET_DONE is
+always set.
+
+For KEVENT_CTL_REMOVE, the id and user fields must be set and an existing
+kevent request must have matching id and user fields. If a match is found,
+the kevent request is removed. If a match can't be found, the passed in
+ukevent's ret_flags has KEVENT_RET_BROKEN set. KEVENT_RET_DONE is always set.
+
+For kevent_get_events, the entire structure is returned.
+
+-------------------------------------------------------------------------------
+
+Usage cases
+
+kevent_timer
+struct ukevent should contain following fields:
+ type - KEVENT_TIMER
+ event - KEVENT_TIMER_FIRED
+ req_flags - KEVENT_REQ_ONESHOT if you want to fire that timer only once
+ id.raw[0] - number of seconds after commit when this timer shout expire
+ id.raw[0] - additional to number of seconds number of nanoseconds
next prev parent reply other threads:[~2006-11-09 8:24 UTC|newest]
Thread overview: 200+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1154985aa0591036@2ka.mipt.ru>
2006-10-27 16:10 ` [take21 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-10-27 16:10 ` [take21 1/4] kevent: Core files Evgeniy Polyakov
2006-10-27 16:10 ` [take21 2/4] kevent: poll/select() notifications Evgeniy Polyakov
2006-10-27 16:10 ` [take21 3/4] kevent: Socket notifications Evgeniy Polyakov
2006-10-27 16:10 ` [take21 4/4] kevent: Timer notifications Evgeniy Polyakov
2006-10-28 10:04 ` [take21 2/4] kevent: poll/select() notifications Eric Dumazet
2006-10-28 10:08 ` Evgeniy Polyakov
2006-10-28 10:28 ` [take21 1/4] kevent: Core files Eric Dumazet
2006-10-28 10:53 ` Evgeniy Polyakov
2006-10-28 12:36 ` Eric Dumazet
2006-10-28 13:03 ` Evgeniy Polyakov
2006-10-28 13:23 ` Eric Dumazet
2006-10-28 13:28 ` Evgeniy Polyakov
2006-10-28 13:34 ` Eric Dumazet
2006-10-28 13:47 ` Evgeniy Polyakov
2006-10-27 16:42 ` [take21 0/4] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-11-07 11:26 ` Jeff Garzik
2006-11-07 11:46 ` Jeff Garzik
2006-11-07 11:58 ` Evgeniy Polyakov
2006-11-07 11:51 ` Evgeniy Polyakov
2006-11-07 12:17 ` Jeff Garzik
2006-11-07 12:29 ` Evgeniy Polyakov
2006-11-07 12:32 ` Jeff Garzik
2006-11-07 19:34 ` Andrew Morton
2006-11-07 20:52 ` David Miller
2006-11-07 21:38 ` Andrew Morton
2006-11-01 11:36 ` [take22 " Evgeniy Polyakov
2006-11-01 11:36 ` [take22 1/4] kevent: Core files Evgeniy Polyakov
2006-11-01 11:36 ` [take22 2/4] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-01 11:36 ` [take22 3/4] kevent: Socket notifications Evgeniy Polyakov
2006-11-01 11:36 ` [take22 4/4] kevent: Timer notifications Evgeniy Polyakov
2006-11-01 13:06 ` [take22 0/4] kevent: Generic event handling mechanism Pavel Machek
2006-11-01 13:25 ` Evgeniy Polyakov
2006-11-01 16:05 ` Pavel Machek
2006-11-01 16:24 ` Evgeniy Polyakov
2006-11-01 18:13 ` Oleg Verych
2006-11-01 18:57 ` Evgeniy Polyakov
2006-11-02 2:12 ` Nate Diller
2006-11-02 6:21 ` Evgeniy Polyakov
2006-11-02 19:40 ` Nate Diller
2006-11-03 8:42 ` Evgeniy Polyakov
2006-11-03 8:57 ` Pavel Machek
2006-11-03 9:04 ` David Miller
2006-11-07 12:05 ` Jeff Garzik
2006-11-03 9:13 ` Evgeniy Polyakov
2006-11-05 11:19 ` Pavel Machek
2006-11-05 11:43 ` Evgeniy Polyakov
[not found] ` <aaf959cb0611011829k36deda6ahe61bcb9bf8e612e1@mail.gmail.com>
[not found] ` <aaf959cb0611011830j1ca3e469tc4a6af3a2a010fa@mail.gmail.com>
[not found] ` <4549A261.9010007@cosmosbay.com>
2006-11-03 2:42 ` zhou drangon
2006-11-03 9:16 ` Evgeniy Polyakov
2006-11-07 12:02 ` Jeff Garzik
2006-11-03 18:49 ` Oleg Verych
2006-11-04 10:24 ` Evgeniy Polyakov
2006-11-04 17:47 ` Evgeniy Polyakov
2006-11-01 16:07 ` James Morris
2006-11-07 16:50 ` [take23 0/5] " Evgeniy Polyakov
2006-11-07 16:50 ` [take23 1/5] kevent: Description Evgeniy Polyakov
2006-11-07 16:50 ` [take23 2/5] kevent: Core files Evgeniy Polyakov
2006-11-07 16:50 ` [take23 3/5] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-07 16:50 ` [take23 4/5] kevent: Socket notifications Evgeniy Polyakov
2006-11-07 16:50 ` [take23 5/5] kevent: Timer notifications Evgeniy Polyakov
2006-11-07 22:53 ` [take23 3/5] kevent: poll/select() notifications Davide Libenzi
2006-11-08 8:45 ` Evgeniy Polyakov
2006-11-08 17:03 ` Evgeniy Polyakov
2006-11-07 22:16 ` [take23 2/5] kevent: Core files Andrew Morton
2006-11-08 8:24 ` Evgeniy Polyakov
2006-11-07 22:16 ` [take23 1/5] kevent: Description Andrew Morton
2006-11-08 8:23 ` Evgeniy Polyakov
2006-11-07 22:17 ` [take23 0/5] kevent: Generic event handling mechanism Andrew Morton
2006-11-08 8:21 ` Evgeniy Polyakov
2006-11-08 14:51 ` Eric Dumazet
2006-11-08 22:03 ` Andrew Morton
2006-11-08 22:44 ` Davide Libenzi
2006-11-08 23:07 ` Eric Dumazet
2006-11-08 23:56 ` Davide Libenzi
2006-11-09 7:24 ` Eric Dumazet
2006-11-09 7:52 ` Eric Dumazet
2006-11-09 17:12 ` Davide Libenzi
2006-11-09 8:23 ` [take24 0/6] " Evgeniy Polyakov
2006-11-09 8:23 ` Evgeniy Polyakov [this message]
2006-11-09 8:23 ` [take24 2/6] kevent: Core files Evgeniy Polyakov
2006-11-09 8:23 ` [take24 3/6] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-09 8:23 ` [take24 4/6] kevent: Socket notifications Evgeniy Polyakov
2006-11-09 8:23 ` [take24 5/6] kevent: Timer notifications Evgeniy Polyakov
2006-11-09 8:23 ` [take24 6/6] kevent: Pipe notifications Evgeniy Polyakov
2006-11-09 9:08 ` [take24 3/6] kevent: poll/select() notifications Eric Dumazet
2006-11-09 9:29 ` Evgeniy Polyakov
2006-11-09 18:51 ` Davide Libenzi
2006-11-09 19:10 ` Evgeniy Polyakov
2006-11-09 19:42 ` Davide Libenzi
2006-11-09 20:10 ` Davide Libenzi
2006-11-11 17:36 ` [take24 7/6] kevent: signal notifications Evgeniy Polyakov
2006-11-11 22:28 ` [take24 0/6] kevent: Generic event handling mechanism Ulrich Drepper
2006-11-13 10:54 ` Evgeniy Polyakov
2006-11-13 11:16 ` Evgeniy Polyakov
2006-11-20 0:02 ` Ulrich Drepper
2006-11-20 8:25 ` Evgeniy Polyakov
2006-11-20 8:43 ` Andrew Morton
2006-11-20 8:51 ` Evgeniy Polyakov
2006-11-20 9:15 ` Andrew Morton
2006-11-20 9:19 ` Evgeniy Polyakov
2006-11-20 20:29 ` Ulrich Drepper
2006-11-20 21:46 ` Jeff Garzik
2006-11-20 21:52 ` Ulrich Drepper
2006-11-21 9:09 ` Ingo Oeser
2006-11-22 11:38 ` Michael Tokarev
2006-11-22 11:47 ` Evgeniy Polyakov
2006-11-22 12:33 ` Jeff Garzik
2006-11-21 9:53 ` Evgeniy Polyakov
2006-11-21 16:58 ` Ulrich Drepper
2006-11-21 17:43 ` Evgeniy Polyakov
2006-11-21 18:46 ` Evgeniy Polyakov
2006-11-21 20:01 ` Jeff Garzik
2006-11-22 10:41 ` Evgeniy Polyakov
2006-11-21 20:19 ` Jeff Garzik
2006-11-22 10:39 ` Evgeniy Polyakov
2006-11-22 7:38 ` Ulrich Drepper
2006-11-22 10:44 ` Evgeniy Polyakov
2006-11-22 21:02 ` Ulrich Drepper
2006-11-23 12:23 ` Evgeniy Polyakov
2006-11-23 8:52 ` Kevent POSIX timers support Evgeniy Polyakov
2006-11-23 20:26 ` Ulrich Drepper
2006-11-24 9:50 ` Evgeniy Polyakov
2006-11-27 18:20 ` Ulrich Drepper
2006-11-27 18:24 ` David Miller
2006-11-27 18:36 ` Ulrich Drepper
2006-11-27 18:49 ` David Miller
2006-11-28 9:16 ` Evgeniy Polyakov
2006-11-28 19:13 ` David Miller
2006-11-28 19:22 ` Evgeniy Polyakov
2006-12-12 1:36 ` David Miller
2006-12-12 5:31 ` Evgeniy Polyakov
2006-11-28 9:16 ` Evgeniy Polyakov
2006-11-22 7:33 ` [take24 0/6] kevent: Generic event handling mechanism Ulrich Drepper
2006-11-22 10:38 ` Evgeniy Polyakov
2006-11-22 22:22 ` Ulrich Drepper
2006-11-23 12:18 ` Evgeniy Polyakov
2006-11-23 22:23 ` Ulrich Drepper
2006-11-24 10:57 ` Evgeniy Polyakov
2006-11-27 19:12 ` Ulrich Drepper
2006-11-28 11:00 ` Evgeniy Polyakov
2006-11-22 12:09 ` Evgeniy Polyakov
2006-11-22 12:15 ` Evgeniy Polyakov
2006-11-22 13:46 ` Evgeniy Polyakov
2006-11-22 22:24 ` Ulrich Drepper
2006-11-23 12:22 ` Evgeniy Polyakov
2006-11-23 20:34 ` Ulrich Drepper
2006-11-24 10:58 ` Evgeniy Polyakov
2006-11-27 18:23 ` Ulrich Drepper
2006-11-28 10:13 ` Evgeniy Polyakov
2006-12-27 20:45 ` Ulrich Drepper
2006-12-28 9:50 ` Evgeniy Polyakov
2006-11-21 16:29 ` [take25 " Evgeniy Polyakov
2006-11-21 16:29 ` [take25 1/6] kevent: Description Evgeniy Polyakov
2006-11-21 16:29 ` [take25 2/6] kevent: Core files Evgeniy Polyakov
2006-11-21 16:29 ` [take25 3/6] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-21 16:29 ` [take25 4/6] kevent: Socket notifications Evgeniy Polyakov
2006-11-21 16:29 ` [take25 5/6] kevent: Timer notifications Evgeniy Polyakov
2006-11-21 16:29 ` [take25 6/6] kevent: Pipe notifications Evgeniy Polyakov
2006-11-22 11:20 ` Eric Dumazet
2006-11-22 11:30 ` Evgeniy Polyakov
2006-11-22 23:46 ` [take25 1/6] kevent: Description Ulrich Drepper
2006-11-23 11:52 ` Evgeniy Polyakov
2006-11-23 19:45 ` Ulrich Drepper
2006-11-24 11:01 ` Evgeniy Polyakov
2006-11-24 16:06 ` Ulrich Drepper
2006-11-24 16:14 ` Evgeniy Polyakov
2006-11-24 16:31 ` Evgeniy Polyakov
2006-11-27 19:20 ` Ulrich Drepper
2006-11-22 23:52 ` Ulrich Drepper
2006-11-23 11:55 ` Evgeniy Polyakov
2006-11-23 20:00 ` Ulrich Drepper
2006-11-23 21:49 ` Hans Henrik Happe
2006-11-23 22:34 ` Ulrich Drepper
2006-11-24 11:50 ` Evgeniy Polyakov
2006-11-24 16:17 ` Ulrich Drepper
2006-11-24 11:46 ` Evgeniy Polyakov
2006-11-24 16:30 ` Ulrich Drepper
2006-11-24 16:49 ` Evgeniy Polyakov
2006-11-27 19:23 ` Ulrich Drepper
2006-11-23 22:33 ` Ulrich Drepper
2006-11-23 22:48 ` Jeff Garzik
2006-11-23 23:45 ` Ulrich Drepper
2006-11-24 0:48 ` Eric Dumazet
2006-11-24 8:14 ` Andrew Morton
2006-11-24 8:33 ` Eric Dumazet
2006-11-24 15:26 ` Ulrich Drepper
2006-11-24 0:14 ` Hans Henrik Happe
2006-11-24 12:05 ` Evgeniy Polyakov
2006-11-24 12:13 ` Evgeniy Polyakov
2006-11-27 19:43 ` Ulrich Drepper
2006-11-28 10:26 ` Evgeniy Polyakov
2006-11-30 19:14 ` [take26 0/8] kevent: Generic event handling mechanism Evgeniy Polyakov
2006-11-30 19:14 ` [take26 1/8] kevent: Description Evgeniy Polyakov
2006-11-30 19:14 ` [take26 2/8] kevent: Core files Evgeniy Polyakov
2006-11-30 19:14 ` [take26 3/8] kevent: poll/select() notifications Evgeniy Polyakov
2006-11-30 19:14 ` [take26 4/8] kevent: Socket notifications Evgeniy Polyakov
2006-11-30 19:14 ` [take26 5/8] kevent: Timer notifications Evgeniy Polyakov
2006-11-30 19:14 ` [take26 6/8] kevent: Pipe notifications Evgeniy Polyakov
2006-11-30 19:14 ` [take26 7/8] kevent: Signal notifications Evgeniy Polyakov
2006-11-30 19:14 ` [take26 8/8] kevent: Kevent posix timer notifications Evgeniy Polyakov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1163060636219@2ka.mipt.ru \
--to=johnpol@2ka.mipt.ru \
--cc=akpm@osdl.org \
--cc=chase.venters@clientec.com \
--cc=davem@davemloft.net \
--cc=drepper@redhat.com \
--cc=hch@infradead.org \
--cc=jeff@garzik.org \
--cc=johann.borck@densedata.com \
--cc=linux-kernel@vger.kernel.org \
--cc=netdev@vger.kernel.org \
--cc=zach.brown@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).