* [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
@ 2014-02-03 0:30 Nathaniel Yazdani
0 siblings, 0 replies; 9+ messages in thread
From: Nathaniel Yazdani @ 2014-02-03 0:30 UTC (permalink / raw)
To: viro; +Cc: linux-fsdevel, linux-kernel
Hi everyone,
This patch series adds support for read(), write(), and ioctl() operations
on eventpolls as well as an associated userspace structure to format the
eventpoll entries delivered via read()/write() buffers. The new structure,
struct epoll, differs from struct epoll_event mainly in that struct epoll
also holds the associated file descriptor. Using the normal I/O interface
to manipulate eventpolls is much neater than using epoll-specific syscalls
while also allowing for greater flexibility (theoretically, pipes could
be used to filter access). Specifically, write() creates, modifies, and/or
removes event entries stored in the supplied buffer, using the userspace
identifier to check whether an entry exists and removing it if no events
are set to trigger it, while read() simply waits for enough events to fill
the provided buffer. As timeout control is essential for polling to be
practical, ioctl() is used to configure an optional timeout, which is
infinite by default.
Documentation/ioctl/ioctl-number.txt | 1 +
fs/eventpoll.c | 534 ++++++++++++++++++++++++-----------
include/uapi/linux/eventpoll.h | 10 +
3 files changed, 384 insertions(+), 161 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
@ 2014-02-03 2:17 Nathaniel Yazdani
2014-02-03 9:43 ` Clemens Ladisch
2014-02-03 19:33 ` Andy Lutomirski
0 siblings, 2 replies; 9+ messages in thread
From: Nathaniel Yazdani @ 2014-02-03 2:17 UTC (permalink / raw)
To: viro; +Cc: linux-fsdevel, linux-kernel
Hi everyone,
This patch series adds support for read(), write(), and ioctl() operations
on eventpolls as well as an associated userspace structure to format the
eventpoll entries delivered via read()/write() buffers. The new structure,
struct epoll, differs from struct epoll_event mainly in that it also holds
the associated file descriptor. Using the normal I/O interface to manipulate
eventpolls is much neater than using epoll-specific syscalls while also
allowing for greater flexibility (theoretically, pipes could be used to
filter access). Specifically, write() creates, modifies, and/or removes event
entries stored in the supplied buffer, using the userspace identifier to
check whether an entry exists and removing it if no events are set to trigger
it, while read() simply waits for enough events to fill the provided buffer.
As timeout control is essential for polling to be practical, ioctl() is used
to configure an optional timeout, which is infinite by default.
Documentation/ioctl/ioctl-number.txt | 1 +
fs/eventpoll.c | 534 ++++++++++++++++++++++++-----------
include/uapi/linux/eventpoll.h | 10 +
3 files changed, 384 insertions(+), 161 deletions(-)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
2014-02-03 2:17 Nathaniel Yazdani
@ 2014-02-03 9:43 ` Clemens Ladisch
2014-02-03 19:34 ` Nathaniel Yazdani
2014-02-03 19:33 ` Andy Lutomirski
1 sibling, 1 reply; 9+ messages in thread
From: Clemens Ladisch @ 2014-02-03 9:43 UTC (permalink / raw)
To: Nathaniel Yazdani, viro; +Cc: linux-fsdevel, linux-kernel
Nathaniel Yazdani wrote:
> Using the normal I/O interface to manipulate eventpolls is much neater
> than using epoll-specific syscalls
But it introduces a _second_ API, which is epoll-specific too, and does
not use the standard semantics either.
> while also allowing for greater flexibility (theoretically, pipes could
> be used to filter access).
I do not understand this.
> read() simply waits for enough events to fill the provided buffer.
The usual semantics of read() are to return a partially filled buffer if
it would block otherwise, i.e., blocking is done only if the returned
buffer would have been empty.
> As timeout control is essential for polling to be practical, ioctl() is
> used to configure an optional timeout
This is what the timeout parameter of poll() and friends is for.
Regards,
Clemens
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
2014-02-03 9:43 ` Clemens Ladisch
@ 2014-02-03 19:34 ` Nathaniel Yazdani
0 siblings, 0 replies; 9+ messages in thread
From: Nathaniel Yazdani @ 2014-02-03 19:34 UTC (permalink / raw)
To: Clemens Ladisch; +Cc: viro, linux-fsdevel, linux-kernel
On 2/3/14, Clemens Ladisch <clemens@ladisch.de> wrote:
> Nathaniel Yazdani wrote:
>> Using the normal I/O interface to manipulate eventpolls is much neater
>> than using epoll-specific syscalls
>
> But it introduces a _second_ API, which is epoll-specific too, and does
> not use the standard semantics either.
>
>> while also allowing for greater flexibility (theoretically, pipes could
>> be used to filter access).
>
> I do not understand this.
The idea here was that if epoll is controlled by read()/write(), then
a program could be written so that it expects the epoll to dup()ed
to a second file descriptor, using one exclusively for writing & the
other exclusively for reading. That way, if an application is in
debug mode, for example, it could start up a thread to replace
those two file descriptors with pipes, so that thread would then
be able to tee, preprocess, or do whatever else to the epoll
streams.
>> read() simply waits for enough events to fill the provided buffer.
>
> The usual semantics of read() are to return a partially filled buffer if
> it would block otherwise, i.e., blocking is done only if the returned
> buffer would have been empty.
>
>> As timeout control is essential for polling to be practical, ioctl() is
>> used to configure an optional timeout
>
> This is what the timeout parameter of poll() and friends is for.
I admit that part of this approach isn't the best.
Either way I appreciate your feedback,
Nathaniel Yazdani
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
2014-02-03 2:17 Nathaniel Yazdani
2014-02-03 9:43 ` Clemens Ladisch
@ 2014-02-03 19:33 ` Andy Lutomirski
2014-02-03 19:42 ` Nathaniel Yazdani
1 sibling, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2014-02-03 19:33 UTC (permalink / raw)
To: Nathaniel Yazdani, viro; +Cc: linux-fsdevel, linux-kernel
On 02/02/2014 06:17 PM, Nathaniel Yazdani wrote:
> Hi everyone,
>
> This patch series adds support for read(), write(), and ioctl() operations
> on eventpolls as well as an associated userspace structure to format the
> eventpoll entries delivered via read()/write() buffers. The new structure,
> struct epoll, differs from struct epoll_event mainly in that it also holds
> the associated file descriptor. Using the normal I/O interface to manipulate
> eventpolls is much neater than using epoll-specific syscalls while also
> allowing for greater flexibility (theoretically, pipes could be used to
> filter access). Specifically, write() creates, modifies, and/or removes event
> entries stored in the supplied buffer, using the userspace identifier to
> check whether an entry exists and removing it if no events are set to trigger
> it, while read() simply waits for enough events to fill the provided buffer.
> As timeout control is essential for polling to be practical, ioctl() is used
> to configure an optional timeout, which is infinite by default.
If major changes are made to the epoll API, I want a way to do a bunch
of EPOLL_CTL_MODs and a wait, all in one syscall. Even better: allow
more flexible timeouts (CLOCK_MONOTONIC, CLOCK_REALTIME, etc) at the
same time.
Since this can't do that, I'm not terribly inspired.
--Andy
>
> Documentation/ioctl/ioctl-number.txt | 1 +
> fs/eventpoll.c | 534 ++++++++++++++++++++++++-----------
> include/uapi/linux/eventpoll.h | 10 +
> 3 files changed, 384 insertions(+), 161 deletions(-)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
2014-02-03 19:33 ` Andy Lutomirski
@ 2014-02-03 19:42 ` Nathaniel Yazdani
2014-02-03 19:56 ` Andy Lutomirski
0 siblings, 1 reply; 9+ messages in thread
From: Nathaniel Yazdani @ 2014-02-03 19:42 UTC (permalink / raw)
To: Andy Lutomirski; +Cc: viro, linux-fsdevel, linux-kernel
On 2/3/14, Andy Lutomirski <luto@amacapital.net> wrote:
> On 02/02/2014 06:17 PM, Nathaniel Yazdani wrote:
>> Hi everyone,
>>
>> This patch series adds support for read(), write(), and ioctl()
>> operations
>> on eventpolls as well as an associated userspace structure to format the
>> eventpoll entries delivered via read()/write() buffers. The new
>> structure,
>> struct epoll, differs from struct epoll_event mainly in that it also
>> holds
>> the associated file descriptor. Using the normal I/O interface to
>> manipulate
>> eventpolls is much neater than using epoll-specific syscalls while also
>> allowing for greater flexibility (theoretically, pipes could be used to
>> filter access). Specifically, write() creates, modifies, and/or removes
>> event
>> entries stored in the supplied buffer, using the userspace identifier to
>> check whether an entry exists and removing it if no events are set to
>> trigger
>> it, while read() simply waits for enough events to fill the provided
>> buffer.
>> As timeout control is essential for polling to be practical, ioctl() is
>> used
>> to configure an optional timeout, which is infinite by default.
>
> If major changes are made to the epoll API, I want a way to do a bunch
> of EPOLL_CTL_MODs and a wait, all in one syscall. Even better: allow
> more flexible timeouts (CLOCK_MONOTONIC, CLOCK_REALTIME, etc) at the
> same time.
>
> Since this can't do that, I'm not terribly inspired.
>
> --Andy
So are you saying that those features you mentioned are specifically sought
after for the kernel? If so I'd like to take a crack at some of them,
may as well
get some use out of my new knowledge of epoll internals :)
Thanks for your input,
Nate Yazdani
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
2014-02-03 19:42 ` Nathaniel Yazdani
@ 2014-02-03 19:56 ` Andy Lutomirski
2014-02-03 21:51 ` Eric Wong
0 siblings, 1 reply; 9+ messages in thread
From: Andy Lutomirski @ 2014-02-03 19:56 UTC (permalink / raw)
To: Nathaniel Yazdani; +Cc: Al Viro, Linux FS Devel, linux-kernel@vger.kernel.org
On Mon, Feb 3, 2014 at 11:42 AM, Nathaniel Yazdani
<n1ght.4nd.d4y@gmail.com> wrote:
> On 2/3/14, Andy Lutomirski <luto@amacapital.net> wrote:
>> On 02/02/2014 06:17 PM, Nathaniel Yazdani wrote:
>>> Hi everyone,
>>>
>>> This patch series adds support for read(), write(), and ioctl()
>>> operations
>>> on eventpolls as well as an associated userspace structure to format the
>>> eventpoll entries delivered via read()/write() buffers. The new
>>> structure,
>>> struct epoll, differs from struct epoll_event mainly in that it also
>>> holds
>>> the associated file descriptor. Using the normal I/O interface to
>>> manipulate
>>> eventpolls is much neater than using epoll-specific syscalls while also
>>> allowing for greater flexibility (theoretically, pipes could be used to
>>> filter access). Specifically, write() creates, modifies, and/or removes
>>> event
>>> entries stored in the supplied buffer, using the userspace identifier to
>>> check whether an entry exists and removing it if no events are set to
>>> trigger
>>> it, while read() simply waits for enough events to fill the provided
>>> buffer.
>>> As timeout control is essential for polling to be practical, ioctl() is
>>> used
>>> to configure an optional timeout, which is infinite by default.
>>
>> If major changes are made to the epoll API, I want a way to do a bunch
>> of EPOLL_CTL_MODs and a wait, all in one syscall. Even better: allow
>> more flexible timeouts (CLOCK_MONOTONIC, CLOCK_REALTIME, etc) at the
>> same time.
>>
>> Since this can't do that, I'm not terribly inspired.
>>
>> --Andy
>
> So are you saying that those features you mentioned are specifically sought
> after for the kernel? If so I'd like to take a crack at some of them,
> may as well
> get some use out of my new knowledge of epoll internals :)
If by "sought after", you mean "is there at least one epoll user who
wants them", then yes :)
I think that EPOLLET and EPOLLONESHOT are giant hacks, and that what
everyone really wants is the ability to very efficiently toggle events
on and off. The ability to do it simultaneously and inexpensively
with epoll_wait would make it happen.
--Andy
>
> Thanks for your input,
> Nate Yazdani
--
Andy Lutomirski
AMA Capital Management, LLC
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
2014-02-03 19:56 ` Andy Lutomirski
@ 2014-02-03 21:51 ` Eric Wong
2014-02-03 22:06 ` Andy Lutomirski
0 siblings, 1 reply; 9+ messages in thread
From: Eric Wong @ 2014-02-03 21:51 UTC (permalink / raw)
To: Andy Lutomirski
Cc: Nathaniel Yazdani, Al Viro, Linux FS Devel,
linux-kernel@vger.kernel.org
Andy Lutomirski <luto@amacapital.net> wrote:
> >> On 02/02/2014 06:17 PM, Nathaniel Yazdani wrote:
> > So are you saying that those features you mentioned are specifically sought
> > after for the kernel? If so I'd like to take a crack at some of them,
> > may as well
> > get some use out of my new knowledge of epoll internals :)
>
> If by "sought after", you mean "is there at least one epoll user who
> wants them", then yes :)
>
> I think that EPOLLET and EPOLLONESHOT are giant hacks, and that what
> everyone really wants is the ability to very efficiently toggle events
> on and off. The ability to do it simultaneously and inexpensively
> with epoll_wait would make it happen.
Everybody using single-threaded epoll, you mean? I suppose there's
quite a few of those.
I've pondered an epoll_xchg syscall which would behave like *BSD kevent
to satisfy single-threaded users, but never got around to it. All my
epoll uses are multithreaded w/ oneshot nowadays, so xchg would only
save one syscall per thread.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface
2014-02-03 21:51 ` Eric Wong
@ 2014-02-03 22:06 ` Andy Lutomirski
0 siblings, 0 replies; 9+ messages in thread
From: Andy Lutomirski @ 2014-02-03 22:06 UTC (permalink / raw)
To: Eric Wong
Cc: Nathaniel Yazdani, Al Viro, Linux FS Devel,
linux-kernel@vger.kernel.org
On Mon, Feb 3, 2014 at 1:51 PM, Eric Wong <normalperson@yhbt.net> wrote:
> Andy Lutomirski <luto@amacapital.net> wrote:
>> >> On 02/02/2014 06:17 PM, Nathaniel Yazdani wrote:
>> > So are you saying that those features you mentioned are specifically sought
>> > after for the kernel? If so I'd like to take a crack at some of them,
>> > may as well
>> > get some use out of my new knowledge of epoll internals :)
>>
>> If by "sought after", you mean "is there at least one epoll user who
>> wants them", then yes :)
>>
>> I think that EPOLLET and EPOLLONESHOT are giant hacks, and that what
>> everyone really wants is the ability to very efficiently toggle events
>> on and off. The ability to do it simultaneously and inexpensively
>> with epoll_wait would make it happen.
>
> Everybody using single-threaded epoll, you mean? I suppose there's
> quite a few of those.
>
> I've pondered an epoll_xchg syscall which would behave like *BSD kevent
> to satisfy single-threaded users, but never got around to it. All my
> epoll uses are multithreaded w/ oneshot nowadays, so xchg would only
> save one syscall per thread.
Even for multithreaded, the ability to rearm EPOLLONESHOT entries
without extra syscalls would probably be useful.
--Andy
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-02-03 22:06 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-02-03 0:30 [RFC PATCH 0/3] epoll: read(),write(),ioctl() interface Nathaniel Yazdani
-- strict thread matches above, loose matches on Subject: below --
2014-02-03 2:17 Nathaniel Yazdani
2014-02-03 9:43 ` Clemens Ladisch
2014-02-03 19:34 ` Nathaniel Yazdani
2014-02-03 19:33 ` Andy Lutomirski
2014-02-03 19:42 ` Nathaniel Yazdani
2014-02-03 19:56 ` Andy Lutomirski
2014-02-03 21:51 ` Eric Wong
2014-02-03 22:06 ` Andy Lutomirski
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).