public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* disk-based fds in select/poll
@ 2001-06-04 20:43 Pierre Phaneuf
  2001-06-04 21:42 ` Alan Cox
  0 siblings, 1 reply; 5+ messages in thread
From: Pierre Phaneuf @ 2001-06-04 20:43 UTC (permalink / raw)
  To: linux-kernel


Pardon me if some parts of this seem clueless. While I'm no newbie in
userland, kernelspace I don't play in very often...

It's fairly widely-known that select/poll returns immediately when
testing a filesystem-based file descriptor for writability or
readability.

On top of this, even when in non-blocking mode, read() could block if
the pages needed aren't in core. sendfile() behaves in a similar way.

What would be needed to alleviate this?

I am thinking that a read() (or sendfile()) that would block because the
pages aren't in core should instead post a request for the pages to be
loaded (some kind of readahead mecanism?) and return immediately (maybe
having given some data that *was* in core). A subsequent read() could
have the data available, but not necessarily (again, it should give
whatever it has in core, but return immediately).

sendfile() would be a lot more tricky to fix in that way I guess, but
could still be possible (the destination fd would be unwritable for a
while, until the transfer is finished). Also, the complexity would be
higher (instead of simply causing readahead to happen (which might
anyway), it would have to trigger the readahead, then get notification
of when the pages are in core to send over, all the while preventing
data from being written to the destination fd in some way).

In the mean time, I was also wondering if issuing smaller read()
requests in a row might give me a better chance of success. I *know*
read() will block, but if I only ask for, say, a page of data (rather
than asking for the full data and relying on the non-blocking to return
EAGAIN (like it should, IMHO!)), it shouldn't take too long, and could
possibly trigger some readahead to be done by the kernel, right?

Or will the readahead be done "on my own time", read() only returning
after the whole thing (my request + readahead) has been done?

I remember seeing a suggestion by Linus for an event-based I/O
interface, similar to kqueue on FreeBSD but much simpler. I'd just say
"I want it too!", ok? :-)

I know about the mincore() trick with mmap()'d files, but with small
files, mmap()ing might not make sense (could be very often).

SGI's AIO might be a solution here, does it use threads? I'm trying to
avoid context switching as much as possible, to keep the CPU cache as
warm as possible.

Well, I might not have the choice to use threads, after all...

(sorry if this message got in twice, I used an NNTP gateway the previous
time, I don't think it got through)

-- 
Pierre Phaneuf

^ permalink raw reply	[flat|nested] 5+ messages in thread
* re: disk-based fds in select/poll
@ 2001-06-05  0:27 Dan Kegel
  0 siblings, 0 replies; 5+ messages in thread
From: Dan Kegel @ 2001-06-05  0:27 UTC (permalink / raw)
  To: Pierre Phaneuf, linux-kernel@vger.kernel.org

Pierre Phaneuf <pp@ludusdesign.com> wrote:
> It's fairly widely-known that select/poll returns immediately when
> testing a filesystem-based file descriptor for writability or
> readability.
> 
> On top of this, even when in non-blocking mode, read() could block if
> the pages needed aren't in core. sendfile() behaves in a similar way.
> 
> What would be needed to alleviate this?
> ...
> I remember seeing a suggestion by Linus for an event-based I/O
> interface, similar to kqueue on FreeBSD but much simpler. I'd just say
> "I want it too!", ok? :-)
>
> SGI's AIO might be a solution here, does it use threads? I'm trying to
> avoid context switching as much as possible, to keep the CPU cache as
> warm as possible.

IMHO, you want AIO.  SGI's is fine for now.  I hear rumors that there will be
something even better coming in 2.5, though I have no details.

Or you could use explicit userspace threads... say, divide up your
network connections among 8 or so threads.  Then if one thread blocks,
the others are there to usefully soak up the CPU time.

Readiness events for readahead completion on disk files used to 
seem like a neat idea to me, but now AIO seems more appealing
in the long run, since they handle random access properly.

- Dan

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2001-06-05  0:27 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-06-04 20:43 disk-based fds in select/poll Pierre Phaneuf
2001-06-04 21:42 ` Alan Cox
2001-06-04 22:42   ` Pierre Phaneuf
2001-06-04 23:16     ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2001-06-05  0:27 Dan Kegel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox