public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pierre Phaneuf <pp@ludusdesign.com>
To: linux-kernel@vger.kernel.org
Subject: disk-based fds in select/poll
Date: Mon, 04 Jun 2001 16:43:38 -0400	[thread overview]
Message-ID: <3B1BF2FA.B74A1B78@ludusdesign.com> (raw)


Pardon me if some parts of this seem clueless. While I'm no newbie in
userland, kernelspace I don't play in very often...

It's fairly widely-known that select/poll returns immediately when
testing a filesystem-based file descriptor for writability or
readability.

On top of this, even when in non-blocking mode, read() could block if
the pages needed aren't in core. sendfile() behaves in a similar way.

What would be needed to alleviate this?

I am thinking that a read() (or sendfile()) that would block because the
pages aren't in core should instead post a request for the pages to be
loaded (some kind of readahead mecanism?) and return immediately (maybe
having given some data that *was* in core). A subsequent read() could
have the data available, but not necessarily (again, it should give
whatever it has in core, but return immediately).

sendfile() would be a lot more tricky to fix in that way I guess, but
could still be possible (the destination fd would be unwritable for a
while, until the transfer is finished). Also, the complexity would be
higher (instead of simply causing readahead to happen (which might
anyway), it would have to trigger the readahead, then get notification
of when the pages are in core to send over, all the while preventing
data from being written to the destination fd in some way).

In the mean time, I was also wondering if issuing smaller read()
requests in a row might give me a better chance of success. I *know*
read() will block, but if I only ask for, say, a page of data (rather
than asking for the full data and relying on the non-blocking to return
EAGAIN (like it should, IMHO!)), it shouldn't take too long, and could
possibly trigger some readahead to be done by the kernel, right?

Or will the readahead be done "on my own time", read() only returning
after the whole thing (my request + readahead) has been done?

I remember seeing a suggestion by Linus for an event-based I/O
interface, similar to kqueue on FreeBSD but much simpler. I'd just say
"I want it too!", ok? :-)

I know about the mincore() trick with mmap()'d files, but with small
files, mmap()ing might not make sense (could be very often).

SGI's AIO might be a solution here, does it use threads? I'm trying to
avoid context switching as much as possible, to keep the CPU cache as
warm as possible.

Well, I might not have the choice to use threads, after all...

(sorry if this message got in twice, I used an NNTP gateway the previous
time, I don't think it got through)

-- 
Pierre Phaneuf

             reply	other threads:[~2001-06-04 20:53 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-06-04 20:43 Pierre Phaneuf [this message]
2001-06-04 21:42 ` disk-based fds in select/poll Alan Cox
2001-06-04 22:42   ` Pierre Phaneuf
2001-06-04 23:16     ` Alan Cox
  -- strict thread matches above, loose matches on Subject: below --
2001-06-05  0:27 Dan Kegel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3B1BF2FA.B74A1B78@ludusdesign.com \
    --to=pp@ludusdesign.com \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox