All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andreas Dilger <adilger@sun.com>
To: lustre-devel@lists.lustre.org
Subject: [Lustre-devel] read ahead
Date: Tue, 11 Dec 2007 16:47:08 -0700	[thread overview]
Message-ID: <20071211234708.GT3214@webber.adilger.int> (raw)
In-Reply-To: <18270.55813.185376.299242@gargle.gargle.HOWL>

On Dec 11, 2007  21:42 +0300, Nikita Danilov wrote:
> Peter Braam writes:
>  > Can anyone tell me if read ahead in Lustre includes "early return" 
>  > features.  I mean that if I read 4K and readahead decides to fetch 1M 
>  > will my request get serviced when the first 4K arrives?  Is this important?
> 
> Currently read system call will proceed when the first RPC (including
> first 4K page and some number of read-ahead pages) is serviced:
> generic_file_read() waits on a page lock, and lock is released by
> completion routine (ll_ap_completion()).

Another thing worth mentioning here is that if this is the FIRST 4kB read
from the file, then only that 4kB will be returned in the RPC, because
readahead hasn't done linear vs. random IO detection yet.  If it is the
second read (and linear) then the client will get the _rest_ of the 1MB
and will have to wait for that second RPC to complete.  For subsequent
reads the readahead will of course prefetch the pages.

For random reads the code does understand the difference between e.g.
reads of 16 sequential pages (64kB generally) read at non-consecutive
offsets and 16 sequential 4kB page reads.  The former will NOT start
readahead, while the latter does.

Two areas where our readahead is lacking are:
- strided reads (may turn the above 16 x 4kB reads into a situation
  where the client will prefetch pages instead of "random" IO, depending
  on access pattern, and will avoid prefetch of data the client is not
  expecting to use)
- limiting the readahead to the rate that the client is actually consuming
  it (currently once we detect sequential reads the readahead window grows
  eventually to the maximum even if this is far more than what the client
  needs)

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

  reply	other threads:[~2007-12-11 23:47 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-12-11 18:25 [Lustre-devel] read ahead Peter Braam
2007-12-11 18:42 ` Nikita Danilov
2007-12-11 23:47   ` Andreas Dilger [this message]
2007-12-12  0:57     ` Mark Seger
2007-12-12  5:53     ` Nikita Danilov
2007-12-14  0:21       ` Andreas Dilger
2007-12-20 10:20         ` Alex Lyashkov
2007-12-20 20:44           ` Andreas Dilger
2007-12-12 15:52     ` chas williams - CONTRACTOR
2007-12-12 16:41       ` Oleg Drokin
2007-12-13 15:15         ` chas williams - CONTRACTOR
2007-12-13 15:23           ` Oleg Drokin
     [not found]     ` <475F7759.2080103@sun.com>
     [not found]       ` <47603E32.6070203@sun.com>
     [not found]         ` <4760B247.4030505@sun.com>
2007-12-13 19:09           ` Peter Braam
2007-12-11 18:43 ` Oleg Drokin
2007-12-11 18:59   ` Peter Braam
2007-12-11 19:16     ` Oleg Drokin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071211234708.GT3214@webber.adilger.int \
    --to=adilger@sun.com \
    --cc=lustre-devel@lists.lustre.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.