linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jamie Lokier <jamie@shareable.org>
To: Evgeniy Polyakov <zbr@ioremap.net>
Cc: Phillip Susi <psusi@cfl.rr.com>,
	linux-fsdevel@vger.kernel.org,
	Linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: readahead on directories
Date: Wed, 21 Apr 2010 19:51:24 +0100	[thread overview]
Message-ID: <20100421185124.GM27575@shareable.org> (raw)
In-Reply-To: <20100421183853.GA14897@ioremap.net>

Evgeniy Polyakov wrote:
> On Wed, Apr 21, 2010 at 05:12:11PM +0100, Jamie Lokier (jamie@shareable.org) wrote:
> > A quick skim of fs/{ext3,ext4}/dir.c finds a call to
> > page_cache_sync_readahead.  Doesn't that do any reading ahead? :-)
> 
> It goes down to fs callbacks of data reading, which is not appliable to
> directories.
> 
> To implement directory 'readahead' we use separated thread to call
> readdir(). It is damn slow indeed, but it can populate cache in advance
> of actual data reading. As a higher level crunch there is a 'find'
> running in background.

Fwiw, I found sorting directories by inode and reading them in that
order help to reduce seeks, some 10 years ago.  I implemented
something like 'find' which works like that, keeping a queue of
directories to read and things to open/stat, ordered by inode number
seen in d_ino before open/stat and st_ino after.  However it did not
try to readahead the blocks inside a directory, or sort operations by
block number.  It reduced some 'find'-like operations to about a
quarter of the time on cold cache.  I still use that program sometimes
before "git status" ;-)  Google "treescan" and "lokier" if you're
interested in trying it (though I use 0.7 which isn't published).

> > > I don't actually care to have the content	s of the
> > > directories returned, so readdir() does more than I need in that
> > > respect, and also it performs a blocking read of one disk block at a
> > > time, which is horribly slow with a cold cache.
> > 
> > I/O is the probably the biggest cost, so it's more important to get
> > the I/O pattern you want than worrying about return values you'll discard.
> > 
> > If readdir() calls are slowed by lots of calls and libc, consider
> > using the getdirentries system call directly.
> 
> it is not about readdir(). Plain read() is synchronous too. But
> filesystem can respond to readahead calls and read next block to current
> one, while it won't do this for next direntry.

I'm surprised it makes much difference, as directories are usually not
very large anyway.

But if it does, go on, try FIEMAP and blockdev reading, you know you
want to :-)

-- Jamie

  reply	other threads:[~2010-04-21 18:51 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-19 15:51 readahead on directories Phillip Susi
2010-04-21  0:44 ` Jamie Lokier
2010-04-21 14:57   ` Phillip Susi
2010-04-21 16:12     ` Jamie Lokier
2010-04-21 18:10       ` Phillip Susi
2010-04-21 20:22         ` Jamie Lokier
2010-04-21 20:59           ` Phillip Susi
2010-04-21 22:06             ` Jamie Lokier
2010-04-22  7:01               ` Brad Boyer
2010-04-22 14:26               ` Phillip Susi
2010-04-22 17:53                 ` Jamie Lokier
2010-04-22 19:23                   ` Phillip Susi
2010-04-22 20:35                     ` Jamie Lokier
2010-04-22 21:22                       ` Phillip Susi
2010-04-22 22:43                         ` Jamie Lokier
2010-04-23  4:13                           ` Phillip Susi
2010-04-21 18:38       ` Evgeniy Polyakov
2010-04-21 18:51         ` Jamie Lokier [this message]
2010-04-21 18:56           ` Evgeniy Polyakov
2010-04-21 20:02             ` Jamie Lokier
2010-04-21 20:21               ` Evgeniy Polyakov
2010-04-21 20:39                 ` Jamie Lokier
2010-04-21 19:23           ` Phillip Susi
2010-04-21 20:01             ` Jamie Lokier
2010-04-21 20:13               ` Phillip Susi
2010-04-21 20:37                 ` Jamie Lokier
2010-05-07 13:38 ` unified page and buffer cache? (was: readahead on directories) Phillip Susi
2010-05-07 13:53   ` Matthew Wilcox
2010-05-07 15:45     ` unified page and buffer cache? Phillip Susi
2010-05-07 18:30       ` Matthew Wilcox
2010-05-08  0:50         ` Phillip Susi
2010-05-08  0:46       ` tytso
2010-05-08  0:54         ` Phillip Susi
2010-05-08 12:52           ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100421185124.GM27575@shareable.org \
    --to=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=psusi@cfl.rr.com \
    --cc=zbr@ioremap.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).