Re: readahead on directories

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jamie Lokier <jamie@shareable.org>
To: Phillip Susi <psusi@cfl.rr.com>
Cc: Evgeniy Polyakov <zbr@ioremap.net>,
	linux-fsdevel@vger.kernel.org,
	Linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: readahead on directories
Date: Wed, 21 Apr 2010 21:01:04 +0100	[thread overview]
Message-ID: <20100421200104.GT27575@shareable.org> (raw)
In-Reply-To: <4BCF509E.2040903@cfl.rr.com>

Phillip Susi wrote:
> On 4/21/2010 2:51 PM, Jamie Lokier wrote:
> > Fwiw, I found sorting directories by inode and reading them in that
> > order help to reduce seeks, some 10 years ago.  I implemented
> > something like 'find' which works like that, keeping a queue of
> > directories to read and things to open/stat, ordered by inode number
> > seen in d_ino before open/stat and st_ino after.  However it did not
> > try to readahead the blocks inside a directory, or sort operations by
> > block number.  It reduced some 'find'-like operations to about a
> > quarter of the time on cold cache.  I still use that program sometimes
> > before "git status" ;-)  Google "treescan" and "lokier" if you're
> > interested in trying it (though I use 0.7 which isn't published).
> 
> That helps with open()ing or stat()ing the files since you access the
> inodes in order, but ureadahead already preloads all of the inode tables
> so this won't help.

It helps a little with data access too, because of block group
locality tending to follow inode numbers.  Don't read inodes and data
in the same batch though.

> >> it is not about readdir(). Plain read() is synchronous too. But
> >> filesystem can respond to readahead calls and read next block to current
> >> one, while it won't do this for next direntry.
> > 
> > I'm surprised it makes much difference, as directories are usually not
> > very large anyway.
> 
> That's just it; it doesn't help.  That's why I want to readahead() all
> of the directories at once instead of reading them one block at a time.

Ok, this discussion has got a bit confused.  Text above refers to
needing to asynchronously read next block in a directory, but if they
are small then that's not important.

> > But if it does, go on, try FIEMAP and blockdev reading, you know you
> > want to :-)
> 
> Why reinvent the wheel when that's readahead()'s job?  As a workaround
> I'm about to try just threading all of the calls to open().

FIEMAP suggestion is only if you think you need to issue reads for
multiple blocks in the _same_ directory in parallel.  From what you say,
I doubt that's important.

FIEMAP is not relevant for reading different directories in parallel.
You'd still have to thread the FIEMAP calls for that - it's a
different problem.

> Each one will queue a read and block, but with them all doing so at
> once should fill the queue with plenty of reads.  It is inefficient,
> but better than one block at a time.

That was my first suggestion: threads with readdir(); I thought it had
been rejected hence the further discussion.

(Actually I would use clone + open + getdirentries + tiny userspace
stack to avoid using tons of memory.  But that's just a tweak, only to
be used if the threading is effective.)

-- Jamie

next prev parent reply	other threads:[~2010-04-21 20:01 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-19 15:51 readahead on directories Phillip Susi
2010-04-21  0:44 ` Jamie Lokier
2010-04-21 14:57   ` Phillip Susi
2010-04-21 16:12     ` Jamie Lokier
2010-04-21 18:10       ` Phillip Susi
2010-04-21 20:22         ` Jamie Lokier
2010-04-21 20:59           ` Phillip Susi
2010-04-21 22:06             ` Jamie Lokier
2010-04-22  7:01               ` Brad Boyer
2010-04-22 14:26               ` Phillip Susi
2010-04-22 17:53                 ` Jamie Lokier
2010-04-22 19:23                   ` Phillip Susi
2010-04-22 20:35                     ` Jamie Lokier
2010-04-22 21:22                       ` Phillip Susi
2010-04-22 22:43                         ` Jamie Lokier
2010-04-23  4:13                           ` Phillip Susi
2010-04-21 18:38       ` Evgeniy Polyakov
2010-04-21 18:51         ` Jamie Lokier
2010-04-21 18:56           ` Evgeniy Polyakov
2010-04-21 20:02             ` Jamie Lokier
2010-04-21 20:21               ` Evgeniy Polyakov
2010-04-21 20:39                 ` Jamie Lokier
2010-04-21 19:23           ` Phillip Susi
2010-04-21 20:01             ` Jamie Lokier [this message]
2010-04-21 20:13               ` Phillip Susi
2010-04-21 20:37                 ` Jamie Lokier
2010-05-07 13:38 ` unified page and buffer cache? (was: readahead on directories) Phillip Susi
2010-05-07 13:53   ` Matthew Wilcox
2010-05-07 15:45     ` unified page and buffer cache? Phillip Susi
2010-05-07 18:30       ` Matthew Wilcox
2010-05-08  0:50         ` Phillip Susi
2010-05-08  0:46       ` tytso
2010-05-08  0:54         ` Phillip Susi
2010-05-08 12:52           ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100421200104.GT27575@shareable.org \
    --to=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=psusi@cfl.rr.com \
    --cc=zbr@ioremap.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.