From: Jamie Lokier <jamie@shareable.org>
To: Phillip Susi <psusi@cfl.rr.com>
Cc: Evgeniy Polyakov <zbr@ioremap.net>,
linux-fsdevel@vger.kernel.org,
Linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: readahead on directories
Date: Wed, 21 Apr 2010 21:01:04 +0100 [thread overview]
Message-ID: <20100421200104.GT27575@shareable.org> (raw)
In-Reply-To: <4BCF509E.2040903@cfl.rr.com>
Phillip Susi wrote:
> On 4/21/2010 2:51 PM, Jamie Lokier wrote:
> > Fwiw, I found sorting directories by inode and reading them in that
> > order help to reduce seeks, some 10 years ago. I implemented
> > something like 'find' which works like that, keeping a queue of
> > directories to read and things to open/stat, ordered by inode number
> > seen in d_ino before open/stat and st_ino after. However it did not
> > try to readahead the blocks inside a directory, or sort operations by
> > block number. It reduced some 'find'-like operations to about a
> > quarter of the time on cold cache. I still use that program sometimes
> > before "git status" ;-) Google "treescan" and "lokier" if you're
> > interested in trying it (though I use 0.7 which isn't published).
>
> That helps with open()ing or stat()ing the files since you access the
> inodes in order, but ureadahead already preloads all of the inode tables
> so this won't help.
It helps a little with data access too, because of block group
locality tending to follow inode numbers. Don't read inodes and data
in the same batch though.
> >> it is not about readdir(). Plain read() is synchronous too. But
> >> filesystem can respond to readahead calls and read next block to current
> >> one, while it won't do this for next direntry.
> >
> > I'm surprised it makes much difference, as directories are usually not
> > very large anyway.
>
> That's just it; it doesn't help. That's why I want to readahead() all
> of the directories at once instead of reading them one block at a time.
Ok, this discussion has got a bit confused. Text above refers to
needing to asynchronously read next block in a directory, but if they
are small then that's not important.
> > But if it does, go on, try FIEMAP and blockdev reading, you know you
> > want to :-)
>
> Why reinvent the wheel when that's readahead()'s job? As a workaround
> I'm about to try just threading all of the calls to open().
FIEMAP suggestion is only if you think you need to issue reads for
multiple blocks in the _same_ directory in parallel. From what you say,
I doubt that's important.
FIEMAP is not relevant for reading different directories in parallel.
You'd still have to thread the FIEMAP calls for that - it's a
different problem.
> Each one will queue a read and block, but with them all doing so at
> once should fill the queue with plenty of reads. It is inefficient,
> but better than one block at a time.
That was my first suggestion: threads with readdir(); I thought it had
been rejected hence the further discussion.
(Actually I would use clone + open + getdirentries + tiny userspace
stack to avoid using tons of memory. But that's just a tweak, only to
be used if the threading is effective.)
-- Jamie
next prev parent reply other threads:[~2010-04-21 20:01 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-04-19 15:51 readahead on directories Phillip Susi
2010-04-21 0:44 ` Jamie Lokier
2010-04-21 14:57 ` Phillip Susi
2010-04-21 16:12 ` Jamie Lokier
2010-04-21 18:10 ` Phillip Susi
2010-04-21 20:22 ` Jamie Lokier
2010-04-21 20:59 ` Phillip Susi
2010-04-21 22:06 ` Jamie Lokier
2010-04-22 7:01 ` Brad Boyer
2010-04-22 14:26 ` Phillip Susi
2010-04-22 17:53 ` Jamie Lokier
2010-04-22 19:23 ` Phillip Susi
2010-04-22 20:35 ` Jamie Lokier
2010-04-22 21:22 ` Phillip Susi
2010-04-22 22:43 ` Jamie Lokier
2010-04-23 4:13 ` Phillip Susi
2010-04-21 18:38 ` Evgeniy Polyakov
2010-04-21 18:51 ` Jamie Lokier
2010-04-21 18:56 ` Evgeniy Polyakov
2010-04-21 20:02 ` Jamie Lokier
2010-04-21 20:21 ` Evgeniy Polyakov
2010-04-21 20:39 ` Jamie Lokier
2010-04-21 19:23 ` Phillip Susi
2010-04-21 20:01 ` Jamie Lokier [this message]
2010-04-21 20:13 ` Phillip Susi
2010-04-21 20:37 ` Jamie Lokier
2010-05-07 13:38 ` unified page and buffer cache? (was: readahead on directories) Phillip Susi
2010-05-07 13:53 ` Matthew Wilcox
2010-05-07 15:45 ` unified page and buffer cache? Phillip Susi
2010-05-07 18:30 ` Matthew Wilcox
2010-05-08 0:50 ` Phillip Susi
2010-05-08 0:46 ` tytso
2010-05-08 0:54 ` Phillip Susi
2010-05-08 12:52 ` tytso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20100421200104.GT27575@shareable.org \
--to=jamie@shareable.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=psusi@cfl.rr.com \
--cc=zbr@ioremap.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).