linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Phillip Susi <psusi@cfl.rr.com>
To: Jamie Lokier <jamie@shareable.org>
Cc: linux-fsdevel@vger.kernel.org,
	Linux-kernel <linux-kernel@vger.kernel.org>
Subject: Re: readahead on directories
Date: Wed, 21 Apr 2010 14:10:54 -0400	[thread overview]
Message-ID: <4BCF3FAE.7090206@cfl.rr.com> (raw)
In-Reply-To: <20100421161211.GC27575@shareable.org>

On 4/21/2010 12:12 PM, Jamie Lokier wrote:
> Asynchronous is available: Use clone or pthreads.

Synchronous in another process is not the same as async.  It seems I'm
going to have to do this for now as a workaround, but one of the reasons
that aio was created was to avoid the inefficiencies this introduces.
Why create a new thread context, switch to it, put a request in the
queue, then sleep, when you could just drop the request in the queue in
the original thread and move on?

> A quick skim of fs/{ext3,ext4}/dir.c finds a call to
> page_cache_sync_readahead.  Doesn't that do any reading ahead? :-)

Unfortunately it does not help when it is synchronous.  The process
still sleeps until it has fetched the blocks it needs.  I believe that
code just ends up doing a single 4kb read if the directory is no larger
than that, or if it is, then it reads up to readahead_size.  It puts the
request in the queue then sleeps until all the data has been read, even
if only the first 4kb was required before readdir() could return.

This means that a single thread calling readdir() is still going to
block reading the directory before it can move on to trying to read
other directories that are also needed.

> I/O is the probably the biggest cost, so it's more important to get
> the I/O pattern you want than worrying about return values you'll discard.

True, but it would be nice not to waste cpu cycles copying unneeded data
around.

> If not, fs/ext4/namei.c:ext4_dir_inode_operations points to
> ext4_fiemap.  So you may have luck calling FIEMAP or FIBMAP on the
> directory, and then reading blocks using the block device.  I'm not
> sure if the cache loaded via the block device (when mounted) will then
> be used for directory lookups.

Yes, I had considered that.  ureadahead already makes use of ext2fslibs
to open the block device and read the inode tables so they are already
in the cache for later use.  It seems a bit silly to do that though,
when that is exactly what readahead() SHOULD do for you.

  reply	other threads:[~2010-04-21 19:12 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-04-19 15:51 readahead on directories Phillip Susi
2010-04-21  0:44 ` Jamie Lokier
2010-04-21 14:57   ` Phillip Susi
2010-04-21 16:12     ` Jamie Lokier
2010-04-21 18:10       ` Phillip Susi [this message]
2010-04-21 20:22         ` Jamie Lokier
2010-04-21 20:59           ` Phillip Susi
2010-04-21 22:06             ` Jamie Lokier
2010-04-22  7:01               ` Brad Boyer
2010-04-22 14:26               ` Phillip Susi
2010-04-22 17:53                 ` Jamie Lokier
2010-04-22 19:23                   ` Phillip Susi
2010-04-22 20:35                     ` Jamie Lokier
2010-04-22 21:22                       ` Phillip Susi
2010-04-22 22:43                         ` Jamie Lokier
2010-04-23  4:13                           ` Phillip Susi
2010-04-21 18:38       ` Evgeniy Polyakov
2010-04-21 18:51         ` Jamie Lokier
2010-04-21 18:56           ` Evgeniy Polyakov
2010-04-21 20:02             ` Jamie Lokier
2010-04-21 20:21               ` Evgeniy Polyakov
2010-04-21 20:39                 ` Jamie Lokier
2010-04-21 19:23           ` Phillip Susi
2010-04-21 20:01             ` Jamie Lokier
2010-04-21 20:13               ` Phillip Susi
2010-04-21 20:37                 ` Jamie Lokier
2010-05-07 13:38 ` unified page and buffer cache? (was: readahead on directories) Phillip Susi
2010-05-07 13:53   ` Matthew Wilcox
2010-05-07 15:45     ` unified page and buffer cache? Phillip Susi
2010-05-07 18:30       ` Matthew Wilcox
2010-05-08  0:50         ` Phillip Susi
2010-05-08  0:46       ` tytso
2010-05-08  0:54         ` Phillip Susi
2010-05-08 12:52           ` tytso

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4BCF3FAE.7090206@cfl.rr.com \
    --to=psusi@cfl.rr.com \
    --cc=jamie@shareable.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).