linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ted Ts'o <tytso@mit.edu>
To: Bernd Schubert <bernd.schubert@fastmail.fm>
Cc: Bernd Schubert <bernd.schubert@itwm.fraunhofer.de>,
	linux-ext4@vger.kernel.org, adilger@whamcloud.com,
	colyli@gmail.com
Subject: Re: [PATCH 2/3] ext4 directory index: read-ahead blocks v2
Date: Sun, 17 Jul 2011 20:23:14 -0400	[thread overview]
Message-ID: <20110718002314.GD2717@thunk.org> (raw)
In-Reply-To: <31F2FDF7-1CED-4831-87E1-861752A6899A@mit.edu>

> On Jul 16, 2011, at 9:02 PM, Bernd Schubert wrote:
> 
> > I don't understand it either yet why we have so many, but each directory
> > has about 20 to 30 index blocks

OK, I think I know what's goign on.  Those are 20-30 index blocks;
those are 20-30 leaf blocks.  Your directories are approximately
80-120k, each, right?

So what your patch is doing is constantly doing readahead to bring the
*entire* directory into the buffer cache any time you do a dx_probe.
That's definitely not what we would want to enable by default, but I
really don't like the idea of adding Yet Another Mount option.  It
expands our testing effort, and the reality is very few people will
take advantage of the mount option.

How about this?  What if we don't actually perform readahead, but
instead try to look up all of the blocks to see if they are in the
buffer cache using sb_find_get_block().  If it is in the the buffer
cache, it will get touched, so it will be less likely to be evicted
from the page cache.  So for a workload like yours, it should do what
you want.  But if won't cause all of the pages to get pulled in after
the first reference of the directory in question.

I'm still worried about the case of a very large directory (say an
unreaped tmp directory that has grown to be tens of megabytes).  If a
program does a sequential scan through the directory doing a
"readdir+stat" (i.e., for example a tmp cleaner or someone running the
command ls -sF"), we probably shouldn't be trying to keep all of those
directory blocks in memory.  So if a sequential scan is detected, that
should probably suppress the calls to sb_find_get_block(0.

					- Ted

  reply	other threads:[~2011-07-18  0:23 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-20 20:28 [PATCH 0/3] directory index patch set v2 Bernd Schubert
2011-06-20 20:28 ` [PATCH 1/3] ext4: Fix compilation with -DDX_DEBUG v2 Bernd Schubert
     [not found]   ` <4E0012A4.1010608@gmail.com>
2011-06-21 15:26     ` Bernd Schubert
2011-07-16 23:41   ` Ted Ts'o
2011-07-19 15:02     ` Bernd Schubert
2011-07-20 21:25       ` Jan Kara
2011-06-20 20:28 ` [PATCH 2/3] ext4 directory index: read-ahead blocks v2 Bernd Schubert
2011-07-16 23:59   ` Ted Ts'o
2011-07-17  1:02     ` Bernd Schubert
2011-07-17 13:12       ` Theodore Tso
2011-07-18  0:23         ` Ted Ts'o [this message]
2011-07-19 14:22           ` Bernd Schubert
2011-06-20 20:29 ` [PATCH 3/3] dx read-ahead: Map blocks with a single semaphore lock Bernd Schubert

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110718002314.GD2717@thunk.org \
    --to=tytso@mit.edu \
    --cc=adilger@whamcloud.com \
    --cc=bernd.schubert@fastmail.fm \
    --cc=bernd.schubert@itwm.fraunhofer.de \
    --cc=colyli@gmail.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).