Re: [RFC] Optimizing readdir()

linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Radek Pazdera <rpazdera@redhat.com>
To: Andreas Dilger <adilger@dilger.ca>
Cc: "Theodore Ts'o" <tytso@mit.edu>,
	linux-ext4@vger.kernel.org, "Lukáš Czerner" <lczerner@redhat.com>
Subject: Re: [RFC] Optimizing readdir()
Date: Tue, 29 Jan 2013 17:38:46 +0100	[thread overview]
Message-ID: <20130129163846.GA17169@laptop.brq.redhat.com> (raw)
In-Reply-To: <B52E6F5A-F266-433C-AE80-2E78FD6920B9@dilger.ca>

[-- Attachment #1: ext4m.txt --]
[-- Type: text/plain, Size: 1775 bytes --]

On Tue, Jan 15, 2013 at 03:44:57PM -0700, Andreas Dilger wrote:
>Having an upper limit on the directory cache is OK too.  Read all
>of the entries that fit into the cache size, sort them, and return
>them to the caller.  When the caller has processed all of those
>entries, read another batch, sort it, return this list, repeat.
>
>As long as the list is piecewise ordered, I suspect it would gain
>most of the benefit of linear ordering (sequential inode table
>reads, avoiding repeated lookups of blocks).  Maybe worthwhile if
>you could test this out?

I did the tests last week. I modified the spd_readdir preload to
read at most $SPD_READDIR_CACHE_LIMIT entries, sort them and repeat.
The patch is here:

    http://www.stud.fit.vutbr.cz/~xpazde00/soubory/dir-index-test-ext4/

I tested it with the limit set to 0 (i.e., no limit), 1000, 10000,
50000, and completely without the preload. The test runs were
performed on the same directory, so the results shouldn't be
affected by positioning on disk.

Directory sizes went from 10k to 1.5M. The tests were run twice.
The first run is only with metadata. In the second run, each file
has 4096B of data.

Here are the results:
  0B files:
    http://www.stud.fit.vutbr.cz/~xpazde00/soubory/dir-index-test-ext4/0B-files

  4096B files:
    http://www.stud.fit.vutbr.cz/~xpazde00/soubory/dir-index-test-ext4/4096B-files/

The times seem to decrease accordingly as the limit of the cache
increases. The differences are bigger in case of 4096B files, where
the data blocks start to evict the inode tables. However, copying is
still more than two times slower for 1.5M files when 50000 entries
are cached.

It might be interesting to test what happens when the size of the
files in the directory increases.

Best Regards
Radek

next prev parent reply	other threads:[~2013-01-29 16:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-13 15:22 [RFC] Optimizing readdir() Radek Pazdera
2013-01-14  4:51 ` Theodore Ts'o
2013-01-14  6:09   ` Stewart Smith
2013-01-15  7:21   ` Radek Pazdera
2013-01-15 22:44     ` Andreas Dilger
2013-01-17 15:53       ` Radek Pazdera
2013-01-29 16:38       ` Radek Pazdera [this message]
2013-01-30 11:34         ` Lukáš Czerner
2013-02-02 19:45           ` Andreas Dilger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20130129163846.GA17169@laptop.brq.redhat.com \
    --to=rpazdera@redhat.com \
    --cc=adilger@dilger.ca \
    --cc=lczerner@redhat.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).