From: Ric Wheeler <ricwheeler@gmail.com>
To: Theodore Tso <tytso@MIT.EDU>
Cc: Florian Weimer <fweimer@bfk.de>,
Eric Sandeen <sandeen@redhat.com>,
Phillip Susi <psusi@cfl.rr.com>,
"linux-ext4@vger.kernel.org" <linux-ext4@vger.kernel.org>
Subject: Re: Large directories and poor order correlation
Date: Tue, 15 Mar 2011 07:23:14 -0400 [thread overview]
Message-ID: <4D7F4C22.9060801@gmail.com> (raw)
In-Reply-To: <4C11D2E5-75CD-4A9F-A534-EEC16CDD836B@mit.edu>
On 03/15/2011 07:06 AM, Theodore Tso wrote:
> On Mar 15, 2011, at 3:59 AM, Florian Weimer wrote:
>
>> * Eric Sandeen:
>>
>>> No, because htree (dir_index) dirs returns names in hash-value
>>> order, not inode number order. i.e. "at random."
>>>
>>> As you say, sorting by inode number will work much better...
>> The dpkg folks tested this and it turns out that you get better
>> results if you open the file and use FIBMAP to get the first block
>> number, and sort by that. You could sort by inode number before the
>> open/fstat calls, but it does not seem to help much.
> It depends on which problem you are trying to solve. If this is a cold
> cache situation, and the inode cache is empty, then sorting by inode
> number will help since otherwise you'll be seeking all over just to
> read in the inode structures. This is true for any kind of readdir+stat
> combination, whether it's ls -l, or du or readdir + FIBMAP (I'd
> recommend using FIEMAP these days, though).
>
> However, if you need to suck in the information for a large number of
> small files (such as all of the files in /var/lib/dpkg/info), then sure, sorting
> ont he block number can help reduce seeks on the data blocks side of
> things.
>
> So in an absolute cold cache situations, what I'd recommend is readdir,
> sort by inode, FIEMAP, sort by block, and then read in the dpkg files.
> Of course an RPM partisan might say, "it would help if you guys had
> used a real database instead of ab(using) the file system. And then
> the dpkg guys could complain about what happens when RPM has to
> deal with corrupted rpm database, and how this allows dpkg to use
> shell scripts to access their package information. Life is full of tradeoffs.
>
> -- Ted
>
I have tested both sorting techniques with very large directories.
Most of the gain came with the simple sorting by inode number, but of course
this relies on the file system allocation policy having a correlation between
the inode numbers and layout (i.e., higher inode number correspond to higher
block numbers).
Note that you can get the inode number used in this sorting without doing any
stat calls.
Sorting by first block number also works well, but does have that extra syscall
(probably two - open & fibmap?) per file.
Ric
next prev parent reply other threads:[~2011-03-15 11:23 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-03-14 20:24 Large directories and poor order correlation Phillip Susi
2011-03-14 20:37 ` Eric Sandeen
2011-03-14 20:52 ` Phillip Susi
2011-03-14 21:12 ` Eric Sandeen
2011-03-14 21:52 ` Ted Ts'o
2011-03-14 23:43 ` Phillip Susi
2011-03-15 0:14 ` Ted Ts'o
2011-03-15 14:01 ` Phillip Susi
2011-03-15 14:33 ` Rogier Wolff
2011-03-15 14:36 ` Ric Wheeler
2011-03-15 17:08 ` Ted Ts'o
2011-03-15 19:08 ` Phillip Susi
2011-03-16 1:50 ` Ted Ts'o
2011-03-15 7:59 ` Florian Weimer
2011-03-15 11:06 ` Theodore Tso
2011-03-15 11:23 ` Ric Wheeler [this message]
2011-03-15 11:38 ` Theodore Tso
2011-03-15 13:33 ` Rogier Wolff
2011-03-15 17:18 ` Ted Ts'o
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4D7F4C22.9060801@gmail.com \
--to=ricwheeler@gmail.com \
--cc=fweimer@bfk.de \
--cc=linux-ext4@vger.kernel.org \
--cc=psusi@cfl.rr.com \
--cc=sandeen@redhat.com \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).