From: Hans Reiser <reiser@namesys.com>
To: "Ragnar Kjørstad" <reiserfs@ragnark.vestdata.no>
Cc: Daniel Phillips <phillips@bonn-fries.net>,
linux-kernel@vger.kernel.org, reiserfs-dev@namesys.com,
Nikita Danilov <god@namesys.com>,
green@thebsh.namesys.com
Subject: Re: [reiserfs-dev] Re: Ext2 directory index: ALS paper and benchmarks
Date: Sat, 08 Dec 2001 00:01:20 +0300 [thread overview]
Message-ID: <3C112E20.2080105@namesys.com> (raw)
In-Reply-To: <E16BjYc-0000hS-00@starship.berlin> <3C0EE8DD.3080108@namesys.com> <20011206122753.A9253@vestdata.no> <E16CNHk-0000u4-00@starship.berlin> <20011207174726.B6640@vestdata.no>
Ragnar Kjørstad wrote:
>On Fri, Dec 07, 2001 at 04:51:33PM +0100, Daniel Phillips wrote:
>
>>I did try R5 in htree, and at least a dozen other hashes. R5 was the worst
>>of the bunch, in terms of uniformity of distribution, and caused a measurable
>>slowdown in Htree performance. (Not an order of magnitude, mind you,
>>something closer to 15%.)
>>
>
>That sounds reasonable.
>
You are more dependent on hash uniformity than we are. We have a
balancing algorithm that manages space, you use hashing to manage your
space. It is a weakness of your approach (which is not to say your
approach is a bad one).
>
>
>
>>An alternative way of looking at this is, rather than R5 causing an order of
>>magnitude improvement for certain cases, something else is causing an order
>>of magnitude slowdown for common cases. I'd suggest attempting to root that
>>out.
>>
>
>In the cases I've studied more closely (e.g. maildir cases) the problem
>with reiserfs and e.g. the tea hash is that there is no common ordering
>between directory entries, stat-data and file-data.
>
>When new files are created in a directory, the file-data tend to be
>allocated somewhere after the last allocated file in the directory. The
>ordering of the directory-entry and the stat-data (hmm, both?) are
>
no, actually this is a problem for v3. stat data are time of creation
ordered (very roughly speaking)
and directory entries are hash ordered, meaning that ls -l suffers a
major performance penalty.
This might well affect our performance vs. htree, I don't know where
Daniel puts his stat data.
This matter is receiving attention in V4, and Nikita and I need to have
a seminar on it next week.
>
>however dependent on the hash. So, with something like the tea hash the
>new file will be inserted in the middle of the directory.
>
>
>In addition to the random lookup type reads, there are three other common
>scenarios for reading the files:
>
>1 Reading them in the same order they were created
>The cache will probably not be 100% effective on the
>directory/stat-data, because it's beeing accessed in a random-like
>order. Read-ahead for the file-data on the other hand will be effective.
>
>2 Reading the files in filename-order
>Some applications (say, ls -l) may do this, but I doubt it's a very
>common accesspattern. Cache-hit for directory-data will be poor, and
>cache-hit for file-data will be poor unless the files were created in
>the same order.
>
>3 Reading the files in readdir() order.
>This is what I think is the most common access-pattern. I would expect a
>lot of programs (e.g. mail clients using maildir) to read the directory
>and for every filename stat the file and read the data. This will be in
>optimal order for directory-caching, but more importantly it will be
>random-order like access for the file-data.
>
>I think scenario nr 3 is the one that matters, and I think it is this
>scenario that makes r5 faster than tea in real-life applications on
>reiserfs. (allthough most numbers available are from benchmarks and not
>real life applications).
>
>The directory content is likely to all fit in cache even for a fairly
>large directory, so cache-misses are not that much of a problem. The
>file-data itself however, will suffer if read-ahead can't start reading
>the next file from disk while the first one is beeing processed.
>
>
>
>I'm counting on Hans or someone else from the reiserfs team to correct
>me if I'm wrong.
>
>
Users who want to speedup reiserfs V3 read/stat performance can do so by
copying directories after creating them, and this way readdir order
equals stat data order. Sad, I know. Only a really fanatic sysadmin is
going to create his reiserfs installs using a master image that has
experienced a cp, but it will make things significantly faster if he
does. Green, add this to the FAQ.
We need to fix this, it is a missed opportunity for higher performance.
V4 I hope.
Hans
next prev parent reply other threads:[~2001-12-07 21:02 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-12-05 21:26 Ext2 directory index: ALS paper and benchmarks Daniel Phillips
2001-12-06 3:41 ` Hans Reiser
2001-12-06 3:54 ` Daniel Phillips
2001-12-06 3:56 ` Hans Reiser
2001-12-06 4:08 ` Daniel Phillips
2001-12-06 13:44 ` Hans Reiser
2001-12-06 17:22 ` Daniel Phillips
2001-12-07 0:13 ` [reiserfs-dev] " Hans Reiser
2001-12-07 4:39 ` Daniel Phillips
2001-12-07 12:36 ` Hans Reiser
2001-12-07 14:35 ` Daniel Phillips
2001-12-07 20:16 ` Hans Reiser
2001-12-06 11:27 ` Ragnar Kjørstad
2001-12-07 15:51 ` Daniel Phillips
2001-12-07 16:47 ` Ragnar Kjørstad
2001-12-07 17:41 ` Daniel Phillips
2001-12-07 18:03 ` Ragnar Kjørstad
2001-12-07 18:18 ` Daniel Phillips
2001-12-07 21:10 ` Hans Reiser
2001-12-07 21:12 ` Hans Reiser
2001-12-07 18:32 ` Andrew Morton
2001-12-07 19:46 ` Daniel Phillips
2001-12-07 20:00 ` Andrew Morton
2001-12-08 7:19 ` Linus Torvalds
2001-12-08 17:32 ` Daniel Phillips
2001-12-08 17:54 ` Jeff Garzik
2001-12-09 3:27 ` Daniel Phillips
2001-12-09 4:19 ` Linus Torvalds
2001-12-09 16:29 ` Alan Cox
2001-12-09 20:13 ` Daniel Phillips
2001-12-10 6:27 ` Linus Torvalds
2001-12-10 6:49 ` Alexander Viro
2001-12-10 8:32 ` Alan Cox
2001-12-10 16:14 ` Daniel Phillips
2001-12-08 20:28 ` Hans Reiser
2001-12-08 21:10 ` Ragnar Kjørstad
2001-12-07 21:01 ` Hans Reiser [this message]
2001-12-07 22:56 ` Ragnar Kjørstad
2001-12-08 0:15 ` Hans Reiser
2001-12-08 19:16 ` Ragnar Kjørstad
2001-12-08 19:55 ` Hans Reiser
2001-12-09 2:47 ` Daniel Phillips
2001-12-09 2:39 ` Daniel Phillips
2001-12-08 18:02 ` Jeremy Fitzhardinge
2001-12-09 2:24 ` Daniel Phillips
2001-12-07 3:19 ` Cameron Simpson
2001-12-07 10:54 ` Hans Reiser
2001-12-07 14:53 ` Daniel Phillips
2001-12-07 20:33 ` Hans Reiser
2001-12-07 13:06 ` [reiserfs-dev] " Ragnar Kjørstad
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3C112E20.2080105@namesys.com \
--to=reiser@namesys.com \
--cc=god@namesys.com \
--cc=green@thebsh.namesys.com \
--cc=linux-kernel@vger.kernel.org \
--cc=phillips@bonn-fries.net \
--cc=reiserfs-dev@namesys.com \
--cc=reiserfs@ragnark.vestdata.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox