From: Hans Reiser <reiser@namesys.com>
To: "Ragnar Kjørstad" <reiserfs@ragnark.vestdata.no>
Cc: Daniel Phillips <phillips@bonn-fries.net>,
linux-kernel@vger.kernel.org, reiserfs-dev@namesys.com,
Nikita Danilov <god@namesys.com>,
green@thebsh.namesys.com
Subject: Re: [reiserfs-dev] Re: Ext2 directory index: ALS paper and benchmarks
Date: Sat, 08 Dec 2001 00:01:20 +0300 [thread overview]
Message-ID: <3C112E20.2080105@namesys.com> (raw)
In-Reply-To: <E16BjYc-0000hS-00@starship.berlin> <3C0EE8DD.3080108@namesys.com> <20011206122753.A9253@vestdata.no> <E16CNHk-0000u4-00@starship.berlin> <20011207174726.B6640@vestdata.no>
Ragnar Kjørstad wrote:
>On Fri, Dec 07, 2001 at 04:51:33PM +0100, Daniel Phillips wrote:
>
>>I did try R5 in htree, and at least a dozen other hashes. R5 was the worst
>>of the bunch, in terms of uniformity of distribution, and caused a measurable
>>slowdown in Htree performance. (Not an order of magnitude, mind you,
>>something closer to 15%.)
>>
>
>That sounds reasonable.
>
You are more dependent on hash uniformity than we are. We have a
balancing algorithm that manages space, you use hashing to manage your
space. It is a weakness of your approach (which is not to say your
approach is a bad one).
>
>
>
>>An alternative way of looking at this is, rather than R5 causing an order of
>>magnitude improvement for certain cases, something else is causing an order
>>of magnitude slowdown for common cases. I'd suggest attempting to root that
>>out.
>>
>
>In the cases I've studied more closely (e.g. maildir cases) the problem
>with reiserfs and e.g. the tea hash is that there is no common ordering
>between directory entries, stat-data and file-data.
>
>When new files are created in a directory, the file-data tend to be
>allocated somewhere after the last allocated file in the directory. The
>ordering of the directory-entry and the stat-data (hmm, both?) are
>
no, actually this is a problem for v3. stat data are time of creation
ordered (very roughly speaking)
and directory entries are hash ordered, meaning that ls -l suffers a
major performance penalty.
This might well affect our performance vs. htree, I don't know where
Daniel puts his stat data.
This matter is receiving attention in V4, and Nikita and I need to have
a seminar on it next week.
>
>however dependent on the hash. So, with something like the tea hash the
>new file will be inserted in the middle of the directory.
>
>
>In addition to the random lookup type reads, there are three other common
>scenarios for reading the files:
>
>1 Reading them in the same order they were created
>The cache will probably not be 100% effective on the
>directory/stat-data, because it's beeing accessed in a random-like
>order. Read-ahead for the file-data on the other hand will be effective.
>
>2 Reading the files in filename-order
>Some applications (say, ls -l) may do this, but I doubt it's a very
>common accesspattern. Cache-hit for directory-data will be poor, and
>cache-hit for file-data will be poor unless the files were created in
>the same order.
>
>3 Reading the files in readdir() order.
>This is what I think is the most common access-pattern. I would expect a
>lot of programs (e.g. mail clients using maildir) to read the directory
>and for every filename stat the file and read the data. This will be in
>optimal order for directory-caching, but more importantly it will be
>random-order like access for the file-data.
>
>I think scenario nr 3 is the one that matters, and I think it is this
>scenario that makes r5 faster than tea in real-life applications on
>reiserfs. (allthough most numbers available are from benchmarks and not
>real life applications).
>
>The directory content is likely to all fit in cache even for a fairly
>large directory, so cache-misses are not that much of a problem. The
>file-data itself however, will suffer if read-ahead can't start reading
>the next file from disk while the first one is beeing processed.
>
>
>
>I'm counting on Hans or someone else from the reiserfs team to correct
>me if I'm wrong.
>
>
Users who want to speedup reiserfs V3 read/stat performance can do so by
copying directories after creating them, and this way readdir order
equals stat data order. Sad, I know. Only a really fanatic sysadmin is
going to create his reiserfs installs using a master image that has
experienced a cp, but it will make things significantly faster if he
does. Green, add this to the FAQ.
We need to fix this, it is a missed opportunity for higher performance.
V4 I hope.
Hans
next prev parent reply other threads:[~2001-12-07 21:02 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-12-05 21:26 Ext2 directory index: ALS paper and benchmarks Daniel Phillips
2001-12-06 3:41 ` Hans Reiser
2001-12-06 3:54 ` Daniel Phillips
2001-12-06 3:56 ` Hans Reiser
2001-12-06 4:08 ` Daniel Phillips
2001-12-06 13:44 ` Hans Reiser
2001-12-06 17:22 ` Daniel Phillips
2001-12-07 0:13 ` [reiserfs-dev] " Hans Reiser
2001-12-07 4:39 ` Daniel Phillips
2001-12-07 12:36 ` Hans Reiser
2001-12-07 14:35 ` Daniel Phillips
2001-12-07 20:16 ` Hans Reiser
2001-12-06 11:27 ` Ragnar Kjørstad
2001-12-07 15:51 ` Daniel Phillips
2001-12-07 16:47 ` Ragnar Kjørstad
2001-12-07 17:41 ` Daniel Phillips
2001-12-07 18:03 ` Ragnar Kjørstad
2001-12-07 18:18 ` Daniel Phillips
2001-12-07 21:10 ` Hans Reiser
2001-12-07 21:12 ` Hans Reiser
2001-12-07 18:32 ` Andrew Morton
2001-12-07 19:46 ` Daniel Phillips
2001-12-07 20:00 ` Andrew Morton
2001-12-08 7:19 ` Linus Torvalds
2001-12-08 17:32 ` Daniel Phillips
2001-12-08 17:54 ` Jeff Garzik
2001-12-09 3:27 ` Daniel Phillips
2001-12-09 4:19 ` Linus Torvalds
2001-12-09 16:29 ` Alan Cox
2001-12-09 20:13 ` Daniel Phillips
2001-12-10 6:27 ` Linus Torvalds
2001-12-10 6:49 ` Alexander Viro
2001-12-10 8:32 ` Alan Cox
2001-12-10 16:14 ` Daniel Phillips
2001-12-08 20:28 ` Hans Reiser
2001-12-08 21:10 ` Ragnar Kjørstad
2001-12-07 21:01 ` Hans Reiser [this message]
2001-12-07 22:56 ` Ragnar Kjørstad
2001-12-08 0:15 ` Hans Reiser
2001-12-08 19:16 ` Ragnar Kjørstad
2001-12-08 19:55 ` Hans Reiser
2001-12-09 2:47 ` Daniel Phillips
2001-12-09 2:39 ` Daniel Phillips
2001-12-08 18:02 ` Jeremy Fitzhardinge
2001-12-09 2:24 ` Daniel Phillips
2001-12-07 3:19 ` Cameron Simpson
2001-12-07 10:54 ` Hans Reiser
2001-12-07 14:53 ` Daniel Phillips
2001-12-07 20:33 ` Hans Reiser
2001-12-07 13:06 ` [reiserfs-dev] " Ragnar Kjørstad
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3C112E20.2080105@namesys.com \
--to=reiser@namesys.com \
--cc=god@namesys.com \
--cc=green@thebsh.namesys.com \
--cc=linux-kernel@vger.kernel.org \
--cc=phillips@bonn-fries.net \
--cc=reiserfs-dev@namesys.com \
--cc=reiserfs@ragnark.vestdata.no \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.