From: Hans Reiser <reiser@namesys.com>
To: Daniel Phillips <phillips@bonn-fries.net>
Cc: cs@zip.com.au, linux-kernel@vger.kernel.org, reiserfs-dev@namesys.com
Subject: Re: Ext2 directory index: ALS paper and benchmarks
Date: Fri, 07 Dec 2001 23:33:40 +0300 [thread overview]
Message-ID: <3C1127A4.6070701@namesys.com> (raw)
In-Reply-To: <E16BjYc-0000hS-00@starship.berlin> <20011207141913.A26225@zapff.research.canon.com.au> <3C109FE3.5070107@namesys.com> <E16CMN6-0000t8-00@starship.berlin>
Daniel Phillips wrote:
>On December 7, 2001 11:54 am, Hans Reiser wrote:
>
>>Cameron Simpson wrote:
>>
>>>On Thu, Dec 06, 2001 at 06:41:17AM +0300, Hans Reiser <reiser@namesys.com>
>>>
>wrote:
>
>>>| Have you ever seen an application that creates millions of files create
>>>| them in random order?
>>>
>>>I can readily imagine one. An app which stashes things sent by random
>>>other things (usenet/email attachment trollers? security cameras taking
>>>thouands of still photos a day?). Mail services like hotmail. with a
>>>zillion mail spools, being made and deleted and accessed at random...
>>>
>>Ok, they exist, but they are the 20% not the 80% case, and for that
>>reason preserving order in hashing is a legitimate optimization.
>>
>
>At least, I think you ought to make a random hash the default. You're
>suffering badly on the 'random name' case, which I don't think is all that
>rare. I'll run that test again with some of your hashes and see what happens.
>
>>If names are truly random ordered, then the only optimization that can
>>help is compression so as to cause the working set to still fit into RAM.
>>
>
>You appear to be mixing up the idea of random characters in the names with
>random processing order. IMHO, the exact characters in a file name should
>not affect processing efficiency at all, and I went out of my way to make
>that true with HTree.
>
If the characters in the name determine the point of insertion, and the
extent to which processing order correlates with the point of insertion
determines how well caching works, then do you see my viewpoint?
Sure, nobody "should" have to engage in locality of reference, but God
was not concerned somehow, and so disk drives make us get all very
worried about locality of reference.
>
>
>On the other hand, the processing order of names does and will always matter
>a great deal in terms of cache footprint.
>
>I should have done random stat benchmarks too, since we'll really see the
>effects of processing order there. I'll put that on my to-do list.
>
>--
>Daniel
>
>
We should give Yura and Green a chance to run some benchmarks before I
get into too much analyzing. I have learned not to comment before
seeing complete benchmarks.
Hans
next prev parent reply other threads:[~2001-12-07 20:35 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-12-05 21:26 Ext2 directory index: ALS paper and benchmarks Daniel Phillips
2001-12-06 3:41 ` Hans Reiser
2001-12-06 3:54 ` Daniel Phillips
2001-12-06 3:56 ` Hans Reiser
2001-12-06 4:08 ` Daniel Phillips
2001-12-06 13:44 ` Hans Reiser
2001-12-06 17:22 ` Daniel Phillips
2001-12-07 0:13 ` [reiserfs-dev] " Hans Reiser
2001-12-07 4:39 ` Daniel Phillips
2001-12-07 12:36 ` Hans Reiser
2001-12-07 14:35 ` Daniel Phillips
2001-12-07 20:16 ` Hans Reiser
2001-12-06 11:27 ` Ragnar Kjørstad
2001-12-07 15:51 ` Daniel Phillips
2001-12-07 16:47 ` Ragnar Kjørstad
2001-12-07 17:41 ` Daniel Phillips
2001-12-07 18:03 ` Ragnar Kjørstad
2001-12-07 18:18 ` Daniel Phillips
2001-12-07 21:10 ` Hans Reiser
2001-12-07 21:12 ` Hans Reiser
2001-12-07 18:32 ` Andrew Morton
2001-12-07 19:46 ` Daniel Phillips
2001-12-07 20:00 ` Andrew Morton
2001-12-08 7:19 ` Linus Torvalds
2001-12-08 17:32 ` Daniel Phillips
2001-12-08 17:54 ` Jeff Garzik
2001-12-09 3:27 ` Daniel Phillips
2001-12-09 4:19 ` Linus Torvalds
2001-12-09 16:29 ` Alan Cox
2001-12-09 20:13 ` Daniel Phillips
2001-12-10 6:27 ` Linus Torvalds
2001-12-10 6:49 ` Alexander Viro
2001-12-10 8:32 ` Alan Cox
2001-12-10 16:14 ` Daniel Phillips
2001-12-08 20:28 ` Hans Reiser
2001-12-08 21:10 ` Ragnar Kjørstad
2001-12-07 21:01 ` Hans Reiser
2001-12-07 22:56 ` Ragnar Kjørstad
2001-12-08 0:15 ` Hans Reiser
2001-12-08 19:16 ` Ragnar Kjørstad
2001-12-08 19:55 ` Hans Reiser
2001-12-09 2:47 ` Daniel Phillips
2001-12-09 2:39 ` Daniel Phillips
2001-12-08 18:02 ` Jeremy Fitzhardinge
2001-12-09 2:24 ` Daniel Phillips
2001-12-07 3:19 ` Cameron Simpson
2001-12-07 10:54 ` Hans Reiser
2001-12-07 14:53 ` Daniel Phillips
2001-12-07 20:33 ` Hans Reiser [this message]
2001-12-07 13:06 ` [reiserfs-dev] " Ragnar Kjørstad
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3C1127A4.6070701@namesys.com \
--to=reiser@namesys.com \
--cc=cs@zip.com.au \
--cc=linux-kernel@vger.kernel.org \
--cc=phillips@bonn-fries.net \
--cc=reiserfs-dev@namesys.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox