From: Daniel Phillips <phillips@bonn-fries.net>
To: Goswin Brederlow <goswin.brederlow@student.uni-tuebingen.de>,
Pavel Machek <pavel@suse.cz>
Cc: Bill Crawford <billc@netcomuk.co.uk>,
Linux Kernel <linux-kernel@vger.kernel.org>,
"H. Peter Anvin" <hpa@transmeta.com>,
Daniel Phillips <phillips@innominate.de>
Subject: Re: Hashing and directories
Date: Fri, 27 Apr 2001 18:20:19 +0200 [thread overview]
Message-ID: <01042718201900.01336@starship> (raw)
In-Reply-To: <3A959BFD.B18F833@netcomuk.co.uk> <20000101020213.D28@(none)> <87ofvcv3dj.fsf@mose.informatik.uni-tuebingen.de>
In-Reply-To: <87ofvcv3dj.fsf@mose.informatik.uni-tuebingen.de>
On Thursday 08 March 2001 13:42, Goswin Brederlow wrote:
> >>>>> " " == Pavel Machek <pavel@suse.cz> writes:
> > Hi!
> >
> >> I was hoping to point out that in real life, most systems that
> >> need to access large numbers of files are already designed to
> >> do some kind of hashing, or at least to divide-and-conquer by
> >> using multi-level directory structures.
> >>
> > Yes -- because their workaround kernel slowness.
> >
> > I had to do this kind of hashing because kernel disliked 70000
> > html files (copy of train time tables).
> >
> > BTW try rm * with 70000 files in directory -- command line will
> > overflow.
>
> There are filesystems that use btrees (reiserfs) or hashing (affs) for
> directories.
>
> That way you get a O(log(n)) or even O(1) access time for
> files. Saddly the hashtable for affs depends on the blocksize and
> linux AFAIK only allows far too small block sizes (512 byte) for affs.
> It was designed for floppies, so the lack of dynamically resizing hash
> tables is excused.
>
> What also could be done is to keed directories sorted. Creating of
> files would cost O(N) time but a lookup could be done in
> O(log(log(n))) most of the time with reasonable name distribution.
> This could be done with ext2 without breaking any compatibility. One
> would need to convert (sort all directories) every time the FS was
> mounted RW by an older ext2, but otherwise nothing changes.
>
> Would you like to write support for this?
Hi, I missed this whole thread at the time, ironically, because I was
too busy working on my hash-keyed directory index.
How do you get log(log(n))? The best I can do is logb(n), with
b=large. For practical purposes this is O(1).
The only problem I ran into is the mismatch between the storage order
of the sorted directory and that of the inodes, which causes thrashing
in the inode table. I was able to eliminate this thrashing completely
from user space by processing the files in inode order. I went on to
examine ways of eliminating the thrashing without help from user space,
and eventually came up with a good way of doing that that relies on
setting an inode allocation target that corresponds loosely to the sort
order.
The thrashing doesn't hurt much anyway compared to the current N**2
behaviour. For a million files I saw a 4X slowdown for delete vs
create. Create shows no thrashing because it works in storage order
in the inodes, with the directory blocks themselves being smaller by
a factor of 6-7, so not contributing significantly to the cache
pressure. Compare this to the 150 times slowdown you see with normal
Ext2, creating 100,000 files.
--
Daniel
next prev parent reply other threads:[~2001-04-27 16:20 UTC|newest]
Thread overview: 31+ messages / expand[flat|nested] mbox.gz Atom feed top
2001-02-22 23:08 Hashing and directories Bill Crawford
2000-01-01 2:02 ` Pavel Machek
2001-03-01 20:54 ` Alexander Viro
2001-03-01 21:05 ` H. Peter Anvin
2001-03-01 21:13 ` Alexander Viro
2001-03-01 21:24 ` H. Peter Anvin
2001-03-02 9:04 ` Pavel Machek
2001-03-02 12:01 ` Oystein Viggen
2001-03-02 12:26 ` Tobias Ringstrom
2001-03-02 12:58 ` David Weinehall
2001-03-02 19:33 ` Tim Wright
2001-03-12 10:05 ` Herbert Xu
2001-03-12 10:43 ` Xavier Bestel
2001-03-01 21:23 ` Andreas Dilger
2001-03-01 21:26 ` Bill Crawford
2001-03-01 21:05 ` Tigran Aivazian
2001-03-02 8:56 ` Pavel Machek
2001-03-07 0:37 ` Jamie Lokier
2001-03-07 4:03 ` Linus Torvalds
2001-03-07 13:41 ` Jamie Lokier
2001-03-02 9:00 ` Pavel Machek
2001-03-03 0:03 ` Bill Crawford
2001-03-08 12:42 ` Goswin Brederlow
2001-04-27 16:20 ` Daniel Phillips [this message]
2001-02-22 23:22 ` H. Peter Anvin
2001-02-22 23:54 ` Bill Crawford
2001-03-10 11:22 ` Kai Henningsen
-- strict thread matches above, loose matches on Subject: below --
2001-03-07 15:56 Manfred Spraul
2001-03-07 16:10 ` Jamie Lokier
2001-03-07 16:23 ` Manfred Spraul
2001-03-07 18:21 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=01042718201900.01336@starship \
--to=phillips@bonn-fries.net \
--cc=billc@netcomuk.co.uk \
--cc=goswin.brederlow@student.uni-tuebingen.de \
--cc=hpa@transmeta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=pavel@suse.cz \
--cc=phillips@innominate.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.