public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Daniel Phillips <phillips@bonn-fries.net>
To: Goswin Brederlow <goswin.brederlow@student.uni-tuebingen.de>,
	Pavel Machek <pavel@suse.cz>
Cc: Bill Crawford <billc@netcomuk.co.uk>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	"H. Peter Anvin" <hpa@transmeta.com>,
	Daniel Phillips <phillips@innominate.de>
Subject: Re: Hashing and directories
Date: Fri, 27 Apr 2001 18:20:19 +0200	[thread overview]
Message-ID: <01042718201900.01336@starship> (raw)
In-Reply-To: <3A959BFD.B18F833@netcomuk.co.uk> <20000101020213.D28@(none)> <87ofvcv3dj.fsf@mose.informatik.uni-tuebingen.de>
In-Reply-To: <87ofvcv3dj.fsf@mose.informatik.uni-tuebingen.de>

On Thursday 08 March 2001 13:42, Goswin Brederlow wrote:
> >>>>> " " == Pavel Machek <pavel@suse.cz> writes:
>      > Hi!
>      >
>     >> I was hoping to point out that in real life, most systems that
>     >> need to access large numbers of files are already designed to
>     >> do some kind of hashing, or at least to divide-and-conquer by
>     >> using multi-level directory structures.
>     >>
>      > Yes -- because their workaround kernel slowness.
>      >
>      > I had to do this kind of hashing because kernel disliked 70000
>      > html files (copy of train time tables).
>      >
>      > BTW try rm * with 70000 files in directory -- command line will
>      > overflow.
>
> There are filesystems that use btrees (reiserfs) or hashing (affs) for
> directories.
>
> That way you get a O(log(n)) or even O(1) access time for
> files. Saddly the hashtable for affs depends on the blocksize and
> linux AFAIK only allows far too small block sizes (512 byte) for affs.
> It was designed for floppies, so the lack of dynamically resizing hash
> tables is excused.
>
> What also could be done is to keed directories sorted. Creating of
> files would cost O(N) time but a lookup could be done in
> O(log(log(n))) most of the time with reasonable name distribution.
> This could be done with ext2 without breaking any compatibility. One
> would need to convert (sort all directories) every time the FS was
> mounted RW by an older ext2, but otherwise nothing changes.
>
> Would you like to write support for this?

Hi, I missed this whole thread at the time, ironically, because I was 
too busy working on my hash-keyed directory index.

How do you get log(log(n))?  The best I can do is logb(n), with
b=large.  For practical purposes this is O(1).

The only problem I ran into is the mismatch between the storage order
of the sorted directory and that of the inodes, which causes thrashing
in the inode table.  I was able to eliminate this thrashing completely
from user space by processing the files in inode order.  I went on to 
examine ways of eliminating the thrashing without help from user space,
and eventually came up with a good way of doing that that relies on
setting an inode allocation target that corresponds loosely to the sort
order.

The thrashing doesn't hurt much anyway compared to the current N**2
behaviour.  For a million files I saw a 4X slowdown for delete vs
create.  Create shows no thrashing because it works in storage order
in the inodes, with the directory blocks themselves being smaller by
a factor of 6-7, so not contributing significantly to the cache
pressure.  Compare this to the 150 times slowdown you see with normal
Ext2, creating 100,000 files.

--
Daniel

  reply	other threads:[~2001-04-27 16:20 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2001-02-22 23:08 Hashing and directories Bill Crawford
2000-01-01  2:02 ` Pavel Machek
2001-03-01 20:54   ` Alexander Viro
2001-03-01 21:05     ` H. Peter Anvin
2001-03-01 21:13       ` Alexander Viro
2001-03-01 21:24         ` H. Peter Anvin
2001-03-02  9:04         ` Pavel Machek
2001-03-02 12:01           ` Oystein Viggen
2001-03-02 12:26             ` Tobias Ringstrom
2001-03-02 12:58           ` David Weinehall
2001-03-02 19:33           ` Tim Wright
2001-03-12 10:05           ` Herbert Xu
2001-03-12 10:43             ` Xavier Bestel
2001-03-01 21:23       ` Andreas Dilger
2001-03-01 21:26       ` Bill Crawford
2001-03-01 21:05     ` Tigran Aivazian
2001-03-02  8:56       ` Pavel Machek
2001-03-07  0:37         ` Jamie Lokier
2001-03-07  4:03           ` Linus Torvalds
2001-03-07 13:41             ` Jamie Lokier
2001-03-02  9:00     ` Pavel Machek
2001-03-03  0:03   ` Bill Crawford
2001-03-08 12:42   ` Goswin Brederlow
2001-04-27 16:20     ` Daniel Phillips [this message]
2001-02-22 23:22 ` H. Peter Anvin
2001-02-22 23:54   ` Bill Crawford
2001-03-10 11:22 ` Kai Henningsen
  -- strict thread matches above, loose matches on Subject: below --
2001-03-07 15:56 Manfred Spraul
2001-03-07 16:10 ` Jamie Lokier
2001-03-07 16:23   ` Manfred Spraul
2001-03-07 18:21     ` Linus Torvalds

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=01042718201900.01336@starship \
    --to=phillips@bonn-fries.net \
    --cc=billc@netcomuk.co.uk \
    --cc=goswin.brederlow@student.uni-tuebingen.de \
    --cc=hpa@transmeta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pavel@suse.cz \
    --cc=phillips@innominate.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox