From: Daniel Phillips <phillips@arcor.de>
To: Helge Hafting <helgehaf@aitel.hist.no>,
ext2-devel@lists.sourceforge.net, linux-kernel@vger.kernel.org
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories
Date: Fri, 21 Jun 2002 18:23:13 +0200 [thread overview]
Message-ID: <E17LRBp-0001ar-00@starship> (raw)
In-Reply-To: <3D12CFC5.38DD9245@aitel.hist.no>
On Friday 21 June 2002 09:03, Helge Hafting wrote:
> Daniel Phillips wrote:
>
> > I ran a bakeoff between your new half-md4 and dx_hack_hash on Ext2. As
> > predicted, half-md4 does produce very even bucket distributions. For 200,000
> > creates:
> >
> > half-md4: 2872 avg bytes filled per 4k block (70%)
> > dx_hack_hash: 2853 avg bytes filled per 4k block (69%)
> >
> > but guess which was faster overall?
> >
> > half-md4: user 0.43 system 6.88 real 0:07.33 CPU 99%
> > dx_hack_hash: user 0.43 system 6.40 real 0:06.82 CPU 100%
> >
> > This is quite reproducible: dx_hack_hash is always faster by about 6%. This
> > must be due entirely to the difference in hashing cost, since half-md4
> > produces measurably better distributions. Now what do we do?
>
> No surprise that the worse distribution is faster - you get less
> io when fewer blocks are used. Which means a bad distribution beats
> a good one _until_ blocks start to really fill up and collide. 2.8K per
> 4K block is only 70% full. I guess the better hash wins
> if you force a higher fill rate?
Hashing in htree doesn't work like that - what you're thinking about
is a traditional fixed-size hash table. HTree is a btree that uses
hashes of names as keys. Each block has a variable amount of the key
space assigned to it, initially just one block for the entire key
space. When that block fills up, its entries and its key space are
split into two, then those blocks start to fill up, get split, and
so on.
So more even key distribution means the key space gets split more
evenly, and blocks are more likely to fill up evenly, meaning less
splitting, fewer blocks in total, and less IO.
A hash function that distributes keys better should give somewhat
better performance, and that has indeed been my experience. But
in the case of half-MD4, the improvement we get from better
distribution is wiped out by the higher cost of computing the hash
function.[1] Which is not to say that the work is without value.
The beautiful distribution given by the half-MD4 hash gives us
something to aim at, we just have to be more efficient about it.
I should note that HTree isn't hugely sensitive to bad hash
functions, at least not at the outset when a directory is growing.
The worst that happens is every leaf block ends up half-full with
a performance hit of just a few percent. However, over time with
many insertions and deletions the hash space can get cut up into
smaller and smaller pieces, so leaf blocks become less and less
full. A more uniform hash function will slow this process down a
great deal, but it will not stop it completely. The proper way
to deal with long term key space fragmentation is to implement
coalesce-on-delete, which is in progress.
[1] CPU cost in filesystem operations *is* important - a lot more
important than commonly thought. Here we have yet another example
where CPU cost in filesystem operations dominates IO time, and
indeed, since directory operations are performed almost entirely
in cache, the quadratic cost of linear directory lookup is almost
entirely CPU cost.
--
Daniel
next prev parent reply other threads:[~2002-06-21 16:23 UTC|newest]
Thread overview: 42+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-06-18 16:08 Shrinking ext3 directories DervishD
2002-06-18 16:10 ` Austin Gonyou
2002-06-18 16:39 ` Andreas Dilger
2002-06-18 19:39 ` DervishD
2002-06-18 19:34 ` DervishD
2002-06-18 16:21 ` Padraig Brady
2002-06-18 16:54 ` David Lang
2002-06-18 19:35 ` DervishD
2002-06-18 21:50 ` Stephen C. Tweedie
2002-06-18 22:18 ` Alexander Viro
2002-06-19 9:38 ` DervishD
2002-06-19 10:37 ` Stephen C. Tweedie
2002-06-19 17:03 ` [Ext2-devel] " Christopher Li
2002-06-19 20:10 ` Stephen C. Tweedie
2002-06-19 20:34 ` Stephen C. Tweedie
2002-06-19 20:13 ` Andrew Morton
2002-06-19 22:43 ` Stephen C. Tweedie
2002-06-19 23:54 ` Stephen C. Tweedie
2002-06-21 3:28 ` Daniel Phillips
2002-06-21 7:03 ` Helge Hafting
2002-06-21 14:02 ` Daniel Phillips
2002-06-24 7:12 ` Helge Hafting
2002-06-21 16:23 ` Daniel Phillips [this message]
2002-06-21 15:06 ` Stephen C. Tweedie
2002-07-04 4:48 ` Daniel Phillips
2002-07-04 14:15 ` jlnance
2002-07-05 2:11 ` Daniel Phillips
2002-06-22 5:53 ` Andreas Dilger
2002-06-22 20:59 ` Daniel Phillips
2002-06-23 0:01 ` Daniel Phillips
2002-06-23 7:57 ` Daniel Phillips
2002-06-19 22:49 ` Daniel Phillips
2002-06-20 0:24 ` Andreas Dilger
2002-06-20 9:34 ` Stephen C. Tweedie
2002-06-20 10:18 ` Andreas Dilger
2002-06-20 13:45 ` Daniel Phillips
2002-06-21 14:54 ` Ville Herva
2002-06-21 15:08 ` Stephen C. Tweedie
2002-06-21 15:38 ` Ville Herva
2002-06-21 16:15 ` Stephen C. Tweedie
2002-06-21 18:44 ` Ville Herva
2002-06-20 16:26 ` Bill Davidsen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=E17LRBp-0001ar-00@starship \
--to=phillips@arcor.de \
--cc=ext2-devel@lists.sourceforge.net \
--cc=helgehaf@aitel.hist.no \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox