public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Stephen C. Tweedie" <sct@redhat.com>
To: Daniel Phillips <phillips@bonn-fries.net>
Cc: "Stephen C. Tweedie" <sct@redhat.com>,
	Andrew Morton <akpm@zip.com.au>,
	Christopher Li <chrisl@gnuchina.org>,
	Linux-kernel <linux-kernel@vger.kernel.org>,
	ext2-devel@lists.sourceforge.net
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories
Date: Fri, 21 Jun 2002 16:06:59 +0100	[thread overview]
Message-ID: <20020621160659.C2805@redhat.com> (raw)
In-Reply-To: <E17LF65-0001K4-00@starship>; from phillips@bonn-fries.net on Fri, Jun 21, 2002 at 05:28:28AM +0200

Hi,

On Fri, Jun 21, 2002 at 05:28:28AM +0200, Daniel Phillips wrote:

> I ran a bakeoff between your new half-md4 and dx_hack_hash on Ext2.  As 
> predicted, half-md4 does produce very even bucket distributions.  For 200,000 
> creates:
> 
>    half-md4:        2872 avg bytes filled per 4k block (70%)
>    dx_hack_hash:    2853 avg bytes filled per 4k block (69%)
> 
> but guess which was faster overall?
> 
>    half-md4:        user 0.43 system 6.88 real 0:07.33 CPU 99%
>    dx_hack_hash:    user 0.43 system 6.40 real 0:06.82 CPU 100%
> 
> This is quite reproducible: dx_hack_hash is always faster by about 6%.  This 
> must be due entirely to the difference in hashing cost, since half-md4 
> produces measurably better distributions.  Now what do we do?

I want to get this thing tested!  

There are far too many factors for this to be resolved very quickly.
In reality, there will be a lot of disk cost under load which you
don't see in benchmarks, too.  We also know for a fact that the early
hashes used in Reiserfs were quick but were vulnerable to terribly bad
behaviour under certain application workloads.  With the half-md4, at
least we can expect decent worst-case behaviour unless we're under
active attack (ie. only maliscious apps get hurt).

I think the md4 is a safer bet until we know more, so I'd vote that we
stick with the ext3 cvs code which uses hash version #1 for that, and
defer anything else until we've seen more --- the hash versioning lets
us do that safely.

> By the way, I'm running about 37 usec per create here, on a 1GHz/1GB PIII, 
> with Ext2.  I think most of the difference vs your timings is that your test 
> code is eating a lot of cpu.

I was getting nearer to 50usec system time, but on an athlon k7-700,
so those timings are pretty comparable.  Mine was ext3, too, which
accounts for a bit.  The difference between that and wall-clock time
was all just idle time, which I think was due to using "touch"/"rm"
--- ie. there was a lot of inode table write activity due to the files
being created/deleted, and that was forcing a journal wrap before the
end of the test.  That effect is not visible on ext2, of course.

--Stephen

  parent reply	other threads:[~2002-06-21 15:07 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-06-18 16:08 Shrinking ext3 directories DervishD
2002-06-18 16:10 ` Austin Gonyou
2002-06-18 16:39   ` Andreas Dilger
2002-06-18 19:39     ` DervishD
2002-06-18 19:34   ` DervishD
2002-06-18 16:21 ` Padraig Brady
2002-06-18 16:54   ` David Lang
2002-06-18 19:35   ` DervishD
2002-06-18 21:50 ` Stephen C. Tweedie
2002-06-18 22:18   ` Alexander Viro
2002-06-19  9:38     ` DervishD
2002-06-19 10:37     ` Stephen C. Tweedie
2002-06-19 17:03       ` [Ext2-devel] " Christopher Li
2002-06-19 20:10         ` Stephen C. Tweedie
2002-06-19 20:34           ` Stephen C. Tweedie
2002-06-19 20:13         ` Andrew Morton
2002-06-19 22:43           ` Stephen C. Tweedie
2002-06-19 23:54             ` Stephen C. Tweedie
2002-06-21  3:28               ` Daniel Phillips
2002-06-21  7:03                 ` Helge Hafting
2002-06-21 14:02                   ` Daniel Phillips
2002-06-24  7:12                     ` Helge Hafting
2002-06-21 16:23                   ` Daniel Phillips
2002-06-21 15:06                 ` Stephen C. Tweedie [this message]
2002-07-04  4:48                   ` Daniel Phillips
2002-07-04 14:15                     ` jlnance
2002-07-05  2:11                       ` Daniel Phillips
2002-06-22  5:53                 ` Andreas Dilger
2002-06-22 20:59                   ` Daniel Phillips
2002-06-23  0:01                     ` Daniel Phillips
2002-06-23  7:57                     ` Daniel Phillips
2002-06-19 22:49         ` Daniel Phillips
2002-06-20  0:24           ` Andreas Dilger
2002-06-20  9:34           ` Stephen C. Tweedie
2002-06-20 10:18             ` Andreas Dilger
2002-06-20 13:45               ` Daniel Phillips
2002-06-21 14:54                 ` Ville Herva
2002-06-21 15:08                   ` Stephen C. Tweedie
2002-06-21 15:38                     ` Ville Herva
2002-06-21 16:15                       ` Stephen C. Tweedie
2002-06-21 18:44                         ` Ville Herva
2002-06-20 16:26               ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020621160659.C2805@redhat.com \
    --to=sct@redhat.com \
    --cc=akpm@zip.com.au \
    --cc=chrisl@gnuchina.org \
    --cc=ext2-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=phillips@bonn-fries.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox