All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Stephen C. Tweedie" <sct@redhat.com>
To: Daniel Phillips <phillips@bonn-fries.net>
Cc: "Stephen C. Tweedie" <sct@redhat.com>,
	Andrew Morton <akpm@zip.com.au>,
	Christopher Li <chrisl@gnuchina.org>,
	Linux-kernel <linux-kernel@vger.kernel.org>,
	ext2-devel@lists.sourceforge.net
Subject: Re: [Ext2-devel] Re: Shrinking ext3 directories
Date: Fri, 21 Jun 2002 16:06:59 +0100	[thread overview]
Message-ID: <20020621160659.C2805@redhat.com> (raw)
In-Reply-To: <E17LF65-0001K4-00@starship>; from phillips@bonn-fries.net on Fri, Jun 21, 2002 at 05:28:28AM +0200

Hi,

On Fri, Jun 21, 2002 at 05:28:28AM +0200, Daniel Phillips wrote:

> I ran a bakeoff between your new half-md4 and dx_hack_hash on Ext2.  As 
> predicted, half-md4 does produce very even bucket distributions.  For 200,000 
> creates:
> 
>    half-md4:        2872 avg bytes filled per 4k block (70%)
>    dx_hack_hash:    2853 avg bytes filled per 4k block (69%)
> 
> but guess which was faster overall?
> 
>    half-md4:        user 0.43 system 6.88 real 0:07.33 CPU 99%
>    dx_hack_hash:    user 0.43 system 6.40 real 0:06.82 CPU 100%
> 
> This is quite reproducible: dx_hack_hash is always faster by about 6%.  This 
> must be due entirely to the difference in hashing cost, since half-md4 
> produces measurably better distributions.  Now what do we do?

I want to get this thing tested!  

There are far too many factors for this to be resolved very quickly.
In reality, there will be a lot of disk cost under load which you
don't see in benchmarks, too.  We also know for a fact that the early
hashes used in Reiserfs were quick but were vulnerable to terribly bad
behaviour under certain application workloads.  With the half-md4, at
least we can expect decent worst-case behaviour unless we're under
active attack (ie. only maliscious apps get hurt).

I think the md4 is a safer bet until we know more, so I'd vote that we
stick with the ext3 cvs code which uses hash version #1 for that, and
defer anything else until we've seen more --- the hash versioning lets
us do that safely.

> By the way, I'm running about 37 usec per create here, on a 1GHz/1GB PIII, 
> with Ext2.  I think most of the difference vs your timings is that your test 
> code is eating a lot of cpu.

I was getting nearer to 50usec system time, but on an athlon k7-700,
so those timings are pretty comparable.  Mine was ext3, too, which
accounts for a bit.  The difference between that and wall-clock time
was all just idle time, which I think was due to using "touch"/"rm"
--- ie. there was a lot of inode table write activity due to the files
being created/deleted, and that was forcing a journal wrap before the
end of the test.  That effect is not visible on ext2, of course.

--Stephen

  parent reply	other threads:[~2002-06-21 15:07 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2002-06-18 16:08 Shrinking ext3 directories DervishD
2002-06-18 16:10 ` Austin Gonyou
2002-06-18 16:39   ` Andreas Dilger
2002-06-18 19:39     ` DervishD
2002-06-18 19:34   ` DervishD
2002-06-18 16:21 ` Padraig Brady
2002-06-18 16:54   ` David Lang
2002-06-18 19:35   ` DervishD
2002-06-18 21:50 ` Stephen C. Tweedie
2002-06-18 22:18   ` Alexander Viro
2002-06-19  9:38     ` DervishD
2002-06-19 10:37     ` Stephen C. Tweedie
2002-06-19 17:03       ` [Ext2-devel] " Christopher Li
2002-06-19 20:10         ` Stephen C. Tweedie
2002-06-19 20:34           ` Stephen C. Tweedie
2002-06-19 20:13         ` Andrew Morton
2002-06-19 22:43           ` Stephen C. Tweedie
2002-06-19 23:54             ` Stephen C. Tweedie
2002-06-21  3:28               ` Daniel Phillips
2002-06-21  7:03                 ` Helge Hafting
2002-06-21 14:02                   ` Daniel Phillips
2002-06-24  7:12                     ` Helge Hafting
2002-06-21 16:23                   ` Daniel Phillips
2002-06-21 15:06                 ` Stephen C. Tweedie [this message]
2002-07-04  4:48                   ` Daniel Phillips
2002-07-04 14:15                     ` jlnance
2002-07-05  2:11                       ` Daniel Phillips
2002-06-22  5:53                 ` Andreas Dilger
2002-06-22 20:59                   ` Daniel Phillips
2002-06-23  0:01                     ` Daniel Phillips
2002-06-23  7:57                     ` Daniel Phillips
2002-06-19 22:49         ` Daniel Phillips
2002-06-20  0:24           ` Andreas Dilger
2002-06-20  9:34           ` Stephen C. Tweedie
2002-06-20 10:18             ` Andreas Dilger
2002-06-20 13:45               ` Daniel Phillips
2002-06-21 14:54                 ` Ville Herva
2002-06-21 15:08                   ` Stephen C. Tweedie
2002-06-21 15:38                     ` Ville Herva
2002-06-21 16:15                       ` Stephen C. Tweedie
2002-06-21 18:44                         ` Ville Herva
2002-06-20 16:26               ` Bill Davidsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20020621160659.C2805@redhat.com \
    --to=sct@redhat.com \
    --cc=akpm@zip.com.au \
    --cc=chrisl@gnuchina.org \
    --cc=ext2-devel@lists.sourceforge.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=phillips@bonn-fries.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.