linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: tytso@mit.edu
To: Christian Kujau <lists@nerdbynature.de>
Cc: jim owens <jowens@hp.com>, Larry McVoy <lm@bitmover.com>,
	jfs-discussion@lists.sourceforge.net,
	linux-nilfs@vger.kernel.org, xfs@oss.sgi.com,
	reiserfs-devel@vger.kernel.org,
	Peter Grandi <pg_jf2@jf2.for.sabi.co.UK>,
	ext-users <ext3-users@redhat.com>,
	linux-ext4@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [Jfs-discussion] benchmark results
Date: Sun, 27 Dec 2009 17:33:07 -0500	[thread overview]
Message-ID: <20091227223307.GA4429@thunk.org> (raw)
In-Reply-To: <alpine.DEB.2.01.0912271346240.3483@bogon.housecafe.de>

On Sun, Dec 27, 2009 at 01:55:26PM -0800, Christian Kujau wrote:
> On Sun, 27 Dec 2009 at 14:50, jim owens wrote:
> > And I don't even care about comparing 2 filesystems, I only care about
> > timing 2 versions of code in the single filesystem I am working on,
> > and forgetting about hardware cache effects has screwed me there.  
> 
> Not me, I'm comparing filesystems - and when the HBA or whatever plays 
> tricks and "sync" doesn't flush all the data, it'll do so for every tested 
> filesystem. Of course, filesystem could handle "sync" differently, and 
> they probably do, hence the different times they take to complete. That's 
> what my tests are about: timing comparision (does that still fall under 
> the "benchmark" category?), not functional comparision. That's left as a 
> task for the reader of these results: "hm, filesystem xy is so much faster 
> when doing foo, why is that? And am I willing to sacrifice e.g. proper 
> syncs to gain more speed?"

Yes, but given many of the file systems have almost *exactly* the same
bandwidth measurement for the "cp" test, and said bandwidth
measurement is 5 times the disk bandwidith as measured by hdparm, it
makes me suspect that you are doing this:

/bin/time /bin/cp -r /source/tree /filesystem-under-test
sync
/bin/time /bin/rm -rf /filesystem-under-test/tree
sync

etc.

It is *a* measurement, but the question is whether it's a useful
comparison.  Consider two different file systems.  One file system
which does a very good job making sure that file writes are done
contiguously to disk, minimizing seek overhead --- and another file
system which is really crappy at disk allocation, and writes the files
to random locations all over the disk.  If you are only measuring the
"cp", then the fact that filesystem 'A' has a very good layout, and is
able to write things to disk very efficiently, and filesystem 'B' has
files written in a really horrible way, won't be measured by your
test.  This is especially true if, for example, you have 8GB of memory
and you are copying 4GB worth of data.

You might notice it if you include the "sync" in the timing, i.e.:

/bin/time /bin/sh -c "/bin/cp -r /source/tree /filesystem-under-test;/bin/sync"

> Again, I don't argue with "hardware caches will have effects", but that's 
> not the point of these tests. Of course hardware is different, but 
> filesystems are too and I'm testing filesystems (on the same hardware).

The question is whether your tests are doing the best job of measuring
how good the filesystem really is.  If your workload is one where you
will only be copying file sets much smaller than your memory, and you
don't care about when the data actually hits the disk, only when
"/bin/cp" returns, then sure, do whatever you want.  But if you want
the tests to have meaning if, for example, you have 2GB of memory and
you are copying 8GB of data, or if later on will be continuously
streaming data to the disk, and sooner or later the need to write data
to the disk will start slowing down your real-life workload, then not
including the time to do the sync in the time to copy your file set
may cause you to assume that filesystems 'A' and 'B' are identical in
performance, and then your filesystem comparison will end up
misleading you.

The bottom line is that it's very hard to do good comparisons that are
useful in the general case.

Best regards,

						- Ted

  reply	other threads:[~2009-12-27 22:33 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-24 10:31 benchmark results Christian Kujau
2009-12-24 12:06 ` Ryusuke Konishi
2009-12-24 12:59 ` Teran McKinney
2009-12-24 20:01   ` Christian Kujau
2009-12-24 13:05 ` Peter Grandi
2009-12-24 21:27   ` [Jfs-discussion] " tytso
2009-12-24 23:46     ` Evgeniy Polyakov
2009-12-25 16:11       ` tytso
2010-01-04 16:27         ` Chris Mason
2010-01-04 18:57           ` Michael Rubin
2010-01-05  0:41           ` Dave Chinner
2010-01-05 15:31             ` Steven Pratt
2009-12-25  1:52     ` Christian Kujau
2009-12-25 13:19       ` lakshmi pathi
2009-12-25 16:14       ` tytso
2009-12-25 16:22         ` Larry McVoy
2009-12-25 16:33           ` tytso
2009-12-25 18:51           ` Christian Kujau
2009-12-26 16:00             ` jim owens
2009-12-26 19:06               ` Christian Kujau
2009-12-27 19:50                 ` jim owens
2009-12-27 21:55                   ` Christian Kujau
2009-12-27 22:33                     ` tytso [this message]
2009-12-28  1:24                       ` Christian Kujau
2009-12-28 14:08                       ` Larry McVoy
2010-01-15 21:42                         ` Edward Shishkin
2009-12-26 19:19               ` tytso
     [not found]           ` <20091225163341.GE32757@thunk.org>
2009-12-25 18:56             ` Christian Kujau
2009-12-25 19:32               ` Christian Kujau
2010-01-11  1:03           ` Casey Allen Shobe
2010-01-11  1:32             ` Larry McVoy
2009-12-25 18:42         ` Christian Kujau
2009-12-29 11:27 ` Emmanuel Florac

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091227223307.GA4429@thunk.org \
    --to=tytso@mit.edu \
    --cc=ext3-users@redhat.com \
    --cc=jfs-discussion@lists.sourceforge.net \
    --cc=jowens@hp.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-nilfs@vger.kernel.org \
    --cc=lists@nerdbynature.de \
    --cc=lm@bitmover.com \
    --cc=pg_jf2@jf2.for.sabi.co.UK \
    --cc=reiserfs-devel@vger.kernel.org \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).