All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chris Mason <chris.mason@oracle.com>
To: "Jose R. Santos" <jrs@us.ibm.com>
Cc: linux-ext4@vger.kernel.org
Subject: Re: compilebench numbers for ext4
Date: Thu, 25 Oct 2007 14:43:55 -0400	[thread overview]
Message-ID: <20071025144355.583a8f88@think.oraclecorp.com> (raw)
In-Reply-To: <20071025103449.2e358220@gara>

On Thu, 25 Oct 2007 10:34:49 -0500
"Jose R. Santos" <jrs@us.ibm.com> wrote:

> On Mon, 22 Oct 2007 19:31:04 -0400
> Chris Mason <chris.mason@oracle.com> wrote:
> 
> > Hello everyone,
> > 
> > I recently posted some performance numbers for Btrfs with different
> > blocksizes, and to help establish a baseline I did comparisons with
> > Ext3.
> > 
> > The graphs, numbers and a basic description of compilebench are
> > here:
> > 
> > http://oss.oracle.com/~mason/blocksizes/
> 
> I've been playing a bit with the workload and I have a couple of
> comments.
> 
> 1) I find the averaging of results at the end of the run misleading
> unless you run a high number of directories.  A single very good
> result due to page caching effects seems to skew the final results
> output. Have you considered providing output of the standard
> deviation of the data points as well in order to show how widely the
> results are spread. 

This is the main reason I keep the output from each run.  Stdev would
definitely help as well, I'll put it on the todo list.

> 
> 2) You mentioned that one of the goals of the benchmark is to measure
> locality during directory aging, but the workloads seems too well
> order to truly age the filesystem.  At least that's what I can gather
> from the output the benchmark spits out.  It may be that Im not
> understanding the relationship between INITIAL_DIRS and RUNS, but the
> workload seem to been localized to do operations on a single dir at a
> time.  Just wondering is this is truly stressing allocation algorithms
> in a significant or realistic way.

A good question.  compilebench has two modes, and the default is better
at aging then the run I graphed on ext4.  compilebench isn't trying to
fragment individual files, but it is instead trying to fragment
locality, and lower the overall performance of a directory tree.

In the default run, the patch, clean, and compile operations end up
changing around groups of files in a somewhat random fashion (at least
from the FS point of view).  But, it is still a workload where a good
FS should be able to maintain locality and provide consistent results
over time.

The ext4 numbers I sent here are from compilebench --makej, which is a
shorter and less complex run.  It has a few simple phases:

* create some number of kernel trees sequentially
* write new files into those trees in random order
* read a three of the trees
* delete all the trees

It is a very basic test that can give you a picture of directory
layout, writeback performance and overall locality.

> 
> If I understand how compilebench works, directories would be allocated
> with in one or two block group boundaries so the data and meta data
> would be in very close proximity.  I assume that doing random lookup
> through the entire file set would show some weakness in the ext3 meta
> data layout.

Probably.

> 
> I really want to use seekwatcher to test some of the stuff that I'm
> doing for flex_bg feature but it barfs on me in my test machine.
> 
> running :sleep 10:
> done running sleep 10
> Device: /dev/sdh
>   Total:                     0 events (dropped 0),     1368 KiB data
> blktrace done
> Traceback (most recent call last):
>   File "/usr/bin/seekwatcher", line 534, in ?
>     add_range(hist, step, start, size)
>   File "/usr/bin/seekwatcher", line 522, in add_range
>     val = hist[slot]
> IndexError: list index out of range

I don't think you have any events in the trace.  Try this instead:

echo 3 > /proc/sys/vm/drop_caches
seekwatcher -t find-trace -d /dev/xxxx -p 'find /usr/local -type f'

> 
> This is running on a PPC64/gentoo combination.  Dont know if this
> means anything to you.  I have a very basic algorithm for to take
> advantage block group metadata grouping and want be able to better
> visualize how different IO patterns take advantage or are hurt by the
> feature.

I wanted to benchmark flexbg too, but couldn't quite figure out the
correct patch combination ;)

> 
> > To match the ext4 numbers with Btrfs, I'd probably have to turn off
> > data checksumming...
> > 
> > But oddly enough I saw very bad ext4 read throughput even when
> > reading a single kernel tree (outside of compilebench).  The time
> > to read the tree was almost 2x ext3.  Have others seen similar
> > problems?
> > 
> > I think the ext4 delete times are so much better than ext3 because
> > this is a single threaded test.  delayed allocation is able to get
> > everything into a few extents, and these all end up in the inode.
> > So, the delete phase only needs to seek around in small directories
> > and seek to well grouped inodes.  ext3 probably had to seek all
> > over for the direct/indirect blocks.
> > 
> > So, tomorrow I'll run a few tests with delalloc and mballoc
> > independently, but if there are other numbers people are interested
> > in, please let me know.
> > 
> > (test box was a desktop machine with single sata drive, barriers
> > were not used).
> 
> More details please....
> 
> 1. CPU info (type, count, speed)

Dual core 3ghz x86-64

> 2. Memory info (mostly amount)

2GB

> 3. Disk info (partition size, disk rpms, interface, internal cache

SAMSUNG HD160JJ (sataII w/ncq), the FS was on a 40GB lvm volume.
Single spindle.

> size) 4. Benchmark cmdline parameters.

mkdir ext4
compilebench --makej -D /mnt -d /dev/mapper/xxxx -t ext4/trace -i 20 >&
ext4/out

-chris

  reply	other threads:[~2007-10-25 18:45 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-10-22 23:31 compilebench numbers for ext4 Chris Mason
2007-10-22 23:48 ` Chris Mason
2007-10-23  0:12 ` Mingming Cao
2007-10-23  0:54   ` Chris Mason
2007-10-23 12:43 ` Aneesh Kumar K.V
2007-10-23 13:08   ` Chris Mason
2007-10-23 13:42     ` Aneesh Kumar K.V
2007-10-25 15:34 ` Jose R. Santos
2007-10-25 18:43   ` Chris Mason [this message]
2007-10-25 22:40     ` Jose R. Santos
2007-10-25 23:45       ` Chris Mason
2007-10-25 15:54 ` Jose R. Santos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20071025144355.583a8f88@think.oraclecorp.com \
    --to=chris.mason@oracle.com \
    --cc=jrs@us.ibm.com \
    --cc=linux-ext4@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.