public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: xfs@oss.sgi.com
Subject: Re: [PATCH 0/12] xfs: delayed logging V6
Date: Mon, 24 May 2010 10:30:39 +1000	[thread overview]
Message-ID: <20100524003039.GB12087@dastard> (raw)
In-Reply-To: <1274138668-1662-1-git-send-email-david@fromorbit.com>

On Tue, May 18, 2010 at 09:24:16AM +1000, Dave Chinner wrote:
> 
> Hi Folks,
> 
> This is version 6 of the delayed logging series and is the first
> release candidate for incluѕion in the xfs-dev tree and 2.6.35-rc1.

BTW, here's a couple of quick benchmarks I've run over the last
couple of days to check comparitive performance. I found that the
previous scalability testing I did was limited by two factors:

	1. Only 4 AGs in the test filesystem, so only 4-way
	parallelism on allocation/freeing. Hence it won't scale to 8
	threads no matter what I do....
	2. lockdep checking limits scalability to around 4 threads.

So I re-ran the pure-metadata, sequential create/remove fs_mark
tests I've previously run with the following results. barriers
were disabled on both XFS and ext4, and XFS also was configured
with:

MKFS_OPTIONS="-l size=128m -d agcount=16"
MOUNT_OPTIONS="-o [no]delaylog,logbsize=262144,nobarrier"

(./fs_mark -S0 -n 100000 -s 0 -d /mnt/scratch/0 ...)

nodelaylog:
		   fs_mark rate
threads	     nodelaylog	      delaylog		ext4
  1		17k/s		19k/s		33k/s
  2		30k/s		35k/s		66k/s
  4		42k/s		63k/s		80k/s
  8		39k/s		97k/s		45k/s

This shows pure metadata operations scale much, much better,
especially for multithreaded workloads. The log throughput at 8
threads is 3.5x lower for a 2.5x improvement in performance.

Also worth noting is that the performance is competitive with ext4
and exceeds it at a higher parallelism. Also worth noting is the
disk subsystem requirements to sustain this performance. These
numbers were recorded of pmchart graphs w/ a 5s sample interval, so
should be considered ballpark numbers (i.e close but not perfectly
accurate):

			   IOPS/s @ MB/s
threads		nodelaylog      delaylog	  ext4
  1		1.0k @ 280	 50 @ 10	   50 @ 20
  2		2.0k @ 460	100 @ 20	  500 @ 75
  4		2.5k @ 520	300 @ 50	 6.5k @ 150
  8		3.7k @ 480	900 @ 150	 9.8k @ 180

We can see why the curent XFS journalling mode is slow - it requires
500MB/s of log throughput to get to 40k creates/s and almost all
the IOPs are servicing log IO.

ext4, on the other hand, really strains the IOP capability of the
disk subsystem and that is the limiting factor at greater than two
threads. It's an interesting progression, too, in that the iops go
up by an order of magnitude each time the thread count doubles.

The best IO behaviour comes from the delayed logging version of XFS,
with the lowest bandwidth and iops to sustain the highest
performance. All the IO is to the log - no metadata is written to
disk at all, which is the way this test should execute.  As a reult,
the delayed logging code was the only configuration not limited by
the IO subsystem - instead it was completely CPU bound (8 CPUs
worth)...

However, it's not all roses, as dbench will show:

# MKFS_OPTIONS="-l size=128m -d agcount=16" MOUNT_OPTIONS="-o nodelaylog,nobarrier" ./bench 1 dave dave dbenchmulti

Throughput is in MB/s, latency in ms.

		  nodelaylog		  delaylog
Threads		thruput  max-lat	thruput	max-lat
   1		153.011   45.450	157.036  59.685
   2		319.534   18.096	330.062  41.458
  10		1350.31   46.631	726.075 303.434
  20		1497.93  365.092	547.380 541.223
 100		1410.42 2488.105	477.964 177.471
 200		1232.97  297.982	457.641 447.060

There is no difference for 1-2 threads (within the error margin of
dbench), but delayed logging shows significant throughput reductions
(>60% degradation) at higher thread counts. This appears to be due
to the unoptimised log force implementation that the delayed logging
code currently has. I'll probably use dbench over the next few weeks
as a measure to try to bring this part of the delayed logging code
up to the same performance.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

      parent reply	other threads:[~2010-05-24  0:28 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-17 23:24 [PATCH 0/12] xfs: delayed logging V6 Dave Chinner
2010-05-17 23:24 ` [PATCH 01/12] xfs: Don't reuse the same transaction ID for duplicated transactions Dave Chinner
2010-05-17 23:24 ` [PATCH 02/12] xfs: allow log ticket allocation to take allocation flags Dave Chinner
2010-05-17 23:24 ` [PATCH 03/12] xfs: modify buffer item reference counting Dave Chinner
2010-05-19 18:35   ` Alex Elder
2010-05-19 22:37     ` Dave Chinner
2010-05-17 23:24 ` [PATCH 04/12] xfs: Clean up XFS_BLI_* flag namespace Dave Chinner
2010-05-19 19:09   ` Alex Elder
2010-05-17 23:24 ` [PATCH 05/12] xfs: clean up log ticket overrun debug output Dave Chinner
2010-05-19 19:16   ` Alex Elder
2010-05-19 22:41     ` Dave Chinner
2010-05-17 23:24 ` [PATCH 06/12] xfs: make the log ticket ID available outside the log infrastructure Dave Chinner
2010-05-17 23:24 ` [PATCH 07/12] xfs: Improve scalability of busy extent tracking Dave Chinner
2010-05-20 20:15   ` Alex Elder
2010-05-21  2:16     ` Dave Chinner
2010-05-21 20:59       ` Alex Elder
2010-05-17 23:24 ` [PATCH 08/12] xfs: Delayed logging design documentation Dave Chinner
2010-05-21 21:02   ` Alex Elder
2010-05-17 23:24 ` [PATCH 09/12] xfs: Introduce delayed logging core code Dave Chinner
2010-05-21 21:06   ` Alex Elder
2010-05-17 23:24 ` [PATCH 10/12] xfs: forced unmounts need to push the CIL Dave Chinner
2010-05-17 23:24 ` [PATCH 11/12] xfs: enable background pushing of " Dave Chinner
2010-05-17 23:24 ` [PATCH 12/12] xfs: Ensure inode allocation buffers are fully replayed Dave Chinner
2010-05-21 21:21 ` [PATCH 0/12] xfs: delayed logging V6 Alex Elder
2010-05-22  0:39   ` Dave Chinner
2010-05-24  0:30 ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100524003039.GB12087@dastard \
    --to=david@fromorbit.com \
    --cc=xfs@oss.sgi.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox