Linux RAID subsystem development
 help / color / mirror / Atom feed
From: Stan Hoeppner <stan@hardwarefreak.com>
To: Peter Landmann <sfrazt@googlemail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: RAID 5 doesn't scale
Date: Wed, 03 Apr 2013 08:18:52 -0500	[thread overview]
Message-ID: <515C2C3C.3020400@hardwarefreak.com> (raw)
In-Reply-To: <loom.20130403T122905-373@post.gmane.org>

On 4/3/2013 6:00 AM, Peter Landmann wrote:

You didn't mention your stripe_cache_size value.  It'll make a lot of
difference.  Make sure it's at least 4096.  The default is 256.

~$ /bin/echo 4096 > /sys/block/md[X]/md/stripe_cache_size

> FIO settings:
> bs=4096
> iodepth=248
> direct=1
> continue_on_error=1
> rw=randwrite
> ioengine=libaio
> norandommap
> refill_buffers
> group_reporting

> numjobs=1

^^^^^^^^^^^  Even when using AIO you're still serialized when using a
single thread, regardless of queue depth.  Thus there is non trivial
latency between IO operations.  Retest with only these global parameters
to get some concurrency.  Along with a larger stripe cache your numbers
should go up substantially.  This test runs 4 threads/core to ensure you
saturate md with IO.

[global]
zero_buffers
numjobs=24
thread
group_reporting
blocksize=4096
ioengine=libaio
iodepth=16
direct=1
size=8G

> So you have an idea why the real performance is only 50% of the theoretical 
> performance? 

Three reasons:  IO latency, limited stripe_cache_size, parity RMW

> No cpu core is at its limits.

Because you're not cycle limited but latency limited.  With this FIO
test your CPU burn should increase a bit.

> As i said in my other post. I would be interested to solve the problem but i 
> have problems to identify it.

Note also that you're doing 4KB random writes against RAID5.  This is
going to generate substantial RMW cycles.  The Intel X25-M G2 is not a
speed daemon.  Its published max 4KB IOPS throughput is for purely
random writes, not the read+write pattern created by parity RMW.  So
while your random read should get a nice jump with this test, your
random write may not improve as much.  The limitation here is a function
of the SSD controller on the X25-M G2, not md/RAID5.  If you test 5
drives in md/RAID0 you'll see a bump in random write IOPS.

-- 
Stan


  parent reply	other threads:[~2013-04-03 13:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-04-03 11:00 RAID 5 doesn't scale Peter Landmann
2013-04-03 11:21 ` Benjamin ESTRABAUD
2013-04-03 18:34   ` Martin Wilck
2013-04-03 20:38     ` Peter Landmann
2013-04-04 13:40       ` Benjamin ESTRABAUD
2013-04-03 13:18 ` Stan Hoeppner [this message]
2013-04-03 15:23   ` keld
2013-04-03 15:31   ` Peter Landmann
2013-04-03 18:35     ` Stan Hoeppner
2013-04-03 18:23   ` Martin Wilck
2013-04-03 20:36     ` Peter Landmann
2013-04-03 21:19       ` Peter Landmann
2013-04-03 21:24       ` Stan Hoeppner
2013-04-03 21:29         ` Peter Landmann
2013-04-03 21:15     ` Stan Hoeppner
2013-04-03 19:56   ` Roy Sigurd Karlsbakk
2013-04-03 21:12   ` Peter Landmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=515C2C3C.3020400@hardwarefreak.com \
    --to=stan@hardwarefreak.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=sfrazt@googlemail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox