All of lore.kernel.org
 help / color / mirror / Atom feed
From: Corey Hickey <bugfood-ml@fatooh.org>
To: stan@hardwarefreak.com
Cc: Peter Grandi <pg@lxra2.for.sabi.co.UK>,
	Linux RAID <linux-raid@vger.kernel.org>
Subject: Re: RAID 5: low sequential write performance?
Date: Mon, 17 Jun 2013 10:14:14 -0700	[thread overview]
Message-ID: <51BF43E6.4000900@fatooh.org> (raw)
In-Reply-To: <51BF1BAC.9020701@hardwarefreak.com>

On 2013-06-17 07:22, Stan Hoeppner wrote:
> On 6/17/2013 1:39 AM, Corey Hickey wrote:
> 
>> 32768 seems to be the maximum for the stripe cache. I'm quite happy to
>> spend 32 MB for this. 256 KB seems quite low, especially since it's only
>> half the default chunk size.
> 
> FULL STOP.  Your stripe cache is consuming *384MB* of RAM, not 32MB.
> Check your actual memory consumption.  The value plugged into
> stripe_cache_size is not a byte value.  The value specifies the number
> of data elements in the stripe cache array.  Each element is #disks*4KB
> in size.  The formula for calculating memory consumed by the stripe
> cache is:
> 
> (num_of_disks * 4KB) * stripe_cache_size
> 
> In your case this would be
> 
> (3 * 4KB) * 32768 = 384MB

I'm actually seeing a bit more memory difference: 401-402 MB when going
from 256 to to 32768, on a mostly idle system, so maybe there's
something else coming into play.

Still your formula does make more sense. Apparently the idea of the
value being KB is a common misconception, possibly perpetuated by this:

https://raid.wiki.kernel.org/index.php/Performance
---
# Set stripe-cache_size for RAID5.
echo "Setting stripe_cache_size to 16 MiB for /dev/md3"
echo 16384 > /sys/block/md3/md/stripe_cache_size
---

Is 256 really a reasonable default? Given what I've been seeing, it
appears that 256 is either unreasonably low or I have something else wrong.

>> mkfs.xfs /dev/m3
>> direct: 89.8 MB/s  not direct: 90.0 MB/s
> 
> You didn't align XFS.  Though with large streaming writes it won't
> matter much as md and the block layer will fill the stripes.  However,
> XFS' big advantage is parallel IO and you're testing serial IO.  Fire up
> 4 O_DIRECT threads/processes and compare to EXT4 w/4 write threads.  The
> throughput gap will increase until you run out of hardware.

This will be something to test next time I rebuild my "real" array.

Thanks,
Corey

  reply	other threads:[~2013-06-17 17:14 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-15 23:10 RAID 5: low sequential write performance? Corey Hickey
2013-06-16 21:27 ` Peter Grandi
2013-06-17  6:39   ` Corey Hickey
2013-06-17 14:22     ` Stan Hoeppner
2013-06-17 17:14       ` Corey Hickey [this message]
2013-06-17 17:45         ` Mikael Abrahamsson
2013-06-18  5:32           ` Corey Hickey
2013-06-18  5:52         ` Stan Hoeppner
2013-06-18  6:29           ` Corey Hickey

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51BF43E6.4000900@fatooh.org \
    --to=bugfood-ml@fatooh.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=pg@lxra2.for.sabi.co.UK \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.