linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stan Hoeppner <stan@hardwarefreak.com>
To: John Robinson <john.robinson@anonymous.org.uk>
Cc: thomas@fjellstrom.ca, Tommy Apel Hansen <tommyapeldk@gmail.com>,
	Chris Murphy <lists@colorremedies.com>,
	linux-raid Raid <linux-raid@vger.kernel.org>
Subject: Re: recommended way to add ssd cache to mdraid array
Date: Wed, 16 Jan 2013 15:29:08 -0600	[thread overview]
Message-ID: <50F71BA4.40900@hardwarefreak.com> (raw)
In-Reply-To: <50F66BDB.3000203@anonymous.org.uk>

On 1/16/2013 2:59 AM, John Robinson wrote:

> At the same time, if in the real world you're doing streaming writes of
> dozens of MB/s, I would expect that write caching would turn a good
> proportion of the writes into full-stripe writes.

The filesystem dictates whether a write is stripe aligned or not, or
fills a full stripe.  If the filesystem performs multiple partial stripe
writes no type nor manner of write caching is going to turn them into a
single stripe aligned write.  That's not possible.

On that note, a great many people on this list mistakenly configure,
optimize, and test their arrays for a magical chunk/stripe size,
apparently oblivious to the fact that most writes with most workloads
are not going to be full stripe writes, or stripe aligned whatsoever:

1.  Journal writes can be aligned, but usually don't fill a full stripe
2.  Metadata writes to the directory tress are often unaligned
3.  File appends or modify-in-place ops are never aligned

The only instance in which one will always get full stripe writes is
when creating and writing a new file whose size is a multiple of the
full stripe width.  I.e. one must be performing allocation to fill full
stripes.  Most workloads don't do this.

This is the reason why RAID6 performs so horribly with mixed read/write
workloads.  Using Thomas' example, while he was doing a streaming read
of a media file and simultaneously doing non-aligned writes from a P2P
or other application, md is performing a RMW operation during each
write, adding substantially to the seek burden on the drives.  RAID5/6
use rotating parity, so he also has an extra seek on each of two drives
occurring, competing with the read seeks of his streaming app.  Consumer
7.2K drives aren't designed to handle this type of random seek load with
good performance.

If using RAID10 or RAID0 over RAID1, there is no RMW penalty for partial
stripe width writes, and no extra seek burden for the parity writes, as
described above for RAID5/6.  Thus it doesn't cause the playback stutter
as the disks can service the read and write requests without running out
of head seek bandwidth as parity arrays do due to RMW and parity block
writes.

In summary, with Thomas' old disk system, he would have most likely
avoided the playback stutter simply by using a non-parity RAID level.

I'm constantly amazed by the fact that so many people here using parity
RAID don't understand the performance impact of these basic parity RAID
IO behaviors, and how striping actually works, and the fact that most
often they're not writing full stripes, and thus not benefiting from
their spindle count.

-- 
Stan


  reply	other threads:[~2013-01-16 21:29 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-22  6:57 recommended way to add ssd cache to mdraid array Thomas Fjellstrom
2012-12-23  3:44 ` Thomas Fjellstrom
2013-01-09 18:41   ` Thomas Fjellstrom
2013-01-10  6:25     ` Chris Murphy
2013-01-10 10:49       ` Thomas Fjellstrom
2013-01-10 21:36         ` Chris Murphy
2013-01-11  0:18           ` Stan Hoeppner
2013-01-11 12:35             ` Thomas Fjellstrom
2013-01-11 12:48               ` Thomas Fjellstrom
2013-01-14  0:05               ` Tommy Apel Hansen
2013-01-14  8:58                 ` Thomas Fjellstrom
2013-01-14 18:22                   ` Thomas Fjellstrom
2013-01-14 19:45                     ` Stan Hoeppner
2013-01-14 21:53                       ` Thomas Fjellstrom
2013-01-14 22:51                         ` Chris Murphy
2013-01-15  3:25                           ` Thomas Fjellstrom
2013-01-15  1:50                         ` Stan Hoeppner
2013-01-15  3:52                           ` Thomas Fjellstrom
2013-01-15  8:38                             ` Stan Hoeppner
2013-01-15  9:02                               ` Tommy Apel
2013-01-15 11:19                                 ` Stan Hoeppner
2013-01-15 10:47                               ` Tommy Apel
2013-01-16  5:31                               ` Thomas Fjellstrom
2013-01-16  8:59                                 ` John Robinson
2013-01-16 21:29                                   ` Stan Hoeppner [this message]
2013-02-10  6:59                                     ` Thomas Fjellstrom
2013-01-16 22:06                                 ` Stan Hoeppner
2013-01-14 21:38                     ` Tommy Apel Hansen
2013-01-14 21:47                     ` Tommy Apel Hansen
2013-01-11 12:20           ` Thomas Fjellstrom
2013-01-11 17:39             ` Chris Murphy
2013-01-11 17:46               ` Chris Murphy
2013-01-11 18:52                 ` Thomas Fjellstrom
2013-01-12  0:47                 ` Phil Turmel
2013-01-12  3:56                   ` Chris Murphy
2013-01-13 22:13                     ` Phil Turmel
2013-01-13 23:20                       ` Chris Murphy
2013-01-14  0:23                         ` Phil Turmel
2013-01-14  3:58                           ` Chris Murphy
2013-01-14 22:00                           ` Thomas Fjellstrom
2013-01-11 18:51               ` Thomas Fjellstrom
2013-01-11 22:17                 ` Stan Hoeppner
2013-01-12  2:44                   ` Thomas Fjellstrom
2013-01-12  8:33                     ` Stan Hoeppner
2013-01-12 14:44                       ` Thomas Fjellstrom
2013-01-13 19:18                 ` Chris Murphy
2013-01-14  9:06                   ` Thomas Fjellstrom
2013-01-11 18:50             ` Stan Hoeppner
2013-01-12  2:45               ` Thomas Fjellstrom
2013-01-12 12:06           ` Roy Sigurd Karlsbakk
2013-01-12 14:14             ` Stan Hoeppner
2013-01-12 16:37               ` Roy Sigurd Karlsbakk
2013-01-10 13:13   ` Brad Campbell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50F71BA4.40900@hardwarefreak.com \
    --to=stan@hardwarefreak.com \
    --cc=john.robinson@anonymous.org.uk \
    --cc=linux-raid@vger.kernel.org \
    --cc=lists@colorremedies.com \
    --cc=thomas@fjellstrom.ca \
    --cc=tommyapeldk@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).