linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Stan Hoeppner <stan@hardwarefreak.com>
To: CoolCold <coolthecold@gmail.com>
Cc: Daniel Pocock <daniel@pocock.com.au>,
	David Brown <david.brown@hesbynett.no>,
	Roberto Spadim <roberto@spadim.com.br>,
	Phil Turmel <philip@turmel.org>,
	Marcus Sorensen <shadowsor@gmail.com>,
	linux-raid@vger.kernel.org
Subject: Re: md RAID with enterprise-class SATA or SAS drives
Date: Mon, 21 May 2012 13:51:21 -0500	[thread overview]
Message-ID: <4FBA8EA9.40203@hardwarefreak.com> (raw)
In-Reply-To: <CAGqmV7oJg8vwKPJEYJhPANzaN-xxVW6Lw2gLTEKmMfG=pqCHuA@mail.gmail.com>

On 5/21/2012 10:20 AM, CoolCold wrote:
> On Sat, May 12, 2012 at 2:28 AM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
>> On 5/11/2012 3:16 AM, Daniel Pocock wrote:
>>
> [snip]
>> That's the one scenario where I abhor using md raid, as I mentioned.  At
>> least, a boot raid 1 pair.  Using layered md raid 1 + 0, or 1 + linear
>> is a great solution for many workloads.  Ask me why I say raid 1 + 0
>> instead of raid 10.
> So, I'm asking - why?

Neil pointed out quite some time ago that the md RAID 1/5/6/10 code runs
as a single kernel thread.  Thus when running heavy IO workloads across
many rust disks or a few SSDs, the md thread becomes CPU bound, as it
can only execute on a single core, just as with any other single thread.

This issue is becoming more relevant as folks move to the latest
generation of server CPUs that trade clock speed for higher core count.
 Imagine the surprise of the op who buys a dual socket box with 2x 16
core AMD Interlagos 2.0GHz CPUs, 256GB RAM, and 32 SSDs in md RAID 10,
only to find he can only get a tiny fraction of the SSD throughput.
Upon investigation he finds a single md thread peaking one core while
the rest are relatively idle but for the application itself.

As I understand Neil's explanation, the md RAID 0 and linear code don't
run as separate kernel threads, but merely pass offsets to the block
layer, which is fully threaded.  Thus, by layering md RAID 0 over md
RAID 1 pairs, the striping load is spread over all cores.  Same with
linear, avoiding the single thread bottleneck.

This layering can be done with any md RAID level, creating RAID50s and
RAID60s, or concatenations of RAID5/6, as well as of RAID 10.

And it shouldn't take anywhere near 32 modern SSDs to saturate a single
2GHz core with md RAID 10.  It's likely less than 8 SSDs, which yield
~400K IOPS, but I haven't done verufication testing myself at this point.

-- 
Stan

  reply	other threads:[~2012-05-21 18:51 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-09 22:00 md RAID with enterprise-class SATA or SAS drives Daniel Pocock
2012-05-09 22:33 ` Marcus Sorensen
2012-05-10 13:34   ` Daniel Pocock
2012-05-10 13:51   ` Phil Turmel
2012-05-10 14:59     ` Daniel Pocock
2012-05-10 15:15       ` Phil Turmel
2012-05-10 15:26     ` Marcus Sorensen
2012-05-10 16:04       ` Phil Turmel
2012-05-10 17:53         ` Keith Keller
2012-05-10 18:10           ` Mathias Burén
2012-05-10 18:23           ` Phil Turmel
2012-05-10 19:15             ` Keith Keller
2012-05-10 18:42         ` Daniel Pocock
2012-05-10 19:09           ` Phil Turmel
2012-05-10 20:30             ` Daniel Pocock
2012-05-11  6:50             ` Michael Tokarev
2012-05-21 14:19           ` Brian Candler
2012-05-21 14:29             ` Phil Turmel
2012-05-26 21:58               ` Stefan *St0fF* Huebner
2012-05-10 21:43       ` Stan Hoeppner
2012-05-10 23:00         ` Marcus Sorensen
2012-05-10 21:15     ` Stan Hoeppner
2012-05-10 21:31       ` Daniel Pocock
2012-05-11  1:53         ` Stan Hoeppner
2012-05-11  8:31           ` Daniel Pocock
2012-05-11 13:54             ` Pierre Beck
2012-05-10 21:41       ` Phil Turmel
2012-05-10 22:27       ` David Brown
2012-05-10 22:37         ` Daniel Pocock
     [not found]         ` <CABYL=ToORULrdhBVQk0K8zQqFYkOomY-wgG7PpnJnzP9u7iBnA@mail.gmail.com>
2012-05-11  7:10           ` David Brown
2012-05-11  8:16             ` Daniel Pocock
2012-05-11 22:28               ` Stan Hoeppner
2012-05-21 15:20                 ` CoolCold
2012-05-21 18:51                   ` Stan Hoeppner [this message]
2012-05-21 18:54                     ` Roberto Spadim
2012-05-21 19:05                       ` Stan Hoeppner
2012-05-21 19:38                         ` Roberto Spadim
2012-05-21 23:34                     ` NeilBrown
2012-05-22  6:36                       ` Stan Hoeppner
2012-05-22  7:29                         ` David Brown
2012-05-23 13:14                           ` Stan Hoeppner
2012-05-23 13:27                             ` Roberto Spadim
2012-05-23 19:49                             ` David Brown
2012-05-23 23:46                               ` Stan Hoeppner
2012-05-24  1:18                                 ` Stan Hoeppner
2012-05-24  2:08                                   ` NeilBrown
2012-05-24  6:16                                     ` Stan Hoeppner
2012-05-24  2:10                         ` NeilBrown
2012-05-24  2:55                           ` Roberto Spadim
2012-05-11 22:17             ` Stan Hoeppner
  -- strict thread matches above, loose matches on Subject: below --
2012-05-10  1:29 Richard Scobie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FBA8EA9.40203@hardwarefreak.com \
    --to=stan@hardwarefreak.com \
    --cc=coolthecold@gmail.com \
    --cc=daniel@pocock.com.au \
    --cc=david.brown@hesbynett.no \
    --cc=linux-raid@vger.kernel.org \
    --cc=philip@turmel.org \
    --cc=roberto@spadim.com.br \
    --cc=shadowsor@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).