From: Stan Hoeppner <stan@hardwarefreak.com>
To: Roberto Spadim <roberto@spadim.com.br>
Cc: CoolCold <coolthecold@gmail.com>,
Daniel Pocock <daniel@pocock.com.au>,
David Brown <david.brown@hesbynett.no>,
Phil Turmel <philip@turmel.org>,
Marcus Sorensen <shadowsor@gmail.com>,
linux-raid@vger.kernel.org
Subject: Re: md RAID with enterprise-class SATA or SAS drives
Date: Mon, 21 May 2012 14:05:01 -0500 [thread overview]
Message-ID: <4FBA91DD.7010307@hardwarefreak.com> (raw)
In-Reply-To: <CABYL=TqBeC+D_FHBGNO5WmdhP5zArQsNVY4v1xwHy9Zz0w4M1w@mail.gmail.com>
On 5/21/2012 1:54 PM, Roberto Spadim wrote:
> hum, does anyone could explain what a 'multi thread' version of raid1
> could be implemented?
> for example, how to scale it? and why this new implementation could
> scale it better
I just did below. You layer a stripe over many RAID 1 pairs. A single
md RAID 1 pair isn't enough to saturate a single core so there is no
gain to be had by trying to thread the RAID 1 code.
--
Stan
> 2012/5/21 Stan Hoeppner <stan@hardwarefreak.com>:
>> On 5/21/2012 10:20 AM, CoolCold wrote:
>>> On Sat, May 12, 2012 at 2:28 AM, Stan Hoeppner <stan@hardwarefreak.com> wrote:
>>>> On 5/11/2012 3:16 AM, Daniel Pocock wrote:
>>>>
>>> [snip]
>>>> That's the one scenario where I abhor using md raid, as I mentioned. At
>>>> least, a boot raid 1 pair. Using layered md raid 1 + 0, or 1 + linear
>>>> is a great solution for many workloads. Ask me why I say raid 1 + 0
>>>> instead of raid 10.
>>> So, I'm asking - why?
>>
>> Neil pointed out quite some time ago that the md RAID 1/5/6/10 code runs
>> as a single kernel thread. Thus when running heavy IO workloads across
>> many rust disks or a few SSDs, the md thread becomes CPU bound, as it
>> can only execute on a single core, just as with any other single thread.
>>
>> This issue is becoming more relevant as folks move to the latest
>> generation of server CPUs that trade clock speed for higher core count.
>> Imagine the surprise of the op who buys a dual socket box with 2x 16
>> core AMD Interlagos 2.0GHz CPUs, 256GB RAM, and 32 SSDs in md RAID 10,
>> only to find he can only get a tiny fraction of the SSD throughput.
>> Upon investigation he finds a single md thread peaking one core while
>> the rest are relatively idle but for the application itself.
>>
>> As I understand Neil's explanation, the md RAID 0 and linear code don't
>> run as separate kernel threads, but merely pass offsets to the block
>> layer, which is fully threaded. Thus, by layering md RAID 0 over md
>> RAID 1 pairs, the striping load is spread over all cores. Same with
>> linear, avoiding the single thread bottleneck.
>>
>> This layering can be done with any md RAID level, creating RAID50s and
>> RAID60s, or concatenations of RAID5/6, as well as of RAID 10.
>>
>> And it shouldn't take anywhere near 32 modern SSDs to saturate a single
>> 2GHz core with md RAID 10. It's likely less than 8 SSDs, which yield
>> ~400K IOPS, but I haven't done verufication testing myself at this point.
>>
>> --
>> Stan
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
>
next prev parent reply other threads:[~2012-05-21 19:05 UTC|newest]
Thread overview: 51+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-05-09 22:00 md RAID with enterprise-class SATA or SAS drives Daniel Pocock
2012-05-09 22:33 ` Marcus Sorensen
2012-05-10 13:34 ` Daniel Pocock
2012-05-10 13:51 ` Phil Turmel
2012-05-10 14:59 ` Daniel Pocock
2012-05-10 15:15 ` Phil Turmel
2012-05-10 15:26 ` Marcus Sorensen
2012-05-10 16:04 ` Phil Turmel
2012-05-10 17:53 ` Keith Keller
2012-05-10 18:10 ` Mathias Burén
2012-05-10 18:23 ` Phil Turmel
2012-05-10 19:15 ` Keith Keller
2012-05-10 18:42 ` Daniel Pocock
2012-05-10 19:09 ` Phil Turmel
2012-05-10 20:30 ` Daniel Pocock
2012-05-11 6:50 ` Michael Tokarev
2012-05-21 14:19 ` Brian Candler
2012-05-21 14:29 ` Phil Turmel
2012-05-26 21:58 ` Stefan *St0fF* Huebner
2012-05-10 21:43 ` Stan Hoeppner
2012-05-10 23:00 ` Marcus Sorensen
2012-05-10 21:15 ` Stan Hoeppner
2012-05-10 21:31 ` Daniel Pocock
2012-05-11 1:53 ` Stan Hoeppner
2012-05-11 8:31 ` Daniel Pocock
2012-05-11 13:54 ` Pierre Beck
2012-05-10 21:41 ` Phil Turmel
2012-05-10 22:27 ` David Brown
2012-05-10 22:37 ` Daniel Pocock
[not found] ` <CABYL=ToORULrdhBVQk0K8zQqFYkOomY-wgG7PpnJnzP9u7iBnA@mail.gmail.com>
2012-05-11 7:10 ` David Brown
2012-05-11 8:16 ` Daniel Pocock
2012-05-11 22:28 ` Stan Hoeppner
2012-05-21 15:20 ` CoolCold
2012-05-21 18:51 ` Stan Hoeppner
2012-05-21 18:54 ` Roberto Spadim
2012-05-21 19:05 ` Stan Hoeppner [this message]
2012-05-21 19:38 ` Roberto Spadim
2012-05-21 23:34 ` NeilBrown
2012-05-22 6:36 ` Stan Hoeppner
2012-05-22 7:29 ` David Brown
2012-05-23 13:14 ` Stan Hoeppner
2012-05-23 13:27 ` Roberto Spadim
2012-05-23 19:49 ` David Brown
2012-05-23 23:46 ` Stan Hoeppner
2012-05-24 1:18 ` Stan Hoeppner
2012-05-24 2:08 ` NeilBrown
2012-05-24 6:16 ` Stan Hoeppner
2012-05-24 2:10 ` NeilBrown
2012-05-24 2:55 ` Roberto Spadim
2012-05-11 22:17 ` Stan Hoeppner
-- strict thread matches above, loose matches on Subject: below --
2012-05-10 1:29 Richard Scobie
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4FBA91DD.7010307@hardwarefreak.com \
--to=stan@hardwarefreak.com \
--cc=coolthecold@gmail.com \
--cc=daniel@pocock.com.au \
--cc=david.brown@hesbynett.no \
--cc=linux-raid@vger.kernel.org \
--cc=philip@turmel.org \
--cc=roberto@spadim.com.br \
--cc=shadowsor@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).