All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Brown <david.brown@hesbynett.no>
To: stan@hardwarefreak.com
Cc: NeilBrown <neilb@suse.de>, CoolCold <coolthecold@gmail.com>,
	Daniel Pocock <daniel@pocock.com.au>,
	Roberto Spadim <roberto@spadim.com.br>,
	Phil Turmel <philip@turmel.org>,
	Marcus Sorensen <shadowsor@gmail.com>,
	linux-raid@vger.kernel.org
Subject: Re: md RAID with enterprise-class SATA or SAS drives
Date: Wed, 23 May 2012 21:49:29 +0200	[thread overview]
Message-ID: <4FBD3F49.5060005@hesbynett.no> (raw)
In-Reply-To: <4FBCE2C2.6030909@hardwarefreak.com>

On 23/05/12 15:14, Stan Hoeppner wrote:
> On 5/22/2012 2:29 AM, David Brown wrote:
>
>> But in general, it's important to do some real-world testing to
>> establish whether or not there really is a bottleneck here.  It is
>> counter-productive for Stan (or anyone else) to advise against raid10 or
>> raid5/6 because of a single-thread bottleneck if it doesn't actually
>> slow things down in practice.
>
> Please reread precisely what I stated earlier:
>
> "Neil pointed out quite some time ago that the md RAID 1/5/6/10 code
> runs as a single kernel thread.  Thus when running heavy IO workloads
> across many rust disks or a few SSDs, the md thread becomes CPU bound,
> as it can only execute on a single core, just as with any other single
> thread."
>
> Note "heavy IO workloads".  The real world testing upon which I based my
> recommendation is in this previous thread on linux-raid, of which I was
> a participant.
>
> Mark Delfman did the testing which revealed this md RAID thread
> scalability problem using 4 PCIe enterprise SSDs:
>
> http://marc.info/?l=linux-raid&m=131307849530290&w=2
>
>> On the other hand, if it /is/ a hinder to
>> scaling, then it is important for Neil and other experts to think about
>> how to change the architecture of md raid to scale better.  And
>
> More thorough testing and identification of the problem is definitely
> required.  Apparently few people are currently running md RAID 1/5/6/10
> across multiple ultra high performance SSDs, people who actually need
> every single ounce of IOPS out of each device in the array.  But this
> trend will increase.  I'd guess those currently building md 1/5/6/10
> arrays w/ many SSDs simply don't *need* every ounce of IOPS, or more
> would be complaining about single core thread limit already.
>
>> somewhere in between there can be guidelines to help users - something
>> like "for an average server, single-threading will saturate raid5
>> performance at 8 disks, raid6 performance at 6 disks, and raid10 at 10
>> disks, beyond which you should use raid0 or linear striping over two or
>> more arrays".
>
> This isn't feasible due to the myriad possible combinations of hardware.
>   And you simply won't see this problem with SRDs (spinning rust disks)
> until you have hundreds of them in a single array.  It requires over 200
> 15K SRDs in RAID 10 to generate only 30K random IOPS.  Just about any
> single x86 core can handle that, probably even a 1.6GHz Atom.  This
> issue mainly affects SSD arrays, where even 8 midrange consumer SATA3
> SSDs in RAID 10 can generate over 400K IOPS, 200K real and 200K mirror data.
>
>> Of course, to do such testing, someone would need a big machine with
>> lots of disks, which is not otherwise in use!
>
> Shouldn't require anything that heavy.  I would guess that one should be
> able to reveal the thread bottleneck with a low freq dual core desktop
> system with an HBA such as the LSI 9211-8i @320K IOPS, and 8 Sandforce
> 2200 based SSDs @40K write IOPS each.
>

It looks like Shaohua Li has done some testing, found that there is a 
slow-down even with just 2 or 4 disks, and has written patches to fix it 
(for raid1 and raid10 so far), which is very nice.



  parent reply	other threads:[~2012-05-23 19:49 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-09 22:00 md RAID with enterprise-class SATA or SAS drives Daniel Pocock
2012-05-09 22:33 ` Marcus Sorensen
2012-05-10 13:34   ` Daniel Pocock
2012-05-10 13:51   ` Phil Turmel
2012-05-10 14:59     ` Daniel Pocock
2012-05-10 15:15       ` Phil Turmel
2012-05-10 15:26     ` Marcus Sorensen
2012-05-10 16:04       ` Phil Turmel
2012-05-10 17:53         ` Keith Keller
2012-05-10 18:10           ` Mathias Burén
2012-05-10 18:23           ` Phil Turmel
2012-05-10 19:15             ` Keith Keller
2012-05-10 18:42         ` Daniel Pocock
2012-05-10 19:09           ` Phil Turmel
2012-05-10 20:30             ` Daniel Pocock
2012-05-11  6:50             ` Michael Tokarev
2012-05-21 14:19           ` Brian Candler
2012-05-21 14:29             ` Phil Turmel
2012-05-26 21:58               ` Stefan *St0fF* Huebner
2012-05-10 21:43       ` Stan Hoeppner
2012-05-10 23:00         ` Marcus Sorensen
2012-05-10 21:15     ` Stan Hoeppner
2012-05-10 21:31       ` Daniel Pocock
2012-05-11  1:53         ` Stan Hoeppner
2012-05-11  8:31           ` Daniel Pocock
2012-05-11 13:54             ` Pierre Beck
2012-05-10 21:41       ` Phil Turmel
2012-05-10 22:27       ` David Brown
2012-05-10 22:37         ` Daniel Pocock
     [not found]         ` <CABYL=ToORULrdhBVQk0K8zQqFYkOomY-wgG7PpnJnzP9u7iBnA@mail.gmail.com>
2012-05-11  7:10           ` David Brown
2012-05-11  8:16             ` Daniel Pocock
2012-05-11 22:28               ` Stan Hoeppner
2012-05-21 15:20                 ` CoolCold
2012-05-21 18:51                   ` Stan Hoeppner
2012-05-21 18:54                     ` Roberto Spadim
2012-05-21 19:05                       ` Stan Hoeppner
2012-05-21 19:38                         ` Roberto Spadim
2012-05-21 23:34                     ` NeilBrown
2012-05-22  6:36                       ` Stan Hoeppner
2012-05-22  7:29                         ` David Brown
2012-05-23 13:14                           ` Stan Hoeppner
2012-05-23 13:27                             ` Roberto Spadim
2012-05-23 19:49                             ` David Brown [this message]
2012-05-23 23:46                               ` Stan Hoeppner
2012-05-24  1:18                                 ` Stan Hoeppner
2012-05-24  2:08                                   ` NeilBrown
2012-05-24  6:16                                     ` Stan Hoeppner
2012-05-24  2:10                         ` NeilBrown
2012-05-24  2:55                           ` Roberto Spadim
2012-05-11 22:17             ` Stan Hoeppner
  -- strict thread matches above, loose matches on Subject: below --
2012-05-10  1:29 Richard Scobie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FBD3F49.5060005@hesbynett.no \
    --to=david.brown@hesbynett.no \
    --cc=coolthecold@gmail.com \
    --cc=daniel@pocock.com.au \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=philip@turmel.org \
    --cc=roberto@spadim.com.br \
    --cc=shadowsor@gmail.com \
    --cc=stan@hardwarefreak.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.