All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Brown <david@westcontrol.com>
To: linux-raid@vger.kernel.org
Subject: Re: high throughput storage server?
Date: Tue, 22 Feb 2011 10:49:18 +0100	[thread overview]
Message-ID: <ik00s5$9do$1@dough.gmane.org> (raw)
In-Reply-To: <Pine.GSO.4.64.1102221027330.7398@montezuma.acc.umu.se>

On 22/02/2011 10:30, Mattias Wadenstein wrote:
> On Tue, 22 Feb 2011, David Brown wrote:
>
>> On 21/02/2011 22:51, Stan Hoeppner wrote:
>>> Mattias Wadenstein put forth on 2/21/2011 4:25 AM:
>>>> On Fri, 18 Feb 2011, Stan Hoeppner wrote:
>>>>
>>>>> Mattias Wadenstein put forth on 2/18/2011 7:49 AM:
>>>>>
>>>>> RAID 5/6 need not apply due the abysmal RMW partial stripe write
>>>>> penalty, unless of course you're doing almost no writes. But in that
>>>>> case, how did the data get there in the first place? :)
>>>
>>>> Actually, that's probably the common case for data analysis load. Lots
>>>> of random reads, but only occasional sequential writes when you add a
>>>> new file/fileset. So raid 5/6 performance-wise works out pretty much as
>>>> a stripe of n-[12] disks.
>>>
>>> RAID5/6 have decent single streaming read performance, but sub optimal
>>> random read, less than sub optimal streaming write, and abysmal random
>>> write performance. They exhibit poor random read performance with high
>>> client counts when compared to RAID0 or RAID10. Additionally, with an
>>> analysis "cluster" designed for overall high utilization (no idle
>>> nodes), one node will be uploading data sets while others are doing
>>> analysis. Thus you end up with a mixed simultaneous random read and
>>> streaming write workload on the server. RAID10 will give many times the
>>> throughput in this case compared to RAID5/6, which will bog down rapidly
>>> under such a workload.
>>>
>>
>> I'm a little confused here. It's easy to see why RAID5/6 have very
>> poor random write performance - you need at least two reads and two
>> writes for a single write access. It's also easy to see that streaming
>> reads will be good, as you can read from most of the disks in parallel.
>>
>> However, I can't see that streaming writes would be so bad - you have
>> to write slightly more than for a RAID0 write, since you have the
>> parity data too, but the parity is calculated in advance without the
>> need of any reads, and all the writes are in parallel. So you get the
>> streamed write performance of n-[12] disks. Contrast this with RAID10
>> where you have to write out all data twice - you get the performance
>> of n/2 disks.
>
> It's fine as long as you have only a few streaming writes, if you go up
> to many streams things might start breaking down.
>

That's always going to be the case when you have a lot of writes at the 
same time.  Perhaps RAID5/6 makes matters a little worse by requiring a 
certain ordering on the writes to ensure consistency (maybe you have to 
write a whole stripe before starting a new stripe?  I don't know how md 
raid balances performance and consistency here).  I think the choice of 
file system is likely to make a bigger impact in such cases.

>> I also cannot see why random reads would be bad - I would expect that
>> to be of similar speed to a RAID0 setup. The only exception would be
>> if you've got atime enabled, and each random read was also causing a
>> small write - then it would be terrible.
>>
>> Or am I missing something here?
>
> The thing I think you are missing is crappy implementations in several
> HW raid controllers. For linux software raid the situation is quite
> sanely as you describe in my experience.
>

Ah, okay.  Thanks!



  reply	other threads:[~2011-02-22  9:49 UTC|newest]

Thread overview: 116+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-14 23:59 high throughput storage server? Matt Garman
2011-02-15  2:06 ` Doug Dumitru
2011-02-15  4:44   ` Matt Garman
2011-02-15  5:49     ` hansbkk
2011-02-15  9:43     ` David Brown
2011-02-24 20:28       ` Matt Garman
2011-02-24 20:43         ` David Brown
2011-02-15 15:16     ` Joe Landman
2011-02-15 20:37       ` NeilBrown
2011-02-15 20:47         ` Joe Landman
2011-02-15 21:41           ` NeilBrown
2011-02-24 20:58       ` Matt Garman
2011-02-24 21:20         ` Joe Landman
2011-02-26 23:54           ` high throughput storage server? GPFS w/ 10GB/s throughput to the rescue Stan Hoeppner
2011-02-27  0:56             ` Joe Landman
2011-02-27 14:55               ` Stan Hoeppner
2011-03-12 22:49                 ` Matt Garman
2011-02-27 21:30     ` high throughput storage server? Ed W
2011-02-28 15:46       ` Joe Landman
2011-02-28 23:14         ` Stan Hoeppner
2011-02-28 22:22       ` Stan Hoeppner
2011-03-02  3:44       ` Matt Garman
2011-03-02  4:20         ` Joe Landman
2011-03-02  7:10           ` Roberto Spadim
2011-03-02 19:03             ` Drew
2011-03-02 19:20               ` Roberto Spadim
2011-03-13 20:10                 ` Christoph Hellwig
2011-03-14 12:27                   ` Stan Hoeppner
2011-03-14 12:47                     ` Christoph Hellwig
2011-03-18 13:16                       ` Stan Hoeppner
2011-03-18 14:05                         ` Christoph Hellwig
2011-03-18 15:43                           ` Stan Hoeppner
2011-03-18 16:21                             ` Roberto Spadim
2011-03-18 22:01                             ` NeilBrown
2011-03-18 22:23                               ` Roberto Spadim
2011-03-20  1:34                               ` Stan Hoeppner
2011-03-20  3:41                                 ` NeilBrown
2011-03-20  5:32                                   ` Roberto Spadim
2011-03-20 23:22                                     ` Stan Hoeppner
2011-03-21  0:52                                       ` Roberto Spadim
2011-03-21  2:44                                       ` Keld Jørn Simonsen
2011-03-21  3:13                                         ` Roberto Spadim
2011-03-21  3:14                                           ` Roberto Spadim
2011-03-21 17:07                                             ` Stan Hoeppner
2011-03-21 14:18                                         ` Stan Hoeppner
2011-03-21 17:08                                           ` Roberto Spadim
2011-03-21 22:13                                           ` Keld Jørn Simonsen
2011-03-22  9:46                                             ` Robin Hill
2011-03-22 10:14                                               ` Keld Jørn Simonsen
2011-03-23  8:53                                                 ` Stan Hoeppner
2011-03-23 15:57                                                   ` Roberto Spadim
2011-03-23 16:19                                                     ` Joe Landman
2011-03-24  8:05                                                       ` Stan Hoeppner
2011-03-24 13:12                                                         ` Joe Landman
2011-03-25  7:06                                                           ` Stan Hoeppner
2011-03-24 17:07                                                       ` Christoph Hellwig
2011-03-24  5:52                                                     ` Stan Hoeppner
2011-03-24  6:33                                                       ` NeilBrown
2011-03-24  8:07                                                         ` Roberto Spadim
2011-03-24  8:31                                                         ` Stan Hoeppner
2011-03-22 10:00                                             ` Stan Hoeppner
2011-03-22 11:01                                               ` Keld Jørn Simonsen
2011-02-15 12:29 ` Stan Hoeppner
2011-02-15 12:45   ` Roberto Spadim
2011-02-15 13:03     ` Roberto Spadim
2011-02-24 20:43       ` Matt Garman
2011-02-24 20:53         ` Zdenek Kaspar
2011-02-24 21:07           ` Joe Landman
2011-02-15 13:39   ` David Brown
2011-02-16 23:32     ` Stan Hoeppner
2011-02-17  0:00       ` Keld Jørn Simonsen
2011-02-17  0:19         ` Stan Hoeppner
2011-02-17  2:23           ` Roberto Spadim
2011-02-17  3:05             ` Stan Hoeppner
2011-02-17  0:26       ` David Brown
2011-02-17  0:45         ` Stan Hoeppner
2011-02-17 10:39           ` David Brown
2011-02-24 20:49     ` Matt Garman
2011-02-15 13:48 ` Zdenek Kaspar
2011-02-15 14:29   ` Roberto Spadim
2011-02-15 14:51     ` A. Krijgsman
2011-02-15 16:44       ` Roberto Spadim
2011-02-15 14:56     ` Zdenek Kaspar
2011-02-24 20:36       ` Matt Garman
2011-02-17 11:07 ` John Robinson
2011-02-17 13:36   ` Roberto Spadim
2011-02-17 13:54     ` Roberto Spadim
2011-02-17 21:47   ` Stan Hoeppner
2011-02-17 22:13     ` Joe Landman
2011-02-17 23:49       ` Stan Hoeppner
2011-02-18  0:06         ` Joe Landman
2011-02-18  3:48           ` Stan Hoeppner
2011-02-18 13:49 ` Mattias Wadenstein
2011-02-18 23:16   ` Stan Hoeppner
2011-02-21 10:25     ` Mattias Wadenstein
2011-02-21 21:51       ` Stan Hoeppner
2011-02-22  8:57         ` David Brown
2011-02-22  9:30           ` Mattias Wadenstein
2011-02-22  9:49             ` David Brown [this message]
2011-02-22 13:38           ` Stan Hoeppner
2011-02-22 14:18             ` David Brown
2011-02-23  5:52               ` Stan Hoeppner
2011-02-23 13:56                 ` David Brown
2011-02-23 14:25                   ` John Robinson
2011-02-23 15:15                     ` David Brown
2011-02-23 23:14                       ` Stan Hoeppner
2011-02-24 10:19                         ` David Brown
2011-02-23 21:59                     ` Stan Hoeppner
2011-02-23 23:43                       ` John Robinson
2011-02-24 15:53                         ` Stan Hoeppner
2011-02-23 21:11                   ` Stan Hoeppner
2011-02-24 11:24                     ` David Brown
2011-02-24 23:30                       ` Stan Hoeppner
2011-02-25  8:20                         ` David Brown
2011-02-19  0:24   ` Joe Landman
2011-02-21 10:04     ` Mattias Wadenstein

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='ik00s5$9do$1@dough.gmane.org' \
    --to=david@westcontrol.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.