linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michael Evans <mjevans1983@gmail.com>
To: linux-raid@vger.kernel.org
Subject: Re: The huge different performance of sequential read between RAID0 and RAID5
Date: Thu, 28 Jan 2010 22:05:47 -0800	[thread overview]
Message-ID: <4877c76c1001282205m280049b8y34253fc8f5062d0f@mail.gmail.com> (raw)
In-Reply-To: <20100128152755.GA23933@cthulhu.home.robinhill.me.uk>

On Thu, Jan 28, 2010 at 7:27 AM, Robin Hill <robin@robinhill.me.uk> wrote:
> On Thu Jan 28, 2010 at 09:55:05AM -0500, Yuehai Xu wrote:
>
>> 2010/1/28 Gabor Gombas <gombasg@sztaki.hu>:
>> > On Thu, Jan 28, 2010 at 09:31:23AM -0500, Yuehai Xu wrote:
>> >
>> >> >> md0 : active raid5 sdh1[7] sdg1[5] sdf1[4] sde1[3] sdd1[2] sdc1[1] sdb1[0]
>> >> >>       631353600 blocks level 5, 64k chunk, algorithm 2 [7/6] [UUUUUU_]
>> > [...]
>> >
>> >> I don't think any of my drive fail because there is no "F" in my
>> >> /proc/mdstat output
>> >
>> > It's not failed, it's simply missing. Either it was unavailable when the
>> > array was assembled, or you've explicitely created/assembled the array
>> > with a missing drive.
>>
>> I noticed that, thanks! Is it usual that at the beginning of each
>> setup, there is one missing drive?
>>
> Yes - in order to make the array available as quickly as possible, it is
> initially created as a degraded array.  The recovery is then run to
> add in the extra disk.  Otherwise all disks would need to be written
> before the array became available.
>
>> >
>> >> How do you know my RAID5 array has one drive missing?
>> >
>> > Look at the above output: there are just 6 of the 7 drives available,
>> > and the underscore also means a missing drive.
>> >
>> >> I tried to setup RAID5 with 5 disks, 3 disks, after each setup,
>> >> recovery has always been done.
>> >
>> > Of course.
>> >
>> >> However, if I format my md0 with such command:
>> >> mkfs.ext3 -b 4096 -E stride=16 -E stripe-width=*** /dev/XXXX, the
>> >> performance for RAID5 becomes usual, at about 200~300M/s.
>> >
>> > I suppose in that case you had all the disks present in the array.
>>
>> Yes, I did my test after the recovery, in that case, does the "missing
>> drive" hurt the performance?
>>
> If you had a missing drive in the array when running the test, then this
> would definitely affect the performance (as the array would need to do
> parity calculations for most stripes).  However, as you've not actually
> given the /proc/mdstat output for the array post-recovery then I don't
> know whether or not this was the case.
>
> Generally, I wouldn't expect the RAID5 array to be that much slower than
> a RAID0.  You'd best check that the various parameters (chunk size,
> stripe cache size, readahead, etc) are the same for both arrays, as
> these can have a major impact on performance.
>
> Cheers,
>    Robin
> --
>     ___
>    ( ' }     |       Robin Hill        <robin@robinhill.me.uk> |
>   / / )      | Little Jim says ....                            |
>  // !!       |      "He fallen in de water !!"                 |
>

A more valid test that could be run would follow:

Assemble all the test drives as a raid-5 array (you can zero the
drives any way you like and then --assume-clean if they really are all
zeros) and let the resync complete.

Run any tests you like.

Stop and --zero-superblock on the array.

Create a striped array (raid 0) using all but one of the test drives.

Since you dropped the drive's worth of storage that would be dedicated
to parity in the raid-5 setup you're now benchmarking the same number
of /data/ storage drives; but have saved one drive's worth of recovery
data (at cost of risking your data if any single drive fails).

Still, run the same benchmarks.

Why is this valid instead of throwing all the drives at it in raid-0
mode as well?  It provides the same resulting storage size.


What I suspect you'll find is very similar read performance and
measurably, though perhaps tolerable, worse write performance from
raid-5.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

  reply	other threads:[~2010-01-29  6:05 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-28  3:16 The huge different performance of sequential read between RAID0 and RAID5 Yuehai Xu
2010-01-28  7:06 ` Gabor Gombas
2010-01-28 14:31   ` Yuehai Xu
2010-01-28 14:41     ` Gabor Gombas
2010-01-28 14:55       ` Yuehai Xu
2010-01-28 15:27         ` Robin Hill
2010-01-29  6:05           ` Michael Evans [this message]
2010-01-29 11:53             ` Goswin von Brederlow
2010-01-30  7:03               ` Michael Evans

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4877c76c1001282205m280049b8y34253fc8f5062d0f@mail.gmail.com \
    --to=mjevans1983@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).