Linux RAID subsystem development
 help / color / mirror / Atom feed
* AW: Raid 1 vs 5 ?
@ 2004-05-07  5:10 Martin Bene
  2004-05-07  5:52 ` John Lange
  2004-06-09 13:27 ` Mauricio
  0 siblings, 2 replies; 12+ messages in thread
From: Martin Bene @ 2004-05-07  5:10 UTC (permalink / raw)
  To: LinuxRaid

> Which one is the better choice and what are the trade offs? Or is
> another configuration more sensible? I'm under the impression that you
> shouldn't (can't?) boot from RAID 5.

Depends very much on what you're going to do with the system - I've
found a high performnce impact of raid5 for database applicaions with
frequent updates (where you end up with lots of small writes scattered
allover the partition). If write speed isn't too important, the space
savings may well make raid5 more attractive.

True, you can't boot directly off raid5, but you can have a /boot on
raid1 and the rest of the system on raid5. Also, you definitely should
consider putting swap on raid1: otherwise a failure of the swap disk
will bring you system down.(don't put swap on raid5 - same performance
issue as mentioned above.)

A minimal configuration for 4 disks optimized for max space could be
like this, though you might want seperate raid5s for /, /usr, /var.  

Each disk partitioned alike:
	1	30MB 
	2	1/2 size_of_swap_
	5	rest_of_disk

Now you can create mds on the disk:
	md0	raid1 sda1 sda2 sda3 sda4
	md1	raid1 sda1 sdb1
	md2	raid1 sdc1 sdd1
	md3	raid5 sda5 sdb5 sdc5 sdd5

	md0	/boot
	md1	swap
	md2	swap
	md3	/

You've got a small 4-disk raid1 as /boot, so each of our disks can be
bootable.
Swap is on 2 2-disk raid1 partitions so your system can survive failure
of a disk used for swap.
Main data storage is on 4-disk raid5.

Bye, Martin

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: AW: Raid 1 vs 5 ?
  2004-05-07  5:10 AW: Raid 1 vs 5 ? Martin Bene
@ 2004-05-07  5:52 ` John Lange
  2004-05-07 14:29   ` Maarten van den Berg
  2004-06-09 13:27 ` Mauricio
  1 sibling, 1 reply; 12+ messages in thread
From: John Lange @ 2004-05-07  5:52 UTC (permalink / raw)
  To: Martin Bene; +Cc: LinuxRaid

Thank you Martin. You have some really great insights there.

A couple of things about the raid sets confused me:

> Each disk partitioned alike:
> 	1	30MB 
> 	2	1/2 size_of_swap_
> 	5	rest_of_disk
> 
> Now you can create mds on the disk:
> 	md0	raid1 sda1 sda2 sda3 sda4
> 	md1	raid1 sda1 sdb1
> 	md2	raid1 sdc1 sdd1
> 	md3	raid5 sda5 sdb5 sdc5 sdd5

First, why do we skip sdx3 and sdx4 on each disk and go directly to sdx5
for partition numbers?

Second, I'm very confused by the way you divided up the raid sets....
I'm thinking you erred? I'm such a newb its possible I really don't
understand whats going on so hopefully you can verify.

md0: did you mean sda1 sdb1 sdc1 sdd1 ?
md1: did you mean sda2 sdb2 ?
md2: did you mean sdc2 sdd2 ?
md3: did you mean sda5 sdb5 sdc5 sdd5 ?

And last, is it possible to build the system from the beginning on RAID?
I'm using slackware. I see there is a section in the how-to for
converting a red hat system after the fact but obviously it would be
easier if I didn't have to do that.

Thanks very much Martin.

John Lange

On Fri, 2004-05-07 at 00:10, Martin Bene wrote:
> > Which one is the better choice and what are the trade offs? Or is
> > another configuration more sensible? I'm under the impression that you
> > shouldn't (can't?) boot from RAID 5.
> 
> Depends very much on what you're going to do with the system - I've
> found a high performnce impact of raid5 for database applicaions with
> frequent updates (where you end up with lots of small writes scattered
> allover the partition). If write speed isn't too important, the space
> savings may well make raid5 more attractive.
> 
> True, you can't boot directly off raid5, but you can have a /boot on
> raid1 and the rest of the system on raid5. Also, you definitely should
> consider putting swap on raid1: otherwise a failure of the swap disk
> will bring you system down.(don't put swap on raid5 - same performance
> issue as mentioned above.)
> 
> A minimal configuration for 4 disks optimized for max space could be
> like this, though you might want seperate raid5s for /, /usr, /var.  
> 
> Each disk partitioned alike:
> 	1	30MB 
> 	2	1/2 size_of_swap_
> 	5	rest_of_disk
> 
> Now you can create mds on the disk:
> 	md0	raid1 sda1 sda2 sda3 sda4
> 	md1	raid1 sda1 sdb1
> 	md2	raid1 sdc1 sdd1
> 	md3	raid5 sda5 sdb5 sdc5 sdd5
> 
> 	md0	/boot
> 	md1	swap
> 	md2	swap
> 	md3	/
> 
> You've got a small 4-disk raid1 as /boot, so each of our disks can be
> bootable.
> Swap is on 2 2-disk raid1 partitions so your system can survive failure
> of a disk used for swap.
> Main data storage is on 4-disk raid5.
> 
> Bye, Martin
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* AW: Raid 1 vs 5 ?
@ 2004-05-07  6:14 Martin Bene
  0 siblings, 0 replies; 12+ messages in thread
From: Martin Bene @ 2004-05-07  6:14 UTC (permalink / raw)
  To: John Lange; +Cc: LinuxRaid

> First, why do we skip sdx3 and sdx4 on each disk and go 
> directly to sdx5
> for partition numbers?

Just habit I guess; usually I've got a couple of data partitions, so I
will have to have some extended partitions; to make things more similar
between systems I tend to put "special" stuff (boot, swap, possibly a
rescue system) on primary partitions and have / as the first extended
partition.

> Second, I'm very confused by the way you divided up the raid sets....
> I'm thinking you erred? I'm such a newb its possible I really don't
> understand whats going on so hopefully you can verify.
> 
> md0: did you mean sda1 sdb1 sdc1 sdd1 ?
> md1: did you mean sda2 sdb2 ?
> md2: did you mean sdc2 sdd2 ?
> md3: did you mean sda5 sdb5 sdc5 sdd5 ?

ARGH - never write to mailinglists without adequate amounts of coffee.
You're of course right, I messed up.

> And last, is it possible to build the system from the 
> beginning on RAID?

Yes :-)

> I'm using slackware. I see there is a section in the how-to for
> converting a red hat system after the fact but obviously it would be
> easier if I didn't have to do that.

Problem is, it depends on the installer / installation process. I know
that redhat installer supports setup of raid, I've often st up gentoo
using raid but it's too long since I installed slackware so I don't know
if/how to go about installing right onto raid.

Bye, Martin


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: AW: Raid 1 vs 5 ?
  2004-05-07  5:52 ` John Lange
@ 2004-05-07 14:29   ` Maarten van den Berg
  2004-05-07 15:15     ` John Lange
  0 siblings, 1 reply; 12+ messages in thread
From: Maarten van den Berg @ 2004-05-07 14:29 UTC (permalink / raw)
  To: LinuxRaid

On Friday 07 May 2004 07:52, John Lange wrote:
> Thank you Martin. You have some really great insights there.
>
> A couple of things about the raid sets confused me:
> > Each disk partitioned alike:
> > 	1	30MB
> > 	2	1/2 size_of_swap_
> > 	5	rest_of_disk
> >
> > Now you can create mds on the disk:
> > 	md0	raid1 sda1 sda2 sda3 sda4
> > 	md1	raid1 sda1 sdb1
> > 	md2	raid1 sdc1 sdd1
> > 	md3	raid5 sda5 sdb5 sdc5 sdd5
>
> First, why do we skip sdx3 and sdx4 on each disk and go directly to sdx5
> for partition numbers?

That is the first number a logical partition gets, as opposed to primary.

> Second, I'm very confused by the way you divided up the raid sets....
> I'm thinking you erred? I'm such a newb its possible I really don't
> understand whats going on so hopefully you can verify.

I concur, there are errors. This is probably what he meant

> md0: did you mean sda1 sdb1 sdc1 sdd1 ?

Yes. It was obviously a typo.

> md1: did you mean sda2 sdb2 ?
> md2: did you mean sdc2 sdd2 ?

If it were me, why not do entire swap (not 1/2 size_of_swap) on a four-way 
raid 1, just as with /boot ?  Is way simpler.

> md3: did you mean sda5 sdb5 sdc5 sdd5 ?
>
> And last, is it possible to build the system from the beginning on RAID?
> I'm using slackware. I see there is a section in the how-to for
> converting a red hat system after the fact but obviously it would be
> easier if I didn't have to do that.

What proved the easiest for me is installing the OS on a temporary scrap 
harddisk, build your raid sets from there on the real target disks and copy.
There is a more complicated way too, it involves setting failed-disk status on 
the drive holding your original data. 

Maarten


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: AW: Raid 1 vs 5 ?
  2004-05-07 14:29   ` Maarten van den Berg
@ 2004-05-07 15:15     ` John Lange
  0 siblings, 0 replies; 12+ messages in thread
From: John Lange @ 2004-05-07 15:15 UTC (permalink / raw)
  To: LinuxRaid

Thank you all for your advice. Very helpful and informative.

I have one final followup.

As this RAID 5 array will be comprised of 4 nearly 80Gig serial ata
disks, should I be using a larger chunk-size than 32 ? The software raid
how-to hints that it should be larger but doesn't go into detail on what
size should be used.

By my calculation this will be a 225G (4 disk) raid 5 array.

((4-1)* 75G) = 225G

My google for it turned up all sorts of conflicting advice on
chunk-size.

Regards,

John Lange

On Fri, 2004-05-07 at 09:29, Maarten van den Berg wrote:
> On Friday 07 May 2004 07:52, John Lange wrote:
> > Thank you Martin. You have some really great insights there.
> >
> > A couple of things about the raid sets confused me:
> > > Each disk partitioned alike:
> > > 	1	30MB
> > > 	2	1/2 size_of_swap_
> > > 	5	rest_of_disk
> > >
> > > Now you can create mds on the disk:
> > > 	md0	raid1 sda1 sda2 sda3 sda4
> > > 	md1	raid1 sda1 sdb1
> > > 	md2	raid1 sdc1 sdd1
> > > 	md3	raid5 sda5 sdb5 sdc5 sdd5
> >
> > First, why do we skip sdx3 and sdx4 on each disk and go directly to sdx5
> > for partition numbers?
> 
> That is the first number a logical partition gets, as opposed to primary.
> 
> > Second, I'm very confused by the way you divided up the raid sets....
> > I'm thinking you erred? I'm such a newb its possible I really don't
> > understand whats going on so hopefully you can verify.
> 
> I concur, there are errors. This is probably what he meant
> 
> > md0: did you mean sda1 sdb1 sdc1 sdd1 ?
> 
> Yes. It was obviously a typo.
> 
> > md1: did you mean sda2 sdb2 ?
> > md2: did you mean sdc2 sdd2 ?
> 
> If it were me, why not do entire swap (not 1/2 size_of_swap) on a four-way 
> raid 1, just as with /boot ?  Is way simpler.
> 
> > md3: did you mean sda5 sdb5 sdc5 sdd5 ?
> >
> > And last, is it possible to build the system from the beginning on RAID?
> > I'm using slackware. I see there is a section in the how-to for
> > converting a red hat system after the fact but obviously it would be
> > easier if I didn't have to do that.
> 
> What proved the easiest for me is installing the OS on a temporary scrap 
> harddisk, build your raid sets from there on the real target disks and copy.
> There is a more complicated way too, it involves setting failed-disk status on 
> the drive holding your original data. 
> 
> Maarten
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
John Lange
BigHostBox.com
(204) 885 0872


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: AW: Raid 1 vs 5 ?
  2004-05-07  5:10 AW: Raid 1 vs 5 ? Martin Bene
  2004-05-07  5:52 ` John Lange
@ 2004-06-09 13:27 ` Mauricio
  2004-06-09 14:57   ` Robin Bowes
  1 sibling, 1 reply; 12+ messages in thread
From: Mauricio @ 2004-06-09 13:27 UTC (permalink / raw)
  To: LinuxRaid

At 07:10 +0200 5/7/04, Martin Bene wrote:
>  > Which one is the better choice and what are the trade offs? Or is
>>  another configuration more sensible? I'm under the impression that you
>>  shouldn't (can't?) boot from RAID 5.
>
>Depends very much on what you're going to do with the system - I've
>found a high performnce impact of raid5 for database applicaions with
>frequent updates (where you end up with lots of small writes scattered

	I've always thought raid1 would be slower than, say, raid0+1 
or raid5.  I guess I am wrong then. =)

>allover the partition). If write speed isn't too important, the space
>savings may well make raid5 more attractive.

	What about raid0+1 vs raid5? What is the difference?  In my 
setup, I plan on using the raid just to store user data; the machine 
would boot (/, /usr, swap, /var) from an unraid disk.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: AW: Raid 1 vs 5 ?
  2004-06-09 13:27 ` Mauricio
@ 2004-06-09 14:57   ` Robin Bowes
  2004-06-09 15:50     ` Robin Bowes
  2004-06-09 15:59     ` Guy
  0 siblings, 2 replies; 12+ messages in thread
From: Robin Bowes @ 2004-06-09 14:57 UTC (permalink / raw)
  To: Mauricio; +Cc: LinuxRaid

On Wed, June 9, 2004 14:27, Mauricio said:
> At 07:10 +0200 5/7/04, Martin Bene wrote:
>>  > Which one is the better choice and what are the trade offs? Or is
>>>  another configuration more sensible? I'm under the impression that you
>>>  shouldn't (can't?) boot from RAID 5.
>>
>>Depends very much on what you're going to do with the system - I've
>>found a high performnce impact of raid5 for database applicaions with
>>frequent updates (where you end up with lots of small writes scattered
>
> 	I've always thought raid1 would be slower than, say, raid0+1
> or raid5.  I guess I am wrong then. =)

Mauricio,

I'm not expert but...

You've got to look at what data is actually being read/written/transferred to get some
idea of performance. Bear in mind too that this can be different with hardware vs.
software raid.

Take RAID1 as an example. With hardware RAID, the data is written to the RAID card which
is responsible for writing a copy to each of the mirrored drives. With Software RAID,
the software must write each copy of the data to each mirrored disk. Assuming a
two-drive mirror then Software RAID will use twice as much bus bandwidth.

Now consider RAID5. Here, with a hardware controller all of the data is written to the
RAID card which in turn calculates parity and stripes the data over the disks. With
software RAID, the software calculates parity and writes the data across all the
mirrored drives. The only additional bus traffic for software RAID is the parity data.

Anyway, my point is that it is not always obvious which combination of RAID levels /
hardware / software is fastest. As Martin said, it also depends on how the system is
being used.

>>allover the partition). If write speed isn't too important, the space
>>savings may well make raid5 more attractive.
>
> 	What about raid0+1 vs raid5? What is the difference?  In my
> setup, I plan on using the raid just to store user data; the machine
> would boot (/, /usr, swap, /var) from an unraid disk.

I'm in the process of setting up something similar. In my case, I have 6 x 250GB SATA
drives which I will spread across two Promise controllers (3 per controller). I want to
use (software) RAID5 to maximise disk capacity but it is not possible to boot from
RAID5. My solution will be something like this:

(All disks partitioned the same)

Partition 1:   3GB
Partition 2:   247GB

I will create the following arrays:

Array 1:
/ (root file system)
3GB RAID1 array built from from D1, D4, with D2 as spare.

Array 2:
swap
3GB RAID1 array built from D3, D6 with D5 as spare

Array 3:
everything else
1235GB RAID5 array build from all six disks.

I plan to do an initial install onto Array 1, then use EVMS or LVM to create logical
volumes on Array 3 onto which I will migrate stuff like /usr, /home, etc.

Partition 1 is sized at 3GB as I have 1.5GB of physical RAM and have therefore sized
swap at 2x this figure.

There are a few possible variations on this...

One esoteric variation I may play around with (just because I can!) is to create Array 1
from just two disks and create Array 2 as Raid0+1 from four parititions to give better
performance for swap. I'm not sure if this is even possible.

I could also keep Partition 1 to 1GB and let linux strip the swap over three disks
giving 3GB swap and a 1GB root partition. I'm not keen on this as the system will hang
if one of the swap disks fails (or so I'm led to believe).

Anyway, hope that gives you some food for thought.

R.
-- 
http://robinbowes.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: AW: Raid 1 vs 5 ?
  2004-06-09 14:57   ` Robin Bowes
@ 2004-06-09 15:50     ` Robin Bowes
  2004-06-09 15:59     ` Guy
  1 sibling, 0 replies; 12+ messages in thread
From: Robin Bowes @ 2004-06-09 15:50 UTC (permalink / raw)
  To: linux-raid

On Wed, June 9, 2004 15:57, Robin Bowes said:
>
> One esoteric variation I may play around with (just because I can!) is to create Array 1
> from just two disks and create Array 2 as Raid0+1 from four parititions to give better
> performance for swap. I'm not sure if this is even possible.

Of course, I could always just do what Martin suggested initially, i.e. create a couple
of RAID1 arrays and stripe the swap across them.

R.
-- 
http://robinbowes.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: AW: Raid 1 vs 5 ?
  2004-06-09 14:57   ` Robin Bowes
  2004-06-09 15:50     ` Robin Bowes
@ 2004-06-09 15:59     ` Guy
  2004-06-09 16:19       ` Robin Bowes
  2004-06-09 22:55       ` Neil Brown
  1 sibling, 2 replies; 12+ messages in thread
From: Guy @ 2004-06-09 15:59 UTC (permalink / raw)
  To: 'Robin Bowes', 'Mauricio'; +Cc: 'LinuxRaid'

You said:
"Now consider RAID5. Here, with a hardware controller all of the data is
written to the RAID card which in turn calculates parity and stripes the
data over the disks. With software RAID, the software calculates parity and
writes the data across all the mirrored drives. The only additional bus
traffic for software RAID is the parity data."

I believe this is wrong:
"The only additional bus traffic for software RAID is the parity data."

It is true if 100% of a stripe is being changed/written.  
If you update less than 100% of a stripe the software RAID must read the
full blocks being changed and the parity block.  Factor out the old data
from the parity then compute a new parity.  Then write the new blocks.

Example:
	Your array will have 6 disks.  You don't state your block size, so
let's assume 64K.  Your stripe size will be 5*64K or 320K.  Now if you were
to write 1 byte to your array this is what will happen:
Read the 64K block that contains the 1 byte.
Read the 64K parity block.
Factor out the 64K data block from the parity block.
Merge your 1 byte into the 64K data block.
Compute a new 64K parity block.
Write the new 64K block that contains your 1 byte.
Write the 64K parity block.

As you can see, your 1 byte require reading 128K from 2 different disks, and
then writing 128K to the same 2 disks.

Now if you were to write 2 bytes and were unlucky, both bytes would be on 2
different data blocks, this is what would happen:
Read the 64K block that contains your first byte.
Read the other 64K block that contains your second byte.
Read the 64K parity block.
Factor out the 2 64K data blocks from the parity block.
Merge your first byte into the first 64K data block.
Merge your second byte into the second 64K data block.
Compute a new 64K parity block.
Write the new 64K block that contains your first byte.
Write the new 64K block that contains your second byte.
Write the 64K parity block.

As you can see, your 2 bytes require reading 192K from 3 different disks,
and then writing 192K to the same 3 disks.

I don't know how md really does this.  I have not looked at the code.
Another choice would be to read 100% of the strip, apply your updates (1
byte in my example), then compute the parity, then write the changed blocks.
This would be simpler (I think) but on larger arrays (lots of disks) it
would cause much more disk IO.  In may first example it would read 320K and
write 128K.

About your swap plans...
If you were going to use 4 disks, create 2 RAID1 arrays, give both to swap.
Swap will strip to both arrays.  I believe this would give better
performance over creating a single RAID1/0 array.

Guy

-----Original Message-----
From: linux-raid-owner@vger.kernel.org
[mailto:linux-raid-owner@vger.kernel.org] On Behalf Of Robin Bowes
Sent: Wednesday, June 09, 2004 10:58 AM
To: Mauricio
Cc: LinuxRaid
Subject: Re: AW: Raid 1 vs 5 ?

On Wed, June 9, 2004 14:27, Mauricio said:
> At 07:10 +0200 5/7/04, Martin Bene wrote:
>>  > Which one is the better choice and what are the trade offs? Or is
>>>  another configuration more sensible? I'm under the impression that you
>>>  shouldn't (can't?) boot from RAID 5.
>>
>>Depends very much on what you're going to do with the system - I've
>>found a high performnce impact of raid5 for database applicaions with
>>frequent updates (where you end up with lots of small writes scattered
>
> 	I've always thought raid1 would be slower than, say, raid0+1
> or raid5.  I guess I am wrong then. =)

Mauricio,

I'm not expert but...

You've got to look at what data is actually being read/written/transferred
to get some
idea of performance. Bear in mind too that this can be different with
hardware vs.
software raid.

Take RAID1 as an example. With hardware RAID, the data is written to the
RAID card which
is responsible for writing a copy to each of the mirrored drives. With
Software RAID,
the software must write each copy of the data to each mirrored disk.
Assuming a
two-drive mirror then Software RAID will use twice as much bus bandwidth.

Now consider RAID5. Here, with a hardware controller all of the data is
written to the
RAID card which in turn calculates parity and stripes the data over the
disks. With
software RAID, the software calculates parity and writes the data across all
the
mirrored drives. The only additional bus traffic for software RAID is the
parity data.

Anyway, my point is that it is not always obvious which combination of RAID
levels /
hardware / software is fastest. As Martin said, it also depends on how the
system is
being used.

>>allover the partition). If write speed isn't too important, the space
>>savings may well make raid5 more attractive.
>
> 	What about raid0+1 vs raid5? What is the difference?  In my
> setup, I plan on using the raid just to store user data; the machine
> would boot (/, /usr, swap, /var) from an unraid disk.

I'm in the process of setting up something similar. In my case, I have 6 x
250GB SATA
drives which I will spread across two Promise controllers (3 per
controller). I want to
use (software) RAID5 to maximise disk capacity but it is not possible to
boot from
RAID5. My solution will be something like this:

(All disks partitioned the same)

Partition 1:   3GB
Partition 2:   247GB

I will create the following arrays:

Array 1:
/ (root file system)
3GB RAID1 array built from from D1, D4, with D2 as spare.

Array 2:
swap
3GB RAID1 array built from D3, D6 with D5 as spare

Array 3:
everything else
1235GB RAID5 array build from all six disks.

I plan to do an initial install onto Array 1, then use EVMS or LVM to create
logical
volumes on Array 3 onto which I will migrate stuff like /usr, /home, etc.

Partition 1 is sized at 3GB as I have 1.5GB of physical RAM and have
therefore sized
swap at 2x this figure.

There are a few possible variations on this...

One esoteric variation I may play around with (just because I can!) is to
create Array 1
from just two disks and create Array 2 as Raid0+1 from four parititions to
give better
performance for swap. I'm not sure if this is even possible.

I could also keep Partition 1 to 1GB and let linux strip the swap over three
disks
giving 3GB swap and a 1GB root partition. I'm not keen on this as the system
will hang
if one of the swap disks fails (or so I'm led to believe).

Anyway, hope that gives you some food for thought.

R.
-- 
http://robinbowes.com

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: AW: Raid 1 vs 5 ?
  2004-06-09 15:59     ` Guy
@ 2004-06-09 16:19       ` Robin Bowes
  2004-06-09 22:55       ` Neil Brown
  1 sibling, 0 replies; 12+ messages in thread
From: Robin Bowes @ 2004-06-09 16:19 UTC (permalink / raw)
  To: Guy; +Cc: 'Mauricio', 'LinuxRaid'

On Wed, June 9, 2004 16:59, Guy said:
> You said:
> "Now consider RAID5. Here, with a hardware controller all of the data is
> written to the RAID card which in turn calculates parity and stripes the
> data over the disks. With software RAID, the software calculates parity and
> writes the data across all the mirrored drives. The only additional bus
> traffic for software RAID is the parity data."
>
> I believe this is wrong:

[snip]

Like I also said:

>>
>> I'm not expert but...
>>

Hmm. I can't spell either! :o)

> About your swap plans...
> If you were going to use 4 disks, create 2 RAID1 arrays, give both to swap.
> Swap will strip to both arrays.  I believe this would give better
> performance over creating a single RAID1/0 array.
>

I came to the same conclusion myself 30 secs after hitting "send".

Thanks for the suggestion.

R.
-- 
http://robinbowes.com


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: AW: Raid 1 vs 5 ?
  2004-06-09 15:59     ` Guy
  2004-06-09 16:19       ` Robin Bowes
@ 2004-06-09 22:55       ` Neil Brown
  2004-06-09 23:39         ` Guy
  1 sibling, 1 reply; 12+ messages in thread
From: Neil Brown @ 2004-06-09 22:55 UTC (permalink / raw)
  To: Guy; +Cc: 'Robin Bowes', 'Mauricio', 'LinuxRaid'

On Wednesday June 9, bugzilla@watkins-home.com wrote:
> You said:
> "Now consider RAID5. Here, with a hardware controller all of the data is
> written to the RAID card which in turn calculates parity and stripes the
> data over the disks. With software RAID, the software calculates parity and
> writes the data across all the mirrored drives. The only additional bus
> traffic for software RAID is the parity data."
> 
> I believe this is wrong:
> "The only additional bus traffic for software RAID is the parity data."
> 
> It is true if 100% of a stripe is being changed/written.  
> If you update less than 100% of a stripe the software RAID must read the
> full blocks being changed and the parity block.  Factor out the old data
> from the parity then compute a new parity.  Then write the new blocks.
> 
> Example:
> 	Your array will have 6 disks.  You don't state your block size, so
> let's assume 64K.  Your stripe size will be 5*64K or 320K.  Now if you were
> to write 1 byte to your array this is what will happen:
> Read the 64K block that contains the 1 byte.
> Read the 64K parity block.
> Factor out the 64K data block from the parity block.
> Merge your 1 byte into the 64K data block.
> Compute a new 64K parity block.
> Write the new 64K block that contains your 1 byte.
> Write the 64K parity block.

This is mostly correct, except that it won't be a 64k block.  It will
normally be a 4k block.  Your chunksize is irrelevant. 
In 2.6, md will do a PAGE_SIZE read/write, which is 4k on x86.
In 2.4, md will do read/writes that match the filesystem blocksize,
which is most often 4k these days.

> 
> As you can see, your 1 byte require reading 128K from 2 different disks, and
> then writing 128K to the same 2 disks.

So that's 8k, twice.

> 
> I don't know how md really does this.  I have not looked at the code.
> Another choice would be to read 100% of the strip, apply your updates (1
> byte in my example), then compute the parity, then write the changed
> blocks.

md sometimes does a "read-modify-write" cycle like your first example,
and sometimes does a "reconstruct-write" cycle like your second
example.  It chooses the option that generates the fewest IO requests.

NeilBrown

^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: AW: Raid 1 vs 5 ?
  2004-06-09 22:55       ` Neil Brown
@ 2004-06-09 23:39         ` Guy
  0 siblings, 0 replies; 12+ messages in thread
From: Guy @ 2004-06-09 23:39 UTC (permalink / raw)
  To: 'Neil Brown'; +Cc: 'LinuxRaid'

Sorry I was somewhat wrong.  Thanks for setting me straight!
md is "smarter" than I thought.  Cool.

But now I have a question.
If md does 4k I/Os is there a reason to create the array with larger blocks?

I have tried block sizes from 1k to maybe 256K and did not notice any real
difference.  My testing was very crude!  And I did not try every block size.
Not even sure I tried 4K.  64K seemed best on my system.

Thanks,
Guy

-----Original Message-----
From: Neil Brown [mailto:neilb@cse.unsw.edu.au] 
Sent: Wednesday, June 09, 2004 6:56 PM
To: Guy
Cc: 'Robin Bowes'; 'Mauricio'; 'LinuxRaid'
Subject: RE: AW: Raid 1 vs 5 ?

On Wednesday June 9, bugzilla@watkins-home.com wrote:
> You said:
> "Now consider RAID5. Here, with a hardware controller all of the data is
> written to the RAID card which in turn calculates parity and stripes the
> data over the disks. With software RAID, the software calculates parity
and
> writes the data across all the mirrored drives. The only additional bus
> traffic for software RAID is the parity data."
> 
> I believe this is wrong:
> "The only additional bus traffic for software RAID is the parity data."
> 
> It is true if 100% of a stripe is being changed/written.  
> If you update less than 100% of a stripe the software RAID must read the
> full blocks being changed and the parity block.  Factor out the old data
> from the parity then compute a new parity.  Then write the new blocks.
> 
> Example:
> 	Your array will have 6 disks.  You don't state your block size, so
> let's assume 64K.  Your stripe size will be 5*64K or 320K.  Now if you
were
> to write 1 byte to your array this is what will happen:
> Read the 64K block that contains the 1 byte.
> Read the 64K parity block.
> Factor out the 64K data block from the parity block.
> Merge your 1 byte into the 64K data block.
> Compute a new 64K parity block.
> Write the new 64K block that contains your 1 byte.
> Write the 64K parity block.

This is mostly correct, except that it won't be a 64k block.  It will
normally be a 4k block.  Your chunksize is irrelevant. 
In 2.6, md will do a PAGE_SIZE read/write, which is 4k on x86.
In 2.4, md will do read/writes that match the filesystem blocksize,
which is most often 4k these days.

> 
> As you can see, your 1 byte require reading 128K from 2 different disks,
and
> then writing 128K to the same 2 disks.

So that's 8k, twice.

> 
> I don't know how md really does this.  I have not looked at the code.
> Another choice would be to read 100% of the strip, apply your updates (1
> byte in my example), then compute the parity, then write the changed
> blocks.

md sometimes does a "read-modify-write" cycle like your first example,
and sometimes does a "reconstruct-write" cycle like your second
example.  It chooses the option that generates the fewest IO requests.

NeilBrown



^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2004-06-09 23:39 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-07  5:10 AW: Raid 1 vs 5 ? Martin Bene
2004-05-07  5:52 ` John Lange
2004-05-07 14:29   ` Maarten van den Berg
2004-05-07 15:15     ` John Lange
2004-06-09 13:27 ` Mauricio
2004-06-09 14:57   ` Robin Bowes
2004-06-09 15:50     ` Robin Bowes
2004-06-09 15:59     ` Guy
2004-06-09 16:19       ` Robin Bowes
2004-06-09 22:55       ` Neil Brown
2004-06-09 23:39         ` Guy
  -- strict thread matches above, loose matches on Subject: below --
2004-05-07  6:14 Martin Bene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox