From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bill Davidsen Subject: Re: Software based SATA RAID-5 expandable arrays? Date: Thu, 12 Jul 2007 23:54:13 -0400 Message-ID: <4696F765.2040403@tmr.com> References: <190877.16439.qm@web54104.mail.re2.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-7; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <190877.16439.qm@web54104.mail.re2.yahoo.com> Sender: linux-raid-owner@vger.kernel.org To: Michael Cc: Daniel Korstad , linux-raid@vger.kernel.org List-Id: linux-raid.ids Michael wrote: > SuSe uses its own version of cron which is different then everything = else I have seen, and the documentation is horrible. However they prov= ide a wonderfull xwindows utility that helps set them up... the problem= Im having is figuring out what to run. When I try to run "/sys/block/= md0/md/sync_action" under a prompt it shoots out a permission denied ev= en though I am SU or logged in under Root. Very annoying. You mention= Check vrs Repair... which brings me too my last issue on setting up th= is machine. How do you send an email when Check, SMART, and when a RAI= D drive fails? How do you auto repair if the Check fails? > > =20 The command is echo! As in echo check >/sys/block/md0/md/sync_action Read the man page on what happens if you echo "repair" instead of=20 "check" there, which might be more what you want to do. Only you can de= cide. > These are the last things I need to do for my Linux Server to work ri= ght... after I get all of this done, I will change the boot to goto the= command prompt and not XWindows, and I will leave it in the corner of = my room hopefully not to be used for as long as possible. > > ----- Original Message ---- > From: Bill Davidsen > To: Daniel Korstad > Cc: Michael ; linux-raid@vger.kernel.= org > Sent: Wednesday, July 11, 2007 10:21:42 AM > Subject: Re: Software based SATA RAID-5 expandable arrays? > > Daniel Korstad wrote: > =20 >> You have lots of options. This will be a lengthy response and will = give just some ideas for just some of the options... >> =20 >> =20 >> =20 > Just a few thoughts below interspersed with your comments. > =20 >> For my server, I had started out with a single drive. I later migra= ted to migrate to a RAID 1 mirror (after having to deal with reinstalls= after drive failures I wised up). Since I already had an OS that I wa= nted to keep, my RAID-1 setup was a bit more involved. I following thi= s migration to get me there; >> http://wiki.clug.org.za/wiki/RAID-1_in_a_hurry_with_grub_and_mdadm >> =20 >> Since you are starting from scratch, it should be easier for you. M= ost distros will have an installer that will guide you though the proce= ss. When you get to hard drive partitioning, look for an advance optio= n or review and modify partition layout option or something similar oth= erwise it might just make a guess of what you want and that would not b= e RAID. In this advance partition setup, you will be able to create yo= ur RAID. First you make equal size partitions on both physical drives.= For example, first carve out 100M partition on each of the two physic= al OS drives, than make a RAID 1 md0 with each of this partitions and t= han make this your /boot. Do this again for other partitions you want = to have RAIDed. You can do this for /boot, /var, /home, /tmp, /usr. T= his is can be nice to have a separations incase a user fills /home/foo = with crap and this will not effect other parts of the OS, or if mail sp= ool fills up, it will not hang the OS. Only problem it >> =20 > determining how big to make them during the install. At a minimum, = I would do three partitions; /boot, swap, and / This means all the oth= ers (/var, /home, /tmp, /usr) are in the / partition but this way you d= on't have to worry about sizing them all correctly.=20 > =20 >> =20 >> For the simplest setup, I would do RAID 1 for /boot (md0), swap (md1= ), and / (md2) (Alternatively, your could make a swap file in / and no= t have a swap partition, tons of options...) Do you need to RAID your = swap? Well, I would RAID it or make a swap file within a RAID partitio= n. If you don't and your system is using swap and you lose a drive tha= t has swap information/partition on it, you might have issues depending= on how important that information in the failed drive was. You system= s might hang. >> =20 >> =20 >> =20 > Note that RAID-10 generally performs better than mirroring, particula= rly=20 > when more than a few drives are involved. This can have performance=20 > implications for swap, when large i/o pushes program pages out of=20 > memory. The other side of that coin is that "recovery CDs" don't seem= to=20 > know how to use RAID-10 swap, which might be an issue on some systems= =2E > =20 >> After you go through the install and have a bootable OS that is runn= ing on mdadm RAID, I would test it to make sure grub was installed corr= ectly to both the physical drives. If grub is not installed to both dr= ives, and you lose one drive down the road and if that one was the one = with grub, you will have a system that will not boot even though it has= a second drive with a copy of all the files. If this were to happen, = you can recover by booting with a bootable linux CD or recover disk and= manually installing grub too. For example say you only had grub instal= led to hda and it failed, boot with a live linux cd and type (assuming = /dev/hdd is the surviving second drive); >> grub >> device (hd0) /dev/hdd >> root (hd0,0) >> setup (hd0) >> quit >> You say you are using two 500G drives for the OS. You don't necessa= ry have to use all the space for the OS. You can make your partitions = and take the left over space and throw it into a logical volume. This = logical volume would not be fault tolerant, but would be the sum of the= left over capacity from both drives. For example, you use 100M for /b= oot and 200G for / and 2G for swap. Take the rest and make a standard = ext3 partition for the remaining space on both drives and put them in a= logical volume giving over 500G to play with for non critical crap. >> =20 >> Why do I use RAID6? For the extra redundancy and I have 10 drives i= n my arrary. =20 >> I have been an advocate for RAID 6, especially with the every increa= sing drive capacity and the number of drives in the array is above say = six; >> http://www.intel.com/technology/magazine/computing/RAID-6-0505.htm=20 >> =20 >> =20 >> =20 > Other configurations will perform better for writes, know your i/o=20 > performance requirements. > =20 >> http://storageadvisors.adaptec.com/2005/10/13/raid-5-pining-for-the-= fjords/=20 >> "...for using RAID-6, the single biggest reason is based on the chan= ce of drive errors during an array rebuild after just a single drive fa= ilure. Rebuilding the data on a failed drive requires that all the othe= r data on the other drives be pristine and error free. If there is a si= ngle error in a single sector, then the data for the corresponding sect= or on the replacement drive cannot be reconstructed. Data is lost. In t= he drive industry, the measurement of how often this occurs is called t= he Bit Error Rate (BER). Simple calculations will show that the chance = of data loss due to BER is much greater than all the other reasons comb= ined. Also, PATA and SATA drives have historically had much greater BER= s, i.e., more bit errors per drive, than SCSI and SAS drives, causing s= ome vendors to recommend RAID-6 for SATA drives if they=A2re used for m= ission critical data." >> =20 >> Since you are using only four drives for your data array, the overhe= ad for RAID6 (two drives for parity) might not be worth it. =20 >> =20 >> With four drives you would be just fine with a RAID5. >> However, I would make a cron for the command to run every once in aw= hile. Add this to your crontab... >> >> #check for bad blocks once a week (every Mon at 2:30am)if bad blocks= are found, they are corrected from parity information=20 >> 30 2 * * Mon echo check /sys/block/md0/md/sync_action >> =20 >> With this, you will keep hidden bad blocks to a minimum and when a d= rive fails, you won't be likely bitten by a hidden bad block(s) during = a rebuild. >> =20 >> =20 >> =20 > I think a comment on "check" vs. "repair" is appropriate here. At the= =20 > least "see the man page" is appropriate. > =20 >> For your data array, I would make one partition of Linux raid (FD) a= nd have one partition for the whole drive in each physical drive. Than= create your raid. =20 >> =20 >> mdadm --create /dev/md3 -l 5 -n 4 /dev/ = /dev/ /dev/ /de= v/ <---the /dev/md3 can be what you want a= nd will depend on how many other previous raid arrays you have, so long= as you use a number not currently used. =20 >> =20 >> My filesystem of choice is XFS, but you get to pick your own poison: >> mkfs.xfs /-f /dev/md3 >> =20 >> Mount the device : >> mount /dev/md3 /foo >> =20 >> I would edit your /etc/fstab to have it automounted for each startup= =2E >> =20 >> Dan. >> =20 >> =20 > Other misc comments: mirroring your boot partition on drives which th= e=20 > BIOS won't use is a waste of bytes. If you have more than, say four,=20 > drives fail to function you probably have a system problem other than= =20 > disk. And some BIOS versions will boot a secondary drive if the prima= ry=20 > fails hard but not if it has a parity or other error, which can enter= a=20 > retry loop (I *must* keep trying to boot). This behavior can be seen = on=20 > at least one major server hardware from a big name vendor, it's not j= ust=20 > cheap desktops. The solution, ugly as it is, is to use the firmware=20 > "RAID" on the motherboard controller for boot, and I have several=20 > systems with low cost small PATA drives in mirror just for boot (afte= r=20 > which they are spun down with hdparm settings) for this reason. > > Really good notes, people should hang onto them! > > =20 --=20 bill davidsen CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html