From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bill Davidsen Subject: Re: Software based SATA RAID-5 expandable arrays? Date: Fri, 13 Jul 2007 14:18:13 -0400 Message-ID: <4697C1E5.80304@tmr.com> References: <827606.43281.qm@web54104.mail.re2.yahoo.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <827606.43281.qm@web54104.mail.re2.yahoo.com> Sender: linux-raid-owner@vger.kernel.org To: Michael Cc: Daniel Korstad , linux-raid List-Id: linux-raid.ids Michael wrote: > RESPONSE > > I had everything working, but it is evident that when I installed SuS= e > the first time check and repair where not included in the package:( = I > did not use the ">>" I used ">", as was incorrectly stated in > many documentations I set up. > > > =20 Doesn't matter, either will work and most people just use ">" > The thing that made me suspect check and repair wasn't part of sues w= as > the failure of "check" or "repair" typed at the command prompt to > respond in any kind other then a response that stated their was no > command. In addition man check and man repair was also missing. > > > =20 One more time, "check" and "repair" are not commands, they are characte= r=20 strings! You are using the echo command to write those strings into the= =20 control interface in the sysfs area. If you type exactly what people=20 have sent you that will work. > BROKEN! > > I did an auto update of the SuSe machine, which ended up replacing th= e > kernel. They added the new entries to the boot choices but the mount > information was not transfered. SuSe also deleted the original kerne= l > boot setup. When suse looked at the drives individually they found > that none of them was recognizable. Therefor when I woke up this > morning and rebooted the machine after the update, I received the > errors and then dumps me to a basic prompt with limited ability to do > anything. I know I need to manually remount the drives, but its goin= g > to be a challenge since I did not do this in the past. The answer to > this question is that I either have to change distro's (which I am > tempted to do) or fix the current distro. Please do not bother > providing any solutions for I simply have to RTFM (which I haven't ha= d > time to do). > > > > I think I am going to reset up my machines. The first two drives wit= h > identical boot partitions, yet not mirror them. I can then manually > run a "tree" copy that would update my second drive as I grow the > system, and after successfull and needed updates. This would then > allow me a fall back after any updates, and with simply swapping SATA > drive cables from the first boot drive too the second. I am assuming > this will work. I then can RAID-6 (or 5) in the setup, recopy my fil= es > (yes I haven't deleted them because I am not confident in my ability > with Linux yet.). Hopefully I will just simply remount these 4 drive= s > because there a simple raid 5 array. > > > > SUSE's COMPLETE FAILURES > > This frustration with SuSe, the lack of a simple reliable update > utility and the failures I experience has discouraged me from using > SuSe at all. Its got some amazing tools that help me from constantly > looking up documentation, posting to forums, or going to IRC, but the > unreliable upgrade process is a deal breaker for me. Its simply to > much work to manually update everything. This project had a simple > goal, which was to provide an easy and cheap solution to an unlimited > NAS service. > > > > SUPPORT > > In addition, SuSe's IRC help channel is among the worst I have > encountered. The level of support is often very good, but the level = of > harassment, flames and simple childish behavior overcomes almost any > attempt at providing any level of support. I have no problem giving > back to the community when I learn enough to do so, but I will not be > mocked for my inability to understand a new and very in depth system.= =20 > In fact, I tend to goto the wonderful gentoo irc for my answers. The > IRC is amazing, the people patient and encouraging, the level of > knowledge is the best I have experienced. This resource, outside the > original incident, has been an amazing resource. I feel highly > confident asking questions about RAID here, because I know you guys a= re > actually RUNNING systems that I am attempting to do. > > ----- Original Message ---- > From: Daniel Korstad > To: big.green.jelly.bean > Cc: davidsen ; linux-raid > Sent: Friday, July 13, 2007 11:22:45 AM > Subject: RE: Software based SATA RAID-5 expandable arrays? > > To run it manually; > > echo check >> /sys/block/md0/md/sync_action > > than you can check the status with; > > cat /proc/mdstat > > Or to continually watch it, if you want (kind of boring though :) ) > > watch cat /proc/mdstat > > This will refresh ever 2sec. > > In my original email I suggested to use a crontab so you don't need t= o remember to do this every once in a while. > > Run (I did this in root); > > crontab -e=20 > > This will allow you to edit you crontab. Now past this command in the= re; > > 30 2 * * Mon echo >> check /sys/block/md0/md/sync_action > > If you want you can add comments, I like to comment my stuff since I = have lots of stuff in mine, just make sure you have '#' in the front of= the lines so your system knows it is just a comment and not a command = it should run; > > #check for bad blocks once a week (every Mon at 2:30am) > #if bad blocks are found, they are corrected from parity information > > After you have put this in your crontab, write and quit with this com= mand; > > :wq > > It should come back with this; > [root@gateway ~]# crontab -e > crontab: installing new crontab > > Now you can look at your cron table (without editing) with this; > > crontab -l > > It should return something like this, depending if you added comments= or how you scheduled your command; > > #check for bad blocks once a week (every Mon at 2:30am) > #if bad blocks are found, they are corrected from parity information > 30 2 * * Mon echo >> check /sys/block/md0/md/sync_action > > For more info on crontab and syntax for times (I just did a google an= d grabbed the first couple links...); > http://www.tech-geeks.org/contrib/mdrone/cron&crontab-howto.htm > http://ubuntuforums.org/showthread.php?t=3D102626&highlight=3Dcron > > Cheers, > Dan. > > -----Original Message----- > From: Michael [mailto:big_green_jelly_bean@yahoo.com]=20 > Sent: Thursday, July 12, 2007 5:43 PM > To: Bill Davidsen; Daniel Korstad > Cc: linux-raid@vger.kernel.org > Subject: Re: Software based SATA RAID-5 expandable arrays? > > SuSe uses its own version of cron which is different then everything = else I have seen, and the documentation is horrible. However they prov= ide a wonderfull xwindows utility that helps set them up... the problem= Im having is figuring out what to run. When I try to run "/sys/block/= md0/md/sync_action" under a prompt it shoots out a permission denied ev= en though I am SU or logged in under Root. Very annoying. You mention= Check vrs Repair... which brings me too my last issue on setting up th= is machine. How do you send an email when Check, SMART, and when a RAI= D drive fails? How do you auto repair if the Check fails? > > These are the last things I need to do for my Linux Server to work ri= ght... after I get all of this done, I will change the boot to goto the= command prompt and not XWindows, and I will leave it in the corner of = my room hopefully not to be used for as long as possible. > > ----- Original Message ---- > From: Bill Davidsen > To: Daniel Korstad > Cc: Michael ; linux-raid@vger.kernel.= org > Sent: Wednesday, July 11, 2007 10:21:42 AM > Subject: Re: Software based SATA RAID-5 expandable arrays? > > Daniel Korstad wrote: > =20 >> You have lots of options. This will be a lengthy response and will = give just some ideas for just some of the options... >> =20 >> =20 >> =20 > Just a few thoughts below interspersed with your comments. > =20 >> For my server, I had started out with a single drive. I later migra= ted to migrate to a RAID 1 mirror (after having to deal with reinstalls= after drive failures I wised up). Since I already had an OS that I wa= nted to keep, my RAID-1 setup was a bit more involved. I following thi= s migration to get me there; >> http://wiki.clug.org.za/wiki/RAID-1_in_a_hurry_with_grub_and_mdadm >> =20 >> Since you are starting from scratch, it should be easier for you. M= ost distros will have an installer that will guide you though the proce= ss. When you get to hard drive partitioning, look for an advance optio= n or review and modify partition layout option or something similar oth= erwise it might just make a guess of what you want and that would not b= e RAID. In this advance partition setup, you will be able to create yo= ur RAID. First you make equal size partitions on both physical drives.= For example, first carve out 100M partition on each of the two physic= al OS drives, than make a RAID 1 md0 with each of this partitions and t= han make this your /boot. Do this again for other partitions you want = to have RAIDed. You can do this for /boot, /var, /home, /tmp, /usr. T= his is can be nice to have a separations incase a user fills /home/foo = with crap and this will not effect other parts of the OS, or if mail sp= ool fills up, it will not hang the OS. Only problem it >> =20 > determining how big to make them during the install. At a minimum, = I would do three partitions; /boot, swap, and / This means all the oth= ers (/var, /home, /tmp, /usr) are in the / partition but this way you d= on't have to worry about sizing them all correctly.=20 > =20 >> =20 >> For the simplest setup, I would do RAID 1 for /boot (md0), swap (md1= ), and / (md2) (Alternatively, your could make a swap file in / and no= t have a swap partition, tons of options...) Do you need to RAID your = swap? Well, I would RAID it or make a swap file within a RAID partitio= n. If you don't and your system is using swap and you lose a drive tha= t has swap information/partition on it, you might have issues depending= on how important that information in the failed drive was. You system= s might hang. >> =20 >> =20 >> =20 > Note that RAID-10 generally performs better than mirroring, particula= rly=20 > when more than a few drives are involved. This can have performance=20 > implications for swap, when large i/o pushes program pages out of=20 > memory. The other side of that coin is that "recovery CDs" don't seem= to=20 > know how to use RAID-10 swap, which might be an issue on some systems= =2E > =20 >> After you go through the install and have a bootable OS that is runn= ing on mdadm RAID, I would test it to make sure grub was installed corr= ectly to both the physical drives. If grub is not installed to both dr= ives, and you lose one drive down the road and if that one was the one = with grub, you will have a system that will not boot even though it has= a second drive with a copy of all the files. If this were to happen, = you can recover by booting with a bootable linux CD or recover disk and= manually installing grub too. For example say you only had grub instal= led to hda and it failed, boot with a live linux cd and type (assuming = /dev/hdd is the surviving second drive); >> grub >> device (hd0) /dev/hdd >> root (hd0,0) >> setup (hd0) >> quit >> You say you are using two 500G drives for the OS. You don't necessa= ry have to use all the space for the OS. You can make your partitions = and take the left over space and throw it into a logical volume. This = logical volume would not be fault tolerant, but would be the sum of the= left over capacity from both drives. For example, you use 100M for /b= oot and 200G for / and 2G for swap. Take the rest and make a standard = ext3 partition for the remaining space on both drives and put them in a= logical volume giving over 500G to play with for non critical crap. >> =20 >> Why do I use RAID6? For the extra redundancy and I have 10 drives i= n my arrary. =20 >> I have been an advocate for RAID 6, especially with the every increa= sing drive capacity and the number of drives in the array is above say = six; >> http://www.intel.com/technology/magazine/computing/RAID-6-0505.htm=20 >> =20 >> =20 >> =20 > Other configurations will perform better for writes, know your i/o=20 > performance requirements. > =20 >> http://storageadvisors.adaptec.com/2005/10/13/raid-5-pining-for-the-= fjords/=20 >> "...for using RAID-6, the single biggest reason is based on the chan= ce of drive errors during an array rebuild after just a single drive fa= ilure. Rebuilding the data on a failed drive requires that all the othe= r data on the other drives be pristine and error free. If there is a si= ngle error in a single sector, then the data for the corresponding sect= or on the replacement drive cannot be reconstructed. Data is lost. In t= he drive industry, the measurement of how often this occurs is called t= he Bit Error Rate (BER). Simple calculations will show that the chance = of data loss due to BER is much greater than all the other reasons comb= ined. Also, PATA and SATA drives have historically had much greater BER= s, i.e., more bit errors per drive, than SCSI and SAS drives, causing s= ome vendors to recommend RAID-6 for SATA drives if they=A2re used for m= ission critical data." >> =20 >> Since you are using only four drives for your data array, the overhe= ad for RAID6 (two drives for parity) might not be worth it. =20 >> =20 >> With four drives you would be just fine with a RAID5. >> However, I would make a cron for the command to run every once in aw= hile. Add this to your crontab... >> >> #check for bad blocks once a week (every Mon at 2:30am)if bad blocks= are found, they are corrected from parity information=20 >> 30 2 * * Mon echo check /sys/block/md0/md/sync_action >> =20 >> With this, you will keep hidden bad blocks to a minimum and when a d= rive fails, you won't be likely bitten by a hidden bad block(s) during = a rebuild. >> =20 >> =20 >> =20 > I think a comment on "check" vs. "repair" is appropriate here. At the= =20 > least "see the man page" is appropriate. > =20 >> For your data array, I would make one partition of Linux raid (FD) a= nd have one partition for the whole drive in each physical drive. Than= create your raid. =20 >> =20 >> mdadm --create /dev/md3 -l 5 -n 4 /dev/ = /dev/ /dev/ /de= v/ <---the /dev/md3 can be what you want a= nd will depend on how many other previous raid arrays you have, so long= as you use a number not currently used. =20 >> =20 >> My filesystem of choice is XFS, but you get to pick your own poison: >> mkfs.xfs /-f /dev/md3 >> =20 >> Mount the device : >> mount /dev/md3 /foo >> =20 >> I would edit your /etc/fstab to have it automounted for each startup= =2E >> =20 >> Dan. >> =20 >> =20 > Other misc comments: mirroring your boot partition on drives which th= e=20 > BIOS won't use is a waste of bytes. If you have more than, say four,=20 > drives fail to function you probably have a system problem other than= =20 > disk. And some BIOS versions will boot a secondary drive if the prima= ry=20 > fails hard but not if it has a parity or other error, which can enter= a=20 > retry loop (I *must* keep trying to boot). This behavior can be seen = on=20 > at least one major server hardware from a big name vendor, it's not j= ust=20 > cheap desktops. The solution, ugly as it is, is to use the firmware=20 > "RAID" on the motherboard controller for boot, and I have several=20 > systems with low cost small PATA drives in mirror just for boot (afte= r=20 > which they are spun down with hdparm settings) for this reason. > > Really good notes, people should hang onto them! > > =20 --=20 bill davidsen CTO TMR Associates, Inc Doing interesting things with small computers since 1979 - To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html