From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bill Davidsen Subject: Re: raid5 software vs hardware: parity calculations? Date: Tue, 16 Jan 2007 00:06:31 -0500 Message-ID: <45AC5D57.50001@tmr.com> References: <2A887D754684B6703B52E126@emerald.sei.cmu.edu> <45A917B8.2060706@tmr.com> <45AB9DC6.50509@tmr.com> <45ABAA43.2000902@robinbowes.com> <45AC1DD9.9070402@panix.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <45AC1DD9.9070402@panix.com> Sender: linux-raid-owner@vger.kernel.org To: berk walker Cc: dean gaudet , Robin Bowes , linux-raid@vger.kernel.org List-Id: linux-raid.ids berk walker wrote: > > dean gaudet wrote: >> On Mon, 15 Jan 2007, Robin Bowes wrote: >> >> >>> I'm running RAID6 instead of RAID5+1 - I've had a couple of instances >>> where a drive has failed in a RAID5+1 array and a second has failed >>> during the rebuild after the hot-spare had kicked in. >>> >> >> if the failures were read errors without losing the entire disk (the >> typical case) then new kernels are much better -- on read error md >> will reconstruct the sectors from the other disks and attempt to >> write it back. >> >> you can also run monthly "checks"... >> >> echo check >/sys/block/mdX/md/sync_action >> >> it'll read the entire array (parity included) and correct read errors >> as they're discovered. >> >> -dean >> - >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> > > Could I get a pointer as to how I can do this "check" in my FC5 [BLAG] > system? I can find no appropriate "check", nor "md" available to me. > It would be a "good thing" if I were able to find potentially weak > spots, rewrite them to good, and know that it might be time for a new > drive. Grab a recent mdadm source, it's a part of that. > > All of my arrays have drives of approx the same mfg date, so the > possibility of more than one showing bad at the same time can not be > ignored. Never can, but it is highly unlikely, given the MTBF of modern drives. And when you consider total failures as opposed to bad sectors it gets even smaller. There is no perfect way to avoid ever losing data, just ways to reduce the chance to balance the cost of data loss vs. hardware. Current Linux will rewrite bad sectors, whole drive failures are an argument for spares. -- bill davidsen CTO TMR Associates, Inc Doing interesting things with small computers since 1979