From mboxrd@z Thu Jan 1 00:00:00 1970 From: Bill Davidsen Subject: Re: If your using large Sata drives in raid 5/6 .... Date: Fri, 05 Feb 2010 12:40:22 -0500 Message-ID: <4B6C5806.9040108@tmr.com> References: <87f94c371002021440o3b30414bk3a7ccf9d2fa9b8af@mail.gmail.com> <87f94c371002021446y38dce6fds6acca2b4919ad773@mail.gmail.com> <4B698365.1040007@anonymous.org.uk> <4B6C3B7D.2090502@tmr.com> <4B6C4E7F.2080501@anonymous.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: In-Reply-To: <4B6C4E7F.2080501@anonymous.org.uk> Sender: linux-raid-owner@vger.kernel.org To: John Robinson Cc: Linux RAID List-Id: linux-raid.ids John Robinson wrote: > On 05/02/2010 15:38, Bill Davidsen wrote: >> John Robinson wrote: > [...] >>> What sums I've done, on the basis of a 1 in 10^15 bit unrecoverable >>> error rate, suggest you've a 1 in 63 chance of getting an >>> uncorrectable error while reading the whole surface of their 2TB >>> disc. Read the whole disc 44 times and you've a 50/50 chance of >>> hitting an uncorrectable error. >>> >> Rethink that, virtually all errors happen during write, reading is >> non-destructive, in terms of what's on the drive. So it's valid after >> write or it isn't, but having been written correctly, other than >> failures in the media (including mechanical parts) or electronics, >> the chances of "going bad" are probably vanishingly small. > > They're quite small, at 1 in 10^15 bits read. On 1GB discs, you > probably could call it vanishingly small. But now with 1TB and larger > discs, I wouldn't characterise it as vanishingly small. It's entirely > on the basis of the given specs that I did my calculations. > > Bear in mind that the operation of the disc is now deliberately > designed to use ECC all the time. Have a look at the vast numbers you > get from the SMART data for ECC errors corrected. I just checked a > 160GB single-platter disc with 4500 power-on hours; it quotes > 200,000,000 hardware ECC errors recovered. I don't know how to read the POH smart reports for Seagate, I just checked a server which has been up 167 days most recently, and all but two weeks (moves and such) of the last four years. It shows 622 POH, and the others in the same raid-5 array show times from 1600 to 470. Two report ECC rates of 50-60m in four years, the other 6. Yes, six. None show any relocates. My set of WD 1TB drives showed no relocates in a year, and no errors (may not show that field if zero). I keep a table of MD5sum for all significate files on the arrays, and haven't seen an error in years. Since I do a "check" regularly, I know all sectors are being read. My main issue with your post was the "read 44 times" as explained in another reply, not your original calculation. -- Bill Davidsen "We can't solve today's problems by using the same thinking we used in creating them." - Einstein