From mboxrd@z Thu Jan 1 00:00:00 1970 From: Ed W Subject: Re: 3TB drives failure rate Date: Sun, 28 Oct 2012 16:47:29 +0000 Message-ID: <508D61A1.7020106@wildgooses.com> References: <11510711257.20121028131527@oudeis.org> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <11510711257.20121028131527@oudeis.org> Sender: linux-raid-owner@vger.kernel.org To: =?ISO-8859-1?Q?Rainer_F=FCgenstein?= Cc: linux-raid@vger.kernel.org List-Id: linux-raid.ids On 28/10/2012 12:15, Rainer F=FCgenstein wrote: > when trying to upgrade my raid5 with 4 Western digital caviar green > 3TB drives [WDC WD30EZRX-00MMMB0] (3 brandnew, 1 about 4months old), > the "old" drive and one of the brand new ones failed with > unrecoverable read errors and about 70 reallocated sectors each. the > failures already occured during the initial resync after creating the > raid. > > until now I was very fond of WD caviar green drives, but after this > 50% failure rate I'm not very eager to restore data from the backup. > > what is your experience with 3TB drives, WD and others? > > (low power drives appreciated, performance is not an issue) > I think there is clearly serial correlation in drive failures and this=20 tends to cause people to have brand love/hate stories. I bought 9x Samsung 2TB green things about 2 years back (to go in an 8= x=20 NAS + 1 spare). I think I had to return 4 almost immediately due to=20 either out of box reallocation warning, or that appeared within 2=20 weeks. Probably if I hadn't been looking I wouldn't have noticed these= =20 warnings and then been one of those groaning about Samsung when probabl= y=20 they all expired within a few weeks of each other. The RMA'd drives=20 have all been fine and the whole array seems ok some years later (teste= d=20 weekly). Note that I think I got 2x drives from a different supplier=20 (hence different batch), so that implies something like 4 out of 7 in a= =20 given batch were "worrying", but the next 4 from a new batch showed no=20 obvious problems I think this fits with the idea that the spinning disk failure curve ha= s=20 a bump in the first few weeks, then flat until some years later when it= =20 peaks again... My conclusion: - RAID6 for data that is highly valuable (and performance is acceptable= ) - Thrash the drives initially for some weeks before you accept them int= o=20 production. - Although highly debated, I believe that failures are likely to be=20 correlated in time, when one drive goes there is a high probability of=20 loosing others in the next 24 hours. Take precautions as you see fit, e= g=20 regular backups, hot/warm spares, etc - Green consumer drives likely are satisfactorarily reliable for most=20 uses, caveat that you accept they will fail catastrophically eventually= =20 (just like your enterprise drive will). We can debate the relative lif= e=20 of each, but it's almost certainly just a linear factor... Good luck Ed W -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html