Re: Questions about bitrot and RAID 5/6

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Phil Turmel <philip@turmel.org>
To: Chris Murphy <lists@colorremedies.com>,
	"linux-raid@vger.kernel.org List" <linux-raid@vger.kernel.org>
Subject: Re: Questions about bitrot and RAID 5/6
Date: Thu, 23 Jan 2014 13:53:42 -0500	[thread overview]
Message-ID: <52E16536.2070608@turmel.org> (raw)
In-Reply-To: <DE020E0C-E6EC-48E9-8D7B-09F5A65A2DF5@colorremedies.com>

Hi Chris,

On 01/23/2014 12:28 PM, Chris Murphy wrote:
> It's a fair point. I've recently run across some claims on a separate
> forum with hardware raid5 arrays containing all enterprise drives,
> with regularly scrubs, yet with such excessive implosions that some
> integrators have moved to raid6 and completely discount the use of
> raid5. The use case is video production. This sounds suspiciously
> like microcode or raid firmware bugs to me. I just don't see how ~6-8
> enterprise drives in a raid5 translates into significantly higher
> array collapses that then essentially vanish when it's raid6.

I just wanted to address this one point.  Raid6 is many orders of
magnitude more robust than raid5 in the rebuild case.  Let me illustrate:

How to lose data in a raid5:

1) Experience unrecoverable read errors on two of the N drives at the
same *time* and same *sector offset* of the two drives.  Absurdly
improbable.  On the order of 1x10^-36 for 1T consumer-grade drives.

2a) Experience hardware failure on one drive followed by 2b) an
unrecoverable read error in another drive.  You can expect a hardware
failure rate of a few percent per year.  Then, when rebuilding on the
replacement drive, the odds skyrocket.  On large arrays, the odds of
data loss are little different from the odds of a hardware failure in
the first place.

How to lose data in a raid6:

1) Experience unrecoverable read errors on *three* of the N drives at
the same *time* and same *sector offset* of the drives.  Even more
absurdly improbable.  On the order of 1x10^-58 for 1T consumer-grade drives.

2) Experience hardware failure on one drive followed by unrecoverable
read errors on two of the remaining drives at the same *time* and same
*sector offset* of the two drives.  Again, absurdly improbable.  Same as
for the raid5 case "1".

3) Experience hardware failure on two drives followed by an
unrecoverable read error in another drive.  As with raid5 on large
arrays, you probably can't complete the rebuild error-free.  But the
odds of this event are subject to management--quick reponse to case "2"
greatly reduces the odds of case "3".

It is no accident that raid5 is becoming much less popular.

Phil

next prev parent reply	other threads:[~2014-01-23 18:53 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-20 20:34 Questions about bitrot and RAID 5/6 Mason Loring Bliss
2014-01-20 21:46 ` NeilBrown
2014-01-20 22:55   ` Peter Grandi
2014-01-21  9:18   ` David Brown
2014-01-21 17:19   ` Mason Loring Bliss
2014-01-22 10:40     ` David Brown
2014-01-23  0:48       ` Chris Murphy
2014-01-23  8:18         ` David Brown
2014-01-23 17:28           ` Chris Murphy
2014-01-23 18:53             ` Phil Turmel [this message]
2014-01-23 21:38               ` Chris Murphy
2014-01-24 13:22                 ` Phil Turmel
2014-01-24 16:11                   ` Chris Murphy
2014-01-24 17:03                     ` Phil Turmel
2014-01-24 17:59                       ` Chris Murphy
2014-01-24 18:12                         ` Phil Turmel
2014-01-24 19:32                           ` Chris Murphy
2014-01-24 19:57                             ` Phil Turmel
2014-01-24 20:54                               ` Chris Murphy
2014-01-25 10:23                                 ` Dag Nygren
2014-01-25 15:48                                 ` Phil Turmel
2014-01-25 17:44                                   ` Stan Hoeppner
2014-01-27  3:34                                     ` Chris Murphy
2014-01-27  7:16                                       ` Mikael Abrahamsson
2014-01-27 18:20                                         ` Chris Murphy
2014-01-30 10:22                                           ` Mikael Abrahamsson
2014-01-30 20:59                                             ` Chris Murphy
2014-01-27  3:20                                   ` Chris Murphy
2014-01-25 17:56                                 ` Wilson Jonathan
2014-01-27  4:07                                   ` Chris Murphy
2014-01-23 22:06               ` David Brown
2014-01-23 22:02             ` David Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52E16536.2070608@turmel.org \
    --to=philip@turmel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=lists@colorremedies.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).