Re: mismatch_cnt questions

All of lore.kernel.org
 help / color / mirror / Atom feed

From: "H. Peter Anvin" <hpa@zytor.com>
To: "H. Peter Anvin" <hpa@zytor.com>
Cc: Eyal Lebedinsky <eyal@eyal.emu.id.au>, Neil Brown <neilb@suse.de>,
	Christian Pernegger <pernegger@gmail.com>,
	linux-raid@vger.kernel.org
Subject: Re: mismatch_cnt questions
Date: Wed, 07 Mar 2007 23:00:31 -0800	[thread overview]
Message-ID: <45EFB48F.3050101@zytor.com> (raw)
In-Reply-To: <45EFAE65.9050608@zytor.com>

H. Peter Anvin wrote:
> Eyal Lebedinsky wrote:
>> Neil Brown wrote:
>> [trim Q re how resync fixes data]
>>> For raid1 we 'fix' and inconsistency by arbitrarily choosing one copy
>>> and writing it over all other copies.
>>> For raid5 we assume the data is correct and update the parity.
>>
>> Can raid6 identify the bad block (two parity blocks could allow this
>> if only one block has bad data in a stripe)? If so, does it?
>>
>> This will surely mean more value for raid6 than just the two-disk-failure
>> protection.
>>
> 
> No.  It's not mathematically possible.
> 

Okay, I've thought about it, and I got it wrong the first time 
(off-the-cuff misapplication of the pigeonhole principle.)

It apparently *is* possible (for notation and algebra rules, see my paper):

Let's assume we know exactly one of the data (Dn) drives is corrupt 
(ignoring the case of P or Q corruption for now.)  That means instead of 
Dn we have a corrupt value, Xn.  Note that which data drive that is 
corrupt (n) is not known.

We compute P' and Q' as the computed values over the corrupt set.

P+P' = Dn+Xn
Q+Q' = g^n Dn + g^n Xn		g = {02}

Q+Q' = g^n (Dn+Xn)

By assumption, Dn != Xn, so P+P' = Dn+Xn != {00}.
g^n is *never* {00}, so Q+Q' = g^n (Dn+Xn) != {00}.

(Q+Q')/(P+P') = [g^n (Dn+Xn)]/(Dn+Xn) = g^n

Since n is known to be in the range [0,255), we thus have:

n = log_g((Q+Q')/(P+P'))

... which is a well-defined relation.

For the case where either the P or the Q drives are corrupt (and the 
data drives are all good), this is easily detected by the fact that if P 
is the corrupt drive, Q+Q' = {00}; similarly, if Q is the corrupt drive, 
P+P' = {00}.  Obviously, if P+P' = Q+Q' = {00}, then as far as RAID-6 
can discover, there is no corruption in the drive set.

So, yes, RAID-6 *can* detect single drive corruption, and even tell you 
which drive it is, if you're willing to compute a full syndrome set (P', 
Q') on every read (as well on every write.)

Note: RAID-6 cannot detect 2-drive corruption, unless of course the 
corruption is in different byte positions.  If multiple corresponding 
byte positions are corrupt, then the algorithm above will generally 
point you to a completely innocent drive.

	-hpa

next prev parent reply	other threads:[~2007-03-08  7:00 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-04 11:22 mismatch_cnt questions Christian Pernegger
2007-03-04 11:50 ` Neil Brown
2007-03-04 12:01   ` Christian Pernegger
2007-03-04 22:19     ` Neil Brown
2007-03-06 10:04       ` mismatch_cnt questions - how about raid10? Peter Rabbitson
2007-03-06 10:20         ` Neil Brown
2007-03-06 10:56           ` Peter Rabbitson
2007-03-06 10:59             ` Justin Piszcz
2007-03-12  5:35             ` Neil Brown
2007-03-12 14:26               ` Peter Rabbitson
2007-03-04 21:21   ` mismatch_cnt questions Eyal Lebedinsky
2007-03-04 22:30     ` Neil Brown
2007-03-05  7:45       ` Eyal Lebedinsky
2007-03-05 14:56         ` detecting/correcting _slightly_ flaky disks Michael Stumpf
2007-03-05 15:09           ` Justin Piszcz
2007-03-05 17:01             ` Michael Stumpf
2007-03-05 17:11               ` Justin Piszcz
2007-03-07  0:14               ` Bill Davidsen
2007-03-07  1:37                 ` Michael Stumpf
2007-03-07 13:57                   ` berk walker
2007-03-07 15:01                   ` Bill Davidsen
2007-03-05 23:40         ` mismatch_cnt questions Neil Brown
2007-03-07  0:22           ` Bill Davidsen
2007-03-08  6:39           ` H. Peter Anvin
2007-03-08 13:54             ` Martin K. Petersen
2007-03-09  2:00               ` Bill Davidsen
2007-03-09  4:20                 ` H. Peter Anvin
2007-03-09  5:20                   ` Bill Davidsen
2007-03-08  6:34         ` H. Peter Anvin
2007-03-08  7:00           ` H. Peter Anvin [this message]
2007-03-08  8:21             ` H. Peter Anvin
2007-03-13  9:58               ` Andre Noll
2007-03-13 23:46                 ` H. Peter Anvin
2007-03-06  6:27       ` Paul Davidson
2008-05-12 11:16   ` Bas van Schaik
2008-05-12 14:31     ` Justin Piszcz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45EFB48F.3050101@zytor.com \
    --to=hpa@zytor.com \
    --cc=eyal@eyal.emu.id.au \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    --cc=pernegger@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.