Re: raid6 check/repair - Thiemo Nagel

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Thiemo Nagel <thiemo.nagel@ph.tum.de>
To: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: raid6 check/repair
Date: Thu, 22 Nov 2007 17:51:01 +0100	[thread overview]
Message-ID: <4745B375.4030500@ph.tum.de> (raw)
In-Reply-To: <18244.64972.172685.796502@notabene.brown>

Dear Neil,

thank you very much for your detailed answer.

Neil Brown wrote:
> While it is possible to use the RAID6 P+Q information to deduce which
> data block is wrong if it is known that either 0 or 1 datablocks is 
> wrong, it is *not* possible to deduce which block or blocks are wrong
> if it is possible that more than 1 data block is wrong.

If I'm not mistaken, this is only partly correct.  Using P+Q redundancy,
it *is* possible, to distinguish three cases:
a) exactly zero bad blocks
b) exactly one bad block
c) more than one bad block

Of course, it is only possible to recover from b), but one *can* tell,
whether the situation is a) or b) or c) and act accordingly.

> As it is quite possible for a write to be aborted in the middle 
> (during unexpected power down) with an unknown number of blocks in a 
> given stripe updated but others not, we do not know how many blocks 
> might be "wrong" so we cannot try to recover some wrong block.

As already mentioned, in my opinion, one can distinguish between 0, 1
and >1 bad blocks, and that is sufficient.

> Doing so would quite possibly corrupt a block that is not wrong.

I don't think additional corruption could be introduced, since recovery
would only be done for the case of exactly one bad block.

> 
> [...]
> 
> As I said above - there is no solution that works in all cases.

I fully agree.  When more than one block is corrupted, and you don't 
know which are the corrupted blocks, you're lost.

> If more that one block is corrupt, and you don't know which ones, 
> then you lose and there is now way around that.

Sure.

The point that I'm trying to make is, that there does exist a specific
case, in which recovery is possible, and that implementing recovery for
that case will not hurt in any way.

> RAID is not designed to protect again bad RAM, bad cables, chipset 
> bugs drivers bugs etc.  It is only designed to protect against drive 
> failure, where the drive failure is apparent.  i.e. a read must 
> return either the same data that was last written, or a failure 
> indication. Anything else is beyond the design parameters for RAID.

I'm taking a more pragmatic approach here.  In my opinion, RAID should
"just protect my data", against drive failure, yes, of course, but if it
can help me in case of occasional data corruption, I'd happily take
that, too, especially if it doesn't cost extra... ;-)

Kind regards,

Thiemo

next prev parent reply	other threads:[~2007-11-22 16:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-11-21 13:25 raid6 check/repair Thiemo Nagel
2007-11-22  3:55 ` Neil Brown
2007-11-22 16:51   ` Thiemo Nagel [this message]
2007-11-27  5:08     ` Bill Davidsen
2007-11-29  6:04       ` Neil Brown
2007-11-29  6:01     ` Neil Brown
2007-11-29 19:30       ` Bill Davidsen
2007-11-29 23:17       ` Eyal Lebedinsky
2007-11-30 14:42         ` Thiemo Nagel
     [not found]           ` <1196650421.14411.10.camel@elara.tcw.local>
     [not found]             ` <47546019.5030300@ph.tum.de>
2007-12-03 20:36               ` mailing list configuration (was: raid6 check/repair) Janek Kozicki
2007-12-04  8:45                 ` Matti Aarnio
2007-12-04 21:07               ` raid6 check/repair Peter Grandi
2007-12-05  6:53                 ` Mikael Abrahamsson
2007-12-05  9:00                 ` Leif Nixon
2007-12-05 20:31                 ` Bill Davidsen
2007-12-06 18:27                   ` Andre Noll
2007-12-07 17:34                   ` Gabor Gombas
2007-11-30 18:34       ` Thiemo Nagel
  -- strict thread matches above, loose matches on Subject: below --
2007-11-21 13:45 Thiemo Nagel
2007-12-14 15:25 ` Thiemo Nagel
2007-11-15 15:28 Leif Nixon
2007-11-16  4:26 ` Neil Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4745B375.4030500@ph.tum.de \
    --to=thiemo.nagel@ph.tum.de \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).