Re: Bad blocks are killing us!

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dieter Stueken <stueken@conterra.de>
To: linux-raid@vger.kernel.org
Subject: Re: Bad blocks are killing us!
Date: Mon, 22 Nov 2004 09:22:06 +0100	[thread overview]
Message-ID: <41A1A1AE.7060909@conterra.de> (raw)
In-Reply-To: <200411180147.iAI1l5N02116@www.watkins-home.com>

Guy Watkins wrote:
> ... but the md-level
> approach might be better.  But I'm not sure I see the point of
> it---unless you have raid 6 with multiple parity blocks, if a disk
> actually has the wrong information recorded on it I don't think you
> can detect which drive is bad, just that one of them is."
> 
> If there is a parity block that does not match the data, true you do not
> know which device has the wrong data.  However, if you do not "correct" the
> parity, when a device fails, it will be constructed differently than it was
> before it failed.  This will just cause more corrupt data.  The parity must
> be made consistent with whatever data is on the data blocks to prevent this
> corrosion of data.  With RAID6 it should be possible to determine which
> block is wrong.  It would be a pain in the @$$, but I think it would be
> doable.  I will explain my theory if someone asks.

This is exactly the same conflict, a single drive has with a unreadable sector.
It notices the sector as bad, and it can not fulfill any read request, until
the data is not rewritten or erased. The single drive can not (and should never
try to!) silently replace the bad sector by some spare sectors, as it can not
recover the content.

Also the RAID system can not solve this problem automagically, and never should
do so, as the former content can not be deduced any more. But notice, that we
have two very different problems to examine: The above problem arises, if all
disks of the RAID system claim to read correct data, whereas the parity information
tells us, that one of them must be wrong. As long as we don't have RAID6,
to recover single bit errors, the data is LOST and can not be recovered.

This is very different to the situation, when one of the disks DOES reports
an internal crc-error. In this case your data CAN be recovered reliable from the
parity information, and in most cases successfully written back to the disk.

But there is also a difference between the problem for RAID compared to the internal
disk: Whereas the disk always reads all CRC data for the sector to verify its integrity,
the RAID system does not normally check the validity of the parity information
by default. (this is, why the idea of data scans actually came up). So, if a scan
discovers a bad parity information, the only action that can (and must!) be taken
is, to tag this piece of data to be invalid. And it is very important, not only
to log that information somewhere. It is even more important to prevent further readings
of this piece of lost data. Otherwise those definitely invalid data may be read
without any notice again, may even get written back again and thus turns into valid
data, even though it become garbage.

People oftenargue for some spare sector management, which would solve all problems.
I think this is an illusion. Spare sectors can only be useful if you fail WRITING data,
not when reading data failed or data loss occurred. This is realized already within
the single disks in a sufficient way (I think). If your disk gives write errors, you
either have a very old one, without internal spare sector management, or your disk
run out of spare sectors already. Read errors are quite more frequent than write
errors and thus a much more important issue.

Dieter Stüken.
-- 
Dieter Stüken, con terra GmbH, Münster
     stueken@conterra.de
     http://www.conterra.de/
     (0)251-7474-501
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2004-11-22  8:22 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <200411150522.iAF5MNN18341@www.watkins-home.com>
2004-11-15 22:27 ` Bad blocks are killing us! Neil Brown
2004-11-16 16:28   ` Maurilio Longo
2004-11-16 18:18   ` Guy
2004-11-16 23:04     ` Neil Brown
2004-11-16 23:07       ` Guy
2004-11-17 13:21         ` Badstripe proposal (was Re: Bad blocks are killing us!) David Greaves
2004-11-18  9:59           ` Maurilio Longo
2004-11-18 10:29             ` Robin Bowes
2004-11-19 17:12             ` Jure Pe_ar
2004-11-20 13:15               ` Maurilio Longo
2004-11-21 18:23                 ` Jure Pe_ar
2004-11-16 23:29       ` Bad blocks are killing us! dean gaudet
2004-11-17 21:58   ` Bruce Lowekamp
2004-11-18  1:46     ` Guy Watkins
2004-11-18 16:03       ` Bruce Lowekamp
2004-11-19 18:47       ` Dieter Stueken
2004-11-22  8:22       ` Dieter Stueken [this message]
2004-11-22  9:17         ` Guy

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41A1A1AE.7060909@conterra.de \
    --to=stueken@conterra.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).