Re: [RFE] Please, add optional RAID1 feature (= chunk checksums) to make it more robust

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Asdo <asdo@shiftmail.org>
To: Jaromir Capik <jcapik@redhat.com>
Cc: linux-raid <linux-raid@vger.kernel.org>
Subject: Re: [RFE] Please, add optional RAID1 feature (= chunk checksums) to make it more robust
Date: Wed, 18 Jul 2012 18:28:59 +0200	[thread overview]
Message-ID: <5006E44B.50402@shiftmail.org> (raw)
In-Reply-To: <a62c8158-aae7-4924-8408-e0302ff4e9b3@zmail15.collab.prod.int.phx2.redhat.com>

On 07/18/12 13:01, Jaromir Capik wrote:
> Hello.
>
> I'd like to ask you to implement the following ...
>
> The current RAID1 solution is not robust enough to protect the data
> against random data corruptions. Such corruptions usually happen
> when an unreadable sector is found by the drive's electronics
> and when the drive's trying to reallocate the sector to the spare area.
> There's no guarantee that the reallocated data will always match
> the original stored data since the drive sometimes can't read the data
> correctly even with several retries. That unfortunately completely masks
> the issue, because the sector can be read by the OS without problems
> even if it doesn't contain correct data. Would it be possible
> to implement chunk checksums to avoid such data corruptions?
> If a corrupted chunk is encountered, it would be taken from the second
> drive and immediately synced back. This would have a small performance
> and capacity impact (1 sector per chunk to minimize performance impact
> caused by unaligned granularity = 0.78% of the capacity with 64k chunks).
>
> Please, let me know if you find my request reasonable or not.
>
> Thanks in advance.
>
> Regards,
> Jaromir.
>

This is a very invasive change that you ask, conceptually, 
man-hours-wise, performance-wise, ondisk-format wise, space-wise and 
also it really should stay at another layer, preferably below the RAID 
(btrfs and zfs do this above though). This should probably be a DM/LVM 
project.

Drives do this already, they have checksums (google for reed-solomon). 
If the checksums are not long enough you should use different drives. 
But in my life I never saw a "silent data corruption" like the one you say.

Also, statistically speaking, if one disk checksum returns false 
positive the drive is very likely dying, because it takes very many bit 
flips to bypass the reed-solomon check, so other sectors on the same 
drive have almost certainly given read error and you should have 
replaced the drive long ago already.

next prev parent reply	other threads:[~2012-07-18 16:28 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <17025a94-1999-4619-b23d-7460946c2f85@zmail15.collab.prod.int.phx2.redhat.com>
2012-07-18 11:01 ` [RFE] Please, add optional RAID1 feature (= chunk checksums) to make it more robust Jaromir Capik
2012-07-18 11:13   ` Mathias Burén
2012-07-18 12:42     ` Jaromir Capik
2012-07-18 11:15   ` NeilBrown
2012-07-18 13:04     ` Jaromir Capik
2012-07-19  3:48       ` Stan Hoeppner
2012-07-20 12:53         ` Jaromir Capik
2012-07-20 18:24           ` Roberto Spadim
2012-07-20 18:30             ` Roberto Spadim
2012-07-20 20:07             ` Jaromir Capik
2012-07-20 20:21               ` Roberto Spadim
2012-07-20 20:44                 ` Jaromir Capik
2012-07-20 20:59                   ` Roberto Spadim
2012-07-21  3:58           ` Stan Hoeppner
2012-07-18 11:49   ` keld
2012-07-18 13:08     ` Jaromir Capik
2012-07-18 16:08       ` Roberto Spadim
2012-07-20 10:35         ` Jaromir Capik
2012-07-18 21:02       ` keld
2012-07-18 16:28   ` Asdo [this message]
2012-07-20 11:07     ` Jaromir Capik
2012-07-20 11:14       ` Oliver Schinagl
2012-07-20 11:28       ` Jaromir Capik
     [not found] <1082734092.338339.1342995087426.JavaMail.root@redhat.com>
2012-07-23  4:29 ` Stan Hoeppner
2012-07-23  9:34   ` Jaromir Capik
2012-07-23 10:53     ` Stan Hoeppner
2012-07-23 17:03     ` Piergiorgio Sartor
2012-07-23 18:24       ` Roberto Spadim
2012-07-23 21:31         ` Drew
2012-07-23 21:42           ` Roberto Spadim
2012-07-24  4:42           ` Stan Hoeppner
2012-07-24 12:51             ` Roberto Spadim
2012-07-27  6:06           ` Adam Goryachev
2012-07-27 13:42             ` Roberto Spadim
2012-07-24 15:09         ` Jaromir Capik
     [not found] <1897705147.341625.1342995720661.JavaMail.root@redhat.com>
2012-07-23  4:30 ` Stan Hoeppner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5006E44B.50402@shiftmail.org \
    --to=asdo@shiftmail.org \
    --cc=jcapik@redhat.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).