From: Neil Brown <neilb@suse.de>
To: Theodore Tso <tytso@MIT.EDU>
Cc: Peter Rabbitson <rabbit+list@rabbit.us>,
Bill Davidsen <davidsen@tmr.com>, Bas van Schaik <bas@tuxes.nl>,
linux-raid@vger.kernel.org
Subject: Re: Redundancy check using "echo check > sync_action": error reporting?
Date: Tue, 25 Mar 2008 15:44:44 +1100 [thread overview]
Message-ID: <18408.33596.390197.465267@notabene.brown> (raw)
In-Reply-To: message from Theodore Tso on Saturday March 22
On Saturday March 22, tytso@MIT.EDU wrote:
> On Fri, Mar 21, 2008 at 06:35:43PM +0100, Peter Rabbitson wrote:
> >
> > Of course it would be possible to instruct md to always read all
> > data+parity chunks and make a comparison on every read. The performance
> > would not be much to write home about though.
>
> Yeah, and that's probably the real problem with this scheme. You
> basically reduce the read bandwidth of your array down to a single
> (slowest) disk --- basically the same reason why RAID-2 is a
> commercial failure.
Exactly.
>
> I suspect the best thing we *can* to do is for filesystems that
> include checksums in the metadata and/or the data blocks, is if the
> CRC doesn't match, to have the filesystem tell the RAID subsystem,
> "um, could you send me copies of the data from all of the RAID-1
> mirrors, and see if one of the copies from the mirrors causes a valid
> checksum". Something similar could be done with RAID-5/RAID-6 arrays,
> if the fs layer could ask the RAID subsystem, "the external checksum
> for this block is bad; can you recalculate it from all available
> parity stripes assuming the data stripe is invalid".
Something along these lines would be very appropriate I think.
Particularly for raid1.
For raid5/raid6 it is possible that a valid block in the same stripe
was read and written before the faulty block was read. This would
correct the parity so when the bad block was found, there would be no
way to recover the correct data.
Still, having the possibility of recovery might be better than not
having it.
>
> As far as the question of how often this happens, where a disk
> silently corrupts a block without returning a media error, it
> definitely happens. Larry McVoy tells a story of periodically running
> a per-file CRC across a backup/archival filesystems, and was able to
> detect files that had not been modified changing out from under him.
> One way this can happen is if the disk accidentally writes some block
> to the wrong location on disk; the blockguard extension and various
> enterprise databases (since they can control their db-specific on-disk
> format) will encode the intended location of a block in their
> per-block checksums, to detect this specific type of failure, which
> should broad hint that this sort of thing can and does happen.
The "address data was corrupted" is certainly a credible possibility.
I remember reading that SCSI has a parity check for data, but not for
the command, which include the storage address.
With the raid6 algorithm, we can tell which device has an error
(assuming only one device does) for each byte in the block.
If this returns the same device for every block in a sector, it is
probably reasonable to assume that exactly that block is bad.
Still, if we only do that on the monthly 'check', it could be too
late.
I'm not sure that "surviving some data corruptions, if you are lucky"
is really better than surviving none. We don't want to provide a
false sense of security.... but maybe RAID already does that.
A filesystem that always writes full stripes and never over-writes
valid data. And that (optionally) stores checksums for everything is
looking more an more appealing. The trouble is, I don't seem to have
enough "spare time" :-)
NeilBrown
next prev parent reply other threads:[~2008-03-25 4:44 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-03-16 14:21 Redundancy check using "echo check > sync_action": error reporting? Bas van Schaik
2008-03-16 15:14 ` Janek Kozicki
2008-03-20 13:32 ` Bas van Schaik
2008-03-20 13:47 ` Robin Hill
2008-03-20 14:19 ` Bas van Schaik
2008-03-20 14:45 ` Robin Hill
2008-03-20 15:16 ` Bas van Schaik
2008-03-20 16:04 ` Robin Hill
2008-03-20 16:35 ` Theodore Tso
2008-03-20 17:10 ` Robin Hill
2008-03-20 17:39 ` Andre Noll
2008-03-20 18:02 ` Theodore Tso
2008-03-20 18:57 ` Andre Noll
2008-03-21 14:02 ` Ric Wheeler
2008-03-21 20:19 ` NeilBrown
2008-03-21 20:45 ` Ric Wheeler
2008-03-22 17:13 ` Bill Davidsen
2008-03-20 23:08 ` Peter Rabbitson
2008-03-21 14:24 ` Bill Davidsen
2008-03-21 14:52 ` Peter Rabbitson
2008-03-21 17:13 ` Theodore Tso
2008-03-21 17:35 ` Peter Rabbitson
2008-03-22 13:27 ` Theodore Tso
2008-03-22 14:00 ` Bas van Schaik
2008-03-25 4:44 ` Neil Brown [this message]
2008-03-25 15:17 ` Bill Davidsen
2008-03-25 9:19 ` Mattias Wadenstein
2008-03-21 17:43 ` Robin Hill
2008-03-21 23:01 ` Bill Davidsen
2008-03-21 23:45 ` Carlos Carvalho
2008-03-22 17:19 ` Bill Davidsen
2008-03-21 23:55 ` Robin Hill
2008-03-22 10:03 ` Peter Rabbitson
2008-03-22 10:42 ` What do Events actually mean? Justin Piszcz
2008-03-22 17:35 ` David Greaves
2008-03-22 17:48 ` Justin Piszcz
2008-03-22 18:02 ` David Greaves
2008-03-25 3:58 ` Neil Brown
2008-03-26 8:57 ` David Greaves
2008-03-26 8:57 ` David Greaves
2008-05-04 7:30 ` Redundancy check using "echo check > sync_action": error reporting? Peter Rabbitson
2008-05-06 6:36 ` Luca Berra
2008-03-25 4:24 ` Neil Brown
2008-03-25 9:00 ` Peter Rabbitson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=18408.33596.390197.465267@notabene.brown \
--to=neilb@suse.de \
--cc=bas@tuxes.nl \
--cc=davidsen@tmr.com \
--cc=linux-raid@vger.kernel.org \
--cc=rabbit+list@rabbit.us \
--cc=tytso@MIT.EDU \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).