From: Bron Gondwana <brong@fastmail.fm>
To: Joe Peterson <lavajoe@gentoo.org>
Cc: Gerald Nowitzky <Nowitzky@igne.de>, linux-btrfs@vger.kernel.org
Subject: Re: single disk reed solomon codes
Date: Sun, 20 Jul 2008 11:21:18 +1000 [thread overview]
Message-ID: <20080720012118.GA4899@brong.net> (raw)
In-Reply-To: <4882678F.50004@gentoo.org>
On Sat, Jul 19, 2008 at 04:15:43PM -0600, Joe Peterson wrote:
> Gerald Nowitzky wrote:
> > When a HDD drive reads a sector from disk, it does a
> > whole bunch of error recognition and correction measures. Usually there are,
> > at least, two layers of error correction with different bit spreads on it.
> > *If* this still isn't enough, it is very likely that the whole sector will
> > come back completely spoiled, or, much more likely, won't come back at all
> > and the drive will report a read error.
>
> With larger and larger disks, it is increasingly likely we will see
> undetected/uncorrected errors (the drive bit error rates are not
> improving - 1 in 10^17 is typical). It is clear we cannot rely
> completely on the hardware to catch everything. Also, errors that
> happen in the hardware between the drive and the CPU can be caused by
> bad cables, interfaces, etc.
>
> For even single disk systems (even without mirroring), it is still valid
> to have some means of verifying integrity. It is far better to know an
> error occurred and which files are affected than to have it happen
> silently. If caught, undetected errors will be less likely to migrate
> onto backups over time and slowly corrupt data there too, making
> eventual recovery impossible. That's why btrfs's checksums are so cool!
>
> See my blog for my personal experiences with silent hard disk errors:
>
> http://planet.gentoo.org/developers/lavajoe/
I've seen an interesting discussion elsewhere about this very issue, in
the context of retrofitting some sort of checksumming support to FFS.
The suggestion was to make 128th block a checksum block for the previous
127 blocks (scale to your liking). Without changing the filesystem
format _at_all_ you could still checksum so long as you read in 128
blocks at a time. This isn't a major problem, since you'll probably
want that sort of readahead anyway.
Of course - that's just error detection, not error correction.
Bron.
next prev parent reply other threads:[~2008-07-20 1:21 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-07-19 12:21 single disk reed solomon codes Ahmed Kamal
2008-07-19 15:18 ` Gerald Nowitzky
2008-07-19 22:15 ` Joe Peterson
2008-07-20 1:21 ` Bron Gondwana [this message]
2008-07-21 6:48 ` Tomasz Torcz
2008-07-21 7:40 ` Ahmed Kamal
2008-07-21 13:03 ` Chris Mason
2008-07-21 15:03 ` Dongjun Shin
2008-08-04 6:52 ` Ahmed Kamal
2008-08-04 11:31 ` Ric Wheeler
2008-07-19 16:50 ` David Woodhouse
2008-07-19 16:53 ` Ahmed Kamal
2008-07-21 13:05 ` Chris Mason
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080720012118.GA4899@brong.net \
--to=brong@fastmail.fm \
--cc=Nowitzky@igne.de \
--cc=lavajoe@gentoo.org \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox