public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Bron Gondwana <brong@fastmail.fm>
To: Joe Peterson <lavajoe@gentoo.org>
Cc: Gerald Nowitzky <Nowitzky@igne.de>, linux-btrfs@vger.kernel.org
Subject: Re: single disk reed solomon codes
Date: Sun, 20 Jul 2008 11:21:18 +1000	[thread overview]
Message-ID: <20080720012118.GA4899@brong.net> (raw)
In-Reply-To: <4882678F.50004@gentoo.org>

On Sat, Jul 19, 2008 at 04:15:43PM -0600, Joe Peterson wrote:
> Gerald Nowitzky wrote:
> > When a HDD drive reads a sector from disk, it does a
> > whole bunch of error recognition and correction measures. Usually there are, 
> > at least, two layers of error correction with different bit spreads on it. 
> > *If* this still isn't enough, it is very likely that the whole sector will 
> > come back completely spoiled, or, much more likely, won't come back at all 
> > and the drive will report a read error.
> 
> With larger and larger disks, it is increasingly likely we will see
> undetected/uncorrected errors (the drive bit error rates are not
> improving - 1 in 10^17 is typical).  It is clear we cannot rely
> completely on the hardware to catch everything.  Also, errors that
> happen in the hardware between the drive and the CPU can be caused by
> bad cables, interfaces, etc.
> 
> For even single disk systems (even without mirroring), it is still valid
> to have some means of verifying integrity.  It is far better to know an
> error occurred and which files are affected than to have it happen
> silently.  If caught, undetected errors will be less likely to migrate
> onto backups over time and slowly corrupt data there too, making
> eventual recovery impossible.  That's why btrfs's checksums are so cool!
> 
> See my blog for my personal experiences with silent hard disk errors:
> 
> 	http://planet.gentoo.org/developers/lavajoe/

I've seen an interesting discussion elsewhere about this very issue, in
the context of retrofitting some sort of checksumming support to FFS.

The suggestion was to make 128th block a checksum block for the previous
127 blocks (scale to your liking).  Without changing the filesystem
format _at_all_ you could still checksum so long as you read in 128
blocks at a time.  This isn't a major problem, since you'll probably
want that sort of readahead anyway.

Of course - that's just error detection, not error correction.

Bron.

  reply	other threads:[~2008-07-20  1:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-19 12:21 single disk reed solomon codes Ahmed Kamal
2008-07-19 15:18 ` Gerald Nowitzky
2008-07-19 22:15   ` Joe Peterson
2008-07-20  1:21     ` Bron Gondwana [this message]
2008-07-21  6:48   ` Tomasz Torcz
2008-07-21  7:40     ` Ahmed Kamal
2008-07-21 13:03       ` Chris Mason
2008-07-21 15:03       ` Dongjun Shin
2008-08-04  6:52         ` Ahmed Kamal
2008-08-04 11:31           ` Ric Wheeler
2008-07-19 16:50 ` David Woodhouse
2008-07-19 16:53   ` Ahmed Kamal
2008-07-21 13:05   ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080720012118.GA4899@brong.net \
    --to=brong@fastmail.fm \
    --cc=Nowitzky@igne.de \
    --cc=lavajoe@gentoo.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox