All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bron Gondwana <brong@fastmail.fm>
To: Joe Peterson <lavajoe@gentoo.org>
Cc: Gerald Nowitzky <Nowitzky@igne.de>, linux-btrfs@vger.kernel.org
Subject: Re: single disk reed solomon codes
Date: Sun, 20 Jul 2008 11:21:18 +1000	[thread overview]
Message-ID: <20080720012118.GA4899@brong.net> (raw)
In-Reply-To: <4882678F.50004@gentoo.org>

On Sat, Jul 19, 2008 at 04:15:43PM -0600, Joe Peterson wrote:
> Gerald Nowitzky wrote:
> > When a HDD drive reads a sector from disk, it does a
> > whole bunch of error recognition and correction measures. Usually there are, 
> > at least, two layers of error correction with different bit spreads on it. 
> > *If* this still isn't enough, it is very likely that the whole sector will 
> > come back completely spoiled, or, much more likely, won't come back at all 
> > and the drive will report a read error.
> 
> With larger and larger disks, it is increasingly likely we will see
> undetected/uncorrected errors (the drive bit error rates are not
> improving - 1 in 10^17 is typical).  It is clear we cannot rely
> completely on the hardware to catch everything.  Also, errors that
> happen in the hardware between the drive and the CPU can be caused by
> bad cables, interfaces, etc.
> 
> For even single disk systems (even without mirroring), it is still valid
> to have some means of verifying integrity.  It is far better to know an
> error occurred and which files are affected than to have it happen
> silently.  If caught, undetected errors will be less likely to migrate
> onto backups over time and slowly corrupt data there too, making
> eventual recovery impossible.  That's why btrfs's checksums are so cool!
> 
> See my blog for my personal experiences with silent hard disk errors:
> 
> 	http://planet.gentoo.org/developers/lavajoe/

I've seen an interesting discussion elsewhere about this very issue, in
the context of retrofitting some sort of checksumming support to FFS.

The suggestion was to make 128th block a checksum block for the previous
127 blocks (scale to your liking).  Without changing the filesystem
format _at_all_ you could still checksum so long as you read in 128
blocks at a time.  This isn't a major problem, since you'll probably
want that sort of readahead anyway.

Of course - that's just error detection, not error correction.

Bron.

  reply	other threads:[~2008-07-20  1:21 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-07-19 12:21 single disk reed solomon codes Ahmed Kamal
2008-07-19 15:18 ` Gerald Nowitzky
2008-07-19 22:15   ` Joe Peterson
2008-07-20  1:21     ` Bron Gondwana [this message]
2008-07-21  6:48   ` Tomasz Torcz
2008-07-21  7:40     ` Ahmed Kamal
2008-07-21 13:03       ` Chris Mason
2008-07-21 15:03       ` Dongjun Shin
2008-08-04  6:52         ` Ahmed Kamal
2008-08-04 11:31           ` Ric Wheeler
2008-07-19 16:50 ` David Woodhouse
2008-07-19 16:53   ` Ahmed Kamal
2008-07-21 13:05   ` Chris Mason

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080720012118.GA4899@brong.net \
    --to=brong@fastmail.fm \
    --cc=Nowitzky@igne.de \
    --cc=lavajoe@gentoo.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.