All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ric Wheeler <ric@emc.com>
To: Hans Reiser <reiser@namesys.com>
Cc: Gregory Maxwell <gmaxwell@gmail.com>,
	Edward Shishkin <edward@namesys.com>,
	Tom Reinhart <rhino_tom@hotmail.com>,
	reiserfs-list@namesys.com
Subject: Re: the " 'official' point of view" expressed by kernelnewbies.org
Date: Wed, 16 Aug 2006 11:08:27 -0400	[thread overview]
Message-ID: <44E334EB.5000001@emc.com> (raw)
In-Reply-To: <44E29DD9.7030302@namesys.com>

Hans Reiser wrote:
> I am skeptical that bitflip errors above the storage layer are as common
> as the ZFS authors say, and their statistics that I have seen somehow
> lack a lot of detail about how they were gathered.  If, say, a device
> with 100 errors counts as 100 instances for their statistics.....  Well,
> it would be nice to know how they were gathered.  Next time I meet them
> I must ask.
>   
I think that most big vendors have a lot of information about failure 
rates on drives, but cannot actually share the details in public (due to 
NDA's with the suppliers).

One thing that we are trying to do is to get some of the more 
"community" oriented people at Seagate Research to come out and talk to 
the people about what are reasonable types of errors to code against.  
Current idea is to get everyone in the same place a couple of days 
before the next FAST conference (i.e., linux IO people or file system 
people and these vendors).  (See the USENIX page for details on FAST at 
http://www.usenix.org/events/fast07/cfp/).

I will say that media errors tend to be larger than single bit errors, 
i.e. you will lose a set of sectors instead of seeing a single bit flip 
on one sector (remember that the drive vendors do extensive ECC at their 
level).  What their ECC will not fix is something like junk settling on 
the platter or a really bad error like a bad disk head.

> That said, if users want it, there should be a plugin that checks the bits.
>
> I agree that stripe awareness and the need to signal the underlying raid
> that a block needs to be recovered is important.  Checksumming at the fs
> level seems like a reasonable plugin.
>
> I have no opinion on the computational cost of ECC vs. checksums, I will
> trust that you are correct.
>   

What we can (and should) do is to make sure that we detect errors much 
better than we do today.  I think that ECC would be overkill, but we 
certainly could do simple checksums for strategic parts of the file 
system data.

Also note that there is work underway in the SCSI space on something 
called "block guard" that defines some extra bytes per disk sector for 
application level data.  That could be used for per block sanity 
checking information, but how to get at it from the file system is an 
interesting question.

Val's write up at LWN on the file system workshop & the comments on that 
write up has a an active discussion on this kind of thing 
(http://lwn.net/Articles/190222/).

ric



  reply	other threads:[~2006-08-16 15:08 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-08-15 21:27 the " 'official' point of view" expressed by kernelnewbies.org Tom Reinhart
2006-08-15 21:55 ` Hans Reiser
2006-08-15 22:06 ` Edward Shishkin
2006-08-15 22:20   ` Hans Reiser
2006-08-16  2:34     ` Tom Reinhart
2006-08-16  3:29       ` Gregory Maxwell
2006-08-15 22:27   ` David Masover
2006-08-15 22:44     ` Edward Shishkin
2006-08-15 23:29       ` David Masover
2006-08-16  1:14   ` Gregory Maxwell
2006-08-16  4:23     ` Hans Reiser
2006-08-16 15:08       ` Ric Wheeler [this message]
2006-08-16 22:48         ` Hans Reiser
2006-08-24 16:11     ` PFC
2006-08-16  2:53   ` Tom Reinhart
  -- strict thread matches above, loose matches on Subject: below --
2006-07-24 15:57 Al Boldi
2006-07-24 17:43 ` Horst H. von Brand
2006-07-24 18:46 ` H. Peter Anvin
2006-07-25  4:07 ` Matthew Frost
2006-07-25  4:57 ` Al Boldi
2006-07-25  5:03   ` Jeff Garzik
2006-07-25  8:33   ` the ' 'official' point of view' " Luigi Genoni
2006-07-25 14:35   ` the " 'official' point of view" " Horst H. von Brand
2006-07-25 15:14     ` Lexington Luthor
2006-07-25 20:59   ` Matthias Andree

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44E334EB.5000001@emc.com \
    --to=ric@emc.com \
    --cc=edward@namesys.com \
    --cc=gmaxwell@gmail.com \
    --cc=reiser@namesys.com \
    --cc=reiserfs-list@namesys.com \
    --cc=rhino_tom@hotmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.