linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: david@lang.hm
To: Mattias Wadenstein <maswan@acc.umu.se>
Cc: Neil Brown <neilb@suse.de>, David Chinner <dgc@sgi.com>,
	Avi Kivity <avi@argo.co.il>,
	linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: limits on raid
Date: Thu, 21 Jun 2007 09:48:38 -0700 (PDT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0706210947090.31603@asgard.lang.hm> (raw)
In-Reply-To: <Pine.GSO.4.64.0706211417180.15647@montezuma.acc.umu.se>

On Thu, 21 Jun 2007, Mattias Wadenstein wrote:

> On Thu, 21 Jun 2007, Neil Brown wrote:
>
>>  I have that - apparently naive - idea that drives use strong checksum,
>>  and will never return bad data, only good data or an error.  If this
>>  isn't right, then it would really help to understand what the cause of
>>  other failures are before working out how to handle them....
>
> In theory, that's how storage should work. In practice, silent data 
> corruption does happen. If not from the disks themselves, somewhere along the 
> path of cables, controllers, drivers, buses, etc. If you add in fcal, you'll 
> get even more sources of failure, but usually you can avoid SANs (if you care 
> about your data).

heh, the pitch I get from the self proclaimed experts is that if you care 
about your data you put it on the san (so you can take advantage of the 
more expensive disk arrays, various backup advantages, and replication 
features that tend to be focused on the san becouse it's a big target)

David Lang

> Well, here is a couple of the issues that I've seen myself:
>
> A hw-raid controller returning every 64th bit as 0, no matter what's on disk. 
> With no error condition at all. (I've also heard from a collegue about this 
> on every 64k, but not seen that myself.)
>
> An fcal switch occasionally resetting, garbling the blocks in transit with 
> random data. Lost a few TB of user data that way.
>
> Add to this the random driver breakage that happens now and then. I've also 
> had a few broken filesystems due to in-memory corruption due to bad ram, not 
> sure there is much hope of fixing that though.
>
> Also, this presentation is pretty worrying on the frequency of silent data 
> corruption:
>
> https://indico.desy.de/contributionDisplay.py?contribId=65&sessionId=42&confId=257
>
> /Mattias Wadenstein
>
>

  parent reply	other threads:[~2007-06-21 16:48 UTC|newest]

Thread overview: 69+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-06-15  2:58 limits on raid david
2007-06-15  3:05 ` Neil Brown
2007-06-15  3:43   ` david
2007-06-15  3:58     ` Neil Brown
2007-06-15  9:13       ` David Chinner
2007-06-15 22:21         ` Neil Brown
2007-06-15 11:10       ` Avi Kivity
2007-06-15 16:23         ` Jan Engelhardt
2007-06-15 17:20           ` Avi Kivity
2007-06-15 21:59         ` Neil Brown
2007-06-16 17:23           ` Avi Kivity
2007-06-17 13:00           ` Andi Kleen
2007-06-18  4:57           ` David Chinner
2007-06-21  2:56             ` Neil Brown
2007-06-21  6:39               ` David Chinner
2007-06-21  6:45                 ` david
2007-06-21  8:59                   ` David Greaves
2007-06-21 17:00                   ` Mark Lord
2007-06-21 11:00                 ` David Chinner
2007-06-21 12:40               ` Mattias Wadenstein
2007-06-21 14:40                 ` Justin Piszcz
2007-06-21 16:48                 ` david [this message]
2007-06-21 18:30                 ` Martin K. Petersen
2007-06-21 20:08               ` Nix
2007-06-16  2:03       ` Wakko Warner
2007-06-16  3:47         ` Neil Brown
2007-06-16  4:40           ` Dan Merillat
2007-06-16  7:48           ` david
2007-06-16 13:38             ` David Greaves
2007-06-16 17:16               ` david
2007-06-17 17:16             ` Bill Davidsen
2007-06-18 17:20             ` Brendan Conoboy
2007-06-18 17:28               ` david
2007-06-18 18:03                 ` Lennart Sorensen
2007-06-18 18:12                   ` david
2007-06-18 18:33                     ` Lennart Sorensen
2007-06-18 18:40                       ` david
2007-06-18 19:11                         ` Brendan Conoboy
2007-06-18 20:52                           ` david
2007-06-18 21:46                             ` Wakko Warner
2007-06-18 21:56                               ` david
2007-06-18 22:00                                 ` Brendan Conoboy
2007-06-19 20:11                                 ` Lennart Sorensen
2007-06-19 20:51                                   ` david
2007-06-19 15:07                             ` Phillip Susi
2007-06-19 19:28                               ` david
2007-06-18 18:07                 ` Brendan Conoboy
2007-06-18 18:16                   ` david
2007-06-16 13:33           ` David Greaves
2007-06-17  1:44             ` dean gaudet
2007-06-21  3:01             ` Neil Brown
2007-06-21  8:49               ` David Greaves
2007-06-16 14:08           ` Wakko Warner
2007-06-17  1:47             ` dean gaudet
2007-06-17 13:28               ` Wakko Warner
2007-06-17 17:28                 ` dean gaudet
2007-06-17 19:30                   ` Wakko Warner
2007-06-17 19:54                     ` dean gaudet
2007-06-17 20:46                       ` david
2007-06-17 20:44                     ` david
2007-06-17 17:14       ` Bill Davidsen
2007-06-21 23:03         ` Bill Davidsen
2007-06-22  2:24           ` Neil Brown
2007-06-22  8:10             ` David Greaves
2007-06-22  9:51               ` david
2007-06-22 12:39                 ` David Greaves
2007-06-22 16:00                   ` Bill Davidsen
2007-06-22 16:55                     ` David Greaves
2007-06-22 18:41                     ` david

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0706210947090.31603@asgard.lang.hm \
    --to=david@lang.hm \
    --cc=avi@argo.co.il \
    --cc=dgc@sgi.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-raid@vger.kernel.org \
    --cc=maswan@acc.umu.se \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).