linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* OT: silent data corruption reading from hard drives
@ 2012-08-01 12:02 matt
  2012-08-01 13:03 ` Roman Mamedov
  2012-08-02  0:56 ` Stan Hoeppner
  0 siblings, 2 replies; 25+ messages in thread
From: matt @ 2012-08-01 12:02 UTC (permalink / raw)
  To: linux-raid

Quick intro: Last year I was having problems with an md array
continuously having a mismatch_cnt in the tens of thousands,
inexplicably.  After a week or two of hardware swapping and such, I
narrowed it down to bad reads of the hard drive block devices.  I used
scripts that would repetitively do something like this on all my drives:
      dd if=/dev/sdk1 bs=1024 count=50000000 |md5sum -b
Some devices would intermittently get different results.  I ended up
resolving (?) it by replacing the cheapo (Syba) SATA controller cards
with other cheapo (Rosewill) ones.  I've been fine for about a year
since then.

But now it's just started happening again.  Although this isn't an md
question per se, I'm hoping some of you raid/kernel/storage gurus can
give me some tips on how to trace this in a better way than my haphazard
method last year.  Is there any way to detect these bad reads when they
happen?  (Apparently not?)   What about finding out if the cause is the
motherboard, the controller card, the device driver, or the kernel?
(Besides swapping hardware?)  Can the md layer help out in this regard?
  Are there known bugs or hardware nuances that relate to this?  Is
silent data corruption like this simply to be expected when using cheap
commodity hardware?

Thanks for reading...

matt

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2012-08-16 14:33 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-08-01 12:02 OT: silent data corruption reading from hard drives matt
2012-08-01 13:03 ` Roman Mamedov
2012-08-02  0:56 ` Stan Hoeppner
2012-08-02  1:07   ` Roberto Spadim
2012-08-02  1:14     ` Roberto Spadim
2012-08-02  1:27   ` Adam Goryachev
2012-08-02  1:35     ` Roberto Spadim
2012-08-02  3:23     ` Stan Hoeppner
2012-08-02 13:02     ` Drew
2012-08-02  3:19   ` Roman Mamedov
2012-08-02  7:51     ` Stan Hoeppner
2012-08-02  8:06       ` Roman Mamedov
2012-08-02  9:29         ` Stan Hoeppner
2012-08-02 12:26         ` Iustin Pop
2012-08-02 16:59         ` listy
2012-08-02 17:04           ` Roberto Spadim
2012-08-02 17:13             ` Jeff Johnson
2012-08-02 17:19               ` Roman Mamedov
2012-08-02 17:25                 ` Roberto Spadim
2012-08-02 17:22               ` Roberto Spadim
     [not found]           ` <501AB9D8.1030404@turmel.org>
2012-08-02 18:32             ` listy
2012-08-03 13:36               ` Phil Turmel
2012-08-15 21:55                 ` Peter Grandi
2012-08-16  7:30                   ` Oliver Schinagl
     [not found]                     ` <CABYL=TqU6qvDK-CuFak42iVNj0v4OcvALXOnr=6XLM4HyXfGkw@mail.gmail.com>
2012-08-16 14:33                       ` Roberto Spadim

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).