linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: matt <listy@fastmail.fm>
To: linux-raid@vger.kernel.org
Subject: OT: silent data corruption reading from hard drives
Date: Wed, 01 Aug 2012 08:02:38 -0400	[thread overview]
Message-ID: <50191ADE.10809@fastmail.fm> (raw)

Quick intro: Last year I was having problems with an md array
continuously having a mismatch_cnt in the tens of thousands,
inexplicably.  After a week or two of hardware swapping and such, I
narrowed it down to bad reads of the hard drive block devices.  I used
scripts that would repetitively do something like this on all my drives:
      dd if=/dev/sdk1 bs=1024 count=50000000 |md5sum -b
Some devices would intermittently get different results.  I ended up
resolving (?) it by replacing the cheapo (Syba) SATA controller cards
with other cheapo (Rosewill) ones.  I've been fine for about a year
since then.

But now it's just started happening again.  Although this isn't an md
question per se, I'm hoping some of you raid/kernel/storage gurus can
give me some tips on how to trace this in a better way than my haphazard
method last year.  Is there any way to detect these bad reads when they
happen?  (Apparently not?)   What about finding out if the cause is the
motherboard, the controller card, the device driver, or the kernel?
(Besides swapping hardware?)  Can the md layer help out in this regard?
  Are there known bugs or hardware nuances that relate to this?  Is
silent data corruption like this simply to be expected when using cheap
commodity hardware?

Thanks for reading...

matt

             reply	other threads:[~2012-08-01 12:02 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01 12:02 matt [this message]
2012-08-01 13:03 ` OT: silent data corruption reading from hard drives Roman Mamedov
2012-08-02  0:56 ` Stan Hoeppner
2012-08-02  1:07   ` Roberto Spadim
2012-08-02  1:14     ` Roberto Spadim
2012-08-02  1:27   ` Adam Goryachev
2012-08-02  1:35     ` Roberto Spadim
2012-08-02  3:23     ` Stan Hoeppner
2012-08-02 13:02     ` Drew
2012-08-02  3:19   ` Roman Mamedov
2012-08-02  7:51     ` Stan Hoeppner
2012-08-02  8:06       ` Roman Mamedov
2012-08-02  9:29         ` Stan Hoeppner
2012-08-02 12:26         ` Iustin Pop
2012-08-02 16:59         ` listy
2012-08-02 17:04           ` Roberto Spadim
2012-08-02 17:13             ` Jeff Johnson
2012-08-02 17:19               ` Roman Mamedov
2012-08-02 17:25                 ` Roberto Spadim
2012-08-02 17:22               ` Roberto Spadim
     [not found]           ` <501AB9D8.1030404@turmel.org>
2012-08-02 18:32             ` listy
2012-08-03 13:36               ` Phil Turmel
2012-08-15 21:55                 ` Peter Grandi
2012-08-16  7:30                   ` Oliver Schinagl
     [not found]                     ` <CABYL=TqU6qvDK-CuFak42iVNj0v4OcvALXOnr=6XLM4HyXfGkw@mail.gmail.com>
2012-08-16 14:33                       ` Roberto Spadim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50191ADE.10809@fastmail.fm \
    --to=listy@fastmail.fm \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).