linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Debugging a strange array corruption
@ 2010-12-14  8:10 Brad Campbell
  2010-12-14  9:22 ` Roman Mamedov
  0 siblings, 1 reply; 7+ messages in thread
From: Brad Campbell @ 2010-12-14  8:10 UTC (permalink / raw)
  To: RAID Linux

G'day all,

I have a 10 x 1TB drive RAID-6 here. It's been great for ages, but recently I've seen nasty random 
corruption across the entire array that I can not pin down.

The machine also has a number of RAID-1 and a RAID-5 which are all behaving perfectly.

The machine has 16GB of RAM, so all my read tests are done with dd bs=1G count=20 to make sure I'm 
actually hitting the disk somewhere.

The array is partitioned into three approximately equal partitions.

If I do something like -

for i in `seq 3` ; do dd if=/dev/md0p1 bs=1G count=20 | md5sum ; done

- I get three completely different checksums

The filesystems are unmounted and the array is idle.

I've run the same test individually on all 10 disks in the array and they all appear to give 
consistent data. Reading anything from the array gives me mostly correct data with intermittent garbage.

I've tried both a 2.6.36.[12] kernel, and I'm currently running 2.6.37-rc5-git3 with the same odd 
results.

All the disks pass long SMART tests. They all checksum correctly from end to end with repeated 
sequential runs.

No libata errors in the logs.

The drives are all on separate channels. 8 are on a pair of Marvell 88SX7042 controllers and 2 are 
on a SIL3132. This has occurred since I upgraded the mainboard (and kernel at the same time - 
nothing like throwing more variables in the mix) and its effects were subtle enough that I missed 
them until it had successfully rotated out all of my good backups with broken data. Lesson learned.

I'm stumped and I don't even know where to begin. I've never seen something like this happen without 
a bad disk, controller or cable and they are easy to diagnose.

Regards,
-- 
Dolphins are so intelligent that within a few weeks they can
train Americans to stand at the edge of the pool and throw them
fish.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-12-14 12:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-14  8:10 Debugging a strange array corruption Brad Campbell
2010-12-14  9:22 ` Roman Mamedov
2010-12-14  9:37   ` Brad Campbell
2010-12-14  9:42     ` Roman Mamedov
2010-12-14 10:29       ` Brad Campbell
2010-12-14 11:59   ` David W.
2010-12-14 12:07     ` Roman Mamedov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).