public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed
From: Marc Bejarano <beej@alum.mit.edu>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: linux-scsi@vger.kernel.org, linux-raid@vger.kernel.org
Subject: Re: data corruption: ext3/lvm2/md/mptsas/vitesse/seagate
Date: Fri, 07 Mar 2008 17:40:40 -0500	[thread overview]
Message-ID: <200803101519.m2AFJCJS032509@colby.verdasys.com> (raw)
In-Reply-To: <1204848652.3062.100.camel@localhost.localdomain>

hi, james.  thanks so much for taking the time to dig into this! :)

At 19:10 3/6/2008, James Bottomley wrote:
 >On Thu, 2008-03-06 at 16:08 -0500, Marc Bejarano wrote:
 >> i've been doing burn-in on a new server i had hoped to deploy months
 >> ago and can't seem to figure out the cause of data corruption i've
 >> been seeing.  the SAS controller is an LSI SAS3801E connected to an
 >> xTore XJ-SA12-316 SAS enclosures (vitesses expanders) full of seagate
 >> 7200.10 750-GB SATA drives.
 >>
 >> the corruption is occurring in ext3 filesystems that live on top of
 >> an lvm2 RAID 0 stripe composed of 16 2-drive md RAID 1 sets.  the
 >> corruption has been detected both by MySQL noticing bad checksums and
 >> also by using md's "check" (sync_action) for RAID 1 consistency.
 >
 >Actually, the RAID-1 might be the most useful.  Is there anything
 >significant about the differing data?

it looks like contiguous sectors of misplaced data.

 >Do od dumps of the corrupt
 >sectors in both halves of the mirror and see what actually appears in
 >the data ... it might turn out to be useful.

my colleague (who has been batting his head against this for far 
longer than he'd like to have been) has been getting at the data via 
a pread64() of the actual mysql data file.  multiple pread64()'s end 
up giving him both halves of the mirror.

 >Things like how long the
 >data corruption is (are the two sectors different, or is it just a run
 >of a few bytes within them) can be useful in tracking the source of the
 >corruption.

here is a cut of an email he wrote me:
===
In one instance of mirroring out-of-sync-ness, the disk with the bad
data looked as follows:

"a" is a currently undetermined offset into the block device divisible by 16K.

a + 0x00000: "header of 16K mysql/innodb page # 178812066 followed by 
good data"

a + 0x02600: **BAD DATA**: "header of 16K mysql/innodb page # 178812067",
should be at a+0x04000, followed by old version of first 6656 bytes 
of page 178812067

a + 0x04000: "header of 16K mysql/innodb page # 178812067 followed by 
correct current copy of page"

It looks to me like mysql/innodb "page" 178812067 at some point was 
written to the wrong spot, and subsequently a newer version of page 
178812067 got written out again, but to the proper spot.

In another instance of out-of-sync-ness, the bad disk looked as 
follows.  The bad disk was in a completely different md raid1 
"device", and if it needs to be said explicitly, was a totally 
different physical drive.

b + 0x00000: "header of 16K mysql/innodb page 309713974 followed by good data"

b + 0x03600: **BAD DATA**: "header of 16K mysql/innodb page 
309713975", should be at b+0x04000, followed by first 10752 == 21*512 
bytes of current correct value of page per disk with good copy

b + 0x06000: correct current last part of page 309713975 in proper place.

This is hard to explain.  It looks like page 309713975 got written 
out to the proper spot, but then the first 10752 bytes got written 
out again to the wrong spot?!?
===

 >Do you happen to have the absolute block number (and relative block
 >number---relative to the partition start) of the corruption?

no.  can you suggest an easy way to get that?

 >Of course, confirming
 >that git head has this problem too, so we could rule out patches added
 >to the RHEL kernel would be useful ...

we're not currently git-enabled, but i suppose it wouldn't take too 
long to become so.  would testing with the latest kernel.org snapshot 
(currently 2.6.25-rc4-git2 from this mornging) be good enough?  or 
were you hoping for a test with stuff from scsi-misc?

cheers,
marc


  reply	other threads:[~2008-03-07 22:40 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06 21:08 data corruption: ext3/lvm2/md/mptsas/vitesse/seagate Marc Bejarano
2008-03-06 22:52 ` Steve Cousins
2008-03-07 22:39   ` Marc Bejarano
2008-03-08 17:18     ` Bill Davidsen
2008-03-08 21:23     ` Grant Grundler
2008-03-07  0:10 ` James Bottomley
2008-03-07 22:40   ` Marc Bejarano [this message]
2008-03-10 15:36     ` James Bottomley
2008-03-10 19:02       ` Janek Kozicki
2008-03-10 19:55         ` James Bottomley
2008-03-11 22:14       ` Marc Bejarano
     [not found]       ` <7.1.0.9.2.20080311174743.1376cc30@alum.mit.edu>
2008-03-25 23:43         ` Marc Bejarano
2008-03-26  0:12           ` Grant Grundler
     [not found]             ` <da824cf30803251712t801fdaexc19ba4fe8130ee2e@mail.gmail.com >
2008-03-26  2:17               ` Marc Bejarano
2008-03-26 17:03                 ` Grant Grundler
     [not found]                   ` <da824cf30803261003i690f108dh86ff846e4f5fd2fa@mail.gmail.co m>
2008-03-27 20:45                     ` Marc Bejarano
     [not found]                   ` <7.1.0.9.2.20080327163522.14ab0ac8@alum.mit.edu>
2008-09-02 19:32                     ` Marc Bejarano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200803101519.m2AFJCJS032509@colby.verdasys.com \
    --to=beej@alum.mit.edu \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox