Re: data corruption: ext3/lvm2/md/mptsas/vitesse/seagate

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: Marc Bejarano <beej@alum.mit.edu>
Cc: linux-scsi@vger.kernel.org, linux-raid@vger.kernel.org,
	Grant Grundler <grundler@google.com>
Subject: Re: data corruption: ext3/lvm2/md/mptsas/vitesse/seagate
Date: Mon, 10 Mar 2008 10:36:26 -0500	[thread overview]
Message-ID: <1205163386.2941.14.camel@localhost.localdomain> (raw)
In-Reply-To: <200803101519.m2AFJCJS032509@colby.verdasys.com>

On Fri, 2008-03-07 at 17:40 -0500, Marc Bejarano wrote:
> In another instance of out-of-sync-ness, the bad disk looked as 
> follows.  The bad disk was in a completely different md raid1 
> "device", and if it needs to be said explicitly, was a totally 
> different physical drive.
> 
> b + 0x00000: "header of 16K mysql/innodb page 309713974 followed by
> good data"
> 
> b + 0x03600: **BAD DATA**: "header of 16K mysql/innodb page 
> 309713975", should be at b+0x04000, followed by first 10752 == 21*512 
> bytes of current correct value of page per disk with good copy
> 
> b + 0x06000: correct current last part of page 309713975 in proper
> place.
> 
> This is hard to explain.  It looks like page 309713975 got written 
> out to the proper spot, but then the first 10752 bytes got written 
> out again to the wrong spot?!?

I'm afraid your not going to like this, but this pattern of corruption
is almost completely definitive of a disk problem with head positioning.

The reason is that the block and all lower layers do write out in terms
of what they see as a logical block size (usually 4k, but definitely
whatever the block size of the underlying filesystem you have mysql on).

Seeing an odd number of 512 byte sectors out of position like that (21
in your case) when that number isn't a power of two (which is a linux
logical block size requirement) can't really have come from the kernel,
since we always deal in power of two units of the underlying 512 byte
sectors all the way from block, through md to the low level SCSI driver.

It's still theoretically possible that something went wrong in the
actual HBA, but I'd place most of my money on a disk fault.  The drives
you have, the Seagate 7200.10 were the first to use perpendicular
recording, so it could be they have head positioning errors with the new
technology.  There's also a lot of talk on the internet about
performance issues with the various revisions of their firmware:

http://www.fluffles.net/articles/seagate-AAK-firmware

Just as a matter of interest, what version of firmware do you have?  You
can get this with

hdparm -I /dev/sd<whatever>

I'm afraid the only way to confirm this theory definitively will be with
the destructive disktest from autotest (it was actually constructed to
check for drive head positioning errors), as Grant explained:

> If you can destroy (and later restore) the data on one or more
> of the disks, you might consider running disktest from:
>    http://test.kernel.org/autotest/
> 
> I've parked an SVN snapshot on:
>    http://iou.parisc-linux.org/~grundler/autotest-20080307.tgz
> 
> See autotest/tests/disktest/ . IIRC this test will tag each 512 byte
> "sector" it writes to a file and will read back those tags later to
> verify the sectors made it to media.

James

next prev parent reply	other threads:[~2008-03-10 15:36 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-03-06 21:08 data corruption: ext3/lvm2/md/mptsas/vitesse/seagate Marc Bejarano
2008-03-06 22:52 ` Steve Cousins
2008-03-07  0:02   ` Janek Kozicki
2008-03-07 22:39   ` Marc Bejarano
2008-03-08 17:18     ` Bill Davidsen
2008-03-08 21:23     ` Grant Grundler
2008-03-07  0:10 ` James Bottomley
2008-03-07 22:40   ` Marc Bejarano
2008-03-10 15:36     ` James Bottomley [this message]
2008-03-10 19:02       ` Janek Kozicki
2008-03-10 19:55         ` James Bottomley
2008-03-11 22:14       ` Marc Bejarano
     [not found]       ` <7.1.0.9.2.20080311174743.1376cc30@alum.mit.edu>
2008-03-25 23:43         ` Marc Bejarano
2008-03-26  0:12           ` Grant Grundler
     [not found]             ` <da824cf30803251712t801fdaexc19ba4fe8130ee2e@mail.gmail.com >
2008-03-26  2:17               ` Marc Bejarano
2008-03-26 17:03                 ` Grant Grundler
     [not found]                   ` <da824cf30803261003i690f108dh86ff846e4f5fd2fa@mail.gmail.co m>
2008-03-27 20:45                     ` Marc Bejarano
     [not found]                   ` <7.1.0.9.2.20080327163522.14ab0ac8@alum.mit.edu>
2008-09-02 19:32                     ` Marc Bejarano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1205163386.2941.14.camel@localhost.localdomain \
    --to=james.bottomley@hansenpartnership.com \
    --cc=beej@alum.mit.edu \
    --cc=grundler@google.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).