All of lore.kernel.org
 help / color / mirror / Atom feed
From: Adam Newham <adam@thenewhams.com>
To: linux-raid@vger.kernel.org
Subject: How do I repair a checksum error in the superblock?
Date: Thu, 23 Sep 2010 16:30:20 -0700	[thread overview]
Message-ID: <4C9BE30C.9060301@thenewhams.com> (raw)


I've got a sick RAID-5 array and looking for advice on the best way to 
fix it. I've Google'd the hell out of it/read the FAQ and think I know 
what I need to do but I what to make sure as I'd rather not have to 
restore the data from backups (as they're incomplete and would be very 
time consuming)

The machine is configured as follows:

    * 4 x 1 TB drives (SATA) - software RAID-5, with LVM consuming all
      3TB and then ext3 on top giving 2.7 TB
    * 1 x OS drive (IDE) (I actually have 1x drive with RHEL5 and
      another with Ubuntu which with the newer kernel is a lot more
      friendly with my motherboard)


Basically I had the machine die due to a bad motherboard and DIMM. 
During a boot a disc check was performed and at 1.6% Linux performed a 
"kernel panic". I re-installed the OS and I'm now trying to recovery the 
RAID. it looks like I have 3x problems.

    * When the original OS was installed, the OS drive was located on
      /dev/hda[x]. Under the new OS (Ubuntu 10.04), its now populated at
      /dev/sda[x]. The RAID was originally located on /dev/sd[abcd]/
      With the OS drive in /dev/sda[x], the OS is populating the RAID at
      /dev/sd[bcde]. I modified the /etc/mdadm/mdadm.conf file to
      reflect this. I could probably get round this by going back to the
      RHEL5 OS, but it would be nice to know how to do this.

At the moment I fixed it by modifying the /etc/mdadm/mdadm.conf file  as 
follows:

DEVICE /dev/sd[bcde]1
ARRAY /dev/md0 level=raid5 num-devices=4 
UUID=08558923:881d9efd:464c249d:988d2ec6

    * The next problem (and is my main problem) is that one of the
      drives (/dev/sde) has a checksum error in the superblock. So when
      the try to assemble the array, I get the following:

sudo mdadm --assemble --verbose /dev/md0
mdadm: looking for devices for /dev/md0
mdadm: /dev/sde1 is identified as a member of /dev/md0, slot 3.
mdadm: /dev/sdd1 is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sdc1 is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdb1 is identified as a member of /dev/md0, slot 0.
mdadm: added /dev/sdc1 to /dev/md0 as 1
mdadm: added /dev/sdd1 to /dev/md0 as 2
mdadm: failed to add /dev/sde1 to /dev/md0: Invalid argument
mdadm: added /dev/sdb1 to /dev/md0 as 0
mdadm: /dev/md0 assembled from 3 drives - not enough to start the array 
while not clean - consider --force.

/var/log/messages contains the following:

md: sde1 does not have a valid v0.90 superblock, not importing!
md: md_import_device returned -22

If I dump out the info for the drive (/dev/sde1) I see the following:

sudo mdadm --examine /dev/sde1
/dev/sde1:
           Magic : a92b4efc
         Version : 00.90.03
            UUID : 08558923:881d9efd:464c249d:988d2ec6
   Creation Time : Mon Nov  3 17:42:21 2008
      Raid Level : raid5
   Used Dev Size : 976759936 (931.51 GiB 1000.20 GB)
      Array Size : 2930279808 (2794.53 GiB 3000.61 GB)
    Raid Devices : 4
   Total Devices : 4
Preferred Minor : 0

     Update Time : Sun Aug 15 12:33:06 2010
           State : active
  Active Devices : 4
Working Devices : 4
  Failed Devices : 0
   Spare Devices : 0
        Checksum : e828e258 - expected e828e260
          Events : 143

          Layout : left-symmetric
      Chunk Size : 64K

       Number   Major   Minor   RaidDevice State
this     3       8       49        3      active sync   /dev/sdd1

    0     0       8        1        0      active sync   /dev/sda1
    1     1       8       17        1      active sync   /dev/sdb1
    2     2       8       33        2      active sync   /dev/sdc1
    3     3       8       49        3      active sync   /dev/sdd1

How do I fix this? Googling seems to imply recreating the array over the 
top and specify the UUID? Should I force the assemble with 3x drives? 
There is also a --update which updates the metadata on the disk?

    * The last problem is that I believe that one of the drives has
      additional metadata. This caused Ubuntu to see an additional
      partition /dev/md0lp1 in addition to /dev/md0. What is the best
      way of removing it?




             reply	other threads:[~2010-09-23 23:30 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-09-23 23:30 Adam Newham [this message]
2010-09-25  7:31 ` How do I repair a checksum error in the superblock? Neil Brown
2010-09-25 15:41   ` Luca Berra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C9BE30C.9060301@thenewhams.com \
    --to=adam@thenewhams.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.