Re: reconstruct raid superblock

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Carl Karsten <carl@personnelware.com>
To: "Majed B." <majedb@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: reconstruct raid superblock
Date: Thu, 17 Dec 2009 10:17:13 -0600	[thread overview]
Message-ID: <549053140912170817n3b3818fan5ad483b520d0cb53@mail.gmail.com> (raw)
In-Reply-To: <70ed7c3e0912170740y771602bfqb682c39887edbd7b@mail.gmail.com>

On Thu, Dec 17, 2009 at 9:40 AM, Majed B. <majedb@gmail.com> wrote:
> I'm assuming you ran the command with the 2 external disks added to the array.
> One question before proceeding: When you removed these 2 externals,
> were there any changes on the array? Did you add/delete/modify any
> files or rename them?

shutdown the box, unplugged drives, booted box.

>
> What do you mean the 2 externals have had mkfs run on them? Is this
> AFTER you removed the disks from the array? If so, they're useless
> now.

That's what I figured.

>
> The names of the disks have changed and their names in the superblock
> are different than what udev is reporting them:
> sde now was named sdg
> sdf is sdf
> sdb is sdb
> sdc is sdc
> sdd is sdd
>
> According to the listing above, you have superblock info on: sdb, sdc,
> sdd, sde, sdf; 5 disks out of 7 -- one of which is a spare.
> sdb was a spare and according to other disks' info, it didn't resync
> so it has no useful data to aid in recovery.
> So you're left with 4 out of 6 disks + 1 spare.
>
> You have a chance of running the array in degraded mode using sde,
> sdc, sdd, sdf, assuming these disks are sane.
>
> Try running this command: mdadm -Af /dev/md0 /dev/sde /dev/sdc /dev/sdd /dev/sdf

mdadm: forcing event count in /dev/sdf(1) from 97276 upto 580158
mdadm: /dev/md0 has been started with 4 drives (out of 6).


>
> then check: cat /proc/mdstat

root@dhcp128:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid6 sdf[1] sde[5] sdd[3] sdc[2]
      5860549632 blocks level 6, 64k chunk, algorithm 2 [6/4] [_UUU_U]

unused devices: <none>

>
> If the remaining disks are sane, it should run the array in degraded
> mode. Hopefully.

dmesg
[31828.093953] md: md0 stopped.
[31838.929607] md: bind<sdc>
[31838.931455] md: bind<sdd>
[31838.932073] md: bind<sde>
[31838.932376] md: bind<sdf>
[31838.973346] raid5: device sdf operational as raid disk 1
[31838.973349] raid5: device sde operational as raid disk 5
[31838.973351] raid5: device sdd operational as raid disk 3
[31838.973353] raid5: device sdc operational as raid disk 2
[31838.973787] raid5: allocated 6307kB for md0
[31838.974165] raid5: raid level 6 set md0 active with 4 out of 6
devices, algorithm 2
[31839.066014] RAID5 conf printout:
[31839.066016]  --- rd:6 wd:4
[31839.066018]  disk 1, o:1, dev:sdf
[31839.066020]  disk 2, o:1, dev:sdc
[31839.066022]  disk 3, o:1, dev:sdd
[31839.066024]  disk 5, o:1, dev:sde
[31839.066066] md0: detected capacity change from 0 to 6001202823168
[31839.066188]  md0: p1

root@dhcp128:/media# fdisk -l /dev/md0
Disk /dev/md0: 6001.2 GB, 6001202823168 bytes
255 heads, 63 sectors/track, 729604 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x96af0591
    Device Boot      Start         End      Blocks   Id  System
/dev/md0p1               1      182401  1465136001   83  Linux

and now the bad news:
mount /dev/md0p1 md0p1
mount: wrong fs type, bad option, bad superblock on /dev/md0p1

[32359.038796] raid5: Disk failure on sde, disabling device.
[32359.038797] raid5: Operation continuing on 3 devices.

>
> If that doesn't work, I'd say you're better off scrapping & restoring
> your data back onto a new array rather than waste more time fiddling
> with superblocks.

Yep.  starting that now.

This is exactly what I was expecting - very few things to try (like 1)
and a very clear pass/fail test.

Thanks for helping me get though this.


>
> On Thu, Dec 17, 2009 at 6:06 PM, Carl Karsten <carl@personnelware.com> wrote:
>> I brought back the 2 externals, which have had mkfs run on them, but
>> maybe the extra superblocks will help (doubt it, but couldn't hurt)
>>
>> root@dhcp128:/media# mdadm -E /dev/sd[a-z]
>> mdadm: No md superblock detected on /dev/sda.
>> /dev/sdb:
>>          Magic : a92b4efc
>>        Version : 00.90.00
>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>  Creation Time : Wed Mar 25 21:04:08 2009
>>     Raid Level : raid6
>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>   Raid Devices : 6
>>  Total Devices : 6
>> Preferred Minor : 0
>>
>>    Update Time : Tue Mar 31 23:08:02 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 6
>>  Failed Devices : 1
>>  Spare Devices : 1
>>       Checksum : a4fbb93a - correct
>>         Events : 8430
>>
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     6       8       16        6      spare   /dev/sdb
>>
>>   0     0       8        0        0      active sync   /dev/sda
>>   1     1       8       64        1      active sync   /dev/sde
>>   2     2       8       32        2      active sync   /dev/sdc
>>   3     3       8       48        3      active sync   /dev/sdd
>>   4     4       0        0        4      faulty removed
>>   5     5       8       80        5      active sync   /dev/sdf
>>   6     6       8       16        6      spare   /dev/sdb
>> /dev/sdc:
>>          Magic : a92b4efc
>>        Version : 00.90.00
>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>  Creation Time : Wed Mar 25 21:04:08 2009
>>     Raid Level : raid6
>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>   Raid Devices : 6
>>  Total Devices : 4
>> Preferred Minor : 0
>>
>>    Update Time : Sun Jul 12 11:31:47 2009
>>          State : clean
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 2
>>  Spare Devices : 0
>>       Checksum : a59452db - correct
>>         Events : 580158
>>
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     2       8       32        2      active sync   /dev/sdc
>>
>>   0     0       8        0        0      active sync   /dev/sda
>>   1     1       0        0        1      faulty removed
>>   2     2       8       32        2      active sync   /dev/sdc
>>   3     3       8       48        3      active sync   /dev/sdd
>>   4     4       0        0        4      faulty removed
>>   5     5       8       96        5      active sync   /dev/sdg
>> /dev/sdd:
>>          Magic : a92b4efc
>>        Version : 00.90.00
>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>  Creation Time : Wed Mar 25 21:04:08 2009
>>     Raid Level : raid6
>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>   Raid Devices : 6
>>  Total Devices : 4
>> Preferred Minor : 0
>>
>>    Update Time : Sun Jul 12 11:31:47 2009
>>          State : clean
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 2
>>  Spare Devices : 0
>>       Checksum : a59452ed - correct
>>         Events : 580158
>>
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     3       8       48        3      active sync   /dev/sdd
>>
>>   0     0       8        0        0      active sync   /dev/sda
>>   1     1       0        0        1      faulty removed
>>   2     2       8       32        2      active sync   /dev/sdc
>>   3     3       8       48        3      active sync   /dev/sdd
>>   4     4       0        0        4      faulty removed
>>   5     5       8       96        5      active sync   /dev/sdg
>> /dev/sde:
>>          Magic : a92b4efc
>>        Version : 00.90.00
>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>  Creation Time : Wed Mar 25 21:04:08 2009
>>     Raid Level : raid6
>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>   Raid Devices : 6
>>  Total Devices : 4
>> Preferred Minor : 0
>>
>>    Update Time : Sun Jul 12 11:31:47 2009
>>          State : clean
>>  Active Devices : 4
>> Working Devices : 4
>>  Failed Devices : 2
>>  Spare Devices : 0
>>       Checksum : a5945321 - correct
>>         Events : 580158
>>
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     5       8       96        5      active sync   /dev/sdg
>>
>>   0     0       8        0        0      active sync   /dev/sda
>>   1     1       0        0        1      faulty removed
>>   2     2       8       32        2      active sync   /dev/sdc
>>   3     3       8       48        3      active sync   /dev/sdd
>>   4     4       0        0        4      faulty removed
>>   5     5       8       96        5      active sync   /dev/sdg
>> /dev/sdf:
>>          Magic : a92b4efc
>>        Version : 00.90.00
>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>  Creation Time : Wed Mar 25 21:04:08 2009
>>     Raid Level : raid6
>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>   Raid Devices : 6
>>  Total Devices : 5
>> Preferred Minor : 0
>>
>>    Update Time : Wed Apr  8 11:13:32 2009
>>          State : clean
>>  Active Devices : 5
>> Working Devices : 5
>>  Failed Devices : 1
>>  Spare Devices : 0
>>       Checksum : a5085415 - correct
>>         Events : 97276
>>
>>     Chunk Size : 64K
>>
>>      Number   Major   Minor   RaidDevice State
>> this     1       8       80        1      active sync   /dev/sdf
>>
>>   0     0       8        0        0      active sync   /dev/sda
>>   1     1       8       80        1      active sync   /dev/sdf
>>   2     2       8       32        2      active sync   /dev/sdc
>>   3     3       8       48        3      active sync   /dev/sdd
>>   4     4       0        0        4      faulty removed
>>   5     5       8       96        5      active sync   /dev/sdg
>> mdadm: No md superblock detected on /dev/sdg.
>>
>>
>>
>> On Thu, Dec 17, 2009 at 8:39 AM, Majed B. <majedb@gmail.com> wrote:
>>> You can't copy and change bytes to identify disks.
>>>
>>> To check which disks belong to an array, do this:
>>> mdadm -E /dev/sd[a-z]
>>>
>>> The disks that you get info from belong to the existing array(s).
>>>
>>> In the first email you sent you included an examine output for one of
>>> the disks that listed another disk as a spare (sdb). The output of
>>> examine should shed more light.
>>>
>>> On Thu, Dec 17, 2009 at 5:15 PM, Carl Karsten <carl@personnelware.com> wrote:
>>>> On Thu, Dec 17, 2009 at 4:35 AM, Majed B. <majedb@gmail.com> wrote:
>>>>> I have misread the information you've provided, so allow me to correct myself:
>>>>>
>>>>> You're running a RAID6 array, with 2 disks lost/failed. Any disk loss
>>>>> after that will cause data loss since you have no redundancy (2 disks
>>>>> died).
>>>>
>>>> right - but I am not sure if data loss has occurred, where data is the
>>>> data being stored on the raid, not the raid metadata.
>>>>
>>>> My guess is I need to copy the raid superblock from one of the other
>>>> disks (say sdb), find the byets that identify the disk and change from
>>>> sdb to sda.
>>>>
>>>>>
>>>>> I believe it's still possible to reassemble the array, but you only
>>>>> need to remove the MBR. See this page for information:
>>>>> http://www.cyberciti.biz/faq/linux-how-to-uninstall-grub/
>>>>> dd if=/dev/null of=/dev/sdX bs=446 count=1
>>>>>
>>>>> Before proceeding, provide the output of cat /proc/mdstat
>>>>
>>>> root@dhcp128:~# cat /proc/mdstat
>>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>>>> [raid4] [raid10]
>>>> unused devices: <none>
>>>>
>>>>
>>>>> Is the array currently running degraded or is it suspended?
>>>>
>>>> um, not running, not sure I would call it suspended.
>>>>
>>>>> What happened to the spare disk assigned?
>>>>
>>>> I don't understand.
>>>>
>>>>> Did it finish resyncing
>>>>> before you installed grub on the wrong disk?
>>>>
>>>> I think so.
>>>>
>>>> I am fairly sure I could assemble the array before I installed grub.
>>>>
>>>>>
>>>>> On Thu, Dec 17, 2009 at 8:21 AM, Majed B. <majedb@gmail.com> wrote:
>>>>>> If your other disks are sane and you are able to run a degraded array,  then
>>>>>> you can remove grub using dd then re-add the disk to the array.
>>>>>>
>>>>>> To clear the first 1MB of the disk:
>>>>>> dd if=/dev/zero of=/dev/sdx bs=1M count=1
>>>>>> Replace sdx with the disk name that has grub.
>>>>>>
>>>>>> On Dec 17, 2009 6:53 AM, "Carl Karsten" <carl@personnelware.com> wrote:
>>>>>>
>>>>>> I took over a box that had 1 ide boot drive, 6 sata raid drives (4
>>>>>> internal, 2 external.)  I believe the 2 externals were redundant, so
>>>>>> could be removed.  so I did, and mkfs-ed them.  then I installed
>>>>>> ubuntu to the ide, and installed grub to sda, which turns out to be
>>>>>> the first sata.  which would be fine if the raid was on sda1, but it
>>>>>> is on sda, and now the raid wont' assemble.  no surprise, and I do
>>>>>> have a backup of the data spread across 5 external drives.  but before
>>>>>> I  abandon the array, I am wondering if I can fix it by recreating
>>>>>> mdadm's metatdata on sda, given I have sd[bcd] to work with.
>>>>>>
>>>>>> any suggestions?
>>>>>>
>>>>>> root@dhcp128:~# mdadm --examine /dev/sd[abcd]
>>>>>> mdadm: No md superblock detected on /dev/sda.
>>>>>> /dev/sdb:
>>>>>>          Magic : a92b4efc
>>>>>>        Version : 00.90.00
>>>>>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>>>>>  Creation Time : Wed Mar 25 21:04:08 2009
>>>>>>     Raid Level : raid6
>>>>>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>>>>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>>>>>   Raid Devices : 6
>>>>>>  Total Devices : 6
>>>>>> Preferred Minor : 0
>>>>>>
>>>>>>    Update Time : Tue Mar 31 23:08:02 2009
>>>>>>          State : clean
>>>>>>  Active Devices : 5
>>>>>> Working Devices : 6
>>>>>>  Failed Devices : 1
>>>>>>  Spare Devices : 1
>>>>>>       Checksum : a4fbb93a - correct
>>>>>>         Events : 8430
>>>>>>
>>>>>>     Chunk Size : 64K
>>>>>>
>>>>>>      Number   Major   Minor   RaidDevice State
>>>>>> this     6       8       16        6      spare   /dev/sdb
>>>>>>
>>>>>>   0     0       8        0        0      active sync   /dev/sda
>>>>>>   1     1       8       64        1      active sync   /dev/sde
>>>>>>   2     2       8       32        2      active sync   /dev/sdc
>>>>>>   3     3       8       48        3      active sync   /dev/sdd
>>>>>>   4     4       0        0        4      faulty removed
>>>>>>   5     5       8       80        5      active sync
>>>>>>   6     6       8       16        6      spare   /dev/sdb
>>>>>> /dev/sdc:
>>>>>>          Magic : a92b4efc
>>>>>>        Version : 00.90.00
>>>>>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>>>>>  Creation Time : Wed Mar 25 21:04:08 2009
>>>>>>     Raid Level : raid6
>>>>>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>>>>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>>>>>   Raid Devices : 6
>>>>>>  Total Devices : 4
>>>>>> Preferred Minor : 0
>>>>>>
>>>>>>    Update Time : Sun Jul 12 11:31:47 2009
>>>>>>          State : clean
>>>>>>  Active Devices : 4
>>>>>> Working Devices : 4
>>>>>>  Failed Devices : 2
>>>>>>  Spare Devices : 0
>>>>>>       Checksum : a59452db - correct
>>>>>>         Events : 580158
>>>>>>
>>>>>>     Chunk Size : 64K
>>>>>>
>>>>>>      Number   Major   Minor   RaidDevice State
>>>>>> this     2       8       32        2      active sync   /dev/sdc
>>>>>>
>>>>>>   0     0       8        0        0      active sync   /dev/sda
>>>>>>   1     1       0        0        1      faulty removed
>>>>>>   2     2       8       32        2      active sync   /dev/sdc
>>>>>>   3     3       8       48        3      active sync   /dev/sdd
>>>>>>   4     4       0        0        4      faulty removed
>>>>>>   5     5       8       96        5      active sync
>>>>>> /dev/sdd:
>>>>>>          Magic : a92b4efc
>>>>>>        Version : 00.90.00
>>>>>>           UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>>>>>  Creation Time : Wed Mar 25 21:04:08 2009
>>>>>>     Raid Level : raid6
>>>>>>  Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>>>>>     Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>>>>>   Raid Devices : 6
>>>>>>  Total Devices : 4
>>>>>> Preferred Minor : 0
>>>>>>
>>>>>>    Update Time : Sun Jul 12 11:31:47 2009
>>>>>>          State : clean
>>>>>>  Active Devices : 4
>>>>>> Working Devices : 4
>>>>>>  Failed Devices : 2
>>>>>>  Spare Devices : 0
>>>>>>       Checksum : a59452ed - correct
>>>>>>         Events : 580158
>>>>>>
>>>>>>     Chunk Size : 64K
>>>>>>
>>>>>>      Number   Major   Minor   RaidDevice State
>>>>>> this     3       8       48        3      active sync   /dev/sdd
>>>>>>
>>>>>>   0     0       8        0        0      active sync   /dev/sda
>>>>>>   1     1       0        0        1      faulty removed
>>>>>>   2     2       8       32        2      active sync   /dev/sdc
>>>>>>   3     3       8       48        3      active sync   /dev/sdd
>>>>>>   4     4       0        0        4      faulty removed
>>>>>>   5     5       8       96        5      active sync
>>>>>>
>>>>>> --
>>>>>> Carl K
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>>       Majed B.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Carl K
>>>>
>>>
>>>
>>>
>>> --
>>>       Majed B.
>>>
>>>
>>
>>
>>
>> --
>> Carl K
>>
>
>
>
> --
>       Majed B.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>



-- 
Carl K
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

next prev parent reply	other threads:[~2009-12-17 16:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-17  3:53 reconstruct raid superblock Carl Karsten
     [not found] ` <70ed7c3e0912162117n3617556p3a8decef94f33a1c@mail.gmail.com>
     [not found]   ` <70ed7c3e0912162121v5df1b972x6d9176bdf7e27401@mail.gmail.com>
2009-12-17  6:18     ` Carl Karsten
     [not found]       ` <4877c76c0912162226w3dfbdbb2t4b13e016f53728a0@mail.gmail.com>
     [not found]         ` <549053140912162236l134c38a9v490ba172231e6b8c@mail.gmail.com>
2009-12-17  7:35           ` Michael Evans
2009-12-17 10:35     ` Majed B.
2009-12-17 11:22       ` Michael Evans
2009-12-17 11:45         ` Majed B.
2009-12-17 14:15       ` Carl Karsten
2009-12-17 14:39         ` Majed B.
2009-12-17 15:06           ` Carl Karsten
2009-12-17 15:40             ` Majed B.
2009-12-17 16:17               ` Carl Karsten [this message]
2009-12-17 18:07                 ` Majed B.
2009-12-17 19:18                   ` Michael Evans

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=549053140912170817n3b3818fan5ad483b520d0cb53@mail.gmail.com \
    --to=carl@personnelware.com \
    --cc=linux-raid@vger.kernel.org \
    --cc=majedb@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).