From: Carl Karsten <carl@personnelware.com>
To: "Majed B." <majedb@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: reconstruct raid superblock
Date: Thu, 17 Dec 2009 10:17:13 -0600 [thread overview]
Message-ID: <549053140912170817n3b3818fan5ad483b520d0cb53@mail.gmail.com> (raw)
In-Reply-To: <70ed7c3e0912170740y771602bfqb682c39887edbd7b@mail.gmail.com>
On Thu, Dec 17, 2009 at 9:40 AM, Majed B. <majedb@gmail.com> wrote:
> I'm assuming you ran the command with the 2 external disks added to the array.
> One question before proceeding: When you removed these 2 externals,
> were there any changes on the array? Did you add/delete/modify any
> files or rename them?
shutdown the box, unplugged drives, booted box.
>
> What do you mean the 2 externals have had mkfs run on them? Is this
> AFTER you removed the disks from the array? If so, they're useless
> now.
That's what I figured.
>
> The names of the disks have changed and their names in the superblock
> are different than what udev is reporting them:
> sde now was named sdg
> sdf is sdf
> sdb is sdb
> sdc is sdc
> sdd is sdd
>
> According to the listing above, you have superblock info on: sdb, sdc,
> sdd, sde, sdf; 5 disks out of 7 -- one of which is a spare.
> sdb was a spare and according to other disks' info, it didn't resync
> so it has no useful data to aid in recovery.
> So you're left with 4 out of 6 disks + 1 spare.
>
> You have a chance of running the array in degraded mode using sde,
> sdc, sdd, sdf, assuming these disks are sane.
>
> Try running this command: mdadm -Af /dev/md0 /dev/sde /dev/sdc /dev/sdd /dev/sdf
mdadm: forcing event count in /dev/sdf(1) from 97276 upto 580158
mdadm: /dev/md0 has been started with 4 drives (out of 6).
>
> then check: cat /proc/mdstat
root@dhcp128:~# cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
[raid4] [raid10]
md0 : active raid6 sdf[1] sde[5] sdd[3] sdc[2]
5860549632 blocks level 6, 64k chunk, algorithm 2 [6/4] [_UUU_U]
unused devices: <none>
>
> If the remaining disks are sane, it should run the array in degraded
> mode. Hopefully.
dmesg
[31828.093953] md: md0 stopped.
[31838.929607] md: bind<sdc>
[31838.931455] md: bind<sdd>
[31838.932073] md: bind<sde>
[31838.932376] md: bind<sdf>
[31838.973346] raid5: device sdf operational as raid disk 1
[31838.973349] raid5: device sde operational as raid disk 5
[31838.973351] raid5: device sdd operational as raid disk 3
[31838.973353] raid5: device sdc operational as raid disk 2
[31838.973787] raid5: allocated 6307kB for md0
[31838.974165] raid5: raid level 6 set md0 active with 4 out of 6
devices, algorithm 2
[31839.066014] RAID5 conf printout:
[31839.066016] --- rd:6 wd:4
[31839.066018] disk 1, o:1, dev:sdf
[31839.066020] disk 2, o:1, dev:sdc
[31839.066022] disk 3, o:1, dev:sdd
[31839.066024] disk 5, o:1, dev:sde
[31839.066066] md0: detected capacity change from 0 to 6001202823168
[31839.066188] md0: p1
root@dhcp128:/media# fdisk -l /dev/md0
Disk /dev/md0: 6001.2 GB, 6001202823168 bytes
255 heads, 63 sectors/track, 729604 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x96af0591
Device Boot Start End Blocks Id System
/dev/md0p1 1 182401 1465136001 83 Linux
and now the bad news:
mount /dev/md0p1 md0p1
mount: wrong fs type, bad option, bad superblock on /dev/md0p1
[32359.038796] raid5: Disk failure on sde, disabling device.
[32359.038797] raid5: Operation continuing on 3 devices.
>
> If that doesn't work, I'd say you're better off scrapping & restoring
> your data back onto a new array rather than waste more time fiddling
> with superblocks.
Yep. starting that now.
This is exactly what I was expecting - very few things to try (like 1)
and a very clear pass/fail test.
Thanks for helping me get though this.
>
> On Thu, Dec 17, 2009 at 6:06 PM, Carl Karsten <carl@personnelware.com> wrote:
>> I brought back the 2 externals, which have had mkfs run on them, but
>> maybe the extra superblocks will help (doubt it, but couldn't hurt)
>>
>> root@dhcp128:/media# mdadm -E /dev/sd[a-z]
>> mdadm: No md superblock detected on /dev/sda.
>> /dev/sdb:
>> Magic : a92b4efc
>> Version : 00.90.00
>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>> Creation Time : Wed Mar 25 21:04:08 2009
>> Raid Level : raid6
>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>> Raid Devices : 6
>> Total Devices : 6
>> Preferred Minor : 0
>>
>> Update Time : Tue Mar 31 23:08:02 2009
>> State : clean
>> Active Devices : 5
>> Working Devices : 6
>> Failed Devices : 1
>> Spare Devices : 1
>> Checksum : a4fbb93a - correct
>> Events : 8430
>>
>> Chunk Size : 64K
>>
>> Number Major Minor RaidDevice State
>> this 6 8 16 6 spare /dev/sdb
>>
>> 0 0 8 0 0 active sync /dev/sda
>> 1 1 8 64 1 active sync /dev/sde
>> 2 2 8 32 2 active sync /dev/sdc
>> 3 3 8 48 3 active sync /dev/sdd
>> 4 4 0 0 4 faulty removed
>> 5 5 8 80 5 active sync /dev/sdf
>> 6 6 8 16 6 spare /dev/sdb
>> /dev/sdc:
>> Magic : a92b4efc
>> Version : 00.90.00
>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>> Creation Time : Wed Mar 25 21:04:08 2009
>> Raid Level : raid6
>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>> Raid Devices : 6
>> Total Devices : 4
>> Preferred Minor : 0
>>
>> Update Time : Sun Jul 12 11:31:47 2009
>> State : clean
>> Active Devices : 4
>> Working Devices : 4
>> Failed Devices : 2
>> Spare Devices : 0
>> Checksum : a59452db - correct
>> Events : 580158
>>
>> Chunk Size : 64K
>>
>> Number Major Minor RaidDevice State
>> this 2 8 32 2 active sync /dev/sdc
>>
>> 0 0 8 0 0 active sync /dev/sda
>> 1 1 0 0 1 faulty removed
>> 2 2 8 32 2 active sync /dev/sdc
>> 3 3 8 48 3 active sync /dev/sdd
>> 4 4 0 0 4 faulty removed
>> 5 5 8 96 5 active sync /dev/sdg
>> /dev/sdd:
>> Magic : a92b4efc
>> Version : 00.90.00
>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>> Creation Time : Wed Mar 25 21:04:08 2009
>> Raid Level : raid6
>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>> Raid Devices : 6
>> Total Devices : 4
>> Preferred Minor : 0
>>
>> Update Time : Sun Jul 12 11:31:47 2009
>> State : clean
>> Active Devices : 4
>> Working Devices : 4
>> Failed Devices : 2
>> Spare Devices : 0
>> Checksum : a59452ed - correct
>> Events : 580158
>>
>> Chunk Size : 64K
>>
>> Number Major Minor RaidDevice State
>> this 3 8 48 3 active sync /dev/sdd
>>
>> 0 0 8 0 0 active sync /dev/sda
>> 1 1 0 0 1 faulty removed
>> 2 2 8 32 2 active sync /dev/sdc
>> 3 3 8 48 3 active sync /dev/sdd
>> 4 4 0 0 4 faulty removed
>> 5 5 8 96 5 active sync /dev/sdg
>> /dev/sde:
>> Magic : a92b4efc
>> Version : 00.90.00
>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>> Creation Time : Wed Mar 25 21:04:08 2009
>> Raid Level : raid6
>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>> Raid Devices : 6
>> Total Devices : 4
>> Preferred Minor : 0
>>
>> Update Time : Sun Jul 12 11:31:47 2009
>> State : clean
>> Active Devices : 4
>> Working Devices : 4
>> Failed Devices : 2
>> Spare Devices : 0
>> Checksum : a5945321 - correct
>> Events : 580158
>>
>> Chunk Size : 64K
>>
>> Number Major Minor RaidDevice State
>> this 5 8 96 5 active sync /dev/sdg
>>
>> 0 0 8 0 0 active sync /dev/sda
>> 1 1 0 0 1 faulty removed
>> 2 2 8 32 2 active sync /dev/sdc
>> 3 3 8 48 3 active sync /dev/sdd
>> 4 4 0 0 4 faulty removed
>> 5 5 8 96 5 active sync /dev/sdg
>> /dev/sdf:
>> Magic : a92b4efc
>> Version : 00.90.00
>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>> Creation Time : Wed Mar 25 21:04:08 2009
>> Raid Level : raid6
>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>> Raid Devices : 6
>> Total Devices : 5
>> Preferred Minor : 0
>>
>> Update Time : Wed Apr 8 11:13:32 2009
>> State : clean
>> Active Devices : 5
>> Working Devices : 5
>> Failed Devices : 1
>> Spare Devices : 0
>> Checksum : a5085415 - correct
>> Events : 97276
>>
>> Chunk Size : 64K
>>
>> Number Major Minor RaidDevice State
>> this 1 8 80 1 active sync /dev/sdf
>>
>> 0 0 8 0 0 active sync /dev/sda
>> 1 1 8 80 1 active sync /dev/sdf
>> 2 2 8 32 2 active sync /dev/sdc
>> 3 3 8 48 3 active sync /dev/sdd
>> 4 4 0 0 4 faulty removed
>> 5 5 8 96 5 active sync /dev/sdg
>> mdadm: No md superblock detected on /dev/sdg.
>>
>>
>>
>> On Thu, Dec 17, 2009 at 8:39 AM, Majed B. <majedb@gmail.com> wrote:
>>> You can't copy and change bytes to identify disks.
>>>
>>> To check which disks belong to an array, do this:
>>> mdadm -E /dev/sd[a-z]
>>>
>>> The disks that you get info from belong to the existing array(s).
>>>
>>> In the first email you sent you included an examine output for one of
>>> the disks that listed another disk as a spare (sdb). The output of
>>> examine should shed more light.
>>>
>>> On Thu, Dec 17, 2009 at 5:15 PM, Carl Karsten <carl@personnelware.com> wrote:
>>>> On Thu, Dec 17, 2009 at 4:35 AM, Majed B. <majedb@gmail.com> wrote:
>>>>> I have misread the information you've provided, so allow me to correct myself:
>>>>>
>>>>> You're running a RAID6 array, with 2 disks lost/failed. Any disk loss
>>>>> after that will cause data loss since you have no redundancy (2 disks
>>>>> died).
>>>>
>>>> right - but I am not sure if data loss has occurred, where data is the
>>>> data being stored on the raid, not the raid metadata.
>>>>
>>>> My guess is I need to copy the raid superblock from one of the other
>>>> disks (say sdb), find the byets that identify the disk and change from
>>>> sdb to sda.
>>>>
>>>>>
>>>>> I believe it's still possible to reassemble the array, but you only
>>>>> need to remove the MBR. See this page for information:
>>>>> http://www.cyberciti.biz/faq/linux-how-to-uninstall-grub/
>>>>> dd if=/dev/null of=/dev/sdX bs=446 count=1
>>>>>
>>>>> Before proceeding, provide the output of cat /proc/mdstat
>>>>
>>>> root@dhcp128:~# cat /proc/mdstat
>>>> Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5]
>>>> [raid4] [raid10]
>>>> unused devices: <none>
>>>>
>>>>
>>>>> Is the array currently running degraded or is it suspended?
>>>>
>>>> um, not running, not sure I would call it suspended.
>>>>
>>>>> What happened to the spare disk assigned?
>>>>
>>>> I don't understand.
>>>>
>>>>> Did it finish resyncing
>>>>> before you installed grub on the wrong disk?
>>>>
>>>> I think so.
>>>>
>>>> I am fairly sure I could assemble the array before I installed grub.
>>>>
>>>>>
>>>>> On Thu, Dec 17, 2009 at 8:21 AM, Majed B. <majedb@gmail.com> wrote:
>>>>>> If your other disks are sane and you are able to run a degraded array, then
>>>>>> you can remove grub using dd then re-add the disk to the array.
>>>>>>
>>>>>> To clear the first 1MB of the disk:
>>>>>> dd if=/dev/zero of=/dev/sdx bs=1M count=1
>>>>>> Replace sdx with the disk name that has grub.
>>>>>>
>>>>>> On Dec 17, 2009 6:53 AM, "Carl Karsten" <carl@personnelware.com> wrote:
>>>>>>
>>>>>> I took over a box that had 1 ide boot drive, 6 sata raid drives (4
>>>>>> internal, 2 external.) I believe the 2 externals were redundant, so
>>>>>> could be removed. so I did, and mkfs-ed them. then I installed
>>>>>> ubuntu to the ide, and installed grub to sda, which turns out to be
>>>>>> the first sata. which would be fine if the raid was on sda1, but it
>>>>>> is on sda, and now the raid wont' assemble. no surprise, and I do
>>>>>> have a backup of the data spread across 5 external drives. but before
>>>>>> I abandon the array, I am wondering if I can fix it by recreating
>>>>>> mdadm's metatdata on sda, given I have sd[bcd] to work with.
>>>>>>
>>>>>> any suggestions?
>>>>>>
>>>>>> root@dhcp128:~# mdadm --examine /dev/sd[abcd]
>>>>>> mdadm: No md superblock detected on /dev/sda.
>>>>>> /dev/sdb:
>>>>>> Magic : a92b4efc
>>>>>> Version : 00.90.00
>>>>>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>>>>> Creation Time : Wed Mar 25 21:04:08 2009
>>>>>> Raid Level : raid6
>>>>>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>>>>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>>>>> Raid Devices : 6
>>>>>> Total Devices : 6
>>>>>> Preferred Minor : 0
>>>>>>
>>>>>> Update Time : Tue Mar 31 23:08:02 2009
>>>>>> State : clean
>>>>>> Active Devices : 5
>>>>>> Working Devices : 6
>>>>>> Failed Devices : 1
>>>>>> Spare Devices : 1
>>>>>> Checksum : a4fbb93a - correct
>>>>>> Events : 8430
>>>>>>
>>>>>> Chunk Size : 64K
>>>>>>
>>>>>> Number Major Minor RaidDevice State
>>>>>> this 6 8 16 6 spare /dev/sdb
>>>>>>
>>>>>> 0 0 8 0 0 active sync /dev/sda
>>>>>> 1 1 8 64 1 active sync /dev/sde
>>>>>> 2 2 8 32 2 active sync /dev/sdc
>>>>>> 3 3 8 48 3 active sync /dev/sdd
>>>>>> 4 4 0 0 4 faulty removed
>>>>>> 5 5 8 80 5 active sync
>>>>>> 6 6 8 16 6 spare /dev/sdb
>>>>>> /dev/sdc:
>>>>>> Magic : a92b4efc
>>>>>> Version : 00.90.00
>>>>>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>>>>> Creation Time : Wed Mar 25 21:04:08 2009
>>>>>> Raid Level : raid6
>>>>>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>>>>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>>>>> Raid Devices : 6
>>>>>> Total Devices : 4
>>>>>> Preferred Minor : 0
>>>>>>
>>>>>> Update Time : Sun Jul 12 11:31:47 2009
>>>>>> State : clean
>>>>>> Active Devices : 4
>>>>>> Working Devices : 4
>>>>>> Failed Devices : 2
>>>>>> Spare Devices : 0
>>>>>> Checksum : a59452db - correct
>>>>>> Events : 580158
>>>>>>
>>>>>> Chunk Size : 64K
>>>>>>
>>>>>> Number Major Minor RaidDevice State
>>>>>> this 2 8 32 2 active sync /dev/sdc
>>>>>>
>>>>>> 0 0 8 0 0 active sync /dev/sda
>>>>>> 1 1 0 0 1 faulty removed
>>>>>> 2 2 8 32 2 active sync /dev/sdc
>>>>>> 3 3 8 48 3 active sync /dev/sdd
>>>>>> 4 4 0 0 4 faulty removed
>>>>>> 5 5 8 96 5 active sync
>>>>>> /dev/sdd:
>>>>>> Magic : a92b4efc
>>>>>> Version : 00.90.00
>>>>>> UUID : 8d0cf436:3fc2d2ef:93d71b24:b036cc6b
>>>>>> Creation Time : Wed Mar 25 21:04:08 2009
>>>>>> Raid Level : raid6
>>>>>> Used Dev Size : 1465137408 (1397.26 GiB 1500.30 GB)
>>>>>> Array Size : 5860549632 (5589.06 GiB 6001.20 GB)
>>>>>> Raid Devices : 6
>>>>>> Total Devices : 4
>>>>>> Preferred Minor : 0
>>>>>>
>>>>>> Update Time : Sun Jul 12 11:31:47 2009
>>>>>> State : clean
>>>>>> Active Devices : 4
>>>>>> Working Devices : 4
>>>>>> Failed Devices : 2
>>>>>> Spare Devices : 0
>>>>>> Checksum : a59452ed - correct
>>>>>> Events : 580158
>>>>>>
>>>>>> Chunk Size : 64K
>>>>>>
>>>>>> Number Major Minor RaidDevice State
>>>>>> this 3 8 48 3 active sync /dev/sdd
>>>>>>
>>>>>> 0 0 8 0 0 active sync /dev/sda
>>>>>> 1 1 0 0 1 faulty removed
>>>>>> 2 2 8 32 2 active sync /dev/sdc
>>>>>> 3 3 8 48 3 active sync /dev/sdd
>>>>>> 4 4 0 0 4 faulty removed
>>>>>> 5 5 8 96 5 active sync
>>>>>>
>>>>>> --
>>>>>> Carl K
>>>>>> --
>>>>>> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>>>>>> the body of a message to majordomo@vger.kernel.org
>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Majed B.
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Carl K
>>>>
>>>
>>>
>>>
>>> --
>>> Majed B.
>>>
>>>
>>
>>
>>
>> --
>> Carl K
>>
>
>
>
> --
> Majed B.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
--
Carl K
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2009-12-17 16:17 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-12-17 3:53 reconstruct raid superblock Carl Karsten
[not found] ` <70ed7c3e0912162117n3617556p3a8decef94f33a1c@mail.gmail.com>
[not found] ` <70ed7c3e0912162121v5df1b972x6d9176bdf7e27401@mail.gmail.com>
2009-12-17 6:18 ` Carl Karsten
[not found] ` <4877c76c0912162226w3dfbdbb2t4b13e016f53728a0@mail.gmail.com>
[not found] ` <549053140912162236l134c38a9v490ba172231e6b8c@mail.gmail.com>
2009-12-17 7:35 ` Michael Evans
2009-12-17 10:35 ` Majed B.
2009-12-17 11:22 ` Michael Evans
2009-12-17 11:45 ` Majed B.
2009-12-17 14:15 ` Carl Karsten
2009-12-17 14:39 ` Majed B.
2009-12-17 15:06 ` Carl Karsten
2009-12-17 15:40 ` Majed B.
2009-12-17 16:17 ` Carl Karsten [this message]
2009-12-17 18:07 ` Majed B.
2009-12-17 19:18 ` Michael Evans
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=549053140912170817n3b3818fan5ad483b520d0cb53@mail.gmail.com \
--to=carl@personnelware.com \
--cc=linux-raid@vger.kernel.org \
--cc=majedb@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).