* Recovering from two raid superblocks on the same disks
@ 2012-05-28 7:14 Jeff Johnson
2012-05-28 11:40 ` NeilBrown
0 siblings, 1 reply; 2+ messages in thread
From: Jeff Johnson @ 2012-05-28 7:14 UTC (permalink / raw)
To: linux-raid
Greetings,
I am looking at a very unique situation and trying to successfully 1TB
of very critical data.
The md raid in question is a 12-drive RAID-10 sitting between two
identical nodes via a shared SAS link. Originally the 12 drives were
configured as two six drive RAID-10 volumes using the entire disk device
(no partitions on member drives). That configuration was later scrapped
in favor of a single 12-drive RAID-10 but in this configuration a single
partition was created and the partition was used as the RAID member
device instead of the entire disk (sdb1 vs sdb).
One of the systems had the old two six-drive RAID-10 mdadm.conf file
left in /etc. Due to a power outage both systems went down and then
rebooted. When one system, the one with the old mdadm.conf file, came up
md referenced the file, saw the intact old superblocks at the beginning
of the drive and started an assemble and resync of those two six-drive
RAID-10 volumes. The resync process got to 40% before it was stopped.
The other system managed to enumerate the drives and see the partition
maps prior to the other node assembling the old superblock config. I can
still see the newer md superblocks that start on the partition boundary
rather than the beginning of the physical drive.
It appears that md overwrite protection was in a way circumvented by the
old superblocks matching the old mdadm.conf file and not seeing
conflicting superblocks at the beginning of the partition boundaries.
Both versions, old and new, were RAID-10. It appears that the errant
resync of the old configuration didn't corrupt the newer RAID config
since the drives were allocated in the same order and the same drives
were paired (mirrors) in both old and new configs. I am guessing that
since the striping method was RAID-0 the absence of stripe parity to
check kept the data on the drives from being corrupted. This is
conjecture on my part.
Old config:
RAID-10, /dev/md0, /dev/sd[bcdefg]
RAID-10, /dev/md1, /dev/sd[hijklm]
New config:
RAID-10, /dev/md0, /dev/sd[bcdefghijklm]1
It appears that the old superblock remained in that ~17KB gap between
physical start of disk and the start boundary of partition 1 where the
new superblock was written.
I was able to still see the partitions on the other node. I was able to
read the new config superblocks from 11 of the 12 drives. UUIDs, state,
all seem to be correct.
Three questions:
1) Has anyone seen a situation like this before?
2) Is it possible that since the mirrored pairs were allocated in the
same order that the data was not overwritten?
3) What is the best way to assemble and run a 12-drive RAID-10 with
member drive 0 (sdb1) seemingly blank (no superblock)?
The current state of the 12-drive volume is: (note: sdb1 has no
superblock but the drive is physically fine)
/dev/sdc1:
Magic : a92b4efc
Version : 0.90.00
UUID : 852267e0:095a343c:f4f590ad:3333cb43
Creation Time : Tue Feb 14 18:56:08 2012
Raid Level : raid10
Used Dev Size : 586059136 (558.91 GiB 600.12 GB)
Array Size : 3516354816 (3353.46 GiB 3600.75 GB)
Raid Devices : 12
Total Devices : 12
Preferred Minor : 0
Update Time : Sat May 26 12:05:11 2012
State : clean
Active Devices : 12
Working Devices : 12
Failed Devices : 0
Spare Devices : 0
Checksum : 21bca4ce - correct
Events : 26
Layout : near=2
Chunk Size : 32K
Number Major Minor RaidDevice State
this 1 8 33 1 active sync /dev/sdc1
0 0 8 17 0 active sync
1 1 8 33 1 active sync /dev/sdc1
2 2 8 49 2 active sync /dev/sdd1
3 3 8 65 3 active sync /dev/sde1
4 4 8 81 4 active sync /dev/sdf1
5 5 8 97 5 active sync /dev/sdg1
6 6 8 113 6 active sync /dev/sdh1
7 7 8 129 7 active sync /dev/sdi1
8 8 8 145 8 active sync /dev/sdj1
9 9 8 161 9 active sync /dev/sdk1
10 10 8 177 10 active sync /dev/sdl1
11 11 8 193 11 active sync /dev/sdm1
I could just run 'mdadm -A --uuid=852267e0095a343cf4f590ad3333cb43
/dev/sd[bcdefghijklm]1 --run' but I feel better seeking advice and
consensus before doing anything.
I have never seen a situation like this before. It seems like there
might be one correct way to get the data back and many ways of losing
the data for good. Any advice or feedback is greatly appreciated!
--Jeff
--
------------------------------
Jeff Johnson
Manager
Aeon Computing
jeff.johnson@aeoncomputing.com
www.aeoncomputing.com
t: 858-412-3810 x101 f: 858-412-3845
m: 619-204-9061
4905 Morena Boulevard, Suite 1313 - San Diego, CA 92117
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Recovering from two raid superblocks on the same disks
2012-05-28 7:14 Recovering from two raid superblocks on the same disks Jeff Johnson
@ 2012-05-28 11:40 ` NeilBrown
0 siblings, 0 replies; 2+ messages in thread
From: NeilBrown @ 2012-05-28 11:40 UTC (permalink / raw)
To: Jeff Johnson; +Cc: linux-raid
[-- Attachment #1: Type: text/plain, Size: 6403 bytes --]
On Mon, 28 May 2012 00:14:55 -0700 Jeff Johnson
<jeff.johnson@aeoncomputing.com> wrote:
> Greetings,
>
> I am looking at a very unique situation and trying to successfully 1TB
> of very critical data.
>
> The md raid in question is a 12-drive RAID-10 sitting between two
> identical nodes via a shared SAS link. Originally the 12 drives were
> configured as two six drive RAID-10 volumes using the entire disk device
> (no partitions on member drives). That configuration was later scrapped
> in favor of a single 12-drive RAID-10 but in this configuration a single
> partition was created and the partition was used as the RAID member
> device instead of the entire disk (sdb1 vs sdb).
>
> One of the systems had the old two six-drive RAID-10 mdadm.conf file
> left in /etc. Due to a power outage both systems went down and then
> rebooted. When one system, the one with the old mdadm.conf file, came up
> md referenced the file, saw the intact old superblocks at the beginning
> of the drive and started an assemble and resync of those two six-drive
> RAID-10 volumes. The resync process got to 40% before it was stopped.
>
> The other system managed to enumerate the drives and see the partition
> maps prior to the other node assembling the old superblock config. I can
> still see the newer md superblocks that start on the partition boundary
> rather than the beginning of the physical drive.
>
> It appears that md overwrite protection was in a way circumvented by the
> old superblocks matching the old mdadm.conf file and not seeing
> conflicting superblocks at the beginning of the partition boundaries.
>
> Both versions, old and new, were RAID-10. It appears that the errant
> resync of the old configuration didn't corrupt the newer RAID config
> since the drives were allocated in the same order and the same drives
> were paired (mirrors) in both old and new configs. I am guessing that
> since the striping method was RAID-0 the absence of stripe parity to
> check kept the data on the drives from being corrupted. This is
> conjecture on my part.
>
> Old config:
> RAID-10, /dev/md0, /dev/sd[bcdefg]
> RAID-10, /dev/md1, /dev/sd[hijklm]
>
> New config:
> RAID-10, /dev/md0, /dev/sd[bcdefghijklm]1
>
> It appears that the old superblock remained in that ~17KB gap between
> physical start of disk and the start boundary of partition 1 where the
> new superblock was written.
>
> I was able to still see the partitions on the other node. I was able to
> read the new config superblocks from 11 of the 12 drives. UUIDs, state,
> all seem to be correct.
>
> Three questions:
>
> 1) Has anyone seen a situation like this before?
I haven't.
> 2) Is it possible that since the mirrored pairs were allocated in the
> same order that the data was not overwritten?
Certainly possible.
> 3) What is the best way to assemble and run a 12-drive RAID-10 with
> member drive 0 (sdb1) seemingly blank (no superblock)?
It would be good to work out exactly why sdb1 is blank as knowing that might
provide a useful insight into the overall situation. However it probably
isn't critical.
The --assemble command you list below should be perfectly safe and allow
read access without risking any corruption.
If you
echo 1 > /sys/module/md_mod/parameters/start_ro
then it will be even safer (if that is possible). It will certainly not write
anything until you write to the array yourself.
You can then 'fsck -n', 'mount -o ro' and copy any super-critical files before
proceeding.
I would then probably
echo check > /sys/block/md0/md/sync_action
just to see if everything is ok (low mismatch count expected).
I also recommend removing the old superblocks.
mdadm --zero /dev/sdc --metadata=0.90
will look for a 0.90 superblock on sdc and if it finds one, it will erase it.
You should first double check with
mdadm --examine --metadata=0.90 /dev/sda
to ensure that is the one you want to remove
(without the --metadata=0.90 it will look for other metadata, and you might
not want it to do that without you checking first).
Good luck,
NeilBrown
>
> The current state of the 12-drive volume is: (note: sdb1 has no
> superblock but the drive is physically fine)
>
> /dev/sdc1:
> Magic : a92b4efc
> Version : 0.90.00
> UUID : 852267e0:095a343c:f4f590ad:3333cb43
> Creation Time : Tue Feb 14 18:56:08 2012
> Raid Level : raid10
> Used Dev Size : 586059136 (558.91 GiB 600.12 GB)
> Array Size : 3516354816 (3353.46 GiB 3600.75 GB)
> Raid Devices : 12
> Total Devices : 12
> Preferred Minor : 0
>
> Update Time : Sat May 26 12:05:11 2012
> State : clean
> Active Devices : 12
> Working Devices : 12
> Failed Devices : 0
> Spare Devices : 0
> Checksum : 21bca4ce - correct
> Events : 26
>
> Layout : near=2
> Chunk Size : 32K
>
> Number Major Minor RaidDevice State
> this 1 8 33 1 active sync /dev/sdc1
>
> 0 0 8 17 0 active sync
> 1 1 8 33 1 active sync /dev/sdc1
> 2 2 8 49 2 active sync /dev/sdd1
> 3 3 8 65 3 active sync /dev/sde1
> 4 4 8 81 4 active sync /dev/sdf1
> 5 5 8 97 5 active sync /dev/sdg1
> 6 6 8 113 6 active sync /dev/sdh1
> 7 7 8 129 7 active sync /dev/sdi1
> 8 8 8 145 8 active sync /dev/sdj1
> 9 9 8 161 9 active sync /dev/sdk1
> 10 10 8 177 10 active sync /dev/sdl1
> 11 11 8 193 11 active sync /dev/sdm1
>
> I could just run 'mdadm -A --uuid=852267e0095a343cf4f590ad3333cb43
> /dev/sd[bcdefghijklm]1 --run' but I feel better seeking advice and
> consensus before doing anything.
>
> I have never seen a situation like this before. It seems like there
> might be one correct way to get the data back and many ways of losing
> the data for good. Any advice or feedback is greatly appreciated!
>
> --Jeff
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-05-28 11:40 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-28 7:14 Recovering from two raid superblocks on the same disks Jeff Johnson
2012-05-28 11:40 ` NeilBrown
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).