From: NeilBrown <neilb@suse.de>
To: "Muskiewicz, Stephen C" <Stephen_Muskiewicz@uml.edu>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Need help recovering RAID5 array
Date: Tue, 9 Aug 2011 09:12:14 +1000 [thread overview]
Message-ID: <20110809091214.4a830696@notabene.brown> (raw)
In-Reply-To: <D32562736F94C445958A7CC900A95568265503@PORSCHE.fs.uml.edu>
On Mon, 8 Aug 2011 17:41:34 +0000 "Muskiewicz, Stephen C"
<Stephen_Muskiewicz@uml.edu> wrote:
> I tried creating a symlink /dev/md/tsongas_archive to /dev/md/51 but still got the "no suitable drives" error when trying to assemble (using both /dev/md/51 or /dev/md/tsongas_archive)
>
> >
> > When you can access the server again, could you report:
> >
> > cat /proc/mdstat
> > grep md /proc/partitions
> > ls -l /dev/md*
> >
> > and maybe
> > mdadm -Ds
> > mdadm -Es
> > cat /etc/mdadm.conf
> >
> > just for completeness.
> >
> >
> > It certainly looks like your data is all there but maybe not appearing
> > exactly where you expect it.
> >
>
> Here is all is:
>
> [root@libthumper1 ~]# cat /proc/mdstat
> Personalities : [raid1] [raid6] [raid5] [raid4]
> md53 : active raid5 sdae1[0] sds1[8](S) sdai1[9](S) sdk1[10] sdam1[6] sdo1[5] sdau1[4] sdaq1[3] sdw1[2] sdaa1[1]
> 3418686208 blocks super 1.0 level 5, 128k chunk, algorithm 2 [8/8] [UUUUUUUU]
>
> md52 : active raid5 sdad1[0] sdf1[11](S) sdz1[10](S) sdb1[12] sdn1[8] sdj1[7] sdal1[6] sdah1[5] sdat1[4] sdap1[3] sdv1[2] sdr1[1]
> 4395453696 blocks super 1.0 level 5, 128k chunk, algorithm 2 [10/10] [UUUUUUUUUU]
>
> md0 : active raid1 sdac2[0] sdy2[1]
> 480375552 blocks [2/2] [UU]
>
> unused devices: <none>
>
> [root@libthumper1 ~]# grep md /proc/partitions
> 9 0 480375552 md0
> 9 52 4395453696 md52
> 9 53 3418686208 md53
>
>
> [root@libthumper1 ~]# ls -l /dev/md*
> brw-r----- 1 root disk 9, 0 Aug 4 15:25 /dev/md0
> lrwxrwxrwx 1 root root 5 Aug 4 15:25 /dev/md51 -> md/51
>
> lrwxrwxrwx 1 root root 5 Aug 4 15:25 /dev/md52 -> md/52
>
> lrwxrwxrwx 1 root root 5 Aug 4 15:25 /dev/md53 -> md/53
>
>
> /dev/md:
> total 0
> brw-r----- 1 root disk 9, 51 Aug 4 15:25 51
> brw-r----- 1 root disk 9, 52 Aug 4 15:25 52
> brw-r----- 1 root disk 9, 53 Aug 4 15:25 53
>
> [root@libthumper1 ~]# mdadm -Ds
> ARRAY /dev/md0 level=raid1 num-devices=2 metadata=0.90 UUID=e30f5b25:6dc28a02:1b03ab94:da5913ed
> ARRAY /dev/md52 level=raid5 num-devices=10 metadata=1.00 spares=2 name=vmware_storage UUID=c436b591:01a4be5f:2736d7dd:3b97d872
> ARRAY /dev/md53 level=raid5 num-devices=8 metadata=1.00 spares=2 name=backup_mirror UUID=9bb89570:675f47be:2fe2f481:ebc33388
>
> [root@libthumper1 ~]# mdadm -Es
> ARRAY /dev/md2 level=raid1 num-devices=6 UUID=d08b45a4:169e4351:02cff74a:c70fcb00
> ARRAY /dev/md0 level=raid1 num-devices=2 UUID=e30f5b25:6dc28a02:1b03ab94:da5913ed
> ARRAY /dev/md/tsongas_archive level=raid5 metadata=1.0 num-devices=8 UUID=41aa414e:cfe1a5ae:3768e4ef:0084904e name=tsongas_archive
> ARRAY /dev/md/vmware_storage level=raid5 metadata=1.0 num-devices=10 UUID=c436b591:01a4be5f:2736d7dd:3b97d872 name=vmware_storage
> ARRAY /dev/md/backup_mirror level=raid5 metadata=1.0 num-devices=8 UUID=9bb89570:675f47be:2fe2f481:ebc33388 name=backup_mirror
>
> [root@libthumper1 ~]# cat /etc/mdadm.conf
>
> # mdadm.conf written out by anaconda
> DEVICE partitions
> MAILADDR sysadmins
> MAILFROM root@libthumper1.uml.edu
> ARRAY /dev/md0 level=raid1 num-devices=2 uuid=e30f5b25:6dc28a02:1b03ab94:da5913ed
> ARRAY /dev/md/51 level=raid5 num-devices=8 spares=2 name=tsongas_archive uuid=41aa414e:cfe1a5ae:3768e4ef:0084904e
> ARRAY /dev/md/52 level=raid5 num-devices=10 spares=2 name=vmware_storage uuid=c436b591:01a4be5f:2736d7dd:3b97d872
> ARRAY /dev/md/53 level=raid5 num-devices=8 spares=2 name=backup_mirror uuid=9bb89570:675f47be:2fe2f481:ebc33388
>
> It looks like the md51 device isn't appearing in /proc/partitions, not sure why that is?
>
> I also just noticed the /dev/md2 that appears in the mdadm -Es output, not sure what that is but I don't recognize it as anything that was previously on that box. (There is no /dev/md2 device file). Not sure if that is related at all or just a red herring...
>
> For good measure, here's some actual mdadm -E output for the specific drives (I won't include all as they all seem to be about the same):
>
> [root@libthumper1 ~]# mdadm -E /dev/sd[qui]1
> /dev/sdi1:
> Magic : a92b4efc
> Version : 1.0
> Feature Map : 0x0
> Array UUID : 41aa414e:cfe1a5ae:3768e4ef:0084904e
> Name : tsongas_archive
> Creation Time : Thu Feb 24 11:43:37 2011
> Raid Level : raid5
> Raid Devices : 8
>
> Avail Dev Size : 976767728 (465.76 GiB 500.11 GB)
> Array Size : 6837372416 (3260.31 GiB 3500.73 GB)
> Used Dev Size : 976767488 (465.76 GiB 500.10 GB)
> Super Offset : 976767984 sectors
> State : clean
> Device UUID : 750e6410:661d4838:0a5f7581:7c110cf1
>
> Update Time : Thu Aug 4 06:41:23 2011
> Checksum : 20bb0567 - correct
> Events : 18446744073709551615
...
>
> Is that huge number for the event count perhaps a problem?
Could be. That number is 0xffff,ffff,ffff,ffff. i.e.2^64-1.
It cannot get any bigger than that.
> >
>
> OK so I tried with the --force and here's what I got (BTW the device names are different from my original email since I didn't have access to the server before, but I used the real device names exactly as when I originally created the array, sorry for any confusion)
>
> mdadm -A /dev/md/51 --force /dev/sdq1 /dev/sdu1 /dev/sdao1 /dev/sdas1 /dev/sdag1 /dev/sdi1 /dev/sdm1 /dev/sda1 /dev/sdak1 /dev/sde1
>
> mdadm: forcing event count in /dev/sdq1(0) from -1 upto -1
> mdadm: forcing event count in /dev/sdu1(1) from -1 upto -1
> mdadm: forcing event count in /dev/sdao1(2) from -1 upto -1
> mdadm: forcing event count in /dev/sdas1(3) from -1 upto -1
> mdadm: forcing event count in /dev/sdag1(4) from -1 upto -1
> mdadm: forcing event count in /dev/sdi1(5) from -1 upto -1
> mdadm: forcing event count in /dev/sdm1(6) from -1 upto -1
> mdadm: forcing event count in /dev/sda1(7) from -1 upto -1
> mdadm: failed to RUN_ARRAY /dev/md/51: Input/output error
and sometimes "2^64-1" looks like "-1".
We just need to replace that "-1" with a more useful number.
It looks the the "--force" might have made a little bit of a mess but we
should be able to recover it.
Could you:
apply the following patch and build a new 'mdadm'.
mdadm -S /dev/md/51
mdadm -A /dev/md/51 --update=summaries
-vv /dev/sdq1 /dev/sdu1 /dev/sdao1 /dev/sdas1 /dev/sdag1 /dev/sdi1 /dev/sdm1 /dev/sda1 /dev/sdak1 /dev/sde1
and if that doesn't work, repeat the same two commands but add "--force" to
the second. Make sure you keep the "-vv" in both cases.
then report the results.
I wonder how the event count got that high. There aren't enough seconds
since the birth of the universe of it to have happened naturally...
Thanks,
NeilBrown
diff --git a/super1.c b/super1.c
index 35e92a3..4a3341a 100644
--- a/super1.c
+++ b/super1.c
@@ -803,6 +803,8 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
__le64_to_cpu(sb->data_size));
} else if (strcmp(update, "_reshape_progress")==0)
sb->reshape_position = __cpu_to_le64(info->reshape_progress);
+ else if (strcmp(update, "summaries") == 0)
+ sb->events = __cpu_to_le64(4);
else
rv = -1;
next prev parent reply other threads:[~2011-08-08 23:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-08-05 15:27 Need help recovering RAID5 array Stephen Muskiewicz
2011-08-06 1:29 ` NeilBrown
2011-08-08 17:41 ` Muskiewicz, Stephen C
2011-08-08 23:12 ` NeilBrown [this message]
2011-08-09 2:29 ` Stephen Muskiewicz
2011-08-09 2:55 ` NeilBrown
2011-08-09 11:38 ` Phil Turmel
2011-08-09 14:47 ` Muskiewicz, Stephen C
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110809091214.4a830696@notabene.brown \
--to=neilb@suse.de \
--cc=Stephen_Muskiewicz@uml.edu \
--cc=linux-raid@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).