linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Thomas J. Baker" <tjb@unh.edu>
To: Neil Brown <neilb@suse.de>
Cc: linux-raid@vger.kernel.org
Subject: Re: Any hope for a 27 disk RAID6+1HS array with four disks reporting "No md superblock detected"?
Date: Fri, 06 Feb 2009 15:32:05 -0500	[thread overview]
Message-ID: <1233952325.9786.20.camel@localhost.localdomain> (raw)
In-Reply-To: <18827.51024.270348.20624@notabene.brown>

On Fri, 2009-02-06 at 16:14 +1100, Neil Brown wrote:
> On Wednesday February 4, tjb@unh.edu wrote:
> > Any help greately appreciated. Here are the details:
> 
> Hmm.....
> 
> The limit on the number of devices in a 0.90 array is 27, despite the
> fact that the manual page says '28'.
> 
> And the only limit that is enforced is that the number of raid_disks
> is limited to 27.  So when you added a hot spare to your array, bad
> things started happening.
> 
> I'd better fix that code and documentation.
> 
> But the issue at the moment is fixing your array.
> It appears that all slots (0-26) are present except 
> 6,8,24
> 
> It seems likely that 
>   6 is on sdh1
>   8 is on sdj1
>  24 is on sdz1 ... or sds1.   They seem to move around a bit.
> 
> If only 2 were missing you would be able to bring the array up.
> But with 3 missing - not.
> 
> So we will need to recreate the array.  This should preserve all your
> old data.
> 
> The command you will need is
> 
> mdadm --create /dev/md0 -l6 -n27  .... list of device names.....
> 
> Getting the correct list of device names is tricky, but quite possible
> if you exercise due care.
> 
> The final list should have 27 entries, 2 of which should be the word
> "missing".
> 
> When you do this it will create a degraded array.  As the array is
> degraded, no resync will happen so the data on the arrays will not be
> changed, only the metadata.
> 
> So if the list of devices turns out to be wrong, it isn't the end of
> the world.  Just stop the array and try again with a different list.
> 
> So: how to get the list.
> Start with the output of 
>    ./examinRAIDDisks | grep -E '^(/dev|this)'
> 
> Based on your current output, the start of this will be:
> 
>                                   vvv
> /dev/sdb1:
> this     0       8       17        0      active sync   /dev/sdb1
> /dev/sdc1:
> this     1       8       33        1      active sync   /dev/sdc1
> /dev/sdd1:
> this     2       8       49        2      active sync   /dev/sdd1
> /dev/sde1:
> this     3       8       65        3      active sync   /dev/sde1
> /dev/sdf1:
> this     4       8       81        4      active sync   /dev/sdf1
> /dev/sdg1:
> this     5       8       97        5      active sync   /dev/sdg1
> /dev/sdi1:
> this     7       8      129        7      active sync   /dev/sdi1
> /dev/sdk1:
> this     9       8      161        9      active sync   /dev/sdk1
>                                   ^^^
> 
> however if you have rebooted and particularly if you have moved any
> drives, this could be different now.
> 
> The information that is important is the 
> /dev/sdX1:
> line and the 5th column of the other line, that I have highlighted.
> Ignore the device name at the end of the lines (column 8), that is
> just confusing.
> 
> The 5th column number tells you where in the array the /dev device
> should live.
> So from the above information, the first few devices in your list
> would be
> 
>  /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 missing
>  /dev/sdi missing /dev/sdk1
> 
> If you follow this process on the complete output of the run, you will
> get a list with 27 entries, 3 of which will be the word 'missing'.
> You need to replace one of the 'missings' with a device that is not
> listed, but probably goes at that place in the order
> e.g. sdh1 in place of the first missing.
> 
> This command might help you
> 
>   ./examineRAIDDisks  |
>    grep -E '^(/dev|this)'  | awk 'NF==1 {d=$1} NF==8 {print $5, d}' |
>    sort -n | awk 'BEGIN {l=0} $1 != l+1 {print l+1, "missing" } {print; l = $1}'
> 
> 
> If you use the --create command as describe above to create the array
> you will probably have all your data accessible.  Use "fsck" or
> whatever to check.  Do *not* add any other drives to the array until
> you are sure that you are happy with the data that you have found.  If
> it doesn't look right, try a different drive in place of the 'missing'
> 
> When you are happy, add two more drives to the array to get redundancy
> back (it will have to recover the drives) but *do not* add any more
> spares.  Leave it with a total of 27 devices.  If you add a spare, you
> will have problems again.
> 
> If any of this isn't clear, please ask for clarification.
> 
> Good luck.
> 
> NeilBrown

Thanks for the info. I think I follow everything. One last question
before really trying it - is this what is expected when I actually run
the command - the warnings about previous array, etc? 

[root@node002 ~]# ./recoverRAID 
mdadm --create /dev/md0 --verbose --level=6
--raid-devices=27 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 missing /dev/sdi1 missing /dev/sdk1 /dev/sdl1 /dev/sdm1 /dev/sdn1 /dev/sdo1 /dev/sdw1 /dev/sdx1 /dev/sdy1 /dev/sdz1 /dev/sdaa1 /dev/sdab1 /dev/sdac1 /dev/sdp1 /dev/sdq1 /dev/sdr1 missing /dev/sdt1 /dev/sdu1
mdadm: layout defaults to left-symmetric
mdadm: chunk size defaults to 64K
mdadm: /dev/sdb1 appears to contain an ext2fs file system
    size=-295395124K  mtime=Fri Nov 20 19:36:27 1931
mdadm: /dev/sdb1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdc1 appears to contain an ext2fs file system
    size=-1265904192K  mtime=Tue Dec 23 15:07:10 2008
mdadm: /dev/sdc1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdd1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sde1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdf1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdg1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdi1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdk1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdl1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdm1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdn1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdo1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdw1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdx1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdy1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdz1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdaa1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdab1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdac1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdp1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdq1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdr1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdt1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: /dev/sdu1 appears to contain an ext2fs file system
    size=-1265903936K  mtime=Sun Mar  1 20:48:00 2009
mdadm: /dev/sdu1 appears to be part of a raid array:
    level=raid6 devices=27 ctime=Thu Jun 28 05:16:13 2007
mdadm: size set to 292961216K
Continue creating array? n
mdadm: create aborted.
[root@node002 ~]# 

Thanks,

tjb
-- 
=======================================================================
| Thomas Baker                                  email: tjb@unh.edu    |
| Systems Programmer                                                  |
| Research Computing Center                     voice: (603) 862-4490 |
| University of New Hampshire                     fax: (603) 862-1761 |
| 332 Morse Hall                                                      |
| Durham, NH 03824 USA              http://wintermute.sr.unh.edu/~tjb |
=======================================================================


  reply	other threads:[~2009-02-06 20:32 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-02-04 19:27 Any hope for a 27 disk RAID6+1HS array with four disks reporting "No md superblock detected"? Thomas J. Baker
2009-02-04 20:50 ` Joe Landman
2009-02-04 21:03   ` Thomas J. Baker
2009-02-04 21:17     ` Thomas J. Baker
2009-02-05 18:49     ` Bill Davidsen
2009-02-05 18:59       ` Thomas J. Baker
2009-02-05 23:57         ` Bill Davidsen
2009-02-06  0:08           ` Thomas Baker
2009-02-06  5:14 ` Neil Brown
2009-02-06 20:32   ` Thomas J. Baker [this message]
2009-02-06 21:01     ` NeilBrown
2009-02-06 21:47       ` Thomas J. Baker
2009-02-07  2:09         ` NeilBrown
2009-02-09 14:48           ` Thomas J. Baker
2009-02-10 16:58             ` Nagilum
2009-02-07  4:05   ` Mr. James W. Laferriere
2009-02-08 22:02     ` Thomas Baker
2009-02-09 11:47       ` Max Waterman
2009-02-10  8:55         ` Luca Berra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1233952325.9786.20.camel@localhost.localdomain \
    --to=tjb@unh.edu \
    --cc=linux-raid@vger.kernel.org \
    --cc=neilb@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).