Linux RAID subsystem development
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Eli Morris <ermorris@ucsc.edu>
Cc: linux-raid@vger.kernel.org
Subject: Re: 4 out of 16 drives show up as 'removed'
Date: Fri, 9 Dec 2011 09:50:33 +1100	[thread overview]
Message-ID: <20111209095033.06136f92@notabene.brown> (raw)
In-Reply-To: <D1EB6C45-E127-4EC6-BFDD-BDFCC4607981@ucsc.edu>

[-- Attachment #1: Type: text/plain, Size: 5229 bytes --]

On Thu, 8 Dec 2011 13:42:44 -0800 Eli Morris <ermorris@ucsc.edu> wrote:

> 
> On Dec 8, 2011, at 12:59 PM, NeilBrown wrote:
> 
> > On Thu, 8 Dec 2011 12:39:10 -0800 Eli Morris <ermorris@ucsc.edu> wrote:
> > 
> >> 
> > 
> >> 
> >> and here is the verbose assemble output:
> >> 
> >> [root@stratus log]# mdadm --verbose --assemble /dev/md5 --force /dev/sda1 /dev/sdb1 /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdi1 /dev/sdj1 /dev/sdk1 /dev/sdl1 /dev/sdm1 /dev/sdn1 /dev/sdo1 
> >> mdadm: looking for devices for /dev/md5
> >> mdadm: /dev/sda1 is identified as a member of /dev/md5, slot 0.
> >> mdadm: /dev/sdb1 is identified as a member of /dev/md5, slot -1.
> >> mdadm: /dev/sdc1 is identified as a member of /dev/md5, slot 2.
> >> mdadm: /dev/sdd1 is identified as a member of /dev/md5, slot 3.
> >> mdadm: /dev/sde1 is identified as a member of /dev/md5, slot 4.
> >> mdadm: /dev/sdf1 is identified as a member of /dev/md5, slot 5.
> >> mdadm: /dev/sdg1 is identified as a member of /dev/md5, slot 6.
> >> mdadm: /dev/sdh1 is identified as a member of /dev/md5, slot 7.
> >> mdadm: /dev/sdi1 is identified as a member of /dev/md5, slot -1.
> >> mdadm: /dev/sdj1 is identified as a member of /dev/md5, slot 9.
> >> mdadm: /dev/sdk1 is identified as a member of /dev/md5, slot 10.
> >> mdadm: /dev/sdl1 is identified as a member of /dev/md5, slot 11.
> >> mdadm: /dev/sdm1 is identified as a member of /dev/md5, slot 12.
> >> mdadm: /dev/sdn1 is identified as a member of /dev/md5, slot 13.
> >> mdadm: /dev/sdo1 is identified as a member of /dev/md5, slot -1.
> >> mdadm: no uptodate device for slot 1 of /dev/md5
> >> mdadm: added /dev/sdc1 to /dev/md5 as 2
> >> mdadm: added /dev/sdd1 to /dev/md5 as 3
> >> mdadm: added /dev/sde1 to /dev/md5 as 4
> >> mdadm: added /dev/sdf1 to /dev/md5 as 5
> >> mdadm: added /dev/sdg1 to /dev/md5 as 6
> >> mdadm: added /dev/sdh1 to /dev/md5 as 7
> >> mdadm: no uptodate device for slot 8 of /dev/md5
> >> mdadm: added /dev/sdj1 to /dev/md5 as 9
> >> mdadm: added /dev/sdk1 to /dev/md5 as 10
> >> mdadm: added /dev/sdl1 to /dev/md5 as 11
> >> mdadm: added /dev/sdm1 to /dev/md5 as 12
> >> mdadm: added /dev/sdn1 to /dev/md5 as 13
> >> mdadm: no uptodate device for slot 14 of /dev/md5
> >> mdadm: no uptodate device for slot 15 of /dev/md5
> >> mdadm: added /dev/sdb1 to /dev/md5 as -1
> >> mdadm: added /dev/sdi1 to /dev/md5 as -1
> >> mdadm: failed to add /dev/sdo1 to /dev/md5: Device or resource busy
> >> mdadm: added /dev/sda1 to /dev/md5 as 0
> >> mdadm: /dev/md5 assembled from 12 drives and 2 spares - not enough to start the array.
> >> 
> >> 
> > 
> > Thank.
> > 
> > I know what the 'busy' thing is now.
> > sdo1 appears the be the 'same' as some other device in some way.
> > 
> > Also it looks like you might have turned some drives into spares
> > unintentionally, though I'm not sure
> > 
> > Could you pleas send "mdadm --examine" output for all of these drives and
> > I'll have a look.
> > 
> > Thanks,
> > NeilBrown
> > 
> > 
> > 
> 
> Thanks Neil. I wasn't sure if you wanted the output of all the drives or just the 'removed' ones, so here is the output for all the drives in the array.
> 
> Just FYI, I don't know what I could have done to make these spares. Between when things worked fine and when they did not, I did not make any hardware or configuration changes to the array.
> 

Thanks.  I did want it all (it is always better to give too much than to
little - so thanks).

Those devices have be turned into spares.  Maybe an "--add" command or
possibly even a "--re-add" though it shouldn't.  Newer versions of mdadm are
more careful about this.

You need to re-"Create" the array.  This doesn't affect the data, just writes
new metadata.
It looks like it is safe to assume that none of the devices have been
renamed.  However if you have any reason to believe that the devices don't
belong in the array in the 'obvious' order, you should let me know or adjust
the command below accordingly.

You want to create the array exactly as it was, and you want to make sure
it doesn't immediately start to resync, just in case something goes wrong and
we want to try again.

All the 'Data Offset's are the same and are 2048 (1M) which is the current
default so that is good.

So:
  mdadm --create /dev/md5 -l5 --layout=left-symmetric --chunk=512 \
  --raid-disks=16  --assume-clean /dev/sd[a-p]

This will over-write all the metadata but not touch the data.

Then you probably want to
  fsck -n /dev/md5

to make sure it looks good.  If it does,

 echo check > /sys/block/md5/md/sync_action

That will read all blocks and  make sure parity is correct.  When it finishes
check
   /sys/block/md5/md/mismatch_cnt

if this is zero or close to zero, then it is looking very good.
If it is a lot more than zero (as  > 10000) then we probably need to think
again.
If it is small but non-zero, then "echo repair > ...the same /sync_action"
will fix it up.

If fsck showed any issues, run
  fsck -f /dev/md5
to fix them, then mount the filesystem and all should be good.

What version of mdadm do you have?

Thanks,
NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

  reply	other threads:[~2011-12-08 22:50 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-07 20:42 4 out of 16 drives show up as 'removed' Eli Morris
2011-12-07 20:51 ` Mathias Burén
2011-12-07 20:57 ` NeilBrown
2011-12-07 22:00   ` Eli Morris
2011-12-07 22:16     ` NeilBrown
2011-12-07 23:42       ` Eli Morris
2011-12-08 19:17       ` Eli Morris
2011-12-08 19:51         ` NeilBrown
2011-12-08 20:39           ` Eli Morris
2011-12-08 20:59             ` NeilBrown
2011-12-08 21:42               ` Eli Morris
2011-12-08 22:50                 ` NeilBrown [this message]
2011-12-08 23:03                   ` Eli Morris
2011-12-09  3:20                     ` NeilBrown
2011-12-09  6:58                       ` Eli Morris
2011-12-09 15:31                         ` John Stoffel
2011-12-09 16:40                       ` Asdo
2011-12-09 19:38 ` Stan Hoeppner
2011-12-09 22:07   ` Eli Morris
2011-12-10  2:29     ` Stan Hoeppner
2011-12-10  4:57       ` Eli Morris
2011-12-11  1:15         ` Stan Hoeppner
2011-12-10 17:28     ` wilsonjonathan
2011-12-10 17:43       ` wilsonjonathan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111209095033.06136f92@notabene.brown \
    --to=neilb@suse.de \
    --cc=ermorris@ucsc.edu \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox