linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: Albert Pauw <albert.pauw@gmail.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: ddf failed disk disappears after adding spare
Date: Wed, 15 Aug 2012 09:39:04 +1000	[thread overview]
Message-ID: <20120815093904.79b9ac8c@notabene.brown> (raw)
In-Reply-To: <5018E8FF.6030402@gmail.com>

[-- Attachment #1: Type: text/plain, Size: 5452 bytes --]

On Wed, 01 Aug 2012 10:29:51 +0200 Albert Pauw <albert.pauw@gmail.com> wrote:

> Hi Neil,
> 
> here is a procedure which shows you another problem. It has to do with 
> the table produced at the end of the mdadm -E command, showing the disks 
> and their status. Seems when a disk has failed and another added, the 
> failed one disappears.
> 
> Hope you can find the problem and fix it.
> 
> Regards,
> 
> Albert
> 
> Here is the exact procedure which shows the problem:
> 
> Create a container with 5 disks:
> 
> mdadm -CR /dev/md127 -e ddf -l container -n 5 /dev/loop[1-5]
> 
>   Physical Disks : 5
>        Number    RefNo      Size       Device      Type/State
>           0    d1c8c16e    479232K /dev/loop1 Global-Spare/Online
>           1    6de79cb6    479232K /dev/loop2 Global-Spare/Online
>           2    b5fd1d6c    479232K /dev/loop3 Global-Spare/Online
>           3    0be2d310    479232K /dev/loop4 Global-Spare/Online
>           4    5d8ac3d0    479232K /dev/loop5 Global-Spare/Online
> 
> 
> Create a RAID 5 set of 3 disks in container:
> 
> mdadm -CR /dev/md0 -l 5 -n 3 /dev/md127
> 
> Physical Disks : 5
>        Number    RefNo      Size       Device      Type/State
>           0    d1c8c16e    479232K /dev/loop1      active/Online
>           1    6de79cb6    479232K /dev/loop2      active/Online
>           2    b5fd1d6c    479232K /dev/loop3      active/Online
>           3    0be2d310    479232K /dev/loop4 Global-Spare/Online
>           4    5d8ac3d0    479232K /dev/loop5 Global-Spare/Online
> 
> 
> Create a RAID 1 set of 2 disks in container:
> 
> mdadm -CR /dev/md1 -l 1 -n 2 /dev/md127
> 
> Physical Disks : 5
>        Number    RefNo      Size       Device      Type/State
>           0    d1c8c16e    479232K /dev/loop1      active/Online
>           1    6de79cb6    479232K /dev/loop2      active/Online
>           2    b5fd1d6c    479232K /dev/loop3      active/Online
>           3    0be2d310    479232K /dev/loop4      active/Online
>           4    5d8ac3d0    479232K /dev/loop5      active/Online
> 
> 
> Fail first disk in RAID 5 set:
> 
> mdadm -f /dev/md0 /dev/loop1
> 
> Physical Disks : 5
>        Number    RefNo      Size       Device      Type/State
>           0    d1c8c16e    479232K /dev/loop1      active/Offline, Failed
>           1    6de79cb6    479232K /dev/loop2      active/Online
>           2    b5fd1d6c    479232K /dev/loop3      active/Online
>           3    0be2d310    479232K /dev/loop4      active/Online
>           4    5d8ac3d0    479232K /dev/loop5      active/Online
> 
> 
> Remove failed disk:
> 
> mdadm -r /dev/md0 /dev/loop1
> 
> Physical Disks : 5
>        Number    RefNo      Size       Device      Type/State
>           0    d1c8c16e    479232K                 active/Offline, 
> Failed, Missing
>           1    6de79cb6    479232K /dev/loop2      active/Online
>           2    b5fd1d6c    479232K /dev/loop3      active/Online
>           3    0be2d310    479232K /dev/loop4      active/Online
>           4    5d8ac3d0    479232K /dev/loop5      active/Online
> 
> 
> Add failed disk back:
> 
> mdadm -a --force /dev/md0 /dev/loop1
> 
> Physical Disks : 5
>        Number    RefNo      Size       Device      Type/State
>           0    d1c8c16e    479232K /dev/loop1      active/Offline, 
> Failed, Missing
>           1    6de79cb6    479232K /dev/loop2      active/Online
>           2    b5fd1d6c    479232K /dev/loop3      active/Online
>           3    0be2d310    479232K /dev/loop4      active/Online
>           4    5d8ac3d0    479232K /dev/loop5      active/Online
> 
> 
> Add spare disk to container:
> 
> mdadm -a --force /dev/md0 /dev/loop6
> 
> Physical Disks : 5
>        Number    RefNo      Size       Device      Type/State
>           0    6de79cb6    479232K /dev/loop2      active/Online
>           1    b5fd1d6c    479232K /dev/loop3      active/Online
>           2    0be2d310    479232K /dev/loop4      active/Online
>           3    5d8ac3d0    479232K /dev/loop5      active/Online
>           4    1dcfe3cf    479232K /dev/loop6      active/Online, Rebuilding
> 
> This is wrong! Physical disks should be 6 now!

Whenever we add a device to the ddf we currently remove any record of any
failed and missing device.  We have to forget about devices that have
disappeared at some stage, and this seems like a good place.

The problem here is that a device that is in the array is marked as
'missing'.  This due to the bug I mentioned in the previous email.  Currently
worked around by --zeroing the device before adding it.

> 
> Removed failed disk (which is missing from list now!) again, zero 
> superblock and add again:
> 
> mdadm -r /dev/md0 /dev/loop1
> mdadm --zero-superblock /dev/loop1
> mdadm -a --force /dev/md0 /dev/loop1
> 
> 
> Physical Disks : 6
>        Number    RefNo      Size       Device      Type/State
>           0    6de79cb6    479232K /dev/loop2      active/Online
>           1    b5fd1d6c    479232K /dev/loop3      active/Online
>           2    0be2d310    479232K /dev/loop4      active/Online
>           3    5d8ac3d0    479232K /dev/loop5      active/Online
>           4    1dcfe3cf    479232K /dev/loop6      active/Online
>           5    8147a3ef    479232K /dev/loop1 Global-Spare/Online
> 
> And there they are, all 6 of them.
> 

NeilBrown

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

      parent reply	other threads:[~2012-08-14 23:39 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-08-01  8:29 ddf failed disk disappears after adding spare Albert Pauw
2012-08-01 16:46 ` Albert Pauw
2012-08-01 17:27   ` Albert Pauw
2012-08-14 23:39 ` NeilBrown [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120815093904.79b9ac8c@notabene.brown \
    --to=neilb@suse.de \
    --cc=albert.pauw@gmail.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).