From: NeilBrown <neilb@suse.de>
To: "Rémi Rérolle" <rrerolle@lacie.com>
Cc: linux-raid@vger.kernel.org
Subject: Re: mdadm: can't removed failed/detached drives when using metadata 1.x
Date: Mon, 14 Feb 2011 14:27:01 +1100 [thread overview]
Message-ID: <20110214142701.28950b00@notabene.brown> (raw)
In-Reply-To: <4D54040C.4040201@lacie.com>
On Thu, 10 Feb 2011 16:28:12 +0100 Rémi Rérolle <rrerolle@lacie.com> wrote:
> Hi Neil,
>
> I recently came across what I believe is a regression in mdadm, which
> has been introduced in version 3.1.3.
>
> It seems that, when using metadata 1.x, the handling of failed/detached
> drives isn't effective anymore.
>
> Here's a quick example:
>
> [root@GrosCinq ~]# mdadm -C /dev/md4 -l1 -n2 --metadata=1.0 /dev/sdc1
> /dev/sdd1
> mdadm: array /dev/md4 started.
> [root@GrosCinq ~]#
> [root@GrosCinq ~]# mdadm --wait /dev/md4
> [root@GrosCinq ~]#
> [root@GrosCinq ~]# mdadm -D /dev/md4
> /dev/md4:
> Version : 1.0
> Creation Time : Thu Feb 10 13:56:31 2011
> Raid Level : raid1
> Array Size : 1953096 (1907.64 MiB 1999.97 MB)
> Used Dev Size : 1953096 (1907.64 MiB 1999.97 MB)
> Raid Devices : 2
> Total Devices : 2
> Persistence : Superblock is persistent
>
> Update Time : Thu Feb 10 13:56:46 2011
> State : clean
> Active Devices : 2
> Working Devices : 2
> Failed Devices : 0
> Spare Devices : 0
>
> Name : GrosCinq:4 (local to host GrosCinq)
> UUID : bbfef508:252e7ce1:c95d4a03:8beb3cbd
> Events : 17
>
> Number Major Minor RaidDevice State
> 0 8 1 0 active sync /dev/sdc1
> 1 8 49 1 active sync /dev/sdd1
>
> [root@GrosCinq ~]# mdadm --fail /dev/md4 /dev/sdc1
> mdadm: set /dev/sdc1 faulty in /dev/md4
> [root@GrosCinq ~]#
> [root@GrosCinq ~]# mdadm -D /dev/md4 | tail -n 6
>
> Number Major Minor RaidDevice State
> 0 0 0 0 removed
> 1 8 49 1 active sync /dev/sdd1
>
> 0 8 1 - faulty spare /dev/sdc1
> [root@GrosCinq ~]#
> [root@GrosCinq ~]# mdadm --remove /dev/md4 failed
> [root@GrosCinq ~]#
> [root@GrosCinq ~]# mdadm -D /dev/md4 | tail -n 6
>
> Number Major Minor RaidDevice State
> 0 0 0 0 removed
> 1 8 49 1 active sync /dev/sdd1
>
> 0 8 1 - faulty spare /dev/sdc1
> [root@GrosCinq ~]#
>
> This is with mdadm 3.1.4, 3.1.3 or even 3.2, but not 3.1.2. I did a git
> bisect to try and isolate the regression and it appears the guilty
> commit is :
>
> b3b4e8a : "Avoid skipping devices where removing all faulty/detached
> devices."
>
> As stated in the commit, this is only true with metadata 1.x. With 0.9,
> there is no problem. I also tested with detached drives as well as
> raid5/6 and encountered the same issue. Actually, with detached drives,
> it's even more annoying, since using --remove detached is the only way
> to remove the device without restarting the array. For a failed drive,
> there is still the possibility to use the device name.
>
> Do you have any idea of the reason behind that regression ? Shall this
> patch only apply in the case of 0.9 metadata ?
>
> Regards,
>
Thanks for the report - especially for bitsecting it down to the erroneous
commit!
This patch should fix the regression. I'll ensure it is in all future
releases.
Thanks,
NeilBrown
diff --git a/Manage.c b/Manage.c
index 481c165..8c86a53 100644
--- a/Manage.c
+++ b/Manage.c
@@ -421,7 +421,7 @@ int Manage_subdevs(char *devname, int fd,
dnprintable = dvname;
break;
}
- if (jnext == 0)
+ if (next != dv)
continue;
} else if (strcmp(dv->devname, "detached") == 0) {
if (dv->disposition != 'r' && dv->disposition != 'f') {
@@ -461,7 +461,7 @@ int Manage_subdevs(char *devname, int fd,
dnprintable = dvname;
break;
}
- if (jnext == 0)
+ if (next != dv)
continue;
} else if (strcmp(dv->devname, "missing") == 0) {
if (dv->disposition != 'a' || dv->re_add == 0) {
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
next prev parent reply other threads:[~2011-02-14 3:27 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-02-10 15:28 mdadm: can't removed failed/detached drives when using metadata 1.x Rémi Rérolle
2011-02-14 3:27 ` NeilBrown [this message]
2011-02-14 14:05 ` Rémi Rérolle
2011-02-15 0:05 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20110214142701.28950b00@notabene.brown \
--to=neilb@suse.de \
--cc=linux-raid@vger.kernel.org \
--cc=rrerolle@lacie.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).