From: NeilBrown <neilb@suse.de>
To: Xiao Ni <xni@redhat.com>
Cc: Joe Lawrence <joe.lawrence@stratus.com>,
linux-raid@vger.kernel.org,
Bill Kuzeja <william.kuzeja@stratus.com>
Subject: Re: RAID1 removing failed disk returns EBUSY
Date: Mon, 2 Feb 2015 17:36:01 +1100 [thread overview]
Message-ID: <20150202173601.1ab02927@notabene.brown> (raw)
In-Reply-To: <371504811.2053160.1422533656432.JavaMail.zimbra@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 6808 bytes --]
On Thu, 29 Jan 2015 07:14:16 -0500 (EST) Xiao Ni <xni@redhat.com> wrote:
>
>
> ----- Original Message -----
> > From: "NeilBrown" <neilb@suse.de>
> > To: "Xiao Ni" <xni@redhat.com>
> > Cc: "Joe Lawrence" <joe.lawrence@stratus.com>, linux-raid@vger.kernel.org, "Bill Kuzeja" <william.kuzeja@stratus.com>
> > Sent: Thursday, January 29, 2015 11:52:17 AM
> > Subject: Re: RAID1 removing failed disk returns EBUSY
> >
> > On Sun, 18 Jan 2015 21:33:50 -0500 (EST) Xiao Ni <xni@redhat.com> wrote:
> >
> > >
> > >
> > > ----- Original Message -----
> > > > From: "Joe Lawrence" <joe.lawrence@stratus.com>
> > > > To: "Xiao Ni" <xni@redhat.com>
> > > > Cc: "NeilBrown" <neilb@suse.de>, linux-raid@vger.kernel.org, "Bill
> > > > Kuzeja" <william.kuzeja@stratus.com>
> > > > Sent: Friday, January 16, 2015 11:10:31 PM
> > > > Subject: Re: RAID1 removing failed disk returns EBUSY
> > > >
> > > > On Fri, 16 Jan 2015 00:20:12 -0500
> > > > Xiao Ni <xni@redhat.com> wrote:
> > > > >
> > > > > Hi Joe
> > > > >
> > > > > Thanks for reminding me. I didn't do that. Now it can remove
> > > > > successfully after writing
> > > > > "idle" to sync_action.
> > > > >
> > > > > I thought wrongly that the patch referenced in this mail is fixed
> > > > > for
> > > > > the problem.
> > > >
> > > > So it sounds like even with 3.18 and a new mdadm, this bug still
> > > > persists?
> > > >
> > > > -- Joe
> > > >
> > > > --
> > >
> > > Hi Joe
> > >
> > > I'm a little confused now. Does the patch
> > > 45eaf45dfa4850df16bc2e8e7903d89021137f40 from linux-stable
> > > resolve the problem?
> > >
> > > My environment is:
> > >
> > > [root@dhcp-12-133 mdadm]# mdadm --version
> > > mdadm - v3.3.2-18-g93d3bd3 - 18th December 2014 (this is the newest
> > > upstream)
> > > [root@dhcp-12-133 mdadm]# uname -r
> > > 3.18.2
> > >
> > >
> > > My steps are:
> > >
> > > [root@dhcp-12-133 mdadm]# lsblk
> > > sdb 8:16 0 931.5G 0 disk
> > > └─sdb1 8:17 0 5G 0 part
> > > sdc 8:32 0 186.3G 0 disk
> > > sdd 8:48 0 931.5G 0 disk
> > > └─sdd1 8:49 0 5G 0 part
> > > [root@dhcp-12-133 mdadm]# mdadm -CR /dev/md0 -l1 -n2 /dev/sdb1 /dev/sdd1
> > > --assume-clean
> > > mdadm: Note: this array has metadata at the start and
> > > may not be suitable as a boot device. If you plan to
> > > store '/boot' on this device please ensure that
> > > your boot-loader understands md/v1.x metadata, or use
> > > --metadata=0.90
> > > mdadm: Defaulting to version 1.2 metadata
> > > mdadm: array /dev/md0 started.
> > >
> > > Then I unplug the disk.
> > >
> > > [root@dhcp-12-133 mdadm]# lsblk
> > > sdc 8:32 0 186.3G 0 disk
> > > sdd 8:48 0 931.5G 0 disk
> > > └─sdd1 8:49 0 5G 0 part
> > > └─md0 9:0 0 5G 0 raid1
> > > [root@dhcp-12-133 mdadm]# echo faulty > /sys/block/md0/md/dev-sdb1/state
> > > [root@dhcp-12-133 mdadm]# echo remove > /sys/block/md0/md/dev-sdb1/state
> > > -bash: echo: write error: Device or resource busy
> > > [root@dhcp-12-133 mdadm]# echo idle > /sys/block/md0/md/sync_action
> > > [root@dhcp-12-133 mdadm]# echo remove > /sys/block/md0/md/dev-sdb1/state
> > >
> >
> > I cannot reproduce this - using linux 3.18.2. I'd be surprised if mdadm
> > version affects things.
>
> Hi Neil
>
> I'm very curious, because it can reproduce in my machine 100%.
>
> >
> > This error (Device or resoource busy) implies that rdev->raid_disk is >= 0
> > (tested in state_store()).
> >
> > ->raid_disk is set to -1 by remove_and_add_spares() providing:
> > 1/ it isn't Blocked (which is very unlikely)
> > 2/ hot_remove_disk succeeds, which it will if nr_pending is zero, and
> > 3/ nr_pending is zero.
>
> I remember I have tired to check those reasons. But it's really is the reason 1
> which is very unlikely.
>
> I add some code in the function array_state_show
>
> array_state_show(struct mddev *mddev, char *page) {
> enum array_state st = inactive;
> struct md_rdev *rdev;
>
> rdev_for_each_rcu(rdev, mddev) {
> printk(KERN_ALERT "search for %s\n", rdev->bdev->bd_disk->disk_name);
> if (test_bit(Blocked, &rdev->flags))
> printk(KERN_ALERT "rdev is Blocked\n");
> else
> printk(KERN_ALERT "rdev is not Blocked\n");
> }
>
> When I echo 1 > /sys/block/sdc/device/delete, then I ran command:
>
> [root@dhcp-12-133 md]# cat /sys/block/md0/md/array_state
> read-auto
^^^^^^^^^
I think that is half the explanation.
You must have the md_mod.start_ro parameter set to '1'.
> [root@dhcp-12-133 md]# dmesg
> [ 2679.559185] search for sdc
> [ 2679.559189] rdev is Blocked
> [ 2679.559190] search for sdb
> [ 2679.559190] rdev is not Blocked
>
> So sdc is Blocked
and that is the other half - thanks.
(yes, I was wrong. Sometimes it is easier than being right, but still
yields results).
When a device fails, it is Blocked until the metadata is updated to record
the failure. This ensures that no writes succeed without writing to that
device, until we a certain that no read will try reading from that device,
even after a crash/restart.
Blocked is cleared after the metadata is written, but read-auto (and
read-only) devices never write out their metadata. So blocked doesn't get
cleared.
When you "echo idle > .../sync_action" one of the side effects is to with
from 'read-auto' to fully active. This allows the metadata to be written,
Blocked to be cleared, and the device to be removed.
If you
echo none > /sys/block/md0/md/dev-sdc/slot
first, then the remove will work.
We could possibly fix it with something like the following, but I'm not sure
I like it. There is no guarantee that I can see which would ensure the
superblock got updated before the first write if the array switch to
read/write.
NeilBrown
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 9233c71138f1..b3d1e8e5e067 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -7528,7 +7528,7 @@ static int remove_and_add_spares(struct mddev *mddev,
rdev_for_each(rdev, mddev)
if ((this == NULL || rdev == this) &&
rdev->raid_disk >= 0 &&
- !test_bit(Blocked, &rdev->flags) &&
+ (!test_bit(Blocked, &rdev->flags) || mddev->ro) &&
(test_bit(Faulty, &rdev->flags) ||
! test_bit(In_sync, &rdev->flags)) &&
atomic_read(&rdev->nr_pending)==0) {
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]
next prev parent reply other threads:[~2015-02-02 6:36 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-27 20:27 RAID1 removing failed disk returns EBUSY Joe Lawrence
2014-10-28 21:41 ` NeilBrown
2014-10-29 17:36 ` Joe Lawrence
2014-11-13 14:05 ` Joe Lawrence
2014-11-16 23:03 ` NeilBrown
2015-01-14 12:41 ` XiaoNi
2015-01-15 13:22 ` Joe Lawrence
2015-01-16 5:20 ` Xiao Ni
2015-01-16 15:10 ` Joe Lawrence
2015-01-19 2:33 ` Xiao Ni
2015-01-19 17:56 ` Joe Lawrence
2015-01-20 7:16 ` Xiao Ni
2015-01-23 15:11 ` Joe Lawrence
2015-01-30 2:19 ` Xiao Ni
2015-01-30 4:27 ` Xiao Ni
2015-01-29 3:52 ` NeilBrown
2015-01-29 12:14 ` Xiao Ni
2015-02-02 6:36 ` NeilBrown [this message]
2015-02-03 8:10 ` Xiao Ni
2015-06-10 6:26 ` XiaoNi
2015-06-17 2:51 ` Neil Brown
2015-06-25 9:42 ` Xiao Ni
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150202173601.1ab02927@notabene.brown \
--to=neilb@suse.de \
--cc=joe.lawrence@stratus.com \
--cc=linux-raid@vger.kernel.org \
--cc=william.kuzeja@stratus.com \
--cc=xni@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).