From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Ni Subject: Re: RAID1 removing failed disk returns EBUSY Date: Thu, 29 Jan 2015 07:14:16 -0500 (EST) Message-ID: <371504811.2053160.1422533656432.JavaMail.zimbra@redhat.com> References: <20141027162748.593451be@jlaw-desktop.mno.stratus.com> <20141117100349.1d1ae1fa@notabene.brown> <54B663EC.8090607@redhat.com> <20150115082210.31bd3ea5@jlaw-desktop.mno.stratus.com> <2054919975.10444188.1421385612513.JavaMail.zimbra@redhat.com> <20150116101031.30c04df3@jlaw-desktop.mno.stratus.com> <1924199853.11308787.1421634830810.JavaMail.zimbra@redhat.com> <20150129145217.1cb31d5c@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <20150129145217.1cb31d5c@notabene.brown> Sender: linux-raid-owner@vger.kernel.org To: NeilBrown Cc: Joe Lawrence , linux-raid@vger.kernel.org, Bill Kuzeja List-Id: linux-raid.ids ----- Original Message ----- > From: "NeilBrown" > To: "Xiao Ni" > Cc: "Joe Lawrence" , linux-raid@vger.kernel= =2Eorg, "Bill Kuzeja" > Sent: Thursday, January 29, 2015 11:52:17 AM > Subject: Re: RAID1 removing failed disk returns EBUSY >=20 > On Sun, 18 Jan 2015 21:33:50 -0500 (EST) Xiao Ni wro= te: >=20 > >=20 > >=20 > > ----- Original Message ----- > > > From: "Joe Lawrence" > > > To: "Xiao Ni" > > > Cc: "NeilBrown" , linux-raid@vger.kernel.org, "Bil= l > > > Kuzeja" > > > Sent: Friday, January 16, 2015 11:10:31 PM > > > Subject: Re: RAID1 removing failed disk returns EBUSY > > >=20 > > > On Fri, 16 Jan 2015 00:20:12 -0500 > > > Xiao Ni wrote: > > > >=20 > > > > Hi Joe > > > >=20 > > > > Thanks for reminding me. I didn't do that. Now it can remove > > > > successfully after writing > > > > "idle" to sync_action. > > > >=20 > > > > I thought wrongly that the patch referenced in this mail is = fixed > > > > for > > > > the problem. > > >=20 > > > So it sounds like even with 3.18 and a new mdadm, this bug still > > > persists? > > >=20 > > > -- Joe > > >=20 > > > -- > >=20 > > Hi Joe > >=20 > > I'm a little confused now. Does the patch > > 45eaf45dfa4850df16bc2e8e7903d89021137f40 from linux-stable > > resolve the problem? > >=20 > > My environment is: > >=20 > > [root@dhcp-12-133 mdadm]# mdadm --version > > mdadm - v3.3.2-18-g93d3bd3 - 18th December 2014 (this is the newes= t > > upstream) > > [root@dhcp-12-133 mdadm]# uname -r > > 3.18.2 > >=20 > >=20 > > My steps are: > >=20 > > [root@dhcp-12-133 mdadm]# lsblk > > sdb 8:16 0 931.5G 0 disk > > =E2=94=94=E2=94=80sdb1 8:17 0 5G 0 part > > sdc 8:32 0 186.3G 0 disk > > sdd 8:48 0 931.5G 0 disk > > =E2=94=94=E2=94=80sdd1 8:49 0 5G 0 part > > [root@dhcp-12-133 mdadm]# mdadm -CR /dev/md0 -l1 -n2 /dev/sdb1 /dev= /sdd1 > > --assume-clean > > mdadm: Note: this array has metadata at the start and > > may not be suitable as a boot device. If you plan to > > store '/boot' on this device please ensure that > > your boot-loader understands md/v1.x metadata, or use > > --metadata=3D0.90 > > mdadm: Defaulting to version 1.2 metadata > > mdadm: array /dev/md0 started. > >=20 > > Then I unplug the disk. > >=20 > > [root@dhcp-12-133 mdadm]# lsblk > > sdc 8:32 0 186.3G 0 disk > > sdd 8:48 0 931.5G 0 disk > > =E2=94=94=E2=94=80sdd1 8:49 0 5G 0 part > > =E2=94=94=E2=94=80md0 9:0 0 5G 0 raid1 > > [root@dhcp-12-133 mdadm]# echo faulty > /sys/block/md0/md/dev-sdb1/= state > > [root@dhcp-12-133 mdadm]# echo remove > /sys/block/md0/md/dev-sdb1/= state > > -bash: echo: write error: Device or resource busy > > [root@dhcp-12-133 mdadm]# echo idle > /sys/block/md0/md/sync_action > > [root@dhcp-12-133 mdadm]# echo remove > /sys/block/md0/md/dev-sdb1/= state > >=20 >=20 > I cannot reproduce this - using linux 3.18.2. I'd be surprised if md= adm > version affects things. Hi Neil I'm very curious, because it can reproduce in my machine 100%. >=20 > This error (Device or resoource busy) implies that rdev->raid_disk is= >=3D 0 > (tested in state_store()). >=20 > ->raid_disk is set to -1 by remove_and_add_spares() providing: > 1/ it isn't Blocked (which is very unlikely) > 2/ hot_remove_disk succeeds, which it will if nr_pending is zero, a= nd > 3/ nr_pending is zero. I remember I have tired to check those reasons. But it's really is t= he reason 1 which is very unlikely. I add some code in the function array_state_show array_state_show(struct mddev *mddev, char *page) { enum array_state st =3D inactive; struct md_rdev *rdev; rdev_for_each_rcu(rdev, mddev) { printk(KERN_ALERT "search for %s\n", rdev->bdev->bd_dis= k->disk_name); if (test_bit(Blocked, &rdev->flags)) printk(KERN_ALERT "rdev is Blocked\n"); else printk(KERN_ALERT "rdev is not Blocked\n"); } When I echo 1 > /sys/block/sdc/device/delete, then I ran command: [root@dhcp-12-133 md]# cat /sys/block/md0/md/array_state=20 read-auto [root@dhcp-12-133 md]# dmesg=20 [ 2679.559185] search for sdc [ 2679.559189] rdev is Blocked [ 2679.559190] search for sdb [ 2679.559190] rdev is not Blocked =20 So sdc is Blocked >=20 > So it seems most likely that either: > 1/ nr_pending is non-zero, or > 2/ remove_and_add_spares() didn't run. >=20 > nr_pending can only get set if IO is generated, and your sequence of = steps > don't show any IO. It is possible that something else (e.g. started = by udev) > triggered some IO. How long that IO can stay pending might depend on= exactly > how you unplug the device. > In my tests I used > echo 1 > /sys/block/sdXX/../../delete > which may have a different effect to what you do. >=20 > However the fact that writing 'idle' to sync_action releases the devi= ce seems > to suggest the nr_pending has dropped to zero. So either > - remove_and_add_spares didn't run, or > - remove_and_add_spares ran during a small window when nr_pending w= as > elevated, and then didn't run again when nr_pending was reduced t= o zero. >=20 > Ahh.... that rings bells.... >=20 > I have the following patch in the SLES kernel which I have applied to > mainline yet (and given how old it is, that is really slack of me). >=20 > Can you apply the following and see if the symptom goes away please? I have tried the patch, the problem is still exist. >=20 > Thanks, > NeilBrown >=20 > From: Hannes Reinecke > Date: Thu, 26 Jul 2012 11:12:18 +0200 > Subject: [PATCH] md: wakeup thread upon rdev_dec_pending() >=20 > After each call to rdev_dec_pending() we should wakeup the > md thread if the device is found to be faulty. > Otherwise we'll incur heavy delays on failing devices. >=20 > Signed-off-by: Neil Brown > Signed-off-by: Hannes Reinecke >=20 > diff --git a/drivers/md/md.h b/drivers/md/md.h > index 03cec5bdcaae..4cc2f59b2994 100644 > --- a/drivers/md/md.h > +++ b/drivers/md/md.h > @@ -439,13 +439,6 @@ struct mddev { > void (*sync_super)(struct mddev *mddev, struct md_rdev *rdev); > }; > =20 > -static inline void rdev_dec_pending(struct md_rdev *rdev, struct mdd= ev > *mddev) > -{ > - int faulty =3D test_bit(Faulty, &rdev->flags); > - if (atomic_dec_and_test(&rdev->nr_pending) && faulty) > - set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); > -} > - > static inline void md_sync_acct(struct block_device *bdev, unsigned = long > nr_sectors) > { > atomic_add(nr_sectors, &bdev->bd_contains->bd_disk->sync_io); > @@ -624,4 +617,14 @@ static inline int mddev_check_plugged(struct mdd= ev > *mddev) > return !!blk_check_plugged(md_unplug, mddev, > sizeof(struct blk_plug_cb)); > } > + > +static inline void rdev_dec_pending(struct md_rdev *rdev, struct mdd= ev > *mddev) > +{ > + int faulty =3D test_bit(Faulty, &rdev->flags); > + if (atomic_dec_and_test(&rdev->nr_pending) && faulty) { > + set_bit(MD_RECOVERY_NEEDED, &mddev->recovery); > + md_wakeup_thread(mddev->thread); > + } > +} > + > #endif /* _MD_MD_H */ >=20 >=20 -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html