From mboxrd@z Thu Jan 1 00:00:00 1970 From: Xiao Ni Subject: Re: RAID1 removing failed disk returns EBUSY Date: Tue, 3 Feb 2015 03:10:56 -0500 (EST) Message-ID: <1914953233.3814567.1422951056539.JavaMail.zimbra@redhat.com> References: <20141027162748.593451be@jlaw-desktop.mno.stratus.com> <20150115082210.31bd3ea5@jlaw-desktop.mno.stratus.com> <2054919975.10444188.1421385612513.JavaMail.zimbra@redhat.com> <20150116101031.30c04df3@jlaw-desktop.mno.stratus.com> <1924199853.11308787.1421634830810.JavaMail.zimbra@redhat.com> <20150129145217.1cb31d5c@notabene.brown> <371504811.2053160.1422533656432.JavaMail.zimbra@redhat.com> <20150202173601.1ab02927@notabene.brown> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Return-path: In-Reply-To: <20150202173601.1ab02927@notabene.brown> Sender: linux-raid-owner@vger.kernel.org To: NeilBrown Cc: Joe Lawrence , linux-raid@vger.kernel.org, Bill Kuzeja List-Id: linux-raid.ids ----- Original Message ----- > From: "NeilBrown" > To: "Xiao Ni" > Cc: "Joe Lawrence" , linux-raid@vger.kernel= =2Eorg, "Bill Kuzeja" > Sent: Monday, February 2, 2015 2:36:01 PM > Subject: Re: RAID1 removing failed disk returns EBUSY >=20 > On Thu, 29 Jan 2015 07:14:16 -0500 (EST) Xiao Ni wro= te: >=20 > >=20 > >=20 > > ----- Original Message ----- > > > From: "NeilBrown" > > > To: "Xiao Ni" > > > Cc: "Joe Lawrence" , > > > linux-raid@vger.kernel.org, "Bill Kuzeja" > > > Sent: Thursday, January 29, 2015 11:52:17 AM > > > Subject: Re: RAID1 removing failed disk returns EBUSY > > >=20 > > > On Sun, 18 Jan 2015 21:33:50 -0500 (EST) Xiao Ni = wrote: > > >=20 > > > >=20 > > > >=20 > > > > ----- Original Message ----- > > > > > From: "Joe Lawrence" > > > > > To: "Xiao Ni" > > > > > Cc: "NeilBrown" , linux-raid@vger.kernel.org, = "Bill > > > > > Kuzeja" > > > > > Sent: Friday, January 16, 2015 11:10:31 PM > > > > > Subject: Re: RAID1 removing failed disk returns EBUSY > > > > >=20 > > > > > On Fri, 16 Jan 2015 00:20:12 -0500 > > > > > Xiao Ni wrote: > > > > > >=20 > > > > > > Hi Joe > > > > > >=20 > > > > > > Thanks for reminding me. I didn't do that. Now it can re= move > > > > > > successfully after writing > > > > > > "idle" to sync_action. > > > > > >=20 > > > > > > I thought wrongly that the patch referenced in this mail= is > > > > > > fixed > > > > > > for > > > > > > the problem. > > > > >=20 > > > > > So it sounds like even with 3.18 and a new mdadm, this bug st= ill > > > > > persists? > > > > >=20 > > > > > -- Joe > > > > >=20 > > > > > -- > > > >=20 > > > > Hi Joe > > > >=20 > > > > I'm a little confused now. Does the patch > > > > 45eaf45dfa4850df16bc2e8e7903d89021137f40 from linux-stable > > > > resolve the problem? > > > >=20 > > > > My environment is: > > > >=20 > > > > [root@dhcp-12-133 mdadm]# mdadm --version > > > > mdadm - v3.3.2-18-g93d3bd3 - 18th December 2014 (this is the n= ewest > > > > upstream) > > > > [root@dhcp-12-133 mdadm]# uname -r > > > > 3.18.2 > > > >=20 > > > >=20 > > > > My steps are: > > > >=20 > > > > [root@dhcp-12-133 mdadm]# lsblk > > > > sdb 8:16 0 931.5G 0 disk > > > > =E2=94=94=E2=94=80sdb1 8:17 0 5G 0 pa= rt > > > > sdc 8:32 0 186.3G 0 disk > > > > sdd 8:48 0 931.5G 0 disk > > > > =E2=94=94=E2=94=80sdd1 8:49 0 5G 0 pa= rt > > > > [root@dhcp-12-133 mdadm]# mdadm -CR /dev/md0 -l1 -n2 /dev/sdb1 > > > > /dev/sdd1 > > > > --assume-clean > > > > mdadm: Note: this array has metadata at the start and > > > > may not be suitable as a boot device. If you plan to > > > > store '/boot' on this device please ensure that > > > > your boot-loader understands md/v1.x metadata, or use > > > > --metadata=3D0.90 > > > > mdadm: Defaulting to version 1.2 metadata > > > > mdadm: array /dev/md0 started. > > > >=20 > > > > Then I unplug the disk. > > > >=20 > > > > [root@dhcp-12-133 mdadm]# lsblk > > > > sdc 8:32 0 186.3G 0 disk > > > > sdd 8:48 0 931.5G 0 disk > > > > =E2=94=94=E2=94=80sdd1 8:49 0 5G 0 pa= rt > > > > =E2=94=94=E2=94=80md0 9:0 0 5G 0 ra= id1 > > > > [root@dhcp-12-133 mdadm]# echo faulty > > > > > /sys/block/md0/md/dev-sdb1/state > > > > [root@dhcp-12-133 mdadm]# echo remove > > > > > /sys/block/md0/md/dev-sdb1/state > > > > -bash: echo: write error: Device or resource busy > > > > [root@dhcp-12-133 mdadm]# echo idle > /sys/block/md0/md/sync_ac= tion > > > > [root@dhcp-12-133 mdadm]# echo remove > > > > > /sys/block/md0/md/dev-sdb1/state > > > >=20 > > >=20 > > > I cannot reproduce this - using linux 3.18.2. I'd be surprised i= f mdadm > > > version affects things. > >=20 > > Hi Neil > >=20 > > I'm very curious, because it can reproduce in my machine 100%. > >=20 > > >=20 > > > This error (Device or resoource busy) implies that rdev->raid_dis= k is >=3D > > > 0 > > > (tested in state_store()). > > >=20 > > > ->raid_disk is set to -1 by remove_and_add_spares() providing: > > > 1/ it isn't Blocked (which is very unlikely) > > > 2/ hot_remove_disk succeeds, which it will if nr_pending is zer= o, and > > > 3/ nr_pending is zero. > >=20 > > I remember I have tired to check those reasons. But it's really = is the > > reason 1 > > which is very unlikely. > >=20 > > I add some code in the function array_state_show > >=20 > > array_state_show(struct mddev *mddev, char *page) { > > enum array_state st =3D inactive; > > struct md_rdev *rdev; > >=20 > > rdev_for_each_rcu(rdev, mddev) { > > printk(KERN_ALERT "search for %s\n", > > rdev->bdev->bd_disk->disk_name); > > if (test_bit(Blocked, &rdev->flags)) > > printk(KERN_ALERT "rdev is Blocked\n"); > > else > > printk(KERN_ALERT "rdev is not Blocked\n"); > > } > >=20 > > When I echo 1 > /sys/block/sdc/device/delete, then I ran command: > >=20 > > [root@dhcp-12-133 md]# cat /sys/block/md0/md/array_state > > read-auto > ^^^^^^^^^ >=20 > I think that is half the explanation. > You must have the md_mod.start_ro parameter set to '1'. >=20 >=20 > > [root@dhcp-12-133 md]# dmesg > > [ 2679.559185] search for sdc > > [ 2679.559189] rdev is Blocked > > [ 2679.559190] search for sdb > > [ 2679.559190] rdev is not Blocked > > =20 > > So sdc is Blocked >=20 > and that is the other half - thanks. > (yes, I was wrong. Sometimes it is easier than being right, but stil= l > yields results). >=20 > When a device fails, it is Blocked until the metadata is updated to r= ecord > the failure. This ensures that no writes succeed without writing to = that > device, until we a certain that no read will try reading from that de= vice, > even after a crash/restart. >=20 > Blocked is cleared after the metadata is written, but read-auto (and > read-only) devices never write out their metadata. So blocked doesn'= t get > cleared. >=20 > When you "echo idle > .../sync_action" one of the side effects is to = with > from 'read-auto' to fully active. This allows the metadata to be wri= tten, > Blocked to be cleared, and the device to be removed. >=20 > If you > echo none > /sys/block/md0/md/dev-sdc/slot >=20 > first, then the remove will work. >=20 > We could possibly fix it with something like the following, but I'm n= ot sure > I like it. There is no guarantee that I can see which would ensure t= he > superblock got updated before the first write if the array switch to > read/write. >=20 > NeilBrown >=20 > diff --git a/drivers/md/md.c b/drivers/md/md.c > index 9233c71138f1..b3d1e8e5e067 100644 > --- a/drivers/md/md.c > +++ b/drivers/md/md.c > @@ -7528,7 +7528,7 @@ static int remove_and_add_spares(struct mddev *= mddev, > rdev_for_each(rdev, mddev) > if ((this =3D=3D NULL || rdev =3D=3D this) && > rdev->raid_disk >=3D 0 && > - !test_bit(Blocked, &rdev->flags) && > + (!test_bit(Blocked, &rdev->flags) || mddev->ro) && > (test_bit(Faulty, &rdev->flags) || > ! test_bit(In_sync, &rdev->flags)) && > atomic_read(&rdev->nr_pending)=3D=3D0) { >=20 >=20 >=20 Hi Neil I have tried the patch and the problem can be fixed by it. But I'm s= orry that I can't give more advices for better idea about this. I'm not familiar with the= metadata part about the md. I'll try to get more time to read the code about md. Best Regards Xiao -- To unsubscribe from this list: send the line "unsubscribe linux-raid" i= n the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html