2.6.11-rc4 md loops on missing drives

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* 2.6.11-rc4 md loops on missing drives
@ 2005-02-15 13:38 Brad Campbell
  2005-02-16  0:27 ` Neil Brown
  0 siblings, 1 reply; 3+ messages in thread
From: Brad Campbell @ 2005-02-15 13:38 UTC (permalink / raw)
  To: RAID Linux

G'day all,

I have just finished my shiny new RAID-6 box. 15 x 250GB SATA drives.
While doing some failure testing (inadvertently due to libata SMART causing command errors) I 
dropped 3 drives out of the array in sequence.
md coped with the first two (as it should), but after the third one dropped out I got the below 
errors spinning continuously in my syslog until I managed to stop the array with mdadm --stop /dev/md0

I'm not really sure how it's supposed to cope with losing more disks than planned, but filling the 
syslog with nastiness is not very polite.

This box takes _ages_ (like between 6 an 10 hours) to rebuild the array, but I'm willing to run some 
tests if anyone has particular RAID-6 stuff they want tested before I put it into service.
I do plan on a couple of days burn-in testing before I really load it up anyway.

The last disk is missing at the moment as I'm short one disk due to a Maxtor dropping its bundle 
after about 5000 hours.

I'm using todays BK kernel plus the libata and libata-dev trees. The drives are all on Promise 
SATA150TX4 controllers.

Feb 15 17:58:28 storage1 kernel: .<6>md: syncing RAID array md0
Feb 15 17:58:28 storage1 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Feb 15 17:58:28 storage1 kernel: md: using maximum available idle IO bandwith (but not more than 
200000 KB/sec) for reconstruction.
Feb 15 17:58:28 storage1 kernel: md: using 128k window, over a total of 245117312 blocks.
Feb 15 17:58:28 storage1 kernel: md: md0: sync done.
Feb 15 17:58:28 storage1 kernel: .<6>md: syncing RAID array md0
Feb 15 17:58:28 storage1 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Feb 15 17:58:28 storage1 kernel: md: using maximum available idle IO bandwith (but not more than 
200000 KB/sec) for reconstruction.
Feb 15 17:58:28 storage1 kernel: md: using 128k window, over a total of 245117312 blocks.
Feb 15 17:58:28 storage1 kernel: md: md0: sync done.
Feb 15 17:58:28 storage1 kernel: .<6>md: syncing RAID array md0
Feb 15 17:58:28 storage1 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Feb 15 17:58:28 storage1 kernel: md: using maximum available idle IO bandwith (but not more than 
200000 KB/sec) for reconstruction.
Feb 15 17:58:28 storage1 kernel: md: using 128k window, over a total of 245117312 blocks.
Feb 15 17:58:28 storage1 kernel: md: md0: sync done.
Feb 15 17:58:28 storage1 kernel: .<6>md: syncing RAID array md0
Feb 15 17:58:28 storage1 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Feb 15 17:58:28 storage1 kernel: md: using maximum available idle IO bandwith (but not more than 
200000 KB/sec) for reconstruction.
Feb 15 17:58:28 storage1 kernel: md: using 128k window, over a total of 245117312 blocks.
Feb 15 17:58:28 storage1 kernel: md: md0: sync done.
Feb 15 17:58:28 storage1 kernel: .<6>md: syncing RAID array md0
Feb 15 17:58:28 storage1 kernel: md: minimum _guaranteed_ reconstruction speed: 1000 KB/sec/disc.
Feb 15 17:58:28 storage1 kernel: md: using maximum available idle IO bandwith (but not more than 
200000 KB/sec) for reconstruction.
<to infinity and beyond>

Existing raid config below. Fail any additional 2 drives due to IO errors to cause this issue.

storage1:/home/brad# mdadm --detail /dev/md0
/dev/md0:
         Version : 00.90.01
   Creation Time : Tue Feb 15 22:00:16 2005
      Raid Level : raid6
      Array Size : 3186525056 (3038.91 GiB 3263.00 GB)
     Device Size : 245117312 (233.76 GiB 251.00 GB)
    Raid Devices : 15
   Total Devices : 15
Preferred Minor : 0
     Persistence : Superblock is persistent

     Update Time : Tue Feb 15 17:17:36 2005
           State : clean, degraded, resyncing
  Active Devices : 14
Working Devices : 14
  Failed Devices : 1
   Spare Devices : 0

      Chunk Size : 128K

  Rebuild Status : 0% complete

            UUID : 11217f79:ac676966:279f2816:f5678084
          Events : 0.40101

     Number   Major   Minor   RaidDevice State
        0       8        0        0      active sync   /dev/devfs/scsi/host0/bus0/target0/lun0/disc
        1       8       16        1      active sync   /dev/devfs/scsi/host1/bus0/target0/lun0/disc
        2       8       32        2      active sync   /dev/devfs/scsi/host2/bus0/target0/lun0/disc
        3       8       48        3      active sync   /dev/devfs/scsi/host3/bus0/target0/lun0/disc
        4       8       64        4      active sync   /dev/devfs/scsi/host4/bus0/target0/lun0/disc
        5       8       80        5      active sync   /dev/devfs/scsi/host5/bus0/target0/lun0/disc
        6       8       96        6      active sync   /dev/devfs/scsi/host6/bus0/target0/lun0/disc
        7       8      112        7      active sync   /dev/devfs/scsi/host7/bus0/target0/lun0/disc
        8       8      128        8      active sync   /dev/devfs/scsi/host8/bus0/target0/lun0/disc
        9       8      144        9      active sync   /dev/devfs/scsi/host9/bus0/target0/lun0/disc
       10       8      160       10      active sync   /dev/devfs/scsi/host10/bus0/target0/lun0/disc
       11       8      176       11      active sync   /dev/devfs/scsi/host11/bus0/target0/lun0/disc
       12       8      192       12      active sync   /dev/devfs/scsi/host12/bus0/target0/lun0/disc
       13       8      208       13      active sync   /dev/devfs/scsi/host13/bus0/target0/lun0/disc
       14       0        0        -      removed

       15       8      224        -      faulty   /dev/devfs/scsi/host14/bus0/target0/lun0/disc

Regards,
Brad
-- 
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.6.11-rc4 md loops on missing drives
  2005-02-15 13:38 2.6.11-rc4 md loops on missing drives Brad Campbell
@ 2005-02-16  0:27 ` Neil Brown
  2005-02-21  4:32   ` Brad Campbell
  0 siblings, 1 reply; 3+ messages in thread
From: Neil Brown @ 2005-02-16  0:27 UTC (permalink / raw)
  To: Brad Campbell; +Cc: RAID Linux

On Tuesday February 15, brad@wasp.net.au wrote:
> G'day all,
> 
> I'm not really sure how it's supposed to cope with losing more disks than planned, but filling the 
> syslog with nastiness is not very polite.

Thanks for the bug report.  There are actually a few problems relating
to resync/recovery when an array (raid 5 or 6) has lost too many
devices.
This patch should fix them.

NeilBrown

------------------------------------------------
Make raid5 and raid6 robust against failure during recovery.

Two problems are fixed here.
1/ if the array is known to require a resync (parity update),
  but there are too many failed devices,  the resync cannot complete
  but will be retried indefinitedly.
2/ if the array has two many failed drives to be usable and a spare is
  available, reconstruction will be attempted, but cannot work.  This
  also is retried indefinitely.


Signed-off-by: Neil Brown <neilb@cse.unsw.edu.au>

### Diffstat output
 ./drivers/md/md.c        |   12 ++++++------
 ./drivers/md/raid5.c     |   13 +++++++++++++
 ./drivers/md/raid6main.c |   12 ++++++++++++
 3 files changed, 31 insertions(+), 6 deletions(-)

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2005-02-16 11:25:25.000000000 +1100
+++ ./drivers/md/md.c	2005-02-16 11:25:31.000000000 +1100
@@ -3655,18 +3655,18 @@ void md_check_recovery(mddev_t *mddev)
 
 		/* no recovery is running.
 		 * remove any failed drives, then
-		 * add spares if possible
+		 * add spares if possible.
+		 * Spare are also removed and re-added, to allow
+		 * the personality to fail the re-add.
 		 */
-		ITERATE_RDEV(mddev,rdev,rtmp) {
+		ITERATE_RDEV(mddev,rdev,rtmp)
 			if (rdev->raid_disk >= 0 &&
-			    rdev->faulty &&
+			    (rdev->faulty || ! rdev->in_sync) &&
 			    atomic_read(&rdev->nr_pending)==0) {
 				if (mddev->pers->hot_remove_disk(mddev, rdev->raid_disk)==0)
 					rdev->raid_disk = -1;
 			}
-			if (!rdev->faulty && rdev->raid_disk >= 0 && !rdev->in_sync)
-				spares++;
-		}
+
 		if (mddev->degraded) {
 			ITERATE_RDEV(mddev,rdev,rtmp)
 				if (rdev->raid_disk < 0

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2005-02-16 11:25:25.000000000 +1100
+++ ./drivers/md/raid5.c	2005-02-16 11:25:31.000000000 +1100
@@ -1491,6 +1491,15 @@ static int sync_request (mddev_t *mddev,
 		unplug_slaves(mddev);
 		return 0;
 	}
+	/* if there is 1 or more failed drives and we are trying
+	 * to resync, then assert that we are finished, because there is
+	 * nothing we can do.
+	 */
+	if (mddev->degraded >= 1 && test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
+		int rv = (mddev->size << 1) - sector_nr;
+		md_done_sync(mddev, rv, 1);
+		return rv;
+	}
 
 	x = sector_nr;
 	chunk_offset = sector_div(x, sectors_per_chunk);
@@ -1882,6 +1891,10 @@ static int raid5_add_disk(mddev_t *mddev
 	int disk;
 	struct disk_info *p;
 
+	if (mddev->degraded > 1)
+		/* no point adding a device */
+		return 0;
+
 	/*
 	 * find the disk ...
 	 */

diff ./drivers/md/raid6main.c~current~ ./drivers/md/raid6main.c
--- ./drivers/md/raid6main.c~current~	2005-02-16 11:25:25.000000000 +1100
+++ ./drivers/md/raid6main.c	2005-02-16 11:25:31.000000000 +1100
@@ -1650,6 +1650,15 @@ static int sync_request (mddev_t *mddev,
 		unplug_slaves(mddev);
 		return 0;
 	}
+	/* if there are 2 or more failed drives and we are trying
+	 * to resync, then assert that we are finished, because there is
+	 * nothing we can do.
+	 */
+	if (mddev->degraded >= 2 && test_bit(MD_RECOVERY_SYNC, &mddev->recovery)) {
+		int rv = (mddev->size << 1) - sector_nr;
+		md_done_sync(mddev, rv, 1);
+		return rv;
+	}
 
 	x = sector_nr;
 	chunk_offset = sector_div(x, sectors_per_chunk);
@@ -2048,6 +2057,9 @@ static int raid6_add_disk(mddev_t *mddev
 	int disk;
 	struct disk_info *p;
 
+	if (mddev->degraded > 2)
+		/* no point adding a device */
+		return 0;
 	/*
 	 * find the disk ...
 	 */

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: 2.6.11-rc4 md loops on missing drives
  2005-02-16  0:27 ` Neil Brown
@ 2005-02-21  4:32   ` Brad Campbell
  0 siblings, 0 replies; 3+ messages in thread
From: Brad Campbell @ 2005-02-21  4:32 UTC (permalink / raw)
  To: Neil Brown; +Cc: RAID Linux

Neil Brown wrote:
> On Tuesday February 15, brad@wasp.net.au wrote:
> 
>>G'day all,
>>
>>I'm not really sure how it's supposed to cope with losing more disks than planned, but filling the 
>>syslog with nastiness is not very polite.
> 
> 
> Thanks for the bug report.  There are actually a few problems relating
> to resync/recovery when an array (raid 5 or 6) has lost too many
> devices.
> This patch should fix them.
> 

I applied your latest array of 9 patches to a vanilla BK kernel and did very, very horrible things 
to it while it was rebuilding. I can confirm that it does indeed tidy up the resync issues.

Ta!
Brad
-- 
"Human beings, who are almost unique in having the ability
to learn from the experience of others, are also remarkable
for their apparent disinclination to do so." -- Douglas Adams

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2005-02-21  4:32 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-02-15 13:38 2.6.11-rc4 md loops on missing drives Brad Campbell
2005-02-16  0:27 ` Neil Brown
2005-02-21  4:32   ` Brad Campbell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).