linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes
@ 2006-07-31  7:31 NeilBrown
  2006-07-31  7:31 ` [PATCH 001 of 9] md: The scheduled removal of the START_ARRAY ioctl for md NeilBrown
                   ` (8 more replies)
  0 siblings, 9 replies; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel

Following are 9 patches for md in 2.6-mm-latest that should be
targeted for 2.6.19 (not 2.6.18 material).

Nothing really remarkable, so I'll leave to speak for themselves.

NeilBrown


 [PATCH 001 of 9] md: The scheduled removal of the START_ARRAY ioctl for md
 [PATCH 002 of 9] md: Fix a comment that is wrong in raid5.h
 [PATCH 003 of 9] md: Factor out part of raid1d into a separate function.
 [PATCH 004 of 9] md: Factor out part of raid10d into a separate function.
 [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags
 [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data.
 [PATCH 007 of 9] md: Remove 'working_disks' from raid10 state
 [PATCH 008 of 9] md: Remove working_disks from raid1 state data
 [PATCH 009 of 9] md: Improve locking around error handling.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 001 of 9] md: The scheduled removal of the START_ARRAY ioctl for md
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
@ 2006-07-31  7:31 ` NeilBrown
  2006-07-31  7:31 ` [PATCH 002 of 9] md: Fix a comment that is wrong in raid5.h NeilBrown
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


From: Adrian Bunk <bunk@stusta.de>

This patch contains the scheduled removal of the START_ARRAY ioctl for md.

Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./Documentation/feature-removal-schedule.txt |    9 --
 ./drivers/md/md.c                            |   82 ---------------------------
 ./include/linux/compat_ioctl.h               |    1 
 ./include/linux/raid/md_u.h                  |    2 
 4 files changed, 1 insertion(+), 93 deletions(-)

diff .prev/Documentation/feature-removal-schedule.txt ./Documentation/feature-removal-schedule.txt
--- .prev/Documentation/feature-removal-schedule.txt	2006-07-31 16:32:06.000000000 +1000
+++ ./Documentation/feature-removal-schedule.txt	2006-07-31 16:32:06.000000000 +1000
@@ -96,15 +96,6 @@ Who:    Arjan van de Ven
 
 ---------------------------
 
-What:	START_ARRAY ioctl for md
-When:	July 2006
-Files:	drivers/md/md.c
-Why:	Not reliable by design - can fail when most needed.
-	Alternatives exist
-Who:	NeilBrown <neilb@suse.de>
-
----------------------------
-
 What:   eepro100 network driver
 When:   January 2007
 Why:    replaced by the e100 driver

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2006-07-31 16:32:06.000000000 +1000
+++ ./drivers/md/md.c	2006-07-31 16:32:06.000000000 +1000
@@ -3427,67 +3427,6 @@ static void autorun_devices(int part)
 	printk(KERN_INFO "md: ... autorun DONE.\n");
 }
 
-/*
- * import RAID devices based on one partition
- * if possible, the array gets run as well.
- */
-
-static int autostart_array(dev_t startdev)
-{
-	char b[BDEVNAME_SIZE];
-	int err = -EINVAL, i;
-	mdp_super_t *sb = NULL;
-	mdk_rdev_t *start_rdev = NULL, *rdev;
-
-	start_rdev = md_import_device(startdev, 0, 0);
-	if (IS_ERR(start_rdev))
-		return err;
-
-
-	/* NOTE: this can only work for 0.90.0 superblocks */
-	sb = (mdp_super_t*)page_address(start_rdev->sb_page);
-	if (sb->major_version != 0 ||
-	    sb->minor_version != 90 ) {
-		printk(KERN_WARNING "md: can only autostart 0.90.0 arrays\n");
-		export_rdev(start_rdev);
-		return err;
-	}
-
-	if (test_bit(Faulty, &start_rdev->flags)) {
-		printk(KERN_WARNING 
-			"md: can not autostart based on faulty %s!\n",
-			bdevname(start_rdev->bdev,b));
-		export_rdev(start_rdev);
-		return err;
-	}
-	list_add(&start_rdev->same_set, &pending_raid_disks);
-
-	for (i = 0; i < MD_SB_DISKS; i++) {
-		mdp_disk_t *desc = sb->disks + i;
-		dev_t dev = MKDEV(desc->major, desc->minor);
-
-		if (!dev)
-			continue;
-		if (dev == startdev)
-			continue;
-		if (MAJOR(dev) != desc->major || MINOR(dev) != desc->minor)
-			continue;
-		rdev = md_import_device(dev, 0, 0);
-		if (IS_ERR(rdev))
-			continue;
-
-		list_add(&rdev->same_set, &pending_raid_disks);
-	}
-
-	/*
-	 * possibly return codes
-	 */
-	autorun_devices(0);
-	return 0;
-
-}
-
-
 static int get_version(void __user * arg)
 {
 	mdu_version_t ver;
@@ -4246,27 +4185,6 @@ static int md_ioctl(struct inode *inode,
 		goto abort;
 	}
 
-
-	if (cmd == START_ARRAY) {
-		/* START_ARRAY doesn't need to lock the array as autostart_array
-		 * does the locking, and it could even be a different array
-		 */
-		static int cnt = 3;
-		if (cnt > 0 ) {
-			printk(KERN_WARNING
-			       "md: %s(pid %d) used deprecated START_ARRAY ioctl. "
-			       "This will not be supported beyond July 2006\n",
-			       current->comm, current->pid);
-			cnt--;
-		}
-		err = autostart_array(new_decode_dev(arg));
-		if (err) {
-			printk(KERN_WARNING "md: autostart failed!\n");
-			goto abort;
-		}
-		goto done;
-	}
-
 	err = mddev_lock(mddev);
 	if (err) {
 		printk(KERN_INFO 

diff .prev/include/linux/compat_ioctl.h ./include/linux/compat_ioctl.h
--- .prev/include/linux/compat_ioctl.h	2006-07-31 16:32:06.000000000 +1000
+++ ./include/linux/compat_ioctl.h	2006-07-31 16:32:06.000000000 +1000
@@ -120,7 +120,6 @@ COMPATIBLE_IOCTL(PROTECT_ARRAY)
 ULONG_IOCTL(HOT_ADD_DISK)
 ULONG_IOCTL(SET_DISK_FAULTY)
 COMPATIBLE_IOCTL(RUN_ARRAY)
-ULONG_IOCTL(START_ARRAY)
 COMPATIBLE_IOCTL(STOP_ARRAY)
 COMPATIBLE_IOCTL(STOP_ARRAY_RO)
 COMPATIBLE_IOCTL(RESTART_ARRAY_RW)

diff .prev/include/linux/raid/md_u.h ./include/linux/raid/md_u.h
--- .prev/include/linux/raid/md_u.h	2006-07-31 16:32:06.000000000 +1000
+++ ./include/linux/raid/md_u.h	2006-07-31 16:32:06.000000000 +1000
@@ -41,7 +41,7 @@
 
 /* usage */
 #define RUN_ARRAY		_IOW (MD_MAJOR, 0x30, mdu_param_t)
-#define START_ARRAY		_IO (MD_MAJOR, 0x31)
+/*  0x31 was START_ARRAY  */
 #define STOP_ARRAY		_IO (MD_MAJOR, 0x32)
 #define STOP_ARRAY_RO		_IO (MD_MAJOR, 0x33)
 #define RESTART_ARRAY_RW	_IO (MD_MAJOR, 0x34)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 002 of 9] md: Fix a comment that is wrong in raid5.h
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
  2006-07-31  7:31 ` [PATCH 001 of 9] md: The scheduled removal of the START_ARRAY ioctl for md NeilBrown
@ 2006-07-31  7:31 ` NeilBrown
  2006-07-31  7:32 ` [PATCH 003 of 9] md: Factor out part of raid1d into a separate function NeilBrown
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:31 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel



Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./include/linux/raid/raid5.h |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff .prev/include/linux/raid/raid5.h ./include/linux/raid/raid5.h
--- .prev/include/linux/raid/raid5.h	2006-07-31 16:33:02.000000000 +1000
+++ ./include/linux/raid/raid5.h	2006-07-31 16:33:02.000000000 +1000
@@ -195,8 +195,9 @@ struct stripe_head {
  * it to the count of prereading stripes.
  * When write is initiated, or the stripe refcnt == 0 (just in case) we
  * clear the PREREAD_ACTIVE flag and decrement the count
- * Whenever the delayed queue is empty and the device is not plugged, we
- * move any strips from delayed to handle and clear the DELAYED flag and set PREREAD_ACTIVE.
+ * Whenever the 'handle' queue is empty and the device is not plugged, we
+ * move any strips from delayed to handle and clear the DELAYED flag and set
+ * PREREAD_ACTIVE.
  * In stripe_handle, if we find pre-reading is necessary, we do it if
  * PREREAD_ACTIVE is set, else we set DELAYED which will send it to the delayed queue.
  * HANDLE gets cleared if stripe_handle leave nothing locked.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 003 of 9] md: Factor out part of raid1d into a separate function.
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
  2006-07-31  7:31 ` [PATCH 001 of 9] md: The scheduled removal of the START_ARRAY ioctl for md NeilBrown
  2006-07-31  7:31 ` [PATCH 002 of 9] md: Fix a comment that is wrong in raid5.h NeilBrown
@ 2006-07-31  7:32 ` NeilBrown
  2006-07-31  7:32 ` [PATCH 004 of 9] md: Factor out part of raid10d " NeilBrown
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


raid1d has toooo many nested block, so take the fix_read_error
functionality out into a separate function.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c |  161 ++++++++++++++++++++++++++++-----------------------
 1 file changed, 89 insertions(+), 72 deletions(-)

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c	2006-07-31 17:23:57.000000000 +1000
+++ ./drivers/md/raid1.c	2006-07-31 17:24:24.000000000 +1000
@@ -1359,6 +1359,89 @@ static void sync_request_write(mddev_t *
  *	3.	Performs writes following reads for array syncronising.
  */
 
+static void fix_read_error(conf_t *conf, int read_disk,
+			   sector_t sect, int sectors)
+{
+	mddev_t *mddev = conf->mddev;
+	while(sectors) {
+		int s = sectors;
+		int d = read_disk;
+		int success = 0;
+		int start;
+		mdk_rdev_t *rdev;
+
+		if (s > (PAGE_SIZE>>9))
+			s = PAGE_SIZE >> 9;
+
+		do {
+			rdev = conf->mirrors[d].rdev;
+			if (rdev &&
+			    test_bit(In_sync, &rdev->flags) &&
+			    sync_page_io(rdev->bdev,
+					 sect + rdev->data_offset,
+					 s<<9,
+					 conf->tmppage, READ))
+				success = 1;
+			else {
+				d++;
+				if (d == conf->raid_disks)
+					d = 0;
+			}
+		} while (!success && d != read_disk);
+
+		if (!success) {
+			/* Cannot read from anywhere -- bye bye array */
+			md_error(mddev, conf->mirrors[read_disk].rdev);
+			break;
+		}
+		/* write it back and re-read */
+		start = d;
+		while (d != read_disk) {
+			if (d==0)
+				d = conf->raid_disks;
+			d--;
+			rdev = conf->mirrors[d].rdev;
+			atomic_add(s, &rdev->corrected_errors);
+			if (rdev &&
+			    test_bit(In_sync, &rdev->flags)) {
+				if (sync_page_io(rdev->bdev,
+						 sect + rdev->data_offset,
+						 s<<9, conf->tmppage, WRITE)
+				    == 0)
+					/* Well, this device is dead */
+					md_error(mddev, rdev);
+			}
+		}
+		d = start;
+		while (d != read_disk) {
+			char b[BDEVNAME_SIZE];
+			if (d==0)
+				d = conf->raid_disks;
+			d--;
+			rdev = conf->mirrors[d].rdev;
+			if (rdev &&
+			    test_bit(In_sync, &rdev->flags)) {
+				if (sync_page_io(rdev->bdev,
+						 sect + rdev->data_offset,
+						 s<<9, conf->tmppage, READ)
+				    == 0)
+					/* Well, this device is dead */
+					md_error(mddev, rdev);
+				else
+					printk(KERN_INFO
+					       "raid1:%s: read error corrected "
+					       "(%d sectors at %llu on %s)\n",
+					       mdname(mddev), s,
+					       (unsigned long long)sect +
+					           rdev->data_offset,
+					       bdevname(rdev->bdev, b));
+			}
+		}
+		sectors -= s;
+		sect += s;
+	}
+}
+
 static void raid1d(mddev_t *mddev)
 {
 	r1bio_t *r1_bio;
@@ -1451,80 +1534,14 @@ static void raid1d(mddev_t *mddev)
 			 * This is all done synchronously while the array is
 			 * frozen
 			 */
-			sector_t sect = r1_bio->sector;
-			int sectors = r1_bio->sectors;
-			freeze_array(conf);
-			if (mddev->ro == 0) while(sectors) {
-				int s = sectors;
-				int d = r1_bio->read_disk;
-				int success = 0;
-
-				if (s > (PAGE_SIZE>>9))
-					s = PAGE_SIZE >> 9;
-
-				do {
-					rdev = conf->mirrors[d].rdev;
-					if (rdev &&
-					    test_bit(In_sync, &rdev->flags) &&
-					    sync_page_io(rdev->bdev,
-							 sect + rdev->data_offset,
-							 s<<9,
-							 conf->tmppage, READ))
-						success = 1;
-					else {
-						d++;
-						if (d == conf->raid_disks)
-							d = 0;
-					}
-				} while (!success && d != r1_bio->read_disk);
-
-				if (success) {
-					/* write it back and re-read */
-					int start = d;
-					while (d != r1_bio->read_disk) {
-						if (d==0)
-							d = conf->raid_disks;
-						d--;
-						rdev = conf->mirrors[d].rdev;
-						atomic_add(s, &rdev->corrected_errors);
-						if (rdev &&
-						    test_bit(In_sync, &rdev->flags)) {
-							if (sync_page_io(rdev->bdev,
-									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, WRITE) == 0)
-								/* Well, this device is dead */
-								md_error(mddev, rdev);
-						}
-					}
-					d = start;
-					while (d != r1_bio->read_disk) {
-						if (d==0)
-							d = conf->raid_disks;
-						d--;
-						rdev = conf->mirrors[d].rdev;
-						if (rdev &&
-						    test_bit(In_sync, &rdev->flags)) {
-							if (sync_page_io(rdev->bdev,
-									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, READ) == 0)
-								/* Well, this device is dead */
-								md_error(mddev, rdev);
-							else
-								printk(KERN_INFO "raid1:%s: read error corrected (%d sectors at %llu on %s)\n",
-								       mdname(mddev), s, (unsigned long long)(sect + rdev->data_offset), bdevname(rdev->bdev, b));
-						}
-					}
-				} else {
-					/* Cannot read from anywhere -- bye bye array */
-					md_error(mddev, conf->mirrors[r1_bio->read_disk].rdev);
-					break;
-				}
-				sectors -= s;
-				sect += s;
+			if (mddev->ro == 0) {
+				freeze_array(conf);
+				fix_read_error(conf, r1_bio->read_disk,
+					       r1_bio->sector,
+					       r1_bio->sectors);
+				unfreeze_array(conf);
 			}
 
-			unfreeze_array(conf);
-
 			bio = r1_bio->bios[r1_bio->read_disk];
 			if ((disk=read_balance(conf, r1_bio)) == -1) {
 				printk(KERN_ALERT "raid1: %s: unrecoverable I/O"

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 004 of 9] md: Factor out part of raid10d into a separate function.
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
                   ` (2 preceding siblings ...)
  2006-07-31  7:32 ` [PATCH 003 of 9] md: Factor out part of raid1d into a separate function NeilBrown
@ 2006-07-31  7:32 ` NeilBrown
  2006-08-01 17:15   ` Bill Davidsen
  2006-07-31  7:32 ` [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags NeilBrown
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


raid10d has toooo many nested block, so take the fix_read_error
functionality out into a separate function.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid10.c |  213 ++++++++++++++++++++++++++------------------------
 1 file changed, 115 insertions(+), 98 deletions(-)

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c	2006-07-31 17:23:57.000000000 +1000
+++ ./drivers/md/raid10.c	2006-07-31 17:24:34.000000000 +1000
@@ -1350,9 +1350,119 @@ static void recovery_request_write(mddev
  *
  *	1.	Retries failed read operations on working mirrors.
  *	2.	Updates the raid superblock when problems encounter.
- *	3.	Performs writes following reads for array syncronising.
+ *	3.	Performs writes following reads for array synchronising.
  */
 
+static void fix_read_error(conf_t *conf, mddev_t *mddev, r10bio_t *r10_bio)
+{
+	int sect = 0; /* Offset from r10_bio->sector */
+	int sectors = r10_bio->sectors;
+	mdk_rdev_t*rdev;
+	while(sectors) {
+		int s = sectors;
+		int sl = r10_bio->read_slot;
+		int success = 0;
+		int start;
+
+		if (s > (PAGE_SIZE>>9))
+			s = PAGE_SIZE >> 9;
+
+		rcu_read_lock();
+		do {
+			int d = r10_bio->devs[sl].devnum;
+			rdev = rcu_dereference(conf->mirrors[d].rdev);
+			if (rdev &&
+			    test_bit(In_sync, &rdev->flags)) {
+				atomic_inc(&rdev->nr_pending);
+				rcu_read_unlock();
+				success = sync_page_io(rdev->bdev,
+						       r10_bio->devs[sl].addr +
+						       sect + rdev->data_offset,
+						       s<<9,
+						       conf->tmppage, READ);
+				rdev_dec_pending(rdev, mddev);
+				rcu_read_lock();
+				if (success)
+					break;
+			}
+			sl++;
+			if (sl == conf->copies)
+				sl = 0;
+		} while (!success && sl != r10_bio->read_slot);
+		rcu_read_unlock();
+
+		if (!success) {
+			/* Cannot read from anywhere -- bye bye array */
+			int dn = r10_bio->devs[r10_bio->read_slot].devnum;
+			md_error(mddev, conf->mirrors[dn].rdev);
+			break;
+		}
+
+		start = sl;
+		/* write it back and re-read */
+		rcu_read_lock();
+		while (sl != r10_bio->read_slot) {
+			int d;
+			if (sl==0)
+				sl = conf->copies;
+			sl--;
+			d = r10_bio->devs[sl].devnum;
+			rdev = rcu_dereference(conf->mirrors[d].rdev);
+			if (rdev &&
+			    test_bit(In_sync, &rdev->flags)) {
+				atomic_inc(&rdev->nr_pending);
+				rcu_read_unlock();
+				atomic_add(s, &rdev->corrected_errors);
+				if (sync_page_io(rdev->bdev,
+						 r10_bio->devs[sl].addr +
+						 sect + rdev->data_offset,
+						 s<<9, conf->tmppage, WRITE)
+				    == 0)
+					/* Well, this device is dead */
+					md_error(mddev, rdev);
+				rdev_dec_pending(rdev, mddev);
+				rcu_read_lock();
+			}
+		}
+		sl = start;
+		while (sl != r10_bio->read_slot) {
+			int d;
+			if (sl==0)
+				sl = conf->copies;
+			sl--;
+			d = r10_bio->devs[sl].devnum;
+			rdev = rcu_dereference(conf->mirrors[d].rdev);
+			if (rdev &&
+			    test_bit(In_sync, &rdev->flags)) {
+				char b[BDEVNAME_SIZE];
+				atomic_inc(&rdev->nr_pending);
+				rcu_read_unlock();
+				if (sync_page_io(rdev->bdev,
+						 r10_bio->devs[sl].addr +
+						 sect + rdev->data_offset,
+						 s<<9, conf->tmppage, READ) == 0)
+					/* Well, this device is dead */
+					md_error(mddev, rdev);
+				else
+					printk(KERN_INFO
+					       "raid10:%s: read error corrected"
+					       " (%d sectors at %llu on %s)\n",
+					       mdname(mddev), s,
+					       (unsigned long long)sect+
+					            rdev->data_offset,
+					       bdevname(rdev->bdev, b));
+
+				rdev_dec_pending(rdev, mddev);
+				rcu_read_lock();
+			}
+		}
+		rcu_read_unlock();
+
+		sectors -= s;
+		sect += s;
+	}
+}
+
 static void raid10d(mddev_t *mddev)
 {
 	r10bio_t *r10_bio;
@@ -1413,105 +1523,12 @@ static void raid10d(mddev_t *mddev)
 			 * This is all done synchronously while the array is
 			 * frozen.
 			 */
-			int sect = 0; /* Offset from r10_bio->sector */
-			int sectors = r10_bio->sectors;
-			freeze_array(conf);
-			if (mddev->ro == 0) while(sectors) {
-				int s = sectors;
-				int sl = r10_bio->read_slot;
-				int success = 0;
-
-				if (s > (PAGE_SIZE>>9))
-					s = PAGE_SIZE >> 9;
-
-				rcu_read_lock();
-				do {
-					int d = r10_bio->devs[sl].devnum;
-					rdev = rcu_dereference(conf->mirrors[d].rdev);
-					if (rdev &&
-					    test_bit(In_sync, &rdev->flags)) {
-						atomic_inc(&rdev->nr_pending);
-						rcu_read_unlock();
-						success = sync_page_io(rdev->bdev,
-								       r10_bio->devs[sl].addr +
-								       sect + rdev->data_offset,
-								       s<<9,
-								       conf->tmppage, READ);
-						rdev_dec_pending(rdev, mddev);
-						rcu_read_lock();
-						if (success)
-							break;
-					}
-					sl++;
-					if (sl == conf->copies)
-						sl = 0;
-				} while (!success && sl != r10_bio->read_slot);
-				rcu_read_unlock();
-
-				if (success) {
-					int start = sl;
-					/* write it back and re-read */
-					rcu_read_lock();
-					while (sl != r10_bio->read_slot) {
-						int d;
-						if (sl==0)
-							sl = conf->copies;
-						sl--;
-						d = r10_bio->devs[sl].devnum;
-						rdev = rcu_dereference(conf->mirrors[d].rdev);
-						if (rdev &&
-						    test_bit(In_sync, &rdev->flags)) {
-							atomic_inc(&rdev->nr_pending);
-							rcu_read_unlock();
-							atomic_add(s, &rdev->corrected_errors);
-							if (sync_page_io(rdev->bdev,
-									 r10_bio->devs[sl].addr +
-									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, WRITE) == 0)
-								/* Well, this device is dead */
-								md_error(mddev, rdev);
-							rdev_dec_pending(rdev, mddev);
-							rcu_read_lock();
-						}
-					}
-					sl = start;
-					while (sl != r10_bio->read_slot) {
-						int d;
-						if (sl==0)
-							sl = conf->copies;
-						sl--;
-						d = r10_bio->devs[sl].devnum;
-						rdev = rcu_dereference(conf->mirrors[d].rdev);
-						if (rdev &&
-						    test_bit(In_sync, &rdev->flags)) {
-							atomic_inc(&rdev->nr_pending);
-							rcu_read_unlock();
-							if (sync_page_io(rdev->bdev,
-									 r10_bio->devs[sl].addr +
-									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, READ) == 0)
-								/* Well, this device is dead */
-								md_error(mddev, rdev);
-							else
-								printk(KERN_INFO "raid10:%s: read error corrected (%d sectors at %llu on %s)\n",
-								       mdname(mddev), s, (unsigned long long)(sect+rdev->data_offset), bdevname(rdev->bdev, b));
-
-							rdev_dec_pending(rdev, mddev);
-							rcu_read_lock();
-						}
-					}
-					rcu_read_unlock();
-				} else {
-					/* Cannot read from anywhere -- bye bye array */
-					md_error(mddev, conf->mirrors[r10_bio->devs[r10_bio->read_slot].devnum].rdev);
-					break;
-				}
-				sectors -= s;
-				sect += s;
+			if (mddev->ro == 0) {
+				freeze_array(conf);
+				fix_read_error(conf, mddev, r10_bio);
+				unfreeze_array(conf);
 			}
 
-			unfreeze_array(conf);
-
 			bio = r10_bio->devs[r10_bio->read_slot].bio;
 			r10_bio->devs[r10_bio->read_slot].bio =
 				mddev->ro ? IO_BLOCKED : NULL;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
                   ` (3 preceding siblings ...)
  2006-07-31  7:32 ` [PATCH 004 of 9] md: Factor out part of raid10d " NeilBrown
@ 2006-07-31  7:32 ` NeilBrown
  2006-07-31 15:33   ` Ingo Oeser
  2006-07-31  7:32 ` [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data NeilBrown
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


Instead of magic numbers (0,1,2,3) in sb_dirty, we have
some flags instead:
MD_CHANGE_DEVS
   Some device state has changed requiring superblock update
   on all devices.
MD_CHANGE_CLEAN
   The array has transitions from 'clean' to 'dirty' or back,
   requiring a superblock update on active devices, but possibly
   not on spares
MD_CHANGE_PENDING
   A superblock update is underway.  

We wait for an update to complete by waiting for all flags to
be clear.  A flag can be set at any time, even during an update,
without risk that the change will be lost.

Stop exporting md_update_sb - isn't needed.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c           |   68 +++++++++++++++++++++++---------------------
 ./drivers/md/multipath.c    |    3 -
 ./drivers/md/raid1.c        |    2 -
 ./drivers/md/raid10.c       |    2 -
 ./drivers/md/raid5.c        |    8 ++---
 ./include/linux/raid/md.h   |    1 
 ./include/linux/raid/md_k.h |    6 +++
 7 files changed, 49 insertions(+), 41 deletions(-)

diff .prev/drivers/md/md.c ./drivers/md/md.c
--- .prev/drivers/md/md.c	2006-07-31 17:23:57.000000000 +1000
+++ ./drivers/md/md.c	2006-07-31 17:24:34.000000000 +1000
@@ -1587,7 +1587,7 @@ static void sync_sbs(mddev_t * mddev, in
 	}
 }
 
-void md_update_sb(mddev_t * mddev)
+static void md_update_sb(mddev_t * mddev, int force_change)
 {
 	int err;
 	struct list_head *tmp;
@@ -1597,18 +1597,25 @@ void md_update_sb(mddev_t * mddev)
 
 repeat:
 	spin_lock_irq(&mddev->write_lock);
-	sync_req = mddev->in_sync;
-	mddev->utime = get_seconds();
-	if (mddev->sb_dirty == 3)
+
+	set_bit(MD_CHANGE_PENDING, &mddev->flags);
+	if (test_and_clear_bit(MD_CHANGE_DEVS, &mddev->flags))
+		force_change = 1;
+	if (test_and_clear_bit(MD_CHANGE_CLEAN, &mddev->flags))
 		/* just a clean<-> dirty transition, possibly leave spares alone,
 		 * though if events isn't the right even/odd, we will have to do
 		 * spares after all
 		 */
 		nospares = 1;
+	if (force_change)
+		nospares = 0;
+
+	sync_req = mddev->in_sync;
+	mddev->utime = get_seconds();
 
 	/* If this is just a dirty<->clean transition, and the array is clean
 	 * and 'events' is odd, we can roll back to the previous clean state */
-	if (mddev->sb_dirty == 3
+	if (nospares
 	    && (mddev->in_sync && mddev->recovery_cp == MaxSector)
 	    && (mddev->events & 1))
 		mddev->events--;
@@ -1639,7 +1646,6 @@ repeat:
 		MD_BUG();
 		mddev->events --;
 	}
-	mddev->sb_dirty = 2;
 	sync_sbs(mddev, nospares);
 
 	/*
@@ -1647,7 +1653,7 @@ repeat:
 	 * nonpersistent superblocks
 	 */
 	if (!mddev->persistent) {
-		mddev->sb_dirty = 0;
+		clear_bit(MD_CHANGE_PENDING, &mddev->flags);
 		spin_unlock_irq(&mddev->write_lock);
 		wake_up(&mddev->sb_wait);
 		return;
@@ -1684,20 +1690,20 @@ repeat:
 			break;
 	}
 	md_super_wait(mddev);
-	/* if there was a failure, sb_dirty was set to 1, and we re-write super */
+	/* if there was a failure, MD_CHANGE_DEVS was set, and we re-write super */
 
 	spin_lock_irq(&mddev->write_lock);
-	if (mddev->in_sync != sync_req|| mddev->sb_dirty == 1) {
+	if (mddev->in_sync != sync_req ||
+	    test_bit(MD_CHANGE_DEVS, &mddev->flags)) {
 		/* have to write it out again */
 		spin_unlock_irq(&mddev->write_lock);
 		goto repeat;
 	}
-	mddev->sb_dirty = 0;
+	clear_bit(MD_CHANGE_PENDING, &mddev->flags);
 	spin_unlock_irq(&mddev->write_lock);
 	wake_up(&mddev->sb_wait);
 
 }
-EXPORT_SYMBOL_GPL(md_update_sb);
 
 /* words written to sysfs files may, or my not, be \n terminated.
  * We want to accept with case. For this we use cmd_match.
@@ -1770,7 +1776,7 @@ state_store(mdk_rdev_t *rdev, const char
 		else {
 			mddev_t *mddev = rdev->mddev;
 			kick_rdev_from_array(rdev);
-			md_update_sb(mddev);
+			md_update_sb(mddev, 1);
 			md_new_event(mddev);
 			err = 0;
 		}
@@ -2413,7 +2419,7 @@ array_state_store(mddev_t *mddev, const 
 			spin_lock_irq(&mddev->write_lock);
 			if (atomic_read(&mddev->writes_pending) == 0) {
 				mddev->in_sync = 1;
-				mddev->sb_dirty = 1;
+				set_bit(MD_CHANGE_CLEAN, &mddev->flags);
 			}
 			spin_unlock_irq(&mddev->write_lock);
 		} else {
@@ -2425,7 +2431,7 @@ array_state_store(mddev_t *mddev, const 
 	case active:
 		if (mddev->pers) {
 			restart_array(mddev);
-			mddev->sb_dirty = 0;
+			clear_bit(MD_CHANGE_CLEAN, &mddev->flags);
 			wake_up(&mddev->sb_wait);
 			err = 0;
 		} else {
@@ -2530,7 +2536,7 @@ size_store(mddev_t *mddev, const char *b
 
 	if (mddev->pers) {
 		err = update_size(mddev, size);
-		md_update_sb(mddev);
+		md_update_sb(mddev, 1);
 	} else {
 		if (mddev->size == 0 ||
 		    mddev->size > size)
@@ -3098,8 +3104,8 @@ static int do_md_run(mddev_t * mddev)
 	
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 	
-	if (mddev->sb_dirty)
-		md_update_sb(mddev);
+	if (mddev->flags)
+		md_update_sb(mddev, 0);
 
 	set_capacity(disk, mddev->array_size<<1);
 
@@ -3262,10 +3268,10 @@ static int do_md_stop(mddev_t * mddev, i
 			if (mddev->ro)
 				mddev->ro = 0;
 		}
-		if (!mddev->in_sync || mddev->sb_dirty) {
+		if (!mddev->in_sync || mddev->flags) {
 			/* mark array as shutdown cleanly */
 			mddev->in_sync = 1;
-			md_update_sb(mddev);
+			md_update_sb(mddev, 1);
 		}
 		if (mode == 1)
 			set_disk_ro(disk, 1);
@@ -3734,7 +3740,7 @@ static int hot_remove_disk(mddev_t * mdd
 		goto busy;
 
 	kick_rdev_from_array(rdev);
-	md_update_sb(mddev);
+	md_update_sb(mddev, 1);
 	md_new_event(mddev);
 
 	return 0;
@@ -3811,7 +3817,7 @@ static int hot_add_disk(mddev_t * mddev,
 
 	rdev->raid_disk = -1;
 
-	md_update_sb(mddev);
+	md_update_sb(mddev, 1);
 
 	/*
 	 * Kick recovery, maybe this spare has to be added to the
@@ -3942,7 +3948,8 @@ static int set_array_info(mddev_t * mdde
 
 	mddev->max_disks     = MD_SB_DISKS;
 
-	mddev->sb_dirty      = 1;
+	mddev->flags         = 0;
+	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 
 	mddev->default_bitmap_offset = MD_SB_BYTES >> 9;
 	mddev->bitmap_offset = 0;
@@ -4111,7 +4118,7 @@ static int update_array_info(mddev_t *md
 			mddev->bitmap_offset = 0;
 		}
 	}
-	md_update_sb(mddev);
+	md_update_sb(mddev, 1);
 	return rv;
 }
 
@@ -4947,12 +4954,12 @@ void md_write_start(mddev_t *mddev, stru
 		spin_lock_irq(&mddev->write_lock);
 		if (mddev->in_sync) {
 			mddev->in_sync = 0;
-			mddev->sb_dirty = 3;
+			set_bit(MD_CHANGE_CLEAN, &mddev->flags);
 			md_wakeup_thread(mddev->thread);
 		}
 		spin_unlock_irq(&mddev->write_lock);
 	}
-	wait_event(mddev->sb_wait, mddev->sb_dirty==0);
+	wait_event(mddev->sb_wait, mddev->flags==0);
 }
 
 void md_write_end(mddev_t *mddev)
@@ -5222,7 +5229,6 @@ void md_do_sync(mddev_t *mddev)
 				    !test_bit(In_sync, &rdev->flags) &&
 				    rdev->recovery_offset < mddev->curr_resync)
 					rdev->recovery_offset = mddev->curr_resync;
-			mddev->sb_dirty = 1;
 		}
 	}
 
@@ -5279,7 +5285,7 @@ void md_check_recovery(mddev_t *mddev)
 	}
 
 	if ( ! (
-		mddev->sb_dirty ||
+		mddev->flags ||
 		test_bit(MD_RECOVERY_NEEDED, &mddev->recovery) ||
 		test_bit(MD_RECOVERY_DONE, &mddev->recovery) ||
 		(mddev->safemode == 1) ||
@@ -5295,14 +5301,14 @@ void md_check_recovery(mddev_t *mddev)
 		if (mddev->safemode && !atomic_read(&mddev->writes_pending) &&
 		    !mddev->in_sync && mddev->recovery_cp == MaxSector) {
 			mddev->in_sync = 1;
-			mddev->sb_dirty = 3;
+			set_bit(MD_CHANGE_CLEAN, &mddev->flags);
 		}
 		if (mddev->safemode == 1)
 			mddev->safemode = 0;
 		spin_unlock_irq(&mddev->write_lock);
 
-		if (mddev->sb_dirty)
-			md_update_sb(mddev);
+		if (mddev->flags)
+			md_update_sb(mddev, 0);
 
 
 		if (test_bit(MD_RECOVERY_RUNNING, &mddev->recovery) &&
@@ -5321,7 +5327,7 @@ void md_check_recovery(mddev_t *mddev)
 				/* activate any spares */
 				mddev->pers->spare_active(mddev);
 			}
-			md_update_sb(mddev);
+			md_update_sb(mddev, 1);
 
 			/* if array is no-longer degraded, then any saved_raid_disk
 			 * information must be scrapped

diff .prev/drivers/md/multipath.c ./drivers/md/multipath.c
--- .prev/drivers/md/multipath.c	2006-07-31 17:23:57.000000000 +1000
+++ ./drivers/md/multipath.c	2006-07-31 17:24:34.000000000 +1000
@@ -253,7 +253,7 @@ static void multipath_error (mddev_t *md
 			char b[BDEVNAME_SIZE];
 			clear_bit(In_sync, &rdev->flags);
 			set_bit(Faulty, &rdev->flags);
-			mddev->sb_dirty = 1;
+			set_bit(MD_CHANGE_DEVS, &mddev->flags);
 			conf->working_disks--;
 			printk(KERN_ALERT "multipath: IO failure on %s,"
 				" disabling IO path. \n	Operation continuing"
@@ -470,7 +470,6 @@ static int multipath_run (mddev_t *mddev
 	}
 
 	conf->raid_disks = mddev->raid_disks;
-	mddev->sb_dirty = 1;
 	conf->mddev = mddev;
 	spin_lock_init(&conf->device_lock);
 	INIT_LIST_HEAD(&conf->retry_list);

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c	2006-07-31 17:24:24.000000000 +1000
+++ ./drivers/md/raid1.c	2006-07-31 17:24:34.000000000 +1000
@@ -966,7 +966,7 @@ static void error(mddev_t *mddev, mdk_rd
 	}
 	clear_bit(In_sync, &rdev->flags);
 	set_bit(Faulty, &rdev->flags);
-	mddev->sb_dirty = 1;
+	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 	printk(KERN_ALERT "raid1: Disk failure on %s, disabling device. \n"
 		"	Operation continuing on %d devices\n",
 		bdevname(rdev->bdev,b), conf->working_disks);

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c	2006-07-31 17:24:34.000000000 +1000
+++ ./drivers/md/raid10.c	2006-07-31 17:24:34.000000000 +1000
@@ -960,7 +960,7 @@ static void error(mddev_t *mddev, mdk_rd
 	}
 	clear_bit(In_sync, &rdev->flags);
 	set_bit(Faulty, &rdev->flags);
-	mddev->sb_dirty = 1;
+	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 	printk(KERN_ALERT "raid10: Disk failure on %s, disabling device. \n"
 		"	Operation continuing on %d devices\n",
 		bdevname(rdev->bdev,b), conf->working_disks);

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2006-07-31 17:23:57.000000000 +1000
+++ ./drivers/md/raid5.c	2006-07-31 17:24:34.000000000 +1000
@@ -696,7 +696,7 @@ static void error(mddev_t *mddev, mdk_rd
 	PRINTK("raid5: error called\n");
 
 	if (!test_bit(Faulty, &rdev->flags)) {
-		mddev->sb_dirty = 1;
+		set_bit(MD_CHANGE_DEVS, &mddev->flags);
 		if (test_bit(In_sync, &rdev->flags)) {
 			conf->working_disks--;
 			mddev->degraded++;
@@ -2781,9 +2781,9 @@ static sector_t reshape_request(mddev_t 
 		wait_event(conf->wait_for_overlap,
 			   atomic_read(&conf->reshape_stripes)==0);
 		mddev->reshape_position = conf->expand_progress;
-		mddev->sb_dirty = 1;
+		set_bit(MD_CHANGE_DEVS, &mddev->flags);
 		md_wakeup_thread(mddev->thread);
-		wait_event(mddev->sb_wait, mddev->sb_dirty == 0 ||
+		wait_event(mddev->sb_wait, mddev->flags == 0 ||
 			   kthread_should_stop());
 		spin_lock_irq(&conf->device_lock);
 		conf->expand_lo = mddev->reshape_position;
@@ -3605,7 +3605,7 @@ static int raid5_start_reshape(mddev_t *
 	mddev->degraded = (conf->raid_disks - conf->previous_raid_disks) - added_devices;
 	mddev->raid_disks = conf->raid_disks;
 	mddev->reshape_position = 0;
-	mddev->sb_dirty = 1;
+	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 
 	clear_bit(MD_RECOVERY_SYNC, &mddev->recovery);
 	clear_bit(MD_RECOVERY_CHECK, &mddev->recovery);

diff .prev/include/linux/raid/md.h ./include/linux/raid/md.h
--- .prev/include/linux/raid/md.h	2006-07-31 17:23:57.000000000 +1000
+++ ./include/linux/raid/md.h	2006-07-31 17:24:34.000000000 +1000
@@ -93,7 +93,6 @@ extern int sync_page_io(struct block_dev
 extern void md_do_sync(mddev_t *mddev);
 extern void md_new_event(mddev_t *mddev);
 
-extern void md_update_sb(mddev_t * mddev);
 
 #endif 
 

diff .prev/include/linux/raid/md_k.h ./include/linux/raid/md_k.h
--- .prev/include/linux/raid/md_k.h	2006-07-31 17:23:57.000000000 +1000
+++ ./include/linux/raid/md_k.h	2006-07-31 17:24:34.000000000 +1000
@@ -114,7 +114,11 @@ struct mddev_s
 	dev_t				unit;
 	int				md_minor;
 	struct list_head 		disks;
-	int				sb_dirty;
+	unsigned long			flags;
+#define MD_CHANGE_DEVS	0	/* Some device status has changed */
+#define MD_CHANGE_CLEAN 1	/* transition to or from 'clean' */
+#define MD_CHANGE_PENDING 2	/* superblock update in progress */
+
 	int				ro;
 
 	struct gendisk			*gendisk;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data.
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
                   ` (4 preceding siblings ...)
  2006-07-31  7:32 ` [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags NeilBrown
@ 2006-07-31  7:32 ` NeilBrown
  2006-07-31  7:43   ` Michael Tokarev
  2006-07-31  7:32 ` [PATCH 007 of 9] md: Remove 'working_disks' from raid10 state NeilBrown
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


They are not needed.
conf->failed_disks is the same as mddev->degraded
and
conf->working_disks is conf->raid_disks - mddev->degraded.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid5.c         |   20 ++++++++------------
 ./include/linux/raid/raid5.h |    2 +-
 2 files changed, 9 insertions(+), 13 deletions(-)

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2006-07-31 17:24:34.000000000 +1000
+++ ./drivers/md/raid5.c	2006-07-31 17:24:35.000000000 +1000
@@ -698,9 +698,7 @@ static void error(mddev_t *mddev, mdk_rd
 	if (!test_bit(Faulty, &rdev->flags)) {
 		set_bit(MD_CHANGE_DEVS, &mddev->flags);
 		if (test_bit(In_sync, &rdev->flags)) {
-			conf->working_disks--;
 			mddev->degraded++;
-			conf->failed_disks++;
 			clear_bit(In_sync, &rdev->flags);
 			/*
 			 * if recovery was running, make sure it aborts.
@@ -711,7 +709,7 @@ static void error(mddev_t *mddev, mdk_rd
 		printk (KERN_ALERT
 			"raid5: Disk failure on %s, disabling device."
 			" Operation continuing on %d devices\n",
-			bdevname(rdev->bdev,b), conf->working_disks);
+			bdevname(rdev->bdev,b), conf->raid_disks - mddev->degraded);
 	}
 }
 
@@ -3074,6 +3072,7 @@ static int run(mddev_t *mddev)
 	mdk_rdev_t *rdev;
 	struct disk_info *disk;
 	struct list_head *tmp;
+	int working_disks = 0;
 
 	if (mddev->level != 5 && mddev->level != 4 && mddev->level != 6) {
 		printk(KERN_ERR "raid5: %s: raid level not set to 4/5/6 (%d)\n",
@@ -3176,14 +3175,14 @@ static int run(mddev_t *mddev)
 			printk(KERN_INFO "raid5: device %s operational as raid"
 				" disk %d\n", bdevname(rdev->bdev,b),
 				raid_disk);
-			conf->working_disks++;
+			working_disks++;
 		}
 	}
 
 	/*
 	 * 0 for a fully functional array, 1 or 2 for a degraded array.
 	 */
-	mddev->degraded = conf->failed_disks = conf->raid_disks - conf->working_disks;
+	mddev->degraded = conf->raid_disks - working_disks;
 	conf->mddev = mddev;
 	conf->chunk_size = mddev->chunk_size;
 	conf->level = mddev->level;
@@ -3218,7 +3217,7 @@ static int run(mddev_t *mddev)
 	if (mddev->degraded > conf->max_degraded) {
 		printk(KERN_ERR "raid5: not enough operational devices for %s"
 			" (%d/%d failed)\n",
-			mdname(mddev), conf->failed_disks, conf->raid_disks);
+			mdname(mddev), mddev->degraded, conf->raid_disks);
 		goto abort;
 	}
 
@@ -3375,7 +3374,7 @@ static void status (struct seq_file *seq
 	int i;
 
 	seq_printf (seq, " level %d, %dk chunk, algorithm %d", mddev->level, mddev->chunk_size >> 10, mddev->layout);
-	seq_printf (seq, " [%d/%d] [", conf->raid_disks, conf->working_disks);
+	seq_printf (seq, " [%d/%d] [", conf->raid_disks, conf->raid_disks - mddev->degraded);
 	for (i = 0; i < conf->raid_disks; i++)
 		seq_printf (seq, "%s",
 			       conf->disks[i].rdev &&
@@ -3397,8 +3396,8 @@ static void print_raid5_conf (raid5_conf
 		printk("(conf==NULL)\n");
 		return;
 	}
-	printk(" --- rd:%d wd:%d fd:%d\n", conf->raid_disks,
-		 conf->working_disks, conf->failed_disks);
+	printk(" --- rd:%d wd:%d\n", conf->raid_disks,
+		 conf->raid_disks - conf->mddev->degraded);
 
 	for (i = 0; i < conf->raid_disks; i++) {
 		char b[BDEVNAME_SIZE];
@@ -3422,8 +3421,6 @@ static int raid5_spare_active(mddev_t *m
 		    && !test_bit(Faulty, &tmp->rdev->flags)
 		    && !test_bit(In_sync, &tmp->rdev->flags)) {
 			mddev->degraded--;
-			conf->failed_disks--;
-			conf->working_disks++;
 			set_bit(In_sync, &tmp->rdev->flags);
 		}
 	}
@@ -3593,7 +3590,6 @@ static int raid5_start_reshape(mddev_t *
 			if (raid5_add_disk(mddev, rdev)) {
 				char nm[20];
 				set_bit(In_sync, &rdev->flags);
-				conf->working_disks++;
 				added_devices++;
 				rdev->recovery_offset = 0;
 				sprintf(nm, "rd%d", rdev->raid_disk);

diff .prev/include/linux/raid/raid5.h ./include/linux/raid/raid5.h
--- .prev/include/linux/raid/raid5.h	2006-07-31 17:23:56.000000000 +1000
+++ ./include/linux/raid/raid5.h	2006-07-31 17:24:35.000000000 +1000
@@ -214,7 +214,7 @@ struct raid5_private_data {
 	struct disk_info	*spare;
 	int			chunk_size, level, algorithm;
 	int			max_degraded;
-	int			raid_disks, working_disks, failed_disks;
+	int			raid_disks;
 	int			max_nr_stripes;
 
 	/* used during an expand */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 007 of 9] md: Remove 'working_disks' from raid10 state
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
                   ` (5 preceding siblings ...)
  2006-07-31  7:32 ` [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data NeilBrown
@ 2006-07-31  7:32 ` NeilBrown
  2006-07-31  7:32 ` [PATCH 008 of 9] md: Remove working_disks from raid1 state data NeilBrown
  2006-07-31  7:32 ` [PATCH 009 of 9] md: Improve locking around error handling NeilBrown
  8 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


It isn't needed as mddev->degraded contains equivalent info.
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid10.c         |   12 ++++--------
 ./include/linux/raid/raid10.h |    1 -
 2 files changed, 4 insertions(+), 9 deletions(-)

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c	2006-07-31 17:24:34.000000000 +1000
+++ ./drivers/md/raid10.c	2006-07-31 17:24:35.000000000 +1000
@@ -921,7 +921,7 @@ static void status(struct seq_file *seq,
 			seq_printf(seq, " %d far-copies", conf->far_copies);
 	}
 	seq_printf(seq, " [%d/%d] [", conf->raid_disks,
-						conf->working_disks);
+					conf->raid_disks - mddev->degraded);
 	for (i = 0; i < conf->raid_disks; i++)
 		seq_printf(seq, "%s",
 			      conf->mirrors[i].rdev &&
@@ -941,7 +941,7 @@ static void error(mddev_t *mddev, mdk_rd
 	 * else mark the drive as failed
 	 */
 	if (test_bit(In_sync, &rdev->flags)
-	    && conf->working_disks == 1)
+	    && conf->raid_disks-mddev->degraded == 1)
 		/*
 		 * Don't fail the drive, just return an IO error.
 		 * The test should really be more sophisticated than
@@ -952,7 +952,6 @@ static void error(mddev_t *mddev, mdk_rd
 		return;
 	if (test_bit(In_sync, &rdev->flags)) {
 		mddev->degraded++;
-		conf->working_disks--;
 		/*
 		 * if recovery is running, make sure it aborts.
 		 */
@@ -963,7 +962,7 @@ static void error(mddev_t *mddev, mdk_rd
 	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 	printk(KERN_ALERT "raid10: Disk failure on %s, disabling device. \n"
 		"	Operation continuing on %d devices\n",
-		bdevname(rdev->bdev,b), conf->working_disks);
+		bdevname(rdev->bdev,b), conf->raid_disks - mddev->degraded);
 }
 
 static void print_conf(conf_t *conf)
@@ -976,7 +975,7 @@ static void print_conf(conf_t *conf)
 		printk("(!conf)\n");
 		return;
 	}
-	printk(" --- wd:%d rd:%d\n", conf->working_disks,
+	printk(" --- wd:%d rd:%d\n", conf->raid_disks - conf->mddev->degraded,
 		conf->raid_disks);
 
 	for (i = 0; i < conf->raid_disks; i++) {
@@ -1035,7 +1034,6 @@ static int raid10_spare_active(mddev_t *
 		if (tmp->rdev
 		    && !test_bit(Faulty, &tmp->rdev->flags)
 		    && !test_bit(In_sync, &tmp->rdev->flags)) {
-			conf->working_disks++;
 			mddev->degraded--;
 			set_bit(In_sync, &tmp->rdev->flags);
 		}
@@ -2035,8 +2033,6 @@ static int run(mddev_t *mddev)
 			mddev->queue->max_sectors = (PAGE_SIZE>>9);
 
 		disk->head_position = 0;
-		if (!test_bit(Faulty, &rdev->flags) && test_bit(In_sync, &rdev->flags))
-			conf->working_disks++;
 	}
 	conf->raid_disks = mddev->raid_disks;
 	conf->mddev = mddev;

diff .prev/include/linux/raid/raid10.h ./include/linux/raid/raid10.h
--- .prev/include/linux/raid/raid10.h	2006-07-31 17:23:56.000000000 +1000
+++ ./include/linux/raid/raid10.h	2006-07-31 17:24:35.000000000 +1000
@@ -16,7 +16,6 @@ struct r10_private_data_s {
 	mddev_t			*mddev;
 	mirror_info_t		*mirrors;
 	int			raid_disks;
-	int			working_disks;
 	spinlock_t		device_lock;
 
 	/* geometry */

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 008 of 9] md: Remove working_disks from raid1 state data
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
                   ` (6 preceding siblings ...)
  2006-07-31  7:32 ` [PATCH 007 of 9] md: Remove 'working_disks' from raid10 state NeilBrown
@ 2006-07-31  7:32 ` NeilBrown
  2006-07-31  7:32 ` [PATCH 009 of 9] md: Improve locking around error handling NeilBrown
  8 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


It is equivalent to conf->raid_disks - conf->mddev->degraded.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c         |   28 ++++++++++++----------------
 ./include/linux/raid/raid1.h |    1 -
 2 files changed, 12 insertions(+), 17 deletions(-)

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c	2006-07-31 17:24:34.000000000 +1000
+++ ./drivers/md/raid1.c	2006-07-31 17:24:36.000000000 +1000
@@ -271,7 +271,7 @@ static int raid1_end_read_request(struct
 	 */
 	update_head_pos(mirror, r1_bio);
 
-	if (uptodate || conf->working_disks <= 1) {
+	if (uptodate || (conf->raid_disks - conf->mddev->degraded) <= 1) {
 		/*
 		 * Set R1BIO_Uptodate in our master bio, so that
 		 * we will return a good error code for to the higher
@@ -929,7 +929,7 @@ static void status(struct seq_file *seq,
 	int i;
 
 	seq_printf(seq, " [%d/%d] [", conf->raid_disks,
-						conf->working_disks);
+		   conf->raid_disks - mddev->degraded);
 	for (i = 0; i < conf->raid_disks; i++)
 		seq_printf(seq, "%s",
 			      conf->mirrors[i].rdev &&
@@ -950,7 +950,7 @@ static void error(mddev_t *mddev, mdk_rd
 	 * else mark the drive as failed
 	 */
 	if (test_bit(In_sync, &rdev->flags)
-	    && conf->working_disks == 1)
+	    && (conf->raid_disks - mddev->degraded) == 1)
 		/*
 		 * Don't fail the drive, act as though we were just a
 		 * normal single drive
@@ -958,7 +958,6 @@ static void error(mddev_t *mddev, mdk_rd
 		return;
 	if (test_bit(In_sync, &rdev->flags)) {
 		mddev->degraded++;
-		conf->working_disks--;
 		/*
 		 * if recovery is running, make sure it aborts.
 		 */
@@ -969,7 +968,7 @@ static void error(mddev_t *mddev, mdk_rd
 	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 	printk(KERN_ALERT "raid1: Disk failure on %s, disabling device. \n"
 		"	Operation continuing on %d devices\n",
-		bdevname(rdev->bdev,b), conf->working_disks);
+		bdevname(rdev->bdev,b), conf->raid_disks - mddev->degraded);
 }
 
 static void print_conf(conf_t *conf)
@@ -982,7 +981,7 @@ static void print_conf(conf_t *conf)
 		printk("(!conf)\n");
 		return;
 	}
-	printk(" --- wd:%d rd:%d\n", conf->working_disks,
+	printk(" --- wd:%d rd:%d\n", conf->raid_disks - conf->mddev->degraded,
 		conf->raid_disks);
 
 	for (i = 0; i < conf->raid_disks; i++) {
@@ -1019,7 +1018,6 @@ static int raid1_spare_active(mddev_t *m
 		if (tmp->rdev 
 		    && !test_bit(Faulty, &tmp->rdev->flags)
 		    && !test_bit(In_sync, &tmp->rdev->flags)) {
-			conf->working_disks++;
 			mddev->degraded--;
 			set_bit(In_sync, &tmp->rdev->flags);
 		}
@@ -1887,15 +1885,11 @@ static int run(mddev_t *mddev)
 			blk_queue_max_sectors(mddev->queue, PAGE_SIZE>>9);
 
 		disk->head_position = 0;
-		if (!test_bit(Faulty, &rdev->flags) && test_bit(In_sync, &rdev->flags))
-			conf->working_disks++;
 	}
 	conf->raid_disks = mddev->raid_disks;
 	conf->mddev = mddev;
 	spin_lock_init(&conf->device_lock);
 	INIT_LIST_HEAD(&conf->retry_list);
-	if (conf->working_disks == 1)
-		mddev->recovery_cp = MaxSector;
 
 	spin_lock_init(&conf->resync_lock);
 	init_waitqueue_head(&conf->wait_barrier);
@@ -1903,11 +1897,6 @@ static int run(mddev_t *mddev)
 	bio_list_init(&conf->pending_bio_list);
 	bio_list_init(&conf->flushing_bio_list);
 
-	if (!conf->working_disks) {
-		printk(KERN_ERR "raid1: no operational mirrors for %s\n",
-			mdname(mddev));
-		goto out_free_conf;
-	}
 
 	mddev->degraded = 0;
 	for (i = 0; i < conf->raid_disks; i++) {
@@ -1920,6 +1909,13 @@ static int run(mddev_t *mddev)
 			mddev->degraded++;
 		}
 	}
+	if (mddev->degraded == conf->raid_disks) {
+		printk(KERN_ERR "raid1: no operational mirrors for %s\n",
+			mdname(mddev));
+		goto out_free_conf;
+	}
+	if (conf->raid_disks - mddev->degraded == 1)
+		mddev->recovery_cp = MaxSector;
 
 	/*
 	 * find the first working one and use it as a starting point

diff .prev/include/linux/raid/raid1.h ./include/linux/raid/raid1.h
--- .prev/include/linux/raid/raid1.h	2006-07-31 17:23:55.000000000 +1000
+++ ./include/linux/raid/raid1.h	2006-07-31 17:24:36.000000000 +1000
@@ -30,7 +30,6 @@ struct r1_private_data_s {
 	mddev_t			*mddev;
 	mirror_info_t		*mirrors;
 	int			raid_disks;
-	int			working_disks;
 	int			last_used;
 	sector_t		next_seq_sect;
 	spinlock_t		device_lock;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 009 of 9] md: Improve locking around error handling.
  2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
                   ` (7 preceding siblings ...)
  2006-07-31  7:32 ` [PATCH 008 of 9] md: Remove working_disks from raid1 state data NeilBrown
@ 2006-07-31  7:32 ` NeilBrown
  8 siblings, 0 replies; 16+ messages in thread
From: NeilBrown @ 2006-07-31  7:32 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid, linux-kernel


The error handling routines don't use proper locking, and
so two concurrent errors could trigger a problem.
So:
  - use test-and-set and test-and-clear to synchonise
    the In_sync bits with the ->degraded count
  - use the spinlock to protect updates to the
    degraded count (could use an atomic_t but that
    would be a bigger change in code, and isn't
    really justified)
  - remove un-necessary locking in raid5
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c  |   16 +++++++++++-----
 ./drivers/md/raid10.c |   12 ++++++++----
 ./drivers/md/raid5.c  |   20 ++++++++++++--------
 3 files changed, 31 insertions(+), 17 deletions(-)

diff .prev/drivers/md/raid1.c ./drivers/md/raid1.c
--- .prev/drivers/md/raid1.c	2006-07-31 17:24:36.000000000 +1000
+++ ./drivers/md/raid1.c	2006-07-31 17:24:36.000000000 +1000
@@ -956,14 +956,16 @@ static void error(mddev_t *mddev, mdk_rd
 		 * normal single drive
 		 */
 		return;
-	if (test_bit(In_sync, &rdev->flags)) {
+	if (test_and_clear_bit(In_sync, &rdev->flags)) {
+		unsigned long flags;
+		spin_lock_irqsave(&conf->device_lock, flags);
 		mddev->degraded++;
+		spin_unlock_irqrestore(&conf->device_lock, flags);
 		/*
 		 * if recovery is running, make sure it aborts.
 		 */
 		set_bit(MD_RECOVERY_ERR, &mddev->recovery);
 	}
-	clear_bit(In_sync, &rdev->flags);
 	set_bit(Faulty, &rdev->flags);
 	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 	printk(KERN_ALERT "raid1: Disk failure on %s, disabling device. \n"
@@ -1017,9 +1019,11 @@ static int raid1_spare_active(mddev_t *m
 		tmp = conf->mirrors + i;
 		if (tmp->rdev 
 		    && !test_bit(Faulty, &tmp->rdev->flags)
-		    && !test_bit(In_sync, &tmp->rdev->flags)) {
+		    && !test_and_set_bit(In_sync, &tmp->rdev->flags)) {
+			unsigned long flags;
+			spin_lock_irqsave(&conf->device_lock, flags);
 			mddev->degraded--;
-			set_bit(In_sync, &tmp->rdev->flags);
+			spin_unlock_irqrestore(&conf->device_lock, flags);
 		}
 	}
 
@@ -2034,7 +2038,7 @@ static int raid1_reshape(mddev_t *mddev)
 	mirror_info_t *newmirrors;
 	conf_t *conf = mddev_to_conf(mddev);
 	int cnt, raid_disks;
-
+	unsigned long flags;
 	int d, d2;
 
 	/* Cannot change chunk_size, layout, or level */
@@ -2093,7 +2097,9 @@ static int raid1_reshape(mddev_t *mddev)
 	kfree(conf->poolinfo);
 	conf->poolinfo = newpoolinfo;
 
+	spin_lock_irqsave(&conf->device_lock, flags);
 	mddev->degraded += (raid_disks - conf->raid_disks);
+	spin_unlock_irqrestore(&conf->device_lock, flags);
 	conf->raid_disks = mddev->raid_disks = raid_disks;
 	mddev->delta_disks = 0;
 

diff .prev/drivers/md/raid10.c ./drivers/md/raid10.c
--- .prev/drivers/md/raid10.c	2006-07-31 17:24:35.000000000 +1000
+++ ./drivers/md/raid10.c	2006-07-31 17:24:36.000000000 +1000
@@ -950,14 +950,16 @@ static void error(mddev_t *mddev, mdk_rd
 		 * really dead" tests...
 		 */
 		return;
-	if (test_bit(In_sync, &rdev->flags)) {
+	if (test_and_clear_bit(In_sync, &rdev->flags)) {
+		unsigned long flags;
+		spin_lock_irqsave(&conf->device_lock, flags);
 		mddev->degraded++;
+		spin_unlock_irqrestore(&conf->device_lock, flags);
 		/*
 		 * if recovery is running, make sure it aborts.
 		 */
 		set_bit(MD_RECOVERY_ERR, &mddev->recovery);
 	}
-	clear_bit(In_sync, &rdev->flags);
 	set_bit(Faulty, &rdev->flags);
 	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 	printk(KERN_ALERT "raid10: Disk failure on %s, disabling device. \n"
@@ -1033,9 +1035,11 @@ static int raid10_spare_active(mddev_t *
 		tmp = conf->mirrors + i;
 		if (tmp->rdev
 		    && !test_bit(Faulty, &tmp->rdev->flags)
-		    && !test_bit(In_sync, &tmp->rdev->flags)) {
+		    && !test_and_set_bit(In_sync, &tmp->rdev->flags)) {
+			unsigned long flags;
+			spin_lock_irqsave(&conf->device_lock, flags);
 			mddev->degraded--;
-			set_bit(In_sync, &tmp->rdev->flags);
+			spin_unlock_irqrestore(&conf->device_lock, flags);
 		}
 	}
 

diff .prev/drivers/md/raid5.c ./drivers/md/raid5.c
--- .prev/drivers/md/raid5.c	2006-07-31 17:24:35.000000000 +1000
+++ ./drivers/md/raid5.c	2006-07-31 17:24:36.000000000 +1000
@@ -636,7 +636,6 @@ static int raid5_end_write_request (stru
  	struct stripe_head *sh = bi->bi_private;
 	raid5_conf_t *conf = sh->raid_conf;
 	int disks = sh->disks, i;
-	unsigned long flags;
 	int uptodate = test_bit(BIO_UPTODATE, &bi->bi_flags);
 
 	if (bi->bi_size)
@@ -654,7 +653,6 @@ static int raid5_end_write_request (stru
 		return 0;
 	}
 
-	spin_lock_irqsave(&conf->device_lock, flags);
 	if (!uptodate)
 		md_error(conf->mddev, conf->disks[i].rdev);
 
@@ -662,8 +660,7 @@ static int raid5_end_write_request (stru
 	
 	clear_bit(R5_LOCKED, &sh->dev[i].flags);
 	set_bit(STRIPE_HANDLE, &sh->state);
-	__release_stripe(conf, sh);
-	spin_unlock_irqrestore(&conf->device_lock, flags);
+	release_stripe(sh);
 	return 0;
 }
 
@@ -697,9 +694,11 @@ static void error(mddev_t *mddev, mdk_rd
 
 	if (!test_bit(Faulty, &rdev->flags)) {
 		set_bit(MD_CHANGE_DEVS, &mddev->flags);
-		if (test_bit(In_sync, &rdev->flags)) {
+		if (test_and_clear_bit(In_sync, &rdev->flags)) {
+			unsigned long flags;
+			spin_lock_irqsave(&conf->device_lock, flags);
 			mddev->degraded++;
-			clear_bit(In_sync, &rdev->flags);
+			spin_unlock_irqrestore(&conf->device_lock, flags);
 			/*
 			 * if recovery was running, make sure it aborts.
 			 */
@@ -3419,9 +3418,11 @@ static int raid5_spare_active(mddev_t *m
 		tmp = conf->disks + i;
 		if (tmp->rdev
 		    && !test_bit(Faulty, &tmp->rdev->flags)
-		    && !test_bit(In_sync, &tmp->rdev->flags)) {
+		    && !test_and_set_bit(In_sync, &tmp->rdev->flags)) {
+			unsigned long flags;
+			spin_lock_irqsave(&conf->device_lock, flags);
 			mddev->degraded--;
-			set_bit(In_sync, &tmp->rdev->flags);
+			spin_unlock_irqrestore(&conf->device_lock, flags);
 		}
 	}
 	print_raid5_conf(conf);
@@ -3557,6 +3558,7 @@ static int raid5_start_reshape(mddev_t *
 	struct list_head *rtmp;
 	int spares = 0;
 	int added_devices = 0;
+	unsigned long flags;
 
 	if (mddev->degraded ||
 	    test_bit(MD_RECOVERY_RUNNING, &mddev->recovery))
@@ -3598,7 +3600,9 @@ static int raid5_start_reshape(mddev_t *
 				break;
 		}
 
+	spin_lock_irqsave(&conf->device_lock, flags);
 	mddev->degraded = (conf->raid_disks - conf->previous_raid_disks) - added_devices;
+	spin_unlock_irqrestore(&conf->device_lock, flags);
 	mddev->raid_disks = conf->raid_disks;
 	mddev->reshape_position = 0;
 	set_bit(MD_CHANGE_DEVS, &mddev->flags);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data.
  2006-07-31  7:32 ` [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data NeilBrown
@ 2006-07-31  7:43   ` Michael Tokarev
  0 siblings, 0 replies; 16+ messages in thread
From: Michael Tokarev @ 2006-07-31  7:43 UTC (permalink / raw)
  To: NeilBrown; +Cc: Andrew Morton, linux-raid, linux-kernel

NeilBrown wrote:
> They are not needed.
> conf->failed_disks is the same as mddev->degraded

By the way, `failed_disks' is more understandable than `degraded'
in this context.  "Degraded" usually refers to the state of the
array in question, when failed_disks > 0.

That to say: I'd rename degraded back to failed_disks, here and
in the rest of raid drivers... ;)

/mjt

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags
  2006-07-31  7:32 ` [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags NeilBrown
@ 2006-07-31 15:33   ` Ingo Oeser
  2006-08-01 17:12     ` Bill Davidsen
  0 siblings, 1 reply; 16+ messages in thread
From: Ingo Oeser @ 2006-07-31 15:33 UTC (permalink / raw)
  To: NeilBrown; +Cc: Andrew Morton, linux-raid, linux-kernel

Hi Neil,

I think the names in this patch don't match the description at all.
May I suggest different ones?

On Monday, 31. July 2006 09:32, NeilBrown wrote:
> 
> Instead of magic numbers (0,1,2,3) in sb_dirty, we have
> some flags instead:
> MD_CHANGE_DEVS
>    Some device state has changed requiring superblock update
>    on all devices.

MD_SB_STALE or MD_SB_NEED_UPDATE

> MD_CHANGE_CLEAN
>    The array has transitions from 'clean' to 'dirty' or back,
>    requiring a superblock update on active devices, but possibly
>    not on spares

Maybe split this into MD_SB_DIRTY and MD_SB_CLEAN ?

> MD_CHANGE_PENDING
>    A superblock update is underway.  

MD_SB_PENDING_UPDATE


Regards

Ingo Oeser

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags
  2006-07-31 15:33   ` Ingo Oeser
@ 2006-08-01 17:12     ` Bill Davidsen
  2006-08-02 22:52       ` Doug Ledford
  0 siblings, 1 reply; 16+ messages in thread
From: Bill Davidsen @ 2006-08-01 17:12 UTC (permalink / raw)
  To: Ingo Oeser; +Cc: NeilBrown, Andrew Morton, linux-raid, linux-kernel

Ingo Oeser wrote:

>Hi Neil,
>
>I think the names in this patch don't match the description at all.
>May I suggest different ones?
>
>On Monday, 31. July 2006 09:32, NeilBrown wrote:
>  
>
>>Instead of magic numbers (0,1,2,3) in sb_dirty, we have
>>some flags instead:
>>MD_CHANGE_DEVS
>>   Some device state has changed requiring superblock update
>>   on all devices.
>>    
>>
>
>MD_SB_STALE or MD_SB_NEED_UPDATE
>  
>
I think STALE is better, it is unambigous.

>  
>
>>MD_CHANGE_CLEAN
>>   The array has transitions from 'clean' to 'dirty' or back,
>>   requiring a superblock update on active devices, but possibly
>>   not on spares
>>    
>>
>
>Maybe split this into MD_SB_DIRTY and MD_SB_CLEAN ?
>  
>
I don't think the split is beneficial, but I don't care for the name 
much. Some name like SB_UPDATE_NEEDED or the like might be better.

>  
>
>>MD_CHANGE_PENDING
>>   A superblock update is underway.  
>>    
>>
>
>MD_SB_PENDING_UPDATE
>
>  
>
I would have said UPDATE_PENDING, but either is more descriptive than 
the original.

Neil - the logic in this code is pretty complex, all the help you can 
give the occasional reader, by using very descriptive names for things, 
is helpful to the reader and reduces your "question due to 
misunderstanding" load.

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 004 of 9] md: Factor out part of raid10d into a separate function.
  2006-07-31  7:32 ` [PATCH 004 of 9] md: Factor out part of raid10d " NeilBrown
@ 2006-08-01 17:15   ` Bill Davidsen
  2006-08-01 20:27     ` Neil Brown
  0 siblings, 1 reply; 16+ messages in thread
From: Bill Davidsen @ 2006-08-01 17:15 UTC (permalink / raw)
  To: NeilBrown; +Cc: Andrew Morton, linux-raid, linux-kernel

don't think this is better, NeilBrown wrote:

>raid10d has toooo many nested block, so take the fix_read_error
>functionality out into a separate function.
>  
>

Definite improvement in readability. Will all versions of the compiler 
do something appropriate WRT inlining or not?

-- 
bill davidsen <davidsen@tmr.com>
  CTO TMR Associates, Inc
  Doing interesting things with small computers since 1979


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 004 of 9] md: Factor out part of raid10d into a separate function.
  2006-08-01 17:15   ` Bill Davidsen
@ 2006-08-01 20:27     ` Neil Brown
  0 siblings, 0 replies; 16+ messages in thread
From: Neil Brown @ 2006-08-01 20:27 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: Andrew Morton, linux-raid, linux-kernel

On Tuesday August 1, davidsen@tmr.com wrote:
> don't think this is better, NeilBrown wrote:
> 
> >raid10d has toooo many nested block, so take the fix_read_error
> >functionality out into a separate function.
> >  
> >
> 
> Definite improvement in readability. Will all versions of the compiler 
> do something appropriate WRT inlining or not?

As the separated function is called about once in a blue moon, it
hardly matters.  I'd probably rather it wasn't inlined so as to be
sure it doesn't clutter the L-1 cache when it isn't needed, but that's
the sort of thing I really want to leave to the compiler.

Maybe it would be good to stick an 'unlikely' or 'likely' in raid10d
to tell the compiler how likely a read error is...

NeilBrown

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags
  2006-08-01 17:12     ` Bill Davidsen
@ 2006-08-02 22:52       ` Doug Ledford
  0 siblings, 0 replies; 16+ messages in thread
From: Doug Ledford @ 2006-08-02 22:52 UTC (permalink / raw)
  To: Bill Davidsen
  Cc: Ingo Oeser, NeilBrown, Andrew Morton, linux-raid, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 2777 bytes --]

On Tue, 2006-08-01 at 13:12 -0400, Bill Davidsen wrote:
> Ingo Oeser wrote:
> 
> >Hi Neil,
> >
> >I think the names in this patch don't match the description at all.
> >May I suggest different ones?
> >
> >On Monday, 31. July 2006 09:32, NeilBrown wrote:
> >  
> >
> >>Instead of magic numbers (0,1,2,3) in sb_dirty, we have
> >>some flags instead:
> >>MD_CHANGE_DEVS
> >>   Some device state has changed requiring superblock update
> >>   on all devices.
> >>    
> >>
> >
> >MD_SB_STALE or MD_SB_NEED_UPDATE
> >  
> >
> I think STALE is better, it is unambigous.
> >  
> >
> >>MD_CHANGE_CLEAN
> >>   The array has transitions from 'clean' to 'dirty' or back,
> >>   requiring a superblock update on active devices, but possibly
> >>   not on spares
> >>    
> >>
> >
> >Maybe split this into MD_SB_DIRTY and MD_SB_CLEAN ?
> >  
> >
> I don't think the split is beneficial, but I don't care for the name 
> much. Some name like SB_UPDATE_NEEDED or the like might be better.

Yeah, no split.  The absence of a dirty flag denotes clean.

> >  
> >
> >>MD_CHANGE_PENDING
> >>   A superblock update is underway.  
> >>    
> >>
> >
> >MD_SB_PENDING_UPDATE
> >
> >  
> >
> I would have said UPDATE_PENDING, but either is more descriptive than 
> the original.
> 
> Neil - the logic in this code is pretty complex, all the help you can 
> give the occasional reader, by using very descriptive names for things, 
> is helpful to the reader and reduces your "question due to 
> misunderstanding" load.

Well, if you don't mind the length you could do:

MD_SB_WRITEOUT_ALL	/* we need to flush all superblocks due
                         * to a device change. */
MD_SB_WRITEOUT_SOME	/* we need to flush the active devices
                         * superblocks due to normal write operations */
MD_SB_WRITEOUT_PENDING	/* the flush is underway */

In trying to come up with descriptive names for this, it really points
out how having both the "I need something" and "I'm handling something"
flags in the same name space sucks.

Anyway, I don't think Neil's original names were that bad, just
obviously the names describe the condition that precipitated the state,
not the current state, implying that a reader of that code should
probably be thinking about what caused the state more than the state
when determining the appropriate course of action.  It may be preferable
to leave them that way to push readers towards that sort of analysis,
whatever Neil thinks is best on that.

-- 
Doug Ledford <dledford@redhat.com>
              GPG KeyID: CFBFF194
              http://people.redhat.com/dledford

Infiniband specific RPMs available at
              http://people.redhat.com/dledford/Infiniband

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2006-08-02 22:52 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-31  7:31 [PATCH 000 of 9] md: Introduction - assorted cleanup and minor fixes NeilBrown
2006-07-31  7:31 ` [PATCH 001 of 9] md: The scheduled removal of the START_ARRAY ioctl for md NeilBrown
2006-07-31  7:31 ` [PATCH 002 of 9] md: Fix a comment that is wrong in raid5.h NeilBrown
2006-07-31  7:32 ` [PATCH 003 of 9] md: Factor out part of raid1d into a separate function NeilBrown
2006-07-31  7:32 ` [PATCH 004 of 9] md: Factor out part of raid10d " NeilBrown
2006-08-01 17:15   ` Bill Davidsen
2006-08-01 20:27     ` Neil Brown
2006-07-31  7:32 ` [PATCH 005 of 9] md: Replace magic numbers in sb_dirty with well defined bit flags NeilBrown
2006-07-31 15:33   ` Ingo Oeser
2006-08-01 17:12     ` Bill Davidsen
2006-08-02 22:52       ` Doug Ledford
2006-07-31  7:32 ` [PATCH 006 of 9] md: Remove the working_disks and failed_disks from raid5 state data NeilBrown
2006-07-31  7:43   ` Michael Tokarev
2006-07-31  7:32 ` [PATCH 007 of 9] md: Remove 'working_disks' from raid10 state NeilBrown
2006-07-31  7:32 ` [PATCH 008 of 9] md: Remove working_disks from raid1 state data NeilBrown
2006-07-31  7:32 ` [PATCH 009 of 9] md: Improve locking around error handling NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).