linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH md 000 of 14] Introduction
@ 2005-12-01  3:22 NeilBrown
  2005-12-01  3:22 ` [PATCH md 001 of 14] Support check-without-repair of raid10 arrays NeilBrown
                   ` (13 more replies)
  0 siblings, 14 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid

Following are 14 more patches for md in 2.6.15-rc2-mm1

None are appropriate for 2.6.15 - all should wait for 2.6.16 to open.

With these the handling for read errors by overwriting with correct
data is supported for all relevant raid levels.  Also user-requested
checking of redundant information is supported for all levels, and
report the number of affect blocks in sysfs (though this number is
only approximate).

/proc/mdstat becomes pollable so that mdadm can get timely
notification of changes without polling at a high frequency.

There are also a substantial number of code cleanups.

 [PATCH md 001 of 14] Support check-without-repair of raid10 arrays
 [PATCH md 002 of 14] Allow raid1 to check consistency
 [PATCH md 003 of 14] Make sure read error on last working drive of raid1 actually returns failure.
 [PATCH md 004 of 14] auto-correct correctable read errors in raid10
 [PATCH md 005 of 14] raid10 read-error handling - resync and read-only
 [PATCH md 006 of 14] Make /proc/mdstat pollable.
 [PATCH md 007 of 14] Clean up 'page' related names in md
 [PATCH md 008 of 14] Convert md to use kzalloc throughout
 [PATCH md 009 of 14] Tidy up raid5/6 hash table code.
 [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic
 [PATCH md 011 of 14] Convert recently exported symbol to GPL
 [PATCH md 012 of 14] Break out of a loop that doesn't need to run to completion.
 [PATCH md 013 of 14] Remove personality numbering from md.
 [PATCH md 014 of 14] Fix possible problem in raid1/raid10 error overwriting.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 001 of 14] Support check-without-repair of raid10 arrays
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
@ 2005-12-01  3:22 ` NeilBrown
  2005-12-01  3:22 ` [PATCH md 002 of 14] Allow raid1 to check consistency NeilBrown
                   ` (12 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


Also keep count on the number of errors found.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid10.c |    4 ++++
 1 file changed, 4 insertions(+)

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-11-28 10:12:52.000000000 +1100
+++ ./drivers/md/raid10.c	2005-11-28 14:01:41.000000000 +1100
@@ -1206,6 +1206,10 @@ static void sync_request_write(mddev_t *
 				break;
 		if (j == vcnt)
 			continue;
+		mddev->resync_mismatches += r10_bio->sectors;
+		if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery))
+			/* Don't fix anything. */
+			continue;
 		/* Ok, we need to write this bio
 		 * First we need to fixup bv_offset, bv_len and
 		 * bi_vecs, as the read request might have corrupted these

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 002 of 14] Allow raid1 to check consistency
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
  2005-12-01  3:22 ` [PATCH md 001 of 14] Support check-without-repair of raid10 arrays NeilBrown
@ 2005-12-01  3:22 ` NeilBrown
  2005-12-01 22:34   ` Andrew Morton
  2005-12-01  3:23 ` [PATCH md 003 of 14] Make sure read error on last working drive of raid1 actually returns failure NeilBrown
                   ` (11 subsequent siblings)
  13 siblings, 1 reply; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:22 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


Where performing a user-requested 'check' or 'repair',
we read all readable devices, and compare the contents.
We only write to blocks which had read errors, or
blocks with content that differs from the first good device
found.



Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c |  158 +++++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 129 insertions(+), 29 deletions(-)

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2005-11-28 15:08:48.000000000 +1100
+++ ./drivers/md/raid1.c	2005-11-28 17:20:37.000000000 +1100
@@ -106,15 +106,30 @@ static void * r1buf_pool_alloc(gfp_t gfp
 	}
 	/*
 	 * Allocate RESYNC_PAGES data pages and attach them to
-	 * the first bio;
-	 */
-	bio = r1_bio->bios[0];
-	for (i = 0; i < RESYNC_PAGES; i++) {
-		page = alloc_page(gfp_flags);
-		if (unlikely(!page))
-			goto out_free_pages;
-
-		bio->bi_io_vec[i].bv_page = page;
+	 * the first bio.
+	 * If this is a user-requested check/repair, allocate
+	 * RESYNC_PAGES for each bio.
+	 */
+	if (test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery))
+		j = pi->raid_disks;
+	else
+		j = 1;
+	while(j--) {
+		bio = r1_bio->bios[j];
+		for (i = 0; i < RESYNC_PAGES; i++) {
+			page = alloc_page(gfp_flags);
+			if (unlikely(!page))
+				goto out_free_pages;
+
+			bio->bi_io_vec[i].bv_page = page;
+		}
+	}
+	/* If not user-requests, copy the page pointers to all bios */
+	if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) {
+		for (i=0; i<RESYNC_PAGES ; i++)
+			for (j=1; j<pi->raid_disks; j++)
+				r1_bio->bios[j]->bi_io_vec[i].bv_page =
+					r1_bio->bios[0]->bi_io_vec[i].bv_page;
 	}
 
 	r1_bio->master_bio = NULL;
@@ -122,8 +137,10 @@ static void * r1buf_pool_alloc(gfp_t gfp
 	return r1_bio;
 
 out_free_pages:
-	for ( ; i > 0 ; i--)
-		__free_page(bio->bi_io_vec[i-1].bv_page);
+	for (i=0; i < RESYNC_PAGES ; i++)
+		for (j=0 ; j < pi->raid_disks; j++)
+			__free_page(r1_bio->bios[j]->bi_io_vec[i].bv_page);
+	j = -1;
 out_free_bio:
 	while ( ++j < pi->raid_disks )
 		bio_put(r1_bio->bios[j]);
@@ -134,14 +151,16 @@ out_free_bio:
 static void r1buf_pool_free(void *__r1_bio, void *data)
 {
 	struct pool_info *pi = data;
-	int i;
+	int i,j;
 	r1bio_t *r1bio = __r1_bio;
-	struct bio *bio = r1bio->bios[0];
 
-	for (i = 0; i < RESYNC_PAGES; i++) {
-		__free_page(bio->bi_io_vec[i].bv_page);
-		bio->bi_io_vec[i].bv_page = NULL;
-	}
+	for (i = 0; i < RESYNC_PAGES; i++)
+		for (j = pi->raid_disks; j-- ;) {
+			if (j == 0 ||
+			    r1bio->bios[j]->bi_io_vec[i].bv_page !=
+			    r1bio->bios[0]->bi_io_vec[i].bv_page)
+				__free_page(r1bio->bios[j]->bi_io_vec[i].bv_page);
+		}
 	for (i=0 ; i < pi->raid_disks; i++)
 		bio_put(r1bio->bios[i]);
 
@@ -1076,13 +1095,16 @@ abort:
 static int end_sync_read(struct bio *bio, unsigned int bytes_done, int error)
 {
 	r1bio_t * r1_bio = (r1bio_t *)(bio->bi_private);
+	int i;
 
 	if (bio->bi_size)
 		return 1;
 
-	if (r1_bio->bios[r1_bio->read_disk] != bio)
-		BUG();
-	update_head_pos(r1_bio->read_disk, r1_bio);
+	for (i=r1_bio->mddev->raid_disks; i--; )
+		if (r1_bio->bios[i] == bio)
+			break;
+	BUG_ON(i < 0);
+	update_head_pos(i, r1_bio);
 	/*
 	 * we have read a block, now it needs to be re-written,
 	 * or re-read if the read failed.
@@ -1090,7 +1112,9 @@ static int end_sync_read(struct bio *bio
 	 */
 	if (test_bit(BIO_UPTODATE, &bio->bi_flags))
 		set_bit(R1BIO_Uptodate, &r1_bio->state);
-	reschedule_retry(r1_bio);
+
+	if (atomic_dec_and_test(&r1_bio->remaining))
+		reschedule_retry(r1_bio);
 	return 0;
 }
 
@@ -1133,9 +1157,65 @@ static void sync_request_write(mddev_t *
 	bio = r1_bio->bios[r1_bio->read_disk];
 
 
-	/*
-	 * schedule writes
-	 */
+	if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
+		/* We have read all readable devices.  If we haven't
+		 * got the block, then there is no hope left.
+		 * If we have, then we want to do a comparison
+		 * and skip the write if everything is the same.
+		 * If any blocks failed to read, then we need to
+		 * attempt an over-write
+		 */
+		int primary;
+		if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) {
+			for (i=0; i<mddev->raid_disks; i++)
+				if (r1_bio->bios[i]->bi_end_io == end_sync_read)
+					md_error(mddev, conf->mirrors[i].rdev);
+
+			md_done_sync(mddev, r1_bio->sectors, 1);
+			put_buf(r1_bio);
+			return;
+		}
+		for (primary=0; primary<mddev->raid_disks; primary++)
+			if (r1_bio->bios[primary]->bi_end_io == end_sync_read &&
+			    test_bit(BIO_UPTODATE, &r1_bio->bios[primary]->bi_flags)) {
+				r1_bio->bios[primary]->bi_end_io = NULL;
+				break;
+			}
+		r1_bio->read_disk = primary;
+		for (i=0; i<mddev->raid_disks; i++)
+			if (r1_bio->bios[i]->bi_end_io == end_sync_read &&
+			    test_bit(BIO_UPTODATE, &r1_bio->bios[i]->bi_flags)) {
+				int j;
+				int vcnt = r1_bio->sectors >> (PAGE_SHIFT- 9);
+				struct bio *pbio = r1_bio->bios[primary];
+				struct bio *sbio = r1_bio->bios[i];
+				for (j = vcnt; j-- ; )
+					if (memcmp(page_address(pbio->bi_io_vec[j].bv_page),
+						   page_address(sbio->bi_io_vec[j].bv_page),
+						   PAGE_SIZE))
+						break;
+				if (j >= 0)
+					mddev->resync_mismatches += r1_bio->sectors;
+				if (j < 0 || test_bit(MD_RECOVERY_CHECK, &mddev->recovery))
+					sbio->bi_end_io = NULL;
+				else {
+					/* fixup the bio for reuse */
+					sbio->bi_vcnt = vcnt;
+					sbio->bi_size = r1_bio->sectors << 9;
+					sbio->bi_idx = 0;
+					sbio->bi_phys_segments = 0;
+					sbio->bi_hw_segments = 0;
+					sbio->bi_hw_front_size = 0;
+					sbio->bi_hw_back_size = 0;
+					sbio->bi_flags &= ~(BIO_POOL_MASK - 1);
+					sbio->bi_flags |= 1 << BIO_UPTODATE;
+					sbio->bi_next = NULL;
+					sbio->bi_sector = r1_bio->sector +
+						conf->mirrors[i].rdev->data_offset;
+					sbio->bi_bdev = conf->mirrors[i].rdev->bdev;
+				}
+			}
+	}
 	if (!test_bit(R1BIO_Uptodate, &r1_bio->state)) {
 		/* ouch - failed to read all of that.
 		 * Try some synchronous reads of other devices to get
@@ -1215,6 +1295,10 @@ static void sync_request_write(mddev_t *
 			idx ++;
 		}
 	}
+
+	/*
+	 * schedule writes
+	 */
 	atomic_set(&r1_bio->remaining, 1);
 	for (i = 0; i < disks ; i++) {
 		wbio = r1_bio->bios[i];
@@ -1617,10 +1701,10 @@ static sector_t sync_request(mddev_t *md
 		for (i=0 ; i < conf->raid_disks; i++) {
 			bio = r1_bio->bios[i];
 			if (bio->bi_end_io) {
-				page = r1_bio->bios[0]->bi_io_vec[bio->bi_vcnt].bv_page;
+				page = bio->bi_io_vec[bio->bi_vcnt].bv_page;
 				if (bio_add_page(bio, page, len, 0) == 0) {
 					/* stop here */
-					r1_bio->bios[0]->bi_io_vec[bio->bi_vcnt].bv_page = page;
+					bio->bi_io_vec[bio->bi_vcnt].bv_page = page;
 					while (i > 0) {
 						i--;
 						bio = r1_bio->bios[i];
@@ -1640,12 +1724,28 @@ static sector_t sync_request(mddev_t *md
 		sync_blocks -= (len>>9);
 	} while (r1_bio->bios[disk]->bi_vcnt < RESYNC_PAGES);
  bio_full:
-	bio = r1_bio->bios[r1_bio->read_disk];
 	r1_bio->sectors = nr_sectors;
 
-	md_sync_acct(conf->mirrors[r1_bio->read_disk].rdev->bdev, nr_sectors);
+	/* For a user-requested sync, we read all readable devices and do a
+	 * compare
+	 */
+	if (test_bit(MD_RECOVERY_REQUESTED, &mddev->recovery)) {
+		atomic_set(&r1_bio->remaining, read_targets);
+		for (i=0; i<conf->raid_disks; i++) {
+			bio = r1_bio->bios[i];
+			if (bio->bi_end_io == end_sync_read) {
+				md_sync_acct(conf->mirrors[i].rdev->bdev, nr_sectors);
+				generic_make_request(bio);
+			}
+		}
+	} else {
+		atomic_set(&r1_bio->remaining, 1);
+		bio = r1_bio->bios[r1_bio->read_disk];
+		md_sync_acct(conf->mirrors[r1_bio->read_disk].rdev->bdev,
+			     nr_sectors);
+		generic_make_request(bio);
 
-	generic_make_request(bio);
+	}
 
 	return nr_sectors;
 }

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 003 of 14] Make sure read error on last working drive of raid1 actually returns failure.
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
  2005-12-01  3:22 ` [PATCH md 001 of 14] Support check-without-repair of raid10 arrays NeilBrown
  2005-12-01  3:22 ` [PATCH md 002 of 14] Allow raid1 to check consistency NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:23 ` [PATCH md 004 of 14] auto-correct correctable read errors in raid10 NeilBrown
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


We are inadvertently setting the R1BIO_Uptodate bit on read errors
when we decide not to try correcting (because there are no other
working devices).  This means that the read error is reported to
the client as success.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c |    3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2005-11-28 17:20:37.000000000 +1100
+++ ./drivers/md/raid1.c	2005-11-29 10:28:21.000000000 +1100
@@ -284,7 +284,8 @@ static int raid1_end_read_request(struct
 		 * user-side. So if something waits for IO, then it will
 		 * wait for the 'master' bio.
 		 */
-		set_bit(R1BIO_Uptodate, &r1_bio->state);
+		if (uptodate)
+			set_bit(R1BIO_Uptodate, &r1_bio->state);
 
 		raid_end_bio_io(r1_bio);
 	} else {

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 004 of 14] auto-correct correctable read errors in raid10
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (2 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 003 of 14] Make sure read error on last working drive of raid1 actually returns failure NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:23 ` [PATCH md 005 of 14] raid10 read-error handling - resync and read-only NeilBrown
                   ` (9 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


Largely just a cross-port from raid1.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid10.c         |  127 +++++++++++++++++++++++++++++++++++++-----
 ./include/linux/raid/raid10.h |    2 
 2 files changed, 114 insertions(+), 15 deletions(-)

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-11-29 10:27:40.000000000 +1100
+++ ./drivers/md/raid10.c	2005-11-29 11:04:22.000000000 +1100
@@ -209,6 +209,7 @@ static void reschedule_retry(r10bio_t *r
 
 	spin_lock_irqsave(&conf->device_lock, flags);
 	list_add(&r10_bio->retry_list, &conf->retry_list);
+	conf->nr_queued ++;
 	spin_unlock_irqrestore(&conf->device_lock, flags);
 
 	md_wakeup_thread(mddev->thread);
@@ -254,9 +255,9 @@ static int raid10_end_read_request(struc
 	/*
 	 * this branch is our 'one mirror IO has finished' event handler:
 	 */
-	if (!uptodate)
-		md_error(r10_bio->mddev, conf->mirrors[dev].rdev);
-	else
+	update_head_pos(slot, r10_bio);
+
+	if (uptodate) {
 		/*
 		 * Set R10BIO_Uptodate in our master bio, so that
 		 * we will return a good error code to the higher
@@ -267,15 +268,8 @@ static int raid10_end_read_request(struc
 		 * wait for the 'master' bio.
 		 */
 		set_bit(R10BIO_Uptodate, &r10_bio->state);
-
-	update_head_pos(slot, r10_bio);
-
-	/*
-	 * we have only one bio on the read side
-	 */
-	if (uptodate)
 		raid_end_bio_io(r10_bio);
-	else {
+	} else {
 		/*
 		 * oops, read error:
 		 */
@@ -714,6 +708,33 @@ static void allow_barrier(conf_t *conf)
 	wake_up(&conf->wait_barrier);
 }
 
+static void freeze_array(conf_t *conf)
+{
+	/* stop syncio and normal IO and wait for everything to
+	 * go quite.
+	 * We increment barrier and nr_waiting, and then
+	 * wait until barrier+nr_pending match nr_queued+2
+	 */
+	spin_lock_irq(&conf->resync_lock);
+	conf->barrier++;
+	conf->nr_waiting++;
+	wait_event_lock_irq(conf->wait_barrier,
+			    conf->barrier+conf->nr_pending == conf->nr_queued+2,
+			    conf->resync_lock,
+			    raid10_unplug(conf->mddev->queue));
+	spin_unlock_irq(&conf->resync_lock);
+}
+
+static void unfreeze_array(conf_t *conf)
+{
+	/* reverse the effect of the freeze */
+	spin_lock_irq(&conf->resync_lock);
+	conf->barrier--;
+	conf->nr_waiting--;
+	wake_up(&conf->wait_barrier);
+	spin_unlock_irq(&conf->resync_lock);
+}
+
 static int make_request(request_queue_t *q, struct bio * bio)
 {
 	mddev_t *mddev = q->queuedata;
@@ -1338,6 +1359,7 @@ static void raid10d(mddev_t *mddev)
 			break;
 		r10_bio = list_entry(head->prev, r10bio_t, retry_list);
 		list_del(head->prev);
+		conf->nr_queued--;
 		spin_unlock_irqrestore(&conf->device_lock, flags);
 
 		mddev = r10_bio->mddev;
@@ -1350,6 +1372,78 @@ static void raid10d(mddev_t *mddev)
 			unplug = 1;
 		} else {
 			int mirror;
+			/* we got a read error. Maybe the drive is bad.  Maybe just
+			 * the block and we can fix it.
+			 * We freeze all other IO, and try reading the block from
+			 * other devices.  When we find one, we re-write
+			 * and check it that fixes the read error.
+			 * This is all done synchronously while the array is
+			 * frozen.
+			 */
+			int sect = 0; /* Offset from r10_bio->sector */
+			int sectors = r10_bio->sectors;
+			freeze_array(conf);
+			if (mddev->ro == 0) while(sectors) {
+				int s = sectors;
+				int sl = r10_bio->read_slot;
+				int success = 0;
+
+				if (s > (PAGE_SIZE>>9))
+					s = PAGE_SIZE >> 9;
+
+				do {
+					int d = r10_bio->devs[sl].devnum;
+					rdev = conf->mirrors[d].rdev;
+					if (rdev &&
+					    test_bit(In_sync, &rdev->flags) &&
+					    sync_page_io(rdev->bdev,
+							 r10_bio->devs[sl].addr +
+							 sect + rdev->data_offset,
+							 s<<9,
+							 conf->tmppage, READ))
+						success = 1;
+					else {
+						sl++;
+						if (sl == conf->copies)
+							sl = 0;
+					}
+				} while (!success && sl != r10_bio->read_slot);
+
+				if (success) {
+					/* write it back and re-read */
+					while (sl != r10_bio->read_slot) {
+						int d;
+						if (sl==0)
+							sl = conf->copies;
+						sl--;
+						d = r10_bio->devs[sl].devnum;
+						rdev = conf->mirrors[d].rdev;
+						if (rdev &&
+						    test_bit(In_sync, &rdev->flags)) {
+							if (sync_page_io(rdev->bdev,
+									 r10_bio->devs[sl].addr +
+									 sect + rdev->data_offset,
+									 s<<9, conf->tmppage, WRITE) == 0 ||
+							    sync_page_io(rdev->bdev,
+									 r10_bio->devs[sl].addr +
+									 sect + rdev->data_offset,
+									 s<<9, conf->tmppage, READ) == 0) {
+								/* Well, this device is dead */
+								md_error(mddev, rdev);
+							}
+						}
+					}
+				} else {
+					/* Cannot read from anywhere -- bye bye array */
+					md_error(mddev, conf->mirrors[r10_bio->devs[r10_bio->read_slot].devnum].rdev);
+					break;
+				}
+				sectors -= s;
+				sect += s;
+			}
+
+			unfreeze_array(conf);
+
 			bio = r10_bio->devs[r10_bio->read_slot].bio;
 			r10_bio->devs[r10_bio->read_slot].bio = NULL;
 			bio_put(bio);
@@ -1793,22 +1887,24 @@ static int run(mddev_t *mddev)
 	 * bookkeeping area. [whatever we allocate in run(),
 	 * should be freed in stop()]
 	 */
-	conf = kmalloc(sizeof(conf_t), GFP_KERNEL);
+	conf = kzalloc(sizeof(conf_t), GFP_KERNEL);
 	mddev->private = conf;
 	if (!conf) {
 		printk(KERN_ERR "raid10: couldn't allocate memory for %s\n",
 			mdname(mddev));
 		goto out;
 	}
-	memset(conf, 0, sizeof(*conf));
-	conf->mirrors = kmalloc(sizeof(struct mirror_info)*mddev->raid_disks,
+	conf->mirrors = kzalloc(sizeof(struct mirror_info)*mddev->raid_disks,
 				 GFP_KERNEL);
 	if (!conf->mirrors) {
 		printk(KERN_ERR "raid10: couldn't allocate memory for %s\n",
 		       mdname(mddev));
 		goto out_free_conf;
 	}
-	memset(conf->mirrors, 0, sizeof(struct mirror_info)*mddev->raid_disks);
+
+	conf->tmppage = alloc_page(GFP_KERNEL);
+	if (!conf->tmppage)
+		goto out_free_conf;
 
 	conf->near_copies = nc;
 	conf->far_copies = fc;
@@ -1918,6 +2014,7 @@ static int run(mddev_t *mddev)
 out_free_conf:
 	if (conf->r10bio_pool)
 		mempool_destroy(conf->r10bio_pool);
+	put_page(conf->tmppage);
 	kfree(conf->mirrors);
 	kfree(conf);
 	mddev->private = NULL;

diff ./include/linux/raid/raid10.h~current~ ./include/linux/raid/raid10.h
--- ./include/linux/raid/raid10.h~current~	2005-11-29 10:27:40.000000000 +1100
+++ ./include/linux/raid/raid10.h	2005-11-29 11:04:56.000000000 +1100
@@ -42,6 +42,7 @@ struct r10_private_data_s {
 	spinlock_t		resync_lock;
 	int nr_pending;
 	int nr_waiting;
+	int nr_queued;
 	int barrier;
 	sector_t		next_resync;
 	int			fullsync;  /* set to 1 if a full sync is needed,
@@ -53,6 +54,7 @@ struct r10_private_data_s {
 
 	mempool_t *r10bio_pool;
 	mempool_t *r10buf_pool;
+	struct page		*tmppage;
 };
 
 typedef struct r10_private_data_s conf_t;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 005 of 14] raid10 read-error handling - resync and read-only
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (3 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 004 of 14] auto-correct correctable read errors in raid10 NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:23 ` [PATCH md 006 of 14] Make /proc/mdstat pollable NeilBrown
                   ` (8 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


Add in correct read-error handling for resync and read-only
situations.
When read-only, we don't over-write, so we need to mark the failed
drive in the r10_bio so we don't re-try it.
During resync, we always read all blocks, so if there is a read error,
we simply over-write it with the good block that we found (assuming
we found one).

Note that the recovery case still isn't handled in an interesting way.
There is nothing useful to do for the 2-copies case.  If there are 3
or more copies, then we could try reading from one of the non-missing
copies, but this is a bit complicated and very rarely would be used,
so I'm leaving it for now.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid10.c         |   56 ++++++++++++++++++++++++++----------------
 ./include/linux/raid/raid10.h |    7 +++++
 2 files changed, 42 insertions(+), 21 deletions(-)

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-11-29 14:07:05.000000000 +1100
+++ ./drivers/md/raid10.c	2005-11-29 14:11:01.000000000 +1100
@@ -172,7 +172,7 @@ static void put_all_bios(conf_t *conf, r
 
 	for (i = 0; i < conf->copies; i++) {
 		struct bio **bio = & r10_bio->devs[i].bio;
-		if (*bio)
+		if (*bio && *bio != IO_BLOCKED)
 			bio_put(*bio);
 		*bio = NULL;
 	}
@@ -500,6 +500,7 @@ static int read_balance(conf_t *conf, r1
 		disk = r10_bio->devs[slot].devnum;
 
 		while ((rdev = rcu_dereference(conf->mirrors[disk].rdev)) == NULL ||
+		       r10_bio->devs[slot].bio == IO_BLOCKED ||
 		       !test_bit(In_sync, &rdev->flags)) {
 			slot++;
 			if (slot == conf->copies) {
@@ -517,6 +518,7 @@ static int read_balance(conf_t *conf, r1
 	slot = 0;
 	disk = r10_bio->devs[slot].devnum;
 	while ((rdev=rcu_dereference(conf->mirrors[disk].rdev)) == NULL ||
+	       r10_bio->devs[slot].bio == IO_BLOCKED ||
 	       !test_bit(In_sync, &rdev->flags)) {
 		slot ++;
 		if (slot == conf->copies) {
@@ -537,6 +539,7 @@ static int read_balance(conf_t *conf, r1
 
 
 		if ((rdev=rcu_dereference(conf->mirrors[ndisk].rdev)) == NULL ||
+		    r10_bio->devs[nslot].bio == IO_BLOCKED ||
 		    !test_bit(In_sync, &rdev->flags))
 			continue;
 
@@ -1104,7 +1107,6 @@ abort:
 
 static int end_sync_read(struct bio *bio, unsigned int bytes_done, int error)
 {
-	int uptodate = test_bit(BIO_UPTODATE, &bio->bi_flags);
 	r10bio_t * r10_bio = (r10bio_t *)(bio->bi_private);
 	conf_t *conf = mddev_to_conf(r10_bio->mddev);
 	int i,d;
@@ -1119,7 +1121,10 @@ static int end_sync_read(struct bio *bio
 		BUG();
 	update_head_pos(i, r10_bio);
 	d = r10_bio->devs[i].devnum;
-	if (!uptodate)
+
+	if (test_bit(BIO_UPTODATE, &bio->bi_flags))
+		set_bit(R10BIO_Uptodate, &r10_bio->state);
+	else if (!test_bit(MD_RECOVERY_SYNC, &conf->mddev->recovery))
 		md_error(r10_bio->mddev,
 			 conf->mirrors[d].rdev);
 
@@ -1209,25 +1214,30 @@ static void sync_request_write(mddev_t *
 	fbio = r10_bio->devs[i].bio;
 
 	/* now find blocks with errors */
-	for (i=first+1 ; i < conf->copies ; i++) {
-		int vcnt, j, d;
+	for (i=0 ; i < conf->copies ; i++) {
+		int  j, d;
+		int vcnt = r10_bio->sectors >> (PAGE_SHIFT-9);
 
-		if (!test_bit(BIO_UPTODATE, &r10_bio->devs[i].bio->bi_flags))
-			continue;
-		/* We know that the bi_io_vec layout is the same for
-		 * both 'first' and 'i', so we just compare them.
-		 * All vec entries are PAGE_SIZE;
-		 */
 		tbio = r10_bio->devs[i].bio;
-		vcnt = r10_bio->sectors >> (PAGE_SHIFT-9);
-		for (j = 0; j < vcnt; j++)
-			if (memcmp(page_address(fbio->bi_io_vec[j].bv_page),
-				   page_address(tbio->bi_io_vec[j].bv_page),
-				   PAGE_SIZE))
-				break;
-		if (j == vcnt)
+
+		if (tbio->bi_end_io != end_sync_read)
+			continue;
+		if (i == first)
 			continue;
-		mddev->resync_mismatches += r10_bio->sectors;
+		if (test_bit(BIO_UPTODATE, &r10_bio->devs[i].bio->bi_flags)) {
+			/* We know that the bi_io_vec layout is the same for
+			 * both 'first' and 'i', so we just compare them.
+			 * All vec entries are PAGE_SIZE;
+			 */
+			for (j = 0; j < vcnt; j++)
+				if (memcmp(page_address(fbio->bi_io_vec[j].bv_page),
+					   page_address(tbio->bi_io_vec[j].bv_page),
+					   PAGE_SIZE))
+					break;
+			if (j == vcnt)
+				continue;
+			mddev->resync_mismatches += r10_bio->sectors;
+		}
 		if (test_bit(MD_RECOVERY_CHECK, &mddev->recovery))
 			/* Don't fix anything. */
 			continue;
@@ -1308,7 +1318,10 @@ static void recovery_request_write(mddev
 
 	atomic_inc(&conf->mirrors[d].rdev->nr_pending);
 	md_sync_acct(conf->mirrors[d].rdev->bdev, wbio->bi_size >> 9);
-	generic_make_request(wbio);
+	if (test_bit(R10BIO_Uptodate, &r10_bio->state))
+		generic_make_request(wbio);
+	else
+		bio_endio(wbio, wbio->bi_size, -EIO);
 }
 
 
@@ -1445,7 +1458,8 @@ static void raid10d(mddev_t *mddev)
 			unfreeze_array(conf);
 
 			bio = r10_bio->devs[r10_bio->read_slot].bio;
-			r10_bio->devs[r10_bio->read_slot].bio = NULL;
+			r10_bio->devs[r10_bio->read_slot].bio =
+				mddev->ro ? IO_BLOCKED : NULL;
 			bio_put(bio);
 			mirror = read_balance(conf, r10_bio);
 			if (mirror == -1) {

diff ./include/linux/raid/raid10.h~current~ ./include/linux/raid/raid10.h
--- ./include/linux/raid/raid10.h~current~	2005-11-29 14:07:05.000000000 +1100
+++ ./include/linux/raid/raid10.h	2005-11-29 12:09:11.000000000 +1100
@@ -104,6 +104,13 @@ struct r10bio_s {
 	} devs[0];
 };
 
+/* when we get a read error on a read-only array, we redirect to another
+ * device without failing the first device, or trying to over-write to
+ * correct the read error.  To keep track of bad blocks on a per-bio
+ * level, we store IO_BLOCKED in the appropriate 'bios' pointer
+ */
+#define IO_BLOCKED ((struct bio*)1)
+
 /* bits for r10bio.state */
 #define	R10BIO_Uptodate	0
 #define	R10BIO_IsSync	1

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 006 of 14] Make /proc/mdstat pollable.
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (4 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 005 of 14] raid10 read-error handling - resync and read-only NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01 22:39   ` Andrew Morton
  2005-12-01  3:23 ` [PATCH md 007 of 14] Clean up 'page' related names in md NeilBrown
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


With this patch it is possible to poll /proc/mdstat to detect
arrays appearing or disappearing, to detect failures,
recovery starting, recovery completing, and devices being
added and removed.

It is similar to the poll-ability of /proc/mounts, though different in that:

We always report that the file is readable (because face it, it is, even
if only for EOF).
We report POLLPRI when there is a change so that select() can detect
it as an exceptional event.  Not only are these exceptional events, but 
that is the mechanism that the current 'mdadm' uses to watch for events
(It also polls after a timeout).
(We also report POLLERR like /proc/mounts).
Finally, we only reset the per-file event counter when the start of the 
file is read, rather than when poll() returns an event.  This is more
robust as it means that an fd will continue to report activity to poll/select
until the program clearly responds to that activity.

md_new_event takes an 'mddev' which isn't currently used, but it will
be soon.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c |   81 ++++++++++++++++++++++++++++++++++++++++++++++++++----
 1 file changed, 76 insertions(+), 5 deletions(-)

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2005-12-01 13:56:49.000000000 +1100
+++ ./drivers/md/md.c	2005-12-01 13:57:01.000000000 +1100
@@ -42,6 +42,7 @@
 #include <linux/devfs_fs_kernel.h>
 #include <linux/buffer_head.h> /* for invalidate_bdev */
 #include <linux/suspend.h>
+#include <linux/poll.h>
 
 #include <linux/init.h>
 
@@ -134,6 +135,24 @@ static struct block_device_operations md
 static int start_readonly;
 
 /*
+ * We have a system wide 'event count' that is incremented
+ * on any 'interesting' event, and readers of /proc/mdstat
+ * can use 'poll' or 'select' to find out when the event
+ * count increases.
+ *
+ * Events are:
+ *  start array, stop array, error, add device, remove device,
+ *  start build, activate spare
+ */
+DECLARE_WAIT_QUEUE_HEAD(md_event_waiters);
+static atomic_t md_event_count;
+void md_new_event(mddev_t *mddev)
+{
+	atomic_inc(&md_event_count);
+	wake_up(&md_event_waiters);
+}
+
+/*
  * Enables to iterate over all existing md arrays
  * all_mddevs_lock protects this list.
  */
@@ -2111,6 +2130,7 @@ static int do_md_run(mddev_t * mddev)
 	mddev->queue->make_request_fn = mddev->pers->make_request;
 
 	mddev->changed = 1;
+	md_new_event(mddev);
 	return 0;
 }
 
@@ -2238,6 +2258,7 @@ static int do_md_stop(mddev_t * mddev, i
 		printk(KERN_INFO "md: %s switched to read-only mode.\n",
 			mdname(mddev));
 	err = 0;
+	md_new_event(mddev);
 out:
 	return err;
 }
@@ -2712,6 +2733,7 @@ static int hot_remove_disk(mddev_t * mdd
 
 	kick_rdev_from_array(rdev);
 	md_update_sb(mddev);
+	md_new_event(mddev);
 
 	return 0;
 busy:
@@ -2802,7 +2824,7 @@ static int hot_add_disk(mddev_t * mddev,
 	 */
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 	md_wakeup_thread(mddev->thread);
-
+	md_new_event(mddev);
 	return 0;
 
 abort_unbind_export:
@@ -3523,6 +3545,7 @@ void md_error(mddev_t *mddev, mdk_rdev_t
 	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 	set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
 	md_wakeup_thread(mddev->thread);
+	md_new_event(mddev);
 }
 
 /* seq_file implementation /proc/mdstat */
@@ -3663,12 +3686,17 @@ static void md_seq_stop(struct seq_file 
 		mddev_put(mddev);
 }
 
+struct mdstat_info {
+	int event;
+};
+
 static int md_seq_show(struct seq_file *seq, void *v)
 {
 	mddev_t *mddev = v;
 	sector_t size;
 	struct list_head *tmp2;
 	mdk_rdev_t *rdev;
+	struct mdstat_info *mi = seq->private;
 	int i;
 	struct bitmap *bitmap;
 
@@ -3681,6 +3709,7 @@ static int md_seq_show(struct seq_file *
 
 		spin_unlock(&pers_lock);
 		seq_printf(seq, "\n");
+		mi->event = atomic_read(&md_event_count);
 		return 0;
 	}
 	if (v == (void*)2) {
@@ -3789,16 +3818,52 @@ static struct seq_operations md_seq_ops 
 static int md_seq_open(struct inode *inode, struct file *file)
 {
 	int error;
+	struct mdstat_info *mi = kmalloc(sizeof(*mi), GFP_KERNEL);
+	if (mi == NULL)
+		return -ENOMEM;
 
 	error = seq_open(file, &md_seq_ops);
+	if (error)
+		kfree(mi);
+	else {
+		struct seq_file *p = file->private_data;
+		p->private = mi;
+		mi->event = atomic_read(&md_event_count);
+	}
 	return error;
 }
 
+static int md_seq_release(struct inode *inode, struct file *file)
+{
+	struct seq_file *m = file->private_data;
+	struct mdstat_info *mi = m->private;
+	m->private = NULL;
+	kfree(mi);
+	return seq_release(inode, file);
+}
+
+static unsigned int mdstat_poll(struct file *filp, poll_table *wait)
+{
+	struct seq_file *m = filp->private_data;
+	struct mdstat_info *mi = m->private;
+	int mask;
+
+	poll_wait(filp, &md_event_waiters, wait);
+
+	/* always allow read */
+	mask = POLLIN | POLLRDNORM;
+
+	if (mi->event != atomic_read(&md_event_count))
+		mask |= POLLERR | POLLPRI;
+	return mask;
+}
+
 static struct file_operations md_seq_fops = {
 	.open           = md_seq_open,
 	.read           = seq_read,
 	.llseek         = seq_lseek,
-	.release	= seq_release,
+	.release	= md_seq_release,
+	.poll		= mdstat_poll,
 };
 
 int register_md_personality(int pnum, mdk_personality_t *p)
@@ -4068,7 +4133,11 @@ static void md_do_sync(mddev_t *mddev)
 
 		j += sectors;
 		if (j>1) mddev->curr_resync = j;
-
+		if (last_check == 0)
+			/* this is the earliers that rebuilt will be
+			 * visible in /proc/mdstat
+			 */
+			md_new_event(mddev);
 
 		if (last_check + window > io_sectors || j == max_sectors)
 			continue;
@@ -4254,6 +4323,7 @@ void md_check_recovery(mddev_t *mddev)
 			mddev->recovery = 0;
 			/* flag recovery needed just to double check */
 			set_bit(MD_RECOVERY_NEEDED, &mddev->recovery);
+			md_new_event(mddev);
 			goto unlock;
 		}
 		/* Clear some bits that don't mean anything, but
@@ -4291,6 +4361,7 @@ void md_check_recovery(mddev_t *mddev)
 						sprintf(nm, "rd%d", rdev->raid_disk);
 						sysfs_create_link(&mddev->kobj, &rdev->kobj, nm);
 						spares++;
+						md_new_event(mddev);
 					} else
 						break;
 				}
@@ -4323,9 +4394,9 @@ void md_check_recovery(mddev_t *mddev)
 					mdname(mddev));
 				/* leave the spares where they are, it shouldn't hurt */
 				mddev->recovery = 0;
-			} else {
+			} else
 				md_wakeup_thread(mddev->sync_thread);
-			}
+			md_new_event(mddev);
 		}
 	unlock:
 		mddev_unlock(mddev);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 007 of 14] Clean up 'page' related names in md
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (5 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 006 of 14] Make /proc/mdstat pollable NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:23 ` [PATCH md 008 of 14] Convert md to use kzalloc throughout NeilBrown
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


Substitute:

  page_cache_get -> get_page
  page_cache_release -> put_page
  PAGE_CACHE_SHIFT -> PAGE_SHIFT
  PAGE_CACHE_SIZE -> PAGE_SIZE
  PAGE_CACHE_MASK -> PAGE_MASK
  __free_page -> put_page

because we aren't using the page cache, we are just using pages.


Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/bitmap.c    |   44 ++++++++++++++++++++++----------------------
 ./drivers/md/md.c        |    2 +-
 ./drivers/md/raid0.c     |    2 +-
 ./drivers/md/raid1.c     |   10 +++++-----
 ./drivers/md/raid10.c    |    8 ++++----
 ./drivers/md/raid5.c     |    4 ++--
 ./drivers/md/raid6main.c |    6 +++---
 7 files changed, 38 insertions(+), 38 deletions(-)

diff ./drivers/md/bitmap.c~current~ ./drivers/md/bitmap.c
--- ./drivers/md/bitmap.c~current~	2005-12-01 13:56:49.000000000 +1100
+++ ./drivers/md/bitmap.c	2005-12-01 13:57:54.000000000 +1100
@@ -341,7 +341,7 @@ static int write_page(struct bitmap *bit
 		/* add to list to be waited for by daemon */
 		struct page_list *item = mempool_alloc(bitmap->write_pool, GFP_NOIO);
 		item->page = page;
-		page_cache_get(page);
+		get_page(page);
 		spin_lock(&bitmap->write_lock);
 		list_add(&item->list, &bitmap->complete_pages);
 		spin_unlock(&bitmap->write_lock);
@@ -357,10 +357,10 @@ static struct page *read_page(struct fil
 	struct inode *inode = file->f_mapping->host;
 	struct page *page = NULL;
 	loff_t isize = i_size_read(inode);
-	unsigned long end_index = isize >> PAGE_CACHE_SHIFT;
+	unsigned long end_index = isize >> PAGE_SHIFT;
 
-	PRINTK("read bitmap file (%dB @ %Lu)\n", (int)PAGE_CACHE_SIZE,
-			(unsigned long long)index << PAGE_CACHE_SHIFT);
+	PRINTK("read bitmap file (%dB @ %Lu)\n", (int)PAGE_SIZE,
+			(unsigned long long)index << PAGE_SHIFT);
 
 	page = read_cache_page(inode->i_mapping, index,
 			(filler_t *)inode->i_mapping->a_ops->readpage, file);
@@ -368,7 +368,7 @@ static struct page *read_page(struct fil
 		goto out;
 	wait_on_page_locked(page);
 	if (!PageUptodate(page) || PageError(page)) {
-		page_cache_release(page);
+		put_page(page);
 		page = ERR_PTR(-EIO);
 		goto out;
 	}
@@ -376,14 +376,14 @@ static struct page *read_page(struct fil
 	if (index > end_index) /* we have read beyond EOF */
 		*bytes_read = 0;
 	else if (index == end_index) /* possible short read */
-		*bytes_read = isize & ~PAGE_CACHE_MASK;
+		*bytes_read = isize & ~PAGE_MASK;
 	else
-		*bytes_read = PAGE_CACHE_SIZE; /* got a full page */
+		*bytes_read = PAGE_SIZE; /* got a full page */
 out:
 	if (IS_ERR(page))
 		printk(KERN_ALERT "md: bitmap read error: (%dB @ %Lu): %ld\n",
-			(int)PAGE_CACHE_SIZE,
-			(unsigned long long)index << PAGE_CACHE_SHIFT,
+			(int)PAGE_SIZE,
+			(unsigned long long)index << PAGE_SHIFT,
 			PTR_ERR(page));
 	return page;
 }
@@ -558,7 +558,7 @@ static void bitmap_mask_state(struct bit
 		spin_unlock_irqrestore(&bitmap->lock, flags);
 		return;
 	}
-	page_cache_get(bitmap->sb_page);
+	get_page(bitmap->sb_page);
 	spin_unlock_irqrestore(&bitmap->lock, flags);
 	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
 	switch (op) {
@@ -569,7 +569,7 @@ static void bitmap_mask_state(struct bit
 		default: BUG();
 	}
 	kunmap(bitmap->sb_page);
-	page_cache_release(bitmap->sb_page);
+	put_page(bitmap->sb_page);
 }
 
 /*
@@ -622,12 +622,12 @@ static void bitmap_file_unmap(struct bit
 
 	while (pages--)
 		if (map[pages]->index != 0) /* 0 is sb_page, release it below */
-			page_cache_release(map[pages]);
+			put_page(map[pages]);
 	kfree(map);
 	kfree(attr);
 
 	if (sb_page)
-		page_cache_release(sb_page);
+		put_page(sb_page);
 }
 
 static void bitmap_stop_daemon(struct bitmap *bitmap);
@@ -654,7 +654,7 @@ static void drain_write_queues(struct bi
 
 	while ((item = dequeue_page(bitmap))) {
 		/* don't bother to wait */
-		page_cache_release(item->page);
+		put_page(item->page);
 		mempool_free(item, bitmap->write_pool);
 	}
 
@@ -763,7 +763,7 @@ static void bitmap_file_set_bit(struct b
 
 	/* make sure the page stays cached until it gets written out */
 	if (! (get_page_attr(bitmap, page) & BITMAP_PAGE_DIRTY))
-		page_cache_get(page);
+		get_page(page);
 
  	/* set the bit */
 	kaddr = kmap_atomic(page, KM_USER0);
@@ -938,7 +938,7 @@ static int bitmap_init_from_disk(struct 
 				if (ret) {
 					kunmap(page);
 					/* release, page not in filemap yet */
-					page_cache_release(page);
+					put_page(page);
 					goto out;
 				}
 			}
@@ -1043,7 +1043,7 @@ int bitmap_daemon_work(struct bitmap *bi
 			/* skip this page unless it's marked as needing cleaning */
 			if (!((attr=get_page_attr(bitmap, page)) & BITMAP_PAGE_CLEAN)) {
 				if (attr & BITMAP_PAGE_NEEDWRITE) {
-					page_cache_get(page);
+					get_page(page);
 					clear_page_attr(bitmap, page, BITMAP_PAGE_NEEDWRITE);
 				}
 				spin_unlock_irqrestore(&bitmap->lock, flags);
@@ -1057,13 +1057,13 @@ int bitmap_daemon_work(struct bitmap *bi
 					default:
 						bitmap_file_kick(bitmap);
 					}
-					page_cache_release(page);
+					put_page(page);
 				}
 				continue;
 			}
 
 			/* grab the new page, sync and release the old */
-			page_cache_get(page);
+			get_page(page);
 			if (lastpage != NULL) {
 				if (get_page_attr(bitmap, lastpage) & BITMAP_PAGE_NEEDWRITE) {
 					clear_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);
@@ -1078,7 +1078,7 @@ int bitmap_daemon_work(struct bitmap *bi
 					spin_unlock_irqrestore(&bitmap->lock, flags);
 				}
 				kunmap(lastpage);
-				page_cache_release(lastpage);
+				put_page(lastpage);
 				if (err)
 					bitmap_file_kick(bitmap);
 			} else
@@ -1133,7 +1133,7 @@ int bitmap_daemon_work(struct bitmap *bi
 			spin_unlock_irqrestore(&bitmap->lock, flags);
 		}
 
-		page_cache_release(lastpage);
+		put_page(lastpage);
 	}
 
 	return err;
@@ -1184,7 +1184,7 @@ static void bitmap_writeback_daemon(mdde
 		PRINTK("finished page writeback: %p\n", page);
 
 		err = PageError(page);
-		page_cache_release(page);
+		put_page(page);
 		if (err) {
 			printk(KERN_WARNING "%s: bitmap file writeback "
 			       "failed (page %lu): %d\n",

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2005-12-01 13:57:01.000000000 +1100
+++ ./drivers/md/md.c	2005-12-01 13:57:54.000000000 +1100
@@ -339,7 +339,7 @@ static int alloc_disk_sb(mdk_rdev_t * rd
 static void free_disk_sb(mdk_rdev_t * rdev)
 {
 	if (rdev->sb_page) {
-		page_cache_release(rdev->sb_page);
+		put_page(rdev->sb_page);
 		rdev->sb_loaded = 0;
 		rdev->sb_page = NULL;
 		rdev->sb_offset = 0;

diff ./drivers/md/raid0.c~current~ ./drivers/md/raid0.c
--- ./drivers/md/raid0.c~current~	2005-12-01 13:56:49.000000000 +1100
+++ ./drivers/md/raid0.c	2005-12-01 13:57:54.000000000 +1100
@@ -361,7 +361,7 @@ static int raid0_run (mddev_t *mddev)
 	 * chunksize should be used in that case.
 	 */
 	{
-		int stripe = mddev->raid_disks * mddev->chunk_size / PAGE_CACHE_SIZE;
+		int stripe = mddev->raid_disks * mddev->chunk_size / PAGE_SIZE;
 		if (mddev->queue->backing_dev_info.ra_pages < 2* stripe)
 			mddev->queue->backing_dev_info.ra_pages = 2* stripe;
 	}

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2005-12-01 13:56:50.000000000 +1100
+++ ./drivers/md/raid1.c	2005-12-01 13:57:54.000000000 +1100
@@ -139,7 +139,7 @@ static void * r1buf_pool_alloc(gfp_t gfp
 out_free_pages:
 	for (i=0; i < RESYNC_PAGES ; i++)
 		for (j=0 ; j < pi->raid_disks; j++)
-			__free_page(r1_bio->bios[j]->bi_io_vec[i].bv_page);
+			put_page(r1_bio->bios[j]->bi_io_vec[i].bv_page);
 	j = -1;
 out_free_bio:
 	while ( ++j < pi->raid_disks )
@@ -159,7 +159,7 @@ static void r1buf_pool_free(void *__r1_b
 			if (j == 0 ||
 			    r1bio->bios[j]->bi_io_vec[i].bv_page !=
 			    r1bio->bios[0]->bi_io_vec[i].bv_page)
-				__free_page(r1bio->bios[j]->bi_io_vec[i].bv_page);
+				put_page(r1bio->bios[j]->bi_io_vec[i].bv_page);
 		}
 	for (i=0 ; i < pi->raid_disks; i++)
 		bio_put(r1bio->bios[i]);
@@ -386,7 +386,7 @@ static int raid1_end_write_request(struc
 /* FIXME bio has been freed!!! */
 			int i = bio->bi_vcnt;
 			while (i--)
-				__free_page(bio->bi_io_vec[i].bv_page);
+				put_page(bio->bi_io_vec[i].bv_page);
 		}
 		/* clear the bitmap if all writes complete successfully */
 		bitmap_endwrite(r1_bio->mddev->bitmap, r1_bio->sector,
@@ -732,7 +732,7 @@ static struct page **alloc_behind_pages(
 do_sync_io:
 	if (pages)
 		for (i = 0; i < bio->bi_vcnt && pages[i]; i++)
-			__free_page(pages[i]);
+			put_page(pages[i]);
 	kfree(pages);
 	PRINTK("%dB behind alloc failed, doing sync I/O\n", bio->bi_size);
 	return NULL;
@@ -1892,7 +1892,7 @@ out_free_conf:
 		if (conf->r1bio_pool)
 			mempool_destroy(conf->r1bio_pool);
 		kfree(conf->mirrors);
-		__free_page(conf->tmppage);
+		put_page(conf->tmppage);
 		kfree(conf->poolinfo);
 		kfree(conf);
 		mddev->private = NULL;

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-12-01 13:56:50.000000000 +1100
+++ ./drivers/md/raid10.c	2005-12-01 13:57:55.000000000 +1100
@@ -134,10 +134,10 @@ static void * r10buf_pool_alloc(gfp_t gf
 
 out_free_pages:
 	for ( ; i > 0 ; i--)
-		__free_page(bio->bi_io_vec[i-1].bv_page);
+		put_page(bio->bi_io_vec[i-1].bv_page);
 	while (j--)
 		for (i = 0; i < RESYNC_PAGES ; i++)
-			__free_page(r10_bio->devs[j].bio->bi_io_vec[i].bv_page);
+			put_page(r10_bio->devs[j].bio->bi_io_vec[i].bv_page);
 	j = -1;
 out_free_bio:
 	while ( ++j < nalloc )
@@ -157,7 +157,7 @@ static void r10buf_pool_free(void *__r10
 		struct bio *bio = r10bio->devs[j].bio;
 		if (bio) {
 			for (i = 0; i < RESYNC_PAGES; i++) {
-				__free_page(bio->bi_io_vec[i].bv_page);
+				put_page(bio->bi_io_vec[i].bv_page);
 				bio->bi_io_vec[i].bv_page = NULL;
 			}
 			bio_put(bio);
@@ -2015,7 +2015,7 @@ static int run(mddev_t *mddev)
 	 * maybe...
 	 */
 	{
-		int stripe = conf->raid_disks * mddev->chunk_size / PAGE_CACHE_SIZE;
+		int stripe = conf->raid_disks * mddev->chunk_size / PAGE_SIZE;
 		stripe /= conf->near_copies;
 		if (mddev->queue->backing_dev_info.ra_pages < 2* stripe)
 			mddev->queue->backing_dev_info.ra_pages = 2* stripe;

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2005-12-01 13:56:50.000000000 +1100
+++ ./drivers/md/raid5.c	2005-12-01 13:57:55.000000000 +1100
@@ -167,7 +167,7 @@ static void shrink_buffers(struct stripe
 		if (!p)
 			continue;
 		sh->dev[i].page = NULL;
-		page_cache_release(p);
+		put_page(p);
 	}
 }
 
@@ -1955,7 +1955,7 @@ memory = conf->max_nr_stripes * (sizeof(
 	 */
 	{
 		int stripe = (mddev->raid_disks-1) * mddev->chunk_size
-			/ PAGE_CACHE_SIZE;
+			/ PAGE_SIZE;
 		if (mddev->queue->backing_dev_info.ra_pages < 2 * stripe)
 			mddev->queue->backing_dev_info.ra_pages = 2 * stripe;
 	}

diff ./drivers/md/raid6main.c~current~ ./drivers/md/raid6main.c
--- ./drivers/md/raid6main.c~current~	2005-12-01 13:56:50.000000000 +1100
+++ ./drivers/md/raid6main.c	2005-12-01 13:57:55.000000000 +1100
@@ -186,7 +186,7 @@ static void shrink_buffers(struct stripe
 		if (!p)
 			continue;
 		sh->dev[i].page = NULL;
-		page_cache_release(p);
+		put_page(p);
 	}
 }
 
@@ -2069,7 +2069,7 @@ static int run(mddev_t *mddev)
 	 */
 	{
 		int stripe = (mddev->raid_disks-2) * mddev->chunk_size
-			/ PAGE_CACHE_SIZE;
+			/ PAGE_SIZE;
 		if (mddev->queue->backing_dev_info.ra_pages < 2 * stripe)
 			mddev->queue->backing_dev_info.ra_pages = 2 * stripe;
 	}
@@ -2084,7 +2084,7 @@ abort:
 	if (conf) {
 		print_raid6_conf(conf);
 		if (conf->spare_page)
-			page_cache_release(conf->spare_page);
+			put_page(conf->spare_page);
 		if (conf->stripe_hashtbl)
 			free_pages((unsigned long) conf->stripe_hashtbl,
 							HASH_PAGES_ORDER);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 008 of 14] Convert md to use kzalloc throughout
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (6 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 007 of 14] Clean up 'page' related names in md NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01 22:42   ` Andrew Morton
  2005-12-01  3:23 ` [PATCH md 009 of 14] Tidy up raid5/6 hash table code NeilBrown
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


Replace multiple kmalloc/memset pairs with kzalloc calls.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/bitmap.c    |   11 +++--------
 ./drivers/md/linear.c    |    3 +--
 ./drivers/md/md.c        |   10 +++-------
 ./drivers/md/multipath.c |   10 +++-------
 ./drivers/md/raid0.c     |    9 ++-------
 ./drivers/md/raid1.c     |   20 ++++++--------------
 ./drivers/md/raid10.c    |    6 ++----
 ./drivers/md/raid5.c     |    8 ++++----
 8 files changed, 24 insertions(+), 53 deletions(-)

diff ./drivers/md/bitmap.c~current~ ./drivers/md/bitmap.c
--- ./drivers/md/bitmap.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/bitmap.c	2005-12-01 13:59:53.000000000 +1100
@@ -887,12 +887,10 @@ static int bitmap_init_from_disk(struct 
 	if (!bitmap->filemap)
 		goto out;
 
-	bitmap->filemap_attr = kmalloc(sizeof(long) * num_pages, GFP_KERNEL);
+	bitmap->filemap_attr = kzalloc(sizeof(long) * num_pages, GFP_KERNEL);
 	if (!bitmap->filemap_attr)
 		goto out;
 
-	memset(bitmap->filemap_attr, 0, sizeof(long) * num_pages);
-
 	oldindex = ~0L;
 
 	for (i = 0; i < chunks; i++) {
@@ -1557,12 +1555,10 @@ int bitmap_create(mddev_t *mddev)
 
 	BUG_ON(file && mddev->bitmap_offset);
 
-	bitmap = kmalloc(sizeof(*bitmap), GFP_KERNEL);
+	bitmap = kzalloc(sizeof(*bitmap), GFP_KERNEL);
 	if (!bitmap)
 		return -ENOMEM;
 
-	memset(bitmap, 0, sizeof(*bitmap));
-
 	spin_lock_init(&bitmap->lock);
 	bitmap->mddev = mddev;
 
@@ -1603,12 +1599,11 @@ int bitmap_create(mddev_t *mddev)
 #ifdef INJECT_FATAL_FAULT_1
 	bitmap->bp = NULL;
 #else
-	bitmap->bp = kmalloc(pages * sizeof(*bitmap->bp), GFP_KERNEL);
+	bitmap->bp = kzalloc(pages * sizeof(*bitmap->bp), GFP_KERNEL);
 #endif
 	err = -ENOMEM;
 	if (!bitmap->bp)
 		goto error;
-	memset(bitmap->bp, 0, pages * sizeof(*bitmap->bp));
 
 	bitmap->flags |= BITMAP_ACTIVE;
 

diff ./drivers/md/linear.c~current~ ./drivers/md/linear.c
--- ./drivers/md/linear.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/linear.c	2005-12-01 13:59:53.000000000 +1100
@@ -121,11 +121,10 @@ static int linear_run (mddev_t *mddev)
 	sector_t curr_offset;
 	struct list_head *tmp;
 
-	conf = kmalloc (sizeof (*conf) + mddev->raid_disks*sizeof(dev_info_t),
+	conf = kzalloc (sizeof (*conf) + mddev->raid_disks*sizeof(dev_info_t),
 			GFP_KERNEL);
 	if (!conf)
 		goto out;
-	memset(conf, 0, sizeof(*conf) + mddev->raid_disks*sizeof(dev_info_t));
 	mddev->private = conf;
 
 	cnt = 0;

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/md.c	2005-12-01 13:59:53.000000000 +1100
@@ -228,12 +228,10 @@ static mddev_t * mddev_find(dev_t unit)
 	}
 	spin_unlock(&all_mddevs_lock);
 
-	new = (mddev_t *) kmalloc(sizeof(*new), GFP_KERNEL);
+	new = (mddev_t *) kzalloc(sizeof(*new), GFP_KERNEL);
 	if (!new)
 		return NULL;
 
-	memset(new, 0, sizeof(*new));
-
 	new->unit = unit;
 	if (MAJOR(unit) == MD_MAJOR)
 		new->md_minor = MINOR(unit);
@@ -1620,12 +1618,11 @@ static mdk_rdev_t *md_import_device(dev_
 	mdk_rdev_t *rdev;
 	sector_t size;
 
-	rdev = (mdk_rdev_t *) kmalloc(sizeof(*rdev), GFP_KERNEL);
+	rdev = (mdk_rdev_t *) kzalloc(sizeof(*rdev), GFP_KERNEL);
 	if (!rdev) {
 		printk(KERN_ERR "md: could not alloc mem for new device!\n");
 		return ERR_PTR(-ENOMEM);
 	}
-	memset(rdev, 0, sizeof(*rdev));
 
 	if ((err = alloc_disk_sb(rdev)))
 		goto abort_free;
@@ -3497,11 +3494,10 @@ mdk_thread_t *md_register_thread(void (*
 {
 	mdk_thread_t *thread;
 
-	thread = kmalloc(sizeof(mdk_thread_t), GFP_KERNEL);
+	thread = kzalloc(sizeof(mdk_thread_t), GFP_KERNEL);
 	if (!thread)
 		return NULL;
 
-	memset(thread, 0, sizeof(mdk_thread_t));
 	init_waitqueue_head(&thread->wqueue);
 
 	thread->run = run;

diff ./drivers/md/multipath.c~current~ ./drivers/md/multipath.c
--- ./drivers/md/multipath.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/multipath.c	2005-12-01 13:59:53.000000000 +1100
@@ -41,9 +41,7 @@ static mdk_personality_t multipath_perso
 static void *mp_pool_alloc(gfp_t gfp_flags, void *data)
 {
 	struct multipath_bh *mpb;
-	mpb = kmalloc(sizeof(*mpb), gfp_flags);
-	if (mpb) 
-		memset(mpb, 0, sizeof(*mpb));
+	mpb = kzalloc(sizeof(*mpb), gfp_flags);
 	return mpb;
 }
 
@@ -444,7 +442,7 @@ static int multipath_run (mddev_t *mddev
 	 * should be freed in multipath_stop()]
 	 */
 
-	conf = kmalloc(sizeof(multipath_conf_t), GFP_KERNEL);
+	conf = kzalloc(sizeof(multipath_conf_t), GFP_KERNEL);
 	mddev->private = conf;
 	if (!conf) {
 		printk(KERN_ERR 
@@ -452,9 +450,8 @@ static int multipath_run (mddev_t *mddev
 			mdname(mddev));
 		goto out;
 	}
-	memset(conf, 0, sizeof(*conf));
 
-	conf->multipaths = kmalloc(sizeof(struct multipath_info)*mddev->raid_disks,
+	conf->multipaths = kzalloc(sizeof(struct multipath_info)*mddev->raid_disks,
 				   GFP_KERNEL);
 	if (!conf->multipaths) {
 		printk(KERN_ERR 
@@ -462,7 +459,6 @@ static int multipath_run (mddev_t *mddev
 			mdname(mddev));
 		goto out_free_conf;
 	}
-	memset(conf->multipaths, 0, sizeof(struct multipath_info)*mddev->raid_disks);
 
 	conf->working_disks = 0;
 	ITERATE_RDEV(mddev,rdev,tmp) {

diff ./drivers/md/raid0.c~current~ ./drivers/md/raid0.c
--- ./drivers/md/raid0.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/raid0.c	2005-12-01 13:59:53.000000000 +1100
@@ -113,21 +113,16 @@ static int create_strip_zones (mddev_t *
 	}
 	printk("raid0: FINAL %d zones\n", conf->nr_strip_zones);
 
-	conf->strip_zone = kmalloc(sizeof(struct strip_zone)*
+	conf->strip_zone = kzalloc(sizeof(struct strip_zone)*
 				conf->nr_strip_zones, GFP_KERNEL);
 	if (!conf->strip_zone)
 		return 1;
-	conf->devlist = kmalloc(sizeof(mdk_rdev_t*)*
+	conf->devlist = kzalloc(sizeof(mdk_rdev_t*)*
 				conf->nr_strip_zones*mddev->raid_disks,
 				GFP_KERNEL);
 	if (!conf->devlist)
 		return 1;
 
-	memset(conf->strip_zone, 0,sizeof(struct strip_zone)*
-				   conf->nr_strip_zones);
-	memset(conf->devlist, 0,
-	       sizeof(mdk_rdev_t*) * conf->nr_strip_zones * mddev->raid_disks);
-
 	/* The first zone must contain all devices, so here we check that
 	 * there is a proper alignment of slots to devices and find them all
 	 */

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/raid1.c	2005-12-01 13:59:53.000000000 +1100
@@ -61,10 +61,8 @@ static void * r1bio_pool_alloc(gfp_t gfp
 	int size = offsetof(r1bio_t, bios[pi->raid_disks]);
 
 	/* allocate a r1bio with room for raid_disks entries in the bios array */
-	r1_bio = kmalloc(size, gfp_flags);
-	if (r1_bio)
-		memset(r1_bio, 0, size);
-	else
+	r1_bio = kzalloc(size, gfp_flags);
+	if (!r1_bio)
 		unplug_slaves(pi->mddev);
 
 	return r1_bio;
@@ -710,13 +708,11 @@ static struct page **alloc_behind_pages(
 {
 	int i;
 	struct bio_vec *bvec;
-	struct page **pages = kmalloc(bio->bi_vcnt * sizeof(struct page *),
+	struct page **pages = kzalloc(bio->bi_vcnt * sizeof(struct page *),
 					GFP_NOIO);
 	if (unlikely(!pages))
 		goto do_sync_io;
 
-	memset(pages, 0, bio->bi_vcnt * sizeof(struct page *));
-
 	bio_for_each_segment(bvec, bio, i) {
 		pages[i] = alloc_page(GFP_NOIO);
 		if (unlikely(!pages[i]))
@@ -1769,19 +1765,16 @@ static int run(mddev_t *mddev)
 	 * bookkeeping area. [whatever we allocate in run(),
 	 * should be freed in stop()]
 	 */
-	conf = kmalloc(sizeof(conf_t), GFP_KERNEL);
+	conf = kzalloc(sizeof(conf_t), GFP_KERNEL);
 	mddev->private = conf;
 	if (!conf)
 		goto out_no_mem;
 
-	memset(conf, 0, sizeof(*conf));
-	conf->mirrors = kmalloc(sizeof(struct mirror_info)*mddev->raid_disks, 
+	conf->mirrors = kzalloc(sizeof(struct mirror_info)*mddev->raid_disks,
 				 GFP_KERNEL);
 	if (!conf->mirrors)
 		goto out_no_mem;
 
-	memset(conf->mirrors, 0, sizeof(struct mirror_info)*mddev->raid_disks);
-
 	conf->tmppage = alloc_page(GFP_KERNEL);
 	if (!conf->tmppage)
 		goto out_no_mem;
@@ -1991,13 +1984,12 @@ static int raid1_reshape(mddev_t *mddev,
 		kfree(newpoolinfo);
 		return -ENOMEM;
 	}
-	newmirrors = kmalloc(sizeof(struct mirror_info) * raid_disks, GFP_KERNEL);
+	newmirrors = kzalloc(sizeof(struct mirror_info) * raid_disks, GFP_KERNEL);
 	if (!newmirrors) {
 		kfree(newpoolinfo);
 		mempool_destroy(newpool);
 		return -ENOMEM;
 	}
-	memset(newmirrors, 0, sizeof(struct mirror_info)*raid_disks);
 
 	raise_barrier(conf);
 

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/raid10.c	2005-12-01 13:59:53.000000000 +1100
@@ -59,10 +59,8 @@ static void * r10bio_pool_alloc(gfp_t gf
 	int size = offsetof(struct r10bio_s, devs[conf->copies]);
 
 	/* allocate a r10bio with room for raid_disks entries in the bios array */
-	r10_bio = kmalloc(size, gfp_flags);
-	if (r10_bio)
-		memset(r10_bio, 0, size);
-	else
+	r10_bio = kzalloc(size, gfp_flags);
+	if (!r10_bio)
 		unplug_slaves(conf->mddev);
 
 	return r10_bio;

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2005-12-01 14:01:30.000000000 +1100
+++ ./drivers/md/raid5.c	2005-12-01 14:01:46.000000000 +1100
@@ -1826,12 +1826,12 @@ static int run(mddev_t *mddev)
 		return -EIO;
 	}
 
-	mddev->private = kmalloc (sizeof (raid5_conf_t)
-				  + mddev->raid_disks * sizeof(struct disk_info),
-				  GFP_KERNEL);
+	mddev->private = kzalloc(sizeof (raid5_conf_t)
+				 + mddev->raid_disks * sizeof(struct disk_info),
+				 GFP_KERNEL);
 	if ((conf = mddev->private) == NULL)
 		goto abort;
-	memset (conf, 0, sizeof (*conf) + mddev->raid_disks * sizeof(struct disk_info) );
+
 	conf->mddev = mddev;
 
 	if ((conf->stripe_hashtbl = (struct stripe_head **) __get_free_pages(GFP_ATOMIC, HASH_PAGES_ORDER)) == NULL)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 009 of 14] Tidy up raid5/6 hash table code.
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (7 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 008 of 14] Convert md to use kzalloc throughout NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:23 ` [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic NeilBrown
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


- replace open-coded hash chain with hlist macros
- Fix hash-table size at one page - it is already quite
  generous, so there will never be a need to use multiple
  pages, so no need for __get_free_pages

No functional change.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid5.c         |   40 +++++++++++++------------------------
 ./drivers/md/raid6main.c     |   46 +++++++++++++++----------------------------
 ./include/linux/raid/raid5.h |    4 +--
 3 files changed, 33 insertions(+), 57 deletions(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2005-12-01 14:01:46.000000000 +1100
+++ ./drivers/md/raid5.c	2005-12-01 14:01:59.000000000 +1100
@@ -35,12 +35,10 @@
 #define STRIPE_SHIFT		(PAGE_SHIFT - 9)
 #define STRIPE_SECTORS		(STRIPE_SIZE>>9)
 #define	IO_THRESHOLD		1
-#define HASH_PAGES		1
-#define HASH_PAGES_ORDER	0
-#define NR_HASH			(HASH_PAGES * PAGE_SIZE / sizeof(struct stripe_head *))
+#define NR_HASH			(PAGE_SIZE / sizeof(struct hlist_head))
 #define HASH_MASK		(NR_HASH - 1)
 
-#define stripe_hash(conf, sect)	((conf)->stripe_hashtbl[((sect) >> STRIPE_SHIFT) & HASH_MASK])
+#define stripe_hash(conf, sect)	(&((conf)->stripe_hashtbl[((sect) >> STRIPE_SHIFT) & HASH_MASK]))
 
 /* bio's attached to a stripe+device for I/O are linked together in bi_sector
  * order without overlap.  There may be several bio's per stripe+device, and
@@ -113,29 +111,21 @@ static void release_stripe(struct stripe
 	spin_unlock_irqrestore(&conf->device_lock, flags);
 }
 
-static void remove_hash(struct stripe_head *sh)
+static inline void remove_hash(struct stripe_head *sh)
 {
 	PRINTK("remove_hash(), stripe %llu\n", (unsigned long long)sh->sector);
 
-	if (sh->hash_pprev) {
-		if (sh->hash_next)
-			sh->hash_next->hash_pprev = sh->hash_pprev;
-		*sh->hash_pprev = sh->hash_next;
-		sh->hash_pprev = NULL;
-	}
+	hlist_del_init(&sh->hash);
 }
 
-static __inline__ void insert_hash(raid5_conf_t *conf, struct stripe_head *sh)
+static inline void insert_hash(raid5_conf_t *conf, struct stripe_head *sh)
 {
-	struct stripe_head **shp = &stripe_hash(conf, sh->sector);
+	struct hlist_head *hp = stripe_hash(conf, sh->sector);
 
 	PRINTK("insert_hash(), stripe %llu\n", (unsigned long long)sh->sector);
 
 	CHECK_DEVLOCK();
-	if ((sh->hash_next = *shp) != NULL)
-		(*shp)->hash_pprev = &sh->hash_next;
-	*shp = sh;
-	sh->hash_pprev = shp;
+	hlist_add_head(&sh->hash, hp);
 }
 
 
@@ -228,10 +218,11 @@ static inline void init_stripe(struct st
 static struct stripe_head *__find_stripe(raid5_conf_t *conf, sector_t sector)
 {
 	struct stripe_head *sh;
+	struct hlist_node *hn;
 
 	CHECK_DEVLOCK();
 	PRINTK("__find_stripe, sector %llu\n", (unsigned long long)sector);
-	for (sh = stripe_hash(conf, sector); sh; sh = sh->hash_next)
+	hlist_for_each_entry(sh, hn, stripe_hash(conf, sector), hash)
 		if (sh->sector == sector)
 			return sh;
 	PRINTK("__stripe %llu not in cache\n", (unsigned long long)sector);
@@ -1834,9 +1825,8 @@ static int run(mddev_t *mddev)
 
 	conf->mddev = mddev;
 
-	if ((conf->stripe_hashtbl = (struct stripe_head **) __get_free_pages(GFP_ATOMIC, HASH_PAGES_ORDER)) == NULL)
+	if ((conf->stripe_hashtbl = kzalloc(PAGE_SIZE, GFP_KERNEL)) == NULL)
 		goto abort;
-	memset(conf->stripe_hashtbl, 0, HASH_PAGES * PAGE_SIZE);
 
 	spin_lock_init(&conf->device_lock);
 	init_waitqueue_head(&conf->wait_for_stripe);
@@ -1971,9 +1961,7 @@ memory = conf->max_nr_stripes * (sizeof(
 abort:
 	if (conf) {
 		print_raid5_conf(conf);
-		if (conf->stripe_hashtbl)
-			free_pages((unsigned long) conf->stripe_hashtbl,
-							HASH_PAGES_ORDER);
+		kfree(conf->stripe_hashtbl);
 		kfree(conf);
 	}
 	mddev->private = NULL;
@@ -1990,7 +1978,7 @@ static int stop(mddev_t *mddev)
 	md_unregister_thread(mddev->thread);
 	mddev->thread = NULL;
 	shrink_stripes(conf);
-	free_pages((unsigned long) conf->stripe_hashtbl, HASH_PAGES_ORDER);
+	kfree(conf->stripe_hashtbl);
 	blk_sync_queue(mddev->queue); /* the unplug fn references 'conf'*/
 	sysfs_remove_group(&mddev->kobj, &raid5_attrs_group);
 	kfree(conf);
@@ -2018,12 +2006,12 @@ static void print_sh (struct stripe_head
 static void printall (raid5_conf_t *conf)
 {
 	struct stripe_head *sh;
+	struct hlist_node *hn;
 	int i;
 
 	spin_lock_irq(&conf->device_lock);
 	for (i = 0; i < NR_HASH; i++) {
-		sh = conf->stripe_hashtbl[i];
-		for (; sh; sh = sh->hash_next) {
+		hlist_for_each_entry(sh, hn, &conf->stripe_hashtbl[i], hash) {
 			if (sh->raid_conf != conf)
 				continue;
 			print_sh(sh);

diff ./drivers/md/raid6main.c~current~ ./drivers/md/raid6main.c
--- ./drivers/md/raid6main.c~current~	2005-12-01 13:59:45.000000000 +1100
+++ ./drivers/md/raid6main.c	2005-12-01 14:01:58.000000000 +1100
@@ -40,12 +40,10 @@
 #define STRIPE_SHIFT		(PAGE_SHIFT - 9)
 #define STRIPE_SECTORS		(STRIPE_SIZE>>9)
 #define	IO_THRESHOLD		1
-#define HASH_PAGES		1
-#define HASH_PAGES_ORDER	0
-#define NR_HASH			(HASH_PAGES * PAGE_SIZE / sizeof(struct stripe_head *))
+#define NR_HASH			(PAGE_SIZE / sizeof(struct hlist_head))
 #define HASH_MASK		(NR_HASH - 1)
 
-#define stripe_hash(conf, sect)	((conf)->stripe_hashtbl[((sect) >> STRIPE_SHIFT) & HASH_MASK])
+#define stripe_hash(conf, sect)	(&((conf)->stripe_hashtbl[((sect) >> STRIPE_SHIFT) & HASH_MASK]))
 
 /* bio's attached to a stripe+device for I/O are linked together in bi_sector
  * order without overlap.  There may be several bio's per stripe+device, and
@@ -132,29 +130,21 @@ static void release_stripe(struct stripe
 	spin_unlock_irqrestore(&conf->device_lock, flags);
 }
 
-static void remove_hash(struct stripe_head *sh)
+static inline void remove_hash(struct stripe_head *sh)
 {
 	PRINTK("remove_hash(), stripe %llu\n", (unsigned long long)sh->sector);
 
-	if (sh->hash_pprev) {
-		if (sh->hash_next)
-			sh->hash_next->hash_pprev = sh->hash_pprev;
-		*sh->hash_pprev = sh->hash_next;
-		sh->hash_pprev = NULL;
-	}
+	hlist_del_init(&sh->hash);
 }
 
-static __inline__ void insert_hash(raid6_conf_t *conf, struct stripe_head *sh)
+static inline void insert_hash(raid6_conf_t *conf, struct stripe_head *sh)
 {
-	struct stripe_head **shp = &stripe_hash(conf, sh->sector);
+	struct hlist_head *hp = stripe_hash(conf, sh->sector);
 
 	PRINTK("insert_hash(), stripe %llu\n", (unsigned long long)sh->sector);
 
 	CHECK_DEVLOCK();
-	if ((sh->hash_next = *shp) != NULL)
-		(*shp)->hash_pprev = &sh->hash_next;
-	*shp = sh;
-	sh->hash_pprev = shp;
+	hlist_add_head(&sh->hash, hp);
 }
 
 
@@ -247,10 +237,11 @@ static inline void init_stripe(struct st
 static struct stripe_head *__find_stripe(raid6_conf_t *conf, sector_t sector)
 {
 	struct stripe_head *sh;
+	struct hlist_node *hn;
 
 	CHECK_DEVLOCK();
 	PRINTK("__find_stripe, sector %llu\n", (unsigned long long)sector);
-	for (sh = stripe_hash(conf, sector); sh; sh = sh->hash_next)
+	hlist_for_each_entry (sh, hn,  stripe_hash(conf, sector), hash)
 		if (sh->sector == sector)
 			return sh;
 	PRINTK("__stripe %llu not in cache\n", (unsigned long long)sector);
@@ -1931,17 +1922,15 @@ static int run(mddev_t *mddev)
 		return -EIO;
 	}
 
-	mddev->private = kmalloc (sizeof (raid6_conf_t)
-				  + mddev->raid_disks * sizeof(struct disk_info),
-				  GFP_KERNEL);
+	mddev->private = kzalloc(sizeof (raid6_conf_t)
+				 + mddev->raid_disks * sizeof(struct disk_info),
+				 GFP_KERNEL);
 	if ((conf = mddev->private) == NULL)
 		goto abort;
-	memset (conf, 0, sizeof (*conf) + mddev->raid_disks * sizeof(struct disk_info) );
 	conf->mddev = mddev;
 
-	if ((conf->stripe_hashtbl = (struct stripe_head **) __get_free_pages(GFP_ATOMIC, HASH_PAGES_ORDER)) == NULL)
+	if ((conf->stripe_hashtbl = kzalloc(PAGE_SIZE, GFP_KERNEL)) == NULL)
 		goto abort;
-	memset(conf->stripe_hashtbl, 0, HASH_PAGES * PAGE_SIZE);
 
 	conf->spare_page = alloc_page(GFP_KERNEL);
 	if (!conf->spare_page)
@@ -2085,9 +2074,7 @@ abort:
 		print_raid6_conf(conf);
 		if (conf->spare_page)
 			put_page(conf->spare_page);
-		if (conf->stripe_hashtbl)
-			free_pages((unsigned long) conf->stripe_hashtbl,
-							HASH_PAGES_ORDER);
+		kfree(conf->stripe_hashtbl);
 		kfree(conf);
 	}
 	mddev->private = NULL;
@@ -2104,7 +2091,7 @@ static int stop (mddev_t *mddev)
 	md_unregister_thread(mddev->thread);
 	mddev->thread = NULL;
 	shrink_stripes(conf);
-	free_pages((unsigned long) conf->stripe_hashtbl, HASH_PAGES_ORDER);
+	kfree(conf->stripe_hashtbl);
 	blk_sync_queue(mddev->queue); /* the unplug fn references 'conf'*/
 	kfree(conf);
 	mddev->private = NULL;
@@ -2131,12 +2118,13 @@ static void print_sh (struct seq_file *s
 static void printall (struct seq_file *seq, raid6_conf_t *conf)
 {
 	struct stripe_head *sh;
+	struct hlist_node *hn;
 	int i;
 
 	spin_lock_irq(&conf->device_lock);
 	for (i = 0; i < NR_HASH; i++) {
 		sh = conf->stripe_hashtbl[i];
-		for (; sh; sh = sh->hash_next) {
+		hlist_for_each_entry(sh, hn, &conf->stripe_hashtbl[i], hash) {
 			if (sh->raid_conf != conf)
 				continue;
 			print_sh(seq, sh);

diff ./include/linux/raid/raid5.h~current~ ./include/linux/raid/raid5.h
--- ./include/linux/raid/raid5.h~current~	2005-12-01 13:59:45.000000000 +1100
+++ ./include/linux/raid/raid5.h	2005-12-01 14:01:58.000000000 +1100
@@ -126,7 +126,7 @@
  */
 
 struct stripe_head {
-	struct stripe_head	*hash_next, **hash_pprev; /* hash pointers */
+	struct hlist_node	hash;
 	struct list_head	lru;			/* inactive_list or handle_list */
 	struct raid5_private_data	*raid_conf;
 	sector_t		sector;			/* sector of this row */
@@ -204,7 +204,7 @@ struct disk_info {
 };
 
 struct raid5_private_data {
-	struct stripe_head	**stripe_hashtbl;
+	struct hlist_head	*stripe_hashtbl;
 	mddev_t			*mddev;
 	struct disk_info	*spare;
 	int			chunk_size, level, algorithm;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (8 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 009 of 14] Tidy up raid5/6 hash table code NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01 22:46   ` Andrew Morton
  2005-12-01  3:23 ` [PATCH md 011 of 14] Convert recently exported symbol to GPL NeilBrown
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid



Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/bitmap.c |   44 +++++++++++++++++++++-----------------------
 1 file changed, 21 insertions(+), 23 deletions(-)

diff ./drivers/md/bitmap.c~current~ ./drivers/md/bitmap.c
--- ./drivers/md/bitmap.c~current~	2005-12-01 14:02:54.000000000 +1100
+++ ./drivers/md/bitmap.c	2005-12-01 14:02:50.000000000 +1100
@@ -406,11 +406,11 @@ int bitmap_update_sb(struct bitmap *bitm
 		return 0;
 	}
 	spin_unlock_irqrestore(&bitmap->lock, flags);
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 	sb->events = cpu_to_le64(bitmap->mddev->events);
 	if (!bitmap->mddev->degraded)
 		sb->events_cleared = cpu_to_le64(bitmap->mddev->events);
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(sb, KM_USER0);
 	return write_page(bitmap, bitmap->sb_page, 1);
 }
 
@@ -421,7 +421,7 @@ void bitmap_print_sb(struct bitmap *bitm
 
 	if (!bitmap || !bitmap->sb_page)
 		return;
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 	printk(KERN_DEBUG "%s: bitmap file superblock:\n", bmname(bitmap));
 	printk(KERN_DEBUG "         magic: %08x\n", le32_to_cpu(sb->magic));
 	printk(KERN_DEBUG "       version: %d\n", le32_to_cpu(sb->version));
@@ -440,7 +440,7 @@ void bitmap_print_sb(struct bitmap *bitm
 	printk(KERN_DEBUG "     sync size: %llu KB\n",
 			(unsigned long long)le64_to_cpu(sb->sync_size)/2);
 	printk(KERN_DEBUG "max write behind: %d\n", le32_to_cpu(sb->write_behind));
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(sb, KM_USER0);
 }
 
 /* read the superblock from the bitmap file and initialize some bitmap fields */
@@ -466,7 +466,7 @@ static int bitmap_read_sb(struct bitmap 
 		return err;
 	}
 
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 
 	if (bytes_read < sizeof(*sb)) { /* short read */
 		printk(KERN_INFO "%s: bitmap file superblock truncated\n",
@@ -535,7 +535,7 @@ success:
 		bitmap->events_cleared = bitmap->mddev->events;
 	err = 0;
 out:
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(sb, KM_USER0);
 	if (err)
 		bitmap_print_sb(bitmap);
 	return err;
@@ -560,7 +560,7 @@ static void bitmap_mask_state(struct bit
 	}
 	get_page(bitmap->sb_page);
 	spin_unlock_irqrestore(&bitmap->lock, flags);
-	sb = (bitmap_super_t *)kmap(bitmap->sb_page);
+	sb = (bitmap_super_t *)kmap_atomic(bitmap->sb_page, KM_USER0);
 	switch (op) {
 		case MASK_SET: sb->state |= bits;
 				break;
@@ -568,7 +568,7 @@ static void bitmap_mask_state(struct bit
 				break;
 		default: BUG();
 	}
-	kunmap(bitmap->sb_page);
+	kunmap_atomic(sb, KM_USER0);
 	put_page(bitmap->sb_page);
 }
 
@@ -854,6 +854,7 @@ static int bitmap_init_from_disk(struct 
 	unsigned long bytes, offset, dummy;
 	int outofdate;
 	int ret = -ENOSPC;
+	void *paddr;
 
 	chunks = bitmap->chunks;
 	file = bitmap->file;
@@ -899,8 +900,6 @@ static int bitmap_init_from_disk(struct 
 		bit = file_page_offset(i);
 		if (index != oldindex) { /* this is a new page, read it in */
 			/* unmap the old page, we're done with it */
-			if (oldpage != NULL)
-				kunmap(oldpage);
 			if (index == 0) {
 				/*
 				 * if we're here then the superblock page
@@ -923,18 +922,18 @@ static int bitmap_init_from_disk(struct 
 
 			oldindex = index;
 			oldpage = page;
-			kmap(page);
 
 			if (outofdate) {
 				/*
 				 * if bitmap is out of date, dirty the
 			 	 * whole page and write it out
 				 */
-				memset(page_address(page) + offset, 0xff,
+				paddr = kmap_atomic(page, KM_USER0);
+				memset(paddr + offset, 0xff,
 				       PAGE_SIZE - offset);
+				kunmap_atomic(paddr, KM_USER0);
 				ret = write_page(bitmap, page, 1);
 				if (ret) {
-					kunmap(page);
 					/* release, page not in filemap yet */
 					put_page(page);
 					goto out;
@@ -943,10 +942,12 @@ static int bitmap_init_from_disk(struct 
 
 			bitmap->filemap[bitmap->file_pages++] = page;
 		}
+		paddr = kmap_atomic(page, KM_USER0);
 		if (bitmap->flags & BITMAP_HOSTENDIAN)
-			b = test_bit(bit, page_address(page));
+			b = test_bit(bit, paddr);
 		else
-			b = ext2_test_bit(bit, page_address(page));
+			b = ext2_test_bit(bit, paddr);
+		kunmap_atomic(paddr, KM_USER0);
 		if (b) {
 			/* if the disk bit is set, set the memory bit */
 			bitmap_set_memory_bits(bitmap, i << CHUNK_BLOCK_SHIFT(bitmap),
@@ -961,9 +962,6 @@ static int bitmap_init_from_disk(struct 
 	ret = 0;
 	bitmap_mask_state(bitmap, BITMAP_STALE, MASK_UNSET);
 
-	if (page) /* unmap the last page */
-		kunmap(page);
-
 	if (bit_cnt) { /* Kick recovery if any bits were set */
 		set_bit(MD_RECOVERY_NEEDED, &bitmap->mddev->recovery);
 		md_wakeup_thread(bitmap->mddev->thread);
@@ -1019,6 +1017,7 @@ int bitmap_daemon_work(struct bitmap *bi
 	int err = 0;
 	int blocks;
 	int attr;
+	void *paddr;
 
 	if (bitmap == NULL)
 		return 0;
@@ -1075,14 +1074,12 @@ int bitmap_daemon_work(struct bitmap *bi
 					set_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);
 					spin_unlock_irqrestore(&bitmap->lock, flags);
 				}
-				kunmap(lastpage);
 				put_page(lastpage);
 				if (err)
 					bitmap_file_kick(bitmap);
 			} else
 				spin_unlock_irqrestore(&bitmap->lock, flags);
 			lastpage = page;
-			kmap(page);
 /*
 			printk("bitmap clean at page %lu\n", j);
 */
@@ -1105,10 +1102,12 @@ int bitmap_daemon_work(struct bitmap *bi
 						  -1);
 
 				/* clear the bit */
+				paddr = kmap_atomic(page, KM_USER0);
 				if (bitmap->flags & BITMAP_HOSTENDIAN)
-					clear_bit(file_page_offset(j), page_address(page));
+					clear_bit(file_page_offset(j), paddr);
 				else
-					ext2_clear_bit(file_page_offset(j), page_address(page));
+					ext2_clear_bit(file_page_offset(j), paddr);
+				kunmap_atomic(paddr, KM_USER0);
 			}
 		}
 		spin_unlock_irqrestore(&bitmap->lock, flags);
@@ -1116,7 +1115,6 @@ int bitmap_daemon_work(struct bitmap *bi
 
 	/* now sync the final page */
 	if (lastpage != NULL) {
-		kunmap(lastpage);
 		spin_lock_irqsave(&bitmap->lock, flags);
 		if (get_page_attr(bitmap, lastpage) &BITMAP_PAGE_NEEDWRITE) {
 			clear_page_attr(bitmap, lastpage, BITMAP_PAGE_NEEDWRITE);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 011 of 14] Convert recently exported symbol to GPL
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (9 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:23 ` [PATCH md 012 of 14] Break out of a loop that doesn't need to run to completion NeilBrown
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


...because that seems to be the preferred practice these days.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/md.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2005-12-01 13:59:53.000000000 +1100
+++ ./drivers/md/md.c	2005-12-01 14:03:07.000000000 +1100
@@ -478,7 +478,7 @@ int sync_page_io(struct block_device *bd
 	bio_put(bio);
 	return ret;
 }
-EXPORT_SYMBOL(sync_page_io);
+EXPORT_SYMBOL_GPL(sync_page_io);
 
 static int read_disk_sb(mdk_rdev_t * rdev, int size)
 {

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 012 of 14] Break out of a loop that doesn't need to run to completion.
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (10 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 011 of 14] Convert recently exported symbol to GPL NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:23 ` [PATCH md 013 of 14] Remove personality numbering from md NeilBrown
  2005-12-01  3:24 ` [PATCH md 014 of 14] Fix possible problem in raid1/raid10 error overwriting NeilBrown
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid



Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid10.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-12-01 13:59:53.000000000 +1100
+++ ./drivers/md/raid10.c	2005-12-01 14:03:16.000000000 +1100
@@ -1672,8 +1672,10 @@ static sector_t sync_request(mddev_t *md
 				for (j=0; j<conf->copies;j++) {
 					int d = r10_bio->devs[j].devnum;
 					if (conf->mirrors[d].rdev == NULL ||
-					    test_bit(Faulty, &conf->mirrors[d].rdev->flags))
+					    test_bit(Faulty, &conf->mirrors[d].rdev->flags)) {
 						still_degraded = 1;
+						break;
+					}
 				}
 				must_sync = bitmap_start_sync(mddev->bitmap, sect,
 							      &sync_blocks, still_degraded);

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 013 of 14] Remove personality numbering from md.
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (11 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 012 of 14] Break out of a loop that doesn't need to run to completion NeilBrown
@ 2005-12-01  3:23 ` NeilBrown
  2005-12-01  3:24 ` [PATCH md 014 of 14] Fix possible problem in raid1/raid10 error overwriting NeilBrown
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:23 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


md supports multiple different RAID level, each being
implemented by a 'personality' (which is often in a separate
module).
These personalities have fairly artificial 'numbers'.  The numbers
are use to:
 1- provide an index into an array where the various personalities 
    are recorded
 2- identify the module (via an alias) which implements are particular
    personality.

Neither of these uses really justify the existence of personality numbers.
The array can be replaced by a linked list which is searched (array lookup
only happens very rarely).  Module identification can be done using an
alias based on level rather than 'personality' number.

The current 'raid5' modules support two level (4 and 5) but only one
personality.  This slight awkwardness (which was handled in the mapping
from level to personality) can be better handled by allowing raid5
to register 2 personalities.

With this change in place, the core md module does not need to have an
exhaustive list of all possible personalities, so other personalities
can be added independently.

This patch also moves the check for chunksize being non-zero
into the ->run routines for the personalities that need it, 
rather than having it in core-md.  This has a side effect of allowing
'faulty' and 'linear' not to have a chunk-size set.

Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/faulty.c       |    8 ++--
 ./drivers/md/linear.c       |   10 +++--
 ./drivers/md/md.c           |   79 ++++++++++++++++----------------------------
 ./drivers/md/multipath.c    |   11 ++----
 ./drivers/md/raid0.c        |   14 +++++--
 ./drivers/md/raid1.c        |    9 ++---
 ./drivers/md/raid10.c       |   16 +++++---
 ./drivers/md/raid5.c        |   34 ++++++++++++++++--
 ./drivers/md/raid6main.c    |   10 +++--
 ./include/linux/raid/md.h   |    4 +-
 ./include/linux/raid/md_k.h |   63 +++++------------------------------
 ./init/do_mounts_md.c       |   22 +++++-------
 12 files changed, 125 insertions(+), 155 deletions(-)

diff ./drivers/md/faulty.c~current~ ./drivers/md/faulty.c
--- ./drivers/md/faulty.c~current~	2005-12-01 13:59:45.000000000 +1100
+++ ./drivers/md/faulty.c	2005-12-01 14:03:25.000000000 +1100
@@ -316,9 +316,10 @@ static int stop(mddev_t *mddev)
 	return 0;
 }
 
-static mdk_personality_t faulty_personality =
+static struct mdk_personality faulty_personality =
 {
 	.name		= "faulty",
+	.level		= LEVEL_FAULTY,
 	.owner		= THIS_MODULE,
 	.make_request	= make_request,
 	.run		= run,
@@ -329,15 +330,16 @@ static mdk_personality_t faulty_personal
 
 static int __init raid_init(void)
 {
-	return register_md_personality(FAULTY, &faulty_personality);
+	return register_md_personality(&faulty_personality);
 }
 
 static void raid_exit(void)
 {
-	unregister_md_personality(FAULTY);
+	unregister_md_personality(&faulty_personality);
 }
 
 module_init(raid_init);
 module_exit(raid_exit);
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("md-personality-10"); /* faulty */
+MODULE_ALIAS("md-level--5");

diff ./drivers/md/linear.c~current~ ./drivers/md/linear.c
--- ./drivers/md/linear.c~current~	2005-12-01 13:59:53.000000000 +1100
+++ ./drivers/md/linear.c	2005-12-01 14:03:25.000000000 +1100
@@ -351,9 +351,10 @@ static void linear_status (struct seq_fi
 }
 
 
-static mdk_personality_t linear_personality=
+static struct mdk_personality linear_personality =
 {
 	.name		= "linear",
+	.level		= LEVEL_LINEAR,
 	.owner		= THIS_MODULE,
 	.make_request	= linear_make_request,
 	.run		= linear_run,
@@ -363,16 +364,17 @@ static mdk_personality_t linear_personal
 
 static int __init linear_init (void)
 {
-	return register_md_personality (LINEAR, &linear_personality);
+	return register_md_personality (&linear_personality);
 }
 
 static void linear_exit (void)
 {
-	unregister_md_personality (LINEAR);
+	unregister_md_personality (&linear_personality);
 }
 
 
 module_init(linear_init);
 module_exit(linear_exit);
 MODULE_LICENSE("GPL");
-MODULE_ALIAS("md-personality-1"); /* LINEAR */
+MODULE_ALIAS("md-personality-1"); /* LINEAR - degrecated*/
+MODULE_ALIAS("md-level--1");

diff ./drivers/md/md.c~current~ ./drivers/md/md.c
--- ./drivers/md/md.c~current~	2005-12-01 14:03:07.000000000 +1100
+++ ./drivers/md/md.c	2005-12-01 14:03:25.000000000 +1100
@@ -68,7 +68,7 @@
 static void autostart_arrays (int part);
 #endif
 
-static mdk_personality_t *pers[MAX_PERSONALITY];
+static LIST_HEAD(pers_list);
 static DEFINE_SPINLOCK(pers_lock);
 
 /*
@@ -303,6 +303,15 @@ static mdk_rdev_t * find_rdev(mddev_t * 
 	return NULL;
 }
 
+static struct mdk_personality *find_pers(int level)
+{
+	struct mdk_personality *pers;
+	list_for_each_entry(pers, &pers_list, list)
+		if (pers->level == level)
+			return pers;
+	return NULL;
+}
+
 static inline sector_t calc_dev_sboffset(struct block_device *bdev)
 {
 	sector_t size = bdev->bd_inode->i_size >> BLOCK_SIZE_BITS;
@@ -1744,7 +1753,7 @@ static void analyze_sbs(mddev_t * mddev)
 static ssize_t
 level_show(mddev_t *mddev, char *page)
 {
-	mdk_personality_t *p = mddev->pers;
+	struct mdk_personality *p = mddev->pers;
 	if (p == NULL && mddev->raid_disks == 0)
 		return 0;
 	if (mddev->level >= 0)
@@ -1960,11 +1969,12 @@ static int start_dirty_degraded;
 
 static int do_md_run(mddev_t * mddev)
 {
-	int pnum, err;
+	int err;
 	int chunk_size;
 	struct list_head *tmp;
 	mdk_rdev_t *rdev;
 	struct gendisk *disk;
+	struct mdk_personality *pers;
 	char b[BDEVNAME_SIZE];
 
 	if (list_empty(&mddev->disks))
@@ -1981,20 +1991,8 @@ static int do_md_run(mddev_t * mddev)
 		analyze_sbs(mddev);
 
 	chunk_size = mddev->chunk_size;
-	pnum = level_to_pers(mddev->level);
 
-	if ((pnum != MULTIPATH) && (pnum != RAID1)) {
-		if (!chunk_size) {
-			/*
-			 * 'default chunksize' in the old md code used to
-			 * be PAGE_SIZE, baaad.
-			 * we abort here to be on the safe side. We don't
-			 * want to continue the bad practice.
-			 */
-			printk(KERN_ERR 
-				"no chunksize specified, see 'man raidtab'\n");
-			return -EINVAL;
-		}
+	if (chunk_size) {
 		if (chunk_size > MAX_CHUNK_SIZE) {
 			printk(KERN_ERR "too big chunk_size: %d > %d\n",
 				chunk_size, MAX_CHUNK_SIZE);
@@ -2030,10 +2028,7 @@ static int do_md_run(mddev_t * mddev)
 	}
 
 #ifdef CONFIG_KMOD
-	if (!pers[pnum])
-	{
-		request_module("md-personality-%d", pnum);
-	}
+	request_module("md-level-%d", mddev->level);
 #endif
 
 	/*
@@ -2055,14 +2050,14 @@ static int do_md_run(mddev_t * mddev)
 		return -ENOMEM;
 
 	spin_lock(&pers_lock);
-	if (!pers[pnum] || !try_module_get(pers[pnum]->owner)) {
+	pers = find_pers(mddev->level);
+	if (!pers || !try_module_get(pers->owner)) {
 		spin_unlock(&pers_lock);
-		printk(KERN_WARNING "md: personality %d is not loaded!\n",
-		       pnum);
+		printk(KERN_WARNING "md: personality for level %d is not loaded!\n",
+		       mddev->level);
 		return -EINVAL;
 	}
-
-	mddev->pers = pers[pnum];
+	mddev->pers = pers;
 	spin_unlock(&pers_lock);
 
 	mddev->recovery = 0;
@@ -3693,15 +3688,14 @@ static int md_seq_show(struct seq_file *
 	struct list_head *tmp2;
 	mdk_rdev_t *rdev;
 	struct mdstat_info *mi = seq->private;
-	int i;
 	struct bitmap *bitmap;
 
 	if (v == (void*)1) {
+		struct mdk_personality *pers;
 		seq_printf(seq, "Personalities : ");
 		spin_lock(&pers_lock);
-		for (i = 0; i < MAX_PERSONALITY; i++)
-			if (pers[i])
-				seq_printf(seq, "[%s] ", pers[i]->name);
+		list_for_each_entry(pers, &pers_list, list)
+			seq_printf(seq, "[%s] ", pers->name);
 
 		spin_unlock(&pers_lock);
 		seq_printf(seq, "\n");
@@ -3862,35 +3856,20 @@ static struct file_operations md_seq_fop
 	.poll		= mdstat_poll,
 };
 
-int register_md_personality(int pnum, mdk_personality_t *p)
+int register_md_personality(struct mdk_personality *p)
 {
-	if (pnum >= MAX_PERSONALITY) {
-		printk(KERN_ERR
-		       "md: tried to install personality %s as nr %d, but max is %lu\n",
-		       p->name, pnum, MAX_PERSONALITY-1);
-		return -EINVAL;
-	}
-
 	spin_lock(&pers_lock);
-	if (pers[pnum]) {
-		spin_unlock(&pers_lock);
-		return -EBUSY;
-	}
-
-	pers[pnum] = p;
-	printk(KERN_INFO "md: %s personality registered as nr %d\n", p->name, pnum);
+	list_add_tail(&p->list, &pers_list);
+	printk(KERN_INFO "md: %s personality registered for level %d\n", p->name, p->level);
 	spin_unlock(&pers_lock);
 	return 0;
 }
 
-int unregister_md_personality(int pnum)
+int unregister_md_personality(struct mdk_personality *p)
 {
-	if (pnum >= MAX_PERSONALITY)
-		return -EINVAL;
-
-	printk(KERN_INFO "md: %s personality unregistered\n", pers[pnum]->name);
+	printk(KERN_INFO "md: %s personality unregistered\n", p->name);
 	spin_lock(&pers_lock);
-	pers[pnum] = NULL;
+	list_del_init(&p->list);
 	spin_unlock(&pers_lock);
 	return 0;
 }

diff ./drivers/md/multipath.c~current~ ./drivers/md/multipath.c
--- ./drivers/md/multipath.c~current~	2005-12-01 13:59:53.000000000 +1100
+++ ./drivers/md/multipath.c	2005-12-01 14:03:25.000000000 +1100
@@ -35,9 +35,6 @@
 #define	NR_RESERVED_BUFS	32
 
 
-static mdk_personality_t multipath_personality;
-
-
 static void *mp_pool_alloc(gfp_t gfp_flags, void *data)
 {
 	struct multipath_bh *mpb;
@@ -553,9 +550,10 @@ static int multipath_stop (mddev_t *mdde
 	return 0;
 }
 
-static mdk_personality_t multipath_personality=
+static struct mdk_personality multipath_personality =
 {
 	.name		= "multipath",
+	.level		= LEVEL_MULTIPATH,
 	.owner		= THIS_MODULE,
 	.make_request	= multipath_make_request,
 	.run		= multipath_run,
@@ -568,15 +566,16 @@ static mdk_personality_t multipath_perso
 
 static int __init multipath_init (void)
 {
-	return register_md_personality (MULTIPATH, &multipath_personality);
+	return register_md_personality (&multipath_personality);
 }
 
 static void __exit multipath_exit (void)
 {
-	unregister_md_personality (MULTIPATH);
+	unregister_md_personality (&multipath_personality);
 }
 
 module_init(multipath_init);
 module_exit(multipath_exit);
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("md-personality-7"); /* MULTIPATH */
+MODULE_ALIAS("md-level--4");

diff ./drivers/md/raid0.c~current~ ./drivers/md/raid0.c
--- ./drivers/md/raid0.c~current~	2005-12-01 13:59:53.000000000 +1100
+++ ./drivers/md/raid0.c	2005-12-01 14:03:25.000000000 +1100
@@ -275,7 +275,11 @@ static int raid0_run (mddev_t *mddev)
 	mdk_rdev_t *rdev;
 	struct list_head *tmp;
 
-	printk("%s: setting max_sectors to %d, segment boundary to %d\n",
+	if (mddev->chunk_size == 0) {
+		printk(KERN_ERR "md/raid0: non-zero chunk size required.\n");
+		return -EINVAL;
+	}
+	printk(KERN_INFO "%s: setting max_sectors to %d, segment boundary to %d\n",
 	       mdname(mddev),
 	       mddev->chunk_size >> 9,
 	       (mddev->chunk_size>>1)-1);
@@ -507,9 +511,10 @@ static void raid0_status (struct seq_fil
 	return;
 }
 
-static mdk_personality_t raid0_personality=
+static struct mdk_personality raid0_personality=
 {
 	.name		= "raid0",
+	.level		= 0,
 	.owner		= THIS_MODULE,
 	.make_request	= raid0_make_request,
 	.run		= raid0_run,
@@ -519,15 +524,16 @@ static mdk_personality_t raid0_personali
 
 static int __init raid0_init (void)
 {
-	return register_md_personality (RAID0, &raid0_personality);
+	return register_md_personality (&raid0_personality);
 }
 
 static void raid0_exit (void)
 {
-	unregister_md_personality (RAID0);
+	unregister_md_personality (&raid0_personality);
 }
 
 module_init(raid0_init);
 module_exit(raid0_exit);
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("md-personality-2"); /* RAID0 */
+MODULE_ALIAS("md-level-0");

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2005-12-01 13:59:53.000000000 +1100
+++ ./drivers/md/raid1.c	2005-12-01 14:03:25.000000000 +1100
@@ -47,7 +47,6 @@
  */
 #define	NR_RAID1_BIOS 256
 
-static mdk_personality_t raid1_personality;
 
 static void unplug_slaves(mddev_t *mddev);
 
@@ -2035,9 +2034,10 @@ static void raid1_quiesce(mddev_t *mddev
 }
 
 
-static mdk_personality_t raid1_personality =
+static struct mdk_personality raid1_personality =
 {
 	.name		= "raid1",
+	.level		= 1,
 	.owner		= THIS_MODULE,
 	.make_request	= make_request,
 	.run		= run,
@@ -2055,15 +2055,16 @@ static mdk_personality_t raid1_personali
 
 static int __init raid_init(void)
 {
-	return register_md_personality(RAID1, &raid1_personality);
+	return register_md_personality(&raid1_personality);
 }
 
 static void raid_exit(void)
 {
-	unregister_md_personality(RAID1);
+	unregister_md_personality(&raid1_personality);
 }
 
 module_init(raid_init);
 module_exit(raid_exit);
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("md-personality-3"); /* RAID1 */
+MODULE_ALIAS("md-level-1");

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-12-01 14:03:16.000000000 +1100
+++ ./drivers/md/raid10.c	2005-12-01 14:03:25.000000000 +1100
@@ -1883,11 +1883,11 @@ static int run(mddev_t *mddev)
 	int nc, fc;
 	sector_t stride, size;
 
-	if (mddev->level != 10) {
-		printk(KERN_ERR "raid10: %s: raid level not set correctly... (%d)\n",
-		       mdname(mddev), mddev->level);
-		goto out;
+	if (mddev->chunk_size == 0) {
+		printk(KERN_ERR "md/raid10: non-zero chunk size required.\n");
+		return -EINVAL;
 	}
+
 	nc = mddev->layout & 255;
 	fc = (mddev->layout >> 8) & 255;
 	if ((nc*fc) <2 || (nc*fc) > mddev->raid_disks ||
@@ -2072,9 +2072,10 @@ static void raid10_quiesce(mddev_t *mdde
 	}
 }
 
-static mdk_personality_t raid10_personality =
+static struct mdk_personality raid10_personality =
 {
 	.name		= "raid10",
+	.level		= 10,
 	.owner		= THIS_MODULE,
 	.make_request	= make_request,
 	.run		= run,
@@ -2090,15 +2091,16 @@ static mdk_personality_t raid10_personal
 
 static int __init raid_init(void)
 {
-	return register_md_personality(RAID10, &raid10_personality);
+	return register_md_personality(&raid10_personality);
 }
 
 static void raid_exit(void)
 {
-	unregister_md_personality(RAID10);
+	unregister_md_personality(&raid10_personality);
 }
 
 module_init(raid_init);
 module_exit(raid_exit);
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("md-personality-9"); /* RAID10 */
+MODULE_ALIAS("md-level-10");

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2005-12-01 14:01:59.000000000 +1100
+++ ./drivers/md/raid5.c	2005-12-01 14:03:25.000000000 +1100
@@ -2186,9 +2186,10 @@ static void raid5_quiesce(mddev_t *mddev
 	}
 }
 
-static mdk_personality_t raid5_personality=
+static struct mdk_personality raid5_personality =
 {
 	.name		= "raid5",
+	.level		= 5,
 	.owner		= THIS_MODULE,
 	.make_request	= make_request,
 	.run		= run,
@@ -2203,17 +2204,40 @@ static mdk_personality_t raid5_personali
 	.quiesce	= raid5_quiesce,
 };
 
-static int __init raid5_init (void)
+static struct mdk_personality raid4_personality =
 {
-	return register_md_personality (RAID5, &raid5_personality);
+	.name		= "raid4",
+	.level		= 4,
+	.owner		= THIS_MODULE,
+	.make_request	= make_request,
+	.run		= run,
+	.stop		= stop,
+	.status		= status,
+	.error_handler	= error,
+	.hot_add_disk	= raid5_add_disk,
+	.hot_remove_disk= raid5_remove_disk,
+	.spare_active	= raid5_spare_active,
+	.sync_request	= sync_request,
+	.resize		= raid5_resize,
+	.quiesce	= raid5_quiesce,
+};
+
+static int __init raid5_init(void)
+{
+	register_md_personality(&raid5_personality);
+	register_md_personality(&raid4_personality);
+	return 0;
 }
 
-static void raid5_exit (void)
+static void raid5_exit(void)
 {
-	unregister_md_personality (RAID5);
+	unregister_md_personality(&raid5_personality);
+	unregister_md_personality(&raid4_personality);
 }
 
 module_init(raid5_init);
 module_exit(raid5_exit);
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("md-personality-4"); /* RAID5 */
+MODULE_ALIAS("md-level-5");
+MODULE_ALIAS("md-level-4");

diff ./drivers/md/raid6main.c~current~ ./drivers/md/raid6main.c
--- ./drivers/md/raid6main.c~current~	2005-12-01 14:01:58.000000000 +1100
+++ ./drivers/md/raid6main.c	2005-12-01 14:03:25.000000000 +1100
@@ -2304,9 +2304,10 @@ static void raid6_quiesce(mddev_t *mddev
 	}
 }
 
-static mdk_personality_t raid6_personality=
+static struct mdk_personality raid6_personality =
 {
 	.name		= "raid6",
+	.level		= 6,
 	.owner		= THIS_MODULE,
 	.make_request	= make_request,
 	.run		= run,
@@ -2321,7 +2322,7 @@ static mdk_personality_t raid6_personali
 	.quiesce	= raid6_quiesce,
 };
 
-static int __init raid6_init (void)
+static int __init raid6_init(void)
 {
 	int e;
 
@@ -2329,15 +2330,16 @@ static int __init raid6_init (void)
 	if ( e )
 		return e;
 
-	return register_md_personality (RAID6, &raid6_personality);
+	return register_md_personality(&raid6_personality);
 }
 
 static void raid6_exit (void)
 {
-	unregister_md_personality (RAID6);
+	unregister_md_personality(&raid6_personality);
 }
 
 module_init(raid6_init);
 module_exit(raid6_exit);
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("md-personality-8"); /* RAID6 */
+MODULE_ALIAS("md-level-6");

diff ./include/linux/raid/md.h~current~ ./include/linux/raid/md.h
--- ./include/linux/raid/md.h~current~	2005-12-01 13:59:45.000000000 +1100
+++ ./include/linux/raid/md.h	2005-12-01 14:03:25.000000000 +1100
@@ -71,8 +71,8 @@
  */
 #define MD_PATCHLEVEL_VERSION           3
 
-extern int register_md_personality (int p_num, mdk_personality_t *p);
-extern int unregister_md_personality (int p_num);
+extern int register_md_personality (struct mdk_personality *p);
+extern int unregister_md_personality (struct mdk_personality *p);
 extern mdk_thread_t * md_register_thread (void (*run) (mddev_t *mddev),
 				mddev_t *mddev, const char *name);
 extern void md_unregister_thread (mdk_thread_t *thread);

diff ./include/linux/raid/md_k.h~current~ ./include/linux/raid/md_k.h
--- ./include/linux/raid/md_k.h~current~	2005-12-01 13:59:45.000000000 +1100
+++ ./include/linux/raid/md_k.h	2005-12-01 14:03:25.000000000 +1100
@@ -18,62 +18,19 @@
 /* and dm-bio-list.h is not under include/linux because.... ??? */
 #include "../../../drivers/md/dm-bio-list.h"
 
-#define MD_RESERVED       0UL
-#define LINEAR            1UL
-#define RAID0             2UL
-#define RAID1             3UL
-#define RAID5             4UL
-#define TRANSLUCENT       5UL
-#define HSM               6UL
-#define MULTIPATH         7UL
-#define RAID6		  8UL
-#define	RAID10		  9UL
-#define FAULTY		  10UL
-#define MAX_PERSONALITY   11UL
-
 #define	LEVEL_MULTIPATH		(-4)
 #define	LEVEL_LINEAR		(-1)
 #define	LEVEL_FAULTY		(-5)
 
+/* we need a value for 'no level specified' and 0
+ * means 'raid0', so we need something else.  This is
+ * for internal use only
+ */
+#define	LEVEL_NONE		(-1000000)
+
 #define MaxSector (~(sector_t)0)
 #define MD_THREAD_NAME_MAX 14
 
-static inline int pers_to_level (int pers)
-{
-	switch (pers) {
-		case FAULTY:		return LEVEL_FAULTY;
-		case MULTIPATH:		return LEVEL_MULTIPATH;
-		case HSM:		return -3;
-		case TRANSLUCENT:	return -2;
-		case LINEAR:		return LEVEL_LINEAR;
-		case RAID0:		return 0;
-		case RAID1:		return 1;
-		case RAID5:		return 5;
-		case RAID6:		return 6;
-		case RAID10:		return 10;
-	}
-	BUG();
-	return MD_RESERVED;
-}
-
-static inline int level_to_pers (int level)
-{
-	switch (level) {
-		case LEVEL_FAULTY: return FAULTY;
-		case LEVEL_MULTIPATH: return MULTIPATH;
-		case -3: return HSM;
-		case -2: return TRANSLUCENT;
-		case LEVEL_LINEAR: return LINEAR;
-		case 0: return RAID0;
-		case 1: return RAID1;
-		case 4:
-		case 5: return RAID5;
-		case 6: return RAID6;
-		case 10: return RAID10;
-	}
-	return MD_RESERVED;
-}
-
 typedef struct mddev_s mddev_t;
 typedef struct mdk_rdev_s mdk_rdev_t;
 
@@ -140,12 +97,10 @@ struct mdk_rdev_s
 					 */
 };
 
-typedef struct mdk_personality_s mdk_personality_t;
-
 struct mddev_s
 {
 	void				*private;
-	mdk_personality_t		*pers;
+	struct mdk_personality		*pers;
 	dev_t				unit;
 	int				md_minor;
 	struct list_head 		disks;
@@ -266,9 +221,11 @@ static inline void md_sync_acct(struct b
         atomic_add(nr_sectors, &bdev->bd_contains->bd_disk->sync_io);
 }
 
-struct mdk_personality_s
+struct mdk_personality
 {
 	char *name;
+	int level;
+	struct list_head list;
 	struct module *owner;
 	int (*make_request)(request_queue_t *q, struct bio *bio);
 	int (*run)(mddev_t *mddev);

diff ./init/do_mounts_md.c~current~ ./init/do_mounts_md.c
--- ./init/do_mounts_md.c~current~	2005-12-01 13:59:45.000000000 +1100
+++ ./init/do_mounts_md.c	2005-12-01 14:03:25.000000000 +1100
@@ -17,7 +17,7 @@ static int __initdata raid_noautodetect,
 static struct {
 	int minor;
 	int partitioned;
-	int pers;
+	int level;
 	int chunk;
 	char *device_names;
 } md_setup_args[MAX_MD_DEVS] __initdata;
@@ -47,7 +47,7 @@ extern int mdp_major;
  */
 static int __init md_setup(char *str)
 {
-	int minor, level, factor, fault, pers, partitioned = 0;
+	int minor, level, factor, fault, partitioned = 0;
 	char *pername = "";
 	char *str1;
 	int ent;
@@ -78,7 +78,7 @@ static int __init md_setup(char *str)
 	}
 	if (ent >= md_setup_ents)
 		md_setup_ents++;
-	switch (get_option(&str, &level)) {	/* RAID Personality */
+	switch (get_option(&str, &level)) {	/* RAID level */
 	case 2: /* could be 0 or -1.. */
 		if (level == 0 || level == LEVEL_LINEAR) {
 			if (get_option(&str, &factor) != 2 ||	/* Chunk Size */
@@ -86,16 +86,12 @@ static int __init md_setup(char *str)
 				printk(KERN_WARNING "md: Too few arguments supplied to md=.\n");
 				return 0;
 			}
-			md_setup_args[ent].pers = level;
+			md_setup_args[ent].level = level;
 			md_setup_args[ent].chunk = 1 << (factor+12);
-			if (level ==  LEVEL_LINEAR) {
-				pers = LINEAR;
+			if (level ==  LEVEL_LINEAR)
 				pername = "linear";
-			} else {
-				pers = RAID0;
+			else
 				pername = "raid0";
-			}
-			md_setup_args[ent].pers = pers;
 			break;
 		}
 		/* FALL THROUGH */
@@ -103,7 +99,7 @@ static int __init md_setup(char *str)
 		str = str1;
 		/* FALL THROUGH */
 	case 0:
-		md_setup_args[ent].pers = 0;
+		md_setup_args[ent].level = LEVEL_NONE;
 		pername="super-block";
 	}
 
@@ -190,10 +186,10 @@ static void __init md_setup_drive(void)
 			continue;
 		}
 
-		if (md_setup_args[ent].pers) {
+		if (md_setup_args[ent].level != LEVEL_NONE) {
 			/* non-persistent */
 			mdu_array_info_t ainfo;
-			ainfo.level = pers_to_level(md_setup_args[ent].pers);
+			ainfo.level = md_setup_args[ent].level;
 			ainfo.size = 0;
 			ainfo.nr_disks =0;
 			ainfo.raid_disks =0;

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH md 014 of 14] Fix possible problem in raid1/raid10 error overwriting.
  2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
                   ` (12 preceding siblings ...)
  2005-12-01  3:23 ` [PATCH md 013 of 14] Remove personality numbering from md NeilBrown
@ 2005-12-01  3:24 ` NeilBrown
  13 siblings, 0 replies; 22+ messages in thread
From: NeilBrown @ 2005-12-01  3:24 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid


The code to overwrite/reread for addressing read errors
in raid1/raid10 currently assumes that the read will
not alter the buffer which could be used to write to
the next device.  This is not a safe assumption to make.

So we split the loops into a overwrite loop and a separate re-read
loop, so that the writing is complete before reading is attempted.

Cc: Paul Clements <paul.clements@steeleye.com>
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./drivers/md/raid1.c  |   38 ++++++++++++++++++++++++++++++--------
 ./drivers/md/raid10.c |   22 ++++++++++++++++++----
 2 files changed, 48 insertions(+), 12 deletions(-)

diff ./drivers/md/raid1.c~current~ ./drivers/md/raid1.c
--- ./drivers/md/raid1.c~current~	2005-12-01 14:03:25.000000000 +1100
+++ ./drivers/md/raid1.c	2005-12-01 14:03:40.000000000 +1100
@@ -1252,6 +1252,7 @@ static void sync_request_write(mddev_t *
 			} while (!success && d != r1_bio->read_disk);
 
 			if (success) {
+				int start = d;
 				/* write it back and re-read */
 				set_bit(R1BIO_Uptodate, &r1_bio->state);
 				while (d != r1_bio->read_disk) {
@@ -1265,14 +1266,23 @@ static void sync_request_write(mddev_t *
 							 sect + rdev->data_offset,
 							 s<<9,
 							 bio->bi_io_vec[idx].bv_page,
-							 WRITE) == 0 ||
-					    sync_page_io(rdev->bdev,
+							 WRITE) == 0)
+						md_error(mddev, rdev);
+				}
+				d = start;
+				while (d != r1_bio->read_disk) {
+					if (d == 0)
+						d = conf->raid_disks;
+					d--;
+					if (r1_bio->bios[d]->bi_end_io != end_sync_read)
+						continue;
+					rdev = conf->mirrors[d].rdev;
+					if (sync_page_io(rdev->bdev,
 							 sect + rdev->data_offset,
 							 s<<9,
 							 bio->bi_io_vec[idx].bv_page,
-							 READ) == 0) {
+							 READ) == 0)
 						md_error(mddev, rdev);
-					}
 				}
 			} else {
 				char b[BDEVNAME_SIZE];
@@ -1444,6 +1454,7 @@ static void raid1d(mddev_t *mddev)
 
 				if (success) {
 					/* write it back and re-read */
+					int start = d;
 					while (d != r1_bio->read_disk) {
 						if (d==0)
 							d = conf->raid_disks;
@@ -1453,13 +1464,24 @@ static void raid1d(mddev_t *mddev)
 						    test_bit(In_sync, &rdev->flags)) {
 							if (sync_page_io(rdev->bdev,
 									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, WRITE) == 0 ||
-							    sync_page_io(rdev->bdev,
+									 s<<9, conf->tmppage, WRITE) == 0)
+								/* Well, this device is dead */
+								md_error(mddev, rdev);
+						}
+					}
+					d = start;
+					while (d != r1_bio->read_disk) {
+						if (d==0)
+							d = conf->raid_disks;
+						d--;
+						rdev = conf->mirrors[d].rdev;
+						if (rdev &&
+						    test_bit(In_sync, &rdev->flags)) {
+							if (sync_page_io(rdev->bdev,
 									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, READ) == 0) {
+									 s<<9, conf->tmppage, READ) == 0)
 								/* Well, this device is dead */
 								md_error(mddev, rdev);
-							}
 						}
 					}
 				} else {

diff ./drivers/md/raid10.c~current~ ./drivers/md/raid10.c
--- ./drivers/md/raid10.c~current~	2005-12-01 14:03:25.000000000 +1100
+++ ./drivers/md/raid10.c	2005-12-01 14:03:41.000000000 +1100
@@ -1421,6 +1421,7 @@ static void raid10d(mddev_t *mddev)
 				} while (!success && sl != r10_bio->read_slot);
 
 				if (success) {
+					int start = sl;
 					/* write it back and re-read */
 					while (sl != r10_bio->read_slot) {
 						int d;
@@ -1434,14 +1435,27 @@ static void raid10d(mddev_t *mddev)
 							if (sync_page_io(rdev->bdev,
 									 r10_bio->devs[sl].addr +
 									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, WRITE) == 0 ||
-							    sync_page_io(rdev->bdev,
+									 s<<9, conf->tmppage, WRITE) == 0)
+								/* Well, this device is dead */
+								md_error(mddev, rdev);
+						}
+					}
+					sl = start;
+					while (sl != r10_bio->read_slot) {
+						int d;
+						if (sl==0)
+							sl = conf->copies;
+						sl--;
+						d = r10_bio->devs[sl].devnum;
+						rdev = conf->mirrors[d].rdev;
+						if (rdev &&
+						    test_bit(In_sync, &rdev->flags)) {
+							if (sync_page_io(rdev->bdev,
 									 r10_bio->devs[sl].addr +
 									 sect + rdev->data_offset,
-									 s<<9, conf->tmppage, READ) == 0) {
+									 s<<9, conf->tmppage, READ) == 0)
 								/* Well, this device is dead */
 								md_error(mddev, rdev);
-							}
 						}
 					}
 				} else {

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH md 002 of 14] Allow raid1 to check consistency
  2005-12-01  3:22 ` [PATCH md 002 of 14] Allow raid1 to check consistency NeilBrown
@ 2005-12-01 22:34   ` Andrew Morton
  2005-12-05 23:30     ` Neil Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2005-12-01 22:34 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

NeilBrown <neilb@suse.de> wrote:
>
> +	if (test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery))
>  +		j = pi->raid_disks;
>  +	else
>  +		j = 1;
>  +	while(j--) {
>  +		bio = r1_bio->bios[j];
>  +		for (i = 0; i < RESYNC_PAGES; i++) {
>  +			page = alloc_page(gfp_flags);
>  +			if (unlikely(!page))
>  +				goto out_free_pages;
>  +
>  +			bio->bi_io_vec[i].bv_page = page;
>  +		}
>  +	}
>  +	/* If not user-requests, copy the page pointers to all bios */
>  +	if (!test_bit(MD_RECOVERY_REQUESTED, &pi->mddev->recovery)) {
>  +		for (i=0; i<RESYNC_PAGES ; i++)
>  +			for (j=1; j<pi->raid_disks; j++)
>  +				r1_bio->bios[j]->bi_io_vec[i].bv_page =
>  +					r1_bio->bios[0]->bi_io_vec[i].bv_page;
>   	}
>   
>   	r1_bio->master_bio = NULL;
>  @@ -122,8 +137,10 @@ static void * r1buf_pool_alloc(gfp_t gfp
>   	return r1_bio;
>   
>   out_free_pages:
>  -	for ( ; i > 0 ; i--)
>  -		__free_page(bio->bi_io_vec[i-1].bv_page);
>  +	for (i=0; i < RESYNC_PAGES ; i++)
>  +		for (j=0 ; j < pi->raid_disks; j++)
>  +			__free_page(r1_bio->bios[j]->bi_io_vec[i].bv_page);
>  +	j = -1;
>   out_free_bio:

Are you sure the error handling here is correct?

a) we loop up to RESYNC_PAGES, but the allocation loop may not have got
   that far

b) we loop in the ascending-index direction, but the allocating loop
   loops in the descending-index direction.

c) we loop up to pi->raid_disks, but the allocating loop may have
   done `j = 1;'.

d) there was a d), but I forgot what it was.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH md 006 of 14] Make /proc/mdstat pollable.
  2005-12-01  3:23 ` [PATCH md 006 of 14] Make /proc/mdstat pollable NeilBrown
@ 2005-12-01 22:39   ` Andrew Morton
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2005-12-01 22:39 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

NeilBrown <neilb@suse.de> wrote:
>
> +DECLARE_WAIT_QUEUE_HEAD(md_event_waiters);
>  

static scope?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH md 008 of 14] Convert md to use kzalloc throughout
  2005-12-01  3:23 ` [PATCH md 008 of 14] Convert md to use kzalloc throughout NeilBrown
@ 2005-12-01 22:42   ` Andrew Morton
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2005-12-01 22:42 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

NeilBrown <neilb@suse.de> wrote:
>
> -	conf = kmalloc (sizeof (*conf) + mddev->raid_disks*sizeof(dev_info_t),
>  +	conf = kzalloc (sizeof (*conf) + mddev->raid_disks*sizeof(dev_info_t),
>  -	new = (mddev_t *) kmalloc(sizeof(*new), GFP_KERNEL);
>  +	new = (mddev_t *) kzalloc(sizeof(*new), GFP_KERNEL);
>  -	rdev = (mdk_rdev_t *) kmalloc(sizeof(*rdev), GFP_KERNEL);
>  +	rdev = (mdk_rdev_t *) kzalloc(sizeof(*rdev), GFP_KERNEL);

It'd be nice to nuke the unneeded cast while we're there.

<edits the diff>

OK, I did that.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic
  2005-12-01  3:23 ` [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic NeilBrown
@ 2005-12-01 22:46   ` Andrew Morton
  2005-12-05 23:43     ` Neil Brown
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Morton @ 2005-12-01 22:46 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

NeilBrown <neilb@suse.de> wrote:
>
> +				paddr = kmap_atomic(page, KM_USER0);
>  +				memset(paddr + offset, 0xff,
>   				       PAGE_SIZE - offset);

This page which is being altered is a user-visible one, no?  A pagecache
page?

We must always run flush_dcache_page() against a page when the kernel
modifies a user-visible page's contents.  Otherwise, on some architectures,
the modification won't be visible at the different virtual address.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH md 002 of 14] Allow raid1 to check consistency
  2005-12-01 22:34   ` Andrew Morton
@ 2005-12-05 23:30     ` Neil Brown
  2005-12-06  3:50       ` Andrew Morton
  0 siblings, 1 reply; 22+ messages in thread
From: Neil Brown @ 2005-12-05 23:30 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid

On Thursday December 1, akpm@osdl.org wrote:
> NeilBrown <neilb@suse.de> wrote:
> >   out_free_pages:
> >  -	for ( ; i > 0 ; i--)
> >  -		__free_page(bio->bi_io_vec[i-1].bv_page);
> >  +	for (i=0; i < RESYNC_PAGES ; i++)
> >  +		for (j=0 ; j < pi->raid_disks; j++)
> >  +			__free_page(r1_bio->bios[j]->bi_io_vec[i].bv_page);
> >  +	j = -1;
> >   out_free_bio:
> 
> Are you sure the error handling here is correct?

Uhmm.. maybe?

> 
> a) we loop up to RESYNC_PAGES, but the allocation loop may not have got
>    that far
> 
> b) we loop in the ascending-index direction, but the allocating loop
>    loops in the descending-index direction.
> 
> c) we loop up to pi->raid_disks, but the allocating loop may have
>    done `j = 1;'.

As it is a well-known fact that all deallocation routines in Linux
accept a NULL argument, and as error handling is not a critical path,
and as the structures are zeroed when allocated, I chose simply to
free every possibly allocated page rather than keep track of exactly
where we were up to.
Unfortunately not all well-known facts are true :-(

> 
> d) there was a d), but I forgot what it was.

Maybe 'd' was  
   __free_page does not accept 'NULL' as an argument, though 
   free_page does (but it wants an address I think...).

But I have since changed this code to use "put_page", and put_page
doesn't like NULL either..

Would you accept:
------------------
Signed-off-by: Neil Brown <neilb@suse.de>

### Diffstat output
 ./mm/swap.c |    2 ++
 1 file changed, 2 insertions(+)

diff ./mm/swap.c~current~ ./mm/swap.c
--- ./mm/swap.c~current~	2005-12-06 10:29:16.000000000 +1100
+++ ./mm/swap.c	2005-12-06 10:29:25.000000000 +1100
@@ -36,6 +36,8 @@ int page_cluster;
 
 void put_page(struct page *page)
 {
+	if (unlikely(page==NULL))
+		return;
 	if (unlikely(PageCompound(page))) {
 		page = (struct page *)page_private(page);
 		if (put_page_testzero(page)) {

--------------------

Or should I open code this in md ?

NeilBrown

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic
  2005-12-01 22:46   ` Andrew Morton
@ 2005-12-05 23:43     ` Neil Brown
  0 siblings, 0 replies; 22+ messages in thread
From: Neil Brown @ 2005-12-05 23:43 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-raid

On Thursday December 1, akpm@osdl.org wrote:
> NeilBrown <neilb@suse.de> wrote:
> >
> > +				paddr = kmap_atomic(page, KM_USER0);
> >  +				memset(paddr + offset, 0xff,
> >   				       PAGE_SIZE - offset);
> 
> This page which is being altered is a user-visible one, no?  A pagecache
> page?

It can be, but also may not be..

> 
> We must always run flush_dcache_page() against a page when the kernel
> modifies a user-visible page's contents.  Otherwise, on some architectures,
> the modification won't be visible at the different virtual address.

Ok, I'll add that at an appropriate place, thanks.

NeilBrown

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH md 002 of 14] Allow raid1 to check consistency
  2005-12-05 23:30     ` Neil Brown
@ 2005-12-06  3:50       ` Andrew Morton
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Morton @ 2005-12-06  3:50 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Neil Brown <neilb@suse.de> wrote:
>
> Would you accept:
>  ------------------
>  Signed-off-by: Neil Brown <neilb@suse.de>
> 
>  ### Diffstat output
>   ./mm/swap.c |    2 ++
>   1 file changed, 2 insertions(+)
> 
>  diff ./mm/swap.c~current~ ./mm/swap.c
>  --- ./mm/swap.c~current~	2005-12-06 10:29:16.000000000 +1100
>  +++ ./mm/swap.c	2005-12-06 10:29:25.000000000 +1100
>  @@ -36,6 +36,8 @@ int page_cluster;
>   
>   void put_page(struct page *page)
>   {
>  +	if (unlikely(page==NULL))
>  +		return;
>   	if (unlikely(PageCompound(page))) {
>   		page = (struct page *)page_private(page);
>   		if (put_page_testzero(page)) {
> 

eek.  That's an all-over-the-place-fast-path!

> 
>  Or should I open code this in md ?

Yes please.

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2005-12-06  3:50 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-12-01  3:22 [PATCH md 000 of 14] Introduction NeilBrown
2005-12-01  3:22 ` [PATCH md 001 of 14] Support check-without-repair of raid10 arrays NeilBrown
2005-12-01  3:22 ` [PATCH md 002 of 14] Allow raid1 to check consistency NeilBrown
2005-12-01 22:34   ` Andrew Morton
2005-12-05 23:30     ` Neil Brown
2005-12-06  3:50       ` Andrew Morton
2005-12-01  3:23 ` [PATCH md 003 of 14] Make sure read error on last working drive of raid1 actually returns failure NeilBrown
2005-12-01  3:23 ` [PATCH md 004 of 14] auto-correct correctable read errors in raid10 NeilBrown
2005-12-01  3:23 ` [PATCH md 005 of 14] raid10 read-error handling - resync and read-only NeilBrown
2005-12-01  3:23 ` [PATCH md 006 of 14] Make /proc/mdstat pollable NeilBrown
2005-12-01 22:39   ` Andrew Morton
2005-12-01  3:23 ` [PATCH md 007 of 14] Clean up 'page' related names in md NeilBrown
2005-12-01  3:23 ` [PATCH md 008 of 14] Convert md to use kzalloc throughout NeilBrown
2005-12-01 22:42   ` Andrew Morton
2005-12-01  3:23 ` [PATCH md 009 of 14] Tidy up raid5/6 hash table code NeilBrown
2005-12-01  3:23 ` [PATCH md 010 of 14] Convert various kmap calls to kmap_atomic NeilBrown
2005-12-01 22:46   ` Andrew Morton
2005-12-05 23:43     ` Neil Brown
2005-12-01  3:23 ` [PATCH md 011 of 14] Convert recently exported symbol to GPL NeilBrown
2005-12-01  3:23 ` [PATCH md 012 of 14] Break out of a loop that doesn't need to run to completion NeilBrown
2005-12-01  3:23 ` [PATCH md 013 of 14] Remove personality numbering from md NeilBrown
2005-12-01  3:24 ` [PATCH md 014 of 14] Fix possible problem in raid1/raid10 error overwriting NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).