linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: linux-raid@vger.kernel.org
Subject: [md PATCH 04/16] md: simplify raid10 read_balance
Date: Wed, 11 May 2011 16:30:30 +1000	[thread overview]
Message-ID: <20110511063030.21263.75346.stgit@notabene.brown> (raw)
In-Reply-To: <20110511062743.21263.72802.stgit@notabene.brown>

raid10 read balance has two different loop for looking through
possible devices to chose the best.
Collapse those into one loop and generally make the code more
readable.

Signed-off-by: NeilBrown <neilb@suse.de>
---

 drivers/md/raid10.c |  110 ++++++++++++++++++++-------------------------------
 1 files changed, 44 insertions(+), 66 deletions(-)

diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index 8e94626..8ea0aca 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -488,13 +488,19 @@ static int raid10_mergeable_bvec(struct request_queue *q,
 static int read_balance(conf_t *conf, r10bio_t *r10_bio)
 {
 	const sector_t this_sector = r10_bio->sector;
-	int disk, slot, nslot;
+	int disk, slot;
 	const int sectors = r10_bio->sectors;
-	sector_t new_distance, current_distance;
+	sector_t new_distance, best_dist;
 	mdk_rdev_t *rdev;
+	int do_balance;
+	int best_slot;
 
 	raid10_find_phys(conf, r10_bio);
 	rcu_read_lock();
+retry:
+	best_slot = -1;
+	best_dist = MaxSector;
+	do_balance = 1;
 	/*
 	 * Check if we can balance. We can balance on the whole
 	 * device if no resync is going on (recovery is ok), or below
@@ -502,86 +508,58 @@ static int read_balance(conf_t *conf, r10bio_t *r10_bio)
 	 * above the resync window.
 	 */
 	if (conf->mddev->recovery_cp < MaxSector
-	    && (this_sector + sectors >= conf->next_resync)) {
-		/* make sure that disk is operational */
-		slot = 0;
-		disk = r10_bio->devs[slot].devnum;
-
-		while ((rdev = rcu_dereference(conf->mirrors[disk].rdev)) == NULL ||
-		       r10_bio->devs[slot].bio == IO_BLOCKED ||
-		       !test_bit(In_sync, &rdev->flags)) {
-			slot++;
-			if (slot == conf->copies) {
-				slot = 0;
-				disk = -1;
-				break;
-			}
-			disk = r10_bio->devs[slot].devnum;
-		}
-		goto rb_out;
-	}
-
+	    && (this_sector + sectors >= conf->next_resync))
+		do_balance = 0;
 
-	/* make sure the disk is operational */
-	slot = 0;
-	disk = r10_bio->devs[slot].devnum;
-	while ((rdev=rcu_dereference(conf->mirrors[disk].rdev)) == NULL ||
-	       r10_bio->devs[slot].bio == IO_BLOCKED ||
-	       !test_bit(In_sync, &rdev->flags)) {
-		slot ++;
-		if (slot == conf->copies) {
-			disk = -1;
-			goto rb_out;
-		}
+	for (slot = 0; slot < conf->copies ; slot++) {
+		if (r10_bio->devs[slot].bio == IO_BLOCKED)
+			continue;
 		disk = r10_bio->devs[slot].devnum;
-	}
-
-
-	current_distance = abs(r10_bio->devs[slot].addr -
-			       conf->mirrors[disk].head_position);
-
-	/* Find the disk whose head is closest,
-	 * or - for far > 1 - find the closest to partition beginning */
-
-	for (nslot = slot; nslot < conf->copies; nslot++) {
-		int ndisk = r10_bio->devs[nslot].devnum;
-
-
-		if ((rdev=rcu_dereference(conf->mirrors[ndisk].rdev)) == NULL ||
-		    r10_bio->devs[nslot].bio == IO_BLOCKED ||
-		    !test_bit(In_sync, &rdev->flags))
+		rdev = rcu_dereference(conf->mirrors[disk].rdev);
+		if (rdev == NULL)
+			continue;
+		if (!test_bit(In_sync, &rdev->flags))
 			continue;
 
+		if (!do_balance)
+			break;
+
 		/* This optimisation is debatable, and completely destroys
 		 * sequential read speed for 'far copies' arrays.  So only
 		 * keep it for 'near' arrays, and review those later.
 		 */
-		if (conf->near_copies > 1 && !atomic_read(&rdev->nr_pending)) {
-			disk = ndisk;
-			slot = nslot;
+		if (conf->near_copies > 1 && !atomic_read(&rdev->nr_pending))
 			break;
-		}
 
 		/* for far > 1 always use the lowest address */
 		if (conf->far_copies > 1)
-			new_distance = r10_bio->devs[nslot].addr;
+			new_distance = r10_bio->devs[slot].addr;
 		else
-			new_distance = abs(r10_bio->devs[nslot].addr -
-					   conf->mirrors[ndisk].head_position);
-		if (new_distance < current_distance) {
-			current_distance = new_distance;
-			disk = ndisk;
-			slot = nslot;
+			new_distance = abs(r10_bio->devs[slot].addr -
+					   conf->mirrors[disk].head_position);
+		if (new_distance < best_dist) {
+			best_dist = new_distance;
+			best_slot = slot;
 		}
 	}
+	if (slot == conf->copies)
+		slot = best_slot;
 
-rb_out:
-	r10_bio->read_slot = slot;
-/*	conf->next_seq_sect = this_sector + sectors;*/
-
-	if (disk >= 0 && (rdev=rcu_dereference(conf->mirrors[disk].rdev))!= NULL)
-		atomic_inc(&conf->mirrors[disk].rdev->nr_pending);
-	else
+	if (slot >= 0) {
+		disk = r10_bio->devs[slot].devnum;
+		rdev = rcu_dereference(conf->mirrors[disk].rdev);
+		if (!rdev)
+			goto retry;
+		atomic_inc(&rdev->nr_pending);
+		if (test_bit(Faulty, &rdev->flags)) {
+			/* Cannot risk returning a device that failed
+			 * before we inc'ed nr_pending
+			 */
+			rdev_dec_pending(rdev, conf->mddev);
+			goto retry;
+		}
+		r10_bio->read_slot = slot;
+	} else
 		disk = -1;
 	rcu_read_unlock();
 



  parent reply	other threads:[~2011-05-11  6:30 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-11  6:30 [md PATCH 00/16] md patches for 2.6.40 NeilBrown
2011-05-11  6:30 ` [md PATCH 02/16] md: reject a re-add request that cannot be honoured NeilBrown
2011-05-11  6:30 ` [md PATCH 01/16] md: Fix race when creating a new md device NeilBrown
2011-05-11  6:30 ` [md PATCH 03/16] md/bitmap: fix saving of events_cleared and other state NeilBrown
2011-05-11  6:30 ` NeilBrown [this message]
2011-05-11  6:30 ` [md PATCH 10/16] md/raid1: try fix_sync_read_error before process_checks NeilBrown
2011-05-11  6:30 ` [md PATCH 14/16] md/raid10: remove unused variable NeilBrown
2011-05-11  6:30 ` [md PATCH 13/16] md/raid10: make more use of 'slot' in raid10d NeilBrown
2011-05-11  6:30 ` [md PATCH 05/16] md/raid1: clean up read_balance NeilBrown
2011-05-11  6:30 ` [md PATCH 08/16] md/raid1: split out two sub-functions from sync_request_write NeilBrown
2011-05-11  6:30 ` [md PATCH 06/16] md/multipath: discard ->working_disks in favour of ->degraded NeilBrown
2011-05-11  6:30 ` [md PATCH 09/16] md/raid1: tidy up new functions: process_checks and fix_sync_read_error NeilBrown
2011-05-11  6:30 ` [md PATCH 12/16] md/raid10: some tidying up in fix_read_error NeilBrown
2011-05-11  6:30 ` [md PATCH 16/16] md: allow resync_start to be set while an array is active NeilBrown
2011-05-11  6:30 ` [md PATCH 11/16] md/raid1: improve handling of pages allocated for write-behind NeilBrown
2011-05-11  6:30 ` [md PATCH 15/16] md/raid10: reformat some loops with less indenting NeilBrown
2011-05-11  6:30 ` [md PATCH 07/16] md: make error_handler functions more uniform and correct NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110511063030.21263.75346.stgit@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).