linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: NeilBrown <neilb@suse.de>
To: linux-raid@vger.kernel.org
Subject: [md PATCH 05/26] md/raid5: need_this_block: tidy/fix last condition.
Date: Wed, 04 Feb 2015 08:42:28 +1100	[thread overview]
Message-ID: <20150203214228.3448.70867.stgit@notabene.brown> (raw)
In-Reply-To: <20150203213948.3448.80258.stgit@notabene.brown>

That last condition is unclear and over cautious.

There are two related issues here.

If a partial write is destined for a missing device, then
either RMW or RCW can work.  We must read all the available
block.  Only then can the missing blocks be calculated, and
then the parity update performed.

If RMW is not an option, then there is a complication even
without partial writes.  If we would need to read a missing
device to perform the reconstruction, then we must first read every
block so the missing device data can be computed.
This is the case for RAID6 (Which currently does not support
RMW) and for times when we don't trust the parity (after a crash)
and so are in the process of resyncing it.

So make these two cases more clear and separate, and perform
the relevant tests more  thoroughly.

Signed-off-by: NeilBrown <neilb@suse.de>
---
 drivers/md/raid5.c |   42 ++++++++++++++++++++++++++++++++----------
 1 file changed, 32 insertions(+), 10 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index bb42551c1a42..a03cf2d889bf 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -2902,6 +2902,7 @@ static int need_this_block(struct stripe_head *sh, struct stripe_head_state *s,
 	struct r5dev *dev = &sh->dev[disk_idx];
 	struct r5dev *fdev[2] = { &sh->dev[s->failed_num[0]],
 				  &sh->dev[s->failed_num[1]] };
+	int i;
 
 
 	if (test_bit(R5_LOCKED, &dev->flags) ||
@@ -2949,16 +2950,37 @@ static int need_this_block(struct stripe_head *sh, struct stripe_head_state *s,
 		 * and there is no need to delay that.
 		 */
 		return 0;
-	if (
-	     (sh->raid_conf->level <= 5 && fdev[0]->towrite &&
-	      !test_bit(R5_OVERWRITE, &fdev[0]->flags)) ||
-	     ((sh->raid_conf->level == 6 ||
-	       sh->sector >= sh->raid_conf->mddev->recovery_cp)
-	      &&
-	      (s->to_write - s->non_overwrite <
-	       sh->raid_conf->raid_disks - sh->raid_conf->max_degraded)
-	      ))
-		return 1;
+
+	for (i = 0; i < s->failed; i++) {
+		if (fdev[i]->towrite &&
+		    !test_bit(R5_UPTODATE, &fdev[i]->flags) &&
+		    !test_bit(R5_OVERWRITE, &fdev[i]->flags))
+			/* If we have a partial write to a failed
+			 * device, then we will need to reconstruct
+			 * the content of that device, so all other
+			 * devices must be read.
+			 */
+			return 1;
+	}
+
+	/* If we are forced to do a reconstruct-write, either because
+	 * the current RAID6 implementation only supports that, or
+	 * or because parity cannot be trusted and we are currently
+	 * recovering it, there is extra need to be careful.
+	 * If one of the devices that we would need to read, because
+	 * it is not being overwritten (and maybe not written at all)
+	 * is missing/faulty, then we need to read everything we can.
+	 */
+	if (sh->raid_conf->level != 6 &&
+	    sh->sector < sh->raid_conf->mddev->recovery_cp)
+		/* reconstruct-write isn't being forced */
+		return 0;
+	for (i = 0; i < s->failed; i++) {
+		if (!test_bit(R5_UPTODATE, &fdev[i]->flags) &&
+		    !test_bit(R5_OVERWRITE, &fdev[i]->flags))
+			return 1;
+	}
+
 	return 0;
 }
 



  reply	other threads:[~2015-02-03 21:42 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-03 21:42 [md PATCH 00/26] Pending md patches NeilBrown
2015-02-03 21:42 ` NeilBrown [this message]
2015-02-03 21:42 ` [md PATCH 03/26] md/raid5: separate out the easy conditions in need_this_block NeilBrown
2015-02-03 21:42 ` [md PATCH 01/26] md: do_release_stripe(): No need to call md_wakeup_thread() twice NeilBrown
2015-02-03 21:42 ` [md PATCH 02/26] md/raid5: separate large if clause out of fetch_block() NeilBrown
2015-02-03 21:42 ` [md PATCH 06/26] md: rename mddev->write_lock to mddev->lock NeilBrown
2015-02-03 21:42 ` [md PATCH 04/26] md/raid5: need_this_block: start simplifying the last two conditions NeilBrown
2015-02-03 21:42 ` [md PATCH 18/26] md: remove mddev_lock from rdev_attr_show() NeilBrown
2015-02-03 21:42 ` [md PATCH 08/26] md: make merge_bvec_fn more robust in face of personality changes NeilBrown
2015-02-03 21:42 ` [md PATCH 20/26] md: tidy up set_bitmap_file NeilBrown
2015-02-03 21:42 ` [md PATCH 17/26] md: remove mddev_lock() from md_attr_show() NeilBrown
2015-02-03 21:42 ` [md PATCH 15/26] md: remove need for mddev_lock() in md_seq_show() NeilBrown
2015-02-03 21:42 ` [md PATCH 09/26] md/linear: remove rcu protections in favour of suspend/resume NeilBrown
2015-02-03 21:42 ` [md PATCH 11/26] md: rename ->stop to ->free NeilBrown
2015-02-03 21:42 ` [md PATCH 07/26] md: make ->congested robust against personality changes NeilBrown
2015-02-03 21:42 ` [md PATCH 16/26] md/raid5: use ->lock to protect accessing raid5 sysfs attributes NeilBrown
2015-02-03 21:42 ` [md PATCH 19/26] md: remove unnecessary 'buf' from get_bitmap_file NeilBrown
2015-02-03 21:42 ` [md PATCH 14/26] md/bitmap: protect clearing of ->bitmap by mddev->lock NeilBrown
2015-02-03 21:42 ` [md PATCH 13/26] md: protect ->pers changes with mddev->lock NeilBrown
2015-02-03 21:42 ` [md PATCH 10/26] md: split detach operation out from ->stop NeilBrown
2015-02-03 21:42 ` [md PATCH 12/26] md: level_store: group all important changes into one place NeilBrown
2015-02-03 21:42 ` [md PATCH 21/26] md: move GET_BITMAP_FILE ioctl out from mddev_lock NeilBrown
2015-02-03 21:42 ` [md PATCH 26/26] md: wakeup thread upon rdev_dec_pending() NeilBrown
2015-02-03 21:42 ` [md PATCH 22/26] md: minor cleanup in safe_delay_store NeilBrown
2015-02-03 21:42 ` [md PATCH 25/26] md: make reconfig_mutex optional for writes to md sysfs files NeilBrown
2015-02-03 21:42 ` [md PATCH 24/26] md: move mddev_lock and related to md.h NeilBrown
2015-02-03 21:42 ` [md PATCH 23/26] md: use mddev->lock to protect updates to resync_{min, max} NeilBrown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150203214228.3448.70867.stgit@notabene.brown \
    --to=neilb@suse.de \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).