Re: [PATCH] md/raid1: always set MD_RECOVERY_INTR flag in raid1 error handler to avoid potential data corruption

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: jiao hui <jiaohui@bwstor.com.cn>
Cc: linux-raid <linux-raid@vger.kernel.org>,
	guomingyang <guomingyang@nrchpc.ac.cn>,
	zhaomeng <zhaomeng@bwstor.com.cn>
Subject: Re: [PATCH] md/raid1: always set MD_RECOVERY_INTR flag in raid1 error handler to avoid potential data corruption
Date: Wed, 30 Jul 2014 13:39:29 +1000	[thread overview]
Message-ID: <20140730133929.7798e250@notabene.brown> (raw)
In-Reply-To: <CAEuG+XJhbjGDWhFTdX0h+be-TNVUQUfgWPZVcaxwT4wvzoPbLA@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 3419 bytes --]

On Tue, 29 Jul 2014 14:50:04 +0800 jiao hui <jiaohui@bwstor.com.cn> wrote:

> Hi neil,
> 
> The patch works. I test it on Centos 7.0 for fifty rounds， no
> consistency issue found。
> 
> Best Regards.
> jiaohui
> 

Thanks for testing.

I looked again and compared with raid10 and decided that your fix actually
was better.  If you are recovering two drives at once and only one fails, you
patch will do the right thing, but mine won't.

I'll be submitting the following.

Thanks,
NeilBrown

From b628438e59827e710df20c27fea680cbe1870272 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Wed, 30 Jul 2014 13:24:50 +1000
Subject: [PATCH] md/raid1,raid10: always abort recover on write error.

Currently we don't abort recovery on a write error if the write error
to the recovering device was triggerd by normal IO (as opposed to
recovery IO).

This means that for one bitmap region, the recovery might write to the
recovering device for a few sectors, then not bother for subsequent
sectors (as it never writes to failed devices).  In this case
the bitmap bit will be cleared, but it really shouldn't.

The result is that if the recovering device fails and is then re-added
(after fixing whatever hardware problem triggerred the failure),
the second recovery won't redo the region it was in the middle of,
so some of the device will not be recovered properly.

If we abort the recovery, the region being processes will be cancelled
(bit not cleared) and the whole region will be retried.

As the bug can result in data corruption the patch is suitable for
-stable.  For kernels prior to 3.11 there is a conflict in raid10.c
which will require care.

Original-from: jiao hui <jiaohui@bwstor.com.cn>
Reported-and-tested-by: jiao hui <jiaohui@bwstor.com.cn>
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@vger.kernel.org

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 56e24c072b62..d7690f86fdb9 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -1501,12 +1501,12 @@ static void error(struct mddev *mddev, struct md_rdev *rdev)
 		mddev->degraded++;
 		set_bit(Faulty, &rdev->flags);
 		spin_unlock_irqrestore(&conf->device_lock, flags);
-		/*
-		 * if recovery is running, make sure it aborts.
-		 */
-		set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 	} else
 		set_bit(Faulty, &rdev->flags);
+	/*
+	 * if recovery is running, make sure it aborts.
+	 */
+	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 	set_bit(MD_CHANGE_DEVS, &mddev->flags);
 	printk(KERN_ALERT
 	       "md/raid1:%s: Disk failure on %s, disabling device.\n"
diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c
index cb882aae9e20..b08c18871323 100644
--- a/drivers/md/raid10.c
+++ b/drivers/md/raid10.c
@@ -1684,13 +1684,12 @@ static void error(struct mddev *mddev, struct md_rdev *rdev)
 		spin_unlock_irqrestore(&conf->device_lock, flags);
 		return;
 	}
-	if (test_and_clear_bit(In_sync, &rdev->flags)) {
+	if (test_and_clear_bit(In_sync, &rdev->flags))
 		mddev->degraded++;
-			/*
-		 * if recovery is running, make sure it aborts.
-		 */
-		set_bit(MD_RECOVERY_INTR, &mddev->recovery);
-	}
+	/*
+	 * If recovery is running, make sure it aborts.
+	 */
+	set_bit(MD_RECOVERY_INTR, &mddev->recovery);
 	set_bit(Blocked, &rdev->flags);
 	set_bit(Faulty, &rdev->flags);
 	set_bit(MD_CHANGE_DEVS, &mddev->flags);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

     prev parent reply	other threads:[~2014-07-30  3:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-28  8:09 [PATCH] md/raid1: always set MD_RECOVERY_INTR flag in raid1 error handler to avoid potential data corruption jiao hui
2014-07-28  8:23 ` jiao hui
2014-07-29  2:44 ` NeilBrown
2014-07-29  6:50   ` jiao hui
2014-07-30  3:39     ` NeilBrown [this message]

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:56e24c072b6 dfblob:d7690f86fdb dfblob:cb882aae9e2
dfblob:b08c1887132 )
 OR (
bs:"md/raid1,raid10: always abort recover on write error." )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140730133929.7798e250@notabene.brown \
    --to=neilb@suse.de \
    --cc=guomingyang@nrchpc.ac.cn \
    --cc=jiaohui@bwstor.com.cn \
    --cc=linux-raid@vger.kernel.org \
    --cc=zhaomeng@bwstor.com.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).