From: NeilBrown <neilb@suse.de>
To: jiao hui <jiaohui@bwstor.com.cn>
Cc: linux-raid@vger.kernel.org, guomingyang@nrchpc.ac.cn,
zhaomeng@bwstor.com.cn
Subject: Re: [PATCH] md/raid1: always set MD_RECOVERY_INTR flag in raid1 error handler to avoid potential data corruption
Date: Tue, 29 Jul 2014 12:44:15 +1000 [thread overview]
Message-ID: <20140729124415.60ecdf3d@notabene.brown> (raw)
In-Reply-To: <1406534973.21454.3.camel@fedws>
[-- Attachment #1: Type: text/plain, Size: 3965 bytes --]
On Mon, 28 Jul 2014 16:09:33 +0800 jiao hui <jiaohui@bwstor.com.cn> wrote:
> >From 1fdbfb8552c00af55d11d7a63cdafbdf1749ff63 Mon Sep 17 00:00:00 2001
> From: Jiao Hui <simonjiaoh@gmail.com>
> Date: Mon, 28 Jul 2014 11:57:20 +0800
> Subject: [PATCH] md/raid1: always set MD_RECOVERY_INTR flag in raid1 error handler to avoid potential data corruption
>
> In the recovery of raid1 with bitmap, if a bitmap bit has a NEEDED or RESYNC flag,
> actual resync io will happen. The sync_thread check each rdev, if any rdev is missing
> or has a FAULTY flag, the array is still_degraded, then the bitmap bit NEEDED flag
> not cleared. Otherwise, we cleared NEEDED flag and set RESYNC flag. The RESYNC flag cleared
> in bitmap_cond_end_sync or bitmap_close_sync.
>
> If the only disk which is being recovered fails again when raid1 recovery is in progress.
> The resync_thread can't find a non-In_sync disk to write, then the remaining recovery skipped.
> RAID1 error handler only set MD_RECOVERY_INTR flag when a In_sync disk fails. But the disk
> being reocvered is non-In_sync, then md_do_sync can't got the INTR singal to break, and the
> mddev->curr_resync is uptodated to max_sectors (mddev->dev_sectors). When raid1 personality
> tries to finish resync process, no bitmap bit with RESYNC flag can set back to NEEDED flag,
> and bitmap_close_sync clear the RESYNC flag. When the disk is added back, the area from
> the offset of last recovery to the end of bitmap-chunk is skipped by resync_thread forever.
>
> Signed-off-by: JiaoHui <jiaohui@bwstor.com.cn>
>
> ---
> drivers/md/raid1.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index aacf6bf..51d06eb 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -1391,16 +1391,16 @@ static void error(struct mddev *mddev, struct md_rdev *rdev)
> return;
> }
> set_bit(Blocked, &rdev->flags);
> + /*
> + * if recovery is running, make sure it aborts.
> + */
> + set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> if (test_and_clear_bit(In_sync, &rdev->flags)) {
> unsigned long flags;
> spin_lock_irqsave(&conf->device_lock, flags);
> mddev->degraded++;
> set_bit(Faulty, &rdev->flags);
> spin_unlock_irqrestore(&conf->device_lock, flags);
> - /*
> - * if recovery is running, make sure it aborts.
> - */
> - set_bit(MD_RECOVERY_INTR, &mddev->recovery);
> } else
> set_bit(Faulty, &rdev->flags);
> set_bit(MD_CHANGE_DEVS, &mddev->flags);
Hi,
thanks for the report and the patch.
If the recovery process gets a write error it will abort the current bitmap
region by calling bitmap_end_sync() in end_sync_write().
However you are talking about a different situation where a normal IO write
gets and error and fails a drive. Then the recovery aborts without aborting
the current bitmap region.
I think I would rather fix the bug by calling end_sync_write() at the place
where the recovery decides to abort, as in the following patch.
Would you be able to test it please and confirm that it works?
A similar fix will probably be needed for raid10.
Thanks,
NeilBrown
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index 56e24c072b62..4f007a410f4b 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -2668,9 +2668,11 @@ static sector_t sync_request(struct mddev *mddev, sector_t sector_nr, int *skipp
if (write_targets == 0 || read_targets == 0) {
/* There is nowhere to write, so all non-sync
- * drives must be failed - so we are finished
+ * drives must be failed - so we are finished.
+ * But abort the current bitmap region though.
*/
sector_t rv;
+ bitmap_end_sync(mddev->bitmap, sector_nr, &sync_blocks, 1);
if (min_bad > 0)
max_sector = sector_nr + min_bad;
rv = max_sector - sector_nr;
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
next prev parent reply other threads:[~2014-07-29 2:44 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-07-28 8:09 [PATCH] md/raid1: always set MD_RECOVERY_INTR flag in raid1 error handler to avoid potential data corruption jiao hui
2014-07-28 8:23 ` jiao hui
2014-07-29 2:44 ` NeilBrown [this message]
2014-07-29 6:50 ` jiao hui
2014-07-30 3:39 ` NeilBrown
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140729124415.60ecdf3d@notabene.brown \
--to=neilb@suse.de \
--cc=guomingyang@nrchpc.ac.cn \
--cc=jiaohui@bwstor.com.cn \
--cc=linux-raid@vger.kernel.org \
--cc=zhaomeng@bwstor.com.cn \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).