Problem with disk replacement

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Problem with disk replacement
       [not found] <45ae92bb.4ffb.13ff2092e1d.Coremail.13691222965@163.com>
@ 2013-07-18 16:46 ` qindehua
  2013-07-22  3:04   ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: qindehua @ 2013-07-18 16:46 UTC (permalink / raw)
  To: linux-raid, neilb

Hi Neil,
I use kernel 3.9.6 and found data may lost while replacing a RAID disk using --replace.

Here is the steps of the test case:
1. Create 3-drives RAID5 and wait for resync done (I use 1GB disks in VirtualBox machine)
2. Calculate the md5sum on the whole raid: dd if=/dev/md1 bs=1M iflag=direct | md5sum
3. add two spare disks to the raid: mdadm /dev/md1 -a /dev/sdi -a /dev/sdj
4. mdadm --replace a disk then --fail another disk:
   mdadm /dev/md1 --replace /dev/sdb; sleep 3; mdadm /dev/md1 -f /dev/sdc
5. wait for recovery done
6. recalculate the md5sum on the whole raid: dd if=/dev/md1 bs=1M iflag=direct | md5sum

The md5sum of step 6 is NOT identical with that of step 2, this means data is corrupted.
I found in step 4, when another disk was made failed, the replacing process was stopped,
and started recovering of the failed disk, and afterwards the replacing not continue.

Regards,
Qin Dehua

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Problem with disk replacement
@ 2013-07-18 16:57 Qin Dehua
  0 siblings, 0 replies; 9+ messages in thread
From: Qin Dehua @ 2013-07-18 16:57 UTC (permalink / raw)
  To: linux-raid

Hi Neil,
I use kernel 3.9.6 and found data may lost while replacing a RAID disk
using --replace.

Here is the steps of the test case:
1. Create 3-drives RAID5 and wait for resync done (I use 1GB disks in
VirtualBox machine)
2. Calculate the md5sum on the whole raid: dd if=/dev/md1 bs=1M
iflag=direct | md5sum
3. add two spare disks to the raid: mdadm /dev/md1 -a /dev/sdi -a /dev/sdj
4. mdadm --replace a disk then --fail another disk:
   mdadm /dev/md1 --replace /dev/sdb; sleep 3; mdadm /dev/md1 -f /dev/sdc
5. wait for recovery done
6. recalculate the md5sum on the whole raid: dd if=/dev/md1 bs=1M
iflag=direct | md5sum

The md5sum of step 6 is NOT identical with that of step 2, this means
data is corrupted.
I found in step 4, when another disk was made failed, the replacing
process was stopped,
and started recovering of the failed disk, and afterwards the
replacing not continue.

Regards,
Qin Dehua

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with disk replacement
  2013-07-18 16:46 ` Problem with disk replacement qindehua
@ 2013-07-22  3:04   ` NeilBrown
  2013-07-22 10:23     ` qindehua
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2013-07-22  3:04 UTC (permalink / raw)
  To: qindehua; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3822 bytes --]

On Fri, 19 Jul 2013 00:46:22 +0800 (CST) qindehua <13691222965@163.com> wrote:

> Hi Neil,
> I use kernel 3.9.6 and found data may lost while replacing a RAID disk using --replace.
> 
> Here is the steps of the test case:
> 1. Create 3-drives RAID5 and wait for resync done (I use 1GB disks in VirtualBox machine)
> 2. Calculate the md5sum on the whole raid: dd if=/dev/md1 bs=1M iflag=direct | md5sum
> 3. add two spare disks to the raid: mdadm /dev/md1 -a /dev/sdi -a /dev/sdj
> 4. mdadm --replace a disk then --fail another disk:
>    mdadm /dev/md1 --replace /dev/sdb; sleep 3; mdadm /dev/md1 -f /dev/sdc
> 5. wait for recovery done
> 6. recalculate the md5sum on the whole raid: dd if=/dev/md1 bs=1M iflag=direct | md5sum
> 
> The md5sum of step 6 is NOT identical with that of step 2, this means data is corrupted.
> I found in step 4, when another disk was made failed, the replacing process was stopped,
> and started recovering of the failed disk, and afterwards the replacing not continue.

Hi Qin,
 thanks for the report.  I can easily reproduce the bug.

I think this will fix it.  Could you please test and confirm that it fixes
the problem for you too.

Thanks,
NeilBrown

From: NeilBrown <neilb@suse.de>
Date: Mon, 22 Jul 2013 12:57:21 +1000
Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'.

If a device in a RAID4/5/6 is being replaced while another is being
recovered, then the writes to the replacement device currently don't
happen, resulting in corruption when the replacement completes and the
new drive takes over.

This is because the replacement writes are only triggered when
's.replacing' is set and not when the similar 's.sync' is set (which
is the case during resync and recovery - it means all devices need to
be read).

So schedule those writes when s.replacing is set as well.

In this case we cannot use "STRIPE_INSYNC" to record that the
replacement has happened as that is needed for recording that any
parity calculation is complete.  So introduce STRIPE_REPLACED to
record if the replacement has happened.

This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
(md/raid5:  detect and handle replacements during recovery.)
which introduced replacement for raid5.
That was in 3.3-rc3, so any stable kernel since then would benefit
from this fix.

Cc: stable@vger.kernel.org (3.3+)
Reported-by: qindehua <13691222965@163.com>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2bf094a..f0aa7abd 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3607,8 +3607,8 @@ static void handle_stripe(struct stripe_head *sh)
 			handle_parity_checks5(conf, sh, &s, disks);
 	}

-	if (s.replacing && s.locked == 0
-	    && !test_bit(STRIPE_INSYNC, &sh->state)) {
+	if ((s.replacing || s.syncing) && s.locked == 0
+	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
 		/* Write out to replacement devices where possible */
 		for (i = 0; i < conf->raid_disks; i++)
 			if (test_bit(R5_UPTODATE, &sh->dev[i].flags) &&
@@ -3617,7 +3617,9 @@ static void handle_stripe(struct stripe_head *sh)
 				set_bit(R5_LOCKED, &sh->dev[i].flags);
 				s.locked++;
 			}
-		set_bit(STRIPE_INSYNC, &sh->state);
+		if (s.replacing)
+			set_bit(STRIPE_INSYNC, &sh->state);
+		set_bit(STRIPE_REPLACED, &sh->state);
 	}
 	if ((s.syncing || s.replacing) && s.locked == 0 &&
 	    test_bit(STRIPE_INSYNC, &sh->state)) {
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index b0b663b..70c4932 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -306,6 +306,7 @@ enum {
 	STRIPE_SYNC_REQUESTED,
 	STRIPE_SYNCING,
 	STRIPE_INSYNC,
+	STRIPE_REPLACED,
 	STRIPE_PREREAD_ACTIVE,
 	STRIPE_DELAYED,
 	STRIPE_DEGRADED,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Problem with disk replacement
  2013-07-22  3:04   ` NeilBrown
@ 2013-07-22 10:23     ` qindehua
  2013-07-23  3:21       ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: qindehua @ 2013-07-22 10:23 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,
I have tested with this patch, and it fixes the problem.
I have also tried various combinations of failing and replacing disks,
with echo idle to /sys/block/md1/md/sync_action, they all work fine.

Regards,
Qin Dehua

At 2013-07-22 11:04:15,NeilBrown <neilb@suse.de> wrote:
>Hi Qin,
> thanks for the report.  I can easily reproduce the bug.
>
>I think this will fix it.  Could you please test and confirm that it fixes
>the problem for you too.
>
>Thanks,
>NeilBrown
>
>
>From: NeilBrown <neilb@suse.de>
>Date: Mon, 22 Jul 2013 12:57:21 +1000
>Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'.
>
>If a device in a RAID4/5/6 is being replaced while another is being
>recovered, then the writes to the replacement device currently don't
>happen, resulting in corruption when the replacement completes and the
>new drive takes over.
>
>This is because the replacement writes are only triggered when
>'s.replacing' is set and not when the similar 's.sync' is set (which
>is the case during resync and recovery - it means all devices need to
>be read).
>
>So schedule those writes when s.replacing is set as well.
>
>In this case we cannot use "STRIPE_INSYNC" to record that the
>replacement has happened as that is needed for recording that any
>parity calculation is complete.  So introduce STRIPE_REPLACED to
>record if the replacement has happened.
>
>This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
>(md/raid5:  detect and handle replacements during recovery.)
>which introduced replacement for raid5.
>That was in 3.3-rc3, so any stable kernel since then would benefit
>from this fix.
>
>Cc: stable@vger.kernel.org (3.3+)
>Reported-by: qindehua <13691222965@163.com>
>Signed-off-by: NeilBrown <neilb@suse.de>
>
>diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>index 2bf094a..f0aa7abd 100644
>--- a/drivers/md/raid5.c
>+++ b/drivers/md/raid5.c
>@@ -3607,8 +3607,8 @@ static void handle_stripe(struct stripe_head *sh)
> 			handle_parity_checks5(conf, sh, &s, disks);
> 	}
> 
>-	if (s.replacing && s.locked == 0
>-	    && !test_bit(STRIPE_INSYNC, &sh->state)) {
>+	if ((s.replacing || s.syncing) && s.locked == 0
>+	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
> 		/* Write out to replacement devices where possible */
> 		for (i = 0; i < conf->raid_disks; i++)
> 			if (test_bit(R5_UPTODATE, &sh->dev[i].flags) &&
>@@ -3617,7 +3617,9 @@ static void handle_stripe(struct stripe_head *sh)
> 				set_bit(R5_LOCKED, &sh->dev[i].flags);
> 				s.locked++;
> 			}
>-		set_bit(STRIPE_INSYNC, &sh->state);
>+		if (s.replacing)
>+			set_bit(STRIPE_INSYNC, &sh->state);
>+		set_bit(STRIPE_REPLACED, &sh->state);
> 	}
> 	if ((s.syncing || s.replacing) && s.locked == 0 &&
> 	    test_bit(STRIPE_INSYNC, &sh->state)) {
>diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
>index b0b663b..70c4932 100644
>--- a/drivers/md/raid5.h
>+++ b/drivers/md/raid5.h
>@@ -306,6 +306,7 @@ enum {
> 	STRIPE_SYNC_REQUESTED,
> 	STRIPE_SYNCING,
> 	STRIPE_INSYNC,
>+	STRIPE_REPLACED,
> 	STRIPE_PREREAD_ACTIVE,
> 	STRIPE_DELAYED,
> 	STRIPE_DEGRADED,

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with disk replacement
  2013-07-22 10:23     ` qindehua
@ 2013-07-23  3:21       ` NeilBrown
  2013-07-24 12:48         ` qindehua
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2013-07-23  3:21 UTC (permalink / raw)
  To: qindehua; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 3549 bytes --]

On Mon, 22 Jul 2013 18:23:57 +0800 (CST) qindehua <qindehua@163.com> wrote:

> Hi Neil,
> I have tested with this patch, and it fixes the problem.
> I have also tried various combinations of failing and replacing disks,
> with echo idle to /sys/block/md1/md/sync_action, they all work fine.
> 
> Regards,
> Qin Dehua

Thanks for confirming, and for all of your testing!

I'll forward the patch to Linus shortly.

NeilBrown



> 
> At 2013-07-22 11:04:15,NeilBrown <neilb@suse.de> wrote:
> >Hi Qin,
> > thanks for the report.  I can easily reproduce the bug.
> >
> >I think this will fix it.  Could you please test and confirm that it fixes
> >the problem for you too.
> >
> >Thanks,
> >NeilBrown
> >
> >
> >From: NeilBrown <neilb@suse.de>
> >Date: Mon, 22 Jul 2013 12:57:21 +1000
> >Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'.
> >
> >If a device in a RAID4/5/6 is being replaced while another is being
> >recovered, then the writes to the replacement device currently don't
> >happen, resulting in corruption when the replacement completes and the
> >new drive takes over.
> >
> >This is because the replacement writes are only triggered when
> >'s.replacing' is set and not when the similar 's.sync' is set (which
> >is the case during resync and recovery - it means all devices need to
> >be read).
> >
> >So schedule those writes when s.replacing is set as well.
> >
> >In this case we cannot use "STRIPE_INSYNC" to record that the
> >replacement has happened as that is needed for recording that any
> >parity calculation is complete.  So introduce STRIPE_REPLACED to
> >record if the replacement has happened.
> >
> >This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
> >(md/raid5:  detect and handle replacements during recovery.)
> >which introduced replacement for raid5.
> >That was in 3.3-rc3, so any stable kernel since then would benefit
> >from this fix.
> >
> >Cc: stable@vger.kernel.org (3.3+)
> >Reported-by: qindehua <13691222965@163.com>
> >Signed-off-by: NeilBrown <neilb@suse.de>
> >
> >diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> >index 2bf094a..f0aa7abd 100644
> >--- a/drivers/md/raid5.c
> >+++ b/drivers/md/raid5.c
> >@@ -3607,8 +3607,8 @@ static void handle_stripe(struct stripe_head *sh)
> > 			handle_parity_checks5(conf, sh, &s, disks);
> > 	}
> > 
> >-	if (s.replacing && s.locked == 0
> >-	    && !test_bit(STRIPE_INSYNC, &sh->state)) {
> >+	if ((s.replacing || s.syncing) && s.locked == 0
> >+	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
> > 		/* Write out to replacement devices where possible */
> > 		for (i = 0; i < conf->raid_disks; i++)
> > 			if (test_bit(R5_UPTODATE, &sh->dev[i].flags) &&
> >@@ -3617,7 +3617,9 @@ static void handle_stripe(struct stripe_head *sh)
> > 				set_bit(R5_LOCKED, &sh->dev[i].flags);
> > 				s.locked++;
> > 			}
> >-		set_bit(STRIPE_INSYNC, &sh->state);
> >+		if (s.replacing)
> >+			set_bit(STRIPE_INSYNC, &sh->state);
> >+		set_bit(STRIPE_REPLACED, &sh->state);
> > 	}
> > 	if ((s.syncing || s.replacing) && s.locked == 0 &&
> > 	    test_bit(STRIPE_INSYNC, &sh->state)) {
> >diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
> >index b0b663b..70c4932 100644
> >--- a/drivers/md/raid5.h
> >+++ b/drivers/md/raid5.h
> >@@ -306,6 +306,7 @@ enum {
> > 	STRIPE_SYNC_REQUESTED,
> > 	STRIPE_SYNCING,
> > 	STRIPE_INSYNC,
> >+	STRIPE_REPLACED,
> > 	STRIPE_PREREAD_ACTIVE,
> > 	STRIPE_DELAYED,
> > 	STRIPE_DEGRADED,


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with disk replacement
  2013-07-23  3:21       ` NeilBrown
@ 2013-07-24 12:48         ` qindehua
  2013-07-25  6:45           ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: qindehua @ 2013-07-24 12:48 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,
I found another problem that the raid5 thread may run into an endless loop in a rare circumstance.

I have tested with or without the patch of "md/raid5: fix interaction of 'replace' and 'recovery'",
the problem will not happen without the patch. So it is the patch that introduce this problem.

Here is the fast steps to reproduce the problem:
1. create 3-drives RAID5 with --assume-clean (I use 1GB disks in VirtualBox machine)
   mdadm -C /dev/md1 -l 5 -n 3 --assume-clean /dev/sd[b-d]

2. change speed_limit_min and speed_limit_max to small number:
   echo 1 > /proc/sys/dev/raid/speed_limit_min
   echo 10 > /proc/sys/dev/raid/speed_limit_max

3. add three spare disks to the raid:
   mdadm /dev/md1 -a /dev/sde -a /dev/sdf -a /dev/sdg

4. replace three raid disks with this procedure:
   	for disk in sdb sdc sdd
	do
		mdadm /dev/md1 --replace /dev/$disk
		echo "idle" >  /sys/block/md1/md/sync_action
		sleep 3
	done

After step 4 the md1_raid5 process then ran into endless loop with 99% CPU usage.
The problem will not happen if there is no step 2. Also it will not happen if
only two disks were replaced.

Regards,
Qin Dehua

At 2013-07-23 11:21:24,NeilBrown <neilb@suse.de> wrote:
>On Mon, 22 Jul 2013 18:23:57 +0800 (CST) qindehua <qindehua@163.com> wrote:
>
>> Hi Neil,
>> I have tested with this patch, and it fixes the problem.
>> I have also tried various combinations of failing and replacing disks,
>> with echo idle to /sys/block/md1/md/sync_action, they all work fine.
>> 
>> Regards,
>> Qin Dehua
>
>Thanks for confirming, and for all of your testing!
>
>I'll forward the patch to Linus shortly.
>
>NeilBrown
>
>
>
>> 
>> At 2013-07-22 11:04:15,NeilBrown <neilb@suse.de> wrote:
>> >Hi Qin,
>> > thanks for the report.  I can easily reproduce the bug.
>> >
>> >I think this will fix it.  Could you please test and confirm that it fixes
>> >the problem for you too.
>> >
>> >Thanks,
>> >NeilBrown
>> >
>> >
>> >From: NeilBrown <neilb@suse.de>
>> >Date: Mon, 22 Jul 2013 12:57:21 +1000
>> >Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'.
>> >
>> >If a device in a RAID4/5/6 is being replaced while another is being
>> >recovered, then the writes to the replacement device currently don't
>> >happen, resulting in corruption when the replacement completes and the
>> >new drive takes over.
>> >
>> >This is because the replacement writes are only triggered when
>> >'s.replacing' is set and not when the similar 's.sync' is set (which
>> >is the case during resync and recovery - it means all devices need to
>> >be read).
>> >
>> >So schedule those writes when s.replacing is set as well.
>> >
>> >In this case we cannot use "STRIPE_INSYNC" to record that the
>> >replacement has happened as that is needed for recording that any
>> >parity calculation is complete.  So introduce STRIPE_REPLACED to
>> >record if the replacement has happened.
>> >
>> >This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
>> >(md/raid5:  detect and handle replacements during recovery.)
>> >which introduced replacement for raid5.
>> >That was in 3.3-rc3, so any stable kernel since then would benefit
>> >from this fix.
>> >
>> >Cc: stable@vger.kernel.org (3.3+)
>> >Reported-by: qindehua <13691222965@163.com>
>> >Signed-off-by: NeilBrown <neilb@suse.de>
>> >
>> >diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> >index 2bf094a..f0aa7abd 100644
>> >--- a/drivers/md/raid5.c
>> >+++ b/drivers/md/raid5.c
>> >@@ -3607,8 +3607,8 @@ static void handle_stripe(struct stripe_head *sh)
>> > 			handle_parity_checks5(conf, sh, &s, disks);
>> > 	}
>> > 
>> >-	if (s.replacing && s.locked == 0
>> >-	    && !test_bit(STRIPE_INSYNC, &sh->state)) {
>> >+	if ((s.replacing || s.syncing) && s.locked == 0
>> >+	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
>> > 		/* Write out to replacement devices where possible */
>> > 		for (i = 0; i < conf->raid_disks; i++)
>> > 			if (test_bit(R5_UPTODATE, &sh->dev[i].flags) &&
>> >@@ -3617,7 +3617,9 @@ static void handle_stripe(struct stripe_head *sh)
>> > 				set_bit(R5_LOCKED, &sh->dev[i].flags);
>> > 				s.locked++;
>> > 			}
>> >-		set_bit(STRIPE_INSYNC, &sh->state);
>> >+		if (s.replacing)
>> >+			set_bit(STRIPE_INSYNC, &sh->state);
>> >+		set_bit(STRIPE_REPLACED, &sh->state);
>> > 	}
>> > 	if ((s.syncing || s.replacing) && s.locked == 0 &&
>> > 	    test_bit(STRIPE_INSYNC, &sh->state)) {
>> >diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
>> >index b0b663b..70c4932 100644
>> >--- a/drivers/md/raid5.h
>> >+++ b/drivers/md/raid5.h
>> >@@ -306,6 +306,7 @@ enum {
>> > 	STRIPE_SYNC_REQUESTED,
>> > 	STRIPE_SYNCING,
>> > 	STRIPE_INSYNC,
>> >+	STRIPE_REPLACED,
>> > 	STRIPE_PREREAD_ACTIVE,
>> > 	STRIPE_DELAYED,
>> > 	STRIPE_DEGRADED,
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Problem with disk replacement
  2013-07-24 12:48         ` qindehua
@ 2013-07-25  6:45           ` NeilBrown
  2013-07-25  7:32             ` NeilBrown
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2013-07-25  6:45 UTC (permalink / raw)
  To: qindehua; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 6721 bytes --]

On Wed, 24 Jul 2013 20:48:27 +0800 (CST) qindehua <qindehua@163.com> wrote:

> Hi Neil,
> I found another problem that the raid5 thread may run into an endless loop in a rare circumstance.
> 
> I have tested with or without the patch of "md/raid5: fix interaction of 'replace' and 'recovery'",
> the problem will not happen without the patch. So it is the patch that introduce this problem.
> 
> Here is the fast steps to reproduce the problem:
> 1. create 3-drives RAID5 with --assume-clean (I use 1GB disks in VirtualBox machine)
>    mdadm -C /dev/md1 -l 5 -n 3 --assume-clean /dev/sd[b-d]
> 
> 2. change speed_limit_min and speed_limit_max to small number:
>    echo 1 > /proc/sys/dev/raid/speed_limit_min
>    echo 10 > /proc/sys/dev/raid/speed_limit_max
> 
> 3. add three spare disks to the raid:
>    mdadm /dev/md1 -a /dev/sde -a /dev/sdf -a /dev/sdg
> 
> 4. replace three raid disks with this procedure:
>    	for disk in sdb sdc sdd
> 	do
> 		mdadm /dev/md1 --replace /dev/$disk
> 		echo "idle" >  /sys/block/md1/md/sync_action
> 		sleep 3
> 	done
> 
> After step 4 the md1_raid5 process then ran into endless loop with 99% CPU usage.
> The problem will not happen if there is no step 2. Also it will not happen if
> only two disks were replaced.

Thanks a lot for the  testing!

This was missing from the previous patch:

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 1c3b279..e6e24c3 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3524,6 +3524,7 @@ static void handle_stripe(struct stripe_head *sh)
 		    test_and_clear_bit(STRIPE_SYNC_REQUESTED, &sh->state)) {
 			set_bit(STRIPE_SYNCING, &sh->state);
 			clear_bit(STRIPE_INSYNC, &sh->state);
+			clear_bit(STRIPE_REPLACED, &sh->state);
 		}
 		spin_unlock(&sh->stripe_lock);
 	}


and that omission caused the problem you saw.

However after looking at the code more closely I'm not sure that is the only
problem.  So I won't push out a revised patch until I am sure.
Maybe I don't actually need the new "STRIPE_REPLACED" flag after all.

Thanks,
NeilBrown


> 
> Regards,
> Qin Dehua
> 
> At 2013-07-23 11:21:24,NeilBrown <neilb@suse.de> wrote:
> >On Mon, 22 Jul 2013 18:23:57 +0800 (CST) qindehua <qindehua@163.com> wrote:
> >
> >> Hi Neil,
> >> I have tested with this patch, and it fixes the problem.
> >> I have also tried various combinations of failing and replacing disks,
> >> with echo idle to /sys/block/md1/md/sync_action, they all work fine.
> >> 
> >> Regards,
> >> Qin Dehua
> >
> >Thanks for confirming, and for all of your testing!
> >
> >I'll forward the patch to Linus shortly.
> >
> >NeilBrown
> >
> >
> >
> >> 
> >> At 2013-07-22 11:04:15,NeilBrown <neilb@suse.de> wrote:
> >> >Hi Qin,
> >> > thanks for the report.  I can easily reproduce the bug.
> >> >
> >> >I think this will fix it.  Could you please test and confirm that it fixes
> >> >the problem for you too.
> >> >
> >> >Thanks,
> >> >NeilBrown
> >> >
> >> >
> >> >From: NeilBrown <neilb@suse.de>
> >> >Date: Mon, 22 Jul 2013 12:57:21 +1000
> >> >Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'.
> >> >
> >> >If a device in a RAID4/5/6 is being replaced while another is being
> >> >recovered, then the writes to the replacement device currently don't
> >> >happen, resulting in corruption when the replacement completes and the
> >> >new drive takes over.
> >> >
> >> >This is because the replacement writes are only triggered when
> >> >'s.replacing' is set and not when the similar 's.sync' is set (which
> >> >is the case during resync and recovery - it means all devices need to
> >> >be read).
> >> >
> >> >So schedule those writes when s.replacing is set as well.
> >> >
> >> >In this case we cannot use "STRIPE_INSYNC" to record that the
> >> >replacement has happened as that is needed for recording that any
> >> >parity calculation is complete.  So introduce STRIPE_REPLACED to
> >> >record if the replacement has happened.
> >> >
> >> >This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
> >> >(md/raid5:  detect and handle replacements during recovery.)
> >> >which introduced replacement for raid5.
> >> >That was in 3.3-rc3, so any stable kernel since then would benefit
> >> >from this fix.
> >> >
> >> >Cc: stable@vger.kernel.org (3.3+)
> >> >Reported-by: qindehua <13691222965@163.com>
> >> >Signed-off-by: NeilBrown <neilb@suse.de>
> >> >
> >> >diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> >> >index 2bf094a..f0aa7abd 100644
> >> >--- a/drivers/md/raid5.c
> >> >+++ b/drivers/md/raid5.c
> >> >@@ -3607,8 +3607,8 @@ static void handle_stripe(struct stripe_head *sh)
> >> > 			handle_parity_checks5(conf, sh, &s, disks);
> >> > 	}
> >> > 
> >> >-	if (s.replacing && s.locked == 0
> >> >-	    && !test_bit(STRIPE_INSYNC, &sh->state)) {
> >> >+	if ((s.replacing || s.syncing) && s.locked == 0
> >> >+	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
> >> > 		/* Write out to replacement devices where possible */
> >> > 		for (i = 0; i < conf->raid_disks; i++)
> >> > 			if (test_bit(R5_UPTODATE, &sh->dev[i].flags) &&
> >> >@@ -3617,7 +3617,9 @@ static void handle_stripe(struct stripe_head *sh)
> >> > 				set_bit(R5_LOCKED, &sh->dev[i].flags);
> >> > 				s.locked++;
> >> > 			}
> >> >-		set_bit(STRIPE_INSYNC, &sh->state);
> >> >+		if (s.replacing)
> >> >+			set_bit(STRIPE_INSYNC, &sh->state);
> >> >+		set_bit(STRIPE_REPLACED, &sh->state);
> >> > 	}
> >> > 	if ((s.syncing || s.replacing) && s.locked == 0 &&
> >> > 	    test_bit(STRIPE_INSYNC, &sh->state)) {
> >> >diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
> >> >index b0b663b..70c4932 100644
> >> >--- a/drivers/md/raid5.h
> >> >+++ b/drivers/md/raid5.h
> >> >@@ -306,6 +306,7 @@ enum {
> >> > 	STRIPE_SYNC_REQUESTED,
> >> > 	STRIPE_SYNCING,
> >> > 	STRIPE_INSYNC,
> >> >+	STRIPE_REPLACED,
> >> > 	STRIPE_PREREAD_ACTIVE,
> >> > 	STRIPE_DELAYED,
> >> > 	STRIPE_DEGRADED,
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Problem with disk replacement
  2013-07-25  6:45           ` NeilBrown
@ 2013-07-25  7:32             ` NeilBrown
  2013-08-16 10:26               ` qindehua
  0 siblings, 1 reply; 9+ messages in thread
From: NeilBrown @ 2013-07-25  7:32 UTC (permalink / raw)
  To: qindehua; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 6447 bytes --]

On Thu, 25 Jul 2013 16:45:31 +1000 NeilBrown <neilb@suse.de> wrote:

> On Wed, 24 Jul 2013 20:48:27 +0800 (CST) qindehua <qindehua@163.com> wrote:
> 
> > Hi Neil,
> > I found another problem that the raid5 thread may run into an endless loop in a rare circumstance.
> > 
> > I have tested with or without the patch of "md/raid5: fix interaction of 'replace' and 'recovery'",
> > the problem will not happen without the patch. So it is the patch that introduce this problem.
> > 
> > Here is the fast steps to reproduce the problem:
> > 1. create 3-drives RAID5 with --assume-clean (I use 1GB disks in VirtualBox machine)
> >    mdadm -C /dev/md1 -l 5 -n 3 --assume-clean /dev/sd[b-d]
> > 
> > 2. change speed_limit_min and speed_limit_max to small number:
> >    echo 1 > /proc/sys/dev/raid/speed_limit_min
> >    echo 10 > /proc/sys/dev/raid/speed_limit_max
> > 
> > 3. add three spare disks to the raid:
> >    mdadm /dev/md1 -a /dev/sde -a /dev/sdf -a /dev/sdg
> > 
> > 4. replace three raid disks with this procedure:
> >    	for disk in sdb sdc sdd
> > 	do
> > 		mdadm /dev/md1 --replace /dev/$disk
> > 		echo "idle" >  /sys/block/md1/md/sync_action
> > 		sleep 3
> > 	done
> > 
> > After step 4 the md1_raid5 process then ran into endless loop with 99% CPU usage.
> > The problem will not happen if there is no step 2. Also it will not happen if
> > only two disks were replaced.
> 
> Thanks a lot for the  testing!
> 
> This was missing from the previous patch:
> 
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 1c3b279..e6e24c3 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -3524,6 +3524,7 @@ static void handle_stripe(struct stripe_head *sh)
>  		    test_and_clear_bit(STRIPE_SYNC_REQUESTED, &sh->state)) {
>  			set_bit(STRIPE_SYNCING, &sh->state);
>  			clear_bit(STRIPE_INSYNC, &sh->state);
> +			clear_bit(STRIPE_REPLACED, &sh->state);
>  		}
>  		spin_unlock(&sh->stripe_lock);
>  	}
> 
> 
> and that omission caused the problem you saw.
> 
> However after looking at the code more closely I'm not sure that is the only
> problem.  So I won't push out a revised patch until I am sure.
> Maybe I don't actually need the new "STRIPE_REPLACED" flag after all.
> 

Hi again.

Here is the patch which I think is correct and I hope to submit tomorrow.

NeilBrown


From f94c0b6658c7edea8bc19d13be321e3860a3fa54 Mon Sep 17 00:00:00 2001
From: NeilBrown <neilb@suse.de>
Date: Mon, 22 Jul 2013 12:57:21 +1000
Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'.

If a device in a RAID4/5/6 is being replaced while another is being
recovered, then the writes to the replacement device currently don't
happen, resulting in corruption when the replacement completes and the
new drive takes over.

This is because the replacement writes are only triggered when
's.replacing' is set and not when the similar 's.sync' is set (which
is the case during resync and recovery - it means all devices need to
be read).

So schedule those writes when s.replacing is set as well.

In this case we cannot use "STRIPE_INSYNC" to record that the
replacement has happened as that is needed for recording that any
parity calculation is complete.  So introduce STRIPE_REPLACED to
record if the replacement has happened.

For safety we should also check that STRIPE_COMPUTE_RUN is not set.
This has a similar effect to the "s.locked == 0" test.  The latter
ensure that now IO has been flagged but not started.  The former
checks if any parity calculation has been flagged by not started.
We must wait for both of these to complete before triggering the
'replace'.

Add a similar test to the subsequent check for "are we finished yet".
This possibly isn't needed (is subsumed in the STRIPE_INSYNC test),
but it makes it more obvious that the REPLACE will happen before we
think we are finished.

Finally if a NeedReplace device is not UPTODATE then that is an
error.  We really must trigger a warning.

This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
(md/raid5:  detect and handle replacements during recovery.)
which introduced replacement for raid5.
That was in 3.3-rc3, so any stable kernel since then would benefit
from this fix.

Cc: stable@vger.kernel.org (3.3+)
Reported-by: qindehua <13691222965@163.com>
Tested-by: qindehua <qindehua@163.com>
Signed-off-by: NeilBrown <neilb@suse.de>

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2bf094a..78ea443 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3462,6 +3462,7 @@ static void handle_stripe(struct stripe_head *sh)
 		    test_and_clear_bit(STRIPE_SYNC_REQUESTED, &sh->state)) {
 			set_bit(STRIPE_SYNCING, &sh->state);
 			clear_bit(STRIPE_INSYNC, &sh->state);
+			clear_bit(STRIPE_REPLACED, &sh->state);
 		}
 		spin_unlock(&sh->stripe_lock);
 	}
@@ -3607,19 +3608,23 @@ static void handle_stripe(struct stripe_head *sh)
 			handle_parity_checks5(conf, sh, &s, disks);
 	}
 
-	if (s.replacing && s.locked == 0
-	    && !test_bit(STRIPE_INSYNC, &sh->state)) {
+	if ((s.replacing || s.syncing) && s.locked == 0
+	    && !test_bit(STRIPE_COMPUTE_RUN, &sh->state)
+	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
 		/* Write out to replacement devices where possible */
 		for (i = 0; i < conf->raid_disks; i++)
-			if (test_bit(R5_UPTODATE, &sh->dev[i].flags) &&
-			    test_bit(R5_NeedReplace, &sh->dev[i].flags)) {
+			if (test_bit(R5_NeedReplace, &sh->dev[i].flags)) {
+				WARN_ON(!test_bit(R5_UPTODATE, &sh->dev[i].flags));
 				set_bit(R5_WantReplace, &sh->dev[i].flags);
 				set_bit(R5_LOCKED, &sh->dev[i].flags);
 				s.locked++;
 			}
-		set_bit(STRIPE_INSYNC, &sh->state);
+		if (s.replacing)
+			set_bit(STRIPE_INSYNC, &sh->state);
+		set_bit(STRIPE_REPLACED, &sh->state);
 	}
 	if ((s.syncing || s.replacing) && s.locked == 0 &&
+	    !test_bit(STRIPE_COMPUTE_RUN, &sh->state) &&
 	    test_bit(STRIPE_INSYNC, &sh->state)) {
 		md_done_sync(conf->mddev, STRIPE_SECTORS, 1);
 		clear_bit(STRIPE_SYNCING, &sh->state);
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index b0b663b..70c4932 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -306,6 +306,7 @@ enum {
 	STRIPE_SYNC_REQUESTED,
 	STRIPE_SYNCING,
 	STRIPE_INSYNC,
+	STRIPE_REPLACED,
 	STRIPE_PREREAD_ACTIVE,
 	STRIPE_DELAYED,
 	STRIPE_DEGRADED,

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: Problem with disk replacement
  2013-07-25  7:32             ` NeilBrown
@ 2013-08-16 10:26               ` qindehua
  0 siblings, 0 replies; 9+ messages in thread
From: qindehua @ 2013-08-16 10:26 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

Hi Neil,

There is a WARN_ON in the patch "md/raid5: fix interaction of 'replace' and 'recovery'", which can happen by following test,
is there any problem?

Here is the steps to produce the WARNING messages:
1. create 3-drives RAID5 with --force
   mdadm -C /dev/md1 --force -l 5 -n 3 /dev/sd[b-d]

2. while the raid is resyncing, replace one disk with follow commands:
   mdadm /dev/md1 -a /dev/sde
   mdadm /dev/md1 --replace /dev/sdb
   echo "idle" >  /sys/block/md1/md/sync_action

After a while the raid5 module prints lots of WARNING messages which is introduced by the patch of "md/raid5: fix interaction of 'replace' and 'recovery'". The code context is:

	if ((s.replacing || s.syncing) && s.locked == 0
	    && !test_bit(STRIPE_COMPUTE_RUN, &sh->state)
	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
		/* Write out to replacement devices where possible */
		for (i = 0; i < conf->raid_disks; i++)
			if (test_bit(R5_NeedReplace, &sh->dev[i].flags)) {
				WARN_ON(!test_bit(R5_UPTODATE, &sh->dev[i].flags));
				set_bit(R5_WantReplace, &sh->dev[i].flags);
				set_bit(R5_LOCKED, &sh->dev[i].flags);
				s.locked++;
			}

The WARNING messages:
[ 1667.432071] ------------[ cut here ]------------
[ 1667.433522] WARNING: at drivers/md/raid5.c:3611 handle_stripe+0x1175/0x1e15()
[ 1667.435566] Hardware name: VirtualBox
[ 1667.438395] Modules linked in: iscsi_scst(O) scst_vdisk(O) scst(O) device_buffer(O) mem_mgnt(PO) fuse
[ 1667.441955] Pid: 4670, comm: md1_raid5 Tainted: P        W  O 3.9.6 #1
[ 1667.443551] Call Trace:
[ 1667.445093]  [<ffffffff8103f0ea>] warn_slowpath_common+0x7a/0xc0
[ 1667.445093]  [<ffffffff8103f145>] warn_slowpath_null+0x15/0x20
[ 1667.447816]  [<ffffffff815d8479>] handle_stripe+0x1175/0x1e15
[ 1667.449638]  [<ffffffff812fe5da>] ? rb_erase_init+0x23/0x30
[ 1667.451527]  [<ffffffff81126572>] ? kmem_cache_alloc+0x62/0x140
[ 1667.453506]  [<ffffffff812e8b52>] ? sg_set_page+0x29/0x3f
[ 1667.455304]  [<ffffffff812e9445>] ? __blk_segment_map_sg+0x209/0x22c
[ 1667.457224]  [<ffffffff812e96fa>] ? blk_rq_map_sg+0x292/0x29e
[ 1667.459684]  [<ffffffff814aa2c2>] ? ahci_qc_prep+0x132/0x180
[ 1667.461397]  [<ffffffff81859699>] ? _raw_spin_unlock_irqrestore+0x9/0x10
[ 1667.462912]  [<ffffffff815e13b4>] ? test_ti_thread_flag+0x24/0x26
[ 1667.464560]  [<ffffffff815e1952>] ? test_tsk_thread_flag+0x24/0x26
[ 1667.466207]  [<ffffffff815e1971>] ? signal_pending+0x1d/0x27
[ 1667.468156]  [<ffffffff815fba4e>] ? md_check_recovery+0x60/0xcc3
[ 1667.470134]  [<ffffffff81455292>] ? put_device+0x12/0x20
[ 1667.471449]  [<ffffffff81472c2a>] ? scsi_request_fn+0x9a/0x4a0
[ 1667.473183]  [<ffffffff812fd3f9>] ? blk_rq_sectors+0x18/0x1d
[ 1667.475422]  [<ffffffff81077ebd>] ? dequeue_task_fair+0x40d/0x920
[ 1667.477118]  [<ffffffff810744b0>] ? set_next_entity+0xa0/0xc0
[ 1667.477118]  [<ffffffff810754b3>] ? pick_next_task_fair+0x63/0x140
[ 1667.480396]  [<ffffffff815c9b0e>] ? list_del_init+0x24/0x26
[ 1667.483181]  [<ffffffff815f7907>] ? md_revalidate+0x30/0x30
[ 1667.484935]  [<ffffffff815dc76c>] handle_active_stripes+0x7f/0xeb
[ 1667.486274]  [<ffffffff815dc93c>] raid5d+0x164/0x20a
[ 1667.488076]  [<ffffffff815f7b9b>] md_thread+0x294/0x2b3
[ 1667.488076]  [<ffffffff81062750>] ? add_wait_queue+0x60/0x60
[ 1667.491532]  [<ffffffff81061ebb>] kthread+0xbb/0xc0
[ 1667.493266]  [<ffffffff81061e00>] ? flush_kthread_work+0x120/0x120
[ 1667.494953]  [<ffffffff8186122c>] ret_from_fork+0x7c/0xb0
[ 1667.496337]  [<ffffffff81061e00>] ? flush_kthread_work+0x120/0x120
[ 1667.498159] ---[ end trace 6c53bec353931415 ]---

Regards,
Qin Dehua

At 2013-07-25 15:32:53,NeilBrown <neilb@suse.de> wrote:
>On Thu, 25 Jul 2013 16:45:31 +1000 NeilBrown <neilb@suse.de> wrote:
>
>Hi again.
>
>Here is the patch which I think is correct and I hope to submit tomorrow.
>
>NeilBrown
>
>
>From f94c0b6658c7edea8bc19d13be321e3860a3fa54 Mon Sep 17 00:00:00 2001
>From: NeilBrown <neilb@suse.de>
>Date: Mon, 22 Jul 2013 12:57:21 +1000
>Subject: [PATCH] md/raid5: fix interaction of 'replace' and 'recovery'.
>
>If a device in a RAID4/5/6 is being replaced while another is being
>recovered, then the writes to the replacement device currently don't
>happen, resulting in corruption when the replacement completes and the
>new drive takes over.
>
>This is because the replacement writes are only triggered when
>'s.replacing' is set and not when the similar 's.sync' is set (which
>is the case during resync and recovery - it means all devices need to
>be read).
>
>So schedule those writes when s.replacing is set as well.
>
>In this case we cannot use "STRIPE_INSYNC" to record that the
>replacement has happened as that is needed for recording that any
>parity calculation is complete.  So introduce STRIPE_REPLACED to
>record if the replacement has happened.
>
>For safety we should also check that STRIPE_COMPUTE_RUN is not set.
>This has a similar effect to the "s.locked == 0" test.  The latter
>ensure that now IO has been flagged but not started.  The former
>checks if any parity calculation has been flagged by not started.
>We must wait for both of these to complete before triggering the
>'replace'.
>
>Add a similar test to the subsequent check for "are we finished yet".
>This possibly isn't needed (is subsumed in the STRIPE_INSYNC test),
>but it makes it more obvious that the REPLACE will happen before we
>think we are finished.
>
>Finally if a NeedReplace device is not UPTODATE then that is an
>error.  We really must trigger a warning.
>
>This bug was introduced in commit 9a3e1101b827a59ac9036a672f5fa8d5279d0fe2
>(md/raid5:  detect and handle replacements during recovery.)
>which introduced replacement for raid5.
>That was in 3.3-rc3, so any stable kernel since then would benefit
>from this fix.
>
>Cc: stable@vger.kernel.org (3.3+)
>Reported-by: qindehua <13691222965@163.com>
>Tested-by: qindehua <qindehua@163.com>
>Signed-off-by: NeilBrown <neilb@suse.de>
>
>diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>index 2bf094a..78ea443 100644
>--- a/drivers/md/raid5.c
>+++ b/drivers/md/raid5.c
>@@ -3462,6 +3462,7 @@ static void handle_stripe(struct stripe_head *sh)
> 		    test_and_clear_bit(STRIPE_SYNC_REQUESTED, &sh->state)) {
> 			set_bit(STRIPE_SYNCING, &sh->state);
> 			clear_bit(STRIPE_INSYNC, &sh->state);
>+			clear_bit(STRIPE_REPLACED, &sh->state);
> 		}
> 		spin_unlock(&sh->stripe_lock);
> 	}
>@@ -3607,19 +3608,23 @@ static void handle_stripe(struct stripe_head *sh)
> 			handle_parity_checks5(conf, sh, &s, disks);
> 	}
> 
>-	if (s.replacing && s.locked == 0
>-	    && !test_bit(STRIPE_INSYNC, &sh->state)) {
>+	if ((s.replacing || s.syncing) && s.locked == 0
>+	    && !test_bit(STRIPE_COMPUTE_RUN, &sh->state)
>+	    && !test_bit(STRIPE_REPLACED, &sh->state)) {
> 		/* Write out to replacement devices where possible */
> 		for (i = 0; i < conf->raid_disks; i++)
>-			if (test_bit(R5_UPTODATE, &sh->dev[i].flags) &&
>-			    test_bit(R5_NeedReplace, &sh->dev[i].flags)) {
>+			if (test_bit(R5_NeedReplace, &sh->dev[i].flags)) {
>+				WARN_ON(!test_bit(R5_UPTODATE, &sh->dev[i].flags));
> 				set_bit(R5_WantReplace, &sh->dev[i].flags);
> 				set_bit(R5_LOCKED, &sh->dev[i].flags);
> 				s.locked++;
> 			}
>-		set_bit(STRIPE_INSYNC, &sh->state);
>+		if (s.replacing)
>+			set_bit(STRIPE_INSYNC, &sh->state);
>+		set_bit(STRIPE_REPLACED, &sh->state);
> 	}
> 	if ((s.syncing || s.replacing) && s.locked == 0 &&
>+	    !test_bit(STRIPE_COMPUTE_RUN, &sh->state) &&
> 	    test_bit(STRIPE_INSYNC, &sh->state)) {
> 		md_done_sync(conf->mddev, STRIPE_SECTORS, 1);
> 		clear_bit(STRIPE_SYNCING, &sh->state);
>diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
>index b0b663b..70c4932 100644
>--- a/drivers/md/raid5.h
>+++ b/drivers/md/raid5.h
>@@ -306,6 +306,7 @@ enum {
> 	STRIPE_SYNC_REQUESTED,
> 	STRIPE_SYNCING,
> 	STRIPE_INSYNC,
>+	STRIPE_REPLACED,
> 	STRIPE_PREREAD_ACTIVE,
> 	STRIPE_DELAYED,
> 	STRIPE_DEGRADED,
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-08-16 10:26 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <45ae92bb.4ffb.13ff2092e1d.Coremail.13691222965@163.com>
2013-07-18 16:46 ` Problem with disk replacement qindehua
2013-07-22  3:04   ` NeilBrown
2013-07-22 10:23     ` qindehua
2013-07-23  3:21       ` NeilBrown
2013-07-24 12:48         ` qindehua
2013-07-25  6:45           ` NeilBrown
2013-07-25  7:32             ` NeilBrown
2013-08-16 10:26               ` qindehua
2013-07-18 16:57 Qin Dehua

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).