Re: Raid0 expansion problem in md

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: NeilBrown <neilb@suse.de>
To: "Kwolek, Adam" <adam.kwolek@intel.com>
Cc: "linux-raid@vger.kernel.org" <linux-raid@vger.kernel.org>
Subject: Re: Raid0 expansion problem in md
Date: Wed, 14 Dec 2011 15:42:19 +1100	[thread overview]
Message-ID: <20111214154219.6eadf590@notabene.brown> (raw)
In-Reply-To: <79556383A0E1384DB3A3903742AAC04A05F9CF@IRSMSX101.ger.corp.intel.com>

[-- Attachment #1: Type: text/plain, Size: 3491 bytes --]

On Tue, 13 Dec 2011 15:45:30 +0000 "Kwolek, Adam" <adam.kwolek@intel.com>
wrote:

> Hi Neil,
> 
> On the latest md neil_for-linus branch I've found raid0 migration problem.
> During OLCE in user space everything goes fine, but in kernel process is not moved forward.
> /older md works fine/
> 
> It is stopped in md in reshape_request() in line (near raid5.c:3957)
>     wait_event(conf->wait_for_overlap, atomic_read(&conf->reshape_stripes)==0);
> 
> I've found that this problem is a side effect of patch:
>     md/raid5: abort any pending parity operations when array fails.
> and added line in this patch:
>      sh->reconstruct_state = 0;
> 
> During OLCE we are going inside because condition
>     if (s.failed > conf->max_degraded)
> with values:
>      locked=1 uptodate=5 to_read=0 to_write=0 failed=2 failed_num=4,1
> 
> and sh->reconstruct_state is set to 0 (reconstruct_state_idle) from 6 (reconstruct_state_result)
> When sh->reconstruct_state is not reset raid0 migration is executed without problem.
> Problem is probably in not executed code for finishing reconstruction (around raid5.c:3300)
> 
> In our case field s.failed should not reach value 2 but we've got it for failed_num = 4,1. 
> It seems that '1' is failed disk for stripe in old array geometry and 4 is failed disk for stripe in new array geometry.
> This means that degradation during reshape is counted two times /final stripe degradation is sum of old and new geometry degradation/.
> When we reading (from old array) and writing (to new geometry) a degraded stripe  and degradation is on different positions (raid0 OLCE case) analyse_stripe() gives
> us false failure information. Possible that we should have old_failed and new_failed counters to know in what geometry (old/new) failure occurs.
> 
> 
> Here is reproduction script:
> 
> export IMSM_NO_PLATFORM=1
> #create container
> mdadm -C /dev/md/imsm0 -amd -e imsm -n 4 /dev/sdb /dev/sdc /dev/sde /dev/sdd -R
> #create array
> mdadm -C /dev/md/raid0vol_0 -amd -l 0 --chunk 64 --size  1048 -n 1 /dev/sdb  -R --force
> #start reshape
> mdadm --grow /dev/md/imsm0 --raid-devices 4
> 
> 
> Please let me know your opinion.

Thanks for the excellent problem report.

I think it is best fixed by the following patch.
I also need to fixed up the calculate of 'degraded' so it doesn't say '2' in
this case, which is confusing.  Then I'll commit the fixes.


Thanks,
NeilBrown

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 31670f8..858fdbb 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -3065,11 +3065,17 @@ static void analyse_stripe(struct stripe_head *sh, struct stripe_head_state *s)
 			}
 		} else if (test_bit(In_sync, &rdev->flags))
 			set_bit(R5_Insync, &dev->flags);
-		else {
+		else if (sh->sector + STRIPE_SECTORS <= rdev->recovery_offset)
 			/* in sync if before recovery_offset */
-			if (sh->sector + STRIPE_SECTORS <= rdev->recovery_offset)
-				set_bit(R5_Insync, &dev->flags);
-		}
+			set_bit(R5_Insync, &dev->flags);
+		else if (test_bit(R5_UPTODATE, &dev->flags) &&
+			 test_bit(R5_Expanded, &dev->flags))
+			/* If we've reshaped into here, we assume it is Insync.
+			 * We will shortly update recovery_offset to make
+			 * it official.
+			 */
+			set_bit(R5_Insync, &dev->flags);
+
 		if (rdev && test_bit(R5_WriteError, &dev->flags)) {
 			clear_bit(R5_Insync, &dev->flags);
 			if (!test_bit(Faulty, &rdev->flags)) {

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 828 bytes --]

next prev parent reply	other threads:[~2011-12-14  4:42 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-12-13 15:45 Raid0 expansion problem in md Kwolek, Adam
2011-12-14  4:42 ` NeilBrown [this message]
2011-12-14 12:15   ` Kwolek, Adam

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:31670f8 dfblob:858fdbb )
 OR (
bs:"Re: Raid0 expansion problem in md" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20111214154219.6eadf590@notabene.brown \
    --to=neilb@suse.de \
    --cc=adam.kwolek@intel.com \
    --cc=linux-raid@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).