Degraded Array

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Degraded Array
@ 2010-12-04  2:42 Leslie Rhorer
  2010-12-04  4:26 ` Majed B.
  0 siblings, 1 reply; 6+ messages in thread
From: Leslie Rhorer @ 2010-12-04  2:42 UTC (permalink / raw)
  To: linux-raid

Hello everyone.

            I was just growing one of my RAID6 arrays from 13 to 14
members.  The array growth had passed its critical stage and had been
growing for several minutes when the system came to a screeching halt.  It
hit the big red switch, and when the system rebooted, the array assembled,
but two members are missing.  One of the members is the new drive and the
other is the 13th drive in the RAID set.  Of course, the array can run well
enough with only 12 members, but it’s definitely not the best situation,
especially since the re-shape will take another day and a half.  Is it best
I go ahead and leave the array in its current state until the re-shape is
done, or should I go ahead and add back the two failed drives?

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Degraded Array
  2010-12-04  2:42 Degraded Array Leslie Rhorer
@ 2010-12-04  4:26 ` Majed B.
  2010-12-04  4:47   ` Leslie Rhorer
  2010-12-04  6:44   ` Neil Brown
  0 siblings, 2 replies; 6+ messages in thread
From: Majed B. @ 2010-12-04  4:26 UTC (permalink / raw)
  To: lrhorer; +Cc: linux-raid

You have a degraded array now with 1 disk down. If you proceed, more
disks might pop out due to errors.

It's best to backup your data, run a check on the array, fix it then
try to resume the reshape.

On Sat, Dec 4, 2010 at 5:42 AM, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
>
> Hello everyone.
>
>             I was just growing one of my RAID6 arrays from 13 to 14
> members.  The array growth had passed its critical stage and had been
> growing for several minutes when the system came to a screeching halt.  It
> hit the big red switch, and when the system rebooted, the array assembled,
> but two members are missing.  One of the members is the new drive and the
> other is the 13th drive in the RAID set.  Of course, the array can run well
> enough with only 12 members, but it’s definitely not the best situation,
> especially since the re-shape will take another day and a half.  Is it best
> I go ahead and leave the array in its current state until the re-shape is
> done, or should I go ahead and add back the two failed drives?
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



--
       Majed B.
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Degraded Array
  2010-12-04  4:26 ` Majed B.
@ 2010-12-04  4:47   ` Leslie Rhorer
  2010-12-04  6:44   ` Neil Brown
  1 sibling, 0 replies; 6+ messages in thread
From: Leslie Rhorer @ 2010-12-04  4:47 UTC (permalink / raw)
  To: linux-raid



> -----Original Message-----
> From: Majed B. [mailto:majedb@gmail.com]
> Sent: Friday, December 03, 2010 10:27 PM
> To: lrhorer@satx.rr.com
> Cc: linux-raid@vger.kernel.org
> Subject: Re: Degraded Array
> 
> You have a degraded array now with 1 disk down. If you proceed, more
> disks might pop out due to errors.

	Well, sort of.  A significant fraction of the data is now striped
across 12 + 0 drives, rather than 11 + 1.  There are no errors occurring on
the drives, although of course an unrecoverable error could happen at any
time.

> It's best to backup your data, run a check on the array, fix it then

	The data is backed up.  Except in extreme circumstances, I would
never start a re-shape without a current backup.

> run a check on the array, fix it then, try to resume the reshape.

The array is in good health, other than the two kicked drives.  I'm not sure
I understand what you mean, though.  I'm asking about the two offline
drives.  Should I add the 13th back?  It still has substantially the same
data as the other 12 drives, discounting the amount that has been
re-written.  If so, how can I safely stop the array re-shape and re-add the
drive?  (This is under mdadm 2.6.7.2.)

> 
> On Sat, Dec 4, 2010 at 5:42 AM, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
> >
> > Hello everyone.
> >
> >             I was just growing one of my RAID6 arrays from 13 to 14
> > members.  The array growth had passed its critical stage and had been
> > growing for several minutes when the system came to a screeching halt.
> > It hit the big red switch, and when the system rebooted, the array

	I meant to type *I*, not *It*.

> > but two members are missing.  One of the members is the new drive and
> the
> > other is the 13th drive in the RAID set.  Of course, the array can run
> well
> > enough with only 12 members, but it’s definitely not the best situation,
> > especially since the re-shape will take another day and a half.  Is it
> best
> > I go ahead and leave the array in its current state until the re-shape
> is
> > done, or should I go ahead and add back the two failed drives?
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> --
>        Majed B.

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Degraded Array
  2010-12-04  4:26 ` Majed B.
  2010-12-04  4:47   ` Leslie Rhorer
@ 2010-12-04  6:44   ` Neil Brown
  2010-12-04  8:53     ` Leslie Rhorer
  2010-12-11  4:29     ` Leslie Rhorer
  1 sibling, 2 replies; 6+ messages in thread
From: Neil Brown @ 2010-12-04  6:44 UTC (permalink / raw)
  To: Majed B.; +Cc: lrhorer, linux-raid

On Sat, 4 Dec 2010 07:26:36 +0300 "Majed B." <majedb@gmail.com> wrote:

> You have a degraded array now with 1 disk down. If you proceed, more
> disks might pop out due to errors.
> 
> It's best to backup your data, run a check on the array, fix it then
> try to resume the reshape.

Backups are always a good idea, but are sometimes impractical.

I don't think running a 'check' would help at all.  A 'reshape' will do much
the same sort of work, and more.

It isn't strictly true that the array is '1 disk down'.  Parts of it are 1
disk down, parts are 2 disks down.  As the reshape progresses more and more
will be 2 disks down.  We don't really want that.

This case isn't really handled well at present.  You want to do a 'recovery'
and a 'reshape' at the same time.  This is quite possible, but doesn't
currently happen when you restart a reshape in the middle (added to my todo
list).

I suggest you:
 - apply the patch below to mdadm.
 - assemble the array with --update=revert-reshape.  You should give
   it a --backup-file too.
 - let the reshape complete so you are back to 13 devices.
 - add  a spare and let it recovery
 - then add a spare and reshape the array.

Of course you needed to be running a new enough kernel to be able decrease
the number of devices in a raid5.

NeilBrown


> 
> On Sat, Dec 4, 2010 at 5:42 AM, Leslie Rhorer <lrhorer@satx.rr.com> wrote:
> >
> > Hello everyone.
> >
> >             I was just growing one of my RAID6 arrays from 13 to 14
> > members.  The array growth had passed its critical stage and had been
> > growing for several minutes when the system came to a screeching halt.  It
> > hit the big red switch, and when the system rebooted, the array assembled,
> > but two members are missing.  One of the members is the new drive and the
> > other is the 13th drive in the RAID set.  Of course, the array can run well
> > enough with only 12 members, but it’s definitely not the best situation,
> > especially since the re-shape will take another day and a half.  Is it best
> > I go ahead and leave the array in its current state until the re-shape is
> > done, or should I go ahead and add back the two failed drives?
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> --
>        Majed B.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



commit 12bab17f765a4130c7bd133a0bbb3b83f3f492b0
Author: NeilBrown <neilb@suse.de>
Date:   Sat Dec 4 17:37:14 2010 +1100

    Support reverting of reshape.
    
    Allow --update=revert-reshape to do what you would expect.
    
    FIXME
    needs review.  Think about interface and use cases.
    Document.

diff --git a/Assemble.c b/Assemble.c
index afd4e60..c034e37 100644
--- a/Assemble.c
+++ b/Assemble.c
@@ -592,6 +592,12 @@ int Assemble(struct supertype *st, char *mddev,
 	/* Ok, no bad inconsistancy, we can try updating etc */
 	bitmap_done = 0;
 	content->update_private = NULL;
+	if (update && strcmp(update, "revert-reshape") == 0 &&
+	    (content->reshape_active == 0 || content->delta_disks <= 0)) {
+		fprintf(stderr, Name ": Cannot revert-reshape on this array\n");
+		close(mdfd);
+		return 1;
+	}
 	for (tmpdev = devlist; tmpdev; tmpdev=tmpdev->next) if (tmpdev->used == 1) {
 		char *devname = tmpdev->devname;
 		struct stat stb;
diff --git a/mdadm.c b/mdadm.c
index 08e8ea4..7cf51b5 100644
--- a/mdadm.c
+++ b/mdadm.c
@@ -662,6 +662,8 @@ int main(int argc, char *argv[])
 				continue;
 			if (strcmp(update, "devicesize")==0)
 				continue;
+			if (strcmp(update, "revert-reshape")==0)
+				continue;
 			if (strcmp(update, "byteorder")==0) {
 				if (ss) {
 					fprintf(stderr, Name ": must not set metadata type with --update=byteorder.\n");
@@ -688,7 +690,8 @@ int main(int argc, char *argv[])
 			}
 			fprintf(outf, "Valid --update options are:\n"
 		"     'sparc2.2', 'super-minor', 'uuid', 'name', 'resync',\n"
-		"     'summaries', 'homehost', 'byteorder', 'devicesize'.\n");
+		"     'summaries', 'homehost', 'byteorder', 'devicesize',\n"
+		"     'revert-reshape'.\n");
 			exit(outf == stdout ? 0 : 2);
 
 		case O(INCREMENTAL,NoDegraded):
diff --git a/super0.c b/super0.c
index ae3e885..01d5cfa 100644
--- a/super0.c
+++ b/super0.c
@@ -545,6 +545,19 @@ static int update_super0(struct supertype *st, struct mdinfo *info,
 	}
 	if (strcmp(update, "_reshape_progress")==0)
 		sb->reshape_position = info->reshape_progress;
+	if (strcmp(update, "revert-reshape") == 0 &&
+	    sb->minor_version > 90 && sb->delta_disks != 0) {
+		int tmp;
+		sb->raid_disks -= sb->delta_disks;
+		sb->delta_disks = - sb->delta_disks;
+		tmp = sb->new_layout;
+		sb->new_layout = sb->layout;
+		sb->layout = tmp;
+
+		tmp = sb->new_chunk;
+		sb->new_chunk = sb->chunk_size;
+		sb->chunk_size = tmp;
+	}
 
 	sb->sb_csum = calc_sb0_csum(sb);
 	return rv;
diff --git a/super1.c b/super1.c
index 0eb0323..805777e 100644
--- a/super1.c
+++ b/super1.c
@@ -781,6 +781,19 @@ static int update_super1(struct supertype *st, struct mdinfo *info,
 	}
 	if (strcmp(update, "_reshape_progress")==0)
 		sb->reshape_position = __cpu_to_le64(info->reshape_progress);
+	if (strcmp(update, "revert-reshape") == 0 && sb->delta_disks) {
+		__u32 temp;
+		sb->raid_disks = __cpu_to_le32(__le32_to_cpu(sb->raid_disks) + __le32_to_cpu(sb->delta_disks));
+		sb->delta_disks = __cpu_to_le32(-__le32_to_cpu(sb->delta_disks));
+		printf("REverted to %d\n", (int)__le32_to_cpu(sb->delta_disks));
+		temp = sb->new_layout;
+		sb->new_layout = sb->layout;
+		sb->layout = temp;
+
+		temp = sb->new_chunk;
+		sb->new_chunk = sb->chunksize;
+		sb->chunksize = temp;
+	}
 
 	sb->sb_csum = calc_sb_1_csum(sb);
 	return rv;

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* RE: Degraded Array
  2010-12-04  6:44   ` Neil Brown
@ 2010-12-04  8:53     ` Leslie Rhorer
  2010-12-11  4:29     ` Leslie Rhorer
  1 sibling, 0 replies; 6+ messages in thread
From: Leslie Rhorer @ 2010-12-04  8:53 UTC (permalink / raw)
  To: 'Neil Brown', 'Majed B.'; +Cc: linux-raid



> -----Original Message-----
> From: linux-raid-owner@vger.kernel.org [mailto:linux-raid-
> owner@vger.kernel.org] On Behalf Of Neil Brown
> Sent: Saturday, December 04, 2010 12:45 AM
> To: Majed B.
> Cc: lrhorer@satx.rr.com; linux-raid@vger.kernel.org
> Subject: Re: Degraded Array
> 
> On Sat, 4 Dec 2010 07:26:36 +0300 "Majed B." <majedb@gmail.com> wrote:
> 
> > You have a degraded array now with 1 disk down. If you proceed, more
> > disks might pop out due to errors.
> >
> > It's best to backup your data, run a check on the array, fix it then
> > try to resume the reshape.
> 
> Backups are always a good idea, but are sometimes impractical.

	I always have backups.  I have a backup system running a RAID array
always kept a bit bigger than my primary server.  Every morning at 04:00 I
run an rsync (well, the system does, of course).

 
> I don't think running a 'check' would help at all.  A 'reshape' will do
> much
> the same sort of work, and more.
> 
> It isn't strictly true that the array is '1 disk down'.  Parts of it are 1
> disk down, parts are 2 disks down.  As the reshape progresses more and
> more
> will be 2 disks down.  We don't really want that.

	Well, I'm not too fussed if there is no better option.

> This case isn't really handled well at present.  You want to do a
> 'recovery'
> and a 'reshape' at the same time.  This is quite possible, but doesn't
> currently happen when you restart a reshape in the middle (added to my
> todo
> list).
> 
> I suggest you:
>  - apply the patch below to mdadm.
>  - assemble the array with --update=revert-reshape.  You should give
>    it a --backup-file too.
>  - let the reshape complete so you are back to 13 devices.
>  - add  a spare and let it recovery
>  - then add a spare and reshape the array.
> 
> Of course you needed to be running a new enough kernel to be able decrease
> the number of devices in a raid5.

	I don't think I am.  Mdadm 2.6.7.2 and kernel 2.6.26-2-amd64.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* RE: Degraded Array
  2010-12-04  6:44   ` Neil Brown
  2010-12-04  8:53     ` Leslie Rhorer
@ 2010-12-11  4:29     ` Leslie Rhorer
  1 sibling, 0 replies; 6+ messages in thread
From: Leslie Rhorer @ 2010-12-11  4:29 UTC (permalink / raw)
  To: 'Neil Brown', 'Majed B.'; +Cc: linux-raid

	Well, that was painful, and more than a little odd.  As I reported
before, the system halted dead during the re-shape from 13 disks to 14 on
the RAID6 array of the main server.  The array reassembled after reboot, but
with only 12 drives.  I'm pretty sure one drive was missing because it (the
14th) wasn't in mdadm.conf, because of course I had not put it there, yet.
I'm not exactly sure why the 12th wasn't assembled in the array.  Any way,
during the continued re-shape, it halted again.  I brought it back up again,
and it eventually completed the re-shape, but with hundreds of thousands of
reported inconsistencies. I re-added the two faulted drives one at a time
and the recovery finished both times without apparent error.  When it was
done, I started looking at the file system, and it was a mess.  At one
point, XFS crashed altogether.  I ran XFS_Repair, and it found numerous
problems at the file system level.  Several files were lost.  I ran a cmp
between every file on the backup and on the RAID array I had just re-shaped,
and nearly every large file was corrupted.  Most small files were intact,
but a few of them were also toast.  The large files were not totally
unreadable, however.  In fact most of the videos were mostly intact, but
with frequent video breakups, stutters, and drop-outs encountered on every
file I checked that had failed the compare.  I then ran an rsync against the
corrupted file system with the --checksum option, but it did not copy most
of the files back from the backup, although it did copy quite a few.  Weird.
Checking a few of the known bad files with md5sum, every pair had different
CRCs.  I also checked a few apparently good files, and every pair of those
had matching CRCs.  I ran another cmp, piping the list of failures to a log
file, and then used the list to copy the remaining failed files back to the
main array.  Finally, I did one last cmp between the two, and every file
passed except those which were expected not to.

I have no idea what could have caused this, but given the symptoms it seems
likely the stripes on one of the drives were accidentally put in the wrong
place while the re-shape took place, or something like that.  On the up
side, the arrays have never performed better.  I'm very pleased.  Running
two TCP transfers at once over a 1000M Ethernet link, the transfers topped
out at over 928 Mbps.  Single TCP transfers managed better than 800Mbps.
Some intra-machine processes topped out at nearly 2200 Mbps.  There is no
sign at all of any corruption post re-shape.


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2010-12-11  4:29 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-12-04  2:42 Degraded Array Leslie Rhorer
2010-12-04  4:26 ` Majed B.
2010-12-04  4:47   ` Leslie Rhorer
2010-12-04  6:44   ` Neil Brown
2010-12-04  8:53     ` Leslie Rhorer
2010-12-11  4:29     ` Leslie Rhorer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).