linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Upgrading Raid1 array to have 2. disks.
@ 2003-05-06 23:24 Anders Fugmann
  2003-05-06 23:46 ` Mads Peter Bach
  2003-05-07  0:00 ` Neil Brown
  0 siblings, 2 replies; 8+ messages in thread
From: Anders Fugmann @ 2003-05-06 23:24 UTC (permalink / raw)
  To: linux-raid

Hi,

Sorry if this is the wrong list to post this on.

I have made an error when creating a new raid1 array as I specified that
the raid should consist of only one disk. Now, I would like to add 
another disk, but mdadm insists on adding it as a spare disk, offering 
no data redundancy.

Before:
md0 : active raid1 ide/host0/bus0/target0/lun0/part1[0]
       1048704 blocks [1/1] [U]

After running
# mdadm -add /dev/md/0
     /dev/ide/host0/bus0/target0/lun0/part1[0]

md0 : active raid1 ide/host0/bus1/target0/lun0/part1[1] 
ide/host0/bus0/target0/lun0/part1[0]
       1048704 blocks [1/1] [U]


I have read the manpages, and tried googling without success.
I do not wish to destroy the data on the discs, and recreate the array 
but It might be a last resort. Can anyone help me convince mdadm to add 
the 2. partition as an active disk.

Regards
Anders Fugmann

P.s.
Please CC me, as I'm not subscribed to the list.



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Upgrading Raid1 array to have 2. disks.
  2003-05-06 23:24 Upgrading Raid1 array to have 2. disks Anders Fugmann
@ 2003-05-06 23:46 ` Mads Peter Bach
  2003-05-07  0:00 ` Neil Brown
  1 sibling, 0 replies; 8+ messages in thread
From: Mads Peter Bach @ 2003-05-06 23:46 UTC (permalink / raw)
  To: Anders Fugmann; +Cc: linux-raid

Anders Fugmann wrote:

> Sorry if this is the wrong list to post this on.

It's the right list.

> I have made an error when creating a new raid1 array as I specified that
> the raid should consist of only one disk. Now, I would like to add 
> another disk, but mdadm insists on adding it as a spare disk, offering 
> no data redundancy.

> I have read the manpages, and tried googling without success.
> I do not wish to destroy the data on the discs, and recreate the array 
> but It might be a last resort. Can anyone help me convince mdadm to add 
> the 2. partition as an active disk.

You could keep your data by doing this:

Remove the current spare:

mdadm --remove /dev/md/0 /dev/ide/host0/bus0/target0/lun0/part1[0]

Create new array with

mdadm --create /dev/md/1 --level=1
--raid-devices=/dev/ide/host0/bus0/target0/lun0/part1[0] missing

Then create fs on new array, mount & copy data to the new array. Stop the old 
array, add disk from old array to the new array.


-- 
Mads Peter Bach
Systemadministrator,  Det Humanistiske Fakultet, Aalborg Universitet
Kroghstræde 3 - 5.111, DK-9220 Aalborg Øst - (+45) 96358062
# whois MPB1-DK@whois.dk-hostmaster.dk

-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Upgrading Raid1 array to have 2. disks.
  2003-05-06 23:24 Upgrading Raid1 array to have 2. disks Anders Fugmann
  2003-05-06 23:46 ` Mads Peter Bach
@ 2003-05-07  0:00 ` Neil Brown
  2003-05-08 19:11   ` Anders Fugmann
  2003-05-20  0:33   ` raid5, 2 drives dead at same time,kernel will Oops? 3tcdgwg3
  1 sibling, 2 replies; 8+ messages in thread
From: Neil Brown @ 2003-05-07  0:00 UTC (permalink / raw)
  To: Anders Fugmann; +Cc: linux-raid

On Wednesday May 7, afu@fugmann.dhs.org wrote:
> Hi,
> 
> Sorry if this is the wrong list to post this on.
> 
> I have made an error when creating a new raid1 array as I specified that
> the raid should consist of only one disk. Now, I would like to add 
> another disk, but mdadm insists on adding it as a spare disk, offering 
> no data redundancy.
> 
> Before:
> md0 : active raid1 ide/host0/bus0/target0/lun0/part1[0]
>        1048704 blocks [1/1] [U]
> 
> After running
> # mdadm -add /dev/md/0
>      /dev/ide/host0/bus0/target0/lun0/part1[0]
> 
> md0 : active raid1 ide/host0/bus1/target0/lun0/part1[1] 
> ide/host0/bus0/target0/lun0/part1[0]
>        1048704 blocks [1/1] [U]
> 
> 
> I have read the manpages, and tried googling without success.
> I do not wish to destroy the data on the discs, and recreate the array 
> but It might be a last resort. Can anyone help me convince mdadm to add 
> the 2. partition as an active disk.

You cannot add a new data drive to a raid1 array.
If you shut down the array and re-create it you should not loose data.
The correct command to use would be something like:
  mdadm --create /dev/md0 --level=1 -n 2 /dev/ide/host0/bus0/target0/lun0/part1 missing

Note the word "missing" at the end.  This tells mdadm to create a
2-drive array with one missing drive (and so only one active drive).
This will have the same data as the old md0.
Now "mdadm --add" should do what you want.

NeilBrown


> 
> Regards
> Anders Fugmann
> 
> P.s.
> Please CC me, as I'm not subscribed to the list.
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Upgrading Raid1 array to have 2. disks.
  2003-05-07  0:00 ` Neil Brown
@ 2003-05-08 19:11   ` Anders Fugmann
  2003-05-20  0:33   ` raid5, 2 drives dead at same time,kernel will Oops? 3tcdgwg3
  1 sibling, 0 replies; 8+ messages in thread
From: Anders Fugmann @ 2003-05-08 19:11 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Neil Brown wrote:
> You cannot add a new data drive to a raid1 array.
> If you shut down the array and re-create it you should not loose data.
> The correct command to use would be something like:
>   mdadm --create /dev/md0 --level=1 -n 2 /dev/ide/host0/bus0/target0/lun0/part1 missing
> 
> Note the word "missing" at the end.  This tells mdadm to create a
> 2-drive array with one missing drive (and so only one active drive).
> This will have the same data as the old md0.
> Now "mdadm --add" should do what you want.

Thanks for the replies.
The method above worked like a charm. The data on the array even surived 
the operation.

> 
> NeilBrown

Regards
Anders Fugmann


^ permalink raw reply	[flat|nested] 8+ messages in thread

* raid5, 2 drives dead at same time,kernel will Oops?
  2003-05-07  0:00 ` Neil Brown
  2003-05-08 19:11   ` Anders Fugmann
@ 2003-05-20  0:33   ` 3tcdgwg3
  2003-05-21  2:42     ` Neil Brown
  1 sibling, 1 reply; 8+ messages in thread
From: 3tcdgwg3 @ 2003-05-20  0:33 UTC (permalink / raw)
  To: linux-raid

Hi,

I am trying to simulate a case that two drives
in an array fail ad same time.
I use two ide drives, I try to create a 
raid 5 array with 4 arms, created as following:

/dev/hdc1
/dev/hde1
/dev/hdc2
/dev/hde2

This is just for test, I know create two arms on
one hard drive doesn't make much sense.


Anyway, when I run this array, if I power off one
of hard drive (/dev/hde) to simulate two arms failing 
at same  time in an array, I got system Oops. I am using
2.4-18 kernel.

Anyone can tell me if this is normal? or if there is a fix for this?

Thanks in advance.

Here are the output:

=================================
hde: drive not ready for command

md: updating md0 RAID superblock on device

md: (skipping faulty hde2 )

md: hdc2 [events: 00000002]<6>(write) hdc2's sb offset: 3469952

md: (skipping faulty hde1 )

md: hdc1 [events: 00000002]<6>(write) hdc1's sb offset: 16064896

md0: no spare disk to reconstruct array! -- continuing in degraded mode

md: recovery thread finished ...

md: recovery thread got woken up ...

md0: no spare disk to reconstruct array! -- continuing in degraded mode

md: recovery thread finished ...

sector=561a8 i=0 00000000 00000000 8132e840 0

kernel BUG at raid5.c:309!

kupdated(7): Oops

================================================



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: raid5, 2 drives dead at same time,kernel will Oops?
  2003-05-20  0:33   ` raid5, 2 drives dead at same time,kernel will Oops? 3tcdgwg3
@ 2003-05-21  2:42     ` Neil Brown
  2003-05-22  1:04       ` 3tcdgwg3
  0 siblings, 1 reply; 8+ messages in thread
From: Neil Brown @ 2003-05-21  2:42 UTC (permalink / raw)
  To: 3tcdgwg3; +Cc: linux-raid

On Monday May 19, 3tcdgwg3@prodigy.net wrote:
> Hi,
> 
> I am trying to simulate a case that two drives
> in an array fail ad same time.
> I use two ide drives, I try to create a 
> raid 5 array with 4 arms, created as following:
> 
> /dev/hdc1
> /dev/hde1
> /dev/hdc2
> /dev/hde2
> 
> This is just for test, I know create two arms on
> one hard drive doesn't make much sense.
> 
> 
> Anyway, when I run this array, if I power off one
> of hard drive (/dev/hde) to simulate two arms failing 
> at same  time in an array, I got system Oops. I am using
> 2.4-18 kernel.
> 
> Anyone can tell me if this is normal? or if there is a fix for this?
> 

Congratulations and thanks.  You have managed to trigger a bug that
no-one else has found.

The following patch (against 2.4.20) should fix it.  If you can test
and confirm I would really appreciate it.

NeilBrown


------------------------------------------------------------
Handle concurrent failure of two drives in raid5

If two drives both fail during a write request, raid5 doesn't
cope properly and will eventually oops.

With this patch, blocks that have already been 'written'
are failed when double drive failure is noticed, as well as
blocks that are about to be written.

 ----------- Diffstat output ------------
 ./drivers/md/raid5.c |   10 +++++++++-
 1 files changed, 9 insertions(+), 1 deletion(-)

diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
--- ./drivers/md/raid5.c~current~	2003-05-21 12:42:07.000000000 +1000
+++ ./drivers/md/raid5.c	2003-05-21 12:37:37.000000000 +1000
@@ -882,7 +882,7 @@ static void handle_stripe(struct stripe_
 	/* check if the array has lost two devices and, if so, some requests might
 	 * need to be failed
 	 */
-	if (failed > 1 && to_read+to_write) {
+	if (failed > 1 && to_read+to_write+written) {
 		for (i=disks; i--; ) {
 			/* fail all writes first */
 			if (sh->bh_write[i]) to_write--;
@@ -891,6 +891,14 @@ static void handle_stripe(struct stripe_
 				bh->b_reqnext = return_fail;
 				return_fail = bh;
 			}
+			/* and fail all 'written' */
+			if (sh->bh_written[i]) written--;
+			while ((bh = sh->bh_written[i])) {
+				sh->bh_written[i] = bh->b_reqnext;
+				bh->b_reqnext = return_fail;
+				return_fail = bh;
+			}
+
 			/* fail any reads if this device is non-operational */
 			if (!conf->disks[i].operational) {
 				spin_lock_irq(&conf->device_lock);

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: raid5, 2 drives dead at same time,kernel will Oops?
  2003-05-21  2:42     ` Neil Brown
@ 2003-05-22  1:04       ` 3tcdgwg3
  2003-05-30 20:33         ` 3tcdgwg3
  0 siblings, 1 reply; 8+ messages in thread
From: 3tcdgwg3 @ 2003-05-22  1:04 UTC (permalink / raw)
  To: Neil Brown; +Cc: linux-raid

Neil,
Preliminary test looks good, will test more
when have time.

Thanks,
-Will.
----- Original Message -----
From: "Neil Brown" <neilb@cse.unsw.edu.au>
To: "3tcdgwg3" <3tcdgwg3@prodigy.net>
Cc: <linux-raid@vger.kernel.org>
Sent: Tuesday, May 20, 2003 7:42 PM
Subject: Re: raid5, 2 drives dead at same time,kernel will Oops?


> On Monday May 19, 3tcdgwg3@prodigy.net wrote:
> > Hi,
> >
> > I am trying to simulate a case that two drives
> > in an array fail ad same time.
> > I use two ide drives, I try to create a
> > raid 5 array with 4 arms, created as following:
> >
> > /dev/hdc1
> > /dev/hde1
> > /dev/hdc2
> > /dev/hde2
> >
> > This is just for test, I know create two arms on
> > one hard drive doesn't make much sense.
> >
> >
> > Anyway, when I run this array, if I power off one
> > of hard drive (/dev/hde) to simulate two arms failing
> > at same  time in an array, I got system Oops. I am using
> > 2.4-18 kernel.
> >
> > Anyone can tell me if this is normal? or if there is a fix for this?
> >
>
> Congratulations and thanks.  You have managed to trigger a bug that
> no-one else has found.
>
> The following patch (against 2.4.20) should fix it.  If you can test
> and confirm I would really appreciate it.
>
> NeilBrown
>
>
> ------------------------------------------------------------
> Handle concurrent failure of two drives in raid5
>
> If two drives both fail during a write request, raid5 doesn't
> cope properly and will eventually oops.
>
> With this patch, blocks that have already been 'written'
> are failed when double drive failure is noticed, as well as
> blocks that are about to be written.
>
>  ----------- Diffstat output ------------
>  ./drivers/md/raid5.c |   10 +++++++++-
>  1 files changed, 9 insertions(+), 1 deletion(-)
>
> diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
> --- ./drivers/md/raid5.c~current~ 2003-05-21 12:42:07.000000000 +1000
> +++ ./drivers/md/raid5.c 2003-05-21 12:37:37.000000000 +1000
> @@ -882,7 +882,7 @@ static void handle_stripe(struct stripe_
>   /* check if the array has lost two devices and, if so, some requests
might
>   * need to be failed
>   */
> - if (failed > 1 && to_read+to_write) {
> + if (failed > 1 && to_read+to_write+written) {
>   for (i=disks; i--; ) {
>   /* fail all writes first */
>   if (sh->bh_write[i]) to_write--;
> @@ -891,6 +891,14 @@ static void handle_stripe(struct stripe_
>   bh->b_reqnext = return_fail;
>   return_fail = bh;
>   }
> + /* and fail all 'written' */
> + if (sh->bh_written[i]) written--;
> + while ((bh = sh->bh_written[i])) {
> + sh->bh_written[i] = bh->b_reqnext;
> + bh->b_reqnext = return_fail;
> + return_fail = bh;
> + }
> +
>   /* fail any reads if this device is non-operational */
>   if (!conf->disks[i].operational) {
>   spin_lock_irq(&conf->device_lock);
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: raid5, 2 drives dead at same time,kernel will Oops?
  2003-05-22  1:04       ` 3tcdgwg3
@ 2003-05-30 20:33         ` 3tcdgwg3
  0 siblings, 0 replies; 8+ messages in thread
From: 3tcdgwg3 @ 2003-05-30 20:33 UTC (permalink / raw)
  To: 3tcdgwg3, Neil Brown; +Cc: linux-raid

Hi,

I have some other issues under this "more than 1
arm broken in a raid5 array" condition. The next
important one is this:

If I have two arrays, if first one is a raid5 and resync
is going on, second is another raid5, resync is schedule
to start after the first raid5 array synced.
At this time, if I kill to arms in the first raid5 array, the resync
will stop, but never aborted, consequently, second raid5
array never get a chance to start the resync.
Is there a fix for this?

Thanks

-W


----- Original Message -----
From: "3tcdgwg3" <3tcdgwg3@prodigy.net>
To: "Neil Brown" <neilb@cse.unsw.edu.au>
Cc: <linux-raid@vger.kernel.org>
Sent: Wednesday, May 21, 2003 6:04 PM
Subject: Re: raid5, 2 drives dead at same time,kernel will Oops?


> Neil,
> Preliminary test looks good, will test more
> when have time.
>
> Thanks,
> -Will.
> ----- Original Message -----
> From: "Neil Brown" <neilb@cse.unsw.edu.au>
> To: "3tcdgwg3" <3tcdgwg3@prodigy.net>
> Cc: <linux-raid@vger.kernel.org>
> Sent: Tuesday, May 20, 2003 7:42 PM
> Subject: Re: raid5, 2 drives dead at same time,kernel will Oops?
>
>
> > On Monday May 19, 3tcdgwg3@prodigy.net wrote:
> > > Hi,
> > >
> > > I am trying to simulate a case that two drives
> > > in an array fail ad same time.
> > > I use two ide drives, I try to create a
> > > raid 5 array with 4 arms, created as following:
> > >
> > > /dev/hdc1
> > > /dev/hde1
> > > /dev/hdc2
> > > /dev/hde2
> > >
> > > This is just for test, I know create two arms on
> > > one hard drive doesn't make much sense.
> > >
> > >
> > > Anyway, when I run this array, if I power off one
> > > of hard drive (/dev/hde) to simulate two arms failing
> > > at same  time in an array, I got system Oops. I am using
> > > 2.4-18 kernel.
> > >
> > > Anyone can tell me if this is normal? or if there is a fix for this?
> > >
> >
> > Congratulations and thanks.  You have managed to trigger a bug that
> > no-one else has found.
> >
> > The following patch (against 2.4.20) should fix it.  If you can test
> > and confirm I would really appreciate it.
> >
> > NeilBrown
> >
> >
> > ------------------------------------------------------------
> > Handle concurrent failure of two drives in raid5
> >
> > If two drives both fail during a write request, raid5 doesn't
> > cope properly and will eventually oops.
> >
> > With this patch, blocks that have already been 'written'
> > are failed when double drive failure is noticed, as well as
> > blocks that are about to be written.
> >
> >  ----------- Diffstat output ------------
> >  ./drivers/md/raid5.c |   10 +++++++++-
> >  1 files changed, 9 insertions(+), 1 deletion(-)
> >
> > diff ./drivers/md/raid5.c~current~ ./drivers/md/raid5.c
> > --- ./drivers/md/raid5.c~current~ 2003-05-21 12:42:07.000000000 +1000
> > +++ ./drivers/md/raid5.c 2003-05-21 12:37:37.000000000 +1000
> > @@ -882,7 +882,7 @@ static void handle_stripe(struct stripe_
> >   /* check if the array has lost two devices and, if so, some requests
> might
> >   * need to be failed
> >   */
> > - if (failed > 1 && to_read+to_write) {
> > + if (failed > 1 && to_read+to_write+written) {
> >   for (i=disks; i--; ) {
> >   /* fail all writes first */
> >   if (sh->bh_write[i]) to_write--;
> > @@ -891,6 +891,14 @@ static void handle_stripe(struct stripe_
> >   bh->b_reqnext = return_fail;
> >   return_fail = bh;
> >   }
> > + /* and fail all 'written' */
> > + if (sh->bh_written[i]) written--;
> > + while ((bh = sh->bh_written[i])) {
> > + sh->bh_written[i] = bh->b_reqnext;
> > + bh->b_reqnext = return_fail;
> > + return_fail = bh;
> > + }
> > +
> >   /* fail any reads if this device is non-operational */
> >   if (!conf->disks[i].operational) {
> >   spin_lock_irq(&conf->device_lock);
> > -
> > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2003-05-30 20:33 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-05-06 23:24 Upgrading Raid1 array to have 2. disks Anders Fugmann
2003-05-06 23:46 ` Mads Peter Bach
2003-05-07  0:00 ` Neil Brown
2003-05-08 19:11   ` Anders Fugmann
2003-05-20  0:33   ` raid5, 2 drives dead at same time,kernel will Oops? 3tcdgwg3
2003-05-21  2:42     ` Neil Brown
2003-05-22  1:04       ` 3tcdgwg3
2003-05-30 20:33         ` 3tcdgwg3

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).