linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* replace disk in raid5 without linux noticing?
@ 2006-04-19 14:31 Dexter Filmore
  2006-04-19 16:31 ` Shai
  0 siblings, 1 reply; 12+ messages in thread
From: Dexter Filmore @ 2006-04-19 14:31 UTC (permalink / raw)
  To: linux-raid

Let's say a disk in an array starts yielding smart errors but is still 
functional.
So instead of waiting for it to fail completely and start a sync and stress 
the other disks, could I clone that disk to a fresh one, put the array 
offline and replace the disk? 

 
-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d--(+)@ s-:+ a- C+++(++++) UL+>++++ P+>++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS++(+) PE(-) Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D G++ e* h>++ r%>* y?
------END GEEK CODE BLOCK------

http://www.stop1984.com
http://www.againsttcpa.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-19 14:31 replace disk in raid5 without linux noticing? Dexter Filmore
@ 2006-04-19 16:31 ` Shai
  2006-04-19 17:03   ` Ming Zhang
  2006-04-21 18:23   ` Dexter Filmore
  0 siblings, 2 replies; 12+ messages in thread
From: Shai @ 2006-04-19 16:31 UTC (permalink / raw)
  To: Dexter Filmore; +Cc: linux-raid

On 4/19/06, Dexter Filmore <Dexter.Filmore@gmx.de> wrote:
> Let's say a disk in an array starts yielding smart errors but is still
> functional.
> So instead of waiting for it to fail completely and start a sync and stress
> the other disks, could I clone that disk to a fresh one, put the array
> offline and replace the disk?

Hi,

Why can't you just mark that drive as failed, remove it and hotadd a
new drive to replace the failed drive?

Shai

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-19 16:31 ` Shai
@ 2006-04-19 17:03   ` Ming Zhang
  2006-04-19 17:41     ` Brendan Conoboy
  2006-04-21 18:23   ` Dexter Filmore
  1 sibling, 1 reply; 12+ messages in thread
From: Ming Zhang @ 2006-04-19 17:03 UTC (permalink / raw)
  To: Shai; +Cc: Dexter Filmore, linux-raid

On Wed, 2006-04-19 at 18:31 +0200, Shai wrote:
> On 4/19/06, Dexter Filmore <Dexter.Filmore@gmx.de> wrote:
> > Let's say a disk in an array starts yielding smart errors but is still
> > functional.
> > So instead of waiting for it to fail completely and start a sync and stress
> > the other disks, could I clone that disk to a fresh one, put the array
> > offline and replace the disk?
> 
> Hi,
> 
> Why can't you just mark that drive as failed, remove it and hotadd a
> new drive to replace the failed drive?

because background rebuild is slower than disk to disk copy, since his
disk is still fully functional.


> 
> Shai
> -
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-19 17:03   ` Ming Zhang
@ 2006-04-19 17:41     ` Brendan Conoboy
  2006-04-19 18:16       ` Ming Zhang
  0 siblings, 1 reply; 12+ messages in thread
From: Brendan Conoboy @ 2006-04-19 17:41 UTC (permalink / raw)
  To: mingz; +Cc: Shai, Dexter Filmore, linux-raid

Ming Zhang wrote:
>> Why can't you just mark that drive as failed, remove it and hotadd a
>> new drive to replace the failed drive?
> 
> because background rebuild is slower than disk to disk copy, since his
> disk is still fully functional.

Wouldn't it be great if every disk in a RAID volume were in its own way 
a degraded RAID1 device without a mirror?  Then when any drive started 
generating recoverable errors and warnings a mirror could be allocated 
without any downtime.  You can certainly generate a layout like this 
manually, but it would be nice to have that sort of feature out of the 
box (and without the performance hit!).  This would help a great deal in 
a situation such as Dexter's.

-Brendan (synk@swcp.com)

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-19 17:41     ` Brendan Conoboy
@ 2006-04-19 18:16       ` Ming Zhang
  2006-04-20 15:22         ` Gabor Gombas
  0 siblings, 1 reply; 12+ messages in thread
From: Ming Zhang @ 2006-04-19 18:16 UTC (permalink / raw)
  To: Brendan Conoboy; +Cc: Shai, Dexter Filmore, linux-raid

On Wed, 2006-04-19 at 10:41 -0700, Brendan Conoboy wrote:
> Ming Zhang wrote:
> >> Why can't you just mark that drive as failed, remove it and hotadd a
> >> new drive to replace the failed drive?
> > 
> > because background rebuild is slower than disk to disk copy, since his
> > disk is still fully functional.
> 
> Wouldn't it be great if every disk in a RAID volume were in its own way 
> a degraded RAID1 device without a mirror?  Then when any drive started 
> generating recoverable errors and warnings a mirror could be allocated 
> without any downtime.  You can certainly generate a layout like this 
> manually, but it would be nice to have that sort of feature out of the 
> box (and without the performance hit!).  This would help a great deal in 
> a situation such as Dexter's.

is this possible? 
* stop RAID5
* set a mirror between current disk X and a new added disk Y, and X as
primary one (which means copy X to Y to full sync, and before this ends,
only read from X); also this mirror will not have any metadata or mark
on existing disk;
* add this mirror to RAID5
* start RAID5;

... mirror will continue copy data from X to Y, once end

* stop RAID5
* split mirror
* put DISK Y back to RAID5
* restart RAID5.

since this is a mirror, all metadata are same. it will be even greater
if no need to stop raid5 to do this.

may MD already can do this, but I do not know.

> 
> -Brendan (synk@swcp.com)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-19 18:16       ` Ming Zhang
@ 2006-04-20 15:22         ` Gabor Gombas
  2006-04-20 15:24           ` Ming Zhang
  0 siblings, 1 reply; 12+ messages in thread
From: Gabor Gombas @ 2006-04-20 15:22 UTC (permalink / raw)
  To: Ming Zhang; +Cc: Brendan Conoboy, Shai, Dexter Filmore, linux-raid

On Wed, Apr 19, 2006 at 02:16:10PM -0400, Ming Zhang wrote:

> is this possible? 
> * stop RAID5
> * set a mirror between current disk X and a new added disk Y, and X as
> primary one (which means copy X to Y to full sync, and before this ends,
> only read from X); also this mirror will not have any metadata or mark
> on existing disk;

The mirror should be created without persistent superblocks (obviously),
but --build does not seem to allow RAID1.

> * add this mirror to RAID5
> * start RAID5;
> 
> ... mirror will continue copy data from X to Y, once end
> 
> * stop RAID5
> * split mirror
> * put DISK Y back to RAID5
> * restart RAID5.
> 
> since this is a mirror, all metadata are same. it will be even greater
> if no need to stop raid5 to do this.

The process seems rather fragile. If I created a RAID5 array to protect
my data I most certainly would not like to perform so much steps where I
can mess up.

Gabor

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-20 15:22         ` Gabor Gombas
@ 2006-04-20 15:24           ` Ming Zhang
  0 siblings, 0 replies; 12+ messages in thread
From: Ming Zhang @ 2006-04-20 15:24 UTC (permalink / raw)
  To: Gabor Gombas; +Cc: Brendan Conoboy, Shai, Dexter Filmore, linux-raid

On Thu, 2006-04-20 at 17:22 +0200, Gabor Gombas wrote:
> On Wed, Apr 19, 2006 at 02:16:10PM -0400, Ming Zhang wrote:
> 
> > is this possible? 
> > * stop RAID5
> > * set a mirror between current disk X and a new added disk Y, and X as
> > primary one (which means copy X to Y to full sync, and before this ends,
> > only read from X); also this mirror will not have any metadata or mark
> > on existing disk;
> 
> The mirror should be created without persistent superblocks (obviously),
> but --build does not seem to allow RAID1.
> 
> > * add this mirror to RAID5
> > * start RAID5;
> > 
> > ... mirror will continue copy data from X to Y, once end
> > 
> > * stop RAID5
> > * split mirror
> > * put DISK Y back to RAID5
> > * restart RAID5.
> > 
> > since this is a mirror, all metadata are same. it will be even greater
> > if no need to stop raid5 to do this.
> 
> The process seems rather fragile. If I created a RAID5 array to protect
> my data I most certainly would not like to perform so much steps where I
> can mess up.

it can be be script and then little chance to get failed.


or if i understand current raid5 bitmap right. u can remove the disk X
from raid, then do a dd from disk X to Y, and then add disk Y to raid.
then bitmap can handle the rsync with a much less cost.




> 
> Gabor
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-19 16:31 ` Shai
  2006-04-19 17:03   ` Ming Zhang
@ 2006-04-21 18:23   ` Dexter Filmore
  2006-04-21 22:25     ` Carlos Carvalho
  1 sibling, 1 reply; 12+ messages in thread
From: Dexter Filmore @ 2006-04-21 18:23 UTC (permalink / raw)
  To: Shai; +Cc: linux-raid

Am Mittwoch, 19. April 2006 18:31 schrieb Shai:
> On 4/19/06, Dexter Filmore <Dexter.Filmore@gmx.de> wrote:
> > Let's say a disk in an array starts yielding smart errors but is still
> > functional.
> > So instead of waiting for it to fail completely and start a sync and
> > stress the other disks, could I clone that disk to a fresh one, put the
> > array offline and replace the disk?
>
> Hi,
>
> Why can't you just mark that drive as failed, remove it and hotadd a
> new drive to replace the failed drive?

Well, resync stresses the other disks a lot. If that can be avoided, I'd 
rather do so.

-- 
-----BEGIN GEEK CODE BLOCK-----
Version: 3.12
GCS d--(+)@ s-:+ a- C+++(++++) UL+>++++ P+>++ L+++>++++ E-- W++ N o? K-
w--(---) !O M+ V- PS++(+) PE(-) Y++ PGP t++(---)@ 5 X+(++) R+(++) tv--(+)@ 
b++(+++) DI+++ D G++ e* h>++ r%>* y?
------END GEEK CODE BLOCK------

http://www.stop1984.com
http://www.againsttcpa.com

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-21 18:23   ` Dexter Filmore
@ 2006-04-21 22:25     ` Carlos Carvalho
  2006-04-22 15:08       ` Martin Cracauer
  0 siblings, 1 reply; 12+ messages in thread
From: Carlos Carvalho @ 2006-04-21 22:25 UTC (permalink / raw)
  To: linux-raid

Dexter Filmore (Dexter.Filmore@gmx.de) wrote on 21 April 2006 20:23:
 >Am Mittwoch, 19. April 2006 18:31 schrieb Shai:
 >> On 4/19/06, Dexter Filmore <Dexter.Filmore@gmx.de> wrote:
 >> > Let's say a disk in an array starts yielding smart errors but is still
 >> > functional.
 >> > So instead of waiting for it to fail completely and start a sync and
 >> > stress the other disks, could I clone that disk to a fresh one, put the
 >> > array offline and replace the disk?
 >>
 >> Hi,
 >>
 >> Why can't you just mark that drive as failed, remove it and hotadd a
 >> new drive to replace the failed drive?
 >
 >Well, resync stresses the other disks a lot. If that can be avoided, I'd 
 >rather do so.

I'm not sure I understand what you want but you certainly can:

stop the array
dd warning disk => new one
remove warning disk
assemble the array again with the new disk

The inconvenience is that you don't have the array during the copy.

Perhaps you could also just change the first step to "put the array in
read-only mode". This of course means your filesystem must be mounted
ro but at least you can read from it. For this to work md must not
make any changes to the disks, in particular it must not change the
superblocks if a disk fails. I don't know if this is what happens.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-21 22:25     ` Carlos Carvalho
@ 2006-04-22 15:08       ` Martin Cracauer
  2006-04-22 17:48         ` Carlos Carvalho
  0 siblings, 1 reply; 12+ messages in thread
From: Martin Cracauer @ 2006-04-22 15:08 UTC (permalink / raw)
  To: Carlos Carvalho; +Cc: linux-raid

> stop the array
> dd warning disk => new one
> remove warning disk
> assemble the array again with the new disk
> 
> The inconvenience is that you don't have the array during the copy.

Stopping the array and restarting it as readonly will give you access
to the data while that copy is in progress.

Martin
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <cracauer@cons.org>   http://www.cons.org/cracauer/
FreeBSD - where you want to go, today.      http://www.freebsd.org/

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-22 15:08       ` Martin Cracauer
@ 2006-04-22 17:48         ` Carlos Carvalho
  2006-04-23 16:43           ` Martin Cracauer
  0 siblings, 1 reply; 12+ messages in thread
From: Carlos Carvalho @ 2006-04-22 17:48 UTC (permalink / raw)
  To: Martin Cracauer; +Cc: linux-raid

Martin Cracauer (cracauer@cons.org) wrote on 22 April 2006 11:08:
 >> stop the array
 >> dd warning disk => new one
 >> remove warning disk
 >> assemble the array again with the new disk
 >> 
 >> The inconvenience is that you don't have the array during the copy.
 >
 >Stopping the array and restarting it as readonly will give you access
 >to the data while that copy is in progress.

Yes but then you could just switch it to read-only without stopping.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: replace disk in raid5 without linux noticing?
  2006-04-22 17:48         ` Carlos Carvalho
@ 2006-04-23 16:43           ` Martin Cracauer
  0 siblings, 0 replies; 12+ messages in thread
From: Martin Cracauer @ 2006-04-23 16:43 UTC (permalink / raw)
  To: Carlos Carvalho; +Cc: Martin Cracauer, linux-raid

Carlos Carvalho wrote on Sat, Apr 22, 2006 at 02:48:23PM -0300: 
> Martin Cracauer (cracauer@cons.org) wrote on 22 April 2006 11:08:
>  >> stop the array
>  >> dd warning disk => new one
>  >> remove warning disk
>  >> assemble the array again with the new disk
>  >> 
>  >> The inconvenience is that you don't have the array during the copy.
>  >
>  >Stopping the array and restarting it as readonly will give you access
>  >to the data while that copy is in progress.
> 
> Yes but then you could just switch it to read-only without stopping.

I believe that would be fine to do the whole operation.  Filesystem
read-only, then md read-only, copy disk, then you need to unmount and
stop the md to restart it with the new disk.

If the final disk change involves a powerdown and putting the new disk
on the physical interface that the old one was on it should be
transparent.

%%

BTW, last time I tested a Linux software RAID-5 by ripping out an
active disk I noticed that while the filesystem stayed up and usable,
a currently ongoing system call would not return and block forever.

Is that a know behaviour?

Martin
-- 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Martin Cracauer <cracauer@cons.org>   http://www.cons.org/cracauer/
FreeBSD - where you want to go, today.      http://www.freebsd.org/

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2006-04-23 16:43 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-19 14:31 replace disk in raid5 without linux noticing? Dexter Filmore
2006-04-19 16:31 ` Shai
2006-04-19 17:03   ` Ming Zhang
2006-04-19 17:41     ` Brendan Conoboy
2006-04-19 18:16       ` Ming Zhang
2006-04-20 15:22         ` Gabor Gombas
2006-04-20 15:24           ` Ming Zhang
2006-04-21 18:23   ` Dexter Filmore
2006-04-21 22:25     ` Carlos Carvalho
2006-04-22 15:08       ` Martin Cracauer
2006-04-22 17:48         ` Carlos Carvalho
2006-04-23 16:43           ` Martin Cracauer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).