* Replace drive in RAID5 without losing redundancy?
@ 2007-03-05 10:55 Ralf Müller
2007-03-05 22:29 ` Neil Brown
0 siblings, 1 reply; 7+ messages in thread
From: Ralf Müller @ 2007-03-05 10:55 UTC (permalink / raw)
To: linux-raid
Hi
The day before I grew a 4 times 300GB disk RAID5. I replaced the 300GB
drives by 750GB ones. As far as I can see the proposed way to do that
is to kick a drive from RAID and let a spare drive take over - for
sensible data this is scary - at least for me, because I lose
redundancy for the whole time of rebuild. Everything worked fine - so
no harm is done - but is there a way to replace a disk without losing
redundancy?
Is it possible to mark a disk as "to be replaced by an existing spare",
then migrate to the spare disk and kick the old disk _after_ migration
has been done? Or not even kick - but mark as new spare.
This feature would be even more interesting for partially degraded
arrays with read problems on different disks which do not apply to the
same RAID chunk (up to now I've seen one array in such a state and this
one was able to recover by automatically rewriting the unreadable
sectors). The point is - I would have felt better if I had been able to
mark the old disks read only and migrate disk by disk without losing
the partially damaged redundancy.
Best Regards
Ralf Mueller
--
Van Roy's Law: -------------------------------------------------------
An unbreakable toy is useful for breaking other toys.
^ permalink raw reply [flat|nested] 7+ messages in thread* Re: Replace drive in RAID5 without losing redundancy?
2007-03-05 10:55 Replace drive in RAID5 without losing redundancy? Ralf Müller
@ 2007-03-05 22:29 ` Neil Brown
2007-03-06 7:37 ` dean gaudet
2007-03-07 15:14 ` Bill Davidsen
0 siblings, 2 replies; 7+ messages in thread
From: Neil Brown @ 2007-03-05 22:29 UTC (permalink / raw)
To: Ralf Müller; +Cc: linux-raid
On Monday March 5, ralf@bj-ig.de wrote:
>
> Is it possible to mark a disk as "to be replaced by an existing spare",
> then migrate to the spare disk and kick the old disk _after_ migration
> has been done? Or not even kick - but mark as new spare.
No, this is not possible yet.
You can get nearly all the way there by:
- add an internal bitmap.
- fail one drive
- --build a raid1 with that drive (and the other missing)
- re-add the raid1 into the raid5
- add the new drive to the raid1
- wait for resync
This has just a very tiny window when the array is degraded. The
bitmap allows the re-added drive to be resynced very quickly.
The next step should be:
- fail out the raid1
- disassemble the raid1
- re-add the new drive to the raid5
However that won't work as the superblock will be in the wrong place
on the new drive.
It wouldn't be hard to relocate the superblock, but there currently is
no code to do this.
NeilBrown
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Replace drive in RAID5 without losing redundancy?
2007-03-05 22:29 ` Neil Brown
@ 2007-03-06 7:37 ` dean gaudet
2007-03-06 12:02 ` Ralf Müller
2007-03-07 15:14 ` Bill Davidsen
1 sibling, 1 reply; 7+ messages in thread
From: dean gaudet @ 2007-03-06 7:37 UTC (permalink / raw)
To: Neil Brown; +Cc: Ralf Müller, linux-raid
On Tue, 6 Mar 2007, Neil Brown wrote:
> On Monday March 5, ralf@bj-ig.de wrote:
> >
> > Is it possible to mark a disk as "to be replaced by an existing spare",
> > then migrate to the spare disk and kick the old disk _after_ migration
> > has been done? Or not even kick - but mark as new spare.
>
> No, this is not possible yet.
> You can get nearly all the way there by:
>
> - add an internal bitmap.
> - fail one drive
> - --build a raid1 with that drive (and the other missing)
> - re-add the raid1 into the raid5
> - add the new drive to the raid1
> - wait for resync
i have an example at
<http://arctic.org/~dean/proactive-raid5-disk-replacement.txt>... plus
discussion as to why this isn't the best solution.
-dean
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Replace drive in RAID5 without losing redundancy?
2007-03-06 7:37 ` dean gaudet
@ 2007-03-06 12:02 ` Ralf Müller
0 siblings, 0 replies; 7+ messages in thread
From: Ralf Müller @ 2007-03-06 12:02 UTC (permalink / raw)
To: dean gaudet; +Cc: Neil Brown, linux-raid
Am 06.03.2007 um 08:37 schrieb dean gaudet:
>
> On Tue, 6 Mar 2007, Neil Brown wrote:
>
>> On Monday March 5, ralf@bj-ig.de wrote:
>>>
>>> Is it possible to mark a disk as "to be replaced by an existing
>>> spare",
>>> then migrate to the spare disk and kick the old disk _after_
>>> migration
>>> has been done? Or not even kick - but mark as new spare.
>>
>> No, this is not possible yet.
>> You can get nearly all the way there by:
>>
>> - add an internal bitmap.
>> - fail one drive
>> - --build a raid1 with that drive (and the other missing)
>> - re-add the raid1 into the raid5
>> - add the new drive to the raid1
>> - wait for resync
>
> i have an example at
> <http://arctic.org/~dean/proactive-raid5-disk-replacement.txt>... plus
> discussion as to why this isn't the best solution.
Oops - you are right. While in raid1-resync-state there is no redundancy
for the disk that should be replaced and therefore I lose a lot of
redundancy
for the raid5 above it. When I replace a disk that is in a good state -
everything is fine. When I replace one with read problems - I simply
lose.
Not as good as I thought first ...
A build in solution would be able to avoid such problems - right.
Ralf Mueller
--
Van Roy's Law: -------------------------------------------------------
An unbreakable toy is useful for breaking other toys.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Replace drive in RAID5 without losing redundancy?
2007-03-05 22:29 ` Neil Brown
2007-03-06 7:37 ` dean gaudet
@ 2007-03-07 15:14 ` Bill Davidsen
2007-03-08 14:56 ` Ralf Müller
1 sibling, 1 reply; 7+ messages in thread
From: Bill Davidsen @ 2007-03-07 15:14 UTC (permalink / raw)
To: Neil Brown; +Cc: Ralf Müller, linux-raid
Neil Brown wrote:
> On Monday March 5, ralf@bj-ig.de wrote:
>
>> Is it possible to mark a disk as "to be replaced by an existing spare",
>> then migrate to the spare disk and kick the old disk _after_ migration
>> has been done? Or not even kick - but mark as new spare.
>>
>
> No, this is not possible yet.
I was thinking about this, and looked at the code a bit. It would seem
(as someone who doesn't have to write code) that the capabilities are
all there. There is rebuild on spare, write-mostly, and therefore a
"migrate" might be done by just using the pieces cleverly.
What I was thinking is to trigger the code to rebuild on spare, but to
only rebuild the data from parity of the drive being replaced were
unreadable. That should make the process run much faster. any write
would present a different problem, I would say the data would go to the
new drive, and if bitmap was enabled the bit would be set for the old
drive but no write done. If the new drive failed during the migrate the
old drive could then be resynced. Without a bitmap you DO want to write
the old drive if it's good, DON'T if you think it's failing.
Thoughts on this?
--
bill davidsen <davidsen@tmr.com>
CTO TMR Associates, Inc
Doing interesting things with small computers since 1979
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Replace drive in RAID5 without losing redundancy?
2007-03-07 15:14 ` Bill Davidsen
@ 2007-03-08 14:56 ` Ralf Müller
0 siblings, 0 replies; 7+ messages in thread
From: Ralf Müller @ 2007-03-08 14:56 UTC (permalink / raw)
To: Bill Davidsen; +Cc: Neil Brown, linux-raid
Am 07.03.2007 um 16:14 schrieb Bill Davidsen:
> Neil Brown wrote:
>> On Monday March 5, ralf@bj-ig.de wrote:
>>> Is it possible to mark a disk as "to be replaced by an existing
>>> spare",
>>> then migrate to the spare disk and kick the old disk _after_
>>> migration
>>> has been done? Or not even kick - but mark as new spare.
>>
>> No, this is not possible yet.
>
> I was thinking about this, and looked at the code a bit.
> [ ... ]
> What I was thinking is to trigger the code to rebuild on spare, but
> to only rebuild the data from parity of the drive being replaced
> were unreadable.
Sounds good to me ...
> That should make the process run much faster. any write would
> present a different problem, I would say the data would go to the
> new drive, and if bitmap was enabled the bit would be set for the
> old drive but no write done.
But then you need to copy the whole block that is covered by a bit in
the bitmap to the new drive on every write that is done to the raid -
am I
right? For small random writes this could be a real performance hit.
> If the new drive failed during the migrate the old drive could then
> be resynced. Without a bitmap you DO want to write the old drive if
> it's good, DON'T if you think it's failing.
Without a bitmap I would suggest to try to write to the old _and_ the
new drive.
In case of a write error on the old drive it seems best to me to not
kick
it - as it would be done normally - but ignore the error and hope the
next time
one tries to write this sector it is still not writable (just in case
we have to
stop migration for any reason). Or is there any way to mark a sector
as invalid
on a raid without bitmap?
In case the new drive fails it should be kicked as usual ...
Ok ... for not having any idea what the raid code actually does this was
quite a lot of text ... hopefully it was not only noise.
Ralf
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: Replace drive in RAID5 without losing redundancy?
@ 2007-03-06 9:18 Ralf Müller
0 siblings, 0 replies; 7+ messages in thread
From: Ralf Müller @ 2007-03-06 9:18 UTC (permalink / raw)
To: Neil Brown
Am 05.03.2007 um 23:29 schrieb Neil Brown:
> On Monday March 5, ralf@bj-ig.de wrote:
>>
>> Is it possible to mark a disk as "to be replaced by an existing
>> spare",
>> then migrate to the spare disk and kick the old disk _after_
>> migration
>> has been done? Or not even kick - but mark as new spare.
>
> No, this is not possible yet.
It's a pity - this would be a cool feature. It would take a lot of
worrying out of RAID5 maintenance. Even when you have a backup it takes
hours over hours to rebuild data from it when you crash a big RAID.
Do you
know if such a thing is on a ToDo list somewhere?
> You can get nearly all the way there by:
>
> - add an internal bitmap.
> [ ... ]
> - disassemble the raid1
> - re-add the new drive to the raid5
> However that won't work as the superblock will be in the wrong place
> on the new drive.
> It wouldn't be hard to relocate the superblock, but there currently is
> no code to do this.
There has been an idea in PM to initially create the RAID5 out of
RAID1's
with only one disk. For migration one could add a disk to one of the
RAID1's,
let it sync and fail out the old disk. The question left had been if
a RAID1
with only one disk would propagate read errors to the RAID5 above -
so that
the RAID5 is be able to rewrite this sector - or simply fail the single
drive left.
Best regards
Ralf Mueller
--
Van Roy's Law: -------------------------------------------------------
An unbreakable toy is useful for breaking other toys.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2007-03-08 14:56 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-03-05 10:55 Replace drive in RAID5 without losing redundancy? Ralf Müller
2007-03-05 22:29 ` Neil Brown
2007-03-06 7:37 ` dean gaudet
2007-03-06 12:02 ` Ralf Müller
2007-03-07 15:14 ` Bill Davidsen
2007-03-08 14:56 ` Ralf Müller
-- strict thread matches above, loose matches on Subject: below --
2007-03-06 9:18 Ralf Müller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).