Reducing the number of devices in a degraded RAID-5

linux-raid.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* Reducing the number of devices in a degraded RAID-5
@ 2017-05-22 12:53 Andreas Klauer
  2017-05-24  2:12 ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Klauer @ 2017-05-22 12:53 UTC (permalink / raw)
  To: linux-raid

Hi,

this is not a recovery question, no real data involved. Thanks for helping!

Suppose you have a failing drive in RAID-5 but you wanted to move to 
fewer drives anyway, so one way or another you're going to reduce 
the number of drives in your RAID.

Given a RAID 5 with 5 drives            [UUUUU]
Reducing it by one drive results in     [UUUU] + Spare
Okay.

Given a degraded RAID 5 with 5 drives   [_UUUU]
Reducing it by one drive results in     [_UUU] + Spare
Still okay? Rebuild must be started manually.

It seems reducing a degraded RAID is a bad idea, 
since there is no redundancy for a very long time.

So what you might end up doing is a three step process:

-> [_UUUU] (Degraded)

Step 1: Add another drive (redundancy first)

-> [UUUUU]
    ^ added drive

Step 2: Reduce by one drive

-> [UUUU] + Spare

Step 3: --replace the previously added drive
        (if the spare happened to be one of the drives you wanted to keep)

-> [UUUU]
    ^ former spare

This way the process is redundant but it takes a very long time, 
three separate reshape/rebuilds instead of just one.

Steps to reproduce the [_UUUU] -> [_UUU] + Spare case:
(using linux 4.10, mdadm 4.0)

# truncate -s 100M 1.img 2.img 3.img 4.img
# devices=$(for f in ?.img; do losetup --find --show "$f"; done)
# mdadm --create /dev/md42 --level=5 --raid-devices=5 missing $devices
md42 : active raid5 loop4[4] loop3[3] loop2[2] loop1[1]
      405504 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [_UUUU]
# mdadm --grow /dev/md42 --array-size 304128
# mdadm --grow /dev/md42 --backup-file=md42.backup --raid-devices=4
md42 : active raid5 loop4[4](S) loop3[3] loop2[2] loop1[1]
      304128 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
# not rebuilding until you re-add the spare

Is it possible to do [_UUUU] -> [UUUU] in a single step?
I haven't found a way. Any ideas?

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reducing the number of devices in a degraded RAID-5
  2017-05-22 12:53 Reducing the number of devices in a degraded RAID-5 Andreas Klauer
@ 2017-05-24  2:12 ` NeilBrown
  2017-05-25  7:24   ` Andreas Klauer
  0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2017-05-24  2:12 UTC (permalink / raw)
  To: Andreas Klauer, linux-raid

[-- Attachment #1: Type: text/plain, Size: 2523 bytes --]

On Mon, May 22 2017, Andreas Klauer wrote:

> Hi,
>
> this is not a recovery question, no real data involved. Thanks for helping!
>
> Suppose you have a failing drive in RAID-5 but you wanted to move to 
> fewer drives anyway, so one way or another you're going to reduce 
> the number of drives in your RAID.
>
> Given a RAID 5 with 5 drives            [UUUUU]
> Reducing it by one drive results in     [UUUU] + Spare
> Okay.
>
> Given a degraded RAID 5 with 5 drives   [_UUUU]
> Reducing it by one drive results in     [_UUU] + Spare
> Still okay? Rebuild must be started manually.
>
> It seems reducing a degraded RAID is a bad idea, 
> since there is no redundancy for a very long time.
>
> So what you might end up doing is a three step process:
>
> -> [_UUUU] (Degraded)
>
> Step 1: Add another drive (redundancy first)
>
> -> [UUUUU]
>     ^ added drive
>
> Step 2: Reduce by one drive
>
> -> [UUUU] + Spare
>
> Step 3: --replace the previously added drive
>         (if the spare happened to be one of the drives you wanted to keep)
>
> -> [UUUU]
>     ^ former spare
>
> This way the process is redundant but it takes a very long time, 
> three separate reshape/rebuilds instead of just one.
>
> Steps to reproduce the [_UUUU] -> [_UUU] + Spare case:
> (using linux 4.10, mdadm 4.0)
>
> # truncate -s 100M 1.img 2.img 3.img 4.img
> # devices=$(for f in ?.img; do losetup --find --show "$f"; done)
> # mdadm --create /dev/md42 --level=5 --raid-devices=5 missing $devices
> md42 : active raid5 loop4[4] loop3[3] loop2[2] loop1[1]
>       405504 blocks super 1.2 level 5, 512k chunk, algorithm 2 [5/4] [_UUUU]
> # mdadm --grow /dev/md42 --array-size 304128
> # mdadm --grow /dev/md42 --backup-file=md42.backup --raid-devices=4
> md42 : active raid5 loop4[4](S) loop3[3] loop2[2] loop1[1]
>       304128 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [_UUU]
> # not rebuilding until you re-add the spare
>
> Is it possible to do [_UUUU] -> [UUUU] in a single step?
> I haven't found a way. Any ideas?

I hoped that
  mdadm --grow /dev/md42 --backup-file=... --raid-device=4 --add /dev/loop4

would have worked, but it doesn't.
What does work is:
 # start with a degraded array, device 0 missing
 mdadm --grow /dev/md42 --array-size=.....
 echo frozen > /sys/block/md42/md/sync_action
 mdadm /dev/md42 --add /dev/loop0
 echo 0 > /sys/block/md42/md/dev-loop0/slot
 mdadm --grow /dev/md42 --backup-file=... --raid-devices=4


NeilBrown


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reducing the number of devices in a degraded RAID-5
  2017-05-24  2:12 ` NeilBrown
@ 2017-05-25  7:24   ` Andreas Klauer
  2017-05-26  5:13     ` NeilBrown
  0 siblings, 1 reply; 4+ messages in thread
From: Andreas Klauer @ 2017-05-25  7:24 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-raid

On Wed, May 24, 2017 at 12:12:32PM +1000, NeilBrown wrote:
> What does work is:
>  # start with a degraded array, device 0 missing
>  mdadm --grow /dev/md42 --array-size=.....
>  echo frozen > /sys/block/md42/md/sync_action
>  mdadm /dev/md42 --add /dev/loop0
>  echo 0 > /sys/block/md42/md/dev-loop0/slot
>  mdadm --grow /dev/md42 --backup-file=... --raid-devices=4

Wow. Thanks.

This should merge Step 1+2, but not Step 3, right?

I really need to take a closer look at the things in /sys/.../md/...

Seems like you can do great things with it... terrible, yes, but great.

(
  Step 3 would be turning slot 4 to-be-spare into slot 0, 
  without --add ing another device at all.

  That's what would happen if /dev/loop0 was actually backed 
  by and thus identical with the slot4 device.

  But that's playing dirty.
)

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Reducing the number of devices in a degraded RAID-5
  2017-05-25  7:24   ` Andreas Klauer
@ 2017-05-26  5:13     ` NeilBrown
  0 siblings, 0 replies; 4+ messages in thread
From: NeilBrown @ 2017-05-26  5:13 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: linux-raid

[-- Attachment #1: Type: text/plain, Size: 1039 bytes --]

On Thu, May 25 2017, Andreas Klauer wrote:

> On Wed, May 24, 2017 at 12:12:32PM +1000, NeilBrown wrote:
>> What does work is:
>>  # start with a degraded array, device 0 missing
>>  mdadm --grow /dev/md42 --array-size=.....
>>  echo frozen > /sys/block/md42/md/sync_action
>>  mdadm /dev/md42 --add /dev/loop0
>>  echo 0 > /sys/block/md42/md/dev-loop0/slot
>>  mdadm --grow /dev/md42 --backup-file=... --raid-devices=4
>
> Wow. Thanks.
>
> This should merge Step 1+2, but not Step 3, right?

Right.  Doing step 3 at the same time is not possible.

NeilBrown


>
> I really need to take a closer look at the things in /sys/.../md/...
>
> Seems like you can do great things with it... terrible, yes, but great.
>
> (
>   Step 3 would be turning slot 4 to-be-spare into slot 0, 
>   without --add ing another device at all.
>
>   That's what would happen if /dev/loop0 was actually backed 
>   by and thus identical with the slot4 device.
>
>   But that's playing dirty.
> )
>
> Regards
> Andreas Klauer

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-05-26  5:13 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-05-22 12:53 Reducing the number of devices in a degraded RAID-5 Andreas Klauer
2017-05-24  2:12 ` NeilBrown
2017-05-25  7:24   ` Andreas Klauer
2017-05-26  5:13     ` NeilBrown

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).