recovering failed and unrecognizable RAID5 during mdadm --grow without backup

Linux RAID subsystem development
 help / color / mirror / Atom feed

* recovering failed and unrecognizable RAID5 during mdadm --grow without backup
@ 2016-05-12  6:22 Claudiu Rad
  2016-05-12 18:58 ` Phil Turmel
  0 siblings, 1 reply; 9+ messages in thread
From: Claudiu Rad @ 2016-05-12  6:22 UTC (permalink / raw)
  To: linux-raid

hello all,

i am a desperate guy that 'successfully' made a chain of mistakes 
leading to a real personal disaster. i need to try recover this as much 
as i can as total data loss is really not acceptable.
the short story is that having a weak performance 4x4TB RAID5 (full 
drives allocated to RAID5 besides the small RAID1 partitions for boot) + 
LVM, after reading a few articles on the internet, i figured out i 
should try some chunk size 'optimizations' and read that this can be 
done with my version of mdadm and my kernel (machine running debian 7.9).
the mistakes:

 1. no backup of 10TB of data. i am talking about a remote rented
    server, and didn't had any easy way to do backups
 2. i did run mdadm --grow -c 128 /dev/md2, it complained about
    --backup-file. run the command again with the file placed in
    /root/...txt, this being a partition inside the vg0 filling
    /dev/md2, thus defeating the purpose. the chunk size was
    automatically set to 512K before, i was trying to reduce it
 3. the command returned almost immediately, didn't have any idea that
    this would trigger a background process, although it is now obvious.
    i then tried to see what it has done but after a ls, a second ls in
    root partition was hanging. my web server panel (webmin) hanged in
    'waiting for...'; tried connecting to a new shell, after providing
    credentials, hanging, no cursor. i thought that my ever running
    monitoring system and some other constant I/O processes running with
    higher priority were clogging the system that now had lower
    throughput due to parameter change and entire I/O was filled because
    of this and maybe my experiments with the scheduler. actually nginx
    webserver seemed to be working properly and this had nice -10
    attached, which led me to this conclusion. another mistake
 4. after a few minutes of unresponsive machine, decided to send a soft
    CTRL+ALT+DELETE restart signal from datacenter control panel but it
    wouldn't work apparently, thus, decided there is no way to exit this
    situation unless using a hard restart (system reset), and this was
    my final and big mistake not knowing that the array was reshaping.
    the system won't boot and datacenter's rescue (network boot) system
    can't see/assemble the /dev/md2 array

i assume i really did the best to destroy a working array (well, besides 
not being satisfied with performance and apparent degradation during 
time). into the rescue system, this is what i see so far:


root@rescue ~ # mdadm --detail --scan
ARRAY /dev/md/0 metadata=1.2 name=rescue:0 
UUID=63b58acc:19623c52:c1134929:5d592d29
ARRAY /dev/md/1 metadata=1.2 name=rescue:1 
UUID=94713b26:3eca44bc:dee330c8:23443240

root@rescue ~ # mdadm --examine --scan
ARRAY /dev/md/0  metadata=1.2 UUID=63b58acc:19623c52:c1134929:5d592d29 
name=rescue:0
ARRAY /dev/md/1  metadata=1.2 UUID=94713b26:3eca44bc:dee330c8:23443240 
name=rescue:1
ARRAY /dev/md/2  metadata=1.2 UUID=a935894f:be435fc0:589c1c7f:d5454b43 
name=rescue:2
(so here the array appears)

root@rescue ~ # cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sda2[0] sdd2[3] sdc2[2] sdb2[1]
       523968 blocks super 1.2 [4/4] [UUUU]
md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
       16768896 blocks super 1.2 [4/4] [UUUU]

root@rescue ~ # mdadm --assemble --scan
mdadm: /dev/md/0 has been started with 4 drives.
mdadm: /dev/md/1 has been started with 4 drives.
mdadm: Failed to restore critical section for reshape, sorry.
        Possibly you needed to specify the --backup-file
Segmentation fault
(this segmentation fault is weird)

root@rescue ~ # mdadm --assemble --scan --invalid-backup
mdadm: /dev/md/2: Need a backup file to complete reshape of this array.
mdadm: Please provided one with "--backup-file=..."

root@rescue ~ # mdadm -V
mdadm - v3.3.2 - 21st August 2014


now.. what can i best do to try as much as i can to recover my array? 
the backup is actually trapped inside the / partition in the vg0 in the 
array. after starting the --grow, i estimate it has been running for 
about 10minutes when i did a force reboot. how can this be reconstructed 
properly? i have broken it enough, i don't want to make any other move 
without asking experts.

please, help. this is my greatest nightmare :(

-- 
Claudiu


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
  2016-05-12  6:22 recovering failed and unrecognizable RAID5 during mdadm --grow without backup Claudiu Rad
@ 2016-05-12 18:58 ` Phil Turmel
  2016-05-12 20:09   ` Claudiu Rad-Lohanel
  0 siblings, 1 reply; 9+ messages in thread
From: Phil Turmel @ 2016-05-12 18:58 UTC (permalink / raw)
  To: Claudiu Rad, linux-raid

On 05/12/2016 02:22 AM, Claudiu Rad wrote:
> hello all,

Please show the examine for the individual partitions of the raid5:

mdadm --examine /dev/sd[a-d]3

{ Replace the '3' if appropriate.  You don't say what partition numbers
your raid5 is on. }

You will need to manually assemble (not create !) your array with a
backup file outside the raid5, and the --invalid-backup option to
abandon the backup file you can't get to.  You will likely have some
unavoidable corruption at the reshape position due to this.

Phil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
  2016-05-12 18:58 ` Phil Turmel
@ 2016-05-12 20:09   ` Claudiu Rad-Lohanel
  2016-05-12 20:23     ` Phil Turmel
  0 siblings, 1 reply; 9+ messages in thread
From: Claudiu Rad-Lohanel @ 2016-05-12 20:09 UTC (permalink / raw)
  To: Phil Turmel, linux-raid



On 5/12/2016 9:58 PM, Phil Turmel wrote:
> Please show the examine for the individual partitions of the raid5:
>
> mdadm --examine /dev/sd[a-d]3
>

root@rescue ~ # mdadm --examine /dev/sd[a-d]3
/dev/sda3:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : a935894f:be435fc0:589c1c7f:d5454b43
            Name : rescue:2  (local to host rescue)
   Creation Time : Mon Apr 14 15:22:47 2014
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 7779167887 (3709.40 GiB 3982.93 GB)
      Array Size : 11668750848 (11128.19 GiB 11948.80 GB)
   Used Dev Size : 7779167232 (3709.40 GiB 3982.93 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=655 sectors
           State : active
     Device UUID : 9bd5271f:9cb24f1f:f27b2d29:71320066

   Reshape pos'n : 49152 (48.01 MiB 50.33 MB)
   New Chunksize : 64K

     Update Time : Wed May 11 16:19:38 2016
        Checksum : 286cd938 - correct
          Events : 11526

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 0
    Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdb3:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : a935894f:be435fc0:589c1c7f:d5454b43
            Name : rescue:2  (local to host rescue)
   Creation Time : Mon Apr 14 15:22:47 2014
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 7779167887 (3709.40 GiB 3982.93 GB)
      Array Size : 11668750848 (11128.19 GiB 11948.80 GB)
   Used Dev Size : 7779167232 (3709.40 GiB 3982.93 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=655 sectors
           State : active
     Device UUID : fe992c5f:cf125d01:9bb8e3f7:572aef37

   Reshape pos'n : 49152 (48.01 MiB 50.33 MB)
   New Chunksize : 64K

     Update Time : Wed May 11 16:19:38 2016
        Checksum : eb24325e - correct
          Events : 11526

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 1
    Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdc3:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : a935894f:be435fc0:589c1c7f:d5454b43
            Name : rescue:2  (local to host rescue)
   Creation Time : Mon Apr 14 15:22:47 2014
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 7779167887 (3709.40 GiB 3982.93 GB)
      Array Size : 11668750848 (11128.19 GiB 11948.80 GB)
   Used Dev Size : 7779167232 (3709.40 GiB 3982.93 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=655 sectors
           State : active
     Device UUID : 0eb93951:876cbbad:46c6004c:0101f3ca

   Reshape pos'n : 49152 (48.01 MiB 50.33 MB)
   New Chunksize : 64K

     Update Time : Wed May 11 16:19:38 2016
        Checksum : 70b08f7d - correct
          Events : 11526

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 2
    Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)
/dev/sdd3:
           Magic : a92b4efc
         Version : 1.2
     Feature Map : 0x4
      Array UUID : a935894f:be435fc0:589c1c7f:d5454b43
            Name : rescue:2  (local to host rescue)
   Creation Time : Mon Apr 14 15:22:47 2014
      Raid Level : raid5
    Raid Devices : 4

  Avail Dev Size : 7779167887 (3709.40 GiB 3982.93 GB)
      Array Size : 11668750848 (11128.19 GiB 11948.80 GB)
   Used Dev Size : 7779167232 (3709.40 GiB 3982.93 GB)
     Data Offset : 262144 sectors
    Super Offset : 8 sectors
    Unused Space : before=262064 sectors, after=655 sectors
           State : active
     Device UUID : 957d7ddb:dc6de4e7:feb6fb1f:7776adcc

   Reshape pos'n : 49152 (48.01 MiB 50.33 MB)
   New Chunksize : 64K

     Update Time : Wed May 11 16:19:38 2016
        Checksum : ad2bb8a - correct
          Events : 11526

          Layout : left-symmetric
      Chunk Size : 512K

    Device Role : Active device 3
    Array State : AAAA ('A' == active, '.' == missing, 'R' == replacing)

> You will need to manually assemble (not create !) your array with a
> backup file outside the raid5, and the --invalid-backup option to
> abandon the backup file you can't get to.  You will likely have some
> unavoidable corruption at the reshape position due to this.

i am waiting for your input on this and how to continue. it seems that i 
actually set new chunk size to 64K not 128K as i was remembering. 
clearly i wasn't with a clear mind when i did all this..
should i be worried that reshape position is so at the beginning of the 
volume? maybe LVM vg0 metadata lost? (i am just assuming, don't know 
much about how and where LVM stores info about its volumes).
the backup file is there, inside the array, if i could reach it somehow 
i could feed it to mdadm and would probably go well afterwards.

anyway, if just data is lost, i don't care, what are really important 
are some LVM volumes probably placed much further inside the array.

thank you phil!

-- 
jazzman


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
  2016-05-12 20:09   ` Claudiu Rad-Lohanel
@ 2016-05-12 20:23     ` Phil Turmel
       [not found]       ` <7cf56631-7909-6a92-f0b2-05dd02722ee8@misalpina.net>
  0 siblings, 1 reply; 9+ messages in thread
From: Phil Turmel @ 2016-05-12 20:23 UTC (permalink / raw)
  To: Claudiu Rad-Lohanel, linux-raid

On 05/12/2016 04:09 PM, Claudiu Rad-Lohanel wrote:
> 
> 
> On 5/12/2016 9:58 PM, Phil Turmel wrote:
>> Please show the examine for the individual partitions of the raid5:
>>
>> mdadm --examine /dev/sd[a-d]3
>>
> 
> root@rescue ~ # mdadm --examine /dev/sd[a-d]3
> /dev/sda3:

Ok.  Nothing outlandish.

>> You will need to manually assemble (not create !) your array with a
>> backup file outside the raid5, and the --invalid-backup option to
>> abandon the backup file you can't get to.  You will likely have some
>> unavoidable corruption at the reshape position due to this.
> 
> i am waiting for your input on this and how to continue. it seems that i
> actually set new chunk size to 64K not 128K as i was remembering.
> clearly i wasn't with a clear mind when i did all this..
> should i be worried that reshape position is so at the beginning of the
> volume? maybe LVM vg0 metadata lost? (i am just assuming, don't know
> much about how and where LVM stores info about its volumes).

It just didn't get very far.

> the backup file is there, inside the array, if i could reach it somehow
> i could feed it to mdadm and would probably go well afterwards.

No way to get to it without assembling, and you can't assemble
error-free without it.  Sorry.

> anyway, if just data is lost, i don't care, what are really important
> are some LVM volumes probably placed much further inside the array.

They are likely to be fine, then.

> thank you phil!

You're welcome.

You should mount your /boot array somewhere convenient, then:

mdadm -Av /dev/md3 --invalid-backup \
  --backup-file=/mount/path/to/boot/newbackupfile \
  /dev/sd[a-d]3

If that fails, repeat with the --force option included.  If that fails,
show us everything it prints out.

If it succeeds, the reshape will be continuing in the background.  While
that is going on, you may mount the array and grab backups of the most
critical content.  Just in case :-)

It will probably take a very long time.  Look at /proc/mdstat to check
the progress.

Phil


^ permalink raw reply	[flat|nested] 9+ messages in thread

[parent not found: <7cf56631-7909-6a92-f0b2-05dd02722ee8@misalpina.net>]

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
       [not found]       ` <7cf56631-7909-6a92-f0b2-05dd02722ee8@misalpina.net>
@ 2016-05-13 14:04         ` Phil Turmel
  2016-05-13 14:11           ` Phil Turmel
  0 siblings, 1 reply; 9+ messages in thread
From: Phil Turmel @ 2016-05-13 14:04 UTC (permalink / raw)
  To: Claudiu Rad, linux-raid

On 05/12/2016 05:37 PM, Claudiu Rad wrote:

> how can i safely stop this reshape and assuming my / partition inside
> the array is sane enough restart the actual server normally after
> fsck-ing all volumes?

Well, your root is inside the array.  So you won't be able to boot
without the array assembling in the initramfs, which needs manual
intervention to supply the backup file.

If your hosting service provides access to the boot console, you could
do this with "rd.shell" or whatever kernel option tells your initramfs
to drop into a rescue console.  Then you could manually assemble and
resume booting.  This would be true even if you had properly specified a
backup file outside the array.

I suspect this was one of the motivations for the enhancement to modern
mdadm & kernels to avoid the backup file for many cases by manipulating
the data offset.  Such a reshape in progress can be automatically
assembled without special options during boot.  Plus it is faster -- one
third fewer I/O operations.

Phil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
  2016-05-13 14:04         ` Phil Turmel
@ 2016-05-13 14:11           ` Phil Turmel
  2016-05-13 14:26             ` Claudiu Rad-Lohanel
  0 siblings, 1 reply; 9+ messages in thread
From: Phil Turmel @ 2016-05-13 14:11 UTC (permalink / raw)
  To: Claudiu Rad, linux-raid

On 05/13/2016 10:04 AM, Phil Turmel wrote:
> On 05/12/2016 05:37 PM, Claudiu Rad wrote:
> 
>> how can i safely stop this reshape and assuming my / partition inside
>> the array is sane enough restart the actual server normally after
>> fsck-ing all volumes?
> 
> Well, your root is inside the array.  So you won't be able to boot
> without the array assembling in the initramfs, which needs manual
> intervention to supply the backup file.

Actually, if it still isn't too far into the reshape, you could use
--revert-reshape.  Then it'll reshape back to the original chunk size
what it has done so far.  That might be quicker than finishing the
reshape.  Then you could reboot into your normal OS.

Whether you finish reshaping, or unreshaping, you need to not be
reshaping at all when you boot into your normal OS, if you can't do the
initramfs console.

Phil

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
  2016-05-13 14:11           ` Phil Turmel
@ 2016-05-13 14:26             ` Claudiu Rad-Lohanel
  2016-05-13 14:39               ` Andreas Klauer
  0 siblings, 1 reply; 9+ messages in thread
From: Claudiu Rad-Lohanel @ 2016-05-13 14:26 UTC (permalink / raw)
  To: Phil Turmel, linux-raid


On 5/13/2016 5:11 PM, Phil Turmel wrote:
> On 05/13/2016 10:04 AM, Phil Turmel wrote:
>> On 05/12/2016 05:37 PM, Claudiu Rad wrote:
>>
>>> how can i safely stop this reshape and assuming my / partition inside
>>> the array is sane enough restart the actual server normally after
>>> fsck-ing all volumes?
>> Well, your root is inside the array.  So you won't be able to boot
>> without the array assembling in the initramfs, which needs manual
>> intervention to supply the backup file.
> Actually, if it still isn't too far into the reshape, you could use
> --revert-reshape.  Then it'll reshape back to the original chunk size
> what it has done so far.  That might be quicker than finishing the
> reshape.  Then you could reboot into your normal OS.

this is interesting.

1. i can't find any documentation anywhere about the --revert-reshape 
option. how should the command look like? is it available in v3.3.2?

2. is it safe?

-- 
jazzman


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
  2016-05-13 14:26             ` Claudiu Rad-Lohanel
@ 2016-05-13 14:39               ` Andreas Klauer
  2016-05-13 15:33                 ` Claudiu Rad-Lohanel
  0 siblings, 1 reply; 9+ messages in thread
From: Andreas Klauer @ 2016-05-13 14:39 UTC (permalink / raw)
  To: Claudiu Rad-Lohanel; +Cc: Phil Turmel, linux-raid

On Fri, May 13, 2016 at 05:26:02PM +0300, Claudiu Rad-Lohanel wrote:
> 1. i can't find any documentation anywhere about the --revert-reshape 
> option. how should the command look like? is it available in v3.3.2?

IIRC it's --assemble --update=revert-reshape and it's not mentioned 
in the man page...
 
> 2. is it safe?

Theoretically...
You could do this with overlays.

https://raid.wiki.kernel.org/index.php/Recovering_a_failed_software_RAID#Making_the_harddisks_read-only_using_an_overlay_file

That's what I'd also recommend when running fsck... 
all of these things write on your disks without providing an undo function.

Regards
Andreas Klauer

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: recovering failed and unrecognizable RAID5 during mdadm --grow without backup
  2016-05-13 14:39               ` Andreas Klauer
@ 2016-05-13 15:33                 ` Claudiu Rad-Lohanel
  0 siblings, 0 replies; 9+ messages in thread
From: Claudiu Rad-Lohanel @ 2016-05-13 15:33 UTC (permalink / raw)
  To: Andreas Klauer; +Cc: Phil Turmel, linux-raid


On 5/13/2016 5:39 PM, Andreas Klauer wrote:
> On Fri, May 13, 2016 at 05:26:02PM +0300, Claudiu Rad-Lohanel wrote:
>> 1. i can't find any documentation anywhere about the --revert-reshape
>> option. how should the command look like? is it available in v3.3.2?
> IIRC it's --assemble --update=revert-reshape and it's not mentioned
> in the man page...

thanks. it would seem to do it, but, hangs, doesn't want to advance:

root@rescue /mnt # cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 dm-1[0] dm-3[4] dm-2[2] dm-0[1]
       11668750848 blocks super 1.2 level 5, 64k chunk, algorithm 2 
[4/4] [UUUU]
       [===================>.]  reshape = 99.9% (3889567232/3889583616) 
finish=7338.6min speed=0K/sec


-- 
jazzman


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2016-05-13 15:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-05-12  6:22 recovering failed and unrecognizable RAID5 during mdadm --grow without backup Claudiu Rad
2016-05-12 18:58 ` Phil Turmel
2016-05-12 20:09   ` Claudiu Rad-Lohanel
2016-05-12 20:23     ` Phil Turmel
     [not found]       ` <7cf56631-7909-6a92-f0b2-05dd02722ee8@misalpina.net>
2016-05-13 14:04         ` Phil Turmel
2016-05-13 14:11           ` Phil Turmel
2016-05-13 14:26             ` Claudiu Rad-Lohanel
2016-05-13 14:39               ` Andreas Klauer
2016-05-13 15:33                 ` Claudiu Rad-Lohanel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox