4-disk RAID6 (non-standard layout) normalise hung, now all disks spare

public inbox for linux-raid@vger.kernel.org
 help / color / mirror / Atom feed

* 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare
@ 2021-06-25 12:08 Jason Flood
  2021-06-25 13:59 ` Phil Turmel
  0 siblings, 1 reply; 7+ messages in thread
From: Jason Flood @ 2021-06-25 12:08 UTC (permalink / raw)
  To: linux-raid

I started with a 4x4TB disk RAID5 array and, over a few years changed all
the drives to 8TB (WD Red - I hadn't seen the warnings before now, but it
looks like these ones are OK). I then successfully migrated it to RAID6, but
it then had a non-standard layout, so I ran:
	sudo mdadm --grow /dev/md0 --raid-devices=4
--backup-file=/root/raid5backup --layout=normalize

After a few days it reached 99% complete, but then the "hours remaining"
counter started counting up. After a few days I had to power the system down
before I could get a backup of the non-critical data (Couldn't get hold of
enough storage quickly enough, but it wouldn't be catastrophic to lose it),
and now the four drives are in standby, with the array thinking it is RAID0.
Running:
	sudo mdadm --assemble /dev/md0 /dev/sd[bcde]
responds with:
	mdadm: /dev/md0 assembled from 4 drives - not enough to start the
array while not clean - consider --force.

It appears to be similar to https://marc.info/?t=155492912100004&r=1&w=2,
but before trying --force I was considering using overlay files as I'm not
sure of the risk of damage. The set-up process that is documented in the "
Recovering a damaged RAID" Wiki article is excellent, however the latter
part of the process isn't clear to me. If successful, are the overlay files
written to the disk like a virtual machine snapshot, or is the process
stopped, the overlays removed and the process repeated, knowing that it now
has a low risk of damage?

System details follow. Thanks for any help.

============================================================================
=====
user@host:~$ uname -a
Linux conan 5.4.0-74-generic #83-Ubuntu SMP Sat May 8 02:35:39 UTC 2021
x86_64 x86_64 x86_64 GNU/Linux

============================================================================
=====
user@host:~$ mdadm --version
mdadm - v4.1 - 2018-10-01

============================================================================
=====
user@host:~$ sudo smartctl -H -i -l scterc /dev/sda
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-74-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     Samsung SSD 860 EVO M.2 250GB
Serial Number:    S413NX0K707647T
LU WWN Device Id: 5 002538 e40528ae8
Firmware Version: RVT21B6Q
User Capacity:    250,059,350,016 bytes [250 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      M.2
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-4 T13/BSR INCITS 529 revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jun 20 10:44:10 2021 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SCT Error Recovery Control:
           Read: Disabled
          Write: Disabled

============================================================================
=====
user@host:~$ sudo smartctl -H -i -l scterc /dev/sdb
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-74-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD80EFAX-68KNBN0
Serial Number:    VGJM3NXK
LU WWN Device Id: 5 000cca 0bee4dfda
Firmware Version: 81.00A81
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jun 20 10:44:10 2021 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

============================================================================
=====
user@host:~$ sudo smartctl -H -i -l scterc /dev/sdc
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-74-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     WDC WD80EFBX-68AZZN0
Serial Number:    VRG5YT4K
LU WWN Device Id: 5 000cca 0c2c2b5a4
Firmware Version: 85.00A85
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jun 20 10:44:11 2021 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

============================================================================
=====
user@host:~$ sudo smartctl -H -i -l scterc /dev/sdd
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-74-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD80EFAX-68KNBN0
Serial Number:    VAGV1WLL
LU WWN Device Id: 5 000cca 099cbd8be
Firmware Version: 81.00A81
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jun 20 10:44:12 2021 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)

============================================================================
=====
user@host:~$ sudo smartctl -H -i -l scterc /dev/sde
smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.0-74-generic] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red
Device Model:     WDC WD80EFAX-68LHPN0
Serial Number:    7SJ5W2KW
LU WWN Device Id: 5 000cca 252deda87
Firmware Version: 83.H0A83
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sun Jun 20 10:44:12 2021 AEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SCT Error Recovery Control:
           Read:     70 (7.0 seconds)
          Write:     70 (7.0 seconds)
		  
============================================================================
=====		
user@host:~$ sudo mdadm --examine /dev/sdb
/dev/sdb:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0xd
     Array UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
           Name : Universe:0
  Creation Time : Thu Jul 13 01:11:22 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 15627794096 (7451.91 GiB 8001.43 GB)
     Array Size : 15627793408 (14903.83 GiB 16002.86 GB)
  Used Dev Size : 15627793408 (7451.91 GiB 8001.43 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=688 sectors
          State : active
    Device UUID : eee9201e:d9769906:b6dccda1:b1f35abe

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39006208 (37.20 GiB 39.94 GB)
  Delta Devices : -1 (5->4)
     New Layout : left-symmetric

    Update Time : Fri Jun 18 08:56:43 2021
  Bad Block Log : 512 entries available at offset 24 sectors - bad blocks
present.
       Checksum : de1db60e - correct
         Events : 184251

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 0
   Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)

============================================================================
=====
user@host:~$ sudo mdadm --examine /dev/sdc
/dev/sdc:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0xd
     Array UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
           Name : Universe:0
  Creation Time : Thu Jul 13 01:11:22 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 15627794096 (7451.91 GiB 8001.43 GB)
     Array Size : 15627793408 (14903.83 GiB 16002.86 GB)
  Used Dev Size : 15627793408 (7451.91 GiB 8001.43 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=688 sectors
          State : active
    Device UUID : 7ed45d83:84db8f79:e3aadf4b:a88212d1

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39006208 (37.20 GiB 39.94 GB)
  Delta Devices : -1 (5->4)
     New Layout : left-symmetric

    Update Time : Fri Jun 18 08:56:43 2021
  Bad Block Log : 512 entries available at offset 24 sectors - bad blocks
present.
       Checksum : 731a6e9f - correct
         Events : 184251

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 1
   Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)

============================================================================
=====
user@host:~$ sudo mdadm --examine /dev/sdd
/dev/sdd:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0xd
     Array UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
           Name : Universe:0
  Creation Time : Thu Jul 13 01:11:22 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 15627794096 (7451.91 GiB 8001.43 GB)
     Array Size : 15627793408 (14903.83 GiB 16002.86 GB)
  Used Dev Size : 15627793408 (7451.91 GiB 8001.43 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=688 sectors
          State : active
    Device UUID : 015b3ea0:9b3a38d2:a860f58a:34c19985

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39006208 (37.20 GiB 39.94 GB)
  Delta Devices : -1 (5->4)
     New Layout : left-symmetric

    Update Time : Fri Jun 18 08:56:43 2021
  Bad Block Log : 512 entries available at offset 24 sectors - bad blocks
present.
       Checksum : dc4048b8 - correct
         Events : 184251

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 2
   Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)

============================================================================
=====
user@host:~$ sudo mdadm --examine /dev/sde
/dev/sde:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x5
     Array UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
           Name : Universe:0
  Creation Time : Thu Jul 13 01:11:22 2017
     Raid Level : raid6
   Raid Devices : 4

 Avail Dev Size : 15627794096 (7451.91 GiB 8001.43 GB)
     Array Size : 15627793408 (14903.83 GiB 16002.86 GB)
  Used Dev Size : 15627793408 (7451.91 GiB 8001.43 GB)
    Data Offset : 259072 sectors
   Super Offset : 8 sectors
   Unused Space : before=258992 sectors, after=688 sectors
          State : active
    Device UUID : bf9e316b:5910c7ca:1fd799e3:41a349b3

Internal Bitmap : 8 sectors from superblock
  Reshape pos'n : 39006208 (37.20 GiB 39.94 GB)
  Delta Devices : -1 (5->4)
     New Layout : left-symmetric

    Update Time : Fri Jun 18 08:56:43 2021
  Bad Block Log : 512 entries available at offset 24 sectors
       Checksum : 2616ba80 - correct
         Events : 184251

         Layout : left-symmetric-6
     Chunk Size : 512K

   Device Role : Active device 3
   Array State : AAAA. ('A' == active, '.' == missing, 'R' == replacing)

============================================================================
=====
user@host:~$ sudo mdadm --detail /dev/md0
/dev/md0:
           Version : 1.2
        Raid Level : raid0
     Total Devices : 4
       Persistence : Superblock is persistent

             State : inactive
   Working Devices : 4

     Delta Devices : -1, (1->0)
         New Level : raid6
        New Layout : left-symmetric
     New Chunksize : 512K

              Name : Universe:0
              UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
            Events : 184251

    Number   Major   Minor   RaidDevice

       -       8       64        -        /dev/sde
       -       8       32        -        /dev/sdc
       -       8       48        -        /dev/sdd
       -       8       16        -        /dev/sdb
	   
============================================================================
=====
user@host:~$ sudo lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
loop0    7:0    0   9.1M  1 loop /snap/canonical-livepatch/98
loop1    7:1    0   9.1M  1 loop /snap/canonical-livepatch/99
loop2    7:2    0  99.4M  1 loop /snap/core/11187
loop3    7:3    0  99.2M  1 loop /snap/core/11167
loop4    7:4    0  55.4M  1 loop /snap/core18/2066
loop5    7:5    0  70.4M  1 loop /snap/lxd/19647
loop7    7:7    0 217.5M  1 loop /snap/nextcloud/28088
loop8    7:8    0  67.6M  1 loop /snap/lxd/20326
loop9    7:9    0 217.5M  1 loop /snap/nextcloud/27920
loop10   7:10   0  55.5M  1 loop /snap/core18/2074
sda      8:0    0 232.9G  0 disk
+-sda1   8:1    0   512M  0 part /boot/efi
L-sda2   8:2    0 232.4G  0 part /
sdb      8:16   0   7.3T  0 disk
sdc      8:32   0   7.3T  0 disk
sdd      8:48   0   7.3T  0 disk
sde      8:64   0   7.3T  0 disk

============================================================================
=====
user@host:~$ cat /proc/mdstat
Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4]
[raid10]
md0 : inactive sdc[7](S) sdb[6](S) sde[4](S) sdd[5](S)
      31255588192 blocks super 1.2

============================================================================
=====
user@host:~$ sudo mdadm --stop /dev/md0
mdadm: stopped /dev/md0

user@host:~$ sudo mdadm --assemble /dev/md0 /dev/sd[bcde]
mdadm: /dev/md0 assembled from 4 drives - not enough to start the array
while not clean - consider --force.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare
  2021-06-25 12:08 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare Jason Flood
@ 2021-06-25 13:59 ` Phil Turmel
  2021-06-26 11:09   ` Jason Flood
  0 siblings, 1 reply; 7+ messages in thread
From: Phil Turmel @ 2021-06-25 13:59 UTC (permalink / raw)
  To: Jason Flood, linux-raid

Good morning Jason,

Good report.  Comments inline.

On 6/25/21 8:08 AM, Jason Flood wrote:
> I started with a 4x4TB disk RAID5 array and, over a few years changed all
> the drives to 8TB (WD Red - I hadn't seen the warnings before now, but it
> looks like these ones are OK). I then successfully migrated it to RAID6, but
> it then had a non-standard layout, so I ran:
> 	sudo mdadm --grow /dev/md0 --raid-devices=4
> --backup-file=/root/raid5backup --layout=normalize

Ugh.  You don't have to use a backup file unless mdadm tells you too. 
Now you are stuck with it.

> After a few days it reached 99% complete, but then the "hours remaining"
> counter started counting up. After a few days I had to power the system down
> before I could get a backup of the non-critical data (Couldn't get hold of
> enough storage quickly enough, but it wouldn't be catastrophic to lose it),
> and now the four drives are in standby, with the array thinking it is RAID0.
> Running:
> 	sudo mdadm --assemble /dev/md0 /dev/sd[bcde]
> responds with:
> 	mdadm: /dev/md0 assembled from 4 drives - not enough to start the
> array while not clean - consider --force.

You have to specify the backup file on assembly if a reshape using one 
was interrupted.

> It appears to be similar to https://marc.info/?t=155492912100004&r=1&w=2,
> but before trying --force I was considering using overlay files as I'm not
> sure of the risk of damage. The set-up process that is documented in the "
> Recovering a damaged RAID" Wiki article is excellent, however the latter
> part of the process isn't clear to me. If successful, are the overlay files
> written to the disk like a virtual machine snapshot, or is the process
> stopped, the overlays removed and the process repeated, knowing that it now
> has a low risk of damage?

Using --force is very low risk on assembly.  I would try it (without 
overlays, and with backup file specified) before you do anything else. 
Odds of success are high.

Also try the flags to treat the backup file as garbage if its contents 
don't match what mdadm expects.

Report back here after the above.

> System details follow. Thanks for any help.

[details trimmed]

Your report of the details was excellent.  Thanks for helping us help you.


Phil

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare
  2021-06-25 13:59 ` Phil Turmel
@ 2021-06-26 11:09   ` Jason Flood
  2021-06-26 13:13     ` antlists
  0 siblings, 1 reply; 7+ messages in thread
From: Jason Flood @ 2021-06-26 11:09 UTC (permalink / raw)
  To: 'Phil Turmel', linux-raid

Thanks for that, Phil - I think I'm starting to piece it all together now. I was going from a 4-disk RAID5 to 4-disk RAID6, so from my reading the backup file was recommended. The non-standard layout meant that the array had over 20TB usable, but standardising the layout reduced that to 16TB. In that case the reshape starts at the end so the critical section (and so the backup file) may have been in progress at the 99% complete point when it failed, hence the need to specify the backup file for the assemble command.

I ran "sudo mdadm --assemble --verbose --force /dev/md0 /dev/sd[bcde] --backup-file=/root/raid5backup":

mdadm: looking for devices for /dev/md0
mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
mdadm: /dev/sdc is identified as a member of /dev/md0, slot 1.
mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
mdadm: Marking array /dev/md0 as 'clean'
mdadm: /dev/md0 has an active reshape - checking if critical section needs to be restored
mdadm: No backup metadata on /root/raid5backup
mdadm: added /dev/sdc to /dev/md0 as 1
mdadm: added /dev/sdd to /dev/md0 as 2
mdadm: added /dev/sde to /dev/md0 as 3
mdadm: no uptodate device for slot 4 of /dev/md0
mdadm: added /dev/sdb to /dev/md0 as 0
mdadm: Need to backup 3072K of critical section..
mdadm: /dev/md0 has been started with 4 drives (out of 5).

=============================================================
sudo mdadm --detail /dev/md0

/dev/md0:
           Version : 1.2
     Creation Time : Thu Jul 13 01:11:22 2017
        Raid Level : raid6
        Array Size : 15627793408 (14903.83 GiB 16002.86 GB)
     Used Dev Size : 7813896704 (7451.91 GiB 8001.43 GB)
      Raid Devices : 4
     Total Devices : 4
       Persistence : Superblock is persistent

     Intent Bitmap : Internal

       Update Time : Sat Jun 26 19:40:16 2021
             State : clean, reshaping
    Active Devices : 4
   Working Devices : 4
    Failed Devices : 0
     Spare Devices : 0

            Layout : left-symmetric-6
        Chunk Size : 512K

Consistency Policy : bitmap

    Reshape Status : 99% complete
     Delta Devices : -1, (5->4)
        New Layout : left-symmetric

              Name : Universe:0
              UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
            Events : 184255

    Number   Major   Minor   RaidDevice State
       6       8       16        0      active sync   /dev/sdb
       7       8       32        1      active sync   /dev/sdc
       5       8       48        2      active sync   /dev/sdd
       4       8       64        3      active sync   /dev/sde

=============================================================

cat /proc/mdstat

Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10]
md0 : active raid6 sdb[6] sde[4] sdd[5] sdc[7]
      15627793408 blocks super 1.2 level 6, 512k chunk, algorithm 18 [4/3] [UUUU]
      [===================>.]  reshape = 99.7% (7794393600/7813896704) finish=52211434.6min speed=0K/sec
      bitmap: 14/30 pages [56KB], 131072KB chunk
=============================================================

The drive mounts and the files are all intact, but still sitting on 99% complete with 52 million minutes to finish and counting up. The "No backup metadata" made me suspicious that it is stuck because it can't write to /root/raid5backup (and looking at it now I should have put it somewhere more sensible as I'm using sudo, but I used it in the RAID5 to RAID6 process and it was happy). It does seem to have modified the file, though:

stat raid5backup

  File: raid5backup
  Size: 3149824         Blocks: 6152       IO Block: 4096   regular file
Device: 802h/2050d      Inode: 1572897     Links: 1
Access: (0600/-rw-------)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2021-06-26 19:39:16.739983712 +1000
Modify: 2021-06-26 19:40:16.778498938 +1000
Change: 2021-06-26 19:40:16.778498938 +1000
 Birth: -
=============================================================

But I believe those times are from when I first ran the assemble command - it's 20:30 now. I couldn't find a flag to conditionally treat the backup file as garbage - just the --invalid-backup "I know it's garbage" option. Given that the assemble isn't complaining about needing to restore the critical section, is my next step something like:

	sudo mdadm --assemble --verbose --force /dev/md0 /dev/sd[bcde] --backup-file=raidbackup --invalid-backup

Thanks again, Phil. I haven't been using Linux seriously for very long, so this has been a steep learning curve for me.

Jason
=======================================================================================================================================

-----Original Message-----
From: Phil Turmel <philip@turmel.org> 
Sent: Saturday, 26 June 2021 00:00
To: Jason Flood <3mu5555@gmail.com>; linux-raid@vger.kernel.org
Subject: Re: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare

Good morning Jason,

Good report.  Comments inline.

On 6/25/21 8:08 AM, Jason Flood wrote:
> I started with a 4x4TB disk RAID5 array and, over a few years changed 
> all the drives to 8TB (WD Red - I hadn't seen the warnings before now, 
> but it looks like these ones are OK). I then successfully migrated it 
> to RAID6, but it then had a non-standard layout, so I ran:
> 	sudo mdadm --grow /dev/md0 --raid-devices=4 
> --backup-file=/root/raid5backup --layout=normalize

Ugh.  You don't have to use a backup file unless mdadm tells you too. 
Now you are stuck with it.

> After a few days it reached 99% complete, but then the "hours remaining"
> counter started counting up. After a few days I had to power the 
> system down before I could get a backup of the non-critical data 
> (Couldn't get hold of enough storage quickly enough, but it wouldn't 
> be catastrophic to lose it), and now the four drives are in standby, with the array thinking it is RAID0.
> Running:
> 	sudo mdadm --assemble /dev/md0 /dev/sd[bcde] responds with:
> 	mdadm: /dev/md0 assembled from 4 drives - not enough to start the 
> array while not clean - consider --force.

You have to specify the backup file on assembly if a reshape using one was interrupted.

> It appears to be similar to 
> https://marc.info/?t=155492912100004&r=1&w=2,
> but before trying --force I was considering using overlay files as I'm 
> not sure of the risk of damage. The set-up process that is documented in the "
> Recovering a damaged RAID" Wiki article is excellent, however the 
> latter part of the process isn't clear to me. If successful, are the 
> overlay files written to the disk like a virtual machine snapshot, or 
> is the process stopped, the overlays removed and the process repeated, 
> knowing that it now has a low risk of damage?

Using --force is very low risk on assembly.  I would try it (without overlays, and with backup file specified) before you do anything else. 
Odds of success are high.

Also try the flags to treat the backup file as garbage if its contents don't match what mdadm expects.

Report back here after the above.

> System details follow. Thanks for any help.

[details trimmed]

Your report of the details was excellent.  Thanks for helping us help you.

Phil

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare
  2021-06-26 11:09   ` Jason Flood
@ 2021-06-26 13:13     ` antlists
  2021-06-26 14:28       ` Phil Turmel
  0 siblings, 1 reply; 7+ messages in thread
From: antlists @ 2021-06-26 13:13 UTC (permalink / raw)
  To: Jason Flood, 'Phil Turmel', linux-raid

On 26/06/2021 12:09, Jason Flood wrote:
>      Reshape Status : 99% complete
>       Delta Devices : -1, (5->4)
>          New Layout : left-symmetric
> 
>                Name : Universe:0
>                UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
>              Events : 184255
> 
>      Number   Major   Minor   RaidDevice State
>         6       8       16        0      active sync   /dev/sdb
>         7       8       32        1      active sync   /dev/sdc
>         5       8       48        2      active sync   /dev/sdd
>         4       8       64        3      active sync   /dev/sde

Phil will know much more about this than me, but I did notice that the 
system thinks there should be FIVE raid drives. Is that an mdadm bug?

That would explain the failure to assemble - it thinks there's a drive 
missing. And while I don't think we've had data-eating trouble, 
reshaping a parity raid has caused quite a lot of grief for people over 
the years ...

However, you're running a recent Ubuntu and mdadm - that should all have 
been fixed by now.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare
  2021-06-26 13:13     ` antlists
@ 2021-06-26 14:28       ` Phil Turmel
  2021-06-27 11:09         ` Jason Flood
  0 siblings, 1 reply; 7+ messages in thread
From: Phil Turmel @ 2021-06-26 14:28 UTC (permalink / raw)
  To: antlists, Jason Flood, linux-raid

Good morning Jason, Wol,

On 6/26/21 9:13 AM, antlists wrote:
> On 26/06/2021 12:09, Jason Flood wrote:
>>      Reshape Status : 99% complete
>>       Delta Devices : -1, (5->4)
>>          New Layout : left-symmetric
>>
>>                Name : Universe:0
>>                UUID : 3eee8746:8a3bf425:afb9b538:daa61b29
>>              Events : 184255
>>
>>      Number   Major   Minor   RaidDevice State
>>         6       8       16        0      active sync   /dev/sdb
>>         7       8       32        1      active sync   /dev/sdc
>>         5       8       48        2      active sync   /dev/sdd
>>         4       8       64        3      active sync   /dev/sde
> 
> Phil will know much more about this than me, but I did notice that the 
> system thinks there should be FIVE raid drives. Is that an mdadm bug?

Not a bug, but a reshape from a degraded array with a reduction in space.

> That would explain the failure to assemble - it thinks there's a drive 
> missing. And while I don't think we've had data-eating trouble, 
> reshaping a parity raid has caused quite a lot of grief for people over 
> the years ...

I've never tried it starting from a degraded array.  Might be a corner 
case bug not yet exposed.

> However, you're running a recent Ubuntu and mdadm - that should all have 
> been fixed by now.

Indeed.

> Cheers,
> Wol

On 6/26/21 7:09 AM, Jason Flood wrote:
 > Thanks for that, Phil - I think I'm starting to piece it all together 
now. I was going from a 4-disk RAID5 to 4-disk RAID6, so from my reading 
the backup file was recommended. The non-standard layout meant that the 
array had over 20TB usable, but standardising the layout reduced that to 
16TB. In that case the reshape starts at the end so the critical section 
(and so the backup file) may have been in progress at the 99% complete 
point when it failed, hence the need to specify the backup file for the 
assemble command.
 >
 > I ran "sudo mdadm --assemble --verbose --force /dev/md0 /dev/sd[bcde] 
--backup-file=/root/raid5backup":
 >
 > mdadm: looking for devices for /dev/md0
 > mdadm: /dev/sdb is identified as a member of /dev/md0, slot 0.
 > mdadm: /dev/sdc is identified as a member of /dev/md0, slot 1.
 > mdadm: /dev/sdd is identified as a member of /dev/md0, slot 2.
 > mdadm: /dev/sde is identified as a member of /dev/md0, slot 3.
 > mdadm: Marking array /dev/md0 as 'clean'
 > mdadm: /dev/md0 has an active reshape - checking if critical section 
needs to be restored
 > mdadm: No backup metadata on /root/raid5backup
 > mdadm: added /dev/sdc to /dev/md0 as 1
 > mdadm: added /dev/sdd to /dev/md0 as 2
 > mdadm: added /dev/sde to /dev/md0 as 3
 > mdadm: no uptodate device for slot 4 of /dev/md0
 > mdadm: added /dev/sdb to /dev/md0 as 0
 > mdadm: Need to backup 3072K of critical section..
 > mdadm: /dev/md0 has been started with 4 drives (out of 5).
 >

So force was sufficient to assemble.  But you are still stuck at 99%.

Look at the output of ps to see if mdmon is still running (that is the 
background process that actually reshapes stripe by stripe).  If not, 
look in your logs for clues as to why it died.

If you can't find anything significant, the next step would be to backup 
the currently functioning array to another system/drive collection and 
start from scratch.  I wouldn't trust anything else with the information 
available.

Phil

ps.  Convention on kernel.org mailing lists is to NOT top-post, and to 
trim unnecessary context.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* RE: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare
  2021-06-26 14:28       ` Phil Turmel
@ 2021-06-27 11:09         ` Jason Flood
  2021-06-28 14:24           ` Phil Turmel
  0 siblings, 1 reply; 7+ messages in thread
From: Jason Flood @ 2021-06-27 11:09 UTC (permalink / raw)
  To: 'Phil Turmel', 'antlists', linux-raid

Good morning Phil, Wol,

> So force was sufficient to assemble.  But you are still stuck at 99%.

> Look at the output of ps to see if mdmon is still running (that is the background process that actually reshapes stripe by stripe).  If not, look in your logs for clues as to why it died.

> If you can't find anything significant, the next step would be to backup the currently functioning array to another system/drive collection and start from scratch.  I wouldn't trust anything else with the information available.

> Phil

Will do, thanks. I have a few assignments due next weekend so I may not be able to report back for a week.

> ps.  Convention on kernel.org mailing lists is to NOT top-post, and to trim unnecessary context.

Sorry. First time on a mailing list since well before Outlook was invented!

Thanks again, Phil and Wol.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare
  2021-06-27 11:09         ` Jason Flood
@ 2021-06-28 14:24           ` Phil Turmel
  0 siblings, 0 replies; 7+ messages in thread
From: Phil Turmel @ 2021-06-28 14:24 UTC (permalink / raw)
  To: Jason Flood, linux-raid; +Cc: 'antlists'

Good morning Jason,

On 6/27/21 7:09 AM, Jason Flood wrote:
> Good morning Phil, Wol,
> 
>> So force was sufficient to assemble.  But you are still stuck at 99%.
> 
>> Look at the output of ps to see if mdmon is still running (that is the background process that actually reshapes stripe by stripe).  If not, look in your logs for clues as to why it died.
> 
>> If you can't find anything significant, the next step would be to backup the currently functioning array to another system/drive collection and start from scratch.  I wouldn't trust anything else with the information available.
> 
>> Phil
> 
> Will do, thanks. I have a few assignments due next weekend so I may not be able to report back for a week.

No worries.

>> ps.  Convention on kernel.org mailing lists is to NOT top-post, and to trim unnecessary context.
> 
> Sorry. First time on a mailing list since well before Outlook was invented!

No worries.  Also note that many mailing lists disagree with this.  And 
with kernel.org's convention to CC: all participants.  You almost can't 
avoid getting flamed one way or the other. (:

> Thanks again, Phil and Wol.

You're welcome.

Phil

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2021-06-28 14:34 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-25 12:08 4-disk RAID6 (non-standard layout) normalise hung, now all disks spare Jason Flood
2021-06-25 13:59 ` Phil Turmel
2021-06-26 11:09   ` Jason Flood
2021-06-26 13:13     ` antlists
2021-06-26 14:28       ` Phil Turmel
2021-06-27 11:09         ` Jason Flood
2021-06-28 14:24           ` Phil Turmel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox